AI อะไรเนี่ย

News

Snap Accelerates A/B Testing with NVIDIA GPUs, Cuts Costs

Snap Accelerates A/B Testing with NVIDIA GPUs, Cuts Costs

Snap, the parent company behind the popular social media platform Snapchat, has achieved significant breakthroughs in its A/B testing infrastructure by integrating NVIDIA's open data processing libraries with Google Cloud. This strategic move has resulted in a remarkable 4x speedup in data processing runtime and an impressive 76% reduction in daily operational costs. The innovation, detailed in a recent NVIDIA Blog post by Sid Sharma, underscores how GPU-accelerated computing is transforming large-scale data analytics for consumer internet companies.

What Happened: Faster Experiments, Lower Costs

Snapchat, serving over 940 million monthly active users, relies heavily on A/B testing to refine and launch new features. The company conducts thousands of these experiments each month, necessitating the processing of more than 10 petabytes of data within a critical three-hour daily window using Apache Spark. The challenge was to scale this experimentation efficiently without escalating computing costs.

To tackle this, Snap adopted NVIDIA cuDF, an open-source library that allows Apache Spark applications to run on NVIDIA GPUs with minimal code changes. By deploying these GPU-accelerated Spark applications on NVIDIA L4 GPUs within Google Cloud's G2 virtual machines, managed via Google Kubernetes Engine, Snap achieved:

  • 4x speedups in runtime for their massive data processing workloads.
  • 76% daily cost savings compared to their previous CPU-only workflows.
  • Optimized pipelines that required only 2,100 concurrent GPUs, a significant reduction from an initial projection of 5,500 GPUs.

These efficiencies were further bolstered by leveraging NVIDIA CUDA-X libraries, which are part of the broader GPU-optimized software stack, ensuring a full-stack platform for data processing at scale.

Why It Matters: Scaling Innovation for Users

For a platform like Snapchat, which constantly evolves its features – from AI-generated stickers to performance optimizations – rapid and cost-effective A/B testing is crucial for continuous innovation. The ability to run more experiments faster means new features and improvements can reach users more quickly and reliably. As Prudhvi Vatala, senior engineering manager at Snap, noted, "Experimentation is at the core of our company. Changing our data infrastructure from CPUs to GPUs allows us to efficiently scale this experimentation to more features, more metrics and more users over time."

This successful migration demonstrates a sustainable path to scale for companies facing similar big data challenges. It highlights how integrating specialized hardware like GPUs with open-source software like NVIDIA cuDF for Apache Spark can lead to dramatic improvements in both performance and cost efficiency.

What It Means for Developers and the Ecosystem

The success story at Snap provides a powerful blueprint for other organizations dealing with vast datasets and the need for accelerated analytics. The ease of deployment, which allows developers to run existing Apache Spark applications on GPUs without extensive code rewriting, lowers the barrier to entry for GPU acceleration.

Looking ahead, Snap plans to expand the use of this GPU-accelerated Spark framework beyond its A/B testing team to a wider array of production workloads, indicating the versatility and robust performance of the solution. This expansion could unlock even more efficiencies and foster further innovation across the company's data operations.

To dive deeper into Snap's journey, you can tune into a detailed session at NVIDIA GTC, where engineering managers from Snap will share their insights and experiences.

Read more: Snap Decisions: How Open Libraries for Accelerated Data Processing Boost A/B Testing for Snapchat for a comprehensive look at this case study.