Expanding Meta’s Custom Silicon to Power Our AI Workloads


In 2023, we developed the Meta Training and Inference Accelerator (MTIA), a family of custom-built silicon chips to power our AI workloads efficiently. Now, we’re developing and deploying four new generations of chips within the next two years — a much faster pace than typical chip cycles — to support ranking, recommendations, and GenAI workloads. 

As our current AI workloads continue to grow and evolve, we’re taking a portfolio approach to scale our infrastructure capacity by sourcing silicon from a range of industry leaders, while keeping our own MTIA custom silicon at the center of our AI infrastructure strategy. 

A Custom Solution

We deploy hundreds of thousands of MTIA chips for inference workloads across both organic content and ads on our apps. These chips are specifically designed for our workloads, and are part of a custom full-stack solution, helping us create a highly optimized system that’s tailored to our needs. This system achieves greater compute efficiency than general use chips for our intended purposes, making MTIA much more cost efficient. 

Four Chips in Two Years 

We’re continuing to advance the MTIA roadmap by developing four new generations of chips, each bringing significant improvements in compute, memory bandwidth, and efficiency. MTIA 300 will be used for ranking and recommendations training, and is already in production. MTIA 400, 450 and 500 will be capable of handling all workloads, but we will primarily use these chips to support GenAI inference production in the near future and into 2027.

The modularity of our silicon allows these new chips to drop into existing rack system infrastructure, accelerating time-to-production.

Our MTIA Strategy

We’ve developed a competitive strategy for MTIA by prioritizing rapid, iterative development, an inference-first focus, and frictionless adoption by building natively on industry standards. 

Rapid, Iterative Development 

While the industry typically launches a new AI chip every one to two years, we’ve developed the capacity to release ours every six months or less by building on our modular, reusable designs. This accelerated pace enables us to quickly adapt to evolving AI techniques, adopt the latest hardware technologies, and minimize costs associated with developing and deploying new chip generations. 

Inference-First Focus 

Mainstream chips are typically built for the most demanding workload — large-scale GenAI pre-training — and then applied, often less cost-effectively, to other workloads like GenAI inference. We take the opposite approach: MTIA 450 and 500 are optimized first for GenAI inference, and they can then be used to support other workloads as needed, including ranking and recommendations training and inference, as well as GenAI training. This keeps MTIA well-tuned to the anticipated growth in GenAI inference demand.

Building on Industry Standards

MTIA is built on industry‑standard software and hardware ecosystems, like PyTorch, vLLM, Triton, and the Open Compute Project (OCP), from the beginning, enabling frictionless adoption of MTIA chips. Beyond industry-standard software, MTIA’s system and rack solutions align with OCP standards, enabling MTIA to be seamlessly deployed in data centers.

Our Portfolio Approach 

There is no single chip that can meet all the demands across our varying needs, which is why we’re working to deploy a variety of chips that are optimized for each of our different workloads. We believe our portfolio approach will enable us to advance and innovate at an unmatched pace, bringing us closer to our goal of creating personal superintelligence for all. 

To learn more about the MTIA roadmap, head to the Meta AI blog.





Source link

Share:

Atbildēt

3 latest news
News Archives
On Key

Related Posts

Solverwp- WordPress Theme and Plugin