Facebook owner Meta Platforms plans to deploy into its data centers this year a new version of a custom chip aimed at supporting its artificial intelligence push, according to an internal company document.

The chip, a second generation of an in-house silicon line Meta announced last year, could help to reduce Meta’s dependence on the Nvidia chips that dominate the market and control the spiralling costs associated with running AI workloads as it races to launch AI products.

The world’s biggest social media company has been scrambling to boost its computing capacity for the power-hungry generative AI products it is pushing into Facebook, Instagram and WhatsApp and hardware devices like its Ray-Ban smart glasses, spending billions of dollars to amass arsenals of specialised chips and reconfigure data centers to accommodate them.

At the scale at which Meta operates, a successful deployment of its own chip could potentially shave off hundreds of millions of dollars in annual energy costs and billions in chip purchasing costs, according to Dylan Patel, founder of the silicon research group SemiAnalysis.

The chips, infrastructure and energy required to run AI applications has become a giant sinkhole of investment for tech companies, to some degree offsetting gains made in the rush of excitement around the technology.

A Meta spokesman confirmed the plan to put the updated chip into production in 2024, saying it would work in coordination with the hundreds of thousands of off-the-shelf graphics processing units (GPUs) — the go-to chips for AI — the company was buying.

“We see our internally developed accelerators to be highly complementary to commercially available GPUs in delivering the optimal mix of performance and efficiency on Meta-specific workloads,” the spokesman said.

H100 processors

Meta CEO Mark Zuckerberg last month said the company planned to have by the end of the year roughly 350 000 flagship H100 processors from Nvidia, which produces the most sought-after GPUs used for AI. Combined with other suppliers, Meta would accumulate the equivalent compute capacity of 600 000 H100s in total, he said.

The deployment of its own chip as part of that plan is a positive turn for Meta’s in-house AI silicon project, after a decision by executives in 2022 to pull the plug on the chip’s first iteration.

Read: Meta’s extraordinary comeback

The company instead opted to buy billions of dollars’ worth of Nvidia’s GPUs, which have a near monopoly on an AI process called training that involves feeding enormous data sets into models to teach them how to perform tasks.

The new chip, referred to internally as “Artemis”, like its predecessor can perform only a process known as inference in which the models are called on to use their algorithms to make ranking judgments and generate responses to user prompts.

Reuters last year reported that Meta was also working on a more ambitious chip that, like GPUs, would be capable of performing both training and inference.

The company shared details about the first generation of its Meta Training and Inference Accelerator (MTIA) program last year. The announcement portrayed that version of the chip as a learning opportunity.

Despite those early stumbles, an inference chip could be considerably more efficient at crunching Meta’s recommendation models than the energy thirsty Nvidia processors, according to Patel. “There is a lot of money and power being spent that could be saved.” — Katie Paul, Stephen Nellis and Max A Cherney, (c) 2024 Reuters