Page Loader
Summarize
Musk's xAI cooks up world's most powerful AI training cluster
Around 100,000 liquid-cooled GPUs from NVIDIA are being used

Musk's xAI cooks up world's most powerful AI training cluster

Jul 22, 2024
07:51 pm

What's the story

Elon Musk's start-up, xAI, has commenced training its next artificial intelligence (AI) model at a new supercomputing facility in Memphis, Tennessee. The billionaire entrepreneur confirmed this development on X. "Nice work by xAI team, X team, NVIDIA & supporting companies getting Memphis Supercluster training started," Musk posted. The company is utilizing 100,000 liquid-cooled H100s GPUs from NVIDIA for this purpose, making it the most powerful AI training cluster in the world.

Tech collaboration

xAI utilizes NVIDIA's H100 GPUs for AI training

The training of the next version of xAI's chatbot, Grok, is being facilitated by 100,000 liquid-cooled H100s from NVIDIA on a single RDMA fabric. Musk praised the setup, stating "With 100k liquid-cooled H100s on a single RDMA fabric, it's the most powerful AI training cluster in the world!" These GPUs from NVIDIA are specifically designed for training AI models that require substantial energy and computing power.

Energy consumption

New supercomputing facility raises environmental concerns

The supercomputing facility, dubbed the "Gigafactory of Compute," is located in a former manufacturing building in Memphis. However, environmental groups have raised concerns about its energy and water consumption. The Memphis Community Against Pollution and two other groups warned that the facility creates a significant "energy burden." They stated, "xAI is also expected to need at least one million gallons of water per day for its cooling towers."

Environmental impact

Community fears over energy and water supply

The CEO of Memphis Light, Gas, and Water estimated that xAI's facility may use up to 150mW of electricity an hour, equivalent to the energy required to power 100,000 homes. Memphis City Council member Pearl Walker voiced the community's concerns last week, stating "People are afraid. They're afraid of what's possibly going to happen with the water and they are afraid about the energy supply." The environmental groups urged xAI to invest in a wastewater reuse system.

Competition

Comparison with tech giants

Despite its size, xAI's Memphis facility may not be the biggest computing facility globally. Other tech giants such as Microsoft, Google, and Meta also use data centers for training and operating their AI models. Meta CEO Mark Zuckerberg has pledged to buy 350,000 NVIDIA H100s this year. Musk had previously announced plans to release Grok 2 next month, but it remains unclear whether this supercomputing cluster might be utilized for this purpose.