NVIDIA expands beyond chips with open AI models
What's the story
NVIDIA has announced the launch of its latest series of open models, the Nemotron 3. The move marks a major shift for the chipmaker, which has primarily focused on supplying chips to AI companies. The new models come with data and tools to help engineers use them effectively. This strategic pivot comes as tech giants like OpenAI, Google, and Anthropic are developing their own advanced chips.
Market shift
NVIDIA's open models: A response to market trends
Open models have become an integral part of the AI ecosystem, with many researchers and start-ups using them for experimentation and prototyping. However, while OpenAI and Google provide small open models, they don't update them as frequently as their Chinese counterparts. This has made Chinese companies' open models more popular on platforms like Hugging Face, a hosting service for open-source projects.
Model details
Nemotron 3: A new era of AI model customization
The new Nemotron 3 models are among the best in their class, according to benchmark scores shared by NVIDIA. They come in three sizes: Nano with 30 billion parameters, Super with 100 billion parameters, and Ultra with a whopping 500 billion parameters. These models can be downloaded, modified, and run on one's hardware. "Open innovation is the foundation of AI progress," said CEO Jensen Huang while announcing this groundbreaking development.
Transparency focus
NVIDIA's commitment to transparency and customization
NVIDIA is taking a more transparent approach than many of its US rivals by releasing the data used to train Nemotron. This move is expected to make it easier for engineers to modify the models. The company is also releasing tools for customization and fine-tuning, including a new hybrid latent mixture-of-experts model architecture. This architecture is especially good for building AI agents that can take actions on computers or the web.
Builder benefits
NVIDIA's open models: A boon for AI builders
Kari Ann Briski, VP of generative AI software at NVIDIA, stressed the importance of open models for AI builders. She said they need to customize models for specific tasks, often hand queries off to different models, and it's easier to get more intelligent responses from these models after training by having them perform simulated reasoning.
Industry trend
The rise of open models in the AI industry
The release of advanced open models by Meta under the name Llama in February 2023 was a major development. However, as competition heats up, Meta has hinted that its future releases may not be open source. This is part of a larger trend in the AI industry where US firms have become more secretive about their research and less willing to share with rivals.