Google launches Gemma 4 AI model designed for laptops
What's the story
Google has unveiled a new addition to its Gemma 4 family of AI models, called the Gemma 4 12B. The latest offering is designed to run on consumer laptops with as little as 16GB of system RAM. This is a significant improvement over previous models that required much more memory. The new model also comes with Multi-Token Prediction (MTP) drafters for improved speed and efficiency.
Performance
The new model can perform complex multistep reasoning
Despite its smaller size, the Gemma 4 12B model can perform complex multistep reasoning and agentic workflows. These tasks were previously reserved for larger models in the Gemma family. The new model also comes with a streamlined embedding module for vision, which allows data to pass to the LLM with proper spatial awareness, eliminating the need for a bulky middleman encoder.
Multimodality
Gemma 4 is natively multimodal
The Gemma 4 family is natively multimodal, meaning it can take text, audio, or images as inputs. Most generative AI models use dedicated encoders to process non-text inputs and pass that data to the LLM. However, with the new mid-weight model, Google has implemented a streamlined embedding module for vision and found a way to project raw audio signals into vectors used for text tokens.
Access
You can try out the new AI model online
The Gemma 4 12B model is available for download on Kaggle and Hugging Face, provided you have the required RAM. The model weights are just under 18GB in size. However, if you want to try out the new AI model without downloading it, you can do so through platforms like LM Studio or Google AI Edge Gallery.