LOADING...
Google launches Gemma 4 AI model designed for laptops
It supports Multi-Token Prediction (MTP) drafting

Google launches Gemma 4 AI model designed for laptops

Jun 04, 2026
09:34 am

What's the story

Google has unveiled a new addition to its Gemma 4 family of AI models, called the Gemma 4 12B. The latest offering is designed to run on consumer laptops with as little as 16GB of system RAM. This is a significant improvement over previous models that required much more memory. The new model also comes with Multi-Token Prediction (MTP) drafters for improved speed and efficiency.

Performance

The new model can perform complex multistep reasoning

Despite its smaller size, the Gemma 4 12B model can perform complex multistep reasoning and agentic workflows. These tasks were previously reserved for larger models in the Gemma family. The new model also comes with a streamlined embedding module for vision, which allows data to pass to the LLM with proper spatial awareness, eliminating the need for a bulky middleman encoder.

Multimodality

Gemma 4 is natively multimodal

The Gemma 4 family is natively multimodal, meaning it can take text, audio, or images as inputs. Most generative AI models use dedicated encoders to process non-text inputs and pass that data to the LLM. However, with the new mid-weight model, Google has implemented a streamlined embedding module for vision and found a way to project raw audio signals into vectors used for text tokens.

Advertisement

Access

You can try out the new AI model online

The Gemma 4 12B model is available for download on Kaggle and Hugging Face, provided you have the required RAM. The model weights are just under 18GB in size. However, if you want to try out the new AI model without downloading it, you can do so through platforms like LM Studio or Google AI Edge Gallery.

Advertisement