Meet Reflection 70B, AI model that can fix its mistakes

By Akash Pandey

Sep 06, 2024

06:57 pm

What's the story

HyperWrite, an AI writing start-up led by Matt Shumer, has introduced a groundbreaking large language model (LLM) named Reflection 70B. This innovative model is based on Meta's open-source Llama 3.1-70B Instruct and incorporates a unique error self-correction technique, which means it can fix its own mistakes. Shumer announced on X that Reflection-70B is now "the world's top open-source AI model."

Benchmark performance

Surpassing competitors in benchmark tests

Reflection 70B has undergone extensive testing across various benchmarks, including MMLU and HumanEval. The LLM Decontaminator from LMSys was used to ensure contamination-free results. These tests revealed that Reflection consistently outperforms models from Meta's Llama series and competes closely with top commercial models. However, due to high demand following the announcement, the demo site is currently experiencing heavy traffic.

Unique features

Unique error identification and correction capabilities

Shumer highlighted that Reflection 70B not only competes with top-tier models but also offers unique capabilities, specifically error identification and correction. The model's name, "Reflection," signifies its ability to reflect on its generated text and assess its accuracy before delivering outputs to the user. This is achieved through a technique called reflection tuning, which allows the model to detect errors in its own reasoning and correct them before finalizing a response.

Twitter Post

Take a look at benchmark scores

I'm excited to announce Reflection 70B, the world’s top open-source model.

Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.

405B coming next week - we expect it to be the best model in the world.

Built w/ @GlaiveAI.

Read on ⬇️: pic.twitter.com/kZPW1plJuo
— Matt Shumer (@mattshumer_) September 5, 2024

User interaction

Reflection 70B introduces new tokens for user interaction

Reflection 70B introduces several new special tokens for reasoning and error correction, enhancing user interaction with the model. During inference, the model displays its reasoning within special tags, allowing for real-time corrections if it detects a mistake. This feature makes the model particularly useful for tasks that require high accuracy as it separates reasoning into distinct steps to improve precision.

Download availability

Reflection 70B is now available for download

The model can be downloaded via the AI code repository Hugging Face, and API access is expected to be available through GPU service provider Hyperbolic Labs. Shumer also revealed plans for an even larger model, Reflection 405B, set to launch next week. He further stated that HyperWrite is working on integrating the Reflection 70B model into its primary AI writing assistant product.

Training insights

Reflection 70B's training process and benchmarks

Shumer confirmed that the underlying model for Reflection 70B is built on Meta's Llama 3.1-70B Instruct and uses the stock Llama chat format, ensuring compatibility with existing tools and pipelines. He also credited Glaive, a start-up specializing in creating use-case-specific datasets, for enabling rapid AI model training. The synthetic data generated by Glaive significantly accelerated the development process of Reflection 70B.