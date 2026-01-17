Peeking inside AI: tracing digital "neurons"

Scientists used circuit-tracing methods to study internal computations in Claude, while Anthropic also open-sourced tools to map connections in open-weight models such as Llama-3.2-1b—kind of like tracing brain circuits.

By intervening on internal representations mid-task (for example removing the concept 'rabbit' from a planning state or inserting the idea 'green'), they found hidden flaws, internal contradictions and cases of misaligned behavior that aren't obvious from the outside.