
'Godfather of AI' launches non-profit to develop 'honest' artificial intelligence
What's the story
Yoshua Bengio, a leading computer scientist and one of the "godfathers" of artificial intelligence (AI), has announced the launch of a non-profit organization called LawZero.
The group will focus on developing an "honest" AI system that can detect rogue systems trying to deceive humans.
With an initial funding of $30 million and a team of over a dozen researchers, Bengio is working on this groundbreaking project.
Project details
Scientist AI: A shield against deceptive AI agents
Bengio's team is developing a system called Scientist AI, which will serve as a guardrail against rogue AI agents.
These agents perform tasks autonomously and may exhibit deceptive or self-preserving behavior, like trying to avoid being turned off.
"We want to build AIs that will be honest and not deceptive," Bengio said while explaining the motivation behind this project.
System function
Scientist AI: A 'psychologist' for AI agents
Bengio compares current AI agents to "actors" trying to mimic humans and please users.
In contrast, the Scientist AI system would be more like a "psychologist" that can understand and predict bad behavior.
Unlike existing generative AI tools, this system won't provide definitive answers but instead give probabilities of whether an answer is correct or not.
Safety measure
Scientist AI: A safeguard for autonomous systems
When deployed alongside an AI agent, Bengio's model would flag potentially harmful behavior by an autonomous system.
It would do this by gaging the probability of its actions causing harm.
If this probability exceeds a certain threshold, the agent's proposed action will be blocked.
This way, Scientist AI could act as a safety measure against rogue AI agents.
Support base
LawZero's initial supporters and future plans
LawZero has already gained support from AI safety body Future of Life Institute, Skype co-founder Jaan Tallinn, and Schmidt Sciences.
Bengio said the first step for LawZero would be to demonstrate that the methodology behind this concept works.
He hopes to convince companies or governments to back larger, more powerful versions of these systems in the future.
Safety concerns
Bengio's AI safety concerns and future aspirations
Bengio, a University of Montreal professor, has been vocal about AI safety.
He recently chaired an International AI Safety report that warned autonomous agents could cause "severe" disruption if they become capable of completing longer sequences of tasks without human supervision.
He expressed concern over Anthropic's admission that its latest system could blackmail engineers trying to shut it down.