Researchers at Helmholtz Munich have unveiled Centaur, an AI language model trained on over ten million decisions from psychological experiments, which emulates human choice patterns and reaction times with striking fidelity, even in tasks it has never seen before. Credit: Stock
A new AI model mimics human thinking with striking accuracy, even in unfamiliar scenarios.
Researchers at Helmholtz Munich have created an advanced artificial intelligence system capable of mimicking human decision-making with impressive precision. The model, named Centaur, was trained using data from more than ten million decisions collected through psychological studies, allowing it to generate responses that mirror human behavior in realistic ways. This breakthrough offers new possibilities for deepening our understanding of how people think and refining existing psychological frameworks.
For years, the field of psychology has sought to fully capture the intricacies of human thought. However, past models have typically been limited to either explaining how people think or predicting how they act, rarely managing to accomplish both.
Led by Dr. Marcel Binz and Dr. Eric Schulz from the Institute for Human-Centered AI at Helmholtz Munich, the research team has now introduced a model that bridges this gap. Centaur was trained on a comprehensive dataset known as Psych-101, which compiles over ten million decisions from 160 different behavioral experiments.
Centaur stands out for its ability to anticipate human responses not only in familiar contexts but also in brand-new situations. It recognizes recurring decision-making patterns, adjusts to new environments with ease, and can even estimate reaction times with a surprising level of detail.
“We’ve created a tool that allows us to predict human behavior in any situation described in natural language – like a virtual laboratory,” says Marcel Binz, who is also the study’s lead author. Potential applications range from analyzing classic psychological experiments to simulating individual decision-making processes in clinical contexts – for example, in depression or anxiety disorders. The model opens up new perspectives in health research in particular – for example, by helping us understand how people with different psychological conditions make decisions. The dataset is set to be expanded to include demographic and psychological characteristics.
Centaur bridges two previously separate domains: interpretable theories and predictive power. It can reveal where classical models fall short – and provide insights into how they might be improved. This opens up new possibilities for research and real-world applications, from medicine to environmental science and the social sciences.
“We’re just getting started and already seeing enormous potential,” says institute director Eric Schulz. Ensuring that such systems remain transparent and controllable is key, Binz adds – for example, by using open, locally hosted models that safeguard full data sovereignty.
Next, the researchers aim to take a closer look inside Centaur: Which computational patterns correspond to specific decision-making processes? Can they be used to infer how people process information – or how decision strategies differ between healthy individuals and those with mental health conditions?
The researchers are convinced: “These models have the potential to fundamentally deepen our understanding of human cognition – provided we use them responsibly.” That this research is taking place at Helmholtz Munich rather than in the development departments of major tech companies is no coincidence. “We combine AI research with psychological theory – and with a clear ethical commitment,” says Binz. “In a public research environment, we have the freedom to pursue fundamental cognitive questions that are often not the focus in industry.”
What is Psych-101?
Psych-101 is a dataset specifically compiled by the team led by Marcel Binz for training the Centaur AI model. It contains over ten million individual decisions made by more than 60,000 participants across 160 psychological experiments. These experiments cover a wide range of human behavior – from risk-taking and reward learning to moral dilemmas. The researchers manually processed and standardized all the data to ensure that it could be interpreted by a language model. As such, Psych-101 represents a unique resource for systematically modeling human behavior based on natural language inputs.
Reference: “A foundation model to predict and capture human cognition” by Marcel Binz, Elif Akata, Matthias Bethge, Franziska Brändle, Fred Callaway, Julian Coda-Forno, Peter Dayan, Can Demircan, Maria K. Eckstein, Noémi Éltető, Thomas L. Griffiths, Susanne Haridi, Akshay K. Jagadish, Li Ji-An, Alexander Kipnis, Sreejan Kumar, Tobias Ludwig, Marvin Mathony, Marcelo Mattar, Alireza Modirshanechi, Surabhi S. Nath, Joshua C. Peterson, Milena Rmus, Evan M. Russek, Tankred Saanum, Johannes A. Schubert, Luca M. Schulze Buschoff, Nishad Singhi, Xin Sui, Mirko Thalmann, Fabian J. Theis, Vuong Truong, Vishaal Udandarao, Konstantinos Voudouris, Robert Wilson, Kristin Witte, Shuchen Wu, Dirk U. Wulff, Huadong Xiong and Eric Schulz, 2 July 2025, Nature.