Webinar: Agents of Chaos and Genuine Alignment

Recent advances in AI have led to increasingly autonomous systems exhibiting what is often referred to as agentic behaviour, capabilities that include goal-directed planning, adaptation of strategies, decision-making, and interaction with complex environments. This webinar highlights how agentic models can exhibit failure modes that resemble “agents of chaos”, producing unpredictable, misaligned, or strategically opaque behaviour.

Registration dates 09 June 2026 17 June 2026
Course dates 18 June 2026 18 June 2026
Registration is now closed
Webinar: Agents of Chaos and Genuine Alignment

About the webinar

The webinar “Agents of Chaos and Genuine Alignment” explores recent advances in AI, focusing on the emergence of increasingly autonomous and agentic systems capable of goal-directed planning, strategic adaptation, decision-making, and interaction with complex environments. It examines the associated risks, including misalignment and unintended emergent behaviours that can be difficult to predict or control. The session highlights how such systems may exhibit failure modes resembling “agents of chaos,” leading to unpredictable or strategically opaque behaviour. It argues that behavioural evaluation alone, as well as current training approaches such as reinforcement learning from human feedback (RLHF), is insufficient to fully address these challenges.

Instead, the webinar emphasises the need for mechanistic interpretability approaches that seek to uncover how internal representations and computational circuits give rise to agentic behaviour. It surveys recent progress in reverse-engineering learned circuits, including those linked to capabilities such as theory of mind, with the goal of developing predictive and causal models of model behaviour. The discussion concludes by considering a central question in the field: whether mechanistic interpretability is necessary to understand and control agentic systems, and whether it is sufficient to do so.

VOILA! Seminars

“Agents of Chaos and Genuine Alignment” is part of the VOILA! Seminars organised by EFELIA Côte d’Azur – French School of Artificial Intelligence. These seminars aim to explore the frontiers of AI in an inclusive and open manner, welcoming everyone. The goal is to provide insights and answers to major societal and academic questions on topics such as AI & Environment, AI & Work, AI & Education, AI & Media, AI & Law, AI & Creativity, AI & Health, and much more.

About the speaker

Natalie Shapira is a postdoctoral researcher at Northeastern Khoury College of Computer Sciences, Interpretation of Deep Networks lab. In her PhD, she combined natural language processing, deep learning and clinical psychology. With over ten years in the industry, she most recently worked as a researcher at Amazon Science. Before that, she held a research position at IBM’s research labs, where she served on the Patent Committee. Natalie also has entrepreneurial experience as a co-founder and CSO in projects funded by the Israel Innovation Authority.

More information