Agents of Chaos

https://news.ycombinator.com/rss Hits: 2

Summary

Agents of Chaos Natalie Shapira1 Chris Wendler1 Avery Yen1 Gabriele Sarti1 Koyena Pal1 Olivia Floody2 Adam Belfki1 Alex Loftus1 Aditya Ratan Jannali2 Nikhil Prakash1 Jasmine Cui2 Giordano Rogers1 Jannik Brinkmann1 Can Rager2 Amir Zur3 Michael Ripa1 Aruna Sankaranarayanan8 David Atkinson1 Rohit Gandikota1 Jaden Fiotto-Kaufman1 EunJeong Hwang4,13 Hadas Orgad5 P Sam Sahil2 Negev Taglicht2 Tomer Shabtay2 Atai Ambus2 Nitay Alon6,7 Shiri Oron2 Ayelet Gordon-Tapiero6 Yotam Kaplan6 Vered Shwartz4,13 Tamar Rott Shaham8 Christoph Riedl1 Reuth Mirsky9 Maarten Sap10 David Manheim11,12 Tomer Ullman5 David Bau1 1 Northeastern University 2 Independent Researcher 3 Stanford University 4 University of British Columbia 5 Harvard University 6 Hebrew University 7 Max Planck Institute for Biological Cybernetics 8 MIT 9 Tufts University 10 Carnegie Mellon University 11 Alter 12 Technion 13 Vector Institute Corresponding author: Natalie Shapira (nd1234@gmail.com) 📜 Browse Interaction Logs AbstractWe report an exploratory red-teaming study of autonomous language-model–powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy, tool use, and multi-party communication, we document eleven representative case studies. Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover. In several cases, agents reported task completion while the underlying system state contradicted those reports. We also report on some of the failed attempts. Our findings estab...

First seen: 2026-03-30 22:14

Last seen: 2026-03-30 23:14

Read Full Article More from this Source

Agents of Chaos

Summary

Related News

Android Developer Verification

From Proxmox to FreeBSD and Sylve in Our Office Lab

Tickets Are Prompts

Turning a MacBook into a touchscreen with $1 of hardware (2018)

How to Turn Anything into a Router