AI companies’ unmonitored internal AI use poses serious risks

Published on April 4, 2025 6:17 PM GMT

AI companies’ unmonitored internal AI use poses serious risks
AI companies use their own models to secure the very systems the AI runs on. They should be monitoring those interactions for signs of deception and misbehavior.

AI companies use powerful unreleased models for important work, like writing security-critical code and analyzing results of safety tests. From my review of public materials, I worry that AI is largely unmonitored when AI companies use it internally today, despite its role in important work.

Right now, unless an AI company specifically says otherwise, you should assume they don’t have visibility into all the ways their models are being used internally and the risks these could pose.

In many industries, using an unreleased product internally can’t harm third-party outsiders. A car manufacturer might test-drive its new prototype around the company campus, but civilian pedestrians can’t be hurt—no matter how many manufacturing defects—because the car is contained to the company’s facilities.

The AI industry is different—more like biology labs testing new pathogens: The biolab must carefully monitor and control conditions like air pressure, or the pathogen might leak into the external world. Same goes for AI: If we don’t keep a watchful eye, the AI might be able to escape—for instance, by exploiting its role in shaping the security code meant to keep the AI locked inside the company’s computers.

The minimum improvement we should ask for is retroactive internal monitoring: AI companies logging all internal use of frontier models by their company, analyzing the logs (e.g., for signs of the AI engaging in scheming), and following up on concerning patterns.

An even stronger vision—likely necessary in the near future, as AI systems become more capable—is to wrap all internal uses of frontier AI within real-time control systems that can intervene on the most urgent interactions.

In this post, I’ll cover:

(If you aren’t sure what I mean by “monitoring”—logging, analysis, and enforcement—you might want to check out my explainer first.)

[ continues on Substack ]

Discuss

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签