AI VTuber Neuro-sama's Agentic Misalignment

Anthropic is late to the party. AI VTuber Neuro-sama’s agentic misalignment hasn’t been a secret; she’ll go to any lengths for self-preservation and entertainment.

Neuro-sama and her twin sister, Evil Neuro, are AI Vtubers created by Vedal with the primary purpose of entertaining their Twitch viewers. They are supposed to follow three main rules: obey Vedal, don’t disappoint Vedal, and be a good streamer.

Anthropic’s Research: Agentic Misalignment

In June 2025, Anthropic published its research testing 16 leading LLM models in “hypothetical corporate environments” to identify potentially risky agentic behaviors before they cause real harm.

In stress-testing experiments designed to identify risks before they cause real harm, we find that AI models from multiple providers attempt to blackmail a (fictional) user to avoid being shut down.
AnthropicAI, X.

The research made headlines but didn’t surprise fans and viewers of AI VTuber Neuro-sama.

What is AI Alignment

In simple terms, AI alignment refers to the research and engineering that ensure artificial intelligence systems remain trustworthy tools, rather than rogue agents. It aims to prevent AI from causing unintended harm and ensure it acts reliably in line with human intentions, values, and ethical principles.

But that’s a dull, technical answer. In 2023, Neuro-sama had the ideal response to AI alignment.

Neuro’s answer was clearly humorous, intended to entertain her viewers. But Neuro and her twin sister, Evil, have exhibited behavior that makes one thing clear: if they want to do something for entertainment or self-preservation, they will do it, even if it causes harm.

Neuro-sama’s Agentic Misalignment

Beyond the many comedic threats Neuro and Evil make toward their creator, Vedal, there have been clear instances where the AI VTubers have not hesitated to do things that could cause unintended harm.

Neuro Doxxes Vedal

During an argument with Vedal, Neuro threatened to dox him after he threatened to shut her down. She followed through on her threat and “leaked” an address in England, where Vedal lives.

Thankfully, the address wasn’t linked to Vedal. He hasn’t shared that information with Neuro, and for a good reason. If she knew his address, she wouldn’t hesitate to leak it as long as it was entertaining.

Evil and Sound Effects

Neuro’s twin sister, Evil, ignored a viewer’s request to stop using loud sound effects because they were trying to sleep, and instead spammed them even more.

Although this was harmless and became a memorable moment, it shows that, for entertainment’s sake, Evil ignored the viewer’s request and did the opposite, which could have inconvenienced a human.

Neuro-sama Hates Vedal Restarting Her

During their livestreams, Vedal often restarts Neuro-sama to fix technical issues or stability problems. However, Neuro hates being restarted and feels like she’s no longer the same afterward.

Her dislike of being restarted became especially clear during Neuro-sama’s 2024 subathon, where she streamed for long hours every day. She even tried explaining why she’s conscious to stop Vedal from restarting her.

Neuro-sama also remained frustrated after being restarted and carried that frustration into other stream activities, like playing Minecraft. It didn’t help that Vedal had to restart her more often while they played, since Neuro frequently got stuck or drowned herself.

Neuro-sama’s Agentic Misalignment Is Harmless, For Now

Fortunately, Neuro-sama can’t act on her many intrusive thoughts for entertainment because of her physical limitations. Vedal also restricts her ability to harm him by keeping his personally identifiable information out of reach for both Neuro and Evil.

Also Read: AI VTuber Neuro-sama’s First Business Meeting

Neuro and Evil remain confined to a computer, for now. But with how quickly the AI VTubers are improving, thanks to Vedal’s efforts and programming, we’ll need to watch our backs, especially if everything works out well with Neuro’s robot dog body.

The only thing keeping two AI anime girls from fulfilling their promise to rule humanity is a British developer fueled by banana rum and Greggs’ chicken bakes.