📰 Key Takeaways
Anthropic announces it will reopen public access to its two most powerful AI models, Claude Fable 5 and Mythos 5. These models were forced offline by the US government on June 12th under export restrictions, after Amazon researchers discovered a method to bypass Fable 5’s safety guardrails, allowing the model to identify multiple software vulnerabilities and generate exploit code. The US government officially lifted restrictions on Wednesday.
Anthropic says the re-released version will come with a brand new classifier specifically designed to identify and block a wider range of cybersecurity-related commands. Commerce Secretary Howard Lutnick confirmed that the government has completed the review and approval process for Fable 5, emphasizing that this move is meant to “solidify America’s leadership in AI.”
This incident has also pushed Anthropic to accelerate, under the “Project Glasswing” framework, the development of consensus standards for evaluating AI jailbreak severity together with partners like Amazon, Microsoft, and Google. This includes establishing pre-release model testing mechanisms, jailbreak information sharing channels, and joint research resources. Worth noting: a well-known AI researcher has publicly claimed to have successfully jailbroken Fable 5 within 48 hours of its re-release, showing that strengthening guardrails still faces ongoing challenges.
💬 JudyAI Lab’s Take
Anthropic’s top-tier Fable 5 and Mythos 5 models were forced offline by the US government for three weeks after security researchers bypassed their safety guardrails, and they’re back online this week. This is a rare case in recent years where a model security vulnerability directly triggered policy restrictions.
This incident shows that AI security guardrails are an ongoing offensive and defensive battle, not a one-time setup at deployment. Even the most well-resourced companies face the reality of being jailbroken within 48 hours after launch. Anthropic’s new classifier targets more granular identification of cybersecurity-related commands, but community researchers have expressed doubts about its longevity. What’s even more值得关注 is the cross-company collaboration driven by “Project Glasswing” — establishing consensus standards for jailbreak severity, pre-release testing, and information sharing mechanisms represents the industry’s attempt to shift from fragmented security assessments to collective defense.
This case reminds us that when building products at the application layer, we can’t outsource all security responsibility to the underlying model. Reviewing your own input filtering and output moderation mechanisms is something you can do right now.
📅 Source Info
- Published: 2026-07-01T06:14
- Source Article: https://cointelegraph.com/news/anthropic-reactivates-newest-ai-models-after-us-govt-lifts-restrictions?utm_source=rss_feed&utm_medium=rss_tag_ai&utm_campaign=rss_partner_inbound