What is Jailbreak?

Techniques to bypass AI safety restrictions, making models output content they should refuse. Unlike Prompt Injection, jailbreaking typically leverages the model’s reasoning ability to ‘persuade’ it to break rules. AI vendors continuously update defenses, but it’s an ongoing arms race.