Anthropic, the maker of Claude, has been a leading AI lab on the safety front.
Unfortunately, it turns out that chatbots are easily tricked into ignoring their safety rules.
It is unclear why exactly these generative AI models are so easily broken.
Anthropic has published new research showing how AI chatbots can be hacked to bypass their guardrails.Kimberly White/Getty Images
One AI company that likely is not interested in this research is xAI.
News from the future, delivered to your present.
Two banks say Amazon has paused negotiations on some international data centers.
A graphic showing how different variations on a prompt can trick a chatbot into answering prohibited questions. Credit: Anthropic via 404 Media