Hundreds of contractors working on a project for Meta pretended to be kids in order to see how other chatbots like Gemini and ChatGPT would respond to high-risk subjects, WIRED found.
Hundreds of contractors working on a project for Meta pretended to be kids in order to see how other chatbots like Gemini and ChatGPT would respond to high-risk subjects, WIRED found.
Researchers say ChatGPT generated violent and sexualized images after a harmless-looking prompt was altered, raising new questions about OpenAI’s safeguards and how quickly AI image tools can be pushed around filters.
Cybersecurity researchers are complaining that Anthropic’s new model Fable has guardrails that are too strict for any cybersecurity work.
Moonbounce has raised $12 million to grow its AI control engine that converts content moderation policies into consistent, predictable AI behavior.
Your chatbot may not feel anything, but new research shows emotion-like signals inside AI can shape responses, steer decisions, and even push systems toward risky behavior under pressure.
A Stanford study finds AI chatbots sometimes enable violent or self-harm thoughts in rare cases, exposing gaps in crisis response and raising concerns about how safe these tools are for emotional support.
The post AI mental health risks exposed as chat…