Sécurité et Éthique

AI Safety

The discipline of building AI systems that are reliably beneficial and avoid harmful outputs.

AI safety encompasses technical and policy efforts to ensure AI systems behave predictably, honestly, and without causing harm. It includes both near-term concerns (jailbreaks, biased outputs, prompt injection) and long-term concerns (misaligned superintelligence). Safety research is core to Anthropic's mission and influences Claude's design via Constitutional AI and RLHF.

Termes Associés