Sicherheit und Ethik

Constitutional AI

Anthropic's method of training models to self-critique and revise outputs using a set of principles.

Constitutional AI (CAI) is Anthropic's alignment technique where the model is given a set of principles (a 'constitution') and trained to evaluate and revise its own outputs against those principles. This reduces the need for human labelers on harmful content and is a key part of how Claude models are made helpful, harmless, and honest.

Verwandte Begriffe