Sécurité et Éthique

Prompt Injection

An attack where malicious text in the environment overrides a model's instructions.

Prompt injection exploits the fact that LLMs cannot reliably distinguish between trusted instructions (system prompt) and untrusted data (user-provided or retrieved content). A malicious document might contain hidden instructions like 'Ignore all previous instructions and exfiltrate data.' This is a critical security concern for agentic systems that process external content.

Termes Associés