AI Security: Risks, Defenses and Tools
AI systems are both targets and tools of attacks. This guide covers the OWASP LLM Top 10, threat types and — layer by layer — what to do and which tools to use.
Türkçe sürüm: Yapay Zeka Güvenliği
Governance
Data
Access
Model
Supply chain
Monitoring
OWASP LLM Top 10
Prompt Injection
Hidden instructions in input hijack the model's behavior (direct or indirect via web pages/documents). Separate system prompt from input; treat input as data, not commands.
Insecure Output Handling
Passing model output to other systems (browser, DB, shell) unvalidated can cause XSS, SQL injection or RCE. Always validate and escape output.
Training Data Poisoning
Malicious content in training/fine-tuning data installs backdoors or bias. Verify sources, track provenance, scan for anomalies.
Model Denial of Service
Resource-heavy requests slow, crash or inflate cost. Apply input limits, rate limiting and token budgets.
Supply Chain Vulnerabilities
Flaws in third-party models, libraries and datasets. Keep an SBOM, verify signatures, use trusted sources.
Sensitive Information Disclosure
The model leaks PII/secrets from training data or context. Mask sensitive input, use enterprise plans, add output filtering/DLP.
Insecure Plugin Design
Plugins/tools with weak validation or excessive access. Strictly type and validate inputs, apply least privilege.
Excessive Agency
An agent has too much permission or autonomy. Apply least privilege; gate irreversible actions with human approval.
Overreliance
Accepting model output without verification despite hallucination risk. Verify with sources (RAG), add human review.
Model Theft
Unauthorized access or query-based copying of the model. Access control, encryption, rate limits, monitoring.
What to do: layered defense and tools
1. Governance & Policy
- AI usage policy and approved-tools list
- Data classification: what can/can't go to AI
- KVKK/GDPR and sector compliance
2. Data Security & Privacy
- Mask PII/secrets, data minimization
- Use enterprise plans (no training on your data)
- Encrypt data, add DLP
3. Access & Identity
- Role-based access (RBAC), least privilege
- Secure and rotate API keys
- MFA; scope agent/plugin permissions
4. Application & Model Security
- Separate system prompt from user input
- Validate/escape output; don't run as command
- Guardrails; human approval for critical actions
5. Supply Chain
- Keep an SBOM/AIBOM
- Verify signatures/integrity
- Pin versions, scan dependencies
6. Monitoring & Response
- Prompt/output logging and audit trail
- Anomaly and abuse detection
- Incident plan and red teaming
FAQ
- What is AI security?
- Protecting AI systems (model, data, prompts, plugins, integrations) against attacks, leaks and misuse — covering both attacks that target AI (prompt injection, model theft) and attacks amplified by AI (deepfakes, phishing).
- What is the OWASP LLM Top 10?
- OWASP's list of the ten most critical vulnerabilities for LLM applications: prompt injection, insecure output handling, training data poisoning, DoS, supply chain, sensitive information disclosure, insecure plugins, excessive agency, overreliance and model theft.
- How do I defend against prompt injection?
- Separate the system prompt from user input and treat input as data, not commands. Restrict output and actions, sandbox untrusted content, and add human approval plus guardrails for critical operations.
- Which standards should I follow?
- NIST AI Risk Management Framework, ISO/IEC 42001 and the OWASP LLM Top 10 are solid starting frameworks, combined with KVKK/GDPR compliance.