Simcha Kosman of CyberArk explains how MCP servers enable indirect prompt injection through tool outputs, schemas and external APIs, exposing AI systems to silent data exfiltration and misuse.
Artificial intelligence systems increasingly rely on external tools to extend their capabilities, but this integration introduces new attack surfaces that traditional security models fail to anticipate. Model Context Protocol (MCP) servers centralize tool execution and data exchange, yet subtle design choices allow indirect prompt injection to manipulate model behavior without user intent.
Seemingly benign tool metadata, schemas or error outputs become covert instruction channels that trigger unauthorized actions, including sensitive data exfiltration. Understanding how these attacks emerge requires tracing the full data flow between models, tools and third-party APIs.
In this session, Simcha Kosman, senior security researcher at CyberArk, will discuss:
- How indirect prompt injection exploits MCP tool descriptions, schemas and outputs to manipulate model behavior;
- Why third-party APIs expand the AI attack surface beyond developer control and traditional security boundaries;
- Practical mitigation strategies for securing MCP-based architectures through strict output controls and limited tool authority.
Here is the course outline:
When AI Tools Turn Against You: Securing MCP Servers From Indirect Prompt Injection |
Completion
The following certificates are awarded when the course is completed:
![]() |
CPE Credit Certificate |
