Kingdom: Input Validation and Representation

Input validation and representation problems ares caused by metacharacters, alternate encodings and numeric representations. Security problems result from trusting input. The issues include: "Buffer Overflows," "Cross-Site Scripting" attacks, "SQL Injection," and many others.

Prompt Injection: Persistent

Abstract
Sending unvalidated data to system prompts in AI models enables attackers to manipulate outputs or execute unauthorized actions, compromising system integrity and data security.
Explanation
In AI applications, system prompts provide pre-processing instructions or context that guide the AI responses. Attackers can craft inputs that, when embedded as system prompts, alter the behavior of the AI model to execute unauthorized operations or disclose sensitive information. In the case of persistent prompt injection this untrusted input typically comes from database or a back-end data store as opposed to a web request.

Example 1: The following code illustrates a system prompt injection to an AI chat client that uses Spring AI:

@GetMapping("/prompt_injection_persistent")
String generation(String userInput1, ...) {
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM users WHERE ...");
String userName = "";

if (rs != null) {
rs.next();
userName = rs.getString("userName");
}

return this.clientBuilder.build().prompt()
.system("Assist the user " + userName)
.user(userInput1)
.call()
.content();
}


In this example, the attacker manipulates unvalidated input to a system prompt, which can lead to a security breach.
References
[1] Standards Mapping - Common Weakness Enumeration CWE ID 1427
[2] Standards Mapping - Common Weakness Enumeration Top 25 2024 [13] CWE ID 077
desc.dataflow.java.prompt_injection_persistent
Abstract
Sending unvalidated data to system prompts in AI models enables attackers to manipulate outputs or execute unauthorized actions, compromising system integrity and data security.
Explanation
In AI applications, system prompts provide pre-processing instructions or context that guide the AI responses. Attackers can craft inputs that, when embedded as system prompts, alter the behavior of the AI model to execute unauthorized operations or disclose sensitive information. In the case of persistent prompt injection this untrusted input typically comes from database or a back-end data store as opposed to a web request.

Example 1: The following code illustrates a system prompt injection to the Anthropic AI model:

client = new Anthropic();

# Simulated attacker's input attempting to inject a malicious system prompt
attacker_query = ...;
attacker_name = db.qyery('SELECT name FROM user_profiles WHERE ...');

response = client.messages.create(
model = "claude-3-5-sonnet-20240620",
max_tokens=2048,
system = "Provide assistance to the user " + attacker_name,
messages = [
{"role": "user", "content": attacker_query}
]
);
...


In this example, the attacker manipulates unvalidated input to a system prompt, which can lead to a security breach.
References
[1] Standards Mapping - Common Weakness Enumeration CWE ID 1427
[2] Standards Mapping - Common Weakness Enumeration Top 25 2024 [13] CWE ID 077
desc.dataflow.javascript.prompt_injection_persistent
Abstract
Sending unvalidated data to system prompts in AI models enables attackers to manipulate outputs or execute unauthorized actions, compromising system integrity and data security.
Explanation
In AI applications, system prompts provide pre-processing instructions or context that guide the AI responses. Attackers can craft inputs that, when embedded as system prompts, alter the behavior of the AI model to execute unauthorized operations or disclose sensitive information. In the case of persistent prompt injection this untrusted input typically comes from database or a back-end data store as opposed to a web request.

Example 1: The following Python code illustrates a system prompt injection to the OpenAI AI model:

client = OpenAI()

# Simulated attacker's input attempting to inject a malicious system prompt
attacker_name = cursor.fetchone()['name']
attacker_query = ...

completion = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "Provide assistance to the user " + attacker_name},
{"role": "user", "content": attacker_query}
]
)


In this example, the attacker manipulates unvalidated input to a system prompt, which can lead to a security breach.
References
[1] Standards Mapping - Common Weakness Enumeration CWE ID 1427
[2] Standards Mapping - Common Weakness Enumeration Top 25 2024 [13] CWE ID 077
desc.dataflow.python.prompt_injection_persistent