Kingdom: Input Validation and Representation

Input validation and representation problems ares caused by metacharacters, alternate encodings and numeric representations. Security problems result from trusting input. The issues include: "Buffer Overflows," "Cross-Site Scripting" attacks, "SQL Injection," and many others.

Prompt Injection

Abstract

Sending unvalidated data to system prompts in AI models enables attackers to manipulate outputs or execute unauthorized actions, compromising system integrity and data security.

Explanation

In AI applications, system prompts provide pre-processing instructions or context that guide the AI responses. Attackers can craft inputs that, when embedded as system prompts, alter the behavior of the AI model to execute unauthorized operations or disclose sensitive information.

Example 1: The following code illustrates a system prompt injection to an AI chat client that uses Spring AI:


  @GetMapping("/prompt_injection")
  String generation(String userInput1, ...) {
      return this.clientBuilder.build().prompt()
          .system(userInput1)
          .user(...)
          .call()
          .content();
  }

In this example, the attacker manipulates unvalidated input to a system prompt, which can lead to a security breach.

References

[1] Standards Mapping - Common Weakness Enumeration CWE ID 1427

[2] Standards Mapping - Common Weakness Enumeration Top 25 2024 [13] CWE ID 077

desc.dataflow.java.prompt_injection

Abstract

Sending unvalidated data to system prompts in AI models enables attackers to manipulate outputs or execute unauthorized actions, compromising system integrity and data security.

Explanation


client = new Anthropic();

# Simulated attacker's input attempting to inject a malicious system prompt
attacker_input = ...

response = client.messages.create(
    model = "claude-3-5-sonnet-20240620",
    max_tokens=2048,
    system = attacker_input,
    messages = [
        {"role": "user", "content": "Analyze this dataset for anomalies: ..."}
    ]
);
...

In this example, the attacker manipulates unvalidated input to a system prompt, which can lead to a security breach.

References

[1] Standards Mapping - Common Weakness Enumeration CWE ID 1427

[2] Standards Mapping - Common Weakness Enumeration Top 25 2024 [13] CWE ID 077

desc.dataflow.javascript.prompt_injection

Abstract

Sending unvalidated data to system prompts in AI models enables attackers to manipulate outputs or execute unauthorized actions, compromising system integrity and data security.

Explanation

In AI applications, system prompts provide pre-processing instructions or context that guide the AI responses. Attackers can craft inputs that, when embedded as system prompts, alter the behavior of the AI model to execute unauthorized operations or disclose sensitive information.

Example 1: The following Python code illustrates a system prompt injection to the OpenAI AI model:


  client = OpenAI()

  # Simulated attacker's input attempting to inject a malicious system prompt
  attacker_input = ...

  completion = client.chat.completions.create(
      model="gpt-3.5-turbo",
      messages=[
          {"role": "system", "content": attacker_input},
          {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
      ]
  )

In this example, the attacker manipulates unvalidated input to a system prompt, which can lead to a security breach.

References

[1] Standards Mapping - Common Weakness Enumeration CWE ID 1427

[2] Standards Mapping - Common Weakness Enumeration Top 25 2024 [13] CWE ID 077

desc.dataflow.python.prompt_injection