Kingdom: Input Validation and Representation

Input validation and representation problems ares caused by metacharacters, alternate encodings and numeric representations. Security problems result from trusting input. The issues include: "Buffer Overflows," "Cross-Site Scripting" attacks, "SQL Injection," and many others.

Buffer Overflow

Abstract
Writing outside the bounds of a block of allocated memory can corrupt data, crash the program, or cause the execution of malicious code.
Explanation
Buffer overflow is probably the best known form of software security vulnerability. Most software developers know what a buffer overflow vulnerability is, but buffer overflow attacks against both legacy and newly-developed applications are still quite common. Part of the problem is due to the wide variety of ways buffer overflows can occur, and part is due to the error-prone techniques often used to prevent them.

In a classic buffer overflow exploit, the attacker sends data to a program, which it stores in an undersized stack buffer. The result is that information on the call stack is overwritten, including the function's return pointer. The data sets the value of the return pointer so that when the function returns, it transfers control to malicious code contained in the attacker's data.

Although this type of stack buffer overflow is still common on some platforms and in some development communities, there are a variety of other types of buffer overflow, including heap buffer overflows and off-by-one errors among others. There are a number of excellent books that provide detailed information on how buffer overflow attacks work, including Building Secure Software [1], Writing Secure Code [2], and The Shellcoder's Handbook [3].

At the code level, buffer overflow vulnerabilities usually involve the violation of a programmer's assumptions. Many memory manipulation functions in C and C++ do not perform bounds checking and can easily overwrite the allocated bounds of the buffers they operate upon. Even bounded functions, such as strncpy(), can cause vulnerabilities when used incorrectly. The combination of memory manipulation and mistaken assumptions about the size or makeup of a piece of data is the root cause of most buffer overflows.

Buffer overflow vulnerabilities typically occur in code that:

- Relies on external data to control its behavior.

- Depends upon properties of the data that are enforced outside of the immediate scope of the code.

- Is so complex that a programmer cannot accurately predict its behavior.



The following examples demonstrate all three of the scenarios.

Example 1.a: The following sample code demonstrates a simple buffer overflow that is often caused by the first scenario in which the code relies on external data to control its behavior. The code uses the gets() function to read an arbitrary amount of data into a stack buffer. Because there is no way to limit the amount of data read by this function, the safety of the code depends on the user to always enter fewer than BUFSIZE characters.


...
char buf[BUFSIZE];
gets(buf);
...
Example 1.b: This example shows how easy it is to mimic the unsafe behavior of the gets() function in C++ by using the >> operator to read input into a char[] string.


...
char buf[BUFSIZE];
cin >> (buf);
...
Example 2: The code in this example also relies on user input to control its behavior, but it adds a level of indirection with the use of the bounded memory copy function memcpy(). This function accepts a destination buffer, a source buffer, and the number of bytes to copy. The input buffer is filled by a bounded call to read(), but the user specifies the number of bytes that memcpy() copies.


...
char buf[64], in[MAX_SIZE];
printf("Enter buffer contents:\n");
read(0, in, MAX_SIZE-1);
printf("Bytes to copy:\n");
scanf("%d", &bytes);
memcpy(buf, in, bytes);
...


Note: This type of buffer overflow vulnerability (where a program reads data and then trusts a value from the data in subsequent memory operations on the remaining data) has turned up with some frequency in image, audio, and other file processing libraries.

Example 3: This is an example of the second scenario in which the code depends on properties of the data that are not verified locally. In this example a function named lccopy() takes a string as its argument and returns a heap-allocated copy of the string with all uppercase letters converted to lowercase. The function performs no bounds checking on its input because it expects str to always be smaller than BUFSIZE. If an attacker bypasses checks in the code that calls lccopy(), or if a change in that code makes the assumption about the size of str untrue, then lccopy() will overflow buf with the unbounded call to strcpy().


char *lccopy(const char *str) {
char buf[BUFSIZE];
char *p;

strcpy(buf, str);
for (p = buf; *p; p++) {
if (isupper(*p)) {
*p = tolower(*p);
}
}
return strdup(buf);
}
Example 4: The following code demonstrates the third scenario in which the code is so complex its behavior cannot be easily predicted. This code is from the popular libPNG image decoder, which is used by a wide array of applications.

The code appears to safely perform bounds checking because it checks the size of the variable length, which it later uses to control the amount of data copied by png_crc_read(). However, immediately before it tests length, the code performs a check on png_ptr->mode, and if this check fails a warning is issued and processing continues. Since length is tested in an else if block, length would not be tested if the first check fails, and is used blindly in the call to png_crc_read(), potentially allowing a stack buffer overflow.

Although the code in this example is not the most complex we have seen, it demonstrates why complexity should be minimized in code that performs memory operations.


if (!(png_ptr->mode & PNG_HAVE_PLTE)) {
/* Should be an error, but we can cope with it */
png_warning(png_ptr, "Missing PLTE before tRNS");
}
else if (length > (png_uint_32)png_ptr->num_palette) {
png_warning(png_ptr, "Incorrect tRNS chunk length");
png_crc_finish(png_ptr, length);
return;
}
...
png_crc_read(png_ptr, readbuf, (png_size_t)length);
Example 5: This example also demonstrates the third scenario in which the program's complexity exposes it to buffer overflows. In this case, the exposure is due to the ambiguous interface of one of the functions rather than the structure of the code (as was the case in the previous example).

The getUserInfo() function takes a username specified as a multibyte string and a pointer to a structure for user information, and populates the structure with information about the user. Since Windows authentication uses Unicode for usernames, the username argument is first converted from a multibyte string to a Unicode string. This function then incorrectly passes the size of unicodeUser in bytes rather than characters. The call to MultiByteToWideChar() may therefore write up to (UNLEN+1)*sizeof(WCHAR) wide characters, or
(UNLEN+1)*sizeof(WCHAR)*sizeof(WCHAR) bytes, to the unicodeUser array, which has only (UNLEN+1)*sizeof(WCHAR) bytes allocated. If the username string contains more than UNLEN characters, the call to MultiByteToWideChar() will overflow the buffer unicodeUser.


void getUserInfo(char *username, struct _USER_INFO_2 info){
WCHAR unicodeUser[UNLEN+1];
MultiByteToWideChar(CP_ACP, 0, username, -1,
unicodeUser, sizeof(unicodeUser));
NetUserGetInfo(NULL, unicodeUser, 2, (LPBYTE *)&info);
}
References
[1] J. Viega, G. McGraw Building Secure Software Addison-Wesley
[2] M. Howard, D. LeBlanc Writing Secure Code, Second Edition Microsoft Press
[3] J. Koziol et al. The Shellcoder's Handbook: Discovering and Exploiting Security Holes John Wiley & Sons
[4] About Strsafe.h Microsoft
[5] Standards Mapping - Common Weakness Enumeration CWE ID 120, CWE ID 129, CWE ID 131, CWE ID 787
[6] Standards Mapping - Common Weakness Enumeration Top 25 2019 [1] CWE ID 119, [3] CWE ID 020, [12] CWE ID 787
[7] Standards Mapping - Common Weakness Enumeration Top 25 2020 [5] CWE ID 119, [3] CWE ID 020, [2] CWE ID 787
[8] Standards Mapping - Common Weakness Enumeration Top 25 2021 [1] CWE ID 787, [4] CWE ID 020, [17] CWE ID 119
[9] Standards Mapping - Common Weakness Enumeration Top 25 2022 [1] CWE ID 787, [4] CWE ID 020, [19] CWE ID 119
[10] Standards Mapping - Common Weakness Enumeration Top 25 2023 [1] CWE ID 787, [6] CWE ID 020, [17] CWE ID 119
[11] Standards Mapping - DISA Control Correlation Identifier Version 2 CCI-002754, CCI-002824
[12] Standards Mapping - General Data Protection Regulation (GDPR) Indirect Access to Sensitive Data
[13] Standards Mapping - Motor Industry Software Reliability Association (MISRA) C Guidelines 2012 Rule 1.3
[14] Standards Mapping - Motor Industry Software Reliability Association (MISRA) C Guidelines 2023 Directive 4.14, Rule 1.3, Rule 21.17
[15] Standards Mapping - Motor Industry Software Reliability Association (MISRA) C++ Guidelines 2008 Rule 0-3-1, Rule 18-0-5
[16] Standards Mapping - NIST Special Publication 800-53 Revision 4 SI-10 Information Input Validation (P1), SI-16 Memory Protection (P1)
[17] Standards Mapping - NIST Special Publication 800-53 Revision 5 SI-10 Information Input Validation, SI-16 Memory Protection
[18] Standards Mapping - OWASP Application Security Verification Standard 4.0 5.1.3 Input Validation Requirements (L1 L2 L3), 5.1.4 Input Validation Requirements (L1 L2 L3), 5.4.1 Memory/String/Unmanaged Code Requirements (L1 L2 L3), 5.4.2 Memory/String/Unmanaged Code Requirements (L1 L2 L3), 14.1.2 Build (L2 L3)
[19] Standards Mapping - OWASP Mobile 2014 M7 Client Side Injection
[20] Standards Mapping - OWASP Mobile 2024 M4 Insufficient Input/Output Validation
[21] Standards Mapping - OWASP Mobile Application Security Verification Standard 2.0 MASVS-CODE-4
[22] Standards Mapping - OWASP Top 10 2004 A5 Buffer Overflow
[23] Standards Mapping - OWASP Top 10 2013 A1 Injection
[24] Standards Mapping - OWASP Top 10 2017 A1 Injection
[25] Standards Mapping - Payment Card Industry Data Security Standard Version 1.1 Requirement 6.5.5
[26] Standards Mapping - Payment Card Industry Data Security Standard Version 1.2 Requirement 6.3.1.1
[27] Standards Mapping - Payment Card Industry Data Security Standard Version 2.0 Requirement 6.5.2
[28] Standards Mapping - Payment Card Industry Data Security Standard Version 3.0 Requirement 6.5.2
[29] Standards Mapping - Payment Card Industry Data Security Standard Version 3.1 Requirement 6.5.2
[30] Standards Mapping - Payment Card Industry Data Security Standard Version 3.2 Requirement 6.5.2
[31] Standards Mapping - Payment Card Industry Data Security Standard Version 3.2.1 Requirement 6.5.2
[32] Standards Mapping - Payment Card Industry Data Security Standard Version 4.0 Requirement 6.2.4
[33] Standards Mapping - Payment Card Industry Software Security Framework 1.0 Control Objective 4.2 - Critical Asset Protection
[34] Standards Mapping - Payment Card Industry Software Security Framework 1.1 Control Objective 4.2 - Critical Asset Protection, Control Objective B.3.1 - Terminal Software Attack Mitigation, Control Objective B.3.1.1 - Terminal Software Attack Mitigation, Control Objective B.3.1.2 - Terminal Software Attack Mitigation
[35] Standards Mapping - Payment Card Industry Software Security Framework 1.2 Control Objective 4.2 - Critical Asset Protection, Control Objective B.3.1 - Terminal Software Attack Mitigation, Control Objective B.3.1.1 - Terminal Software Attack Mitigation, Control Objective B.3.1.2 - Terminal Software Attack Mitigation, Control Objective C.3.2 - Web Software Attack Mitigation
[36] Standards Mapping - SANS Top 25 2009 Risky Resource Management - CWE ID 119
[37] Standards Mapping - SANS Top 25 2010 Risky Resource Management - CWE ID 120, Risky Resource Management - CWE ID 129, Risky Resource Management - CWE ID 131
[38] Standards Mapping - SANS Top 25 2011 Risky Resource Management - CWE ID 120, Risky Resource Management - CWE ID 131
[39] Standards Mapping - Security Technical Implementation Guide Version 3.1 APP3510 CAT I, APP3590.1 CAT I
[40] Standards Mapping - Security Technical Implementation Guide Version 3.4 APP3510 CAT I, APP3590.1 CAT I
[41] Standards Mapping - Security Technical Implementation Guide Version 3.5 APP3510 CAT I, APP3590.1 CAT I
[42] Standards Mapping - Security Technical Implementation Guide Version 3.6 APP3510 CAT I, APP3590.1 CAT I
[43] Standards Mapping - Security Technical Implementation Guide Version 3.7 APP3510 CAT I, APP3590.1 CAT I
[44] Standards Mapping - Security Technical Implementation Guide Version 3.9 APP3510 CAT I, APP3590.1 CAT I
[45] Standards Mapping - Security Technical Implementation Guide Version 3.10 APP3510 CAT I, APP3590.1 CAT I
[46] Standards Mapping - Security Technical Implementation Guide Version 4.2 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[47] Standards Mapping - Security Technical Implementation Guide Version 4.3 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[48] Standards Mapping - Security Technical Implementation Guide Version 4.4 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[49] Standards Mapping - Security Technical Implementation Guide Version 4.5 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[50] Standards Mapping - Security Technical Implementation Guide Version 4.6 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[51] Standards Mapping - Security Technical Implementation Guide Version 4.7 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[52] Standards Mapping - Security Technical Implementation Guide Version 4.8 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[53] Standards Mapping - Security Technical Implementation Guide Version 4.9 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[54] Standards Mapping - Security Technical Implementation Guide Version 4.10 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[55] Standards Mapping - Security Technical Implementation Guide Version 4.11 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[56] Standards Mapping - Security Technical Implementation Guide Version 4.1 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[57] Standards Mapping - Security Technical Implementation Guide Version 5.1 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[58] Standards Mapping - Security Technical Implementation Guide Version 5.2 APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[59] Standards Mapping - Security Technical Implementation Guide Version 5.3 APSC-DV-002530 CAT II, APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[60] Standards Mapping - Security Technical Implementation Guide Version 6.1 APSC-DV-002530 CAT II, APSC-DV-002560 CAT I, APSC-DV-002590 CAT I
[61] Standards Mapping - Web Application Security Consortium Version 2.00 Buffer Overflow (WASC-07)
[62] Standards Mapping - Web Application Security Consortium 24 + 2 Buffer Overflow
desc.dataflow.cpp.buffer_overflow