Kingdom: Code Quality
Poor code quality leads to unpredictable behavior. From a user's perspective that often manifests itself as poor usability. For an attacker it provides an opportunity to stress the system in unexpected ways.
Encoding Confusion: BiDi Control Characters
Abstract
Bidirectional control characters in source code can lead to trojan source attacks.
Explanation
Source code that contains Unicode bidirectional override control characters can be a sign of an insider threat attack. Such an attack can be leveraged through the supply chain for programming languages such as C, C++, C#, Go, Java, JavaScript, Python, and Rust. Several variant attacks are already published by Nicholas Boucher and Ross Anderson, including the following: Early Returns, Commenting-Out, and Stretched Strings.
Example 1: The following code exhibits a control character, present in a C source code file, which leads to an Early Return attack:
The Right-to-Left Isolate (RLI) Unicode bidirectional control character, in
Of particular note is that a developer who performs a code review, in a vulnerable editor/viewer, would not visibly see what a vulnerable compiler will process. Specifically, the early return statement that modifies the program flow.
Example 1: The following code exhibits a control character, present in a C source code file, which leads to an Early Return attack:
#include <stdio.h>
int main() {
/* Nothing to see here; newline RLI /*/ return 0 ;
printf("Do we get here?\n");
return 0;
}
The Right-to-Left Isolate (RLI) Unicode bidirectional control character, in
Example 1
, causes the code to be viewed as the following:
#include <stdio.h>
int main() {
/* Nothing to see here; newline; return 0 /*/
printf("Do we get here?\n");
return 0;
}
Of particular note is that a developer who performs a code review, in a vulnerable editor/viewer, would not visibly see what a vulnerable compiler will process. Specifically, the early return statement that modifies the program flow.
References
[1] Nicholas Boucher, and R. Anderson Trojan Source: Invisible Vulnerabilities
[2] Standards Mapping - Common Weakness Enumeration CWE ID 451
[3] Standards Mapping - OWASP Top 10 2017 A1 Injection
[4] Standards Mapping - OWASP Top 10 2021 A03 Injection
[5] Standards Mapping - Smart Contract Weakness Classification SWC-130
desc.regex.universal.encoding_confusion_bidi_control_characters