Kingdom: Code Quality

Poor code quality leads to unpredictable behavior. From a user's perspective that often manifests itself as poor usability. For an attacker it provides an opportunity to stress the system in unexpected ways.

Code Correctness: Byte Array to String Conversion

Abstract
Converting a byte array into a String may lead to data loss.
Explanation
When data from a byte array is converted into a String, it is unspecified what will happen to any data that is outside of the applicable character set. This can lead to data being lost, or a decrease in the level of security when binary data is needed to ensure proper security measures are followed.

Example 1: The following code converts data into a String in order to create a hash.


...
FileInputStream fis = new FileInputStream(myFile);
byte[] byteArr = byte[BUFSIZE];
...
int count = fis.read(byteArr);
...
String fileString = new String(byteArr);
String fileSHA256Hex = DigestUtils.sha256Hex(fileString);
// use fileSHA256Hex to validate file
...


Assuming the size of the file is less than BUFSIZE, this works fine as long as the information in myFile is encoded the same as the default character set, however if it's using a different encoding, or is a binary file, it will lose information. This in turn will cause the resulting SHA hash to be less reliable, and could mean it's far easier to cause collisions, especially if any data outside of the default character set is represented by the same value, such as a question mark.
References
[1] STR03-J. Do not encode noncharacter data as a string CERT
[2] When 'EFBFBD' and Friends Come Knocking: Observations of Byte Array to String Conversions GDS Security
[3] Standards Mapping - Common Weakness Enumeration CWE ID 486
desc.semantic.java.code_correctness_byte_array_to_string_conversion