[SalesForce] Why does trim() NOT remove char 160

I have a scenario where the label of a PicklistEntry contains a char 160 (non breaking space) at the end of the value. So when I call pe.label.getChars() the array output is the following characters: (104, 101, 108, 108, 111, 160).

If I call trim() then getChars(), I'm expecting the trailing 160 character to be removed. However, it's not. When I use char 32, trim() will remove it correctly.

Additionally, using normalizeSpace() DOES remove the char 160.

So my main question is why doesn't trim() remove this character while normalizeSpace() does?

See code example below.

// get 'hello ' from character array using 160 for space
String hello = String.fromCharArray(new Integer[] { 104, 101, 108, 108, 111, 160 });
System.debug('==>' + hello.trim()); //output ==> 'hello '

// get 'hello ' from character array using 32 for space
hello = String.fromCharArray(new Integer[] { 104, 101, 108, 108, 111, 32 });
System.debug('==>' + hello.trim()); //output ==> 'hello'

// get 'hello ' from character array using 160 for space and call normalizeSpace()
hello = String.fromCharArray(new Integer[] { 104, 101, 108, 108, 111, 160 });
System.debug('==>' + hello.normalizeSpace()); //output ==> 'hello'

EDIT
Additionally, when calling normalizeSpace() the char 160 is actually converted to char 32. So in order to completely trim the 160 and additional 32, I have to call normalizeSpace().trim()

String hello = String.fromCharArray(new Integer[] { 104, 101, 108, 108, 111, 160 });
String normalized = hello.normalizeSpace();
System.debug('==>' + normalized); //output ==> 'hello'
System.debug('==>' + normalized.getChars()); //output ==> (104, 101, 108, 108, 111, 32)

Best Answer

The documentation for trim says:

Leading and trailing ASCII control characters such as tabs and newline characters are also removed. White space and control characters that aren’t at the beginning or end of the sentence aren’t removed.

Taking this literally, only space (ASCII 32), tab (ASCII 9), line feed (ASCII 10), and carriage return (ASCII 13) would be removed, leaving other whitespace, such as non-breaking space, zero-width space, and so on unaffected. This is probably because trim is a very old method, dating back to the beginning of Apex, while normalizeWhitespace is relatively new.


As a quick alternative that should do what you want:

// get ' hello ' from character array using 160 for space
String hello = String.fromCharArray(new Integer[] { 160, 104, 101, 108, 108, 111, 160 });
System.debug('==>"' + hello.replaceAll('^\\p{IsWhite_Space}+|\\p{IsWhite_Space}+$','')+'"'); //output ==>"hello"
Related Topic