Char vs String in Apex

apexstring

I'm currently working on creating a tiny expression language in Apex. I'm following https://craftinginterpreters.com which develops a simple lexer / parser in Java. I'm working on the lexer now which traverses the source character by character. Apex does not have a native character type. I'm wondering if it would be better to go to the work of treating characters as Integers.

Integer c = 'Hello, world!'.charAt(currentPosition);

or

String c = 'Hello, world!'.substring(currentPosition, currentPosition + 1);

CPU time is a consideration as it might have to parse 1,000's of small source expressions in a regular transaction. I suspect the low friction path is to implement characters as strings of length 1.

I thought someone here is been down this path and would have some advice or a pointer or two.

Thanks,
Peter

Best Answer

I compared both of those methods to another method you may have overlooked: getChars. I got the following times in MS for each (lower is better):

Method Time
charAt 2512
substring 3338
getChars 744

I tested each against a String consisting of 100,000 characters for dramatic effect (and to get more accurate measurements). All three methods were called in the same transaction to avoid variances caused by server load between transactions.

If you don't mind working with Unicode codes instead of strings, you can get up to a 400% increase in performance by converting the string to Integer values via getChars, but it will cost 4 bytes per "character" for heap usage, which is 4 times as much as it costs as a String. Overall, if you can spare the heap usage, the CPU tradeoff is definitely worth it.