[SalesForce] REST API Unicode characters not working

I'm using the REST API for a texting app (Lightning components). When I send a messages that includes special characters 妮 暗示 or 😀, the characters go out correctly and get saved to my custom object Text field correctly. However, if receiving a text, the special characters are gibberish — not Unicode or UTF-8 encodings, but ððð or ç±è°è°.

Here's my code where I'm pulling the incoming parameters:

global class ReceiveSMS {
    @HttpPost
    global static void saveSMS() {
        // Store the request received
        RestRequest req = RestContext.request;
        System.debug(req);

        // Store the HTTP parameters received in a Map
        Map<String, String> smsParams = req.params ;

        String fromMobile ;
        String msgBody;
        Contact contactRecord;

        // Extract SMS Sender's phone number and store it in a variable
        if (smsParams.containsKey('From')){
            fromMobile = smsParams.get('From') ;
        }

        // Extract the body of SMS and store it in a variable
        if (smsParams.containsKey('Body')){
            msgBody = smsParams.get('Body');

        }

       // code goes on to construct and insert the message, etc.

I'm not sure what I'm missing… I know the field on the custom object can display the characters (when I send instead of receive).

The HttpPost raw code (with my access codes obscured, and I've inserted line returns here to make it easier to read). Note the Body parameter is passing F0 9F 8D 8E (with % in place of spaces). That's UTF-8 for Red Apple, but when it's saved in my custom object field, I see this: ð instead of 🍎.

ApiVersion=2010-04-01&SmsSid=SM9f774axxxxxf48a3a145&SmsStatus=received&
SmsMessageSid=SM9f774xxxxxxxxx4b34f48a3a145&NumSegments=1&
From=%2B14151234567&ToState=CA&MessageSid=SM9fxxxxx4f48a3a145&
AccountSid=AC3xxxxx7a9f58&ToZip=&FromCountry=US&ToCity=&
FromCity=SAN+FRANCISCO&To=%2B1987654321&FromZip=95550&
Body=%F0%9F%8D%8E&ToCountry=US&FromState=CA&NumMedia=0

Suggestions?


@IllusiveBrian, I'm not sure, but I think it's missing something when my APEX ReceiveSMS class extracts the Body parameter and inserts it into the custom object field. I have other code that saves outgoing messages into that same field, and putting an emoji in is no problem — it displays as emoji, not as UTF-8 other encryption.


@DavidReed, thanks for that bit of info about that character. That would suggest that it's picking up the first part (FO) to show ð, but not the whole thing? Not sure what to make of that.


@Sumaga, thanks for your insights. I can't figure out how to dump the entire raw incoming request where I can see/share it. I tried adding System.debug, but the logs don't seem to run on an @RestResource class static method. I also tried dumping the full request into a text field, but the RestRequest class isn't a string, and string.valueOf() doesn't work. Open to suggestions on this.


@DavidReed, here's the header:

Accept: */*, CipherSuite: ECDHE-RSA-AES256-GCM-SHA384 TLSv1.2 256-bits, Referer: http://[my instance].na73.force.com/services/apexrest/ReceiveSMS, User-Agent: TwilioProxy/1.1, Host: [my instance].na73.force.com, X-Salesforce-VIP: FORCE, X-Salesforce-SIP: 34.227.11.139, Cache-Control: max-age=259200, X-Twilio-Signature: af9fTva9ra2SUagUSxCNvQTztig=, X-Cnection: close, X-Salesforce-Forwarded-To: na73.salesforce.com, Content-Type: application/x-www-form-urlencoded

Same header, line returns inserted and blockquoted for easier reading:

Accept: /, CipherSuite: ECDHE-RSA-AES256-GCM-SHA384
TLSv1.2 256-bits, Referer: http://[my instance].na73.force.com/services/apexrest/ReceiveSMS, User-Agent:
TwilioProxy/1.1, Host: [my instance].na73.force.com, X
Salesforce-VIP: FORCE, X-Salesforce-SIP: 34.227.11.139, Cache-Control: max-
age=259200, X-Twilio-Signature: af9fTva9ra2SUagUSxCNvQTztig=,
X-Cnection: close, X-Salesforce-Forwarded-To: na73.salesforce.com,
Content-Type: application/x-www-form-urlencoded

I think the key element is that last line, Content-Type, but that's what I was expecting to get.


MORE STUFF I'VE TRIED…
I tried grabbing the incoming request as a blob, then decoding that — no luck. The base64Encode is supposed to: Converts a Blob to an unencoded String representing its normal form.

RestRequest req = RestContext.request;
Blob reqBlob = req.requestBody;
String reqString = EncodingUtil.base64Encode(reqBlob);

For a message sent as 'Cherries 🍒🍒', I still get: Body: Cherries ðð

I also tried converting the blob with a simple String.valueOf(reqBlob), but same result.

I was hopeful about the EncodingUtil.urlDecode, which decodes a string and lets you specify the format, but it only takes a string, and the string I have access to has already corrupted the emoji. I tried this, with the same result as before.

    RestRequest req = RestContext.request;
    Blob reqBlob = req.requestBody;
    String reqString = EncodingUtil.urlDecode(String.valueOf(reqBlob), 'UTF-8');

QUESTION: is there a way to grab the incoming rest request without using RestContext?

ANOTHER QUESTION: I notice in the httpRequest, it starts with: ApiVersion=2010-04-01. Could it be calling to an older version of Salesforce that doesn't support utf-8?

Best Answer

I've been able to reproduce this in a test environment using Workbench. It appears that Salesforce by default interprets text in the body of a request with Content-Type: application/x-www-form-urlencoded as either ISO Latin-1 or UTF-16, which is why that specific piece of text yields an eth character (Unicode/Latin-1 F0).

If you can get Twilio to send the header Content-Type: application/x-www-form-urlencoded; charset=utf-8, I've confirmed that this fixes the issue.

Unfortunately, it looks like Apex grabs the request body and deserializes it to params before we actually ask for params, and the original binary request content is no longer available by the time it gets to us. Further, setting the Content-Type header on the RestRequest object once it gets to Salesforce has no effect.

Related Topic