[SalesForce] List of Special Character that not acceptable in Content Note conversion

I am doing old Note conversion to new Note. For some special character , I can use escapeHTML4() and it works like charm but for some character such as '£', '~' is fail.I tried to use escapeUnicode() but it still fail.

So I plan to search those notes that contains special character that not able to be converted and convert it manually.

Is there any list of special character that cannot be converted so I can work with conversion plan?

Best Answer

I ran into this very issue and considering I have thousands of Content Notes to import I went ahead and wrote a simple python script that I call 'notedog' to search and replace the disallowed characters for the allowed characters. The script takes the file name(s) as input and the modification is made in place. As a disclaimer I would recommend backing up your files first.

Sample usage:

python notedog.py c:\migration\notes\23.txt
# (build a script with your entire list of files)

Here's the script:

import sys
import os
import fileinput
import re


for line in fileinput.input(inplace=True):
    line = re.sub('&','&', line.rstrip())
    line = re.sub("'","'", line.rstrip())
    line = re.sub('"','"', line.rstrip())
    line = re.sub('<','&lt;', line.rstrip())
    line = re.sub('>','&gt;', line.rstrip())

    print(line)

that's it!

Related Topic