Is "REGEX too Complicated" error while attempting to upload a CSV via Apex into SFDC an undocumented limit? What should be the optimal data volume for a CSV upload via Apex? I was trying with 10,000 rows of 8 columns each?
[SalesForce] “REGEX too Complicated” error
Related Solutions
Good question, I personally would not try to handle this in the client, it's to fragile an environment, better to upload this file and handle at the server side. Or consider adjusting your requirements / user expectations to fit Salesforce Data Loader tools, see my summary below.
Client Side
- Client Library and Communications. You can find likely find a number of CSV parsing libraries online and then interface the results of those back to the server using JavaScript Remoting. This would avoid hitting the daily API governor or attempting to handle this via apex:actionFunction (which would run you into view state issues).
- Scaleability and Robustness. While this has the advantage that you can handle larger files. It will likely be slower and is dependent on the user keeping the page open during the processing. Keep in mind you still need be able to upload the CSV file to store it, as per your requirements, so...
Server Side
The following should get you started on a number of things you need to address here.
- Uploading. You can indeed use many of the file upload examples available such as the one jkraybill references here. To avoid view state governor issues, make sure you pay attention clearing the Body of the document or as I prefer simply use the 'transient' modifier on the member variable in your controller.
- Parsing. You can also find here a good CSV parser library written in Apex by Marty Y. Chang. In terms of actually parsing a CSV of any reasonable number of columns and rows you will likely hit the Apex statement governor before anything else (upload or heap size limits).
- Scalability. To get the most out of the platform you might want to consider parsing your CSV file using Batch Apex. There is a good example of how to combine Marty's library with Batch Apex to accomplish this here from Agustina GarcĂa. With this approach you can see, depending on your data, around 5000 rows of 100 columns being process for an approximate 2MB file. If you implement this you might also be interested in this recent question and answer on monitoring Batch Apex jobs from your UI.
Summary and Salesforce Data Loaders
Parsing CSV files can be very expensive on Apex statements, which will limit the size and complexity of the files you can consume. So you should consider if you really need to build this yourself vs either of the Salesforce Data Loaders (there is a desktop and web based version). Neither of these however will retain the CSV file uploaded unfortunately, so if that is a strict requirement then rolling your own is the way to go.
Hope the above helps give you the pointers you need, enjoy!
if(attachmentProperty != null && attachmentProperty.Body != null ) { string attachmentBodyString = attachmentProperty.Body.toString(); //above string populated correctly with attachment content
List<string> modifiedfileLinesList = new List<string>();
List<string> fileLinesList = new List<String>();
while(attachmentBodyString != '')
{
fileLinesList.add(attachmentBodyString.substring(0,attachmentBodyString.firstIndexOf('\n')));
attachmentBodyString = attachmentBodyString.substring(attachmentBodyString.firstIndexOf('\n'), attachmentBodyString.length());
}
attachmentBodyString.split('\n');
//above split function is throwing an error
//some logic here
}
Best Answer
Looks like you found the solution, but for other folks, I've seen this error occur in three different ways.
(\n|\r|.)
(seems silly but this is a valid regex since Apex doesn't support DOTALL). In all these cases I've seen, I've had to refactor the regex to actually make it less complex, although "complex" is a misnomer in the example here.