[SalesForce] CSV File Upload into SFDC

I am looking for an approach to perform the below:

[1] Read an CSV file from the local drive and insert the contents into SFDC (having a typical Browse button)
[2] Once the read is complete upload the CSV into Content
[3] Locally create an Error Log with the failed rows and error message (or) Write into the same CSV file the Error against each failed row

The design I am thinking off is to have a VF Page with Java Script to locate and read the rows while the Controller would do the insert. Is this design viable? If not what would be the best approach?

Best Answer

Good question, I personally would not try to handle this in the client, it's to fragile an environment, better to upload this file and handle at the server side. Or consider adjusting your requirements / user expectations to fit Salesforce Data Loader tools, see my summary below.

Client Side

  • Client Library and Communications. You can find likely find a number of CSV parsing libraries online and then interface the results of those back to the server using JavaScript Remoting. This would avoid hitting the daily API governor or attempting to handle this via apex:actionFunction (which would run you into view state issues).
  • Scaleability and Robustness. While this has the advantage that you can handle larger files. It will likely be slower and is dependent on the user keeping the page open during the processing. Keep in mind you still need be able to upload the CSV file to store it, as per your requirements, so...

Server Side

The following should get you started on a number of things you need to address here.

  • Uploading. You can indeed use many of the file upload examples available such as the one jkraybill references here. To avoid view state governor issues, make sure you pay attention clearing the Body of the document or as I prefer simply use the 'transient' modifier on the member variable in your controller.
  • Parsing. You can also find here a good CSV parser library written in Apex by Marty Y. Chang. In terms of actually parsing a CSV of any reasonable number of columns and rows you will likely hit the Apex statement governor before anything else (upload or heap size limits).
  • Scalability. To get the most out of the platform you might want to consider parsing your CSV file using Batch Apex. There is a good example of how to combine Marty's library with Batch Apex to accomplish this here from Agustina GarcĂ­a. With this approach you can see, depending on your data, around 5000 rows of 100 columns being process for an approximate 2MB file. If you implement this you might also be interested in this recent question and answer on monitoring Batch Apex jobs from your UI.

Summary and Salesforce Data Loaders

Parsing CSV files can be very expensive on Apex statements, which will limit the size and complexity of the files you can consume. So you should consider if you really need to build this yourself vs either of the Salesforce Data Loaders (there is a desktop and web based version). Neither of these however will retain the CSV file uploaded unfortunately, so if that is a strict requirement then rolling your own is the way to go.

Hope the above helps give you the pointers you need, enjoy!