[SalesForce] How to query archived Tasks using Bulk Api

I'm trying to retrieve all Tasks in our Salesforce instance, including those that were archived more than 12 months ago. Since we have over 3 million records, I decided to try the bulk api. It seemed promising until I ran into this error:

 MALFORMED_QUERY: ALL ROWS not allowed in this context

Does bulk-api not support querying for archived tasks? I believe you simply include 'ALL ROWS' at the end of your SOQL query to retrieve deleted and archived records. Is there another way to do this using the bulk api? If not, this seems like a huge limitation. Archived tasks seem like a prime candidate for the bulk api.

Any help is appreciated,
Andrew

Here's an example of one of my queries:

Select Id from Task Where CreatedDate >= 2013-01-01T00:00:00z and CreatedDate < 2013-03-01T00:00:00z ORDER BY Id ASC ALL ROWS

Best Answer

Over Twitter I was able to get an answer to question. In short, Bulk API really isn't the right framework to help with sort of problem. The Bulk API is most helpful for mass inserting/updating/deleting. It isn't really that much more efficient when querying. It would be great if the Bulk API could break a query with many results into chunks of 2000 rows. That way, the job could download all of those chunks in parallel. In theory, it's possible to do this, but you as the developer will need to determine all of the chunks (queries) and then add them all to the job. It would be amazing if the Bulk API could 'chunkify' the query for you. As of today, it cannot.

So, I found it simpler to just use the normal queryAll() calls through the WebAPI and do 10 queries at a time (using Spring Batch). The only downside is that you can only have 10 query locators open at a time.

So, I don't have a solution of how to do this with the bulk API. In fact, if you want archived Tasks it's not possible because it does NOT support the ALL ROWS keywords.

Andrew

Related Topic