[SalesForce] Loading 1 million records using bulk API: Exceeding batch Limit

Whenever records are inserted in Contacts(usually inserted in 1000's), we have a batch job that checks related customer account in SFDC and populates the contact.accountId field and couple other fields. This so far worked fine.
Now, we have a scenario where 1 million contact records (approximately 5000 batches) should be inserted into SFDC using Bulk API, and even with flex queue enabled, I am running out of available batches.

Trigger:

trigger conTrigger on Contact (after insert) {
    ContactHandler.managecontacts(trigger.new);
}

ContactHandler.class

public static void managecontacts(List<Contact> acList) {
    Database.executeBatch(new UpdateContactsBatch(acList));
} 

UpdateContactsBatch.class

    public class UpdateContactsBatch implements Database.Batchable<sObject>, Database.Stateful {
        List<Contact> acList = new List<Contact>();

public UpdateContactsBatch (List<Contact> acList) {
this.acList = acList;
}

            public Iterable<sObject> start(Database.BatchableContext BC) {
                return [select id, accountId from contact where Id In: acList];
            }

            public void execute(Database.BatchableContext BC, List<AccountContacts__c> scope) {
                ContactTriggerHandler.managecontacts_execute(scope);        
            }

            public void finish(Database.BatchableContext BC) { }        

        }

One option I have is to turn off the trigger during this mass load and manually run the batch from anonymous window after inserting all contacts. (Tested this scenario and this is working fine in PreProd environment). In Production this requires 2 times deployment (one for activating trigger and one for deactivating. Donot prefer having custom settings to control trigger behaviour)

But would like to know if there is any other option to handle this scenario.

Problem: I changed the batch size to 5000 in my dataloader which is currently set to "Use Bulk API" but still seems like my trigger is getting 200 records at a time and for each load of records from trigger.new, a batch job will be queued

enter image description here

Best Answer

So I think the problem is that the bulk data API is running multiple asynchronous batches and thus kicking off too many Batchable Contexts.

It looks like your batch process doesn't actually care what Id's you pass in and instead just processes everything?

If this is the case and you don't need real time (or close to realtime) updates, it might be a better idea to just schedule this to run.

Otherwise, if you need it to happen in near real time, you could use batch "chaining" in order the have the batch keep running as long as there are records to process.

The basic steps would look something like this:

  1. In the finish method, add a check to see if there are more records to process. If so kick off another batch.
  2. In your trigger, have it first check if the batch is already running. If so, you don't need to do anything. Otherwise, have it kick off the batch.

Update

If you have access to the JobInfo concurrencyMode param, you can set this to "Serial" and it will prevent multiple batches from running in parallel. I think this might solve you problem. If that alone doesn't work, try also reducing the batch size. Obviously, this will cause the import to take much (much) longer.

On the dataloader this Serial can be turned on by checking "Enable Serial Mode for Bulk Api".

Related Topic