[SalesForce] Dealing with auto-batching of triggers

I've recently been trying to come to terms with the oddities of the auto-"batching" of triggers. What I'm referring to is how when dealing with more than 200 records in a trigger execution, Salesforce will automatically "batch" the records into groups of 200. For example, if I run a trigger on 500 Account records, the logic in the trigger will be done three times in succession, on the first 200 records, then the next 200, then the last 100. Your logic will run 3 times in succession, and governor limits are never reset.

This doesn't pose much of a problem in a small-scale operation, but in working in a Salesforce instance with thousands of users, large quantities of data, and extensive automation, these issues are bound to come up.

My first question: what is the best practice to handle a trigger firstRun variable? Since workflows can often cause a trigger to run twice, it's important to pass through pieces of logic only the first time. As for order of execution, I've determined that the first 200 records will move through the trigger, then workflow, then the trigger again. Then the next "batch" will be processed, and so on. Because of this predictable order, my best approach is this (I simplified the overarching structure to demonstrate the logic):

public class MyObjectServices {
    public static Set<Id> recordsProcessed = new Set<Id>();

    public static void myTriggerMethod(Map<Id, MyObject__c> newMap){
        //The number of records processed in all "batches" to this point
        Integer sizeBefore = recordsProcessed.size();

        //Add the ids all records being processed this "batch". If they've already
        //been processed, the set will prevent duplicates from being added
        for (Id key : newMap.keySet()){
            recordsProcessed.add(key);
        }

        //Determines if the records included in the current "batch" of 200 
        //have been processed before
        if (recordsProcessed.size() != sizeBefore){
            //Trigger logic
        }
    }
}

However, this 1) doesn't work for before insert since the id doesn't yet exist, and 2) feels legitmately hacky. Are there any better ways to do this?

My second question: Is there any way to get the total number of records being processed in your trigger? Trigger.size will only give you the number being processed in that particular batch. We could use the recordsProcessed variable from above, but that will only give you the total number during the last "batch" of the trigger. And there's no way of knowing which batch that will be 100% of the time! Yes, if Trigger.size != 200, you know it's the last batch, but what if you're dealing with an exact multiple of 200?

My third question: finally, how do you handle limits? Suppose you update 10,000 of one type of record. This means 50 separate "batches" within the same execution! So, suppose you follow bulkification practices perfectly, you only have 2 queries (100 total) and 3 DML statements (150 total) to work with per batch before you hit governor limits. This is straight-up unworkable. All ideas I can come up with involve incredibly over-engineered methods of separating records into groups of 200 or less and using Queueables to process them. But this makes it very difficult to do any validation, and of course any implementation would be very complicated, so a dedicated technical architect would need to furiously review all incoming code.

So, do any of you have any insight how to face these problems in doing large-scale trigger automation?

EDIT: Adding an example to clear up some misconceptions about how trigger "batching" affects limits and static variables. Trigger:

trigger FooTrigger on Foo__c (before update) {

    System.debug('Trigger before handler call');
    FooTriggerHandler fth = new FooTriggerHandler(
        trigger.oldMap,
        trigger.newMap,
        trigger.old,
        trigger.new,
        trigger.isInsert,
        trigger.isUpdate,
        trigger.isDelete,
        trigger.isUndelete,
        trigger.isBefore,
        trigger.isAfter,
        trigger.size
    );
    System.debug('Trigger after handler call');
}

Apex class:

public class FooTriggerHandler {

    public static Set<Id> allRecordIds = new Set<Id>();
    public static Boolean firstRun = true;
    public static Integer count = 0;

    public FooTriggerHandler(Map<Id,Foo__c> oldMap, Map<Id,Foo__c> newMap, List<Foo__c> triggerOld, List<Foo__c> triggerNew, 
    Boolean isInsert, Boolean isUpdate, Boolean isDelete, Boolean isUndelete, Boolean isBefore, Boolean isAfter, Integer size){
        System.debug('Trigger.size: ' + size);
        System.debug('Entering dispatcher constructor');
        System.debug('DML before: ' + Limits.getDmlStatements());
        System.debug('DML rows before: ' + Limits.getDmlRows());
        System.debug('Queries before: ' + Limits.getQueries());
        System.debug('Query rows before: ' + Limits.getQueryRows());
        System.debug('Firstrun: ' + firstRun);
        firstRun = false;
        System.debug('Record Ids before: ' + allRecordIds.size());
        List<User> someUsers = [SELECT Id FROM User LIMIT 2];
        for (Foo__c f : triggerNew){
           allRecordIds.add(f.Id);
        }
        System.debug('Record Ids after: ' + allRecordIds.size());
        Contact contact1 = new Contact(LastName = 'TestContact' + count++);
        Contact contact2 = new Contact(LastName = 'TestContact' + count++);
        insert new List<Contact>{contact1, contact2};
        System.debug('DML after: ' + Limits.getDmlStatements());
        System.debug('DML rows after: ' + Limits.getDmlRows());
        System.debug('Queries after: ' + Limits.getQueries());
        System.debug('Query rows after: ' + Limits.getQueryRows());
    }

}

Meanwhile, I have a workflow rule running an update every time I update a Foo__c (in order to demonstrate the trigger re-running).

I run the following anonymous code: update [SELECT Id FROM Foo__c LIMIT 300];

And my debug logs:

//First trigger batch, first run
09:18:00:143 USER_DEBUG [3]|DEBUG|Trigger before handler call
09:18:00:144 USER_DEBUG [9]|DEBUG|Trigger.size: 200
09:18:00:144 USER_DEBUG [10]|DEBUG|Entering dispatcher constructor
09:18:00:144 USER_DEBUG [11]|DEBUG|DML before: 1
09:18:00:144 USER_DEBUG [12]|DEBUG|DML rows before: 300
09:18:00:144 USER_DEBUG [13]|DEBUG|Queries before: 1
09:18:00:144 USER_DEBUG [14]|DEBUG|Query rows before: 300
09:18:00:144 USER_DEBUG [15]|DEBUG|Firstrun: true
09:18:00:144 USER_DEBUG [17]|DEBUG|Record Ids before: 0
09:18:00:391 USER_DEBUG [22]|DEBUG|Record Ids after: 200
09:18:01:161 USER_DEBUG [26]|DEBUG|DML after: 2
09:18:01:161 USER_DEBUG [27]|DEBUG|DML rows after: 302
09:18:01:161 USER_DEBUG [28]|DEBUG|Queries after: 2
09:18:01:161 USER_DEBUG [29]|DEBUG|Query rows after: 302
09:18:01:161 USER_DEBUG [17]|DEBUG|Trigger after handler call
//First trigger batch, after workflow
09:18:01:798 USER_DEBUG [3]|DEBUG|Trigger before handler call
09:18:01:799 USER_DEBUG [9]|DEBUG|Trigger.size: 200
09:18:01:799 USER_DEBUG [10]|DEBUG|Entering dispatcher constructor
09:18:01:799 USER_DEBUG [11]|DEBUG|DML before: 2
09:18:01:799 USER_DEBUG [12]|DEBUG|DML rows before: 302
09:18:01:799 USER_DEBUG [13]|DEBUG|Queries before: 2
09:18:01:799 USER_DEBUG [14]|DEBUG|Query rows before: 302
09:18:01:799 USER_DEBUG [15]|DEBUG|Firstrun: false
09:18:01:799 USER_DEBUG [17]|DEBUG|Record Ids before: 200
09:18:01:890 USER_DEBUG [22]|DEBUG|Record Ids after: 200
09:18:01:952 USER_DEBUG [26]|DEBUG|DML after: 3
09:18:01:952 USER_DEBUG [27]|DEBUG|DML rows after: 304
09:18:01:952 USER_DEBUG [28]|DEBUG|Queries after: 3
09:18:01:952 USER_DEBUG [29]|DEBUG|Query rows after: 304
09:18:01:952 USER_DEBUG [17]|DEBUG|Trigger after handler call
//Second trigger batch, first run
09:18:02:623 USER_DEBUG [3]|DEBUG|Trigger before handler call
09:18:02:623 USER_DEBUG [9]|DEBUG|Trigger.size: 100
09:18:02:623 USER_DEBUG [10]|DEBUG|Entering dispatcher constructor
09:18:02:624 USER_DEBUG [11]|DEBUG|DML before: 3
09:18:02:624 USER_DEBUG [12]|DEBUG|DML rows before: 304
09:18:02:624 USER_DEBUG [13]|DEBUG|Queries before: 3
09:18:02:624 USER_DEBUG [14]|DEBUG|Query rows before: 304
09:18:02:624 USER_DEBUG [15]|DEBUG|Firstrun: false
09:18:02:624 USER_DEBUG [17]|DEBUG|Record Ids before: 200
09:18:02:624 USER_DEBUG [22]|DEBUG|Record Ids after: 300
09:18:02:624 USER_DEBUG [26]|DEBUG|DML after: 4
09:18:02:624 USER_DEBUG [27]|DEBUG|DML rows after: 306
09:18:02:624 USER_DEBUG [28]|DEBUG|Queries after: 4
09:18:02:624 USER_DEBUG [29]|DEBUG|Query rows after: 306
09:18:02:624 USER_DEBUG [17]|DEBUG|Trigger after handler call
//Second trigger batch, after workflow
09:18:03:180 USER_DEBUG [3]|DEBUG|Trigger before handler call
09:18:03:180 USER_DEBUG [9]|DEBUG|Trigger.size: 100
09:18:03:180 USER_DEBUG [10]|DEBUG|Entering dispatcher constructor
09:18:03:180 USER_DEBUG [11]|DEBUG|DML before: 4
09:18:03:180 USER_DEBUG [12]|DEBUG|DML rows before: 306
09:18:03:180 USER_DEBUG [13]|DEBUG|Queries before: 4
09:18:03:180 USER_DEBUG [14]|DEBUG|Query rows before: 306
09:18:03:180 USER_DEBUG [15]|DEBUG|Firstrun: false
09:18:03:180 USER_DEBUG [17]|DEBUG|Record Ids before: 300
09:18:03:511 USER_DEBUG [22]|DEBUG|Record Ids after: 300
09:18:03:603 USER_DEBUG [26]|DEBUG|DML after: 5
09:18:03:604 USER_DEBUG [27]|DEBUG|DML rows after: 308
09:18:03:604 USER_DEBUG [28]|DEBUG|Queries after: 5
09:18:03:604 USER_DEBUG [29]|DEBUG|Query rows after: 308
09:18:03:604 USER_DEBUG [17]|DEBUG|Trigger after handler call

Observations:

Using a static first run variable only affects the first 200 records. The variable remains false when processing any additional records. Only use this when you legitimately want code to run exactly one time, not once with each record.
Limits are not reset between batches, nor between re-runs due to workflow. You could technically query the same 500 records each time, and they would count repeatedly toward your query rows, potentially causing you to hit limits without actually working in bulk.

Best Answer

However, this 1) doesn't work for before insert since the id doesn't yet exist, and 2) feels legitmately hacky. Are there any better ways to do this?

Your approach (Set<Id> recordsProcessed) is a good one. You do not have to prevent before insert trigger recursion, so it's really not an issue that records do not have an Id yet in that case. I don't find it to be "hacky", and this approach is more robust than a simple Boolean flag (which will only operate correctly on the first batch).

Is there any way to get the total number of records being processed in your trigger?

I don't believe so, unless you set it from the calling context. For instance, you could do something like:

public with sharing class LeadService
{
    public static Integer recordsToProcess = 0;

    // service methods
}

/*VVV calling context VVV*/
List<Lead> toUpdate; // = <some_list>
LeadService.recordsToProcess = toUpdate.size();
update recordsToProcess;

However, you might have futher updates to leads with different values. I would avoid this strategy and find other ways around this limitation. It shouldn't matter which batch comes last. If you want to make sure some logic happens after your trigger logic completes, consider asynchronous processing.

how do you handle limits?

Two common strategies for easing limits usage are to:

Use asynchronous processing for heavy lifting
Use lazy loading to re-use common data

With the former strategy, you can trade queries/dml/cpu for async calls. It can be more difficult to prevent trigger recursion, but you should be able to work around it by careful application of criteria (filters).

The latter can help when you have configuration data you will need in all your batches. It would look something like:

public static List<ConfigObject> configData
{
    get
    {
        if (configData == null)
            configData = [/*query*/];
        return configData;
    }
    private set;
}

Related Solutions

[SalesForce] Is there still a batch size limit in trigger

Triggers now come in two sizes, batches of 200 and batches of 2,000. The APIs that chunked down to 100 records per trigger chunk are/will be retired in June 2021. For Platform Events, expect triggers to contain up to 2,000 events, and for all other normal DML triggers, triggers will have at most 200 records per chunk.

As a best practice, you should code the trigger to handle an "unlimited" number of records (barring CPU time), for the best performance and in consideration of governor limits. For example, if you want to do a callout per record, remember that you're limited to 100 callouts per transaction, so you'd use Queueable, Batchable, or future methods to handle groups of 100 records (fewer, if you need additional callouts for tokens, etc).

See the original answer, below, for historic trigger functionality.

You're confusing queries for DML operations. DML operations are always batched to sizes of 200 records maximum (100 if it's a really old API version, for backwards compatibility reasons). The 200/500/2000 row limit applies to the size of a single query result without using queryMore.

See Triggers where they discuss this "batch size" in regards to older API versions:

In API version 20.0 and earlier, if a Bulk API request causes a trigger to fire, each chunk of 200 records for the trigger to process is split into chunks of 100 records. In Salesforce API version 21.0 and later, no further splits of API chunks occur. If a Bulk API request causes a trigger to fire multiple times for chunks of 200 records, governor limits are reset between these trigger invocations for the same HTTP request.

I can't seem to find the documentation that states that triggers execute in chunks of 200 records, but you'll notice that it's pretty much in all the official (and unofficial) literature out there.

[SalesForce] Need help with System.LimitException: Apex CPU time limit exceeded

Your debug statements are sucking the performance right out of your code. Never leave debug statements in production code. And I do mean never. Let me give you a trivial example.

Map<Integer, Integer> values = new Map<Integer, Integer>();
for(Integer i = 0; i < 7500; i++) {
    values.put(i, i);
    system.debug('item'+values);
}

This code tries to put 7,500 items into a map. It requires about 9.7 seconds to run:

  Number of SOQL queries: 0 out of 100
  Number of query rows: 0 out of 50000
  Number of SOSL queries: 0 out of 20
  Number of DML statements: 0 out of 150
  Number of DML rows: 0 out of 10000
  Maximum CPU time: 9714 out of 10000 ******* CLOSE TO LIMIT
  Maximum heap size: 0 out of 6000000
  Number of callouts: 0 out of 100
  Number of Email Invocations: 0 out of 10
  Number of future calls: 0 out of 50
  Number of queueable jobs added to the queue: 0 out of 50
  Number of Mobile Apex push calls: 0 out of 10

Now, we remove the debug statement:

  Number of SOQL queries: 0 out of 100
  Number of query rows: 0 out of 50000
  Number of SOSL queries: 0 out of 20
  Number of DML statements: 0 out of 150
  Number of DML rows: 0 out of 10000
  Maximum CPU time: 0 out of 10000
  Maximum heap size: 0 out of 6000000
  Number of callouts: 0 out of 100
  Number of Email Invocations: 0 out of 10
  Number of future calls: 0 out of 50
  Number of queueable jobs added to the queue: 0 out of 50
  Number of Mobile Apex push calls: 0 out of 10

The script took so little time it didn't even register in the governor limits. The exact same code minus debug statements. Now, I'm not saying that this will fix your issue, but considering that you're using debug statements in every single loop, I can pretty much guarantee it'll save at least 5 seconds of CPU time, if not more.

If you find yourself in a position where you need to debug, use checkpoints instead. They will give you a complete memory graph at any arbitrary point of execution, and when you're done, you can disable them so they don't take up tons of CPU time.

Best Answer

Related Solutions

[SalesForce] Is there still a batch size limit in trigger

[SalesForce] Need help with System.LimitException: Apex CPU time limit exceeded

Related Topic