Rarely, batch jobs do fail to execute with no error message.
One of the modes that causes this, can be when a SOQL query cannot be executed within the time limits. For example, a particularly non-selective query happening during the batch execute()
method will cause the rest of the job to fail to execute. That's not "abort", but actually fail to execute.
The really kooky details are:
- the job does not appear like
System.abortJob
took place,
- the remaining executes are discarded,
- the
finish()
method does run,
If you are able to raise a case to Salesforce, they may be able to analyze to the extent:
"After processing N batches, one query in Class
at line 46 has timed out as it is running for more than 2 minutes and the job was aborted there."
One time during API 34.0, they were also able to confirm that there is some erroneous behaviour around the UI display of the AsyncApexJob which should show Failed
but doesn't.
Perhaps by exceeding the heap size with the Database.Stateful
it has incurred a similar failure mode, where the whole job has to be trashed rather than continuing because of memory limits?
I found a partial answer here :
Schedulable equivalent of Database.Stateful?
Context is to keep the attributes of the schedule instance.
Let's say you have a log inside your schedulable class that you want to keep all day long then email when a specific hour is reached. Or store into a custom setting (you will need an Id to get it back anyway). So you store an attribute.
Attributes will be saved for next run at the moment you schedule your instance (object is serialized at this moment).
In the example the job is aborted before it is scheduled again (i was scheduling at the very beginning). It has been tested by users so even if I don't have test it I assume the code is right.
public void execute(SchedulableContext sc) {
changeMyAttribute();
System.abortJob(sc.getTriggerId());
schedule();
}
Now I am going to work on this. Send the instance to the batch class and run some method in the batch finish method.
Maybe in addition of a nice strategy design pattern (obj 1 don't abort, obj 2 abort, obj 3 send email) ;)
I will give some feedback later.
Ok, did some tests and it works great for some time.
Strategy design pattern :
Scheduler know 3 kinds of Action objects :
- ActionInit : run batch, don't abort, reschedule
- ActionDefault : run batch, abort, reschedule
- ActionFinish : send email, abort
Action know 2 abstract methods : run(); and finish(); for polymorphism intent.
And some "toolbox" methods used in child classes.
I change them in the execute() method depending on count (1 = actioninit, >1 = actiondefault, hour>maxHour = actionfinish).
Where it fails :
It works well with a very quick run in the middle of the day but fails if I try to run it through the whole day. Without any error message despite try/catches everywhere.
I have a log object that could be too big and send an exception before my try/catches. I am going to clear this.
Too bad it seems i can't do:
public override String toString(){
return ''+
' Serialized size:'+JSON.serialize(this).length();
}
It is triggering a internal error. Even with .clone(). It would have permit me to check how big the serialized record is ...
Ok so here is the working version (for those who could be interested by this kind of problem).
Looks like it was the log object size. It was growing close to 2Mo at the very last working batch. I guess this is a serialization limit.
I did a very compact one with just:
- 1 string for initialization information (parameters, url for callout, etc)
- a list of success and a list of errors with 1 string for each run (depending on batch ending, calculated in finish()
And here is finally how it works :
- execute() from the scheduler chose a object for the strategy design pattern then launch run() polymorphic method.
- it run a batch
- the batch finish methode call the scheduler finish polymorphic methode than add the result, reschedule, send email, whatever
- the scheduler trigger is aborted into the finish() polymorphic method before the reschedule (or not !)
It works very well ! :)
BTW if you are interested by this kind of code, don't hesitate to comment or ask your question.
Best Answer
When you have a batch finish() launch a second batch via executeBatch(), you must use two testmethods:
SFDC won't actually execute a batch in a testmethod until the Test.stoptest() is reached and you won't be able to assert the results of batch 2.
Testing batches with http callouts and setup of test data requires testing start(), execute(), and finish() explicitly if you get an Uncommitted Work pending error. See here and here.