Rather than scheduling a one-time job, schedule a recurring job.
Schedule the job to run on an hourly interval (every hour). As part of the finishing phase of your job, cancel this hourly schedule and replace it with another similar hourly schedule where the first execution is set to be a short period (let's say 5 minutes) from the finish of the job.
This works in a very similar way to using a "one off" schedule (as per your existing implementation) - in both of these implementations the job is rescheduled in the finish phase, but by using a recurring schedule you have the added benefit that if for any reason the job does not execute, the platform will attempt to run it again an hour later, and every hour until it succeeds.
Note that we don't know why the job may fail to execute - but we're assuming that it relates to platform maintenance. Chaining one-off scheduled jobs together relies on the successful start and completion of each job for the integrity of the chain, whereas using a recurring scheduled job provides "auto-resume" behaviour regardless of the successful start / completion of an individual job.
Example process flow:
(1) at 12:00 we schedule a job to run every every hour, at 5 minutes
past the hour: 12:05,13:05,14:05...etc...
(2) at 12:05 the batch manager job is started according to the hourly
schedule, and this checks your custom batch job object records to see
if there is any work currently running or waiting.
It finds that there are no jobs running but there is a job waiting:
"Foo". The batch manager therefore starts the batch process for Foo.
(3) at 13:05 the batch manager job is started according to the hourly
schedule.
On this occasion it finds that job Foo is in progress and so quits
taking no action.
(4) at 13:35 job Foo finishes.
In the finish phase, the existing hourly scheduled job is cancelled,
and another new hourly job is scheduled, this time to run at 40
minutes past the hour: 13:40, 14:40, 15:40...etc…
(5) at 13:40 the batch manager job is due to start according to the
hourly schedule, but this fails (we assume because of platform
maintenance)
(6) at 14:40 the batch manager job is started according to the hourly
schedule.
It finds that there are no jobs running but there is a job waiting: "Bar". The batch manager therefore starts the batch process for Bar.
etc.
I use this pattern, i've seen lots of others but all do something similar:
private static Integer getCurrentJobCount()
{
return (Integer)[Select count() From AsyncApexJob Where JobType = 'BatchApex' and ( Status = 'Processing' or Status = 'Preparing' )];
}
private void submitJob( IReschedulable job, SchedulableContext SC )
{
if( getCurrentJobCount() > 4 )
{
// try again in 15 minutes
Datetime sysTime = System.now().addSeconds( 900 );
String chronExpression = '' + sysTime.second() + ' ' + sysTime.minute() + ' ' + sysTime.hour() + ' ' + sysTime.day() + ' ' + sysTime.month() + ' ? ' + sysTime.year();
System.schedule( job.getJobName() + sysTime, chronExpression, (System.Schedulable)job );
}
else
{
Database.executeBatch( (Database.Batchable<Object>)job, job.getBatchSize() );
// abort scheduled job if this was as a result of a reschedule
if( SC != null && job.getAbortAfterSubmit() )
{
System.abortJob( SC.getTriggerId() );
}
}
}
I've missed off some particulars, but hopefully you get the gist. You should basically put the count just prior to calling Database.executeBatch
, which I'm guessing is in the BatchHandler
class. I'm afraid it's not completely infallible, because there is of course a moment between the query returning < 4 and the job actually starting (at which point another job could have gotten in) but I've found it to be very reliable (and have it deployed in many orgs).
PS. Just noticed you're using System.scheduleBatch, I personally wouldn't trust that to be safe, see this post.
Best Answer
Why not just perform the callout in your Batch Apex start method? This means you will always start the Batch Apex job from your Scheduled apex job, but i think is on balance better than having to marshall schedule jobs between @future jobs?