Rather than scheduling a one-time job, schedule a recurring job.
Schedule the job to run on an hourly interval (every hour). As part of the finishing phase of your job, cancel this hourly schedule and replace it with another similar hourly schedule where the first execution is set to be a short period (let's say 5 minutes) from the finish of the job.
This works in a very similar way to using a "one off" schedule (as per your existing implementation) - in both of these implementations the job is rescheduled in the finish phase, but by using a recurring schedule you have the added benefit that if for any reason the job does not execute, the platform will attempt to run it again an hour later, and every hour until it succeeds.
Note that we don't know why the job may fail to execute - but we're assuming that it relates to platform maintenance. Chaining one-off scheduled jobs together relies on the successful start and completion of each job for the integrity of the chain, whereas using a recurring scheduled job provides "auto-resume" behaviour regardless of the successful start / completion of an individual job.
Example process flow:
(1) at 12:00 we schedule a job to run every every hour, at 5 minutes
past the hour: 12:05,13:05,14:05...etc...
(2) at 12:05 the batch manager job is started according to the hourly
schedule, and this checks your custom batch job object records to see
if there is any work currently running or waiting.
It finds that there are no jobs running but there is a job waiting:
"Foo". The batch manager therefore starts the batch process for Foo.
(3) at 13:05 the batch manager job is started according to the hourly
schedule.
On this occasion it finds that job Foo is in progress and so quits
taking no action.
(4) at 13:35 job Foo finishes.
In the finish phase, the existing hourly scheduled job is cancelled,
and another new hourly job is scheduled, this time to run at 40
minutes past the hour: 13:40, 14:40, 15:40...etc…
(5) at 13:40 the batch manager job is due to start according to the
hourly schedule, but this fails (we assume because of platform
maintenance)
(6) at 14:40 the batch manager job is started according to the hourly
schedule.
It finds that there are no jobs running but there is a job waiting: "Bar". The batch manager therefore starts the batch process for Bar.
etc.
You can chain batches only through the finish() method. The reason for this is that the execute() method will be called many times for a given batch start() and this can get out of hand with too many batches scheduled.
Have your batchable class implement Database.stateful
and save any information you need as class member variables, available to the finish() method for starting the next batch
From the Apex doc:
Starting with Apex saved using Salesforce.com API version 26.0, you
can call Database.executeBatch or System.scheduleBatch from the finish
method. This enables you to start or schedule a new batch job when the
current batch job finishes. For previous versions, you can’t call
Database.executeBatch or System.scheduleBatch from any batch Apex
method. Note that the version used is the version of the running batch
class that starts or schedules another batch job. If the finish method
in the running batch class calls a method in a helper class to start
the batch job, the Salesforce.com API version of the helper class
doesn’t matter.
Best Answer
Yes, currently 'execute' methods run sequentially, though the documentation does not make it perfectly clear on this. In the future Salesforce may enable parallel running, though I'd expect this to be enabled via a annotation or interface marker on the class. So hopefully we would get the chance to opt into it or not.
So it maybe possible your users are running your job again and thus in parallel over the same data, or indeed other activities as you suspect.
Note: Also this comment from the Batch Apex docs, which might be important to you.