I'm trying to find the best solution for periodic task running in parallel. Requirements:
- Java (Spring w/o Hibernate).
- Tasks are being managed by front-end application and stored in MySQL DB (fields:
id
,frequency
(in seconds), <other attributes/settings about task scenario
>). -- Something like crontab, only withfrequency
(seconds) field, instead of minutes/hours/days/months/days of weeks.
I'm thinking about:
TaskImporter
thread polling Tasks from DB (viaTasksDAO.findToProcess()
) and submitting them to queue.java.util.concurrent.ThreadPoolExecutor
running tasks (from queue) in parallel.
The most tricky part of this architecture is TasksDAO.findToProcess()
:
- How do I know which tasks is time to run right now?
- I'm thinking about
next_run
Task field, which would be populated (UPDATE tasks SET next_run = TIMESTAMPADD(SECOND, NOW(), frequency) WHERE id = ?
straight after selection (SELECT * FROM tasks WHERE next_run IS NULL OR next_run <= NOW() FOR UPDATE
). The problem: Have to run lots of UPDATES for lots of SELECT'ed tasks (UPDATE for each Task or bulk UPDATE) + concurrency problems (see below).
- I'm thinking about
- Ability to run several concurrent processing applications (cloud), using/polling same DB.
- All of the concurring processing applications must run concrete task only once. Must lock all SELECT's from all other apps, until app A finishes updating (
next_run
) of all selected tasks. The problem: locking production table (front-end app) would slow things down. Table mirror?
- All of the concurring processing applications must run concrete task only once. Must lock all SELECT's from all other apps, until app A finishes updating (
I love simple and clean solutions and believe there's a better way to implement this processing application. Do you see any? :)
Thanks in advance.
EDIT: Using Quartz as a scheduler/executor is not an option because of syncing latency. Front-end app is not in Java and so is not able to interact with Quartz, except Webservice-oriented solution, which is not an option too, because front-end app has more data associated with previously mentioned Tasks and needs direct access to all data in DB (read+write).