Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Job Scheduling in Java

Posted on 2010-05-17
28
Medium Priority
?
5,189 Views
Last Modified: 2013-12-02
Hi,

We've a web-based application running on Apache Tomcat v6.0.10. Of course, reports are also available as part of our application.  Now, we've planned to introduce a new feature in the application called report scheduling.  Using report scheduling, application User's can schedule reports of their choice and get it delivered at their mail box. We want to give User's as much flexibility as they can in scheduling the reports.

I also heard & read about job scheduling in Java at:

Open Source Job Schedulers in Java
http://java-source.net/open-source/job-schedulers

What is Quartz
http://onjava.com/lpt/a/6207

JobServer 1.4 Open Source Java Job Scheduler
http://www.javalobby.org/java/forums/t68751.html

Considering my use case explained above, my questions are:
1) Is it possible to use Sun's own java.util.TimerTask for my complex report scheduling?
2) What are the valid/strong reasons/limitations of java.util.TimerTask compared to other job scheduler frameworks?  So that I myself have a strong belief/reason before choosing  a third-party job scheduler framework.
3) There are a maximum of 100-200 Users in my application.  In case, Users have scheduled reports in such a way at one time there are 100 report requests in the queue.  How does the  job scheduler framework OR java.util.TimerTask handle such scenarios?  Do we have control over this?
4) At any time, Users are allowed to change their report schedules. Does the job scheduler framework support this?
5) Obviously, to run a report there are report inputs, that has to be passed to each report schedule.  Do we have the flexibility/option in passing parameters to the job scheduler framework?
5) Which is the best way? Integrating job scheduler framework with web application or running it as a standalone?

NOTE
Because of memory leak in our application, we've a restart of Tomcat service daily-basis at low-usage time.  Reason I'm explaining this is that report scheduled by Users should be persisted across server/Tomcat restarts.  Take this into consideration.

Experts opinion in right direction are appreciated.
0
Comment
Question by:Zoniac
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 13
  • 9
  • 4
  • +1
28 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 32779098
This is really a no-brainer: you shouldn't attempt to implement this yourself with java timers, otherwise you'll end up reimplementing something like Quartz, but probably not as well.

I don't know JobServer, but Quartz is the standard implementation and supports your requirements
0
 
LVL 4

Expert Comment

by:nfaria
ID: 32779157
I have some scheduled jobs implemented with the OS utilities.

I record in the DB what are the reports configs and delivery schedule.

Then I use CRON in Linux to start Java programas that read from the DB and issue all the required reports. The report system runs as a stand-alone exec. In windows you can also schedule jobs to launch your execs.

For asynchronous tasks that don´t influence the outputs and data of your application this is enough. I think reporting can be treated this way.

Scheduling jobs in the application is more for those jobs that interact with ongoing processes and require to know the current state of the application.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32779255
Hi CEHJ,

I'll also go thro' the documentation to understand the functionality that meets my requirements.

I'm looking for a very specific point on the following and this keeps coming to my mind always to further proceed/go with Quartz framework:
4) At any time,  Users are allowed to change their report schedules. Does the job  scheduler framework support this?

Do you've any sample/example code for manipulating or re-scheduling a already scheduled job using Quartz framework?  Any pointers to relevant documentation are also appreciated.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:CEHJ
ID: 32779296
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32779312
Hi nfaria,

It looks that with the in-house job scheduling implementation at OS level, the DB schema defined for report configs and delivery schedule itself becomes very important in, for example, when the next scheduling has to be run, etc.

Is it possible for you to expose your scheduling portion of your schema?
0
 
LVL 4

Expert Comment

by:nfaria
ID: 32779416

I only need to set up daily, weekly or monthly reports so it is enough to have somenthing like

scheduling_interval TINYINY (1, 7, 30 defines daily, weekly or monthly, special value 0 means no sending)
scheduling_last_sent DATE (I don´t need the time)
scheduling_user_id INT (owner of the config and target of the e-mail)
scheduling_report_id INT (report to run)

And my Java program issue a SELECT similar to (in MySQL syntax)

SELECT report_id, user_id, user_name, user_email
FROM scheduling_table st
INNER JOIN users u ON st.user_id = u.user_id
WHERE DATEDIFF(NOW(), scheduling_last_sent) >= scheduling_interval AND scheduling_interval > 0;

And for each record I process the report identified by report_id, dispatch it to user_email and update the field scheduling_last_sent.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32779603
Hi nfaria,

In your case:

1) Daily: At what time does it trigger?  Based on your schema, I think it is not user-configurable.  Am I correct?
2) Weekly: At what day and time does it trigger?  Again is it user-configurable?
3) Monthly:  At what date and time does it trigger?

Overall, I see there is no recurrence pattern in your scheduling mechanism.  Is my understanding correct?

Is your cron entry scheduled to query database every 1 minute?
0
 
LVL 4

Expert Comment

by:nfaria
ID: 32779847
In my requisites it is enough to launch the exec every night and I don´t have a specific day for sendings like every Monday, every 1st day or something like that.

So each user gets its report from the day they enabled it plus the time interval.

If you need to set up an explicit day of month, day of week and/or hour to each user and report you just have to add those fields and adjust the query. For example

scheduling_day_of_month TINYINT (1 to 31, 0 no use)
scheduling_day_of_week TINYINT (1 to 7, 0 no use)

and issue an execution every day with adjusted query

SELECT report_id, user_id, user_name, user_email
FROM scheduling_table
INNER JOIN users u ON scheduling_user_id = u.user_id
WHERE
DATEDIFF(NOW(), scheduling_last_sent) >= scheduling_interval AND
(
DAYOFMONTH(NOW()) = scheduling_day_of_month OR
DAYOFWEEK(NOW()) = scheduling_day_of_month
) AND
scheduling_interval > 0;

If you need hourly reports or to send them in a given hour of the day just have to extend this keeping the same logic and executing your exec each hour or so. You could do it by the minute but you have to make sure that it only launches after the previous launch has ended.
0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32781459
If your application is based on spring, I can certainly recommend having a look at the Spring Scheduling features. They do rely on Quartz, but wrap all the "dirty stuff" of scheduling. http://static.springsource.org/spring/docs/2.0.x/reference/scheduling.html
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32787237
3) There are a  maximum of 100-200 Users in my application.  In case, Users have  scheduled reports in such a way that at one particular time there are 100 report  requests in the queue.  How does the  job scheduler framework handle in such scenarios?  Do we have control over  this?

0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32787974
Well a scheduling framework usually has a predefined number of working-threads. Per default I think Spring configures cron with 5 worker threads in a Thread-Pool. Stuff like report-generation seems to be quite calculation-intense, so I think this behaviour is rather desirable in your case. Otherwise the scheduler would take down your system every night for a few minutes.

Do I understand it correctly? Your system doesn't need to run user-defined jobs at a user-defined time, but work user defined reports at a predifined and system wide time? If this is case id certainly recommend adding report-jobs to a list and have a report-workter (or several) work off the list at a given time triggered by quartz (either directly or using the spring wrapper)
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32788027
Hi ChristoferDutz,

To answer your question, so far I had the idea of running jobs for user-defined & user-defined time.   Now, based on your points, suddenly this comes into my mind.  Can you clear this?

In case if my Quartz scheduler worker threads are also running in parallel with my web application, will that slow down my web application system?  I can also say, my each report job may take approximately 5 minutes to complete.
0
 
LVL 4

Expert Comment

by:nfaria
ID: 32788053
5 minutes x 100 reports in queue?

If they run in parallel they will strain your app, but if they share the same DB even if they are executed apart they can strain your DB and your app as a consequence.

Each user request is an independent report with nothing in common with each other?
Couldn´t you generate a report and send it to the users that 'subscribed' it?

If not hope you have a mirror DB exclusively for reading and report generating.
0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32788112
Hi,

On a normal Quad-Core-Duo you will be able to execute 4 Threads in parrallel (Intel may say 8 but the "duo" cores are no full cores). So if you have 100 Jobs each taking 5 minutes, this means that you will roughly need 2 Hours to work off the load. If you increase the number of parallel executions you will certainly get less performance as the Operating-System has to do a lot of Thread switching and Thread scheduling, Lock maintenance etc.

Of course your Reports will strain the Webserver and if for example a lot of people want fresh reports at noon, your website may even go offline of be verry sluggish for a given time. A solution would be to give the report-threads a really low priority. This would at least give the main threads enough power to keep the webserver available.

I'd recommend a dedicated Report-Server, that does the generating of reports and that generates them at night if the report-generation is a DB intense operation.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32788310
Hi nfaria & ChristoferDutz,

Let me give some more detail on my production server setup:

SERVER SETUP DETAILS
RAM: 7.5 GB
Both my web server (Apache Tomcat) and database (PostgreSQL) are running in the same server.  Also, I've planned to run Quartz framework in the same machine.  Choice of integrating Quartz server framework is in my hand now, that is either integrating it with Apache Tomcat or running as standalone & adding jobs using RMI call.

YES.  Each user request is an independent report with nothing in common with each  other.

  1. How do I make report-threads a really low priority from within Apache Tomcat in case Quartz server is integrated with Apache Tomcat?
0
 
LVL 20

Assisted Solution

by:ChristoferDutz
ChristoferDutz earned 1200 total points
ID: 32788373
The following code would reuduce the priorioty of the Thread executing it to the minimum priority:

        Thread.currentThread().setPriority(Thread.MIN_PRIORITY);

I would suggest getting the priority first, saving it, setting it to minimum in a try block and resetting it to the old value in the finally block.

I would predict that you will propably run into trouble with your setup. If your reports are very DB intense, you may be able to lower the priority of the Java-Thread but not of the DB. So even if your Java Thread does not slow down the system, the DB thread answering the Java Thread may still bring down your Server.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32789623
Hi ChristoferDutz,

By setting thread priority, you mean setting this at the quartz.properties level?

# "THREAD_PRIO" can be any int between Thread.MIN_PRIORITY (1) and
# Thread.MAX_PRIORITY (10).  The default is Thread.NORM_PRIORITY (5).
org.quartz.threadPool.threadPriority = 1


OR, do you mean handling this at each job-level at the time of adding & scheduling:
scheduler.addJob(job, true);scheduler.scheduleJob(cronTrigger);

Also, can you explain the significance of  "getting the priority first, saving it, setting it to minimum in a try  block and resetting it to the old value in the finally block"?  I assume there may be reason behind for you saying this, let me understand your view point further.
0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32789974
Oh well if the Quartz allows you to set this, I would prefer the "org.quartz.threadPool.threadPriority = 1" option ... your "OR"-Option is not going to help you.

My approach allows you to set the level of the currently executed Thread to a predefined priority and makes sure it is reset to its default after the job is finished (Actually shortly before it is finished). If you don't save and reset the priority you would run into problems if you have Jobs in "mixed-priority" (Some Jobs have normal priority and some have lower priority), because as soon as one thread executes the low-prio job, its priority is reduced and after it has finished it's job it is returned to the Thread-Store (still with low priority). As soon as a normal-prio Job gets a thread from the Store it might get one with normal priority, but it also might get one with low priority. I doubt this is what you want.

I think this also may be the big difference between setting "org.quartz.threadPool.threadPriority" and my approach as my approach allows Jobs executing in different priority whereas the "org.quartz.threadPool.threadPriority" approach makes all run in the same.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32806053
Hi ChristoferDutz,

I got your approach.  In my case, all reports generated are of equal priority.  So I cannot change job thread priority programmatically and also I would not be able to distinguish the priority of the job threads executing at runtime, since as I said, all report threads queued are have equal priority.

So as you said and also my opinion, I would better set org.quartz.threadPool.threadPriority = 1 and make all threads run with the same priority.
0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32806662
I'd suggest to give it a try :-)

Additionally I'd recommend to limit the amount of threads in the pool as I mentioned 1000 Treads working on 4-8 Cores does not make real sense and will certainly be tha cause of some performance issues you would be getting under heavy load.

If it turns out that your system still has some performance/scalability issues, feel free to come back and let us help you with them. I think there are quite some performance specialists here.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32806751
Hi ChristoferDutz,

Thanks for your valuable suggestion.  Sure, I'll get back.

I need another idea/suggestion w.r.t Quartz framework:

Based on my use case explained in my original post, I need to have many job instances doing same thing (generating report) but with different parameters (report input) and different time intervals (based on each User's schedule).

1. Is it right to create a separate JobDetail and a separate CronTrigger for each User?
2. In case if the User wants to reschedule his report (that is different report input and different schedule), which approach would you suggest:
           1. Deleting both JobDetail and CronTrigger and creating a new JobDetail & CronTrigger
               whenever User want's to change his/her report input and reschedule.
           2. Identify/locate CronTrigger from JobDetail and rescheudling job using
               Scheduler.rescheduleJob(String triggerName,               String groupName,               Trigger newTrigger)
0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32806806
No, I wouldn't suggest that.

I would recommend a generic CronTrigger and JobDetail, that works off a List (preferably a concurrent List) of ReportJob objects. These ReportJob objects are simple containers, that contain everything that a users report needs.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32806881
Hi ChristoferDutz,

In my case, may be CronTrigger can be worked off to a predefined list, but it cannot be in case of report input (which has to be passed a parameter to JobDetail), because the report input to be passed has too many parameters in my case.

Again my question is little different here:  Incase if the User (more than one user are allowed) wants to change report input and it's schedule, anyway I need to reframe job of this particular user with different report input parameters and with new schedule.

1. Delete JobDetail and it's associated CronTrigger and create one with new JobDetail & CronTrigger (from predefined list that is).
2. Keeping JobDetail as it is, but just overwrite JobDataMap (report parameters) and CronTrigger (schedule).
0
 
LVL 20

Accepted Solution

by:
ChristoferDutz earned 1200 total points
ID: 32806975
Well I just had a look at one of my applications that uses Quartz (Wrapped by Spring) and on update i unschedule a job and then schedule the changed one.

Here my code for sheduling/unscheduling ... I also have to pass quite some information to the Job, so it's relatively similar to what you need (but keep in mind, that mabe one or the other class is only available when using the Spring Quartz wrapper) ("data" is the real job data, spring-context and job-executer-service are referenced to the Spring context and a service, that I need inside my job)
	protected
	void scheduleJob(final SchedulerJob aJob)
	{
        // Create a Job context (Quartz-Object).
        final JobDataMap jobDataMap = new JobDataMap();
        jobDataMap.put("data", aJob);
        jobDataMap.put("spring-context", ctx);
        jobDataMap.put("job-executer-service", jobEcecuterService);

        // Create a Job execution detail (Quartz-Object).
        final JobDetail jobDetail = new JobDetail(aJob.getId(),
                null, DefaultSchedulerExecuter.class);
        jobDetail.setJobDataMap(jobDataMap);

        // Add the jod-definition to the scheduler.
        try {
            schedulerFactory.addJob(jobDetail, true);

            // Create a Quartz trigger to start the job.
            final CronTrigger trigger = new CronTrigger();
            // Set the name to the id.
            trigger.setName(aJob.getId());
            trigger.setJobName(aJob.getId());
            // Start now.
            trigger.setStartTime(Calendar.getInstance().getTime());
            // Schedule as specified in the cron expression.
            trigger.setCronExpression(aJob.getCronExpression());

            if(log.isInfoEnabled()) {
                log.info("[SCHEDULER] - Scheduling Job: " +
                        aJob.getId() + " with executer " +
                        jobDetail.getJobClass().getCanonicalName());
            }

            // Schedule the new Job.
            schedulerFactory.scheduleJob(trigger);
        } catch (final Exception e) {
            if(log.isDebugEnabled()) {
                log.info("Error Scheduling Job " + aJob.getName() + ". Job deactivated.", e);
            }

            // Deactivate the Job, so it is not started automatically again.
            aJob.setEnabled(false);

            // Save the modified job.
            try {
                updateJob(aJob);
            } catch(final Exception ex) {
                throw new RuntimeException(ex);
            }
        }
	}

	protected
	void unscheduleJob(final String aJobId)
	{
    	try {
    		if(schedulerFactory.getJobDetail(
                    aJobId, "DEFAULT") != null) {
				if(log.isInfoEnabled()) {
					log.info("[SCHEDULER] - Unscheduling Job: " +
							aJobId);
				}

				schedulerFactory.deleteJob(aJobId, "DEFAULT");
    		}
		} catch (final SchedulerException e) {
			throw new RuntimeException(
					"Error unscheduling job with id " + aJobId, e);
		}
	}

Open in new window

0
 
LVL 1

Author Comment

by:Zoniac
ID: 32807227
Hi ChristoferDutz,

First, thanks for sharing the scheduling portion code of your application.

As we discussed, in your case also, to reschedule an already assigned job (to the Quartz scheduler), first the existing job is located and its associated trigger were deleted, and a new job with new trigger is created and scheduled.

Am I right?  As you pointed correctly, your case closely matches mine.

This is just out of my curiosity after reading your code.  In scheduleJob() method's catch block, you have updateJob(aJob).  Can you share with me what this does/handles on having exceptions in scheduling a job?
0
 
LVL 20

Expert Comment

by:ChristoferDutz
ID: 32807289
Oh this is nothing to wory about. My Job definitions are saved using JPA. A user can specify a Job. Within this job is a boolean field that allows him to activate the job. So only the active jobs are automatically scheduled. We had the case that some job definitions caused errors (if the job contains a year setting of 2009 for example) this is why jobs causing errors are deactivated automatically and in order to persist the changes my updateJob method simply persists the JPA Object.
0
 
LVL 1

Author Comment

by:Zoniac
ID: 32817584
Hi ChristoferDutz,

Thank you for your update on updateJob(aJob).
0
 
LVL 1

Author Closing Comment

by:Zoniac
ID: 32817627
Solution arrived for my use case with Quartz scheduler framework.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn different types of Android Layout and some basics of an Android App.
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
This video teaches viewers how to create their own website using cPanel and Wordpress. Tutorial walks users through how to set up their own domain name from tools like Domain Registrar, Hosting Account, and Wordpress. More specifically, the order in…
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
Suggested Courses

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question