Go Premium for a chance to win a PS4. Enter to Win


How to design email management (send only) for multiple jvm environment.

Posted on 2007-11-13
Medium Priority
Last Modified: 2010-03-30
I am generating email alerts through a timer thread which runs once a day. now the app is being clustered into two app servers, which is leading to duplicate emails. Is there any good mechanism/pattern/architecture to generate and send emails on a clustered environment.
Question by:sunilramu
  • 3
  • 2

Expert Comment

ID: 20275209
There's a tonne of different strategies that you could use here:

1.  If it's a standard "shared nothing" cluster, then just place a lock resource (i.e. a lock file or something similar) on a shared resource that can only be accessed by the active node at any point in time.  Then prior to your timer thread sending a mail it should check to see whether or not it has access to this resource.   In a standard active/passive cluster arrangement, the clustering software will take care of failing over this resource to the secondary/ternary nodes so that your other mail threads automatically acquire ownership if one node fails, thereby ensuring that only one mailer can operate at a time.  What you're basically doing here is creating an external mutex type object that only one mailer can acquire a lock on at a time.

2.  You could use a similar strategy by basically simulating the same sort of shared lock, but in a shared resource such as a database.  Basically each mailer thread also has to periodically update a shared table row with a timestamp and a random number.   Each mailer queries this table for the highest number when it wants to send a mail (including it's own).   If it doesn't have the highest numbered entry, then it needs to check that the those entries "outranking" it have set a timestamp that is within a timeout threshold.  (Say a minute - but you should make this configurable).   If entries that outrank a given mailer have "timed out" then it needs to set it's own number to be higher than those that outrank it, and then update it's timestamp.   This is basically a variation on what's called a lottery algorithm.   When a failed mailer comes back online, it rolls the dice to generate it's rank, and updates it's timestamp.

3.  Similar sort of lottery setup, but based on mailer nodes just broadcasting their timestamp/rank information across the subnet.   So instead of storing stuff in a database, just store it on the network so to speak.  You need to think about convergence (which is what needs to happen when a mailer fails) in this scenario because the networking protocol that you cook up might become a bit complicated.  (More so than you want it to be).

Hope this helps...just a few quick thoughts..

LVL 86

Expert Comment

ID: 20275242
Not really. You should probably configure the mail server to simply drop mails from one or more IP addresses

Expert Comment

ID: 20275261
But then the IP address would change depending on which node in the cluster is active - what would happen at the mail server if the mailer service migrated to a different node (with different IP)?

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.


Author Comment

ID: 20275342
JC and CEHJ thakns for your comments. JC i agree with you. also, dropping mails from one ip address would not provide any failover. CHEJ, your thoughts?

can you elaborate on #2 point. i believe i understand most of it. its just that you mention quick thoughts, is there any drawback you see in future using this statergy, and are there any online material you could refer me to so for further reference.

thanks again

Accepted Solution

jcoombes earned 2000 total points
ID: 20275527

Well to elaborate:

1.  Each mailer node (i.e. JVM) updates it's mail processing thread so that on each pass it:

2.  Selects a list of (nodeid, timestamp, integer) values from a known, shared database table.  Here the nodeid could be the IP/MAC of the mailer machine.  These should probably be sorted in descending order based on the integer value.

3.  It checks to see if it has already added an entry to this table and if not, selects a random number within a specific range (should be more than the number of nodes), generates a timestamp and adds itself into this table.  It then re-reads the table (in order to update the other records).

4.  Starting at the top of the list (highest integer) the mailer node then checks to see if the time-stamp for the highest entry is within a given amount of time (a timeout value) to the current time.  This will indicate whether or not the node that is associated with this entry has updated the timestamp recently (i.e. it's still alive) or not.

4a.  If the timestamp is stale, then the node moves on to the next highest entry in the list and performs the same check.  (i.e. go back to 4)

4b.  If the timestamp isn't stale then either:

       i)  The node id is itself, in which case it sends any queued outgoing mail items that need processing.

       ii) The node id is another node, in which case it knows that there is another active node with a higher integer ranking and so lets it do the processing.  (Stops the mail processing loop, and tries again later).

Eventually, the node will either find another node (with a higher integer value than itself) that is responsible for the delivery of mail items, or it will find that itself is the next available (i.e. next highest) node and so do the processing itself.

If you want to load-balance between nodes, then at the end of each sweep (successful or not) you should get each node to re-generate a new random number and update the centralised table.  This should also be done at "node startup" to mix things up a bit.

I think this should work - and I've used similar strategies in the past which have worked given that:

1.  The numerical ranking provides a selection criteria for a node to do the processing.

2.  If all nodes are inactive, the first one up will process backlog.

Collisions (where nodes have the same integer value) shouldn't matter because the ordering returned by the database will effectively decide which entry wins, if you see what I mean.  

Of course, this assumes that the mails to be processed are also centralised and can be queued in some way.

Another *very simple* approach to this is to assume that all nodes are active, and then in your outgoing mail queue, assign each mail item an integer id.   Then, assign each node a value modulo number of nodes, e.g. if you have 5 nodes, then give them values 0, 1, 2, 3, 4.   Then on each pass, a node only takes and processes those mail items whose id's match their modulo value taken mod #nodes.

So for instance if I have 10 messages (numbered 1 to 10) and 4 nodes (0, 1, 2, 3 mod 4 respectively):

Node 1 (0 mod 4) processes messages 4, 8
Node 2 (1 mod 4) processes messages 1, 5, 9
Node 3 (2 mod 4) processes messages 2, 6, 10
Node 4 (3 mod 4) processes messages 3, 7

Different strategy, but could work equally well.  (And it's less complicated)  :0)

Hope this helps


Author Comment

ID: 20275944
Thanks JC

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
This video teaches viewers about errors in exception handling.
Suggested Courses

824 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question