How to design email management (send only) for multiple jvm environment.

Posted on 2007-11-13
Last Modified: 2010-03-30
I am generating email alerts through a timer thread which runs once a day. now the app is being clustered into two app servers, which is leading to duplicate emails. Is there any good mechanism/pattern/architecture to generate and send emails on a clustered environment.
Question by:sunilramu
  • 3
  • 2

Expert Comment

ID: 20275209
There's a tonne of different strategies that you could use here:

1.  If it's a standard "shared nothing" cluster, then just place a lock resource (i.e. a lock file or something similar) on a shared resource that can only be accessed by the active node at any point in time.  Then prior to your timer thread sending a mail it should check to see whether or not it has access to this resource.   In a standard active/passive cluster arrangement, the clustering software will take care of failing over this resource to the secondary/ternary nodes so that your other mail threads automatically acquire ownership if one node fails, thereby ensuring that only one mailer can operate at a time.  What you're basically doing here is creating an external mutex type object that only one mailer can acquire a lock on at a time.

2.  You could use a similar strategy by basically simulating the same sort of shared lock, but in a shared resource such as a database.  Basically each mailer thread also has to periodically update a shared table row with a timestamp and a random number.   Each mailer queries this table for the highest number when it wants to send a mail (including it's own).   If it doesn't have the highest numbered entry, then it needs to check that the those entries "outranking" it have set a timestamp that is within a timeout threshold.  (Say a minute - but you should make this configurable).   If entries that outrank a given mailer have "timed out" then it needs to set it's own number to be higher than those that outrank it, and then update it's timestamp.   This is basically a variation on what's called a lottery algorithm.   When a failed mailer comes back online, it rolls the dice to generate it's rank, and updates it's timestamp.

3.  Similar sort of lottery setup, but based on mailer nodes just broadcasting their timestamp/rank information across the subnet.   So instead of storing stuff in a database, just store it on the network so to speak.  You need to think about convergence (which is what needs to happen when a mailer fails) in this scenario because the networking protocol that you cook up might become a bit complicated.  (More so than you want it to be).

Hope this helps...just a few quick thoughts..

LVL 86

Expert Comment

ID: 20275242
Not really. You should probably configure the mail server to simply drop mails from one or more IP addresses

Expert Comment

ID: 20275261
But then the IP address would change depending on which node in the cluster is active - what would happen at the mail server if the mailer service migrated to a different node (with different IP)?
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.


Author Comment

ID: 20275342
JC and CEHJ thakns for your comments. JC i agree with you. also, dropping mails from one ip address would not provide any failover. CHEJ, your thoughts?

can you elaborate on #2 point. i believe i understand most of it. its just that you mention quick thoughts, is there any drawback you see in future using this statergy, and are there any online material you could refer me to so for further reference.

thanks again

Accepted Solution

jcoombes earned 500 total points
ID: 20275527

Well to elaborate:

1.  Each mailer node (i.e. JVM) updates it's mail processing thread so that on each pass it:

2.  Selects a list of (nodeid, timestamp, integer) values from a known, shared database table.  Here the nodeid could be the IP/MAC of the mailer machine.  These should probably be sorted in descending order based on the integer value.

3.  It checks to see if it has already added an entry to this table and if not, selects a random number within a specific range (should be more than the number of nodes), generates a timestamp and adds itself into this table.  It then re-reads the table (in order to update the other records).

4.  Starting at the top of the list (highest integer) the mailer node then checks to see if the time-stamp for the highest entry is within a given amount of time (a timeout value) to the current time.  This will indicate whether or not the node that is associated with this entry has updated the timestamp recently (i.e. it's still alive) or not.

4a.  If the timestamp is stale, then the node moves on to the next highest entry in the list and performs the same check.  (i.e. go back to 4)

4b.  If the timestamp isn't stale then either:

       i)  The node id is itself, in which case it sends any queued outgoing mail items that need processing.

       ii) The node id is another node, in which case it knows that there is another active node with a higher integer ranking and so lets it do the processing.  (Stops the mail processing loop, and tries again later).

Eventually, the node will either find another node (with a higher integer value than itself) that is responsible for the delivery of mail items, or it will find that itself is the next available (i.e. next highest) node and so do the processing itself.

If you want to load-balance between nodes, then at the end of each sweep (successful or not) you should get each node to re-generate a new random number and update the centralised table.  This should also be done at "node startup" to mix things up a bit.

I think this should work - and I've used similar strategies in the past which have worked given that:

1.  The numerical ranking provides a selection criteria for a node to do the processing.

2.  If all nodes are inactive, the first one up will process backlog.

Collisions (where nodes have the same integer value) shouldn't matter because the ordering returned by the database will effectively decide which entry wins, if you see what I mean.  

Of course, this assumes that the mails to be processed are also centralised and can be queued in some way.

Another *very simple* approach to this is to assume that all nodes are active, and then in your outgoing mail queue, assign each mail item an integer id.   Then, assign each node a value modulo number of nodes, e.g. if you have 5 nodes, then give them values 0, 1, 2, 3, 4.   Then on each pass, a node only takes and processes those mail items whose id's match their modulo value taken mod #nodes.

So for instance if I have 10 messages (numbered 1 to 10) and 4 nodes (0, 1, 2, 3 mod 4 respectively):

Node 1 (0 mod 4) processes messages 4, 8
Node 2 (1 mod 4) processes messages 1, 5, 9
Node 3 (2 mod 4) processes messages 2, 6, 10
Node 4 (3 mod 4) processes messages 3, 7

Different strategy, but could work equally well.  (And it's less complicated)  :0)

Hope this helps


Author Comment

ID: 20275944
Thanks JC

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
groupNoAdj 7 97
ArrayIndexOutOfBoundException 9 84
servlet filter example 37 63
Java: The Public Class Main 4 17
Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:

832 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question