How to design email management (send only) for multiple jvm environment.

I am generating email alerts through a timer thread which runs once a day. now the app is being clustered into two app servers, which is leading to duplicate emails. Is there any good mechanism/pattern/architecture to generate and send emails on a clustered environment.
Who is Participating?

Well to elaborate:

1.  Each mailer node (i.e. JVM) updates it's mail processing thread so that on each pass it:

2.  Selects a list of (nodeid, timestamp, integer) values from a known, shared database table.  Here the nodeid could be the IP/MAC of the mailer machine.  These should probably be sorted in descending order based on the integer value.

3.  It checks to see if it has already added an entry to this table and if not, selects a random number within a specific range (should be more than the number of nodes), generates a timestamp and adds itself into this table.  It then re-reads the table (in order to update the other records).

4.  Starting at the top of the list (highest integer) the mailer node then checks to see if the time-stamp for the highest entry is within a given amount of time (a timeout value) to the current time.  This will indicate whether or not the node that is associated with this entry has updated the timestamp recently (i.e. it's still alive) or not.

4a.  If the timestamp is stale, then the node moves on to the next highest entry in the list and performs the same check.  (i.e. go back to 4)

4b.  If the timestamp isn't stale then either:

       i)  The node id is itself, in which case it sends any queued outgoing mail items that need processing.

       ii) The node id is another node, in which case it knows that there is another active node with a higher integer ranking and so lets it do the processing.  (Stops the mail processing loop, and tries again later).

Eventually, the node will either find another node (with a higher integer value than itself) that is responsible for the delivery of mail items, or it will find that itself is the next available (i.e. next highest) node and so do the processing itself.

If you want to load-balance between nodes, then at the end of each sweep (successful or not) you should get each node to re-generate a new random number and update the centralised table.  This should also be done at "node startup" to mix things up a bit.

I think this should work - and I've used similar strategies in the past which have worked given that:

1.  The numerical ranking provides a selection criteria for a node to do the processing.

2.  If all nodes are inactive, the first one up will process backlog.

Collisions (where nodes have the same integer value) shouldn't matter because the ordering returned by the database will effectively decide which entry wins, if you see what I mean.  

Of course, this assumes that the mails to be processed are also centralised and can be queued in some way.

Another *very simple* approach to this is to assume that all nodes are active, and then in your outgoing mail queue, assign each mail item an integer id.   Then, assign each node a value modulo number of nodes, e.g. if you have 5 nodes, then give them values 0, 1, 2, 3, 4.   Then on each pass, a node only takes and processes those mail items whose id's match their modulo value taken mod #nodes.

So for instance if I have 10 messages (numbered 1 to 10) and 4 nodes (0, 1, 2, 3 mod 4 respectively):

Node 1 (0 mod 4) processes messages 4, 8
Node 2 (1 mod 4) processes messages 1, 5, 9
Node 3 (2 mod 4) processes messages 2, 6, 10
Node 4 (3 mod 4) processes messages 3, 7

Different strategy, but could work equally well.  (And it's less complicated)  :0)

Hope this helps

There's a tonne of different strategies that you could use here:

1.  If it's a standard "shared nothing" cluster, then just place a lock resource (i.e. a lock file or something similar) on a shared resource that can only be accessed by the active node at any point in time.  Then prior to your timer thread sending a mail it should check to see whether or not it has access to this resource.   In a standard active/passive cluster arrangement, the clustering software will take care of failing over this resource to the secondary/ternary nodes so that your other mail threads automatically acquire ownership if one node fails, thereby ensuring that only one mailer can operate at a time.  What you're basically doing here is creating an external mutex type object that only one mailer can acquire a lock on at a time.

2.  You could use a similar strategy by basically simulating the same sort of shared lock, but in a shared resource such as a database.  Basically each mailer thread also has to periodically update a shared table row with a timestamp and a random number.   Each mailer queries this table for the highest number when it wants to send a mail (including it's own).   If it doesn't have the highest numbered entry, then it needs to check that the those entries "outranking" it have set a timestamp that is within a timeout threshold.  (Say a minute - but you should make this configurable).   If entries that outrank a given mailer have "timed out" then it needs to set it's own number to be higher than those that outrank it, and then update it's timestamp.   This is basically a variation on what's called a lottery algorithm.   When a failed mailer comes back online, it rolls the dice to generate it's rank, and updates it's timestamp.

3.  Similar sort of lottery setup, but based on mailer nodes just broadcasting their timestamp/rank information across the subnet.   So instead of storing stuff in a database, just store it on the network so to speak.  You need to think about convergence (which is what needs to happen when a mailer fails) in this scenario because the networking protocol that you cook up might become a bit complicated.  (More so than you want it to be).

Hope this helps...just a few quick thoughts..

Not really. You should probably configure the mail server to simply drop mails from one or more IP addresses
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

But then the IP address would change depending on which node in the cluster is active - what would happen at the mail server if the mailer service migrated to a different node (with different IP)?
sunilramuAuthor Commented:
JC and CEHJ thakns for your comments. JC i agree with you. also, dropping mails from one ip address would not provide any failover. CHEJ, your thoughts?

can you elaborate on #2 point. i believe i understand most of it. its just that you mention quick thoughts, is there any drawback you see in future using this statergy, and are there any online material you could refer me to so for further reference.

thanks again
sunilramuAuthor Commented:
Thanks JC
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.