Solved

How to design email management (send only) for multiple jvm environment.

Posted on 2007-11-13
6
211 Views
Last Modified: 2010-03-30
I am generating email alerts through a timer thread which runs once a day. now the app is being clustered into two app servers, which is leading to duplicate emails. Is there any good mechanism/pattern/architecture to generate and send emails on a clustered environment.
0
Comment
Question by:sunilramu
  • 3
  • 2
6 Comments
 
LVL 4

Expert Comment

by:jcoombes
Comment Utility
There's a tonne of different strategies that you could use here:

1.  If it's a standard "shared nothing" cluster, then just place a lock resource (i.e. a lock file or something similar) on a shared resource that can only be accessed by the active node at any point in time.  Then prior to your timer thread sending a mail it should check to see whether or not it has access to this resource.   In a standard active/passive cluster arrangement, the clustering software will take care of failing over this resource to the secondary/ternary nodes so that your other mail threads automatically acquire ownership if one node fails, thereby ensuring that only one mailer can operate at a time.  What you're basically doing here is creating an external mutex type object that only one mailer can acquire a lock on at a time.

2.  You could use a similar strategy by basically simulating the same sort of shared lock, but in a shared resource such as a database.  Basically each mailer thread also has to periodically update a shared table row with a timestamp and a random number.   Each mailer queries this table for the highest number when it wants to send a mail (including it's own).   If it doesn't have the highest numbered entry, then it needs to check that the those entries "outranking" it have set a timestamp that is within a timeout threshold.  (Say a minute - but you should make this configurable).   If entries that outrank a given mailer have "timed out" then it needs to set it's own number to be higher than those that outrank it, and then update it's timestamp.   This is basically a variation on what's called a lottery algorithm.   When a failed mailer comes back online, it rolls the dice to generate it's rank, and updates it's timestamp.

3.  Similar sort of lottery setup, but based on mailer nodes just broadcasting their timestamp/rank information across the subnet.   So instead of storing stuff in a database, just store it on the network so to speak.  You need to think about convergence (which is what needs to happen when a mailer fails) in this scenario because the networking protocol that you cook up might become a bit complicated.  (More so than you want it to be).

Hope this helps...just a few quick thoughts..


JC
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Not really. You should probably configure the mail server to simply drop mails from one or more IP addresses
0
 
LVL 4

Expert Comment

by:jcoombes
Comment Utility
But then the IP address would change depending on which node in the cluster is active - what would happen at the mail server if the mailer service migrated to a different node (with different IP)?
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:sunilramu
Comment Utility
JC and CEHJ thakns for your comments. JC i agree with you. also, dropping mails from one ip address would not provide any failover. CHEJ, your thoughts?

JC,
can you elaborate on #2 point. i believe i understand most of it. its just that you mention quick thoughts, is there any drawback you see in future using this statergy, and are there any online material you could refer me to so for further reference.

thanks again
0
 
LVL 4

Accepted Solution

by:
jcoombes earned 500 total points
Comment Utility
Hi,

Well to elaborate:

1.  Each mailer node (i.e. JVM) updates it's mail processing thread so that on each pass it:

2.  Selects a list of (nodeid, timestamp, integer) values from a known, shared database table.  Here the nodeid could be the IP/MAC of the mailer machine.  These should probably be sorted in descending order based on the integer value.

3.  It checks to see if it has already added an entry to this table and if not, selects a random number within a specific range (should be more than the number of nodes), generates a timestamp and adds itself into this table.  It then re-reads the table (in order to update the other records).

4.  Starting at the top of the list (highest integer) the mailer node then checks to see if the time-stamp for the highest entry is within a given amount of time (a timeout value) to the current time.  This will indicate whether or not the node that is associated with this entry has updated the timestamp recently (i.e. it's still alive) or not.

4a.  If the timestamp is stale, then the node moves on to the next highest entry in the list and performs the same check.  (i.e. go back to 4)

4b.  If the timestamp isn't stale then either:

       i)  The node id is itself, in which case it sends any queued outgoing mail items that need processing.

       ii) The node id is another node, in which case it knows that there is another active node with a higher integer ranking and so lets it do the processing.  (Stops the mail processing loop, and tries again later).

Eventually, the node will either find another node (with a higher integer value than itself) that is responsible for the delivery of mail items, or it will find that itself is the next available (i.e. next highest) node and so do the processing itself.

If you want to load-balance between nodes, then at the end of each sweep (successful or not) you should get each node to re-generate a new random number and update the centralised table.  This should also be done at "node startup" to mix things up a bit.

I think this should work - and I've used similar strategies in the past which have worked given that:

1.  The numerical ranking provides a selection criteria for a node to do the processing.

2.  If all nodes are inactive, the first one up will process backlog.

Collisions (where nodes have the same integer value) shouldn't matter because the ordering returned by the database will effectively decide which entry wins, if you see what I mean.  

Of course, this assumes that the mails to be processed are also centralised and can be queued in some way.


Another *very simple* approach to this is to assume that all nodes are active, and then in your outgoing mail queue, assign each mail item an integer id.   Then, assign each node a value modulo number of nodes, e.g. if you have 5 nodes, then give them values 0, 1, 2, 3, 4.   Then on each pass, a node only takes and processes those mail items whose id's match their modulo value taken mod #nodes.

So for instance if I have 10 messages (numbered 1 to 10) and 4 nodes (0, 1, 2, 3 mod 4 respectively):

Node 1 (0 mod 4) processes messages 4, 8
Node 2 (1 mod 4) processes messages 1, 5, 9
Node 3 (2 mod 4) processes messages 2, 6, 10
Node 4 (3 mod 4) processes messages 3, 7

Different strategy, but could work equally well.  (And it's less complicated)  :0)


Hope this helps


JC
0
 

Author Comment

by:sunilramu
Comment Utility
Thanks JC
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
strCopies  challenge 17 73
mapShare challenge 13 66
DO we need Java installed on a Windows PC and WHY ? 13 77
JAVA part two 5 39
INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now