Solved

How to design email management (send only) for multiple jvm environment.

Posted on 2007-11-13
6
218 Views
Last Modified: 2010-03-30
I am generating email alerts through a timer thread which runs once a day. now the app is being clustered into two app servers, which is leading to duplicate emails. Is there any good mechanism/pattern/architecture to generate and send emails on a clustered environment.
0
Comment
Question by:sunilramu
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 4

Expert Comment

by:jcoombes
ID: 20275209
There's a tonne of different strategies that you could use here:

1.  If it's a standard "shared nothing" cluster, then just place a lock resource (i.e. a lock file or something similar) on a shared resource that can only be accessed by the active node at any point in time.  Then prior to your timer thread sending a mail it should check to see whether or not it has access to this resource.   In a standard active/passive cluster arrangement, the clustering software will take care of failing over this resource to the secondary/ternary nodes so that your other mail threads automatically acquire ownership if one node fails, thereby ensuring that only one mailer can operate at a time.  What you're basically doing here is creating an external mutex type object that only one mailer can acquire a lock on at a time.

2.  You could use a similar strategy by basically simulating the same sort of shared lock, but in a shared resource such as a database.  Basically each mailer thread also has to periodically update a shared table row with a timestamp and a random number.   Each mailer queries this table for the highest number when it wants to send a mail (including it's own).   If it doesn't have the highest numbered entry, then it needs to check that the those entries "outranking" it have set a timestamp that is within a timeout threshold.  (Say a minute - but you should make this configurable).   If entries that outrank a given mailer have "timed out" then it needs to set it's own number to be higher than those that outrank it, and then update it's timestamp.   This is basically a variation on what's called a lottery algorithm.   When a failed mailer comes back online, it rolls the dice to generate it's rank, and updates it's timestamp.

3.  Similar sort of lottery setup, but based on mailer nodes just broadcasting their timestamp/rank information across the subnet.   So instead of storing stuff in a database, just store it on the network so to speak.  You need to think about convergence (which is what needs to happen when a mailer fails) in this scenario because the networking protocol that you cook up might become a bit complicated.  (More so than you want it to be).

Hope this helps...just a few quick thoughts..


JC
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 20275242
Not really. You should probably configure the mail server to simply drop mails from one or more IP addresses
0
 
LVL 4

Expert Comment

by:jcoombes
ID: 20275261
But then the IP address would change depending on which node in the cluster is active - what would happen at the mail server if the mailer service migrated to a different node (with different IP)?
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 

Author Comment

by:sunilramu
ID: 20275342
JC and CEHJ thakns for your comments. JC i agree with you. also, dropping mails from one ip address would not provide any failover. CHEJ, your thoughts?

JC,
can you elaborate on #2 point. i believe i understand most of it. its just that you mention quick thoughts, is there any drawback you see in future using this statergy, and are there any online material you could refer me to so for further reference.

thanks again
0
 
LVL 4

Accepted Solution

by:
jcoombes earned 500 total points
ID: 20275527
Hi,

Well to elaborate:

1.  Each mailer node (i.e. JVM) updates it's mail processing thread so that on each pass it:

2.  Selects a list of (nodeid, timestamp, integer) values from a known, shared database table.  Here the nodeid could be the IP/MAC of the mailer machine.  These should probably be sorted in descending order based on the integer value.

3.  It checks to see if it has already added an entry to this table and if not, selects a random number within a specific range (should be more than the number of nodes), generates a timestamp and adds itself into this table.  It then re-reads the table (in order to update the other records).

4.  Starting at the top of the list (highest integer) the mailer node then checks to see if the time-stamp for the highest entry is within a given amount of time (a timeout value) to the current time.  This will indicate whether or not the node that is associated with this entry has updated the timestamp recently (i.e. it's still alive) or not.

4a.  If the timestamp is stale, then the node moves on to the next highest entry in the list and performs the same check.  (i.e. go back to 4)

4b.  If the timestamp isn't stale then either:

       i)  The node id is itself, in which case it sends any queued outgoing mail items that need processing.

       ii) The node id is another node, in which case it knows that there is another active node with a higher integer ranking and so lets it do the processing.  (Stops the mail processing loop, and tries again later).

Eventually, the node will either find another node (with a higher integer value than itself) that is responsible for the delivery of mail items, or it will find that itself is the next available (i.e. next highest) node and so do the processing itself.

If you want to load-balance between nodes, then at the end of each sweep (successful or not) you should get each node to re-generate a new random number and update the centralised table.  This should also be done at "node startup" to mix things up a bit.

I think this should work - and I've used similar strategies in the past which have worked given that:

1.  The numerical ranking provides a selection criteria for a node to do the processing.

2.  If all nodes are inactive, the first one up will process backlog.

Collisions (where nodes have the same integer value) shouldn't matter because the ordering returned by the database will effectively decide which entry wins, if you see what I mean.  

Of course, this assumes that the mails to be processed are also centralised and can be queued in some way.


Another *very simple* approach to this is to assume that all nodes are active, and then in your outgoing mail queue, assign each mail item an integer id.   Then, assign each node a value modulo number of nodes, e.g. if you have 5 nodes, then give them values 0, 1, 2, 3, 4.   Then on each pass, a node only takes and processes those mail items whose id's match their modulo value taken mod #nodes.

So for instance if I have 10 messages (numbered 1 to 10) and 4 nodes (0, 1, 2, 3 mod 4 respectively):

Node 1 (0 mod 4) processes messages 4, 8
Node 2 (1 mod 4) processes messages 1, 5, 9
Node 3 (2 mod 4) processes messages 2, 6, 10
Node 4 (3 mod 4) processes messages 3, 7

Different strategy, but could work equally well.  (And it's less complicated)  :0)


Hope this helps


JC
0
 

Author Comment

by:sunilramu
ID: 20275944
Thanks JC
0

Featured Post

Revamp Your Training Process

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
The viewer will learn how to implement Singleton Design Pattern in Java.
Suggested Courses

628 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question