How to troubleshoot stuck JMS queue

We have a messaging application. We recently converted it "back" to using sql backed jms queues. We had this problem before and converted to non-persistent memory queues to eliminate the problem. I have need to use persistence so I am back to trying to solve this problem.

Typically the messages we are server are SMS messages. Messages come into the system, parameters are retrieved from an LDAP database and then a reply is sent back.

What will happen is eventually the replies get stuck, I then have to restart the application and thankfully the messages are persistent so messages resume.

When the queues get stuck there are no errors that I can detect and I have logging at a pretty high level also the queues will never unstick themselves and I have to do the restart, no matter how much time can pass.

We are using Mule 1.4.4, ActiveMQ 5.4.2 and Java 1.6.26

I am desperate and need some advice on how to troubleshoot this. Needless to say we HAVE tried many things like tunning queues, forcing them all the be vm queues.

Looking forward to some advice!  
Michael SoleDirector of SupportAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

mccarlIT Business Systems Analyst / Software DeveloperCommented:
> there are no errors that I can detect and I have logging at a pretty high level

Is this application logging that you are referring to? Have you looked at the ActiveMQ logs?

Also, you mention "sql backed" JMS queues, have you tried other persistence options, eg. Kaha?
0
Michael SoleDirector of SupportAuthor Commented:
Sorry, we are using kahadb and yes I am referring to the activeMQ (mule, app as well) logs. We have everything piped into one log.

And I believe we are using embedded ActiveMQ so its not the full on installation.
0
mccarlIT Business Systems Analyst / Software DeveloperCommented:
Oh ok. My initial thought would be that it is 'Producer Flow Control' that is kicking in and stopping your app from putting messages on the queue, although I would have thought that that would produce some sort of logging. Can you see if this is enabled or not in the ActiveMQ config (note that it may be that it is a default setting, so it may not be explicitly enabled) and try disabling it and see if you get exceptions in your app? Also, what type of cursor do you currently have setup for the queue that the replies going to? And what other queues have you tried? Are you monitoring the queues while your app is running using something like the web interface (don't know if you can do this if you are using it asa embedded) or via JMX? You might want to try firing ActiveMQ up as a standalone process, it may make monitoring things a little bit easier! (I have always run our ActiveMQ instances this way (not embedded), so that is where I am coming from with my ideas to try and troubleshoot)
0
Exploring SQL Server 2016: Fundamentals

Learn the fundamentals of Microsoft SQL Server, a relational database management system that stores and retrieves data when requested by other software applications.

Michael SoleDirector of SupportAuthor Commented:
Answers below:

Oh ok. My initial thought would be that it is 'Producer Flow Control' that is kicking in and stopping your app from putting messages on the queue, although I would have thought that that would produce some sort of logging.
Can you see if this is enabled or not in the ActiveMQ config (note that it may be that it is a default setting, so it may not be explicitly enabled) and try disabling it and see if you get exceptions in your app?

This is the current configuration of the producer flow control.  We still get a sticky queue.
<policyEntry queue=">" producerFlowControl="false" memoryLimit="1mb" useCache="false">

Also, what type of cursor do you currently have setup for the queue that the replies going to?

We are using persistent queues and the default ActiveMQ5.0 destination policy.  We haven’t configured any destination policies

And what other queues have you tried? Not sure what you mean.  Please explain a little further

Are you monitoring the queues while your app is running using something like the web interface (don't know if you can do this if you are using it asa embedded) or via JMX?

We are using the JMX console. We are able to actively monitor the queues.  The strange thing is that when the queues are stuck It never shows up in the monitor as I thought they would.  The monitor shows the stuck queue as being empty.  


----

I am also seeing if it is a server specific issue and we are testing that now. If that proves that it is not a server issue I will try running ActiveMQ as not embedded but see if any of the info above illuminates something. Thank you very much for your responses :)
0
mccarlIT Business Systems Analyst / Software DeveloperCommented:
Hi skione,

Sorry for the delay, it was the weekend here and I hadn't a chance to reply to you.

The configuration to disable producerFlowControl 'looks' ok but perhaps there is some small part of that syntax (or the location of the attribute) that is not quite right. Although, I thought that ActiveMQ validates the configuration using it's XMLSchema definition, but maybe that doesn't happen when using it embedded. Validation should definitely determine whether that config is ok or not, but if it isn't being validated and there is an error, perhaps it is silently disregarding that attempt to disable flow control. *shrugs* Just some thoughts!

> We are using persistent queues and the default ActiveMQ5.0 destination policy

By cursors, I meant which one of vmCursor, fileCursor or storeCursor are you using? I noted that you mentioned vmCursor in your original post. From my experience, vmCursors can be faster but is more likely to trigger things like flow control, etc that result in things getting stuck. Currently, we are using fileCursors and although things can slow down a little when we get a high number of pending messages, we have no longer gotten to a point where things have 'stuck' or crashed, etc and everything just runs smoother.

> The monitor shows the stuck queue as being empty

From what I understand of your problem, you have at least 2 queues, one to receive messages on and one to send the replies out. So, you say that one of the queues is empty, but what is the state of the other? It may sound illogical but what I found is that the problem doesn't always look like it is where it should be, if you get what I mean.

Here is a scenario that I think might be happening (it is similar to one we had)... You are receiving message on queue A, processing them and posting replies to queue B. Also, this is done transactionally so that if the reply doesn't successfully get to queue B, it is rolled back on queue A and will eventually be retried. The messages in queue A are all being stored in memory (generally what happens when using the vmCursor, I think), and the ActiveMQ has a finite amount of memory that it can use. Generally things work fine because messages are only relatively short lived in ActiveMQ. But if the incoming queue A's pending messages goes up, the memory usage goes up. At some point there gets to be a point where there is no memory to post the reply, and so that fails, the message gets rolled back on queue A which *doesn't* free up any memory, and therefore it can't go any further.

I still would have thought that you should see error about not being able to post the reply or from ActiveMQ filling it's memory allocation, etc. but maybe that might give you a clue to where to look. So, yeah, I would recommending to try a different cursor type (eg. fileCursor) and perhaps running ActiveMQ as a separate server process might highlight configuration issues or hidden exception messages, etc.

Hope this helps, let us know how you go!
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Michael SoleDirector of SupportAuthor Commented:
I am awarding you the points because your suggestion of using file cursors helped illuminate more details.

Thank you!
0
mccarlIT Business Systems Analyst / Software DeveloperCommented:
Not a problem! Also, if you did end up finding the real issue, or you do find it in the near future, it would be great if you could post some info here, just to help others if they ever stumble upon this question at a later date!
0
Michael SoleDirector of SupportAuthor Commented:
Specifically using file cursors allows us to "see" a stuck queue and that in of itself was the essence of my problem. While there may end up being a further root cause or more questions that specifically answered the one I asked here.

Happy holidays!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.