Question

MQSeries Queue Manager rejecting connection with 2009 and 2058

Asked by: Harmsy2008

Hi

I have a multi-threaded Java MQSeries application which runs OK most of the time. For throughput reasons, the application can be configured so that many threads can be run to read messages from a remote Queue Manager. Typically, four threads are configured to run.

There are two problems.Sometimes after the application has been restarted, one or two of the threads will successfully connect to the remote Queue Manager. But the others will fail with either a 2009 or 2058 error i.e.

    MQJE001: An MQException occurred: Completion Code 2, Reason 2009
    MQJE016: MQ queue manager closed channel immediately during connect
    Closure reason = 2009

    and

    MQJE001: An MQException occurred: Completion Code 2, Reason 2058
    MQJE036: Queue manager rejected connection attempt

The code has retry logic, so that the connection is retried five minutes later (the code retries five times in total before giving up). Sometimes the code recovers. Sometimes, on the retry, connections that had failed with a 2009, fail with a 2058 on the retry. Sometimes, the retry fails again with a 2058.

The code is quite simple. The queue manager name, host name, channel and port are all configuration parameters.

    MQEnvironment.hostname = hostname;
    MQEnvironment.channel = channel;
    MQEnvironment.port = port;
    MQQueueManager queueMan = new MQQueueManager(queueManagerName);

The problem sometimes occurs after the application has started ok, and been running for a while. Due to a firewall terminating the connection, a 2009 error is incurred. On trying to reconnect to the remote Queue Manager, a 2058 is incurred.

A possible thought that I have, is that the previous incarnation of the program has left some resources open in MQSeries environment somewhere. When it is restarted (new Java VM started), the MQSeries node rejects it with a 2058 which is misleading. Similarly, when the code is trying to recover from a 2009 error.

A slight complexity to this issue. Is that the application is getting this problem, at a client's site. So it's not easy for me to reproduce personally. The client gets this every couple of days.

Regards
Pat

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2008-03-20 at 20:34:06ID23259076
Tags

MQSeries

,

2058

,

2009

,

queue

,

manager

,

connection

,

rejected

Topics

Message Queue

,

Java Programming Language

,

Java Standard Edition

Participating Experts
2
Points
500
Comments
25

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. Throughput
    I have heard someone said this Firewall has a throughput to 1.5 Gigabytes. What does it mean ? Please clarify me. Thanks in advances.

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: HonorGodPosted on 2008-03-21 at 06:46:04ID: 21179814

What version of MQ is being used?

When you say "... after the application has been restarted..." does that mean
the Application Server hasn't been restarted, just the application?

If so, what commands/tool/technique was used to "restart" the application?

The possibility of another instance of the application remaining around is slight,
but I guess that it is possible that some thread had a resource still marked as
"in use" somewhere.

 

by: Harmsy2008Posted on 2008-03-21 at 07:57:03ID: 21180328

The client is MQ 5.3, the remote server is 6.

The applikcation is a vanilla POJ program. It is not running in the "container". In the actual configuration, the Java VM is a Microsoft Cluster resource. The Java VM, is stopped and restarted via the Microsoft Cluster Manager tool.

I suspect that the client side is "clean". I question that whether the server, knows the client has been restarted. Thatit might have resources from the client's previous incarnation.


 

by: HonorGodPosted on 2008-03-21 at 08:32:27ID: 21180603

Sounds likely.

What "session timeout" value do you have set?

e.g., for WebSphere Application Server 6.1, this is available using the information
documented here:

http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic=/com.ibm.websphere.base.doc/info/aes/ae/uprs_rsession_manager.html

 

by: Harmsy2008Posted on 2008-03-21 at 16:21:16ID: 21183911

The client is NOT running under the Application Server. It essentially is an application that runs as a "Daemon" process (on Windows as a Service).

Suspect that having here in the "Software/Server Software/Application Servers/Java/IBM Websphere" zone is a bit misleading. I put it here, as I found a previous query about 2058 here. I've now published it in the "Software/Message Queue/IBM MQ" zone.

 

by: Harmsy2008Posted on 2008-03-21 at 16:36:10ID: 21183976

Aaggghhh. Just realised I can't publish it to the "Software/Message Queue/IBM MQ" zone by myself. I have to get a Community Support Zone moderator to do it for me. Being an EE newbie I didn't realise this.

 

by: HonorGodPosted on 2008-03-21 at 16:52:36ID: 21184024

The fact that the client is not running on the AppServer is immaterial.
The session timeout is on the AppServer.  So, that is where we have to find the session timeout value.

 

by: Harmsy2008Posted on 2008-03-21 at 17:17:02ID: 21184142

Ok. I don't know the actual setup at the server side. My customer has told me it is v6. Now, pardon my ignorance here, as I haven't worked with MQSeries v6. Can it be just MQSeries by itself, like with v5.3 . Or with v6, does it come tightly coupled with WAS? I can send an email to my customer (a large financial institution), however being Easter weekend, it is unlikely that I will get an answer from them until Tuesday though.

 

by: HonorGodPosted on 2008-03-21 at 17:56:50ID: 21184282

No, they aren't tightly coupled.  As a matter of fact, WSAS V6 has it's own JMS messaging engine.
However, it can be configured to use MQ instead of the one that comes with WSAS 6.x

 

by: Harmsy2008Posted on 2008-03-21 at 20:02:14ID: 21184599

Ok. I suspect my customer just has MQ then. So the WAS session-timeout setting won't be applicable. It is so strange that an error 2058 is returned. Any ideas on that? And how to recover from it. After 5 retries, the code treats it as a hard error. When my customer restarts it, it works. Really strange.

 

by: HonorGodPosted on 2008-03-22 at 05:07:32ID: 21185719

I'm sorry, I didn't realize that I had said it in that way.  Now, I understand your confusion better.

Session Timeouts are configurable for TCP/IP connections.  I presume that the connection from your messaging client is using TCP/IP to connect to the Queue Manager.  In order for that to occur, the Queue Manager has to be listening on a TCP/IP port when the client sends a connection request.

Once the connection is established, data should flow in the form of requests and responses.
If the parties are quiet (i.e., not sending or receiving), it is possible for something to disrupt the client such that as far as the Queue Manager is concerned, it (the client) just "goes away."

However, the resources on the Queue Manager associated with that session can remain "in use" until the session timeout occurs, which is when the QM decides that the client has gone away.

Does this make sense?

 

by: Harmsy2008Posted on 2008-03-22 at 15:36:07ID: 21187963

Yes. That goes with my understanding as well. Let me explain the problem again with some more detail, this problem has two scenarios:

1) In the first scenario, the TCP/IP session to the Queue Manager is being terminated by a firewall, in between the client reading a message from the queue and it sending back a response. It is being terminated because this is taking more than an hour. Why it is taking this long, is a separate issue. The code gets a 2009 error on the operation getting access to the response queue i.e.
            int replyOptions = MQC.MQOO_OUTPUT | MQC.MQOO_PASS_IDENTITY_CONTEXT;
            replyQueue =queueMan.accessQueue(request.replyToQueueName,
                                             replyOptions,
                                             request.replyToQueueManagerName,
                                             null,
                                             null);

The code when it catches the 2009 MQException, closes the input queue and the Queue Manager, ignoring any MQExceptions incurred during this i.e.
            try {
                if ( inputQueue != null ) {
                    inputQueue.close();
                    inputQueue = null;
                }
            } catch (MQException e) {
                // Do nothing
            }
           
            try {
                if ( queueMan != null ) {
                    queueMan.disconnect();
                    queueMan = null;
                }
            } catch (MQException e) {
                // Do nothing
            }

The Queue Manager object is then re-created. It is during this that the 2058 occurs i.e.
            queueMan = new MQQueueManager(queueManagerName);
           

2) In the second scenario. The application has been shutdown and restarted. The shutdown has not been clean, it has been a "sledge hammer" shutdown. Why it is not elegant,  is a separate issue. Our application is running in a Microsoft clustered environment, the VM that starts the client is a non-cluster aware "Generic Application Cluster" resource. When my customer needs to shutdown the system, the resource is just stopped. So it is instant. Anyway, on the restart, we sometimes get 2009 and 2058 errors on creating the Queue Manager object. Sometimes a retry works. Sometimes the retry of a 2009, then gets an error 2058. Sometimes the retry of a 2058, still gets 2058 errors.

I haven't been able to recreate either scenario. For scenario 1, whilst debugging the code, just before the accessQueue operation, I have terminated the TCP/IP connection (using the TCPView utility from Sysinternals). A 2009 error is incurred, but the recreation of the Queue Manager is successful. Similarly for scenario 2, I have terminated the Java process at various spots. The restart always works. However, my testing is against a 5.3 MQSeries server. Whether 6 is less robust, or the default configuration for 5.3 has it more robust I don't know.

If we assume, that the server has retained some of the resources from previous sessions. Why does it return a 2058? Why do some of the threads get a successful connection, whereas others with the same configuration fail with the 2058? Is there anything programmatic  that can be done, to force the server to do a cleanup?

 

by: lgacsPosted on 2008-03-23 at 05:24:14ID: 21189349

You can get more detailed information reading error log on server side.
A possible reason can be that the maximal number of client connection is reached.

The error log can be found in <MQR>/qmgrs/<QMGR_NAME>/error directory.
On Unix (Linux) systems MQR generally /var/mqm,
on windows systems it is the install dir. of WMQ.

 

by: lgacsPosted on 2008-03-23 at 05:27:52ID: 21189356

... to continue the previous comment ....

The reason of extremely large number of client connection can be the misconfigured
firewall, which keeps channel TCP connections on server side open even the client
terminated.  

 

by: Harmsy2008Posted on 2008-03-23 at 23:37:16ID: 21192177

I need to confirm with my customer what is in the log for each time the error does occur. I know that they have found errors like:

    23/03/2008  08:28:36
    AMQ9208: Error on receive from host seng20 (192.168.0.188).

    EXPLANATION:
    An error occurred receiving data from seng20 (192.168.0.188) over TCP/IP. This
    may be due to a communications failure.
    ACTION:
    The return code from the TCP/IP (recv) call was 10054 (X'2746'). Record these
    values and tell the systems administrator.

where the host being reported is the client machine.

If it was a firewall issue, wouldn't the error be more a 2059 (MQRC_Q_MGR_NOT_AVAILABLE) as it is a TCPIP issue, not a 2058 (MQRC_Q_MGR_NAME_ERROR). And wouldn't the server log something about rejecting the connection.

 

by: lgacsPosted on 2008-03-24 at 07:32:53ID: 21193748

It is quite common when the client applications disconnecting from the qmgr abnormally,
i.e.. closing the application instead of disconnecting from the qmgr first.

This can be caused by your "Microsoft clustered environment" which I do not know.

 

by: Harmsy2008Posted on 2008-03-24 at 14:15:51ID: 21196985

When you say "quite common". Do you mean, that it is "quite common" for 2058 errors to occur if the client applications disconnects from the queue manager abnormally?

 

by: lgacsPosted on 2008-03-24 at 15:34:51ID: 21197933

I mean "quite common" to have log entries like described, because of abnormal termination of
network connection.

The reason getting 2058 is most probably the consequence of either the lack of available connection
or some odd network behavior of MS cluster.

Do you have cluster (either MQ cluster or MS cluster) on server side too ?

 

by: Harmsy2008Posted on 2008-03-24 at 17:09:08ID: 21198598

Ok.

I'm asking my customer via email re what clustering they have on the server side. Will post a comment when they come back with this.

If we assume, that the server has retained some of the resources from previous sessions.  Is there anything programmatic  that can be done, to force the server to do a cleanup?

 

by: Harmsy2008Posted on 2008-03-25 at 05:39:38ID: 21201263

Or is it a known deficiency with IBM MQSeries, that it cannot handle abnormally terminated sessions.

If th server cannot be forced proigramatticallt to clean itself up, what is the setting on the server that determines how long till it cleans itself up. My customer generally finds these errors occuring around the 4am mark, programmatic recovery logic around that time, fails. A couple of hours later, the restart of the application works.

I'm now coming to the conclusion, that in these error scenarios, that my customer should stop and restart the MQ queue manager or the whole MQ node (if it is the only QM, which I don't know).

 

by: HonorGodPosted on 2008-03-25 at 07:04:46ID: 21201996

You can use the "Display Channel Status".  Additionally, you can configure the KeepAlive setting.
Unfortunately, I believe that the default KeepAlive setting is 2 hours (which is IHO) much too long).  I prefer a setting of something like 5 minutes.  This allows the Queue Manager to check on the status of the client, and close the connection and release all associated resources.

 

by: Harmsy2008Posted on 2008-03-26 at 22:55:07ID: 21218893

Guys. I've got some outstanding questions to my customer re some of the above statements.

 

by: Harmsy2008Posted on 2008-03-27 at 22:49:51ID: 21228276

Latest update. Something, is a possible solution which my customer is going to test next week. With the multithreaded application, the system can be configured to have one group of threads using one set of configuration values (i.e. host, queue manager, channel)  and another group having a different (host, queue manager and channel). The customer's MQ configuration person got back to me today, with some errors that were occurring around the restart which had the wrong channel value, it was one of the other valuesl! Why they hadn't provided this information before, I'm cranky about. As previously they had only indicate the errors I documented before.

It got me thinking about the static MQ class MQEnvironment, the hostName, channel in this are static. So I'm thinking that the groups are stepping on each other. My customer, is reconfiguring the system (unfortunately being a production system at a large financial institution) this is not going to occur to next week.

I'll wait to see what happens with this. I've also got my customer looking at configuring the KAINT to something less than the firewall timeout, that way it cleans things up from it's end. Depending on what happens, I'll assign multiple solutions to be fair, as I have got something from both "Experts" responses on this.

 

by: Harmsy2008Posted on 2008-04-04 at 21:34:19ID: 21287082

Latest status. My customer has LOTS and LOTS of bureacracy, they also mucked up a change. So waiting till early next week, for the above changes to be implemented and tested.

 

by: Harmsy2008Posted on 2008-04-09 at 22:51:05ID: 31441713

The two experts gave him me information that helped solved the problem. However, in the end, I solved the issue myself, by getting a vital piece of information from my customer about the actual issue (i.e. that the WebSphere log contained the wrong server channel name, they had withheld that information from me, agggghh). This led me to the conclusion, that because the MQEnvironment class was static, that having two threads with different hostname/channel values would clash.

 

by: Harmsy2008Posted on 2008-04-09 at 22:56:37ID: 21322028

I got my client to do two things.
1) Most importantly, the threads were grouped, on a hostname/channel/port combination being in their own virtual machine. This stopped any issue with the static MQEnvironment class.
2) The KAINT on the MQ Server servers were set to a value, less than the firewall timeout value. That way, the server would clean up the connections.

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...