Link to home
Start Free TrialLog in
Avatar of ConnieCA
ConnieCA

asked on

E2K Mail transport problem

Once upon a time there was an Exchange server that lived all my itself on a domain…

FIRST SITE (SITE A)

Exchange 2000 SP3 server A
Also the GC for the domain (domain is looney.uar.edu)
Sits in Win2K Site A
IP range for this site is 152.26.134….

Member of First Administrative Group
Member of First Routing Group
SMTP Connector (for internet mail)
      Using DNS to route
      SMTP * address space, entire organization, allow relay
      Local bridgehead is itself
DNS handled outside of our company by the university
Dynamic DNS not supported by the university (all DNS entries are manually entered)
DNS not allowed to be installed on any of our W2K servers
MX record on university DNS server
Exchange applicable ports:
      25 listening
      389 listening
      691 listening
      3268 listening



Email worked well and everyone was happy

One day, a second Exchange server came along to live in that same domain in a different site…

NEW SITE (SITE B)

Exchange 2000 SP3 server B (part of the looney.uar.edu domain also)
Sits in Win2K Site B
Not a domain controller (but a DC/GC for the domain also lives in the site)
IP range for this site is 192.46.102…
Member of First Administrative Group
Member of First Routing Group
MX record for this server added to university DNS server (with a lower priority then Server A)
Exchange applicable ports:
      25 listening
      389 not listening (but is listening on the DC/GC in this site)
      691 listening
      3268 not listening (but is listening on the DC/GC in this site)

PIX firewall between the two sites.
Mailguard is turned OFF.
No Exchange related ports being dropped.
Can nslookup between the two E2K servers with no problem.


Now here’s what is happening with email…

Email from the internet is received and routed properly to both Server A and Server B
Server A and Server B can send TO the internet
Email from Server B is routed to mailboxes on Server A.
Email from Server A to Server B hangs in the queue on Server B and never leaves.
Telnet to port 25 from Server A to Server B works and sending a test SMTP mail from a telnet session works

TONS of info I know, but we’re stuck on what the problem is here.

Can anyone see where the problem might be in this scenario???





Avatar of OneHump
OneHump

Check my last two posts in the other thread.  :)
Avatar of ConnieCA

ASKER

25 is listening on both exchange servers
691 is listening on both exchange server (but it is not listening on the GC/DC in Site B (a separate box from the Exchange Server)

Re: 389 and 3268, these are not listening on the Exchange Server in Site B but it was my understanding that LDAP is handled by the domain controller and that the Exchange 'member server' doesn't need it.

The messages are sticking in the queue for Server B on Server A.

Maximum logging has been turned on but nothing jumps out as being a problem...

It looks like Server A sends a message to a user on Server B who sends one back to Server A in the same minute, according to the log, but no message ever shows up anywhere and they 'seem to be' just sitting in the queue.
What is the name of that queue?

Because your servers are separated by a firewall, can we put them in separate routing groups?

Did you have a chance to check out logging?

Also, when you checked your ports, you checked them from source server to destination, right?  I'm less concerned that a server is listening than server A can connect to server B and vice versa.

OneHump
name of the queue is zoot.itsc.uah.edu

Tried putting them in different routing groups with a routing group connector between the two with no luck

All looks ok in the log, from what I can tell. Not really sure what to look for

Let me try the tool you suggested for checking the ports. If I stop the system attendant, my email will be disrupted, correct?
Using the I can connect to port 25 and port 691 from the source server (ServerA) to the destination server (ServerB) using the tool you suggested.

Is there any chance the problem could be coming from both E2K servers having MX records for the same namespace??? (ServerA has a higher priority than ServerB). Just wondering...
OK, so it can't connect to Zoot.  I'm glad we established that.  :)

So... if you check properties on that queue, it's telling you that it's in 'retry' state, right?

You are looking for anything that talks about why it can't connect.  The messages have been categorized, so my guess is that there is a port 25 issue.  Can you telnet from the queueing server to zoot on port 25 and back from zoot to that server on port 25?  Do you know how to send a manual SMTP message by telnet?  If not, let me know, if so, try that.  You might see the problem there as you are entering commands.

If you stop the system attendat, all other Exchange services will stop and mail will be down.  Try my telnet suggestion first.

OneHump
I can use telnet to open port 25 on zoot from home and get an answer back from my helo command

What is the command to finish the email message and send it?

mail from: cthompson@itsc.uah.edu
rcpt to: ztest@itsc.uah.edu
data

then what???

btw...when typing these commands I get no response such as 'sender ok' other than with the helo command
type some data "asdfkljasd;fjkaj;" and the hit enter, type a period "." and then hit enter again.

Let me show you what you should get on an Exchange 2000 server:

220 server.sub.domain.net Microsoft ESMTP MAIL Service, Version: 5.0.2195.5329 ready at  Thu, 30 Oct 2003 15:36:43 -0800
helo mydomain.com
250 server.sub.domain.net Hello [111.111.111.111]
mail from:me@mydomain.com
250 2.1.0 me@mydomain.com....Sender OK
rcpt to:me@mydomain.com
250 2.1.5 me@mydomain.com
data
354 Start mail input; end with <CRLF>.<CRLF>
This is a test
.
250 2.6.0 <ServernameYAPNXRPm00000082@server.sub.domain.net> Queued mail for delivery

If you arent getting responses to each command, try that from the local server itself.  From Zoot, telnet to Zoot locally.

OneHump
No response back on home using telnet as described above. Tried from zoot locally...type in mail from: thompson@itsc.uah.edu it gives back the 'sender ok' message but...

when I try rcpt to: ztest@itsc.uah.edu (my test user on zoot) it says '501 5.5.4 Invalid Address"

That user is in AD Users & Computers with that email address.

???
"No response back on home using telnet as described above."

Does that mean that you telneted from Zooty to Home and got no response between commands?

Can you please log onto a mailbox on Zoot and paste ztest@itsc.uah.edu from this thread right into the To field on a new note and CTRL-K.  It should resolve to the display name of your test mailbox.  That is very very strange that you get invalid address.

Also, telnet back into zoot, from zoot and type expn ztest and tell me what you get.

OneHump
Sorry my response wasn't very clear...

I telnet on home, connect to zoot via port 25, type in the commands for the test SMTP message. After each command (mail from:, rcpt to:, etc.) I get no response back AND when I'm done I get no message saying 'message queued' (like you would normally see). Just the cursor sitting down on the next line after you hit ,<enter>, period (.), <enter> and never responds back.

The rest I will need to do tomorrow because it's 8:45 and if I don't go let my puppy out he will go nuts and I don't have VPN into these servers yet. Thank you SO much. If I could give more than 500 points I would because this problem is so frustrating and you have been a BIG help.

Connie

Something must have been 'messed up' with Exchange last night because this morning it doesn't give me the invalid address error when I telnet a message from zoot to zoot to ztest@itsc.uah.edu.

when I telnet on zoot to zoot and type in 'expn ztest' I get '500 5.3.3 Unrecognized Command'

Just to make sure I'm doing it right, you want me to type that in after:
set local_echo
open zoot 25
helo itsc.uah.edu

Right?
btw...I went in and deleted one of the messages that was stuck in the queue and requested an NDR...this is what I got

Your message did not reach some or all of the intended recipients.

      Subject:      test using zoot.itsc.uah.edu
      Sent:      10/31/2003 10:20 AM

The following recipient(s) could not be reached:

      'btherat@zoot.itsc.uah.edu' on 10/31/2003 10:32 AM
            You do not have permission to send to this recipient.  For assistance, contact your system administrator.
            <home.itsc.uah.edu #5.7.1 smtp;550 5.7.1 Unable to relay for btherat@zoot.itsc.uah.edu>

Don't know if this is a 'real' error or if it's just generated because I deleted the message from the queue.
Oops...that last NDR was from a test email that I had sent to @zoot.itsc.uah.edu, not just itsc.uah.edu.

Here is the NDR I get if I send it the right way (to btherat@itsc.uah.edu)...

The following recipient(s) could not be reached:

      Bernie Therat on 10/31/2003 10:45 AM
            This message was rejected due to the current administrative policy by the destination server.  Please retry at a later time.  If that fails, contact your system administrator.
            <home.itsc.uah.edu #4.3.2>
Sorry, Exchange is SMTP stupid and can't expand addresses.  I should have asked you to use vrfy instead.

That error is because you deleted the message from the queue.

So, let's pick all this back up.  You can now send to ztest, right?

We have established that you are not getting the proper SMTP resonses back, but I'm still unclear about when this occurs.  Please answer these questions:

1.  When you telnet from zoot to zoot, do you get responses after each SMTP command?

2.  When you telnet from home to home, do you get responses after each SMTP command?

3.  When you telnet from zoot to home, do you get responses after each SMTP command?

4.  When you telnet from home to zoot, do you get responses after each SMTP command?

5.  What kind of puppy do you have?

Don't sweat the points, that's not what motivates most of us.  :)

OneHump
1.  Yes - zoot to zoot on port 25, I see the responses and the email is received by the test user on zoot

2.  Yes - home to home on port 25, I see the responses and the mail is received by the test user on zoot

3.  Yes - zoot to home, see responses and mail is received

4.  No - home to zoot on port 25, no responses so no mail but...added port 26 to zoots SMTP virtual server port list. Telnet from home to zoot on port 26, see responses and mail is received.

5.  An evil one ;-) He's a beagle name Bernie (test user on zoot is btherat which stands for Bernie The Rat :-) )

Is it possible to test this with zoot using port 26 for smtp communication between him and home?

C.
Why is port 26 open on the firewall??  :)

It seems to me that something is screwy with the firewall that is messing with traffic on 25 from home to zoot.  Can you shut down SMTP on zoot and use blues port tool to verify TCP AND UDP on port 25 from home to zoot?

Beagles are trouble.  I used to have one when I was a kid.  Little rascals they are.  I have a Golden Retriever puppy in the oven.  I should get it 3/1/204.
Oh, one more thing.  Ask your firewall guys to delete the rule allowing 25 and change the rule that allows 26 to "25".  :)
Ok...stopped the SMTP service on zoot. On home, used blues port tool to try to connect to port 25. TCP reads 'connecting', then 'connected to [IP] (for a quick sec), then 'disconnected from [IP}. UDP stays connected.

Regarding port 26...

The firewall guys weren't comfortable turning off mailguard so they have now turned it back on.  Unfortunately, this firewall is out of our control so to ensure that mailguard isn't the causing us these problems, we decided to try to route SMTP mail on zoot through port 26.

Is this A. possible and B. a bad idea for some reason?

C.
Should have added this to the last post but ...

according to the firewall guys, the mailguard feature can't be turned off point to point...it can only be turned off for the entire firewall which handles other organizations, not just ours...
With SMTP turned back on on zoot, blues port tool can open port 25
OneHump...

I seem to have lost you and just wanted to clarify one thing...

Mailguard was not turned on during our troubleshooting of this process. It was only turned on by the firewall guys immediately preceding my post at 11:41 on 10/31 asking about using a different port (26). The reason for that particular question was because of the notification from the firewall guys that this PIX feature was going to be enabled once again for security purposes.

Just didn't want you thinking I had you troubleshooting this whole time with mailguard running.


Here I am, took a weekend away.  Let me catch up on this massive thread and I'll get back do you.  :)
OK, to answer this:
-----------
Regarding port 26...

The firewall guys weren't comfortable turning off mailguard so they have now turned it back on.  Unfortunately, this firewall is out of our control so to ensure that mailguard isn't the causing us these problems, we decided to try to route SMTP mail on zoot through port 26.

Is this A. possible and B. a bad idea for some reason?

C.
----------

MailGuard is garbage and is notorious for causing tons and tons of problems.  At work, I refuse to work with anyone who has it on.  Vahik has already agreed.  If you post a 50 point question about it, I'm sure Kidego and others will chime in the same way.  All I can tell you is that Mailguard corrupts the SMTP transaction, as seen from telnet, and should never ever be turned on.
Glad you're back!
ASKER CERTIFIED SOLUTION
Avatar of OneHump
OneHump

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Should have added this to the last post but ...

according to the firewall guys, the mailguard feature can't be turned off point to point...it can only be turned off for the entire firewall which handles other organizations, not just ours...

-------

True, but it's such a piece of garbage, they would be doing themselves a favor to forget it even exists.
OneHump...

I seem to have lost you and just wanted to clarify one thing...

Mailguard was not turned on during our troubleshooting of this process. It was only turned on by the firewall guys immediately preceding my post at 11:41 on 10/31 asking about using a different port (26). The reason for that particular question was because of the notification from the firewall guys that this PIX feature was going to be enabled once again for security purposes.

Just didn't want you thinking I had you troubleshooting this whole time with mailguard running.

---------------
Mailguard is not the cause of this particular problem.  It may, however, cause other problems that may make it difficult to narrow this one down.  It will certainly cause problems in the future.

You can leave it on, but I think we should make a case to have it turned off.  That's another story.
You arent supposed to accept until the problem is solved.  :)

OK, back to our story.  You can't telnet to port 25 on zoot, right?  Sorry I keep having to review; I'm getting lost.  :)

Can you get a dump of the PIX rules?  Don't post them here, but I would like you to email the rules between zoot and home to me.  

Oops...sorry :)

I can telnet to port 25 on zoot and I can send the 'helo itsc.uah.edu' message and get the 'hello...' back but when I type 'mail from:...', etc. I get no response and the email is never sent/received.

I'm going to check with my firewall guys about getting a dump of the rules...I'll let you know ASAP.
OK.  I don't want to be a dead horse, but that lack of resonse is what I see a lot with MailGuard.  :)

I will say that it  could be caused by a protocol issue on the firewall.  Is there a way you can bring the servers into the same data center and plug them into the same switch?  I'll bet a buck that everything will work.
We have it set up in our 'test lab' (one domain, two sites, two subnets, one routing group) and all works great but, of course, in our test lab the firewall right now is wide open. Second exchange server came online and test users on either server can send/receive mail.

Interesting idea about moving zoot over to the subnet where home resides temporarily. I think I'll do that. Let you know how it goes.
Two questions...

First, for internal routing between servers in the same routing group, I don't need an mx record for the second server(zoot...specifically, for this test), right?

Second, last week on Friday when we were 'playing with' trying to get Exchange to use port 26, we saw a very big increase in our bandwidth usage (actually the campus guys saw it) on port 691. Are the two related? Changing exchange to try to use port 26 and a BIG increase in bandwidth on port 691 between home and zoot?
Have them cut and paste the rules from the production firewall into the test lab firewall.  

No MX record for internal routing.  Exchange checks the link state tables for a route and then looks in AD for the GUID of the server on the next hop.

Port 691 is used by the Link State Algorithm (LSA).  Servers in the same routing group share link state tables on port 691.  That traffic would be between the routing group master and other servers in the routing group.  That's another reason to put servers separated by a firewall into separate routing groups.

Changing the port probably forced the routing tables to be rebuilt and republished.  I can't see it being a lot of data unless the data didnt get through.

LSA is cirtical for servers in the same routing group.
Thanks!
No problemo.
would something like that take as long as 4 hours???? That's how long the high bandwidth usage lasted.
I wouldnt think so, but it looks like it did.  :)

If there is a firewall issue on 691, the data could have looped in some way.  I would definately put those servers in different routing groups and connect them with a routing group connector.  You won't see the problem again if you do that.
Voila...the server is sitting in the data center with the first exchange and...you are right...all works perfectly. Definitely a firewall and/or mailguard issue.
Probably firewall in this case.  Did you like those comments posted to your mailguard thread?  Do a google search.  Some people get very emotional about it.  It's universally hated by the messaging community.

Good work in getting that server moved.  I know it's a pain, but it proves your point and gives you more info for your firewall people.  The problem with firewall/network people is that they get blamed for everything so they tend to be defensive.
I've definitely noticed that about firewall people and even understand their reluctancy in allowing anything at all.

Man...you're not kidding about that fixup (so called) feature being well known and a touchy subject.  Posting that thread and getting the additional answers back gives me more fuel when I ask that it be disabled...again.

I'm looking through the logs to see if anything weird showed up when I had the server at the data center.

Thanks for all your help on this!
One last question....

If (and it's a strong if being a university) the firewall guys AND the powers that be absolutely refuse to disable that smtp fixup thing, is it at all possible to get exchange running between zoot and home using a different port?  I'm going to push as hard as possible to get it turned off but I'm also going to prepare myself for the worst.
I wouldnt change ports.  Technically, it's possible.  You'd be better off disabling ESMTP.  I would put a business case together.  Do some research and gather evidence.  Call Cisco if you need to.  Present a case that you cannot do business with a defective feature enabled.

Simply do a telnet session and show them the corrupted SMTP session.  It looks like this:

5********w*******e64***************