Greetings, Exchange Experts...
On Saturday, my Exchange server stopped receiving emails from external sources. However, mail flow was normal for internal emails and we could send to external parties. Upon researching the Exchange queues and the Event Viewer of the server, I found a couple things. For one, I had a ton of emails stuck the queue, including 2 that were over 1.7GB each. Many of the queued items were OOO replies, but not all. In the Event Viewer, I found 4 15004 errors and a couple 1009 errors. The confusing thing was that these errors were from the previous day.
Here is an example:
The resource pressure increased from Medium to High.
The following resources are under pressure:
Version buckets = 208 [High] [Normal=80 Medium=120 High=200]
The following components are disabled due to back pressure:
Inbound mail submission from Hub Transport servers
Inbound mail submission from the Internet
Mail submission from Pickup directory
Mail submission from Replay directory
Mail submission from Mailbox server
Mail delivery to remote domains
The following resources are in normal state:
Queue database path ("C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\data\Queue\mail.que") = 60% [Normal] [Normal=95% Medium=97% High=99%]
Queue database logging path ("C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\data\Queue\") = 71% [Normal] [Normal=95% Medium=97% High=99%]
Private bytes = 6% [Normal] [Normal=71% Medium=73% High=75%]
Physical memory load = 76% [limit is 94% to start dehydrating messages.]
Batch Point = 0 [Normal] [Normal=2000 Medium=4000 High=8000]
Submission Queue = 0 [Normal] [Normal=1000 Medium=2000 High=4000]
The Microsoft Exchange Mail Submission service is currently unable to contact any Hub Transport servers in the local Active Directory site. The servers may be too busy to accept new connections at this time.
What would drive the version buckets value?
What could have triggered 15004 events?
Why would internal emails be held-up in the queues?
I deleted the emails that I could from the queues and rebooted the servers (after restarting the transport service didn't do anything). Upon reboot, everything generally performed normally. Yes, the 1.7GB files is an issue, but that it is another ticket.