We help IT Professionals succeed at work.

2011 SBS keeps shutting down every 60-90 minutes ONLY at office site. RDP Attack??

burkem3434
burkem3434 used Ask the Experts™
on
Looking for some help here so we can get the server running long enough for our migration to the new one we ordered. We have an ongoing issue where the server blue screens every 60-90 minutes all day long. I pull the server on overnights and now two weekends and not one reboot or BSOD (currently up time is 2 days and 18 hours). But from history when I return it to the site it will again start like clockwork. So far:

•      I have moved to another power outlet (but power doesn’t seem likely because once I disabled auto restart it always stays on just with a BSOD).
•      Scanned drives both at boot and with external tools
•      Run Windows Memory Diagnostics Tool many times (No problems Detected)
•       Unplugged any device attached (even changed USB keyboard and mouse)

I couldn’t see it being hardware, driver, or a service since it runs clean at my site. I even ran all day today since our office is closed just in case it was some “work hours” thing. No issues. I spent the last 48 hours going entry by entry in the event viewer and found some disturbing items. It looks like we were attacked based on a share we did not create in

HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\shares.

Value name
Content.IE5
CSCFlags=2048
MaxUses=4294967295
Path=D:\Virus Files\User Folders\Tom H\Tom H\AppData\Local\Microsoft\Windows\Temporary Internet Files\Low\Content.IE5
Permissions:=0
Remark=
ShareName=Content.IE5
Type=0

Also I found in the log TermDD Event 56 several entries that say “The Terminal Server security layer detected an error in the protocol stream and has disconnected the client. Client IP: xxx.xxx.xxx.xxx”
These IP’s are all coming back to ISP in Russia and Ukraine.

Here are a few of the BSOD messages.

Driver_ IRQL_NOT_LESS_OR_EQUAL
stop 0x000000D1 (0X0000000000000000, 0x0000000000000002,0x0000000000000000, 0xFFFFF88002FEE006)

termdd.sys – Address FFFFF88002FEE006 base at FFFFF88002FEB000 Date Stamp 4ce7ab0c


Driver_IRQL_NOT_LESS_OR_EQUAL
stop 0x0000000A (0X0000000000000000, 0x0000000000000002,0x0000000000000001, 0xFFFFF8000405A48E)

No other info


Driver_IRQL_NOT_LESS_OR_EQUAL
stop 0x0000000A (0X0000000000000000, 0x0000000000000002,0x0000000000000001, 0xFFFFF800040A748E)

No other info

So I will delete that share, turn off RDP since we don’t need it (And block 3389 at the Sonicwall) but what else can be or should be done? I ran scans with pro versions of both Malwarebytes and AVG Server Business and they come up clean. Plus have been on with daily scans the whole time. Thoughts?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Distinguished Expert 2017

Commented:
Subs at office site? Do you have multiple locations? Sounds as though there is another DC that possibly took over the AD roles.

It could be a hardware issue network, memory?

Does the hardware include hardware logs?

30-90 minutes by use could point to a flawed memory module that is reChed when enough resources are consumed over time,

Depending on what the system does, and based on load that is reached sooner rather then later.

RDP might only add to the demand on memory.....

Author

Commented:
One site, One Server.

I physically have removed the server to troubleshoot over an extended off time never once having the issue.

I thought maybe RAM as well since they do crash during hours when all are working. But I see no errors or warning relating to hardware in the event logs and at this point i have read down each one. Also have run Windows Memory Diagnostic Tool and memtest86. I may just replace anyway since it is an inexpensive replacement.
Philip ElderTechnical Architect - HA/Compute/Storage

Commented:
Take the UPS you are plugging the box into at your office with you back to theirs and plug the server in to it at their site.

Betcha the problem "goes away".

Other possibility is to also take the switch you have it plugged in as well.
Distinguished Expert 2017

Commented:
Event log will reflect the panic may not indicate it as a memory issue.

What hardware is it on?. It may have a way to check the hardware log.

Run sfc /scannow
What type of storage?

Author

Commented:
Arnold,

Ran  sfc /scannow multiple times ("Windows resource protection did not find any integrity violations").

Raid 1 replaced BOTH drives (imaged first then actually rebuilt mirror).

Phillip

Bought new UPS. 1 week old. Moved server to other side of building on different breaker. But again I don't see power as an issue after I turned off auto restart. Now just stops at BSOD never losing power.
Distinguished Expert 2017
Commented:
Hardware? Branded ? Does it have HW access hp insight, Dell openmanager, IBM director? Ipmitool to query the hardware?

Look at blocking those ranges on the firewall if you do not need, or provide access to RDP if needed from predefined IPs
Possible a brute force or it is being attack to create an overflow condition to gain access.

Author

Commented:
Ram was only $90.00 for 4 x 4 GB so I bought to replace all of them. A bad RAM getting hit at peak usage has some merit. But still concerned about the unknown share, the Russia IP's. I have been reading about the Brute attacks that are so prevalent this year and that might be what is overwhelming the server and triggering the bad RAM. On paper sounds possible?

Author

Commented:
Its a Dell Poweredge T110 II
Distinguished Expert 2017

Commented:
For this entry level small business server, I think you can get Dell openmanager node admin through which you could get access to the log.

Your raid controller is the s100?

SBS 2011 if not mistaken has reached end of support. Getting a newer usedT430 with a newer OS........with better/newer raid controller and hot swap 2.5" faster disks.

Author

Commented:
SBS 2011 (2008 R2) is End of Life next month 1-14-2020. Already ordered new T340 with 2019 Server but still need to run until then and during the migration.

Dell PERC H200 and 6Gbps SAS HBA is my Raid Controller.

I didn't try Open Manager I think have hanging around. I wondered about the Raid Controller too. But then it would resynch drives after these events and run like a dream in my office (day 3 up time now).
Edmond HawilaChief Operating Officer

Commented:
Ot does seem from what you say that this might be related to the RDP attacks. Just take ot back.in the office make sure that any NAT from 3389 is removed from the firewall and see if it behaves better. You could also install RDPGuard to monitor and protect this, they have a 30day trial which you can use without any cost.
kevinhsiehNetwork Engineer

Commented:
Fully patched? This could be due to Bluekeep vulnerability in RDP. Definitely shut down RDP access at the firewall.

RDP should not be allowed directly from the Internet. Protect it via VPN, RD Gateway, or restrict to specific static IPs at the firewall.

Does the Sonicwall support rules based on country of the source or destination? If so, only permit inbound traffic from from countries that need it.

We host our own Exchange on site. All access to OWA and from Outlook or mobile device can only be done domestically. Any access from a foreign country requires notification from the user of the countries they will be in and the dates. We then put rules in the firewall that have schedules to allow access from the specified country during the specified dates. It helps keep crap off the network.

Author

Commented:
I am going to follow all these suggestions as well as change the RAM when it comes on Thursday. Hopefully we get clear and buy enough time to get to our upgrade. I will follow up on Friday.

Checking some dump files and I ntoskrnl.exe was also one of the BSOD. I believe this also points to Faulty RAM?
Robert RComputer Service Technician

Commented:
could be that the temperature in the server closet is hotter than when it is setup at your work station, but then you would have over heating events in the event log.
Distinguished Expert 2017

Commented:
You would need to download/install the dell openmanage node administrator that when run, using I.e will provide info on the hardware and any events there...

Author

Commented:
Update.

I haven't gone back to the site. Server has had zero BSOD and 100% up time for a week other than a restart I made to check RAM type. I installed open manager and all the hardware has green checks with no hardware errors in the logs. Tomorrow I will return it to the site and see where we are at and follow up.

Author

Commented:
Last weekend I brought the server back and shut off RDP at the sonicwall as well as removing the unknown share in HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\shares. Also I made sure all updates were current. The ram I ordered was incorrect so that was never replaced. But Open mange had no hardware issues in the logs and all the other events seem like nothing new has happened. the server has run for 6 days with at least 4 full staffed.

New T340 is ordered with 2019 and it is nice to have customer functioning in the meantime. Plus the migration is going to be much easier with a running server. I never felt this was a heat, power, or hardware issue but ran the gauntlet anyway. It sure feels like suspicious activity was taking the server down.

Thanks all for the help. Never sure how points get dispersed wen so many give helpful advice but i'll do the best I can.
Philip ElderTechnical Architect - HA/Compute/Storage

Commented:
Never, ever, open a RDP listener on any port to the Internet. That's just asking for trouble. TSGrinder be the main tool that is used to attack a system.

Microsoft built-in RD Gateway into the Remote Desktop Services setup to provide an extra layer of security. I suggest using it.