Link to home
Start Free TrialLog in
Avatar of floyd197
floyd197

asked on

DELL PowerEdge T320 Server VLT0204 PS2 PG Fail Voltage Is Outside Of Range. Contact Support Message

Just been to a site where I haven't visited for a while and noticed the the above message in Orange  on the front of the server.

I don't know how long the message has been on and when it first started. Is this message a serious error message
and what is the likeliest cause?

The server is out of warranty.

Thanks
Avatar of Sajid Shaik M
Sajid Shaik M
Flag of Saudi Arabia image

PS2 is power supply 2 voltage issue

if possible shutdown the server remove power cables then press power button for 30 second and then replug it'll flus the error logs ... most of the time it'll solve the issue ....

if remains same boot the server with 2nd power supply and check if still issue remains replace it ...

all the best
ASKER CERTIFIED SOLUTION
Avatar of Jane Updegraff
Jane Updegraff
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of floyd197
floyd197

ASKER

I only had a quick look but it looked like both Power supplies were lit up green. It looked like there was an amber light pulsing
at the back towards the bottom left.
hmmm was there any code on the LCD other than what you already posted? an EXXXX code for instance, possibly at the very beginning of the scrolling message?
Don't know to be honest. Didn't notice but there may have been. Have attached some screenshots but they obviously do not
show everything
Dell1.jpg
Dell2.jpg
Dell3.jpg
Well if you can get someone at the location to get the error code from the very beginning of the string that displays on that LCD, we can almost certainly get you closer to whatever the error means.
Thanks. Will try and get the full message for you.
I have asked somebody in the office to read what the message says,

The message reads "VLT0204 System board PS2PG fail voltage is outside of range. Contact Support".

Would there be anywhere else where the message would be in cae anything has been missed.

Thanks
Interesting. Well there might be something to be found in the Dell OpenManage utility, assuming its installed on the server and that the server is able to run. There is likely also a diagnostic utility in the BIOS that you can run pre-OS boot, either in person with a monitor and keyboard/mouse or remotely using the iDRAC ... once again, that is assuming you have an iDRAC onboard and already plugged in to remotely manage the server. There is also an online tool for disagnosis but that is only for older systems. Here are the options for diagnostics: https://www.dell.com/support/article/us/en/19/sln292190/diagnostic-tools-for-advanced-troubleshooting-on-poweredge-servers?lang=en 

Personally, I would start with the diagnostics available in the BIOS but obviously that means taking the server out of production. If that is not feasible then log into the server and see if it has the  "OpenManage" application installed and if so, run it to see what it offers. If that doesn't tell you much, move on to the next diagnostic method, etc.

Also, I found out that "PS2PG" means "power supply 2 power ground", so I'm guessing that it's basically saying that the voltage on the power ground is wrong on power supply 2. If it's expected to be at 0 (which is what I would expect, since ground isn't supposed to carry any voltage) and for some reason (due to environment, hardware failure, whatever) some power is detected on the ground of a power supply, in this case on the PSU2, then this error would be triggered.

Try swapping positions of the two power supplies, then clear nvram by unplugging the power cables and holding in the power button for several seconds, then plug everything back in and start it up and see if the same error re-appears and if so, you need to find out if it reappears for the same power supply position or the ps1 position. That will tell you if the power supply is to blame or not.
Thanks for your detailed response. They had a power issue a month or so ago where the office didn't have enough power or
something like that and all the other businesses on the park were affected as well. I am guessing that may have cause something
though I can't be sure as I haven't visited the site for a while.

The server is SBS 2011 and was purchased in December 2013. It doesn't have the dell openmanage utility. Where can this be
downloaded from and it it safe to install in a working enviroment.

Thanks
Dell OpenManage is safe to use wherever, I have used it on production machines for years. I doubt if it calls for a reboot after install, but if it does, you can always just schedule a reboot for off-hours (assuming you have off-hours) and return to it the next day. OpenManage is free from Dell, and normally it would have been pre-loaded on the server and a shortcut would be on the desktop, had the server been loaded by Dell themselves. It sounds like you bought a pre-loaded SBS server, so yours was possibly built not by Dell but perhaps wiped and reloaded by the reseller who sold you SBS in a package with the server.

download openmanage here:
http://www.dell.com/support/contents/us/en/19/article/product-support/self-support-knowledgebase/enterprise-resource-center/systemsmanagement/OMSA

step-by-step installation walkthrough here:
https://www.dell.com/support/article/us/en/19/sln170723/openmanage-server-administrator---installation--step-bystep-?lang=en

Then if you need more information about their diagnostics or want to do some online diagnostics (try that next), you can always look up your specific machine using their support site. It doesn't have to be under warranty to look up the documentation and diagnostics for it, and the documentation online includes all the hardware part numbers for your specific server, as it was when it shipped from the Dell factory, ... very handy if you should find that you need to buy a new power supply. Their site is also just a good place to begin for diagnostic help, warranty or no.

To do that you just go to http://www.dell.com/support/home/us/en/4 and enter the service tag number (serial number, really - its alphanumeric and for Dell servers I think it's 7 or 8 characters) that is on the sticker on the back or side of the machine. You can also get the service tag from the command prompt by entering the command wmic bios get serialnumber
That's great, thankyou. It was purchased direct from Dell so I would have assumed it would have been.
Just installed Dell OM.
Dell-OM.png
So its pretty much the same message. That didn't help much but at least you know that the other components are good. Well since this server is running, first back it up thoroughly, because during this troubleshooting there is always a chance that you will turn it off and never come back on. If it really is damaged from the brownout you had, there is just no telling. So back it up in whatever way you use. If it has SQL on it, back SQL databases up separately using the SQL backup feature. If it has file shares on it, copy the file shares to some other place before starting. Then you can move on to troubleshooting.

If it were me, I would proceed with 1. removing power supply in psu 2 position (the psu slots are clearly marked on the chassis with a "1" and "2"), then clearing nvram and seeing if the message clears. You clear nvram by unplugging the power cords and holding in the power button for 10-20 seconds. This drains the remaining power that nvram uses to hang on to the error message between reboots. Plug it back in and turn it on. If the message clears, then don't put power supply 2 back, in, it'll just display the same message ... just order a new power supply to replace it, and that may clear the error. You'll have to live with one power supply until the new one arrives. BUT if removing psu 2 doesn't change anything, try swapping the positions of the two power supplies. You won't be able to do that second suggestion with the server running. Like I said, there is a chance that if you turn it completely off and clear nvram, it may shut itself down in fail-safe mode or not turn on at all. Cross your fingers and hold your breath when you go to turn it back on. :-)

But here's the idea .. once you swap them and drain the ram power, then turn it back on .. see if you get the same message and if you do, which power supply it reports being on. If it still reports that it's power supply on position 2 after you have swapped power supplies, then you probably could get rid of the message by replacing the motherboard with a new motherboard, although let's be clear here ... it might run properly, without issue, for years to come while still displaying that error. It depends on how MUCH voltage is detected on the ground, something the error doesn't tell you. If the error is transferred to power supply 1, then you will need to buy a replacement power supply and replace the one you moved from psu2 slot to psu1 slot to see if that clears it.

But as I said, if you are the gambling type, you might be able to run this server indefinitely with that error continually displaying, because it tells you very little other than that it detected some amount of voltage on the ground. And it says a LOT that it hasn't actually died yet or shut down in fail-safe mode. On the other hand, it might be a really bad ground leak and eventually it might damage the server even more. So back that thing up ASAP.

Your instinct of it being due to the power brownout that they had in the office park is a very good one, because brownouts wreak absolute havoc with capacitors and tiny little circuits like those on the motherboard inside this server. If you filed an insurance claim for that event, I would put this server on the list of damaged items.  But if you must keep it running, you can do nothing and hope it doesn't get worse, or you can try the above to see what part is bad.
Many thanks again for the detailed steps. Just to clarify, remove the power supply in PSU 2 position,
unplug the power cable and hold thepower in for 10-20 secs.

Can i do this with server running? will pushing the power button not turn the server off?
Just removing the power supply (to see if the error clears) will not turn the server off, but yes that second suggestion where i mentioned clearing NVRAM must be done with the server off.
Trying to do the right thing here but the customer customer doesn't have a service contract and wouldn't renew the dell on site
warranty when it expired. I don't want to get into a situation that renders the Server unusable but don't want to leave it in case it
does further damage down the line.
I think one of the power supplies is  plugged into the mains and the other into a UPS
 but cannot remember which one is in which. I am hoping no 1 is in the mains in case
for some reason the Ups doesn't support it even though it should.
I can certainly understand your quandary. But if the customer didn't want to extend their support for the product there isn't much you can do. Of course you can explain the situation and explain that there is no way to know exactly what is causing the error message without further diagnostics. And let me state it a different way ... if it's going to die when you run a diagnostic, it was also going to die the next time it lost power. So would they rather risk it now (on a schedule where it can be predicted) or risk it then (when it will be surprise)? That's their call and not yours, but you should at least explain that if they choose to ignore the error message, everything might continue to work fine or it might not and that you have no way of knowing which. The only way to be sure that the server remains reliable is to replace whatever part is damaged and the only way to determine exactly what part is damaged is to do certain diagnostic procedures. See what they want to do and go with the customer's directive.

It's interesting that one of the power supplies would be plugged into a UPS while the other is not. I wonder why that's setup that way? I plug everything sensitive into a UPS to avoid damage from power spikes and dips (brownouts) like the one your server just experienced. But lets say that UPS is in bad condition, I wonder if it could introduce a trickle of power on the ground? Maybe you should find out by moving the power cable to a wall socket instead of the UPS and see?
I think the reasoning behind that is if both are plugged into the UPS and the UPS goes overnight or
something then the server would stay on. Though that that would seem unlikely.
I cannot think of any other reason


It is a good point and think that it would be better if both were plugged into the UPS to protect against surges and uneven power.


The UPS had new batteries only a couple of months ago.
There may be nothing wrong with the UPS but it may also be where the ground leak is coming from, and it's easy to test for that, so why not? Just move that one cable that is plugged into the UPS into the wall and re-run the diagnostic in Dell OpenManage and see if it still shows voltage on the ground on ps2. it won't affect the server since you are not unplugging BOTH at the same time, just the one. That's exactly why you have two. :-)
Many thanks again. Will try that first.
Some more logs attached - don't know if this hepls any more or not. I have put something
in writing to them so will take it from there.
This issue still exists as the client is dragging their feet. I may need to reboot the server for another issue.
Should this be ok as I am not shutting it down.

Many thanks
Anybody know if it is likely to be safe to reboot this server but not shut it down. Thanks
Rebooted the server which cleared the error. Thanks for the help.
Sorry for the late reply.  Many thanks for the help and detailed responses