This is a question about Internet outage at a client today. Up front:
1. I decided to ask a question instead of writing an article: My choice.
2. I rated it high priority even though I know the answer to provide incentive for experts to answer.
3. I do not want and will not accept Googled answers. I already know how to use Google. Use only your own words, skill, knowledge and experience.
4. I could not find a solution here but that does not mean there isn't one.
5. I will select the first two best answers: 1,500 points each.
Background. My client downtown for which I do the Financial Consulting work. I went there this morning and Internet was just fine from 8:00 am to about noon when it went "thud" and was gone. Normally that is external, I hooked up my Rocket Stick to finish a few things, and asked the Office Administrator if she had called the ISP. She was on the phone to them. Everyone has no internet so no (hosted Exchange) Outlook email, but servers and printers are running.
Hookup: ISP modem, business internet, 6 static IP addresses allocated. At this point, one IP for office, one IP for Wireless Guest access (no access to serverss or network), and one IP for wireless POS solution for our ticketing system.
The modem is attached to an ISP Cisco 891 (?) box, which lashup is to provide high speed internet to the office. Hooked to this is a Juniper Netscreen VPN router / firewall. Hooked to the Juniper are a couple of HP Switches to distribute Internet services to employees and servers and the same switches provide employees with connection to folders and printers.
There is another router that has one of the Static IP addresses and provided wireless internet to the POS devices for ticketing.
Troubleshooting The internet is now out (as I stated earlier) and I went down to the room where the servers are and where the switches / internet gear is. I have passed by the Office Admin desk and she is being fed ISP Pablum that it is our issue and not theirs. Turns out in hindsight it was our issue but that certainly was not apparent at the time. We asked for technical support onsite and they agreed.
We rent our facilities from a host and we learn their internet went out at the same time. Too easy! Must be an external issue but it turns out it was not.
By this time, I have contacted my colleague who normally provides IT services to this client but he was engaged at another of our clients today. We start talking. I do not see any lights (even power) on the Cisco 891. Let's restart it. We do - nothing. It turns out Cisco (who cannot design user friendly gear for love nor money) puts the light on the back where you cannot see them. I put out the Netscreen and Cisco and see the lights.
Did you restart the Juniper Netscreen? No, I said. Let's restart it. But I will observe here that both he (somewhere else) and I (on my Rocket Stick) could make tunnels to the Netscreen. He disconnects and I restart the Netscreen Nothing. He logs into the Netscreen and tells me the Netscreen CPU is running full tilt and orange on the dashboard. Why?
Did you restart the HP Switches? No, I said. If you want to do that, I need everyone to log off so they do not lose work. I go to the office, gather everyone and tell them to log off immediately. They do. Back to the server room, unplug / re-plug the HP Switches. Nothing. Now the switches, router, Cisco box have all been restarted. Let's restart the ISP Modem. We do and nothing. My colleague tells me that his pings through the VPN tunnel are 4/5 lost and 1/5 connect.
Back to the office and the ISP Technician has arrived. We know each other because he has been here before (a year ago) when the ISP put in the Cisco box to raise speeds. I am happy to have him instead of a stranger. We go down to the server room and he asks: Did you restart this, that and the other thing? Yes said I.
He checks connections. Nothing obviously wrong. He verifies there is no internet, calls "home" to check a few things about our system, starts his computer. It is now a couple of hours or so later and still no internet.
He connects to the POS router (which is a different IP on the same ISP modem) and gets a decent signal. Interesting but that is not our business IP. No internet at the business IP address.
Now, have you grasped all this? Have you started to form an opinion?
He asked: Have you restarted the POS router? No, I have not. Different system I said. He agrees. But he says, I am going to restart it. He does. After a couple of minutes, Internet to the business has returned, we walk down to the office and all is well. My colleague sends me an email that the Juniper Netscreen CPU is back to normal.
The question: What was wrong, and what caused the internet to go out (or at least to be so utterly slow as to not respond).
Have you seen this before? Any thoughts?