Solved

My network loses connection with the server at approximately the same each day

Posted on 2011-03-21
77
834 Views
Last Modified: 2012-05-11
I have a network of 6 computers running windows xp with a server running windows server 2003. We work three evenings a week, and the network loses connection with the server at approximately 5pm on each of these evenings. I.e. We are running a program called oasis which stores client data on the server; when there is this slowness issue the computers will freeze up when you try to pick a different client and look at there info - at the same time a shared directory on the server will become inaccessible.  Logging out on the computers will resolve the issue.  This problem doesn't seem to arise at any other time of day.  All computers have the same problem at the same time when it starts. I am also using a cisco rvs4000 router and tp link td8810? Modem/router, vet antivirus.  I have had a computer tech look at this but he couldn't work out the issue.  I don't want to exclude any of your possible solutions, but I have tried:
- briefly checking for scheduled tasks at this time, I'm not sure if i may have missed something
- changing the time on the computers (seemed to work the first day but then it happened again)
- turning off xp auto disconnect from server if computer is idle
- running a batch file that continually pinged the server so that the connection was never idle
   I have some computer literacy but am not an expert by any means, so please provide as much instruction as possible with your solutions if you can.  Thanks.
0
Comment
Question by:mhwolog
  • 39
  • 24
  • 9
  • +2
77 Comments
 
LVL 16

Expert Comment

by:Malmensa
ID: 35186377
Set up one machine to ping continuously, over the "bad time" (TYPE PING servername -t at a command prompt) This will help narrow things down a bit.
0
 

Author Comment

by:mhwolog
ID: 35186477
Malmensa, if I set this up, what do I look for then?
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35186594
Check the following

1. Anti virus scans - are there any tasks within the antivirus software that kick off a full HDD scan at a particular time?

2. Backup Software - What time do your backups run?


Also does the ping drop out when the share disappears? Are any other resources on the server affected like shared printers or other shares?
0
 

Author Comment

by:mhwolog
ID: 35186617
BigBlake, I've checked the anti virus software and back up software and they are both set up to work in the middle of the night, outside of business hours.  I'll have the check the ping next time the share disappears.  The shared printer on the server does stop responding/printing.
0
 
LVL 1

Accepted Solution

by:
BigBlake earned 250 total points
ID: 35186649
Couple more questions to help narrow it down...

 What are the physical specs on the server?

What sort of database oasis is running?

Are there any other services running on the server (AD, email, web sites?)

When the server is unavailable can the PCs still access external resources - internet or other shares?

Cheers
0
 
LVL 16

Expert Comment

by:Malmensa
ID: 35186696
If you have a problem with cabling, a switch, a network card, network card drivers, or any TCP/IP stack problems, you will see a heap of lost packets.
0
 

Author Comment

by:mhwolog
ID: 35195334
Hi,
I've checked a few things out for you.  I've set up the continuous ping server, i'll see how it goes this evening.  And I will also check if the ping disappears when the share disappears, and if the other computers can still access the internet.

The physical specs on server are: Intel Pentium 4, 3.00GHz, 1.5GB ram running Microsoft Windows Server 2003 for Small Business Server.
Oasis database is mimer SQL data, with .dbf data files
The server is not running any websites, or significant email, I'm not sure what AD stands for?

Thanks.
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35195359
AD - shorthand for Active Directory - does Microsoft domain authentication etc. If you only have 6 PC's you may not be using it.

By the way - is your network wired or wireless?
0
 

Author Comment

by:mhwolog
ID: 35195577
Active directory is loaded on the server and there are details/computers set up in there, but how do I tell if it is actively running/doing anything?

The network is all wired.
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35195667
If you are logging on to the domain in the morning then AD is working. It is a fairly integral part of SBS server.

On the server is there a console / program installed to manage the mimer SQL database where you can check if it has any scheduled maintenance jobs running?
0
 

Author Comment

by:mhwolog
ID: 35195951
The login for each XP computer does have the domain listed so then yes it sounds like AD is working.

Regarding scheduled maintenance jobs for the mimer SQL database, I called the oasis program support, they said the only one would be in windows scheduled tasks.  I looked into this and found all but one to be working outside of business hours, but I'm not sure what some of them are and whether they can be disabled:
oasis backup 10 PM each workday
collect server performance data every 1 hour from 2:45PM Each day
collect usage data 4:30AM Each day
SBS - server status report - server performance report 6AM Each day
SBS - server status report - server usage report 6:30AM Each fortnight
0
 

Author Comment

by:mhwolog
ID: 35196821
Have some more info:

All the computers slowed down at the usual time, I logged out/logged in the user for all of them except the one running the constant ping to the server.  Now all are working normally except that one.  
The slow computer had - no problem continually pinging the server, it is still going through now.  I did see one timed out ping.  Individual pings also go through.
- oasis program freezing when changing client file
- My computer was also freezing as it usually shows the server's shared/network drives
- shared directory on server inaccessible
- Internet connection through server is working as normal
- Task manager shows main usage is by oasis 50,000K, iexplore 42,000K, AAWservice 17,000K, svchost 13,000K, explore 12,000K, Hicaps connect 11,000K.
- cpu usage 1%-3% usually, occasional spike to 47%
- PING DOES NOT GO THROUGH FROM SERVER TO THE SLOW COMPUTER (not sure if this is good information), but ping goes from server to other computers.

Server at this point shows task manager main tasks are
sqlserver 112,000K
services 63,000K
sqlserver 50,000K
isafe 39,000K
svchost 38,000K
sqlserver 35,000K
lsass 31,000K
oasis 30,000K
svchost 7,000K
svchost 7,000K
3%cpu usage
sqlserver and svchost show up 3 times each.
Thanks
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35197146
Well it doesn't appear to be a lack  of cpu / memory on the server side causing this, although out of interest when was the server last rebooted?

The fact that the sever can't ping out to the one PC either is interesting. I see the Cisco RVS4000 has four lan ports, but you have at least 7 devices, so what is plugged into the RVS and what other switch / hub are you using?
0
 

Author Comment

by:mhwolog
ID: 35197364
Server was last rebooted maybe a week or two ago.

Server and other computers connected to linksys SR224 24 port 10/100 switch
Router connected to tplink modem/router and portswitch above.
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35204083
OK, Have you done any troubleshooting around the Linksys switch yet? Not that they have much in the way of diagnostics on them, but if the pattern of lights on the front changes during the trouble period it could help us narrow this down.

Also is it possible there are scheduled tasks running on any of the other PCs connected to the network (Or even a small NAS unit?) At around this time?
0
 

Author Comment

by:mhwolog
ID: 35204700
Due to slow down soon, I will check
0
 

Author Comment

by:mhwolog
ID: 35205291
I checked the other PCs for scheduled tasks (I don't know why my IT tech didn't?) most of them have OGALogon - OGAverify.exe to run at logon
OGADaily - OGAverify.exe to run at 6:00pm each day
I looked it up - Windows Genuine software check.  (can this be deleted from the list?)
I changed the scheduled time on all the computers - only one seemed to slow down + the printer was lost off the network at around 5:30pm - could this still be related to the problem even though time wise it slowed a bit before 6?  Maybe I just didn't get to that computer in time and the process had already started?  However, I also had the PC windows clock set 3 hours early and it still happened - is there an internal clock that scheduled tasks might run to?
Is there anywhere else to look for scheduled tasks apart from windows task scheduler?  And could the UPS or printers have scheduled tasks that would affect this?

I didn't really get a chance to look at the port switch because they didn't all slow down as usual.  I will have to see how it goes Monday evening.
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35205367
Hi Mhwolog,

UPS and printer won't have scheduled tasks. I have come across some small NAS units that will run a task to back up a remote machine (Like the Oasis data files on the server) which are scheduled from the NAS end. I have also seen the same done with PC's running a scheduled task for  a similar reason. Nothing wrong with this as such but it is one of the things you tend to lose track off as people change.

I see no problem leaving the OGA task running - especially if you have now set it for a time that it won't be a problem.
0
 

Author Comment

by:mhwolog
ID: 35205423
Just to add another twist...
Just now oasis is still working in that I can change client files and look up their info.  But when I go to access one of their x-rays which is saved as a jpg on a shared directory on the server all the pcs are telling me "cannot create patient directory ...."  even though the directory is already there.
And when I go to explorer to access a shared directory on the server I get the message "\\... not accessible.  You might not have permission to use this network resource.  Contact the administrator of this server to find out if you have access permission.  The system detected a possible attempt to compromise security.  Please ensure that you can contact the server that authorised you."

The port switch looks like it is blinking as normal.
0
 

Author Comment

by:mhwolog
ID: 35205527
To enable the printer and scanner to be connected to the network, these are plugged into a D-Link 10/100 Fast Ethernet switch which is plugged into the main portswitch if this makes any difference.
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35210679
How quickly did the "not accessible " message come up? Is it almost instant (Which means permission was explicitly denied somewhere) or did it take more like 20 - 30 seconds (Which is more likely to mean something has lost connection and timed out somewhere).

Also if you don't restart a pc, how long does it take before that PC can connect again?

Regards
0
 

Author Comment

by:mhwolog
ID: 35210815
Well last night as mentioned only one computer went slow, but they all had the not accessible message.  It was instant when you tried to look up the shared directory.  These directories also become inaccessible when the computers are running slow, but not specifically with that message, they either freeze or start asking for a username and password to access - which I would assume is the same problem.

Last night I left one that had the "not accessible" message for 1.5-2 hours and it still couldn't connect.  Then I restarted.
0
 

Author Comment

by:mhwolog
ID: 35230599
Only very short term slow down tonight, nothing really to speak of.  Will see how it goes over the next couple of days.
0
 

Author Comment

by:mhwolog
ID: 35254941
No its still happening.  It doesn't always happen the same way, as in today there was no inaccessible error message, but I still lost connection with the printer, the computers started freezing when changing client accounts and share directories can't be accessed.  The portswitch looks normal even when in the middle of the problem.  Is there anything else to try?
0
 
LVL 1

Expert Comment

by:BigBlake
ID: 35265285
Is anyone runnig reports at this time of the day? Or doing anything different at this time?
0
 

Author Comment

by:mhwolog
ID: 35268704
No, the computers are mostly sitting idle at this time.  During this time there may only be one computer in use at a time.  However, normally during the day many computers may be sitting idle for up to 30mins before they are briefly used again.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35303525
1) When this is happening, have you tried to continiously ping two computers (not the server) in your LAN that are plugged into the same switch?

2) Have you considered that your server's ethernet plug may bottleneck? If it is the case, you may benefit from adding another NIC to either load balance the traffic, of at least, route the traffix from your Oasis software to the IP addess of the second NIC.

Cheers,
Rene
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35303539
I need te rephrase No.1

1) When this is happening, have you tried to continiously ping two computers (not the server) in your LAN that are plugged into the same switch that the server is plugged into?


Also,

If my ideas did not inspire you for a solution, would you mind drawing how your network is interconnected?
0
 

Author Comment

by:mhwolog
ID: 35305106
ASDL connection -> tplink modem/router -> cisco router -> port switch -> 7 PCs one of which is server
                                                                                                                -> also 1 cable from port switch to D-Link 10/100 Fast Ethernet switch -> printer and another cable to scanner
I hope this is clear, I don't think I am missing anything.  All wired network.

Next time it happens I will try pinging two computers as you mentioned.

What do you think from the network layout?  Could it be a bottleneck? If it is the case, how do I... add another NIC to either load balance the traffic, of at least, route the traffix from your Oasis software to the IP addess of the second NIC.  ??
Would it make any difference changing the network layout?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35305694
A bottleneck could indeed happen. If it's the case, we must find where.

For arround how long do you loose ethernet connectivity with the server?

Also, you mentionned "port switch", then a "D-Link". They are 2 differente switches right?

If their are 2 swithes:
-Are you sure in no way, a second port of the d-link may loop, back to the port switch?
-Only the printer and the scanner are plugged in the D-Link. Right?
-May sound stupid, but before it happens, have you tried disconnecting the ethernet cable between the D-Link and the switch?

Before changing anything, we need to understand what is realy happening.

Use the following batch file to test ping. Of corse, you will need to customize  the IP addresses.

The log files created will be named as per the destination IP address.log

Also, if you have a managed switch and have an IP address, you could add it to the script as well.

 
@ECHO OFF

SETLOCAL enabledelayedexpansion

SET IPofSERVER=192.168.0.10
SET IPofaPC=192.168.0.100
SET IPofDefaultDW=192.168.0.1


:Home

FOR %%A IN (%IPofSERVER%,%IPofaPC%,%IPofDefaultDW%) DO (
	PING -n 1 -l 1000 "%%A" >NUL
	IF !errorlevel! == 0 (
		ECHO %date% %time% IS OK : [%%A]
	) ELSE (		
		ECHO %date% %time% ERROR : [%%A]
		ECHO %date% %time% ERROR : [%%A]>>%%A.log
	)
)
ECHO.

REM WILL TIME DELAY FOR 15 SECONDS
CHOICE /D Y /T 15 >NUL	
GOTO Home


REM BEFORE RUNNING THE SCRIPT, TYE THE COMMAND "CHOICE" IN A DOS WINDOW ANS SEE IF IT GIVES YOU A YES NO OPTION
REM IF IT TELLS YOU SOMETHINH LIKE "UNKNOWN COMMAND" YOU NEED TO INSTALL THE XP TOOLS FOUND ON THE XP INSTALLATION CD
REM YOU CAN FIND IT ON MICROFOT WEBSITE OR http://www.dynawell.com/download/reskit/microsoft/win2000/choice.zip

Open in new window

0
 

Author Comment

by:mhwolog
ID: 35307335
Usually I end up restarting the computer or logging out the user to regain connection with the server, but when I have left a computer it normally won't regain connection with the server even after a couple of hours.

Yes that is correct, there are two switches.  I will double check at work, but I'm pretty sure there is no loop back.

Just to confirm, IPofSERVER= local IP of server
SET IPofaPC= local IP of the computer the batch file is working on?
SET IPofDefaultDW= local IP of the router?
Is this batch file different from pinging between two computers other than the server that you mentioned?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35307357
IPofSERVER= local IP of server
SET IPofaPC= local IP of another PC, plugged in the same switch than the server and the computer running this script.
SET IPofDefaultDW= local IP of the router.

If you have a PC connect the the second router annd it to the script as well.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35307377
Also, don't run this script on the PC that freeze.
I would also add as a target ip, another PC and the one that freeze.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35307383
@ECHO OFF

SETLOCAL enabledelayedexpansion

SET IPofSERVER=192.168.0.10
SET IPofaPC1=192.168.0.100
SET IPofaPC2=192.168.0.101
SET IPofaPC3=192.168.0.102
REM AND SO ON
SET IPofDefaultDW=192.168.0.1


:Home

FOR %%A IN (%IPofSERVER%,%IPofaPC1%,%IPofaPC2%,%IPofaPC3%,%IPofDefaultDW%) DO (
	PING -n 1 -l 1000 "%%A" >NUL
	IF !errorlevel! == 0 (
		ECHO %date% %time% IS OK : [%%A]
	) ELSE (		
		ECHO %date% %time% ERROR : [%%A]
		ECHO %date% %time% ERROR : [%%A]>>%%A.log
	)
)
ECHO.

REM WILL TIME DELAY FOR 15 SECONDS
CHOICE /D Y /T 15 >NUL	
GOTO Home


REM BEFORE RUNNING THE SCRIPT, TYE THE COMMAND "CHOICE" IN A DOS WINDOW ANS SEE IF IT GIVES YOU A YES NO OPTION
REM IF IT TELLS YOU SOMETHINH LIKE "UNKNOWN COMMAND" YOU NEED TO INSTALL THE XP TOOLS FOUND ON THE XP INSTALLATION CD
REM YOU CAN FIND IT ON MICROFOT WEBSITE OR http://www.dynawell.com/download/reskit/microsoft/win2000/choice.zip

Open in new window

0
 

Author Comment

by:mhwolog
ID: 35307581
If you have a PC connect the the second router annd it to the script as well.
Do you mean connect another computer to the second switch? DL-link

Also, don't run this script on the PC that freeze.
I would also add as a target ip, another PC and the one that freeze.

But they all tend to freeze (only tasks related to network, other programs work normally) at the same time... how do I run it on one that isn't frozen... should I just log out/log in on one and run it on that?  And do I only run the script when another computer is frozen - not when everything is working properly?

Thanks, I will try this tomorrow evening at work.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35307864
I did make a mistake. I mentionned "second router" but i ment "second switch", the D-Link.

Thanks for pointing tha out.

-You could just let it run all the time, it does not matter
-I would desable all scheduled tasks from the PC running the script.
-You could also run the script on the server

Here is the thing.
-If all pings goes well for the router but non of the PCs, that will tell us that the switch is not overloaded
-If only the server does not ping, you need to load balance it's NIC with another NIC
-If no ping works, either the switch is overloaded or the PC with the script can't comminucate with the network.

I just modified the script and added a CPU load monitor feature. It will log the CPUs load of the PC running the script whan a connection problem occures.

 
@ECHO OFF

SETLOCAL enabledelayedexpansion

REM CUSTOMISE TARGETS TO BE TESTED for the remote destinations, add a "comma" at the end
REM IF YOU DO NOT WISH TO MONITOR LOCAL DESTINATIONS, JUST PUT NOTHING ON THE RIGHT OF "="
	SET RemoteDestinations=yahoo.com,youtube.com,
	SET LocalSubnet=192.168.0
	SET LocalIPs=1,10,100,101,102

REM BUILDING TARGET VARIABLE
	FOR %%A IN (%LocalIPs%) DO SET Destinations=%LocalSubnet%.%%A,!Destinations!
	SET Destinations=%RemoteDestinations%%Destinations:~0,-1%
	TITLE MONITORING DESTINATIONS: %Destinations%

REM THE LOG FILE WILL TAKE THE NAME OF THE BATCH FILE WITH THE LOG EXTENSION
	SET LogFile=%~n0.log

:Home

FOR %%A IN (%Destinations%) DO (
	PING -n 1 -l 1000 %%A >NUL
	IF !errorlevel! == 0 (
		PING -n 1 -l 1000 %%A | Findstr -i unreachable >NUL
		IF !errorlevel! NEQ 0 (
			ECHO %date% %time% IS OK : [%%A]
		) ELSE (
			CALL :ReportError "%%A" "1"
		)
	) ELSE (
		CALL :ReportError "%%A" "2"
	)
)

ECHO.

REM WILL TIME DELAY FOR 60 SECONDS
CHOICE /D Y /T 5 >NUL	
GOTO Home

REM BEFORE RUNNING THE SCRIPT, TYE THE COMMAND "CHOICE" IN A DOS WINDOW ANS SEE IF IT GIVES YOU A YES NO OPTION
REM IF IT TELLS YOU SOMETHINH LIKE "UNKNOWN COMMAND" YOU NEED TO INSTALL THE XP TOOLS FOUND ON THE XP INSTALLATION CD
REM YOU CAN FIND IT ON MICROFOT WEBSITE OR http://www.dynawell.com/download/reskit/microsoft/win2000/choice.zip

:ReportError
FOR /F "skip=1" %%B IN ('WMIC CPU get LoadPercentage') DO (
	IF %%B GTR 0 (
		SET /a Counter+=1
		ECHO %date% %time% ERR:%~2 : %~1 LOCALHOST.LOAD.CPU!Counter!=[%%B]>>"%LogFile%"
		ECHO %date% %time% ERR:%~2 : %~1 LOCALHOST.LOAD.CPU!Counter!=[%%B]
	)
)
SET COUNTER=0
EXIT /b

Open in new window


0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35307870
oups,

Change: CHOICE /D Y /T 5 >NUL
        To: CHOICE /D Y /T 60 >NUL
0
 

Author Comment

by:mhwolog
ID: 35311509
Hi just noticed that within the properties of the server local area connection,
network load balancing is disabled.  You were talking about possible load balance the traffic.. could this setting have anything to do with the problem?

0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:mhwolog
ID: 35311522
With the batch file, local IPs are 192.168.0.2, 192.168.0.3, 192.168.0.4 etc
what do I put here?

SET LocalSubnet=192.168.0
SET LocalIPs=1,10,100,101,102

Should it be:
SET LocalSubnet=192.168.0
SET LocalIPs=2,3,4,5,6,7
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35311524
No, if you only have 1 NIC
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35311529
SET LocalSubnet=192.168.0
SET LocalIPs=2,3,4,5,6,7

that is good
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35311545
Also, if you did not install the tools in the XP CD or the ressource kit, you may not have the CHOICE command.

open a dos prompt and type CHOICE.

If you have a YN choice, then you have the command. if not, you can either:
-download it from the microsoft site or from the link I provided to you
-install the tool from the XP installation CD
-install the microsoft ressource kit
-or replace the command from "CHOICE /D Y /T 60 >NUL " to "PING 127.0.0.1 -n 60 >NUL"

Cheers,
Rene
0
 

Author Comment

by:mhwolog
ID: 35313206
Okay here are the results from the batch files, have a look and tell me what you think.  All computers were running slow today, not just specific to evening, although none were restarted once the script was running.  I only added 192.168.0.50 later once I thought of it.

ASDL connection -> tplink modem/router (192.168.1.1)-> cisco router (192.168.0.1) -> port switch -> 7 PCs (192.168.0.3,6,7,45) one of which is server (192.168.0.2)
                                                                                                                -> also 1 cable from port switch to D-Link 10/100 Fast Ethernet switch -> printer (192.168.0.5) and another cable to scanner (192.168.0.50) and I plugged in one of the pcs here (192.168.0.4)

internet has been shaped to 56k, i'm not sure if that's why the websites were dropped sometimes.
major problems seemed to be with 192.168.0.6 (about 5years old, exactly the same model/age as 7 and 45) and 192.168.0.3 (older than 5years)
server02.log
fdesk1-03.log
fdesk2-04.log
lsurg06.log
ksurg07.log
backoffice045.log
0
 

Author Comment

by:mhwolog
ID: 35313217
1 pc was not running at all today.
192.168.0.3 is close enough to run through the second switch if you think this is  a wire problem.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35314376
mhwolog, I got a go. I'll look at this later today.
0
 

Author Comment

by:mhwolog
ID: 35320086
Just out of interest I ran the batch again this morning from the server.  I put notes in the log describing when I changed network wires around.  I haven't tried moving the PC at 192.168.0.6 to a different network cable yet.
server5thApril.txt
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35320403
Plug either the .0.3 or .0.6 PC to the D-Link, but not both, and do the same test.
0
 

Author Comment

by:mhwolog
ID: 35320457
that last log was with just the 0.3 plugged into the dlink
0
 

Author Comment

by:mhwolog
ID: 35320511
If anything it seemed to start the ping problem with the 0.3, and then even when this computer was plugged back into the original network port the problem remained.  I've written details in server5thApril.log
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35320529
Your rignt. Thanks for pointing this out.

Then, I'm going to ask you to do this.

Exchange plugs of fdesk1-03 with fdesk1-04 on the switch.

And have the scripts running all day.

By the way, what is your time zone? Me it's -5GMT

Cheers,
Rene
0
 

Author Comment

by:mhwolog
ID: 35320560
GMT+10 I'm in Australia!

Sorry so to confirm, originally both 0.3 and 0.4 were plugged into network ports which link back to the main linksys portswitch.  Do you just want me to switch the 0.3 and 0.4 to each other's network ports?  Or switch the plugs on the main linksys portswitch?  I just wasn't sure if you were talking about the extension onto the smaller dlink switch...
0
 
LVL 10

Assisted Solution

by:ReneGe
ReneGe earned 250 total points
ID: 35320856
Lets focus on the Linksys switch.

I would need to isolate the problem by eliminating options.

Q: What processes takes most CPU when you 0.3 connected to the D-Link?

We see that all communication seems to be ok except.
That the CPU loads are not high ennuf to cause ethernet communication issue
Network -> 0.3
Network -> 0.6
0.3 -> 0.6
0.6 -> 0.3

So obviously, .03 and .06 are not receiving packets very well. You could have a wiring or NIC issue.

You then need to isolate if it's a switch, wiring, or the NIC or a software issue.

So when the problem occures:
-completly unplugging the D-Link switch
-rebooting the switch and see if the problem goes away (of corse, if a program times out and stops what it's doing, it may not tell a lot);
-switching the Linksys ports with a good and bad one (ex: 0.3. and 0.4);
-switching wall ethernet plugs from good and bad...<
-Interchange patch cables.
-etc

Basically, without knowing a lot about networking, you can make tests until you isolate what is wrong.

At some point, you may also benefit by having a good ethernet cable tester.

Also, it would help if you would ID the switches by Linksys or D-Link, not the "usual one" or the "second ine", it gets me confused.

Have fun,

Rene
0
 

Author Comment

by:mhwolog
ID: 35330101
okay so after playing..
192.168.0.3 and 0.6 still cannot be pinged properly when plugged straight into the cisco router (which I am assuming bypasses the linksys port switch)  so I think they both have a problem with settings, software or NIC.

the cable from the linksys port switch to the usual position of 192.168.0.6 is also faulty as a working PC plugged in that spot/room looses pingability also.  The wiring and linksys port switch for 192.168.0.3's normal spot/room works fine with a properly functioning pc.

What should I try next?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35330421
I feel that we are going somewhere here.

By the way, by saying "spot/room", Did you mean "Ethernet Wall plug"?

1) Your LAN is plugged into the Linksys, not the Cisco router. Therefore, making a test by using another Cisco port may not tell us anything as these ports may be programmed to work in not a currently desirable fashion. However, if you tried another PC on that same Cisco port and it worked, then your assomption is most likely correct.

2) [the cable from the linksys port switch to the usual position of 192.168.0.6 is also faulty as a working PC plugged in that spot/room looses pingability also.]

Do you mean that you unplugged the ethernet plug of a pingable PC from a Linksys port and plugged it into the port used by the 0.6 PC?

If that's the case, I would have the same conclusion as you.

3) [The wiring and linksys port switch for 192.168.0.3's normal spot/room works fine with a properly functioning pc.]

So here, you plugged, a pignable PC to the same wall jack that the 0.3 PC is usually pluged into, and that PC was still pignagle. Right? If that's the case, I will conclude the same as you.

NEXT (in the order that maks sense to you):
-[...The wiring and linksys port switch for 192.168.0.3's normal...] also test the ethernet wireing to connect the 0.6.

-Also, directly plug 0.6 into a known working Linksys port and with a known working patch cable. If it then work, you will know that 0.6 PC is ok.

-While at it, test if the Cisco port can see the LAN, by plugging a known pignable PC and patch cable.

-If 0.6 still does not receive pings, I'd make sure to deactivate any firewall, antivirus (because some also have a firewall), then try again.

-If still not working, From Device Manager, uninstall the NIC's driver, then reinstall it by Scanning for New Devices (you will need to reconfigure the IP settings as it was). Then try to ping again.

-If still not working, I'd make sure the NIC's pins are either streight or not fletted out, as if someone would take flat screw driver and force the pins down (some cheep ethernet plus may actually do that.

Try all these tests on both 0.3 and 0.6 PCs.

If everything fails and none of these tests makes you ping 0.3 and 0.6, go buy and install 2 new 10/100 NICs (not Gbs NICs unless you know that your ethernet wiring is made for that) (they are not very expensive anyway) and try again. Don't forget to reconfigure IP settings. If 10/100 NICs are not available anymore, and you only have CAT5e and down ethernet cable, make sure you configure the NICs to work at 100 Mbs.

If you feel that an ethernet cable may be faulty, try replacing the patch cable with a known goog one. If still not working, you may want to replace the plugs in the wall jack...

But, get an ethernet cable tester. You may have saved a lot of time here.

Good luck!

Cheers,
Rene
0
 

Author Comment

by:mhwolog
ID: 35338788
Okay I'm glad I started with checking the firewalls - windows firewall seems to have been blocking the pinging on those computers.  When that is turned off and nothing else is changed (same cables, same sockets on linksys etc) the pings all work - however, i've noticed:
1. some computers are really fast going through the batch file, others not as fast
2. occasionally some pings fail although after watching for a short time only, there doesn't seem to be a pattern as to which computers are failing.
3. Now because everything is working for the most part I'm not sure if there is any problem with cables or linksys etc..

Do I actually need the windows firewalls enabled on each PC if the router has it enabled?  What should I do from here?  Should I just run the batch again for a day and see if there is any pattern?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35338880
Good idea, try again but this time, by removing "-l 1000" from the ping command line.

By the way, was everything working ok once you turned off the firewalls?

Also, WMIC (see the batch file) may take some processing power to run, so if you have slow PCs, it will make some difference.

In Window Firewall, you may benefit from allowing ICMP Echo packets. You will find this in the Firewall advanced tab.

Now that you know the firewall is blocking needed packets, you should learn about how to use Windows Firewall in you type of network and what ports are required to be opened. Maybe Oasis requires some special porte to be opened as well.
0
 

Author Comment

by:mhwolog
ID: 35363527
Hi,
I'm just running the batch files without the firewall, will post results asap.
Thanks.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35363548
ok
0
 

Author Comment

by:mhwolog
ID: 35364997
Hi, not much luck I don't think.  There is the odd failed ping to another computer on the network every now and then, but no real patterns.  There were a fair few failed pings to external websites.  I left the internal windows firewalls off.  The computers were running fine all day until the usual bad time of around 6pm.  0.4 froze for a moment then kept working while 0.3 wasn't responding - there were no failed pings on 0.3 at this time really.
I'm not sure where to go from here.  I thought we were onto something with the failed pings....
11th-02.log
11th-03.log
11th-04.log
11th-06.log
11th-07.log
11th-012.log
11th-045.log
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35374090
1) You challange to accessing the websites, may be caused by the fat that your unternet connection is show (56k), I'm saying this because all other are running ok, when these pings occures.

2) It seems that removing the firewall resolved the issue.  You may benefit from learning how to manage windows security within a domain. "CBT Nuggets", "Trainsignal", "CBT Planet" has great computer based training.

3) Wour few missing LAN pings may be caused by several reasons. But I would try changing the router with another faster one and see how it goes.

4) Have you got an ethernet cable tester?

5) When you say .04 froze. What did you mean by that?  I don,t see anything in the log files that indicate this.

6) I'v seen network problems caused by electic equipment sending bursts of E.M.F. like electric motors, neon balast... could that be the case?

7) How is the stability of your voltage.  You have surge protectors, You have UPSs...
0
 

Author Comment

by:mhwolog
ID: 35374550
1) You challange to accessing the websites, may be caused by the fat that your unternet connection is show (56k), I'm saying this because all other are running ok, when these pings occures.
Speed was back to normal up to 20Mbps, but of course it never gets close to this

2) It seems that removing the firewall resolved the issue.  You may benefit from learning how to manage windows security within a domain. "CBT Nuggets", "Trainsignal", "CBT Planet" has great computer based training.
I will look into this, I'm happy to leave this off until the initial problem is hopefully resolved

3) Wour few missing LAN pings may be caused by several reasons. But I would try changing the router with another faster one and see how it goes.
The specs of the router say:
- 4 port gigabit router with full duplex 10/100/1000 ethernet switch
- NAT throughput 800Mbps when IPS disabled - I'm not sure what relates to the speed.
But should this router be good enough for a small network?  And how would this relate to a slow down at a specific time of day, when it runs okay the rest of the time?
Could the problem be related to any of the settings in the router?


4) Have you got an ethernet cable tester?
No but I might have to look into getting one.

5) When you say .04 froze. What did you mean by that?  I don,t see anything in the log files that indicate this.
This is the actual problem.  I'm not sure if this is the part that needs to be focused on.  By freezing I meant - when using the oasis program and changing client files (I'm assuming the PC looks up the data on the server when you change client files) - the oasis program stops responding.  Also any shared directories on the server become inaccessible.  Most commonly it happens to all the PCs at the same time, and only a reboot or log out of windows user will fix the problem.  Are there any other possible problems that could cause this "loss of access" to the server at a specific time of the day, but not lose "pingability"?  For your information about how the oasis program runs.. any data changed on one PC immediately gets updated on the server and all the other PCs at the same time - e.g. an appointment gets added to the appt book on one PC, it will appear on all PCs.

6) I'v seen network problems caused by electic equipment sending bursts of E.M.F. like electric motors, neon balast... could that be the case?
I will have to check again, but would this need to be something that changes only at a specific time of day?  You did remind me of something I have to test (I'm not sure if I did already) - at that time of day there is a manual switch to divert all the phone lines to one phone port - to which a cordless phone is attached.  I will try tommorrow just leaving everything the same as it is throughout the day

7) How is the stability of your voltage.  You have surge protectors, You have UPSs...
The server is connected to a UPSs.  The other PCs are not, but some are plugged into a power board with a build in surge protector.

Thanks for coming back and responding, I thought I might have lost you with this difficult problem.  I'm not sure if the points should go up... :)
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35375698
Oups,

In: "3) Wour few missing LAN pings may be caused by several reasons. But I would try changing the router with another faster one and see how it goes."

I meant "changing the switch"


I will review the rest of your answer later.

In the mean while,  assuming all of the CISCO router ports are all in the same VLAN, if there are ennuf plugs, plug the server, 0.3 and 0.4 to the CISCO router and run the test again and all day?
0
 

Author Comment

by:mhwolog
ID: 35380870
In the mean while,  assuming all of the CISCO router ports are all in the same VLAN, if there are ennuf plugs, plug the server, 0.3 and 0.4 to the CISCO router and run the test again and all day?  
Okay I am trying that now.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35380926
If their is an extra port, why not also add 0.6
0
 

Author Comment

by:mhwolog
ID: 35382523
I plugged in 0.3, 0.4 and the server to Cisco router port, no room for 0.6.  They still had the slow/non responsive problem, as well as the other PCs that were plugged into the linksys portswitch.  Still nothing showing up in the batch file logs except every now and then couldn't ping the external websites.
-> Does this eliminate the possibility of a faster switch making a difference?

I tested the phone/answering machine also, no change, the computers still ran slowly.

I was just thinking, the problem is resolved by logging out and in again, doesn't necessarily have to be a full shut down.  Could there be anything to explore here?  What happens when windows logs out that resolves the problem?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35383759
Does this eliminate the possibility of a faster switch making a difference?
Yes

 What happens when windows logs out that resolves the problem?
Programs started within you session closes. You may review the programs running in your session name using task manager.
0
 

Author Comment

by:mhwolog
ID: 35398396
I'm going to try running them Monday evening with one PC having the oasis program closed to see if the same problem occurs, and maybe one just opening oasis when in use and then closing it again.
Let me know if you  have any other ideas to test out.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35398894
By the way, sometimes, Internet websites can miss a ping one in a while. To eliminate the possibility that it's not your DNS server, add the web page of your ISP's address, and also, add the IP address of it.

If all the named destination misses once in a while but not the IP one, we may suspect the DNS server may have something to do with it.

If your ISP's web page never mist, IP and name, we may suspect you'r good.

Have yoursel a nice weekend.

Rene
0
 

Author Comment

by:mhwolog
ID: 35415066
Hello,
I tried pinging my ISP - the ping is blocked for URL and IP.  Not sure if the same test can be applied for the IP of yahoo.com or youtube.com

I left two PCs out of oasis, there was still a slow down accessing the server, both with these PCs and the others.  Although I suppose this still doesn't rule out a problem in oasis, as the PCs with oasis running could have slowed the network?
If I had two PCs running slow, and I restarted one computer and not the other, so that one is running normally and the other is slow - does this rule out any specific network problems?
0
 
LVL 2

Expert Comment

by:ghemstrom
ID: 35415541

What about users logged on to the pcs being logged out by user manager setting defining the users not to be allowed after 5 pm? Farfetched, yes, but never discussed...?

Why do you have the cisco router in series with the modem/router? What function does it have?

Have you searched the lmhosts and hosts files in windows\system32\drivers\etc
for inconsistencies with your actual configuration in the server as well as in the PCs?

But really, this happens every day at 5 pm. For how long has it occurred? Do you have a change log for the server and PCs?

According to my experience




0
 
LVL 2

Expert Comment

by:ghemstrom
ID: 35415625
Drop "According to my experience"  in my former comment... I have not stopped thinking....

Your network looks quite ok to me. there is nothing wrong in principle except for the two routers...

You have obviously checked the static settings for all the devices you have on the network. And you have thus checked that they have unique IP addresses but the same  mask and standard gateway, all of them. The mask could be 255.255.255.0, of course and the gateway should be pointing to the cisco router. The Cisco router would not have any gateway of course. But I still do not understand why it is there!
0
 

Author Comment

by:mhwolog
ID: 35421103
What about users logged on to the pcs being logged out by user manager setting defining the users not to be allowed after 5 pm? Farfetched, yes, but never discussed...?
Where would I check for this?  I don't have that much IT experience

Why do you have the cisco router in series with the modem/router? What function does it have?
Was installed by an IT tech for greater security because I was logging in to the work network from home via VPN.  Does this sound like it was the right decision?  The modem/router is now in bridging mode, otherwise the VPN wasn't working

Have you searched the lmhosts and hosts files in windows\system32\drivers\etc
for inconsistencies with your actual configuration in the server as well as in the PCs?
Where would I look for this?

But really, this happens every day at 5 pm. For how long has it occurred? Do you have a change log for the server and PCs?
About a year ago we started working past 5pm to 7:30pm.  Started noticing it then.  Before then hadn't noticed it.  By change log do you mean a written one? - no   Or within the server?
According to my experienceYour network looks quite ok to me. there is nothing wrong in principle except for the two routers...
Would it be worth trying to run without the cisco router? - but see the VPN note above

You have obviously checked the static settings for all the devices you have on the network. And you have thus checked that they have unique IP addresses but the same  mask and standard gateway, all of them. The mask could be 255.255.255.0, of course and the gateway should be pointing to the cisco router.
I'll double check, but I'm pretty sure i put all the same settings in, because during all the ping tests I set all the pcs to have a static IPThe Cisco router would not have any gateway of course. But I still do not understand why it is there!
0
 
LVL 2

Expert Comment

by:ghemstrom
ID: 35433409
User Manager is part of Windows Server.
Router configuration sounds sound!
windows\system32\drivers\etc is the path where you would find lmhosts and hosts. They are text files, could be edited with notepad..
It is sound routine to have a written log with all changes on server and pcs for an administrator
Trying without VPN - No!
The Cisco has got the VPN built in, that's why - keep it!

No problem - if you were not the one - check User Manager?

Happy Easter!
0
 

Author Comment

by:mhwolog
ID: 35459014
Hello,
Checked out these things only thing to note was:
hosts file didn't have much in it, only 1 line with incorrect IP of local host and Imhosts file had nothing in it.  Is this a problem??

I spoke to the oasis support, they tried changing a setting and asked me to check how everything runs - I'm only back at work Wed week.  If that doesn't work they suggested more RAM
0
 

Author Comment

by:mhwolog
ID: 35703740
Thanks anyway guys, looking into new server.
0
 

Author Closing Comment

by:mhwolog
ID: 35703761
Splitting the points as a couple of people put a lot of time into this... Thanks.   Sorted out ping problem with Rene's help.  Seems like overall the problem might be related to inadequate specs of server and workstations.  Looking into updating the server, hopefully this will solve the issue.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35703894
Thanks for the points.

Good luck and let us know when and how you figured it out.

Cheers,
Rene
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

If your business is like most, chances are you still need to maintain a fax infrastructure for your staff. It’s hard to believe that a communication technology that was thriving in the mid-80s could still be an essential part of your team’s modern I…
If you're not part of the solution, you're part of the problem.   Tips on how to secure IoT devices, even the dumbest ones, so they can't be used as part of a DDoS botnet.  Use PRTG Network Monitor as one of the building blocks, to detect unusual…
Viewers will learn how to connect to a wireless network using the network security key. They will also learn how to access the IP address and DNS server for connections that must be done manually. After setting up a router, find the network security…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now