sunhux
asked on
Users at a remote location slow to download a web server's page
We have a few IIS web servers (all in same subnet/VLAN)
that are load-balanced using NLB.
Users at a remote office (of only 3 PCs) have been complaining of
frequent slowness in loading a web page: not even doing data query
but simply upon login to the main menu page (ie doesn't involve
DB servers yet).
These 3 users would access a url of the NLB & when this happens,
we tried accessing from the users' PCs different web servers'
url directly (to bypass the NLB) & could reproduce this problem:
2 of the web servers are slow to load the main page (25-39 secs)
while the rest of the web servers would load the same main
menu page within 1 sec.
I installed httpwatch basic edition on one of the user's PC to test out.
Attachment 1 is the screen showing web007's fast loading time.
Attachment 2 is the screen showing web010's slow loading time.
occasionally the web010 would load within 2 secs but at most
times, it would take 25-39 secs.
To isolate out the DNS, I've hardcoded web010 & web007's IP address on
that user PC's c:\windows\system32\driver s\etc\host s but web010 still
takes as long to send back the mainmenu.aspx.
During this testing, I issued 'pathping -q 10 IP_addr_of the web servers'
& they all returned back with good timings of less than 3 secs for every
hop.
During these slow loading times, the CPU/RAM utilization of all the web
servers are less than 20%. Also users from other remote locations
could load very fast the very same web010 page. I also tried using the
user's PC to load a .js file (from web010) that's about 15 times larger
than mainmenu.aspx & it loads in less than 1 sec.
Someone in EE suggested this could be an MTU issue but I've checked
the "show int xxx" outputs of the Cisco ports of the 3 PCs as well as
the Cisco ports that the web servers are all connected: the counter/
column for "Giants" is 0 (so I suppose this means no MTU issue at
both the servers & PCs ends).
When web010's page was consistently slow to load, if I entered web010's
IP addr in the IE browser to load the same url, it loads within 1 sec
consistently. Strange that when I define/hardcode web010's IP in the PC's
hosts table (& I verified the hosts' table entry takes effect by intentionally
entering an unused IP & did a ping), it loads slow (25-39secs): so I really
can't conclude if this is a DNS issue.
I've seen one case when pinging web010 on this PC, it resolves to one
IP addr & the next minute, it resolves to a different IP addr. However,
ipconfig/all on the PC showed the DNS used is still the same.
How else can I troubleshoot this further?
Would the Event Viewer logs in the DNS server help? If so, I'll ask the
DNS admin to provide
web07.png
web10.png
that are load-balanced using NLB.
Users at a remote office (of only 3 PCs) have been complaining of
frequent slowness in loading a web page: not even doing data query
but simply upon login to the main menu page (ie doesn't involve
DB servers yet).
These 3 users would access a url of the NLB & when this happens,
we tried accessing from the users' PCs different web servers'
url directly (to bypass the NLB) & could reproduce this problem:
2 of the web servers are slow to load the main page (25-39 secs)
while the rest of the web servers would load the same main
menu page within 1 sec.
I installed httpwatch basic edition on one of the user's PC to test out.
Attachment 1 is the screen showing web007's fast loading time.
Attachment 2 is the screen showing web010's slow loading time.
occasionally the web010 would load within 2 secs but at most
times, it would take 25-39 secs.
To isolate out the DNS, I've hardcoded web010 & web007's IP address on
that user PC's c:\windows\system32\driver
takes as long to send back the mainmenu.aspx.
During this testing, I issued 'pathping -q 10 IP_addr_of the web servers'
& they all returned back with good timings of less than 3 secs for every
hop.
During these slow loading times, the CPU/RAM utilization of all the web
servers are less than 20%. Also users from other remote locations
could load very fast the very same web010 page. I also tried using the
user's PC to load a .js file (from web010) that's about 15 times larger
than mainmenu.aspx & it loads in less than 1 sec.
Someone in EE suggested this could be an MTU issue but I've checked
the "show int xxx" outputs of the Cisco ports of the 3 PCs as well as
the Cisco ports that the web servers are all connected: the counter/
column for "Giants" is 0 (so I suppose this means no MTU issue at
both the servers & PCs ends).
When web010's page was consistently slow to load, if I entered web010's
IP addr in the IE browser to load the same url, it loads within 1 sec
consistently. Strange that when I define/hardcode web010's IP in the PC's
hosts table (& I verified the hosts' table entry takes effect by intentionally
entering an unused IP & did a ping), it loads slow (25-39secs): so I really
can't conclude if this is a DNS issue.
I've seen one case when pinging web010 on this PC, it resolves to one
IP addr & the next minute, it resolves to a different IP addr. However,
ipconfig/all on the PC showed the DNS used is still the same.
How else can I troubleshoot this further?
Would the Event Viewer logs in the DNS server help? If so, I'll ask the
DNS admin to provide
web07.png
web10.png
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The attachments I uploaded are httpwatch, which I feel is more
user-friendly but if you prefer fiddler, I'll get it installed again but
do tell me what are the things you would like me to test out so as
to be able to get more fruitful diagnosis this round
user-friendly but if you prefer fiddler, I'll get it installed again but
do tell me what are the things you would like me to test out so as
to be able to get more fruitful diagnosis this round
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ok, finally managed to get permission for access & install
fiddler to the users' PCs. Notice the 2 results from fiddler
below: 1st one took about 21 secs while the 2nd one, 1 sec:
When browsing/loading the webserver by using
its FQDN in IE, got the following:
ACTUAL PERFORMANCE
--------------
ClientConnected: 23:15:19.933
ClientBeginRequest: 23:15:19.933
GotRequestHeaders: 23:15:19.933
ClientDoneRequest: 23:15:19.933
Determine Gateway: 0ms
DNS Lookup: 0ms
TCP/IP Connect: 2ms
HTTPS Handshake: 0ms
ServerConnected: 23:15:19.933
FiddlerBeginRequest: 23:15:19.933
ServerGotRequest: 23:15:19.933
ServerBeginResponse: 23:15:41.554
GotResponseHeaders: 23:15:41.554
ServerDoneResponse: 23:15:41.569
ClientBeginResponse: 23:15:41.554
ClientDoneResponse: 23:15:41.569
Overall Elapsed: 00:00:21.6364382
RESPONSE BYTES (by Content-Type)
--------------
text/html: 53,573
~headers~: 530
========================== ========
When browsing/loading the webserver by using
its IP address in IE, got the following:
Request Count: 1
Bytes Sent: 321 (headers:321; body:0)
Bytes Received: 710 (headers:627; body:83)
ACTUAL PERFORMANCE
--------------
ClientConnected: 23:20:42.224
ClientBeginRequest: 23:21:27.604
GotRequestHeaders: 23:21:27.604
ClientDoneRequest: 23:21:27.604
Determine Gateway: 0ms
DNS Lookup: 0ms
TCP/IP Connect: 0ms
HTTPS Handshake: 0ms
ServerConnected: 23:20:42.240
FiddlerBeginRequest: 23:21:27.604
ServerGotRequest: 23:21:27.604
ServerBeginResponse: 23:21:27.604
GotResponseHeaders: 23:21:27.604
ServerDoneResponse: 23:21:27.604
ClientBeginResponse: 23:21:27.604
ClientDoneResponse: 23:21:27.604
RESPONSE BYTES (by Content-Type)
--------------
~headers~: 627
text/html: 83
fiddler to the users' PCs. Notice the 2 results from fiddler
below: 1st one took about 21 secs while the 2nd one, 1 sec:
When browsing/loading the webserver by using
its FQDN in IE, got the following:
ACTUAL PERFORMANCE
--------------
ClientConnected: 23:15:19.933
ClientBeginRequest: 23:15:19.933
GotRequestHeaders: 23:15:19.933
ClientDoneRequest: 23:15:19.933
Determine Gateway: 0ms
DNS Lookup: 0ms
TCP/IP Connect: 2ms
HTTPS Handshake: 0ms
ServerConnected: 23:15:19.933
FiddlerBeginRequest: 23:15:19.933
ServerGotRequest: 23:15:19.933
ServerBeginResponse: 23:15:41.554
GotResponseHeaders: 23:15:41.554
ServerDoneResponse: 23:15:41.569
ClientBeginResponse: 23:15:41.554
ClientDoneResponse: 23:15:41.569
Overall Elapsed: 00:00:21.6364382
RESPONSE BYTES (by Content-Type)
--------------
text/html: 53,573
~headers~: 530
==========================
When browsing/loading the webserver by using
its IP address in IE, got the following:
Request Count: 1
Bytes Sent: 321 (headers:321; body:0)
Bytes Received: 710 (headers:627; body:83)
ACTUAL PERFORMANCE
--------------
ClientConnected: 23:20:42.224
ClientBeginRequest: 23:21:27.604
GotRequestHeaders: 23:21:27.604
ClientDoneRequest: 23:21:27.604
Determine Gateway: 0ms
DNS Lookup: 0ms
TCP/IP Connect: 0ms
HTTPS Handshake: 0ms
ServerConnected: 23:20:42.240
FiddlerBeginRequest: 23:21:27.604
ServerGotRequest: 23:21:27.604
ServerBeginResponse: 23:21:27.604
GotResponseHeaders: 23:21:27.604
ServerDoneResponse: 23:21:27.604
ClientBeginResponse: 23:21:27.604
ClientDoneResponse: 23:21:27.604
RESPONSE BYTES (by Content-Type)
--------------
~headers~: 627
text/html: 83
ASKER
& the commands you suggested, a couple did
not work (as I probably don't have admin rights ):
C:\fiddler2>nbtstat -c
Local Area Connection:
Node IpAddress: [10.231.6.54] Scope Id: []
NetBIOS Remote Cache Name Table
Name Type Host Address Life [sec]
-------------------------- ---------- ---------- ---------- ----
KAMKCCN12060199<20> UNIQUE 10.231.6.55 507
C:\fiddler2>arp -a
Interface: 10.231.6.54 --- 0x12
Internet Address Physical Address Type
10.231.6.1 00-07-b4-00-05-02 dynamic
10.231.6.55 ac-16-2d-10-c0-03 dynamic
10.231.6.255 ff-ff-ff-ff-ff-ff static
224.0.0.22 01-00-5e-00-00-16 static
224.0.0.252 01-00-5e-00-00-fc static
239.255.255.250 01-00-5e-7f-ff-fa static
255.255.255.255 ff-ff-ff-ff-ff-ff static
C:\fiddler2>
C:\fiddler2>ipconfig /flushdns
The requested operation requires elevation.
C:\fiddler2>nbtstat -R
Failed to Purge the NBT Remote Cache Table.
C:\fiddler2>arp -d *
The ARP entry deletion failed: The requested operation requires elevation.
not work (as I probably don't have admin rights ):
C:\fiddler2>nbtstat -c
Local Area Connection:
Node IpAddress: [10.231.6.54] Scope Id: []
NetBIOS Remote Cache Name Table
Name Type Host Address Life [sec]
--------------------------
KAMKCCN12060199<20> UNIQUE 10.231.6.55 507
C:\fiddler2>arp -a
Interface: 10.231.6.54 --- 0x12
Internet Address Physical Address Type
10.231.6.1 00-07-b4-00-05-02 dynamic
10.231.6.55 ac-16-2d-10-c0-03 dynamic
10.231.6.255 ff-ff-ff-ff-ff-ff static
224.0.0.22 01-00-5e-00-00-16 static
224.0.0.252 01-00-5e-00-00-fc static
239.255.255.250 01-00-5e-7f-ff-fa static
255.255.255.255 ff-ff-ff-ff-ff-ff static
C:\fiddler2>
C:\fiddler2>ipconfig /flushdns
The requested operation requires elevation.
C:\fiddler2>nbtstat -R
Failed to Purge the NBT Remote Cache Table.
C:\fiddler2>arp -d *
The ARP entry deletion failed: The requested operation requires elevation.
ASKER
Some observations that I've just noted:
For the slow web server, ping & nslookup gives
different IP addresses :
C:\Windows\System32\driver s\etc> ping web010
Pinging web010 [10.198.11.142] with 32 bytes of data:
Reply from 10.198.11.142: bytes=32 time=2ms TTL=120 <== this is HB IP of web010
Reply from 10.198.11.142: bytes=32 time=2ms TTL=120
For a web server that loads fast (ie <1 sec), both ping
& nslookup gives the same IP address:
C:\Windows\System32\driver s\etc> ping web007
Pinging web007.shsh.sss.com [10.198.11.87] with 32 bytes of data:
Reply from 10.198.11.87: bytes=32 time=2ms TTL=120
Reply from 10.198.11.87: bytes=32 time=2ms TTL=120
> web010
Server: ad2.kkk.com
Address: 10.188.1.88
Non-authoritative answer:
Name: oascghweb010.shes.shs.com. sg
Address: 10.198.10.101 <==direct login to web010 gives this IP; ping gave another IP
> oascghweb007
Server: kkhdrad2.kkh.shs.com.sg
Address: 10.188.1.88
Non-authoritative answer:
Name: oascghweb007.shes.shs.com. sg
Address: 10.198.11.87 <== this is the same IP as what ping gave
For the slow web server, ping & nslookup gives
different IP addresses :
C:\Windows\System32\driver
Pinging web010 [10.198.11.142] with 32 bytes of data:
Reply from 10.198.11.142: bytes=32 time=2ms TTL=120 <== this is HB IP of web010
Reply from 10.198.11.142: bytes=32 time=2ms TTL=120
For a web server that loads fast (ie <1 sec), both ping
& nslookup gives the same IP address:
C:\Windows\System32\driver
Pinging web007.shsh.sss.com [10.198.11.87] with 32 bytes of data:
Reply from 10.198.11.87: bytes=32 time=2ms TTL=120
Reply from 10.198.11.87: bytes=32 time=2ms TTL=120
> web010
Server: ad2.kkk.com
Address: 10.188.1.88
Non-authoritative answer:
Name: oascghweb010.shes.shs.com.
Address: 10.198.10.101 <==direct login to web010 gives this IP; ping gave another IP
> oascghweb007
Server: kkhdrad2.kkh.shs.com.sg
Address: 10.188.1.88
Non-authoritative answer:
Name: oascghweb007.shes.shs.com.
Address: 10.198.11.87 <== this is the same IP as what ping gave
ASKER
Provided all cache are cleared (for consistent test results),
I've also noted that if I browse web010 by the IP addr
10.198.10.101, it takes 21-35 secs to load but loading
using web010's the other IP, ie 10.198.11.142, it's always
loading in less than 1 sec
I've also noted that if I browse web010 by the IP addr
10.198.10.101, it takes 21-35 secs to load but loading
using web010's the other IP, ie 10.198.11.142, it's always
loading in less than 1 sec
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
> edit host file and map server's ip
Have tried the above a few times, did not help.
Suspect the return packets referred to DNSes & took a
long way for the return packets to come back: I noted
when the users' PC tried to load the webpage, within
1 sec, ' netstat -an | find "user_PC_IP" ' would show the
connection as established but it's the return packets
that took more than 21 secs (sometimes 30 over secs)
to get back to the user's PC/browser.
I checked the numerous DNSes & they define those web
servers that are slow to load to 10.198.10.x addresses
while those web servers that are fast to load, they're
defined as 10.198.11.x addresses : one set of addresses
are the actual IP address while the other set is the
heartbeat IP addresses.
I don't have admin rights (only normal user rights that don't
allow me to change IP/DNS etc) to amend the configs on the
users' PCs. Will attempt what you've suggested on Mon
when I get in touch with the PCs' admin.
Have tried the above a few times, did not help.
Suspect the return packets referred to DNSes & took a
long way for the return packets to come back: I noted
when the users' PC tried to load the webpage, within
1 sec, ' netstat -an | find "user_PC_IP" ' would show the
connection as established but it's the return packets
that took more than 21 secs (sometimes 30 over secs)
to get back to the user's PC/browser.
I checked the numerous DNSes & they define those web
servers that are slow to load to 10.198.10.x addresses
while those web servers that are fast to load, they're
defined as 10.198.11.x addresses : one set of addresses
are the actual IP address while the other set is the
heartbeat IP addresses.
I don't have admin rights (only normal user rights that don't
allow me to change IP/DNS etc) to amend the configs on the
users' PCs. Will attempt what you've suggested on Mon
when I get in touch with the PCs' admin.
ASKER
Give me 1 more week: raising CR to update this suspected
incorrect entry, thereafter, will update in this thread again.
incorrect entry, thereafter, will update in this thread again.
ASKER
Now, even entering the IP address of 10.231.11.x
which used to load fast has now become slow to load.
The users' frustration has escalated.
I've just done another fiddler which did not reveal much:
Request Count: 1
Bytes Sent: 2,869 (headers:2,869; body:0)
Bytes Received: 8,607 (headers:582; body:8,025)
ACTUAL PERFORMANCE
--------------
ClientConnected: 22:37:21.546
ClientBeginRequest: 22:38:30.905
GotRequestHeaders: 22:38:30.905
ClientDoneRequest: 22:38:30.905
Determine Gateway: 0ms
DNS Lookup: 1ms
TCP/IP Connect: 11ms
HTTPS Handshake: 0ms
ServerConnected: 22:38:30.918
FiddlerBeginRequest: 22:38:30.918
ServerGotRequest: 22:38:30.918
ServerBeginResponse: 22:38:30.961
GotResponseHeaders: 22:38:30.961
ServerDoneResponse: 22:38:49.446
ClientBeginResponse: 22:38:30.961
ClientDoneResponse: 22:38:49.446
Overall Elapsed: 00:00:18.5410000 <== 18 secs
A good response is usually <1sec
I've got our DNS admin to deregister the heartbeat address
from DNS but no joy.
Also tried the suggestions below:
> Do this, on one of the client system
> leave IP/mask/gateway as it is
> Change the dns to use public dns e.g. 8.8.8.8 only => yep, done & tested it took effect
> remove wins ip (if configured) => under the NIC properties, WINS not configured
> ==> but ipconfig/all still shows WINS servers are there
> edit host file and map server's ip
Done the above for web007 & it loads very slow (of > 30 secs)
which used to load fast has now become slow to load.
The users' frustration has escalated.
I've just done another fiddler which did not reveal much:
Request Count: 1
Bytes Sent: 2,869 (headers:2,869; body:0)
Bytes Received: 8,607 (headers:582; body:8,025)
ACTUAL PERFORMANCE
--------------
ClientConnected: 22:37:21.546
ClientBeginRequest: 22:38:30.905
GotRequestHeaders: 22:38:30.905
ClientDoneRequest: 22:38:30.905
Determine Gateway: 0ms
DNS Lookup: 1ms
TCP/IP Connect: 11ms
HTTPS Handshake: 0ms
ServerConnected: 22:38:30.918
FiddlerBeginRequest: 22:38:30.918
ServerGotRequest: 22:38:30.918
ServerBeginResponse: 22:38:30.961
GotResponseHeaders: 22:38:30.961
ServerDoneResponse: 22:38:49.446
ClientBeginResponse: 22:38:30.961
ClientDoneResponse: 22:38:49.446
Overall Elapsed: 00:00:18.5410000 <== 18 secs
A good response is usually <1sec
I've got our DNS admin to deregister the heartbeat address
from DNS but no joy.
Also tried the suggestions below:
> Do this, on one of the client system
> leave IP/mask/gateway as it is
> Change the dns to use public dns e.g. 8.8.8.8 only => yep, done & tested it took effect
> remove wins ip (if configured) => under the NIC properties, WINS not configured
> ==> but ipconfig/all still shows WINS servers are there
> edit host file and map server's ip
Done the above for web007 & it loads very slow (of > 30 secs)
ASKER
Ok, just found that to get rid of the 2 WINS, I'll just have to key
in a public DNS IP in its place & the 2 internal WIN servers will
be gone & replaced with the public DNS IP
in a public DNS IP in its place & the 2 internal WIN servers will
be gone & replaced with the public DNS IP
ASKER
There's 2 processes in the users' PCs:
BESClientUI.exe &
BESPower.exe
What are these? I've heard of Norton Antivirus
causing slowness by hundreds of times.
BESClientUI.exe &
BESPower.exe
What are these? I've heard of Norton Antivirus
causing slowness by hundreds of times.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I've set the 2 BigFix processes to Lowest priority, no joy.
Is there anything else I can do to isolate further?
Is there anything else I can do to isolate further?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
By hard-coding into hosts file, the users PCs are now pinging
to the correct & consistent IP: but there are still some servers
out there that resolves to different IP address : would this matter?
to the correct & consistent IP: but there are still some servers
out there that resolves to different IP address : would this matter?
ASKER
on that PC but they got nothing out of it. They've since deleted the
fiddler outputs.
Is there anything specific that you would like to capture so that this time
round, the fiddler output will give more clues.