Question

500 Point question!!!! Etherreal checksum errors and slow Internet access

Asked by: jcistaro

OK here is a good one.  I'v ebeen experiencing slow internet access form my site for a while now.  One site in particular will not load....www.good.com  See the tecxt below of a sniffer trace (etherreal) any ideas?  I've attached some troubleshooting that has happened already...We are looking for fresh ideas now.  

 
                                 
                                OK, I've been looking into xxxx  ths evening, and I found something that isn't right.
                                 
                                First, as summary of what you have in NJ..  As you may be aware, NJEDS has three (3) T1's for it's site.  The three ones are aggregated into a PPP Multilink bundle, giving a total of ~4.5Mb/s.  Currently, each connection is limited to 1.5Mb/s given how things are configured...more on that in a bit.

                                 
                                Here are the three interfaces:
                                 
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#show int ser0/0/0:0
                                Serial0/0/0:0 is up, line protocol is up
                                  Hardware is GT96K Serial
                                  Description: T1 to Paetec (CID# xx)
                                  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
                                     reliability 255/255, txload 88/255, rxload 5/255
                                  Encapsulation PPP, LCP Open, multilink Open
                                  Link is a member of Multilink bundle Multilink1, loopback not set
                                  Keepalive set (10 sec)
                                  Last input 00:00:00, output 00:00:00, output hang never
                                  Last clearing of "show interface" counters 6d22h
                                  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
                                  Queueing strategy: fifo
                                  Output queue: 0/40 (size/max)
                                  5 minute input rate 34000 bits/sec, 32 packets/sec
                                  5 minute output rate 535000 bits/sec, 98 packets/sec
                                     22848457 packets input, 2888792823 bytes, 0 no buffer
                                     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
                                     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
                                     62672123 packets output, 2924484279 bytes, 0 underruns
                                     0 output errors, 0 collisions, 0 interface resets
                                     0 output buffer failures, 0 output buffers swapped out
                                     0 carrier transitions
                                  Timeslot(s) Used:1-24, SCC: 0, Transmitter delay is 0 flags
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#show int ser0/0/1:0
                                Serial0/0/1:0 is up, line protocol is up
                                  Hardware is GT96K Serial
                                  Description: T1 to Paetec (CID# xxxxx)
                                  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
                                     reliability 255/255, txload 89/255, rxload 5/255
                                  Encapsulation PPP, LCP Open, multilink Open
                                  Link is a member of Multilink bundle Multilink1, loopback not set
                                  Keepalive set (10 sec)
                                  Last input 00:00:00, output 00:00:00, output hang never
                                  Last clearing of "show interface" counters 6d22h
                                  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
                                  Queueing strategy: fifo
                                  Output queue: 0/40 (size/max)
                                  5 minute input rate 32000 bits/sec, 30 packets/sec
                                  5 minute output rate 537000 bits/sec, 98 packets/sec
                                     22880386 packets input, 2887525686 bytes, 0 no buffer
                                     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
                                     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
                                     62690729 packets output, 2936854900 bytes, 0 underruns
                                     0 output errors, 0 collisions, 0 interface resets
                                     0 output buffer failures, 0 output buffers swapped out
                                     0 carrier transitions
                                  Timeslot(s) Used:1-24, SCC: 1, Transmitter delay is 0 flags
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#show int ser0/1/0:0
                                Serial0/1/0:0 is up, line protocol is up
                                  Hardware is GT96K Serial
                                  Description: T1 to Paetec (CID# xxxx..NJ)
                                  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
                                     reliability 255/255, txload 88/255, rxload 5/255
                                  Encapsulation PPP, LCP Open, multilink Open
                                  Link is a member of Multilink bundle Multilink1, loopback not set
                                  Keepalive set (10 sec)
                                  Last input 00:00:00, output 00:00:00, output hang never
                                  Last clearing of "show interface" counters 6d22h
                                  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
                                  Queueing strategy: fifo
                                  Output queue: 0/40 (size/max)
                                  5 minute input rate 32000 bits/sec, 30 packets/sec
                                  5 minute output rate 535000 bits/sec, 97 packets/sec
                                     22813342 packets input, 2862958091 bytes, 0 no buffer
                                     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
                                     107811 input errors, 82285 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
                                     62693039 packets output, 2933002928 bytes, 0 underruns
                                     0 output errors, 0 collisions, 0 interface resets
                                     0 output buffer failures, 0 output buffers swapped out
                                     0 carrier transitions
                                  Timeslot(s) Used:1-24, SCC: 0, Transmitter delay is 0 flags
                                NJEDS-ER-Site#
                                NJEDS-ER-Site#
                                 
                                 
                                Those three physical interfaces above are bundled into a single logical interface called a Multilink interface.  Here is the Multilink info on it:

                                 
                                Multilink1, bundle name is Px
                                  Endpoint discriminator is Pxx1
                                  Bundle up for 11w6d, 90/255 load
                                  Receive buffer limit 36000 bytes, frag timeout 1000 ms
                                    0/0 fragments/bytes in reassembly list
                                    41895 lost fragments, 6878661 reordered
                                    3186/3540062 discarded fragments/bytes, 3186 lost received
                                    0xAC53EB received sequence, 0x9D314F sent sequence
                                  Member links: 3 active, 0 inactive (max not set, min not set)
                                    Se0/1/0:0, since 11w6d
                                    Se0/0/0:0, since 11w6d
                                    Se0/0/1:0, since 9w1d
                                 
                                 
                                When you have a multilink path, reordering is normal and expected as packets can be sent by the other side of the link "out of order", at which point we have to hold onto the packet(s) in memory, and reorder them.  But, to me, the number seems a tad bit high given the amount of time the link has been up, but I'm just guessing.

                                 
                                With all that said, here's what I think is happening, and what I need you to do:
                                 
                                1) Notice the errors on Serial0/1/0:0.  A while back we were getting errors on one of the serial interfaces, and we called Paetec about it, but they said they were clean on their side.  Testing on our side AT THAT POINT IN TIME showed no errors, so we couldn't continue forward on it.  I kept an eye on it for a little bit, and it seemed fine...  But, looks like the errors have returned, and I need you to work with Paetec and TelCo to get this resolved.  The above output of the interfaces should show Paetec the errors, which is why I included them.  That I'm aware of, the Circuit ID's (CID#) should be accurate, assuming the cables haven't been swaped.  As I see it, this is Paetec to solve, so put the work on them (if possible), but they will need your assistance to let TelCo in, etc...

                                 
                                2) Because of the errors in Serial0/1/0:0, some packets are being dropped.  These packets are causing retries, thus introducing additional delays.  Do they account for the large amount of delays that Seth experienced?...I doubt it, but it's not helping...   The good thing, and this is by design, the three T1's are in a PPP Multilink.  As such, if we think the errors are really causing the issues, we can disconnect that T1, and the PPP Multilink will continue to function.

                                 
                                3) Once we get this T1 resolved, we can see about changing the way your PPP Multilink sends packets out.  Right now, each session/stream of traffic is bound to a SINGLE T1.  We can change it so that traffic is load balanced not by session, but rather, on a per packet basis.  This essentially will fully maximize your total bandwidth.  But, I don't want to do this until the Serial interface is fixed, and not taking on any greater percentage of errors over your other links.  Enabling this before that would be bad as every 3rd packet could have a problem.  But, in order for this to be fully useful, we need the ISP to do the same thing on their end.  So, once we get this corrected, we'll need to have Paetec do the same.  I'd highly suggest that you communicate our request to them sooner rather than later, and get from them in an e-mail that they can/will do it upon request.  I say this because ISP's are VERY reluctant to do this as it increases the CPU on their routers, as each packet needs to be inspected by the CPU.  Hence, they will often fight doing it...and for good reason (from their point of view).  FYI, the command that does this is called "ip load-sharing per-packet", but they should know that.  I'm mentioning it in case they are not familiar with it, and want to look it up in the Cisco documentation.

                                 
                               x




220       "9.149825"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=2720 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
221       "9.149881"       "65.200.201.189"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
222       "9.149924"       "65.200.201.189"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
223       "9.149934"       "10.14.66.0"       "65.200.201.189"       "TCP"       "3463 > http [ACK] Seq=263 Ack=1461 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
224       "9.150675"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
225       "9.150692"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
226       "9.150713"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=4180 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
227       "9.150791"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
228       "9.151453"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
229       "9.151478"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=5640 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
230       "9.151664"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
231       "9.152200"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
232       "9.152228"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=7100 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
233       "9.169350"       "65.200.201.189"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
234       "9.169372"       "65.200.201.189"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
235       "9.169395"       "10.14.66.0"       "65.200.201.189"       "TCP"       "3463 > http [ACK] Seq=263 Ack=2921 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
236       "9.169412"       "65.200.201.189"       "10.14.66.0"       "HTTP"       "HTTP/1.1 200 OK (application/x-javascript)"                              
237       "9.209488"       "65.200.201.189"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
238       "9.209523"       "65.200.201.189"       "10.14.66.0"       "HTTP"       "HTTP/1.1 200 OK (application/x-javascript)"                              
239       "9.209542"       "10.14.66.0"       "65.200.201.189"       "TCP"       "3464 > http [ACK] Seq=269 Ack=1266 Win=64270 [TCP CHECKSUM INCORRECT] Len=0"                              
240       "9.229354"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
241       "9.229385"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
242       "9.229408"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=8560 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
243       "9.229467"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
244       "9.229480"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
245       "9.229490"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=10020 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              
246       "9.265050"       "10.14.66.0"       "10.14.64.254"       "DNS"       "Standard query A us.bc.yahoo.com"                              
247       "9.270274"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
248       "9.270299"       "65.200.201.183"       "10.14.66.0"       "TCP"       "[TCP segment of a reassembled PDU]"                              
249       "9.270318"       "10.14.66.0"       "65.200.201.183"       "TCP"       "3461 > http [ACK] Seq=551 Ack=11480 Win=65535 [TCP CHECKSUM INCORRECT] Len=0"                              

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2006-04-19 at 10:46:02ID21819561
Tags

tcp

,

checksum

,

reassembled

Topics

Miscellaneous Networking

,

Network Switches & Hubs

,

Network Auditing Software

Participating Experts
2
Points
500
Comments
7

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. Fighting SPAM
    I am an administaror of a small network,my internet service provider tell me that my network address has been abused by spammers.I want to stop the spammers from this without affecting my clients in the network who use it to browse the internet through a proxy server.what I...
  2. SMS in PDU Mode
    Hello, I'm developing a program to send SMS through a mobile phone (Siemens C45), this phone doesn't support Text Mode so I have to use PDU Mode. I send this string to the phone: 'SCA sCadena = sCadena & "00" 'PDU Type sCadena = sCadena &am...
  3. MTU and Routers
    I have two Cisco 1600 series routers in a point to point configuration between the main office and a remote location. We have trouble accessing a database at the main location from the remote office. Our application vendor is telling us that we have to set the MTU size to 149...
  4. Reassembled PDUs / Fragmentented TCP packets
    Hi, I am doing database queries over a relatively slow line and became aware, that only some 1/6th of the available bandwidth is used. The server is Windows 2000 Server, the database is SQLBase ( please see also this question: http://www.experts-exchange.com/Databases/Q_220...
  5. Question about VPN and MTU
    I have two offices connected together via two Sonic Firewall routers. One is a TZ170 and the other is a 2040. We've had a LOT of slow Terminal service response from location B. Location A has the 2003 Server. I did a sniff with Wireshark (Ethereal) and discovered a ton ...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: rsivanandanPosted on 2006-04-19 at 12:12:36ID: 16491242

How is the line clock setup? I mean, are you and your ISP agree on the clock settings? usually from line I believe.

If so, did you try to 'clear counters' to flush all the errors and input problems from the router interfaces? or can you do that?

Then see if you get any problems? It sounds crazy but it happens with Cisco routers, I've seen them a couple of times. But this doesn't solve the problem.

Cheers,
Rajesh

 

by: jcistaroPosted on 2006-04-19 at 12:15:57ID: 16491291

The counter have been cleared and counters reset.  This didn't seem to help the situation.

 

by: rsivanandanPosted on 2006-04-19 at 12:18:10ID: 16491320

Did your interfaces come up okay? In the router diag and also by the link lights?

Cheers,
Rajesh

 

by: Abs_jaipurPosted on 2006-04-19 at 20:29:55ID: 16494832

Hi,

One forums result :

If the packets that have incorrect TCP checksums are all being sent by the machine on which Ethereal is running, this is probably because the network interface on which you're capturing does TCP checksum offloading. That means that the TCP checksum is added to the packet by the network interface, not by the OS's TCP/IP stack; when capturing on an interface, packets being sent by the host on which you're capturing are directly handed to the capture interface by the OS, which means that they are handed to the capture interface without a TCP checksum being added to them.

The only way to prevent this from happening would be to disable TCP checksum offloading, but

   1. that might not even be possible on some OSes;
   2. that could reduce networking performance significantly.

However, you can disable the check that Ethereal does of the TCP checksum, so that it won't report any packets as having TCP checksum errors, and so that it won't refuse to do TCP reassembly due to a packet having an incorrect TCP checksum. That can be set as an Ethereal preference by selecting "Preferences" from the "Edit" menu, opening up the "Protocols" list in the left-hand pane of the "Preferences" dialog box, selecting "TCP", from that list, turning off the "Check the validity of the TCP checksum when possible" option, clicking "Save" if you want to save that setting in your preference file, and clicking "OK".

It can also be set on the Ethereal or Tethereal command line with a -o tcp.check_checksum:false command-line flag, or manually set in your preferences file by adding a tcp.check_checksum:false line.


Check sum notes:

Ethereal checksum validation

Ethereal will validate the checksums of several potocols, e.g.: IP, TCP, ...

It will do the same calculation as a "normal receiver" would do, and shows the checksum fields in the packet details with a comment, e.g.: [correct], [invalid, must be 0x12345678] or alike.

Checksum validation can be switched off for various protocols in the Ethereal protocol preferences, e.g. to (very slightly) increase performance.

If the checksum validation is enabled and it detected an invalid checksum, features like packet reassembling won't be processed. This is avoided as incorrect connection data could "confuse" the internal database.
Checksum offloading

The checksum calculation might be done by the network driver, protocol driver or even in hardware.

For example: The Ethernet transmitting hardware calculates the Ethernet CRC32 checksum and the receiving hardware validates this checksum. If the received checksum is wrong Ethereal won't even see the packet, as the Ethernet hardware internally throws away the packet.

Higher level checksums are "traditionally" calculated by the protocol implementation and the completed packet is then handed over to the hardware.

Recent network hardware can perform advanced features such as IP checksum calculation, also known as checksum offloading. The network driver won't calculate the checksum itself but simply hand over an empty (zero or garbage filled) checksum field to the hardware.
Note!

Checksum offloading often causes confusion as the network packets to be transmitted are handed over to Ethereal before the checksums are actually calculated. Ethereal gets these "empty" checksums and displays them as invalid, even though the packets will contain valid checksums when they leave the network hardware later.

Checksum offloading can be confusing and having a lot of [invalid] messages on the screen can be quite annoying. As mentioned above, invalid checksums may lead to unreassembled packets, making the analysis of the packet data much harder.

You can do two things to avoid this checksum offloading problem:

    *

      Turn off the checksum offloading in the network driver, if this option is available.
    *

      Turn off checksum validation of the specific protocol in the Ethereal preferences.

 

by: Abs_jaipurPosted on 2006-04-19 at 20:33:42ID: 16494854

Hi,

SLow problem :

Ethereal has one domain were it is poor, and that is NCP decodes.
Noone so far has taken the time to include an exhaustive list of NCP
decodes into the problem. For this reason, Ethereal is not very good
to diagnose Netware client connection problems, at least as long as
there are no obvious problems at IP level. Actually, most IP
performance problems problems (e.g. IPX fine, IP slow) are due to
configuration problems at switch or nic level and are not related to
protocol problems which could be seen with Ethereal. So  the first
thing to do is verify your switches and nics and see if the actually
used duplex settings for nics and switches are the same. Don't trust
the configured values, but if possible, try to find out what the
devices decided to finally use. On the server for instance, most nic
drivers either show the duplex setting at load time (and you can find
it in console.log), or in the lan/wan statistics in monitor.nlm (under
custom counters).

Note that you posted your message as a new thread rather than as a
followup to previous discussions on this issue you may have had. So I
have no idea on what you might already have done or discussed with
other sysops. I recommend you to keep posting in the same thread as
long as you are working on the same issue. This would make it easier
for everyone who is implied in the discussion



Forums discussion :according to Ronnie Sahlberg
The problem now is that fore VERY large captures, ethereal is always slow
under all circumstances.
So let us start with just a simple random generic capture and measure for it
to try to keep the number of variables low.
(If it is as you say the number of sessions affect it as well,  do you mean
the number of TCP sessions or what kind of sessions?
 At some point, when the worst performance problem has been addressed this
would be a very interesting area to look at.
 (I could create different synthetic capture files to measure with,   same
number of packets, same payload just different number of sessions)
 Make a note that you have observed the number of sessions to possibly have
an effect on the dissection speed so we dont forget to look at
  it furhter down the track
)

I currently belive that during refiltering of a capture, most time would be
spent inside file.c/add_packet_to_packet_list().
It would be VERY VERY useful to verify that this assumption is correct.
I would really like someone to look at gprof data and analyze where most
time is consumed to either verify my claim  add_packet_to_packet_list()
or to invalidate it.

The thing inside this function I think consumes the most cpu I belive would
be where we call epan_dissect_run() and perform a full dissection of the
packet.

As I see it, apart from the initial time we encounter the packet during file
read (or live capture) there are not that many instances where we really
must
dissect the packet at all.
OK. If we select a packet in the list so it gets displayed in the dissect
pane that might be an exception but that is not something that we do 100.000
times
per capture anyway so the performance of that is irrelevant.
We might also need to do a full rescan/redissect of all packets IF we have
changed the preferences in such a way that the packets will be dissected
differently  or when we have changed stuff using  DecodeAs.

However, for me and many other users, the MAIN reason ethereal rescans the
packet list is because we have applied or changed a filter. Some users will
filter and refilter a capture file over and over and over, ten, twenty,
thirty if not more times for each capture they work with.
Or see when a ConversationList dislog or a ServiceResponseTime dialog is
opened.

Well enough of that. To my idea:

Hypothesis:  A significant part of the slowness of ethereal when refiltering
a capture file comes from the expensive calls to epan_dissect_run() called
from add_packet_to_packet_list() in file.c
Potential fix: Reduce the number of calls made to epan_dissect_run() at the
expense of additional memory requirements (enabled by a preference)

Assuming that most of the time we perform a full rescan/redissect of the
capture file is when we really just want to reapply a display filter. (and
are not doing anything that affects how a packet is dissected).

What do we need in order to refilter the packet list  if we do not allow
calling epan_dissect_run()?
1, We need to remember all COL values for all packets so that we can just
reapply them when adding the packet to the packetlist without calling the
dissector and recreating them that way.   This will consume additional
memory.
2, For every packet we need to keep a list of all the hf_fields that were
encountered in the packet.
    This list contains the index of the hf variable as well as the value it
has.
    Nothing else needs to be stored there (in order to reduce the impact on
memory)
    This list may NOT be pruned as the edt structs are. This is because we
want to be able to still use this list even after the filters have changed
and thus
the pruning would be different.   No pruning.
The "ApplyFilterToEdtStructure" fucntions would need to be changed (or
duplicated) so they could operate on the list in 2 instead of the edt
structure.
This function might also need to be looked at so that it would be efficient
even for very large lists (no pruning)

1 would allow us to rebuild the packet list without needing to call the
dissector (?)
2 would allow us to refilter the entire trace without calling any
dissectors.

ideas, comments?

Right now it would be nice if someone could create a capture as I proposed
earlier and use GPROF to check where most of the CPU is spent when
refiltering the capture. To verify if my assumptions are correct or
invalidate them.

(
As a nice benefit in the future, IF we were to have that list of fields for
each packet, easily available, we could do things like merging this list
between packets.
Say #6 is the Call and #27 is the Response.
Since these packets are paired we could merge the lists from these two
packets into a single one.
Then when searcing for something that occured in the Response packet, we
would automatically also pick up the matching Call packet sinte their lists
were merged.
I.e filtering for smb.error==foo   would both find the Response that barfed
saying foo  but also teh matched Call to this Response.
That would also be useful.
)

 

by: rsivanandanPosted on 2006-05-23 at 07:15:09ID: 16742605

thnx.

Cheers,
Rajesh

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...