VOIP interruptions, spurious noises, malfunctioning.

Fred Marshall
Fred Marshall used Ask the Experts™
on
I'm dealing with performance issues with a VOIP phone system.
The VOIP service provider provides a dedicated internet connection for VOIP and the "PBX" is externally provided by the VSP.
Since the PBX is external, even internal extension-to-extension VOIP calls cause external traffic.
We provide a dedicated VOIP firewall in the form of an RV320 followed by cascaded SG300 switches - configured with a VOIP VLAN with QoS set up.
We provide a dedicated internet connection and firewall for site data - independent of VOIP traffic.
There are 3 sites, each one with separate internet connections, firewalls, etc.
The largest site has about 20 phones and 25 workstations.
The smallest site has about 6 phones and 10 workstations
The middle site has about 9 phones10 workstations.
Data traffic is modest.

I believe the VOIP system is working overall as intended so the "problems" are a matter of service quality I'd say.
Problems are intermittent and include:
- audio is heard at one end and not the other.
- a very loud "screeching noise" is heard at one end or the other and can be audible at one or both ends.  This is reported to be rather high-pitched and not like loud TV white noise.
- some incoming calls don't arrive on site and go directly to voice mail.
Overall, it's reasonable to say: "while the system seems to "work", service is unacceptable".

Since the 3 sites are each independent of the other re: VOIP, if all sites behave similarly (re: problems) then one might conclude that the problems are likely "external" and with the VSP.  Yet, the type of internet feed, the VOIP firewall model, the internal switch models are also common.
It may be that the largest site is more affected with problems than the smaller 2 sites.  That's a little hard to say.

Some thoughts:
- We could hire an expert in VOIP problems and SG300 switch QoS to assess our switch configurations and related network architecture.  But, going from *no* particular QoS settings to VOIP- favoring priority settings seems to have not had much effect if any.
- We could figure out a way to measure things in order to determine what part of the overall VOIP system is responsible for the problems (thus WHO?).  
- We are willing to tackle internal problems but don't know now what more to do.  It would be good to be able to to conclusively determine who needs to tackle the problems ( us or the VSP or both?).  
- We have limited confidence in the VSP's ability or motivation to address these issues.  Often, the suggestions from the VSP are that we should instrument this and that.  This is, in part, understandable.  But motivation seems low.  Is that normal?

Any recommendations on any of this?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Distinguished Expert 2018

Commented:
It gets trickier when the PBX is in the cloud, partly because the vendor now has much more flexibility to blame things on your side.

One approach I've taken in this scenario is making a VLAN for the phones, then reserve bandwidth for it. I'm not 100% sure your Cisco routers can do it.

Also, who is the VSP? It is possible that you have a pretty crappy one.
nociSoftware Engineer
Distinguished Expert 2018

Commented:
If blame needs to be dropped somewhere try to isolate all VOIP traffic without mixing it to other traffic.  (Separate VLAN, no trunking the VLAN with others to core switches etc. etc.). So separated switch interconnect for VOIP may help.

Network recordings (SIP & RTP) of traffic might help analysis (depending on causes, .. )
wrt. SIP traffic to see what messaging does and does not arrive. (Start /Tear down session)
wrt. RTP if it is a regular stream from both sides.  There should not be any irregular gaps or sudden bursts of packets. (the screatches might easily be caused by this).
For internal comparison measure between router & phone.  ( capture on both ports ).
For external measurement try to measure on WAN.

Besides these recordings: collect a log of when good conversations and failed conversations occurred. Then you can quickly select relevant parts in earlier packet logging.
One sided conversations hint at NAT problems. Those should be visible to in such traces.

Some VSP's operate on small margins spending 1 hour on a customers problem might need months of phone conversations to compensate.
So if there is a quick fix it will be taken if it takes to long they might rather have you take your business elsewhere.

Some food for thought:    You can fairly easy make  an on-prem PBX (Fusion PBX f.e., 3CX)  that will at least take some of the issues (internal conversations) from the equation.
(Also you can have the "internal" calls  between sites taken care of without payment to some 3rd party).

Author

Commented:
noci:  Thank you for some very interesting ideas!
I'm not a VOIP expert so I'm not sure just how far I can go with recording traffic.  I'm sure that I can record it but not at all sure that I can analyze it.
Capturing "bad" events means capturing all the traffic - as bad events are unpredictable.

As far as switches, VLANs and trunks:
The largest site has a central LAN switch where the VOIP LAN enters.
There are 2 types of downstream connections:
1) immediate trunked ports on the central SG300 switch for phones/computers.
2) LAG trunked ports downstream to cascaded SG300 switches.
3) trunked ports downstream of the cascaded SG300 switches for phones/computers.
I don't think this is important but just to be complete: at some "desks" there are multiple computers, network printers.  So those are supported with a DGS-1100-08 switch that is VLAN capable.

The LAGs between the SG300 switches are made up of 3 cables and ports each.  We could use one of those cables entirely for VOIP and then either use the remaining 2 cables in a LAG for data or have no LAG and just use one cable for data - leaving the 3rd cable as an unused spare.  

The smaller sites have no LAGs but the downstream (single) cables are trunked.  
So, a typical smaller site has:
VOIP firewall to:
Central SG300 switch VOIP VLAN port.
Trunked ports to downstream SG300 switches that have phone/computer combinations on trunked ports.
In one case there is a 3rd downstream layer with SG300 switch on a trunked port.
Phone/computer combinations downstream from the 3rd switch.
(Except for the central switch, these are all 10-port SG300 switches)
Fundamentals of JavaScript

Learn the fundamentals of the popular programming language JavaScript so that you can explore the realm of web development.

nociSoftware Engineer
Distinguished Expert 2018

Commented:
Wireshark has some VOIP tools on ( one of the submenu's is named "telephony" ) so there is help there.
SIP & RTP sub menu's. have extra info if you have a recording.

Author

Commented:
noci:  Yes, I've used those tools just a bit.  Still not an expert on VOIP.  Well, I could *become* an expert but there's not the personal nor customer bandwidth to justify that.  As I mentioned at the beginning, a VOIP / SG300 expert might be the right approach.

masnrock: the VSP is Silver Star Telecom in Vancouver, WA.
One way transmission is unlikely to be solved by applying QoS.

With a SIP based system, although the call setup on an extension to extension call goes via the SIP server, with some systems the "call" is actually directly between the endpoints, it is possible that the system in question is attempting this, and failing over to go via the SIP server, and the one way TX is when the failover to going via the SIP server fails, a SIP trace (or simple packet capture) should show this.

As regards the loud "tone", I would guess that this is a codec mismatch, something that can only be reolved by the SIP provider.

Author

Commented:
AnreLovius:  Thank you!

I don't see how the "call" can be between the endpoints no matter what.  
I pretty sure the phones themselves aren't smart enough to do that and there is NO other equipment on site.  But I suppose it's possible.

We've experienced the following:
- one-way transmission as you've addressed
- bursts of what I would call white noise without interruption of the audio that I could tell
- bursts of very loud screeching and I can't say if the audio was disrupted or not on these cases.  I have somewhat confirmed that this isn't the same as the white noise bursts.  
More about this reported:  When the call is inside to outside then the outside caller is the only one to hear the screeching.  When the call is extension to extension then it apparently can be heard at both ends.
- interrupted audio in short segments with no noises (I was using a cell phone but have never experience this on the cell phone).

So, I can do a Wireshark capture at the VOIP firewall interface to presumably see all the VOIP traffic.
I can use the Wireshark VOIP analysis tools.
I can capture a ton of "normal" data this way and hope to capture an "event".
I'm pretty sure that I won't know what I'm looking at......
Even so, if this might help us know what's wrong with our network - that would be helpful.
And, if it's not our network, knowing that would be helpful also (I hope).
Thanks for your comment about one-way transmission and QoS.  From what you said in further explanation suggests that not only won't QoS fix it but that it's not anything internal to our network that we could address, right?
Then too, this is happening at 3 sites that are internally independent.
The only common element is the local communications service provider where fiber-based internet service for VOIP is being used - with fiber internet service to each site.
nociSoftware Engineer
Distinguished Expert 2018
Commented:
About one way transmission....

SIP: design:  phone -> pbx1 ->   vsp-interconnect -> pbx2 -> phone
        actual    phone -> pbx1 -> firewall1 (nat) -> vsp interconnect -> pbx2 -> firewall2 (nat) -> phone.

In most cases the RTP connection is phone -> firewall1 (nat) -> firewall2 (nat) -> phone
   With transcode / (Lawful) intercept one of the pbx / vsp interconnects can be in the pipeline as well to either copy a conversation or  convert it from one format to another.
In the SIP negotiation the port numbers for the RTP sessions need to be exchanged, also there is a need on the firewall to do something with it.
The PBX's  need to support something call FENT (Far End NaT) support for this.

Single channel Voice mostly is about one transmission stream failing. Most often due to NAT not being mapped correctly.

Many firewalls have something called ALG (Application Layer Gateway) support for SIP ... In MOST cases this is broken due to implementation errors.
ALG's for SIP Should be disabled as a general rule.
Either that or the portrange for RTP is not completely setup correctly.    (From phones ACCEPT all UDP packets outgoing, and also accept UDP packets in the range for the phone incoming with a destination to the phone).

If you have a unix/linux based system as firewall or pbx try to use sngrep it is a tool to visualize SIP negotiations.

wrt. recordings. You can instruct wireshark to open a new file for every interval. (fe. each hour).   The written log can help to select the right files. And if there are no complaints you can remove all files when there was no special issue. (and keep a few to have comparison material).
(some firewall can deliver a PCAP (capture file) on their own, Wireshark should be able to process those).

White noise  can be inserted by phones (comfort noise) as otherwise there would be absolute silence.

Author

Commented:
noci:  Thank you!

Re: white noise:
I doubt that white noise of the intensity I heard would be intentional.  Also I doubt it would be so intermittent.
But that's *interesting"!

Yes, I have experience with Wireshark and the file handling.  Thanks.

At our end, the only firewall for VOIP (dedicated) is the simple RV320.  No other data traverses this router.  SIP Alg is disabled at the request of the VSP - which agrees with what you've said.  That the system mostly just *works* is an indicator that things are set up correctly at least to a point.  If QoS isn't going to help then I'm rather stuck as to what *we* can do to our network.

I have not encountered, on our side, anything mentioning RTP nor portrange beyond port settings prescribed for the firewall by the VSP.  I find in many of these cases that they say we have to have certain ports open when they mean we need to have some of those ports *outgoing* open - which they are by default anyway.  In those cases, no ports need to be opened and certainly not *incoming* as would be the connotation or blind adherence to the "spec's" provided.

If there are things that you've described that we have to somehow further pay attention to and accomodate, I haven't been told anything about it.  So, I suspect that we are at the mercy of the VSP in that regard    (?).

Said another way, I'm at a loss to know what else we might do or to pursue.  So, if I've not translated some of what you're telling me then a bit of coaching may still be needed.  :-o
nociSoftware Engineer
Distinguished Expert 2018

Commented:
Eh yes as the PBX is "in-the-cloud"   your VSP has to tell your what & how to setup.
(capabilities of PBX determine what is needed).

Comfort noise should be low-key as a background noise.  Not a loud hisss. (real silence make people wonder if things still work as we are used to some static discharge on long lines from the POTS days ;-) ).

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial