How To: Address Choppy VoIP

Telephones have been around for longer than anyone reading this article has been alive. They’ve been a ubiquitous part of our daily lives for decades. And there is no reason to believe they’re going away anytime soon. But they are changing. Cell phones, cordless phones, dedicated hardware voice over IP phones (VoIP) and software-based VoIP (e.g. Skype) are all commonplace. Many of these, especially VoIP, deliver technology convergence by combining several similar technologies. For example, mom’s desk phone delivers voice, video, instant messaging, and address book features.

Where do they converge? Well, on the device is the most user-centric view. From the IT perspective, they converge on the cable as well. Ethernet cable usually carries the data necessary to supply all of these technologies. And with the continuing speed breakthroughs in wired and wireless Ethernet, there’s no reason to think this trend won’t continue.

The Problem with VoIP

Succinctly, VoIP shares the Ethernet infrastructure with everything else that uses Ethernet. This includes typical data networking like file sharing, Internet browsing, and database management. So the more used a network is, the more likely VoIP services will have to compete against different technologies for Ethernet bandwidth. When the competition occurs, VoIP traffic may be delayed or dropped. That will negatively impact VoIP with distortion, drop in quality, or even connection loss. Ultimately, any negative impact on VoIP traffic is likely to hurt VoIP quality.

The most common types of network issues that impact VoIP include:

Latency. This term references the delay between the transmission and reception of data packets throughout a conversation. Short and medium latencies are often transparently compensated for by applications by changing their use of codecs and bitrates. Exceptionally long latency can prevent communication. Unpredictable latency is jitter.

Jitter. Jitter occurs when you see a significant variance in the elapsed time between transmission and reception of data packets over the course of a conversation. Jitter focuses specifically on the unpredictable variance aspect of transmission and reception.

Dropped packets. Most applications can handle the occasional loss of data packets with simple retry methods. VoIP deals with dropped packets poorly. If retry is attempted, the entire conversation is delayed. If the packet is discarded, part of the conversation is lost.

Out-of-order data delivery. This is a combination of several other traffic issues where the transmitted data is all received by the target, but not in the proper order and likely with at least one packet delayed. Although most protocols can handle out of order delivery as long as all the packets arrive within a reasonable time threshold, some may be dropped or discarded to preserve the integrity of the other packets.

Consider these network data transmission factors for a moment. They are mostly transparently handled by the protocols and hardware when “traditional” data is being moved. But in the case of VoIP, the built-in error correction mechanisms are not enough to deal with network bandwidth competition and the resulting resource constraints.

Hardcore network nerds have already made allowances for this kind of resource competition. They designed wired Ethernet to allow for traffic prioritization using quality of service (QoS) mechanisms. QoS is, at its root, largely an honor-based system. Network applications identify their traffic and network infrastructure like switches and routers honor the identification. There are certainly applications that miscategorize their traffic to get better performance but these are usually resolved over time.

Identifying Specific Network Issues

Now that you understand the issues that impact VoIP, you should know that there are numerous tools out there to help identify and resolve them. These tools provide various levels of assistance.

For example, ping is a useful troubleshooting tool built into most operating systems and network hardware. Consider this common output that most IT pros ignore:

C:\Windows\system32>ping -a www.example.com
Pinging www.example.com [97.74.104.201] with 32 bytes of data:
Reply from 97.74.104.201: bytes=32 time=52ms TTL=117
Reply from 97.74.104.201: bytes=32 time=52ms TTL=117
Reply from 97.74.104.201: bytes=32 time=52ms TTL=117
Reply from 97.74.104.201: bytes=32 time=53ms TTL=117
Ping statistics for 97.74.104.201:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 52ms, Maximum = 53ms, Average = 52ms

Now look at it within the context of VoIP as an initial tool to get some very early data:

Successful name resolution, which is always a good start

0% loss indicates low data loss

Latency varies between 52ms and 53ms, indicating low jitter

TTL=117 indicating that there are 11 hops between the client and the server

Moving to the next level of complexity you can leverage a frequently built-in tool called traceroute. From our perspective, it is an enhanced version of ping that shows the host along the network route between the client and server. For the same connection shown above, this is the traceroute:

C:\Windows\system32>tracert www.example.com
Tracing route to www.example.com [97.74.104.201] over a maximum of 30 hops:
1   <1 ms   <1 ms   <1 ms  Router.home [192.168.1.1]
2    5 ms    5 ms    4 ms  L110.STTLWA-VFT.verizon-gni.net [98.111.210.116]
3    6 ms    6 ms    6 ms  G14.STTLWA-LCR-01.ncnetwork.net [184.19.242.141]
4    7 ms    6 ms    6 ms  so-6-0.SEA01-BB-RTR1.verizon-gni.net [108.57.128.194]
5    6 ms    7 ms    7 ms  0.so-7-1-0.XT1.SEA7.ALTER.NET [152.63.105.57]
6    9 ms    8 ms    9 ms  0.so-2-0-0.XT1.SEA1.ALTER.NET [152.63.104.225]
7    9 ms    8 ms    8 ms  POS4-0.BR1.SEA1.ALTER.NET [152.63.105.81]
8    9 ms    9 ms    9 ms  192.205.33.121
9   50 ms   51 ms   50 ms  cr1.st6wa.ip.att.net [12.122.146.50]
10   52 ms   52 ms   52 ms  cr2.sffca.ip.att.net [12.122.31.194]
11   49 ms   49 ms   50 ms  cr1.sffca.ip.att.net [12.122.3.69]
12   50 ms   50 ms   51 ms  cr1.la2ca.ip.att.net [12.122.3.122]
13   50 ms   50 ms   50 ms  cr2.phmaz.ip.att.net [12.122.31.190]
14   49 ms   57 ms   50 ms  gar4.phmaz.ip.att.net [12.123.206.209]
15   52 ms   52 ms   52 ms  12.122.255.106
16   50 ms   49 ms   50 ms  mdf001c7613r0004-gig-12-1.phx1.attens.net [63.241.130.174]
17   49 ms   50 ms   53 ms  63.241.142.126
18   53 ms   52 ms   53 ms  corpweb-v101.prod.mesa1.secureserver.net [97.74.104.201]
19   52 ms   54 ms   52 ms  corpweb-v101.prod.mesa1.secureserver.net [97.74.104.201]
Trace complete.

This is a myriad of information for VoIP issue troubleshooting. For example the spike in latency between hosts 8 and 9 could easily impact VoIP if the threshold for the data rate and codec in use are below 50ms. That means connecting to VoIP hosts on the near-side of that router will experience great VoIP performance, while those on the far side might experience lower performance.

There are many other tools available to identify the VoIP issues. In fact, these two are the most basic available. But they will get you started and help you identify basic issues. For more advanced VoIP debugging and issue identification you should consider tools that take VoIP-specific network factors into account.

As an interesting sidebar, one useful approach in finding trouble spots is to actually exacerbate them (for a very short time) to see how the network behaves. This process essentially creates short-lived a traffic jam on your network. Tools that accomplish this, any of which are only available through third-parties, flood your network with specific types and volumes of network traffic. This flooding tests things like jitter and packet loss under load. Missing from many inferior network monitor solutions, tools to accomplish this empower you to see how network services hold up under load. Keep an eye out for tools that can accomplish this task, as it is a great addition to your troubleshooting toolkit.

Resolving VoIP Issues

Now that you’ve got the tools to hunt down VoIP issues it is time to integrate them into a problem resolution process. The process I tend to use is fairly straightforward. It starts at the client reporting issues.

Is this the only client reporting issues? Try your own VoIP client and see if the problem is systemic.

Run the basic analysis tools to get core networking information.

Analyze the information to determine which of the network conditions outlined earlier exist.

Now this is a broader network problem with specific conditions, so you can use network problem identification and resolution approaches.

It may seem at first glance that #4 is a bit of a cop-out. But it really isn’t. Once the VoIP problem is identified in terms of typical network issues (e.g. latency, packet loss) there is usually an entire set of tools and personnel waiting to fix the problem. Presenting the problem as a VoIP problem such as “My mom’s voice is garbled” is far different than “The router at 12.122.3.122 is dropping greater than 50% of UDP packets under load when a connection is opened from the 12.122.3.69 side over the last 30 minutes.”

Tracking Down VoIP Issues isn’t Difficult

Tracking down VoIP issues is probably simpler than you thought. They are usually network traffic issues. But they’re subtle issues that other services probably don’t even notice, or address transparently if they do. While VoIP may be distorted or unreliable when a switch is misbehaving, other traffic may be routing its way around the issue without any perceptible problems. That’s why you should use specialized tools or take a closer look at the basic tools to get the data you need. Once you’ve retrieved that data and reduced the problem to a core networking issue, either you or others can use it to resolve the issue using common network troubleshooting techniques.

Anonymous