Why Deploying Cisco’s IP SLA Isn’t a Performance Management Strategy by Team AppNeta February 19, 2013
Filed under: Performance Monitoring
IP SLA is a network monitoring capability deployed by Cisco-powered enterprises to measure several network health metrics between locations, but it’s far from a complete network performance monitoring tool. One of our financial service industry customers recently shared how they used PathView Cloud to pinpoint an unexpected WAN provider issue that was missed by IP SLA and PRTG.
To review, IP SLA is an active network probe enabled on a source Cisco device that tests the connectivity and status of a network path between sites. Like PathView Cloud, it can measure dual-ended between two capable devices, or it can measure single ended between a capable device and IP addressable target.
To make use of the device feature, IP SLA must be enabled and configured on the chosen Cisco routers. Next, you need to deploy a network device monitoring tool to periodically collect the IP SLA data so you can view and use it. Examples of these are available from Solarwinds, PRTG, and ManageEngine. You’ll need a server to host the reporting application and depending on the scale of your environment, another host for the database. When it’s all deployed, you’ll still be missing at least three keys to your own network performance analysis.
Key 1: Total Path Capacity
Knowing the actual network capacity being delivered by your WAN provider is critical to understanding true SLA delivery. Are you getting what you’re paying for? Do you have the available bandwidth for the services you’re supporting? Customers are often surprised to learn they’re missing out on contracted bandwidth and relieved to have the capacity analysis capabilities PathView provides while watching their networks.
Key 2: Troubleshooting Performance Problems
When measuring one or two performance parameters with IP SLA, it’s easy to miss signs of performance impairments between sites. The financial customer mentioned above had deployed PRTG in addition to tools from NetScout and Riverbed. The customer deployed PathView Cloud to monitor WAN performance between a data center and several remote trading offices.
During their initial deployment, the customer reported transaction issues at a site in the same region as the data center. PRTG showed only a bump in round trip latency between locations as observed by IP SLA data running on the edge routers.
Here’s a screenshot the customer shared from PRTG showing the RTT bump in green.
The customer had also configured path monitoring between a branch office microAppliance and a target server in the data center.
PathView detected violations in several network performance KPIs in addition to RTT. As you can see in this screenshot, total capacity was reduced, path utilization increased to nearly the total capacity, and excessive jitter, latency (one-way and round-trip) and intermittent, low grade packet loss was measured in PathView.
PathView automatically segments the nearest hop where impairments are confirmed, and here it highlighted the WAN carrier’s service. See where the green severity marker at hop 2 changes to a red X at hop 3.
After attaching PathView reports to the WAN provider’s trouble ticket, the customer was advised of an unexpected ‘maintenance’ event that caused its WAN circuit to be shared across additional customer traffic for a short time.
Key 3: QoS Verification
In the above screenshot you can see a column on the right for quality of service verification. DSCP EF markings are indicated by the ‘46’ in the outgoing direction, and you can see QoS markings being altered to other values at hops 2 and 4. IP SLA is not QoS aware, and many customers use PathView to verify QoS configuration in support of unified communications.
Cisco’s backing of path-based performance monitoring is a plus, but there’s more to performance than uptime, latency, loss and jitter.