Redefining MTTR, MTTI for the Cloud Era
According to the Cloud Adoption and Risk Report 2019 from McAfee, cloud usage grew 15 percent last year, establishing a new high-water mark for enterprise adoption. The average enterprise now employs roughly 1,935 cloud apps, with business applications (ie. Google Drive, Office 365 or Salesforce) accounting for 70 percent of total cloud services. (This may seem high, but McAfee likely breaks out every component of Office 365 separately.)
That’s a lot of applications vying for network capacity -- not to mention a lot of end users IT needs to keep satisfied. But the sheer volume of apps, users and data that IT needs to parse in order to respond to events is just one factor prohibiting enterprise IT from speeding up their mean-time-to-resolution (MTTR).
Across the enterprise landscape, companies are retiring their data-center-centric network architectures for more flexible deployments, forcing teams to adopt new techniques to get to the root cause of performance issues than they had in the past. That’s because although IT controlled all of the connections that supported legacy apps in the pre-SaaS and pre-cloud days, they don’t, on their own, have visibility into programs and routes managed by third-parties or partners -- ie. SaaS and cloud vendor networks or ISPs.
With enterprise networks now defined by a mix of cloud, MPLS, and direct internet access (DIA), it’s increasingly difficult for IT to -- on their own -- even spot where issues occur on the network in the first place, let alone provide a speedy resolution. This gets compounded when a problem isn’t actually the fault of the network, but an issue with the SaaS app itself that enterprise IT no longer has control over.
Today, IT has become the one-stop-shop for blame when applications, increasingly served through web browsers over vast networks, aren’t performing they way users expect. Due to mobile apps and the prevalence of WiFi users expect instantaneous load times. When that expectation isn’t met they end up turning to the people close by with the technical skills to fix what they perceive to be the problem–the network. In this increasingly common scenario, a new acronym comes to define a big piece of the network-diagnosis workflow: MTTI, or mean-time-to-innocence where IT is assumed guilty until proven innocent.
While IT may not always be able to manage or fix SaaS applications that they don’t control, they should, at the very least, be able to give end users an explanation as to what went down, where, and who was really to blame. For frustrated users, “I don’t know” is not an acceptable answer.
To speed up MTTI (and ideally accelerate MTTR), teams need to look beyond standard network monitoring or management solutions. They need a tool that can see the entire delivery path, end-to-end, of all apps leveraging the network, regardless of where they originate or where they’re going.
For instance, if an app appears to be down from the perspective of an end user, teams need to be able to take a “problem domain isolation” approach that moves step-by-step, beginning with ensuring the app was delivered to the end user successfully in the first place. An SD-WAN controller may be able to tell IT that the app made it to the LAN gateway at a remote office, for instance, but that’s only the beginning of tracking down what parties are actually to blame -- the app provider or the network.
A comprehensive monitoring solution should be able to fill in the important hop-by-hop details that show the actual journey the app took from A to B.
If there was a server error, for instance, that caused an application to be rerouted over a sub-optimal network pathway that delays delivery, teams can quickly point to the network as the culprit for latency. But the situation isn’t always so easy to suss out.
The app delivery path may be relatively routine and unchanged from a point in time when app performance is generally meeting end user expectations. But still, users aren’t experiencing their apps as expected, and they’ll want to know why.
In this scenario, if teams are employing a monitoring solution that can look at the entire network path in detail, they have all of the backup they need to demonstrate that nothing was out of the ordinary on the network level, and that they’ll need to put pressure on the SaaS provider or app owner to resolve the issue. While this situation doesn’t, on its own, result in faster MTTR, IT can still demonstrate that they’re on top of the issue, even if a third-party needs to ultimately resolve it.
While, ideally, IT teams would like to achieve MTTI in lock-step with MTTR, they can buy themselves a bit more credibility when they can get past visibility challenges that otherwise plague network teams when they adopt new workflows without a monitoring solution.