X
    Categories Industry InsightsPerformance Monitoring

APM Deployments Shouldn’t Fail

The topic of why APM deployments fail comes up often. APM Digest did an article in 2012 providing 15 reasons offered by industry experts, and Jonah Kowall over at AppDynamics recently revisited the topic. Something’s in the water, apparently.

From our point of view as an APM vendor, APM deployments don’t fail. APM is just one part of the monitoring space and there are many other tools and different technologies involved in it. We do see three big challenges that companies may encounter when deploying monitoring.

Our customers rarely, if ever, do a deployment and then look at it after six months and say, “What a disaster.” The APM tools that were available in 2012 were big and cumbersome and you could potentially fail with them. Today, projects stall. Failure is no longer an option because both the tools and the environments have changed, which means there are more places for a project to simply sit, costing money and providing no value. Let’s look at what’s changed:

Lots of Services, Lots of Applications

Environments have become so big that phased rollouts are necessary. You can’t take an environment with 1,500 servers and running 15 different applications as a result of the seven companies you acquired and decide that everyone is going to start using the same tools next month. We advise our customers to deploy our APM solution on five or 10 servers, get comfortable with the tooling, build expertise with the system and then scale the deployment up, either to more servers in the same service or to different services. At every stage, you would have someone who has gone in, quickly obtained value and is now going forward. If there are any organizational challenges, the entire organization isn’t immediately involved. Also, you won’t have the type of failure where you reach some apex and have to scale back or start over.

How to take advantage: Make sure each team is up to speed before going to another team. 100-person rollouts are failures waiting to happen.

A Better Lifecycle

An environment isn’t big just because there are numerous different applications in it. It’s big because software development has fundamentally changed how phased rollouts happen.

Former APM tools would just be dropped in all environments and would provide monitoring value in different ways at once. Today, when customers use our tools, there’s a bunch of different approaches to just deploying them. You can deploy straight into production because you’re fighting fires and then move backwards and try to fight fires earlier in your build. Or, you could start with wanting to explore why an application seems systematically slow. You should start investigating and building a culture of performance, then start using the tool in development and in QA and then finally, you end up in production. You don’t start by deploying across every place possible ─ you start where the tool will be most effective and where you can really make some progress and know it’s going to work. Then there’s not a huge risk taken.

How to take advantage: Figure out where APM can help you and your team specifically. Solving everything at once sounds appealing in the same way that waterfall and an annual release is appealing … and approximately as successful.

An Overwhelming Number of Tools

Now that APM, logging, user monitoring, and many other tools have become more powerful, you no longer have to try to solve problems with a single, across-the-board tool. You should start with a tool that fixes one particular problem, or one particular part of your problem, or maybe to address an issue you were unable to address before. This issue could be something like trying to tie slowness on pages to a specific component, such as a database. Or, it could be a pain such as trying to find the 2% of requests that are draining 40% of your resources. You should start with the tool(s) that fix that problem as well as others, then as you build up the number of tools you have, you can periodically go back and remove redundancies. That way, you always have a set of tools that can solve your problems, while at the same time staying lithe.

How to take advantage: Remember why you’re monitoring, and choose tools that support those goals. Everybody is a bit different, but you can’t go wrong thinking about your users first.

So, there is no need to think about reasons (or make excuses) for why APM deployments fail. APM deployment failure shouldn’t be a concern. If companies think about their environment, take an incremental approach to APM roll-outs and monitoring software development and utilize the right set of tools for addressing issues, APM deployment failures shouldn’t happen at all.

TR Jordan: A veteran of MIT’s Lincoln Labs, TR is a reformed physicist and full-stack hacker – for some limited definition of full stack. TR still harbors a not-so-secret love for Matlab-esque graphs and half-baked statistics, as well as elegant and highly-performant code.