Companies have written extensively on the debate between using Application Performance Monitoring or using Logging to analyze web applications. Logging is a catch-all tool that’s both convenient and necessary for running web applications in production. Have an app? I bet it makes logs; drop in your favorite solution here!
But if you’re concerned with end-user performance, latency, scalability, and distributed systems, it’s not “APM versus logging”: it’s just APM.
You need APM because you care about the application performance for your end users and saving your own time by fixing issues as quickly as possible. APM tools are built to diagnose performance problems, and the good solutions will lead you directly to issues like serialization delay, queuing, loops, or exceptions that cause slow apps. They provide a specific toolset developed to locate and fix web-based application performance problems.
Below you’ll find some of the top reasons why you should select an APM solution to reduce the overall time it takes to put out daily fires and prioritize your next development sprint.
The ability to correlate latency and errors from web applications across multiple devices and multiple layers of an application is a key APM strength. Correlation with logging can be a manual process requiring engineers to recognize which exceptions to catch and log in order to solve performance issues with logging tools. With request tracing, requests are recorded from initial user interaction through every layer of the web application until the request is complete at page load or data return.
Every app has noise and hidden in that noise are edge cases that are hard to pinpoint, harder to reproduce, and often ignored because they affect 5% of customers. Cache timeouts and query loops are subsets of performance issues that have a completely different behavior than the rest of your app. These are also extremely easy to miss with logging because logs only work with the data you put into them.
logs only work with the data you put into them
With TraceView’s Heatmap, devops teams can see a three dimensional view of slow requests based on when they occurred, page latency and how often they appeared.
SaaS APM tools are continually providing updated features and technology to better support your team and help you find issues faster. With APM there’s no need to decide what data to dump into logs, deploy changes, and build your own dashboard to view the results.
With APM there’s no need to decide what data to dump into logs, deploy changes, and build your own dashboard to view the results.
At AppNeta our passion is performance in production. Let’s look at some examples where APM can better identify and diagnose application performance issues.
Slow Login Page
It’s a common problem. Users or operations complain that the web app login is slow. It’s often difficult to figure out which dependencies in the authentication chain are under-performing, especially if the authentication has been separated into a different service than the web frontend. TraceView shows you queries made during a request as well as the frequency and average time spent in each. Sorting and filtering queries is also easily achieved through the tabular interface.
Slow Page Load
Every app has pages that load and render at different speeds. Content, queries or customizations can lead to a single page loading dramatically slower than others. Logs can be difficult to identify that particular queries might be rolling up daily data. Daily data takes very little time at 8am, but at 10pm can indicate a significant hit on the performance of a page. With TraceView’s Heatmap it’s easily identified as a sawtooth wave. The query starts fast and end slow, resetting every day. Or potentially logging is showing a slow database queries, but with TraceView it’s clear that there is a cache miss that’s causing a full round trip to the database for every load. Optimizing the query will help, but fixing the cache code will do a lot more.
Decreased Cache Performance
Using caches like memcache can be beneficial, but sometimes it can mask deeper issues. Perhaps you’re serializing results into the cache and due to constantly appending data the cache performance slows. While cache restarts can fix this error, at peak traffic times this issue could dramatically reduce the user experience. With TraceView you could also see that while the cache has a dramatic effect on page latency, database calls are also being serialized accounting for a smaller overall latency, but experiencing fivefold increases in execution time between Drupal server restarts and cache clears.
Logging on a particular service may easily identify when one is overloaded. But APM allows users to cross-examine what secondary services may be calling in to the first (Service A). Perhaps an overloaded service (A) is merely a function of a number of other services (e.g., B, C) calling into it. With logging’s perspective one might recommend optimizing the first service (A), but with APM you can identify any unnecessary calls and reduce load from secondary services (B, C).
There are a myriad of other examples where the first problem found is often not the root cause. Fixing the root cause will always lead to better applications and a better user experience. And that’s what APM is designed for.