Move Toward Proactive Alerts for Network and Application Monitoring
Too many software alerts can cause IT teams a lot of annoyance. Though they’re designed to be useful and prevent bigger problems, alerts from multiple systems can easily overload IT inboxes. We hear from customers that they want to use alerting more proactively, rather than continually just reacting to alerts that aren’t all urgent or even necessary.
Alert chatter happens when measured values are continuously varying between an alert state and a normal state. It often happens with threshold-based alerting that doesn’t have a time element involved. If a value exceeds the alert threshold, it would ideally stay in that alert state for a certain period of time before a notification is sent.
We’ve put together some tips here to start getting ahead of alert chatter and start becoming proactive to find systemic problems that cause alerts.
How to Make Alerts Work Better for IT
We’ve heard from more than one AppNeta customer that they have found they need to regularly review alert profiles and thresholds, continuously adjusting thresholds to match long-term historical levels. One customer needed historical data and alert correlation working together to find the sources of issues more quickly without extra noise.
First-level alerting is the reactive type, when there was a metric transition from OK to bad. But second-level alerting is proactive, showing when a value or set of values are trending toward a potential issue.
AppNeta Performance Manager allows for alert profiles to be set for network paths, web paths and flow analysis. Each of these profiles can include multiple conditions for any of the tracked metrics. You can set alerts to trigger immediately or when the value violates the alert threshold for a period of time (this helps avoid alert chatter). These alert profiles also define the “clear” event, when an alert can be triggered, but then left the alert state.
One way we recommend cutting down on unnecessary alerts is to measure retransmit rate by tracking application usage. This is often an early indicator of bad performance for a user or app. Instead of setting up blanket alerts for all applications, try this on business-critical apps first. Retransmit tolerance rates vary widely, so a source like an ad placement will be very different than a productivity app like Asana. If an ad requires retransmits, the user probably won’t notice, but if a productivity app doesn’t load part of the site, a user will certainly notice and be affected. (We’ve got a list of tips here on setting good alert thresholds.)
Here's what an AppNeta alert notification looks like.
Tips for Setting Up and Managing AppNeta Monitoring Alerts
If you’d like to get more proactive with your AppNeta alerting, start by setting up a schedule to revisit your alerting policies. Do this in a way that makes sense for your business, whether it’s once a quarter, twice a year, etc. The policies you originally set when you deployed AppNeta might not be useful later on. For example, if there’s a certain alert you and your IT team always ignore, then remove it or consider another way to use it. If you’ve never triggered a particular alert, then re-evaluate whether there was a scenario where it should have been triggered.
Part of your alert setup is crafting an effective threshold for alerting. When you first deploy AppNeta, wait to set alerts. Give it some time, whether a few days, a week or a month, to identify normal traffic patterns, so you can then tailor your alerts to your particular infrastructure. In the first few weeks of a deployment, many IT users see the highest level of activity in a system because of onboarding tasks. Once there’s a baseline of network activity, alerts will be smarter and better.
When our customers have alerts in place for any of the three parts of AppNeta Performance Manager, it’s like our monitoring solution has been turbocharged. This has a lot to do with one of the most dramatic results customers get from using AppNeta, which is a hugely improved mean time to resolution (MTTR).