Finding and Fixing Ghost Issues
Modern enterprise IT is complex and distributed. IT network teams today are managing a range of technologies, both old and new, for users who are ever more likely to work remotely or at a branch office. One recent survey found that 83% of businesses are increasing their number of WAN-connected sites. That means more money being spent on WAN connectivity and optimization and more pressure on IT to give end users a good experience at scale and at a distance.
Providing these end users with the tools and performance they need is harder when most remote or branch offices don’t have IT teams on site. No longer can the network engineer walk down the hall to troubleshoot a user’s desktop computer. Now, that same engineer could get a call, chat message or email asking for help from a user who’s thousands of miles away. And user problems can stem from a lot of different causes, whether it’s the cloud provider’s network, the SaaS application or the local network or WAN.
There’s also the challenge that apps are increasingly dependent on each other for data, which can cause hard-to-find problems. Additionally, the classic “Is it the app or is it the network?” question comes up right away when there’s an issue, adding more people and time to the equation.
How You Can Tell You Have a Ghost Issue
User “ghost” issues are the ones that are hard to troubleshoot because you can’t replicate them. A typical ghost issue might be an important SaaS app (something like Office 365) slowing to a halt for one user at a remote office at random times. You’ll get repeated helpdesk tickets, but when you call the user, the application is working fine and you can’t get more details.
By nature, ghost issues come and go, and can be extremely annoying. For IT teams managing remote offices or locations, ghost issues are time-consuming and frustrating. When there’s an unresolved ghost issue, your helpdesk queue will be hard to empty. There’s also the possibility that it’s a major problem that you just can’t see, and that could get worse to affect many more users. Even experienced network or systems admins can get bogged down for weeks or months looking for the needle in the haystack.
Some of the ghost issues our customers have told us about include a cloud application that was having recurring performance issues. In that case, a member of the IT networking team explored (using the delivery and experience features of AppNeta Performance Manager, naturally) and found that the app was hosted on too small an instance in a provider data center that was too far away. Another customer found the cause of a ghost issue was with their network service provider, which had made an automation error that threw off QoS enforcement for cloud app traffic.
Other possible causes of ghost issues include:
- Integration issues between new cloud services and existing third-party software
- Lack of network visibility into WAN connections
- Service delays resulting from limited staff resources
- Shadow IT applications or cloud use
- Recreational applications (e.g., social media, streaming video) sucking up bandwidth
Troubleshooting Ghost Issues, the Old Way
These mystery issues that pop up can really frustrate IT. Lots of enterprise IT teams today are stuck between frustrated users and the cloud provider or SaaS application support center. Instead of cloud and SaaS use taking the burden off IT, they’ve added another layer of complexity and obscured visibility into what the end user is experiencing. When a user calls IT for help with a SaaS application issue, IT is often blind to much of the application delivery path. But the slowdown the user sees is very real and impacting productivity—and IT is still responsible for solving that problem.
Here’s a typical troubleshooting checklist when you get a call from a user at a remote office:
- 1. Try to access the application yourself. If it looks OK, check the status page. It’ll show you the status of that app’s infrastructure, which won’t reflect your user’s experience if their office is accessing a different infrastructure or cloud. It may have a log of previous status, but likely it will be point-in-time.
- 2. Check your own infrastructure. Log in to the firewall to make sure ISP service is up. If it is, check the status of your ISP on their status page.
- 3. Do some tests. Use synthetic testing like webpagetest.org, which produces synthetic traffic data from computers around the world, to see if it shows the issue your user reported. Next, open a remote desktop or desktop-sharing tool to duplicate the issue on a computer at the remote office. Use a bandwidth tool like Ookla Speedtest to see if anything looks problematic there.
Are you still coming up empty after all that? It may appear that there’s nothing wrong, but these tools and tests don’t show you the complete picture. You’ll keep getting calls and helpdesk tickets from users experiencing issues if the root cause isn’t fixed.
Troubleshooting Ghost Issues Now
The ghost issue resolution process can be rocky—and that’s because you have to depend on users’ feedback to figure out how to approach the issue. Make sure that users aren’t just reacting to changes in infrastructure that affect their daily application use. Moving from an on-premises to cloud-based app, like Office 365, is a change for employees. Try to get ahead of these transitions with training and communication so you don’t end up getting helpdesk tickets unnecessarily.
When you do start investigating a real issue, focus on the infrastructure components under your control to see if the problem is on your end. Once you exhaust all the components, look toward the providers involved with the hops along the network between your user and the app or service that’s causing them trouble. This may involve a good deal of persistence. We’ve heard from plenty of our users who find an ISP issue before the provider knows about it, and have to convince the provider of the problem with AppNeta data.
Taking advantage of remote server functionality will let you dig into the environment experiencing problems. From there, you just have to start troubleshooting. Break the problem down into variables and test different pieces until you can identify a root cause.
Continuous monitoring gives you historical data so there’s no need to pore over millions of event log entries to find something out of place. You can capture route history in real time and do remote packet captures to prevent or fix ghost issues for users at remote offices.
Continuous monitoring with solutions like AppNeta fills in the gaps for monitoring cloud and SaaS applications, so you get to see the entire network, end to end. We designed our tools for these modern environments, where visibility often disappears for IT teams. Ghost issues take up a lot less time when you can see every part of the application delivery path to find and fix problems fast.