X
    Categories Industry InsightsNetworking TechnologyPerformance Monitoring

Hubot, AppNeta. AppNeta, Hubot.

AppNeta no longer blogs on DevOps topics like this one.

Feel free to enjoy it, and check out what we can do for monitoring end user experience of the apps you use to drive your business at www.appneta.com.

Time was that everyone used subversion, and connectivity problems were the #2 programmer excuse for legitimately slacking off. One of the strengths of git is the ability to switch branches and continue working even if you can’t fetch or push, eliminating the “how do I svn commit on a plane” problem. But what if I need to do code review, or push an emergency fix to my git-backed deployment workflow? Diffs, compression, and other optimizations make it easy to forget that remote git commands are still heavily network bound.

Hubot greeting our new intern.

Like most engineering teams, we fall back to an instinctive reflex whenever we suspect a network outage: is it down for everyone or just me? Luckily, being on the TraceView team means having an answer at your fingertips:

PathView tracks changing network conditions on a minute-by-minute basis.

PathView is AppNeta’s cloud-based network performance management solution. We use it to monitor network paths critical to daily development tasks, like connections to our servers in EC2 or from our primary Boston office to our coworkers in Providence or Vancouver. We’re able to track the total, utilized, and available capacity of every path we monitor, as well as the latency, round-trip time, and both data and voice jitter and loss of the packets traveling over it. Because PathView collects this data without any overhead, we can see how conditions have changed without changing them for the worse.

This time around, the culprit was severe packet loss affecting our Providence office’s outgoing connections. The frequency of dropped packets meant that remote Git commands on large repositories became unusable even while other, lighter-weight applications (like HipChat!) kept working without a hitch. Dan was able to log into PathView, trace the symptom back to its root cause, take a screenshot, and upload it to HipChat in just a few minutes. Blazing fast!

But PathView is a live monitoring tool with up to the minute data, and that means that by the time our coworkers in Providence saw the image in HipChat it was already out of date. I decided to improve the status quo in the most logical way: by making a machine do it for us.

Thanks to Hubot and the PathView Cloud API, the TraceView team can monitor network visibility without even switching to a web browser.

The TraceView team has slowly been ceding control to Hubot, the malevolent AI that now runs Github. Having watched plenty of Battlestar Galactica, we initially only trusted it for tracking Bitcoin exchange rates and pug bombings. But Hubot’s seductively simple syntax, inherently social nature, and adorable animal gifs won us over. Soon, its reign of terror extended to telling us if Github is experiencing site-wide outages.

Hubot is ready to take the next step on its path to world domination with pathview.coffee, a script that allows it to mine the PathView Cloud API for data about network conditions. The script lives in our fork of the hubot-scripts repository, but as of this blog post, here’s the source code:


# Description:
#   Display network path performance from PathView Cloud.
#
# Dependencies:
#
# Configuration:
#   HUBOT_PVC_SUBDOMAIN
#   HUBOT_PVC_ACCOUNT
#   HUBOT_PVC_PASSWORD
#
# Commands:
#   hubot (pathview|pvc) me  (to)?  - Return the performance of matching PathView Cloud paths
#
# Author:
#   Eronarn

module.exports = (robot) ->
  robot.respond /(pathview|pvc) me (\w*)\s*(to)?\s*(\w*)/i, (msg) ->

    # Get start and finish parameters.
    start = msg.match[2]
    finish = msg.match[4]

    # Get PVC credentials.
    subdomain = process.env.HUBOT_PVC_SUBDOMAIN
    account = process.env.HUBOT_PVC_ACCOUNT
    password = process.env.HUBOT_PVC_PASSWORD

    # Get the base64-encoded auth string.
    auth = new Buffer(account+":"+password).toString('base64')

    # Make a call to the PVC API.
    robot.http("https://"+subdomain+".pathviewcloud.com/pvc-ws/v1/paths?name=*"+start+"*"+finish+"*")
      .header('Authorization', 'Basic '+auth)
      .get() (err, res, body) ->
        if err
          msg.send "didn't work, bro 🙁 #{err}"
          return

        # Get an array of paths.
        paths = JSON.parse body

        # Sort paths by name.
        paths.sort (a,b) ->
          if a.pathName.toUpperCase() >= b.pathName.toUpperCase() then 1 else -1

        # Format paths.
        print = (path) ->
          since = Date(new Date().valueOf() - path.serviceQualityTime);
          msg.send("#{path.pathName} @ #{path.target}: #{path.serviceQuality} since #{since}")

        print path for path in paths


Getting a local development environment ready required installing node.js for the first time, getting `npm` working, spinning up a local Hubot, installing its Redis-based brain, reading about how to use the PVC API, and learning enough Coffeescript to retrieve path data via GET requests. That’s a lot of tasks, but the API was simple enough to use that I still managed to have a release working the same day as the Git outage described above.

The PVC API exposes a lot of data, including an implementation of the Observer pattern that allows users to register a callback URL to be notified instantly about important events like latency spikes and route changes. Clearly the next step for `pathview.coffee` is to write an extension to Hubot’s HTTPD script to allow it to announce service interruptions as they happen! Since Hubot has persistent storage in Redis, you could even have certain paths (like Github) trigger an @all in HipChat while others (like dev boxes) remain opt-in.

While I started out working with the PathView Cloud API, I didn’t want to neglect the TraceView side of the family. A few days later I wrote traceview.coffee to pull in basic latency and traffic volume information from the TraceView data API (previously blogged about by TR). Again, writing a basic Hubot script was no problem at all:

Don’t tell me what you “can” and “can’t” access.

Hubot is hard at work figuring out how to replace me with a convincing simulation written in Coffeescript, but until then I plan to keep working on `pathview.coffee` and `traceview.coffee`. There’s a lot more I could pull out of TraceView’s API than just hourly app latency summaries – for example, did you know that real user latency data is also available in the API? The data presentation could also be improved, and as a roguelike developer, I particularly like the idea of using Clark to draw ASCII sparklines to represent the more detailed timeseries data available in both APIs.

James Meickle: James started as a hobbyist web developer, even though his academic background is in social psychology and political science. His favorite language is Python, his favorite editor is Sublime, and his favorite game is Dwarf Fortress.