Granular Insight with Akka Actor Instrumentation
by February 11, 2016

Filed under: Industry Insights

AppNeta no longer blogs on DevOps topics like this one.

Feel free to enjoy it, and check out what we can do for monitoring end user experience of the apps you use to drive your business at www.appneta.com.

Akka actors help you build scalable, high throughput and concurrent applications. Since its introduction as a Scala implementation of the Actor Model in 2009, Akka has been the driving force behind some of the largest e-commerce applications like Walmart and the Gilt Groupe.

For demanding environments, Akka makes an ideal choice thanks to several fundamental features:

  • Isolated processing with no shared memory/locks, inherently concurrent processing of units or streams of work. This loose coupling and concurrent processing allow better utilization of computation power and resources.
  • No assumptions on an actor’s location as well as actor system configuration and composition – simple model of messages and addresses. This allows easy scaling and clustering with simple configuration changes.
  • “Let it crash” fault handling. The processing actor only concerns the normal cases. Fault conditions are handled by the supervising actor with four strategy options –  stop, resume, restart and escalate. This provides a separation of concerns that enables the processing actor to focus on only the normal cases while simplifying and standardizing actor life-cycle handling for fault conditions.
A sample Play-Akka worker stack

A sample Play-Akka worker stack

The proliferation of e-commerce and cloud based services continually raises the bar for backend services. Systems are expected to maintain high availability and quick response times even under heavy traffic and stress conditions. A common thread is that many of these critical systems are implemented by the Akka framework.

In recent years several tools, such as Kamon, have emerged to provide monitoring solutions on the Akka framework. They provide good metrics coverage including statistics on various components such as message queue time in Mailbox, Actor message processing time and others. As a step forward, the Tracing API from Kamon  gives developers a way to record metrics of a particular logic flow.

However, some concerns remain that are inherently difficult to address even with the aid of those tools due to the natures of the Actor Model:

  • Exact Execution Flow: It is hard to debug problems as code is executed asynchronously and the difficulties of identifying the exact execution flow due to loose coupling between actors (messages between actors have no explicit call stack) further compounds the problem. Back traces/stack traces do not provide much information (always show the same actor-received a message stack) and metrics usually do not help in this scenario
  • End-to-end Performance: It is hard to understand per request performance from start to end experienced by end users. Actors themselves are a continuously running system that does not have defined boundaries (other than actor lifecycles which is mostly irrelevant to end-user performance), but most requests still do have some boundaries as series of interactions. Getting aggregate metrics on frameworks such as Play, Spray could give clues but not enough details to investigate per-request level problems.

Fortunately, with TraceView’s latest Akka support, users can now configure a list of Akka Actors they wish to monitor which empowers them to better understand and debug Akka performance problems from a fresh perspective.

The new instrumentation does not only give a general idea of average time spent on actor processing, but it also provides finer details on each message processed in the flow. Users can easily identify the flow of the message by inspecting the “sender actor” and the “receiving actor” as well as time spent (waiting time) in the mail box. Additionally this information is available on each of the messages involved in the flow.

Akka Actor Raw Extent Data

Akka Actor Raw Extent Data

On top of that the per request Trace Details page visualizes all the Akka messages associated with a particular request. This provides a full picture of all the moving parts within the Akka systems including database operations and cache access. For example, the screenshot below demonstrates Akka actor handling and Slick database access triggered by a request sent to a Spray HTTP service end-point.

Trace Details with Akka

Trace Details with Akka

The above example is of the Trace Details Page illustrating a web request to a Spray endpoint that triggers work handled by Akka actors. The green and blue bars on top indicate time spent in the Spray framework while the orange bars with stripes represent Akka actor work. The first orange bar to the left is the “akka-controller” (the legend can be found on the left panel) which is the main controller actor that delegated work to worker actors via “my-akka-router”. Two worker actors, “my-akka-router/$a” and “my-akka-router/$b” handled 3 units of work (the 3 orange bars on the right) with each triggering asynchronous JDBC operations via Slick, illustrated by the brown bars below each worker’s orange bar. Take note that 2 orange bars vertically overlapping each other demonstrate the asynchronous nature of Akka actor model.

This deep insight into parent web requests and the resulting Akka actor interactions is provided out of the box with AppNeta. Just install the AppNeta Scala instrumentation , add the actors you would like traced to config and that’s it! No additional configuration or code changes are required and you get end-to-end visibility into your distributed Scala applications.

We love hearing from you so please feel free to get in touch at feedback@appneta.com with your comments or feedback.