Back in November, we described how we do Automation Testing in a very high level overview. Today, we are going to show how we prepare a specific part of the infrastructure for the purpose of Automation Testing: the virtual appliance.
Most of our modules (AppView, FlowView, and PathView) require physical or virtual appliances (we’ll refer to both as “appliance” from this point onward) to collect network and application performance measurement. These appliances are connected to a server that analyzes and persists the data. An appliance can only connect to one server. Given this limitation, each engineer is assigned a limited number of appliances to connect to their development server where each appliance will have unique hostname as they exist in the same network.
This creates friction when it comes to writing shared/common Selenium automation tests to be run on our Continuous Integration server because the tests refer to these appliances by identifying their hostnames.
For example: I was given two m20 and two m22 appliances with sheeva75, sheeva51, dream29, and dream30 as the identifying hostnames. In the Selenium automation tests, I refer them as ‘sheeva75’ and ‘sheeva51’. These hostnames won’t exist in other system.
Another challenge that we face is the process of connecting these appliances to the server: manually put the connection configuration to the intended appliances either via USB or via SCP. This connection configuration is unique per server because it is generated on-demand by the server. The content contains the server hostname, the port in which the message queue runs, and an organization identifier in which the appliance belongs to. This prohibits us from deploying everything in the cloud in an automated fashion!
Ideally, we would like to have a set of consistent test data so we can write automation tests without worrying the existence of this data across different environments.
Enter Docker. Docker is a lightweight Linux container that allows us to create multiple Linux instances for cheap! At AppNeta, we use Docker to deploy multiple consistent and reproducible virtual appliances for each engineer.
To show you how Docker solves our problem, we’d like to show you the newly improved development environment setup for an engineer:
Given this setup, we now can assume that everybody has a set of “known” appliances (the front-end uses the hostname, information that we can pass when instantiating a docker instance, as the default display of the appliance unless overridden by the user). This reduces the friction of writing automation tests.
Once we have this setup, we quickly realized that we just opened ourselves to further automation opportunities.
First, we realized we can prepare a pair of Virtual Machines for individual engineers versus ordering additional physical appliances, one would be the current development workstation VM and the other would be the VM specifically to host Docker instances.
Second, with the advancement in infrastructure automation,we can potentially use tools like Packer to generate both AMIs and Vagrant boxes using the same Packer configuration via the Docker builder and related post–processors.
Third, our Docker setup gives us a way to inject, via command-line when instantiating the Docker instance, the connection configuration required by the appliance to connect to the server in an automated fashion and gives us the opportunity to be able to host the whole automation test environment in ephemeral cloud infrastructure such as AWS EC2 because the infrastructure setup is now reproducible in a consistent manner. With this setup in place, we now can update our Jenkins workflow to execute the Selenium automation tests as follows:
- Deploy Infrastructure and App on an EC2 instance
- [Subproject of 1] Copy and import the Docker image (template) from Jenkins Job step 1
- Instantiate another EC2 instance
- Instantiate the necessary Docker instances (on that EC2 instance) based on the image above
- [Subproject of 2] Run the test automation
Last, but not least, we can now add or reduce these virtual appliances by instantiating or shutting down the Docker instances as we see fit. The ability to self-provision these virtual appliances without purchasing the physical appliances is probably where we see the most value of using Docker. This helps us to extend our performance/stress test against our server to another level that previously was limited by the number of existing physical appliances. As an added bonus, the virtual appliance is susceptible to hardware failure as our physical appliance does.
Self-provision also helps us to quickly add a new type of virtual appliance. For example: let’s say AppNeta decided to add a new type of appliance, S50 (imaginary type). Since our development version of virtual appliance can identify itself as any type of appliance, we can easily set the type to be S50 and create a Docker instance based on the said configuration and start testing front-end specific logic for the type S50 without the presence of the physical S50 appliance.
We’ve gone through a big change in our setups by introducing Docker and by using virtual appliances more than before. While the overall effort is not too much, there are a few challenges that we encountered along the way.
We tried an S3-backed Docker private registry to share our images but ran into issues (bad checksums were frequently reported when pushing images). We ran out of time to correct the problem, but found a workaround: we decided to use our private Archiva repository and make the virtual appliance Docker base image as a dependency to our infrastructure. As we make more use of Docker, we’ll return to testing the latest release of the Docker private registry.
While it is possible to prepare the infrastructure with using VMs, we believe that this approach is simpler, maintainable, and cheaper. Using VM to host an individual virtual appliance increases the cost of everything: multiple EC2 instances per virtual appliance, more resources to host multiple VMs on our workstation, increase in startup time per VM, additional overhead to manage multiple VMs (managing VMs via additional automation tools such as Rundeck), managing and storing GBs of OS images versus hundred MBs of Docker images.