When we set out to make a super fast and super accurate application-aware flow collector, we found ourselves with a dilemma. How do we test it? How do you generate complex combinations of business and recreational application traffic and at 10GigE speeds? How do we generate Candy Crush, YouTube and SalesForce plus hundreds of other Layer 7 applications simultaneously at controlled rates, all the way up to full wire speed?
Traditionally hardware traffic generators are used to test network products. Their ability to generate packets at wire rates usually overshadowed the fact that hand-crafting packets is tedious. This may be doable for IP Flow / NetFlow collectors that only decode Layer 3 network protocols and IP addresses. However Layer 7 application-intelligent NetFlow collectors such as FlowView are not sufficiently stressed with synthetic traffic. To stress these collectors to levels experienced in live networks, they need to be tested with real-life network traffic containing a variety of network applications. We concluded that the only practical method was to replay previously-recorded network traffic into our devices. We investigated several commercial and open source solutions, and we finally selected Tcpreplay as our NetFlow test platform.
The selection of Tcpreplay was the start of a long journey for me personally. I knew that it was not ready for serious flow testing, so I would have to make it so. Tcprelay was primarily designed to edit captures and replay malicious traffic patterns to Intrusion Detection Systems (IDS). While it is sometimes used for performance testing, version 3.x was not optimized for speed. In our 10GigE environment, we were only able to achieve 60% of full wire rates. Flow features were nonexistent.
To add flow capabilities to Tcpreplay I had to:
- Make Tcpreplay fast – One of the many tricks that we use to make FlowView fast is bypassing the kernel and allowing the application to access the network adapter buffers directly. This reduces CPU up to 50%, reduces memory by 75% and increased performance nearly 800%. I did the same trick in Tcpreplay. I integrated optional netmap kernel driver support to achieve near 10GigE wire rates and over 150,000 flows per second (fps) on commodity hardware.
- Make Tcpreplay accurate – I rewrote the main TX loop, redesigned the timestamping algorithm and updated calculations. Selecting playback speeds is much more efficient, and typically accuracy has improved from 5% to 0.01%.
- Add flow capabilities – The replayed traffic pattern is analyzed and flow information is collected. Statistics are presented such as number of flows, flows per second and number of non-flow packets. Expired flow analysis statistic are also available. Additionally, a new option will efficiently modify IP addresses every loop iteration, allowing generation of very high Flows per Second (fps) rates.
I presented my modifications to the author of Tcpreplay, Aaron Turner. He accepted my modifications, and then Aaron surprised me by asking if I would agree to take over the project. I agreed, and on December 20, 2013 Aaron announced that I would be the maintainer of Tcpreplay with sponsorship from AppNeta, and support from the AppNeta Engineering Team.
root@pw29:~# tcpreplay -i eth7 -tK --loop 50000 --netmap --unique-ip smallFlows.pcap Switching network driver for eth7 to netmap bypass mode... done! File Cache is enabled Actual: 713050000 packets (460826550000 bytes) sent in 385.07 seconds. Rated: 1194660947.8 Bps, 9557.28 Mbps, 1848532.79 pps Flows: 60450000 flows, 156712.44 fps, 712150000 flow packets, 900000 non-flow Statistics for network device: eth7 Attempted packets: 713050000 Successful packets: 713050000 Failed packets: 0 Truncated packets: 0 Retried packets (ENOBUFS): 0 Retried packets (EAGAIN): 0 Switching network driver for eth7 to normal mode... done!
In summary, FlowView performance drove the need to improve Tcpreplay performance. In turn, the updated version of Tcpreplay helped us to improve FlowView. And so the cycle continues.
Watch this space as I will demonstrate how easy it is to test Layer 7 NetFlow performance with Tcpreplay, and I share with you some revealing performance results …