Build Your Own 10GigE Wire-Rate NetFlow Traffic Generator Using Tcpreplay 4.0
Want to build a packet generator using free software and commodity hardware? Wouldn’t it be nice if that packet generator was just as fast as expensive hardware-based packet generators? What if it was more powerful, flexible and easier to use than commercial alternatives?
My blog “Why Test NetFlow with Tcpreplay 4.0” introduced Tcpreplay 4.0 as an alternative to hardware packet generators. It demonstrated tcpreplay and its ability to generate unique flow traffic at up to 10GigE wire rates. Today I will walk you through the steps required to build your own high-performance packet generator. Using these instructions, you will be able to recreate the following test results on your own NetFlow device.
Let’s assume that you have a network “device under test” (DUT) and a test device running the Tcpreplay 4.0 product suite. The DUT can be any network device that is able to process traffic to/from many sources. Do not test to an application or web server. Testing to a web server is possible with tcpliveplay, but this is beyond the scope of this article.
On your test device you will run tcpreplay to read “pcap” network capture files and replay them to the DUT. Try to connect the test device directly to the DUT. If you must connect through a switch, you need to update your pcap files. Using tcprewrite, set the destination MAC address of all packets to reflect the DUT, for example:
$ tcprewrite --enet-dmac=00:55:22:AF:C6:37 --infile=input.pcap --outfile=output.pcap
Get your hands on some good hardware. If you are only testing at GigE rates, any modern Intel 64-bit x86 system will do as long as you are able to install Linux or FreeBSD. To achieve full wire rates, you will be installing one of the netmap supported network drivers.
If you are testing at 10GigE rates, you will need better hardware. We suggest an Intel i7 processor system, and an Intel 82599 or x540 10GigE network adapter connected to an 8-lane PCIe slot. It is also beneficial if the machine has the fastest memory available.
Install the Tcpreplay 4.0 Suite
You need to compile Tcpreplay, but first ensure that you have prerequisite software installed. For example, on a base Ubuntu or Debian system you may need to do the following:
sudo apt-get install build-essential libpcap-dev
Download the Tcpreplay source code and extract tarball. Change to the root directory, then do:
./configure make sudo make install
Build netmap feature
Netmap is a Linux/BSD kernel driver that will enhance Tcpreplay performance. When installed, tcpreplay can bypass the network stack and write directly to the NIC buffers. This bypass allows tcpreplay to achieve full line rates on commodity network adapters.
Note that bypassing the network driver will disrupt other applications connected through the test interface. For example, you may see interruptions while testing on the same interface you ssh'ed into.
FreeBSD 10 and higher already contain netmap capabilities. It is automatically detected when running configure. To enable the netmap driver on the system you will need to recompile the kernel with "device netmap" included.
For Linux, download latest and install netmap from http://info.iet.unipi.it/~luigi/netmap/. If you extract netmap into /usr/src/ or /usr/local/src you can build without extra configure options. Otherwise you must specify the netmap source directory, for example:
./configure --with-netmap=/home/fklassen/git/netmap make sudo make install
You can also find netmap source at http://code.google.com/p/netmap/
How to do a Performance Test for a Flow Appliance
In this example we are sending high flows-per-second (fps) traffic to a DUT running nprobe. We use nprobe logging features to record the number of packets, bytes and flows per second captured. This allows us to understand the ratio of flows captured vs. lost.
I created enhancements in Tcpreplay 4.0, allowing it to generate wire-rate traffic and extreme flow-per-second rates. Typically tcpreplay is so fast that it outperforms the receiving station. It is important to track the amount of traffic sent and received, and determine the amount of lost data. We will do this by enabling logging in nprobe.
The type of test data sent is significant with some IP Flow/NetFlow devices. Tcpreplay should replay traffic that is random, chatty and arbitrary. Consider capturing large amount of traffic at an Internet access point, capturing real traffic from hundreds of users. That’s often hard to do, so we have provided an excellent packet capture in this example.
To prepare for this test, do the following:
- Download and install the latest release of Tcpreplay on the test machine
- Download bigFlows.pcap onto the test machine (see captures wiki for details)
- Download and install nprobe on the DUT
- Start nprobe on DUT ...
# nprobe -b1 -T "%IPV4_SRC_ADDR %IPV4_DST_ADDR %IPV4_NEXT_HOP \ %INPUT_SNMP %OUTPUT_SNMP %IN_PKTS %IN_BYTES %FIRST_SWITCHED \ %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS \ %PROTOCOL %SRC_TOS %SRC_AS %DST_AS %IPV4_SRC_MASK \ %IPV4_DST_MASK" -i eth7 03/Jan/2014 20:50:24 [nprobe.c:2822] Welcome to nprobe v.6.7.3 for x86_64 03/Jan/2014 20:50:24 [plugin.c:143] No plugins found in ./plugins 03/Jan/2014 20:50:24 [plugin.c:143] No plugins found in /usr/local/lib/nprobe/plugins 03/Jan/2014 20:50:24 [nprobe.c:4004] Welcome to nprobe v.6.7.3 for x86_64 03/Jan/2014 20:50:24 [plugin.c:665] 0 plugin(s) enabled 03/Jan/2014 20:50:24 [nprobe.c:3145] Using packet capture length 128 03/Jan/2014 20:50:24 [nprobe.c:4222] Flows ASs will not be computed 03/Jan/2014 20:50:24 [nprobe.c:4298] Capturing packets from interface eth7 03/Jan/2014 20:50:24 [util.c:3262] nProbe changed user to 'nobody'
On the test device, send some traffic with tcpreplay …
# tcpreplay -i eth7 -K --mbps 7000 --loop 50 --unique-ip --netmap bigFlows.pcap Switching network driver for eth7 to netmap bypass mode... done! File Cache is enabled Actual: 39580750 packets (17770889200 bytes) sent in 20.03 seconds. Rated: 874999640.5 Bps, 6999.99 Mbps, 1948869.39 pps Flows: 2034300 flows, 100164.47 fps, 39558950 flow packets, 21800 non-flow Statistics for network device: eth7 Attempted packets: 39580750 Successful packets: 39580750 Failed packets: 0 Truncated packets: 0 Retried packets (ENOBUFS): 0 Retried packets (EAGAIN): 0 Switching network driver for eth7 to normal mode... done!
Here are some notes and observations:
- The --unique-ip option will ensure that IP addresses are unique for every loop iteration, which will result in unique flows for every iteration
- The -K option will preload the PCAP file into memory before testing. Omit if you do not have enough memory for this option
- The --mbps option will limit the speed at which traffic is sent, and thereby limits the flows-per-second
- The --netmap option bypasses network drivers, allowing tcpreplay to achieve up to full 10GigE wire rates
- In this example we are only sending for about 20 seconds, but we recommending increasing --loop option to generate several minutes of traffic
On the DUT observe the flow statistics received ...
03/Jan/2014 20:52:30 [nprobe.c:1557] --------------------------------- 03/Jan/2014 20:52:30 [nprobe.c:1558] Average traffic: [1.870 M pps][6 Gb/sec] 03/Jan/2014 20:52:30 [nprobe.c:1563] Current traffic: [569.089 K pps][2 Gb/sec] 03/Jan/2014 20:52:30 [nprobe.c:1568] Current flow export rate: [0.0 flows/sec] 03/Jan/2014 20:52:30 [nprobe.c:1571] Flow drops: [export queue too long=0] [too many flows=0] 03/Jan/2014 20:52:30 [nprobe.c:1575] Export Queue: 0/512000 [0.0 %] 03/Jan/2014 20:52:30 [nprobe.c:1582] Flow Buckets: [active=749448][allocated=749448] [toBeExported=0][frags=0] 03/Jan/2014 20:52:30 [nprobe.c:1465] Processed packets: 13089188 (max bucket search: 162) 03/Jan/2014 20:52:30 [nprobe.c:1468] Flow export stats: [0 bytes][0 pkts][0 flows] 03/Jan/2014 20:52:30 [nprobe.c:1477] Flow drop stats: [0 bytes][0 pkts][0 flows] 03/Jan/2014 20:52:30 [nprobe.c:1482] Total flow stats: [0 bytes][0 pkts][0 flows] 03/Jan/2014 20:52:30 [nprobe.c:235] Packet stats (pcap): 13100072/234355 pkts rcvd/dropped [1.8%] [Last 13334427/234355 pkts rcvd/dropped] 03/Jan/2014 20:53:00 [nprobe.c:1557] --------------------------------- 03/Jan/2014 20:53:00 [nprobe.c:1558] Average traffic: [626.567 K pps][2 Gb/sec] 03/Jan/2014 20:53:00 [nprobe.c:1563] Current traffic: [336.465 K pps][1 Gb/sec] 03/Jan/2014 20:53:00 [nprobe.c:1568] Current flow export rate: [0.0 flows/sec] 03/Jan/2014 20:53:00 [nprobe.c:1571] Flow drops: [export queue too long=0] [too many flows=0] 03/Jan/2014 20:53:00 [nprobe.c:1575] Export Queue: 0/512000 [0.0 %] 03/Jan/2014 20:53:00 [nprobe.c:1582] Flow Buckets: [active=1542197][allocated=1542197] [toBeExported=0][frags=0] 03/Jan/2014 20:53:00 [nprobe.c:1465] Processed packets: 23183007 (max bucket search: 336) 03/Jan/2014 20:53:00 [nprobe.c:1468] Flow export stats: [0 bytes][0 pkts][0 flows] 03/Jan/2014 20:53:00 [nprobe.c:1477] Flow drop stats: [0 bytes][0 pkts][0 flows] 03/Jan/2014 20:53:00 [nprobe.c:1482] Total flow stats: [0 bytes][0 pkts][0 flows] 03/Jan/2014 20:53:00 [nprobe.c:235] Packet stats (pcap): 23183007/11039913 pkts rcvd/dropped [32.3%] [Last 20888493/10805558 pkts rcvd/dropped]
Notice that when the DUT is receiving 100K fps it is not able to process everything. This loss is common for many flow products. Loss will most likely increase with the amount of processing that flow tool does on a per-flow basis. For example, deep packet inspection (DPI) classification is more expensive than packet-header classification.
It is not unreasonable to see a flow tool lose packets at 100K fps. Consider that the industry expects NetFlow tools to process 5 fps per Mbps. This suggests that a 10GigE link should be able to perform at 50Kfps (50,000 flows per second). To get a clear picture of this device, test at several rates to discover the rate at which there is no loss.
Real-life NetFlow Testing
Regardless of the flow collector tested, it is important to be able to control the TX and observe the RX flow rates.
The --mbps option controls the tcpreplay TX flow rate. The relationship between --mbps and fps varies based on the packet capture file used. The bigFlows.pcap file will saturate 136,610 fps at full 10GigE wire-speed. The smallFlows.pcap packet capture will saturate the link at 156,712 fps.
RX flow rate depends on the amount of traffic the flow collector drops. Advanced logging allows you to observe the RX rate in fine detail. If advanced logging is unavailable, you will want to observe NetFlow summary reports.
When testing NetFlow devices, you should strive to discover the sustained flows-per-second rates. By adjusting the --loop parameter, you can increase the amount of time that the test runs. Let's look at the above graph as an example. Each point on the graph represents 6 minutes of sustained traffic at a given rate. The total data set takes about 8 hours to collect.
More information is available on the Tcpreplay wiki.
There are several screencast videos available related to this article. The entire play list of screencasts are available here, or you can watch all screencasts below. If you are more interested in detailed descriptions, visit the Tcpreplay How-to wiki.