Simple nfdump Setup (in Containers) for Netflow Collection and Analysis

This post will outline my simple Netflow collection setup running with some containers with nfdump tooling I setup for this purpose.

Assumptions:

I’m going to make the following assumptions:

  • You’re in a Unix-like environment
  • Docker is the targeted container engine
  • Netflow binaries will be stored in /flows on your system
  • You already have another system that supports sending Netflow data to a remote host, such as a switch or router (I will not cover how to setup sflow on a Linux machine)

Modify the content to your needs if any of these assumptions don’t line up with your setup.

Why nfdump?

At a previous role I was introduced to nfdump and really liked it. It’s like tcpdump in the fact that it’s a command line tool and provides a user with the ability to inspect the netflow data based on filters such as time or source or destination hosts/ports – but unlike tcpdump, nfdump is used against Netflow binary files that have the captured data on your filesystem.

The repository for nfdump is here: phaag/nfdump

Setup nfcapd

Nfcapd is a daemon included with nfdump that receives Netflow data and writes it to disk for inspection with nfdump.

First, let’s create the /flows directory and make sure that UID 1000 owns it (that UID/GID is used in the container):

sudo mkdir -p /flows
sudo chown -R 1000 /flows

We can spin up an nfcapd receiver with the following docker command:

docker run -d --restart=unless-stopped --name=nfcapd -p 9995:9995/udp -v /flows:/flows heywoodlh/nfcapd:latest

Now that nfcapd is setup, you can send Netflow to UDP port 9995 on your Docker host and it should start writing the received data to binary files in the /flows directory. My container image by default will organize flows into the /flows directory by year/month/day/hour. For example, flows captured between 4:00 p.m. to 5:00 p.m. July 20th, 2022 would be in the following folder /flows/2022/07/20/16.

If you do not like the defaults I have set in my heywoodlh/nfcapd image, you can check out nfcapd’s arguments by running this command:

docker run -it --rm heywoodlh/nfcapd --help

You can then provide your desired nfcapd arguments to the docker run command. If you are going this route I can assume you are technical enough to figure out what you need so I will not cover any thing more of going outside the defaults I have set.

Accessing the Captured Netflow Data:

For actually accessing the data you need to use the nfdump tool. You can either install nfdump on your machine – or just run a Docker container I have setup for this purpose. I will cover how to use the container as that will be a bit more predictable.

Use the following docker command to recursively access ALL of the Netflow data in the /flows/2022 directory:

docker run -it --rm -v /flows:/flows heywoodlh/nfdump -R /flows/2022

As I stated earlier in this post, you can use filters with nfdump to parse out the data you actually want to see. For example, if I want to see all traffic related to destination port 80 from an IP address at 192.168.1.143, I could run this command:

docker run -it --rm -v /flows:/flows heywoodlh/nfdump -R /flows/2022 "dst port 80 and src host 192.168.1.143"

You can access nfdump’s help section with the following command:

docker run -it --rm heywoodlh/nfdump --help

If you install nfdump on your machine directly, all of the above arguments to nfdump should work the same.

The filters provided with nfdump can provide really cool pieces of information about what’s going on with your network. Using nfdump also allows you to have flexibility to build simple but effective monitoring tools around Netflow.

Additional Reading:

Github repo: phaag/nfdump

Man pages: nfcapd, nfdump

Dockerfiles: heywoodlh/nfcapd, heywoodlh/nfdump

Github Actions used to build the images: nfcapd-buildx, nfdump-buildx

Written on July 20, 2022