Reducing a network problem to a file problem (fuzzing)

There’s been a number of issues lately found by fuzzing input files (strings, lesspipe, various image issues, etc.), but these are all stateless apps / libraries. It’s much easier to fuzz input for strings for example, than client input to mysql server. The fuzzing project is actually concentrating on local input only for now and keeps protocols (network and devices) out of scope. But provided we can agree to some compromises, it’s possible to reduce many of those network communications to stateless file processing problems.

Some kind of network fuzzing exercises will be easier than others. For example anything that uses stateless protocol by design will be easier than something that requires a lot of state, authentication, potential cache warmup, etc. Most UDP applications for example should be easier to fuzz than TCP ones, because they need to act on separate messages and preserve any state explicitly by design. It also matters a lot how well the application is designed. If it keeps a lot of magic global state around and you can’t extract parts of functionality simply, then it’s going to bring some problems.

Let’s look at one such candidate: UDP messages, no auth, functions with no state - systemd-resolve, the dns resolver for systemd. While normally it initiates the queries, any host can send back a response that needs to be processed safely. In this case, let’s try to fuzz only one very specific part of the code: message parsing.

Instead of starting the actual daemon though and trying to send it messages, we can use the required parts and construct a new program that will be a self-contained stateless file processor. First, we can treat the whole systemd as a big library and provide our own main() for it. This is actually what a lot of the test-* applications in systemd already do. Here’s a tiny file parser:

#include "resolved-dns-packet.h"
#include <stdio.h>

int main(int argc, char **argv) {
    char buf[BUFSIZE];
    FILE *in;
    int l, res;
    DnsPacket *p;

    in = fopen(argv[1], "r");
    l = fread(buf, 1, BUFSIZE, in);
    printf("read: %i bytes\n", l);

    dns_packet_new(&p, DNS_PROTOCOL_DNS, -1);
    res = dns_packet_append_blob(p, buf, l, NULL);
    printf("append_blob: %i\n", res);

    res = dns_packet_extract(p);
    printf("%i\n", res);

The dns_* functions are provided by systemd-resolve and here are being called in a way close to what the application does in real life. Specifically, if this program crashes then we can be close to certain that the real app will also crash - and that’s what we care about most.

Now we can just add a new entry to to actually build it:

tests += \
test_dns_parser_SOURCES = \
       src/resolve/resolved-dns-packet.h \
       src/resolve/resolved-dns-packet.c \
       src/resolve/resolved-dns-question.h \
       src/resolve/resolved-dns-question.c \
       src/resolve/resolved-dns-answer.h \
       src/resolve/resolved-dns-answer.c \
       src/resolve/resolved-dns-rr.h \
       src/resolve/resolved-dns-rr.c \
       src/resolve/resolved-dns-domain.h \
       src/resolve/resolved-dns-domain.c \

This app can now be easily checked with most common fuzzers like AFL for example. Just CC=…/afl-gcc ./configure && make test-dns-parser and run the fuzzer on it. The printed out information is only for quick verification of later results.

Sample input vectors can be created from wireshark network captures, or even just some trivial generator.

What about the compromises then? The main issue is about the tested scope. As seen in the resolver’s example, the only tested parts are parsing and a bit of the memory management. That’s not a lot and it doesn’t involve any of the business logic dealing with caches, constructing responses, etc. It only checks whether a random message arriving at this port is going to crash the program while it’s being parsed into memory structures, or not.

Is it worth it in the end? The systemd-resolve case seems to show it is. There was an assert abort case (function result not set), null pointer read (string vectors api misuse) and one infinite loop hiding in the parsing code. All fixed upstream now. Even though they didn’t allow arbitrary code execution, they’d be trivial to use to (remotely!) DoS a server by using 100% CPU and stopping any hostname resolution.

Was it useful? BTC: 182DVfre4E7WNk3Qakc4aK7bh4fch51hTY
While you're here, why not check out my project Phishtrack which will notify you about domains with names similar to your business. Learn about phishing campaigns early.