There’s been a number of issues lately found by fuzzing input files
(strings, lesspipe, various image issues, etc.), but these are all
stateless apps / libraries. It’s much easier to fuzz input for strings
for example, than client input to mysql server. The
fuzzing project is actually concentrating on
local input only for now and keeps protocols (network and devices) out
of scope. But provided we can agree to some compromises, it’s possible
to reduce many of those network communications to stateless file
processing problems.
Some kind of network fuzzing exercises will be easier than others. For
example anything that uses stateless protocol by design will be easier
than something that requires a lot of state, authentication, potential
cache warmup, etc. Most UDP applications for example should be easier to
fuzz than TCP ones, because they need to act on separate messages and
preserve any state explicitly by design. It also matters a lot how well
the application is designed. If it keeps a lot of magic global state
around and you can’t extract parts of functionality simply, then it’s
going to bring some problems.
Let’s look at one such candidate: UDP messages, no auth, functions with
no state - systemd-resolve, the dns resolver for systemd. While
normally it initiates the queries, any host can send back a response
that needs to be processed safely. In this case, let’s try to fuzz only
one very specific part of the code: message parsing.
Instead of starting the actual daemon though and trying to send it
messages, we can use the required parts and construct a new program that
will be a self-contained stateless file processor. First, we can treat
the whole systemd as a big library and provide our own main() for it.
This is actually what a lot of the test-* applications in systemd
already do. Here’s a tiny file parser:
#include "resolved-dns-packet.h"
#include <stdio.h>
#define BUFSIZE (DNS_PACKET_SIZE_MAX-100)
int main(int argc, char **argv) {
char buf[BUFSIZE];
FILE *in;
int l, res;
DnsPacket *p;
in = fopen(argv[1], "r");
l = fread(buf, 1, BUFSIZE, in);
printf("read: %i bytes\n", l);
dns_packet_new(&p, DNS_PROTOCOL_DNS, -1);
p->size=0;
res = dns_packet_append_blob(p, buf, l, NULL);
printf("append_blob: %i\n", res);
res = dns_packet_extract(p);
dns_packet_unref(p);
printf("%i\n", res);
}
The dns_* functions are provided by systemd-resolve and here are
being called in a way close to what the application does in real life.
Specifically, if this program crashes then we can be close to certain
that the real app will also crash - and that’s what we care about most.
Now we can just add a new entry to Makefile.am to actually build it:
tests += \
test-dns-parser
test_dns_parser_SOURCES = \
src/resolve/resolved-dns-packet.h \
src/resolve/resolved-dns-packet.c \
src/resolve/resolved-dns-question.h \
src/resolve/resolved-dns-question.c \
src/resolve/resolved-dns-answer.h \
src/resolve/resolved-dns-answer.c \
src/resolve/resolved-dns-rr.h \
src/resolve/resolved-dns-rr.c \
src/resolve/resolved-dns-domain.h \
src/resolve/resolved-dns-domain.c \
src/resolve/test-dns-parser.c
This app can now be easily checked with most common fuzzers like AFL
for example. JustĀ CC=…/afl-gcc ./configure && make test-dns-parser
and run the fuzzer on it. The printed out information is only for quick
verification of later results.
Sample input vectors can be created from wireshark network captures, or
even just some trivial generator.
What about the compromises then? The main issue is about the tested
scope. As seen in the resolver’s example, the only tested parts are
parsing and a bit of the memory management. That’s not a lot and it
doesn’t involve any of the business logic dealing with caches,
constructing responses, etc. It only checks whether a random message
arriving at this port is going to crash the program while it’s being
parsed into memory structures, or not.
Is it worth it in the end? The systemd-resolve case seems to show it is.
There was an assert abort case (function result not set),
null pointer read (string vectors api misuse)
and one infinite loop
hiding in the parsing code. All fixed upstream now. Even though they
didn’t allow arbitrary code execution, they’d be trivial to use to
(remotely!) DoS a server by using 100% CPU and stopping any hostname resolution.