Fuzzer Crash Root Cause Analysis With ASAN (AddressSanitizer)

In our attempt to "re-discover" the sudoedit vulnerability (CVE-2021-3156), we use the address sanitation tool to investigate a heap overflow. After fixing it, we investigate several other unique crashes registered by the AFL fuzzer.

The Video

Introduction

In the last episode of the sudoedit vulnerability series (CVE-2021-3156), we found a crash using AFL fuzzing that looks quite similar to the actual vulnerability, at least in its format. However, the crash is actually an abort triggered by malloc because the heap seems to be corrupted. This abort is a symptom of a memory corruption that must have happened earlier in the program execution. Now, we need to find when it is that the heap overflows. Finding the root cause of this vulnerability is thus our objective for the day.

If you program in C, there's a tool that you should know about: address sanitizer. It's actually a compiler tool, that is extremely helpful for debugging all kinds of bugs – not just security ones. The reason for us integrating this tool in our workflow in the sudoedit vulnerability serise is that it'll help us track down some more information about this crash.

Investigating the Memory Corruption with ASAN

Creating the ASAN Build

Let's build sudo with ASAN (that's "address sanitizer"), so that we can investigate why the memory was corrupted. From The Fuzzing Project, we can find information that will assist us in finding bugs using compiler features, as well as other content, relating to flags to use during compilation.

The `-fsanitize=address` flag will be coming in handy.

We added this flag to our compilation of sudo. Due to not fuzzing with afl anymore (as we switched to afl++), we can go ahead and just use the vanilla clang, instead of afl-clang. We then build sudo with make, and contemplate the output in the console. You'll note that there is a lot going on in said console – that's -fsanitize=address at work.

Unfortunately, the build fails... wait, we probably forgot the LD flags. We quickly put them into our build configuration command, and hit the Return key again.

./configure CFLAGS="-fsanitize=address,undefined" LDFLAGS="-fsanitize=address" CC=clang

With this step complete, we re-run make, which seems to finish successfully. If we execute the binary, with ./src/sudo, we can see that ASAN is working, as it mentions in the console that it is finding memory leaks. However, these memory leaks are not important for security – they just mean that some memory has been allocated, and not freed at a later point. We can (and did) disable the memory leak reporting by ASAN, so that the output is a little more cleaned up.

We pipe in our malicious input into our new sudo build, and let sudo crash with ASAN keeping watch! It... didn't crash. What? We get a password prompt, and then another memory leak? What is going on?

Ironing Out Some Issues

As it turns out, there is a big issue we had to deal with behind the scenes. We wound up killing a lot of time trying to iron it out. We decided to discuss it so that you can see our troubleshooting methodology and avoid making the same mistake. Research and projects (whether security-oriented or even outside of this sector) are rarely straightforward and smooth; there are often snags, detours, and problems to address. Sometimes, these issues are really simple to sort out, and sometimes, they are awfully difficult. It's just normal.

Back to the memory leak: throwing the input at the normally-compiled sudo via our wrapper still works. What gives? There are a few things that we can consider from here.

First: the recompilation. Recompiling with massive changes such as AFL instrumentation can introduce issues – as can using address sanitation, for that matter. Large changes such as this can move code around in the binary and thus create a different environment in which the malicious input can no longer cause a crash. It can also be that the input crashed the program because of something in AFL. Here, it seems like something with the ASAN build has changed the binary too significantly, such that the input doesn't break anything. This is weird, as ASAN builds are typically more sensitive, and should be immediately reporting overflow cases, even if they don't lead to an abort condition later. We were therefore quite suspicious at this point, as we expected the report that in the end, we did not get.

Where do we go from here? We could try the original crash, as opposed to the minimized one. Alternately, we could increase the overwrite and the data, or generally impart more modifications on the input. Unfortunately, nothing was working.

We tried a variety of inputs, to no avail.

We therefore decided to take a step back, and assess the big picture to try and identify what we should do. We checked whether the argument parsing still worked, by running sudo -l :

echo -en "sudo\x00-l" |./src/sudo

Surprisingly, the password prompt popped up in the console. We instead expected to see a different output.

This is what we expected to see instead.

It seems like ASAN has changed something else... but that doesn't make a lot of sense. It has to be something else, as using ASAN should not affect the binary like this.

What we have here is a typical weird technical issue. Somewhere, somehow, we have made a mistake. We must have! Backtracking further makes a lot of sense. Removing some of our previous modifications to code as part of preparation for fuzzing might help. We decided to rebuild the "normal" sudo, this time with ASAN, so that we can specify the arguments directly ourselves. In the root folder of the Docker container, there is the unmodified and regular version of sudo, which we build with ASAN. Let's see what happens: ./src/sudo -l works with this version. This is a good sign! We can now create a symbolic link to 0edit and attempt to call 0edit with the malicious argument... but it doesn't crash.

Accidentally Fixing the Issue

It's time for more sanity-checking. Here is a previous Docker system where sudoedit clearly crashes – and that was built from the same source, just without ASAN. In our desperation, we decided to install this ASAN version of sudo as the system version with make install. We created the symbolic link from that new ASAN sudo version as 0edit, tried it, and bam! It works?! Indeed it does: ASAN reports a heap buffer overflow at a specified address. This left us pleased, yet confused. Why did this happen? What was our mistake?

For absolutely no reason, the file ./src/sudo is not the binary anymore.

Previously, this was the path to the built binary, and now, it's a shell script. The real sudo is located in .libs.

That's it. That's the mistake, that's the error that cost us hours of troubleshooting.

This is, as it turns out, where `sudo` is *actually* located.

If we symlink the actual binary in .libs and try again, we get the ASAN heap-overflow detection. We really, really love computers, but on some days, we really don't.

Improving the ASAN Debug Output

At any rate, we have our ASAN output now. The overflow seems to be happening in a function called set_cmnd(), but the report only shows us the binary offset in the object file and not its exact source code line. We could use a sudo debug build... so we made one with the -g compiler flag. We extended the CFLAGS, so that the compiler includes the -g flag during the configuration. Unfortunately, this didn't really help. Time for gdb!

We tried to set a breakpoint to set_cmnd(), but the breakpoint isn't hit, and thus this approach didn't work. Looking more closely, we noticed the address range of this function: looks like a library address! As a result, we considered the virtual memory map for the sudo process; we can see that the address previously listed must belong to an already-loaded library, perhaps the libsudo_util one.

Libraries may have something to do with all of this.

It's worth it for us to try to build sudo without external libraries – a static build, with everything contained within the binary. This would after all greatly facilitate the debugging procedure.

We can do this with the --disabled_shared flag. You might've noticed in previous builds that we used this very same flag. We saw it in The Fuzzing Project's tutorial and figured that we should just use it too. However, when we built sudo with ASAN, we did not specify this flag. Adding it again should disable the creation of shared libraries and make this simpler and nicer to debug.

CFLAGS="-ggdb" CXXFLAGS="-ggdb" ./configure --disable-shared

Disabling shared libraries.

By the way, we noticed that the --disabled_shared flag dictates whether we get a binary in .libs, or if ./src/sudo is already the binary. Therefore, this was the reason why the binary used to be in ./src/sudo, and why the earlier ASAN build wound up in .libs, which is absolutely not what we expected. This is the issue that ate up hours of our time. So, if you retain any lesson from this video and article... use the --disabled_shared flag.

Now that we have a working and proper sudo build that involves ASAN and disabled shared libraries. When we test the crash (again), we get painfully clear information about which file and which code line the crash happened at. There we go! Thank you, ASAN. Now, let's have a gander.

ASAN giving us the file and line where the overflow occurs.

At this code line, we find a loop that copies some bytes around. It copies them from the *from pointer to the location of the *to. We also noticed that this loop depends on a check for a backslash, one level upstream. This might explain why our proof-of-concept crash had a backslash at the end. And now, we can start analyzing the bug that we found.

Triaging Other Unique Crashes

It's worth mentioning that during this entire time, afl++ has been at work, trying all kinds of different inputs to generate a crash. It noted three unique vulnerabilities! We'll triage them, and see what their root cause is. Maybe we found a zero-day after all!

We can use afl-tmin to minimize the two other crashes, which goes quickly. We then look at the two cases, and surprise surprise: they look the same as the one we've been using thus far. When minimized, they follow the same structure: a variation of the sudoedit call to -s, with a backslash at the end. So, these are the "same crash", but as we explained previously, afl++ looks at the executed edges to determine if you've reached a different functionality, and it's a unique crash because different edges were executed to reach this conclusion.

Anyway, we're still curious to see if regular afl will find the bug, because so far only afl++ has. We also want to see if specifying sudoedit determines whether the fuzzer can still converge on the vulnerability. We'll leave the fuzzers to continue working while we progress with our series on CVE-2021-3156.

The Next Steps

What could we do next? We could jump forward into exploit development, and try to exploit the crash we found. It could prove to be too hard to exploit, though, without a more thorough and complete understanding of its mechanism. Having this stronger foundation would also mean that exploit development would be an easier task than without the extra background.

It's worth noting that the original Qualys advisory did explain in some detail why this overflow actually happened. It would be beneficial for us to reach the same conclusion. So how does one figure out this detailed explanation?

After all, the theme and the learning methodology of this video series is for us to imagine that we are the researchers, and that we generate this knowledge as organically as possible. Therefore, our role also includes writing a report that goes in-depth to explain the root cause of this vulnerability. So, let's do that next!