Previously in this series of articles, we loaded by hand into Ghidra aln, an executable artifact in the traditional Unix a.out format, despite Ghidra not offering support for this particular file format. In this article, we will start reverse-engineering aln with the intent to port it to new environments.

Reverse-engineering tricks for lazy people

In order to make software ports of aln, we will need some level of understanding on how it is put together. The aln executable itself is about 116 KiB in size, which is a large program to reverse-engineer. Unlike our past case study we will not attempt a thorough or detailed analysis of aln, since that would simply take way too long. Instead, we’ll cheat and exploit every trick or leverage that we can to cut down on the amount of work required.

One such leverage is the fact that aln is a statically-linked program. It is self-sufficient and doesn’t require external libraries to run: any library the program needs is embedded within it. In particular, we can expect it to contain parts of a C runtime library somewhere in there.

Hunting for the C standard library

Given its age and context, aln can reasonably be expected to contain a glibc 1.x static library from a Linux distribution of this era. Unfortunately, getting a hold of derelict Linux userlands can be challenging and getting them to run even more so. After trying a bunch of distributions and much gnashing of teeth, Slackware 2.3’s libc.a happens to be a near-perfect match.

Loading the C standard library

Static libraries are an archive of object files, but we’re going to use Ghidra’s Version Tracking tool, which can only work on one source/destination program pair at a time. We need to extract the static archive (ar x libc.a) and then combine all those object files into one (ld -r *.o -o libc.o) so that the tool can work with it.

The upside is that after combining libc.a into libc.o we have one object file which contains every symbol from the various .o files that libc.a is made up of. The downside is that libc.a doesn’t carry any debugging symbols, but we can cross-reference the original glibc source code if we need to make up for it.

Now that we have one file to work with, we need to actually load it. As of Ghidra 10.4 the a.out file format isn’t supported, but there is a pull request that implements it. Since libc.o contains a lot of metadata like symbols we want to use, unlike last time we will not try to load this object file by hand. Instead, after building this modified version of Ghidra we can load libc.o using the UNIX A.out loader format:

Version tracking to the rescue

Now that we have both our source program (libc.o) and our destination program (aln), we can finally start Ghidra’s Version Tracking tool by clicking on the footprints icon in the project window:

Since we directly launched this tool, it started without a session opened. Start a new session by clicking on the footstep icon (or File > New Session...):

After running the precondition checks and finishing the new session wizard, Ghidra will open two CodeBrowser windows, one for the source program and one for the destination program.

From there, to leverage the source file we need to start matching it to the destination file. To get matches, we need to run correlators by clicking on the green plus icon (or File > Add to session) and selecting which correlation algorithms to use:

Since we have a source artifact that is a very close match and a destination artifact that has no symbols, we’ll wing it with just the exact match correlation algorithms with the default settings for now.

After running the correlation algorithms, over 300 potential matches appeared in the version tracking window:

We can see the details of a particular match by selecting it, such as the exact function mnemonics match for _brk:

The bottom part of the version tracking tool shows how the two pieces of both programs correlates. From there, we have three options:

  • Accepting a match, giving it a green flag status ;
  • Rejecting a match, giving it a forbidden icon status ;
  • Applying a match’s markup, removing it from the list of matches.

We’ll accept this match by clicking on the green flag (or by right-cliking it and selecting Accept):

Both matches for _brk have been accepted and the FUN_0000fe78 function in the destination program has been renamed _brk. Furthermore, two new implied data matches for ____brk_addr and _errno appeared with red highlights, since both references inside the _brk function are identical between the two programs. We can retire accepted matches by clicking on the green checkmark icon (or right-clicking them and selection Apply Markup), which will make them disappear from the matches list to unclutter it.

Using the version tracking tool effectively requires triaging matches. This in turn may generate further implied matches, transferring more and more of the markups from the source program to the destination program.

After processing every match, the destination program (aln) is fully marked up with all the information from the source program (libc.o):

That’s a lot of information inside aln we didn’t need to reverse-engineer by hand.

The files for this case study can be found here: case-study.tar.gz

Conclusion

We have learned how to use Ghidra’s version tracking tool to transfer markings from one program to another and used it to annotate parts of aln with a closely matching C static library. Next time, we’ll start our porting journey by converting this a.out executable into a modern ELF executable.