Porting the Atari Jaguar SDK part 1: loading executables into Ghidra the hard way
After laying out the context for this series of articles, let’s begin our journey on how to make software ports of programs with no source code.
Our case study will be aln
, the Atari linker for the Atari Jaguar ; specifically, its original Linux port in the a.out executable file format for the 32-bit x86 architecture.
Running aln
Before trying to reverse-engineer the artifact head-on, let’s run it first as-is on a modern x86_64 Linux system.
After downloading the aln
artifact, we’ll use Kees Cook’s a.out loader to execute it.
Let’s download and build it:
$ wget https://raw.githubusercontent.com/kees/kernel-tools/trunk/a.out/aout.c
$ i686-linux-gnu-gcc aout.c -o aout -static
Then, we can use the loader to run aln
:
$ ./aout ./aln -v
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
No object files to link.
Link aborted.
a.out QMAGIC executables are based very low in the virtual address space (4096), which is likely below the minimum mmap()
address configured on a modern Linux system.
Either lower that minimum address with sysctl -w vm.mmap_min_addr=4096
or use QEMU’s user-mode emulation to run aout
.
There is a packaging of a complete Jaguar SDK by cubanismo, ready to use on modern systems.
It offers the choice between running the original tools and modern replacements, if any.
After setting it up, we can build the sample included with the original aln
executable like so:
$ make LINK=aln V=1
rmac -fb -g2 -rd -v +o0 +o1 +o2 startup.s
startup.s 81: Warning: RISC code generated with no origin defined
[Writing object file: startup.o]
TEXT segment: 624 bytes
DATA segment: 0 bytes
BSS segment: 64080 bytes
Total : 64704 bytes
TextRel size: 248 bytes
DataRel size: 0 bytes
m68k-aout-gcc -DJAGUAR -I/home/boricj/Documents/jaguar-sdk/jaguar/include -O2 -c -o jag.o jag.c
aln -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof
Sizes: Text Data Bss Syms
(hex) 3C0 410 FA54 1C00
Link complete.
$ file jaghello.cof
jaghello.cof: mc68k COFF object not stripped
$ sha256sum jaghello.cof
f9c8269cdc998de01c0ac7a3e815c16b7ced106e25f10f92a7078c722a220dbb jaghello.cof
We’ll assume that our future ports of aln
to be successful if they can replicate jaghello.cof
.
It doesn’t completely prove that the ports are fit for purpose as this is hardly an exhaustive stress testing of the linker, but it will demonstrate at least some basic level of functionality.
Loading aln into Ghidra
The Linux port of aln
is a statically-linked QMAGIC a.out executable.
At the time of writing this article, vanilla Ghidra doesn’t have an a.out file loader.
There is a pull request that aims to implement it, but we’ll demonstrate how to manually load aln
into Ghidra by hand.
For now, we’ll keep things simple and assume the a.out file format is composed of just four parts:
- A header ;
- A
.text
segment ; - A
.data
segment ; - A
.bss
segment.
The header has the following structure:
struct aout_header
{
uint32_t a_info; /* machine type, magic, etc */
uint32_t a_text; /* text size */
uint32_t a_data; /* data size */
uint32_t a_bss; /* desired bss size */
uint32_t a_syms; /* symbol table size */
uint32_t a_entry; /* entry address */
uint32_t a_trsize; /* text relocation size */
uint32_t a_drsize; /* data relocation size */
};
When running an a.out QMAGIC executable, the Linux kernel will load it as following:
- The header and
.text
segment are mapped read-execute at offset 4096 ; - The
.data
segment is mapped read-write-execute immediately after the.text
segment ; - The
.bss
segment is mapped read-write immediately after the.data
segment.
First, let’s analyze the header:
$ hexdump -C aln | head -n 2
00000000 cc 00 64 00 00 c0 01 00 00 10 00 00 94 09 00 00 |..d.............|
00000010 00 00 00 00 20 10 00 00 00 00 00 00 00 00 00 00 |.... ...........|
Using the data structure above (and remembering that the executable is in little-endian order), we can read the following interesting values:
Field | Offset | Value |
---|---|---|
a_text |
4 |
0x0001c000 |
a_data |
8 |
0x00001000 |
a_bss |
12 |
0x00000994 |
a_entry |
20 |
0x00001020 |
Now that we have all the information needed from the header, we’ll import this executable inside Ghidra as a raw file and use the following settings:
- Language:
x86:LE:32:default:gcc
; - Block name:
.text
; - Base address:
0x00001000
; - Length:
0x0001d000
.
Open the file and skip auto-analysis for now.
Click on Window > Memory Map
to display the memory map:
Ghidra has loaded the whole of aln
as one .text
section (as instructed), but this isn’t how this a.out file is actually structured.
We need to fix up by hand the sections so that the memory map of Ghidra matches up with the memory map of the a.out file.
First, we’ll split off .data
from .text
.
Select the .text
section and click on the orange “🟰” button on the top-right corner with the tooltip Split a block
.
Fill in the dialog box as follows:
- Block name:
.data
; - Block length:
0x00001000
.
The rest of the fields will automatically adjust.
Click on OK
to split the memory block.
Every initialized section is now correctly set up, but we still have .bss
missing, which is an uninitialized section (meaning these bytes aren’t actually stored inside the a.out file).
Click on the green “➕” button on the top-right corner with the tooltip Add a new block to memory
.
Fill in the dialog box as follows:
- Block name:
.bss
; - Start address:
ram:0x0001e000
; - Block length:
0x00000994
; - Permissions: read, write ;
- Uninitialized.
Click on OK
to add the memory block.
The memory map should now look like this:
The artifact is now properly loaded, but there is one last piece of information from the header that we can apply: the entrypoint.
Select the address 0x00001020
in the listing view and hit F
(or right-click and select Create Function
).
Then, hit the L
key, rename the function to _start
and mark it as an entry point.
The files for this case study can be found here: case-study.tar.gz
Conclusion
We have loaded the aln
executable artifact into Ghidra by hand.
Next time, we’ll start reverse-engineering this artifact in order to start porting aln
to new environments.