In this series of articles, we will study a program written in C that prints the ASCII table on its standard output. This particular program will:

  • use the Executable and Linkable Format for its binary artifacts ;
  • target the 32-bit, little endian MIPS III instruction set architecture ;
  • run on the Linux operating system.

Tooling up

We will be using the Debian Bullseye operating system as a building environment.

First, we need to install the toolchain for our target:

$ sudo apt-get install \
    build-essential \
    binutils-mips-linux-gnu \
    gcc-mips-linux-gnu

Among other things, this will provide us with the following programs used in this part:

  • make: the traditional Unix build automation tool ;
  • mips-linux-gnu-gcc: the cross-compiler driver (a program that invokes the assembler, compiler and linker for us) for the MIPS platform.

Since this series is focused on reverse-engineering binary artifacts made by a toolchain and not the toolchain itself, we will not study here in detail how a toolchain works. For more details on this topic, please refer to Fabien Sanglard’s Driving Compilers series of articles.

Anatomy of our case study

To keep things simple, this program is self-sufficient and does not rely on any libraries. This means we need to provide the C runtime library functionalities that this program relies on ourselves. The source tree layout is as follows:

  • ascii-table.c contains the main source code for this program ;
  • ctype.c is an implementation for the ctype.h header, adapted from OpenBSD’s source code ;
  • libstd.c provides write(), exit() and __start().

In our program, the main() function iterates over all 128 characters of the ASCII table and prints out its ASCII table entry. The ASCII table itself is formatted with multiple columns and the function processes the entries in column order:

int main() {
    for (int i = 0; i < 128; i++) {
        int x = i % COLUMNS;
        int y = i / COLUMNS;
        int character = x * 128 / COLUMNS + y;

        print_ascii_entry(character, s_ascii_properties, NUM_ASCII_PROPERTIES);

        putchar(i % COLUMNS == COLUMNS - 1 ? '\n' : '\t');
    }

    return 0;
}

The print_ascii_entry() function prints a single ASCII table entry. It prints out the character as an integer, the character itself if it has a graphical representation and a series of flags representing various ASCII character properties:

void print_ascii_entry(char character, const ascii_property properties[], int num_ascii_properties) {
    print_number(character);
    putchar(' ');

    if (isgraph(character))
        putchar(character);
    else
        putchar(' ');
    putchar(' ');

    for (int k = 0; k < num_ascii_properties; k++) {
        const ascii_property *property = &properties[k];

        if (property->matches(character))
            putchar(property->flag);
        else
            putchar(' ');
    }
}

The print_number() function prints an integer as a zero-padded, right-justified, four character wide string in hexadecimal:

void print_number(int num) {
    for (int n = 3; n >= 0; n--) {
        int digit = (num >> (4 * n)) % 16;

        if (digit < 10)
            putchar('0' + digit);
        else
            putchar('a' + digit - 10);
    }
}

Building the case study

To build our case study, we will use the make command:

$ make CC=mips-linux-gnu-gcc LD=mips-linux-gnu-gcc ascii-table.elf
mips-linux-gnu-gcc -Os -g -EL -ffreestanding -fno-pic -fno-plt -nostdinc -I. -mno-abicalls -mno-check-zero-division -fverbose-asm -S -o ascii-table.S ascii-table.c
mips-linux-gnu-gcc -Os -g -EL -ffreestanding -fno-pic -fno-plt -nostdinc -I. -mno-abicalls -mno-check-zero-division -c -o ascii-table.o ascii-table.S
mips-linux-gnu-gcc -Os -g -EL -ffreestanding -fno-pic -fno-plt -nostdinc -I. -mno-abicalls -mno-check-zero-division -fverbose-asm -S -o ctype.S ctype.c
mips-linux-gnu-gcc -Os -g -EL -ffreestanding -fno-pic -fno-plt -nostdinc -I. -mno-abicalls -mno-check-zero-division -c -o ctype.o ctype.S
mips-linux-gnu-gcc -Os -g -EL -ffreestanding -fno-pic -fno-plt -nostdinc -I. -mno-abicalls -mno-check-zero-division -fverbose-asm -S -o libstd.S libstd.c
mips-linux-gnu-gcc -Os -g -EL -ffreestanding -fno-pic -fno-plt -nostdinc -I. -mno-abicalls -mno-check-zero-division -c -o libstd.o libstd.S
mips-linux-gnu-gcc -EL -static -no-pie -nostdlib -o ascii-table.elf ascii-table.o ctype.o libstd.o

Looking at the various files, we can observe the following build workflow:

  • ascii-table.c: the original C source code file written by the programmer ;
  • ascii-table.S: the assembly file generated from the source file by the compiler ;
  • ascii-table.o: the relocatable object file generated from the assembly file by the assembler ;
  • ascii-table.elf: the executable file generated from the object files by the linker.

Note that in addition to the object files and the executable file explained before, we have also generated assembly files. They are provided here for reference, as an intermediate step between a source code file and an object file. This step is usually skipped in real-world practice.

The files for this case study can be found here: case-study.tar.gz

Conclusion

We have created binary artifacts to study by compiling and linking a program using a toolchain. Next time, we will use the toolchain to introspect the artifacts we have just created.