Previously in this series of articles, we ported aln as a whole from a Linux a.out executable to a modern Linux ELF executable, aln.elf. However, that executable still contains the old, glibc 1.xx C standard library aln was originally built with. If we are to make more ambitious ports of aln, we need to get rid of it.

Static library switcheroo

This time, rather than just basically repackaging aln from the a.out file format to ELF, we’ll swap out the C standard library aln was statically linked with a contemporary glibc 2.xx one. To do so, instead of exporting the whole of aln as an object file (aln.whole.o) like we did previously, we’ll export aln without its C standard library bits (aln.o) instead, then link it as if it was a normal, everyday object file.

We’ll separate the different components of aln inside of a new program tree. After creating folders and memory fragments, then triaging the program bits by dragging and dropping selections of the program into memory fragments, we end up with the following pieces:

Slicing up aln in that manner makes it easy to export each piece by right-clicking on a piece and selecting Select Addresses, then exporting the selection as an object file with my Ghidra extension, like in the previous part.

After exporting aln.o, we can inspect the undefined symbols for this object file with nm:

$ i686-linux-gnu-nm --undefined-only aln.o
         U calloc
         U close
         U _ctype_b
         U _ctype_tolower
         U _ctype_toupper
         U DAT_0001de20
         U exit
         U free
         U FUN_0000fba0
         U getenv
         U index
         U _IO_fflush
         U _IO_fprintf
         U _IO_gets
         U _IO_printf
         U longjmp
         U lseek
         U malloc
         U memmove
         U memset
         U open
         U puts
         U read
         U realloc
         U rindex
         U scanf
         U _setjmp
         U sprintf
         U stderr
         U stdout
         U strcat
         U strcmp
         U strcpy
         U strncmp
         U strncpy
         U write

If we can provide compatible replacements for every undefined symbol in this file, then aln.o will link successfully and should run, regardless of the provenance of these replacements.

What could possibly go wrong?

For starters, let’s link aln.o statically as if this was a normal, everyday object file:

$ i686-linux-gnu-gcc -static -o aln.static.elf aln.o
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `FUN_00002824':
aln.o:(.text+0x1867): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x1a42): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x1b22): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x1c02): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x2194): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:aln.o:(.text+0x2572): more undefined references to `_ctype_toupper' follow
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `main':
aln.o:(.text+0x3123): undefined reference to `FUN_0000fba0'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `FUN_00004bc4':
aln.o:(.text+0x3bce): undefined reference to `_ctype_b'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `FUN_00004c74':
aln.o:(.text+0x3c61): undefined reference to `_ctype_tolower'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x3c7e): undefined reference to `_ctype_b'
collect2: error: ld returned 1 exit status
make: *** [Makefile:11: aln.static.elf] Error 1

Well, it wouldn’t be worthy of a blog article if it was that simple…

Just put some Flex TAPE® stubs on it

We have several undefined references to fix:

  • _ctype_toupper, _ctype_tolower and _ctype_b are global variables, part of glibc’s 1.xx implementation of ctype.h ;
  • FUN_0000fba0 is as far as I can tell an internal initialization function of the C standard library that is called from main for some reason.

The first issue occurs because we’re trying to use glibc 2.xx from the host Linux system, yet aln was originally built with glibc 1.xx ; what we see here are the first signs of ABI mismatches, where the expectations of the original program don’t line up with its new environment.

Instead of fixing this at the source (which we don’t have), we’ll cheese it by stubbing out FUN_0000fba0 and borrowing the reconstructed ctype.o from aln into aln.elf:

$ cat FUN_0000fba0.c
void FUN_0000fba0() {
}
$ i686-linux-gnu-gcc -static -o aln.static.elf aln.o ctype.o FUN_0000fba0.c
$ file aln.static.elf 
aln.static.elf: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, BuildID[sha1]=004086f9962fe01ffdf12dab5f3c264773cf922b, for GNU/Linux 3.2.0, with debug_info, not stripped

We have a successful link!

We’re not aiming for purity here but rather pragmatism: as long as aln can be tricked into running successfully, anything goes.

Now that we have an aln.static.elf file, let’s try something simple like printing out the help message with debug mode on:

$ ./aln.static.elf -z -?
Option `?'
Segmentation fault (core dumped)

Hmm… It managed to write out some output, but it crashed fairly quickly.

You say “Application Binary Interface”, I say [ɛ̃tɛɾfasə daplikasjõ binɛɾə]

Let’s see what’s happening with GDB:

$ gdb-multiarch --args ./aln.static.elf -z -?
...
Reading symbols from ./aln.static.elf...
(gdb) run
Starting program: /home/boricj/Documents/atari-sdk-elf/aln.static.elf -z -\?
Option `?'

Program received signal SIGSEGV, Segmentation fault.
0x0806d6e8 in fflush ()
(gdb) backtrace
#0  0x0806d6e8 in fflush ()
#1  0x0804b6b6 in ?? ()
#2  0x0804d005 in ?? ()
#3  0x08059528 in __libc_start_main ()
#4  0x08049ce2 in _start ()
(gdb) 

GDB isn’t telling much, but we have a thread to pull on, fflush():

(gdb) break fflush
Breakpoint 1 at 0x806d674
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/boricj/Documents/atari-sdk-elf/aln.static.elf -z -\?
Option `?'

Breakpoint 1, 0x0806d674 in fflush ()
(gdb) x/4wx $ebp
0xffffd188:     0xffffd1d0      0x0804b6b6      0x08118bb0      0xffffd314
(gdb) info symbol 0x08118bb0
stdout in section .data of /home/boricj/Documents/atari-sdk-elf/aln.static.elf
(gdb) x/wx &stdout
0x8118bb0 <stdout>:     0x08118940
(gdb) x/wx (void*)stdout
0x8118940 <_IO_2_1_stdout_>:    0xfbad2a84

GDB isn’t providing much information, but we can recover what we need by hand. With the standard calling convention on i386, the arguments to a function are passed on the stack. They can be inspected using the frame pointer register, starting at $ebp+8. fflush() takes a single FILE* argument and we can deduce that its raw value here is 0x08118bb0, a pointer to stdout.

Looking inside glibc’s source code, we can see that the opaque FILE structure from the C standard library is a typedef for the _IO_FILE structure private to glibc. This structure begins with the magic number 0xFBAD. The 32-bit word at 0x8118940 <_IO_2_1_stdout_> contains it, but the 32-bit word at 0x8118bb0 <stdout> is a pointer to the former.

Somehow, things got mixed up where a FILE** value was passed to fflush() instead of a FILE* as expected, which led to the segmentation fault. To fix this, we need to rename the symbol stdout to _IO_2_1_stdout_ so that fflush() gets called with the correct value.

Rather than adjusting it inside the Ghidra database (whose contents are supposedly correct for an executable with glibc 1.xx), we’ll rename it after the fact with objcopy:

$ cat glibc-1xx-to-glibc-2xx.syms
stdin           _IO_2_1_stdin_
stdout          _IO_2_1_stdout_
stderr          _IO_2_1_stderr_
$ i686-linux-gnu-objcopy --redefine-syms=glibc-1xx-to-glibc-2xx.syms aln.o aln.glibc20.o
$ i686-linux-gnu-gcc -g -static -no-pie -fno-pic -o aln.static.elf aln.glibc20.o ctype.o FUN_0000fba0.c

glibc-1xx-to-glibc-2xx.syms is a text file containing all symbols to redefine, one per line:

Old symbol name New name (glibc 2.xx)
stdin _IO_2_1_stdin_
stdout _IO_2_1_stdout_
stderr _IO_2_1_stderr_

We’re trying to mend ABIs together that aren’t supposed to be compatible with one another. As noted above, anything goes as long as it runs, no matter how dubious the hacks are.

With that out of the way, let’s see if it made a difference:

$ ./aln.static.elf -z -?
Option `?'
Usage: ./aln.static.elf [-options] <files|-x file|-i[i] <fname> <label>>
Where options are:
?: print this
a <text> <data> <bss>: output absolute file
        hex value: segment address
        r: relocatable segment
        x: contiguous segment
b: don't remove multiply defined local labels
d: wait for key after link
...
No object files to link.

Good, it’s no longer crashing. What about our nominal test case, linking the “Hello, world!” sample from the Atari Jaguar community SDK?

$ rm -f jaghello.cof && make LINK=aln.static.elf V=1
aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
*   ATARI LINKER (Mar 17 1995)    *
*  Adds from Atari version 1.11   *
*     and PC/DOS&Linux ports      *
*  Copyright 1993-95 Brainstorm.  *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof
make: *** [Makefile:12: jaghello.cof] Segmentation fault (core dumped)

Sigh.

It’s never that simple, is it?

Despair kicks in

Let’s run aln with the debug mode on to see what’s going on:

$ aln.static.elf -z -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
...
***********************************
*   ATARI LINKER (Mar 17 1995)    *
*  Adds from Atari version 1.11   *
*     and PC/DOS&Linux ports      *
*  Copyright 1993-95 Brainstorm.  *
** Copyright 1987-95 Atari Corp. **
***********************************
...
bsddoobj(startup.o,0x854f240,<none>)
bsdaddsyms for file startup.o
add_global(gSetOLP,startup.o,80,a200,GLOBAL)
...
add_global(_vidmem,startup.o,50,a100,GLOBAL)
add_unresolved(___main,startup.o)
bsddoobj(jag.o,0x8552420,<none>)
AlignSect: off=0X20, size=0X14C, align=0X10, padd=0x4
AlignSect: off=0X170, size=0X40C, align=0X10, padd=0x4
bsdaddsyms for file jag.o
add_global(_textfont,jag.o,0,a400,GLOBAL)
...
add_global(___main,jag.o,100,a200,GLOBAL)
Initialize symbols
add_local(_TEXT_E,(null))
...
add_global(_BSS_E,(null),0,a100,GLOBAL)
find_global(___main) =>  global  in BSD object jag.o
DOUNRESOLVED
DOCOMMON
add_local(_jagscreen,(null))
DOSYM(startup.o)
add_local(DRAM,(null))
...
add_local(VC,(null))
Segmentation fault (core dumped)

It’s crashing in the middle of the linking process. What can GDB tell us?

$ gdb-multiarch --args aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
...
Reading symbols from aln.static.elf...
(gdb) r
Starting program: /home/boricj/Documents/atari-sdk-elf/aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
*   ATARI LINKER (Mar 17 1995)    *
*  Adds from Atari version 1.11   *
*     and PC/DOS&Linux ports      *
*  Copyright 1993-95 Brainstorm.  *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof

Program received signal SIGSEGV, Segmentation fault.
0x0804d801 in FUN_00004a04 ()
(gdb) bt
#0  0x0804d801 in FUN_00004a04 ()
#1  0x0805760f in ?? ()
#2  0x08057922 in ?? ()
#3  0x08057c7a in ?? ()
#4  0x0804d179 in LAB_00004388 ()
#5  0x08118000 in ?? ()
#6  0x08059528 in __libc_start_main ()
#7  0x08049ce2 in _start ()
(gdb) 

Oh no.

It’s not crashing inside glibc, it’s crashing deep inside aln. That means I have no obvious threads to pull on to figure this one out…

I’ll admit, this one’s got me stumped for quite a long time, chasing dead-ends one after another. I’ll skip ahead to the resolution.

FUN_00004a04() seems to be part of a hash table implementation and its decompilation contains the following line:

iVar1 = *(int *)(&DAT_0001de20 + (uVar2 & 0xff) * 4);

From this, we can assume that DAT_0001de20 is an array of 256 elements, each 4 bytes wide, for a total of 1024 bytes. Say, what does that array contain?

(gdb) info address DAT_0001de20
Symbol "DAT_0001de20" is at 0x8065e20 in a file compiled without debugging.
(gdb) x/256wx 0x8065e20
0x8065e20 <DAT_0001de20>:       0x00000000      0x00000000      0x00000000      0x00000000
0x8065e30:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065e40:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065e50:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065e60:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065e70:      0x00000000      0x08078ee0      0x00000000      0x00000000
0x8065e80:      0x00000000      0x00000000      0x00000000      0x08078e20
0x8065e90:      0x00000000      0x08078da0      0x00000000      0x08078f40
0x8065ea0:      0x00000000      0x00000000      0x08078ea0      0x00000000
0x8065eb0:      0x08078ec0      0x00000000      0x00000000      0x00000000
0x8065ec0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065ed0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065ee0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065ef0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065f00:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065f10:      0x00000000      0x00000000      0x08078d20      0x00000000
0x8065f20:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065f30:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065f40:      0x08078de0      0x00000000      0x00000000      0x00000000
0x8065f50:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065f60:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065f70:      0x00000000      0x00000000      0x00000000      0x08078e00
0x8065f80:      0x00000000      0x08078fa0      0x00000000      0x00000000
0x8065f90:      0x00000000      0x00000000      0x00000000      0x08078e80
0x8065fa0:      0x00000000      0x00000000      0x08078d40      0x08078fe0
0x8065fb0:      0x08078e60      0x00000000      0x08078000      0x00000000
0x8065fc0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8065fd0:      0x00000000      0x00000000      0x00000000      0x08078f00
0x8065fe0:      0x08078dc0      0x00000000      0x00000000      0x00000000
0x8065ff0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066000:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066010:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066020:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066030:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066040:      0x00000000      0x00000000      0x00000000      0x08078f20
0x8066050:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066060:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066070:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066080:      0x00000000      0x08078fc0      0x00000000      0x00000000
0x8066090:      0x00000000      0x00000000      0x00000000      0x00000000
0x80660a0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80660b0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80660c0:      0x08078e40      0x00000000      0x00000000      0x00000000
0x80660d0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80660e0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80660f0:      0x00000000      0x00000000      0x00000000      0x08078f60
0x8066100:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066110:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066120:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066130:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066140:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066150:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066160:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066170:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066180:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066190:      0x00000000      0x00000000      0x00000000      0x00000000
0x80661a0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80661b0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80661c0:      0x00000000      0x00000000      0x00000000      0x08078d60
0x80661d0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80661e0:      0x00000000      0x00000000      0x00000000      0x00000000
0x80661f0:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066200:      0x00000000      0x00000000      0x00000000      0x00000000
0x8066210:      0x00000000      0x00000000      0x00000000      0x00000000
(gdb) info address DAT_0001de20
Symbol "DAT_0001de20" is at 0x8118600 in a file compiled without debugging.
(gdb) x/256wx 0x8118600
0x8118600 <DAT_0001de20>:       0x00000000      0x00000000      0x00000000      0x00000000
0x8118610:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118620:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118630:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118640:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118650:      0x00000000      0x08124060      0x00000000      0x00000000
0x8118660:      0x00000000      0x00000000      0x00000000      0x08124040
0x8118670:      0x00000000      0x081245b0      0x00000000      0x08124590
0x8118680:      0x00000000      0x00000000      0x08124140      0x00000000
0x8118690:      0x08124120      0x00000000      0x00000000      0x00000000
0x81186a0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81186b0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81186c0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81186d0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81186e0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81186f0:      0x00000000      0x00000000      0x08124080      0x00000000
0x8118700:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118710:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118720:      0x08124570      0x00000000      0x00000000      0x00000000
0x8118730:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118740:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118750:      0x00000000      0x00000000      0x00000000      0x08123fa0
0x8118760:      0x00000000      0x081245d0      0x00000000      0x00000000
0x8118770:      0x00000000      0x00000000      0x00000000      0x081241a0
0x8118780:      0x00000000      0x00000000      0x08124000      0x08124610
0x8118790:      0x08123f60      0x00000000      0x08123fc0      0x00000000
0x81187a0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81187b0:      0x00000000      0x00000000      0x00000000      0x08124180
0x81187c0:      0x08123fe0      0x00000000      0x00000000      0x00000000
0x81187d0:      0x00000000      0x00000000      0x00000000      0x00000000
0x81187e0 <_ctype_b>:   0x080588e2      0x08058ae3      0x08058be4      0x080e32bc
0x81187f0 <__exit_funcs>:       0x0811a8e0      0x080e1b40      0x08118800      0x00000000
0x8118800 <_IO_2_1_stderr_>:    0xfbad2086      0x00000000      0x00000000      0x00000000
0x8118810 <_IO_2_1_stderr_+16>: 0x00000000      0x00000000      0x00000000      0x00000000
0x8118820 <_IO_2_1_stderr_+32>: 0x00000000      0x00000000      0x00000000      0x08123f40
0x8118830 <_IO_2_1_stderr_+48>: 0x00000000      0x08118940      0x00000002      0x00000000
0x8118840 <_IO_2_1_stderr_+64>: 0xffffffff      0x00000000      0x0811ab04      0xffffffff
0x8118850 <_IO_2_1_stderr_+80>: 0xffffffff      0x00000000      0x081188a0      0x00000000
0x8118860 <_IO_2_1_stderr_+96>: 0x00000000      0x081241c0      0x00000000      0x00000000
0x8118870 <_IO_2_1_stderr_+112>:        0x00000000      0x00000000      0x00000000      0x00000000
0x8118880 <_IO_2_1_stderr_+128>:        0x00000000      0x00000000      0x00000000      0x00000000
0x8118890 <_IO_2_1_stderr_+144>:        0x00000000      0x08119980      0x00000000      0x00000000
0x81188a0 <_IO_wide_data_2>:    0x08124020      0x00000000      0x00000000      0x00000000
0x81188b0 <_IO_wide_data_2+16>: 0x00000000      0x00000000      0x00000000      0x00000000
0x81188c0 <_IO_wide_data_2+32>: 0x00000000      0x00000000      0x00000000      0x00000000
0x81188d0 <_IO_wide_data_2+48>: 0x00000000      0x00000000      0x00000000      0x08124160
0x81188e0 <_IO_wide_data_2+64>: 0x00000000      0x00000000      0x00000000      0x00000000
0x81188f0 <_IO_wide_data_2+80>: 0x00000000      0x00000000      0x00000000      0x00000000
0x8118900 <_IO_wide_data_2+96>: 0x00000000      0x00000000      0x00000000      0x00000000
0x8118910 <_IO_wide_data_2+112>:        0x00000000      0x00000000      0x00000000      0x00000000
0x8118920 <_IO_wide_data_2+128>:        0x00000000      0x00000000      0x08119860      0x00000000
0x8118930:      0x00000000      0x00000000      0x00000000      0x00000000
0x8118940 <_IO_2_1_stdout_>:    0xfbad2a84      0x0811cdb0      0x0811cdb0      0x0811cdb0
0x8118950 <_IO_2_1_stdout_+16>: 0x0811cdb0      0x0811cdb0      0x0811cdb0      0x0811cdb0
0x8118960 <_IO_2_1_stdout_+32>: 0x0811d1b0      0x00000000      0x00000000      0x00000000
0x8118970 <_IO_2_1_stdout_+48>: 0x00000000      0x08118a80      0x00000001      0x00000000
0x8118980 <_IO_2_1_stdout_+64>: 0xffffffff      0x00000000      0x0811ab10      0xffffffff
0x8118990 <_IO_2_1_stdout_+80>: 0xffffffff      0x00000000      0x081189e0      0x00000000
0x81189a0 <_IO_2_1_stdout_+96>: 0x00000000      0x00000000      0xffffffff      0x081245f0
0x81189b0 <_IO_2_1_stdout_+112>:        0x00000000      0x00000000      0x00000000      0x00000000
0x81189c0 <_IO_2_1_stdout_+128>:        0x00000000      0x00000000      0x00000000      0x00000000
0x81189d0 <_IO_2_1_stdout_+144>:        0x00000000      0x08119980      0x00000000      0x00000000
0x81189e0 <_IO_wide_data_1>:    0x00000000      0x00000000      0x00000000      0x00000000
0x81189f0 <_IO_wide_data_1+16>: 0x00000000      0x00000000      0x00000000      0x00000000

Whereas aln.elf seems in good order, inside aln.static.elf this array appears to be truncated at 480 bytes, with unrelated variables following it. However, aln’s code still assumed it was 1024 bytes long, so these variables were overwritten until that memory corruption led to a segmentation fault.

How come DAT_0001de20 was truncated in aln.static.elf and not in aln.elf? That array was originally located inside aln from 0x1de20 to 0x1e21f. However, the .data segment stops at 0x1dfff and the .bss segment begins at 0x1e000. As far as I can tell, that variable is laid across two sections.

I might not give a damn about computer engineering conventions in this series of articles, but so did toolchains in the 90’s apparently.

(╯°□°)╯︵ ┻━┻

When the aln.o object file got exported, these two sections became untethered. Later on, the linker didn’t place these bits next to each other, splitting that variable in two, effectively truncating it down to 480 bytes.

That was a fun one to track down.

Work, goddammit!

So, let’s excise that problematic array from aln.o and substitute it with a clean stub:

$ cat DAT_0001de20.c
int DAT_0001de20[256];
$ i686-linux-gnu-gcc -g -static -no-pie -fno-pic -o aln.static.elf aln.glibc20.o ctype.o FUN_0000fba0.c DAT_0001de20.c

We don’t care about the type of DAT_0001de20, we only want it to be at least 1024 bytes long. Unlike a compiler, a traditional linker only processes symbol names and doesn’t care about typing information.

Please work this time…

$ rm -f jaghello.cof && aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
*   ATARI LINKER (Mar 17 1995)    *
*  Adds from Atari version 1.11   *
*     and PC/DOS&Linux ports      *
*  Copyright 1993-95 Brainstorm.  *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof

Sizes:   Text   Data    Bss   Syms
(hex)     3C0    410   FA54   1C00

Link complete.
$ sha256sum jaghello.cof
f9c8269cdc998de01c0ac7a3e815c16b7ced106e25f10f92a7078c722a220dbb  jaghello.cof

Finally, aln lives on with a transplanted C standard library from the 21st century and it only took the better part of my sanity.

Despite all this nonsense, it wasn’t as bad as it could be. In particular, aln doesn’t use errno, which was a static variable in the old days and is a thread-local one nowadays. These two storage schemes are mutually incompatible.

There are ways to work around this problem, but thankfully I didn’t have to deal with this nonsense here.

Dynamic library displacement

We’ve swapped a C standard library with another, how about removing it altogether from the executable? We can do so by dynamically linking it:

$ i686-linux-gnu-gcc -g -no-pie -fno-pic -o aln.dynamic.elf aln.glibc20.o ctype.o FUN_0000fba0.c DAT_0001de20.c
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.glibc20.o: warning: relocation against `_IO_2_1_stderr_@@GLIBC_2.1' in read-only section `.text'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: warning: creating DT_TEXTREL in a PIE
$ file aln.dynamic.elf 
aln.dynamic.elf: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=d6aabe004514845f6530384228f12741aa87dbc1, for GNU/Linux 3.2.0, with debug_info, not stripped

The linker isn’t very happy about it, but it did spat out an executable. Let’s try it out:

$ rm -f jaghello.cof && aln.dynamic.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
*   ATARI LINKER (Mar 17 1995)    *
*  Adds from Atari version 1.11   *
*     and PC/DOS&Linux ports      *
*  Copyright 1993-95 Brainstorm.  *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof

Sizes:   Text   Data    Bss   Syms
(hex)     3C0    410   FA54   1C00

Link complete.
$ sha256sum jaghello.cof
f9c8269cdc998de01c0ac7a3e815c16b7ced106e25f10f92a7078c722a220dbb  jaghello.cof

It worked out of the box? Well, after dealing with all these shenanigans previously it better be!

The files for this case study can be found here: case-study.tar.gz

Conclusion

We have delinked aln back into a standard object file without its C standard library, then produced both statically and dynamically linked versions of aln using contemporary versions of glibc. Next time, we’ll attempt our most ambitious port yet: turning aln from a Linux program into a native Windows executable.