Porting the Atari Jaguar SDK part 4: where we're going, we don't need the C standard library
Previously in this series of articles, we ported aln
as a whole from a Linux a.out executable to a modern Linux ELF executable, aln.elf
.
However, that executable still contains the old, glibc 1.xx C standard library aln
was originally built with.
If we are to make more ambitious ports of aln
, we need to get rid of it.
Static library switcheroo
This time, rather than just basically repackaging aln
from the a.out file format to ELF, we’ll swap out the C standard library aln
was statically linked with a contemporary glibc 2.xx one.
To do so, instead of exporting the whole of aln
as an object file (aln.whole.o
) like we did previously, we’ll export aln
without its C standard library bits (aln.o
) instead, then link it as if it was a normal, everyday object file.
We’ll separate the different components of aln
inside of a new program tree.
After creating folders and memory fragments, then triaging the program bits by dragging and dropping selections of the program into memory fragments, we end up with the following pieces:
Slicing up aln
in that manner makes it easy to export each piece by right-clicking on a piece and selecting Select Addresses
, then exporting the selection as an object file with my Ghidra extension, like in the previous part.
After exporting aln.o
, we can inspect the undefined symbols for this object file with nm
:
$ i686-linux-gnu-nm --undefined-only aln.o
U calloc
U close
U _ctype_b
U _ctype_tolower
U _ctype_toupper
U DAT_0001de20
U exit
U free
U FUN_0000fba0
U getenv
U index
U _IO_fflush
U _IO_fprintf
U _IO_gets
U _IO_printf
U longjmp
U lseek
U malloc
U memmove
U memset
U open
U puts
U read
U realloc
U rindex
U scanf
U _setjmp
U sprintf
U stderr
U stdout
U strcat
U strcmp
U strcpy
U strncmp
U strncpy
U write
If we can provide compatible replacements for every undefined symbol in this file, then aln.o
will link successfully and should run, regardless of the provenance of these replacements.
What could possibly go wrong?
For starters, let’s link aln.o
statically as if this was a normal, everyday object file:
$ i686-linux-gnu-gcc -static -o aln.static.elf aln.o
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `FUN_00002824':
aln.o:(.text+0x1867): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x1a42): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x1b22): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x1c02): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x2194): undefined reference to `_ctype_toupper'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:aln.o:(.text+0x2572): more undefined references to `_ctype_toupper' follow
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `main':
aln.o:(.text+0x3123): undefined reference to `FUN_0000fba0'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `FUN_00004bc4':
aln.o:(.text+0x3bce): undefined reference to `_ctype_b'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o: in function `FUN_00004c74':
aln.o:(.text+0x3c61): undefined reference to `_ctype_tolower'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.o:(.text+0x3c7e): undefined reference to `_ctype_b'
collect2: error: ld returned 1 exit status
make: *** [Makefile:11: aln.static.elf] Error 1
Well, it wouldn’t be worthy of a blog article if it was that simple…
Just put some Flex TAPE® stubs on it
We have several undefined references to fix:
_ctype_toupper
,_ctype_tolower
and_ctype_b
are global variables, part of glibc’s 1.xx implementation ofctype.h
;FUN_0000fba0
is as far as I can tell an internal initialization function of the C standard library that is called frommain
for some reason.
The first issue occurs because we’re trying to use glibc 2.xx from the host Linux system, yet aln
was originally built with glibc 1.xx ; what we see here are the first signs of ABI mismatches, where the expectations of the original program don’t line up with its new environment.
Instead of fixing this at the source (which we don’t have), we’ll cheese it by stubbing out FUN_0000fba0
and borrowing the reconstructed ctype.o
from aln
into aln.elf
:
$ cat FUN_0000fba0.c
void FUN_0000fba0() {
}
$ i686-linux-gnu-gcc -static -o aln.static.elf aln.o ctype.o FUN_0000fba0.c
$ file aln.static.elf
aln.static.elf: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, BuildID[sha1]=004086f9962fe01ffdf12dab5f3c264773cf922b, for GNU/Linux 3.2.0, with debug_info, not stripped
We have a successful link!
We’re not aiming for purity here but rather pragmatism: as long as aln
can be tricked into running successfully, anything goes.
Now that we have an aln.static.elf
file, let’s try something simple like printing out the help message with debug mode on:
$ ./aln.static.elf -z -?
Option `?'
Segmentation fault (core dumped)
Hmm… It managed to write out some output, but it crashed fairly quickly.
You say “Application Binary Interface”, I say [ɛ̃tɛɾfasə daplikasjõ binɛɾə]
Let’s see what’s happening with GDB:
$ gdb-multiarch --args ./aln.static.elf -z -?
...
Reading symbols from ./aln.static.elf...
(gdb) run
Starting program: /home/boricj/Documents/atari-sdk-elf/aln.static.elf -z -\?
Option `?'
Program received signal SIGSEGV, Segmentation fault.
0x0806d6e8 in fflush ()
(gdb) backtrace
#0 0x0806d6e8 in fflush ()
#1 0x0804b6b6 in ?? ()
#2 0x0804d005 in ?? ()
#3 0x08059528 in __libc_start_main ()
#4 0x08049ce2 in _start ()
(gdb)
GDB isn’t telling much, but we have a thread to pull on, fflush()
:
(gdb) break fflush
Breakpoint 1 at 0x806d674
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/boricj/Documents/atari-sdk-elf/aln.static.elf -z -\?
Option `?'
Breakpoint 1, 0x0806d674 in fflush ()
(gdb) x/4wx $ebp
0xffffd188: 0xffffd1d0 0x0804b6b6 0x08118bb0 0xffffd314
(gdb) info symbol 0x08118bb0
stdout in section .data of /home/boricj/Documents/atari-sdk-elf/aln.static.elf
(gdb) x/wx &stdout
0x8118bb0 <stdout>: 0x08118940
(gdb) x/wx (void*)stdout
0x8118940 <_IO_2_1_stdout_>: 0xfbad2a84
GDB isn’t providing much information, but we can recover what we need by hand.
With the standard calling convention on i386, the arguments to a function are passed on the stack.
They can be inspected using the frame pointer register, starting at $ebp+8
.
fflush()
takes a single FILE*
argument and we can deduce that its raw value here is 0x08118bb0
, a pointer to stdout
.
Looking inside glibc’s source code, we can see that the opaque FILE
structure from the C standard library is a typedef for the _IO_FILE
structure private to glibc.
This structure begins with the magic number 0xFBAD
.
The 32-bit word at 0x8118940 <_IO_2_1_stdout_>
contains it, but the 32-bit word at 0x8118bb0 <stdout>
is a pointer to the former.
Somehow, things got mixed up where a FILE**
value was passed to fflush()
instead of a FILE*
as expected, which led to the segmentation fault.
To fix this, we need to rename the symbol stdout
to _IO_2_1_stdout_
so that fflush()
gets called with the correct value.
Rather than adjusting it inside the Ghidra database (whose contents are supposedly correct for an executable with glibc 1.xx), we’ll rename it after the fact with objcopy
:
$ cat glibc-1xx-to-glibc-2xx.syms
stdin _IO_2_1_stdin_
stdout _IO_2_1_stdout_
stderr _IO_2_1_stderr_
$ i686-linux-gnu-objcopy --redefine-syms=glibc-1xx-to-glibc-2xx.syms aln.o aln.glibc20.o
$ i686-linux-gnu-gcc -g -static -no-pie -fno-pic -o aln.static.elf aln.glibc20.o ctype.o FUN_0000fba0.c
glibc-1xx-to-glibc-2xx.syms
is a text file containing all symbols to redefine, one per line:
Old symbol name | New name (glibc 2.xx) |
---|---|
stdin |
_IO_2_1_stdin_ |
stdout |
_IO_2_1_stdout_ |
stderr |
_IO_2_1_stderr_ |
We’re trying to mend ABIs together that aren’t supposed to be compatible with one another. As noted above, anything goes as long as it runs, no matter how dubious the hacks are.
With that out of the way, let’s see if it made a difference:
$ ./aln.static.elf -z -?
Option `?'
Usage: ./aln.static.elf [-options] <files|-x file|-i[i] <fname> <label>>
Where options are:
?: print this
a <text> <data> <bss>: output absolute file
hex value: segment address
r: relocatable segment
x: contiguous segment
b: don't remove multiply defined local labels
d: wait for key after link
...
No object files to link.
Good, it’s no longer crashing. What about our nominal test case, linking the “Hello, world!” sample from the Atari Jaguar community SDK?
$ rm -f jaghello.cof && make LINK=aln.static.elf V=1
aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof
make: *** [Makefile:12: jaghello.cof] Segmentation fault (core dumped)
Sigh.
It’s never that simple, is it?
Despair kicks in
Let’s run aln
with the debug mode on to see what’s going on:
$ aln.static.elf -z -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
...
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
...
bsddoobj(startup.o,0x854f240,<none>)
bsdaddsyms for file startup.o
add_global(gSetOLP,startup.o,80,a200,GLOBAL)
...
add_global(_vidmem,startup.o,50,a100,GLOBAL)
add_unresolved(___main,startup.o)
bsddoobj(jag.o,0x8552420,<none>)
AlignSect: off=0X20, size=0X14C, align=0X10, padd=0x4
AlignSect: off=0X170, size=0X40C, align=0X10, padd=0x4
bsdaddsyms for file jag.o
add_global(_textfont,jag.o,0,a400,GLOBAL)
...
add_global(___main,jag.o,100,a200,GLOBAL)
Initialize symbols
add_local(_TEXT_E,(null))
...
add_global(_BSS_E,(null),0,a100,GLOBAL)
find_global(___main) => global in BSD object jag.o
DOUNRESOLVED
DOCOMMON
add_local(_jagscreen,(null))
DOSYM(startup.o)
add_local(DRAM,(null))
...
add_local(VC,(null))
Segmentation fault (core dumped)
It’s crashing in the middle of the linking process. What can GDB tell us?
$ gdb-multiarch --args aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
...
Reading symbols from aln.static.elf...
(gdb) r
Starting program: /home/boricj/Documents/atari-sdk-elf/aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof
Program received signal SIGSEGV, Segmentation fault.
0x0804d801 in FUN_00004a04 ()
(gdb) bt
#0 0x0804d801 in FUN_00004a04 ()
#1 0x0805760f in ?? ()
#2 0x08057922 in ?? ()
#3 0x08057c7a in ?? ()
#4 0x0804d179 in LAB_00004388 ()
#5 0x08118000 in ?? ()
#6 0x08059528 in __libc_start_main ()
#7 0x08049ce2 in _start ()
(gdb)
Oh no.
It’s not crashing inside glibc, it’s crashing deep inside aln
.
That means I have no obvious threads to pull on to figure this one out…
I’ll admit, this one’s got me stumped for quite a long time, chasing dead-ends one after another. I’ll skip ahead to the resolution.
FUN_00004a04()
seems to be part of a hash table implementation and its decompilation contains the following line:
iVar1 = *(int *)(&DAT_0001de20 + (uVar2 & 0xff) * 4);
From this, we can assume that DAT_0001de20
is an array of 256 elements, each 4 bytes wide, for a total of 1024 bytes.
Say, what does that array contain?
(gdb) info address DAT_0001de20
Symbol "DAT_0001de20" is at 0x8065e20 in a file compiled without debugging.
(gdb) x/256wx 0x8065e20
0x8065e20 <DAT_0001de20>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065e30: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065e40: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065e50: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065e60: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065e70: 0x00000000 0x08078ee0 0x00000000 0x00000000
0x8065e80: 0x00000000 0x00000000 0x00000000 0x08078e20
0x8065e90: 0x00000000 0x08078da0 0x00000000 0x08078f40
0x8065ea0: 0x00000000 0x00000000 0x08078ea0 0x00000000
0x8065eb0: 0x08078ec0 0x00000000 0x00000000 0x00000000
0x8065ec0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065ed0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065ee0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065ef0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065f00: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065f10: 0x00000000 0x00000000 0x08078d20 0x00000000
0x8065f20: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065f30: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065f40: 0x08078de0 0x00000000 0x00000000 0x00000000
0x8065f50: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065f60: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065f70: 0x00000000 0x00000000 0x00000000 0x08078e00
0x8065f80: 0x00000000 0x08078fa0 0x00000000 0x00000000
0x8065f90: 0x00000000 0x00000000 0x00000000 0x08078e80
0x8065fa0: 0x00000000 0x00000000 0x08078d40 0x08078fe0
0x8065fb0: 0x08078e60 0x00000000 0x08078000 0x00000000
0x8065fc0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8065fd0: 0x00000000 0x00000000 0x00000000 0x08078f00
0x8065fe0: 0x08078dc0 0x00000000 0x00000000 0x00000000
0x8065ff0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066000: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066010: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066020: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066030: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066040: 0x00000000 0x00000000 0x00000000 0x08078f20
0x8066050: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066060: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066070: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066080: 0x00000000 0x08078fc0 0x00000000 0x00000000
0x8066090: 0x00000000 0x00000000 0x00000000 0x00000000
0x80660a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80660b0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80660c0: 0x08078e40 0x00000000 0x00000000 0x00000000
0x80660d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80660e0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80660f0: 0x00000000 0x00000000 0x00000000 0x08078f60
0x8066100: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066110: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066120: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066130: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066140: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066150: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066160: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066170: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066180: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066190: 0x00000000 0x00000000 0x00000000 0x00000000
0x80661a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80661b0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80661c0: 0x00000000 0x00000000 0x00000000 0x08078d60
0x80661d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80661e0: 0x00000000 0x00000000 0x00000000 0x00000000
0x80661f0: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066200: 0x00000000 0x00000000 0x00000000 0x00000000
0x8066210: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) info address DAT_0001de20
Symbol "DAT_0001de20" is at 0x8118600 in a file compiled without debugging.
(gdb) x/256wx 0x8118600
0x8118600 <DAT_0001de20>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118610: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118620: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118630: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118640: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118650: 0x00000000 0x08124060 0x00000000 0x00000000
0x8118660: 0x00000000 0x00000000 0x00000000 0x08124040
0x8118670: 0x00000000 0x081245b0 0x00000000 0x08124590
0x8118680: 0x00000000 0x00000000 0x08124140 0x00000000
0x8118690: 0x08124120 0x00000000 0x00000000 0x00000000
0x81186a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81186b0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81186c0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81186d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81186e0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81186f0: 0x00000000 0x00000000 0x08124080 0x00000000
0x8118700: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118710: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118720: 0x08124570 0x00000000 0x00000000 0x00000000
0x8118730: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118740: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118750: 0x00000000 0x00000000 0x00000000 0x08123fa0
0x8118760: 0x00000000 0x081245d0 0x00000000 0x00000000
0x8118770: 0x00000000 0x00000000 0x00000000 0x081241a0
0x8118780: 0x00000000 0x00000000 0x08124000 0x08124610
0x8118790: 0x08123f60 0x00000000 0x08123fc0 0x00000000
0x81187a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81187b0: 0x00000000 0x00000000 0x00000000 0x08124180
0x81187c0: 0x08123fe0 0x00000000 0x00000000 0x00000000
0x81187d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x81187e0 <_ctype_b>: 0x080588e2 0x08058ae3 0x08058be4 0x080e32bc
0x81187f0 <__exit_funcs>: 0x0811a8e0 0x080e1b40 0x08118800 0x00000000
0x8118800 <_IO_2_1_stderr_>: 0xfbad2086 0x00000000 0x00000000 0x00000000
0x8118810 <_IO_2_1_stderr_+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118820 <_IO_2_1_stderr_+32>: 0x00000000 0x00000000 0x00000000 0x08123f40
0x8118830 <_IO_2_1_stderr_+48>: 0x00000000 0x08118940 0x00000002 0x00000000
0x8118840 <_IO_2_1_stderr_+64>: 0xffffffff 0x00000000 0x0811ab04 0xffffffff
0x8118850 <_IO_2_1_stderr_+80>: 0xffffffff 0x00000000 0x081188a0 0x00000000
0x8118860 <_IO_2_1_stderr_+96>: 0x00000000 0x081241c0 0x00000000 0x00000000
0x8118870 <_IO_2_1_stderr_+112>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118880 <_IO_2_1_stderr_+128>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118890 <_IO_2_1_stderr_+144>: 0x00000000 0x08119980 0x00000000 0x00000000
0x81188a0 <_IO_wide_data_2>: 0x08124020 0x00000000 0x00000000 0x00000000
0x81188b0 <_IO_wide_data_2+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x81188c0 <_IO_wide_data_2+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x81188d0 <_IO_wide_data_2+48>: 0x00000000 0x00000000 0x00000000 0x08124160
0x81188e0 <_IO_wide_data_2+64>: 0x00000000 0x00000000 0x00000000 0x00000000
0x81188f0 <_IO_wide_data_2+80>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118900 <_IO_wide_data_2+96>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118910 <_IO_wide_data_2+112>: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118920 <_IO_wide_data_2+128>: 0x00000000 0x00000000 0x08119860 0x00000000
0x8118930: 0x00000000 0x00000000 0x00000000 0x00000000
0x8118940 <_IO_2_1_stdout_>: 0xfbad2a84 0x0811cdb0 0x0811cdb0 0x0811cdb0
0x8118950 <_IO_2_1_stdout_+16>: 0x0811cdb0 0x0811cdb0 0x0811cdb0 0x0811cdb0
0x8118960 <_IO_2_1_stdout_+32>: 0x0811d1b0 0x00000000 0x00000000 0x00000000
0x8118970 <_IO_2_1_stdout_+48>: 0x00000000 0x08118a80 0x00000001 0x00000000
0x8118980 <_IO_2_1_stdout_+64>: 0xffffffff 0x00000000 0x0811ab10 0xffffffff
0x8118990 <_IO_2_1_stdout_+80>: 0xffffffff 0x00000000 0x081189e0 0x00000000
0x81189a0 <_IO_2_1_stdout_+96>: 0x00000000 0x00000000 0xffffffff 0x081245f0
0x81189b0 <_IO_2_1_stdout_+112>: 0x00000000 0x00000000 0x00000000 0x00000000
0x81189c0 <_IO_2_1_stdout_+128>: 0x00000000 0x00000000 0x00000000 0x00000000
0x81189d0 <_IO_2_1_stdout_+144>: 0x00000000 0x08119980 0x00000000 0x00000000
0x81189e0 <_IO_wide_data_1>: 0x00000000 0x00000000 0x00000000 0x00000000
0x81189f0 <_IO_wide_data_1+16>: 0x00000000 0x00000000 0x00000000 0x00000000
Whereas aln.elf
seems in good order, inside aln.static.elf
this array appears to be truncated at 480 bytes, with unrelated variables following it.
However, aln
’s code still assumed it was 1024 bytes long, so these variables were overwritten until that memory corruption led to a segmentation fault.
How come DAT_0001de20
was truncated in aln.static.elf
and not in aln.elf
?
That array was originally located inside aln
from 0x1de20
to 0x1e21f
.
However, the .data
segment stops at 0x1dfff
and the .bss
segment begins at 0x1e000
.
As far as I can tell, that variable is laid across two sections.
I might not give a damn about computer engineering conventions in this series of articles, but so did toolchains in the 90’s apparently.
(╯°□°)╯︵ ┻━┻
When the aln.o
object file got exported, these two sections became untethered.
Later on, the linker didn’t place these bits next to each other, splitting that variable in two, effectively truncating it down to 480 bytes.
That was a fun one to track down.
Work, goddammit!
So, let’s excise that problematic array from aln.o
and substitute it with a clean stub:
$ cat DAT_0001de20.c
int DAT_0001de20[256];
$ i686-linux-gnu-gcc -g -static -no-pie -fno-pic -o aln.static.elf aln.glibc20.o ctype.o FUN_0000fba0.c DAT_0001de20.c
We don’t care about the type of DAT_0001de20
, we only want it to be at least 1024 bytes long.
Unlike a compiler, a traditional linker only processes symbol names and doesn’t care about typing information.
Please work this time…
$ rm -f jaghello.cof && aln.static.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof
Sizes: Text Data Bss Syms
(hex) 3C0 410 FA54 1C00
Link complete.
$ sha256sum jaghello.cof
f9c8269cdc998de01c0ac7a3e815c16b7ced106e25f10f92a7078c722a220dbb jaghello.cof
Finally, aln
lives on with a transplanted C standard library from the 21st century and it only took the better part of my sanity.
Despite all this nonsense, it wasn’t as bad as it could be.
In particular, aln
doesn’t use errno
, which was a static variable in the old days and is a thread-local one nowadays.
These two storage schemes are mutually incompatible.
There are ways to work around this problem, but thankfully I didn’t have to deal with this nonsense here.
Dynamic library displacement
We’ve swapped a C standard library with another, how about removing it altogether from the executable? We can do so by dynamically linking it:
$ i686-linux-gnu-gcc -g -no-pie -fno-pic -o aln.dynamic.elf aln.glibc20.o ctype.o FUN_0000fba0.c DAT_0001de20.c
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: aln.glibc20.o: warning: relocation against `_IO_2_1_stderr_@@GLIBC_2.1' in read-only section `.text'
/usr/lib/gcc-cross/i686-linux-gnu/10/../../../../i686-linux-gnu/bin/ld: warning: creating DT_TEXTREL in a PIE
$ file aln.dynamic.elf
aln.dynamic.elf: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=d6aabe004514845f6530384228f12741aa87dbc1, for GNU/Linux 3.2.0, with debug_info, not stripped
The linker isn’t very happy about it, but it did spat out an executable. Let’s try it out:
$ rm -f jaghello.cof && aln.dynamic.elf -e -l -g2 -rd -a 4000 x x -v -o jaghello.cof startup.o jag.o
***********************************
* ATARI LINKER (Mar 17 1995) *
* Adds from Atari version 1.11 *
* and PC/DOS&Linux ports *
* Copyright 1993-95 Brainstorm. *
** Copyright 1987-95 Atari Corp. **
***********************************
Output file is jaghello.cof
Sizes: Text Data Bss Syms
(hex) 3C0 410 FA54 1C00
Link complete.
$ sha256sum jaghello.cof
f9c8269cdc998de01c0ac7a3e815c16b7ced106e25f10f92a7078c722a220dbb jaghello.cof
It worked out of the box? Well, after dealing with all these shenanigans previously it better be!
The files for this case study can be found here: case-study.tar.gz
Conclusion
We have delinked aln
back into a standard object file without its C standard library, then produced both statically and dynamically linked versions of aln
using contemporary versions of glibc.
Next time, we’ll attempt our most ambitious port yet: turning aln
from a Linux program into a native Windows executable.