Page 1 of 1

Just a nifty little executable format...

Posted: Tue Feb 24, 2015 5:46 pm
by SoulofDeity
I was wanting to use relocatable object files as modules for this particular project I'm working on, but didn't feel up to writing an ELF loader. It's supposed to be cross-platform anyway, so ELF-specific loading is a pain. I just wanted a lightweight container that I could throw into RAM and use mprotect to make it executable. The format I'm going with is based off of the Zelda64 overlay format. It's basically a raw dump of the text, data, and rodata sections followed by a table listing the size of each section (and the bss) followed by the number of relocations and a relocation table. There are no symbols and the last word of the file is a relative pointer to the size table.

It may not be OS-dev specific, but it's an extremely simple format to implement. I figured some people may find it useful as a temporary stand-in in their projects.

Below is the shell script that does the conversion:

Code: Select all

#!/bin/sh

# Display the usage information.
usage ()
{
cat << EOF
Usage: $0 [options] file...
Options:
  -h           Show this message
  -o FILE      Place output into FILE
  -T FILE      Use FILE as linker script
EOF
}

OPTIND=1
while getopts "ho:" OPTION
do
  case $OPTION in
    h)
      usage
      exit 1
      ;;
    o)
      export OUTPUT="$OPTARG"
      ;;
    T)
      export LDSCRIPT="$OPTARG"
      ;;
  esac
done

if [ "$OUTPUT" = "" ]
then
  echo "$0: no output file specified" >&2
  echo "$0: use the -h option for usage information" >&2
  exit 1
else
  export LDOUT="`echo "$OUTPUT" | sed 's/\.actor/.o/'`"
fi

if [ "$LDSCRIPT" = "" ]
then
  export LDSCRIPT="actor.x"
fi

shift $((OPTIND - 1))

# Compile and link the code for the actor.
cc -c -o "$LDOUT.1" $* \
  -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables
ld -r -o "$LDOUT" -T $LDSCRIPT "$LDOUT.1"

# Remove the old file for the actor if it exists and create a blank file.
rm -f "$OUTPUT" "$LDOUT.1"
touch "$OUTPUT"

# Dump the .text, .data, and .rodata sections into the actor.
objdump -s -j ".text" "$LDOUT"         |
  grep '^ [0-9]'                       |  # Grab only hex dump lines.
  sed -r 's/^ +[0-9a-f]+ +//;s/  .*//' |  # Remove offsets and ASCII.
  sed ':a;N;$!ba;s/\n/ /g;s/ //g'      |  # Remove whitespace.
  xxd -r -p - >> "$OUTPUT"                # Convert to binary code.
export TEXTSIZE="`stat -c%s \"$OUTPUT\"`"
objdump -s -j ".data" "$LDOUT"         |
  grep '^ [0-9]'                       |  # Grab only hex dump lines.
  sed -r 's/^ +[0-9a-f]+ +//;s/  .*//' |  # Remove offsets and ASCII.
  sed ':a;N;$!ba;s/\n/ /g;s/ //g'      |  # Remove whitespace.
  xxd -r -p - >> "$OUTPUT"                # Convert to binary code.
export DATASIZE="`stat -c%s \"$OUTPUT\"`"
export DATASIZE="`expr $DATASIZE - $TEXTSIZE`"
objdump -s -j ".rodata" "$LDOUT"       |
  grep '^ [0-9]'                       |  # Grab only hex dump lines.
  sed -r 's/^ +[0-9a-f]+ +//;s/  .*//' |  # Remove offsets and ASCII.
  sed ':a;N;$!ba;s/\n/ /g;s/ //g'      |  # Remove whitespace.
  xxd -r -p - >> "$OUTPUT"                # Convert to binary code.
export TABLEOFFSET="`stat -c%s \"$OUTPUT\"`"
export RODATASIZE="`expr $TABLEOFFSET - $DATASIZE - $TEXTSIZE`"

# Get the size of the .bss section and write the size table to memory.
export BSSSIZE="`size -A \"$LDOUT\" |
                 grep .bss | sed -r 's/\.bss *//;s/ [0-9]+//'`"
printf "%08x%08x%08x%08x" $TEXTSIZE $DATASIZE $RODATASIZE $BSSSIZE |
  xxd -r -p - >> "$OUTPUT"

# Write the number of relocations
export TEXTRC="`objdump -r -j \".text\" \"$LDOUT\" |
                grep '^[0-9]' | wc -l`"
export DATARC="`objdump -r -j \".data\" \"$LDOUT\" |
                grep '^[0-9]' | wc -l`"
export RODATARC="`objdump -r -j \".rodata\" \"$LDOUT\" |
                  grep '^[0-9]' | wc -l`"
export BSSRC="`objdump -r -j \".bss\" \"$LDOUT\" |
               grep '^[0-9]' | wc -l`"
printf "%08x" `expr $TEXTRC + $DATARC + $RODATARC + $BSSRC` |
  xxd -r -p - >> "$OUTPUT"

# ---
# Eventually, I'll need to canonicalize the relocations and insert them
# the file here; note: this is done with 'objdump -r'.
# ---

# Write a relative pointer to the size table at the end of the file.
export ENDOFFSET="`stat -c%s \"$OUTPUT\"`"
export RELATIVEOFFSET="`expr $ENDOFFSET - $TABLEOFFSET + 4`"
printf "%08x" $RELATIVEOFFSET | xxd -r -p - >> "$OUTPUT"

# Print the location of the symbol for the actor data.
echo `nm -g "$LDOUT" | grep '\s*actor_data$'`
The only real difference between this format and the original overlay format is the lack of section padding. It could have been a lot simpler if the GNU developers hadn't removed the '-k' option from objdump. The relocations are gonna take a bit more work; I need to figure out what the types listed by 'objdump -r' mean. Also, it's normal if your files come out really fricken tiny. I fed a 17.5KiB file into it for testing and got a 60-byte file out of it.

Re: Just a nifty little executable format...

Posted: Wed Feb 25, 2015 1:35 am
by alexfru
You seem to be reinventing the a.out format. :)

Re: Just a nifty little executable format...

Posted: Wed Feb 25, 2015 10:59 am
by SoulofDeity
alexfru wrote:You seem to be reinventing the a.out format. :)
It's actually a format created by Nintendo for their actors in Zelda64 games. The project I'm working on needs to have external modules, but I don't want to use a scripting language or build my modules as shared libraries. The script here hackishly uses objdump, size, and nm to extract the raw data for sections and relocations and dump them into this format. This way, it can be used to convert other formats like COFF.

Since there are no symbols, I'll be using objdump to export the symbols in the main program and include them in the linker script used by the actors. This way, only internal relocations have to be made.