School of ITEE, The University of Queensland

David Ung

Retargetable Loader


David's 1996 Honours thesis, under the supervision of Cristina Cifuentes, involved the development of a specification language to specify the internal format of binary-file formats and its use in building retargetable loaders automatically, from such specifications.  

David implemented a retargetable loader called SRL (simple retargetable loader) which took BFF (binary-file format) specifications for the MS-DOS EXE format, the Solaris ELF format, and the Microsoft Windows NE (new EXE) format.  The DOS version of the SRL loader was integrated, for testing purposes, with the dcc decompiler.  

Abstract

The operating system (OS) loader decodes the object file and creates a memory image when an executable object or binary file needs to be executed. Apart from image creation in operating systems, the loader is also be used to extract important file information in some of the more complex machine-code manipulation tools disassemblers, decompilers, debuggers, binary translators and tracers/profilers. Traditional loaders like the ones used in the OS can only understand one type of binary file format (BFF). The ideal model is to create a generic loader which is capable of understanding several BFFs.

A binary file have a set of attributes which describe its environment; namely, the machine architecture M in which it runs, the operating system OS for that machine, and its binary file format. The notation we use to describe the environment is the tuple (M, OS, BFF).

Traditionally, we need to write a decoder for every BFF we want to manipulate, ie. for n different object files or n (M, OS, BFF) tuples, we need to write n different loaders. The idea of a (generic) retargetable loader (RL) is to eliminate the effort needed to create different loaders. The RL is designed to be generically intelligent, and can understand a wide range of different binary file formats.

This thesis looks into the different approaches for developing an RL and develop a prototype for such a tool. The approach used was by means of BFF specifications. Specifications are unambiguous and trouble free, which make it ideal for developing an RL based on a BFF grammar. There are a few difference between the grammars used in traditional programming languages and the grammar used for BFFs. In an object file, parts or regions of the file are inter-related. Addresses and segment sizes are usually controlled by definitions found in the file header and their values are determined only at run-time. Hence, a BFF grammar must be able to re-reference information that was defined earlier in the file.

The simple retargetable loader (SRL) is a first attempt to develop an RL with a simple BFF grammar developed by the author. The three environments: (x86,DOS,EXE), (x86,Windows,NE) and (Sparc,Solaris,ELF) were used as the basis for testing the SRL. The three environments give a good coverage of different BFFs currently in use by OSs for RISC and CISC machines. Overall, the structure of ELF is complex, while EXE is simple and NE is in between.

Code

The code base for SRL is not currently available.

Documentation

The Retargeable Loader thesis describes the description of the BFF specification language and the SRL tool that implements support for such language (html format). 

Bibtex entry:


@MASTERSTHESIS{
   author = "D. Ung",
   title = "Retargetable Loader",
   school = "The University of Queensland",
   type = "Honours Thesis",
   address = "Department of Computer Science and Electrical Engineering",
   date = nov,
   year = 1996
}


Last updated: 28 April 2002

This page: http://www.itee.uq.edu.au/~cristina/students/david/david.html