Extracting data from ELF Files

I don't know if this question has already been posted (can't perform a search on "elf") so sorry for the dupe if this is the case:

Does anyone know how to browse/extract content of .elf files ?

There are some saturn games with only one big .elf file which contains all the data. It is some sort of ISO or container. IIRC, by hex edit it, you can find the word TANTALUS multiple times.

The House of the Dead is one of this game.

If no tool exists for that, maybe some people with the knowledge could check the file which load/grant access to the .elf file (the 0 file in most case).

Thanks for your help.
 
ELF on its own is just an executable format. You can pick them apart to some extent with objdump (part of GNU binutils and should be in any reasonably complete Saturn toolchain), but it's likely that the data is encoded in some game-specific format. The "TANTALUS" string might be a clue as to the format, though I wouldn't count on it.
 
No success with objdump :( (or maybe I don't how to use it).

It seems that all games with a big Elf file have been developped in some way by Tantalus Interactive, so it is a proprietary format :(

On a side note, the name of Tantalus Interactive appears nowhere on the game/manual/cover, so it is difficult to track those developpers when they are not credited for their work.
 
I had a quick look at Manx TT and the "CDIMAGE.ELF" file certainly isn't an ELF file (it would have been rather surprising if it were - the format was still quite new back then).

It appears to contain multiple "BIFF" files, which look like they're based on IFF - they contain the same type of "BODY" and "JUNK" chunk tags as eg. RIFF files. If they follow the same structure as IFF files it would be easy to automatically extract all the files contained in the image file. There are also some AVI Truemotion videos, they're identifiable by the "RIFF" start tag.

I think all the code is contained in the initial load file (0.AAA). I don't know if it's an artifact of their build system, but it also appears to contain the file names of all the art data files, plus bits of debugging and script code. And also this:
 

Attachments

  • Picture 10.png
    Picture 10.png
    3.9 KB · Views: 108
Lol, always some piece of texts secretly inserted in the code :D

I am not familiar with BIFF and IFF files. Is it an archive format to simply organize files into one big container ? If so, it could be interested to dev a tool to extract all the files from them (Unfortunately I don't have the knowledge to do that myself)

I also found another type of archive format: .BIG file (extension choosen probably because of the huge file ;) ) for Atlantis by Cryo. Definitely not the same as .ELF files but the same purpose: putting all the files into one big archive.
 
IFF is a very simple file format which was (and still is) popular in games. "BIFF" is probably a custom variant but the cunk id + length structure seems to be the same as plain IFF, with the exception that the length is stored as a little-endian value. Dunno about the data itself.

Based on some other strings in the code I would hazard the guess that the start of the "ELF" file is an index that stores data about the contained files which are identified by the hashed filename (which would explain why all the filenames are included in the code). If you figured out the index structure and the hash algorithm you could then get the original filenames for all files in the "ELF" image.
 
The start of the "ELF" file in Manx TT contains these bytes:

Code:
00 00 00 93   00 00 00 01   00 00 00 02   00 00 00 00

97 D6 30 44   [color=Lime]00 00 00 11[/color]   [color=Olive]00 00 01 A8[/color]   00 00 00 01

97 D6 30 75   [color=Red]00 00 00 12[/color]   [color=Magenta]00 00 0D F0[/color]   00 00 00 02

DD 25 F3 A4   00 00 00 14   00 00 27 3C   00 00 00 05

The first 16 bytes probably describe the index itself, so lets forget them for now and look at the next 16 bytes. The presence of "JUNK" tags in the BIFF files hint that the data is optimized for accessing in sector-sized chunks. 17*2048 = 34816, and quite correctly the first "BIFF" file is found at offset 0x8800 in the "ELF". At 0x89a8 there is a load of "TANTALUSTANTALUS" which is obviously used as padding. The next file starts at offset 0x9000, or 18*2048 and the next "TANTALUS" padding at 0x9df0.

If a filename hash is really used to index this table that is probably the first four bytes of each entry. The last four bytes tell how many sectors each file occupies (startsector + size in sectors = startsector of the next file).
 
antime said:
The start of the "ELF" file in Manx TT contains these bytes:

Code:
00 00 00 93   00 00 00 01   00 00 00 02   00 00 00 00

97 D6 30 44   [color=Lime]00 00 00 11[/color]   [color=Olive]00 00 01 A8[/color]   00 00 00 01

97 D6 30 75   [color=Red]00 00 00 12[/color]   [color=Magenta]00 00 0D F0[/color]   00 00 00 02

DD 25 F3 A4   00 00 00 14   00 00 27 3C   00 00 00 05

.ELF Table of Contents

0x00000093 -- number of files in the .ELF

0x00000001 -- hash weight (for the filename hash)

0x00000002 -- number of sectors used for the .ELF TOC

0x00000000 -- volume ID (N/A)

The rest of the TOC is 16 bytes per file. The first one is:

0x97D63044 -- hashed filename

0x00000011 -- start sector (offset from start of the .ELF)

0x000001A8 -- filesize in bytes

0x00000001 -- number of sectors used for the file [not set in HOTD]

The filename hash is calculated thusly:

Code:
long hashval = 0

for each ASCII character in filename -> char c

  hashval = hashval * (hashweight + 1) + c

Any overflow in hashval is ignored.

As demonstrated in House of the Dead, the filename can include a directory (e.g., 'dialog\dialog.rbh'). Theoretically, it could be any text since the length and format of the string is largely irrelevant.

If you want to test out the hash, try this one (from HOTD):

Filename = dialog\dialog.rbh

Hash weight = 9

The hashed value is 0x1E769DC4 [if you're curious, the unclipped value is 0xF7C7B421E769DC4].

Enjoy!
 
Great, this is all the info we need to understand the Elf file structure. Thanks a lot man! Where did you find this info ?

Hashed filenames are little bit strange for me. Is it easy to de-hashed them ?

With these information, could someone make a proggy to exact data from .elf files ?

Also, found in the 0.AAA file of THOTD:

Code:
Tantalus Executive Kernel V2.1c Oct 24 1997 (C) Tantalus Entertainment ACN 061 390 458 1994-1996
and

Code:
Tantalus Saturn Engine V2.0a Apr 30 1997 (C) Tantalus Entertainment ACN 061 390 458 1994-1996
 
Madroms said:
Great, this is all the info we need to understand the Elf file structure. Thanks a lot man! Where did you find this info ?

I extrapolated it. :) Seriously, most of it is in the executable.


Madroms said:
Hashed filenames are little bit strange for me. Is it easy to de-hashed them ?

It's pretty much impossible. The only way you'll figure it out is to look for filenames in the executable, run those through the hash, then look them up in the .ELF's TOC.


If you want to figure out the filenames in HOTD's .ELF, I'll help you track them down.
 
In fact, my goal was to look inside elf files to find readable files (txt) and movie files to identify which type of video files they used (to fill my databases on my site). So I don't really need all the filenames on each Elf file. It will be just a great extra. Thanks for the offer.

On THOTD's Elf File, I don't find any AVI/CPK video files. Do you know if they used some ?
 
No, House of the Dead doesn't use any movie files. All of the cutscenes are engine scripted, which is common for such 3D arcade games given the storage issues. But I'm glad I could help anyway.
 
Back
Top