view doc/Binary-file-formats @ 902:8ddb16a37273

tree org: move TCH and VM utils from miscutil to tchtools
author Mychaela Falconia <falcon@freecalypso.org>
date Wed, 28 Dec 2022 07:52:30 +0000
parents b6b8307d195b
children
line wrap: on
line source

In FreeCalypso we use 3 different file formats for Calypso binary images, i.e.,
code images to be loaded into either flash or RAM or retrieved flash dumps.
These 3 different file formats are straight binary (*.bin), moko-style m0 (*.m0)
and little-endian S-records (*.srec).

Straight binary (*.bin)
=======================

Straight binary is our preferred format for flash dumps.  It is written in the
native little-endian byte order of the Calypso ARM7 processor, i.e., the order
of bytes in the raw binary file directly corresponds to incrementing byte
addresses as visible to the ARM7 - any ASCII strings in the image thus appear
naturally.  We also use the same straight binary format in native LE byte order
for flashable code images generated with the gcc+binutils toolchain (as opposed
to TI's TMS470), generated with arm-elf-objcopy -O binary - although we don't
have too many such code images currently given that neither FC Citrine nor FC
Selenite ever achieved production quality.

Another unrelated use of this straight binary format is for RAM-loadable code
images that are fed to Compal's bootloader (Motorola C1xx and Sony Ericsson
J100) as opposed to Calypso boot ROM.  Our generally preferred image format for
RAM-loadable code pieces is little-endian S-records (*.srec, see below), but
for Compal's bootloader we use straight binary instead because of the way this
bootloader protocol works.

moko-style m0
=============

This format is a variant of Motorola hex (S-records), a variant invented by TI
rather than by us.  This format is produced by TI's hex470 tool when run with
-m -memwidth 16 -romwidth 16 options, which is the configuration used by TI in
the Calypso program, and is read by TI's flash programming tool called FLUID.
TI used this format not only for flashable firmware images, but also for various
RAM-loadable code pieces, particularly those that comprise the target-side
component of FLUID.

The special quirk of this S-record variant format is its peculiar byte order.
TI viewed it as "16-bit hex", meaning that the image is logically viewed as
consisting of 16-bit words rather than 8-bit bytes, each S3 record carries an
even number of bytes to be loaded at an even address, and each 16-bit word
(4 hex digits) appears in these S3 records with its most-significant hex nibble
toward the left, just like the address field of each S-record.  But if this
image gets interpreted by some more naive tool (for example, objcopy from GNU
binutils) as bytes rather than 16-bit words, the result will be a reversed byte
order, with all strings etc messed up.

In FreeCalypso we use this moko-style m0 format (our new name for what TI called
16-bit hex) only for flashable firmware images built with TI's TMS470 toolchain
(can be our own FC Magnetite or historical ones built by Openmoko or other
similar historical vendors), but never for any RAM-loadable code pieces - we use
little-endian SREC for the latter as explained below.

And what about the name?  Why do we call it moko-style m0 rather than just m0?
The reason is because Compal muddied our waters by introducing their own *.m0
files that were generated with -memwidth 8 -romwidth 8 instead of -memwidth 16
-romwidth 16, producing 8-bit hex instead of 16-bit hex.  We do not support
Compal's different *.m0 files at all, and we needed some name to specifically
identify TI-style m0 files rather than Compal-style.  We ended up with the name
moko-style rather than TI-style because we already had our mokosrec2bin program
going back to 2013-04-15, one of the very first programs written in the
FreeCalypso family of projects: our very first encounter with this file format
were mokoN firmware images put out in this *.m0 format by That Company.

Little-endian S-records (*.srec)
================================

Back at the beginning of FreeCalypso in the spring/summer of 2013 I (Mother
Mychaela) decided to use S-records instead of straight binary for our
RAM-loadable code pieces, i.e., code that is loaded into RAM either through the
Calypso boot ROM (fc-iram) or by chain-loading via loadagent (fc-xram).  I made
this decision based on two factors:

1) ARM code generated by common toolchains (both TI's TMS470 and gcc+binutils)
   without special contortions is not position-independent: a code image that
   was linked for a given address needs to be loaded at that specific address,
   not some other.

2) An S-record image has its load address embedded in the image itself, whereas
   a raw binary naturally does not carry any such extra metadata.

With SREC as the standardized hand-off format from code generation tools to
loadtools, the choice of load address is made entirely on the code generation
side; loadtools do not impose a fixed load address, nor do they require it to
be communicated via extra command line arguments or options.

However, the variant of SREC we use for RAM-loadable code pieces is not the same
as moko-style m0 - the byte order is the opposite, with our RAM-loadable code
pieces using the native little-endian byte order of the ARM7 processor as the
byte order within S3 records.  Prior to the introduction of RAM-loadable FC
Magnetite fw images for Pirelli DP-L10 in late 2016 (and then likewise for our
own FCDEV3B), the only RAM-loadable code pieces we have had were built with
gcc+binutils, not with TMS470, and GNU binutils got a different take on the
S-record format than TI did: they generate byte-oriented SREC files, with the
byte order being the same as it would be in a straight binary file, matching
the target processor's memory byte addressing order.  Thus GNU-style SREC has
been adopted as the format for our RAM-loadable code images for both fc-iram
and fc-xram, as opposed to TI-style SREC aka moko-style m0.  The convention we
have adopted is that *.m0 filename suffix means TI-style aka moko-style,
whereas *.srec means GNU-style.

Besides the S3 record byte order, there is one other difference between TI-built
*.m0 code images and GNU-built *.srec ones: the final S7 record carries the
entry point address in GNU-built *.srec images, whereas TI's *.m0 images always
have a zero dummy address in there.  Our fc-iram and fc-xram tools require the
real entry point address in the S7 record.

How do we generate ramimage.srec RAM-loadable images for fc-xram in FC
Magnetite?  Answer: FC Magnetite build system includes a special ad hoc
converter program that reads ramimage.m0 produced by TI's hex470 tool and
produces ramimage.srec: it reverses the order of bytes, adds another S3 record
that writes the boot-ROM-redirected interrupt and exception vectors and
generates an S7 record with the right entry point address.

This little-endian *.srec format is actively used only for RAM-loadable code
pieces in FreeCalypso, not for anything that goes into or gets read from flash.
We do have flash dump2srec and flash program-srec commands in fc-loadtool, they
were implemented back in the founding stage of FreeCalypso in 2013 for the sake
of completeness and symmetry (it seemed right to support both binary and
S-record formats), but they never got any practical use: if you are making a
flash dump, you would normally want to examine it afterward, and any such
examination almost always needs a straight binary image, not S-records.