view objgrep/README @ 185:a820d9f9adbf

leo-obj: started analyzing tpudrv.lib objects
author Michael Spacefalcon <msokolov@ivan.Harhan.ORG>
date Mon, 11 Aug 2014 21:32:30 +0000
parents 10a9a0ca9d07
children
line wrap: on
line source

We have TI's reference firmware for the Calypso/Iota/Rita chipset (Leonardo) in
the form of linkable COFF objects and some source pieces, but when it comes to
practically usable "dumbphones" based on this chipset, we only have the binary
fw images read out of flash, without any kind of symbolic info.

The tools in this directory perform a kind of grep operation, searching an
unknown binary fw image for the bits of code or data contained in a linkable
COFF object.  The objective was to determine whether or not our "reference"
Leonardo objects could be found verbatim in the set of proprietary firmwares
from Compal and Foxconn (Pirelli DP-L10) that run on our "dumbphone" targets.

The tools are as follows:

objgrep

	This tool extracts one section (e.g., .text or .const, to be specified
	on the command line) from a "needle" COFF object and searches for it in
	the "haystack" unknown binary.  The byte positions in the sought-for
	object section where relocs are to be applied at linking time are masked
	as appopriate for each reloc type, and the section is expected to start
	on a 4-byte-aligned boundary in the unknown binary.  If a match is
	found, objgrep can print out the list of symbol addresses in the
	sought-for and found section, and it can also deduce some symbols
	external to the module or belonging to the module's other sections by
	looking where the relocs that were masked for the match point to in the
	unknown binary.

	In order for this form of grep to be effective, the section being
	searched for should be "meaty", i.e., mostly code or constant data with
	some interspersed relocs.  If the sought-for section is very small, fits
	the same pattern after reloc masking as other unrelated bits of code,
	or consists mostly of relocs, the most likely result will be a useless
	false hit.

objgrep-fe

	This program is a front-end to objgrep.  It reads a line-based text file
	listing the objects and sections to be grepped for, and invokes objgrep
	for each listed section.  The output of objgrep is captured through a
	pipe; objgrep-fe collates together the symbol addresses found with each
	individual objgrep hit and produces a sorted symbol listing.

Results
=======

The idea proved quite successful in the case of Pirelli DP-L10 firmware,
specifically version D910.0.3.98: this fw appears to have been built with
exactly the same RTS, Nucleus and GPF libraries that are featured in our
Leonardo semi-src as "very stable blobs", i.e., *.lib files in the source tree
itself, rather than blobs under g23m/__out__ for which TI's closed source
police excluded the corresponding source.  Every object that comes from these
libraries in our leo2moko build was also found in Pirelli's fw.

It is worth noting that the GPF libraries in particular contain a few objects
with embedded second-granularity timestamps, courtesy of the C compiler's
__DATE__ and __TIME__ preprocessor definitions, i.e., the timestamp strings
with times to the second are emitted into the code image built with these
libraries.  These timestamped objects were found in Pirelli's fw with our
objgrep tools along with the rest of GPF, proving beyond any doubt that this fw
has been built with exactly the same GPF libs as our leo2moko.

This confirmation in the case of Pirelli's fw is very reassuring because this
fw has received a lot of real-life testing: I've been using a Pirelli running
its original proprietary fw (as no free fw exists yet, for this or any other
dumbphone) as my personal everyday cellphone for over a year now.  That is a
lot more real life experience than I can get with anything Openmoko-based, and
it is reassuring to know that the GPF libraries we have painstakingly
reconstructed are used not only in the largely-untested moko firmware, but also
in the much more real-life-tested Pirelli DP-L10 fw.

Attemping the same grep against Compal's fw yielded far fewer hits, however.
A lot of RTS modules were found, but very little from Nucleus or GPF libs.
Nucleus' tct and tmt assembly modules were found, but not much else.  Manual
examination of Compal's INC_Initialize() function (which is easy to locate even
in a totally unknown fw binary, as it's only one ARM->Thumb call veneer away
from the boilerplate code at the boot entry point) has revealed that it's the
same code, but compiled slightly differently, probably a slightly newer C
compiler version.  (The version in our reference libs saves one more call-
preserved register than necessary; the version that appears in Compal's fw is
fully optimal in this regard.)  I reason that the same compiler difference must
be responsible for the great scarcity of hits in general, as these kinds of
compiler changes would produce differences in just about every module.