diff objgrep/README @ 176:10a9a0ca9d07

objgrep/README written
author Michael Spacefalcon <msokolov@ivan.Harhan.ORG>
date Sun, 06 Jul 2014 20:22:09 +0000
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/objgrep/README	Sun Jul 06 20:22:09 2014 +0000
@@ -0,0 +1,82 @@
+We have TI's reference firmware for the Calypso/Iota/Rita chipset (Leonardo) in
+the form of linkable COFF objects and some source pieces, but when it comes to
+practically usable "dumbphones" based on this chipset, we only have the binary
+fw images read out of flash, without any kind of symbolic info.
+
+The tools in this directory perform a kind of grep operation, searching an
+unknown binary fw image for the bits of code or data contained in a linkable
+COFF object.  The objective was to determine whether or not our "reference"
+Leonardo objects could be found verbatim in the set of proprietary firmwares
+from Compal and Foxconn (Pirelli DP-L10) that run on our "dumbphone" targets.
+
+The tools are as follows:
+
+objgrep
+
+	This tool extracts one section (e.g., .text or .const, to be specified
+	on the command line) from a "needle" COFF object and searches for it in
+	the "haystack" unknown binary.  The byte positions in the sought-for
+	object section where relocs are to be applied at linking time are masked
+	as appopriate for each reloc type, and the section is expected to start
+	on a 4-byte-aligned boundary in the unknown binary.  If a match is
+	found, objgrep can print out the list of symbol addresses in the
+	sought-for and found section, and it can also deduce some symbols
+	external to the module or belonging to the module's other sections by
+	looking where the relocs that were masked for the match point to in the
+	unknown binary.
+
+	In order for this form of grep to be effective, the section being
+	searched for should be "meaty", i.e., mostly code or constant data with
+	some interspersed relocs.  If the sought-for section is very small, fits
+	the same pattern after reloc masking as other unrelated bits of code,
+	or consists mostly of relocs, the most likely result will be a useless
+	false hit.
+
+objgrep-fe
+
+	This program is a front-end to objgrep.  It reads a line-based text file
+	listing the objects and sections to be grepped for, and invokes objgrep
+	for each listed section.  The output of objgrep is captured through a
+	pipe; objgrep-fe collates together the symbol addresses found with each
+	individual objgrep hit and produces a sorted symbol listing.
+
+Results
+=======
+
+The idea proved quite successful in the case of Pirelli DP-L10 firmware,
+specifically version D910.0.3.98: this fw appears to have been built with
+exactly the same RTS, Nucleus and GPF libraries that are featured in our
+Leonardo semi-src as "very stable blobs", i.e., *.lib files in the source tree
+itself, rather than blobs under g23m/__out__ for which TI's closed source
+police excluded the corresponding source.  Every object that comes from these
+libraries in our leo2moko build was also found in Pirelli's fw.
+
+It is worth noting that the GPF libraries in particular contain a few objects
+with embedded second-granularity timestamps, courtesy of the C compiler's
+__DATE__ and __TIME__ preprocessor definitions, i.e., the timestamp strings
+with times to the second are emitted into the code image built with these
+libraries.  These timestamped objects were found in Pirelli's fw with our
+objgrep tools along with the rest of GPF, proving beyond any doubt that this fw
+has been built with exactly the same GPF libs as our leo2moko.
+
+This confirmation in the case of Pirelli's fw is very reassuring because this
+fw has received a lot of real-life testing: I've been using a Pirelli running
+its original proprietary fw (as no free fw exists yet, for this or any other
+dumbphone) as my personal everyday cellphone for over a year now.  That is a
+lot more real life experience than I can get with anything Openmoko-based, and
+it is reassuring to know that the GPF libraries we have painstakingly
+reconstructed are used not only in the largely-untested moko firmware, but also
+in the much more real-life-tested Pirelli DP-L10 fw.
+
+Attemping the same grep against Compal's fw yielded far fewer hits, however.
+A lot of RTS modules were found, but very little from Nucleus or GPF libs.
+Nucleus' tct and tmt assembly modules were found, but not much else.  Manual
+examination of Compal's INC_Initialize() function (which is easy to locate even
+in a totally unknown fw binary, as it's only one ARM->Thumb call veneer away
+from the boilerplate code at the boot entry point) has revealed that it's the
+same code, but compiled slightly differently, probably a slightly newer C
+compiler version.  (The version in our reference libs saves one more call-
+preserved register than necessary; the version that appears in Compal's fw is
+fully optimal in this regard.)  I reason that the same compiler difference must
+be responsible for the great scarcity of hits in general, as these kinds of
+compiler changes would produce differences in just about every module.