# HG changeset patch
# User Michael Spacefalcon <msokolov@ivan.Harhan.ORG>
# Date 1372565700 0
# Node ID c9f7a4afccc93d11a5dd2997304c92aed7c6b4eb
# Parent  343b6b2f178b9bce8cbee4cd06bb3341792a7174
Mokopir-FFS: verbal description finished

diff -r 343b6b2f178b -r c9f7a4afccc9 mpffs/Description
--- a/mpffs/Description	Sun Jun 30 01:17:30 2013 +0000
+++ b/mpffs/Description	Sun Jun 30 04:15:00 2013 +0000
@@ -60,7 +60,7 @@
 However, the knowledge possessed by the present hacker (and conveyed in this
 document and the accompanying source code) is NOT sufficient for constructing a
 valid Mokopir-FFS image "in vitro" given a tree of directories and files, or
-for making modifications to the file or directory content on an existing image
+for making modifications to the file or directory content of an existing image
 and producing a content-modified image that is also valid; valid as in suitable
 for the original proprietary firmware to make its normal read and write
 operations without noticing anything amiss.
@@ -68,7 +68,7 @@
 Constructing "de novo" Mokopir-FFS images or modifying existing images in such
 a way that they remain 100% valid for all read and write operations of the
 original proprietary firmware would, at the very minimum, require an
-understanding of the meaning of *all* fields on the on-media FFS format.  Some
+understanding of the meaning of *all* fields of the on-media FFS format.  Some
 of these fields are still left as "non-understood" for now though: a read-only
 implementation can get away with simply ignoring them, but a writer/generator
 would have to put *something* in those fields.
@@ -230,7 +230,7 @@
 consisting of two flash writes: a 16-byte entry in the active index block, with
 the object type byte set to F2, and a corresponding 16-byte chunk in one of the
 data sectors, with the 16 bytes containing "gsm", a terminating NUL byte, and
-12 FF bytes to pad up to 16.  In the case of files, this name may be following
+12 FF bytes to pad up to 16.  In the case of files, this name may be followed
 by the first chunk of file data content, as explained further down.
 
 In order to parse the FFS directory tree (whether the objective is to dump the
@@ -278,3 +278,93 @@
  the active index block is going to fill up too, and will need to be rewritten
  into a new sector - at which point the previously-deleted index entries are
  omitted and the root node becomes #1 again...]
+
+Tree structure
+
+Once the root node has been found, the descendant and sibling pointers are used
+to traverse the tree structure.  For each directory object, including the root
+node, the descendant pointer points to the first child object of this directory:
+the first file or subdirectory contained therein.  (Descendant and sibling
+pointers take the form of index numbers in the active index block.  A "nil"
+pointer is indicated by all 1s (FFFF) - the usual all-0s NULL pointer convention
+couldn't be used because it's flash, where the blank state is all 1s.)  If the
+descendant pointer of a directory object is nil, that means an empty directory.
+The sibling pointer of each file or directory points to its next sibling, i.e.,
+the next member of the same parent directory.  The sibling pointer of the root
+node is nil.
+
+Data content of files
+
+Objects of type F1 are the head chunks of files.  Each file has a head chunk,
+and may or may not have continuation chunks.  More precisely, the head chunk
+may contain only the name (or viewed alternatively, 0 bytes of data), or it may
+contain a nonzero number of payload bytes; orthogonally to this variability,
+there may or may not be continuation chunk(s) present.
+
+Continuation chunks
+
+The descendant pointer of each file head object (the object of type F1, the one
+reached by traversing the directory tree) indicates whether or not there are
+any continuation chunks present.  If this descendant pointer is nil, there are
+no continuation chunks; otherwise it points to the first continuation chunk
+object.  File continuation objects have type F4, don't have any siblings (the
+sibling pointer is nil), and the descendant pointer of each continuation object
+points to the next continuation object, if there is one - nil otherwise.
+
+Payload data delineation
+
+Each chunk, whether head or continuation, always has a length that is a nonzero
+multiple of 16 bytes.  The length of the chunk here means the amount of flash
+space it occupies in its data sector - which is NOT equal to the payload data
+length.
+
+The head chunk of each file begins with the filename, terminated by a NUL byte.
+If there are any payload data bytes present in this head chunk (I'll explain
+momentarily how you would tell), the byte immediately after the NUL that
+terminates the filename is the first byte of the payload.  In the case of a
+continuation chunk, there is no filename and the first byte of the chunk is the
+first byte of that chunk's portion of the user data payload.
+
+Each data-containing chunk (head or continuation) has the following termination
+after the last byte of that chunk's payload data: one byte of 00, followed by
+however many bytes are needed ([0,15] range) of FFs to pad to a 16-byte
+boundary.  A file head chunk that has no payload data has the same format as a
+directory name chunk: filename followed by its terminating NUL followed by
+[0,15] bytes of FFs to pad to the next 16-byte boundary.
+
+When working with a head chunk, find the beginning of possible payload data (1
+byte after the filename terminating NUL) and find the end per the standard
+termination logic: scanning from the end of the chunk, skip FFs until 00 is
+found (encountering anything else is an error).  If the head chunk has no data,
+the effective data length (end_pointer - start_pointer) will be 0 or -1.  (The
+latter possibility is the most likely, as there will normally be a "shared" 00
+byte, serving as both the filename terminator and the 00 before the padding
+FF bytes.)
+
+-------------------------------------------------------------------------------
+
+That's all I can think of right now.  If anything is unclear, see the
+accompanying source code for the listing/extraction utilities: with the general
+explanation given by this document, it should be clear what my code does and
+why.  And if a given piece of knowledge is found neither in this document nor
+in my source code, then I don't know it myself either, and my read-only
+Mokopir-FFS implementation makes do without it.
+
+All knowledge contained herein has been recovered by reverse engineering.
+Believe it or not, I have figured it out by staring at the hex dump of FFS
+sectors, reasoning about how one could possibly implement an FFS given the
+requirement of dynamic writability and the physical constraints of flash memory,
+and writing listing/extraction test code iteratively until I got something that
+appears to correctly parse all FFS images available to me - the result is the
+code in this package.
+
+I never got as far as attempting to locate the FFS implementation routines
+within the proprietary firmware binary code images, and I most certainly don't
+have anything from TI that would help in this case.  (The TSM30 code doesn't
+seem to be of any use as its FFS appears to be totally different, and I haven't
+looked at the FFS code in the more recently found LoCosto code leak because I
+assumed from the documentation in the latter that the FFS implemented there is
+different as well.)
+
+Michael Spacefalcon
+SE 52 Mes 11