# HG changeset patch
# User Michael Spacefalcon <msokolov@ivan.Harhan.ORG>
# Date 1372994766 0
# Node ID 86a494a5f2b03f7bd699f0a0f9d60ee71f695df2
# Parent  7ceab8bfacb3090c254772b57a0db2ec6aada3e2
MPFFS description: documented relocated chunks and the journal file

diff -r 7ceab8bfacb3 -r 86a494a5f2b0 mpffs/Description
--- a/mpffs/Description	Mon Jul 01 07:04:01 2013 +0000
+++ b/mpffs/Description	Fri Jul 05 03:26:06 2013 +0000
@@ -6,7 +6,7 @@
 further to MPFFS.
 
 (I have previously called the FFS in question MysteryFFS; but now that I've
- successfully reverse-engineered it, it isn't such a mystery any more :-)
+ successfully reverse-engineered it, it isn't as much of a mystery any more :-)
 
 At a high functional level, Mokopir-FFS presents the following features:
 
@@ -308,8 +308,9 @@
 any continuation chunks present.  If this descendant pointer is nil, there are
 no continuation chunks; otherwise it points to the first continuation chunk
 object.  File continuation objects have type F4, don't have any siblings (the
-sibling pointer is nil), and the descendant pointer of each continuation object
-points to the next continuation object, if there is one - nil otherwise.
+sibling pointer is nil - but see below regarding relocated chunks), and the
+descendant pointer of each continuation object points to the next continuation
+object, if there is one - nil otherwise.
 
 Payload data delineation
 
@@ -341,6 +342,86 @@
 byte, serving as both the filename terminator and the 00 before the padding
 FF bytes.)
 
+Relocated chunks
+
+Let's go back to the scenario in which a particular data sector is full (no more
+usable free space left) and contains a mixture of active and dirty (deleted or
+invalidated) data.  How does the dirty flash space get reclaimed, so that the
+amount of available space (blank flash ready to hold new data) becomes equal to
+the total FFS size minus the total size of active files and overhead?  It can
+only be done by relocating the still-active objects from the full sector to a
+new one, invalidating the old copies, and once the old sector consists of
+nothing but invalidated data, subjecting it to flash erasure.
+
+So how do the active FFS objects get relocated from a "condemned" sector to a
+new one?  If the object is a directory, a new index entry is created, pointing
+to the newly relocated name chunk, but it is then made to fit into the old tree
+structure without disrupting the latter: the new index entry is added at the
+tail of the sibling-chain of the parent directory's descendants, the old index
+entry for the same directory is invalidated (as if the directory were rmdir'ed),
+and the descendant pointer of the newly written index entry is set to a copy of
+the descendant pointer from the old index entry for the same directory.  The
+same approach is used when the head chunk of a file needs to be relocated; in
+both cases a read-only FFS implementation doesn't need to do anything special to
+support reading file and directory objects that have been relocated in this
+manner.
+
+However, if the relocated object is a file continuation chunk, then the manner
+in which such objects get relocated does affect file reading code.  What if a
+chunk in the middle of a chain linked by "descend" pointers needs to be moved?
+What happens in this case is that the old copy of the chunk gets invalidated
+(the object type byte turned to 00) like in the other object relocating cases,
+and the sibling pointer of that old index entry (which was originally FFFF as
+continuation objects have no siblings) is set to point to the new index entry
+for the same chunk.  The "descend" pointer in the new index entry is a copy of
+that pointer from the old index entry.
+
+The manner of chunk relocation just described has been observed in the FFS
+images read out of my most recent batch of Pirelli phones - the same ones in
+which the root directory object is not at index #1.  Thinking about it as I
+write this, I've realized that the way in which continuation objects get
+relocated is exactly the same as for other object types - thus the compaction
+code in the firmware doesn't need to examine what object type it is moving.
+However, the case of continuation chunk relocation deserves special attention
+because it affects a read-only implementation like ours - the utilities whose
+source accompanies this document used to fail on these FFS images until I
+implemented the following additional handling:
+
+When following the chunk chain of a file, normally the only object type that's
+expected is F4 - any other object type is an error.  However, as a result of
+chunk relocation, one can also encounter deleted objects, i.e., type == 00.
+If such a deleted object is encountered, follow its sibling pointer, which must
+be non-nil.
+
+Journal file
+
+Every Mokopir-FFS image I've seen so far contains a special file named
+/.journal; this file is special in the following ways:
+
+* The object type byte is E1 instead of F1;
+* Unlike regular files, this special file is internally-writable.
+
+What I mean by the above is that regular files are mostly immutable: once a
+file has been created with some data content in the head chunk, it can only be
+either appended to (one or more continuation chunks added), or overwritten by
+creating a new file with the same name at the same level in the tree hierarchy
+and invalidating the old one.  But the special /.journal file is different: I
+have never observed it to consist of more than the head chunk, and this head
+chunk is pre-allocated with some largish and apparently fixed length (4 KiB on
+my GTA02, 16 KiB on the Pirelli).  This pre-allocated chunk contains what look
+like 16-byte records at the beginning (on the first 4-byte boundary after the
+NUL terminating the ".journal" name), followed by blank flash for the remainder
+of the pre-allocated chunk - so it surely looks like new flash writes happen
+within this chunk.
+
+I do not currently know the purpose of this /.journal file or the meaning of the
+records it seems to contain.  This understanding would surely be needed if one
+wanted to create FFS images from scratch or to implement FFS write operations,
+but I reason that a read-only implementation can get away with simply ignoring
+this file.  I reason that this file can't be necessary in order to parse an FFS
+image for reading because one needs to parse the tree structure first in order
+to locate this journal file itself.
+
 -------------------------------------------------------------------------------
 
 That's all I can think of right now.  If anything is unclear, see the
@@ -359,12 +440,13 @@
 code in this package.
 
 I never got as far as attempting to locate the FFS implementation routines
-within the proprietary firmware binary code images, and I most certainly don't
-have anything from TI that would help in this case.  (The TSM30 code doesn't
-seem to be of any use as its FFS appears to be totally different, and I haven't
-looked at the FFS code in the more recently found LoCosto code leak because I
-assumed from the documentation in the latter that the FFS implemented there is
-different as well.)
+within the proprietary firmware binary code images, and I haven't found an
+implementation of this particular FFS in any of the leaked sources yet either.
+The TSM30 code doesn't seem to be of any use as its FFS appears to be totally
+different.  As to the more recently found LoCosto code leak, I found that one a
+few days *after* I got the Moko/Pirelli "MysteryFFS" reverse-engineered on my
+own, and when I did look at the FFS in the LoCosto code later, I saw what seems
+to be a different FFS as well.
 
 Michael Spacefalcon
-SE 52 Mes 11
+SE 52 Mes 16