FreeCalypso > hg > freecalypso-reveng
comparison mpffs/Description @ 27:343b6b2f178b
beginning of Mokopir-FFS verbal description
author | Michael Spacefalcon <msokolov@ivan.Harhan.ORG> |
---|---|
date | Sun, 30 Jun 2013 01:17:30 +0000 |
parents | |
children | c9f7a4afccc9 |
comparison
equal
deleted
inserted
replaced
26:d19b4e20ff9f | 27:343b6b2f178b |
---|---|
1 This is a description, based on reverse engineering, of the flash file system | |
2 (FFS) implemented in Pirelli's original firmware for the DP-L10 GSM/WiFi dual | |
3 mode mobile phone, and in the Closedmoko GTA0x modem firmware. Not knowing the | |
4 "proper" name for this FFS, and needing _some_ identifier to refer to it, I | |
5 have named it Mokopir-FFS, from "Moko" and "Pirelli" - sometimes abbreviated | |
6 further to MPFFS. | |
7 | |
8 (I have previously called the FFS in question MysteryFFS; but now that I've | |
9 successfully reverse-engineered it, it isn't such a mystery any more :-) | |
10 | |
11 At a high functional level, Mokopir-FFS presents the following features: | |
12 | |
13 * Has a directory tree structure like UNIX file systems; | |
14 | |
15 * The file system API that must be implemented inside the proprietary firmware | |
16 appears to use UNIX-style pathnames; doing strings on firmware images reveals | |
17 pathname strings like these: | |
18 | |
19 /var/dbg/dar | |
20 /gsm/l3/rr_white_list | |
21 /gsm/l3/rr_medium_rxlev_thr | |
22 /gsm/l3/rr_upper_rxlev_thr | |
23 /gsm/l3/shield | |
24 | |
25 Parsing the corresponding FFS image with tools included in the present | |
26 package has confirmed that the directory structure implied by these pathnames | |
27 does indeed exist in the FFS. | |
28 | |
29 * Absolutely no DOS-ish semantics seen anywhere: no 8.3 filenames and no | |
30 colon-separated device names (seen in the TSM30 file system source, for | |
31 example) are visible in the Closedmoko/Pirelli FFS. | |
32 | |
33 * File contents are stored uncompressed, but not necessarily contiguous: one | |
34 could probably store a file in FFS which is bigger than the flash sector | |
35 size, it which case it can never be contiguous in a writable FFS (see below), | |
36 and the firmware implementation seems to limit chunk sizes to a fairly small | |
37 number: on the Pirelli phones all largish files are divided into chunks of | |
38 8 KiB each, and on my GTA02 the largest observed chunk size is only 2 KiB. | |
39 | |
40 The smaller files, like the IMEI and the firmware ID strings in my GTA02 FFS, | |
41 are contiguous. | |
42 | |
43 * The FFS structure is such that the length of "user" payload data stored in | |
44 each chunk (and consequently, in each file) can be known exactly in bytes, | |
45 with the files/chunks able to contain arbitrary binary data. (This property | |
46 may seem obvious or trivial, as all familiar UNIX and DOS file systems have | |
47 it, but contrast with RT-11 for example.) | |
48 | |
49 * The flash file system is a writable one: the running firmware can create, | |
50 delete and overwrite files (and possibly directories too) in the live FFS; | |
51 thus the FFS design is such that allows these operations to be performed | |
52 within the physical constraints of NOR flash write operations. | |
53 | |
54 I have reverse-engineered this Mokopir-FFS on a read-only level. What it means | |
55 is that I, or anyone else who can read this document and the accompanying | |
56 source for the listing/extraction utilities, can take a Mokopir-FFS image read | |
57 out of a device and see/extract its full content: the complete directory tree | |
58 and the exact binary byte content of all files contained therein. | |
59 | |
60 However, the knowledge possessed by the present hacker (and conveyed in this | |
61 document and the accompanying source code) is NOT sufficient for constructing a | |
62 valid Mokopir-FFS image "in vitro" given a tree of directories and files, or | |
63 for making modifications to the file or directory content on an existing image | |
64 and producing a content-modified image that is also valid; valid as in suitable | |
65 for the original proprietary firmware to make its normal read and write | |
66 operations without noticing anything amiss. | |
67 | |
68 Constructing "de novo" Mokopir-FFS images or modifying existing images in such | |
69 a way that they remain 100% valid for all read and write operations of the | |
70 original proprietary firmware would, at the very minimum, require an | |
71 understanding of the meaning of *all* fields on the on-media FFS format. Some | |
72 of these fields are still left as "non-understood" for now though: a read-only | |
73 implementation can get away with simply ignoring them, but a writer/generator | |
74 would have to put *something* in those fields. | |
75 | |
76 As you read the "read-only" description of the Mokopir-FFS on-media format in | |
77 the remainder of this document, it should become fairly obvious which pieces | |
78 are missing before our understanding of this FFS can be elevated to a | |
79 "writable" level. | |
80 | |
81 However, when it comes to writing new code to run on the two Calypso phones in | |
82 question (Closedmoko and Pirelli), it seems, at least to the present hacker, | |
83 that a read-only understanding of Mokopir-FFS should be sufficient: | |
84 | |
85 * In the case of Closedmoko GTA0x modems, the FFS is seen to contain the IMEI | |
86 and the RF calibration data. The format of the former is obvious; the latter | |
87 not so much - but in any case, the information of interest is clearly of a | |
88 read-only nature. It's difficult to tell (or rather, I haven't bothered to | |
89 experiment enough) whether the Closedmoko firmware does any writes to FFS or | |
90 if the FFS is treated as read-only outside of the production line environment, | |
91 but in any case, it seems to me that for any 3rd party replacement firmware, | |
92 the best strategy would be to treat the FFS as a read-only source of IMEI and | |
93 RF calibration data, and nothing more. | |
94 | |
95 * In the case of Pirelli phones, the FFS is used to store user data: sent and | |
96 received SMS (and MMS/email/whatever), call history, UI settings, pictures | |
97 taken with the camera, and whatever else. It also stores a ton of files | |
98 which I can only presume were meant to be immutable except at the time of | |
99 firmware updates: graphics for the UI, ringtones, i18n UI strings, and even | |
100 "helper" firmware images for the WiFi and VoIP processors. However, no IMEI | |
101 or RF calibration data are anywhere to be found in the FFS - instead this | |
102 information appears to be stored in the "factory block" at the end of the | |
103 flash (in its own sector) outside of the FFS. | |
104 | |
105 Being able to parse FFS images extracted out of Pirelli phones "in vitro" | |
106 allows us to steal some of these helper files (UI artwork, ringtones, | |
107 WiFi/VoIP helpers), and some of these might even come useful to firmware | |
108 replacement projects, but it seems to me that a replacement firmware would | |
109 be better off using its own FFS design for storing user data, and as to | |
110 retrieving the original IMEI and RF calibration data, the original FFS isn't | |
111 of any use for that anyway. | |
112 | |
113 ======================= | |
114 Moko/Pirelli FFS format | |
115 ======================= | |
116 | |
117 OK, now that I'm done with the introduction, we can get to the actual | |
118 Mokopir-FFS format. | |
119 | |
120 * On the GTA0x modem (or at least on my GTA02; my sample size is 1) the FFS | |
121 occupies 7 flash sectors of 64 KiB each at offsets 0x380000 through 0x3E0000, | |
122 inclusive. | |
123 | |
124 (The 4 MiB NOR flash chip used by Closedmoko has an independent R/W bank | |
125 division between the first 3 MiB and the last 1 MiB. The first 3 MiB are used | |
126 to hold the field-flashable closed firmware images distributed as *.m0 files; | |
127 the independent last megabyte holds the FFS, and thus the FW could be | |
128 implemented to do FFS writes while running from flash in the main bank. | |
129 Less than half of that last megabyte appears to be used for the FFS though; | |
130 the rest appears to be unused - blank flash observed.) | |
131 | |
132 * On the Pirelli the FFS occupies 18 sectors of 256 KiB each at offsets 0 | |
133 through 0x440000 (inclusive) of the 2nd flash chip select, the one wired to | |
134 nCS3 on the Calypso. | |
135 | |
136 Each flash sector allocated to FFS begins with the following signature: | |
137 | |
138 00000000: 46 66 73 23 10 02 xx yy zz FF FF FF FF FF FF FF Ffs#............ | |
139 | |
140 The bytes shown as xx and yy above serve a non-understood purpose; as a guess, | |
141 they may hold some info for the flash wear leveling algorithm: in a "virgin" | |
142 FFS image like that found in my GTA02 (which never had a SIM card in it and | |
143 never made or received a call) or read out of a "virgin" Pirelli phone that | |
144 hasn't seen any active use yet, both of these bytes are FFs, but when I look at | |
145 FFS images read out of the Pirelli which I currently use as my everyday-use | |
146 cellphone, I see other values in sectors which must have been erased and | |
147 rewritten. A read-only implementation can ignore these bytes, as mine does. | |
148 | |
149 The byte shown as zz is more important though, even to a read-only | |
150 implementation. The 3 values I've encountered in this byte so far are AB, BD | |
151 and BF. Per my current understanding, in a "healthy" FFS exactly one sector | |
152 will have AB in its header, exactly one will have BF, and the rest will have | |
153 BD. The meanings are (or appear to be): | |
154 | |
155 AB: the sector holds a vital data structure which I have called the active | |
156 index block; | |
157 BD: the sector holds regular data; | |
158 BF: the sector is blank except for the header, can be turned into a new AB or | |
159 BD. | |
160 | |
161 (Note that a flash program operation, which can turn 1s into 0s but not the | |
162 other way around, can turn BF into either AB or BD - but neither AB nor BD can | |
163 be turned into any other valid value.) | |
164 | |
165 In a "virgin" FFS image (as explained above) the first FFS sector is AB, the | |
166 last one is BF, and the ones in between are BDs. | |
167 | |
168 An FFS read operation (a search for a given pathname, or a listing of all | |
169 present directories and files) needs to start with locating the active index | |
170 block - the FFS sector with AB in the header. Following this header, which is | |
171 treated as being 16 bytes long (almost everything in Mokopir-FFS is aligned on | |
172 16-byte boundaries), the active index block contains a linear array of 16-byte | |
173 records, each record describing an FFS object: directory, file or file | |
174 continuation chunk. | |
175 | |
176 Here is my current understanding of the 16-byte index block record structure: | |
177 | |
178 2 bytes: Length of the described chunk in bytes | |
179 1 byte: Purpose/meaning not understood, ignored by my current code | |
180 1 byte: Object type | |
181 2 bytes: Descendant pointer | |
182 2 bytes: Sibling pointer | |
183 4 bytes: Data pointer | |
184 4 bytes: Purpose/meaning not understood, ignored by my current code | |
185 | |
186 (On the Calypso phones of interest, all multibyte fields are in the native | |
187 little-endian byte order of the ARM7TDMI processor.) | |
188 | |
189 The active index block gets filled with these records as objects are created; | |
190 the first record goes right after the 'Ffs#'...AB header (padded to 16 bytes); | |
191 the last record (at any given moment) is followed by blank flash for the | |
192 remainder of the sector. Records thus appear in the order in which they are | |
193 created, which bears no direct relation to the directory tree structure. | |
194 | |
195 The objects, each described by a record in the index block, are organized into | |
196 a tree structure by the descendant and sibling pointers, plus the object type | |
197 indicator byte. Let's start with the latter; the following objtype byte values | |
198 have been observed: | |
199 | |
200 00: deleted object - a read-only implementation should ignore everything except | |
201 the descendant and sibling pointers. (A write-capable implementation would | |
202 need more care - it would need a way of reclaiming dirty flash space taken | |
203 up by deleted/overwritten files.) | |
204 | |
205 E1: a special file - see the description of the /.journal file further down | |
206 F1: a regular file (head chunk thereof) | |
207 F2: a directory | |
208 F4: file continuation chunk (explained below) | |
209 | |
210 Each record in the index block has an associated chunk in one of the data | |
211 sectors; the index record contains fields giving the address and length of this | |
212 chunk. The length of a chunk is always a nonzero multiple of 16 bytes, and is | |
213 stored (as a number in bytes) in the first 16-bit field of the 16-byte index | |
214 entry. The address of each chunk is given by the data pointer field of the | |
215 index record, and it is reckoned in 16-byte units (thereby 16-byte alignment is | |
216 required) from the beginning of the FFS sector group in the flash address space. | |
217 | |
218 For objects of type F1 and F2 (regular files and directories) the just-described | |
219 chunk begins with the name of the file or subdirectory as a NUL-terminated ASCII | |
220 string. This name is just for the current level of the directory tree, just | |
221 like in UNIX directories, thus one will have chunk names like gsm, l3, eplmn | |
222 etc, rather than /gsm/l3/eplmn. One practical effect is that one can't readily | |
223 see pathnames or any of the directory structure by looking at an FFS image as a | |
224 raw hex dump; the structure is only revealed when one uses a parsing program | |
225 like those which accompany this document. | |
226 | |
227 In the case of directories, the "chunk" part of the object contains only the | |
228 name of the directory itself, padded with FFs to a 16-byte boundary. For | |
229 example, an FFS directory named /gsm would be represented by an object | |
230 consisting of two flash writes: a 16-byte entry in the active index block, with | |
231 the object type byte set to F2, and a corresponding 16-byte chunk in one of the | |
232 data sectors, with the 16 bytes containing "gsm", a terminating NUL byte, and | |
233 12 FF bytes to pad up to 16. In the case of files, this name may be following | |
234 by the first chunk of file data content, as explained further down. | |
235 | |
236 In order to parse the FFS directory tree (whether the objective is to dump the | |
237 whole thing recursively or to find a specific file given a pathname), one needs | |
238 to first (well, after finding the active AB block) find the root directory node. | |
239 The root directory object is similar to other directory objects: it has a type | |
240 of F2, and an associated chunk of 16 bytes in one of the data sectors. The | |
241 latter contains the name of the root node: on the Pirelli it is "/", whereas on | |
242 my GTA02 it is "/ffs-root". | |
243 | |
244 The astute reader should notice that it really makes no sense to store a name | |
245 for the root node, and indeed, this name plays no part in the traversal of the | |
246 directory tree given an absolute pathname. But instead this name, or rather | |
247 its first character, appears to be used for the purpose of locating the root | |
248 node itself. At first I had assumed that the index record for the root node is | |
249 always the first record in the active index block right after the signature | |
250 header - that is how it is in "virgin" FFS images, and also in some quite non- | |
251 virgin ones I have pulled from my daily-use Pirelli. Naturally my first version | |
252 of the Mokopir-FFS (then called MysteryFFS) extraction utility expected the root | |
253 node to always be at index #1. But then I got some additional Pirelli phones, | |
254 and discovered that in certain cases, index record #1 is a deleted object (the | |
255 original root node which has been deleted), and the new active root node is | |
256 somewhere in the middle of the index! | |
257 | |
258 Thus it appears that in order to find the active root node, one needs to scan | |
259 the active index block linearly from the beginning (disregarding the tree | |
260 structure pointers in this initial pass), looking for a non-deleted object of | |
261 type F2 (a directory) whose corresponding name chunk sports a name beginning | |
262 with the '/' character. (Anyone who's been raised in UNIX will immediately | |
263 know that the path separator character '/' is the only character other than NUL | |
264 that's absolutely forbidden in the individual filenames - so this special | |
265 "root node name" is the only case of a '/' character appearing in what would | |
266 otherwise be a regular filename.) | |
267 | |
268 [What causes the root node to be somewhere other than at index #1? I assume it | |
269 has to do with the dirty space reclamation / data movement algorithm. In a | |
270 "virgin" FFS image the very first sector is the active index block, and the | |
271 following sector is the first to hold chunks, beginning with the name chunk of | |
272 the root node. Now what happens if all data in that sector aside from the | |
273 root node name and some other mostly-static directory names becomes dirty, | |
274 i.e., belonging to deleted or overwritten files? How would that flash space | |
275 get reclaimed? I assume that the FFS firmware algorithm moves all still-active | |
276 chunks to a new flash sector, invalidating the old copies - turning the latter | |
277 into deleted objects. The root node will be among them. Then at some point | |
278 the active index block is going to fill up too, and will need to be rewritten | |
279 into a new sector - at which point the previously-deleted index entries are | |
280 omitted and the root node becomes #1 again...] |