FreeCalypso > hg > freecalypso-sw
comparison doc/Firmware_Architecture @ 868:d92b110e06e0
doc/Firmware_Architecture written
author | Space Falcon <falcon@ivan.Harhan.ORG> |
---|---|
date | Sun, 17 May 2015 03:45:19 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
867:c4da570dca83 | 868:d92b110e06e0 |
---|---|
1 Our FreeCalypso GSM firmware follows the same architecture as TI's TCS211; | |
2 this document is an attempt to describe this architecture. | |
3 | |
4 Nucleus environment | |
5 =================== | |
6 | |
7 Like all classic TI firmwares, ours is based on the Nucleus PLUS RTOS. Just | |
8 like TI's original code on which we are based, we use only a small subset of | |
9 the functionality provided by Nucleus - but because the latter is a library, | |
10 the pieces we don't use simply don't get pulled into the link. The main | |
11 function we get out of Nucleus is the scheduling of threads, or tasks as | |
12 Nucleus calls them. | |
13 | |
14 Our entry point code as we receive control from the Calypso boot ROM or from | |
15 other bootloaders on crippled targets or from loadagent in the case of fc-xram | |
16 loadable builds does some absolutely minimal initialization (set up sensible | |
17 memory access timings, copy iram.text to IRAM and .data to XRAM if we are | |
18 booting from flash, zero out our two bss segments (int.bss and ext.bss)) and | |
19 jumps to Nucleus' assembly init entry point. Prior to jumping to Nucleus, we | |
20 don't even have a stack (all init code prior to that point is pure assembly and | |
21 uses only ARM registers); Nucleus then sets up the stack pointer for everything | |
22 running under its control. | |
23 | |
24 Aside from just a few exceptions (ARM exception handlers come to mind, never | |
25 mind the pun), every piece of code in the firmware executes in one of the | |
26 following contexts: | |
27 | |
28 * Application_Initialize(): this function and everything called from it execute | |
29 just before Nucleus' thread scheduler starts; at this point interrupts are | |
30 disabled at the ARM7 core level (in the CPSR) and must not be enabled; the | |
31 stack is Nucleus' "system stack" which is also used by the scheduler and LISRs | |
32 as explained below. | |
33 | |
34 * Regular threads or tasks: once Application_Initialize() finishes, all code | |
35 with the exception of interrupt handlers (LISRs and HISRs as explained below) | |
36 runs in the context of some Nucleus task. Whenever you are trying to debug | |
37 or simply understand some piece of code in the firmware, the first question | |
38 you should ask is "which task does this code execute in?". Most functional | |
39 components run in their own tasks, i.e., a given piece of code is only | |
40 intended to run within the Nucleus task that belongs to the component in | |
41 question. On the other hand, some components are implemented as APIs, | |
42 functions to be called from other components: these don't have their own task | |
43 associated with them, and instead they run in the context of whatever task | |
44 they were called from. Some only get called from one task: for example, the | |
45 "uartfax" driver API calls only get called from the protocol stack's UART | |
46 entity, which is its own task. Other component API functions like FFS and | |
47 trace can get called from just about any task in the system. Many components | |
48 have both their own task and some API functions to be called from other tasks, | |
49 and the API functions oftentimes post messages to the task to be worked on by | |
50 the latter; the just-mentioned FFS and trace functions work in this manner. | |
51 | |
52 In our current GSM firmware (just like in TCS211) every Nucleus task is | |
53 created either through Riviera or through GPF, and not in any other way - see | |
54 the description of Riviera and GPF below. | |
55 | |
56 * LISRs (Low level Interrupt Service Routines): these are the interrupt handlers | |
57 that run immediately when an ARM IRQ or FIQ comes in. The code at the IRQ and | |
58 FIQ vector entry points calls Nucleus' magic stack switching function | |
59 (switches the CPU from IRQ/FIQ into SVC mode, saves the interrupted thread's | |
60 registers on that thread's stack, and switches to the "system" stack) and | |
61 then calls TI's IRQ dispatcher implemented in C. The latter figures out | |
62 which Calypso interrupt needs to be handled and calls the handler configured | |
63 in the compiled-in table. Nucleus' LISR registration framework is not used | |
64 by the GSM fw, but these interrupt handlers should be viewed as LISRs | |
65 nonetheless. | |
66 | |
67 There is one additional difference between canonical Nucleus and TI's version | |
68 (we've replicated the latter): canonical Nucleus was designed to support | |
69 nested LISRs, i.e., IRQs re-enabled in the magic stack switching function, | |
70 but in TI's version which we follow this IRQ re-enabling is removed: each LISR | |
71 runs with interrupts disabled and cannot be interrupted. (The corner case of | |
72 an FIQ interruping an IRQ remains to be looked at more closely as bugs may be | |
73 hiding there, but Calypso doesn't really use FIQ interrupts.) There is really | |
74 no need for LISR nesting in our GSM fw, as each LISR is very short: most LISRs | |
75 do nothing more than trigger the corresponding HISR. | |
76 | |
77 * HISRs (High level Interrupt Service Routines): these hold an intermediate | |
78 place between LISRs and tasks, similar to softirqs in the Linux kernel. A | |
79 HISR can be activated by a LISR calling NU_Activate_HISR(), and when the LISR | |
80 returns, the HISR will run before the interrupted task (or some higher | |
81 priority task, see below) can resume. HISRs run with CPU interrupts enabled, | |
82 thus more interrupts can occur, with their LISRs executing and possibly | |
83 triggering other HISRs. All triggered HISRs must complete and thereby go | |
84 "quiescent" before task scheduling resumes, i.e., all HISRs as a group have a | |
85 higher scheduling priority than tasks. | |
86 | |
87 Nucleus implements priority scheduling for tasks. Tasks have their priority set | |
88 when they are created (through Riviera or GPF, see below), and a higher priority | |
89 task will run until it gets blocked waiting for something, at which time lower | |
90 priority tasks will run. If a lower priority task sends a message to a higher | |
91 priority task, unblocking the latter which was waiting for incoming messages, | |
92 the lower priority task will effectively suspend itself immediately while the | |
93 higher priority task runs to process the message it was sent. | |
94 | |
95 HISRs oftentimes post messages to their associated tasks as well; if one of | |
96 these messages unblocks a higher priority task, that unblocked task will run | |
97 upon the completion of the HISR instead of the original lower priority task | |
98 that was interrupted by the LISR that triggered the HISR. Nucleus' scheduler | |
99 is fun! | |
100 | |
101 Major functional blocks | |
102 ======================= | |
103 | |
104 At the highest level, all code in TI's classic firmwares and in our FreeCalypso | |
105 fw can be divided into 3 broad groupings: | |
106 | |
107 * GSM Layer 1: this code was developed by TI, is highly specific to TI's | |
108 baseband chipset family in general and to specific individual chips in | |
109 particular (the code is liberally sprinkled with conditional compilation | |
110 based on DBB type, ABB type, DSP ROM version and so on), and is absolutely | |
111 necessary in order to operate a Calypso device as a GSM MS (mobile station) | |
112 and not merely as a general purpose microprocessor platform. This code can | |
113 be considered to be the most important part of the entire firmware. | |
114 | |
115 L1 interties with Nucleus and with the G23M stack (with which it needs to | |
116 communicate) in a very peculiar way described later in this article. | |
117 | |
118 * G23M protocol stack: at the beginning of TI's involvement in the GSM baseband | |
119 chipset business, they only developed and maintained their own L1 code, while | |
120 the rest of the protocol stack (which is hardware-independent) was licensed | |
121 from another company called Condat. Later Condat as a company was fully | |
122 acquired by TI, and the once-customer of this code became its owner. The | |
123 name of TI/Condat's implementation of GSM layers 2&3 for the MS side is G23M, | |
124 and it forms its own major division of the overall fw architecture. | |
125 | |
126 Underlying the G23M stack is a special layer called GPF, which was originally | |
127 Condat's Generic Protocol stack Framework. Apparently Condat was in the | |
128 business of developing and maintaining a whole bunch of protocol stacks: GSM | |
129 MS side, GSM network side, TETRA and who knows what else. GPF was their | |
130 common underpinning for all of their protocol stack projects, which ran on top | |
131 of many different OS environments: Nucleus, pSOS, VxWorks, Unix/Linux, Win32 | |
132 and who knows what else. | |
133 | |
134 In the case of FreeCalypso GSM fw, both the protocol stack and the underlying | |
135 OS environment are fixed: GSM and Nucleus, respectively. But GPF is still a | |
136 critically important layer in the firmware architecture: in addition to | |
137 serving as the glue between the G23M stack and Nucleus, it provides some | |
138 important support infrastructure for the protocol stack. | |
139 | |
140 * Miscellaneous peripheral accessories: under this category I (Space Falcon) | |
141 place everything implemented through TI's Riviera framework. Historical | |
142 evidence indicates that TI's earliest firmwares did not have this part, i.e., | |
143 Riviera and everything built on top of it is a "non-essential" later | |
144 addition. It appears that TI originally invented Riviera in order to support | |
145 the development of fancy "feature phone" UI/application layers, complete with | |
146 Java, MMS, WAP, games and whatnot - things upon which our FreeCalypso project | |
147 looks with disdain - but in the TCS211 firmware from 2007 which I used as the | |
148 reference for FreeCalypso this Riviera framework serves as the foundation for | |
149 some small but essential pieces of functionality: the FFS implementation, the | |
150 SPI-based ABB access driver, the RTC driver and the debug trace facility. | |
151 | |
152 While it is certain that TI had some non-Riviera implementation of the just- | |
153 listed essential pieces in their earliest pre-Riviera days, trying to find | |
154 surviving sources from those days would be a "mission impossible" task. OTOH, | |
155 reusing the Riviera code from TCS211 was quite easy, as the copy of TCS211 we | |
156 got has it in full source form with nothing omitted. Therefore, I took the | |
157 sensible easy road and kept Riviera in FreeCalypso. | |
158 | |
159 The above division of the firmware into 3 broad functional groupings also | |
160 corresponds quite neatly with where each piece of our source code originally | |
161 came from. Our versions of L1 and G23M came in their entirety from TI's TCS3.2 | |
162 program targeting their later LoCosto chipset (specifically from the | |
163 TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip release from Peek/FGW), whereas | |
164 everything in the 3rd division (Riviera and everything built on top of it) came | |
165 from our TCS211/Leonardo source from Sotovik. | |
166 | |
167 The just-listed divisions of the firmware are really separate software | |
168 environments which are linked together into one final image, but which have | |
169 very little in the way of interties. Each of the 3 realms has its own very | |
170 different coding style, its own set of header files and its own defined types. | |
171 It is very rare for a module from one realm to include any header files or call | |
172 any functions from another realm, and while they all ultimately run on top of | |
173 Nucleus, they interface with Nucleus in different ways: G23M goes through GPF, | |
174 everything in Riviera land goes through Riviera, and L1 uses its own bizarre | |
175 mechanism which in our fw ends up going through GPF but hasn't always been this | |
176 way - to be explained lated in this article. | |
177 | |
178 Also note that there is no mention of any handset UI code (or MMI in the GSM | |
179 industry's sexist speak) in the above breakdown of code divisions. This | |
180 document describes the architecture of TI's modem firmware in which the highest | |
181 layer is the AT command interface (part of the G23M suite, or its uppermost | |
182 layer to be precise), and which does not include any UI code. Our TI reference | |
183 sources do include their "MMI" code, but I haven't studied it closely enough | |
184 yet to comment on it properly, and the version of TCS211 which serves as our | |
185 primary reference is set up for the modem configuration without this "MMI" part. | |
186 Making sense of TI's "MMI" code is a task to be tackled later in the project | |
187 when we have a working modem and are ready to start building a usable handset | |
188 with UI. | |
189 | |
190 Riviera and GPF | |
191 =============== | |
192 | |
193 Riviera and GPF are two parallel/independent/competing wrappers around or | |
194 layers above Nucleus. The way in which they are treated in our FreeCalypso fw | |
195 architecture is somewhat inverted: originally GPF was the essential framework | |
196 underlying the G23M stack (and to which L1 was also attached in a hacky way) | |
197 while Riviera was added to support non-essential frills, but in our current FC | |
198 fw Riviera is always included just like Nucleus, whereas GPF only needs to be | |
199 included in the build when building with feature gsm (full GSM MS functionality) | |
200 or feature l1stand (L1 standalone) - but is not needed if one wishes to build | |
201 an "in vivo" FFS editing agent, for example. | |
202 | |
203 This peculiar arrangement happened because of the source code availability | |
204 situation we found ourselves in. TCS211 uses real Riviera that is fully | |
205 independent of GPF (see below), and our copy thereof came with this part in | |
206 full source form. On the other hand, we never got the complete original source | |
207 for GPF in one piece, thus our FC version of GPF had to be reconstructed from | |
208 bits and pieces. For this reason I made the decision early on to include | |
209 Riviera and some RV-based components in the "mandatory core" part of our FC fw | |
210 architecture, while leaving GPF to be worked on later. And when I did get to | |
211 reintegrating GPF, at that point it was natural to make it into an "optional" | |
212 component that is included only when needed. | |
213 | |
214 At some point in their post-Calypso TCS3.x program TI decided to eliminate | |
215 Riviera as an independent framework and to reimplement Riviera APIs (used by | |
216 peripheral but necessary code such as FFS, ETM, various drivers etc) over GPF. | |
217 This arrangement is used in the TCS3.2 LoCosto code from which we lifted our | |
218 versions of L1 and G23M. However, I (Space Falcon) chose not to adopt this | |
219 approach for FreeCalypso, and mimic the TCS211 way (Riviera entirely | |
220 independent of GPF) instead. The reasons were twofold: (1) there was no full | |
221 source for GPF and a painstaking reconstruction effort was required before we | |
222 could have our own working version of GPF in our gcc-built fw, and (2) I felt | |
223 more comfortable and familiar with following TCS211. | |
224 | |
225 Start-up process | |
226 ================ | |
227 | |
228 I mentioned earlier that every Nucleus task in our firmware gets created and | |
229 started either through Riviera or through GPF. All GPF tasks are created and | |
230 placed into the runable state in the Application_Initialize() context: the work | |
231 is done by GPF init code in gsm-fw/gpf/frame/frame.c, and the top level GPF | |
232 init function called from Application_Initialize() is StartFrame(). Thus when | |
233 Application_Initialize() finishes and the Nucleus thread scheduler starts | |
234 running for the first time, all GPF tasks are there to be scheduled. | |
235 | |
236 There is a compiled-in table of all protocol stack entities and the tasks in | |
237 which they need to run which (in our fw) lives under gsm-fw/gpf/conf and which | |
238 logically belongs to GPF. Canonically each protocol stack entities runs in its | |
239 own task, but sometimes two or more are combined to run in the same task: for | |
240 example, in the minimal GSM "voice only" configuration (no CSD, fax or GPRS) | |
241 CC, SMS and SS entities share the same task named CM. Unlike Riviera, GPF does | |
242 not support dynamic starting and stopping of tasks. | |
243 | |
244 As each GPF task starts running (immediately upon entry into Nucleus' scheduling | |
245 loop as Application_Initialize() finishes), pf_TaskEntry() function in | |
246 gsm-fw/gpf/frame/frame.c is the first code it runs. This function creates the | |
247 queue for messages to be sent to all entities running within the task in | |
248 question, calls each entity's pei_init() function (repeatedly until it succeeds: | |
249 it will fail until the other entities to which this entity needs to send | |
250 messages have created their message queues), and then falls into the main body | |
251 of the task: for all "regular" entities/tasks except L1, this main body consists | |
252 of waiting for messages (or signals or timeouts) to arrive on the queue and | |
253 dispatching each received message to the appropriate handler in the right | |
254 entity. | |
255 | |
256 Riviera tasks get started in a different way. The same Application_Initialize() | |
257 function that calls StartFrame() to create and start all GPF tasks also calls | |
258 create_tasks() (found in gsm-fw/riviera/init/create_RVtasks.c), the appinit-time | |
259 function for starting the Riviera environment. But this function does not | |
260 create and start every configured Riviera task like StartFrame() does for GPF. | |
261 Instead it creates a special helper task which will do this work once scheduled. | |
262 Thus at the completion of Application_Initialize() and the beginning of | |
263 scheduling the set of runable Nucleus tasks consists of all GPF ones plus the | |
264 special RV starter task. Once the RV starter task gets scheduled, it will call | |
265 rvm_start_swe() to launch every configured Riviera SWE (SoftWare Entity), which | |
266 in turns entails creating the tasks in which these SWEs are to run. | |
267 | |
268 Dynamic memory allocation | |
269 ========================= | |
270 | |
271 All dynamic memory allocation (i.e., all RAM usage beyond statically allocated | |
272 variables and buffers) is once again done either through Riviera or through GPF, | |
273 and in no other way. Ultimately all areas of the physical RAM that will ever | |
274 be used by the fw in any way are allocated when the fw is compiled and linked: | |
275 the areas from which Riviera and GPF serve their dynamic memory allocations are | |
276 statically allocated as char arrays in the respective C modules and placed in | |
277 the int.ram or ext.ram section as appropriate; Riviera and GPF then provide | |
278 API functions that allocate memory dynamically from these statically allocated | |
279 large pools. | |
280 | |
281 Riviera and GPF have entirely separate memory pools from which they serve their | |
282 respective clients, hence there is no possibility of one affecting the other. | |
283 Riviera's memory allocation scheme is very much like the classic malloc&free: | |
284 there is one large unstructured pool from which all allocations are made, one | |
285 can allocate a chunk of any size, free chunks are merged when physically | |
286 adjacent, and fragmentation is an issue: a memory allocation request may fail | |
287 even when there is enough memory available in total if it is too fragmented. | |
288 | |
289 GPF's dynamic memory allocation facility is considerably more robust: while it | |
290 does maintain one or two (depending on configuration) memory pools of the | |
291 traditional "dynamic" kind (like malloc&free, susceptible to fragmentation), | |
292 most GPF memory allocation works on "partition" memory instead. Here GPF | |
293 maintains 3 separate groups of pools: PRIM, TEST and DMEM; each allocation | |
294 request must specify the appropriate pool group and cannot affect the others. | |
295 Within each pool there is a fixed number of partitions of a fixed size: for | |
296 example, in TI's TCS211 GSM+GPRS configuration the PRIM pool group consists of | |
297 190 partitions of 60 bytes, 110 partitions of 128 bytes, 50 partitions of 632 | |
298 bytes and 7 partitions of 1600 bytes. An allocation request from a given pool | |
299 group (e.g., PRIM) can request any arbitrary size in bytes, but it gets rounded | |
300 up to the nearest partition size and allocated out of the respective pool. If | |
301 no free partitions are available, the requesting task is suspended until another | |
302 task frees on. Because these partitions are used primarily for intertask | |
303 communication, if none are free, it can only mean (assuming that the firmware | |
304 functions correcly) that all partitions have been allocated and sent to some | |
305 queue for some task to work on, hence eventually they will get freed. | |
306 | |
307 This scheme implemented in GPF is extremely robust in the opinion of this | |
308 author, and the other purely "dynamic" scheme is used (in the case of GPF) only | |
309 for init-time allocations which are never freed, such as task stacks - hence | |
310 the GPF-based part of the firmware is not suspectible at all to the problem of | |
311 memory fragmentation. But Riviera does suffer from this problem, and the | |
312 concern is more than just theoretical: one major user of Riviera-based dynamic | |
313 memory allocation is the trace facility (described in its own section below), | |
314 and my observation of the trace output from Pirelli's proprietary fw (which | |
315 appears to use the same architecture with separate Riviera and GPF) suggests | |
316 that after the fw has been running for a while, Riviera memory gets fragmented | |
317 to a point where many traces are being dropped. Replacing Riviera's poor | |
318 dynamic memory allocation scheme with a GPF-like partition-based one is a to-do | |
319 item for our project. | |
320 | |
321 Message-based intertask communication | |
322 ===================================== | |
323 | |
324 Even though all entities of the G23M protocol stack are linked together into | |
325 one monolithic fw image and there is nothing to stop them from calling each | |
326 other's functions and accessing each other's variables, they don't work that | |
327 way. Instead all communication between entities is done through messages, just | |
328 as if they ran in separate address spaces or even on separate processors. | |
329 Buffers for this message exchange are allocated from a GPF partition pool: an | |
330 entity that needs to send a message to another entity allocates a buffer of the | |
331 needed size, fills it with the message to be sent, and posts it on the recipient | |
332 entity's message queue, all through GPF services. The other entity simply | |
333 processes the stream of messages that arrives on its message queue, freeing each | |
334 message (returning the buffer to the partition pool in came from) as it is | |
335 processed. | |
336 | |
337 Riviera-based tasks use a similar mechanism: unlike G23M protocol stack | |
338 entities, most Riviera-based functional modules provide APIs that are called as | |
339 functions from other tasks, but these API functions typically allocate a memory | |
340 buffer (through Riviera), fill it with the call parameters, and post it to the | |
341 associated task's message queue (also in the Riviera land) to be worked on. | |
342 Once the worker task gets the job done, it will either call a callback function | |
343 or post a response message back to the requestor - the latter option is only | |
344 possible if the requesting entity is also Riviera-based. | |
345 | |
346 A closer look at GPF | |
347 ==================== | |
348 | |
349 There are certain sublayers within GPF which need to be pointed out. The 3 | |
350 major subdivisions within GPF are: | |
351 | |
352 * The meaty core of GPF: this part is the code under gsm-fw/gpf/frame in our | |
353 source tree. It appears that this part was originally intended to be both | |
354 project-independent (same for GSM, TETRA etc) and OS-independent (same for | |
355 Nucleus, pSOS, VxWorks etc). This is the part of GPF that matters for the | |
356 G23M stack: all APIs called by PS entities are implemented here, and so are | |
357 all other PS-facing functions such as startup. (PS = protocol stack) | |
358 | |
359 * OS adaptation layer (OSL): this is the part of GPF that adapts it to a given | |
360 underlying OS, in our case Nucleus. | |
361 | |
362 * Test interface: see the code under gsm-fw/gpf/tst_drv and gsm-fw/gpf/tst_pei. | |
363 This part handles the trace output from all entities that run under GPF and | |
364 the mechanism for sending external debug commands to the GPF+PS subsystem. | |
365 | |
366 GPF was a difficult step in our GSM firmware reintegration process because no | |
367 complete source for it could be found anywhere: apparently GPF was so stable | |
368 and so independent of firmware particulars (Calypso or LoCosto, GSM only or | |
369 GSM+GPRS, modem or complete phone with UI etc) that it appears to have been | |
370 used and distributed as prebuilt binary libraries even inside TI. All TI fw | |
371 (semi-)sources we have use GPF in prebuilt library form and are not set up to | |
372 recompile any part of it from source. (They had to include all GPF header | |
373 files though, as most of them are included by G23M C modules, and it would be | |
374 too much hassle to figure out which ones are or aren't needed, hence all were | |
375 included.) | |
376 | |
377 Fortunately though, we were able to find the sources for most parts of GPF: | |
378 | |
379 * The LoCosto source in TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip features the | |
380 source for the "core" part of GPF under gpf/FRAME - these sources aren't | |
381 actually used by that fw's build system (it only uses the prebuilt binary | |
382 libs for GPF), but they are there. | |
383 | |
384 * Our TCS211 semi-src doesn't have any sources for the core part of GPF, but | |
385 instead it features the source for the test interface and some "misc" parts: | |
386 under gpf/MISC and gpf/tst in that source tree - these sources are not present | |
387 in the LoCosto version from Peek. | |
388 | |
389 But one critical piece was still missing: the OS adaptation layer. It appears | |
390 that the GPF core (vsi_??? modules) and OSL (os_??? modules) were maintained | |
391 and built together, ending up together in frame_<blah>.lib files in the binary | |
392 form used to build firmwares, but the source for the "frame" part in the Peek | |
393 find contained only vsi_*.c and others, but not any of os_*.c. | |
394 | |
395 Thus we had to reconstruct GPF from the shattered bits and pieces we had. I | |
396 took the frame sources from Peek and the misc and tst sources from Sotovik, and | |
397 saw that they compiled w/o problems in our gcc environment. Attempting to link | |
398 any firmware that uses GPF would have been futile at this point, as it would | |
399 have failed with undefined references to os_*() functions. Then I had to do | |
400 the hard work: disassemble the missing os_??? modules from the binary libs in | |
401 the TCS211 version (hey, at least this one was known to work reliably) and write | |
402 new C code replicating the exact logic found in the disassembly of the known | |
403 working and fitting binary. This work is now mostly done (some non-essential | |
404 functions have been stubbed out to be revisited later), and the version of GPF | |
405 used by FreeCalypso is a significant work of reconstruction, not merely lifted | |
406 from a readily available source and plopped in. | |
407 | |
408 A closer look at L1 | |
409 =================== | |
410 | |
411 The L1 code is remarkable in how little intertie it has with the rest of the | |
412 firmware it is linked into. It is almost entirely self-contained, expecting | |
413 only 4 functions to be provided by the underlying OS environment: | |
414 | |
415 os_alloc_sig -- allocate message buffer | |
416 os_free_sig -- free message buffer | |
417 os_send_sig -- send message to upper layers | |
418 os_receive_sig -- receive message from upper layers | |
419 | |
420 It helps to remember that at the beginning of TI's involvement in the GSM | |
421 baseband chipset business, L1 was the only thing they "owned", while Condat, | |
422 the maintainers of the higher level protocol stack, was a separate company. | |
423 TI's "turnkey" solution must have consisted of their own L1 code plus G23M code | |
424 (including GPF etc) licensed from Condat, but I'm guessing that TI probably | |
425 wanted to retain the ability to sell their chips with their L1 without being | |
426 entangled by Condat: let the customer use their own GSM L23 stack, or perhaps | |
427 work out their own independent licensing arrangements with Condat. I'm | |
428 guessing that L1 was maintained as its own highly independent and at least | |
429 conceptually portable entity for these reasons. | |
430 | |
431 The way in which L1 is intertied into our FreeCalypso GSM fw is the same as how | |
432 it is done in TI's production firmwares, including both our TCS211 reference | |
433 and the TCS3.2 version from which we got our L1 source. There is a module | |
434 called OSX, which is an extremely thin adaptation layer that implements the | |
435 APIs expected by L1 in terms of GPF. Furthermore, this OSX layer provides | |
436 header file isolation: the only "outside" (non-L1) header included by L1 is | |
437 cust_os.h, and it defines the necessary interface to OSX *without* including | |
438 any other headers (no GPF headers in particular), using only the C language's | |
439 native types. Apart from this cust_os.h header, the entire OSX layer is | |
440 implemented in one C module (osx.c, which we had to reconstruct from osx.obj as | |
441 the source was missing - but it's very simple) which does include some GPF | |
442 headers and implements the OSX API in terms of GPF services. Thus in TI's | |
443 production firmwares and in our FC GSM fw L1 does sit on top of GPF, but very | |
444 indirectly. | |
445 | |
446 More specifically, the "production" version of OSX implements its API in terms | |
447 of *high-level* GPF functions, i.e., VSI. However, they also had an interesting | |
448 OP_L1_STANDALONE configuration which omitted not only all of G23M, but also the | |
449 core of GPF and possibly the Riviera environment as well. We don't have a way | |
450 to recreate this configuration exactly as it existed inside TI because we don't | |
451 have the source bits specific to this configuration (our own standalone L1 | |
452 configuration is implemented differently, see below), but we do have a little | |
453 bit of insight into how it worked. | |
454 | |
455 It appears that TI's OP_L1_STANDALONE build used a special "gutted" version of | |
456 GPF in which the "meaty core" (VSI etc) was removed. The OS layer (os_??? | |
457 modules implementing os_*() functions) that interfaces to Nucleus was kept, and | |
458 so was OSX used by L1 - but this time the OSX API functions were implemented in | |
459 terms of os_*() ones (low-level wrappers around Nucleus) instead of the higher- | |
460 level VSI APIs provided by the "meaty core" of GPF. It is purely a guess on my | |
461 part, but perhaps this hack was also done in the days before TI's acquisition | |
462 of Condat, and by omitting the "meaty core" of GPF, TI could claim that their | |
463 OP_L1_STANDALONE configuration did not contain any of Condat's "intellectual | |
464 property". | |
465 | |
466 In FreeCalypso we do have a way to build a firmware image that includes L1 but | |
467 not G23M: it is our own L1 standalone configuration, enabled with a | |
468 feature l1stand line in build.conf. However, because IP considerations don't | |
469 apply to us (we operate under the doctrine of eminent domain), we are not | |
470 replicating TI's gutting of GPF: *our* L1 standalone configuration includes the | |
471 full GPF (with OSX for L1 implemented in terms of VSI), but with a greatly | |
472 reduced set of tasks when G23M is omitted. | |
473 | |
474 Run-time structure of L1 | |
475 ======================== | |
476 | |
477 L1 consists of two major parts: L1S and L1A. L1S is the synchronous part where | |
478 the most time-critical functions are performed; it runs as a Nucleus HISR. The | |
479 hardware in the Calypso generates an interrupt on every TDMA frame (4.615 ms), | |
480 and the LISR handler for this interrupt triggers the L1S HISR. L1S communicates | |
481 with L1A through a shared memory data structure, and also sometimes allocates | |
482 message buffers and posts them to L1A's incoming message queue (both via OSX | |
483 API functions, i.e., via GPF in disguise). | |
484 | |
485 L1A runs as a regular task under Nucleus, and includes a blocking call (to GPF | |
486 via OSX) to wait for incoming messages on its queue. It is one big loop that | |
487 waits for incoming messages, then processes each received message and commands | |
488 L1S to do most of the work. The entry point to L1A in the L1 code proper is | |
489 l1a_task(), although the responsibility for running it as a task falls on some | |
490 "glue" code outside of L1 proper. TI's production firmwares with G23M included | |
491 have an L1 protocol stack entity within G23M whose only job (aside from some | |
492 initialization) is to run l1a_task() in the Nucleus task created by GPF for | |
493 that protocol stack entity; we do the same in our firmware. | |
494 | |
495 Communication between L1 and G23M | |
496 ================================= | |
497 | |
498 It is remarkable that L1 and G23M don't have any header files in common: L1 | |
499 uses its own (almost fully self-contained), whereas the G23M+GPF realm is its | |
500 own world with its own header files. One has to ask then: how do they | |
501 communicate? OK, we know they communicate through primitives (messages in | |
502 buffers allocated from GPF's PRIM partition memory pool) passes via message | |
503 queues, but what about the data structures in these messages? Where are those | |
504 defined if there are no header files in common between L1 and G23M? | |
505 | |
506 The answer is that there are separate definitions of the L1<->G23M interface on | |
507 each side, and TI must have kept them in sync manually. Not exactly a | |
508 recommended programming or software maintenance practice for sure, but TI took | |
509 care of it, and the existing proprietary products based on TI's firmware are | |
510 rock solid, so it is not really our place to complain. | |
511 | |
512 TI's firmwares from the era we are working with (the TCS3.2/LoCosto source from | |
513 20090327 from which we took our L1 and G23M and the binary libs version of | |
514 TCS211 from 20070608 which serves as our reference) also include a component | |
515 called ALR. It resides in the G23M code realm: G23M coding style, uses Condat | |
516 header files, runs as its own protocol stack entity under GPF. This component | |
517 appears to serve as a glue layer between the rest of the G23M stack (which is | |
518 supposed to be truly hardware-independent) and TI's L1. | |
519 | |
520 Speaking of ALR, it is worth mentioning that there is a little naming | |
521 inconsistency here. ALR is known to the connect-by-name logic in GPF as "PL" | |
522 (physical layer, apparently), while the ACI entity (Application Control | |
523 Interface, the top level entity) is known to the same logic as "MMI". No big | |
524 deal really, but hopefully knowing this quirk will save someone some confusion. | |
525 | |
526 Debug trace facility | |
527 ==================== | |
528 | |
529 See the RVTMUX document in the same directory as this one for general background | |
530 information about the debug and development interface provided by TI-based | |
531 firmwares. Our FreeCalypso GSM firmware implements an RVTMUX interface as well, | |
532 and the most immediate use to which it is put is debug trace output. In this | |
533 section I'm going to describe how this debug trace output is generated inside | |
534 the fw. | |
535 | |
536 The firmware component that "owns" the physical UART channel assigned to RVTMUX | |
537 is RVT, implemented in gsm-fw/riviera/rvt. It is a Riviera-based component, | |
538 and it has a Nucleus task that is created and started through Riviera. All | |
539 calls to the actual driver for the UART are made from RVT. In the case of | |
540 output from the Calypso GSM device to an external host, all such output is | |
541 performed in the context of RVT's Nucleus task; this task drains RVT's message | |
542 queue and emits the content of allocated buffers posted to it, freeing them | |
543 afterward. (The dynamic memory allocation system in this case is Riviera's, | |
544 which is susceptible to fragmentation - see discussion earlier in this article.) | |
545 Therefore, every trace or other output packet emitted from a GSM device running | |
546 our fw (or any of the proprietary firmwares based on the same architecture) | |
547 appears as a result of a message in a dynamically allocated buffer having been | |
548 posted to RVT's queue. | |
549 | |
550 RVT exports several API functions that are intended to be called from other | |
551 tasks, it is by way of these functions that most output is submitted to RVT. | |
552 One can call rvt_send_trace_cpy() with a fully prepared output message, and | |
553 that function will allocate a buffer from Riviera's dynamic memory allocator | |
554 properly accounted to RVT, fill it and post it to the RVT task's queue. | |
555 Alternatively, one can can rvt_mem_alloc() to allocate the buffer, fill it in | |
556 and then pass it to rvt_send_trace_no_cpy(). | |
557 | |
558 At higher levels, there are a total of 3 kinds of debug traces that can be | |
559 emitted: | |
560 | |
561 * Riviera traces: these are generated by various components implemented in | |
562 Riviera land, although in reality any component can generate a trace of this | |
563 form by calling rvf_send_trace() - this function can be called from any task. | |
564 | |
565 * L1 traces: L1 has its own trace facility implemented in | |
566 gsm-fw/L1/cfile/l1_trace.c; it generates its traces as ASCII messages and | |
567 sends them out via rvt_send_trace_cpy(). | |
568 | |
569 * GPF traces: code that runs in GPF/G23M land and uses those header files and | |
570 coding conventions etc can emit traces through GPF. GPF's trace functions | |
571 (implemented in gsm-fw/gpf/frame/vsi_trc.c) allocate a memory partition from | |
572 GPF's TEST pool, format the trace into it, and send the trace primitive to | |
573 GPF's special test interface task. That task receives trace and other GPF | |
574 test interface primitives on its queue, performs some manipulations on them, | |
575 and ultimately generates RVT trace output, i.e., a new dynamic memory buffer | |
576 is allocated in the Riviera land, the trace is copied there, and the Riviera | |
577 buffer goes to the RVT task for the actual output. | |
578 | |
579 Trace masking | |
580 ============= | |
581 | |
582 The RV trace facility invoked via rvf_send_trace() has a crude masking ability, | |
583 but by default all traces are enabled. In TI's standard firmwares most of the | |
584 trace output comes from L1: L1's trace output is very voluminous, and appears | |
585 to be fully enabled by default. I have yet to look more closely if there is | |
586 any trace masking functionality in L1 and what the default trace verbosity | |
587 level should be. | |
588 | |
589 On the other hand, GPF and therefore G23M traces are mostly disabled by default. | |
590 One can turn the trace verbosity level from any GPF-based entity up or down by | |
591 sending a "system primitive" command to the running fw, and another such command | |
592 can be used to save these masks in FFS, so that they will be restored on the | |
593 next boot cycle and be effective at the earliest possible time. Enabling *all* | |
594 GPF trace output for all entities is generally not useful though, as it is so | |
595 verbose that a developer trying to make sense of it will likely drown in it. | |
596 | |
597 GPF compressed trace hack | |
598 ========================= | |
599 | |
600 TI's Windows-based GSM firmware build systems include a hack called str2ind. | |
601 Seeking to reduce the fw image size by eliminating trace ASCII strings from it, | |
602 and seeking to reduce the load on the RVTMUX serial interface by eliminating | |
603 the transmission time of these strings, they passed their sources through an | |
604 ad hoc preprocessor that replaces these ASCII strings with numeric indices. | |
605 The compilation process with this str2ind hack becomes very messy: each source | |
606 file is first passed through the C preprocessor, then the intermediate form is | |
607 passed through str2ind, and finally the de-string-ified form is compiled, with | |
608 the compiler being told not to run the C preprocessor again. | |
609 | |
610 TI's str2ind tool maintains a table of correspondence between the original trace | |
611 ASCII strings and the indices they've been turned into, and a copy of this table | |
612 becomes essential for making sense of GPF trace output: the firmware now emits | |
613 only numeric indices which are useless without this str2ind.tab mapping table. | |
614 | |
615 Our FreeCalypso firmware does not currently implement this str2ind aka | |
616 compressed trace hack, i.e., all GPF trace output from our fw is in full ASCII | |
617 string form. I have not bothered to implement compressed traces because: | |
618 | |
619 * We have not yet encountered a case of the full ASCII strings causing a problem | |
620 either with fw images not fitting into the available memory or excessive load | |
621 on the RVTMUX interface; | |
622 | |
623 * Implementing the hack in question would require extra work: the str2ind tool | |
624 would have to be reimplemented anew, as of the original we have no source, | |
625 only a Windows binary, and requiring our free fw build process to run a | |
626 Windows binary under Wine is a no-no; | |
627 | |
628 * I don't feel like doing all that extra work for what appears to be no real | |
629 gain; | |
630 | |
631 * Having to run gcc with separate cpp and actual compilation steps with str2ind | |
632 sandwiched in between would be ugly and gross; | |
633 | |
634 * Having to keep track of which str2ind.tab goes with which fw image and supply | |
635 the right table to our rvinterf tools would likely be a pita. | |
636 | |
637 So we shall stick with full ASCII string traces until and unless we run into an | |
638 actual (as opposed to hypothetical) problem with either fw image size or serial | |
639 interface load. | |
640 | |
641 RVTMUX command input | |
642 ==================== | |
643 | |
644 RVTMUX is not just debug trace output: it is also possible for an external host | |
645 to send commands to the running fw via RVTMUX. | |
646 | |
647 Inside the fw RVTMUX input is handled by the RVT entity by way of a Nucleus | |
648 HISR. This HISR gets triggered when Rx bytes arrive at the designated UART, | |
649 and it calls the UART driver to collect the input. RVT code running in this | |
650 HISR parses the message structure and figures out which fw component the | |
651 incoming message is addressed to. Any fw component can register to receive | |
652 RVTMUX packets, and provides a callback function with this registration; this | |
653 callback function is called in the context of the HISR. | |
654 | |
655 In our current FC GSM fw there are two components that register to receive | |
656 external host commands via RVTMUX: ETM and GPF. ETM is described in my earlier | |
657 RVTMUX write-up. ETM is implemented as a Riviera SWE and has its own Nucleus | |
658 task; the callback function that gets called from the RVT HISR posts received | |
659 messages onto ETM's own queue drained by its task. The ETM task gets scheduled, | |
660 picks up the command posted to its queue, executes it, and sends a response | |
661 message back to the external host through RVT. | |
662 | |
663 Because all ETM commands funnel through ETM's queue and task, and that task | |
664 won't start looking at a new command until it finished handling the previous | |
665 one, all ETM commands and responses are in strict lock-step: it is not possible | |
666 to send two commands and have their responses come in out of order, and it makes | |
667 no sense to send another ETM command prior to receiving the response to the | |
668 previous one. (But there can still be debug traces or other traffic intermixed | |
669 on RVTMUX in between an ETM command and the corresponding response!) | |
670 | |
671 The other component that can receive external commands is GPF. GPF's test | |
672 interface can receive so-called "system primitives", which are ASCII string | |
673 commands parsed and acted upon by GPF, and also binary protocol stack | |
674 primitives. Remember how all entities in the G23M stack communicate by sending | |
675 messages to each other? Well, GPF's test interface allows such messages to be | |
676 injected externally as well, directed to any entity in the running fw. System | |
677 primitive commands can also be used to cause entities to send their outgoing | |
678 primitives to the test interface, either instead of or in addition to the | |
679 originally intended recipient. | |
680 | |
681 Firmware subsetting | |
682 =================== | |
683 | |
684 We have built our firmware up incrementally, piece by piece, starting from a | |
685 very small skeleton. As we added pieces working toward full GSM MS | |
686 functionality, the ability to build less functional fw images corresponding to | |
687 our earlier stages of development has been retained. Each piece we added is | |
688 "optional" from the viewpoint of our build system, even if it is absolutely | |
689 required for normal usage, and is enabled by the appropriate feature line in | |
690 build.conf. | |
691 | |
692 Our minimal baseline with absolutely no "features" enabled consists of: | |
693 | |
694 * Nucleus | |
695 * Riviera | |
696 * TI's basic drivers for GPIO, ABB etc | |
697 * RVTMUX on the UART port chosen by the user (RVTMUX_UART_port Bourne shell | |
698 variable in build.conf) and the UART driver for it | |
699 * FFS code operating on a fake FFS image in RAM | |
700 | |
701 If one runs this minimal "firmware" on a Calypso device, one will see some | |
702 startup messages in RV trace format followed by a System Time trace every 20 s. | |
703 This "firmware" can't do anything more, there is not even a way to command it | |
704 to power off or reboot. | |
705 | |
706 Working toward full GSM MS functionality, pieces can be added to this skeleton | |
707 in this order: | |
708 | |
709 * GPF | |
710 * L1 | |
711 * G23M | |
712 | |
713 feature gsm enables all of the above for normal usage; feature l1stand can be | |
714 used alternatively to build an L1 standalone image without G23M - we expect | |
715 that we may end up using a ramImage form of the latter for RF calibration on | |
716 our own Calypso hardware. | |
717 | |
718 ETM and various FFS configurations are orthogonal features to the choice of | |
719 core functionality level. | |
720 | |
721 Further reading | |
722 =============== | |
723 | |
724 Believe it or not, some of the documentation that was written by the original | |
725 vendors of the software in question and which we've been able to locate turns | |
726 out to be fairly relevant and helpful, such that I recommend reading it. | |
727 | |
728 Documentation for Nucleus PLUS RTOS: | |
729 | |
730 ftp://ftp.ifctf.org/pub/embedded/Nucleus/nucleus_manuals.tar.bz2 | |
731 | |
732 Quite informative, and fits our version of Nucleus just fine. | |
733 | |
734 Riviera environment: | |
735 | |
736 ftp://ftp.ifctf.org/pub/GSM/Calypso/riviera_preso.pdf | |
737 | |
738 It's in slide presentation form, not a detailed technical document, but | |
739 it covers a lot of points, and all that Riviera stuff described in the | |
740 preso *is* present in our fw for real, hence it should be considered | |
741 relevant. | |
742 | |
743 GPF documentation: | |
744 | |
745 http://scottn.us/downloads/peek/SW%20doc/frame_users_guide.pdf | |
746 http://scottn.us/downloads/peek/SW%20doc/vsipei_api.pdf | |
747 | |
748 Very good reading, helped me understand GPF when I first reached this | |
749 part of firmware reintegration. | |
750 | |
751 TCS3.x/LoCosto fw architecture: | |
752 | |
753 http://scottn.us/downloads/peek/SW%20doc/TCS2_1_to_3_2_Migration_v0_8.pdf | |
754 ftp://ftp.ifctf.org/pub/GSM/LoCosto/LoCosto_Software_Architecture_Specification_Document.pdf | |
755 | |
756 These TI docs focus mostly on how they changed the fw architecture from | |
757 their TCS2.x program (Calypso) to their newer TCS3.x (LoCosto), but one | |
758 can still get a little insight into the "old" TCS211 architecture they | |
759 were moving away from, which is the architecture I've adopted for | |
760 FreeCalypso. |