comparison doc/Firmware_Architecture @ 868:d92b110e06e0

doc/Firmware_Architecture written
author Space Falcon <falcon@ivan.Harhan.ORG>
date Sun, 17 May 2015 03:45:19 +0000
parents
children
comparison
equal deleted inserted replaced
867:c4da570dca83 868:d92b110e06e0
1 Our FreeCalypso GSM firmware follows the same architecture as TI's TCS211;
2 this document is an attempt to describe this architecture.
3
4 Nucleus environment
5 ===================
6
7 Like all classic TI firmwares, ours is based on the Nucleus PLUS RTOS. Just
8 like TI's original code on which we are based, we use only a small subset of
9 the functionality provided by Nucleus - but because the latter is a library,
10 the pieces we don't use simply don't get pulled into the link. The main
11 function we get out of Nucleus is the scheduling of threads, or tasks as
12 Nucleus calls them.
13
14 Our entry point code as we receive control from the Calypso boot ROM or from
15 other bootloaders on crippled targets or from loadagent in the case of fc-xram
16 loadable builds does some absolutely minimal initialization (set up sensible
17 memory access timings, copy iram.text to IRAM and .data to XRAM if we are
18 booting from flash, zero out our two bss segments (int.bss and ext.bss)) and
19 jumps to Nucleus' assembly init entry point. Prior to jumping to Nucleus, we
20 don't even have a stack (all init code prior to that point is pure assembly and
21 uses only ARM registers); Nucleus then sets up the stack pointer for everything
22 running under its control.
23
24 Aside from just a few exceptions (ARM exception handlers come to mind, never
25 mind the pun), every piece of code in the firmware executes in one of the
26 following contexts:
27
28 * Application_Initialize(): this function and everything called from it execute
29 just before Nucleus' thread scheduler starts; at this point interrupts are
30 disabled at the ARM7 core level (in the CPSR) and must not be enabled; the
31 stack is Nucleus' "system stack" which is also used by the scheduler and LISRs
32 as explained below.
33
34 * Regular threads or tasks: once Application_Initialize() finishes, all code
35 with the exception of interrupt handlers (LISRs and HISRs as explained below)
36 runs in the context of some Nucleus task. Whenever you are trying to debug
37 or simply understand some piece of code in the firmware, the first question
38 you should ask is "which task does this code execute in?". Most functional
39 components run in their own tasks, i.e., a given piece of code is only
40 intended to run within the Nucleus task that belongs to the component in
41 question. On the other hand, some components are implemented as APIs,
42 functions to be called from other components: these don't have their own task
43 associated with them, and instead they run in the context of whatever task
44 they were called from. Some only get called from one task: for example, the
45 "uartfax" driver API calls only get called from the protocol stack's UART
46 entity, which is its own task. Other component API functions like FFS and
47 trace can get called from just about any task in the system. Many components
48 have both their own task and some API functions to be called from other tasks,
49 and the API functions oftentimes post messages to the task to be worked on by
50 the latter; the just-mentioned FFS and trace functions work in this manner.
51
52 In our current GSM firmware (just like in TCS211) every Nucleus task is
53 created either through Riviera or through GPF, and not in any other way - see
54 the description of Riviera and GPF below.
55
56 * LISRs (Low level Interrupt Service Routines): these are the interrupt handlers
57 that run immediately when an ARM IRQ or FIQ comes in. The code at the IRQ and
58 FIQ vector entry points calls Nucleus' magic stack switching function
59 (switches the CPU from IRQ/FIQ into SVC mode, saves the interrupted thread's
60 registers on that thread's stack, and switches to the "system" stack) and
61 then calls TI's IRQ dispatcher implemented in C. The latter figures out
62 which Calypso interrupt needs to be handled and calls the handler configured
63 in the compiled-in table. Nucleus' LISR registration framework is not used
64 by the GSM fw, but these interrupt handlers should be viewed as LISRs
65 nonetheless.
66
67 There is one additional difference between canonical Nucleus and TI's version
68 (we've replicated the latter): canonical Nucleus was designed to support
69 nested LISRs, i.e., IRQs re-enabled in the magic stack switching function,
70 but in TI's version which we follow this IRQ re-enabling is removed: each LISR
71 runs with interrupts disabled and cannot be interrupted. (The corner case of
72 an FIQ interruping an IRQ remains to be looked at more closely as bugs may be
73 hiding there, but Calypso doesn't really use FIQ interrupts.) There is really
74 no need for LISR nesting in our GSM fw, as each LISR is very short: most LISRs
75 do nothing more than trigger the corresponding HISR.
76
77 * HISRs (High level Interrupt Service Routines): these hold an intermediate
78 place between LISRs and tasks, similar to softirqs in the Linux kernel. A
79 HISR can be activated by a LISR calling NU_Activate_HISR(), and when the LISR
80 returns, the HISR will run before the interrupted task (or some higher
81 priority task, see below) can resume. HISRs run with CPU interrupts enabled,
82 thus more interrupts can occur, with their LISRs executing and possibly
83 triggering other HISRs. All triggered HISRs must complete and thereby go
84 "quiescent" before task scheduling resumes, i.e., all HISRs as a group have a
85 higher scheduling priority than tasks.
86
87 Nucleus implements priority scheduling for tasks. Tasks have their priority set
88 when they are created (through Riviera or GPF, see below), and a higher priority
89 task will run until it gets blocked waiting for something, at which time lower
90 priority tasks will run. If a lower priority task sends a message to a higher
91 priority task, unblocking the latter which was waiting for incoming messages,
92 the lower priority task will effectively suspend itself immediately while the
93 higher priority task runs to process the message it was sent.
94
95 HISRs oftentimes post messages to their associated tasks as well; if one of
96 these messages unblocks a higher priority task, that unblocked task will run
97 upon the completion of the HISR instead of the original lower priority task
98 that was interrupted by the LISR that triggered the HISR. Nucleus' scheduler
99 is fun!
100
101 Major functional blocks
102 =======================
103
104 At the highest level, all code in TI's classic firmwares and in our FreeCalypso
105 fw can be divided into 3 broad groupings:
106
107 * GSM Layer 1: this code was developed by TI, is highly specific to TI's
108 baseband chipset family in general and to specific individual chips in
109 particular (the code is liberally sprinkled with conditional compilation
110 based on DBB type, ABB type, DSP ROM version and so on), and is absolutely
111 necessary in order to operate a Calypso device as a GSM MS (mobile station)
112 and not merely as a general purpose microprocessor platform. This code can
113 be considered to be the most important part of the entire firmware.
114
115 L1 interties with Nucleus and with the G23M stack (with which it needs to
116 communicate) in a very peculiar way described later in this article.
117
118 * G23M protocol stack: at the beginning of TI's involvement in the GSM baseband
119 chipset business, they only developed and maintained their own L1 code, while
120 the rest of the protocol stack (which is hardware-independent) was licensed
121 from another company called Condat. Later Condat as a company was fully
122 acquired by TI, and the once-customer of this code became its owner. The
123 name of TI/Condat's implementation of GSM layers 2&3 for the MS side is G23M,
124 and it forms its own major division of the overall fw architecture.
125
126 Underlying the G23M stack is a special layer called GPF, which was originally
127 Condat's Generic Protocol stack Framework. Apparently Condat was in the
128 business of developing and maintaining a whole bunch of protocol stacks: GSM
129 MS side, GSM network side, TETRA and who knows what else. GPF was their
130 common underpinning for all of their protocol stack projects, which ran on top
131 of many different OS environments: Nucleus, pSOS, VxWorks, Unix/Linux, Win32
132 and who knows what else.
133
134 In the case of FreeCalypso GSM fw, both the protocol stack and the underlying
135 OS environment are fixed: GSM and Nucleus, respectively. But GPF is still a
136 critically important layer in the firmware architecture: in addition to
137 serving as the glue between the G23M stack and Nucleus, it provides some
138 important support infrastructure for the protocol stack.
139
140 * Miscellaneous peripheral accessories: under this category I (Space Falcon)
141 place everything implemented through TI's Riviera framework. Historical
142 evidence indicates that TI's earliest firmwares did not have this part, i.e.,
143 Riviera and everything built on top of it is a "non-essential" later
144 addition. It appears that TI originally invented Riviera in order to support
145 the development of fancy "feature phone" UI/application layers, complete with
146 Java, MMS, WAP, games and whatnot - things upon which our FreeCalypso project
147 looks with disdain - but in the TCS211 firmware from 2007 which I used as the
148 reference for FreeCalypso this Riviera framework serves as the foundation for
149 some small but essential pieces of functionality: the FFS implementation, the
150 SPI-based ABB access driver, the RTC driver and the debug trace facility.
151
152 While it is certain that TI had some non-Riviera implementation of the just-
153 listed essential pieces in their earliest pre-Riviera days, trying to find
154 surviving sources from those days would be a "mission impossible" task. OTOH,
155 reusing the Riviera code from TCS211 was quite easy, as the copy of TCS211 we
156 got has it in full source form with nothing omitted. Therefore, I took the
157 sensible easy road and kept Riviera in FreeCalypso.
158
159 The above division of the firmware into 3 broad functional groupings also
160 corresponds quite neatly with where each piece of our source code originally
161 came from. Our versions of L1 and G23M came in their entirety from TI's TCS3.2
162 program targeting their later LoCosto chipset (specifically from the
163 TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip release from Peek/FGW), whereas
164 everything in the 3rd division (Riviera and everything built on top of it) came
165 from our TCS211/Leonardo source from Sotovik.
166
167 The just-listed divisions of the firmware are really separate software
168 environments which are linked together into one final image, but which have
169 very little in the way of interties. Each of the 3 realms has its own very
170 different coding style, its own set of header files and its own defined types.
171 It is very rare for a module from one realm to include any header files or call
172 any functions from another realm, and while they all ultimately run on top of
173 Nucleus, they interface with Nucleus in different ways: G23M goes through GPF,
174 everything in Riviera land goes through Riviera, and L1 uses its own bizarre
175 mechanism which in our fw ends up going through GPF but hasn't always been this
176 way - to be explained lated in this article.
177
178 Also note that there is no mention of any handset UI code (or MMI in the GSM
179 industry's sexist speak) in the above breakdown of code divisions. This
180 document describes the architecture of TI's modem firmware in which the highest
181 layer is the AT command interface (part of the G23M suite, or its uppermost
182 layer to be precise), and which does not include any UI code. Our TI reference
183 sources do include their "MMI" code, but I haven't studied it closely enough
184 yet to comment on it properly, and the version of TCS211 which serves as our
185 primary reference is set up for the modem configuration without this "MMI" part.
186 Making sense of TI's "MMI" code is a task to be tackled later in the project
187 when we have a working modem and are ready to start building a usable handset
188 with UI.
189
190 Riviera and GPF
191 ===============
192
193 Riviera and GPF are two parallel/independent/competing wrappers around or
194 layers above Nucleus. The way in which they are treated in our FreeCalypso fw
195 architecture is somewhat inverted: originally GPF was the essential framework
196 underlying the G23M stack (and to which L1 was also attached in a hacky way)
197 while Riviera was added to support non-essential frills, but in our current FC
198 fw Riviera is always included just like Nucleus, whereas GPF only needs to be
199 included in the build when building with feature gsm (full GSM MS functionality)
200 or feature l1stand (L1 standalone) - but is not needed if one wishes to build
201 an "in vivo" FFS editing agent, for example.
202
203 This peculiar arrangement happened because of the source code availability
204 situation we found ourselves in. TCS211 uses real Riviera that is fully
205 independent of GPF (see below), and our copy thereof came with this part in
206 full source form. On the other hand, we never got the complete original source
207 for GPF in one piece, thus our FC version of GPF had to be reconstructed from
208 bits and pieces. For this reason I made the decision early on to include
209 Riviera and some RV-based components in the "mandatory core" part of our FC fw
210 architecture, while leaving GPF to be worked on later. And when I did get to
211 reintegrating GPF, at that point it was natural to make it into an "optional"
212 component that is included only when needed.
213
214 At some point in their post-Calypso TCS3.x program TI decided to eliminate
215 Riviera as an independent framework and to reimplement Riviera APIs (used by
216 peripheral but necessary code such as FFS, ETM, various drivers etc) over GPF.
217 This arrangement is used in the TCS3.2 LoCosto code from which we lifted our
218 versions of L1 and G23M. However, I (Space Falcon) chose not to adopt this
219 approach for FreeCalypso, and mimic the TCS211 way (Riviera entirely
220 independent of GPF) instead. The reasons were twofold: (1) there was no full
221 source for GPF and a painstaking reconstruction effort was required before we
222 could have our own working version of GPF in our gcc-built fw, and (2) I felt
223 more comfortable and familiar with following TCS211.
224
225 Start-up process
226 ================
227
228 I mentioned earlier that every Nucleus task in our firmware gets created and
229 started either through Riviera or through GPF. All GPF tasks are created and
230 placed into the runable state in the Application_Initialize() context: the work
231 is done by GPF init code in gsm-fw/gpf/frame/frame.c, and the top level GPF
232 init function called from Application_Initialize() is StartFrame(). Thus when
233 Application_Initialize() finishes and the Nucleus thread scheduler starts
234 running for the first time, all GPF tasks are there to be scheduled.
235
236 There is a compiled-in table of all protocol stack entities and the tasks in
237 which they need to run which (in our fw) lives under gsm-fw/gpf/conf and which
238 logically belongs to GPF. Canonically each protocol stack entities runs in its
239 own task, but sometimes two or more are combined to run in the same task: for
240 example, in the minimal GSM "voice only" configuration (no CSD, fax or GPRS)
241 CC, SMS and SS entities share the same task named CM. Unlike Riviera, GPF does
242 not support dynamic starting and stopping of tasks.
243
244 As each GPF task starts running (immediately upon entry into Nucleus' scheduling
245 loop as Application_Initialize() finishes), pf_TaskEntry() function in
246 gsm-fw/gpf/frame/frame.c is the first code it runs. This function creates the
247 queue for messages to be sent to all entities running within the task in
248 question, calls each entity's pei_init() function (repeatedly until it succeeds:
249 it will fail until the other entities to which this entity needs to send
250 messages have created their message queues), and then falls into the main body
251 of the task: for all "regular" entities/tasks except L1, this main body consists
252 of waiting for messages (or signals or timeouts) to arrive on the queue and
253 dispatching each received message to the appropriate handler in the right
254 entity.
255
256 Riviera tasks get started in a different way. The same Application_Initialize()
257 function that calls StartFrame() to create and start all GPF tasks also calls
258 create_tasks() (found in gsm-fw/riviera/init/create_RVtasks.c), the appinit-time
259 function for starting the Riviera environment. But this function does not
260 create and start every configured Riviera task like StartFrame() does for GPF.
261 Instead it creates a special helper task which will do this work once scheduled.
262 Thus at the completion of Application_Initialize() and the beginning of
263 scheduling the set of runable Nucleus tasks consists of all GPF ones plus the
264 special RV starter task. Once the RV starter task gets scheduled, it will call
265 rvm_start_swe() to launch every configured Riviera SWE (SoftWare Entity), which
266 in turns entails creating the tasks in which these SWEs are to run.
267
268 Dynamic memory allocation
269 =========================
270
271 All dynamic memory allocation (i.e., all RAM usage beyond statically allocated
272 variables and buffers) is once again done either through Riviera or through GPF,
273 and in no other way. Ultimately all areas of the physical RAM that will ever
274 be used by the fw in any way are allocated when the fw is compiled and linked:
275 the areas from which Riviera and GPF serve their dynamic memory allocations are
276 statically allocated as char arrays in the respective C modules and placed in
277 the int.ram or ext.ram section as appropriate; Riviera and GPF then provide
278 API functions that allocate memory dynamically from these statically allocated
279 large pools.
280
281 Riviera and GPF have entirely separate memory pools from which they serve their
282 respective clients, hence there is no possibility of one affecting the other.
283 Riviera's memory allocation scheme is very much like the classic malloc&free:
284 there is one large unstructured pool from which all allocations are made, one
285 can allocate a chunk of any size, free chunks are merged when physically
286 adjacent, and fragmentation is an issue: a memory allocation request may fail
287 even when there is enough memory available in total if it is too fragmented.
288
289 GPF's dynamic memory allocation facility is considerably more robust: while it
290 does maintain one or two (depending on configuration) memory pools of the
291 traditional "dynamic" kind (like malloc&free, susceptible to fragmentation),
292 most GPF memory allocation works on "partition" memory instead. Here GPF
293 maintains 3 separate groups of pools: PRIM, TEST and DMEM; each allocation
294 request must specify the appropriate pool group and cannot affect the others.
295 Within each pool there is a fixed number of partitions of a fixed size: for
296 example, in TI's TCS211 GSM+GPRS configuration the PRIM pool group consists of
297 190 partitions of 60 bytes, 110 partitions of 128 bytes, 50 partitions of 632
298 bytes and 7 partitions of 1600 bytes. An allocation request from a given pool
299 group (e.g., PRIM) can request any arbitrary size in bytes, but it gets rounded
300 up to the nearest partition size and allocated out of the respective pool. If
301 no free partitions are available, the requesting task is suspended until another
302 task frees on. Because these partitions are used primarily for intertask
303 communication, if none are free, it can only mean (assuming that the firmware
304 functions correcly) that all partitions have been allocated and sent to some
305 queue for some task to work on, hence eventually they will get freed.
306
307 This scheme implemented in GPF is extremely robust in the opinion of this
308 author, and the other purely "dynamic" scheme is used (in the case of GPF) only
309 for init-time allocations which are never freed, such as task stacks - hence
310 the GPF-based part of the firmware is not suspectible at all to the problem of
311 memory fragmentation. But Riviera does suffer from this problem, and the
312 concern is more than just theoretical: one major user of Riviera-based dynamic
313 memory allocation is the trace facility (described in its own section below),
314 and my observation of the trace output from Pirelli's proprietary fw (which
315 appears to use the same architecture with separate Riviera and GPF) suggests
316 that after the fw has been running for a while, Riviera memory gets fragmented
317 to a point where many traces are being dropped. Replacing Riviera's poor
318 dynamic memory allocation scheme with a GPF-like partition-based one is a to-do
319 item for our project.
320
321 Message-based intertask communication
322 =====================================
323
324 Even though all entities of the G23M protocol stack are linked together into
325 one monolithic fw image and there is nothing to stop them from calling each
326 other's functions and accessing each other's variables, they don't work that
327 way. Instead all communication between entities is done through messages, just
328 as if they ran in separate address spaces or even on separate processors.
329 Buffers for this message exchange are allocated from a GPF partition pool: an
330 entity that needs to send a message to another entity allocates a buffer of the
331 needed size, fills it with the message to be sent, and posts it on the recipient
332 entity's message queue, all through GPF services. The other entity simply
333 processes the stream of messages that arrives on its message queue, freeing each
334 message (returning the buffer to the partition pool in came from) as it is
335 processed.
336
337 Riviera-based tasks use a similar mechanism: unlike G23M protocol stack
338 entities, most Riviera-based functional modules provide APIs that are called as
339 functions from other tasks, but these API functions typically allocate a memory
340 buffer (through Riviera), fill it with the call parameters, and post it to the
341 associated task's message queue (also in the Riviera land) to be worked on.
342 Once the worker task gets the job done, it will either call a callback function
343 or post a response message back to the requestor - the latter option is only
344 possible if the requesting entity is also Riviera-based.
345
346 A closer look at GPF
347 ====================
348
349 There are certain sublayers within GPF which need to be pointed out. The 3
350 major subdivisions within GPF are:
351
352 * The meaty core of GPF: this part is the code under gsm-fw/gpf/frame in our
353 source tree. It appears that this part was originally intended to be both
354 project-independent (same for GSM, TETRA etc) and OS-independent (same for
355 Nucleus, pSOS, VxWorks etc). This is the part of GPF that matters for the
356 G23M stack: all APIs called by PS entities are implemented here, and so are
357 all other PS-facing functions such as startup. (PS = protocol stack)
358
359 * OS adaptation layer (OSL): this is the part of GPF that adapts it to a given
360 underlying OS, in our case Nucleus.
361
362 * Test interface: see the code under gsm-fw/gpf/tst_drv and gsm-fw/gpf/tst_pei.
363 This part handles the trace output from all entities that run under GPF and
364 the mechanism for sending external debug commands to the GPF+PS subsystem.
365
366 GPF was a difficult step in our GSM firmware reintegration process because no
367 complete source for it could be found anywhere: apparently GPF was so stable
368 and so independent of firmware particulars (Calypso or LoCosto, GSM only or
369 GSM+GPRS, modem or complete phone with UI etc) that it appears to have been
370 used and distributed as prebuilt binary libraries even inside TI. All TI fw
371 (semi-)sources we have use GPF in prebuilt library form and are not set up to
372 recompile any part of it from source. (They had to include all GPF header
373 files though, as most of them are included by G23M C modules, and it would be
374 too much hassle to figure out which ones are or aren't needed, hence all were
375 included.)
376
377 Fortunately though, we were able to find the sources for most parts of GPF:
378
379 * The LoCosto source in TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip features the
380 source for the "core" part of GPF under gpf/FRAME - these sources aren't
381 actually used by that fw's build system (it only uses the prebuilt binary
382 libs for GPF), but they are there.
383
384 * Our TCS211 semi-src doesn't have any sources for the core part of GPF, but
385 instead it features the source for the test interface and some "misc" parts:
386 under gpf/MISC and gpf/tst in that source tree - these sources are not present
387 in the LoCosto version from Peek.
388
389 But one critical piece was still missing: the OS adaptation layer. It appears
390 that the GPF core (vsi_??? modules) and OSL (os_??? modules) were maintained
391 and built together, ending up together in frame_<blah>.lib files in the binary
392 form used to build firmwares, but the source for the "frame" part in the Peek
393 find contained only vsi_*.c and others, but not any of os_*.c.
394
395 Thus we had to reconstruct GPF from the shattered bits and pieces we had. I
396 took the frame sources from Peek and the misc and tst sources from Sotovik, and
397 saw that they compiled w/o problems in our gcc environment. Attempting to link
398 any firmware that uses GPF would have been futile at this point, as it would
399 have failed with undefined references to os_*() functions. Then I had to do
400 the hard work: disassemble the missing os_??? modules from the binary libs in
401 the TCS211 version (hey, at least this one was known to work reliably) and write
402 new C code replicating the exact logic found in the disassembly of the known
403 working and fitting binary. This work is now mostly done (some non-essential
404 functions have been stubbed out to be revisited later), and the version of GPF
405 used by FreeCalypso is a significant work of reconstruction, not merely lifted
406 from a readily available source and plopped in.
407
408 A closer look at L1
409 ===================
410
411 The L1 code is remarkable in how little intertie it has with the rest of the
412 firmware it is linked into. It is almost entirely self-contained, expecting
413 only 4 functions to be provided by the underlying OS environment:
414
415 os_alloc_sig -- allocate message buffer
416 os_free_sig -- free message buffer
417 os_send_sig -- send message to upper layers
418 os_receive_sig -- receive message from upper layers
419
420 It helps to remember that at the beginning of TI's involvement in the GSM
421 baseband chipset business, L1 was the only thing they "owned", while Condat,
422 the maintainers of the higher level protocol stack, was a separate company.
423 TI's "turnkey" solution must have consisted of their own L1 code plus G23M code
424 (including GPF etc) licensed from Condat, but I'm guessing that TI probably
425 wanted to retain the ability to sell their chips with their L1 without being
426 entangled by Condat: let the customer use their own GSM L23 stack, or perhaps
427 work out their own independent licensing arrangements with Condat. I'm
428 guessing that L1 was maintained as its own highly independent and at least
429 conceptually portable entity for these reasons.
430
431 The way in which L1 is intertied into our FreeCalypso GSM fw is the same as how
432 it is done in TI's production firmwares, including both our TCS211 reference
433 and the TCS3.2 version from which we got our L1 source. There is a module
434 called OSX, which is an extremely thin adaptation layer that implements the
435 APIs expected by L1 in terms of GPF. Furthermore, this OSX layer provides
436 header file isolation: the only "outside" (non-L1) header included by L1 is
437 cust_os.h, and it defines the necessary interface to OSX *without* including
438 any other headers (no GPF headers in particular), using only the C language's
439 native types. Apart from this cust_os.h header, the entire OSX layer is
440 implemented in one C module (osx.c, which we had to reconstruct from osx.obj as
441 the source was missing - but it's very simple) which does include some GPF
442 headers and implements the OSX API in terms of GPF services. Thus in TI's
443 production firmwares and in our FC GSM fw L1 does sit on top of GPF, but very
444 indirectly.
445
446 More specifically, the "production" version of OSX implements its API in terms
447 of *high-level* GPF functions, i.e., VSI. However, they also had an interesting
448 OP_L1_STANDALONE configuration which omitted not only all of G23M, but also the
449 core of GPF and possibly the Riviera environment as well. We don't have a way
450 to recreate this configuration exactly as it existed inside TI because we don't
451 have the source bits specific to this configuration (our own standalone L1
452 configuration is implemented differently, see below), but we do have a little
453 bit of insight into how it worked.
454
455 It appears that TI's OP_L1_STANDALONE build used a special "gutted" version of
456 GPF in which the "meaty core" (VSI etc) was removed. The OS layer (os_???
457 modules implementing os_*() functions) that interfaces to Nucleus was kept, and
458 so was OSX used by L1 - but this time the OSX API functions were implemented in
459 terms of os_*() ones (low-level wrappers around Nucleus) instead of the higher-
460 level VSI APIs provided by the "meaty core" of GPF. It is purely a guess on my
461 part, but perhaps this hack was also done in the days before TI's acquisition
462 of Condat, and by omitting the "meaty core" of GPF, TI could claim that their
463 OP_L1_STANDALONE configuration did not contain any of Condat's "intellectual
464 property".
465
466 In FreeCalypso we do have a way to build a firmware image that includes L1 but
467 not G23M: it is our own L1 standalone configuration, enabled with a
468 feature l1stand line in build.conf. However, because IP considerations don't
469 apply to us (we operate under the doctrine of eminent domain), we are not
470 replicating TI's gutting of GPF: *our* L1 standalone configuration includes the
471 full GPF (with OSX for L1 implemented in terms of VSI), but with a greatly
472 reduced set of tasks when G23M is omitted.
473
474 Run-time structure of L1
475 ========================
476
477 L1 consists of two major parts: L1S and L1A. L1S is the synchronous part where
478 the most time-critical functions are performed; it runs as a Nucleus HISR. The
479 hardware in the Calypso generates an interrupt on every TDMA frame (4.615 ms),
480 and the LISR handler for this interrupt triggers the L1S HISR. L1S communicates
481 with L1A through a shared memory data structure, and also sometimes allocates
482 message buffers and posts them to L1A's incoming message queue (both via OSX
483 API functions, i.e., via GPF in disguise).
484
485 L1A runs as a regular task under Nucleus, and includes a blocking call (to GPF
486 via OSX) to wait for incoming messages on its queue. It is one big loop that
487 waits for incoming messages, then processes each received message and commands
488 L1S to do most of the work. The entry point to L1A in the L1 code proper is
489 l1a_task(), although the responsibility for running it as a task falls on some
490 "glue" code outside of L1 proper. TI's production firmwares with G23M included
491 have an L1 protocol stack entity within G23M whose only job (aside from some
492 initialization) is to run l1a_task() in the Nucleus task created by GPF for
493 that protocol stack entity; we do the same in our firmware.
494
495 Communication between L1 and G23M
496 =================================
497
498 It is remarkable that L1 and G23M don't have any header files in common: L1
499 uses its own (almost fully self-contained), whereas the G23M+GPF realm is its
500 own world with its own header files. One has to ask then: how do they
501 communicate? OK, we know they communicate through primitives (messages in
502 buffers allocated from GPF's PRIM partition memory pool) passes via message
503 queues, but what about the data structures in these messages? Where are those
504 defined if there are no header files in common between L1 and G23M?
505
506 The answer is that there are separate definitions of the L1<->G23M interface on
507 each side, and TI must have kept them in sync manually. Not exactly a
508 recommended programming or software maintenance practice for sure, but TI took
509 care of it, and the existing proprietary products based on TI's firmware are
510 rock solid, so it is not really our place to complain.
511
512 TI's firmwares from the era we are working with (the TCS3.2/LoCosto source from
513 20090327 from which we took our L1 and G23M and the binary libs version of
514 TCS211 from 20070608 which serves as our reference) also include a component
515 called ALR. It resides in the G23M code realm: G23M coding style, uses Condat
516 header files, runs as its own protocol stack entity under GPF. This component
517 appears to serve as a glue layer between the rest of the G23M stack (which is
518 supposed to be truly hardware-independent) and TI's L1.
519
520 Speaking of ALR, it is worth mentioning that there is a little naming
521 inconsistency here. ALR is known to the connect-by-name logic in GPF as "PL"
522 (physical layer, apparently), while the ACI entity (Application Control
523 Interface, the top level entity) is known to the same logic as "MMI". No big
524 deal really, but hopefully knowing this quirk will save someone some confusion.
525
526 Debug trace facility
527 ====================
528
529 See the RVTMUX document in the same directory as this one for general background
530 information about the debug and development interface provided by TI-based
531 firmwares. Our FreeCalypso GSM firmware implements an RVTMUX interface as well,
532 and the most immediate use to which it is put is debug trace output. In this
533 section I'm going to describe how this debug trace output is generated inside
534 the fw.
535
536 The firmware component that "owns" the physical UART channel assigned to RVTMUX
537 is RVT, implemented in gsm-fw/riviera/rvt. It is a Riviera-based component,
538 and it has a Nucleus task that is created and started through Riviera. All
539 calls to the actual driver for the UART are made from RVT. In the case of
540 output from the Calypso GSM device to an external host, all such output is
541 performed in the context of RVT's Nucleus task; this task drains RVT's message
542 queue and emits the content of allocated buffers posted to it, freeing them
543 afterward. (The dynamic memory allocation system in this case is Riviera's,
544 which is susceptible to fragmentation - see discussion earlier in this article.)
545 Therefore, every trace or other output packet emitted from a GSM device running
546 our fw (or any of the proprietary firmwares based on the same architecture)
547 appears as a result of a message in a dynamically allocated buffer having been
548 posted to RVT's queue.
549
550 RVT exports several API functions that are intended to be called from other
551 tasks, it is by way of these functions that most output is submitted to RVT.
552 One can call rvt_send_trace_cpy() with a fully prepared output message, and
553 that function will allocate a buffer from Riviera's dynamic memory allocator
554 properly accounted to RVT, fill it and post it to the RVT task's queue.
555 Alternatively, one can can rvt_mem_alloc() to allocate the buffer, fill it in
556 and then pass it to rvt_send_trace_no_cpy().
557
558 At higher levels, there are a total of 3 kinds of debug traces that can be
559 emitted:
560
561 * Riviera traces: these are generated by various components implemented in
562 Riviera land, although in reality any component can generate a trace of this
563 form by calling rvf_send_trace() - this function can be called from any task.
564
565 * L1 traces: L1 has its own trace facility implemented in
566 gsm-fw/L1/cfile/l1_trace.c; it generates its traces as ASCII messages and
567 sends them out via rvt_send_trace_cpy().
568
569 * GPF traces: code that runs in GPF/G23M land and uses those header files and
570 coding conventions etc can emit traces through GPF. GPF's trace functions
571 (implemented in gsm-fw/gpf/frame/vsi_trc.c) allocate a memory partition from
572 GPF's TEST pool, format the trace into it, and send the trace primitive to
573 GPF's special test interface task. That task receives trace and other GPF
574 test interface primitives on its queue, performs some manipulations on them,
575 and ultimately generates RVT trace output, i.e., a new dynamic memory buffer
576 is allocated in the Riviera land, the trace is copied there, and the Riviera
577 buffer goes to the RVT task for the actual output.
578
579 Trace masking
580 =============
581
582 The RV trace facility invoked via rvf_send_trace() has a crude masking ability,
583 but by default all traces are enabled. In TI's standard firmwares most of the
584 trace output comes from L1: L1's trace output is very voluminous, and appears
585 to be fully enabled by default. I have yet to look more closely if there is
586 any trace masking functionality in L1 and what the default trace verbosity
587 level should be.
588
589 On the other hand, GPF and therefore G23M traces are mostly disabled by default.
590 One can turn the trace verbosity level from any GPF-based entity up or down by
591 sending a "system primitive" command to the running fw, and another such command
592 can be used to save these masks in FFS, so that they will be restored on the
593 next boot cycle and be effective at the earliest possible time. Enabling *all*
594 GPF trace output for all entities is generally not useful though, as it is so
595 verbose that a developer trying to make sense of it will likely drown in it.
596
597 GPF compressed trace hack
598 =========================
599
600 TI's Windows-based GSM firmware build systems include a hack called str2ind.
601 Seeking to reduce the fw image size by eliminating trace ASCII strings from it,
602 and seeking to reduce the load on the RVTMUX serial interface by eliminating
603 the transmission time of these strings, they passed their sources through an
604 ad hoc preprocessor that replaces these ASCII strings with numeric indices.
605 The compilation process with this str2ind hack becomes very messy: each source
606 file is first passed through the C preprocessor, then the intermediate form is
607 passed through str2ind, and finally the de-string-ified form is compiled, with
608 the compiler being told not to run the C preprocessor again.
609
610 TI's str2ind tool maintains a table of correspondence between the original trace
611 ASCII strings and the indices they've been turned into, and a copy of this table
612 becomes essential for making sense of GPF trace output: the firmware now emits
613 only numeric indices which are useless without this str2ind.tab mapping table.
614
615 Our FreeCalypso firmware does not currently implement this str2ind aka
616 compressed trace hack, i.e., all GPF trace output from our fw is in full ASCII
617 string form. I have not bothered to implement compressed traces because:
618
619 * We have not yet encountered a case of the full ASCII strings causing a problem
620 either with fw images not fitting into the available memory or excessive load
621 on the RVTMUX interface;
622
623 * Implementing the hack in question would require extra work: the str2ind tool
624 would have to be reimplemented anew, as of the original we have no source,
625 only a Windows binary, and requiring our free fw build process to run a
626 Windows binary under Wine is a no-no;
627
628 * I don't feel like doing all that extra work for what appears to be no real
629 gain;
630
631 * Having to run gcc with separate cpp and actual compilation steps with str2ind
632 sandwiched in between would be ugly and gross;
633
634 * Having to keep track of which str2ind.tab goes with which fw image and supply
635 the right table to our rvinterf tools would likely be a pita.
636
637 So we shall stick with full ASCII string traces until and unless we run into an
638 actual (as opposed to hypothetical) problem with either fw image size or serial
639 interface load.
640
641 RVTMUX command input
642 ====================
643
644 RVTMUX is not just debug trace output: it is also possible for an external host
645 to send commands to the running fw via RVTMUX.
646
647 Inside the fw RVTMUX input is handled by the RVT entity by way of a Nucleus
648 HISR. This HISR gets triggered when Rx bytes arrive at the designated UART,
649 and it calls the UART driver to collect the input. RVT code running in this
650 HISR parses the message structure and figures out which fw component the
651 incoming message is addressed to. Any fw component can register to receive
652 RVTMUX packets, and provides a callback function with this registration; this
653 callback function is called in the context of the HISR.
654
655 In our current FC GSM fw there are two components that register to receive
656 external host commands via RVTMUX: ETM and GPF. ETM is described in my earlier
657 RVTMUX write-up. ETM is implemented as a Riviera SWE and has its own Nucleus
658 task; the callback function that gets called from the RVT HISR posts received
659 messages onto ETM's own queue drained by its task. The ETM task gets scheduled,
660 picks up the command posted to its queue, executes it, and sends a response
661 message back to the external host through RVT.
662
663 Because all ETM commands funnel through ETM's queue and task, and that task
664 won't start looking at a new command until it finished handling the previous
665 one, all ETM commands and responses are in strict lock-step: it is not possible
666 to send two commands and have their responses come in out of order, and it makes
667 no sense to send another ETM command prior to receiving the response to the
668 previous one. (But there can still be debug traces or other traffic intermixed
669 on RVTMUX in between an ETM command and the corresponding response!)
670
671 The other component that can receive external commands is GPF. GPF's test
672 interface can receive so-called "system primitives", which are ASCII string
673 commands parsed and acted upon by GPF, and also binary protocol stack
674 primitives. Remember how all entities in the G23M stack communicate by sending
675 messages to each other? Well, GPF's test interface allows such messages to be
676 injected externally as well, directed to any entity in the running fw. System
677 primitive commands can also be used to cause entities to send their outgoing
678 primitives to the test interface, either instead of or in addition to the
679 originally intended recipient.
680
681 Firmware subsetting
682 ===================
683
684 We have built our firmware up incrementally, piece by piece, starting from a
685 very small skeleton. As we added pieces working toward full GSM MS
686 functionality, the ability to build less functional fw images corresponding to
687 our earlier stages of development has been retained. Each piece we added is
688 "optional" from the viewpoint of our build system, even if it is absolutely
689 required for normal usage, and is enabled by the appropriate feature line in
690 build.conf.
691
692 Our minimal baseline with absolutely no "features" enabled consists of:
693
694 * Nucleus
695 * Riviera
696 * TI's basic drivers for GPIO, ABB etc
697 * RVTMUX on the UART port chosen by the user (RVTMUX_UART_port Bourne shell
698 variable in build.conf) and the UART driver for it
699 * FFS code operating on a fake FFS image in RAM
700
701 If one runs this minimal "firmware" on a Calypso device, one will see some
702 startup messages in RV trace format followed by a System Time trace every 20 s.
703 This "firmware" can't do anything more, there is not even a way to command it
704 to power off or reboot.
705
706 Working toward full GSM MS functionality, pieces can be added to this skeleton
707 in this order:
708
709 * GPF
710 * L1
711 * G23M
712
713 feature gsm enables all of the above for normal usage; feature l1stand can be
714 used alternatively to build an L1 standalone image without G23M - we expect
715 that we may end up using a ramImage form of the latter for RF calibration on
716 our own Calypso hardware.
717
718 ETM and various FFS configurations are orthogonal features to the choice of
719 core functionality level.
720
721 Further reading
722 ===============
723
724 Believe it or not, some of the documentation that was written by the original
725 vendors of the software in question and which we've been able to locate turns
726 out to be fairly relevant and helpful, such that I recommend reading it.
727
728 Documentation for Nucleus PLUS RTOS:
729
730 ftp://ftp.ifctf.org/pub/embedded/Nucleus/nucleus_manuals.tar.bz2
731
732 Quite informative, and fits our version of Nucleus just fine.
733
734 Riviera environment:
735
736 ftp://ftp.ifctf.org/pub/GSM/Calypso/riviera_preso.pdf
737
738 It's in slide presentation form, not a detailed technical document, but
739 it covers a lot of points, and all that Riviera stuff described in the
740 preso *is* present in our fw for real, hence it should be considered
741 relevant.
742
743 GPF documentation:
744
745 http://scottn.us/downloads/peek/SW%20doc/frame_users_guide.pdf
746 http://scottn.us/downloads/peek/SW%20doc/vsipei_api.pdf
747
748 Very good reading, helped me understand GPF when I first reached this
749 part of firmware reintegration.
750
751 TCS3.x/LoCosto fw architecture:
752
753 http://scottn.us/downloads/peek/SW%20doc/TCS2_1_to_3_2_Migration_v0_8.pdf
754 ftp://ftp.ifctf.org/pub/GSM/LoCosto/LoCosto_Software_Architecture_Specification_Document.pdf
755
756 These TI docs focus mostly on how they changed the fw architecture from
757 their TCS2.x program (Calypso) to their newer TCS3.x (LoCosto), but one
758 can still get a little insight into the "old" TCS211 architecture they
759 were moving away from, which is the architecture I've adopted for
760 FreeCalypso.