view TCS211-fw-arch @ 32:78c2cc6ebbb8

DUART-cable: update for DUART28
author Mychaela Falconia <falcon@freecalypso.org>
date Thu, 24 Sep 2020 02:47:14 +0000
parents f5ddeacbe744
children
line wrap: on
line source

This document describes the architecture of TI's TCS211 firmware and that of
our FreeCalypso Magnetite and Selenite firmwares which are based on it.

What is TCS211, and why we use it as our reference
==================================================

TI were in the business of making GSM baseband chipsets for about a decade
from the late 1990s up until 2009, and over that time span both their silicon
and their firmware architecture had evolved in many different ways.  All of our
work in the FreeCalypso family of projects is based on one fairly arbitrary
snapshot, a rather arbitrarily picked single point in that long evolutionary
line: we use the Calypso chipset as opposed to both the ones before and the
ones after, and we use TI's TCS211 firmware from 2007 as our golden reference,
as opposed to other equally valid ways of architecturing the fw that came
before and after our arbitrarily picked snapshot.

Q: Why do we use the Calypso chipset as opposed to LoCosto or E-Costo or
whatever was TI's very last offering before they got out of that business?

A: Because that's what Openmoko used: their Neo FreeRunner aka GTA02 smartphones
were our primary hardware target for many years before we gathered the money
and the courage to build our own board-level hardware starting from just chips
bought on the Chinese surplus market.

Q: Why do we use TI's TCS211 firmware from 2007 and its architecture as our
golden reference, as opposed to any of the other infinitely many equally valid
ways of architecturing a working firmware implementation for the same Calypso
chipset?

A: Because it works flawlessly, and is extremely stable as a commercial product.
The firmware which Openmoko got from TI had only a tiny difference from TI's
internal TCS211 mainline (TSPACT signal definitions in tpudrv12.h which are
different between the quadband RFFE on TI's internal reference hw and the
triband one in FIC's commercial implementation), and with only a few additional
changes related to our use of a newer flash chip that wasn't supported back in
TI's and Openmoko's days, this golden reference fw can run equally well on our
own FCDEV3B.

Relation between TCS211 and FreeCalypso
=======================================

The only "pure" TCS211 firmware we got is the one that has been salvaged from
the ruins of Openmoko.  To the best of our knowledge, it is the world's only
surviving copy of any version of TCS211 - it is entirely possible that even TI
may not have it any more in any of their archives, given the length of time that
has passed and the total lack of interest in this "ancient junk".  In its pure
form, this world's sole surviving copy of TI's TCS211 fw is laden with blobs
(many components exist only as binary object libraries with no corresponding
source), and it features a build system that is very thoroughly Windows-based.
And to top it off, that configuration and build system has many critical
components which also exist only as compiled binaries (Windows executables or
Java bytecode) with no corresponding source.

We started by replacing the original configuration and build system of TCS211
with our own one that is Unix-based rather than Windows-based, and implemented
in Bourne shell with a few C helpers instead of XML, Java and Perl.  The result
was named FreeCalypso Magnetite.  At first we changed only the configuration
and build system, but kept all of the original TCS211 code, including all of
the binary-only components.  Then we deblobbed it gradually, replacing binary-
only components with source, one component at a time.  Where did we get the
source for the pieces that came as binary objects with no corresponding source?
The answer is different for different components:

* For GSM Layer 1 (a very critical and highly chipset-dependent component), we
  did a painstaking reconstruction which you can see in the tcs211-l1-reconst
  repository.  That world's last surviving copy of TCS211 which we got only had
  *.c files censored out, while all of the original *.h files were preserved -
  and thanks to the preserved configuration and build system, we also got all
  of the original compilation lines including compiler options, -D definitions
  and -I include paths.  For most of the missing *.c files we got a "wrong"
  version from the TCS3/LoCosto source.  The reconstruction proceeded by taking
  these "wrong version" *.c files, putting them one module (one *.c file) at a
  time into the TCS211 build environment, and massaging each individual *.c
  file until it compiled into a perfect match to the original binary object.
  Thus we have reconstructed a full C source for the L1 component which for all
  practical purposes can be treated as if it were the lost original source.

* For some small pieces like the tpudrv12 RF driver and the OSL and OSX
  components of GPF it was more of a translation from disassembly to C: the C
  code we use is of our own writing, but it faithfully matches the logic
  implemented by the original blobs as recovered through disassembly.

* The G23M protocol stack is a very large and complex component, and our copy
  of TCS211 (the world's only surviving copy to the best of our knowledge) has
  it in binary-only form.  Trying to source-reconstruct it precisely like we
  did with L1 would have been infeasible, hence we took a different approach:
  we put together a TCS2/TCS3 hybrid in which we made a wholesale replacement
  of all G23M components: we adopted the new version of G23M wholesale without
  trying to recreate the old version.

* Both TCS211 and TI's newer TCS3.2 fw for the LoCosto chipset are based on
  Nucleus PLUS RTOS (different versions), and both firmwares have their Nucleus
  only as binary object libraries, no source.  However, we got another version
  of Nucleus from about the same time frame (slightly newer than the one TI used
  in TCS211, but slightly older than the one in TCS3.2) from a non-TI source
  (it was posted on a Russian web forum by Comrade XVilka), and in FreeCalypso
  Selenite we use this new Nucleus as a replacement for TCS211 original version
  in the same manner as how we had earlier made a wholesale replacement of the
  G23M protocol stack.

With two major components (Nucleus and the G23M PS) replaced with non-TCS211
versions, our Magnetite hybrid and Selenite firmwares are no longer TCS211, but
they still faithfully follow the _architecture_ of TCS211: in each case when we
replaced the code, we made the new code version fit perfectly into the original
architecture without any disruptive changes.  Thus anyone who desires to
understand our current FreeCalypso firmwares (Magnetite and Selenite) needs to
first understand the original TCS211 architecture, as it is essentially
unchanged.

Why not use the LoCosto chipset and its TCS3.2 firmware?
========================================================

We went the Calypso route and not the LoCosto route because of the circumstances
that surrounded the beginning of our family of projects.  We did not get all of
the tools needed for working with LoCosto chips and TI's TCS3.2 fw (CSST and
SBuild) until the spring of 2015, and by that time we had invested too much into
the Calypso to throw it all away and restart anew in the uncharted waters of
LoCosto.  Another factor is that the software for talking to LoCosto's ROM
bootloader (CSST) exists only as Windows binaries sans source, and it would
require some effort to reverse-engineer the protocol and implement a free and
Unix-based alternative - whereas for the Calypso this work was already done by
OsmocomBB folks before we entered the scene.  Finally, in the case of the
Calypso we have read out the actual content of the ROMs (both the ARM boot ROM
and the DSP ROM) and the ARM boot ROM code has been disassembled and thoroughly
understood - whereas in the case of LoCosto it is not certain if we can even
read out the ROM content, as it is said to be protected against reading.

If someone else desires to play with LoCosto, either by hacking a Peek device
or by building an I-Sample board from the available PADS PCB file, go for it!
But the FreeCalypso core team is sticking with the Calypso chipset for now, and
our actively maintained Magnetite and Selenite firmwares follow the architecture
of TCS211, not that of TCS3.2.

Relation between the ARM and DSP cores in the Calypso
=====================================================

The Calypso digital baseband processor chip has two processor cores in it: an
ARM7TDMI core that runs the main firmware and a C54x DSP core that performs the
more burdensome signal processing tasks.  The DSP is subservient to the ARM:
only the ARM comes out of reset and starts executing code upon power-up, while
the DSP is held in reset (does not run) until and unless the ARM firmware starts
it running.

The ARM core executes code from outside of the Calypso chip itself: in normal
operation (outside of development) there is a flash memory chip connected to
Calypso's external memory bus, and the Calypso's ARM core executes firmware
stored in this flash.  There is an optional (enabled or disabled by a hardware
pin) ARM boot ROM inside the Calypso chip; when this boot ROM is enabled by
nIBOOT pin strapping on the board (like it is on Openmoko and FreeCalypso
hardware), the ARM core executes code from this boot ROM first upon power-up or
reset before jumping to external flash.  The tiny piece of code that is hard-
cast in this mask ROM acts as an unbricking aid: it gives a certain time window
during which the boot process can be interrupted and diverted if certain magic
characters are sent into either of Calypso's two UARTs by an external
development host, and if nothing is received on either UART during that time
window (as would be the case in normal usage of a Calypso phone or modem), the
boot ROM transfers control to the firmware image in the external flash.  The
end result is that the ARM core always runs code from outside of the Calypso
chip itself, either the firmware image in the flash or whatever code is fed by
an external development host to the boot ROM serially over a UART.

There is also an internal RAM inside the Calypso from which the ARM can execute
code (512 KiB on the full Calypso version or 256 KiB on Calypso Lite silicon
used in some historical low-end phones); the primary purpose of this internal
RAM is to allow chosen sections of code to execute faster without the
performance penalty of the external memory bus, but it is volatile RAM, not ROM
or flash, hence it doesn't have any code in it until and unless loaded by the
firmware copying code from flash or via the serial boot protocol.

In contrast, the DSP is very different.  The DSP core can never execute any
code from outside the chip, and has no access to the Calypso chip's external
memory bus at all.  Instead the only two memories accessible to the DSP are a
mask ROM and a fast internal RAM.  The DSP's dedicated mask ROM is 128 Kwords;
the DSP's RAM is 28 Kwords, out of which 8 Kwords constitute the so-called API
RAM which is accessible to both ARM and DSP cores.  (The C54x DSP addresses
memory by words instead of bytes, hence the memory sizes are given in Kwords
instead of KiB.)

The main bulk of the DSP's operating program is already hard-cast in the silicon
in the 128 Kword mask ROM.  The DSP ROM code is structured in such a way that
any part of it can be overridden by downloadable patch codes which get loaded
somewhere in the DSP's 28 Kword RAM, but because the RAM is significantly
smaller than the ROM, downloadable DSP code cannot replace the entirety of the
ROM code - instead the code needs to be patched very selectively only where
necessary to fix a bug that was discovered after the silicon was made or to
extend the DSP functionality with a new feature.

The DSP ROM code in the Calypso silicon we are using has been successfully read
out, but it is only the executable binary code and data - we never found a copy
of the source for this DSP ROM code.  And even if we had this source, we would
not be able to casually modify and recompile it without spending millions of
dollars to fab a new chip revision with a modified mask ROM.  Having this source
would allow us to develop our own DSP patch codes and to understand and maintain
the existing ones, hence we need to make an effort to convince TI to release
the source for the DSP ROM if they have it in their archives, but if no
surviving copy of this source exists anywhere in the world, the fallback plan
would be to reverse-engineer the DSP ROM code by disassembly.  The latter plan
has not been pursued yet because of the very high labor cost it would involve.

It is possible to run the Calypso DSP without any patches, i.e., have it run
only the code that is already in the mask ROM.  Our competitor OsmocomBB
operates in this manner, and we have also built and run modified versions of
our TCS211-based FreeCalypso firmware with DSP patch loading disabled as an
experiment.  However, all ARM-side firmwares that have been officially released
by TI for production use including our TCS211-20070608 golden reference do apply
downloadable patches to the DSP, and are designed to run with this patched DSP;
running them with DSP patching disabled results in unstable operation.

DSP patch codes that are included in ARM-side Calypso firmwares take the form
of const char arrays initialized with hex bytes; these C source files with hex
char arrays inside were apparently produced from C54x COFF files with a tool
called coff2c, but we never got any of those COFF files or whatever source (C
or assembly) they were built from.  At the present time in the FreeCalypso
family of projects we use the DSP patch codes (hex char arrays) which we got
with our copy of TCS211 from 20070608, and we treat the entire DSP block (the
combination of mask ROM plus patches) as a functional black box.

Having to treat the DSP as a black box is certainly a major shortcoming of our
FreeCalypso solution.  However, I (Mother Mychaela) would much rather have a
phone or modem in which only the DSP is a black box while I get to maintain all
of the upper layers with full freedom, as opposed to the status quo alternative
of a very high-level black box with FOTA backdoors.  Unlike the ubiquitous
high-level black boxes from the likes of Qualcomm, the DSP in the Calypso cannot
be backdoored: it has no access to the ARM address space, thus no access to the
flash (cannot surreptitiously modify the firmware) and no access to any of the
higher-level radio protocol state maintained by the ARM, all it can do is
modulate and demodulate bursts and run voice codecs _as commanded by the ARM_.
Furthermore, the DSP has no access to the Calypso chip's TPU (Time Processing
Unit, the block that controls board-level RF hardware) and thus has no direct
control over any of the RF hardware: it cannot initiate radio transmission or
even reception on its own, instead the ARM firmware has to configure the RF
hardware via the TPU for each and every Rx or Tx time window.

Finally, if anyone is truly paranoid about the possibility of backdoors in the
DSP, the DSP ROM code has been read out - you are welcome to hire a professional
reverser of your choice to disassemble and audit it as thoroughly as you like.
This code is unchangeable by virtue of being hard-cast in a mask ROM in the
silicon.

The rest of this document covers the firmware that runs on the ARM core; it
controls the DSP via its API RAM, a form of shared memory interface.

High-level structure of TCS211 firmware
=======================================

The code base that makes up TI's TCS211 firmware consists of 3 main divisions:
chipset software, Condat G23M (GSM and GPRS L23 protocol stacks, ACI and
optional handset UI layers) and GPF.  Let us look at them in turn:

chipsetsw division
------------------

In the original TCS211 delivery there was a top-level directory named chipsetsw
(chipset software), containing code that is specific to TI's chipsets in
particular and was never intended to run on any other hardware.  This code
division has been retained intact in our FreeCalypso Magnetite and Selenite
firmwares, taken in its entirety from our TCS211 golden reference, although we
have shortened the name: this code division now resides under src/cs in
Magnetite and Selenite.  Aside from a few bits of system glue, this chipsetsw
breaks down into two further subdivisions: the L1+drivers core and the SSA
division.

L1+drivers core
---------------

This division resides under chipsetsw/layer1 and chipsetsw/drivers/drv_core, or
under src/cs/layer1 and src/cs/drivers/drv_core in our version.  The most
important piece here is L1 (GSM Layer 1): this code drives the DSP and the RF
hardware, and thereby makes the Calypso function as a GSM MS (mobile station)
and not merely as a general purpose microprocessor platform.  This code can be
considered to be the most important part of the entire firmware.

At one time TI had a so-called standalone L1 configuration, selected by the
OP_L1_STANDALONE C preprocessor symbol.  We don't have the bits that are needed
to build this configuration (they were probably never released outside of TI at
all), but it appears that this fw build configuration consisted of just Nucleus,
L1, the drivers under drv_core, the OSL and OSX parts of GPF without the rest,
and some stubs for the few higher-level functions that are intertied with L1.

The drivers under chipsetsw/drivers are divided into drv_core and drv_app: the
former are the most essential or fundamental ones, used by L1 and/or needed for
the OP_L1_STANDALONE config; the latter belong to the higher-level SSA division
described below.

SSA division
------------

TI had a group called System Software and Applications (SSA), and they supplied
those parts of the firmware that are neither L1+drv_core nor Condat G23M.  The
more interesting pieces here include the flash file system (FFS), the debug
trace facility (RVT), the Enhanced Test Mode (ETM) facility that allows
external development and production tools to poke at the firmware, RiViera Audio
Service (playing various beeps and ringtones through the DSP, a front-end to L1
audio functions), LCD and keypad drivers for Calypso-based handsets, and various
supportive functions implemented via the Iota ABB: switch-on and switch-off
logic, battery monitoring and charging, backlight LED control.

All firmware components in the SSA division are built on top of a framework
called RiViera - more will be said about it later.  Everything under
chipsetsw/drivers/drv_app, chipsetsw/riviera and chipsetsw/services (or under
src/cs/drivers/drv_app, src/cs/riviera and src/cs/services in our version)
belongs to the SSA realm.

Condat G23M division
--------------------

At the beginning of TI's involvement in the GSM baseband chipset business, they
only developed and maintained their own L1 code, which eventually grew into the
larger chipsetsw division described above, while the rest of the protocol stack
(which is hardware-independent) was licensed from another company called Condat.
Later Condat as a company was fully acquired by TI, and the once-customer of
this code became its owner.  The name of TI/Condat's implementation of GSM
layers 2&3 for the MS side is G23M, and it forms its own major division of the
overall fw architecture.

The overall Condat code realm can be further subdivided into GSM and GPRS L23
protocol stacks, the Application Control Interface (ACI) which includes the AT
command interpreter (ATI), and additional phone UI layers which are only
included in handset but not modem firmwares.

We don't know exactly how TI maintained this software internally: given that it
is mostly hardware-independent aside from integration details and some minor
features which may be present on one hw platform but not on another, it would
have made the most sense for TI to maintain a single internal mainline common
to both Calypso and LoCosto, and then integrate the code from this mainline into
chipset-specific customer releases.  We have no way of knowing if TI indeed
followed this approach or not, but when we took the version of G23M from the
TCS3.2 source for the LoCosto chipset and grafted it onto the chipsetsw
foundation from TCS211 for the Calypso to produce our TCS2/TCS3 hybrid, the
integration went surprisingly smoothly.  The full-source version of G23M which
we took from TCS3/LoCosto is newer than the binary-only version featured in the
world's last surviving copy of TCS211 from Openmoko.

GPF island of stability
-----------------------

Underlying the G23M protocol stack is a special layer called GPF, which was
originally Condat's Generic Protocol stack Framework.  Apparently Condat were
in the business of developing and maintaining a whole bunch of protocol stacks:
GSM MS side, GSM network side, TETRA and who knows what else.  GPF was their
common underpinning for all of their protocol stack projects, which ran on top
of many different OS environments: Nucleus, pSOS, VxWorks, Unix/Linux, Win32
and who knows what else.

In the case of TI/FreeCalypso GSM fw, both the protocol stack and the underlying
OS environment are fixed: GSM and Nucleus, respectively.  But GPF is still a
critically important layer in the firmware architecture: in addition to serving
as the glue between the G23M stack and Nucleus, it provides some important
support infrastructure for the protocol stack.

However, what makes GPF very special is the way in which it relates to the rest
of the firmware architecture.  GPF remained common and unchanged across TI's
many different projects, and it is so independent from the rest of the firmware
and its build configuration that TI were able to make company-wide GPF library
builds and then plop them into multiple fw projects which used them as
configuration-independent prebuilt libraries.  All TI firmware (semi-)sources
we've got use GPF in prebuilt library form and are not set up to recompile any
part of it from source.

Our FC Magnetite firmware uses the original binary libs from TCS211-Openmoko
for its GPF component, but for FC Selenite the project requirement is to be
completely blob-free, hence we had to reconstruct the source for GPF.  The
original source for most parts of GPF was found between TCS3.2 from Peek/FGW
and TCS211 from OM (the former had the source for the core "frame" modules and
the latter had the source for misc and tst), but we never got the source for the
OSL and OSX components, hence we had to reconstruct them from disassembly.  OSL
is the glue layer between GPF and Nucleus, OSX is the glue layer between GPF
and L1.

Firmware boot process
=====================

As already mentioned earlier, the Calypso chip itself includes an ARM boot ROM
in the silicon that serves as an unbricking aid: it provides a certain time
window during which the boot process can be interrupted and diverted if certain
magic characters are sent into either of Calypso's two UARTs by an external
host, and if nothing is received on either UART during that time window, the
boot ROM transfers control to the firmware image in the external flash.  As we
understand it, Calypso was TI's first DBB (digital baseband processor) chip to
include this boot ROM, and their previous DBB chips did not have such: they
would always execute code directly from external flash immediately out of reset.

TI's TCS211 and earlier firmwares are structured in such a way that they boot
and run exactly the same way whether the Calypso boot ROM is present and
enabled, present but disabled, or not present at all.  They put magic constant
0x00000001 in the 32-bit word at flash address 0x2000, which tells the Calypso
boot ROM (if it is present and enabled) to boot the flash fw image in legacy
mode: after providing the unbricking time window, the boot ROM moves itself out
of the way (sets two bits in the FFFF:FB10 register which tell the chip to unmap
the boot ROM and to map external memory at address 0) and induces a watchdog
reset, causing the chip to re-execute the reset vector, this time directly out
of external flash - thus the firmware boots as if the boot ROM weren't there,
but the ROM's unbricking function is retained.

In order to make it easier to load new firmware images during development on
pre-Calypso platforms which didn't have a boot ROM, TI had developed a flash-
resident bootloader stage and included it in their fw architecture.  This
bootloader stage is placed at the beginning of the flash at the reset vector,
and the rest of the firmware begins at an erase unit boundary.  The bootloader
stage executes first, and before it jumps to the main firmware entry point
(_INT_Initialize) for normal boot, it offers an opportunity for the boot process
to be interrupted and diverted if an external host sends certain magic command
packets into either of the two UARTs during the allotted time window.  If the
external host does interrupt and divert the boot process in this manner, it can
feed a code image to the bootloader to be written somewhere in target RAM, and
then command the bootloader to jump to it.  It is exactly the same functionality
(though with different serial protocol specifics) as implemented in the Calypso
boot ROM.  The ROM version is obviously superior because it is unbrickable, but
the flash-resident, built-with-firmware version is what TI used before they
came up with the idea of the boot ROM for the Calypso.

When the boot-ROM-equipped Calypso came along, TI kept the flash-resident
bootloader in the firmware: it does no harm aside from adding a little bit of
delay to the boot process, it does not conflict with the ROM bootloader as the
two speak different serial protocols and respond to different interrupt-boot
sequences, and it allowed TI to keep the same firmware architecture for
platforms with and without a boot ROM.  However, in our FreeCalypso firmwares
starting with Magnetite we have removed this extra bootloader stage for the
following reasons:

* It is not useful to us on any of our hardware targets: on those devices that
  have the Calypso boot ROM enabled, we use that boot ROM and get full
  unbrickability, whereas on Mot C1xx phones we have to work with Mot/Compal's
  own different bootloader and serial protocol at least initially, hence it
  makes the most sense to stick with the same after the conversion to
  FreeCalypso as well.

* As delivered by TI with their full production TCS211 fw releases, their
  firmware-resident bootloader works as intended only on hw platforms with
  13 MHz VCXOs like the original D-Sample (Clara RF), and is broken on platforms
  like Rita RF (the only RF chip for which we have driver code!) with 26 MHz
  VCXOs: there is no conditionally-compiled code anywhere in the bootloader
  code path to set the VCLKOUT_DIV2 bit in the CNTL_CLK register on 26 MHz
  platforms, thus the UARTs are fed with 26 MHz instead of the standard 13 MHz
  clock expected in normal operation, and the intended baud rate of 115200 bps
  turns into 230400.  Because 230400 bps is a baud rate which Calypso UARTs
  *cannot* produce in normal GSM operation (when the peripheral clock network
  runs at the expected 13 MHz), tools that are designed to talk to Calypso GSM
  devices are typically not designed to support this baud rate.  In particular
  for CP2102 USB-serial adapters, the precedent established by the factory
  CP2102 EEPROM programming in the Pirelli DP-L10 phone is that the baud rate
  entry for 230400 bps is replaced with 203125 bps, which is a valid baud rate
  for Calypso UARTs running at 13 MHz.

* We have no source for TI's firmware-resident bootloader, only linkable binary
  objects that came with our world's last surviving copy of TCS211, which are
  incompatible with our goal of blob-free firmware.

Because this extra bootloader stage is ultimately unnecessary in our
environment, the deblobbing goal was easier accomplished by removing it
altogether instead of expending effort on a blob-free replacement.  Because I
wasn't comfortable with modifying TMS470 assembly code and linker script magic,
the removal of the bootloader was accomplished by stubbing out its C body with
an empty function.  In the gcc-built FC Selenite version it is removed
completely, without any leftover stubs.

Finally, it needs to be noted for the sake of completeness that Compal's
bootloader used on Mot C1xx phones is a modified version based on TI's original
bootloader.  However, this factoid matters only for historians and genealogists;
for all practical purposes it is an unrelated animal, as Mot/Compal's serial
protocol for interrupting and diverting the boot process is their own and bears
no resemblance to TI's version.  And yes, Mot/Compal's version does set the
VCLKOUT_DIV2 bit in the CNTL_CLK register to adjust for the 26 MHz clock input
as its first order of business; it was probably the very first issue they had
to fix.

When we build FC Magnetite or FC Selenite TMS470 firmware for Mot C1xx targets,
we use dd to strip off the first 64 KiB of the image produced by TI's linker
(the part where TI's bootloader resides, be it intact or stubbed out) and flash
the remaining image (the main body of the fw) starting at flash address 0x10000.
In the gcc-built Selenite version we natively link images that are designed to
be flashed at 0x10000 without any dirty hacks.  Common to all FC firmwares for
C1xx targets, the bootloader image we put at 0 (in the brickable flash sector)
is a modified version based on one of Mot/Compal's originals: we have binary-
patched it to redirect the exception vectors from Mot/Compal's 0x20A0 to 0x10000
and to move the main fw entry point from Mot/Compal's 0x20F8 to TI's 0x10058.

None of this muckery applies to our own FreeCalypso hardware or to our
predecessor Openmoko's hw: on these good hw targets the complete fw image as
built is flashed at 0, and there is no possibility of bricking because we use
the boot ROM to gain access irrespective of what's in the flash.

Main firmware entry point
-------------------------

With the bootloader distraction out of the way, the main fw entry point is at
the _INT_Initialize symbol in the int.s assembly module, located in
src/cs/system/main/int.s in Magnetite and Selenite.  The functional equivalent
for the gcc environment in Selenite is in src/cs/system/main/gcc/bootentry.S.
This assembly code performs some basic hardware initialization, sets up
sensible memory timings for the boot path phase before DPLL setup, copies the
IRAM code (the code that is intended to execute out of the fast internal RAM)
from flash to where it needs to be, zeros both IRAM and XRAM .bss regions, does
TI's cinit/auto_init business for initialized data in the TMS470 environment
(Selenite gcc version copies .data from flash to RAM instead), sets up the
system, IRQ, FIQ and exception stacks, does some assembly initialization for
Nucleus and finally jumps to Nucleus' C entry point INC_Initialize().

Further initialization takes place in the Init_Target() and Init_Drivers()
functions called from Application_Initialize(), which is the last function
called by INC_Initialize() before starting the Nucleus task scheduler.

Nucleus environment
===================

Like all classic TI firmwares, ours is based on the Nucleus PLUS RTOS.  Just
like TI's original code on which we are based, we use only a small subset of
the functionality provided by Nucleus - but because the latter is a library,
the pieces we don't use simply don't get pulled into the link.  The main
function we get out of Nucleus is the scheduling of threads, or tasks as
Nucleus calls them.

Aside from pre-stack-setup assembly init code and ARM exception handlers, every
piece of code in the firmware executes in one of the following contexts:

* Application_Initialize(): this function and everything called from it execute
  just before Nucleus' thread scheduler starts; at this point interrupts are
  disabled at the ARM7 core level (in the CPSR) and must not be enabled; the
  stack is Nucleus' "system stack" which is also used by the scheduler and LISRs
  as explained below.

* Regular threads or tasks: once Application_Initialize() finishes, all code
  with the exception of interrupt handlers (LISRs and HISRs as explained below)
  runs in the context of some Nucleus task.  Whenever you are trying to debug
  or simply understand some piece of code in the firmware, the first question
  you should ask is "which task does this code execute in?".  Most functional
  components run in their own tasks, i.e., a given piece of code is only
  intended to run within the Nucleus task that belongs to the component in
  question.  On the other hand, some components are implemented as APIs,
  functions to be called from other components: these don't have their own task
  associated with them, and instead they run in the context of whatever task
  they were called from.  Some only get called from one task: for example, the
  "uartfax" driver API calls only get called from the protocol stack's UART
  entity, which is its own task.  Other component API functions like FFS and
  trace can get called from just about any task in the system.  Many components
  have both their own task and some API functions to be called from other tasks,
  and the API functions oftentimes post messages to the task to be worked on by
  the latter; the just-mentioned FFS and trace functions work in this manner.

  In our TCS211-mimicking Magnetite and Selenite firmwares every Nucleus task is
  created either through RiViera or through GPF, and not in any other way - see
  the description of RiViera and GPF below.

* LISRs (Low level Interrupt Service Routines): these are the interrupt handlers
  that run immediately when an ARM IRQ or FIQ comes in.  The code at the IRQ and
  FIQ vector entry points calls Nucleus' magic stack switching function
  (switches the CPU from IRQ/FIQ into SVC mode, saves the interrupted thread's
  registers on that thread's stack, and switches to the "system" stack) and
  then calls TI's IRQ dispatcher implemented in C.  The latter figures out
  which Calypso interrupt needs to be handled and calls the handler configured
  in the compiled-in table.  Nucleus' LISR registration framework is not used
  by the GSM fw, but these interrupt handlers should be viewed as LISRs
  nonetheless.

  There is one additional difference between canonical Nucleus and TI's version
  (we've replicated the latter): canonical Nucleus was designed to support
  nested LISRs, i.e., IRQs re-enabled in the magic stack switching function,
  but in TI's version which we follow this IRQ re-enabling is removed: each LISR
  runs with interrupts disabled and cannot be interrupted.  (The corner case of
  an FIQ interruping an IRQ remains to be looked at more closely as bugs may be
  hiding there, but Calypso doesn't really use FIQ interrupts.)  There is really
  no need for LISR nesting in our GSM fw, as each LISR is very short: most LISRs
  do nothing more than trigger the corresponding HISR.

* HISRs (High level Interrupt Service Routines): these hold an intermediate
  place between LISRs and tasks, similar to softirqs in the Linux kernel.  A
  HISR can be activated by a LISR calling NU_Activate_HISR(), and when the LISR
  returns, the HISR will run before the interrupted task (or some higher
  priority task, see below) can resume.  HISRs run with CPU interrupts enabled,
  thus more interrupts can occur, with their LISRs executing and possibly
  triggering other HISRs.  All triggered HISRs must complete and thereby go
  "quiescent" before task scheduling resumes, i.e., all HISRs as a group have a
  higher scheduling priority than tasks.

Nucleus implements priority scheduling for tasks.  Tasks have their priority set
when they are created (through RiViera or GPF, see below), and a higher priority
task will run until it gets blocked waiting for something, at which time lower
priority tasks will run.  If a lower priority task sends a message to a higher
priority task, unblocking the latter which was waiting for incoming messages,
the lower priority task will effectively suspend itself immediately while the
higher priority task runs to process the message it was sent.

HISRs oftentimes post messages to their associated tasks as well; if one of
these messages unblocks a higher priority task, that unblocked task will run
upon the completion of the HISR instead of the original lower priority task
that was interrupted by the LISR that triggered the HISR.  Nucleus' scheduler
is fun!

RiViera and GPF
===============

RiViera and GPF are two parallel/independent/competing wrappers around or
layers above Nucleus.  GPF comes from Condat and is used by the G23M protocol
stack and indirectly by L1 (the peculiar way in which L1 ties in with the rest
of the firmware will be covered later), whereas RiViera is used by the fw
components from TI's SSA group: flash file system, debug trace, RiViera Audio
Service and so forth.

At some point in their post-Calypso TCS3.x program TI decided to eliminate
RiViera as an independent framework and to reimplement RiViera APIs (used by
peripheral but necessary code such as FFS, ETM, various drivers etc) over GPF.
This arrangement is used in the TCS3.2 LoCosto firmware from which we have
lifted our source replacements for much of the code that came as binary objects
in our reference TCS211 version.  However, our current Magnetite and Selenite
firmwares follow the architecture of TCS211, not that of TCS3.2, and because
the entire SSA division of the fw including the RiViera core came in full source
form in our copy of TCS211, it was only natural to keep this code and its
architecture.

Start-up process continued
==========================

As mentioned earlier, Nucleus calls the application's software init function
called Application_Initialize() after it initializes itself but before starting
the task scheduler.  This function in TCS211 is just the following:

Application_Initialize()
{
	Init_Target();
	Init_Drivers();
	Cust_Init_Layer1();
	Init_Serial_Flows();
	StartFrame();
	Init_Unmask_IT();
}

Cust_Init_Layer1() is in L1, StartFrame() is in GPF, and the remaining 4 init
functions live in the init.c module under src/cs/system/main.

The Init_Target() function finishes the hardware initialization that was
started by the assembly code at the firmware boot entry point (int.s): among
other things, it sets up the final memory timings that will be used by the
running fw and configures the Calypso DPLL which provides multiplied internal
clocks to both ARM and DSP cores.  On Calypso C035 silicon which is used on our
own FreeCalypso boards and on most of our pre-existing hw targets the DPLL and
the DSP run at 104 MHz and the ARM gets half of that, running at 52 MHz.
Init_Target() also calls AI_InitIOConfig(), the function that initializes
Calypso GPIO directions and initial outputs; both of these functions typically
need to be tweaked when adding support for a new Calypso board target.

The Init_Drivers() function is primarily responsible for initializing RiViera
and FFS, although it also does a bit of init related to ABB and SIM drivers.

I mentioned earlier that every Nucleus task in our firmware gets created and
started either through RiViera or through GPF.  All GPF tasks are created and
placed into the runable state in the Application_Initialize() context: the work
is done by GPF init code in gpf/frame/frame.c, and the top level GPF init
function called from Application_Initialize() is StartFrame().  Thus when
Application_Initialize() finishes and the Nucleus thread scheduler starts
running for the first time, all GPF tasks are there to be scheduled.

There is a compiled-in table of all protocol stack entities and the tasks in
which they need to run; in TCS211 these GPF config bits live under
g23m/condat/frame/config for the GSM+GPRS configuration and under
g23m/condat/com/src/config for the GSM-only config without GPRS.  Canonically
each protocol stack entity runs in its own task, but sometimes two or more are
combined to run in the same task: for example, in the minimal GSM "voice only"
configuration (no CSD, fax or GPRS) CC, SMS and SS entities share the same task
named CM.  Unlike RiViera, GPF does not support dynamic starting and stopping
of tasks.

As each GPF task starts running (immediately upon entry into Nucleus' scheduling
loop as Application_Initialize() finishes), pf_TaskEntry() function in
gpf/frame/frame.c is the first code it runs.  This function creates the queue
for messages to be sent to all entities running within the task in question,
calls each entity's pei_init() function (repeatedly until it succeeds: it will
fail until the other entities to which this entity needs to send messages have
created their message queues), and then falls into the main body of the task:
for all "regular" entities/tasks except L1, this main body consists of waiting
for messages (or signals or timeouts) to arrive on the queue and dispatching
each received message to the appropriate handler in the right entity.

RiViera tasks get started in a different way.  The responsible code lives in
src/cs/system/main/create_RVtasks.c, and the create_tasks() function found in
that module is called by Init_Drivers() in the Application_Initialize() context.
But this function does not directly create and start every configured RiViera
task like StartFrame() does for GPF.  Instead it creates a special helper task
which will do this work once scheduled.  Thus at the completion of
Application_Initialize() and the beginning of scheduling the set of runable
Nucleus tasks consists of all GPF ones plus the special RV starter task.  Once
the RV starter task gets scheduled, it will call rvm_start_swe() to launch
every configured RiViera SWE (SoftWare Entity), which in turns entails creating
the tasks in which these SWEs are to run.

Dynamic memory allocation
=========================

All dynamic memory allocation (i.e., all RAM usage beyond statically allocated
variables and buffers) is once again done either through RiViera or through GPF,
and in no other way.  Ultimately all areas of the physical RAM that will ever
be used by the fw in any way are allocated when the fw is compiled and linked:
the areas from which RiViera and GPF serve their dynamic memory allocations are
statically allocated as char arrays in the respective C modules and placed in
the appropriate IRAM or XRAM .bss section by the linker script; RiViera and GPF
then provide API functions that allocate memory dynamically from these
statically allocated large pools.

RiViera and GPF have entirely separate memory pools from which they serve their
respective clients, hence there is no possibility of one affecting the other.
Riviera's memory allocation scheme is very much like the classic malloc&free:
there is one large unstructured pool from which all allocations are made, one
can allocate a chunk of any size, free chunks are merged when physically
adjacent, and fragmentation is an issue: a memory allocation request may fail
even when there is enough memory available in total if it is too fragmented.

GPF's dynamic memory allocation facility is considerably more robust: while it
does maintain one or two (depending on configuration) memory pools of the
traditional "dynamic" kind (like malloc&free, susceptible to fragmentation),
most GPF memory allocation works on "partition" memory instead.  Here GPF
maintains 3 separate groups of pools: PRIM, TEST and DMEM; each allocation
request must specify the appropriate pool group and cannot affect the others.
Within each pool there is a fixed number of partitions of a fixed size: for
example, in TI's TCS211 GSM+GPRS configuration the PRIM pool group consists of
190 partitions of 60 bytes, 110 partitions of 128 bytes, 50 partitions of 632
bytes and 7 partitions of 1600 bytes.  An allocation request from a given pool
group (e.g., PRIM) can request any arbitrary size in bytes, but it gets rounded
up to the nearest partition size and allocated out of the respective pool.  If
no free partition is available, the requesting task is suspended until another
task frees one.  Because these partitions are used primarily for intertask
communication, if none are free, it can only mean (assuming that the firmware
functions correctly) that all partitions have been allocated and sent to some
queue for some task to work on, hence eventually they will get freed.

This scheme implemented in GPF is extremely robust in the opinion of this
author, and the other purely "dynamic" scheme is used (in the case of GPF) only
for init-time allocations which are never freed, such as task stacks - hence
the GPF-based part of the firmware is not suspectible at all to the problem of
memory fragmentation.  But Riviera does suffer from this problem, and the
concern is more than just theoretical: one major user of Riviera-based dynamic
memory allocation is the trace facility (described in its own section below),
and my observation of the trace output from Pirelli's proprietary fw (which
appears to use the same architecture with separate Riviera and GPF) suggests
that after the fw has been running for a while, Riviera memory gets fragmented
to a point where many traces are being dropped.  Replacing Riviera's poor
dynamic memory allocation scheme with a GPF-like partition-based one is a to-do
item for our project.

Message-based intertask communication
=====================================

Even though all entities of the G23M protocol stack are linked together into
one monolithic fw image and there is nothing to stop them from calling each
other's functions and accessing each other's variables, they don't work that
way.  Instead all communication between entities is done through messages, just
as if they ran in separate address spaces or even on separate processors.
Buffers for this message exchange are allocated from a GPF partition pool: an
entity that needs to send a message to another entity allocates a buffer of the
needed size, fills it with the message to be sent, and posts it on the recipient
entity's message queue, all through GPF services.  The other entity simply
processes the stream of messages that arrives on its message queue, freeing each
message (returning the buffer to the partition pool it came from) as it is
processed.

Riviera-based tasks use a similar mechanism: unlike G23M protocol stack
entities, most Riviera-based functional modules provide APIs that are called as
functions from other tasks, but these API functions typically allocate a memory
buffer (through Riviera), fill it with the call parameters, and post it to the
associated task's message queue (also in the Riviera land) to be worked on.
Once the worker task gets the job done, it will either call a callback function
or post a response message back to the requestor - the latter option is only
possible if the requesting entity is also Riviera-based.

A closer look at GPF
====================

There are certain sublayers within GPF which need to be pointed out.  The 3
major subdivisions within GPF are:

* The meaty core of GPF: this part is the code under src/gpf/frame in our
  Selenite GPF reconstruction, originating from gpf/FRAME in the TCS3.2 source
  from Peek/FGW.  It appears that this part was originally intended to be both
  project-independent (same for GSM, TETRA etc) and OS-independent (same for
  Nucleus, pSOS, VxWorks etc).  This is the part of GPF that matters for the
  G23M stack: all APIs called by PS entities are implemented here, and so are
  all other PS-facing functions such as startup.  (PS = protocol stack)

* OS adaptation layer (OSL): this is the part of GPF that adapts it to a given
  underlying OS, in our case Nucleus.

* Test interface: see the code under gpf/tst in TCS211 from Openmoko or in
  FC Selenite.  This part handles the trace output from all entities that run
  under GPF and the mechanism for sending external debug commands to the GPF+PS
  subsystem.

GPF was a difficult step in our GSM firmware deblobbing process because no
complete source for it could be found anywhere: apparently GPF was so stable
and so independent of firmware particulars (Calypso or LoCosto, GSM only or
GSM+GPRS, modem or complete phone with UI etc) that it appears to have been
used and distributed as prebuilt binary libraries even inside TI.  All TI fw
(semi-)sources we've got use GPF in prebuilt library form and are not set up to
recompile any part of it from source.  (They had to include all GPF header
files though, as most of them are included by G23M C modules, and it would be
too much hassle to figure out which ones are or aren't needed, hence all were
included.)

Fortunately though, we were able to find the sources for most parts of GPF:

* The LoCosto source in TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip features the
  source for the "core" part of GPF under gpf/FRAME - these sources aren't
  actually used by that fw's build system (it only uses the prebuilt binary
  libs for GPF), but they are there.

* Our TCS211 semi-src doesn't have any sources for the core part of GPF, but
  instead it features the source for the test interface and some "misc" parts:
  under gpf/MISC and gpf/tst in that source tree - these sources are not present
  in the LoCosto version from Peek.

The GPF frame, misc and tst sources we have found have been verified to match
the binary objects that came with TCS211 from OM: they can be compiled into a
bit-for-bit match.  However, one critical piece was still missing: the OS
adaptation layer.  It appears that the GPF core (vsi_??? modules) and OSL
(os_??? modules) were maintained and built together, ending up together in
frame_<blah>.lib files in the binary form used to build firmwares, but the
source for the "frame" part in the Peek find contained only vsi_*.c and others,
but not any of os_*.c.

Our FC Magnetite firmware uses the original binary libs from TCS211-Openmoko
for its GPF component, but for FC Selenite the project requirement is to be
completely blob-free, hence we had to reconstruct the source for the OSL part
of GPF from disassembly.  This work was originally done in 2014 in the context
of our first attempt at gcc-built blob-free GSM fw (FC Citrine, now deemed to
be a dead end and fully retired); this reconstruction was then dug up and
adapted for Selenite in 2018.  As of this writing, this reconstruction is still
not 100% complete (one complex error handling function is stubbed out) and not
yet trusted to be fully correct, thus our fully deblobbed Selenite firmware is
currently considered experimental; our current production fw is still Magnetite
with blobs for GPF.

A closer look at L1
===================

The L1 code is remarkable in how little intertie it has with the rest of the
firmware it is linked into.  It is almost entirely self-contained, expecting
only 4 functions to be provided by the underlying OS environment:

os_alloc_sig	-- allocate message buffer
os_free_sig	-- free message buffer
os_send_sig	-- send message to upper layers
os_receive_sig	-- receive message from upper layers

It helps to remember that at the beginning of TI's involvement in the GSM
baseband chipset business, L1 was the only thing they "owned", while Condat,
the maintainers of the higher level protocol stack, was a separate company.
TI's "turnkey" solution must have consisted of their own L1 code plus G23M code
(including GPF etc) licensed from Condat, but I'm guessing that TI probably
wanted to retain the ability to sell their chips with their L1 without being
entangled by Condat: let the customer use their own GSM L23 stack, or perhaps
work out their own independent licensing arrangements with Condat.  I'm
guessing that L1 was maintained as its own highly independent and at least
conceptually portable entity for these reasons.

The way in which L1 is intertied into the rest of the fw is the same in all TI
production firmwares we have seen, including both our TCS211 reference and the
TCS3.2 LoCosto version.  There is a module called OSX, which is an extremely
thin adaptation layer that implements the APIs expected by L1 in terms of GPF.
Furthermore, this OSX layer provides header file isolation: the only "outside"
(non-L1) header included by L1 is cust_os.h, and it defines the necessary
interface to OSX *without* including any other headers (no GPF headers in
particular), using only the C language's native types.  Apart from this
cust_os.h header, the entire OSX layer is implemented in one C module (osx.c,
which we had to reconstruct from osx.obj as the source was missing - but it's
very simple) which does include some GPF headers and implements the OSX API in
terms of GPF services.  Thus in both TI's production firmwares and our own ones,
L1 does sit on top of GPF, but very indirectly.

More specifically, the "production" version of OSX implements its API in terms
of *high-level* GPF functions, i.e., VSI.  However, they also had an interesting
OP_L1_STANDALONE configuration which omitted not only all of G23M, but also the
core of GPF and possibly the Riviera environment as well.  We don't have a way
to recreate this configuration exactly as it existed inside TI because we don't
have the source bits specific to this configuration, but we do have a little
bit of insight into how it worked.

It appears that TI's OP_L1_STANDALONE build used a special "gutted" version of
GPF in which the "meaty core" (VSI etc) was removed.  The OS layer (os_???
modules implementing os_*() functions) that interfaces to Nucleus was kept, and
so was OSX used by L1 - but this time the OSX API functions were implemented in
terms of os_*() ones (low-level wrappers around Nucleus) instead of the higher-
level VSI APIs provided by the "meaty core" of GPF.  It is purely a guess on my
part, but perhaps this hack was also done in the days before TI's acquisition
of Condat, and by omitting the "meaty core" of GPF, TI could claim that their
OP_L1_STANDALONE configuration did not contain any of Condat's "intellectual
property".

Run-time structure of L1
========================

L1 consists of two major parts: L1S and L1A.  L1S is the synchronous part where
the most time-critical functions are performed; it runs as a Nucleus HISR.  The
hardware in the Calypso generates an interrupt on every TDMA frame (4.615 ms,
or more precisely 60/13 ms), and the LISR handler for this interrupt triggers
the L1S HISR.  L1S communicates with L1A through a shared memory data structure,
and also sometimes allocates message buffers and posts them to L1A's incoming
message queue (both via OSX API functions, i.e., via GPF in disguise).

L1A runs as a regular task under Nucleus, and includes a blocking call (to GPF
via OSX) to wait for incoming messages on its queue.  It is one big loop that
waits for incoming messages, then processes each received message and commands
L1S to do most of the work.  The entry point to L1A in the L1 code proper is
l1a_task(), although the responsibility for running it as a task falls on some
"glue" code outside of L1 proper.  TI's production firmwares with G23M included
have an L1 protocol stack entity within G23M whose only job (aside from some
initialization) is to run l1a_task() in the Nucleus task created by GPF for
that protocol stack entity; we do the same in our firmwares.

Communication between L1 and G23M
=================================

It is remarkable that L1 and G23M don't have any header files in common: L1
uses its own (almost fully self-contained), whereas the G23M+GPF realm is its
own world with its own header files.  One has to ask then: how do they
communicate?  OK, we know they communicate through primitives (messages in
buffers allocated from GPF's PRIM partition memory pool) passed via message
queues, but what about the data structures in these messages?  Where are those
defined if there are no header files in common between L1 and G23M?

The answer is that there are separate definitions of the L1<->G23M interface on
each side, and TI must have kept them in sync manually.  Not exactly a
recommended programming or software maintenance practice for sure, but TI took
care of it, and the existing proprietary products based on TI's firmware are
rock solid, so it is not really our place to complain.

TI's firmwares from the era we are working with (both our TCS211 golden
reference and the TCS3.2/LoCosto source from which we took the newer full-source
version of G23M for our TCS2/TCS3 hybrid) also include a component called ALR.
It resides in the G23M code realm: G23M coding style, uses Condat header files,
runs as its own protocol stack entity under GPF.  This component appears to
serve as a glue layer between the rest of the G23M stack (which is supposed to
be truly hardware-independent) and TI's L1.

Speaking of ALR, it is worth mentioning that there is a little naming
inconsistency here.  ALR is known to the connect-by-name logic in GPF as "PL"
(physical layer, apparently), while the ACI entity (Application Control
Interface, the top level entity) is known to the same logic as "MMI".  No big
deal really, but hopefully knowing this quirk will save someone some confusion.

A closer look at our FreeCalypso TCS2/TCS3 hybrid
=================================================

Because we don't have an official TI firmware release for the Calypso in full
source form and because I am not willing to throw away all of our Calypso work
and restart anew with LoCosto with its own host of unknowns, the only currently
available way for us to have blob-free production-quality GSM mobile station fw
is the TCS2/TCS3 hybrid implemented in FC Magnetite and Selenite.  This hybrid
is made by taking the G23M version from TCS3/LoCosto and grafting it onto the
chipsetsw foundation from TCS211, including the original TCS211/Calypso version
of L1 which we have meticulously source-reconstructed.  The version of GPF used
for this hybrid is also the TCS211 version in Magnetite or our source
reconstruction thereof in Selenite.

The Condat G23M pieces have been hybridized as follows:

* cdginc generated header files are a special hybrid version described below;

* The include files under condat/com/inc and condat/com/include are the TCS3
  version, except for pwr.h and rtc.h for which we use the TCS2 version;

* comlib is the TCS2 version, except for cl_rlcmac.c which is from TCS3;

* config modules (condat/com/src/config and condat/frame/config) are the TCS2
  version, with some fixes for the needs of the TCS3 version of G23M PS and our
  own FreeCalypso fixes;

* Condat drivers (condat/com/src/driver) are the TCS2 version;

* All G23M PS components are the TCS3 version by necessity, as this is the part
  for which the source is missing in our TCS211 version, with the exception of
  ALR - the original source for the TCS211 version of ALR has miraculously
  survived, the ALR source in TCS211 from OM can be compiled into a perfect
  match for the binary lib version;

* We use the TCS2 version of ALR (the interface to our TCS211 L1) and not the
  TCS3 version (a change from Citrine), but it is compiled with the same hybrid
  cdginc headers as the rest of hybrid G23M, not the old TCS211 ones;

* ACI is the TCS3 version - we have the source for both versions, but trying to
  use the old TCS2 version of ACI on top of the new TCS3 version of the PS
  would cause untold breakage;

* The UI layers (MFW and BMI) for handset fw builds are handled like ACI: we
  have the source for both versions, but we use the TCS3 version which works
  with the TCS3 versions of ACI and cdginc;

* The CST (Customer Specific Task) component is the TCS2 version - while it
  logically belongs in the Condat realm, the code lives in the chipsetsw realm
  under chipsetsw/services/cst (yes, it's under services with SSA stuff even
  though it doesn't use RiViera) and thus our copy of TCS211 from OM has this
  source preserved.

With this hybrid arrangement the main splice point lies above ALR, and there
are many little splice points throughout the code where some upper-level code
from TCS3 needs to talk to lower-level code from TCS2.  There are no inversions,
i.e., no places where TCS2 code sits on top of code from TCS3, although there
are a few instances where TCS2 C code uses some TCS3 header files.

TCS3 feature flags
------------------

Our TCS3.2/LoCosto code from Peek/FGW from 20090327 supports several new GSM
features (apparently related to GSM release 99) which are not supported by our
TCS211-20070608 golden reference from OM.  All of these new features can be
enabled or disabled with conditional compilation flags.  Our TCS2/TCS3 hybrid
currently has all of these new features disabled: it was too difficult for me
to figure out if these new features require some support from the hardware or
the DSP which is present on LoCosto but not Calypso, and even if our hw and DSP
have all of the necessary capabilities, at least some of the new features
require adding some code to L1, which is incompatible with my approach of
reconstructing TCS211 L1 pristinely.

In any case, the GSM functionality we get by using the new version of G23M with
new feature flags disabled on top of pristine TCS211 L1 cannot be any worse
than what we would have had if we had the full corresponding source for our
TCS211-20070608 golden reference, and it is probably a little better because we
are using a newer version of G23M code.

cdginc headers
--------------

Much of the code in the Condat G23M realm makes heavy use of a set of machine-
generated C header files called cdginc.  These header files contain various
definitions related both to the GSM air protocols being implemented and to G23M
protocol stack internals (interfaces and message structures between components),
and they are generated from a set of message definition files (*.mdf) and
primitive definition files (*.pdf) by a tool called ccdgen.  The *.{mdf,pdf}
inputs to ccdgen are human-readable ASCII, and of course the generated C header
files are human-readable too, but we have no source for the ccdgen tool itself,
only a Windows binary which we can run under Wine.

The ccdgen binary problem is yet another instance of so far incomplete
liberation of the GSM firmware.  It is currently a very low-priority problem:
we do not casually edit any of the *.{mdf,pdf} inputs to ccdgen, and we don't
run ccdgen on every fw build - instead we have run ccdgen once and checked its
output files (generated C headers) into our Magnetite and Selenite trees as if
they were sources.  If we are not able to convince TI to dig up and release the
source for ccdgen, there is a viable albeit costly alternative: hire a Windows
reverser to RE the ccdgen.exe binary (262144 bytes) and produce a C
reimplementation that replicates all of its logic.  It is a Win32 console app,
no GUI, and it is a pure data processing application without any hardware access
or OS functions or any other muckery: it is probably pure ANSI C code that reads
and parses a bunch of ASCII input files, performs some business logic on the
data, and writes another bunch of ASCII text files as outputs.  It is currently
a very low-priority task though; reversing the Calypso DSP ROM code should
probably be a higher priority.

The set of cdginc headers for our TCS2/TCS3 hybrid has been generated as
follows:

* All of the *.mdf files are the TCS3 version;

* All of the *.pdf files except mphc.pdf and mphp.pdf are also the TCS3 version;

* mphc.pdf and mphp.pdf are the TCS211 version - this is the interface to
  TCS211 L1;

* All new feature flags (see discussion above) are set to disabled.

Condat Coder and Decoder (CCD)
------------------------------

CCD is a firmware component in the Condat G23M realm which I haven't really
studied yet.  It consists of two parts:

* A fixed portion which TI used to distribute in binary form and which various
  firmware projects used as a prebuilt library like GPF - technically TI
  considered it to be a part of GPF, although we prefer to treat it as its own
  more independent entity;

* The ccddata portion which needs to be compiled with cdginc headers for each
  given project.

We got the source for both parts of CCD only in the TCS3.2/LoCosto version, but
not in the TCS211 version, hence the decision was easy: we use the TCS3 version
of CCD (both parts) with the TCS3 version of cdginc with the TCS3 version of
the G23M PS.

TCS3.2 GPF discrepancy
----------------------

A careful examination of the prebuilt GPF libraries under gpf/LIB in the TCS3.2
LoCosto source tree has revealed that a few of the binary objects exhibit some
differences from the TCS211 version which we've been treating as our golden
reference:

* The os_mis module (OSL miscellany) in the IRAM library implements a new
  function called os_CheckQueueEvent() and defines a new global data object
  named my_os_mis_Protect;

* The os_tim module (OSL timer code) in the flash (XIP) library has some code
  differences;

* The vsi_tim module (VSI timer code) in the flash (XIP) library has some code
  differences;

* The vsi_tim module (VSI timer code) in the IRAM library has some code
  differences and makes use of the new os_CheckQueueEvent() function.

In the case of os_??? modules we have no corresponding source for either
version, but the vsi_tim difference is more bizarre: we got our vsi_tim.c source
(and the rest of vsi_*.c) from the TCS3.2/LoCosto source, but this source
matches the TCS211 binary version and not the newer and different binary version
used by the TCS3.2 build system!  (Remember that none of TI's firmware build
systems that we have seen are set up to recompile any part of GPF from source,
they used it only as prebuilt libraries.)

Because we have the corresponding source for the "old" version of GPF frame core
but not for the "new" version, we are continuing to treat the "old" TCS211
version as our golden reference: we use the source pieces which we got, and we
use the "old" os_???.obj blobs as our basis for reconstruction via disassembly.

Because the changes in the TCS3.2 binary version of GPF involve only the
implementation of a part of VSI but not its API (there are no changes to any
part of the GPF API presented to the G23M PS that I can see anywhere), I have
every good reason to believe that there is no problem with using the new TCS3.2
version of G23M with the old version of GPF from TCS211: it should work no worse
than pure TCS211.

It should also be noted that if we ever succeed in getting some more complete
GPF source out of TI (including the source for the OS adaptation layer which is
difficult to reconstruct), thanks to the great stability and independence of
GPF, we will be happy with *any* version, does not need to match either TCS211
or TCS3.2.

GPRS implementation differences
-------------------------------

There is a visible difference between the way GPRS is implemented in the old
TCS211-20070608 blob version of G23M and the way it is implemented in the newer
TCS3.2/LoCosto version we are using for our hybrid.  The new implementation adds
a new protocol stack entity named UPM (User Plane Manager), and the pre-existing
SM and SNDCP entities have been significantly changed to work with this UPM.
Because we are using the GPRS config modules (condat/frame/config) from TCS211,
we had to add a -DFF_UPM compilation flag to include UPM in the GPF
configuration for the GSM+GPRS protocol stack.

A closer look at ACI
====================

The Application Control Interface (ACI) is the crown that sits on top of the
G23M protocol stack.  It includes the AT command interpreter (ATI) component,
and this AT command interface is brought to the outside world via the UART
protocol stack entity.  The UART entity implements the GSM 07.10 MUX, can
operate the physical UART in either multiplexed or non-multiplexed mode (the
latter is the default on boot for a plain ASCII AT command interface) as
commanded by ACI, and establishes 1 to 4 logical channels carrying AT commands
to ACI.  When a CSD or fax call or a GPRS PPP session is in progress, the data
path is switched to run between the UART entity and the appropriate GSM or GPRS
protocol stack destination.  In the case of modem products that are designed to
be controlled by an external host via AT commands, this combination of ACI and
UART entities provides the ultimate end function of the device.

The set of implemented AT commands is defined in ati_cmd.c: this is the C file
where new AT commands get added; there is also an enum of command IDs in
aci_cmh.h which needs to be extended.  For every AT command listed in the table
in ati_cmd.c there is a handler function: for example, for the AT+CFUN command
there is a setatPlusCFUN() function that handles setting and a queatPlusCFUN()
function that handles querying.  For some simple AT commands like AT+CGxx
queries the function listed in ati_cmd.c does the entirety of the work, but for
most of the interesting GSM commands (including the AT+CFUN example just used)
the set and query functions implemented in the ATI layer only handle the parsing
of ASCII arguments and generation of ASCII output (if any), whereas the actual
command implementation happens in the CMH layer below.

Below ATI but still within ACI lies the sublayer of command handlers (CMH).
For each AT command that does something to the GSM mobile station there is a
functional equivalent, a C function that performs the same operation as the
spec-defined AT command, but is designed to be used natively from C code,
without AT command string parsing or output formatting.  For the AT+CFUN example
used above, the setatPlusCFUN() ATI function parses the arguments from ASCII
and then calls sAT_PlusCFUN() to perform the actual operation, whereas the
queatPlusCFUN() ATI function calls qAT_PlusCFUN() to retrieve the current state
and then prints it out in ASCII.  This functional interface is used by TI's
demo/prototype phone UI implementation described in the Handset-UI-fw companion
document.

Finally, at the bottom of ACI lies the sublayer of Protocol Stack Adapters
(PSA): these are pieces of code that execute within the ACI task and exchange
primitives with various G23M protocol stack entities below.

We have the source for both TCS2 and TCS3 versions of ACI.  The TCS2 version is
from Openmoko, containing OM's modifications, and we had to go through these
changes and additions by OM, reject the bogus ones and reimplement the sensible
ones in the new TCS3 version of ACI for our TCS2/TCS3 hybrid going forward.

Flash file system
=================

Every GSM device that is based on TI's firmware architecture contains not only
the firmware image proper, but also a flash file system that is separate from
the fw image and is maintained in a different part of the flash chip.  The FFS
implementation code is a mandatory part of the firmware; in TCS211 it resides
in chipsetsw/drivers/drv_app/ffs and logically belongs to the SSA realm.  This
code initializes early in the fw boot process in the Application_Initialize()
context before the start of Nucleus task scheduling; the responsible function
is ffs_main_init() called from Init_Drivers().

Flash driver support and FFS location
-------------------------------------

Determining the location of the flash area allocated for FFS and the flash
driver to be used to write to it is a combination of autodetection and hard-
coding.  The approach implemented in the original TCS211 code is as follows:
there is a piece of autodetection code that reads the flash chip ID, and the
autodetected ID is then looked up in a hard-coded table that gives the driver
and geometry details and the location of the FFS sectors for each supported
flash chip type.  However, this approach has its limitations:

* The sequence of write operations which TI's original autodetection code
  issues in order to put the flash chip into its Read ID mode worked for older
  flash chips that were used by TI and Openmoko, but does not work for the newer
  Spansion S71PL129NC0HFW4B flash chip which we (FreeCalypso) have copied from
  the Pirelli DP-L10 phone.  We have now fixed it, but until recently we had to
  disable flash autodetection and hard-code the flash chip type on Pirelli and
  FCDEV3B targets.

* While the physical flash chip used on a given phone or modem board is a
  physical property that can be autodetected, the choice of which flash sectors
  should be used for FFS is a matter of policy.  Before we built our own
  FreeCalypso hardware, we had to run our fw on some pre-existing "alien" hw
  targets, and we still support such usage to a limited extent.  When we run
  our FreeCalypso fw on an alien hw target as an aftermarket deal, our
  aftermarket FFS location needs to be chosen quite carefully.

* Some flash chips have two chip select banks, and with such chips it is
  generally desirable to put the FFS in the second bank.  However, it is a
  matter of board wiring whether that second flash chip select is connected to
  Calypso chip select nCS2, nCS3 or nCS4 - thus FFS addresses in the second bank
  have to be hard-coded with conditional compilation per board type and cannot
  be autodetected.

To support our new repertoire of possible hardware targets, we have made the
following changes in our Magnetite and Selenite firmwares:

* We have a new version of the ffsdrv_device_id_read() autodetection function
  that issues AMD's Read ID command sequence in a way that works with all flash
  chips which we've encountered so far in real life, including Openmoko's
  Samsung K5A3281 and our new (originally Pirelli's) Spansion flash chip.  We
  have also incorporated the logic from Pirelli's firmware that distinguishes
  between S71PL-J and S71PL-N chips: they have different sector sizes which FFS
  needs to know about, but they have the same ID codes and can only be
  distinguished through CFI.

* The autodetected flash ID code is looked up in a compiled-in table like
  before, but we now have 4 different versions of this table selected by
  conditional compilation based on the target for which the firmware is being
  built:

  - For our own FC hardware family (CONFIG_TARGET_FCFAM) we have our brand-new
    table of possible flash configurations which we keep free of any legacy
    gunk;

  - For Mot C1xx targets (CONFIG_TARGET_COMPAL) we have a dedicated table
    giving our aftermarket FFS configurations for Intel flash chip types found
    in these phones;

  - For the Pirelli DP-L10 target (CONFIG_TARGET_PIRELLI) we likewise have
    another dedicated table giving our aftermarket FFS config for Pirelli's
    S71PL-J or S71PL-N flash;

  - The #else clause is the original table from TI/Openmoko, used on
    dsample and gtamodem targets.

The advantages of this new approach over our previous approach of disabling
flash autodetection and using a strictly fixed hard-coded FFS config for
FreeCalypso and Pirelli targets are:

* The high-capacity flash chip we are currently using (S71PL129NC0HFW4B) is
  great for development boards or perhaps for a high-end Pirelli-like feature
  phone, but it would be way overkill for an embedded modem product - for the
  latter device class a smaller flash chip like Openmoko's K5A32xx would be
  more appropriate.  The new autodetection approach makes it possible to build
  a single fw image that can run on both large-flash and small-flash boards.

* I've only seen Pirelli phones with S71PL-N flash so far, but their original
  fw supports both S71PL-J and S71PL-N with autodetection.  We can now do
  likewise in our FreeCalypso fw.

Finally, independent of flash chip type autodetection vs. hard-coding issues,
we have had to change the AMD multibank flash driver to issue write commands in
a way that is compatible with our new S71PL129NC0HFW4B chip.  It still works
just as well with Openmoko's K5A32xx.

FFS life cycle
--------------

In products that have been built according to TI's original way, including
Openmoko GTA01/02 and our own FreeCalypso devices, the FFS is formatted and
initialized with some essential content at the time of device manufacture, and
this factory-created and factory-initialized FFS then persists for the lifetime
of the device.  In our factory environment at FreeCalypso hardware manufacturing
we initialize the flash on our freshly assembled boards like this:

flash erase 0 0x800000
flash program-bin 0 fwimage.bin
flash2 erase 0 0x800000

This factory procedure (which should ONLY be executed at the factory and never
by any end users or even sw/fw developers and tinkerers) ensures that the flash
is completely blank everywhere except the fw image loaded at the time of
production, and when this fw image boots for the first time, it will see blank
flash in the FFS sectors.  When TI's FFS code in ffs_main_init() sees this
condition, it performs what TI called a preformat: it writes a basic FFS block
header into each FFS sector, but does not automatically perform a full format -
instead the latter needs to be commanded explicitly by the production station
via one of TMFFS command packet protocols as described later in this article.
In FreeCalypso we have adopted TMFFS2 as our choice of Test Mode FFS access
protocol, our host side implementation of this protocol is fc-fsio, and we
format and initialize the FFS on our devices with an fc-fsio command script as
part of our factory procedure.

FFS content and usage
---------------------

TI's firmware architecture uses the FFS for many purposes:

* The IMEI is stored in the FFS - GSMA can proclaim all they want that it
  "MUST" be stored in some kind of super-secure one-time programmable fuses,
  but in TI's architecture and in FreeCalypso it is just a regular file in the
  FFS.

* A number of RF calibration tables are stored in FFS and read by the RF code
  in L1.  If you have a Rohde&Schwarz CMU200 instrument which is itself in good
  repair and calibration standing and a metrology-grade RF cabling setup whose
  insertion loss at the relevant GSM frequencies is precisely known, creating
  or recreating these RF calibration values is as simple as executing one shell
  script that takes a few minutes to run - this is how we do it at FreeCalypso
  hw manufacturing - but if you are an ordinary user or sw/fw developer or
  tinkerer without a professional calibration station setup, you need to use
  the RF calibration values that have been written into the FFS by the device
  manufacturer.  These RF calibration tables live under /gsm/rf.

* /gsm/com/rfcap tells the RR component in the G23M protocol stack (not L1!)
  which frequency bands are supported on a given device - on our devices it is
  a factory-programmed file distinguishing between tri900 and tri850 units and
  telling the firmware which bands it should scan for possible GSM cells.

* Manufacturer, model and revision ID strings may be written into /pcm/CGMI,
  /pcm/CGMM and /pcm/CGMR, respectively, to be returned by the corresponding
  AT+CGxx query commands.

* The G23M protocol stack writes a number of dynamically updated files under
  the /gsm hierarchy and under /pcm.

* TI's demo/prototype UI code (see Handset-UI-fw companion document) writes its
  persistent state in files under /mmi.

* Audio mode configuration files are kept under /aud - see the Audio-mode-config
  article in freecalypso-tools.

* If a given product uses the Melody E1 mechanism, melody files to be played
  through the RiViera Audio Service are kept in FFS - see the Melody_E1 article
  in freecalypso-tools.

Building firmware for different targets
=======================================

TI's TCS3.2 firmware for their LoCosto chipset which was rejected by the Mother
for reasons described near the beginning of this article makes a complete break
from the past and has no possibility of supporting any pre-LoCosto chips such
as our beloved Calypso, but TI's previous evolutionary developments weren't so
drastic: the evolution to Calypso from previous chips such as Hercules and
Ulysse was smoother, and our reference TCS211 fw is littered with C preprocessor
conditionals supporting TI's earlier development boards prior to D-Sample and
DBB chips prior to Calypso.

TI's configuration management architecture supported only TI's own development
boards and not any of the end product boards: unfortunately they did not follow
a development model like the Linux kernel where everyone is encouraged to
contribute their custom board support bits upstream and the mainline kernel
strives to support every hw target that was ever supported with a single source
tree, instead it was the divergent model where every end device manufacturer
would take TI's reference firmware source and hack it for their specific needs
with no concern for upstreamability or support for targets or applications
other than their own.  TI's firmware build configuration model defined the
following C preprocessor symbols relating to support for different hw targets,
all numeric, i.e., each symbol is always defined to a number:

BOARD identifies which board is to be targeted, with numbers assigned for
different development boards made by various TI groups, but generally not for
customer boards.  The only Calypso-based BOARD number is 41, originally
assigned for the D-Sample but then also reused for the Leonardo; all other
BOARD numbers are for some other chipsets that aren't Calypso.  The previous
board before D-Sample was C-Sample, which is BOARD 9, but I am not sure exactly
what chipset it had - perhaps it was Ulysse/Nausica/Clara.  There is still
plenty of support for BOARD 9 and even earlier boards in the firmware source we
got.

CHIPSET identifies the main DBB chip.  The interesting numbers are 7 for the
very original Calypso C05 rev A, 8 for Calypso C05 rev B (found on the D-Sample
board which the Mother scored in 2015), 10 for Calypso C035 (the Calypso silicon
version we work with in FreeCalypso), 11 for Calypso Lite (same as the regular
Calypso except for smaller IRAM), 12 for Calypso+ (a short-lived intermediate
step between Calypso and LoCosto) and 15 for LoCosto.

ANLG_FAM (previously ANALOG) identifies the ABB chip.  The numbers are 1 for
Nausica, 2 for Iota (what we use) and 3 for Syren (typically used with Calypso+
like on the E-Sample board).

RF_FAM (previously just RF) identifies the RF hardware hooked up to the baseband
chipset.  The interesting numbers are 10 for Clara (D-Sample) and 12 for Rita,
the latter being the only RF chip for which we have driver support.

Naturally any code that cares about DBB register differences would use the
CHIPSET definition, ABB support code would use ANLG_FAM, RF support code would
use RF_FAM, and finally code that needs to know about board-level peripherals
like LCDs and keypads would use the BOARD symbol.  This model worked fine up to
D-Sample: for example, the code for C-Sample vs. D-Sample LCDs and keypads is
cleanly conditionalized on BOARD 9 vs. BOARD 41.  However, the waters got badly
muddied when TI introduced their Leonardo board and instead of giving it its
own BOARD number, reused BOARD number 41 from D-Sample.

D-Sample was TI's primary internal development platform for the Calypso,
featuring Iota for the ABB and Clara for the RF part.  It was a great solid
platform in every way except the RF part: the old Clara RF is inconvenient
(needs more external parts) and TI were marketing their newer Rita RF to real
end device manufacturers, but the D-Sample still worked great for development:
if you aren't working specifically on the RF part, it doesn't matter as long as
you have a working driver for it, which we lack.  Then TI made another Calypso
development board called Leonardo, featuring the same Calypso+Iota baseband
plus the newer Rita RF.  But this Leonardo never fully replaced the D-Sample
for any of the high-level development in the SSA and UI groups.

Openmoko's modem is a direct derivative of the Leonardo, the only change being
the RFFE (for some reason FIC didn't like TI's quadband RFFE as implemented on
Leonardo and E-Sample boards and used their own slightly hobbled triband RFFE
instead), and the firmware build given to OM was TI's Leonardo fw with just a
few tweaks in tpudrv12.h to account for the RFFE control signal differences.
However, because Leonardo never got its own BOARD number and the BOARD symbol
is still set to 41, all of the SSA/UI code (LCD, keypad, battery charging etc)
is still built as if for D-Sample - but none of that code is used on a pure AT
command modem without UI functions or UI hardware, hence OM probably never
noticed anything odd.

And it wasn't just Openmoko - it appears that TI used their Leonardo boards
mostly or perhaps even solely in the ACI configuration without UI layers
(MMI=0 build configuration), while all or most UI development was done on
D-Sample kits.  Their TCS211 reference fw product officially supported both
D-Sample and Leonardo targets in both ACI and BMI+MFW configurations, but if
one were to build a high-end UI-enabled config for the Leonardo like pdt_2272,
it would target a 176x220 pixel color LCD, the LCD output driver would be the
one for the D-Sample (expecting memory-mapped LCD registers on nCS3), and the
keypad driver would expect D-Sample keypad wiring.  Looking at the available
Leonardo schematics I see a serial (uWire) LCD interface instead and a more
basic keypad with different wiring, so I don't see how those Leonardo+UI
firmware builds could possibly work.  Perhaps some other group at TI did some
UI work on Leonardo boards, but never made it into the internal mainline
from which TCS211 releases were cut - who knows...

Finally, aside from the basic failure to distinguish properly between D-Sample
and Leonardo boards, this whole BOARD number system provides absolutely no
mechanism to distinguish between TI's development boards and end product boards
derived from them, or between end product boards of vendor A vs. vendor B, or
between end product model A and model B from the same vendor - it's always
BOARD 41 as far as TI's code is concerned.  When TI had to modify their code
for OM to support FIC's different TSPACT signal wiring, they just edited the
definitions in tpudrv12.h without any conditionals, so one couldn't build
binaries for the original Leonardo vs. OM's hardware from the same source tree
in different configs.

The build system of TCS211 produces a set of generated C header files named
*.cfg (instead of the more natural *.h); these generated config headers define
all of the C preprocessor symbols listed above and many more.  They are included
sometimes as #include "board.cfg" and othertimes as #include "config/board.cfg"
(ditto for other *.cfg), thus the list of -I directories passed by the build
system on compiler invocation lines needs to include both the config directory
and its parent.  In our Magnetite and Selenite build systems we likewise
generate these *.cfg headers; some of the symbols defined therein are variable
and originate from Bourne shell variables in our own configuration system, but
many others are fixed.  See scripts/cfg-template in our Magnetite and Selenite
trees for the magic.

The BOARD symbol is always fixed at 41 in all FreeCalypso firmwares,
corresponding to TI's D-Sample and Leonardo, and we use our own different
mechanism to distinguish among our supported targets.  The solution adopted in
Magnetite and Selenite is as follows: we are supplementing TI's *.cfg and
rv_swe.h files with our own fc-target.h (included as #include "fc-target.h" or
as #include "config/fc-target.h" matching whatever existing TI code we are
gently extending), and this fc-target.h header is populated by the build system
by copying the appropriate targets/*.h header file.  These targets/*.h
header snippets define C preprocessor symbols of our own invention like
CONFIG_TARGET_xxx, and whenever we need to know our target in C code, we
#include "fc-target.h" and use #ifdef logic based on these preprocessor symbols
of our own addition.

RVTMUX debug and development interface
======================================

The Calypso chip has two UARTs, and TI's TCS211 firmware and its predecessors
are designed with the assumption that both of these UARTs are available.  Per
TI's fw architecture, Calypso's MODEM UART presents the standard AT command
interface with GSM 07.10 MUX, CSD, fax and GPRS capabilities as described
earlier when we looked at ACI and ATI, whereas the other UART (called the IrDA
UART in hardware docs but not used for that purpose) presents a vitally
important debug, development and production interface called RVTMUX.  This
RVTMUX interface can also be moved to the MODEM UART, in which case the standard
AT command interface is lost.

RVTMUX is a binary packet interface, and it got its name because it is a MUX of
multiple logical channels managed by the RiViera Trace (RVT) firmware component.
RVTMUX is often thought of as being primarily a debug trace interface, as that
is the primary use to which it is put: in normal operation the firmware emits
quite voluminous debug trace output on the IrDA UART, encapsulated in 3
different RVTMUX channels as explained below.  However, it is also possible to
send a number of different debug and development commands to the firmware via
this interface, and this functionality is used as a critical component in
Calypso GSM device factory production line processes: this RVTMUX interface is
the only way by which the FFS can be initialized, RF calibration and tests can
be performed and the IMEI can be set at the factory.

Communication with a running firmware over this RVTMUX interface in a
development or production setting (whether passively reading debug traces or
actively sending development or test commands to the running fw) requires
specialized host tools.  TI originally had a suite of Windows-based tools for
this purpose, but we are not using them in FreeCalypso: we only got Windows
binaries without any sources, and even in the case of those binaries we only
got an incomplete set with some important tools missing.  Instead we are using
our own Unix-based tools called FreeCalypso host tools; these tools have been
developed from scratch by Mother Mychaela after studying the firmware components
with which they need to communicate.

Debug trace output
==================

The firmware component that "owns" the physical UART channel assigned to RVTMUX
is RVT, contained in chipsetsw/riviera/rvt in TCS211 or in src/cs/riviera/rvt
in our Magnetite and Selenite firmwares.  It is a Riviera-based component,
and it has a Nucleus task that is created and started through Riviera.  All
calls to the actual driver for the UART are made from RVT.  In the case of
output from the Calypso GSM device to an external host, all such output is
performed in the context of RVT's Nucleus task; this task drains RVT's message
queue and emits the content of allocated buffers posted to it, freeing them
afterward.  (The dynamic memory allocation system in this case is Riviera's,
which is susceptible to fragmentation - see discussion earlier in this article.)
Therefore, every trace or other output packet emitted from a GSM device running
our fw (or any of the proprietary firmwares based on the same architecture)
appears as a result of a message in a dynamically allocated buffer having been
posted to RVT's queue.

RVT exports several API functions that are intended to be called from other
tasks, it is by way of these functions that most output is submitted to RVT.
One can call rvt_send_trace_cpy() with a fully prepared output message, and
that function will allocate a buffer from Riviera's dynamic memory allocator
properly accounted to RVT, fill it and post it to the RVT task's queue.
Alternatively, one can call rvt_mem_alloc() to allocate the buffer, fill it in
and then pass it to rvt_send_trace_no_cpy().

At higher levels, there are a total of 3 kinds of debug traces that can be
emitted:

* Riviera traces: these are generated by various components implemented in
  Riviera land, although in reality any component can generate a trace of this
  form by calling rvf_send_trace() - this function can be called from any task.

* L1 traces: L1 has its own trace facility implemented in
  src/cs/layer1/cfile/l1_trace.c; it generates its traces as ASCII messages and
  sends them out via rvt_send_trace_cpy().

* GPF traces: code that runs in GPF/G23M land and uses those header files and
  coding conventions etc can emit traces through GPF.  GPF's trace functions
  (implemented in gpf/frame/vsi_trc.c) allocate a memory partition from
  GPF's TEST pool, format the trace into it, and send the trace primitive to
  GPF's special test interface task.  That task receives trace and other GPF
  test interface primitives on its queue, performs some manipulations on them,
  and ultimately generates RVT trace output, i.e., a new dynamic memory buffer
  is allocated in the Riviera land, the trace is copied there, and the Riviera
  buffer goes to the RVT task for the actual output.

Trace masking
=============

The RV trace facility invoked via rvf_send_trace() has a crude masking ability,
but by default all traces are enabled.  In TI's standard firmwares most of the
trace output comes from L1: L1's trace output is very voluminous, and most of
it is fully enabled by default.

On the other hand, GPF and therefore G23M traces are mostly disabled by default.
One can turn the trace verbosity level from any GPF-based entity up or down by
sending a "system primitive" command to the running fw, and another such command
can be used to save these masks in FFS, so that they will be restored on the
next boot cycle and be effective at the earliest possible time.  Enabling *all*
GPF trace output for all entities is generally not useful though, as it is so
verbose that a developer trying to make sense of it will likely drown in it -
and it will also overwhelm the debug trace facility itself, causing most of
these far too voluminous traces to be lost.  Therefore, a developer seeking to
debug an issue in the G23M protocol stack needs to enable traces very
judiciously.

GPF compressed trace hack
=========================

TI's Windows-based GSM firmware build systems include a hack called str2ind.
Seeking to reduce the fw image size by eliminating trace ASCII strings from it,
and seeking to reduce the load on the RVTMUX serial interface by eliminating
the transmission time of these strings, they passed their sources through an
ad hoc preprocessor that replaces these ASCII strings with numeric indices.
The compilation process with this str2ind hack becomes very messy: each source
file is first passed through the C preprocessor, then the intermediate form is
passed through str2ind, and finally the de-string-ified form is compiled, with
the compiler being told not to run the C preprocessor again.

TI's str2ind tool maintains a table of correspondence between the original trace
ASCII strings and the indices they've been turned into, and a copy of this table
becomes essential for making sense of GPF trace output: the firmware now emits
only numeric indices which are useless without this str2ind.tab mapping table.

Our FC Magnetite build system retains the option of using str2ind, but it is
disabled by default: str2ind significantly increases firmware compilation times,
the resulting fw image sizes without str2ind are fine (the slight increase does
not push us over any limits), and we haven't had any issues with ASCII strings
overloading the trace interface.  However, there is an additional complication
stemming from the choice of two possible G23M PS versions, one of which is a
set of blob libraries:

* If Magnetite is compiled in a pure TCS211 configuration using the original
  blob version of G23M PS, these blobs already have str2ind indices baked into
  them instead of trace ASCII strings, hence the frozen str2ind.tab file from
  Openmoko that maps these indices back to strings needs to be used.

* If Magnetite is compiled in a TCS2/TCS3 hybrid config without G23M blobs,
  then unless you enable it explicitly with USE_STR2IND=1, no str2ind will be
  used at all.

Our blob-free FC Selenite firmware does not support str2ind at all - we shall
stick with full ASCII string traces until and unless we run into an actual (as
opposed to hypothetical) problem with either fw image size or serial interface
load.

RVTMUX command input
====================

RVTMUX is not just debug trace output: it is also possible for an external host
to send commands to the running fw via RVTMUX.

Inside the fw RVTMUX input is handled by the RVT entity by way of a Nucleus
HISR.  This HISR gets triggered when Rx bytes arrive at the designated UART,
and it calls the UART driver to collect the input.  RVT code running in this
HISR parses the message structure and figures out which fw component the
incoming message is addressed to.  Any fw component can register to receive
RVTMUX packets, and provides a callback function with this registration; this
callback function is called in the context of the HISR.

In the original TCS211 fw there are only two components that register to receive
external host commands via RVTMUX: ETM and GPF, hence these are the only command
packet types that can be sent to this original fw.  In FreeCalypso we have kept
these, and we've also added some new RVTMUX channels of our own invention.

Test Mode (TM) and Enhanced Test Mode (ETM)
===========================================

A major use of the RVTMUX interface is sending so-called Test Mode commands
from an external host to a running GSM device.  Depending on the firmware
version, a GSM device can be commanded to do any of the following things
through this mechanism:

* Exercise RF test modes, e.g., transmit continuously at a set frequency and
  power level;
* Read and write arbitrary memory locations in the Calypso ARM7 address space;
* Read and write ABB chip registers;
* Reboot or power off;
* Access and manipulate the device's flash file system (FFS).

In the segment of history of interest to us TI has produced two different
target firmware components that can receive, interpret and act upon Test Mode
command packets:

* The original Test Mode component of Layer 1, called L1TM or TML1: this
  component handles all RF test modes (needed for RF calibration on device
  production lines), and originally it also implemented memory and ABB register
  read and write commands, and provided access to TMFFS1 (see below).  In the
  original implementation this component registered itself as the handler for
  the "TM" RVTMUX channel (RVT packet type 0x14), so it would receive all TM
  packets sent to the device.

* Enhanced Test Mode (ETM) is a later invention.  It registers itself (instead
  of the old TM in L1) with RVT as the handler for the "TM" RVTMUX channel, and
  then provides a registration service of its own, such that various components
  in the fw suite can register to receive external command packets passing
  first through RVT, then through ETM, and can send responses passing through
  ETM, then through RVT back to the external host.  If a given fw version
  contains both ETM and L1TM like TCS211 does, then L1TM registers itself with
  ETM; an external host would send exactly the same binary command packets to
  exercise RF test modes, but inside the firmware they now pass through ETM on
  their way to L1TM.

The ETM_CORE module contained within ETM itself provides some low-level debug
commands: by sending the right binary command packets to the GSM device via the
RVTMUX serial channel, an external host can examine or modify any memory
location and any hardware register, cause the device to reset, etc.  Prior to
ETM some of these functions (but not all) could be exercised through older TM3
commands, but in FreeCalypso we became familiar with the ETM versions of these
commands long before the older ones because we got the ETM component in full
source form, whereas the sole surviving copy of TCS211 that serves as our golden
reference came with L1TM in binary object form like the rest of L1, and we got
to source-reconstructing it only much later.

ETM is implemented as a Riviera SWE and has its own Nucleus task; the callback
function that gets called from the RVT HISR posts received messages onto ETM's
own queue drained by its task.  The ETM task gets scheduled, picks up the
command posted to its queue, executes it, and sends a response message back to
the external host through RVT.

Because all ETM commands funnel through ETM's queue and task, and that task
won't start looking at a new command until it finished handling the previous
one, all ETM commands and responses are in strict lock-step: it is not possible
to send two commands and have their responses come in out of order, and it makes
no sense to send another ETM command prior to receiving the response to the
previous one.  (But there can still be debug traces or other traffic intermixed
on RVTMUX in between an ETM command and the corresponding response!)

L1TM commands get posted to the message queue of the L1A task and then executed
in that task's context.

FFS access via TM/ETM
=====================

One of the essential facilities provided in one form or another in all known
incarnations of the Test Mode mechanism (at least in TI's original architecture,
as opposed to Motorola's bastardized version) is the ability to access and
manipulate the GSM device's flash file system (FFS) that was described earlier
in this article.  TI's TMFFS1 and TMFFS2 protocols provide a command and
response packet interface to the FFS API functions inside the fw, and enable an
external host connected to the GSM device via the RVTMUX channel to perform
arbitrary read and write operations on the device file system.

In the segment of history of interest to us TI has produced two different
and entirely incompatible versions of the TMFFS protocol: TMFFS1 and TMFFS2.
Or rather, what is now called TMFFS1 was originally just TMFFS, and then came
TMFFS2.  TMFFS2 works only through ETM, whereas TMFFS1 predates ETM: in the
original implementation the tm_ffs() function in the FFS code was called from
L1TM code.

Our copy of TCS211 reference fw includes the source for both TMFFS1 and TMFFS2;
it is theoretically possible to build a firmware image that includes both TMFFS
versions (they won't conflict because they respond to different command
packets), but it is pretty clear that TI never intended to have both enabled
at the same time.  Our copy of TCS211 came with TMFFS1 enabled and we didn't
change it when we made the moko12 (leo2moko-r1) fw release for the Openmoko
community (the previous proprietary mokoN firmwares also implement TMFFS1),
but we have subsequently switched to TMFFS2 for our current Magnetite and
Selenite firmwares.

Our choice of TMFFS2 over TMFFS1 was driven by the need to develop our own host
tools to replace TI's original ones which we never got.  We needed to develop
our own host tools for operating on GSM device FFS via one of the two TMFFS
protocols, and after studying the fw source implementing both, I (Mother
Mychaela) came to the conclusion that TMFFS2 is both more capable and more
reliable; my guess is that TMFFS1 was likely kept around only because some of
TI's crappy Weendoze host software depended on it.  (See the implementation
code in chipsetsw/drivers/drv_app/ffs/board/tmffs.c in TCS211 if you would like
to judge for yourself.)  Our host tool that speaks the TMFFS2 protocol is
fc-fsio.

GPF external command input
==========================

The other component that can receive external commands is GPF.  GPF's test
interface can receive so-called "system primitives", which are ASCII string
commands parsed and acted upon by GPF, and also binary protocol stack
primitives.  Remember how all entities in the G23M stack communicate by sending
messages to each other?  Well, GPF's test interface allows such messages to be
injected externally as well, directed to any entity in the running fw.  System
primitive commands can also be used to cause entities to send their outgoing
primitives to the test interface, either instead of or in addition to the
originally intended recipient.

AT commands over RVTMUX
=======================

There is one more use to which we put the RVTMUX debug serial interface that is
an original FreeCalypso invention: communicating with the AT command interpreter
(ATI).  TI's original architecture assumes that if a product is to offer a
standard AT command interface (the product is either a GSM/GPRS modem for which
this AT command interface is the sole mode of usage or a feature phone that
offers a data port as one of its features), then it will be presented on a
dedicated UART separate from RVTMUX.

However, in the case of our FreeCalypso family of projects about 2 years had
passed between our first functional GSM fw attempts in 2015 and us successfully
building our own development board in 2017; during this time we had to work on
various crippled pre-existing Calypso devices, and many of them had only one
UART practically accessible.  In response to this situation we developed a way
to pass AT commands over RVTMUX.  We created a new RVTMUX channel for this
interface and assigned it RVT packet type 0x1A.  Packets sent from an external
host to the GSM device carry AT commands and SMS string input, whereas packets
flowing the other way carry ATI's responses to commands and asynchronous
notifications such as incoming calls.  The host utility for talking AT commands
to a FreeCalypso GSM device via RVTMUX is fc-shell, described below.

Now that we have built a proper FreeCalypso development board with two UARTs,
the use of this AT-over-RVTMUX hack is deprecated for general usage: this hack
does not support any data services (CSD or GPRS), and even for SMS it was
crippled for a long time because maximum-length messages could not be sent in
the more capable PDU mode until our recent extension that works around this
limitation.  However, it still comes in handy during certain casual testing
sessions, and it is required if one needs to run our FreeCalypso firmware on
Mot C1xx or Pirelli DP-L10 hardware.

FC host tools for talking to firmwares via RVTMUX
=================================================

The fundamental tool for talking to running firmwares via RVTMUX is a program
called rvinterf.  It runs on a Unix/Linux host machine, opens a serial port
that is expected to be connected to the RVTMUX UART on the target, and then
speaks TI's binary packet protocol on that serial port.  It then performs two
functions:

* If rvinterf is run in the foreground in a terminal window (or more precisely,
  if its default terminal output is not disabled), every packet received from
  the target is decoded and printed on stdout in human-readable ASCII.  For
  some packets like TM/ETM responses this "human-readable" form is just a hex
  dump, but the trace messages which the firmware emits on its own are printed
  in truly human-readable form.  This output can also be saved to a log file.

* Rvinterf creates a local UNIX domain socket on the machine it is running on,
  and other host tools can then connect to this socket to exchange packets with
  the firmware.  Client programs connected to rvinterf via this local socket
  interface can register to receive copies of packets sent by the target on
  specific RVTMUX channels, and they can also send arbitrary packets to the
  target.

Our main "client" programs for actively interacting with running firmwares via
rvinterf are:

fc-tmsh		This utility speaks the TM/ETM protocol.  It supports almost
		all ETM and L1TM commands that are supported by our reference
		TCS211 fw with the important exception of TMFFS; support means
		that fc-tmsh can issue these commands and decode the firmware's
		responses to them.  fc-tmsh operates asynchronously in that the
		issuance of commands to the target and the display of firmware
		responses are completely decoupled; this asynchronous model is
		a good match for L1/RF test mode commands and simple ETM
		operations, but is a poor fit for FFS manipulation.  fc-tmsh's
		companion fc-fsio implements FFS access via TMFFS2, and we
		don't have a host side implementation for TI's older TMFFS1
		protocol.

fc-fsio		This utility speaks the TMFFS2 protocol over the TM/ETM RVTMUX
		channel (same channel as used by fc-tmsh, so don't try to run
		both at the same time) and implements fairly high-level FFS read
		and write operations.  fc-fsio is used to format and initialize
		the FFS on newly made devices in our hardware manufacturing
		environment, it can upload files or entire subtrees into target
		device FFS, it has higher-level commands for writing some files
		like the IMEI, rfcap and AT+CGxx ID strings, and it can list and
		read out FFS content.  Unlike fc-tmsh, fc-fsio is synchronous:
		it is built on command-response (send a command and expect a
		response) primitives, and a single user command can turn into a
		large number of command-response exchanges on the RVTMUX
		interface.  fc-fsio also implements a few non-FFS commands
		because they naturally fit into this ETM synchronous model.

fc-shell	This tool is asynchronous like fc-tmsh, but instead of talking
		and listening on the TM/ETM RVTMUX channel, it talks and listens
		on GPF's channel and on the new AT-over-RVTMUX channel which we
		added in FreeCalypso.  fc-shell can be used to issue system
		primitive commands to GPF (and to see firmware responses to
		them), and to talk AT commands via RVTMUX.

Finally, if you only need to passively observe the firmware's debug trace output
and don't need to make any active pokes at the target, our rvtdump utility is a
stripped-down version of rvinterf (or historically its predecessor) that only
decodes and prints/logs the output from the target without sending anything to
it.

Further reading
===============

Believe it or not, some of the documentation that was written by the original
vendors of the software in question and which we've been able to locate turns
out to be fairly relevant and helpful, such that I recommend reading it.

Documentation for Nucleus PLUS RTOS:

	ftp://ftp.freecalypso.org/pub/embedded/Nucleus/nucleus_manuals.tar.bz2

	Quite informative, and fits our version of Nucleus just fine.

Riviera environment:

	ftp://ftp.freecalypso.org/pub/GSM/Calypso/riviera_preso.pdf

	It's in slide presentation form, not a detailed technical document, but
	it covers a lot of points, and all that Riviera stuff described in the
	preso *is* present in our fw for real, hence it should be considered
	relevant.

GPF documentation:

	https://www.freecalypso.org/LoCosto-docs/SW%20doc/frame_users_guide.pdf
	https://www.freecalypso.org/LoCosto-docs/SW%20doc/vsipei_api.pdf

	Very good reading, helped me understand GPF when I first reached this
	part of firmware reintegration.

TCS3.x/LoCosto fw architecture:

	https://www.freecalypso.org/LoCosto-docs/SW%20doc/TCS2_1_to_3_2_Migration_v0_8.pdf
	ftp://ftp.freecalypso.org/pub/GSM/LoCosto/LoCosto_Software_Architecture_Specification_Document.pdf

	These TI docs focus mostly on how they changed the fw architecture from
	their TCS2.x program (Calypso) to their newer TCS3.x (LoCosto), but one
	can still get a little insight into the "old" TCS211 architecture they
	were moving away from, which is the architecture we've adopted for
	FreeCalypso.