view doc/Nucleus-change @ 303:f76436d19a7a default tip

!GPRS config: fix long-standing AT+COPS chance hanging bug There has been a long-standing bug in FreeCalypso going back years: sometimes in the AT command bring-up sequence of an ACI-only MS, the AT+COPS command would produce only a power scan followed by cessation of protocol stack activity (only L1 ADC traces), instead of the expected network search sequence. This behaviour was seen in different FC firmware versions going back to Citrine, and seemed to follow some law of chance, not reliably repeatable. This bug has been tracked down and found to be specific to !GPRS configuration, stemming from our TCS2/TCS3 hybrid and reconstruction of !GPRS support that was bitrotten in TCS3.2/LoCosto version. ACI module psa_mms.c, needed only for !GPRS, was missing in the TCS3 version and had to be pulled from TCS2 - but as it turns out, there is a new field in the MMR_REG_REQ primitive that needs to be set correctly, and that psa_mms.c module is the place where this initialization needed to be added.
author Mychaela Falconia <falcon@freecalypso.org>
date Thu, 08 Jun 2023 08:23:37 +0000
parents a1799f6d6aa7
children
line wrap: on
line source

The specific integration of ATI Nucleus PLUS RTOS in TI's stable TCS211 fw
(which served as the baseline for several vendors' production fw) exhibits one
hair-raising bug.  While we don't know for sure where and how they maintained
Nucleus library sources for compilation (the version we got has them censored
out), we do see that Nucleus header files (nucleus.h and ??_defs.h) exist in
two different locations in the source tree in two different versions:

* One version exists under chipsetsw/os/nucleus
* The other version exists under gpf/inc/nuc & gpf/inc/nuc/arm7

The two versions of these header files under these two paths in TCS211 are not
the same!  The main nucleus.h header file is the same in both places, cs_defs.h
and tm_defs.h versions differ only in comments, but tc_defs.h is the real
kicker: the version under gpf/inc/nuc has an extra field added to the TC_HCB
aka NU_HISR structure, making this structure one word longer than in the other
version!  More specifically, in ATI's original Nucleus this structure is 22
words long with 4 unused dummy words at the end; TI's GPF version adds a fifth
dummy word (thankfully toward the end, not shifting any actually-used members
of the struct), putting the total struct size at 23 words.

It would be one thing if TI had made this change consistently, but they didn't:
some modules were compiled with one version of the headers and got the 22-word
version of the struct, while other modules were compiled with the other header
file version and got the 23-word version of the struct.  How can their fw work
with this bug in it?  Answer: TCS211 fw works despite this Nucleus integration
bug because:

* None of the actually-used members of the struct change offsets between the
  two versions;

* Some places in the code have 22-word structs allocated in memory while other
  places have 23-word structs, but when they pass pointers to these structs to
  Nucleus API functions, those functions don't access past the actually-used
  part at the beginning (the part before dummy words), and they never do
  anything like zeroing out the full size of the expected struct.

* The only place in TCS211 fw where the total size of the struct matters is
  where NU_HISR is embedded in another structure, and there is one such place
  in GPF.  Here breakage would result if different modules using these structs
  and arrays were compiled with different header file versions, but all modules
  that touch this part are compiled with the GPF version of nucleus.h, NU_DEBUG
  and tc_defs.h.

Needless to say, resolving this bogosity has been an important part of
FreeCalypso firmware deblobbing.  Naturally the most ideal solution would have
been to remove the bogus extra word added by TI and consistently use the
original 22-word struct everywhere, but there is one further complication: I
(Mother Mychaela) don't feel comfortable with moving away from the original blob
version of the OSL component of GPF, and these COFF objects have been compiled
with the 23-word version of TC_HCB aka NU_HISR.

The following alternative approach has been implemented in FC Tourmaline:

* The new source version of Nucleus by Comrade XVilka has been checked in under
  src/nucleus, and this new source version is the one we are using instead of
  TI's binary object version.

* The new Nucleus header files src/nucleus/nucleus.h and src/nucleus/??_defs.h
  are the only ones used in Tourmaline - both old versions have been removed
  from active -I include paths.

* The new src/nucleus/tc_defs.h header file has been patched to replicate TI's
  23-word version of TC_HCB aka NU_HISR, and the NU_HISR_SIZE definition in
  src/nucleus/nucleus.h has also been adjusted to match.

Thus we are using the 23-word version of TC_HCB aka NU_HISR everywhere, with 5
dummy words at the end rather than 4, adding 4 extra bytes of wasted RAM space
to every instance of this struct throughout the firmware - but there are only a
small number of these instances, thus the waste is negligible.  In return we
gain 100% consistency (the same version of the struct is used everywhere in our
fw), and we retain the ability to keep the original OSL blobs which I am not
ready to give up.