gsm-net-reveng: doc/TFO-xform/Theory comparison

comparison doc/TFO-xform/Theory @ 33:e828468b0afd

doc/TFO-xform/Theory: article written

author	Mychaela Falconia <falcon@freecalypso.org>
date	Sat, 31 Aug 2024 20:45:25 +0000
parents
children

comparison

equal deleted inserted replaced

-:f6bb790e186a
+:e828468b0afd
+TFO transform from uplink to downlink
+=====================================
+With all 3 classic GSM codecs (FRv1, HRv1, EFR) the original architecture calls
+for a network-side transcoder (TRAU) on each individual call leg.  The
+implications are:
+* The uplink runs from the MS to the speech decoder in the TRAU that turns the
+mobile-generated speech into 64 kbit/s G.711.  The Rx DTX handler, a subblock
+of that speech decoder in the TRAU, handles error concealment (substitution
+and muting of lost frames) and comfort noise insertion during DTXu pauses,
+and once this speech stream has been transcoded to G.711, all trace of these
+GSM-specific effects disappears.
+* The downlink runs from the speech encoder in the TRAU to TCH DL radio output
+from the BTS.  Because the DL frame stream comes from a free-running speech
+encoder, it never contains errored frames or invalid SID or any other
+aberrations: without DTXd, this frame stream is 100% good speech frames, and
+with DTXd, it is a mixture of good speech and valid SID frames.
+But suppose you have two mobile call legs (mobile user Alice calls mobile user
+Bob), and you wish to eliminate the quality-degrading effect of double or tandem
+transcoding by passing compressed speech frames directly from Alice to Bob and
+vice-versa - what happens now?  The UL frame stream from each call leg will
+contain BFI frame gaps that are never allowed in DL, and if the network deploys
+DTX only in the UL direction (DTXu without DTXd, a very sensible choice for
+small-capacity single-carrier cells), the representation of DTXu pauses coming
+from each call leg (SID frames followed by prolonged BFI gaps) is also not
+suitable for direct passing to the DL of the opposite call leg.
+The solution offered in the TFO spec (GSM 08.62) is a special transform from
+call leg A UL to call leg B DL.  This transform has no official name that I
+could find, but I call it "TFO transform".  In the original GSM 08.62 spec (up
+to R99) this TFO transform is described in sections 8.2.1 and 8.2.2; when the
+spec changed to 28.062 with 3GPP Release 4 (adding AMR in GSM and AMR-only
+UMTS), the description of TFO transform for classic GSM codecs moved to section
+C.3.2.1.1.
+However, both spec versions only say what "shall" be done without any guidance
+on how to do it algorithmically: the spec language is "subject to manufacturer
+dependent future improvements and is not part of this recommendation."
+Distilling the problem to its essence, the addition of TFO introduces a new type
+of logical transform on codec frames (and a stateful one at that!) that never
+appeared previously anywhere in classic GSM architecture, is not mentioned in
+any other spec, and is not addressed at all by any of the reference codec
+sources.  This new transform is implemented only in the TFO block in TRAUs and
+nowhere else (in classic GSM architecture), and can be exercised only by
+establishing a TFO call between two interworking TRAUs.
+There are 3 main parts to this TFO transform, 3 main areas where anyone who
+seeks to implement this transform has to think hard and come up with an
+innovative solution:
+1) Error concealment in non-DTX speech: if an errored frame (BFI) appears after
+non-SID speech frames (meaning non-DTX speech), the transform has to fill in
+substitution/muting "speech" frames (meaning codec frames that look like
+valid speech frames) in the stream going to call leg B DL.
+2) Comfort noise insertion: if the incoming frame stream from call leg A UL
+contains SID frames (DTXu) but the same are not allowed on call leg B DL
+(no DTXd), the transform has to insert "speech" frames (in the same
+parenthetical meaning) that represent comfort noise, as intended by Alice's
+phone that transmitted SID with certain CN parameters.
+3) Comfort noise muting: handling the case where the incoming UL frame stream
+goes into CN insertion state (via one or more SID frames), but then goes
+total BFI, with no more SID update frames appearing in TAF positions.  In
+the case of a single codec leg from a source encoder to an end decoder,
+standard decoders are required by their respective DTX specs to gradually
+mute their CN output, to indicate channel breakdown to the user - the TFO
+transform has to produce the same effect.
+All 3 of the just-listed functions are explicitly called out in the TFO spec, in
+each case with the same language of "shall" followed by "subject to manufacturer
+dependent future improvements and is not part of this recommendation."
+DTXd or no DTXd
+===============
+When the destination call leg operates without DTXd, the TFO transform can only
+emit frames that are well-formed speech frames for the respective codec, no SID
+frames.  In this case the transform has to do "everything", all 3 of the listed
+functions, although the last function of CN muting may be either separate or
+absorbed into CN generation function depending on the codec.
+OTOH, when call leg B has DTXd enabled/allowed, there is more room for
+additional complexity.  The simplest solution would be to not make use of DTXd
+capability and always emit speech frames - but the problem with this simple
+approach is teleological.  If a GSM network operator runs with DTXd enabled,
+presumably that operator seeks to reap the benefits of DTXd as in reduction of
+radio interference, in which case a TFO transform that fails to make use of DTXd
+capability would defeat the purpose.  Hence if someone sets out to implement a
+TFO transform that supports full utilization of DTXd, they would have to do
+additional work:
+* The function of CN insertion in the transform _mostly_ goes away: if a valid
+SID frame comes, the TRAU caches it and repeats it continuously until the
+next SID update, allowing the BTS to select which SID frames it will actually
+transmit based on its SACCH alignment.  But more complex handling is still
+needed if the first SID frame (the one that begins CN insertion period) came
+in as invalid SID, and the function of CN muting takes on new significance.
+* CN muting: when the cached SID expires and no new SID updates arrive in TAF
+positions, the TFO transform has to indicate somehow to Bob that Alice's call
+leg is having trouble, which will be easy or difficult depending on what rules
+are specified in the codec specs for SID interpolation in the final receiver.
+* Error concealment in non-DTX speech: at first glance this function appears to
+be exactly the same whether DTXd is used or not.  But consider the case of
+total channel breakdown, such that the incoming frame stream becomes all BFI:
+how should this case be handled?  In the absence of DTXd, the output of the
+TFO transform becomes a stream of silence frames, meaning some kind of
+"speech" frames that produce total silence at the end decoder.  But if the
+network operates with DTXd with the aim of reducing radio interference, these
+silence "speech" frames should be replaced with SIDs whose parameters are
+chosen to produce silent output.
+Current approach in Themyscira libraries
+========================================
+There is a desire to implement TFO transform for all 3 classic GSM codecs in
+Themyscira Wireless GSM codec libraries suite, and the first question to be
+decided is the policy with regard to DTXd.
+The current approach is to not implement any DTXd support, i.e., implement the
+TFO transform only in its no-DTXd basic form.  The reason for this decision is
+based on the reality of small-capacity single-carrier cells: given that the
+total number of humans who actually _want_ to use GSM (as opposed to whatever
+latest 4G/5G/etc is peddled by Big Tech mafia) is vanishingly small, there is
+currently no justification for building higher-capacity GSM cells that use more
+than a single 200 kHz radio carrier.  And if each GSM cell consists of only one
+radio carrier (the BCCH carrier, also called C0 in the specs), then physical
+DTXd (as in actually turning off radio Tx, as opposed to "logical" DTXd where
+that effect is merely faked for the MS by transmitting dummy bursts or
+induced-BFI frames) is simply impossible.  Therefore, in the present state of
+human condition, there is no justification for expending the effort to implement
+additional complexity for proper DTXd.

FreeCalypso > hg > gsm-net-reveng

comparison doc/TFO-xform/Theory @ 33:e828468b0afd