view doc/Calypso-TCH-downlink @ 479:616b7ba1135b

doc/AMR-library-API: document AMR-EFR hybrid decoder
author Mychaela Falconia <falcon@freecalypso.org>
date Sun, 19 May 2024 22:22:40 +0000
parents e512f0d25409
children 5e2d849a4fbc
line wrap: on
line source

It has been discovered that the implementation of standard signal processing
chains for speech TCH downlink and uplink in the DSP ROM in the Calypso GSM
baseband processor allows these signal processing chains to be tapped at certain
points, as detailed in the TCH-tap-modes article in our freecalypso-docs Hg
repository.  There is a mechanism to capture the stream of received traffic
frames on TCH DL, and there is another mechanism by which an externally supplied
stream can be "played" into TCH UL.

I (Mother Mychaela) previously played with this functionality back in 2016, and
it's been mostly shelved since then.  This functionality became interesting
once again in late 2022: now that we have a proper set of codec libraries (the
present package) and a proper understanding of Rx DTX handling requirements, we
can take another shot at decoding TCH downlink captures taken from Calypso GSM
MS.

The overall functionality is described in the TCH-tap-modes article in
freecalypso-docs; the mechanism for capturing TCH DL bits from Calypso DSP is
split between FreeCalypso GSM MS firmware (added to FC Tourmaline as of
2022-12-13) and the fc-shell utility in the FC host tools package, updated as
of fc-host-tools-r18 to support the new FreeCalypso fw.  There is also a set of
utilities included in the present GSM codec libraries & utilities package for
parsing and decoding these Calypso TCH DL captures; the present document
describes these utilities.

As explained in the TCH-tap-modes article in freecalypso-docs, the mechanism
for capturing TCH DL is currently implemented for TCH/FS, TCH/HS and TCH/EFS,
corresponding to FR1, HR1 and EFR codecs.  However, further parsing and decoding
support has only been implemented for FR1 and EFR codecs in the present package,
in the form of the following utilities:

gsmfr-dlcap-parse	This program reads a TCH/FS DL capture file and parses
			it for human analysis.  All input fields are passed
			through to the output, but the program also computes
			the ternary SID flag of GSM 06.31 section 6.1.1 from
			the payload bits (for comparison against what the DSP
			wrote in its status word 0) and prints all broken-down
			parameter fields of each GSM 06.10 FR1 codec frame.

gsmfr-dlcap-gsmx	This program reads a TCH/FS DL capture file and converts
			it into an extended-libgsm (gsmx) file containing a mix
			of FR1 codec frames and Themyscira BFI markers.  The
			latter BFI markers will be emitted in those frame
			positions where FACCH was received instead of speech,
			or where the DSP otherwise indicated BFI=1.  The gsmx
			output from this utility needs to be fed to gsmfr-decode
			from the present package, so that our FR1 Rx DTX
			preprocessor will take care of SIDs and BFIs, completing
			the required GSM MS processing chain for TCH/FS DL.

gsmefr-dlcap-parse	This program reads a TCH/EFS DL capture file and parses
			it for human analysis.  All input fields are passed
			through to the output, but the program also computes
			the ternary SID flag of GSM 06.81 section 6.1.1 from
			the payload bits (for comparison against what the DSP
			wrote in its status word 0) and prints all broken-down
			parameter fields of each EFR codec frame.  Finally, each
			triplicated bit group of GSM 05.03 section 3.1.1.2 is
			printed as an octal digit, to aid human analysis of how
			the DSP writes these bits in its a_dd_0 buffer.

gsmefr-dlcap-gsmx	This program reads a TCH/EFS DL capture file and
			converts it into a gsmx binary file, containing a mix
			of EFR codec frames and Themyscira BFI markers.  The
			latter BFI markers will be emitted in those frame
			positions where FACCH was received instead of speech,
			or where the DSP otherwise indicated BFI=1.  The gsmx
			output from this utility needs to be fed to
			gsmefr-decode (or gsmefr-decode-r) from the present
			package.

gsmefr-dlcap-dec	This program reads a TCH/EFS DL capture file and feeds
			it directly to the EFR reference decoder implemented in
			libgsmefr, without going through a gsmx intermediary.

Additional notes:

* The new gsmfr-dlcap-gsmx utility described above replaces the old fc-tch2fr
  utility from FC host tools - the latter should now be considered a bogon.
  The required GSM MS processing chain for TCH/FS DL includes the step of Rx
  DTX handler between the output of GSM 05.03 channel decoder and the input of
  GSM 06.10 speech decoder; the old chain of fc-tch2fr followed by libgsm
  decoding omitted this critical step and thus produced very unkind-on-ears
  sounds.

* gsmefr-dlcap-dec has been written as a bold attempt to replicate the complete
  Rx DTX handler and speech decoder (the part of TCH DL processing chain that
  sits past the a_dd_0 buffer) as they are implemented inside TI's DSP.  Such a
  feat won't be possible for FR1 codec (other than by a Herculean effort of full
  static reversing of the DSP ROM) because there is no bit-exact definition of
  FR1 Rx DTX functions in GSM specs, but for EFR there is a bit-exact reference
  implementation from ETSI.  *If* TI's DSP matches this bit-exact reference
  (there are some aspects of Rx DTX handling where this bit-exact reference is
  considered to be an example rather than normative, see GSM 06.61), then there
  is a chance we could replicate TI's DSP chain externally - but only if we can
  figure out exactly how the bits of a_dd_0[0] drive the logic of their Rx DTX
  handler.  The Mother's plan is to capture the DSP's decoded speech output from
  MCSI on an FCDEV3B using a small FPGA board with a PCM-to-UART logic function,
  while simultaneously capturing TCH DL bits in the a_dd_0 buffer, then run
  gsmefr-dlcap-dec on the captured TCH DL booty and see if we can replicate the
  DSP's end output - but until then, this gsmefr-dlcap-dec program should be
  treated as an unfinished experiment in progress.

* In the case of FR1 codec, there is no prescribed bit-exact definition for the
  Rx DTX handler (GSM 06.11, 06.12 and 06.31 specs define general requirements,
  but aren't bit-exact in most aspects), and the way in which we (Themyscira
  Wireless) have implemented our FR1 Rx DTX handler (libgsmfr2 in the present
  package) perfectly matches our gsmx binary file format for good vs bad frames.
  Therefore, in the case of FR1 codec there is nothing to be gained by skipping
  gsmx and calling library functions directly, and thus there is no FR1
  counterpart to gsmefr-dlcap-dec - just use gsmfr-dlcap-gsmx followed by
  gsmfr-decode or gsmfr-decode-r.

* In addition to TCH DL capture files, gsmfr-dlcap-parse also accepts the hex
  output from fc-vm2hex, originating from TCS211 voice memo recordings,
  including fc-vm2hex output in the case of VM recordings made in DTX mode.
  However, if the objective is to play that VM recording and not just look at
  parsed bits, the correct approach is to convert the VM file to gsmx with
  fc-vm2gsmx, and then decode with gsmfr-decode.  Using fc-vm2hex followed by
  gsmfr-dlcap-gsmx instead of fc-vm2gsmx won't work!

Catching the output of the network-side speech encoder
======================================================

The set of FR1 test sequences included with later versions of GSM 06.10 specs
and the set of EFR test sequences in GSM 06.54 include special synchronization
sequences that can be fed to the G.711 PCMA or PCMU input of the TRAU in the
downlink direction, and the set of 160 possible speech encoder outputs for each
codec that can result from the TRAU processing that DL input, depending on the
alignment between the input and the location of 20 ms frame boundaries for the
encoder.  In the case of EFR, there is a second dimension of uncertainty when
experimenting with GSM networks that aren't your own: in addition to the unknown
alignment of G.711 input (160 possibilities), there is the unknown of whether
the network transcoder implements classic EFR or an AMR-EFR hybrid - see our
AMR-EFR-philosophy and AMR-EFR-hybrid-emu articles.  However, thanks to the work
we did in vband-misc Hg repository, we now have a fully backward-compatible
extended version of ETSI's seqsync[au].inp TRAU DL inputs (the last frame of
160 samples that isn't EHF is simply repeated twice) that allows us to
distinguish between two possible styles of EFR implementation in the network
transcoder, producing 320 possible outputs on GSM Um DL for 160 possible
alignments times two possible EFR implementation options.

However, tools are still needed on the GSM MS side of the test setup, reading
the TCH DL capture produced with FreeCalypso tools and detecting which of the
possible 160 (FR1) or 320 (EFR) encoded frames have been produced.  (320 or
even 160 possible frames is too many to check by hand!)  These tools are
provided in gsmfr-dlcap-sync and gsmefr-dlcap-sync, added to Themyscira GSM
codec libraries and utilities suite as of gsm-codec-lib-r3.  Each of these
utilities takes two command line arguments: the name of TCH DL capture file to
read and analyze, and an "alaw" or "ulaw" keyword argument selecting the match
table to use.  Specify alaw if you are feeding seqsynca.inp to a PCMA-native
TRAU or other GSM network speech transcoder, or ulaw if you are feeding
seqsyncu.inp to a PCMU-native network.  The program will read the entire TCH DL
capture, looking for matches, and will report any matches it finds.

Both gsmfr-dlcap-sync and gsmefr-dlcap-sync implement the logic of looking for
the respective codec's DHF followed by one of 160 (FR1) or 320 (EFR) distinct
encoded frames.  In the case of EFR, if the network transcoder implements
AMR-EFR and the alignment shift happens to be in the [120,159] range, there
will also be an MR122 DHF sandwiched between the standard EFR DHF and the
distinct encoded frame (unique for each of the 40 possible alignments in this
range) if the AMR-EFR hybrid is implemented like our amr_dhf_subst_efr2()
function, matching the network of T-Mobile USA.  gsmefr-dlcap-sync looks for
both EFR and MR122 DHF; in the case of matches to AMR-EFR offset [120,159], the
tool's indication whether the unique frame was preceded by EFR or MR122 DHF
indicates how the alien network transcoder implements its DHF transformation;
in the case of other matches, seeing MR122 DHF is an unexpected error condition,
and it is reported as such.

These tools cover just one step in the workflow of reverse-engineering an alien
GSM network's speech transcoder and confirming if it matches standard EFR or
the AMR-EFR hybrid as currently found in the wild.  The complete workflow in
the GSM downlink direction will typically be as follows:

1) Using sipout-test-voice utility from the sipout-test-utils suite, establish
   a test call from IP-PSTN to a test MS served by the GSM network under study.
   AT%SPVER will typically need to be used to cause the network to assign the
   desired codec on this call.

2) While making a TCH DL recording on the FreeCalypso MS used in this test,
   play seqsync[au].inp (or the extended version with the last frame sent twice)
   into the G.711 PCM stream from IP-PSTN side, using 'play' command of
   sipout-test-voice.

3) Run gsmfr-dlcap-sync or gsmefr-dlcap-sync on the DL recording from the
   previous step, as appropriate.

4) Once the alignment is known, use 'play-offset' command in sipout-test-voice
   to play a longer test sequence into the same call, and have another TCH DL
   recording running on the test MS.

5) If the longer test sequence begins with the same seqsync[au].inp preamble
   (which is recommended), playing it with the correct offset from IP-PSTN side
   should result in gsm[e]fr-dlcap-sync reporting zero offset on the new DL
   recording.  However, gsm[e]fr-dlcap-sync on this second DL capture should
   indicate the line number where the interesting part begins.

6) Extract the part of interest identified in the previous step, convert it to
   gsmx format with gsm[e]fr-dlcap-gsmx, and compare it against the expected
   encoded frame sequence.