diff doc/AMR-EFR-hybrid-emu @ 467:ad032051166a

doc: AMR-EFR-hybrid-emu new article
author Mychaela Falconia <falcon@freecalypso.org>
date Sun, 12 May 2024 23:54:43 +0000
parents doc/AMR-EFR-philosophy@9bcf65088006
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/AMR-EFR-hybrid-emu	Sun May 12 23:54:43 2024 +0000
@@ -0,0 +1,147 @@
+Emulation of other people's AMR-EFR hybrid implementations
+==========================================================
+
+[Please see AMR-EFR-philosophy article for background information on the
+ differences between classic GSM-EFR and the 12k2 mode of AMR, and how ETSI/3GPP
+ loosened their regulation on bit-exactness of EFR, then continue here.]
+
+Experiments reveal that the extant commercial GSM networks of T-Mobile USA and
+Telcel Mexico (and likely other countries' GSM networks too) use a GSM speech
+transcoder implementation that performs EFR encoding and decoding (for times
+when the MS declares no support for AMR and the network falls back to EFR) per
+the alternative which we call AMR-EFR hybrid.  The needed experiments are done
+by using a FreeCalypso phone or devboard as the MS (declaring yourself to the
+network as non-AMR-capable via AT%SPVER), capturing TCH DL and feeding TCH UL
+with FreeCalypso tools, and using a SIP-to-PSTN connectivity provider (BulkVS
+or Anveo) on the other end of the test call that allows the experimenter to
+receive the PCMU or PCMA sample stream coming out of the GSM network's speech
+transcoder and feed a crafted PCMU/PCMA sample stream in the other direction.
+
+In this experimental setup, bit-exact details of how the GSM network under study
+implements EFR decoding can be tested by feeding a controlled sequence of EFR
+codec frames (beginning with at least two DHFs) to GSM Um uplink and observing
+the PCMU or PCMA sample stream received on the IP-PSTN end of the call.
+Similarly, bit-exact details of how the NUS implements EFR encoding can be
+tested by feeding controlled PCMU/PCMA sample streams into the call from IP-PSTN
+and observing what the network emits on GSM Um downlink.  In the latter case,
+frame synchronization finding tricks described in ETSI/3GPP test sequence specs
+need to included as part of the experiment.
+
+When these experiments were performed on the GSM networks of T-Mobile USA and
+Telcel Mexico, it was immediately apparent that they do not implement EFR
+following the original bit-exact code of GSM 06.53: feeding any of the original
+EFR test sequences from GSM 06.54 to the NUS does not produce matching results.
+However, when I tried feeding EFR codec frame sequences from amr122_efr.zip
+(the late addendum to GSM 06.54 for the AMR-EFR hybrid option) to GSM UL, the
+PCMU (T-Mobile USA) or PCMA (Telcel Mexico) output from the GSM network's EFR
+decoder matched _those_ test sequences, indicating that these networks use the
+AMR-EFR alternative implementation.
+
+Creating tinkerer-oriented FOSS tools that can emulate or replicate the poorly
+defined "EFR alternative 2" implemented by these extant commercial networks has
+been a sportive challenge ever since.  The present development in Themyscira
+GSM codec libraries and utilities suite is a step toward conquering that
+challenge: we are now able to replicate the mystery commercial transcoder in
+non-DTX operation, specifically:
+
+a) We can feed a SID-free stream of EFR codec frames to GSM UL, beginning with
+   DHF, and get the expected result on PCMU or PCMA;
+
+b) In the encoder direction, for the first 7 frames after EHF, before DTX is
+   allowed to kick in, we can get GSM DL output from the network that matches
+   our expectations.
+
+Encoder 5 ms delay and DHF transformation
+=========================================
+
+One of the diffs between classic EFR and MR122 in the encoder direction is the
+artificial delay of 5 ms introduced in the AMR version.  In true multirate
+operation this delay is needed to support seamless switching between codec
+modes, but when the only allowed codec rate is 12k2 (which is the case with EFR
+by definition), this delay is pure waste.  (Needless to say, an extra delay of
+5 ms is nothing compared to the egregious latencies introduced by today's ugly
+and horrible world of IP-based transport everywhere, but still...)  This
+artificial 5 ms delay in the encoder is the reason for the DHF difference
+between EFR and MR122 - but here is the wild part: instead of recognizing this
+artificial delay as unnecessary and wasteful for 12k2-only EFR and removing it
+from the AMR-EFR hybrid contraption, those commercial transcoder vendors and
+the people who prepared amr122_efr.zip for ETSI/3GPP (were they the same
+people?) kept this 5 ms encoder delay, keeping the whole encoder unchanged AMR
+except for whatever insane trickery they did to fit EFR DTX logic and EFR SID
+generation into it, but added special DHF transformation logic on the output of
+this AMR encoder to produce compliant EFR DHF when the input is EHF.
+
+Exactly how this DHF transformation is done in those actually-deployed AMR-EFR
+hybrid encoders is a bit of a mystery.  My first thought was to compare the
+speech parameters emitted by the AMR encoder against MR122 DHF, and if the
+result is a match, replace that MR122-DHF parameter set with EFR DHF.  This
+approach is implemented in the simple amr_dhf_subst_efr() function in libtwamr.
+One distinctive signature of this approach is that the output of a hybrid
+encoder following this method can never equal MR122 DHF: this one particular
+bit pattern is precluded from the set of possible outputs under all conditions.
+
+However, subsequent experiments quickly revealed that the logic implemented by
+the transcoder in the network of T-Mobile USA must be different.  One of the
+counter-intuitive effects of the 5 ms artificial delay in the MR122 encoder is
+what happens when the encoder is in its homed state and you feed it an input
+frame whose first 120 samples are all 0x0008, but some (as few as one or as many
+as all) of the last 40 samples are different.  This frame does not meet the
+definition of EHF and won't be recognized as such - the encoder won't get
+rehomed once again after processing this frame - yet the output will be
+bit-exact MR122 DHF.  How do those AMR-EFR hybrid encoders handle *this* case?
+
+Experiments on T-Mobile reveal that in the case in question, the encoded frame
+is emitted with the bit pattern of MR122 DHF, *not* transformed into EFR DHF.
+Because MR122-DHF output is impossible with an encoder that implements logic
+like our amr_dhf_subst_efr() first cut, we know (by modus tollens) that
+T-Mobile's implementation uses some different logic.
+
+Our new (current) working model is implemented in amr_dhf_subst_efr2(): we
+replace the output of the AMR encoder with EFR DHF if the raw encoder output
+was MR122 DHF *and* the input frame was EHF.  This version appears to match
+the observed behavior of T-Mobile USA so far.
+
+EFR DHF in the decoder direction
+================================
+
+The way decoder homing works in all ETSI/3GPP-defined speech codecs, there is
+an explicit check against known DHF bit pattern (up to first subframe only) at
+the beginning of the decoder (if the decoder is homed and the input is DHF per
+this reduced check, artificially emit EHF, stay homed and do nothing more), and
+a second similar check against the known DHF bit pattern (full frame comparison
+this time) at the end of the decoder, triggering the state reset function on
+match.  These checks are (and can only be) implemented by explicit comparison
+against a known hard-coded DHF pattern - hence it doesn't matter in the decoder
+case whether the DHF is natural (as in all properly ETSI-defined codecs) or
+artificial as in AMR-EFR hybrid.  Thus the "correct" handling of DHF in the
+AMR-EFR hybrid decoder is a matter of replacing the check against MR122 DHF bit
+pattern with a check against the different bit pattern of EFR DHF.
+
+The decoder engine in libtwamr supports this different-DHF option for MR122
+decoding by way of a bit set in the mode field in struct amr_param_frame - see
+the detailed description in AMR-library-API article.
+
+Command line utilities for AMR-EFR hybrid
+=========================================
+
+The present package includes a small set of command line utilities that work
+with the AMR-EFR hybrid described above:
+
+amrefr-encode-r
+amrefr-decode-r
+
+	These two utilities function just like gsmefr-encode-r and
+	gsmefr-decode-r described in Codec-utils article, but implement the
+	AMR-EFR hybrid version of the codec instead of original EFR.  The
+	no-DTX limitation applies: amrefr-encode-r lacks -d option, and the
+	input to amrefr-decode-r must not contain any SID frames.
+
+amrefr-tseq-enc
+amrefr-tseq-dec
+
+	These two utilities are AMR-EFR counterparts to gsmefr-etsi-enc and
+	gsmefr-etsi-dec test programs described in EFR-testing article.  They
+	pass all tests on the non-DTX t??_efr.* sequences in ETSI's
+	amr122_efr.zip, but not on any of the DTX sequences included in the
+	same ZIP.  Just like amrefr-encode-r, amrefr-tseq-enc lacks -d option,
+	and amrefr-tseq-dec rejects input containing SID frames.