FreeCalypso > hg > gsm-codec-lib
diff doc/AMR-EFR-hybrid-emu @ 467:ad032051166a
doc: AMR-EFR-hybrid-emu new article
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sun, 12 May 2024 23:54:43 +0000 |
parents | doc/AMR-EFR-philosophy@9bcf65088006 |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/AMR-EFR-hybrid-emu Sun May 12 23:54:43 2024 +0000 @@ -0,0 +1,147 @@ +Emulation of other people's AMR-EFR hybrid implementations +========================================================== + +[Please see AMR-EFR-philosophy article for background information on the + differences between classic GSM-EFR and the 12k2 mode of AMR, and how ETSI/3GPP + loosened their regulation on bit-exactness of EFR, then continue here.] + +Experiments reveal that the extant commercial GSM networks of T-Mobile USA and +Telcel Mexico (and likely other countries' GSM networks too) use a GSM speech +transcoder implementation that performs EFR encoding and decoding (for times +when the MS declares no support for AMR and the network falls back to EFR) per +the alternative which we call AMR-EFR hybrid. The needed experiments are done +by using a FreeCalypso phone or devboard as the MS (declaring yourself to the +network as non-AMR-capable via AT%SPVER), capturing TCH DL and feeding TCH UL +with FreeCalypso tools, and using a SIP-to-PSTN connectivity provider (BulkVS +or Anveo) on the other end of the test call that allows the experimenter to +receive the PCMU or PCMA sample stream coming out of the GSM network's speech +transcoder and feed a crafted PCMU/PCMA sample stream in the other direction. + +In this experimental setup, bit-exact details of how the GSM network under study +implements EFR decoding can be tested by feeding a controlled sequence of EFR +codec frames (beginning with at least two DHFs) to GSM Um uplink and observing +the PCMU or PCMA sample stream received on the IP-PSTN end of the call. +Similarly, bit-exact details of how the NUS implements EFR encoding can be +tested by feeding controlled PCMU/PCMA sample streams into the call from IP-PSTN +and observing what the network emits on GSM Um downlink. In the latter case, +frame synchronization finding tricks described in ETSI/3GPP test sequence specs +need to included as part of the experiment. + +When these experiments were performed on the GSM networks of T-Mobile USA and +Telcel Mexico, it was immediately apparent that they do not implement EFR +following the original bit-exact code of GSM 06.53: feeding any of the original +EFR test sequences from GSM 06.54 to the NUS does not produce matching results. +However, when I tried feeding EFR codec frame sequences from amr122_efr.zip +(the late addendum to GSM 06.54 for the AMR-EFR hybrid option) to GSM UL, the +PCMU (T-Mobile USA) or PCMA (Telcel Mexico) output from the GSM network's EFR +decoder matched _those_ test sequences, indicating that these networks use the +AMR-EFR alternative implementation. + +Creating tinkerer-oriented FOSS tools that can emulate or replicate the poorly +defined "EFR alternative 2" implemented by these extant commercial networks has +been a sportive challenge ever since. The present development in Themyscira +GSM codec libraries and utilities suite is a step toward conquering that +challenge: we are now able to replicate the mystery commercial transcoder in +non-DTX operation, specifically: + +a) We can feed a SID-free stream of EFR codec frames to GSM UL, beginning with + DHF, and get the expected result on PCMU or PCMA; + +b) In the encoder direction, for the first 7 frames after EHF, before DTX is + allowed to kick in, we can get GSM DL output from the network that matches + our expectations. + +Encoder 5 ms delay and DHF transformation +========================================= + +One of the diffs between classic EFR and MR122 in the encoder direction is the +artificial delay of 5 ms introduced in the AMR version. In true multirate +operation this delay is needed to support seamless switching between codec +modes, but when the only allowed codec rate is 12k2 (which is the case with EFR +by definition), this delay is pure waste. (Needless to say, an extra delay of +5 ms is nothing compared to the egregious latencies introduced by today's ugly +and horrible world of IP-based transport everywhere, but still...) This +artificial 5 ms delay in the encoder is the reason for the DHF difference +between EFR and MR122 - but here is the wild part: instead of recognizing this +artificial delay as unnecessary and wasteful for 12k2-only EFR and removing it +from the AMR-EFR hybrid contraption, those commercial transcoder vendors and +the people who prepared amr122_efr.zip for ETSI/3GPP (were they the same +people?) kept this 5 ms encoder delay, keeping the whole encoder unchanged AMR +except for whatever insane trickery they did to fit EFR DTX logic and EFR SID +generation into it, but added special DHF transformation logic on the output of +this AMR encoder to produce compliant EFR DHF when the input is EHF. + +Exactly how this DHF transformation is done in those actually-deployed AMR-EFR +hybrid encoders is a bit of a mystery. My first thought was to compare the +speech parameters emitted by the AMR encoder against MR122 DHF, and if the +result is a match, replace that MR122-DHF parameter set with EFR DHF. This +approach is implemented in the simple amr_dhf_subst_efr() function in libtwamr. +One distinctive signature of this approach is that the output of a hybrid +encoder following this method can never equal MR122 DHF: this one particular +bit pattern is precluded from the set of possible outputs under all conditions. + +However, subsequent experiments quickly revealed that the logic implemented by +the transcoder in the network of T-Mobile USA must be different. One of the +counter-intuitive effects of the 5 ms artificial delay in the MR122 encoder is +what happens when the encoder is in its homed state and you feed it an input +frame whose first 120 samples are all 0x0008, but some (as few as one or as many +as all) of the last 40 samples are different. This frame does not meet the +definition of EHF and won't be recognized as such - the encoder won't get +rehomed once again after processing this frame - yet the output will be +bit-exact MR122 DHF. How do those AMR-EFR hybrid encoders handle *this* case? + +Experiments on T-Mobile reveal that in the case in question, the encoded frame +is emitted with the bit pattern of MR122 DHF, *not* transformed into EFR DHF. +Because MR122-DHF output is impossible with an encoder that implements logic +like our amr_dhf_subst_efr() first cut, we know (by modus tollens) that +T-Mobile's implementation uses some different logic. + +Our new (current) working model is implemented in amr_dhf_subst_efr2(): we +replace the output of the AMR encoder with EFR DHF if the raw encoder output +was MR122 DHF *and* the input frame was EHF. This version appears to match +the observed behavior of T-Mobile USA so far. + +EFR DHF in the decoder direction +================================ + +The way decoder homing works in all ETSI/3GPP-defined speech codecs, there is +an explicit check against known DHF bit pattern (up to first subframe only) at +the beginning of the decoder (if the decoder is homed and the input is DHF per +this reduced check, artificially emit EHF, stay homed and do nothing more), and +a second similar check against the known DHF bit pattern (full frame comparison +this time) at the end of the decoder, triggering the state reset function on +match. These checks are (and can only be) implemented by explicit comparison +against a known hard-coded DHF pattern - hence it doesn't matter in the decoder +case whether the DHF is natural (as in all properly ETSI-defined codecs) or +artificial as in AMR-EFR hybrid. Thus the "correct" handling of DHF in the +AMR-EFR hybrid decoder is a matter of replacing the check against MR122 DHF bit +pattern with a check against the different bit pattern of EFR DHF. + +The decoder engine in libtwamr supports this different-DHF option for MR122 +decoding by way of a bit set in the mode field in struct amr_param_frame - see +the detailed description in AMR-library-API article. + +Command line utilities for AMR-EFR hybrid +========================================= + +The present package includes a small set of command line utilities that work +with the AMR-EFR hybrid described above: + +amrefr-encode-r +amrefr-decode-r + + These two utilities function just like gsmefr-encode-r and + gsmefr-decode-r described in Codec-utils article, but implement the + AMR-EFR hybrid version of the codec instead of original EFR. The + no-DTX limitation applies: amrefr-encode-r lacks -d option, and the + input to amrefr-decode-r must not contain any SID frames. + +amrefr-tseq-enc +amrefr-tseq-dec + + These two utilities are AMR-EFR counterparts to gsmefr-etsi-enc and + gsmefr-etsi-dec test programs described in EFR-testing article. They + pass all tests on the non-DTX t??_efr.* sequences in ETSI's + amr122_efr.zip, but not on any of the DTX sequences included in the + same ZIP. Just like amrefr-encode-r, amrefr-tseq-enc lacks -d option, + and amrefr-tseq-dec rejects input containing SID frames.