comparison doc/AMR-EFR-hybrid-emu @ 467:ad032051166a

doc: AMR-EFR-hybrid-emu new article
author Mychaela Falconia <falcon@freecalypso.org>
date Sun, 12 May 2024 23:54:43 +0000
parents doc/AMR-EFR-philosophy@9bcf65088006
children
comparison
equal deleted inserted replaced
466:0c4e1bc06740 467:ad032051166a
1 Emulation of other people's AMR-EFR hybrid implementations
2 ==========================================================
3
4 [Please see AMR-EFR-philosophy article for background information on the
5 differences between classic GSM-EFR and the 12k2 mode of AMR, and how ETSI/3GPP
6 loosened their regulation on bit-exactness of EFR, then continue here.]
7
8 Experiments reveal that the extant commercial GSM networks of T-Mobile USA and
9 Telcel Mexico (and likely other countries' GSM networks too) use a GSM speech
10 transcoder implementation that performs EFR encoding and decoding (for times
11 when the MS declares no support for AMR and the network falls back to EFR) per
12 the alternative which we call AMR-EFR hybrid. The needed experiments are done
13 by using a FreeCalypso phone or devboard as the MS (declaring yourself to the
14 network as non-AMR-capable via AT%SPVER), capturing TCH DL and feeding TCH UL
15 with FreeCalypso tools, and using a SIP-to-PSTN connectivity provider (BulkVS
16 or Anveo) on the other end of the test call that allows the experimenter to
17 receive the PCMU or PCMA sample stream coming out of the GSM network's speech
18 transcoder and feed a crafted PCMU/PCMA sample stream in the other direction.
19
20 In this experimental setup, bit-exact details of how the GSM network under study
21 implements EFR decoding can be tested by feeding a controlled sequence of EFR
22 codec frames (beginning with at least two DHFs) to GSM Um uplink and observing
23 the PCMU or PCMA sample stream received on the IP-PSTN end of the call.
24 Similarly, bit-exact details of how the NUS implements EFR encoding can be
25 tested by feeding controlled PCMU/PCMA sample streams into the call from IP-PSTN
26 and observing what the network emits on GSM Um downlink. In the latter case,
27 frame synchronization finding tricks described in ETSI/3GPP test sequence specs
28 need to included as part of the experiment.
29
30 When these experiments were performed on the GSM networks of T-Mobile USA and
31 Telcel Mexico, it was immediately apparent that they do not implement EFR
32 following the original bit-exact code of GSM 06.53: feeding any of the original
33 EFR test sequences from GSM 06.54 to the NUS does not produce matching results.
34 However, when I tried feeding EFR codec frame sequences from amr122_efr.zip
35 (the late addendum to GSM 06.54 for the AMR-EFR hybrid option) to GSM UL, the
36 PCMU (T-Mobile USA) or PCMA (Telcel Mexico) output from the GSM network's EFR
37 decoder matched _those_ test sequences, indicating that these networks use the
38 AMR-EFR alternative implementation.
39
40 Creating tinkerer-oriented FOSS tools that can emulate or replicate the poorly
41 defined "EFR alternative 2" implemented by these extant commercial networks has
42 been a sportive challenge ever since. The present development in Themyscira
43 GSM codec libraries and utilities suite is a step toward conquering that
44 challenge: we are now able to replicate the mystery commercial transcoder in
45 non-DTX operation, specifically:
46
47 a) We can feed a SID-free stream of EFR codec frames to GSM UL, beginning with
48 DHF, and get the expected result on PCMU or PCMA;
49
50 b) In the encoder direction, for the first 7 frames after EHF, before DTX is
51 allowed to kick in, we can get GSM DL output from the network that matches
52 our expectations.
53
54 Encoder 5 ms delay and DHF transformation
55 =========================================
56
57 One of the diffs between classic EFR and MR122 in the encoder direction is the
58 artificial delay of 5 ms introduced in the AMR version. In true multirate
59 operation this delay is needed to support seamless switching between codec
60 modes, but when the only allowed codec rate is 12k2 (which is the case with EFR
61 by definition), this delay is pure waste. (Needless to say, an extra delay of
62 5 ms is nothing compared to the egregious latencies introduced by today's ugly
63 and horrible world of IP-based transport everywhere, but still...) This
64 artificial 5 ms delay in the encoder is the reason for the DHF difference
65 between EFR and MR122 - but here is the wild part: instead of recognizing this
66 artificial delay as unnecessary and wasteful for 12k2-only EFR and removing it
67 from the AMR-EFR hybrid contraption, those commercial transcoder vendors and
68 the people who prepared amr122_efr.zip for ETSI/3GPP (were they the same
69 people?) kept this 5 ms encoder delay, keeping the whole encoder unchanged AMR
70 except for whatever insane trickery they did to fit EFR DTX logic and EFR SID
71 generation into it, but added special DHF transformation logic on the output of
72 this AMR encoder to produce compliant EFR DHF when the input is EHF.
73
74 Exactly how this DHF transformation is done in those actually-deployed AMR-EFR
75 hybrid encoders is a bit of a mystery. My first thought was to compare the
76 speech parameters emitted by the AMR encoder against MR122 DHF, and if the
77 result is a match, replace that MR122-DHF parameter set with EFR DHF. This
78 approach is implemented in the simple amr_dhf_subst_efr() function in libtwamr.
79 One distinctive signature of this approach is that the output of a hybrid
80 encoder following this method can never equal MR122 DHF: this one particular
81 bit pattern is precluded from the set of possible outputs under all conditions.
82
83 However, subsequent experiments quickly revealed that the logic implemented by
84 the transcoder in the network of T-Mobile USA must be different. One of the
85 counter-intuitive effects of the 5 ms artificial delay in the MR122 encoder is
86 what happens when the encoder is in its homed state and you feed it an input
87 frame whose first 120 samples are all 0x0008, but some (as few as one or as many
88 as all) of the last 40 samples are different. This frame does not meet the
89 definition of EHF and won't be recognized as such - the encoder won't get
90 rehomed once again after processing this frame - yet the output will be
91 bit-exact MR122 DHF. How do those AMR-EFR hybrid encoders handle *this* case?
92
93 Experiments on T-Mobile reveal that in the case in question, the encoded frame
94 is emitted with the bit pattern of MR122 DHF, *not* transformed into EFR DHF.
95 Because MR122-DHF output is impossible with an encoder that implements logic
96 like our amr_dhf_subst_efr() first cut, we know (by modus tollens) that
97 T-Mobile's implementation uses some different logic.
98
99 Our new (current) working model is implemented in amr_dhf_subst_efr2(): we
100 replace the output of the AMR encoder with EFR DHF if the raw encoder output
101 was MR122 DHF *and* the input frame was EHF. This version appears to match
102 the observed behavior of T-Mobile USA so far.
103
104 EFR DHF in the decoder direction
105 ================================
106
107 The way decoder homing works in all ETSI/3GPP-defined speech codecs, there is
108 an explicit check against known DHF bit pattern (up to first subframe only) at
109 the beginning of the decoder (if the decoder is homed and the input is DHF per
110 this reduced check, artificially emit EHF, stay homed and do nothing more), and
111 a second similar check against the known DHF bit pattern (full frame comparison
112 this time) at the end of the decoder, triggering the state reset function on
113 match. These checks are (and can only be) implemented by explicit comparison
114 against a known hard-coded DHF pattern - hence it doesn't matter in the decoder
115 case whether the DHF is natural (as in all properly ETSI-defined codecs) or
116 artificial as in AMR-EFR hybrid. Thus the "correct" handling of DHF in the
117 AMR-EFR hybrid decoder is a matter of replacing the check against MR122 DHF bit
118 pattern with a check against the different bit pattern of EFR DHF.
119
120 The decoder engine in libtwamr supports this different-DHF option for MR122
121 decoding by way of a bit set in the mode field in struct amr_param_frame - see
122 the detailed description in AMR-library-API article.
123
124 Command line utilities for AMR-EFR hybrid
125 =========================================
126
127 The present package includes a small set of command line utilities that work
128 with the AMR-EFR hybrid described above:
129
130 amrefr-encode-r
131 amrefr-decode-r
132
133 These two utilities function just like gsmefr-encode-r and
134 gsmefr-decode-r described in Codec-utils article, but implement the
135 AMR-EFR hybrid version of the codec instead of original EFR. The
136 no-DTX limitation applies: amrefr-encode-r lacks -d option, and the
137 input to amrefr-decode-r must not contain any SID frames.
138
139 amrefr-tseq-enc
140 amrefr-tseq-dec
141
142 These two utilities are AMR-EFR counterparts to gsmefr-etsi-enc and
143 gsmefr-etsi-dec test programs described in EFR-testing article. They
144 pass all tests on the non-DTX t??_efr.* sequences in ETSI's
145 amr122_efr.zip, but not on any of the DTX sequences included in the
146 same ZIP. Just like amrefr-encode-r, amrefr-tseq-enc lacks -d option,
147 and amrefr-tseq-dec rejects input containing SID frames.