FreeCalypso > hg > gsm-codec-lib
comparison doc/AMR-EFR-hybrid-emu @ 467:ad032051166a
doc: AMR-EFR-hybrid-emu new article
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sun, 12 May 2024 23:54:43 +0000 |
parents | doc/AMR-EFR-philosophy@9bcf65088006 |
children |
comparison
equal
deleted
inserted
replaced
466:0c4e1bc06740 | 467:ad032051166a |
---|---|
1 Emulation of other people's AMR-EFR hybrid implementations | |
2 ========================================================== | |
3 | |
4 [Please see AMR-EFR-philosophy article for background information on the | |
5 differences between classic GSM-EFR and the 12k2 mode of AMR, and how ETSI/3GPP | |
6 loosened their regulation on bit-exactness of EFR, then continue here.] | |
7 | |
8 Experiments reveal that the extant commercial GSM networks of T-Mobile USA and | |
9 Telcel Mexico (and likely other countries' GSM networks too) use a GSM speech | |
10 transcoder implementation that performs EFR encoding and decoding (for times | |
11 when the MS declares no support for AMR and the network falls back to EFR) per | |
12 the alternative which we call AMR-EFR hybrid. The needed experiments are done | |
13 by using a FreeCalypso phone or devboard as the MS (declaring yourself to the | |
14 network as non-AMR-capable via AT%SPVER), capturing TCH DL and feeding TCH UL | |
15 with FreeCalypso tools, and using a SIP-to-PSTN connectivity provider (BulkVS | |
16 or Anveo) on the other end of the test call that allows the experimenter to | |
17 receive the PCMU or PCMA sample stream coming out of the GSM network's speech | |
18 transcoder and feed a crafted PCMU/PCMA sample stream in the other direction. | |
19 | |
20 In this experimental setup, bit-exact details of how the GSM network under study | |
21 implements EFR decoding can be tested by feeding a controlled sequence of EFR | |
22 codec frames (beginning with at least two DHFs) to GSM Um uplink and observing | |
23 the PCMU or PCMA sample stream received on the IP-PSTN end of the call. | |
24 Similarly, bit-exact details of how the NUS implements EFR encoding can be | |
25 tested by feeding controlled PCMU/PCMA sample streams into the call from IP-PSTN | |
26 and observing what the network emits on GSM Um downlink. In the latter case, | |
27 frame synchronization finding tricks described in ETSI/3GPP test sequence specs | |
28 need to included as part of the experiment. | |
29 | |
30 When these experiments were performed on the GSM networks of T-Mobile USA and | |
31 Telcel Mexico, it was immediately apparent that they do not implement EFR | |
32 following the original bit-exact code of GSM 06.53: feeding any of the original | |
33 EFR test sequences from GSM 06.54 to the NUS does not produce matching results. | |
34 However, when I tried feeding EFR codec frame sequences from amr122_efr.zip | |
35 (the late addendum to GSM 06.54 for the AMR-EFR hybrid option) to GSM UL, the | |
36 PCMU (T-Mobile USA) or PCMA (Telcel Mexico) output from the GSM network's EFR | |
37 decoder matched _those_ test sequences, indicating that these networks use the | |
38 AMR-EFR alternative implementation. | |
39 | |
40 Creating tinkerer-oriented FOSS tools that can emulate or replicate the poorly | |
41 defined "EFR alternative 2" implemented by these extant commercial networks has | |
42 been a sportive challenge ever since. The present development in Themyscira | |
43 GSM codec libraries and utilities suite is a step toward conquering that | |
44 challenge: we are now able to replicate the mystery commercial transcoder in | |
45 non-DTX operation, specifically: | |
46 | |
47 a) We can feed a SID-free stream of EFR codec frames to GSM UL, beginning with | |
48 DHF, and get the expected result on PCMU or PCMA; | |
49 | |
50 b) In the encoder direction, for the first 7 frames after EHF, before DTX is | |
51 allowed to kick in, we can get GSM DL output from the network that matches | |
52 our expectations. | |
53 | |
54 Encoder 5 ms delay and DHF transformation | |
55 ========================================= | |
56 | |
57 One of the diffs between classic EFR and MR122 in the encoder direction is the | |
58 artificial delay of 5 ms introduced in the AMR version. In true multirate | |
59 operation this delay is needed to support seamless switching between codec | |
60 modes, but when the only allowed codec rate is 12k2 (which is the case with EFR | |
61 by definition), this delay is pure waste. (Needless to say, an extra delay of | |
62 5 ms is nothing compared to the egregious latencies introduced by today's ugly | |
63 and horrible world of IP-based transport everywhere, but still...) This | |
64 artificial 5 ms delay in the encoder is the reason for the DHF difference | |
65 between EFR and MR122 - but here is the wild part: instead of recognizing this | |
66 artificial delay as unnecessary and wasteful for 12k2-only EFR and removing it | |
67 from the AMR-EFR hybrid contraption, those commercial transcoder vendors and | |
68 the people who prepared amr122_efr.zip for ETSI/3GPP (were they the same | |
69 people?) kept this 5 ms encoder delay, keeping the whole encoder unchanged AMR | |
70 except for whatever insane trickery they did to fit EFR DTX logic and EFR SID | |
71 generation into it, but added special DHF transformation logic on the output of | |
72 this AMR encoder to produce compliant EFR DHF when the input is EHF. | |
73 | |
74 Exactly how this DHF transformation is done in those actually-deployed AMR-EFR | |
75 hybrid encoders is a bit of a mystery. My first thought was to compare the | |
76 speech parameters emitted by the AMR encoder against MR122 DHF, and if the | |
77 result is a match, replace that MR122-DHF parameter set with EFR DHF. This | |
78 approach is implemented in the simple amr_dhf_subst_efr() function in libtwamr. | |
79 One distinctive signature of this approach is that the output of a hybrid | |
80 encoder following this method can never equal MR122 DHF: this one particular | |
81 bit pattern is precluded from the set of possible outputs under all conditions. | |
82 | |
83 However, subsequent experiments quickly revealed that the logic implemented by | |
84 the transcoder in the network of T-Mobile USA must be different. One of the | |
85 counter-intuitive effects of the 5 ms artificial delay in the MR122 encoder is | |
86 what happens when the encoder is in its homed state and you feed it an input | |
87 frame whose first 120 samples are all 0x0008, but some (as few as one or as many | |
88 as all) of the last 40 samples are different. This frame does not meet the | |
89 definition of EHF and won't be recognized as such - the encoder won't get | |
90 rehomed once again after processing this frame - yet the output will be | |
91 bit-exact MR122 DHF. How do those AMR-EFR hybrid encoders handle *this* case? | |
92 | |
93 Experiments on T-Mobile reveal that in the case in question, the encoded frame | |
94 is emitted with the bit pattern of MR122 DHF, *not* transformed into EFR DHF. | |
95 Because MR122-DHF output is impossible with an encoder that implements logic | |
96 like our amr_dhf_subst_efr() first cut, we know (by modus tollens) that | |
97 T-Mobile's implementation uses some different logic. | |
98 | |
99 Our new (current) working model is implemented in amr_dhf_subst_efr2(): we | |
100 replace the output of the AMR encoder with EFR DHF if the raw encoder output | |
101 was MR122 DHF *and* the input frame was EHF. This version appears to match | |
102 the observed behavior of T-Mobile USA so far. | |
103 | |
104 EFR DHF in the decoder direction | |
105 ================================ | |
106 | |
107 The way decoder homing works in all ETSI/3GPP-defined speech codecs, there is | |
108 an explicit check against known DHF bit pattern (up to first subframe only) at | |
109 the beginning of the decoder (if the decoder is homed and the input is DHF per | |
110 this reduced check, artificially emit EHF, stay homed and do nothing more), and | |
111 a second similar check against the known DHF bit pattern (full frame comparison | |
112 this time) at the end of the decoder, triggering the state reset function on | |
113 match. These checks are (and can only be) implemented by explicit comparison | |
114 against a known hard-coded DHF pattern - hence it doesn't matter in the decoder | |
115 case whether the DHF is natural (as in all properly ETSI-defined codecs) or | |
116 artificial as in AMR-EFR hybrid. Thus the "correct" handling of DHF in the | |
117 AMR-EFR hybrid decoder is a matter of replacing the check against MR122 DHF bit | |
118 pattern with a check against the different bit pattern of EFR DHF. | |
119 | |
120 The decoder engine in libtwamr supports this different-DHF option for MR122 | |
121 decoding by way of a bit set in the mode field in struct amr_param_frame - see | |
122 the detailed description in AMR-library-API article. | |
123 | |
124 Command line utilities for AMR-EFR hybrid | |
125 ========================================= | |
126 | |
127 The present package includes a small set of command line utilities that work | |
128 with the AMR-EFR hybrid described above: | |
129 | |
130 amrefr-encode-r | |
131 amrefr-decode-r | |
132 | |
133 These two utilities function just like gsmefr-encode-r and | |
134 gsmefr-decode-r described in Codec-utils article, but implement the | |
135 AMR-EFR hybrid version of the codec instead of original EFR. The | |
136 no-DTX limitation applies: amrefr-encode-r lacks -d option, and the | |
137 input to amrefr-decode-r must not contain any SID frames. | |
138 | |
139 amrefr-tseq-enc | |
140 amrefr-tseq-dec | |
141 | |
142 These two utilities are AMR-EFR counterparts to gsmefr-etsi-enc and | |
143 gsmefr-etsi-dec test programs described in EFR-testing article. They | |
144 pass all tests on the non-DTX t??_efr.* sequences in ETSI's | |
145 amr122_efr.zip, but not on any of the DTX sequences included in the | |
146 same ZIP. Just like amrefr-encode-r, amrefr-tseq-enc lacks -d option, | |
147 and amrefr-tseq-dec rejects input containing SID frames. |