comparison doc/TFO-xform/EFR @ 36:d9553c7ac6ea

doc/TFO-xform/EFR: beginning of article
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 03 Sep 2024 07:08:24 +0000
parents
children 4ab7cc414ed2
comparison
equal deleted inserted replaced
35:0979407719f0 36:d9553c7ac6ea
1 TFO transform for EFR
2 =====================
3
4 Unlike the situation with FRv1 and HRv1, the standard endpoint decoder for EFR
5 provides no help for implementing a TFO transform. The reference EFR decoder
6 source from ETSI includes bad frame handling and Rx DTX functions, but the logic
7 that implements these functions is interwoven throughout the body of the decoder
8 and does not form a separable front-end. Most saliently, this Rx DTX and ECU
9 logic in the reference decoder does not operate on coded parameters as would be
10 needed for a TFO transform, instead it operates on linear values deeper in the
11 decoder after parameter dequantization.
12
13 Given that Abis is a de facto proprietary interface that is not interoperable
14 between different vendors (and the same holds for Ater in those BSS designs
15 that separate the TRAU from the BSC), and given how daunting it seems to
16 implement a true TFO transform for EFR, prior to getting our Nokia TCSM2 lab
17 setup I was wondering if historical TRAU vendors really did implement this
18 TFO transform, or if perhaps they used some kind of "cheating" trick on their
19 Abis similar to what we did in OsmoBTS in mid-2023. However, once I got our
20 Nokia TCSM2 gear working, set up a TFO connection between two active TRAU
21 channels in EFR mode and passed some test sequences through it, it became clear
22 that Nokia did implement a real "honest-to-god" TFO transform for EFR: the
23 TRAU-DL frame stream is 100% valid "speech" frames (no idle frames or other
24 aberrations inserted) even when the TRAU-UL stream fed via TFO contains BFI
25 speech frames and DTXu pauses - the TRAU really does apply bad frame handling
26 and comfort noise insertion on parameter level.
27
28 Seeing that at least one major historical vendor did implement TFO transform
29 for EFR, and seeing the output from that transform, has set up a sportive
30 challenge for me: I no longer have a valid excuse to not do it. I now have a
31 desire to produce a FOSS implementation of TFO transform for EFR in Themyscira
32 libraries (probably in libgsmefr), and make it no worse than Nokia's
33 implementation in TCSM2.
34
35 Bad frame handling in speech mode
36 =================================
37
38 Looking at the DL speech frames that were synthesized by the TRAU in those
39 frame positions where the incoming UL stream via TFO had BFIs, we can make the
40 following observations:
41
42 * The 5 LPC parameters are different in each generated substitution/muting
43 frame, hence it looks like the TFO transform is running the quantization
44 algorithm for each output frame to produce LPC parameters that aim for the
45 substitution/muting LSFs of the official "example solution".
46
47 * LTP lag parameters remain constant for each run of BFIs between good speech
48 frames; the lag value encoded therein matches the LTP lag (integer part only)
49 from the 4th subframe of the last good speech frame, just like in the official
50 endpoint decoder.
51
52 * Surprising bit: the 4 LTP gain values from the last good speech frame are
53 endlessly regurgitated verbatim in each substitution/muting frame, without
54 any signs of the attenuation I expected to see based on the official "example
55 solution".
56
57 * Another surprising bit: the 35-bit fixed codebook sequence in each subframe
58 is taken from the corresponding subframe of the last good speech frame,
59 contrary to the official "example solution" that takes these bits from the
60 errored frames.
61
62 * The four fixed codebook gain parameters in the emitted substitution/muting
63 frames differ from one frame to the next in the case of multiple BFI frames
64 in a row, and they also differ between subframes in the same frame - hence
65 these parameters are clearly being regenerated as output progresses. However,
66 the quantization algorithm for this parameter is so complex that I haven't
67 been able to make a more intelligent analysis yet.
68
69 Looking at the first good speech frame that follows each BFI substitution/muting
70 insert, we see that it is mostly unaltered: no alterations were seen to LPC or
71 LTP parameters, in particular. However, in the case of the fixed codebook gain
72 parameter we see a different behavioral pattern: most of the time it is also
73 unaltered, but sometimes we see reduction in this parameter, and even then it
74 is only in certain subframes. Are we perhaps seeing a capping of the fixed
75 codebook gain in the first good frame following BFI, similar to that implemented
76 in the reference endpoint decoder? A better understanding of the quantization
77 mechanism for this parameter will be needed.