FreeCalypso > hg > gsm-net-reveng
view doc/TFO-xform/EFR @ 36:d9553c7ac6ea
doc/TFO-xform/EFR: beginning of article
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Tue, 03 Sep 2024 07:08:24 +0000 |
parents | |
children | 4ab7cc414ed2 |
line wrap: on
line source
TFO transform for EFR ===================== Unlike the situation with FRv1 and HRv1, the standard endpoint decoder for EFR provides no help for implementing a TFO transform. The reference EFR decoder source from ETSI includes bad frame handling and Rx DTX functions, but the logic that implements these functions is interwoven throughout the body of the decoder and does not form a separable front-end. Most saliently, this Rx DTX and ECU logic in the reference decoder does not operate on coded parameters as would be needed for a TFO transform, instead it operates on linear values deeper in the decoder after parameter dequantization. Given that Abis is a de facto proprietary interface that is not interoperable between different vendors (and the same holds for Ater in those BSS designs that separate the TRAU from the BSC), and given how daunting it seems to implement a true TFO transform for EFR, prior to getting our Nokia TCSM2 lab setup I was wondering if historical TRAU vendors really did implement this TFO transform, or if perhaps they used some kind of "cheating" trick on their Abis similar to what we did in OsmoBTS in mid-2023. However, once I got our Nokia TCSM2 gear working, set up a TFO connection between two active TRAU channels in EFR mode and passed some test sequences through it, it became clear that Nokia did implement a real "honest-to-god" TFO transform for EFR: the TRAU-DL frame stream is 100% valid "speech" frames (no idle frames or other aberrations inserted) even when the TRAU-UL stream fed via TFO contains BFI speech frames and DTXu pauses - the TRAU really does apply bad frame handling and comfort noise insertion on parameter level. Seeing that at least one major historical vendor did implement TFO transform for EFR, and seeing the output from that transform, has set up a sportive challenge for me: I no longer have a valid excuse to not do it. I now have a desire to produce a FOSS implementation of TFO transform for EFR in Themyscira libraries (probably in libgsmefr), and make it no worse than Nokia's implementation in TCSM2. Bad frame handling in speech mode ================================= Looking at the DL speech frames that were synthesized by the TRAU in those frame positions where the incoming UL stream via TFO had BFIs, we can make the following observations: * The 5 LPC parameters are different in each generated substitution/muting frame, hence it looks like the TFO transform is running the quantization algorithm for each output frame to produce LPC parameters that aim for the substitution/muting LSFs of the official "example solution". * LTP lag parameters remain constant for each run of BFIs between good speech frames; the lag value encoded therein matches the LTP lag (integer part only) from the 4th subframe of the last good speech frame, just like in the official endpoint decoder. * Surprising bit: the 4 LTP gain values from the last good speech frame are endlessly regurgitated verbatim in each substitution/muting frame, without any signs of the attenuation I expected to see based on the official "example solution". * Another surprising bit: the 35-bit fixed codebook sequence in each subframe is taken from the corresponding subframe of the last good speech frame, contrary to the official "example solution" that takes these bits from the errored frames. * The four fixed codebook gain parameters in the emitted substitution/muting frames differ from one frame to the next in the case of multiple BFI frames in a row, and they also differ between subframes in the same frame - hence these parameters are clearly being regenerated as output progresses. However, the quantization algorithm for this parameter is so complex that I haven't been able to make a more intelligent analysis yet. Looking at the first good speech frame that follows each BFI substitution/muting insert, we see that it is mostly unaltered: no alterations were seen to LPC or LTP parameters, in particular. However, in the case of the fixed codebook gain parameter we see a different behavioral pattern: most of the time it is also unaltered, but sometimes we see reduction in this parameter, and even then it is only in certain subframes. Are we perhaps seeing a capping of the fixed codebook gain in the first good frame following BFI, similar to that implemented in the reference endpoint decoder? A better understanding of the quantization mechanism for this parameter will be needed.