FreeCalypso > hg > gsm-codec-lib
view doc/AMR-EFR-philosophy @ 530:96c4ed5529bf
libgsmfr2 preproc: implement support for DTXd
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Thu, 19 Sep 2024 20:15:54 +0000 |
parents | ad032051166a |
children |
line wrap: on
line source
Relation between GSM-EFR and 12k2 mode of AMR ============================================= What are the differences between GSM-EFR codec and the highest 12k2 mode of AMR, or MR122 for short? The most obvious difference is in DTX: the format of SID frames and even the very paradigm of how DTX works are completely different between EFR and AMR. But what about non-DTX operation? If a codec session consists solely of good speech frames, no SIDs and no BFI frame gaps, are EFR and MR122 strictly identical? The correct answer is that in the absence of SIDs, EFR and MR122 are directly interoperable in that the output of an EFR encoder can be fed to the input of an AMR decoder, and vice-versa. However, the two codecs are NOT identical at the bit-exact level! The differences are subtle, such that finding them requires some intense study; this article documents some of these study findings: https://www.freecalypso.org/hg/efr-experiments/file/tip/Theory-and-mystery What other DSP/transcoder vendors have done =========================================== ETSI had a tradition of defining standard GSM codecs (FR, HR, EFR) in bit-exact form, and every production implementation was required to match the output of the official reference bit for bit. However, once AMR came out, the regulation on EFR was loosened. GSM 06.54 document from 2000-08 (ETSI TS 100 725 V5.2.0) has an appendix-like chapter (chapter 10) whose first paragraph reads: The 12.2 kbit/s mode of the Adaptive Multi Rate speech coder described in TS 26.071 is functionally equivalent to the GSM Enhanced Full Rate speech coder. An alternative implementation of the Enhanced Full Rate speech service based on the 12.2 kbit/s mode of the Adaptive Multi Rate coder is allowed. Alternative implementations shall implement the functionality specified in TS 26.071 for the 12.2 kbit/s mode, with the exception that the DTX transmission format (GSM 06.81) and the comfort noise generation (GSM 06.62) shall be used. It appears that DSP vendors (for GSM MS or for network transcoders, or perhaps both) weren't too happy with the prospect of having to include two different versions of _almost_ the same codec algorithm with a bunch of interspersed subtle diffs, and so the rules were bent: EFR implementors were given permission to deviate from the original bit-exact definition of EFR in order to have more commonality with MR122. Approach adopted for Themyscira GSM codec libraries suite ========================================================= I (Mother Mychaela) previously entertained the idea of creating a unified codec library that supports both AMR and EFR with common code, producing a published- source, FOSS-culture equivalent of what most proprietary vendors have done. However, on further reflection, that idea has been rejected. The current situation as of 2024-05 is as follows: * Libgsmefr is our production-oriented implementation of GSM-EFR codec. It implements the original bit-exact definition of EFR, not the AMR-EFR hybrid version, and it includes full support for DTX encoding and SID decoding with comfort noise generation per GSM 06.62. * Libtwamr is our librification of 3GPP AMR reference code. The library is structured in such a way that libtwamr stateful encoder and decoder functions can be combined with stateless EFR frame packing and unpacking functions from libgsmefr, allowing AMR-EFR hybrid encoders and decoders to be built. The decoder homing function in libtwamr can be told to trigger on EFR DHF instead of MR122 version, and for the encoder direction there is a simple utility function that artificially transforms MR122 DHF into EFR DHF post-encoder. However, there is no support for AMR-EFR hybrid encoding with DTX enabled, and the low-effort version of AMR-EFR hybrid decoder constructed in this manner cannot grok EFR SID frames or generate CN per GSM 06.62. Production implementations of GSM network elements that need to perform EFR speech transcoding should use libgsmefr, not libtwamr. The limited support that is provided for AMR-EFR hybrid encoding and decoding with the combination of libtwamr and libgsmefr is intended for experimentation and reverse engineering of other people's implementations, for times when it becomes necessary to model, simulate or replicate bit-exact operation of someone else's network element.