FreeCalypso > hg > gsm-codec-lib
view doc/AMR-EFR-philosophy @ 477:4c9222d95647
libtwamr encoder: always emit frame->mode = mode;
In the original implementation of amr_encode_frame(), the 'mode' member
of the output struct was set to 0xFF if the output frame type is TX_NO_DATA.
This design was made to mimic the mode field (16-bit word) being set to
0xFFFF (or -1) in 3GPP test sequence format - but nothing actually depends
on this struct member being set in any way, and amr_frame_to_tseq()
generates the needed 0xFFFF on its own, based on frame->type being equal
to TX_NO_DATA.
It is simpler and more efficient to always set frame->mode to the actual
encoding mode in amr_encode_frame(), and this new behavior has already
been documented in doc/AMR-library-API description in anticipation of
the present change.
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sat, 18 May 2024 22:30:42 +0000 |
parents | ad032051166a |
children |
line wrap: on
line source
Relation between GSM-EFR and 12k2 mode of AMR ============================================= What are the differences between GSM-EFR codec and the highest 12k2 mode of AMR, or MR122 for short? The most obvious difference is in DTX: the format of SID frames and even the very paradigm of how DTX works are completely different between EFR and AMR. But what about non-DTX operation? If a codec session consists solely of good speech frames, no SIDs and no BFI frame gaps, are EFR and MR122 strictly identical? The correct answer is that in the absence of SIDs, EFR and MR122 are directly interoperable in that the output of an EFR encoder can be fed to the input of an AMR decoder, and vice-versa. However, the two codecs are NOT identical at the bit-exact level! The differences are subtle, such that finding them requires some intense study; this article documents some of these study findings: https://www.freecalypso.org/hg/efr-experiments/file/tip/Theory-and-mystery What other DSP/transcoder vendors have done =========================================== ETSI had a tradition of defining standard GSM codecs (FR, HR, EFR) in bit-exact form, and every production implementation was required to match the output of the official reference bit for bit. However, once AMR came out, the regulation on EFR was loosened. GSM 06.54 document from 2000-08 (ETSI TS 100 725 V5.2.0) has an appendix-like chapter (chapter 10) whose first paragraph reads: The 12.2 kbit/s mode of the Adaptive Multi Rate speech coder described in TS 26.071 is functionally equivalent to the GSM Enhanced Full Rate speech coder. An alternative implementation of the Enhanced Full Rate speech service based on the 12.2 kbit/s mode of the Adaptive Multi Rate coder is allowed. Alternative implementations shall implement the functionality specified in TS 26.071 for the 12.2 kbit/s mode, with the exception that the DTX transmission format (GSM 06.81) and the comfort noise generation (GSM 06.62) shall be used. It appears that DSP vendors (for GSM MS or for network transcoders, or perhaps both) weren't too happy with the prospect of having to include two different versions of _almost_ the same codec algorithm with a bunch of interspersed subtle diffs, and so the rules were bent: EFR implementors were given permission to deviate from the original bit-exact definition of EFR in order to have more commonality with MR122. Approach adopted for Themyscira GSM codec libraries suite ========================================================= I (Mother Mychaela) previously entertained the idea of creating a unified codec library that supports both AMR and EFR with common code, producing a published- source, FOSS-culture equivalent of what most proprietary vendors have done. However, on further reflection, that idea has been rejected. The current situation as of 2024-05 is as follows: * Libgsmefr is our production-oriented implementation of GSM-EFR codec. It implements the original bit-exact definition of EFR, not the AMR-EFR hybrid version, and it includes full support for DTX encoding and SID decoding with comfort noise generation per GSM 06.62. * Libtwamr is our librification of 3GPP AMR reference code. The library is structured in such a way that libtwamr stateful encoder and decoder functions can be combined with stateless EFR frame packing and unpacking functions from libgsmefr, allowing AMR-EFR hybrid encoders and decoders to be built. The decoder homing function in libtwamr can be told to trigger on EFR DHF instead of MR122 version, and for the encoder direction there is a simple utility function that artificially transforms MR122 DHF into EFR DHF post-encoder. However, there is no support for AMR-EFR hybrid encoding with DTX enabled, and the low-effort version of AMR-EFR hybrid decoder constructed in this manner cannot grok EFR SID frames or generate CN per GSM 06.62. Production implementations of GSM network elements that need to perform EFR speech transcoding should use libgsmefr, not libtwamr. The limited support that is provided for AMR-EFR hybrid encoding and decoding with the combination of libtwamr and libgsmefr is intended for experimentation and reverse engineering of other people's implementations, for times when it becomes necessary to model, simulate or replicate bit-exact operation of someone else's network element.