FreeCalypso > hg > gsm-codec-lib
view doc/AMR-library-desc @ 510:5bf71b091323
libgsmhr1: add direct conversion from RTP input to decoder params
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sun, 25 Aug 2024 02:19:37 +0000 |
parents | c84bf526c7eb |
children |
line wrap: on
line source
Themyscira libtwamr general description ======================================= Libtwamr is a librification of the official AMR reference C code from 3GPP, produced by Themyscira Wireless and styled to match our libraries for more classic GSM codecs. This library has been created with the following two goals in mind: 1) At the present time we (ThemWi) operate our GSM network with only GSM-FR and GSM-EFR codecs, with the latter being preferred. We do not currently operate with AMR because the conditions under which AMR becomes advantageous do not currently exist in our network operation. However, we need to be prepared for the possibility that the conditions which make AMR desirable may arise some day, and we may need to start deploying AMR. In order to make AMR deployment a possibility, many parts will need to be implemented, one of which is a speech transcoding library that implements the AMR codec in the same way how libgsmfr2 and libgsmefr implement the more classic codecs which we use currently. 2) Many other commercial GSM networks have implemented EFR speech service using a type of AMR-EFR hybrid described in AMR-EFR-philosophy and AMR-EFR-hybrid-emu articles. As part of certain behavioral reverse engineering experiments, we sometimes need to model the bit-exact operation of those other-people-controlled commercial implementations of AMR-EFR, and our current libtwamr provides one way to do so. Knowing that a proper implementation of an AMR codec library is likely to be needed some day for reason 1 above, justification was obtained for expending the effort to produce the present libtwamr. Compared to other plausible ways in which someone could reasonably approach the task of librifying the AMR reference code from 3GPP, the design of libtwamr includes two somewhat original choices: * Separation of core and I/O: the stateful encoder and decoder engines in libtwamr operate on a custom frame structure that includes the array of codec parameters in their broken-down form (e.g., 57 parameters for MR122), the frame type as in original RXFrameType and TXFrameType, and the codec mode. Conversion between this internal canonical form (which is most native to the guts of the encoder and decoder engines) and external I/O formats (the 3GPP test sequence format and the more practical RFC 4867 format used in RTP and in .amr recording files) is relegated to stateless utility functions. * Both VAD1 and VAD2 included: the reference code from 3GPP includes two alternative versions of Voice Activity Detection algorithm, VAD1 and VAD2. Implementors are allowed to use either version and be compliant; 3GPP code uses conditional compilation to select between the two, and it appears that no thought was given to the possibility that a real implementation would incorporate both VAD versions, to be selected at run time. However, given our (ThemWi) desire for bit-exact testing against other people's implementations, it made no sense for us to arbitrarily select one VAD version and drop the other - hence we took the unconventional route of incorporating both VAD1 and VAD2 into libtwamr, and designing our encoder API so that library users get to select which VAD they wish to apply. Like all other Themyscira GSM codec libraries, libtwamr includes the codec homing feature in both encoder and decoder directions, as required by 3GPP specs. Furthermore, libtwamr implementation of this codec homing feature includes the following simple extensions (simple in terms of low implementation cost) to facilitate construction of an AMR-EFR hybrid encoder and decoder: * In the decoder direction, the main AMR frame decoder function includes a DHF detector as required by 3GPP architecture. In libtwamr this function can be told to trigger on EFR DHF instead of MR122 version, by way of a flag set in the mode field of the frame structure passed to amr_decode_frame(). * In the encoder direction, the regular call to amr_encode_frame() - standard for AMR - can be followed with a call to amr_dhf_subst_efr() or amr_dhf_subst_efr2() before passing the array of encoded parameters to EFR_params2frame() from libgsmefr. See AMR-EFR-hybrid-emu article for more information. The AMR-EFR hybrid test sequences in amr122_efr.zip pass on both amr_dhf_subst_efr() and amr_dhf_subst_efr2() versions, but the latter additionally matches the observable behavior of T-Mobile USA. The mechanism that allows libtwamr to be used for AMR-EFR hybrid implementation (as opposed to the more conventional use case of implementing standard AMR-NB) is kept out of the main stateful paths: there are no separate AMR-EFR hybrid encoder or decoder sessions that are distinguishable from regular AMR encoding and decoding in terms of state. In the decoder direction, the main AMR frame decoder function needs to know which DHF it should check for, but this indication is embedded in the mode field in struct amr_param_frame and not in the state. In the encoder direction, the mechanism is a separate function (stateless) that needs to be called between amr_encode_frame() and EFR_params2frame(). This approach dovetails nicely with the core vs I/O separation: the option of AMR-EFR hybrid can be viewed as a different I/O front end to the same AMR engine, alongside with 3GPP AMR test sequence and RFC 4867 I/O options. Please refer to AMR-library-API article for further details.