view doc/AMR-library-desc @ 492:cc3a831712a4

libgsmhr1: implement arbitrary RTP input
author Mychaela Falconia <falcon@freecalypso.org>
date Sat, 15 Jun 2024 06:52:37 +0000
parents c84bf526c7eb
children
line wrap: on
line source

Themyscira libtwamr general description
=======================================

Libtwamr is a librification of the official AMR reference C code from 3GPP,
produced by Themyscira Wireless and styled to match our libraries for more
classic GSM codecs.  This library has been created with the following two goals
in mind:

1) At the present time we (ThemWi) operate our GSM network with only GSM-FR and
   GSM-EFR codecs, with the latter being preferred.  We do not currently operate
   with AMR because the conditions under which AMR becomes advantageous do not
   currently exist in our network operation.  However, we need to be prepared
   for the possibility that the conditions which make AMR desirable may arise
   some day, and we may need to start deploying AMR.  In order to make AMR
   deployment a possibility, many parts will need to be implemented, one of
   which is a speech transcoding library that implements the AMR codec in the
   same way how libgsmfr2 and libgsmefr implement the more classic codecs which
   we use currently.

2) Many other commercial GSM networks have implemented EFR speech service using
   a type of AMR-EFR hybrid described in AMR-EFR-philosophy and
   AMR-EFR-hybrid-emu articles.  As part of certain behavioral reverse
   engineering experiments, we sometimes need to model the bit-exact operation
   of those other-people-controlled commercial implementations of AMR-EFR, and
   our current libtwamr provides one way to do so.  Knowing that a proper
   implementation of an AMR codec library is likely to be needed some day for
   reason 1 above, justification was obtained for expending the effort to
   produce the present libtwamr.

Compared to other plausible ways in which someone could reasonably approach the
task of librifying the AMR reference code from 3GPP, the design of libtwamr
includes two somewhat original choices:

* Separation of core and I/O: the stateful encoder and decoder engines in
  libtwamr operate on a custom frame structure that includes the array of codec
  parameters in their broken-down form (e.g., 57 parameters for MR122), the
  frame type as in original RXFrameType and TXFrameType, and the codec mode.
  Conversion between this internal canonical form (which is most native to the
  guts of the encoder and decoder engines) and external I/O formats (the 3GPP
  test sequence format and the more practical RFC 4867 format used in RTP and
  in .amr recording files) is relegated to stateless utility functions.

* Both VAD1 and VAD2 included: the reference code from 3GPP includes two
  alternative versions of Voice Activity Detection algorithm, VAD1 and VAD2.
  Implementors are allowed to use either version and be compliant; 3GPP code
  uses conditional compilation to select between the two, and it appears that
  no thought was given to the possibility that a real implementation would
  incorporate both VAD versions, to be selected at run time.  However, given our
  (ThemWi) desire for bit-exact testing against other people's implementations,
  it made no sense for us to arbitrarily select one VAD version and drop the
  other - hence we took the unconventional route of incorporating both VAD1 and
  VAD2 into libtwamr, and designing our encoder API so that library users get
  to select which VAD they wish to apply.

Like all other Themyscira GSM codec libraries, libtwamr includes the codec
homing feature in both encoder and decoder directions, as required by 3GPP
specs.  Furthermore, libtwamr implementation of this codec homing feature
includes the following simple extensions (simple in terms of low implementation
cost) to facilitate construction of an AMR-EFR hybrid encoder and decoder:

* In the decoder direction, the main AMR frame decoder function includes a DHF
  detector as required by 3GPP architecture.  In libtwamr this function can be
  told to trigger on EFR DHF instead of MR122 version, by way of a flag set in
  the mode field of the frame structure passed to amr_decode_frame().

* In the encoder direction, the regular call to amr_encode_frame() - standard
  for AMR - can be followed with a call to amr_dhf_subst_efr() or
  amr_dhf_subst_efr2() before passing the array of encoded parameters to
  EFR_params2frame() from libgsmefr.  See AMR-EFR-hybrid-emu article for more
  information.  The AMR-EFR hybrid test sequences in amr122_efr.zip pass on
  both amr_dhf_subst_efr() and amr_dhf_subst_efr2() versions, but the latter
  additionally matches the observable behavior of T-Mobile USA.

The mechanism that allows libtwamr to be used for AMR-EFR hybrid implementation
(as opposed to the more conventional use case of implementing standard AMR-NB)
is kept out of the main stateful paths: there are no separate AMR-EFR hybrid
encoder or decoder sessions that are distinguishable from regular AMR encoding
and decoding in terms of state.  In the decoder direction, the main AMR frame
decoder function needs to know which DHF it should check for, but this
indication is embedded in the mode field in struct amr_param_frame and not in
the state.  In the encoder direction, the mechanism is a separate function
(stateless) that needs to be called between amr_encode_frame() and
EFR_params2frame().  This approach dovetails nicely with the core vs I/O
separation: the option of AMR-EFR hybrid can be viewed as a different I/O front
end to the same AMR engine, alongside with 3GPP AMR test sequence and RFC 4867
I/O options.

Please refer to AMR-library-API article for further details.