view doc/FR1-library-desc @ 408:8847c1740e78

libtwamr: integrate VAD1
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 07 May 2024 00:56:10 +0000
parents 6b479cfb06a4
children
line wrap: on
line source

Themyscira libgsmfr2 is our new (2024) library offering for GSM FRv1 codec,
replacing the previous combination of libgsm (classic 1990s free sw offering)
and libgsmfrp (our add-on).  That combination appeared satisfactory at first
because of how the decoder processing chain is defined for FRv1 (the Rx DTX
handler forms a modular piece, passing a frame of 260 bits to unmodified "pure"
GSM 06.10 decoder), but use of legacy libgsm presents some difficulties:

* Inconvenience and inconsistency: for all other supported GSM speech codecs,
  Themyscira libraries provide the complete solution, not depending on anyone
  else's software, but for FRv1 we depended on a library that was created in a
  different era for non-GSM applications but just happens, almost by accident,
  to be a bit-exact implementation of GSM 06.10 spec.

* Poor design in frame packing and unpacking functions: the operations of
  bit-shuffling a GSM-FR codec frame between the array of parameters form
  (76 words) and the packed RTP format used in IP-based GSM RAN (33 bytes) are
  stateless pure functions, but their implementations in libgsm (gsm_explode()
  and gsm_implode()) require a state structure.  (Libgsm supports WAV49 format
  in addition to the RTP-adopted one, and WAV49 packing is stateful - but this
  WAV49 feature is of no use and no relevance in real GSM applications.)

* No ability to implement homing: the internal state structure used by libgsm
  is set to the home state when it is allocated with gsm_create(), but there is
  no reset function, and such function cannot be implemented externally when
  the state structure is private to the library and not exposed.  Therefore,
  the optional codec homing feature defined in later versions of GSM 06.10 spec
  cannot be implemented in a wrapper around legacy libgsm, causing the resulting
  FOSS implementation to be inferior to existing commercial implementations
  (deployed in practice by incumbent nation-scale networks) which do implement
  this feature.

In response to the above issues, we now have a new library named libgsmfr2 that
provides all needed functions for GSM-FR codec "under one roof", harmonized
with our support for other GSM speech codecs.  However, the modularity that is
inherent in the way this codec is defined in the specs (contrast with EFR) is
still retained in the design of our library, which exhibits the following 4
functional divisions:

1) Libgsmfr2 includes a "pure" GSM 06.10 encoder and decoder, directly
   corresponding to old libgsm.  This code is in fact lifted from libgsm,
   ported into Themyscira gsm-codec-lib style in terms of C dialect, defined
   types and naming conventions.  The reset function that is missing in libgsm
   is now provided, however, fixing that defect.

2) The Rx DTX handler component is unchanged from our previous libgsmfrp.  Per
   the relevant specs (GSM 06.11, 06.12 and 06.31) this component is a modular
   piece: it emits a standard 260-bit frame of GSM 06.10 parameters that can be
   fed to anyone else's implementation of the latter standard.

3) Full decoder: this component is a wrapper around an 06.10 decoder instance
   and an Rx DTX instance, providing functionality equivalent to the standard
   decoder function in other GSM speech codecs.

4) Stateless utility functions for frame format conversions (packing and
   unpacking) and for incoming SID classification.  An application can freely
   use just these functions, without pulling in any encoder or decoder or
   stateful preprocessor functionality, making the present library very
   convenient for debug utilities.

The homing feature is available in both encoder and decoder directions, but it
is implemented differently between the two:

* In the encoder direction, if the application wishes to enable the possibility
  of in-band homing, it needs to call gsmfr_0610_encoder_homing() after the
  regular call to gsmfr_0610_encode_frame() or gsmfr_0610_encode_params().

* In the decoder direction, the homing feature is always included if one uses
  the "fulldec" (full decoder) wrapper (it is implemented in that layer), and
  never included if one uses the "basic" GSM 06.10 decoder by itself, or the
  Rx DTX handler block by itself.

The only major feature of GSM-FR codec that is currently absent in libgsmfr2 is
application of DTX in the encoder direction: GSM 06.32 VAD followed by DTX
hangover logic and SID output.  This omission is currently acceptable for
Themyscira Wireless: DTXd (DTX in the radio downlink direction) is not allowed
when each BTS operates at only one carrier frequency, which makes it pointless
to enable DTX for speech encoding at the network edge transcoder, and we are not
trying to replace TI Calypso DSP on the MS side of GSM.  However, if this
situation changes and some need arises for a FOSS implementation of DTX-enabled
GSM-FR encoder, the architecture of Themyscira libgsmfr2 should make it possible
to integrate such addition.