view doc/FR1-library-API @ 391:be8edf9e6bc1

libtwamr: integrate pitch_fr.c
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 06 May 2024 18:30:00 +0000
parents 59751c8fc773
children bf7bbc7d494f
line wrap: on
line source

Libgsmfr2 general usage
=======================

The external public interface to Themyscira libgsmfr2 consists of a single
header file <tw_gsmfr.h>; it should be installed in some system include
directory.

The dialect of C used by all Themyscira GSM codec libraries is ANSI C (function
prototypes), const qualifier is used where appropriate, and the interface is
defined in terms of <stdint.h> types; <tw_gsmfr.h> includes <stdint.h>.  The
use of old libgsm defined types (gsm_byte, gsm_frame and gsm_signal) has been
abolished in the migration from libgsm+libgsmfrp to libgsmfr2.

GSM 06.10 encoder and decoder
=============================

Both the encoder and the decoder are stateful; each running instance of either
element needs its own state structure.  However, this GSM 06.10 component of
libgsmfr2 shares a peculiar property with old libgsm from which it was derived:
the same state structure (struct gsmfr_0610_state) is used by both entities.
Needless to say, each given instance of struct gsmfr_0610_state must be used
for only one purpose, either for the encoder or for the decoder; mixing calls
to encoder and decoder functions with the same state structure is an invalid
operation with undefined results.

State structures for the basic encoder or decoder are allocated with this
function:

struct gsmfr_0610_state *gsmfr_0610_create(void);

This function allocates dynamic memory for the state structure with malloc()
(the size of the struct is internal to the library and not exposed) and returns
a pointer to the allocated and initialized struct if successful, or NULL if
malloc() fails.  The state structure is malloc'ed as a single chunk, hence when
you are done with it, simply free() it.

The initialization or reset portion of gsmfr_0610_create() operation can always
be repeated with this function:

void gsmfr_0610_reset(struct gsmfr_0610_state *state);

Immediately after gsmfr_0610_create() or gsmfr_0610_reset(), the "virgin" state
structure can be used either for the encoder or for the decoder; however, once
that state struct has been passed to functions of either group, it can only be
used for that functional group.

Encoder specifics
-----------------

The most elementary single-frame processing function of libgsmfr2 GSM 06.10
encoder is:

void gsmfr_0610_encode_params(struct gsmfr_0610_state *st, const int16_t *pcm,
			      struct gsmfr_param_frame *param);

The input is an array of 160 linear PCM samples (left-justified in int16_t),
and the output is this structure:

struct gsmfr_param_frame {
	int16_t	LARc[8];
	int16_t	Nc[4];
	int16_t	bc[4];
	int16_t	Mc[4];
	int16_t	xmaxc[4];
	int16_t	xMc[4][13];
};

Most of the time the following wrapper function is more useful:

void gsmfr_0610_encode_frame(struct gsmfr_0610_state *st, const int16_t *pcm,
			     uint8_t *frame);

The output is a 33-byte buffer, filled with the encoded GSM-FR speech frame in
the RTP format specified in ETSI TS 101 318 and IETF RFC 3551.

If the optional encoder homing feature is desired, call this function right
after the call to gsmfr_0610_encode_frame() or gsmfr_0610_encode_params():

void gsmfr_0610_encoder_homing(struct gsmfr_0610_state *st, const int16_t *pcm);

This function checks to see if the PCM frame (160 linear PCM samples) is an EHF;
if the input frame is indeed EHF, the function calls gsmfr_0610_reset().

Decoder specifics
-----------------

The internal native form of the 06.10 decoder once again uses
struct gsmfr_param_frame:

void gsmfr_0610_decode_params(struct gsmfr_0610_state *st,
			      const struct gsmfr_param_frame *param,
			      int16_t *pcm);

The more commonly used RTP-format version is:

void gsmfr_0610_decode_frame(struct gsmfr_0610_state *st, const uint8_t *frame,
			     int16_t *pcm);

Please note:

1) The basic GSM 06.10 decoder is just that: there is no SID recognition or DTX
   handling, every possible input bit pattern will be interpreted and decoded
   as a GSM 06.10 speech frame.

2) There is no decoder homing function at this layer, and no check for DHF.

3) The RTP signature nibble 0xD is ignored (not checked) by
   gsmfr_0610_decode_frame().

Rx DTX preprocessor block
=========================

The Rx DTX preprocessor is its own stateful element, independent from the 06.10
decoder to which it is usually coupled.  Libgsmfr2 provides a "fulldec" wrapper
that incorporates both elements, but the ability to use the Rx DTX preprocessor
by itself still remains, unchanged from our previous libgsmfrp offering.  One
potential application for this preprocessor by itself, without immediately
following it with the GSM 06.10 decode step, is the possibility of implementing
the TFO/TrFO transform of 3GPP TS 28.062 section C.3.2.1.1 for GSM-FR: our Rx
DTX preprocessor does exactly what that section calls for, specifically in
"case 1" where the input UL frame stream may contain SIDs and BFI frame gaps,
but the output must be 100% valid frames and SID-free.

The state structure for this block is struct gsmfr_preproc_state, and it is
allocated with this function:

struct gsmfr_preproc_state *gsmfr_preproc_create(void);

Like other state structures in Themyscira GSM codec libraries, this opaque
state is malloc'ed as a single chunk and can be simply freed afterward.  A
reset function is also provided:

void gsmfr_preproc_reset(struct gsmfr_preproc_state *state);

Preprocessing good frames
-------------------------

For every good traffic frame (BFI=0) you receive from the radio subsystem, you
need to call this preprocessor function:

void gsmfr_preproc_good_frame(struct gsmfr_preproc_state *state,
				uint8_t *frame);

The second argument is both input and output, i.e., the frame is modified in
place.  If the received frame is not SID (specifically, if the SID field
deviates from the SID codeword by 16 or more bits, per GSM 06.31 section 6.1.1),
then the frame (considered a good speech frame) will be left unmodified (i.e.,
it is to be passed unchanged to the GSM 06.10 decoder), but preprocessor state
will be updated.  OTOH, if the received frame is classified as either valid or
invalid SID per GSM 06.31, then the output frame will contain comfort noise
generated by the preprocessor using a PRNG, or a silence frame in one particular
corner case.

GSM-FR RTP (originally libgsm) 0xD magic: the upper nibble of the first byte
can be anything on input to gsmfr_preproc_good_frame(), but the output frame
will always have the correct magic in it.

Handling BFI conditions
-----------------------

If you received a lost/missing frame indication instead of a good traffic frame,
call this preprocessor function:

void gsmfr_preproc_bfi(struct gsmfr_preproc_state *state, int taf,
			uint8_t *frame_out);

TAF is a flag defined in GSM 06.31 section 6.1.1; if you don't have this flag,
pass 0 - you will lose the function of comfort noise muting in the event of
prolonged SID loss, but all other Rx DTX functions will still work the same.

With this function the 33-byte frame buffer is only an output, i.e., prior
buffer content is a don't-care and there is no provision for making any use of
erroneous frames like in EFR.  The frame generated by the preprocessor may be
substitution/muting, comfort noise or silence depending on the state.

GSM-FR full decoder
===================

The full decoder is a high-level feature of libgsmfr2, incorporating both the
Rx DTX preprocessor block and the GSM 06.10 decoder block.  The state structure
for the full decoder (struct gsmfr_fulldec_state) internally incorporates both
struct gsmfr_0610_state and gsmfr_preproc_state, but because it is implemented
inside libgsmfr2, it is still malloc'ed as a single chunk and can thus be
released with a single free() call.  The functions for allocating and
initializing this state follow the established pattern:

struct gsmfr_fulldec_state *gsmfr_fulldec_create(void);

void gsmfr_fulldec_reset(struct gsmfr_fulldec_state *state);

The reset function internally calls gsmfr_0610_reset() and
gsmfr_preproc_reset(), initializing both processing blocks.

Frame processing functions are also straightforward:

void gsmfr_fulldec_good_frame(struct gsmfr_fulldec_state *state,
				const uint8_t *frame, int16_t *pcm);

void gsmfr_fulldec_bfi(struct gsmfr_fulldec_state *state, int taf,
			int16_t *pcm);

These functions follow the same pattern as gsmfr_preproc_good_frame() and
gsmfr_preproc_bfi(), but the output is a 160-sample linear PCM buffer.  Also
note that the frame input to gsmfr_fulldec_good_frame() is const, unlike the
situation with gsmfr_preproc_good_frame() - the copying into a scratchpad
buffer (on the stack) happens inside this "fulldec" wrapper.

The "fulldec" layer also adds the decoder homing feature:
gsmfr_fulldec_good_frame() detects decoder homing frames and invokes
gsmfr_fulldec_reset() when required, and also implements EHF output per the
spec.

Stateless utility functions
===========================

Conversions between RTP packed format and broken-down codec parameters are
stateless and implemented with highly efficient code.  There are two versions;
this version converts between packed frames and struct gsmfr_param_frame used
by 06.10 encoder and decoder functions:

void gsmfr_pack_frame(const struct gsmfr_param_frame *param, uint8_t *frame);
void gsmfr_unpack_frame(const uint8_t *frame, struct gsmfr_param_frame *param);

and this version converts between packed frames and a straight linear array of
76 parameters:

void gsmfr_pack_from_array(const int16_t *params, uint8_t *frame);
void gsmfr_unpack_to_array(const uint8_t *frame, int16_t *params);

The latter functions gsmfr_pack_from_array() and gsmfr_unpack_to_array() are
drop-in replacements for gsm_implode() and gsm_explode() from old libgsm.  The
order of parameters in this array is the canonical one: first all LARc, then
all params for the first subframe, then the second subframe, then the third and
the fourth.  OTOH, struct gsmfr_param_frame uses functional grouping, chosen
for ease of porting of original libgsm code.

Both unpacking functions (gsmfr_unpack_frame() and gsmfr_unpack_to_array())
ignore the upper nibble of the first byte, i.e., the 0xD signature is not
enforced.  However, this signature is always set correctly by gsmfr_pack_frame()
and gsmfr_pack_from_array(), and also by gsmfr_0610_encode_frame() function
which calls gsmfr_pack_frame() as its finishing step.

The last remaining stateless utility function performs SID classification of
received GSM-FR frames:

int gsmfr_preproc_sid_classify(const uint8_t *frame);

This function analyzes an RTP-encoded FR frame (the upper nibble of the first
byte is NOT checked for 0xD signature) for the SID codeword of GSM 06.12 and
classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.31
section 6.1.1.  This classification is the first processing step performed by
gsmfr_preproc_good_frame().

Public constant definitions
===========================

Our public header file <tw_gsmfr.h> provides these constant definitions, which
should be self-explanatory:

#define	GSMFR_RTP_FRAME_LEN	33
#define	GSMFR_NUM_PARAMS	76

Public const data items
=======================

There are two special GSM-FR frame bit patterns defined in the specs: there is
the silence frame of GSM 06.11, and there is the decoder homing frame specified
in later versions of GSM 06.10.  RTP-packed representations of both frames are
included in libgsmfr2, and are made public:

extern const uint8_t gsmfr_preproc_silence_frame[GSMFR_RTP_FRAME_LEN];
extern const uint8_t gsmfr_decoder_homing_frame[GSMFR_RTP_FRAME_LEN];