diff doc/EFR-library-API @ 123:92fdb499b5c3

doc/EFR-library-API article written
author Mychaela Falconia <falcon@freecalypso.org>
date Sat, 10 Dec 2022 22:01:14 +0000
parents
children 1c529bb31219
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/EFR-library-API	Sat Dec 10 22:01:14 2022 +0000
@@ -0,0 +1,182 @@
+The external public interface to Themyscira libgsmefr consists of a single
+header file <gsm_efr.h>; it should be installed in the same system include
+directory as <gsm.h> from classic libgsm (1990s free software product) for the
+original FR codec, and the API of libgsmefr is modeled after that of libgsm.
+
+The dialect of C we chose for libgsmefr is ANSI C (function prototypes), const
+qualifier is used where appropriate, and the interface is defined in terms of
+<stdint.h> types; <gsm_efr.h> includes <stdint.h>.
+
+State allocation and freeing
+============================
+
+In order to use the EFR encoder, you will need to allocate an encoder state
+structure, and to use the EFR decoder, you will need to allocate a decoder state
+structure.  The necessary state allocation functions are:
+
+extern struct EFR_encoder_state *EFR_encoder_create(int dtx);
+extern struct EFR_decoder_state *EFR_decoder_create(void);
+
+struct EFR_encoder_state and struct EFR_decoder_state are opaque structures to
+library users: you only get pointers which you remember and pass around, but
+<gsm_efr.h> does not give you full definitions of these structs.  As a library
+user, you don't even get to know the size of these structs, hence the necessary
+malloc() operation happens inside EFR_encoder_create() and EFR_decoder_create().
+However, each structure is malloc'ed as a single chunk, hence when you are done
+with it, simply call free() to relinquish each encoder or decoder state
+instance.
+
+EFR_encoder_create() and EFR_decoder_create() functions can fail if the malloc()
+call inside fails, in which case the two libgsmefr functions in question return
+NULL.
+
+The dtx argument to EFR_encoder_create() is a Boolean flag represented as an
+int; it tells the EFR encoder whether it should operate with DTX enabled (run
+GSM 06.82 VAD and emit SID frames instead of speech frames per GSM 06.81) or DTX
+disabled (skip VAD and always emit speech frames).
+
+Using the EFR encoder
+=====================
+
+To encode one 20 ms audio frame per EFR, call EFR_encode_frame():
+
+extern void EFR_encode_frame(struct EFR_encoder_state *st, const int16_t *pcm,
+			     uint8_t *frame, int *sp, int *vad);
+
+You need to provide an encoder state structure allocated earlier with
+EFR_encoder_create(), a block of 160 linear PCM samples, and an output buffer of
+31 bytes (EFR_RTP_FRAME_LEN constant also defined in <gsm_efr.h>) into which the
+encoded EFR frame will be written; the frame format is that defined in ETSI TS
+101 318 for EFR in RTP, including the 0xC signature in the upper nibble of the
+first byte.
+
+The last two arguments of type (int *) are optional pointers to extra output
+flags SP and VAD, defined in GSM 06.81 section 5.1.1; either pointer or both of
+them can be NULL if these extra output flags aren't needed.  Both of these flags
+are needed in order to test our libgsmefr encoder implementation against
+official ETSI test sequences (GSM 06.54), but they typically aren't needed
+otherwise.
+
+Using the EFR decoder
+=====================
+
+The main interface to our EFR decoder is this function:
+
+extern void EFR_decode_frame(struct EFR_decoder_state *st, const uint8_t *frame,
+			     int bfi, int taf, int16_t *pcm);
+
+The inputs consist of 244 bits of frame payload (the 4 upper bits of the first
+byte are ignored - there is NO enforcement of 0xC signature in our frame
+decoder) and BFI and TAF flags defined in GSM 06.81 section 6.1.1.  Note the
+absence of a SID flag argument: EFR_decode_frame() calls our own utility
+function EFR_sid_classify() to determine SID from the frame itself per the rules
+of GSM 06.81 section 6.1.1.
+
+Many EFR decoder applications will also be faced with a situation where they
+receive a frame gap (no data at all), and they need to run the EFR decoder with
+BFI=1, but don't have any frame-bits input.  If you find yourself in this
+situation, call the following function:
+
+extern void EFR_decode_bfi_nodata(struct EFR_decoder_state *st, int taf,
+				  int16_t *pcm);
+
+EFR_decode_bfi_nodata() is equivalent to calling EFR_decode_frame() with a frame
+buffer of 31 zero bytes (or 0xC signature followed by 244 zero bits) and BFI=1,
+but is slightly more efficient in that the internal steps of EFR_frame2params()
+and EFR_sid_classify() are skipped, and the made-up "frame" of 244 zero bits is
+passed to the decoder core at the params array level.
+
+Note that the official EFR decoder from ETSI, which we've replicated in our
+librified form in libgsmefr, does make use of some presumed-invalid frame data
+bits under BFI=1 conditions: see the description in GSM 06.61 section 6.1, where
+the last sentence reads "The received fixed codebook excitation pulses from the
+erroneous frame are always used as such."  With our current implementation, the
+"erroneous frame" in the case of completely lost or missing frames is a made-up
+frame of 244 zero bits; the question of whether this approach is good enough or
+if we need to do something more complex remains for further study.
+
+Stateless utility functions
+===========================
+
+All functions in this section are stateless (no encoder state or decoder state
+structure is needed); they merely manipulate bit fields.
+
+extern void EFR_frame2params(const uint8_t *frame, int16_t *params);
+
+This function unpacks an EFR codec frame in ETSI TS 101 318 RTP encoding (the
+upper nibble of the first byte is NOT checked, i.e., there is NO enforcement of
+0xC signature) into an array of 57 (EFR_NUM_PARAMS) parameter words for the
+codec.  int16_t signed type is used for the params array (even though all
+parameters are actually unsigned) in order to match the guts of ETSI-based EFR
+codec, and EFR_frame2params() is called internally by EFR_decode_frame().
+
+extern void EFR_params2frame(const int16_t *params, uint8_t *frame);
+
+This function takes an array of 57 (EFR_NUM_PARAMS) EFR codec parameter words
+and packs them into a 31-byte (EFR_RTP_FRAME_LEN) frame in ETSI TS 101 318
+format.  The 0xC signature is generated by this function, and every byte of the
+output buffer is fully written without regard to any previous content.  This
+function is called internally by EFR_encode_frame().
+
+extern int EFR_sid_classify(const uint8_t *frame);
+
+This function analyzes an RTP-encoded EFR frame (the upper nibble of the first
+byte is NOT checked for 0xC signature) for the SID codeword of GSM 06.62 and
+classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.81
+section 6.1.1.
+
+extern void EFR_insert_sid_codeword(uint8_t *frame);
+
+This function inserts the SID codeword of GSM 06.62 into the frame in the
+pointed-to buffer; specifically, the 95 bits that make up the SID field are all
+set to 1s, but all other bits remain unchanged.  This function is arguably least
+useful to external users of libgsmefr, but it exists because of how the original
+code from ETSI generates SID frames produced by the encoder in DTX mode.
+
+Parameter-based encoder and decoder functions
+=============================================
+
+The EFR_encode_frame() and EFR_decode_frame() functions described earlier in
+this document constitute the most practically useful (intended for actual use)
+interfaces to our EFR encoder and decoder, but they are actually wrappers around
+these parameter-based functions:
+
+extern void EFR_encode_params(struct EFR_encoder_state *st, const int16_t *pcm,
+			      int16_t *params, int *sp, int *vad);
+
+This function is similar to EFR_encode_frame(), but the output is an array of
+57 (EFR_NUM_PARAMS) codec parameter words rather than a finished frame.  The two
+extra output flags are optional (pointers may be NULL) just like with
+EFR_encode_frame(), but there is a catch: if the output frame is a SID (which
+can only happen if DTX is enabled), the bits inside parameter words that would
+correspond to SID codeword bits are NOT set, instead one MUST call
+EFR_insert_sid_codeword() after packing the frame with EFR_params2frame().  The
+wrapper in EFR_encode_frame() does exactly as described, and the overall logic
+follows the original code structure from ETSI.
+
+extern void EFR_decode_params(struct EFR_decoder_state *st,
+			      const int16_t *params, int bfi, int sid, int taf,
+			      int16_t *pcm);
+
+This function is similar to EFR_decode_frame() with the frame input replaced
+with params array input, but the SID classification per the rules of GSM 06.81
+section 6.1.1 needs to be provided by the caller.  The wrapper in
+EFR_decode_frame() calls both EFR_frame2params() and EFR_sid_classify() before
+passing the work to EFR_decode_params().
+
+State reset functions
+=====================
+
+extern void EFR_encoder_reset(struct EFR_encoder_state *st, int dtx);
+extern void EFR_decoder_reset(struct EFR_decoder_state *st);
+
+These functions reset the state of the encoder or the decoder, respectively;
+the entire state structure is fully initialized to the respective home state
+defined in GSM 06.60 section 8.5 for the encoder or section 8.6 for the decoder.
+
+EFR_encoder_reset() is called internally by EFR_encoder_create() and by the
+encoder itself when it encounters the ETSI-prescribed encoder homing frame;
+EFR_decoder_reset() is called internally by EFR_decoder_create() and by the
+decoder itself when it encounters the ETSI-prescribed decoder homing frame.
+Therefore, there is generally no need for libgsmefr users to call these
+functions directly - but they are made public for the sake of completeness.