FreeCalypso > hg > gsm-codec-lib
view doc/EFR-library-API @ 143:195911f2211c
document PCM format conversion utilities
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Wed, 14 Dec 2022 07:04:59 +0000 |
parents | 1c529bb31219 |
children | fe5aceaf51e0 |
line wrap: on
line source
The external public interface to Themyscira libgsmefr consists of a single header file <gsm_efr.h>; it should be installed in the same system include directory as <gsm.h> from classic libgsm (1990s free software product) for the original FR codec, and the API of libgsmefr is modeled after that of libgsm. The dialect of C we chose for libgsmefr is ANSI C (function prototypes), const qualifier is used where appropriate, and the interface is defined in terms of <stdint.h> types; <gsm_efr.h> includes <stdint.h>. State allocation and freeing ============================ In order to use the EFR encoder, you will need to allocate an encoder state structure, and to use the EFR decoder, you will need to allocate a decoder state structure. The necessary state allocation functions are: extern struct EFR_encoder_state *EFR_encoder_create(int dtx); extern struct EFR_decoder_state *EFR_decoder_create(void); struct EFR_encoder_state and struct EFR_decoder_state are opaque structures to library users: you only get pointers which you remember and pass around, but <gsm_efr.h> does not give you full definitions of these structs. As a library user, you don't even get to know the size of these structs, hence the necessary malloc() operation happens inside EFR_encoder_create() and EFR_decoder_create(). However, each structure is malloc'ed as a single chunk, hence when you are done with it, simply call free() to relinquish each encoder or decoder state instance. EFR_encoder_create() and EFR_decoder_create() functions can fail if the malloc() call inside fails, in which case the two libgsmefr functions in question return NULL. The dtx argument to EFR_encoder_create() is a Boolean flag represented as an int; it tells the EFR encoder whether it should operate with DTX enabled (run GSM 06.82 VAD and emit SID frames instead of speech frames per GSM 06.81) or DTX disabled (skip VAD and always emit speech frames). Using the EFR encoder ===================== To encode one 20 ms audio frame per EFR, call EFR_encode_frame(): extern void EFR_encode_frame(struct EFR_encoder_state *st, const int16_t *pcm, uint8_t *frame, int *sp, int *vad); You need to provide an encoder state structure allocated earlier with EFR_encoder_create(), a block of 160 linear PCM samples, and an output buffer of 31 bytes (EFR_RTP_FRAME_LEN constant also defined in <gsm_efr.h>) into which the encoded EFR frame will be written; the frame format is that defined in ETSI TS 101 318 for EFR in RTP, including the 0xC signature in the upper nibble of the first byte. The last two arguments of type (int *) are optional pointers to extra output flags SP and VAD, defined in GSM 06.81 section 5.1.1; either pointer or both of them can be NULL if these extra output flags aren't needed. Both of these flags are needed in order to test our libgsmefr encoder implementation against official ETSI test sequences (GSM 06.54), but they typically aren't needed otherwise. Using the EFR decoder ===================== The main interface to our EFR decoder is this function: extern void EFR_decode_frame(struct EFR_decoder_state *st, const uint8_t *frame, int bfi, int taf, int16_t *pcm); The inputs consist of 244 bits of frame payload (the 4 upper bits of the first byte are ignored - there is NO enforcement of 0xC signature in our frame decoder) and BFI and TAF flags defined in GSM 06.81 section 6.1.1. Note the absence of a SID flag argument: EFR_decode_frame() calls our own utility function EFR_sid_classify() to determine SID from the frame itself per the rules of GSM 06.81 section 6.1.1. Many EFR decoder applications will also be faced with a situation where they receive a frame gap (no data at all), and they need to run the EFR decoder with BFI=1, but don't have any frame-bits input. If you find yourself in this situation, call the following function: extern void EFR_decode_bfi_nodata(struct EFR_decoder_state *st, int taf, int16_t *pcm); EFR_decode_bfi_nodata() is equivalent to calling EFR_decode_frame() with a frame buffer of 31 zero bytes (or 0xC signature followed by 244 zero bits) and BFI=1, but is slightly more efficient in that the internal steps of EFR_frame2params() and EFR_sid_classify() are skipped, and the made-up "frame" of 244 zero bits is passed to the decoder core at the params array level. Note that the official EFR decoder from ETSI, which we've replicated in our librified form in libgsmefr, does make use of some presumed-invalid frame data bits under BFI=1 conditions: see the description in GSM 06.61 section 6.1, where the last sentence reads "The received fixed codebook excitation pulses from the erroneous frame are always used as such." With our current implementation, the "erroneous frame" in the case of completely lost or missing frames is a made-up frame of 244 zero bits; the question of whether this approach is good enough or if we need to do something more complex remains for further study. Stateless utility functions =========================== All functions in this section are stateless (no encoder state or decoder state structure is needed); they merely manipulate bit fields. extern void EFR_frame2params(const uint8_t *frame, int16_t *params); This function unpacks an EFR codec frame in ETSI TS 101 318 RTP encoding (the upper nibble of the first byte is NOT checked, i.e., there is NO enforcement of 0xC signature) into an array of 57 (EFR_NUM_PARAMS) parameter words for the codec. int16_t signed type is used for the params array (even though all parameters are actually unsigned) in order to match the guts of ETSI-based EFR codec, and EFR_frame2params() is called internally by EFR_decode_frame(). extern void EFR_params2frame(const int16_t *params, uint8_t *frame); This function takes an array of 57 (EFR_NUM_PARAMS) EFR codec parameter words and packs them into a 31-byte (EFR_RTP_FRAME_LEN) frame in ETSI TS 101 318 format. The 0xC signature is generated by this function, and every byte of the output buffer is fully written without regard to any previous content. This function is called internally by EFR_encode_frame(). extern int EFR_sid_classify(const uint8_t *frame); This function analyzes an RTP-encoded EFR frame (the upper nibble of the first byte is NOT checked for 0xC signature) for the SID codeword of GSM 06.62 and classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.81 section 6.1.1. extern void EFR_insert_sid_codeword(uint8_t *frame); This function inserts the SID codeword of GSM 06.62 into the frame in the pointed-to buffer; specifically, the 95 bits that make up the SID field are all set to 1s, but all other bits remain unchanged. This function is arguably least useful to external users of libgsmefr, but it exists because of how the original code from ETSI generates SID frames produced by the encoder in DTX mode. Parameter-based encoder and decoder functions ============================================= The EFR_encode_frame() and EFR_decode_frame() functions described earlier in this document constitute the most practically useful (intended for actual use) interfaces to our EFR encoder and decoder, but they are actually wrappers around these parameter-based functions: extern void EFR_encode_params(struct EFR_encoder_state *st, const int16_t *pcm, int16_t *params, int *sp, int *vad); This function is similar to EFR_encode_frame(), but the output is an array of 57 (EFR_NUM_PARAMS) codec parameter words rather than a finished frame. The two extra output flags are optional (pointers may be NULL) just like with EFR_encode_frame(), but there is a catch: if the output frame is a SID (which can only happen if DTX is enabled), the bits inside parameter words that would correspond to SID codeword bits are NOT set, instead one MUST call EFR_insert_sid_codeword() after packing the frame with EFR_params2frame(). The wrapper in EFR_encode_frame() does exactly as described, and the overall logic follows the original code structure from ETSI. extern void EFR_decode_params(struct EFR_decoder_state *st, const int16_t *params, int bfi, int sid, int taf, int16_t *pcm); This function is similar to EFR_decode_frame() with the frame input replaced with params array input, but the SID classification per the rules of GSM 06.81 section 6.1.1 needs to be provided by the caller. The wrapper in EFR_decode_frame() calls both EFR_frame2params() and EFR_sid_classify() before passing the work to EFR_decode_params(). State reset functions ===================== extern void EFR_encoder_reset(struct EFR_encoder_state *st, int dtx); extern void EFR_decoder_reset(struct EFR_decoder_state *st); These functions reset the state of the encoder or the decoder, respectively; the entire state structure is fully initialized to the respective home state defined in GSM 06.60 section 8.5 for the encoder or section 8.6 for the decoder. EFR_encoder_reset() is called internally by EFR_encoder_create() and by the encoder itself when it encounters the ETSI-prescribed encoder homing frame; EFR_decoder_reset() is called internally by EFR_decoder_create() and by the decoder itself when it encounters the ETSI-prescribed decoder homing frame. Therefore, there is generally no need for libgsmefr users to call these functions directly - but they are made public for the sake of completeness. If you call EFR_encoder_reset() manually, you can change the DTX enable/disable flag from its initial value given to EFR_encoder_create() - the new value of this flag passed to EFR_encoder_reset() always takes effect. There is no provision for changing this mode within an encoder session without a full reset.