FreeCalypso > hg > gsm-codec-lib
view doc/FR1-library-API @ 542:f2d0f2f15d5f
libgsmefr: add wrapper for TW-TS-001 RTP input
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sat, 28 Sep 2024 06:38:08 +0000 |
parents | a3300483ae74 |
children |
line wrap: on
line source
Libgsmfr2 general usage ======================= The external public interface to Themyscira libgsmfr2 consists of a single header file <tw_gsmfr.h>; it should be installed in some system include directory. The dialect of C used by all Themyscira GSM codec libraries is ANSI C (function prototypes), const qualifier is used where appropriate, and the interface is defined in terms of <stdint.h> types; <tw_gsmfr.h> includes <stdint.h>. The use of old libgsm defined types (gsm_byte, gsm_frame and gsm_signal) has been abolished in the migration from libgsm+libgsmfrp to libgsmfr2. GSM 06.10 encoder and decoder ============================= Both the encoder and the decoder are stateful; each running instance of either element needs its own state structure. However, this GSM 06.10 component of libgsmfr2 shares a peculiar property with old libgsm from which it was derived: the same state structure (struct gsmfr_0610_state) is used by both entities. Needless to say, each given instance of struct gsmfr_0610_state must be used for only one purpose, either for the encoder or for the decoder; mixing calls to encoder and decoder functions with the same state structure is an invalid operation with undefined results. State structures for the basic encoder or decoder are allocated with this function: struct gsmfr_0610_state *gsmfr_0610_create(void); This function allocates dynamic memory for the state structure with malloc() and returns a pointer to the allocated and initialized struct if successful, or NULL if malloc() fails. The state structure is malloc'ed as a single chunk, hence when you are done with it, simply free() it. The initialization or reset portion of gsmfr_0610_create() operation can always be repeated with this function: void gsmfr_0610_reset(struct gsmfr_0610_state *state); To support applications that need (or prefer) to use some different method of managing their memory allocations, the library also exports this const datum: extern const unsigned gsmfr_0610_state_size; Using this feature, one can replace gsmfr_0610_create() with something like the following (example for applications based on Osmocom libraries): struct gsmfr_0610_state *st; st = talloc_size(ctx, gsmfr_0610_state_size); if (st) gsmfr_0610_reset(st); Immediately after gsmfr_0610_create() or gsmfr_0610_reset(), the "virgin" state structure can be used either for the encoder or for the decoder; however, once that state struct has been passed to functions of either group, it can only be used for that functional group. Encoder specifics ----------------- The most elementary single-frame processing function of libgsmfr2 GSM 06.10 encoder is: void gsmfr_0610_encode_params(struct gsmfr_0610_state *st, const int16_t *pcm, struct gsmfr_param_frame *param); The input is an array of 160 linear PCM samples (left-justified in int16_t), and the output is this structure: struct gsmfr_param_frame { int16_t LARc[8]; int16_t Nc[4]; int16_t bc[4]; int16_t Mc[4]; int16_t xmaxc[4]; int16_t xMc[4][13]; }; Most of the time the following wrapper function is more useful: void gsmfr_0610_encode_frame(struct gsmfr_0610_state *st, const int16_t *pcm, uint8_t *frame); The output is a 33-byte buffer, filled with the encoded GSM-FR speech frame in the RTP format specified in ETSI TS 101 318 and IETF RFC 3551. If the optional encoder homing feature is desired, call this function right after the call to gsmfr_0610_encode_frame() or gsmfr_0610_encode_params(): void gsmfr_0610_encoder_homing(struct gsmfr_0610_state *st, const int16_t *pcm); This function checks to see if the PCM frame (160 linear PCM samples) is an EHF; if the input frame is indeed EHF, the function calls gsmfr_0610_reset(). Decoder specifics ----------------- The internal native form of the 06.10 decoder once again uses struct gsmfr_param_frame: void gsmfr_0610_decode_params(struct gsmfr_0610_state *st, const struct gsmfr_param_frame *param, int16_t *pcm); The more commonly used RTP-format version is: void gsmfr_0610_decode_frame(struct gsmfr_0610_state *st, const uint8_t *frame, int16_t *pcm); Please note: 1) The basic GSM 06.10 decoder is just that: there is no SID recognition or DTX handling, every possible input bit pattern will be interpreted and decoded as a GSM 06.10 speech frame. 2) There is no decoder homing function at this layer, and no check for DHF. 3) The RTP signature nibble 0xD is ignored (not checked) by gsmfr_0610_decode_frame(). Rx DTX preprocessor block ========================= The Rx DTX preprocessor is its own stateful element, independent from the 06.10 decoder to which it is usually coupled. Libgsmfr2 provides a "fulldec" wrapper that incorporates both elements, but the ability to use the Rx DTX preprocessor by itself still remains, unchanged from our previous libgsmfrp offering. One significant application for this preprocessor by itself, without immediately following it with the GSM 06.10 decode step, is the TFO/TrFO transform of 3GPP TS 28.062 section C.3.2.1.1 for GSM-FR: our Rx DTX preprocessor does exactly what that section calls for, specifically in "case 1" where the input UL frame stream may contain SIDs and BFI frame gaps, but the output must be 100% valid frames and SID-free. The current version of libgsmfr2 includes some additional provisions for using our preprocessor block as a TFO transform in both non-DTXd and DTXd-enabled configurations, as detailed in a later section of this document. The state structure for this block is struct gsmfr_preproc_state, and it is allocated with this function: struct gsmfr_preproc_state *gsmfr_preproc_create(void); Like other state structures in Themyscira GSM codec libraries, this opaque state is malloc'ed as a single chunk and can be simply freed afterward. A reset function is also provided: void gsmfr_preproc_reset(struct gsmfr_preproc_state *state); There is also a public const datum with the size of this structure, allowing use of talloc and other alternative schemes: extern const unsigned gsmfr_preproc_state_size; Preprocessing good frames ------------------------- For every good traffic frame (BFI=0) you receive from the radio subsystem, you need to call this preprocessor function: void gsmfr_preproc_good_frame(struct gsmfr_preproc_state *state, uint8_t *frame); The second argument is both input and output, i.e., the frame is modified in place. If the received frame is not SID (specifically, if the SID field deviates from the SID codeword by 16 or more bits, per GSM 06.31 section 6.1.1), then the frame (considered a good speech frame) will be left unmodified (i.e., it is to be passed unchanged to the GSM 06.10 decoder), but preprocessor state will be updated. OTOH, if the received frame is classified as either valid or invalid SID per GSM 06.31, then the output frame will contain comfort noise generated by the preprocessor using a PRNG, or a speech muting or silence frame in some corner cases involving invalid SID. GSM-FR RTP (originally libgsm) 0xD magic: the upper nibble of the first byte can be anything on input to gsmfr_preproc_good_frame(), but the output frame will always have the correct magic in it. There is also a variant of this function (implemented as a wrapper) that applies homing logic: void gsmfr_preproc_good_frame_hm(struct gsmfr_preproc_state *state, uint8_t *frame); This function operates just like plain gsmfr_preproc_good_frame() except for one difference: if the input matches the decoder homing frame (DHF), the state is reset with an internal call to gsmfr_preproc_reset(). (Because the DHF is still a good speech frame, it is always passed through to the output unchanged by both functions - the only difference is the effect on subsequent state.) The homing version of good frame preproc is intended for TFO applications, and is invoked internally by gsmfr_tfo_xfrm_main() function described in a later section of this document. Handling BFI conditions ----------------------- If you received a lost/missing frame indication instead of a good traffic frame, call one of these preprocessor functions: void gsmfr_preproc_bfi(struct gsmfr_preproc_state *state, int taf, uint8_t *frame_out); or void gsmfr_preproc_bfi_bits(struct gsmfr_preproc_state *state, const uint8_t *bad_frame, int taf, uint8_t *frame_out); gsmfr_preproc_bfi_bits() should be called if you received payload bits along with the BFI flag; plain gsmfr_preproc_bfi() should be called if you received BFI with no data. The bad frame passed to gsmfr_preproc_bfi_bits() is used only to check if the BFI should be handled as an invalid SID rather than the more common case of an unusable frame - see GSM 06.31 for definitions of these terms. Past the SID check, the bad frame content is a don't-care, and there is no provision for making any use of erroneous frames like in EFR. TAF is a flag defined in GSM 06.31 section 6.1.1; if you don't have this flag, pass 0 - you will lose the function of comfort noise muting in the event of prolonged SID loss, but all other Rx DTX functions will still work the same. With both functions the 33-byte buffer pointed to by frame_out is only an output, i.e., prior buffer content is a don't-care. The frame generated by the preprocessor may be substitution/muting, comfort noise or silence depending on the state. gsmfr_preproc_bfi_bits() arguments bad_frame and frame_out can point to the same memory: the function finishes analyzing bad_frame input before it starts writing to frame_out. GSM-FR full decoder =================== The full decoder is a high-level feature of libgsmfr2, incorporating both the Rx DTX preprocessor block and the GSM 06.10 decoder block. The state structure for the full decoder (struct gsmfr_fulldec_state) internally incorporates both struct gsmfr_0610_state and gsmfr_preproc_state, but because it is implemented inside libgsmfr2, it is still malloc'ed as a single chunk and can thus be released with a single free() call. The functions for allocating and initializing this state follow the established pattern: struct gsmfr_fulldec_state *gsmfr_fulldec_create(void); void gsmfr_fulldec_reset(struct gsmfr_fulldec_state *state); extern const unsigned gsmfr_fulldec_state_size; The reset function internally calls gsmfr_0610_reset() and gsmfr_preproc_reset(), initializing both processing blocks. Frame processing functions are also straightforward: void gsmfr_fulldec_good_frame(struct gsmfr_fulldec_state *state, const uint8_t *frame, int16_t *pcm); void gsmfr_fulldec_bfi(struct gsmfr_fulldec_state *state, int taf, int16_t *pcm); void gsmfr_fulldec_bfi_bits(struct gsmfr_fulldec_state *state, const uint8_t *bad_frame, int taf, int16_t *pcm); These functions follow the same pattern as gsmfr_preproc_good_frame(), gsmfr_preproc_bfi() and gsmfr_preproc_bfi_bits(), but the output is a 160-sample linear PCM buffer. Also note that the frame input to gsmfr_fulldec_good_frame() is const, unlike the situation with gsmfr_preproc_good_frame() - the copying into a scratchpad buffer (on the stack) happens inside this "fulldec" wrapper. The "fulldec" layer also adds the decoder homing feature: gsmfr_fulldec_good_frame() detects decoder homing frames and invokes gsmfr_fulldec_reset() when required, and also implements EHF output per the spec. Full decoder RTP input ---------------------- If a network element is receiving GSM-FR input via RTP and needs to feed this input to the decoder, the RTP payload handler needs to support both the basic RTP format of ETSI TS 101 318 (also RFC 3551) and the extended RTP format of TW-TS-001. Depending on the format received, and depending on bit flags in the TEH octet in the case of TW-TS-001, one of the 3 main processing functions listed above will need to be called. Seeing that this complex logic should be abstracted away from applications into the library, we've added the following wrapper function: int gsmfr_fulldec_rtp_in(struct gsmfr_fulldec_state *state, const uint8_t *rtp_pl, unsigned rtp_pl_len, int16_t *pcm); The input is the received RTP payload: array of bytes and length. It is acceptable to pass 0 as rtp_pl_len, in which case rtp_pl pointer can be NULL. The function proceeds as follows: * If the input is valid RTP format for GSM-FR (either basic or extended), it is passed to the appropriate main processing function. Unlike the permissive stance taken in lower-level functions, RTP input validation includes a check of 0xD signature of GSM-FR, as well as validation of TEH octet signature and consistency in the case of TW-TS-001. The return value is 0, indicating that good input was received. * If the input is a zero-length payload (rtp_pl_len is 0, rtp_pl may be NULL), it is treated like BFI-no-data with TAF=0. The return value is 0, meaning that this input is still considered valid. * All other inputs are considered invalid. Linear PCM output is still generated by calling gsmfr_fulldec_bfi(), but the return value is -1, signaling invalid RTP input. TFO transform ============= "TFO transform" is the term adopted by Themyscira Wireless for the non-trivial transform on GSM codec frames called for by the TFO spec, 3GPP TS 28.062 section C.3.2.1.1. For each of the 3 classic GSM codecs, this transform can operate in two modes: DTXd=0: the input UL frame stream from call leg A may contain SIDs and BFI frame gaps, but the output to call leg B DL must be 100% valid frames and SID-free. DTXd=1: the output to call leg B DL is allowed to contain both good speech and valid SID frames, just like the output of a DTX-enabled speech encoder. Furthermore, it can be presumed that network operators who enable DTXd seek to reap its benefits in terms of radio interference reduction, hence the DTXd-enabled TFO transform should actually make use of DTXd capability. In the case of GSM-FR codec, the TFO transform with DTXd=0 is identical to the Rx DTX preprocessor part of the standard endpoint decoder, hence our "preproc" block is directly suited to serve as such. OTOH, the case of DTXd=1 is different: heeding the implied need to actually make use of DTXd when possible requires implementing a transform that is not the same as the preprocessor to be applied just prior to local GSM 06.10 decoding, hence the DTXd-enabled TFO transform is a different entity. The approach implemented in Themyscira libgsmfr2 is a hybrid: * The preprocessor block described earlier in this document functions both as the necessary component of the full endpoint decoder and as the TFO transform for DTXd=0. * TFO transform for DTXd=1 is implemented as a two-step process: 1) Regular main processing functions of the preproc block produce output that is SID-free, containing synthetic "speech" frames in the case of comfort noise or silence. 2) A special post-processor function needs to be called immediately afterward. This function selectively transforms some output frames into SIDs based on a flag set in the state structure. In order to make this approach possible, all main processing functions of the preproc block do a little bit of extra housekeeping to keep track of whether or not their output can be replaced with SID, logic that is unnecessary when this block functions as part of the full endpoint decoder or as non-DTXd TFO transform. However, this logic is very simple and the overhead is very light. TFO transform API ----------------- The state structure was already described earlier: it is struct gsmfr_preproc_state, created either with gsmfr_preproc_create() or by externally allocating the needed memory based on gsmfr_preproc_state_size and then initializing it with gsmfr_preproc_reset(). The following API functions are then available: int gsmfr_tfo_xfrm_main(struct gsmfr_preproc_state *state, const uint8_t *rtp_in, unsigned rtp_in_len, uint8_t *frame_out); int gsmfr_tfo_xfrm_dtxd(struct gsmfr_preproc_state *state, uint8_t *frame_out); gsmfr_tfo_xfrm_main() is the TFO transform counterpart to gsmfr_fulldec_rtp_in(), described in detail earlier. It is also possible (and allowed) to call gsmfr_preproc_* main processing functions directly, but the RTP wrapper is convenient for the same reasons as in the case of the full decoder. In this mode of usage, the only difference between the full decoder and the TFO transform is that the former emits linear PCM output, whereas the latter emits 33-byte GSM-FR codec frames to be sent to call leg B downlink. The return value from gsmfr_tfo_xfrm_main() is the same as that of gsmfr_fulldec_rtp_in(): 0 if the the RTP input was considered good or -1 if it is invalid. In the case of invalid RTP input that produces -1 return value, gsmfr_tfo_xfrm_main() calls gsmfr_preproc_bfi(), just like how gsmfr_fulldec_rtp_in() calls gsmfr_fulldec_bfi() under the same conditions. If DTXd is in use, then the call to gsmfr_tfo_xfrm_main() needs to be directly followed by a call to gsmfr_tfo_xfrm_dtxd(), operating on the same output buffer with the same state structure. The output will then be changed to SID when appropriate for the current state. The return value from gsmfr_tfo_xfrm_dtxd() is the SP flag of GSM 06.31: 1 if the output frame is speech or 0 if it is SID. TFO transform homing -------------------- 3GPP specs are silent on whether or not TFO transforms should implement homing, i.e., whether or not they should reset to home state when a decoder homing frame passes through. However, at Themyscira Wireless we believe in building deterministic systems whose bit-exact behavior can be modeled and relied upon; for this reason, our implementation of TFO transform does include in-band homing. In accord with this design decision, gsmfr_tfo_xfrm_main() internally calls gsmfr_preproc_good_frame_hm() described earlier instead of plain gsmfr_preproc_good_frame(). With DTXd=1, if a stream of DHFs is input to the TFO transform, the same stream of DHFs will appear on the output, i.e., DTXd won't kick in. (The same behavior occurs in a standard 3GPP-compliant speech encoder whose input is a stream of 0xD5 octets in PCMA or 0xFE in PCMU.) However, any BFIs following this DHF will be immediately converted to SID, under the same conditions when our TFO transform with DTXd=0 emits silence frames of GSM 06.11. Stateless utility functions =========================== Conversions between RTP packed format and broken-down codec parameters are stateless and implemented with highly efficient code. There are two versions; this version converts between packed frames and struct gsmfr_param_frame used by 06.10 encoder and decoder functions: void gsmfr_pack_frame(const struct gsmfr_param_frame *param, uint8_t *frame); void gsmfr_unpack_frame(const uint8_t *frame, struct gsmfr_param_frame *param); and this version converts between packed frames and a straight linear array of 76 parameters: void gsmfr_pack_from_array(const int16_t *params, uint8_t *frame); void gsmfr_unpack_to_array(const uint8_t *frame, int16_t *params); The latter functions gsmfr_pack_from_array() and gsmfr_unpack_to_array() are drop-in replacements for gsm_implode() and gsm_explode() from old libgsm. The order of parameters in this array is the canonical one: first all LARc, then all params for the first subframe, then the second subframe, then the third and the fourth. OTOH, struct gsmfr_param_frame uses functional grouping, chosen for ease of porting of original libgsm code. Both unpacking functions (gsmfr_unpack_frame() and gsmfr_unpack_to_array()) ignore the upper nibble of the first byte, i.e., the 0xD signature is not enforced. However, this signature is always set correctly by gsmfr_pack_frame() and gsmfr_pack_from_array(), and also by gsmfr_0610_encode_frame() function which calls gsmfr_pack_frame() as its finishing step. The last remaining stateless utility function performs SID classification of received GSM-FR frames: int gsmfr_preproc_sid_classify(const uint8_t *frame); This function analyzes an RTP-encoded FR frame (the upper nibble of the first byte is NOT checked for 0xD signature) for the SID codeword of GSM 06.12 and classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.31 section 6.1.1. This classification is the first processing step performed by gsmfr_preproc_good_frame(). Public constant definitions =========================== Our public header file <tw_gsmfr.h> provides these constant definitions, which should be self-explanatory: #define GSMFR_RTP_FRAME_LEN 33 #define GSMFR_NUM_PARAMS 76 Public const data items ======================= There are two special GSM-FR frame bit patterns defined in the specs: there is the silence frame of GSM 06.11, and there is the decoder homing frame specified in later versions of GSM 06.10. RTP-packed representations of both frames are included in libgsmfr2, and are made public: extern const uint8_t gsmfr_preproc_silence_frame[GSMFR_RTP_FRAME_LEN]; extern const uint8_t gsmfr_decoder_homing_frame[GSMFR_RTP_FRAME_LEN];