changeset 535:bf7bbc7d494f

doc/FR1-library-API: document new additions
author Mychaela Falconia <falcon@freecalypso.org>
date Fri, 20 Sep 2024 07:27:18 +0000
parents 516e84085a15
children a3300483ae74
files doc/FR1-library-API
diffstat 1 files changed, 210 insertions(+), 22 deletions(-) [+]
line wrap: on
line diff
--- a/doc/FR1-library-API	Fri Sep 20 00:17:35 2024 +0000
+++ b/doc/FR1-library-API	Fri Sep 20 07:27:18 2024 +0000
@@ -29,16 +29,28 @@
 struct gsmfr_0610_state *gsmfr_0610_create(void);
 
 This function allocates dynamic memory for the state structure with malloc()
-(the size of the struct is internal to the library and not exposed) and returns
-a pointer to the allocated and initialized struct if successful, or NULL if
-malloc() fails.  The state structure is malloc'ed as a single chunk, hence when
-you are done with it, simply free() it.
+and returns a pointer to the allocated and initialized struct if successful, or
+NULL if malloc() fails.  The state structure is malloc'ed as a single chunk,
+hence when you are done with it, simply free() it.
 
 The initialization or reset portion of gsmfr_0610_create() operation can always
 be repeated with this function:
 
 void gsmfr_0610_reset(struct gsmfr_0610_state *state);
 
+To support applications that need (or prefer) to use some different method of
+managing their memory allocations, the library also exports this const datum:
+
+extern const unsigned gsmfr_0610_state_size;
+
+Using this feature, one can replace gsmfr_0610_create() with something like the
+following (example for applications based on Osmocom libraries):
+
+	struct gsmfr_0610_state *st;
+	st = talloc_size(ctx, gsmfr_0610_state_size);
+	if (st)
+		gsmfr_0610_reset(st);
+
 Immediately after gsmfr_0610_create() or gsmfr_0610_reset(), the "virgin" state
 structure can be used either for the encoder or for the decoder; however, once
 that state struct has been passed to functions of either group, it can only be
@@ -114,12 +126,15 @@
 decoder to which it is usually coupled.  Libgsmfr2 provides a "fulldec" wrapper
 that incorporates both elements, but the ability to use the Rx DTX preprocessor
 by itself still remains, unchanged from our previous libgsmfrp offering.  One
-potential application for this preprocessor by itself, without immediately
-following it with the GSM 06.10 decode step, is the possibility of implementing
-the TFO/TrFO transform of 3GPP TS 28.062 section C.3.2.1.1 for GSM-FR: our Rx
-DTX preprocessor does exactly what that section calls for, specifically in
-"case 1" where the input UL frame stream may contain SIDs and BFI frame gaps,
-but the output must be 100% valid frames and SID-free.
+significant application for this preprocessor by itself, without immediately
+following it with the GSM 06.10 decode step, is the TFO/TrFO transform of 3GPP
+TS 28.062 section C.3.2.1.1 for GSM-FR: our Rx DTX preprocessor does exactly
+what that section calls for, specifically in "case 1" where the input UL frame
+stream may contain SIDs and BFI frame gaps, but the output must be 100% valid
+frames and SID-free.  The current version of libgsmfr2 includes some additional
+provisions for using our preprocessor block as a TFO transform in both non-DTXd
+and DTXd-enabled configurations, as detailed in a later section of this
+document.
 
 The state structure for this block is struct gsmfr_preproc_state, and it is
 allocated with this function:
@@ -132,6 +147,11 @@
 
 void gsmfr_preproc_reset(struct gsmfr_preproc_state *state);
 
+There is also a public const datum with the size of this structure, allowing
+use of talloc and other alternative schemes:
+
+extern const unsigned gsmfr_preproc_state_size;
+
 Preprocessing good frames
 -------------------------
 
@@ -148,30 +168,63 @@
 it is to be passed unchanged to the GSM 06.10 decoder), but preprocessor state
 will be updated.  OTOH, if the received frame is classified as either valid or
 invalid SID per GSM 06.31, then the output frame will contain comfort noise
-generated by the preprocessor using a PRNG, or a silence frame in one particular
-corner case.
+generated by the preprocessor using a PRNG, or a speech muting or silence frame
+in some corner cases involving invalid SID.
 
 GSM-FR RTP (originally libgsm) 0xD magic: the upper nibble of the first byte
 can be anything on input to gsmfr_preproc_good_frame(), but the output frame
 will always have the correct magic in it.
 
+There is also a variant of this function (implemented as a wrapper) that applies
+homing logic:
+
+void gsmfr_preproc_good_frame_hm(struct gsmfr_preproc_state *state,
+				 uint8_t *frame);
+
+This function operates just like plain gsmfr_preproc_good_frame() except for
+one difference: if the input matches the decoder homing frame (DHF), the state
+is reset with an internal call to gsmfr_preproc_reset().  (Because the DHF is
+still a good speech frame, it is always passed through to the output unchanged
+by both functions - the only difference is the effect on subsequent state.)
+The homing version of good frame preproc is intended for TFO applications, and
+is invoked internally by gsmfr_tfo_xfrm_main() function described in a later
+section of this document.
+
 Handling BFI conditions
 -----------------------
 
 If you received a lost/missing frame indication instead of a good traffic frame,
-call this preprocessor function:
+call one of these preprocessor functions:
 
 void gsmfr_preproc_bfi(struct gsmfr_preproc_state *state, int taf,
 			uint8_t *frame_out);
 
+or
+
+void gsmfr_preproc_bfi_bits(struct gsmfr_preproc_state *state,
+			    const uint8_t *bad_frame, int taf,
+			    uint8_t *frame_out);
+
+gsmfr_preproc_bfi_bits() should be called if you received payload bits along
+with the BFI flag; plain gsmfr_preproc_bfi() should be called if you received
+BFI with no data.  The bad frame passed to gsmfr_preproc_bfi_bits() is used
+only to check if the BFI should be handled as an invalid SID rather than the
+more common case of an unusable frame - see GSM 06.31 for definitions of these
+terms.  Past the SID check, the bad frame content is a don't-care, and there is
+no provision for making any use of erroneous frames like in EFR.
+
 TAF is a flag defined in GSM 06.31 section 6.1.1; if you don't have this flag,
 pass 0 - you will lose the function of comfort noise muting in the event of
 prolonged SID loss, but all other Rx DTX functions will still work the same.
 
-With this function the 33-byte frame buffer is only an output, i.e., prior
-buffer content is a don't-care and there is no provision for making any use of
-erroneous frames like in EFR.  The frame generated by the preprocessor may be
-substitution/muting, comfort noise or silence depending on the state.
+With both functions the 33-byte buffer pointed to by frame_out is only an
+output, i.e., prior buffer content is a don't-care.  The frame generated by the
+preprocessor may be substitution/muting, comfort noise or silence depending on
+the state.
+
+gsmfr_preproc_bfi_bits() arguments bad_frame and frame_out can point to the
+same memory: the function finishes analyzing bad_frame input before it starts
+writing to frame_out.
 
 GSM-FR full decoder
 ===================
@@ -188,6 +241,8 @@
 
 void gsmfr_fulldec_reset(struct gsmfr_fulldec_state *state);
 
+extern const unsigned gsmfr_fulldec_state_size;
+
 The reset function internally calls gsmfr_0610_reset() and
 gsmfr_preproc_reset(), initializing both processing blocks.
 
@@ -199,17 +254,150 @@
 void gsmfr_fulldec_bfi(struct gsmfr_fulldec_state *state, int taf,
 			int16_t *pcm);
 
-These functions follow the same pattern as gsmfr_preproc_good_frame() and
-gsmfr_preproc_bfi(), but the output is a 160-sample linear PCM buffer.  Also
-note that the frame input to gsmfr_fulldec_good_frame() is const, unlike the
-situation with gsmfr_preproc_good_frame() - the copying into a scratchpad
-buffer (on the stack) happens inside this "fulldec" wrapper.
+void gsmfr_fulldec_bfi_bits(struct gsmfr_fulldec_state *state,
+			    const uint8_t *bad_frame, int taf, int16_t *pcm);
+
+These functions follow the same pattern as gsmfr_preproc_good_frame(),
+gsmfr_preproc_bfi() and gsmfr_preproc_bfi_bits(), but the output is a 160-sample
+linear PCM buffer.  Also note that the frame input to gsmfr_fulldec_good_frame()
+is const, unlike the situation with gsmfr_preproc_good_frame() - the copying
+into a scratchpad buffer (on the stack) happens inside this "fulldec" wrapper.
 
 The "fulldec" layer also adds the decoder homing feature:
 gsmfr_fulldec_good_frame() detects decoder homing frames and invokes
 gsmfr_fulldec_reset() when required, and also implements EHF output per the
 spec.
 
+Full decoder RTP input
+----------------------
+
+If a network element is receiving GSM-FR input via RTP and needs to feed this
+input to the decoder, the RTP payload handler needs to support both the basic
+RTP format of ETSI TS 101 318 (also RFC 3551) and the extended RTP format of
+TW-TS-001.  Depending on the format received, and depending on bit flags in the
+TEH octet in the case of TW-TS-001, one of the 3 main processing functions
+listed above will need to be called.  Seeing that this complex logic should be
+abstracted away from applications into the library, we've added the following
+wrapper function:
+
+int gsmfr_fulldec_rtp_in(struct gsmfr_fulldec_state *state,
+			 const uint8_t *rtp_pl, unsigned rtp_pl_len,
+			 int16_t *pcm);
+
+The input is the received RTP payload: array of bytes and length.  It is
+acceptable to pass 0 as rtp_pl_len, in which case rtp_pl pointer can be NULL.
+The function proceeds as follows:
+
+* If the input is valid RTP format for GSM-FR (either basic or extended), it is
+  passed to the appropriate main processing function.  Unlike the permissive
+  stance taken in lower-level functions, RTP input validation includes a check
+  of 0xD signature of GSM-FR, as well as validation of TEH octet signature and
+  consistency in the case of TW-TS-001.  The return value is 0, indicating that
+  good input was received.
+
+* If the input is a zero-length payload (rtp_pl_len is 0, rtp_pl may be NULL),
+  it is treated like BFI-no-data with TAF=0.  The return value is 0, meaning
+  that this input is still considered valid.
+
+* All other inputs are considered invalid.  Linear PCM output is still generated
+  by calling gsmfr_fulldec_bfi(), but the return value is -1, signaling invalid
+  RTP input.
+
+TFO transform
+=============
+
+"TFO transform" is the term adopted by Themyscira Wireless for the non-trivial
+transform on GSM codec frames called for by the TFO spec, 3GPP TS 28.062
+section C.3.2.1.1.  For each of the 3 classic GSM codecs, this transform can
+operate in two modes:
+
+DTXd=0: the input UL frame stream from call leg A may contain SIDs and BFI
+frame gaps, but the output to call leg B DL must be 100% valid frames and
+SID-free.
+
+DTXd=1: the output to call leg B DL is allowed to contain both good speech and
+valid SID frames, just like the output of a DTX-enabled speech encoder.
+Furthermore, it can be presumed that network operators who enable DTXd seek to
+reap its benefits in terms of radio interference reduction, hence the
+DTXd-enabled TFO transform should actually make use of DTXd capability.
+
+In the case of GSM-FR codec, the TFO transform with DTXd=0 is identical to the
+Rx DTX preprocessor part of the standard endpoint decoder, hence our "preproc"
+block is directly suited to serve as such.  OTOH, the case of DTXd=1 is
+different: heeding the implied need to actually make use of DTXd when possible
+requires implementing a transform that is not the same as the preprocessor to
+be applied just prior to local GSM 06.10 decoding, hence the DTXd-enabled TFO
+transform is a different entity.
+
+The approach implemented in Themyscira libgsmfr2 is a hybrid:
+
+* The preprocessor block described earlier in this document functions both as
+  the necessary component of the full endpoint decoder and as the TFO transform
+  for DTXd=0.
+
+* TFO transform for DTXd=1 is implemented as a two-step process:
+
+1) Regular main processing functions of the preproc block produce output that
+   is SID-free, containing synthetic "speech" frames in the case of comfort
+   noise or silence.
+
+2) A special post-processor function needs to be called immediately afterward.
+   This function selectively transforms some output frames into SIDs based on
+   a flag set in the state structure.
+
+In order to make this approach possible, all main processing functions of the
+preproc block do a little bit of extra housekeeping to keep track of whether or
+not their output can be replaced with SID, logic that is unnecessary when this
+block functions as part of the full endpoint decoder or as non-DTXd TFO
+transform.  However, this logic is very simple and the overhead is very light.
+
+TFO transform API
+-----------------
+
+The state structure was already described earlier: it is
+struct gsmfr_preproc_state, created either with gsmfr_preproc_create() or by
+externally allocating the needed memory based on gsmfr_preproc_state_size and
+then initializing it with gsmfr_preproc_reset().  The following API functions
+are then available:
+
+int gsmfr_tfo_xfrm_main(struct gsmfr_preproc_state *state,
+			const uint8_t *rtp_in, unsigned rtp_in_len,
+			uint8_t *frame_out);
+
+int gsmfr_tfo_xfrm_dtxd(struct gsmfr_preproc_state *state, uint8_t *frame_out);
+
+gsmfr_tfo_xfrm_main() is the TFO transform counterpart to
+gsmfr_fulldec_rtp_in(), described in detail earlier.  It is also possible (and
+allowed) to call gsmfr_preproc_* main processing functions directly, but the
+RTP wrapper is convenient for the same reasons as in the case of the full
+decoder.  In this mode of usage, the only difference between the full decoder
+and the TFO transform is that the former emits linear PCM output, whereas the
+latter emits 33-byte GSM-FR codec frames to be sent to call leg B downlink.
+
+If DTXd is in use, then the call to gsmfr_tfo_xfrm_main() needs to be directly
+followed by a call to gsmfr_tfo_xfrm_dtxd(), operating on the same output buffer
+with the same state structure.  The output will then be changed to SID when
+appropriate for the current state.
+
+TFO transform homing
+--------------------
+
+3GPP specs are silent on whether or not TFO transforms should implement homing,
+i.e., whether or not they should reset to home state when a decoder homing frame
+passes through.  However, at Themyscira Wireless we believe in building
+deterministic systems whose bit-exact behavior can be modeled and relied upon;
+for this reason, our implementation of TFO transform does include in-band
+homing.  In accord with this design decision, gsmfr_tfo_xfrm_main() internally
+calls gsmfr_preproc_good_frame_hm() described earlier instead of plain
+gsmfr_preproc_good_frame().
+
+With DTXd=1, if a stream of DHFs is input to the TFO transform, the same stream
+of DHFs will appear on the output, i.e., DTXd won't kick in.  (The same behavior
+occurs in a standard 3GPP-compliant speech encoder whose input is a stream of
+0xD5 octets in PCMA or 0xFE in PCMU.)  However, any BFIs following this DHF
+will be immediately converted to SID, under the same conditions when our TFO
+transform with DTXd=0 emits silence frames of GSM 06.11.
+
 Stateless utility functions
 ===========================