changeset 478:936a08cc73ce

doc/AMR-library-API: describe the decoder
author Mychaela Falconia <falcon@freecalypso.org>
date Sun, 19 May 2024 21:32:31 +0000
parents 4c9222d95647
children 616b7ba1135b
files doc/AMR-library-API
diffstat 1 files changed, 138 insertions(+), 2 deletions(-) [+]
line wrap: on
line diff
--- a/doc/AMR-library-API	Sat May 18 22:30:42 2024 +0000
+++ b/doc/AMR-library-API	Sun May 19 21:32:31 2024 +0000
@@ -247,6 +247,142 @@
 After this transformation, call EFR_params2frame() from libgsmefr (see
 EFR-library-API) with param[] array in struct amr_param_frame as input.
 
-Using the AMR decoder
-=====================
+Using the AMR decoder: native interface
+=======================================
+
+The internal native form of the stateful AMR decoder engine is:
+
+void amr_decode_frame(struct amr_decoder_state *st,
+			const struct amr_param_frame *frame, int16_t *pcm);
+
+The input frame is given as struct amr_param_frame, same structure as is used
+for the output of the encoder.  However, the required input to
+amr_decode_frame() is different from amr_encode_frame() output:
+
+* The 'type' member of the struct must be a code from enum RXFrameType, *not*
+  enum TXFrameType!
+
+* All 3GPP-defined Rx frame types are allowed.
+
+* The 'mode' member of the input struct is ignored if the Rx frame type is
+  RX_NO_DATA, but must be valid for every other frame type.
+
+If frame->type is not RX_NO_DATA, frame->mode is interpreted as follows:
+
+* The 3 least significant bits (mask 0x07) are taken to indicate the codec mode
+  used for this frame;
+
+* The most significant bit (mask 0x80) has meaning only if the mode is MR122
+  and frame->type is RX_SPEECH_GOOD.  Under these conditions, if this bit is
+  set, the DHF check is modified to match against the bit pattern of EFR DHF
+  instead of regular MR122 DHF.
+
+amr_decode_frame() contains no guards against invalid (undefined) frame types
+in frame->type, or against any of the codec parameters being out of range.
+struct amr_param_frame coming into this function must come only from trusted
+sources inside the application program, usually from one of the provided input
+format conversion functions.
+
+Decoder homing frame check
+--------------------------
+
+The definition of AMR decoder per 3GPP includes two mandatory checks for the
+possibility of the input frame being one of the defined per-mode decoder homing
+frames (DHFs): one check at the beginning of the decoder, checking only up to
+the first subframe and acting only when the current state is homed, and the
+second check at the end of the decoder, checking all parameters (the full frame)
+and resetting the decoder on match.
+
+This DHF check operation, called from those two places in the stateful decoder
+as just described, is factored out into its own function that is exported as
+part of the public API:
+
+int amr_check_dhf(const struct amr_param_frame *frame, int first_sub_only);
+
+struct amr_param_frame needs to be passed to amr_check_dhf() as if it was
+amr_decode_frame(); the latter function in fact calls amr_check_dhf() on its
+input.  The Boolean flag argument (first_sub_only) tells the function to check
+only to the end of the first subframe if nonzero, or check the entire frame if
+zero.  The return value is 1 if the input matches DHF, 0 otherwise.
+
+frame->type must be RX_SPEECH_GOOD for the frame to be a DHF candidate, and the
+interpretation of frame->mode, including the special mode of matching against
+EFR DHF, is implemented in this function.
+
+Using the AMR decoder: input preparation
+========================================
+
+Stateless utility functions are provided for preparing decoder inputs,
+converting from RFC 4867 or 3GPP test sequence format into the internal form
+described above.
 
+Decoding RFC 4867 input
+-----------------------
+
+If the entire RFC 4867 frame (read from .amr storage format or received in RTP
+as an octet-aligned payload) is already in memory, decode it with this function:
+
+int amr_frame_from_ietf(const uint8_t *bytes, struct amr_param_frame *frame);
+
+The string of bytes input to this function must begin with the ToC octet.  Out
+of this ToC octet, only bits falling under the mask 0x7C (FT and Q bit fields)
+are checked.  The remaining 3 bits are not checked: in the case of .amr storage
+format, RFC 4867 describes these bits as "padding" (P bits) and stipulates that
+they MUST be ignored by readers.  However, in the case of RTP payloads received
+in a live session, the uppermost bit of the ToC octet becomes F rather than P,
+and it is the responsibility of the application to ensure that F=0: multiframe
+payloads are NOT supported.
+
+FT in the input frame may be [0,7] (MR475 through MR122), 8 (MRDTX) or 15
+(AMR_FT_NODATA).  In all of these cases amr_frame_from_ietf() succeeds and
+returns 0 to indicate so; the resulting struct amr_param_frame is then good to
+be passed to amr_decode_frame().  OTOH, if FT falls into the invalid range of
+[9,14], amr_frame_from_ietf() returns -1 to indicate invalid input.
+
+Applications that read from a .amr file will need to read just the ToC (aka
+frame header) octet and decode it to determine how many additional octets need
+to be read to absorb one frame.  Similarly, RTP applications may need to
+validate incoming payloada by cross-checking between the FT indicated in the
+ToC octet and the received payload length.  Both applications can use this
+function:
+
+int amr_ietf_grok_first_octet(uint8_t fo);
+
+The argument is the first octet, and the function only considers the FT field
+thereof.  The return value is:
+
+-1 for invalid FT [9,14]
+0 for FT=15 (the ToC octet is the entirety of the payload)
+>0 for valid FT [0,8], indicating the number of additional bytes to be read
+
+Decoding 3GPP test sequence input
+---------------------------------
+
+To decode a frame from 3GPP .cod file format, call this function:
+
+int amr_frame_from_tseq(const uint16_t *cod, int use_rxtype,
+			struct amr_param_frame *frame);
+
+The argument 'use_rxtype' should be 1 if the input uses Rx frame types (enum
+RXFrameType) or 0 if it uses Tx frame types (enum TXFrameType); this argument
+directly corresponds to -rxframetype command line option in the reference
+decoder program from 3GPP.
+
+Unlike raw amr_decode_frame(), amr_frame_from_tseq() does guard against invalid
+input.  The return value from this function is:
+
+0 means the input was good and the output is good to pass to amr_decode_frame();
+-1 means the frame type field in the input is invalid;
+-2 means the mode field in the input is invalid.
+
+Frame type conversion
+---------------------
+
+The operation of mapping from enum TXFrameType to enum RXFrameType, optionally
+but very commonly invoked from amr_frame_from_tseq(), is factored out into its
+own function, exported as part of the public API:
+
+int amr_txtype_to_rxtype(enum TXFrameType tx_type, enum RXFrameType *rx_type);
+
+The return value is 0 if tx_type is valid and *rx_type has been filled
+accordingly, or -1 if tx_type is invalid.