comparison doc/TFO-transform @ 553:ebcf414b7d99

doc/TFO-transform: describe details for FRv1, both modes
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 07 Oct 2024 08:24:24 +0000
parents 8f44d7064c56
children
comparison
equal deleted inserted replaced
552:6ab066180ec2 553:ebcf414b7d99
34 Both input and output files are in TW-TS-005 Annex A hexadecimal format. The 34 Both input and output files are in TW-TS-005 Annex A hexadecimal format. The
35 input will typically consist of TW-TS-001 extended RTP format, whereas the 35 input will typically consist of TW-TS-001 extended RTP format, whereas the
36 output is always emitted in the basic format, pure GSM-FR codec frames only. 36 output is always emitted in the basic format, pure GSM-FR codec frames only.
37 37
38 -d option enables DTXd, which is disabled by default. 38 -d option enables DTXd, which is disabled by default.
39
40 Details of FRv1 TFO transform with DTXd=0
41 -----------------------------------------
42
43 Our implementation of TFO transform in DTXd=0 configuration is mostly identical
44 with the Rx DTX handler preprocessor stage of regular speech decoding; the
45 details are covered in FR1-Rx-DTX-detail article.
46
47 ThemWi implementation of TFO transform includes the feature of in-band homing:
48 if the input to the transform is the spec-defined decoder homing frame (DHF),
49 this DHF is passed through to the output just like any other good speech frame,
50 but the internal state is reset to the initial "home" state.
51
52 Details of FRv1 TFO transform with DTXd=1
53 -----------------------------------------
54
55 We implement the DTXd=1 version of TFO transform as a post-processor stage
56 after executing the "regular" logic for DTXd=0 case; more precisely, our
57 "regular" Rx DTX handler code sets some flags that are only used by the TFO
58 DTXd=1 post-processor, and the latter element acts on one of those flags.
59
60 The resulting visible behaviour of our TFO transform is as follows:
61
62 * Whenever a valid SID frame comes in, it is re-emitted on the output in the
63 same frame position with the same parameters, even if it has different Xmaxc
64 in different subframes. However, it is "rejuvenated" in that any possible
65 single bit error in the SID codeword is corrected, and all unused bits are
66 also cleared. This behaviour agrees with GSM 08.62 section 8.2.2.
67
68 * Also in agreement with GSM 08.62 section 8.2.2, any unusable frames or invalid
69 SID frames that come in after that valid SID (but before that cached SID
70 expires by way of two lost SID events, or a good speech frame ends the DTX
71 pause) are replaced with output that repeats the last processed valid SID.
72 This output consists of repeated SID frames just like the original, but with
73 all 4 Xmaxc parameters set to the one from the last subframe.
74
75 * If an invalid SID frame is received directly after good speech, indicating a
76 need to start comfort noise insertion but lacking usable parameters for it,
77 the output from the TFO transform is just like that described in
78 FR1-Rx-DTX-detail article, but in the form of SID frames rather than "speech"
79 frames that represent CN.
80
81 * If two consecutive lost SID events occur and the Rx DTX handler has to enter
82 CN muting state, our TFO transform breaks out of DTX and emits the CN muting
83 sequence as "speech" frames rather than altered SID. This tactic is done in
84 order to produce immediate effect on the receiving end. Once the muting fully
85 decays, the transform emits 4 silence frames of GSM 06.11 Table 1, then
86 switches to endlessly emitting SIDs derived from this silence frame (same
87 LARc, Xmaxc=0).
88
89 * Any other time the Rx DTX handler is in NO_DATA state (initial reset state or
90 fully decayed state after speech muting), the TFO transform in DTXd=1 mode
91 emits SIDs derived from the silence frame instead of actual silence frames.
92
93 Emission of transform-synthesized SIDs frames during muting states is done in
94 order to help achieve the presumed network operator's goal of DTX maximization
95 and radio interference reduction. However, if the input to the transform is
96 all good speech frames without DTX pauses, the transform does not attempt to
97 apply VAD and make its own DTXd.