FreeCalypso > hg > gsm-codec-lib
comparison doc/FR1-Rx-DTX @ 303:4034c2b06ec8
doc/FR1-Rx-DTX: update for libgsmfr2 and the new landscape
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Mon, 15 Apr 2024 22:07:00 +0000 |
parents | 731c98b67da1 |
children |
comparison
equal
deleted
inserted
replaced
302:f469bad44c0e | 303:4034c2b06ec8 |
---|---|
24 SID flag, but it is determined from frame payload bits), and then the | 24 SID flag, but it is determined from frame payload bits), and then the |
25 interface from the Rx DTX handler to the GSM 06.10 decoder is another FR frame | 25 interface from the Rx DTX handler to the GSM 06.10 decoder is another FR frame |
26 of 260 bits. | 26 of 260 bits. |
27 | 27 |
28 What are the implications of this situation for the GSM published-source | 28 What are the implications of this situation for the GSM published-source |
29 software community? Prior to the present libgsmfrp offering, there has always | 29 software community? Prior to the present Themyscira offering, there has always |
30 been libgsm, but no Rx DTX handler. If you are working with a GSM uplink RTP | 30 been libgsm, but no Rx DTX handler. If you are working with a GSM uplink RTP |
31 stream from a BTS or a GSM downlink frame stream read out of TI Calypso DSP or | 31 stream from a BTS or a GSM downlink frame stream read out of TI Calypso DSP or |
32 some other GSM MS PHY, feeding that stream directly to libgsm (without passing | 32 some other GSM MS PHY, feeding that stream directly to libgsm (without passing |
33 through an Rx DTX handler) is NOT acceptable: a "bare" GSM 06.10 decoder won't | 33 through an Rx DTX handler) is NOT acceptable: a "bare" GSM 06.10 decoder won't |
34 recognize SID frames and won't produce the expected comfort noise output, and | 34 recognize SID frames and won't produce the expected comfort noise output, and |
39 will be garbage during those frame windows when no good frame was received; | 39 will be garbage during those frame windows when no good frame was received; |
40 feeding that garbage to libgsm produces noises that are very unkind on ears. | 40 feeding that garbage to libgsm produces noises that are very unkind on ears. |
41 | 41 |
42 The correct solution is to implement an Rx DTX handler, pass the stream of | 42 The correct solution is to implement an Rx DTX handler, pass the stream of |
43 frames and flags from the BTS or the MS PHY to this handler first, and then pass | 43 frames and flags from the BTS or the MS PHY to this handler first, and then pass |
44 the output of this handler to libgsm 06.10 decoder. Themyscira libgsmfrp is a | 44 the output of this handler to the standard GSM 06.10 decoder (classic libgsm or |
45 Free Software implementation of Rx DTX handler for GSM FR, implementing SID | 45 some updated port thereof). Themyscira libgsmfrp was our first Free Software |
46 classification, comfort noise generation and error concealment. | 46 implementation of Rx DTX handler for GSM-FR, implementing SID classification, |
47 comfort noise generation and error concealment. Our new libgsmfr2 offering | |
48 takes the harmonization effort (between GSM-FR and other GSM codecs) one step | |
49 further, eliminating the dependency on old libgsm and putting all GSM-FR codec | |
50 functions "under one roof". | |
51 | |
52 libgsmfrp/libgsmfr2 API documentation | |
53 ===================================== | |
54 | |
55 The Rx DTX component of libgsmfr2 has the same API as our previous libgsmfrp, | |
56 except for dropping the use of <gsm.h> and its types and needing to include our | |
57 new API header <tw_gsmfr.h>. The present article previously contained the full | |
58 description of this API; that description has now been moved to FR1-library-API | |
59 article, where the whole of libgsmfr2 is documented. | |
60 | |
61 Standalone exerciser utility | |
62 ============================ | |
63 | |
64 The present GSM codec libraries and utilities package includes a standalone | |
65 utility that exercises our Rx DTX handler for GSM-FR. This utility is | |
66 gsmfr-preproc, to be run as follows: | |
67 | |
68 gsmfr-preproc input.gsmx output.gsm | |
69 | |
70 The input is an extended-libgsm file that can contain SIDs and BFI frame gaps | |
71 in addition to regular GSM 06.10 speech frames (see Binary-file-format article); | |
72 the output is GSM 06.10 speech frames only. | |
73 | |
74 False SID detection | |
75 =================== | |
76 | |
77 The intent of GSM-FR spec authors was that the sets of possible speech frames | |
78 and possible SID frames be disjoint. Prior to introduction of DTX, there were | |
79 only regular speech frames per GSM 06.10, no SID, and a receiver had to deal | |
80 with only two possibilities: either a good speech frame was received, or the | |
81 frame was lost to radio errors or FACCH stealing (unusable frame). When SID | |
82 frames were introduced for the purpose of intentional DTX as distinct from | |
83 radio errors, the intent was that SID was to be a "new animal" not seen before, | |
84 distinct from regular speech frames. There is, however, a small blemish in the | |
85 actual system as realized: if the SID frame detector and the Rx DTX handler | |
86 that follows it in the Rx chain follow the rules of GSM 06.31 sections 6.1.1 | |
87 and 6.1.2, respectively (like our implementation does), then some speech frames | |
88 may be mistaken for invalid SID, or perhaps even for valid SID, producing a | |
89 nonzero failure rate in this mechanism. | |
90 | |
91 Official test sequence 02 in the set of 5 provided by ETSI exhibits this effect: | |
92 Seq02.inp is a legitimate 13-bit linear PCM input to the speech encoder, and the | |
93 corresponding output of GSM 06.10 encoder is contained in Seq02.cod. However, | |
94 that output contains some frames that are mistakenly classified as SID=1 | |
95 (invalid SID) by the rules of GSM 06.31 section 6.1.1! It is true that these | |
96 ancient test sequences chronologically predate the invention of DTX and | |
97 GSM 06.31, but we still need to bear in mind that this problematic Seq02.cod is | |
98 not an artificially constructed sequence of 06.10 codec parameters: it is the | |
99 required output of the prescribed bit-exact encoder given a legitimate PCM | |
100 input! There does not exist a perfect solution to this problem: as usual, | |
101 real-world engineering is all about trade-offs and compromises, and occasionally | |
102 a gear will slip. The best we can do is to model the probability of such | |
103 gear-slip or wrong detection events, and engineer our systems to reduce this | |
104 probability to a level that is deemed acceptable - which is exactly what GSM | |
105 spec designers did here. | |
106 | |
107 As of gsm-codec-lib-r3, gsmrec-dump utility shows the SID classification result | |
108 (GSM 06.31 section 6.1.1) in addition to parsed 06.10 codec parameters for each | |
109 frame, thus one can inspect FR-encoded streams and check for this blemish. | |
47 | 110 |
48 Effect of extra preprocessing | 111 Effect of extra preprocessing |
49 ============================= | 112 ============================= |
50 | 113 |
51 One key detail deserves extra emphasis before going into library API details: | 114 What will happen if the output of our Rx DTX preprocessor (e.g., the output of |
52 if the input to libgsmfrp consists entirely of good speech frames (no SID frames | 115 gsmfr-preproc utility) is fed to another utility such as gsmfr-decode that also |
53 and no BFIs), then the preprocessor becomes an identity transform. Therefore, | 116 applies the same preprocessor to its input? In other words, what is the effect |
54 if the output of our libgsmfrp preprocessor were to be fed to an additional | 117 of a secondary preprocessor application to previous preprocessor output? |
55 instance of the same further down the processing chain, no extra transformation | |
56 of any kind will happen. | |
57 | 118 |
58 Using libgsmfrp | 119 Most of the time, the second preprocessor pass will be an identity transform |
59 =============== | 120 under these conditions, as the input to that second pass will consist entirely |
60 | 121 of good speech frames, no SIDs and no BFIs. Any speech frames in the original |
61 The external public interface to Themyscira libgsmfrp consists of a single | 122 input that were mistakenly classified as SID (valid or invalid) have already |
62 header file <gsm_fr_preproc.h>; it should be installed in the same system | 123 been converted to comfort noise (or to the silence frame in one corner case of |
63 include directory as <gsm.h> from libgsm. Please note that <gsm_fr_preproc.h> | 124 invalid SID), hence they are no longer present in the output to trigger this |
64 includes <gsm.h>, as needed for gsm_byte and gsm_frame defined types. | 125 effect a second time. However, there is still a small possibility that a |
65 | 126 second pass will be a non-identity transform: pseudorandom RPE pulse parameters |
66 The dialect of C we chose for libgsmfrp is ANSI C (function prototypes), const | 127 in our comfort noise output are uniformly distributed between 1 and 6 (GSM 06.12 |
67 qualifier is used where appropriate; however, unlike libgsmefr, the interface | 128 section 6.1), and if PRNG dice roll such that at least 80 out of 95 SID codeword |
68 to libgsmfrp is defined in terms of gsm_byte type defined in <gsm.h>, included | 129 bit positions (all in the xMc part of the frame) are all zeros, the resulting |
69 from <gsm_fr_preproc.h>. | 130 CN frame will be liable to misinterpretation as SID (invalid SID most of the |
70 | 131 time, or even more rarely valid SID if at least 94 out of 95 SID codeword bit |
71 State allocation and freeing | 132 positions are all zeros) if fed to the preprocessor a second time. That second |
72 ============================ | 133 pass would then further alter those affected frames, but no others. |
73 | |
74 The Rx DTX handler is stateful, hence you will need to allocate a preprocessor | |
75 state structure in addition to the usual libgsm state structure for your GSM FR | |
76 Rx session. The necessary function is: | |
77 | |
78 extern struct gsmfr_preproc_state *gsmfr_preproc_create(void); | |
79 | |
80 struct gsmfr_preproc_state is an opaque structure to library users: you only get | |
81 a pointer which you remember and pass around, but <gsm_fr_preproc.h> does not | |
82 give you a full definition of this struct. As a library user, you don't even | |
83 get to know the size of this struct, hence the necessary malloc() operation | |
84 happens inside gsmfr_preproc_create(). However, the structure is malloc'ed as | |
85 a single chunk, hence when you are done with it, simply call free() on the | |
86 pointer you got from gsmfr_preproc_create(). | |
87 | |
88 gsmfr_preproc_create() can fail if the malloc() call inside fails, in which case | |
89 it returns NULL. | |
90 | |
91 Preprocessing good frames | |
92 ========================= | |
93 | |
94 For every good traffic frame (BFI=0) you receive from the radio subsystem, you | |
95 need to call this preprocessor function: | |
96 | |
97 extern void gsmfr_preproc_good_frame(struct gsmfr_preproc_state *state, | |
98 gsm_byte *frame); | |
99 | |
100 The second argument is both input and output, i.e., the frame is modified in | |
101 place. If the received frame is not SID (specifically, if the SID field | |
102 deviates from the SID codeword by 16 or more bits, per GSM 06.31 section 6.1.1), | |
103 then the frame (considered a good speech frame) will be left unmodified (i.e., | |
104 it is to be passed unchanged to the GSM 06.10 decoder), but preprocessor state | |
105 will be updated. OTOH, if the received frame is classified as either valid or | |
106 invalid SID per GSM 06.31, then the output frame will contain comfort noise | |
107 generated by the preprocessor using a PRNG, or a silence frame in one particular | |
108 corner case. | |
109 | |
110 GSM-FR RTP (or libgsm) 0xD magic: the upper nibble of the first byte can be | |
111 anything on input to gsmfr_preproc_good_frame(), but the output frame will | |
112 always have the correct magic in it. | |
113 | |
114 Handling BFI conditions | |
115 ======================= | |
116 | |
117 If you received a lost/missing frame indication instead of a good traffic frame, | |
118 call this preprocessor function: | |
119 | |
120 extern void gsmfr_preproc_bfi(struct gsmfr_preproc_state *state, int taf, | |
121 gsm_byte *frame_out); | |
122 | |
123 TAF is a flag defined in GSM 06.31 section 6.1.1; if you don't have this flag, | |
124 pass 0 - you will lose the function of comfort noise muting in the event of | |
125 prolonged SID loss, but all other Rx DTX functions will still work the same. | |
126 | |
127 With this function the 33-byte frame buffer is only an output, i.e., prior | |
128 buffer content is a don't-care and there is no provision for making any use of | |
129 erroneous frames like in EFR. The frame generated by the preprocessor may be | |
130 substitution/muting, comfort noise or silence depending on the state. | |
131 | |
132 Other miscellaneous functions | |
133 ============================= | |
134 | |
135 extern void gsmfr_preproc_reset(struct gsmfr_preproc_state *state); | |
136 | |
137 This function resets the preprocessor state to what it is right out of | |
138 gsmfr_preproc_create(), which is naturally just a combination of malloc() and | |
139 gsmfr_preproc_reset(). Given that our Rx DTX handler state is much simpler | |
140 than, for example, EFR codec state, there does not seem to be any need for | |
141 explicit resets, but the reset function is made public for the sake of | |
142 completeness. | |
143 | |
144 extern int gsmfr_preproc_sid_classify(const gsm_byte *frame); | |
145 | |
146 This function analyzes an RTP-encoded FR frame (the upper nibble of the first | |
147 byte is NOT checked for 0xD signature) for the SID codeword of GSM 06.12 and | |
148 classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.31 | |
149 section 6.1.1. | |
150 | |
151 Silence frame datum | |
152 =================== | |
153 | |
154 extern const gsm_frame gsmfr_preproc_silence_frame; | |
155 | |
156 Many implementors make the mistake of thinking that a GSM FR silence frame is a | |
157 frame of 260 zero bits, but the official specs disagree: the silence frame given | |
158 in GSM 06.11 (3GPP TS 46.011, at the very end of the spec) is quite different. | |
159 Themyscira libgsmfrp implements the correct silence frame per the spec, and that | |
160 datum is also made public. | |
161 | |
162 libgsmfrp change history: version 1.0.1 to version 1.0.2 | |
163 ======================================================== | |
164 | |
165 There are only two changes, both involving corner cases with invalid SID frames | |
166 being received: | |
167 | |
168 1) An invalid SID frame was received immediately following a good speech frame. | |
169 In this case we start CN generation, but we take the needed LARc and Xmaxc | |
170 parameters from the last speech frame, instead of the usual procedure of | |
171 extracting them from a valid SID frame. The change from 1.0.1 to 1.0.2 | |
172 concerns the Xmaxc parameter in this corner case: in 1.0.1 we took Xmaxc | |
173 from the last subframe and used it for ensuing CN generation, but in 1.0.2 | |
174 we compute a more proper mean Xmaxc from all 4 subframes, by dequantizing, | |
175 summing and requantizing. | |
176 | |
177 2) An invalid SID frame was received in the speech muting state. The sequence | |
178 of inputs would have to be: | |
179 | |
180 - a good speech frame; | |
181 - one or more BFIs, but not too many, so that the cached speech frame | |
182 does not decay fully by Xmaxc reduction; | |
183 - an invalid SID frame. | |
184 | |
185 In version 1.0.1 we handled this even more obscure corner case by entering | |
186 the CN muting state, i.e., the state that is normally entered upon the | |
187 second lost SID. In version 1.0.2 we ignore invalid SID in the speech | |
188 muting state and act as if we got BFI, i.e., continue speech muting rather | |
189 than switch to CN muting. | |
190 | |
191 libgsmfrp change history: version 1.0.0 to version 1.0.1 | |
192 ======================================================== | |
193 | |
194 Version 1.0.0 exhibited the following defects, which are fixed in 1.0.1: | |
195 | |
196 1) The last received valid SID was cached forever for the purpose of | |
197 handling future invalid SIDs - we could have received some valid | |
198 SID ages ago, then lots of speech or NO_DATA, and if we then get | |
199 an invalid SID, we would resurrect the last valid SID from ancient | |
200 history - a bad design. In our new design, we handle invalid SID | |
201 based on the current state, much like BFI. | |
202 | |
203 2) GSM 06.11 spec says clearly that after the second lost SID | |
204 (received BFI=1 && TAF=1 in CN state) we need to gradually decrease | |
205 the output level, rather than jump directly to emitting silence | |
206 frames - we previously failed to implement such logic. | |
207 | |
208 3) Per GSM 06.12 section 5.2, Xmaxc should be the same in all 4 subframes | |
209 in a SID frame. What should we do if we receive an otherwise valid | |
210 SID frame with different Xmaxc? Our previous approach would | |
211 replicate this Xmaxc oddity in every subsequent generated CN frame, | |
212 which is rather bad. In our new design, the very first CN frame | |
213 (which can be seen as a transformation of the SID frame itself) | |
214 retains the original 4 distinct Xmaxc, but all subsequent CN frames | |
215 are based on the Xmaxc from the last subframe of the most recent SID. |