FreeCalypso > hg > gsm-codec-lib
view doc/FR1-Rx-DTX @ 407:5a1d18542f8a
libtwamr: integrate dtx_dec.c and dtx_enc.c
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Tue, 07 May 2024 00:05:12 +0000 |
parents | 4034c2b06ec8 |
children |
line wrap: on
line source
At the level of provided functionality and architectural structure, ETSI GSM specifications for DTX (discontinuous transmission) are very symmetric between FR and EFR: the same DTX functionality is specified for both codecs, with the same overall architecture. However, there is one important difference: in the case of EFR the complete implementation of all DTX functions (for both Tx and Rx) forms an integral and inseparable part of the reference codec (implemented in C) from the beginning, whereas in the case of FR1 the addition of DTX is somewhat of an afterthought. GSM 06.10 defines a "pure" FR codec without any DTX functions, and this most basic spec can be and has been implemented in this "pure" form - classic Unix libgsm from 1990s is a proper, fully compliant implementation of GSM 06.10, but only this spec, without any DTX. In contrast, there has never existed a "pure" implementation of GSM 06.60 EFR codec without associated Tx and Rx DTX functions. Furthermore, there is an important distinction between Tx and Rx DTX handlers for FR1: * Anyone who seeks to implement Tx DTX for FR1 would have to dig into the guts of GSM 06.10 encoder and augment it with VAD and SID encoding functions per GSM 06.32 and 06.12 specs. * In contrast, the Rx DTX handler for FR1 is modular: the way it is specified in GSM 06.11, 06.12 and 06.31 is a front-end to unmodified GSM 06.10 decoder. On the Rx side, the interface from the radio subsystem to the Rx DTX handler consists of 260 bits of frame plus BFI and TAF flags (the spec also defines a SID flag, but it is determined from frame payload bits), and then the interface from the Rx DTX handler to the GSM 06.10 decoder is another FR frame of 260 bits. What are the implications of this situation for the GSM published-source software community? Prior to the present Themyscira offering, there has always been libgsm, but no Rx DTX handler. If you are working with a GSM uplink RTP stream from a BTS or a GSM downlink frame stream read out of TI Calypso DSP or some other GSM MS PHY, feeding that stream directly to libgsm (without passing through an Rx DTX handler) is NOT acceptable: a "bare" GSM 06.10 decoder won't recognize SID frames and won't produce the expected comfort noise output, and what are you going to do in those 20 ms windows in which no good traffic frame was received? The situation becomes especially bad (unkind on ears) if you are reading received downlink frames out of TI Calypso DSP: the DSP's buffer will have *some* bit content in every 20 ms window, but naturally this bit content will be garbage during those frame windows when no good frame was received; feeding that garbage to libgsm produces noises that are very unkind on ears. The correct solution is to implement an Rx DTX handler, pass the stream of frames and flags from the BTS or the MS PHY to this handler first, and then pass the output of this handler to the standard GSM 06.10 decoder (classic libgsm or some updated port thereof). Themyscira libgsmfrp was our first Free Software implementation of Rx DTX handler for GSM-FR, implementing SID classification, comfort noise generation and error concealment. Our new libgsmfr2 offering takes the harmonization effort (between GSM-FR and other GSM codecs) one step further, eliminating the dependency on old libgsm and putting all GSM-FR codec functions "under one roof". libgsmfrp/libgsmfr2 API documentation ===================================== The Rx DTX component of libgsmfr2 has the same API as our previous libgsmfrp, except for dropping the use of <gsm.h> and its types and needing to include our new API header <tw_gsmfr.h>. The present article previously contained the full description of this API; that description has now been moved to FR1-library-API article, where the whole of libgsmfr2 is documented. Standalone exerciser utility ============================ The present GSM codec libraries and utilities package includes a standalone utility that exercises our Rx DTX handler for GSM-FR. This utility is gsmfr-preproc, to be run as follows: gsmfr-preproc input.gsmx output.gsm The input is an extended-libgsm file that can contain SIDs and BFI frame gaps in addition to regular GSM 06.10 speech frames (see Binary-file-format article); the output is GSM 06.10 speech frames only. False SID detection =================== The intent of GSM-FR spec authors was that the sets of possible speech frames and possible SID frames be disjoint. Prior to introduction of DTX, there were only regular speech frames per GSM 06.10, no SID, and a receiver had to deal with only two possibilities: either a good speech frame was received, or the frame was lost to radio errors or FACCH stealing (unusable frame). When SID frames were introduced for the purpose of intentional DTX as distinct from radio errors, the intent was that SID was to be a "new animal" not seen before, distinct from regular speech frames. There is, however, a small blemish in the actual system as realized: if the SID frame detector and the Rx DTX handler that follows it in the Rx chain follow the rules of GSM 06.31 sections 6.1.1 and 6.1.2, respectively (like our implementation does), then some speech frames may be mistaken for invalid SID, or perhaps even for valid SID, producing a nonzero failure rate in this mechanism. Official test sequence 02 in the set of 5 provided by ETSI exhibits this effect: Seq02.inp is a legitimate 13-bit linear PCM input to the speech encoder, and the corresponding output of GSM 06.10 encoder is contained in Seq02.cod. However, that output contains some frames that are mistakenly classified as SID=1 (invalid SID) by the rules of GSM 06.31 section 6.1.1! It is true that these ancient test sequences chronologically predate the invention of DTX and GSM 06.31, but we still need to bear in mind that this problematic Seq02.cod is not an artificially constructed sequence of 06.10 codec parameters: it is the required output of the prescribed bit-exact encoder given a legitimate PCM input! There does not exist a perfect solution to this problem: as usual, real-world engineering is all about trade-offs and compromises, and occasionally a gear will slip. The best we can do is to model the probability of such gear-slip or wrong detection events, and engineer our systems to reduce this probability to a level that is deemed acceptable - which is exactly what GSM spec designers did here. As of gsm-codec-lib-r3, gsmrec-dump utility shows the SID classification result (GSM 06.31 section 6.1.1) in addition to parsed 06.10 codec parameters for each frame, thus one can inspect FR-encoded streams and check for this blemish. Effect of extra preprocessing ============================= What will happen if the output of our Rx DTX preprocessor (e.g., the output of gsmfr-preproc utility) is fed to another utility such as gsmfr-decode that also applies the same preprocessor to its input? In other words, what is the effect of a secondary preprocessor application to previous preprocessor output? Most of the time, the second preprocessor pass will be an identity transform under these conditions, as the input to that second pass will consist entirely of good speech frames, no SIDs and no BFIs. Any speech frames in the original input that were mistakenly classified as SID (valid or invalid) have already been converted to comfort noise (or to the silence frame in one corner case of invalid SID), hence they are no longer present in the output to trigger this effect a second time. However, there is still a small possibility that a second pass will be a non-identity transform: pseudorandom RPE pulse parameters in our comfort noise output are uniformly distributed between 1 and 6 (GSM 06.12 section 6.1), and if PRNG dice roll such that at least 80 out of 95 SID codeword bit positions (all in the xMc part of the frame) are all zeros, the resulting CN frame will be liable to misinterpretation as SID (invalid SID most of the time, or even more rarely valid SID if at least 94 out of 95 SID codeword bit positions are all zeros) if fed to the preprocessor a second time. That second pass would then further alter those affected frames, but no others.