FreeCalypso > hg > gsm-codec-lib
comparison doc/EFR-rationale @ 122:b33f2168fdec
doc/EFR-rationale article written
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Sat, 10 Dec 2022 08:51:01 +0000 |
| parents | |
| children | 4af99bf8671a |
comparison
equal
deleted
inserted
replaced
| 121:b51295fcbbae | 122:b33f2168fdec |
|---|---|
| 1 Problem in need of solving | |
| 2 ========================== | |
| 3 | |
| 4 At the time of the undertaking of Themyscira libgsmefr project (late 2022), | |
| 5 there did not exist any readily available library solution for GSM EFR codec. | |
| 6 The community of FOSS offers classic libgsm from 1990s for FR1 codec (it's an | |
| 7 implementation of GSM 06.10, on top of which we had to implement our own Rx DTX | |
| 8 handler) and opencore-amrnb for AMR (based on Android OpenCORE framework) - but | |
| 9 nothing for EFR. This situation creates a problem for anyone seeking to deploy | |
| 10 their own GSM network with a voice interface to PSTN or other networks: such | |
| 11 voice interface generally requires implementing a transcoder, and doing the | |
| 12 latter in turn requires a library that implements the codec to be supported. | |
| 13 In the present situation, anyone who wishes to implement a speech transcoder | |
| 14 for GSM networks can easily support FR1 and AMR codecs, but not EFR. | |
| 15 | |
| 16 EFR is more than just 12k2 mode of AMR! | |
| 17 ======================================= | |
| 18 | |
| 19 There is a common misconception in the GSM hacker community that EFR is nothing | |
| 20 but the highest 12k2 mode of AMR, and that any library that implements AMR, | |
| 21 such as opencore-amrnb, is thus sufficient to support EFR as well. However, | |
| 22 the reality is more complex: | |
| 23 | |
| 24 * If an AMR encoder operates with DTX disabled, such that the output contains | |
| 25 only speech frames and no SID, and the mode is forced to 12k2, then indeed a | |
| 26 simple reshuffling of bits will produce speech frames that can be fed to an | |
| 27 EFR decoder on the other end. Note that the two encoders (EFR and AMR 12k2) | |
| 28 will produce *different* encoded speech parameters from the same input, and | |
| 29 the decoded speech output on the other end will also be different, but the | |
| 30 two versions are expected to be equally good for human ears. | |
| 31 | |
| 32 * In the other direction, if an EFR input stream contains only good speech | |
| 33 frames (no SID and no lost, FACCH-stolen or DTX-suppressed frames), one can | |
| 34 likewise do a simple bit reordering and feed these frames to an AMR decoder. | |
| 35 The output of this AMR decoder will once again be different from a proper | |
| 36 (bit-exact) EFR decoder for the same speech parameter inputs. but as long as | |
| 37 the EFR input stream is all good speech frames, the output will be good enough | |
| 38 for human ears. | |
| 39 | |
| 40 * The real problem occurs when the EFR input stream contains SID frames and BFI | |
| 41 frame gaps, as will always happen in reality if this stream is an uplink from | |
| 42 a GSM call. AMR SID mechanism is different from that of EFR, and an AMR | |
| 43 decoder will NOT recognize EFR SID frames. A quick experiment confirms that | |
| 44 when a real GSM EFR uplink RTP capture is converted to AMR by non-SID-aware | |
| 45 bit reshuffling and then fed to amrnb-dec from opencore-amrnb, unpleasant | |
| 46 sounds appear in the output whenever GSM uplink goes into SID. | |
| 47 | |
| 48 EFR reference code from ETSI | |
| 49 ============================ | |
| 50 | |
| 51 A published-source bit-exact implementation of GSM EFR encoder and decoder, | |
| 52 complete with all beyond-speech functions of DTX, VAD, comfort noise generation, | |
| 53 error concealment etc does exist in the form of reference code from ETSI. | |
| 54 However, this code has never been turned into a usable codec library by anyone | |
| 55 prior to us (at least not by anyone who freely published their work), and doing | |
| 56 such librification (producing an EFR analogue to what Android OpenCORE people | |
| 57 did with AMR) is no easy feat! The original EFR code from ETSI exhibits two | |
| 58 problems which need to be remedied in the librification project: | |
| 59 | |
| 60 1) The original code maintains all codec state in global variables (lots of | |
| 61 them) that are scattered throughout. 3GPP reference code for AMR (naturally | |
| 62 later than EFR in chronological order) is better in this regard (in the AMR | |
| 63 version they gathered their global vars into structs and pass pointers to | |
| 64 these structs, although still many separately-malloc'ed structs instead of | |
| 65 single unified encoder state and decoder state), but we need the EFR version | |
| 66 for correct handling of all beyond-speech aspects, and this version is all | |
| 67 global vars. | |
| 68 | |
| 69 2) These reference codes from ETSI/3GPP (both EFR and AMR versions, it seems) | |
| 70 were intended to serve as simulations, not as production code, and the code | |
| 71 is very inefficient. | |
| 72 | |
| 73 Themyscira libgsmefr | |
| 74 ==================== | |
| 75 | |
| 76 Libgsmefr presented in this code repository is our current solution for EFR. | |
| 77 It is a library styled after classic libgsm for FR1, but its guts consist of a | |
| 78 librified derivative of ETSI EFR code. The problem of global vars has been | |
| 79 solved in this library version - they've been gathered into one unified struct | |
| 80 for encoder state and another unified struct for decoder state - but the problem | |
| 81 of poor performance (significantly worse than opencore-amrnb) still remains for | |
| 82 now. |
