FreeCalypso > hg > gsm-net-reveng
comparison doc/TFO-xform/EFR @ 37:4ab7cc414ed2
doc/TFO-xform/EFR: document CN insertion
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Tue, 03 Sep 2024 21:20:47 +0000 |
parents | d9553c7ac6ea |
children |
comparison
equal
deleted
inserted
replaced
36:d9553c7ac6ea | 37:4ab7cc414ed2 |
---|---|
42 * The 5 LPC parameters are different in each generated substitution/muting | 42 * The 5 LPC parameters are different in each generated substitution/muting |
43 frame, hence it looks like the TFO transform is running the quantization | 43 frame, hence it looks like the TFO transform is running the quantization |
44 algorithm for each output frame to produce LPC parameters that aim for the | 44 algorithm for each output frame to produce LPC parameters that aim for the |
45 substitution/muting LSFs of the official "example solution". | 45 substitution/muting LSFs of the official "example solution". |
46 | 46 |
47 If the series of BFI inputs continues for a while, the emitted LPC parameters | |
48 settle into an oscillating pattern that alternates between two sets of | |
49 numbers. | |
50 | |
47 * LTP lag parameters remain constant for each run of BFIs between good speech | 51 * LTP lag parameters remain constant for each run of BFIs between good speech |
48 frames; the lag value encoded therein matches the LTP lag (integer part only) | 52 frames; the lag value encoded therein matches the LTP lag (integer part only) |
49 from the 4th subframe of the last good speech frame, just like in the official | 53 from the 4th subframe of the last good speech frame, just like in the official |
50 endpoint decoder. | 54 endpoint decoder. |
51 | 55 |
64 in a row, and they also differ between subframes in the same frame - hence | 68 in a row, and they also differ between subframes in the same frame - hence |
65 these parameters are clearly being regenerated as output progresses. However, | 69 these parameters are clearly being regenerated as output progresses. However, |
66 the quantization algorithm for this parameter is so complex that I haven't | 70 the quantization algorithm for this parameter is so complex that I haven't |
67 been able to make a more intelligent analysis yet. | 71 been able to make a more intelligent analysis yet. |
68 | 72 |
73 If the series of BFI inputs continues for a while, the emitted fixed codebook | |
74 gain parameters slowly go down and eventually become all zeros - although the | |
75 exact meaning is still unclear given the highly non-intuitive quantization | |
76 algorithm. | |
77 | |
69 Looking at the first good speech frame that follows each BFI substitution/muting | 78 Looking at the first good speech frame that follows each BFI substitution/muting |
70 insert, we see that it is mostly unaltered: no alterations were seen to LPC or | 79 insert, we see that it is mostly unaltered: no alterations were seen to LPC or |
71 LTP parameters, in particular. However, in the case of the fixed codebook gain | 80 LTP parameters, in particular. However, in the case of the fixed codebook gain |
72 parameter we see a different behavioral pattern: most of the time it is also | 81 parameter we see a different behavioral pattern: most of the time it is also |
73 unaltered, but sometimes we see reduction in this parameter, and even then it | 82 unaltered, but sometimes we see reduction in this parameter, and even then it |
74 is only in certain subframes. Are we perhaps seeing a capping of the fixed | 83 is only in certain subframes. Are we perhaps seeing a capping of the fixed |
75 codebook gain in the first good frame following BFI, similar to that implemented | 84 codebook gain in the first good frame following BFI, similar to that implemented |
76 in the reference endpoint decoder? A better understanding of the quantization | 85 in the reference endpoint decoder? A better understanding of the quantization |
77 mechanism for this parameter will be needed. | 86 mechanism for this parameter will be needed. |
87 | |
88 CN insertion by TFO transform | |
89 ============================= | |
90 | |
91 Looking at the DL speech frames that were synthesized by the TRAU in those | |
92 frame positions where the incoming UL stream via TFO had DTXu pauses (valid SID | |
93 frames followed by BFIs), we can make the following observations: | |
94 | |
95 * The 5 LPC parameters appear to be generated anew on each output frame just | |
96 like in the substitution/muting case, and it likewise appears that the TFO | |
97 transform is running the regular LSF quantization algorithm taken from the | |
98 encoder. | |
99 | |
100 * The 4 LTP lag parameters are set to {135, 33, 135, 33} in each generated CN | |
101 frame, in agreement with how the official endpoint decoder sets the pitch | |
102 delay to constant value 40. | |
103 | |
104 * The 4 LTP gain parameters are all set to 0, also in agreement with CN | |
105 generation in the official endpoint decoder. | |
106 | |
107 * The 35-bit fixed codebook part of each subframe appears to be set to a | |
108 pseudorandom sequence, different in each emitted frame and subframe. My | |
109 analysis tells me it should be possible to construct fixed codebook sequences | |
110 in "speech" output frames that would produce the same excitation as the | |
111 official bit-exact CN - although the final PCM output probably won't match | |
112 the official bit-exact CN because of LSF and fixed codebook gain | |
113 requantization. However, we won't know whether or not the output from | |
114 Nokia's TFO transform matches our idea of official-CN-matching fixed codebook | |
115 excitation until we have our own implementation of this idea and compare | |
116 the two. | |
117 | |
118 * The four fixed codebook gain parameters in the emitted CN frames are once | |
119 again too difficult to understand for now - but they are definitely being | |
120 recomputed anew for each emitted CN frame and subframe. | |
121 | |
122 If CN muting kicks in on the second lost SID (BFI instead of SID received in | |
123 TAF position), we see the following additional behaviour: | |
124 | |
125 * On the TAF-position frame that initiates CN muting, the emitted LPC parameters | |
126 break out of the alternating pattern they previously settled into. They go | |
127 through a few unique number sets, then settle into a two-state oscillating | |
128 pattern once again. Is the TFO transform perhaps making a switch from | |
129 last-SID LSF numbers to the static "mean" ones when it goes into CN muting? | |
130 | |
131 * The emitted fixed codebook gain parameters start going down and eventually | |
132 become all zeros. | |
133 | |
134 Looking at the first good speech frame that follows each CN insertion period, | |
135 we see only two alterations made by the TFO transform: the 5 LPC parameters and | |
136 the first subframe fixed codebook gain parameter are modified, presumably to | |
137 compensate for the lack of quantizer state reset that happens when the end | |
138 decoder has seen a CN insert. No more speech parameter alterations are seen | |
139 past the first subframe of the first frame following the DTXu pause. |