changeset 108:e26623146358 default tip

new article DSP-speech-decoder
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 29 Oct 2024 22:11:41 +0000
parents dfa5f99631a6
children
files DSP-speech-decoder
diffstat 1 files changed, 101 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/DSP-speech-decoder	Tue Oct 29 22:11:41 2024 +0000
@@ -0,0 +1,101 @@
+As described in Calypso-digital-voice article, we can use Calypso MCSI (an
+auxiliary hardware interface brought out on FreeCalypso dev boards) in the so-
+called "Bluetooth headset" mode to capture the voice call downlink in digital
+form, specifically 16-bit linear PCM, after speech decoding performed by the
+DSP, but before D-to-A conversion for analog audio output.  As of 2024-10, we
+have a set of tools (Lattice Icestick FPGA board, a gateware design for this
+FPGA, and corresponding host software tools - see fc-pcm-if Hg repository) that
+puts these ideas into practice; using these tools, we can now answer some long-
+standing questions about exactly how Calypso DSP implements speech decoding for
+various GSM codecs.
+
+FRv1 decoder implementation
+===========================
+
+The original set of Full Rate codec specs did not include the feature of in-band
+encoder and decoder homing frames; this feature was included in HRv1 and EFR
+from day 1, and was also back-ported to FRv1 in later versions of 06.10 spec -
+but it was not present originally, and it is defined as optional in the later
+spec versions.  Going into these Calypso MCSI experiments, we had this question:
+does Calypso DSP implement decoder homing frames for FRv1, or does it not?
+Experimental observation: in-band decoder homing is NOT implemented for FRv1 in
+this DSP.  If we feed the DHF defined in later versions of 06.10 spec to our
+Calypso MS via TCH DL, it gets treated as a regular speech frame without any
+special handling; more specifically, we don't see PCM16 output frames consisting
+of 0x0008 words as required by the spec when the optional in-band decoder homing
+feature is present.
+
+This lack of in-band homing feature makes it difficult to test the decoder
+against official ETSI test sequences, as we have no way of getting it into a
+defined state.  The ancient GSM 11.10 test spec provides a mechanism called DAI
+that includes the necessary reset signal, and it appears that Calypso DSP does
+support operating MCSI as DAI rather than "Bluetooth" mode - but we haven't done
+any experiments with this DAI mode yet.  Hence our exploration of FRv1 decoder
+implementation in our dear Calypso DSP ends here for now.
+
+EFR decoder implementation
+==========================
+
+Unlike the situation with FRv1, the in-band codec homing feature is mandatory
+for EFR per the specs.  And with Calypso DSP we got good news: it does implement
+this feature at least in the decoding direction, which is the only one we've
+tested so far.  Exactly per the spec, the first DHF arriving from TCH DL is
+decoded like a regular speech frame, but the decoder state is reset at the end.
+All subsequent DHFs turn into 0x0008 output, and we indeed see the latter coming
+out on MCSI.
+
+Once we established that the decoder homing feature works as specified, our
+next question was: does our dear Calypso DSP implement the original bit-exact
+definition of EFR, or does it implement what we call the AMR-EFR hybrid?
+Experimental observation: at least in the decoding direction, it implements the
+original bit-exact EFR decoder!  This finding is remarkable because all of our
+experiments in this series were performed on DSP ROM version 3606, the final
+one that not only includes AMR support, but has been subject to whatever rework
+TI did between 3416 and 3606 ROMs.  If we assume (based on some experiments in
+past years where we disabled loading of TI's official DSP patches) that AMR
+support is already present in the DSP ROM, as opposed to being contained
+entirely in the patches, it follows that the DSP ROM must contain both EFR and
+AMR codec implementations, yet the EFR implementation has not been AMR-ized.
+(In contrast, the network-side speech transcoder implementation used by T-Mobile
+USA and Telcel Mexico appears to use maximally-shared code between EFR and AMR,
+resulting in the AMR-EFR hybrid that differs from the original bit-exact
+definition, but has been blessed by 3GPP as an acceptable alternative.)
+
+Strange corruption bug
+----------------------
+
+As we fed ETSI test sequences from GSM 06.54 to our Calypso DSP under test via
+TCH DL, checking for an exact match, we observed what appears to be a corruption
+bug of some kind (corrupting either codec parameter bits or state variables)
+that manifests in a manner that seems random.  In our test, we concatenated all
+21 test sequences from GSM 06.54 (test0.cod through test20.cod) and fed the
+resulting super-long sequence into TCH DL.  The test was performed twice; on
+the second try we had 'tch record' running in fc-shell, capturing TCH DL as
+received by the Calypso MS, and we confirmed that the entire super-long test
+sequence was received without errors.  The decoded output (captured via MCSI)
+was then split into 21 separate robe files following frame counts: because each
+test*.cod input sequence begins with two DHFs, the decoder resets between these
+concatenated test sequences, allowing the output to be compared against
+test*.out reference files with per-sequence granularity.  The final diff
+comparison between split-out robe files and the official reference decoder
+output revealed the following oddities:
+
+* In the first run, decoded outputs for test0, test6, test11, test14, test15,
+  test16 and test20 showed a mismatch.  The difference in test0 outputs was
+  examined closer: the diff begins on a subframe boundary and continues until
+  the next DHF.  All other test sequences got a perfect match to ETSI reference
+  decoder output.
+
+* In the second run, a different set of sequences showed a mismatch: this time
+  the failing sequences were test1, test5, test6, test7, test15 and test16.
+  The other sequences that produced mismatches in the first run produced
+  perfectly good output (matching ETSI reference decoder) in this run!
+
+* test6, test15 and test16 were the three sequences that failed in both runs.
+  However, diffing between the two runs shows that the errors in each of these
+  3 twice-failed sequences are different between runs, further strengthening
+  the appearance of corruption that seems totally random.
+
+We do not currently have any better explanation for this oddity than a data
+corruption bug somewhere in the DSP.  For the record, these tests were performed
+on an FCDEV3B with Calypso DSP ROM version 3606 and DSP patch version 6840.