comparison TCH-tap-modes @ 95:8a45cd92e3c3

TCH-tap-modes: new article
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 19 Dec 2022 02:02:28 +0000
parents
children
comparison
equal deleted inserted replaced
94:7aaed576fa26 95:8a45cd92e3c3
1 It has been discovered that the DSP ROM in the Calypso GSM baseband processor
2 makes it possible to "tap" into speech traffic on GSM traffic channels (TCH):
3
4 1) In the downlink direction, the signal processing chain which every GSM MS
5 must implement includes a GSM 05.03 channel decoder, operating in one of
6 several variants as necessary for each supported TCH mode, followed by speech
7 decoders for each supported codec. TI's DSP naturally implements this
8 required signal processing chain, and this implementation includes one nifty
9 feature: the bits that make up the internal interface from GSM 05.03 channel
10 decoder output to the input of speech decoders are written into the NDB API
11 RAM page that is also accessible to the ARM core, and these bits can be
12 externally read out. The act of reading these bits is completely
13 non-invasive (we are only reading bits that are already there, not modifying
14 anything), thus we can sniff TCH downlink on any voice call in real time
15 without disrupting or impacting standard type-approved GSM MS operation in
16 any way.
17
18 2) In the uplink direction, there is a reverse signal processing chain in which
19 the output of the internal speech encoder for the selected codec feeds into
20 the input of the corresponding GSM 05.03 channel encoder. In this direction
21 there are two tapping possibilities:
22
23 2a) There is a buffer in the NDB API RAM page from which one can read the bits
24 that pass from the speech encoder output to the channel encoder input -
25 let's call this form of TCH tap "uplink sniffing";
26
27 2b) There is a special mode in which the output of the internal speech encoder
28 is effectively suppressed and the input to the channel encoder comes from
29 another NDB API RAM buffer that needs to be filled by ARM firmware - let's
30 call this form of TCH tap "uplink substitution".
31
32 Sources of knowledge about these DSP functions
33 ==============================================
34
35 For the functions of TCH DL sniffing (tap 1 in the above summary) and TCH UL
36 substitution (tap 2b in the above summary), the primary source of knowledge is
37 the defunct '#if TRACE_TYPE==3' code in TSM30 and LoCosto L1 sources. I call
38 this code defunct because the TRACE_TYPE preprocessor symbol is set to 4 (not 3)
39 in both TCS211 and LoCosto versions, and appears to be set to 0 (all trace
40 disabled) in the ancient TSM30 build. This code appears to be some very old
41 test mode, apparently sending some test bit patterns into TCH UL and expecting
42 the same bit patterns back on TCH DL, presumably with a test instrument such as
43 CMU200 providing a loopback from UL to DL on this test TCH, and has only
44 survived in an incomplete form:
45
46 * There are '#if TRACE_TYPE==3' stanzas in l1_cmplx.c, in both TSM30 and LoCosto
47 versions, that implement DSP buffer writing for TCH UL substitution (TCH/F
48 only) and timing control for TCH DL buffer reading (both TCH/F and TCH/H),
49 calling a function named play_trace() for the latter.
50
51 * There is no play_trace() code in the LoCosto source. but there is an
52 hw_debug.c source module in the TSM30 code drop under MCU/Layer1/L1c/Src,
53 and it contains (presumed) TI-legacy play_trace() and play_diagnostics()
54 functions, once again under '#if (TRACE_TYPE==3)'. play_trace() reads the
55 DSP's TCH DL buffer and saves the bits in an ARM firmware RAM buffer, and
56 then play_diagnostics() analyzes the captured booty - and studying the second
57 function is how we learn the apparent original intent of doing test bit
58 patterns on TCH.
59
60 * The code that feeds "UL play" test bit patterns to the earlier-mentioned
61 '#if TRACE_TYPE==3' TCH UL substitution code in l1_cmplx.c (apparently once
62 hacked into dll_read_dcch() and tx_tch_data()) has not been found anywhere.
63
64 For TCH tap 2a in our summary at the beginning of this article (non-invasive
65 sniffing of TCH UL bits produced by the internal speech encoder) there does not
66 exist any authoritative source of knowledge. It naturally follows from
67 otherwise-known Calypso DSP architecture that these internally produced TCH UL
68 bits should reside in the "main" a_du_0 buffer (or in a_du_1 when TCH/H
69 subchannel 1 is active), and I (Mother Mychaela) have heard an anecdotal report
70 (from someone who once worked with Calypso in a non-community-based manner) that
71 these UL bits could indeed be read out of this buffer - but in the absence of
72 an authoritative source, we don't know when would be the correct time to read
73 this buffer.
74
75 In our current state of knowledge, only TCH DL sniffing can be exercised safely:
76 for UL sniffing we don't know the correct time when the buffer would need to be
77 read, while active UL substitution is obviously an invasive hack involving a DSP
78 debug or test feature that is never used in standard GSM MS operation.
79
80 Support for different speech codecs
81 ===================================
82
83 When it comes to passively sniffing TCH DL and/or UL, we are merely reading bits
84 that are already there, and basic reasoning tells us that the DSP's DL and UL
85 buffers involved in this exercise exist in all speech TCH modes supported by
86 the DSP: FR1, HR1, EFR and AMR. However:
87
88 * The ancient '#if TRACE_TYPE==3' reference code exists only for FR1, HR1 and
89 EFR - it clearly predates the addition of AMR in the later Calypso DSP
90 versions.
91
92 * FR1, HR1 and EFR are the only codecs for which we (FreeCalypso community) know
93 the format in which TCH DL bits appear in the DSP's a_dd_0 and a_dd_1 buffers.
94
95 * I (Mother Mychaela) have heard an anecdotal report (from the same
96 non-community-based party mentioned earlier) that TCH DL bits could be read
97 out of a_dd_0 buffer in TCH/AFS (AMR) mode - but I never got any details.
98
99 In contrast with passive sniffing, active TCH UL substitution requires explicit
100 support from the DSP - and this explicit DSP support is known to exist for
101 certain only for TCH/FS and TCH/EFS channel modes, i.e., for FR1 and EFR codecs
102 only. In the case of TCH/HS channel mode (HR1 codec), it *appears* that the DSP
103 supports UL substitution in this mode too, but this combination has only been
104 exercised by OsmocomBB people (the original '#if TRACE_TYPE==3' code for UL play
105 only supports TCH/F), and FreeCalypso policy is to treat everything coming out
106 of OBB as highly suspect.
107
108 What about AMR? The anecdotal report (from the same already-mentioned party) is
109 that TCH UL substitution that works for FR1 and EFR appears to NOT work for AMR
110 - that's all I know - but frankly speaking, given that it's a weird DSP debug
111 mode that is never needed in standard GSM MS operation, I find it more
112 surprising that it works for FR1 and EFR than the observation that it doesn't
113 work for AMR.
114
115 FreeCalypso support for TCH tap functions
116 =========================================
117
118 TCH DL sniffing and UL substitution provisions were initially implemented in
119 FreeCalypso back in 2016, but only in the Citrine version, which was deemed to
120 be a dead end later that same year. However, this functionality is now being
121 resurrected, and it has been incorporated into our production FC Tourmaline
122 firmware as of 2022-12-13.
123
124 In order to activate the function of TCH DL sniffing and save the recording of a
125 TCH DL session into a file, one needs to use the fc-shell utility from FC host
126 tools, specifically the tch record command in an interactive fc-shell session.
127 The format in which TCH DL tap traffic is passed over RVTMUX (an original
128 FreeCalypso invention) has changed in a slight but incompatible way between the
129 original hackish version from 2016 and the new production version as of 2022,
130 and capturing TCH DL with new firmware requires the updated version of fc-shell
131 that will be released as part of fc-host-tools-r18. The current (late 2022)
132 incarnation of FreeCalypso TCH DL sniffing feature supports FR1, HR1 and EFR
133 codecs, although only FR1 and EFR have been tested so far.
134
135 The function of TCH UL substitution is currently implemented in FC Tourmaline
136 only for FR1 and EFR (no HR1, no AMR), and it likewise requires running an
137 interactive fc-shell session in which you would invoke the tool's tch play
138 command. In the case of TCH UL play feature there has been NO change in the
139 RVTMUX transport format between 2016 and 2022 versions.
140
141 TCH DL DSP buffers and capture format
142 =====================================
143
144 The DSP's NDB API page has two buffers in which TCH DL bits appear: a_dd_0 and
145 a_dd_1. All TCH/F modes use a_dd_0, but TCH/H uses one buffer or the other
146 depending on the subchannel: subchannel 0 uses a_dd_0, subchannel 1 uses a_dd_1.
147 (It is certainly a strange design - the DSP won't be able to receive and decode
148 the "wrong" subchannel because it doesn't know the ciphering key for the other
149 MS - but perhaps the designers of this DSP architecture aeons ago found this
150 design to somehow flow more naturally with their scheduling of DSP tasks.) Each
151 buffer consists of 22 16-bit words - they were originally 20 words, but then
152 extended to 22 words to support CSD 14.4 kbps mode.
153
154 Each TCH buffer in the DSP's NDB API page consists of 3 status or header words
155 followed by N words of payload, where N depends on TCH mode: 17 for TCH/FS and
156 TCH/EFS, 8 for TCH/HS, and not-yet-studied for AMR and CSD. Let's begin our
157 analysis with the 3 status words that make up the buffer header:
158
159 Status word 0 (a_dd_0[0] or a_dd_1[0]) is a word of flag bits. We don't know
160 the meaning of every bit in this word, but at least for TCH/FS and TCH/EFS (we
161 haven't exercised TCH/HS at all) we know the following bits:
162
163 * Bit 15 (B_BLUD) is a "buffer filled" or "data present" flag. This flag is
164 observed as 1 in *almost* every 20 ms window in which a traffic frame is
165 expected (fn_report_mod13_mod4 == 0 in l1s_read_dedic_dl(), case TCHTF),
166 except for certain instances early in the call setup process which remain to
167 be studied.
168
169 * Bit 14 (B_AF) will be set if the block of 8 half-bursts (block diagonal
170 interleaving of GSM 05.03) corresponding to this buffer was channel-decoded
171 as speech rather than as FACCH - see further analysis below.
172
173 * Bit 9 (B_ECRC) has only ever been observed as 1 when B_AF is set, i.e., when
174 the speech-not-FACCH channel decoder was invoked. In the case of TCH/EFS this
175 bit is set to 1 if the EFR-added CRC-8 was bad, and cleared if this CRC-8 was
176 good; in the case of TCH/FS this bit has always been observed as 1 and should
177 be ignored because there is no CRC-8 in TCH/FS.
178
179 * Bit 7 has always been observed as 1 wheneven B_BLUD is set but B_AF is
180 cleared, i.e., whenever the block was channel-decoded in FACCH rather than
181 speech mode.
182
183 * Bits 6:5 indicate the result of FIRE decoding in the event that the FACCH
184 decoder was invoked.
185
186 * Bits 4:3 carry the ternary SID flag encoded as in section 6.1.1 of GSM 06.31
187 and 06.81, but only when the speech-not-FACCH channel decoder was invoked as
188 indicated by B_AF.
189
190 * Bit 2 is BFI as defined in section 6.1.1 of GSM 06.31 and 06.81. Whenever
191 the block was decoded as FACCH (bit 14 clear, bit 7 set), bit 2 has always
192 been observed as set, agreeing with the stipulation in GSM 06.31 and 06.81
193 that BFI=1 whenever a FACCH frame has been received. However, in the case of
194 TCH/EFS it appears that CRC-8 status (reported in bit 9) is NOT factored into
195 the logic that sets bit 2 - it appears that the subsequent speech decoding
196 logic is expected to OR bits 2 and 9 together to get the BFI flag for the Rx
197 DTX handler of GSM 06.81.
198
199 In the case of 20 ms blocks (reassembled from 8 half-bursts) that were channel-
200 decoded as speech rather than FACCH, the observed behavior is that bits 15 and
201 14 are set, the payload portion of the buffer is filled with the output from the
202 channel decoder, and bits 4:3 are set from this payload by the bit-counting rule
203 of section 6.1.1 of GSM 06.31 and 06.81 irrespective of the good-or-bad status
204 in bits 2 and 9. However, when bit 14 is clear and bit 7 is set, indicating
205 that the block (from 8 half-bursts) was channel-decoded in FACCH mode, the
206 following additional behavior is observed:
207
208 * The payload portion of the buffer remains unchanged from its previous content,
209 last written when a frame was channel-decoded in speech-not-FACCH mode;
210
211 * Bit 2 is set, bit 9 is cleared;
212
213 * Bits 4:3 are cleared even when they previously indicated SID based on the bit
214 pattern in the payload portion of the buffer, even when that SID-encoding
215 payload is still there.
216
217 In the standard TCH DL signal processing chain, GSM 05.03 channel decoding is
218 followed by the Rx DTX handler of GSM 06.31 or 06.81 for TCH/FS or TCH/EFS,
219 respectively. It appears that the Rx DTX handler implemented in TI's DSP is
220 driven by this status word 0 at the head of the buffer, and we can only guess
221 as to its exact logic. At this point it bears reminding that the functions of
222 the Rx DTX handler are not rigidly prescribed in the specs: in the case of EFR
223 the bit-exact reference implementation is normative only in certain aspects
224 (e.g., comfort noise generation after receiving SID), but is considered a non-
225 normative example in some other key aspects (all GSM 06.61 functions, including
226 what happens when a FACCH block was received when speech frames were expected),
227 and in the case of FR1 there is no bit-exact reference implementation at all,
228 only general guidance.
229
230 Having the curiosity of a cat, I (Mother Mychaela) naturally desire to know
231 exactly how the Rx DTX handler (the bridge between the channel decoder and the
232 speech decoder) works in TI's DSP. A full static reversing job on the DSP ROM
233 would provide complete answers, but is a very daunting proposition, thus I am
234 also looking at the idea of behavioral analysis: the output of the speech
235 decoder can be captured from MCSI on FCDEV3B hardware, or from the VSP tap on
236 FC Venus if we ever build that board, and if we combine that speech decoder
237 output capture with the currently-discussed capture of TCH DL buffers, we may
238 be able to glean some insight into the workings of the Rx DTX handler block: we
239 could implement a candidate Rx DTX handler clone in software and compare the
240 output (of this proposed handler followed by the spec-defined speech decoder)
241 against the actual speech output from the DSP.
242
243 Back to our exposition of TCH DL buffer content:
244
245 Status word 1 (a_dd_0[1] or a_dd_1[1]) is some kind of DSP measurement or count
246 which Calypso ARM fw does not need to look at, except when debugging - the only
247 code which I (Mother Mychaela) could find that does anything with this DSP
248 status word is the ancient play_diagnostics() code in the TSM30 version
249 (obviously never included in any production fw); this code looks at the unknown
250 word in question and calls it "D_MACC". This play_diagnostics() code compares
251 the D_MACC reading against a threshold, and if the per-block reading is below
252 the threshold, an error message is printed. That's all we know!
253
254 Status word 2 (a_dd_0[2] or a_dd_1[2]) is a bit error count: the code in
255 l1s_read_dedic_dl() reads this error count and uses it for RXQUAL computation
256 for measurement reports.
257
258 If one's area of interest is in replicating Rx DTX handling and speech decoding
259 that happens in the DSP, status words 1 and 2 can probably be ignored - instead
260 the important parts are status word 0 (extensively covered above) and the
261 payload portion of the buffer.
262
263 The payload portion of the buffer consists of some number of 16-bit words: 17
264 of them for TCH/FS and TCH/EFS, or 8 of them for TCH/HS. The DSP does not have
265 any notion of 8-bit bytes, instead it operates on 16-bit words as its elementary
266 data unit. The ordering of bits within these 16-bit words (in the payload
267 portion of TCH buffers) is from the most-significant bit toward the least-
268 significant bit, thus when these TCH buffers are transferred via octet-oriented
269 interfaces, the upper byte of each word should be transferred first, even though
270 this byte order is counter to the little-endian byte order of the Calypso ARM
271 core.
272
273 In the case of TCH/FS and TCH/EFS, the fill order of bits in the payload words
274 is as follows, starting with the most-significant bit of buffer word 3 (first
275 word of the payload portion):
276
277 * 182 bits of class 1;
278
279 * 4 dummy bits (always observed as 0);
280
281 * 78 bits of class 2;
282
283 * the last 8 bits of a_dd_0[19] are unused.
284
285 In the case of TCH/HS, the fill order is similar, but modified as appropriate
286 for TCH/HS:
287
288 * 95 bits of class 1;
289
290 * 4 dummy bits;
291
292 * 17 bits of class 2;
293
294 * the last 12 bits of a_dd_0[10] or a_dd_1[10] are unused.
295
296 Aside from the insertion of 4 extra dummy bits at the boundary between class 1
297 and class 2, the overall bit order is that of GSM 05.03 Figure 1 interface 1.
298
299 In the case of TCH/EFS, the following additional considerations apply:
300
301 * Bits [65:73] in all received DL frames, where CRC-8 would go in the 260-bit
302 frame of GSM 05.03 interface 1 for EFR, are always observed as 0, whether
303 this CRC-8 was good (a_dd_0[0] bit 9 clear) or bad (a_dd_0[0] bit 9 set).
304
305 * The handling of repetition bits (4 bits of 244-bit EFR codec frame, each of
306 which is triplicated in the channel encoding for transmission) is unclear.
307
308 Further detail regarding the repetition bits of TCH/EFS: distinct bit positions
309 exist in the 260-bit frame of GSM 05.03 interface 1 (which is the frame format
310 in the TCH buffers of TI's DSP) for each of the 3 copies of each of the 4
311 triplicated bits. It is obvious that correct decoding of these triplicated bits
312 requires a majority-vote function just like the one implemented in TMR systems
313 in space gear - but it is not absolutely and unquestionably obvious where this
314 TMR voting function is implemented in the Rx processing chain of TI's DSP. It
315 *appears* that this majority-vote function has already been performed by the DSP
316 function that writes a_dd_0, and that the first bit position out of each group
317 of 3 holds the output of this voting function, so that the subsequent speech
318 decoder only needs to use those "cooked" bits - but there is this mystery:
319
320 * At certain times, particularly during the main part of a test call, TCH DL
321 buffer readouts contain zeros in the "extra" repetition bit positions: for
322 each group of 3 bits, the first will contain 0 or 1, but the other two will
323 always be 0.
324
325 * At other times, seemingly in the beginning and ending parts of test calls,
326 TCH DL buffer readouts contain matching bit values in all 3 positions: for
327 each group of 3 bits, if the first bit is 0, the other two will also be 0, or
328 if the first bit is 1, then the other two will also be 1.
329
330 One possibility is that the DSP applies the required majority-voting function,
331 writes its output into the first bit position of each group of 3, but then
332 sometimes (and not at other times) applies another function that writes the
333 voting function output into the remaining bit positions, perhaps for loopback
334 of TCH DL into TCH UL. More study is needed in this area.
335
336 FreeCalypso file format for TCH DL captures
337 ===========================================
338
339 The file format written by fc-shell tch record command is ASCII hex, line-based,
340 with one line for every captured 20 ms window. The new format as of 2022 is:
341
342 * Each line begins with an FR, HR or EFR keyword indicating which variant of
343 TCH DL has been captured;
344
345 * This keyword is followed by 3 space-separated DSP status words, each written
346 as 4 hex digits;
347
348 * The main body of the frame is written as 33 (TCH/FS & TCH/EFS) or 15 (TCH/HS)
349 hex bytes, produced from the payload portion of the TCH DL buffer by turning
350 each 16-bit word into 2 bytes (MSB first) and discarding the last byte that
351 is unused (always 0);
352
353 * Each line ends with a frame number in decimal, specifically the value of
354 fn_mod_104 variable in the l1s_read_dedic_dl() function when the DSP buffer
355 was read.
356
357 The addition of the frame number field allows these TCH DL captures to be
358 reconciled against the SACCH multiframe structure, which matters for the rules
359 of DTX.
360
361 TCH UL substitution: open questions
362 ===================================
363
364 Moving from the mostly-understood realm of TCH DL capture into the much more
365 experimental realm of TCH UL substitution, we have some open questions: how does
366 this DSP special mode really work? Here is what we know: if we load externally
367 sourced speech frames into otherwise-unused a_du_1 DSP buffer at the time of
368 (fn_report_mod13_mod4 == 3), which is the same time when FACCH or CSD UL would
369 be expected, and set B_PLAY_UL bit in DSP NDB API word d_tch_mode, the speech
370 frame stream going to the other end of the call will be the one we feed into
371 a_du_1 instead of the one produced from the microphone input by the internal
372 speech encoder. But here are the parts we don't know:
373
374 * If one were to set B_PLAY_UL in d_tch_mode but not feed external UL input
375 into a_du_1 buffer at the needed time, what will happen?
376
377 * Vice-versa, if one were to load a_du_1 and set its B_BLUD bit without setting
378 B_PLAY_UL in d_tch_mode, what will happen?
379
380 * Can the frame stream fed into a_du_1 be encoded in DTX-enabled mode, including
381 SID frames? If this possibility is allowed, what magic bits would need to be
382 set where in order to get the correct behavior from the DSP's subsequent
383 burst-by-burst DTX logic?
384
385 TCH UL substitution: implemented PoC
386 ====================================
387
388 Back in 2016 we implemented a proof-of-concept TCH UL play feature in
389 FreeCalypso (only for TCH/FS and TCH/EFS), and the same PoC has been retained
390 when the overall TCH tap facility has been mainlined in late 2022. Having this
391 highly experimental (not fit for production use) TCH UL play code present in our
392 current production fw is deemed acceptable because this code will never be
393 invoked unless the user sends TCH_ULBITS_REQ packets to the running fw via
394 RVTMUX - and if you do send such packets (via tch play command in an fc-shell
395 session or by any other means), you are leaving the realm of production-approved
396 functionality and entering the realm of wild experimentation.
397
398 The PoC TCH UL play mechanism consists of a small buffer (holding up to 4 FR1 or
399 EFR frames) implemented in the ARM firmware; this buffer is filled by arriving
400 TCH_ULBITS_REQ packets and drained by the tchf_substitute_uplink() function
401 called from l1s_ctrl_tchtf(). Specifically, a flag named tch_ul_play_mode is
402 set when TCH_ULBITS_REQ input is received, telling l1s_ctrl_tchtf() to start
403 calling tchf_substitute_uplink() when (fn_report_mod13_mod4 == 3); the called
404 function drains an uplink frame from the ring buffer, writes it into the DSP's
405 a_du_1 buffer, sets B_PLAY_UL in d_tch_mode and sends a TCH_ULBITS_CONF packet
406 back to the host. If the ring buffer is empty, the function clears both
407 B_PLAY_UL and the firmware's tch_ul_play_mode flag, ending the special TCH UL
408 play mode.
409
410 This PoC mechanism is meant to be exercised with tch play command in an
411 interactive fc-shell session: this command reads an ASCII line-based uplink data
412 file and sends it to the firmware frame by frame, paced by TCH_ULBITS_CONF
413 packets from the target. The input to this command is a line-based ASCII hex
414 file similar to the format written by tch record, but simplified: each line is
415 just the 33-byte frame to be sent (in TI DSP buffer format, following GSM 05.03
416 interface 1), without any flags or status words or frame numbers.