comparison doc/Audio-mode-config @ 847:6e137995c9c8

doc/Audio-mode-config: elaborate on AEC and FIR blocks
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 10 Aug 2021 01:50:41 +0000
parents 6a0fcbca8ac7
children 6c306705f503
comparison
equal deleted inserted replaced
846:a2e17e0f9622 847:6e137995c9c8
266 * If neither setting is given, a 164 byte file will be produced, with the 5 AEC 266 * If neither setting is given, a 164 byte file will be produced, with the 5 AEC
267 words of the old type being all zeros. Thanks to the modified audio mode 267 words of the old type being all zeros. Thanks to the modified audio mode
268 loading code in our firmwares, these 164 byte mode files can still be used 268 loading code in our firmwares, these 164 byte mode files can still be used
269 with current Tourmaline fw, with AEC set to its default disabled state. 269 with current Tourmaline fw, with AEC set to its default disabled state.
270 270
271 New AEC parameter words
272 =======================
273
274 The 12 words that configure AEC of the L1_NEW_AEC flavor (appearing on the
275 aec-new line in tiaud-compile input or in an fc-tmsh auw 12 command) map as
276 follows:
277
278 Word 0: aec_enable
279
280 This word must be set to 0 to disable AEC or 2 to enable it. This word is
281 translated to a single DSP control bit by the Audio Service layer, thus no
282 other values must be written into it.
283
284 Word 1: continuous_filtering
285
286 This word is written directly into the DSP, and we have no documentation for it
287 beyond "enable (1) or disable (0) continuous mode filtering".
288
289 Word 2: granularity_attenuation
290
291 This word is written directly into the DSP, and we have no documentation for it
292 beyond "granularity of the smoothed attenuation".
293
294 Word 3: smoothing_coefficient
295
296 This word is written directly into the DSP, and we have no documentation for it
297 beyond "smoothing coefficient".
298
299 Word 4: max_echo_suppression_level
300
301 This word is written directly into the DSP; it is described as "maximum
302 attenuation level", and the following constants are defined for it:
303
304 #define AUDIO_MAX_ECHO_0dB (0x7FFF)
305 #define AUDIO_MAX_ECHO_2dB (0x65AA)
306 #define AUDIO_MAX_ECHO_3dB (0x59AD)
307 #define AUDIO_MAX_ECHO_6dB (0x4000)
308 #define AUDIO_MAX_ECHO_12dB (0x1FFF)
309 #define AUDIO_MAX_ECHO_18dB (0x0FFF)
310 #define AUDIO_MAX_ECHO_24dB (0x07FF)
311
312 Word 5: vad_factor
313
314 This word is written directly into the DSP, and we have no documentation for it
315 beyond "VAD factor relative to the current estimated energy". VAD must stand
316 for "voice activity detector", but our knowledge ends here.
317
318 Word 6: absolute_threshold
319
320 This word is written directly into the DSP, and we have no documentation for it
321 beyond "VAD absolute offset relative to the current estimated energy".
322
323 Word 7: factor_asd_filtering
324
325 This word is written directly into the DSP, and we have no documentation for it
326 beyond "modifying factor of d_far_end_noise for filtering decision".
327
328 Word 8: factor_asd_muting
329
330 This word is written directly into the DSP, and we have no documentation for it
331 beyond "modifying factor of d_far_end_noise for muting decision".
332
333 Word 9: aec_visibility
334
335 This word must be set to 0 for normal operation or 0x200 for "AEC visibility"
336 debug mode. This word is translated to a single L1 control bit by the Audio
337 Service layer, thus no other values must be written into it.
338
339 Word 10: noise_suppression_enable
340
341 This word must be set to 0 to disable SPENH algorithm or 4 to enable it. This
342 word is translated to a single DSP control bit by the Audio Service layer, thus
343 no other values must be written into it. We don't know what this "speech
344 enhancement" algorithm does, and whether or not it is the same as "noise
345 suppression".
346
347 Word 11: noise_suppression_level
348
349 This config word is mapped to just two bits in the actual DSP control word by
350 the Audio Service layer, thus there are only 4 possible valid values here:
351
352 #define AUDIO_NOISE_NO_LIMITATION (0x0000)
353 #define AUDIO_NOISE_6dB (0x0020)
354 #define AUDIO_NOISE_12dB (0x0040)
355 #define AUDIO_NOISE_18dB (0x0060)
356
357 Some known-good AEC configurations
358 ==================================
359
360 The terse descriptions of parameter words given above unfortunately constitute
361 the total extent of our knowledge of the AEC block in our dear Calypso DSP and
362 its tuning parameters - we don't know anything more. However, we do have 3
363 example configurations to look at: we have the default values of the tuning
364 parameters that appear to be initialized by the DSP itself on boot, and we have
365 two AEC-enabled configurations set by Pirelli DP-L10 firmware: one for handheld
366 and wired headset modes, the other for the hands-free loudspeaker mode. Here
367 are the 12 parameter words in the 3 available configurations:
368
369 Parameter word Default Pirelli Pirelli
370 (AEC disabled) handheld hands-free
371 --------------------------------------------------------------------------
372 aec_enable 0 (off) 2 (on) 2 (on)
373 continuous_filtering 0 (off) 1 (on) 1 (on)
374 granularity_attenuation 0x0001 0x0014 0x0014
375 smoothing_coefficient 0x7FFF 0x0CCC 0x0CCC
376 max_echo_suppression_level 0x1FFF (12 dB) 0x59AD (3 dB) 0x0FFF (18 dB)
377 vad_factor 0x4000 0x4000 0x4000
378 absolute_threshold 0x0032 0x0032 0x0032
379 factor_asd_filtering 0x1000 0x1000 0x1000
380 factor_asd_muting 0x1000 0x1000 0x1000
381 aec_visibility 0 (off) 0 (off) 0 (off)
382 noise_suppression_enable 0 (off) 4 (on) 4 (on)
383 noise_suppression_level 0 (none) 0 (none) 0x0060 (18 dB)
384
385 The following observations can be made:
386
387 * The 4 parameters vad_factor, absolute_threshold, factor_asd_filtering and
388 factor_asd_muting remain unchanged between TI's DSP default and Pirelli's
389 production configs. On the basis of this observation, I (Mother Mychaela)
390 get the feeling that these four should be left alone.
391
392 * Besides the obvious steps of enabling AEC and SPENH, Pirelli did change
393 continuous_filtering (from off to on), granularity_attenuation and
394 smoothing_coefficient. Unfortunately, unless we recover the source code for
395 our Calypso DSP ROM or some documents explaining this version of AEC in
396 detail, we have no way of understanding what these parameters do, let alone
397 evaluating the merits of Pirelli's change.
398
399 * max_echo_suppression_level and noise_suppression_level seem to be the two
400 parameters most amenable to tuning.
401
402 * It is interesting to note that Pirelli's fw enables AEC not only in the
403 loudspeaker mode, but also in the more basic handheld and wired headset
404 modes. The two AEC configs differ only in max_echo_suppression_level and
405 noise_suppression_level parameters, with the loudspeaker mode AEC config
406 being more aggressive.
407
408 Prior to seeing what Pirelli's fw does, my (Mychaela's) own thinking was that
409 AEC is only needed in loudspeaker configurations, not handheld or headset.
410 After seeing Pirelli's AEC configs, I reason that enabling a less aggressive
411 AEC configuration in those less echo-prone modes probably doesn't hurt - thus
412 until and unless we recover more documentation or other knowledge, the plan for
413 our own FreeCalypso Libre Dumbphone handset is to do what Pirelli does: use a
414 less aggressive AEC config in handheld and headset modes, and a more aggressive
415 one in the hands-free loudspeaker mode.
416
417 On our current FCDEV3B setup with a SparkFun COM-09151 loudspeaker and a CUI
418 CMC-9745-130T microphone, applying Pirelli's loudspeaker-mode AEC config
419 produces echo cancellation that sounds acceptable to our subjective human
420 evaluator on the far end of test calls.
421
422 FIR filter details
423 ==================
424
425 Calypso DSP has two FIR filters in the voice paths, one in the uplink path and
426 one in the downlink path. Aside from their placement, the two FIR filters are
427 identical. Each FIR block has 31 taps (making a 30th order filter), and each of
428 the 31 coefficients is a 16-bit fixed-point number. The fixed point format is
429 F2.14 aka Q14: to get the real coefficient from the physical 16 bits, treat the
430 16-bit datum as a two's complement signed integer, then divide by 16384.
431 Examples: 0x4000 means 1, 0x2000 means 0.5, 0xC000 means -1, 0xE000 means -0.5.
432
433 In principle you can set all 31 coefficients to whatever you like, but in
434 practice only two possible configurations are used:
435
436 * When the FIR filter is disabled (identity transform), coefficient 0 is set to
437 0x4000 (unity) and all other coefficients are set to 0. In this configuration
438 the FIR block does not introduce any extra delay: all delayed samples are
439 multiplied by 0 and thus produce no effect.
440
441 * When some non-identity frequency response transformation is desired, a linear
442 phase filter is set up: coefficient #15 becomes the main tap (significantly
443 greater in absolute value than all others) and all other coefficients mirror
444 around it symmetrically: #0 equals #30, #1 equals #29 and so forth, until #14
445 equals #16. This filter adds 1.875 ms of delay (15 sample times) to the voice
446 path in which it is active, and an equal amount of "pre-ringing".
447
448 The presumed purpose of these two FIR filters (uplink and downlink) is to
449 flatten the frequency response of the speaker and microphone transducers, or
450 perhaps even more ambitiously, the frequency response of the modeled acoustic
451 environment. However, actually coming up with a good set of FIR filter
452 coefficients given a desired frequency response is a hard problem, one where
453 forward engineering is much more difficult than reverse.
454
455 When it comes to reverse engineering of existing Calypso DSP FIR filters, a
456 total of 7 specimen have been captured out in the wild so far: one downlink FIR
457 filter from Openmoko's non-functional para0.cfg (no way of knowing which speaker
458 it was once designed for), and a set of 6 filters extracted from Pirelli DP-L10,
459 3 uplink and 3 downlink, corresponding to the 3 audio routing modes supported on
460 this phone model (handheld, hands-free and wired headset). All 7 are linear
461 phase filters as described above. Analyzing the frequency response of a given
462 already existing FIR filter is easy: just use the fir2freq program in our
463 freecalypso-reveng Hg repository. OTOH, coming up with a new set of FIR filter
464 coefficients for some desired frequency response (e.g., for a new phone handset
465 being designed) is a much harder problem, one which we will probably have to
466 outsource to a hired DSP/FIR expert.
467
468 Calypso FIR support in FC host tools
469 ------------------------------------
470
471 Uplink and downlink FIR filter coefficients can be included in the input to
472 tiaud-compile. Each coefficient is given as the actual 16-bit word going into
473 the DSP (Q14 scaling included), and can be specified either in hex or as a
474 signed decimal integer.
475
476 We also have a dedicated ASCII file format for a FIR filter coefficient set by
477 itself, like this example:
478
479 fir-coeff-table
480
481 0x0178 0x0AB5 0xF43D 0xFED5 0xFCA7 0x04D8 0x00B8 0x0371
482 0x032F 0x0007 0x151C 0xF24C 0x19A6 0xE918 0xF7CD 0x7D0C
483 0xF7CD 0xE918 0x19A6 0xF24C 0x151C 0x0007 0x032F 0x0371
484 0x00B8 0x04D8 0xFCA7 0xFED5 0xF43D 0x0AB5 0x0178
485
486 (This example is the FIR filter extracted from Openmoko's non-functional
487 para0.cfg.) This by-itself FIR filter coeff set format is accepted as input to
488 the auw-fir command in fc-tmsh (allowing experimental FIR filters to be uploaded
489 to a running Calypso device for testing) and to our fir2freq analysis program.
490 We will probably use the same format if and when we embark on a venture to
491 design our own FIR filters for our own handset hardware.
492
271 fc-tmsync aur and aur-all addition 493 fc-tmsync aur and aur-all addition
272 ================================== 494 ==================================
273 495
274 New addition as of fc-host-tools-r16: our aur command which natively resides in 496 New addition as of fc-host-tools-r16: our aur command which natively resides in
275 fc-tmsh (audio mode full access read operation via ETM) has also been 497 fc-tmsh (audio mode full access read operation via ETM) has also been