# HG changeset patch # User Mychaela Falconia # Date 1628560241 0 # Node ID 6e137995c9c80698124b0732921cf67bf81ce441 # Parent a2e17e0f9622ee67a5de277db2300835f8ede1ee doc/Audio-mode-config: elaborate on AEC and FIR blocks diff -r a2e17e0f9622 -r 6e137995c9c8 doc/Audio-mode-config --- a/doc/Audio-mode-config Sat Jul 31 22:37:34 2021 +0000 +++ b/doc/Audio-mode-config Tue Aug 10 01:50:41 2021 +0000 @@ -268,6 +268,228 @@ loading code in our firmwares, these 164 byte mode files can still be used with current Tourmaline fw, with AEC set to its default disabled state. +New AEC parameter words +======================= + +The 12 words that configure AEC of the L1_NEW_AEC flavor (appearing on the +aec-new line in tiaud-compile input or in an fc-tmsh auw 12 command) map as +follows: + +Word 0: aec_enable + +This word must be set to 0 to disable AEC or 2 to enable it. This word is +translated to a single DSP control bit by the Audio Service layer, thus no +other values must be written into it. + +Word 1: continuous_filtering + +This word is written directly into the DSP, and we have no documentation for it +beyond "enable (1) or disable (0) continuous mode filtering". + +Word 2: granularity_attenuation + +This word is written directly into the DSP, and we have no documentation for it +beyond "granularity of the smoothed attenuation". + +Word 3: smoothing_coefficient + +This word is written directly into the DSP, and we have no documentation for it +beyond "smoothing coefficient". + +Word 4: max_echo_suppression_level + +This word is written directly into the DSP; it is described as "maximum +attenuation level", and the following constants are defined for it: + + #define AUDIO_MAX_ECHO_0dB (0x7FFF) + #define AUDIO_MAX_ECHO_2dB (0x65AA) + #define AUDIO_MAX_ECHO_3dB (0x59AD) + #define AUDIO_MAX_ECHO_6dB (0x4000) + #define AUDIO_MAX_ECHO_12dB (0x1FFF) + #define AUDIO_MAX_ECHO_18dB (0x0FFF) + #define AUDIO_MAX_ECHO_24dB (0x07FF) + +Word 5: vad_factor + +This word is written directly into the DSP, and we have no documentation for it +beyond "VAD factor relative to the current estimated energy". VAD must stand +for "voice activity detector", but our knowledge ends here. + +Word 6: absolute_threshold + +This word is written directly into the DSP, and we have no documentation for it +beyond "VAD absolute offset relative to the current estimated energy". + +Word 7: factor_asd_filtering + +This word is written directly into the DSP, and we have no documentation for it +beyond "modifying factor of d_far_end_noise for filtering decision". + +Word 8: factor_asd_muting + +This word is written directly into the DSP, and we have no documentation for it +beyond "modifying factor of d_far_end_noise for muting decision". + +Word 9: aec_visibility + +This word must be set to 0 for normal operation or 0x200 for "AEC visibility" +debug mode. This word is translated to a single L1 control bit by the Audio +Service layer, thus no other values must be written into it. + +Word 10: noise_suppression_enable + +This word must be set to 0 to disable SPENH algorithm or 4 to enable it. This +word is translated to a single DSP control bit by the Audio Service layer, thus +no other values must be written into it. We don't know what this "speech +enhancement" algorithm does, and whether or not it is the same as "noise +suppression". + +Word 11: noise_suppression_level + +This config word is mapped to just two bits in the actual DSP control word by +the Audio Service layer, thus there are only 4 possible valid values here: + + #define AUDIO_NOISE_NO_LIMITATION (0x0000) + #define AUDIO_NOISE_6dB (0x0020) + #define AUDIO_NOISE_12dB (0x0040) + #define AUDIO_NOISE_18dB (0x0060) + +Some known-good AEC configurations +================================== + +The terse descriptions of parameter words given above unfortunately constitute +the total extent of our knowledge of the AEC block in our dear Calypso DSP and +its tuning parameters - we don't know anything more. However, we do have 3 +example configurations to look at: we have the default values of the tuning +parameters that appear to be initialized by the DSP itself on boot, and we have +two AEC-enabled configurations set by Pirelli DP-L10 firmware: one for handheld +and wired headset modes, the other for the hands-free loudspeaker mode. Here +are the 12 parameter words in the 3 available configurations: + +Parameter word Default Pirelli Pirelli + (AEC disabled) handheld hands-free +-------------------------------------------------------------------------- +aec_enable 0 (off) 2 (on) 2 (on) +continuous_filtering 0 (off) 1 (on) 1 (on) +granularity_attenuation 0x0001 0x0014 0x0014 +smoothing_coefficient 0x7FFF 0x0CCC 0x0CCC +max_echo_suppression_level 0x1FFF (12 dB) 0x59AD (3 dB) 0x0FFF (18 dB) +vad_factor 0x4000 0x4000 0x4000 +absolute_threshold 0x0032 0x0032 0x0032 +factor_asd_filtering 0x1000 0x1000 0x1000 +factor_asd_muting 0x1000 0x1000 0x1000 +aec_visibility 0 (off) 0 (off) 0 (off) +noise_suppression_enable 0 (off) 4 (on) 4 (on) +noise_suppression_level 0 (none) 0 (none) 0x0060 (18 dB) + +The following observations can be made: + +* The 4 parameters vad_factor, absolute_threshold, factor_asd_filtering and + factor_asd_muting remain unchanged between TI's DSP default and Pirelli's + production configs. On the basis of this observation, I (Mother Mychaela) + get the feeling that these four should be left alone. + +* Besides the obvious steps of enabling AEC and SPENH, Pirelli did change + continuous_filtering (from off to on), granularity_attenuation and + smoothing_coefficient. Unfortunately, unless we recover the source code for + our Calypso DSP ROM or some documents explaining this version of AEC in + detail, we have no way of understanding what these parameters do, let alone + evaluating the merits of Pirelli's change. + +* max_echo_suppression_level and noise_suppression_level seem to be the two + parameters most amenable to tuning. + +* It is interesting to note that Pirelli's fw enables AEC not only in the + loudspeaker mode, but also in the more basic handheld and wired headset + modes. The two AEC configs differ only in max_echo_suppression_level and + noise_suppression_level parameters, with the loudspeaker mode AEC config + being more aggressive. + +Prior to seeing what Pirelli's fw does, my (Mychaela's) own thinking was that +AEC is only needed in loudspeaker configurations, not handheld or headset. +After seeing Pirelli's AEC configs, I reason that enabling a less aggressive +AEC configuration in those less echo-prone modes probably doesn't hurt - thus +until and unless we recover more documentation or other knowledge, the plan for +our own FreeCalypso Libre Dumbphone handset is to do what Pirelli does: use a +less aggressive AEC config in handheld and headset modes, and a more aggressive +one in the hands-free loudspeaker mode. + +On our current FCDEV3B setup with a SparkFun COM-09151 loudspeaker and a CUI +CMC-9745-130T microphone, applying Pirelli's loudspeaker-mode AEC config +produces echo cancellation that sounds acceptable to our subjective human +evaluator on the far end of test calls. + +FIR filter details +================== + +Calypso DSP has two FIR filters in the voice paths, one in the uplink path and +one in the downlink path. Aside from their placement, the two FIR filters are +identical. Each FIR block has 31 taps (making a 30th order filter), and each of +the 31 coefficients is a 16-bit fixed-point number. The fixed point format is +F2.14 aka Q14: to get the real coefficient from the physical 16 bits, treat the +16-bit datum as a two's complement signed integer, then divide by 16384. +Examples: 0x4000 means 1, 0x2000 means 0.5, 0xC000 means -1, 0xE000 means -0.5. + +In principle you can set all 31 coefficients to whatever you like, but in +practice only two possible configurations are used: + +* When the FIR filter is disabled (identity transform), coefficient 0 is set to + 0x4000 (unity) and all other coefficients are set to 0. In this configuration + the FIR block does not introduce any extra delay: all delayed samples are + multiplied by 0 and thus produce no effect. + +* When some non-identity frequency response transformation is desired, a linear + phase filter is set up: coefficient #15 becomes the main tap (significantly + greater in absolute value than all others) and all other coefficients mirror + around it symmetrically: #0 equals #30, #1 equals #29 and so forth, until #14 + equals #16. This filter adds 1.875 ms of delay (15 sample times) to the voice + path in which it is active, and an equal amount of "pre-ringing". + +The presumed purpose of these two FIR filters (uplink and downlink) is to +flatten the frequency response of the speaker and microphone transducers, or +perhaps even more ambitiously, the frequency response of the modeled acoustic +environment. However, actually coming up with a good set of FIR filter +coefficients given a desired frequency response is a hard problem, one where +forward engineering is much more difficult than reverse. + +When it comes to reverse engineering of existing Calypso DSP FIR filters, a +total of 7 specimen have been captured out in the wild so far: one downlink FIR +filter from Openmoko's non-functional para0.cfg (no way of knowing which speaker +it was once designed for), and a set of 6 filters extracted from Pirelli DP-L10, +3 uplink and 3 downlink, corresponding to the 3 audio routing modes supported on +this phone model (handheld, hands-free and wired headset). All 7 are linear +phase filters as described above. Analyzing the frequency response of a given +already existing FIR filter is easy: just use the fir2freq program in our +freecalypso-reveng Hg repository. OTOH, coming up with a new set of FIR filter +coefficients for some desired frequency response (e.g., for a new phone handset +being designed) is a much harder problem, one which we will probably have to +outsource to a hired DSP/FIR expert. + +Calypso FIR support in FC host tools +------------------------------------ + +Uplink and downlink FIR filter coefficients can be included in the input to +tiaud-compile. Each coefficient is given as the actual 16-bit word going into +the DSP (Q14 scaling included), and can be specified either in hex or as a +signed decimal integer. + +We also have a dedicated ASCII file format for a FIR filter coefficient set by +itself, like this example: + +fir-coeff-table + +0x0178 0x0AB5 0xF43D 0xFED5 0xFCA7 0x04D8 0x00B8 0x0371 +0x032F 0x0007 0x151C 0xF24C 0x19A6 0xE918 0xF7CD 0x7D0C +0xF7CD 0xE918 0x19A6 0xF24C 0x151C 0x0007 0x032F 0x0371 +0x00B8 0x04D8 0xFCA7 0xFED5 0xF43D 0x0AB5 0x0178 + +(This example is the FIR filter extracted from Openmoko's non-functional +para0.cfg.) This by-itself FIR filter coeff set format is accepted as input to +the auw-fir command in fc-tmsh (allowing experimental FIR filters to be uploaded +to a running Calypso device for testing) and to our fir2freq analysis program. +We will probably use the same format if and when we embark on a venture to +design our own FIR filters for our own handset hardware. + fc-tmsync aur and aur-all addition ==================================