FreeCalypso > hg > freecalypso-tools
view doc/Audio-mode-config @ 896:0a2f50c571de
CHANGES: fc-buzplay basic 'play' command extension
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sun, 03 Apr 2022 08:41:34 +0000 |
parents | 6c306705f503 |
children |
line wrap: on
line source
There exist a number of tunable settings in the Iota ABB (the chip that performs A-to-D and D-to-A conversion for the voice path) and in the Calypso DSP which in TI's firmware architecture are meant to be configured through the audio mode facility of the RiViera Audio Service. The ABB settings grouped under the audio mode are as follows: * The selection of which analog interface pins the downlink audio should be sent to: EARN&EARP (earpiece), AUXON&AUXOP (auxiliary) or HSO (headset). * The selection of which analog interface pins the uplink audio should be taken from: MICIN&MICIP (main microphone), AUXI (auxiliary input) or HSMICP (headset microphone). * The selection of AUXI input levels when this analog input is in use for the voice uplink. * Analog gains for the uplink, the downlink and the analog sidetone from the uplink input to the downlink output. * Selection of a special filter bypass mode for the voice downlink. * The selection of MICBIAS (or HSMICBIAS) voltage between 2.0 V and 2.5 V. The DSP voice path settings grouped under the audio mode are as follows: * The selection of the digital voice path as being between GSM and the ABB (the default for analog voice interfaces), between GSM and MCSI (the external digital voice interface) or between MCSI and the ABB (non-GSM operation). * FIR filter coefficients for the voice uplink and for the voice downlink. * Enabling/disabling and configuration of the Acoustic Echo Cancellation (AEC) mechanism. The firmware paradigm for working with all of the above settings is as follows: * In a lab environment, each of the listed settings can be independently tweaked and read back through ETM packets over the RVTMUX debug serial interface; the corresponding fc-tmsh commands (matching TI's original Windows-based TMSH) are auw for writing individual audio parameters and aur for reading them back. * In end-use operation, TI's intent as realized in the firmware design is that all of the listed audio settings will only be changed as a group, loaded from audio mode configuration files in FFS. Each audio mode configuration needs to be assigned a name between 1 and 9 characters long, and for each named configuration there are two files in FFS: /aud/modename.cfg is the main configuration file /aud/modename.vol is the corresponding volume setting file This paradigm is a good fit for "dumbphone" handsets in which there usually will be several different voice audio configurations for classic handheld operation, for the hands-free loudspeaker mode, for operation with a wired headset, and if the phone uses its hands-free loudspeaker plus the Calypso DSP to play ringtones (as opposed to using a buzzer on BU/PWT or a ringtone player chip that drives the speaker bypassing the voice path), there will also need to be an output-only audio configuration for ringing. How do the audio mode config files under /aud come into being? It appears that TI's original intent was that a configuration would be manually constructed on a test device via TMSH auw commands, saved in the FFS of that test device with the aus command, then read out of that test device FFS in binary form and reuploaded as an opaque blob to all devices on the production line. One can do the same procedure with our fc-tmsh and fc-fsio which fully replicate the relevant functionality of TI's original TMSH (to the best of our knowledge), but in FreeCalypso we have an alternate way which fits better with our UNIX philosophy: we have created our own ASCII text format for representing all of the content in TI's /aud/*.cfg binary files and tiaud-* utilities for compiling TI's binary cfg files from our ASCII source format, disassembling a *.cfg file read out of FFS into the same ASCII format, and creating the required *.vol companion files, which are also binary. A note about volume settings: the Iota ABB has two variable gain controls in the voice downlink path: the main "volume" gain in rather coarse 6 dB steps (the choices being 0 dB, -6 dB, -12 dB, -18 dB, -24 dB and mute) and a finer "calibration" gain in 1 dB steps between -6 and +6 dB. It appears that TI's intent was that only the coarse volume control in 6 dB steps is to be visible to the user, with just 5 possible non-mute volume levels, and that the finer gain control be set at the factory in the audio mode config files for each mode as some form of calibration. Pirelli DP-L10 significantly deviates from this model by providing 10 non-mute volume levels to the user with 2 dB or 3 dB steps between them by changing both VOLCTL and VDLPG fields in the VBDCTRL register, but at the present time we have no plans to make a similar drastic change in FreeCalypso. Another noteworthy feature of the audio mode system with respect to volume control is that there is a separate *.vol file that stores the current volume setting for each mode. In a "dumbphone" handset firmware built according to TI's paradigm, the /aud/*.cfg files will be written once on the factory production line and only read afterward, but whenever the user turns the volume up or down in the UI, the *.vol file _corresponding to the current mode_ will be updated by the running fw. Thus the fw would maintain a separate notion of the current volume for ringing, for the earpiece speaker, for the hands-free loudspeaker and for the wired headset, something which Pirelli's fw very notoriously fails to do. Old vs. new AEC =============== One of the settings in the audio mode config structure underwent an evolutionary change within the span of history that is relevant to FreeCalypso - this setting is the configuration for AEC, the Acoustic Echo Cancellation functional block of the Calypso DSP. As TI's GSM DSPs evolved (before, during and after the Calypso era), their AEC implementation evolved along with the rest, and different evolutionary versions of AEC require different configuration and tuning parameters. When the audio mode facility was first implemented, the AEC block in TI's GSM DSPs of that time was controlled with a single 16-bit control word; the people in the SSA group who implemented RiViera Audio Service then decided to split different bits from this one DSP control word into 5 different parameter words, and the result was the "old" 5-word AEC config. But the version of AEC implemented in the DSP ROM in the Calypso silicon version we work with is slightly newer; this version corresponds to what TI's L1 code calls L1_NEW_AEC. However, the waters then got muddied: for reasons which we (FreeCalypso team) cannot understand (perhaps miscommunication between different groups at TI), TI's TCS211 reference firmware shipped with L1_NEW_AEC disabled (C preprocessor symbol set to 0 instead of 1), even though the underlying DSP AEC block (combination of ROM and official patches) is the "new" kind and not the "old" one. There are two fallouts from this software misconfiguration on TI's part: 1) If one takes stock TCS211 from TI or any derivative version in which this aspect is unchanged (all mokoN firmwares, and all FC firmwares up to Magnetite) and tries to enable AEC, the result will be a poor AEC configuration: the old echo level and long vs short settings do nothing on the new DSP, whereas the new tunable parameters will remain at their defaults with no way to tweak them. I (Mother Mychaela) can only guess that this situation is what Openmoko must have run into when they tried to get AEC working. 2) When someone downstream of TI figures out that L1_NEW_AEC needs to be changed from 0 to 1 and actually makes that change, like we did in our Tourmaline fw, the format and size of the audio mode binary structure change, and all old audio mode config files become invalid. Our FreeCalypso work is affected by point 2 above: we started working with audio mode config files in 2017, using the old AEC configuration, and only made the switch to L1_NEW_AEC in 2021. We now have two kinds of audio mode config binary files: the old kind that are 164 bytes long, and the new kind that are 176 bytes long. Our Tourmaline firmware has L1_NEW_AEC enabled, while Magnetite (our legacy backward compatiblity fw) has it disabled. To prevent loading of garbage into AEC config when an audio mode file of the wrong kind is loaded, we have implemented the following workaround in both Tourmaline and Magnetite: if the loaded mode config file has the wrong length, the AEC config is set to the default disabled state instead of whatever is in the mode file - loading an AEC config of the wrong format is not possible. Default audio configuration =========================== The default audio config set in the Iota ABB registers and in the DSP when no named audio mode config has been loaded with the audio_mode_load() API call (accessible via AT@AUL or via fc-tmsh aul command) is as follows, in the syntax which our tiaud-compile utility accepts as input and which our tiaud-decomp utility emits as output: voice-path 0 mic default { gain 3 output-bias 0 fir 0 0x4000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 fir 8 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 fir 16 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 fir 24 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 } speaker ear+aux { gain 0 audio-filter 0 fir 0 0x4000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 fir 8 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 fir 16 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 fir 24 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 } sidetone -5 aec 0 0 0 0 0 The above version is the one produced by Magnetite and earlier firmwares without L1_NEW_AEC; in the new Tourmaline version the last line changes to: aec-new 0 0 0x1 0x7FFF 0x1FFF 0x4000 0x32 0x1000 0x1000 0 0 0 The meaning is as follows: * voice-path is the DSP digital voice path setting, 0 means the standard configuration with the voice channel going between GSM and the local analog voice hardware attached to the ABB. * The default microphone input is used for the voice uplink (MICIN&MICIP pins), whereas the voice downlink is presented on both EARN&EARP and AUXON&AUXOP pins, i.e., both "ear" and "aux" VDL amplifiers are enabled. * The microphone gain is 3 dB, the fine gain adjustment in the voice downlink path is 0 dB, and the sidetone gain is -5 dB. * output-bias 0 under mic means that the MICBIAS voltage is set to 2.0 V. * audio-filter 0 under speaker means that the VFBYP bit in the VBCTRL1 register is NOT set, i.e., the normal configuration. * DSP FIR filters do nothing, as coefficient 0 is set to unity and all other coefficients are set to zero. * The AEC mechanism in the DSP is disabled, although the format of the bits that say so is different between old and new AEC versions. In the new version there are a number of tunable settings that only kick in when AEC is enabled; when AEC is disabled by default, these tunable knobs still have sensible defaults that aren't all zeros. Creating your own audio mode configurations =========================================== The input to our tiaud-compile utility can contain every setting shown in the default case above, or any desired subset thereof. For any settings not given in the input, the defaults from the above will be used, except that tiaud-compile's current default for the speaker mode is just ear rather than ear+aux. (It is a default which you should NOT depend on; set it explicitly if it matters!) A few notes: * For all settings given as numbers, the number given in the ASCII input is the number that goes into TI's binary structure, without any transformation, even in those cases where the result is counter-intuitive, such as "audio-filter 0" meaning that the filter is *enabled*. * The 3 possible mode keywords for the mic mode are default, aux and headset, corresponding to MICIN&MICIP, AUXI and HSMICP analog inputs, respectively. * The 5 possible mode keywords for the speaker mode are ear, aux, headset, buzzer and ear+aux. The buzzer speaker mode exists only on TI's Nausica ABB predating Iota, i.e., it won't work on any of the Calypso+Iota+Rita devices built or supported by FreeCalypso, but our tiaud-compile and tiaud-decomp utilities support it because it is nominally supported by TI's RiViera Audio Service and its binary data structure for audio mode configuration. * When mic is set to aux, an additional mic setting called extra-gain becomes available. If extra-gain is set to 0, the AUXI gain will be set to 28.2 dB, if extra-gain is set to 1, the AUXI gain will be set to 4.6 dB; all other values will be considered invalid by the firmware. * Each of the two FIR filters in the DSP (one for uplink, one for downlink) has a total of 31 coefficients, numbered 0 through 30, inclusive. In the ASCII input to tiaud-compile you can put each coefficient on its own fir line, put all 31 coefficients on the same line, or group them in any other way you like. The grouping used in the tiaud-decomp output has been chosen for line length reasons. aec vs aec-new in tiaud-compile input ===================================== tiaud-compile accepts both aec (old) and aec-new settings; aec must be followed by 5 numbers, aec-new must be followed by 12 numbers. Each number is a 16-bit value, and they go into the binary structure without further interpretation by tiaud-compile - instead the firmware is the entity that gives them meaning. Numbers without 0x prefix are interpreted as decimal. tiaud-compile will generate one type or the other of the binary output file, following these rules: * If an aec setting is given, a 164 byte file will be produced, with the 5 AEC words being the given ones. * If an aec-new setting is given, a 176 byte file will be produced, with the 12 AEC words being the given ones. * If neither setting is given, a 164 byte file will be produced, with the 5 AEC words of the old type being all zeros. Thanks to the modified audio mode loading code in our firmwares, these 164 byte mode files can still be used with current Tourmaline fw, with AEC set to its default disabled state. New AEC parameter words ======================= The 12 words that configure AEC of the L1_NEW_AEC flavor (appearing on the aec-new line in tiaud-compile input or in an fc-tmsh auw 12 command) map as follows: Word 0: aec_enable This word must be set to 0 to disable AEC or 2 to enable it. This word is translated to a single DSP control bit by the Audio Service layer, thus no other values must be written into it. Word 1: continuous_filtering This word is written directly into the DSP, and we have no documentation for it beyond "enable (1) or disable (0) continuous mode filtering". Word 2: granularity_attenuation This word is written directly into the DSP, and we have no documentation for it beyond "granularity of the smoothed attenuation". Word 3: smoothing_coefficient This word is written directly into the DSP, and we have no documentation for it beyond "smoothing coefficient". Word 4: max_echo_suppression_level This word is written directly into the DSP; it is described as "maximum attenuation level", and the following constants are defined for it: #define AUDIO_MAX_ECHO_0dB (0x7FFF) #define AUDIO_MAX_ECHO_2dB (0x65AA) #define AUDIO_MAX_ECHO_3dB (0x59AD) #define AUDIO_MAX_ECHO_6dB (0x4000) #define AUDIO_MAX_ECHO_12dB (0x1FFF) #define AUDIO_MAX_ECHO_18dB (0x0FFF) #define AUDIO_MAX_ECHO_24dB (0x07FF) Word 5: vad_factor This word is written directly into the DSP, and we have no documentation for it beyond "VAD factor relative to the current estimated energy". VAD must stand for "voice activity detector", but our knowledge ends here. Word 6: absolute_threshold This word is written directly into the DSP, and we have no documentation for it beyond "VAD absolute offset relative to the current estimated energy". Word 7: factor_asd_filtering This word is written directly into the DSP, and we have no documentation for it beyond "modifying factor of d_far_end_noise for filtering decision". Word 8: factor_asd_muting This word is written directly into the DSP, and we have no documentation for it beyond "modifying factor of d_far_end_noise for muting decision". Word 9: aec_visibility This word must be set to 0 for normal operation or 0x200 for "AEC visibility" debug mode. This word is translated to a single L1 control bit by the Audio Service layer, thus no other values must be written into it. Word 10: noise_suppression_enable This word must be set to 0 to disable SPENH algorithm or 4 to enable it. This word is translated to a single DSP control bit by the Audio Service layer, thus no other values must be written into it. We don't know what this "speech enhancement" algorithm does, and whether or not it is the same as "noise suppression". Word 11: noise_suppression_level This config word is mapped to just two bits in the actual DSP control word by the Audio Service layer, thus there are only 4 possible valid values here: #define AUDIO_NOISE_NO_LIMITATION (0x0000) #define AUDIO_NOISE_6dB (0x0020) #define AUDIO_NOISE_12dB (0x0040) #define AUDIO_NOISE_18dB (0x0060) Some known-good AEC configurations ================================== The terse descriptions of parameter words given above unfortunately constitute the total extent of our knowledge of the AEC block in our dear Calypso DSP and its tuning parameters - we don't know anything more. However, we do have 3 example configurations to look at: we have the default values of the tuning parameters that appear to be initialized by the DSP itself on boot, and we have two AEC-enabled configurations set by Pirelli DP-L10 firmware: one for handheld and wired headset modes, the other for the hands-free loudspeaker mode. Here are the 12 parameter words in the 3 available configurations: Parameter word Default Pirelli Pirelli (AEC disabled) handheld hands-free -------------------------------------------------------------------------- aec_enable 0 (off) 2 (on) 2 (on) continuous_filtering 0 (off) 1 (on) 1 (on) granularity_attenuation 0x0001 0x0014 0x0014 smoothing_coefficient 0x7FFF 0x0CCC 0x0CCC max_echo_suppression_level 0x1FFF (12 dB) 0x59AD (3 dB) 0x0FFF (18 dB) vad_factor 0x4000 0x4000 0x4000 absolute_threshold 0x0032 0x0032 0x0032 factor_asd_filtering 0x1000 0x1000 0x1000 factor_asd_muting 0x1000 0x1000 0x1000 aec_visibility 0 (off) 0 (off) 0 (off) noise_suppression_enable 0 (off) 4 (on) 4 (on) noise_suppression_level 0 (none) 0 (none) 0x0060 (18 dB) The following observations can be made: * The 4 parameters vad_factor, absolute_threshold, factor_asd_filtering and factor_asd_muting remain unchanged between TI's DSP default and Pirelli's production configs. On the basis of this observation, I (Mother Mychaela) get the feeling that these four should be left alone. * Besides the obvious steps of enabling AEC and SPENH, Pirelli did change continuous_filtering (from off to on), granularity_attenuation and smoothing_coefficient. Unfortunately, unless we recover the source code for our Calypso DSP ROM or some documents explaining this version of AEC in detail, we have no way of understanding what these parameters do, let alone evaluating the merits of Pirelli's change. * max_echo_suppression_level and noise_suppression_level seem to be the two parameters most amenable to tuning. * It is interesting to note that Pirelli's fw enables AEC not only in the loudspeaker mode, but also in the more basic handheld and wired headset modes. The two AEC configs differ only in max_echo_suppression_level and noise_suppression_level parameters, with the loudspeaker mode AEC config being more aggressive. Prior to seeing what Pirelli's fw does, my (Mychaela's) own thinking was that AEC is only needed in loudspeaker configurations, not handheld or headset. After seeing Pirelli's AEC configs, I reason that enabling a less aggressive AEC configuration in those less echo-prone modes probably doesn't hurt - thus until and unless we recover more documentation or other knowledge, the plan for our own FreeCalypso Libre Dumbphone handset is to do what Pirelli does: use a less aggressive AEC config in handheld and headset modes, and a more aggressive one in the hands-free loudspeaker mode. On our current FCDEV3B setup with a SparkFun COM-09151 loudspeaker and a CUI CMC-9745-130T microphone, applying Pirelli's loudspeaker-mode AEC config produces echo cancellation that sounds acceptable to our subjective human evaluator on the far end of test calls. FIR filter details ================== Calypso DSP has two FIR filters in the voice paths, one in the uplink path and one in the downlink path. Aside from their placement, the two FIR filters are identical. Each FIR block has 31 taps (making a 30th order filter), and each of the 31 coefficients is a 16-bit fixed-point number. The fixed point format is F2.14 aka Q14: to get the real coefficient from the physical 16 bits, treat the 16-bit datum as a two's complement signed integer, then divide by 16384. Examples: 0x4000 means 1, 0x2000 means 0.5, 0xC000 means -1, 0xE000 means -0.5. In principle you can set all 31 coefficients to whatever you like, but in practice only two possible configurations are used: * When the FIR filter is disabled (identity transform), coefficient 0 is set to 0x4000 (unity) and all other coefficients are set to 0. In this configuration the FIR block does not introduce any extra delay: all delayed samples are multiplied by 0 and thus produce no effect. * When some non-identity frequency response transformation is desired, a linear phase filter is set up: coefficient #15 becomes the main tap (significantly greater in absolute value than all others) and all other coefficients mirror around it symmetrically: #0 equals #30, #1 equals #29 and so forth, until #14 equals #16. This filter adds 1.875 ms of delay (15 sample times) to the voice path in which it is active, and an equal amount of "pre-ringing". The presumed purpose of these two FIR filters (uplink and downlink) is to flatten the frequency response of the speaker and microphone transducers, or perhaps even more ambitiously, the frequency response of the modeled acoustic environment. However, actually coming up with a good set of FIR filter coefficients given a desired frequency response is a hard problem, one where forward engineering is much more difficult than reverse. When it comes to reverse engineering of existing Calypso DSP FIR filters, a total of 7 specimen have been captured out in the wild so far: one downlink FIR filter from Openmoko's non-functional para0.cfg (no way of knowing which speaker it was once designed for), and a set of 6 filters extracted from Pirelli DP-L10, 3 uplink and 3 downlink, corresponding to the 3 audio routing modes supported on this phone model (handheld, hands-free and wired headset). All 7 are linear phase filters as described above. Analyzing the frequency response of a given already existing FIR filter is easy: just use the fir2freq program in our freecalypso-reveng Hg repository. OTOH, coming up with a new set of FIR filter coefficients for some desired frequency response (e.g., for a new phone handset being designed) is a much harder problem, one which we will probably have to outsource to a hired DSP/FIR expert. Calypso FIR support in FC host tools ------------------------------------ Uplink and downlink FIR filter coefficients can be included in the input to tiaud-compile. Each coefficient is given as the actual 16-bit word going into the DSP (Q14 scaling included), and can be specified either in hex or as a signed decimal integer. We also have a dedicated ASCII file format for a FIR filter coefficient set by itself, like this example: fir-coeff-table 0x0178 0x0AB5 0xF43D 0xFED5 0xFCA7 0x04D8 0x00B8 0x0371 0x032F 0x0007 0x151C 0xF24C 0x19A6 0xE918 0xF7CD 0x7D0C 0xF7CD 0xE918 0x19A6 0xF24C 0x151C 0x0007 0x032F 0x0371 0x00B8 0x04D8 0xFCA7 0xFED5 0xF43D 0x0AB5 0x0178 (This example is the FIR filter extracted from Openmoko's non-functional para0.cfg.) This by-itself FIR filter coeff set format is accepted as input to the auw-fir command in fc-tmsh (allowing experimental FIR filters to be uploaded to a running Calypso device for testing) and to our fir2freq analysis program. We will probably use the same format if and when we embark on a venture to design our own FIR filters for our own handset hardware. fc-tmsync aur and aur-all addition ================================== New addition as of fc-host-tools-r16: our aur command which natively resides in fc-tmsh (audio mode full access read operation via ETM) has also been implemented in fc-tmsync for scripted usage. Furthermore, we also implemented an aur-all command that issues the same sequence of aur operations as the firmware's built-in audio_mode_save() and emits the output on stdout in the same format as tiaud-decomp. The end effect is that fc-tmsync aur-all is a much shorter and more direct way of obtaining exactly the same result as would previously be obtained by saving the current audio mode config with aus, reading out the resulting binary file with fc-fsio and decoding it with tiaud-decomp. The implementation of aur-all and the more elementary aur 12 command in fc-tmsync works only with firmware versions that have L1_NEW_AEC enabled - therefore, these commands work with FC Tourmaline but not Magnetite. Furthermore, our aur command in both fc-tmsh and fc-tmsync and the new fc-tmsync aur-all command also work against Pirelli's firmware - this alien fw implements ETM aur operation exactly the same as standard TCS211, and it has L1_NEW_AEC enabled, such that aur 12 returns the 24 byte long L1_NEW_AEC version of T_AUDIO_AEC_CFG structure. The combination of this functionality in Pirelli's fw and our fc-tmsync addition makes it possible to read out Pirelli's highly tuned audio configurations in a very convenient manner, much more convenient than reading ABB registers with abbr and reading DSP API words with r16. fc-audio-config repository ========================== We have a separate repository for FC audio mode configurations, both complete configs and individual config pieces (AEC and FIR): https://www.freecalypso.org/hg/fc-audio-config/ These audio configuration bits are maintained in their own repository because they are separate from the present FC host tools package and also separate from our firmwares, evolving independently without strict synchronization with these other components.