FreeCalypso > hg > gsm-codec-lib
view doc/Codec-utils @ 487:cd1f0fa936cc
doc/AMR-EFR-performance: new article
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Mon, 20 May 2024 21:53:11 +0000 |
parents | 751f06541fbb |
children | de333989a12b |
line wrap: on
line source
Standalone command line utilities for FR and EFR codecs ======================================================= The pre-existing FOSS opencore-amr package includes amrnb-enc and amrnb-dec test programs: the first reads linear PCM from a WAV file and emits AMR encoder output in a .amr file (RFC 4867 AMR storage format), the second reads this .amr format and emits AMR decoder output as WAV. Inspired by these simple test programs, the present package offers equivalent command line utilities for GSM FR and EFR codecs. Here they are: gsmfr-encode This utility reads linear PCM from a WAV file, runs the bit-exact GSM 06.10 encoder and writes the output in the classic .gsm format (directly abutted FR codec frames of 33 bytes each). We don't currently have a Tx-side DTX implementation (VAD etc) for GSM-FR, hence the output from gsmfr-encode will always consist of good speech frames only. gsmfr-decode This utility reads our gsmx format (see Binary-file-format article), which is a superset of the classic libgsm format. The input to gsmfr-decode may be a pure .gsm recording as produced by gsmfr-encode or toast from libgsm package, or it can also contain SID frames and/or BFI markers. The processing performed by gsmfr-decode begins with our FR1 Rx DTX handler preprocessor, which will be an identity transform for pure .gsm input (most of the time) but becomes important for real-world input containing SIDs and BFIs, and is followed by the bit-exact GSM 06.10 decoder. The decoded output is written as WAV. gsmefr-encode This utility reads linear PCM from a WAV file, runs our EFR encoder (Themyscira libgsmefr) and writes the output in our gsmx format. There is an option to enable or disable DTX: -d enables DTX, otherwise it is disabled. (This option mirrors amrnb-enc.) gsmefr-decode This utility reads our gsmx format (which must be EFR, not FR1) and feeds all frames and BFIs to our EFR decoder. The decoded output is written as WAV. The above are original programs that read WAV input for encoding and write WAV output from decoding. We now also have raw versions that read and write our "robe" (raw big-endian) format instead: gsmfr-encode-r Just like gsm[e]fr-encode, but reading "robe" instead of WAV. gsmefr-encode-r gsmfr-decode-r Just like gsm[e]fr-decode, but writing "robe" instead of WAV. gsmefr-decode-r Please see PCM-file-formats article for the rationale. Additions for libgsmfr2 ======================= With the introduction of libgsmfr2, gsmfr-* codec utilities have undergone some changes: * gsmfr-decode and gsmfr-decode-r now implement the optional decoder homing feature, detecting and acting upon GSM 06.10 decoder homing frames. * gsmfr-encode-r takes an optional -h flag that enables the encoder homing function; it is disabled by default. The same feature was not replicated in WAV-reading gsmfr-encode, as WAV format is poorly suited for tinkering- oriented bit-exact work. * There is a new utility named gsmfr-decode-rb, where rb stands for "raw basic". This utility emits "robe" output like gsmfr-decode-r, but it performs only "basic" GSM 06.10 decoding, without the Rx DTX preprocessor step. BFI frame gaps in input are not allowed, and there is no SID or DHF detection. Standalone command line utilities for AMR codec =============================================== As described above, gsm[e]fr-encode and gsm[e]fr-decode were modeled after amrnb-enc and amrnb-dec from opencore-amr, a piece of pre-existing FOSS. However, now that we have libtwamr, a Themyscira library for AMR codec that is designed to serve as a replacement for libopencore-amrnb in our workflows involving AMR, we also have our own twamr-encode and twamr-decode utilities that directly replace amrnb-enc and amrnb-dec. twamr-encode is a functional replacement for amrnb-enc: it reads 16-bit linear PCM speech input from a WAV file and writes the AMR encoder output in a .amr file (RFC 4867 storage format). However, there is a difference in the command line structure and a small difference in operation. The command line structure of twamr-encode is as follows: twamr-encode [-d] [-2] input.wav mode output.amr The middle argument specifies the codec mode to be used; there is no default. Ordinarily the mode argument is one of these 8 keywords: MR475 MR515 MR59 MR67 MR74 MR795 MR102 MR122 However, this mode argument can also take the form of "file:$modefile", where $modefile is an ASCII text file giving one of the above mode keywords per line. This form is not likely to be useful in casual twamr-encode usage, but it exists for the sake of symmetry with twamr-tseq-enc program used for verification testing with the official test sequences from 3GPP. Aside from this difference in the command line structure, the small functional difference between amrnb-enc and twamr-encode is that libopencore-amrnb (the engine underlying amrnb-enc) omits the codec homing feature, whereas libtwamr (the engine underlying twamr-encode) implements the homing feature as a mandatory part of the codec definition per 3GPP specs. twamr-encode flag options: -d enables DTX, -2 switches the VAD algorithm from VAD1 default to VAD2 alternative. The two options can be combined as -d2. twamr-decode is a more straightforward replacement for amrnb-dec, with this simple command line structure: twamr-decode input.amr output.wav The functional difference from amrnb-dec is once again in the codec homing feature: present in libtwamr and hence twamr-decode, but absent in libopencore-amrnb and hence amrnb-dec. Finally, for the sake of completeness and symmetry with the other supported codecs, the present suite includes twamr-encode-r and twamr-decode-r utilities. They function just like twamr-encode and twamr-decode, with the same command line structure, but the file format for 16-bit linear PCM speech is "robe" instead of WAV.