# HG changeset patch # User Mychaela Falconia # Date 1728089881 0 # Node ID d9f6b31252594ec0313b649f330eb73e1cb8aa0d # Parent 583dc4cbee953462e3a89911a9869afc13c86e4f document TW-TS-005 utilities diff -r 583dc4cbee95 -r d9f6b3125259 doc/TW-TS-005 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/TW-TS-005 Sat Oct 05 00:58:01 2024 +0000 @@ -0,0 +1,81 @@ +The original set of Themyscira Wireless utilities for FR and EFR codecs uses an +ad hoc binary file format to represent streams of FR or EFR codec frames - see +Binary-file-format article. However, a newer hexadecimal format has now been +standardized as Themyscira Wireless Technical Specification TW-TS-005: + +https://www.freecalypso.org/specs/tw-ts-005-v010003.txt + +The standard has two annexes intended for practical use: + +* TW-TS-005 Annex A defines a representation format for FR and EFR codecs; +* TW-TS-005 Annex B defines a representation format for HR codec. + +The present version of ThemWi GSM codec libraries & utilities suite includes +some utilities that operate on TW-TS-005 Annex A hex files; support for Annex B +will appear in a future version when our work on GSM-HR codec integration +progresses further. + +TW-TS-005 Annex A vs gsmx binary format +======================================= + +For working with FR and EFR codecs, our original binary file format has one +major defect: it cannot represent bad traffic frames (in GSM 06.31 & 06.81 +definition, i.e., BFI=1) that have payload data bits included, as happens in +well-designed GSM networks that use GSM 08.60 TRAU-UL frames or TW-TS-001 +enhanced RTP transport. This file format deficiency leads to the following +downstream defects: + +* The combination of "bad traffic frame" and "accepted SID frame" (again, + GSM 06.31 & 06.81 terminology) gets incorrectly treated as "unusable frame" + rather than "invalid SID frame" as the specs decree. + +* In the case of EFR, the reference decoder C code that forms the basis for + Themyscira libgsmefr makes use of "fixed codebook excitation pulses" portion + of bad frames during speech (as opposed to comfort noise) state - but these + bits were lost to file format shortcoming. + +The new hexadecimal format of TW-TS-005 Annex A solves this shortcoming: each +frame is stored as a hex line that directly corresponds to a single RTP payload, +hence the full capabilities of TW-TS-001 extended RTP format are made available +in a file at rest. + +Because we have so many existing utilities that read and write gsmx binary +files, and this binary format is so entrenched in Themyscira development +environment, we are not doing a "forklift" migration of all of our tools to the +new format. Instead we are taking a more tempered approach: + +* For the decoding operation (taking a frame stream from an Rx Radio Subsystem + and producing linear PCM output) that is most affected by the shortcomings of + gsmx format, we have new utilities that read TW-TS-005 Annex A input, while + the old gsmx-reading utilities are still preserved and maintained; + +* For most other workflows (for example, encoding of new speech) conversion + utilities between the two formats (described below) are deemed sufficient; + +* New developments such as TFO transform use TW-TS-005 Annex A format natively. + +Human-readable dump decoding of TW-TS-005 hex files +=================================================== + +A line-based hexadecimal file format with one line per stored codec frame is +inherently more human-readable than a binary file, but we also desire a more +complete decoding such as that produced by gsmrec-dump, showing all codec +parameters and frame metadata flags. tw5a-dump produces such decoding for +TW-TS-005 Annex A hex files; there will also be a corresponding tw5b-dump +utility for TW-TS-005 Annex B when we finish integrating GSM-HR codec support. + +Conversion utilities (FR and EFR codecs) +======================================== + +gsmx-to-tw5a and tw5a-to-gsmx utilities do what their names suggest: convert +FR/EFR speech recordings or test sequences between gsmx (binary) and TW-TS-005 +Annex A (hex) formats. Important semantic notes: + +* gsmx-to-tw5a emits basic RTP format (no TEH) for all good frames, while each + BFI marker record is converted to a TEH-only No_Data frame. + +* tw5a-to-gsmx is the lossy conversion: distinction between basic and extended + RTP formats is lost, ditto for TAF without BFI, all BFIs become BFI-no-data. + +A conversion from gsmx to tw5a back to gsmx is lossless, but not the other way +around. diff -r 583dc4cbee95 -r d9f6b3125259 doc/Utils-overview --- a/doc/Utils-overview Fri Oct 04 20:40:42 2024 +0000 +++ b/doc/Utils-overview Sat Oct 05 00:58:01 2024 +0000 @@ -68,6 +68,8 @@ gsmrec-dump See Binary-file-format article. +gsmx-to-tw5a See TW-TS-005 article. + pcm16-check13 This program reads a 16-bit linear PCM recording file (raw BE by default, or raw LE with -l option) and checks if the 3 least significant bits of every sample are all @@ -84,6 +86,9 @@ pcm16-to-ulaw pcm8-to-pcm16 +tw5a-dump See TW-TS-005 article. +tw5a-to-gsmx + twamr-decode See Codec-utils article. twamr-decode-r twamr-encode