comparison doc/TW-TS-005 @ 549:d9f6b3125259

document TW-TS-005 utilities
author Mychaela Falconia <falcon@freecalypso.org>
date Sat, 05 Oct 2024 00:58:01 +0000
parents
children
comparison
equal deleted inserted replaced
548:583dc4cbee95 549:d9f6b3125259
1 The original set of Themyscira Wireless utilities for FR and EFR codecs uses an
2 ad hoc binary file format to represent streams of FR or EFR codec frames - see
3 Binary-file-format article. However, a newer hexadecimal format has now been
4 standardized as Themyscira Wireless Technical Specification TW-TS-005:
5
6 https://www.freecalypso.org/specs/tw-ts-005-v010003.txt
7
8 The standard has two annexes intended for practical use:
9
10 * TW-TS-005 Annex A defines a representation format for FR and EFR codecs;
11 * TW-TS-005 Annex B defines a representation format for HR codec.
12
13 The present version of ThemWi GSM codec libraries & utilities suite includes
14 some utilities that operate on TW-TS-005 Annex A hex files; support for Annex B
15 will appear in a future version when our work on GSM-HR codec integration
16 progresses further.
17
18 TW-TS-005 Annex A vs gsmx binary format
19 =======================================
20
21 For working with FR and EFR codecs, our original binary file format has one
22 major defect: it cannot represent bad traffic frames (in GSM 06.31 & 06.81
23 definition, i.e., BFI=1) that have payload data bits included, as happens in
24 well-designed GSM networks that use GSM 08.60 TRAU-UL frames or TW-TS-001
25 enhanced RTP transport. This file format deficiency leads to the following
26 downstream defects:
27
28 * The combination of "bad traffic frame" and "accepted SID frame" (again,
29 GSM 06.31 & 06.81 terminology) gets incorrectly treated as "unusable frame"
30 rather than "invalid SID frame" as the specs decree.
31
32 * In the case of EFR, the reference decoder C code that forms the basis for
33 Themyscira libgsmefr makes use of "fixed codebook excitation pulses" portion
34 of bad frames during speech (as opposed to comfort noise) state - but these
35 bits were lost to file format shortcoming.
36
37 The new hexadecimal format of TW-TS-005 Annex A solves this shortcoming: each
38 frame is stored as a hex line that directly corresponds to a single RTP payload,
39 hence the full capabilities of TW-TS-001 extended RTP format are made available
40 in a file at rest.
41
42 Because we have so many existing utilities that read and write gsmx binary
43 files, and this binary format is so entrenched in Themyscira development
44 environment, we are not doing a "forklift" migration of all of our tools to the
45 new format. Instead we are taking a more tempered approach:
46
47 * For the decoding operation (taking a frame stream from an Rx Radio Subsystem
48 and producing linear PCM output) that is most affected by the shortcomings of
49 gsmx format, we have new utilities that read TW-TS-005 Annex A input, while
50 the old gsmx-reading utilities are still preserved and maintained;
51
52 * For most other workflows (for example, encoding of new speech) conversion
53 utilities between the two formats (described below) are deemed sufficient;
54
55 * New developments such as TFO transform use TW-TS-005 Annex A format natively.
56
57 Human-readable dump decoding of TW-TS-005 hex files
58 ===================================================
59
60 A line-based hexadecimal file format with one line per stored codec frame is
61 inherently more human-readable than a binary file, but we also desire a more
62 complete decoding such as that produced by gsmrec-dump, showing all codec
63 parameters and frame metadata flags. tw5a-dump produces such decoding for
64 TW-TS-005 Annex A hex files; there will also be a corresponding tw5b-dump
65 utility for TW-TS-005 Annex B when we finish integrating GSM-HR codec support.
66
67 Conversion utilities (FR and EFR codecs)
68 ========================================
69
70 gsmx-to-tw5a and tw5a-to-gsmx utilities do what their names suggest: convert
71 FR/EFR speech recordings or test sequences between gsmx (binary) and TW-TS-005
72 Annex A (hex) formats. Important semantic notes:
73
74 * gsmx-to-tw5a emits basic RTP format (no TEH) for all good frames, while each
75 BFI marker record is converted to a TEH-only No_Data frame.
76
77 * tw5a-to-gsmx is the lossy conversion: distinction between basic and extended
78 RTP formats is lost, ditto for TAF without BFI, all BFIs become BFI-no-data.
79
80 A conversion from gsmx to tw5a back to gsmx is lossless, but not the other way
81 around.