comparison doc/RTP-TRAUlike-format @ 207:185225722714

doc: new extended RTP format
author Mychaela Falconia <falcon@freecalypso.org>
date Thu, 06 Apr 2023 21:30:33 -0800
parents
children f0b90591f67c
comparison
equal deleted inserted replaced
206:c76f42e0cd3f 207:185225722714
1 TRAU-UL-like RTP transport format for FR & EFR codecs
2 =====================================================
3
4 The generally accepted industry standard format for RTP transport of FR and EFR
5 codec frames in an IP-based GSM RAN is given in ETSI TS 101 318; the same format
6 is also codified in IETF RFC 3551. However, when compared to the classic
7 TRAU-UL format of 3GPP TS 48.060, the standard RTP format of RFC 3551 exhibits
8 the following two shortcomings:
9
10 1) no way to indicate a BFI condition and still send frame data bits;
11 2) no way to transport the Time Alignment Flag (TAF).
12
13 Both of these shortcomings will be explained in detail further in this document;
14 however, the primary purpose of this document is to propose a new, regrettably
15 non-standard, RTP transport format for FR & EFR codecs, for use only within a
16 GSM RAN and the immediately attached CN transcoder ("soft TRAU"), that provides
17 the same functionality as the classic TRAU-UL format of TS 48.060, but is
18 carried over RTP in IP rather than a 16 kbps TDM subchannel.
19
20 The non-standard RTP transport format presented in this document is implemented
21 in OsmoBTS on a private feature branch:
22
23 https://cgit.osmocom.org/osmo-bts/log/?h=falconia/rtp_traulike
24
25 OsmoBTS versions that include this code always accept TRAUlike FR/EFR packets
26 on their RTP input, following the principle of being liberal in what you accept
27 while being conservative in what you send, but emit such packets on their RTP
28 output only when this non-default vty config option is given:
29
30 rtp fr-efr-traulike
31
32 The recently added (mainline) "rtp continuous-streaming" vty config option also
33 needs to be enabled.
34
35 The present document serves as the formal specification for the TRAUlike RTP
36 transport format for FR and EFR.
37
38 Detailed description of shortcomings of standard RTP transport for FR & EFR
39 ===========================================================================
40
41 These shortcomings are solved in the TRAUlike RTP transport format defined in
42 this document; understanding these shortcomings provides the essential rationale
43 for TRAU-like RTP.
44
45 Indicating BFI along with data bits
46 -----------------------------------
47
48 The only way to indicate a BFI condition in standard RTP (for FR/EFR) is to
49 either send no packet at all in the 20 ms window in question (industry standard
50 behavior and OsmoBTS default) or send an RTP packet with a zero-length payload
51 ("rtp continuous-streaming" option in OsmoBTS). The latter option provides a
52 timing tick for a CN-attached transcoder relying on the BTS-originating RTP
53 stream as its timing source, but there is still no way to send a frame of
54 marked-erroneous data bits. Contrast with TS 48.060 TRAU-UL format: in this
55 format the Dn bits carrying FR or EFR frame bits and the C12 bit carrying BFI
56 are orthogonal.
57
58 Why would one care about known-bad or deemed-to-be-bad frame data bits? They
59 do matter at least in the case of EFR: the official reference C-source EFR
60 decoder from ETSI makes use of the "fixed codebook excitation pulses" portion
61 of its EFR frame bits input (140 bits out of 244) even when BFI=1. This
62 portion of reference C-source behavior is declared to be a non-normative example
63 by the text of GSM 06.61 spec, thus there may be other compliant EFR decoder
64 implementations that never look at marked-erroneous data bits - but given the
65 ease of simply using the C code from ETSI as-is, or recoding it more efficiently
66 but keeping unchanged all bit-exact algorithms, including non-normative ones,
67 we should expect that the behavior of ETSI reference code is retained in many
68 production implementations and deployments.
69
70 Consider the case where a traditional E1-based BTS with a classic TRAU interface
71 is attached to an IP-based Osmocom RAN by way of OsmoMGW, and the resulting RTP
72 stream then (after passing through another OsmoMGW instance at the MSC) goes to
73 a "soft TRAU" transcoder (TC) in the CN. The TC will feed its RTP input to FR
74 and EFR decoders, and at least the EFR decoder makes use of "fixed codebook
75 excitation pulses" bits from erroneous frames. Furthermore, the TC may
76 implement in-band TFO (3GPP TS 28.062) inside its G.711 RTP output, in which
77 case it will need to insert a slightly modified TRAU-UL frame into that output.
78 The bits that would ideally be fed to the ETSI EFR decoder and emitted to the
79 outside world in TFO frames already exist at the output of the E1-based BTS,
80 but they get lost in the RTP transport when the industry standard RTP payload
81 format is used.
82
83 Consider another case where OsmoBTS does have an FR/EFR traffic frame that
84 could potentially be sent out, but it is suppressed by the
85 (tch_ind->lqual_cb >= bts->min_qual_norm) check in l1sap_tch_ind() in
86 src/common/l1sap.c. In this case it would be ideal to send out that frame
87 along with a BFI=1 indication, if the RTP transport format were to allow such
88 representation.
89
90 Lack of TAF bit in standard RTP transport
91 -----------------------------------------
92
93 The TRAU-UL frame format of TS 48.060 for FR and EFR includes a bit called TAF,
94 for Time Alignment Flag. Per the specs (TS 48.060 refers to TS 46.031 for
95 definition and coding of frame indicators) this bit shall be set to 1 in one
96 particular position in the 480 ms SACCH multiframe (the particular 20 ms frame
97 position in which a valid frame is always transmitted, even during DTX pauses)
98 and set to 0 in all other frames. This flag factors into the Rx DTX handler
99 logic prescribed in GSM 06.31 and 06.81 specs for FR and EFR, respectively, and
100 there exist production decoders for these codecs that implement their Rx DTX
101 handler function exactly to the letter of the specs, including the use of TAF
102 bit when deciding what to do with a BFI=1 frame received in the comfort noise
103 generation state. (These spec-compliant decoders include the reference ETSI
104 C-source decoder for EFR and Themyscira libgsmfrp for FR.)
105
106 This TAF bit does not exist in the standard RTP transport for FR & EFR. The
107 lack of this TAF bit causes the following problems for the CN-attached "soft
108 TRAU" transcoder:
109
110 1) The ability to implement spec-compliant handling of GSM 06.11 or 06.61
111 section 5.4 requirement (same section in both specs) is lost;
112
113 2) The TC won't know when to set the TAF bit in its outgoing TFO frames, if it
114 implements in-band TFO per 3GPP TS 28.062.
115
116 The TFO problem is particularly concerning because these TFO frames are emitted
117 to the outside world, outside of administrative and technical control of the
118 party implementing the Osmocom-based GSM network and the TC at its edge. The
119 resulting G.711 octet stream with TFO frames embedded inside can be carried
120 half-way around the world by the international toll telephone network, and there
121 is no telling what kind of implementation may be receiving and decoding these
122 bits on the other end. For this reason, "poor man's" workarounds in the
123 RTP-fed, TFO-generating TC are very unattractive:
124
125 * If the TC were to set TAF=0 in all TFO frames it generates, the receiver's
126 expectation of seeing TAF=1 in every 24th frame will be violated.
127
128 * If the TC were to arbitrarily set TAF=1 in every 24th frame by its own free-
129 running count, without knowledge of the actual SACCH alignment in the original
130 GSM call leg, these TAF-marked frames won't coincide with those frame
131 positions where the MS sends its SID frames, and the resulting TFO frame
132 stream will be invalid to the receiving Rx DTX handler on the far end.
133
134 The knowledge of which frames need to be marked with TAF=1 exists inside the
135 entity that generates the FR/EFR RTP stream: if this entity is a converter from
136 E1-based Abis to RTP, the TRAU-UL frames from the BTS contain this TAF bit, and
137 if the RTP-generating entity is a native IP BTS, it knows the frame number for
138 which it generates each RTP packet. The only problem is that there is no place
139 to insert this TAF bit in the standard RTP transport format of TS 101 318.
140
141 Why TRAU-UL and not TRAU-DL
142 ===========================
143
144 The present document argues the case that the industry standard RTP transport
145 format for FR & EFR is functionally crippled compared to the TRAU-UL transport
146 format of 3GPP TS 48.060, and defines an alternative RTP transport format that
147 can be used by those who desire TRAU-UL-like functionality badly enough to
148 accept the price of going totally non-standard in their IP RAN transport. The
149 new RTP transport format defined in this document explicitly mimics the
150 functionality and semantics of TS 48.060 TRAU-UL for FR and EFR.
151
152 At this point a reader may reasonably ask: why TRAU-UL and not TRAU-DL? The
153 answer is TFO: 3GPP TS 28.062 and its predecessor GSM 08.62 define the TFO frame
154 format as being based on TRAU-UL frames with only a few bits changed, and no
155 change in semantics of any of the frame indicator bits of TRAU-UL (C12 through
156 C17). Whereas the Abis interface is inherently asymmetric (TRAU-UL frames in
157 one direction, TRAU-DL frames in the other direction), end-to-end TFO is
158 directionally symmetric. If we imagine a TFO call between Alice in America and
159 Bob in Britain, there will be TRAU-UL frames flowing in both directions of the
160 trans-oceanic G.711 toll connection, one set coming almost unchanged from
161 Alice's BTS CCU and the other coming almost unchanged from Bob's BTS CCU. Of
162 course each party's GSM call DL will require TRAU-DL frames to be fed to it,
163 not TRAU-UL, but the necessary UL-to-DL conversion is the responsibility of the
164 TFO receiver on each end.
165
166 The general rules for turning a TRAU-UL frame into one for TRAU-DL are specified
167 in TS 28.062 section C.3.2.1.1; it should be noted that this section spells out
168 the requirements of what the UL-to-DL converter must do, but does not specify
169 exactly how to do it algorithmically - the wording it uses is "subject to
170 manufacturer dependent future improvements and is not part of this
171 recommendation." Implementing all of these section C.3.2.1.1 rules (hereafter
172 called C3211 rules for short) exactly to the letter is quite easy for the FR
173 codec (Themyscira libgsmfrp does everything that is needed, and is a simple and
174 lightweight FLOSS function library), but much harder for EFR. At the present
175 time it is unclear to the author of this document whether real historical T1/E1
176 TRAU implementations for which GSM 08.62 TFO was originally specified really did
177 implement C3211 rules to the letter, particularly for EFR, or if they cut some
178 corners.
179
180 Because the TRAUlike RTP transport format defined in this document is
181 semantically equivalent to TRAU-UL, any entity that receives such RTP packets
182 but internally needs to generate either TRAU-DL or some private functional
183 equivalent thereof will need to perform the same UL-to-DL conversion as called
184 for in TFO. The lack of a readily available function library that implements
185 the onerous rules of C3211 for EFR is certainly an obstacle, but it is also
186 possible to "cut corners" by doing the following:
187
188 1) Ignore Table C.3.2.1-1 case 1 and treat it like case 2, at least for EFR:
189 whenever SID frames are received on the incoming TRAU-UL or TRAUlike RTP
190 interface, forward them to call leg B even when that destination call leg
191 has no DTXd. Given that DTX and SID support has been an integral part of
192 the EFR codec from the beginning, as opposed to an after-addition in the
193 case of FR, every GSM MS that supports EFR can be expected to understand
194 SID frames on the downlink.
195
196 2) During speech pauses following transmission of a SID frame on call leg B DL,
197 if real DTXd (turning off Tx) is not allowed, do "fake DTXd" by transmitting
198 dummy FACCH with an L2 fill frame in the same 20 ms traffic frame windows in
199 which real DTXd would have been exercised if it were allowed.
200
201 3) Whenever a BFI condition is encountered in the incoming TRAU-UL or TRAUlike
202 RTP frame stream outside of SID, i.e., the case described in the first
203 paragraph of section C.3.2.1.1, induce an intentional BFI condition in the
204 receiving GSM MS by transmitting a dummy FACCH frame as above, instead of
205 trying to devise a parameter-level ECU for EFR.
206
207 It should be noted that the just-outlined "cut corners" method is exactly what
208 OsmoBTS (and a "pure" Osmocom network in general) does currently, hence nothing
209 is lost and no regression is introduced by continuing to do the same.
210
211 Seen another way, by making our RTP transport semantically equivalent to
212 TRAU-UL, we achieve harmonization between TFO and TrFO. TrFO (Transcoder-Free
213 Operation) is a scenario in which the RTP output from one IP BTS for call leg A
214 goes directly to the RTP input of another IP BTS for call leg B, possibly
215 passing through simple RTP forwarders like OsmoMGW, but never passing through
216 any transcoder. TrFO is what happens in a self-contained Osmocom network
217 without any external MNCC connected to OsmoMSC. The principal rules of what
218 transformations are inherently necessary in order to produce a fully proper DL
219 for call leg B from the UL of call leg A remain the same whether the transport
220 in between is old-fashioned TFO or modern TrFO, hence the same conversions that
221 are codified in TS 28.062 section C.3.2.1.1 are still needed - the only question
222 is where in the network are they to be performed. The original TDM-based GSM
223 designers at ETSI gave us a superb architecture end to end; by employing an RTP
224 transport that is semantically equivalent to TRAU-UL, we can preserve that whole
225 architecture fully intact in an all-IP implementation.
226
227 Specification for TRAUlike RTP payload format for FR and EFR
228 ============================================================
229
230 The modified RTP payload format shall consist of a single octet called TRAUlike
231 Extension Header (TEH), followed (most of the time) by the standard (same as in
232 RFC 3551) 33 octets for FR or 31 octets for EFR. The TEH octet has the
233 following structure:
234
235 +----+----+----+----+----+----+----+----+
236 Hex mask | 0xF0 |0x08|0x04|0x02|0x01|
237 +----+----+----+----+----+----+----+----+
238 Meaning | signature |DTXd|NDF |BFI |TAF |
239 +----+----+----+----+----+----+----+----+
240
241 (Bit numbers are identified by hex masks in order to avoid getting into an
242 argument over which bit numbering convention should be used.)
243
244 The following bit fields are defined within the TEH octet:
245
246 signature: the upper nibble of the TEH octet shall be set to 0xE. This
247 signature allows RTP packet receivers to identify the payload format by the
248 upper nibble of the first octet: if it equals 0xC, the format is EFR without
249 TEH, if it equals 0xD, the format is FR without TEH, and if it equals 0xE, then
250 the first octet is TEH.
251
252 DTXd: this bit is strictly identical with TRAU-UL frame bit C17.
253
254 No_Data flag (NDF): this bit shall be set to 1 if the TRAUlike payload consists
255 solely of TEH, with the standard 33-octet FR frame or 31-octet EFR frame
256 entirely omitted, and shall be 0 otherwise.
257
258 BFI: this bit is strictly identical with TRAU-UL frame bit C12.
259
260 TAF: this bit is strictly identical with TRAU-UL frame bit C15.
261
262 There are two possibilities for full composition of a TRAUlike RTP payload:
263
264 Possibility 1: TEH with NDF=0 is followed by a standard 33-octet FR frame or a
265 standard 31-octet EFR frame. The signature in the upper nibble of the octet
266 immediately following TEH shall be correct: 0xD for FR or 0xC for EFR.
267
268 Possibility 2: TEH with NDF=1 constitutes the entirety of the RTP payload for
269 the 20 ms time window in question.
270
271 If the No_Data flag is set, BFI must also be set: the combination of NDF=1 and
272 BFI=0 is invalid.
273
274 Per this specification, the sender of a BFI packet has the choice of sending it
275 in one of two forms: with or without presumed-erroneous frame bits. If the
276 TRAUlike RTP packet is generated from bits received in an actual TRAU-UL frame
277 (E1 Abis or TFO), erroneous frame bits shall be included, unchanged from the
278 TRAU-UL source. However, if the entity generating the TRAUlike RTP packet is
279 the ultimate point of origin (e.g., a native IP BTS), then it shall choose one
280 form or the other based on the situation at hand:
281
282 a) if the sender does have an FR or EFR frame "on hand" but that frame is
283 considered to be erroneous (for example, the link quality check in
284 l1sap_tch_ind() in OsmoBTS), the long form of BFI shall be sent, with the
285 presumed-erroneous frame bits included.
286
287 b) if the sender does not have any FR or EFR frame at all that could be sent
288 (for example, if the reason for the BFI condition is because FACCH was
289 successfully received and decoded instead of a traffic frame), then the
290 No_Data form of BFI shall be sent.
291
292 The option of No_Data BFI is provided in this RTP transport format specification
293 because if this option were disallowed, senders would be tasked with an
294 additional burden of having to artificially generate dummy or "garbage" frame
295 bits. This task is slightly complicated, as explained in the following section,
296 and the present design moves that task from all senders to only those receivers
297 that need it.
298
299 Lack of SID classification bits matching TRAU-UL C13 & C14
300 ----------------------------------------------------------
301
302 TRAU-UL frame format includes two bits C13 & C14 that carry the ternany SID flag
303 (0, 1 or 2) as defined in GSM 06.31 and 06.81 section 6.1.1 (same section in
304 both specs). No equivalent bits are included in the TRAUlike RTP transport
305 format as defined by this specification - however, these bits are redundant.
306 The rules of section 6.1.1 in GSM 06.31 and 06.81, hereafter called S611 rules,
307 specify a strictly deterministic, unambiguous formula by which these C13 & C14
308 bits derive their values from the bit content of the FR/EFR frame payload -
309 thus if a TRAU-UL frame is received in which these C13 & C14 bits fail to match
310 the S611 value derived from the contained payload, then that TRAU-UL frame is
311 defective. There is no need to include such redundant bits in our TRAUlike RTP
312 format, only to create confusion for receivers as to which source of SID S611
313 classification they should use.
314
315 Feeding received TRAUlike BFI frames to an EFR decoder
316 ======================================================
317
318 If an EFR decoder implementation is based on the reference C source from ETSI,
319 this decoder requires that _some_ frame bits input be fed to it at all times,
320 even when BFI=1. But what if the BFI packet came in as No_Data? In that case
321 the receiver must synthesize its own fake "bad data" bits to feed to the
322 standard decoder. When synthesizing "bad data" bits in this manner, the
323 following rules should be observed:
324
325 * The 140 bits corresponding to "fixed codebook excitation pulses" (35 bits in
326 each of the 4 subframes) shall be filled using a PRNG. These bits are the
327 ones used by the standard decoder when its internal state, based on previous
328 good frames, puts it in GSM 06.61 substitution/muting mode as opposed to
329 GSM 06.62 comfort noise generation mode.
330
331 * The remaining 104 bits of the EFR frame shall be set to 0. These bits are
332 never used by the standard decoder under the condition of BFI=1, and setting
333 them to 0 prevents the possibility of S611 rules classifying the frame as SID
334 even if the PRNG output in the other 140 bits happens to be all 1s in those
335 SID codeword bit positions (70 out of 140) that fall within the "fixed
336 codebook excitation pulses" portion.
337
338 Converting from TRAU-UL to TRAUlike RTP
339 =======================================
340
341 There will be a need to convert from standard TS 48.060 TRAU-UL frames to our
342 TRAUlike RTP format in the following two scenarios:
343
344 1) When interfacing an E1 BTS to Osmocom RAN, when and if such support is to be
345 added to OsmoMGW;
346
347 2) In the CN transcoder operating in TFO mode, when forwarding received TFO
348 frames to the local RAN.
349
350 In both cases the conversion is straightforward:
351
352 * Always generate full-length TRAUlike RTP payloads, never generate No_Data in
353 the case of a properly received TRAU-UL speech (not idle) frame.
354
355 * Forward the payload bits directly from TRAU-UL to TRAUlike RTP, for both good
356 and bad frames.
357
358 * Directly forward BFI, TAF and DTXd frame indicator bits from TRAU-UL C-bits
359 to TEH octet bits.
360
361 * Ignore TRAU-UL C13 & C14 bits.
362
363 Converting from TRAUlike RTP to TRAU-UL
364 =======================================
365
366 This direction of conversion will need to be performed in the CN transcoder when
367 emitting TFO frames toward the outside world. The following rules will need to
368 be applied:
369
370 * If the incoming TRAUlike RTP payload is full-length, as opposed to No_Data,
371 simply copy the payload bits into the constructed TRAU-UL frame, for both
372 good (BFI=0) and bad (BFI=1) frames.
373
374 * If the incoming TRAUlike RTP payload is No_Data, put the following filler in
375 the data bits portion of the TRAU-UL frame:
376
377 - For FR codec, use the silence frame of 3GPP TS 46.011 Table 1 as the filler.
378
379 - For EFR codec, perform the same PRNG procedure as detailed earlier in this
380 document for the case of feeding a No_Data BFI packet to the standard ETSI
381 decoder for EFR. Given that a TFO-frame-emitting transcoder still needs to
382 run its regular speech decoder in order to fill the upper 6 bits of each
383 outgoing G.711 sample octet, the same No_Data PRNG handler will typically
384 be run just once for both internal decoding and TFO frame output.
385
386 * Algorithmically set C13 & C14 bits in the generated TRAU-UL frame per the
387 rules of S611. This step can be done using osmo_{fr,efr}_sid_classify()
388 functions proposed in this Gerrit patch submission:
389
390 https://gerrit.osmocom.org/c/libosmocore/+/32183
391
392 or using equivalent functions in Themyscira libgsmefr and libgsmfrp.
393
394 * Directly forward BFI, TAF and DTXd frame indicator bits from TEH octet bits
395 to TRAU-UL C12, C15 and C17, respectively.
396
397 Mixing standard RFC 3551 and TRAUlike RTP payloads
398 ==================================================
399
400 An RTP stream receiver for FR/EFR codecs that supports the present non-standard
401 extension to the RTP payload format shall behave gracefully when it receives a
402 mixture of standard RFC 3551 payloads and TRAUlike payloads in the same RTP
403 stream. A receiver that has no interest in the additional information carried
404 in the TRAUlike Extension Header shall simply strip the TEH octet when one is
405 received, reducing the received payload to standard RFC 3551; if a BFI or
406 No_Data payload is received, treat it the same as if nothing at all was
407 received. A receiver that is interested in the TRAUlike Extension Header but
408 receives an FR/EFR payload without one should behave as if it received a TEH
409 with BFI=0, TAF=0, and a received zero-length RTP payload should be treated the
410 same as receiving a No_Data TRAUlike payload with TAF=0.
411
412 There may even be cases when an RTP sender may alternate between sending
413 standard RFC 3551 payloads and TRAUlike payloads in the same session: for
414 example, a TFO-supporting CN transcoder may emit "plain" RFC 3551 payloads when
415 supplying the output of its free-running speech encoder, but switch to sending
416 TRAUlike payloads when it switches to forwarding bits received in TFO frames
417 from the far end.