FreeCalypso > hg > themwi-system-sw
view doc/RTP-TRAUlike-format @ 270:6f28a4377a99
doc/Local-short-numbers: first draft written
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sun, 26 Nov 2023 17:08:10 -0800 |
parents | f0b90591f67c |
children |
line wrap: on
line source
TRAU-UL-like RTP transport format for FR & EFR codecs ===================================================== The generally accepted industry standard format for RTP transport of FR and EFR codec frames in an IP-based GSM RAN is given in ETSI TS 101 318; the same format is also codified in IETF RFC 3551. However, when compared to the classic TRAU-UL format of 3GPP TS 48.060, the standard RTP format of RFC 3551 exhibits the following two shortcomings: 1) no way to indicate a BFI condition and still send frame data bits; 2) no way to transport the Time Alignment Flag (TAF). Both of these shortcomings will be explained in detail further in this document; however, the primary purpose of this document is to propose a new, regrettably non-standard, RTP transport format for FR & EFR codecs, for use only within a GSM RAN and the immediately attached CN transcoder ("soft TRAU"), that provides the same functionality as the classic TRAU-UL format of TS 48.060, but is carried over RTP in IP rather than a 16 kbps TDM subchannel. The non-standard RTP transport format presented in this document is implemented in OsmoBTS on a private feature branch: https://cgit.osmocom.org/osmo-bts/log/?h=falconia/rtp_traulike OsmoBTS versions that include this code always accept TRAUlike FR/EFR packets on their RTP input, following the principle of being liberal in what you accept while being conservative in what you send, but emit such packets on their RTP output only when this non-default vty config option is given: rtp fr-efr-traulike The recently added (mainline) "rtp continuous-streaming" vty config option also needs to be enabled. The present document serves as the formal specification for the TRAUlike RTP transport format for FR and EFR. Detailed description of shortcomings of standard RTP transport for FR & EFR =========================================================================== These shortcomings are solved in the TRAUlike RTP transport format defined in this document; understanding these shortcomings provides the essential rationale for TRAU-like RTP. Indicating BFI along with data bits ----------------------------------- The only way to indicate a BFI condition in standard RTP (for FR/EFR) is to either send no packet at all in the 20 ms window in question (industry standard behavior and OsmoBTS default) or send an RTP packet with a zero-length payload ("rtp continuous-streaming" option in OsmoBTS). The latter option provides a timing tick for a CN-attached transcoder relying on the BTS-originating RTP stream as its timing source, but there is still no way to send a frame of marked-erroneous data bits. Contrast with TS 48.060 TRAU-UL format: in this format the Dn bits carrying FR or EFR frame bits and the C12 bit carrying BFI are orthogonal. Why would one care about known-bad or deemed-to-be-bad frame data bits? They do matter at least in the case of EFR: the official reference C-source EFR decoder from ETSI makes use of the "fixed codebook excitation pulses" portion of its EFR frame bits input (140 bits out of 244) even when BFI=1. This portion of reference C-source behavior is declared to be a non-normative example by the text of GSM 06.61 spec, thus there may be other compliant EFR decoder implementations that never look at marked-erroneous data bits - but given the ease of simply using the C code from ETSI as-is, or recoding it more efficiently but keeping unchanged all bit-exact algorithms, including non-normative ones, we should expect that the behavior of ETSI reference code is retained in many production implementations and deployments. Consider the case where a traditional E1-based BTS with a classic TRAU interface is attached to an IP-based Osmocom RAN by way of OsmoMGW, and the resulting RTP stream then (after passing through another OsmoMGW instance at the MSC) goes to a "soft TRAU" transcoder (TC) in the CN. The TC will feed its RTP input to FR and EFR decoders, and at least the EFR decoder makes use of "fixed codebook excitation pulses" bits from erroneous frames. Furthermore, the TC may implement in-band TFO (3GPP TS 28.062) inside its G.711 RTP output, in which case it will need to insert a slightly modified TRAU-UL frame into that output. The bits that would ideally be fed to the ETSI EFR decoder and emitted to the outside world in TFO frames already exist at the output of the E1-based BTS, but they get lost in the RTP transport when the industry standard RTP payload format is used. Consider another case where OsmoBTS does have an FR/EFR traffic frame that could potentially be sent out, but it is suppressed by the (tch_ind->lqual_cb >= bts->min_qual_norm) check in l1sap_tch_ind() in src/common/l1sap.c. In this case it would be ideal to send out that frame along with a BFI=1 indication, if the RTP transport format were to allow such representation. Lack of TAF bit in standard RTP transport ----------------------------------------- The TRAU-UL frame format of TS 48.060 for FR and EFR includes a bit called TAF, for Time Alignment Flag. Per the specs (TS 48.060 refers to TS 46.031 for definition and coding of frame indicators) this bit shall be set to 1 in one particular position in the 480 ms SACCH multiframe (the particular 20 ms frame position in which a valid frame is always transmitted, even during DTX pauses) and set to 0 in all other frames. This flag factors into the Rx DTX handler logic prescribed in GSM 06.31 and 06.81 specs for FR and EFR, respectively, and there exist production decoders for these codecs that implement their Rx DTX handler function exactly to the letter of the specs, including the use of TAF bit when deciding what to do with a BFI=1 frame received in the comfort noise generation state. (These spec-compliant decoders include the reference ETSI C-source decoder for EFR and Themyscira libgsmfrp for FR.) This TAF bit does not exist in the standard RTP transport for FR & EFR. The lack of this TAF bit causes the following problems for the CN-attached "soft TRAU" transcoder: 1) The ability to implement spec-compliant handling of GSM 06.11 or 06.61 section 5.4 requirement (same section in both specs) is lost; 2) The TC won't know when to set the TAF bit in its outgoing TFO frames, if it implements in-band TFO per 3GPP TS 28.062. The TFO problem is particularly concerning because these TFO frames are emitted to the outside world, outside of administrative and technical control of the party implementing the Osmocom-based GSM network and the TC at its edge. The resulting G.711 octet stream with TFO frames embedded inside can be carried half-way around the world by the international toll telephone network, and there is no telling what kind of implementation may be receiving and decoding these bits on the other end. For this reason, "poor man's" workarounds in the RTP-fed, TFO-generating TC are very unattractive: * If the TC were to set TAF=0 in all TFO frames it generates, the receiver's expectation of seeing TAF=1 in every 24th frame will be violated. * If the TC were to arbitrarily set TAF=1 in every 24th frame by its own free- running count, without knowledge of the actual SACCH alignment in the original GSM call leg, these TAF-marked frames won't coincide with those frame positions where the MS sends its SID frames, and the resulting TFO frame stream will be invalid to the receiving Rx DTX handler on the far end. The knowledge of which frames need to be marked with TAF=1 exists inside the entity that generates the FR/EFR RTP stream: if this entity is a converter from E1-based Abis to RTP, the TRAU-UL frames from the BTS contain this TAF bit, and if the RTP-generating entity is a native IP BTS, it knows the frame number for which it generates each RTP packet. The only problem is that there is no place to insert this TAF bit in the standard RTP transport format of TS 101 318. Why TRAU-UL and not TRAU-DL =========================== The present document argues the case that the industry standard RTP transport format for FR & EFR is functionally crippled compared to the TRAU-UL transport format of 3GPP TS 48.060, and defines an alternative RTP transport format that can be used by those who desire TRAU-UL-like functionality badly enough to accept the price of going totally non-standard in their IP RAN transport. The new RTP transport format defined in this document explicitly mimics the functionality and semantics of TS 48.060 TRAU-UL for FR and EFR. At this point a reader may reasonably ask: why TRAU-UL and not TRAU-DL? The answer is TFO: 3GPP TS 28.062 and its predecessor GSM 08.62 define the TFO frame format as being based on TRAU-UL frames with only a few bits changed, and no change in semantics of any of the frame indicator bits of TRAU-UL (C12 through C17). Whereas the Abis interface is inherently asymmetric (TRAU-UL frames in one direction, TRAU-DL frames in the other direction), end-to-end TFO is directionally symmetric. If we imagine a TFO call between Alice in America and Bob in Britain, there will be TRAU-UL frames flowing in both directions of the trans-oceanic G.711 toll connection, one set coming almost unchanged from Alice's BTS CCU and the other coming almost unchanged from Bob's BTS CCU. Of course each party's GSM call DL will require TRAU-DL frames to be fed to it, not TRAU-UL, but the necessary UL-to-DL conversion is the responsibility of the TFO receiver on each end. The general rules for turning a TRAU-UL frame into one for TRAU-DL are specified in TS 28.062 section C.3.2.1.1; it should be noted that this section spells out the requirements of what the UL-to-DL converter must do, but does not specify exactly how to do it algorithmically - the wording it uses is "subject to manufacturer dependent future improvements and is not part of this recommendation." Implementing all of these section C.3.2.1.1 rules (hereafter called C3211 rules for short) exactly to the letter is quite easy for the FR codec (Themyscira libgsmfrp does everything that is needed, and is a simple and lightweight FLOSS function library), but much harder for EFR. At the present time it is unclear to the author of this document whether real historical T1/E1 TRAU implementations for which GSM 08.62 TFO was originally specified really did implement C3211 rules to the letter, particularly for EFR, or if they cut some corners. Because the TRAUlike RTP transport format defined in this document is semantically equivalent to TRAU-UL, any entity that receives such RTP packets but internally needs to generate either TRAU-DL or some private functional equivalent thereof will need to perform the same UL-to-DL conversion as called for in TFO. The lack of a readily available function library that implements the onerous rules of C3211 for EFR is certainly an obstacle, but it is also possible to "cut corners" by doing the following: 1) Ignore Table C.3.2.1-1 case 1 and treat it like case 2, at least for EFR: whenever SID frames are received on the incoming TRAU-UL or TRAUlike RTP interface, forward them to call leg B even when that destination call leg has no DTXd. Given that DTX and SID support has been an integral part of the EFR codec from the beginning, as opposed to an after-addition in the case of FR, every GSM MS that supports EFR can be expected to understand SID frames on the downlink. 2) During speech pauses following transmission of a SID frame on call leg B DL, if real DTXd (turning off Tx) is not allowed, do "fake DTXd" by transmitting dummy FACCH with an L2 fill frame in the same 20 ms traffic frame windows in which real DTXd would have been exercised if it were allowed. 3) Whenever a BFI condition is encountered in the incoming TRAU-UL or TRAUlike RTP frame stream outside of SID, i.e., the case described in the first paragraph of section C.3.2.1.1, induce an intentional BFI condition in the receiving GSM MS by transmitting a dummy FACCH frame as above, instead of trying to devise a parameter-level ECU for EFR. It should be noted that the just-outlined "cut corners" method is exactly what OsmoBTS (and a "pure" Osmocom network in general) does currently, hence nothing is lost and no regression is introduced by continuing to do the same. Seen another way, by making our RTP transport semantically equivalent to TRAU-UL, we achieve harmonization between TFO and TrFO. TrFO (Transcoder-Free Operation) is a scenario in which the RTP output from one IP BTS for call leg A goes directly to the RTP input of another IP BTS for call leg B, possibly passing through simple RTP forwarders like OsmoMGW, but never passing through any transcoder. TrFO is what happens in a self-contained Osmocom network without any external MNCC connected to OsmoMSC. The principal rules of what transformations are inherently necessary in order to produce a fully proper DL for call leg B from the UL of call leg A remain the same whether the transport in between is old-fashioned TFO or modern TrFO, hence the same conversions that are codified in TS 28.062 section C.3.2.1.1 are still needed - the only question is where in the network are they to be performed. The original TDM-based GSM designers at ETSI gave us a superb architecture end to end; by employing an RTP transport that is semantically equivalent to TRAU-UL, we can preserve that whole architecture fully intact in an all-IP implementation. Specification for TRAUlike RTP payload format for FR and EFR ============================================================ The modified RTP payload format shall consist of a single octet called TRAUlike Extension Header (TEH), followed (most of the time) by the standard (same as in RFC 3551) 33 octets for FR or 31 octets for EFR. The TEH octet has the following structure: +----+----+----+----+----+----+----+----+ Hex mask | 0xF0 |0x08|0x04|0x02|0x01| +----+----+----+----+----+----+----+----+ Meaning | signature |DTXd|NDF |BFI |TAF | +----+----+----+----+----+----+----+----+ (Bit numbers are identified by hex masks in order to avoid getting into an argument over which bit numbering convention should be used.) The following bit fields are defined within the TEH octet: signature: the upper nibble of the TEH octet shall be set to 0xE. This signature allows RTP packet receivers to identify the payload format by the upper nibble of the first octet: if it equals 0xC, the format is EFR without TEH, if it equals 0xD, the format is FR without TEH, and if it equals 0xE, then the first octet is TEH. DTXd: this bit is strictly identical with TRAU-UL frame bit C17. No_Data flag (NDF): this bit shall be set to 1 if the TRAUlike payload consists solely of TEH, with the standard 33-octet FR frame or 31-octet EFR frame entirely omitted, and shall be 0 otherwise. BFI: this bit is strictly identical with TRAU-UL frame bit C12. TAF: this bit is strictly identical with TRAU-UL frame bit C15. There are two possibilities for full composition of a TRAUlike RTP payload: Possibility 1: TEH with NDF=0 is followed by a standard 33-octet FR frame or a standard 31-octet EFR frame. The signature in the upper nibble of the octet immediately following TEH shall be correct: 0xD for FR or 0xC for EFR. Possibility 2: TEH with NDF=1 constitutes the entirety of the RTP payload for the 20 ms time window in question. If the No_Data flag is set, BFI must also be set: the combination of NDF=1 and BFI=0 is invalid. Per this specification, the sender of a BFI packet has the choice of sending it in one of two forms: with or without presumed-erroneous frame bits. If the TRAUlike RTP packet is generated from bits received in an actual TRAU-UL frame (E1 Abis or TFO), erroneous frame bits shall be included, unchanged from the TRAU-UL source. However, if the entity generating the TRAUlike RTP packet is the ultimate point of origin (e.g., a native IP BTS), then it shall choose one form or the other based on the situation at hand: a) if the sender does have an FR or EFR frame "on hand" but that frame is considered to be erroneous (for example, the link quality check in l1sap_tch_ind() in OsmoBTS), the long form of BFI shall be sent, with the presumed-erroneous frame bits included. b) if the sender does not have any FR or EFR frame at all that could be sent (for example, if the reason for the BFI condition is because FACCH was successfully received and decoded instead of a traffic frame), then the No_Data form of BFI shall be sent. The option of No_Data BFI is provided in this RTP transport format specification because if this option were disallowed, senders would be tasked with an additional burden of having to artificially generate dummy or "garbage" frame bits. This task is slightly complicated, as explained in the following section, and the present design moves that task from all senders to only those receivers that need it. Lack of SID classification bits matching TRAU-UL C13 & C14 ---------------------------------------------------------- TRAU-UL frame format includes two bits C13 & C14 that carry the ternany SID flag (0, 1 or 2) as defined in GSM 06.31 and 06.81 section 6.1.1 (same section in both specs). No equivalent bits are included in the TRAUlike RTP transport format as defined by this specification - however, these bits are redundant. The rules of section 6.1.1 in GSM 06.31 and 06.81, hereafter called S611 rules, specify a strictly deterministic, unambiguous formula by which these C13 & C14 bits derive their values from the bit content of the FR/EFR frame payload - thus if a TRAU-UL frame is received in which these C13 & C14 bits fail to match the S611 value derived from the contained payload, then that TRAU-UL frame is defective. There is no need to include such redundant bits in our TRAUlike RTP format, only to create confusion for receivers as to which source of SID S611 classification they should use. Feeding received TRAUlike BFI frames to an EFR decoder ====================================================== If an EFR decoder implementation is based on the reference C source from ETSI, this decoder requires that _some_ frame bits input be fed to it at all times, even when BFI=1. But what if the BFI packet came in as No_Data? In that case the receiver must synthesize its own fake "bad data" bits to feed to the standard decoder. When synthesizing "bad data" bits in this manner, the following rules should be observed: * The 140 bits corresponding to "fixed codebook excitation pulses" (35 bits in each of the 4 subframes) shall be filled using a PRNG. These bits are the ones used by the standard decoder when its internal state, based on previous good frames, puts it in GSM 06.61 substitution/muting mode as opposed to GSM 06.62 comfort noise generation mode. * The remaining 104 bits of the EFR frame shall be set to 0. These bits are never used by the standard decoder under the condition of BFI=1, and setting them to 0 prevents the possibility of S611 rules classifying the frame as SID even if the PRNG output in the other 140 bits happens to be all 1s in those SID codeword bit positions (70 out of 140) that fall within the "fixed codebook excitation pulses" portion. Converting from TRAU-UL to TRAUlike RTP ======================================= There will be a need to convert from standard TS 48.060 TRAU-UL frames to our TRAUlike RTP format in the following two scenarios: 1) When interfacing an E1 BTS to Osmocom RAN, when and if such support is to be added to OsmoMGW; 2) In the CN transcoder operating in TFO mode, when forwarding received TFO frames to the local RAN. In both cases the conversion is straightforward: * Always generate full-length TRAUlike RTP payloads, never generate No_Data in the case of a properly received TRAU-UL speech (not idle) frame. * Forward the payload bits directly from TRAU-UL to TRAUlike RTP, for both good and bad frames. * Directly forward BFI, TAF and DTXd frame indicator bits from TRAU-UL C-bits to TEH octet bits. * Ignore TRAU-UL C13 & C14 bits. Converting from TRAUlike RTP to TRAU-UL ======================================= This direction of conversion will need to be performed in the CN transcoder when emitting TFO frames toward the outside world. The following rules will need to be applied: * If the incoming TRAUlike RTP payload is full-length, as opposed to No_Data, simply copy the payload bits into the constructed TRAU-UL frame, for both good (BFI=0) and bad (BFI=1) frames. * If the incoming TRAUlike RTP payload is No_Data, put the following filler in the data bits portion of the TRAU-UL frame: - For FR codec, use the silence frame of 3GPP TS 46.011 Table 1 as the filler. - For EFR codec, perform the same PRNG procedure as detailed earlier in this document for the case of feeding a No_Data BFI packet to the standard ETSI decoder for EFR. Given that a TFO-frame-emitting transcoder still needs to run its regular speech decoder in order to fill the upper 6 bits of each outgoing G.711 sample octet, the same No_Data PRNG handler will typically be run just once for both internal decoding and TFO frame output. * Algorithmically set C13 & C14 bits in the generated TRAU-UL frame per the rules of S611. This step can be done using osmo_{fr,efr}_sid_classify() functions in libosmocodec, or using equivalent functions in Themyscira libgsmefr and libgsmfrp. * Directly forward BFI, TAF and DTXd frame indicator bits from TEH octet bits to TRAU-UL C12, C15 and C17, respectively. Mixing standard RFC 3551 and TRAUlike RTP payloads ================================================== An RTP stream receiver for FR/EFR codecs that supports the present non-standard extension to the RTP payload format shall behave gracefully when it receives a mixture of standard RFC 3551 payloads and TRAUlike payloads in the same RTP stream. A receiver that has no interest in the additional information carried in the TRAUlike Extension Header shall simply strip the TEH octet when one is received, reducing the received payload to standard RFC 3551; if a BFI or No_Data payload is received, treat it the same as if nothing at all was received. A receiver that is interested in the TRAUlike Extension Header but receives an FR/EFR payload without one should behave as if it received a TEH with BFI=0, TAF=0, and a received zero-length RTP payload should be treated the same as receiving a No_Data TRAUlike payload with TAF=0. There may even be cases when an RTP sender may alternate between sending standard RFC 3551 payloads and TRAUlike payloads in the same session: for example, a TFO-supporting CN transcoder may emit "plain" RFC 3551 payloads when supplying the output of its free-running speech encoder, but switch to sending TRAUlike payloads when it switches to forwarding bits received in TFO frames from the far end.