FreeCalypso > hg > themwi-system-sw
diff doc/RTP-TRAUlike-format @ 207:185225722714
doc: new extended RTP format
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Thu, 06 Apr 2023 21:30:33 -0800 |
parents | |
children | f0b90591f67c |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/RTP-TRAUlike-format Thu Apr 06 21:30:33 2023 -0800 @@ -0,0 +1,417 @@ +TRAU-UL-like RTP transport format for FR & EFR codecs +===================================================== + +The generally accepted industry standard format for RTP transport of FR and EFR +codec frames in an IP-based GSM RAN is given in ETSI TS 101 318; the same format +is also codified in IETF RFC 3551. However, when compared to the classic +TRAU-UL format of 3GPP TS 48.060, the standard RTP format of RFC 3551 exhibits +the following two shortcomings: + +1) no way to indicate a BFI condition and still send frame data bits; +2) no way to transport the Time Alignment Flag (TAF). + +Both of these shortcomings will be explained in detail further in this document; +however, the primary purpose of this document is to propose a new, regrettably +non-standard, RTP transport format for FR & EFR codecs, for use only within a +GSM RAN and the immediately attached CN transcoder ("soft TRAU"), that provides +the same functionality as the classic TRAU-UL format of TS 48.060, but is +carried over RTP in IP rather than a 16 kbps TDM subchannel. + +The non-standard RTP transport format presented in this document is implemented +in OsmoBTS on a private feature branch: + +https://cgit.osmocom.org/osmo-bts/log/?h=falconia/rtp_traulike + +OsmoBTS versions that include this code always accept TRAUlike FR/EFR packets +on their RTP input, following the principle of being liberal in what you accept +while being conservative in what you send, but emit such packets on their RTP +output only when this non-default vty config option is given: + +rtp fr-efr-traulike + +The recently added (mainline) "rtp continuous-streaming" vty config option also +needs to be enabled. + +The present document serves as the formal specification for the TRAUlike RTP +transport format for FR and EFR. + +Detailed description of shortcomings of standard RTP transport for FR & EFR +=========================================================================== + +These shortcomings are solved in the TRAUlike RTP transport format defined in +this document; understanding these shortcomings provides the essential rationale +for TRAU-like RTP. + +Indicating BFI along with data bits +----------------------------------- + +The only way to indicate a BFI condition in standard RTP (for FR/EFR) is to +either send no packet at all in the 20 ms window in question (industry standard +behavior and OsmoBTS default) or send an RTP packet with a zero-length payload +("rtp continuous-streaming" option in OsmoBTS). The latter option provides a +timing tick for a CN-attached transcoder relying on the BTS-originating RTP +stream as its timing source, but there is still no way to send a frame of +marked-erroneous data bits. Contrast with TS 48.060 TRAU-UL format: in this +format the Dn bits carrying FR or EFR frame bits and the C12 bit carrying BFI +are orthogonal. + +Why would one care about known-bad or deemed-to-be-bad frame data bits? They +do matter at least in the case of EFR: the official reference C-source EFR +decoder from ETSI makes use of the "fixed codebook excitation pulses" portion +of its EFR frame bits input (140 bits out of 244) even when BFI=1. This +portion of reference C-source behavior is declared to be a non-normative example +by the text of GSM 06.61 spec, thus there may be other compliant EFR decoder +implementations that never look at marked-erroneous data bits - but given the +ease of simply using the C code from ETSI as-is, or recoding it more efficiently +but keeping unchanged all bit-exact algorithms, including non-normative ones, +we should expect that the behavior of ETSI reference code is retained in many +production implementations and deployments. + +Consider the case where a traditional E1-based BTS with a classic TRAU interface +is attached to an IP-based Osmocom RAN by way of OsmoMGW, and the resulting RTP +stream then (after passing through another OsmoMGW instance at the MSC) goes to +a "soft TRAU" transcoder (TC) in the CN. The TC will feed its RTP input to FR +and EFR decoders, and at least the EFR decoder makes use of "fixed codebook +excitation pulses" bits from erroneous frames. Furthermore, the TC may +implement in-band TFO (3GPP TS 28.062) inside its G.711 RTP output, in which +case it will need to insert a slightly modified TRAU-UL frame into that output. +The bits that would ideally be fed to the ETSI EFR decoder and emitted to the +outside world in TFO frames already exist at the output of the E1-based BTS, +but they get lost in the RTP transport when the industry standard RTP payload +format is used. + +Consider another case where OsmoBTS does have an FR/EFR traffic frame that +could potentially be sent out, but it is suppressed by the +(tch_ind->lqual_cb >= bts->min_qual_norm) check in l1sap_tch_ind() in +src/common/l1sap.c. In this case it would be ideal to send out that frame +along with a BFI=1 indication, if the RTP transport format were to allow such +representation. + +Lack of TAF bit in standard RTP transport +----------------------------------------- + +The TRAU-UL frame format of TS 48.060 for FR and EFR includes a bit called TAF, +for Time Alignment Flag. Per the specs (TS 48.060 refers to TS 46.031 for +definition and coding of frame indicators) this bit shall be set to 1 in one +particular position in the 480 ms SACCH multiframe (the particular 20 ms frame +position in which a valid frame is always transmitted, even during DTX pauses) +and set to 0 in all other frames. This flag factors into the Rx DTX handler +logic prescribed in GSM 06.31 and 06.81 specs for FR and EFR, respectively, and +there exist production decoders for these codecs that implement their Rx DTX +handler function exactly to the letter of the specs, including the use of TAF +bit when deciding what to do with a BFI=1 frame received in the comfort noise +generation state. (These spec-compliant decoders include the reference ETSI +C-source decoder for EFR and Themyscira libgsmfrp for FR.) + +This TAF bit does not exist in the standard RTP transport for FR & EFR. The +lack of this TAF bit causes the following problems for the CN-attached "soft +TRAU" transcoder: + +1) The ability to implement spec-compliant handling of GSM 06.11 or 06.61 + section 5.4 requirement (same section in both specs) is lost; + +2) The TC won't know when to set the TAF bit in its outgoing TFO frames, if it + implements in-band TFO per 3GPP TS 28.062. + +The TFO problem is particularly concerning because these TFO frames are emitted +to the outside world, outside of administrative and technical control of the +party implementing the Osmocom-based GSM network and the TC at its edge. The +resulting G.711 octet stream with TFO frames embedded inside can be carried +half-way around the world by the international toll telephone network, and there +is no telling what kind of implementation may be receiving and decoding these +bits on the other end. For this reason, "poor man's" workarounds in the +RTP-fed, TFO-generating TC are very unattractive: + +* If the TC were to set TAF=0 in all TFO frames it generates, the receiver's + expectation of seeing TAF=1 in every 24th frame will be violated. + +* If the TC were to arbitrarily set TAF=1 in every 24th frame by its own free- + running count, without knowledge of the actual SACCH alignment in the original + GSM call leg, these TAF-marked frames won't coincide with those frame + positions where the MS sends its SID frames, and the resulting TFO frame + stream will be invalid to the receiving Rx DTX handler on the far end. + +The knowledge of which frames need to be marked with TAF=1 exists inside the +entity that generates the FR/EFR RTP stream: if this entity is a converter from +E1-based Abis to RTP, the TRAU-UL frames from the BTS contain this TAF bit, and +if the RTP-generating entity is a native IP BTS, it knows the frame number for +which it generates each RTP packet. The only problem is that there is no place +to insert this TAF bit in the standard RTP transport format of TS 101 318. + +Why TRAU-UL and not TRAU-DL +=========================== + +The present document argues the case that the industry standard RTP transport +format for FR & EFR is functionally crippled compared to the TRAU-UL transport +format of 3GPP TS 48.060, and defines an alternative RTP transport format that +can be used by those who desire TRAU-UL-like functionality badly enough to +accept the price of going totally non-standard in their IP RAN transport. The +new RTP transport format defined in this document explicitly mimics the +functionality and semantics of TS 48.060 TRAU-UL for FR and EFR. + +At this point a reader may reasonably ask: why TRAU-UL and not TRAU-DL? The +answer is TFO: 3GPP TS 28.062 and its predecessor GSM 08.62 define the TFO frame +format as being based on TRAU-UL frames with only a few bits changed, and no +change in semantics of any of the frame indicator bits of TRAU-UL (C12 through +C17). Whereas the Abis interface is inherently asymmetric (TRAU-UL frames in +one direction, TRAU-DL frames in the other direction), end-to-end TFO is +directionally symmetric. If we imagine a TFO call between Alice in America and +Bob in Britain, there will be TRAU-UL frames flowing in both directions of the +trans-oceanic G.711 toll connection, one set coming almost unchanged from +Alice's BTS CCU and the other coming almost unchanged from Bob's BTS CCU. Of +course each party's GSM call DL will require TRAU-DL frames to be fed to it, +not TRAU-UL, but the necessary UL-to-DL conversion is the responsibility of the +TFO receiver on each end. + +The general rules for turning a TRAU-UL frame into one for TRAU-DL are specified +in TS 28.062 section C.3.2.1.1; it should be noted that this section spells out +the requirements of what the UL-to-DL converter must do, but does not specify +exactly how to do it algorithmically - the wording it uses is "subject to +manufacturer dependent future improvements and is not part of this +recommendation." Implementing all of these section C.3.2.1.1 rules (hereafter +called C3211 rules for short) exactly to the letter is quite easy for the FR +codec (Themyscira libgsmfrp does everything that is needed, and is a simple and +lightweight FLOSS function library), but much harder for EFR. At the present +time it is unclear to the author of this document whether real historical T1/E1 +TRAU implementations for which GSM 08.62 TFO was originally specified really did +implement C3211 rules to the letter, particularly for EFR, or if they cut some +corners. + +Because the TRAUlike RTP transport format defined in this document is +semantically equivalent to TRAU-UL, any entity that receives such RTP packets +but internally needs to generate either TRAU-DL or some private functional +equivalent thereof will need to perform the same UL-to-DL conversion as called +for in TFO. The lack of a readily available function library that implements +the onerous rules of C3211 for EFR is certainly an obstacle, but it is also +possible to "cut corners" by doing the following: + +1) Ignore Table C.3.2.1-1 case 1 and treat it like case 2, at least for EFR: + whenever SID frames are received on the incoming TRAU-UL or TRAUlike RTP + interface, forward them to call leg B even when that destination call leg + has no DTXd. Given that DTX and SID support has been an integral part of + the EFR codec from the beginning, as opposed to an after-addition in the + case of FR, every GSM MS that supports EFR can be expected to understand + SID frames on the downlink. + +2) During speech pauses following transmission of a SID frame on call leg B DL, + if real DTXd (turning off Tx) is not allowed, do "fake DTXd" by transmitting + dummy FACCH with an L2 fill frame in the same 20 ms traffic frame windows in + which real DTXd would have been exercised if it were allowed. + +3) Whenever a BFI condition is encountered in the incoming TRAU-UL or TRAUlike + RTP frame stream outside of SID, i.e., the case described in the first + paragraph of section C.3.2.1.1, induce an intentional BFI condition in the + receiving GSM MS by transmitting a dummy FACCH frame as above, instead of + trying to devise a parameter-level ECU for EFR. + +It should be noted that the just-outlined "cut corners" method is exactly what +OsmoBTS (and a "pure" Osmocom network in general) does currently, hence nothing +is lost and no regression is introduced by continuing to do the same. + +Seen another way, by making our RTP transport semantically equivalent to +TRAU-UL, we achieve harmonization between TFO and TrFO. TrFO (Transcoder-Free +Operation) is a scenario in which the RTP output from one IP BTS for call leg A +goes directly to the RTP input of another IP BTS for call leg B, possibly +passing through simple RTP forwarders like OsmoMGW, but never passing through +any transcoder. TrFO is what happens in a self-contained Osmocom network +without any external MNCC connected to OsmoMSC. The principal rules of what +transformations are inherently necessary in order to produce a fully proper DL +for call leg B from the UL of call leg A remain the same whether the transport +in between is old-fashioned TFO or modern TrFO, hence the same conversions that +are codified in TS 28.062 section C.3.2.1.1 are still needed - the only question +is where in the network are they to be performed. The original TDM-based GSM +designers at ETSI gave us a superb architecture end to end; by employing an RTP +transport that is semantically equivalent to TRAU-UL, we can preserve that whole +architecture fully intact in an all-IP implementation. + +Specification for TRAUlike RTP payload format for FR and EFR +============================================================ + +The modified RTP payload format shall consist of a single octet called TRAUlike +Extension Header (TEH), followed (most of the time) by the standard (same as in +RFC 3551) 33 octets for FR or 31 octets for EFR. The TEH octet has the +following structure: + + +----+----+----+----+----+----+----+----+ +Hex mask | 0xF0 |0x08|0x04|0x02|0x01| + +----+----+----+----+----+----+----+----+ +Meaning | signature |DTXd|NDF |BFI |TAF | + +----+----+----+----+----+----+----+----+ + +(Bit numbers are identified by hex masks in order to avoid getting into an + argument over which bit numbering convention should be used.) + +The following bit fields are defined within the TEH octet: + +signature: the upper nibble of the TEH octet shall be set to 0xE. This +signature allows RTP packet receivers to identify the payload format by the +upper nibble of the first octet: if it equals 0xC, the format is EFR without +TEH, if it equals 0xD, the format is FR without TEH, and if it equals 0xE, then +the first octet is TEH. + +DTXd: this bit is strictly identical with TRAU-UL frame bit C17. + +No_Data flag (NDF): this bit shall be set to 1 if the TRAUlike payload consists +solely of TEH, with the standard 33-octet FR frame or 31-octet EFR frame +entirely omitted, and shall be 0 otherwise. + +BFI: this bit is strictly identical with TRAU-UL frame bit C12. + +TAF: this bit is strictly identical with TRAU-UL frame bit C15. + +There are two possibilities for full composition of a TRAUlike RTP payload: + +Possibility 1: TEH with NDF=0 is followed by a standard 33-octet FR frame or a +standard 31-octet EFR frame. The signature in the upper nibble of the octet +immediately following TEH shall be correct: 0xD for FR or 0xC for EFR. + +Possibility 2: TEH with NDF=1 constitutes the entirety of the RTP payload for +the 20 ms time window in question. + +If the No_Data flag is set, BFI must also be set: the combination of NDF=1 and +BFI=0 is invalid. + +Per this specification, the sender of a BFI packet has the choice of sending it +in one of two forms: with or without presumed-erroneous frame bits. If the +TRAUlike RTP packet is generated from bits received in an actual TRAU-UL frame +(E1 Abis or TFO), erroneous frame bits shall be included, unchanged from the +TRAU-UL source. However, if the entity generating the TRAUlike RTP packet is +the ultimate point of origin (e.g., a native IP BTS), then it shall choose one +form or the other based on the situation at hand: + +a) if the sender does have an FR or EFR frame "on hand" but that frame is + considered to be erroneous (for example, the link quality check in + l1sap_tch_ind() in OsmoBTS), the long form of BFI shall be sent, with the + presumed-erroneous frame bits included. + +b) if the sender does not have any FR or EFR frame at all that could be sent + (for example, if the reason for the BFI condition is because FACCH was + successfully received and decoded instead of a traffic frame), then the + No_Data form of BFI shall be sent. + +The option of No_Data BFI is provided in this RTP transport format specification +because if this option were disallowed, senders would be tasked with an +additional burden of having to artificially generate dummy or "garbage" frame +bits. This task is slightly complicated, as explained in the following section, +and the present design moves that task from all senders to only those receivers +that need it. + +Lack of SID classification bits matching TRAU-UL C13 & C14 +---------------------------------------------------------- + +TRAU-UL frame format includes two bits C13 & C14 that carry the ternany SID flag +(0, 1 or 2) as defined in GSM 06.31 and 06.81 section 6.1.1 (same section in +both specs). No equivalent bits are included in the TRAUlike RTP transport +format as defined by this specification - however, these bits are redundant. +The rules of section 6.1.1 in GSM 06.31 and 06.81, hereafter called S611 rules, +specify a strictly deterministic, unambiguous formula by which these C13 & C14 +bits derive their values from the bit content of the FR/EFR frame payload - +thus if a TRAU-UL frame is received in which these C13 & C14 bits fail to match +the S611 value derived from the contained payload, then that TRAU-UL frame is +defective. There is no need to include such redundant bits in our TRAUlike RTP +format, only to create confusion for receivers as to which source of SID S611 +classification they should use. + +Feeding received TRAUlike BFI frames to an EFR decoder +====================================================== + +If an EFR decoder implementation is based on the reference C source from ETSI, +this decoder requires that _some_ frame bits input be fed to it at all times, +even when BFI=1. But what if the BFI packet came in as No_Data? In that case +the receiver must synthesize its own fake "bad data" bits to feed to the +standard decoder. When synthesizing "bad data" bits in this manner, the +following rules should be observed: + +* The 140 bits corresponding to "fixed codebook excitation pulses" (35 bits in + each of the 4 subframes) shall be filled using a PRNG. These bits are the + ones used by the standard decoder when its internal state, based on previous + good frames, puts it in GSM 06.61 substitution/muting mode as opposed to + GSM 06.62 comfort noise generation mode. + +* The remaining 104 bits of the EFR frame shall be set to 0. These bits are + never used by the standard decoder under the condition of BFI=1, and setting + them to 0 prevents the possibility of S611 rules classifying the frame as SID + even if the PRNG output in the other 140 bits happens to be all 1s in those + SID codeword bit positions (70 out of 140) that fall within the "fixed + codebook excitation pulses" portion. + +Converting from TRAU-UL to TRAUlike RTP +======================================= + +There will be a need to convert from standard TS 48.060 TRAU-UL frames to our +TRAUlike RTP format in the following two scenarios: + +1) When interfacing an E1 BTS to Osmocom RAN, when and if such support is to be + added to OsmoMGW; + +2) In the CN transcoder operating in TFO mode, when forwarding received TFO + frames to the local RAN. + +In both cases the conversion is straightforward: + +* Always generate full-length TRAUlike RTP payloads, never generate No_Data in + the case of a properly received TRAU-UL speech (not idle) frame. + +* Forward the payload bits directly from TRAU-UL to TRAUlike RTP, for both good + and bad frames. + +* Directly forward BFI, TAF and DTXd frame indicator bits from TRAU-UL C-bits + to TEH octet bits. + +* Ignore TRAU-UL C13 & C14 bits. + +Converting from TRAUlike RTP to TRAU-UL +======================================= + +This direction of conversion will need to be performed in the CN transcoder when +emitting TFO frames toward the outside world. The following rules will need to +be applied: + +* If the incoming TRAUlike RTP payload is full-length, as opposed to No_Data, + simply copy the payload bits into the constructed TRAU-UL frame, for both + good (BFI=0) and bad (BFI=1) frames. + +* If the incoming TRAUlike RTP payload is No_Data, put the following filler in + the data bits portion of the TRAU-UL frame: + + - For FR codec, use the silence frame of 3GPP TS 46.011 Table 1 as the filler. + + - For EFR codec, perform the same PRNG procedure as detailed earlier in this + document for the case of feeding a No_Data BFI packet to the standard ETSI + decoder for EFR. Given that a TFO-frame-emitting transcoder still needs to + run its regular speech decoder in order to fill the upper 6 bits of each + outgoing G.711 sample octet, the same No_Data PRNG handler will typically + be run just once for both internal decoding and TFO frame output. + +* Algorithmically set C13 & C14 bits in the generated TRAU-UL frame per the + rules of S611. This step can be done using osmo_{fr,efr}_sid_classify() + functions proposed in this Gerrit patch submission: + + https://gerrit.osmocom.org/c/libosmocore/+/32183 + + or using equivalent functions in Themyscira libgsmefr and libgsmfrp. + +* Directly forward BFI, TAF and DTXd frame indicator bits from TEH octet bits + to TRAU-UL C12, C15 and C17, respectively. + +Mixing standard RFC 3551 and TRAUlike RTP payloads +================================================== + +An RTP stream receiver for FR/EFR codecs that supports the present non-standard +extension to the RTP payload format shall behave gracefully when it receives a +mixture of standard RFC 3551 payloads and TRAUlike payloads in the same RTP +stream. A receiver that has no interest in the additional information carried +in the TRAUlike Extension Header shall simply strip the TEH octet when one is +received, reducing the received payload to standard RFC 3551; if a BFI or +No_Data payload is received, treat it the same as if nothing at all was +received. A receiver that is interested in the TRAUlike Extension Header but +receives an FR/EFR payload without one should behave as if it received a TEH +with BFI=0, TAF=0, and a received zero-length RTP payload should be treated the +same as receiving a No_Data TRAUlike payload with TAF=0. + +There may even be cases when an RTP sender may alternate between sending +standard RFC 3551 payloads and TRAUlike payloads in the same session: for +example, a TFO-supporting CN transcoder may emit "plain" RFC 3551 payloads when +supplying the output of its free-running speech encoder, but switch to sending +TRAUlike payloads when it switches to forwarding bits received in TFO frames +from the far end.