FreeCalypso > hg > sms-coding-utils
view doc/SMS-PDU-decoding @ 31:19476164c54d
doc/SMS-PDU-decoding: document imported utilities
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Fri, 14 Jun 2024 18:48:58 +0000 |
parents | |
children |
line wrap: on
line source
The decoding part of the present sms-coding-utils suite consists of two programs: sms-pdu-decode and pcm-sms-decode. Their functions are as follows: * The input to sms-pdu-decode is an ASCII text stream (stdin or read from a file) in which every SMS PDU to be decoded appears as a long hex string. This input can originate from the GSM 07.05 interface on a FreeCalypso GSM MS (fcup-smdump utility in FC host tools), in which case every GSM 03.40 TPDU will be preceded by an SC address field - the original use case for sms-pdu-decode, run it without special options. In the other alternative, the input to sms-pdu-decode can originate from test scenarios on the network side of GSM (SMSC development and testing), in which case input SMS PDUs will be pure GSM 03.40 TPDUs, without SC address prefix - use sms-pdu-decode -n option in this case. * The input to pcm-sms-decode is a binary file with 176 bytes per record, corresponding to the format of EF_SMS elementary file on SIM cards. This program can be used to decode readouts of this EF_SMS file made with fc-simtool, or readouts of /pcm/SMS file in the flash file system of Pirelli DP-L10 phone, which uses the same format - the latter use case arose first in chronological order of FreeCalypso development, hence the name of the utility. Common options: character set and dump format control ===================================================== By default, sms-pdu-decode and pcm-sms-decode only emit 7-bit ASCII characters in their output; any GSM7 or UCS-2 characters which fall outside of this plain ASCII repertoire are converted into backslash escapes. This conservative default behaviour can be modified as follows: -e option extends the potential output character repertoire from 7-bit ASCII to 8-bit ISO 8859-1. Any 8859-1 high characters are emitted as single bytes, i.e., are NOT encoded in UTF-8 - this option is intended for non-UTF-8 environments. -u option extends the potential output character repertoire to all of Unicode, and changes the output encoding to UTF-8. Regardless of whether the source message character set is GSM7 or UCS-2 and irrespective of -e or -u options, any backslash characters are always escaped as \\, and any CR characters are represented as \r. Additional backslash escape encodings depend on the source message character set: * If the source message character set is GSM7, the following additional backslash escapes can be emitted: - In the absence of -u option, the Euro currency symbol is converted to \E; - Any GSM7 escape characters (0x1B) that aren't part of a valid escape sequence for [\]^ or {|}~ or \E are represented as \e; - Any GSM7 characters that either can't be represented in the output character set (ASCII or ISO 8859-1) or are outright invalid per GSM 03.38 are represented as \xX, where xX is the original GSM7 code point in 2-digit hexadecimal form between 00 and 7F; - Invalid GSM7 escape sequences are emitted as \e\xX. * If the source message character set is UCS-2, the following additional backslash escapes can be emitted: - Invalid UCS-2 characters falling onto control character code points are emitted as \u00XX; - UCS-2 characters that can't be represented in ASCII or ISO 8859-1 (when running without -u option) are emitted as \uXXXX; - If UTF-16 surrogate pairs are detected in the input, the encoded high-plane Unicode character is reconstructed and emitted as \UXXXXXX in the absence of -u option, or as the appropriate UTF-8 byte sequence with -u. -h option causes the user data portion of every message to be displayed as a raw hex dump; in the case of GSM7-encoded messages, this hex dump shows the unpacked septets. sms-pdu-decode specifics ======================== The input to the program may contain additional text besides SMS PDUs in the form of long hex strings; all lines that are not hex strings are passed through to the output. Every input line that is purely a string of directly abutted hex bytes is taken to be an SMS PDU in need of decoding, and the full decoding operation is attempted. The following additional options are available besides the common -e, -u and -h options documented above: -n By default, sms-pdu-decode expects every hex-encoded SMS PDU to begin with an SC address field, followed by a GSM 03.40 TPDU - the format used on GSM 07.05 interface in PDU mode and in SIM SMS storage. With -n option, sms-pdu-decode expects pure GSM 03.40 TPDUs instead, without SC address prefix. -p Keep all hex-encoded PDU lines in the output: for each encountered hex PDU, first the original hex line is output, then the decoding result. pcm-sms-decode specifics ======================== This program reads a binary file; the file to be read must be named on the command line. The output is ASCII (or an extended character set with -e or -u options as described in the common section above), naming each dumped record as "Record #%u" and showing its content. For a binary file of N records, the default record numbering is from 0 to N-1: this numbering order is natural to this Mother's native world of CompSci, and I implemented it when I originally wrote pcm-sms-decode for the purpose of decoding /pcm/SMS readouts from Pirelli DP-L10 FFS. However, when I later wrote fc-simtool and pcm-sms-decode acquired a second use case of decoding SIM EF_SMS readouts, a mismatch became apparent: the record numbering used in READ RECORD and UPDATE RECORD commands on the SIM-ME interface is 1..N instead of 0..N-1. pcm-sms-decode -s option switches the record numbering scheme to 1..N to match the SIM application.