FreeCalypso > hg > sms-coding-utils
comparison doc/SMS-PDU-decoding @ 31:19476164c54d
doc/SMS-PDU-decoding: document imported utilities
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Fri, 14 Jun 2024 18:48:58 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
30:d7571dc2fecc | 31:19476164c54d |
---|---|
1 The decoding part of the present sms-coding-utils suite consists of two | |
2 programs: sms-pdu-decode and pcm-sms-decode. Their functions are as follows: | |
3 | |
4 * The input to sms-pdu-decode is an ASCII text stream (stdin or read from a | |
5 file) in which every SMS PDU to be decoded appears as a long hex string. | |
6 This input can originate from the GSM 07.05 interface on a FreeCalypso GSM MS | |
7 (fcup-smdump utility in FC host tools), in which case every GSM 03.40 TPDU | |
8 will be preceded by an SC address field - the original use case for | |
9 sms-pdu-decode, run it without special options. In the other alternative, | |
10 the input to sms-pdu-decode can originate from test scenarios on the network | |
11 side of GSM (SMSC development and testing), in which case input SMS PDUs will | |
12 be pure GSM 03.40 TPDUs, without SC address prefix - use sms-pdu-decode -n | |
13 option in this case. | |
14 | |
15 * The input to pcm-sms-decode is a binary file with 176 bytes per record, | |
16 corresponding to the format of EF_SMS elementary file on SIM cards. This | |
17 program can be used to decode readouts of this EF_SMS file made with | |
18 fc-simtool, or readouts of /pcm/SMS file in the flash file system of Pirelli | |
19 DP-L10 phone, which uses the same format - the latter use case arose first in | |
20 chronological order of FreeCalypso development, hence the name of the utility. | |
21 | |
22 Common options: character set and dump format control | |
23 ===================================================== | |
24 | |
25 By default, sms-pdu-decode and pcm-sms-decode only emit 7-bit ASCII characters | |
26 in their output; any GSM7 or UCS-2 characters which fall outside of this plain | |
27 ASCII repertoire are converted into backslash escapes. This conservative | |
28 default behaviour can be modified as follows: | |
29 | |
30 -e option extends the potential output character repertoire from 7-bit ASCII to | |
31 8-bit ISO 8859-1. Any 8859-1 high characters are emitted as single bytes, | |
32 i.e., are NOT encoded in UTF-8 - this option is intended for non-UTF-8 | |
33 environments. | |
34 | |
35 -u option extends the potential output character repertoire to all of Unicode, | |
36 and changes the output encoding to UTF-8. | |
37 | |
38 Regardless of whether the source message character set is GSM7 or UCS-2 and | |
39 irrespective of -e or -u options, any backslash characters are always escaped | |
40 as \\, and any CR characters are represented as \r. Additional backslash | |
41 escape encodings depend on the source message character set: | |
42 | |
43 * If the source message character set is GSM7, the following additional | |
44 backslash escapes can be emitted: | |
45 | |
46 - In the absence of -u option, the Euro currency symbol is converted to \E; | |
47 | |
48 - Any GSM7 escape characters (0x1B) that aren't part of a valid escape | |
49 sequence for [\]^ or {|}~ or \E are represented as \e; | |
50 | |
51 - Any GSM7 characters that either can't be represented in the output character | |
52 set (ASCII or ISO 8859-1) or are outright invalid per GSM 03.38 are | |
53 represented as \xX, where xX is the original GSM7 code point in 2-digit | |
54 hexadecimal form between 00 and 7F; | |
55 | |
56 - Invalid GSM7 escape sequences are emitted as \e\xX. | |
57 | |
58 * If the source message character set is UCS-2, the following additional | |
59 backslash escapes can be emitted: | |
60 | |
61 - Invalid UCS-2 characters falling onto control character code points are | |
62 emitted as \u00XX; | |
63 | |
64 - UCS-2 characters that can't be represented in ASCII or ISO 8859-1 (when | |
65 running without -u option) are emitted as \uXXXX; | |
66 | |
67 - If UTF-16 surrogate pairs are detected in the input, the encoded high-plane | |
68 Unicode character is reconstructed and emitted as \UXXXXXX in the absence | |
69 of -u option, or as the appropriate UTF-8 byte sequence with -u. | |
70 | |
71 -h option causes the user data portion of every message to be displayed as a | |
72 raw hex dump; in the case of GSM7-encoded messages, this hex dump shows the | |
73 unpacked septets. | |
74 | |
75 sms-pdu-decode specifics | |
76 ======================== | |
77 | |
78 The input to the program may contain additional text besides SMS PDUs in the | |
79 form of long hex strings; all lines that are not hex strings are passed through | |
80 to the output. Every input line that is purely a string of directly abutted hex | |
81 bytes is taken to be an SMS PDU in need of decoding, and the full decoding | |
82 operation is attempted. The following additional options are available besides | |
83 the common -e, -u and -h options documented above: | |
84 | |
85 -n By default, sms-pdu-decode expects every hex-encoded SMS PDU to begin | |
86 with an SC address field, followed by a GSM 03.40 TPDU - the format used | |
87 on GSM 07.05 interface in PDU mode and in SIM SMS storage. With -n | |
88 option, sms-pdu-decode expects pure GSM 03.40 TPDUs instead, without | |
89 SC address prefix. | |
90 | |
91 -p Keep all hex-encoded PDU lines in the output: for each encountered hex | |
92 PDU, first the original hex line is output, then the decoding result. | |
93 | |
94 pcm-sms-decode specifics | |
95 ======================== | |
96 | |
97 This program reads a binary file; the file to be read must be named on the | |
98 command line. The output is ASCII (or an extended character set with -e or -u | |
99 options as described in the common section above), naming each dumped record as | |
100 "Record #%u" and showing its content. For a binary file of N records, the | |
101 default record numbering is from 0 to N-1: this numbering order is natural to | |
102 this Mother's native world of CompSci, and I implemented it when I originally | |
103 wrote pcm-sms-decode for the purpose of decoding /pcm/SMS readouts from Pirelli | |
104 DP-L10 FFS. However, when I later wrote fc-simtool and pcm-sms-decode acquired | |
105 a second use case of decoding SIM EF_SMS readouts, a mismatch became apparent: | |
106 the record numbering used in READ RECORD and UPDATE RECORD commands on the | |
107 SIM-ME interface is 1..N instead of 0..N-1. pcm-sms-decode -s option switches | |
108 the record numbering scheme to 1..N to match the SIM application. |