FreeCalypso > hg > freecalypso-tools
view doc/User-phone-tools @ 805:a43c5dc251dc
doc/User-phone-tools: new sms-pdu-decode backslash escapes
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Thu, 25 Mar 2021 05:10:43 +0000 |
parents | b5235f8240b9 |
children | 8cf7d41f2821 |
line wrap: on
line source
FreeCalypso User Phone Tools are a new software addition to the FreeCalypso family. These tools are programs that run on a Unix host computer such as a GNU/Linux PC or laptop and communicate with a FreeCalypso phone or modem via the standard AT command interface, rather than any of the formerly proprietary interfaces specific to TI's internal architecture. The following tools are currently available: fcup-at Issues an arbitrary AT command given on the command line. fcup-settime Issues AT+CCLK command to the target to set its clock to the host computer's notion of local time. fcup-smdump Retrieves a dump of SMS records (received, sent or stored messages) from the FC device's SMS storage (currently SIM storage; ME storage may be implemented in the future), optionally deleting them from the severely space-limited SIM/ME storage afterward. fcup-smsend* Tools for sending outgoing SMS from a host computer through a FreeCalypso phone or modem and/or writing such outgoing SMS into the FC device's SMS storage. fcup-smwrite Debug and development tool: writes arbitrary message records into the FC device's SMS storage (currently SIM storage) in any of the possible 4 states, with arbitrary incoming or outgoing SMS PDU content. Because these tools communicate with the target via standards-defined AT commands, in theory they ought to work with any AT-command-speaking 3GPP phone or modem and not just our own FreeCalypso. However, experience has shown that in the case of the common proprietary implementations, practice does not match theory: when I (Mychaela) tried these same AT commands against a random off-the-shelf proprietary modem (Huawei E303 USB stick modem for 3G), the following problems were seen: * The essential AT+CMGL=4 command for retrieving the full set of SMS records from SIM storage in PDU mode appears to be broken: all I got was a hang. Its text mode counterpart AT+CMGL="ALL" produces incomplete output. * Qualcomm/Huawei's implementation of the AT command interface does not allow AT+CSCS to be set to "HEX"; our fcup-smdump implementation uses this setting so that the phonebook names returned along with SMS PDUs in the +CMGL responses can be parsed reliably no matter what weird characters they might contain. * Setting AT+CSCS to "8859-1" is not supported either; this setting is used by our fcup-smsend and fcup-smsendmult tools when sending in text mode. * Sending outgoing SMS with fcup-smsend in PDU mode (which does not touch AT+CSCS) works in that the message goes out, but the tool complains afterward because the echo after the ^Z is different from what our tools expect. Because of these quirks, our FC User Phone Tools officially work only with our own FreeCalypso phones and modems, and are not expected to work against various proprietary implementations. Let us not forget that the broken and buggy nature of the common proprietary implementations is the very reason why we need FreeCalypso in the first place. Target interface options ======================== Our fcup-* tools can communicate with the AT-command-speaking target in one of two ways: * The default is the standard AT command interface over a dedicated UART. As of this writing, the only FreeCalypso device that provides a full-featured AT command interface of this kind is our FCDEV3B modem, but the ultimate goal of the project is to build our own end user phone handset (a Libre Dumbphone) that will also provide a full-featured AT command interface on its USB port via a built-in CP2102 or FT232R chip. * As a dirty hack, one can run FreeCalypso GSM fw on some alien hw targets, currently Motorola C1xx and Pirelli DP-L10. In this hacked-up configuration there is no dedicated UART available for a standard AT command interface, but there is a hack that allows a limited subset of AT commands to be passed over the RVTMUX binary packet interface provided by the running FreeCalypso GSM fw. Our fcup-* tools can work with this alternate target interface option and thereby support these crippled targets. The AT-over-RVTMUX mechanism was originally invented back in 2015 as a development aid, and was never intended for production use or to support any kind of end user functionality. One of the limitations of its original incarnation was that the strings that are sent to ATI via this interface were limited to 254 characters, whereas sending or writing SMS in hex format requires longer strings. As of early 2019, this limitation has been lifted: our Magnetite and Selenite firmwares from 20190109 onward support an extended version of our AT-over-RVTMUX hack that allows longer strings to be sent in pieces, and the present version of our FC User Phone Tools suite will send the strings it generates via this extended mechanism whenever they exceed the old 254 character limit. The new mechanism works correctly starting with the 20190128 firmware release for modem products and the 20190129 fw release for Mot C1xx phones, thus when the present version of FC User Phone Tools is used to communicate with our current firmwares, both target interface options provide equivalent functionality on all supported targets. All fcup-* tools take the following common command line options for selecting the AT command target interface: -B baud Valid only when -p is also given; selects a different baud rate than the default 115200 bps. -n Dry run debug mode with no target interface at all: the AT commands which would otherwise be sent to the target are simply printed on stdout. -p ttyport Names the serial port to be used to talk to the target. -R Use the AT-over-RVTMUX interface instead of the standard AT command interface over a dedicated UART. -X program Use the specified external program as the AT target communication back-end; read the source code for the details. -R and -p options interact as follows: Neither -R The standard dedicated AT command interface is used; nor -p FC_GSM_DEVICE= environment variable needs to be set to point to the serial port. -p only The standard dedicated AT command interface is used; the serial port is named with the -p option. -R only AT-over-RVTMUX interface is used; the fcup-* tool connects to an already running rvinterf process. -R and -p AT-over-RVTMUX interface is used; a new rvinterf process is launched to talk RVTMUX on the specified serial port. Retrieving and decoding stored SMS ================================== As of this writing, our current FreeCalypso GSM firmware supports only SIM storage for SMS, i.e., there is no working mechanism currently for storing SMS records (received and sent messages) in the phone's or modem's own flash file system. The capacity of this SIM SMS storage is determined by the SIM issuer, but it is typically quite limited, on the order of 20 to 30 messages. The model adopted for FreeCalypso is that incoming (and possibly saved outgoing) messages initially accumulate in the SIM storage as they come in, and then the user periodically transfers them to her larger host computer, simultaneously deleting them from the SIM storage to reclaim the limited space. The retrieval of stored SMS from FreeCalypso GSM devices is accomplished with our fcup-smdump utility; like all SMS operations with the current tools+firmware combination, this operation works exactly the same whether the FC GSM device offers a full- featured AT command interface or only AT over RVTMUX. SMS retrieval is always done in PDU mode, and the output from fcup-smdump contains raw SMS PDUs in the form of long hex strings. A separate utility called sms-pdu-decode then does what its name says. The intended mode of usage is something like this: fcup-smdump -d >> long-term-sms-log The -d option to fcup-smdump tells it to delete the retrieved messages from the SIM or future ME storage; this option should only be used when the output is redirected into some kind of longer-term storage. In the above model the file named long-term-sms-log becomes what its name says as new messages retrieved from the FC GSM device get added to it; the format will look like this: Received message: XXXXXX... Received message: XXXXXX... Sent message: XXXXXX... Stored unsent message: XXXXXX... Received message: XXXXXX... Each of the "XXXXXX..." lines will be a long hex string giving an SMS PDU. The idea is that the complete record of all received and sent messages should be stored on the user's big computer in raw PDU form, rather than decoded, and the decoding utility sms-pdu-decode should be invoked by the user (with the message log file as input) as needed for reading these messages. The message decoding utility sms-pdu-decode does its best to decode and show everything without dropping any bits: in addition to the actual decoded message characters and the From/To address (the "end user" content of the message), it decodes and shows the SC address, the first octet, the MR octet for outgoing messages, PID and DCS octets, the SC timestamp or the validity period fields, and the UDH bytes if present. However, some bits can still be lost in the decoding, which is why it is important to archive messages in the raw PDU form: * Padding bits used to round the From/To address and septet-based user data to an octet boundary and to round any UDH to a septet boundary are not decoded. * If the user data portion of the message is 8-bit or compressed data (per the DCS octet), it is shown as a raw hex dump, which is lossless, but if it is GSM7 or UCS-2 text (GSM 03.38 character encodings), the characters are converted to the user's character set (plain ASCII only by default) for display, and some characters may not be displayable. Character sets and encodings ---------------------------- By default, sms-pdu-decode only emits 7-bit ASCII characters in its output; any GSM7 or UCS-2 characters which fall outside of this plain ASCII repertoire are converted into backslash escapes. This conservative default behaviour can be modified as follows: -e option extends the potential output character repertoire from 7-bit ASCII to 8-bit ISO 8859-1. Any 8859-1 high characters are emitted as single bytes, i.e., are NOT encoded in UTF-8 - this option is intended for non-UTF-8 environments. -u option extends the potential output character repertoire to all of Unicode, and changes the output encoding to UTF-8. Regardless of whether the source message character set is GSM7 or UCS-2 and irrespective of -e or -u options, any backslash characters are always escaped as \\, and any CR characters are represented as \r. Additional backslash escape encodings depend on the source message character set: * If the source message character set is GSM7, the following additional backslash escapes can be emitted: - In the absence of -u option, the Euro currency symbol is converted to \E; - Any GSM7 escape characters (0x1B) that aren't part of a valid escape sequence for [\]^ or {|}~ or \E are represented as \e; - Any GSM7 characters that either can't be represented in the output character set (ASCII or ISO 8859-1) or are outright invalid per GSM 03.38 are represented as \xX, where xX is the original GSM7 code point in 2-digit hexadecimal form between 00 and 7F; - Invalid GSM7 escape sequences are emitted as \e\xX. * If the source message character set is UCS-2, the following additional backslash escapes can be emitted: - Invalid UCS-2 characters falling onto control character code points are emitted as \u00XX; - UCS-2 characters that can't be represented in ASCII or ISO 8859-1 (when running without -u option) are emitted as \uXXXX; - If UTF-16 surrogate pairs are detected in the input, the encoded high-plane Unicode character is reconstructed and emitted as \UXXXXXX in the absence of -u option, or as the appropriate UTF-8 byte sequence with -u. -h option causes the user data portion of every message to be displayed as a raw hex dump; in the case of GSM7-encoded messages, this hex dump shows the unpacked septets. Composing and sending outgoing SMS ================================== When used in the default PDU mode (which now works on all targets with our current firmware and tools), the primary SMS sending/writing tool fcup-smsend offers the following capabilities: * Sending outgoing messages in either GSM7 or UCS-2 encoding; * Sending either single or long (concatenated) SMS; * Message body input in ASCII, ISO 8859-1 or UTF-8; * Message body input either on the command line or on stdin; * Any messages sent through this tool (single or concatenated) may be multiline, i.e., may contain embedded newlines; * Messages sent in GSM7 encoding can contain ASCII characters [\]^ and {|}~ - the tool is smart enough to do the necessary escape encoding. The default and preferred AT command interface mode for sending/writing SMS is PDU mode, which works great when the GSM device provides a proper AT command interface. However, when a message of maximum or near-maximum length is being submitted to the modem in PDU mode, the hex string that needs to be sent is quite long, and at the time when our FC User Phone Tools were first designed and written, our AT-over-RVTMUX mechanism could not handle such long strings. Because we sought to have at least limited SMS sending and writing support for crippled Motorola and Pirelli targets, we also implemented text mode support in fcup-smsend and fcup-smsendmult, enabled with the -t option. In this text (-t) mode the following restrictions apply: * Only single SMS can be sent, not concatenated; * Only GSM7-encoded messages can be sent, not UCS-2; * No multiline messages can be sent, i.e., no newlines in the message body; * ASCII characters [\]^ and {|}~ won't be sent correctly - GSM 07.05 text mode drops them. Now that we have extended our AT-over-RVTMUX mechanism to support longer strings and gained full support for PDU mode on all targets, the above -t mode is no longer necessary for any use case, as the default PDU mode is a proper superset in functionality. However, support for this -t mode has been retained, as removing software functionality for no good reason is not the way of FOSS. The invokation syntax is as follows: fcup-smsend [options] dest-addr [message] The destination address must be given on the command line; the address digits may be optionally followed by a comma and an address type byte, either decimal or hexadecimal with 0x prefix. The default address type is 0x91 if the number begins with a '+' or 0x81 otherwise. If the message body is given on the command line, it must be given as a single argument; if no message body argument is given, the message body will be read from stdin. Any trailing newlines are stripped before SMS encoding. The following options are supported, in addition to the common target interface options listed earlier: -c Enables concatenated SMS. Concatenated SMS will be sent only if the message body exceeds 160 GSM7 or 70 UCS-2 characters, otherwise plain SMS will be sent whether -c is given or not - but the -c option enables the possibility of sending concatenated SMS. -C refno Enables concatenated SMS like -c, but also explicitly sets the concatenated SMS reference number to be used. The number can be either decimal or hexadecimal with 0x prefix. -q Concatenated SMS quiet mode. If -c is given without -q, the tool prints a message on stdout indicating whether the message was sent as single or concatenated, and in how many parts. -q suppresses this additional output. -t Use text mode instead of PDU mode on the AT command interface. This option is incompatible with -c and with -U, and introduces other restrictions listed above. -u By default, if the message body input contains any 8-bit characters, they are interpreted as ISO 8859-1. With -u they are interpreted as UTF-8 instead. This option is only relevant for GSM7 output encoding, and it is implemented by converting the input first from UTF-8 to 8859-1, and then from 8859-1 to GSM7 - thus all UTF-8 input characters must fall into the 8859-1 repertoire, and it is not currently possible to send GSM7-encoded messages containing the few Greek letters or the Euro currency symbol allowed by GSM 03.38 encoding. -U Send message in UCS-2 encoding instead of GSM7. Any 8-bit characters in the message body input are interpreted as UTF-8, and the entire Basic Multilingual Plane of Unicode is allowed. -w By default the outgoing message is sent out on the GSM network with the AT+CMGS command. With this -w option, the message is first written into SIM or future ME SMS storage with AT+CMGW, then sent out on the GSM network with AT+CMSS. -W Write only, not send: the message is written into storage with AT+CMGW and no further action is taken. The modem's +CMGW: responses with message storage indices are forwarded to stdout. With this option the destination address argument can be a null string or omitted altogether. Concatenated SMS reference numbers ---------------------------------- Every concatenated SMS transmission needs a reference number, and this number needs to increment from one concatenated SMS to the next, to help message recipients sort out which is which. If the reference number is not given explicitly with -C, fcup-smsend creates (opens with O_RDWR|O_CREAT) a file named .concat_sms_refno in the invoking user's $HOME directory; automatically incrementing reference numbers are maintained in this file. The initial seed is an XOR of all bytes of the current time returned by gettimeofday(2), followed by simple linear incrementing; these reference numbers do not need to be random in any kind of cryptographically secure sense. fcup-smsendmult =============== As an alternative to sending concatenated SMS, one can use the fcup-smsendmult utility to send several single (no UDH) messages in one batch. This utlity supports both text and PDU modes (PDU mode is still the preferred default when it can be used), and when PDU mode is used, it supports both GSM7 and UCS-2 output encodings just like fcup-smsend. The messages to be sent are read from stdin, and each input line produces a new message. The entire batch of messages can be sent to a single recipient, or each message in the batch can have its own individual destination address. If the destination address is given on the command line, each input line read from stdin is just a message body; if no destination address is given on the command line, each input line must have the following format: <dest addr><white space><message body> -t, -u, -U, -w and -W command line options are unchanged from fcup-smsend. This fcup-smsendmult method of sending batched SMS was originally envisioned as an alternative to concatenated SMS for crippled hw targets that couldn't support sending SMS in PDU mode, but that limitation has now been lifted. Because we do not remove already-implemented functionality for no good reason, the tool currently remains in search of new potential use cases. fcup-smsendpdu ============== This utility sends out SMS PDUs that have been prepared externally; it only works in PDU mode - originally it was limited to high-end FreeCalypso hardware with a full AT command interface, but now we've got PDU mode working on all targets. The PDUs to be sent out are read from stdin, one long hex string PDU per line; one can send either a single message or a batch. Because the destination address and all content details are encoded in the PDU, the tool does not care if the messages are going to the same recipient or to different recipients, nor does it care if they constitute a concatenated SMS transmission or not. -w and -W options work the same way as in fcup-smsend and fcup-smsendmult. fcup-smwrite ============ This utility is a debug and development tool; it differs from fcup-smsendpdu in the following ways: * fcup-smsendpdu can send messages out with AT+CMGS, write them into memory with AT+CMGW, or do a write-then-send sequence (-w option) with AT+CMGW followed by AT+CMSS. fcup-smwrite only issues AT+CMGW commands. * fcup-smwrite passes a second argument to AT+CMGW that sets the message state to any of the possible 4 values; fcup-smsend* -W put them in the "stored unsent" state. * The input to fcup-smsendpdu is just PDU hex strings; the input to fcup-smwrite needs to have the same format as fcup-smdump output in order to indicate what state each message should be written in.