FreeCalypso > hg > themwi-smsc
view doc/Arch-design @ 2:b203ebebe9b3
doc/Arch-design: fill out sections 2.4.[2-5]
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Fri, 22 Dec 2023 06:38:16 +0000 |
parents | c4f8a32af088 |
children | b084a9542471 |
line wrap: on
line source
Themyscira Wireless SMSC implementation Architectural design specification 1. Purpose and scope of the software The purpose of the present software project is to facilitate store-and-forward SMS exchange among the following parties: * Locally owned mobile telephone numbers (LOMTNs) that belong to Themyscira Wireless, with Short Message Service accessed either via the local GSM network (Osmocom-based) or via direct command line access to the SMSC; * The outside world: the total set of all SMS-capable E.164 telephone numbers in the world, with whom our users must be able to freely exchange SMS just like users of any other cellular phone carrier in USA; * USA-specific 5-digit and 6-digit short codes: these services aren't accessible from anywhere in the world, only from USA (each country has its own services of this type), but because we are located in USA, we must provide the same access to public services as any other cellular phone carrier; * Any downstream parties who enter into an interconnection agreement with ThemWi for the purpose of sharing our SMS uplink to the outside world. 1.1. NANP specifics The design of our SMSC makes the following assumptions that are specific to North American Numbering Plan: * All LOMTNs and all downstream peer MTNs are expected to be NANP numbers; any/all SMS source or destination numbers in country codes other than +1 are treated as belonging in the Outside World, accessible only via the SMPP "uplink" connection to our upstream SMS connectivity provider. * The set of SMS destination numbers that can be sent to the upstream includes not only non-NANP and not-locally-known NANP E.164 numbers, but also any/all SMS short codes in USA-specific NXXXX or NXXXXX format. * In the case of Mobile-Originated SMS from the local GSM network, if the user-entered destination number is not explicitly international (TON=1) and does not fit the format of a USA SMS short code, other USA-customary dialing formats are supported, as in 10-digit NPANXXXXXX or 11-digit 1NPANXXXXXX without '+' prefix. themwi-nanp software package is a strict dependency for themwi-smsc: themwi-nanp utilities must be used to manage the database of locally owned NANP numbers, and the present software uses themwi-nanp libraries to access that database. 1.2. Hierarchical arrangement of upstream and downstream peers The telecom landscape in USA is such that anyone can obtain 10-digit telephone numbers (TNs) very easily and very cheaply, but making them SMS-capable (able to function as Mobile Telephone Numbers or MTNs) is much more difficult. Suitably equipped providers such as Bandwidth.com are generally unwilling to provide service directly to small customers, and we (Themyscira Wireless team) were able to find only one company (Sopranica Telecom) who buys P2P SMS interconnection service from Bandwidth and was willing to resell to us. Suppose that many different ultra-small parties wish to set up their own indie GSM networks in different parts of USA. Each of these tiny fiefdoms can serve as its own administration and get its own TNs from a provider such as BulkVS. How would all of these tiny fiefdoms then add SMS capability? The feedback we got from Sopranica is that asking them to set up a sub-account on their Bandwidth service for each microfiefdom would be too much work - hence San Diego 2G Association (the primary instance of Themyscira Wireless) will need to serve as a third-level reseller, getting Bandwidth SMS interconnection service from Sopranica and then further subletting it to other microfiefdoms. Vertical hierarchy support in ThemWi-SMSC is designed to support the just- described use case. Each SMSC instance has a set of locally owned mobile TNs (LOMTNs, owned by the local fiefdom operating this SMSC instance), a single upstream SMPP link pointing up the hierarchy tree (toward the Outside World) and any number of downstream SMPP links to downstream peers. The total set of phone numbers known to each SMSC instance is its own local set (themwi-nanp database of locally owned TNs) plus the set of numbers assigned to downstream peers - all other E.164 numbers everywhere in the world (plus all non-E.164 USA SMS short codes) belong in the Outside World and are sent to the "uplink" connection. Messages are then routed as follows: * Any SM originating from a local GSM subscriber can go to another GSM subscriber, to a known downstream peer or to the Outside World. * Any SM that are injected directly into the SMSC from local shell access are treated the same way as Mobile-Originated SMS from local GSM users - hence this mechanism can be used to send SMS to the local GSM network or to the Outside World. * Any SM coming from the uplink connection can be addressing a local GSM subscriber or a downstream peer - but either way it must be a number known to this SMSC, otherwise something is badly misconfigured somewhere. * Any SM coming from a downlink connection can go to a local GSM subscriber, to a different downstream peer or to the Outside World. 1.2.1. Direction of SMPP connections Despite the name "Short Message Peer to Peer", SMPP is an asymmetric client- server protocol, not symmetric peer-to-peer. Our primary, above-all-else requirement when it comes to SMPP is to connect to the "big daddy" SMSC of Bandwidth.com, the one that allows us to receive SMS from and send SMS to anywhere in the Outside World. BW requires that we connect to their SMSC server in the role of an SMPP client and bind as a bidirectional transceiver - both message directions then flow over this single long-lived TCP connection from our client to their server. This externally imposed requirement dictates the entire architectural design of ThemWi-SMSC with respect to SMPP. Each instance of ThemWi-SMSC can have a single upstream peer to whom we connect in the role of an SMPP client, and it can optionally act as an SMPP server accepting TCP connections from downstream peers. The master instance of ThemWi-SMSC at smsc.sandiego2g.org will point its "upstream" link at Bandwidth.com SMPP server, using credentials given to us by Sopranica, whereas other small fiefdoms who wish to join our service resale tree will point the "upstream" link of their ThemWi-SMSC instances to smsc.sandiego2g.org, and we (SD2G) will assign them authentication credentials and manage their downstream number pools. 1.3. Possible use outside of originally intended North American use case If your situation and/or interests do not match the very specific use case for which the present software is designed (if you are located outside of North America, and/or you have no interest in attaining SMS interconnection with the national mobile telephony environment of whichever country you call home), you can still play with the present implementation of GSM-oriented SMSC: the uplink connection to the Outside World can be omitted, and if you don't have real TNs (telephone numbers) in North American Numbering Plan (either because you are outside of North America or because you are in NA but not interested in official phone network interconnection), you can operate ThemWi-SMSC (plus the attached Osmocom GSM network) with fake NANP numbers instead. To be clear, this support for modes of usage outside of the primary design goals of ThemWi-SMSC is intended only to facilitate "play" and evaluation (getting a feel for what may be the first SMSC implementation connecting to Osmocom CNI via GSUP), not for serious long-term usage. If your actual desired use case is an isolated GSM network with a totally ad hoc or "free" numbering plan (the default which one gets with a "vanilla" installation of Osmocom CNI), or a GSM network that is interconnected with the national mobile telephony environment of some country other than USA, you need a different SMSC design that is tailored for your numbering plan (free-form or non-USA national) that will be different from NANP, and for local telecom environment quirks that will almost certainly be different from those in USA. If you like the general idea and overall design of ThemWi-SMSC, but require an adaptation to a different numbering plan or a different telecom environment (isolated or a national interconnect in some other country), you should be able to take the present code base and modify just the numbering plan aspects, producing a derivative-work SMSC for your different needs. 2. ThemWi-SMSC software architecture 2.1. Modularity of components A complete deployment of ThemWi-SMSC, as in our own use case at Themyscira Wireless, includes a local GSM network (Osmocom-based) and a connection to the hierarchical SMPP tree that eventually leads to the Outside World SMS connectivity provider at the top. However, our software implementation will be modular, divided into separate software components for: * The internal core of the SMSC (one daemon process and some command line utilities); * A pair of daemon processes devoted to the task of connecting the SMSC to the local Osmocom-based GSM network, to be omitted if you don't have one; * A dedicated daemon process serving the SMPP link to the upstream peer, to be omitted if you have no upstream link; * Another dedicated sw component serving downstream peer SMPP connections, one process instance per downstream peer, or none if you have no such peers. This modularity allows the software to be used and (hopefully) appreciated outside of its primary intended use case. At one extreme, someone could have an isolated Osmocom GSM network, modify it slightly to use MSISDNs that look like (fake) NANP numbers, hook up ThemWi-SMSC and use this SMSC as a replacement for the Osmocom-default one, paving the way for factoring the SMSC function out of OsmoMSC. At the other extreme, if someone is located in USA and wishes to interconnect to the world of SMS through the chain of 3 resellers (Bandwidth followed by Sopranica followed by San Diego 2G Association), they can run an instance of ThemWi-SMSC without any GSM network at all. (You will still need Osmocom libraries, but no Osmocom processes and no hardware.) In such a deployment, all incoming SMS to your number(s) will be written into the persistent store which you can read, and you can send outgoing SMS with a command line utility. 2.2. Persistent message store Every SM that passes through ThemWi-SMSC gets written into an append-only persistent message store (PMS). Because this store is append-only, no messages are ever deleted - however, each message in PMS can be in one of two states: active or historical. An active SM is one for which the SMSC still needs to make delivery attempts, either attempts at GSM MT delivery or attempts at delivery to the appropriate upstream or downstream SMPP peer. A historical SM is one for which no further action will be taken by any component of our SMSC. An SM can enter "historical" state in several ways: * For some LOMTNs the act of writing incoming messages into PMS constitutes final delivery in itself, and no other delivery actions are needed. In this case a newly entered SM is directly written into PMS in the "historical" state, without ever going through "active". * For messages that need to be delivered to a GSM MS or to an SMPP peer, once that delivery has been made successfully, the message transitions from active to historical. * In the case of failed deliveries (permament error, or expiration time reached after repeated temporary failures), the failed message also transitions from active to historical. The persistent message store is a simple binary file (/var/sms/pms.bin) consisting of directly abutted 'struct sm_record' records. Each message record is exactly 256 bytes (see struct definition - we were able to fit everything we needed under the 256 byte mark, and then padded the struct to perfect round size), and this perfect power-of-2 record size makes it very easy to perform operations such as binary search via mmap or stripping initial megabytes of historical records - see subsequent sections for more detailed description. PMS is append-only as already stated, but already-written records do not become fully immutable until they become historical. For as long as a given SM is in the active state, themwi-smsc-core daemon can and will update that record in pms.bin: * For messages addressed to local GSM subscribers, dest_imsi will be filled when the MSISDN-to-IMSI lookup operation on the destination number succeeds; * Upon discharge (successful delivery, permanent error or validity period expiration after temporary failures), themwi-smsc-core will transition the sm_record into historical state by filling disposition and time_disch struct members; * Additional info may be written into dest_extra_info upon discharge, depending on the destination type and thus the mode of final delivery. Once an sm_record transitions into historical state, it is then immutable for archival purposes; archives of historical messages can be kept for years or even decades, depending on local administration policy. 2.2.1. Historical megabyte count Given the simple binary structure of the main PMS file, each megabyte (2**20 bytes) holds exactly 4096 messages. It is envisioned that as a busy SMSC runs for a long time, a significant number of historical messages will accumulate, and the content of PMS may become many megabytes of historical messages followed by some active SMs at the end. When themwi-smsc-core daemon restarts, it has to read the entire PMS in order to collect all still-active SMs. Having to read through many megabytes of historical SMs to get to active ones at the end becomes unacceptable at large archive sizes, hence a mechanism is needed for marking where the historical-only portion ends and the possibly-active portion begins. There will be an auxiliary file named historical-mb, containing a single ASCII line giving the number of historical megabytes in pms.bin. If this file reads 1, the first 4096 SM records are historical, if the auxiliary file reads 2, the first 8192 SM records are historical, and so forth. This auxiliary file will be used as follows: * Upon startup, themwi-smsc-core will read this historical-mb file and skip that many initial megabytes of pms.bin; * At run time, themwi-smsc-core will track the index of the oldest still-active SM in PMS. Whenever this index crosses a megabyte boundary, historical-mb will be updated. 2.2.2. Offline storage Even with the historical-mb mechanism of the previous section, the fact remains that disk space on live servers is not infinite. If the archive of historical messages grows so big that it needs to be removed from the SMSC server to free up disk space, one can carry out the following procedure: * Temporarily stop themwi-smsc-core daemon at the level of runit or systemctl or whatever you are using - this operation will bring down the entire SMSC, so do it during a scheduled maintenance window; * Use dd to split pms.bin into historical and active portions: dd if=pms.bin of=pms-hist.bin bs=1048576 count=N dd if=pms.bin of=pms-new.bin bs=1048576 skip=N * Move pms-hist.bin to offline storage; * Replace the long file with the shortened one: mv pms-new.bin pms.bin echo 0 > historical-mb * Re-enable themwi-smsc-core and restart all other SMSC daemons. 2.2.3. themwi-smsc-dump reading tool The program named themwi-smsc-dump will be a standalone command line utility (fully static in its operation, not talking to any daemons or services) for reading and parsing (decoding) pms.bin. It will open pms.bin with O_RDONLY, do a read-only mmap on it, and then access this PMS as a memory-mapped file. Several different modes of operation will be provided: * It will be possible to dump and decode the entire PMS, as needed during early debugging. * It will be possible to specify a starting date/time at which the dump should begin. As records are added in strict forward chronological order, it is possible to find a record nearest (by time_entry timestamp) to a given time point by binary search, very efficient on a memory-mapped file. * Once the dump has a starting point (beginning of the file or a time point found by binary search), the tool can be told to dump till the end, display some count of messages, or run until a certain ending date/time is crossed. * The tool can dump all message records in the selected range, or only those matching specific filters such as a particular source or destination type, or a specific phone number. The complexity described above is needed for the following reasons: * One radical idea is to grant limited access (by way of a very strict wrapper) to themwi-smsc-dump to unprivileged users of the network served by the SMSC, i.e., to end users. The idea is that each individual user should be able to give their ssh public key to the administrator of the community network, and then ssh into a special restricted service on the SMSC that does not grant any system shell access, but allows them to access services under their own phone number. Such an empowered end user should be able to submit SMS from their own phone number using the power of a full-size computer (as opposed to very painful text entry on the numeric keypad of a traditional GSM phone), and to see a full log of all messages received by or sent from their own phone number. * By the nature of her job, the administrator of the SMSC (and of the community GSM network to which this SMSC belongs) necessarily has access to every message that passes through the system, all metadata and actual content. While this access is technically necessary, an administrator who is worthy of her trusted position must not abuse this trust, and must do everything possible to avoid looking at users' private message content when it is not necessary to do so for technical troubleshooting reasons. Toward this objective, themwi-smsc-dump must make it easy to look at only technically necessary information, without throwing unnecessary private info into the operator's eyeballs. 2.3. themwi-smsc-core daemon operation The core daemon (long-lived process) of ThemWi-SMSC is named themwi-smsc-core. Aside from themwi-smsc-dump read-only tool, themwi-smsc-core will be the only software component that accesses pms.bin directly - all other components of ThemWi-SMSC will connect to a UNIX domain local socket provided by themwi-smsc-core. In more detail, the core daemon will perform the following functions: * Read the potentially-active (not marked as historical-only) tail portion of PMS on startup, catch all still-active SMs and hold them in RAM-based data structures; * Listen on a UNIX domain local socket of type SOCK_SEQPACKET, meaning connection- and message-oriented; * Accept message submission (or entry) commands from other ThemWi-SMSC components connecting to this socket; * Allow those socket-connecting SMSC components to register themselves as performing special roles (GSM network interface, IMSI resolver, uplink and downlink SMPP connection handlers), and send notification packets to those role-handlers when an active SM needs that type of processing; * When these just-described role-handlers respond with success or failure of message handling, discharge the SM into historical state (either delivered or failed), or in one special case (successful completion of MSISDN-to-IMSI lookup) promote the SM from need-IMSI-lookup state into GSM-MT-delivery state. The key feature of themwi-smsc-core daemon is that it can stay up and running even when all other ThemWi-SMSC daemon processes are shut down. It won't be particularly useful in this state, and won't be able to bring any outstanding active SMs any closer toward delivery, but the key point is that dependency graph arrows between sw components point in only one direction. 2.4. Message entry paths Every new SM enters the SMSC by way of one of our sw components making a local socket connection to themwi-smsc-core and sending it a "submit new message" command packet. The following ThemWi-SMSC sw components will be able to enter new SMs in this manner: * A special command line utility named themwi-smsc-submit will perform just this function and nothing else; * GSM network interface daemon themwi-smsc-gsmif will submit SMs received from GSM subscribers as MO messages; * Upstream SMPP link handler themwi-smsc-uplink will submit SMs received from the upstream connection, i.e., from the outside world; * Downstream SMPP link handlers will submit SMs received from downstream peers. Most of the common processing functions, such routing and validation steps, will be performed by themwi-smsc-core. Once all admission-time checks pass, the new SM will be written into PMS, and if the destination is anything other than write-into-PMS-only, the new active SM will also be added to the core daemon's in-RAM data structures. Further delivery steps will happen if and when the appropriate role-handler connects to themwi-smsc-core and accepts messages for processing. 2.4.1. Routing of Short Messages For every incoming SM, themwi-smsc-core will apply routing based on the destination address in addr_to_orig member of the submitted struct sm_record. Referring to the general principles of section 1.1, this step is very specific to the numbering plan (NANP) for which ThemWi-SMSC is designed. The following routing rules will be applied: * If the destination number is international (TON=1) and the country code is anything other than +1, the destination is set to SME_CLASS_UPSTREAM. * If the destination number is NANP, entered in international TON=1 format or in one of local-culture formats (10-digit NPANXXXXXX or 11-digit 1NPANXXXXXX, TON=0), NANP validation rules are applied and outright-invalid numbers are rejected. The validated NANP number is looked up in themwi-nanp database of locally owned phone numbers; if the number is locally owned, the destination is either SME_CLASS_LOCAL or SME_CLASS_GSM, depending on how the number is assigned, or the message may be rejected if the locally-owned number is of a type that cannot receive SMS. If there is no hit in the database of locally owned numbers, another number database gets a lookup, the one for numbers of downstream peers - a hit in that database will set the destination to SME_CLASS_DOWNSTREAM. Finally, if the NANP destination number doesn't hit anywhere, the destination is SME_CLASS_UPSTREAM. * If the destination number is a USA SMS short code of form NXXXX or NXXXXX, the destination is SME_CLASS_UPSTREAM. * In the case of locally originated SMs only (coming from GSM MO or from themwi-smsc-submit command line utility), special 4-digit numbers may be defined in the number database of themwi-nanp that are meaningful only locally. If one of those numbers matches, the destination is SME_CLASS_LOCAL or SME_CLASS_GSM according to the exact number type. * If none of the above conditions match, the message is rejected as unroutable. What is the difference between SME_CLASS_LOCAL and SME_CLASS_GSM destinations? Answer: SME_CLASS_LOCAL means that writing the SM into PMS constitutes final delivery, and nothing more needs to be done. OTOH, destination of SME_CLASS_GSM means that an MSISDN-to-IMSI lookup needs to be performed, followed by GSM MT delivery. There is one additional routing mode that is available only via themwi-smsc-submit, or perhaps future specialized network sw components that incorporate the same function: if a locally generated MT message needs to be sent to a local GSM MS addressed by IMSI, with no destination phone number existing at all, themwi-smsc-submit can instruct themwi-smsc-core to skip the routing step, with the destination preset to SME_CLASS_GSM and dest_imsi prefilled. 2.4.2. Permission to send to the uplink Not every local phone number served by ThemWi-SMSC is allowed to send SMS to our upstream interconnection point with Bandwidth.com SMPP server. As explained in section 1.2, our access to Bandwidth P2P SMS interconnection service is through a reseller (Sopranica Telecom), and our arrangement is such that we have to pay for each individual phone number for which P2P SMS interconnection service is provided. The economics of the situation are such that the total set of NANP numbers (good for calls) we rent from BulkVS is greater than the subset for which we enable outside SMS interconnection service through Bandwidth+Sopranica. Therefore, we have a flag in our themwi-nanp database of locally owned numbers (NUMBER_FLAG_SMSPROV) which we set only on certain numbers, those that are provisioned for outside SMS interconnection and which are therefore allowed to send SMS to the outside world. All other locally owned phone numbers (those without this flag) can only exchange SMS within our fiefdom, including our downstream peers. For each newly submitted SM, themwi-smsc-core will make a routing determination per the previous section, and if the destination is SME_CLASS_UPSTREAM, the identity of the sender will be checked. The sender will need to be a locally owned number with upstream SMS permission bit set, otherwise the message is rejected. 2.4.3. PID and DCS constraints Special codes in PID and DCS octets can invoke many special functions that go far beyond ordinary human-to-human SMS: setting and clearing voice mail waiting indication flags, SIM OTA communication, silent SMS etc. While there are legitimate use cases for all of these special services, and an SMSC implementation should provide a way for duly authorized network components to send such special SMS to local GSM subscribers, it would be irresponsible for a public MNO to allow any Alice to send such SMS-encoded trojans to any Bob, or to accept the same from Big Bad outside world and forward them directly to unsuspecting local users. The solution adopted for ThemWi-SMSC is that each sw component that accepts SMs from untrusted parties will apply filtering rules to both PID and DCS octets. In the case of messages originated from local GSM MS, themwi-smsc-gsmif will be responsible for preening PID and DCS, whereas in the case of messages coming from the outside world, the responsibility falls on themwi-smsc-uplink instead. The specific masks or ACLs of which PID and DCS codes should be accepted will be configurable; the recommended default is to: * allow any PID in 000xxxxx range (0x00 through 0x1F), but no others; * allow DCS 0x00 (GSM7 text) and 0x08 (UCS-2 text), but no others. 2.4.4. Validity period and expiry time Given the store-and-forward nature of SMS, the amount of work spent trying to deliver a message to a "difficult" destination must be bounded. The standard SMS architecture of GSM 03.40 provides the notion of a validity period, optionally specified by message senders, as the mechanism for limiting the lifetime of a message that cannot be delivered right away. Message validity periods and expiry times will be handled as follows in ThemWi-SMSC: * At the socket interface from message-submitting components to themwi-smsc-core, the VP will always be communicated in relative form, as a count of seconds. Special value 0 means that the source is not setting the VP, and a system-wide default needs to be applied. * If themwi-smsc-gsmif receives an absolute-format VP from GSM MS, it will convert to relative seconds-from-present before submitting the SM to themwi-smsc-core. * themwi-smsc-core will have two configurable settings with regard to message longevity: default VP and maximum VP. The default VP setting will be applied when no VP is set at the message source: themwi-smsc-submit without explicit VP, MO SM from a GSM MS without VP setting, or a message from the outside world where SMPP never provides a VP on incoming messages. OTOH, the maximum VP setting will serve as a cap in case a user did specify an explicit VP, but it is unreasonably long. 2.4.5. Duplicate message detection One can easily envision various scenarios in which a duplicate copy is received for an earlier message which is still active, i.e., still queued for delivery to its destination. Instead of adding such duplicates to the queue, it is desirable to be able to detect and suppress them. The details remain to be worked out. 3. SMS communication via direct shell access To be filled. 4. Interface to local Osmocom GSM network GSUP and separate MSISDN-to-IMSI lookup, to be described. 5. SMPP connection handlers and outside-world SM exchange To be filled.