diff doc/Arch-design @ 0:9e364c18e0e8

beginning of architectural design spec
author Mychaela Falconia <falcon@freecalypso.org>
date Wed, 20 Dec 2023 03:50:06 +0000
parents
children c4f8a32af088
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/Arch-design	Wed Dec 20 03:50:06 2023 +0000
@@ -0,0 +1,334 @@
+Themyscira Wireless SMSC implementation
+Architectural design specification
+
+1. Purpose and scope of the software
+
+The purpose of the present software project is to facilitate store-and-forward
+SMS exchange among the following parties:
+
+* Locally owned mobile telephone numbers (LOMTNs) that belong to Themyscira
+  Wireless, with Short Message Service accessed either via the local GSM network
+  (Osmocom-based) or via direct command line access to the SMSC;
+
+* The outside world: the total set of all SMS-capable E.164 telephone numbers
+  in the world, with whom our users must be able to freely exchange SMS just
+  like users of any other cellular phone carrier in USA;
+
+* USA-specific 5-digit and 6-digit short codes: these services aren't accessible
+  from anywhere in the world, only from USA (each country has its own services
+  of this type), but because we are located in USA, we must provide the same
+  access to public services as any other cellular phone carrier;
+
+* Any downstream parties who enter into an interconnection agreement with ThemWi
+  for the purpose of sharing our SMS uplink to the outside world.
+
+1.1. NANP specifics
+
+The design of our SMSC makes the following assumptions that are specific to
+North American Numbering Plan:
+
+* All LOMTNs and all downstream peer MTNs are expected to be NANP numbers;
+  any/all SMS source or destination numbers in country codes other than +1 are
+  treated as belonging in the Outside World, accessible only via the SMPP
+  "uplink" connection to our upstream SMS connectivity provider.
+
+* The set of SMS destination numbers that can be sent to the upstream includes
+  not only non-NANP and not-locally-known NANP E.164 numbers, but also any/all
+  SMS short codes in USA-specific NXXXX or NXXXXX format.
+
+* In the case of Mobile-Originated SMS from the local GSM network, if the
+  user-entered destination number is not explicitly international (TON=1) and
+  does not fit the format of a USA SMS short code, other USA-customary dialing
+  formats are supported, as in 10-digit NPANXXXXXX or 11-digit 1NPANXXXXXX
+  without '+' prefix.
+
+themwi-nanp software package is a strict dependency for themwi-smsc: themwi-nanp
+utilities must be used to manage the database of locally owned NANP numbers,
+and the present software uses themwi-nanp libraries to access that database.
+
+1.2. Hierarchical arrangement of upstream and downstream peers
+
+The telecom landscape in USA is such that anyone can obtain 10-digit telephone
+numbers (TNs) very easily and very cheaply, but making them SMS-capable (able
+to function as Mobile Telephone Numbers or MTNs) is much more difficult.
+Suitably equipped providers such as Bandwidth.com are generally unwilling to
+provide service directly to small customers, and we (Themyscira Wireless team)
+were able to find only one company (Sopranica Telecom) who buys P2P SMS
+interconnection service from Bandwidth and was willing to resell to us.
+
+Suppose that many different ultra-small parties wish to set up their own indie
+GSM networks in different parts of USA.  Each of these tiny fiefdoms can serve
+as its own administration and get its own TNs from a provider such as BulkVS.
+How would all of these tiny fiefdoms then add SMS capability?  The feedback we
+got from Sopranica is that asking them to set up a sub-account on their
+Bandwidth service for each microfiefdom would be too much work - hence San Diego
+2G Association (the primary instance of Themyscira Wireless) will need to serve
+as a third-level reseller, getting Bandwidth SMS interconnection service from
+Sopranica and then further subletting it to other microfiefdoms.
+
+Vertical hierarchy support in ThemWi-SMSC is designed to support the just-
+described use case.  Each SMSC instance has a set of locally owned mobile TNs
+(LOMTNs, owned by the local fiefdom operating this SMSC instance), a single
+upstream SMPP link pointing up the hierarchy tree (toward the Outside World)
+and any number of downstream SMPP links to downstream peers.  The total set of
+phone numbers known to each SMSC instance is its own local set (themwi-nanp
+database of locally owned TNs) plus the set of numbers assigned to downstream
+peers - all other E.164 numbers everywhere in the world (plus all non-E.164 USA
+SMS short codes) belong in the Outside World and are sent to the "uplink"
+connection.  Messages are then routed as follows:
+
+* Any SM originating from a local GSM subscriber can go to another GSM
+  subscriber, to a known downstream peer or to the Outside World.
+
+* Any SM that are injected directly into the SMSC from local shell access are
+  treated the same way as Mobile-Originated SMS from local GSM users - hence
+  this mechanism can be used to send SMS to the local GSM network or to the
+  Outside World.
+
+* Any SM coming from the uplink connection can be addressing a local GSM
+  subscriber or a downstream peer - but either way it must be a number known
+  to this SMSC, otherwise something is badly misconfigured somewhere.
+
+* Any SM coming from a downlink connection can go to a local GSM subscriber, to
+  a different downstream peer or to the Outside World.
+
+1.2.1. Direction of SMPP connections
+
+Despite the name "Short Message Peer to Peer", SMPP is an asymmetric client-
+server protocol, not symmetric peer-to-peer.  Our primary, above-all-else
+requirement when it comes to SMPP is to connect to the "big daddy" SMSC of
+Bandwidth.com, the one that allows us to receive SMS from and send SMS to
+anywhere in the Outside World.  BW requires that we connect to their SMSC server
+in the role of an SMPP client and bind as a bidirectional transceiver - both
+message directions then flow over this single long-lived TCP connection from our
+client to their server.
+
+This externally imposed requirement dictates the entire architectural design of
+ThemWi-SMSC with respect to SMPP.  Each instance of ThemWi-SMSC can have a
+single upstream peer to whom we connect in the role of an SMPP client, and it
+can optionally act as an SMPP server accepting TCP connections from downstream
+peers.  The master instance of ThemWi-SMSC at smsc.sandiego2g.org will point
+its "upstream" link at Bandwidth.com SMPP server, using credentials given to us
+by Sopranica, whereas other small fiefdoms who wish to join our service resale
+tree will point the "upstream" link of their ThemWi-SMSC instances to
+smsc.sandiego2g.org, and we (SD2G) will assign them authentication credentials
+and manage their downstream number pools.
+
+1.3. Possible use outside of originally intended North American use case
+
+If your situation and/or interests do not match the very specific use case for
+which the present software is designed (if you are located outside of North
+America, and/or you have no interest in attaining SMS interconnection with the
+national mobile telephony environment of whichever country you call home), you
+can still play with the present implementation of GSM-oriented SMSC: the uplink
+connection to the Outside World can be omitted, and if you don't have real TNs
+(telephone numbers) in North American Numbering Plan (either because you are
+outside of North America or because you are in NA but not interested in official
+phone network interconnection), you can operate ThemWi-SMSC (plus the attached
+Osmocom GSM network) with fake NANP numbers instead.
+
+To be clear, this support for modes of usage outside of the primary design goals
+of ThemWi-SMSC is intended only to facilitate "play" and evaluation (getting a
+feel for what may be the first SMSC implementation connecting to Osmocom CNI
+via GSUP), not for serious long-term usage.  If your actual desired use case is
+an isolated GSM network with a totally ad hoc or "free" numbering plan (the
+default which one gets with a "vanilla" installation of Osmocom CNI), or a GSM
+network that is interconnected with the national mobile telephony environment
+of some country other than USA, you need a different SMSC design that is
+tailored for your numbering plan (free-form or non-USA national) that will be
+different from NANP, and for local telecom environment quirks that will almost
+certainly be different from those in USA.
+
+If you like the general idea and overall design of ThemWi-SMSC, but require an
+adaptation to a different numbering plan or a different telecom environment
+(isolated or a national interconnect in some other country), you should be able
+to take the present code base and modify just the numbering plan aspects,
+producing a derivative-work SMSC for your different needs.
+
+2. ThemWi-SMSC software architecture
+
+2.1. Modularity of components
+
+A complete deployment of ThemWi-SMSC, as in our own use case at Themyscira
+Wireless, includes a local GSM network (Osmocom-based) and a connection to the
+hierarchical SMPP tree that eventually leads to the Outside World SMS
+connectivity provider at the top.  However, our software implementation will be
+modular, divided into separate software components for:
+
+* The internal core of the SMSC (one daemon process and some command line
+  utilities);
+
+* A pair of daemon processes devoted to the task of connecting the SMSC to the
+  local Osmocom-based GSM network, to be omitted if you don't have one;
+
+* A dedicated daemon process serving the SMPP link to the upstream peer, to be
+  omitted if you have no upstream link;
+
+* Another dedicated sw component serving downstream peer SMPP connections, one
+  process instance per downstream peer, or none if you have no such peers.
+
+This modularity allows the software to be used and (hopefully) appreciated
+outside of its primary intended use case.  At one extreme, someone could have
+an isolated Osmocom GSM network, modify it slightly to use MSISDNs that look
+like (fake) NANP numbers, hook up ThemWi-SMSC and use this SMSC as a replacement
+for the Osmocom-default one, paving the way for factoring the SMSC function out
+of OsmoMSC.  At the other extreme, if someone is located in USA and wishes to
+interconnect to the world of SMS through the chain of 3 resellers (Bandwidth
+followed by Sopranica followed by San Diego 2G Association), they can run an
+instance of ThemWi-SMSC without any GSM network at all.  (You will still need
+Osmocom libraries, but no Osmocom processes and no hardware.)  In such a
+deployment, all incoming SMS to your number(s) will be written into the
+persistent store which you can read, and you can send outgoing SMS with a
+command line utility.
+
+2.2. Persistent message store
+
+Every SM that passes through ThemWi-SMSC gets written into an append-only
+persistent message store (PMS).  Because this store is append-only, no messages
+are ever deleted - however, each message in PMS can be in one of two states:
+active or historical.  An active SM is one for which the SMSC still needs to
+make delivery attempts, either attempts at GSM MT delivery or attempts at
+delivery to the appropriate upstream or downstream SMPP peer.  A historical SM
+is one for which no further action will be taken by any component of our SMSC.
+An SM can enter "historical" state in several ways:
+
+* For some LOMTNs the act of writing incoming messages into PMS constitutes
+  final delivery in itself, and no other delivery actions are needed.  In this
+  case a newly entered SM is directly written into PMS in the "historical"
+  state, without ever going through "active".
+
+* For messages that need to be delivered to a GSM MS or to an SMPP peer, once
+  that delivery has been made successfully, the message transitions from active
+  to historical.
+
+* In the case of failed deliveries (permament error, or expiration time reached
+  after repeated temporary failures), the failed message also transitions from
+  active to historical.
+
+The persistent message store is a simple binary file (/var/sms/pms.bin)
+consisting of directly abutted 'struct sm_record' records.  Each message record
+is exactly 256 bytes (see struct definition - we were able to fit everything we
+needed under the 256 byte mark, and then padded the struct to perfect round
+size), and this perfect power-of-2 record size makes it very easy to perform
+operations such as binary search via mmap or stripping initial megabytes of
+historical records - see subsequent sections for more detailed description.
+
+PMS is append-only as already stated, but already-written records do not become
+fully immutable until they become historical.  For as long as a given SM is in
+the active state, themwi-smsc-core daemon can and will update that record in
+pms.bin:
+
+* For messages addressed to local GSM subscribers, dest_imsi will be filled
+  when the MSISDN-to-IMSI lookup operation on the destination number succeeds;
+
+* Upon discharge (successful delivery, permanent error or validity period
+  expiration after temporary failures), themwi-smsc-core will transition the
+  sm_record into historical state by filling disposition and time_disch struct
+  members;
+
+* Additional info may be written into dest_extra_info upon discharge, depending
+  on the destination type and thus the mode of final delivery.
+
+Once an sm_record transitions into historical state, it is then immutable for
+archival purposes; archives of historical messages can be kept for years or even
+decades, depending on local administration policy.
+
+2.2.1. Historical megabyte count
+
+Given the simple binary structure of the main PMS file, each megabyte (2**20
+bytes) holds exactly 4096 messages.  It is envisioned that as a busy SMSC runs
+for a long time, a significant number of historical messages will accumulate,
+and the content of PMS may become many megabytes of historical messages followed
+by some active SMs at the end.  When themwi-smsc-core daemon restarts, it has
+to read the entire PMS in order to collect all still-active SMs.  Having to
+read through many megabytes of historical SMs to get to active ones at the end
+becomes unacceptable at large archive sizes, hence a mechanism is needed for
+marking where the historical-only portion ends and the possibly-active portion
+begins.
+
+There will be an auxiliary file named historical-mb, containing a single ASCII
+line giving the number of historical megabytes in pms.bin.  If this file reads
+1, the first 4096 SM records are historical, if the auxiliary file reads 2, the
+first 8192 SM records are historical, and so forth.  This auxiliary file will be
+used as follows:
+
+* Upon startup, themwi-smsc-core will read this historical-mb file and skip that
+  many initial megabytes of pms.bin;
+
+* At run time, themwi-smsc-core will track the index of the oldest still-active
+  SM in PMS.  Whenever this index crosses a megabyte boundary, historical-mb
+  will be updated.
+
+2.2.2. Offline storage
+
+Even with the historical-mb mechanism of the previous section, the fact remains
+that disk space on live servers is not infinite.  If the archive of historical
+messages grows so big that it needs to be removed from the SMSC server to free
+up disk space, one can carry out the following procedure:
+
+* Temporarily stop themwi-smsc-core daemon at the level of runit or systemctl
+  or whatever you are using - this operation will bring down the entire SMSC,
+  so do it during a scheduled maintenance window;
+
+* Use dd to split pms.bin into historical and active portions:
+
+  dd if=pms.bin of=pms-hist.bin bs=1048576 count=N
+  dd if=pms.bin of=pms-new.bin bs=1048576 skip=N
+
+* Move pms-hist.bin to offline storage;
+
+* Replace the long file with the shortened one:
+
+  mv pms-new.bin pms.bin
+  echo 0 > historical-mb
+
+* Re-enable themwi-smsc-core and restart all other SMSC daemons.
+
+2.2.3. themwi-smsc-dump reading tool
+
+The program named themwi-smsc-dump will be a standalone command line utility
+(fully static in its operation, not talking to any daemons or services) for
+reading and parsing (decoding) pms.bin.  It will open pms.bin with O_RDONLY, do
+a read-only mmap on it, and then access this PMS as a memory-mapped file.
+Several different modes of operation will be provided:
+
+* It will be possible to dump and decode the entire PMS, as needed during early
+  debugging.
+
+* It will be possible to specify a starting date/time at which the dump should
+  begin.  As records are added in strict forward chronological order, it is
+  possible to find a record nearest (by time_entry timestamp) to a given time
+  point by binary search, very efficient on a memory-mapped file.
+
+* Once the dump has a starting point (beginning of the file or a time point
+  found by binary search), the tool can be told to dump till the end, display
+  some count of messages, or run until a certain ending date/time is crossed.
+
+* The tool can dump all message records in the selected range, or only those
+  matching specific filters such as a particular source or destination type, or
+  a specific phone number.
+
+The complexity described above is needed for the following reasons:
+
+* One radical idea is to grant limited access (by way of a very strict wrapper)
+  to themwi-smsc-dump to unprivileged users of the network served by the SMSC,
+  i.e., to end users.  The idea is that each individual user should be able to
+  give their ssh public key to the administrator of the community network, and
+  then ssh into a special restricted service on the SMSC that does not grant
+  any system shell access, but allows them to access services under their own
+  phone number.  Such an empowered end user should be able to submit SMS from
+  their own phone number using the power of a full-size computer (as opposed to
+  very painful text entry on the numeric keypad of a traditional GSM phone),
+  and to see a full log of all messages received by or sent from their own
+  phone number.
+
+* By the nature of her job, the administrator of the SMSC (and of the community
+  GSM network to which this SMSC belongs) necessarily has access to every
+  message that passes through the system, all metadata and actual content.
+  While this access is technically necessary, an administrator who is worthy of
+  her trusted position must not abuse this trust, and must do everything
+  possible to avoid looking at users' private message content when it is not
+  necessary to do so for technical troubleshooting reasons.  Toward this
+  objective, themwi-smsc-dump must make it easy to look at only technically
+  necessary information, without throwing unnecessary private info into the
+  operator's eyeballs.