Defined-Trust Transport (DeftT) Protocol for Limited Domains

Internet-Draft	Defined-Trust Transport (DeftT)	March 2024
Nichols, et al.	Expires 2 October 2024	[Page]

Abstract

This document describes a broadcast-oriented, many-to-many Defined-trust Transport (DeftT) framework that makes it simple to express and enforce application and deployment specific integrity, authentication, access control and behavior constraints directly in the protocol stack. DeftT enables secure and completely self-contained (e.g., no external identity servers or certificate authorities) overlay networks where credentialed members can join and leave at any time. DeftT is part of a Defined-trust Communications approach with a specific example implementation available. Combined with IPv6 multicast and modern hardware-based methods for securing keys and code, it provides an easy to use foundation for secure and efficient communications in Limited Domains (RFC8799), in particular for Operational Technology (OT) networks.¶

Conventional IP transports create optionally secured one-to-one sessions. Where member identities exist, they must either be validated via external servers or all member identities must be be preconfigured in each member during enrollment. Synchronization of data across a domain is carried out through multiple two-party transport sessions. In contrast, DeftT is a multi-party transport that synchronizes collections of secured information across all enrolled members of a domain. Security in DeftT is not optional and members are preconfigured only with their own identities and the secured rules to authenticate other member identities.¶

This document describes an architecture that is not a standard and does not enjoy IETF consensus. (The community is invited to consider standardizing its concepts and the specification.)¶

1. Introduction

Decades of success in providing IP connectivity over any physical media ("IP over everything") has commoditized IP-based communications. This makes IP an attractive option for Internet of Things (IoT), Industrial Control Systems (ICS) and Operational Technologies (OT) applications like building automation, embedded systems and transportation control, that previously required proprietary or analog connectivity. For the energy sector in particular, the growing use of Distributed Energy Resources (DER) like residential solar has created interest in low cost commodity networked devices with added features for security, robustness and low-power operation [MODOT][OPR][CIDS]. Other emerging uses include connecting controls and sensors in nuclear power plants [DIGN] and carbon capture monitoring [IIOT].¶

While use of an IP network layer is a major advance for OT, current Internet transport options are a poor match to its needs in both the communications and security models. TCP generalized the Arpanet transport notion of a packet "phone call" between two endpoints into a generic, reliable, bi-directional bytestream working over IP's stateless unidirectional best-effort delivery model. Just as the voice phone call model spawned a global voice communications infrastructure in the 1900s, TCP/IP's two-party packet sessions are the foundation of today's global data communication infrastructure. Yet "good for global communication" isn't the same as "good for everything". A signficant number of OT uses can be characterized as Limited Domains [RFC8799] under a single administrative authority with a primary function of coordination and control and communication patterns that are many-to-many. Implementing many-to-many applications over two-party transport sessions changes the configuration burden and traffic scaling from the native media's O(n) to O(n²) (see Section 1.2). Further, as OT devices have specific, highly proscribed roles with strict constraints on who can say what to which, the opacity of modern encrypted two-party sessions can make it impossible to enforce or audit these constraints.¶

This memo describes Defined-trust Transport (DeftT) whose intended use is for communications in the constrained environments that can be characterized as Limited Domains [RFC8799]. As in [RFC8366] we define a domain as "The set of entities or infrastructure under common administrative control. Following [RFC8799] we use the term Limited Domain to refer to a region where network and end system requirements, behaviors, and semantics are applied only within a single domain where membership is assumed to be cryptographically secured. The region of a Limited Domain’s applicability could be a physical locality, e.g., within the same building, campus or immediate proximity, or could be distributed across geographies, as an overlay on the Internet or as a network parallel to the Internet. The definition of limited used here is "restricted" or "characterized by enforceable limitations." (https://www.merriam-webster.com/dictionary/limited) Trust domain denotes a Limited Domain where all network communications are via DeftT. Though the Limited Domain properties and general functional requirements enumerated in [RFC8799] are necessary, we do not claim they are sufficient.¶

In contrast to conventional two-party session-based IP transports with optional session security, DeftT is a multi-party transport that synchronizes collections of secured information across all enrolled members of a domain using broadcast when available. This feature keeps domain communications synchronized without the need to construct multiple two-party transport sessions, either between members or with a server or broker. Like conventional transports in limited domains, DeftT relies on the concept of secure, dynamic membership but the membership configuration is the only information needed to validate the authenticity and capabilities of other senders obviating the need to validate other members' identities via external servers or preconfiguration with those identities. Thus new members can be created and join a trust domain at any time.¶

The secure, dynamic membership of [RFC8799] is critical in DeftT and its use and enforcement is described in subsequent sections. Members use preconfigured identities and secured communications rules to validate other members' identities, to validate received communications both structurally and cryptographically and to construct compliant outbound communications. All communications in within the domain; no external servers (e.g. certificate authorities, identity servers) are used at any time.¶

In DeftT Domains, multipoint communications are enabled through use of a named collection abstraction and secured by an integrated trust management engine. DeftT employs IPv6 link-local multicast [RFC4291], a distributed set reconciliation communications model, flexible pub/sub APIs, chain-of-trust membership identities, and secured rules that define the local context and communication constraints of a deployment in a declarative language. The rules are used by DeftT's runtime trust management engine to enforce adherence to the constraints; together with a shared trust anchor, a member can validate every other member's identity at runtime. There is no need for members to have apriori knowledge of one another. The resulting system is efficient, secure and scalable: communication, signing and validation costs are constant per-publication, independent of the richness and complexity of the deployment's constraints or the number of entites deployed.¶

Like QUIC, DeftT is a user-space transport protocol that sits between an application and a system-provided transport like UDP or UDP multicast (see Figure 1). No UDP or TCP port assignement is required. DeftT's membership model means that an application is configured with its credentials before it starts communications. This includes the domain rules ("schema") distributed as a certificate signed by the domain trust anchor. Members use the SHA256 "thumbprint" Section 2 of the schema cert as a compact, unique identifier for the domain and portions of this identifier are used to generate addresses in accordance with the Dynamic Ports of [RFC6335] to determine both the IPv6 link-local multicast address and UDP destination port. (more in Section 1.5). (Note that UDP multicast requires different ports for send and receive, so this use doesn't conflict with the normal use of this range for emphemeral client src ports.) Unicast protocols require more configuration and the protocol, role ("listen" or "connect"), address and port information goes into the identity chains of the two parties.¶

Figure 1: DeftT in an IP stack

DeftT's trust domains are self-contained communications networks layered on IP via multicast UDP or unicast protocols (e.g., TCP and UDP) and do not directly interact with IP routing protocols. DeftT operates on IP networks without interfering with conventional IP protocols. In contrast with IETF standards track protocols like the client-server COAP [RFC7252], DeftT is intended to serve the communication needs of a closed community with common objectives, a zero-trust Limited Domain (trust domain). Foremost among those needs is the ability to enforce community-specific policy constraints ("who can say what to which"). ABAC (Attribute-Based Access Control) [NIST] provides a model sufficient to express and enforce these constraints but a fundamental architectural choice remains to either:¶

(a) Start with Internet-based communication protocols then "harden them" by layering an ABAC framework on top, or¶

(b) Start with an ABAC framework that verifiably enforces the policy constraints then augment it with the minimum necessary communication primitives needed to function in a community's deployment environment.¶

Existing IETF protocols use approach (a) and, given how few enforceable security policies are possible on the open Internet, it's a reasonable choice. For LDs, approach (a) imports all the (otherwise unneeded) Internet abstraction maintenance machinery (DHCP, DNS, CAs, PDPs/PIPs, routing, address plans, etc.). When communication is expressed in terms of Internet abstractions (e.g., a TLS connection between two IP endpoints), there needs to be a translation layer to map between these abstractions and the community's entities, requirements and objectives. All this machinery is configuration intensive and recent history has demonstrated that it's all prime attack surface. DeftT has been created as a self-contained ABAC framework where the PEP and PDP are in the transport narrow pub/sub waist. DeftT embeds the PIP function in certificate signing chains so it's self-authenticating and self-distributing.¶

Like COAP/OSCORE nodes, DeftT trust domain members start with a pre-existing identity obtained out-of-band, which means that existing and evolving bootstrap and enrollment protocols and methodologies can be used. DeftT identities are more than a single key pair signed by a trust anchor; instead, the identities are in the form of certificate chains that contain all the attributes or roles the identity is granted. Entities are configured with a chain of public certificates terminating at the trust anchor, along with a private key corresponding to the unique identity cert at the chain's leaf. An identity only conveys membership in a specific trust domain that is using a particular set of rules and a particular trust anchor.¶

As with MUD, trust domain members have specific capabilities and permitted communications that are explicitly specified. Unlike MUD, each member gets the rules for the domain distributed in binary form in a certificate (a communications schema) signed by the same trust anchor that is at the root of the member identity. This compact and secured schema specifies the format for identity chains as well as the format of all permitted communications and the attributes required by identities that issue them. Each DeftT node has an integrated trust management engine that applies the schema at run-time. DeftT enrollment consists of configuring a device with identity bundles that contain the trust anchor certificate, the communication schema, and a membership identity which comprises all the certs in its signing chain terminated at the trust anchor. The private key corresponding to the leaf certificate of the member's identity should be securely configured (i.e., not exposed to any third party) while the security of the identity bundle can be deployment-specific (i.e., the public certificates it contains may optionally be protected from third parties).¶

In a Trust Domain, all members' identities and the schema share a common trust anchor, so the bundle suffices for a member to authenticate and authorize communication from peers and vice-versa. The identity bundle, DeftT's trust management engine, and a trust domain's certificate collection (where members publish their signing chains) allow new members to join and communicate with no specific knowledge of other members, thus obviating labor intensive and error-prone device-to-device association configuration. The signing key pairs used for communications are made locally at each entity on whatever rotation schedule chosen for the application. Private identity keys are not used for signing DeftT packets, only to sign the public signing certificate that becomes the leaf certificate of a signing chain. The identity key can remain within protected hardware like a TPM for signing while the signing key is used in the communications path as a tradeoff between the possibility of more exposure vs the need for speed. (More on certificates and identities in DeftT in Section 3.6 and Section 5.)¶

Along with bootstrapped identity bundles, DeftT makes use of both its synchronized collections and its integrated trust management engine to securely join a particular Trust Domain. In synchronized collections, members communicate about their local version of the collection state and send additions to the collection the other members are missing.¶

DeftT's collections hold Publications that have lifetimes set as appropriate to their use. For example, Publications that carry application communications are normally ephemeral, like a UDP or TCP packet, while Publications that carry certificates have longer lifetimes, on the order of hours or even months. Collections are synchronized; members compare their local versions of the collection state and update the collection with Publications they have that others are missing. On a broadcast subnet (e.g., using IPv6 link local multicast) updates are sent to multiple members at once. On a unicast subnet (e.g., UDP) updates are sent to the other member of the subnet.¶

To understand how a DeftT joins a Trust Domain, we now describe the certificate distributor model from the point of view of a newly joining member DeftT. A state diagram of the joining process is Figure 2. The joining member creates its first signing pair, then constructs a certificate holding the public signing key and signs it with the member's private identity key. The joining member adds its signing chain to its local copy of the cert collection and starts the process of joining the Trust Domain by publishing its cert collection state and subscribing to the domain certificate collection. The local collection state is compared to the received collection state and any certs that are not already in that received state will be sent on the network to be added to the domain collection. Note that a new member will always need to add its identity and signing certs, but other certificates of the chain may already have been added to the collection by previously joining members. A DeftT does not consider itself joined (or "connected to the domain") until it receives a collection state from the network that contains all of its certs, indicating that at least one other member will be able to receive its signed packets. Whether fully joined or not, the cert distributor receives all certs published on the network, adding them to its local collection when an entire validated signing chain is received and updating its local collection state.¶

Figure 2: DeftT certificate distributor enables joining Trust Domain

1.1. Environment and use

Due to physical deployment constraints and the high cost of wiring, many OT networks preferentially use radio as their communication medium. Use of wires is impossible in many installations (untethered Things, adding connected devices to home and infrastructure networks, vehicular uses, etc.). Wiring costs far exceed the cost of current System-on-Chip Wi-Fi IoT devices and the cost differential is increasing [WSEN][COST]. For example, the popular ESP32 is a 32bit/320KB SRAM RISC with 60 analog and digital I/O channels plus complete 802.11b/g/n and bluetooth radios on a 5mm die that consumes 70uW in normal operation. It currently costs $0.13 in small quantities while the estimated cost of pulling cable to retrofit nuclear power plants is presently $2000/ft [NPPI].¶

Many OT networks are Limited Domains having a defined membership and communications that are often local, have a many-to-many pattern, and use application-specific identifiers ("topics") for rendezvous. This fits the generic Publish/Subscribe communications model ("pub/sub") and, as table 1 in [PRAG] shows, nine of the eleven most widely used IoT protocols use a topic-based pub/sub transport. For example MQTT, an open standard developed in 1999 to monitor oil pipelines over satellite [MQTT][MHST], is now likely the most widely used application communication protocol in IoT (https://mqtt.org/use-cases/). Microsoft Azure, Amazon AWS, Google Cloud, and Cloudflare all offer hosted MQTT brokers for collecting and connecting sensor and control data in addition to providing local pub/sub in buildings, factories and homes. Pub/sub protocols communicate by using the same topic but need no knowledge of one another. These protocols are typically implemented as an application layer protocol over a two-party Internet transports like TCP or TLS which require in-advance configuration of peer addresses and credentials at each endpoint and incur unnecessary communications overhead Section 1.2.¶

1.2. Transporting information

Figure 3 shows a smart lighting example with a topic-based pub/sub application layer protocol in a wireless broadcast subnet. Each switch is set up to do triple-duty: one click of its on/off paddle controls some particular light(s), two clicks control all the lights in the room, and three clicks control all available lights (five kitchen plus the four den ceiling). Thus a switch button push may require a message to as many as nine light devices. On a broadcast transmission network each packet sent by the switch is heard by all nine devices. IPv6 link-level multicast provides a network layer that can take advantage of this but current IP transport protocols cannot. Instead, each switch needs to establish nine bilateral transport associations in order to send the published message for all lights to turn on. Communicating devices must be configured with each other's IP address and enrolled identity so, for n devices, both the configuration burden and traffic scale as O(n²). For example, when an "all" event is triggered, every light's radio will receive nine messages but discard the eight determined to be "not mine." If a device sleeps, is out-of-range, or has partial connectivity, additional application-level mechanisms have to be implemented to accommodate it.¶

Figure 3: Smart lighting use of Pub/Sub

MQTT and other broker-based pub/sub approaches mitigate this by adding a broker (Figure 4). Each entity makes a single TCP transport connection with the broker and tells the broker to subscribe it to topics. Then the kitchen switch uses its single transport session to publish commands to topic kitchen/counter, topic kitchen or all. The kitchen counter light uses its broker session to subscribe to those same three topics. The kitchen ceiling lights subscribe to topics kitchen ceiling, kitchen and all while den ceiling lights subscribe to topics den ceiling, den and all. Use of a broker reduces the configuration burden from O(n²) to O(n): 18 transport sessions to 11 for this simple example but for realistic deployments the reduction is often greater. There are other advantages: besides their own IP addresses and identities, devices only need to be configured with those of the broker. Further, the broker can store messages for temporarily unavailable devices and use the transport session to confirm the reception of messages. This approach is popular because the pub/sub application layer protocol provides an easy-to-use API and the broker reduces configuration burden while maintaining secure, reliable delivery and providing short-term in-network storage of messages. Still the broker implementation doubles the per-device configuration burden by adding an entity that exists only to implement transport and traffic still scales as O(n²), e.g., any switch publishing to all lights results in ten (unicast) message transfers over the wifi network. Further, the broker introduces a single point of failure into a network that is richly connected physically.¶

Figure 4: Brokers enable Pub/Sub over connection/session protocols

Clearly, a transport protocol able to exploit a physical network's broadcast capabilities would better suit this problem. (Since unicast is just multicast restricted to peer sets of size 2, a multicast transport handles all unicast use cases but the converse is not true.)¶

More general solutions for this communications paradigm are possible by moving the view of the problem from message exchange to the concept of coordinating shared objectives. In the distributed systems literature, communication associated with coordinating shared objectives has long been modeled as distributed set reconciliation [WegmanC81][Demers87]. In this approach, each domain of discourse is a named set, e.g., myhouse.iot. Each event or action, e.g., a switch button press, is added as a new element to the instance of myhouse.iot at its point of origin then the reconciliation process ensures that every instance of myhouse.iot has this element. In 2000, [MINSKY03] developed a broadcast-capable set reconciliation algorithm whose communication cost equaled the set instancemdifferences (which is optimal) but its polynomial computational cost impeded adoption. In 2011, [DIFF] used Invertible Bloom Lookup Tables (IBLTs) [IBLT][MPSR] to create a simple distributed set reconciliation algorithm providing optimal in both communication and computational cost. DeftT uses this algorithm (see Section 3.2) and takes advantage of IPv6's self-configuring link local multicast to avoid all manual configuration and external dependencies. This restores the system design to Figure 3 where each device has a single, auto-configured transport that makes use of the broadcast radio medium without need for a broker or multiple transport associations. Each button push is broadcast exactly once to be added to the distributed set. (See Section 3.5 to see how members fill in missing elements.)¶

1.3. Securing information

Conventional session-based transports combine multiple publications with independent topics and purposes under a single session key, providing privacy by encrypting the sessions between endpoints. Credentials of endpoints (e.g., a website) are usually attested by a third party certificate authority (CA) and bound to a DNS name; each secure transport association requires the exchange of these credentials which allows for secure exchange of a nonce symmetric key. For example, in Figure 4 each transport session is a separate security association where each device validates the broker's credential and the broker has to validate each device's. Secured transport associations are between two enrolled devices (protecting against outsider and some MITM attacks) but, once the transport session has been established there are no constraints whatsoever on what devices can say. Clearly, this does not protect against the insider attacks that currently plague OT, e.g., [CHPT] description of a lightbulb taking over a network. For example, the basic function of a light switch requires that it be allowed to tell a light to turn on or off but it almost certainly shouldn't be allowed to tell the light to overwrite its firmware (fwupd), even though "on/off" and "fwupd" are both standard capabilities of most smart light APIs. Once a TLS session is established, the transport handles "fwupd" publications the same way as "on/off" publications. Such attacks can be prevented using trust management that operates per-publication, using rules that enable the "fwupd" from the light switch to be rejected. Combining per-publication trust decisions with many-to-many communications over broadcast infrastructure requires per-publication signing rather than session-based signing.¶

Securing each publication rather than the path it arrives on deals with a wider spectrum of threats while avoiding the quadratic session state and traffic burden. In OT, valid messages conform to rigid standards on syntax and semantics [IEC61850][ISO9506MMS][ONE][MATR][OSCAL][NMUD][ST][ZCL] that can be combined with site-specific requirements on identities and capabilities to create a system's communication rules. These rules can be employed to secure publications in a trust management system such as [DLOG] where each publisher is responsible for supplying all of the "who/what/where/when" information needed for each subscriber to prove the publication complies with system policies.¶

Instead of vulnerable third-party CAs [W509], domains employ a local root of trust and locally created certificates. Communication rules are expressed in a declarative language [DLOG] that can be validated for consistency and completeness then converted to a compact runtime form which is authorized and secured via signing with the system trust anchor. This communication schema is distributed as a certificate that can be validated using on-device trusted enclaves [TPM][HSE][ATZ] as part of the device enrollment process.¶

Figure 5 shows a simplified version of a communications schema for Figure 3. These member identities are simple two-level signing chains, the domain trust anchor signs the identities (device certs). The notation "<=" should be read as "is signed by" and the "& {}" notation indicates particular constraints. We give switches and lights different attributes in their identities ("switch" or "light") and then require that publications that send commands to turn on or off must be signed by switch identities. We limit light identities to signing publications that indicate their on or off status. When a trust domain member constructs or validates a publication, the "room" and "loc" fields will need to match the signer's attributes of "myroom" and "myloc" respectively. These rules specify that all certificates have three final components, the first of which is always "KEY"; the other two follow requirements within the trust engine (DeftT's approach is detailed in Section 3.3). In the longer discussion of the schema language (see Section 4.2 and [DCT]), signing/validation methods are specified, signing certificates that are signed by identities are specified, and signing chains contain more attributes.¶

_domain: "myLights"             //name the trust domain
//format of a publication
#lsPub: _domain/room/loc/arg/_ts & {_ts: timestamp()} <= devCert
//specifiy what a switch can say
switchPub: #lsPub & {room: "kitchen"|"den"|"all"} & {arg:"turnOn"|"turnOff"} <= switchCert
//specify what a light can say: status of on or off
lightPub: #lsPub & {room: _myroom} & {loc: _myloc} & {arg: "on"|"off"} <= lightCert
domainCert: _domain/_certFormat  //the domain trust anchor format
//format and rules for device certificates
devCert: _domain/_type/_myroom/_myloc/_certFormat <= domainCert
switchCert: devCert & {_type: "switch} & {_myroom: "kitchen"|"den"}
lightCert: devCert & {_type: "light"} & {_myroom: "kitchen"|"den"}
                    & (_myloc: "ceiling"|"counter"}
//attributes of device signing chains accessed by trust management engine
#chainInfo: /_myroom/_myloc <= devCert
//appended to all certs to show carries a key, space for key identifier, date
_certFormat: "KEY"/_/_

Figure 5: Example communications schema for Figure 3's lighting system

The administrator of the "myLights" domain compiles the text version of the schema into binary and makes a schema cert, then makes a trust anchor, kitchen and den switch identities, four kitchen ceiling light identities, one kitchen counter light identity, and four den ceiling light identities. (Utilities for this are provided at [DCT].) The trust anchor private key is used to sign all of these and each device is preconfigured with the trust anchor cert, the schema cert, and its device cert (along with the device cert private key).¶

In DeftT's publication-based transport, the schema is used to both construct and validate publications, guaranteeing that all parts of the system always conform to and enforce the same rules, even as those rules evolve to meet new threats (more in Section 4.1). DeftT embeds the trust management mechanism described above directly in the publish and subscribe data paths as shown below:¶

Figure 6: Trust management elements of DeftT.

This approach extends LangSec's [LANGSEC] "be definite in what you accept" principle by using the authenticated common rules of the schema for belt-and-suspenders enforcement at both publication and subscription functions of the transport. If an application asks the Publication Builder to publish something and the schema shows it lacks credentials, an error is thrown and nothing is published. Independently, the Publication Validator ignores publications that:¶

don't have a locally validated, complete signing chain for the credential that signed it¶
the schema shows its signing chain isn't appropriate for this publication¶
have a publication signature that doesn't validate¶

Since an application's subscriptions determine which publications it wants, only certificates from chains that can sign publications matching the subscriptions need to be validated or retained. Thus a device's communication state burden and computation costs are a function of how many different things are allowed to talk to it but not how many things it talks to or the total number of devices in the system. In particular, event driven, publish-only devices like sensors will spend no time or space on validation. Unlike most 'secure' systems, adding additional constraints to schemas to reduce attack surface results in devices doing less work.¶

1.4. Defined-trust Communications Domains

A Defined-trust Communications Limited Domain (or simply, trust domain) is a Limited Domain where communications are via DeftT (Figure 7) and all members are configured with the same trust anchor and schema (or some subset of the domain schema) as well as an individual schema-conformant DeftT identity cert chain that terminates at the trust anchor and the private key corresponding to the identity chain's leaf cert. The particular rules for any deployment are application-specific (e.g., Is it home IoT or a nuclear power plant?) and site-specific (specific form of credential and idiosyncrasies in rules) which DeftT accommodates by being invoked with a ruleset (schema) particular to a deployment. We anticipate that efforts to create common data models (e.g., [ONE]) for specific sectors will lead to easier and more forms-based configuration of DeftT deployments.¶

A trust domain is perimeterless and may operate over one or more subnets, sharing physical media with non-members. Domain members use DeftT to publish and subscribe using Publication Builders and Validators as shown in Figure 6. Publications become the elements of a set, or named collection, that is synchronized across each subnet of a trust domain that is using the same schema or schema subset. DeftT uses a distributed set reconciliation protocol on each collection and each subnet independently. A DeftT's set reconciliation protocol operates in a sync zone which is defined on a single subnet where members use the same communications schema or communications schema subset. Every DeftT maintains at least two collections: msgs for Publications constructed from application messages and cert where member signing chains are published.¶

Figure 7: Trust domain

Trust domains are extended across physically separated subnets, subnets using different media and/or groups of members using distinct subsets of the domain schema by relays. (see Section 3.7) Relays have a DeftT in each sync zone and pass Publications between them as long as the Publications are valid at both the receiving and sending DeftTs. Since set reconciliation does not accept duplicates, relays are powerful elements in creating efficient configuration-free meshes. In Figure 8, different sync zones are denoted by shape and could represent different colocated media (e.g. bluetooth, wifi, ethernet) or physically distant sync zones. The shaded triangles represent DeftTs in a sync zone with only relay members, as for a unicast link between two relays. The set reconciliation protocol ensures that items only transit a sync zone once; an item must be specifically requested in order to be transmitted.¶

Any part of a verifiable defined-trust identity can be used in the delineation of sync zones, e.g. specific component(s) of identity names allowed to sign Publications can be constrained to be identical so that Publications are effectively only relayed to a particular "group" as identified by those components. This is enforced via the secured schema, i.e., non-"group" Publications will not validate. Relay identities have a relay capability (an attribute with a special meaning within DeftT) in their signing chain so that they can use specialized certificate and Publication movement while being prohibted from obtaining Publication encryption/decryption keys. Relay discussion is in Section 3.7 and Section 6.¶

Figure 8: Relayed trust domain

1.5. Discovery

Conventional IP services are discovered via well-known IP ports that are registered with IANA [RFC6335]. DeftT is not a service, so it doesn't require a port allocation. Instead, DeftT does peer discovery using shared security information. In particular, each DeftT is a member of a trust domain and communicates within a single sync zone where all members use the same schema cert, which is signed by the common trust anchor (see Section 1.4). The SHA256 thumbprint Section 2 of the schema cert for a sync zone is a compact, unique identifier.¶

1.5.1. Multicast

The first 8 bytes of the schema cert thumbprint are used as the first component in all DeftT PDUs (see Section 3.3.1.1) and the lower 14 bytes are used for DeftT's link-local IPv6 addresses.¶

IPv6 mutlicast addresses have format [RFC4292]:¶

 8|  4  |  4  |  112   |
--+-----+-----+--------+
FF|flags|scope|group ID|
--+-----+-----+--------+

and DeftT sets these as:¶

FF|01|02|<lower 112 bits of schema cert thumbprint>

where flags=01 indicates a dynamically assigned address and scope=02 indicates link-local scope. The UDP dest port is set using the upper byte of the id to pick a random value in the [RFC6335] Dynamic Ports range 49152-65535. As UDP multicast requires different ports for send and receive, this use doesn't conflict at all with the normal use of this range for emphemeral client source ports.¶

1.5.2. Unicast

Unicast protocols require more configuration than multicast, partly because they're inherently asymetric so each end has to know in advance whether to 'listen' or 'connect', partly because there are firewall and NAT implications, and partly due to DeftT wanting good peer authentication and privacy. Since unicast is between particular DeftT peers, the protocol, role, address and port information goes into the identity chains of those peers Section 5.2 as an attribute. This allows them to connect, zero-RTT mutually authenticate, and establish a TLS-1.3-like EC Diffie-Hellman secure tunnel as needed.¶

1.6. Current status

An open-source Defined-trust Communications Toolkit [DCT] with an example implementation of DeftT is maintained by the corresponding author's company. [DCT] currently has examples of using DeftT to implement secure brokerless message-based pub/sub using multicast UDP/IPv6 and unicast {UDP/TCP}/IPv6 and includes extending a trust domain via a unicast connection or between two broadcast network segments.¶

Massive build out of the renewable energy sector is driving connectivity needs for both monitoring and control. Author King's company, Operant, is currently developing extensions of DeftT in a mix of open-source and proprietary software tailored for commercial deployment in support of distributed energy resources (DER). Current small scale use cases have performed well and expanded usage is underway. Pollere is also working on home IoT uses. The development philosophy for DeftT is to start from solving useful problems with a well-defined scope and extend from there. As the needs of our use cases expand, the Defined-trust communications framework will evolve with increased efficiencies. DeftT's code is open source, as befits any communications protocol, but even more critical for one attempting to offer security. DCT itself makes use of the open source cryptographic library libsodium [SOD] and the project is open to feedback on potential security issues as well as hearing from potential collaborators.¶

The well-known issues with 802.11 multicast [RFC9119] can make DeftT less efficient than it should be. Target OT deployments primarily use smaller packet sizes and DeftT's set reconciliation provides robust delivery that currently mitigates these concerns. DeftT use may become another force for improved multicast on 802.11, joining the critical network infrastructure applications of neighbor discovery, address resolution, DHCP, etc.¶

Cryptographic signing takes most of the application-to-network time in DeftT. Though not prohibitively costly (e.g., under 20 microseconds on a Mac Studio), increased use of signing in transports may incentivize creation of more efficient signing algorithms.¶

1.6.1. Performance considerations

DeftT is intended for the kinds of devices available today. The minimum expected capability is that of an ESP32 as noted in Section 1.1. Experience thus far on a number of platforms has not shown DeftT to be unduly burdensome. Section 3.8 reports measurement results for time to process application messages as well as memory burden.¶

Note that DeftT incurs some memory overhead that may not be immediately apparent. ItT keeps a copy of the current identity information for all members that can produce Publications to which it can subscribe (as per schema). This requires ~300 bytes per certificate. Collections hold Publications for their lifetime plus a skew time in order to prevent replay attacks.¶

1.6.2. Relationship to other approaches

OSCORE [RFC8613] adds object security to COAP specifically to get around the vulnerability of using only DTLS/TLS with proxies. OSCORE uses pre-shared keys, acquired out-of-band or via a key establishment protocol, to encrypt/sign COAP messages which are carried as payload in a COAP message with the OSCORE option. Its Security Context is between two endpoints and specific to sender ID and recipient ID where sender IDs may be established out-of-band. As Internet compatible protocols, COAP/OSCORE/ACE[RFC9200] use 1) cleartext options in their headers and 2) trusted third parties or resource servers, both of which can be exploited.¶

A DeftT PDU uses a hash of its compiled rules cert to identify its trust domain (Section 2) and uses no header options. In the Internet, PDU headers tell nodes how the packet should be handled; in a DeftT trust domain, the trust domain identifier indicates the packet is part of the domain whose rules will be enforced by any receiver. These are very different architectures both for communicating and for securing communications and are expected to serve different roles although the application spaces may overlap. Further, DeftT and Defined-trust Communications are early-stage work compared to COAP/OSCORE and other IETF work, but deployments are underway by Operant and Pollere.¶

Out-of-band configuration techniques developed for COAP and OSCORE should be adaptable for configuration of DeftT members.¶

3. DeftT and Defined-trust Communications

DeftT synchronizes and secures communications between its enrolled members. DeftT's multi-party synchronized collections of named, schema-conformant Publications contrast with the bilateral session of TCP or QUIC where a source and a destination coordinate with one another to transport undifferentiated streams of information. At any specific time, the DeftTs in a trust domain may hold different subsets of a collection (e.g., immediately after entities add elements to the collection) but the synchronization protocol ensures all converge to the complete set of elements within a few round-trip-times following the changes.¶

Applications use DeftT to add to and access from a collection of Publications. DeftT enforces "who can say what to which" as well as providing required integrity, authenticity and confidentiality. Transparently to applications, a DeftT both constructs and validates all Publications against its schema's formal, validated rules. The compiled binary communications schema is distributed as a trust-root-signed certificate and that certificate's thumbprint (see Section 3.3.1.4 and Section 2) uniquely identifies each trust domain. Each DeftT is configured with the trust anchor used in the domain, the schema cert, and its own credentials for membership in the domain. DeftTs must be in the same domain to communicate. Identity credentials comprise a unique private identity key along with a public certificate chain rooted at the domain's trust anchor. Certificates in identity chains are specified in the schema and contain the attributes granted to the identity. Thus, attributes are stored in the identity not on an external server.¶

As illustrated in Figure 2, each member publishes its credentials to the certificate collection in order to join the domain. DeftT validates credentials as a certificate chain against the schema and does not accept Publications without a fully validated signer. This unique approach enables fully distributed policy enforcement without a secured-perimeter physical network and/or extensive per-device configuration. DeftT can share an IP network with non-DeftT traffic as well as DeftT traffic of a different Domain. Privacy via AEAD (Authenticated Encryption with Associated Data) is automatically handled within DeftT if selected in the schema.¶

Figure 9: DeftT's interaction in a network stack

Figure 9 shows the data flow in and out of a DeftT. DeftT uses its schema to package application information into Publications that are added to its local view of the collection. Application information is packaged in Publications which are carried in collection addition (cAdd) PDUs that are used along with collection state (cState) PDUs to communicate about and synchronize Collections. cStates report the state of the local collection; cAdds carry Publications to other members that need them. These PDUs are broadcast on their subnet (e.g., UDP multicast).¶

3.1. Inside DeftT

DeftT's example implementation [DCT] is organized in functional library modules that interact to prepare application-level information for transport and to extract application-level information from packets, see Figure 10. Extensions and alternate module implementations are possible but the functionality and interfaces must be preserved. Internals of DeftT are completely transparent to an application and the example implementation is efficient in both lines of code and performance. The schema determines which modules are used Figure 11. A DeftT participates in two required collections and may participate in others if required by the schema-designated signature managers. One of the required collections, msgs, contains Publications constructed from application messages. The other required collection, cert, contains the certificates of the trust domain. Specific signature managers may require group key distribution in descriptively named collection keys.¶

Figure 10: Run-time library modules

A shim serves as the translator between application semantics and the named information objects (Publications) whose format is defined by the schema. The syncps module is the set reconciliation protocol used by DeftT (see Section 3.2). New signature managers, distributors, and face modules may be added to the library to extend features. More detail on each module can be found at [DCT] in both code files and documents.¶

The signing/validation modules (signature managers) are used for both Publications and cAdds. Figure 11 shows validator specifications for the two required Publication collections, one for Publications carrying application messages and one for Publications carrying certificates. The "#pduValidator" specifies AEAD, which requires a symmetric key, an example of a sigmgr that requires another distributor module (see #(group-key-distributors)) that is automatically instantiated at run-time. Changing a pduValidator from "EdDSA" to "AEAD" in a domain's communication schema is all that needs to be done to change from signed, cleartext PDUs to PDUs encrypted with a periodically regenerated shared key where only the minimal information of trust domain identifier and collection type is exposed.¶

#msgsValidator: "EdDSA"
#certValidator: "EdDSA"
#pduValidator:  "AEAD"  //for cAdds

Figure 11: Schema portion specifying the sigmgr to be used for signing and validation

Following good security practice, DeftT's Publications are constructed and signed early in their creation, then are validated (or discarded) early in the reception process.The schemaLib module provides certificate store access throughout DeftT along with access to distributors of group keys, Publication building and structural validation, and other functions of the trust management engine. This organization of interacting modules is not possible in a strictly layered implementation.¶

3.2. syncps: a set reconciliation protocol

DeftT requires a method or protocol that keeps collections of Publications synchronized. Required functionality for such a protocol can be understood through the example of the syncps protocol included in the example implementation. The syncps protocol uses IBLTs [DIFF][IBLT][MPSR] to solve the multi-party set-difference problem efficiently without the use of prior context and with communication proportional to the size of the difference between the sets being compared. The state of a local collection is encoded in an IBLT. A syncps announces its local collection state (set of currently known Publications) by sending a cState (Section 3.3.1.1) that also serves as a query for additional data not reflected in its local state. Receipt of a cState performs three simultaneous functions: (1) announces new Publications, (2) notifies of Publications that member(s) are missing and (3) acknowledges Publication receipt. The first may prompt the recipient to share its cState to get the new Publication(s). The second results in the recipient sending a cAdd Section 3.3.1.2 containing all the locally available missing Publications that fit. The third is used optionally and may result in a progress notification sent to other local modules so anything waiting for delivery confirmation can proceed.¶

On broadcast media, syncps uses any cStates it hears to reduce (suppress) sending excess cStates and listens for cAdds that may add to its collection. This means that one-to-many Publications cause sending a single cState and a single cAdd independently of the number of members desiring the Publication (the theoretical minimum possible for reliable delivery). The digest size of a cState can be controlled by Publication lifetime, dynamically constructing the digest to maximize communication progress [Graphene][Graphene19] and, if necessary for a large network, dynamically adapting topic specificity.¶

A cAdd with new Publication(s) responds to a particular cState as per (Section 3.3.1.2 item 1). Any DeftT that is missing a Publication (due to being out-of-range, asleep, channel errors, etc.) can receive it from any other DeftT. A syncps will continue to send cAdds as long as cStates are received that are missing any of its active Publications. This results in reliability that is subscriber-oriented, not publisher-oriented, kept efficient with protocol features that prevent multiple redundant broadcasts. The example implementation of syncps prevents redundant broadcasts by having originating publishers send their responding Publications immediately while others delay before supplying missing Publications, canceling if a responding cAdd is overheard. Other approaches are possible.¶

The collection synchronization work of a syncps module is shown as a state diagram in Figure 12. When a new syncps is started, it always sends its local cState (starts unsuppressed) on the network and sets an expiration timer for the cState. If this timer expires, the "new local cState" actions are repeated and the cState may be suppressed (thus not sent). For most collections, the initial cState will show an empty collection (certificate collections will have the local identity chain). The events that can move the collection forward are (1) the arrival of a cState from the network, (2) the arrival of a cAdd from the network whose csID matches a hash value stored from a previously received or sent cState, or (3) arrival of a new Publication from its shim. For an arriving cAdd, each Publication is extracted, validated, and passed to any registered subscriber callback(s). Non-validating packets are silently discarded (may optionally set alerts or count discards). Reception of new Pub(s) may cause an application process to create and add new Publications to the local collection. Sending of new Pubs is deferred until the entire cAdd has been processed. If there are no new Pubs to send, syncps moves to its "set sendCStateTimer" state where a cancelable sendCState timer is set to the estimated dispersion delay of this local subnet. (Dispersion should be << cState lifetime. More on dispersion delay in Section 3.5.)¶

Figure 12: State diagram of a syncps module

Since new Publications are always eligible to send, if any were created while in "process cAdd", the next state is "process Pubs to send" with csID set to the csID field of the cAdd on entry to "process cAdd". In "process Pubs to send" any pending sendCState will be canceled and eligible Pubs are packaged as content for a new cAdd. Packaged Publications are subject to a hold time (during which they are ineligible to send) of twice the dispersion delay to avoid responding to cStates sent before reception of a cAdd containing the Pub. If there are Pubs to send, a cAdd with that content and the passed in csID is sent, then the set sendCStateTimer state is entered; when there are no Pubs to send, the action moves there directly.¶

Another path to exit the "wait for event" state is reception of a cState which moves to "process cState" where incoming cStates are recorded, If this cState matches the local cState, the syncps returns to the wait state. Otherwise, the IBLT is extracted from the cState, an IBLT is computed on the local Pub collection, and they are peeled to find the ones the received cState has that are not in the local collection ("needs") and the ones that are in the local collection and not in the received cState ("haves"). Syncps enters the "process Pubs to send" state with csID set to the hash of the cState's name. Eligible Pubs are "haves" that do not have a hold time set and locally generated Publications are sent preferentially. Publications obtained from others are not immediately eligible to send; members delay to give the originator time to respond, sending these when further cStates indicate a member continues to need them.¶

A syncps also exits the wait state when the attached shim has a Publication to send. Since a new Pub will not appear in any previously issued cState, any one can be used, including one issued locally. In "get stored cState", the best one (the most recent cState from the network if available) is retrieved and passed to the "peel IBLT values" state where its cState.csID is used for the csID value. There will always be at least the one new Publication to send in this case.¶

This state diagram is intended to capture the major functionality of a syncps module while excluding excessive detail. In particular, Figure 12 does not show "housekeeping" tasks on the collection, e.g., removal of expired Publications.¶

3.3. DeftT formats

All DeftT Information is represented using TLV (Type, Length, Value) tuples. Types can either be containers (they contain a concatenated sequence of TLVs) or leaves (they contain a single non-TLV value with well-defined semantics and serialization). All TLVs have a boolean 'valid()' method that returns 'true' if and only if their content satisfies all the constraints associated with the TLV's type. For container types this means, at minimum, that the sum of all the enclosed TLV Lengths and header sizes exactly equals the Length of the container and that the valid() method of each of the enclosed TLVs returns true. Most container types have additional constraints on the type, ordering and value of the enclosed TLVs that are described below.¶

3.3.1. Top level container TLVs

As shown in Figure 10 there are two kinds of top level containers: PDUs which are exchanged with the system-provided network transport and carry Pubs, the other top level container, which are the elements of the set synchronization protocol. PDUS and Pubs have similar structure and share most of their code but are designed to be unambiguously distinguishable. As indicated in Figure 9 and Figure 10, syncps uses a pub/sub model for both its shim facing and network facing interfaces. Thus the first TLV in any top level container is a Name container comprising the topic name used to mediate the pub/sub rendezvous. The other TLVs in the top level container depend on the kind of container.¶

There are two kinds of PDU containers: cState and cAdd and two kinds of of Publications, publications and certs. All four are described in the following section.¶

3.3.1.1. cState PDUs

A cState PDU (TLV type 5) announces the items a member holds in a specific collection of a specific trust domain subnet. It must contain the following three TLVs and they must be in this order:¶

Name TLV containing exactly three type Generic (aka, byte array or binary blob) components:¶

C1:

Sync zone id consisting of the first 8 bytes of the SHA-256 thumbprint of the communications schema cert in use for this domain (may be the same as the trust domain id).¶

C2:

Collection name¶

C3:

Run-length compressed IBLT of the items in the publisher's instance of the collection (see Section 3.2 for more information on IBLTs).¶
Nonce (TLV type 10, leaf) whose value must be 4 random bytes chosen by the publisher at the time the cState is built. Duplicate cStates can arise from multiple members announcing the same Name because they hold the same items or because the network doesn't handle multicast well and lets PDUs loop. The nonce allows these two cases to be distinguished so looping cStates can be dropped.¶
Lifetime (TLV type 12, leaf) whose value is the lifetime (measured in milliseconds since this PDU's arrival) serialized as an unsigned big-endian integer with all leading zero bytes suppressed. A member receiving the cState and capable of publishing into the collection can hold onto the cState for this lifetime. If the member has an item to publish before the end of the cState's lifetime, the Publication can be sent immediately in a responding cAdd.¶

For example, the initial PDU sent by the home IoT "gate controller" sample app (in examples/hmIot of [DCT]) will be a cState in the cert collection and looks like:¶

5 (cState) size 128:
| 7 (Name) size 116:
| | 8 (Generic) size 8:  55d5 7f99 7d8d ba91
| | 8 (Generic) size 4:  cert
| | 8 (Generic) size 98:  8201 dd76 eb0f 46ed  89a8 8101 dd76 eb0f  46..
| |                       beb5 9922 fdd6 7401  cbbe 5bc1 5b57 1c63  84..
| |                       79aa ca17 8501 cbbe  5bc1 5b57 1c63 8801  ce..
| |                       92f8
| 10 (Nonce) size 4:  8b9f 8134
| 12 (Lifetime) size 2:  4789

Note that the format inspections of this section are produced by using the dctwatch tool from [DCT] with the -f option.¶

3.3.1.2. cAdd PDUs

Note: cAdds, Publications and Certificates all share the same Data (TLV type 6) container format but are distinguished by its Metainfo TLV. They all contain the same five TLVs in the same order but each has different constraints on the value of those TLVs.¶

A cAdd PDU (TLV type 6) supplies one or more Pubs in response to some cState. It must contain the following five TLVs and they must be in this order:¶

Name TLV derived from the cState's Name: the first two components, sync zone id and collection name, are the same but the IBLT component is replaced by a csID (TLV type 35, leaf) whose value must be the 32 bit big-endian Murmurhash of the cState's entire Name TLV. (This is done because "Repeat the question and append the answer" is the common strategy for matching responses to requests in multicast protocols but an IBLT can be hundreds of bytes which would drastically reduce the cAdd's payload space so "the question" is replaced with a compact hash proxy.)¶
Metainfo (TLV type 20) saying the PDU's ContentType is cAdd (42), i.e., contains one or more Pubs and nothing else so it must be 'structurally validated' on arrival.¶
Content container (TLV type 21) which must contain one or more complete, valid Pubs. The Pubs must NOT already be in the cState's IBLT. I.e., the Pubs must be newly created on the cAdd publisher or in the 'need' set when the difference between the publisher's IBLT and the cState's IBLT is 'peeled' (see [DIFF] and the DeftT example implementation's handleCState code for details).¶
SigInfo container (TLV type 22) which must contain a SigType (TLV 27, leaf) containing a valid keyed or unkeyed signature type from the types listed in Section 3.3.2. If and only if the signature type is keyed (i.e., validation requires the public key cert of a public/private keypair), the SigInfo must contain KeyLocator (TLV 28) containing a KeyDigest (TLV 29, leaf) of length 32 bytes containing the thumbprint of the cert needed for validation. The SigType must match the type of the PDU signature validator associated with the collection.¶
SigValue (TLV type 23, leaf) containing the result of signing the cAdd PDU with using the algorithm and key, if any, specified by the SigInfo. The Length of the TLV must match the length used by the signature type as per Section 3.3.2. The PDU signature validator must successfully validate the signature.¶

For example, the following is the frontdoor's cAdd responding to the cState, shown above, for the cert collection. The cert collection is synced by the certificate distributor which can't use any signature types that depend on keys since it's responsible for obtaining the certificates containing the keys that would be needed to validate a PDU's signature. Thus, it is the only collection allowed to use an unkeyed [RFC7693] BLAKE2 MAC to integrity check its PDUs (this is why Figure 11 does not specify a separate validator for cert PDUs). Since the content is self-authenticating public key certs, this doesn't cause security issues.¶

6 (Data) size 561:
| 7 (Name) size 22:
| | 8 (Generic) size 8:  55d5 7f99 7d8d ba91
| | 8 (Generic) size 4:  cert
| | 35 (csID) size 4:  f6d7 3d84
| 20 (MetaInfo) size 3:
| | 24 (ContentType) size 1:  42 (CAdd)
| 21 (Content) size 489:
         ... (489 bytes of Content elided)
| 22 (SigInfo) size 3:
| | 27 (SigType) size 1:  9 (RFC7693)
| 23 (SigValue) size 32:  af8e 1412 e659 103f  5237 f1e1 0e7b 0af8  9c..

Except for this collection, PDU and Publication signature types are specified in the schema. PDUs typically use AEAD with a locally elected cover key distributor to protect the content privacy. Publications typically use EdDSA to provide provenance and ABAC attributes via the signing chain or a combined AEAD and EdDSA signature type (AEADSGN) to constrain content disclosure to some limited group. All encrypted content must remain encrypted, in motion or at rest, from point of origin to point(s) of use. The syncps subscribe upcall may decrypt a piece of content for ephemeral use but the callee must NOT retain the plaintext form.¶

3.3.1.3. Publications

As noted above, a Publication must be in a Data TLV containing the same five TLVs in the same order as cAdds and Certificates. Publications are distinguished by having a Metainfo ContentType of Blob (0).¶

DeftT communicates via Publications which are currently organized into collections of msgs and keys.¶

A Publication (TLV type 6) must contain the following five TLVs and they must be in this order:¶

Name TLV which must contain at least three components and the first component's length must be non-zero. The schema specifies the format of the Name including number and type of components, allowed values, allowed signers, etc. Implementations must construct and sign Pubs so that they are consistent with the schema. (The example implementation's applications show that this can be done automatically with minimal application involvement, e.g., see the phone app in the office control example.) Implementations must fully validate Publications both cryptographically and against the schema before adding them to the collection. Implementations must NOT add a Publication to a collection that already contains it.¶
Metainfo (TLV type 20) saying the Publication's ContentType is Blob (0), i.e., contains arbitrary bytes that can't be 'structurally' validated (but are always cryptographically validated for integrity and authorization by the signature check)..¶
Content container (TLV type 21) containing Length bytes. Length may be zero.¶
SigInfo container (TLV type 22) which must contain exactly two TLVs: a SigType (TLV type 27, leaf) containing a valid keyed signature type from the types listed in Section 3.3.2 followed by a KeyLocator (TLV type 28) containing a KeyDigest (TLV type 29, leaf) of length 32 bytes containing the thumbprint of the cert needed to validate the signature. The SigType in the Publication must match the collection's Publication validator which must match the #pubValidator specified in the schema.¶
SigValue (TLV type 23, leaf) containing the result of signing the Publication using the algorithm and key specified by the SigInfo. The Length of the TLV must match the length used by the signature type as per Section 3.3.2. The collection's publication signature validator must successfully validate the signature.¶

For example, what follows are two consecutive Publications made to the msgs collection. First, operator alice publishes a command for all lock devices to lock themselves (similar to the multiple subscriptions per-light shown in Figure 3, the schema requires that all lockable devices subscribe to the iot1/lock/command/all prefix in msgs):¶

6 (Data) size 216:
| 7 (Name) size 68:
| | 8 (Generic) size 4:  iot1
| | 8 (Generic) size 4:  lock
| | 8 (Generic) size 7:  command
| | 8 (Generic) size 3:  all
| | 8 (Generic) size 4:  lock
| | 8 (Generic) size 17:  p38863@aphone.local
| | 37 (SequenceNum) size 4:  b4a1 ea2a
| | 37 (SequenceNum) size 0:
| | 36 (Timestamp) size 7:  23-09-18@19:40:45.591793
| 20 (MetaInfo) size 3:
| | 24 (ContentType) size 1:  0 (Blob)
| 21 (Content) size 32:  Msg #3 from operator:alice-38863
| 22 (SigInfo) size 39:
| | 27 (SigType) size 1:  8 (EdDSA)
| | 28 (KeyLocator) size 34:
| | | 29 (KeyDigest) size 32:  7096 5de9 6848 7543  d2c8 e459 24fb 7b0..
| 23 (SigValue) size 64:  61b3 fc3c 03df 2c89  7a0c ddae 27a2 f883  dd..
|                         2699 899f 1c91 46c1  3127 9da8 8948 e783  68..

Three milliseconds later, the gate publishes that it has locked itself:¶

6 (Data) size 214:
| 7 (Name) size 69:
| | 8 (Generic) size 4:  iot1
| | 8 (Generic) size 4:  lock
| | 8 (Generic) size 5:  event
| | 8 (Generic) size 4:  gate
| | 8 (Generic) size 6:  locked
| | 8 (Generic) size 17:  p59280@rpi2.local
| | 37 (SequenceNum) size 4:  e131 5a4b
| | 37 (SequenceNum) size 0:
| | 36 (Timestamp) size 7:  23-09-18@19:40:45.594867
| 20 (MetaInfo) size 3:
| | 24 (ContentType) size 1:  0 (Blob)
| 21 (Content) size 29:  Msg #3 from device:gate-59280
| 22 (SigInfo) size 39:
| | 27 (SigType) size 1:  8 (EdDSA)
| | 28 (KeyLocator) size 34:
| | | 29 (KeyDigest) size 32:  3dde 0f21 beae 2c20  3ea3 5c2e 77ca 9d4..
| 23 (SigValue) size 64:  3913 011d 7e74 807c  94b5 e725 a8e7 5b2f  09..
|                         bc99 9c8b fa9f f929  4722 f23a 1fbe cd84  b6..

As described in Section 8, the schema is designed for spoofing and replay protection of Publications. Section 3.2 notes that the per-publication EdDSA signature prevents spoofing or modification. Since all collections ignore duplicates of an existing publication, replays of anything in the collection will be ignored. Publications have a collection-dependent lifetime that is generally ephemeral. To keep collections from growing without bound, Publications are removed once their arrival time plus lifetime exceeds the node's local time. Arriving Publications are ignored if their timestamp (name component 9) plus a collection-dependent "expiry time" is after the node's local time. "lifetime" is substantially larger then "expiry time" to account for clock skew so the combination of these two mechanisms prevents all replay.¶

3.3.1.4. Certificates

As noted above, a Certificate must be in a Data TLV containing the same five TLVs in the same order as cAdds and Publications. Certificates are distinguished by having a Metainfo ContentType of Key (2) and by having a Validity Period specified according to a more rigorous subset of the rules in [RFC1422] section 3.3.6 as described in item 5 below.¶

A Certificate (TLV type 6) must contain the following five TLVs and they must be in this order:¶

Name TLV which must contain at least five components and the first component's length must be non-zero. The schema specifies the format of the Name including number and type of components, allowed values, allowed signers, etc. Implementations must construct and sign certs so that they are consistent with the schema. (Tools to do this are supplied with the example implementation.) Implementations must fully validate certs both cryptographically and against the schema before adding accepting them. "Fully validating" requires that the cert's signer has been accepted thus a cert cannot be accepted until its entire signing chain has been accepted.¶
Metainfo (TLV type 20) saying the Cert's ContentType is Key (2), This means the container has no TLV structure to validate.¶
Content container (TLV type 21) containing Length bytes. Length must equal the size of the public key associated with the cert's SigInfo SigType¶
SigInfo container (TLV type 22) which must contain exactly two TLVs: a SigType (TLV type 27, leaf) containing a valid keyed signature type from the types listed in Section 3.3.2 followed by a KeyLocator (TLV type 28) containing a KeyDigest (TLV type 29, leaf) of length 32 bytes containing the thumbprint of the cert needed to validate the signature. The KeyDigest must be followed by a Validity Period (TLV 253) containing a NotBefore (TLV 254, leaf) containing a valid 15 character ISO 8601-1:2019 format GMT timepoint followed by a NotAfter (TLV 255, leaf) containing a valid 15 character ISO 8601-1:2019 format GMT timepoint. The cert must be ignored if the NotBefore value is >= the NotAfter value, if the NotAfter value is < the current time or if the validity period is not completely contained within its signing cert's validity period. The SigType in the Cert must match the #certValidator type specified in the schema.¶
SigValue (TLV type 23, leaf) containing the result of signing the Cert using the algorithm and key specified by the SigInfo. The Length of the TLV must match the length used by the signature type as per Section 3.3.2.¶

For example, what follows is the frontdoor's identity cert used in the home IoT example which gets added to the "cert" collection:¶

6 (Data) size 240:
| 7 (Name) size 50:
| | 8 (Generic) size 4:  iot2
| | 8 (Generic) size 6:  device
| | 8 (Generic) size 9:  frontdoor
| | 8 (Generic) size 3:  KEY
| | 8 (Generic) size 4:  0eaf f793
| | 8 (Generic) size 3:  dct
| | 36 (Timestamp) size 7:  23-02-18@18:17:46.088971
| 20 (MetaInfo) size 3:
| | 24 (ContentType) size 1:  2 (Key)
| 21 (Content) size 32:  de19 4605 7f77 a7bd  1317 de41 002c fe15  1bc..
| 22 (SigInfo) size 81:
| | 27 (SigType) size 1:  8 (EdDSA)
| | 28 (KeyLocator) size 34:
| | | 29 (KeyDigest) size 32:  8c7f 1de9 ebc9 17b6  a8e9 dce9 056a 74c..
| | 253 (Validity) size 38:
| | | 254 (NotBefore) size 15:  20230219T021746
| | | 255 (NotAfter) size 15:  20240219T021746
| 23 (SigValue) size 64:  c8b9 5883 4b9a 8aac  9ad0 e5e4 5eef 0a18  4b..
|                         1b3a 1574 58d4 0528  1740 883e d90c 836f  ed..

3.3.2. Leaf TLVs

Most of DeftT's leaf TLVs were described above but there are two important enumeration types, name components and signature types, with particular constraints and implications.¶

There are four types of components allowed in a Name (TLV 7):¶

Table 1: Name Component Types
Type	TLV	Description
`Generic`	8	Arbitrary blob of bytes
`csID`	35	32-bit murmurhash of cState name (number)
`Timestamp`	36	GMT time point in microseconds (number)
`SequenceNum`	37	unsigned 64-bit integer (number)

"Number" types are encoded in big-endian order (MSB first) with all leading zero bytes suppressed. Thus their length can be zero to eight bytes. For example, a SequenceNum of 0 would be [37, 0], 100 would be [37, 1, 100] and 1,000,000 would be [37, 3, 15, 66, 64].¶

There are five types of signature allowed in a SigType (TLV 27) and each requires the SigValue (TLV 23) in a Data with that SigType have a particular size:¶

Table 2: Signature Types
Type	Value	SigValue length	Description
stSHA256	0	32	SHA256 data integrity
stAEAD	7	40	[RFC8103] content privacy plus full data integrity
stEdDSA	8	64	Ed25519 provenance and full data integrity
stRFC7693	9	64	[RFC7693] full data integrity
stAEADSGN	13	104	[RFC8103] content privacy with Ed25519 provenance and data integrity

3.3.3. TLV header details

All TLV headers use the same format. They occupy either 2 or 4 bytes, depending on the value of L. L specifies the length in bytes of V. Lengths in the range 0 to 252 occupy one byte. A length of zero is allowed and indicates there are no V bytes. Lengths in the range 253 to 65535 occupy three bytes: a 'flag byte' of 253 followed by the two bytes of the 16 bit length in big endian order. Lengths greater than 65535 (deliberately) can not be represented so a DeftT object can be no larger than 65535+4 = 65539 bytes. (Objects of arbitrary size can be handled by a segmentation/reassembly layer above DeftT such as dct/shims/mbps.hpp in the example implementation.)¶

L must use the minimum description length coding. For example, a length of 0 must be encoded as the single byte [0], not as the 3 bytes [253, 0, 0], 252 is encoded as [252], 253 as [253, 0, 253], 256 as [253, 1, 0] and 65535 as [253, 255, 255].¶

T specifies the type of data in the container. It occupies one byte, must be an element of the valid types set defined below, and must conform to that element's rules.¶

3.3.4. Design rationale

DeftT's Publication, PDU and serialization formats were strongly influenced by the [LANGSEC] observation that most security issues are due to improper input handling. For example, [LangSecErr] (section II) found that this class of errors accounted for 75% of the 47 OpenSSL security vulnerabilities reported in the 18 months following 2015-1-1. Also, as of 2023-7-5, all 25 of the protobuf CVEs listed in the NIST National Vulnerability Database are of this class.¶

[LangSecErr] suggests these vulnerabilities could have been avoided by designing the protocol following three rules:¶

The acceptable input to a program should be:¶

well-defined (i.e., via a grammar)¶

as simple as possible (on the Chomsky scale of syntactic complexity)¶

fully validated before use (no "shotgun parsing")¶

3.3.4.1. Making DeftT 'well-defined'

A DeftT domain's "acceptable inputs" are specified using its communication rules declarative language (see Section 4.1) then compiled by an LALR parser into a compact binary "schema" that avoids any need for runtime parsing -- given the schema, the DeftT runtime can construct or validate any legal domain input in constant time. The compiler will fail to construct a schema if the domain communication rules are incomplete or inconsistent.¶

After successful compilation, the schema is authorized, authenticated and integrity protected by cryptographic signing using the domain's trust anchor. This signed schema is supplied to every member as part of their identity bundle (Section 5.2) and the SHA-256 thumbprint (see Section 3.3.1.4) of the schema is the first component of every PDU's topic name. This ensures not only that the rules are well defined but also that all publishers and subscribers are playing by the same rules.¶

3.3.4.2. Making DeftT 'as simple as possible'

All DeftT Information is represented using TLV (Type, Length, Value) tuples for the reasons noted by Dan Berstein [netstrings][tnetstrings]:¶

Unlike delimitter-based approaches like XML or JSON, TLVs are resistant to buffer overflow and false pairing attacks.¶
TLVs are self-describing and trivial to parse or validate.¶
They can be used recursively -- containers can contain other containers.¶
TLVs are fast, cache friendly and not resource intensive.¶
TLVs make no assumptions about contents and can store binary data without escaping or encoding.¶
TLVs are transport agnostic.¶

Attackers regard the 'seams' between protocol layers as prime attack surface since a lower layer can pass up partial information that it later finds to be inconsistent or invalid (an anti-pattern known as shotgun parsing [LangSecErr]). DeftT deliberately reuses a small set of formatting conventions to construct its TLV containers in contrast to the Internet convention of constructing its PDUs in separate layers with rules chosen by different committees. For example, DeftT PDUs, Publications and Certs have essentially the same format so they can all be structurally validated (e.g., the contents of a container are the type expected in the order expected and exactly fill their container) by one simple, generic, recursive descent validation pass over each arriving PDU performed at the point where it arrives.¶

As described in Section 1.3, DeftT validates every Publication and PDU both cryptographically and syntactically using the domain's communications rules to enforce who-can-say-what-to-which-where-when. DeftT does both serialization and validation using rules bound at runtime (Figure 6) not compile time. It can do this at rates competitive with protobufs by taking advantage of the "definiteness" of local-domain communication:¶

Since the same rules are used both to produce and validate Publications/PDUs, encoding order is fixed and known in advance. Thus every top-level object can be validated by a single sequential pass through it.¶
Every party to the communication is guaranteed to be using the same rules so there are no options and no negotiation thus no combinatorial explosion of variants to check.¶
Communication rules can be extended and amended at any time and the resultant binary schema published to members with no changes to their code. Thus the current ruleset should always be the minimum necessary to support existing applications and policies, not the open-ended monster needed to support any possible future.¶

3.4. Application and network interface

Figure 9 and Figure 10 show the blocks and modules application information passes through in DeftT and may be useful references for this subsection. In a trust domain, applications pass information to be communicated to their DeftT, which packages it into Publication(s) that are added to the local collection copy. These Publications are also sent in a PDU via the system network interface to be received by other members of the domain which add the Publications to their local collections. If a received Publication matches a subscription, the information it contains is passed to the application. (For more detail, see the library at [DCT].) DeftT is organized into modules that perform its tasks. A DeftT shim exchanges information with applications. The example implementation [DCT] provides a message-based publish/subscribe (mbps) API that exchanges messages with the application and Publications with the sync protocol. (Other APIs are possible.) DeftT startup begins when a shim object is instantiated by the application and given its identity bundle. Startup includes creating a certificate distributor and, optionally, group key distributors, depending on schema-specified signing. After startup, the msgs syncps of each member will maintain a cState containing the IBLT of its view of the collection. (In the stable, synchronized state, all members of a collection will have the same IBLT.)¶

Applications subscribe to all messages, or to a subset by topic, by passing a callback function to the mbps subscribe method. Application subscriptions are turned into syncps subscriptions via mbps. An application with new information to communicate passes the formatted topic items as parameters and the other content in a message via a publish call. Only the topic components and the message, if any, are passed between the application and mbps. Mbps adds mbps-specific components to the parameter list and invokes a schemaLib method that builds a valid (according to the schema) Publication and can be passed to syncps to publish. Messages that exceed the content size limits of a single Publication are segmented by mbps and carried in multiple Publications. If a member's identity chain lacks the attributes required for a specific Publication, no Publication is built. The Publication is signed using the sign method of the appropriate sigmgr and passed to syncps.¶

syncps adds this Publication to its collection and updates its IBLT to contain the new Publication. Since its application just created it, the Publication is a new addition and thus is always a response to the current cState. The Publication is packaged into a cAdd and signed using the sign method of the designated sigmgr and passed to the face. The updated IBLT is packaged into a new cState that is handed to the face.¶

Trust domain members only process cAdds that share their trust domain identifier (Section 3.3.1.1 and Section 3.3.1.2). When a new cAdd is received at a member, the face ensures it matches an outstanding cState and, if so, passes it on to its matching syncps(es). Syncps validates (both structurally and cryptographically) the cAdd using the appropriate sigmgr's validate and continues, removing Publications, if valid. Each Publication is structurally validated via a sigmgr and valid Publications are added to the local collection and IBLT. syncps passes this updated cState to the local face. If this Publication matches a subscription it is passed to mbps, invoking the sigmgr's decrypt if the Publication is encrypted. (Publication decryption is not available at relays.) mbps receives the Publication and passes any topic components of interest to the application along with the content (if any) to the application via the callback registered when it subscribed. (If the original content was spread across Publications, mbps will wait until all of the content is received. The sCnt component of a mbps Publication Name is used for this.)¶

3.5. Synchronizing a collection

DeftT works on unicast (as a special case of multicast) links, but is designed to take full advantage of a multicast subnet (e.g., link-level IPv6 multicast on broadcast media) with syncps orchestrating collection-based communications. Members' syncps modules interact on a multicast subnet to keep their collections synchronized. The following example illustrates member actions and communications to synchronize a collection (see also the sequence diagram in Figure 13).¶

Starting with all members connected to the collection (having confirmed publication of their identity credentials) and with an empty msgs collection (i.e., no applications have active Publications), member2's application passes content to its DeftT via an mbps.publish(). The content is packaged into a Publication (p1) and passed to syncps which creates and sends a cAdd PDU. The cAdd uses a hash of the shared (empty) cState as its cState identifier (third component of the Name Section 3.3.1.2 item 1) to indicate the Publication(s) it carries are additions to the collection in that state. Member2's new local cState (containing p1) is scheduled to be sent at a delay of the subnet's dispersion time (d) plus a small random value (r). Dispersion time is an estimate of the expected time for a cAdd to reach every member's collection. It may be a fixed or adaptive estimate and syncps is robust to inaccuracies: an overestimate may lead to longer delays and an underestimate may lead to more cState traffic on the channel. Members receive and validate the cAdd, then extract and validate p1, passing it to subscriptions. Each member schedules a sendCState after a small random delay r. (Scheduling a new sendCState cancels any pending sendCState.) When the sendCState timer expires, a new local cState is created with the IBLT of the collection (which will contain p1). This cState's expiration time is scheduled (value significantly longer than d) and the member sends the cState unless it is suppressed. DeftT suppresses cStates that are identical to one that has already been heard twice. If member2 is waiting to confirm p1, it can do so with the first of these cStates it receives. In Figure 13, member6 did not receive the cAdd but reception of one of the new cStates shows the presence of p1 so member6 immediately sends its own local cState (which has an empty collection, lacking member2's Publication). In this example, all members receive member6's cState, but member2, as p1's originator, responds preferentially and sends p1 in a new cAdd immediately. All other members set a timer (to d+r) to send p1. That timer is cancelled if the member receives a cAdd responding to member6's cState that contains p1. Meanwhile, member6 receives the new cAdd, adds p1 to its collection and schedules a new cState for delay r. That cState will be suppressed as it matches those already sent by the other members. Now the distributed collection is synchronized with a state of one Publication (p1). If no other application content is created, cStates will be sent at ~cStateLifetime. On the channel, we will see one cState per ~cStateLifetime since each overlaps enough to suppress others. When p1 expires, it will be removed at each local collection and the subsequent cState will show an empty collection.¶

Figure 13: Seven members using DeftT on a multicast subnet

Although Figure 13 shows one Publication at a time for clarity, the logic applies if multiple members are publishing simultaneously or at close intervals (less than d or the cStateLifeTime). Distributed collections are always moving toward synchronization but during periods of intense interaction, times when all members are synchronized may be infrequent; this is not problematic.¶

3.6. Distributors

Distributors implement services a Deft requires for its operation. Distributors optional to general operation are specified in the communications schema.¶

3.6.1. Certificate distributor

DeftT's certificate distributor is a required module. It implements a collection of all the signing chain certificates in the Domain. When a new DeftT is instantiated, it must publish all the certificates from its identity bundle as well as its locally created signing certificate. This joining process was shown in Figure 2. Since many certificates in a member's chain are shared, that will be reflected in each cState and those certs will not be sent on the subnet. A member DeftT must receive a cState showing its signing chain in another member's local collection before a DeftT can be considered "connected" to the trust domain. This ensures there is at least one other member that can receive the PDUs it sends.¶

3.6.2. Group key distributors

Group key distributors are optional in DeftT but required, and automatically supplied, if encryption is specified in the schema. When present, they are instantiated after their local certificate distributor has "connected." The example implementation contains two types of group key distributors. A group key distributor handles creation and distribution of a single symmetric key to all members of the Domain to use to encrypt either Publications or PDUs (if both are encrypted, there is a group key distribuor for each). A subscriber group key distributor distinguishes subscribers that can decrypt PDUs and/or Publication and publishers that encrypt PDUs and/or Publications (a member can be both subscriber and publisher). The group key distributor is briefly described here.¶

A trust domain using group key encryption must have at least one member with the attribute or capability of "keymaker" in its identity chain. Keymaker-capable members of a Domain elect a keymaker that makes a new symmetric encryption key upon winning the election. The non-keymakers publish key requests that the keymaker uses to create a list of current members. Requests and the symmetric key both have limited lifetimes. The keymaker uses each member's signing cert to encrypt a copy of the current key and creates and publishes as many Publications as needed to carry all the encrypted keys. In these Publications, entries are indexed by the thumbprint of the associated signing cert and the range of thumbprints is used in the Publication name. Members only accept such Publications from keymaker-capable signers and, in case of conflict, use the key sent by a member whose signing cert thumbprint is the smallest.¶

If the keymaker receives a new key request in between making new keys, a copy of the key will be encrypted for it and published. There is no explicit revocation but a blacklist can be implemented and either published or passed from an application and a new group key can be made and distributed to non-blacklisted members ahead of the normal schedule.¶

3.6.3. Other distributors

Distributors may be used for other types of key distribution and for distributing other types of information, e.g. blacklisted members or domain statistics.¶

3.7. Schema-based information movement

Although the Internet's transport and routing protocols emphasize universal reachability with packet forwarding based on destination, a significant number of applications neither need nor desire to transit the Internet (e.g., see [RFC8799]). This is true for a wide class of OT applications. Further, liberal acceptance of packets while depending on the good sending practices of others leaves critical applications open to misconfiguration and attacks. Internet protocols use header information to tell them how to forward packets; A DeftT PDU's header only contains a sync zone id and a collection name. Each DeftT has a trust management engine with a copy of rules (a schema or subschema). DeftT only creates and moves its Publications in accordance with the fully specified communications schemas and never moves a PDU between sync zones. This approach differs in both intent and execution from Internet forwarding. It may not be appropriate for all use cases but offers new opportunities to address the specific security requirements of many Limited Domain use cases.¶

DeftT PDUs on the same subnet may be in different sync zones or trust domains and DeftT sync zones in the same trust domain may be on different subnets. In some cases, it is useful to define sync zones whose DeftTs have a compatible, but more limited, version of the trust domain's communications schema which is itself complete as a communications schema. This subschema concept was introduced in Section 1.3 and further discussed in Section 4. "Compatible" means there is at least one Publication type and associated signer specification in common or one schema may be a subset of the other. Different subschemas may be deployed for sync zones on the same subnet or on different subnets. A subschema cert will have a different thumbprint from that of the full trust domain and different sync zones can be identified by the thumbprint of the (sub)schema in use.¶

For example, a unicast link may be used to connect two remotely located subnets of the same trust domain and only certain types of Publications should pass through the unicast link. Relays can be used where the DeftTs on the unicast link have a restricted subschema (e.g. Figure 14-right). Further, different sync zones on the same subnet might be used where certain members have more limited access, either due to the technology of their devices or to restrict their access (e.g., guests of a network). Relays could limit Publications simply by filtering Publications or subscribing to subsets of Publications but use of a subschema in different sync zones provides enforcement of Publication movement.¶

Both cStates and cAdds contain their sync zone id and are not moved between subnets while Publications are defined in the trust domain's communications schema and can move to any DeftT that can validate them. In the case of DeftTs on the same subnet but in with different (sub)schema certs, the cState and cAdd PDUs are differentiated by the sync zone id (thumbprint of the (sub)schema certificate as in Section 3.3.1.1 item C1). The sync zone id is used at the face module to determine whether or not to process a PDU. A DeftT's syncps manages a particular collection on a single subnet. Relays move Publications between separate sync zones of the same trust domain by moving Publications between the relay's multiple DeftT instances.¶

A relay is implemented [DCT] as an application running on a device with a DeftT interface in each sync zone (two or more) Figure 14. Each DeftT participates using a communication identity valid for the schema used by the DeftT. Only Publications (including certs) are relayed between DeftTs and the Publication must validate (to the extent possible) against the schema of each DeftT. Consequently cAdd encryption is unique per sync zone while Publication encryption holds across the domain.¶

Since relay applications merely pass Publications in the msgs collection, their DeftT API module (a "shim", see Section 3.1) performs pass-through of valid Publications. As a consequence and a further security measure for boundary devices, relays have no need for Publication encryption keys; this is enforced by use of a capability cert in relay identity chains. (The group symmetric key is never given to an identity with the relay capability in its chain.) For example, if we added a relay definition to the example of Figure 5:¶

rlyCert: _domain/"relay"/_myid/_certFormat <= rlyCap
capCert: _domain/"CAP"/capId/capArg/_certFormat
rlyCap: capCert & {capId: "RLY", capArg: _} <= domainCert

The relay of Figure 14-left is on three separate wireless subnets. If all three DeftTs are using an identical schema, a new validated cert added to the cert store of an incoming DeftT is then passed to the other two, which each validate the cert before adding to their own cert stores (superfluous in this case, but not a lot of overhead for additional security). When a valid Publication is received at one DeftT, it is passed to the other two DeftTs to validate against their schemas and published if it passes.¶

Figure 14: Relays connect subnets

A relay may have different identities and schemas for each DeftT, but its DeftTs must have the same trust anchor and schemas that are identical copies, proper subsets or overlapping subsets of the domain schema. Publications that are undefined for a particular DeftT are silently discarded if they do not validate upon relay, just as they are when received from a face. This means the relay application of Figure 14-left can remain the same but Publications will only be published to a different subnet if its DeftT has that specification in its schema. In addition, relays may filter Publications at the application level or restrict subscriptions on some of their DeftT interfaces. Figure 14-right shows extending a trust domain geographically by using a unicast connection (e.g., over a cell line or tunnel over the Internet) between two relays which also interface to local broadcast subnets. Everything on each local subnet shows up on the other. A communications schema subset could be used here to limit the types of Publications sent on the remote link, e.g., logs or alerts. Using this approach in Figure 14-right, local communications for subnet 1 can be kept local while subnet 2 might send commands and/or collect log files from subnet 1.¶

More generally, relays can form a mesh of broadcast subnets with no additional configuration (i.e., relays on a broadcast network do not need to be configured with others' identities and can join at any time). The mesh is efficient: Publications are only added to an individual DeftT's collection once regardless of how it is received. Relays with overlapping broadcast physical media will only add a Publication to any of its DeftTs once; syncps ensures there are no duplicates. More on the applicability of DeftT meshes is in Section 6.¶

3.8. Performance

Measurements and profiling have been performed using examples from the open source [DCT] proof-of-concept codebase (v11.2) running on an Apple M1-max and an x86 linux-based machine. On both machines the code was compiled with clang-16 at optimization level -O3 but no additional or platform-dependent flags. An application examples/hmIoT/app2.cpp [DCT] with an operator role issue commands to app2.cpp's with device roles which are expected to perform the operation (e.g., lock the door) and report their subsequent status (e.g., locked). The schema used calls for EdDSA signing of Publications and AEAD encryption of cAdds. Measuring the time from the appearance of the cAdd carrying the command publication until the appearance of the cAdd carrying the status publication captures the time to receive a cAdd, process its Publication, convert the resulting application message into a Publication and then add the Publication to its local collection and package it and send it in a cAdd. On ethernet with negligible transmission delay, the Apple platform takes ~200us and the linux x86 ~300us.¶

Memory use is dominated by the certificate store (the cert collection) and the msgs collection of Publications carrying application content. In the former case, each certificate is ~300 bytes and a member only stores its own signing chain as well as those of members that create Publications the communications schema says they can accept. Measurement has shown memory use to be consistently less than 2 Mbytes.¶

The home IoT example was profiled on an M1-max using Apple's instruments tool. The time spent was classified both by the DeftT action it was invoked in and the codebase it was part of (DeftT, the libsodium cryptographic library, or the Boost async I/O library interface the to system's UDP/IP networking and scheduling). For each action, times were normalized by the total time spent on the action. The numbers in the table below are percentages of the total time taken for the action in the first column. For each codebase, the number given is the percentage time taken by its busiest function. The name of the action's largest item, irrespective of codebase, shown in the final column.¶

Table 3: Profiling Results from an Apple M1-max
action	DeftT%	crypto%	system%	largest item
make msg-to-Pub	32	68	0	EdDSA signing
send cAdd	3	4	84	sendTo syscall
send cState	10	0	80	sendTo syscall
handle cAdd	1	6	81	recvFrom syscall
handle Pub-to-msg	4	94	0	EdDSA validation
handle cState	3	0	73	recvFrom syscall

Note that none of the absolute times are large. For example, the EdDSA validation that accounts for 94% of incoming Pub-to-msg handling takes 22 microseconds and the corresponding signature that's 68% of make msg-to-Pub takes 16 microseconds. In general, send and receive syscall handling dominates and these delays are experienced by any transport protocol. Roughly half of this appears to be due to Boost's lock-heavy "executor" model which might be fixed by switching to c++20 co-routines. A real-time OS like Zephyr would be expected to remove much of the remaining system cost.¶

Thanks to the availability of hardware AES accelaration on both Arm and Intel platforms, the time to perform the cAdd AEGIS crypto is quite small: 300ns to simultaneously validate and decrypt on either platform. These measurements convince us that users and industries promoting the use of signing in networking would do well to focus on accelerating asymmetric key functions.¶

Table 4 shows the fixed and per-byte signing and validation costs for the current reference implementation [DCT] as measured by its time_signing.cpp tool. Hardware accelerated AEGIS-128L and AEGIS-256, were added in Sep 13, 2023 libsodium 1.0.19 release. AEAD-AEGIS is 4.5x faster than AEAD-IETF on the Intel platform and 7.8x faster on the Arm platform. Given the rapid evolution of high quality crypto algorithms and implementations, it's important that infrastructure and application code that relies on crypto be able to keep up. It took less than a day to add an AEGIS sigmgr to [DCT] and a one line change to the home IoT schema to switch the home IoT apps to use it. Since the sigmgr API is generic, no app or DeftT code changed.¶

Table 4: Measured Sigmgr per-operation times
operation	Intel	Arm M1-max
EdDSA signing	21us + 5.3us/KB	15us + 3.8us/KB
EdDSA validation	58us + 2.6us/KB	41us + 1.9us/KB
AEAD encryption	735ns + 885ns/KB	437ns + 2430ns/KB
AEAD decryption	719ns + 853ns/KB	433ns + 2420ns/KB
AEGIS encryption	209ns + 211ns/KB	176ns + 147ns/KB
AEGIS decryption	173ns + 192ns/KB	205ns + 141ns/KB

3.9. Congestion control

Each DeftT manages its collection on a single broadcast subnet (since unicast is a proper subset of multicast, a point-to-point connection is viewed as a trivial broadcast subnet), thus it only has to deal with that subnet's congestion. As described in the previous section, a device connected to two or more subnets may create DeftTs having the same collection name on each subnet with a Publication Relay between them but DeftT never forwards PDUs between subnets. It is, of course, possible to run DeftT over an extended broadcast network like a PIM multicast group but the result will generally require more configuration and be less reliable, efficient and secure than DeftT's self-configuring peer-to-peer Relay mesh.¶

DeftT sends at most one copy of any Publication over any fully connected subnet, independent of the number of publishers and subscribers on the subnet. Thus the total DeftT traffic on a subnet is strictly upper bounded by the application-level publication rate. As described in Section 3.2, DeftTs publish a cState specifying the set elements they currently hold. If a DeftT receives a cState specifying the same elements (Publications) it holds, it doesn't send its cState. Thus the upper bound on cState publication rate is the number of members on the subnet divided by the cState lifetime (typically seconds to minutes) but is typically one per cState lifetime due to the duplicate suppression. Each member can send at most one cAdd in response to a cState. This creates a strict request/response flow balance which upper bounds the cAdd traffic rate to (number of members - 1) times the cState publication rate. The flow balance ensures an instance can't send a new cState until it's previous one is either obsoleted by a cAdd or times out. Similarly a cAdd can only be sent in response to the cState which it obsoletes. Thus the number of outstanding PDUs per instance is at most one and DeftT cannot cause subnet congestion collapse.¶

If a Relay is used to extend a trust domain over a path whose bandwidth delay product is many times larger than typical subnet MTUs (1.5-9KB), the one-outstanding-PDU per member constraint can result in poor performance (1500 bytes per 100ms transcontinental RTT is only 120Kbps). DeftT can run over any lower layer transport and stream-oriented transports like TCP or QUIC allow for a 'virtual MTU' that can be set large enough for DeftT to relay at or above the average publication rate (the default is 64KB which can relay up to 5Mbps of Publications into a 100ms RTT). In this case there can be many lower layer packets in flight for each DeftT cAdd PDU but their congestion control is handled by TCP or QUIC.¶

10. References

10.1. Normative References

[RFC1422]: Kent, S., "Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management", RFC 1422, DOI 10.17487/RFC1422, February 1993, <https://www.rfc-editor.org/info/rfc1422>.
[RFC8366]: Watsen, K., Richardson, M., Pritikin, M., and T. Eckert, "A Voucher Artifact for Bootstrapping Protocols", RFC 8366, DOI 10.17487/RFC8366, May 2018, <https://www.rfc-editor.org/info/rfc8366>.
[RFC8613]: Selander, G., Mattsson, J., Palombini, F., and L. Seitz, "Object Security for Constrained RESTful Environments (OSCORE)", RFC 8613, DOI 10.17487/RFC8613, July 2019, <https://www.rfc-editor.org/info/rfc8613>.
[RFC8799]: Carpenter, B. and B. Liu, "Limited Domains and Internet Protocols", RFC 8799, DOI 10.17487/RFC8799, July 2020, <https://www.rfc-editor.org/info/rfc8799>.
[RFC9119]: Perkins, C., McBride, M., Stanley, D., Kumari, W., and JC. Zúñiga, "Multicast Considerations over IEEE 802 Wireless Media", RFC 9119, DOI 10.17487/RFC9119, October 2021, <https://www.rfc-editor.org/info/rfc9119>.
[RFC9200]: Seitz, L., Selander, G., Wahlstroem, E., Erdtman, S., and H. Tschofenig, "Authentication and Authorization for Constrained Environments Using the OAuth 2.0 Framework (ACE-OAuth)", RFC 9200, DOI 10.17487/RFC9200, August 2022, <https://www.rfc-editor.org/info/rfc9200>.

10.2. Informative References

[ATZ]: Ngabonziza, B., Martin, D., Bailey, A., Cho, H., and S. Martin, "TrustZone Explained: Architectural Features and Use Cases", 2016, <https://doi.org/10.1109/CIC.2016.065>.
[CAvuln]: Marlinspike, M., "More Tricks for Defeating SSL in Practice", 2009, <http://2015.hack.lu/archive/2009/moxie-marlinspike-some_tricks_for_defeating_ssl_in_practice.pdf>.
[CHPT]: CheckPoint, "The Dark Side of Smart Lighting: Check Point Research Shows How Business and Home Networks Can Be Hacked from a Lightbulb", February 2020, <https://www.globenewswire.com/news-release/2020/02/05/1980090/0/en/The-Dark-Side-of-Smart-Lighting-Check-Point-Research-Shows-How-Business-and-Home-Networks-Can-Be-Hacked-from-a-Lightbulb.html>.
[CIDS]: OperantNetworks, "Cybersecurity Intrusion Detection System for Large-Scale Solar Field Networks", 2021, <https://www.sbir.gov/sbirsearch/detail/2104327>.
[COMIS]: Lydersen, L., "Commissioning Methods for IoT", February 2019, <https://www.silabs.com/documents/public/presentations/ew-2019-iot-security-commissioning-methods-for-iot.pdf>.
[COST]: Guy, W., "Wireless Industrial Networking Alliance, Wired vs. Wireless: Cost and Reliability", October 2005, <https://www.fierceelectronics.com/embedded/wired-vs-wireless-cost-and-reliability>.
[ConfusedDep]: Support, G. C., "Additional authenticated data guide", July 2021, <https://cloud.google.com/kms/docs/additional-authenticated-data#confused_deputy_attack_example>.
[DCT]: Pollere, "Defined-trust Communications Toolkit", 2022, <https://github.com/pollere/DCT>.
[DER]: NERC, "North American Electric Reliability Corporation: Distributed Energy Resources: Connection, Modeling, and Reliability Considerations", February 2017, <https://www.nerc.com/pa/RAPA/ra/Reliability%20Assessments%20DL/Distributed_Energy_Resources_Report.pdf>.
[DIFF]: Eppstein, D., Goodrich, M. T., Uyeda, F., and G. Varghese, "What's the difference?: efficient set reconciliation without prior context", 2011.
[DIGN]: Bandyk, M., "As Dominion, others target 80-year nuclear plants, cybersecurity concerns complicate digital upgrades", November 2019, <https://www.utilitydive.com/news/as-nuclear-plants-look-to-digitize-controls-and-enhance-performance-cyber/566478/>.
[DLOG]: Li, N., Grosof, B., and J. Feigenbaum, "Delegation logic", February 2003, <https://doi.org/10.1145/605434.605438>.
[DMR]: al., M. C. E., "Device Management Requirements to Secure Enterprise IoT Edge Infrastructure", April 2021, <https://www.wwt.com/white-paper/device-management-requirements-to-secure-enterprise-iot-edge-infrastructure/>.
[DNMP]: Nichols, K., "Lessons Learned Building a Secure Network Measurement Framework Using Basic NDN", September 2019.
[DTM]: Blaze, M., Feigenbaum, J., and J. Lacy, "Decentralized Trust Management", June 1996, <https://doi.org/10.1109/SECPRI.1996.502679>.
[Demers87]: Demers, A. J., Greene, D. H., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H. E., Swinehart, D. C., and D. B. Terry, "Epidemic Algorithms for Replicated Database Maintenance", 1987, <https://doi.org/10.1145/41840.41841>.
[Graphene]: Ozisik, A. P., Andresen, G., Bissias, G., Houmansadr, A., and B. N. Levine, "Graphene: A New Protocol for Block Propagation Using Set Reconciliation", 2017, <https://doi.org/10.1007/978-3-319-67816-0\_24>.
[Graphene19]: Ozisik, A. P., Andresen, G., Levine, B. N., Tapp, D., Bissias, G., and S. Katkuri, "Graphene: efficient interactive set reconciliation applied to blockchain propagation", 2019, <https://doi.org/10.1145/3341302.3342082>.
[HSE]: Kapersky, "Secure Element", 2022, <https://encyclopedia.kaspersky.com/glossary/secure-element/>.
[IAWS]: Ganapathy, K., "Using a Trusted Platform Module for endpoint device security in AWS IoT Greengrass", November 2019, <Using a Trusted Platform Module for endpoint device security in AWS IoT Greengrass>.
[IBLT]: Goodrich, M. T. and M. Mitzenmacher, "Invertible bloom lookup tables", 2011, <https://doi.org/10.1109/Allerton.2011.6120248>.
[IEC]: IEC, "Power systems management and associated information exchange - Data and communications security - Part 8: Role-based access control for power system management", 2022, <https://webstore.iec.ch/publication/61822>.
[IEC61850]: Wikipedia, "IEC 61850", 2021, <https://en.wikipedia.org/wiki/IEC_61850>.
[IIOT]: Rajiv, "Applications of Industrial Internet of Things (IIoT)", June 2018, <https://www.rfpage.com/applications-of-industrial-internet-of-things/>.
[IOTK]: Nichols, K., "Trust schemas and {ICN:} key to secure home IoT", 2021, <https://doi.org/10.1145/3460417.3482972>.
[ISO9506MMS]: ISO, "Industrial automation systems --- Manufacturing Message Specification --- Part 1: Service definition", 2003, <https://www.iso.org/obp/ui/#iso:std:iso:9506:-1:ed-2:v1:en>.
[LANGSEC]: LANGSEC, "LANGSEC: Language-theoretic Security "The View from the Tower of Babel"", 2021, <http://langsec.org>.
[LangSecErr]: Momot, F., Bratus, S., Hallberg, S. M., and M. L. Patterson, "The Seven Turrets of Babel: {A} Taxonomy of LangSec Errors and How to Expunge Them", 2016, <https://langsec.org/papers/langsec-cwes-secdev2016.pdf>.
[MATR]: Alliance, C. S., "Matter is the foundation for connected things", 2021, <https://buildwithmatter.com/>.
[MHST]: Wikipedia, "MQTT", 2022, <https://en.wikipedia.org/wiki/MQTT>.
[MINSKY03]: Minsky, Y., Trachtenberg, A., and R. Zippel, "Set reconciliation with nearly optimal communication complexity", 2003, <https://doi.org/10.1109/TIT.2003.815784>.
[MODOT]: Saleem, D., Granda, S., Touhiduzzaman, M., Hasandka, A., Hupp, W., Martin, M., Hossain-McKenzie, S., Cordeiro, P., Onunkwo, I., and D. Jose, "Modular Security Apparatus for Managing Distributed Cryptography for Command and Control Messages on Operational Technology Networks (Module-OT)", January 2022, <https://www.nrel.gov/docs/fy22osti/79974.pdf>.
[MPSR]: Mitzenmacher, M. and R. Pagh, "Simple multi-party set reconciliation", 2018.
[MQTT]: OASIS, "MQTT: The Standard for IoT Messaging", 2022, <mqtt.org>.
[NDNW]: Jacobson, V., "Watching NDN's Waist: How Simplicity Creates Innovation and Opportunity", July 2019, <http://ice-ar.named-data.net/meetings/2019-ICE-WEN-Annual/0-ICNWEN-Van-Keynote.pdf>.
[NERC]: NERC, "Emerging Technology Roundtable - Substation Automation/IEC 61850", November 2016, <https://www.nerc.com/pa/CI/Documents/roundtable%20-%20IEC%2061850%20slides%20%20(20161115).pdf>.
[NIST]: Hu, C., Ferraiolo, D., Kuhn, D., Schnitzer, A., Sandlin, K., Miller, R., and K. Scarfone, "Guide to Attribute Based Access Control (ABAC) Definition and Considerations", August 2019, <https://www.nist.gov/publications/guide-attribute-based-access-control-abac-definition-and-considerations-0>.
[NMUD]: al, D. D. E., "Securing Small-Business and Home Internet of Things (IoT) Devices: Mitigating Network-Based Attacks Using Manufacturer Usage Description (MUD)", May 2021, <https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1800-15.pdf>.
[NPPI]: Hashemian, H. M., "Nuclear Power Plant Instrumentation and Control", 2011, <https://cdn.intechopen.com/pdfs/21051/InTechNuclear_power_plant_instrumentation_and_control.pdf>.
[NVR]: Gutmann, P., "Everything you Never Wanted to Know about PKI but were Forced to Find Out", 2002, <https://www.cs.auckland.ac.nz/~pgut001/pubs/pkitutorial.pdf>.
[ONE]: OneDM, "One Data Model", 2022, <https://onedm.org/>.
[OPR]: King, R., "Commercialization of NDN in Cybersecure Energy System Communications video", 2019, <https://www.nist.gov/news-events/events/2019/09/ndn-community-meeting>.
[OSCAL]: NIST, "OSCAL: the Open Security Controls Assessment Language", 2022, <https://pages.nist.gov/OSCAL/>.
[OTPM]: Hinds, L., "Keylime - An Open Source TPM Project for Remote Trust", November 2019, <https://www.youtube.com/watch?v=YtPsruEqGeY>.
[OWASP]: owasp.org/www-project-sidekek/, "SideKEK README", June 2020, <https://github.com/OWASP/SideKEK>.
[PRAG]: e}bowicz, J. W., Cabaj, K., and J. Krawiec, "Messaging Protocols for IoT Systems---A Pragmatic Comparison", 2021, <https://www.mdpi.com/1424-8220/21/20/6904>.
[QTPM]: Arthur, D. C. W., "Quick Tutorial on TPM 2.0", January 2015, <https://link.springer.com/chapter/10.1007/978-1-4302-6584-9_3>.
[RFC2693]: Ellison, C., Frantz, B., Lampson, B., Rivest, R., Thomas, B., and T. Ylonen, "SPKI Certificate Theory", RFC 2693, DOI 10.17487/RFC2693, September 1999, <https://www.rfc-editor.org/info/rfc2693>.
[RFC3552]: Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on Security Considerations", BCP 72, RFC 3552, DOI 10.17487/RFC3552, July 2003, <https://www.rfc-editor.org/info/rfc3552>.
[RFC3986]: Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, <https://www.rfc-editor.org/info/rfc3986>.
[RFC4291]: Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, DOI 10.17487/RFC4291, February 2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC4292]: Haberman, B., "IP Forwarding Table MIB", RFC 4292, DOI 10.17487/RFC4292, April 2006, <https://www.rfc-editor.org/info/rfc4292>.
[RFC4949]: Shirey, R., "Internet Security Glossary, Version 2", FYI 36, RFC 4949, DOI 10.17487/RFC4949, August 2007, <https://www.rfc-editor.org/info/rfc4949>.
[RFC6335]: Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. Cheshire, "Internet Assigned Numbers Authority (IANA) Procedures for the Management of the Service Name and Transport Protocol Port Number Registry", BCP 165, RFC 6335, DOI 10.17487/RFC6335, August 2011, <https://www.rfc-editor.org/info/rfc6335>.
[RFC7252]: Shelby, Z., Hartke, K., and C. Bormann, "The Constrained Application Protocol (CoAP)", RFC 7252, DOI 10.17487/RFC7252, June 2014, <https://www.rfc-editor.org/info/rfc7252>.
[RFC7693]: Saarinen, M., Ed. and J. Aumasson, "The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)", RFC 7693, DOI 10.17487/RFC7693, November 2015, <https://www.rfc-editor.org/info/rfc7693>.
[RFC8103]: Housley, R., "Using ChaCha20-Poly1305 Authenticated Encryption in the Cryptographic Message Syntax (CMS)", RFC 8103, DOI 10.17487/RFC8103, February 2017, <https://www.rfc-editor.org/info/rfc8103>.
[RFC8520]: Lear, E., Droms, R., and D. Romascanu, "Manufacturer Usage Description Specification", RFC 8520, DOI 10.17487/RFC8520, March 2019, <https://www.rfc-editor.org/info/rfc8520>.
[RFC8995]: Pritikin, M., Richardson, M., Eckert, T., Behringer, M., and K. Watsen, "Bootstrapping Remote Secure Key Infrastructure (BRSKI)", RFC 8995, DOI 10.17487/RFC8995, May 2021, <https://www.rfc-editor.org/info/rfc8995>.
[RSK]: Ellison, C. and B. Schneier, "Ten Risks of PKI: What You're Not Being Told About Public Key Infrastructure", 2000.
[SDSI]: Rivest, R. L. and B. W. Lampson, "SDSI - A Simple Distributed Security Infrastructure", April 1996.
[SIOT]: Truong, T., "How to Use the TPM to Secure Your IoT/Device Data", January 2017, <https://tonytruong.net/how-to-use-the-tpm-to-secure-your-iot-device-data/>.
[SKH]: Yates, T., "Secure key handling using the TPM", October 2018, <https://lwn.net/Articles/768419/>.
[SNC]: Smetters, D. K. and V. Jacobson, "Securing Network Content", October 2009, <https://named-data.net/wp-content/uploads/securing-network-content-tr.pdf>.
[SOD]: Bernstein, D., Lange, T., and P. Schwabe, "libsodium", 2022, <https://doc.libsodium.org/>.
[SPRV]: AgendalessConsulting, "Supervisor: A Process Control System", 2022, <http://supervisord.org/>.
[ST]: Samsung, "SmartThings API (v1.0-PREVIEW)", 2020, <https://smartthings.developer.samsung.com/docs/api-ref/st-api.html##operation/listCapabilities>.
[STNDN]: Yu, Y., Afanasyev, A., Clark, D. D., claffy, K., Jacobson, V., and L. Zhang, "Schematizing Trust in Named Data Networking", 2015.
[TATT]: Microsoft, "TPM attestation", June 2021, <https://docs.microsoft.com/en-us/azure/iot-dps/concepts-tpm-attestation>.
[TLSvuln]: al., C. B. E., "Using Frankencerts for Automated Adversarial Testing of Certificate Validation in SSL/TLS Implementations", November 2014, <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4232952/>.
[TPM]: Griffiths, P., "TPM 2.0 and Certificate-Based IoT Device Authentication", September 2020, <https://www.globalsign.com/en/resources/white-papers-ebooks/white-paper-tpm-20-and-certificate-based-iot-device-authentication>.
[W509]: Wikipedia, "X.509: Security", October 2021, <https://en.wikipedia.org/wiki/X.509#Security>.
[WSEN]: Kintner-Meyer, M., Brambley, M., Carlon, T., and N. Bauman, "Wireless Sensors: Technology and Cost-Savings for Commercial Buildings", 2002, <https://www.aceee.org/files/proceedings/2002/data/papers/SS02_Panel7_Paper10.pdf>.
[WegmanC81]: Wegman, M. N. and L. Carter, "New Hash Functions and Their Use in Authentication and Set Equality", 1981, <https://doi.org/10.1016/0022-0000(81)90033-7>.
[ZCL]: zigbeealliance, "Zigbee Cluster Library Specification Revision 6", 2019, <https://zigbeealliance.org/wp-content/uploads/2019/12/07-5123-06-zigbee-cluster-library-specification.pdf>.
[netstrings]: Bernstein, D. J., "Netstrings", February 1997, <https://cr.yp.to/proto/netstrings.txt>.
[tnetstrings]: tnetstrings, "About Tagged Netstrings", August 2011, <https://web.archive.org/web/20140210012056/http://tnetstrings.org/>.