Sensor Measurement Lists (SenML) Fields for Indicating Data Value Content-FormatEricssonJorvas02420Finlandari.keranen@ericsson.comUniversität Bremen TZIPostfach 330440BremenD-28359Germany+49-421-218-63921cabo@tzi.orgInternet of Things (IoT)Internet of ThingsIOTdata modelmedia typeThe Sensor Measurement Lists (SenML) media types support multiple types
of values, from numbers to text strings and arbitrary binary Data Values.
In order to facilitate processing of binary Data Values, this document
specifies a pair of new SenML fields for indicating the
content format of those binary Data Values, i.e., their Internet media
type, including parameters as well as any content codings applied.Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by
the Internet Engineering Steering Group (IESG). Further
information on Internet Standards is available in Section 2 of
RFC 7841.
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
.
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
() in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Revised BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Revised BSD License.
Table of Contents
. Introduction
. Evolution
. Terminology
. SenML Content-Format ("ct") Field
. SenML Base Content-Format ("bct") Field
. Examples
. ABNF
. Security Considerations
. IANA Considerations
. References
. Normative References
. Informative References
Acknowledgments
Authors' Addresses
IntroductionThe Sensor Measurement Lists (SenML) media types can be used
to send various kinds of data. In the example given in
, a temperature value, an indication whether a lock is open, and
a Data Value (with SenML field "vd") read from a Near Field Communication (NFC) reader is sent in a
single SenML Pack.
The example is given in SenML JSON representation, so the "vd" (Data
Value) field is encoded as a base64url string (without
padding), as per .The receiver is expected to know how to interpret the data in the "vd"
field based on the context, e.g., the name of the data source and out-of-band
knowledge of the application. However, this context may not always be
easily available to entities processing the SenML Pack, especially if
the Pack is propagated over time and via multiple entities. To facilitate
automatic interpretation, it is useful to be able to indicate an Internet
media type and, optionally, content codings right in the SenML Record.The Constrained Application Protocol (CoAP)
Content-Format () provides this
information in the form of a single unsigned integer. For instance, defines the Content-Format number 60 for
Content-Type application/cbor. Enclosing this Content-Format number in the Record is illustrated in . All registered CoAP Content-Format numbers are listed
in the "" registry , as specified by
.
Note that, at the time of writing, the structure of this registry only
provides for zero or one content coding; nothing in the present
document needs to change if the registry is extended to allow
sequences of content codings.In this example SenML Record, the Data Value contains a string "foo" and a
number 42 encoded in a Concise Binary Object Representation (CBOR) array. Since the example above
uses the JSON format of SenML, the Data Value containing the binary CBOR
value is base64 encoded ().
The Data Value after base64 decoding is shown
with CBOR diagnostic notation in .EvolutionAs with SenML in general, there is no expectation that the creator of
a SenML Pack knows (or has negotiated with) each consumer of that Pack,
which may be very remote in space and particularly in time.
This means that the SenML creator in general has no way to know
whether the consumer knows:
each specific Media-Type-Name used,
each parameter and each parameter value used,
each content coding in use, and
each Content-Format number in use for a combination of these.
What SenML, as well as the new fields defined here, guarantees is that
a recipient implementation knows when it needs to be updated to
understand these field values and the values controlled by them;
registries are used to evolve these name spaces in a controlled way.
SenML Packs can be processed by a consumer while not understanding all
the information in them, and information can generally be preserved in
this processing such that it is useful for further consumers.TerminologyThe key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 when, and only when, they
appear in all capitals, as shown here.
Media type:
A registered label for representations (byte strings) prepared for
interchange , identified by a Media-Type-Name.
Media-Type-Name:
A combination of a type-name and a subtype-name registered in
, as per , conventionally
identified by the two names separated by a slash.
Content-Type:
A Media-Type-Name, optionally associated with parameters
(, separated from
the Media-Type-Name and from each other by a semicolon).
In HTTP and many other protocols, it is used in a Content-Type header field.
Content coding:
A name registered in the "" , as specified by
Sections and of , indicating an encoding
transformation with semantics further specified in .
Confusingly, in HTTP, content coding values are found in a header field
called "Content-Encoding"; however, "content coding" is the correct
term for the process and the registered values.
Content format:
The combination of a Content-Type and zero or more content codings, identified
by (1) a numeric identifier defined in the "" registry ,
as per (referred to as Content-Format
number), or (2) a Content-Format-String.
Content-Format-String:
The string representation of the combination of a Content-Type and
zero or more content codings.
Content-Format-Spec:
The string representation of a content format; either a
Content-Format-String or the (decimal) string representation of a
Content-Format number.
Readers should also be familiar with the terms and concepts discussed in
.SenML Content-Format ("ct") FieldWhen a SenML Record contains a Data Value field ("vd"), the Record MAY
also include a Content-Format indication field, using label "ct". The
value of this field is a Content-Format-Spec, i.e., one of the following:
a CoAP Content-Format number in decimal form with no leading
zeros (except for the value "0" itself). This value represents an
unsigned integer in the range of 0-65535, similar to the "ct"
attribute defined in for CoRE Link
Format .
a Content-Format-String containing a Content-Type and
zero or more content codings (see below).
The syntax of this field is formally defined in .The CoAP Content-Format number provides a simple and efficient way
to indicate the type of the data. Since some Internet media types and
their content coding and parameter alternatives do not have assigned
CoAP Content-Format numbers, using Content-Type and zero or more
content codings
is also allowed. Both methods use a string value in the "ct" field to
keep its data type consistent across uses. When the "ct" field
contains only digits, it is interpreted as a CoAP Content-Format
number.To indicate that one or more content codings are used with a Content-Type,
each of the content coding values is appended to the Content-Type value (media
type and parameters, if any), separated by an "@" sign, in the order of when
the content codings were applied (the same order as in ).
For example (using a content coding value of "deflate", as defined in
):
text/plain; charset=utf-8@deflate
If no "@" sign is present after the media type and parameters,
then no content coding has been specified, and the "identity"
content coding is used -- no encoding transformation is employed.SenML Base Content-Format ("bct") FieldThe Base Content-Format field, label "bct", provides a default value for
the Content-Format field (label "ct") within its range. The range of the
base field includes the Record containing it, up to (but not including)
the next Record containing a "bct" field, if any, or up to the end of the
Pack otherwise. The process of resolving () this base
field is performed by adding its value with the label "ct" to all Records
in this range that carry a "vd" field but do not already contain a
Content-Format ("ct") field. shows a variation of with multiple records, with the
"nfc-reader" records resolving to the base field value "60" and the
"iris-photo" record overriding this with the "image/png" media type
(actual data left out for brevity).ExamplesThe following examples are valid values for the "ct" and "bct" fields
(explanation/comments in parentheses):
"60" (CoAP Content-Format number for "application/cbor")
"0" (CoAP Content-Format number for "text/plain" with parameter
"charset=utf-8")
"application/json" (JSON Content-Type -- equivalent to "50" CoAP
Content-Format number)
"application/json@deflate" (JSON Content-Type with "deflate" as
content coding -- equivalent to "11050" CoAP Content-Format number)
"application/json@deflate@aes128gcm" (JSON Content-Type with
"deflate" followed by "aes128gcm" as content codings)
"text/csv;header=present@gzip" (CSV with header row, using "gzip" as
content coding)
ABNFThis specification provides a formal definition of the syntax of
Content-Format-Spec strings using ABNF notation , which
contains three new rules and a number of rules collected and adapted
from various RFCs .Security ConsiderationsThe indication of a media type in the data does not exempt a consuming
application from properly checking its inputs.
Also, the ability for an attacker to supply crafted SenML data that
specifies media types chosen by the attacker may expose vulnerabilities
of handlers for these media types to the attacker.
This includes "decompression bombs", compressed data that is crafted
to decompress to extremely large data items.IANA ConsiderationsIANA has assigned the following new labels in the
"" subregistry
of the "Sensor Measurement Lists (SenML)" registry (as defined in ) for the
Content-Format indication, as per :
IANA Registration for New SenML Labels
Name
Label
JSON Type
XML Type
Reference
Base Content-Format
bct
String
string
RFC 9193
Content-Format
ct
String
string
RFC 9193
Note that, per , no CBOR labels nor Efficient XML Interchange (EXI)
schemaId values (EXI ID column) are supplied.ReferencesNormative ReferencesConstrained RESTful Environments (CoRE) ParametersIANAHypertext Transfer Protocol (HTTP) ParametersIANAMedia TypesIANASensor Measurement Lists (SenML)IANAMultipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message BodiesThis initial document specifies the various headers used to describe the structure of MIME messages. [STANDARDS-TRACK]Key words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.Augmented BNF for Syntax Specifications: ABNFInternet technical specifications often need to define a formal syntax. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power. The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges. This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications. [STANDARDS-TRACK]The Constrained Application Protocol (CoAP)The Constrained Application Protocol (CoAP) is a specialized web transfer protocol for use with constrained nodes and constrained (e.g., low-power, lossy) networks. The nodes often have 8-bit microcontrollers with small amounts of ROM and RAM, while constrained networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs) often have high packet error rates and a typical throughput of 10s of kbit/s. The protocol is designed for machine- to-machine (M2M) applications such as smart energy and building automation.CoAP provides a request/response interaction model between application endpoints, supports built-in discovery of services and resources, and includes key concepts of the Web such as URIs and Internet media types. CoAP is designed to easily interface with HTTP for integration with the Web while meeting specialized requirements such as multicast support, very low overhead, and simplicity for constrained environments.Ambiguity of Uppercase vs Lowercase in RFC 2119 Key WordsRFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.Sensor Measurement Lists (SenML)This specification defines a format for representing simple sensor measurements and device parameters in Sensor Measurement Lists (SenML). Representations are defined in JavaScript Object Notation (JSON), Concise Binary Object Representation (CBOR), Extensible Markup Language (XML), and Efficient XML Interchange (EXI), which share the common SenML data model. A simple sensor, such as a temperature sensor, could use one of these media types in protocols such as HTTP or the Constrained Application Protocol (CoAP) to transport the measurements of the sensor or to be configured.HTTP SemanticsInformative ReferencesMedia Type Registration ProcedureSeveral questions have been raised about the requirements and administrative procedure for registering MIME content-type and subtypes, and the use of these Media Types for other applications. This document addresses these issues and specifies a procedure for the registration of new Media Types (content-type/subtypes). It also generalizes the scope of use of these Media Types to make it appropriate to use the same registrations and specifications with other applications. This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind.Common Format and MIME Type for Comma-Separated Values (CSV) FilesThis RFC documents the format used for Comma-Separated Values (CSV) files and registers the associated MIME type "text/csv". This memo provides information for the Internet community.The Base16, Base32, and Base64 Data EncodingsThis document describes the commonly used base 64, base 32, and base 16 encoding schemes. It also discusses the use of line-feeds in encoded data, use of padding in encoded data, use of non-alphabet characters in encoded data, use of different encoding alphabets, and canonical encodings. [STANDARDS-TRACK]Constrained RESTful Environments (CoRE) Link FormatThis specification defines Web Linking using a link format for use by constrained web servers to describe hosted resources, their attributes, and other relationships between links. Based on the HTTP Link Header field defined in RFC 5988, the Constrained RESTful Environments (CoRE) Link Format is carried as a payload and is assigned an Internet media type. "RESTful" refers to the Representational State Transfer (REST) architecture. A well-known URI is defined as a default entry point for requesting the links hosted by a server. [STANDARDS-TRACK]Media Type Specifications and Registration ProceduresThis document defines procedures for the specification and registration of media types for use in HTTP, MIME, and other Internet protocols. This memo documents an Internet Best Current Practice.SDP: Session Description ProtocolThis memo defines the Session Description Protocol (SDP). SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. This document obsoletes RFC 4566.Concise Binary Object Representation (CBOR)The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack.This document obsoletes RFC 7049, providing editorial improvements, new details, and errata fixes while keeping full compatibility with the interchange format of RFC 7049. It does not create a new version of the format.AcknowledgmentsThe authors would like to thank for the discussions leading
to the design of this extension and for reviews and
feedback.
suggested not burdening this document with a separate
mandatory-to-implement version of the fields.
, , and provided helpful
comments at Working Group Last Call.
asked for clarifying and using the term Content-Format-Spec.Authors' AddressesEricssonJorvas02420Finlandari.keranen@ericsson.comUniversität Bremen TZIPostfach 330440BremenD-28359Germany+49-421-218-63921cabo@tzi.org