Internet-Draft | JSON semantic format (JSON-NTV) | May 2024 |
THOMY | Expires 28 November 2024 | [Page] |
This document describes a set of simple rules for unambiguously and concisely encoding semantic data into JSON Data Interchange Format. These rules are based on an NTV (Named and Typed Values) data structure applicable to any simple or complex data.¶
The JSON-NTV format is its JSON translation.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 28 November 2024.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The semantic level of JSON or CSV shared data remains low. It is often limited to the type of data defined in those exchange formats (strings for CSV formats; numbers, strings, arrays and objects for JSON formats).¶
JSON-NTV proposes to increase the semantic level of the JSON entities [RFC8259] by adding two additional pieces of information to a JSON entity :¶
The NTV entity is thus a triplet with a mandatory element (value) and two additional elements (name, type).¶
For example, Paris location can be represented by : ¶
The easiest way to add that information into a JSON value is to use a JSON object with a single member. The first term is the additional elements using the syntax JSON-ND [JSON-ND]. The second term is the JSON value.¶
With this approach, two NTV entities are defined :¶
as well as two JSON formats depending on the presence of the additional elements :¶
Example (entity composed of two other primitive entities): ¶
A JSON-NTV generator produces a JSON value from a NTV entity and vice versa a JSON-NTV parser transforms a JSON value into a NTV entity.¶
The document [NTV-TAB] presents a variation of this format for tabular and multidimensional data.¶
The conversion between NTV entity and native entity is outside the scope of this note.¶
The format is focused on simplicity, lightness and web usage.¶
The key features of this format are the following:¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document also uses the following terms:¶
NTVlist and NTVsingle entities can be abbreviated as:¶
NTV and JsonNTV structures are defined as shown in Figure 1:¶
Example:¶
Native layer¶
NTV layer¶
JsonNTV layer¶
JsonText layer¶
Two categories of entities (one primitive and one structured) are defined:¶
An NTV entity is therefore a tree where the leaves nodes are the NTVsingle entities and where the inner nodes are the NTVlist entities¶
The data triplet of NTVsingle entities is composed by:¶
In other words, any entity that has on the one hand a function of encoding it into a JsonValue and on the other hand a function of creating from a JsonValue can be taken into account. This approach is very general because the majority of computer objects are defined by a list of parameters (e.g. *args in python) and/or a list of key/values (e.g. **kwargs in python) which simply translate into a JsonArray or a JsonObject.¶
The consistency between NTVsingleValue and NTVsingleType is outside the scope of this note.¶
The data triplet of NTVlist entities is composed by:¶
Example of equivalent JSON representations:¶
where NTVlistType is None for the global NTVlist¶
where NTVlistType is "point" for the global NTVlist¶
If JsonValue is { "::dat" : ["2022-01-28T18-23-54", {":point": [1.1, 2.2] ] } }, the parsers deduce that the first NTVvalue has a "dat" NTVtype and the second a "point" NTVtype.¶
A DataType is defined in a nested structure called Namespace.¶
This structuring of type makes it possible to reference any type of data that has a JSON representation and to consolidate all the shared data structures within the same tree of types.¶
A Namespace is defined by a name (NamespaceName) and a Namespace parent (NamespaceParent). The NamespaceName is unique in the NamespaceParent.¶
Root node in the Namespace tree is the GlobalNamespace.¶
The DataType represents the semantic of a data and is structured in a flat classification. For example, "email" and "string" are two DataType.¶
A DataType is composed with a TypeBase and an optional TypeExtension. The TypeExtension defines an additional property. For example:¶
A DataType is defined by a name (DataTypeName) and a Namespace parent (NamespaceParent). The TypeBase of a DataType is unique in the NamespaceParent. The TypeExtension of a DataType is free.¶
TypeBase and the rules to encode or decode NTVvalues MUST be understood by data producers and data consumers. So TypeBase and rules associated have to be defined in a specification shared by a large community. On the other hand, it must be possible for everyone to share data according to their own data structure.¶
There are therefore two categories of TypeBase:¶
For shared TypeBase, three sub-categories are defined (None, Simple, Generic).¶
Example:¶
A Namespace is defined by a string followed by a point (NamespaceName).¶
A DataType is defined by a string (DataTypeName) composed by the TypeBaseName and the TypeExtensionName.¶
The representation of a Namespace (NamespaceLongName) is composed by all the nested NamespaceName.¶
The representation of a DataType (DataTypeLongName) is composed by the NamespaceLongName and the DataTypeName.¶
The DataTypeLongName is defined in Figure 2, which uses ABNF from [RFC5234].¶
The corresponding rules are as follows:¶
Example for a representation of a DataType defined in two nested Namespace in the global Namespace:¶
Example of custom categories:¶
If "fr." is the name of a Namespace attached to the global Namespace and containing the Namespace 'BAN' and the DataType 'dep', then:¶
The JsonNTV format is the JSON representation of an NTV entity (JsonValue). This JsonValue is converted in JsonText with a Json generator.¶
The JsonNTV format is defined in Figure 3, which uses ABNF from [RFC5234].¶
The JsonNTV format is built with the NTVname, NTVvalue and the JsonNTVtype.¶
Two JsonNTV formats are defined:¶
The corresponding rule is as follows:¶
Note :¶
JsonNTVname is the concatenation of NTVname and JsonSepType.¶
JsonSepType is composed with the separator singleSep or listSep and the JsonNTVtype.¶
JsonNTVname and JsonSepType are defined in Figure 4, which uses ABNF from [RFC5234].¶
For NTVsingle entities:¶
For NTVlist entities:¶
The JSON representation of a NTVtype (JsonNTVtype) is a compact representation of the NTVtype in the context of the NTVtypeParent.¶
The JsonNTVtype is defined in Figure 5, which uses ABNF from [RFC5234].¶
The corresponding rules are as follows:¶
Example:¶
The JsonNTVvalue is the JsonValue representation of NTVvalue as defined in Figure 6, which uses ABNF from [RFC5234].¶
For a NTVsingle, JsonNTVvalue is the NTVvalue.¶
For a NTVlist, JsonNTValue has two representations:¶
The corresponding rules are as follows:¶
Example:¶
Examples of JsonNTV representation of NTV entities:¶
Vsingle :¶
NVsingle : ¶
TVsingle: ¶
NTVsingle: ¶
Vlist (composed with JsonArray): ¶
Vlist (composed with JsonObject) :¶
NVlist : ¶
TVlist : ¶
NTVlist : ¶
NTVlist and NVlist (composed with JsonObject) :¶
JsonValue is parsed according to JSON structure (from root to leaves).¶
Several steps are considered:¶
This part is not detailed and consists of:¶
The NTV entity is inferred from the JSON structure of JsonValue or JsonNTVvalue:¶
JsonValue | NTV entity |
---|---|
JsonPrimitive | Vsingle |
JsonUnnamed | Vlist |
JsonArray | Vlist |
JsonNamed | see Table 2 |
Separator | JsonNTVvalue | NTV entity |
---|---|---|
None | JsonPrimitive | NVsingle |
None | JsonNamed | NVsingle |
None | JsonUnnamed | NVlist |
None | JsonArray | NVlist |
":" | JsonValue | TVsingle or NTVSingle |
"::" | JsonUnnamed | NVlist or TVlist or NTVlist |
"::" | JsonArray | NVlist or TVlist or NTVlist |
The NTVvalue is inferred from the JsonNTVvalue:¶
NTVtype is inferred from the JsonNTVtype:¶
If JsonNTVtype is a valid NTVtypeLongName : NTVtype is the decoded JsonNTVtype,¶
else if concatened ParentNTVtypeLongName with JsonNTVtype is empty : NTVtype is "json",¶
else if the decoded of concatenation of ParentNTVtypeLongName with JsonNTVtype is a valid NTVtypeLongName : NTVtype is it,¶
else NTVtype is the defaultNTVtype¶
The [JSON-NTV] repository gives some examples of NTV usage.¶
An NTV entity is a tree where the leaf nodes are the NTVsingle entities and where the inner nodes are the NTVlist entities.¶
Therefore the tree property are applicable:¶
A NTV Pointer is a string syntax for identifying a specific NTV entity within a NTV tree.¶
The syntax defined for JSON Pointer [RFC6901] (json-pointer, reference-token) is transposable to NTV Pointer (ntv-pointer, reference-token):¶
The ntv-pointer is equal to the json-pointer of the JsonNTV representation in most cases except :¶
Three levels of equality are defined.¶
Strict equality:¶
Two NTV entities are strictly equals if :¶
Structural equality:¶
Two NTV entities are structurally equals if :¶
Example of structural equality (JsonNTV representation):¶
Semantic equality:¶
The NTVtype of the NTVlist entities is only useful for constructing the JSON representation (two NTVlists with two different NTVtypes are equal according to the structural equality criteria).¶
A canonical (single) format is chosen to facilitate sharing and analysis of NTV and JsonNTV data. It is defined by setting a value for the NTVtype of an NTVlist entity.¶
The rule is as follows:¶
NTVtype is empty, if an included entity is an NTVlist with an empty NTVtype,¶
NTVtype is empty, if all included entities have an NTVtype 'json',¶
NTVtype is the common Namespace, if the common Namespace is not the GlobalNamespace,¶
In other cases, the NTVtype is the same as that of the first included entity.¶
This rule is simple to implement and allows you to have a compact JsonNTV format (another more complex choice would have been to take the most frequent NTVtype among the included entities).¶
The Appendix C shows that there is an equivalence between Json entities (JsonArray, JsonObject, JsonNumber, JsonString, JsonFalse, JsonTrue or JsonNull) and the corresponding NTV entities called JNTV (resp. NTVarray, NTVobject, NTVnumber, NTVstring, NTVfalse, NTVtrue , NTVnull).¶
JNTV entities have several properties:¶
The NTVvalue of NTV entities are JsonValue. In the extended NTV structure, NTVvalue can be every kind of data.¶
With this structure, the NTV representation is a "json like" data where:¶
Examples of extended NTV representation:¶
As defined in Section 2.2.1 an entity with Json representation can be taken into account.¶
Appendix A defines usual NTVtypes (e.g. date, coordinate, email) but also NTVtypes associated to structured data (e.g. dataset, NTV data, custom format):¶
The example in Figure 7 is the representation of a NTVlist with three NTVsingle:¶
Nested NTVsingle is treated as simple NTVsingle and can be used with extended NTV structures.¶
No IANA actions are required :¶
The format used for NTV data exchanges is the JSON format. So, all the security considerations of [RFC8259] apply.¶
The NTV structure provides no cryptographic integrity protection of any kind.¶
The structure of DataType by Namespace makes it possible to have DataType or Namespace corresponding to recognized standards at the global level.¶
A standard DataType / Namespace is a DataType / Namespace defined in the Global Namespace.¶
The standard DataType and Namespace are listed below.¶
Json DataType have a generic DataType : "json"¶
DataTypeName (generic) | NTVvalue | example NTVvalue |
---|---|---|
json | generic DataType | |
number (json) | JsonNumber [RFC8259] | 10 |
boolean (json) | JsonBoolean [RFC8259] | "true" |
null (json) | JsonNull [RFC8259] | "null" |
string (json) | JsonString [RFC8259] | "value" |
array (json) | JsonArray [RFC8259] | [1.1, 2.2] |
object (json) | JsonObject [RFC8259] | {"value1": 1, "value2": 2} |
DataTypeName | NTVvalue | comment |
---|---|---|
int | JsonNumber [RFC8259] | integer |
int8, int16, int32, int64 | JsonNumber [RFC8259] | signed integer |
uint8, uint16, uint32, uint64 | JsonNumber [RFC8259] | unsigned integer |
decimal64 | JsonNumber [RFC8259] | decimal floating point |
float | JsonNumber [RFC8259] | binary floating point |
float16, float32, float64 | JsonNumber [RFC8259] | binary floating point |
DataTypeName | NTVvalue | comment |
---|---|---|
bit | binary digit | string "0" or "1" |
binary | bit string | string of bit array |
base16 | base16 encoding [RFC4648] | string of hexadecimal array |
base32 | base32 encoding [RFC4648] | string of duotrigesimal array |
base64 | base64 encoding [RFC4648] | string of tetrasexagesimal array |
Datation DataType have a generic DataType : "dat"¶
DataTypeName (generic) | NTVvalue | example NTVvalue |
---|---|---|
year | fullyear [RFC3339] | 1998 |
month | month [RFC3339] | 10 |
yearmonth | year + month (ISO8601-1) | 1998-10 |
day | mday [RFC3339] day of month | 21 |
wday | wday [RFC3339] day of week | 7 |
yday | yday [RFC3339] day of year | 360 |
week | week [RFC3339] | 38 |
hour | hour [RFC3339] | 20 |
minute | minute [RFC3339] | 18 |
second | second [RFC3339] | 54 |
dat | generic DataType | |
date (dat) | date [RFC3339] | "2022-01-28"" |
time (dat) | timespec-base [time-fraction][RFC3339] | "T18:23:54", "18:23", "T18" |
timetz (dat) | timespec-base [time-fraction] time-zone[RFC3339] | "T18:23:54+0400" |
datetime (dat) | iso-date-time (without time-zone)[RFC3339] | "2022-01-28T18-23-54" |
datetimetz (dat) | iso-date-time (with time-zone)[RFC3339] | "2022-01-28T18-23-54+0400" |
Location DataType have a generic DataType : "loc".¶
The CRS (Coordinate Reference Systems) is geographic, using the World Geodetic System 1984 (WGS 84) datum, with longitude and latitude units of decimal degrees (EPSG:4326).¶
DataTypeName (generic) | NTVvalue | example NTVvalue |
---|---|---|
loc | generic DataType | |
point (loc) | Point coordinates [RFC7946] | [ 5.12, 45.256 ] (lon, lat) |
pointstr (loc) | Point coordinates (string) | "5.12, 45.256" (lon, lat) |
pointobj (loc) | Point coordinates (object) | {"lon": 5.12, "lat": 45.256} |
multipoint | MultiPoint coordinates [RFC7946] |
[pt1, pt2, pt3]¶ ptx is "point" NTVvalue¶ |
line (loc) | LineString coordinates [RFC7946] |
[pt1, pt2, pt3]¶ ptx is "point" NTVvalue¶ |
multiline | MultiLineString coordinates [RFC7946] |
[li1, li2, li3]¶ lix is "line" NTVvalue¶ |
polygon (loc) | Polygon coordinates [RFC7946] |
[rg1, rg2, rg3]¶ rgx is "line" NTVvalue (ring)¶ |
multipolygon (loc) | MultiPolygon coordinates [RFC7946] |
[pl1, pl2, pl3]¶ plx is "polygon" NTVvalue¶ |
geometry | Geometry coordinates [RFC7946] |
"point", "line" or "polygon"¶ NTVvalue¶ |
multigeometry | MultiGeometry coordinates [RFC7946] |
[geo1, geo2, geo3]¶ geox is "geometry" NTVvalue¶ |
box (loc) | box coordinates [RFC7946] | [ -10.0, -10.0, 10.0, 10.0 ] |
geojson (loc) | geoJSON object [RFC7946] | {"type": "point", "coordinates": [40.0, 0.0]} |
codeolc (loc) | Open Location Code [OLC] | "8FW4V75V+8F6" |
DataTypeName | NTVvalue |
---|---|
row | row [W3C_TAB] |
field | column [W3C_TAB] |
tab | table [NTV-TAB] |
ndarray | multidimensional array [NTV-TAB] |
xndarray | labelled multidimensional array[NTV-TAB] |
xdataset | coordinated multidimensional array[NTV-TAB] |
ntv | JsonNTV |
sch | Schema |
The data structure associated to this DataTypeName are defined in specific document.¶
DataTypeName | NTVvalue | example NTVvalue |
---|---|---|
unit | Physical quantities | "kg / s2" |
uri | URI (RFC3986) |
"https://www.ietf.org/rfc/rfc3986.txt"¶ "geo:13.4125,103.86673" (RFC5870)¶ "info:eu-repo/dai/nl/12345"¶ "mailto:John.Doe@example.com"¶ "tel:+1-201-555-0123" (RFC3966)¶ |
uriref | URI-reference (RFC3986) | "www.example.com/questions/3456/my-document" |
iri | internationalized URI (RFC3987) | "http://www.example.org/D%FCrst" |
iriref | IRI-reference (RFC3987) | "www.example.org/D%FCrst" |
uritem | URI-template (RFC6570) | "{+x,hello,y}" |
uuid | UUID (RFC4122) | "f81d4fae-7dec-11d0-a765-00a0c91e6bf6" |
address (RFC5322) | "John Doe <jdoe@machine.example>" | |
idnemail | internationalized adress (RFC6531) | |
hostname | Host Names (RFC1123) | "en.wikipedia.org" |
idnhostname | internationalized Host Names (RFC5890) | |
jpointer | JSON pointer (RFC6901) | "/foo/0" |
rjpointer | relative JSON pointer (I-D) | "1/nested/objects" |
regex | regular expression (ECMA 262) | "[Ss]mith\\\\b" |
ipv4 | IPv4 address (RFC2673) | "192.168.1.1" |
ipv6 | IPv6 address (RFC2373) |
"2001:0db8:85a3:0000:00¶ 00:8a2e:0370:7334"¶ |
file | file-hier-part (RFC8089) |
"///path/to/file"¶ "//host.example.com/path/to/file"¶ |
Keywords are not defined as Normalized String (eg. "id", "mandatory", "units"), they can be used as custom DataType (eg. "$id", "$mandatory", "$units")¶
The global Namespace includes Namespaces for countries, dependent territories and special areas as defined in [ISO_3166-1_alpha-2]¶
The JsonNamespace for those Namespace is composed by the two digits of the country following by a dot.¶
Example :¶
Each Namespace defines a list of included DataType and Namespace.¶
Custom DataType and Namespace can be created in any Namespace.¶
Table 11 below presents some examples of custom DataType.¶
JsonDataType | comment or JsonNTV example |
---|---|
"$id" |
defined in the global Namespace¶ eg. { ":$id": 5426849" }¶ |
"$iata" |
IATA airport code¶ eg. {"Paris Nord:$iata": "CDG"}¶ |
"$uic.station" |
UIC station code¶ eg. {"Nantes station:$uic.station" : "8748100"}¶ |
"fr.$city" |
DataType "city" in "fr." Namespace¶ eg. {":fr.$city" : "Paris"}¶ |
"$schemaorg." |
"schemaorg" catalog¶ eg. { ":$schemaorg.propertyID": "NO2" }¶ { ":$schemaorg.unitText": "mg/m3"}¶ |
"$darwincore." |
"darwincore" catalog¶ eg. { ":$darwincore.acceptedNameUsage": "Tamias minimus" }¶ |
This Appendix shows that we can associate with each JsonValue an NTV entity whose JSON representation (JsonNtv) is identical to the JsonValue.¶
To do this, let's call NTVarray, NTVobject, NTVnumber, NTVstring, NTVfalse, NTVtrue, NTVnull the entities defined below.¶
Json primitives¶
For example, the NTVstring entity associated with the JsonString "foo" has the following attributes:¶
Json Array¶
For example, for a JsonArray composed of the JsonNumber represented by 25 and the JsonArray represented by [1,2], the NTVarray has the following attributes:¶
JsonObject¶
If val is a Json Primitive, NTVmember is an NVsingle:¶
If val is a JsonArray or JsonObject, NTVmember is an NVlist:¶
For example, for a JsonObject composed of the JsonMember represented by "foo": 25 and the JsonMember represented by "bar": [1,2], the NTVObject will be:¶
Thus, for all Json entities, we have an equivalence with an NTV entity¶
This Appendix presents the mapping between DataTypes and the types defined in Data schemas or Data languages.¶
The mapping concerns Table Schema types and formats [TABLE_SCHEMA]. Parsable and pattern formats (datation) and topojson (location) are not included¶
type | format | DataType |
---|---|---|
string | default | string |
string | ||
string | uri | uri |
string | binary (base64 string) | base64 |
string | uuid | uuid |
number | default | number |
integer | default | int |
boolean | default | boolean |
object | default (json) | json |
array | default (json array) | array |
date | default (date ISO8601) | date |
time | default (time ISO8601) | time |
datetime | default (datetime ISO8601 in UTC) | datetime |
year | default | year |
yearmonth | default | yearmonth |
duration | default (lexical duration ISO8601) | duration |
geopoint | default (string "lon, lat") | pointstr |
geopoint | array (array [lon, lat]) | point |
geopoint | object (eg {"lon": 90, "lat": 45}) | pointobj |
geojson | default (geojson spec) | geojson |
-any- | -any- | $xxx (custom type) |
The mapping concerns JSON-schema [JSON_SCHEMA] primitive types, defined formats and String-Encoded data .¶
type | format | DataType |
---|---|---|
string | string | |
string | date-time | datetime |
string | time | time |
string | date | date |
string | duration | duration |
string | ||
string | idn-email | idnemail |
string | hostname | hostname |
string | idn-hostname | idnhostname |
string | ipv4 | ipv4 |
string | ipv6 | ipv6 |
string | uuid | uuid |
string | uri | uri |
string | uri-reference | uriref |
string | uri-template | uritem |
string | iri | iri |
string | iri-reference | iriref |
string | json-pointer | jpointer |
string | relative-json-pointer | rjpointer |
string | regex | regex |
number | number | |
integer | int | |
boolean | boolean | |
object | object | |
array | array | |
contentEncoding | base64 | base64 |
contentEncoding | base32 | base32 |
contentEncoding | base16 | base16 |
contentEncoding | binary | binary |
null | null |
The mapping concerns YANG [RFC7950] build-in types defined. Enumeration, leafref, identityref, instance-identifier, union are not included.¶
type | DataType | comments |
---|---|---|
boolean | boolean | |
decimal64 | decimal64 | |
empty | null | |
int8 | int8 | |
int16 | int16 | |
int32 | int32 | |
int64 | int64 | |
uint8 | uint8 | |
uint16 | uint16 | |
uint32 | uint32 | |
uint64 | uint64 | |
string | string | |
bit | bit | |
binary | binary |
TBD¶