Document Status

This is an initial draft of the SDF32™ 1.0 specification.

While this draft has been released, it has not been approved by any standards body.

Implementors should periodically check the SDF32 online resources (see #Online Resources tab) for the current status of SDF32 documentation.

Abstract

The Signed DAG Format 32 Byte (SDF32) specification provides a convenient format for signed directional acyclic graphs (SDAGs).

The SDF32 specification is inspired by the patent-free SSF32 specification and is designed to provide a poratble, legally unencumbered, highly-secure, well-specified standard for encoding and decoding directional acyclic graphs constructed from SSF in a manner which .... .

The initial motivation for the SDF32 standard was to have a convenient, extensible, compact & cryptographically secure method for proving two or more transformations of some source document are related, either directly or indirectly, through some arbitrary (non-cyclic) chain of transformations.

The SDF32 file format is a patent-free format licensed under the Apache 2.0 License.

This specification defines the Internet Media Type "application/sdf32".

Please review either the #License tab or the repository LICENSE.md for more information.

SDF32 retains the following features from the PNG Standard & SSF32 Standard:

  • Complete hardware and platform independence.
  • Effective, 100% lossless compression via gzdeflate.
  • Reliable, straightforward detection of data/file corruption.

SDF32 is designed to be:

  • Simple & portable.
  • Secure: Reasonably cryptographically resistant to tampering or data corruption.
  • Legally unencumbered: to the best knowledge of the SSF32 authors, no algorithms under legal challenge are used.
  • Well compressed, implementing gzdeflate algorithm.
  • Flexible: Allows for future extensions without compromising the fundementals of the specification.
  • Robust: Designed to support full file integrity, including: simple, quick detection of common transmission errors, while also supporting highly optimized, industry standard cryptographic validation algorithms.
  • Acyclic: Graphs which result in circular structures are not supported.

The primary purpose of this specification is to provide the definition of the Signed DAG Format and recommendations for the minimum required behaviour for renderers, parsers and verifiers.

The #Appendix tab provides additional documentation, including rational for all design decisions.

While the #Appendix tab and its content are not part of the formal specification, it helps implementors to understand the design and intent behind this document. Its other purpose is to provide cross-references to relevant sections of rationale, examples or other supporting material.

Terms

The key words in this document:

  • "MUST"
  • "MUST NOT"
  • "REQUIRED"
  • "SHALL"
  • "SHALL NOT"
  • "SHOULD"
  • "SHOULD NOT"
  • "RECOMMENDED"
  • "MAY"
  • and "OPTIONAL"

are to be interpreted as described in RFC-2119.

Pronunciation

Phonetically, SDF is pronounced: "ɛs-di-ɛf".

SDF32 is a platform independant Directional Acyclic Graph format, designed to graph cryptographic transforms, encodings, etc and verify the relationship of those transforms, such that if 'x' is some source string or file and 'f(x)' is some transform applied to the source, then by traversing the graph, the relationship between 'x' and 'f(x)' can be established and independantly verified without revealing any underlying data mapped in the graph.

As with SSF32, a SDF32 structure's elements MUST be composed of binary SSF32 entries, which MUST either be tightly packed or padded to 32 Byte boundaries.

When serializing SDF32 files as JSON, encoders consisting of:

  • A single root object containing a non-zero amount of unique keys.
  • Unique keys which MUST contain an underscore ('_') prefixed series of 64 hexadecimal encoded characters (repressenting 32 8-bit Bytes) constituting the unique SDAG key of some text or file (if it is signed).
  • Each root key MUST be associated with an array containing one or more other unique keys which MAY also contain associations to other arrays.

File Header

There is no unique file header currently associated with .sdf32 files.

All .sdf32 encoded files MUST be RFC-2045 Base64 encoded .json files, which additionally adhere to the restrictions outlined in this specifiction when decoded and parsed.

Structure

Due to the flexability of this standard, it is up to end users to construct SDAGS in the layout they desire, allowing for establishing of a wide arrange of textual sematic relationships.

Flat Layout

In this common layout, a single source text will have all possible transforms aggregated as single leaf nodes which do not have any other children.

This approach can be convenient, however, for semantic relationships including higher-oder functions, such that '`y = h(g(f(x)))', which might repressent some complex series of transforms can be conveniently expressed as a single layer SDAG.

{
    "_87a95ba3eb40319184f32195a8c07dd79f4d56fb726bfbf3797406ee9ac9d8aa": [
        "_756191b167221a64ef5e6eabb5801e5ec7e2534cd916ad0d4aebadbe7dded056",
        "_c3b0f93c1751464f8a4f2aa506e797fe854e9f7097205666e3e1c0de96e27359",
        "_c689c8ba9c32d1c3c5420b0bdef0829ae28ef7d6b7d20ccd8b66516d2207f38c",
        "_4bf39b1950d66ee0a216284c4a59908d2d7e23fa2dae09b520cf640b82da0737",
        "_6c8f0f2aa6171af9102d82b38fb7ea1818e4ff1851acdff83cbec226deb597c8",
        "_ab4b6e0994480cae80428a8f733f52140d6b479eb53d7eb8d300438eb90f41b3",
        "_8719f8e14879dffa8159fcaed18055b117d490e749450b03c6c760447c436634",
        "_bddfc595e52ae72d5b6932a677aecc14c441b20c16ae2d5a6ea03da30d2d8558",
        "_532d2cb466fcd57a57f5a7e0ae2fa287e448379b9fcb164425f793e1d8ffd151",
        "_1f7e21479198c4979bb2bb5045a1ecd0ff38f4ed4b0ce6ee0037a3d352ae912c"
    ]
}

Tiered Layout

This common layout is helpful for repressenting complex relationships such as higher-order functions, where a single source text will have all possible transforms separated into leaf nodes and branch nodes which may or may not contain other branches or leaves (collectively known as children), such that for each transform in '`y = h(g(f(x)))', will be a new child tier layer in the SDAG for each resulting transform.

{
    "_87a95ba3eb40319184f32195a8c07dd79f4d56fb726bfbf3797406ee9ac9d8aa": [
        "_756191b167221a64ef5e6eabb5801e5ec7e2534cd916ad0d4aebadbe7dded056",
        "_c3b0f93c1751464f8a4f2aa506e797fe854e9f7097205666e3e1c0de96e27359",
        ...
        {
            "_45a51cbc8ea35aaae0249b7ef7659600f6a2cddd4339869d8edd1803cce2c7a2": [
                "_6c4eb5b38fe9563c122f2f7ba0a8ea5f6361f526c537a253e54c8cdab94640e4",
                ...
            ],
            "_c689c8ba9c32d1c3c5420b0bdef0829ae28ef7d6b7d20ccd8b66516d2207f38c": [
                "_182c729b559cd1779325adfb22357b97ef6db95b769d600cfc27cad843c495dc",
                ...
            ],
            ...
        }
    ]
}

See rationale: SDF32 Key Format.

After the data structure has been serialized into a native data repressentation (in this example, JSON), it is 'gzdeflate'ed (to compress the payload), then Base64 encoded (to support transport across multiple internet protocols).

By default, SDF32 keys MUST be binary SSF32, but MAY optionally be encoded as an underscore prefixed ('_') series of hexadecimal values (JSON standard as well as multiplatform support across Python, C/C++, Javascript and other common programming languages).

Deflate/Inflate Compression

SDF32 files MUST be compressed using the lossless 'gzdeflate' method, then Base64 encoded in compliance with RFC-2045.

Deflate compression is an LZ77 derivative used in zip, gzip, pkzip, and related programs. Extensive research has been done supporting its patent-free status. Portable C implementations are freely available across multiple platforms.

Decoding SDF32 files MUST decode the Base64 RFC-2045 compliant file, then uncompress the resulting binary payload using 'gzinflate' to generate the resulting JSON data structure.

File Name Extension

Where system customarily include a file extension signifying its type, the extension ".sdf32" is recommended for SDF32 files.

Lowercase ".sdf32" is preferred if names are case-sensitive.

SDF32 file extensions MUST contain an integer suffix (eg. ".sdf" is not permitted), denoting the byte format (typically either 32 or 64 bytes) of the SSF encoding used to generated the SDAG's entries.

Security Considerations

All SDF32 files MUST be non-executable, SHOULD be read-only in production environments.

Production SDF32 files SHOULD be pre-built during a separate build phase while they SHOULD also be read-only, to prevent runtime modifiction in production environmnents.

Use of dynamic, runtime generated, read-write SDF32 files in production environmnents is highly discouraged, however this is not explicitly a violation of the standard, although implementors must be aware such behaviour may provide a potential attack vector to be exploited by hostile actors.

A validator that fails to complete all checks specified in the #Validator tab MUST consider the decoded data corrupt or otherwise invalid.

Encoders

This section provides recommendations for SDF32 encoder behavior.

The only absolute requirement for an SDF32 encoder is that it ... However, best results will usually be achieved by following these recommendations.

MUST:

  • Produce files or strings that conform to this specification.
  • Support SSF32 Packed mode

MAY:

  • Support SSF32 Padded mode.
  • Perform additional encryption, compression or any other transforms on the resulting output as needed.
  • Include either a signed or unsigned SSF32 header when transmitted over http(s) for content integrity and/or verification purposes.

Decoders

This section provides recommendations for decoder behavior.

The only absolute requirement for an SDF32 decoder is that it successfully reads any file conforming to the format specified in this document. However, best results will usually be achieved by following these recommendations.

MUST:

  • Support Packed mode.
  • Base64 decode, then gzinflate input before parsing the resulting JSON data.

MAY:

  • Support Padded mode.

Glossary of Terms

Eight bits; also called an octet.

The name of the compression algorithm used in standard SDF32 files & PNG files, as well as in zip, gzip, pkzip, and other compression programs.

Deflate is a member of the LZ77 family of compression methods.

Any method of data compression that guarantees the original data can be reconstructed exactly, bit-for-bit.

A particular format for data that has been compressed using deflate-style compression. Also, the name of a library implementing this method.

SDF32 implementations need not use the zlib library, but they must conform to its format for compressed data.

An extensible, secure, non-executable binary format for structuring the hash and digital signature in a manner which is resistant to corruption, transferable over common internet protocols as a header, embeded within an HTML webpage, included as an email header/attachment or stored in a database server or blockchain.
Signed Directional Acyclic Graph
SDAG Format 32 Byte Specification

Appendix

This section is not part of the formal SDF32 specification.

This section provides reasoning behind some of the design decisions which when into standardizing SDF32. Many of these decisions were the subject of considerable debate.

Why a new file format?

Example Encoders

The official SDF32 encoders are implementations available here:

Example Decoders

The official SDF32 decoder implementations are available here:

Online Resources

The Official SDF32 PHP reference is available here.

Errata

Change Log:

Credits