For shared immutable key–value and time series databases
MultiChain streams enable a blockchain to be used as a general purpose append-only database, with the blockchain providing timestamping, notarization and immutability. A MultiChain blockchain can contain any number of streams, each of which has a name and permissions. If a node chooses to subscribe to a stream, it will index that stream’s contents in real-time to enable efficient retrieval in various ways.
Each stream is an ordered list of items, with the following characteristics:
- One or more
publishers
who have digitally signed that item. - One or more
keys
between 0 and 256 bytes in length, to allow efficient retrieval. - Some
data
in JSON, text or binary format, which can be on-chain (embedded in the transaction) or off-chain (represented by a hash in the transaction). Stream filters can be used to define custom validation rules for this data. - Information about the item’s transaction and block, including its
txid
,blockhash
,blocktime
, and so on.
On-chain vs off-chain data
In MultiChain 2.x, the data in a stream item can be published either on-chain or off-chain. On-chain data is embedded directly within the blockchain transaction, meaning that it is received and stored in full by every node in the network, whether that node is subscribed to the stream or not. Still, only subscribed nodes will index the stream’s contents, enabling its items to be retrieved by order, key, publisher and so on.
For off-chain data, only a hash (digital fingerprint) of the data is embedded within the transaction. (If the data is large, it will be broken up into multiple chunks, each of whose hash is embedded.) After receiving a transaction with off-chain data, only nodes which are subscribed to the stream will request that data from other nodes in the network. Once the data is received, it is verified against the hash(es) inside the transaction. This delivery and verification of off-chain data takes place asynchronously, but usually finishes within a split second of the transaction being received. For more technical information on MultiChain’s implementation of off-chain data, please see here.
For application developers using the MultiChain API, there is almost no distinction between items with on-chain and off-chain data. To publish a stream item with off-chain data, an extra offchain
flag is added to the appropriate command. When querying stream items with off-chain data, the API response indicates whether the data is available
. Many features in MultiChain Enterprise are based on off-chain data, including read-restricted streams, end-to-end encryption, selective retrieval and data purging.
Below is a summary of the main differences between on-chain and off-chain data:
On-chain data | Off-chain data | |
Data availability | Immediately with transaction | Within a split second of transaction (can be longer if data is large or many off-chain items are queued for retrieval) |
Transaction content | Publisher(s), key(s), data format, full data | Publisher(s), key(s), data format, data size, data hash(es) |
Transaction size | Few hundred bytes + full data size | Few hundred bytes + 37 bytes per 1 MB of off-chain data (using default for maximum-chunk-size blockchain parameter) |
Maximum data size | Up to 64 MB – see max-std-op-return-size blockchain parameter |
Up to 1 GB – see maximum-chunk-size and maximum-chunk-count blockchain parameters |
Enterprise features | Data feeds, selective indexing | Read restrictions, end-to-end encryption, data feeds, selective indexing, selective retrieval, data purging |
Referring to streams
Like native assets, each MultiChain stream can be referred to in any of three ways:
- An optional stream
name
, chosen at the time of stream creation. If used, the name must be unique on a blockchain, between both streams and other created entities. Stream names are stored as UTF-8 encoded strings up to 32 bytes in size and are case insensitive. - A
createtxid
, containing the txid of the transaction in which the stream was created. - A
streamref
which encodes the block number and byte offset of the stream creation transaction, along with the first two bytes of its txid.
If root-stream-name
in the blockchain parameters is a non-empty string, it defines a stream which is created with the blockchain and can be written to immediately. The root stream’s createtxid
is the txid of the coinbase of the genesis block, and its streamref
is 0-0-0
.
Permissions in streams
Streams are created by a special transaction output, which must only be signed by addresses which have the create
permission (unless anyone-can-create
is true
in the blockchain parameters). This is easy to do using the create
command in multichain-cli
or the JSON-RPC API. The stream’s creator automatically receives admin
, activate
, write
and read
permissions for that stream, although the read
permission is only relevant if the stream is read-protected (requires MultiChain Enterprise). It is not possible to create more than one stream in a single transaction, or to combine stream creation with initial or follow-on asset issuance.
Each stream item is encoded in a single transaction output (see below), which is easy to create using the publish
command. A single transaction can write to multiple streams atomically using the publishmulti
command or raw transactions. The publishers of a stream item are defined by the addresses used in the inputs which signed the output containing that item. (If an input spends a pay-to-scripthash P2SH multisig output, the P2SH address is considered as the item publisher, independent of the actual public keys used in the input.)
When a stream is created, it can be restricted in a number of ways. In a write-restricted stream, every publisher of a stream item must have write
permission for that stream, or the item and its transaction are not valid. In a read-restricted stream, all data is off-chain, and only certain addresses can retrieve that data – this requires both publishing and retrieving nodes to be using MultiChain Enterprise. Streams can also be restricted to only allow on-chain or off-chain data.
The per-stream write
and read
permissions for an address can be modified in a permissions transaction signed by an address with per-stream admin
or activate
permissions, while admin
or activate
permissions can be changed by those with per-stream admin
permissions only. This is easy to do using the grant
and revoke
commands in multichain-cli
or the JSON-RPC API.
For the root stream, the creator of the chain’s first genesis block automatically receives all permissions. The root stream is open for general writing if the root-stream-open
blockchain parameter is true
. Root streams are not restricted in any other way.
Streams in transaction data
For regular use of MultiChain, you can ignore the technical details below. They are only relevant if you want to work with the raw data within MultiChain transactions. Note that you can also use the raw transactions APIs to encode and decode this information.
Stream creation outputs
A transaction output creates a stream if it contains the following, followed by an OP_DROP
(0x75
) and OP_RETURN
(0x6a
):
Field | Size | Description |
Identifier | 4 bytes | spkn or 0x73 0x70 0x6b 0x6e |
Type | 1 byte | 0x02 for a stream. |
Repeat the below for each stream property | ||
Property key | Variable | If the first byte of the key is 0x00 , it denotes a property with special meaning to MultiChain, and the second byte gives the property type (see table below). If the first byte of the property key is not 0x00 , it contains the null-delimited name of a user-defined custom field, e.g. 0x75 0x72 0x6c 0x00 for url (no longer used in MultiChain 2.0). |
Length | 1-9 bytes | Bitcoin-style variable-length integer indicating the length of the property value in bytes. |
Value | Variable | The property’s value as raw binary. |
Below is a list of special property keys used in the above structure (all are optional):
Property key | Property value |
0x00 0x01 |
Stream name in UTF-8 encoding. |
0x00 0x04 |
Stream open/closed to all writers, where a value of 0x00 means closed and 0x01 means open (used in protocol versions 100xx ). |
0x00 0x05 |
All custom fields as a JSON object, serialized in UBJSON format (used in protocol versions 200xx ). |
0x00 0x06 |
Stream read/write restrictions as a 1-byte bitmap, where 0x80 means read-restricted and 0x08 means write-restricted (used in protocol versions 200xx ). If a stream is read-restricted, it is recommended to also specify that on-chain data is not allowed and that off-chain items must be salted (see below). This ensures that no data can be embedded directly within transactions (which all nodes see) and prevents dictionary attacks against chunk hashes. |
0x00 0x07 |
Item data restrictions as a 1-byte bitmap, where 0x01 means on-chain data is not allowed, 0x02 means off-chain is not allowed, and 0x04 means that all off-chain items must be salted (used in protocol versions 200xx ). |
Stream item outputs
A transaction output containing an on-chain stream item has the following structure:
stream-identifier OP_DROP [item-key OP_DROP, ...] (item-format OP_DROP) OP_RETURN item-data
A transaction output containing an off-chain stream item has the following structure:
stream-identifier OP_DROP [item-key OP_DROP, ...] offchain-descriptor OP_DROP OP_RETURN
In either case, the stream-identifier
has the following structure:
Field | Size | Description |
Prefix | 4 bytes | spke or 0x73 0x70 0x6b 0x65 |
Stream | 16 bytes | First 16 bytes of stream creation txid in reverse order. |
And each item-key
has the following structure:
Field | Size | Description |
Prefix | 4 bytes | spkk or 0x73 0x70 0x6b 0x6b |
Key data | Variable | Item key in UTF-8 encoding (can be empty). |
For on-chain items, the optional item-format
has the following structure: (otherwise the data is assumed to be raw binary)
Field | Size | Description |
Prefix | 4 bytes | spkf or 0x73 0x70 0x6b 0x66 |
Format | 1 byte | 0x01 to denote text data, or 0x02 to denote JSON data. |
For on-chain items, the item-data
has no prefix and is embedded directly after the OP_RETURN
. If the data format is text, it uses UTF-8 encoding. If the data format is JSON, it is serialized in the UBJSON format.
For off-chain items, the offchain-descriptor
has the following structure:
Field | Size | Description |
Prefix | 4 bytes | spkf or 0x73 0x70 0x6b 0x66 |
Format | 1 byte | 0xf0 to denote off-chain data. |
Data format | 1 byte | 0x00 for binary data, 0x01 for text data, or 0x02 for JSON data. |
Salt length | 1 byte | Either 0x00 (unsalted) or a number of bytes between 0x08 and 0x20 . |
Chunk count | Variable | Bitcoin-style variable-length integer indicating the number of chunks. |
Repeat the below for each chunk of data | ||
Chunk size | Variable | Bitcoin-style variable-length integer indicating the size of the chunk. |
Chunk hash | 32 bytes | Double SHA-256 hash of chunk contents, with bytes in reverse order. |