This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Contributor API

The API for managing the contribution of data to a BitBroker instance

The Contributor API is the API which is used for submitting data contributions into the BitBroker catalog. It is tightly connected with the concepts of entity types and their associated data connectors.

It is important that you understand these, and other key concepts, before you begin using the Contributor API.

1 - Contributing Records

How connectors contribute entity instance records to BitBroker

All the data being managed by a BitBroker instance, enters the system via the Contribution API. The process of contributing such data is documented in detail in this section.

In this section, we will consider the basic use case of contributing entity instance records. Later sections of this documentation will detail how you can contribute live, on-demand data and timeseries data.

Contributing Records to the Catalog

We will assume for the purposes of this section that an entity type and it’s associated data connector have been created and are present within the system. Further, that the connector ID and authorization token, which were obtained when the data connecter was created, have been recorded and are available.

Data can now be contributed into the catalog by this data connector, but within the context of its parent entity type only. Hence, we say that a single connector contributes “entity instance records”. If one organization wants to contribute data to multiple entity types, then they must do this via multiple data connectors.

The process of contributing entity instance records into the catalog breaks down into three steps:

  1. Create a data contribution session
  2. Upsert and/or delete records into this session
  3. Close the session

These steps are achieved via an HTTP based API, which we outline in detail below. Each data connector will have a private end-point on this API which is waiting for its contributions.


Sessions are used by the Contribution API to manage inbound data coming from the community of data connectors. Sessions allow the connectors to contribute entity instance records in well-defined ways, which are respectful of the state management of the source data store.

BitBroker supports three types of sessions: stream, accrue and replace. Each one provides for different update and delete contexts.

The three types of session provide for different application logic in the following areas:

  • Whether data is available to consumers whilst the session is still open or only after is it closed.
  • Whether the data provided within a session adds to or replaces earlier data from your connector.

Here is the detail of how each session type functions:

Area Stream Accrue Replace
Data visibility as soon as posted on session close on session close
Data from previous session in addition to in addition to replaces entirely

Let’s explore each of these in more detail:

Stream Sessions

Stream sessions are likely to be the default mode of operation for most data connectors. Inbound entity instance records arrive in the catalog as soon as they are posted and whilst the session remains open. They are immediately available to consumers to view via the Consumer API.

New records are in addition to existing records in the catalog and removal must be explicitly requested. Closing a stream session is a moot operation, since the session type is essentially an “open pipe” into the catalog. In fact, stream sessions can be opened and left open indefinitely.

Type Session Action
session data is already visible, in addition to previous data
no operation - session data is already visible, in addition to previous data
no operation - session data is already visible, in addition to previous data

Accrue Sessions

Accrue sessions are useful when entity instance records should only become visible as complete sets. In this scenario, the entity instance records contributed within a session, only become visible via the Consumer API when the session is closed - and hence only as a complete set.

New records are in addition to existing records in the catalog and removal must be explicitly requested. When you close an accrue session, you must specify a commit state as true or false. Closing the session with true makes the contributed records visible in the Consumer API, but closing it with false will discard all the records contributed within that session.

Type Close Action
session data not visible, but previous data is
session data now becomes visible, in addition to previous data
session data is discarded and previous data persists

Replace Sessions

Replace sessions are useful when contributed entity instance records should completely replace the set provided in previous sessions. In this scenario, the entity instance records contributed within a session, become visible via the Consumer API when the session is closed as a complete set - but all the records contributed in earlier sessions are discarded. Replace sessions are useful when you cannot maintain state about earlier contributions, and hence each contribution is a complete statement of your record set.

New records are in replacement for existing records in the catalog and removal of these “old” records is implicit. When you close an accrue session, you must specify a commit state as true or false. Closing the session with true makes the contributed records visible in the Consumer API and deletes records from previous sessions. However, closing it with false will discard all the records contributed within that session and previously contributed records will remain untouched.

Type Close Action
session data not visible, but previous data is
session data now becomes visible and replaces all previous data
session data is discarded and previous data persists

As you can see, picking the right session type is vitally important to ensure you make the best use of the catalog. In general, you should aim to use a stream type session where you can, as this is the simplest.

If you don’t want clients to be able to see intermediate updates in the catalog, then accrue and replace may be better options. Where you don’t want to (or can’t) store any state about what you previously sent to the catalog, then replace is probably the best option.

Using Sessions

There are only three HTTP calls which your data connectors need make in order to contribute records into the catalog.

Opening a Session

New sessions can be created by issuing an HTTP/GET to the /connector/:cid/session/open/:mode end-point.

In order to open a session, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker.

You will also need to select one of the three session modes from stream, accure and replace. These should be specified in lowercase and without any spaces.

curl http://bbk-contributor:8002/v1/connector/9afcf3235500836c6fcd9e82110dbc05ffbb734b/session/open/stream \
     --include \
     --header "x-bbk-auth-token: your-token-goes-here"

This will result in a response as follows:

HTTP/1.1 200 OK

The body of this response will contain a session ID (sid), which should be recorded as it will be needed for subsequent API calls. For example:


Posting Records in a Session

Once you have an open session, you can post two types of actions to it in order to manipulate your catalog entries.

  • upsert to update or insert a record into the catalog
  • delete to remove an existing record from the catalog

Entity instance records can be upserted or deleted by issuing an HTTP/POST to the /connector/:cid/session/:sid/:action end-point.

In order to post record actions, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker. You must also know the session ID (sid), which was returned in the previous step where a session was opened.

Finally, you will also need to select one of the two valid actions from upsert and delete. These should be specified in lowercase and without any spaces.

curl http://bbk-contributor:8002/v1/connector/9afcf3235500836c6fcd9e82110dbc05ffbb734b/session/4527eff4-d9cf-41c0-9ecc-8e06b57fcf54/upsert \
     --request POST \
     --include \
     --header "Content-Type: application/json" \
     --header "x-bbk-auth-token: your-token-goes-here" \
     --data-binary @- << EOF
     [ ]

In the example above, we upsert an empty array - this is obviously not useful. Let’s now look in detail about how records are inserted, update and deleted using this API call.

Upserting records

When you post an upsert request, you should include an array of entity instances in JSON format within your post body. Each record can contain the following attributes:

Attribute Necessity Validation Rules
String between 1 and 64 characters long
String between 1 and 64 characters long
An object conforming to the entity schema for this entity type
An object containing other, ancillary information

It is important to understand the difference between the three classes of attributes which you can be present within each entity instance record:

Global Attributes

These attributes are required to be present for entity instance in the system, regardless of its entity type. This set consists of only these attributes:

Attribute Description
id Your domain key for this entity instance
name A human-readable name describing this entity instance
Entity Attributes

These attributes are required to be present for entity instance in the system, of a given entity type. This set of attributes will have been communicated to you by the coordinator user who created your connector within BitBroker. It will presented in the form of a JSON schema.

Instance Attributes

These attributes only exist for a given entity instance in the system. This is a free format object which can be used to store additional or ancillary information.

This simple hierarchy of three classes (global, entity and instance) is designed to give consumers maximum assurance about which data can be expected to be available to them:

  • They can always expect to find the global data present
  • They have firm expectations about data availability within an entity type
  • They understand that instance data is ad-hoc and cannot be relied upon

Here is the post body for an example upsert request for a set of three records:

        "id": "GB",
        "name": "United Kingdom",
        "entity": {
            "area": 242900,
            "calling_code": 44,
            "capital": "London",
            "code": "GB",
            "continent": "Europe",
            "currency": {
                "code": "GBP",
                "name": "Sterling"
            "population": 66040229
        "id": "IN",
        "name": "India",
        "entity": {
            "area": 3287263,
            "calling_code": 91,
            "capital": "New Delhi",
            "code": "IN",
            "continent": "Asia",
            "currency": {
                "code": "INR",
                "name": "Indian Rupee"
            "population": 1344860000
        "instance": {
            "independence": 1947
        "id": "BR",
        "name": "Brazil",
        "entity": {
            "area": 8547403,
            "calling_code": 55,
            "capital": "Brasilia",
            "code": "BR",
            "continent": "South America",
            "currency": {
                "code": "BRL",
                "name": "Brazilian Real"
            "population": 209659000
        "instance": {}

Whenever records are upserted into the catalog, it will return a report to the caller with information about how each posted record was processed. For example, for the three records above, you might get a report such as:

    "GB": "5ebb30afaa6ce33843b00bbff63f63b90e91028c",
    "IN": "917d0311c687e5ffb28c91a9ea57cd3a306890d0",
    "BR": "d5fa7d9d8e4625399da7771fc0e3e87886f2a5ac"

In the report, you will see a row for every record that was posted, alongside the BitBroker key which is being used for this entity instance. This is the key which consumers will use in order to retrieve this record via the Consumer API.

Deleting records

When deleting records from the catalog, you need to simply post an array of your domain keys for the records to be removed. These should be the same domain keys you specified when you upserted the records. For example, to remove two of the records upserted in the previous step, the post body would need to be:

[ "GB", "BR" ]

Whenever records are deleted from the catalog, it will return a report to the caller with information about how each posted ID was processed. For example, for the two IDs above, you might get a report such as:

    "GB": "5ebb30afaa6ce33843b00bbff63f63b90e91028c",
    "BR": "d5fa7d9d8e4625399da7771fc0e3e87886f2a5ac"

In the report, you will see a row for every ID that was posted, alongside the BitBroker key which was being used for this (now removed) entity instance. This is the key which consumers will have used in order to retrieve this record via the Consumer API.

Closing a Session

After entity instance records have been posted, you can be close a session by issuing an HTTP/GET to the /connector/:cid/session/:sid/close/:commit end-point.

In order to post record actions, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker. You must also know the session ID (sid), which was returned in the previous step where a session was opened.

Finally, you will also need to select one of the two valid commits from true and false. These should be specified in lowercase and without any spaces.

curl http://bbk-contributor:8002/v1/connector/9afcf3235500836c6fcd9e82110dbc05ffbb734b/session/4527eff4-d9cf-41c0-9ecc-8e06b57fcf54/close/true \
     --include \
     --header "x-bbk-auth-token: your-token-goes-here"

This will result in a response as follows:

HTTP/1.1 200 OK

The exact mechanics of closing a session depends on the type of session that specified when it was opened. This was covered in detail in the earlier section on session types.

2 - Hosting a Webhook

How to use webhooks to incorporate live and on-demand data

It is an expectation that the BitBroker catalog contains information which is useful to enable search and discovery of entity instances. Hence, it contains key metadata - but it does not normally contain actual entity data. This is pulled on-demand via a webhook hosted by the data connector who contributed the entity record.

The distinction between data and metadata is covered in more detail in the key concepts documentation. Depending on how data and metadata is balanced in a BitBroker instance, there may or may not be a requirement to host a webhook.

In this section, we will outline how to implement a webhook within a data container.

Registering your Webhook

The first step is to register your webhook with BitBroker. This is done when the connector is created or can be done later by updating the connector. These actions are part of the Coordinator API and hence can only be performed by a coordinator user on your behalf.

Your webhook should be an HTTP server which is capable of receiving calls from the BitBroker instance. You can host this server in any manner you like, however the coordinator of your BitBroker may have their own hosting and security requirements of it.

You need to maintain your webhook so that it is always available to its connected BitBroker instance. If your webhook is down or inaccessible when BitBroker needs it, this will result in a poor experience for consumers using the Consumer API. In this scenario, they will only see partial records. Information about misbehaving data connectors will be available to coordinator users.

Required End-points

You are required to implement two end-points as part of your webhook deployment.

Entity End-point

The entity end-point is used by BitBroker to get a full data record for an entity instance which you previously submitted into the catalog.

The entity end-point has the following signature:

   HTTP/GET /entity/:type/:id


Attribute Presence Description
The entity type ID, for this entity instance
Your own domain key, which you previously submitted into the catalog

The entity type is presented here to allow for scenarios where one webhook is servicing the needs of multiple data connectors.

In response to this call, you should return a JSON object consisting of an entity and instance attribute only - all other attributes will be ignored. The object you return will be merged with the catalog record, which you provided earlier. Hence, there is no need to resupply the catalog information you have already submitted in previous steps.

For example, consider this (previously submitted) catalog record:

    "id": "GB",
    "name": "United Kingdom",
    "type": "country",  
    "entity": {
        "area": 242900,
        "calling_code": 44,
        "capital": "London",
        "code": "GB",
        "continent": "Europe",
        "currency": {
            "code": "GBP",
            "name": "Sterling"
        "population": 66040229
    "instance": {
        "independence": 1066

If there is a call for the detail of this record made on the Consumer API, the system will callback on the entity end-point as follows:

HTTP/GET /entity/country/GB

Then the webhook should respond with any extra / live / on-demand entity and instance data:

    "entity": {
        "inflation": 4.3
    "instance": {
        "temperature": 18.8

The system will then merge this live information with the catalog record to send a combined record to the consumer.

    "id": "GB",
    "name": "United Kingdom",
    "type": "country",
    "entity": {
        "area": 242900,
        "calling_code": 44,
        "capital": "London",
        "code": "GB",
        "continent": "Europe",
        "currency": {
            "code": "GBP",
            "name": "Sterling"
        "population": 66040229,
        "inflation": 4.3  // this has been merged in
    "instance": {
        "independence": 1066,
        "temperature": 18.8  // this has been merged in

Timeseries End-point

The timeseries end-point is used by BitBroker to get a timeseries information associated with an entity instance previously submitted into the catalog.

Not all entity type will have timeseries associated with them. When they do, then this callback is vital, since no timeseries data points are held within the catalog itself. Only the existence of timeseries and key metadata about them is stored.

The timeseries end-point has the following signature:

HTTP/GET /timeseries/:type/:id/:tsid?start=:start&end=:end&limit=:limit


Attribute Presence Description
The entity type ID, for this entity instance
Your own domain key, which you previously submitted into the catalog
The ID of the timeseries associated with this entity instance
The earliest timeseries data point being requested
When present, an ISO 8601 formatted date
The latest timeseries data point being requested
When present, an ISO 8601 formatted date
The maximum number of timeseries points to return
An integer greater than zero

Further information about the possible URL parameters supplied with this callback are:

Attribute Information
start Should be treated as inclusive of the range being requested
When not supplied, assume a start from the latest timeseries point
end Should be treated as exclusive of the range being requested
When present, this will always after the start
Never present without start also being present
When not supplied, defer to the limit count
limit Takes precedence over the start and end range
The end may not be reached, if limit is breached first

Then the webhook should respond timeseries data points as follows:

        "from": 1910,
        "to": 1911,
        "value": 5231
        "from": 1911,
        "to": 1912,
        "value": 6253
    // other timeseries points here


Attribute Necessity Description
An ISO 8601 formatted date
When present, an ISO 8601 formatted date
A valid JSON data type or object

Specifying both from and to is rare - in most cases, only a from will be present. You can place any data type which makes sense for your timeseries in the value attribute. But this should be consistent across all the timeseries points you return.