JSON Schema

JSON schema specification for upserting events to your tracking plan via Data Catalog API.

JSON schema is a vocabulary used to annotate and validate JSON documents.

Follow this guide to standardize and define expectations for your events while upserting them into your tracking plan via the Data Catalog API.

Keywords

Keywords are properties appearing within a JSON schema object.

{
  "title": "Example Schema",
  "type": "object"
}

In the above snippet, the title and type are keywords.

Type-specific keywords

The type keyword specifies the data type for a JSON schema. RudderStack supports the following keywords:

Strings

The string data type is used to represent strings of text and can contain Unicode characters.

A sample schema definition for a string:

{
  "type": "string"
}
warning

RudderStack restricts usage of the following string keywords:

Integers and numbers

The JSON schema defines two numeric data types:

  • integer: Used for integral numbers.
  • number: Used for any numeric type like integers or floating point numbers.

A sample schema definition for an integer and number:

warning

RudderStack restricts usage of the following numeric data type keywords:

Objects

You can use JSON objects to map specific keys to values.

While using the Data Catalog API, you can use the rules object to specify the property mappings for the event to be upserted in the tracking plan. A sample rules object is shown:

{
  "identify": {
    "type": "object",
    "$schema": "http://json-schema.org/draft-07/schema#",
    "properties": {
      "anonymousId": {
        "type": "string"
      },
      "userId": {
        "type": "string"
      }
    }
  }
}

RudderStack supports the following keywords within an object:

KeywordData typeDescription
propertiesObjectProperties for the event to be upserted in tracking plan.
additionalPropertiesBooleanDetermines if RudderStack should allow any other properties apart from the ones defined in properties.
requiredBooleanDetermines if a property is required or optional for the event.

A sample schema definition for an object:

{
  "type": "object",
  "properties": {
    "number": {
      "type": "number"
    },
    "street_name": {
      "type": "string"
    },
    "street_type": {
      "enum": ["Street", "Avenue", "Boulevard"]
    }
  },
  "additionalProperties": false
}

For the above example, the schema definition validates the following JSON:

{
  "number": 1600,
  "street_name": "Pennsylvania",
  "street_type": "Avenue"
}

The above schema definition invalidates the following JSON as it contains an undefined property direction and additionalProperties is set to false:

{
  "number": 1600,
  "street_name": "Pennsylvania",
  "street_type": "Avenue",
  "direction": "North-west"
}
warning

RudderStack restricts usage of the following object data type keywords:

Arrays

You can use arrays for ordered elements.

info
RudderStack expects only objects to be present within an array.

There are two ways of using arrays in JSON:

  • List validation: Sequence of arbitrary length where each item matches the same schema.
  • Tuple validation: Sequence of fixed length where each item can have a different schema.
info
RudderStack supports only list validation and the items keyword to validate the items in the array.

A sample schema definition for an array:

{
  "type": "array"
}
Nesting properties
info
This section is applicable to objects and array data types.

RudderStack supports defining complex nested properties within an object or array while defining the event properties.

A sample object highlighting the nested properties is shown:

{
  "type": "object",
  "properties": {
    "traits": {
      "type": "object",
      "properties": {
        "name": {
          "type": "string"
        },
        "industry": {
          "type": "string"
        },
        "plan": {
          "type": "string"
        }
      }
    }
  }
}

Note that:

  • RudderStack supports up to three levels of nesting within an event property.
  • You can nest properties only within an object or an array.
  • Removing the parent object or array automatically removes all the nested properties.
  • If not explicitly declared, RudderStack allows all data types for a property by default. However, it does not support nesting for that property.
  • You cannot nest properties within a property having both array and object data types.

Boolean

The boolean data type supports only two values - true and false. RudderStack does not support values that evaluate to true or false, like 1 and 0.

A sample schema definition for a boolean:

{
  "type": "boolean"
}

Null

The null data type accepts only one value - null.

A sample schema definition for null data type:

{
  "type": "null"
}

Multi data types

RudderStack also supports specifying multi data types for the event properties along with the above data types.

A sample schema definition for multi data types:

{
  "type": ["string", "integer", "boolean", "null"]
}

Restricted keyword structures

Apart from the data type-specific keywords, RudderStack also restricts usage of the following keyword structures:


Questions? Contact us by email or on Slack