JSON Import and Export

This how-to guide demonstrates how to import and export JSON data using Rel.

Download this guide as a RAI notebook by clicking here.

Goal

The goal of this how-to guide is to familiarize the reader with Rel’s importing and exporting functionalities for data in the JSON format. After reading this guide you should expect to know how to import and export files of JSON data using the different available options.

You may also find the CSV Import and CSV Export how-to guides relevant since they cover how to import and export other types of data.

Relational Representation of JSON

When representing JSON data relationally, we can typically think of it as a tree. For example, consider the following JSON data for a person:

{
  "first_name": "John",
  "last_name": "Smith",
  "address": { "city": "Seattle",
               "state": "WA" },
  "phone": [
    { "type": "home",
      "number": "206-456" },
    { "type": "work",
      "number": "206-123" }
  ]
}

We can view this data as a tree:

JSON Tree

In Rel, all data is expressed as relations (IDB or EDB), preferably in the highly normalized Graph Normal Form.

Defining the previous JSON data directly in Rel (not importing it) looks as follows:

query
def json[:first_name] = "John"
def json[:last_name] = "Smith"
def json[:address, :city] = "Seattle"
def json[:address, :state] = "WA"
def json[:phone, :[], 1, :type] = "home"
def json[:phone, :[], 1, :number] = "206-456"
def json[:phone, :[], 2, :type] = "work"
def json[:phone, :[], 2, :number] = "206-123"

def output = json

Relation: output

:address:city"Seattle"
:address:state"WA"
:first_name"John"
:last_name"Smith"
:phone:[]1:number"206-456"
:phone:[]1:type"home"
:phone:[]2:number"206-123"
:phone:[]2:type"work"

In Rel, the JSON keys turn into relation names (starting with :). Arrays are relations as well, using :[] as the relation name. Positional information within an array is indicated explicitly with an integer argument, as shown in the last four lines above. We discuss this in more detail in a later section about Exporting Arrays.

Importing JSON Data

Our KGMS supports importing JSON data natively through the use of load_json.

Let’s consider the following simple JSON file as our first example. This example has a flat structure (i.e., no nesting of data, no objects, no arrays) with all the primitive data types that we discussed earlier:

{
    "integer" : 123,
    "float" : 10.12,
    "scientific" : 10E+02,
    "string" : "hello world",
    "mynull" : null
}

file: simple.json

We can now import this simple file within the json relation as follows:

query
def config[:path] = "s3://relationalai-documentation-public/csv-import/simple.json"
def output = load_json[config]

Relation: output

:float10.12
:integer123
:mynullmissing
:scientific1000
:string"hello world"

As the previous example shows, some type conversions happen automatically when we load JSON data. For example, the null value is converted to missing, while the scientific notation was converted to a Float64.

Of course more elaborate examples of JSON data are also supported. Here is an example that has nested data and arrays:

{
  "first_name": "John",
  "last_name": "Smith",
  "address": { "city": "Seattle",
               "state": "WA" },
  "phone": [
    { "type": "home",
      "number": 206456 },
    { "type": "work",
      "number": 206-123 }
  ]
}

file: person.json

that we can also read as follows:

query
def config[:path] = "s3://relationalai-documentation-public/csv-import/person.json"
def output = load_json[config]

Relation: output

:address:city"Seattle"
:address:state"WA"
:first_name"John"
:last_name"Smith"
:phone:[]1:number206456
:phone:[]1:type"home"
:phone:[]2:number206123
:phone:[]2:type"work"

We can observe that phone is now an array, while the nested data (e.g., city within address) is represented with the additional column names within the json relation.

Importing Local JSON Files

We can read JSON data from a string instead of a file as follows:

query
def config[:data] = """
{"name": "John", "phone": [{"no.":"1234"}, {"no.":"4567"}]}
"""
def output = load_json[config]

Relation: output

:name"John"
:phone:[]1:no."1234"
:phone:[]2:no."4567"

Combined with the RAI notebook upload functionality, we can use this to import local JSON files into Rel:

  1. Upload the file as a string to a named relation, say, mydata.

  2. Use def config:data = mydata, and set other configuration options as needed.

  3. Do load_json[config].

Importing Options

When we use load_json we can specify the following parameters in the config relation:

parameterdescription
pathThe URI where we would like to load the data from. Please note that the Rel server needs to have access to this URI given the integration options set in the config.
integrationThis parameter allows for importing from the cloud. In this case we will also have to specify the :provider (e.g., “azure”) as well as the :credentials to be used for access to the cloud (e.g., a sas_token).

As an example of this last integration parameter, we can import JSON data from the cloud through the following example that imports data from Azure into the json relation:

def config[:path] = "https://myaccount.blob.core.windows.net/sascontainer/myfile.json"

def config[:integration, :provider] = "azure"
def config[:integration, :credentials, :azure_sas_token] = "sv=2014-02-14&sr=b&si=TestPolicy&sig=o%2B5%2F0C%2BLm7tWWftNKvQEGKHlSt%2Bfs8No7FZkUk5T%2Bv0%3D"

def json = load_json[config]

Exporting JSON Data

We will begin by looking into how to export JSON data. Similar to exporting CSV data, this is done using export_json. We will start with a simple example for exporting:

def json[:name] = "John"
def json[:age] = 12
def config:path = cloud_URI
def config:data = json

def export = export_json[config]

Similar to when we export to CSV files, we must specify the path and data parameters in config. Also, please note (again similar to exporting CSV data) that the configuration should include credentials that give write access to the URI where we want to write the data (cloud_URI).

The contents of the output file should be:

{"name":"John","age":12}

Export Options

We did manage to export the data, however, they are not nicely formatted. We can do this through the indent parameter that we can pass to export_json. For example, the following code:

def json[:name] = "John"
def json[:age] = 12

def export = export_json[(:path, cloud_URI); (:data, json); (:indent, 4)]

tells the system that we would like to use 4 spaces when formatting the output data. When we check the output file, the content will now be:

{
    "name": "John",
    "age": 12
}

When we use export_json we can specify the following parameters, all of which are optional:

parameterdescription
pathThis is the URI where we would like to write the output file. Please note that the Rel server needs to have write access to this URI given the integration options provided in the config. This parameter is optional. If it is not specified, no output file is written.
dataThis parameter provides the relation that we would like to output to the file. This parameter is optional. If it is not specified, the output file will just contain the word null.
indentThis parameter indicates the amount of indentation that we would like to use when writing the output data. This parameter is also optional. By default, if indent is not specified, all of the data is written in a single line to the output.
integrationThis parameter allows for exporting to the cloud. In this case we will also have to specify the :provider (e.g., “azure”) as well as the :credentials to be used for access to the cloud (e.g., a sas_token).

As an example of this last integration parameter, we can export JSON data to the cloud as follows:

def json[:name] = "John"
def json[:age] = 12

def integration[:provider] = "azure"
def integration[:credentials, :azure_sas_token] = "sv=2014-02-14&sr=b&si=TestPolicy&sig=o%2B5%2F0C%2BLm7tWWftNKvQEGKHlSt%2Bfs8No7FZkUk5T%2Bv0%3D"

def azure_url = "https://myaccount.blob.core.windows.net/sascontainer/myfile.json"

def export = export_json[
(:path, azure_url);
(:data, json);
(:indent, 4);
(:integration, integration)
]

Exporting Nested Data

We just explored exporting some data in the JSON format which were essentially all at the same level (i.e., there was no nesting of data). We will now examine how to export data that is nested in a more elaborate way. For example, consider the following data, where we have an object inside another object. More specifically, we have a car object in addition to the top-level person object.

def json[:name] = "John"
def json[:age] = 35
def json[:car, :brand] = "Mazda"
def json[:car, :color] = "red"

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

If we take a look at the output file, we see the following:

{
    "name": "John",
    "car": {
        "brand": "Mazda",
        "color": "red"
    },
    "age": 35
}

we observe that the car object is properly outputted. Of course, we can have more objects and additional nesting. Here is a more elaborate example:

def json[:name] = "John"
def json[:age] = 35

def json[:car1, :brand] = "Mazda"
def json[:car1, :color] = "red"
def json[:car1, :engine, :size] = 2000
def json[:car1, :engine, :cylinders, :setup] = "inline"
def json[:car1, :engine, :cylinders, :number] = 6

def json[:car2, :brand] = "Jaguar"
def json[:car2, :color] = "green"
def json[:car2, :engine, :size] = 3000
def json[:car2, :engine, :cylinders, :setup] = "V"
def json[:car2, :engine, :cylinders, :number] = 8

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

giving us the output:

{
    "name": "John",
    "car1": {
        "brand": "Mazda",
        "color": "red",
        "engine": {
            "size": 2000,
            "cylinders": {
                "number": 6,
                "setup": "inline"
            }
        }
    },
    "age": 35,
    "car2": {
        "brand": "Jaguar",
        "color": "green",
        "engine": {
            "size": 3000,
            "cylinders": {
                "number": 8,
                "setup": "V"
            }
        }
    }
}

Exporting Arrays

In this last example, we exported two cars, using two different keys, car1 and car2. JSON, however, supports arrays for this purpose. We will now examine how to export arrays. Instead of car1 and car2 we will have an array called cars containing the two objects:

def json[:name] = "John"
def json[:age] = 35

// note the :[] marker here
def json[:cars, :[], 1, :brand] = "Mazda"
def json[:cars, :[], 1, :color] = "red"
def json[:cars, :[], 1, :engine, :size] = 2000
def json[:cars, :[], 1, :engine, :cylinders, :setup] = "inline"
def json[:cars, :[], 1, :engine, :cylinders, :number] = 6

def json[:cars, :[], 2, :brand] = "Jaguar"
def json[:cars, :[], 2, :color] = "green"
def json[:cars, :[], 2, :engine, :size] = 3000
def json[:cars, :[], 2, :engine, :cylinders, :setup] = "V"
def json[:cars, :[], 2, :engine, :cylinders, :number] = 8

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

The :[] marker might seem redundant, but is needed to output the JSON data properly as an array. Executing the above code, we obtain the output:

{
    "name": "John",
    "cars": [
        {
            "brand": "Mazda",
            "color": "red",
            "engine": {
                "size": 2000,
                "cylinders": {
                    "number": 6,
                    "setup": "inline"
                }
            }
        },
        {
            "brand": "Jaguar",
            "color": "green",
            "engine": {
                "size": 3000,
                "cylinders": {
                    "number": 8,
                    "setup": "V"
                }
            }
        }
    ],
    "age": 35
}

Of course, we can have arrays inside arrays in JSON and Rel also supports that. Here is a simple example:

def json[:arr, :[], 1, :arr1, :[], 1] = 11
def json[:arr, :[], 1, :arr1, :[], 2] = 12

def json[:arr, :[], 2, :arr2, :[], 1] = 21
def json[:arr, :[], 2, :arr2, :[], 2] = 22

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

giving us the output containing two arrays within an array:

{
    "arr": [
        {
            "arr1": [
                11,
                12
            ]
        },
        {
            "arr2": [
                21,
                22
            ]
        }
    ]
}

JSON Data Format

JSON (JavaScript Object Notation) is a popular open standard file and data exchange format. Many applications use JSON to exchange data. JSON is a data format that is also human-readable. Its structure can optionally follow a strictly enforced schema (JSON schema) defining the format and structure of the data.

JSON supports the following data types:

Data TypeDescription
nullAn empty value. The actual word null is used to denote this.
NumberA signed number (integer or floating point) which can also be expressed in scientific notation (e.g., 3.2E+2).
StringA sequence of Unicode characters which can also be empty.
Booleantakes the obvious values of true and false. Please note that Boolean types are not (officially) supported by Rel at this point.
ArrayA list of zero or more values of either primitive types i.e., Number, String, Boolean or Object(see below). The arrays are enclosed in square brackets [ ], while the elements in the array are separated by a comma ,.
ObjectIn JSON, an object is a collection of key : value pairs. The keys are always strings, and must be unique within the object. An object’s goal is to represent an associative array. An object is defined with opening and closed curly brackets { }. The key : value pairs are separated with a comma ,, while the key and the value are separated by a colon :.

Here is a small example of a JSON array:

[ 1, "23", {"next": 45}]

And here is a small example of a JSON object that includes another object:

{
    "name" : "john",
    "birthdate": {
        "day": 10,
        "month": 12,
        "year": 1970
    }
}