JSON Import and Export

This how-to guide demonstrates how to import and export JSON data using Rel.

Download this guide as a RAI notebook by clicking here.

Goal

The goal of this how-to guide is to familiarize the reader with Rel’s importing and exporting functionalities for data in the JSON format. After reading this guide you should expect to know how to import and export files of JSON data using the different available options.

You may also find the CSV Import and CSV Export how-to guides relevant since they cover how to import and export other types of data.

Relational Representation of JSON

When representing JSON data relationally, you can typically think of it as a tree. For example, consider the following JSON data for a person:

{
  "first_name": "John",
  "last_name": "Smith",
  "address": { "city": "Seattle",
               "state": "WA" },
  "phone": [
    { "type": "home",
      "number": "206-456" },
    { "type": "work",
      "number": "206-123" }
  ]
}

You can view this data as a tree:

JSON Tree

In Rel, all data are expressed as relations (IDB or EDB), preferably in the highly normalized Graph Normal Form.

Defining the previous JSON data directly in Rel – instead of importing it – looks like this:

def json[:first_name] = "John"
def json[:last_name] = "Smith"
def json[:address, :city] = "Seattle"
def json[:address, :state] = "WA"
def json[:phone, :[], 1, :type] = "home"
def json[:phone, :[], 1, :number] = "206-456"
def json[:phone, :[], 2, :type] = "work"
def json[:phone, :[], 2, :number] = "206-123"

def output = json

Relation: output

:address:city"Seattle"
:address:state"WA"
:first_name"John"
:last_name"Smith"
:phone:[]1:number"206-456"
:phone:[]1:type"home"
:phone:[]2:number"206-123"
:phone:[]2:type"work"

In Rel, the JSON keys turn into relation names, starting with :. Arrays are relations as well, using :[] as the relation name. Positional information within an array is indicated explicitly with an integer argument, as shown in the last four lines above. This is discussed in more detail in a later section about Exporting Arrays.

Importing JSON Data

The RelationalAI KGMS supports importing JSON data natively through the use of load_json.

Consider the following simple JSON file as a first example. This example has a flat structure (i.e., no nesting of data, no objects, no arrays) with all the primitive data types discussed earlier:

{
    "integer" : 123,
    "float" : 10.12,
    "scientific" : 10E+02,
    "string" : "hello world",
    "mynull" : null
}

file: simple.json

You can now import this simple file within the json relation as follows:

def config[:path] = "s3://relationalai-documentation-public/csv-import/simple.json"
def output = load_json[config]

Relation: output

:float10.12
:integer123
:mynullmissing
:scientific1000
:string"hello world"

Note: Currently, S3 data must be in buckets in the (N. Virginia) us-east-1 region.

As the previous example shows, some type conversions happen automatically when you load JSON data. For example, the null value is converted to missing while the scientific notation is converted to a Float64.

More elaborate examples of JSON data are also supported. Here is an example that has nested data and arrays:

{
  "first_name": "John",
  "last_name": "Smith",
  "address": { "city": "Seattle",
               "state": "WA" },
  "phone": [
    { "type": "home",
      "number": 206456 },
    { "type": "work",
      "number": 206-123 }
  ]
}

file: person.json

You can also read this as follows:

def config[:path] = "s3://relationalai-documentation-public/csv-import/person.json"
def output = load_json[config]

Relation: output

:address:city"Seattle"
:address:state"WA"
:first_name"John"
:last_name"Smith"
:phone:[]1:number206456
:phone:[]1:type"home"
:phone:[]2:number206123
:phone:[]2:type"work"

You can observe that phone is now an array, while the nested data, for example, city within address, are represented with the additional column names within the json relation.

Importing Local JSON Files

You can read JSON data from a string instead of a file as follows:

def config[:data] = """
{"name": "John", "phone": [{"no.":"1234"}, {"no.":"4567"}]}
"""
def output = load_json[config]

Relation: output

:name"John"
:phone:[]1:no."1234"
:phone:[]2:no."4567"

Combined with the RAI notebook upload functionality, you can use this to import local JSON files into Rel:

  1. Upload the file as a string to a named relation, for example, mydata.

  2. Use def config:data = mydata, and set other configuration options as needed.

  3. Do load_json[config].

Import Options

When using load_json, you can specify the following parameters in the config relation:

ParameterDescription
pathThe URI where you would like to load the data from. Note that the Rel server needs to have access to this URI given the integration options set in the config.
integrationThis parameter allows for importing from the cloud. In this case, you also have to specify the :provider (for example, "azure") as well as the :credentials to be used for access to the cloud (for example, a sas_token).

As an example of this last integration parameter, you can import JSON data from the cloud through the following example that imports data from Azure into the json relation:

def config[:path] = "azure://myaccount.blob.core.windows.net/sascontainer/myfile.json"

def config[:integration, :provider] = "azure"
def config[:integration, :credentials, :azure_sas_token] = "sv=2014-02-14&sr=b&si=TestPolicy&sig=o%2B5%2F0C%2BLm7tWWftNKvQEGKHlSt%2Bfs8No7FZkUk5T%2Bv0%3D"

def json = load_json[config]

Exporting JSON Data

You can start by learning how to export JSON data. Similar to exporting CSV data, this is done using export_json. Here is a simple example for exporting:

def json[:name] = "John"
def json[:age] = 12
def config:path = cloud_URI
def config:data = json

def export = export_json[config]

Similar to when you export to CSV files, you must specify the path and data parameters in config. Note that, again similar to exporting CSV data, the configuration should include credentials that give write access to the URI where you want to write the data (cloud_URI).

The contents of the output file should be:

{"name":"John","age":12}

Given a relation R, you can obtain the JSON representation with json_string[R] and preview it in a RAI notebook using view_json[R]:

def json[:name] = "John"
def json[:age] = 12
def output = json_string[json]

Relation: output

" { "age": 12, "name": "John" }"

Export Options

The data are exported, however, they are not nicely formatted. You can do this through the indent parameter that you can pass to export_json. For example, the following code tells the system that you want to use four spaces when formatting the output data:

def json[:name] = "John"
def json[:age] = 12

def export = export_json[(:path, cloud_URI); (:data, json); (:indent, 4)]

When checking the output file, the content now looks like this:

{
    "name": "John",
    "age": 12
}

When using export_json, you can specify the following parameters, all of which are optional:

ParameterDescription
pathThis is the URI where you would like to write the output file. Note that the Rel server needs to have write access to this URI given the integration options provided in the config. This parameter is optional. If it is not specified, no output file is written.
dataThis parameter provides the relation that you would like to output to the file. This parameter is optional. If it is not specified, the output file will just contain the word null.
indentThis parameter indicates the amount of indentation that you would like to use when writing the output data. This parameter is also optional. By default, if indent is not specified, all of the data are written in a single line to the output.
integrationThis parameter allows for exporting to the cloud. In this case, you also have to specify the :provider (for example, "azure") as well as the :credentials to be used for access to the cloud (for example, a sas_token).

As an example of this last integration parameter, you can export JSON data to the cloud as follows:

def json[:name] = "John"
def json[:age] = 12

def integration[:provider] = "azure"
def integration[:credentials, :azure_sas_token] = "sv=2014-02-14&sr=b&si=TestPolicy&sig=o%2B5%2F0C%2BLm7tWWftNKvQEGKHlSt%2Bfs8No7FZkUk5T%2Bv0%3D"

def azure_url = "azure://myaccount.blob.core.windows.net/sascontainer/myfile.json"

def export = export_json[
(:path, azure_url);
(:data, json);
(:indent, 4);
(:integration, integration)
]

Exporting Nested Data

You have so far explored exporting some data in the JSON format which were essentially all at the same level, i.e., there was no nesting of data. You will now examine how to export data that are nested in a more elaborate way. For example, consider the following data, where you have an object inside another object. More specifically, you have a car object in addition to the top-level person object.

def json[:name] = "John"
def json[:age] = 35
def json[:car, :brand] = "Mazda"
def json[:car, :color] = "red"

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

If you take a look at the output file, you see the following:

{
    "name": "John",
    "car": {
        "brand": "Mazda",
        "color": "red"
    },
    "age": 35
}

You can observe that the car object is properly outputted. You can also have more objects and additional nesting. Here is a more elaborate example:

def json[:name] = "John"
def json[:age] = 35

def json[:car1, :brand] = "Mazda"
def json[:car1, :color] = "red"
def json[:car1, :engine, :size] = 2000
def json[:car1, :engine, :cylinders, :setup] = "inline"
def json[:car1, :engine, :cylinders, :number] = 6

def json[:car2, :brand] = "Jaguar"
def json[:car2, :color] = "green"
def json[:car2, :engine, :size] = 3000
def json[:car2, :engine, :cylinders, :setup] = "V"
def json[:car2, :engine, :cylinders, :number] = 8

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

This gives the output:

{
    "name": "John",
    "car1": {
        "brand": "Mazda",
        "color": "red",
        "engine": {
            "size": 2000,
            "cylinders": {
                "number": 6,
                "setup": "inline"
            }
        }
    },
    "age": 35,
    "car2": {
        "brand": "Jaguar",
        "color": "green",
        "engine": {
            "size": 3000,
            "cylinders": {
                "number": 8,
                "setup": "V"
            }
        }
    }
}

Exporting Arrays

In the last example, you exported two cars, using two different keys, car1 and car2. JSON, however, supports arrays for this purpose. You will now learn how to export arrays. Instead of car1 and car2, you have an array called cars containing the two objects:

def json[:name] = "John"
def json[:age] = 35

// note the :[] marker here
def json[:cars, :[], 1, :brand] = "Mazda"
def json[:cars, :[], 1, :color] = "red"
def json[:cars, :[], 1, :engine, :size] = 2000
def json[:cars, :[], 1, :engine, :cylinders, :setup] = "inline"
def json[:cars, :[], 1, :engine, :cylinders, :number] = 6

def json[:cars, :[], 2, :brand] = "Jaguar"
def json[:cars, :[], 2, :color] = "green"
def json[:cars, :[], 2, :engine, :size] = 3000
def json[:cars, :[], 2, :engine, :cylinders, :setup] = "V"
def json[:cars, :[], 2, :engine, :cylinders, :number] = 8

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

The :[] marker might seem redundant, but it is needed to output the JSON data properly as an array. Executing the above code, you obtain the output:

{
    "name": "John",
    "cars": [
        {
            "brand": "Mazda",
            "color": "red",
            "engine": {
                "size": 2000,
                "cylinders": {
                    "number": 6,
                    "setup": "inline"
                }
            }
        },
        {
            "brand": "Jaguar",
            "color": "green",
            "engine": {
                "size": 3000,
                "cylinders": {
                    "number": 8,
                    "setup": "V"
                }
            }
        }
    ],
    "age": 35
}

You can have arrays inside arrays in JSON, and Rel also supports that. Here is a simple example:

def json[:arr, :[], 1, :arr1, :[], 1] = 11
def json[:arr, :[], 1, :arr1, :[], 2] = 12

def json[:arr, :[], 2, :arr2, :[], 1] = 21
def json[:arr, :[], 2, :arr2, :[], 2] = 22

def export = export_json[(:path, cloud_URI; :data, json; :indent, 4)]

This gives output containing two arrays within an array:

{
    "arr": [
        {
            "arr1": [
                11,
                12
            ]
        },
        {
            "arr2": [
                21,
                22
            ]
        }
    ]
}

Rel JSON Utils

The Standard Library’s json_string[R] returns the JSON string representing the relation R.

The view_json[R] utility from the display library displays the relation R as a JSON object.

Appendix: The JSON Data Format

JSON (JavaScript Object Notation) is a popular open standard file and data exchange format. Many applications use JSON to exchange data. JSON is a data format that is also human-readable. Its structure can optionally follow a strictly enforced schema (JSON schema) defining the format and structure of the data.

JSON supports the following data types:

Data TypeDescription
NullAn empty value. The actual word null is used to denote this.
NumberA signed number, integer or floating point, which can also be expressed in scientific notation, for example, 3.2E+2.
StringA sequence of Unicode characters which can also be empty.
Booleantakes the obvious values of true and false. Note that Boolean types are not (officially) supported by Rel at this point.
ArrayA list of zero or more values of either primitive types, i.e., Number, String, Boolean or Object(see below). The arrays are enclosed in square brackets [ ], while the elements in the array are separated by a comma ,.
ObjectIn JSON, an object is a collection of key : value pairs. The keys are always strings and must be unique within the object. An object’s goal is to represent an associative array. An object is defined with opening and closed curly brackets { }. The key : value pairs are separated with a comma ,, while the key and the value are separated by a colon :.

Here is a small example of a JSON array:

[ 1, "23", {"next": 45}]

And here is a small example of a JSON object that includes another object:

{
    "name" : "john",
    "birthdate": {
        "day": 10,
        "month": 12,
        "year": 1970
    }
}

Summary

You have learned how to import JSON data in Rel using the load_json functionality. By following the examples, you are now able to import JSON data into your RKGMS database, including local files and files in cloud storage. You can also export JSON data, including arrays and nested data.