Skip to content

JSON Import

This how-to guide demonstrates how to import JSON data using Rel.

Goal

By following this guide, you will learn how to import JSON (opens in a new tab) data into a database using Rel.

This guide complements the JSON Export guide and the Working With JSON Data guides, where you can find all the relevant information about supported JSON data types and representations as well as how to work with them.

See the Data Import and Export guides for more information on how to import and export other types of data.

Importing JSON Data

The RKGS provides two JSON representations. Each representation has its own data loading operation:

Rel RelationJSON SchemaDescription
load_jsonData-definedExtracts the schema from the data and stores them in a wide format.
load_json_generalGeneralAllows for arbitrary schemas and stores the data in Graph Normal Form (GNF).

The two JSON loading operations share the same configuration options, which are typically defined in a relation called config:

def config = ...
 
def my_json = load_json[config]
def my_json_general = load_json_general[config]

The relation config — which is an arbitrary name — is used to define all JSON load configurations. For details on all the available options, see Import Options below.

See the Working With JSON Data guides for more details on how to work with JSON data using these two schema representations.

Getting Started

Consider the following JSON file person.json (opens in a new tab) hosted in the cloud:

{
    "first_name": "John",
    "last_name": "Smith",
    "id": 10E+02,
    "gender": null,
    "address": 
    { 
        "city": "Seattle",
        "state": "WA" 
    },
    "phone": 
    [
        { 
            "type": "home",
            "number": 206456 
        },
        { 
            "type": "work",
            "number": 206-123
        }
    ]
}

You can load this JSON file into the RKGS as follows:

// read query
 
def json = load_json["azure://raidocs.blob.core.windows.net/csv-import/person.json"]
def output = json

Note how the nested data structure around the object address is represented in the data-defined schema with additional columns in the relation. The field phone is an array.

Also, some data type conversions happen automatically during the loading. For example, the null value in the field gender is converted to missing, while the scientific notation in the field id is converted to a Float64. See the JSON Data Types guide for more information on how JSON native types map to Rel types.

🔎

Using the data-defined schema in Rel turns JSON keys into relation names of type RelName. Arrays are relations as well, using :[] as the relation name. Positional information within an array is indicated explicitly with an integer argument, as shown in the example above.

You can load it using the general schema as follows:

// read query
 
def json = load_json_general["azure://raidocs.blob.core.windows.net/csv-import/person.json"]
def output = json

Notice how the general representation creates relations with more depth — more rows than the data-defined schema. This is because each node in the JSON tree is represented by a unique ID (a hash), and each node is connected to its children through these IDs. See the JSON Data With a General Schema guide for more details.

Import Options

In situations where it’s not required to specify a configuration, you can just provide the file path:

def my_json = load_json["azure://...."]

Or:

def my_json_general = load_json_general["azure://...."]

For more customized imports, load_json and load_json_general take a configuration relation as an argument:

module config
    // ...
end
 
def my_json = load_json[config]
def my_json_general = load_json_general[config]
🔎

The configuration relation is typically named config, but you can also use any other name.

Within the load configuration, you can specify:

OptionDescription
pathA URL for the JSON data.
dataAn alternative way to specify the data itself. A JSON formatted string value.
integrationCredentials — if needed — to access the data.

You can use the first two options to import data from the cloud and the data option to import local files.

Path

The import option path is a string that specifies the location and name of the file you want to import. Currently, this string can point to azure://... (Microsoft Azure) or s3://... (Amazon S3) URLs. See the Accessing the Cloud guide for more details.

module config
    def path = "azure://myaccount.blob.core.windows.net/container/file.json"
end

Data

The import option data is a string specifying the actual data to be imported.

The path option takes precedence over the data option. If both are present, the data option is ignored.

module config
    def data = "a,b,c\n1,2,3\n4,5,6"
end

Integration

You may need to specify credentials for accessing private JSON data in cloud storage, using the import option integration. See the Accessing the Cloud guide for more details.

Data Sources

This section covers the available options to import JSON data.

Strings

You can import JSON data directly from a string.

Here’s an example using the data-defined schema:

// read query
 
def config[:data] = 
    """
        {"name": "John", "phone": [{"no.":"1234"}, {"no.":"4567"}]}
    """
 
def output = load_json[config]

This is the same example using the general representation:

// read query
 
def config[:data] = 
    """
        {"name": "John", "phone": [{"no.":"1234"}, {"no.":"4567"}]}
    """
 
def output = load_json_general[config]

Local Files

You can load local JSON data using the importing functionality from the RAI Console. These are the required steps:

  1. Upload the file as a string to a named relation, for example, my_data.

  2. Use def config:data = my_data, and set other configuration options as needed.

  3. Run load_json[config] or load_json_general[config], depending on the schema representation you are using.

See the Loading JSON Data guide for more details.

You can import files up to 64MB in size. If you try to import a larger file, you will receive a validation error. This limit applies to local files only; no limit is imposed for imports from cloud sources.

Cloud

You can import public and private JSON data from the supported cloud providers.

Here’s an example that loads public data:

module config
    def path = "s3://my-s3-bucket/myfile.json"
end
 
def my_json = load_json[config]
def my_json_general = load_json_general[config]

See the Accessing the Cloud guide for more details.

Compressed Files

You can import compressed JSON files from the cloud. No specific options are required, so you can just provide the file path with the compression extension. For example:

module config
    def path = "azure://myaccount.blob.core.windows.net/sascontainer/myfile.json.gz"
end
 
def my_json_compressed = load_json[config]
def my_json_general_compressed = load_json_general[config]

See Compressed Data for more details.

Loading JSON Lines

The RKGS supports the JSON Lines file format.

🔎

JSON Lines (opens in a new tab) is a file format that treats each line of the text file as a separate JSON object.

Rel provides the following built-in relations depending on the schema representation you are working with:

Rel RelationJSON Schema
load_jsonlinesData-defined
load_jsonlines_generalGeneral

Here’s an example using the data-defined schema:

// read query
 
def data = 
    """
        {"name": "Anton", "age":56}
        {"name": "Alex", "age":44}
    """
 
def config[:data] = data
def output = load_jsonlines[config]

This is the same example using the general representation:

// read query
 
def data = 
    """
        {"name": "Anton", "age":56}
        {"name": "Alex", "age":44}
    """
 
def config[:data] = data
def output = load_jsonlines_general[config]

Import Errors

The RKGS expects the JSON data that are going to be imported to have the correct format.

The data import process is atomic. This means that if there is an error in the data, the system does not attempt to load any of the data.

In such cases, the whole transaction is aborted with an error message and the data are not loaded. See the Loading Errors guide for more details.

See Also

For more information on the supported JSON data types and formats as well as how to work with them, see the Working With JSON Data concept guides.

See the JSON Export guide for exporting JSON and the Data Import and Export guides for more information on how to import and export other types of data.

Was this doc helpful?