JSON Representations
This guide presents the supported JSON representations in the Relational Knowledge Graph System (RKGS).
Introduction
Currently, there are two schema representations you can use to work with JSON data in the RKGS:
JSON Schema | Description |
---|---|
Data-defined | Extracts the schema from the data and stores them in a wide format. |
General | Stores the JSON data as a tree, the nodes of which are treated as entities. |
The two representations store data in the RGKS using different approaches, but they are otherwise equivalent in terms of the operations and computations that you can perform over the data.
Data Types
Generally in both formats, the data type conversions happen automatically during the loading and exporting operations.
See the JSON Data Types guide for more information on how JSON native types map to Rel types.
Loading and Exporting
Each format has its own set of data loading and exporting operations.
Note that the data-defined JSON schema has more support than the general schema.
Currently, you can’t export JSON data in the general format.
Task | Data-Defined Schema | General Schema |
---|---|---|
Loading | load_json and load_jsonlines | load_json_general and load_jsonlines_general |
Exporting | export_json | n/a |
String conversion | json_string | n/a |
Data-Defined Schema
Consider the following JSON data:
{
"first_name": "John",
"last_name": "Smith",
"address":
{
"city": "Seattle",
"state": "WA"
},
"phone":
[
{
"type": "home",
"number": "206-456"
},
{
"type": "work",
"number": "206-123"
}
]
}
Here’s their tree representation:
You can load them using a data-defined schema as follows:
// read query
def my_json = load_json["azure://raidocs.blob.core.windows.net/working-with-json/json_example.json"]
def output = my_json
Note how each key within the JSON data has its respective values and children arranged in a wide relation within my_json
.
See the JSON Data With a Data-Defined Schema guide for more details.
See also the JSON Data Types guide for more information on how JSON native types map to Rel types.
General Schema
Here are the same JSON data, represented using the general schema:
// read query
def my_json = load_json_general["azure://raidocs.blob.core.windows.net/working-with-json/json_example.json"]
def output = my_json
Note how the general representation creates relations with more rows than the data-defined schema. This is because each node in the JSON tree is represented by a unique ID (a hash) and stored as a separate entry in the relation.
Additionally, when loading the data into the general schema, the relationships between nodes — most notably child
and index
— are also stored within the relation my_json
.
See the JSON Data With a General Schema guide for more details.
Summary
There are two representations of JSON data in the RKGS. The data-defined schema extracts the schema from the data and represents the data in a wide relation with many columns. The general schema approach stores the JSON data as a tree, where the nodes of the tree are treated as entities.
For more information, see the JSON With a General Schema and the JSON With a Data-Defined Schema concept guides.