Supported Data Sources
This guide describes the supported data sources for importing data into the Relational Knowledge Graph System (RKGS).
Goal
By following this guide, you’ll learn which data sources are supported by the system, allowing you to import data into the RKGS.
This guide complements the other Data Import and Export guides.
Supported Data Sources
These are the different sources you can use to import data into the RKGS:
Source | Description | Data Type |
---|---|---|
Strings | Load data from strings formatted in a certain data type. | String in Rel source code. |
Local files | Load files locally from your computer. | Binary, CSV, JSON, JSON Lines, and Parquet files. |
Cloud | Load data from the supported cloud providers. | Binary, CSV, JSON, JSON Lines, Parquet files and Iceberg tables. |
Snowflake | Load data from Snowflake through RAI data streams. | Snowflake object, which can be a SQL table (opens in a new tab) or a view (opens in a new tab). |
The Configuration Module
When importing data from the cloud or strings, loading is controlled via a configuration module. This module is passed as an argument to the loading relations:
module config
// ...
end
def my_json = load_json[config]
See Data Types for more details on supported data types and the available loading relations.
The configuration relation is usually named config
but any other name can be used.
The following configuration options are common to all Rel data-loading relations and control where the data are loaded from.
Option | Description | Used For |
---|---|---|
data | A formatted string value. | String. |
path | A URL for the data. | Cloud. |
integration | Credentials, if needed, to access private data. | Cloud. |
Data-loading relations for specific data types, like CSV, may have additional configuration options for import and export.
The option path
is a string that specifies the location and name of the file you want to import.
It can be combined with integration
for specifying credentials when accessing private data in cloud storage.
See Cloud for more details.
The option data
is a string specifying the actual data to be imported.
You use this to load data from strings.
The path
option takes precedent over the data
option.
If both are present, the data
option is ignored.
Strings
You can import data as a formatted String in Rel source code. This option is convenient for experimental examples and small datasets.
Here’s an example of loading some JSON data:
// read query
module config
def data =
"""
{"name": "John", "phone": [{"no.":"1234"}, {"no.":"4567"}]}
"""
end
def output = load_json[config]
Local Files
You can load files directly from your computer using:
RAI Interface | Details |
---|---|
RAI Console | Load data using the import functionality. |
CLI | Load CSV and JSON data using the loading data commands. |
RelationalAI SDKs | Load CSV and JSON data using the loading data functionality. |
RelationalAI VS Code Extension (opens in a new tab) | Load CSV and JSON data either via RelationalAI View > Admin > Commands or the Command Palette. |
Here’s an example of loading a CSV file using the Python SDK:
file_name = "my_data_file.csv"
with open(file_name) as fp:
data = fp.read()
api.load_csv(ctx, database, engine, "my_csv", data)
You can load local files up to 64MB in size. To upload larger files, use the cloud.
Cloud
You can load data from several cloud providers. The system supports importing both public and private data. Importing private data requires setting up certain credentials within the loading configuration module.
Here’s an example of loading an Iceberg table from AWS S3:
module config
def path = "s3://my-s3-bucket/path/to/table_iceberg"
end
def my_iceberg = load_iceberg[config]
See Accesing the Cloud for more details on supported providers, regions, and examples.
Snowflake
The RAI Integration Services are a collection of features designed to enable RAI’s advanced modeling and querying capabilities to analyze your Snowflake data in new ways, including graph analytics.
A RAI data stream synchronizes your data from a Snowflake database to a RAI database. You can think of a RAI data stream as a materialized view that connects your Snowflake data with RAI. To use it, you need access to a RAI integration.
To load data this way, you will use Snowflake’s interface and write some SQL.
Here’s an example that creates a RAI data stream between the Snowflake table product
and the RAI relation rai_product
:
CREATE OR REPLACE TABLE product (ID INT, Name STRING, Category STRING)
AS SELECT * FROM VALUES
(11, 'Laptop', 'Electronics'),
(27, 'Shirt', 'Clothing');
CALL RAI.create_data_stream('my_sf_db.my_sf_schema.product', 'my_rai_db', 'rai_product');
See RAI Data Streams for more details.
See Also
Now that you know the several data sources you can use to import data into RAI, check out the other Data Import and Export guides to learn how to interact with your data.