RelationalAI SDK for Python

This guide presents the main features of the RelationalAI SDK for Python, which can be used to interact with RelationalAI’s Relational Knowledge Graph System (RKGS).

The rai-sdk-python package is open source and is available in this Github repository:

RelationalAI/rai-sdk-python

It includes self-contained examples (opens in a new tab) of the main API functionality. Contributions and pull requests are welcome.

Note: This guide applies to rai-sdk-python, the latest iteration of the RelationalAI SDK for Python. The relationalai-sdk package is deprecated.

See also Python SDK Through Visual Studio Code if you are using a Microsoft VS Code environment to work with the RelationalAI SDK for Python.

Requirements

You can check the rai-sdk-python (opens in a new tab) repository for the latest version requirements to interact with the RKGS using the RelationalAI SDK for Python.

Installation

The RelationalAI SDK for Python is a stand-alone client library and can be installed using the pip Python package manager as shown below:

# using the pip package manager
pip install rai-sdk

Additional ways of installing the library can be found in the rai-sdk-python (opens in a new tab) repository.

Configuration

The RelationalAI SDK for Python can access your RAI Server credentials using a configuration file. See SDK Configuration for more details. If you get “local issuer certificate” errors, you may need to install local certificates for your Python (opens in a new tab).

The Python API config.read() function takes the configuration file and the profile name as optional arguments. For instance, you can write the following configuration in a .py file or directly from the Python command-line interface:

from railib import config
cfg = config.read(fname = "~/.rai/config", profile = "default")

To load a different configuration, you can replace "default" with a different profile name.

Creating a Context

Most API operations use a context object, constructed with railib.api.Context. To apply the default profile in your ~/.rai/config file, you can use:

from railib import api, config
cfg = config.read()
# to specify a non-default profile instead:
# cfg = config.read(profile='myprofile')
ctx = api.Context(**cfg)

As an alternative to loading a configuration and using **cfg, you can also specify the different fields directly using keyword arguments in api.Context. See help(api.Context) for details.

You can test your configuration and context by running:

import json
rsp = api.list_databases(ctx)
print(json.dumps(rsp, indent=2))

This should return a list with database info, assuming your keys have the corresponding permissions. See Listing Databases below.

Another way of testing your setup is by running one of the examples from the repository. You can do that directly from a terminal:

$ cd examples
$ python3 ./list_databases.py

The remaining code examples in this document assume that you have a valid context in the ctx Python variable and the following imports:

from railib import api, config, show
import json
from urllib.request import HTTPError

Moreover, all of them can be run from the Python command-line interface or a .py file.

Most API requests return a JSON value, represented in Python as a dict, which can be converted to a string with json.dumps and then printed:

print(json.dumps(rsp, indent=2))

Managing Users

This section covers the API functions that you need to manage users.

🔎

Each user has a role associated with specific permissions. These permissions determine the operations that the user can execute. See User Roles in the Managing Users and OAuth Clients guide for more details.

Creating a User

api.create_user(ctx, userid, roles)

Here, userid is a string, identifying the user, and roles is a list of railib.api.Role, with default None and which assigns the user role.

Current valid roles are created with api.Role(<rolestring>). For a list of roles:

rsp = [r.value for r in api.Role]
print(json.dumps(rsp, indent=2))
 
["user", "admin"]

Disabling and Enabling a User

You can disable a user through:

api.disable_user(ctx, id)

Again, id is a string representing a given user’s ID. You can reenable the user as follows:

api.enable_user(ctx, id)

Listing Users

rsp = api.list_users(ctx)
print(json.dumps(rsp, indent=2))

Getting Information for a User

You can get information for a user as follows:

rsp = api.get_user(ctx, user)
print(json.dumps(rsp, indent=2))

Here, user is a string ID, for example, "auth0|XXXXXXXXXXXXXXXXXX".

Deleting a User

You can delete a user through:

api.delete_user(ctx, id)

In this case, id is a string reflecting a given user’s ID.

Managing OAuth Clients

This section covers the API functions that you need to manage OAuth clients.

🔎

Each OAuth client has a specific set of permissions. These permissions determine the operations that the OAuth client can execute. See [Permissions for OAuth Clients](/rkgms/console/user-management(#permissions-for-oauth-clients) in the Managing Users and OAuth Clients guide for more details.

Creating an OAuth Client

You can create an OAuth client as follows:

api.create_oauth_client(ctx, name, permissions)

name is a string identifying the client. permissions is a list of railib.api.Permission, with default None and which assigns no permissions.

To see the list of permissions, run the following:

rsp = [p.value for p in api.Permission]
print(json.dumps(rsp, indent=2))

This gives you the following output:

[
  "create:engine", "delete:engine", "list:engine", "read:engine",
  "list:database", "update:database", "delete:database",
  "run:transaction", "read:transaction", "read:credits_usage",
  "create:oauth_client", "read:oauth_client", "list:oauth_client",
  "update:oauth_client", "delete:oauth_client",
  "rotate:oauth_client_secret",
  "create:user", "list:user", "read:user", "update:user",
  "list:role", "read:role",
  "list:permission", "create:accesskey", "list:accesskey"
]

Listing OAuth Clients

You can get a list of OAuth clients as follows:

api.list_oauth_clients(ctx)

Getting Information for an OAuth Client

You can get details for a specific OAuth client, identified by the string id, as follows:

api.get_oauth_client(ctx, id)

Deleting an OAuth Client

You can delete an OAuth client identified by the string id as follows:

api.delete_oauth_client(ctx, id)

Managing Engines

This section covers the API functions you need to use to manage engines.

Creating an Engine

You can create a new engine as follows:

engine="my_engine"
 
rsp = api.create_engine(ctx, engine)
print(json.dumps(rsp, indent=2))

By default, the engine size is XS. You can create an engine of a different size by specifying the size parameter:

engine="my_engine"
size="S"
 
rsp = api.create_engine(ctx, engine, size)
print(json.dumps(rsp, indent=2))

Valid sizes are given as strings and can be one of the following:

XS (extra small).
S (small).
M (medium).
L (large).
XL (extra large).

To wait until the engine is created, you can use api.create_engine_wait instead:

engine="my_engine"
size="S"
 
rsp = api.create_engine_wait(ctx, engine, size)
print(json.dumps(rsp, indent=2))

💡

Your engine may take some time to reach the “PROVISIONED” state, where it is ready for queries. It is in the “PROVISIONING” state until then.

Listing Engines

You can list all engines associated with your account as follows:

rsp = api.list_engines(ctx)
print(json.dumps(rsp, indent=2))

This returns a JSON array containing details for each engine:

[
    {
        "id": "ca******",
        "name": "my_engine",
        "region": "us-east",
        "account_name": "******",
        "created_by": "******",
        "created_on": "2023-07-10T17:15:22.000Z",
        "size": "S",
        "state": "PROVISIONED"
    }
]

To list engines that are in a given state, you can use the state parameter:

rsp = api.list_engines(ctx, "PROVISIONED")
print(json.dumps(rsp, indent=2))

Possible states are:

"REQUESTED".
"PROVISIONING".
"PROVISIONED".
"DEPROVISIONING".

If there is an error with the request, a HTTPError exception is thrown.

Getting Information for an Engine

You can get information for a specific engine as follows:

api.get_engine(ctx, engine)

This gives you the following output:

{
    "id": "******",
    "name": "my_engine",
    "region": "us-east",
    "account_name": "******",
    "created_by": "******",
    "created_on": "2023-07-10T17:15:22.000Z",
    "size": "S",
    "state": "PROVISIONED"
}

If the engine does not exist, the output is an empty list.

Deleting an Engine

You can delete an engine as follows:

rsp = api.delete_engine(ctx, engine)

RelationalAI decouples computation from storage. Therefore, deleting an engine does not delete any cloud databases. See Managing Engines for more details.

Managing Databases

This section covers the API functions you need to use to manage databases.

Creating a Database

You can create a database as follows:

database = "my_database"
 
rsp = api.create_database(ctx, database)
print(json.dumps(rsp, indent=2))

The result from a successful create_database call looks like this:

{
    "id": "******",
    "name": "my_database",
    "region": "us-east",
    "account_name": "******",
    "created_by": "******",
    "created_on": "2023-07-20T08:03:03.616Z",
    "state": "CREATED"
}

A failed call returns a status message such as:

{
    "status": "Conflict",
    "message": "database already exists"
}

Cloning a Database

You can clone a database by specifying the target and source databases:

database = "my_database"
 
rsp = api.create_database(ctx, "my_clone_database", source=database)
print(json.dumps(rsp, indent=2))

Any subsequent changes to either database will not affect the other. Cloning a database fails if the source database does not exist.

⚠

You cannot clone from a database until an engine has executed at least one transaction on that database.

Listing Databases

You can list the available databases associated with your account as follows:

rsp = api.list_databases(ctx)
print(json.dumps(rsp, indent=2))

This returns a JSON array containing details for each database:

[
    {
        "id": "******",
        "name": "my_database",
        "region": "us-east",
        "account_name": "******",
        "created_by": "******",
        "created_on": "2023-07-20T08:03:03.616Z",
        "state": "CREATED"
    }
]

To filter databases by state, you can use the state parameter. For instance:

state = "CREATED"
rsp = api.list_databases(ctx, state)
print(json.dumps(rsp, indent=2))

Possible states are:

"CREATED".
"CREATING".
"CREATION_FAILED".
"DELETED".

Getting Information for a Database

You can get information for a specific database as follows:

database = "my_database"
rsp = api.get_database(ctx, database)
print(json.dumps(rsp, indent=2))

It gives this output:

{
    "id": "******",
    "name": "my_database",
    "region": "us-east",
    "account_name": "******",
    "created_by": "******",
    "created_on": "2023-07-20T08:03:03.616Z",
    "state": "CREATED"
}

If the database does not exist, [] is returned.

Deleting a Database

You can delete a database as follows:

rsp = api.delete_database(ctx, database_name)
print(json.dumps(rsp, indent=2))

If successful, the response will be of the form:

{"name": "my_database", "message": "deleted successfully"}

⚠

Deleting a database cannot be undone.

🔎

The remaining code examples in this guide assume that you have a running engine in engine and a database in database.

Managing Rel Models

This section covers the API functions you can use to manage Rel models.

Rel models are collections of Rel code that can be added, updated, or deleted from a dedicated database. A running engine — and a database — is required to perform operations on models.

Loading a Model

The api.install_model function loads a Rel model in a given database. The last argument is a Python dictionary, mapping names to models, so that more than one named model can be installed.

For example, this is how to add a Rel model to a database:

model_string = """ def countries = {"United States of America"; "Germany"; "Japan"; "Greece"}
def oceans = {"Arctic"; "Atlantic"; "Indian"; "Pacific"; "Southern"} """
 
api.install_model(ctx, database, engine, {"my_model" : model_string})

⚠

If the database already contains an installed model with the same name as the newly installed model, then the new model replaces the existing one.

If you need to load from a file, you can read it into a string first. For example:

from os import path
 
my_model = {}
with open(fname) as fp:
    my_model[path.basename(fname)] = fp.read()
 
api.install_model(ctx, database, engine, my_model)

The argument fname is a string.

Note that you can also load models in a specific folder by adding the directory in the API function call:

api.install_model(ctx, database, engine, {"my_models/my_model" : model_string})

Loading Multiple Models

You can also provide a Python dictionary with a collection of models, together with their names.

Here is an example that loads multiple models at once:

model_string = """ def countries = {"United States of America"; "Germany"; "Japan"; "Greece"}"""
model_string2 = """ def oceans = {"Arctic"; "Atlantic"; "Indian"; "Pacific"; "Southern"}"""
 
api.install_model(ctx, database, engine, {"my_model" : model_string, "my_model2" : model_string2})

Listing Models

You can list the models in a database as follows:

api.list_models(ctx, database, engine)

This returns a JSON array of names:

[
  "rel/alglib",
  "rel/display",
  "rel/graph-basics",
  "rel/graph-centrality",
  "rel/graph-components",
  "rel/graph-degree",
  "rel/graph-measures",
  "rel/graph-paths",
  "rel/histogram",
  "rel/intrinsics",
  "rel/mathopt",
  "rel/mirror",
  "rel/net",
  "rel/stdlib",
  "rel/vega",
  "rel/vegalite"
]

In the example above, you can see all the built-in models associated with a database.

Getting Information for a Model

To see the contents of a given model, you can use:

modelname = "my_model"
api.get_model(ctx, database, engine, modelname)

This gives the following output:

{
  "name": "my_model",
  "value": "def my_range(x) = range(1, 10, 1, x)"
}

In the example above, my_model defines a specific range.

Deleting Models

You can delete models from a database using the following code:

api.delete_model(ctx, database, engine, model_name)

Note that model_name is a string vector containing the names of the model or models to be deleted.

Querying a Database

The API call for executing queries against the database is exec. It is a synchronous function, meaning that the running code is blocked until the transaction is completed or there are several timeouts indicating that the system may be inaccessible. Each query is a complete transaction, executed in the context of the provided database.

The exec function specifies a Rel query, which can be empty, and a set of input relations. It is defined as follows:

api.exec(
    ctx: api.Context,
    database: str,
    engine: str,
    command: str,
    inputs: dict = None,
    readonly: bool = True
)

Here’s an example of a read query using exec:

rsp = api.exec(ctx, database, engine,  "def output = {1; 2; 3}")
show.results(rsp)

This gives you the following output:

/:output/Int64
(1,)
(2,)
(3,)

By default, readonly is true.

Write queries, which update base relations through the control relations insert and delete, must use readonly=false.

Here’s an API call to load some CSV data and store them in the base relation my_base_relation:

data = """
name,lastname,id
John,Smith,1
Peter,Jones,2
"""
api.exec(ctx, database, engine,
"""
def config:schema:name="string"
def config:schema:lastname="string"
def config:schema:id="int"
def config:syntax:header_row=1
def config:data = my_data
 
def delete[:my_base_relation] = my_baserelation
def insert[:my_base_relation] = load_csv[config]
""",
inputs = {"my_data":data},
readonly=False
)

The RelationalAI SDK for Python also supports asynchronous transactions, through exec_async. In summary, when you issue a query to the database, you get an output containing a transaction ID that can subsequently be used to retrieve the actual query results.

exec_async is defined as exec, but in this case the running processes are not blocked:

rsp_async= api.exec_async(
    ctx,
    database, engine,
    "def output = {1;2;3}"
)

Then, you can poll the transaction until it has completed or aborted. Finally, you can fetch the results:

if rsp_async.transaction['state'] == "COMPLETED":
    show.results(rsp_async)

Similarly, you can get the results, metadata, and problems for a given transaction ID using the following functions:

results = api.get_transaction_results(ctx, rsp_async.transaction['id'])
metadata = api.get_transaction_metadata(ctx, rsp_async.transaction['id'])
problems = api.get_transaction_problems(ctx, rsp_async.transaction['id'])

⚠

The query size is limited to 64MB. An HTTPError exception will be thrown if the request exceeds this API limit.

Getting Multiple Relations Back

To return multiple relations, you can define subrelations of output. For example:

rsp = api.exec(ctx, database, engine,
    "def a = 1;2 def b = 3;4 def output:one = a def output:two = b")
show.results(rsp)

It gives this output:

{
    "v1": [
        3,
        4
    ]
},
{
    "v1": [
        1,
        2
    ]
}

Result Structure

The response is a Python dictionary with the following keys:

Field	Meaning
Transaction	Information about the transaction status, including the identifier.
Metadata	Metadata information about the results key.
Results	Query output information.
Problems	Information about any existing problems in the database — which are not necessarily caused by the query.

Transaction

The transaction key is a JSON string with the following fields:

Field	Meaning
ID	Transaction identifier.
State	Transaction state. See Transaction States for more details.

For example:

{
    "id": "******",
    "state": "COMPLETED"
}

Metadata

The metadata key is a JSON string with the following fields:

Field	Meaning
Relation ID	This is a relation identifier, for example, `"/:output/:two/Int64"`. It describes the relation name `/:output/:two` followed by its data schema `Int64`.
Types	This is a JSON array that contains the key names of the relation and their data type.

For example:

{
    relation_id {
        arguments {
            tag: CONSTANT_TYPE
            constant_type {
                rel_type {
                    tag: PRIMITIVE_TYPE
                    primitive_type: STRING
                }
                value {
                    arguments {
                        tag: STRING
                        string_val: "output"
                    }
                }
            }
        }
        arguments {
            tag: PRIMITIVE_TYPE
            primitive_type: INT_64
        }
    }
}

Results

The results key is a vector with the following fields:

Field	Meaning
Relation ID	This is a key for the relation, for example, `"v1"`. It refers to the column name in the Arrow table that contains the data, where `"v"` stands for variable, since a relation’s tuples contain several variables.
Table	This contains the results of the query in a JSON-array format.

For example:

v1: [[1,2,3]]

Problems

The problems key is a JSON string with the following fields:

Field	Meaning
error_code	The type of error that happened, for example, `"PARSE_ERROR"`.
is_error	Whether an error occurred or there was some other problem.
is_exception	Whether an exception occurred or there was some other problem.
message	A short description of the problem.
path	A file path for the cases when such a path was used.
report	A long description of the problem.
type	The type of problem, for example, `"ClientProblem"`.

For example:

{
    'is_error': True, 
    'error_code': 'PARSE_ERROR', 
    'path': '', 
    'report': '1| def output = {1; 2; 3\n  ^~~~~~~~\n', 'message': 'Missing closing `}`.', 
    'is_exception': False, 
    'type': 'ClientProblem'
}

Specifying Inputs

The exec API call takes an optional inputs dictionary that can be used to map relation names to string constants for the duration of the query. For example:

api.exec(ctx, database, engine, "def output = foo", inputs = {"foo" : "asdf"})

This returns the string "asdf" back.

Functions that transform a file and write the results to a base relation can be written in this way. The calls api.load_csv and api.load_json are special cases of this. See, for example, the sample code using load_csv in Querying a Database.

Printing Responses

The railib.show.results module can be used to print API responses. For example:

rsp = api.exec(ctx, database, engine, "def output = {1;2;3}")
show.results(rsp)
show.problems(rsp)

Loading Data

The load_csv and load_json functions allow you to load data into a database. These are not strictly necessary, since the Rel load utilities can also be used for this task. See the CSV Import and JSON Import guides for more information.

💡

It’s advisable to load data using built-in Rel utilities within queries, rather than these specific SDK functions. See Querying a Database for more details.

Loading CSV Data

The load_csv function loads CSV data and inserts the result into the base relation named by the relation argument.

api.load_csv(ctx: api.Context, database: str, engine: str, relation: str,
    data, syntax: dict = {})
    # `syntax`:
    #   * header: a map from col number to name (base 1)
    #   * header_row: row number of header, 0 means no header (default: 1)
    #   * delim: default: ,
    #   * quotechar: default: "
    #   * escapechar: default: \
    #
    # Schema: a map from col name to rel type name, eg:
    #   {'a': "int", 'b': "string"}

Here’s an example:

file_name = "my_data_file.csv"
with open(file_name) as fp:
    data = fp.read()
 
api.load_csv(ctx, database, engine, "my_csv", data)

By default, load_csv attempts to guess the schema of the data. The syntax argument allows you to specify how to parse a given CSV file, including its schema, delimiters, and escape characters. See this example (opens in a new tab) for more details.

Loading JSON Data

The load_json function loads JSON data and inserts them into the base relation named by the relation argument.

api.load_json(ctx: api.Context, database: str, engine: str, relation: str, data)

Here’s an example:

api.load_json(ctx, database, engine, "my_json", """{"a" : "b"}""")

💡

In both the LoadCsvAsync() and LoadJsonAsync() methods, the base relation relation is not cleared, allowing for multipart, incremental loads.

You can clear a base relation, such as my_base_relation, as follows:

rsp = api.exec(
    ctx, 
    database, 
    engine,  
    "def delete[:my_base_relation] = my_base_relation", 
    readonly=False
)

Listing Base Relations

You can list the base relations in a given database as follows:

api.list_edbs(ctx, database, engine)

The result is a JSON list of objects.

Managing Transactions

This section covers the API functions you can use to manage transactions.

Listing Transactions

You can list the transactions in your context ctx as follows:

rsp = api.list_transactions(ctx)
print(json.dumps(rsp, indent=2))

Canceling Transactions

You can cancel an ongoing transaction as follows:

api.cancel_transaction(ctx, id)

The argument id is a string that represents the transaction ID. For instance, rsp[0]["id"] from a previous exec API call.

Was this doc helpful?