RelationalAI SDK for Python
This guide presents the main features of the RelationalAI SDK for Python, which can be used to interact with RelationalAI’s Relational Knowledge Graph System (RKGS).
The rai-sdk-python
package is open source and is available in this Github repository:
RelationalAI/rai-sdk-python
It includes self-contained examples (opens in a new tab) of the main API functionality. Contributions and pull requests are welcome.
Note: This guide applies to rai-sdk-python
, the latest iteration of the RelationalAI SDK for Python.
The relationalai-sdk
package is deprecated.
See also Python SDK Through Visual Studio Code if you are using a Microsoft VS Code environment to work with the RelationalAI SDK for Python.
Requirements
You can check the rai-sdk-python (opens in a new tab) repository for the latest version requirements to interact with the RKGS using the RelationalAI SDK for Python.
Installation
The RelationalAI SDK for Python is a stand-alone client library and can be installed using the pip
Python package manager as shown below:
# using the pip package manager
pip install rai-sdk
Additional ways of installing the library can be found in the rai-sdk-python (opens in a new tab) repository.
Configuration
The RelationalAI SDK for Python can access your RAI Server credentials using a configuration file. See SDK Configuration for more details. If you get “local issuer certificate” errors, you may need to install local certificates for your Python (opens in a new tab).
The Python API config.read()
function takes the configuration file and the profile name as optional arguments. For instance, you can write the following configuration in a .py
file or directly from the Python command-line interface:
from railib import config
cfg = config.read(fname = "~/.rai/config", profile = "default")
To load a different configuration, you can replace "default"
with a different profile name.
Creating a Context
Most API operations use a context object, constructed with railib.api.Context
.
To apply the default
profile in your ~/.rai/config
file, you can use:
from railib import api, config
cfg = config.read()
# to specify a non-default profile instead:
# cfg = config.read(profile='myprofile')
ctx = api.Context(**cfg)
As an alternative to loading a configuration and using **cfg
,
you can also specify the different fields directly using keyword arguments
in api.Context
.
See help(api.Context)
for details.
You can test your configuration and context by running:
import json
rsp = api.list_databases(ctx)
print(json.dumps(rsp, indent=2))
This should return a list with database info, assuming your keys have the corresponding permissions. See Listing Databases below.
Another way of testing your setup is by running one of the examples from the repository. You can do that directly from a terminal:
$ cd examples
$ python3 ./list_databases.py
The remaining code examples in this document assume that you have a valid context in the ctx
Python variable and the following imports:
from railib import api, config, show
import json
from urllib.request import HTTPError
Moreover, all of them can be run from the Python command-line interface or a .py
file.
Most API requests return a JSON value, represented in Python as a dict,
which can be converted to a string with json.dumps
and then printed:
print(json.dumps(rsp, indent=2))
Managing Users
This section covers the API functions that you need to manage users.
Each user has a role associated with specific permissions. These permissions determine the operations that the user can execute. See User Roles in the Managing Users and OAuth Clients guide for more details.
Creating a User
api.create_user(ctx, userid, roles)
Here, userid
is a string, identifying the user, and roles
is a list of railib.api.Role
, with default None
and which assigns the user
role.
Current valid roles are created with api.Role(<rolestring>)
. For a list of roles:
rsp = [r.value for r in api.Role]
print(json.dumps(rsp, indent=2))
["user", "admin"]
Disabling and Enabling a User
You can disable a user through:
api.disable_user(ctx, id)
Again, id
is a string representing a given user’s ID.
You can reenable the user as follows:
api.enable_user(ctx, id)
Listing Users
rsp = api.list_users(ctx)
print(json.dumps(rsp, indent=2))
Getting Information for a User
You can get information for a user as follows:
rsp = api.get_user(ctx, user)
print(json.dumps(rsp, indent=2))
Here, user
is a string ID, for example, "auth0|XXXXXXXXXXXXXXXXXX"
.
Deleting a User
You can delete a user through:
api.delete_user(ctx, id)
In this case, id
is a string reflecting a given user’s ID.
Managing OAuth Clients
This section covers the API functions that you need to manage OAuth clients.
Each OAuth client has a specific set of permissions. These permissions determine the operations that the OAuth client can execute. See [Permissions for OAuth Clients](/rkgms/console/user-management(#permissions-for-oauth-clients) in the Managing Users and OAuth Clients guide for more details.
Creating an OAuth Client
You can create an OAuth client as follows:
api.create_oauth_client(ctx, name, permissions)
name
is a string identifying the client. permissions
is a list of railib.api.Permission
, with default None
and which assigns no permissions.
To see the list of permissions, run the following:
rsp = [p.value for p in api.Permission]
print(json.dumps(rsp, indent=2))
This gives you the following output:
[
"create:engine", "delete:engine", "list:engine", "read:engine",
"list:database", "update:database", "delete:database",
"run:transaction", "read:transaction", "read:credits_usage",
"create:oauth_client", "read:oauth_client", "list:oauth_client",
"update:oauth_client", "delete:oauth_client",
"rotate:oauth_client_secret",
"create:user", "list:user", "read:user", "update:user",
"list:role", "read:role",
"list:permission", "create:accesskey", "list:accesskey"
]
Listing OAuth Clients
You can get a list of OAuth clients as follows:
api.list_oauth_clients(ctx)
Getting Information for an OAuth Client
You can get details for a specific OAuth client, identified by the string id
, as follows:
api.get_oauth_client(ctx, id)
Deleting an OAuth Client
You can delete an OAuth client identified by the string id
as follows:
api.delete_oauth_client(ctx, id)
Managing Engines
This section covers the API functions you need to use to manage engines.
Creating an Engine
You can create a new engine as follows:
engine="my_engine"
rsp = api.create_engine(ctx, engine)
print(json.dumps(rsp, indent=2))
By default, the engine size is XS
.
You can create an engine of a different size by specifying the size
parameter:
engine="my_engine"
size="S"
rsp = api.create_engine(ctx, engine, size)
print(json.dumps(rsp, indent=2))
Valid sizes are given as strings and can be one of the following:
XS
(extra small).S
(small).M
(medium).L
(large).XL
(extra large).
To wait until the engine is created, you can use api.create_engine_wait
instead:
engine="my_engine"
size="S"
rsp = api.create_engine_wait(ctx, engine, size)
print(json.dumps(rsp, indent=2))
Your engine may take some time to reach the “PROVISIONED” state, where it is ready for queries. It is in the “PROVISIONING” state until then.
Listing Engines
You can list all engines associated with your account as follows:
rsp = api.list_engines(ctx)
print(json.dumps(rsp, indent=2))
This returns a JSON array containing details for each engine:
[
{
"id": "ca******",
"name": "my_engine",
"region": "us-east",
"account_name": "******",
"created_by": "******",
"created_on": "2023-07-10T17:15:22.000Z",
"size": "S",
"state": "PROVISIONED"
}
]
To list engines that are in a given state, you can use the state
parameter:
rsp = api.list_engines(ctx, "PROVISIONED")
print(json.dumps(rsp, indent=2))
Possible states are:
"REQUESTED"
."PROVISIONING"
."PROVISIONED"
."DEPROVISIONING"
.
If there is an error with the request, a HTTPError
exception is thrown.
Getting Information for an Engine
You can get information for a specific engine as follows:
api.get_engine(ctx, engine)
This gives you the following output:
{
"id": "******",
"name": "my_engine",
"region": "us-east",
"account_name": "******",
"created_by": "******",
"created_on": "2023-07-10T17:15:22.000Z",
"size": "S",
"state": "PROVISIONED"
}
If the engine does not exist, the output is an empty list.
Deleting an Engine
You can delete an engine as follows:
rsp = api.delete_engine(ctx, engine)
RelationalAI decouples computation from storage. Therefore, deleting an engine does not delete any cloud databases. See Managing Engines for more details.
Managing Databases
This section covers the API functions you need to use to manage databases.
Creating a Database
You can create a database as follows:
database = "my_database"
rsp = api.create_database(ctx, database)
print(json.dumps(rsp, indent=2))
The result from a successful create_database
call looks like this:
{
"id": "******",
"name": "my_database",
"region": "us-east",
"account_name": "******",
"created_by": "******",
"created_on": "2023-07-20T08:03:03.616Z",
"state": "CREATED"
}
A failed call returns a status message such as:
{
"status": "Conflict",
"message": "database already exists"
}
Cloning a Database
You can clone a database by specifying the target and source databases:
database = "my_database"
rsp = api.create_database(ctx, "my_clone_database", source=database)
print(json.dumps(rsp, indent=2))
Any subsequent changes to either database will not affect the other. Cloning a database fails if the source database does not exist.
You cannot clone from a database until an engine has executed at least one transaction on that database.
Listing Databases
You can list the available databases associated with your account as follows:
rsp = api.list_databases(ctx)
print(json.dumps(rsp, indent=2))
This returns a JSON array containing details for each database:
[
{
"id": "******",
"name": "my_database",
"region": "us-east",
"account_name": "******",
"created_by": "******",
"created_on": "2023-07-20T08:03:03.616Z",
"state": "CREATED"
}
]
To filter databases by state, you can use the state
parameter.
For instance:
state = "CREATED"
rsp = api.list_databases(ctx, state)
print(json.dumps(rsp, indent=2))
Possible states are:
"CREATED"
."CREATING"
."CREATION_FAILED"
."DELETED"
.
Getting Information for a Database
You can get information for a specific database as follows:
database = "my_database"
rsp = api.get_database(ctx, database)
print(json.dumps(rsp, indent=2))
It gives this output:
{
"id": "******",
"name": "my_database",
"region": "us-east",
"account_name": "******",
"created_by": "******",
"created_on": "2023-07-20T08:03:03.616Z",
"state": "CREATED"
}
If the database does not exist, []
is returned.
Deleting a Database
You can delete a database as follows:
rsp = api.delete_database(ctx, database_name)
print(json.dumps(rsp, indent=2))
If successful, the response will be of the form:
{"name": "my_database", "message": "deleted successfully"}
Deleting a database cannot be undone.
The remaining code examples in this guide assume that you have a running engine in engine
and a database in database
.
Managing Rel Models
This section covers the API functions you can use to manage Rel models.
Rel models are collections of Rel code that can be added, updated, or deleted from a dedicated database. A running engine — and a database — is required to perform operations on models.
Loading a Model
The api.install_model
function loads a Rel model in a given database.
The last argument is a Python dictionary, mapping names to models, so that more than one named model can be installed.
For example, this is how to add a Rel model to a database:
model_string = """ def countries = {"United States of America"; "Germany"; "Japan"; "Greece"}
def oceans = {"Arctic"; "Atlantic"; "Indian"; "Pacific"; "Southern"} """
api.install_model(ctx, database, engine, {"my_model" : model_string})
If the database already contains an installed model with the same name as the newly installed model, then the new model replaces the existing one.
If you need to load from a file, you can read it into a string first. For example:
from os import path
my_model = {}
with open(fname) as fp:
my_model[path.basename(fname)] = fp.read()
api.install_model(ctx, database, engine, my_model)
The argument fname
is a string.
Note that you can also load models in a specific folder by adding the directory in the API function call:
api.install_model(ctx, database, engine, {"my_models/my_model" : model_string})
Loading Multiple Models
You can also provide a Python dictionary with a collection of models, together with their names.
Here is an example that loads multiple models at once:
model_string = """ def countries = {"United States of America"; "Germany"; "Japan"; "Greece"}"""
model_string2 = """ def oceans = {"Arctic"; "Atlantic"; "Indian"; "Pacific"; "Southern"}"""
api.install_model(ctx, database, engine, {"my_model" : model_string, "my_model2" : model_string2})
Listing Models
You can list the models in a database as follows:
api.list_models(ctx, database, engine)
This returns a JSON array of names:
[
"rel/alglib",
"rel/display",
"rel/graph-basics",
"rel/graph-centrality",
"rel/graph-components",
"rel/graph-degree",
"rel/graph-measures",
"rel/graph-paths",
"rel/histogram",
"rel/intrinsics",
"rel/mathopt",
"rel/mirror",
"rel/net",
"rel/stdlib",
"rel/vega",
"rel/vegalite"
]
In the example above, you can see all the built-in models associated with a database.
Getting Information for a Model
To see the contents of a given model, you can use:
modelname = "my_model"
api.get_model(ctx, database, engine, modelname)
This gives the following output:
{
"name": "my_model",
"value": "def my_range(x) = range(1, 10, 1, x)"
}
In the example above, my_model
defines a specific range.
Deleting Models
You can delete models from a database using the following code:
api.delete_model(ctx, database, engine, model_name)
Note that model_name
is a string vector containing the names of the model or models to be deleted.
Querying a Database
The API call for executing queries against the database is exec
.
It is a synchronous function, meaning that the running code is blocked until the transaction is completed or there are several timeouts indicating that the system may be inaccessible.
Each query is a complete transaction, executed in the context of the provided database.
The exec
function specifies a Rel query, which can be empty, and a set of input relations.
It is defined as follows:
api.exec(
ctx: api.Context,
database: str,
engine: str,
command: str,
inputs: dict = None,
readonly: bool = True
)
Here’s an example of a read query using exec
:
rsp = api.exec(ctx, database, engine, "def output = {1; 2; 3}")
show.results(rsp)
This gives you the following output:
/:output/Int64
(1,)
(2,)
(3,)
By default, readonly
is true
.
Write queries, which update base relations through the control relations insert
and delete
,
must use readonly=false
.
Here’s an API call to load some CSV data and store them in the base relation my_base_relation
:
data = """
name,lastname,id
John,Smith,1
Peter,Jones,2
"""
api.exec(ctx, database, engine,
"""
def config:schema:name="string"
def config:schema:lastname="string"
def config:schema:id="int"
def config:syntax:header_row=1
def config:data = my_data
def delete[:my_base_relation] = my_baserelation
def insert[:my_base_relation] = load_csv[config]
""",
inputs = {"my_data":data},
readonly=False
)
The RelationalAI SDK for Python also supports asynchronous transactions, through exec_async
.
In summary, when you issue a query to the database, you get an output containing a transaction ID that can subsequently be used to retrieve the actual query results.
exec_async
is defined as exec
, but in this case the running processes are not blocked:
rsp_async= api.exec_async(
ctx,
database, engine,
"def output = {1;2;3}"
)
Then, you can poll the transaction until it has completed or aborted. Finally, you can fetch the results:
if rsp_async.transaction['state'] == "COMPLETED":
show.results(rsp_async)
Similarly, you can get the results, metadata, and problems for a given transaction ID using the following functions:
results = api.get_transaction_results(ctx, rsp_async.transaction['id'])
metadata = api.get_transaction_metadata(ctx, rsp_async.transaction['id'])
problems = api.get_transaction_problems(ctx, rsp_async.transaction['id'])
The query size is limited to 64MB. An HTTPError
exception will be thrown if the request exceeds this API limit.
Getting Multiple Relations Back
To return multiple relations, you can define subrelations of output
.
For example:
rsp = api.exec(ctx, database, engine,
"def a = 1;2 def b = 3;4 def output:one = a def output:two = b")
show.results(rsp)
It gives this output:
{
"v1": [
3,
4
]
},
{
"v1": [
1,
2
]
}
Result Structure
The response is a Python dictionary with the following keys:
Field | Meaning |
---|---|
Transaction | Information about the transaction status, including the identifier. |
Metadata | Metadata information about the results key. |
Results | Query output information. |
Problems | Information about any existing problems in the database — which are not necessarily caused by the query. |
Transaction
The transaction key is a JSON string with the following fields:
Field | Meaning |
---|---|
ID | Transaction identifier. |
State | Transaction state. See Transaction States for more details. |
For example:
{
"id": "******",
"state": "COMPLETED"
}
Metadata
The metadata key is a JSON string with the following fields:
Field | Meaning |
---|---|
Relation ID | This is a relation identifier, for example, "/:output/:two/Int64" . It describes the relation name /:output/:two followed by its data schema Int64 . |
Types | This is a JSON array that contains the key names of the relation and their data type. |
For example:
{
relation_id {
arguments {
tag: CONSTANT_TYPE
constant_type {
rel_type {
tag: PRIMITIVE_TYPE
primitive_type: STRING
}
value {
arguments {
tag: STRING
string_val: "output"
}
}
}
}
arguments {
tag: PRIMITIVE_TYPE
primitive_type: INT_64
}
}
}
Results
The results key is a vector with the following fields:
Field | Meaning |
---|---|
Relation ID | This is a key for the relation, for example, "v1" . It refers to the column name in the Arrow table that contains the data, where "v" stands for variable, since a relation’s tuples contain several variables. |
Table | This contains the results of the query in a JSON-array format. |
For example:
v1: [[1,2,3]]
Problems
The problems key is a JSON string with the following fields:
Field | Meaning |
---|---|
error_code | The type of error that happened, for example, "PARSE_ERROR" . |
is_error | Whether an error occurred or there was some other problem. |
is_exception | Whether an exception occurred or there was some other problem. |
message | A short description of the problem. |
path | A file path for the cases when such a path was used. |
report | A long description of the problem. |
type | The type of problem, for example, "ClientProblem" . |
For example:
{
'is_error': True,
'error_code': 'PARSE_ERROR',
'path': '',
'report': '1| def output = {1; 2; 3\n ^~~~~~~~\n', 'message': 'Missing closing `}`.',
'is_exception': False,
'type': 'ClientProblem'
}
Specifying Inputs
The exec
API call takes an optional inputs
dictionary that can be used to map relation names to string constants for the duration of the query.
For example:
api.exec(ctx, database, engine, "def output = foo", inputs = {"foo" : "asdf"})
This returns the string "asdf"
back.
Functions that transform a file and write the results to a base relation can be written in this way.
The calls api.load_csv
and api.load_json
are special cases of this.
See, for example, the sample code using load_csv
in Querying a Database.
Printing Responses
The railib.show.results
module can be used to print API responses.
For example:
rsp = api.exec(ctx, database, engine, "def output = {1;2;3}")
show.results(rsp)
show.problems(rsp)
Loading Data
The load_csv
and load_json
functions allow you to load data into a database.
These are not strictly necessary, since the Rel load utilities can also be used for this task.
See the CSV Import and JSON Import guides for more information.
It’s advisable to load data using built-in Rel utilities within queries, rather than these specific SDK functions. See Querying a Database for more details.
Loading CSV Data
The load_csv
function loads CSV data and inserts the result into the base relation named by the relation
argument.
api.load_csv(ctx: api.Context, database: str, engine: str, relation: str,
data, syntax: dict = {})
# `syntax`:
# * header: a map from col number to name (base 1)
# * header_row: row number of header, 0 means no header (default: 1)
# * delim: default: ,
# * quotechar: default: "
# * escapechar: default: \
#
# Schema: a map from col name to rel type name, eg:
# {'a': "int", 'b': "string"}
Here’s an example:
file_name = "my_data_file.csv"
with open(file_name) as fp:
data = fp.read()
api.load_csv(ctx, database, engine, "my_csv", data)
By default, load_csv
attempts to guess the schema of the data.
The syntax
argument allows you to specify how to parse a given CSV file, including its schema, delimiters, and escape characters.
See this example (opens in a new tab) for more details.
Loading JSON Data
The load_json
function loads JSON data and inserts them into the base relation named by the relation
argument.
api.load_json(ctx: api.Context, database: str, engine: str, relation: str, data)
Here’s an example:
api.load_json(ctx, database, engine, "my_json", """{"a" : "b"}""")
In both the LoadCsvAsync()
and LoadJsonAsync()
methods, the base relation relation
is not cleared, allowing for multipart, incremental loads.
You can clear a base relation, such as my_base_relation
, as follows:
rsp = api.exec(
ctx,
database,
engine,
"def delete[:my_base_relation] = my_base_relation",
readonly=False
)
Listing Base Relations
You can list the base relations in a given database as follows:
api.list_edbs(ctx, database, engine)
The result is a JSON list of objects.
Managing Transactions
This section covers the API functions you can use to manage transactions.
Listing Transactions
You can list the transactions in your context ctx
as follows:
rsp = api.list_transactions(ctx)
print(json.dumps(rsp, indent=2))
Canceling Transactions
You can cancel an ongoing transaction as follows:
api.cancel_transaction(ctx, id)
The argument id
is a string that represents the transaction ID.
For instance, rsp[0]["id"]
from a previous exec
API call.