Database Transactions
This guide discusses transactions in the Relational Knowledge Graph System (RKGS) and the main transaction modes you can use to interact with a RAI database.
Introduction
A database transaction is a unit of work that takes the database from one state S
to a new state S_new
.
Transactions happen atomically, meaning that either the entire transaction succeeds,
with no externally visible intermediate states and no interference from other transactions,
or the database remains at S
.
In the case of a read-only transaction, the two states are the same.
Transactions serve as a safeguard, ensuring robustness against:
Type | Description |
---|---|
Data inconsistencies | When concurrent queries try to simultaneously update the same data within the database, the transaction must isolate them from each other to avoid conflicts. |
System failures | When a system error appears, all transactional changes prior to the error source can be rolled back, keeping the database in a consistent state at all times. |
In order to ensure that a database is reliable, transactions must be atomic, consistent, isolated, and durable. These properties are commonly abbreviated as ACID (opens in a new tab).
RAI is ACID compliant.
This guide initially introduces transaction states, followed by an overview of the primary transaction modes within a RAI database. Finally, it delves into concurrent transactions. Overall, the focus is on data transactions. Other type of transactions such as those related to the control plane — for instance, managing engines and users — are not covered in this guide. See the RAI Console overview for more information on those.
Transaction States
Transactions can be in one of five states:
Status | Explanation |
---|---|
Created | Transaction has been issued but is not yet running. |
Running | Transaction is currently running. Running transactions can be canceled. |
Canceling | Transaction has been asked to cancel. Canceled transactions will eventually move to an aborted state. |
Completed | Transaction has run successfully. |
Aborted | Transaction has been canceled by the user or aborted by the system. See Abort Reasons for more details. |
This is the transaction states flowchart:
See Managing Transactions in the RAI Console for more details.
Transaction Modes
The main transaction modes used to interact with a RAI database are:
Mode | Description |
---|---|
Write query | A write query is allowed to change the state of the database by modifying base relations. Write queries may also extract data via the output relation. |
Read query | A read query may extract data from the database via the output relation but is not allowed to change the state of the database. |
Model | A transaction that persists a Rel model in the database. |
Their mapping to the RelationalAI SDKs are as follows:
Mode | SDK Mapping |
---|---|
Write query | exec call with readonly = false . |
Read query | exec call with readonly = true . |
Model | load_model call. |
Rel code snippets across the documentation site indicate whether the code should be run as a read query, a write query, or a model.
Next, take a look at the transaction modes and how to execute them using different RAI interfaces.
Query
A query transaction allows you to:
- Retrieve information from the database.
- Add, delete, or modify persisted data, which are stored in base relations.
- Alter system behavior by modifying specific control relations.
Query transactions can’t be used to persist modifications to derived relations. To do this, you must load a model.
Here’s an example of a read query using the Python SDK.
It assumes that you have an operating database
, engine
, and context ctx
:
rsp = api.exec(ctx, database, engine,
"""
def output = {1; 2; 3}
""",
readonly = True
)
show.results(rsp)
This gives the output:
/:output/String/Int64
(1,)
(2,)
(3,)
Database State Flowchart
A query transaction can be divided into the following steps:
- The query code is sent to the system.
- Changes to base and derived relations are evaluated and applied. More details on this are discussed below in this section.
- The changes applied to
S
are in a partially committed state indicated byS_temp
. “Partially” means these changes have not yet been persisted in the database. - Integrity constraints (ICs) are evaluated in the state
S_temp
. If any of the integrity constraints fail, the entire transaction is aborted and the database remains at the initial stateS
. - Query outputs, if requested, are sent back to the client whether or not ICs have been violated. Query outputs refer to the partially committed state
S_temp
. - If the transaction is a read query, all changes captured by
S_temp
are discarded and the database remains in the initial stateS
. - All changes that were made to base relations in the intermediate state
S_temp
are committed to the database. Changes to derived relations are never committed in a query. - The database is now in the state
S_new
.
Outputs are evaluated and sent back to the client even if the transaction is aborted.
The following diagram explains how changes to base and derived relations — going from S
to S_temp
— are applied:
- The control relation
delete
specifies which data should be removed from the base relations when it is materialized. It is evaluated in stateS
. - The control relation
insert
specifies which data should be updated and/or added to the base relations when it is materialized. It is evaluated in stateS
. - Derived relations are updated by considering data changes to base relations from previous steps, and new declarations and changes to existing declarations of derived relations.
- ICs are evaluated.
- The database is now in the partially committed state
S_temp
.
See Inserts and Deletes in the Same Transaction for more details.
Write Query
Write queries can be used to extract and modify information stored in the database. A write query can execute several instructions that may change the state of the database by modifying some base relations. The changes are persisted in the database if no IC violation or abort has been triggered.
Write queries can modify the database state.
The control relations insert
and delete
are used to modify base relations.
If used in a write query transaction, these changes are persisted and change the database state.
If used in a read query, these base relation changes are not persisted.
See Working With Base Relations for more details.
For instance, consider inserting a tool inventory into the base relation inventory
, where inventory(x, y)
states the number of items, y
, of the tool, x
.
// write query
def data = {("shovel", 15); ("hammer", 27); ("bucket", 5)}
def insert[:inventory] = data
def output = inventory
Note that the derived relation data
is not persisted in the database after the transaction is completed, contrary to the base relation inventory
.
When using the RelationalAI SDKs, a write query is mapped into an exec
call with readonly = false
.
Following the example above, and assuming you have an operating database
, engine
, and context ctx
,
here’s an API call to insert inventory
using the RelationalAI SDK for Python:
rsp = api.exec(ctx, database, engine,
"""
def insert[:inventory] = {("shovel", 15); ("hammer", 27); ("bucket", 5)}
def output = inventory
""",
readonly=False
)
show.results(rsp)
This gives the output:
/:output/String/Int64
('bucket', 5)
('hammer', 27)
('shovel', 15)
Note that to execute a write query, the SDK readonly
flag must be set to false
.
Read Query
Read queries execute several instructions that update the state of the database. Unlike write queries, the outcome of these instructions is not persisted in the database.
Read queries cannot modify the database state.
They are used to extract information, perform data analysis and inspection, and for data visualization.
They can be used to investigate what-if scenarios, such as performing updates to base relations (via insert
and delete
) without persisting the changes.
For example, say you want to display the inventory
base relation:
// read query
def output = inventory
Using the RelationalAI SDKs, the read query is mapped into an exec
call with readonly = true
.
Following the example above, and assuming you have an operating database
, engine
, and context ctx
,
here’s an API call to query inventory
using the RelationalAI SDK for Python:
rsp = api.exec(ctx, database, engine, """ def output = inventory """, readonly=True)
show.results(rsp)
This gives the output:
/:output/String/Int64
('bucket', 5)
('hammer', 27)
('shovel', 15)
Note that to execute a read query, the SDK readonly
flag must be set to true
.
Model Loading
You can persist Rel code in the database by loading it into the database as a model.
A Rel model consists of a set of declarations, including rules, integrity constraints, and others.
When you load a model, the following steps are executed:
- The model code is sent to the system.
- Changes to the model are evaluated and applied. This includes updating derived relations and ICs within the model.
- ICs are evaluated.
If any of the integrity constraints fail, the entire transaction is aborted and the database remains in the initial state
S
. - If none of the integrity constraints fail, the transaction succeeds and the model is updated.
- The database state is now in the state
S_new
.
In the RAI Console, you can create, delete, and modify models using the Editor.
Example
For instance, consider loading a model to infer some ICs to the base relation inventory
.
Say you want to ensure that the amount of items per product cannot be negative.
You also want to set a cap on inventory items:
// model
def inventory_count = sum[inventory] <++ 0
ic inventory_is_full {
inventory_count < 500
}
ic inventory_positive(x, y) {
inventory(x, y) implies y > 0
}
Note that inventory_count
is a derived relation that is persisted in the database as part of the model.
Now, try to insert a product with a negative number of items into inventory
:
// write query
def insert[:inventory] = {("screws", -2)}
def output = inventory
You can see that an IC violation is returned:
When this happens, the entire transaction is aborted and the database is not updated. It remains in the previous state.
Using the RelationalAI SDKs
When using the RelationalAI SDKs, loading a model is mapped to a load_model
call.
Depending on the SDK language, the name of the load_model
method may differ slightly.
For instance, assuming you have an operating database
, engine
and context ctx
,
here’s an API call to load the Rel model inventory-ic
into the database database
using the RelationalAI SDK for Python:
model_code = """ ic inventory_positive(x, y) {inventory(x, y) implies y > 0} """
api.install_model(ctx, database, engine, {"inventory-ic" : model_code})
Using the CLI, the same API call would look as follows:
rai load-model database --engine engine --model inventory-ic code.rel
In this case, the Rel model is defined in the code.rel
file.
To load a model into the database, the RKGS internally performs a write query that inserts the model logic into the dedicated base relation rel:catalog
, which houses all the installed models.
def insert:rel:catalog:model["my_model"] = "my_rel_code"
Concurrent Transactions
This section discusses concurrent transactions and how multiple simultaneous transactions are handled and impact the database. The most common situations are:
The number of engines involved in processing transactions is important, as it affects how multiple transactions are handled. In a multiple-engine scenario, the need to synchronize between engines creates a higher level of complexity. This section focuses on using one engine only.
Currently, the default limit amount of concurrent transactions per engine is 100.
The RGKS doesn’t impose any restrictions on transaction duration. However, in certain cases, it may consider a transaction as aborted if the engine it is running on doesn’t send a signal for over an hour.
In general, you can verify that a certain transaction has been successful by querying the updated base relation. You can also check by inspecting the transaction log in the RAI Console.
In case of a conflict, the entire database state is impacted, meaning all or no changes are committed to the database.
The scope of conflict is the entire database you are working with.
Multiple Write Queries
Concurrent write queries are a frequent operation. For instance, when you are constantly updating a base relation by inserting new data.
When there may exist concurrent writes, only one of them proceeds and the rest are blocked until the one that got the “writing slot” is completed or aborted. Then, the operation is repeated among the waiting ones until all write transactions are completed. Therefore, write transactions using one engine at a time are essentially serialized.
Multiple write queries using one engine are executed sequentially.
Here’s an example with three concurrent write transactions:
- The database state is
S
. - Transaction
Q1
comes in and is immediately processed. - Transactions
Q2
andQ3
come in whileQ1
is still running. Both transactions wait untilQ1
finishes. - When transaction
Q1
is complete, the database state is nowS1
. - The next transaction to be processed is implemented using condition variables. Whichever is first to wake wins the writing slot. This means that the execution order doesn’t follow the order in which the queries were received.
- In the diagram above, transaction
Q2
wins the writing slot and is the next to be processed. TransactionQ2
is working with the database stateS1
. - When transaction
Q2
is complete, the database state isS2
. - Steps 4 and 5 are repeated until all waiting write transactions are processed.
The execution order of queued write queries is arbitrary and likely doesn’t correspond to the order they were received.
Thanks to our advanced incremental view maintenance policy, the changes to derived relations, which are triggered by each transaction, are maintained in a highly efficient way. This minimizes computational needs and ensures concurrent transactions are processed as fast as possible.
Multiple Read Queries
When multiple read queries are executed at the same time, each transaction is completely isolated. This means that each query is independent of each other and they run in parallel.
Multiple read queries using one engine are executed concurrently.
- All the concurrent read queries start from the same database state
S
and are processed simultaneously. - After performing changes to base and derived relations, each read query updates the database to a partially committed state
S_temp
. S_temp
is specific to each transaction.- Since they are all read queries, the changes triggered by each query (captured by each
S_temp
) are not persisted and the database remains in the initial stateS
.
Multiple Write and Read Queries
Here’s an example of concurrent write and read queries using one engine:
- The database state is
S
. - Write transaction
Q1
comes in and is immediately processed. - Transactions
Q2
andQ3
come in whileQ1
is still running.Q2
is processed in parallel since it is a read query and does not persist changes to the database. Write queryQ3
waits untilQ1
finishes. - When transaction
Q1
is complete, the database state isS1
. - The next write transaction to be processed is implemented using condition variables. Whichever is first to wake wins the writing slot. This means that the execution order doesn’t follow the order in which the queries were received.
- In the diagram above, there is only one write transaction waiting. Transaction
Q3
is processed, working with the database stateS1
. - Read query
Q4
comes in and it is immediately processed. TransactionQ4
is also working with the database stateS1
. - When transaction
Q3
is complete, the database state isS3
. - Steps 4 and 5 are repeated until all waiting write transactions are processed.
If the database receives a write query while another write query is already executing, the second write query is queued and executed sequentially.
The query engine evaluates read queries concurrently. Each query is executed against the state of the database at the time the query is received.
Summary
In summary, a transaction is a unit of work that sends requests from a client to the database where the request is processed. RAI is ACID-compliant.
To modify the base relations in a database, use a write query. To persist rules, integrity constraints and other declarations in the database, load those declarations as a model. Use read queries, which don’t change the database state and are executed concurrently, to extract data from the database.
RAI transactions can be executed concurrently and using different tools, such as the RAI Console or the RelationalAI SDKs.