Working With Base Relations

This concept guide discusses storing, updating, and deleting data in a RAI database using the Rel language. It explains how to interact with and change base relations, specifically through the use of the control relations insert and delete. This guide also covers how transactions update the database and then goes into the more advanced topic of reactive updates.

Introduction

Most databases are not just static collections of data; they evolve over time.

Generally speaking, a database can change in two ways:

The data themselves are updated — changed, added, or deleted.
The rules that define derived relations are updated — changed, added, or deleted.

It is important to distinguish relations stored on disk, which are often raw data imported from other sources, from relations that are derived from these data via rules. RelationalAI uses the technical terms from the Datalog literature:

Base relations are stored on disk or other nonvolatile media.
Derived relations are defined by rules, computed on-demand, and are not persisted — though their definitions can be persisted, as installed code. These are analogous to views in other database systems.

Normally, base relations contain data from external sources that cannot otherwise be captured; they are the “raw data” as far as the database is concerned. By contrast, derived relations are defined and modified by applying rules to the base relations.

Base relations often contain data loaded from CSV files or other external sources.

Simple Example:

You can directly create a small base relation using insert, described in more detail below:

// write query
 
def insert[:parent] = {("James", "Harry"); ("Harry", "Lily")}
def output = parent

While parent(x, y) is a base and binary relation, the tuples of which are explicitly stored in the database, grandparent(x, y) might be a derived relation, derived from parent and defined by the rule:

// read query
 
def grandparent(x, z) = parent(x, y) and parent(y, z) from y
def output = grandparent

Derived relations can change if the rules or the underlying base relations change.

Base relations only change when the underlying data change.

Adding Base Relation Data: `insert`

To add data to a new or existing base relation, you can use the control relation insert. The first argument for insert should be a Symbol, i.e., a specialized relation starting with : .

Data are added to the right-hand side of the definition to create a relation. Other than the important side effect of updating base relations, these definitions are just like any other in Rel, and all the usual Rel constructs can be used.

For example, consider a balance base relation, where balance(x, y) says that account x has balance y. To add data to balance, you can execute this write query:

// write query
 
def insert[:balance] = {("john", 20); ("jack", 30); ("jill", 60)}
def output = balance

This will add three pairs to the balance relation, creating the relation if it does not already exist.

The insert relation is computed from the current state of the database, determining what gets added to the base relation.

It is even possible to query insert, as in the example below, but it is not itself persisted. The same is true for the delete relation discussed below.

When requested as a query output, these relations are computed from the new state of the database. See the Transactions guide for more details. After the above, insert will have the tuples {(:balance, "john", 20) ; (:balance, "jack", 30) ; (:balance, "jill", 60)}.

In general, insert is defined as any other relation, so the right-hand side of the definition can be a relational expression, defined from other base and derived relations in the database. For example:

// write query
 
def insert[:pairs](x, y) = range(1, 6, 1, x) and x % 2 = 0 and y = x * 2
def output:insert = insert
def output:pairs = pairs

Removing Base Relation Data: `delete`

Removing data from the database is analogous to adding them, except here you use the control relation delete instead of insert.

You can specify a set of tuples that you want to remove through a relation. This is a typical operation needed in some data cleaning scenarios.

For example, consider the following data imported from a CSV file (see CSV Import):

// write query
 
def insert[:my_data] = load_csv["azure://raidocs.blob.core.windows.net/csv-import/simple-drinks.csv"]
def output = my_data

These data are represented in Graph Normal Form.

🔎

Note that the insertion above and the deletion in the next code block need to happen in separate transactions. For more details see Inserts and Deletes in the Same Transaction.

You can remove arbitrary data as follows:

// write query
 
// Specify two country-drink pairs.
def to_remove = {
    ("Argentina", "Yerba Mate");
    ("United States", "Coca-Cola");
}
 
// Find the rows to remove.
def to_delete(row) =
    my_data(:country, row, country) and
    my_data(:drink, row, drink) and
    to_remove(country, drink)
    from country, drink
 
// Delete them.
def delete[:my_data](col, row, v) = my_data(col, row, v) and to_delete(row)
 
def output = my_data

In the previous example, the data to be removed were initially specified and then the row IDs were identified before removing the actual data.

You can also remove data by specifying a condition. For example, you can remove all balances lower than 50 with:

// write query
 
def delete[:balance](x, y) = balance(x,y) and y < 50
def output = balance

This has removed the tuples ("jack", 30) and ("john", 20) from the balance relation.

You can clear the entire balance relation with:

// write query
 
def delete[:balance] = balance
def output = balance

In general, you can include def delete[:r] = r in a transaction to make sure that r is empty before any new inserts happen. This can produce a warning the first time, if r is not previously defined, but the warning can be ignored.

Inserts and Deletes in the Same Transaction

Regardless of the order of the definitions, within a single transaction, described in more detail below, deletions are always executed before insertions.

When inserts and deletes are issued together in the same transaction, the deletes are executed first, irrespective of whether the inserts take precedence in the code.

The example below first initializes a unary base relation t:

// write query
 
def insert[:t] = {1; 2}
def output:t = t

You will now add and remove data through an insert and a delete. Note: In the RelationalAI guides, blocks of Rel code are executed as a single transaction. Since they are in the same transaction, the order of these lines does not matter:

// write query
 
def delete[:t] = {1; 2; 10; 20}
def insert[:t] = {1; 2; 3}
def output:t = t
def output:insert = insert
def output:delete = delete

As you can see, 1 and 2 are still in t.

Deleting elements that don’t exist is not an error. This is why you can include extra elements in delete, such as 10 and 20 in this example, that are not in the t relation in the first place.

Updating Existing Values

In general, to update an existing value, the old value must be deleted. Otherwise, the new value will simply be added, and the old and new values will coexist.

For example, suppose you give "jack" a balance of 60:

// write query
 
def delete[:balance] = balance
def insert[:balance]["jack"] = 60
def output = balance

You can update the balance for "jack" with the following transaction, which also removes the old balance:

// write query
 
def delete[:balance]("jack", x) = balance("jack", x)
def insert[:balance]("jack", new) = balance("jack", old) and new = old + 40 from old
def output = balance

If balance is meant to be a function, you can add an integrity constraint for it. A functional dependency on a relation, indicating that the last argument is uniquely determined by the values of the previous ones, can be specified with the built-in relation function:

// model
 
ic balance_is_a_function { function(balance) }

This IC will make sure that no customer ever has two different account balances — any transaction that violates it will fail. See the Integrity Constraints section for other examples of ICs relevant to data updates.

Current and Updated States

Note that the body of an update refers to the “current” state of the database, before any of the updates are carried out. For example, consider a database where you initialize r to {1; 2} as follows:

// write query
 
def delete[:r] = r // clear r
def delete[:t] = t // clear t
def insert[:r] = {1; 2}
def output:t = t
def output:r = r

Suppose you add some elements to r, but also set the new relation t to be equal to r:

// write query
 
def insert[:r] = {3; 4; 5}
def insert[:t] = r
def output:insert = insert
def output:t = t
def output:r = r

As you can see, the updated r will contain {1; 2; 3; 4; 5}, but t will contain the previous value of r, that is, {1; 2}. This is because when evaluating def insert[:t] = r, the relation r still had the initial values. The current transaction has not been committed, and r has not been changed yet.

The situation is different when you evaluate the output relation. As you can see, insert[:t] in the output will contain the new value of r ({1; 2; 3; 4; 5}), rather than the old one. This means you cannot rely on the output value ofinsert and delete alone to see deltas. See the Change Tracking section for alternatives.

If you want t to get the latest values of r, you should execute def insert[:t] = r in a separate transaction, after r has been updated:

// write query
 
def insert[:t] = r
def output:t = t
def output:r = r

🔎

In a write query, insert and delete definitions are evaluated at the start of the transaction. Output relations are evaluated at the end of the transaction and reflect the updates — even if you ask for insert and delete.

Loading and Updating CSV and JSON Data

CSV files are also loaded using insert. See CSV Import for details. Here’s a quick recap:

def insert[:my_csv_relation] = load_csv["/my/filename.csv"]
def insert[:my_json_relation] = load_json["/my/filename.json"]

Note that the above will add new data to any previously existing relation. You can fully replace previously loaded data like this:

def delete[:my_csv_relation] = my_csv_relation
def insert[:my_csv_relation] = load_csv["/my/updated_filename.csv"]

The loaded relation can be transformed before being persisted — for example, if only a subset of the external data is persisted. Here, only columns A and B are kept:

@inline
def schema_map[CSV](k, v) = CSV(:A, pos, k) and CSV(:B, pos, v) from pos
def insert[:my_csv_relation] = schema_map[load_csv["/my/filename.csv"]]

Updating a Single Field in JSON Data

Updates can be used to modify loaded JSON data, as expected. To demonstrate, start by loading a simple JSON object:

// write query
 
def config[:data] = """{ "a": {"b": {"c": 1, "d": 2}} }"""
def insert[:myjson] = load_json[config]
def output = myjson

You can update one of its fields as follows:

// write query
 
def delete[:myjson][:a, :b, :c] = myjson[:a, :b, :c]
def insert[:myjson][:a, :b, :c] = 1234
def output = myjson

See JSON Import for more on JSON data.

Reactive Updates: Example

Installed update rules can be used to implement reactive database logic, where the values in the database are updated according to external events.

As an example, consider the balance base relation again. You can clear it and give "jack" and "jill" a balance of 100:

// write query
 
def delete[:balance] = balance
def insert[:balance] = {("jack", 100); ("jill", 100)}
def output = balance

To transfer money from one account to another, on the condition that the balances do not become negative, you can install the following update logic:

// model
 
def transfer = {}
def insert[:balance] = {(p1, b1_new) ; (p2, b2_new) }
    from
        p1, p2, val,
        b1_new, b2_new, b1_old, b2_old
    where
        transfer(p1, p2, val)
    and val != 0
    and balance(p1, b1_old)
    and balance(p2, b2_old)
    and b1_new = b1_old - val
    and b2_new = b2_old + val
    and b1_new >= 0
    and b2_new >= 0
 
def delete[:balance](x, y) = balance(x, y) and insert(:balance, x, new) from new

🔎

The code block above loaded a model, indicated by the model label above it. For more details on loading models, see Working With Models.

If the derived relation transfer(p1, p2, val) holds a nonzero value val, the logic above “triggers,” transferring an amount val from account p1 to account p2, provided neither account falls below 0.

Note that even though you have not populated transfer, you are able to install logic in the database that will update the base relation balance if transfer is present.

You can do this even if transfer is not defined. The system will report that transfer is undefined, but the code will still be installed. To avoid this warning, you can add def transfer = {} to the code above. Note that the RAI query engine combines all defs when installing the logic above, so this has no effect when other values are present in the relation. Also, the order in which the definitions are written down is not relevant. See Rel Primer: Basic Syntax for more details.

If the relation transfer has values, the balance transfer happens. The transfer relation is known as the trigger in some formulations, since the update does not happen without it.

Try triggering a balance transfer of 40 from "jack" to "jill" by running:

// write query
 
def transfer = ("jack", "jill", 40)
def output = balance

Since the installed code runs with every query, this tuple in the transfer relation caused the updates to balance to happen.

You can verify the balance updates by installing these integrity constraints:

// model
 
ic balance_check {
    balance["jack"] + balance["jill"] = 200
}
 
ic nonnegative_balances(x, y) {
    balance(x, y) implies y >= 0
}

Executing the same query again triggers another update to balance:

// write query
 
def transfer=("jack", "jill", 40)
def output = balance

If you execute it one more time, balance will be unchanged, since there are not enough funds in the "jack" account:

// write query
 
def transfer=("jack", "jill", 40)
def output = balance

The condition b1_new >= 0 prevented the insert from being triggered. Note that no error happened: The transaction succeeded, but no updates were made.

Installed Unconditional Updates

It is important to remember that installed insert or delete definitions are always executed with each transaction. This might lead to some confusing behavior, if you forget that the code is installed.

If you install an unconditional insert, as in insert[:p] = 1, that value will always be present in the relation and cannot be removed, unless you uninstall the insert. Note that an installed delete can possibly be overridden by an insert, since inserts are executed after deletes. An unconditional insert can only be undone by uninstalling or modifying the corresponding source code.

This behavior might be useful to ensure that certain elements of a relation can never be deleted.

Change Tracking

You can debug updates by using auxiliary base relations to see what was inserted and deleted. For example, you can install these rules for inserted and deleted:

// model
 
def delete[:inserted] = inserted
def delete[:deleted] = deleted
def insert[:inserted](p, x) = insert[p](x) and p != :inserted
def insert[:deleted](p, x) = delete[p](x) and p != :deleted

You can update t and check for inserted and deleted afterwards:

// write query
 
def delete[:t] = t
def insert[:t] = {10; 11; 12}
def output:inserted = inserted
def output:deleted = deleted

Note that inserted and deleted are really supersets of what was changed in the relation. For more accuracy, you can use a snapshot, as described below.

Snapshotting Relations

You can always snapshot a derived relation or base relation, saving it to disk. Currently, the relation t is {10; 11; 12} (see above). You can now snapshot and update it:

// write query
 
def delete[:t_snapshot] = t_snapshot
def insert[:t_snapshot] = t
def insert[:t] = {100; 101}
def delete[:t] = t
def output = t_snapshot

Note that the relation is captured before the updates are applied. Later, you can compare the snapshot with the latest version of the relation. For example:

// read query
 
def changed[:added](x...) = t(x...) and not t_snapshot(x...)
def changed[:deleted](x...) = t_snapshot(x...) and not(t(x...))
def output = changed

Note that this gives more accurate information about what has actually changed in t, compared to looking only at insert and delete.

Integrity Constraints

Recall that integrity constraints are evaluated after all deletes and inserts are applied. If an update violates an integrity constraint, the violation is reported and the entire transaction fails. Integrity constraints can therefore ensure that properties of the database are maintained by any update. See the Integrity Constraints concept guide for more details.

For example, you can make sure that a relation contains only integers with:

ic p_int { forall(x: p(x) implies Int(x) ) }

You can ensure that a relation is never empty by installing the constraint:

ic p_nonempty { not empty(p) }

As the diagram in the Transactions guide shows, integrity constraints are checked after all the updates are applied, and before the transaction commits. Thus, for example, the p_nonempty IC will not be triggered if, in the same transaction, you can delete all the elements of p and then insert a new one:

def delete[:p] = p
def insert[:p] = 1

You can disallow all inserts or deletes with this integrity constraint:

ic readonly_database { not exists(x...: insert(x...)) and not exists(x... : delete(x...))}

Recursive Inserts and Deletes

The insert and delete relations can be defined recursively, to handle “cascading updates,” where an update can depend on previous ones. For example:

def insert[:ancestor](x, y) = parent(x, y)
def insert[:ancestor](x, y) = parent(x, t) and insert[:ancestor](t, y) from t

This populates an ancestor base relation, starting with the existing parent relation, and then recursively adding new pairs if the inserts imply it. See the Recursion concept guide for more details on recursion in Rel.

As another example, assume that you have several products. You have a binary relation part_of(x, y), where x is a component of y, and that you want to mark products as imported if one of their parts is imported. Given a new_import fact, you can then update:

def insert[:imported](x) = new_import(x)
def insert[:imported](y) = part_of(x, y) and insert[:imported](x) from x

Note that this only updates the imported relation for products connected to the new_import fact.

Other Notes

Listing Base Relations

The SDK APIs allow you to list all installed base relations. For example, see this section of the RelationalAI SDK for Julia.

Overloaded Base Relations

Creating both derived relations and base relations with the same name is not allowed. For example, this will fail:

def insert[:r] = {1; 2}
def r = 3

Like derived relations, base relations can be overloaded by arity and type. If you insert different types or arities into the same relation name, overloaded versions of the relation will be created:

// write query
 
def insert[:q] = 3.0
def insert[:q] = "a"
def insert[:q] = (1, "b")
def output = q

Summary

Updating data in a RAI database using the Rel language depends on changing, adding, or deleting either the base relations — the raw data — or the rules that define derived relations. The control relations insert and delete can be used in base relations to add or remove data. They are also used to load and replace CSV and JSON files. Rel also allows you to implement reactive data logic, use auxiliary base relations to update bugs, and even recursively define insert and delete relations to handle updates that depend on previous transactions.

Working With Base Relations

Introduction

Simple Example:

Adding Base Relation Data: insert

Removing Base Relation Data: delete

Inserts and Deletes in the Same Transaction

Updating Existing Values

Current and Updated States

Loading and Updating CSV and JSON Data

Updating a Single Field in JSON Data

Reactive Updates: Example

Installed Unconditional Updates

Change Tracking

Snapshotting Relations

Integrity Constraints

Recursive Inserts and Deletes

Other Notes

Listing Base Relations

Overloaded Base Relations

Summary

See Also

Adding Base Relation Data: `insert`

Removing Base Relation Data: `delete`