# Rel Primer: Basic Syntax

An introduction to the basic syntax of Rel.

Download this Primer as a RAI notebook by clicking here.

# Introduction

This Rel Primer is an introduction to the main features of Rel. It assumes basic knowledge of database concepts and programming.

Rel is RelationalAI’s language for building models and querying data. Rel is inspired by the relational calculus, the Datalog and HiLog programming/query languages, and a number of modeling formalisms and tools, such as Alloy.

Rel has a simple but powerful syntax designed to express a large set of relational operations, build declarative models, and capture general domain knowledge, including, in particular, Knowledge Graphs.

## Relations

Everything in Rel is a *relation*. Rel imports data into relations,
combines relations to define new ones, and queries relations to get answers back — in the form of new relations.

Relations are sets of *tuples*.

A *tuple* is a list of values, where the order matters. It corresponds to a *row* in a traditional database.
The values are the basic objects in the database — for example, numbers, strings, symbols, or dates.
Tuples are enclosed by `()`

and use `,`

to separate their values.

For example,
`("Neymar", "PSG")`

is a tuple with two values, and can represent the name of a soccer player and the club he currently plays for.
`("store8", "product156", 12.99)`

is a tuple with three values, and can represent the price of a product at a particular store.
(In Rel, we prefer normalized relations to represent data.)

Relations are enclosed by curly braces (`{}`

), and we use semicolons (`;`

) to separate their tuples:

`def players = { ("Neymar", "PSG"); // single-line comment`

("Messi", "BFC");

("Werner", "Chelsea"); ("Pulisic", "Chelsea")

}

/*

This is a multi-line comment in Rel.

*/

def output = players

Relation: output

"Messi" | "BFC" |

"Neymar" | "PSG" |

"Pulisic" | "Chelsea" |

"Werner" | "Chelsea" |

Note that the order of the tuples does not matter, since relations are sets of tuples. The order of the values inside each tuple, in contrast, can matter a lot.

Rel chooses a default order to display the tuples in the relation, which may be different from the order we used to write the relation down, as seen in this example.

You should not assume that tuples will appear in any particular order;
but you can use utilities like `sort`

, `top`

and `bottom`

to sort and rank them.

`output`

relation.
This is the same behavior that RAI Notebook query cells have — where you may also
enter a single stand-alone Rel expression and get the result.### Arity and Cardinality

If we think of relations as tables,
the *arity* of a relation is the number of columns, and the *cardinality* is the number of rows.

### Single Values Are Relations

In Rel, a single value is identified with a relation of arity 1 and cardinality 1.
For example, the number 7 is the same as the relation `{(7)}`

.
We do not require curly braces for single-value relations, so `{4}`

and `4`

are the same relation,
with arity 1 and cardinality 1.

The constants `true`

and `false`

are also relations, of arity 0.
There are only two of these: `false`

is `{}`

, the empty relation (arity 0 and cardinality 0),
and `true`

is `{()}`

, that is, the relation with one empty tuple (arity 0 and cardinality 1).
This might seem strange at first, but will play quite nicely with other features of the language, as we will see below.

This way, everything is a relation, including the results of arithmetical operations
(`+`

, `-`

, `*`

, etc.)
and boolean operations (`and`

, `or`

, `not`

, etc.).

### Relations Are Sets

As we have mentioned a few times already,
Rel relations are *sets* of tuples.
This means that order does not matter and *duplicates are ignored* — we only keep one of each different tuple.
Thus, for example, `{1; 1; 1}`

is the same unary relation as `{1}`

.
And `{(1, 2); (3, 4); (1, 2)}`

is the same binary relation
as `{(3, 4); (1, 2)}`

.

This is important when computing aggregations, as explained in the group-by section in the next part of this Primer.

Thinking of relations as tables again, this means that, unlike columns, the order of the rows does not matter, and there are no duplicate rows.

The parser allows but does not require parentheses and brackets when there is no ambiguity. For example:

`()`

as a Membership Test

For a relation `r`

, the expression `r(x)`

may be seen as testing set membership, and will hold exactly for those `x`

in `r`

.
In general, `r(t1,...,tn)`

is true iff `(t1,...,tn)`

is a tuple in `r`

.

Furthermore, `r`

can be any expression that evaluates to a relation. For example:

## Stored (EDB) vs. Derived (IDB) Relations

In our documentation, small relations with sample data are often built into the Rel code, as in the `def players = {...}`

definition above.
Typically, though, data will be stored on disk, as relations, often representing “raw data” from outside sources, such as JSON or CSV files.
Once written to disk, no definitions are necessary for these relations, which we call *stored* relations.
In contrast, relations defined by rules are *derived* relations, also known in the database world as *views*.

In the Datalog world, stored relations are known as *EDB relations*,
and derived ones as *IDB relations* (for “extensional” and “intensional”, respectively).

Derived relations are transient, local only to a query, unless (1) we write their results back to disk,
creating new stored relations, or (2) we *install* their definitions (hence the term, “installed view”).
See the EDB vs. IDB relations Concept Guide for more details.

## Rules

Database logic will be a set of rules. Each derived relation may refer to other derived relations, relations defined in libraries such as the stdlib, and stored (EDB) relations.

Some quick things to know about rules (`def`

s):

- Their order does not matter.
- Definitions are unioned, so they add up. If we have:

`def myrel = {}`

def myrel = 1; 2

def myrel = 2; 3

then `myrel`

will be `{1; 2; 3}`

.

Definitions can be recursive — see the Recursion Topic Guide.

Relations can be overloaded by arity and type — see Advanced Syntax.

Related definitions can be grouped and parameterized together, into

*modules*— see the Modules Concept Guide.

### Schemas Are Optional

Rel does not require pre-defined schemas. When you create an EDB relation, or define an IDB relation in Rel, the system automatically tracks the arity and type of the tuples in the relation. Unlike traditional database systems, you do not have to specify this information beforehand as a “database schema.”

You can, however, enforce schemas if you want, in the form of integrity constraints that restrict the arity and types of relations, for instance — see the IC concept guide for examples.

## Uses of `,`

and `;`

Both `,`

and `;`

are operators in their own right, and it is good to familiarize yourself with the multiple ways they can be used.

### Tupling, Filtering, Conjunction and Products (`,`

)

The `,`

operator can:

- make tuples
- serve as a boolean filter (in analogy to “and”)
- denote cross-products

In the following example, `,`

gives the cross-product of relations `{1; 2; 3}`

and `{4; 5}`

,
where `4`

or `5`

is appended to each of `{1; 2; 3}`

:

`{1}`

and `{2}`

is `{(1, 2)}`

, so tupling is a special case of cross-product.Taking advantage of `true`

and `false`

being relations, `,`

may also be used as a boolean filter,
analogous to an `and`

operation (conjunction):

`def myvalues = 1; 2; 3; 4; 5; 6; 7; 8; 9`

def output(x) = myvalues(x), x > 3, x < 7

Relation: output

4 |

5 |

6 |

`true`

is the relation `{()}`

(arity 0 and cardinality 1),
and `false`

is the relation `{}`

(arity 0 and cardinality 0).
Any cross-product with `{}`

will be `{}`

, so `R, false`

is always `false`

.
And the cross-product of any relation `R`

with `{()}`

is `R`

, hence `R, true`

is always `R`

.You can also use the usual connectives `and`

, `or`

, `implies`

, `not`

— but, unlike the comma operator, they always expect arity 0 (`true`

or `false`

).
We will return to this later.

### Union and Disjunction (`;`

)

While `,`

builds tuples and concatenates them, the `;`

operator builds relations, and unions them:

`def rel1 = (1, 2); (4, 8); (3, 6)`

def rel2 = (3, 6); (2, 4)

def output = rel1; rel2

Relation: output

1 | 2 |

2 | 4 |

3 | 6 |

4 | 8 |

The `;`

operator can be thought of as a disjunction (“or”), and it behaves exactly as `or`

for arity 0 (boolean) expressions:
`true or false`

is `true`

, `false or false`

is `false`

, and so on. While `or`

works only for arity 0 expressions,
the `;`

operator also works with
higher arities, and still represents a choice. For example, `{1; 2}(x)`

serves as `1=x or 2=x`

here:

## Definitions

We have already seen a few examples of relation definitions (`def`

’s) in Rel,
and their meaning should be intuitively clear.
A Rel model is a collection of such definitions (called “Rel sources”),
which can be defined in terms of each other, even recursively.
The RAI query engine answers queries by combining these definitions with known data to compute a requested result.
The query comes in the form of a relation, often called `output`

, that we want to compute.

The order in which the definitions are written down is not relevant. For example:

`def output(x) = odddata(x) and x > 5`

def mydata = {1; 2; 3; 4; 5; 7; 8}

def odddata(x) = mydata(x) and x%2 = 1

def mydata = {9; 10; 11}

Relation: output

7 |

9 |

11 |

Here, the two separate definitions for `mydata`

are unioned,
and it is not a problem that `output`

goes first,
or that `odddata`

is introduced before `mydata`

, on which it depends.

## Relational Abstraction (`:`

)

Recall that a relation is a set of tuples. One way to describe a relation is to write down a formula that is `true`

exactly for those tuples in the relation
and `false`

for all others.

We can see this most clearly in definitions
that introduce a variable for each column in the left-hand side of the definition, as in
`def myrel(x1, x2, ..., xn) = ...`

and then give a boolean expression on the right that constrains those variables:

`def mydomain(x) = range(1, 7, 1, x) // x will range over 1; 2; 3; 4; 5; 6; 7`

def myrel(x, y) = mydomain(x) and mydomain(y) and x + y = 9

def output = myrel

Relation: output

2 | 7 |

3 | 6 |

4 | 5 |

5 | 4 |

6 | 3 |

7 | 2 |

Note: The Rel standard library has many utilities, and we use one here:
`range(low, high, stride, x)`

constrains `x`

to range over all the values between `low`

and `high`

,
inclusive, skipping by `stride`

each time.

We prefer this style, since it leads to more readable definitions,
but relations can also be specified using *relational abstraction*, indicated by `:`

in Rel.
The above definitions are equivalent to:

`def mydomain = x : range(1, 7, 1, x)`

def myrel = x, y : mydomain(x) and mydomain(y) and x + y = 9

def output = myrel

Relation: output

2 | 7 |

3 | 6 |

4 | 5 |

5 | 4 |

6 | 3 |

7 | 2 |

This use of `:`

is similar to standard mathematical notation to describe a set.
There are variables on the left of the `:`

, a boolean condition over those variables on the right,
and the set contains all the combinations of values for those variables (tuples) that make the condition true.

But in Rel, `:`

can do much more.
The expression `Expr`

in `<vars> : Expr`

does not have to be a boolean formula (that is, it can have arity greater than 0).
If `Expr`

has arity $n$, the corresponding elements from `Expr`

are appended to `<vars>`

, and the
total arity will be the number of variables in `vars`

plus $n$. For example:

`def mydomain = x : range(1, 4, 1, x)`

def myrel = x : x * 5, mydomain(x)

def output = myrel

Relation: output

1 | 5 |

2 | 10 |

3 | 15 |

4 | 20 |

This is similar to how `,`

works. For both `:`

and `,`

,
when the right-hand side is a boolean expression (has arity 0), it works as a filter.
If the right-hand side has arity greater than 0,
then the expression is appended to the tuple.

This gives us a way to do group-by aggregations, as we will see later in the group-by section.

`bindings : expression`

is the number of variables in `bindings`

plus the arity of `expression`

.Style Note:
When we define a relation using the `def myrel = x1, ..., xn : Expr`

style,
the arity of `myrel`

depends on the arity of the expression on the right (`Expr`

),
which might not be clear at first sight. We just know that `myrel`

will have arity of *at least* $n$.
In contrast, the `def myrel(x1, ..., xn) = ...`

definition explicitly calls out the arity of the relation — *exactly* $n$ – which is why we prefer it.

### Binding Shortcuts: `in`

and `where`

It is often useful to put filters in the left-hand side of `:`

— which we call the *bindings*.

For this, we can use `in`

and `where`

, which can be used directly wherever variables are introduced,
to restrict the domain of the variables. For example:

`def mydomain(x in range[1, 7, 1]) = true // same as: def mydomain = range[1, 7, 1]`

def myrel = x in mydomain, y in mydomain where x + y = 9 : x * y

def output = myrel

Relation: output

2 | 7 | 14 |

3 | 6 | 18 |

4 | 5 | 20 |

5 | 4 | 20 |

6 | 3 | 18 |

7 | 2 | 14 |

Note that while `in`

restricts the domain of a single variable, `where`

can restrict combinations of variables.

`in`

(and its equivalent symbol, `âˆˆ`

) is not a standalone boolean predicate, and can be used only in bindings. If you want to express `x in R`

in a boolean formula, just write `R(x)`

.We can already define many different kinds of relations using the machinery we have seen. Next we introduce a feature that is widely used in Rel to make our code more concise.

## Relational Application (`[]`

)

Relations that represent functions often have their inputs first and the results second. Similarly, database tables are often defined with keys first and values second. It is natural, then, and very useful, to have a notation that easily identifies the latter based on the former.

Square brackets *restrict* a relation to a certain prefix, and then remove the prefix
(*project*, in relational parlance).
For example, `myrel["a", "b"]`

indicates all the tuples
in `myrel`

whose first element is `"a"`

and second element is `"b"`

.

This is what we mean by *relational application*. For example:

Here’s another example, where we fix and then remove the first two columns:

`def myrel = {(1, 2, 3, 4); (1, 2, 6, 7); (1, 3, 10, 11)}`

def output = myrel[1, 2]

Relation: output

3 | 4 |

6 | 7 |

The expression `myrel[x, y]`

is equivalent to `myrel[x][y]`

.

As an example from the Rel Standard Library, the function
`add`

is a ternary relation, where `add(x, y, z)`

holds if z is the sum of x and y.
We often just write `add[x, y]`

to indicate that value; the expression `add[1, 2]`

fixes x and y,
and gives us all the z’s. In this example, there is only one result, but this does not have to be the case,
as we discuss next.

`[]`

is Not Always a Function

In many settings, `[]`

looks like a function call — as in `add[1, 2]`

,
or `cos[2 * pi_float64]`

—
but we should remember that the output might be empty, or not unique.
In the example below, `neighbor[x]`

can have zero, one, or two elements for each `x`

:

`def mydomain(x) = range(1, 5, 1, x) // numbers from 1 to 5, inclusive`

def neighbor(x in mydomain, y in mydomain) = (y = x + 1) or (y = x - 1)

def output = x in {2; 3} : neighbor[x]

Relation: output

2 | 1 |

2 | 3 |

3 | 2 |

3 | 4 |

`[]`

is *relational*application, which is a generalization of

*functional*application. (Functions are special cases of relations: they are relations where the values are uniquely determined by the keys.) Rel code that uses

`[]`

could always be written without it, by introducing new variables,
but `[]`

makes it more concise and natural to read.### Operator Distribution

If `S`

is a unary relation, `Rel`

can also evaluate `R[S]`

,
which will be the union of `R[x]`

over all the tuples `x`

in `S`

.
For example,

`def output = neighbor[{1; 3}]`

will give `{2; 4}`

, since `neighbor[1]`

is `{2}`

and `neighbor[3]`

is `{2; 4}`

.

Other operations distribute similarly. For example,
`1 + {2; 3}`

is `{3; 4}`

.

Note: In the expression `r[S]`

, the relation `S`

must have arity 1
(unless `r`

is a “higher-order” `@inline`

def –
see the advanced syntax Primer).

### Using `[]`

In Definitions

Square brackets can be used in the left-hand side of definitions, making them more concise, and often highlighting a functional dependency. For example:

Here is another example:

`def myvalues = (1, 2); (3, 4); (5, 6)`

def output[x] = myvalues[x], x > 1, x < 4

Relation: output

3 | 4 |

An equivalent query using logical connectives is:

`def myvalues = (1, 2); (3, 4); (5, 6)`

def output(x, y) = myvalues(x, y) and x > 1 and x < 4

Relation: output

3 | 4 |

In this case,
we prefer the second formulation, using `and`

instead of `,`

, since it makes it easier to find typos and mistakes.
If any of the formulas used with `and`

has an arity greater than 0, we will get a compiler error.
In contrast, in the previous formulation, the comma operator (`,`

) will happily compute the cross product of any relations it gets, regardless of their arity.

- When we write
`def myrel[v1,...vn] = e`

, the arity of`myrel`

will be`n`

plus the arity of`e`

. - This is also true of
`def myrel(v1,...,vn) = e`

, but in this case,`e`

must have arity 0, and`myrel`

has arity`n`

. - When
`n`

is 0, we just have`def myrel = e`

, and the arity of`myrel`

is the arity of`e`

.

We can use constants on the left-hand side:

An example that combines `[]`

, `in`

, and `,`

as a tuple constructor and filter:

`def mydomain = range[1, 5, 1] // numbers from 1 to 5, inclusive`

def output[x in mydomain, y in mydomain] = x - y, x + y, x * y, x + y = 5

Relation: output

1 | 4 | -3 | 5 | 4 |

2 | 3 | -1 | 5 | 6 |

3 | 2 | 1 | 5 | 6 |

4 | 1 | 3 | 5 | 4 |

Caveat: Not all combinations of `in`

and `where`

are yet supported on the head
of the `def`

(that is, the left of the `=`

),
but you can combine them on the body (the right of the `=`

).
The above definition is equivalent to:

`def output = x in mydomain, y in mydomain where x + y = 5 : x - y, x + y, x * y`

## Thinking About Arity

`()`

vs. `[]`

The Rel compiler checks that all the arities in the code match up and reports an error if they do not.
When we write, say, `myrel(x,y,z)`

, we are indicating that `myrel`

*must* have arity 3, and the result has arity 0 — which is the arity
expected by the boolean connectives (`and`

, `or`

, `implies`

, etc.), and means that it is either `true`

or `false`

.

Writing `myrel[x,y,z]`

implies that the arity of `myrel`

is *at least* 3,
but it can, of course, be higher. The arity of `myrel[x,y,z]`

is equal to `arity(myrel) - 3`

.

If `myrel`

has arity 3, you can still write `myrel[1,2,3]`

; it will be equivalent to `myrel(1,2,3)`

—
either `true`

or `false`

.

### Boolean Formulas Have Arity 0

Writing, say, `{1} and p(x)`

, or `sin[x] or q(x)`

, gives an arity error.
The boolean connectives require arity 0 for both arguments, and
their result has arity 0.

`myrel[1,2,3]`

, we expect `myrel`

to have arity of *at least*3. If we write

`myrel(1,2,3)`

, we require `myrel`

to have arity of *exactly*3.

### Arity and Defs

When we write `def myrel(a,b,c) = E`

, the arity of `E`

must be 0 (it should have a boolean value), and the arity of `myrel`

is 3.

When we write `def myrel[a,b,c] = E`

, then `E`

can have arbitrary arity, and the arity of `myrel`

will be 3 plus the arity of `E`

.

## The Standard Library

Rel includes a Standard Library with many functions and utilities.
For example, `range`

, `maximum`

, `minimum`

, `sin`

, `cos`

,
and many other mathematical operations.
The stdlib also includes “higher-order” definitions that take relations as arguments,
such as `argmax`

, `argmin`

, and `first`

.

### Help!

The query `help[:<name>]`

— or, thanks to module syntax, just `help:<name>`

— will display a brief docstring for each utility.
For example, `help:argmax`

, `help:range`

, or `help:min`

.

`domain`

, `function`

, `total`

, `first`

, `last`

, `equal`

.
A quick way to check if a name is reserved is to try `help:name`

.## Summary

This article has covered the basics of Rel syntax. The next one in this Rel Primer series focuses on aggregations, group-by and joins, followed by a Primer on more advanced Rel features.