Rel Primer: Basic Syntax
This Primer is an introduction to the basic syntax of Rel.
Introduction
This Rel Primer is an introduction to the main features of Rel. It assumes basic knowledge of database concepts and programming.
Rel is RelationalAI’s language for building models and querying data.
Rel has a simple but powerful syntax designed to express a large set of relational operations, build declarative models, and capture general domain knowledge, including, in particular, knowledge graphs.
Relations
Everything in Rel is a relation. Rel imports data into relations, combines relations to define new ones, and queries relations to get answers back — in the form of new relations.
Relations are sets of tuples.
A tuple is a list of elements, where the order matters.
It corresponds to a row in a traditional database.
The elements are the basic objects in the database — for example, numbers, strings, symbols, or dates.
A tuple may be specified in Rel by separating its elements with commas (,
) and enclosing the list of elements in parentheses (()
).
For example,
("Neymar", "PSG")
is a tuple with two elements that represent the name of a soccer player and the club he currently plays for.
("store8", "product156", 12.99)
is a tuple with three elements that represent a store ID, a product ID, and the price of the product at that store.
Note: Rel uses normalized relations to represent data.
A relation may be defined explicitly by using semicolons (;
) to separate tuples and curly braces to enclose them:
// query
def players {
("Neymar", "PSG"); // single-line comment
("Messi", "BFC");
("Werner", "Chelsea");
("Pulisic", "Chelsea")
}
/*
This is a multiline comment in Rel.
*/
def output = players
Note that the order of the tuples does not matter, since relations are sets of tuples. The order of the elements inside each tuple, by contrast, can matter a lot.
Rel chooses a default order to display the tuples in the relation, which may be different from the order used to write the relation down, as seen in this example.
You should not assume that tuples will appear in any particular order; but you can use utilities like sort, top, and bottom to sort and rank them.
By convention, Rel query examples usually display the contents of the output
relation.
This is the same behavior that RAI notebook query cells have — where you can also
enter a single stand-alone Rel expression and get the result.
Arity and Cardinality
If you think of relations as tables, the arity of a relation is the number of columns, and the cardinality is the number of rows.
Arity
To compute the arity of a relation, simply use arity
.
// query
def R = {(1, "a"); (1, "b"); (1, "c")}
def output = arity[R]
In fact, arity
computes the lengths of all tuples in a relation.
If a relation contains tuples of various length, arity
will return all tuple lengths that occur in the relations.
// query
def R {
("SKI CHAMPIONSHIPS", 2009);
("SKI CHAMPIONSHIPS", 2009, "Lindsey", "Vonn", "Downhill")
}
def output = arity[R]
Cardinality
You can calculate the cardinality of a relation using count
:
// query
def R = {1; 3; 5; 7; 9}
def output = count[R]
It does not matter — in contrast to the arity — if a relation has tuples with various lengths. The cardinality returns the number of tuples (regardless of length).
You only have to watch out when calculating the cardinality of the empty tuple {}
.
For the empty tuple, the relation count
evaluates to false
.
To get the zero for the cardinality, you can use one of the override relations:
// query
def R = {}
def output:count = count[R]
def output:cardinality = (count[R] <++ 0)
Notice that the output has no entry with :count
, because count[R]
evaluates to false
.
Single Elements
In Rel, a single element is identified with a relation of arity 1 and cardinality 1.
For example, the number 7 is the same as the relation {(7)}
.
Curly braces are not required for single-element relations; so {4}
and 4
are the same relation,
with arity 1 and cardinality 1.
The constants true
and false
are also relations, of arity 0.
There are only two of these: false
is {}
(the empty relation with arity 0),
and true is {()}
, that is, the relation with one empty tuple (arity 0 and cardinality 1).
Though this syntax might seem unusual, it is compatible with other features of the language, as you will see below.
This way, everything is a relation, including the results of arithmetical operations
(+
, -
, *
, etc.)
and boolean operations (and
, or
, not
, etc.).
Relations Are Sets
Rel relations are sets of tuples.
This means that order does not matter and duplicates are ignored — keeping only one of each different tuple.
Thus, for example, {1; 1; 1}
is the same unary relation as {1}
.
And {(1, 2); (3, 4); (1, 2)}
is the same binary relation
as {(3, 4); (1, 2)}
.
This is important when computing aggregations, as explained in the Group-By section in the next part of this Primer.
Thinking of relations as tables again, this means that, unlike columns, the order of the rows does not matter, and there are no duplicate rows.
The parser allows but does not require parentheses and brackets when there is no ambiguity. So you can write:
// query
def output = 1, 2 ; 3, 4
In Rel, relations are sets of tuples; they don’t have duplicates.
Testing Set Membership
For a relation R
, the expression R(x)
may be seen as testing set membership and will hold exactly for those x
in R
.
In general, R(t1,...,tn)
is true if and only if (t1,...,tn)
is a tuple in R
.
Furthermore, R
can be any expression that evaluates to a relation. For example:
// query
def output(x) = {1; 2}(x) or {2; 3}(x)
Base and Derived Relations
In the RelationalAI documentation, small relations with sample data are often built into the Rel code, as in the def players = {...}
definition above.
Typically, though, data will be stored on disk, as relations, often representing “raw data” from outside sources, such as JSON or CSV files.
Once written to disk, no definitions are necessary for these relations, called base relations.
By contrast, relations defined by rules are derived relations, also known in the database world as views.
See Updating Data.
Derived relations are transient, local only to a query, unless:
- You write their results back to disk, creating new base relations.
- You install their definitions — hence the term “installed view.”
Database logic is defined by a set of rules. Each derived relation can refer to other derived relations, relations defined in libraries such as the stdlib, and base relations.
Some quick things to know about rules, ordef
s:
- Their order does not matter.
- They add up because relations in Rel are the union of their definitions. Suppose you have:
// query
def myrel = {}
def myrel = 1; 2
def myrel = 2; 3
myrel
will be {1; 2; 3}
.
-
Definitions can be recursive — see Recursion.
-
Relations can be overloaded by arity and type — see Advanced Syntax.
-
Related definitions can be grouped and parameterized together into modules — see Modules.
Optional Schema Declaration
Rel does not require predefined schemas. When you create a base relation or a derived relation in Rel, the system automatically tracks the arity and type of the tuples in the relation. Unlike traditional database systems, you do not have to specify this information beforehand as a “database schema.”
You can, however, enforce schemas if you want, in the form of integrity constraints that restrict the arity and types of relations, for instance. See Integrity Constraints for examples.
Uses of ,
and ;
Both ,
and ;
are operators in their own right, and it is helpful to be familiar with the multiple ways they can be used.
Tupling, Filtering, Conjunction, and Products
The ,
operator can:
- Make tuples.
- Serve as a boolean filter, analogous to
and
. - Denote cross products.
In the following example, ,
gives the cross product of relations {1; 2; 3}
and {4; 5}
,
where 4
or 5
is appended to each of {1; 2; 3}
:
// query
def output = {1; 2; 3}, {4; 5}
The cross product of {1}
and {2}
is {(1, 2)}
, so you can see tupling as a special case of cross product.
Taking advantage of true
and false
being relations, ,
can also be used as a boolean filter,
analogous to an and
operation:
// query
def myelements = 1; 2; 3; 4; 5; 6; 7; 8; 9
def output(x) = myelements(x), x > 3, x < 7
This works because true
is the relation {()}
(arity 0 and cardinality 1),
and false
is the relation {}
(arity 0 and cardinality 0).
Any cross product with {}
will be {}
, so R, false
is always false
.
And the cross product of any relation R
with {()}
is R
, hence R, true
is always R
.
You can also use the usual connectives and
, or
, implies
, and not
— but, unlike the comma operator, they always expect arity 0 (true
or false
).
This will be further discussed later.
Union and Disjunction
While ,
builds tuples and can concatenate them, the ;
operator builds relations and can combine their definitions in a union:
// query
def output = 2; {2; 2; 3} ; {3; 3; 4}
// query
def rel1 = (1, 2); (4, 8); (3, 6)
def rel2 = (3, 6); (2, 4)
def output = rel1; rel2
The ;
operator can be thought of as a disjunction (“or”), and it behaves exactly as or
for arity 0 (boolean) expressions:
true or false
is true
, false or false
is false
, and so on. While or
works only for arity 0 expressions,
the ;
operator also works with
higher arities and still represents a choice.
For example, {1; 2}(x)
serves as 1 = x or 2 = x
here:
// query
def output(x, y) {
{1; 2}(x)
and {4; 5; 6}(y)
and x + y < 7
}
Definitions
You have already seen a few examples of relation definitions (def
s) in Rel.
A Rel model is a collection of such definitions, called “Rel sources”,
which can be defined in terms of each other, even recursively.
The RAI query engine answers queries by combining these definitions with known data to compute a requested result.
The query comes in the form of a relation, often called output
, that you want to compute.
The order in which the definitions are written down is not relevant. For example:
// query
def output(x) = odddata(x) and x > 5
def mydata = {1; 2; 3; 4; 5; 7; 8}
def odddata(x) = mydata(x) and x%2 = 1
def mydata = {9; 10; 11}
Here, mydata
is the union of its two definitions,
and it is not a problem that output
goes first,
or that odddata
is introduced before mydata
, on which it depends.
Relational Abstraction
Recall that a relation is a set of tuples. One way to describe a relation is to write down a formula that is true
exactly for those tuples in the relation
and false
for all others.
You can see this most clearly in definitions
that introduce a variable for each column in the left-hand side of the definition, as in
def myrel(x1, x2, ..., xn) = ...
and then give a boolean expression on the right-hand side that constrains those variables:
// query
def mydomain(x) = range(1, 7, 1, x) // x will range over 1; 2; 3; 4; 5; 6; 7
def myrel(x, y) {
mydomain(x)
and mydomain(y)
and x + y = 9
}
def output = myrel
Note: The Rel Standard Library has many utilities including the one used here:
range(start, stop, step, x)
constrains x
to range over all the elements between start
and stop
,
inclusive, skipping by step
each time.
This style is preferred, since it leads to more readable definitions.
However, relations can also be specified using relational abstraction, indicated by :
in Rel.
The definitions above are equivalent to:
// query
def mydomain = x : range(1, 7, 1, x)
def myrel {
x, y : mydomain(x)
and mydomain(y)
and x + y = 9
}
def output = myrel
This use of :
is similar to the vertical bar used for describing mathematical sets (opens in a new tab) that separates the variables from the conditions (for example: ).
There are variables on the left of the :
, a boolean condition over those variables on the right,
and the set contains all the combinations of elements for those variables (tuples) that make the condition true.
But in Rel, :
can do much more.
The expression Expr
in <vars> : Expr
does not have to be a boolean formula, that is, it can have arity greater than 0.
If Expr
has arity , the corresponding elements from Expr
are appended to <vars>
, and the
total arity will be the number of variables in vars
plus . For example:
// query
def mydomain = x : range(1, 4, 1, x)
def myrel = x : x * 5, mydomain(x)
def output = myrel
This is similar to how ,
works. For both :
and ,
,
when the right-hand side is a boolean expression (has arity 0), it works as a filter.
If the right-hand side has arity greater than 0,
then the expression is appended to the tuple.
This offers a way to do group-by aggregations, as you will see later in the Group-By section.
The arity of bindings : expression
is the number of variables in bindings
plus the arity of expression
.
Style note:
When you define a relation using the def myrel = x1, ..., xn : Expr
style,
the arity of myrel
depends on the arity of the expression on the right (Expr
),
which might not be clear at first sight. You only know that myrel
will have arity of at least .
By contrast, the def myrel(x1, ..., xn) = ...
definition explicitly calls out the arity of the relation — exactly — which is why it is preferred.
Binding Shortcuts
It is often useful to put filters in the left-hand side of :
— called bindings.
For this, you can use in
and where
, which can be used directly wherever variables are introduced,
to restrict the domain of the variables. For example:
// query
def mydomain(x in range[1, 7, 1]) = true // same as: def mydomain = range[1, 7, 1]
def myrel = x in mydomain, y in mydomain where x + y = 9 : x * y
def output = myrel
Note that while in
restricts the domain of a single variable, where
can restrict combinations of variables.
Note: in
— and its equivalent symbol ∈
— is not a stand-alone boolean predicate and can be used only in bindings. If you want to express x in R
in a boolean formula, just write R(x)
.
You can define many different kinds of relations using the machinery you have seen. Check out this next feature that is widely used in Rel to make the code more concise.
Relational Application
Relational application is the term used when variables are “applied” to a relation. In Rel, a relational application looks as follows:
myrel(1, 2)
()
encloses the arguments (here the pair (1, 2)
) that should be applied to the relation myrel
.
You can also view relational application as a “look up” testing that the tuple (1, 2)
is in myrel
.
Relations in GNF organize data as a key-value pair, (k, v)
, where k
is the key and v
is the value.
This enables a relation to act like a dictionary where the value v
is keyed by k
.
To look up all values associated with the key 1
, you can write myrel[1]
:
// query
def myrel {
(1, 2);
(1, 4);
(3, 6)
}
def output = myrel[1]
This mimics the dictionary syntax of many popular computer languages.
In more technical terms, this is called partial relational application, where only some of the first arguments are applied.
The square brackets []
indicate that not all arguments (are necessarily) provided and some may be missing.
Furthermore, the syntax myrel[1]
means that all remaining arguments (i.e., elements in the tuple) that haven’t been applied will be returned, which in the case above are the values 2
and 4
.
With partial relational application, you can apply multiple arguments.
For example, myrel[1, 2]
is used to return the remaining part of the tuples
in myrel
that have as first element 1
and as second element 2
.
// query
def myrel {
(1, 2, 3, 4);
(1, 2, 6, 7);
(1, 3, 10, 11)
}
def output = myrel[1, 2]
The expression myrel[x, y]
is equivalent to myrel[x][y]
.
As an example from the Rel Standard Library, the function
add
is a ternary relation, where add(x, y, z)
holds if z is the sum of x and y.
You can write add[x, y]
to indicate the sum of x
and y
; the expression add[1, 2]
binds x
and y
to 1
and 2
, and evaluates to a single-element relation containing all of values of z
for which add(x, y, z)
is true.
In this example, there is only one result, but this does not have to be the case, as you will see next.
Partial Applications Are Not Functions
In many settings, []
looks like a function call — as in add[1, 2]
or cos[2 * pi_float64]
—
but you should remember that the output might be empty, or not unique.
In the example below, neighbor[x]
can have zero, one, or two elements for each x
:
// query
def mydomain(x) = range(1, 5, 1, x) // numbers from 1 to 5, inclusive
def neighbor(x in mydomain, y in mydomain) {
(y = x + 1)
or (y = x - 1)
}
def output = x in {2; 3} : neighbor[x]
In technical terms, []
is relational application, which is a generalization of functional application.
Rel code that uses []
could always be written without it, by introducing new variables, but []
makes it more concise and natural to read.
Operator Distribution
If S
is a unary relation, Rel
can also evaluate R[S]
,
which will be the union of R[x]
over all the tuples x
in S
.
For example:
def output = neighbor[{1; 3}]
This will give {2; 4}
, since neighbor[1]
is {2}
and neighbor[3]
is {2; 4}
.
Other operations distribute similarly. For example,
1 + {2; 3}
is {3; 4}
.
Note: In the expression R[S]
, the relation S
must have arity 1, unless R
is a “higher-order” @inline
def.
See the Advanced Syntax Primer.
Square Brackets in Definitions
Square brackets can be used in the left-hand side of definitions, making them more concise, and often highlighting a functional dependency. For example:
// query
def f[x in {3}, y in {4; 5}] = x + y
def output = f
Here is another example:
// query
def myelements = (1, 2); (3, 4); (5, 6)
def output[x] = myelements[x], x > 1, x < 4
An equivalent query using logical connectives is:
// query
def myelements = (1, 2); (3, 4); (5, 6)
def output(x, y) = myelements(x, y) and x > 1 and x < 4
In this case,
the second formulation is preferred, using and
instead of ,
, since it makes it easier to find typos and mistakes.
If any of the formulas used with and
have an arity greater than 0, you will get a compiler error.
By contrast, in the previous formulation, the comma operator (,
) will compute the cross product of any relations it gets, regardless of their arity.
- When writing
def myrel[v1,...,vn] = e
, the arity ofmyrel
will ben
plus the arity ofe
. - This is also true of
def myrel(v1,...,vn) = e
, but in this case,e
must have arity 0, andmyrel
has arityn
. - When
n
is 0, you just havedef myrel = e
, and the arity ofmyrel
is the arity ofe
.
You can use constants on the left-hand side:
// query
def output[1, 2] = 3, 4
Here’s an example that combines []
, in
, and ,
as a tuple constructor and filter:
// query
def mydomain = range[1, 5, 1] // numbers from 1 to 5, inclusive
def output[x in mydomain, y in mydomain] = x - y, x + y, x * y, x + y = 5
Caveat: Not all combinations of in
and where
are yet supported in the head
of the def
, that is, the left-hand side of the =
,
but you can combine them in the body — the right-hand side of the =
.
The definition above is equivalent to:
def output = x in mydomain, y in mydomain where x + y = 5 : x - y, x + y, x * y
Parentheses
Parentheses ()
can be thought of as a special case of relational application, with remaining arity 0:
The key is the full tuple, and the value is the empty tuple ()
, that is, true
.
Arity
Parentheses Versus Square Brackets
The Rel compiler checks that all the arities in the code match up and reports an error if they do not.
When you write, say, myrel(x, y, z)
, you are indicating that myrel
must have arity 3.
The expression myrel(x, y, z)
will evaluate to either true
or false
, which are represented in Rel as relations of arity 0.
By contrast, writing myrel[x, y, z]
implies that the arity of myrel
is at least 3, and the statement evaluates to a relation with an arity that is 3 less than the arity of myrel
.
For more details on how partial relational application works, see section Relational Application.
In the case that myrel
has arity 3, the partial relational application myrel[1, 2, 3]
is equivalent to a “complete” relational application myrel(1, 2, 3)
.
Both expressions evaluate to relations of arity 0 — which are either true
or false
.
Boolean Formulas
Writing, say, {1} and p(x)
or sin[x] or q(x)
, gives an arity error.
The boolean connectives require arity 0 for both arguments, and
their result has arity 0.
When you write myrel[1, 2, 3]
, you expect myrel
to have arity of at least 3.
If you write myrel(1, 2, 3)
, you require myrel
to have arity of exactly 3.
Arity and Defs
When you write def myrel(a, b, c) = E
, the arity of E
must be 0 because it should have a boolean value, and the arity of myrel
is 3.
When you write def myrel[a,b,c] = E
, then E
can have arbitrary arity, and the arity of myrel
will be 3 plus the arity of E
.
Format
In Rel neither indentation nor line breaks matter. The style conventions adopted in the codes on the website are for readability purposes.
Consider the following code:
// query
def players {
("Neymar", "PSG");
("Messi", "BFC");
("Werner", "Chelsea");
("Pulisic", "Chelsea")
}
def output = players
ic {count[players] = 4}
It could also be written like this:
// query
def players {
("Neymar", "PSG");
("Messi", "BFC");
("Werner", "Chelsea"); ("Pulisic", "Chelsea") }
def output = players ic {count[players] = 4}
Both yield the same result. Indeed, the two code blocks are not just correct in Rel, they are also syntactically identical and render the same output.
Even though the second code block is syntactically identical to the first, it is highly recommended to use spaces and line breaks to make your code readable and maintainable.
Standard Library
Rel includes a Standard Library with many functions and utilities; for example, range, maximum, minimum, sin, cos, and many other mathematical operations. The stdlib also includes “higher-order” definitions that take relations as arguments, such as argmax, argmin, and first.
Help
The query help[:name]
— or, in module syntax, just help:name
— will display a brief docstring for each relation.
For example, help:argmax
, help:range
, or help:min
.
These relations are currently imported into the main Rel namespace.
Therefore, unexpected outcomes may occur when users define their own relations with the same names.
Some names to avoid are: domain
, function
, total
, first
, last
, and equal
.
A quick way to check if a name is reserved is to try help:name
.
Summary
This article has covered the basics of Rel syntax. The next in the Rel Primer series focuses on aggregations, group-by and joins, followed by more advanced Rel features.