Rel Language Reference

Reference Guide for the Rel language.

Introduction

This is the reference guide for the Rel language. For an introduction to the language, see our upcoming tutorials.

Values

Every value in Rel is a first-order relation. We write relations as tuples separated by semicolons (;) and wrapped in curly braces ({}). For example:

{(1, 2); (3, 4); (5, 6)} is a relation with arity 2 and cardinality 3. It can be visualized as the following table:

12
34
56

{(1, 2, 3); (4, 5, 6)} is a relation with arity 3 and cardinality 2. It can be visualized as the following table:

123
456

{(1); (2); (3); (4)} is a relation with arity 1 (a unary relation) and cardinality 4. It can be visualized as the following table:

1
2
3
4

We typically leave out the parentheses for unary relations, so this relation can be written as {1; 2; 3; 4}.

The purpose of this first section is mainly to explain what notation we use for values computed by Rel; but the notation is also supported by the Rel language itself.

In Rel, the parentheses of the tuple are in general optional because , is an operator and it binds stronger than the ; operator (see Comma and Semicolon). Similar to other languages, tuples with a trailing comma are allowed, as in {(1,); (2,); (3,); (4,)}.

Values in Rel are restricted to first-order relations. This means that relations cannot contain relations. Relations can contain values such as integers, floating-point numbers, fixed-point decimals, characters, dates, datetimes, etc. Many of the types have some structure (such as dates), but are not variable-sized collections. The Rel system is designed to support arbitrary fixed-size Julia types as values.

Terminology

TermExplanation
ArityNumber of arguments in a relation
CardinalityNumber of tuples in a relation (size)
RelationA relation of any cardinality, arity, and with any properties. When we say ‘relation’, we do not mean that this relation is not a function, set, singleton etc.
FunctionRelation with a functional dependency from the initial arguments to the last argument
Empty relationRelation of cardinality 0 (any arity)
TupleRelation of cardinality 1 (any arity)
SetRelation of arity 1 (Unary relation)
Nullary relationRelation of arity 0. Nullary relations can only have cardinality 0 or 1. Nullary relations can be thought of as representing the boolean values true and false. See the Formulas section for a discussion of how this fits in the language design.
Unary relationRelation of arity 1 (Set)
Binary relationRelation of arity 2
Ternary relationRelation of arity 3
N-ary relationRelation of arity n
SingletonRelation of cardinality 1 and arity 1
IDB relationIDB is an abbreviation of intensional database. An IDB relation is a relation defined by a computation over IDB and EDB relations. IDB relations could be persisted or not, because they can always be recomputed from data that is already in the database. This corresponds to the definition of a view in SQL.
EDB relationEDB is an abbreviation of extensional database. An EDB relation has to be stored in the database because it is not defined by a computation. This corresponds to a table in SQL.

Lexical syntax

Identifiers

An identifier is a sequence of any Unicode letter or underscore, followed by zero or more Unicode letters, underscores, or numbers.

Identifier  := IdCharInit IdChar*
IdCharInit := [a-zA-Zα-ωΑ-Ω_]
IdChar     := [a-zA-Zα-ωΑ-Ω_0-9]

For example, valid identifiers are: x, a1, b1c, θ, φ, Σ, ß, β1, β2.

Due to missing features in the parser generator we use, the definition explicitly lists supported ranges of Unicode letters (hence the Greek ranges). This is currently incomplete, so we do not yet support other Unicode letters in identifiers.

Comments

Rel uses /* ... */ for block (multi-line) comments and // ... for end-of-line comments.

Infix symbols

Infix symbols that are not defined in the Rel libraries are available to be user-defined: , , , , , , , , , , , , .

For example:

@inline def (≈)(a, b) = a + 1 = b
def output = 1 ≈ 2

Note: the infix used in alglib.rel (Unicode 8764) is not the same character as your keyboard’s ~ (Unicode 126).

These symbols can also be useful as higher-order relational arguments, as this example from alglib.rel shows:

@inline
def unary_relation(D, ∼) =
    equal(arity[∼], 1)

Declarations

Definitions

Definitions define a named, computed relation.

Declaration :=
    Attribute* "def" Id FormalParamsBracket* FormalParamsParen? "=" Expr
    Attribute* "def" Id FormalParamsBracket* FormalParamsParen? "{" Expr "}"

In our examples, we write parent/2 as an abbreviation that means “relation parent has arity 2”.

ExampleDescription
def five = 5five is a unary, singleton relation {(5,)}
def numbers = {1; 2; 3; 4}numbers is a unary relation
def grandparent = a, b: ∃(t: parent(a, t) and parent(t, b))grandparent is a binary relation
def grandparent(a, b) { ∃(t: parent(a, t) and parent(t, b)) }grandparent is a binary relation
def netsales[x, y, z] = sales[x, y, z] - returns[x, y, z]Assuming sales/4 and returns/4, netsales is arity 4.

Note that in Rel any constant or variable can be used as a relation. For example, in def pi = 3.14 the value of pi is the singleton relation {(3.14,)}.

Definitions are similar to views in SQL, though Rel has fewer restrictions on what computations are supported in a view.

Section Relational Application explains the square brackets syntax.

Entities

Entity definitions in Rel have the form

Declaration :=
    Attribute* "entity" entity_name:Id constructor_name:Id FormalParamsBracket* FormalParamsParen? "=" Expr

See the Entities concept Guide for an introduction to using entities in Rel.

We can optionally use one of @hash and @auto_number as the Attribute to indicate how entities are constructed. By default, @hash is used. entity_name:Id is the name of the kind of entity, and constructor_name:Id is the constructor, used to create and reference the entities. The relation Expr describes the set of entities, each of which will be considered unique. The tuples in this relation will be the keys from which the entities are constructed.

The declaration @hash entity E c = r corresponds to def c = hash[r] and def E = last[c]. Similarly, @auto_number entity E c = r corresponds to def c = auto_number[r] and def E = last[c].

hash entities are 128-bit and auto_number entities are 64-bit. In both cases, do not make any assumption on the particular representation or ordering of the entities, other than their uniqueness.

Expressions

Formulas

Formulas are expressions that evaluate to either true or false. Formulas are the core of the Rel language. This means that other constructs are compiled to formulas.

Formulas and relation-valued expressions may appear to be distinct types of expressions, but in Rel true is equivalent to {()}, which is the zero arity relation containing its only possible value: the empty tuple. Similarly, false is the empty relation, {}, of arity 0. There are only two relations of arity 0.

true{()}
false{}

Formulas often have free variables (e.g. x and y in p(x, y) and q(y)). Formulas with free variables are not relations (they are just booleans, or equivalently, arity-0 relations).

Application (Atom)

Expr := Expr "(" {Expr ","}* ")"

The arity of the resulting expression must be zero. For example, in p(x, y, z), the arity of p must be three, and all three arguments are necessary. The formula p(x, y) would evaluate to false because there is no binary tuple in p/3.

Examples
age("Martin", 39)
parent(x, y)
foo()

In Section Relational Application we introduce a syntax with square brackets that supports partial application of the relation, for example age["Martin"] has the value {(39)}.

Conjunction

Expr := Expr "and" Expr

This construct is logical conjunction. Evaluates to true if an only if the left and right expression evaluates to true.

The arguments of and must be a formula (arity 0).

Examples
flying(x) and mammal(x)

The comma operator (introduced later) can be used instead of and, but is more general: it does not require arity-0 arguments as and does. Due to this, the and syntax can help catch mistakes and disambiguate overloaded logic. The arity of expression e1, e2 entirely depends on the arity of e1 and e2 (it can be any arity). On the other hand, the arity of e1 and e2 is always 0. If e1 and e2 are used with an incorrect assumption of their arity, then and will give an error, while , will not.

The possible uses of e1 and e2 are:

e1e2e1 and e2
falsefalsefalse
falsetruefalse
truefalsefalse
truetruetrue

Disjunction

Expr := Expr "or" Expr
Examples
flying(x) or floating(x)

The possible uses of e1 or e2 are:

e1e2e1 or e2
falsefalsefalse
falsetruetrue
truefalsetrue
truetruetrue

Similar to the role of the comma operator for and, the semicolon operator is a generalization of or (see Section Semicolon (Union and Disjunction)). The semicolon operator is disjunction (like or) for arity 0, and union for relations of an arity bigger than 0. The section on the semicolon operator explains how this matches the semantics of or exactly.

Implication

Expr := Expr "implies" Expr

Implication e1 implies e2 is defined as (not e1) or (e1 and e2)

The possible uses of e1 implies e2 are:

e1e2e1 implies e2
falsefalsetrue
falsetruetrue
truefalsefalse
truetruetrue

Negation

Expr := "not" Expr
Examples
not mammal(x)

Similar to the other logical operators, the argument of not must be a formula (an arity-0 relation).

The possible uses of not e are:

enot e
falsetrue
truefalse

Boolean Constant Relations

Expr := "true"
      | "false"

The constants true and false are relations, not scalar values. The constant true is equivalent to {()}. The constant false is equivalent to the arity-0 relation {}.

Because true and false are relations, there is never a relation persisted in the database that has true or false occurring as a value in the tuples (otherwise that relation would not be first-order). The Rel compiler statically eliminates any occurrences of true and false according the laws of the language.

Comparison Operators

Expr := Expr "<" Expr
      | Expr ">" Expr
      | Expr "=" Expr
      | Expr "!=" Expr
      | Expr "≠" Expr
      | Expr "<=" Expr
      | Expr "≤" Expr
      | Expr ">=" Expr
      | Expr "≥" Expr

Comparison operators are translated into applications of infinite relations defined in the standard library. For example, e1 = e2 is translated to eq(e1, e2).

Examples: x < 5, x > 5, x != 0.

Conditional Expression: If-Then-Else-End

Expr := "if" Expr "then" Expr "else" Expr "end"

The condition must be a formula (an arity-0 expression).

If the condition is true, then the value of the expression is the then branch. Otherwise the value is the else branch. The then and else branches can have arbitrary arity.

The semantics of conditional expressions is:

|[ if e1 then e2 else e3 end ]| => |[ (e1, e2); (not(e1), e3) ]|

Examples:

if 1 < 2 then 'a' else 'b' end

gives (‘a’,).

def p = 1
def q = 2,3
def r = 4,5
def output = if p(1) then q else r end

gives (2, 3).

def output = if p > 1 then q else r end

gives (4, 5).

def p = 1; 2; 3
def output = if x = 1 then 'a' else 'b' end for x in p

gives

(1, 'a')
(2, 'b')
(3, 'b')

Constants

Constants in Rel are semantically equivalent to singleton relations (arity 1, cardinality 1). In Rel, the value of every expression is a relation, and this includes constant expressions. For example, the value of the constant 1 is {(1)}.

Note that true and false (see Section Boolean Constant Relations) are arity-0 relations.

Integer Constants

Examples: 1, 2, 3

Floating-point Constants

Expr := [0-9]+ "." [0-9]* exponent?
      | [0-9]+ exponent
      | "." [0-9]+ exponent?

decimalDigit  := /[0-9]/
decimalDigits := decimalDigit+
exponent      := ('e' | 'E') ('+' | '-')? decimalDigits

Examples:

  • 0.0
  • 3.14
  • 1.5e4
  • 1e10
  • 5e-4
  • 5e+4
  • .5e4

String Constants

Strings are enclosed by double quotes ("). Multi-line strings are allowed, and enclosed by three double quotes ("""). A quote can be escaped as \", and the \ character can be specified as "\\". There is no need to escape single quotes inside triple-quoted strings, unless they are next to the triple quotes.

Examples:

def f1 = "abc"
def f2 = """
a,b,c
1,2,"three"
"""
def f3 = "\""
def output = i, c : char[f3, i, c]

f3 will have length 1, and the output will be {1, "}.

Characters are Unicode. char["中文例子", 2] = '文'.

Character Constants

Enclosed by single quotes.

Examples: 'c', ‘文’.

Date Constants

Date constants use the ISO 8601 extended format.

Expr := [0-9][0-9][0-9][0-9] "-" [0-9][0-9] "-" [0-9][0-9]

Examples: 2021-01-01, 2100-04-03

If a Date does not exist (for example 2021-02-30 or 2020-25-01) then a warning is reported and the value of the constant is the unary empty relation.

DateTime Constants

DateTime constants use the ISO 8601 extended format. A timezone is required.

Example: 2021-01-01T21:00:00+00:00

Expr := Date "T" Time Timezone

Date = [0-9][0-9][0-9][0-9] "-" [0-9][0-9] "-" [0-9][0-9]
Time = [0-9][0-9] ":" [0-9][0-9] ":" [0-9][0-9]
Timezone = "Z"
         | "+" [0-9][0-9] ":" [0-9][0-9]
         | "-" [0-9][0-9] ":" [0-9][0-9]

If a DateTime is not valid then a warning is reported and the value of the constant is the unary empty relation.

Relational Abstractions

Bindings

Bindings are the mechanism by which variables are introduced in Rel. It is often useful to immediately specify the expected domain of the new variables, and the binding syntax supports that with where and in.

Bindings := FormalParams
Bindings := FormalParams "where" Expr

FormalParams := FormalParam
FormalParams := "(" FormalParams ")"
FormalParams := FormalParams "," FormalParams

FormalParam := FormalId
             | FormalId "in" Expr
             | FormalId "∈" Expr
             | Constant

FormalId := Id
          | Id "..."

Bindings are not expressions themselves, but are used in many language constructs that are expressions. Because the syntax for bindings is somewhat elaborate, it is useful to understand it separately from the language constructs in which it is used.

Examples of bindings:

  • x

    • Introduces a single variable named x
  • x, y, z

    • Introduces three variables: x, y, z
  • x in A

    • Introduces a single variable x whose domain is restricted to A
  • x ∈ A

    • Equivalent to the previous example.
  • x where A(x)

    • Equivalent to the previous example.
  • x ∈ Int

    • Introduces a single variable x, with domain Int. Domains do not have to be finite.
  • z in {1; 2; 3; 4; 5}

    • Domains can also use literal relations
  • z in 1

    • A domain can be a singleton (or even empty, but that’s not often useful)
  • t where team(t) and soccer(t)

    • A domain can be a more complex logical formula.
  • x in A, y in B, z in C or equivalently x ∈ A, y ∈ B, z ∈ C

    • Introduces three variables, where x has domain A, y has B, z has C. A, B, and C need to have arity 1.
  • x ∈ Red, y ∈ Green where edge(x,y)

    • in and where can be combined.
  • x, y where edge(x,y) and Red(x) and Green(y)

    • Equivalent to the previous example.
  • x, y where edge(x, y)

    • Introduces two variables x and y in the edge domain.

If the domains of variables introduced are independent, then in can be used. If the domain involves multiple variables, then where must be used.

For the where case, the Expr must be arity 0 (a formula). The formula can use all variables in the binding.

For the in/ case, the expression must have arity 1 if a normal variable is used. Varargs are supported though, as in x... in {(1,2,3)} : true. In the varargs case, the arity of the expression (3, in our example) determines how many variables the vararg represents.

Relational Abstraction

Relational abstraction defines a relation without a name. This is by far the most important construct in the language. We will introduce some syntactic variations next, but they all translate into relational abstractions.

Expr := Bindings ":" Expr

The arity of a relational abstraction is the number of variables in the bindings + the arity of the expression. If the expression is a formula (arity 0), then only the variables from the bindings remain.

In the following examples, we use:

  • def p = {1; 2; 3}
  • def q = {2; 3; 4}
  • def r = {(1, "a"); (2, "b"); (3, "c")}
  • def abc = {"a"; "b"; "c"}
  • def parent = {("John", "Mary"); ("Mary", "Felix"); ("Felix", "George")}

It is common for a relational abstraction to have a formula as the expression. This is similar to a Datalog rule, but without a name for the relation that is defined. The following examples all use a formula.

  • x: p(x) and q(x)
    • Unary relation (one variable x)
    • Value: {2; 3}
  • x: p(x) or q(x)
    • Unary relation
    • Value: {1; 2; 3; 4}
  • x, s: q(x) and r(x, s)
    • Binary relation
    • Value: {(2, "b"); (3, "c")}
  • s: exists(x: q(x) and r(x, s))
    • Unary relation. The x is existentially quantified in the body of relational abstraction.
    • Value: {"b"; "c"}
  • x, y: p(x) and q(y)
    • Binary relation
    • This is the Cartesian product of p and q, which is {(1,2); (1,3); (1,4); (2,2); (2,3); (2,4), (3,2); (3,3); (3,4)}
  • x, t, y: parent(x, t) and parent(t, y)
    • Ternary relation
    • Value: {("John", "Mary", "Felix"); ("Mary", "Felix", "George")}
  • x, y: parent(x, t) and parent(t, y)
    • This is an error because variable t is never introduced.
  • x, y: ∃(t: parent(x, t) and parent(t, y))
    • Value: {("John", "Felix"); ("Mary", "George")}
  • x: Int(x) and x > 0
    • Value is the infinite relation {1; 2; 3; 4; ...}
  • x, y: x = y
    • Binary relation of all values that are equal, e.g (1,1) and ('a', 'a')
  • x: x = 1
    • Singleton relation {(1,)}
  • x: true
    • Infinite unary relation (e.g. contains 1, and "abc")
  • x: false
    • Empty, unary relation
  • x...: p(x...) and q(x...)
    • Varargs are explained later, but hopefully they are intuitively clear. In this case, x... will only represent a single variable (because p and q have arity 1) and the value is {1; 2}.
  • {x: p(x) and q(x)}
    • As discussed later, the only purpose of curly braces and parentheses is to indicate precedence. Sometimes a relational abstraction is more clear enclosed in curly braces. It does not change the semantics in any way.

The previous examples all use formulas for the body of the relational abstraction. Expressions with arity bigger than 0 are allowed as well though. This makes it a bit similar to a lambda, although it is important to keep in mind that the value is not restricted to functions: the expression can compute more than one value. Using the same sample data for p, q, r and parent as before:

  • x, y: x + y
    • Ternary, infinite relation
    • Example tuples: (1, 1, 2), (2, 3, 5), (3.5, 1.2, 4.7)
  • x: x + 1
    • Binary, infinite relation
    • Example tuples: (1, 2), (1.0, 2.0), (3.14, 4.14)
  • x in Int: x + 1
    • Binary, infinite relation
    • Example tuples: (1, 2), (2,3), (3,4)
  • x where Int(x): x + 1
    • Equivalent to the previous example.

A few slightly more advanced examples without sample data:

  • p: today - date_of_birth[p]
    • Binary relation of person and their age. The arity is 1 (for p) plus 1 for the body.
  • x: sum[v: scores(x, v)]
    • Sum scores v, grouped by x
    • Note that there are two relational abstractions here: one outermost, and one as a parameter to sum.
    • Aggregations are explained later in much more detail.
  • mean[x, y: (actual[x, y] - prediction[x, y]) ^ 2]
    • Compute the mse for a machine learning solution. Note that mse is also defined in the standard library, which can be used as mse[actual, prediction].
  • x: edge(x, x)
    • Unary relation of all nodes that have a self edge.
  • ∃(c ∈ course: ∀(s ∈ student: enrolled(c, s)))
    • The arguments to and are relational abstractions.
  • sum[date store_nbr item_nbr: train_unit_sales[date, store_nbr, item_nbr]]
    • Sum up the sales for every unique (date, store_nbr, item_nbr) triple in the train_unit_sales relation.

Bindings can also use constants as an abbreviation for having to introduce a separate variable. Examples of constants in a binding of a relational abstraction:

  • x, 1: abc(x)

    • Value: {("a", 1); ("b", 1); ("c", 1)}
  • def count_abc = sum[x, 1: abc(x)]

    • Sums the last argument of all tuples, which all are 1, so this defines the count of the number of tuples in abc, which is 3.
    • The standard library defines count in this way generically, and it can be used as count[abc] in this example.
  • 1: p(_)

    • Value: 1 (given that p is not empty)
    • Note that there does not need to be any explicit variable at all in the left-hand-side of the relational abstraction.
    • The underscore expression is syntactic sugar for an anonymous variable, see Section Underscore.

Inverse Relational Abstraction

Relational abstractions are often used as a parameter to higher-order abstractions, such as aggregations. In aggregations, typically the most important thing to see first is what value is being aggregated. To make this easier to see immediately, the bindings and expression of a relational abstraction can be inverted. The : becomes a | (or the keyword for) in this case. This syntax is designed to mimic the common mathematical notation.

Expr := Expr "|" Bindings
Expr := Expr "for" Bindings

This is purely a syntactic convenience: e | b is equivalent to e for b which is equivalent to b: e.

Examples:

  • x+1 | x in {1; 2; 3}

    • Value {(1,2); (2,3); (3,4)}
    • Equivalent to x in {1; 2; 3}: x+1
  • x+1 for x in {1; 2; 3}

    • Equivalent to the previous example.
  • x^2 for x in {1; 2; 3}

    • Value {(1,1); (2,4); (3,9)}
  • {x^2 | x ∈ {1; 2; 3}}

    • Equivalent to the previous example.
    • Similar to normal relational abstraction, curly braces can be used if that is more clear.
  • x^2, x^3 for x in {1; 2; 3}

    • The comma operator is explained in more detail later, but hopefully it is clear that the initial expression is now a tuple. The value of the entire expression is a ternary relation {(1,1,1); (2,4,8); (3,9,27)}
  • (x^2, x^3) for x in {1; 2; 3}

    • Equivalent to the previous example. Parentheses are used for precedence and have no other semantic here.

To users familiar with list comprehension in programming languages such as Python, Julia, and Haskell it may appear as strange that the variables introduced in the comprehension (e.g. x in x^2 for x in {1; 2; 3}) are part of the result. In these programming languages, the value of the comprehension is a list of the value before the for.

The distinction with Rel is that in Rel every value is a relation, and a relation is a set of tuples. This means that duplicates are eliminated. For some computations this is the intent, but often is not. Some examples to illustrate this:

  • age[p] for p in employee

    • This is a binary relation of Person and Int, not a unary relation of ages (which would contain every age value only once).
  • max[age[p] for p in employee]

    • This is the previous example used as an argument to the max aggregation. The result is a singleton of the age of the oldest employee. Note that for this particular aggregation it does not matter whether age values are kept for all employees.
  • sum[age[p] for p in employee]

    • The same example, but now in a sum aggregation. Here it is very important that even if two different persons p1 and p2 have the same age, the age of both persons contributes to the sum. This is ensured by having p be a part of the relation that is being aggregated, and we get the intended result.

Note that max (resp. sum) computes the maximum (resp. sum) of the last argument of the tuples in the relation, which can have any arity.

In the next section we introduce the “from” language construct, which does not keep the variables introduced, more akin to comprehension in programming languages.

“From” Expressions

From expressions are similar to relational abstractions, with one important distinction: the variables introduced by the binding are not part of the resulting value. The arity of the expression is entirely determined by the expression.

Expr := Expr "from" Bindings

The key difference between for (introduced in the previous section) and from (introduced here) with some examples:

ExampleDescription
x+1 for x in {1; 2; 3}Binary relation {(1,2); (2,3); (3,4)}
x+1 from x in {1; 2; 3}Unary relation {2; 3; 4}
x, x+1 from x in {1; 2; 3}Binary relation {(1,2); (2,3); (3,4)}
x, x+1 for x in {1; 2; 3}Ternary relation {(1, 1,2); (2, 2,3); (3, 3,4)}

The third example shows that from can be used in a similar way as the for construct by explicitly including the variables that introduced in the expression.

ExpressionValue
x+1 from x in {1; 2; 3}{2; 3; 4}
x^2 from x in {1; 2; 3}{1; 4; 9}
{x^2 from x in {1; 2; 3}}{1; 4; 9}
x^2, x^3 from x in {1; 2; 3}{(1,1); (4,8); (9,27)}
(x^2, x^3) from x in {1; 2; 3}{(1,1); (4,8); (9,27)}

The from construct is particularly useful if auxiliary variables are needed in some bigger definition. Typically this comes up in the domain of a binding. For example, suppose that we need to define a binary relation of a person x and the name of the father of x. The problem is here that we need to find the father of x, but the father is not part of the actual relation.

  • def fathers_name = x where exists(f: father(x, f)): name[f]

    • This definition is incorrect: the variable f is introduced with an existential quantifier in the domain of x, but it cannot be used outside of that scope, so name[f] will trigger a compilation error that f is undefined.
  • def fathers_name = name[f] | x: exists(f: father(x, f))

    • The inverse relational abstraction syntax has the exact same problem. It will given an error that f is not in scope in name[f].
  • def fathers_name = x: {name[f] from f where father(x, f)}

    • This correct example introduces the variable f in separate from. The curly braces are optional, but may be helpful to understand the precedence and scope of f.
  • def fathers_name = x, name[f] from x, f where father(x, f)

    • If what is actually computed gets confusing, it can also help to just explicitly list the desired result entirely, in this case x, name[f]. The variable ‘x’ must now be introduced in the from binding (otherwise it would be undefined).
  • def fathers_name[x] = name[father[x]]

    • In this example we actually did not need the variable f at all, so it could also have been written as in this example. The logic is not always this straightforward though.

From-Where

To make Rel more accessible to users who are familiar with LINQ or SQL, Rel supports a similar syntax for queries:

Expr := Expr "from" Bindings

Although where does not occur in this grammar rule, it is part of Bindings and leads to a syntax very similar to LINQ (Language-Integrated Query).

s.name from s in students
where s.age > 12 and s.age < 20
p.partkey from p in part
where like[p.name, '@@1%']

This example uses composite relations (e.g., s.age and p.name). The semantics of relational composition, also called navigation, are described in more detail below.

Overview of Relational Abstraction Alternatives

The previous sections introduced a few variants of relational abstractions, and it may be hard to remember the variants and when to use which one. When choosing how to write an abstraction, the first thing to consider is whether it is more readable to first introduce variables or state the value that is computed. This splits the options in two categories with only a few alternatives.

The following examples use p = {1; 2; 3}

Abstractions that have Bindings first and Expr second:

     x in p:              x+1       = {(1, 2); (2, 3); (3, 4)}
     x in p where x < 3:  x+1       = {(1, 2); (2, 3)}

Abstractions that have Expr first and Bindings second:

       x+1 |    x in p       = {(1, 2); (2, 3); (3, 4)}
       x+1 for  x in p       = {(1, 2); (2, 3); (3, 4)}
       x+1 from x in p       = {2; 3; 4}
       x+1 from x in p where x < 3  = {2; 3}

The first and second are identical. The third variant is more SQL-like and lists the computed value separately.

Relational Application

Expr := Expr "[" {Expr ","}* "]"

The square bracket syntax is used to project and select a slice of the relation. It is related to currying in functional languages in the sense that relations can be partially applied.

As opposed to application syntax with parentheses, the number of parameters can be less than the arity of the relation with relational application. For example, for a binary relation p, only applications with two arguments are valid, e.g p(x, y). Normal application with parentheses is a formula, or equivalently a nullary relation. For binary p, an example of a relational application is p[x], mimicking the syntax of arrays and matrix indexing from other languages. For a binary relation p, the value of p[x] is a unary relation. By doing so the part of the expression that gets evaluated will not be returned. This behavior is useful to access attributes/information that are stored in a key-value format.

We can use this syntax to define relations

  def name[1] = "Noether", "Emmy"
  def name[2] = "Hopper", "Grace"
  def name[3] = "Curie", "Marie"

  def age[1] = 53
  def age[2] = 85
  def age[3] = 66

and to query them

ExampleDescription
name[2]{"(Hopper", "Grace")}
name[3, "Curie"]{"Marie"}
name[x-1 from x in {1;2}]{("Noether", "Emmy")}
(name[p], age[p] from p)[_]{("Emmy", 53); ("Grace", 85); ("Marie", 66)}
(age[p], name[p] for p)[(x : x%2 = 1), (x: x>60)]{("Curie", "Marie")}

(In the last example, note that we have placed age first, so we can filter it with x : x > 60 after filtering for p.)

Comma (Cartesian Product and Conjunction)

The comma operator is technically a Cartesian product operator, but when applied to singleton values, then it easier to understand as a tupling operator.

Expr := Expr "," Expr

The comma operator is used in the following examples:

ExampleDescription
def p = 1, 2Singleton, binary relation (tuple)
1, 2, 3Ternary tuple
(1, 2, 3)Also a ternary tuple
{(1, 2, 3)}Also a ternary tuple
x^2, x^3 from x in {1; 2; 3}Binary relation (x is not in result)

The comma operator can also be used in some perhaps surprising ways:

  • def soccer_players[t] = players[t], soccer(t)

    • The comma serves here as an additional condition on the value of soccer players, without having to write out the logic as the point-wise variant def soccer_players(t, p) = players(t, p) and soccer(t).
  • def order_price[pos] = order[pos, :price], not exists order[pos, :load_errors]

    • Similarly, the comma introduces a filter here, skipping rows from a CSV file with any errors.
  • def count[R] = sum[{R, 1}]

    • Every tuple of R is extended with a 1, so that computing a sum aggregation actually computes the count.

Note that comma is really an operator and there is no dedicated syntax for tuples in the language. For example, (1, 2, 3) is an expression that uses parentheses (Expr := "(" Expr ")") and two applications of the comma operator. Tuples never occur as arguments of tuples in a relation. For example, ((1, 2), 3) is the ternary relation {(1, 2, 3)}, and (1, 2), (3, 4) is the arity 4 relation {(1, 2, 3, 4)}.

The comma operator is equivalent to and when its arguments are restricted to relations of arity 0. To illustrate this, here is the table of all possible arguments of arity-0 relations:

LeftRightLeft , Right
{}{}{}
{()}{}{}
{}{()}{}
{()}{()}{()}

Unfortunately the comma character is overloaded by the comma operator and the arguments of applications. In the context of an application, there is a semantic difference whether the comma is interpreted as a comma operator or separator of arguments.

  • foo[x: y, z]

    • In the absence of precedence rules, this example is ambiguous. It can be interpreted as either foo[{x: y}, z] or foo[{x: y, z}]

    • Independent of what the default interpretation is, it is advisable to be clear about the intent when any commas and relational abstractions are involved.

    • The comma operator binds stronger than relational abstraction, which means that foo[{x: y, z}] is the chosen representation in the absence of parentheses. If this is written without curly braces, then a warning will be reported [TODO].

    • As a general guideline, unless there is only a single relational argument (as in sum, count etc), it is best to always use curly braces or separate the arguments. For example, if the intent is to invoke foo with two arguments, then use either foo[x: y][z] or foo[{x: y}, z].

  • foo[a, b: c, d]

    • A similar problem exists in the bindings of the abstraction, because a comma is also used there to separate formal parameters. In this case, the possible interpretations are foo[{a, b: c, d}], foo[{a, b: c}, d], foo[a, {b: c}, d], foo[a, {b: c, d}]. Similar to the previous example, precedence rules select foo[{a, b: c, d}]. Similarly, the recommendation is to clarify the intent using curly braces.
  • sum[R, 1]

    • In this incorrect example, sum is required to have two arguments (which is not the case). The first argument is the relation R, and the second is the number 1. The intent of the logic is to count the number of tuples in R by appending 1 to all its tuples and then applying sum to the relation. To achieve this, the relation argument must be put in parentheses or curly braces: sum[{R, 1}].

As a general rule, if a relation is used as an argument of an application, then it should be wrapped in {...} unless it is the only parameter and no commas are involved.

We are planning to introduce warnings for cases where the interpretation is ambiguous, which should hopefully address the confusion that arises from this. The warning would trigger for an expression that has a comma as the right-most operand (i.e. not written in curly braces or parentheses).

Semicolon (Union and Disjunction)

The semicolon is a generalized union operator.

Expr := Expr ";" Expr
ExampleNormalized
p(x); q(x)p(x) or q(x)
p;qx: p(x) or q(x)
1;3x: x=1 or x=3
{};pp

The semicolon operator is equivalent to or when its arguments are restricted to relations of arity 0. To illustrate this, here is the table of all possible arguments of arity-0 relations:

LeftRightLeft ; Right
{}{}{}
{()}{}{()}
{}{()}{()}
{()}{()}{()}

The semicolon operator does not require its arguments to have the same arity. The compilation of expressions without a single arity is not entirely implemented yet though.

Precedence

Precedence can be indicated with either curly braces or parentheses. The two are exactly equivalent.

Expr := "{" Expr "}"
Expr := "(" Expr ")"

For readability, it can be helpful to always use {...} for non-singleton relations, and `(…)' for singleton relations (scalars). For example, ‘(x + y) * z’ vs ‘{x, y, z: p(x, y, z)}’.

All binary operators are left-associative unless otherwise stated below.

From highest precedence to lowest:

  1. _, variables, qualified names, literals, (), {}
  2. .
  3. relational application
  4. ^ (right assoc)
  5. unary -
  6. /, %, *, ÷, ×, ,
  7. -, +, , ,
  8. , ,
  9. =, !=, , , , <, >, <=, , >=, , , , , , , , , , ,
  10. ¬, not
  11. forall, , exists,
  12. , and
  13. , or
  14. (right assoc), implies (right assoc),
  15. , , , , , xor, iff
  16. <:, :>
  17. <++, ++>
  18. ,
  19. ;

Relation Literals

Relation literals do not have a dedicated syntax, but are by convention written using curly braces (outer), parentheses (around tuples), comma (in the tuples) and semicolon (between the tuples).

ExampleDescription
{1}Singleton relation (arity 1, cardinality 1)
{1; 2; 3; 4}Unary relation (arity 1, cardinality 4)
{(1, 2); (3, 4); (5, 6)}Arity 2, cardinality 3
{1; 2; 1; 2}Arity 1, cardinality 2 (duplicates are eliminated)
{()}Unit relation (cardinality 1, arity 0)
{}Empty relation (cardinality 0, any arity)

Note that the arguments do not have to be constants, so the relation literal is not a relation constant. It can contain variables, and the tuples do also not have to be written explicitly with scalar variables.

For example, {(friend, 1); (family, 2); (partner, 3)} for binary friend, family and partner has arity 3 and unknown cardinality. It can explicitly be defined as

x y v: friend(x, y) and v = 1 or
       family(x, y) and v = 2 or
       partner(x, y) and v = 3

Composition (Navigation)

Expr := Expr "." Expr

The semantics of R.S is x..., y... : exists(t: R(x..., t) and S(t, y...)). Note that the arguments must have at least arity 1.

For specific cases, the semantics is simplified a little. In particular,x.R is R[x].

ExampleNormalized
parent.parentx, y: ∃(t: parent(x, t) and parent(t, y))
t.playersp: players(t, p)
t.players.agev: ∃(p: players(t, p) and age(p, v))
brand_name."Tesla"b: brand_name(b, "Tesla")

Restriction

Expr := Expr "<:" Expr
Expr := Expr ":>" Expr

Prefix join (or prefix restriction) R <: S is syntax for prefix_join[R, S] defined in the standard library.

Suffix join (or suffix restriction) R :> S is syntax for suffix_join[R, S] defined in the standard library.

Override

Expr := Expr "<++" Expr
Expr := Expr "++>" Expr

Left override operator R <++ S is syntax for left_override[R, S] defined in the standard library.

Right override operator R ++> S is syntax for right_override[R, S] defined in the standard library.

Arithmetic Operators

Expr := Expr "+" Expr
      | Expr "-" Expr
      | Expr "*" Expr
      | Expr "/" Expr
      | Expr "%" Expr
      | Expr "÷" Expr
      | Expr "^" Expr

Underscore

Expr := "_"

Underscore is a variable without a name. It desugars into the relation x: true, which is the infinite relation of all values. After further rewriting, this typically becomes an existential quantification of a variable with a unique name. Underscore is mostly useful if you do not care about certain arguments of relation and want to avoid introducing a named variable and quantifier.

For example:

// Unary relation of all orders with a line item:
o: line_item(o, _)

// There does not exist an order for customer c:
not(order_customer(_, c))

// Count the number of distinct values of a function or relation `f`:
count[f[_]]

Splat

Expr := Expr "..."

Existential Quantification

Expr := "exists" Expr
      | "∃" Expr

The argument of existential quantification can be an expression of any arity. This is because the expression includes the variables that are quantified over (they are not syntactically associated with the quantifier). For example:

def foo = 1;2;3
def output = exists(x, y : foo(x), foo(y), x > 2)

which evaluates to true.

The value of an existential quantification is false if the argument is the empty relation, and true otherwise.

Examples
exists(p: person(p) and age[p] > 99)
exists(centenarian)
∃(centenarian)
c: course(c) and not ∃(s: enrolled(c, s))
c: course(c) and not ∃(enrolled[c])
def grandparent(a, b) = exists(t: parent(a, t) and parent(t, b))

Universal Quantification

Expr := "forall" "(" Bindings ":" Expr ")"
      | "∀"      "(" Bindings ":" Expr ")"

Universal quantification is true if the range (the Expr expression) is true for the entire domain (the domain specified in the bindings). For the empty domain, universal quantification evaluates to true.

The range (right-hand side) of universal quantification must be a formula (that is, an arity-0 relation).

ExamplesDescription
forall(p in person: age[p] > 99)Are all persons older than 99?
∀(p in person: centenarian(p))Are all persons centenarians? Here we use a Unicode quantifier.
exists(c in course: forall(s in student: enrolled(c, s))Does there exist a course where all students are enrolled?
∃(c in course: ∀(s in student: enrolled(c, s))Same with Unicode quantifiers.
c: course(c) and ∀(y in year: ∀(s where class[s] = y: enrolled(c, s)))Courses that have, for every year, all students from that year enrolled.
c: course(c) and ∃(y in year: ∀(s where class[s] = y: enrolled(c, s)))Courses that have all students from one year enrolled.

Writing forall x,y,... where E1(x,y,...) : E2 is preferred to forall x,y,... : E1(x,y,...) implies E2.

Aggregation

There is no dedicated syntax for aggregations. Instead, there is one higher-order native called reduce that is used to define various aggregations in stdlib.rel.

A few examples of aggregations:

ExampleDescription
sum[salary]Sum all salaries
sum[p in employee: salary[p]]Sum employee salaries
Σ[salary[p] | p ∈ employee]Sum employee salaries
Σ[employee.salary]Wrong: duplicates eliminated
d in dept: Σ[salary[p] | p ∈ member[d]]Sum salaries by department
d in dept: Σ[d.member.salary]Wrong: duplicates eliminated
count[employee]Count number of employees
d in dept: count[d.member]Count number of employees by department
d, v: dept(d) and count[d.member](v)Aggregations can be used in formulas
max[age[p] | p ∈ person]Max age of all persons
max[age[p] for p in person]Max age of all persons
max[t.member.age] | t ∈ teamMax age by team
max[Σ[salary[p] for p in member[t]]] for t in teamMax sum of salaries by team
argmax[Σ[salary[p] for p in member[t]] for t in team]Team with max sum of salaries
argmin[Σ[salary[p] for p in member[t]] for t in team]Team with min sum of salaries
t: max[age[p] for p in member[t]]Max age by team (again)

Variable Scope

All variables have to be explicitly introduced to clarify their scope.

There is no implicit quantification as in Datalog.

All variables are introduced by Bindings (although in definitions we support a subset of Bindings, FormalParams).

A local scalar variable x shadows relation definitions, schema, and natives with name x.

For example, in the definition of foo, the local variable nation shadows the previous definition of nation.

def nation = ...
def foo = nation: ...

Variable scope is static, not dynamic. This means that the local variable nation only shadows variables that are lexically in its scope.

Rel Libraries

When a database is created, a number of libraries are installed automatically as part of the database, including:

Type Predicates

The Rel standard library includes a number of predicates that test for a given type — see for example the Number relation and the ones that follow.

Integrity Constraints

Rel supports expressive Integrity Constraints, written using the Rel language itself — and the IC Concept Guide for an introduction to ICs, and the Integrity Constraint Reference for technical IC reference.

Modules

Rel modules, which group together definitions and can be parameterized by relations, are relations as well. See the Rel Modules Guide.

EDBs: insert and delete

EDB relations can be persisted and updated with the transaction control relations insert and delete. See the guide on Working with EDB Relations.