Integrity Constraints

This article is part of the Rel reference manual. See the Integrity Constraints concept guide for an extensive introduction to ICs.

Introduction

Integrity constraints define requirements that are imposed within a transaction. An integrity constraint (IC) can cover a single fact, a single relation, or multiple relations. If a defined IC is violated, the running transaction is aborted. If you are using the Console, the appropriate information is presented in a readable format.

For example, consider an IC on a graph stating that there cannot be nodes with self-edges. You can declare the IC in such a manner that a violation will cause the system to report, in a structured form, the nodes that violate this constraint.

Semantics

IC expressions are evaluated to either true or false depending on the state of the database. They are re-evaluated when the state of the database changes. If a constraint is violated (false), then the transaction is aborted and the database returns information about the constraint violation. A read-only transaction is not aborted, as it cannot modify the database.

Grammar and Syntax

Declaration :=
    Annotation* "ic" Id? FormalParamsBracket* FormalParamsParen? "=" Expr
    Annotation* "ic" Id? FormalParamsBracket* FormalParamsParen? "{" Expr "}"

Commentary

An integrity constraint (IC) is declared with the keyword ic.
An IC is a positive statement. It states what is expected to be true.
An IC has an optional name (the identifier). If the declaration does not contain an identifier, a random name is generated internally. The system uses the name to identify the constraint in its report of a constraint violation.
An IC can have optional formal parameters. The system uses these parameters to demonstrate counterexamples in its report of a constraint violation.
The constraint itself can be an arbitrary expression.

Examples

ic = forall(x, y: edge(x, y) implies x != y)

Logic: There is no edge from a node to itself.
An anonymous IC without parameters. Only an internally-generated name will be included in the report about a violation.

ic(x, y) = edge(x, y) implies x != y

Logic: There is no edge from a node to itself.
An anonymous IC with parameters. An internally-generated name and the values of the parameters will be included in the report about a violation.
Notice that x and y are introduced in the head, so they are universally quantified by default. There is no need for a quantifier in the body of the constraint.

ic no_self_edge(x, y) { edge(x, y) implies x != y }

Logic: There is no edge from a node to itself.
A named IC with parameters. The name and the values of the parameters will be included in the report about a violation.

ic total_age(x) {
    person(x) implies age(x, _)
}

Logic: Every person has an age.

ic total_age {
    total(person, age)
}

Logic: Every person has an age.
This version uses a higher-order abstraction from the Standard Library, total. Unlike the previous version, this will not display detailed information about the items in person that do not occur in the first column of age.

Functional Dependency

The following is an anonymous integrity constraint enforcing a functional dependency on age, making sure that nothing can have two different ages:

ic() = forall(p, a, b: age(p, a) and age(p, b) implies a = b)

This IC could also be written using function from the Standard Library:

ic { function[age] }

An equivalent named IC with parameters that will allow you to identify the violation:

ic fun_age_ic(p, a, b) {
    age(p, a) and age(p, b) implies a = b
}

Type Constraints

You can constrain the types of relations using type relations such as Int, String, etc. For example:

ic age_int_ic(x) { age(_, x) implies Int(x) }

You can also use other unary relations in this manner. For example, this IC checks that every student is also a person:

ic() {
    forall(x: student(x) implies person(x))
}

The following named IC also checks that every student is a person, but avoids the quantifier:

ic student_type(x) {
    student(x) implies person(x)
}

Point-free version using the higher-order infix operator ⊆ (subset):

ic() {
    student ⊆ person
    // Can also be written as `subset(student, person)`.
}

This IC checks that the parent relation holds only for values that are in person:

ic(x, y) {
    parent(x, y) implies person(x) and person(y)
}

Point-free version using the higher-order infix operator ⊆ (subset):

ic() {
    parent ⊆ (person, person)
    // Can also be written as `subset(parent, (person, person))`.
}

Deriving Constraint Violations

Integrity constraints are implemented as relations. The body of the definition of a relation that implements an IC is the negation of the constraint. If the IC is not violated, the corresponding relation will remain empty (false). But if the IC is violated, the corresponding relation will no longer be empty. If the definition of the IC contains parameters, the corresponding relation will be populated by counterexamples: instantiations of the parameters for which the IC does not hold.

The following example clarifies this. Here is an IC which checks that a graph contains no self-edges:

     ic no_self_edge(x) = node(x) implies not edge(x, x)

The IC is converted to a relation whose actual name may be different from the one shown here:

     def no_self_edge_def(x) = not( node(x) implies not edge(x, x) )

The compiler then applies a series of logical transformations to simplify the definition:

   // `expr1` implies `expr2` => not `expr1` or `expr2`
   => def no_self_edge_def(x) = not( (not node(x)) or (not edge(x, x)) )
   // `not( expr1 or expr2)` => `(not expr1)` and `(not expr2)`
   => def no_self_edge_def(x) = ( not(not node(x)) ) and  ( not(not edge(x, x)) )
   // `not(not expr)` => `expr`
   => def no_self_edge_def(x) = node(x) and edge(x, x)

As you can see, no_self_edge_def will contain those instantiations of x for which both node(x) and edge(x, x) are true. If the constraint is not violated, no_self_edge_def will be empty.

Representing IC Violations

One of the main decisions in the design of Rel was to separate the database logic from transaction control. To facilitate such separation, the effects of evaluating integrity constraints is captured in a relation named abort that evaluates to true if an IC is violated. It is up to the transaction control semantics to handle the violation of an IC.

You can access the abort relation:

// read query
 
def node = 1; 2
def edge = (1, 2); (1, 1)
ic no_self_edge(x) = node(x) implies not edge(x, x)
def output:A = edge
def output:B = abort

If you are using the Console, you don’t have to worry about abort. If an IC is violated, you will see this as a problem in the Console. For this particular example you will see:

A red box containing the words IC Violation: no-self-edge and a display of the number 1

When the relation edge was displayed in the example above, it contained the offending self-edge. This is because the query was run in a read-only transaction, so there was no danger of corrupting the database. If the transaction could write to the database, it would be aborted as soon as an IC was violated.

You can verify this as follows. Install node, edge and the integrity constraint in the database:

// model
 
def node = 1; 2; 3
def edge = (1, 2); (2, 3); (3, 1)
ic no_self_edge(x) = node(x) implies not edge(x, x)

You can temporarily extend edge in a query transaction. One of the new edges violates the IC, but you see it in the output:

// read query
 
def edge = (1, 1); (2, 1)
def output:A = edge

In the next transaction you won’t see those new edges:

// read query
 
def output = edge

Now try to permanently add new edges, one of which violates no_self_edge:

// model
 
def edge = (1, 1); (2, 1)

A query will show that the previous transaction didn’t install anything:

// read query
 
def output:A = edge

If you now try to add edges permanently without violating the IC, the transaction will be successful and the edges will be added:

// model
 
def edge = (1, 3); (2, 1)

// read query
 
def output = edge

Next: Other Topics

Was this doc helpful?