Integrity Constraints
This article is part of the Rel reference manual. See the Integrity Constraints concept guide for an extensive introduction to ICs.
Introduction
Integrity constraints define requirements that are imposed within a transaction. An integrity constraint (IC) can cover a single fact, a single relation, or multiple relations. If a defined IC is violated, the running transaction is aborted. If you are using the Console, the appropriate information is presented in a readable format.
For example, consider an IC on a graph stating that there cannot be nodes with self-edges. You can declare the IC in such a manner that a violation will cause the system to report, in a structured form, the nodes that violate this constraint.
Semantics
IC expressions are evaluated to either true
or false
depending on the state of the database.
They are re-evaluated when the state of the database changes.
If a constraint is violated (false
), then the transaction is aborted and the database returns information about the constraint violation.
A read-only transaction is not aborted, as it cannot modify the database.
Grammar and Syntax
Declaration :=
Annotation* "ic" Id? FormalParamsBracket* FormalParamsParen? "=" Expr
Annotation* "ic" Id? FormalParamsBracket* FormalParamsParen? "{" Expr "}"
Commentary
- An integrity constraint (IC) is declared with the keyword
ic
. - An IC is a positive statement. It states what is expected to be true.
- An IC has an optional name (the identifier). If the declaration does not contain an identifier, a random name is generated internally. The system uses the name to identify the constraint in its report of a constraint violation.
- An IC can have optional formal parameters. The system uses these parameters to demonstrate counterexamples in its report of a constraint violation.
- The constraint itself can be an arbitrary expression.
Examples
ic = forall(x, y: edge(x, y) implies x != y)
- Logic: There is no edge from a node to itself.
- An anonymous IC without parameters. Only an internally-generated name will be included in the report about a violation.
ic(x, y) = edge(x, y) implies x != y
- Logic: There is no edge from a node to itself.
- An anonymous IC with parameters. An internally-generated name and the values of the parameters will be included in the report about a violation.
- Notice that
x
andy
are introduced in the head, so they are universally quantified by default. There is no need for a quantifier in the body of the constraint.
ic no_self_edge(x, y) { edge(x, y) implies x != y }
- Logic: There is no edge from a node to itself.
- A named IC with parameters. The name and the values of the parameters will be included in the report about a violation.
ic total_age(x) {
person(x) implies age(x, _)
}
- Logic: Every person has an age.
ic total_age {
total(person, age)
}
- Logic: Every person has an age.
- This version uses a higher-order abstraction from the Standard Library,
total
. Unlike the previous version, this will not display detailed information about the items inperson
that do not occur in the first column ofage
.
Functional Dependency
The following is an anonymous integrity constraint enforcing a functional dependency on age
, making sure that nothing can have two different ages:
ic() = forall(p, a, b: age(p, a) and age(p, b) implies a = b)
This IC could also be written using function
from the Standard Library:
ic { function[age] }
An equivalent named IC with parameters that will allow you to identify the violation:
ic fun_age_ic(p, a, b) {
age(p, a) and age(p, b) implies a = b
}
Type Constraints
You can constrain the types of relations using type relations such as Int
, String
, etc.
For example:
ic age_int_ic(x) { age(_, x) implies Int(x) }
You can also use other unary relations in this manner. For example, this IC checks that every student is also a person:
ic() {
forall(x: student(x) implies person(x))
}
The following named IC also checks that every student is a person, but avoids the quantifier:
ic student_type(x) {
student(x) implies person(x)
}
Point-free version using the higher-order infix operator ⊆
(subset
):
ic() {
student ⊆ person
// Can also be written as `subset(student, person)`.
}
This IC checks that the parent
relation holds only for values that are in person
:
ic(x, y) {
parent(x, y) implies person(x) and person(y)
}
Point-free version using the higher-order infix operator ⊆
(subset
):
ic() {
parent ⊆ (person, person)
// Can also be written as `subset(parent, (person, person))`.
}
Deriving Constraint Violations
Integrity constraints are implemented as relations.
The body of the definition of a relation that implements an IC is the negation of the constraint.
If the IC is not violated, the corresponding relation will remain empty (false
).
But if the IC is violated, the corresponding relation will no longer be empty.
If the definition of the IC contains parameters, the corresponding relation will be populated by counterexamples: instantiations of the parameters for which the IC does not hold.
The following example clarifies this. Here is an IC which checks that a graph contains no self-edges:
ic no_self_edge(x) = node(x) implies not edge(x, x)
The IC is converted to a relation whose actual name may be different from the one shown here:
def no_self_edge_def(x) = not( node(x) implies not edge(x, x) )
The compiler then applies a series of logical transformations to simplify the definition:
// `expr1` implies `expr2` => not `expr1` or `expr2`
=> def no_self_edge_def(x) = not( (not node(x)) or (not edge(x, x)) )
// `not( expr1 or expr2)` => `(not expr1)` and `(not expr2)`
=> def no_self_edge_def(x) = ( not(not node(x)) ) and ( not(not edge(x, x)) )
// `not(not expr)` => `expr`
=> def no_self_edge_def(x) = node(x) and edge(x, x)
As you can see, no_self_edge_def
will contain those instantiations of x
for which both node(x)
and edge(x, x)
are true.
If the constraint is not violated, no_self_edge_def
will be empty.
Representing IC Violations
One of the main decisions in the design of Rel was to separate the database logic from transaction control.
To facilitate such separation, the effects of evaluating integrity constraints is captured
in a relation named abort
that evaluates to true
if an IC is violated.
It is up to the transaction control semantics to handle the violation of an IC.
You can access the abort
relation:
// read query
def node = 1; 2
def edge = (1, 2); (1, 1)
ic no_self_edge(x) = node(x) implies not edge(x, x)
def output:A = edge
def output:B = abort
If you are using the Console, you don’t have to worry about abort
.
If an IC is violated, you will see this as a problem in the Console.
For this particular example you will see:
When the relation edge
was displayed in the example above, it contained the offending self-edge.
This is because the query was run in a read-only transaction, so there was no danger of corrupting the database.
If the transaction could write to the database, it would be aborted as soon as an IC was violated.
You can verify this as follows.
Install node
, edge
and the integrity constraint in the database:
// model
def node = 1; 2; 3
def edge = (1, 2); (2, 3); (3, 1)
ic no_self_edge(x) = node(x) implies not edge(x, x)
You can temporarily extend edge
in a query transaction.
One of the new edges violates the IC, but you see it in the output:
// read query
def edge = (1, 1); (2, 1)
def output:A = edge
In the next transaction you won’t see those new edges:
// read query
def output = edge
Now try to permanently add new edges, one of which violates no_self_edge
:
// model
def edge = (1, 1); (2, 1)
A query will show that the previous transaction didn’t install anything:
// read query
def output:A = edge
If you now try to add edges permanently without violating the IC, the transaction will be successful and the edges will be added:
// model
def edge = (1, 3); (2, 1)
// read query
def output = edge
Next: Other Topics