# Rel Integrity Constraints Reference

Reference guide for Rel integrity constraints.

This is a reference document. See the Integrity Constraints concept guide for an introduction to ICs.

## Introduction

Integrity constraints define a requirement that is imposed within a transaction.
If a defined IC is violated, the running transaction is `aborted`

and the appropriate information is logged in the `:output`

relation.

An integrity constraint can cover a single fact, a single relation, or multiple relations.
When any IC is violated,
the transaction is aborted and an informative and structured output is presented to the user of the database.
For example, consider a constraint on a graph stating that there cannot be nodes with self-edges.
A violation of this IC can report, in a structured form, the nodes that violate this constraint.
The `output`

relation will include counterexamples to the constraint; in this case, the nodes with self-edges.

## Semantics

IC expressions are evaluated to either `true`

or `false`

depending on the state of the database.
If the constraint is violated (`false`

), then the transaction is aborted and the database returns information about the constraint violation to the user. In the current design, this always causes the transaction to abort.

## Grammar and Syntax

```
Declaration :=
Attribute* "ic" Id? FormalParamsBracket* FormalParamsParen? "=" Expr
Attribute* "ic" Id? FormalParamsBracket* FormalParamsParen? "{" Expr "}"
```

### Commentary

- Integrity constraints are declarations, and RelationalAI uses the keyword
`ic`

to declare them. - ICs are positive statements: They state what needs to be true.
- ICs have optional names (the identifier). If an identifier is not declared, a random name is generated internally. The system uses the name for reporting constraint violations to the application.
- ICs can have optional formal parameters. The system uses these parameters for reporting constraint violations; in particular, demonstrating the actual counterexamples.
- An arbitrary expression is used for the constraint evaluation.

The following examples give a high-level overview of how to declare integrity constraints:

Example | Description |
---|---|

`ic no_self_edge(x, y) {edge(x, y) implies x != y}` | `no_self_edge` is a constraint defined on the `edge` relation. |

`ic = forall(x, y: edge(x, y) implies x != y)` | You can omit the name of the constraint. |

`ic(a, b) = edge(a, b) implies a != b` | You can declare free variables. |

`ic age_constraint = forall(p, a, b where age(p, a) and age(p, b): a=b)` | The same person cannot have different ages. |

`ic all_age = forall(x: person(x) implies age(x, _))` | Every person has an age. |

## Further IC Examples

```
ic() {
forall(x : person(x) implies age(x, _))
}
```

- Logic: Every person needs to have an age.
- Anonymous constraint without parameters.
- Violations do not contain information that can be used by an application.

`ic total_age() = forall(x: person(x) implies age(x, _))`

- This is a named IC without parameters.
- Violations do not contain information that can be used by an application.

`ic total_age(x) = person(x) implies age(x, _)`

- Named IC, making
`x`

a parameter, which is used in identifying the actual violations. Note that`x`

is already introduced, so it is no longer quantified in the body of the constraint.

```
ic total_age(x) {
person(x) implies age(x, _)
}
```

- Named IC.
- Convenience of using curly braces to indicate the body of the constraint. This mimics Alloy syntax.

```
ic {
total(person, age)
}
```

- Anonymous IC that uses a higher-order abstraction from the Standard Library,
`total`

.

```
ic(x) {
person(x) implies age(x, _))
}
```

- Anonymous IC.
- Contains information about the violation (formal parameter
`x`

).

### Functional Dependency

The following is an anonymous IC enforcing a functional dependency on `age`

, making sure that nothing can
have two different ages:

`ic() = forall(p, a, b: age(p, a) and age(p, b) implies a = b)`

This IC could also be written using stdlib’s `function`

:

`ic { function[age] }`

An equivalent named IC, including parameters that will explain the violation, is:

```
ic fun_age_ic(p, a, b) {
age(p, a) and age(p, b) implies a = b
}
```

### Type Constraints

You can constrain the types of relation using type predicates such as `Int`

, `String`

, etc. For example:

`ic age_int_ic(x) { age(_, x) implies Int(x) }`

You can also use other unary relations as types. For example, this IC checks that every student is also a person:

```
ic() {
forall(x: student(x) implies person(x))
}
```

The following named IC also checks that every student is a person, but avoids the quantifier:

```
ic student_type(x) {
student(x) implies person(x)
}
```

Point-free version using the higher-order infix operator `⊆`

(`subset`

):

```
ic() {
student ⊆ person
// Can also be written as `subset(student, person)`.
}
```

This IC checks that the `parent`

relation holds only for values of `person`

type:

```
ic(x, y) {
parent(x, y) implies person(x) and person(y)
}
```

Point-free version using the higher-order infix operator `⊆`

(`subset`

):

```
ic() {
parent ⊆ (person, person)
// Can also be written as `subset(parent, (person, person))`.
}
```

## Deriving Constraint Violations

Constraints could be implemented by confirming that the IC formula evaluates to `true`

,
but because constraints are almost always universally quantified,
it is easier to derive `true`

if the constraint is violated,
i.e., it is sufficient and simpler to check the existence of counterexamples.
For this, `rAI`

internally evaluates the *negation* of the formula into a relation definition,
which changes the universal quantification that is common in constraints to an existential quantification.

### Step-by-Step Derivation Example

To demonstrate this process, here is a step-by-step example of the internal transformations applied:

- Define an IC that enforces no self-edges:

` ic no_self_edge() = forall(x: node(x) implies not edge(x, x))`

- Convert the IC to a relation defined as the negation of the expression body:

```
// ic rel() E => def rel() = not(E)
=> def no_self_edge_def() = not forall(x: node(x) implies not edge(x, x))
```

- Apply logical transformation rules on the negation of universal quantification and double negations:

```
// not(forall(domain, expr)) => exists(domain and not(expr))
=> def no_self_edge_def() = exists(x: node(x) and not(not edge(x, x)))
// not(not F) => F
=> def no_self_edge_def() = exists(x: node(x) and edge(x, x))
```

### Constraint Violation Derivation Examples

This guide now gives more examples of the derived negated definitions.
For demonstration purposes, this guide names the negated relation `violates`

, but it does not necessarily have this name internally:

```
ic() {forall(x: student(x) implies person(x))}
=> def violates() = exists(x: student(x) and not person(x))
```

```
ic() {forall(x, y: parent(x, y) implies ( person(x) and person(y)))}
=> def violates() = exists(x, y: parent(x, y) and not(person(x) and person(y)))
```

```
ic() {forall(x, y: age(x,y) implies ( person(x) and Int(y)))}
=> def violates() = exists(x, y: age(x,y) and not(person(x) and Int(y)))
```

```
ic() {forall(x: node(x) implies not edge(x, x))}
=> def violates() = exists(x: node(x) and edge(x, x))
```

```
ic() {forall(x: person(x) implies age(x, _))}
=> def violates() = exists(x: person(x) and not age(x, _))
```

```
ic() {forall(x, y: parent(x, y) implies ( person(x) and person(y) and x!=y and not parent(y, x)))}
=> def violates() = exists(parent(x, y) and (not person(x) or
not person(y) or
x = y or
parent(y, x)))
```

```
ic() {forall(p, a, b: (age(p, a) and age(p, b)) implies a=b)}
=> def violates() = exists(p, a, b: age(p, a) and age(p, b) and a !=b)
```

```
ic() {exists(x: president(x))}
=> def violates() = not exists(x: president(x))
```

### Handling of Formal Parameters

In definitions, formal parameters are moved into the body. For example:

```
def has_self_edge(x) = edge(x, x)
=> def has_self_edge = x: edge(x, x)
```

You can follow the same approach for constraints, as the following example shows:

```
ic no_self_edge(x) node(x) implies not edge(x, x)
=> // Move formal parameters to the body expression:
ic no_self_edge = x: node(x) implies not edge(x,x)
=> // Rewrite:
def no_self_edge_def = x: node(x) and edge(x, x)
```

## Representing IC Violations

One of RelationalAI’s main design decisions is to separate the database logic from transaction control. Therefore, this guide introduces two relations during the course of evaluating integrity constraints:

- The
`:abort`

relation, which evaluates to`true`

if a constraint is violated. Currently, when an`abort`

is triggered an exception is thrown to the end user. In the future, it will be up to the transaction control semantics to handle the abort. - The
`:output`

relation, which represents the IC violations relationally. The`:output`

relation includes information about the violating constraints.

Currently, RelationalAI implements a mechanism that expresses the `abort`

and `output`

relations
as disjunctions of constraint-violation definitions, as shown below:

```
def edge = {(2, 2); (5, 78); (3, 1); (2, 5); (6, 1)}
def node = {1; 2; 3; 5; 78; 6; -1}
ic positive_node(x) = node(x) implies x > 0
ic no_self_edge(a, b) = edge(a, b) implies a != b
```

Internally these constraints would be translated to the following relation definitions:

```
def positive_node_def(x) = not(node(x) implies x > 0)
def no_self_edge_def(x, y) = not(edge(a, b) implies a != b)
def abort = exists(positive_node_def) or exists(no_self_edge_def)
def output(icname) =
(icname = :positive_node_def and exists(positive_node_def)) or
(icname = :no_self_edge_def and exists(no_self_edge_def))
```