This concept guide introduces entities in Rel.

Download this guide as a RAI notebook by clicking here.


The goal of this concept guide is to introduce entities and show how they are defined and used throughout Rel.


The following concept guides will be referred to in some of the sections below:


An entity is something that exists, independently of its representation in the database, and can be uniquely identified. It can be a concrete object, such as a car or a person, or more abstract, such as an organization. A database often describes sets of entities, their attributes, and the relationships between them, forming a knowledge graph, where entities are the nodes of the graph.

Imagine a database where we have professionals and their countries of origin. A common approach is to assign integer IDs to each person, country, and occupation, and then express queries using these IDs. However, we might inadvertently join on person and country IDs. We might use IDs that don’t exist, or make typos when querying a particular occupation name.

Entities alleviate these problems. They give us a way to describe distinct objects uniquely and unambiguously, keeping distinct concepts separate when needed. Entities let us decouple the details of how we identify concepts from how we use them.


Constructing Entities

As a simple example of constructing entities, consider a database where we want to refer to several countries. To create unique entities for each one, we can write the following entity definition:

entity Country
country_from_name = {
"United States";

Here, Country refers to the entity type we are defining. On the right-hand side, we have an (anonymous) relation that contains the names (or more generally, the attributes) that uniquely identify the different Country instances. The constructor, country_from_name, will be a relation that maps the names to the respective entities, which will have a unique ID. In our case, this ID will be an Integer or a Hash – see section on “desugaring”, below, to see how entities are constructed from the identifying attributes.

Throughout this concept guide, we will use the term “entity” to refer both to the unique ID in the database, and the real-world entity that it refers to.

For clarity, we use the convention X_from_Y for the constructor, where X refers to the entity type and Y to the identifying attributes. Also as a convention, we start the entity name(e.g.: Country) with an uppercase letter, following the naming convention of other type predicates like String or Number (see stdlib). This naming convention is not mandatory.

In this example, the key is a single string value. As we will see in Section Multiple Identifying Attributes, entities can be also identified by multiple values, which is useful for more complex entities or recursive entity creation (see Section Entities and Recursion).

With the above definition in place, Country will be a unary relation containing the corresponding five entities – in this case, unique IDs:


Relation: Country


The constructor, country_from_name, will be a binary relation that maps each string to the corresponding entity:


Relation: country_from_name

"United States"RelationalAITypes.HashValue(0x74688db32dd4b1f49d712b49acedba9e)

Applying the constructor to a string not in the list will not yield any result. Thus, country_from_name["United States"] will refer to an entity, but country_from_name["USA"] will be empty.

Behind the scenes, Country is actually derived from the constructor country_from_name (see Section on “desugaring” for details).

Entity Attributes

We now discuss how to assign attributes to the newly constructed entity instances. We prefer to do this in a fully normalized way, with separate definitions for each attribute, keyed by the attribute name.

As a convention, and to avoid overloading the entity type relation Country, we define an all-lowercase country(e, :attr, value) relation that maps the attribute :attr of entity e to its value value.

For example, let’s assign a :name attribute to Country entities – which is particularly easy in this case, since we can find it in the constructor:

def country(e, :name, value) = country_from_name(value, e)

At first glance, it might seem redundant to assign the identifying attribute in this way, since we can already access it via the constructor country_from_name. However, this will have the advantage that all attributes will be accessible and queried in the same way.

Let’s define some more (non-identifying) attributes:

def country[e, :population] = 364134, country(e, :name, "Iceland")
def country[e, :population] = 328239523, country(e, :name, "United States")

def country(e, :name_alias, name) =
{"USA"; "United States of America"}(name)
and country(e, :name, "United States")

This assigns population information to Iceland and the United States, and two name aliases for the United States. Note that we don’t need missing values for attributes we don’t want to assign, or don’t know.

Note also how we used the identifying attribute from our previous definition, country(e, :name, value), to select the entity e. We could have used the constructor country_from_name(value, e) instead.

It is time to look at all the information we know about our Country entities.


Relation: country

RelationalAITypes.HashValue(0x74688db32dd4b1f49d712b49acedba9e):name"United States"
RelationalAITypes.HashValue(0x74688db32dd4b1f49d712b49acedba9e):name_alias"United States of America"

As we can see, all country entities have a :name attribute; and we have extra :population and :name_alias attributes for Iceland and the United States.

Displaying Entities

It is useful to define a show relation that displays a more human-readable string than the internal entity ID. The identifying attribute(s) are usually a good choice, and, by definition, the entity will be uniquely identifiable. For our Country entity, this is the name:

def show[entity] = country[entity, :name]

Let’s list all countries where we have population information, using the human-readable show relation to identify the entities:

def output = show[e], country[e, :population] from e

Relation: output

"United States"328239523

Multiple Identifying Attributes (n-ary constructors)

Often, more than one value is required to uniquely identify an entity. An obvious example is a person who is identified by a first and last names, which ideally (w.r.t normalized modeling approach) are stored not within the same string but as two separate strings.

As an example, let’s define an Actor entity where an actor is uniquely identified by its first and last names:

def actor_names = {
("Sharon", "Stone");
("Tim", "Curry");
("Robert", "DeNiro");
("Michael", "Jordan")

entity Actor
actor_from_name = actor_names

This time, the relation that contains the identifying attributes is a named relation, actor_names. This relation has arity 2 (aka two columns) and was defined outside the entity construction statement. The constructor actor_from_name is now a relation of arity 3, mapping each name pair to an entity ID that uniquely references the actor.


Relation: actor_from_name


It is easy to see how an entity can now be created from an arbitrarily large number of attributes:

entity E
E_from_values(x...) = identifying_attributes_relation(x...)

Advanced Entity Construction

Multiple Constructors

It might be necessary to construct entities of the same type in different ways, if the set of identifying attributes is not uniform across all of the entities. To do so, we can define multiple constructors for the same entity type. For example, we can add an entity for “Cher” where she is uniquely identified by her single artist name.

entity Actor
actor_from_name = {"Cher"}

The entity type Actor now has two constructors. The binary actor_from_name defines one Actor entity. The ternary actor_from_name defines four Actor entities.

It is also possible to overload the same constructor by arity and/or type. Let’s demonstrate that by defining a few more entity types:

entity Musician
musician_from_name = {

entity Musician
musician_from_name = {
("Paul", "McCartney");
("John", "Lennon")

entity Athlete
athlete_from_name = {
("Michael", "Jordan")

entity Professor
professor_from_name = {
("Michael", "Jordan")

Here, we defined four constructors for three entity types.

  • The entity type Musician has two constructors of the same name, musician_from_name but of different arities. One constructor has arity 2 and the other has arity 3.
  • We defined a Professor entity and an Athlete entity with the same identifying attributes (i.e., name pair). These two entities are, however, distinct. This is possible because the constructor name is also taken into account when generating the entity ID.

Constructors can also be overloaded by data type, and will generate distinct entities for each type. For example:

entity Num num_constructor = 2
entity Num num_constructor = "2"
entity Num num_constructor = 2.0

defines three different Num entities.


Relation: num_constructor


Entities of Multiple Types

Cher is mostly known for her music, so we may want to classify her not just as Actor but also as Musician. We can do so by referring to the entity ID we have already created and adding her to the Musician relation.

def Musician(e) = actor_from_name("Cher", e)

We could have also created a new entity for Cher as a Musician by adding her identifying attributes to the musician_from_name relation. However, this would have resulted in two separate entities referring to the same underlying person. This would defeat the purpose of having entities in the first place, and should be avoided as much as possible. (When joining two knowledge bases, or importing large amounts of raw data, this situation may be unavoidable; resolving this ambiguity is known as entity resolution.)

Entity Hierarchies

Hierarchies are a key semantic concept, ubiquitous in modeling real-world data (think OWL). Examples are any organizational structure, such as a corporation (e.g.: CEO, CTO, …, intern), a government (e.g.: President, Vice-President, …), the classification of life (life, domain, kingdom, …), or a family tree (children, parents, grandparents, …).

To model hierarchies between entities, we can, for example, define the concept of a Professional as a supertype that includes all professions we have defined earlier:

def Professional = Actor; Athlete; Musician; Professor

This definition does not preclude creating additional constructors for Professional directly. For example, we can add Roger Penrose directly as a Professional:

entity Professional
professional_from_name = ("Roger", "Penrose")

Since Roger Penrose is a Professor in Mathematics, we can add the entity we just created to the Professor relation:

def Professor(x) = professional_from_name("Roger", "Penrose", x)

In hindsight, we notice it would have been easier to add him first as Professor because the definition def Professional = Actor; Athlete; Musician; Professor would have automatically added him as Professional.

Entity Attributes (Continued)

How do we model attributes in the presence of a hierarchy? Do we need to assign attribute values at each hierarchical level again? You might guess the answer – we don’t.

However, we have a choice to make.

  • Bottom-up: Define the entity attributes at the lowest level of the hierarchy (Actor, Athlete, ….) and propagate them up the hierarchy (to Professional)

  • Top-down: Define the attributes at the top level of the hierarchy (Professional) and propagate them down (to Actor, Athlete, …), or

  • Mixed: Define the attributes on the level where the entity IDs are defined and propagate the information up and down. In our case, most people were defined on the bottom level but Roger Penrose was defined on the higher Professional level.

We demonstrate here the bottom-up and top-down approaches.


In the bottom-up approach, we assign the attributes on the lowest-level, and we define relations actor, athlete, …, to collect the entity attributes of each. We follow the naming convention from Section Entity Attributes, where the name of the relation containing the attributes is all-lowercase.

def actor(entity, :first_name, x) = actor_from_name(x, y..., entity) from y...
def actor(entity, :last_name, x) = actor_from_name(_, x, entity)

def athlete(entity, :first_name, x) = athlete_from_name(x, y..., entity) from y...
def athlete(entity, :last_name, x) = athlete_from_name(_, x, entity)

def musician(entity, :first_name, x) = musician_from_name(x, y..., entity) from y...
def musician(entity, :last_name, x) = musician_from_name(_, x, entity)
def musician(entity in Musician, a, x) = actor(entity, a, x)

def professor(entity, :first_name, x) = professor_from_name(x, _, entity)
def professor(entity in Professor, :first_name, x) = professional_from_name(x, _, entity)
def professor(entity, :last_name, x) = professor_from_name(_, x, entity)
def professor(entity in Professor, :last_name, x) = professional_from_name(_, x, entity)

Note that some of our :first_name definitions use the y... varargs notation, to allow for 0 or more arguments in its place. This way, the definition uses the first argument regardless of whether the x_from_name constructor has one, two, or more arguments. (The last operator from the stdlib can be used similarly.)

Note also that since we know that some actors are musicians, our last rule for musician refers to actor – but only for those cases where we have a Musician entity.

We created two attributes first_name and last_name for the entity types Actor, Athlete, Musician, and Professor.

For Actor and Athlete, we construct the attributes directly from their constructors. Because the constructors are unique for each entity type, we don’t need to check that the entity ID entity is of the correct type.

The situation is a bit more complicated for the Musician and Professor entity types because they contain entities (more precisely, entity instances) that were initially defined for another entity type: “Cher” and “Roger Penrose” were initially defined as Actor and Professional and later also as Musician and Professor, respectively. Therefore, in the code above we use additional checks on the left-hand side: entity in Musician and entity in Professor ensure that the entity IDs also refer to the intended entity type.

Here, you see one main drawback by this modeling approach. If specific instances have multiple assigned entity types, then special treatment is needed that complicates the logic.

Inspecting the musician and professor relations shows that only the attributes that we know are assigned:

Relation: musician


Relation: professor


Notice how “Cher” and “Roger Penrose” are included in these relations, even though we defined them initially as entities of a different type.

To propagate these definitions up, we just simply repeat the same union as we did for the entity type relations (e.g., Actor) but this time for the relations containing the entity attributes (e.g.: actor).

def professional = actor; athlete; musician; professor

Let’s look at all attributes associated with the first 3 entity instances in professional.

def output = professional[e] for e in last[top[3, Professional]]

Relation: output


As we can see, professional does include names that were initially assigned to actor, athlete, musician, or professor. This shows that we successfully propagated these attributes to the professional relation without having to redefine them. Furthermore, we can access the attributes exactly the same way in professional as we do in actor, et al.


We can also model the attributes in a top-down fashion where we first define professional and then define actor, athlete, musician, and professor in a second step.

def professional(e, :first_name, first) = {
actor_from_name; athlete_from_name; musician_from_name;
professor_from_name; professional_from_name
}(first, x..., e) from x...

def professional(e, :last_name, last) = {
actor_from_name; athlete_from_name; musician_from_name;
professor_from_name; professional_from_name
}(_, last, e)

The union over all the entity constructors can be more compactly expressed in the top-down approach than in the bottom-up approach above. However, this is only possible because all entity constructors have a very similar structure. If the individual constructors vary too much then the union of them must be expressed more verbosely and potentially require delicate handling.

Now, in the second step, we can propagate the entity attributes down to the individual (sub)-entity level.

def actor[x] = professional[x], Actor(x)
def athlete[x] = professional[x], Athlete(x)
def musician[x] = professional[x], Musician(x)
def professor[x] = professional[x], Professor(x)

Comparing the results for musician


Relation: musician


to the results from the previous section we find perfect agreement, demonstrating that the two modeling approaches are equivalent.

Displaying Entities (Continued)

Now we have defined several types of entities, and some entities even belong to multiple types. Fortunately, this does not complicate defining a show function for all of them.

Of course, we could define a show function for each entity type individually. However, it is actually much easier to define show for all of them. The super-type Professional comes in handy as it includes all people we have defined as far; we only need to define one show relation for all entities of type Professional:

def show[p in Professional] =
professional[p, :first_name],
(concat[" ",
professional[p, :last_name]
] <++ "")

Note that we use the left_override (<++) from stdlib to include professionals with no last name.

Let’s look at the entity instances from Section Bottom-Up. We use show to return human-readable strings instead of the internal entity ID:

def output = show[e], professional[e] from e in last[top[3, Professional]]

Relation: output

"Michael Jordan":first_name"Michael"
"Michael Jordan":last_name"Jordan"
"Roger Penrose":first_name"Roger"
"Roger Penrose":last_name"Penrose"
def output = show[e], actor[e] from e in last[top[3, Actor]]

Relation: output

"Michael Jordan":first_name"Michael"
"Michael Jordan":last_name"Jordan"
"Robert DeNiro":first_name"Robert"
"Robert DeNiro":last_name"DeNiro"
"Tim Curry":first_name"Tim"
"Tim Curry":last_name"Curry"

Connecting Entities

So far, we have defined a number of entity types and even an entity hierarchy, which can be used as the starting point for a Knowledge Graph, but we have not yet created any edges to connect different entities.

In Rel, an edge is just a binary relation, relating pairs of entities with each other. (We can also create hyper-edges, that connect three or more entities together.)

Edge information often comes from an external source like a CSV file. It can also come from domain knowledge. For instance, a father edge could be derived from a parent edge and a condition that the parent needs to be a Man.

In this concept guide, we will limit ourselves to a binary edge relation, has_nationality, that connects the Professional entities to the Country entities defined in Section Constructing Entities.

To avoid loading external data, we define a relation nationality_csv that mimics the relational structure that results from doing load_csv on a file with two rows:

def nationality_csv = {
(:first_name, 1, "John");
(:last_name, 1, "Lennon");
(:country, 1, "England");
(:first_name, 2, "Robert");
(:last_name, 2, "DeNiro");
(:country, 2, "United States");

(See the CSV Import how-to guide for more on the relational structure of imported data.)

The next step is to match the attributes to their corresponding entities and collect the entity pairs in the relation has_nationality.

def has_nationality(p, c) =
professional[p, :first_name] = nationality_csv[:first_name, pos]
and professional[p, :last_name] = nationality_csv[:last_name, pos]
and country[c, :name] = nationality_csv[:country, pos]
from pos

Relation: has_nationality


We have two entries in has_nationality as expected. We can use our show function to confirm that we have connected the right entities:

def output = show[p], show[c]
from p, c where has_nationality(p, c)

Relation: output

"John Lennon""England"
"Robert DeNiro""United States"

Entities and Integrity Constraints

We can use integrity constraints to check the consistency of our entity definitions. For example, to make sure that has_nationality connects the right types of entities, we can add this integrity constraint:

ic has_nationality_types(x,y) {
has_nationality(x,y) implies Professional(x) and Country(y)

To make sure that first_name is defined for all professionals, we can add:

ic all_have_first_name(e) {
Professional(e) implies professional(e, :first_name, _)

We can check that actor and musician attributes are only defined for entities of the corresponding type:

ic actor_type(e) {(actor(e, x...) from x...) implies Actor(e)}
ic musician_type(e) {(musician(e, x...) from x...) implies Musician(e)}

We can also write an integrity constraint that verifies that all Actor entities, for instance, are also of type Professional. Mathematically, this means that Actor should be a subset (⊆) of Professional.

ic Actor_subset_Professional {
Actor Professional

Entities and Recursion

Entities can be constructed from other entities, and created recursively. For example, we might want to represent paths through a directed acyclic graph, starting from a given set of source nodes, treating each path as a distinct entity. A path can then be nil (base case), or the result of extending a given path by adding one more edge (inductive step). The entity definition is then:

// define the graph
def edge = {(1,2); (2,3); (3,5); (1,4); (4,5)}

// starting nodes
def source = {1; 3}

// construct the Path entities
entity Path nil() = true
entity Path cons(path, x) = nil(path) and source(x)
entity Path cons(path, y) = cons(_, x, path) and edge(x, y) from x

The path constructors define three different kinds of paths: First, nil constructs a single entity for the empty path. Next, we define paths that are constructed from the nil path and one of the source nodes. Finally, we can recursively construct a new path from an existing path and a node y, if the existing path ends at x and there is an edge from x to y.

Note that this example does not lead to an infinite recursion because there are no cycles in the graph.

We can now define more complex attributes of path entities, such as their number of nodes and edges:

def attribute(:numnodes, p, 0) = nil(p)
def attribute[:numnodes, p] = attribute[:numnodes, prev] + 1
from prev where cons(prev, _, p)

def attribute(:length, p, 0) = cons(nilpath, _, p) and nil(nilpath)
from nilpath

def attribute[:length, p] = attribute[:length, prev] + 1
from prev where cons(prev, _, p)

Note that in this formulation, the nil path has a :numnodes attibute 0, but the :length attribute is not defined for it.

We can assign string names to path entities with a recursive definition as well:

def show[p in nil] = "nil"

def show[p in Path] = string[end_node]
from end_node where cons(nil, end_node, p)

def show[p in Path] = concat[show[prefix], concat["-", string[end_node]]]
from end_node, prefix where cons(prefix, end_node, p) and not nil(prefix)

We can now use show to see this representation of the Path entities:

def sample_paths = show[p], p from p in Path

Relation: sample_paths


Behind the Scenes

@auto_number vs. @hash Entities

By default, hashes are used to construct the entities (which, in the DB, are just unique internal IDs). This can be made explicit with the @hash annotation, as in @hash entity E c = r. The @auto_number annotation, as in @auto_number E c = r, assigns, instead, integer IDs to the created entities. If the entity set is very large, auto_number might have better performance.

In both cases, you should make no assumptions about the entities themselves, other than their uniqueness, and they should only be accessed via the Entity and constructor relations. Their specific values, ordering, etc. remain unspecified.

Desugaring Entity Definitions

It might be useful to see the Rel definitions that correspond to entity definitions – we are really defining a unary relation (the set that contains the entities), and a relation from keys to elements of that set.

Using point-free notation,

entity E c = r

is (roughly) equivalent to

def c = hash[r]
def E = last[c]

(the actual hashes will be different since the entity declaration also hashes the constructor and entity names). Note that hash adds the hash as the $(n+1)^\text{th}$ element when given a relation of arity $n$.


entity E c = r

is (roughly) equivalent to

def c = auto_number[r]
def E = last[c]

(auto_number will give different results each time it is called).

Syntax Note

As shown in Desugaring Entity Definitions, the entity constructor is a relation as well. This has the benefit that we can use the same rewrites as for normal relations, to emphasize different aspects of the query definition or simply to write the definition more compactly.

For example, we could write a professional_from_first_last_occupation constructor that defines three different professionals with the same name, but different occupations, as follows:

entity Professional professional_from_first_last_occupation["Michael", "Jordan"] =
{"Athlete"; "Actor"; "Professor"}

or also as

entity Professional professional_from_first_last_occupation("Michael", "Jordan", x) =
{"Athlete"; "Actor"; "Professor"}(x)

However, if we want to account for the basketball player Michael Jordan acting in the movie Space Jam (1996) which makes him both an Actor and an Athlete, we might want to add a middle initial, to distinguish him from the (different) actor Michael B. Jordan.