Elements of a Relational Knowledge Graph
This concept guide explains what a relational knowledge graph (RKG) is. It also covers the semantic implication of graph components, the steps to build an RKG, and how to run simple queries over it.
Introduction
A relational knowledge graph (RKG) is a knowledge graph represented in accordance with the relational data model. What does this mean? A relational knowledge graph represents each component of a knowledge graph (node, edge, and hyperedge) in the form of relations. In this way a relational knowledge graph can be seen as a relational database composed of multiple relations, which fulfill specific roles.
To make full use of the relational data model, the relational knowledge graph is stored in a fully normalized form and each data point contains semantic meaning. This data modeling strategy is called Graph Normal Form (GNF). See the Graph Normal Form guide for more details.
This guide focuses on the schema and the semantic meaning of each graph component. Understanding these foundations is essential to successfully building a relational knowledge graph that represents your domain of interest.
This guide also illustrates each graph component discussion by object-role modeling (ORM) diagrams. For more details see Schema Visualization.
Here is an overview of all graph components, their definitions, and how they appear in Rel:
Graph Component | Definition | Example |
---|---|---|
graph | A module of relations containing labeled nodes, edges, and hyperedges. | graph |
node | A value or identifier representing a concept in the graph. Each node can be referenced by a unary (arity-1) relation named by a node label. | NodeLabel(node) |
edge | A pair of nodes in the graph. Each edge can be referenced by a binary (arity-2) relation named by an edge label. | edge_label(source_node, target_node) |
hyperedge | A tuple of three or more nodes in the graph. Each hyperedge can be referenced by a ternary or larger (arity-3+) relation named by a hyperedge label. | hyperedge_label(node_1, node_2, node_3) |
Those familiar with labeled-property graph (LPG) diagrams will recognize two additional graph component names, “node property” and “edge property.” In Rel, these are just special cases of edges and hyperedges respectively.
LPG Component | Definition | Example |
---|---|---|
node property | A value node connected to an entity node via an edge. The edge label is the property name. In a Rel graph, property values are nodes in their own right. | Name(person, “Alice”) |
edge property | A value connected to two or more nodes via a hyperedge. The hyperedge label is the property name. In a Rel graph, edge property values are nodes in their own right. | employs_since(company, person, 2020) |
Graph
A relational knowledge graph is a knowledge graph where each component (node, edge, and hyperedge) of the graph is described by a relation.
In Rel, a relational knowledge graph is defined in the scope of a single module. This module acts as a container that consists of relations that represent the different graph components. The names of the relations act as labels for the graph components, grouping together sets of nodes, edges, or hyperedges.
For example, you can declare a module CompanyGraph
as follows:
module CompanyGraph
//Defining nodes
def Person = . . .
//Defining edges
def employs = . . .
//Defining hyperedges
def employs_since = . . .
end
Data defined in modules can then be queried either by adding module_name:
before the defined elements needed or by using the with <module> use <relation>
syntax.
You will find a detailed example of how to populate and query a graph at the end of this guide.
Nodes
Each node represents a “thing” or “noun” in your schema. A node can represent a physical thing. For example, in retail, a “product” you hold in your hand would be represented as a node. Alternatively, a node may represent an abstract but well-understood concept, such as a “customer” or “supplier”.
Some nodes do not translate to “things” at all. Nonetheless, treating these abstractions as nodes is useful for understanding how your data are connected.
Nodes in a relational knowledge graph are either entity nodes or value nodes. Only entity nodes are required to be referenced with labeled node relations. Value nodes are usually used to represent object properties.
As a rule of thumb, entity nodes are represented by graph-unique keys, and value nodes contain human-readable data. More information on how to represent data using entity and value types can be found in Things Not Strings in Rel in the Graph Normal Form guide.
Entity Nodes
The sort of people, objects, or concepts described above are represented as entity nodes.
Defining Entity Nodes
For example, you can declare an entity type
Person
by specifying the schema of the identifying attributes of a person who is an employee in a company:
entity type Person = String
This specifies that the person’s name (String
) identifies them.
Every entity type declaration creates an entity constructor relation ^Person(id..., node)
that maps the identifiers, id...
, to a unique key, node
, which will be used to reference the entity node.
The data types of the identifiers are specified in the entity type declaration (see code above).
The leading caret ^
(^Person
) indicates that it is a constructor.
Populating Entity Nodes
The constructor ^Person
can be used to populate the graph with concrete persons.
For example:
def Person = {
^Person["Alice Zhao"];
^Person["Bob Yablonsky"];
^Person["Ava Nguyen"]
}
The relation Person
now contains the node keys of the three person nodes.
Here you can use point-free syntax, where expression ^Person[id...]
evaluates to the unique node key.
Typically, Rel graph nodes are uppercase, and multiword labels are linked directly with no spaces (UpperCamelCase).
What Should an Entity Node Represent?
To decide which aspects of your domain should be modeled as an entity is ultimately a design decision and up to the developer.
Generally speaking, the following concepts are great examples of entity nodes:
- Objects with many properties.
- Abstract concepts that have nontrivial substructures.
- Entities in an Entity Relationship (ER) diagram.
- Primary keys in a relational table.
- Subjects in the RDF triplestore (opens in a new tab).
Value Nodes
A node that represents data — like a node property value — is represented as a value node.
Defining Value Nodes
For example, you can define a new value type Quarter
that represents three-month periods:
value type Quarter = Date, Date
A value type is defined by specifying the schema.
Here, Quarter
is defined by two Date
values: a start date and an end date.
Similar to entity declarations, the value type declarations create a constructor relation ^Quarter(id..., value)
, whose name starts with a caret, ^
.
The identifying attributes id...
map to the value, value
, which captures all the information in id...
in one value.
For more details, see Value Types.
Populating Value Nodes
Unlike entity nodes, value nodes do not need to be explicitly defined in the graph. It is sufficient to just link to them when creating the edges of the graph. In Edges, you will see how to ensure that nodes have the appropriate value type.
You can also model value nodes in the same way as entity nodes, by defining the relation Quarter
:
def Quarter = {
^Quarter[2022-01-01, 2022-03-31];
^Quarter[2022-04-01, 2022-06-30]
}
Here again, the relation ^Quarter
can be used to populate the node.
The name of the value node doesn’t need to match the name of the underlying value type but it is usually recommended if there is no modeling reason not to do so. For instance:
def FiscalQuarter = {
(^Quarter[2022-01-01, 2022-03-31])
}
This would also be a valid value node declaration.
What Should a Value Node Represent?
As with entities, deciding what should be modeled as a value node is ultimately a design decision and up to the developer.
Generally speaking, the following concepts are great examples of value nodes:
- Properties of an object.
- Attributes of an entity in an ER diagram.
- Columns of a relational table that are not primary or foreign keys.
- Objects in the RDF triplestore (opens in a new tab).
- Concepts with very few properties.
Edges
If nodes are the “things” of your graph, edges describe “relationships” between things. They are the verbs that connect your nouns. Edges connect entity nodes to each other, to value nodes, and can even connect value nodes to each other.
Edges Between Entity Nodes
You can add an edge between two entity nodes by adding a relation to the graph.
In the company example, you can define the edge employs
between the entity nodes Person
and Company
, and populate it with the node identifiers ^Person
and ^Company
:
def employs = {
(^Company["RAI"], ^Person["Ava Nguyen"]);
(^Company["RAI"], ^Person["Alice Zhao"]);
(^Company["Microsoft"], ^Person["Alice Zhao"]);
(^Company["Microsoft"], ^Person["Bob Yablonsky"]);
}
Typically, Rel graph edges are lowercase, and multiword labels are linked with underscores _
(snake_case).
Strings can also be used as edge names, for example, :"employed by"
.

Edge definitions are inherently directional, from source to target.
A binary edge may be made undirected by making the underlying relation symmetric.
This can be done by using transpose
:
def spouse = transpose[spouse]
Node Properties
Adding an entity node property is equivalent to connecting an entity and a value node via an edge. The process is very similar to connecting two entity nodes. However, instead of the edge’s target node containing an entity node identifier, it contains the property value itself.
Here’s how to populate the property born_on
for each Person
node:
def born_on = {
(^Person["Alice Zhao"], 1982-03-15);
(^Person["Bob Yablonsky"], 1991-11-22);
(^Person["Ava Nguyen"], 1979-09-09)
}
For property values, you can safely omit a value type declaration as long as the property name uniquely identifies the data. Value nodes are leaf nodes in the graph.
If the node property has a value type, add the edge just as you would an edge between two entity nodes.
For example, you can declare a value type String
for the node property Name
as such:
value type Name = String
Then you can add a ^Name
node property to each entity node ^Person
as follows:
def has_name = {
(^Person["Alice Zhao"], ^Name ["Alice Grace Zhao"]);
(^Person["Bob Yablonsky"], ^Name ["Bob Matthew Yablonsky"]);
(^Person["Ava Nguyen"], ^Name ["Ava Marie Nguyen"])
}

The purple bar in the ORM diagram above indicates a uniqueness constraint.
Here the purple bar above the edge has_name
means “each Person
has at most one Name
.”
The purple dot indicates a mandatory role constraint.
In this case it means “each Person
has at least one Name
.”
Combining the constraints means “each Person
has exactly one Name
.”
Edges Between Value Nodes
For most applications, source nodes will be entity nodes.
However, in some circumstances, it makes sense to link value nodes together.
For example, consider the link between a Date
value type and a Year
value type:

In Rel, you can define this edge as follows:
def year(date, year) {
^Date(year, _, _,date)
}
In the company example, you can connect the properties of dates. For example:
def start_month(quarter_node, month) {
(^Quarter[start_date, _, quarter_node], date_month[start_date, month])
from start_date
}
Hyperedges
A hyperedge connects three or more nodes — usually multiple entity nodes — with each other. However, any mix of entity or value nodes is possible.
Hyperedges are relations with composite keys that are made up of two or more individual keys. Hyperedge relations — as any relation — may or may not have a value column. If they do, each composite key points to only one value, which is located in the last column in the relation. If the hyperedge relation describes a boolean-like relationship — one that can be answered with true or false — only the composite key is stored in the relation and no value column exists.
One common use case for hyperedges is to store higher dimensional data like embeddings.
Edge Properties
Just as nodes can have properties, so can edges. Edge properties are represented in Rel using hyperedges.
In the sentence analogy, if the source node is the subject, and the target node is the direct object, an edge property is the indirect object. When speaking about your data in natural language, edge properties are often objects of prepositions like “since,” “on,” “for,” “by,” etc.
In Rel, an edge property is a value node connected to two (or more) entity nodes via a ternary edge.
For example, ”RelationalAI has employed Alice since 2020.” This can be understood as:
- A binary edge
employs
connecting theCompany
entity node for “RelationalAI” and thePerson
entity node for “Alice.” - The ternary edge (hyperedge)
employs_since
connecting the “RelationalAI” and “Alice” entity nodes to a value node, aYear
with the value 2020. The value node is the edge property.
Here is an ORM diagram for this relationship. Note that the ORM diagram can be read across the set of role boxes, just like a sentence: ”Company employs Person since Date.”

Defining and Populating Edge Properties
Hyperedges are defined similarly to binary edges.
For the example above, the edge property employs_since
can be defined as:
def employs_since = {
(^Company["RAI"], ^Person["Ava Nguyen"], 2019-12-01);
(^Company["RAI"], ^Person["Alice Zhao"], 2020-05-10);
(^Company["Microsoft"], ^Person["Alice Zhao"],2021-08-24);
(^Company["Microsoft"], ^Person["Bob Yablonsky"], 2018-02-28)
}
What Should an Edge Property Represent?
Edge properties can represent a range of information about the graph. They often reflect a condition or degree under which the edge is valid.
The following are good examples of edge properties:
- Reliability (as a weight).
- Information source.
- Time or location.
- Time- or location-dependent datum, as a quaternary (arity-4) hyperedge.
Optional Edge Properties and Incomplete Data
Note that the hyperedge example employs_since
will only connect entities for which there is a value for the edge property.
If there is no property value for a particular edge, this relation will not exist.
In this example, an employs
edge will still connect the nodes.
In the vast majority of cases, this is the graph connectivity you want to represent.
However, if it is important to the schema to represent that the edge property is missing, you can use Rel’s Missing
type.
Hyperedges of Any Size
Edge properties are just one subset of hyperedges. Rel is agnostic to the dimension of a hyperedge or what combination of entity and value nodes are contained in the hyperedge.
Consider the following example fact: ”Alice met Bob at the Strange Loop conference in 2021.”
This fact can be readily modeled as an arity-4 hyperedge met_at_place_in_year
:
def met_at_place_in_year(person1, person2, conference, year) {
person1 = ^Person["Alice Zhao"],
person2 = ^Person["Bob Yablonsky"],
conference = ^Conference["Strange Loop"],
year = 2021
}
Reification
While Rel supports hypergraphs of any dimension, graph traversal algorithms are optimized for binary graphs. Reification is a process by which hyperedges can be transformed into binary relationships. Reifying a hyperedge is a two-step process:
-
Define a new reified entity using the key of the hyperedge relation (every node but the last, target node):
entity type ReifiedEntity = String, Date
def ReifiedEntity(e) { exists(s, d : e = ^ReifiedEntity[s, d] and a_hyperedge(s, d, _) ) }
-
Connect a new labeled edge
reified_edge
from the new reified entityReifiedEntity
to the target nodet
:def reified_edge(source_node, target_node) { exists(s, d : source_node = ^ReifiedEntity[s, d], target_node = a_hyperedge[s, d] ) }
New labeled edges can also be defined to connect to each node in the hyperedge key. The example ”Company employs Person since Date” represented below can be reified following the two steps.

-
Define and populate a new node (
Employment
) that captures the relationship ”Company employs Person“:entity type Employment = String, String
def Employment(employment) { exists(company, person : employment = ^Employment[company, person] and employment_since(company, person, _) ) }
-
Connect the reified
Employment
node to each node (Company
,Person
, andDate
) of theemploys_since
hyperedge. This requires defining three new labeled edges:has_employer
,has_employee
, andemployment_since
. For example:def has_employer(employment, company) { exists(person: employment = ^Employment[company, person] and employment_since(company, person, _) ) } def has_employee(employment, person) { exists(company: employment = ^Employment[company, person] and employment_since(company, person, _) ) } def employment_since(employment, dt) { exists(company, person: employment = ^Employment[company, person] and employment_since(company, person, dt) ) }
Here is the ORM diagram representation of the reification:

The new edges has_employer
, has_employee
, and employment_since
are all binary.
The entity node label Employment
may not be obvious in the source data, but the construction of hyperedges can point to natural places where new concepts, and new node labels, can be generated.
Embeddings
Vectors, matrices, or any tensors can be represented as relations with arity 2, 3, or , respectively, where is the rank of the tensor.
The schema of the underlying relation — representing the embedding — is (key_graph..., key_tensor..., value)
.
The graph-related key key_graph...
identifies the node or edge associated with the embedding.
The tensor indices are captured by key_tensor...
.
Both together make up the composite key of the relation.
Note that both keys may be composite keys themselves, which is indicated by ...
.
It’s good practice to store embeddings in a separate module, rather than as hyperedges. A single graph may have many sets of embeddings associated with it.
The following example initializes a 10-dimensional vector embedding for the Person
entities with zeros:
def mygraph_embedding:Person(person, dimension, value) {
person = mygraph:Person,
dimension = range[1, 10, 1],
value = 0.0
}
Here, the graph-related key is the entity ID person
and the tensor index is dimension
.
Similarly, a 10x10 matrix embedding can be defined for an edge with uniformly distributed random values between 0 and 1 using the Threefry pseudorandom number generator random_threefry_float64
:
def mygraph_embedding:employed_by(person, company, index1, index2, value) {
mygraph:employed_by(person, company),
index1 = range[1, 10, 1],
index2 = range[1, 10, 1],
value = random_threefry_float64[index1, index2]
}
Here, both the graph-related key (person, company)
and the matrix indices (index1, index2)
are composite keys.
Modeling this way provides great flexibility. Any kind of embedding ranging from a single value, vector, or matrix to an abstract tensor can be realized. Thanks to the high similarities between GNF and sparse matrix representations, embedding in Rel are stored in the spare COO (opens in a new tab) format and don’t have to be dense.
Building and Querying a Graph
Building a Graph
With the insights on the elements of a relational knowledge graph you can now build a simple graph and query it. Here is an ORM diagram of the knowledge graph you are about to build:

Defining the Schema
The first step is to define the schema. You can define the types of the value and entity nodes as follows:
// model
// entity nodes
entity type Company = String
entity type Person = String
// value nodes
value type Name = String
You can define the schema of the edges as well:
// model
module CompanyGraph
// entity nodes
bound Company = Entity
bound Person = Entity
// node attribute / edge: has_name
bound has_name = Company, Name
bound has_name = Person, Name
// edge: born_on
bound born_on = Person, Date
// edge: employs
bound employs = Company, Person
// edge attribute / hyperedge: employs_since
bound employs_since = Company, Person, Date
end
For more details on the bound
syntax, see Bound Declarations in the Rel Reference manual.
Populating the Graph
Now that you have defined the schema, you can insert some data into your database.
It is best practice to organize data with modules and group information together. When building a knowledge graph, it makes sense to insert all the data into a module where the module represents the knowledge graph.
Updating the data within a module requires two steps:
- Defining the data within a temporary module (
company_graph
). - Storing the data in a base relation (
CompanyGraph
), which persists in the database.
// write query
// defining the data within a temporary company_graph module
module company_graph
// entity node: Company
def Company = {
^Company["RAI"];
^Company["Microsoft"]
}
// entity node: Person
def Person = {
^Person["Alice Zhao"];
^Person["Bob Yablonsky"];
^Person["Ava Nguyen"]
}
// edge: has_name
def has_name = {
(^Company["RAI"], ^Name["RelationalAI"]);
(^Company["Microsoft"], ^Name["Microsoft Corporation"]);
(^Person["Alice Zhao"], ^Name ["Alice Grace Zhao"]);
(^Person["Bob Yablonsky"], ^Name ["Bob Matthew Yablonsky"]);
(^Person["Ava Nguyen"], ^Name ["Ava Marie Nguyen"])
}
// edge: born_on
def born_on = {
(^Person["Alice Zhao"], 1982-03-15);
(^Person["Bob Yablonsky"], 1991-11-22);
(^Person["Ava Nguyen"], 1979-09-09)
}
// edge: employs
def employs = {
(^Company["RAI"], ^Person["Ava Nguyen"]);
(^Company["RAI"], ^Person["Alice Zhao"]);
(^Company["Microsoft"], ^Person["Alice Zhao"]);
(^Company["Microsoft"], ^Person["Bob Yablonsky"]);
}
// hyperedge: employs_since
def employs_since = {
(^Company["RAI"], ^Person["Ava Nguyen"], 2019-12-01);
(^Company["RAI"], ^Person["Alice Zhao"], 2020-05-10);
(^Company["Microsoft"], ^Person["Alice Zhao"],2021-08-24);
(^Company["Microsoft"], ^Person["Bob Yablonsky"], 2018-02-28)
}
end
// storing the data in the `CompanyGraph` base relation
def insert:CompanyGraph = company_graph
You can find a detailed explanation of the code above in My First Knowledge Graph.
Querying a Graph
Querying a relational knowledge graph allows you to find specific entities. In the following examples you will see a query based on attributes, an aggregation query, and a query with conditions.
What Company Does Alice Zhao Work For?
// read query
def output(company_name) = {
exists(company :
CompanyGraph:employs(company, ^Person["Alice Zhao"])
and CompanyGraph:has_name(company, company_name)
)
}
The def output
statement in the ()
indicates what elements will be displayed in the output.
In this case it is one value: the name of the company fulfilling the statement on the right-hand side of the equal sign.
The right-hand side of the definition can be verbalized as “there exists a company such that the company employs Alice Zhao and the company has a name.”
The code and CompanyGraph:has_name(company, company_name)
connects the entity hash with the entity name, making the outputs more readable.
Notice that employs
and has_name
are both preceded by CompanyGraph:
.
This indicates that the information will have to be retrieved from the CompanyGraph
module.
How Many Employees Work at Each Company?
// read query
def output(company_name, head_count) {
exists(company :
head_count = count[CompanyGraph:employs[company]]
and company_name = CompanyGraph:has_name[company]
)
}
This query uses the count
relation.
The output will display two values: the name of the company and its headcount.
The right-hand side of the definition can be verbalized as “there exists a company such that the company has a name and the headcount is the number of employs
edges for that company.”
Note that company_name = CompanyGraph:has_name[company]
is the equivalent of CompanyGraph:has_name(company, company_name)
in the previous query.
Who Was Hired at RelationalAI After 2020?
// read query
with CompanyGraph use Company, Person, employs_since, has_name
def RAI_hires_after_2020(person_name in Name, dt in Date) {
exists(company in Company, person in Person :
employs_since(company, person, dt)
and has_name(company, ^Name["RelationalAI"])
and dt > 2019-12-31
and has_name(person, person_name)
)
}
def output = RAI_hires_after_2020
The statement def RAI_hires_after_2020(person_name in Name, dt in Date)
indicates that the definition RAI_hires_after_2020
has a pair of values, person_name
and dt
, where person_name
is of property node Name
and dt
is of property node Date
.
The right-hand side of the definition can be verbalized as “there exists a company and a person such that a company employs a person since a date, the name of the company is RAI, the start date happens after 2019-12-31, the person has a name.”
Starting the queries with a with <module> use <relation>
statement is another way of retrieving the information stored in the example module.
In this case, the module is CompanyGraph
and the relations are the ones required for the query definition: Company
, Person
, employs_since
, and has_name
.
Summary
A relational knowledge graph represents each component of a knowledge graph (nodes, edges, and hyperedges) in the form of relations. Those relations are defined within modules and the data populating them are stored in based relations, creating a graph.
See Also
To practice some simple Rel that will allow you to create your first knowledge graph, see My First Knowledge Graph. For a more complex example, see Modeling and Reasoning: The Lehigh University Benchmark.