Derive facts with logic
Writing logic in PyRel is how you derive new facts from existing ones in order to capture domain knowledge and build up a rich semantic model. This guide shows you how to write logic with the PyRel DSL, using the building blocks of fragments, expressions, variables, and chains.
- PyRel is installed and importable in Python. See Set Up Your Environment for instructions.
- You have a
Modelinstance and a declared semantic schema. See Create a Model Instance, Declare Concepts, and Declare Relationships and Properties. - You have defined the base facts you want to derive from. See Define Base Facts.
Understand PyRel logic constructs
Section titled “Understand PyRel logic constructs”Derived facts are built from a small set of DSL building blocks. Understanding these building blocks and how they work together is key to writing effective logic in PyRel.
There are four main constructs you will use to build logic:
- Fragments represent a unit of logic that can be built up and then materialized to produce results.
They are built by chaining calls to
where,select, anddefine. - Variables are placeholders for entities and values that can be reused in multiple expressions.
- Expressions are a type of variable that represent the result of performing operations like comparisons or relationship traversals.
- Chains represent paths through your model’s relationships and properties and can be used to traverse your model and extract values.
How fragments build logic
Section titled “How fragments build logic”Methods like Model.where, Model.select, and Model.define return a Fragment.
A fragment is a composable, lazy unit of logic.
You build it up by chaining more calls to where, select, and define to add more conditions, outputs, and definitions.
Each call returns a new fragment.
The three main fragment methods have different semantics:
Fragment.whereadds filter conditions to the fragment. It does not specify outputs or definitions, only conditions.Fragment.selectspecifies what you want to return when you materialize the fragment. It does not add conditions or definitions.Fragment.definespecifies what you want to define when you materialize the fragment. It does not add conditions or output values.
Building a fragment does not execute any logic or produce results until you materialize it. There are two main ways to materialize a fragment:
Fragment.to_dfcompiles and executes the fragment and returns the results as a DataFrame.Fragment.intocompiles and executes the fragment and writes the results into a Snowflake table.
For example, the following snippet builds a fragment that selects customer names for customers with pending orders, and then materializes it to a DataFrame:
from relationalai.semantics import String
m = Model("MyModel")Customer = m.Concept("Customer")Order = m.Concept("Order")
class Status(m.Enum): PENDING = "pending" SHIPPED = "shipped" CANCELLED = "cancelled" RETURNED = "returned"
Customer.orders = m.Relationship(f"{Customer} places {Order}")Order.status = m.Property(f"{Order} has status {Status}")
# Build a fragment that selects names of customers with pending orders.q = m.where(Customer.orders.status == Status.PENDING).select(Customer.name)# Materialize the fragment to a DataFrame.print(q.to_df())qis aFragmentinstance that represents the logic of filtering customers with pending orders and selecting their names.m.where()creates a new fragment with the specified conditions, namely that the customer’s orders have a status of “pending”..select()adds a selection to the fragment, specifying that we want to retrieve the names of the customers that meet the conditions.q.to_df()materializes the fragment by compiling and executing the logic, and returns the results as a DataFrame.
What a variable is
Section titled “What a variable is”A Variable is a placeholder for an entity or value that can be reused in multiple expressions.
You never have to explicitly create a Variable instance, since most DSL objects are variables, and most DSL methods return variables.
For example, the following fragment uses Concept objects as variables to represent entities that are members of that concept:
m.where(Customer.orders.status == Status.PENDING).select(Customer.name)Customeris a variable that represents customer entities.Customer.ordersis a variable that represents orders placed by a customer.Customer.orders.statusis a variable that represents the status of a customer’s order.
What an expression is
Section titled “What an expression is”An Expression is a kind of variable that represents the result of performing operations like comparisons or relationship traversals.
Expressions can be used in Model.where to filter results, in Model.define to define new facts, and in Model.select to specify output values.
For instance, the following fragment uses expressions to filter customers with pending orders:
m.where(Customer.orders.status == Status.PENDING).select(Customer.name)Customer.orders.status == Status.PENDINGis a boolean expression that filters Customer entities based on the status of their orders.Customer.orders.statusis an expression that traverses the relationship from Customer to Order and accesses the status property.Customer.ordersandCustomer.nameare also expressions that represent the orders placed by a customer and the name of a customer, respectively.Customeris not an expression; it is a variable that represents the concept of customers.
Write conditional definitions with Model.where and Model.define
Section titled “Write conditional definitions with Model.where and Model.define”Derived facts are often conditional: for example, “an order needs review if its promised ship date is before a cutoff”. Conditional definitions are a two-step process:
-
Write a condition with
Model.whereUse
Model.whereorFragment.whereto express the condition that determines when the derived fact applies. This can include comparisons, relationship traversals, and any other expressions that evaluate to a boolean. -
Define the derived fact with
Model.defineUse
Model.defineorFragment.defineto specify the derived fact you want to define when the condition is met. This can be a new relationship, a new property value, or concept membership.
What can go in a where clause
Section titled “What can go in a where clause”Model.where and Fragment.where can take any number of arguments that are boolean expressions.
This includes:
-
Comparison expressions
These are expressions that compare values using operators like
<,>,==, etc:Order.promised_ship_date < cutoffOrder.status != Status.SHIPPED100 <= Order.total_amount <= 500For a full list of supported operators, see the
Variablereference documentation. -
Concept membership checks
Call a
Conceptand pass an entity variable to check if that entity is a member of the concept:DelayedOrder(Order)NeedsReview(Order) -
Relationship existence checks
Call a relationship
Chainand pass one entity variable for field in the final relationship to check if that relationship path exists:Customer.orders(Order)Order.shipments.carrier(Carrier) -
Logical expressions
You can combine expressions with logical operators, like
&for AND,|for OR, andModel.not_for NOT:(Order.promised_ship_date < cutoff) & (Order.status == Status.PENDING)(Order.status == Status.CANCELLED) | (Order.status == Status.RETURNED)my_model.not_(DelayedOrder(Order))-
Python’s
and,or, andnotcannot be used for logical expressions in the PyRel DSL because they cannot be overloaded to build DSL logic. Always use&,|, andModel.not_instead. -
The
&operator is often unnecessary becauseModel.whereimplicitly ANDs multiple arguments. Use it to group conditions with|, or to combine pre-built filter fragments. See Combinewhereclauses with&. -
The
|operator short-circuits: if the left side matches, the right side is not applied. For true set union semantics, useModel.unioninstead. UseModel.union(...)when you want to include matches from both branches.
-
Put define before where
Section titled “Put define before where”When you want the definition to read like a statement with conditions, such as “an order needs review if its promised ship date is before a cutoff”, put define before where:
from datetime import datetimefrom relationalai.semantics import Integer, Model
CUTOFF = datetime(2026, 1, 1)
m = Model("MyModel")
Order = m.Concept("Order", identify_by={"order_id": Integer})NeedsReview = m.Concept("NeedsReview")
# Define some order entities.m.define( Order.new(order_id=1, promised_ship_date=datetime(2025, 12, 1), status="shipped"), Order.new(order_id=1, promised_ship_date=datetime(2025, 12, 15), status="pending"), Order.new(order_id=2, promised_ship_date=datetime(2026, 2, 1), status="pending"),)
# Define the derived fact with the condition.m.define(NeedsReview(Order)).where(Order.status == "pending", Order.promised_ship_date < CUTOFF)
# Materialize the flagged orders.df = m.select(Order.order_id, Order.promised_ship_date).where(NeedsReview(Order)).to_df()print(df)m.define(NeedsReview(Order))defines the derived fact as concept membership inNeedsReview..where(...)adds the conditions that the order status is “pending” and the promised ship date is before the cutoff.- The verification query selects orders that are members of
NeedsReviewto confirm the definition works as expected. It also illustrates howselectcan be put beforewherewhen you want the output to read like “select these columns where this condition holds”.
Put where before define
Section titled “Put where before define”When you want the condition to read like a filter, such as “orders that are pending and before a cutoff”, put where before define:
from datetime import datetimefrom relationalai.semantics import Integer, Model
CUTOFF = datetime(2026, 1, 1)
m = Model("MyModel")
Order = m.Concept("Order", identify_by={"order_id": Integer})NeedsReview = m.Concept("NeedsReview")
# Define some order entities.m.define( Order.new(order_id=1, promised_ship_date=datetime(2025, 12, 1), status="shipped"), Order.new(order_id=2, promised_ship_date=datetime(2025, 12, 15), status="pending"), Order.new(order_id=3, promised_ship_date=datetime(2026, 2, 1), status="pending"),)
# Filter first, then define the derived fact.m.where(Order.status == "pending", Order.promised_ship_date < CUTOFF).define(NeedsReview(Order))
# Materialize the flagged orders.df = m.where(NeedsReview(Order)).select(Order.order_id, Order.promised_ship_date).to_df()print(df)m.where(...)expresses the condition as a filter..define(NeedsReview(Order))defines the derived fact only for the matches.- The verification query selects orders that are members of
NeedsReviewto confirm the definition works as expected. It also illustrates howwherecan be put beforeselectwhen you want the output to read like “where this condition holds, select these columns”.
Chain multiple where clauses
Section titled “Chain multiple where clauses”When you want to modularize a query, build up your filter in reusable steps. This is especially useful when you have a base condition (like “pending orders”) that you want to reuse across multiple derived facts and verification queries.
This snippet uses Model.where to build a reusable filter fragment, then refines it with another where before calling define:
from datetime import datetimefrom relationalai.semantics import Integer, Model
CUTOFF = datetime(2026, 1, 1)
m = Model("MyModel")
Order = m.Concept("Order", identify_by={"order_id": Integer})NeedsReview = m.Concept("NeedsReview")
# Define some order entities.m.define( Order.new(order_id=1, promised_ship_date=datetime(2025, 12, 1), status="shipped"), Order.new(order_id=2, promised_ship_date=datetime(2025, 12, 15), status="pending"), Order.new(order_id=3, promised_ship_date=datetime(2026, 2, 1), status="pending"),)
# Build a reusable base filter.where_pending = m.where(Order.status == "pending")
# Refine it in a second step, then define the derived fact.where_pending_before_cutoff = where_pending.where(Order.promised_ship_date < CUTOFF)where_pending_before_cutoff.define(NeedsReview(Order))
# Materialize the flagged orders.df = m.where(NeedsReview(Order)).select(Order.order_id, Order.promised_ship_date).to_df()print(df)where_pending = m.where(...)builds a reusable filter fragment that filters for pending orders.where_pending_before_cutoff = where_pending.where(...)refines that fragment by adding another condition to filter pending orders before the cutoff.where_pending_before_cutoff.define(...)defines the derived fact for orders that match both conditions.
- Each
wherecreates a new fragment, so you can keep the base filter (where_pending) and build multiple derived facts from it by refining it with different conditions. - This is more modular and reusable than writing one big
wherewith all conditions at once. - It also allows you to materialize intermediate fragments (for example,
where_pending.select(...).to_df()) to verify that each step is working as expected before you build on top of it.
Combine where clauses with & and |
Section titled “Combine where clauses with & and |”When you already have separate where fragments and you want to combine them into one ANDed filter, combine them with &:
from datetime import datetimefrom relationalai.semantics import Integer, Model
CUTOFF = datetime(2026, 1, 1)
m = Model("MyModel")
Order = m.Concept("Order", identify_by={"order_id": Integer})NeedsReview = m.Concept("NeedsReview")
# Define some order entities.m.define( Order.new(order_id=1, promised_ship_date=datetime(2025, 12, 1), status="shipped"), Order.new(order_id=2, promised_ship_date=datetime(2025, 12, 15), status="pending"), Order.new(order_id=3, promised_ship_date=datetime(2026, 2, 1), status="pending"),)
where_pending = m.where(Order.status == "pending")where_before_cutoff = m.where(Order.promised_ship_date > CUTOFF)
# Combine the fragments with & to define the derived fact.where_pending_before_cutoff = where_pending & where_before_cutoffwhere_pending_before_cutoff.define(NeedsReview(Order))
# Materialize the flagged orders.df = m.where(NeedsReview(Order)).select(Order.order_id, Order.promised_ship_date).to_df()print(df)where_pendingandwhere_before_cutoffare filter-only fragments.&combines those fragments into a single ANDed filter.- Use
&rather than Pythonandbecause Pythonandcannot be overloaded to build DSL logic.
You can also combine fragments with | to express fallback logic, where you want to take the left branch if it matches, and if not, take the right branch:
where_shipped = m.where(Order.status == "shipped")where_after_cutoff = m.where(Order.promised_ship_date > CUTOFF)where_shipped_or_after_cutoff = where_shipped | where_after_cutoff
# Materialize the combined fragment.df = where_shipped_or_after_cutoff.select(Order.order_id, Order.promised_ship_date).to_df()print(df)|should be used only when you explicitly want fallback behavior (take the left branch if it matches, otherwise the right). If you are defining mutually exclusive categories, multiple conditional definitions (one per category) are often the clearest case-split pattern.- For either/or logic, prefer
Model.unionwhen you want to combine matches from multiple branches.
Classify entities with concept membership
Section titled “Classify entities with concept membership”Derived concept membership lets you name reusable categories of entities based on conditions over their properties and relationships.
For example, you might want to define a DelayedOrder concept for orders that have a promised ship date before their actual ship date:
from relationalai.semantics import DateTime, Integer, Modelfrom relationalai.semantics.std.datetime import datetime
m = Model("MyModel")
# Model schemaShipment = m.Concept("Shipment", identify_by={"id": Integer})Shipment.shipped_at = m.Property(f"{Shipment} shipped at {DateTime}")Order = m.Concept("Order", identify_by={"id": Integer})Order.promised_ship_date = m.Property(f"{Order} promised ship date is {DateTime}")Order.shipments = m.Relationship(f"{Order} has shipment {Shipment}")DelayedOrder = m.Concept("DelayedOrder", extends=[Order])
# Define some base Order and Shipment entitiesm.define( o := Order.new(id=1, promised_ship_date=datetime(2025, 12, 1)), s := Shipment.new(id=1, shipped_at=datetime(2025, 12, 15)), o.shipments(s),)
# Define derived concept membership for delayed ordersm.where( Order.shipments.shipped_at > Order.promised_ship_date,).define( DelayedOrder(Order))
# Verify: materialize delayed orders.df = m.select(DelayedOrder.id).to_df()print(df)DelayedOrderextendsOrder, so members are still orders.m.where(...)bindsShipmentandOrderand filters for late shipments..define(DelayedOrder(o))defines membership for the matching orders.- The verification query ends with
Fragment.to_dfto execute and materialize results.
- This pattern assumes you already defined base facts for
Shipment.shipped_for,Shipment.shipped_at, andOrder.promised_ship_date. If those facts are missing, the membership query will be empty.
Compute derived property values
Section titled “Compute derived property values”When you want to compute a property value based on other values and conditions, use a conditional definition with Model.define to assign the property value for matches of the condition.
For example, you might want to compute a delay_in_days property for orders that have shipped late:
from relationalai.semantics import DateTime, Integer, Modelfrom relationalai.semantics.std.datetime import datetime
m = Model("MyModel")
# Model schemaShipment = m.Concept("Shipment", identify_by={"id": Integer})Shipment.shipped_at = m.Property(f"{Shipment} shipped at {DateTime}")Order = m.Concept("Order", identify_by={"id": Integer})Order.promised_ship_date = m.Property(f"{Order} promised ship date is {DateTime}")Order.delay_in_days = m.Property(f"{DelayedOrder} is delayed {Integer} days")Order.shipments = m.Relationship(f"{Order} has shipment {Shipment}")
# Define some base Order and Shipment entitiesm.define( o := Order.new(id=1, promised_ship_date=datetime(2025, 12, 1)), s := Shipment.new(id=1, shipped_at=datetime(2025, 12, 1)), o.shipments(s),)
# Define derived property values for delayed orders.m.where( Order.shipments.shipped_at > Order.promised_ship_date,).define( Order.delay_in_days == datetime.diff("days", Order.shipments.shipped_at, Order.promised_ship_date))
# Verify: materialize orders and their delay in days.df = m.select( Order.id.alias("order_id"), Order.delay_in_days.alias("delay_in_days") | 0,).to_df()print(df)Compute derived relationships
Section titled “Compute derived relationships”You can compute derived relationships to capture connections between entities that are not explicitly stated in your base facts, but can be inferred from them.
For example, you might want to define a Customer concept and derive a Customer.delayed_orders relationship to link customers to their delayed orders:
from relationalai.semantics import DateTime, Integer, Modelfrom relationalai.semantics.std.datetime import datetime
m = Model("MyModel")
# Model schemaShipment = m.Concept("Shipment", identify_by={"id": Integer})Shipment.shipped_at = m.Property(f"{Shipment} shipped at {DateTime}")Order = m.Concept("Order", identify_by={"id": Integer})Order.promised_ship_date = m.Property(f"{Order} promised ship date is {DateTime}")Customer = m.Concept("Customer", identify_by={"id": Integer})Customer.orders = m.Relationship(f"{Customer} places {Order}")Order.shipments = m.Relationship(f"{Order} has shipment {Shipment}")
# Derived relationshipCustomer.delayed_orders = m.Relationship(f"{Customer} has delayed order {Order}")
# Define base factsm.define( c := Customer.new(id=1), o := Order.new(id=1, promised_ship_date=datetime(2025, 12, 1)), s := Shipment.new(id=1, shipped_at=datetime(2025, 12, 15)), c.orders(o), o.shipments(s),)
# Define the derived relationship for delayed orders.m.where( Customer.orders.promised_ship_date < Customer.orders.shipments.shipped_at,).define( Customer.delayed_orders(Order))
# Verify: materialize customers and their delayed orders.df = m.select( Customer.id.alias("customer_id"), Customer.delayed_orders.id.alias("delayed_order_id")).to_df()print(df)- The schema declares the base facts you need for the derivation:
Customer.orderslinks customers to orders, andOrder.shipments.shipped_atgives you the shipped timestamp to compare againstOrder.promised_ship_date. Customer.delayed_ordersis a derived relationship. Its direction is explicit: it links aCustomer(left side) to anOrder(right side).- The
m.define(...)block creates one customer (c1), one order (o1), one shipment (s1), and connects them withc1.orders(o1)ando1.shipments(s1). m.where(Customer.orders(Order), Order.promised_ship_date < Order.shipments.shipped_at)matches customer–order pairs where the order shipped after its promised ship date..define(Customer.delayed_orders(Order))materializes those matches as new relationship facts.- The verification query selects
Customer.idandCustomer.delayed_orders.idand ends withFragment.to_dfto execute and show the derived links.
Match multiple entities of the same type with Concept.ref
Section titled “Match multiple entities of the same type with Concept.ref”When you need to match multiple entities of the same concept in the same query, use Concept.ref to create distinct variable bindings for each match:
from relationalai.semantics import Model, String
m = Model("MyModel")
# Model schemaOrder = m.Concept("Order", identify_by={"id": String})
# Define some base facts.m.define( Order.new(id="o1"), Order.new(id="o2"), Order.new(id="o3"),)
# Get pairs of order ids.df = m.where( o1 = Order.ref(), o2 = Order.ref(), o1.id < o2.id,).select( o1.id.alias("order_1_id"), o2.id.alias("order_2_id"),)Order.ref()creates a new variable binding for theOrderconcept.- Each call to
Order.ref()creates a distinct variable, soo1ando2can match different orders in the same query. m.select(o1.id, o2.id).to_df()materializes all pairs of order ids, including pairs whereo1ando2are the same order.
Technically, only one of the Order.ref() calls needs to be distinct to match multiple orders.
For example, you could have written the query with just one Order.ref() and one Order:
df = m.where( o1 = Order o2 = Order.ref(), o1.id < o2.id,).select( o1.id.alias("order_1_id"), o2.id.alias("order_2_id"),)Match multiple related values with Chain.ref
Section titled “Match multiple related values with Chain.ref”When you need to match multiple values from the same multi-valued relationship path in one query, use Chain.ref to create distinct path bindings.
This is a good fit for pairwise logic over related entities, like “two different shipments for the same order”:
from relationalai.semantics import Integer, Model
m = Model("MyModel")
# Model schemaOrder = m.Concept("Order", identify_by={"id": Integer})Shipment = m.Concept("Shipment", identify_by={"id": Integer})Order.shipments = m.Relationship(f"{Order} has shipment {Shipment:shipment}")
# Define some base facts.m.define( o := Order.new(id=1), s1 := Shipment.new(id=1), s2 := Shipment.new(id=2), s3 := Shipment.new(id=3), o.shipments(s1), o.shipments(s2), o.shipments(s3),)
# Get pairs of shipments for each order.df = m.where( s1 = Order.shipments.ref(), s2 = Order.shipments.ref(), s1.id < s2.id,).select( Order.id.alias("order_id"), s1.id.alias("shipment_1_id"), s2.id.alias("shipment_2_id"),)print(df)Order.shipmentsis aChainthat represents the path fromOrdertoShipmentthrough theshipmentsrelationship.Order.shipments.ref()creates a new variable binding for that path, allowing you to match multiple shipments for the same order in the same query.- Each call to
Order.shipments.ref()creates a distinct path variable, sos1ands2can match different shipments for the same order in the same query. - The condition
s1.id < s2.idensures that we only get each pair of shipments once, and thats1ands2are not the same shipment
Chain.refmakes the path matches distinct when the same path can match multiple values. If you need multiple orders (a self-join onOrder), you still needOrder.ref.