Meet PyRel
Welcome to RelationalAI! If you’re new to PyRel, you’re in the right spot. This tutorial gently guides you through the ins and outs of using PyRel to model your business domain and answer valuable business questions that support data-driven decision making.
You’ll learn how to:
- Declare a semantic model with concepts, properties, and relationships.
- Load data as base facts in your model.
- Build a graph from your model’s facts.
- Run a graph algorithm and store the results back in your model.
- Write queries that summarize and analyze your results.
Let’s get started!
- PyRel is installed. See Set up your environment for instructions.
What PyRel is
Section titled “What PyRel is”Before you write any code, it helps to have a simple mental model of how PyRel works.
PyRel is a Python library for building declarative models over your data. Instead of writing loops that compute outputs step-by-step, you declare facts and logic, and then ask PyRel to evaluate those declarations.
In practice, you use PyRel to:
- Create a model workspace.
Use
Modelto hold your schema, definitions, and requirements. - Declare a semantic schema. Declare concepts, properties, and relationships so you can talk about your data in domain terms.
- Load data from sources into the model. Start with in-memory rows for quick iteration or map from real data sources.
- Define the logic that governs your model. Definitions derive new facts from existing facts using declarative logic. Requirements are reusable checks that must hold. They help you catch missing or inconsistent data early.
- Query and materialize results. Build queries to describe what you want to see, then materialize results in Python to evaluate the query. Query results are computed on-demand based on the model’s data and declared logic.
What problem you will solve
Section titled “What problem you will solve”Imagine you run an online store. You want to segment your customers so you can market to them more effectively. You have an orders table, but need a way to turn that raw data into actionable insights.
In this tutorial, you will segment customers based on co-purchases. The idea is simple:
- If two customers buy many of the same products, they are probably similar.
- Similar customers should end up in the same segment.
To make that “similarity” explicit, you will use graph reasoning to build a customer graph:
- Each customer is a node.
- You connect two customers when they bought the same product.
- The more products they share, the stronger the connection.
Then you will run a community detection algorithm to turn that connected structure into a segment label per customer. Finally, you will analyze the value of each segment by looking at how much revenue they generated.
Create a model
Section titled “Create a model”You will build your model in one runnable script named segment_customers.py.
You’ll start small, then add a few lines at a time until the script produces segment assignments and a simple segment summary.
-
Create a
raiconfig.yamlfileFrom your project root, run:
Terminal window rai initThis creates a
raiconfig.yamltemplate. Fill in the fields for your Snowflake account and authentication. For more information, see the Configuration guides. -
Create
segment_customers.pyand add importsCreate a file named
segment_customers.py. Then add these imports at the top:segment_customers.py from relationalai.semantics import Float, Integer, Model, String, distinctfrom relationalai.semantics.reasoners.graph import Graphfrom relationalai.semantics.std import aggregates -
Create a model
Add the following code below your imports:
segment_customers.py model = Model("retail_customer_segmentation")
Declare the model schema
Section titled “Declare the model schema”Now you will declare the schema you will use for the rest of the tutorial. In PyRel, the schema is the combination of concepts, properties, and relationships that lets you talk about raw data in domain terms (like customers and orders).
In this tutorial, your schema includes:
- Named entity types called concepts:
- Customers, products, and orders
- Customer segments (the output of community detection)
- Single-valued attributes called properties:
- Customer name and region
- Product name and category
- Order amount
- Links between concepts called relationships:
- Orders link to the customer who placed them and the product they contain.
- Customers link to the segment they belong to.
Build that schema in three small steps:
-
Declare
Customer,Product, andOrderconcepts and their propertiesConcepts are the entity types in your domain, like customers and orders. Properties are the single-valued attributes you want to query later, like a customer’s region or an order amount.
Add the following code below the
model = ...line you added earlier:Customer = model.Concept("Customer", identify_by={"id": Integer})Customer.name = model.Property(f"{Customer} has name {String:name}")Customer.region = model.Property(f"{Customer} has region {String:region}")Product = model.Concept("Product", identify_by={"id": Integer})Product.name = model.Property(f"{Product} has name {String:name}")Product.category = model.Property(f"{Product} in category {String:category}")Order = model.Concept("Order", identify_by={"id": Integer})Order.amount = model.Property(f"{Order} has amount {Float:amount}")identify_by={"id": Integer}gives each concept a stable identity key.- This does not create any entities yet. It just declares the schema you will load data into later.
-
Declare relationships between orders, customers, and products
Relationships are the links between entity types. Here you declare that each order points to (1) the customer who placed it and (2) the product it contains.
Add the following code below the concept declarations:
Order.customer = model.Relationship(f"{Order} placed by {Customer}")Order.product = model.Relationship(f"{Order} contains {Product}")- The
f"{...}"argument is a reading: a compact, human-readable description of what the relationship means. - The
{Order},{Customer}, and{Product}parts are fields. They tell PyRel what concept types participate in the relationship. - You can name a field by adding
:<name>inside the braces (for example,{Order:order}), but this tutorial keeps the readings simple.
- The
-
Declare
CustomerSegmentand attach segments to customersYou will store the output of community detection as a reusable concept in your model. In other words: segments are first-class entities, and each customer belongs to one segment.
Add the following code below the relationship declarations:
CustomerSegment = model.Concept("CustomerSegment", identify_by={"id": Integer})Customer.segment = model.Relationship(f"{Customer} belongs to {CustomerSegment}")
Load sample data as base facts
Section titled “Load sample data as base facts”Now you will load a small set of sample rows into your model. In PyRel, these rows are your base facts: the starting truth your later definitions and queries build on.
Follow these steps to load the data and turn it into base facts in your model:
-
Add customer rows
Start by creating a small “customers” dataset. Each row is one customer, and the
idvalue will be used as identity.Add the following rows below your schema declarations:
customer_rows = [{"id": 1, "name": "Alice", "region": "North"},{"id": 2, "name": "Bob", "region": "North"},{"id": 3, "name": "Carol", "region": "South"},{"id": 4, "name": "Dan", "region": "South"},{"id": 5, "name": "Eve", "region": "West"},{"id": 6, "name": "Frank", "region": "West"},] -
Add product rows
Next, add products. These rows are referenced by orders later.
Add the following below
customer_rows:product_rows = [{"id": 101, "name": "Protein Powder", "category": "Fitness"},{"id": 102, "name": "Yoga Mat", "category": "Fitness"},{"id": 103, "name": "Wireless Earbuds", "category": "Electronics"},{"id": 104, "name": "Smart Watch", "category": "Electronics"},{"id": 105, "name": "Espresso Beans", "category": "Food"},{"id": 106, "name": "Coffee Grinder", "category": "Food"},] -
Add order rows (including cross-cluster purchases)
Now add orders. Notice that each order row includes
customer_idandproduct_id. These are foreign-key style references that you will turn into real relationships in a later step.Add the following code below
product_rows:order_rows = [# Fitness-oriented cluster{"id": 1001, "customer_id": 1, "product_id": 101, "amount": 95.0},{"id": 1002, "customer_id": 1, "product_id": 102, "amount": 40.0},{"id": 1003, "customer_id": 2, "product_id": 101, "amount": 90.0},{"id": 1004, "customer_id": 2, "product_id": 102, "amount": 42.0},# Electronics-oriented cluster{"id": 1005, "customer_id": 3, "product_id": 103, "amount": 160.0},{"id": 1006, "customer_id": 3, "product_id": 104, "amount": 240.0},{"id": 1007, "customer_id": 4, "product_id": 103, "amount": 155.0},{"id": 1008, "customer_id": 4, "product_id": 104, "amount": 235.0},# Food-oriented cluster{"id": 1009, "customer_id": 5, "product_id": 105, "amount": 24.0},{"id": 1010, "customer_id": 5, "product_id": 106, "amount": 130.0},{"id": 1011, "customer_id": 6, "product_id": 105, "amount": 28.0},{"id": 1012, "customer_id": 6, "product_id": 106, "amount": 125.0},# A few cross-cluster purchases to make segmentation realistic{"id": 1013, "customer_id": 2, "product_id": 103, "amount": 145.0},{"id": 1014, "customer_id": 4, "product_id": 106, "amount": 120.0},{"id": 1015, "customer_id": 6, "product_id": 102, "amount": 39.0},] -
Turn the rows into model facts
At this point, you have plain Python data. The next step is to convert it into something PyRel can define as facts.
Add the following code below the row lists:
# Wrap raw rows in model.data() to get table-like objects PyRel can work withcustomer_data = model.data(customer_rows)product_data = model.data(product_rows)order_data = model.data(order_rows)# Explicitly map columns to concept properties with keyword argumentsmodel.define(Customer.new(id=customer_data.id,name=customer_data.name,region=customer_data.region,))# Implicitly map columns to concept properties with .to_schema()model.define(Product.new(product_data.to_schema()))model.data(...)wraps your Python rows in a table-like object. You can access columns as attributes (for example,customer_data.id).Customer.new(id=..., name=..., region=...)uses keyword arguments to map columns to properties. This is the most explicit form for defining entities and allows. You decide exactly which column populates each field. You can map columns with different names or do simple column transformations..to_schema()turns the table-like object into a mapping of column names to column values.Product.new(product_data.to_schema())uses that to map matching column names automatically. This is a more implicit form of defining entities when you know that your column names already match your schema.model.define(...)adds the resulting entities to the model as base facts.
- No entities have been computed yet. You are just declaring what the facts are. PyRel will compute the actual entities when you materialize results later.
- You can mix and match the explicit keyword-argument style and the implicit
.to_schema()style as needed. An example is shown in the next step.
-
Define orders with foreign-key style references
Orders are slightly different because each row has two references: a customer and a product. Here you define the order facts and also define the links to the matching
CustomerandProductentities.Add the following code below the
model.define(Product.new(...))line:model.define(Order.new(order_data.to_schema(exclude=["customer_id", "product_id"]),customer=Customer.filter_by(id=order_data.customer_id),product=Product.filter_by(id=order_data.product_id),))exclude=["customer_id", "product_id"]keeps those raw ID columns out of theOrderconcept.Customer.filter_by(id=order_data.customer_id)is how you “look up” the matching customer entity.product=Product.filter_by(...)does the same lookup for products.
-
Add a few simple requirements
Requirements are checks that act like guardrails for your model. They help you fail fast when the input data is missing something you expect, like a customer name. They also catch values that should never happen, like a negative order amount.
PyRel enforces requirements when you materialize results for a query. If a requirement fails, you will get an error.
In this tutorial, you will require that customers and products have names, and that order amounts are positive.
Add the following code below the
model.define(...)calls from the previous steps:# Every customer and product must have a name.Customer.require(Customer.name)Product.require(Product.name)# Every order must have a positive amount.Order.require(Order.amount > 0.0)
Build a customer graph
Section titled “Build a customer graph”Now you will turn your order facts into a graph that a community detection algorithm can operate on. This graph is a derived structure. It is separate from your model schema, but built from the model’s facts.
In this tutorial, an edge means:
- Two customers are connected if they bought the same product.
- The edge weight is the number of shared products.
Follow these steps to build the graph from your order facts:
-
Create a graph object
Start by creating a
Graphthat uses customers as nodes. You will build an undirected, weighted graph and store edges as facts in your model.Add the following code below your requirements:
graph = Graph(model,directed=False,weighted=True,node_concept=Customer,aggregator="sum",)directed=Falsemeans the edgeA → Bis the same asB → A.weighted=Truemeans edges can have a numeric weight.node_concept=Customermeans each node represents aCustomerentity.aggregator="sum"collapses duplicate edges by summing their weights. This is necessary because two customers can share more than one product and therefore have multiple edges between them. In PyRel, graphs are simple, meaning they can only have one edge per node pair. Theaggregatorargument tells PyRel how to combine multiple edges into one.
-
Define edges from shared products
Now you will define the logic for connecting customers based on shared purchases. To do that, you will compare pairs of orders by creating two distinct references to
Order.Add the following code below
graph = Graph(...):left_order = Order.ref()right_order = Order.ref()model.where(left_order.product == right_order.product,left_order.customer.id < right_order.customer.id,).define(graph.Edge.new(src=left_order.customer,dst=right_order.customer,weight=1.0,))-
Order.ref()creates an independent variable for theOrderconcept. This is different from a normal Python variable. Think of it as a labeled placeholder. Using two refs letsleft_orderandright_ordermatch different orders in the same definition, so you can compare their properties to find shared purchases. -
Read
model.where(...).define(...)as: “find all matches, then create facts for each match.” -
model.where(...)builds the filter. Each argument is a condition, and multiple conditions in one call are combined with AND. In this case:left_order.product == right_order.productmatches pairs of orders that bought the same product.left_order.customer.id < right_order.customer.idkeeps only one ordering of each customer pair so you do not create both A–B and B–A.
-
.define(...)says what to create for each matching pair. Here you define a graph edge fact withgraph.Edge.new(...). -
Each match contributes
weight=1.0to the edge between the two customers. Because the graph usesaggregator="sum", multiple shared products add up to a larger weight.
-
Derive customer segments
Section titled “Derive customer segments”Now that you have a customer graph, you can derive a segment for each customer with Louvain community detection. Louvain assigns a community label to each node based on how strongly it is connected to other nodes.
-
Run Louvain and store the label on nodes
Add the following code below your edge definition:
graph.Node.community_label = graph.louvain()graph.louvain()returns a relationship that maps each node to a community label.graph.Nodeis the graph’s node concept. It is a special concept that represents the nodes in your graph. You can define new properties and relationships on it like any other concept. Here you define a new relationship calledcommunity_labelto hold the Louvain output.
-
Turn labels into segment entities
Segments are first-class entities in your semantic model. You will create one
CustomerSegmententity per distinct label.Add the following code below the Louvain line:
model.define(CustomerSegment.new(id=graph.Node.community_label)) -
Attach the segment to each customer
Finally, you attach each customer to the segment with the same label.
Add the following code below the
CustomerSegment.new(...)definition:model.where(graph.Node == Customer).define(Customer.segment(CustomerSegment.filter_by(id=graph.Node.community_label)))
Summarize segment value
Section titled “Summarize segment value”At this point, every customer belongs to a segment. Now you will build two small tables to help you understand the result:
- A segment-level summary with customers, orders, and revenue.
- A customer → segment table you can scan to sanity-check assignments.
-
Define a segment-level summary query
Add the following code near the bottom of your file:
segment_value = (model.where(Customer.segment == CustomerSegment,Order.customer == Customer,).select(CustomerSegment.id.alias("segment_id"),aggregates.count(distinct(Customer)).per(CustomerSegment).alias("customers"),aggregates.count(Order).per(CustomerSegment).alias("orders"),aggregates.sum(Order.amount).per(CustomerSegment).alias("revenue"),aggregates.avg(Order.amount).per(CustomerSegment).alias("avg_order_value"),).to_df().sort_values("revenue", ascending=False).reset_index(drop=True))-
model.where(Customer.segment == CustomerSegment, Order.customer == Customer)joins customers to their segment and to their orders. -
.per(CustomerSegment)groups each aggregate by segment. Without.per(...), you would get one overall total. -
distinct(Customer)is important. Because each customer can match multiple orders,count(distinct(Customer))counts customers instead of customer-order pairs. -
.to_df()materializes the result. This triggers PyRel to compile the model and run the query, returning a Pandas DataFrame you can work with in Python. This is also where you’ll see any compilation error and requirement violations if there are issues with your model.
-
-
Define a customer → segment membership table
This table is a convenient “debug view” of the segmentation.
Add the following code below the
segment_value = (...)block:customer_segment_membership = (model.where(Customer.segment(CustomerSegment)).select(Customer.id.alias("customer_id"),Customer.name.alias("customer_name"),Customer.region,CustomerSegment.id.alias("segment_id"),).to_df().sort_values(["segment_id", "customer_id"]))-
This query does not aggregate. It returns one row per customer.
-
Sorting by
segment_idmakes it easy to see which customers ended up together.
-
-
Print both tables
Add the following code below the membership table definition:
print("\nCustomer -> Segment assignments")print(customer_segment_membership.to_string(index=False))print("\nSegment value analysis")print(segment_value.to_string(index=False))
Run the finished script
Section titled “Run the finished script”Run the script from your project root:
python segment_customers.pyYou will see two printed tables:
Customer -> Segment assignmentscustomer_id customer_name region segment_id 1 Alice North 1 2 Bob North 1 3 Carol South 1 4 Dan South 2 5 Eve West 2 6 Frank West 2
Segment value analysissegment_id customers orders revenue avg_order_value 2 3 8 856.0 107.0 1 3 7 812.0 116.0