Meet PyRel

Build a small PyRel workflow that segments customers based on co-purchase behavior. This tutorial shows you how to declare a minimal retail model, load sample data, build a customer graph, run Louvain community detection, and summarize the results.

You’ll learn how to:

Declare a semantic model with concepts, properties, and relationships.
Load data as base facts in your model.
Build a graph from your model’s facts.
Run a graph algorithm and store the results back in your model.
Write queries that summarize and analyze your results.

PyRel is installed. See Set up your environment for instructions.

What PyRel is

Before you write any code, it helps to have a quick mental model of how PyRel works.

PyRel is a Python library for building declarative models over your data. Instead of writing loops that compute outputs step-by-step, you declare facts and logic, and then ask PyRel to evaluate those declarations.

You use PyRel to:

Create a model: Use a Model object to hold your schema, definitions, and requirements.
Declare a semantic schema: Declare concepts, properties, and relationships so you can talk about your data in domain terms.
Load data from sources into the model: Start with in-memory rows for quick iteration or map from real data sources.
Define the logic that governs your model: Definitions derive new facts from existing facts using declarative logic. Requirements are reusable checks that must hold. They help you catch missing or inconsistent data early.
Query and materialize results: Build queries to describe what you want to see, then materialize results in Python to evaluate the query. Query results are computed on-demand based on the model’s data and declared logic.

What problem you will solve

Imagine you run an online store. You want to segment your customers so you can market to them more effectively. You have an orders table but need a way to turn that raw data into actionable insights.

You’ll segment customers based on co-purchases. The idea is simple:

If two customers buy many of the same products, they are probably similar.
Customers with stronger purchase overlap should end up in the same segment.

To make that similarity explicit, you’ll use graph reasoning to build a customer graph:

Each customer is a node.
You connect two customers when they bought the same product.
The more products they share, the stronger the connection.

Then you’ll run a community detection algorithm to turn that connected structure into a segment label per customer. Finally, you’ll analyze the value of each segment by looking at how much revenue they generated.

Create a model

You’ll build your model in one runnable script named segment_customers.py. You’ll start small, then add a few lines at a time until the script produces segment assignments and a simple segment summary.

Create a raiconfig.yaml file

From your project root, run:
Terminal window
```
rai init
```
This creates a raiconfig.yaml template. Fill in the fields for your Snowflake account and authentication. For more information, see the Configuration guides.

Create segment_customers.py and add imports

Create a file named segment_customers.py. Then add these imports at the top:

from relationalai.semantics import Float, Integer, Model, String, distinct
from relationalai.semantics.reasoners.graph import Graph
from relationalai.semantics.std import aggregates

Create a model

Instantiate a Model object with a string for the model name:

from relationalai.semantics import Float, Integer, Model, String, distinct
from relationalai.semantics.reasoners.graph import Graph
from relationalai.semantics.std import aggregates

model = Model("retail_customer_segmentation")

Declare the model schema

Now you’ll declare the schema you’ll use for the rest of the tutorial. In PyRel, the schema is the combination of concepts, properties, and relationships that lets you talk about raw data in domain terms, such as customers and orders.

In this tutorial, your schema includes these parts:

Schema part	Includes	Used for
Concepts	Customers, products, orders, and customer segments	Represent the main entities in the model
Properties	Customer name and region, product name and category, and order amount	Store single-valued attributes
Relationships	Order-to-customer, order-to-product, and customer-to-segment links	Connect related entities

Build that schema in three small steps:

Declare Customer, Product, and Order concepts and their properties

Concepts are the entity types in your domain, such as customers and orders. Properties are the single-valued attributes you want to query later, such as a customer’s region or an order amount.

Add the following code below the model = ... line you added earlier:
```
Customer = model.Concept("Customer", identify_by={"id": Integer})
Customer.name = model.Property(f"{Customer} has name {String:name}")
Customer.region = model.Property(f"{Customer} has region {String:region}")

Product = model.Concept("Product", identify_by={"id": Integer})
Product.name = model.Property(f"{Product} has name {String:name}")
Product.category = model.Property(f"{Product} in category {String:category}")

Order = model.Concept("Order", identify_by={"id": Integer})
Order.amount = model.Property(f"{Order} has amount {Float:amount}")
```
In this example:
- identify_by={"id": Integer} gives each concept a stable identity key.
- This does not create any entities yet. It just declares the schema you’ll load data into later.
Related resources:
- Declare concepts
- Declare relationships and properties
Declare relationships between orders, customers, and products

Relationships are the links between entity types. Here you declare that each order points to the customer who placed it and the product it contains.

Add the following code below the concept declarations:
```
Order.customer = model.Relationship(f"{Order} placed by {Customer}")
Order.product = model.Relationship(f"{Order} contains {Product}")
```
In this example:
- The f"{...}" argument is a reading: a compact, human-readable description of what the relationship means.
- The {Order}, {Customer}, and {Product} parts are fields. They tell PyRel what concept types participate in the relationship.
- You can name a field by adding :<name> inside the braces (for example, {Order:order}), but this tutorial keeps the readings simple.
Related resources:
- What the difference is between properties and relationships
Declare CustomerSegment and attach segments to customers

You’ll store the output of community detection as a reusable concept in your model. In other words, segments are first-class entities, and each customer belongs to one segment.

Add the following code below the relationship declarations:
```
CustomerSegment = model.Concept("CustomerSegment", identify_by={"id": Integer})
Customer.segment = model.Relationship(f"{Customer} belongs to {CustomerSegment}")
```

Load sample data as base facts

Now you’ll load a small set of sample rows into your model. In PyRel, these rows are your base facts: the starting truth your later definitions and queries build on.

Follow these steps to load the data and turn it into base facts in your model:

Add customer rows

Start by creating a small customer dataset. Each row is one customer, and the id value will be used as identity.

Add the following rows below your schema declarations:

customer_rows = [
    {"id": 1, "name": "Alice", "region": "North"},
    {"id": 2, "name": "Bob", "region": "North"},
    {"id": 3, "name": "Carol", "region": "South"},
    {"id": 4, "name": "Dan", "region": "South"},
    {"id": 5, "name": "Eve", "region": "West"},
    {"id": 6, "name": "Frank", "region": "West"},
]

Add product rows

Next, add products. These rows are referenced by orders later.

Add the following below customer_rows:

product_rows = [
    {"id": 101, "name": "Protein Powder", "category": "Fitness"},
    {"id": 102, "name": "Yoga Mat", "category": "Fitness"},
    {"id": 103, "name": "Wireless Earbuds", "category": "Electronics"},
    {"id": 104, "name": "Smart Watch", "category": "Electronics"},
    {"id": 105, "name": "Espresso Beans", "category": "Food"},
    {"id": 106, "name": "Coffee Grinder", "category": "Food"},
]

Add order rows, including cross-cluster purchases

Now add orders. Notice that each order row includes customer_id and product_id. These are foreign-key style references that you’ll turn into real relationships in a later step.

Add the following code below product_rows:

order_rows = [
    # Fitness-oriented cluster
    {"id": 1001, "customer_id": 1, "product_id": 101, "amount": 95.0},
    {"id": 1002, "customer_id": 1, "product_id": 102, "amount": 40.0},
    {"id": 1003, "customer_id": 2, "product_id": 101, "amount": 90.0},
    {"id": 1004, "customer_id": 2, "product_id": 102, "amount": 42.0},
    # Electronics-oriented cluster
    {"id": 1005, "customer_id": 3, "product_id": 103, "amount": 160.0},
    {"id": 1006, "customer_id": 3, "product_id": 104, "amount": 240.0},
    {"id": 1007, "customer_id": 4, "product_id": 103, "amount": 155.0},
    {"id": 1008, "customer_id": 4, "product_id": 104, "amount": 235.0},
    # Food-oriented cluster
    {"id": 1009, "customer_id": 5, "product_id": 105, "amount": 24.0},
    {"id": 1010, "customer_id": 5, "product_id": 106, "amount": 130.0},
    {"id": 1011, "customer_id": 6, "product_id": 105, "amount": 28.0},
    {"id": 1012, "customer_id": 6, "product_id": 106, "amount": 125.0},
    # A few cross-cluster purchases to make segmentation realistic
    {"id": 1013, "customer_id": 2, "product_id": 103, "amount": 145.0},
    {"id": 1014, "customer_id": 4, "product_id": 106, "amount": 120.0},
    {"id": 1015, "customer_id": 6, "product_id": 102, "amount": 39.0},
  ]

Turn the rows into model facts

At this point, you have plain Python data. The next step is to convert it into something PyRel can define as facts.

Add the following code below the row lists:
```
# Wrap raw rows in model.data() to get table-like objects PyRel can work with
customer_data = model.data(customer_rows)
product_data = model.data(product_rows)
order_data = model.data(order_rows)

# Explicitly map columns to concept properties with keyword arguments
model.define(
    Customer.new(
        id=customer_data.id,
        name=customer_data.name,
        region=customer_data.region,
    )
)

# Implicitly map columns to concept properties with .to_schema()
model.define(Product.new(product_data.to_schema()))
```
In this example:
- model.data(...) wraps your Python rows in a table-like object. You can access columns as attributes (for example, customer_data.id).
- Customer.new(id=..., name=..., region=...) uses keyword arguments to map columns to properties. It’s the most explicit form for defining entities. You decide exactly which column populates each field. You can map columns with different names or do simple column transformations.
- .to_schema() turns the table-like object into a mapping of column names to column values. Product.new(product_data.to_schema()) uses that to map matching column names automatically. It’s a more implicit form of defining entities when you know that your column names already match your schema.
- model.define(...) adds the resulting entities to the model as base facts.
Note that:
- No entities have been computed yet. You are just declaring what the facts are. PyRel will compute the actual entities when you materialize results later.
- You can mix and match the explicit keyword-argument style and the implicit .to_schema() style as needed. An example is shown in the next step.
Related resources:
- Declare data sources
- Define base facts
Define orders with foreign-key style references

Orders are slightly different because each row has two references: a customer and a product. Here you define the order facts and also define the links to the matching Customer and Product entities.

Add the following code below the model.define(Product.new(...)) line:
```
model.define(
    Order.new(
        order_data.to_schema(exclude=["customer_id", "product_id"]),
        customer=Customer.filter_by(id=order_data.customer_id),
        product=Product.filter_by(id=order_data.product_id),
    )
)
```
In this example:
- exclude=["customer_id", "product_id"] keeps those raw ID columns out of the Order concept.
- Customer.filter_by(id=order_data.customer_id) is how you look up the matching customer entity.
- product=Product.filter_by(...) does the same lookup for products.
Add a few simple requirements

Requirements are checks that act as guardrails for your model. They help you fail fast when the input data is missing something you expect, such as a customer name. They also catch values that should never happen, such as a negative order amount.

PyRel enforces requirements when you materialize results for a query. If a requirement fails, you’ll get an error.

In this tutorial, you’ll require that customers and products have names, and that order amounts are positive.

Add the following code below the model.define(...) calls from the previous steps:
```
# Every customer and product must have a name.
Customer.require(Customer.name)
Product.require(Product.name)

# Every order must have a positive amount.
Order.require(Order.amount > 0.0)
```
Related resources:
- Define requirements

Build a customer graph

Now you’ll turn your order facts into a graph that a community detection algorithm can operate on. This graph is a derived structure. It is separate from your model schema, but built from the model’s facts.

In this tutorial, an edge means:

Two customers are connected if they bought the same product.
The edge weight is the number of shared products.

Follow these steps to build the graph from your order facts:

Create a graph object

Start by creating a Graph that uses customers as nodes. You’ll build an undirected, weighted graph and store edges as facts in your model.

Add the following code below your requirements:
```
graph = Graph(
    model,
    directed=False,
    weighted=True,
    node_concept=Customer,
    aggregator="sum",
)
```
In this example:
- directed=False means the edge A → B is the same as B → A.
- weighted=True means edges can have a numeric weight.
- node_concept=Customer means each node represents a Customer entity.
- aggregator="sum" collapses duplicate edges by summing their weights. This is necessary because two customers can share more than one product and therefore have multiple edges between them. In PyRel, graphs are simple, meaning they can only have one edge per node pair. The aggregator argument tells PyRel how to combine multiple edges into one.
Related resources:
- Create a graph
Define edges from shared products

Now you’ll define the logic for connecting customers based on shared purchases. To do that, you’ll compare pairs of orders by creating two distinct references to Order.

Add the following code below graph = Graph(...):
```
left_order = Order.ref()
right_order = Order.ref()

model.where(
    left_order.product == right_order.product,
    left_order.customer.id < right_order.customer.id,
).define(
    graph.Edge.new(
        src=left_order.customer,
        dst=right_order.customer,
        weight=1.0,
    )
)
```
In this example:
- Order.ref() creates an independent variable for the Order concept. It’s different from a normal Python variable. Think of it as a labeled placeholder. Using two refs lets left_order and right_order match different orders in the same definition, so you can compare their properties to find shared purchases.
- You can read model.where(...).define(...) as: find all matches, then create facts for each match.
- model.where(...) builds the filter. Each argument is a condition, and multiple conditions in one call are combined with AND.
- left_order.product == right_order.product matches pairs of orders that bought the same product.
- left_order.customer.id < right_order.customer.id keeps only one ordering of each customer pair so you do not create both A-B and B-A.
- .define(...) says what to create for each matching pair. Here you define a graph edge fact with graph.Edge.new(...).
- Each match contributes weight=1.0 to the edge between the two customers. Because the graph uses aggregator="sum", multiple shared products add up to a larger weight.
Related resources:
- What a Variable is
- Write conditional definitions with Model.where and Model.define

Derive customer segments

Now that you have a customer graph, you can derive a segment for each customer with Louvain community detection. Louvain assigns a community label to each node based on how strongly it is connected to other nodes.

Run Louvain and store the label on nodes

Add the following code below your edge definition:
```
graph.Node.community_label = graph.louvain()
```
In this example:
- graph.louvain() returns a relationship that maps each node to a community label.
- graph.Node is the concept that represents the nodes in your graph. You can define new properties and relationships on it like any other concept. Here you define a new relationship called community_label to hold the Louvain output.
Related resources:
- Run a graph algorithm
Turn labels into segment entities

Segments are first-class entities in your semantic model. You’ll create one CustomerSegment entity per distinct label.

Add the following code below the Louvain line:
```
model.define(CustomerSegment.new(id=graph.Node.community_label))
```
Attach the segment to each customer

Finally, you attach each customer to the segment with the same label.

Add the following code below the CustomerSegment.new(...) definition:
```
model.where(graph.Node == Customer).define(
    Customer.segment(CustomerSegment.filter_by(id=graph.Node.community_label))
)
```

Summarize segment value

At this point, every customer belongs to a segment. Now you’ll build two small tables to help you understand the result:

A segment-level summary with customers, orders, and revenue.
A customer-to-segment table you can scan to sanity-check assignments.

Define a segment-level summary query

Add the following code near the bottom of your file:
```
segment_value = (
    model.where(
        Customer.segment == CustomerSegment,
        Order.customer == Customer,
    )
    .select(
        CustomerSegment.id.alias("segment_id"),
        aggregates.count(distinct(Customer)).per(CustomerSegment).alias("customers"),
        aggregates.count(Order).per(CustomerSegment).alias("orders"),
        aggregates.sum(Order.amount).per(CustomerSegment).alias("revenue"),
        aggregates.avg(Order.amount).per(CustomerSegment).alias("avg_order_value"),
    )
    .to_df()
    .sort_values("revenue", ascending=False)
    .reset_index(drop=True)
)
```
In this example:
- model.where(Customer.segment == CustomerSegment, Order.customer == Customer) joins customers to their segment and to their orders.
- .per(CustomerSegment) groups each aggregate by segment. Without .per(...), you would get one overall total.
- distinct(Customer) is important. Because each customer can match multiple orders, count(distinct(Customer)) counts customers instead of customer-order pairs.
- .to_df() materializes the result. This triggers PyRel to compile the model and run the query, returning a Pandas DataFrame you can work with in Python. You’ll also see any compilation error and requirement violations here if there are issues with your model.
Related resources:
- Query a model
- Aggregate and group data

Define a customer-to-segment membership table

This table is a convenient debug view of the segmentation.

Add the following code below the segment_value = (...) block:

customer_segment_membership = (
    model.where(Customer.segment(CustomerSegment))
    .select(
        Customer.id.alias("customer_id"),
        Customer.name.alias("customer_name"),
        Customer.region,
        CustomerSegment.id.alias("segment_id"),
    )
    .to_df()
    .sort_values(["segment_id", "customer_id"])
)

This query does not aggregate. It returns one row per customer.
Sorting by segment_id makes it easy to see which customers ended up together.

Print both tables

Add the following code below the membership table definition:

print("\nCustomer -> Segment assignments")
print(customer_segment_membership.to_string(index=False))

print("\nSegment value analysis")
print(segment_value.to_string(index=False))

Run the finished script

Run the script from your project root:

python segment_customers.py

You’ll see two printed tables:

Customer -> Segment assignments
customer_id customer_name region segment_id
          1         Alice  North          1
          2           Bob  North          1
          3         Carol  South          1
          4           Dan  South          2
          5           Eve   West          2
          6         Frank   West          2

Segment value analysis
segment_id customers orders  revenue  avg_order_value
         2         3      8    856.0            107.0
         1         3      7    812.0            116.0

Next steps

Check out our in-depth guides to learn more about modeling and reasoning with PyRel:

Build a semantic model

Use advanced reasoning