Skip to content

Meet PyRel

Welcome to RelationalAI! If you’re new to PyRel, you’re in the right spot. This tutorial gently guides you through the ins and outs of using PyRel to model your business domain and answer valuable business questions that support data-driven decision making.

You’ll learn how to:

  • Declare a semantic model with concepts, properties, and relationships.
  • Load data as base facts in your model.
  • Build a graph from your model’s facts.
  • Run a graph algorithm and store the results back in your model.
  • Write queries that summarize and analyze your results.

Let’s get started!

Before you write any code, it helps to have a simple mental model of how PyRel works.

PyRel is a Python library for building declarative models over your data. Instead of writing loops that compute outputs step-by-step, you declare facts and logic, and then ask PyRel to evaluate those declarations.

In practice, you use PyRel to:

  • Create a model workspace. Use Model to hold your schema, definitions, and requirements.
  • Declare a semantic schema. Declare concepts, properties, and relationships so you can talk about your data in domain terms.
  • Load data from sources into the model. Start with in-memory rows for quick iteration or map from real data sources.
  • Define the logic that governs your model. Definitions derive new facts from existing facts using declarative logic. Requirements are reusable checks that must hold. They help you catch missing or inconsistent data early.
  • Query and materialize results. Build queries to describe what you want to see, then materialize results in Python to evaluate the query. Query results are computed on-demand based on the model’s data and declared logic.

Imagine you run an online store. You want to segment your customers so you can market to them more effectively. You have an orders table, but need a way to turn that raw data into actionable insights.

In this tutorial, you will segment customers based on co-purchases. The idea is simple:

  • If two customers buy many of the same products, they are probably similar.
  • Similar customers should end up in the same segment.

To make that “similarity” explicit, you will use graph reasoning to build a customer graph:

  • Each customer is a node.
  • You connect two customers when they bought the same product.
  • The more products they share, the stronger the connection.

Then you will run a community detection algorithm to turn that connected structure into a segment label per customer. Finally, you will analyze the value of each segment by looking at how much revenue they generated.

You will build your model in one runnable script named segment_customers.py. You’ll start small, then add a few lines at a time until the script produces segment assignments and a simple segment summary.

  1. Create a raiconfig.yaml file

    From your project root, run:

    Terminal window
    rai init

    This creates a raiconfig.yaml template. Fill in the fields for your Snowflake account and authentication. For more information, see the Configuration guides.

  2. Create segment_customers.py and add imports

    Create a file named segment_customers.py. Then add these imports at the top:

    segment_customers.py
    from relationalai.semantics import Float, Integer, Model, String, distinct
    from relationalai.semantics.reasoners.graph import Graph
    from relationalai.semantics.std import aggregates
  3. Create a model

    Add the following code below your imports:

    segment_customers.py
    model = Model("retail_customer_segmentation")

Now you will declare the schema you will use for the rest of the tutorial. In PyRel, the schema is the combination of concepts, properties, and relationships that lets you talk about raw data in domain terms (like customers and orders).

In this tutorial, your schema includes:

  • Named entity types called concepts:
    • Customers, products, and orders
    • Customer segments (the output of community detection)
  • Single-valued attributes called properties:
    • Customer name and region
    • Product name and category
    • Order amount
  • Links between concepts called relationships:
    • Orders link to the customer who placed them and the product they contain.
    • Customers link to the segment they belong to.

Build that schema in three small steps:

  1. Declare Customer, Product, and Order concepts and their properties

    Concepts are the entity types in your domain, like customers and orders. Properties are the single-valued attributes you want to query later, like a customer’s region or an order amount.

    Add the following code below the model = ... line you added earlier:

    Customer = model.Concept("Customer", identify_by={"id": Integer})
    Customer.name = model.Property(f"{Customer} has name {String:name}")
    Customer.region = model.Property(f"{Customer} has region {String:region}")
    Product = model.Concept("Product", identify_by={"id": Integer})
    Product.name = model.Property(f"{Product} has name {String:name}")
    Product.category = model.Property(f"{Product} in category {String:category}")
    Order = model.Concept("Order", identify_by={"id": Integer})
    Order.amount = model.Property(f"{Order} has amount {Float:amount}")
    • identify_by={"id": Integer} gives each concept a stable identity key.
    • This does not create any entities yet. It just declares the schema you will load data into later.
  2. Declare relationships between orders, customers, and products

    Relationships are the links between entity types. Here you declare that each order points to (1) the customer who placed it and (2) the product it contains.

    Add the following code below the concept declarations:

    Order.customer = model.Relationship(f"{Order} placed by {Customer}")
    Order.product = model.Relationship(f"{Order} contains {Product}")
    • The f"{...}" argument is a reading: a compact, human-readable description of what the relationship means.
    • The {Order}, {Customer}, and {Product} parts are fields. They tell PyRel what concept types participate in the relationship.
    • You can name a field by adding :<name> inside the braces (for example, {Order:order}), but this tutorial keeps the readings simple.
  3. Declare CustomerSegment and attach segments to customers

    You will store the output of community detection as a reusable concept in your model. In other words: segments are first-class entities, and each customer belongs to one segment.

    Add the following code below the relationship declarations:

    CustomerSegment = model.Concept("CustomerSegment", identify_by={"id": Integer})
    Customer.segment = model.Relationship(f"{Customer} belongs to {CustomerSegment}")

Now you will load a small set of sample rows into your model. In PyRel, these rows are your base facts: the starting truth your later definitions and queries build on.

Follow these steps to load the data and turn it into base facts in your model:

  1. Add customer rows

    Start by creating a small “customers” dataset. Each row is one customer, and the id value will be used as identity.

    Add the following rows below your schema declarations:

    customer_rows = [
    {"id": 1, "name": "Alice", "region": "North"},
    {"id": 2, "name": "Bob", "region": "North"},
    {"id": 3, "name": "Carol", "region": "South"},
    {"id": 4, "name": "Dan", "region": "South"},
    {"id": 5, "name": "Eve", "region": "West"},
    {"id": 6, "name": "Frank", "region": "West"},
    ]
  2. Add product rows

    Next, add products. These rows are referenced by orders later.

    Add the following below customer_rows:

    product_rows = [
    {"id": 101, "name": "Protein Powder", "category": "Fitness"},
    {"id": 102, "name": "Yoga Mat", "category": "Fitness"},
    {"id": 103, "name": "Wireless Earbuds", "category": "Electronics"},
    {"id": 104, "name": "Smart Watch", "category": "Electronics"},
    {"id": 105, "name": "Espresso Beans", "category": "Food"},
    {"id": 106, "name": "Coffee Grinder", "category": "Food"},
    ]
  3. Add order rows (including cross-cluster purchases)

    Now add orders. Notice that each order row includes customer_id and product_id. These are foreign-key style references that you will turn into real relationships in a later step.

    Add the following code below product_rows:

    order_rows = [
    # Fitness-oriented cluster
    {"id": 1001, "customer_id": 1, "product_id": 101, "amount": 95.0},
    {"id": 1002, "customer_id": 1, "product_id": 102, "amount": 40.0},
    {"id": 1003, "customer_id": 2, "product_id": 101, "amount": 90.0},
    {"id": 1004, "customer_id": 2, "product_id": 102, "amount": 42.0},
    # Electronics-oriented cluster
    {"id": 1005, "customer_id": 3, "product_id": 103, "amount": 160.0},
    {"id": 1006, "customer_id": 3, "product_id": 104, "amount": 240.0},
    {"id": 1007, "customer_id": 4, "product_id": 103, "amount": 155.0},
    {"id": 1008, "customer_id": 4, "product_id": 104, "amount": 235.0},
    # Food-oriented cluster
    {"id": 1009, "customer_id": 5, "product_id": 105, "amount": 24.0},
    {"id": 1010, "customer_id": 5, "product_id": 106, "amount": 130.0},
    {"id": 1011, "customer_id": 6, "product_id": 105, "amount": 28.0},
    {"id": 1012, "customer_id": 6, "product_id": 106, "amount": 125.0},
    # A few cross-cluster purchases to make segmentation realistic
    {"id": 1013, "customer_id": 2, "product_id": 103, "amount": 145.0},
    {"id": 1014, "customer_id": 4, "product_id": 106, "amount": 120.0},
    {"id": 1015, "customer_id": 6, "product_id": 102, "amount": 39.0},
    ]
  4. Turn the rows into model facts

    At this point, you have plain Python data. The next step is to convert it into something PyRel can define as facts.

    Add the following code below the row lists:

    # Wrap raw rows in model.data() to get table-like objects PyRel can work with
    customer_data = model.data(customer_rows)
    product_data = model.data(product_rows)
    order_data = model.data(order_rows)
    # Explicitly map columns to concept properties with keyword arguments
    model.define(
    Customer.new(
    id=customer_data.id,
    name=customer_data.name,
    region=customer_data.region,
    )
    )
    # Implicitly map columns to concept properties with .to_schema()
    model.define(Product.new(product_data.to_schema()))
    • model.data(...) wraps your Python rows in a table-like object. You can access columns as attributes (for example, customer_data.id).
    • Customer.new(id=..., name=..., region=...) uses keyword arguments to map columns to properties. This is the most explicit form for defining entities and allows. You decide exactly which column populates each field. You can map columns with different names or do simple column transformations.
    • .to_schema() turns the table-like object into a mapping of column names to column values. Product.new(product_data.to_schema()) uses that to map matching column names automatically. This is a more implicit form of defining entities when you know that your column names already match your schema.
    • model.define(...) adds the resulting entities to the model as base facts.
    • No entities have been computed yet. You are just declaring what the facts are. PyRel will compute the actual entities when you materialize results later.
    • You can mix and match the explicit keyword-argument style and the implicit .to_schema() style as needed. An example is shown in the next step.
  5. Define orders with foreign-key style references

    Orders are slightly different because each row has two references: a customer and a product. Here you define the order facts and also define the links to the matching Customer and Product entities.

    Add the following code below the model.define(Product.new(...)) line:

    model.define(
    Order.new(
    order_data.to_schema(exclude=["customer_id", "product_id"]),
    customer=Customer.filter_by(id=order_data.customer_id),
    product=Product.filter_by(id=order_data.product_id),
    )
    )
    • exclude=["customer_id", "product_id"] keeps those raw ID columns out of the Order concept.
    • Customer.filter_by(id=order_data.customer_id) is how you “look up” the matching customer entity.
    • product=Product.filter_by(...) does the same lookup for products.
  6. Add a few simple requirements

    Requirements are checks that act like guardrails for your model. They help you fail fast when the input data is missing something you expect, like a customer name. They also catch values that should never happen, like a negative order amount.

    PyRel enforces requirements when you materialize results for a query. If a requirement fails, you will get an error.

    In this tutorial, you will require that customers and products have names, and that order amounts are positive.

    Add the following code below the model.define(...) calls from the previous steps:

    # Every customer and product must have a name.
    Customer.require(Customer.name)
    Product.require(Product.name)
    # Every order must have a positive amount.
    Order.require(Order.amount > 0.0)

Now you will turn your order facts into a graph that a community detection algorithm can operate on. This graph is a derived structure. It is separate from your model schema, but built from the model’s facts.

In this tutorial, an edge means:

  • Two customers are connected if they bought the same product.
  • The edge weight is the number of shared products.

Follow these steps to build the graph from your order facts:

  1. Create a graph object

    Start by creating a Graph that uses customers as nodes. You will build an undirected, weighted graph and store edges as facts in your model.

    Add the following code below your requirements:

    graph = Graph(
    model,
    directed=False,
    weighted=True,
    node_concept=Customer,
    aggregator="sum",
    )
    • directed=False means the edge A → B is the same as B → A.
    • weighted=True means edges can have a numeric weight.
    • node_concept=Customer means each node represents a Customer entity.
    • aggregator="sum" collapses duplicate edges by summing their weights. This is necessary because two customers can share more than one product and therefore have multiple edges between them. In PyRel, graphs are simple, meaning they can only have one edge per node pair. The aggregator argument tells PyRel how to combine multiple edges into one.
  2. Define edges from shared products

    Now you will define the logic for connecting customers based on shared purchases. To do that, you will compare pairs of orders by creating two distinct references to Order.

    Add the following code below graph = Graph(...):

    left_order = Order.ref()
    right_order = Order.ref()
    model.where(
    left_order.product == right_order.product,
    left_order.customer.id < right_order.customer.id,
    ).define(
    graph.Edge.new(
    src=left_order.customer,
    dst=right_order.customer,
    weight=1.0,
    )
    )
    • Order.ref() creates an independent variable for the Order concept. This is different from a normal Python variable. Think of it as a labeled placeholder. Using two refs lets left_order and right_order match different orders in the same definition, so you can compare their properties to find shared purchases.

    • Read model.where(...).define(...) as: “find all matches, then create facts for each match.”

    • model.where(...) builds the filter. Each argument is a condition, and multiple conditions in one call are combined with AND. In this case:

      • left_order.product == right_order.product matches pairs of orders that bought the same product.
      • left_order.customer.id < right_order.customer.id keeps only one ordering of each customer pair so you do not create both A–B and B–A.
    • .define(...) says what to create for each matching pair. Here you define a graph edge fact with graph.Edge.new(...).

    • Each match contributes weight=1.0 to the edge between the two customers. Because the graph uses aggregator="sum", multiple shared products add up to a larger weight.

Now that you have a customer graph, you can derive a segment for each customer with Louvain community detection. Louvain assigns a community label to each node based on how strongly it is connected to other nodes.

  1. Run Louvain and store the label on nodes

    Add the following code below your edge definition:

    graph.Node.community_label = graph.louvain()
    • graph.louvain() returns a relationship that maps each node to a community label.
    • graph.Node is the graph’s node concept. It is a special concept that represents the nodes in your graph. You can define new properties and relationships on it like any other concept. Here you define a new relationship called community_label to hold the Louvain output.
  2. Turn labels into segment entities

    Segments are first-class entities in your semantic model. You will create one CustomerSegment entity per distinct label.

    Add the following code below the Louvain line:

    model.define(CustomerSegment.new(id=graph.Node.community_label))
  3. Attach the segment to each customer

    Finally, you attach each customer to the segment with the same label.

    Add the following code below the CustomerSegment.new(...) definition:

    model.where(graph.Node == Customer).define(
    Customer.segment(CustomerSegment.filter_by(id=graph.Node.community_label))
    )

At this point, every customer belongs to a segment. Now you will build two small tables to help you understand the result:

  • A segment-level summary with customers, orders, and revenue.
  • A customer → segment table you can scan to sanity-check assignments.
  1. Define a segment-level summary query

    Add the following code near the bottom of your file:

    segment_value = (
    model.where(
    Customer.segment == CustomerSegment,
    Order.customer == Customer,
    )
    .select(
    CustomerSegment.id.alias("segment_id"),
    aggregates.count(distinct(Customer)).per(CustomerSegment).alias("customers"),
    aggregates.count(Order).per(CustomerSegment).alias("orders"),
    aggregates.sum(Order.amount).per(CustomerSegment).alias("revenue"),
    aggregates.avg(Order.amount).per(CustomerSegment).alias("avg_order_value"),
    )
    .to_df()
    .sort_values("revenue", ascending=False)
    .reset_index(drop=True)
    )
    • model.where(Customer.segment == CustomerSegment, Order.customer == Customer) joins customers to their segment and to their orders.

    • .per(CustomerSegment) groups each aggregate by segment. Without .per(...), you would get one overall total.

    • distinct(Customer) is important. Because each customer can match multiple orders, count(distinct(Customer)) counts customers instead of customer-order pairs.

    • .to_df() materializes the result. This triggers PyRel to compile the model and run the query, returning a Pandas DataFrame you can work with in Python. This is also where you’ll see any compilation error and requirement violations if there are issues with your model.

  2. Define a customer → segment membership table

    This table is a convenient “debug view” of the segmentation.

    Add the following code below the segment_value = (...) block:

    customer_segment_membership = (
    model.where(Customer.segment(CustomerSegment))
    .select(
    Customer.id.alias("customer_id"),
    Customer.name.alias("customer_name"),
    Customer.region,
    CustomerSegment.id.alias("segment_id"),
    )
    .to_df()
    .sort_values(["segment_id", "customer_id"])
    )
    • This query does not aggregate. It returns one row per customer.

    • Sorting by segment_id makes it easy to see which customers ended up together.

  3. Print both tables

    Add the following code below the membership table definition:

    print("\nCustomer -> Segment assignments")
    print(customer_segment_membership.to_string(index=False))
    print("\nSegment value analysis")
    print(segment_value.to_string(index=False))

Run the script from your project root:

Terminal window
python segment_customers.py

You will see two printed tables:

Terminal window
Customer -> Segment assignments
customer_id customer_name region segment_id
1 Alice North 1
2 Bob North 1
3 Carol South 1
4 Dan South 2
5 Eve West 2
6 Frank West 2
Segment value analysis
segment_id customers orders revenue avg_order_value
2 3 8 856.0 107.0
1 3 7 812.0 116.0