Skip to content

Work with strings

Use string expressions and the std.strings library to normalize, compare, and build text values in PyRel. In this guide, you’ll learn practical patterns for filtering facts, extracting segments, and avoiding common issues with whitespace, case, and missing values.

A string expression is any PyRel Expression that evaluates to a string value. You use string expressions in where() conditions with define() and select().

A string expression usually shows up in one of these ways:

  • A literal string: A Python string constant like "open" or "billing:".
  • A relationship chain that ends in text: A traversal like Ticket.subject or Ticket.account.tier.
  • A derived string expression: A function call that returns a string, like strings.lower(Ticket.subject).

The following example shows one instance of each kind of string expression:

from relationalai.semantics import Model
from relationalai.semantics.std import strings
m = Model("SupportModel")
Account = m.Concept("Account")
Ticket = m.Concept("Ticket")
Ticket.account = m.Relationship(f"{Ticket} belongs to {Account:account}")
4 collapsed lines
m.define(
Account.new(id=1, tier="Enterprise"),
Ticket.new(id=101, subject="[P0] Outage", account=Account.new(id=1)),
)
literal = "enterprise" # A literal string expression
chain = Ticket.account.tier # A relationship chain that ends in a string property
derived = strings.lower(Ticket.account.tier) # A derived string expression
df = m.select(literal, chain, derived).to_df()
print(df)

Use strings.string() when you need to treat a non-string value as text. This is most common when you build labels with strings.concat() and strings.join().

Values you can convert to strings include:

  • Numeric identifiers like ticket IDs.
  • Numeric measures like counts and amounts.
  • Dates and datetimes.

The following example builds a stable summary string from mixed-type properties:

from relationalai.semantics import Date, Integer, Model
from relationalai.semantics.std import strings, datetime as dt
m = Model("SupportModel")
Ticket = m.Concept("Ticket", identify_by={"id": Integer})
Ticket.opened_on = m.Property(f"{Ticket} opened on {Date}")
Ticket.sla_hours = m.Property(f"{Ticket} has SLA of {Integer:hours}")
4 collapsed lines
m.define(
Ticket.new(id=601, opened_on=dt.date(2026, 2, 26), sla_hours=4),
Ticket.new(id=602, opened_on=dt.date(2026, 2, 27), sla_hours=12),
)
# Build a summary string that combines the ticket ID, opened date, and SLA hours.
ticket_key = strings.concat("ticket#", strings.string(Ticket.id))
sla_hours_str = strings.string(Ticket.sla_hours)
opened_on_str = strings.string(Ticket.opened_on)
summary = strings.join([ticket_key, opened_on_str, sla_hours_str], separator=" | ")
# Define a summary property that downstream logic can select and filter on.
m.define(Ticket.summary(summary))
df = m.select(Ticket.id, Ticket.summary).to_df()
print(df)
  • strings.string(...) converts an integer and a date into string expressions.
  • strings.concat(...) and strings.join(...) build a single derived string you can select or filter on.

Use functions like strings.strip(), strings.lower(), and strings.upper() to normalize text before you compare it in conditions:

from relationalai.semantics import Integer, Model
from relationalai.semantics.std import strings
m = Model("SupportModel")
Ticket = m.Concept("Ticket", identify_by={"id": Integer})
6 collapsed lines
m.define(
Ticket.new(id=201, status_text=" Open"),
Ticket.new(id=202, status_text="open "),
Ticket.new(id=203, status_text="OPEN"),
Ticket.new(id=204, status_text="Closed"),
)
# Normalize status text.
status_norm = strings.lower(strings.strip(Ticket.status_text))
# Define a normalized status property that downstream logic can reuse.
m.define(Ticket.status_norm(status_norm))
df = m.select(
Ticket.id,
Ticket.status_text.alias("status_raw"),
Ticket.status_norm.alias("status_normalized"),
).to_df()
print(df)
  • strings.strip() removes leading and trailing whitespace that would cause mismatches.
  • strings.lower() makes the comparison case-insensitive so you can match "Open", "open", and "OPEN" with the same condition.
  • m.define(...) assigns the normalized expression to Ticket.status_norm so you can reuse it in multiple conditions without repeating the normalization logic.

Build new string values with functions like strings.concat(), strings.join(), and strings.replace() to create derived labels or classifications:

from relationalai.semantics import Integer, Model
from relationalai.semantics.std import strings
m = Model("SupportModel")
Ticket = m.Concept("Ticket", identify_by={"id": Integer})
4 collapsed lines
m.define(
Ticket.new(id=301, subject="[P0] Outage: login ", channel="email"),
Ticket.new(id=302, subject="Billing: invoice mismatch", channel="web"),
)
# Clean up and combine text properties into a display label.
subject_clean = strings.strip(Ticket.subject)
subject_clean = strings.replace(subject_clean, "[P0]", "P0")
channel_tag = strings.upper(strings.strip(Ticket.channel))
label = strings.join(
[
strings.concat("ticket#", strings.string(Ticket.id)),
subject_clean,
channel_tag,
],
separator=" | ",
)
# Define a display label property that downstream logic can select and filter on.
m.define(Ticket.display_label(label))
df = m.select(Ticket.display_label).to_df()
print(df)
  • strings.string(Ticket.id) converts a non-string value into a string for label building.
  • strings.replace(...) cleans up a token you do not want to keep in the derived label.
  • strings.join([...], separator=" | ") produces one stable label that downstream filters can use.

Use functions like strings.like(), strings.startswith(), and strings.endswith() to filter facts based on text conditions:

from relationalai.semantics import Integer, Model, String
from relationalai.semantics.std import strings
m = Model("SupportModel")
Ticket = m.Concept("Ticket", identify_by={"id": Integer})
Ticket.subject = m.Property(f"{Ticket} has {String:subject}")
PriorityTicket = m.Concept("PriorityTicket", extends=[Ticket])
7 collapsed lines
m.define(
Ticket.new(id=401, subject="[P0] Outage: login"),
Ticket.new(id=402, subject="Re: [P0] outage"),
Ticket.new(id=403, subject="Customer outage report (urgent)"),
Ticket.new(id=404, subject="test outage - ignore"),
Ticket.new(id=405, subject="Question about billing"),
)
# Normalize subject text for consistent matching.
subject_norm = strings.lower(strings.strip(Ticket.subject))
# Define PriorityTicket based on normalized subject patterns.
m.where(
strings.like(subject_norm, r"%outage%"),
(
strings.startswith(subject_norm, "[p0]")
| strings.endswith(subject_norm, "(urgent)")
),
).define(
PriorityTicket(Ticket)
)
df = m.select(PriorityTicket.id, PriorityTicket.subject).to_df()
print(df)
  • strings.like(subject_norm, r"%outage%") matches any subject that contains the word “outage” (case-insensitive due to normalization).
  • In like() patterns, % matches any-length text and _ matches a single character.
  • strings.startswith(subject_norm, "[p0]") matches subjects that start with the critical priority tag.
  • strings.endswith(subject_norm, "(urgent)") matches subjects that end with an urgency flag.
  • The combined condition captures tickets that mention an outage and are either tagged as P0 or marked urgent, using the | operator to express the OR logic.

Use strings.split_part() to break a string into parts based on a delimiter and extract the segment you need:

from relationalai.semantics import Integer, Model
from relationalai.semantics.std import strings
m = Model("SupportModel")
Ticket = m.Concept("Ticket", identify_by={"id": Integer})
4 collapsed lines
m.define(
Ticket.new(id=501, external_key="INC-2026-0042"),
Ticket.new(id=502, external_key="REQ-2025-0007"),
)
key_prefix = strings.split_part(Ticket.external_key, "-", 0)
key_year = strings.split_part(Ticket.external_key, "-", 1)
key_seq = strings.split_part(Ticket.external_key, "-", 2)
print(
m.select(
Ticket.id,
Ticket.external_key,
key_prefix.alias("key_prefix"),
key_year.alias("key_year"),
key_seq.alias("key_seq"),
).to_df()
)
  • strings.split_part(Ticket.external_key, "-", 0) extracts the first segment before the first hyphen (for example, “INC”).
  • strings.split_part(Ticket.external_key, "-", 1) extracts the second segment between the hyphens (for example, “2026”).
  • strings.split_part(Ticket.external_key, "-", 2) extracts the third segment after the second hyphen (for example, “0042”).
  • strings.split_part() uses zero-based indexing for the part number, so 0 is the first part, 1 is the second part, and so on.
  • If the specified part number does not exist, strings.split_part() raises no error. Instead, the expression is treated as a missing value and may filter out of conditions or return NULL in results depending on the context.

String-focused definitions often fail in subtle ways. The most common causes are missing values, missing relationship links, and inconsistent whitespace or case.

Use this checklist when results are empty, unexpectedly broad, or missing matches:

  • Normalize before comparing: apply strings.strip(), strings.lower(), or strings.upper() before equality checks and pattern matches.
  • Prefer bounded patterns: use strings.startswith() and strings.endswith() when possible instead of broad strings.like() wildcards. Bounded checks reduce false positives and make it easier to understand why a value matched.
  • Validate in small steps: select() the raw text, the normalized value, and the final condition side by side to confirm your logic is working as expected.