Skip to content

relationalai.std.re.Pattern

class Pattern

Represents a compiled regular expression object that can be used to match or search strings. Use the compile() function to create a Pattern object.

Use the compile() function to compile a regular expression into a Pattern object:

import relationalai as rai
from relationalai.std import re
# =====
# SETUP
# =====
model = rai.Model("MyModel")
Person = model.Type("Person")
Company = model.Type("Company")
with model.rule():
Person.add(id=1).set(name="Bob")
Person.add(id=2).set(name="Sue")
Person.add(id=3).set(name="Braxton")
with model.rule():
Company.add(id=1).set(name="RelationalAI")
Company.add(id=2).set(name="Snowflake")
# =======
# EXAMPLE
# =======
# Compile a regular expression pattern. If you pass a string literal to compile(),
# you may call compile() outside of a rule or query and reuse the Pattern object
# across multiple contexts.
pattern = re.compile(r"S.*")
# Get people whose names match the pattern.
with model.query() as select:
person = Person()
pattern.match(person.name) # Filter names that match the pattern.
response = select(person.id, person.name)
print(response.results)
# id name
# 0 2 Sue
# Get companies whose names match the pattern.
with model.query() as select:
company = Company()
pattern.match(company.name)
response = select(company.id, company.name)
print(response.results)
# id name
# 0 2 Snowflake

Pattern objects created from Python string literals can be reused across multiple rule and query contexts.

You may also pass a Producer object to compile(). However, in that case, the Pattern object can only be used in the same rule or query context where it was created:

Regex = model.Type("Regex")
with model.rule():
Regex.add(pattern=r"J.*")
Regex.add(pattern=r"B.*")
with model.rule():
regex = Regex()
# Compile each regex pattern. Note that regex.pattern is an InstanceProperty,
# which is a subclass of the Producer class.
pattern = re.compile(regex.pattern)
# Use the pattern object to assign Person objects whose names match the
# pattern to a multi-valued matched_people property.
person = Person()
pattern.match(person.name)
regex.matched_people.add(person)
# Get the names of people matched by each pattern.
with model.query() as select:
regex = Regex()
person = regex.matched_people
response = select(regex.pattern, person.name)
print(response.results)
# pattern name
# 0 B.* Bob
# 1 B.* Braxton
# 2 S.* Sue
# pattern was created from an InstanceProperty, so it can't be used in a
# different rule or query context.
with model.rule():
person = Person()
pattern.match(person.name) # Raises an error.

Since you can’t re-use Pattern objects created from Producer objects in different rule or query contexts, you may alternatively use the re.match() function directly instead of pre-compiling the pattern:

# Alternative to the rule in the previous example that uses re.match() directly.
with model.rule():
regex = Regex()
person = Person()
re.match(regex.pattern, person.name)
regex.matched_people.add(person)

Note that you can’t set properties of objects to a Pattern object. Doing so raises an error:

with model.rule():
Regex.add(pattern=re.compile(r"J.*")) # Raises an error.
NameDescriptionType
.patternThe pattern string from which the Pattern object was compiled.Producer or Python str
NameDescriptionReturns
.search()Searches for a match anywhere in a string.Match
.match()Matches a string from the beginning.Match
.fullmatch()Matches a string to a pattern exactly.Match
.findall()Finds all matches in a string.tuple[Expression]
.sub()Replaces all matches in a string.Expression
Pattern.findall(string: str|Producer) -> tuple[Expression]

Finds all non-overlapping matches of a compiled pattern in a string starting. Returns a tuple (index, substring) of Expression objects representing the 1-based index of the match and the matched substring. If string is a Producer, then Pattern.findall() filters out non-string values from string. Must be used in a rule or query context.

NameTypeDescription
stringProducer or Python strThe string to search.
posProducer or Python intThe starting position of the search. (Default: 0)

A tuple of two Expression objects.

Use .findall() to retrieve all non-overlapping matches of a compiled pattern in a string, along with their 1-based match index:

import relationalai as rai
from relationalai.std import aggregates, re
# =====
# SETUP
# =====
model = rai.Model("MyModel")
Person = model.Type("Person")
with model.rule():
Person.add(id=1).set(full_name="Alan Turing")
Person.add(id=2).set(full_name="Gottfried Wilhelm Leibniz")
Person.add(id=3).set(full_name=-1) # Non-string name
# =======
# EXAMPLE
# =======
# Compile a pattern to find all words in a full name.
pattern = re.compile(r"(\w+)")
with model.rule():
person = Person()
index, word = pattern.findall(person.full_name)
# Count the number of words per person.
num_words = aggregates.count(index, per=[person])
with model.match():
# Set the first_name property to the first word.
with index == 1:
person.set(first_name=word)
# Set the last_name property to the last word.
with index == num_words:
person.set(last_name=word)
# Set the middle_name property if there are more than 2 words.
with model.case():
person.set(middle_name=word)
with model.query() as select:
person = Person()
response = select(
person.id,
person.full_name,
person.first_name,
person.middle_name,
person.last_name
)
print(response.results)
# id full_name first_name middle_name last_name
# 0 1 Alan Turing Alan NaN Turing
# 1 2 Gottfried Wilhelm Leibniz Gottfried Wilhelm Leibniz
# 2 3 -1 NaN NaN NaN

In the preceding example, with statements handle conditional assignments based on the match index, setting first_name, middle_name, and last_name appropriately. See Expressing if-else Using model.match() for more details on conditional logic in RAI Python.

Pattern.fullmatch(string: str|Producer, pos: int|Producer = 0) -> Match

Matches strings that fully match a compiled pattern starting from position pos and returns a Match object. If string or pos is a Producer, then Pattern.fullmatch() filters out non-string values from string and non-integer and negative values from pos. Must be used in a rule or query context.

NameTypeDescription
stringProducer or Python strThe string to match against.
posProducer or Python intThe starting position of the match. Must be non-negative. (Default: 0)

A Match object.

Use .fullmatch() to filter for strings that fully match a compiled pattern, starting from a specified position:

import relationalai as rai
from relationalai.std import re
# =====
# SETUP
# =====
model = rai.Model("MyModel")
Person = model.Type("Person")
with model.rule():
Person.add(id=1).set(name="Alan Turing")
Person.add(id=2).set(name="Gottfried Wilhelm Leibniz")
Person.add(id=3).set(name=-1) # Non-string name
# =======
# EXAMPLE
# =======
# Compile a pattern for matching full names with groups for first and last names.
pattern = re.compile(r"(\w+) (\w+)")
with model.rule():
person = Person()
# Match the pattern against each person's full name.
match = pattern.fullmatch(person.name)
# Use match.group() to set first_name and last_name properties. Since
# fullmatch() filters out non-matching strings and non-string values, the
# following does not set properties for Person objects with ID 2 or 3.
person.set(first_name=match.group(1), last_name=match.group(2))
with model.query() as select:
person = Person()
response = select(
person.id,
person.name,
person.first_name,
person.last_name
)
print(response.results)
# id name first_name last_name
# 0 1 Alan Turing Alan Turing
# 1 2 Gottfried Wilhelm Leibniz NaN NaN
# 2 3 -1 NaN NaN
Pattern.match(string: str|Producer, pos: int|Producer = 0) -> Match

Matches a string that begins with a compiled pattern starting from position pos and returns a Match object. If string or pos is a Producer, then .match() filters out non-string values from string and non-integer and negative values from pos. Must be used in a rule or query context.

NameTypeDescription
stringProducer or Python strThe string to match against.
posProducer or Python intThe starting position of the match. Must be non-negative. (Default: 0)

A Match object.

Use .match() to match a compiled pattern at the beginning of a string, starting from a specified position:

import relationalai as rai
from relationalai.std import re
# =====
# SETUP
# =====
model = rai.Model("MyModel")
Person = model.Type("Person")
with model.rule():
Person.add(id=1).set(name="Alan Turing")
Person.add(id=2).set(name="Bob")
Person.add(id=3).set(name=-1) # Non-string name
# =======
# EXAMPLE
# =======
# Compile a pattern for matching full names with groups for first and last names.
pattern = re.compile(r"(\w+) (\w+)")
with model.rule():
person = Person()
# Match the pattern within each person's name.
match = pattern.match(person.name)
# Use match.group() to set first_name and last_name properties. Since match()
# filters out non-matching strings and non-string values, the following does
# not set properties for Person objects with IDs 2 and 3.
person.set(first_name=match.group(1), last_name=match.group(2))
with model.query() as select:
person = Person()
response = select(
person.id,
person.name,
person.first_name,
person.last_name
)
print(response.results)
# id name first_name last_name
# 0 1 Alan Turing Alan Turing
# 1 2 Bob NaN NaN
# 2 3 -1 NaN NaN
Pattern.search(string: str|Producer, pos: int|Producer = 0) -> Match

Searches a string for a match to the compiled pattern starting from position pos and returns a Match object for the first match found. If string or pos is a Producer, then .search() filters out non-string values from string and non-integer and negative values from pos. Must be used in a rule or query context.

NameTypeDescription
stringProducer or Python strThe string to search.
posProducer or Python intThe starting position of the search. Must be non-negative. (Default: 0)

A Match object.

Use .search() to search for a substring that matches the compiled regular expression anywhere in a string, starting from a specified position:

import relationalai as rai
from relationalai.std import re
# =====
# SETUP
# =====
model = rai.Model("MyModel")
Message = model.Type("Message")
with model.rule():
Message.add(id=1).set(text="The party starts at 8:00 PM.")
Message.add(id=2).set(text="Bring tacos")
Message.add(id=3).set(text=-1) # Non-string text
# =======
# EXAMPLE
# =======
# Compile a pattern for times in the format "HH:MM AM/PM"
pattern = re.compile(r"\d{1,2}:\d{2} [AP]M")
with model.rule():
message = Message()
# Search for the time pattern within each message starting at position 0.
match = pattern.search(message.text)
# Since search() filters out non-matching and non-string values, the
# following properties are not set for messages with IDs 2 and 3.
message.set(
match_text=match,
match_start=match.start(),
match_end=match.end()
)
with model.query() as select:
message = Message()
response = select(
message.id,
message.text,
message.match_text,
message.match_start,
message.match_end,
)
print(response.results)
# id text match_text match_start match_end
# 0 1 The party starts at 8:00 PM. 8:00 PM 20.0 27.0
# 1 2 Bring tacos NaN NaN NaN
# 2 3 -1 NaN NaN NaN
Pattern.sub(repl: str|Producer, string: str|Producer) -> Expression

Replaces all occurrences of a compiled pattern in string with repl. If string or repl is a Producer, then .sub() filters out non-string values from string and repl. Must be used in a rule or query context.

NameTypeDescription
replProducer or Python strThe string to replace the matches with.
stringProducer or Python strThe string to search for matches.
posProducer or Python intThe starting position of the search. (Default: 0)

An Expression object.

Use .sub() to replace all occurrences of a compiled pattern in a string:

import relationalai as rai
from relationalai.std import re
# =====
# SETUP
# =====
model = rai.Model("MyModel")
Person = model.Type("Person")
with model.rule():
Person.add(id=1).set(full_name="Alan Turing")
Person.add(id=2).set(full_name="Gottfried Wilhelm Leibniz")
Person.add(id=3).set(full_name=-1) # Non-string name
# =======
# EXAMPLE
# =======
# Compile a pattern to replace middle names with initials.
pattern = re.compile(r"(\w)\w+")
with model.rule():
person = Person()
person.set(initials=pattern.sub(r"\1.", person.full_name))
with model.query() as select:
person = Person()
response = select(person.id, person.full_name, person.initials)
print(response.results)
# id full_name initials
# 0 1 Alan Turing A. T.
# 1 2 Gottfried Wilhelm Leibniz G. W. L.
# 2 3 -1 NaN