relationalai.std.re.Pattern
class PatternRepresents a compiled regular expression object that can be used to match or search strings.
Use the compile() function to create a Pattern object.
Example
Section titled “Example”Use the compile() function to compile a regular expression into a Pattern object:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")Company = model.Type("Company")
with model.rule(): Person.add(id=1).set(name="Bob") Person.add(id=2).set(name="Sue") Person.add(id=3).set(name="Braxton")
with model.rule(): Company.add(id=1).set(name="RelationalAI") Company.add(id=2).set(name="Snowflake")
# =======# EXAMPLE# =======
# Compile a regular expression pattern. If you pass a string literal to compile(),# you may call compile() outside of a rule or query and reuse the Pattern object# across multiple contexts.pattern = re.compile(r"S.*")
# Get people whose names match the pattern.with model.query() as select: person = Person() pattern.match(person.name) # Filter names that match the pattern. response = select(person.id, person.name)
print(response.results)# id name# 0 2 Sue
# Get companies whose names match the pattern.with model.query() as select: company = Company() pattern.match(company.name) response = select(company.id, company.name)
print(response.results)# id name# 0 2 SnowflakePattern objects created from Python string literals can be reused across multiple rule and query contexts.
You may also pass a Producer object to compile().
However, in that case, the Pattern object can only be used in the same rule or query context where it was created:
Regex = model.Type("Regex")
with model.rule(): Regex.add(pattern=r"J.*") Regex.add(pattern=r"B.*")
with model.rule(): regex = Regex() # Compile each regex pattern. Note that regex.pattern is an InstanceProperty, # which is a subclass of the Producer class. pattern = re.compile(regex.pattern) # Use the pattern object to assign Person objects whose names match the # pattern to a multi-valued matched_people property. person = Person() pattern.match(person.name) regex.matched_people.add(person)
# Get the names of people matched by each pattern.with model.query() as select: regex = Regex() person = regex.matched_people response = select(regex.pattern, person.name)
print(response.results)# pattern name# 0 B.* Bob# 1 B.* Braxton# 2 S.* Sue
# pattern was created from an InstanceProperty, so it can't be used in a# different rule or query context.with model.rule(): person = Person() pattern.match(person.name) # Raises an error.Since you can’t re-use Pattern objects created from Producer objects in different rule or query contexts, you may alternatively use the re.match() function directly instead of pre-compiling the pattern:
# Alternative to the rule in the previous example that uses re.match() directly.with model.rule(): regex = Regex() person = Person() re.match(regex.pattern, person.name) regex.matched_people.add(person)Note that you can’t set properties of objects to a Pattern object.
Doing so raises an error:
with model.rule(): Regex.add(pattern=re.compile(r"J.*")) # Raises an error.Attributes
Section titled “Attributes”| Name | Description | Type |
|---|---|---|
.pattern | The pattern string from which the Pattern object was compiled. | Producer or Python str |
Methods
Section titled “Methods”| Name | Description | Returns |
|---|---|---|
.search() | Searches for a match anywhere in a string. | Match |
.match() | Matches a string from the beginning. | Match |
.fullmatch() | Matches a string to a pattern exactly. | Match |
.findall() | Finds all matches in a string. | tuple[Expression] |
.sub() | Replaces all matches in a string. | Expression |
.findall()
Section titled “.findall()”Pattern.findall(string: str|Producer) -> tuple[Expression]Finds all non-overlapping matches of a compiled pattern in a string starting.
Returns a tuple (index, substring) of Expression objects representing the 1-based index of the match and the matched substring.
If string is a Producer, then Pattern.findall() filters out non-string values from string.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”| Name | Type | Description |
|---|---|---|
string | Producer or Python str | The string to search. |
pos | Producer or Python int | The starting position of the search. (Default: 0) |
Returns
Section titled “Returns”A tuple of two Expression objects.
Example
Section titled “Example”Use .findall() to retrieve all non-overlapping matches of a compiled pattern in a string, along with their 1-based match index:
import relationalai as raifrom relationalai.std import aggregates, re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(full_name="Alan Turing") Person.add(id=2).set(full_name="Gottfried Wilhelm Leibniz") Person.add(id=3).set(full_name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern to find all words in a full name.pattern = re.compile(r"(\w+)")
with model.rule(): person = Person() index, word = pattern.findall(person.full_name)
# Count the number of words per person. num_words = aggregates.count(index, per=[person])
with model.match(): # Set the first_name property to the first word. with index == 1: person.set(first_name=word) # Set the last_name property to the last word. with index == num_words: person.set(last_name=word) # Set the middle_name property if there are more than 2 words. with model.case(): person.set(middle_name=word)
with model.query() as select: person = Person() response = select( person.id, person.full_name, person.first_name, person.middle_name, person.last_name )
print(response.results)# id full_name first_name middle_name last_name# 0 1 Alan Turing Alan NaN Turing# 1 2 Gottfried Wilhelm Leibniz Gottfried Wilhelm Leibniz# 2 3 -1 NaN NaN NaNIn the preceding example, with statements handle conditional assignments based on the match index, setting first_name, middle_name, and last_name appropriately.
See Expressing if-else Using model.match() for more details on conditional logic in RAI Python.
.fullmatch()
Section titled “.fullmatch()”Pattern.fullmatch(string: str|Producer, pos: int|Producer = 0) -> MatchMatches strings that fully match a compiled pattern starting from position pos and returns a Match object.
If string or pos is a Producer, then Pattern.fullmatch() filters out non-string values from string and non-integer and negative values from pos.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”| Name | Type | Description |
|---|---|---|
string | Producer or Python str | The string to match against. |
pos | Producer or Python int | The starting position of the match. Must be non-negative. (Default: 0) |
Returns
Section titled “Returns”A Match object.
Example
Section titled “Example”Use .fullmatch() to filter for strings that fully match a compiled pattern, starting from a specified position:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(name="Alan Turing") Person.add(id=2).set(name="Gottfried Wilhelm Leibniz") Person.add(id=3).set(name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern for matching full names with groups for first and last names.pattern = re.compile(r"(\w+) (\w+)")
with model.rule(): person = Person() # Match the pattern against each person's full name. match = pattern.fullmatch(person.name) # Use match.group() to set first_name and last_name properties. Since # fullmatch() filters out non-matching strings and non-string values, the # following does not set properties for Person objects with ID 2 or 3. person.set(first_name=match.group(1), last_name=match.group(2))
with model.query() as select: person = Person() response = select( person.id, person.name, person.first_name, person.last_name )
print(response.results)# id name first_name last_name# 0 1 Alan Turing Alan Turing# 1 2 Gottfried Wilhelm Leibniz NaN NaN# 2 3 -1 NaN NaN.match()
Section titled “.match()”Pattern.match(string: str|Producer, pos: int|Producer = 0) -> MatchMatches a string that begins with a compiled pattern starting from position pos and returns a Match object.
If string or pos is a Producer, then .match() filters out non-string values from string and non-integer and negative values from pos.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”| Name | Type | Description |
|---|---|---|
string | Producer or Python str | The string to match against. |
pos | Producer or Python int | The starting position of the match. Must be non-negative. (Default: 0) |
Returns
Section titled “Returns”A Match object.
Example
Section titled “Example”Use .match() to match a compiled pattern at the beginning of a string, starting from a specified position:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(name="Alan Turing") Person.add(id=2).set(name="Bob") Person.add(id=3).set(name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern for matching full names with groups for first and last names.pattern = re.compile(r"(\w+) (\w+)")
with model.rule(): person = Person() # Match the pattern within each person's name. match = pattern.match(person.name) # Use match.group() to set first_name and last_name properties. Since match() # filters out non-matching strings and non-string values, the following does # not set properties for Person objects with IDs 2 and 3. person.set(first_name=match.group(1), last_name=match.group(2))
with model.query() as select: person = Person() response = select( person.id, person.name, person.first_name, person.last_name )
print(response.results)# id name first_name last_name# 0 1 Alan Turing Alan Turing# 1 2 Bob NaN NaN# 2 3 -1 NaN NaN.search()
Section titled “.search()”Pattern.search(string: str|Producer, pos: int|Producer = 0) -> MatchSearches a string for a match to the compiled pattern starting from position pos and returns a Match object for the first match found.
If string or pos is a Producer, then .search() filters out non-string values from string and non-integer and negative values from pos.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”| Name | Type | Description |
|---|---|---|
string | Producer or Python str | The string to search. |
pos | Producer or Python int | The starting position of the search. Must be non-negative. (Default: 0) |
Returns
Section titled “Returns”A Match object.
Example
Section titled “Example”Use .search() to search for a substring that matches the compiled regular expression anywhere in a string, starting from a specified position:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Message = model.Type("Message")
with model.rule(): Message.add(id=1).set(text="The party starts at 8:00 PM.") Message.add(id=2).set(text="Bring tacos") Message.add(id=3).set(text=-1) # Non-string text
# =======# EXAMPLE# =======
# Compile a pattern for times in the format "HH:MM AM/PM"pattern = re.compile(r"\d{1,2}:\d{2} [AP]M")
with model.rule(): message = Message() # Search for the time pattern within each message starting at position 0. match = pattern.search(message.text) # Since search() filters out non-matching and non-string values, the # following properties are not set for messages with IDs 2 and 3. message.set( match_text=match, match_start=match.start(), match_end=match.end() )
with model.query() as select: message = Message() response = select( message.id, message.text, message.match_text, message.match_start, message.match_end, )
print(response.results)# id text match_text match_start match_end# 0 1 The party starts at 8:00 PM. 8:00 PM 20.0 27.0# 1 2 Bring tacos NaN NaN NaN# 2 3 -1 NaN NaN NaN.sub()
Section titled “.sub()”Pattern.sub(repl: str|Producer, string: str|Producer) -> ExpressionReplaces all occurrences of a compiled pattern in string with repl.
If string or repl is a Producer, then .sub() filters out non-string values from string and repl.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”| Name | Type | Description |
|---|---|---|
repl | Producer or Python str | The string to replace the matches with. |
string | Producer or Python str | The string to search for matches. |
pos | Producer or Python int | The starting position of the search. (Default: 0) |
Returns
Section titled “Returns”An Expression object.
Example
Section titled “Example”Use .sub() to replace all occurrences of a compiled pattern in a string:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(full_name="Alan Turing") Person.add(id=2).set(full_name="Gottfried Wilhelm Leibniz") Person.add(id=3).set(full_name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern to replace middle names with initials.pattern = re.compile(r"(\w)\w+")
with model.rule(): person = Person() person.set(initials=pattern.sub(r"\1.", person.full_name))
with model.query() as select: person = Person() response = select(person.id, person.full_name, person.initials)
print(response.results)# id full_name initials# 0 1 Alan Turing A. T.# 1 2 Gottfried Wilhelm Leibniz G. W. L.# 2 3 -1 NaN