relationalai.std.re.Pattern
class Pattern
Represents a compiled regular expression object that can be used to match or search strings.
Use the compile()
function to create a Pattern
object.
Example
Section titled “Example”Use the compile()
function to compile a regular expression into a Pattern
object:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")Company = model.Type("Company")
with model.rule(): Person.add(id=1).set(name="Bob") Person.add(id=2).set(name="Sue") Person.add(id=3).set(name="Braxton")
with model.rule(): Company.add(id=1).set(name="RelationalAI") Company.add(id=2).set(name="Snowflake")
# =======# EXAMPLE# =======
# Compile a regular expression pattern. If you pass a string literal to compile(),# you may call compile() outside of a rule or query and reuse the Pattern object# across multiple contexts.pattern = re.compile(r"S.*")
# Get people whose names match the pattern.with model.query() as select: person = Person() pattern.match(person.name) # Filter names that match the pattern. response = select(person.id, person.name)
print(response.results)# id name# 0 2 Sue
# Get companies whose names match the pattern.with model.query() as select: company = Company() pattern.match(company.name) response = select(company.id, company.name)
print(response.results)# id name# 0 2 Snowflake
Pattern
objects created from Python string literals can be reused across multiple rule and query contexts.
You may also pass a Producer
object to compile()
.
However, in that case, the Pattern
object can only be used in the same rule or query context where it was created:
Regex = model.Type("Regex")
with model.rule(): Regex.add(pattern=r"J.*") Regex.add(pattern=r"B.*")
with model.rule(): regex = Regex() # Compile each regex pattern. Note that regex.pattern is an InstanceProperty, # which is a subclass of the Producer class. pattern = re.compile(regex.pattern) # Use the pattern object to assign Person objects whose names match the # pattern to a multi-valued matched_people property. person = Person() pattern.match(person.name) regex.matched_people.add(person)
# Get the names of people matched by each pattern.with model.query() as select: regex = Regex() person = regex.matched_people response = select(regex.pattern, person.name)
print(response.results)# pattern name# 0 B.* Bob# 1 B.* Braxton# 2 S.* Sue
# pattern was created from an InstanceProperty, so it can't be used in a# different rule or query context.with model.rule(): person = Person() pattern.match(person.name) # Raises an error.
Since you can’t re-use Pattern
objects created from Producer
objects in different rule or query contexts, you may alternatively use the re.match()
function directly instead of pre-compiling the pattern:
# Alternative to the rule in the previous example that uses re.match() directly.with model.rule(): regex = Regex() person = Person() re.match(regex.pattern, person.name) regex.matched_people.add(person)
Note that you can’t set properties of objects to a Pattern
object.
Doing so raises an error:
with model.rule(): Regex.add(pattern=re.compile(r"J.*")) # Raises an error.
Attributes
Section titled “Attributes”Name | Description | Type |
---|---|---|
.pattern | The pattern string from which the Pattern object was compiled. | Producer or Python str |
Methods
Section titled “Methods”Name | Description | Returns |
---|---|---|
.search() | Searches for a match anywhere in a string. | Match |
.match() | Matches a string from the beginning. | Match |
.fullmatch() | Matches a string to a pattern exactly. | Match |
.findall() | Finds all matches in a string. | tuple[Expression] |
.sub() | Replaces all matches in a string. | Expression |
.findall()
Section titled “.findall()”Pattern.findall(string: str|Producer) -> tuple[Expression]
Finds all non-overlapping matches of a compiled pattern in a string starting.
Returns a tuple (index, substring)
of Expression
objects representing the 1-based index of the match and the matched substring.
If string
is a Producer
, then Pattern.findall()
filters out non-string values from string
.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”Name | Type | Description |
---|---|---|
string | Producer or Python str | The string to search. |
pos | Producer or Python int | The starting position of the search. (Default: 0 ) |
Returns
Section titled “Returns”A tuple of two Expression
objects.
Example
Section titled “Example”Use .findall()
to retrieve all non-overlapping matches of a compiled pattern in a string, along with their 1-based match index:
import relationalai as raifrom relationalai.std import aggregates, re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(full_name="Alan Turing") Person.add(id=2).set(full_name="Gottfried Wilhelm Leibniz") Person.add(id=3).set(full_name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern to find all words in a full name.pattern = re.compile(r"(\w+)")
with model.rule(): person = Person() index, word = pattern.findall(person.full_name)
# Count the number of words per person. num_words = aggregates.count(index, per=[person])
with model.match(): # Set the first_name property to the first word. with index == 1: person.set(first_name=word) # Set the last_name property to the last word. with index == num_words: person.set(last_name=word) # Set the middle_name property if there are more than 2 words. with model.case(): person.set(middle_name=word)
with model.query() as select: person = Person() response = select( person.id, person.full_name, person.first_name, person.middle_name, person.last_name )
print(response.results)# id full_name first_name middle_name last_name# 0 1 Alan Turing Alan NaN Turing# 1 2 Gottfried Wilhelm Leibniz Gottfried Wilhelm Leibniz# 2 3 -1 NaN NaN NaN
In the preceding example, with
statements handle conditional assignments based on the match index, setting first_name
, middle_name
, and last_name
appropriately.
See Expressing if
-else
Using model.match()
for more details on conditional logic in RAI Python.
.fullmatch()
Section titled “.fullmatch()”Pattern.fullmatch(string: str|Producer, pos: int|Producer = 0) -> Match
Matches strings that fully match a compiled pattern starting from position pos
and returns a Match
object.
If string
or pos
is a Producer
, then Pattern.fullmatch()
filters out non-string values from string
and non-integer and negative values from pos
.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”Name | Type | Description |
---|---|---|
string | Producer or Python str | The string to match against. |
pos | Producer or Python int | The starting position of the match. Must be non-negative. (Default: 0 ) |
Returns
Section titled “Returns”A Match
object.
Example
Section titled “Example”Use .fullmatch()
to filter for strings that fully match a compiled pattern, starting from a specified position:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(name="Alan Turing") Person.add(id=2).set(name="Gottfried Wilhelm Leibniz") Person.add(id=3).set(name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern for matching full names with groups for first and last names.pattern = re.compile(r"(\w+) (\w+)")
with model.rule(): person = Person() # Match the pattern against each person's full name. match = pattern.fullmatch(person.name) # Use match.group() to set first_name and last_name properties. Since # fullmatch() filters out non-matching strings and non-string values, the # following does not set properties for Person objects with ID 2 or 3. person.set(first_name=match.group(1), last_name=match.group(2))
with model.query() as select: person = Person() response = select( person.id, person.name, person.first_name, person.last_name )
print(response.results)# id name first_name last_name# 0 1 Alan Turing Alan Turing# 1 2 Gottfried Wilhelm Leibniz NaN NaN# 2 3 -1 NaN NaN
.match()
Section titled “.match()”Pattern.match(string: str|Producer, pos: int|Producer = 0) -> Match
Matches a string that begins with a compiled pattern starting from position pos
and returns a Match
object.
If string
or pos
is a Producer
, then .match()
filters out non-string values from string
and non-integer and negative values from pos
.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”Name | Type | Description |
---|---|---|
string | Producer or Python str | The string to match against. |
pos | Producer or Python int | The starting position of the match. Must be non-negative. (Default: 0 ) |
Returns
Section titled “Returns”A Match
object.
Example
Section titled “Example”Use .match()
to match a compiled pattern at the beginning of a string, starting from a specified position:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(name="Alan Turing") Person.add(id=2).set(name="Bob") Person.add(id=3).set(name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern for matching full names with groups for first and last names.pattern = re.compile(r"(\w+) (\w+)")
with model.rule(): person = Person() # Match the pattern within each person's name. match = pattern.match(person.name) # Use match.group() to set first_name and last_name properties. Since match() # filters out non-matching strings and non-string values, the following does # not set properties for Person objects with IDs 2 and 3. person.set(first_name=match.group(1), last_name=match.group(2))
with model.query() as select: person = Person() response = select( person.id, person.name, person.first_name, person.last_name )
print(response.results)# id name first_name last_name# 0 1 Alan Turing Alan Turing# 1 2 Bob NaN NaN# 2 3 -1 NaN NaN
.search()
Section titled “.search()”Pattern.search(string: str|Producer, pos: int|Producer = 0) -> Match
Searches a string for a match to the compiled pattern starting from position pos
and returns a Match
object for the first match found.
If string
or pos
is a Producer
, then .search()
filters out non-string values from string
and non-integer and negative values from pos
.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”Name | Type | Description |
---|---|---|
string | Producer or Python str | The string to search. |
pos | Producer or Python int | The starting position of the search. Must be non-negative. (Default: 0 ) |
Returns
Section titled “Returns”A Match
object.
Example
Section titled “Example”Use .search()
to search for a substring that matches the compiled regular expression anywhere in a string, starting from a specified position:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Message = model.Type("Message")
with model.rule(): Message.add(id=1).set(text="The party starts at 8:00 PM.") Message.add(id=2).set(text="Bring tacos") Message.add(id=3).set(text=-1) # Non-string text
# =======# EXAMPLE# =======
# Compile a pattern for times in the format "HH:MM AM/PM"pattern = re.compile(r"\d{1,2}:\d{2} [AP]M")
with model.rule(): message = Message() # Search for the time pattern within each message starting at position 0. match = pattern.search(message.text) # Since search() filters out non-matching and non-string values, the # following properties are not set for messages with IDs 2 and 3. message.set( match_text=match, match_start=match.start(), match_end=match.end() )
with model.query() as select: message = Message() response = select( message.id, message.text, message.match_text, message.match_start, message.match_end, )
print(response.results)# id text match_text match_start match_end# 0 1 The party starts at 8:00 PM. 8:00 PM 20.0 27.0# 1 2 Bring tacos NaN NaN NaN# 2 3 -1 NaN NaN NaN
.sub()
Section titled “.sub()”Pattern.sub(repl: str|Producer, string: str|Producer) -> Expression
Replaces all occurrences of a compiled pattern in string
with repl
.
If string
or repl
is a Producer
, then .sub()
filters out non-string values from string
and repl
.
Must be used in a rule or query context.
Parameters
Section titled “Parameters”Name | Type | Description |
---|---|---|
repl | Producer or Python str | The string to replace the matches with. |
string | Producer or Python str | The string to search for matches. |
pos | Producer or Python int | The starting position of the search. (Default: 0 ) |
Returns
Section titled “Returns”An Expression
object.
Example
Section titled “Example”Use .sub()
to replace all occurrences of a compiled pattern in a string:
import relationalai as raifrom relationalai.std import re
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(full_name="Alan Turing") Person.add(id=2).set(full_name="Gottfried Wilhelm Leibniz") Person.add(id=3).set(full_name=-1) # Non-string name
# =======# EXAMPLE# =======
# Compile a pattern to replace middle names with initials.pattern = re.compile(r"(\w)\w+")
with model.rule(): person = Person() person.set(initials=pattern.sub(r"\1.", person.full_name))
with model.query() as select: person = Person() response = select(person.id, person.full_name, person.initials)
print(response.results)# id full_name initials# 0 1 Alan Turing A. T.# 1 2 Gottfried Wilhelm Leibniz G. W. L.# 2 3 -1 NaN