relationalai.std.strings.levenshtein()
levenshtein(string1: str|Producer, string2: str|Producer) -> Expression
Calculates the Levenshtein distance between two strings, which measures the minimum number of single-character edits (insertions, deletions, or substitutions) required to transform one string into the other. If string1
or string2
is a Producer
, then levenshtein()
filters out non-string values. Must be called in a rule or query context.
Parameters
Section titled “Parameters”Name | Type | Description |
---|---|---|
string1 | Producer or Python str object | The first string. |
string2 | Producer or Python str object | The second string. |
Returns
Section titled “Returns”An Expression
object.
Example
Section titled “Example”Use levenshtein()
to calculate the distance between pairs of strings:
import relationalai as raifrom relationalai.std import aggregates, strings
# =====# SETUP# =====
model = rai.Model("MyModel")Person = model.Type("Person")
with model.rule(): Person.add(id=1).set(name="Alice") Person.add(id=2).set(name="Alicia") Person.add(id=3).set(name="Bob") Person.add(id=4).set(name=-1) # Non-string name
# =======# EXAMPLE# =======
# Set a multi-valued most_similar_to property on each person to other people# whose names have the smallest Levenshtein distance from their own.with model.rule(): person, other = Person(), Person() person != other # Calculate the Levenshtein distance between the names of each pair of people. dist = strings.levenshtein(person.name, other.name) # Filter to others with smallest distance per person. aggregates.bottom(1, dist, per=[person]) # Set the most_similar_to property to the other people with the smallest distance. person.most_similar_to.extend([other])
# Since levenshtein() filters out non-string values, the most_similar_to property# is not set for the person with id=4.with model.query() as select: person = Person() response = select( person.id, person.name, person.most_similar_to.id, person.most_similar_to.name )
print(response.results)# id name id2 name2# 0 1 Alice 2 Alicia# 1 2 Alicia 1 Alice# 2 3 Bob 1 Alice