Skip to content

Core Concepts

RelationalAI (RAI) adds a layer of intelligence to your Snowflake data cloud, using Python-based models to define the relationships, rules, and reasoning that drive your organization’s decision-making process, with tools to query the model and libraries for applying various AI reasoning techniques.

This guide covers the core concepts of building, testing, and deploying an application using RAI.

Models represent the entities important to your organization as objects. Each object represents a real-world entity, such as a customer, account, or transaction, and it has properties that describe its attributes and relationships to other objects.

Models describe objects by defining:

  • Types, which represent categories of objects, such as Customer or Account.
  • Rules, which capture knowledge by setting object properties and assigning objects to types. For instance, a rule might assign all Account objects that have no transactions within the last year to an Inactive type.

Rules may leverage advanced features, like graph reasoning, to infer relationships and properties of objects and drive more sophisticated decision-making. A rule in a financial fraud model, for example, might use a graph algorithm to identify suspicious accounts that need to be investigated.

You can query a model to answer questions about the objects it describes. Queries are evaluated by the RAI Native App installed in your Snowflake account, where objects are created and assigned types and properties based on the model’s rules, using data from Snowflake tables shared with the app.

The following diagram illustrates the relationship between a model, the RAI Native App, and Snowflake data for a hypothetical financial fraud model:

Diagram illustrating a model's interaction with Snowflake data. The model has four types: Criminal (data source: criminals), Account (data source: accounts), Transaction (data source: transactions), and Suspicious. It also has one rule: Accounts with transaction patterns similar to accounts owned by criminals are suspicious. The query is: 'Which accounts are suspicious?' The RAI Native App processes the query using the Snowflake data sources (financedb.fraud.criminals, financedb.fraud.accounts, financedb.fraud.transactions). The results are returned to the model's Python process.Diagram illustrating a model's interaction with Snowflake data. The model has four types: Criminal (data source: criminals), Account (data source: accounts), Transaction (data source: transactions), and Suspicious. It also has one rule: Accounts with transaction patterns similar to accounts owned by criminals are suspicious. The query is: 'Which accounts are suspicious?' The RAI Native App processes the query using the Snowflake data sources (financedb.fraud.criminals, financedb.fraud.accounts, financedb.fraud.transactions). The results are returned to the model's Python process.

Rules and queries are written in Python using RAI’s declarative query-builder syntax. In the following sections, you’ll learn the basics of creating a model, defining types and rules, and executing queries.

Instantiate the Model class to create a model:

import relationalai as rai
model = rai.Model("MyModel")

You’ll use the model object to declare types, define rules, and execute queries.

Use the Model.Type() method to declare a type:

import relationalai as rai
model = rai.Model("MyModel")
Person = model.Type("Person")

model.Type("Person") returns an instance of the Type class, which is used to define objects in rules and queries. By convention, variable names for types are capitalized.

When you query a model, objects are created and assigned types and properties according to the model’s rules. Data for objects may come from Snowflake tables or be defined in rules.

Defining Objects From Rows In Snowflake Tables

Section titled “Defining Objects From Rows In Snowflake Tables”

To populate a type with objects from a Snowflake table or view, create a data stream from the table to your model and pass the fully-qualified table name to the Type constructor’s source parameter:

import relationalai as rai
model = rai.Model("MyModel")
# Get a Provider instance.
app = rai.Provider()
# Create a stream from a Snowflake table named "people" to your model. Note that
# <db> and <schema> are placeholders for a Snowflake database and schema.
app.create_streams(["<db>.<schema>.people"], model="MyModel")
# Declare a Person type that is populated with objects from the "people" table.
Person = model.Type("Person", source="<db>.<schema>.people")

Setting source="<db>.<schema>.people" ensures that whenever the model is queried objects that correspond to rows in the <db>.<schema>.people table are created and assigned to the Person type. Columns in the source table are used to set properties on each object, with lowercased column names used as property names.

Only columns with the following Snowflake data types are supported in source tables:

You may specify objects directly in your model’s rules using the Type.add() method. Rules are defined using model.rule(), which returns a context manager and is used in a with statement.

The following creates a model with a Person type and defines two Person objects in a rule:

import relationalai as rai
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare a Person type.
Person = model.Type("Person")
# Use Person.add() to define two Person objects.
with model.rule():
Person.add(name="Alice", age=10, favorite_color="blue")
Person.add(name="Bob", age=20)

Person.add() declares an object of type Person with properties specified as keyword arguments. Objects in the same type may have different properties. For instance, the above rule defines one Person object with a favorite_color property and another without.

Properties passed to .add() are hashed to create a unique internal identifier for each object. Calling .add() twice with the same properties and values does not generate duplicate objects. For example, the following rule defines two distinct objects, not three:

with model.rule():
# Define a Person object with three properties: name, age, and favorite_color.
Person.add(name="Bob", age=20, favorite_color="green")
# Define a second Person object with two properties: name and age. This object
# is distinct from the first object because its hash doesn't include the
# favorite_color property.
Person.add(name="Bob", age=20)
# The following does not define a third object because the properties hash
# to the same value as above.
Person.add(name="Bob", age=20)

The name of the type that .add() is called on is included in the object’s hash. This means that adding an object with the same properties to two different types results in two distinct objects being created:

Person = model.Type("Person")
Student = model.Type("Student")
with model.rule():
# Define a Person object with name and age properties.
Person.add(name="Alice", age=10)
# Define a Student object with name and age properties. Even though the name
# and age properties are the same as the Person object above, the objects are
# distinct because they are created in different types.
Student.add(name="Alice", age=10)

You may, however, assign the same object to multiple types by passing additional types as positional arguments to .add(). For instance, the following rule defines one Person object that is also a Student:

with model.rule():
Person.add(Student, name="Alice", age=10)

As a general rule, call .add() from the most general type an object belongs to, and pass any subtypes it belongs to as positional arguments. However, you are free to model things as you see fit.

In the Capturing Knowledge in Rules section, you’ll learn how to assign types to objects based on their properties and relationships to other objects. But first, let’s take a closer look at properties.

Objects may have two types of properties:

  • Single-valued properties are assigned a single value.
  • Multi-valued properties may be assigned multiple values.

Think of properties as arrows that connect objects to a value. They may point to values with the following types:

  • Strings
  • Numbers, such as integers and floats
  • Booleans
  • Dates and datetimes
  • Other objects

Multi-valued properties do not point to a single list or set of values. Instead, they point to multiple values simultaneously.

The following diagram illustrates three objects and their properties. Single-valued properties are displayed as solid arrows and multi-valued properties as dashed arrows:

A hierarchical diagram of objects and properties. A single-valued 'name' property points from the a Person object to the string 'Bob'. The Person object has a multi-valued 'pets' property (dashed arrows) pointing to two objects: a Dog named 'Fido' and a Cat named 'Whiskers.' Each pet object has a single-valued 'name' property pointing to its name.A hierarchical diagram of objects and properties. A single-valued 'name' property points from the a Person object to the string 'Bob'. The Person object has a multi-valued 'pets' property (dashed arrows) pointing to two objects: a Dog named 'Fido' and a Cat named 'Whiskers.' Each pet object has a single-valued 'name' property pointing to its name.

The status of a property as single- or multi-valued is fixed across the entire model.

There is only one name property in above diagram, and it is single-valued. You may set the name property on any object, but it must always point to a single value. Similarly, there is only one pets property, which may also be set on any object but is always multi-valued.

By convention, we use plural names for multi-valued properties to distinguish them from single-valued properties.

You can declare single-valued properties using the Property.declare() method:

import relationalai as rai
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare Cat, Dog, and Person types.
Cat = model.Type("Cat")
Dog = model.Type("Dog")
Person = model.Type("Person")
# Declare the properties used by each type. This is optional.
Cat.name.declare()
Dog.name.declare()
Person.name.declare()
Person.age.declare()
# Define a Cat, Dog, and Person objects.
with model.rule():
# Note that calling .add() automatically creates single-valued properties
# even if their declarations are missing.
Cat.add(name="Whiskers")
Dog.add(name="Fido")
Person.add(name="Bob")

In this example:

  • Cat.name returns an instance of the Property class. This might look a bit strange, since the .name attribute doesn’t exist. Type instances allow dynamic attribute access.

  • .declare() is called from Cat.name, Dog.name, and Person.name to declare that Cat, Dog, and Person objects all use the single-valued name property.

You aren’t required to set values set for every declared property. For instance, the Person object defined in the rule above doesn’t have an age value.

Properties set with .add() serve as the object’s primary key. When you call .add(), it returns an Instance that references the object. Use the Instance.set() method to define additional single-valued properties that aren’t part of the object’s primary key:

with model.rule():
# Define a Cat object with primary key property id set to 1. Note that if a
# Cat object with id=1 already exists, .add() returns a reference to the
# existing object.
cat = Cat.add(id=1)
# Set the name properties for the object. name is single-valued and is not
# part of the object's primary key.
cat.set(name="Whiskers")
# .set() also returns an Instance object, so you can chain calls. The
# following is equivalent to the above:
Cat.add(id=1).set(name="Whiskers")

You can only set a single-valued property once per object:

# The following pair of rules is invalid.
with model.rule():
Cat.add(id=1).set(name="Whiskers")
with model.rule():
# Cat(id=1) returns an Instance that references the Cat object with id=1.
# This is the same object defined in the previous rule. Setting two values
# for the same single-valued property is invalid.
Cat(id=1).set(name="Fluffy")

It’s tempting to interpret the rules in the preceding example as first creating a Cat object with a name property set to "Whiskers" and then updating the name property to "Fluffy". But that’s not what happens.

Rules are not executed sequentially. Each rule describes a fact about the objects that are created when you query the model. The two rules above describe contradictory facts for the same object, which is invalid.

You may, however, set different properties for the same object in different rules:

# Define a Cat object with id=1 and set its name property to "Whiskers".
with model.rule():
Cat.add(id=1).set(name="Whiskers")
# Set the breed and color properties for the Cat object with id=1.
with model.rule():
Cat(id=1).set(breed="Siamese", color="white")

Declare multi-valued properties using the Property.has_many() method:

import relationalai as rai
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare Cat, Dog, and Person types.
Cat = model.Type("Cat")
Dog = model.Type("Dog")
Person = model.Type("Person")
# Declare the properties used by each type. This is optional.
Cat.name.declare()
Dog.name.declare()
Person.name.declare()
Person.pets.has_many() # pets is multi-valued.
# Define Cat, Dog, and Person objects.
with model.rule():
whiskers = Cat.add(name="Whiskers")
fido = Dog.add(name="Fido")
bob = Person.add(name="Bob")
# Extend Bob's multi-valued pets property with Whiskers and Fido. Note that
# calling bob.pets.extend() automatically creates the multi-valued pets property
# even if its declaration is missing.
bob.pets.extend([fido, whiskers])
# Alternatively, you may add values to a multi-valued property one at a time
# using bob.pets.add(). The following two lines are equivalent to the one above.
# Like .extend(), .add() automatically creates the multi-valued pets property
# if its declaration is missing.
bob.pets.add(fido)
bob.pets.add(whiskers)

In this example:

  • Person.pets.has_many() declares that Person objects use a multi-valued pets property.

  • bob.pets returns an InstanceProperty that references the bob object’s pets property. Use bob.pets.extend() or bob.pets.add() to set values for the pets property.

Just like single-valued properties, you aren’t required to set values for every multi-valued property that you declare.

Unlike single-valued properties, you may set multi-valued properties multiple times. Each rule that sets a multi-valued property adds to the property’s values:

# Define a Person object with two pets.
with model.rule():
bob = Person.add(name="Bob")
bob.pets.extend([
Cat.add(name="Whiskers"),
Dog.add(name="Fido"),
])
# Define another Dog object and set it as Bob's pet.
with model.rule():
# Person(name="Bob") returns an Instance that references the Person object
# with name="Bob".
bob = Person(name="Bob")
# Add another Dog object to Bob's pets. Here, bob.pets.add() is used instead
# of bob.pets.extend() since only one pet is being added.
bob.pets.add(Dog.add(name="Buddy"))

The second rule doesn’t replace the pets that are set in the first rule. Rather, Bob’s pets property points to three objects: Whiskers, Fido, and Buddy.

Multi-valued properties have set-like semantics. Setting the same value for a multi-valued property multiple times doesn’t create duplicate property values.

Queries are written using the model.query() method, which, like model.rule(), returns a context manager that is used in a with statement.

This section introduces the basics of querying a model. For a more in-depth look at RAI’s query-builder syntax, see the Basic Functionality guide.

The following example creates a model with a Person type, defines some Person objects, and then queries the model for their IDs, names, and favorite colors:

import relationalai as rai
# =====
# SETUP
# =====
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare a Person type.
Person = model.Type("Person")
with model.rule():
# Define Person objects.
alice = Person.add(id=1).set(name="Alice", age=16, favorite_color="blue")
bob = Person.add(id=2).set(name="Bob", age=18, favorite_color="green")
carol = Person.add(id=3).set(name="Carol", age=18) # Carol has no favorite_color.
# Connect people to their friends with a multi-valued friends property.
# Visually, the friends property looks like: Alice <-> Bob <-> Carol.
alice.friends.add(bob)
bob.friends.extend([alice, carol])
carol.friends.add(bob)
# =======
# EXAMPLE
# =======
# Query the model for names of people and their favorite color.
with model.query() as select:
person = Person()
response = select(person.id, person.name, person.favorite_color)
# Print the query results.
print(response.results)
# id name favorite_color
# 0 1 Alice blue
# 1 2 Bob green
# 2 3 Carol NaN

Let’s break down the model.query() block:

  • Calling Person() returns an Instance that references a Person object, which is assigned here to the person variable.
  • select(person.id, person.name, person.favorite_color) selects the id, name, and favorite_color properties of person objects and returns a Context object used to access the query results.
  • When the Python interpreter reaches the end of the with block, the model compiles the query together with all of its types and rules. The compiled query is sent to the RAI Native App for evaluation and execution is blocked until a response is received. Results are assigned to the Context object’s .results attribute and, by default, are returned as a pandas DataFrame with three columns labeled id, name, and favorite_color and a row for each person.

If an object lacks a value for a selected property, null values are returned. For example, Carol has no favorite color, so it’s displayed as NaN in the results. Refer to Dealing With Null Values for more information on handling null values in queries.

Queries work a bit like a SQL SELECT statement, but with a different order:

RAI PythonSQL Interpretation
person = Person()FROM Person person
select(person.name, person.favorite_color)SELECT person.name, person.favorite_color

When you select a single-valued property, like person.name, one row is returned for each person object filtered by the query. This can lead to duplicate rows in the results if multiple objects have the same property value. For instance, querying the model for people’s ages returns two rows where the age is 18:

with model.query() as select:
person = Person()
response = select(person.age)
print(response.results)
# age
# 0 16
# 1 18
# 2 18

One of the duplicate rows is for Bob, and the other is for Carol. Use select.distinct() to remove duplicate rows:

with model.query() as select:
person = Person()
response = select.distinct(person.age)
print(response.results)
# age
# 0 16
# 1 18

Selecting a multi-valued property, like person.friends, returns a row for each person and each of their friends, which can again lead to duplicate rows in the results:

with model.query() as select:
person = Person()
response = select(person.friends)
print(response.results)
# friends
# 0 XXolLCtOngI6pFJ128Xktg
# 1 d1SmRsWF5TLVmYhCCPhD9g
# 2 g4rDjPY1HHWkEikWQXw+3Q
# 3 g4rDjPY1HHWkEikWQXw+3Q

person.friends points to objects, so the friends column in the results displays the internal identifier of each friend object. There are two rows with the same identifier. This makes sense because Bob is friends with both Alice and Carol, so we should expect two rows corresponding to the Bob object.

Property access may be chained. For example, person.friends.name accesses the name property of each person.friends object. Selecting person.friends.name does not return a row per person per friend. Instead, it returns a row per unique friend object in the set of all person.friends objects:

with model.query() as select:
person = Person()
response = select(person.friends.name)
print(response.results)
# name
# 0 Alice
# 1 Bob <-- Only one row for Bob, even though he's friends with two people.
# 2 Carol

In general, whatever is to the left of the last dot (.) in a property chain determines the property’s key. Since person.friends is keyed by person, selecting it returns rows for each person object. person.friends.name is keyed by friends, so selecting it returns rows for each unique object assigned to some person’s friends property.

When you select multiple properties, multiple keys may be used to determine the rows in the results:

with model.query() as select:
person = Person()
friend = person.friends
response = select(person.name, friend.name)
print(response.results)
# name name2
# 0 Alice Bob
# 1 Bob Alice
# 2 Bob Carol
# 3 Carol Bob

Each row in the results is keyed by both person and friends, so there is one row for each person-friend pair.

Declare conditions in a query to filter objects:

with model.query() as select:
person = Person()
person.age >= 18
response = select(person.name, person.age)
print(response.results)
# name age
# 0 Bob 18
# 1 Carol 18

The preceding query selects only people who are at least 18 years old. person.age >= 18 is similar to a SQL WHERE clause:

RAI PythonSQL Interpretation
person = Person()FROM Person person
person.age >= 18WHERE person.age >= 18
select(person.name, person.age)SELECT person.name, person.age

Queries can perform joins and declare multiple conditions:

with model.query() as select:
person1, person2 = Person(), Person()
person1.age >= 16
person2.favorite_color.in_(["red", "blue"])
response = select(person1.name, person2.name)
print(response.results)
# name name2
# 0 Alice Alice
# 1 Bob Alice
# 2 Carol Alice

Each line in the query body is combined with AND, so this query selects pairs of people where the first person is at least 16 and the second’s favorite color is red or blue. See Filtering Objects by Type and Filtering Objects by Property Value in the Basic Functionality guide for more information on filtering objects.

Queries return a pandas DataFrame by default. This downloads all of the results and, for large result sets, may be slow and consume a lot of memory. To avoid this, you can set the query’s format parameter to "snowpark" to return a Snowpark DataFrame instead:

with model.query(format="snowpark") as select:
person = Person()
response = select(person.name, person.favorite_color)
response.results.show()
# ---------------------------
# |"NAME" |"FAVORITE_COLOR"|
# ---------------------------
# |Alice |blue |
# |Bob |green |
# |Carol |null |
# ---------------------------

Alternatively, set format="snowpark" when you instantiate the model to change the default result format:

model = rai.Model("MyModel", format="snowpark")

Data for Snowpark DataFrames are stored in Snowflake. Only a small portion of the data is downloaded for display. You can use DataFrame methods to manipulate the data and even save the results to a table in a Snowflake database. See Writing Results to Snowflake for more information.

The first time a query is executed against a model, a check is made to ensure that the data streams for the model’s Snowflake source data are up-to-date. If the data streams are out-of-date, the data is prepared before the query is executed and a spinner is displayed in your console or notebook environment with details about the data preparation.

If you want to skip this check and ensure that queries always execute immediately, set the wait_for_stream_sync configuration key to False in your raiconfig.toml file. See the Configuration guide for more information.

Rules let you express knowledge about objects and their relationships. For example, a rule can capture the fact “All people who are 18 years or older are adults” by setting Person objects to an Adult type if they meet the age requirement:

import relationalai as rai
# =====
# SETUP
# =====
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare Person and Adult types.
Person = model.Type("Person")
Adult = model.Type("Adult")
# Define Person objects.
with model.rule():
alice = Person.add(id=1).set(name="Alice", age=16, favorite_color="blue")
bob = Person.add(id=2).set(name="Bob", age=18, favorite_color="green")
carol = Person.add(id=3).set(name="Carol", age=18)
alice.friends.add(bob)
bob.friends.extend([alice, carol])
carol.friends.add(bob)
# =======
# EXAMPLE
# =======
# Define a rule that sets Person objects to the Adult type if they are 18 or older.
with model.rule():
person = Person()
person.age >= 18
person.set(Adult)
# Query the model for the names and ages of adults.
with model.query() as select:
adult = Adult()
response = select(adult.name, adult.age)
print(response.results)
# name age
# 0 Bob 18
# 1 Carol 18

Rules work like queries, except that instead of selecting things, they add new objects or set types and properties of existing objects.

Using the SQL analogy, you can interpret the above rule as something like the following:

RAI PythonSQL Interpretation
person = Person()FROM Person person
person.age >= 18WHERE person.age >= 18
person.set(Adult)INSERT INTO Adult VALUES (person)

Methods like Type.add() and Instance.set() act on the objects filtered by the rule. But rules don’t have to stop after just one action. They may continue to filter objects and take additional actions:

with model.rule():
person = Person()
person.age >= 18
person.set(Adult)
person.friends.favorite_color == "blue"
person.set(has_blue_friend=True)
with model.query() as select:
adult = Adult()
response = select(adult.name, adult.age, adult.has_blue_friend)
print(response.results)
# name age has_blue_friend
# 0 Bob 18 True
# 1 Carol 18 NaN

In this version of the rule:

  • First, person.age >= 18 filters people who are 18 or older. The people who pass this filter are set to the Adult type.
  • Next, person.friends.favorite_color == "blue" filters the remaining people who have a friend whose favorite color is blue. The has_blue_friend property is set to True for these people.

Filters and actions are applied in the declared order, with actions affecting only objects that pass all preceding filters. For example, both Bob and Carol are adults, but only Bob’s has_blue_friend property is set to True.

In this guide, you learned about the core concepts of the RelationalAI Python package:

  • Models describe the entities, concepts, and logic important to your organization. Entities are represented by objects with properties that may be single- or multi-valued.
  • Types represent categories of objects on a model and may be populated with objects from Snowflake tables or defined in rules.
  • Rules represent facts about objects, such as which types they belong to and what properties they have.
  • Queries ask questions about the objects described in a model.

The examples in this guide only scratch the surface of what you can express in a model. To learn more: