Skip to content

Managing Graphs

Goal

This guide demonstrates how to create, view, update, and delete graphs using RelationalAI’s SQL Graph Library for Snowflake.

Introduction

Graphs have two components:

  1. A data stream, which sends data for the graph’s edges to RAI.
  2. A graph object, which is built from the data stream and can be used to perform graph analyses.

Although edge data streams and graphs are stored in a RAI database, you can create and manage graphs without leaving Snowflake. In this guide, you’ll learn how to:

  1. Create graphs.
  2. View graph information.
  3. Update graphs.
  4. Delete graphs.

Creating Graphs

There are three steps to creating a graph in RAI:

  1. Prepare your Snowflake data.
  2. Create an edge data stream.
  3. Create a graph.

Prepare Snowflake Data

Before you create a graph, you must have a Snowflake table or view containing the edge data. The table must have two columns: each row represents an edge and each column contains an integer value representing a node in the graph.

Typically, edge views are created using RAI’s Schema Mapping Library to map entities in your Snowflake data to nodes that can be used to describe the graph’s edges:

USE DATABASE my_snowflake_db;
USE SCHEMA my_snowflake_schema;
 
-- Create a table representing products purchased by clients.
CREATE TABLE Purchase(Client_ID INT, Product_ID INT)
    AS SELECT * FROM VALUES
    (1, 1), (1, 3), (2, 1);
 
-- Create Client and Product entities.
CALL RAI.create_entity('Client', 'id');
CALL RAI.create_entity('Product', 'id');
 
-- Create the entity lookup table.
CALL RAI.create_lookup('Client', 'Purchase', 'Client_ID');
CALL RAI.create_lookup('Product', 'Purchase', 'Product_ID');
 
-- Create a view with the edge set for the graph.
CREATE VIEW my_edges(source, target) AS (
    SELECT RAI.node('Client', Client_ID), RAI.node('Product', Product_ID)
    FROM Purchase
);

The advantage of transforming your data this way is that nodes are automatically represented as integers. In the preceding example, the Purchase table has two integer columns and looks like it could represent edges in a graph. However, the domains of the nodes overlap, since there is both a Product and a Client with ID 1. This overlap is automatically handled by the RAI.node function.

See Schema Mapping for details on transforming Snowflake data into entities and working with lookup tables.

Create an Edge Data Stream

The next step is to create a data stream for the graph’s edges using the create_data_stream procedure. Data streams require a RAI database, which can be set using the use_rai_database procedure.

-- Choose the RAI database in which to store the edge data stream.
CALL RAI.use_rai_database('my_rai_db');
/*+-------------------+
  | USE_RAI_DATABASE  |
  +-------------------+
  |  my_rai_database  |
  +-------------------+ */
 
-- Create a RAI data stream from the 'my_edges' view.
-- By default, the data stream is named 'my_edges'.
CALL RAI.create_data_stream('my_edges');
/*+------------------------------------------------------------------------------+
  | {"account": ***,  "id": ***,                                                 |
  |  "name": "my_snowflake_db.rai-my_snowflake_db.my_snowflake_schema.my_edges", |
  |  ...                                                                         |
  | }                                                                            |
  +------------------------------------------------------------------------------+ */
🔎

By default, the functions and procedures in the SQL Graph Library for Snowflake are contained in the RAI schema in your Snowflake database. Note that this schema name may vary depending on how your administrator set up the RAI integration.

The my_edges data stream sends data from the my_edges Snowflake table to the RAI database my_rai_db. Any graph created from the data stream is stored in the same RAI database. Note that it may take several minutes for the data stream to become available. See Create a Data Stream for more information on creating data streams and checking their state.

Currently, the my_edges data stream describes a graph with two nodes — for the clients with IDs 1 and 2 — and three edges representing the two products purchased by client 1 and the product purchased by client 2. If the data contained in the Snowflake source view my_edges changes, then the my_edges data stream, and any graph created from it, is automatically updated.

Create a Graph

Once the data stream is established, you can create a graph from the stream using the create_graph procedure. A RAI engine is required to create a graph and may be set with the use_rai_engine procedure:

-- Choose the RAI engine that will be used to create the graph.
CALL RAI.use_rai_engine('my_rai_engine');
/*+-----------------+
  | USE_RAI_ENGINE  |
  +-----------------+
  |  my_rai_engine  |
  +-----------------+ */
 
-- Create a graph in RAI called 'my_graph'
-- with edges from the 'my_edges' data stream.
CALL RAI.create_graph('my_graph', 'my_edges');
/*+--------------------------------------------------------------------------------+
  | {"message":"Graph 'my_graph' created in database 'my_rai_db'.","success":true} |
  +--------------------------------------------------------------------------------+  */

By default, my_graph is undirected. To create a directed graph, pass {'directed': true} to the create_graph procedure’s optional third parameter:

-- Create a directed graph in RAI called 'my_directed_graph' from the 'my_edges' data stream.
CALL RAI.create_graph('my_directed_graph', 'my_edges', {'directed': true});
/*+-----------------------------------------------------------------------------------------+
  | {"message":"Graph 'my_directed_graph' created in database 'my_rai_db'.","success":true} |
  +-----------------------------------------------------------------------------------------+  */
🔎

It’s possible to create multiple graphs from the same data stream. All graphs created with the same schema, such as the RAI schema in the preceding examples, must have unique names, even if they are created from different data streams. Graph names are case sensitive.

How Graphs Are Stored

When you create a graph, the actual graph object on which algorithms are run is stored in a RAI database. A catalog of available graphs is maintained in a Snowflake database in the Snowflake schema from which the SQL Graph Library functions are called, such as the RAI schema in the preceding examples. Graphs stored in this catalog are identified by their name, so graph names must be unique to the Snowflake schema in which they are created.

Viewing Graph Information

The SQL Graph Library provides three functions for checking whether or not graphs exist and viewing information about graphs:

Check If a Graph Exists

To check whether or not a graph exists, pass the graph name as a string to the is_graph_created function:

SELECT RAI.is_graph_created('my_graph');
/*+------+
  | TRUE |
  +------+ */

Calling the RAI.is_graph_created function returns the BOOLEAN value TRUE if a graph with the given name has been created using the RAI schema, and FALSE otherwise.

View Info About a Graph

You can view info about a specific graph by passing the graph’s name to the get_graph function:

SELECT RAI.get_graph('my_graph');
/*+------------------------------------------------------------------+
  | {                                                                |
  |   "DIRECTED": false,                                             |
  |   "EDGE_STREAM": "MY_SNOWFLAKE_DB.MY_SNOWFLAKE_SCHEMA.my_edges", |
  |   "NAME": "my_graph",                                            |
  |   "RAI_DATABASE": "my_rai_db"                                    |
  | }                                                                |
  +------------------------------------------------------------------+ */

The output is a JSON object that tells you:

  • Whether or not the graph is directed.
  • The fully qualified name of the Snowflake table or view linked to the graph’s edge data stream.
  • The name of the graph.
  • The RAI database that the graph is stored in.

If no graph with the given name exists, get_graph returns NULL.

List All Available Graphs

To view all available graphs, use the list_graphs function:

SELECT RAI.list_graphs();
/*+--------------------------------------------------------------------+
  | [                                                                  |
  |   {                                                                |
  |     "DIRECTED": false,                                             |
  |     "EDGE_STREAM": "MY_SNOWFLAKE_DB.MY_SNOWFLAKE_SCHEMA.my_edges", |
  |     "NAME": "my_graph",                                            |
  |     "RAI_DATABASE": "my_rai_db"                                    |
  |   },                                                               |
  |   {                                                                |
  |     "DIRECTED": true,                                              |
  |     "EDGE_STREAM": "MY_SNOWFLAKE_DB.MY_SNOWFLAKE_SCHEMA.my_edges", |
  |     "NAME": "my_directed_graph",                                   |
  |     "RAI_DATABASE": "my_rai_db"                                    |
  |   }                                                                |
  | ]                                                                  |
  +--------------------------------------------------------------------+ */

The output is an array of JSON objects.

Updating Graphs

After a graph has been created, you can alter some of its properties. For example, you can:

In each case, you will use the update_graph procedure. This procedure removes the existing graph and replaces it with a new graph with the same properties as the old graph except for the requested changes.

It’s not possible to change a graph’s name.

The update_graph procedure is not meant to be used to add or remove edges from a graph. This is achieved by adding or removing rows in the Snowflake source table of the graph’s edge data stream. These changes are automatically synced between Snowflake and RAI and do not require a manual update.

Updating a graph requires a RAI engine. You can either pass the engine name to the update_graph procedure’s rai_engine argument or set the engine in a RAI context.

Change the Edge Data Stream

You may change the data stream that is associated with a graph by passing the name of the new data stream to the edge_stream_name argument.

-- Update the graph 'my_graph' to use the data stream 'my_data_stream'.
-- Note that update_graph requires a RAI engine.
CALL RAI.update_graph(
  'my_graph', {'rai_engine': 'my_engine', 'edge_stream_name': 'my_data_stream'}
);
/*+-------------------------------------------------------------------------+
  | {                                                                       |
  |     "message": "Graph 'my_graph' created in RAI database 'my_rai_db'.", |
  |     "success": true                                                     |
  | }                                                                       |
  +-------------------------------------------------------------------------+ */

The output message indicates that the new, updated graph has been created in the RAI database my_rai_db. This is the same RAI database that the original graph was stored in and may not be changed.

Switch a Graph From Undirected to Directed

To change an undirected graph to a directed graph, set the the update_graph prodecure’s directed argument to true:

-- Update the graph 'my_graph' to be a directed graph.
-- Note that update_graph requires a RAI engine.
CALL RAI.update_graph(
  'my_graph', {'rai_engine': 'my_engine', 'directed': 'true'}
);
/*+-------------------------------------------------------------------------+
  | {                                                                       |
  |     "message": "Graph 'my_graph' created in RAI database 'my_rai_db'.", |
  |     "success": true                                                     |
  | }                                                                       |
  +-------------------------------------------------------------------------+ */

You can change multiple properties of the graph simultaneously by providing values to both the directed and edge_stream_name arguments.

Deleting Graphs

You can delete a graph by passing the graph’s name to the delete_graph procedure. Deleting a graph requires a RAI engine. You can either pass the engine name to the delete_graph procedure’s rai_engine argument or set the engine in a RAI context:

-- Delete 'my_graph'. Note that delete_graph requires a RAI engine.
CALL RAI.delete_graph('my_graph', {'rai_engine': 'my_rai_engine'});
/*+---------------------------------------------------------------------------+
  | {                                                                         |
  |     "message": "Graph 'my_graph' removed from RAI database 'my_rai_db'.", |
  |     "success": true                                                       |
  | }                                                                         |
  +---------------------------------------------------------------------------+ */

You should delete every graph associated with a data stream before deleting the stream.

See Also

See Working With Algorithms to learn more about running graph analytics workloads. See the Overview of Graph Algorithms to explore every algorithm implemented in RelationalAI’s SQL Graph Library for Snowflake.

Was this doc helpful?