Skip to content

SQL Library Reference for Snowflake: Functions

Reference for the SQL functions that come with the RAI Integration Services for Snowflake.

adamic_adar

adamic_adar(graph_name, arguments)

Compute the Adamic-Adar index of pairs of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but not together with node1.An array of arrays representing pairs of nodes.
node1Int (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their Adamic-Adar index.TABLE(node1 INT, node2 INT, score FLOAT)

Explanation

The Adamic-Adar index measures the similarity of two nodes based on the number of shared edges between them. The adamic_adar function returns a table with three columns — node1, node2, and score — whose rows contain pairs of nodes and their Adamic-Adar index value. If no graph with the provided name exists, an error is returned.

Higher scores indicate greater similarity, and rows where the score is zero are omitted from the results. A score of zero indicates that two nodes are neither similar nor dissimilar. Excluding zeros from the results improves performance and prevents the need to remove those rows in a post-processing step.

A RAI engine is required to execute the adamic_adar function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the adamic_adar function:

  • Compute the Adamic-Adar index of two nodes, or sets of nodes, by passing node IDs, or arrays of node IDs, to the node1 and node2 arguments. You may compute the Adamic-Adar index for multiple pairs of nodes by passing an array of pairs of node IDs to the tuples argument.
  • Compute the Adamic-Adar index of a node, or set of nodes, and every other node in the graph, by passing a node ID, or array of node IDs, to the node1 argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Compute the Adamic-Adar index of two nodes in a graph using the node1 and node2 arguments:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Adamic-Adar index of nodes 1 and 2
-- using the `'node1'` and `'node2'` arguments.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': 1, 'node2': 2}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 2     | 1.4426950408889634 |
  +-------+-------+--------------------+  */
 
-- Compute the Adamic-Adar index of nodes 1 and 2 and of nodes 1 and 3
-- using the `'tuples'` argument.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 2     | 1.4426950408889634 |
  | 1     | 3     | 1.4426950408889634 |
  +-------+-------+--------------------+  */

Compute the Adamic-Adar index of a given node and every node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Adamic-Adar index of node 1 and every node in `'my_graph'`.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': 1}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 1     | 2.8853900817779268 |
  | 1     | 2     | 1.4426950408889634 |
  | 1     | 3     | 1.4426950408889634 |
  +-------+-------+--------------------+  */

Compute the Adamic-Adar index of a single node and a set of nodes using the node1 and node2 arguments:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Adamic-Adar index of node 1 and nodes 2 and 3.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': 1, 'node2': [2, 3]}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 2     | 1.4426950408889634 |
  | 1     | 3     | 1.4426950408889634 |
  +-------+-------+--------------------+  */

Compute the Adamic-Adar index of every pair of nodes from two sets of nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Adamic-Adar index between two sets of nodes.
-- The result contains rows for each pair in the Cartesian product of the sets of nodes.
-- Note that the row with 3 in the NODE1 column and 2 in the NODE2 column is missing
-- because the Adamic-Adar index is zero.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': [1, 3], 'node2': [2, 3]}));
/*+-----------------------------+
  | NODE1 | NODE2 | SCORE       |
  +-------+-------+-------------+
  | 1     | 2     | 1.442695041 |
  | 1     | 3     | 1.442695041 |
  | 3     | 3     | 1.442695041 |
  +-------+-------+-------------+  */

Compute the Adamic-Adar index of a given node and each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Adamic-Adar index of node 1 and every other node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.adamic_adar(
    'my_graph',
    {
        'node1': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+----------------------------------+
  | COL1 | COL2 | COL3               |
  +------+------+--------------------+
  | 1    | 1    | 2.8853900817779268 |
  | 1    | 2    | 1.4426950408889634 |
  | 1    | 3    | 1.4426950408889634 |
  +------+------+--------------------+  */

See Also

jaccard_similarity, cosine_similarity, preferential_attachment, and common_neighbor.

average_degree

average_degree(graph_name)
average_degree(graph_name, arguments)

Return the average degree of a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Float (opens in a new tab)The average degree of the graph.

Explanation

The function average_degree('my_graph') returns the average degree over all degrees of nodes in my_graph. If no graph with the provided name exists, an error is returned.

Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.

The average_degree function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Get the average degree of an undirected graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the average degree of `'my_graph'`.
SELECT RAI.average_degree('my_graph');
/*+-----+
  | 1.5 |
  +-----+  */

Get the average degree of a graph using a different RAI engine than the engine set in the RAI context, and store the result in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the average degree of `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `'my_result_table'`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.max_degree(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
 
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
  | COL1 |
  +------+
  | 1.5  |
  +------+  */

See Also

degree, max_degree, min_degree, and degree_histogram.

betweenness_centrality

betweenness_centrality(graph_name)
betweenness_centrality(graph_name, arguments)

Measure a node’s importance in a graph based on how many shortest paths go through it.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

You can specify the following arguments in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoA node ID, or array of node IDs.
rai_engineVarchar (opens in a new tab)Yes, unless set in a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their betweenness centrality value.TABLE(node INT, value FLOAT)

Explanation

Betweenness centrality measures how important a node is based on how many times that node appears in a shortest path between any two nodes in the graph. Nodes with high betweenness centrality represent bridges between different parts of the graph. For example, in a network representing airports and the flights between them, nodes with high betweenness centrality may identify “hub” airports that connect flights to different regions.

Calculating betweenness centrality involves computing all of the shortest paths between every pair of nodes in a graph and can be expensive to calculate exactly. The betweenness_centrality function gives an approximation using the Brandes algorithm (opens in a new tab) with source nodes drawn from a sample of 100 nodes with the highest degrees. Values are non-negative and are not normalized.

The betweenness_centrality function returns a table with two columns — node and value — whose rows represent pairs of nodes in my_graph and their approximate betweenness centrality value. If no graph with the provided name exists, an error is returned.

You may get the betweenness centrality of a single node by providing the node ID to the node argument. If the specified node does not exist in the graph, an empty table is returned.

The betweenness_centrality function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the betweenness centrality of every node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 2), (2, 3), (3, 3), (2, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the betweenness centrality of each node in 'my_graph'.
SELECT * FROM TABLE(RAI.betweenness_centrality('my_graph'));
/*+--------------+
  | NODE | VALUE |
  +------+-------+
  | 1    | 0.0   |
  | 2    | 1.0   |
  | 3    | 0.0   |
  | 4    | 0.0   |
  +------+-------+  */
 
-- Compute the betweenness centrality of each node in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.betweenness_centrality(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 0.5  |
  | 2    | 1    |
  | 3    | 0.5  |
  +------+------+  */

Compute the betweenness centrality of specific nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 2), (2, 3), (3, 3), (2, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the betweenness centrality of node 2.
SELECT * FROM TABLE(RAI.betweenness_centrality('my_graph', {'node': 2}));
/*+--------------+
  | NODE | VALUE |
  +------+-------+
  | 2    | 1.0   |
  +------+-------+  */
 
-- Compute the betweenness centrality of nodes 2 and 3.
SELECT * FROM TABLE(RAI.betweenness_centrality('my_graph', {'node': [2, 3]}));
/*+--------------+
  | NODE | VALUE |
  +------+-------+
  | 2    | 1.0   |
  | 3    | 0.0   |
  +------+-------+  */

See Also

degree_centrality, eigenvector_centrality, and pagerank.

common_neighbor

common_neighbor(graph_name, arguments)

Find common neighbors of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but not together with node1.An array of arrays representing pairs of nodes.
node1Int (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
node3Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table whose rows represent triples of nodes in which the third node is a common neighbor of the first two nodes.TABLE(node1 INT, node2 INT, node3 INT)

Explanation

The common_neighbor function returns a table with three columns — node1, node2, and node3 — whose rows indicate that node3 is a common neighbor of node1 and node2. If no graph with the provided name exists, an error is returned.

A RAI engine is required to execute the common_neighbor function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the common_neighbor function:

  • Find the common neighbors of two nodes, or sets of nodes, by passing node IDs, or arrays of node IDs, to the node1 and node2 arguments. You may find the common neighbors of multiple pairs of nodes by passing an array of pairs of node IDs to the tuples argument.
  • Find the common neighbors of a node, or set of nodes, and every other node in the graph, by passing a single node, or an array of nodes, to the node1 argument.
  • Check if a node is a common neighbor of two other nodes by passing each node ID to the node1, node2, and node3 arguments. If node3 is a common neighbor of node1 and node2, then the function returns a row in the result table containing node1, node2, and node3. Otherwise, an empty table is returned. You may check multiple triples simultaneously by passing an array of triples of node IDs to the tuples argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Find the common neighbors of two nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 3), (1, 4), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the common neighbors of nodes 1 and 2 using the `'node1'` and `'node2'` arguments.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1, 'node2': 2}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1    | 2      | 3     |
  +------+--------+-------+  */
 
-- Find the common neighbors of nodes 1 and 2 and nodes 1 and 3 using the `'tuples'` argument.
-- No row is returned with 1 in the NODE1 column and 3 in the NODE2 column
-- since node 1 and node 3 have no common neighbors.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1    | 2      | 3     |
  +------+--------+-------+  */

Find the common neighbor of a given node and every node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 3), (1, 4), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the common neighbors of node 1 and every node in `'my_graph'`.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1    | 1      | 3     |
  | 1    | 1      | 4     |
  | 1    | 2      | 3     |
  +------+--------+-------+  */

Check whether or not a node is a common neighbor of two other nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 3), (1, 4), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Check whether or not node 3 is a common neighbor of nodes 1 and 2.
-- The output is non-empty because 3 is a common neighbor.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1, 'node2': 2, 'node3': 3}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1    | 2      | 3     |
  +------+--------+-------+  */
 
-- Check whether or not node 4 is a common neighbor of nodes 1 and 2.
-- The output is empty because node 4 is not a common neighbor.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1, 'node2': 2, 'node3': 4}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | Empty table.          |
  +------+--------+-------+  */
 
-- Check multiple triples simultaneously using the `'tuples'` argument.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'tuples': [[1, 2, 3], [1, 2, 4]]}))
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1    | 2      | 3     |
  +------+--------+-------+  */

Find the common neighbor of a given node and every other node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the common neighbors of node 1 and every other node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `'my_result_table'`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.common_neighbor(
    'my_graph',
    {
        'node1': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1    | 1      | 3     |
  | 1    | 1      | 4     |
  | 1    | 2      | 3     |
  +------+--------+-------+  */

See Also

jaccard_similarity, cosine_similarity, preferential_attachment, and cosine_similarity.

cosine_similarity

cosine_similarity(graph_name, arguments)

Compute the cosine similarity of pairs of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but not together with node1.An array of arrays representing pairs of nodes.
node1Int (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their cosine similarity value.TABLE(node1 INT, node2 INT, score FLOAT)

Explanation

Cosine similarity measures the similarity of two nodes as the inner product of vector representations of the nodes’ neighborhoods. The cosine_similarity function returns a table with three columns — node1, node2, and score — whose rows contain pairs of nodes and their cosine similarity value. If no graph with the provided name exists, an error is returned.

Cosine similarity values range from -1 to 1, inclusive, but rows where score is zero are omitted from the results. A score of zero indicates that two nodes are neither similar nor dissimilar. Excluding zeros improves performance and spares you from having to remove those rows in a post-processing step.

A RAI engine is required to execute the cosine_similarity function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the cosine_similarity function:

  • Compute the cosine similarity of two nodes, or sets of nodes, by passing node IDs, or arrays of node IDs, to the node1 and node2 arguments. You may compute the cosine similarity for multiple pairs of nodes by passing an array of pairs of node IDs to the tuples argument.
  • Compute the cosine similarity of a node, or set of nodes, and every other node in the graph, by passing a node ID, or array of node IDs, to the node1 argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Compute the cosine similarity of two nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the cosine similarity of nodes 1 and 2
-- using the `'node1'` and `'node2'` arguments.
SELECT * FROM TABLE(RAI.cosine_similarity('my_graph', {'node1': 1, 'node2': 2}));
/*+-----------------------+
  | NODE1 | NODE2 | SCORE |
  +-------+-------+-------+
  | 1     | 2     | 0.5   |
  +-------+-------+-------+  */
 
-- Compute the cosine similarity of nodes 1 and 2 and nodes 1 and 3
-- using the `'tuples'` argument.
SELECT * FROM TABLE(RAI.cosine_similarity('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 2     | 0.5                |
  | 1     | 3     | 0.7071067811865475 |
  +-------+-------+--------------------+  */

Compute the cosine similarity of a node and every node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the cosine similarity of node 1 and every node in `'my_graph'`.
SELECT * FROM TABLE(RAI.cosine_similarity('my_graph', {'node1': 1}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 1     | 1.0                |
  | 1     | 2     | 0.5                |
  | 1     | 3     | 0.7071067811865475 |
  +-------+-------+--------------------+  */

Compute the cosine similarity of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the cosine similarity of node 1 and every other node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `'my_result_table'`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.cosine_similarity(
    'my_graph',
    {
        'node1': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+----------------------------------+
  | COL1 | COL2 | COL3               |
  +------+------+--------------------+
  | 1    | 1    | 1.0                |
  | 1    | 2    | 0.5                |
  | 1    | 3    | 0.7071067811865475 |
  +------+------+--------------------+  */

See Also

jaccard_similarity, preferential_attachment, adamic_adar, and common_neighbor.

create_rai_database

create_rai_database(rai_db)

Create a database in RAI.

Parameters

ParameterTypeDescription
rai_dbVarchar (opens in a new tab) RAI database name.

Output

TypeDescription
Variant (opens in a new tab) Status of the creation request. Example:“ok”.

Explanation

The function create_rai_database creates a database in RAI.

Examples

SELECT RAI.create_rai_database('rai_db');

See Also

get_rai_database and delete_rai_database.

create_rai_engine

create_rai_engine(rai_engine, size)

Create an engine in RAI.

Parameters

ParameterTypeDescription
rai_engine Varchar (opens in a new tab) RAI engine name.
sizeVarchar (opens in a new tab) RAI engine size. Possible values are 'XS', 'S', 'M', 'L', 'XL'.

Output

TypeDescription
Variant (opens in a new tab) Status of the create engine request. Example:“ok”.

Explanation

The function create_rai_engine creates an engine in RAI.

Examples

SELECT RAI.create_rai_engine('rai_engine', 'S');

See Also

get_rai_engine and delete_rai_engine.

current_rai_database

current_rai_database()

Return the RAI database that is currently selected.

Output

TypeDescription
Varchar (opens in a new tab) The name of the currently selected RAI database.

Explanation

The function current_rai_database returns the currently selected RAI database. The database to use is typically set by calling the use_rai_database procedure. If no database has been previously selected, the function returns NULL.

Examples

SELECT RAI.current_rai_database();
 
/*+----------------------------+
  | RAI.CURRENT_RAI_DATABASE() |
  +----------------------------+
  | rai_db                     |
  +----------------------------+  */

See Also

use_rai_database.

current_rai_engine

current_rai_engine()

Return the RAI engine that is currently selected.

Output

TypeDescription
Varchar (opens in a new tab) The name of the currently selected RAI engine.

Explanation

The function current_rai_engine returns the currently selected RAI engine. The engine to use is typically set by calling the use_rai_engine procedure. If no engine has been previously selected, the function returns NULL.

Examples

SELECT RAI.current_rai_engine();
 
/*+--------------------------+
  | RAI.CURRENT_RAI_ENGINE() |
  +--------------------------+
  | rai_engine               |
  +--------------------------+  */

See Also

use_rai_engine.

degree

degree(graph_name)
degree(graph_name, arguments)

Return the degree of each node in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoInstruct degree to return only the rows in the output table whose node column contains the provided value(s).
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table representing pairs of neighbors in the graph.TABLE(node1 INT, node2 INT)

Explanation

The function degree('my_graph') returns a table with two columns, node and degree, whose rows represent the degrees of nodes in my_graph. If no graph with the provided name exists, an error is returned.

Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.

The degree function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Get the degree of each node in an undirected graph with the RAI engine set in a RAI context:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the degrees of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree('my_graph'));
/*+---------------+
  | NODE | DEGREE |
  +------+--------+
  | 1    | 2      |
  | 2    | 1      |
  +------+--------+  */

Get the degree of each node in a directed graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Get the degrees of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree('my_graph'));
/*+---------------+
  | NODE | DEGREE |
  +------+--------+
  | 1    | 3      |
  | 2    | 1      |
  +------+--------+  */

Get the degree of a specific node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the degree of node 2 in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree('my_graph', {'node': 2}));
/*+---------------+
  | NODE | DEGREE |
  +------+--------+
  | 2    | 2      |
  +------+--------+  */

Get the degree of multiple nodes simultaneously:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the degree of nodes 1 and 2 in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree('my_graph', {'node': [1, 2]}));
/*+---------------+
  | NODE | DEGREE |
  +------+--------+
  | 1    | 1      |
  | 2    | 2      |
  +------+--------+  */

Get the degree of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of edges in `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.degree(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 2    |
  | 2    | 1    |
  +------+------+  */

See Also

min_degree, max_degree, average_degree, degree_histogram, and neighbor.

degree_centrality

degree_centrality(graph_name)
degree_centrality(graph_name, arguments)

Compute the degree centrality of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoInstruct degree_centrality to return only the rows in the output table whose node column contains the provided value.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their degree centrality value.TABLE(node INT, value FLOAT)

Explanation

Degree centrality is a measure of the centrality, or importance, of a node in a graph based on its degree. Degree centrality is computed as the degree of a node divided by the number of nodes in the graph minus one. For simple graphs without loops this value is at most one. Graphs with loops may have nodes with degree centrality greater than one.

The degree_centrality('my_graph') function returns a table with two columns — node and value — whose rows represent pairs of nodes in my_graph and their degree centrality value. If no graph with the provided name exists, an error is returned.

You may get the degree centrality of a single node by providing the node ID to the node argument. If the specified node does not exist in the graph, an empty table is returned.

The degree_centrality function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the degree centrality of each node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the degree centrality of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree_centrality('my_graph'));
/*+--------------+
  | NODE | VALUE |
  +------+-------+
  | 1    | 0.5   |
  | 2    | 1     |
  | 3    | 0.5   |
  +------+-------+  */

Compute the degree centrality of a single node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the degree centrality of node 2.
SELECT * FROM TABLE(RAI.degree_centrality('my_graph', {'node': 2}));
/*+--------------+
  | NODE | VALUE |
  +------+-------+
  | 2    | 1     |
  +------+-------+  */

Compute the degree centrality of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the degree centrality of each node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.degree_centrality(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 0.5  |
  | 2    | 1    |
  | 3    | 0.5  |
  +------+------+  */

See Also

betweenness_centrality, eigenvector_centrality, and pagerank.

degree_histogram

degree_histogram(graph_name)
degree_histogram(graph_name, arguments)

Count the number of nodes with each degree in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
degreeInt (opens in a new tab) / Array (opens in a new tab)NoInstruct degree_histogram to return only the rows in the output table whose degree column contains the provided value.
countInt (opens in a new tab) / Array (opens in a new tab)NoInstruct degree_histogram to return only the rows in the output table whose count column contains the provided value.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table representing the number of nodes with each degree.TABLE(node1 INT, node2 INT)

Explanation

The function degree_histogram('my_graph') returns a table with two columns, degree and count, whose rows represent the number of nodes with each degree in the graph. If no graph with the provided name exists, an error is returned.

Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.

The degree_histogram function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Count the number of nodes with each degree in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Count the number of nodes with each degree in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree_histogram('my_graph'));
/*+----------------+
  | DEGREE | COUNT |
  +--------+-------+
  | 1      | 2     |
  | 2      | 1     |
  +--------+-------+  */

Count the number of nodes in a graph with a specific degree:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Count the number of nodes with degree 2 in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree_histogram('my_graph', {'degree': 2}));
/*+----------------+
  | DEGREE | COUNT |
  +--------+-------+
  | 2      | 1     |
  +--------+-------+  */

Find all degrees for which there is a specific number of nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Count the number of nodes with degree 2 in `'my_graph'`.
SELECT * FROM TABLE(RAI.degree_histogram('my_graph', {'count': 2}));
/*+----------------+
  | DEGREE | COUNT |
  +--------+-------+
  | 1      | 2     |
  +--------+-------+  */

Count the number of nodes in a graph with each degree using a different RAI engine than the one set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Count the number of nodes with each degree in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.degree_histogram(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 2    |
  | 2    | 1    |
  +------+------+  */

See Also

min_degree, max_degree, average_degree, and degree.

delete_rai_database

delete_rai_database(rai_db)

Delete a database in RAI.

Parameters

ParameterTypeDescription
rai_dbVarchar (opens in a new tab) RAI database name.

Output

TypeDescription
Variant (opens in a new tab) Status of the deletion request. Example:“ok”.

Explanation

The function delete_rai_database deletes a database in RAI.

Examples

SELECT RAI.delete_rai_database('rai_db');

See Also

create_rai_database.

delete_rai_engine

delete_rai_engine(rai_engine)

Delete an engine in RAI.

Parameters

ParameterTypeDescription
rai_engine Varchar (opens in a new tab) RAI engine name.

Output

TypeDescription
Variant (opens in a new tab) Status of the deletion request. Example:“ok”.

Explanation

The function delete_rai_engine deletes an engine in RAI.

Examples

SELECT RAI.delete_rai_engine('rai_engine');

See Also

create_rai_engine.

diameter_range

diameter_range(graph_name)
diameter_range(graph_name, arguments)

Estimate the diameter of a graph with lower and upper bounds.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Object (opens in a new tab)A JSON object containing the lower and upper bounds for the diameter of the graph.
{"min": 1, "max": 10}

Explanation

The diameter_range function is used to determine the range of possible diameter values for a graph. If no graph with the provided name exists, an error is returned.

The graph’s diameter is estimated by selecting a number of random nodes in the graph and taking the maximum of all shortest path lengths from each selected node to the other nodes in the graph. This gives a range per node. Then, the intersection of the ranges is taken and the final range is returned.

The diameter_range function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Estimate the diameter of a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Estimate the diameter of `'my_graph'`.
SELECT RAI.diameter_range('my_graph');
/*+-----------------------+
  | { "max": 2, "min": 2} |
  +-----------------------+  */

Estimate the diameter of a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Estimate the diameter of `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.diameter_range(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
-- Query the results.
SELECT * FROM my_result_table;
/*+-----------------------+
  | COL1                  |
  +-----------------------+
  | { "max": 2, "min": 2} |
  +-----------------------+  */

See Also

shortest_path_length.

eigenvector_centrality

eigenvector_centrality(graph_name)
eigenvector_centrality(graph_name, arguments)

Compute the eigenvector centrality of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoInstruct eigenvector_centrality to return only the rows in the output table whose node column contains the provided value.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their eigenvector centrality value.TABLE(node INT, value FLOAT)

Explanation

Eigenvector centrality measures a node’s importance in such a way that connections to more important nodes contribute more to a node’s score than connections to less important nodes. It is computed based on the eigenvector associated with the top eigenvalue of the graph’s adjacency matrix.

We use the power method (opens in a new tab) to compute the eigenvector in our implementation. Note that the power method requires the adjacency matrix to be diagonalizable, and will only converge if the largest positive eigenvalue has multiplicity one. If your graph does not meet either of these requirements, the eigenvector_centrality function will not converge.

The eigenvector_centrality('my_graph') function returns a table with two columns — node and value — whose rows represent pairs of nodes in my_graph and their eigenvector centrality value. If no graph with the provided name exists, an error is returned.

You may get the eigenvector centrality of a single node by providing the node ID to the node argument. If the specified node does not exist in the graph, an empty table is returned.

The eigenvector_centrality function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the eigenvector centrality of each node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the eigenvector centrality of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.eigenvector_centrality('my_graph'));
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 1    | 0.4082482905 |
  | 2    | 0.8164965809 |
  | 3    | 0.4082482905 |
  +------+--------------+  */

Compute the eigenvector centrality of a single node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the eigenvector centrality of node 2.
SELECT * FROM TABLE(RAI.eigenvector_centrality('my_graph', {'node': 2}));
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 2    | 0.8164965809 |
  +------+--------------+  */

Compute the eigenvector centrality of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the eigenvector centrality of each node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.eigenvector_centrality(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+---------------------+
  | COL1 | COL2         |
  +------+--------------+
  | 1    | 0.4082482905 |
  | 2    | 0.8164965809 |
  | 3    | 0.4082482905 |
  +------+--------------+  */

See Also

betweenness_centrality, degree_centrality, and pagerank.

exec

exec(rai_query)
exec(rai_db, rai_engine, rai_query, data, readonly)

Execute a query against the RAI database.

Parameters

ParameterTypeDescription
rai_dbVarchar (opens in a new tab) RAI database name.
rai_engine Varchar (opens in a new tab) RAI engine name.
rai_queryVarchar (opens in a new tab) The query that will be executed in RAI.
dataVariant (opens in a new tab) Data input for the rai_query.
readonlyBoolean (opens in a new tab) Whether or not the query is read-only.

Output

TypeDescription
Variant (opens in a new tab) The query output information.
/*+-----------------------------------------------+
  | EXEC('DEF OUTPUT = {1; 2; 3}')                |
  +-----------------------------------------------+
  | [ [     1   ],   [     2   ],   [     3   ] ] |
  +-----------------------------------------------+ */

Explanation

The function exec executes in RAI the rai_query passed as a parameter. It uses the rai_db database and the rai_engine engine. The data parameter is used to provide input for the query. The readonly parameter specifies whether or not the rai_query is read-only. By default, readonly is set to TRUE.

Examples

SELECT RAI.exec('def output = {1; 2; 3}');
 
SELECT RAI.exec('rai_db', 'rai_engine', 'def output=foo', {'foo' : 'hello'}, TRUE);

See Also

exec_into and create_rai_database.

exec_into

exec_into(rai_query, sf_target)
exec_into(rai_db, rai_engine, sf_warehouse, sf_target, rai_query)
exec_into(rai_db, rai_engine, rai_query, data, readonly, sf_warehouse, sf_target)

Execute a query against the RAI database and insert the output into a target Snowflake table.

Parameters

ParameterTypeDescription
rai_dbVarchar (opens in a new tab) RAI database name.
rai_engine Varchar (opens in a new tab) RAI engine name.
sf_warehouse Varchar (opens in a new tab) The selected Snowflake warehouse.
sf_targetVarchar (opens in a new tab) Target Snowflake table that will contain the query output. Note that the table specified by sf_target needs to be fully qualified, for example, database.schema.table.
rai_queryVarchar (opens in a new tab) The query that will be executed in RAI.
dataVariant (opens in a new tab) Data input for the rai_query.
readonlyBoolean (opens in a new tab) Whether or not the query is read-only.

Output

TypeDescription
Variant (opens in a new tab) Status of the request. Example: “1 row(s) written”.

Explanation

The function exec_into executes the rai_query passed as a parameter in RAI. It uses the rai_db database and the rai_engine engine. The output of the query is stored in the Snowflake table sf_target. The Snowflake warehouse is specified through sf_warehouse.

Examples

CREATE OR REPLACE TABLE my_data(x INT);
 
SELECT RAI.exec_into('def output = {1; 2; 3}', 'my_db.rai.my_data');
 
CREATE OR REPLACE TABLE my_output(s VARCHAR);
 
SELECT RAI.exec_into(
    'rai_db',
    'rai_engine',
    'def output=foo',
    {'foo' : 'hello'},
    TRUE,
    'my_warehouse',
    'my_db.rai.my_output'
);

See Also

exec and create_rai_database.

get_rai_database

get_rai_database(rai_db)

Get information about a RAI database.

Parameters

ParameterTypeDescription
rai_dbVarchar (opens in a new tab) RAI database name.

Output

TypeDescription
Variant (opens in a new tab) RAI database information as a JSON object.
{
    "account_name": "******",
    "created_by": "******",
    "created_on": "2023-05-31T16:05:05.131Z",
    "id": "******",
    "name": "rai_db",
    "region": "us-east",
    "state": "CREATED"
}

Explanation

The function get_rai_database returns information about a RAI database.

Examples

SELECT RAI.get_rai_database('rai_db');

See Also

create_rai_database.

get_data_stream

get_data_stream(datasource)

Get the static information of a RAI data stream. A RAI data stream is identified by the fully qualified object name (opens in a new tab) of a SQL object, such as a table or view, in the form <database>.<schema>.<object>.

Parameters

ParameterTypeDescription
datasource Varchar (opens in a new tab) The fully qualified name of the Snowflake data source that identifies the data stream.

Output

TypeDescription
Variant (opens in a new tab) RAI data stream information as a JSON object.
{
    "account": "******",
    "createdBy": "******",
    "createdOn": "2023-05-31T11:24:09.710Z",
    "dbLink": "sf_db.rai",
    "id": "******",
    "integration": "****",
    "name": "sf_db.rai-sf_db.sf_schema.datasource",
    "rai": {
        "database": "rai_db",
        "relation": "rai_baserelation"
    },
    "snowflake": {
        "database": "sf_db",
        "object": "sf_db.sf_schema.datasource",
        "schema": "sf_schema"
    },
    "state": "CREATED"
}

Explanation

The function get_data_stream returns the static information of a RAI data stream. It includes the following fields:

JSON FieldDescription
accountRAI account name.
createdByClient who created the RAI data stream.
createdOnCreation time of the RAI data stream.
dbLinkRAI database link that manages this RAI data stream.
idInternal identifier of the RAI data stream.
integrationRAI integration that the RAI data stream is using.
nameIdentifying name of the RAI data stream.
rai:databaseRAI database that contains the target object of the RAI data stream.
rai:relationName of the target object, which is a base relation.
snowflake:databaseSnowflake database that contains the source object of the RAI data stream.
snowflake:objectFully qualified name of the Snowflake object.
snowflake:schemaSnowflake schema that holds the Snowflake object.
stateCreation state of the RAI data stream.

This information will not change over time, regardless of the RAI data stream’s state.

If the RAI data stream doesn’t exist, the function get_data_stream returns NULL.

To retrieve the current status of a RAI data stream and verify whether the data are synchronized, use get_data_stream_status.

Examples

This is how to retrieve the static information of the RAI data stream that originates from sf_db.sf_schema.sf_table:

SELECT RAI.get_data_stream('sf_db.sf_schema.sf_table');

The query output will be a Snowflake JSON object as described above.

See Also

get_data_stream_status and create_data_stream.

get_rai_engine

get_rai_engine(rai_engine)

Get information about a RAI engine.

Parameters

ParameterTypeDescription
rai_engine Varchar (opens in a new tab) RAI engine name.

Output

TypeDescription
Variant (opens in a new tab) RAI engine information as a JSON object.
{
    "account_name": "******",
    "created_by": "******",
    "created_on": "2023-05-25T23:04:29.000Z",
    "id": "******",
    "name": "rai_engine",
    "region": "us-east",
    "size": "S",
    "state": "PROVISIONED"
}

Explanation

The function get_rai_engine returns information about a RAI engine.

Examples

SELECT RAI.get_rai_engine('rai_engine');

See Also

create_rai_engine.

get_graph

get_graph(graph_name)

Get information about a graph, if it exists.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab) The name of the graph.

Output

TypeDescription
Variant (opens in a new tab) Graph information as a JSON object.
{
    "DIRECTED": false,
    "EDGE_STREAM": "my_db.my_schema.my_edge_stream",
    "NAME": "my_graph",
    "RAI_DATABASE": "my_rai_database"
}

Explanation

The function get_graph returns a JSON object with information about the graph, including the name of the graph, whether or not the graph is directed, the fully qualified name of the graph’s edge stream, and the name of the RAI database in which the graph is stored.

Examples

SELECT RAI.get_graph('my_graph');

See Also

is_graph_created and list_graphs.

is_connected

is_connected(graph_name)
is_connected(graph_name, arguments)

Check whether or not a graph is connected.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab) The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Boolean (opens in a new tab) Whether or not the graph is connected.

Explanation

Returns true if the graph is connected and false otherwise. If no graph with the provided name exists, an error is returned.

The is_connected function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Determine whether or not `'my_graph'` is connected.
SELECT RAI.is_connected('my_graph');
/*+------+
  | TRUE |
  +------+  */

See Also

weakly_connected_component

is_graph_created

is_graph_created(graph_name)

Check whether a graph called graph_name exists in the RAI Integration.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab) The name of the graph.

Output

TypeDescription
Boolean (opens in a new tab) Whether or not the graph exists.

Explanation

Returns true if a graph called graph_name exists and false if it does not.

Examples

SELECT RAI.is_graph_created('my_graph');

See Also

create_graph, get_graph, and list_graphs.

jaccard_similarity

jaccard_similarity(graph_name, arguments)

Compute the Jaccard similarity of pairs of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but not together with node1.An array of arrays representing pairs of nodes.
node1Int (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their Jaccard similarity value.TABLE(node1 INT, node2 INT, score FLOAT)

Explanation

Jaccard similarity measures the similarity of two nodes based on the number of neighbors common to both nodes. The jaccard_similarity function returns a table with three columns — node1, node2, and score — whose rows contain pairs of nodes and their Jaccard similarity value. If no graph with the provided name exists, an error is returned.

Jaccard similarity values range from 0 to 1, inclusive, but rows where score is zero are omitted from the results. A score of zero indicates that two nodes are incomparable. Excluding zeros improves performance and spares you from having to remove those rows in a post-processing step.

A RAI engine is required to execute the jaccard_similarity function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the jaccard_similarity function:

  • Compute the Jaccard similarity of two nodes, or sets of nodes, by passing node IDs, or arrays of node IDs, to the node1 and node2 arguments. You may compute the Jaccard similarity for multiple pairs of nodes by passing an array of pairs of node IDs to the tuples argument.
  • Compute the Jaccard similarity of a node, or set of nodes, and every other node in the graph, by passing a node ID, or array of node IDs, to the node1 argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Compute the Jaccard similarity of two nodes in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Jaccard similarity of nodes 1 and 2
-- using the `'node1'` and `'node2'` arguments.
SELECT * FROM TABLE(RAI.jaccard_similarity('my_graph', {'node1': 1, 'node2': 2}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 2     | 0.3333333333333333 |
  +-------+-------+--------------------+  */
 
-- Compute the Jaccard similarity of nodes 1 and 2 and nodes 1 and 3
-- using the `'tuples'` argument.
SELECT * FROM TABLE(RAI.jaccard_similarity('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 2     | 0.3333333333333333 |
  | 1     | 3     | 0.5                |
  +-------+-------+--------------------+  */

Compute the Jaccard similarity between a given node and every other node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Jaccard similarity of node 1 and every other node in `'my_graph'`.
SELECT * FROM TABLE(RAI.jaccard_similarity('my_graph', {'node1': 1}));
/*+------------------------------------+
  | NODE1 | NODE2 | SCORE              |
  +-------+-------+--------------------+
  | 1     | 1     | 1.0                |
  | 1     | 2     | 0.3333333333333333 |
  | 1     | 3     | 0.5                |
  +-------+-------+--------------------+  */

Compute the Jaccard similarity of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the Jaccard similarity of node 1 and every other node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `'my_result_table'`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.jaccard_similarity(
    'my_graph',
    {
        'node1': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+----------------------------------+
  | COL1 | COL2 | COL3               |
  +------+------+--------------------+
  | 1    | 1    | 1.0                |
  | 1    | 2    | 0.3333333333333333 |
  | 1    | 3    | 0.5                |
  +------+------+--------------------+  */

See Also

cosine_similarity, preferential_attachment, adamic_adar, and common_neighbor.

label_propagation

label_propagation(graph_name)
label_propagation(graph_name, arguments)

Find communities using the label propagation algorithm (opens in a new tab).

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

You can specify the following arguments in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)No.A node ID, or array of node IDs.
communityInt (opens in a new tab) / Array (opens in a new tab)No.A community ID, or array of community IDs.
rai_engineVarchar (opens in a new tab)Yes, unless the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing node IDs and their community IDs.TABLE(node INT, community INT)

Explanation

The label propagation algorithm identifies communities in a graph through iterative steps. Nodes are initialized with unique lables and, at each iteration of the algorithm, nodes adopt the most frequently occuring label among their neighbors. In directed graphs, only outneighbors are considered. Ties in label frequency are broken deterministically. The process concludes when labels of the nodes no longer change or a preset maximum number of iterations is reached.

The label_propagation function returns a table with two columns — node and community — whose rows pair a node ID with its community ID. Community IDs are the maximum node ID contained in the community. If no graph with the provided name exists, an error is returned.

A RAI engine is required to execute the label_propagation function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument.

Examples

Compute community labels for each node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with six nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES
    (1, 2), (2, 3), (3, 1), (3, 4),
    (4, 5), (5, 6), (6, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute community labels using the label propagation algorithm.
SELECT * FROM TABLE(RAI.label_propagation('my_graph'));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 1    | 3         |
  | 2    | 3         |
  | 3    | 3         |
  | 4    | 6         |
  | 5    | 6         |
  | 6    | 6         |
  +------+-----------+  */
 
-- Compute community labels using the RAI engine 'my_other_rai_engine'
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.label_propagation(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 1    | 3         |
  | 2    | 3         |
  | 3    | 3         |
  | 4    | 6         |
  | 5    | 6         |
  | 6    | 6         |
  +------+-----------+  */

Find the community label of specific nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with six nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES
    (1, 2), (2, 3), (3, 1), (3, 4),
    (4, 5), (5, 6), (6, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the community label of node 2.
SELECT * FROM TABLE(RAI.label_propagation('my_graph', {'node': 2}));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 2    | 3         |
  +------+-----------+  */
 
-- Compute the community label of nodes 2 and 4.
SELECT * FROM TABLE(RAI.label_propagation('my_graph', {'node': [2, 4]}));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 2    | 3         |
  | 4    | 6         |
  +------+-----------+  */

Find all nodes with a given community ID:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with six nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES
    (1, 2), (2, 3), (3, 1), (3, 4),
    (4, 5), (5, 6), (6, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute all nodes with community label 3.
SELECT * FROM TABLE(RAI.label_propagation('my_graph', {'community': 3}));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 1    | 3         |
  | 2    | 3         |
  | 3    | 3         |
  +------+-----------+  */

See Also

triangle_community.

list_data_streams

list_data_streams()

List all created RAI data streams of the RAI database link. A RAI data stream is identified by the fully qualified name (opens in a new tab) of a SQL object, such as a table or view, in the form <database>.<schema>.<object>.

Output

TypeDescription
Variant (opens in a new tab) List of all RAI data streams as a JSON object.
[
    {
        "account": "******",
        "createdBy": "******",
        "createdOn": "2023-05-31T11:24:09.710Z",
        "dbLink": "sf_db.my_schema",
        "id": "******",
        "integration": "******",
        "name": "datasource",
        "rai": {
            "database": "rai_db",
            "relation": "rai_baserelation"
        },
        "snowflake": {
            "database": "sf_db",
            "object": "sf_db.my_schema.datasource",
            "schema": "my_schema"
        },
        "state": "CREATED"
    }
]

Explanation

The function list_data_streams belongs to a specific RAI database link located in a Snowflake schema, for example, snowflake_database.RAI, and returns a list of all active RAI data streams within this RAI database link. The list of RAI data streams is returned in JSON format.

Examples

SELECT RAI.list_data_streams();

See Also

get_data_stream and create_data_stream.

list_entities

list_entities()

Return a list of all available entities.

Output Table

TypeDescriptionSchema
Table (opens in a new tab)A table of entities represented by their name and key.TABLE(entity_name VARCHAR, entity_key ARRAY)

Examples

   SELECT * FROM TABLE(SF_INT.LIST_ENTITIES());
 
/* +-------------+-----------------+
   | ENTITY_NAME | ENTITY_KEY      |
   |-------------+-----------------|
   | Client      | [               |
   |             |   "first_name", |
   |             |   "last_name"   |
   |             | ]               |
   | Product     | [               |
   |             |   "name"        |
   |             | ]               |
   +-------------+-----------------+ */

See Also

create_entity.

list_graphs

list_graphs()

List all available graphs in the RAI Integration.

Output

TypeDescription
Array (opens in a new tab) List of all available graphs as a JSON object.
[
    {
        "DIRECTED": false,
        "EDGE_STREAM": "my_db.my_schema.my_edge_stream",
        "NAME": "my_graph",
        "RAI_DATABASE": "my_rai_db"
    }
]

Explanation

The function list_graphs returns a list in JSON format including all available graphs.

Examples

SELECT RAI.list_graphs();

See Also

is_graph_created and get_graph.

list_graph_algorithms

list_graph_algorithms()

List all available graph algorithms within the SQL Library for Snowflake.

Output

TypeDescription
Object (opens in a new tab) List of all available graph algorithms as a JSON object.
{
    "_documentation": "https://docs.relational.ai/preview/snowflake/graph-analytics",
    "basics": {
        "average_degree": {
        "description": "Returns the average degree of the graph."
        },
        "degree": {
        "description": "Finds the degrees of each node in the graph."
        },
        "degree_histogram": {
        "description": "Counts the number of nodes with each degree in the graph."
        },
        "max_degree": {
        "description": "Returns the maximum degree of the graph."
        },
        "min_degree": {
        "description": "Returns the minimum degree of the graph. In directed graphs, the degree of a node is the sum of its indegree and outdegree."
        },
        "neighbor": {
        "description": "Finds the neighbors of each node in the graph."
        },
        "num_edges": {
        "description": "Returns the number of edges in the graph."
        },
        ...
    }
}

Explanation

The function list_graph_algorithms returns a list in JSON format including all available graph algorithms within the SQL Library for Snowflake.

Examples

SELECT RAI.list_graph_algorithms();

See Also

list_graphs.

list_lookups

list_entities()

Return a list of all tables and entities currently covered by the lookup table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table of entities represented by their name and key attributes.TABLE(entity_name VARCHAR, database_name VARCHAR, schema_name VARCHAR, table_name VARCHAR, entity_key_attrs ARRAY)

Examples

   SELECT * FROM TABLE(RAI.list_lookups());
 
/* +-------------+----------------+-------------+------------+------------------------+
   | ENTITY_NAME | DATABASE_NAME | SCHEMA_NAME | TABLE_NAME | ENTITY_KEY_ATTRS       |
   |-------------+----------------+-------------+------------+------------------------|
   | Client      | COMMERCE_DB    | ONLINE_SHOP | PURCHASE   | [                      |
   |             |                |             |            |   "CLIENT_FIRST_NAME", |
   |             |                |             |            |   "CLIENT_LAST_NAME"   |
   |             |                |             |            | ]                      |
   | Product     | COMMERCE_DB    | ONLINE_SHOP | PURCHASE   | [                      |
   |             |                |             |            |   "PRODUCT_NAME"       |
   |             |                |             |            | ]                      |
   +-------------+----------------+-------------+------------+------------------------+ */

See Also

create_lookup.

load_model

load_model(rai_db, rai_engine, model_name, model_path)

Create a model (aka Rel code) in the RAI database.

Parameters

ParameterTypeDescription
rai_db Varchar (opens in a new tab) RAI database name.
rai_engine Varchar (opens in a new tab) RAI engine name.
model_name Varchar (opens in a new tab) RAI model name.
model_path Varchar (opens in a new tab) The path to a file with Rel code. The path is typically a URI. For example: azure://account_name.blob.core.windows.net/container/model.rel. The model (i.e., Rel code) will be downloaded and imported into RAI.

Output

TypeDescription
Variant (opens in a new tab) Status of the loading model request. Example:“ok”.

Explanation

The function load_model imports the model from the specified Azure URI provided through the model_path parameter and stores it in the RAI database. The function load_model will download the Rel code (that is also called a model) from the specified path and import it into RAI.

Examples

SELECT RAI.load_model(
    'rai_db',
    'rai_engine',
    'my_model',
    'azure://<account_name>.blob.core.windows.net/container/model.rel'
);

See Also

load_model_code and load_model_query.

load_model_code

load_model(rai_db, rai_engine, model_name, model_code)

Create a model given as Rel code in the RAI database.

Parameters

ParameterTypeDescription
rai_db Varchar (opens in a new tab)  RAI database name.
rai_engine Varchar (opens in a new tab) RAI engine name.
model_name Varchar (opens in a new tab) RAI model name.
model_code Varchar (opens in a new tab) The Rel code (also known as model) that will be imported into RAI.

Output

TypeDescription
Variant (opens in a new tab) Status of the load-model request. Example:“ok”.

Explanation

The function load_model_code creates a model inside RAI with the given name and code.

Examples

SELECT RAI.load_model_code('rai_db', 'rai_engine', 'my_model', 'def myrange(x) = range(1, 10, 1, x)');

See Also

load_model and load_model_query.

load_model_query

load_model_query(model_name, model_path)

Create a model in the RAI database.

A model is a collection of Rel declarations that is persisted in the database. Relations defined in a model are available to any query executed against the database.

Parameters

ParameterTypeDescription
model_name Varchar (opens in a new tab) RAI model name.
model_path Varchar (opens in a new tab) The path to a file with Rel code. The path is typically a URI. For example: azure://account_name.blob.core.windows.net/container/model.rel. The model (i.e., Rel code) will be downloaded and imported into RAI.

Output

TypeDescription
Variant (opens in a new tab) Status of the load-model request. Example:“ok”.

Explanation

The function load_model_query imports the model from the specified Azure URI provided through the model_path parameter and stores it in RAI. The function load_model_query will download the Rel code (that is also called a model) from the specified path and import it into RAI.

Examples

SELECT RAI.load_model_query(
    'my_model',
    'azure://<account_name>.blob.core.windows.net/container/model.rel'
);

See Also

load_model and load_model_code.

local_clustering_coefficient

local_clustering_coefficient(graph_name)
local_clustering_coefficient(graph_name, arguments)

Compute the local clustering coefficient of nodes in a graph.

Parameters

ParameterTypeDescription
graph_nameVarchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoThe node or array of nodes for which the local clustering coefficient is computed.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing nodes and their local clustering coefficient value.TABLE(node INT, value FLOAT)

Explanation

The local clustering coefficient of a node is a measure of how interconnected the neighbors of a node are. In particular, the local clustering coefficient quantifies how close the node’s neighbors are to being a clique as the probability that an edge exists between any two of the node’s neighbors. Formally, the local clustering coefficient C for a node v is calculated as the following:

C(v) = (2 * num_edges) / (degree(v) * (degree(v) - 1))

Here, num_edges is the number of edges between the neighbors of node v and degree(v) is the degree of node v.

The local_clustering_coefficient function returns a table with two columns — node and value — whose rows represent pairs of nodes and their local clustering coefficient. Nodes whose local clustering coefficient is zero are excluded from the results. If no graph with the provided name exists, an error is returned.

You may get the local clustering coefficient of a single node, or an array of nodes, by providing the node ID, or an array of node IDs, to the node argument. Nodes that do not exist in the graph are ignored. In particular, if none of the nodes passed to the node argument exist, an empty table is returned.

The local_clustering_coefficient function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the local clustering coefficient of every node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with five nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT *
    FROM VALUES (1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (2, 4), (3, 5)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the local clustering coefficient of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.local_clustering_coefficient('my_graph'));
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 1    | 1.0          |
  | 2    | 0.6666666666 |
  | 3    | 0.5          |
  | 4    | 0.6666666666 |
  | 5    | 1.0          |
  +------+--------------+  */

Compute the local clustering coefficient of a single node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with five nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT *
    FROM VALUES (1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (2, 4), (3, 5)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the local clustering coefficient of node 2 in `'my_graph'`.
SELECT * FROM TABLE(RAI.local_clustering_coefficient('my_graph'), {'node': 2});
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 2    | 0.6666666666 |
  +------+--------------+  */

Compute the local clustering coefficient of multiple nodes in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with five nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT *
    FROM VALUES (1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (2, 4), (3, 5)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the local clustering coefficient of nodes 2 and 3 in `'my_graph'`.
SELECT * FROM TABLE(RAI.local_clustering_coefficient('my_graph', {'node': [2, 3]}));
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 2    | 0.6666666666 |
  | 3    | 0.5          |
  +------+--------------+  */

lookup

lookup(node_id)

Retrieve the object description corresponding to the given node ID.

Parameters

ParameterTypeDescription
node_idInteger (opens in a new tab) A node’s integer ID.

Output

TypeDescription
Object (opens in a new tab) A dictionary containing values of key attributes that identify the object of the entity whose name is stored under the key type.

Explanation

The entity name of the object is stored in the dictionary under the key type.

Before using this function, you must populate the lookup table with create_lookup.

If this function raises the Failure during expansion or Materialized View ... is invalid errors, it is likely due to schema changes to one of the source tables used to populate and maintain the lookup table. Use rebuild_lookup_table to repair the lookup table and list_lookups to verify that the lookup table has been populated by scanning the necessary tables.

Examples

   SELECT LOOKUP(-5527614564291079873);
/* +------------------------------+
   | LOOKUP(-5527614564291079873) |
   |------------------------------|
   | {                            |
   |   "first_name": "John",      |
   |   "last_name": "Smith",      |
   |   "type": "Client"           |
   | }                            |
   +------------------------------+ */
   SELECT LOOKUP(-2071496898560134469);
/* +------------------------------+
   | LOOKUP(-2071496898560134469) |
   |------------------------------|
   | {                            |
   |   "name": "Scissors",        |
   |   "type": "Product"          |
   | }                            |
   +------------------------------+ */

See Also

node, list_entities, create_lookup, and list_lookups.

max_degree

max_degree(graph_name)
max_degree(graph_name, arguments)

Return the maximum degree of a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Int (opens in a new tab)The maximum degree of the graph.

Explanation

The function max_degree('my_graph') returns the maximum degree over all degrees of nodes in my_graph. If no graph with the provided name exists, an error is returned.

Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.

The max_degree function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Get the maximum degree a directed graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the maximum degree of `'my_graph'`.
SELECT RAI.max_degree('my_graph');
/*+---+
  | 2 |
  +---+  */

Get the maximum degree in a directed graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Get the maximum degree of `'my_graph'`.
SELECT RAI.max_degree('my_graph');
/*+---+
  | 3 |
  +---+  */

Get the maximum degree of a graph using a different RAI engine than the engine set in the RAI context, and store the result in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the maximum degree of `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.max_degree(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
  | COL1 |
  +------+
  | 2    |
  +------+  */

See Also

degree, max_degree, average_degree, and degree_histogram.

min_degree

min_degree(graph_name)
min_degree(graph_name, arguments)

Return the minimum degree of a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Int (opens in a new tab)The minimum degree of the graph.

Explanation

The function min_degree('my_graph') returns the minimum degree over all degrees of nodes in my_graph. If no graph with the provided name exists, an error is returned.

Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.

The min_degree function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Get the minimum degree of a directed graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the minimum degree of `'my_graph'`.
SELECT RAI.min_degree('my_graph');
/*+---+
  | 1 |
  +---+  */

Get the minimum degree of a graph using a different RAI engine than the engine set in the RAI context, and store the result in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the minimum degree of `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.min_degree(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
  | COL1 |
  +------+
  | 1    |
  +------+  */

See Also

degree, min_degree, average_degree, and degree_histogram.

neighbor

neighbor(graph_name)
neighbor(graph_name, arguments)

Return the number of edges in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)No.An array of arrays representing pairs of nodes. May not be combined with the node1 and node2 arguments.
node1Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes. May not be combined with the tuples argument.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table representing pairs of neighbors.TABLE(node1 INT, node2 INT)

Explanation

Two nodes are neighbors if there is any edge in the graph that links the nodes together. The neighbor function returns a table with two columns — node1 and node2 — whose rows indicate that node1 and node2 are neighbors. If no graph with the provided name exists, an error is returned.

A RAI engine is required to execute the neighbor function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the neighbor function:

  • Find the neighbors of nodes, or set of nodes, by passing a node ID, or arrays of node IDs, to the node1 argument.
  • Find the neighbors for every node in the graph by calling the function without an arguments object.
  • Check if two nodes are neighbors by passing each node ID to the node1 and node2 arguments. If the nodes are neighbors, the function returns a row in the result table containing those nodes. Otherwise, an empty table is returned. You may check multiple pairs of nodes simultaneously by passing an array of pairs of node IDs to the tuples argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Find the neighbors of a node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the neighbors of node 2 in `'my_graph'`.
SELECT * FROM TABLE(RAI.neighbor('my_graph', {'node1': 2}));
/*+---------------+
  | NODE1 | NODE2 |
  +-------+-------+
  | 2     | 1     |
  | 2     | 3     |
  +-------+-------+  */
 
-- Find the neighbors of nodes 1 and 2 by passing an array to the `'node1'` argument.
SELECT * FROM TABLE(RAI.neighbor('my_graph', {'node1': [1, 2]}))
/*+---------------+
  | NODE1 | NODE2 |
  +-------+-------+
  | 1     | 2     |
  | 2     | 1     |
  | 2     | 3     |
  +-------+-------+  */

Find the neighbors of each node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the neighbors of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.neighbor('my_graph'));
/*+---------------+
  | NODE1 | NODE2 |
  +-------+-------+
  | 1     | 2     |
  | 2     | 1     |
  | 2     | 3     |
  | 3     | 2     |
  +-------+-------+  */

Check if two nodes are neighbors:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Check if nodes 1 and 2 are neighbors.
-- Since nodes 1 and 2 are neighbors, the result contains the row
-- with those nodes in the NODE1 and NODE2 columns.
SELECT * FROM TABLE(RAI.neighbor('my_graph', {'node1': 1, 'node2': 2}));
/*+---------------+
  | NODE1 | NODE2 |
  +-------+-------+
  | 1     | 2     |
  +-------+-------+  */
 
-- Check if nodes 1 and 3 are neighbors.
-- Since nodes 1 and 3 are not neighbors, the result is empty.
SELECT * FROM TABLE(RAI.neighbor('my_graph', {'node1': 1, 'node2': 3}));
/*+---------------+
  | NODE1 | NODE2 |
  +-------+-------+
  | Empty table.  |
  +-------+-------+  */
 
-- Check multiple pairs simultaneously by passing an array of pairs to
-- the `'tuples'` argument. The result table only contains rows corresponding
-- to pairs of neighbors.
SELECT * FROM TABLE(RAI.neighbor('my_graph', {'tuples': [[1, 2], [1, 3], [2, 3]]}))
/*+---------------+
  | NODE1 | NODE2 |
  +-------+-------+
  | 1     | 2     |
  | 2     | 3     |
  +-------+-------+  */

Get the neighbors of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get pairs of neighbors in `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.neighbor(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+--------------+
  | COL1 | COL2 |
  +-------+------+
  | 1     | 2    |
  | 2     | 1    |
  | 2     | 3    |
  | 3     | 2    |
  +-------+------+  */

See Also

degree.

node

node(entity_name, entity_key_values)

Compute the node ID corresponding to an object of a given entity and identified by specific key values.

Parameters

ParameterTypeDescription
entity_nameVarchar (opens in a new tab) The name of the entity.
entity_key_values Variant (opens in a new tab) or Array (opens in a new tab) The key value(s) identifying an object in the scope of the entity.

Output

TypeDescription
Number (opens in a new tab) The integer ID of the corresponding node.

Explanation

The function is polymorphic: It accepts either a singular value if the entity key is a singleton, or an array of values if the entity key is composite (has multiple attributes). Snowflake coerces almost all basic values to the Variant data type, except for character data, which needs to be cast explicitly. For example, node('Product', 'Crayons'::VARIANT).

You should use entities that have been previously declared with create_entity, and the number of entity key values should be the same as the number of key attributes used in the entity declaration.

Examples

   SELECT RAI.node('Client', ['John', 'Smith']);
 
/* +------------------------------------------+
   | SF_INT.NODE('CLIENT', ['JOHN', 'SMITH']) |
   |------------------------------------------|
   |                     -5527614564291079873 |
   +------------------------------------------+ */

The following example shows an alternative use case where the entity has a single key attribute:

   SELECT RAI.node('Order', 2095029);
/* +-------------------------------+
   | SF_INT.NODE('ORDER', 2095029) |
   |-------------------------------|
   |          -5670855959970978372 |
   +-------------------------------+ */

Because the Varchar (opens in a new tab) is not coerced (opens in a new tab) by Snowflake into Variant (opens in a new tab), creating the nodes of an entity with a single Varchar requires either explicitly casting Variant into Variant, or using a single element list.

   SELECT RAI.node('Product', 'Scissors'::VARIANT);
/* +---------------------------------------------+
   | SF_INT.NODE('PRODUCT', 'SCISSORS'::VARIANT) |
   |---------------------------------------------|
   |                        -2071496898560134469 |
   +---------------------------------------------+ */
 
   SELECT RAI.node('Product', ['Scissors']);
/* +--------------------------------------+
   | SF_INT.NODE('PRODUCT', ['SCISSORS']) |
   |--------------------------------------|
   |                 -2071496898560134469 |
   +--------------------------------------+ */

Typically, the node function is used to define sets of edges. For instance, if you work with a table Purchase(Client_First_Name, Client_Last_Name, Product_Name) and want to create a graph connecting clients to their purchased products, you can define a view with the edges as follows:

 
   CREATE VIEW Purchase_Edge(src, dst) AS (
     SELECT NODE('Client', [Client_First_Name, Client_Last_Name])
            NODE('Product', [Product_Name])
     FROM Purchases
   );

See Also

lookup.

num_edges

num_edges(graph_name)
num_edges(graph_name, arguments)

Return the number of edges in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Int (opens in a new tab)The number of edges in the graph.

Explanation

The function num_edges('my_graph') returns the number of edges in my_graph. If no graph with the provided name exists, an error is returned.

The num_edges function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Get the number of edges in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of edges in `'my_graph'`.
SELECT RAI.num_edges('my_graph');
/*+---+
  | 2 |
  +---+  */

Get the number of edges in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of edges in `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.num_edges(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
  | COL1 |
  +------+
  | 2    |
  +------+  */

See Also

num_nodes.

num_nodes

num_nodes(graph_name)
num_nodes(graph_name, arguments)

Return the number of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Int (opens in a new tab)The number of nodes in the graph.

Explanation

The function num_nodes('my_graph') returns the number of nodes in my_graph. If no graph with the provided name exists, an error is returned.

The num_nodes function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Get the number of nodes in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of nodes in `'my_graph'`.
SELECT RAI.num_nodes('my_graph');
/*+---+
  | 3 |
  +---+  */

Get the number of nodes in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of nodes in `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.num_nodes(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
 
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
  | COL1 |
  +------+
  | 3    |
  +------+  */

See Also

num_edges.

num_triangles

num_triangles(graph_name)
num_triangles(graph_name, arguments)

Compute the number of unique triangles in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescription
Int (opens in a new tab)The number of unique triangles in the graph.

Explanation

The function num_triangles('my_graph') returns the number of unique triangles in my_graph. If no graph with the provided name exists, an error is returned.

The num_triangles function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the number of unique triangles in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (1, 4), (2, 1), (3, 2), (3, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Get the number of unique triangles in `'my_graph'`.
SELECT RAI.num_triangles('my_graph');
/*+---+
  | 1 |
  +---+  */

If a graph has no triangles, the result of num_triangles is zero:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of unique triangles in `'my_graph'`.
SELECT RAI.num_triangles('my_graph');
/*+---+
  | 0 |
  +---+  */

Get the number of unique triangles in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (1, 4), (2, 1), (3, 2), (3, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Get the number of nodes in `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.num_triangles(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
);
 
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
  | COL1 |
  +------+
  | 1    |
  +------+  */

See Also

triangle_count, triangle_community, and unique_triangle.

pagerank

pagerank(graph_name)
pagerank(graph_name, arguments)

Compute the PageRank of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoInstruct pagerank to return only the rows in the output table whose node column contains the provided value.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their PageRank value.TABLE(node INT, value FLOAT)

Explanation

PageRank is a measure of the centrality, or importance, of a node in a graph. It is similar to eigenvector_centrality, but with an additional scaling factor.

The pagerank('my_graph') function returns a table with two columns — node and value — whose rows represent pairs of nodes and their PageRank value. If no graph with the provided name exists, an error is returned.

You may get the PageRank of a single node by providing the node ID to the node argument. If the specified node does not exist in the graph, an empty table is returned.

The pagerank function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the PageRank of each node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the PageRank of each node in `'my_graph'`.
SELECT * FROM TABLE(RAI.pagerank('my_graph'));
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 1    | 0.2567563672 |
  | 2    | 0.4864872655 |
  | 3    | 0.2567563672 |
  +------+--------------+  */

Compute the PageRank of a single node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the PageRank of node 2.
SELECT * FROM TABLE(RAI.pagerank('my_graph', {'node': 2}));
/*+---------------------+
  | NODE | VALUE        |
  +------+--------------+
  | 2    | 0.4864872655 |
  +------+--------------+  */

Compute the PageRank of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the PageRank of each node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.pagerank(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+---------------------+
  | COL1 | COL2         |
  +------+--------------+
  | 1    | 0.2567563672 |
  | 2    | 0.4864872655 |
  | 3    | 0.2567563672 |
  +------+--------------+  */

See Also

betweenness_centrality, degree_centrality, and eigenvector_centrality.

ping

ping()

Ping RAI and return the ping timestamp.

Output

TypeDescription
Variant (opens in a new tab) A timestamp of the ping time from Snowflake to RAI.

Explanation

The function ping returns a timestamp of the ping time from Snowflake to RAI. This is used to check whether RAI is accessible and running as expected.

Examples

SELECT RAI.ping();

See Also

list_data_streams

preferential_attachment

preferential_attachment(graph_name, arguments)

Compute the preferential attachment score of pairs of nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but not together with node1.An array of arrays representing pairs of nodes.
node1Int (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of nodes and their preferential attachment scores.TABLE(node1 INT, node2 INT, score INT)

Explanation

Preferential attachment measures the similarity of two nodes as the product of their degrees. The preferential_attachment function returns a table with three columns — node1, node2, and score — whose rows contain pairs of nodes and their preferential attachment scores. If no graph with the provided name exists, an error is returned.

Higher scores indicate greater similarity, and rows where score is zero are omitted from the results. A score of zero indicates that two nodes are neither similar nor dissimilar. Excluding zeros from the results improves performance and spares you from having to remove those rows in a post-processing step.

A RAI engine is required to execute the preferential_attachment function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the preferential_attachment function:

  • Compute the preferential attachment score of two nodes, or sets of nodes, by passing node IDs, or arrays of node IDs, to the node1 and node2 arguments. You may compute the preferential attachment score for multiple pairs of nodes by passing an array of pairs of node IDs to the tuples argument.
  • Compute the preferential attachment score of a node, or set of nodes, and every other node in the graph, by passing a node ID, or array of node IDs, to the node1 argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Compute the preferential attachment score of two nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the preferential attachment scores of nodes 1 and 2
-- using the `'node1'` and `'node2'` arguments.
SELECT * FROM TABLE(RAI.preferential_attachment('my_graph', {'node1': 1, 'node2': 2}));
/*+-----------------------+
  | NODE1 | NODE2 | SCORE |
  +-------+-------+-------+
  | 1     | 2     | 4     |
  +-------+-------+-------+  */
 
-- Compute the preferential attachment scores of nodes 1 and 2 and nodes 1 and 3
-- using the `'tuples'` argument.
SELECT * FROM TABLE(RAI.preferential_attachment('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+-----------------------+
  | NODE1 | NODE2 | SCORE |
  +-------+-------+-------+
  | 1     | 2     | 4     |
  | 1     | 3     | 2     |
  +-------+-------+-------+  */

Compute the preferential attachment score of a node and every other node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the preferential attachment score of node 1 and every other node in `'my_graph'`.
SELECT * FROM TABLE(RAI.preferential_attachment('my_graph', {'node1': 1}));
/*+-----------------------+
  | NODE1 | NODE2 | SCORE |
  +-------+-------+-------+
  | 1     | 1     | 4     |
  | 1     | 2     | 4     |
  | 1     | 3     | 2     |
  +-------+-------+-------+  */

Compute the preferential attachment score of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the preferential attachment score of node 1 and every other node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `'my_result_table'`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.preferential_attachment(
    'my_graph',
    {
        'node1': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+--------------------+
  | COL1 | COL2 | COL3 |
  +------+------+------+
  | 1    | 1    | 4    |
  | 1    | 2    | 4    |
  | 1    | 3    | 2    |
  +------+------+------+  */

See Also

jaccard_similarity, cosine_similarity, adamic_adar, and common_neighbor.

shortest_path_length

shortest_path_length(graph_name, arguments)

Return the length of a shortest path from a source node to one or more target nodes in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but may not be combined with the source and target arguments.An array of arrays representing pairs of nodes.
sourceInt (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
targetInt (opens in a new tab) / Array (opens in a new tab)NoA node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing the shortest path length between pairs of nodes.TABLE(source INT, target INT, length INT)

Explanation

The function shortest_path_length returns a table with three columns — source, target, and length — whose rows represent the length of a shortest path from a source node to a target node. If no graph with the provided name exists, an error is returned. Rows for which no path from the source node to the target node exists are excluded from the results.

A RAI engine is required to execute the shortest_path_length function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the shortest_path_length function:

  • Compute the shortest path length between a two nodes, or two sets of nodes, by passing node IDs, or arrays of node IDS, to the source and target arguments. You may compute the shortest path length between multiple pairs of nodes by passing an array of pars of node IDs to the tuples argument.
  • Compute the shortest path length between a node, or a set of nodes, and every node in the graph, by passing a node ID, or an array of node IDs to the source argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Compute the shortest path length between two nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the shortest path length between node 1 and node 3.
SELECT * FROM TABLE(RAI.shortest_path_length('my_graph', {'source': 1, 'target': 3}));
/*+--------------------------+
  | SOURCE | TARGET | LENGTH |
  +--------+--------+--------+
  | 1      | 3      | 2      |
  +--------+--------+--------+  */
 
-- Compute the shortest path length between nodes 1 and 2 and nodes 1 and 3
-- using the `'tuples'` argument.
SELECT * FROM TABLE(RAI.shortest_path_length('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+--------------------------+
  | SOURCE | TARGET | LENGTH |
  +--------+--------+--------+
  | 1      | 2      | 1      |
  | 1      | 3      | 2      |
  +--------+--------+--------+  */

Find the shortest path length from one node to every other node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Find the shortest path length from node 1 to every node in `'my_graph'`.
-- Rows where no path between nodes exists are excluded from the results.
SELECT * FROM TABLE(RAI.shortest_path_length('my_graph', {'source': 1}));
/*+--------------------------+
  | SOURCE | TARGET | LENGTH |
  +--------+--------+--------+
  | 1      | 1      | 0      |
  | 1      | 2      | 1      |
  | 1      | 3      | 2      |
  +--------+--------+--------+  */

Get the length of a shortest path from a source node to all reachable nodes using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the shortest path length from node 1 in `'my_graph'`
-- to all nodes reachable from node 1 using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `'my_result_table'`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.shortest_path_length(
    'my_graph',
    {
        'source': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+--------------------+
  | COL1 | COL2 | COL3 |
  +------+------+------+
  | 1    | 1    | 0    |
  | 1    | 2    | 1    |
  | 1    | 3    | 2    |
  +------+------+------+  */

See Also

transitive_closure

transitive_closure

transitive_closure(graph_name, arguments)

Computes the transitive closure of the edges in a graph and may be used to determine which nodes are reachable from each node in the graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but may not be combined with the source and target arguments.An array of arrays representing pairs of nodes.
sourceInt (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
targetInt (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing pairs of reachable nodes.TABLE(source INT, target INT)

Explanation

The transitive closure of a graph is the set of all pairs of source and target nodes for which there exists a path beginning at the source node and ending at the target node. If such a path exists, the target node is said to be reachable from the source node.

The transitive_closure function returns a table with two columns — source and target — whose rows represent pairs of nodes for which the target node is reachable from the source node. If no graph with the provided name exists, an error is returned. Rows for which no path from the source node to the target node exists are excluded from the results.

A RAI engine is required to execute the transitive_closure function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, and they may not be used together.

There are several ways to use the transitive_closure function:

  • Check if a target node is reachable from a source node by passing node IDs to the source and target arguments. If it is reachable, the result contains the source and target pair. Otherwise, the result is empty. Multiple pairs may be checked simultaneously by passing an array of pairs of node IDs to the tuples argument.
  • Find every node reachable from a source node, or a set of source nodes, by passing a node ID, or an array of node IDs, to the source argument.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Determine if a node is reachable from another node in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Check whether or not node 2 is reachable from node 1.
-- Since node 2 is reachable from node 1, the result table is non-empty.
SELECT * FROM TABLE(RAI.transitive_closure('my_graph', {'source': 1, 'target': 2}));
/*+-----------------+
  | SOURCE | TARGET |
  +--------+--------+
  | 1      | 2      |
  +--------+--------+  */
 
-- Check whether or not node 3 is reachable from node 1.
-- Since node 3 is not reachable from node 1, the result table is empty.
SELECT * FROM TABLE(RAI.transitive_closure('my_graph', {'source': 1, 'target': 3}));
/*+-----------------+
  | SOURCE | TARGET |
  +--------+--------+
  | Empty result    |
  +--------+--------+  */
 
-- Check multiple pairs of nodes simultaneously using the `'tuples'` argument.
-- Only the rows corresponding to reachable pairs are included in the result table.
SELECT * FROM TABLE(RAI.transitive_closure('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+-----------------+
  | SOURCE | TARGET |
  +--------+--------+
  | 1      | 2      |
  +--------+--------+  */

Find every node reachable from a given source node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get all nodes in `'my_graph'` reachable from node 1.
SELECT * FROM TABLE(RAI.transitive_closure('my_graph', {'source': 1}));
/*+-----------------+
  | SOURCE | TARGET |
  +--------+--------+
  | 1      | 2      |
  | 1      | 3      |
  +--------+--------+  */

Get every node reachable from a given node using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find all nodes reachable from node 1 using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `'my_result_table'`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.transitive_closure(
    'my_graph',
    {
        'source': 1,
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 2    |
  | 1    | 3    |
  +------+------+  */

See Also

shortest_path_length

triangle_community

triangle_community(graph_name)
triangle_community(graph_name, arguments)

Find K-clique communities (with K=3) using the percolation method (opens in a new tab).

This algorithm does not support directed graphs.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

You can specify the following arguments in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)No.A node ID, or array of node IDs.
communityInt (opens in a new tab) / Array (opens in a new tab)No.A community ID, or array of community IDs.
rai_engineVarchar (opens in a new tab)Yes, unless the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing node IDs and their community IDs.TABLE(node INT, community INT)

Explanation

A triangle community is the union of all triangles (3-cliques) that can be reached through adjacent triangles, where two triangles are adjacent if they share two nodes.

The triangle_community function returns a table with two columns — node and community — whose rows pair a node ID with its triangle community label. Community labels are integers and are computed deterministically. Nodes that are not contained in a triangle are excluded from the results. If no graph with the provided name exists, an error is returned.

A RAI engine is required to execute the triangle_community function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument.

Examples

Compute triangle community labels for each node in an undirected graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with six nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES
    (1, 2), (2, 3), (3, 1), (3, 4),
    (4, 5), (5, 6), (6, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute triangle community labels.
SELECT * FROM TABLE(RAI.triangle_community('my_graph'));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 1    | 1         |
  | 2    | 1         |
  | 3    | 1         |
  | 4    | 2         |
  | 5    | 2         |
  | 6    | 2         |
  +------+-----------+  */

Compute the triangle community label for a single node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with six nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES
    (1, 2), (2, 3), (3, 1), (3, 4),
    (4, 5), (5, 6), (6, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the triangle community label of node 2.
SELECT * FROM TABLE(RAI.triangle_community('my_graph', {'node': 2}));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 2    | 1         |
  +------+-----------+  */

Find all nodes with a given community label:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with six nodes and seven edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES
    (1, 2), (2, 3), (3, 1), (3, 4),
    (4, 5), (5, 6), (6, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute triangle community labels.
SELECT * FROM TABLE(RAI.triangle_community('my_graph', {'community': 2}));
/*+------------------+
  | NODE | COMMUNITY |
  +------+-----------+
  | 4    | 2         |
  | 5    | 2         |
  | 6    | 2         |
  +------+-----------+  */

See Also

label_propagation, num_triangles, triangle_count, and unique_triangle.

triangle_count

triangle_count(graph_name)
triangle_count(graph_name, arguments)

Count the number of unique triangles to which each node in a graph belongs.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoInstruct triangle_count to return only the rows in the output table whose node column contains the provided value.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table whose rows contain the number of unique triangles to which a node belongs.TABLE(node INT, count INT)

Explanation

The function triangle_count('my_graph') returns a table with two columns, node and count, whose rows represent the number of unique triangles in the graph to which each node belongs. If no graph with the provided name exists, an error is returned.

The triangle_count function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Count the unique triangles to which each node in a graph belongs:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 2), (2, 3), (2, 4), (3, 1), (3, 4), (5, 1)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Count the number of unique triangles in `'my_graph'` to which each node belongs.
SELECT * FROM TABLE(RAI.triangle_count('my_graph'));
/*+--------------+
  | NODE | COUNT |
  +------+-------+
  | 1    | 1     |
  | 2    | 1     |
  | 3    | 1     |
  | 4    | 0     |
  | 5    | 0     |
  +------+-------+  */

Compute the number of unique triangles to which a given node belongs:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 2), (2, 3), (2, 4), (3, 1), (3, 4), (5, 1)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Count the number of unique triangles in `'my_graph'` to which each node belongs.
SELECT * FROM TABLE(RAI.triangle_count('my_graph', {'node': 2}));
/*+--------------+
  | NODE | COUNT |
  +------+-------+
  | 2    | 1     |
  +------+-------+  */

Count the unique triangles to which each node in a graph belongs using a different RAI engine than the one set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 2), (2, 3), (2, 4), (3, 1), (3, 4), (5, 1)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Count the number of unique triangles in `'my_graph'` to which
-- each node belongs using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.triangle_count(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 1    |
  | 2    | 1    |
  | 3    | 1    |
  | 4    | 0    |
  | 5    | 0    |
  +------+------+  */

See Also

num_triangles, triangle_community, and unique_triangle.

unique_triangle

unique_triangle(graph_name)
unique_triangle(graph_name, arguments)

Compute triples of nodes, unique up to order, that form a triangle in a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
tuplesArray (opens in a new tab)Yes, but may not be combined with the node1, node2, and node3 arguments.An array of arrays representing pairs of nodes.
node1Int (opens in a new tab) / Array (opens in a new tab)Yes, but not together with tuples.A node, or an array of nodes.
node2Int (opens in a new tab) / Array (opens in a new tab)No.A node, or an array of nodes.
node3Int (opens in a new tab) / Array (opens in a new tab)NoA node, or an array of nodes.
rai_engineVarchar (opens in a new tab)Yes, except if the engine is set via a RAI context.The name of the RAI engine to use to execute the algorithm.
result_table Varchar (opens in a new tab)No.The fully qualified name of the Snowflake table in which to store results.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table whose rows contain triples of nodes forming unique triangles in a graph.TABLE(node1 INT, node2 INT, node3 INT)

Explanation

The unique_triangle function returns a table with three columns — node1, node2, and node3 — whose rows represent triples of nodes that form a triangle. The node ids within the triples are ordered from smallest to largest to avoid combinatorially redundant triples that correspond to the same triangle. If no graph with the provided name exists, an error is returned.

For undirected graphs, the uniqueness of each triple is guaranteed because the nodes are ordered so that node1 < node2 < node3. For directed graphs, triples are ordered so that node1 < node2, node1 < node3, and node2 != node3. This admits triangles with the same nodes but oppositely directed edges. For example, the triples 1, 2, 3 and 1, 3, 2 represent two unique directed triangles.

A RAI engine is required to execute the unique_triangle function. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. One of the node1 and tuples arguments are required, but they may not be used together.

There are several ways to use the unique_triangle function:

  • Find all unique triangles in a graph by calling unique_triangle('my_graph').
  • Find all unique triangles containing a given node, or a set of nodes, by passing a node ID, or array of node IDs, to the node1 argument. Only triangles where the given node ID is the smallest node ID in the triangle will be returned, due to how triples are ordered.
  • Find all unique triangles containing a pair of nodes by passing node IDs to the node1 and node2 arguments. You may find triangles for multiple pairs of nodes by passing an array of pairs of node IDs to the tuples argument. The smallest node ID must be passed to node1 due to how triples are ordered.
  • Check if three nodes form a triangle by passing the nodes IDs to the node1, node2, and node3 arguments. You may check multiple triples simultaneously by passing an array of triples of node IDs to the tuples argument. To avoid false negatives, you must ensure that the node IDs are passed in the correct order.

Nodes that do not exist in the graph are ignored. The next section contains concrete examples.

Examples

Find all unique triangles in a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an edge table representing a graph with four nodes and five edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (2, 1), (2, 4), (3, 2), (3, 4)
);
 
-- Create a RAI data stream.
CALL RAI.create_data_stream('my_edges');
 
-- Create an undirected graph and a directed graph from the RAI data stream.
CALL RAI.create_graph('my_undirected_graph', 'my_edges');
CALL RAI.create_graph('my_directed_graph', 'my_edges', {'directed': true});
 
-- Compute all of the unique triangles in `'my_undirected_graph'`.
SELECT * FROM TABLE(RAI.unique_triangle('my_undirected_graph'));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  | 2     | 3     | 4     |
  +-------+-------+-------+  */
 
-- Compute all of the unique triangles in `'my_directed_graph'`.
SELECT * FROM TABLE(RAI.unique_triangle('my_directed_graph'));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 3     | 2     |
  +-------+-------+-------+  */

Find all unique triangles containing a given node:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with four nodes and five edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (2, 1), (2, 4), (3, 2), (3, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find all unique triangles in `'my_graph'` starting with node 1
-- by passing the node ID to the `'node1'` argument.
SELECT * FROM TABLE(RAI.unique_triangle('my_graph', {'node1': 1}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  +-------+-------+-------+  */
 
-- Find all unique triangles in `'my_graph'` starting with node 2
-- by passing the node ID to the `'node1'` argument. Even though
-- node 2 is contained in both of the two triangles in the graph,
-- there is only one triangle that "starts" at node 2, meaning
-- 2 is the smallest node ID in the triangle.
SELECT * FROM TABLE(RAI.unique_triangle('my_graph', {'node1': 2}))
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 2     | 3     | 4     |
  +-------+-------+-------+  */
 
-- Find all unique triangles in `'my_graph'` starting with any of
-- nodes 1, 2, or 3 by passing an array of node IDs to the `'node1'` argument.
-- There is no triangle represented with node 3 in the first column,
-- so only the rows for the triangles with nodes 1 and 2 in the first column
-- are returned.
SELECT * FROM TABLE(RAI.unique_triangle('my_graph', {'node1': [1, 2, 3]}))
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  | 2     | 3     | 4     |
  +-------+-------+-------+  */

Find all unique triangles containing two given nodes:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with four nodes and five edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (2, 1), (2, 4), (3, 2), (3, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find all unique triangles in `'my_graph'` starting with node 1 and node 3.
SELECT * FROM TABLE(RAI.unique_triangle('my_graph', {'node1': 1, 'node2': 3}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 3     | 2     |
  +-------+-------+-------+  */
 
-- Find all unique triangles in `'my_graph'` starting with node 2 and node 3.
-- Even though both triangles in the graph contain nodes 2 and 3,
-- there is only one triangle represented with node 2 in the first column
-- and node 3 in the second column.
SELECT * FROM TABLE(RAI.unique_triangle('my_graph', {'node1': 1, 'node2': 3}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 2     | 3     | 4     |
  +-------+-------+-------+  */
 
-- Find all unique triangles in `'my_graph'` starting with nodes 1 and 2,
-- nodes 2 and 3, and nodes 3 and 4 by passing an array of node IDs to
-- the `'tuples'` argument. There is no triangle represented with node 3
-- in the first column, so only the rows with nodes 1 and 3 and nodes 2 and 3
-- in the first two columns are returned.
SELECT * FROM TABLE(RAI.unique_triangle('my_graph', {'tuples': [[1, 2], [2, 3], [3, 4]]}));
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  | 2     | 3     | 4     |
  +-------+-------+-------+  */

Determine whether or not three nodes form a unique triangle:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with four nodes and five edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (2, 1), (2, 4), (3, 2), (3, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Determine whether or not nodes 1, 2, and 3 form a unique triangle in `'my_graph'`.
-- Note that the result table is non-empty, since there is a triangle
-- containing nodes 1, 3, and 2 and is represented with the nodes in that order.
SELECT * FROM TABLE(
    RAI.unique_triangle('my_graph', {'node1': 1, 'node2': 2, 'node3': 3})
);
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  +-------+-------+-------+  */
 
-- Determine whether or not nodes 1, 3, and 2 form a unique triangle in `'my_graph'`.
-- The result table is empty. Even though there is a triangle containing nodes
-- 1, 3, and 2, there is no triangle represented by the nodes in that order.
SELECT * FROM TABLE(
    RAI.unique_triangle('my_graph', {'node1': 1, 'node2': 3, 'node3': 2})
);
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  |  Empty table          |
  +-------+-------+-------+  */
 
-- You can check multiple triples of nodes simultaneously by passing
-- an array of triples of node IDs to the `'tuples'` argument.
SELECT * FROM TABLE(
    RAI.unique_triangle('my_graph', {'tuples': [[1, 2, 3], [1, 3, 2], [2, 3, 4]]})
)
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  | 2     | 3     | 4     |
  +-------+-------+-------+  */

Compute all of the unique triangles in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS (
    SELECT * FROM VALUES (1, 3), (2, 1), (2, 4), (3, 2), (3, 4)
);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find the common neighbors of node 1 and every other node in `'my_graph'`
-- using the RAI engine `'my_other_rai_engine'` and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.unique_triangle(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-----------------------+
  | NODE1 | NODE2 | NODE3 |
  +-------+-------+-------+
  | 1     | 2     | 3     |
  | 2     | 3     | 4     |
  +-------+-------+-------+  */

See Also

num_triangles, triangle_community, and triangle_count.

weakly_connected_component

weakly_connected_component(graph_name)
weakly_connected_component(graph_name, arguments)

Computes the weakly connected components of a graph.

Parameters

ParameterTypeDescription
graph_name Varchar (opens in a new tab)The name of the graph.
argumentsObject (opens in a new tab)A JSON object containing additional arguments.

Supported Arguments

The following arguments may be specified in the arguments object:

Argument NameTypeRequiredDescription
nodeInt (opens in a new tab) / Array (opens in a new tab)NoWhen provided, the function returns the index of the component to which node belongs.
componentInt (opens in a new tab) / Array (opens in a new tab)NoWhen provided without node, the function returns all nodes in the given component. Use node and component together to check whether or not a given node belongs to a given component.
rai_engineVarchar (opens in a new tab)NoThe name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context.
result_table Varchar (opens in a new tab)NoThe fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table.

Output

TypeDescriptionSchema
Table (opens in a new tab)A table containing nodes and the index of the weakly connected component to which they belong.TABLE(node INT, component INT)

Explanation

The weakly_connected_component('my_graph') function computes the weak component (opens in a new tab) to which each node in the graph belongs. The component ID is the minimum ID of any node belonging to the component. If no graph with the provided name exists, an error is returned.

Both directed and undirected graphs are supported by the weakly_connected_component function. In undirected graphs, weakly connected components are the same as connected components.

The weakly_connected_component function requires a RAI engine. You can either specify the engine in a RAI context or provide an engine name to the rai_engine argument. The next section contains concrete examples.

Examples

Compute the weakly connected components of a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Compute the weakly connected components of `'my_graph'`.
SELECT * FROM TABLE(RAI.weakly_connected_component('my_graph'));
/*+------------------+
  | NODE | COMPONENT |
  +------+-----------+
  | 1    | 1         |
  | 2    | 1         |
  | 3    | 3         |
  | 4    | 3         |
  +------+-----------+  */

Find the weakly connected component to which a given node belongs:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create a directed graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
 
-- Find the weakly connected component of `'my_graph'` containing node 2.
SELECT * FROM TABLE(RAI.weakly_connected_component('my_graph', {'node': 2}));
/*+------------------+
  | NODE | COMPONENT |
  +------+-----------+
  | 2    | 1         |
  +------+-----------+  */

Find all nodes in a given component of a graph:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Find all nodes in `'my_graph'` belonging to component 3.
SELECT * FROM TABLE(RAI.weakly_connected_component('my_graph', {'component': 3}));
/*+------------------+
  | NODE | COMPONENT |
  +------+-----------+
  | 3    | 3         |
  | 4    | 3         |
  +------+-----------+  */

Determine whether or not a given node belongs to a given component:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Check whether or not node 2 belongs to component 1.
-- Note that the output table is non-empty, since node 2 belongs to component 1.
SELECT * FROM TABLE(
    RAI.weakly_connected_component('my_graph', {'node': 2, 'component': 1})
);
/*+------------------+
  | NODE | COMPONENT |
  +------+-----------+
  | 2    | 1         |
  | 4    | 3         |
  +------+-----------+  */
 
-- Check whether or not node 2 belongs to component 3.
-- Note that the output table is empty, since node 2 does not belong to component 3.
SELECT * FROM TABLE(
    RAI.weakly_connected_component('my_graph', {'node': 2, 'component': 1})
);
/*+------------------+
  | NODE | COMPONENT |
  +------+-----------+
  | Empty table      |
  +------+-----------+  */

Compute the weakly connected components of a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:

-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
 
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
 
-- Get the number of edges in `'my_graph'` using the RAI engine `'my_other_rai_engine'`
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.weakly_connected_component(
    'my_graph',
    {
        'rai_engine': 'my_other_rai_engine',
        'result_table': '<sf_database>.<sf_schema>.my_result_table'
    }
));
 
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
  | COL1 | COL2 |
  +------+------+
  | 1    | 1    |
  | 2    | 1    |
  | 3    | 3    |
  | 4    | 3    |
  +------+------+  */

See Also

is_connected.

Was this doc helpful?