SQL Library Reference for Snowflake: Functions
Reference for the SQL functions that come with the RAI Integration Services for Snowflake.
adamic_adar
adamic_adar(graph_name, arguments)
Compute the Adamic-Adar index of pairs of nodes in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments
may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
tuples | Array (opens in a new tab) | Yes, but not together with node1 . | An array of arrays representing pairs of nodes. |
node1 | Int (opens in a new tab) / Array (opens in a new tab) | Yes, but not together with tuples . | A node, or an array of nodes. |
node2 | Int (opens in a new tab) / Array (opens in a new tab) | No. | A node, or an array of nodes. |
rai_engine | Varchar (opens in a new tab) | Yes, except if the engine is set via a RAI context. | The name of the RAI engine to use to execute the algorithm. |
result_table | Varchar (opens in a new tab) | No. | The fully qualified name of the Snowflake table in which to store results. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table containing pairs of nodes and their Adamic-Adar index. | TABLE(node1 INT, node2 INT, score FLOAT) |
Explanation
The Adamic-Adar index measures the similarity of two nodes
based on the number of shared edges between them.
The adamic_adar
function returns a table with three columns
— node1
, node2
, and score
—
whose rows contain pairs of nodes and their Adamic-Adar index value.
If no graph with the provided name exists, an error is returned.
Higher scores indicate greater similarity,
and rows where the score
is zero are omitted from the results.
A score of zero indicates that two nodes are neither similar nor dissimilar.
Excluding zeros from the results improves performance and prevents the need
to remove those rows in a post-processing step.
A RAI engine is required to execute the adamic_adar
function.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
One of the node1
and tuples
arguments are required,
and they may not be used together.
There are several ways to use the adamic_adar
function:
- Compute the Adamic-Adar index of two nodes, or sets of nodes,
by passing node IDs, or arrays of node IDs, to the
node1
andnode2
arguments. You may compute the Adamic-Adar index for multiple pairs of nodes by passing an array of pairs of node IDs to thetuples
argument. - Compute the Adamic-Adar index of a node, or set of nodes,
and every other node in the graph, by passing a node ID, or array of node IDs,
to the
node1
argument.
Nodes that do not exist in the graph are ignored. The next section contains concrete examples.
Examples
Compute the Adamic-Adar index of two nodes in a graph
using the node1
and node2
arguments:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Adamic-Adar index of nodes 1 and 2
-- using the 'node1' and 'node2' arguments.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': 1, 'node2': 2}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 2 | 1.4426950408889634 |
+-------+-------+--------------------+ */
-- Compute the Adamic-Adar index of nodes 1 and 2 and of nodes 1 and 3
-- using the 'tuples' argument.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 2 | 1.4426950408889634 |
| 1 | 3 | 1.4426950408889634 |
+-------+-------+--------------------+ */
Compute the Adamic-Adar index of a given node and every node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Adamic-Adar index of node 1 and every node in 'my_graph'.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': 1}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 1 | 2.8853900817779268 |
| 1 | 2 | 1.4426950408889634 |
| 1 | 3 | 1.4426950408889634 |
+-------+-------+--------------------+ */
Compute the Adamic-Adar index of a single node and a set of nodes
using the node1
and node2
arguments:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Adamic-Adar index of node 1 and nodes 2 and 3.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': 1, 'node2': [2, 3]}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 2 | 1.4426950408889634 |
| 1 | 3 | 1.4426950408889634 |
+-------+-------+--------------------+ */
Compute the Adamic-Adar index of every pair of nodes from two sets of nodes:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Adamic-Adar index between two sets of nodes.
-- The result contains rows for each pair in the Cartesian product of the sets of nodes.
-- Note that the row with 3 in the NODE1 column and 2 in the NODE2 column is missing
-- because the Adamic-Adar index is zero.
SELECT * FROM TABLE(RAI.adamic_adar('my_graph', {'node1': [1, 3], 'node2': [2, 3]}));
/*+-----------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+-------------+
| 1 | 2 | 1.442695041 |
| 1 | 3 | 1.442695041 |
| 3 | 3 | 1.442695041 |
+-------+-------+-------------+ */
Compute the Adamic-Adar index of a given node and each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Adamic-Adar index of node 1 and every other node in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.adamic_adar(
'my_graph',
{
'node1': 1,
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+----------------------------------+
| COL1 | COL2 | COL3 |
+------+------+--------------------+
| 1 | 1 | 2.8853900817779268 |
| 1 | 2 | 1.4426950408889634 |
| 1 | 3 | 1.4426950408889634 |
+------+------+--------------------+ */
See Also
jaccard_similarity
,
cosine_similarity
,
preferential_attachment
, and
common_neighbor
.
average_degree
average_degree(graph_name)
average_degree(graph_name, arguments)
Return the average degree of a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description |
---|---|
Float (opens in a new tab) | The average degree of the graph. |
Explanation
The function average_degree('my_graph')
returns the average degree
over all degrees of nodes in my_graph
.
If no graph with the provided name exists, an error is returned.
Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.
The average_degree
function requires a RAI engine.
You can either specify the engine in a
RAI context
or
provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
Get the average degree of an undirected graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Get the average degree of 'my_graph'.
SELECT RAI.average_degree('my_graph');
/*+-----+
| 1.5 |
+-----+ */
Get the average degree of a graph using a different RAI engine than the engine set in the RAI context, and store the result in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Get the average degree of 'my_graph' using the RAI engine 'my_other_rai_engine'
-- and store the results in the Snowflake table 'my_result_table'.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.max_degree(
'my_graph',
{
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
);
-- Query the results.
SELECT * FROM my_result_table;
/*+------+
| COL1 |
+------+
| 1.5 |
+------+ */
See Also
degree
,
max_degree
,
min_degree
, and
degree_histogram
.
common_neighbor
common_neighbor(graph_name, arguments)
Find common neighbors of nodes in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments
may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
tuples | Array (opens in a new tab) | Yes, but not together with node1 . | An array of arrays representing pairs of nodes. |
node1 | Int (opens in a new tab) / Array (opens in a new tab) | Yes, but not together with tuples . | A node, or an array of nodes. |
node2 | Int (opens in a new tab) / Array (opens in a new tab) | No. | A node, or an array of nodes. |
node3 | Int (opens in a new tab) / Array (opens in a new tab) | No. | A node, or an array of nodes. |
rai_engine | Varchar (opens in a new tab) | Yes, except if the engine is set via a RAI context. | The name of the RAI engine to use to execute the algorithm. |
result_table | Varchar (opens in a new tab) | No. | The fully qualified name of the Snowflake table in which to store results. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table whose rows represent triples of nodes in which the third node is a common neighbor of the first two nodes. | TABLE(node1 INT, node2 INT, node3 INT) |
Explanation
The common_neighbor
function returns a table with three columns
— node1
, node2
, and node3
—
whose rows indicate that node3
is a common neighbor of node1
and node2
.
If no graph with the provided name exists, an error is returned.
A RAI engine is required to execute the common_neighbor
function.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
One of the node1
and tuples
arguments are required,
and they may not be used together.
There are several ways to use the common_neighbor
function:
- Find the common neighbors of two nodes, or sets of nodes,
by passing node IDs, or arrays of node IDs, to the
node1
andnode2
arguments. You may find the common neighbors of multiple pairs of nodes by passing an array of pairs of node IDs to thetuples
argument. - Find the common neighbors of a node, or set of nodes,
and every other node in the graph, by passing a single node,
or an array of nodes, to the
node1
argument. - Check if a node is a common neighbor of two other nodes
by passing each node ID to the
node1
,node2
, andnode3
arguments. Ifnode3
is a common neighbor ofnode1
andnode2
, then the function returns a row in the result table containingnode1
,node2
, andnode3
. Otherwise, an empty table is returned. You may check multiple triples simultaneously by passing an array of triples of node IDs to thetuples
argument.
Nodes that do not exist in the graph are ignored. The next section contains concrete examples.
Examples
Find the common neighbors of two nodes:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 3), (1, 4), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Find the common neighbors of nodes 1 and 2 using the 'node1' and 'node2' arguments.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1, 'node2': 2}));
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| 1 | 2 | 3 |
+------+--------+-------+ */
-- Find the common neighbors of nodes 1 and 2 and nodes 1 and 3 using the 'tuples' argument.
-- No row is returned with 1 in the NODE1 column and 3 in the NODE2 column
-- since node 1 and node 3 have no common neighbors.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| 1 | 2 | 3 |
+------+--------+-------+ */
Find the common neighbor of a given node and every node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 3), (1, 4), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Find the common neighbors of node 1 and every node in 'my_graph'.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1}));
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| 1 | 1 | 3 |
| 1 | 1 | 4 |
| 1 | 2 | 3 |
+------+--------+-------+ */
Check whether or not a node is a common neighbor of two other nodes:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 3), (1, 4), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Check whether or not node 3 is a common neighbor of nodes 1 and 2.
-- The output is non-empty because 3 is a common neighbor.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1, 'node2': 2, 'node3': 3}));
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| 1 | 2 | 3 |
+------+--------+-------+ */
-- Check whether or not node 4 is a common neighbor of nodes 1 and 2.
-- The output is empty becase node 4 is not a common neighbor.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'node1': 1, 'node2': 2, 'node3': 4}));
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| Empty table. |
+------+--------+-------+ */
-- Check multiple triples simultaneously using the 'tuples' argument.
SELECT * FROM TABLE(RAI.common_neighbor('my_graph', {'tuples': [[1, 2, 3], [1, 2, 4]]}))
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| 1 | 2 | 3 |
+------+--------+-------+ */
Find the common neighbor of a given node and every other node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Find the common neighbors of node 1 and every other node in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table 'my_result_table'. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.common_neighbor(
'my_graph',
{
'node1': 1,
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+-----------------------+
| NODE1 | NODE2 | NODE3 |
+-------+-------+-------+
| 1 | 1 | 3 |
| 1 | 1 | 4 |
| 1 | 2 | 3 |
+------+--------+-------+ */
See Also
jaccard_similarity
,
cosine_similarity
,
preferential_attachment
, and
cosine_similarity
.
cosine_similarity
cosine_similarity(graph_name, arguments)
Compute the cosine similarity of pairs of nodes in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments
may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
tuples | Array (opens in a new tab) | Yes, but not together with node1 . | An array of arrays representing pairs of nodes. |
node1 | Int (opens in a new tab) / Array (opens in a new tab) | Yes, but not together with tuples . | A node, or an array of nodes. |
node2 | Int (opens in a new tab) / Array (opens in a new tab) | No. | A node, or an array of nodes. |
rai_engine | Varchar (opens in a new tab) | Yes, except if the engine is set via a RAI context. | The name of the RAI engine to use to execute the algorithm. |
result_table | Varchar (opens in a new tab) | No. | The fully qualified name of the Snowflake table in which to store results. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table containing pairs of nodes and their cosine similarity value. | TABLE(node1 INT, node2 INT, score FLOAT) |
Explanation
Cosine similarity measures the similarity of two nodes
as the inner product of vector representations of the nodes’ neighborhoods.
The cosine_similarity
function returns a table with three columns
— node1
, node2
, and score
—
whose rows contain pairs of nodes and their cosine similarity value.
If no graph with the provided name exists, an error is returned.
Cosine similarity values range from -1 to 1, inclusive,
but rows where score
is zero are omitted from the results.
A score of zero indicates that two nodes are neither similar nor dissimilar.
Excluding zeros improves performance
and spares you from having to remove those rows in a post-processing step.
A RAI engine is required to execute the cosine_similarity
function.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
One of the node1
and tuples
arguments are required,
and they may not be used together.
There are several ways to use the cosine_similarity
function:
- Compute the cosine similarity of two nodes, or sets of nodes,
by passing node IDs, or arrays of node IDs, to the
node1
andnode2
arguments. You may compute the cosine similarity for multiple pairs of nodes by passing an array of pairs of node IDs to thetuples
argument. - Compute the cosine similarity of a node, or set of nodes,
and every other node in the graph, by passing a node ID, or array of node IDs,
to the
node1
argument.
Nodes that do not exist in the graph are ignored. The next section contains concrete examples.
Examples
Compute the cosine similarity of two nodes:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the cosine similarity of nodes 1 and 2
-- using the 'node1' and 'node2' arguments.
SELECT * FROM TABLE(RAI.cosine_similarity('my_graph', {'node1': 1, 'node2': 2}));
/*+-----------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+-------+
| 1 | 2 | 0.5 |
+-------+-------+-------+ */
-- Compute the cosine similarity of nodes 1 and 2 and nodes 1 and 3
-- using the 'tuples' argument.
SELECT * FROM TABLE(RAI.cosine_similarity('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 2 | 0.5 |
| 1 | 3 | 0.7071067811865475 |
+-------+-------+--------------------+ */
Compute the cosine similarity of a node and every node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the cosine similarity of node 1 and every node in 'my_graph'.
SELECT * FROM TABLE(RAI.cosine_similarity('my_graph', {'node1': 1}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 1 | 1.0 |
| 1 | 2 | 0.5 |
| 1 | 3 | 0.7071067811865475 |
+-------+-------+--------------------+ */
Compute the cosine similarity of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the cosine similarity of node 1 and every other node in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table 'my_result_table'. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.cosine_similarity(
'my_graph',
{
'node1': 1,
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+----------------------------------+
| COL1 | COL2 | COL3 |
+------+------+--------------------+
| 1 | 1 | 1.0 |
| 1 | 2 | 0.5 |
| 1 | 3 | 0.7071067811865475 |
+------+------+--------------------+ */
See Also
jaccard_similarity
,
preferential_attachment
,
adamic_adar
, and
common_neighbor
.
create_rai_database
create_rai_database(rai_db)
Create a database in RAI.
Parameters
Parameter | Type | Description |
---|---|---|
rai_db | Varchar (opens in a new tab) | RAI database name. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | Status of the creation request. Example:“ok”. |
Explanation
The function create_rai_database
creates a database in RAI.
Examples
SELECT RAI.create_rai_database('rai_db');
See Also
get_rai_database
and
delete_rai_database
.
create_rai_engine
create_rai_engine(rai_engine, size)
Create an engine in RAI.
Parameters
Parameter | Type | Description |
---|---|---|
rai_engine | Varchar (opens in a new tab) | RAI engine name. |
size | Varchar (opens in a new tab) | RAI engine size. Possible values are 'XS' , 'S' , 'M' , 'L' , 'XL' . |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | Status of the create engine request. Example:“ok”. |
Explanation
The function create_rai_engine
creates an engine in RAI.
Examples
SELECT RAI.create_rai_engine('rai_engine', 'S');
See Also
get_rai_engine
and
delete_rai_engine
.
current_rai_database
current_rai_database()
Return the RAI database that is currently selected.
Output
Type | Description |
---|---|
Varchar (opens in a new tab) | The name of the currently selected RAI database. |
Explanation
The function current_rai_database
returns the currently selected RAI database.
The database to use is typically set by calling the use_rai_database
procedure.
If no database has been previously selected, the function returns NULL
.
Examples
SELECT RAI.current_rai_database();
/*+----------------------------+
| RAI.CURRENT_RAI_DATABASE() |
+----------------------------+
| rai_db |
+----------------------------+ */
See Also
current_rai_engine
current_rai_engine()
Return the RAI engine that is currently selected.
Output
Type | Description |
---|---|
Varchar (opens in a new tab) | The name of the currently selected RAI engine. |
Explanation
The function current_rai_engine
returns the currently selected RAI engine.
The engine to use is typically set by calling the use_rai_engine
procedure.
If no engine has been previously selected, the function returns NULL
.
Examples
SELECT RAI.current_rai_engine();
/*+--------------------------+
| RAI.CURRENT_RAI_ENGINE() |
+--------------------------+
| rai_engine |
+--------------------------+ */
See Also
degree
degree(graph_name)
degree(graph_name, arguments)
Return the degree of each node in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
node | Int (opens in a new tab) / Array (opens in a new tab) | No | Instruct degree to return only the rows in the output table whose node column contains the provided value(s). |
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table representing pairs of neighbors in the graph. | TABLE(node1 INT, node2 INT) |
Explanation
The function degree('my_graph')
returns a table with two columns, node
and degree
,
whose rows represent the degrees of nodes in my_graph
.
If no graph with the provided name exists, an error is returned.
Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.
The degree
function requires a RAI engine.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
Get the degree of each node in an undirected graph with the RAI engine set in a RAI context:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Get the degrees of each node in 'my_graph'.
SELECT * FROM TABLE(RAI.degree('my_graph'));
/*+---------------+
| NODE | DEGREE |
+------+--------+
| 1 | 2 |
| 2 | 1 |
+------+--------+ */
Get the degree of each node in a directed graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create a directed graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges', {'directed': true});
-- Get the degrees of each node in 'my_graph'.
SELECT * FROM TABLE(RAI.degree('my_graph'));
/*+---------------+
| NODE | DEGREE |
+------+--------+
| 1 | 3 |
| 2 | 1 |
+------+--------+ */
Get the degree of a specific node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Get the degree of node 2 in 'my_graph'.
SELECT * FROM TABLE(RAI.degree('my_graph', {'node': 2}));
/*+---------------+
| NODE | DEGREE |
+------+--------+
| 2 | 2 |
+------+--------+ */
Get the degree of multiple nodes simultaneously:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Get the degree of nodes 1 and 2 in 'my_graph'.
SELECT * FROM TABLE(RAI.degree('my_graph', {'node': [1, 2]}));
/*+---------------+
| NODE | DEGREE |
+------+--------+
| 1 | 1 |
| 2 | 2 |
+------+--------+ */
Get the degree of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Get the number of edges in 'my_graph' using the RAI engine 'my_other_rai_engine'
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.degree(
'my_graph',
{
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
| COL1 | COL2 |
+------+------+
| 1 | 2 |
| 2 | 1 |
+------+------+ */
See Also
min_degree
,
max_degree
,
average_degree
,
degree_histogram
, and
neighbor
.
degree_centrality
degree_centrality(graph_name)
degree_centrality(graph_name, arguments)
Compute the degree centrality of nodes in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments
may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
node | Int (opens in a new tab) / Array (opens in a new tab) | No | Instruct degree_centrality to return only the rows in the output table whose node column contains the provided value. |
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table containing pairs of nodes and their degree centrality value. | TABLE(node INT, value FLOAT) |
Explanation
Degree centrality is a measure of the centrality, or importance, of a node in a graph based on its degree. Degree centrality is computed as the degree of a node divided by the number of nodes in the graph minus one. For simple graphs without loops this value is at most one. Graphs with loops may have nodes with degree centrality greater than one.
The degree_centrality('my_graph')
function returns a table with two columns
— node
and value
—
whose rows represent pairs of nodes in my_graph
and their degree centrality value.
If no graph with the provided name exists, an error is returned.
You may get the degree centrality of a single node
by providing the node ID to the node
argument.
If the specified node does not exist in the graph,
an empty table is returned.
The degree_centrality
function requires a RAI engine.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
Compute the degree centrality of each node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the degree centrality of each node in 'my_graph'.
SELECT * FROM TABLE(RAI.degree_centrality('my_graph'));
/*+--------------+
| NODE | VALUE |
+------+-------+
| 1 | 0.5 |
| 2 | 1 |
| 3 | 0.5 |
+------+-------+ */
Compute the degree centrality of a single node:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the degree centrality of node 2.
SELECT * FROM TABLE(RAI.degree_centrality('my_graph', {'node': 2}));
/*+--------------+
| NODE | VALUE |
+------+-------+
| 2 | 1 |
+------+-------+ */
Compute the degree centrality of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the degree centrality of each node in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.degree_centrality(
'my_graph',
{
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
| COL1 | COL2 |
+------+------+
| 1 | 0.5 |
| 2 | 1 |
| 3 | 0.5 |
+------+------+ */
See Also
eigenvector_centrality
and
pagerank
.
degree_histogram
degree_histogram(graph_name)
degree_histogram(graph_name, arguments)
Count the number of nodes with each degree in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
degree | Int (opens in a new tab) / Array (opens in a new tab) | No | Instruct degree_histogram to return only the rows in the output table whose degree column contains the provided value. |
count | Int (opens in a new tab) / Array (opens in a new tab) | No | Instruct degree_histogram to return only the rows in the output table whose count column contains the provided value. |
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table representing the number of nodes with each degree. | TABLE(node1 INT, node2 INT) |
Explanation
The function degree_histogram('my_graph')
returns a table
with two columns, degree
and count
,
whose rows represent the number of nodes with each degree in the graph.
If no graph with the provided name exists, an error is returned.
Note that in directed graphs, the degree of a node is the sum of its indegree and outdegree.
The degree_histogram
function requires a RAI engine.
You can either specify the engine in a
RAI context
or
provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
Count the number of nodes with each degree in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Count the number of nodes with each degree in 'my_graph'.
SELECT * FROM TABLE(RAI.degree_histogram('my_graph'));
/*+----------------+
| DEGREE | COUNT |
+--------+-------+
| 1 | 2 |
| 2 | 1 |
+--------+-------+ */
Count the number of nodes in a graph with a specific degree:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Count the number of nodes with degree 2 in 'my_graph'.
SELECT * FROM TABLE(RAI.degree_histogram('my_graph', {'degree': 2}));
/*+----------------+
| DEGREE | COUNT |
+--------+-------+
| 2 | 1 |
+--------+-------+ */
Find all degrees for which there is a specific number of nodes:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Count the number of nodes with degree 2 in 'my_graph'.
SELECT * FROM TABLE(RAI.degree_histogram('my_graph', {'count': 2}));
/*+----------------+
| DEGREE | COUNT |
+--------+-------+
| 1 | 2 |
+--------+-------+ */
Count the number of nodes in a graph with each degree using a different RAI engine than the one set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Count the number of nodes with each degree in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT TABLE(RAI.degree_histogram(
'my_graph',
{
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+-------------+
| COL1 | COL2 |
+------+------+
| 1 | 2 |
| 2 | 1 |
+------+------+ */
See Also
min_degree
,
max_degree
,
average_degree
, and
degree
.
delete_rai_database
delete_rai_database(rai_db)
Delete a database in RAI.
Parameters
Parameter | Type | Description |
---|---|---|
rai_db | Varchar (opens in a new tab) | RAI database name. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | Status of the deletion request. Example:“ok”. |
Explanation
The function delete_rai_database
deletes a database in RAI.
Examples
SELECT RAI.delete_rai_database('rai_db');
See Also
delete_rai_engine
delete_rai_engine(rai_engine)
Delete an engine in RAI.
Parameters
Parameter | Type | Description |
---|---|---|
rai_engine | Varchar (opens in a new tab) | RAI engine name. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | Status of the deletion request. Example:“ok”. |
Explanation
The function delete_rai_engine
deletes an engine in RAI.
Examples
SELECT RAI.delete_rai_engine('rai_engine');
See Also
diameter_range
diameter_range(graph_name)
diameter_range(graph_name, arguments)
Estimate the diameter of a graph with lower and upper bounds.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description |
---|---|
Object (opens in a new tab) | A JSON object containing the lower and upper bounds for the diameter of the graph. |
{"min": 1, "max": 10}
Explanation
The diameter_range
function is used to determine
the range of possible diameter values for a graph.
If no graph with the provided name exists, an error is returned.
The graph’s diameter is estimated by selecting a number of random nodes in the graph and taking the maximum of all shortest path lengths from each selected node to the other nodes in the graph. This gives a range per node. Then, the intersection of the ranges is taken and the final range is returned.
The diameter_range
function requires a RAI engine.
You can either specify the engine in a
RAI context
or
provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
Estimate the diameter of a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Estimate the diameter of 'my_graph'.
SELECT RAI.diameter_range('my_graph');
/*+-----------------------+
| { "max": 2, "min": 2} |
+-----------------------+ */
Estimate the diameter of a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 2), (2, 3), (3, 4);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Estimate the diameter of 'my_graph' using the RAI engine 'my_other_rai_engine'
-- and store the results in the Snowflake table `my_result_table`.
-- Note that the name of the Snowflake table must be fully qualified.
SELECT RAI.diameter_range(
'my_graph',
{
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
);
-- Query the results.
SELECT * FROM my_result_table;
/*+-----------------------+
| COL1 |
+-----------------------+
| { "max": 2, "min": 2} |
+-----------------------+ */
See Also
eigenvector_centrality
eigenvector_centrality(graph_name)
eigenvector_centrality(graph_name, arguments)
Compute the eigenvector centrality of nodes in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments
may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
node | Int (opens in a new tab) / Array (opens in a new tab) | No | Instruct eigenvector_centrality to return only the rows in the output table whose node column contains the provided value. |
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table containing pairs of nodes and their eigenvector centrality value. | TABLE(node INT, value FLOAT) |
Explanation
Eigenvector centrality measures a node’s importance in such a way that connections to more important nodes contribute more to a node’s score than connections to less important nodes. It is computed based on the eigenvector associated with the top eigenvalue of the graph’s adjacency matrix.
We use the
power method (opens in a new tab)
to compute the eigenvector in our implementation.
Note that the power method requires the adjacency matrix to be diagonalizable,
and will only converge if the largest positive eignevalue has multiplicity one.
If your graph does not meet either of these requirements,
the eigenvector_centrality
function will not converge.
The eigenvector_centrality('my_graph')
function returns a table with two columns
— node
and value
—
whose rows represent pairs of nodes in my_graph
and their eigenvector centrality value.
If no graph with the provided name exists, an error is returned.
You may get the eigenvector centrality of a single node
by providing the node ID to the node
argument.
If the specified node does not exist in the graph,
an empty table is returned.
The eigenvector_centrality
function requires a RAI engine.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
Compute the eigenvector centrality of each node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the eigenvector centrality of each node in 'my_graph'.
SELECT * FROM TABLE(RAI.eigenvector_centrality('my_graph'));
/*+---------------------+
| NODE | VALUE |
+------+--------------+
| 1 | 0.4082482905 |
| 2 | 0.8164965809 |
| 3 | 0.4082482905 |
+------+--------------+ */
Compute the eigenvector centrality of a single node:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the eigenvector centrality of node 2.
SELECT * FROM TABLE(RAI.eigenvector_centrality('my_graph', {'node': 2}));
/*+---------------------+
| NODE | VALUE |
+------+--------------+
| 2 | 0.8164965809 |
+------+--------------+ */
Compute the eigenvector centrality of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the eigenvector centrality of each node in 'my_graph'
-- using the RAI engine 'my_other_rai_engine' and store the results
-- in the Snowflake table `my_result_table`. Note that the name of the
-- Snowflake table must be fully qualified.
SELECT TABLE(RAI.eigenvector_centrality(
'my_graph',
{
'rai_engine': 'my_other_rai_engine',
'result_table': '<sf_database>.<sf_schema>.my_result_table'
}
));
-- Query the results.
SELECT * FROM my_result_table;
/*+---------------------+
| COL1 | COL2 |
+------+--------------+
| 1 | 0.4082482905 |
| 2 | 0.8164965809 |
| 3 | 0.4082482905 |
+------+--------------+ */
See Also
degree_centrality
and
pagerank
.
exec
exec(rai_query)
exec(rai_db, rai_engine, rai_query, data, readonly)
Execute a query against the RAI database.
Parameters
Parameter | Type | Description |
---|---|---|
rai_db | Varchar (opens in a new tab) | RAI database name. |
rai_engine | Varchar (opens in a new tab) | RAI engine name. |
rai_query | Varchar (opens in a new tab) | The query that will be executed in RAI. |
data | Variant (opens in a new tab) | Data input for the rai_query . |
readonly | Boolean (opens in a new tab) | Whether or not the query is read-only. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | The query output information. |
/*+-----------------------------------------------+
| EXEC('DEF OUTPUT = {1; 2; 3}') |
+-----------------------------------------------+
| [ [ 1 ], [ 2 ], [ 3 ] ] |
+-----------------------------------------------+ */
Explanation
The function exec
executes in RAI the rai_query
passed as a parameter.
It uses the rai_db
database and the rai_engine
engine.
The data
parameter is used to provide input for the query.
The readonly
parameter specifies whether or not the rai_query
is read-only.
By default, readonly
is set to TRUE
.
Examples
SELECT RAI.exec('def output = {1; 2; 3}');
SELECT RAI.exec('rai_db', 'rai_engine', 'def output=foo', {'foo' : 'hello'}, TRUE);
See Also
exec_into
and create_rai_database
.
exec_into
exec_into(rai_query, sf_target)
exec_into(rai_db, rai_engine, sf_warehouse, sf_target, rai_query)
exec_into(rai_db, rai_engine, rai_query, data, readonly, sf_warehouse, sf_target)
Execute a query against the RAI database and insert the output into a target Snowflake table.
Parameters
Parameter | Type | Description |
---|---|---|
rai_db | Varchar (opens in a new tab) | RAI database name. |
rai_engine | Varchar (opens in a new tab) | RAI engine name. |
sf_warehouse | Varchar (opens in a new tab) | The selected Snowflake warehouse. |
sf_target | Varchar (opens in a new tab) | Target Snowflake table that will contain the query output. Note that the table specified by sf_target needs to be fully qualified, for example, database.schema.table . |
rai_query | Varchar (opens in a new tab) | The query that will be executed in RAI. |
data | Variant (opens in a new tab) | Data input for the rai_query . |
readonly | Boolean (opens in a new tab) | Whether or not the query is read-only. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | Status of the request. Example: “1 row(s) written”. |
Explanation
The function exec_into
executes the rai_query
passed as a parameter in RAI.
It uses the rai_db
database and the rai_engine
engine.
The output of the query is stored in the Snowflake table sf_target
.
The Snowflake warehouse is specified through sf_warehouse
.
Examples
CREATE OR REPLACE TABLE my_data(x INT);
SELECT RAI.exec_into('def output = {1; 2; 3}', 'my_db.rai.my_data');
CREATE OR REPLACE TABLE my_output(s VARCHAR);
SELECT RAI.exec_into(
'rai_db',
'rai_engine',
'def output=foo',
{'foo' : 'hello'},
TRUE,
'my_warehouse',
'my_db.rai.my_output'
);
See Also
exec
and create_rai_database
.
get_rai_database
get_rai_database(rai_db)
Get information about a RAI database.
Parameters
Parameter | Type | Description |
---|---|---|
rai_db | Varchar (opens in a new tab) | RAI database name. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | RAI database information as a JSON object. |
{
"account_name": "******",
"created_by": "******",
"created_on": "2023-05-31T16:05:05.131Z",
"id": "******",
"name": "rai_db",
"region": "us-east",
"state": "CREATED"
}
Explanation
The function get_rai_database
returns information about a RAI database.
Examples
SELECT RAI.get_rai_database('rai_db');
See Also
get_data_stream
get_data_stream(datasource)
Get the static information of a RAI data stream.
A RAI data stream is identified by the fully qualified object name (opens in a new tab)
of a SQL object, such as a table or view, in the form <database>.<schema>.<object>
.
Parameters
Parameter | Type | Description |
---|---|---|
datasource | Varchar (opens in a new tab) | The fully qualified name of the Snowflake data source that identifies the data stream. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | RAI data stream information as a JSON object. |
{
"account": "******",
"createdBy": "******",
"createdOn": "2023-05-31T11:24:09.710Z",
"dbLink": "sf_db.rai",
"id": "******",
"integration": "****",
"name": "sf_db.rai-sf_db.sf_schema.datasource",
"rai": {
"database": "rai_db",
"relation": "rai_baserelation"
},
"snowflake": {
"database": "sf_db",
"object": "sf_db.sf_schema.datasource",
"schema": "sf_schema"
},
"state": "CREATED"
}
Explanation
The function get_data_stream
returns the static information of a RAI data stream.
It includes the following fields:
JSON Field | Description |
---|---|
account | RAI account name. |
createdBy | Client who created the RAI data stream. |
createdOn | Creation time of the RAI data stream. |
dbLink | RAI database link that manages this RAI data stream. |
id | Internal identifier of the RAI data stream. |
integration | RAI integration that the RAI data stream is using. |
name | Identifying name of the RAI data stream. |
rai:database | RAI database that contains the target object of the RAI data stream. |
rai:relation | Name of the target object, which is a base relation. |
snowflake:database | Snowflake database that contains the source object of the RAI data stream. |
snowflake:object | Fully qualified name of the Snowflake object. |
snowflake:schema | Snowflake schema that holds the Snowflake object. |
state | Creation state of the RAI data stream. |
This information will not change over time, regardless of the RAI data stream’s state.
If the RAI data stream doesn’t exist, the function get_data_stream
returns NULL
.
To retrieve the current status of a RAI data stream and verify whether the data are synchronized, use get_data_stream_status
.
Examples
This is how to retrieve the static information of the RAI data stream that originates fromsf_db.sf_schema.sf_table
:
SELECT RAI.get_data_stream('sf_db.sf_schema.sf_table');
The query output will be a Snwoflake JSON object as described above.
See Also
get_data_stream_status
and
create_data_stream
.
get_rai_engine
get_rai_engine(rai_engine)
Get information about a RAI engine.
Parameters
Parameter | Type | Description |
---|---|---|
rai_engine | Varchar (opens in a new tab) | RAI engine name. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | RAI engine information as a JSON object. |
{
"account_name": "******",
"created_by": "******",
"created_on": "2023-05-25T23:04:29.000Z",
"id": "******",
"name": "rai_engine",
"region": "us-east",
"size": "S",
"state": "PROVISIONED"
}
Explanation
The function get_rai_engine
returns information about a RAI engine.
Examples
SELECT RAI.get_rai_engine('rai_engine');
See Also
get_graph
get_graph(graph_name)
Get information about a graph, if it exists.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
Output
Type | Description |
---|---|
Variant (opens in a new tab) | Graph information as a JSON object. |
{
"DIRECTED": false,
"EDGE_STREAM": "my_db.my_schema.my_edge_stream",
"NAME": "my_graph",
"RAI_DATABASE": "my_rai_database"
}
Explanation
The function get_graph
returns a JSON object with information about the graph,
including the name of the graph, whether or not the graph is directed,
the fully qualified name of the graph’s edge stream,
and the name of the RAI database in which the graph is stored.
Examples
SELECT RAI.get_graph('my_graph');
See Also
is_graph_created
and
list_graphs
.
is_connected
is_connected(graph_name)
is_connected(graph_name, arguments)
Check whether or not a graph is connected.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
rai_engine | Varchar (opens in a new tab) | No | The name of the RAI engine to use to execute the algorithm. Required if no engine has been set in a RAI context. |
result_table | Varchar (opens in a new tab) | No | The fully qualified name of the Snowflake table in which to store results. If this argument is provided, the function returns an empty table. |
Output
Type | Description |
---|---|
Boolean (opens in a new tab) | Whether or not the graph is connected. |
Explanation
Returns true
if the graph is connected and false
otherwise.
If no graph with the provided name exists, an error is returned.
The is_connected
function requires a RAI engine.
You can either specify the engine in a
RAI context
or
provide an engine name to the rai_engine
argument.
The next section contains concrete examples.
Examples
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with three nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Determine whether or not 'my_graph' is connected.
SELECT RAI.is_connected('my_graph');
/*+------+
| TRUE |
+------+ */
See Also
is_graph_created
is_graph_created(graph_name)
Check whether a graph called graph_name
exists in the RAI Integration.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
Output
Type | Description |
---|---|
Boolean (opens in a new tab) | Whether or not the graph exists. |
Explanation
Returns true
if a graph called graph_name
exists and false
if it does not.
Examples
SELECT RAI.is_graph_created('my_graph');
See Also
create_graph
,
get_graph
, and
list_graphs
.
jaccard_similarity
jaccard_similarity(graph_name, arguments)
Compute the Jaccard similarity of pairs of nodes in a graph.
Parameters
Parameter | Type | Description |
---|---|---|
graph_name | Varchar (opens in a new tab) | The name of the graph. |
arguments | Object (opens in a new tab) | A JSON object containing additional arguments. |
Supported Arguments
The following arguments
may be specified in the arguments
object:
Argument Name | Type | Required | Description |
---|---|---|---|
tuples | Array (opens in a new tab) | Yes, but not together with node1 . | An array of arrays representing pairs of nodes. |
node1 | Int (opens in a new tab) / Array (opens in a new tab) | Yes, but not together with tuples . | A node, or an array of nodes. |
node2 | Int (opens in a new tab) / Array (opens in a new tab) | No. | A node, or an array of nodes. |
rai_engine | Varchar (opens in a new tab) | Yes, except if the engine is set via a RAI context. | The name of the RAI engine to use to execute the algorithm. |
result_table | Varchar (opens in a new tab) | No. | The fully qualified name of the Snowflake table in which to store results. |
Output
Type | Description | Schema |
---|---|---|
Table (opens in a new tab) | A table containing pairs of nodes and their Jaccard similarity value. | TABLE(node1 INT, node2 INT, score FLOAT) |
Explanation
Jaccard similarity measures the similarity of two nodes
based on the number of neighbors common to both nodes.
The jaccard_similarity
function returns a table with three columns
— node1
, node2
, and score
—
whose rows contain pairs of nodes and their Jaccard similarity value.
If no graph with the provided name exists, an error is returned.
Jaccard similarity values range from 0 to 1, inclusive,
but rows where score
is zero are omitted from the results.
A score of zero indicates that two nodes are incomparable.
Excluding zeros improves performance and
spares you from having to remove those rows in a post-processing step.
A RAI engine is required to execute the jaccard_similarity
function.
You can either specify the engine in a
RAI context
or provide an engine name to the rai_engine
argument.
One of the node1
and tuples
arguments are required,
and they may not be used together.
There are several ways to use the jaccard_similarity
function:
- Compute the Jaccard similarity of two nodes, or sets of nodes,
by passing node IDs, or arrays of node IDs, to the
node1
andnode2
arguments. You may compute the Jaccard similarity for multiple pairs of nodes by passing an array of pairs of node IDs to thetuples
argument. - Compute the Jaccard similarity of a node, or set of nodes,
and every other node in the graph, by passing a node ID, or array of node IDs,
to the
node1
argument.
Nodes that do not exist in the graph are ignored. The next section contains concrete examples.
Examples
Compute the Jaccard similarity of two nodes in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Jaccard similarity of nodes 1 and 2
-- using the 'node1' and 'node2' arguments.
SELECT * FROM TABLE(RAI.jaccard_similarity('my_graph', {'node1': 1, 'node2': 2}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 2 | 0.3333333333333333 |
+-------+-------+--------------------+ */
-- Compute the Jaccard similarity of nodes 1 and 2 and nodes 1 and 3
-- using the 'tuples' argument.
SELECT * FROM TABLE(RAI.jaccard_similarity('my_graph', {'tuples': [[1, 2], [1, 3]]}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 2 | 0.3333333333333333 |
| 1 | 3 | 0.5 |
+-------+-------+--------------------+ */
Compute the Jaccard similarity between a given node and every other node in a graph:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');
-- Create an undirected graph with two nodes and two edges.
CREATE TABLE my_edges(x INT, y INT) AS SELECT * FROM VALUES (1, 1), (1, 2), (2, 3);
CALL RAI.create_data_stream('my_edges');
CALL RAI.create_graph('my_graph', 'my_edges');
-- Compute the Jaccard similarity of node 1 and every other node in 'my_graph'.
SELECT * FROM TABLE(RAI.jaccard_similarity('my_graph', {'node1': 1}));
/*+------------------------------------+
| NODE1 | NODE2 | SCORE |
+-------+-------+--------------------+
| 1 | 1 | 1.0 |
| 1 | 2 | 0.3333333333333333 |
| 1 | 3 | 0.5 |
+-------+-------+--------------------+ */
Compute the Jaccard similarity of each node in a graph using a different RAI engine than the engine set in the RAI context, and store the results in a Snowflake table:
-- Set the RAI context.
CALL RAI.use_rai_database('my_rai_db');
CALL RAI.use_rai_engine('my_rai_engine');