Schema Mapping Tutorial
This tutorial provides an introduction to schema mapping with the RAI Integration for Snowflake.
Goal
This tutorial introduces schema mapping with a few simple steps. You will learn how to create and list entities and how to create and list lookup tables.
Data
Consider an example where clients buy products, as represented in this Purchase
table:
Client_First_Name | Client_Last_Name | Product_Name | Product_Price |
---|---|---|---|
John | Smith | Scissors | 12.75 |
John | Smith | Crayons | 10.50 |
Lisa | Brown | Scissors | 12.75 |
Based on this table, you can construct a graph whose nodes represent clients and products, with edges indicating that a particular client has purchased a specific product.
The first two columns, Client_First_Name
and Client_Last_Name
, are strings that identify clients, while the third column is a string identifying a product.
The last column, Product_Price
, is identified by a float.
The last column is not used in the example.
Not all of the data in your SQL table will be part of the schema mapping.
Here is a graph representation of the data you are interested in:
The identifying attributes of the objects (Clients and Products) cannot be used to create a graph as they are.
They are not identified by integers, which is required by the create_graph
procedure.
They also differ in arity.
Clients are identified by pairs of strings, while Products are identified by single strings.
This is solved by defining a bidirectional mapping between objects (using their identifying information) and nodes (identified by unique integer values).
Schema Mapping
Create and List Entities
Once your data is set up in a Snowflake table, use CREATE_ENTITY
to declare entities.
In the example, the entities are Client
and Product
.
The key representing the entity Client
is the combination of first_name
and last_name
.
The entity Product
has a single key: name
.
-- Create the entity `'Client'`.
CALL RAI.CREATE_ENTITY('Client', ['first_name', 'last_name']);
/*+-----------------------------------------------------------------------+
| "messages": ["Added entity Client with key [first_name, last_name].]" |
+-----------------------------------------------------------------------+ */
-- Create the entity `'Product'`.
CALL RAI.CREATE_ENTITY('Product', 'name');
/*+-------------------------------------------------------+
| "messages" :["Added entity Product with key [name].]" |
+-------------------------------------------------------+ */
By default, the functions and procedures in the SQL Graph Library for Snowflake are contained in the RAI
schema in your Snowflake database.
Note that this schema name may vary depending on how your administrator set up the RAI integration.
Use the tabular function LIST_ENTITIES()
to view a table listing the declared entities:
-- View a list of all the entities that have been declared and their keys.
SELECT * FROM TABLE(RAI.LIST_ENTITIES());
/*+-------------+-------------------------------+
| ENTITY_NAME | ENTITY_KEY |
|-------------+-------------------------------+
| Client | [ "first_name", "last_name" ] |
| Product | [ "name" ] |
+-------------+-------------------------------+ */
Create and View Lookup Tables
Use CREATE_LOOKUP
to create a lookup table.
The lookup table holds the association between integer values assigned to particular objects and their identifying information.
In the example, it consists of two parts: one with clients and one with products.
Each part is loaded with a separate CREATE_LOOKUP
call:
-- Create a lookup table for the entity `'Client'`.
CALL RAI.CREATE_LOOKUP('Client', 'Purchase', ['Client_First_Name', 'Client_Last_Name']);
/*+---------------------------------------------------------------------------------------------------------------------------------------------+
| "Loaded lookup table part for entity Client on table COMMERCE_DB.ONLINE_SHOP.PURCHASE with columns [CLIENT_FIRST_NAME, CLIENT_LAST_NAME].", |
| "Added lookup table scan for entity Client on table COMMERCE_DB.ONLINE_SHOP.PURCHASE with columns [CLIENT_FIRST_NAME, CLIENT_LAST_NAME]." , | |
+---------------------------------------------------------------------------------------------------------------------------------------------+ */
-- Create a lookup table for the entity `'Product'`.
CALL RAI.CREATE_LOOKUP('Product', 'Purchase', 'Product_Name');
/*+-----------------------------------------------------------------------------------------------------------------------+
| "Loaded lookup table part for entity Product on table COMMERCE_DB.ONLINE_SHOP.PURCHASE with columns [PRODUCT_NAME].", |
| "Added lookup table scan for entity Product on table COMMERCE_DB.ONLINE_SHOP.PURCHASE with columns [PRODUCT_NAME].", | |
+-----------------------------------------------------------------------------------------------------------------------+ */
Use the tabular function LIST_LOOKUPS()
to view a table listing the lookup tables:
-- List the lookup tables.
SELECT * FROM TABLE(RAI.LIST_LOOKUPS());
/*+----------------+-------------+------------+-------------+-------------------------------------------------+
| DATABASE_NAME | SCHEMA_NAME | TABLE_NAME | ENTITY_NAME | ENTITY_KEY_ATTRS |
|----------------+-------------+------------+-------------+-------------------------------------------------+
| MY_DATABASE | MY_SCHEMA | PURCHASE | Client | [ "CLIENT_FIRST_NAME", "CLIENT_LAST_NAME" ] |
| MY_DATABASE | MY_SCHEMA | PURCHASE | Product | [ "PRODUCT_NAME" ] |
+----------------+-------------+------------+-------------+-------------------------------------------------+ */
Mapping Objects to Node Identifiers
Use the scalar function NODE
to define an edge set to be used for graph creation:
-- Define an edge set.
CREATE VIEW Edge(src, dst) AS (
SELECT RAI.NODE('Client', [Client_First_Name, Client_Last_Name]) as src, RAI.NODE('Product', Product_Name:VARIANT) as dst
FROM Purchases
);
/*+--------------------------------+
| View EDGE successfully created |
+--------------------------------+ */
The version of NODE
that accepts a single key attribute value expects this value to be of type VARIANT
.
Most basic types are automatically coerced to VARIANT
, but others may need explicit casting.
For more information, see the Snowflake documentation (opens in a new tab).
Mapping Node Identifiers to Objects
Use the scalar function LOOKUP
to get the the object’s key information for the integer of a node that has been previously loaded into the lookup table.
-- Get the object key information for the result.
SELECT RAI.LOOKUP(-5527614564291079873);
/* +------------------------------+
| LOOKUP(-5527614564291079873) |
|------------------------------|
| { |
| "first_name": "John", |
| "last_name": "Smith", |
| "type": "Client" |
| } |
+------------------------------+ */
Alternatively you could use RAI.LOOKUP_TABLE
:
SELECT obj FROM RAI.LOOKUP_TABLE WHERE node_id = -5527614564291079873
Clean Up (Optional)
When you no longer need the graph, remove it from RAI and clean up the lookup table by running DROP_ALL_ENTITIES
.
If you intend to run multiple algorithms on the same graph at different moments in time, it is not necessary to clean up and recreate the lookup table. Both the RAI graph and the lookup table are maintained automatically. Updates to the source tables are automatically propagated to the graph and the lookup table.
Summary
In this tutorial, you learned how to create and list entities and how to create and list lookup tables. This allows you to map an object to a node and vice versa.