Overview
Learn how to use and manage the schema mapping tools accessible with the RAI Integration for Snowflake.
Introduction
In order to create a graph from a Snowflake SQL table with the RAI Integration Services, you will need to create a mapping that maps the schema of your SQL table to a graph schema.
The schema mapping tools help you do this. They help you construct graphs representing relationships across tables and databases.
The schema mapping tools create a mapping between primary keys in your SQL table to unique graph node IDs. These node IDs are unique even when you have overlapping keys between different entities in your dataset. Composite keys are also supported.
You can think of schema mapping as a data preparation step for graph creation.
To create the graph, use the procedure create_graph
.
Schema mapping also ensures that the node IDs are integers.
This is important because many graph algorithms require integer node IDs, or yield significantly better performance when integer node IDs are used.
More details on graph workloads can be found in the Graph Analytics section.
Concepts
The concepts of entities and keys, objects and lookup tables are illustrated by the following example table:
Client_First_Name | Client_Last_Name | Product_Name | Product_Price |
---|---|---|---|
John | Smith | Scissors | 12.75 |
John | Smith | Crayons | 10.50 |
Lisa | Brown | Scissors | 12.75 |
The Schema Mapping tutorial uses the same example.
Entities and Keys
An entity describes something that exists in the real world that can be uniquely identified independently of its properties or other relationships. For example, Clients and Products are entities.
A key is a set of attributes whose values are unique for every object in the entity’s domain.
For example, a Client entity may have a key consisting of two attributes: First_Name
and Last_Name
.
The Product key consists of a single attribute: Name
.
The definition of entities and their keys forms an implicit contract that no two objects from the same entity share the same key value(s). Entity resolution (opens in a new tab) is beyond the scope of the schema mapping tools.
The RAI Integration Services only support situations where each entity has a unique key, which may consist of multiple attributes.
Objects
The RAI Integration Services represent entities using key-value dictionaries that are supported in Snowflake as elements of type OBJECT
(opens in a new tab), which is analogous to a JSON object.
The advantage of using OBJECT
is that entities can be stored in a SQL table with a fixed schema even though the number and type of entity keys that identify an entity may vary.
Here is a table of JSON objects, where each entry represents an entity.
OBJ |
---|
{‘type’: ‘Client’, ‘first_name’: ‘John’, ‘last_name’: ‘Smith’} |
{‘type’: ‘Client’, ‘first_name’: ‘Lisa’, ‘last_name’: ‘Brown’} |
{‘type’: ‘Product’, ‘name’: ‘Scissors’} |
{‘type’: ‘Product’, ‘name’: ‘Crayons’} |
Note that every dictionary contains both the identifying information and the type of entity to which it belongs.
The use of type
in the dictionaries is therefore reserved and no entity may use an attribute type as part of its key.
Lookup Tables
The lookup table holds the association between an object’s, or entity’s, integer ID and its identifying information, which includes the entity type and the entity keys.
You do not need access to this table directly.
Use the node
identifier and lookup
functions.
Initially, the lookup table is empty.
To use it, you need to explicitly populate it with create_lookup
.
Functions and Procedures
Entities
Function | Description |
---|---|
create_entity | Declare an entity. |
list_entities | Return a list of all available entities. |
drop_entity | Remove the declaration and corresponding data of a given entity from the system. |
drop_all_entities | Remove information about all entities from the system, resetting the schema map. |
Lookup Tables
Function | Description |
---|---|
create_lookup | Add an entity to the lookup table. |
list_lookups | Return a list of all lookups and entities currently covered by the lookup table. |
node | Compute the node ID corresponding to an object of a specific entity and identified by specific key values. |
lookup | Retrieve the object description corresponding to the given node ID. |
rebuild_lookup_table | Rebuild the lookup table by dropping and recreating all lookups. |
drop_lookup | Remove a specific lookup and all its corresponding entries from the lookup table. |
drop_all_lookups | Remove all lookups and entries from the lookup table. |