Skip to content

Prepare your data sources

Use this guide to make sure a Snowflake table or view is ready to stream into the RAI Native App. It covers supported object and column types, how to reshape unsupported VARIANT data into supported columns, how to enable change tracking, and what access-control constraints to check before you create a stream.

  • The RAI Native App is installed in your Snowflake account.
  • You have a Snowflake table or view that you want to use as a data source.

What Snowflake objects can be used as data sources

Section titled “What Snowflake objects can be used as data sources”

Use this table to check whether your Snowflake object type can be used as a stream source:

Snowflake object typeSupported?
Standard table
Standard view
Snowflake-managed Iceberg table Preview
Temporary table
Transient table
Dynamic table
External table
External view

Data streams support the following Snowflake column types:

If your table or view uses only the column types listed above, you can use it as a stream source after you enable change tracking. If it includes unsupported column types, reshape the data into supported columns before you create the stream.

Reshape VARIANT data into supported columns

Section titled “Reshape VARIANT data into supported columns”

Snowflake tables or views that contain VARIANT columns can’t be streamed directly into the RAI Native App. To use JSON data in your models, expand it into supported column types before you create a data stream.

Use this table to choose the right reshaping pattern:

What to useWhen to use it
Flatten into a viewUse when each source row should stay as one output row and JSON fields can become scalar columns.
Normalize into a child tableUse when a JSON array must become multiple rows that you want to query independently.

Use this pattern when you want to keep a one-to-one mapping between rows in the raw table and rows in the streamable source. This works well when array values can be safely collapsed into a scalar representation such as a comma-separated string.

For example, consider the following purchase-events table with a VARIANT column:

-- Source table with a VARIANT column
CREATE OR REPLACE TABLE PURCHASE_EVENTS_RAW (
event_id NUMBER,
order_id NUMBER,
customer_id NUMBER,
occurred_at TIMESTAMP_NTZ,
EVENT_DATA VARIANT
);
-- Sample rows with nested JSON data
INSERT INTO PURCHASE_EVENTS_RAW
SELECT 1, 501, 10001, '2025-07-01 10:15:00'::TIMESTAMP_NTZ,
PARSE_JSON('{"channel":"web","status":"submitted","region":"NA","items":["SKU-123","SKU-456"]}')
UNION ALL
SELECT 2, 502, 10002, '2025-07-01 10:20:00'::TIMESTAMP_NTZ,
PARSE_JSON('{"channel":"mobile","status":"shipped","region":"EMEA","items":["SKU-789"]}');

Follow these steps to create a view with supported columns:

  1. Create a flat view with supported columns

    Extract the JSON fields into scalar columns that the stream can use:

    CREATE OR REPLACE VIEW PURCHASE_EVENTS_CLEAN AS
    SELECT
    event_id,
    order_id,
    customer_id,
    occurred_at,
    EVENT_DATA:channel::STRING AS channel,
    EVENT_DATA:status::STRING AS status,
    EVENT_DATA:region::STRING AS region,
    ARRAY_TO_STRING(EVENT_DATA:items, ',') AS items
    FROM PURCHASE_EVENTS_RAW;
    • EVENT_DATA:channel::STRING, EVENT_DATA:status::STRING, and EVENT_DATA:region::STRING extract scalar values from the VARIANT payload and cast them to supported STRING columns.
    • ARRAY_TO_STRING(EVENT_DATA:items, ',') converts the JSON array in items into a single supported scalar value.
    • The resulting PURCHASE_EVENTS_CLEAN view keeps one row per source record, but replaces the unsupported VARIANT column with streamable columns.
  2. Enable change tracking on the table and view

    The underlying table for the view must also have change tracking enabled:

    ALTER TABLE PURCHASE_EVENTS_RAW SET CHANGE_TRACKING = TRUE;
    ALTER VIEW PURCHASE_EVENTS_CLEAN SET CHANGE_TRACKING = TRUE;
  3. Verify the reshaped view

    Query the view and confirm that the JSON fields now appear as supported scalar columns:

    SELECT * FROM PURCHASE_EVENTS_CLEAN;
    Output
    +--------+-----------+--------+---------------------+------------------------------------+--------------------+---------+---------------+
    | EVENT_ID| ORDER_ID | CUSTOMER_ID| OCCURRED_AT | CHANNEL | STATUS | REGION | ITEMS |
    |--------+-----------+--------+---------------------+------------------------------------+--------------------+---------+---------------|
    | 1 | 501 | 10001 | 2025-07-01 10:15:00 | web | submitted | NA | SKU-123,SKU-456 |
    | 2 | 502 | 10002 | 2025-07-01 10:20:00 | mobile | shipped | EMEA | SKU-789 |
    +--------+-----------+--------+---------------------+------------------------------------+--------------------+---------+---------------+

    PURCHASE_EVENTS_CLEAN uses supported column types and can be used by the RAI Native App.

Use this pattern when your JSON data includes arrays that need to be represented as multiple rows. The child table becomes the streamable source object.

For example, consider the following purchase-events table with a VARIANT column:

-- Source table with a VARIANT column
CREATE OR REPLACE TABLE PURCHASE_EVENTS_RAW (
event_id NUMBER,
order_id NUMBER,
customer_id NUMBER,
occurred_at TIMESTAMP_NTZ,
EVENT_DATA VARIANT
);
-- Sample rows with nested JSON data
INSERT INTO PURCHASE_EVENTS_RAW
SELECT 1, 501, 10001, '2025-07-01 10:15:00'::TIMESTAMP_NTZ,
PARSE_JSON('{"channel":"web","status":"submitted","region":"NA","items":["SKU-123","SKU-456"]}')
UNION ALL
SELECT 2, 502, 10002, '2025-07-01 10:20:00'::TIMESTAMP_NTZ,
PARSE_JSON('{"channel":"mobile","status":"shipped","region":"EMEA","items":["SKU-789"]}');

Follow these steps to create a child table with supported columns:

  1. Create a child table with one row per array value

    Use LATERAL FLATTEN to expand the JSON array into streamable rows:

    CREATE OR REPLACE TABLE PURCHASE_EVENT_ITEMS AS
    SELECT
    p.event_id,
    p.order_id,
    p.customer_id,
    p.occurred_at,
    item.value::STRING AS item
    FROM PURCHASE_EVENTS_RAW AS p,
    LATERAL FLATTEN(INPUT => p.EVENT_DATA:items) AS item;
    • LATERAL FLATTEN(INPUT => p.EVENT_DATA:items) expands the unsupported JSON array into one result row per array element.
    • item.value::STRING AS item casts each expanded JSON value to a supported STRING column.
    • The parent row metadata stays attached through event_id, order_id, customer_id, and occurred_at, so each purchased item can be streamed as its own row.
  2. Enable change tracking on the child table

    Turn on change tracking for the reshaped source:

    ALTER TABLE PURCHASE_EVENT_ITEMS SET CHANGE_TRACKING = TRUE;
  3. Verify the child table

    Query the child table and confirm that each array value is now stored in its own row:

    SELECT * FROM PURCHASE_EVENT_ITEMS;
    Output
    +--------+-----------+--------+---------------------+---------+
    | EVENT_ID| ORDER_ID | CUSTOMER_ID| OCCURRED_AT | ITEM |
    |--------+-----------+--------+---------------------+---------|
    | 1 | 501 | 10001 | 2025-07-01 10:15:00 | SKU-123 |
    | 1 | 501 | 10001 | 2025-07-01 10:15:00 | SKU-456 |
    | 2 | 502 | 10002 | 2025-07-01 10:20:00 | SKU-789 |
    +--------+-----------+--------+---------------------+---------+

    PURCHASE_EVENT_ITEMS uses supported column types and can be used by the RAI Native App.

The RAI Native App uses Snowflake’s native change tracking feature to capture changes to tables and views and synchronize them with PyRel semantic models built on top of those sources. You must enable change tracking on each source object before PyRel can use it as a source.

Follow these steps to enable change tracking on a Snowflake table:

  1. Check whether change tracking is already enabled

    Use SHOW TABLES and inspect the CHANGE_TRACKING column:

    SHOW TABLES LIKE 'MyTable';

    If change tracking is enabled, the CHANGE_TRACKING column is set to ON.

  2. Enable change tracking if needed

    Turn on change tracking for the table:

    ALTER TABLE MyTable SET CHANGE_TRACKING = TRUE;
  3. Verify the setting

    Run the same SHOW TABLES check again and confirm that CHANGE_TRACKING is now ON:

    SHOW TABLES LIKE 'MyTable';