Skip to content

Configure data sync behavior

Use these settings to control how PyRel handles Snowflake-backed source data at query time. This guide explains when to change the defaults to balance fresher data, faster query startup, automatic change tracking, schema validation, and query timeouts.

  • You have access to a Snowflake account with the RelationalAI Native App installed. If you are unsure, contact your Snowflake administrator.
  • You have a working PyRel installation. See Set Up Your Environment for instructions.

These settings apply during the Submit a job step in the PyRel workflow:

1. Load and validateconfiguration2. Build modeland query3. SubmitRelationalAI job4. Run job withreasoners5. Materializeresults

When your model reads Snowflake tables or views, the RAI Native App accesses that data through data streams maintained by the CDC service. These settings control query-time sync behavior, but they do not enable or repair CDC for you.

The settings on this page control whether PyRel waits for synchronized data, how fresh that data must be, whether PyRel validates the synchronized schema, whether it tries to enable change tracking on sources for you, and how long Snowflake queries can run.

Use this table to find the setting to change when you need to adjust one of those behaviors:

SettingDefaultWhat it controlsChange it when
Stream synchronizationtrueWhether PyRel waits for the data stream to catch up to the latest source data before it runs your query. This assumes the CDC service is already healthy and processing changes.You prefer lower latency over waiting for the latest synchronized data.
Freshness toleranceNoneLets PyRel evaluate a query when the source data is recent enough, even if synchronization is not fully caught up. This applies only when data.wait_for_stream_sync is true.You want recent enough data without requiring full catch-up every time.
Column type checkingtrueWhether PyRel validates that synchronized source columns still match the types your model expects.You need to iterate faster on your model without waiting for full type validation.
Change trackingfalseWhether PyRel tries to enable change tracking on source tables or views that data stream synchronization depends on. This helps with source setup, but it does not enable or repair the CDC service.You want PyRel to enable change tracking for eligible source tables or views automatically.
Query timeoutNoneLets PyRel set an explicit timeout for Snowflake queries.You want a shorter configured timeout than Snowflake’s default timeout of 24 hours.

Each section shows how to configure the setting in raiconfig.yaml or programmatically in Python.

Enable or disable stream synchronization before query execution

Section titled “Enable or disable stream synchronization before query execution”

Set data.wait_for_stream_sync to control whether PyRel waits for streams to synchronize before running queries. It is enabled by default. Disable it when you prefer lower latency and can tolerate stale results.

  1. Set data.wait_for_stream_sync in raiconfig.yaml:

    connections:
    # ...
    data:
    wait_for_stream_sync: false
  2. Inspect the parsed config:

    from relationalai.semantics import Model
    m = Model("MyModel")
    print(m.config.data.wait_for_stream_sync)

    If this prints False, PyRel loaded the setting you configured. If it still prints True, check that you updated the config source this model is actually using.

Set the data freshness threshold for stream synchronization

Section titled “Set the data freshness threshold for stream synchronization”

Set an optional freshness threshold when you want PyRel to accept data that is fresh enough without requiring full stream synchronization. It is unset by default. This threshold applies only when PyRel still waits for stream synchronization. If you set data.data_freshness_mins above 30240, PyRel clamps it to 30240 minutes (3 weeks) when it loads the config and emits a warning.

  1. Set data.data_freshness_mins in raiconfig.yaml:

    connections:
    # ...
    data:
    data_freshness_mins: 5
  2. Inspect the parsed config:

    from relationalai.semantics import Model
    m = Model("MyModel")
    print(m.config.data.data_freshness_mins)

    If this prints 5, PyRel loaded the freshness threshold you configured.

  • data.data_freshness_mins works with data.wait_for_stream_sync, not instead of it.
  • If you disable wait_for_stream_sync, PyRel does not wait long enough to enforce the freshness threshold before running queries.

Enable or disable column type checking during stream synchronization

Section titled “Enable or disable column type checking during stream synchronization”

Set data.check_column_types to control whether PyRel validates Snowflake source column types for synchronized data before it runs your query. It is enabled by default. Disable it only when you deliberately accept the risk of type mismatches reaching later stages.

  1. Set data.check_column_types in raiconfig.yaml:

    connections:
    # ...
    data:
    check_column_types: false
  2. Inspect the parsed config:

    from relationalai.semantics import Model
    m = Model("MyModel")
    print(m.config.data.check_column_types)

    If this prints False, PyRel loaded the relaxed schema-validation setting you configured.

Enable or disable automatic change tracking

Section titled “Enable or disable automatic change tracking”

Change tracking must be enabled on Snowflake tables and views before PyRel can create and synchronize data streams. Set data.ensure_change_tracking to control whether PyRel tries to enable Snowflake change tracking automatically. This setting is disabled by default.

Enable this when you want PyRel to try to turn on Snowflake change tracking for eligible source tables or views automatically. Leave it off when your team manages change tracking outside PyRel, or when your Snowflake role does not have permission to alter those tables or views. For most users, the safe default is to leave this off unless missing change tracking is a recurring setup problem in your Snowflake environment.

  1. Set data.ensure_change_tracking in raiconfig.yaml:

    connections:
    # ...
    data:
    ensure_change_tracking: true
  2. Inspect the parsed config:

    from relationalai.semantics import Model
    m = Model("MyModel")
    print(m.config.data.ensure_change_tracking)

    If this prints True, PyRel loaded ensure_change_tracking: true from the raiconfig.yaml file you expected.

Set data.query_timeout_mins when you want to set an explicit timeout for Snowflake queries from your model. If you do not set it, Model.config.data.query_timeout_mins stays None after PyRel loads your config.

  1. Set data.query_timeout_mins in raiconfig.yaml:

    connections:
    # ...
    data:
    query_timeout_mins: 10
  2. Inspect the parsed config:

    from relationalai.semantics import Model
    m = Model("MyModel")
    print(m.config.data.query_timeout_mins)

    If this prints 10, PyRel loaded the timeout you configured. If it prints None, PyRel did not load that field from the config source you expected.