Configure data sync behavior
Use these settings to control how PyRel handles Snowflake-backed source data at query time. This guide explains when to change the defaults to balance fresher data, faster query startup, automatic change tracking, schema validation, and query timeouts.
- You have access to a Snowflake account with the RelationalAI Native App installed. If you are unsure, contact your Snowflake administrator.
- You have a working PyRel installation. See Set Up Your Environment for instructions.
What data sync configuration covers
Section titled “What data sync configuration covers”These settings apply during the Submit a job step in the PyRel workflow:
When your model reads Snowflake tables or views, the RAI Native App accesses that data through data streams maintained by the CDC service. These settings control query-time sync behavior, but they do not enable or repair CDC for you.
The settings on this page control whether PyRel waits for synchronized data, how fresh that data must be, whether PyRel validates the synchronized schema, whether it tries to enable change tracking on sources for you, and how long Snowflake queries can run.
Use this table to find the setting to change when you need to adjust one of those behaviors:
| Setting | Default | What it controls | Change it when |
|---|---|---|---|
| Stream synchronization | true | Whether PyRel waits for the data stream to catch up to the latest source data before it runs your query. This assumes the CDC service is already healthy and processing changes. | You prefer lower latency over waiting for the latest synchronized data. |
| Freshness tolerance | None | Lets PyRel evaluate a query when the source data is recent enough, even if synchronization is not fully caught up. This applies only when data.wait_for_stream_sync is true. | You want recent enough data without requiring full catch-up every time. |
| Column type checking | true | Whether PyRel validates that synchronized source columns still match the types your model expects. | You need to iterate faster on your model without waiting for full type validation. |
| Change tracking | false | Whether PyRel tries to enable change tracking on source tables or views that data stream synchronization depends on. This helps with source setup, but it does not enable or repair the CDC service. | You want PyRel to enable change tracking for eligible source tables or views automatically. |
| Query timeout | None | Lets PyRel set an explicit timeout for Snowflake queries. | You want a shorter configured timeout than Snowflake’s default timeout of 24 hours. |
Each section shows how to configure the setting in raiconfig.yaml or programmatically in Python.
Enable or disable stream synchronization before query execution
Section titled “Enable or disable stream synchronization before query execution”Set data.wait_for_stream_sync to control whether PyRel waits for streams to synchronize before running queries.
It is enabled by default.
Disable it when you prefer lower latency and can tolerate stale results.
-
Set
data.wait_for_stream_syncinraiconfig.yaml:connections:# ...data:wait_for_stream_sync: false -
Inspect the parsed config:
from relationalai.semantics import Modelm = Model("MyModel")print(m.config.data.wait_for_stream_sync)If this prints
False, PyRel loaded the setting you configured. If it still printsTrue, check that you updated the config source this model is actually using.
Set data.wait_for_stream_sync using DataConfig or a plain Python dict.
These examples assume you already have a valid connection configured through file discovery.
-
Set the value with a typed config class:
from relationalai.config import DataConfig, create_configfrom relationalai.semantics import Modelcfg = create_config(data=DataConfig(wait_for_stream_sync=False))m = Model("MyModel", config=cfg)print(m.config.data.wait_for_stream_sync)If this prints
False, PyRel loaded the typed config object as expected. -
Alternatively, set the value with a dict:
from relationalai.config import create_configfrom relationalai.semantics import Modelcfg = create_config(data={"wait_for_stream_sync": False})m = Model("MyModel", config=cfg)print(m.config.data.wait_for_stream_sync)If this prints
False, PyRel loaded the dict-backed config object as expected.
Set the data freshness threshold for stream synchronization
Section titled “Set the data freshness threshold for stream synchronization”Set an optional freshness threshold when you want PyRel to accept data that is fresh enough without requiring full stream synchronization.
It is unset by default.
This threshold applies only when PyRel still waits for stream synchronization.
If you set data.data_freshness_mins above 30240, PyRel clamps it to 30240 minutes (3 weeks) when it loads the config and emits a warning.
-
Set
data.data_freshness_minsinraiconfig.yaml:connections:# ...data:data_freshness_mins: 5 -
Inspect the parsed config:
from relationalai.semantics import Modelm = Model("MyModel")print(m.config.data.data_freshness_mins)If this prints
5, PyRel loaded the freshness threshold you configured.
Set data.data_freshness_mins using DataConfig or a plain Python dict.
These examples assume you already have a valid connection configured through file discovery.
-
Set the freshness threshold with a typed config class:
from relationalai.config import DataConfig, create_configfrom relationalai.semantics import Modelcfg = create_config(data=DataConfig(data_freshness_mins=5))m = Model("MyModel", config=cfg)print(m.config.data.data_freshness_mins)If this prints
5, PyRel loaded the typed config object as expected. -
Alternatively, set the freshness threshold with a dict:
from relationalai.config import create_configfrom relationalai.semantics import Modelcfg = create_config(data={"data_freshness_mins": 5})m = Model("MyModel", config=cfg)print(m.config.data.data_freshness_mins)If this prints
5, PyRel loaded the dict-backed config object as expected.
data.data_freshness_minsworks withdata.wait_for_stream_sync, not instead of it.- If you disable
wait_for_stream_sync, PyRel does not wait long enough to enforce the freshness threshold before running queries.
Enable or disable column type checking during stream synchronization
Section titled “Enable or disable column type checking during stream synchronization”Set data.check_column_types to control whether PyRel validates Snowflake source column types for synchronized data before it runs your query.
It is enabled by default.
Disable it only when you deliberately accept the risk of type mismatches reaching later stages.
-
Set
data.check_column_typesinraiconfig.yaml:connections:# ...data:check_column_types: false -
Inspect the parsed config:
from relationalai.semantics import Modelm = Model("MyModel")print(m.config.data.check_column_types)If this prints
False, PyRel loaded the relaxed schema-validation setting you configured.
Set data.check_column_types using DataConfig or a plain Python dict.
These examples assume you already have a valid connection configured through file discovery.
-
Set the value with a typed config class:
from relationalai.config import DataConfig, create_configfrom relationalai.semantics import Modelcfg = create_config(data=DataConfig(check_column_types=False))m = Model("MyModel", config=cfg)print(m.config.data.check_column_types)If this prints
False, PyRel loaded the typed config object as expected. -
Alternatively, set the value with a dict:
from relationalai.config import create_configfrom relationalai.semantics import Modelcfg = create_config(data={"check_column_types": False})m = Model("MyModel", config=cfg)print(m.config.data.check_column_types)If this prints
False, PyRel loaded the dict-backed config object as expected.
Enable or disable automatic change tracking
Section titled “Enable or disable automatic change tracking”Change tracking must be enabled on Snowflake tables and views before PyRel can create and synchronize data streams.
Set data.ensure_change_tracking to control whether PyRel tries to enable Snowflake change tracking automatically.
This setting is disabled by default.
Enable this when you want PyRel to try to turn on Snowflake change tracking for eligible source tables or views automatically. Leave it off when your team manages change tracking outside PyRel, or when your Snowflake role does not have permission to alter those tables or views. For most users, the safe default is to leave this off unless missing change tracking is a recurring setup problem in your Snowflake environment.
-
Set
data.ensure_change_trackinginraiconfig.yaml:connections:# ...data:ensure_change_tracking: true -
Inspect the parsed config:
from relationalai.semantics import Modelm = Model("MyModel")print(m.config.data.ensure_change_tracking)If this prints
True, PyRel loadedensure_change_tracking: truefrom theraiconfig.yamlfile you expected.
Set data.ensure_change_tracking using DataConfig or a plain Python dict.
These examples assume you already have a valid connection configured through file discovery.
-
Set the value with a typed config class:
from relationalai.config import DataConfig, create_configfrom relationalai.semantics import Modelcfg = create_config(data=DataConfig(ensure_change_tracking=True))m = Model("MyModel", config=cfg)print(m.config.data.ensure_change_tracking)If this prints
True, the model is using the config object you passed intoModel. -
Alternatively, set the value with a dict:
from relationalai.config import create_configfrom relationalai.semantics import Modelcfg = create_config(data={"ensure_change_tracking": True})m = Model("MyModel", config=cfg)print(m.config.data.ensure_change_tracking)If this prints
True, PyRel loaded the dict-backed config object as expected.
Set the query timeout
Section titled “Set the query timeout”Set data.query_timeout_mins when you want to set an explicit timeout for Snowflake queries from your model.
If you do not set it, Model.config.data.query_timeout_mins stays None after PyRel loads your config.
-
Set
data.query_timeout_minsinraiconfig.yaml:connections:# ...data:query_timeout_mins: 10 -
Inspect the parsed config:
from relationalai.semantics import Modelm = Model("MyModel")print(m.config.data.query_timeout_mins)If this prints
10, PyRel loaded the timeout you configured. If it printsNone, PyRel did not load that field from the config source you expected.
Set data.query_timeout_mins using DataConfig or a plain Python dict.
These examples assume you already have a valid connection configured through file discovery.
-
Set the query timeout with a typed config class:
from relationalai.config import DataConfig, create_configfrom relationalai.semantics import Modelcfg = create_config(data=DataConfig(query_timeout_mins=10))m = Model("MyModel", config=cfg)print(m.config.data.query_timeout_mins)If this prints
10, PyRel loaded the typed config object as expected. -
Alternatively, set the query timeout with a dict:
from relationalai.config import create_configfrom relationalai.semantics import Modelcfg = create_config(data={"query_timeout_mins": 10})m = Model("MyModel", config=cfg)print(m.config.data.query_timeout_mins)If this prints
10, PyRel loaded the dict-backed config object as expected.