Data I/O: Accessing the Cloud
This guide demonstrates how to interact with data using the supported cloud providers.
Goal
By following this guide, you will be able to access data from the cloud in the RKGS database and vice versa using the different available options.
This guide complements the CSV Import/CSV Export and JSON Import/Export guides, where you can find all the relevant information for importing and exporting specific data types.
Cloud Storage Providers
Currently, Azure Blob Storage (opens in a new tab) and Amazon S3 (opens in a new tab) are supported.
S3 data must be in a bucket with public read access. For private data, use Azure Blob Storage.
Provider | URI prefix | Public/private Data | Read-only Access | Write Access | Supported Regions |
---|---|---|---|---|---|
Azure Blob (opens in a new tab) | azure:// | public & private | ✅ | ✅ | (N. Virginia) us-east-1 |
AWS S3 (opens in a new tab) | s3:// | public only | ✅ | ❌ | (N. Virginia) us-east-1 |
Cloud Parameters
To interact with the cloud storage service, you need to define a module that specifies the data configuration. There are two relevant options for describing how to access the cloud:
Option | Description |
---|---|
path | A string that specifies the location and name of the file you want to import/export. Currently, this can point to azure://... (Microsoft Azure) or s3://... (Amazon S3) URLs. |
integration | Credentials needed to access the data. |
Public Data
If the cloud storage container provides public read access — for example, this file (opens in a new tab) — you only need to provide the path in the configuration option path
.
module my_config
def path = "s3://relationalai-documentation-public/csv-import/simple-import-4cols.csv"
end
For security reasons, "http://"
and "https://"
URL addresses are not supported.
Private Data
To access private data, you need to specify cloud credentials in integration
.
Currently, only private Azure Blob Storage (opens in a new tab) is supported. To access it, you must provide a valid SAS token and URL address.
The URL is provided via the path
option.
Within the integration
submodule, the following information needs to be provided:
- the cloud storage provider in the
provider
field, and - the access token as a string in the
credentials
field, along with the token name (:azure_sas_token
).
module config
def path = "azure://myaccount.blob.core.windows.net/sascontainer/myfile.csv"
module integration
def provider = "azure"
def credentials = (:azure_sas_token, raw"example%of%credentials")
end
end
Note that the unescaped percent sign, %
, is used for string interpolation within a Rel string.
This means you need to store URL and cloud credentials that contain %
as raw strings.
See the example above.
See Also
Now that you know how to access a service cloud provider, you can check the I/O how-to guides to learn how to interact with data. Check CSV Import and CSV Export for CSV data. If you are interested in JSON, see JSON Import/Export.