Grant RudderStack the required permissions on your data warehouse.
7 minute read
RudderStack supports Snowflake, Redshift, Databricks, and BigQuery for creating unified user profiles.
To read and write data to the warehouse, RudderStack requires specific warehouse permissions as explained in the following sections.
Keeping separate schemas for projects running via CLI and web is recommended. This way projects run from the CLI will never risk overwriting your production data.
Snowflake
Snowflake uses a combination of DAC and RBAC models for access control. However, RudderStack chooses an RBAC-based access control mechanism as multiple users can launch the Profile Builder CLI.
Also, it is not ideal to tie the result of an individual user run with
that user. Hence, it is recommended to create a generic role (for example, PROFILES_ROLE) with the following privileges:
Read access to all the inputs to the model (can be shared in case of multiple schemas/tables).
Write access to the schemas and common tables as the PB project creates material (output) tables.
If you want to access any material created from the project run, the role (PROFILES_ROLE) must also have read access to all of those schemas.
Below are some sample commands which grant the required privileges to the role (PROFILES_ROLE) in a Snowflake warehouse:
-- Create role
CREATEROLEPROFILES_ROLE;SHOWROLES;-- To validate
-- Create user
CREATEUSERPROFILES_TEST_USERPASSWORD='<StrongPassword>'DEFAULT_ROLE='PROFILES_ROLE';SHOWUSERS;-- To validate
-- Grant role to user and database
GRANTROLEPROFILES_ROLETOUSERPROFILES_TEST_USER;GRANTUSAGEONDATABASEYOUR_RUDDERSTACK_DBTOROLEPROFILES_ROLE;
-- Create separate schema for Profiles and grant privileges to role
CREATESCHEMAYOUR_RUDDERSTACK_DB.RS_PROFILES;GRANTALLPRIVILEGESONSCHEMAYOUR_RUDDERSTACK_DB.RS_PROFILESTOROLEPROFILES_ROLE;GRANTUSAGEONWAREHOUSERUDDER_WAREHOUSETOROLEPROFILES_ROLE;GRANTUSAGEONSCHEMAYOUR_RUDDERSTACK_DB.EVENTSSCHEMATOROLEPROFILES_ROLE;
For accessing input sources, you can individually grant select on tables/views, or give blanket grant to all in a schema.
-- Assuming we want read access to tables/views in schema EVENTSSCHEMA
GRANTSELECTONALLTABLESINSCHEMAYOUR_RUDDERSTACK_DB.RS_PROFILESTOPROFILES_ROLE;GRANTSELECTONFUTURETABLESINSCHEMAYOUR_RUDDERSTACK_DB.RS_PROFILESTOPROFILES_ROLE;GRANTSELECTONALLVIEWSINSCHEMAYOUR_RUDDERSTACK_DB.RS_PROFILESTOPROFILES_ROLE;GRANTSELECTONFUTUREVIEWSINSCHEMAYOUR_RUDDERSTACK_DB.RS_PROFILESTOPROFILES_ROLE;
-- Assuming we want read access to tracks and identifies tables in schema EVENTSSCHEMA
GRANTSELECTONTABLEYOUR_RUDDERSTACK_DB.RS_PROFILES.TRACKSTOPROFILES_ROLE;GRANTSELECTONTABLEYOUR_RUDDERSTACK_DB.RS_PROFILES.IDENTIFIESTOPROFILES_ROLE;
Redshift
Suppose the inputs/edge sources are in a single schema website_eventstream and the name of the newly created Profiles user is rudderstack_admin. In this case, the requirements are as follows:
A separate schema rs_profiles (to store all the common and output tables).
The rudderstack_admin user should have all the privileges on the above schema and the associated tables.
The rudderstack_admin user should have USAGE privilege on schemas that have the edge sources and input tables (website_eventstream) and read (SELECT) privileges on specific tables as well. This privilege can extend to the migration schema and other schemas from where data from warehouses comes in.
The rudderstack_admin user should have privileges to use plpythonu to create some UDFs.
RudderStack supports the following input types for Redshift warehouse/serverless:
Redshift cluster with DC2 type nodes with following types as inputs:
Redshift internal tables
External schema and tables only for inputs (not supported as output)
CSV files stored on S3 as inputs
Redshift cluster with RA3 type nodes with following types as inputs:
Redshift internal tables
External schema and tables only for inputs (not supported as output)
CSV files stored on S3 as inputs
Cross DB input tables
Redshift serverless with following types as inputs:
Redshift internal tables
External schema and tables (not supported as output)
CSV files stored on S3 as inputs
Cross DB input tables
RudderStack also supports various authentication mechanisms to authenticate the user running the Profiles project. Refer Redshift warehouse connection for more information.
Databricks
Open the Databricks UI.
Create a new user.
Reuse an existing catalog or create a new one by clicking Create Catalog.
Grant USE SCHEMA privilege on the catalog.
Create a separate schema to write objects created by RudderStack Profiles.
Grant all privileges on this schema.
Grant privileges to access relevant schemas for the input tables. For example, if an input schema is in a schema named website_eventstream, then you can run the following commands to assign a blanket grant to all schemas or only specific tables/views referred in your Profiles project:
Select the dataset from your schema to which you want to grant access.
In the Dataset info window, click SHARING > Permissions.
Click Add Principal.
Enter the service account email in the New principals field.
In the Assign roles section, select the following roles from the role list:
BigQuery Data Viewer: Allows read access to the dataset.
BigQuery Data Editor: Allows read and write access to the dataset.
Alternatively, you can add the following IAM Policy Binding:
{"bindings": [{"role": "roles/bigquery.dataViewer","members": ["serviceAccount:your-service-account@your-project.iam.gserviceaccount.com"],"condition": {"title": "Access to dataset1","description": "Allow data viewing access to dataset1","expression": "resource.name.startsWith('projects/your-project-id/datasets/dataset1')"}},{"role": "roles/bigquery.dataEditor","members": ["serviceAccount:your-service-account@your-project.iam.gserviceaccount.com"],"condition": {"title": "Access to dataset2","description": "Allow data editing access to dataset2","expression": "resource.name.startsWith('projects/your-project-id/datasets/dataset2')"}}]}
If you have input relations in many datasets from a single project, you can assign the required privileges for the complete Google Cloud project. This allows your service account to access the relation from all the datasets of the project.
This site uses cookies to improve your experience while you navigate through the website. Out of
these
cookies, the cookies that are categorized as necessary are stored on your browser as they are as
essential
for the working of basic functionalities of the website. We also use third-party cookies that
help
us
analyze and understand how you use this website. These cookies will be stored in your browser
only
with
your
consent. You also have the option to opt-out of these cookies. But opting out of some of these
cookies
may
have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This
category only includes cookies that ensures basic functionalities and security
features of the website. These cookies do not store any personal information.
This site uses cookies to improve your experience. If you want to
learn more about cookies and why we use them, visit our cookie
policy. We'll assume you're ok with this, but you can opt-out if you wish Cookie Settings.