Snowflake: Snowplow Schema

Prerequisites:

The Snowplow Snowflake integration with Indicative is available for Enterprise customers only. To integrate with Snowplow Snowflake, you will need to access your Snowplow console.

 

Instructions:

  1. In Indicative, click on Settings and select Data Sources

    mceclip0.png

  2. Click on New Data Source

    mceclip1.png

  3. Select Connect via Data Warehouse or Lake
    mceclip0.png
  4. Select Snowflake as your data connection Snowplow as the connection schema and click Connect.
    mceclip0.png
  5. You should see this Snowflake + Snowplow Overview screen. Click Next.
    mceclip1.png


Connection Information

mceclip2.png

  1. Log in to your Snowflake account. 
  2. Enter the Account ID information into Indicative.
    mceclip2.png
    Account ID is everything to the left of .gcp.fnowflakecomputing.com/...
  3. Enter the Warehouse name.
    mceclip3.png
  4. Enter the Database name.
    mceclip4.png
  5. Click into Warehouses and copy the Schema.
    mceclip5.png
  6. Enter the Table name.mceclip6.png
  7. For Auto-Generated Password, we randomly generate a password for you to use. If you would like to create your own password, please replace the autofilled value in that field.

Grant Permissions 

mceclip3.png

  1. You will need to copy and paste these code snippets into your Snowflake console. The last snippet is optional.
  2. Navigate to the Worksheets tab and paste the snippets into the SQL runner, check the All Queries checkbox, and hit Run.
  3. Click Next to test your connection

Event Modeling

  1. In the Structured Event Name section, select the field that should be used to derive Indicative event names.
  2. For Timestamp, select the field that represents the time that the event was performed. If unsure, leave as derived_tstamp
  3. For Vendor Name, input the Snowplow vendor names used so we can simplify your event property names

User Identification (Aliasing)

For more information on User Identification (Aliasing), please refer to this article.

*NoteIf aliasing is not preferred, please set the Authenticated ID Type to None and press Next

  1. Select the Type for the Unauthenticated ID
    1. Atomic - This will allow you to choose between the domain_userid and network_userid fields that are part of the standard Snowplow event structure.
    2. Context - If the unauthenticated ID is part of a Snowplow context, choose this option. Enter the values for Vendor,Name,Version, and Field.
    3. Other - If the unauthenticated field is not either of the options, please specify where we can find the unauthenticated ID in the data.

  2. Select the Type for Authenticated ID
    1. Atomic - Enter the field name that should be used for known users. Typically, it is the user_id field in the raw enriched event archive data.
    2. Context - If the authenticated ID is part of a Snowplow context, choose this option. Enter the values for Vendor,Name,Version, and Field.
    3. Other - If the authenticated field is not either of the options, please specify where we can find the authenticated ID in the data.
    4. None - choose this option to skip aliasing.

Scheduling

  1. Select the Schedule Interval to adjust the frequency at which new data is available in Indicative.
  2. Set the Schedule Time for when the data should be extracted from your BigQuery environment. It is critical that 100% of the data is available by this time to avoid loading partial data.
  3. Select Next

Waiting for Data

mceclip3.png

Once you see this screen, you're all set! A Solutions Engineer will begin preparing your data, and you should be notified via email when ready.

Advanced Settings

For additional advanced settings such as excluding certain events and properties, please refer to this page

0 users found this helpful