Important: As of August 13, 2024, this page will no longer be actively maintained. Please refer to the current version of this content here.
Prerequisites:
The Snowplow Unified Log is stored in an S3 bucket and you is required to write an IAM policy to grant Analytics programmatic access to the respective S3 bucket.
If there are additional enrichments required, such as joining with user property tables or deriving custom user_ids, please contact us.
Instructions:
Adding a Data Source In Analytics
1. In Analytics, click on the gear icon and select Project Settings.
2. Select the Data Sources tab.
3. Select New Data Source.
4. Select Connect via Data Warehouse or Lake
5. Select S3 as your data connection and Snowplow as the connection schema and click Connect
6. You should see this S3 + Snowplow Overview screen. Click Next
Connection Information
- Sign in to the AWS Management Console and open your IAM console.
- Under the Services dropdown, select S3 under Storage.
- Click on the bucket that contains your Snowplow data.
- Enter in the Bucket Name into the Analytics UI.
- Click on your bucket and refer to the bucket structure. Enter that into File Path field in the Analytics UI.
In this example, the File Path to put into the Analytics field is /main/enriched/good - Click Next
Grant Permissions
- In this section click on the box that contains the policy to copy to your clipboard. You will need to use this in step 4 of this section.
- Go back to the AWS Console. Select the bucket and click on the Permissions tab.
- Click on Bucket Policy
- Enter the copied policy from step 1 into the editor and click Save
- Click Next in Analytics.
Event Modeling
- In the Structured Event Name section, select the field that should be used to derive Analytics event names. Our logic will first look at this field, an if this value is null, it will try to use the event_name field. If that value is also null, then we will look at the event field.
- se_action
- se_category
- se_label
- None - Select this option if you're not using Snowplow's structured events
- For Timestamp, select the field that represents the time that the event was performed. If unsure, leave as derived_tstamp
- For Vendor Name, input the Snowplow vendor names used so we can simplify your event property names
User Identification (Aliasing)
For more information on User Identification (Aliasing), please refer to this article.
*Note: If aliasing is not preferred, please set the Authenticated ID Type to None and press Next
- Select the Type for the Unauthenticated ID
-
Atomic - This will allow you to choose between the domain_userid and network_userid fields that are part of the standard Snowplow event structure.
We typically recommend domain_userid since this uses a 1st party cookie. Click here for more information. - Context - If the unauthenticated ID is part of a Snowplow context, choose this option. Enter the values for Vendor,Name,Version, and Field.
-
Other - If the unauthenticated field is not either of the options, please specify where we can find the unauthenticated ID in the data.
-
Atomic - This will allow you to choose between the domain_userid and network_userid fields that are part of the standard Snowplow event structure.
- Select the Type for Authenticated ID
- Atomic - Enter the field name that should be used for known users. Typically, it is the user_id field in the raw enriched event archive data.
- Context - If the authenticated ID is part of a Snowplow context, choose this option. Enter the values for Vendor,Name,Version, and Field.
- Other - If the authenticated field is not either of the options, please specify where we can find the authenticated ID in the data.
- None - choose this option to skip aliasing.
Scheduling
- Select the Schedule Interval to adjust the frequency at which new data is available in Analytics.
- Set the Schedule Time for when the data should be extracted from your S3 bucket. It is critical that 100% of the data is available by this time to avoid loading partial data.
- Select Next
Waiting for Data
Advanced Settings
For additional advanced settings such as excluding certain events and properties, please refer to this page