How to Prepare Your Data for Product Analytics

Determine Goals and Use Cases

Setting out goals from the start aligns internal stakeholders—including engineers and business users—which ensures a more cooperative process later when you integrate your data into Indicative. 

The first step is to identify your use cases for a customer analytics platform. What are you trying to learn? What are you trying to achieve? Some example use cases may include identifying:

  • Where people drop off in the signup flow.
  • Feature usage before and after upgrades.
  • Engagement patterns and customer behavior leading up to churn.
  • The customer experience and journey between various marketing initiatives and revenue-focused conversion goals.
  • Drop-off points that lead to churn or lost revenue.
  • Performance of A/B testing variants.
  • Sign-up flow performance and drop-offs.
  • Marketing initiatives that lead to upgrades.
  • Engagement patterns throughout the product and marketing experience.

Identify Data Requirements

The next step begins with identifying the data points you need. What information do you need to achieve the goals outlined in step one? First, identify your metrics. Then determine what data you’ll need to create these calculations. You’ll want to identify both the user actions and associated metadata required. As an example, based on the goals listed as examples in step 1, you would need to track:

  • Step-by-step signup flows, along with device and UTM information.
  • Feature usage (such as clicks and error messages).
  • Upgrade campaign events (email click-throughs and CTAs, for example, mobile push notifications, or in-app notifications).
  • User flows for upgrades and downgrades.
  • User segments based on behaviors (This can be done through segmentation).
  • User data such as subscription type and location.

Create a Clean and Organized Data Structure

To run effective analyses in your customer analytics platform, you’ll need log-level event data that allows you to perform analyses at aggregate levels. To truly democratize your data, you will also need to structure it in an organized and intuitive fashion.

Event Data should include:

  • Event name -- so you can identify the action.
  • Timestamp -- so you can see when the action occurred.
  • User ID -- so you can see who completed the action. This can be an authenticated ID or unauthenticated ID, or both.
    • Authenticated ID - User has completed a log in step and their information has been verified.
    • Unauthenticated ID - User has not completed the log in step and their information has not been verified.
  • Additional metadata - Like attribution and device information, subscription status, demographic information, location, and more.

Generalize Your Event Data 

Keep your list of events short, but create a robust set of event properties to allow for deeper analysis.

For example, rather than create an event for a specific button being clicked on within your platform, create an event for all buttons being clicked - then add event properties specifying each button you wish to track. 

  • Establish a unified identification system -- Your business intelligence and Customer Analytics platforms need to deliver a unified view of your customers. For example, you’ll need logic to map the Zendesk ID in your Zendesk data to a customer or visitor ID in your website data.
  • Synthesize different sources --  You need to be able to map server-side data (like order confirmations) with client-side (like app clicks) and third-party data (like Zendesk live chat interactions) to contextualize user behavior at a higher level. Ideally, these various sources can be unified into one schema. For example, if you’re using BigQuery, it is best to have a single unified events table as opposed to one table per event.
  • Ensure consistency -- When synthesizing different data sources, oftentimes field names may vary even if they represent the same thing. It’s important to unify the schema and field names to ensure simplicity. For example, different datasets may represent a user’s device type as device_type, deviceType, or dvce_type.
  • Create lookup and reference tables --  This allows you to enrich the event data with additional context. For example, a B2B SaaS platform may have event data that includes a company ID. A lookup table may have additional context based on company IDs, like company name or subscription type, that can be used to enrich the event data.

Develop Documentation 

Documentation serves as both a historical guide and an educational resource that makes your data easier to access, interpret, and ultimately use.

  • How to access data -- If it’s in S3, for example, where in the S3 bucket does certain data live? What format (parquet, JSON, etc.) are the files in?
  • How to interpret data -- Given all the considerations above, what is the right way to query and analyze the data?
  • Who is the right resource for questions? -- You need an owner to be the expert on the data model—who can field questions from the rest of the team. This may be a different resource from the owner for data access.
  • A data dictionary. This helps end-users understand what the data represents. For example, does Confirm Subscription represent when the user clicks the confirm button or when the subscription is successfully processed and confirmed by the server? This should be done in a way that users who are new to data analytics can easily understand what each event and event property are built to represent. 
2 users found this helpful