1 of 39

What’s the deal with Data Contracts?

And why should I care?

2 of 39

A little about me...

I’ve worked in digital marketing and analytics for the last 10 years
Most of my experience is in agencies up’Norf
I’ve spent most of my time using Google Analytics and Google Tag Manager, but in the last few years I’ve focused on data management, big data, data warehousing and data science
I’m now Principal Technical Account Manager at Snowplow

3 of 39

What I’ll talk about today

Why data quality is important
Ways businesses try to address data quality issues
What are Data Contracts?
Some examples
Why digital analytics practitioners should care

4 of 39

Why is data quality important?

Data quality can roughly be defined as data that is:

Accurate
Complete
Timely
Governed
Understandable
Clean
Compliant

5 of 39

Why is data quality important?

Data missing from your reports or your analyses
Drawing incorrect inferences
Causes you stress that you could do without

6 of 39

How do companies try to tackle data quality issues?

Dev processes

Sandbox/QA envs

GTM Preview

7 of 39

What tools do companies use to try to tackle data quality issues?

Tag scanners

Data quality & expectation testing tools

8 of 39

Data Contracts

9 of 39

What are Data Contracts?

10 of 39

"A way of defining specific data requirements between data producers and data consumers to ensure that the necessary data required by a downstream consumer or application is always available in the agreed, expected format"

J. Peck, 2023

11 of 39

What actually is a Data Contract?

12 of 39

13 of 39

What actually is a Data Contract?

A Data Contract is technically, generally a file that defines a number of properties about the data in question, the location it was sourced from, the destination it’s being sent to, who owns it, and what the data looks like.

It can contain, but is not limited to…

The source system
The destination system
The schema of the expected data
The owner from the producing side
The consumer
Any SLAs
The product the data is part of/feeds into
Whether the data contains any PII
Any encryption/masking rules that have been applied
The version of the contract
Loads of other stuff…

14 of 39

15 of 39

By the way, Snowplow has used “Data Contracts” for nearly 10 years…

16 of 39

Example - Data extracted from Salesforce for a report

Darren is the data engineer responsible for extracting data from a business’ Salesforce account, with data on customer contracts, tiers, contract value etc
Lucy is a data analyst who is building a report for her stakeholders on the status of their customers
The current report has “Customer Tier” as a dimension: “Free” and “Premium”
Lucy has a requirement to change the tier as the product changes, to include “Freemium” as a tier
She updates the data contract as a draft, and attaches it to the request for Darren to change the transformation
Darren makes the changes to the ETL process
Lucy updates her reports accordingly
The new version of the Data Contract is updated to include this change, and disseminated around the business

17 of 39

So what does this mean for digital analytics?

18 of 39

Digital analysts have been ahead of the curve on this

From the day GTM was launched, we quickly found the value of “data contracts” - we just didn’t call them that
We used to scrape the DOM for CSS classes, and quickly realized that that's a really bad idea as data quality is so difficult to guarantee
So, we created specs for developers to push our requirements to the dataLayer - so that when any functional or visual/UI changes were made, tracking requirements weren't affected
THESE WERE DATA CONTRACTS, WE JUST DIDN’T KNOW IT

19 of 39

Is it time for digital analytics to go all-in on Data Contracts?

20 of 39

Data quality for web/mobile tracking is an obvious starting point, but there are further opportunities here…

Report builders -> report consumers
Data/Analytics engineers -> data analysts
Data scientists -> data engineers
Data/Analytics engineers -> web or application developers
Data analysts -> digital marketers

21 of 39

Jordan Peck

Snowplow

Catch me on Linkedin, Twitter or the MeasureSlack

22 of 39

Client vs Server-Side Tracking

January 6, 2025

23 of 39

AGENDA

Introduction to Tracking Methods
Client-Side Tracking Explained
Server-Side Tracking Explained
Rise of Server-Side Tracking
Applications of Server-Side Tracking
Tracking: When to Use Which
Benefits of Client-Side Tracking
Benefits of Server-Side Tracking

Challenges of Client-Side Tracking
Challenges of Server-Side Tracking
Server-Side Use Cases
Integration and Synchronization Tips
Practical Application Examples
Balancing Both Approaches
Final Recommendations

24 of 39

Introduction to Tracking Methods

Client-Side Tracking

Server-Side Tracking

Occurs on the user's device, typically within the browser or mobile app.
Involves direct interaction with the webpage or application, capturing user behavior like clicks and page views.
Utilizes tools like Google Tag Manager to embed tracking code directly into the website.

Takes place on the server, independently of the user's device.
Captures data from various applications and services, reducing reliance on client-side interactions.
Ideal for tracking events that don’t have a direct user interface, such as API calls or backend transactions.

Plus tip:

Consider elaborating on specific examples of each tracking method relevant to your audience's business needs.

25 of 39

Client-Side Tracking Explained

Understanding Client-Side Tracking

Client-side tracking refers to the practice of collecting data directly from a user's device or browser when they interact with a website or application.
This type of tracking typically involves using JavaScript code embedded within web pages, which makes requests to a tracking server to send data such as page views, clicks, and events.
An analogy to understand client-side tracking is asking someone how tall they are; their response may be accurate, but it relies on their own ability to measure and report their height accurately.

Plus tip:

Consider providing real-world examples of client-side tracking tools, such as Google Analytics or Facebook Pixel, to illustrate how this method is commonly implemented.

26 of 39

Server-Side Tracking Explained

What is Server-Side Tracking?

Server-Side Tracking vs. Server-Side Tagging

Server-side tracking collects data directly from the server instead of the user's device.
It captures events and interactions that occur server-side, like transaction completions and API requests.
This method is used for applications not directly connected to a web interface.

Server-side tracking monitors the actual actions and processes occurring on the server.
Server-side tagging involves using a server-side application to manage and relay tracking tags to different destinations.
While both methods utilize server infrastructure, tracking focuses on data collection, and tagging is about data transmission.

Plus tip:

Consider adding examples of applications that benefit from server-side tracking, such as e-commerce platforms or APIs, to enhance understanding.

27 of 39

Rise of Server-Side Tracking

The rise of server-side tracking has been driven by a confluence of regulatory changes, heightened user privacy concerns, and advancements in browser privacy features. With regulations like GDPR and CCPA imposing strict guidelines on data collection, organizations are seeking more secure methods to track user behavior without compromising privacy. Additionally, users are increasingly aware of their data rights, leading to demands for transparency and control over personal information. As browsers implement features to block third-party cookies and restrict tracking scripts, server-side tracking has emerged as a robust alternative, allowing businesses to gather insights while ensuring compliance and user trust.

Plus tip:

Consider adding specific examples of regulations or browser features that have influenced the shift towards server-side tracking. This can help contextualize the discussion further.

28 of 39

Applications of Server-Side Tracking

Content Management Systems (CMS)

Web Servers

Server-Side APIs

Server-side tracking can be implemented in popular CMS platforms like WordPress, Drupal, and Shopify to monitor user interactions and content performance.

Frameworks like Flask, Next.js, and Express allow tracking of server-side events, providing insights into application performance and user behavior.

Tracking can also be done through server-side APIs to analyze data exchanges between servers and clients, enhancing data consistency and security.

Plus tip:

Consider including specific examples or case studies of successful server-side tracking implementations in each application type to enhance understanding.

29 of 39

Tracking: When to Use Which

When to Use Server-Side Tracking

When to Use Client-Side Tracking

Hybrid Approach Benefits

Key Considerations

Use server-side tracking when events occur on the server, such as payment processing or when handling sensitive information like PII. Ideal for operations like sign-ups and transaction completions.

Utilize client-side tracking for capturing user interactions on the webpage, such as clicks, pageviews, and video plays. It's essential for tracking client-specific data like UTM parameters.

Adopting a hybrid approach allows for the collection of both server-side and client-side data, ensuring comprehensive insights while maintaining data accuracy and user privacy.

Always assess the type of data needed and the environment. For rich user interaction data, prefer client-side, while for secure and essential backend operations, lean towards server-side tracking.

Plus tip:

Consider comparing the effectiveness of server-side vs client-side tracking based on specific business needs. You can also include examples of industries that benefit from each approach.

30 of 39

Benefits of Client-Side Tracking

Client-side tracking is often easier to implement, requiring less technical knowledge and resources compared to server-side solutions.
It provides granular information about user interactions, such as clicks, scroll depth, and pageviews, enabling detailed behavioral analysis.
Real-time data collection is possible, allowing marketers to respond quickly to user actions and optimize experiences immediately.
Client-side tracking can capture contextual browser information, such as UTM parameters and IP addresses, which are crucial for understanding traffic sources.
It allows for immediate insights into user engagement, helping businesses make data-driven decisions more efficiently.

Plus tip:

Consider discussing specific tools or platforms that are commonly used for client-side tracking, such as Google Tag Manager or Adobe Analytics, to provide further context.

31 of 39

Benefits of Server-Side Tracking

Enhanced Data Protection: Server-side tracking minimizes the exposure of personally identifiable information (PII) and sensitive data to the client, ensuring better compliance with privacy regulations.
Increased Processing Power: The server can handle more complex data processing and analytics tasks than a client device, allowing for richer data insights and less strain on client resources.
Improved Performance: Server-side tracking reduces the load on the client-side, resulting in faster website performance and a smoother user experience, especially for data-heavy applications.
Reduced Impact of Ad-Blockers: Since server-side tracking operates independently of the user's browser, it is less affected by ad-blockers or cookie restrictions, ensuring more reliable data collection.

32 of 39

Challenges of Client-Side Tracking

Cons of Client-Side Tracking

Lowlights of Client-Side Tracking

Client-side tracking setup is generally simpler and more accessible for non-technical users.
Granular data collection capabilities, such as cookies and user agent information, provide valuable insights.
Immediate feedback on user interactions, such as clicks and scrolls, enhances user experience analysis.

Ad-blockers can prevent tracking scripts from loading, resulting in significant data loss.
Client-side tracking can be unreliable due to varying user environments, such as unstable internet connections or browser extensions.
Security vulnerabilities may arise, as client-side data can be manipulated or intercepted by malicious actors.
User privacy regulations and consent requirements can restrict data collection, leading to incomplete data sets.

Plus tip:

Consider customizing this slide by adding specific statistics or case studies to illustrate the impact of these challenges on client-side tracking.

33 of 39

Challenges of Server-Side Tracking

Pros

Cons

Enhanced data security as sensitive information is processed on the server side.
Reduced reliance on client-side resources, allowing for more robust tracking capabilities.
Improved performance by minimizing client-side tracking load, potentially speeding up page load times.

Complex integration process that often requires collaboration with development teams and technical expertise.
Higher costs associated with server infrastructure and maintenance compared to client-side solutions.
Limited visibility into real-time client-side interactions, such as scrolls and clicks, which may lead to gaps in data.

Plus tip:

Consider adding specific examples of server-side tracking implementations to illustrate the complexities and costs involved.

34 of 39

Server-Side Use Cases

Tracking Sign-Ups

Monitoring Purchases

Server-to-Server Interactions

Server-side tracking captures user sign-up events directly on the server, ensuring accurate data collection and reducing reliance on client-side interactions.

Purchases can be tracked server-side to provide a more secure and reliable record of transactions, protecting sensitive data from client-side vulnerabilities.

This approach allows for tracking interactions between different servers, such as webhooks, ensuring seamless data exchange and event logging.

Plus tip:

Consider adding real-world examples or case studies to illustrate how companies effectively use server-side tracking in these scenarios.

35 of 39

Integration and Synchronization Tips

Establish a Common Identifier

Implement Session Stitching

Monitor Timestamp Drift

Maintain Data Privacy Compliance

Use a unique user ID that can be tracked across both client-side and server-side environments to ensure data consistency.

Utilize session IDs stored in cookies or headers to connect client-side interactions with corresponding server-side events.

Track multiple timestamps for each event to address potential discrepancies between client-side and server-side recorded times.

Ensure that both client-side and server-side tracking methods adhere to data protection regulations by clearly communicating tracking practices.

Plus tip:

Consider incorporating real-world examples of successful integration strategies from companies in your industry to illustrate these points further.

36 of 39

Practical Application Examples

Node.js Implementation

PHP Implementation

AWS Lambda Function

Using Express, you can set up a route to track events. For example, capturing a purchase event could look like this: app.post('/track/purchase', (req, res) => { const purchaseData = req.body; // process data });

In a PHP environment like WordPress, you could track a user registration event with: add_action('user_register', 'track_user_registration'); function track_user_registration($user_id) { // Send data to tracking server }

Create a Lambda function to handle tracking events. For example, tracking page views could be implemented as: exports.handler = async (event) => { const pageData = JSON.parse(event.body); // log data to a database or analytics service }.

Plus tip:

Consider customizing the code examples to fit your specific analytics needs. You can also include real-world scenarios or additional programming languages relevant to your audience.

37 of 39

Balancing Both Approaches

Utilizing both client-side and server-side tracking is essential for obtaining comprehensive data insights. Client-side tracking excels in capturing user interactions and contextual information, while server-side tracking provides enhanced data security, accuracy, and resilience against privacy regulations. By integrating both methods, organizations can enrich their data collection strategy, ensuring they gather a complete and nuanced understanding of user behavior and system performance. This balanced approach not only maximizes data quality but also fosters a more robust analytics framework that can adapt to evolving digital landscapes.

Plus tip:

Consider exploring specific scenarios where a hybrid approach has led to significant improvements in data quality or insights, and tailor the content to reflect your audience's familiarity with tracking methods.

38 of 39

Final Recommendations

A hybrid tracking strategy combines the strengths of both client-side and server-side tracking for comprehensive data collection.
Client-side tracking is ideal for capturing user interactions and contextual data, while server-side tracking excels in handling sensitive information and reducing reliance on third-party cookies.
Implementing both methods allows for enriched data quality and accuracy, ensuring a more complete view of user behavior.
Regularly evaluate the performance and data accuracy of both tracking methods to optimize your digital analytics strategy.
Stay updated with privacy regulations and technology trends to adapt your tracking methods accordingly.

Plus tip:

Consider tailoring the recommendations to your specific industry or sector, highlighting any relevant tools or strategies that may enhance your tracking approach.

39 of 39

Thank you.