May 10, 2022
While increased spending on marketing analytics has become ubiquitous with modern digital marketing, most companies are still struggling with the sheer size and variety of possible data inputs. In this series, our intent is to demystify the complexity around setting up a robust marketing analytics platform and provide a way for companies to not only own their data but also democratize it in a way that allows the business to drive meaningful, insights-based change.
In the first three parts of this series, we introduced four steps to harness the value of customer data, shared how modern cloud data platforms enable marketers to accelerate value to deliver differentiated precision marketing, and discussed incorporating a decisioning engine with marketing analytics platform (MAP). In this final installment, we’re sharing two practical steps to get started on this journey.
How to Get Started With MAP
Step 1: Unify Marketing Data in One Place
What Is a Data Lake?
Raw data, both structured and unstructured, in its native form is typically stored in a layer referred to as a data lake. A data lake is used to store raw data that can be used for a multitude of purposes. All the major cloud platforms (AWS, Azure, and Google) have a storage option that works well for data lakes. AWS's Simple Storage Service (S3), Azure's Data Lake Storage or Blob Storage, and GCP's Cloud Storage are ideal for a data lake because of the unlimited scalability at a relatively low cost.
While all the major cloud platforms have services that can automate data pulls into the data lake in addition to platforms such as Fivetran, Stitch, and Singer, our recommendation is to first focus on identifying and validating the data sources and then focusing on automating the data extracts.
There are two advantages to storing data in a data lake:
- Provides ability to perform an exploratory data analysis (EDA) using self-service tools such as R or Python.
- Establishes a governance foundation around data quality in downstream systems such as a Data Warehouse or Data Mart.
Let's dive into both of these concepts to describe why configuring a data lake is a key first step in the process of setting up an analytics environment.
Advantages of a Data Lake: Defining EDA
EDA is the process of investigating the dataset to discover patterns, anomalies (outliers), identify quality issues, and test hypotheses and/or form a new hypothesis based on observations. EDA can help us answer questions like:
- What is the average age of our customers?
- What is the income range for our customers?
- Do our customers prefer shopping online versus in-store?
- Does click-through rate differ by ad platforms? Does it vary by ad dimensions or phone OS?
- Does the data support the current customer segments?
And for each of these questions, we can add additional variables to see how it varies with the data. For example, we could also ask, “Does shopping preference between online versus in-store vary by income group or age?”
EDA is also a critical first step in building machine learning (ML) models by identifying column types and columns with missing or null values, calculating skewness and kurtosis, running bivariate and multivariate analysis, and exploring range and distribution of variables.
Step 2: Establish a Data Governance Foundation
Governance on a data lake has several key facets:
- Documentation of the data flow: Helps identify data source, data owner, frequency of ingestion, method of ingestion (API, SFTP, etc.), and sensitivity of data.
- Data retention standards:.
- Data ingestion standards: Defines how new data sources will be incorporated, including how schemas and data dictionaries will be managed.
- Data ownership: Establishes a clear chain of custody and is instrumental in enforcing access controls as well as managing data security.
While setting up an analytics platform can seem like a formidable task, each step in the process will unlock business gains along the way. For example, the governance facets we have discussed above will help ensure we can meet both compliance goals (identifying the source and flow of the data through the organization) and quality goals (ability to tie summarized results to source attributes in the raw/native format). Getting data sources ingested into the data lake will also allow for democratization of that data across the enterprise and enable teams to do their own EDA.
Credera’s Marketing Analytics Platform Series
This is part four of a multi-part series exploring how a single customer view powered by Tealium and AWS can help provide personalized customer experiences and increase the ROI of your marketing budget.
Find the rest of the series insights here:
- Marketing Analytics Platform Powered by OPMG Part 1: Activating Insights From Your Customer Data
- Marketing Analytics Platform Powered by OPMG Part 2: Unlocking Customer Insights With a Modern Marketing Data Platform
- Marketing Analytics Platform Powered by OPMG Part 3: Ignite Customer Insights With Enhanced Decisioning Capabilities
Marketing Analytics Platform Powered by OPMG
OPMG’s Marketing Analytics Platform (MAP) is a technology toolkit for marketing teams to unlock value from their customer data. The toolkit blends enterprise patterns, cloud technology, and off-the-shelf software to unify customer data, segment customers, orchestrate real-time experiences, and measure the customer journey.
This content was created in partnership between RAPP and Credera, sister agencies and part of Omnicom Precision Marketing Group (OPMG). Omnicom Precision Marketing Group aligns Omnicom's global digital, data and CRM capabilities to deliver precisely targeted and meaningful customer experiences