Analyzing the relationship between online and offline channels
While it's common to analyze all digital cross-channel activity, the link between offline activity and your SEO efforts is often forgotten. But why is it so important to include all channels when building a marketing strategy?
We know that SEO does not live in a vacuum—the relationship with Paid Search and other digital channels is well documented. However, by speaking with departments outside of the digital realm and combining data, we can determine proper attribution and uncover actual ROI. A true omni-channel view also allows us to think about tradeoffs and make predictions.
One of our clients is a large eCommerce company in the fashion industry with multiple brands, each with a separate website. They invest in both online and offline channels, including magazine mailings, but have no brick and mortar locations.
Looking week over week we noticed a drop in their SEO performance. After we checked the usual suspects we found:
- No ranking changes or Google updates
- No site changes
- No tracking issues
After turning up no solid leads, we decided to ask about changes to their other channels. They mentioned that their catalog department had recently reduced circulation.
The previous week had seen a decrease in circulation of -38% Y/Y for one brand, which in absolute numbers meant dropping from over 1M magazines last year to around 700K this year. Additionally, they planned to reduce circulation throughout the rest of the year as they transitioned more towards digital.
As a digital agency, a catalog department involves a channel that is way outside of our usual orbit.
Hypothesis and Assumptions
Could a drop in the circulation of magazine mailings be the reason why SEO traffic had also dropped?
Our best theory was that the number of magazines mailed would affect “branded” traffic but not “non-branded” traffic. As a reminder, branded SEO traffic is defined as clicks from Google search results that include the brand name (for example, people that search for “verizon” and click through to verizonwireless.com). Non-branded traffic is the opposite—clicks from generic search terms (to continue our verizon example, “mobile phones” for verizonwireless.com).
Where do we find branded SEO traffic data? Unfortunately, Google Analytics does not show traffic by keyword because keyword data is “not provided,” so we must instead use Google Search Console (formerly known as Webmaster Tools) to see which keywords are driving clicks.
Ayima is a very data-driven organization, and we’ve created proprietary tools that warehouse Search Console data on a daily basis. This comes in handy if we need to add custom filters to query information and easily trend branded and non-branded traffic over time.
However, it is relatively easy to download Search Console data via the web or the API. You can download daily data for the past 90 days in Search Console via the browser interface:
It's also possible to use programming in Python or R to pull daily data for the past 16 months via the Google API:
Even though we have daily data available for this case, we chose to aggregate by week. This decision was based on the assumption that not everyone who receives a magazine looks at it immediately, so there is always a lag of a few days.
Once we have both data sets, we combine them by plotting branded clicks against circulation numbers in the same chart.
In this case, the best tool for the job is trusty old Excel. Simply put the data sets together in adjacent columns so the dates line up by row, and then create a pivot table that includes everything:
We can even add a pivot chart with data trended by week, along with a slicer that lets us choose the brand and highlight Y/Y changes for each week with conditional formatting:
In order to see how strongly two variables are related, we also add a correlation calculation in Excel using the =CORREL function. There are several other ways to calculate correlation, but this method is easiest since we are already working in Excel.
This correlation metric will have a value between -1 and +1, and shows us how strong the linear relationship is between circulation and branded clicks. If the correlation is positive, it means that circulation impacts branded clicks positively—in other words, an increase in circulation will also increase branded clicks. The opposite holds for a negative correlation—issuing more circulation numbers would decrease the number of branded clicks.
We can use the correlation to decide how much a brand relies on circulation. For example:
Branded Clicks and Circulation are highly correlated for some brands (0.799), meaning whenever there is a big magazine mailing (orange bars), branded SEO clicks also spike (blue line).
Other brands show slightly lower correlation (0.441). Perhaps magazine mailings cause branded SEO clicks to go up a bit but there are other factors involved.
As expected. there is no real correlation between non-brand clicks. Someone receiving a brand magazine would not necessarily be influenced to search for a generic term like “sweaters.”
The correlation metric on its own is limited in usefulness, but it suggests whether there is a relationship we can use in predictive models. Correlation only measures the strength of the relationship—to quantify the impact circulation has on branded clicks we can use regression models.
Models are simple representations of reality that can predict things, like expected branded SEO clicks based on future circulation numbers. Luckily, magazine mailings are scheduled far ahead of time because of the production that’s necessary to assemble a catalog. In this case, we had data through the rest of the year.
Similar to how we use data science to optimize PPC ads, we can apply data science techniques like regression modeling in Python or R to create these predictions. By doing so, we’ll know what impact an increase of 1 unit of circulation issues has on branded clicks. The estimated impact can be used for predicting future branded clicks based on circulation and other variables.
We then add these results to our tool so we can easily visualize the impact of lower circulation (orange bars) on non-brand clicks for the rest of the year (gray line).
Why is this useful? Well, knowing how branded SEO clicks are expected to perform allows us to explain to a client any drops or spikes in SEO traffic that may occur as a result of lower circulation. It also allows the client to choose a level of circulation to target certain SEO traffic numbers in order to hit desired financial goals.
Hopefully this case study has shown the importance of embracing a true omni-channel view, and that seemingly distant marketing channels are actually related in interesting and complex ways.
Don’t be afraid to venture beyond your typical orbit! Be sure to speak with other departments because SEO can be affected by many factors.