This post is about the value of consumer data to corporations, as seen partially through the lens of systems theory. I try to explore exactly how companies draw you into their product web - and why they really can't afford for you to leave. I’ll also try to show why these same principles explain why most recommendation systems will never truly enable discovery on the part of the user.
First, some background on the value of data and data analytics. According to IDC, the worldwide big data analytics market will reach $125 billion in 2015. Companies more than ever see analytics as a critical priority for their industries. Analytics is now a must-have across the board - a baseline, not a competitive advantage.
For those newer to analytics, a few definitions are helpful. There is descriptive or exploratory analytics, where underlying data is visualized or filtered in such a way as to highlight areas for attention and make insights more easily discoverable. There is predictive analytics, where data is used to generate models that can project with some confidence key dependent variables, such as future performance, demand, cash flow, and so on. Then there is prescriptive analytics, where models and data are used to generate a recommended business action that will optimize some key metric for the company. See more here.
When discussed inside corporations or between shareholders, the value of data is often stated in terms of the data itself. How comprehensive is the data? How many users or days do we have data for? How reliable is the data? How clean is it? How relational is it? How easy is it for us to work with? Who else might value it?
However, the real value of internal data to companies is really only in how predictive and prescriptive the data can be, especially for B2C companies. Is the data complete enough to form a reliable model of consumer behavior? Is it relational enough to identify key drivers of important user performance metrics?
Facebook, Google, and other large tech companies with a variety of product offerings already have enough user data to perform a variety of seemingly impossible tasks, such as tracking flu outbreaks or influencing our mood. Thus the key for them in leveraging our data (yes, OUR data) is not collecting more of it - it’s creating environments in which the data they have can serve as a complete model for our behavior. And that’s where it gets scary.
The Race to Close The System
I’ll digress for a moment to discuss some definitions in systems theory. Bertalanffy describes an open system as one where there is "an exchange of matter with [the] environment, presenting import and export, building-up and breaking-down of its material components.” In other words, there are external forces acting upon the system and driving behavior of the system’s participants. In contrast, a closed system is one characterized by isolation of its elements from external forces.
When you attempt to model the behavior of an actor in a system, you rely upon two things chiefly: data and relationships. Data can describe the amount of something, such as the number of users, or the rate of something, such as the churn rate or subscription rate of users. Relationships can show how these amounts or rates change over time based on their relationships to other amounts or rates. For example, churn rate may be linked to the availability of competing products, or subscription rate may be linked to marketing spending in certain channels.
In open systems, much of the data or relational knowledge required to construct an effective model is unknown or unreliable. For example, it is difficult to create a model that takes as an input a competitor’s marketing spend if the competitor closely guards that information. Thus consumers can appear to behaving randomly even when there is a clear but unmeasurable external stimulus. Tracking down these exterior forces is costly, often prohibitively so. Generally they are only uncovered through extensive A/B tests, which can still be inconclusive, or focus group testing, which is inefficient. And even these insights would still need to be translated into business action (prescriptive analytics).
Additionally, when a customer leaves a particular service, their behavior is no longer measured by that service. Therefore there is no way to close the feedback loop with a clear understanding of a consumers next behavior. All you know is that they abandoned the service - not why, or for whom. That means forming a prescriptive model is even more challenging, because it cannot always learn from the results of actions that are taken.
This is why large tech companies with a trove of data will often go out of their way to create closed systems for their users. The more data that is held internally and the more relational the data, the better the predictive models that are generated. But additionally, in closed systems, tech companies can artificially restrict stimuli so as to remove variability in user behavior. This makes it easier to predict purchasing activity or project the lifetime value of a customer - key business metrics.
The Only Winning Recommendation Is Not To Play
See, Amazon doesn’t care about recommending you the product you’re going to enjoy. It cares about showing you the product that you’re the most likely to buy, because that’s what it can predict. And then Amazon will spend all its time convincing you that the product you’re the most likely to buy is the one you really wanted all along.
Netflix doesn’t really care about allowing users to truly “discover” content. It will recommend you the movie you are the most likely to watch next, because that is what it can predict. And it will try to convince you that your taste ought to be more narrow than it is (“Teen Paranormal Dramas”, anyone?) because that will make its recommendation system look more powerful.
And most music services don’t really care about providing a good “discovery” mechanism for new music, because that’s not what they want either. They want to deliver a certain number of listens to the labels with which they have relationships, so they can continue those relationships. They want to program content such that the artists that are supposed to get listened to actually do. How can music platforms negotiate the value of content with content providers if they don’t understand the behavior of their own audience and their relationship with that content? Then that question quickly becomes: how can they understand and predict their own audience unless they control their users' behavior?
All of these companies create habit loops for consumers, and they’ve gotten really goddamn good at it. They can identify and test our triggers because their databases are full of correlations, if not causations. They do this not to keep us satisfied - they do it to keep us predictable. They do it to keep us monetizable. And there is very little business incentive for most companies to structure products or ecosystems any other way.
So the next time you use Apple Maps - and it still sucks - just remember why it exists. Remember that companies don’t want you to discover anything new, to find surprising content, to explore. Those are human needs - not a corporate need. So please, fight the habit loops and the seamless systems, because you know they won’t do it for you.