Big data, big problems

Lisa R. Lifshitz
On May 27, the U.S. Federal Trade Commission released its study of data broker practices entitled “Big Data: A Call for Transparency and Accountability.” Based on its 18-month review of the data collection and use practices of nine significant data brokers (Acxiom, CoreLogic, Datalogix, eBureau, ID Analytics, Intelius, PeekYou, Rapleaf, and Recorded Future), the FTC obtained detailed information regarding the data brokers’ practices, including the nature and sources of consumer data they collect; how they use, maintain, and disseminate the data; and the extent to which data brokers currently allow U.S. consumers to access and correct data about them or to opt out of having their personal information sold or shared.

Not surprisingly, the report indicated “big data” heralds some big problems for U.S. consumers (and virtually everyone else) and a few (creepy) highlights are shared below.

Interestingly, the FTC found none of the nine data brokers sampled collected data directly from consumers. Rather, they collect data from:
•    government sources (even though some of this information is technically protected by law);
•    other publicly available sources (i.e. social media, blogs, the Internet generally); and
•    commercial sources (retailers, catalogue companies and other data brokers as well as from one another).

While each data broker source may provide only a few data elements about a consumer’s activities, data brokers typically stitch all of these together to form a more detailed composite of the consumer’s life. It is virtually impossible for consumers to determine the originator of a particular data element, given the multiple layers of data brokers — this means there is very little accountability on the part of individual brokers.

Data brokers sell both the actual and derived data elements to their clients, creating lists of consumers with similar characteristics and using the data to predict client behaviour. Some of these data segments are relatively innocuous — like “outdoor/hunting and shooting.” Other segments rely on ethnicity, income level, and education level or specific interests mined from information provided, which are clearly more sensitive, such as “financially challenged,” “African-American professional,” “diabetes interest,” “cholesterol focus,” or “rural everlasting,” which includes single men and women over the age of 66 with “low educational attainment and low net worths.”

The data brokers offer products in three broad categories:
•    marketing (the biggest category by far);
•    risk mitigation; and
•    people search.

These products generated a combined total of approximately $426-million in annual revenue in 2012 for the nine data brokers. While “behavioural advertising” and targeting consumers with ads based on their specific interests is old hat, this report revealed the practice of “onboarding.”

“Onboarding” is the process whereby a data broker adds offline data into a cookie (the process of onboarding offline data) to enable advertisers to target consumers virtually anywhere on the Internet.

Onboarding clients either provide data about their customers to a data broker to facilitate the process of finding those consumers on the Internet to deliver targeted advertisements; or use a data broker to identify an audience of consumers who are likely to share particular characteristics and find those consumers on the Internet to deliver advertisements.

After the data broker finds the relevant target consumer, it places a cookie on the browsers of the consumer who has logged on to the relevant web site that will include information the data broker has appended to the consumer’s profile. The data broker can advertise to the consumer across the Internet for as long as the cookie stays on the consumer’s browser. The data broker either acts as an advertising network itself by buying advertising space on various web sites or contracts with advertising networks that have secured advertising space on these web sites. All this (largely) without actual consumer knowledge.

A staggering amount of data being collected on consumers — one data broker’s database has information on 1.4-billion consumer transactions and more than 700-billion aggregated data elements; another’s database covers one trillion dollars in consumer transactions. This massive collection of “big data” (and the use of such data) reveals considerable privacy risks, not the least being secondary impacts.

Consumers cannot easily mitigate the impact of poor financial scoring or the possible misuse of health-related information that would cause an insurance company to score the “diabetes interest” consumer as high risk.

Privacy policies remain vague; only a few of the data brokers offer any kind of data opt out or redress. Even if available, opting-out typically does not take effect immediately, taking a data broker several weeks to suppress a consumer’s personal information from its database. Furthermore, even if a consumer tries to opt out, information about that consumer might still appear in another consumer’s records, such as that of a spouse. Or, if a consumer submits identifying information in an opt-out request that varies from the identifying information in the data broker’s records, theopt out may not capture all of those records. Not to mention all the data that was previously sold by the broker to other brokers downstream.

The FTC also noted some of the data brokers profiled store all data collected indefinitely, even if it is later updated, unless otherwise prohibited by contract. Although stored data may be useful for future business purposes, it offers a tempting treat for identity thieves.

Concerned by these findings, the FTC called for the U.S. Congress to enact legislation that would enable consumers to learn of the existence and activities of data brokers and provide consumers with reasonable access to information (including sensitive data) about them held by these entities (and allow for opt out to avoid sharing such data for marketing purposes).

The FTC also proposed the creation of a centralized mechanism, such as an Internet portal, where data brokers can identify themselves, describe their information collection and use practices, and provide links to access tools and opt outs.

Data brokers should also clearly disclose to consumers (e.g., on their web sites) that they not only use the raw data they obtain from their sources (such as a person’s name, address, age, and income range), but they also derive from the data certain data elements, and these categories should be clearly identified. Data brokers should also be required to disclose the names and/or categories of their sources of data, so consumers are better able to determine if, for example, they need to correct their data with an original public record source.

Finally, the FTC advocated that Congress should require consumer-facing companies to provide a prominent notice to consumers that they share consumer data with data brokers and allow consumers to opt out of sharing their information with data brokers. In the meantime, the FTC urged data brokers to practise “privacy by design” and reasonable precautions to ensure downstream users of their data do not use it for unlawful discriminatory purposes.

So where do we stand in Canada? At the recent meet and greet with the new federal Privacy Commissioner Daniel Therrien in Toronto earlier this month, the assembly of in-house lawyers, privacy officers, and outside privacy practitioners unanimously requested additional guidance from the OPC regarding big data practices, from both sides of the table. This request was duly noted and given all of the pressing matters on the OPC’s plate (Bills C-13 and S-4, just to name a few), I fear we will have a long wait on this urgent matter.


Bonus CASL update

Last week I called the CRTC at the behest of a client on a no-names basis in order to seek additional CASL guidance regarding the “in-mail” features of LinkedIn. After waiting patiently for the CRTC to call me back, I asked my questions to our regulator and received the following response: “What is LinkedIn?” After explaining the concept to the representative, the young lady pointed me to a (utterly generic) section of their CASL sub site and told me to consult a lawyer if I had further questions.

True story. I couldn’t make this one up if I tried.