Is Anonymous Credit Card Data Really Anonymous?

Is Anonymous Credit Card Data Really Anonymous?

Every business needs to be concerned about data security and cyber liability, which is why many are taking steps to protect their data and reduce the likelihood of experiencing a data security breach. Unfortunately, not every business believes they are at risk of experiencing a data security breach. Perhaps a recent study of credit card data will help these businesses realize that the risk of cyber liability is very real and can be very costly.

Researchers at the Massachusetts Institute of Technology found that anonymous credit card data isn’t so anonymous after all. According to the MIT study, removing personally identifiable information from credit card transaction data does not make it anonymous and does not make it safe for release to the public or to third parties.

The study analyzed credit card data of 1.1 million users in 10,000 stores over a 3-month period. Each credit card transaction was time-stamped and associated with a specific store. However, the data was ‘anonymized’ by stripping names, account numbers and obvious identifiers, such as addresses, phone numbers or other personally identifiable information.

Despite being ‘anonymized’, MIT researchers found that knowing four random spatiotemporal points (an individual’s location at a specific time) is enough to uniquely identify 90% of individuals and uncover their records. Consider the following example included in the study:

Let’s say that we are searching for Scott in a simply anonymized credit card data set. We know two points about Scott: he went to the bakery on 23 September and to the restaurant on 24 September. Searching through the data set reveals that there is one and only one person in the entire data set who went to these two places on these two days. Scott is re-identified, and we now know all of his other transactions, such as the fact that he went shopping for shoes and groceries on 23 September, and how much he spent.

Researchers also discovered that an additional piece of ‘anonymized’ data significantly increases the chances of being identified.

Although knowing the location of my local coffee shop and the approximate time I was there this morning helps to re-identify me, knowing the approximate price of my coffee significantly increases the chances of re-identifying me. In fact, adding the approximate price of the transaction increases, on average, the [risk of being identified] by 22%.

MIT researchers also studied the effects of gender and income on the likelihood of being identified. According to their results:

  • The odds of women being identified are 1.2 times greater than for men.
  • The odds of high-income people being identified are 1.7 times greater than for low-income people.
  • The odds of medium-income people being identified are 1.1 times greater than for low-income people.

Since individuals may be relatively easily identified by anonymous credit card data, MIT researchers suggest that simply removing names, home addresses, phone numbers, or other personally identifiable information may not be sufficient to protect the privacy of individuals. From a policy perspective, MIT researchers believe data protection mechanisms must move beyond the concepts of personally identifiable information and anonymity toward a more quantitative assessment of the likelihood that data can be used to identify individuals.

Until then, businesses should do everything in their power to protect their data and obtain insurance to manage their cyber risks. There are a number of cyber liability products that protect against privacy injuries, such as identity theft, and that cover the cost of complying with various data breach notice laws.

If you would like to learn more about insuring against cyber risks, contact us.

If you would like to subscribe to our newsletters please click here.