Consumer Analytics - Privacy

Card Puncher Data Processing


The purpose of data mining is to discriminate …

  • who gets the loan
  • who gets the special offer

Certain kinds of discrimination are unethical, and illegal

  • racial, sexual, religious, …

But it depends on the context

  • sexual discrimination is usually illegal (except for doctors, who are expected to take gender into account)
  • and information that appears innocuous may not be (ZIP code correlates with race, membership of certain organizations correlates with gender)

Information privacy laws


  • A purpose must be stated for any personal information collected
  • Such information must not be disclosed to others without consent
  • Records kept on individuals must be accurate and up to date
  • To ensure accuracy, individuals should be able to review data about themselves
  • Data must be deleted when it is no longer needed for the stated purpose
  • Personal information must not be transmitted to locations where equivalent data protection cannot be assured
  • Some data is too sensitive to be collected, except in extreme circumstances (e.g., sexual orientation, religion)


Anonymization is harder than you think

Medical records

When Massachusetts released medical records summarizing every state employee’s hospital record in the mid‐1990s, the governor gave a public assurance that it had been anonymized by removing all identifying information such as name, address, and social security number.

He was surprised to receive his own health records (which included diagnoses and prescriptions) in the mail.


Using publicly available records:

  • 50% of Americans can be identified from city, birth date, and sex
  • 85% can be identified if you include the 5‐digit zip code as well


Netflix movie database: 100 million records of movie ratings (1–5)

  • Can identify 99% of people in the database if you know their ratings for 6 movies and approximately when they saw the movies (+- one week)
  • Can identify 70% if you know their ratings for 2 movies and roughly when they saw them

AOL engine queries

In 2006, a text file was released on the web containing 20,000,000 search engine queries made by 650,000 users over a 3-month period, intended for research purposes. The file had been anonymized by replacing user names with random numbers, one per user. However, some of the queries contained clues to the user's identity. The New York Times was able to locate an individual from these supposedly anonymized search records by cross referencing them with phonebook listings. Look up this renowned example of reidentification and read about it. What is the name of the user identified by the New York Times?

NSA - Metadata Match (Stanford)

Stanford Researchers: It Is Trivially Easy to Match Metadata to Real People

Open Person Directory

  • Yelp,
  • Google Places,
  • Facebook directories

Data Type

Every data that is device scoped such as a mac address, browser fingerprint is not privacy proof as they enables tracking


You can control your privacy by google at:

Pgp (Pretty Good Privacy)

Pgp (Pretty Good Privacy) is a cryptographic protocol aimed to increase privacy in the digital world.

Documentation / Reference

Discover More
Card Puncher Data Processing
Ad - Optout / Privacy

- The NAI is a self-regulatory membership organization that sets privacy rules for its members through a Code of Conduct. - DAA (Digital...
Card Puncher Data Processing
Ads - Consent Management (CM - CMP)

Consent Management is a mandatory privacy process due to gdpr where a website showing a page to a EU resident needs the consent of the user in order to use their data (cookies,..) The consent box may...
Card Puncher Data Processing
Analytics - Anonymisation

See also: Web Browser Tor Off-the-Record Messaging (OTR) Pretty_Good_PrivacyPretty Good Privacy (PGP) Anonymity...
Card Puncher Data Processing
General Data Protection Regulation (GDPR)

The European Data Protection Regulation is applicable as of May 25th, 2018 in all member states to harmonize data privacy laws across Europe. Under this law, websites need: to explain the usage of...
HTML - Privacy (Anonymization)

in the HTML context. A user can be distinguished from another by the user's IP address. (IP addresses are not a perfectly match to a user due to routing, proxy, ...); Technologies such as onion...
Gpg Kleopatra
Pgp (Pretty Good Privacy)

PGP (Pretty Good Privacy) (or OpenPGP) is a standard defined by 4880RFC4880 (OpenPGP Message Format) that define how to securely provide: electronic communications (email) and data storage services....
Web - Browser Fingerprinting

A digital fingerprint is a string that represents a unique id of a device (browser). The more unique is the browser, the more it has a one on one relationship with a user. A digital fingerprint may...
What is a UUID - Universally Unique IDentifier - (also known as GUID) ?

UUID (Universally or Global Unique IDentifier) are generated identifiers that are guaranteed to be unique and avoid collision

Share this page:
Follow us:
Task Runner