How Google’s thirst for unique content leads to massive personal privacy violations

How does your address, phone number, income level, and even your credit score get on Google? Our CEO Dimitri Shelest, uncovers the hidden mechanics that make Google and information brokers partners in massive privacy violations.
Let me begin with a confession. Long before I founded the privacy protection web service Onerep, I was an SEO consultant and then an entrepreneur with a focus on large database-driven websites. Rather than building static pages filled with content in advance, such websites create thousands and millions of dynamic pages pulling data from large datasets in real time.
It’s not a big deal to create such a site; there are tons of free and paid datasets on any topic available out there. The hard part — and this is where search engine optimization steps into action — is to convince Google that it’s your website, not a similar one created from exactly the same dataset, that deserves to be ranked high in its search results. If you succeed, a golden rain of free Google traffic falls on your business.
Giant, below-the-radar, hardly regulated industry
Through my consulting work with big sites, I eventually came across a niche of destinations that can be classified as people-search sites — think Whitepages, BeenVerified, MyLife, Spokeo, etc. These sites compile and publish profiles on hundreds of millions of people. The profiles are filled with the information pulled from public records and consumer databases. Besides your social media accounts, your profiles on such sites is what people see when they Google your name. These profiles promise to reveal and do reveal a ton of your private information: your phone number, your home address, your family ties, your political affiliation, your financial records, your criminal records, and, since recently, even your hobbies.
The size and scale of this niche is truly mind-blowing. Few people realize this, but we look up one another on Google all the time. A typical use case: two people meet each other on Tinder and agree to have dinner. Before going out with a complete stranger, a woman Googles her date’s name. Usually she will Google something like “John Snow, Albuquerque, TX.” An average John would be Googled a few dozen times per year. But think about this. There are 200+ million adults in the US. Multiply this number by twenty and you’ll get an idea of how big this market is.
Don’t forget to add millions and millions of searches people do on celebrities of all levels — from A-listers and rising TikTok stars to regular people who hit the news due to some happy (or unhappy) incident. According to SimilarWeb, giants of the people-search industry such as Whitepages, Spokeo and BeenVerified get 20–50 million visitors per month each. The entire ecosystem counts over 100 websites and gets close to a billion of free Google clicks monthly. With an average cost per click in Google Ads between $1 and $2, this market is a lucrative pie worth competing for.
Taking into account the amount of sensitive private data people-search sites collect and reveal, one would expect there to be at least some moderate regulation of their activity in place. Sadly, this is not the case. The only real regulation that does impact this industry is The Fair Credit Reporting Act (FCRA). This legislation was enacted to promote the accuracy, fairness, and privacy of consumer information contained in the files of consumer reporting agencies. However, people-search sites use an easy workaround to get themselves out of the FCRA action; they add a note at the footer of their sites stating that they are not consumer reporting agencies. Smart.
How Google prompts people-search sites to reveal more and more information about you
How is all this related to Google and privacy violations? There is a very direct causation here. As any SEO or website owner knows, content quality is one of the major factors built into Google ranking algorithms used to decide which site deserves to be ranked higher in search results. Google judges the quality of the content by how rich and unique this content is. The problem with all people-search sites, however, is that the information they accumulate about each individual mainly comes from one source — public records.
In other words, people-search sites have to compete with each other in Google using practically identical content. Still, Google algorithms make their choice and give preference (and tons of free traffic) to 5–10 sites for each particular name lookup. How do these winners succeed in convincing Google that they have the most up-to-date and unique content about John Snow? I’ve been watching people-search sites for over 5 years now and I know the answer. Each year they keep releasing more and more private information about John for free.
Phase 1: Infancy era of people-search sites
Five years ago when we created Onerep — the service for automatic removal of unauthorized profiles from people-search sites — almost all private information contained in profiles on people-search sites was pay-walled. Back then, John Snow would search his own name and Google would show him a number of “his” pages on sites such as Whitepages or Spokeo. These pages contained very limited information — mainly John’s full name, age, location (city, state) and sometimes his relatives.
All other data pulled from public records was packed into so-called “background reports” and sold as subscriptions. Google didn’t have access to this extended private information, either. People-search sites wouldn’t allow Google bots to index this content. As a result, you couldn’t Google somebody’s name and see his or her phone and address, let alone criminal records, without buying a pricey subscription. Similarly, you could not do a reverse address lookup by Googling an address and finding out who lives at that address.
Phase 2: Dark horses take the lead by revealing “unique” content
Two-three years later, new players in the people-search site niche emerged. They allowed Google to index exact home addresses and phone numbers of millions of Americans, with the information becoming accessible to anyone interested… without a paywall. Google was happy to get this new unique content in its index and rewarded the newcomers with top rankings.
That’s how such sites as TruePeopleSearch, FastPeopleSearch, ClustrMaps and many others climbed to the top of Google search results and got their generous share of free Google search traffic. The competition was becoming way more intense while Google’s AI-powered ranking algorithms were becoming more and more demanding towards the “quality” of content.
Phase 3: Breaking bad
2020 gave rise to a new phenomenon: in order to outperform competition and let Google know they have more unique, richer content than competitors, a number of people-search sites started publishing data that they acquire from consumer and marketing databases. This data contains information about purchasing behavior of a person (or a household that person is associated with). Releasing this information to the public is an entirely new level of privacy violation happening right now in the US.
Today, when you Google someone, you see “profiles” that reveal their occupation, education level, income, credit score, and a ton of other factors related to someone’s purchasing behavior. Some profiles reveal your hobbies or even what pet you prefer: a cat, a dog, or both. Many profiles reveal your political views; this information is pulled from voter records and consumer records that register the party that you send donations. I wonder when we will start seeing family members’ clothing sizes — consumer databases store this data as well.
Can we do anything to stop these privacy violations?
As a technology entrepreneur immersed in the data and privacy fields, I see strong indications that the volume of personal information being collected and exposed by governments and businesses doubles each year. The symbiotic relationship between Google and people-search sites is just one example of this process. I also see that the public becomes more and more concerned about privacy protection. The emergence and success of new startups in the privacy field, such as Mine and Spartacus, is a sign that people are awakening and investors are starting to recognize this growing consumer demand.
Speaking of people-search sites and personal information they deliver to Google, I believe it is important to remember that, to a great extent, privacy is a conscious choice, and as such, it requires some effort. Violation of your privacy gives way to a whole lot of very unpleasant situations in life — from intrusive unwanted communication or stalking to identity theft. Before anything, privacy is safety. Your own and that of your family.
I challenge everyone to complete this very simple exercise: Google yourself in this manner: first name + last name + city + state. I bet you’ll be surprised at how much Google knows about you. Luckily, this kind of exposure can be fixed: remove yourself from sites that publish your data. Use DIY guides from here or use the tool that removes public records from people-search sites automatically.
And next time you buy something online or in-store, spend a minute reviewing the retailer’s Terms of Service, which will tell you if and how the company plans to share your purchase information with 3rd parties. Our privacy is in our hands.
Dimitri is a tech entrepreneur and the founder of Onerep. He led the team that developed the industry’s first fully automated data removal tool, transforming what was once a manual process into an effortless solution for protecting personal information from data brokers.