Beware of your username selection -- online marketers can use it to track you and online criminals can use it to phish you.
That's the warning from researchers at INRIA -- France's National Institute for Research in Computer Science and Control -- in a new paper, "How Unique And Traceable Are Usernames?" The answer to that question is that many usernames are unique, and thus highly traceable, especially when they get reused across multiple Web sites, online forums, and social networks.
Tying a person's pseudonym to his or her real identity creates security and privacy risks. For example, said the researchers, such information could "be exploited to perform efficient social phishing or targeted spam, and might [also be] used by advertisers or future employers seeking information."
Thankfully, however, at least from a security standpoint, such phishing and scam attacks are theoretical – so far. "We haven't seen attacks in the wild yet," said Daniele Perito, a doctoral candidate at INRIA and one of the research paper's authors, in an email interview. "I would not be surprised if, in the near future, we will see some of these attacks emerging."
Amassing data about individuals based on their usernames isn't an abstract concept. Last year, news reports surfaced that media-research firms and data brokers, including Nielsen, were scraping online Web sites and forums -- that is, copying every message or page present -- to amass more detailed data on consumers.
In their research, the INRIA team only explored whether usernames alone could be linked to real identities. But in the real world, market researchers or online criminals have more available data points than just usernames, and thus an improved ability to correlate real identities with pseudonyms.
For example, the researchers point to a recent patent application made by PeekYou.com. The application proposes "aggregating personal information available from public sources over a network" to correlate online pseudonyms with unique individuals. To do this, the patent application says it will look at such factors as username uniqueness, geographic location, gender, e-mail address, and even zodiac sign.
Based on the INRIA research, usernames built using a person's complete name seem especially vulnerable. For example, the username "zorro1982" is relatively generic and non-identifying. But "dan.perito" has a high likelihood of serving as a unique identifier.
Expect the threat of username profiling to grow more acute in the future. "There is a growing online footprint we are leaving behind, and this will be more and more exploitable by advertisers and spammers in the future," said Perito.
Accordingly, how can people build usernames that are less traceable? For individuals, the research team released a free tool to assess the relative entropy -- that is, uniqueness -- of a username.
Meanwhile, the researchers recommend that Web site operators throttle or require a CAPTCHA for any non-search engine that attempts to crawl (and potentially scrape) their site for usernames. Some Web sites, such as eBay, already appear to do this, to discourage dictionary attacks seeking to amass a list of the site's usernames. Other sites, such as Twitter, do not.