(Update Jan 23: Apparently, the DOJ has proposed changes so that publishing weak password lists will become a felony.)
At either the start or end of the year, a security company somewhere in the intertubes can be found publishing a list of the top “most popular” passwords (usually the top ten). These lists, compiled by culling the passwords from multiple data breaches, tend to also be a list of the weakest passwords. Why the weakest? Because the top ten passwords are the most often used ones; this in turn means that hackers looking to break into accounts will try these passwords first before any others.
It’s the law of large numbers in action: if the top ten passwords comprise, say, 5% of all passwords used by a (statistical) population, chances are that 1 in 20 of accounts you try to break into will use one of those ten passwords.
There’s a slight problem with these lists, though. They aren’t the top ten passwords for this year (that is, 2014 per this post) even if the article claims that they are. Heck, chances are that they’re not the top ten passwords for 2013. The passwords that were used to compile these lists are, for the lack of a better word, dated.
Digital Storage is Cheap
Storage for digital data is cheap: the price decrease for data storage, on a per byte basis, has paralleled Moore’s Law. The benefactors of this downward price pressure, among others, have been the many Silicon Valley startups-turned-behemoths: they collect, store, and process data (with most looking to serve you personalized ads). The foundation of their billions in market capitalization and revenue is founded on cheap storage.
And while storage is cheap, going through data and deleting what’s not needed anymore? When factoring in the human element, not so cheap. Consequently – and, although legal reasons factor into this as well – most companies have an unofficial policy where nothing ever really gets deleted, including passwords.
Cheap Storage Leads to Hoarding and Old Data
Of course, this is not to imply that companies keep your passwords for the sake of keeping your passwords. Rather, it’s a collateral effect. For example, if a company like Google makes it a point to not delete email accounts that haven’t been accessed in more than 5 years (which is a pretty strong indication that these accounts have been abandoned), the passwords will remain in place. Passwords that are 5 years old, at least. This is true for lesser-known companies, too.
What this means is that, if a company were to experience a data breach and a massive cache of passwords are stolen, chances are that it’s chock full of old passwords. So, the oft-quoted passwords like “password123” and “trustno1” may actually be not part of the top passwords. Indeed, trustno1 – which has been showing up on password lists since around the late 1990s, thanks to the popularity of the TV series “The X-Files” – seems particularly like an anachronism, and emblematic of the “hacked database contains a lot of abandoned accounts and their passwords” situation I’m describing. The show has been off the air since 2002, after all.
Getting a Real List of Passwords Used in 2014
The problem of excluding old, unused, and invalid passwords from top ten lists cannot be easily resolved: most researchers compiling such data work with leaked data, which tends to be comprised by (1) an email address and (2) a password (hashed or plaintext). I don’t think I’ve ever run across a case where “created” and “password last changed” dates are also offered. And you can’t really ask people for their passwords.
It’s Not as Bad as the Lists Make It Out to Be
There is also reason to believe that the situation is not as dire as these lists make them out to be. Sure, “password123” is a terrible password. But that’s not really the point. The real question people should be asking is “where are these passwords used?”
If “password123” is a top ten password used by people to access their work emails, that’s probably problematic; if it’s used to leave snarky comments on a celebrity gossip site, not so much.