Cargo cult security – how not to use hashing
Sometimes, I still get surprised by the lack of thought behind some software systems that have made it to production. This time, I stumbled over a 2 year old reddit thread about the Swedish city Västerås using WiFi to track the movements of people, storing hashes of their smartphone MAC addresses, in the rather naive misbelief that hashing would render people anonymity. That leads me to a subset of the subject of cargo cult programming: cargo cult security.
In short, cargo cult programming means that you use a chunk of code because you more or less know what is does, but have no idea of how it works. That means you can’t modify the code, and you probably also unknowingly include redundant code.
In my example of cargo cult security, the developers in question used hashing as a means to anonymize users. From a cargo cult perspective, hashing works well for storing passwords, so if it’s used for a completely different application it should work there too, right? No, actually quite wrong.
It’s the search space size, stupid
First, let me translate a few quotes from the developers, from an article on the Swedish IDG site:
“-What we do, is take the MAC address and signal strength of the signal transmitted by the phone. […] We apply one-way encryption on the MAC address and store it as a hash.”
“According to Bumbee Labs, the one-way encrypted MAC address can’t be linked to any specific phone afterwards.”
But of course it can. The way to do is it by simple brute force cracking. While this isn’t feasible for an huge search space, like good passwords, the MAC addresses of Västerås, or even the entire country of Sweden, is a rather small search space. If you want to track one single phone, it’s of course super trivial, since you only hash that single phone’s MAC address and search for that hash in the database. Which, by necessity, must be exactly what Bumbee Labs do when the tracked phones move from one WiFi access point to the next.
Time required for brute force hashing
Picking the slowest hash computation from this hashcat benchmark, which is 75 hashes/s when calculated on an Nvidia GTX 1080 graphics card, it would take 37 hours to compute the hashes of 10,000,000 phones. Assuming that every Swede, from newborn to 100+ years of age, has a smartphone, 37 hours would be enough to deanonymize the collected data for the entire population of Sweden.
I don’t know what hash algorithm Bumbee Labs use, nor how many iterations they run it, however it’s a reasonable assumption that they use something less CPU intense than the algorithm in the above example. If they use a hash that computes with the same speed as WPA2 (used in WiFi routers), 26 seconds would suffice.
To put things in perspective, brute forcing the hashes of MAC addresses of one phone per inhabitant in Sweden, is as easy as cracking a 4 character password.
That’s the sort of security you get when you hash points from a small search space.