A few weeks ago LastPass had a serious breach of data. I don’t like to talk about the competitors, but this unlucky event generated a lot of articles, concern and fear. It also caused an increase in emails from Passpack users who are seriously concerned about their security and want to understand if what has happened to LastPass can happen also to Passpack.
I think not, but I can’t be sure because I don’t know the architectural details of LastPass, and I don’t know what happened to their data. I can say however, that compared with what I understood from their post, we use a few different approaches. For example, it seems that they had all the services on the same network. We have always had totally separate sandboxes for the different Passpack properties. This eliminates the risk that a bug in the blogging or support ticket systems could open up the core application to attack.
But speaking more specifically to the data loss and the risk of a dictionary attack issue, we use a non-standard approach to work around this risk. I’ll explain that here, and at the bottom of the post is also some code so that anyone, LastPass included, can employ it if they’d like.
Let me tell you a story
When I was a child, I was fragile, thin, short and delicate. I was the typical victim of children bigger than me. I remember that during a Christmas vacation, after having been bullied yet again, my grandfather said to me: “if you can not defend yourself, you need a big friend that can defend both of you”. I took his advise, got a big friend, and I never had problem for the rest of my childhood.
Still now that I am a grown man, I am more at ease knowing that I can count on a big, trustworthy friend if necessary. We’ve used this “big friend” approach in Passpack as well. It’s one of the components we use in solving the hash problem that is the main focus of this post. The full details are very technical, so please be patient.
About openness and reactions
I don’t always agree with LastPass’ choices (see my post on how sharing “masked passwords” is a serious security hole), however they are a good company, and their choice to inform their users about what was happening in a quick and open way was the right one. They knew the risks of that choice, but they chose to do what was best for their users over what was best for their public image. I applaud them and would surely do the same.
The reaction of the press to their announcement was less impressive. I was surprised by the superficial knee-jerk reactions from some blogs, including the prestigious LifeHacker. Instead of investigating to understand if there could have been ways for an online password manager to have avoided such an incident, they simply assumed there was not and suggested users go back to offline tools.
This is not a solution. It is a step back of at least five years and doesn’t help people that need something that only the cloud can offer – collaboration and access. Every day we all connect to Facebook, Twitter, PayPal and many other services. If we refuse to persevere and solve important issues in the cloud, then we might as well you turn off our computers now. Problems need solutions, not avoidance. Maybe the current solutions are not ideal, but if we stop the progress we will never have a definitive solution.
Premise: your security is your choice
We can call the best engineers in the world to manage our server security, but if you join the name of your sons and use it like your Packing Key you are very lucky if nobody entered your account yet. So, the first thing to do, in any case, is to choose a really strong Packing Key. Believe me, it is not difficult. A good pass phrase could be something like: my grandma had a strange green bbq or when I jump I can not stop laughing. You simply need a sentence that is easy to remember for you. You will discover that it will be also easy to write. And, most importantly, I can tell you that it will be very resistant to every attack.
About passwords and hashes
Here are some basics. Every web application that doesn’t use a federated authentication system (like Facebook connect) needs something form of password to identify the user. Old systems used to save the passwords directly in a database. Recently (fortunately) most will save a derivation of the password generated using an hash function instead. What is a hash?
A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string (via Wikipedia).
There are certain algorithms that can take text, and create a different text that identifies it – it doesn’t transform it, it just identifies it, like a fingerprint. The result is called a hash string, and that string can’t be reversed. For example, if you take the Wikipedia definition above and run it through a SHA1 hash function, you obtain the following result:
Now, if you change just one thing, anything in that Wikipedia definition, then you will get a completely different result. For example, if you remove the period at the end of the paragraph, the new hash would be:
As you can see the two hashes are the same length, but they are totally different.
Here’s another example. If you hash the word “password” you get another different result, but again, it’s the same length, 40 characters:
So every string that you hash with a SHA1 function has always the same size, being (almost) always different. It doesn’t matter how long or short the original text is, it will always produce the same size hash. Both your ATM PIN number and “War and Peace” would both have a hexadecimal 40 character hash. This makes hashes perfect for identifying something without knowing what that something actually is.
In fact, hash functions are used in your ATM and credit cards to protect the pin.
The problem with hashes
Unfortunately there are methods to discover the original string, starting from the hashed one, even though it’s not technically a reversal. The most used hacking method is the so-called Dictionary Attack. A dictionary attack simply means that the attacker takes an entire dictionary of words, and tries to guess your password. Alas, people tend to use always the same passwords or very common words. Some time ago Twitter banned 370 common passwords from use in its accounts because it was really dangerous to allow users to choose a password like password, testing, naked, stupid, 123456, etc.
Since hash algorithms are known, an attacker can simply run all the dictionary words through the algorithm and compare the result with an existing hash (perhaps that he’s somehow stolen) and see if they match. When the two hashes match, he then knows what the original password was.
Of course, to make hacking more efficient, there are dictionaries of the hashes of the dictionary words (so called Rainbow Tables), available for the outputs of all the major algorithms (SHA256, MD5, etc.). Therefore, the very first step in protecting users is to salt their hashes. This means that before running a string through a hash algorithm, some additional text is “mixed in” with it (like adding a dash of salt to your hash browns :) so that the result is at least not so easily looked up in the dictionary of hashes.
At Passpack we use several different combinations with iterations of SHA256 adding in the user’s clientcode as salt. Since that is unique for every user, even if two users use the same Packing Key the hashes will be different. Does that completely solve the problem? No, not yet. It’s a first step to raising the bar a bit since it wards off mass attacks, forcing a potential hacker to try and match hashes one by one.
So at this point, if you’ve been paying attention, you should realize that if you chose a strong Packing Key, then you would not be susceptible to these dictionary attacks (and if you don’t have a strong Packing Key, now is as good a time as any to change it).
We can urge you all the time to create a good Packing Key, but it’s still our job to build the system to be as a secure as possible, regardess of your choices. That means that for all the work we do to build a secure server infrastructure, and make it as close to impossible as we can to penetrate it, we still need to plan for the rare case in which someone does, indeed, get in. Everyone works hard to make sure that this doesn’t happen, but sometimes it still does. This is what just happened to LastPass.
So what are the possible solutions to protect users?
A common solution: iterations
It works like this: instead simply saving the hash to the database once you receive it on the server, you first run it through another algorithm… many, many, many times. For example 50,000 times. Then you save that to the database. By doing so you slow down the potential hacker by the same number of times. This makes cracking the hash wildly inefficient for the hacker, if not impossible.
The problem with this method is that the server has to work intensively to derive the original hash and you’ll wind up with some serious performance issues. The result is that you can not use this method if you have a lot of users because you risk continuous server downtimes. PBKDF2 is therefore not a real solution for web applications.
The Passpack solution: distribution
Passpack is a small company. We do our absolute best to build the safest infrastructure possible, but we can not spend a lot of money to build our own bomb-proof servers. So I decided to find a big friend, someone that can share with us the work and, in defending himself, defends us as well.
I wrote a super simple application on Google App Engine called Spino. Spino runs on a Google account that I use only for this app and for nothing else. Spino collaborates with Passpack to create a distributed solution for the hash problem.
How does it work? It’s quite complex, but here’s a simplified description:
- The User types his Packing Key
- The Browser applies an hash function to the Packing Key using the clientcode as salt
- The Browser sends the hash of the Packing Key to Passpack
- Passpack sends the temporary hashes to Spino
- Spino verifies it is receiving the hashes from the real Passpack
- Spino applies a secret hash function to the original hashes and sends it back to Passpack
- Passpack saves this result in the database
Until some time ago, Spino was on another server and probably in the future we could move it on some other one. Also, to further strengthen the pattern we could distribute the hash manipulation across multiple, external apps on multiple servers using different technologies.
The risk in a solution like this is that if the external platform — in our case Google App Engine — goes down, it would be impossible to login to Passpack. But since we’ve implemented this solution, we’ve had no such complaints. On the other hand, the big advantage is that even in the rare event that an attacker were to grab the Passpack database, he would not be able to do anything with it unless he is also able to penetrate Google. And Google is a really big friend.
To maintain good performance, once again, we use an encrypted cookie as a distributed cache. In fact, only the server can decrypt the cookie, but the server needs the browser to access the cookie itself.
Do you like Spino?
If you like Spino’s approach, feel free to use it. It is open to all.
When I decided to move it from the previous server to Google App Engine, in a week-end I learned Python and I ported the script, so I am sure that you can create and deploy your app from scratch in 20 minutes using the following code.
# # Spino, a remote hasher # version 1.2 # Copyright (c) 2009+ Francesco Sullo, Passpack SRL # Licensed as Open Source under MIT License # import hashlib import os from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app def hashx(val): salt = 'Wow! I am a super salt. Please change me!' # just a few interactions iterations = 20000 cont = 0 while cont < iterations: m = hashlib.sha256() m.update(salt + val) val = m.hexdigest() cont = cont+1 return val class MainPage(webapp.RequestHandler): def get(self): self.response.headers['Content-Type'] = 'text/plain' try: a = self.request.get('a') e = os.environ['REMOTE_ADDR'] # set your authorized referer IP or IPs: if e == '100.100.100.100': ret = hashx(a) self.response.out.write(ret) return except: pass self.response.out.write('Hello, World!') application = webapp.WSGIApplication( [('/', MainPage)], debug=True) def main(): run_wsgi_app(application) if __name__ == "__main__": main()
This is a simplified version. But it works :)