As a company, if you're facing a suspected security breach then forcing your customers and clients to change passwords is not a bad thing to do. However, when you do this don't end up DDOS'ing your own system as Atlassian have apparently done.
If you don't know, Atlassian make the JIRA project management and tracking tool as well as the popular hipchat chat room software and they've recently acquired Trello.
The start of the sorry tale kicks off on Sunday (Click on the images for a clearer view):
The blog article is open, well written and talks about the password storage mechanism which is bcrypt with hashed and salted passwords. All good there then.
The last line is an interesting one though "If you are a user of HipChat.com and have not received an email from our Security Team with these instructions, we have found no evidence that you are affected by this incident." I'll come back to this.
The blog post does admit that customers may have lost data. Something to be aware of with cloud based chat services like hipchat.
Fair enough. I'm a hipchat and JIRA user and I got an email late on Sunday night which I ignored. This morning around 6:30am I logged in to JIRA and was forced to change password. Again, this is fine, it's a decent response to a security issue but when attempting to change the password the ID server (id.atlassian.com) falls over. Thinking that this is going to be a minor issue I waited an hour, same problem. No updates from Atlassian.
A few hours later Atlassian admitted that yes, there is a problem with the password changing process:
Some six hours later, Atlassian admitted that yes, they've DDOS'ed themselves:
This brings me back to the comment I mentioned earlier "If you are a user of HipChat.com and have not received an email from our Security Team with these instructions, we have found no evidence that you are affected by this incident.".
Are Atlassian seriously saying that a subset of users being forced to change passwords is enough to take down the system that changes passwords? It seems worrying to me if that's true. It's also worrying that they took 6 hours to spot that it's a capacity issue and it's worrying to me that they can't rapidly add capacity to a system and this leads to an important question - is the password change server THAT much of a custom build?
I do hope that they do a full root cause analysis of both the security issue, how the data was extracted and the subsequent issues with the password change server.
Subscribe to Ramblings of a Sysadmin
Get the latest posts delivered right to your inbox