I was on vacation when the news hit that customers of T-Mobile’s cloud-serviced Sidekick phones had likely lost their data due to a failure of the storage service provided by a company recently acquired by Microsoft, ironically named Danger. So, rather than following the story in real time, I found myself reading the historical account of the incident on Thursday. My reading of this was interrupted by another calamitous story, also related to clouds– a small boy whisked up 10,000 feet in the air by a helium balloon and carried in horrifying swoops across the Colorado skies. When the balloon landed gently some two hours later, the boy was not there and feared lost. Both stories, as it turns out, were much less horrifying than originally imagined.
“Balloon boy” was hiding in his family’s attic. The Sidekick data was not lost either; it was hiding somewhere as well. Yesterday Microsoft announced that it had recovered most, if not all, of the customer data.
Despite the good news, we’re seeing evidence now of storm clouds when it comes to cloud storage. The Sidekick data failure was attributed to a storage area network (SAN) upgrade gone awry. In this case, there appears to have been a difference of opinion about whether a backup was necessary in order to go forward with the SAN upgrade. According to sources of blogger Daniel Eran Dilger, instead of doing a backup that would have taken six days, Microsoft management is said to have decided to curtail the process two days into it. What then ensued is not yet clear, but the implication is that an Oracle system responded to some abnormality in the SAN upgrade that caused the data’s “disappearance.”
Whatever the cause, this scenario highlights the enormous complexity of cloud storage and the inherent risks involved with such a new data-handling approach. Indeed, another recent cloud mishap, in which a hacker was sending spam through an Amazon email server, elicited another calamitous response: Amazon EC2 subscribers had their email put on a spam blacklist by Spamhaus because of this one bad apple.
It’s not surprising these technology glitches are happening, given the newness and complexity of cloud computing. But I think what all three cases show — from the overblown police and media reaction to the image of an airborne balloon, to the software response to a SAN upgrade gone wrong, to Spamhaus’ Draconian solution to deal with a single hacker, is that we’re inexperienced. We don’t yet have enough understanding to deal with these unusual events in a calibrated — not exaggerated — way. The heart of the matter is that, in each case, there may have been an overreaction to an unexpected, but as it turned out not particularly serious, problem.
With all new things, there is a learning curve. Single hackers will be dealt with in a different way in the future. Microsoft will never do an upgrade without backing up first. As for “balloon boy,” it’s likely some adults will look in the attic before calling out the National Guard. And if the balloon incident was staged, a family conspiracy? Well, there are those who believe the Sidekick data wipeout was insider sabotage.