I am in the midst of defining a cloud strategy. We need a framework, at least for the next few years, that will help us decide which services we support on-premises and which services, naturally and logically, belong in the cloud. In the early stage of defining our strategy, cost is certainly a factor. So is what I call "capacity." We have so many projects in our pipeline that if I can free up my internal staff and infrastructure resources for new projects by off-loading maintenance activities to someone else, I create capacity that puts me way ahead. We still have some work to do before we can use this framework to make every decision, but there are some cloud decisions that are easy to make … and cloud data storage is one of those.
Our organizations are addicted to data retention. Just take a look at your own or others' email inboxes. Still got that invitation to the 2007 company Christmas party? What about that email from the CEO asking a question that you answered months ago. Why do we keep such data? Because of our deep-seated fear that, at some point in the future, we might need that email, file, document or record. This makes it nearly impossible for us to delete it. After all, what if it turns out we really do need something that was in that invitation to the 2007 company Christmas party and the invitation is no longer there?
For years, I have used some simple sorting rules to define my approach to data storage and retention. I segregate the data into a few broad categories:
- Always used
- Sometimes used
- Rarely used
- Never used
In the pre-cloud days, I would put the Always used data on the fast drives (now including solid state). I would put the Sometimes used data on the slower drives, the Rarely used on the slower drives, and I would try to convince the owners of the Never used data to get rid of it. But, in practice, I was never able to get rid of the Rarely used data and ended up putting it on the slower drives. Over time, I kept buying more relatively expensive slower drives as the amount of Sometimes used, Rarely used and Never used data grew.
More on data storage strategies
Independent of any other decisions I make about cloud services such as SaaS, IaaS, PaaS, et cetera, cloud data storage makes my data retention sorting much cheaper and simpler. I still sort into Always used, Sometimes used, Rarely used and Never used. And, I still put the Always used on my fastest storage and my Sometimes used on the slower storage. But, I move the Rarely and Never used to the cloud. Do I care about retrieval performance of the Rarely and Never used? Not at all. Do I want to allocate my storage capacity to something that is rarely, if ever, used? Not on your life; I have too many other demands on that capacity. Do I want to allocate my storage dollars to something that is rarely, if ever, used? Not when I can get gigabytes of slow cloud storage for pennies. In effect, cloud storage is my data archive.
Some people might question this decision. Don't I worry about the security and protection of my data? I do worry about that, but not at all with a reputable, proven cloud provider. After all, they have to be at least as good as I am at data security and protection. Otherwise, their business model collapses. If I am honest with myself, I suspect that they are better at data protection and security than I am -- they have to be.
I use cloud data storage to create internal capacity that I allocate to the services that my customers want the most: high-performance, on-demand access to the data they use the most. For everything else, I find someone who can do it cheaper and at least as well in the cloud.