Last Update: Sep 04, 2024 | Published: Sep 11, 2017
The first thing you should know is what a blob is. I first heard the term “blob” in database theory back in college. We were educated that a blob was a file that was stored in a database. You can think of an Azure storage stamp as a massive, resilient database cluster. When we store data in a storage account, Azure figures out how we are going to use that data and stores (and charges for it) appropriately. One of those kinds of storage is a blob, which is a file.
Azure has two kinds of storage account:
When the hot and cold tiers were announced, the public responded with, “That is great but have you got something like Amazon Glacier?” Cool storage costs $0.01 per GB in the East US 2 Azure region. If you have petabytes of archive data that you need to keep but rarely access, even that paper clip cost can build up to be significant.
A new form of blob storage was announced, offering a third tier, below hot and cool blob storage. The idea behind archive storage is that it is ultra-cheap for huge amounts of data that you very rarely need. Microsoft made this possible by using some form of offline storage. When you try to access cool, hot or general storage blobs, the latency is unperceivable. You access the blob and the file is immediately available to you. In the case of archive storage, there will be a latency that is “on the order of hours“.
Note: I have no idea what storage system is being used for Archive Storage. That latency makes me think that it is some kind of tape storage, kept probably in triplicate tapes/libraries.
That sort of latency is okay. Realistically, any requirement for this old data is not immediate. It might be something, such as a court-issued subpoena to retrieve data from several years ago and such requests can be satisfied in days/weeks.
Archive storage has some interesting traits:
The most important trait of archive storage is the price. The preview price of archive blob storage in East US 2 is $0.0018. This is versus $0.01 for cool blob storage in the same region. 1 terabyte (TiB) will cost just $1.84 per month! 1 petabyte (PiB) will cost $1,887.44 per month!
Those of you working with on-premises tiered storage might wonder if Microsoft is going to work on tiering blobs. In other words, can a blob be moved to an appropriate tier? The answer is yes but what that “yes” means will change over time.
Today in the limited preview, you can move a blob from one tier to another. You can open a blob and select a tier for that blob: hot, cool, or archive.
Today, that means that either you or some software that you use/write must track the usage/age of a blob and move it to the appropriate tier.
What about auto-tiering? That would definitely be popular and Microsoft knows that. Back at Build, the Azure storage team announced that auto-tiering would come after general availability of Archive Storage and Blob-Level Tiering.
At the moment, availability of the program is limited to approved applicants. Regional and storage availability is also limited: