Microsoft Azure Cloud offers several types of scalable, high-availability storage: for tables, queues, files, blobs and Azure virtual machine disks. But what does it all really mean?
Even for an IT-specialist, it is not that easy to determine the best solution for the corporate requirements and environment. In this article, we will provide a definition for each of the Microsoft Azure storage types and give examples.
Microsoft Azure Storage
All storage providers from the “big 3” (Amazon, Google and Microsoft) develop their own structure and naming for the storage and Microsoft is no exception. To start working with Microsoft Azure Storage, you need to:
- Define the need in the storage
- Get an account in Azure
- Choose your storage account type
- Choose your storage type, depending on the storage account type
- Choose your redundancy level, depending on the storage type
In Microsoft Azure, there are two types of storage accounts, five types of storage, four levels of redundancy, and three tiers for storing your files.
In this article, we will overview each storage type, that exists on the Microsoft Azure platform and discuss possible options for usage for the given storage type. Our main goal for the article is to create clear view for the over-complicated at first sight, Microsoft Azure Storage.
The basis of Microsoft Azure Storage is, of course, the storage type you need to choose. The type is defining how you store what and which options and features you can use. There are five storage types in Microsoft Azure and they can be divided into two groups by their design. One group is designed with file storage, scalability and communication in mind and is accessible via REST API. The other - to enhance the features of Microsoft Azure Virtual Machine environment and to be accessed from VMs exclusively.
However, things are not so simple.
Blob is a file
It all has begun with Blob Storage in Microsoft Azure. BLOB is an acronym and means Binary Large OBject. Blob. To put it in plain English, blob is a file. Thus, Microsoft Azure Blob Storage is a storage for your files.
Blobs can be stored in Azure in three different ways:
Good for file storage, are capable of 4.77 TB per file
When you store a file in block blob - that means that it arrives on the storage in small parts and only after you complete the upload - the file/blob puts itself together in one peace. With that architecture, file cannot be modified without a complete re-upload. This is the most basic and the cheapest way to store your files in Azure.
You may wonder - why is it called block blob storage? Such name was given due to the specific architecture of the allocation of the files inside Microsoft’s datacenters. When you upload your file to the storage - it is shown to you as a whole, but in fact, each file is divided into separate parts, called blocks. That architecture was proposed a long time ago and it ensures the fast and lossless file upload.
You can have access to the files, but cannot change them inside the storage. Thus Block Blobs are ideal for backups and storing user files. Previously, the size of one blob (one file) could not be bigger than 200GB. If 200GB per file were okay in the year of 2007, it’s considered too small in 2017. So Azure has updated the limit to be equal to 4.77 TB per blob.
Good for storing logs or meta-data, you need to update constantly
You cannot change the Block Blob without re-uploading it. However, there are situations, when you need update the file on a regular basis. Append Blobs were created just for that purpose - they are structured in such a way, so the user can upload parts of the files from the end.
Designed for storing disks
Page Blobs are the basis for Microsoft Azure virtual machines environment. They were specifically designed to meet the restrictions for disks - each Page Blob should be multiple to 512 bytes. The architecture of Page Blobs allows writing data to each part of the blob.
The purpose of that is simple - when VM or your on-prem machines is operating with the local storage (HDD or SSD with your disk partitions inside) - it operates various parts of the device - without a linear structure. Thus, there was a need in a blob, that could be changed in each part.
The general application for Page Blobs is simple - you can store the images for your disks there. As soon as you need to run the disk on a VM - you can easily set it up, taking the image from the Page Blob.
In fact, if you are running any sort of disk on a VM in Microsoft Azure - it uses Page Blobs.
Blobs Access Tiers
However, that is not the end with the blobs. They also have the so-called access tiers. You may have heard about hot, warm and cold storage types.
- Hot files are the ones, you store in the cloud to access a lot. Expensive to store, but cheap to access.
- Warm files are the ones, you store in the cloud, to access not that frequently. Less expensive to store files then in hot, more expensive to access
- Cold files are the ones, you store in the cloud to access once in months or years. Dirt cheap to store files, the most expensive type to access files
Microsoft has all three access tiers for the Blob Storage - Hot Tier, Cool Tier and Archive, (the last one in preview/beta as of September 2017).
Mind, that you cannot change access tier for Page Blobs. Access tier is applicable only for Append Blobs and Block Blobs. The limitation is connected with the purpose and architecture of these storage types. You can change access tiers for files, but not disks.
Cheaper, more scalable storage for your structured data and big data analysis
Table storage can store, you guessed it, tables. Microsoft Azure Table Storage was designed to store structured noSQL data. The storage is hugely scalable and, at the same time, cheap to keep data in. However, it becomes more expensive when you access files frequently.
That storage is quite handy if you find Microsoft Azure SQL too expensive and can go without SQL structure and architecture.
When your services need to communicate with each other
Queue Storage is a type of storage designed to connect components of your application. It allows you to build flexible applications with decoupled and independent components that rely on asynchronous message queuing. Suppose you have an on-premises software interacting with a server somewhere in the cloud. Sometimes the server is down, meaning that you can no longer send messages to it. If you try, that would normally result in an error. Here are some other issues concerning an asynchronous communication that you have to deal with:
- The necessity to have both the receiver and the sender available simultaneously. Needless to say, if one of them goes down, the communication terminates.
- Mandatory implementation of try/retry logic to provide for a possible outage.
- Lack of proper scalability.
However, all of that can be avoided simply by employing a mediator that will collect the messages while one of the communication partners is down. With Azure Queues, you have a third player that connects the two components and acts as both a buffer and a mediator. So for example, if the consumer partner is down, the producer can still insert messages in the queue while it’s waiting for the other component to come back online.
Azure Queue storage is essentially a service for storing large numbers of messages that can be accessed from anywhere in the world via authenticated HTTP or HTTPS requests. A single queue sizes up to 64 kb.
Azure supports two types of queue mechanisms:
- Storage queues. Being part of the Azure storage infrastructure, they feature a simple REST-based GET/PUT/PEEK interface with reliable and persistent messaging within and between services.
- Service Bus queues are part of a broader Azure messaging infrastructure that supports queueing as well as more advanced integration patterns.
Azure Queue storage is unquestionably a service for advanced users that require certain know-how. We, therefore, suggest you familiarize yourself with Microsoft’s documentation to get a clearer picture of how it functions in real life.
Microsoft Azure Disk Storage works on the basis of Page Blobs. It is a service that allows you to create disks for your virtual machines. The disk, created in Disk Storage, can be accessed from only one virtual machine. In other words - it is your local drive.
Yes, it’s that simple.
Here you can have two options for speed of your disks:
- for standard disks (HDDs). HDDs are cheap but slow.
- or premium (SSDs). SSDs are fast but expensive.
And two options for the disk management:
- Unmanaged disk - you should manage the disk storage and corresponding account yourself
- Managed disk - Azure does everything for you. You need to select only the size of the disk and the desired type - standard or premium
Microsoft Azure File storage is the second storage type that was designed to support the needs of Azure VM environment. That storage is, in essence, a network share. You can store files there that can be accessed from different Virtual Machines. It is similar to Amazon EFS and is its direct competitor.
Again, quite simple.
Okay, now you are done with choosing the type of storage for your needs. However, in the IT world safety of your data is amongst the most essential and basic things. When you store data in Microsoft Azure, regardless of the type of storage, it is stored somewhere inside the datacenters of Microsoft. However, what if one day you wake up and read the news about the devastating accident that has completely destroyed the data center in question? Kind of a nightmare really.
To prevent that from happening, Microsoft has four options for data replication amongst their data-centers.
Disk Storage redundancy will be described separately, as it cannot be simply added to the following structure
3 copies of each file in one building but 3 different places
That is the most basic and the cheapest way to replicate your data. When you choose Locally-Redundant Storage (LRS), you can be sure that in one data-center of your choice there will be 3 copies of each file you store on different nodes (basically, on different hard drives).
LRS is available for all 5 types of Microsoft Azure Storage.
3 copies in one building, 3 copies in the other
However, if a meteorite strikes the datacenter, it won’t matter at how many different nodes your data is, it will be still lost!
Microsoft, however, has thought it through and created a GRS - Geo-Redundancy Storage. You will have 6 copies of your data in 2 different regions, hundreds of miles away one from another.
It is LRS multiplied by 2, basically.
Read-Access Geo-Redundant Storage
Read-access geo-redundant storage gives you read-only access to your data in the secondary location, in addition to the replication across two regions provided by GRS. The secondary endpoint is similar to the primary endpoint but adds a suffix to the account name. That is to say, myaccount.blob.core.windows.net turns into
myaccount-secondary.blob.core.windows.net. Although, you need to note that Since asynchronous replication involves a delay, in the event of a regional disaster it is possible that changes that have not yet been replicated to the secondary region will be lost if the data cannot be recovered from the primary region.
Zone Redundant Storage
Zone-redundant storage (ZRS) replicates your data asynchronously across data centers within one or two regions in addition to storing three replicas similar to LRS. Data stored in ZRS is durable even if the primary data center is unavailable. ZRS is only available for block blobs and cannot be converted to LRS or GRS (or vice versa).
Disk Storage Redundancy
Disk storage serves as a storage for Azure VMs and hence requires replication to ensure data integrity. The first thing you need to do is to enable Azure VM replication in the Azure portal.
The next step is to configure continuous replication for your VMs. That way data writes on the VM disks are continuously transferred to the cache storage account in the target location. Site Recovery processes the data and sends it to the target storage account. After the data is processed, recovery points are generated in the target storage account every few minutes. If your main setup fails somehow, new virtual machines will be initiated in the target region and the cache storage account will ensure that your data is intact.
Microsoft Azure Storage Accounts
Now that you are aware of storage types, access tiers and redundancy, it will be quite easy to understand how it works inside your Microsoft Azure account. Within Microsoft Azure, you can choose two Storage Account types. They are General-Purpose Account and Blob Storage account. You may think, that Blob Storage account works only blobs and General-Purpose works with the rest.
And you would be completely wrong.
General-Purpose Storage Account
General-Purpose Account is designed to operate with all types of Microsoft Azure Storage, except Disk Storage. To create disks inside your Azure Storage, you should first create the Microsoft Azure Virtual Machine.
Blob Storage Account
This might be confusing at the beginning, but bear with us. Blob Storage accounts are designed to work with Block Blobs and Append Blobs. Page Blobs, however, can only be created when you are using a General-Purpose account.
Because, one of the distinctive features of Block and Append Blobs are their ability to be Hot, Cool or Archive as they are files. Page Blobs are designed to be disks. They were designed not to store files, but to enhance the functionality of VMs inside Azure, thus they need to be frequently accessed. That makes access tiering in case of Page Blobs useless.
Change Access Tier Inside Container
You can now change the access tier for your blob right from the container. You may ask - and how is that different from the previous model?
A container is a basic structural element for your storage account. You first create a storage account and then create a container, where will your files get. Previously, the only possibility to choose access tier inside Microsoft Azure was to select the needed tier inside Blob Storage account at the stage of creation of the account.
But now you can change access tier after you created an account. And more importantly, this feature works for both General Purpose Accounts and Blob Accounts. Now, if you think that you do not need files to be accessed frequently, you can simply go to the container, select a file and change its access tier.
That feature is in preview mode and works currently (as of September 2017) only in LRS for Append and Block Blobs.
Microsoft Azure Storage and CloudBerry Products
We in CloudBerry are working to bring support for each and every cloud storage on the market, bringing you more versatile for backup and storage of your files.
Backup to Microsoft Azure Blob
When you create a backup plan in CloudBerry Backup, the first thing you have to do is to indicate the cloud storage of your choice. Microsoft Azure Blob is fully supported and ready for action.
Restore from Block Blob to Azure VM
Aside from backups, CloudBerry Backup enables you to restore your image-based backups to Azure VM. The data can be fetched from any cloud storage service, including Azure Block Blob.
Support for Access Tiers
CloudBerry Backups also supports different Azure Block Blob storage tiers, enabling you to upload either to Azure Cool Blob Storage or Azure Archive Storage.
Microsoft Azure and Explorer PRO Capabilities
We also have CloudBerry Explorer for Azure — the cloud explorer that lets you list, download, upload, and change the access tier of files in Azure. It is available free of charge for the standard version, so be sure to check it out.
Microsoft Azure Storage is not the easiest to understand choice on the market but is definitely one of the most flexible. In fact, it offers more options than Amazon AWS, while substantially outperforming it in the pricing department. If you’re thinking of moving your IT infrastructure to the cloud, definitely take a look at Microsoft Azure.