Want to take advantage of geo-redundant storage but are unsure where to start? This article is for you. Below, we compare AWS replication across regions, Azure storage geo-replication and the data replication features available on Google Cloud Storage.
Geographical redundancy, or geo-redundancy for short, is a valuable data storage strategy that can help to improve data reliability and availability. When you replicate data across multiple regions, your data is more resistant to disruptions that could cause damage to a particular data center or set of servers.
AWS S3 Replication
S3, the storage service on AWS, provides two types of geo-redundancy options.
S3 Availability Zones
The first method for achieving geo-redundancy on AWS is by using what AWS calls Availability Zones. Each AWS region includes multiple Availability Zones, which are geographically distant from each other.
By default, the Standard, Infrequent Access and Glacier storage classes on S3 replicate data automatically across at least three Availability Zones. This means that as long as you are using one of these S3 storage classes, your data will be geo-redundant, without any extra effort required on your part. The cost of this AWS geo-redundancy is built into the standard S3 pricing.
The major downside to achieving AWS geo-redundancy via this approach is that the Availability Zones within the same AWS region are still somewhat close together. Amazon is not specific about how close they are to each other, but says only that they are separated by “miles.” As such, they will suffice to protect against disruptions that are very localized, such as a fire or cooling system failure that affects just one server room. However, a large-scale disaster, such as major flooding or an earthquake, could impact multiple Availability Zones within the same region; in that case, your data may cease to be available despite the geo-redundancy that you have established via Availability Zones.
AWS Cross-Region Replication (CRR)
If you are concerned about the large-scale disasters described above, you can gain a greater level of S3 data redundancy by taking advantage of AWS’s Cross-Region Replication (CRR) feature.
With CRR, your S3 data is automatically replicated across multiple distinct AWS regions. As long as the infrastructure in at least one of those regions remains intact during a disaster, your data will remain available.
Microsoft Azure Storage Geo-Replication
The Azure cloud offers four types of geo-redundancy solutions for Blob storage.
Locally Redundant Storage (LRS)
This is the most basic and lowest-cost data replication solution for Azure. LRS replicates data across a collection of storage racks in the same data center. It provides no real protection against data loss resulting from physical damage to the data center, but it is effective for protecting against problems resulting from data read/write errors and similar inconsistencies in an application.
Zone Redundant Storage (ZRS)
ZRS is very similar to Availability Zones-based geo-redundancy in AWS. ZRS replicates data across Azure’s version of Availability Zones, which, like those of AWS, are physically separate storage locations within the same cloud region.
Geo-Redundant Storage (GRS)
GRS is the Azure equivalent of AWS CRR. GRS replicates data across multiple regions in the Azure cloud.
Under the standard Azure GRS service, the secondary copy of geo-redundant data becomes available to users only in the event that Microsoft executes a failover from the primary region to the secondary region. In other words, your data is backed up in a secondary region, but you can’t access it from there unless the primary region officially fails.
Read-Access Geo-Redundant Storage (RA-GRS)
RA-GRS provides the same level of replication as standard GRS, but with the difference that users can read (but not write) data from the secondary region at all times. You don’t have to wait for Microsoft to initiate a failover in order to access your data from the secondary region.
Cross-Region Replication Pricing: AWS vs. Azure
Whether you use AWS or Azure, achieving the geo-redundancy of cross-region data replication is much more expensive than other geo-redundancy options.
AWS CRR Costs
The cost of AWS CRR for S3 is more than double the cost of Standard S3 storage because you have to pay for the storage costs associated with each copy of your data. That means that, if you chose to use the AWS U.S. East (Ohio) region as your primary region, and U.S. West (Northern California) for the secondary, you’d have to pay storage costs of $0.023 per gigabyte for the first region, plus $0.026 per gigabyte for the second, assuming that you used S3 Standard storage.
Infrequent-Access and Glacier storage costs would be much lower, at $0.0125 and $0.004 per gigabyte, respectively. However, your total storage bill would still be double because again, you’d be paying for storage in two regions.
On top of this, you also have to pay for the inter-region data transfers required to replicate data across regions using AWS CRR. Inter-region data transfers are currently priced at $0.04 per gigabyte for the U.S. East (Ohio) region. That would mean that, if you had 10 terabytes of data to replicate across regions, you’d pay $400 for the data transfer costs, in addition to storage costs.
Keep in mind that it is possible to configure CRR for only certain S3 buckets, so if price is a concern, you can choose to use this geo-redundancy feature only for critical data, and not for other data.
Azure GRS and RA-GRS Pricing
As with AWS CRR, Azure’s cross-regional replication services come with a hefty price tag.
RA-GRS storage costs $0.0589 per gigabyte when you use the Azure U.S. East region as your primary region. That is almost 65 percent more than the price of simple LRS storage (which is $0.0208 per gigabyte), and on par with the relative cost increase from basic AWS S3 storage to CRR.
Basic GRS storage is somewhat less expensive, at $0.0458 per gigabyte when using the U.S. East region. Even with this lower-priced option, however, you still pay more than double the cost of Azure LRS storage.
Geo-Redundancy on Google Cloud Platform
On Google Cloud Platform (GCP), geo-redundant storage options are more limited. The only major solution is the Multi-Regional Storage class, which replicates data across “at least two geographic places separated by at least 100 miles,” according to Google.
Google is more vague than AWS and Azure regarding exactly how its cross-region storage service works. It says that the data replication is not synchronous, so it’s unclear how long you’d have to wait for data replication to complete within the secondary location. Users also have no ability to select exactly which regions are used when they set up multi-region storage; instead, users select which general location (such as “Asia” or the “European Union”) in which they wish to store their data, and Google automatically selects which data centers to use for geo-redundant storage within those locations.
Geo-Redundancy Service Types: AWS vs Azure vs Google Cloud
|AWS||Azure||Google Cloud Platform|
|Localized storage replication service||Availability Zones on S3||LRS, ZRS||No explicit option|
|Cross-regional data replication||CRR||GRS, RA-GRS||Multi-region storage class|
When it comes to geo-redundancy, there are clear differences between the major cloud providers:
- AWS offers automatic geo-redundant storage for its most popular S3 storage within the same region, but users can opt for AWS replication across regions if they wish and are willing to pay significantly higher costs.
- Azure is similar to AWS in that it offers several tiers of geo-redundancy, although its cross-region storage options come with some caveats (such as the inability to access data from the secondary region under standard Azure GRS). Like AWS, Azure charges significantly higher rates for multi-region replication.
- On Google Cloud Platform, geo-redundant storage options are limited, and their operational details are the least clear from the user’s perspective.