Microsoft has added two new kinds of resiliency to general purpose v2 (GPv2) storage accounts called Geo-Zone Redundant Storage (GZRS) and Read-Access Geo-Zone Redundant Storage (RA-GZRS).
The Old ZRS
ZRS, when it originally appeared several years ago in Azure, was a form of general purpose storage account v (GPv1) replication that had a complex definition. It kept 3 copies of your data, 2 in the region of choice, and the third was either in the same region or in a nearby region. But this was before Azure regions had zones as we know them today.
The concept of ZRS was to get over the availability limitations of LRS and GRS:
- LRS keeps 3 asynchronous copies of the storage account on a single storage cluster, in a single room (co-lo), in a single data centre, in a single region. If that one cluster, co-lo, or data centre goes down, then you lose the storage account until/if it returns.
- GRS is an extension of LRS, keeping an additional 3 asynchronous copies of the storage account in the paired region (secondary region) of the primary region (the region you deployed the storage account into). However, you cannot use the failover replicas until Microsoft declares a failover, which is a non-retrievable failure of the primary; this event has never occurred but there have been plenty of local outages which made the accessible (LRS) copies unavailable for periods of time.
- RA-GRS extends GRS by making the additional copies in the paired region available for read access, useful if you have a custom app that only needs to read the data.
However, the old ZRS still didn’t understand how to divide up it’s copies into independent zones in the same region, even if it spread the data around 2 to 3 data centres in the same region; those data centres could have had shared dependencies.
Availability Zones
Microsoft is slowly adding availability zones to their Azure regions. When a region (a cluster of closely located data centres that you deploy resources into) is broken up into availability zones, Microsoft creates 4 zones that have completely independent power, networking, etc. The idea is that if one zone goes down because of an internal infrastructure failure, it should have no affect on production systems in the other zones in the same region. As a result, we can get higher SLAs by using zone-redundant deployments.
However, there is a cost. Some resources require higher SKUs, there is a micro inter-zone communications cost, and latency between tiers of a service or services can be increased by using more than one zone.
Note that a region divides the data centres of that region into 4 zones. At any one time, you will see 3 zones, “round robin” (or some other algorithm) selected for you, labelled as 1, 2, and 3.
The New ZRS
When Microsoft launched GPv2, they did two things:
- The shared an end-of-life date for ZRS in GPv1
- They introduced a new form of ZRS in GPv2
The new ZRS uses the availability zones of an enabled region to place 3 copies of your storage account data across three different storage clusters, across 3 different data centres that do not have shared dependencies. Now if two of those data centres, co-los, or storage clusters go down, the storage account remains available.
Adding Geo-Redundancy To ZRS
It would make sense for ZRS to be used, but it does not have geo-redundancy. So just like with LRS, Microsoft is adding (in preview today in US East) two geo-redundant options:
- GZRS or Geo-Zone Redundant Storage: ZRS plus 3 asynchronous copies in the paired region.
- RA-GZRS or Read-Access Geo-Zone Redundant Storage: GZRS where the asynchronous copies can be used for read operations only.
Note that:
- The replicas in the paired region are stored in LRS, not ZRS. And that means that …
- The paired region does not need to be in the preview for GZRS or RA-GZRS and it does not need to support availability zones – only the primary region does.
Which means that more people will be able to use GZRS and RA-GZRS.
Is ZRS the New LRS?
For those regions where ZRS is supported, and GZRS/RA-GZRS will be added, would it make sense to use ZRS as your starting point? I would like to say that the answer is yes. My default answer is “yes” but you need to check that your services will support it. For example, I use ZRS for certain things, but other things, such virtual machine diagnostics, I cannot because the IaaS diagnostics agent will not support ZRS! I guess the team responsible for that is more focused on driving revenue into Azure Monitor Logs (Log Analytics) by adding support for Workspace (preview today) in addition to LRS/GRS storage.