There are a number of storage classes for every need and budget in Amazon Web Services cloud storage. The cheapest method to store the backup for a long time is Amazon Glacier. However, there are two different ways to upload files to that storage: direct and using the lifecycle policy. In this article we will demonstrate why the lifecycle policy method is more practical.
Differences Between Amazon Glacier and S3
Amazon Glacier offers a few data upload options:
- AWS control web interface.
- AWS (SDK) software or Amazon Glacier API.
- Amazon S3 allows setting lifecycle management rules to archive unused data from Amazon S3 to Glacier.
- Own solutions of AWS (APN) network partners that are already embedded in Glacier.
Let’s consider direct data archiving in Glacier and archiving using Lifecycle Policy in S3 in detail.
The cost of data storage in S3 or Glacier is just a part of the price for these services. An extra pay is charged for requests for data stored in S3 and download traffic. In Glacier, this list also includes the cost of data retrieval and the price also depends on the retrieval speed.
The point is that all information in Glacier is stored on reasonably priced tape drives, but this information is copied to more expensive media before its transfer to the user. So, when data is archived directly to Glacier, it takes half a day to index it and 3-5 hours more to make it available for retrieval (if standard retrieval is used). If data is uploaded to S3, all files will be instantly available.
Direct Upload to Glacier
Let’s consider the basic method of operation with Glacier in detail through the example of CloudBerry Backup.
When creating a new backup plan in CloudBerry Backup, you should select Glacier as cloud storage:
The backup mode is an important option to select.
- In Regular Mode each file will be uploaded separately.
- In Archive Mode, data will be compressed in a single file – that will allow saving on requests.
You will have to specify files for archiving and set up encryption, compression, e-mail notifications and other parameters.
As an example of a calculation of costs for data upload and storage in Canada (Center) storage, we will use an archive of backups in the form of a single 200 GB file that is increased every month by 10%.
The list of costs is as follows:
- Data transfer to Amazon Glacier: Free.
- UPLOAD and RETRIEVAL requests: $0.055 per 1,000 requests.
- LISTVAULTS, GETJOBOUTPUT, DELETE and other requests: Free.
- Storage: $0.0045 per GB / month = $0.9 per 200 GB (in increments of 10% every month).
All costs during the first month will amount to $0.9 + $0.055 = $0.955, during the second month – $0.99 + $0.055 = $1.045, etc.
Upload to Glacier through S3 Lifecycle Policies
It is possible to store several data versions and manage their lifecycle in S3. Upon expiration, data is deleted or it can be transferred to Glacier. If you set a storage class equal to 0 days, information will be immediately sent to Glacier. It is of use when information is rarely accessed in everyday life but its storage life is limited.
Though, it might seem that uploading data to S3 first and go with it to Glacier afterward might be more expensive, AWS has ensured that this exact scenario leads to no more expenses than direct Glacier upload.
Lifecycle policy could be created in AWS console or you could set it up in a couple of clicks within CloudBerry Backup.
After setting up the backup plan and cloud storage in S3, you should enable Lifecycle Policy in CloudBerry Backup. Go to Tools and click Lifecycle Policy or use the same parameters on the left pane. For the step-by-step guide please check out this article.
In the dialog box select the source of data that will be transferred to the storage.
As a result of such transfer, the files would first get to S3 and after that would get to Glacier in 60 days. If you set 0 days - they will be archived automatically.
Please mind, that if you set different setting from "0", the costs of storage and file transfer will be bigger. AWS will charge additional costs for storing files in S3 storage.
This approach has an advantage over the pure Glacier transfer. With the pure Glacier transfer, you have to wait 3-5 hours for the initial inventory to complete. With the S3-Glacier lifecycle policy set to 0 days, the inventory happens as soon as the files are on the storage.
The need to use Amazon S3 and Glacier individually or in combination depends on the objectives of each separate project. We prepared a comparison table for you to evaluate expenses for both methods of data archiving:
|Amazon S3||Amazon Glacier|
|Data transfer to Amazon (per GB)||$0.000||$0.000|
|Data transfer from S3 to Glacier (1,000 requests)||$0.055||X|
|PUT (1,000 requests)||$0.0055||Free|
|COPY (1,000 requests)||$0.0055||Free|
|POST (1,000 requests)||$0.0055||Free|
|LIST (1,000 requests)||$0.0055||Free|
|GET (10,000 requests)||$0.0044||Free|
|UPLOAD (1,000 requests)||$0.0044||$0.055|
|RETRIEVAL (1,000 requests)||$0.0044||$0.055|
|Data storage (first 50 TB / month)||$0.025 per GB||$0.0045 per GB|
|Data access rate||Instantly||Price|
|Data transfer from Amazon||Price||Price|
Amazon Web Services provides a possibility to use 5 GB of S3 storage (unfortunately without Glacier) for FREE for 12 months – this time period is enough to get acquainted with system features. We recommend trying CloudBerry Backup available for FREE for 15 days to assess archiving features in full.
You are welcome to share your experience and ask questions in the comments section below.
- Compare Amazon Glacier Direct Upload and Glacier Upload Through Amazon S3
- How to Archive Amazon S3 Data to Glacier with CloudBerry Backup
- Introducing Glacier Support for CloudBerry S3 Explorer
CloudBerry Backup 5.6.1