AWS S3 offers different Storage Classes, allowing to optimize cost among others.
For instance, some classes are used for archiving purposes: S3 Glacier and S3 Glacier Deep Archive. It means the storage cost is the lowest you can obtain, but your data is not available immediately and the access cost is increased.

In the case of S3 archive classes, retrieving the data is not cost-effective because this is clearly not what it is aimed for. This is for data you want to keep for some reasons (legal, insurance…), but you need very rare access on it.
Database backups are clearly one scenario where these storage classes are designed for.

But what does happen if I need this data? How do I proceed? We will answer to these questions using AWSPOWERSHELL module.
Of course, AWS CLI is another approach possible and well-documented. But, in my opinion, this is less reusable (integration in a custom module less convenient for instance) and less in the PowerShell “philosophy”.

I- Select your S3 object

First of all, you need to find the object you have to retrieve. To do so, several information are necessary:

  • the bucket name where the object resides (mandatory)
  • the key (optional): returns the object matching the exact key
  • the key prefix (optional): returns all objects with a key starting with this prefix

The cmdlet you need for this is called Get-S3Object. Here are some examples of usage:

# Retrieve object from a specific key
Get-S3Object -BucketName $BucketName -Key $Key

# Retrieve objects from a key prefix
Get-S3Object -BucketName $BucketName -KeyPrefix $KeyPrefix

It is not possible from this cmdlet to retrieve an object only with its name: you need to know the key or the beginning of the key (key prefix).
Of course, a research inside PowerShell is possible, but you will need to retrieve ALL objects in a bucket before doing the research… You are dependent on the number of objects in your bucket.

Moreover, to retrieve information regarding restore status, you need to look into metadata with cmdlet Get-S3ObjectMetadata.

To make the research simple with the desired information, I created a custom function to accept the partial name of a S3 object as input and to personalize the output:

Function Get-dbiS3Object(){
    param(
        [Parameter(Mandatory=$true)]
        [String]
        $BucketName,
        [String]
        $Key,
        [String]
        $KeyPrefix = '',
        [String]
        $Name = ''
    )

    $Command = 'Get-S3Object -BucketName ' + '"' + $BucketName + '"';

    If ($KeyPrefix){
        $Command += ' -KeyPrefix ' + '"' + $KeyPrefix + '"';
    }
    If ($Key){
        $Command += ' -Key' + '"' + $Key + '"';
    }
    If ($Name){
        $Command += ' | Where-Object Key -Match ' + '"' + $Name + '"';
    }

    $Objects = Invoke-Expression $Command;


    If ($Objects){
        @($Objects) | ForEach-Object -Begin {`
                        [System.Collections.ArrayList] $S3CustomObjects = @();}`
                  -Process {`
                           $Metadata = $_ | Get-S3ObjectMetadata;`
                           $S3CustomObj = [PSCustomObject]@{`
                                         BucketName = "$($_.BucketName)";`
                                         StorageClass = "$($_.StorageClass)";`
                                         LastModified = "$($_.LastModified)";`
                                         SizeInB = "$($_.Size)";`
                                         RestoreExpirationUtc = "$($Metadata.RestoreExpiration)";`
                                         RestoreInProgress = "$($Metadata.RestoreInProgress)";`
                                         ExpirationRule = "$($Metadata.Expiration.RuleId)";`
                                         ExpiryDateUtc= "$($Metadata.Expiration.ExpiryDateUtc)";`
                           };`
                           $Null = $S3CustomObjects.Add($S3CustomObj);`
                  };
    }

  return $S3CustomObjects;
}

2- Restore the S3 Object

Once you have selected your objects, you have to create a request to make your objects accessible. Indeed, in Glacier, your objects are not accessible until a request is performed: they are archived (like “frozen”).
For Glacier, it exists 3 archive retrieval options:

  • Expedited: 1-5 minutes for the highest cost
  • Standard: 3-5 hours for a lower cost
  • Bulk: 5-12 hours for the lowest cost

So after this request you will have to wait, depending on your archive retrieval options.

This demand is performed with the cmdlet Restore-S3Object.
Here is an example of usage:

# CopyLifetimeInDays is the number of days the object remains accessible before it is frozen again
Restore-S3Object -BucketName $element.BucketName -Key $element.Key -CopyLifetimeInDays $CopyLifetimeInDays -Tier $TierType

By using our previous custom cmdlet called Get-dbiS3Object, we can also build a new custom cmdlet to simplify the process:

Function Restore-dbiS3Object (){
    param(
        $CustomS3Objects,
        [String]
        $Key,
        [String]
        $KeyPrefix,
        [String]
        $BucketName,
        [Amazon.S3.GlacierJobTier]
        $Tier='Bulk',              # Default archive retrieval option if nothing specified
        [int]
        $CopyLifetimeInDays = 5    # Default number of days if nothing specified
    )

    If ($CustomS3Objects){
        @($CustomS3Objects) | Foreach-Object -Process {`
            If ( (-not ($_.RestoreExpirationUtc) -and (-not ($_.RestoreInProgress) -and ($_.StorageClass -eq 'Glacier') -and ($_.SizeInB -gt 0)))) {`
                Restore-S3Object -BucketName $_.BucketName -Key $_.Key -CopyLifetimeInDays $CopyLifetimeInDays -Tier $TierType);`
            }`
        }
    }
    elseif ($Key -and $BucketName){
        $Objects = Get-dbiS3Object -BucketName $BucketName -Key $Key;
        Restore-dbiS3Object -CustomS3Objects $Objects;
    }
    elseif ($KeyPrefix -and $BucketName){
        $Objects = Get-dbiS3Object -BucketName $BucketName -KeyPrefix $KeyPrefix;
        Restore-dbiS3Object -CustomS3Objects $Objects;
    }
}

To check if the retrieval is finished and if the object is accessible for download, you can obtain this information with the cmdlet Get-dbiS3Object.

Of course, these 2 custom functions are perfectible and could be customized differently. The goal of this blog is mostly to introduce the potential of this PowerShell module, and give examples of integration in a custom PowerShell module to make daily life easier 🙂