Veeam Backup and Replication Direct to S3 mode Design

Introduction

The purpose of this KB article is to explain how backup jobs work when using “Direct to S3” mode.

DataCore Swarm does not recommend Direct to S3 mode.

Direct to Object storage is always “forever forward incremental” as long as you don’t schedule active full backups.

Forever Forward Incremental backup policy uses API calls to extend immutability for objects used in a weekly GFS restore point.

Blocks of restore point are split into unique objects ( default 1MB, 50% smaller after compression )

Using larger storage block size provides 3x better write performance but increases the size of the incremental backups 2x. Changing the block size on existing jobs will require a full backup and active full backups won’t repurpose already offloaded objects, so this will be a full offload.​

Choosing a smaller block size increases the number of objects stored, and will have an impact on backup job runtime since it will have to execute more API calls to extend immutability and more delete API calls when they expire.

Veeam uses the object storage as a database, keeping state  in special objects called "owner" and "repository" as well as meta-data. See https://helpcenter.veeam.com/docs/backup/vsphere/object_storage_structure.html?ver=120

Some customers decide to add multiple VM’s into a single Job, to benefit from deduplication , this has consequences to the Job runtime once the Merging and Block Generation Rotation phases start to emerge. These phases happen per VM/Disk inside a Veeam Job, therefor one VM doing a merge or block generation rotation can hold up the entire backup job when lumped together.

Veeam Phases

image-20240807-075735.png

Phase 1: The Honeymoon Phase

The first week VBR only writes data to object storage.​

The size of the objects depends on the configured storage optimization block size , default is 1MB.​

The default storage optimization block size is a desired maximum size, the final storage will depend on deduplication and compression ratio, typical average compression is 50%​

TLDR: First week is Full Backup + Incremental backups at the scheduled backup job period.​

 

Phase 2: Incremental backups

The example here is for a 50GB hourly incremental backup, it takes on average 18 mins to complete pay close attention to the total operations per HTTP code.​

You can monitor this using our DataCore Swarm Gateway grafana dashboard.​

image-20240807-080011.png

Phase 3: Merging oldest incremental backup into full backup

After 7 days, as the backup chain window starts to become too long, Veeam has to merge the oldest incremental backup into a full backup image, before it can decide to delete expired backups.​

Merging runs every 24hrs, these happen in your backup job as well as a background task called "Retention Job"​

This will enumerate every object and uses a combination of the following S3 calls to do the merging:

  • ListObjects

  • Get

  • PUT(Veeam metadata)

  • PUTObjectRetention

  • Delete

In my example merging takes about 1hr to complete., the console view here is only showing 1 VM, my backup job had 10 VM’s in it which ran concurrently at the same time. ( maxActiveSnapshot = 10 )

Phase 4: Extending Immutability

After 10 days (Block Generation) + configured Job immutability period you will see large amounts of putObjectRetention() calls ( 1 for every stored object in the bucket ) Veeam re-assigns all blocks to the next block generation.​

For more information about this , see : https://www.veeam.com/kb4470

Veeam does this per VM, so if you have a backup job with 100 VM's in it, and you allow maxActiveSnapshots of 10, you will see 10 parallel VM backups, each of which does the behavior as seen in the chart below.

  • The determining factor in this phase, is the cost of doing DELETE and PutObjectRetention S3 API call​s

 

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.