Veeam Elasticsearch Sizing Guidance

Introduction

The purpose of this article is to provide sizing guidance regarding how much data a Veeam Backup and Replication as well as Veeam M365 Server can store in a Swarm domain.

With Swarm 16.1.4 and Gateway 8.1.0 we introduced a new feature called “Index per Domain”, this means within our Elasticsearch database we will now store metadata from each domain in separate indices. It is enabled by setting search.perDomainIndex=True before creating a Search Feed.

Data in Elasticsearch is organized into indices. Each index is made up of one or more shards. Each shard is an instance of a Lucene index, which you can think of as a self-contained search engine that indexes and handles queries for a subset of the data in an Elasticsearch cluster.

There are some sizing consideration an administrator needs to review first.

Elastic recommends the following regarding shards in their official documentation:

  • Size of any shard must be maintained between 20GB and 50GB to achieve optimal performance.

  • Number of shards should be a multiple of number of nodes( with data role ) in the cluster to achieve equal spreading of data across nodes.

  • Increased or decreased number of shards should be a multiple of number of shards present in the index earlier

  • 20 shards per GB of heap available gives us optimal usage of heap and RAM

  • 30.5 GB of heap (i.e. 61GB RAM) is the optimal resource to have per node(data)

A node can have a maximum of 30.5GB heap * 20 shards per of heap which gives us about 600 shards.
Note: Elastic has a default limit of max shards per node set to 1000, it is recommended to stay under 600 for optimal performance.

Example capacity planning

Let’s assume we have an Elasticsearch cluster with 5 data nodes and 5 shards per index.

Veeam Backup and Replication uses long path names for their objects:

Example: Veeam/Backup/MyCustomFolderName/Clients/{cfc1ab5f-6188-428c-9af8-7eac88d4d38d}/dfa8c857-1336-4aa1-8a91-20281bc5048f/CloudStg/Data/{45b88683-1c15-4149-ad43-b3d47331834d}/{57b53906-43b8-46d2-be9d-ff9c2bb8306e}/449485_77f4217c4a064129e8f55d708d48027d_0a2448d71034f0e9b931ed5fadb0e51c

This is roughly 275 characters on average ( depends on the custom folder name you chose when creating the backup repository ).

On average we have measured that meta-data for an object written by VBR v12 consumes about 1.1kb on disk on the Elasticsearch datastore.

1 shard can therefor hold 50G/1.1kb = 46.5 Million documents.

Total number of shards planned at start will be 10 (5 primary + 5 replicas)

So one index(per domain) can hold about = 10 shards * 46.5 Millions docs per shard = 465 Million documents(including the replicas)

This means for optimal use of a domain a maximum of 232 Million objects are recommended, after which performance will start to degrade.

We recommend not exceeding 200 Million objects per domain

If you are using the default storage optimization aka block size of 1MB , this means that in the best case scenario where Veeam writes data into perfect 1MB sized objects you will be able to write 200 TB to the domain. This includes the storage footprint of immutability ( depends on your chosen backup retention settings ).

How can you monitor the size of your shards ?

Using Curl via a shell:

curl -s "http://ESIP:9200/_cat/shards/index_*?h=index,shard,d,prirep,sto,ip&v"

Example output:

index shard d prirep sto ip index_swarm.sollab.local1 6 3833537 r 3gb 172.29.10.21 index_swarm.sollab.local1 6 3833537 p 3gb 172.29.10.23 index_swarm.sollab.local1 5 3830333 r 3gb 172.29.10.22 index_swarm.sollab.local1 5 3830333 p 3gb 172.29.10.23 index_swarm.sollab.local1 2 3828835 r 3.1gb 172.29.10.21 index_swarm.sollab.local1 2 3828835 p 3.1gb 172.29.10.22 index_swarm.sollab.local1 1 3832547 r 3.1gb 172.29.10.21 index_swarm.sollab.local1 1 3832547 p 3gb 172.29.10.20 etc...

The column “sto” shows the datastore disk space usage for the shard.

Using Grafana:

Our Dashboard “Swarm Search v8.1” has a panel showing this metric

What can I do if my shard exceeds 50GB ?

You have 3 options:

Conclusion

In the example above we had 5 data nodes, with 600 shards per node this means Swarm can support up to 250 domains ( some indices are needed for csmetrics, kibana, etc.. )

If you require more domains that can be supported by your existing Elasticsearch cluster size, then you will need to grow the size of your Elasticsearch cluster, specifically nodes with “data” role.

If you are projecting to exceed 200TB per domain , then you need to eighter increase the storage optimization aka block size to our recommended size of 4MB to 8MB, or think about how to distribute this workload over multiple domains for optimal performance.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.