Configuring Subclusters in Elasticsearch

A couple of settings are required to put in place on each node to allocate them to an availability zone:

node.attr.availabilityzone: <zone name>
cluster.routing.allocation.awareness.attributes: availabilityzone

This ensures that the replica shards do not end up in the same zone as their associated primary shards.

Elasticsearch documents include all the information related to shard allocation and high availability of large clusters. See the below documents:

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/high-availability-cluster-design-large-clusters.html.

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/modules-cluster.html#shard-allocation-awareness

We also have the required settings in the node.attr.rack setting, which is documented in the doc, Configuring Elasticsearch.
The attribute name rack or rack_id or availabilityzone does not matter as long as cluster.routing.allocation.awareness.attributes: <name> is set along with node.attr.<name>.
The zone name uniquely identifies the rack or ESXi host with elasticsearch nodes in the same zone using the same value. You must do a rolling restart Rolling Restart of Elasticsearch - Swarm Documentation - Confluence (atlassian.net) of all your elasticsearch nodes after updating elasticsearch.yml, waiting for the shard reallocations to complete.

Knowledge Base

Configuring Subclusters in Elasticsearch

Analytics

Related content