Using the elasticsearch api to split large shards

An Elasticsearch index is divided into a set of shards – primary shards and replica (backup) shards. The shard count is configured when the index is created. Elastic recommends shard sizes not be larger than about 50GB, to make them faster to update or to shuffle between nodes when necessary. They also recommend a 32GB-heap node only store up to 600 shards. Although there are typically hundreds of metrics- and csmeter- shards they shouldn’t have much effect on performance as they are small, with time-based indices.

Swarm 12+ allows you to set search.numberOfShards (default is 5) to a larger value like 20 if you know you will have a very large search feed index (e.g. you will be storing a billion objects or a large amount of metadata).

Before Elasticsearch 6 you were unable to increase the number of shards after an index was created. Now you can use the _split api https://www.elastic.co/guide/en/elasticsearch/reference/7.17/indices-split-index.html. This is faster than creating a new search feed with the correct number of shards and waiting for it to populate. But note it requires downtime to complete these steps and for the new split index to be ready. It also requires that you have enough Elasticsearch disk space for a copy of the current index.

Please let us know if you think you need to split your index. Also note that Elasticsearch 6 (EOL since 2022) has an extra requirement that index.number_of_routing_shards is set before using _split https://www.elastic.co/guide/en/elasticsearch/reference/6.8/indices-split-index.html. You should instead first upgrade to Elasticsearch 7 which has been supported since Swarm 12.

Instructions

Find the search feed index name, this will be the OLD_INDEX used in commands below:
curl 'http://elasticsearch:9200/_cat/indices?index=index_*'
yellow open index_caringo71-cluster0 _27akdSrQyK0_uo76y3Ofw 5 1 12 0 126.1kb 126.1kb
The 0 suffix represents the Swarm search feed id, it might be a 1 or 2 in your environment!
OLD_INDEX=index_caringo71-cluster0
Find the name of the alias that Swarm and Gateway use to refer to the index, you'll use this later.
curl "http://elasticsearch:9200/_aliases?index=${OLD_INDEX}&pretty"
{
"index_caringo71-cluster0" : {
"aliases" : {
"caringo71-cluster0" : { }
}
}
}
The alias name should be the index name without "index_".
ALIAS_NAME=caringo71-cluster0
Pause the Swarm search feed in the Storage UI or Swarm console or api.
Make the current index read-only:
curl -XPUT -H 'Content-type: application/json' "http://elasticsearch:9200/${OLD_INDEX}/_settings?pretty" --data-binary '
{
"settings": {
"index.blocks.write" : true
}
}
'
Split the index, e.g. from default 5 shards to 20. This will return an error if it's not a multiple of the current number of shards.
curl -XPUT -H 'Content-type: application/json' "http://elasticsearch:9200/${OLD_INDEX}/_split/${OLD_INDEX}_split20?pretty" --data-binary '
{
"settings": {
"index.number_of_shards": 20
}
}
'
Elasticsearch should quickly become yellow again, verify with:
time while ! curl -fsS 'http://elasticsearch:9200/_cluster/health?wait_for_status=yellow&pretty' ; do sleep 5 ; done
But make sure at least all the primary shards are STARTED (they should be, since yellow)
curl -fsS "http://elasticsearch:9200/_cat/shards?index=${OLD_INDEX}_split20" | grep -w p
Change the alias to point from the old index to the new split index
curl -XPOST -H 'Content-type: application/json' 'http://elasticsearch:9200/_aliases?pretty' --data-binary '
{
"actions": [
{ "remove" : { "index" : "'"${OLD_INDEX}"'", "alias" : "'"${ALIAS_NAME}"'" } },
{ "add" : { "index" : "'"${OLD_INDEX}_split20"'", "alias" : "'"${ALIAS_NAME}"'" } }
]
}
'
Verify the alias is correctly pointing to the new “split” index with:
curl http://elasticsearch:9200/_aliases | grep "${ALIAS_NAME}"
Verify again Elasticsearch is at least yellow and all primary shards are STARTED
time while ! curl -fsS 'http://elasticsearch:9200/_cluster/health?wait_for_status=yellow&pretty' ; do sleep 5 ; done

curl -fsS "http://elasticsearch:9200/_cat/shards?index=${OLD_INDEX}_split20" | grep -w p
It's now safe to delete the old index
curl -fsS -XDELETE "http://elasticsearch:9200/${OLD_INDEX}"
Now undo the read-only setting (the setting was copied to the new index)
curl -XPUT -H 'Content-type: application/json' "http://elasticsearch:9200/${OLD_INDEX}_split20/_settings?pretty" --data-binary '
{
"settings": {
"index.blocks.write" : null
}
}
'
You can now Resume (Storage UI) the search feed or uncheck Pause (legacy Swarm console). Do not Restart/Refresh!
Verify Swarm indexing and listings are again working:
curl -i 'http://swarm/?domains&format=json'
HTTP/1.1 200 OK
Castor-System-Object-Count: 2
...
curl -i -XPOST --post301 --location-trusted 'http://swarm/?domain=temptestsplit&createDomain'
HTTP/1.1 201 Created
curl -i 'http://swarm/?domains&format=json&name=temptestsplit'
HTTP/1.1 200 OK
...
{"last_modified": "2020-10-01T00:00:21.200000Z", "bytes": 0, "name": "temptestsplit", "hash": "4a49ac2fe229ca9b7a9e6b042e159f04", "written": "2020-10-01T00:00:21.200000Z", "accessed": "2020-10-01T00:00:21.200000Z"}

Yes, all this really needs to be scripted – too many error-prone quotes and curls.

Using the elasticsearch api to split large shards

Instructions

Related articles