Overview
This guide will help you to set up and enable regular backups of your Elasticsearch index without causing downtime. It leverages Elasticsearch's snapshot and restore functionality to create backups efficiently. The example provided uses a shared file system for storing snapshots. Backing up Elasticsearch indices is particularly beneficial for use cases where the data on Swarm is static, like Write Once Read Many (WORM). However, it is still necessary to perform a “Refresh Search Index” operation after recovery to ensure the Elasticsearch index is fully updated and synchronized. This process is crucial because it catches up the Elasticsearch index with the latest data state.
For use cases with frequent updates, such as backup software like Commvault, NetBackup, Veeam et al, restoring an Elasticsearch index from backup is not suitable. After restoring a specific point-in-time of the Elasticsearch index, the backup client software may not be able to read the data beyond that time. To address this, a “Refresh Search Index” is required. This synchronizes the Elasticsearch Index with the current state of data in the storage cluster. This refresh process can take almost the same amount of time as creating a new search feed, which is why recovery of Elasticsearch from backup is less beneficial for these use cases.
Prerequisites
Elasticsearch cluster running.
Shared file system accessible by all Elasticsearch nodes.
Access to the cluster via curl or a similar HTTP client.
(Optional) Elasticsearch Curator for automating snapshots.
...