Configuring Elasticsearch
Elasticsearch requires configuration and settings file changes to be made consistently across the Elasticsearch cluster.
Scripted Configuration
Using the provided configuration script automates in-place Elasticsearch upgrades as well as the essential configuration that Elasticsearch requires for use with Swarm.
The script handles the following:
Upgrading Elasticsearch in place (using the same index) if it detects a supported version (6.8.6+) is already installed and configured
Editing
/etc/elasticsearch/elasticsearch.yml
(except for changing the path.data variable to use a different data directory)Editing
/etc/elasticsearch/log4j2.properties
Editing
/usr/lib/systemd/system/elasticsearch.service
Editing
/etc/sysconfig/elasticsearch
Creating the override file for Systemd:
/etc/systemd/system/elasticsearch.service.d/override.conf
Bulk | This method is most efficient for a large number of nodes and/or have manual configurations to apply to the
|
---|---|
Non-Bulk | InfoThis still requires running the configure script on each node but do not copy the generated elasticsearch.yml files between the nodes.
NoteIn step 4, the prompt for the cluster name and list of nodes must be answered identically. |
Customization
The paths given are relative to the Elasticsearch installation directory, which is assumed to be the working directory.
Caution
Errors in adding and completing these settings can prevent the Elasticsearch service from working properly.
Adjust all references to Elasticsearch’s
path.data
location below to reflect the new location if thepath.data
location is customized from the default.
Elasticsearch Config File
Edit the Elasticsearch config file: /etc/elasticsearch/elasticsearch.yml
action.auto_create_index: "+csmeter*,+*_nfsconnector,.watches, | Needed to disable automatic index creation, csmeter indices, and Swarm NFS connectors. (v10.1) |
cluster.name: <ES_cluster_name> | Provide the Elasticsearch cluster a unique name, which is unrelated to the Swarm cluster name. Do not use periods in the name. |
node.name: <ES_node_name> | Optional: Elasticsearch supplies a node name if one is not set. Do not use periods in the name. |
network.host: _site_ | Assign a specific hostname or IP address, which requires clients to access the ES server using that address. Update |
cluster.initial_master_nodes | (ES 7+) For first-time bootstrapping of a production ES cluster. Set to an array or comma-delimited list of the hostnames of the master-eligible ES nodes whose votes should be counted in the very first election. |
discovery.zen. | (ES 6 only) Set to (number of master-eligible nodes / 2, rounded down) + 1. Prevents split-brain scenarios by setting the minimum number of ES nodes online before deciding on electing a new master. |
discovery.seed_hosts | (ES 7+) Enables auto-clustering of ES nodes across hosts. Set to an array or comma-delimited list of the addresses of all master-eligible nodes in the cluster. |
discovery.zen.ping.unicast.hosts: ["es0", "es1"] | (ES 6 only) Set to the list of node names/IPs in the cluster, verifying all ES servers are included. Multicast is disabled by default. |
gateway.expected_nodes: 4 | Add and set to the number of nodes in the ES cluster. Recovery of local shards starts as soon as this number of nodes have joined the cluster. It falls back to the |
gateway.recover_after_nodes: 2 | Set to the minimum number of ES nodes started before going into operation status:
|
bootstrap.memory_lock: true | Set to lock the memory on startup to verify Elasticsearch does not swap (swapping leads to poor performance). Verify enough system memory resources are available for all processes running on the server. The RPM installer makes these edits to |
path.data: <path_to_data_directory> | By default path.data is Then path.data can be set to the directory or make a symlink to the default location: |
thread_pool.write.queue_size | The size of the queue used for bulk indexing. This variable was called |
node.attr.rack | Optional: A setting for Elasticsearch that tells to not assign the replica shard to a node running in the same “rack” where the primary shard lives. This allows for example a 6-node cluster running with 2 nodes on each of 3 ESXi hosts to survive one of the ESXi hosts being down. The state is yellow, not red. Set to a rack name or ESXi host identifier like Ideally, this is set right after initial configuration when first starting Elasticsearch. To add to an existing deployment, all nodes must be restarted before shards are reallocated. To do this without downtime, first turn off shard allocation, then restart each node one by one waiting for it to show in |
Systemd (RHEL/CentOS)
Create a systemd override file for the Elasticsearch service to set the LimitMEMLOCK property to be unlimited.
Create the override file.
Add this content.
Load the override file; otherwise, the setting does not take effect until the next reboot.
Environment Settings
Edit the environmental settings: /etc/sysconfig/elasticsearch
| Set to |
---|---|
| Set to |
JVM Options
Edit the JVM settings to manage memory and space usage: /etc/elasticsearch/jvm.options
| Set to half the available memory, but not more than 31 GB. |
---|---|
| Set to half the available memory, but not more than 31 GB. |
GC logs (optional) - Elasticsearch enables GC logs by default. These are configured in jvm.options
and output to the same default location as the Elasticsearch logs. The default configuration rotates the logs every 64 MB and can consume up to 2 GB of disk space. Disable these logs until needed to troubleshoot memory leaks. Comment out these lines to disable them:
Log Setup
Adjust the configuration file: /etc/elasticsearch/log4j2.properties
to customize the logging format and behavior.
Logging has the needed ownership in the default location. Choose a separate, dedicated partition of ample size, and make the elasticsearch
user the owner of that directory to move the log directory:
Deprecation log
This is the log of deprecated actions, to inform for future migrations. Adjust the log size and log file count for the deprecation log:
Update to these values
Deprecation logging is enabled at the WARN level by default, the level at which all deprecation log messages are emitted. Change the log level to ERROR to avoid having large warning logs:
Change level
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.