Introduction and Prerequisites

This document guides to effectively deploy a Swarm cluster using the OVF VM bundle package – Rocky Linux 8 edition.

Both the OVF bundle package and the standalone software are available in the DataCore downloads website.

Note

The process described in this document covers a standard and generic deployment of Swarm, focused on small installations and test environments for Proof-of-Concept/Proof-of-Value purposes.

As every single use case may be different, we recommend working with DataCore partners and DataCore Solutions Architects to address any specific configuration requirements or customization needed.

There are two main sections in this document:

Swarm deployment using the OVF VM bundle package.
Deploying Swarm from scratch.

The present document is based on a traditional deployment of Swarm, where the management and access layer run virtualized in one or more VMware ESXi hosts, while the storage nodes are physical x86-64 machines that will hold the data. See the below diagram.

Swarm Components

The Swarm stack utilizes several components grouped in two different layers:

Storage Layer: Comprised by the Swarm storage nodes which hold the information and take care of data protection.
Management and Access Layer: As the name implies, this layer provides both the administration of the Swarm cluster as well as access to the storage for users and client applications. No data storage or caching is happening in this layer.

Below are the software components of the entire Swarm stack, their functions, and count recommendations for durability and availability purposes:

Swarm Storage Nodes

Swarm is an in-purpose built on-premises object storage solution. It runs on standard physical x86-64 servers providing a single pool of resources, supporting billions of objects/files in the same cluster and extending its capabilities to multiple sites (data replication).
Swarm will leverage all hardware resources the node (server where it runs) provides: CPU, RAM, network, and any direct-attached disk drives.
Minimum recommended storage nodes count: Four (4).

Platform Server - Swarm Cluster Services (SCS)

The SCS software provides Swarm cluster configuration and boot services as well as log aggregation and Swarm version management.
The SCS is not in the data path, but it does require access to the same layer 2 network as the Swarm storage nodes.
Minimum recommended SCS count is one.

Best Practice

Create a snapshot or clone the VM once its configuration is completed. Only one SCS instance can be online.

Elasticsearch

Provides listing and search capabilities based on object name and object metadata.
Minimum recommended Elasticsearch VM count for production environments is three.
For functional Proof-of-Concepts, one instance should suffice.

Content Gateway

The Content Gateway provides S3 and HTTP access as well as a Content Portal (web interface) that users and administrators can leverage to create buckets, upload data, use collections to perform searches (based on metadata), and many more. Hence, the Content Gateway is in the data path.
Content Gateway also enforces multitenancy features such as user authentication against LDAP, Active Directory or Single-Sign-on (SAML), permissions, quotas, and so on.
Minimum recommended Content Gateway count for production environments is two.

Important

As Content Gateway is in the data path, at least two instances should be up and running at all times. A load balance mechanism such an HTTP Load Balancer is recommended to distribute requests across all the Content Gateway instances. Alternatively, DNS-RR can be used.

For functional Proof-of-Concepts, one instance should suffice.

Telemetry (Optional)

Prometheus integration and Grafana dashboards.
Minimum recommended Telemetry count is often one, but there could be as many as needed.

Load Balancers (Optional)

To balance the client load across all the Content Gateway instances, an HTTP Load Balancer in front of the Content Gateways can be leveraged. This load balancer can be a software solution such as HAProxy, NGINX, or others. Also, it could be a hardware-based, appliance one.

Note

The DMZ network, load balancers, and public network items are outside the DataCore offering.

Networking Requirements and Recommendations

Swarm utilizes a dual networking configuration, where there is a Storage (Backend) network and a Service (Frontend) one. As per the diagram above, the Swarm storage nodes are only connected to the Backend network, while the management and access layer components have presence in both (dual-homed). Hence, this Backend/storage network must be configured in VMware ESXi as well.

The Backend network could be just a VLAN in the existent switching environment. However, this VLAN/network has to be dedicated exclusively to Swarm and it is usually isolated from the rest of the network environment. At any rate, no other system outside the Swarm stack should be connected to it.

The switch ports used by the Swarm storage nodes must be in access mode, as the Swarm nodes cannot tag VLAN traffic. Also, ‘port fast’ should be enabled to facilitate the PXE boot process (see below).

Best Practice

If multicast traffic is allowed in this Backend network, IGMP snooping must be disabled. Multicast is no longer required with Swarm 15, but enabling it still remains a best practice.

The Swarm storage nodes will PXE boot (boot over the network) from the SCS virtual machine that holds the image of the Operating System the nodes will use, as well as the cluster configuration. As part of the PXE boot process, the nodes will ask for an IP address via DHCP. The SCS VM will act as that DHCP server in the storage/backend network, no other DHCP server must be present in the Backend network segment.

To maximize availability, network failover (active-backup) configurations are encouraged, for both the Swarm storage and the virtualized management and access layer.

Open Ports Overview

VM Name	Network	Port - Protocol	Service
ALL VMs	0.0.0.0	22 - TCP	SSH
SwarmClusterServices	Backend	514 - TCP/UDP	Rsyslog
SwarmClusterServices	Backend	69 - UDP	TFTP
SwarmClusterServices	Backend	8095 - TCP	Platform API
SwarmClusterServices	Backend	9000 - TCP	Netboot
SwarmCloudgateway	Frontend	80 - TCP	S3
SwarmCloudgateway	Frontend	8090 - TCP	SCSP
SwarmCloudgateway	Frontend	91 - TCP	Swarm UI
SwarmCloudgateway	Backend	9100 - TCP	Prometheus metrics
SwarmCloudgateway	Backend	9095 - TCP	Node_exporter
SwarmSearch	Backend	9200 - TCP	Elasticsearch
SwarmSearch	Backend	9300 - TCP	Elasticsearch VIP
SwarmTelemetry	0.0.0.0	80 - TCP	Grafana
SwarmTelemetry	127.0.0.1	9090 - TCP	Prometheus
SwarmTelemetry	127.0.0.1	9093 - TCP	Alertmanager
SwarmTelemetry	127.0.0.1	9114 - TCP	Elasticsearch exporter

Environment Prerequisites

The following table illustrates the requirements for a typical Swarm deployment.

Note

As each use case may vary, working with DataCore Partners and/or DataCore Solutions Architects to review these requirements is encouraged.

Required

A Swarm license key is required to finish the setup. Contact the DataCore Sales team.

Optionally, the end-user organization should generate a valid SSL certificate to enable HTTPS access.

Site Survey

To configure the Swarm cluster, the following information is required:

Swarm Cluster Name (FQDN)	<CLUSTER_NAME>
DNS Server(s)	<DNS_SERVER_1> <DNS_SERVER_2>
DNS Domain	<DNS_DOMAIN>
NTP Server(s)	<NTP_SERVER_1> <NTP_SERVER_2>
Storage/Backend Network (VLAN) – CIDR	<BACKEND_NETWORK>
Service/Frontend Network (VLAN) – CIDR	<FRONTEND_NETWORK>
Storage/Backend Network (VLAN) IP Range	<BACKEND_NETMASK>
Service/Frontend Network (VLAN) IP Range	<FRONTEND_NETMASK>
Service/Frontend Network (VLAN) Gateway	<FRONTEND _GATEWAY>

IP Addresses
Component Name	Frontend net. IP Address	Backend net. IP Address
SCS	<SCS_FRONTEND_IP>	<SCS_BACKEND_IP>
Content Gateway	<GW_FRONTEND_IP>	<GW_BACKEND_IP>
Elasticsearch	Optional	<ES_BACKEND_IP>
Swarm Telemetry	<TM_FRONTEND_IP>	<TM_BACKEND_IP>
Swarm Nodes	N/A	Auto-assigned by the SCS VM

Swarm Deployment Using VMware Bundle

The VM bundled is comprised of OVF packages to be deployed in VMware ESXi 7.0U2 and above. The operating system and the Swarm software are both pre-installed. They are based on Rocky Linux 8.9.

The pre-configured Backend network/VLAN range is 172.29.0.0/16, but it can be changed as desired. Make sure the selected range is not in use by another environment.

The default credentials are:

SSH and console access: root - datacore
Web UIs: admin - datacore

These are the templates included in the VM bundle Swarm-16.1-ESX-8.0-RL8:

SCS - PXE-boot the Swarm storage nodes, support tools
- Template: SwarmClusterServices.ovf 
- Associated disks: datacore-swarm-16.1.0-ESX-RL8-disk1.vmdk, datacore-swarm-16.1.0-ESX-RL8-disk2.vmdk
Swarmsearch (Elasticsearch) - Indexer and search engine
- Template: SwarmSearch1.ovf
- Associated disks: datacore-swarm-16.1.0-ESX-RL8-disk3.vmdk, datacore-swarm-16.1.0-ESX-RL8-disk4.vmdk
Content Gateway - S3 access, Content Portal
- Template: SwarmContentGateway.ovf
- Associated disks: datacore-swarm-16.1.0-ESX-RL8-disk5.vmdk
Telemetry (optional component) - Grafana dashboards
- Template: SwarmTelemetry.ovf
- Associated disks: datacore-swarm-16.1.0-ESX-RL8-disk6.vmdk, datacore-swarm-16.1.0-ESX-RL8-disk7.vmdk

The bundle also includes an OVF template that will deploy all VMs as a vAPP:

datacore-swarm-16.1.0-ESX-RL8.ovf

Important

As per VMware requirement, vCenter 7 with DRS enabled needs to be in place to deploy this vAPP.

Platform Server – Swarm Cluster Services (SCS)

Preparation Steps

Deploy SCS VM (SwarmClusterServices.ovf) and its associated virtual disks (vmdk).

Note

The Operating System (Rocky Linux 8.9) and the Swarm software is pre-installed. It has two virtual interfaces, one for the backend network and another for the frontend one.

Change the IP configuration and verify the connection information for the frontend network.

nmcli con mod ens192 ipv4.addresses <SCS_FRONTEND_IP>/<FRONTEND_NETWORK> 
nmcli con mod ens192 ipv4.gateway <FRONTEND_GATEWAY> 
nmcli con mod ens192 ipv4.dns <DNS_SERVER_1>,<DNS_SERVER_2> 
nmcli con mod ens192 ipv4.dns-search <DNS_DOMAIN> 
 
nmcli con mod ens192 ipv4.method manual 
nmcli con mod ens192 connection.autoconnect yes 

nmcli con reload 
nmcli con down ens192 
nmcli con up ens192 

nmcli device show ens192

Change the IP configuration and verify the connection information for the backend network.

nmcli con mod ens224 ipv4.addresses <SCS_BACKEND_IP>/<BACKEND_NETWORK> 
nmcli con mod ens224 ipv4.method manual 
nmcli con mod ens224 connection.autoconnect yes 

nmcli con reload 
nmcli con down ens224 
nmcli con up ens224 
 
nmcli device show ens224

The network configuration can be verified with the command: ip a or nmcli con show.

Offline Installation

For offline installation (i.e., when no Internet access is available).

Edit /etc/hosts comment out the http://k8s.gcr.io entry and the docker-repo.tx.caringo.com one.
The first line should read:
```
<SCS_FRONTEND_IP> www.datacore.com
```
Set the time zone according to your local clock.
timedatectl set-timezone <timezone>
hwclock --systohc

Note

All available time zones can be listed with the command: timedatectl list-timezones

Configure chrony (NTP daemon) to connect to a valid NTP server.
Edit the file /etc/chrony.conf and add the proper IP addresses or names of those NTP servers. Remove the lines referred to as the default ones if they are not reachable.
```
server <NTP_SERVER_1> iburst 
server <NTP_SERVER_2> iburst 

allow <SCS_BACKEND_NETWORK/BACKEND_NETWORK>  
```
The following is an example of the allow line: allow 192.168.90.0/24
Restart chrony daemon: systemctl restart chronyd
Verify the clock is in sync with: chronyc tracking

SCS Configuration

Once the auxiliary services of SCS are configured, the SCS setup can take place.

Run the configurator wizard:
scsctl init wizard -a
Running step [1/37]: Set site name.
Type the <CLUSTER_NAME>
Running step [2/37]: Set the administrative password.
Type the admin password for the cluster
Re-enter to confirm
Running step [4/37]: Choose the Swarm-internal interface.
Specify the network interface that will be used for internal Swarm operations:
lo
ens192
> ens224
Select ens224, press Enter.
Running step [5/37]: Choose external (client-facing) interface.
Specify the network interface that will be used for operations OUTSIDE of swarm:
lo
> ens192
ens224
Select ens192, press Enter.
Running step [7/37]: Define Swarm-internal network.
The internal interface requires a *static* IP address to be defined on it!
It looks like your internal interface is already configured with an IP address: x.x.x.x/yz
Do you wish to continue to use this address and netmask? [Y/n]:
Reply Y and hit Enter
The provisioning process will commence, and it takes a few minutes to complete.
Continue the configuration process running:
scsctl diagnostics config scan_missing
Missing setting: network_boot/network.dnsServers
Update this setting as a default at group level.
Press Enter and type the IP addresses of the DNS servers below separated with a blank space:
network.dnsServers [type: array[str]] (Required: DNS servers to be used):
<DNS_SERVER_1> <DNS_SERVER_2>
Missing setting: platform/network.dnsDomain
Update this setting as a default at group level
Hit Enter and type the DNS domain used
network.dnsDomain [type: str] (Required: The DNS domain name that will be used.):
<DNS_DOMAIN>
Fix the rsyslog template
Run: cp -f /root/swarm-platform-fix.conf /etc/rsyslog.d/swarm-platform.conf
In the next step, the Swarm image will be added and configured.
Run: scsctl repo component add -f /root/swarm-scs-storage-16.1.0.tgz
During this process, the feature “encryption at rest” (EAR) can be configured.

Optional

This is an optional functionality that will encrypt the data when it hits the plate of the disks. It comes at a cost of usually 15-20% performance penalty as the nodes need to use processing power to encrypt/decrypt data.

This guide assumes EAR will be configured. If it is not a requirement, select False on the next step when the wizard asks about disk.encryptNewVolumes configuration.

Also, the configuration steps will ask about multicast traffic, and whether it will be allowed. As it is a best practice to keep it enabled, this guide will follow that.

Missing setting: storage/disk.encryptNewVolumes
Update this setting as a component-wide default
Hit Enter
disk.encryptNewVolumes [type: bool] (Whether to encrypt new Swarm volumes. Enabling encryptNewVolumes means that any newly-formatted Swarm volume will be encrypted)
True
False
Missing setting: storage/cip.multicastEnabled
Update this setting as a component-wide default
Press Enter.
cip.multicastEnabled [type: bool] (Whether multicast should be used for communication within Swarm.)
> True
Press Enter
Finally, the configuration wizard asks for what drives will be used to store data, this guide assumes “all” drives will be used as the server should be dedicated exclusively to Swarm.
Missing setting: storage/disk.volumes
Update this setting as a component-wide default
Press Enter
disk.volumes [type: str] (Required: Specifies the volume storage devices for Swarm to use)
all
At this stage, the Swarm image is added. The configuration wizard will ask about the cluster name and a description.
added: storage - 16.1.0 (16.1.0)
Enter a name for the group (FQDN format encouraged):
<CLUSTER_NAME>
Enter a description for the group (purpose, etc.). [OPTIONAL]:

Info

The above key name and value are just examples. A proper encryption key should be generated.

SCS needs to know what IP range can be used to PXE boot the Swarm storage nodes in the backend network. To avoid collisions with other Swarm services, we can reserve several IP addresses at the beginning and/or the end of the range, so SCS will not assign those IP addresses to the nodes. To do this,

Run the below command:

scsctl init dhcp --dhcp-reserve-lower=50 --dhcp-reserve-upper=10

Adjusting the values to whatever makes sense in the backend network. For example, In a /24 network, the above will use from .51 to .244 to PXE boot and assign IP addresses to the Swarm storage nodes.

If the physical servers have SSD/NVMe or smaller drives that are not required for Swarm, can be excluded by running:
```
scsctl storage config set -d "disk.minGB=4096"
```

As an example, the above command will exclude any drive that is smaller than 4TB.

Unzip and add the license key. This key should be a plain text file:
```
scsctl license add -f license.txt 
```
Override the swarm-platform.conf config file:

cp /root/swarm-platform-fix.conf /etc/rsyslog.d/swarm-platform.conf

It is recommended to enable Swarm node stats for the Telemetry VM (Prometheus/Grafana). To do this, run:

scsctl storage config set -d "metrics.enableNodeExporter=true" 
scsctl storage config set -d "metrics.nodeExporterFrequency=120"

If the Swarm storage nodes use an Intel Skylake based CPU or similar, run the following:

scsctl network_boot config set kernel.extraArgs=clocksource.max_cswd_read_retries=50 -d  
systemctl restart swarm-platform

For more information, see Intel Skylake/Cascade Lake CPU Performance Issue

Finally, create a backup of the SCS configuration. Run:

scsctl backup -o backup-config-<date>

At this point, SCS has been configured and it is ready to PXE boot Swarm storage nodes.

Elasticsearch

Preparation Steps

Before Deployment

Please be aware there are optional steps involved on the deployment of Elasticsearch. First of all, it is not mandatory to connect the virtual network card assigned to the frontend network, as the Elasticsearch communication only happens with the Swarm storage nodes that are sitting in the backend network. Please, make sure the Elasticsearch VM(s) can reach the NTP server(s), a network gateway may be needed to reach the NTP servers if only the backend interface is configured, the SCS VM can act as such network gateway.

The other consideration is about the number of Elasticsearch VMs to deploy. As stated in the first chapter of this guide, for testing and Proof-of-Concept scenarios, usually a single Elasticsearch VM is sufficient. However, for production environments at least three Elasticsearch VMs must be deployed. Important

Before doing this, deploy the Swarm Search VM template (SwarmSearch1.ovf).

Deploy SwarmSearch VM (SwarmSearch.ovf) and its associated virtual disks (vmdk). Please note the virtual network interfaces are inverted. The first one corresponds to the backend network, while the second interface is connected to the frontend network.
The preconfigured IP address for the backend network is 172.29.1.20/16.
Below are the steps to change it, if required:

Note

The Operating System (Rocky Linux 8.9) and the Swarm software is pre-installed. It has two virtual interfaces, one for the backend network and another for the frontend one. The latter is disconnected by default as it's not strictly required.

Verify that the first virtual network card of the VM is connected to the backend network.

Change the IP configuration and verify the connection information for the backend network.

nmcli con mod ens192 ipv4.addresses <ES_BACKEND_IP>/<BACKEND_NETWORK> 
nmcli con mod ens192 ipv4.gateway <SCS_BACKEND_IP>
nmcli con reload 
nmcli con down ens192 
nmcli con up ens192 
nmcli device show ens192

The network configuration can be verified with the command: ip a or nmcli con show.

Set the time zone according to your local clock:

timedatectl set-timezone <timezone>
hwclock --systohc

Info

All available time zones can be listed with the command: timedatectl list-timezones

disable swapping by executing the following command:

echo "vm.swappiness=1" > /etc/sysctl.d/51-swappiness.conf
sysctl -w vm.swappiness=1

Edit the file /etc/chrony.conf and add the proper IP addresses or names of those NTP servers. Remove the lines referred to the default ones if they are not reachable.
```
server <NTP_SERVER_1> iburst 
server <NTP_SERVER_2> iburst 
```
Ensure that the NTP servers are reachable.
Restart chrony daemon.
```
systemctl restart chronyd 
```
Verify the clock is in sync.
```
chronyc tracking
```
Edit /etc/elasticsearch/elasticsearch.yml and replace 172.29.1.20 with the IP address configured in the previous step for this VM in the following sections of the file:
```
network.host: <ES_BACKEND_IP> 
discovery.seed_hosts: ["<ES_BACKEND_IP>"] 
cluster.initial_master_nodes: ["<ES_BACKEND_IP>"
```
Restart the service.
```
systemctl restart elasticsearch 
```
Verify it is up and running.
```
curl -XGET "http://<ES_BACKEND_IP>:9200/_cat/health?v"
```
The response should be "green" or "yellow".

Optional

To assign an IP address on the Frontend network/VLAN, change the IP configuration and verify the connection information for the frontend network.

nmcli con modify ens224 ipv4.addresses <ES_FRONTEND_IP>/<FRONTEND_NETWORK> 
nmcli con modify ens224 ipv4.gateway <FRONTEND_GATEWAY> 
nmcli con modify ens224 ipv4.dns <DNS_SERVER_1>,<DNS_SERVER_2> 
nmcli con modify ens224 ipv4.dns-search <DNS_DOMAIN> 
nmcli con reload 
nmcli con down ens224 
nmcli con up ens224 
nmcli device show ens224

Edit the properties of the VM and verify that there is a check mark on “connect” for the virtual interface assigned to the frontend network.

With the above steps, only one Elasticsearch VM will be provisioned. The status will appear in “Yellow” the moment there is any data in Elasticsearch as there is no redundancy.

This configuration is enough for Proof-of-Concept or Proof-of-Value scenarios. However, for production environments, the recommendation is to have at least three Elasticsearch VMs up and running forming a cluster by themselves.

The steps to deploy a full Elasticsearch cluster are explained below:

Important

This is not mandatory and depends on infrastructure, desired architecture, and objectives.

Deploy the SwarmSearch1.ovf template two more times.
Update the static IP address for the backend adapter ens192.
Update /etc/hostname of the two new VMs, e.g., “swarmsearch2”, “swarmsearch3”.

Stop the elasticsearch service and delete the pre-generated data:

systemctl stop elasticsearch 
cd /var/lib/elasticsearch 
rm -rf nodes

Delete the predefined elasticsearch.yml config file and run the configurator wizard:

rm /etc/elasticsearch/elasticsearch.yml
/usr/share/caringo-elasticsearch-search/bin/configure_elasticsearch_with_swarm_search.py

Some details are is required to complete the configuration. Please note, that <ES_NODE_NAME> will be different for each of the Elasticsearch nodes. For example: swarmsearch1 for the first one, swarmsearch2 for the second one and so on.

Enter Elasticsearch cluster name [A string]: swarmsearch
Enter List of all the Elasticsearch server names in the cluster [Comma-separated list of DNS-resolvable names or IP addresses]: <ES_NODE1_BACKEND_IP>,<ES_NODE2_BACKEND_IP>,<ES_NODE3_BACKEND_IP>
Enter this Elasticsearch node's name [A string name from the list entered above]: <ES_NODE_NAME>

Repeat these steps for every Elasticsearch VM, including the original (the first was deployed) one. For more information, see Configuring Elasticsearch

Once all Elasticsearch VM’s is configured, restart the elasticsearch service in all of them:

systemctl restart elasticsearch

Finally to check the health of the Elasticsearch cluster, run:

curl -XGET "http://<ES_BACKEND_IP>:9200/_cat/health?v"

Three nodes should appear under the “node.total” column, and the status should be “green”.

Swarm Storage Nodes

Before starting the PXE boot process, enter the BIOS of each server that will be the Swarm storage node and check:
1. The HBA/Disk controller is configured in passthrough mode. Essentially, this is a non-RAID configuration where all the disk drives are presented to the operating system individually. It is also called IT mode, HBA mode, pass-thru, or non-RAID.
2. The network card port connected to the Backend VLAN/network must be enabled for PXE booting, no other port should be PXE-boot enabled. Moreover, there should not be any other port connected to any other network, with the exception of the dedicated port for out of band management (OOB, IPMI, BMC…)
Once these are verified, the PXE boot process can begin.
Start with a single node, making sure it boots properly.
Continue with the rest. A successful Swarm storage node boot looks like this on the screen / IPMI console of the server:
Swarm version, IP address of the node, and “Storage Processes: RUNNING” should appear on the screen.

Content Gateway

The final step to have everything needed to have a functional Swarm cluster is to get Content Gateway up and running.

Deploy SwarmContentGateway.ovf. The IP addresses must be configured next.

Change the IP configuration and verify the connection information for the frontend network.

nmcli con mod ens192 ipv4.addresses <GW_FRONTEND_IP>/<FRONTEND_NETWORK> 
nmcli con mod ens192 ipv4.gateway <FRONTEND_GATEWAY> 
nmcli con mod ens192 ipv4.dns <DNS_SERVER_1>,<DNS_SERVER_2> 
nmcli con mod ens192 ipv4.dns-search <DNS_DOMAIN> 
nmcli con mod ens192 ipv4.method manual 
nmcli con mod ens192 connection.autoconnect yes 

nmcli con reload 
nmcli con down ens192 
nmcli con up ens192

nmcli device show ens192

Change the IP configuration and verify the connection information for the backend network.

nmcli con mod ens224 ipv4.addresses <GW_BACKEND_IP>/<BACKEND_NETWORK> 
nmcli con mod ens224 ipv4.method manual 
nmcli con mod ens224 connection.autoconnect yes 
nmcli con reload 
nmcli con down ens224 
nmcli con up ens224 
 
nmcli device show ens224

The network configuration can be verified with the command: ip a or nmcli con show

Set the time zone according to your local clock.

timedatectl set-timezone <timezone> 
hwclock --systohc

Note

All available time zones can be listed with the command: timedatectl list-timezones

Configure chrony (NTP daemon) to connect to a valid NTP server.
Edit the file /etc/chrony.conf and add the proper IP addresses or names of those NTP servers. Remove the lines referred to as the default ones.
```
server <NTP_SERVER_1> iburst 
server <NTP_SERVER_2> iburst
```
Restart the chrony daemon.
```
systemctl restart chronyd  
```
Verify the clock is in sync.
```
chronyc tracking 
```

The Content Gateway configuration comes next.

Important

This guide assumes local users on the Content Gateway will be used, by default there is only one “admin” that has access to the entire system. It is also possible to integrate Content Gateway with an LDAP server or Active Directory. For further information, we recommend working with one of DataCore’s Solutions Architects or refer to the examples available on the Content Gateway configuration directory: /etc/caringo/cloudgateway/examples

To proceed with the Content Gateway configuration:

Edit /etc/caringo/cloudgateway/gateway.cfg

adminDomain = admin.<CLUSTER_NAME>
hosts = <SWARM_NODE1_IP> < SWARM_NODE2_IP > <SWARM_NODE3_IP> < SWARM_NODE4_IP>
indexerHosts = <ES_BACKEND_IP> 

managementPassword = <CLUSTER_MGMT_PASSWORD>

Note

If more than one Elasticsearch VMs are configured, include all IP addresses of each of those VMs separated by a blank space.
This was defined during the SCS setup.

Metering and quotas can be enabled. This is optional.
For more information, see Content Metering and Setting Quotas
```
[metering]
enabled = true
```
By default, metering is false.
```
[quota]
enabled = true
```
By default, the quota is false.

Run the below command.

/opt/caringo/cloudgateway/bin/initgateway 
systemctl enable cloudgateway
systemctl start cloudgateway
systemctl status cloudgateway

Tip

Verify that the cloudgateway service is active (running).

Content Gateway should be up and running now.

As the final step, let’s configure the desired default protection scheme and connect Swarm to Elasticsearch.

Open a web browser and go to: http://<GW_FRONTEND_IP>:91/_admin/storage
Click Storage Management.
Click Cluster and then on Feeds.
On the top right corner, click +Add and select Search Metadata feed.

Note

This is just a description, something like “indexer” or “elasticsearch” as name should work.

On server host(s) or IP(s), type all the IP addresses of all the Elasticsearch VMs that are up and running separated by a blank space. <ES_BACKEND_IP>
Click Save. Now the Swarm nodes are connected to Elasticsearch, every time a new object/file gets uploaded to the cluster, its metadata will be also copied to Elasticsearch for search and listing purposes.
Due to a bug in Gateway 8.0.4 you will need to restart cloudgateway for it to see the new search feed.
```
systemctl restart cloudgateway
```
To finalize the setup, the default protection scheme should be set. Also, features like lifecycle policies and versioning can be enabled, if desired.
For more information about these features, see Object Versioning and Bucket Lifecycle Policy.
Versioning is required to enable “S3 object locking” (immutability).
For more information, see SCSP Object Locking.

Click Settings and select Cluster.
In the “Policy” section, change the protection scheme as desired, for example with 4 Swarm storage nodes:

policy.eCEncoding 4:2 
policy.eCMinStreamSize 1Mb 
policy.lifecycle enabled 
policy.replicas min:3 max:16 default:3 
policy.versioning allowed

Click Save at the top right corner.
Finally, test uploads and downloads using the provided Content Portal.
Open a web browser and go to http://<GW_FRONTEND_IP>/_admin/portal
Click System Tenant at the upper right corner and click +Add.

Best Practice

Create a storage domain (endpoint) that matches the name of the cluster.

To create a bucket:

Click the domain that you just created.
Click +Add this time selecting “Bucket”. Provide a name such as “bucket1” or “test1”.
Click the bucket you just created and click +Add or drop files.
Look for some files of various sizes on the client machine used, from KBs to MBs and upload them.
Click the bucket name at the top.

Note

The files uploaded should be displayed. If a video or image file has been uploaded, a preview should appear on the right panel.

Swarm utilizes FQDNs to identify which storage domain (endpoint) the client is connecting to. Hence, create DNS entries according to the Storage Domains used in the environment.

At this point Swarm is up and running and its basic functionality has been verified.

Create an S3 Key Pair (Optional)

To access the storage layer using the S3 protocol, a S3 key pair must be created.

It is comprised by the S3 access key and the S3 secret key.

Open a web browser and go to http://<GW_FRONTEND_IP>/_admin/portal
Click the domain (endpoint) desired, but not on the admin one.
Click the cog/wheel in the top right corner and select Tokens. There will be an +Add button again in the top right corner.
Provide a description, an expiration date, and click the checkmark by “S3 secret key”.

Important

The system allows the customization of the secret key if desired. If it is not required, use the random string created.

Upon clicking on Add, a green message will appear with all the information needed.

Note

The Token ID is the “S3 access key”.

With this information and the name of the domain used, it is possible to create a connection to the Swarm repository over the S3 protocol.

Note

There must be a DNS entry (or hosts entry) that points the FQDN of the storage domain to the Content Gateway IP address.

Configuring an SSL Certificate (Optional)

By default, the Content Gateway VM template comes with HAProxy unconfigured and no self-signed certificate.
If you wish to configure HAProxy as an SSL offloader, follow the steps outlined in Configuring haproxy SSL offloading with a Self Signed Certificate on CentOS7/8

Central Logging (Optional)

It is recommended that the Content Gateway logs all actions and their status to the central Syslog server. The SCS can act as the central repository for logs.

To configure this, edit /etc/caringo/cloudgateway/logging.yaml and modify the following lines:

Syslog: 
   - name: audit_syslog 
     host: <SCS_BACKEND_IP>

It is localhost by default.

   - name: server_syslog 
     host: <SCS_BACKEND_IP>

Loggers: 

# Global logging configuration 

Root: 
level: "${logLevel}" 
AppenderRef: 
  - ref: file 
  - ref: server_syslog

Note

Commented by default, remove the # at the beginning.

Logger: 
   # Audit logger 
   - name: audit 
     level: info 
     additivity: false 
     AppenderRef: 
       - ref: audit 
       - ref: audit_syslog

There is no need to restart the Content Gateway service; the new logging configuration will be applied automatically after a few seconds.

Telemetry (Optional)

The Telemetry VM provides an all-in-one reference implementation of Prometheus, Alertmanager, and Grafana.

Preparation Steps

Deploy SwarmTelemetry.ovf. IP addresses must be configured next.

Change the IP configuration and verify the connection information for the frontend network.

nmcli con mod ens192 ipv4.addresses <TM_FRONTEND_IP>/<FRONTEND_NETWORK> 
nmcli con mod ens192 ipv4.gateway <FRONTEND_GATEWAY> 
nmcli con mod ens192 ipv4.dns <DNS_SERVER_1>,<DNS_SERVER_2> 
nmcli con mod ens192 ipv4.dns-search <DNS_DOMAIN> 
nmcli con mod ens192 ipv4.method manual 
nmcli con mod ens192 connection.autoconnect yes 
nmcli con reload 
nmcli con down ens192 
nmcli con up ens192 
 
nmcli device show ens192

Change the IP configuration and verify the connection information for the backend network.

nmcli con mod ens224 ipv4.addresses <TM_BACKEND_IP>/<BACKEND_NETWORK> 
nmcli con mod ens192 ipv4.gateway <SCS_BACKEND_IP>
nmcli con reload 
nmcli con down ens224 
nmcli con up ens224 
 
nmcli device show ens224

The network configuration can be verified with the command: ip a or nmcli con show

Set the time zone according to the local clock.

timedatectl set-timezone <timezone>
hwclock --systohc

Note

All available time zones can be listed with the command: timedatectl list-timezones

Configure chrony (NTP daemon) to connect to a valid NTP server. Edit the file /etc/chrony.conf and add the proper IP addresses or names of those NTP servers. Remove the lines referred to as the default ones.
```
server <NTP_SERVER_1> iburst 
server <NTP_SERVER_2> iburst
```
Restart the chrony daemon.
```
systemctl restart chronyd
```
Verify the clock is in sync.
```
chronyc tracking
```

Prometheus Master Configuration

The next step is to configure Prometheus.

Edit /etc/prometheus/prometheus.yml to include all the IP addresses of all the Swarm components to be monitored, uncomment lines as needed:

# THIS IS THE ELASTICSEARCH EXPORTER DEFINITION 
  # IP ADDRESS SHOULD BE Telemetry loopback 

  - job_name: 'elasticsearch' 
    scrape_interval: 30s 
    static_configs: 
    - targets: ['127.0.0.1:9114'] 
    relabel_configs: 
    - source_labels: [__address__] 
      regex: "([^:]+):\\d+" 
      target_label: instance 

  # THIS IS THE CLOUD CONTENT GATEWAY JOB DEFINITION 
  # IP ADDRESS SHOULD BE CLOUD GATEWAY STORAGE VLAN IP 

  - job_name: 'swarmcontentgateway' 
    static_configs: 
    - targets: ['<GW_BACKEND_IP>:9100'] 
    relabel_configs: 
    - source_labels: [__address__] 
      regex: "([^:]+):\\d+" 
      target_label: instance 

  # THIS IS THE CLOUD GATEWAY NODE_EXPORTER JOB DEFINITION 
  # IP ADDRESS SHOULD BE CLOUD GATEWAY STORAGE VLAN IP 
 
  - job_name: 'gateway-nodeexporter' 
    scrape_interval: 30s 
    static_configs: 
    - targets: ['<GW_BACKEND_IP>:9095'] 
    relabel_configs: 
    - source_labels: [__address__] 
      regex: "([^:]+):\\d+" 
      target_label: instance 

  # THIS IS THE SWARM JOB DEFINITON 
  # IP ADDRESS SHOULD BE STORAGE VLAN IP 

  - job_name: 'swarm' 
    scrape_interval: 30s 
    static_configs: 
    - targets: ['<SWARM_NODE1_IP>:9100','<SWARM_NODE2_IP>:9100','<SWARM_NODE3_IP>:9100','<SWARM_NODE4_IP>:9100'] 
    relabel_configs: 
    - source_labels: [__address__] 
      regex: "([^:]+):\\d+" 
      target_label: instance

Note

If there are multiple gateways, add them to the targets list like:

targets: ['<GW1_BACKEND_IP>:9100','<GW2_BACKEND_IP>:9100']

YAML (.yml) files are quite sensitive to spaces and indentation. The following command will check that there is no errors.

promtool check config /etc/prometheus/prometheus.yml

Elasticsearch Node Exporter

To gather statistics and status about Elasticsearch edit: /usr/lib/systemd/system/elasticsearch_exporter.service updating the IP address of the (first) Elasticsearch VM (instead of the pre-configured 172.29.1.20).

ExecStart = /usr/local/bin/elasticsearch_exporter --es.all --
es.cluster_settings --es.indices --es.indices_settings --
es.indices_mappings --es.shards --es.snapshots --es.uri 
http://<ES_BACKEND_IP>:9200 --es.timeout 20s --web.listen-address :9114

Run the below commands to enable and start the service.

systemctl daemon-reload 
systemctl enable elasticsearch_exporter 
systemctl start elasticsearch_exporter

Once the Prometheus master config changes are applied, the service can be enabled and started.
```
systemctl enable prometheus 
systemctl restart prometheus 
```
To verify that Prometheus is up and running, open a web browser and go to: http://<TM_FRONTEND_IP>:9090/targets

This page shows which targets it is currently collecting metrics for and if they are reachable. Click Status and select “Targets”. It will take a few minutes to be updated. All states should appear as “UP”.

Alertmanager Configuration

There are four (4) alerts defined in /etc/prometheus/alert.rules.yml

Service_down: Triggered if any swarm storage node is down for more than 30 minutes.
Gateway_down: Triggered if the cloudgateway service is down for more than 2 minutes.
Elasticsearch_cluster_state: Triggered if the cluster state changed to "red" after 5 minutes.
Swarm_volume_missing: Triggered if the reported drive count is decreasing over 10 minutes. This is due to a failed disk drive that needs to be replaced.

The /etc/prometheus/prometheus.yml contains a section that points to the alertmanager service on port 9093, as well as which alert.rules.yml file to use.

To customize the alerts:

Modify the swarmUI template in /etc/prometheus/alertmanager/template/basic-email.tmpl. This will be used for the email HTML template showing a button to the chosen URL.
```
{{ define "__swarmuiURL" }}http://<GW_FRONTEND_IP>:91/_admin/storage/{{ end }} 
```
The configuration for where to send alerts is defined in the file: /etc/prometheus/alertmanager/alertmanager.yml
By default, the route is disabled as it requires manual input for every specific environment, values such as:
SMTP server, username, password (if applicable), etc.

Important

Prometheus alertmanager does not support SMTP NTLM authentication, as such it cannot be used to send authenticated emails directly to Microsoft Exchange. Instead, smarthost should be configured to connect to localhost:25 without authentication. This is where the default Rocky Linux Postfix server is running. It will know how to send the email to your corporate relay as it is auto-discovered via DNS. Add require_tls: false to the email definition config section in alertmanager.yml.

Example configuration for a local SMTP relay: Prometheus alertmanager does not support SMTP NTLM authentication, as such it cannot be used to send authenticated emails directly to Microsoft Exchange. Instead, smarthost should be configured to connect to localhost:25 without authentication. This is where the default CentOS Postfix server is running. It will know how to send the email to your corporate relay as it is auto-discovered via DNS. Add require_tls: false to the email definition config section in alertmanager.yml.

- name: 'emailchannel'  
  email_configs: 
  - to: admin@acme.com  
    from: swarmtelemetry@acme.com  
    smarthost: smtp.acme.com:25  
    require_tls: false  
    send_resolved: true

Once the configuration is complete, restart the alertmanager.
```
systemctl restart alertmanager  
```

To verify the alertmanager.yml has the correct syntax, run:

amtool check-config /etc/prometheus/alertmanager/alertmanager.yml

It should give the following output:

Checking '/etc/prometheus/alertmanager/alertmanager.yml' SUCCESS

Found:

global config
route
1 inhibit rules
2 receivers
1 templates SUCCESS

Tip

The easiest way to trigger an alert for testing purposes is to shut down one Content Gateway.

Grafana Configuration

The password for the “admin” user can be changed on the configuration file /etc/grafana/grafana.ini, look for admin_password.

Note

Grafana has several authentication options including google-auth / oAuth / LDAP and by default basic http auth.

For more information, see Documentation | Grafana Labs.

To enable on boot and start the service type:

systemctl enable grafana-server 
systemctl restart grafana-server

Grafana has all the Swarm dashboards pre-installed. Open a web browser and go to http://<TM_FRONTEND_IP>

The default period is 7 days, modify it to 5 minutes to see some stats appearing on the charts.

The latest Swarm dashboards are available on the Grafana website.

Dashboard ID	Dashboard Name
16545	DataCore Swarm AlertManager v15
16546	DataCore Swarm Gateway v7
16547	DataCore Swarm Node View
16548	DataCore Swarm System Monitoring v15
17057	DataCore Swarm Search v7
19456	DataCore Swarm Health Processor v1

Job Name (Optional)

In /etc/prometheus/prometheus.yml the job_name of the Content Gateway can be defined. This job_name will be displayed on the Content Gateway Grafana dashboard.

Best Practice

It is recommended to make it human friendly, using the fully qualified hostname (FQDN).

If the Content Gateway job_name is changed there are a couple of additional changes required:

Modify the gateway job name in /etc/prometheus/alertmanager/alertmanager.yml; it must match what appears in prometheus.yml.

routes:  
- match:  
      job: <new_job_name>

Info

It is swarmcontentgateway by default.

Modify the gateway job name in /etc/prometheus/alert.rules.yml.

alert: gateway_down  
expr: up{job="<new_job_name>"} == 0

DNS names can be used. In the absence of a DNS server, first, modify /etc/hosts file with the desired names for each Swarm storage node and then use those in the configuration file. This is recommended in scenarios where the dashboards are publicly accessible.

Prometheus Retention Time (Optional)

By default, the Prometheus configuration in Telemetry keeps metrics for 30 days. If there is a need to increase or decrease this retention, follow the next steps:

Edit the /root/prometheus.service file.
Select the default retention time for the collected metrics.
Modify the --storage.tsdb.retention.time=30d flag to the new desired retention time.

Tip

The rule of thumb is 600MB of disk space for 30 days per Swarm Node. This VM template comes with a 50 GB dedicated vmdk partition for Prometheus.

Finally, commit the change:

cp /root/prometheus.service /usr/lib/systemd/system  
systemctl daemon-reload  
promtool check config /etc/prometheus/prometheus.yml  
systemctl restart Prometheus

Prometheus Security (Optional)

It may be desirable to restrict Prometheus server to only allow queries from the local host, since Grafana server is running on the same VM. This can be done by editing /root/prometheus.service file and adding the flag --web.listen-address=127.0.0.1:9090

If Prometheus is bound only to localhost, the built-in Prometheus UI on port 9090 will not be accessible remotely.

Planning and Storage Nodes Prerequisites

Deployment Planning

Hardware Requirements for Storage

SCS

Swarm Cluster Services (SCS) Implementation

Elasticsearch

Hardware Requirements for Elasticsearch

Preparing the Search Cluster

Installing Elasticsearch

Configuring Elasticsearch

Setup Elasticsearch Cluster

Managing Feeds

Content Gateway

Gateway Requirements

Gateway Installation

Gateway Configuration

Gateway Verification

Telemetry (Prometheus and Grafana)

Prometheus Node Exporter and Grafana

VM	vCPU	RAM	System Disk	Data Disk
SCS	2	4 GB	50 GB	100 GB
Content Gateway	4	8 GB	50 GB	N/A
Swarm Search	4	24 GB	30 GB	450 GB
Swarm Telemetry	1	2 GB	40 GB	50 GB

Swarm 16.1.0 VM Bundle Deployment for Rocky Linux 8

Introduction and Prerequisites

Note

Swarm Components

Swarm Storage Nodes

Platform Server - Swarm Cluster Services (SCS)

Best Practice

Elasticsearch

Content Gateway

Important

Telemetry (Optional)

Load Balancers (Optional)

Note

Networking Requirements and Recommendations

Best Practice

Open Ports Overview

Environment Prerequisites

Note

Required

Site Survey

Swarm Deployment Using VMware Bundle

Important

Platform Server – Swarm Cluster Services (SCS)

Preparation Steps

Note

Offline Installation

Note

SCS Configuration

Optional

Info

Elasticsearch

Preparation Steps

Before Deployment

Note

Info

Optional

Important

Swarm Storage Nodes

Content Gateway

Note

Important

Note

Tip

Note

Best Practice

Note

Create an S3 Key Pair (Optional)

Important

Note

Note

Configuring an SSL Certificate (Optional)

Central Logging (Optional)

Note

Telemetry (Optional)

Preparation Steps

Prometheus Master Configuration

Note

Elasticsearch Node Exporter

Alertmanager Configuration

Important

Tip

Grafana Configuration

Note

Job Name (Optional)

Best Practice

Info

Prometheus Retention Time (Optional)

Tip

Prometheus Security (Optional)

Planning and Storage Nodes Prerequisites

SCS

Elasticsearch

Content Gateway

Telemetry (Prometheus and Grafana)