Swarm integrates Elasticsearch and extends the Swarm API with commands for querying Swarm objects in terms of their metadata. Through this feature, Swarm indexes object metadata in near real time and allows performing ad hoc searches (via query commands) on the attributes and metadata of your stored objects.
Swarm uses Elasticsearch servers for its metadata searching operations. You can deploy these servers for high-availability and horizontal scaling. Although high availability of the search cluster is not needed for high availability of the storage cluster, you may need it to service third-party analytics applications.
Important
You can return the results as JSON or XML, which you can import into your third-party analytics applications.
See also these sections:
- Swarm Storage Release Notes
- Elasticsearch for Swarm (configuration and administration)
- /wiki/spaces/DOCS/pages/2443813497
- Storage SCSP Development
Search components
The search infrastructure includes these components:
- Swarm Storage cluster, which is connected to the Elasticsearch servers through a Search Feed.
Search feed(s), which transmit the metadata from the storage cluster. Feeds iterate over data on storage nodes and use intermittent channel connections to distribute data to one or more configured destinations, including metadata search servers. See Managing Feeds.
Tip
Because Swarm uniquely names each search feed index, you can configure additional feeds that use the same Elasticsearch cluster; plan for doubling or tripling the space demands on that server.
- Elasticsearch servers, which index the metadata and service search requests. This metadata can be reconstructed from the storage cluster, if needed.
- Metrics curator service, which can be installed on one of the Elasticsearch servers, or another system running RHEL/CentOS 7.
- Client applications, which access the Swarm cluster through SCSP commands.
Best practice