Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Shard Size: Recommended shard size is between 20 GB and 50 GB.

  • Shard Alignment: Align the number of shards with the number of nodes (shards should be a multiple of nodes).

  • Heap Size: Maximum heap size of 30.5 GB per node.

  • Shard Capacity: A node can handle up to 30.5 GB heap * 20 shards per GB of heap, resulting in a maximum of approximately 600 shards per node.

  • Elastic Limit: Elasticsearch has a default maximum shard limit of 1,000 per node. It is advisable to stay below 600 shards per node for optimal performance.

...

  1. Keep the new search feed index at 100% after PDI is enabled: When enabling PDI, a new set of indices is created, each corresponding to a specific domain. The new search feed means data being indexed with the new search feed targeting particular ES nodes. The new indices search feed need to be fully populated (100%) before switching to themthose indices. This means that all documents should be indexed without any gaps or missing data.
    Implementation Steps:

  2. Enable PDI in the Cluster settings.

  3. Create a new search feed based on predefined templates or configurations.

  4. Reindex all existing documents from scratch, non-PDI index will also be available if not deleted.

  5. Monitor the progress of the reindexing process until all indices reflect 100% of the expected documents

    .

  6. Once completed, set the new search feed as the default: When the search feed is 100%, make it default. Now, the cluster is ready to list as per PDI. This switch should be seamless to ensure that users experience no disruption.

    Implementation Steps:

  7. Confirm the search feed is 100%.

  8. Update the default search feed to the new PDI-enabled search feed.

  9. Conduct the test to verify that the Per Domain Index is visible in the Elasticsearch nodes

    .

  10. Clean up the previous index (created without PDI) as needed: The old index stored data without per-domain segregation, is now redundant. Cleaning it up can free up resources and reduce storage costs.

    Implementation Steps:

    Verify that the new indices are performing correctly and no issues exist

    .

  11. Delete the old index to remove unnecessary data.

Implementing Per Domain Index (PDI) With Downtime

  1. Set the new search feed as the default after enabling PDI: If downtime is acceptable, the process is simpler. You can immediately switch to the new search feed after enabling PDI, even if they are not fully populated. Users will experience downtime or incomplete search results until reindexing is complete.
    Implementation Steps:

  2. Enable PDI and create a new domain-specific search feed.

  3. Immediately update the search configuration to use the new search feed

    .

  4. Reindex documents into the new indices while search may be temporarily affected.

  5. The new index will be temporarily unavailable for listing: During reindexing, the new domain indices may not be fully operational. Users may experience missing or incomplete data in search results.
    Implementation Steps:

  6. Inform users about the planned downtime and potential service disruption.

  7. Monitor the reindexing process to track progress and identify any issues

    .

  8. Listing will resume after the new search feed reaches 100% completion: Once the reindexing is complete, the search functionality will return to full capacity with complete and accurate listings.
    Implementation Steps:

  9. Monitor reindexing progress and verify all expected documents are indexed.

  10. Conduct validation checks to ensure data integrity and accuracy.

  11. Notify users when the service is fully restored

    .

Disabling Per Domain Index

  1. Set search.perDomainIndex = False and delete the associated Search Feed: To disable PDI, switch the configuration setting that enables search.perDomainIndex to False. This setting controls whether the system should use per-domain indices or not. After disabling, delete all per-domain search feeds to consolidate data back into a single index.
    Implementation Steps:

  2. Set the configuration search.perDomainIndex to False in your application or search engine settings.

  3. Delete the Per Domain search feed as they are no longer needed

    delete all the per domain indices.

  4. All Per Domain Indices will be removed, leaving a single index operational: After disabling PDI, the system will revert to using a single index for all domains. This simplifies data management but loses the benefits of domain-specific indexing.

  5. The number of shards will remain unchanged, so there won’t be any issues with single indexing: The shard count configuration is typically set at the cluster level and does not change when switching from PDI to a single index. This ensures that the system’s capacity for handling search queries remains stable.
    Implementation Steps:

  6. Review the shard configuration of your search engine to confirm it is optimal for a single index.

  7. Adjust the shard count if necessary to balance performance and resource usage

    .

Limitations

  • Increased Complexity: Managing multiple indexes can complicate the architecture of the search engine or database system. It requires more sophisticated algorithms and infrastructure to handle indexing and searching across domains.

  • Resource Intensive: Each index requires storage space and processing power. Maintaining multiple indexes can lead to higher resource consumption, increasing costs for hardware, maintenance, and energy.

  • Index Synchronization: Keeping indexes up-to-date across multiple domains can be challenging. Changes in the data must be reflected in all relevant indexes, which can introduce delays or errors.

  • Search Performance: While Per Domain Index can improve search performance for specific queries, it may degrade overall performance when a query spans multiple domains. This can result in longer query times as the system needs to aggregate results from various indexes.

  • Scalability Issues: As the number of domains increases, scaling the infrastructure to support numerous indexes can become difficult. Performance may suffer if not properly managed.

...