Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
outlinefalse
styledisc
typelist
printabletrue

...

The listing cache prototype can be installed as a regular gateway. No extra config is required. Increasing java heap size is recommended, and the disk storing /var/spool/caringo/cloudgateway should be SSD with preferably >=100Gb free capacity.

Memory, CPU and Disk Requirements

  • Minimum: 4 vCPU, 8 GB RAM (heap) - VM 12GB, 100GB dedicated partition on SSD ( on XFS filesystem ) ← THIS STATES MEM, CPU AND DISK REQS

  • Recommended: 8 vCPU 12 GB RAM(heap) - VM 16 GB, 200GB dedicated partition on SSD ( on XFS )

CPU

  • The system must have adequate CPU resources, as caching can lead to higher CPU utilization due to cache management (e.g., cache eviction policies, invalidation, etc. ).

  • Cache with complex data structures or serialization/deserialization can impact CPU usage. Basically already stated above

Network

  • If any.

...

  • If any.

Limitations

  • Client-Specific Binding: Bound to a dedicated client, with no cross-gateway sharing allowed. The gateway must be able to intercept every write and delete that happens in Swarm.

  • Non-Persistent Cache: The disk/memory cache is discarded by default on restart.

  • Limited Lifecycle and RecursiveDeletion Support: No support for bucket lifecycle policies, delete lifepoints, or recursive deletes. All writes and deletes must originate from the gateway.Stale Data: If data changes frequently, Listing Cache can serve outdated (stale) data unless proper cache invalidation mechanisms are in place.This is an implementation detail and no concern for the user.

  • Memory Constraints: Caching large volumes of data can quickly consume system memory. Misconfiguring cache sizes can lead to memory exhaustion or excessive eviction, reducing cache effectiveness.

  • Cache Invalidation Complexity: Managing cache invalidation (i.e., ensuring cached data is refreshed when source data changes) can be complex, especially in distributed environments or when working with dynamic data. Ditto

  • Overhead on Writes: When new data is added or existing data is modified, cache updates and invalidation can add extra overhead, potentially slowing down write operations. Because LC allows switching of synchronous indexing, writes actually became faster. I do not think we need to raise this point.

  • Cache Miss Penalties: If the cache miss rate is high (meaning data isn’t found in the cache often), the overhead of checking the cache and then falling back to the database could negatively impact performance. The worst case is no worse than without caching so I do not see this as a limitation.

  • Delimiters Support: Custom delimiters are not yet supported, only forward slash "/".

How Does Listing Cache Work

  • Ensure Sufficient Disk Space: Listing Cache stores each folder in a separate SQLite database, which consumes disk space. Provide ample disk space to avoid frequent evictions of folder databases, as this impacts performance.

  • Automatic Folder Detection: Listing Cache automatically learns about folders through ongoing list, write, and delete requests. No manual intervention is required to create or manage databases for each folder.

  • Monitor Cache Population: Initially, for any new folder, the cache starts with an "infinite gap," meaning it has no data cached and queries ElasticSearchElasticsearch. Over time, as more listings are cached, the gap reduces until the folder is fully cached and can be served without querying ElasticSearchElasticsearch.

  • Real-Time Cache Updates: Ongoing write and delete requests are intercepted and used to keep the folder databases updated, ensuring the cache remains consistent with the actual data.

  • LRU-Based Eviction: The system automatically evicts the least recently used (LRU) databases when disk space is full. If a folder's database is evicted and later requested, the cache process restarts for that folder.

  • Disk Space Directly Impacts Performance: The more disk space available, the fewer evictions occur, allowing more folders to remain fully cached and reducing the need for frequent ElasticSearch Elasticsearch queries.

  • Prepare for ElasticSearch Elasticsearch Querying: In case of cache misses or folder database evictions, ElasticSearch Elasticsearch will be queried. Ensure that ElasticSearch Elasticsearch is properly configured to handle such requests, especially during periods of high cache turnover.

...

Info

Validate your system support Listing Cache. Many frameworks or databases like Redis, Memcached, or certain web frameworks support caching.

  1. Locate the cache configuration settings in your system’s configuration file or interface. In gateway.cfg, just set Set [storage_cluster]disableListingCache=false

  2. Set the cache engine (e.g., Redis, Memcached) and specify settings such as cache size, eviction policies (e.g., LRU - Least Recently Used), and TTL (Time to Live) for cached items. I don’t know where you got this from. None of these technologies were used in LC????

  3. Update code to leverage the Listing Cache. For example, fetch listings from the cache instead of querying the database, and store listings in the cache after the first database query. This is all transparent to gateway S3/SCSP clients.Add checks to ensure cache invalidation happens when the data changes (e.g., if product details are updated).cfg.

  4. After testing in a staging environment, roll out the Listing Cache to production by deploying the necessary configurations and code changes.

  5. Monitor performance impact closely during the rollout phase.

  6. Optional. Pre-warm the cache with commonly accessed listings before enabling it in production, so the initial requests are served from the cache.

...

How to Determine Listing Cache is Working Correctly

  1. Monitor Cache Hit Rate

    • Use cache statistics (available through the cache system, such as Redis or Memcached) to monitor cache hit-and-miss rates. If they If you have telemetry and grafana Grafana available, they can check the LC dashboard. A high cache hit rate (e.g., above 90%) indicates that the cache effectively serves requestsListing Cache dashboard.

  2. Check Response Time

    • Compare the response time before and after enabling the Listing Cache. Reduced response times, particularly for frequently requested data, indicate the cache functions correctly.

  3. Resource Utilization

    • Monitor memory usage and CPU utilization. Increased memory usage and steady CPU activity are normal in a caching system, but excessively high CPU or memory usage may indicate misconfiguration or over-reliance on the cache.

  4. Log Analysis

    • Review application logs for cache access events, if you have implemented custom logging for cache hits, misses, and evictions. This helps verify that requests are being served from the cache. This requires switching on DEBUG logging. Do we want customers to do that? If yes I can explain which log messages to look for.

  5. Data Consistency Checks

    • Ensure that cached data remains consistent with the data source after cache invalidation events. You can periodically run checks comparing cache data with the database to verify consistency. The only way to do this is to setup a second gateway WITHOUT listing cache and compare listing results from both.