Rate limiting of S3 listings
There is a Content Gateway patch implementing experimental rate limiting of S3 listing requests to prevent overloading of the elasticsearch cluster.
Installation
The latest rpm will be provided in a support ticket, named like below. Just install it on your current Content Gateway 7.10.7 server. There is no official release of Gateway 7.10.8 but this patch uses that version to ease yum install and downgrade. There will likely be updates to this patch and note that all configuration, apis and behavior will change before it is available in an official release in an 8.0-based branch.
yum install caringo-gateway-7.10.8-0.CLOUD.3331.ratelimitlistings.7108.0003.noarch.rpm
Verify with yum list installed | grep caringo
and curl http://GATEWAY/_admin/manage/version
that version is now installed and running.
Rollback can be done with yum downgrade caringo-gateway-7.10.7-1.noarch.rpm
but that should not be necessary as there is no rate limiting enabled by default.
Configuration
The maximum number of concurrent delimiter listings and non-delimiter listings can be configured separately, since delimiter listings are usually the problem queries for elasticsearch. Add this to your /etc/caringo/cloudgateway/gateway.cfg
and systemctl restart cloudgateway
.
[debug]
auditLogVersion = 4
# Requires a special pre-release rpm from DataCore Support:
# caringo-gateway-7.10.7-0.CLOUD.3331.ratelimitlistings.0005.noarch.rpm
# This Gateway will allow a maximum of 50 delimiter listings. Additional requests will
# wait 5 seconds before returning an S3 503 SlowDown response to the client.
# Up to 100 non-delimiter listing requests are allowed with additional requests waiting
# the default 10 seconds before returning a 503 SlowDown response to the client.
rateLimitListings = delimiter:50,5 nondelimiter:100
Dynamic configuration
The rate limiting configuration can also be changed dynamically using the below PUT
apis, without requiring a Gateway restart.
Since all Gateways are independent these requests must be issued to each Gateway server. The Gateway reverts to the rateLimitListings
configuration in gateway.cfg on restart. Consider having your load balancer direct all listing requests (a GET
with query args like delimiter
, prefix
, max-keys
, marker
, or continuation-token
) to one or two Gateways, for better control.
For example if elasticsearch cluster is having severe problems you can prevent any further delimiter listings, with no wait time, and allow only 10 concurrent non-delimiter listings with:
curl -u dcadmin -X PUT http://GATEWAY/_admin/manage/_ratelimit/listings/delimiter/0,0
curl -u dcadmin -X PUT http://GATEWAY/_admin/manage/_ratelimit/listings/nondelimiter/10
This can be done on a running Gateway without affecting any other requests like object GET and PUT. Keep in mind backup software and tools like rclone copy
will likely stop if unable to list. Consider have your load balancer send requests in a critical domain to a separate gateway configured with a higher limit.
You can DELETE
any rate limiting configuration dynamically, to no longer rate limit listings, with:
curl -u dcadmin -X DELETE http://GATEWAY/_admin/manage/_ratelimit/listings/all
Monitoring
It’s good to tail cloudgateway_audit.log
on the individual Gateways or on the centralized SCS/CSN syslog server to monitor the listing times and responses:
You can see the current rate limit configurations and see a list of active listing requests with:
The formatting of the list of listings needs to be improved.
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.