Gateway Configuration
These configuration files reside on the system after installing the Content Gateway service:
/etc/caringo/cloudgateway/gateway.cfg
/etc/caringo/cloudgateway/logging.yaml
Logging: See Gateway Logging after completing the Gateway configuration. The configuration file for logging changed from logging.cfg
to logging.yaml
as of Gateway 6.0 to support newer versions of Elasticsearch and to add customizations to the YAML file. See the Apache documentation for logging.
Password Security
Plain-text passwords in both Gateway Configuration and IDSYS are replaced by encrypted versions on startup. Enter new passwords and restart Gateway when management passwords need to be changed, which replaces those strings with encrypted versions as part of startup. (v7.1)
These config items must be changed back to plain text so they can be encrypted with the new key if the adminDomain is deleted or changed.
Configuring the Content Gateway
Minimum Configuration
While cluster administrators must understand the details of configuring Content Gateway, this section summarizes the minimum steps required to configure and run Gateway. To deploy Gateway into production, additional customization is needed.
Check either that
IPTABLES
are off or that inbound access for the front-end protocols is allowed. These commands turn off and disable the firewall daemon.systemctl disable firewalld systemctl stop firewalld
Edit the
/etc/caringo/cloudgateway/gateway.cfg
file:Set
adminDomain
to the name of an administrative domain that is created.Set
hosts
for the storage cluster nodes. Including 4 to 5 nodes is sufficient for most deployments.Set
indexerHosts
to the Elasticsearch servers (required for S3 and Content Metering).Enable at least one of the front-end protocols: SCSP or S3.
Alternatively, for Service Proxy use (to host the Swarm UI), set both to disabled and complete the[cluster_admin]
section.
Create the administrative domain by running the following on the first Gateway server:
/opt/caringo/cloudgateway/bin/initgateway
Password Security: This one-time step initializes password encryption for the Gateway configuration and IDSYS files. If upgrading from a version prior to 7.1, this initialization must be run again on one Gateway server to enable the feature. (v7.1)
See https://perifery.atlassian.net/wiki/spaces/public/pages/2443810269Start the Gateway service:
Enable automatic startup of the Gateway service.
Production deployments require customizations of the configuration parameters, below.
Configuration Sections of gateway.cfg
The gateway.cfg
file controls the core operations of the Content Gateway. It is a plain text, INI-formatted file read when the Gateway is first started. The parameters within the file are organized into the following sections, and colored rows are generally essential entries.
[gateway]
This section configures client communications.
adminDomain | gatewayAdminDomain | Required. The administrative domain where meta information about tenants and storage domains is kept. ImportantThis parameter must be set to the same value for all Gateway servers. Changing the adminDomain invalidates encrypted passwords in idsys.json and gateway.cfg and all tokens. This is not recommended to match the Swarm default domain (cluster.name). Doing so leads to “Invalid token” errors if cluster.enforceTenancy=False, which is also not recommended. |
---|---|---|
threads | 200 | The number of threads allocated to handling client requests. Set for 100 times number of CPU cores. Minimum is 200. For CPUs with hyperthreading enabled, this calculation is based on the number of virtual cores, not physical. |
tokenTTLHours | 24 | The default number of hours an authentication token is valid if no time is defined when it is created. |
multipartSpoolDir |
| The location of the spool directory for HTTP multipart MIME upload temporary space. NoteUploads through the Content UI use SCSP multipart uploads rather than multipart MIME uploads. (Gateway v6.2) |
multipartUsageAllowed | 50 | The percentage of the file system that can be used for multipart MIME upload temporary space. |
recursiveDeleteMaxThreads | 50 | The maximum number of parallel delete operations to dispatch when processing recursive delete requests. |
sanitizeErrors | false | Set to true to hide identity management configuration details from authentication errors. |
cookieDomains | One or more base domains for the Example:
| |
veeamKbBlockSize | 8192 | Gateway implements the Veeam SOSAPI extension (v7.10.3). This config allows block size configuration. The default and recommended value is 8192. Set to 0 to disable SOSAPI handling. The capacity and availability returned in a GET of |
recursiveDeleteMaxItems | 10000 | The max multidelete request items, SCSP only. S3 has a fixed limit of 1000 which is defined by AWS. |
recursiveDeleteMaxSize
| 2560000 | The max multidelete request body size (~2.5Mb). |
recursiveDeleteMaxRetries | 3 | Number of retries when hitting 503 on delete. |
recursiveDeleteRetryDelay | 500 | Number of milliseconds to wait before retrying. |
recursiveDeleteSynchronousIndexing | true | Whether to request synchronous ES index update during each delete. |
[storage_cluster]
This section configures the back-end storage cluster.
locatorType | "static" | Zeroconf is not supported. |
---|---|---|
hosts | server1 server2 server3 | Space or comma delimited list of IP addresses or host names of the storage cluster nodes. |
port | 80 | Integer socket port number for SCSP on the storage nodes. |
clusterName | The name of the storage cluster. | |
indexerHosts | indexer1 indexer2 indexer3 | Space or comma delimited list of the Elasticsearch metadata index servers used by the storage cluster. Must be from the same ES cluster: do not mix old and new clusters. Required for the S3 protocol and for Content Metering |
indexerPort | 9200 | The socket port on which the Elasticsearch servers listen. |
managementPort | 91 | Provide these credentials for the storage cluster to enable Gateway version and component information to be included in the cluster health report that provides proactive support from DataCore. (v6.0) Required when using [cluster_admin]. |
clientBindAddress | 0.0.0.0 | Set to the IP address of the network interface connected to the storage cluster subnet when using a multi-homed Gateway. The value must be defined as a non-default value when using a multi-homed Gateway server such as one connected to a front-end client network and a back-end storage network. |
maxConnectionsPerRoute | 100 | The maximum number of open connections to a specific storage node. |
maxConnections | 250 | The maximum number of open connections to allow. This includes both active and idle connections. |
connectTimeout | 60 | The time in seconds allowed to connect to a node. |
socketTimeout | 10 | The time in seconds allowed for an active connection to deliver data. |
idleTimeout | 120 | The time in seconds an idle socket is allowed to remain in the connection pool. |
indexerSocketTimeout | 120 | The time in seconds an indexer socket is allowed to remain in the connection pool. This affects the ability to list larger buckets. (v7.1) |
continueWaitTimeout | 30 | The time in seconds to wait for client response after a 100 continue reply. |
dataProtection | "immediate" | Controls whether synchronous (immediate, using replicate on write) or asynchronous (delayed) data protection is requested when writing to the storage cluster. Values:
|
blockUndeletableWrites | true | When enabled, the Gateway rejects any SCSP write (PUT, POST, COPY, APPEND) that includes a |
[scsp]
This section configures the front-end SCSP protocol. This protocol must be enabled for any Gateway that services Content UI requests.
enabled | true | Activates this protocol: Values are: "true", "false". |
---|---|---|
bindAddress | 0.0.0.0 | The IP address of the network interface to which the listening socket binds. Defaults to all interfaces. |
bindPort | 80 | Integer socket port number for protocol. |
externalHTTPPort | 80 443 | Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when X-Forwarded-Proto is found on the request. Gateway uses X-Forwarded-Proto to determine which port to use. (v5.4) |
allowSwarmAdminIP | undefined | Allows the use of internal Swarm requests for content replication to pass through the Gateway. This is useful if using replication feeds between clusters that use Gateway as the front-end. Values are "all", full IP addresses, IP address prefixes, a list of IPs/prefixes, or CIDR format such as 172.30.15.0/24. |
[s3]
This section configures the front-end S3 protocol, which is optional.
enabled | false | The protocol must be explicitly enabled. Values are: "true", "false". |
---|---|---|
bindAddress | 0.0.0.0 | The IP address of the network interface to which the listening socket binds. Defaults to all interfaces. |
bindPort | 80 | Integer socket port number for protocol. |
externalHTTPPort | 80 | Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when X-Forwarded-Proto is found on the request. Gateway uses X-Forwarded-Proto to determine which port to use. (v5.4) |
enhancedListingConsistency | true | Improves compatibility with S3 clients and software libraries that expect consistent listings (despite the documented nature of listings to be eventually consistent). Can be disabled to boost write throughput (especially for small objects), if listing consistency is not critical. (v5.2.1) Exceptions to synchronous indexing:
|
region |
| The Amazon S3 GET Bucket Location request returns the AWS region in which the bucket is located.
If you require the behavior prior to Content Gateway 7.10.2 of returning the cluster name, set |
forcedDomain |
| Set |
[metering]
This section configures usage metering, which is optional. See Content Metering
enabled | false | The feature must be explicitly enabled. |
---|---|---|
flushIntervalSeconds | 300 (5 minutes) | How frequently to send usage reports to Elasticsearch. Minimum is 10 seconds. The default value is optimized for the resolution of the queries. |
retentionDays | 100 (days) | How long to retain usage records. Minimum is 2 days. Allow for additional storage space if significantly increasing the retention period. |
storageSampleIntervalSeconds | 3600 (1 hour) | How frequently to sample the disk usage. Minimum is 900 (15 minutes). Larger values reduce the query workload on Elasticsearch. |
[caching]
This section configures cache expiration. Times are in seconds. To disable, set it to 0.
authRefresh | 300 | Time before authorization is revalidated with a request to the identity management system. |
---|---|---|
tokenRefresh | 300 | Time before an authentication token is revalidated with a request to the administration domain. |
idsysRefresh | 300 | Time an IDSYS document, or its nonexistence, is cached in memory. |
policyRefresh | 300 | Time a tenant, domain, or bucket Policy document, or its nonexistence, is cached in memory. |
xformRefresh | 300 | Time an XFORM document, or its nonexistence, is cached in memory. |
metadataRefresh | 300 | Time that metadata for a tenant, domain, or bucket, or its nonexistence, is cached in memory. |
domainExistenceRefresh | 300 | Time that the knowledge of a domain's existence or nonexistence is cached. |
socketTimeout | 10 (seconds) | Default timeout starting from v8.1.0. Set to -1 to disable this configuration. |
[quota]
This section configures storage and network usage quotas. See Setting Quotas
The Gateway regularly refreshes the cache of quota information using an Elasticsearch query against usage metrics when enabled; it changes the quota state and performs the action specified by policy if any quota limit is reached.
enabled | false | The feature must be explicitly enabled. |
---|---|---|
minRefreshDeadline | 60 | The global limits on the speed of quota data refreshing. To increase the precision of the usage data, lower these values. To reduce the load on Elasticsearch, increase these values. To optimize the load on Elasticsearch, Gateway refreshes with a dynamic algorithm: slower when metrics are still far from the limit and faster when the limit approaches, slower when approaching a limit and faster as the overage nears an end. The minimum and maximum deadlines refer to the caps to apply to this refresh rate (no faster and no slower than these values). |
maxRefreshDeadline | 3600 | |
numRefreshThreads | 4 | The number of threads in the pool that continuously look at the most urgent deadlines in the queue and perform the refreshes (Elasticsearch queries) as needed. |
maxRefreshRetries | 3 | The number of times a refresh can fail due to a failing Elasticsearch query before an error is logged and the refresh is dropped. |
maxQueueSize | 10000 | Maximum queue size for scope quota evaluations. The internal implementation uses a deadline queue and, If the queue is overflowed, the least urgent items are pushed out of the queue. |
queryTTL | maxRefreshDeadline | This avoids unnecessary load on Elasticsearch by allowing the results of a quota check performed when a scope (tenant, domain, bucket) is accessed to be cached for this period of time. If the time since last access is less that this value, the scope is not scanned in the background. Setting this parameter to 0 disables the access caching function. |
refreshRetryDelay | 10 | Number of seconds to wait before retrying a refresh after the previous failed due to a failing Elasticsearch query. |
refreshIdleSleep | 3 | Seconds to wait after finishing the work in a queue and before starting again. |
smtpHost | localhost | Required. The hostname or IP address of the SMTP server that sends the email notifications. |
smtpPort | 25 | Optional. The port where the SMTP server listens. |
smtpUser | Optional. The user and password to authenticate with SMTP server. | |
mailFrom |
| Email address for the sender of the notification. |
mailSubjectTemplate | Quota state change notification | Email templates for subject line and body. These variables can be used in both the subject line and message body templates.
The %xxx% strings render current values when the message is generated. |
mailTemplate | Metric %metric% changed to %state% state in %contextType% %contextName%. |
[dynamic_features]
Any configuration settings appear in this Dynamic Features section if optional, dynamic features such as Video Clipping for Partial File Restore are installed. (v11.0)
resultObjectLifetime | 5 | In days. Sets a lifepoint to trigger clean up of any JSON result objects for video clips are created asynchronously. |
---|
[folder_listings]
The section configures options related to object listings.
usePaths | A setting introduced in Gateway v7.10.0 to improve S3 delimiter listing performance. The default is true as of Gateway v8.0.3. Also see It requires Swarm 14.1 or later with
|
usePathsMaxDirs | A setting introduced in Gateway v8.0.3 that determines which delimiter listings are affected by |
[cluster_admin]
This section configures options related to the Service Proxy.
enabled | false | Enables the Service Proxy functionality. |
bindAddress | <IP | hostname> | Specifies the IP address or host name where Service Proxy listens for incoming storage cluster management API and Metering Query requests. |
bindPort | 91 | Specifies the port where Service Proxy listens. By convention, this is port 91. |
externalHTTPPort | <port> | Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when X-Forwarded-Proto is found on the request. Gateway uses X-Forwarded-Proto to determine which port to use. (v5.4) |
platformHost | <IP | hostname> | Required for Platform Server if running Service Proxy/Swarm UI on a standalone Gateway. |
testMode | true | false | Enables testMode when troubleshooting, which stops obfuscation of the backend Swarm Storage and Elasticsearch node IPs. |
[metrics]
This section configures the metrics server that gateway exposes for Prometheus. Prometheus is configured to poll /metrics
on this address and port. Metrics are prefixed with caringo_gateway
.
metricsEnabled | true | Metrics is enabled by default |
metricsPort | 9100 | Port for Prometheus to poll |
metricsHost | 0.0.0.0 | Address the metrics server bind to. 0.0.0.0, by default, refers to all IP addresses. This can be configured to a private address if Prometheus can connect to it. (v7.10.6) |
[debug]
This section contains configuration that Support might ask to be temporarily enabled for diagnosis:
debugConnLeaks | true | Set this to true as directed by DataCore Support to diagnose connection pool or stuck thread issues. |
Setting Ports for Docker or Proxies
Gateway manages communications through assigned ports. Gateway is configured to run either within a Docker environment or behind a proxy as of release 5.4. The configuration has two settings (externalHTTPPort
, externalHTTPSPort
) per protocol: [scsp]
and [cluster_admin]
, the Service Proxy. These settings take effect when X-Forwarded-Proto
appears on the request.
SCSP, S3, and Service Proxy request each route to the correct port. Browser requests must use the correct port:
Content UI |
| SCSP port |
|
---|---|---|---|
Swarm UI |
| Service Proxy port |
|
Gateway can redirect users if they attempt to access a UI on the wrong port; to accomplish this,
The load balancer must set
X-Forwarded-
headers, which Gateway uses to determine which port to useConfigure
externalHTTP[S]Port
correctly ingateway.cfg
Example Load Balancer Setup | Example Settings in gateway.cfg |
---|---|
If an HAProxy load balancer at haproxy.example.com is proxying requests for SCSP and S3 (on a shared port) and for Service Proxy: | ...then expose both HTTP and HTTPS |
Redirection: This is how redirection is achieved given the example above. A user incorrectly attempts to access /_admin/storage
on the SCSP/S3 port exposed by HAProxy.
HAProxy proxies this request to Gateway's SCSP port as:
Gateway SCSP knows that it does not handle /_admin/storage
requests and that /_admin/storage
is handled by the [cluster_admin]
port, so it responds with a redirect to the [cluster_admin]
externalHTTPSPort
(because X-Forwarded-Protocol
specifies HTTPS; otherwise, it uses externalHTTPPort
).
Enabling the Service Proxy
For most implementations, one Gateway is dedicated to running as Service Proxy to support cluster administration (using Swarm UI and Management API), and a pool of additional Gateways handles all content management at scale. For test or lightly used clusters, enable both cluster administration and content management on a single Gateway instance.
On the Gateway instance that runs as Service Proxy, make the following changes to the configuration (gateway.cfg
file):
[cluster_admin] |
| Enables the Service Proxy functionality. |
---|---|---|
| Specifies the IP address or host name where Service Proxy listens for incoming storage cluster management API and Metering Query requests. | |
| Specifies the port where Service Proxy listens. By convention, this is port 91. | |
| Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when | |
| Required for Platform Server if running Service Proxy/Swarm UI on a standalone Gateway. | |
| Enables testMode when troubleshooting, which stops obfuscation of the backend Swarm Storage and Elasticsearch node IPs. | |
[storage_cluster] |
| Specifies the port where Swarm listens for storage cluster management API requests. By convention, this is port 91. |
| Specifies the user known to Swarm allowed to perform management API requests against the storage cluster. | |
| Specifies the password of the | |
[s3] |
| |
[scsp] |
|
Authentication and authorization for the Service Proxy use Content Gateway's root IDSYS and root Policy. The root Policy must grant all actions to the storage administrator users and/or groups:
See https://perifery.atlassian.net/wiki/spaces/public/pages/2443816826 and https://perifery.atlassian.net/wiki/spaces/public/pages/2443816981
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.