1 Configuring the Content Gateway
- 1.1 Minimum Configuration
- 1.2 Configuration Sections of gateway.cfg
2 Setting Ports for Docker or Proxies
3 Enabling the Service Proxy

These configuration files reside on the system after installing the Content Gateway service:

/etc/caringo/cloudgateway/gateway.cfg 
/etc/caringo/cloudgateway/logging.yaml

Logging: See Gateway Logging after completing the Gateway configuration. The configuration file for logging changed from logging.cfg to logging.yaml as of Gateway 6.0 to support newer versions of Elasticsearch and to add customizations to the YAML file. See the Apache documentation for logging.

Password Security

Plain-text passwords in both Gateway Configuration and IDSYS are replaced by encrypted versions on startup. Enter new passwords and restart Gateway when management passwords need to be changed, which replaces those strings with encrypted versions as part of startup. (v7.1)

These config items must be changed back to plain text so they can be encrypted with the new key if the adminDomain is deleted or changed.

Configuring the Content Gateway

Minimum Configuration

While cluster administrators must understand the details of configuring Content Gateway, this section summarizes the minimum steps required to configure and run Gateway. To deploy Gateway into production, additional customization is needed.

Check either that IPTABLES are off or that inbound access for the front-end protocols is allowed. These commands turn off and disable the firewall daemon.
systemctl disable firewalld systemctl stop firewalld
Edit the /etc/caringo/cloudgateway/gateway.cfg file:
1. Set adminDomain to the name of an administrative domain that is created.
2. Set hosts for the storage cluster nodes. Including 4 to 5 nodes is sufficient for most deployments.
3. Set indexerHosts to the Elasticsearch servers (required for S3 and Content Metering).
4. Enable at least one of the front-end protocols: SCSP or S3.
  Alternatively, for Service Proxy use (to host the Swarm UI), set both to disabled and complete the [cluster_admin] section.
Create the administrative domain by running the following on the first Gateway server:
/opt/caringo/cloudgateway/bin/initgateway
Password Security: This one-time step initializes password encryption for the Gateway configuration and IDSYS files. If upgrading from a version prior to 7.1, this initialization must be run again on one Gateway server to enable the feature. (v7.1)
See Gateway Administrative Domain
Start the Gateway service:
Enable automatic startup of the Gateway service.

Production deployments require customizations of the configuration parameters, below.

Configuration Sections of gateway.cfg

The gateway.cfg file controls the core operations of the Content Gateway. It is a plain text, INI-formatted file read when the Gateway is first started. The parameters within the file are organized into the following sections, and colored rows are generally essential entries.

1 [gateway]
2 [storage_cluster]
3 [scsp]
4 [s3]
5 [metering]
6 [caching]
7 [quota]
8 [dynamic_features]
9 [folder_listings]
10 [cluster_admin]
11 [metrics]
12 [debug]

[gateway]

This section configures client communications.

adminDomain	gatewayAdminDomain	Required. The administrative domain where meta information about tenants and storage domains is kept. Important This parameter must be set to the same value for all Gateway servers. Changing the adminDomain invalidates encrypted passwords in idsys.json and gateway.cfg and all tokens. This is not recommended to match the Swarm default domain (cluster.name). Doing so leads to “Invalid token” errors if cluster.enforceTenancy=False, which is also not recommended.
threads	100 * (number of CPUs present in the Gateway)	The number of threads allocated to handling client requests. Set for 100 times the number of CPU cores. The minimum is 200. For CPUs with hyperthreading-enabled, this calculation is based on the number of virtual cores, not physical.
tokenTTLHours	24	The default number of hours an authentication token is valid if no time is defined when it is created.
multipartSpoolDir	`/var/spool/cloudgateway`	The location of the spool directory for HTTP multipart MIME upload temporary space. Note Uploads through the Content UI use SCSP multipart uploads rather than multipart MIME uploads. (Gateway v6.2)
multipartUsageAllowed	50	The percentage of the file system that can be used for multipart MIME upload temporary space.
recursiveDeleteMaxThreads	50	The maximum number of parallel delete operations to dispatch when processing recursive delete requests.
sanitizeErrors	false	Set to true to hide identity management configuration details from authentication errors.
cookieDomains		One or more base domains for the `Set-Cookie` response header to scope (instead of the FQDN from the request) if an authentication token is created within a child domain of one of these base domains. This can be useful when using the Content UI to access multiple storage domains that share a common base domain when wanting to use the same authentication token across domains. (v5.2.2) Example: `cookieDomains = cloud.example.com cloud.example.net`
veeamKbBlockSize	8192	Gateway implements the Veeam SOSAPI extension (v7.10.3). This config allows block size configuration. The default and recommended value is 8192. Set to 0 to disable SOSAPI handling. The capacity and availability returned in a GET of `pseudo-object .system-d26a9498-cb7c-4a87-a44a-8ae204f5ba6c/capacity.xml` are estimated based on the bucket's evaluated EC setting which is cached for 5 minutes. The values are based on cluster capacity; bucket quotas are not currently used.
recursiveDeleteMaxItems	10000	The max multidelete request items, SCSP only. S3 has a fixed limit of 1000 which is defined by AWS.
recursiveDeleteMaxSize	2560000	The max multidelete request body size (~2.5Mb).
recursiveDeleteMaxRetries	3	Number of retries when hitting 503 on delete.
recursiveDeleteRetryDelay	500	Number of milliseconds to wait before retrying.
recursiveDeleteSynchronousIndexing	true	Whether to request synchronous ES index update during each delete.

[storage_cluster]

This section configures the back-end storage cluster.

locatorType	"static"	Zeroconf is not supported.
hosts	server1 server2 server3	Space or comma delimited list of IP addresses or host names of the storage cluster nodes.
port	80	Integer socket port number for SCSP on the storage nodes.
clusterName		The name of the storage cluster.
indexerHosts	indexer1 indexer2 indexer3	Space or comma delimited list of the Elasticsearch metadata index servers used by the storage cluster. Must be from the same ES cluster: do not mix old and new clusters. Required for the S3 protocol and for Content Metering
indexerPort	9200	The socket port on which the Elasticsearch servers listen.
managementPort managementUser managementPassword	91	Provide these credentials for the storage cluster to enable Gateway version and component information to be included in the cluster health report that provides proactive support from DataCore. (v6.0) Required when using [cluster_admin].
clientBindAddress	0.0.0.0	Set to the IP address of the network interface connected to the storage cluster subnet when using a multi-homed Gateway. The value must be defined as a non-default value when using a multi-homed Gateway server such as one connected to a front-end client network and a back-end storage network.
maxConnectionsPerRoute	threads / 2	The maximum number of open connections to a specific storage node.
maxConnections	(threads / 2) * (number of nodes)	The maximum number of open connections to allow. This includes both active and idle connections.
connectTimeout	60	The time in seconds allowed to connect to a node.
socketTimeout	10	The time in seconds allowed for an active connection to deliver data. 10 (seconds), default starting 8.1.0. Set to -1 to disable.
idleTimeout	120	The time in seconds an idle socket is allowed to remain in the connection pool.
indexerSocketTimeout	120	The time in seconds an indexer socket is allowed to remain in the connection pool. This affects the ability to list larger buckets. (v7.1)
continueWaitTimeout	30	The time in seconds to wait for client response after a 100 continue reply.
dataProtection	"immediate"	Controls whether synchronous (immediate, using replicate on write) or asynchronous (delayed) data protection is requested when writing to the storage cluster. Values: "immediate" (for replicate on write) - requires storage cluster setting of `scsp.replicateOnWrite=true` "delayed" (disables replicate on write) - requires storage cluster setting of `scsp.replicateOnWrite=false` See Configuring ROW Replicate On Write
blockUndeletableWrites	true	When enabled, the Gateway rejects any SCSP write (PUT, POST, COPY, APPEND) that includes a `deletable=no/false` lifepoint. This restriction applies to both named and unnamed (alias and immutable) objects. The request is refused with a 400 error message, "Unable to write undeletable object".

[scsp]

This section configures the front-end SCSP protocol. This protocol must be enabled for any Gateway that services Content UI requests.

enabled	true	Activates this protocol: Values are: "true", "false".
bindAddress	0.0.0.0	The IP address of the network interface to which the listening socket binds. Defaults to all interfaces.
bindPort	80	Integer socket port number for protocol.
externalHTTPPort externalHTTPSPort	80 443	Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when X-Forwarded-Proto is found on the request. Gateway uses X-Forwarded-Proto to determine which port to use. (v5.4)
allowSwarmAdminIP	undefined	Allows the use of internal Swarm requests for content replication to pass through the Gateway. This is useful if using replication feeds between clusters that use Gateway as the front-end. Values are "all", full IP addresses, IP address prefixes, a list of IPs/prefixes, or CIDR format such as 172.30.15.0/24. When undefined, no clients are allowed to send Swarm admin requests through the Gateway.

[s3]

This section configures the front-end S3 protocol, which is optional.

enabled	false	The protocol must be explicitly enabled. Values are: "true", "false".
bindAddress	0.0.0.0	The IP address of the network interface to which the listening socket binds. Defaults to all interfaces.
bindPort	80	Integer socket port number for protocol.
externalHTTPPort externalHTTPSPort	80 443	Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when X-Forwarded-Proto is found on the request. Gateway uses X-Forwarded-Proto to determine which port to use. (v5.4)
enhancedListingConsistency	true	Improves compatibility with S3 clients and software libraries that expect consistent listings (despite the documented nature of listings to be eventually consistent). Can be disabled to boost write throughput (especially for small objects), if listing consistency is not critical. (v5.2.1) Exceptions to synchronous indexing: Deletes of manifests for canceled multipart uploads are done asynchronously. On a delete, when there is not enough space on the local node to write a delete marker for a named object, Swarm writes to another node and indexes asynchronously. On a rename, Swarm indexes the new name synchronously, but the old name is deleted asynchronously. On a parallel write complete, the init stream is deleted asynchronously.
region		The Amazon S3 GET Bucket Location request returns the AWS region in which the bucket is located. By default, Gateway returns an empty value for the location, which S3 clients interpret as us-east-1. If another region is required, there are two options: Supply the location in the bucket creation operation using LocationConstraint. Set the `region` option in the Gateway configuration file to the preferred region. This applies to all buckets unless the location is specified during creation. If you require the behavior prior to Content Gateway 7.10.2 of returning the cluster name, set `region` to that cluster name.
forcedDomain		Set `forcedDomain` to the name of an existing domain to force Content Gateway to use that domain for S3 requests regardless of the incoming Host or X-Forwarded-Host header. This allows S3 clients to use gateway hostnames or IP addresses as the endpoint instead of requiring the endpoint to be a domain name. The S3 clients must use the "bucket in path" style of access for all requests, not the “bucket in Host” style. This feature is supported since v7.10.7.

[metering]

This section configures usage metering, which is optional. See Content Metering

enabled	false	The feature must be explicitly enabled.
flushIntervalSeconds	300 (5 minutes)	How frequently to send usage reports to Elasticsearch. Minimum is 10 seconds. The default value is optimized for the resolution of the queries.
retentionDays	100 (days)	How long to retain usage records. Minimum is 2 days. Allow for additional storage space if significantly increasing the retention period.
storageSampleIntervalSeconds	3600 (1 hour)	How frequently to sample the disk usage. Minimum is 900 (15 minutes). Larger values reduce the query workload on Elasticsearch.

[caching]

This section configures cache expiration. Times are in seconds. To disable, set it to 0.

authRefresh	300	Time before authorization is revalidated with a request to the identity management system.
tokenRefresh	300	Time before an authentication token is revalidated with a request to the administration domain.
idsysRefresh	300	Time an IDSYS document, or its nonexistence, is cached in memory.
policyRefresh	300	Time a tenant, domain, or bucket Policy document, or its nonexistence, is cached in memory.
xformRefresh	300	Time an XFORM document, or its nonexistence, is cached in memory.
metadataRefresh	300	Time that metadata for a tenant, domain, or bucket, or its nonexistence, is cached in memory. This includes the owner for a tenant/domain/bucket and whether a bucket exists.
domainExistenceRefresh	300	Time that the knowledge of a domain's existence or nonexistence is cached.
socketTimeout	10 (seconds)	Default timeout starting from v8.1.0. Set to -1 to disable this configuration.

[quota]

This section configures storage and network usage quotas. See Setting Quotas

The Gateway regularly refreshes the cache of quota information using an Elasticsearch query against usage metrics when enabled; it changes the quota state and performs the action specified by policy if any quota limit is reached.

enabled	false	The feature must be explicitly enabled.
minRefreshDeadline	60	The global limits on the speed of quota data refreshing. To increase the precision of the usage data, lower these values. To reduce the load on Elasticsearch, increase these values. To optimize the load on Elasticsearch, Gateway refreshes with a dynamic algorithm: slower when metrics are still far from the limit and faster when the limit approaches, slower when approaching a limit and faster as the overage nears an end. The minimum and maximum deadlines refer to the caps to apply to this refresh rate (no faster and no slower than these values).
maxRefreshDeadline	3600
numRefreshThreads	4	The number of threads in the pool that continuously look at the most urgent deadlines in the queue and perform the refreshes (Elasticsearch queries) as needed.
maxRefreshRetries	3	The number of times a refresh can fail due to a failing Elasticsearch query before an error is logged and the refresh is dropped.
maxQueueSize	10000	Maximum queue size for scope quota evaluations. The internal implementation uses a deadline queue and, If the queue is overflowed, the least urgent items are pushed out of the queue.
queryTTL	maxRefreshDeadline	This avoids unnecessary load on Elasticsearch by allowing the results of a quota check performed when a scope (tenant, domain, bucket) is accessed to be cached for this period of time. If the time since last access is less that this value, the scope is not scanned in the background. Setting this parameter to 0 disables the access caching function.
refreshRetryDelay	10	Number of seconds to wait before retrying a refresh after the previous failed due to a failing Elasticsearch query.
refreshIdleSleep	3	Seconds to wait after finishing the work in a queue and before starting again.
smtpHost	localhost	Required. The hostname or IP address of the SMTP server that sends the email notifications.
smtpPort	25	Optional. The port where the SMTP server listens.
smtpUser smtpPassword		Optional. The user and password to authenticate with SMTP server.
mailFrom	`donotreply@localhost`	Email address for the sender of the notification.
mailSubjectTemplate	Quota state change notification	Email templates for subject line and body. These variables can be used in both the subject line and message body templates. %metric% %state% %contextType% %contextName% The %xxx% strings render current values when the message is generated.
mailTemplate	Metric %metric% changed to %state% state in %contextType% %contextName%.

[dynamic_features]

Any configuration settings appear in this Dynamic Features section if optional, dynamic features such as Video Clipping for Partial File Restore are installed. (v11.0)

resultObjectLifetime	5	In days. Sets a lifepoint to trigger clean up of any JSON result objects for video clips are created asynchronously.

[folder_listings]

The section configures options related to object listings.

usePaths

A setting introduced in Gateway v7.10.0 to improve S3 delimiter listing performance. The default is true as of Gateway v8.0.3. Also see usePathsMaxDirs.

It requires Swarm 14.1 or later with search.enableDelimiterPaths set to True. This is the default for new Swarm 15.0+ clusters, see Settings Reference.
Set this setting explicitly if upgrading from Swarm 14.1.

Set swarmctl -C search.enableDelimiterPaths -V True and create a new search feed. See also, Add Search Feed.
The new feed can use the same Elasticsearch cluster if it has at least half of its disk space free.

Make it default after completion and restart gateways with [folder_listings] usePaths=true in gateway.cfg.
Now, this makes the top-level delimiter listings (used by Veeam and S3 Browser) faster.

usePathsMaxDirs

A setting introduced in Gateway v8.0.3 that determines which delimiter listings are affected by usePaths=true. It defaults to 5, which means that it affects delimiter listings without a prefix or with the prefix having up to five subdirectories.

[cluster_admin]

This section configures options related to the Service Proxy.

enabled	false	Enables the Service Proxy functionality.
bindAddress	<IP \| hostname>	Specifies the IP address or host name where Service Proxy listens for incoming storage cluster management API and Metering Query requests.
bindPort	91	Specifies the port where Service Proxy listens. By convention, this is port 91.
externalHTTPPort externalHTTPSPort	<port> <port>	Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when X-Forwarded-Proto is found on the request. Gateway uses X-Forwarded-Proto to determine which port to use. (v5.4)
platformHost	<IP \| hostname>	Required for Platform Server if running Service Proxy/Swarm UI on a standalone Gateway. See Configuring Swarm for Platform Server
testMode	true \| false	Enables testMode when troubleshooting, which stops obfuscation of the backend Swarm Storage and Elasticsearch node IPs.

[metrics]

This section configures the metrics server that gateway exposes for Prometheus. Prometheus is configured to poll /metrics on this address and port. Metrics are prefixed with caringo_gateway.

metricsEnabled	true	Metrics is enabled by default
metricsPort	9100	Port for Prometheus to poll
metricsHost	0.0.0.0	Address the metrics server bind to. 0.0.0.0, by default, refers to all IP addresses. This can be configured to a private address if Prometheus can connect to it. (v7.10.6)

[debug]

This section contains configuration that Support might ask to be temporarily enabled for diagnosis:

debugConnLeaks

true

Set this to true as directed by DataCore Support to diagnose connection pool or stuck thread issues.

Setting Ports for Docker or Proxies

Gateway manages communications through assigned ports. Gateway is configured to run either within a Docker environment or behind a proxy as of release 5.4. The configuration has two settings (externalHTTPPort, externalHTTPSPort) per protocol: [scsp] and [cluster_admin], the Service Proxy. These settings take effect when X-Forwarded-Proto appears on the request.

SCSP, S3, and Service Proxy request each route to the correct port. Browser requests must use the correct port:

Content UI	`/_admin/portal`	SCSP port	`[scsp]`
Swarm UI	`/_admin/storage`	Service Proxy port	`[cluster_admin]`

Gateway can redirect users if they attempt to access a UI on the wrong port; to accomplish this,

The load balancer must set X-Forwarded- headers, which Gateway uses to determine which port to use
Configure externalHTTP[S]Port correctly in gateway.cfg

Example Load Balancer Setup	Example Settings in gateway.cfg

Example Load Balancer Setup

Example Settings in gateway.cfg

If an HAProxy load balancer at haproxy.example.com is proxying requests for SCSP and S3 (on a shared port) and for Service Proxy:

...then expose both HTTP and HTTPS
in these sections:

Redirection: This is how redirection is achieved given the example above. A user incorrectly attempts to access /_admin/storage on the SCSP/S3 port exposed by HAProxy.

HAProxy proxies this request to Gateway's SCSP port as:

Gateway SCSP knows that it does not handle /_admin/storage requests and that /_admin/storage is handled by the [cluster_admin] port, so it responds with a redirect to the [cluster_admin] externalHTTPSPort (because X-Forwarded-Protocol specifies HTTPS; otherwise, it uses externalHTTPPort).

Enabling the Service Proxy

For most implementations, one Gateway is dedicated to running as Service Proxy to support cluster administration (using Swarm UI and Management API), and a pool of additional Gateways handles all content management at scale. For test or lightly used clusters, enable both cluster administration and content management on a single Gateway instance.

On the Gateway instance that runs as Service Proxy, make the following changes to the configuration (gateway.cfg file):

[cluster_admin]	`enabled=true`	Enables the Service Proxy functionality.
	`bindAddress=<IP\|hostname>`	Specifies the IP address or host name where Service Proxy listens for incoming storage cluster management API and Metering Query requests.
	`bindPort=91`	Specifies the port where Service Proxy listens. By convention, this is port 91.
	`externalHTTPPort=<port>` `externalHTTPSPort=<port>`	Optional, one or both. Allows Gateway to be used either behind a proxy or within a Docker environment, taking effect when `X-Forwarded-Proto` is found on the request. Gateway uses `X-Forwarded-Proto` to determine which port to use. (v5.4)
	`platformHost=<IP\|hostname>` `platformPort=<port>`	Required for Platform Server if running Service Proxy/Swarm UI on a standalone Gateway. See Configuring Swarm for Platform Server
	`testMode=<true\|false>`	Enables testMode when troubleshooting, which stops obfuscation of the backend Swarm Storage and Elasticsearch node IPs.
[storage_cluster]	`managementPort=91`	Specifies the port where Swarm listens for storage cluster management API requests. By convention, this is port 91.
	`managementUser=<Swarm·admin·user>`	Specifies the user known to Swarm allowed to perform management API requests against the storage cluster.
	`managementPassword=<Swarm·admin·password>`	Specifies the password of the `managementUser`.
[s3]	`enabled=false`
[scsp]	`enabled=false`

Authentication and authorization for the Service Proxy use Content Gateway's root IDSYS and root Policy. The root Policy must grant all actions to the storage administrator users and/or groups:

See https://perifery.atlassian.net/wiki/spaces/public/pages/2443816826 and https://perifery.atlassian.net/wiki/spaces/public/pages/2443816981

Swarm Documentation

Gateway Configuration

Analytics