Filtering Headers
A business requirement may exist to enable filtering of the optional HTTP response headers transmitted for GET and HEAD requests if Swarm is used to deliver content directly over the Internet. (v9.5)
Caution
Indiscriminate filtering of response headers, which is cluster-wide in scope, can break client applications. Do not filter headers if the client applications are object storage aware and are using SCSP or S3 (Content Gateway) to interact with Storage.
Filtering metadata headers on objects can cause problems for other applications that know how to work with object metadata, such as Content UI, SwarmFS, and FileFly.
Because the header filtering does add additional processing to Swarm's responses, best practice is to enable it only for a specific content delivery need:
Bandwidth needs to be conserved and as many bytes as possible need to be eliminated when serving content.
Enhanced security is needed and as little as possible about content and context needs to be revealed.
The target clients are web browsers instead of object storage-aware applications.
Important
Regardless of filtering, do not expose Swarm Storage directly on the Internet. Do not allow arbitrary requests, especially by unauthorized users. Some kind of HTTP request restrictions should always be present to prevent abuse by untrusted clients.
Header filtering is a Storage feature dynamically implemented without a cluster restart. The choice of filtering approaches follow:
Whitelist - list which non-required headers to retain, if any
Blacklist - list which non-required headers to remove, preserving all others
The lists are case-insensitive, and they can include system headers (such as "Castor-System-Owner
").
Essential Headers
The following essential metadata headers are unaffected by Blacklisting and are always included when they are present on an object:
Allow, Authentication-Info, Authorization, Cache-Control, Connection, Content-Length, Content-MD5, Content-Range, Content-Type, Date, Expires, Keep-Alive, Location, Server, Trailer, Transfer-Encoding
Settings for Filtering
Filtering is disabled by default. These SCSP settings allow controlling which of the optional response headers are returned from the cluster:
scsp.filterResponseHeaders | none | Which method to use to filter HTTP response headers. Whitelist or blacklist setting must be defined before implementing that method. Valid values: none, whitelist, blacklist. SNMP: filterResponseHeaders |
---|---|---|
scsp.filterResponseBlacklist | [] | Which headers to remove from HTTP GET and HEAD responses. List is comma-separated and case-insensitive. SNMP: filterResponseBlacklist |
scsp.filterResponseWhitelist | [] | Which headers to retain in HTTP GET and HEAD responses, removing all others. List is comma-separated and case-insensitive. Leave the brackets empty to have Swarm strip out all non-essential headers. SNMP: filterResponseWhitelist |
Set these values using the Storage UI, or use SNMP or cURL:
curl -i http://$SCSP_HOST:91/api/storage/clusters/<cluster-name>/settings/scsp.filterResponseWhitelist
-XPUT -d {"value": ["key1","key2"]}
curl -i http://$SCSP_HOST:91/api/storage/clusters/<cluster-name>/settings/scsp.filterResponseHeaders
-XPUT -d {"value": "whitelist"}
Sample Output
Following are examples of how responses can appear with and without filtering applied. Swarm includes the Castor-System-Headers-Filtered: True
header with every response that has been filtered by a whitelist or blacklist.
Target of GET | Headers not Filtered | Headers Filtered |
---|---|---|
Missing | $ curl -i "172.16.15.180/11111111111111111111111111111111" HTTP/1.1 404 Not Found Castor-System-Error-Token: NotFound3 Castor-System-Error-Text: Existing object not found in cluster. Castor-System-Error-Code: 404 Castor-System-Cluster: CAStorCluster Content-Length: 83 Content-Type: text/html Date: Fri, 30 Nov 2018 16:27:36 GMT Server: CAStor Cluster/9.6.a Allow: HEAD, COPY, GET, SEND, PATCH, PUT, RELEASE, POST, HOLD, GEN, APPEND, DELETE Keep-Alive: timeout=14400 <html><body><h2>Swarm Storage Error</h2><br> Requested stream was not found</body></html> | $ curl -i "172.16.15.179/11111111111111111111111111111111" HTTP/1.1 404 Not Found Castor-System-Headers-Filtered: True Content-Length: 83 Content-Type: text/html Date: Fri, 30 Nov 2018 16:29:22 GMT Server: CAStor Cluster/9.6.singleip Allow: HEAD, COPY, GET, SEND, PATCH, PUT, RELEASE, POST, HOLD, GEN, APPEND, DELETE Keep-Alive: timeout=14400 <html><body><h2>Swarm Storage Error</h2><br> Requested stream was not found</body></html> |
Immutable | $ curl -i "172.16.15.178/7b9a25bcd48afac3156a89212859c62c" HTTP/1.1 200 OK Castor-System-Cluster: CAStorCluster Castor-System-Created: Fri, 30 Nov 2018 16:31:04 GMT Content-Length: 0 Last-Modified: Fri, 30 Nov 2018 16:31:04 GMT Etag: "7b9a25bcd48afac3156a89212859c62c" Volume: b9ec90023e27941147b3ce6fb2ed54bd Date: Fri, 30 Nov 2018 16:32:13 GMT Server: CAStor Cluster/9.6.a Keep-Alive: timeout=14400 | $ curl -i "172.16.15.179/7b9a25bcd48afac3156a89212859c62c" HTTP/1.1 200 OK Content-Length: 0 Castor-System-Headers-Filtered: True Date: Fri, 30 Nov 2018 16:31:25 GMT Server: CAStor Cluster/9.6.singleip Keep-Alive: timeout=14400 |
Named | $ curl -i "172.16.15.180/bucket/stream?domain=domain" -I HTTP/1.1 200 OK Castor-System-CID: 84c1cbf7d33aec1feec4d4dd11225b87 Castor-System-Cluster: CAStorCluster Castor-System-Created: Fri, 30 Nov 2018 16:33:44 GMT Castor-System-Name: stream Castor-System-Version: 1543595624.202 Content-Length: 0 Last-Modified: Fri, 30 Nov 2018 16:33:44 GMT Etag: "46ce386cdc13828d7d8d68ee20aac58d" Castor-System-Path: /domain/bucket/stream Castor-System-Domain: domain Volume: 0a9a7ed07b5f86520b096fb0ef824846 Date: Fri, 30 Nov 2018 16:34:21 GMT Server: CAStor Cluster/9.6.a Keep-Alive: timeout=14400 | $ curl -i "172.16.15.179/bucket/s?domain=x" HTTP/1.1 200 OK Content-Length: 0 Castor-System-Headers-Filtered: True Date: Fri, 30 Nov 2018 16:33:48 GMT Server: CAStor Cluster/9.6.singleip Keep-Alive: timeout=14400 |
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.