Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel2
outlinefalse
typelist
printablefalse

Content-MD5 checksums provide an end-to-end message integrity check of the content (excluding metadata) as it is sent to and returned from Swarm. A proxy or client can check the Content-MD5 header to detect modifications to the entity - body while in transit. A client can provide this header to indicate Swarm should compute and check it as it is storing or returning the object data.

See SCSP Headers.

Client-Provided Content-MD5

During a POST or PUT, the client can provide the following Content-MD5 header as specified in section 14.15 of the HTTP/1.1 RFC:

...

Where md5-digest is the base64 of the 128-bit MD5 digest (see See RFC 1864 for more information).

...

  • If this header is present, Swarm computes an MD5 digest during data transfer and then compares the computed digest to the digest provided in the header.

  • When completed, the Content-MD5 data is stored with the object and returned with the GET or HEAD request.

  • If the hashes do not match, Swarm returns a 400 Bad Request error response, abandons the object, and closes the client connection.

Swarm-Provided Content-MD5

Another way to associate a Content-MD5 value with an object is to have Swarm compute the ContentMD5 for the body data of the request. Include the gencontentmd5 query argument in the request to perform this. Swarm returns the Content-MD5 as a header in the 201 Created response. Once computed, the Content-MD5 data is stored with the object and returned as a response header for any subsequent GET or HEAD requests. Note that : the gencontentmd5 query argument replaces use of the "Expect: Content-MD5" request header, which is deprecated per RFC 2731. (v9.2)

...

For details about Range headers, see section 14.35 (Range) in the HTTP/1.1 RFC.

Info

Validation

failuresBecause of the way Swarm reports a hash validation failure,

Failures

When SCSP reading operations that request for a Content-MD5 hash validation and for which there is a hash mismatch causes , a storage node to be is removed for of the Gateway's connection pool temporarily because of how Swarm reports a hash validation failure.

Storing Content-MD5 Headers

Content-MD5 headers are stored with the object metadata and returned on all subsequent GET or HEAD requests.

  • If a Content-MD5 header is included with a GET request, Swarm computes the hash as the bytes are read, regardless of whether the header was originally stored with the object

  • If the computed and provided hashes do not match, the connection is closed before the last bytes are transmitted, which is the standard way to indicate something went wrong with the transfer.

Content-MD5 and Replication

When providing the gencontentmd5 query argument in a request on a replicated object, the following applies:

  • On a write request (POST, PUT, COPY, or APPEND), the Content-MD5 is calculated, stored with the object, and returned as a response header for that write operation.

  • The Content-MD5 is always returned for any GET or HEAD request written with the gencontentmd5 query argument.

  • When including ?gencontentmd5 on a range read (a GET request with the Range header), Swarm suppresses any stored Content-MD5 from the response headers and instead return a Content-MD5 for the requested range as a trailing header.

Content-MD5 and Erasure-Coding

When providing the gencontentmd5 query argument in request on an erasure-coded object, the following applies:

  • The APPEND operation is no longer supported. If providing a gencontentmd5 query argument on an APPEND, it returns a 400 Bad Request error response.

  • The COPY operation is only supported if providing a gencontentmd5 query argument on the existing object's write. Otherwise ; otherwise the COPY operation fails.

  • For a range read (a GET request with the Range header), Swarm suppresses any stored Content-MD5 from the response headers and instead return a Content-MD5 for the requested range as a trailing header.

Example Download Verification

You can verify the integrity of a download from Swarm by checking the Content-MD5 published in an object’s metadata with the base64 encoded MD5 digest of the downloaded object. An example of how this is performed using the ‘openssl’ utility is outlined below:

Code Block
$ curl -sI https://support.cloud.datacore.com/tools/swarm-support-tools.tgz
HTTP/1.1 200 OK
Date: Tue, 10 Jan 2023 19:12:40 GMT
Gateway-Request-Id: A0A1788FF937057D
Server: CAStor Cluster/15.0.1
Via: 1.1 support.cloud.datacore.com (Cloud Gateway SCSP/7.10.2)
Gateway-Protocol: scsp
CAStor-application: CaringoTechSupport
Castor-System-CID: 664727e752ca7a48092c73699e909578
Castor-System-Cluster: gem.tx.caringo.com
Castor-System-Created: Mon, 09 Jan 2023 18:25:17 GMT
Castor-System-Name: swarm-support-tools.tgz
Castor-System-Version: 1673288717.693
Content-Type: application/x-www-form-urlencoded
Last-Modified: Mon, 09 Jan 2023 18:25:17 GMT
X-Last-Modified-By-Meta: tools+swarm
X-Owner-Meta: tools+swarm
Manifest: ec
ETag: "b5dea5b4048f21a0f99880873fa64865"
Castor-System-Path: /support.cloud.datacore.com/tools/swarm-support-tools.tgz
Castor-System-Domain: support.cloud.datacore.com
Volume: 1dc47666d09cdb27bd59cbb731d046ca
Content-MD5: EF8xHMmzt3xNjpksfRLo+A==
Content-Length: 28398358

$ curl -O https://support.cloud.datacore.com/tools/swarm-support-tools.tgz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 27.0M  100 27.0M    0     0  2928k      0  0:00:09  0:00:09 --:--:-- 2826k

$ cat swarm-support-tools.tgz | openssl dgst -md5 -binary | openssl enc -base64
EF8xHMmzt3xNjpksfRLo+A==