Created 1/20/2016 pat.ray · Updated 3/21/2016 pat.ray
...
Problem
I'm implementing an interface that does not provide the content length of streams at the start of a write; or I'm writing code to write Swarm streams from a streaming input of unkown length.
We've seen two specific cases:
- The HDFS FileSystem interface, which uses a create/write block/write block/.../close protocol for writing streams. The create call doesn't include a length parameter.
- Streaming data input for audio or video.
...
Solution
We can use the following
...
four features of Swarm to solve the problem:
- Chunked transfers - Swarm accepts streams without
...
- a
Content-length
...
- header if sent
...
- with
Transfer-encoding: chunked
- ECP - Swarm writes all streams sent with chunked encoding as erasure coded, so Elastic Content Protection must be enabled and configured in the target Swarm cluster.
- Lifepoints - These can include reps specs, so we can use those to convert from erasure
...
- coding to straight replicas.
- Swarm Management API - This lets us query Swarm for the ec.minStreamSize.
...
Naive Approaches
...
Spooling
We could
...
write all streams to non-Swarm storage and then read from that storage on stream completion and POST the stream of known length to Swarm.
If we're using a sychronous spooler, we'll add latency to the transfer. If we're using an asynchronous spooler, we'll have to implement something to track the result of writing to Swarm after the write to the spool completes, and we'll have to manage data loss within the spooler.
Both implementations suffer from limitations of the spool available storage, both for a single stream and in aggregate.
...
Writing streams Transfer-Encoding:chunked without cleanup
We want
...
to use chunked transfers and therefore store streams erasure-coded. This means, however,
...
that small streams wil be erasure-coded, which wastes disk and index memory space.
...
Better Answer
Write streams Transfer-Encoding:chunked and then use
...
COPY to convert small stream to straight reps using a lifepoint on completion of the write. COPY using a terminal reps= lifepoint will strip off the implicit EC and recode the stream as reps.
(Note that the recoding occurs asynchronously, either in the background immediately after the response or at the next HP examination.)
...
How large is a small stream?
Ideally, we want to use the default cluster setting ec.minStreamSize in Swarm 8.0 and prior. Swarm 8.0 has a nifty hidden feature that allows you to read that. You can use the Swarm Management API, available on port 91 of a Swarm node by default, to retrieve various configuration settings. Specifically, we want the configuration setting at /api/storage/nodes/<node-ip>/settings/ec.minStreamSize.
EXAMPLE
The following curl command retrieves the value of ec.minStreamSize
Code Block | ||||
---|---|---|---|---|
| ||||
curl -v "192.168.3.84:91/api/storage/nodes/192.168.3.84/settings/ec.minStreamSize" -L * About to connect() to 192.168.3.84 port 91 (#0) * Trying 192.168.3.84... connected * Connected to 192.168.3.84 (192.168.3.84) port 91 (#0) |
...
> GET /api/storage/nodes/192.168.3.84/settings/ec.minStreamSize HTTP/1.1 |
...
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 |
...
> Host: 192.168.3.84:91 |
...
> Accept: */* > |
...
< HTTP/1.1 200 OK |
...
< Content-Length: 286 |
...
< Server: TwistedWeb/12.2.0 |
...
< Allow: GET,HEAD,OPTIONS,PUT |
...
< Date: Wed, 20 Jan 2016 23:18:17 GMT |
...
< Access-Control-Allow-Origin: * |
...
< Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept, Authorization |
...
< Content-Type: application/json < |
...
* Connection #0 to host 192.168.3.84 left intact * Closing connection #0 {"_links": |
{"self": |
...
{"href": "http://192.168.3.84:91/api/storage/nodes/88b64d590c48d0de/settings/ec.minStreamSize"}, |
...
"curies": |
...
[
{
[ { "href": "/api-docs/rels/{rel}", |
...
"name": "waggle" |
...
}
],
} ], "schema": |
...
{
{ "href": "http://192.168.3.84:91/api-docs/schema/api.storage.setting.json" |
...
}
},
} }, "value": 1000000 |
...
[The above was reformatted by hand to make it readable.]
...
} |
Complete Example in curl
The example below uses the following curl options:
- --upload-file (alias -T) Path|- : Transfers a file to destination URL. The example
...
- uses
-
, signifying standard input. Using - also tells curl to use chunked transfer and to write to the named resource in the destination URL without appending the input Path. - -X
...
- POST : Tells curl to use
...
- POST rather than PUT
- -X COPY : Tells curl to use COPY
- -H "<header name>:<header value>": Tells curl to send a header in the request
- -l and --post301: Tells curl to handle redirects and do so with a POST.
- -v: Turns on curl verbose mode.
...
Original write
Code Block | ||||
---|---|---|---|---|
| ||||
cat file.txt | curl -v "192.168.3.84/bucket1/file.txt?domain=d1" -T - -L --post301 -X POST * About to connect() to 192.168.3.84 port 80 (#0) * Trying 192.168.3.84... connected * Connected to 192.168.3.84 (192.168.3.84) port 80 (#0) |
...
> POST /bucket1/file.txt?domain=d1 HTTP/1.1 |
...
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 |
...
> Host: 192.168.3.84 |
...
> Accept: */ |
...
* > Transfer-Encoding: chunked |
...
> Expect: 100-continue > |
...
[...] |
...
< HTTP/1.1 100 Continue |
...
< Date: Fri, 22 Jan 2016 20:27:18 GMT |
...
< Server: CAStor Cluster/8.0.0 |
...
< Content-Length: 0 |
...
< HTTP/1.1 201 Created |
...
< Location: http://192.168.3.87:80/bucket1/file.txt?domain=d1 |
...
< Location: http://192.168.3.85:80/bucket1/file.txt?domain=d1 |
...
< Volume: 3ba2966990b0073f7af0c6120f2e8ae9 |
...
< Volume: 15e2cc5efd61e36b07312e94bf169852 |
...
< Manifest: ec |
...
< Last-Modified: Fri, 22 Jan 2016 20:27:18 GMT |
...
< Entity-MD5: REMbUcTevWQCOwX6PHvR5w== |
...
< Stored-Digest: 44431b51c4debd64023b05fa3c7bd1e7 |
...
< Castor-System-Encoding: zfec 1.4(2, 1, 524288, 200000000) |
...
< Castor-System-Version: 1453494438.442 |
...
< Etag: "64ba617fb08821b02cb0264438f37058" |
...
< Replica-Count: 2 |
...
< Date: Fri, 22 Jan 2016 20:27:18 GMT |
...
< Server: CAStor Cluster/8.0.0 |
...
< Content-Length: 46 |
...
< Content-Type: text/html |
...
< Keep-Alive: timeout=14400 |
In the above, note that
- The
...
- POST was done with a chunked transfer:
Transfer-encoding: chunked is in the request;
- The POST didn't include a content-length;
- The object was stored erasure-coded:
Manifest: ec
...
- is in the response.
...
Retrieving the file size
We'd be able to track the file length if we were doing this in code, but for illustration we'll get the object length by making a HEAD request to Swarm
curl -I "192.168.3.84/bucket1/file.txt?domain=d1" -L --post301
[...]
Code Block | ||||
---|---|---|---|---|
| ||||
HTTP/1.1 200 OK Castor-System-CID: 4f4a83b8a9c9114ae55fd5dd76931b50 Castor-System-Cluster: pats.cluster Castor-System-Created: Fri, 22 Jan 2016 20:27:18 GMT Castor-System-Name: file.txt Castor-System-Version: 1453494438.442 Last-Modified: Fri, 22 Jan 2016 20:27:18 GMT Manifest: ec Content-Length: 8835 Etag: "64ba617fb08821b02cb0264438f37058" Castor-System-Path: /d1/bucket1/file.txt Castor-System-Domain: d1 Volume: 15e2cc5efd61e36b07312e94bf169852 Date: Fri, 22 Jan 2016 20:42:09 GMT Server: CAStor Cluster/8.0.0 Keep-Alive: timeout=14400 |
Again, note that the stream is stored erasure-coded (Manifest: ec
). The stream length is 8,835 bytes (Content-Length: 8835
).
...
Converting to straight replicas
From our previous call to retrieve ec.minStreamSize from the Swarm Management API, we know the stream we just wrote is much smaller than the default minimum for ec. We therefore want to convert the stream to use straight replicas.
Note that we
...
can use the Replica-Count
...
value (2) returned on the original POST as our replicas value in our conversion to straight reps.
We'll COPY the original stream, adding a terminal lifepoint with a policy of reps=2 (Lifepoint: [] reps=2
) to effect the conversion.
Code Block | ||||
---|---|---|---|---|
| ||||
curl -v "192.168.3.84/bucket1/file.txt?domain=d1" -L --post301 -X COPY -L -H "Lifepoint: [] reps=2" * About to connect() to 192.168.3.84 port 80 (#0) * Trying 192.168.3.84... connected * Connected to 192.168.3.84 (192.168.3.84) port 80 (#0) |
...
> COPY /bucket1/file.txt?domain=d1 HTTP/1.1 |
...
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 |
...
> Host: 192.168.3.84 |
...
> Accept: */ |
...
* > Lifepoint: [] reps=2 [...] |
...
< HTTP/1.1 201 Created |
...
< Location: http://192.168.3.85:80/bucket1/file.txt?domain=d1 |
...
< Location: http://192.168.3.86:80/bucket1/file.txt?domain=d1 |
...
< Volume: 15e2cc5efd61e36b07312e94bf169852 |
...
< Volume: 7e965614b6c9c4dc04a29a411b068348 |
...
< Manifest: ec |
...
< Last-Modified: Fri, 22 Jan 2016 20:51:40 GMT |
...
< Entity-MD5: I+JK/Xlw5ogEV4/2oFbWSg== |
...
< Stored-Digest: 23e24afd7970e68804578ff6a056d64a |
...
< Castor-System-Encoding: zfec 1.4(2, 1, 524288, 200000000) |
...
< Castor-System-Version: 1453495900.898 |
...
< Etag: "d3d98eea36a66c23c9d89a8f5391e4f1" |
...
< Replica-Count: 2 |
...
< Date: Fri, 22 Jan 2016 20:51:41 GMT |
...
< Server: CAStor Cluster/8.0.0 |
...
< Content-Length: 51 |
...
< Content-Type: text/html |
...
< Keep-Alive: timeout=14400 < |
...
<html><body>Stream metadata updated</body></html> |
The response still indicates the stream is erasure-coded since the response still contains
...
a Manifest: ec
...
header. If we HEAD the stream, however, we can see that stream is no longer erasure-coded.
Code Block | ||||
---|---|---|---|---|
| ||||
curl -I "192.168.3.84/bucket1/file.txt?domain=d1" -L --post301 HTTP/1.1 200 OK Castor-System-CID: 4f4a83b8a9c9114ae55fd5dd76931b50 Castor-System-Cluster: pats.cluster Castor-System-Created: Fri, 22 Jan 2016 20:51:40 GMT Castor-System-Name: file.txt Castor-System-Version: 1453495900.903 Content-Length: 8835 Last-Modified: Fri, 22 Jan 2016 20:51:40 GMT Lifepoint: [] reps=2 Etag: "84997c119a18a55cb904cf009f9fda85" Castor-System-Path: /d1/bucket1/file.txt Castor-System-Domain: d1 Volume: 15e2cc5efd61e36b07312e94bf169852 Date: Fri, 22 Jan 2016 20:54:58 GMT Server: CAStor Cluster/8.0.0 Keep-Alive: timeout=14400 |
Note that response no longer contains the Manifest header.