TechNote 2015002: Using cURL with Swarm
Document Identifier: | TechNote 2015002 (replaces TechNote 2011004) |
Document Date: | July 24, 2015 |
Software Package: | SWARM |
Version: | Swarm 7 or later |
Abstract
This technical note provides examples for using cURL with Swarm version 7.0 or later. cURL can be useful for performing specific tasks, such as executing all SCSP methods, including those that require authentication, and can assist you with verifying that your client and Swarm are working properly. See the SDK Overview for detailed information about programming for Swarm.
cURL is a free, open source command-line utility that runs in a variety of operating systems, including Windows and UNIX. The functionality of cURL is also made available as a programming library called libcurl. For more information about cURL, see one of the following resources:
http://curl.haxx.se/
http://curl.haxx.se/docs/manpage.html
Assumptions
This Tech Note assumes the following:
You are familiar with using cURL.
You are using cURL version 7.20.1 or later
You are familiar with the Simple Content Storage Protocol (SCSP), the protocol used by Swarm clients.
Your cluster runs Swarm version 7.0 or later.
You can communicate with at least one node in the cluster using a computer that runs either Windows or Linux.
Getting Started
To make sure you can communicate with your Swarm cluster, open a command prompt window and ping a node in the cluster:
ping cluster-node-ip
For example:
ping 172.16.0.32
If the node does not respond, try to ping another node IP address. Do not continue until you can successfully ping a node in your cluster. This verifies ICMP reachability but not necessarily that the node is available on port 80 for HTTP requests. It is not definitive, as a node could potentially be unavailable via ICMP through the network, but still be available via port 80. Regardless, it is a good, simple test for reachability. Similarly, you could run the following to test port 80:
telnet 172.16.0.32 80
How to Correctly Use cURL
Even if you are accustomed to using cURL to communicate with a web server, you might not know some of the differences necessary to communicate with Swarm. Although Swarm uses the HTTP protocol, Swarm uses more of the HTTP specification than does a typical web server.
Based on our experience, DataCore provides the following recommendations for avoiding common cURL mistakes:
Use the correct cURL version
SCSP and HTTP method names
Use only plain ASCII text
Send binary data to Swarm in binary format using
--data-binary
Enclose the location (that is, URL) in single quotes
Use
--anyauth
and--location-trusted
with authentication (Swarm 5.0 and later)Understand how to interpret cURL output
Use a recent cURL version
Older versions of cURL do not support --post301
, which prevents cURL from following redirects in the cluster using the original HTTP method (without --post301
, an HTTP GET is substituted for the original SCSP method). As a result, any request that is redirected from the first node of contact fails.
DataCore recommends you use the most recent available version of cURL. Some operating systems, such as CentOS 5.5, have older versions of cURL that do not work with Swarm. Before continuing, verify the cURL version you are using and upgrade it if necessary.
To verify your cURL version, enter curl --version
. Follow the documentation provided with your operating system, or the cURL man page, for more information about upgrading cURL.
SCSP and HTTP method names
When you execute methods using cURL, you must use HTTP method names; however, this Tech Note uses SCSP method names except when showing cURL command examples. For more information about SCSP and HTTP method names, see SCSP Essentials and SCSP Methods.
Plain text only
cURL accepts commands in plain ASCII text only; for example, you must use straight quotes (') and not "curly" quotes (´). Some characters in this Tech Note might paste into a command prompt as non-ASCII text characters. Check your command carefully for non-ASCII characters before you execute it.
Use --data-binary and not –d
To send data to Swarm, always use the --data-binary
option because it sends binary data to Swarm in binary format. If you are used to using cURL to send data to a web server, you might make the mistake of using –d
option, which translates binary data to text.
-d
is intended to send HTML form-posted data of Content-Type application/x-www-form-urlencoded
, which works for a Web server but not always for Swarm!
Always enclose the location in quotes
The location, or the URL to which the command is being sent, must be enclosed in single quotes. This prevents special characters in the URL, such as ?
and &
used in query arguments, from being interpreted as commands by the command shell.
When authentication is required, use --anyauth
and --location-trusted
Later in this Tech Note, examples are shown. Some of these examples require authentication, and whenever you use authentication, you must specify --anyauth
before providing an authorized user name and password. You must also use --location-trusted
instead of –L
to specify the object's URL.
The examples in this Tech Note always use --location-trusted
for consistency.
Interpreting cURL –v(erbose) Output
Most of the examples in this Tech Note use cURL with the –v
option, which displays “verbose” output. cURL’s verbose output can indicate communication problems with the cluster and other important diagnostic information.
An example follows; significant lines of output are displayed in italicized text and explained after the example.
* About to connect() to 172.16.0.32 port 80 (#0)
* Trying 172.16.0.32... connected
* Connected to 172.16.0.32 (172.16.0.32) port 80 (#0)
> POST / HTTP/1.1
> User-Agent: ...
> Host: 172.16.0.32
> Accept: */*
> Content-type: text/xml
> Content-Length: 2
< HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
< Date: Wed, 16 Mar 2015 23:52:52 GMT
Date: Wed, 16 Mar 2015 23:52:52 GMT
< Server: CAStor Cluster/7.5.1
Server: CAStor Cluster/7.5.1
< Location: http://172.16.0.34:80/?auth=f2615d9233b2632c7ffa1295498885fb
Location: http://172.16.0.34:80/?auth=f2615d9233b2632c7ffa1295498885fb
< Content-Length: 2
Content-Length: 2
>
< HTTP/1.1 201 Created
HTTP/1.1 201 Created
< Content-UUID: b64d2b22cd2ca0c8300ffecb337181e5
Content-UUID: b64d2b22cd2ca0c8300ffecb337181e5
< Location: http://172.16.0.32:80/b64d2b22cd2ca0c8300ffecb337181e5
Location: http://172.16.0.32:80/b64d2b22cd2ca0c8300ffecb337181e5
< Etag: "b64d2b22cd2ca0c8300ffecb337181e5"
Etag: "b64d2b22cd2ca0c8300ffecb337181e5"
< Last-Modified: Wed, 16 Mar 2015 23:55:41 GMT
Last-Modified: Wed, 16 Mar 2015 23:55:41 GMT
< Entity-MD5: VpehP5grtghJpxy0db29Ig==
Entity-MD5: VpehP5grtghJpxy0db29Ig==
< Stored-Digest: 5697a13f982bb60849a71cb475bdbd22
Stored-Digest: 5697a13f982bb60849a71cb475bdbd22
< Date: Wed, 16 Mar 2015 23:55:42 GMT
Date: Wed, 16 Mar 2015 23:55:42 GMT
< Server: CAStor Cluster/7.5.1
Server: CAStor Cluster/7.5.1
< Content-Length: 44
Content-Length: 44
< Content-Type: text/html
Content-Type: text/html
<
* Connection #0 to host 172.16.0.32 left intact
* Closing connection #0
<html><body>New stream created</body></html>
The following table explains the italicized lines in cURL output:
Line | Meaning |
---|---|
| Confirms the IP address and port of the node with which you are communicating. |
| Displays the SCSP method name (in this case, WRITE). |
| Displays the name and version information about your cURL client. This value is intentionally omitted in the examples in this Tech Note. |
| Does not always display but if so, is normal. This message indicates that the primary access node (PAN) redirected your request to another node in the cluster. |
| Indicates the request was successful. |
| Displays the URL to the object. |
| Displays the Swarm version running on the responding node. |
| Swarm’s response, which indicates the method completed successfully. |
Summary of Selected Commands
Following are selected sample commands discussed in this Tech Note. The commands show how to execute all seven SCSP methods on mutable unnamed objects (also called anchor streams) and on named objects.
Notice the similarities between the commands for unnamed mutable objects and named objects; the differences are:
How you specify the location of the object itself.
All SCSP methods on unnamed mutable objects require the
?alias[=yes]
query argument- this only applies to Swarm previous to 7.x. >= 7.x Swarm does not require the?alias
query arg for unnamed alias streams.
You can use the same commands for unnamed immutable objects as well (minus the ?alias
query argument); however, the only SCSP methods supported by immutable objects are WRITE, DELETE, READ, and INFO.
SCSP WRITE
Unnamed mutable:
Named (requires a previously created domain and bucket- see later in the doc for examples):
SCSP APPEND
Unnamed mutable:
Named:
SCSP COPY
Unnamed mutable:
Named:
SCSP UPDATE
Unnamed mutable:
Named:
SCSP READ
Unnamed mutable:
Named:
SCSP INFO
Unnamed mutable:
Named:
SCSP DELETE
Unnamed mutable:
Named:
Working With Unnamed Objects
This section provides examples of using cURL to work with unnamed objects. The term unnamed object includes both immutable objects and mutable objects (which are also referred to as anchor streams).
An immutable object can be created, but can’t be changed. It can be be deleted.
You can execute all SCSP methods on a mutable object: WRITE, UPDATE, COPY, APPEND, READ, INFO, and DELETE.
Assumption: You use the default port 80 for SCSP. If you use a different port, you must include it in the URL.
Write a New Immutable Object
WRITE a string:
WRITE a file:
For the preceding command to work, Hello_World.html must be in a location accessible by cURL and it must have HTML contents, such as <h1>Hello World</h1>
. (Include the path to the file if necessary as @path-to-file/Hello_World.html
.)
To view the object (that is, execute the READ method), paste the URL displayed in the Location response into a web browser’s address or location field.
INFO an Unnamed Immutable Object
An INFO differs from a READ in that only the metadata stored with the object is returned. The object’s contents are not returned if you INFO the object.
For example,
Sample response:
HTTP/1.1 200 OK
Castor-System-Created: Sat, 19 Mar 2015 19:18:42 GMT
Content-Length: 33
Content-type: text/html
Last-Modified: Sat, 19 Mar 2015 19:18:42 GMT
Etag: "3c0723bd9f555df9645c266227ee5fa2"
Date: Sat, 19 Mar 2015 19:22:41 GMT
Server: CAStor Cluster/7.5.1
DELETE an Unnamed Immutable Object
To delete an unnamed immutable object, you must know the value of the object’s Location header.
For example,
If you now try to READ or INFO the object, Swarm responds with 404 (Not Found) or with a Swarm Error.
Working With An Unnamed Mutable Object
An unnamed mutable object is one that can be changed using UPDATE, APPEND, COPY, or DELETE. You can also READ and INFO the object. This example shows how to create a new mutable object, how to modify it using UPDATE, add data to it using APPEND, COPY metadata, then READ and INFO the object before you DELETE it.
The ?alias
(optionally ?alias=yes
) query argument is only required on POST when creating a mutable unnamed object in Swarm >= 7.x. It will not be shown in subsequent examples where unnecessary as the examples are created in post 7.x Swarm code.
Create the object using POST
WRITE a string:
WRITE a file:
For the preceding command to work, Hello_World.html must be in a location accessible by cURL and it must have HTML contents, such as <h1>Hello World</h1>. (Include the path to the file if necessary as @path-to-file/Hello_World.html.)
READ the Object
Verify the object was created successfully by pasting the value of the Location header in a browser.
Add data to the object using APPEND
For example,
Refresh your web browser to see how the object changed.
Note: Although this example shows how to append data to an object, APPEND can also be used to append metadata to the object as well.
Add custom metadata to the object using COPY
Swarm enables you to add custom metadata in the format x-*-meta-*
. Custom metadata added in this way is case-insensitive (to be consistent with section 4.2 of RFC 2616) and can contain ASCII characters only. The total length of all persisted metadata, keys and values, is limited to 32KB.
You can add metadata headers to an object using the UPDATE method (which replaces the object's data and metadata) or using COPY (which replaces the metadata only).
For example,
INFO the object to verify the header:
A sample response follows (the custom metadata header is displayed in italicized type for emphasis):
HTTP/1.1 200 OK
Castor-System-Alias: 90eee0a3170bf8fca313532e70b37eec
Castor-System-Created: Sat, 19 Mar 2015 21:47:05 GMT
Castor-System-Version: 1438101372.519
Content-Length: 0
Last-Modified: Sat, 19 Mar 2015 21:47:05 GMT
x-ExampleCorp-meta-color: blue
Etag: "b4fb59c0f49e5be9ba09c8996784c73a"
Date: Sat, 19 Mar 2015 21:47:08 GMT
Server: CAStor Cluster/7.5.1
Note: If you send a Content-Length
header, you must set it to 0 because a COPY has no data.
Replace data and metadata using UPDATE
UPDATE is very similar to COPY except that UPDATE also replaces the object's contents as well as its metadata.
Try UPDATE on the preceding example as follows:
For example,
INFO the object to verify the header:
A sample follows:
HTTP/1.1 200 OK
Castor-System-Alias: 90eee0a3170bf8fca313532e70b37eec
Castor-System-Created: Sat, 19 Mar 2015 22:11:33 GMT
Castor-System-Version: 1438101372.519
Content-Length: 0
Last-Modified: Sat, 19 Mar 2015 22:11:33 GMT
x-ExampleCorp-meta-color: orange
Etag: "a6d10b1786ddfc40e9f06e5e385295a0"
Date: Sat, 19 Mar 2015 22:11:38 GMT
Server: CAStor Cluster/7.5.1
Refresh your browser window to see the change in the object's contents.
DELETE the mutable object
Paste the value of the Location header in a web browser, or INFO the object, to verify you get a 404 (Not Found) or a Swarm Error.
Beyond the Basics With Unnamed Objects
This section discusses more advanced actions, such as using cURL for performance, immediate replication (also referred to as replicate-on-write, lifepoint headers, and range reads).
Performance tip
cURL's output to the command prompt can drastically skew or slow response times. If you use cURL for performance reasons, use -s
(silent) and -o /dev/null
(output). For example,
Immediate replication (replicate-on-write)
To create a mutable unnamed object and replicate it immediately:
INFO the object to get its replica count:
For example:
Sample response (the Replica-Count
header is displayed in italicized text for emphasis):
HTTP/1.1 200 OK
Castor-System-Alias: cd0e8d605372a4d21b660000fc807c1b
Castor-System-Created: Thu, 17 Mar 2015 21:45:47 GMT
Castor-System-Version: 1438101372.519
Content-Length: 0
Content-type: text/html
Last-Modified: Thu, 17 Mar 2015 21:45:47 GMT
lifepoint: [Fri, 18 March 2015 10:15:00 GMT] reps=2, deletable=True, [] delete
Replica-Count: 2
Etag: "7a1f8719a80df7065d1b5ce4ef30403b"
Date: Thu, 17 Mar 2015 22:01:20 GMT
Server: CAStor Cluster/7.5.1
Range READ
For example:
The response depends on the type of file you are reading. For example, following is a partial response if you request bytes 399-600 from the Henry Ford biography from Wikipedia (assuming you had previously stored it in your cluster):
* About to connect() to 172.16.0.32 port 80 (#0)
* Trying 172.16.0.32... connected
* Connected to 172.16.0.32 (172.16.0.32) port 80 (#0)
> READ /44bc48cd3e040c7fc5d76c606eafff46?alias HTTP/1.1
> Range: bytes=399-600
..[this portion intentionally omitted]
>
< HTTP/1.1 206 Partial Content
HTTP/1.1 206 Partial Content
< Castor-System-Alias: 44bc48cd3e040c7fc5d76c606eafff46
Castor-System-Alias: 44bc48cd3e040c7fc5d76c606eafff46
..[this portion intentionally omitted]
< Content-Range: bytes 399-600/219136
Content-Range: bytes 399-600/219136
< Content-Length: 202
Content-Length: 202
< Etag: "25d456df70ba3813d5490bceaa71cb79"
Etag: "25d456df70ba3813d5490bceaa71cb79"
< Age: 97
Age: 97
< Date: Thu, 17 Mar 2015 23:11:09 GMT
Date: Thu, 17 Mar 2015 23:11:09 GMT
< Server: CAStor Cluster/7.5.1
Server: CAStor Cluster/7.5.1
<
Henry Ford - Wikipedia, the free encyclopedia</title>
<meta http-equiv="Content-Style-Type" content="text/css">
<meta name="generator" content="MediaWiki 1.17wmf1">
* Connection #0 to host 172.16.0.32 left intact
* Closing connection #0
WRITE a New Mutable Object With a Lifepoint Header
Set http-date
to an HTTP 1.1 specification-formatted date that is at least an hour from now. An example follows:
The preceding command creates an object that initially has two replicas, both of which are deleted at http-date
.
To verify the lifepoint headers, INFO the object using the following command:
A response similar to the following displays to indicate the object was found, and it displays the object’s lifepoint header:
HTTP/1.1 200 OK
Castor-System-Created: Thu, 17 Mar 2015 01:18:32 GMT
Content-Length: 33
Content-type: text/html
Last-Modified: Thu, 17 Mar 2015 01:18:32 GMT
lifepoint: [Friday, 18 March 2015 10:15:00 GMT] reps=2, deletable=True, [] delete
Etag: "236a921048fa3a4fb31cdda452f2c0a0"
Date: Thu, 17 Mar 2015 01:19:00 GMT
Server: CAStor Cluster/7.5.1
Add a lifepoint header to a mutable object using UPDATE
Set http-date
to a HTTP 1.1 specification-formatted date that is at least an hour from now. An example follows:
Verify the headers on the object using the following command:
A sample follows (the lifepoint header is displayed in italicized type for emphasis):
HTTP/1.1 200 OK
Castor-System-Alias: cd0e8d605372a4d21b660000fc807c1b
Castor-System-Created: Thu, 17 Mar 2015 21:45:47 GMT
Castor-System-Version: 1438101372.519
Content-Length: 0
Content-type: text/html
Last-Modified: Thu, 17 Mar 2015 21:45:47 GMT
lifepoint: [Fri, 18 March 2015 10:15:00 GMT] reps=2, deletable=True, [] delete
Etag: "7a1f8719a80df7065d1b5ce4ef30403b"
Date: Thu, 17 Mar 2015 21:47:21 GMT
Server: CAStor Cluster/7.5.1
This response confirms you updated the lifepoint header successfully.
Working With Named Objects
This section discusses how to verify you can perform SCSP operations on named objects.
Brief Introduction to Named Objects
Before you can create named objects, your cluster administrator must create at least one tenant (that is, domain). DataCore recommends you use the default cluster domain for these examples but that is not required.
The default cluster domain is defined as a domain whose name exactly matches the value of the cluster configuration parameter in the node or cluster configuration file. Swarm assumes that the domain is set to the default cluster domain if you omit the ?domain=
query argument from your cURL commands.
In other words, if cluster.example.com
is the default cluster domain, the following cURL commands are equivalent:
For more information about creating a tenant and a domain, see the Swarm Guide, under Configuration and Administration.
For more information about named objects, see the Swarm Guide, under Understanding Swarm Objects.
Before you continue, you must know all of the following:
The IP address of one node in the cluster.
The name of the domain and whether or not it is the default cluster domain. If the domain is not the default cluster domain, you must add the ?domain=domain-name query argument to every cURL command.
The domain’s protection setting, which defines the users who can WRITE in the domain. This document will not address authentication.
If you do not use the default port 80 for SCSP, you must include the port in the URL.
Creating the Domain
This section describes how to create the mycluster.example.com
domain. All tasks discussed in this section require you to authenticate as a user in the Swarm administrators user list. It is preferred to create the domain using the Swarm user interface if possible.
To create the domain:
1. Create the mycluster.example.com domain:
Caution — Be sure to enter these headers exactly as shown so that they match the headers used when domains are created by the Console. lifepoint: [] reps=16
enables the domain to be replicated as many times as possible. Use Castor-Stream-Type: admin
for all root objects that are accessed frequently and all objects that use a Castor-Authorization
header.
2. Locate the 201 Created response to confirm that the command was successful.
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Digest realm="CAStor administrator",
nonce="e9d618c06bbeca15e7568aca2f
opaque="f9bcffd26d1bede5419a2b30cbcff976",
stale=false, qop="auth", algorithm=MD5
WWW-Authenticate: Basic realm="CAStor administrator"
Content-Length: 51
Content-Type: text/html
Date: Fri, 04 Nov 2011 16:22:53 GMT
Server: CAStor Cluster/6.0.0
Allow: HEAD, GET, PUT, POST, COPY, APPEND, DELETE
HTTP/1.1 201 Created
Content-UUID: 68aeb6c78e4ca0101f3936c03fa26e72
Location: http://172.16.0.35:80/?domain=mycluster.example.com
Volume: 064d2cc2392810dcc2d04f7b1023ecc7
Last-Modified: Fri, 04 Nov 2011 16:22:53 GMT
Entity-MD5: ZoxbtOo9PhHyZzAbaqa/Pw==
Stored-Digest: 668c5bb4ea3d3e11f267301b6aa6bf3f
Castor-System-Owner: admin@CAStor administrator
Castor-System-Version: 1320
Creating the Bucket
Once you have a domain, you need one or more buckets to write your data streams into. If you do not create a bucket, and you write named streams into your domain, the streams will be assumed to be buckets. For example, if you write a stream called: http://172.16.0.35/stream1?domain=mycluster.example.com
, stream1 will be considered a bucket, not a regular stream. This is not desirable as different replication policies apply to context streams (domains and buckets) than to streams.
This section discusses how to WRITE a bucket in a domain that does not require authentication.
WRITE a bucket without authentication
(?domain-name
is not required if you are creating a bucket in the default cluster domain.)
Verify the bucket.
A sample response follows (the authorization specification is displayed in italicized type for emphasis):
HTTP/1.1 200 OK
Castor-System-Alias: 759db90c1b81d695837d54261df90c53
Castor-System-CID: 4a89010e358d48c3040778f35282238e
Castor-System-Cluster: mycluster.example.com
Castor-System-Created: Mon, 21 Mar 2015 23:29:28 GMT
Castor-System-Name: mybucket
Castor-System-Version: 1438101372.519
Content-Length: 0
Content-Type: application/x-www-form-urlencoded
Last-Modified: Mon, 21 Mar 2015 23:29:28 GMT
Etag: "f21a6e2e847539049d3f83c508fa0986"
Age: 114
Date: Mon, 21 Mar 2015 23:32:49 GMT
Server: CAStor Cluster/7.5.1
Named Objects
WRITE an object in a bucket
WRITE a string:
WRITE a file:
Verify the object in a web browser
Enter the following URL in your web browser’s location or address field:
http://node-ip-address/bucket-name/technote.html[?domain=domain-name]
For example, if the bucket is named bucket and it is in the default cluster domain, and the node IP address is 172.16.0.32:
http://172.16.0.32/bucket/technote.html
Hello World displays in bold text.
Add data to the object using APPEND
Verify the append succeeded by refreshing your web browser window.
Add custom metadata using COPY
INFO the object to verify the metadata.
A sample response follows:
HTTP/1.1 200 OK
Castor-System-CID: 74888c3f62ebd05e53bf442eb6a7d98c
Castor-System-Cluster: mycluster.example.com
Castor-System-Created: Sat, 19 Mar 2015 22:58:48 GMT
Castor-System-Name: technote.html
Castor-System-Version: 1438101372.519
Content-Length: 0
Last-Modified: Sat, 19 Mar 2015 22:58:48 GMT
x-NamedObject-meta-data: NameOne
Etag: "258de599945736b9bb26247ac8229e5f"
Date: Sat, 19 Mar 2015 22:59:13 GMT
Server: CAStor Cluster/7.5.1
Replace data and metadata using PUT
INFO the object to verify the metadata.
A sample response follows:
HTTP/1.1 200 OK
Castor-System-CID: 74888c3f62ebd05e53bf442eb6a7d98c
Castor-System-Cluster: mycluster.example.com
Castor-System-Created: Sat, 19 Mar 2015 23:14:16 GMT
Castor-System-Name: technote.html
Castor-System-Version: 1438101372.519
Content-Length: 0
Last-Modified: Sat, 19 Mar 2015 23:14:16 GMT
x-NamedObject-meta-data: NameTwo
Etag: "f03a429994243f27013b4f1fceb31fff"
Date: Sat, 19 Mar 2015 23:14:34 GMT
Server: CAStor Cluster/7.5.1
Refresh your browser window to see the object's data.
Delete the named object
Verify the delete succeeded by refreshing your web browser window or using an INFO. You should get a 404 (Not Found) or a Swarm Error.
HTTP and SCSP Methods
Swarm clients use the Simple Content Storage Protocol (SCSP) to communicate with Swarm. SCSP is a subset of HTTP 1.1 as defined in RFC 2616. Some SCSP methods have different names than their HTTP counterparts.
When you execute methods using cURL, you must use HTTP method names; however, this Tech Note uses SCSP method names except when showing cURL command examples.
The following tables map SCSP method names to their HTTP method counterparts. A link is provided in the HTTP method column to the relevant section in RFC 2616. For more information about SCSP methods, see the Swarm Guide.
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.