Because any node can be called on by the PAN to service any particular request, all nodes must be accessible to the application client. An application can locate a PAN to use for transactions using one of the following methods.

To locate a PAN, use one of these methods (listed most preferred to least):

Use the Swarm SDK

(recommended) You can integrate your applications with Swarm using the Software Development Kit (SDK). Along with the convenience it provides, your application can use the ProxyLocator or StaticLocator object included with the SDK to locate and communicate with a node.

The SDK's ProxyLocator subclass performs two functions:

Performs a GET / to the SCSP Proxy to pre-populate its local list of Swarm node IP addresses.
Dynamically maintains this list as redirects and other responses are received directly from Swarm nodes.

See the SDK Overview.

Use Multicast-DNS (mDNS)

Another way to make your nodes locate an initial PAN is to use mDNS. mDNS is often referred to as Zeroconf, the collective name for DNS and DNS Service Discovery to enable zero-configuration networking.

mDNS is supported for all deployments. It provides the most flexibility because it presents applications with a list of storage nodes to choose from when selecting a PAN without requiring the application to maintain a static list of IP addresses.

Every Swarm node implements an mDNS service that allows applications to provide service discovery. Even if DHCP is used to assign and change node IP addresses, mDNS allows an application to "discover" an active node in any storage cluster and use it as the PAN. Several free mDNS client implementations in various languages are available online for implementing mDNS node location.

Important

When using mDNS, ensure that the cluster.name parameter value is unique for each cluster. The parameter is located in the node.cfg file or in the Platform Server's cluster configuration.

Swarm mDNS support allows an application to discover all nodes on a network, all nodes in a specific cluster, or to look up a node. To implement this process, it "publishes" several different records, including an A (host) record for the node and an SRV (service) record under the _scsp._tcp service type.

Although an in-depth description of mDNS deployment is beyond this scope, a typical use example is provided below. This example uses the Avahi command line tools to pass in the name of the cluster and return all nodes discovered in that cluster. Here, two nodes were found and their IP addresses were returned in the address field for each record.

% avahi-browse -tr 
_clustername._sub._scsp._tcp local + eth0 IPv4 D2024267FF8F1DD056EEA15E40EE52C9 
_scsp._tcp local = eth0 IPv4 CD35B28FD2E70CD1E47095C774F8050F 
_scsp._tcp local hostname = [CD35B28FD2E70CD1E47095C774F8050F.local] 
address = [192.168.1.123] port = [80] txt = [] = eth0 IPv4 D2024267FF8F1DD056EEA15E40EE52C9 
_scsp._tcp local hostname = [D2024267FF8F1DD056EEA15E40EE52C9.local] 
address = [192.168.1.125] port = [80] txt = []

Use DNS round robin

For large and/or dynamic storage clusters where nodes are often added and removed (even for temporary maintenance), you can address the cluster using a DNS host name instead of an IP address.

This method is recommended for all deployments. It is particularly helpful for multi-tenancy, as you can use the DNS name to pass in a domain.

When using DNS with multi-tenancy, the domains must resolve to least one IP address (such as an "A" record) for client applications so that the application software includes a recognized Swarm domain name in the Host header of the HTTP/1.1 request.

Tip

With some DNS servers, you can move the maintenance of the PAN addresses out of the applications and into the DNS server itself. The Berkeley Internet Name Domain (BIND), the most commonly used DNS server, lets you enter multiple "A" records that map a single DNS name to more than one IP address.

This process also requires static IP addresses, but it enables the application to use a single DNS name (or multiple DNS names if you are using multiple domains) for the entire cluster. The DNS server selects one of the defined IP addresses on a round-robin basis. If one of the nodes does not respond, the application must resolve the host name again.

Use a pool of static IP addresses

A less desirable approach for an application to address a storage cluster is to use a stored list of several (perhaps all) of the static IP addresses for the nodes in the cluster. This method is not recommended for a production environment.

The application's stored list of IP addresses must be accessible programmatically from the application. If one of the nodes fails to respond to a request, the application can simply try another IP address.

If a redirect response reveals a storage node that is not in the original list, the application should be able to add the new IP address to the list. If your cluster is relatively stable with respect to static node IP addresses, this may be a good approach. However, if nodes are frequently added and removed from the cluster, do not use this method.

Use a single static IP address

The simplest but least recommended (and least supported) way for an application to address a storage cluster is to assign a static IP address to at least one of the cluster nodes and then use that IP address in every request. This method should be used only in a development environment. It can be set up quickly, but is not maintainable in a larger system.

The simplicity of this approach is balanced by a significant disadvantage. If the sole PAN is taken out of service or fails for any reason, the application cannot send requests to the cluster, even though other nodes might still be functioning and all desired content is still available.

Choosing How to Access a PAN

Use the Swarm SDK

Use Multicast-DNS (mDNS)

Use DNS round robin

Use a pool of static IP addresses

Use a single static IP address