Because any node can be called on by the PAN to service any particular request, all nodes must be accessible to the application client. An application can locate a PAN to use for transactions using one of the following methods.
To locate a PAN, use one of these methods (listed most preferred to least):
Use the Swarm SDK
(recommended) Integrate applications with Swarm using the Software Development Kit (SDK). Along with the convenience it provides, applications can use the ProxyLocator or StaticLocator object included with the SDK to locate and communicate with a node.
The SDK's ProxyLocator subclass performs two functions:
Performs a GET / to the SCSP Proxy to pre-populate the local list of Swarm node IP addresses.
Dynamically maintains this list as redirects and other responses are received directly from Swarm nodes.
See the SDK Overview.
Use Multicast-DNS (mDNS)
Another way to make nodes locate an initial PAN is to use mDNS. mDNS is often referred to as Zeroconf, the collective name for DNS and DNS Service Discovery to enable zero-configuration networking.
mDNS is supported for all deployments. It provides the most flexibility because it presents applications with a list of storage nodes to choose from when selecting a PAN without requiring the application to maintain a static list of IP addresses.
Every Swarm node implements an mDNS service that allows applications to provide service discovery. Even if DHCP is used to assign and change node IP addresses, mDNS allows an application to "discover" an active node in any storage cluster and use it as the PAN. Several free mDNS client implementations in various languages are available online for implementing mDNS node location.
Important
Verify the cluster.name
parameter value is unique for each cluster when using mDNS. The parameter is located in the node.cfg
file or in the Platform Server's cluster configuration.
Swarm mDNS support allows an application to discover all nodes on a network, all nodes in a specific cluster, or to look up a node. To implement this process, it "publishes" several different records, including an A (host) record for the node and an SRV (service) record under the _scsp._tcp service type.
Although an in-depth description of mDNS deployment is beyond this scope, a typical use example is provided below. This example uses the Avahi command line tools to pass in the name of the cluster and return all nodes discovered in the cluster. Here, two nodes were found and the IP addresses were returned in the address field for each record.
% avahi-browse -tr _clustername._sub._scsp._tcp local + eth0 IPv4 D2024267FF8F1DD056EEA15E40EE52C9 _scsp._tcp local = eth0 IPv4 CD35B28FD2E70CD1E47095C774F8050F _scsp._tcp local hostname = [CD35B28FD2E70CD1E47095C774F8050F.local] address = [192.168.1.123] port = [80] txt = [] = eth0 IPv4 D2024267FF8F1DD056EEA15E40EE52C9 _scsp._tcp local hostname = [D2024267FF8F1DD056EEA15E40EE52C9.local] address = [192.168.1.125] port = [80] txt = []
Use DNS round robin
For large and/or dynamic storage clusters where nodes are often added and removed (even for temporary maintenance), address the cluster using a DNS host name instead of an IP address.
This method is recommended for all deployments. It is particularly helpful for multi-tenancy, as the DNS name can be used to pass in a domain.
The domains must resolve to least one IP address (such as an "A" record) for client applications so the application software includes a recognized Swarm domain name in the Host header of the HTTP/1.1 request when using DNS with multi-tenancy.
Tip
With some DNS servers, move the maintenance of the PAN addresses out of the applications and into the DNS server itself. The Berkeley Internet Name Domain (BIND), the most commonly used DNS server, allows entry of multiple "A" records that map a single DNS name to more than one IP address.
This process also requires static IP addresses, but it enables the application to use a single DNS name (or multiple DNS names if using multiple domains) for the entire cluster. The DNS server selects one of the defined IP addresses on a round-robin basis. The application must resolve the host name again if one of the nodes does not respond.
Use a pool of static IP addresses
A less desirable approach for an application to address a storage cluster is to use a stored list of several (perhaps all) of the static IP addresses for the nodes in the cluster. This method is not recommended for a production environment.
The application's stored list of IP addresses must be accessible programmatically from the application. The application can attempt another IP address if one of the nodes fails to respond to a request.
The application is able to add the new IP address to the list if a redirect response reveals a storage node not in the original list. This may be a good approach if the cluster is relatively stable with respect to static node IP addresses. Do not use this method if nodes are frequently added and removed from the cluster.
Use a single static IP address
The simplest but least recommended (and least supported) way for an application to address a storage cluster is to assign a static IP address to at least one of the cluster nodes and then use that IP address in every request. It is recommended to restrict usage to a development environment. It can be set up quickly, but is not maintainable in a larger system.
The simplicity of this approach is balanced by a significant disadvantage. The application cannot send requests to the cluster if the sole PAN is taken out of service or fails for any reason, even though other nodes may still be functioning and all desired content is still available.