Page Comparison

Feeds are a mechanism for a cluster to do some processing for every stream (replica, EC manifest, EC segment, etc) of every object in the cluster. Replication feeds “process” those streams by replicating logical objects to another cluster. There are a lot of considerations an and possibilities with feeds. There are issues of how the clusters are connected, which data will be processed by the feed, the size and characteristics of the objects, understanding the catch-up phase vs. the steady state phase, etc.

...

Let’s consider connectivity between clusters. The source cluster may have a direct network path to the target cluster. This configuration is the most efficient as this allows every Linux process in the source cluster to have a direct network path to all the nodes in the target cluster. Bandwidth is usually not an issue in this situation. More commonly, though, clusters are separated by some complex network path, possibly even going over a wide area network. Replication may optionally go through a target cluster Gateway, a source target cluster reverse proxy, a source cluster forward proxy or NAT device, and/or an HTTPS offload for secure communications. Adding hops doesn’t generally impact throughput except for the fact that throughput will necessarily be limited by the slowest link. Often, this is the network link itself, especially if ISPs are involved. It’s also important to understand that with large object use cases, some remote replication requests will take hours. Any component on the network path can have timeouts, which can effectively clog the pipe as these large object transfers will be tried over and over again without success while limiting the available bandwidth for object transfers that would succeed. If there are a large number of “retrying”objects “retrying” objects on a replication feed, this might be the issue.

It’s common for customers to replicate all of the data in the source cluster to the target cluster. It is possible to limit which domain(s) is/ are transferred by a feed. One would hope that if domain D comprised 1/100th the space used by a cluster, that it would be replicated that much more quickly to the target cluster. Unfortunately, each feed must consider every stream on every volume in the cluster, regardless of the a feed definition restriction. The savings of the restriction is that relatively few of those streams will do the actual work of replication. In the feed telemetry, a new feed will have mostly “unqualified” streams. These are streams known on disk but it is unknown whether they match the feed definition. Streams that are found not to match are just removed from “unqualified”. Matching streams are converted to “processing”, and the replication work is queued. Eventually, the stream is marked as “success” or “retrying”. So, with our domain D, the feed processing time may be limited by the time it takes Swarm to scan all of the volumes for matching content.

In the case of replication feeds, only a subset of the streams on disk are processed. Each wholly replicated object will have a replica in the cluster with a number that is controlled by the replication policy of the cluster. Typically, there are three replicas. In the case of EC objects, the p+1 manifests of a k:p encoded EC object are processed by the feed, causing the logical object to be replicated to the target cluster. Segments are ignored by the feed. Remember that we replicate logical objects because the target cluster may have different replication policies in play.

Replicas and manifests are scattered throughout the cluster and will be processed by a feed at different times. When each replica/manifest is processed, the feed will attempt to perform the replication to the target cluster. It’s the case that the first replica/manifest to do so is the winner. The other replicas will find that the object is already in the target cluster (based on a HEAD request) and just check off the work as done. So if you imagine that all the streams in the source cluster are processed at an even rate, a new feed will do a lot of replication work early in the feed lifetime because everything needs to be transferred. Over time, more and more content will be found to be replicated. So the feed may be making linear progress, but the objects being replicated may look more like an S-curve: high at the start, a pretty rapid dropping, then a long tail. The shape of the curve depends on object size and EC and replication policies, and many other factors. There are statistics keeping track of the remote transfer successes and duplicated successes. The sum of these will give you an idea of feed processing but the slopes of these statistics can indicate where on this S-curve the replication process is.

...

Versions Compared

Old Version 1

New Version Current

Key