Created 5/23/2011 tw.cook · Updated 1/26/2016 aaron.enfield
Swarm's Health Processor (HP) visits every stream on every cycle and checks replication for each one. However, it doesn't always do that by multicast; typically it does that by unicasting to the hints* stored. If HP finds a stream with no hints, so that it can't verify replication by unicast, then it ALWAYS multicasts - it always validates stream replicas regardless of repMulticastFrequency. repMulticastFrequency tells it to do a multicast for a certain fraction of streams, REGARDLESS of whether they could be validated with hints. Increasing repMulticastFrequency will thus will find overreplication (it won't trim excess replicas EXCEPT when it does a multicast), but makes no difference with underreplication.
This parameter can be set as repMulticastFrequency or health.replicationMulticastFrequency - the default value is 1 (meaning that HP will use multicast at least 1% of the time, whether it needs to or not).
- Swarm stores "hints" for each object - these are simply the last known locations for the other replicas of that object. Remember that HP could move an object to better balance performance or storage usage, or because of a volume recovery, so these hints are not guaranteed but are reasonably likely to be accurate. One of HP's jobs is to periodically update those hints.