<transport clusterName = "MyCluster" machineId = "LinuxServer01" rackId = "Rack01" siteId = "US-WestCoast" />
The motivations behind this feature is to ensure when using distribution, backups are not picked to reside on the same physical server, rack or data centre. For obvious reasons it doesn't work with total replication.
Server hinting was added in Infinspan 4.2.0 "Ursus". You'll need this release or later in order to use it.
The hints are configured at transport level:
<transport clusterName = "MyCluster" machineId = "LinuxServer01" rackId = "Rack01" siteId = "US-WestCoast" />
machineId - this is probably the most useful, to disambiguate between multiple JVM instances on the same node, or even multiple virtual hosts on the same physical host.
rackId - in larger clusters with nodes occupying more than a single rack, this setting would help prevent backups being stored on the same rack.
siteId - to differentiate between nodes in different data centres replicating to each other. All of the above are optional, and if not provided, the distribution algorithms provide no guarantees that backups will not be stored in instances on the same host/rack/site.
This is an advanced topic, useful e.g. if you need to change distribution behaviour.
The consistent hash beyond this implementation is wheel based. Conceptually this works as follows: each node is placed on a wheel ordered by the hash code of its address. When an entry is added its owners are chosen using this algorithm:
key's hash code is calculated
the first node on the wheel with a value grater than key's hash code is the first owner
for subsequent nodes, walk clockwise and pick nodes that have a different site id
if not enough nodes found repeat walk again and pick nodes that have different site id and rack id
if not enough nodes found repeat walk again and pick nodes that have different site id, rack id and machine id
Ultimately cycle back to the first node selected, don't discard any nodes, regardless of machine id/rack