Class JoinerDelay

java.lang.Object
org.apache.sling.discovery.oak.JoinerDelay
All Implemented Interfaces:
ClusterSyncService

public class JoinerDelay extends Object implements ClusterSyncService
The JoinerDelay is an ClusterSyncService used in the sync chain and got introduced as part of SLING-10489.

With SLING-10489 new-joining instances are ignored/suppressed by existing instances in the cluster as long as they are potentially only partially started up. The definition of partial-vs-full is the when everything is written and the consistencyService (sync) would succeed. In other words it includes: lease update / idMap / leaderElectionId / syncToken. It is undefined how long a startup lasts and to avoid blocking other instances from operating under a well-defined topology, the notion of ignoring/suppressing partially started instances has been introduced.

Generally speaking there are the following different cases wrt changes in the local cluster:

  • only properties change -> that is handled separately already and not interesting here
  • only leaving instances -> not interesting for this problem
  • only joining instances: in this case, these joining instances get ignored/suppressed until they are fully started and have written all needed discovery data. Until that has happened, the existing instances don't do anything discovery-related yet, ie they don't store syncToken yet neither. Thus, the newly joined instances, once they are finished with the startup, they would have to wait for the existing ones to yet take note of their full startup, of them writing their own sync token, so that the new-joiners can see those sync tokens and finish. So this case is perfectly fine simply with ignoring/suppressing.
  • some leaving, some joining instances: it is this case which is a bit more tricky: with SLING-10489 the joining instances are now ignored/suppressed until they are fully started, so upon a cluster change they don't trigger any discovery activity on the existing instances. However, because there are also some instances leaving, the existing instances will take note of a cluster change and *therefore* update the syncToken etc. In that case, we have a new situation: the cluster change has been announced in the existing instances, the existing instances wrote their new sync token, but the new-joiners are still partially starting up. Let's say now the new-joiners finish their startup, so they write down the sync token. In that very moment the following happens concurrently: (a) the new-joiners check the topology and notice that everybody else already wrote the new sync token, so they can immediately go ahead and do a TOPOLOGY_INIT. (b) the existing instances just now stop ignoring/suppressing the new-joiners and then go through the consistencyService/syncing - but before they can do that, they have to inform existing listeners with a TOPOLOGY_CHANGING. Since that might take a while, it is realistic that the new-joiner already thinks they are in the new topology *while* the existing ones haven't received a TOPOLOGY_CHANGING event yet. And voila, we have a sort of short-lived split-brain. Now usually this should really only be very short-lived, as all that is holding back is TopologyEventListeners reacting to TOPOLOGY_CHANGING - plus then some repository writes. So all of that shouldn't take too long. But it could be a few seconds. And the aim of discovery is to provide guarantees that there are never different topologies in the , aehm .., topology. Now to fix this, we'd have to probably do another synching, which would be unfeasibly complicated. But there's a rather simple way out: we can artificially delay the new-joiners from sending their very first TOPOLOGY_INIT. That way, if that delay is bigger than the above described race-condition, things would be fine. And that's what this JoinerDelay is about : delay new-joiner's TOPOLOGY_INIT.
  • Constructor Details

    • JoinerDelay

      public JoinerDelay(long timeoutMs, Scheduler scheduler)
  • Method Details

    • sync

      public void sync(BaseTopologyView view, Runnable callback)
      Description copied from interface: ClusterSyncService
      Starts the synchronization process and calls the provided callback upon completion.

      sync() is not thread-safe and should not be invoked concurrently.

      If sync() gets called before a previous invocation finished, that previous invocation will be discarded, ie the callback of the previous invocation will no longer be called.

      The synchronization process consists of making sure that the repository has processed any potential backlog of instances that are no longer part of the provided, new view. Plus it writes a 'sync-token' to a well-defined location, with all peers doing the same, and upon seeing all other sync-tokens declares successful completion - at which point it calls the callback.run().

      Specified by:
      sync in interface ClusterSyncService
      Parameters:
      view - the view which all instances in the local cluster should agree on having seen
      callback - the runnable which should be called after successful syncing
    • cancelSync

      public void cancelSync()
      Specified by:
      cancelSync in interface ClusterSyncService