Distributor

The distributor keeps track of which search nodes should hold which copy of each bucket, based on the redundancy setting and information from the cluster controller. Distributors are responsible for keeping track of metadata for a non-overlapping subset of the data in the cluster. The distributors each keep a bucket database containing metadata of buckets it is responsible for. This metadata indicates what content nodes store copies of the buckets, the checksum of the bucket content and the number of documents and meta entries within the bucket. Each incoming document is assigned to a bucket and forwarded to the right search nodes. All requests related to these buckets are sent through the distributors.

A distribution algorithm calculates what distributors are responsible for given buckets based on information of what nodes are available in the cluster. Clients can calculate what distributor to talk to using information in the cluster state.

When a distributor goes down or becomes available, all distributors reshuffle what buckets they are responsible for. During this reshuffling, clients may get some requests temporarily failing as distributors taking over bucket ownership doesn't yet know where copies of its buckets are located. These requests will automatically be resent, providing timeouts allow for it. Changes in what distributors are up and available will currently cause windows of availability loss. Other distributors take over bucket ownership. To do this they fetch bucket metadata for new buckets from all storage nodes, in memory operations for speed.

The distribution algorithm also calculates what content nodes copies of buckets should be stored on. As moving bucket copies take time, the distributors track current state in a bucket database, allowing requests to be processed independent of where buckets are currently located. Distributors use the distribution algorithm to detect buckets that are stored on wrong nodes, and move copies to correct nodes when there are free resources to do so.

In addition the distributor can track some historic knowledge. For instance it may know that of the currently existing bucket copies, a given copy has been available all the time and may be trusted to have all the information, while another copy may be on a storage node that recently restarted, so that copy may lack some documents. Such historic state is not persisted, and is thus lost on distributor restarts.

The distributors use the bucket metadata to ensure the buckets are in a good state.

  • If buckets have too few copies, new copies are generated
  • If buckets have too many copies, superfluous copies are deleted once the distributor knows the copies to delete don't contain data other copies do not have
  • If all the copies do not contain the same data, merges are issued to get bucket copies consistent
  • If two buckets exist, such that both may contain the same document, the buckets are split or joined to remove such overlapping buckets
  • If buckets contain too little or too much data, they should be joined or split
  • If not exactly one copy is marked active, and the backend wants a copy to be marked active, activate or deactivate copies to get in a good state
The various maintenance operations have various priorities depending on why they are needed. The distributor tries to prioritize to fix the most critical issues first. If no maintenance operations are needed, the cluster is said to be in the ideal state. The distributors synchronize maintenance load with user load, e.g. to remap requests to other buckets after bucket splitting and joining.