QPS Scaling in an Indexed Content Cluster

Document distribution is used to scale QPS in an indexed content cluster. With flat distribution, documents are uniformly distributed across all content/search nodes. This distribution is used to scale QPS when the dynamic query cost is high. With hierarchical distribution, documents are distributed among a set of groups, each containing content/search nodes, such that the entire document collection is contained in each group. This distribution is used to scale QPS in cases where the static query cost is high. This article describes:

  • The differences between flat and hierarchical distribution
  • How flat and hierarchical distribution can be used for QPS scaling
  • How to set up hierarchical distribution in a system
  • How to grow a live system using hierarchical distribution to handle more QPS
For more detailed information on capacity planning and performance tuning take a look at Vespa search sizing guide. For more information on generic use of hierarchical distribution take a look at Document distribution.

Flat vs Hierarchical Distribution

With flat distribution, documents are distributed equally across all content/search nodes in the cluster. The redundancy of the cluster specifies how many copies of a document that exist among the nodes. Below, redundancy is 2 and each document is located on two nodes;

Hierarchical distribution defines a number of groups, each containing a set of content/search nodes. The documents are uniformly distributed across all nodes in each group, such that the entire document collection is contained in each group. The total redundancy of the cluster specifies how many copies of a document that exists in total. Above, total redundancy is 4, and redundancy 2 in each group.

Search Query Handling

In addition to an indexed content cluster, there are Container nodes running the Search Container. These scatters queries to content nodes and gathers the partial results. When using flat distribution, each query is sent (in parallel) to all the content nodes. With hierarchical distribution, a single group will instead be (round robin) selected for each given query and that query only sent to the nodes of that group.

Scaling

An important aspect when sizing an indexed content cluster is the QPS (queries per second) the system can handle. The cost of running a query is the sum of two parts, the static query cost (SQC) and the dynamic query cost (DQC). SQC is an administrative overhead, and is typically the cost of building a large query with many terms and/or the cost of doing an expensive 2. phase rank computation on the content/search nodes. DQC depends and scales with the number of documents on the content/search nodes. We can scale the QPS in two ways (or the combination of the two), depending on the ratio SQC / (SQC + DQC). In both cases, the number of search container nodes must typically be increased to fully utilize the content/search nodes.

Add nodes (low fixed query cost) As long as SQC / (SQC + DQC) is small, QPS can be scaled by just adding more content/search nodes in a system using flat distribution. This will reduce the number of documents per node, and thus reduce the DQC and the total query cost.
Add groups (high fixed query cost) When SQC / (SQC + DQC) is high, one can no longer just add more content/search nodes. Since the query is distributed to all content/search nodes in the cluster when using flat distribution, the SQC will increase with the number of nodes. Address this problem by using hierarchical distribution. Each group can handle a particular QPS, so by adding more groups, one can effectively scale QPS.

Migrate to using groups - procedure

Migrating from flat distribution to a hierarchy, means you might increase redundancy first, then reconfigure to using groups - this to ensure proper search coverage during the transition. When adding new group(s), it will take some time to populate the new nodes. During this transition period, queries hitting these nodes will return partial results. Reference documentation.

Example - migrate to using groups. Starting point: A cluster with 4 nodes and flat distribution and redundancy 2. End state: all 4 nodes with all documents:

<content id="music" version="1.0">
  <redundancy>2</redundancy>
  <documents>
    <document mode="index" type="music" />
  </documents>
  <nodes />
    <node hostalias="node0"/>
    <node hostalias="node1"/>
    <node hostalias="node2"/>
    <node hostalias="node3"/>
  </nodes>
  <engine>
    <proton>
      <searchable-copies>2</searchable-copies>
    </proton>
  </engine>
</content>
Increase redundancy and searchable-copies to 4. After deploying this, wait for all search nodes to finish redistribution (i.e. all nodes have all documents):
<content id="music" version="1.0">
  <redundancy>4</redundancy>
  <documents>
    <document mode="index" type="music" />
  </documents>
  <nodes />
    <node hostalias="node0"/>
    <node hostalias="node1"/>
    <node hostalias="node2"/>
    <node hostalias="node3"/>
  </nodes>
  <engine>
    <proton>
      <searchable-copies>4</searchable-copies>
    </proton>
  </engine>
</content>
Reconfigure to 4 groups, setting redundancy and searchable-copies to 1:
<content id="music" version="1.0">
  <redundancy>1</redundancy>
  <documents>
    <document mode="index" type="music" />
  </documents>
  <group distribution-key="0" name="mytopgroup">
    <distribution partitions="1|*"/>
    <group distribution-key="0" name="mygroup0">
      <node distribution-key="0" hostalias="node0"/>
    </group>
    <group distribution-key="1" name="mygroup1">
      <node distribution-key="0" hostalias="node1"/>
    </group>
    <group distribution-key="2" name="mygroup2">
      <node distribution-key="0" hostalias="node1"/>
    </group>
    <group distribution-key="3" name="mygroup3">
      <node distribution-key="0" hostalias="node1"/>
    </group>
  </group>
  <engine>
    <proton>
      <searchable-copies>1</searchable-copies>
    </proton>
  </engine>
</content>

Hierarchical distribution

You can exercise full control over group layout by specifying a hierarchy of groups in the content cluster definition in services.xml. A flat distribution will have all content/search nodes defined under a single top group, while hierarchical distribution introduces an extra level of groups under the top group.

Content cluster using flat distribution

The following example has a simple content cluster with flat distribution, redundancy 2 and 3 content/search nodes:

<content version="1.0" id="mycluster">
  <redundancy>2</redundancy>
  <documents>
    <document mode="index" type="mydoctype"/>
  </documents>
  <group distribution-key="0" name="mytopgroup">
    <node distribution-key="0" hostalias="node0"/>
    <node distribution-key="1" hostalias="node1"/>
    <node distribution-key="2" hostalias="node2"/>
  </group>
  <engine>
    <proton>
      <searchable-copies>2</searchable-copies>
    </proton>
  </engine>
</content>
Lets say we need to increase the QPS in our system by a factor of 2 and that the static query cost (SQC) is high. We cannot just add 3 more content/search nodes to reach the desired QPS. Instead we use hierarchical distribution to get extra groups that also contain the entire document collection. The next example shows how to setup such a system.

Content cluster using hierarchical distribution

In the following example we have a similar content cluster that uses hierarchical distribution. We have defined 2 leaf groups (mygroup0, mygroup1) that are located under the top group. Each leaf group has 3 content/search nodes. The total redundancy of the system is now 4, and the distribution partitions is specified such that each leaf group will have 2 copies of the documents in the system. Each query can be sent to the nodes of a single group instead of all nodes in the system and the QPS requirement can be met through parallelization of searches. With redundancy 2 per leaf group we can also lose a node from a group and still have enough coverage.

<content version="1.0" id="mycluster">
  <redundancy>4</redundancy>
  <documents>
    <document mode="index" type="mydoctype"/>
  </documents>
  <group distribution-key="0" name="mytopgroup">
    <distribution partitions="2|*"/>
    <group distribution-key="0" name="mygroup0">
      <node distribution-key="0" hostalias="node0"/>
      <node distribution-key="1" hostalias="node1"/>
      <node distribution-key="2" hostalias="node2"/>
    </group>
    <group distribution-key="1" name="mygroup1">
      <node distribution-key="3" hostalias="node3"/>
      <node distribution-key="4" hostalias="node4"/>
      <node distribution-key="5" hostalias="node5"/>
    </group>
  </group>
  <engine>
    <proton>
      <searchable-copies>4</searchable-copies>
    </proton>
  </engine>
</content>

Heterogeneous groups

Vespa supports 2 dispatch policies for dispatching queries to the groups. Default is round-robin. This requires the groups to be fairly equal with respect to serving capacity. However there is also the random that will favour fast groups and thus try to load balance between them. It will select a random group with a probability that is proportional to the inverse latency.

Restrictions

When using hierarchical distribution in an indexed content cluster, the following restrictions apply:

  • There can only be a single level of leaf groups under the top group
  • Each leaf group must have the same number of nodes unless you specify random dispatch policy
  • The number of leaf groups must be a factor of the redundancy
  • The distribution partitions must be specified such that the redundancy per group is equal. In the example above the distribution partitions 2|* ensures that 2 document copies are stored in each group. If we had redundancy 6 and 3 groups the distribution partitions would be 2|2|* instead
  • The number of searchable copies must be less than or equal to redundancy and the number of leaf groups must be a factor of searchable copies