vespa-feeder

vespa-feeder is a feeding client that parses JSON input as Vespa document operations and sends to a Vespa application. File format is auto-detected. Use vespa-feeder --help for usage. Typical use is:

$ vespa-feeder mydocs.json
The binary parses the content of the input sequentially, and feeds each operation in order. However, since many operations will be pending at any one time, and because the processing time of an operation varies, there is no guarantee as to which order operations reach the backend. Because this can be very important when it comes to operations that apply to the same document id, there is logic in place to never send an operation for a document id to which there is already a pending operation.

An option to note is --abortondataerror, which can be set to no in case the input has errors (e.g. invalid characters). vespa-feeder notifies on parsing errors at the end of the feed, but it will not abort.

Troubleshooting

At the end of the feed, vespa-feeder prints a report. To print this report once a minute, use --verbose - sample output:

Messages sent to vespa (route default) :
----------------------------------------
PutDocument: ok: 999997 msgs/sec: 411.38 failed: 0 ignored: 0 latency(min, max, avg): 2, 4360, 99
ignored reports the number of documents that could not be routed to any content clusters because they did not match any of the configured document types or selections - examples are:
  • A document type is removed from the application and the feed file contains documents with this type
  • One or more selection expressions restrict the documents the cluster accepts, and the feed file contains documents that are excluded. An example is feeding expired documents - a selection for documents that are less than 30 days old and the feed file contains documents that are 30+ days old.