Search API

This is the Vespa Search API guide - refer to the search API reference for details.

HTTP

  • The host:port endpoint is the Search Container. The general form of a search request is:
      http://host:port/search/?param1=value1&param2=value2&...
    
  • The only mandatory parameter is yql
  • Use GET or POST - these are equivalent:
    $ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22%3B
    $ curl --data yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22%3B http://localhost:8080/search/
    
  • The Search Container uses Jetty for HTTP. Configure the http server - e.g. set requestHeaderSize to configure URL length (including headers):
    <container version="1.0">
      <http>
        <server port="8080" id="myserver">
          <config name="jdisc.http.connector">
            <requestHeaderSize>32768</requestHeaderSize>
          </config>
        </server>
      </http>
    </container>
    
  • HTTP keepalive is supported
  • Values must be encoded according to standard URL encoding. Thus, space is encoded as +, + as %2b and so on - see RFC 2396

Features

The query string contains the specification of which results the search should return, typically some words which should be present in matching documents. Queries are formulated in YQL. Note: Also find the legacy simple query language reference.

If Vespa cannot generate a valid search expression from the query string, it will issue the error message Null query. To troubleshoot, add &tracelevel=2 to the request. A missing yql parameter will also lead to this error message.

Examples - refer to YQL, grouping and the sorting reference for details:

Ordering
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22\
+order+by+year+desc%3B
Grouping
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22\
+%7Call%28group%28year%29+each%28output%28sum%28duration%29%29%29%29%3B
Pagination
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22\
+limit+2+offset+1%3B
Numeric
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+year+%3E+2000%3B
Timeout
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22\
+timeout+100%3B
Regexp
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+title+matches+%22mado%5Bn%5D%2Ba%22%3B

Query parameters

Below is a list of query parameters to control aspects of queries - refer to the search API reference for the full list.
ranking Unless ordering is specified, results are ranked using the default rank profile nativerank. Vespa has a rich ranking framework, read more in ranking. Control result ranking using rank profiles
searchChain Use search chains to implement query processing. Set &tracelevel=2 to inspect the search chain components. Refer to chained components
sources An application can have multiple content clusters - Vespa searches in all by default. Federation controls how to query the clusters, sources names the clusters
pos.ll Specify position using latitude and longitude to implement geo search
queryProfile Use query profiles to store query parameters in configuration. This makes query strings shorter, and makes it easy to modify queries by modifying configuration only. Use cases are setting query properties for different markets, parameters that do not change, and so on. Query profiles can be nested, versioned and use inheritance
tracelevel Set to a positive integer to see query tracing. Higher numbers produce more tracing output

Default Result Format

The default output format is JSON. The basic structure is:

{
    "root": {
        "children": [
            objects with same structure as root itself...
        ],
        "fields": {
            "document field name": "document field contents",
            …
        }
    }
}
As for a complete example of the structure:
{
    "root": {
        "children": [
            {
                "children": [
                    {
                        "fields": {
                            "c": "d",
                            "uri": "http://localhost/1"
                        },
                        "id": "http://localhost/1",
                        "relevance": 0.9,
                        "types": [
                            "summary"
                        ]
                    }
                ],
                "id": "usual",
                "relevance": 1.0
            },
            {
                "fields": {
                    "e": "f"
                },
                "id": "type grouphit",
                "relevance": 1.0,
                "types": [
                    "grouphit"
                ]
            },
            {
                "fields": {
                    "description": "foo",
                    "uri": "http://localhost/"
                },
                "id": "http://localhost/",
                "relevance": 0.95,
                "types": [
                    "summary"
                ]
            }
        ],
        "coverage": {
            "coverage": 100,
            "documents": 500,
            "full": true,
            "nodes": 1,
            "results": 1,
            "resultsFull": 1
        },
        "errors": [
            {
                "code": 18,
                "message": "boom",
                "summary": "Internal server error."
            }
        ],
        "fields": {
            "totalCount": 130
        },
        "id": "toplevel",
        "relevance": 1.0
    }
}