Get started with the Python API to create and modify Vespa applications
This self-contained tutorial will create a basic text search application from scratch based on the MS MARCO dataset, similar to Vespa’s text search tutorials. Visit pyvespa reference API for more detailed information about the API presented here.
!pip install pyvespa
Document instance containing the
Fields to store in the app. To simplify the application, include only the
title and the
body of the MS MARCO documents.
from vespa.package import Document, Field document = Document( fields=[ Field(name = "id", type = "string", indexing = ["attribute", "summary"]), Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"), Field(name = "body", type = "string", indexing = ["index", "summary"], index = "enable-bm25") ] )
Schema will be named
msmarco and contain the
Document instance defined above. The default
FieldSet indicates that queries will look for matches by searching both in the titles and bodies of the documents. The default
RankProfile indicates that all the matched documents will be ranked by the
nativeRank expression involving the title and the body of the matched documents.
from vespa.package import Schema, FieldSet, RankProfile msmarco_schema = Schema( name = "msmarco", document = document, fieldsets = [FieldSet(name = "default", fields = ["title", "body"])], rank_profiles = [RankProfile(name = "default", first_phase = "nativeRank(title, body)")] )
Schema is defined, create the msmarco
from vespa.package import ApplicationPackage app_package = ApplicationPackage(name = "msmarco", schema=[msmarco_schema])
At this point,
app_package contains all the relevant information required to create an MS MARCO text search app and is ready for deployment.