This is the sixth part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
In the previous part of this series, we set up a recommendation system that,
given a user id, needed two requests to generate a recommendation. The first
to retrieve the user embedding, and a second for finding the nearest neighbor
news articles. In this part, we’ll introduce searchers
, which are
processors that can modify queries before passing them along to search. These
will allow us to pull the logic from the Python scripts into Vespa.
For reference, the final state of this tutorial can be found in the
app-6-recommendation-with-searchers
directory of the news
sample application.
First, let’s revisit Vespa’s overall architecture:
Recall that the application package contains everything necessary to run the
application. When this is deployed, the config cluster takes care of
distributing the services to the various nodes. In particular, the two main
types of nodes are the stateless container
nodes and the stateful content
nodes.
All requests pass through the container
cluster before passing along to
content
cluster where the actual retrieval and ranking occurs. The queries
actually pass through a chain of searchers; each one possibly doing a small
amount of processing. This can be seen by adding a &tracelevel=5
to a
query:
...
{ "message": "Invoke searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
{ "message": "Invoke searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
{ "message": "Invoke searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'" },
{ "message": "Invoke searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'" },
{ "message": "Invoke searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'" },
...
{ "message": "Federating to [mind]" },
...
{ "message": "Got 10 hits from source:mind" },
{ "message": "Return searcher 'federation in native'" },
...
{ "message": "Return searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'" },
{ "message": "Return searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'" },
{ "message": "Return searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'" },
{ "message": "Return searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
{ "message": "Return searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'"
...
This shows a small sample of the additional output when using tracelevel
. Note the
invocations of the searchers. Each searcher gets invoked along a chain, and the last
searcher in the chain sends the post-processed query to the search backend. When the
results come back, the processing passes back up the chain. The searchers can then
process the results before passing them to the previous searcher, and ultimately back as
a response to the query.
So, searchers are Java components that perform some kind of processing along the query chain; either modifying the query before the actual search, modifying the results after the search, or some combination of both.
Developers can provide their own searchers and inject them into the query chain.
We’ll capitalize on this and create a searcher that performs essentially
the same task that our src/python/user_search.py
script does: retrieve
a user embedding and do a news article search based on that. In the process,
we’ll only pass a user_id
to Vespa instead of a full YQL query:
/search/?user_id=U33527&searchchain=user
Our search will take care of creating the actual query for us. So, let’s get started.
While the content
layer in Vespa is written in C++ for maximum performance,
the container
layer is in Java for flexibility. So, all searchers and thus
custom searchers are written in Java. Please refer to the guide on searcher
development for more information.
We want to create a searcher that takes a user_id
, issues a query to
find the corresponding embedding, then issues a second query to retrieve
the news articles.
To do this, we create a UserProfileSearcher
that extends the base searcher
class com.yahoo.search.Searcher
. This searcher must implement a single
method: search
, and has the responsibility of passing the query to
the next searcher on the list. A minimal example is:
public class UserProfileSearcher extends Searcher {
public Result search(Query query, Execution execution) {
// ... process query
Result results = execution.search(query)
// ... process results
return results;
}
}
So, what we do before we pass the query along (in execution.search(query)
) and
before we return the results is completely up to us. So, we implement our
UserProfileSearcher
like this:
public class UserProfileSearcher extends Searcher { public Result search(Query query, Execution execution) { // Get tensor and read items from user profile Object userIdProperty = query.properties().get("user_id"); if (userIdProperty != null) { // Retrieve user embedding by doing a search for the user_id and extract the tensor Tensor userEmbedding = retrieveUserEmbedding(userIdProperty.toString(), execution); // Create a new search using the user's embedding tensor NearestNeighborItem nn = new NearestNeighborItem("embedding", "user_embedding"); nn.setTargetNumHits(query.getHits()); nn.setAllowApproximate(true); query.getModel().getQueryTree().setRoot(nn); query.getRanking().getFeatures().put("query(user_embedding)", userEmbedding); query.getModel().setRestrict("news"); // Override default ranking profile if (query.getRanking().getProfile().equals("default")) { query.getRanking().setProfile("recommendation"); } } return execution.search(query); } private Tensor retrieveUserEmbedding(String userId, Execution execution) { Query query = new Query(); query.getModel().setRestrict("user"); query.getModel().getQueryTree().setRoot(new WordItem(userId, "user_id")); query.setHits(1); Result result = execution.search(query); execution.fill(result); // This is needed to get the actual summary data if (result.getTotalHitCount() == 0) throw new RuntimeException("User id " + userId + " not found..."); return (Tensor) result.hits().get(0).getField("embedding"); } }
First, we retrieve the user_id
from the query. If this is given in the
query, we first call the retrieveUserEmbedding
method, which creates a new
Query
to find the user’s embedding. This is a straight-forward search which
is restricted to the user
document type. Since the user_id
is
unique, we only expect a single hit. We then extract the embedding
tensor
from the user document.
Now that we’ve retrieved the user embedding, we programmatically set up a
nearest-neighbor search, and add the user embedding to the query as the
ranking feature query(user_embedding)
. The search is then passed along to
the next searcher in the chain. We do not need to explicitly fill the
result here, as that is guaranteed to happen before ultimately rendering
the results.
Again, note that all this is pretty much the same as what we did in
src/python/user_search.py
- just in Java.
To add this searcher to Vespa, we need to modify services.xml
:
<container id="default" version="1.0"> <search> <chain id="user" inherits="vespa"> <searcher bundle="news-recommendation-searcher" id="ai.vespa.example.UserProfileSearcher" /> </chain> </search> ... </container>
Here, we instruct Vespa to add a new search chain called user
(which
inherits the default vespa
search chain), and includes our
UserProfileSearcher
. Note that Vespa expects this searcher to be in a
bundle called news-recommendation
, so we need to compile and package this code.
In Vespa, we use Apache Maven for this,
which requires a project object model, or pom.xml
, to specify how to build this artifact.
We won’t go through that here; please refer to the
app-6-recommendation-with-searchers
sub-directory in the news
sample
application for details. Note that this application’s directory structure has
changed compared to the previous parts in the tutorial. The structure is now:
.
├── pom.xml
├── src
│ └── main
│ ├── application
│ │ ├── schemas
│ │ │ ├── news.sd
│ │ │ └── user.sd
│ │ ├── search
│ │ │ └── query-profiles
│ │ │ ├── default.xml
│ │ │ └── types
│ │ │ └── root.xml
│ │ └── services.xml
│ └── java
│ └── ai
│ └── vespa
│ └── example
│ └── UserProfileSearcher.java
The Vespa application now lies under src/main/application
, and all
custom Java components are under src/main/java
as is standard in
a Java project. We can now compile and package this application:
$ (cd app-6-recommendation-with-searchers && mvn package)
pom.xml
is set up to create an artifact called news-recommendation
,
which is what we referred to in services.xml
. When the command
finishes, we can see this artifact in target/application.zip
,
This contains the full Vespa application, with Java components.
The vespa-cli detects that this app has custom Java components:
$ vespa deploy --wait 300 app-6-recommendation-with-searchers
After the application has been deployed, we are ready to test. Please refer to the searcher development guide for much more on custom searchers and the Java API.
Now we can search for a user’s recommended news articles directly from the user_id
:
$ vespa query -v 'user_id=U33527' 'searchChain=user'
This should now return the top 10 recommended news articles for this user. Indeed,
if we now add a with a tracelevel=5
, we see the searcher being invoked:
$ vespa query -v 'user_id=U33527' 'searchChain=user' 'tracelevel=5'
...
{ "message": "Invoke searcher 'ai.vespa.example.UserProfileSearcher in user'" },
{ "message": "Invoke searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
{ "message": "Invoke searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
...
{ "message": "Return searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
{ "message": "Return searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
{ "message": "Return searcher 'ai.vespa.example.UserProfileSearcher in user' },
...
Note that the searchChain
query parameter can be set as default so this does not have to
be passed with the query request. This is done by adding it to the default query profile in
src/main/application/search/query-profiles/default.xml
:
<query-profile id="default" type="root"> <field name="searchChain">user</field> </query-profile>
src/python/evaluate.py
script can now be modified to also use this searcher.
However, to properly calculate the metrics, the searcher needs to be modified to
accept a list of news article id’s and only recall those.
We’ll leave this as an exercise to the reader.
As can be seen in the architecture overview above, there are other component types as well. One is document processors, which are conceptually similar to searchers. When a document is fed to Vespa, it goes through a chain of document processors before being passed to the content node for storage and indexing.
Vespa also supports custom document processors. Please refer to the guide for document processing for more information.
If we take a closer look at the query above, and search for the top 100 hits:
$ vespa query 'user_id=U33527' 'searchChain=user' 'hits=100' | grep "category\": \"sports" | \ wc -l
We see that all the hits are of category sports
for this user. Actually,
they are all from the football_nfl
sub-category. Indeed, from inspection of
the impressions file, this user has only clicked on sports
articles. So,
while this can seem a success, we generally would like to give users
some form of diversity to keep them interested. This is also to
combat the negative effects of filter bubbles.
One way to do this is to create searchers that perform multiple queries
to the backend with various rank profiles. In the above, we were only
retrieving results from the recommendation
rank profile. Still, we can
have any number of rank profiles. By searching in multiple rank profiles,
we can blend the results from these sources before returning to the
user, and thus introduce diversity.
This is often called federation. Vespa supports federation both from internal and external sources. Please see the guide on federation for more information.
A common way of performing blending from multiple sources is to implement a specialized blending searcher. This searcher can, for instance, use an approach such reciprocal rank fusion which gives decent results. However, when it comes to diversity, there are usually some goals or restrictions that needs to be controlled. In this case the business rules can be hand-written in the blending searcher. Searchers are flexible enough to perform any type of processing.
We now have a Vespa application up and running that takes a single user_id
and returns recommendations for that user.
In the next part of the tutorial,
we’ll address what to do when new users without any history visit our recommendation system.