The RAG Blueprint

Vespa is the platform of choice for large scale RAG applications like Perplexity. It gives you all the features you need but putting them all together can be a challenge.

This open source sample applications contains all the elements you need to create a RAG application that

  • delivers state-of-the-art quality, and
  • scales to any amount of data, query load, and complexity.

This README provides the steps to create and run your own application based on the blueprint. Refer to the RAG Blueprint tutorial for more in-depth explanations, or try out the Python notebook.

Setup:

  1. Create a tenant on Vespa Cloud:

    Go to console.vespa-cloud.com and create your tenant (unless you already have one).

  2. Install the Vespa CLI using Homebrew:

    $ brew install vespa-cli
    

    Windows/No Homebrew? See the Vespa CLI page to download directly.

  3. Configure the Vespa client:

    $ export VESPA_CLI_HOME=$PWD/.vespa
    
    $ vespa config set target cloud
    $ vespa config set application vespa-team.autotest
    

    Use the tenant name from step 1 instead of "vespa-team", and replace in other steps in this example guide, too.

  4. Get Vespa Cloud control plane access:

    $ vespa auth login
    

    Follow the instructions from the command to authenticate.

  5. Clone a sample application:

    $ vespa clone rag-blueprint myapp && cd myapp
    

    See sample-apps for other sample apps you can clone.

  6. Add a certificate for data plane access to the application:

    $ vespa auth cert app
    

    It is a good idea to take note of the path to the .pem files written here.

Test the application

$ vespa deploy --wait 900 ./app

Feed some documents, this will also chunk and embed so it takes about 3 minutes:

$ vespa feed dataset/docs.jsonl

Now you can issue queries:

$ vespa query 'query=yc b2b sales'
$ vespa destroy --force

[!TIP] Add "-v" to see the HTTP request this becomes.

Congratulations! You have now created a RAG application that can scale to billions of documents and thousands of queries per second, while delivering state-of-the-art quality.

Explore more

What do you want to do next?