summaryrefslogtreecommitdiffstats
path: root/python
diff options
context:
space:
mode:
authortmartins <thigm85@gmail.com>2020-09-02 22:51:26 +0200
committertmartins <thigm85@gmail.com>2020-09-02 22:51:26 +0200
commitded8372d7171a3a8c046e90795598dea659ab43a (patch)
treeabdd7085556a2445a5948e54c883728882cc5018 /python
parentb8547db5f9fcfe82b2012cb7c107ea1cba97b2f6 (diff)
update doc
Diffstat (limited to 'python')
-rw-r--r--python/vespa/docs/sphinx/source/connect-to-vespa-instance.ipynb705
-rw-r--r--python/vespa/docs/sphinx/source/index.rst15
-rw-r--r--python/vespa/docs/sphinx/source/quickstart.rst4
3 files changed, 677 insertions, 47 deletions
diff --git a/python/vespa/docs/sphinx/source/connect-to-vespa-instance.ipynb b/python/vespa/docs/sphinx/source/connect-to-vespa-instance.ipynb
index 210ab549383..c7bd98370fe 100644
--- a/python/vespa/docs/sphinx/source/connect-to-vespa-instance.ipynb
+++ b/python/vespa/docs/sphinx/source/connect-to-vespa-instance.ipynb
@@ -4,25 +4,18 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "# Vespa library for data analysis\n",
+ "# How to connect with running Vespa instances\n",
"\n",
- "> Provide data analysis support for Vespa applications \n",
+ "> Connect and interact with CORD-19 search app.\n",
"\n",
- "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/vespa/blob/tgm/pyvespa-tutorial/python/vespa/notebooks/connect-to-vespa-instance.ipynb)"
+ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/vespa/blob/tgm/reference-doc/python/vespa/docs/sphinx/source/connect-to-vespa-instance.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "`pyvespa` provides a python API to [vespa.ai](vespa.ai). It allow us to create, modify, deploy and interact with running Vespa instances. The main goal of the library is to allow for faster prototyping and ML experimentation. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "This tutorial will show you how to connect to a pre-existing Vespa instance. We will use the https://cord19.vespa.ai/ app as an example. You can run this tutorial yourself in Google Colab by clicking on the badge located at the top of the tutorial."
+ "This self-contained tutorial will show you how to connect to a pre-existing Vespa instance. We will use the https://cord19.vespa.ai/ app as an example. You can run this tutorial yourself in Google Colab by clicking on the badge located at the top of the tutorial."
]
},
{
@@ -52,21 +45,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Connect to a Vespa app\n",
- "\n",
- "> Connect to a running Vespa application"
+ "## Connect to a running Vespa application"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "We can connect to a running Vespa instance by created an instance of `Vespa` with the appropriate url. The resulting `app` will then be used to communicate with the application."
+ "We can connect to a running Vespa application by creating an instance of [Vespa](reference-api.rst#vespa.application.Vespa) with the appropriate url. The resulting `app` will then be used to communicate with the application."
]
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@@ -88,19 +79,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "When building a search application, we usually want to expirement with different query models. A `Query` model consists of a match phase and a ranking phase. The matching phase will define how to match documents based on the query sent and the ranking phase will define how to rank the matched documents. Both phases can get quite complex and being able to easily express and experiment with them is very valuable."
+ "When building a search application, we usually want to expirement with different query models. A [Query](reference-api.rst#vespa.query.Query) model consists of a match phase and a ranking phase. The matching phase will define how to match documents based on the query sent and the ranking phase will define how to rank the matched documents. Both phases can get quite complex and being able to easily express and experiment with them is very valuable."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "In the example below we define the match phase to be the `Union` of the `WeakAnd` and the `ANN` operators. The `WeakAnd` will match documents based on query terms while the Approximate Nearest Neighbor (`ANN`) operator will match documents based on the distance between the query and document embeddings. This is an illustration of how easy it is to combine term and semantic matching in Vespa. "
+ "In the example below we define the match phase to be the [Union](reference-api.rst#vespa.query.Union) of the [WeakAnd](reference-api.rst#vespa.query.WeakAnd) and the [ANN](reference-api.rst#vespa.query.ANN) operators. The `WeakAnd` will match documents based on query terms while the Approximate Nearest Neighbor (`ANN`) operator will match documents based on the distance between the query and document embeddings. This is an illustration of how easy it is to combine term and semantic matching in Vespa. "
]
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
@@ -123,12 +114,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "We then define the the ranking to be done by the `bm25` rank-profile that is already defined in the application package. We set `list_features=True` to be able to collect ranking-features later in this tutorial. After defining the `match_phase` and the `rank_profile` we can instantiate the `Query` model."
+ "We then define the ranking to be done by the `bm25` rank-profile that is already defined in the application package. We set `list_features=True` to be able to collect ranking-features later in this tutorial. After defining the `match_phase` and the `rank_profile` we can instantiate the `Query` model."
]
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
@@ -145,7 +136,7 @@
"source": [
"## Query the vespa app\n",
"\n",
- "> Send queries via the query API. See the [query page](/vespa/query) for more examples."
+ "> Send queries via the query API. See the [query page](query.ipynb) for more examples."
]
},
{
@@ -157,7 +148,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
@@ -176,9 +167,20 @@
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "1046"
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"query_result.number_documents_retrieved"
]
@@ -192,9 +194,20 @@
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"len(query_result.hits)"
]
@@ -224,7 +237,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
@@ -255,21 +268,564 @@
"source": [
"## Collect training data\n",
"\n",
- "> Collect training data to analyse and/or improve ranking functions. See the [collect training data page](/vespa/collect_training_data) for more examples."
+ "> Collect training data to analyse and/or improve ranking functions. See the [collect training data page](collect-training-data.ipynb) for more examples."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "We can colect training data with the `collect_training_data` method according to a specific `query_model`. Below we will collect two documents for each query in addition to the relevant ones."
+ "We can colect training data with the [collect_training_data](reference-api.rst#vespa.application.Vespa.collect_training_data) method according to a specific [Query](reference-api.rst#vespa.query.Query) model. Below we will collect two documents for each query in addition to the relevant ones."
]
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>attributeMatch(authors.first)</th>\n",
+ " <th>attributeMatch(authors.first).averageWeight</th>\n",
+ " <th>attributeMatch(authors.first).completeness</th>\n",
+ " <th>attributeMatch(authors.first).fieldCompleteness</th>\n",
+ " <th>attributeMatch(authors.first).importance</th>\n",
+ " <th>attributeMatch(authors.first).matches</th>\n",
+ " <th>attributeMatch(authors.first).maxWeight</th>\n",
+ " <th>attributeMatch(authors.first).normalizedWeight</th>\n",
+ " <th>attributeMatch(authors.first).normalizedWeightedWeight</th>\n",
+ " <th>attributeMatch(authors.first).queryCompleteness</th>\n",
+ " <th>...</th>\n",
+ " <th>textSimilarity(results).queryCoverage</th>\n",
+ " <th>textSimilarity(results).score</th>\n",
+ " <th>textSimilarity(title).fieldCoverage</th>\n",
+ " <th>textSimilarity(title).order</th>\n",
+ " <th>textSimilarity(title).proximity</th>\n",
+ " <th>textSimilarity(title).queryCoverage</th>\n",
+ " <th>textSimilarity(title).score</th>\n",
+ " <th>document_id</th>\n",
+ " <th>query_id</th>\n",
+ " <th>relevant</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.062500</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.142857</td>\n",
+ " <td>0.055357</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>213690</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>2</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.285714</td>\n",
+ " <td>0.666667</td>\n",
+ " <td>0.739583</td>\n",
+ " <td>0.571429</td>\n",
+ " <td>0.587426</td>\n",
+ " <td>225739</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>3</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.142857</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.437500</td>\n",
+ " <td>0.142857</td>\n",
+ " <td>0.224554</td>\n",
+ " <td>3</td>\n",
+ " <td>0</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>4</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>213690</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>5</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.285714</td>\n",
+ " <td>0.666667</td>\n",
+ " <td>0.739583</td>\n",
+ " <td>0.571429</td>\n",
+ " <td>0.587426</td>\n",
+ " <td>225739</td>\n",
+ " <td>0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>6</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.111111</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.083333</td>\n",
+ " <td>0.047222</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>7</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>176163</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>8</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.187500</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>0.250000</td>\n",
+ " <td>0.612500</td>\n",
+ " <td>13597</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>9</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.083333</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.000000</td>\n",
+ " <td>0.083333</td>\n",
+ " <td>0.041667</td>\n",
+ " <td>5</td>\n",
+ " <td>1</td>\n",
+ " <td>1</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>10</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>176163</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>11</th>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>...</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0.187500</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>1.000000</td>\n",
+ " <td>0.250000</td>\n",
+ " <td>0.612500</td>\n",
+ " <td>13597</td>\n",
+ " <td>1</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "<p>12 rows × 984 columns</p>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " attributeMatch(authors.first) \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).averageWeight \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).completeness \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).fieldCompleteness \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).importance \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).matches \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).maxWeight \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).normalizedWeight \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).normalizedWeightedWeight \\\n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "5 0.0 \n",
+ "6 0.0 \n",
+ "7 0.0 \n",
+ "8 0.0 \n",
+ "9 0.0 \n",
+ "10 0.0 \n",
+ "11 0.0 \n",
+ "\n",
+ " attributeMatch(authors.first).queryCompleteness ... \\\n",
+ "0 0.0 ... \n",
+ "1 0.0 ... \n",
+ "2 0.0 ... \n",
+ "3 0.0 ... \n",
+ "4 0.0 ... \n",
+ "5 0.0 ... \n",
+ "6 0.0 ... \n",
+ "7 0.0 ... \n",
+ "8 0.0 ... \n",
+ "9 0.0 ... \n",
+ "10 0.0 ... \n",
+ "11 0.0 ... \n",
+ "\n",
+ " textSimilarity(results).queryCoverage textSimilarity(results).score \\\n",
+ "0 0.0 0.0 \n",
+ "1 0.0 0.0 \n",
+ "2 0.0 0.0 \n",
+ "3 0.0 0.0 \n",
+ "4 0.0 0.0 \n",
+ "5 0.0 0.0 \n",
+ "6 0.0 0.0 \n",
+ "7 0.0 0.0 \n",
+ "8 0.0 0.0 \n",
+ "9 0.0 0.0 \n",
+ "10 0.0 0.0 \n",
+ "11 0.0 0.0 \n",
+ "\n",
+ " textSimilarity(title).fieldCoverage textSimilarity(title).order \\\n",
+ "0 0.062500 0.000000 \n",
+ "1 1.000000 1.000000 \n",
+ "2 0.285714 0.666667 \n",
+ "3 0.142857 0.000000 \n",
+ "4 1.000000 1.000000 \n",
+ "5 0.285714 0.666667 \n",
+ "6 0.111111 0.000000 \n",
+ "7 1.000000 1.000000 \n",
+ "8 0.187500 1.000000 \n",
+ "9 0.083333 0.000000 \n",
+ "10 1.000000 1.000000 \n",
+ "11 0.187500 1.000000 \n",
+ "\n",
+ " textSimilarity(title).proximity textSimilarity(title).queryCoverage \\\n",
+ "0 0.000000 0.142857 \n",
+ "1 1.000000 1.000000 \n",
+ "2 0.739583 0.571429 \n",
+ "3 0.437500 0.142857 \n",
+ "4 1.000000 1.000000 \n",
+ "5 0.739583 0.571429 \n",
+ "6 0.000000 0.083333 \n",
+ "7 1.000000 1.000000 \n",
+ "8 1.000000 0.250000 \n",
+ "9 0.000000 0.083333 \n",
+ "10 1.000000 1.000000 \n",
+ "11 1.000000 0.250000 \n",
+ "\n",
+ " textSimilarity(title).score document_id query_id relevant \n",
+ "0 0.055357 0 0 1 \n",
+ "1 1.000000 213690 0 0 \n",
+ "2 0.587426 225739 0 0 \n",
+ "3 0.224554 3 0 1 \n",
+ "4 1.000000 213690 0 0 \n",
+ "5 0.587426 225739 0 0 \n",
+ "6 0.047222 1 1 1 \n",
+ "7 1.000000 176163 1 0 \n",
+ "8 0.612500 13597 1 0 \n",
+ "9 0.041667 5 1 1 \n",
+ "10 1.000000 176163 1 0 \n",
+ "11 0.612500 13597 1 0 \n",
+ "\n",
+ "[12 rows x 984 columns]"
+ ]
+ },
+ "execution_count": 9,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"training_data_batch = app.collect_training_data(\n",
" labelled_data = labelled_data,\n",
@@ -286,7 +842,7 @@
"source": [
"## Evaluating a query model\n",
"\n",
- "> Define metrics and evaluate query models. See the [evaluation page](/vespa/evaluation) for more examples."
+ "> Define metrics and evaluate query models. See the [evaluation page](evaluation.ipynb) for more examples."
]
},
{
@@ -301,7 +857,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
@@ -319,9 +875,76 @@
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "<div>\n",
+ "<style scoped>\n",
+ " .dataframe tbody tr th:only-of-type {\n",
+ " vertical-align: middle;\n",
+ " }\n",
+ "\n",
+ " .dataframe tbody tr th {\n",
+ " vertical-align: top;\n",
+ " }\n",
+ "\n",
+ " .dataframe thead th {\n",
+ " text-align: right;\n",
+ " }\n",
+ "</style>\n",
+ "<table border=\"1\" class=\"dataframe\">\n",
+ " <thead>\n",
+ " <tr style=\"text-align: right;\">\n",
+ " <th></th>\n",
+ " <th>query_id</th>\n",
+ " <th>match_ratio_retrieved_docs</th>\n",
+ " <th>match_ratio_docs_available</th>\n",
+ " <th>match_ratio_value</th>\n",
+ " <th>recall_10_value</th>\n",
+ " <th>reciprocal_rank_10_value</th>\n",
+ " </tr>\n",
+ " </thead>\n",
+ " <tbody>\n",
+ " <tr>\n",
+ " <th>0</th>\n",
+ " <td>0</td>\n",
+ " <td>1254</td>\n",
+ " <td>233281</td>\n",
+ " <td>0.005375</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " <tr>\n",
+ " <th>1</th>\n",
+ " <td>1</td>\n",
+ " <td>1003</td>\n",
+ " <td>233281</td>\n",
+ " <td>0.004300</td>\n",
+ " <td>0.0</td>\n",
+ " <td>0</td>\n",
+ " </tr>\n",
+ " </tbody>\n",
+ "</table>\n",
+ "</div>"
+ ],
+ "text/plain": [
+ " query_id match_ratio_retrieved_docs match_ratio_docs_available \\\n",
+ "0 0 1254 233281 \n",
+ "1 1 1003 233281 \n",
+ "\n",
+ " match_ratio_value recall_10_value reciprocal_rank_10_value \n",
+ "0 0.005375 0.0 0 \n",
+ "1 0.004300 0.0 0 "
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"evaluation = app.evaluate(\n",
" labelled_data = labelled_data,\n",
@@ -349,7 +972,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.7.3"
+ "version": "3.8.5"
}
},
"nbformat": 4,
diff --git a/python/vespa/docs/sphinx/source/index.rst b/python/vespa/docs/sphinx/source/index.rst
index 9901e0cca80..82f384fbc09 100644
--- a/python/vespa/docs/sphinx/source/index.rst
+++ b/python/vespa/docs/sphinx/source/index.rst
@@ -14,11 +14,12 @@ Vespa python API
howto
reference-api
-``pyvespa`` provides a python API to vespa.ai_. It allow us to create, modify, deploy and interact with
+Vespa_ is the scalable open-sourced serving engine that enable us to store, compute and rank big data at user
+serving time. ``pyvespa`` provides a python API to Vespa. It allow us to create, modify, deploy and interact with
running Vespa instances. The main goal of the library is to allow for faster prototyping and to facilitate
-Machine Learning experiments around Vespa applications.
+Machine Learning experiments for Vespa applications.
-.. _vespa.ai: https://vespa.ai/
+.. _Vespa: https://vespa.ai/
Install
@@ -36,8 +37,12 @@ Quick-start
The best way to get started is by following the tutorials below. You can easily run them yourself on Google Colab
by clicking on the badge at the top of the tutorial.
-- :doc:`Connect and interact with CORD-19 search app <connect-to-vespa-instance>`.
-- :doc:`Create and deploy a MS MARCO search app from scratch <create-and-deploy-vespa-cloud>`.
+.. toctree::
+ :maxdepth: 1
+
+ connect-to-vespa-instance
+ create-and-deploy-vespa-cloud
+
How-to guides
+++++++++++++
diff --git a/python/vespa/docs/sphinx/source/quickstart.rst b/python/vespa/docs/sphinx/source/quickstart.rst
index 623b97842ea..4f7194bab18 100644
--- a/python/vespa/docs/sphinx/source/quickstart.rst
+++ b/python/vespa/docs/sphinx/source/quickstart.rst
@@ -1,8 +1,10 @@
Quick-start
===========
+The best way to get started is by following the tutorials below. You can easily run them yourself on Google Colab
+by clicking on the badge at the top of the tutorial.
+
.. toctree::
- :hidden:
connect-to-vespa-instance
create-and-deploy-vespa-cloud