summaryrefslogtreecommitdiffstats
path: root/TODO.md
diff options
context:
space:
mode:
authorJon Marius Venstad <venstad@gmail.com>2021-06-25 17:12:13 +0200
committerJon Marius Venstad <venstad@gmail.com>2021-06-25 17:12:13 +0200
commitbf3053f7d316f8baf470ed7cce951a41f6cb59da (patch)
tree6f2cde4f0daf3b1429dc7a0d281784bdcc9bae7a /TODO.md
parentf2001ec54146d5f325cbf248e000dbc8eacf41f6 (diff)
Update TODO.md with feed client task
Diffstat (limited to 'TODO.md')
-rw-r--r--TODO.md56
1 files changed, 56 insertions, 0 deletions
diff --git a/TODO.md b/TODO.md
index c633a1bf38a..efedf55c7f1 100644
--- a/TODO.md
+++ b/TODO.md
@@ -99,3 +99,59 @@ model updates.
**Code pointers:**
- Tensor modify operation (for document tensors): [Java](https://github.com/vespa-engine/vespa/blob/master/document/src/main/java/com/yahoo/document/update/TensorModifyUpdate.java), [C++](https://github.com/vespa-engine/vespa/blob/master/document/src/vespa/document/update/tensor_modify_update.h)
+
+
+## Feed clients in different languages
+
+**Effort:** Low<br/>
+**Difficulty:** Low<br/>
+**Skills:** Knowledge of a decent HTTP/2 library in some language
+
+/document/v1 is a RESTified HTTP API which exposes the Vespa Document API to the
+outside of the application's Java containers. The design of this API is simple,
+with each operation modelled as a single HTTP request, and its result as
+a single HTTP response. While it was previously not possible to achieve comparable
+throughput using this API to what the undocumented, custom-protocol /feedapi offered,
+this changed with HTTP/2 support in Vespa. The clean design of /document/v1 makes it
+easy to interface with from any language and runtime that support HTTP/2.
+An implementation currently only exists for Java, and requires a JDK8+ runtime,
+and implementations in other languages are very welcome. The below psuedo-code could
+be a starting point for an asynchronous implementation with futures and promises.
+
+Let `http` be an asynchronous HTTP/2 client, which returns a `future` for each request.
+A `future` will complete some time in the future, at which point dependent computations
+will trigger, depending on the result of the operation. A `future` is obtained from a
+`promise`, and completes when the `promise` is completed. An efficient feed client is then:
+
+```
+inflight = map<document_id, promise>()
+
+func dispatch(operation: request, result: promise, attempt: int): void
+ http.send(operation).when_complete(response => handle(operation, response, result, attempt))
+
+func handle(operation: request, response: response, result: promise, attempt: int): void
+ if retry(response, attempt):
+ dispatch(operation, result, attempt + 1)
+ else:
+ result.complete(response)
+
+func enqueue(operation): future
+ result_promise = promise()
+ result = result_promise.get_future()
+ previous = inflight.put(document.id, result) # store `result` under `id` and obtain previous mapping
+ if previous == NIL:
+ while inflight.size >= max_inflight(): wait()
+ dispatch(operation, result, 1)
+ else:
+ previous.when_complete(ignored => dispatch(operation, result, 1))
+ result.when_complete(ignored => inflight.remove_value(result)) # remove mapping unless it has been replaced
+ return result
+```
+
+Apply synchronization as necessary. The `inflight` map is used to serialise multiple operations
+to the same document id: the mapped entry for each id is the tail of a linked queue where new
+dependents may be added, while the queue is emptied from the head one entry at a time, whenever
+a dependency (`previous`) completes computation. `enqueue` blocks until there is room in the client.
+
+**Code pointers:**
+- [Java feed client](https://github.com/vespa-engine/vespa/blob/master/vespa-feed-client/src/main/java/ai/vespa/feed/client/FeedClient.java)