Update TODO.md with feed client task

author: Jon Marius Venstad <venstad@gmail.com> 2021-06-25 17:12:13 +0200
committer: Jon Marius Venstad <venstad@gmail.com> 2021-06-25 17:12:13 +0200
commit: bf3053f7d316f8baf470ed7cce951a41f6cb59da (patch)
tree: 6f2cde4f0daf3b1429dc7a0d281784bdcc9bae7a /TODO.md
parent: f2001ec54146d5f325cbf248e000dbc8eacf41f6 (diff)
1 files changed, 56 insertions, 0 deletions
diff --git a/TODO.md b/TODO.md
index c633a1bf38a..efedf55c7f1 100644
--- a/TODO.md
+++ b/TODO.md
@@ -99,3 +99,59 @@ model updates.
 
 **Code pointers:**
 - Tensor modify operation (for document tensors): [Java](https://github.com/vespa-engine/vespa/blob/master/document/src/main/java/com/yahoo/document/update/TensorModifyUpdate.java), [C++](https://github.com/vespa-engine/vespa/blob/master/document/src/vespa/document/update/tensor_modify_update.h)
+
+
+## Feed clients in different languages
+
+**Effort:** Low<br/>
+**Difficulty:** Low<br/>
+**Skills:** Knowledge of a decent HTTP/2 library in some language
+
+/document/v1 is a RESTified HTTP API which exposes the Vespa Document API to the
+outside of the application's Java containers. The design of this API is simple,
+with each operation modelled as a single HTTP request, and its result as
+a single HTTP response. While it was previously not possible to achieve comparable
+throughput using this API to what the undocumented, custom-protocol /feedapi offered,
+this changed with HTTP/2 support in Vespa. The clean design of /document/v1 makes it
+easy to interface with from any language and runtime that support HTTP/2.
+An implementation currently only exists for Java, and requires a JDK8+ runtime,
+and implementations in other languages are very welcome. The below psuedo-code could
+be a starting point for an asynchronous implementation with futures and promises.
+
+Let `http` be an asynchronous HTTP/2 client, which returns a `future` for each request.
+A `future` will complete some time in the future, at which point dependent computations
+will trigger, depending on the result of the operation. A `future` is obtained from a
+`promise`, and completes when the `promise` is completed. An efficient feed client is then:
+
+```
+inflight = map<document_id, promise>()
+
+func dispatch(operation: request, result: promise, attempt: int): void
+    http.send(operation).when_complete(response => handle(operation, response, result, attempt))
+
+func handle(operation: request, response: response, result: promise, attempt: int): void
+    if retry(response, attempt):
+        dispatch(operation, result, attempt + 1)
+    else:
+        result.complete(response)
+
+func enqueue(operation): future
+    result_promise = promise()
+    result = result_promise.get_future()
+    previous = inflight.put(document.id, result)  # store `result` under `id` and obtain previous mapping
+    if previous == NIL:
+        while inflight.size >= max_inflight(): wait()
+        dispatch(operation, result, 1)
+    else:
+        previous.when_complete(ignored => dispatch(operation, result, 1))
+        result.when_complete(ignored => inflight.remove_value(result)) # remove mapping unless it has been replaced
+        return result
+```
+
+Apply synchronization as necessary. The `inflight` map is used to serialise multiple operations
+to the same document id: the mapped entry for each id is the tail of a linked queue where new
+dependents may be added, while the queue is emptied from the head one entry at a time, whenever
+a dependency (`previous`) completes computation. `enqueue` blocks until there is room in the client.
+
+**Code pointers:**
+- [Java feed client](https://github.com/vespa-engine/vespa/blob/master/vespa-feed-client/src/main/java/ai/vespa/feed/client/FeedClient.java)
author	Jon Marius Venstad <venstad@gmail.com>	2021-06-25 17:12:13 +0200
committer	Jon Marius Venstad <venstad@gmail.com>	2021-06-25 17:12:13 +0200
commit	bf3053f7d316f8baf470ed7cce951a41f6cb59da (patch)
tree	6f2cde4f0daf3b1429dc7a0d281784bdcc9bae7a /TODO.md
parent	f2001ec54146d5f325cbf248e000dbc8eacf41f6 (diff)