summaryrefslogtreecommitdiffstats
path: root/Code-map.md
diff options
context:
space:
mode:
authorJon Bratseth <bratseth@verizonmedia.com>2019-02-15 15:09:59 +0100
committerJon Bratseth <bratseth@verizonmedia.com>2019-02-15 15:09:59 +0100
commita37504d5750b77df68e4c263d7a448e9eaeb5a36 (patch)
tree3d3057ce27056291224a40af27aa5f1cc293835a /Code-map.md
parent7097b927e3a54b38c57006a2525774f6816bb53d (diff)
WIP
Diffstat (limited to 'Code-map.md')
-rw-r--r--Code-map.md74
1 files changed, 65 insertions, 9 deletions
diff --git a/Code-map.md b/Code-map.md
index 7488b913fa6..6b93f99c552 100644
--- a/Code-map.md
+++ b/Code-map.md
@@ -6,20 +6,76 @@ Vespa consists of about 1.7 million lines of code, about equal parts Java and C+
Since it it's mostly written by a team of developers selected for their ability
to do this kind of thing unusually well, who have been given time to dedicate
themselves to it for a long time, it is mostly easily to work with. However, one
-thing we haven't done is to create a module structure friendly to newcomers.
+thing we haven't done is to create a module structure friendly to newcomers - the code
+simply organized in a flat structure of about 150 modules.
-This document aims to alleviate that somewhat by providing a map from the [[https://docs.vespa.ai/documentation/overview.html][functional elements]
-of Vespa to the top-level module structure
-[[https://github.com/vespa-engine/vespa][on Github]].
+This document aims to provide a map from the
+[functional elements](https://docs.vespa.ai/documentation/overview.html)
+of Vespa to the most implrtant modules in the flat module structure in the
+[code base on GitHub]()[https://github.com/vespa-engine/vespa).
-We'll start with the query and write paths, outside in.
## The stateless container
-When a request is made to Vespa it first enters some stateless container cluster.
-These containers consists of a core layer which provides general request-response
-handling (using Jetty for HTTP), component management, configuration and similar
-basic functionality.
+When a request is made to Vespa it first enters some stateless container cluster,
+called jDisc. This consists of:
+
+- a __jDisc core__ layer which provides a model of a running application, general protocol-independent request-response handling, with various protocol implementations,
+- a __jDisc container__ layer providing component management, configuration and similar.
+- a __search middleware__ layer containing query/result API's, query execution logic etc.
+- API's and modules for writing and processing document operations.
+
+The stateless container is implemented in Java.
+
+jDisc core modules:
+
+- [jdisc_core](https://github.com/vespa-engine/vespa/tree/master/jdisc_core) - The core jDisc functionality
+- [jdisc_http_service](https://github.com/vespa-engine/vespa/tree/master/jdisc_http_service) - HTTP connector for jDisc, implemented using Jetty.
+
+jDisc container modules, layered on jDisc core:
+
+- [container-disc](https://github.com/vespa-engine/vespa/tree/master/container-disc) - Integration between the jDisc container and jDisc core layers.
+- [container-core](https://github.com/vespa-engine/vespa/tree/master/container-core) - Core jDisc container functionality: Metrics, OSGi integration for component bundles, etc.
+- [component](https://github.com/vespa-engine/vespa/tree/master/component) - the component model. Components (in Java) will implement or subclass a type for this module.
+- [chain](https://github.com/vespa-engine/vespa/tree/master/chain) - generic support for chaining components in a Chain of Responsibility structure, which is a pattern used repeatedly in higher level modules.
+- [container-di](https://github.com/vespa-engine/vespa/tree/master/container-di) - component dependency injection framework for the container, compatible with Guice annotations but an separate implementation which handles injection of config and injection of component collections.
+- [processing](https://github.com/vespa-engine/vespa/tree/master/processing) - generic, chainable request-response processing framework (Processors).
+
+Search container, layered on jDisc container:
+
+- [container-search](https://github.com/vespa-engine/vespa/tree/master/container-search) - Query-Result processing framework (Searchers) layered over the generic processing framework, query profiles, and the global query execution logic: Dispatch (scatter-gather), grouping, etc.
+
+Document operation modules:
+
+- [document](https://github.com/vespa-engine/vespa/tree/master/document) - The document model - documents, fields and document types, and operations on documents.
+- [messagebus](https://github.com/vespa-engine/vespa/tree/master/messagebus) - Generic async, multi-hop message passing implemented in both Java and C++.
+- [jdics_messagebus_service](https://github.com/vespa-engine/vespa/tree/master/jdisc_messagebus_service) - MessageBus connector for jDisc.
+- [documentapi](https://github.com/vespa-engine/vespa/tree/master/documentapi) - API for issuing document operations to Vespa over messagebus.
+- [docproc](https://github.com/vespa-engine/vespa/tree/master/docproc) - Chainable document (operation) processors: Document operations issued over messagebus to Vespa will usually be routed through a container running a document processor chain.
+- [indexinglanguage](https://github.com/vespa-engine/vespa/tree/master/indexinglanguage) - Implementation of the "indexing" language which is used to express the statements prefixed by "indexing:" in the search definition.
+- [docprocs](https://github.com/vespa-engine/vespa/tree/master/docprocs) - Document processor components bundled with Vespa. Notably the Indexingprocessor - a document processor invoking the indexing language statements configured for the document type in question on document operations.
+- [vespaclient-container-plugin](https://github.com/vespa-engine/vespa/tree/master/vespaclient-container-plugin) - Implements the document/v1 API and internal API used by the Java HTTP client on top of the jDisc container, forwarding to the Document API.
+- [vespa-http-client](https://github.com/vespa-engine/vespa/tree/master/vespa-http-client) - client for fast writing to the internal API implemented by vespaclient-container-plugin.
+
+## Content nodes
+
+Content nodes store all data in Vespa, maintains reverse indexes and performs the distributed parts of query execution - matching, ranking and grouping/aggregation.
+This is written in C++.
+
+- [searchcore](https://github.com/vespa-engine/vespa/tree/master/searchcore) - core functionality for maintaining indexes, matching, data storage, grouping, as well as the content node server itselv (called proton).
+- [searchlib](https://github.com/vespa-engine/vespa/tree/master/searchlib) - libraries invoked by searchcore: Ranking (feature execution framework (fef), rank feature implementations, ranking expressions), index and btree implementations etc. This also contains the Java libraries for ranking.
+- [storage](https://github.com/vespa-engine/vespa/tree/master/storage/src/vespa/storage) - system for elastic and auto-recovering data storage over clusters of nodes.
+- [eval](https://github.com/vespa-engine/vespa/tree/master/eval) - library for efficient evaluation of ranking expressions.
+- [storageapi](https://github.com/vespa-engine/vespa/tree/master/storageapi/src/vespa/storageapi) - message bus messages and implementation for the document API.
+- [clustercontroller-core](https://github.com/vespa-engine/vespa/tree/master/clustercontroller-core) - cluster controller for storage, implemented in Java. This provides singular node-level decision making for storage, based on ZooKeeper.
+
+
+## Configuration and administration
+
+The third major subsystem in Vespa is responsible for managing configuration, clusters, application deployment and similar.
+It is implemented in Java.
+
+