diff options
author | Jon Bratseth <bratseth@verizonmedia.com> | 2019-02-15 15:09:59 +0100 |
---|---|---|
committer | Jon Bratseth <bratseth@verizonmedia.com> | 2019-02-15 15:09:59 +0100 |
commit | a37504d5750b77df68e4c263d7a448e9eaeb5a36 (patch) | |
tree | 3d3057ce27056291224a40af27aa5f1cc293835a /Code-map.md | |
parent | 7097b927e3a54b38c57006a2525774f6816bb53d (diff) |
WIP
Diffstat (limited to 'Code-map.md')
-rw-r--r-- | Code-map.md | 74 |
1 files changed, 65 insertions, 9 deletions
diff --git a/Code-map.md b/Code-map.md index 7488b913fa6..6b93f99c552 100644 --- a/Code-map.md +++ b/Code-map.md @@ -6,20 +6,76 @@ Vespa consists of about 1.7 million lines of code, about equal parts Java and C+ Since it it's mostly written by a team of developers selected for their ability to do this kind of thing unusually well, who have been given time to dedicate themselves to it for a long time, it is mostly easily to work with. However, one -thing we haven't done is to create a module structure friendly to newcomers. +thing we haven't done is to create a module structure friendly to newcomers - the code +simply organized in a flat structure of about 150 modules. -This document aims to alleviate that somewhat by providing a map from the [[https://docs.vespa.ai/documentation/overview.html][functional elements] -of Vespa to the top-level module structure -[[https://github.com/vespa-engine/vespa][on Github]]. +This document aims to provide a map from the +[functional elements](https://docs.vespa.ai/documentation/overview.html) +of Vespa to the most implrtant modules in the flat module structure in the +[code base on GitHub]()[https://github.com/vespa-engine/vespa). -We'll start with the query and write paths, outside in. ## The stateless container -When a request is made to Vespa it first enters some stateless container cluster. -These containers consists of a core layer which provides general request-response -handling (using Jetty for HTTP), component management, configuration and similar -basic functionality. +When a request is made to Vespa it first enters some stateless container cluster, +called jDisc. This consists of: + +- a __jDisc core__ layer which provides a model of a running application, general protocol-independent request-response handling, with various protocol implementations, +- a __jDisc container__ layer providing component management, configuration and similar. +- a __search middleware__ layer containing query/result API's, query execution logic etc. +- API's and modules for writing and processing document operations. + +The stateless container is implemented in Java. + +jDisc core modules: + +- [jdisc_core](https://github.com/vespa-engine/vespa/tree/master/jdisc_core) - The core jDisc functionality +- [jdisc_http_service](https://github.com/vespa-engine/vespa/tree/master/jdisc_http_service) - HTTP connector for jDisc, implemented using Jetty. + +jDisc container modules, layered on jDisc core: + +- [container-disc](https://github.com/vespa-engine/vespa/tree/master/container-disc) - Integration between the jDisc container and jDisc core layers. +- [container-core](https://github.com/vespa-engine/vespa/tree/master/container-core) - Core jDisc container functionality: Metrics, OSGi integration for component bundles, etc. +- [component](https://github.com/vespa-engine/vespa/tree/master/component) - the component model. Components (in Java) will implement or subclass a type for this module. +- [chain](https://github.com/vespa-engine/vespa/tree/master/chain) - generic support for chaining components in a Chain of Responsibility structure, which is a pattern used repeatedly in higher level modules. +- [container-di](https://github.com/vespa-engine/vespa/tree/master/container-di) - component dependency injection framework for the container, compatible with Guice annotations but an separate implementation which handles injection of config and injection of component collections. +- [processing](https://github.com/vespa-engine/vespa/tree/master/processing) - generic, chainable request-response processing framework (Processors). + +Search container, layered on jDisc container: + +- [container-search](https://github.com/vespa-engine/vespa/tree/master/container-search) - Query-Result processing framework (Searchers) layered over the generic processing framework, query profiles, and the global query execution logic: Dispatch (scatter-gather), grouping, etc. + +Document operation modules: + +- [document](https://github.com/vespa-engine/vespa/tree/master/document) - The document model - documents, fields and document types, and operations on documents. +- [messagebus](https://github.com/vespa-engine/vespa/tree/master/messagebus) - Generic async, multi-hop message passing implemented in both Java and C++. +- [jdics_messagebus_service](https://github.com/vespa-engine/vespa/tree/master/jdisc_messagebus_service) - MessageBus connector for jDisc. +- [documentapi](https://github.com/vespa-engine/vespa/tree/master/documentapi) - API for issuing document operations to Vespa over messagebus. +- [docproc](https://github.com/vespa-engine/vespa/tree/master/docproc) - Chainable document (operation) processors: Document operations issued over messagebus to Vespa will usually be routed through a container running a document processor chain. +- [indexinglanguage](https://github.com/vespa-engine/vespa/tree/master/indexinglanguage) - Implementation of the "indexing" language which is used to express the statements prefixed by "indexing:" in the search definition. +- [docprocs](https://github.com/vespa-engine/vespa/tree/master/docprocs) - Document processor components bundled with Vespa. Notably the Indexingprocessor - a document processor invoking the indexing language statements configured for the document type in question on document operations. +- [vespaclient-container-plugin](https://github.com/vespa-engine/vespa/tree/master/vespaclient-container-plugin) - Implements the document/v1 API and internal API used by the Java HTTP client on top of the jDisc container, forwarding to the Document API. +- [vespa-http-client](https://github.com/vespa-engine/vespa/tree/master/vespa-http-client) - client for fast writing to the internal API implemented by vespaclient-container-plugin. + +## Content nodes + +Content nodes store all data in Vespa, maintains reverse indexes and performs the distributed parts of query execution - matching, ranking and grouping/aggregation. +This is written in C++. + +- [searchcore](https://github.com/vespa-engine/vespa/tree/master/searchcore) - core functionality for maintaining indexes, matching, data storage, grouping, as well as the content node server itselv (called proton). +- [searchlib](https://github.com/vespa-engine/vespa/tree/master/searchlib) - libraries invoked by searchcore: Ranking (feature execution framework (fef), rank feature implementations, ranking expressions), index and btree implementations etc. This also contains the Java libraries for ranking. +- [storage](https://github.com/vespa-engine/vespa/tree/master/storage/src/vespa/storage) - system for elastic and auto-recovering data storage over clusters of nodes. +- [eval](https://github.com/vespa-engine/vespa/tree/master/eval) - library for efficient evaluation of ranking expressions. +- [storageapi](https://github.com/vespa-engine/vespa/tree/master/storageapi/src/vespa/storageapi) - message bus messages and implementation for the document API. +- [clustercontroller-core](https://github.com/vespa-engine/vespa/tree/master/clustercontroller-core) - cluster controller for storage, implemented in Java. This provides singular node-level decision making for storage, based on ZooKeeper. + + +## Configuration and administration + +The third major subsystem in Vespa is responsible for managing configuration, clusters, application deployment and similar. +It is implemented in Java. + + |