diff options
Diffstat (limited to 'vespaclient-java/src/main/sh/vespa-visit.1')
-rw-r--r-- | vespaclient-java/src/main/sh/vespa-visit.1 | 160 |
1 files changed, 160 insertions, 0 deletions
diff --git a/vespaclient-java/src/main/sh/vespa-visit.1 b/vespaclient-java/src/main/sh/vespa-visit.1 new file mode 100644 index 00000000000..7b8f7521865 --- /dev/null +++ b/vespaclient-java/src/main/sh/vespa-visit.1 @@ -0,0 +1,160 @@ +." Copyright 2017 Yahoo Inc. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. +.TH VESPAVISIT 1 2008-03-07 "Vespa" "Vespa Documentation" +.SH NAME +vespa-visit \- Visit documents from a Vespa installation +.SH SYNPOSIS +.B vespa-visit +[\fIOPTION\fR]... +.SH DESCRIPTION +.PP +In the regular case, retrieve documents stored in VESPA, and either print +them to STDOUT or send them to a given MessageBus route. +.PP +A Vespa visit operation processes a set of stored documents, in undefined +order, locally on the storage nodes where they are stored. A visitor library +available on all storage nodes will receive the documents stored locally, and +can process these and send messages to the visitor data handler. The regular +case is to use the DumpVisitor library to merely send the documents themselves +in blocks back to the data handler, which by default is this client that will +write the documents to STDOUT. +.PP +Mandatory arguments to long options are mandatory for short options too. +Short options can not currently be concatenated together. +.TP +\fB\-s\fR, \fB\-\-selection\fR \fISELECTION\fR +A document selection string, specifying what documents to visit. Documentation +on the language itself can be found in the documentation. Note that this argument +should probably be quoted to prevent your shell from invalidating your +selection. +.TP +\fB\-f\fR, \fB\-\-from\fR \fITIME\fR +If this option is given, only documents from given timestamp or newer will be +visited. The time is given in microseconds since 1970. +.TP +\fB\-t\fR, \fB\-\-to\fR \fITIME\fR +If this option is given, only documents up to and including the given timestamp +will be visited. The time is given in microseconds since 1970. +.TP +\fB\-e\fR, \fB\-\-headersonly\fR +By default, the whole documents stored are processed. If this option is given +only the header parts of documents will be processed. By defining the big +document fields as body fields, you can efficiently visit all the header fields +using this option. +.TP +\fB\-i\fR, \fB\-\-printids\fR +Using this option, only the document identifiers will be printed to STDOUT. +In addition, if visiting removes, an additional tag will be added so you can +see whether document has been removed or not. This option implies headers only +visiting, and can only be used if no datahandler is specified. +.TP +\fB\-d\fR, \fB\-\-datahandler\fR \fIVISITTARGET\fR +The data handler is the destination of messages sent from the visitor library. +By default, the data handler is the vespa-visit process you start, which will +merely print all returned data to STDOUT. A visit target can be specified +instead. See the chapter below on visit targets. +.TP +\fB\-p\fR, \fB\-\-progress\fR \fIFILE\fR +By setting a progress file, current visitor progress will be saved to this +file at regular intervals. If this file exists on startup, the visitor will +continue from this point. +.TP +\fB\-o\fR, \fB\-\-timeout\fR \fITIMEOUT\fR +Time out the visitor after given number of milliseconds. +.TP +\fB\-r\fR, \fB\-\-visitremoves\fR +By default, only documents existing in Vespa will be processed. By giving +this option, also entries identifying documents previously existing will +be returned. This is useful for secondary copies of data that wants to know +whether documents it has stored has been removed. Note that documents deleted +a long time ago will no longer be tracked. Vespa keeps remove entries for +a configurable amount of time. +.TP +\fB\-m\fR, \fB\-\-maxpending\fR \fINUM\fR +Maximum pending docblock messages to data handlers. This may be used to +increase or reduce visiting speed, but should not be set too high so that data +handlers run out of memory. To get an estimate of memory consumption on each +data handler, multiply maxpending with defaultdocblocksize in stor-visitor +config and divide by number of data handlers. Default value for maxpending is +16. +.TP +\fB\-c\fR, \fB\-\-cluster\fR \fICLUSTER\fR +Visit the given VDS cluster. +.TP +\fB\-v\fR, \fB\-\-verbose\fR +More verbose output. Indent XML and add progress and info to STDERR. +.TP +\fB\-h\fR, \fB\-\-help\fR +Shows a short syntax reminder. +.PP +Advanced options: +.PP +The below options are used for advanced usage or for testing. +.TP +\fB\-\-visitlibrary\fR \fILIBRARY\fR +By default, the DumpVisitor library, sending documents back to the data handler, +is used when visiting. Another library can be specified using this option. The +library filename should be the name given here, with lib prepended and .so +appended. +.TP +\fB\-\-libraryparam\fR \fIKEY\fR \fIVALUE\fR +The default DumpVisitor library has no options to set, but custom libraries +may need user specifiable options. Here such options can be specified. Look +at visitor library documentation for legal parameters. +.TP +\fB\-\-polling\fR \fIarg\fR +The document API implements both a polling and a callback visitor API. The +callback API is most efficient and used by default. The polling API might be +simpler for users used to such APIs. Some VESPA system tests use this option +to test that the polling API works. +.TP +\fB\-\-visitinconsistentbuckets\fR +In some cases Vespa may temporarily be in an inconsistent state, that is, +different nodes contain different copies of the data. Collections of documents +are grouped into so-called buckets. The normal behavior of visiting is to wait +for the inconsistencies to resolve before actually visiting the data. This +might be a problem for time critical applications. Setting this option will +result in the bucket copy with most documents to be visited in case of +inconsistencies, which means that the data returned by the visitor are not +guaranteed to be correct. +.SH VISIT TARGET +Results from visiting can be sent to many different kind of targets. +.TP +\fBMessage bus routes\fR +You can specify a message bus route name directly, and this route will be used +to send the results. This is typically used when doing reprocessing within +Vespa. Message bus routes are set up in the application package. In addition +some routes may have been autogenerated in simple setups, for instance a +route called \fIdefault\fR is generated if your setup is so simple that Vespa +can guess where you want to send your data. +.TP +\fBSlobrok address\fR +You can also specify a slobrok address for data to be sent to. A slobrok address +is a slash separated path where you can use asterisk to mean any element within +this path. For instance, if you have a docproc cluster called \fImydpcluster\fR +it will have registered its nodes with slobrok names like +\fIdocproc/cluster.mydpcluster/docproc/0/feed_processor\fR, where the 0 here +indicates the first node in the cluster. You can thus specify to send visit data +to this docproc cluster by stating a slobrok address of +\fIdocproc/cluster.mydpcluster/docproc/*/feed_processor\fR. Note that this will +not send all the data to one or all the nodes. The data sent from the visitor +will be distributed among the matching nodes, but each message will just be sent +to one node. + +Slobrok names may also be used if you use the \fBvespa-visit-target\fR tool to +retrieve the data at some location. If you start vespa-visit-target on two nodes, +listening to slobrok names \fImynode/0/visit-destination\fR and +\fImynode/1/visit-destination\fR you can send the results to these nodes by +specifying \fImynode/*/visit-destination\fR as the data handler. See +\fBman vespa-visit-target\fR for naming conventions used for such targets. +.TP +\fBTCP socket\fR +TCP sockets can also be specified directly. This requires that the endpoint +speaks FNET RPC though. This is typically done, either by using the +\fBvespa-visit-target\fR tool, or by using a visitor destination programmatically +by using utility class in the document API. A socket address looks like the +following: tcp/\fIhostname\fR:\fIport\fR/\fIservicename\fR. For instance, an +address generated by the \fBvespa-visit-target\fR tool might look like the +following: \fItcp/myhost.com:12345/visit-destination\fR. + +.SH AUTHOR +Written by Haakon Humberset. |