Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Remove duplicate headers | Jon Bratseth | 2021-03-18 | 1 | -1/+1 |
| | |||||
* | Add copyright headers | Jon Bratseth | 2021-03-18 | 1 | -1/+2 |
| | |||||
* | Simplify, testing server is not used | Harald Musum | 2021-03-03 | 1 | -59/+7 |
| | |||||
* | Use CuratorConfig | Martin Polden | 2020-11-20 | 1 | -13/+17 |
| | |||||
* | Extract ConnectionSpec | Martin Polden | 2020-11-20 | 2 | -16/+74 |
| | |||||
* | Also test cumulative deadlock counters | Håkon Hallingstad | 2020-10-20 | 1 | -0/+4 |
| | |||||
* | Replace deadlock avoidance with metrics | Håkon Hallingstad | 2020-10-19 | 1 | -9/+16 |
| | |||||
* | Deadlock detection | Håkon Hallingstad | 2020-10-11 | 1 | -5/+70 |
| | | | | | | | | | | | | Just before Lock.acquire() is invoked, the locks within the process is queried to see if a "deadlock" will occur: The current thread waiting to acquire lock path P1, which is held by thread T1 waiting on acquiring a lock at path P2, etc, until a thread is waiting for a lock held by the current thread. Even without this PR the deadlock would resolve itself automatically because all locks are acquired with timeouts. However, this PR 1. resolves the deadlock immediately, and 2. leaves a log trace (hopefully from the exception) to allow us to refactor code to avoid such deadlocks. | ||||
* | Avoid metrics on reentry of lock | Håkon Hallingstad | 2020-10-08 | 2 | -32/+60 |
| | |||||
* | Make richer latency stats | Håkon Hallingstad | 2020-10-05 | 4 | -139/+144 |
| | | | | | | | | | Makes a LatencyStats which provides some useful metrics, best explained there and in LatencyMetrics. This includes latency metrics, the "QPS" (e.g. the number of acquire() per second), and load metrics. Unfortunately I had to move from atomics to synchronized to accomplish this, but I see no other way. | ||||
* | Move lock metrics to MetricsReporter | Håkon Hallingstad | 2020-10-03 | 5 | -29/+196 |
| | | | | | | | | | | | | | | | Adds two new metrics: - The load of acquiring each lock path: The average number of threads waiting to acquire the lock within the last minute (or unit of time). Aka the lock queue (depth). - The load of the lock for each lock path: The average number of threads holding the lock within the last minute (or unit of time). This is always <= 1. Aka the lock utilization. Changes the LockCounters to LockMetrics, and exporting those once every minute through MetricReporter which is designed for this. | ||||
* | Add metrics to lock attempts | Håkon Hallingstad | 2020-10-01 | 3 | -4/+6 |
| | |||||
* | Record locks taken for external deploys | Håkon Hallingstad | 2020-09-30 | 1 | -11/+11 |
| | | | | | | | | | | | | - Information about a lock attempt now includes a list of lock attempts done while holding the lock, forming a tree (forest) structure. - Records the duration and locking attempts done as part of an external deploy, forming a tree of locks with timing info. The currently active external deploys are shown in an "ongoing-recording" field of /nodes/v2/locks. - The 3 longest external deploys are kept in "recordings" in /nodes/v2/locks. - Extracts the global process-wide parts of ThreadLockStats into separate class for clarity. | ||||
* | More info -> attempt renames | Håkon Hallingstad | 2020-09-28 | 2 | -11/+11 |
| | |||||
* | LockInfo -> LockAttempt, ThreadLockInfo -> ThreadLockStats, and more | Håkon Hallingstad | 2020-09-28 | 2 | -36/+44 |
| | |||||
* | Use deque as stack | Håkon Hallingstad | 2020-09-28 | 1 | -0/+23 |
| | |||||
* | Mock lock path from thread to per-lock (bug) | Håkon Hallingstad | 2020-09-26 | 2 | -1/+59 |
| | |||||
* | Adds method name to stack trace and adds timeout count and test | Håkon Hallingstad | 2020-09-25 | 1 | -0/+113 |
| | |||||
* | Stick to junit for simple test. | Henning Baldersheim | 2020-08-11 | 1 | -6/+5 |
| | |||||
* | Wait longer for servers to reach barrier | Harald Musum | 2020-04-29 | 1 | -3/+3 |
| | | | | | | | 1. Wait up to 2 seconds for all to reach barrier. 2. If not, wait up to 4 seconds for the server that waits for the barrier to be one of the respondents AND a majority of servers have reached barrier. 3. If not, wait for a majority of servers to have reached barrier. | ||||
* | Create zookeeper client config file only when necessary | Harald Musum | 2020-01-09 | 1 | -5/+4 |
| | |||||
* | Set TLS config for client based on VESPA_USE_TLS_FOR_ZOOKEEPER_CLIENT | Harald Musum | 2019-12-05 | 2 | -5/+6 |
| | |||||
* | Revert "Revert "Reapply "Move ZooKeeperServer to another module""" | Harald Musum | 2019-10-23 | 1 | -140/+0 |
| | |||||
* | Revert "Reapply "Move ZooKeeperServer to another module"" | Harald Musum | 2019-10-22 | 1 | -0/+140 |
| | |||||
* | Revert "Revert "Reapply "move ZooKeeperServer to another module""" | Harald Musum | 2019-10-22 | 1 | -140/+0 |
| | |||||
* | Revert "Reapply "move ZooKeeperServer to another module"" | Håkon Hallingstad | 2019-10-22 | 1 | -0/+140 |
| | |||||
* | Revert "Revert "Reapply "Move ZooKeeperServer to another module"""" | Harald Musum | 2019-10-21 | 1 | -140/+0 |
| | |||||
* | Revert "Reapply "Move ZooKeeperServer to another module""" | Harald Musum | 2019-10-21 | 1 | -0/+140 |
| | |||||
* | Revert "Revert "Move ZooKeeperServer to another module"" | Harald Musum | 2019-10-20 | 1 | -140/+0 |
| | |||||
* | Revert "Move ZooKeeperServer to another module" | Harald Musum | 2019-10-18 | 1 | -0/+140 |
| | |||||
* | Move ZooKeeperServer to another module | Harald Musum | 2019-10-17 | 1 | -140/+0 |
| | | | | | | zookeeper-server jar is not a preinstalled bundle, as zkfacade is, so need to add bundle explicitly for clustercontroller and add symlink from components dir for config server | ||||
* | Add constructor without ZooKeeperServer argument, for testing | Harald Musum | 2019-10-16 | 1 | -1/+1 |
| | | | | Will be used by code in internal repo, so needs to be public | ||||
* | Revert "Reapply "upgrade to zookeeper 3.5"" | Harald Musum | 2019-09-27 | 1 | -6/+2 |
| | |||||
* | Revert "Revert "Hmusum/upgrade to zookeeper 3.5"" | Harald Musum | 2019-09-24 | 1 | -2/+6 |
| | |||||
* | Revert "Hmusum/upgrade to zookeeper 3.5" | Harald Musum | 2019-09-17 | 1 | -6/+2 |
| | |||||
* | Fix error from rebasing onto master: server numbering starts at 0. | gjoranv | 2019-09-10 | 1 | -4/+4 |
| | |||||
* | Fix unit tests | Harald Musum | 2019-09-10 | 1 | -6/+10 |
| | |||||
* | Use server id in config for singlenode zookeeper setups | Harald Musum | 2019-09-06 | 1 | -13/+14 |
| | | | | | Update tests accordingly and start numbering at 0, aw will be done by the code that creates zookeeper-server config | ||||
* | Actually test the waiter in its test class >_< | Jon Marius Venstad | 2019-04-12 | 1 | -0/+1 |
| | |||||
* | Revert "Revert "Revert "Jvenstad/fix config model inconsitency""" | Jon Marius Venstad | 2019-03-01 | 1 | -1/+0 |
| | |||||
* | Revert "Revert "Jvenstad/fix config model inconsitency"" | Jon Marius Venstad | 2019-03-01 | 1 | -0/+1 |
| | |||||
* | Revert "Jvenstad/fix config model inconsitency" | Harald Musum | 2019-03-01 | 1 | -1/+0 |
| | |||||
* | Actually test the waiter in its test class >_< | Jon Marius Venstad | 2019-03-01 | 1 | -0/+1 |
| | |||||
* | Replace CuratorLock with Lock | Jon Marius Venstad | 2019-02-26 | 1 | -51/+0 |
| | |||||
* | Revert "Revert "No need for restricting access to zookeeper in hosted vespa"" | Harald Musum | 2018-10-24 | 1 | -2/+1 |
| | |||||
* | Revert "No need for restricting access to zookeeper in hosted vespa" | Harald Musum | 2018-10-24 | 1 | -1/+2 |
| | |||||
* | Merge pull request #7423 from ↵ | Jon Bratseth | 2018-10-24 | 1 | -2/+1 |
|\ | | | | | | | | | vespa-engine/hmusum/remove-check-for-allowed-zk-clients-in-hosted No need for restricting access to zookeeper in hosted vespa | ||||
| * | No need for restricting access to zookeeper in hosted vespa | Harald Musum | 2018-10-23 | 1 | -2/+1 |
| | | | | | | | | Access restrictions handled by other means | ||||
* | | Whitelist ZooKeeper four letter commands | Harald Musum | 2018-10-24 | 1 | -1/+3 |
|/ | |||||
* | Use full name | Harald Musum | 2018-10-03 | 4 | -4/+4 |
| |