Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Upgrade Curator framework to 5.1.0 | Bjørn Christian Seime | 2021-03-08 | 12 | -12/+16 |
| | |||||
* | Simplify, testing server is not used | Harald Musum | 2021-03-03 | 1 | -59/+7 |
| | |||||
* | Add TODO for Provider<Curator> && gp | Jon Marius Venstad | 2021-02-14 | 1 | -0/+1 |
| | |||||
* | Avoid returning null where possible | Harald Musum | 2021-01-05 | 1 | -7/+6 |
| | |||||
* | Revert "Revert "Reapply "Upgrade to Curator 4""" | Harald Musum | 2021-01-05 | 13 | -69/+267 |
| | |||||
* | Revert "Reapply "Upgrade to Curator 4"" | Harald Musum | 2021-01-05 | 13 | -267/+69 |
| | |||||
* | Fix arguments used when constructing MockCuratorFramework | Harald Musum | 2021-01-02 | 1 | -5/+3 |
| | |||||
* | Fix newWatcherRemoveCuratorFramework() | Harald Musum | 2021-01-01 | 1 | -2/+15 |
| | |||||
* | Merge branch 'master' into revert-14062-revert-14057-hmusum/upgrade-to-curator-4 | Harald Musum | 2021-01-01 | 1 | -1/+1 |
|\ | |||||
| * | Revert "Revert "Simplify symlink"" | Håkon Hallingstad | 2020-12-08 | 1 | -1/+1 |
| | | |||||
| * | Revert "Simplify symlink" | Harald Musum | 2020-12-07 | 1 | -1/+1 |
| | | |||||
| * | Simplify symlink | Håkon Hallingstad | 2020-12-02 | 1 | -1/+1 |
| | | | | | | | | | | | | | | The symlink points to a file in the same directory. Therefore, instead of pointing to the absolute path, it should point to the filename. This also makes the symlink work on the host, if the symlink was made in a container (and vice versa). | ||||
* | | Merge branch 'master' into revert-14062-revert-14057-hmusum/upgrade-to-curator-4 | Harald Musum | 2020-11-26 | 10 | -1670/+1644 |
|\| | |||||
| * | Log (at level FINE) participants in barrier and how many have responded | Harald Musum | 2020-11-26 | 1 | -0/+2 |
| | | |||||
| * | Add VespaCurator interface | Martin Polden | 2020-11-23 | 4 | -2/+32 |
| | | |||||
| * | Merge pull request #15402 from vespa-engine/mpolden/use-curator-config | Harald Musum | 2020-11-20 | 6 | -1261/+1418 |
| |\ | | | | | | | Use CuratorConfig in Curator | ||||
| | * | Refactor mock to simplify Curator constructors | Martin Polden | 2020-11-20 | 3 | -1163/+1195 |
| | | | |||||
| | * | Use CuratorConfig | Martin Polden | 2020-11-20 | 2 | -20/+29 |
| | | | |||||
| | * | Extract ConnectionSpec | Martin Polden | 2020-11-20 | 5 | -91/+207 |
| | | | |||||
| * | | Create path first if necessary in set() | Harald Musum | 2020-11-20 | 1 | -4/+5 |
| |/ | | | | | | | | | | | Checking for existence and creating and setting data might fail if node was created after check. Use internal method to create, which handles node being created after checking for existence. | ||||
* / | Revert "Revert "Upgrade to Curator 4"" | Harald Musum | 2020-11-20 | 13 | -56/+459 |
|/ | |||||
* | Improve logging when removing an application | Harald Musum | 2020-11-15 | 1 | -2/+3 |
| | | | | Log only when it is removed, add some more validation | ||||
* | return earlier if possible | Håkon Hallingstad | 2020-11-11 | 1 | -5/+6 |
| | |||||
* | Fix thread lock detection bug | Håkon Hallingstad | 2020-11-11 | 1 | -2/+2 |
| | | | | | | | | | | | | The effect of the bug was that a deadlock would be reported as long as the current thread T0 that tries to acquire the ZK path P0 is in the following situation: 1. Thread T0 tries to acquire ZK path P0, held by T1. 2. Thread T1 tries to acquire ZK path P1, held by T2. Instead, T2 would need to equal T0. Or, 3. Thread T2 tries to acquire ZK path P2, held by T3 = one of (T0, T1). etc. | ||||
* | Also test cumulative deadlock counters | Håkon Hallingstad | 2020-10-20 | 1 | -0/+4 |
| | |||||
* | Replace deadlock avoidance with metrics | Håkon Hallingstad | 2020-10-19 | 4 | -29/+91 |
| | |||||
* | Deadlock detection | Håkon Hallingstad | 2020-10-11 | 4 | -17/+199 |
| | | | | | | | | | | | | Just before Lock.acquire() is invoked, the locks within the process is queried to see if a "deadlock" will occur: The current thread waiting to acquire lock path P1, which is held by thread T1 waiting on acquiring a lock at path P2, etc, until a thread is waiting for a lock held by the current thread. Even without this PR the deadlock would resolve itself automatically because all locks are acquired with timeouts. However, this PR 1. resolves the deadlock immediately, and 2. leaves a log trace (hopefully from the exception) to allow us to refactor code to avoid such deadlocks. | ||||
* | Avoid metrics on reentry of lock | Håkon Hallingstad | 2020-10-08 | 5 | -61/+114 |
| | |||||
* | Avoid even small double-counting of locked time | Håkon Hallingstad | 2020-10-07 | 4 | -23/+39 |
| | |||||
* | Make richer latency stats | Håkon Hallingstad | 2020-10-05 | 11 | -384/+404 |
| | | | | | | | | | Makes a LatencyStats which provides some useful metrics, best explained there and in LatencyMetrics. This includes latency metrics, the "QPS" (e.g. the number of acquire() per second), and load metrics. Unfortunately I had to move from atomics to synchronized to accomplish this, but I see no other way. | ||||
* | Move lock metrics to MetricsReporter | Håkon Hallingstad | 2020-10-03 | 17 | -215/+619 |
| | | | | | | | | | | | | | | | Adds two new metrics: - The load of acquiring each lock path: The average number of threads waiting to acquire the lock within the last minute (or unit of time). Aka the lock queue (depth). - The load of the lock for each lock path: The average number of threads holding the lock within the last minute (or unit of time). This is always <= 1. Aka the lock utilization. Changes the LockCounters to LockMetrics, and exporting those once every minute through MetricReporter which is designed for this. | ||||
* | Merge pull request #14657 from ↵ | Håkon Hallingstad | 2020-10-01 | 8 | -36/+108 |
|\ | | | | | | | | | vespa-engine/hakonhall/add-metrics-to-lock-attempts Add metrics to lock attempts | ||||
| * | Add metrics to lock attempts | Håkon Hallingstad | 2020-10-01 | 8 | -36/+108 |
| | | |||||
* | | Merge pull request #14655 from vespa-engine/mpolden/fix-agent | Valerij Fredriksen | 2020-10-01 | 1 | -1/+1 |
|\ \ | |/ |/| | Store correct agent when adding nodes | ||||
| * | Store correct agent when adding nodes | Martin Polden | 2020-10-01 | 1 | -1/+1 |
| | | |||||
* | | Record locks taken for external deploys | Håkon Hallingstad | 2020-09-30 | 7 | -66/+202 |
|/ | | | | | | | | | | | | - Information about a lock attempt now includes a list of lock attempts done while holding the lock, forming a tree (forest) structure. - Records the duration and locking attempts done as part of an external deploy, forming a tree of locks with timing info. The currently active external deploys are shown in an "ongoing-recording" field of /nodes/v2/locks. - The 3 longest external deploys are kept in "recordings" in /nodes/v2/locks. - Extracts the global process-wide parts of ThreadLockStats into separate class for clarity. | ||||
* | Add count of failed releases | Håkon Hallingstad | 2020-09-28 | 4 | -10/+18 |
| | |||||
* | More info -> attempt renames | Håkon Hallingstad | 2020-09-28 | 4 | -21/+21 |
| | |||||
* | LockInfo -> LockAttempt, ThreadLockInfo -> ThreadLockStats, and more | Håkon Hallingstad | 2020-09-28 | 6 | -103/+111 |
| | |||||
* | Use deque as stack | Håkon Hallingstad | 2020-09-28 | 3 | -14/+32 |
| | |||||
* | Mock lock path from thread to per-lock (bug) | Håkon Hallingstad | 2020-09-26 | 6 | -80/+223 |
| | |||||
* | Dump stack trace once per thread | Håkon Hallingstad | 2020-09-26 | 2 | -27/+29 |
| | |||||
* | Adds method name to stack trace and adds timeout count and test | Håkon Hallingstad | 2020-09-25 | 5 | -4/+163 |
| | |||||
* | Remove reentrant lock no longer needed | Håkon Hallingstad | 2020-09-25 | 4 | -50/+30 |
| | |||||
* | Add duration of acquire, in locked, and total | Håkon Hallingstad | 2020-09-25 | 2 | -3/+18 |
| | |||||
* | Avoid double iteration | Håkon Hallingstad | 2020-09-25 | 1 | -9/+8 |
| | |||||
* | Make stacktraces for active locks during request handling | Håkon Hallingstad | 2020-09-24 | 2 | -26/+30 |
| | |||||
* | Also show the longest-living historical locks, with stack trace | Håkon Hallingstad | 2020-09-24 | 2 | -11/+59 |
| | |||||
* | Count events per zk path and move to separate package | Håkon Hallingstad | 2020-09-24 | 5 | -37/+72 |
| | |||||
* | Expose locks info in REST API | Håkon Hallingstad | 2020-09-24 | 3 | -2/+203 |
| |