summaryrefslogtreecommitdiffstats
path: root/zkfacade
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Revert "Simplify symlink""Håkon Hallingstad2020-12-081-1/+1
|
* Revert "Simplify symlink"Harald Musum2020-12-071-1/+1
|
* Simplify symlinkHåkon Hallingstad2020-12-021-1/+1
| | | | | | | The symlink points to a file in the same directory. Therefore, instead of pointing to the absolute path, it should point to the filename. This also makes the symlink work on the host, if the symlink was made in a container (and vice versa).
* Log (at level FINE) participants in barrier and how many have respondedHarald Musum2020-11-261-0/+2
|
* Add VespaCurator interfaceMartin Polden2020-11-235-3/+48
|
* Merge pull request #15402 from vespa-engine/mpolden/use-curator-configHarald Musum2020-11-207-1261/+1419
|\ | | | | Use CuratorConfig in Curator
| * Refactor mock to simplify Curator constructorsMartin Polden2020-11-203-1163/+1195
| |
| * Use CuratorConfigMartin Polden2020-11-203-20/+30
| |
| * Extract ConnectionSpecMartin Polden2020-11-205-91/+207
| |
* | Create path first if necessary in set()Harald Musum2020-11-201-4/+5
|/ | | | | | Checking for existence and creating and setting data might fail if node was created after check. Use internal method to create, which handles node being created after checking for existence.
* Improve logging when removing an applicationHarald Musum2020-11-151-2/+3
| | | | Log only when it is removed, add some more validation
* return earlier if possibleHåkon Hallingstad2020-11-111-5/+6
|
* Fix thread lock detection bugHåkon Hallingstad2020-11-111-2/+2
| | | | | | | | | | | | The effect of the bug was that a deadlock would be reported as long as the current thread T0 that tries to acquire the ZK path P0 is in the following situation: 1. Thread T0 tries to acquire ZK path P0, held by T1. 2. Thread T1 tries to acquire ZK path P1, held by T2. Instead, T2 would need to equal T0. Or, 3. Thread T2 tries to acquire ZK path P2, held by T3 = one of (T0, T1). etc.
* Also test cumulative deadlock countersHåkon Hallingstad2020-10-201-0/+4
|
* Replace deadlock avoidance with metricsHåkon Hallingstad2020-10-194-29/+91
|
* Deadlock detectionHåkon Hallingstad2020-10-114-17/+199
| | | | | | | | | | | | Just before Lock.acquire() is invoked, the locks within the process is queried to see if a "deadlock" will occur: The current thread waiting to acquire lock path P1, which is held by thread T1 waiting on acquiring a lock at path P2, etc, until a thread is waiting for a lock held by the current thread. Even without this PR the deadlock would resolve itself automatically because all locks are acquired with timeouts. However, this PR 1. resolves the deadlock immediately, and 2. leaves a log trace (hopefully from the exception) to allow us to refactor code to avoid such deadlocks.
* Avoid metrics on reentry of lockHåkon Hallingstad2020-10-085-61/+114
|
* Avoid even small double-counting of locked timeHåkon Hallingstad2020-10-074-23/+39
|
* Make richer latency statsHåkon Hallingstad2020-10-0511-384/+404
| | | | | | | | | Makes a LatencyStats which provides some useful metrics, best explained there and in LatencyMetrics. This includes latency metrics, the "QPS" (e.g. the number of acquire() per second), and load metrics. Unfortunately I had to move from atomics to synchronized to accomplish this, but I see no other way.
* Move lock metrics to MetricsReporterHåkon Hallingstad2020-10-0319-219/+629
| | | | | | | | | | | | | | | Adds two new metrics: - The load of acquiring each lock path: The average number of threads waiting to acquire the lock within the last minute (or unit of time). Aka the lock queue (depth). - The load of the lock for each lock path: The average number of threads holding the lock within the last minute (or unit of time). This is always <= 1. Aka the lock utilization. Changes the LockCounters to LockMetrics, and exporting those once every minute through MetricReporter which is designed for this.
* Merge pull request #14657 from ↵Håkon Hallingstad2020-10-019-40/+112
|\ | | | | | | | | vespa-engine/hakonhall/add-metrics-to-lock-attempts Add metrics to lock attempts
| * Add metrics to lock attemptsHåkon Hallingstad2020-10-019-40/+112
| |
* | Merge pull request #14655 from vespa-engine/mpolden/fix-agentValerij Fredriksen2020-10-011-1/+1
|\ \ | |/ |/| Store correct agent when adding nodes
| * Store correct agent when adding nodesMartin Polden2020-10-011-1/+1
| |
* | Record locks taken for external deploysHåkon Hallingstad2020-09-307-66/+202
|/ | | | | | | | | | | | - Information about a lock attempt now includes a list of lock attempts done while holding the lock, forming a tree (forest) structure. - Records the duration and locking attempts done as part of an external deploy, forming a tree of locks with timing info. The currently active external deploys are shown in an "ongoing-recording" field of /nodes/v2/locks. - The 3 longest external deploys are kept in "recordings" in /nodes/v2/locks. - Extracts the global process-wide parts of ThreadLockStats into separate class for clarity.
* Add count of failed releasesHåkon Hallingstad2020-09-284-10/+18
|
* More info -> attempt renamesHåkon Hallingstad2020-09-284-21/+21
|
* LockInfo -> LockAttempt, ThreadLockInfo -> ThreadLockStats, and moreHåkon Hallingstad2020-09-286-103/+111
|
* Use deque as stackHåkon Hallingstad2020-09-283-14/+32
|
* Mock lock path from thread to per-lock (bug)Håkon Hallingstad2020-09-266-80/+223
|
* Dump stack trace once per threadHåkon Hallingstad2020-09-262-27/+29
|
* Adds method name to stack trace and adds timeout count and testHåkon Hallingstad2020-09-257-4/+169
|
* Remove reentrant lock no longer neededHåkon Hallingstad2020-09-254-50/+30
|
* Add duration of acquire, in locked, and totalHåkon Hallingstad2020-09-252-3/+18
|
* Avoid double iterationHåkon Hallingstad2020-09-251-9/+8
|
* Make stacktraces for active locks during request handlingHåkon Hallingstad2020-09-242-26/+30
|
* Also show the longest-living historical locks, with stack traceHåkon Hallingstad2020-09-242-11/+59
|
* Count events per zk path and move to separate packageHåkon Hallingstad2020-09-246-97/+72
|
* Expose locks info in REST APIHåkon Hallingstad2020-09-244-2/+263
|
* Avoid unnecesary logging: Reduce log level or removeHarald Musum2020-08-311-1/+1
|
* Revert "Upgrade to Curator 4"Harald Musum2020-08-1714-460/+56
|
* Upgrade to 4.3.0Harald Musum2020-08-1611-11/+11
|
* Cleanup exclusions and remove stray fileHarald Musum2020-08-163-133/+0
|
* Upgrade to Curator 4Harald Musum2020-08-1616-56/+593
|
* Stick to junit for simple test.Henning Baldersheim2020-08-111-6/+5
|
* Actually don't create parentsJon Marius Venstad2020-08-111-1/+1
|
* Avoid creating session path when creating waitersJon Marius Venstad2020-08-101-1/+1
|
* Revert "Revert "Reapply "Upgrade to Curator 2.13.0"""Harald Musum2020-08-0312-49/+87
|
* Revert "Reapply "Upgrade to Curator 2.13.0""Harald Musum2020-07-3012-87/+49
|
* Revert "Revert "Upgrade to Curator 2.13.0""Harald Musum2020-07-3012-49/+87
|