Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | Move lock metrics to MetricsReporter | Håkon Hallingstad | 2020-10-03 | 12 | -186/+423 | |
| | | | | | | | | | | | | | | | Adds two new metrics: - The load of acquiring each lock path: The average number of threads waiting to acquire the lock within the last minute (or unit of time). Aka the lock queue (depth). - The load of the lock for each lock path: The average number of threads holding the lock within the last minute (or unit of time). This is always <= 1. Aka the lock utilization. Changes the LockCounters to LockMetrics, and exporting those once every minute through MetricReporter which is designed for this. | |||||
* | Merge pull request #14657 from ↵ | Håkon Hallingstad | 2020-10-01 | 5 | -32/+102 | |
|\ | | | | | | | | | vespa-engine/hakonhall/add-metrics-to-lock-attempts Add metrics to lock attempts | |||||
| * | Add metrics to lock attempts | Håkon Hallingstad | 2020-10-01 | 5 | -32/+102 | |
| | | ||||||
* | | Merge pull request #14655 from vespa-engine/mpolden/fix-agent | Valerij Fredriksen | 2020-10-01 | 1 | -1/+1 | |
|\ \ | |/ |/| | Store correct agent when adding nodes | |||||
| * | Store correct agent when adding nodes | Martin Polden | 2020-10-01 | 1 | -1/+1 | |
| | | ||||||
* | | Record locks taken for external deploys | Håkon Hallingstad | 2020-09-30 | 6 | -55/+191 | |
|/ | | | | | | | | | | | | - Information about a lock attempt now includes a list of lock attempts done while holding the lock, forming a tree (forest) structure. - Records the duration and locking attempts done as part of an external deploy, forming a tree of locks with timing info. The currently active external deploys are shown in an "ongoing-recording" field of /nodes/v2/locks. - The 3 longest external deploys are kept in "recordings" in /nodes/v2/locks. - Extracts the global process-wide parts of ThreadLockStats into separate class for clarity. | |||||
* | Add count of failed releases | Håkon Hallingstad | 2020-09-28 | 4 | -10/+18 | |
| | ||||||
* | More info -> attempt renames | Håkon Hallingstad | 2020-09-28 | 2 | -10/+10 | |
| | ||||||
* | LockInfo -> LockAttempt, ThreadLockInfo -> ThreadLockStats, and more | Håkon Hallingstad | 2020-09-28 | 4 | -67/+67 | |
| | ||||||
* | Use deque as stack | Håkon Hallingstad | 2020-09-28 | 2 | -14/+9 | |
| | ||||||
* | Mock lock path from thread to per-lock (bug) | Håkon Hallingstad | 2020-09-26 | 4 | -79/+164 | |
| | ||||||
* | Dump stack trace once per thread | Håkon Hallingstad | 2020-09-26 | 2 | -27/+29 | |
| | ||||||
* | Adds method name to stack trace and adds timeout count and test | Håkon Hallingstad | 2020-09-25 | 4 | -4/+50 | |
| | ||||||
* | Remove reentrant lock no longer needed | Håkon Hallingstad | 2020-09-25 | 4 | -50/+30 | |
| | ||||||
* | Add duration of acquire, in locked, and total | Håkon Hallingstad | 2020-09-25 | 2 | -3/+18 | |
| | ||||||
* | Avoid double iteration | Håkon Hallingstad | 2020-09-25 | 1 | -9/+8 | |
| | ||||||
* | Make stacktraces for active locks during request handling | Håkon Hallingstad | 2020-09-24 | 2 | -26/+30 | |
| | ||||||
* | Also show the longest-living historical locks, with stack trace | Håkon Hallingstad | 2020-09-24 | 2 | -11/+59 | |
| | ||||||
* | Count events per zk path and move to separate package | Håkon Hallingstad | 2020-09-24 | 5 | -37/+72 | |
| | ||||||
* | Expose locks info in REST API | Håkon Hallingstad | 2020-09-24 | 3 | -2/+203 | |
| | ||||||
* | Avoid unnecesary logging: Reduce log level or remove | Harald Musum | 2020-08-31 | 1 | -1/+1 | |
| | ||||||
* | Revert "Upgrade to Curator 4" | Harald Musum | 2020-08-17 | 2 | -448/+45 | |
| | ||||||
* | Upgrade to Curator 4 | Harald Musum | 2020-08-16 | 2 | -45/+448 | |
| | ||||||
* | Actually don't create parents | Jon Marius Venstad | 2020-08-11 | 1 | -1/+1 | |
| | ||||||
* | Avoid creating session path when creating waiters | Jon Marius Venstad | 2020-08-10 | 1 | -1/+1 | |
| | ||||||
* | Revert "Revert "Reapply "Upgrade to Curator 2.13.0""" | Harald Musum | 2020-08-03 | 1 | -38/+76 | |
| | ||||||
* | Revert "Reapply "Upgrade to Curator 2.13.0"" | Harald Musum | 2020-07-30 | 1 | -76/+38 | |
| | ||||||
* | Revert "Revert "Upgrade to Curator 2.13.0"" | Harald Musum | 2020-07-30 | 1 | -38/+76 | |
| | ||||||
* | Revert "Upgrade to Curator 2.13.0" | Harald Musum | 2020-07-30 | 1 | -76/+38 | |
| | ||||||
* | Remove stray file | Harald Musum | 2020-07-29 | 1 | -1199/+0 | |
| | ||||||
* | Upgrade to Curator 2.13.0 | Harald Musum | 2020-07-29 | 2 | -38/+1275 | |
| | ||||||
* | Do not wait longer for more participants in barrier | Harald Musum | 2020-05-25 | 1 | -11/+2 | |
| | ||||||
* | Wait longer for servers to reach barrier | Harald Musum | 2020-04-29 | 3 | -20/+49 | |
| | | | | | | | 1. Wait up to 2 seconds for all to reach barrier. 2. If not, wait up to 4 seconds for the server that waits for the barrier to be one of the respondents AND a majority of servers have reached barrier. 3. If not, wait for a majority of servers to have reached barrier. | |||||
* | LogLevel.DEBUG -> Level.FINE | gjoranv | 2020-04-25 | 1 | -2/+2 | |
| | ||||||
* | Import java.util.logging.Level instead of com.yahoo.log.LogLevel | gjoranv | 2020-04-25 | 1 | -1/+1 | |
| | ||||||
* | Let Curator own re-entrant locks | Martin Polden | 2020-04-14 | 2 | -1/+17 | |
| | ||||||
* | Use Duration for timeouts | Martin Polden | 2020-04-14 | 1 | -18/+14 | |
| | ||||||
* | Merge pull request #11815 from ↵ | Jon Marius Venstad | 2020-03-20 | 1 | -6/+22 | |
|\ | | | | | | | | | vespa-engine/jvenstad/wrap-curator-mutex-with-reentarnt-lock Hold a JVM-wide reentrant lock to grab mutex — helps ZK stale reads? | |||||
| * | Be less stupoid | Jon Marius Venstad | 2020-01-16 | 1 | -1/+4 | |
| | | ||||||
| * | Differentiate between failing to acquire the two locks | Jon Marius Venstad | 2020-01-16 | 1 | -10/+7 | |
| | | ||||||
| * | Swap order of locks, to avoid doubling timeout duration | Jon Marius Venstad | 2020-01-16 | 1 | -6/+14 | |
| | | ||||||
| * | Hold a JVM-wide reentrant lock to grab mutex — helps ZK stale reads? | Jon Marius Venstad | 2020-01-16 | 1 | -1/+9 | |
| | | ||||||
* | | Update Curator.java | Jon Marius Venstad | 2020-01-20 | 1 | -3/+2 | |
| | | ||||||
* | | Log curator state changes SUSPENDED, RECONNECTED and LOST | Jon Marius Venstad | 2020-01-20 | 1 | -1/+8 | |
|/ | ||||||
* | Create zookeeper client config file only when necessary | Harald Musum | 2020-01-09 | 1 | -16/+18 | |
| | ||||||
* | Revert "Reapply "Upgrade to Curator 2.13.0"" | Harald Musum | 2020-01-09 | 1 | -76/+38 | |
| | ||||||
* | Revert "Revert "Upgrade to Curator 2.13.0"" | Harald Musum | 2020-01-07 | 1 | -38/+76 | |
| | ||||||
* | Cache historic runs (with mock spuport for ZK node versions) | Jon Marius Venstad | 2019-12-20 | 3 | -5/+18 | |
| | ||||||
* | Revert "Upgrade to Curator 2.13.0" | Harald Musum | 2019-12-20 | 1 | -76/+38 | |
| | ||||||
* | Upgrade to Curator 2.13.0 | Harald Musum | 2019-12-19 | 1 | -38/+76 | |
| |