summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Fix Powershell warningsfix-psRonny Berndt2022-12-191-18/+18
|
* Ensure we use the chttpd vs httpd section in fix_uriNick Vatamaniuc2022-12-171-1/+1
| | | | It's a deprecated feature but we want to make sure it still works.
* Bump version to 3.3.0Nick Vatamaniuc2022-12-162-3/+3
|
* Bump Fauxton to v1.2.9Nick Vatamaniuc2022-12-162-1/+5
|
* Update 3.3 release notes.Nick Vatamaniuc2022-12-162-1/+17
| | | | | | | | | | | | * Since the replicator optimization issues were a bit scattered, create a single parent one as a highlight, and also run somewhat more realistic benchmark tests. * Add an image macro to keep the tradition going. * Update docs section with a 5.3.0 PR which just merged. * Add reference to the newly merged `couchjs -v` enhancement.
* Show version of spidermonkey runtime in couchjs (#4262)Ronny Berndt2022-12-153-2/+28
|
* Update Sphinx to 5.3.0Nick Vatamaniuc2022-12-151-2/+1
|
* docs: add 3.3.0 relese notesJan Lehnardt2022-12-153-2/+374
|
* fix(3517): super-simplistic fix to avoid costly AST transforms when t… (#4292)Jan Lehnardt2022-12-138-8/+92
| | | | | * fix(3517): super-simplistic fix to avoid costly AST transforms when they are not needed Co-authored-by: Ronny Berndt <ronny@apache.org>
* Fix rendering of inline literal with withespaces (#4301)Ronny Berndt2022-12-121-1/+1
| | | | | Fix wrong rendering of an inline literal with whitespaces. Use `:literal` as workaround, with esacpes whitespace `\ ` and added non-break-whitespace chars (U+00A0).
* Add note on how to configure replicaton backoff (#4299)mikhailantoshkin2022-12-121-1/+5
| | | | | | | Documentation was mentioning the exponential backoff on `Crashing` and `Error` states but did not mention how to configure it. This commit adds note on relation of `max_history` and maximum backoff with link to appropriate configuration section.
* Fix a few more flaky smoosh testsNick Vatamaniuc2022-12-081-46/+75
| | | | | | | | | Give up trying to wait on compaction pids, since even if they finished the compaction itself isn't done and swapping hasn't happened yet. Instead just poll db and view sizes until those show a lowered file size. To avoid other channels possibly getting in the way, by default unpausing only the "ratio_dbs" one, and as needed unpause others depending on the test.
* Fix flaky checkpointing smoosh testNick Vatamaniuc2022-12-061-8/+10
| | | | | | This may not be a 100% fix just doing extra asserts that file operations have succeeded and making to wait for the compactor to exit and match the exit reason exactly, to make sure we killed and it does as expected.
* Ensure prevent_overlapping_partitions stays false in Erlang 25+Nick Vatamaniuc2022-12-051-0/+9
| | | | | | | | It's already false in 23 and 24 but will start to be enabled in 25+. We don't rely on global for process registration any more, and have our own auto-connection module. So we don't want to be caught by surprise in Erlang 25+ since there is some additional coordination and resource usage needed when this option is true. See https://github.com/erlang/otp/issues/6470 for an example.
* Fix unbound variable warning (#4289)Ronny Berndt2022-12-051-4/+5
| | | | While the variable `RelativeFileName` is not really unbound in the case-statement, use a more up-to-date pattern for binding the value.
* Allow = in config key namesNick Vatamaniuc2022-12-044-82/+185
| | | | | | | | | | | | They are allowed only for the "k = v" format and not the "k=v" format. The idea is to split on " = " first, and if that fails to produce a valid kv pair we split on "=" as before. To implement it, simplify the parsing logic and remove the undocumented multi-line config value feature. The continuation syntax is not documented anywhere and not used by our default.ini or in documentation. Fix: https://github.com/apache/couchdb/issues/3319
* Increase ddoc_cache test timeoutsNick Vatamaniuc2022-12-021-14/+16
| | | | | | | | | | | | | | | To hopefully fix a flaky test I noticed: ``` module 'ddoc_cache_lru_test' ... ddoc_cache_lru_test:97: with (check_cache_refill)...*failed* ... in function meck_proc:wait/6 (src/meck_proc.erl, line 171) in call from ddoc_cache_lru_test:check_cache_refill/1 (test/eunit/ddoc_cache_lru_test.erl, line 204) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71) **error:timeout ```
* Add debug logs to smoosh testNick Vatamaniuc2022-12-021-1/+10
| | | | | | Saw this fail 2 in a the last week or so but couldn't reproduce it locally. Instead of deleting, this is an attempt to emit debug lines to see how far it goes before it times out in the CI.
* Remove all usage of globalNick Vatamaniuc2022-12-026-32/+238
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Global has a tendency to create deadlocks [1], and when that happens replication jobs can't start on any of the nodes in the cluster. We don't really need strict consistent locking for replication jobs. It's mostly to avoid replication jobs thrashing the same checkpoint docs back and forth between different session IDs. So, remove global to avoid any issues around it, and replace it with `pg` -- the new (Erlang 23+) process group implementation. (Technically `global` is still running in the runtime system as it's started by the `kernel` app. We just avoid interacting with it and registering any names to avoid deadlocks). In `pg` we start a replication `scope`, and then in that scope make every RepId a process `group`. When replication processes spawn, their Pids becomes `members` of that group: ``` couch_replicator_pg (scope): "a12c+create_target" (RepId group): [Pid1, Pid2] (members) ... ``` As per `pg` implementation, groups are created and remove automatically as members are added/removed from it, so we don't have to do anything there. If there are already any running Pids in the same group, we avoid starting the jobs, and fail like we did before when we used global. In the even more rare case of a race condition, when 2+ jobs do manage to start, we do a membership check before each checkpoint. One of the jobs then stops to yield to another. For simplicity pick the one running on the lowest lexicographically sorted node name to survive. [1] https://github.com/erlang/otp/issues/6524 [2] https://www.erlang.org/doc/man/pg.html
* Improve validation of replicator job parametersNick Vatamaniuc2022-11-3018-861/+1046
| | | | | | | | | | | | | | | | | | | There are two main improvements: * Replace the auto-inserted replicator VDU with a BDU. Replicator already had a BDU to update the `"owner"` field, so plug right into it and validate everything we need there. This way, the validation and parsing logic is all in one module. The previously inserted VDU design doc will be deleted. * Allow constraining endpoint protocol types and socket options. Previously, users could create replications with any low level socket options. Some of those are dangerous and are possible "foot-guns". Restrict those options to a more usable set. In addition to those improvements, increase test coverage a bit by explicitly checking a few more parsing corner cases. Fixes #4273
* Format config filesRonny Berndt2022-11-282-126/+176
| | | | | | | | | | Adjust configuration files to improve readability. Adjust all config keys to use the same pattern (semi-colon w/o ws): ";key = value" Adjust comments (semi-colon with ws): "; My comment description" Added new-line between description and config key, if config-key has a description
* chore: configurable ICU locationsJohannes Jörg Schmidt2022-11-281-1/+3
| | | | | | add `LDFLAGS` and `CFLAGS` environment variables to `IcuEnv` in rebar config to be able to configure ICU includes and object paths via environment variable without having to patch `rebar.config.script`.
* Merge pull request #4275 from apache/node-local-warningRobert Newson2022-11-241-0/+5
|\ | | | | add warning about misapprehending the node-local interface
| * add warning about misapprehending the node-local interfaceRobert Newson2022-11-241-0/+5
|/
* Update smoosh documentationNick Vatamaniuc2022-11-184-76/+74
| | | | | | | | | | | | | | | | * Remove the state chart. With activated/not-activated state gone, we don't need it any longer. * Describe the cleanup channels. * Add upgrade db and view channel references in a few places. * Remove references to `external` or `data_size` and other previous compaction size metrics used for triggering compactions. Replace references with `active` size. * Use double back-ticks in a few places instead of single back-ticks due to differences in RST vs MD. In RST code litterals need double back-ticks.
* Optimize smooshNick Vatamaniuc2022-11-184-713/+1100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up, optimize and increase test coverage for smoosh. * Use the new, simpler persistence module. Remove the extra `activated vs non-activated` states. Use `handle_continue(...)` gen_server callback to avoid blocking smoosh application initialization. Then, let channel unpersist in their init function which happens outside general application initialization. * Add an index cleanup channel and enqueue index cleanup jobs by default. * Remove gen_server bottlenecks for status and last update calls. Instead rely on ets table lookups and gen_casts only. * Add a few more `try ... catch`'s to avoid crashing the channels. * Use maps to keep track of starting and active jobs in the channel. * Re-check priority again before starting jobs. This is needed when jobs are un-persisted after restart and database may have been compacted or deleted already. * Update periodic channel scheduled checks to have a common scheduling mechanism. * Quantize ratio priorities to keep about a single decimal worth of precision. This should help with churn in the priority queue where the same db is constantly added and removed for small insignificant ratio changes like 2.0001 -> 2.0002. This works along a recent optimization in smoosh priority queue module, where if priority matches that item is not removed and re-added, it just stays where it is, reducing CPU and heap memory churn. * Remove testing-only API functions to avoid cluttering the API. * Store messages off-heap for channels and smoosh_server. * Instead of a per-channel `last_updated` gen_server:call(...), use a single access ets table which is periodically cleaned from stale entries. As an optimization, before enqueueing the shard, check last access first. This avoids sending an extra message to smoosh_server. * As a protection mechanism against overload, cap the access table size at 250k entries. Don't allow enqueueing more than that many entries during the configured `[smoosh] staleness = Minutes` period. * Increase test coverage from 60% to 90%
* Improve smoosh_priority_queueNick Vatamaniuc2022-11-182-294/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove `Value` parameter as it was never used. * Use `make_ref()` instead of `unique_integer([monotonic])` for performance [1]. * Rewrite tests to get 100% coverage. Previous tests didn't actually run due to a setup error. * Switch from `size/1` to `qsize/1` as `size/1` is a built-in and we have to write `?MODULE:size/1` everywhere. * Do not remove elements from the gb_tree just for inspecting min and max elements. * Remove `last_updated/2` logic, as that will be published in an ETS table. * Replace low level file serialization operations with `to_map/1`, `from_map/3`. [1] ``` 4> timer:tc(fun() -> [make_ref() || _ <- lists:seq(1, 1000000)], ok end ). {488923,ok} 6> timer:tc(fun() -> [{erlang:monotonic_time(), erlang:unique_integer([monotonic])} || _ <- lists:seq(1, 1000000)], ok end ). {1178409,ok} ```
* Improve smoosh_utilsNick Vatamaniuc2022-11-183-144/+217
| | | | | | | | | | | | | | | | | | | | Add functions related to dealing with new index cleanup channel. Add functions which fetch the channels list from config. Fix an unfortunate corner case where operators when creating custom smoosh channel may omit by mistake some of the built-in channel like view upgrades. To improve the behavior, make built-in channel always available. Channels configured by the user are always appended to the list of built-in ones. The new validate_arg/1 function is in charge of validation smoosh's input args. It centralizes ?b2l, ?l2b and other transforms and checks in one place so they are not sprinkled throughout the internal functions. After the validation steps all the db and view names are guaranteed to be binaries. The most significant changes is probably addition of extra tests to get to 100% test coverage. Since we cover all of the cases in time window checking, remove the more heavyweight elixir test, since we now get a nice coverage report in Erlang and the test is overall smaller as well.
* Add a new smoosh persistence moduleNick Vatamaniuc2022-11-181-0/+300
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * persist/3 : persists the queue, active and starting jobs * unpersist/1 : unpersist the queue * check_setup/0 : quick configuration smoke test The design is a bit simplified from the previous implementation: * There is a single waiting queue persisted as simple map. Both starting and active jobs are added to the same map structure. To maintain their position in the queue use `infinity` as their priority value. This is an established Erlang pattern and they should sort higher than both float and numeric values. * Detailed, unique gb_tree key parts are not persisted. Only the priority values and the queue objects. This should make structure smaller and faster a bit. * Remove versioning. Smoosh queue data is not critical and if need be, we can just use a new suffix. * Anticipating setup errors with the persistence state directory, add a smoke test in check_setup/0 to check writes and read to/from the state directory and log warning on failure. The idea is to do call it once during startup in the smoosh application to inform users their setup is problematic. After that, treat smoosh persistence opportunistically and if state directory is not setup, opt to keep the rest of CouchDB node up. There should be 100% test coverage.
* Improve flaky dbs info testJay Doane2022-11-181-6/+10
| | | | | | | | | | | | | | | | | | | | | This test can time out with the following stack trace because ibrowse sends {error, retry_later} under certain conditions [1] which causes test_request:request/6 to sleep and retry [2], which can result in this failure: chttpd_dbs_info_test:79: -dbs_info_test_/0-fun-20- (should_return_500_time_out_when_time_is_not_enough_for_get_dbs_info)...*timed out* in function timer:sleep/1 (timer.erl, line 219) in call from test_request:request/6 (src/test_request.erl, line 106) in call from chttpd_dbs_info_test:should_return_500_time_out_when_time_is_not_enough_for_get_dbs_info/1 (test/eunit/chttpd_dbs_info_test.erl, line 157) in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71) in call from eunit_proc:run_test/1 (eunit_proc.erl, line 531) in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 356) in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 514) in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 456) undefined [1] https://github.com/cmullaparthi/ibrowse/blob/22d6fd6baa6e83633aa2923f41589945c1d2dc2f/src/ibrowse.erl#L409 [2] https://github.com/apache/couchdb/blob/62d92766e8b8042c2f3627c3ac3e2365410c7912/src/couch/src/test_request.erl#L104-L106
* Merge pull request #4272 from apache/kill_all_couch_serversRobert Newson2022-11-181-2/+2
|\ | | | | kill all couch_servers if database_dir changes
| * kill all couch_servers if database_dir changesRobert Newson2022-11-181-2/+2
|/
* Improve purge client cleanup loggingNick Vatamaniuc2022-11-181-35/+120
| | | | | | | | | | | | | | | | | | * Refactor lag logging to a separate function. * When purge client validity throws an error log the error as well. * Use the newer `erlang:system_time(second)` call to get time. * When warning about update lag, log both the lag and the limit. * Add a specific log message if update timestamp is invalid. * Clarify that an invalid is not always because of a malformed checkpoint document. The most likely case is it's a stale view client checkpoint which hasn't been cleaned up properly. Fix: #4181
* Add a proper reshard jobs ioq classNick Vatamaniuc2022-11-174-3/+12
| | | | | | This lets users bypass it just like any other ioq classes. Improving on https://github.com/apache/couchdb/pull/4267
* Use compaction ioq priority for shard splittingNick Vatamaniuc2022-11-162-1/+4
| | | | | Db shard copy uses `db_compact` priority and view building uses `view_compact` priority to avoid inventing new priority levels.
* Update active db size calculation to use only leaf nodesNick Vatamaniuc2022-11-131-8/+7
| | | | | | | | | | | | | | | | | Previously it used both leaf and intermediate nodes. With this PR active db size should decrease when users delete documents. This, in turn, should make smoosh enqueue those dbs for compaction to recover the disk space used by the now deleted document bodies. Previously, it was possible for users to delete gigabytes worth of document bodies and smoosh never noticing and never triggering the compaction. Original idea and patch provided by Robert Newson in CouchDB dev Slack discussion channel. Co-authored-by: Robert Newson <rnewson@apache.org> Fixes: https://github.com/apache/couchdb/issues/4263
* Update couch_mrview_debug with a few new functionsNick Vatamaniuc2022-11-101-2/+126
| | | | Add functions to inspect view signatures, index files and purge checkpoints.
* Improve fabric index cleanupNick Vatamaniuc2022-11-107-246/+500
| | | | | | | | | | | | | | | | | | | | | * Clean-up stale view purge checkpoints. Previously we didn't and purge progress could have stalled by keeping around inactive(lagging) lagging purge checkpoints. * couch_mrview_cleanup attempted to clean purge checkpoints but that didn't work for clustered databases, only for local ones. Nowadays most dbs are clustered so make sure those work as well. * DRY-out code from both fabric inactive index cleanup and couch_mrview_cleanup modules. Move some of the common code to couch_mrview_util module. couch_mrvew_cleanup is the only place in charge the cleanup logic now. * Consolidate and improve tests. Utility functions to get all index files, purge checkpoint and signatures are now tested with couch_mrview_util tests, and end-to-end fabric cleanup tests are in fabric_tests. Since fabirc_tests covers all the test scenarios from fabric_test.exs, remove fabric_test.exs so we don't have test duplicated and get same coverage.
* Fixing whitspaces of config option admin_only_all_dbs (#4256)Ronny Berndt2022-11-041-3/+3
| | | Correct indentation of config option admin_only_all_dbs
* Make sure admin_only_all_dbs applies to _dbs_info as wellNick Vatamaniuc2022-11-024-7/+90
| | | | | | | | We mised that in the 3.x versions up until now. Also some documentation around it with version history. Fixes #4253
* Integrate b64url, ets_lru and khash into the main repoNick Vatamaniuc2022-10-2830-6/+4883
| | | | As discussed on the mailing list https://lists.apache.org/thread/opvsmz1pwlnv96wozy5kp7ss896l9lfp
* Optimize smoosh enqueuingNick Vatamaniuc2022-10-271-22/+49
| | | | | | | Make sure we don't do an O(n) filtering when enqueuing processes terminate. Also, cleanup an old code_change clause, and add a tiny optimization to avoid ununecessary external calls.
* Fix smoosh get_priority/2 case clauseNick Vatamaniuc2022-10-271-85/+96
| | | | Fixes the case when get_index/3 returns a `database_not_found` error.
* Bump mochiweb to v3.1.1Nick Vatamaniuc2022-10-271-1/+1
|
* Implement global password hasher process (#4240)Ronny2022-10-273-21/+83
| | | | | | | | | | | | | Implement a global password hasher process. The new behavior reduces the hashing calls from 2 * N (N equals the number of `couch_server` processes) down to 2 calls. The first call is triggered by the change in the config file and the second call through `config:set` to write the hashed result back into the config file. If we want to reduce this to one call only, we need to implement some more intelligence into the config part, to prevent the triggers for such calls. The password hasher is implemented as a `gen_server` and started with the `couch_primary_services` supervisior. Fixes #4236.
* Integrate config app into main repoNick Vatamaniuc2022-10-2618-3/+2287
| | | | As per consensus in ML discussion https://lists.apache.org/thread/9dphqb6mjh1v234v15rcft7mfpjx9223
* Optimize _bulk_get endpointNick Vatamaniuc2022-10-256-941/+1095
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use new `fabric:open_revs/3` API implemented in #4201 to optimize the _bulk_get HTTP API. Since `open_revs/3` itself is new, allow reverting to individual doc fetches using the previous `open_revs/4` API via a config setting, mostly as a precautionary measure. The implementation consists of three main parts: * Parse and validate args * Fetch the docs using `open_revs/3` or `open_revs/4` * Emit results as json or multipart, based on the `Accept` header value Parsing and validation checks for various errors and then returns a map of `#{Ref => {DocId, RevOrError, DocOptions}}` and a list of Refs in the original argument order. The middle tuple element of `RevOrError` is notable that it may hold either the revision ID (`[Rev]` or `all`) or `{error, {Rev, ErrorTag, ErrorReason}}`. Fetching the docs is fairly straightforward. The slightly interesting aspect is when an error is returned from `open_revs/3` we have to pretend that all the batched docs failed with that error. That is done to preserve the "zip" property, where all the input arguments have their matching result at the same position in results list. Another notable thing here is we fixed a bug where the error returned from `fabric:open_revs/3,4` was not formatted in a way it could have been emitted as json resulting in a function clause. That is why we call `couch_util:to_binary/1` on it. This was detected by the integration testing outline before and was missed by the previous mocked unit test. The last part is emitting the results as either json or multipart. Here most changes are cleanups and grouping into separate handler functions. The `Accept` header can be either `multipart/related` or `multipart/mixed` and we try to emit the same content type as it was passed in the `Accept` header. One notable thing here is by DRY-ing the filtering of attachments in `non_stubbed_attachments/1` we fixed another bug when the multipart result was returning nonsense in cases when all attachments were stubs. The doc was returned as a multipart chunk with content type `multipart/...` instead of application/json. This was also detected in the integration tests described below. The largest changes are in the testing area. Previous multipart tests were using mocks heavily, were quite fragile, and didn't have good coverage. Those tests were removed and replaced by new end-to-end tests in `chttpd_bulk_get_test.erl`. To make that happen add a simple multipart parser utility function which knows how to parse multipart responses into maps. Those maps preserve chunk headers and we can match those with `?assertMatch(...)` fairly easily. The tests try to get decent coverage for `chttpd_db.erl` bulk_get implementation and its utility functions, but they are also end-to-end tests so they test everything below, including fabric and couch layers as well. Quick 1 node testing using the couchdyno replicating of 1 million docs shows at least a 2x speedup to complete the replication using this PR. On main: ``` r=rep.Rep(); r.replicate_1_to_n_and_compare(1, num=1000000, normal=True) 330 sec ``` With this PR: ``` r=rep.Rep(); r.replicate_1_to_n_and_compare(1, num=1000000, normal=True) 160 sec ``` Individual `_bulk_get` response times shows an even higher improvement: an 8x speedup: On main: ``` [notice] ... POST /cdyno-0000001/_bulk_get?latest=true&revs=true&attachments=false 200 ok 468 [notice] ... POST /cdyno-0000001/_bulk_get?latest=true&revs=true&attachments=false 200 ok 479 ``` With this PR: ``` [notice] ... POST /cdyno-0000001/_bulk_get?latest=true&revs=true&attachments=false 200 ok 54 [notice] ... POST /cdyno-0000001/_bulk_get?latest=true&revs=true&attachments=false 200 ok 61 ``` Fixes: https://github.com/apache/couchdb/issues/4183
* A few more Erlang <23 cleanupsNick Vatamaniuc2022-10-252-11/+2
| | | | Thanks to Ilya for noticing and catching these!
* Introduce roles_claim_path (#4232)Ronny2022-10-252-2/+161
| | | | Introduce new roles_claim_path parameter and add missing roles_claim_name description.
* Implement fabric:open_revs/3Nick Vatamaniuc2022-10-193-1/+716
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `fabric:open_revs/3` is the batched counterpart to `fabric:open_revs/4`. Instead of fetching revisions for only one document, like `fabric:open_revs/4`, it can fetch batches of document revisions. The main optimization is achieved by grouping documents by shard ranges and then spawning shard workers only once for each range (essentially doing what fabric:update_docs/2,3 does). The idea is to use this implementation for a more efficient version of _bulk_get. So far this is just the fabric API, not plugged into the HTTP API yet. The backend API for this is already available as implemented in https://github.com/apache/couchdb/pull/4185. While the implementation is intended to be as close as possible to `fabric:open_revs/4` there a key difference -- since we're now dealing with potentially large batches, we choose not to perform immediate read repair. Instead, we let the internal replicator handle that. Read repair can spawn an unbounded number of processes, and now with batches those could be large fabric:update_docs/2,3 requests. Issuing lots of those, especially when a cluster has a hard time keeping up, might not be a good idea. For now we leave the `fabric:open_revs/4` implementation intact, as opposed to re-wiring it to call `fabric:open_revs/3` with a one-element list. This helps have a test reference point (see fabric_open_revs_test for an example), allows for a configurable fallback for _bulk_get implementation, and lets individual doc GET fetches work exactly as they have been. A few details about the implementation itself: * The coordinator state is kept in the `#st{}` record. Each requested doc ID and revisions list argument keeps track of its quorum logic and gathers responses in its individual `#req{}` record. There is exactly one of those for each requested `{{Id, Revs}, DocOpts}` argument. * Since the `{{Id, Revs}, DocOpts}` arguments are not guaranteed to be unique, we have to keep track of them so we can return the results in the same order. To help with that we tag each with a unique reference, and keep track of it in `{atts_since, ...}` revisions, it helps to stash away the arguments separately and just operating with reference tags. Those references are usually called `ArgRef` and `ArgsRefs` in the code. * The `#st.workers` map keeps track of the spawned workers and which args they are responsible for. As each worker returns, or is knocked out by an error, it gets removed from the `#st.workers` map and at the same time all the individual requests (`#req{}` record) are updated. After the update we check if read quorum has been met and if it's still possible to succeed with the remaining workers. * There is a 100% test coverage. The tests in the `fabric_open_revs_test` module are more end-to-end and they exercise various options. The tests in `fabric_open_revs` test how quorum logic and error handling works by emulating various workers returning and how revisions are merged together into the final result. Issue: https://github.com/apache/couchdb/issues/4183