summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Improve retryable FDB error handlingimprove-resilience-to-retriable-errorsNick Vatamaniuc2021-03-2510-28/+44
| | | | | | | | | | | After running the Elixir suite under the buggify" mode, we noticed retryable errors were not handled well. In some cases we handled some errors (1009) but not others (1007, 1031). Some other retryable errors were not considered at all. So here we use the newly defined constants and `ERLFDB_IS_RETRYABLE/2` guard from erlfdb v1.3.1 to make handling of these errors a bit more consistent.
* Decrease the view indexer transaction timeNick Vatamaniuc2021-03-232-8/+24
| | | | | | | | | | | | | | The indexer transaction time is decreased in order to allow enough time for the client to re-use the same GRV to emit doc bodies. This PR goes along with [1], where emitted doc bodies in a view responses now come from the same database read version as the one used by the indexer. Since the batcher previously used 4.5 seconds as the maximum, that left little time to read any doc bodies. [1]: https://github.com/apache/couchdb/pull/3391 Issue: https://github.com/apache/couchdb/issues/3381
* Clean up indexes after each test case in couch_views_active_tasks_testNick Vatamaniuc2021-03-231-0/+3
| | | | | | | | | | | | | | | Since with the recent changes there is an extra `do_finalize` transaction, the test which relied on the `do_update` being called last and inspecting a mocked active_tasks util call, started failing. The test mocks and checks active_tasks reporting for changes done. The first test case completes during the last `do_update` transaction, but the indexing process continues on with `do_finalize`. And in the meantime, the second test starts and sees 3 active tasks instead of 2. The extra one being from the first test. To fix this make sure to clean indexing data and jobs after each test case.
* Require subscribers to wait until indexer finishesAdam Kocoloski2021-03-231-4/+0
| | | | | | | | | | | | | This clause allowed a subscriber to start reading a view as soon as the indexer made it past the sequence of interest. The trouble with that approach is the resulting view is not directly related to any snapshot of the underlying DB. Waiting until the indexer finishes allows us to provide better semantics, where the view observes a consistent snapshot of the database at some point in time >= the requested sequence. In order to see this work through the view reader should explicitly set the read version of FDB to match the commit version introduced by the indexer, to avoid seeing partial results from a follow-on indexing job.
* Consistent view emits using indexer's GRVs and committed versionstampsNick Vatamaniuc2021-03-237-33/+262
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | View indexer saves both the GRV it used during the view update, and the committed versionstamp in the couch_jobs job data section. Then, the view reader uses those versionstamps to emit a consistent snapshot of the view. * The committed versionstamp ensures that the view results can be emitted even if the view gets updated between the time the view finished and the reader gets notified. * The indexer GRV ensures that the view will emit the same doc revisions in case when include_docs=true option is used as what it read during the time it indexed the data. The view reader uses those two versionstamps only if it initiates the view build itself and then waits for it to build, to ensure that it doesn't operate on stale GRVs. Because the committed version is only available after the main transaction commits, during view indexing finalization there is now a separate transaction which runs at the end which reads the committed version then marks the view as `finished`. Since included docs have to be read at the indexer's GRV version, and, that version is different than the committed version, those documents are loaded in from a separate process. There are a few complications introduced by this commit: * The versionstamps, especially the indexer GRV ones, may become stale (older than 5 seconds) quickly and start throwing 1007 (transaction_too_old) errors. This could be mitigated by forcing the indexer to commit after a shorter interval (1-2 seconds). * There is some fragility introduced in respect to how included docs are loaded in a separate process. When that crashes or does timeout it maybe throw a new type of un-expected error that we don't catch properly.
* Increase timeout for continuous filtered changes elixir test (#3453)Bessenyei Balázs Donát2021-03-231-1/+1
|
* Make "make leaves" replication test less flakyNick Vatamaniuc2021-03-231-3/+3
| | | | | | | It's a CPU-intensive test and so it often times out in CI. So try to increase its timeout and also decrease the size of attachments and the number of design docs. We cannot decrease the number of leaves as that's the thing that's being tested.
* Increase timeout for process_response in ChangesAsyncTest (#3450)Bessenyei Balázs Donát2021-03-221-1/+1
|
* Remove CentOS 6 from CI (#3439)Bessenyei Balázs Donát2021-03-191-46/+0
|
* Fix error_logger reports for OTP >= 21Nick Vatamaniuc2021-03-181-0/+13
| | | | | | | | | | | | | | | | | | | | | Starting with OTP 21 there is a new logging system, and we forgot to add the legacy error logger handler for it. Without it `couch_log` cannot emit gen_server, supervisor and other such system events. Luckily, there is OTP support to enable legacy error_logger behavior and that's what we're doing here. The `add_report_handler/1` call will auto-start the `error_logger` app if needed, and it will also add an `error_logger` handler to the global `logger` system. We also keep the `gen_event:add_sup_handler/3` call, as that will ensure we'll find out when `error_logger` dies so that `couch_log_monitor` can restart everything. Someday(TM) we'll write a proper log event handler for the new logger and have nicely formatted structured logs, but it's better to do that once we don't have to support OTP versions =< 20. Issue: https://github.com/apache/couchdb/pull/3422
* Add more concurrent write testsNick Vatamaniuc2021-03-181-0/+76
| | | | | | | | | | | | | | * `Secondary data tests with updates and queries`: Like the `Secondary data tests with updates` but adds intermittent queries while updates are taking place. * `Secondary data tests with deletes and queries`: Deletes and queries intermettently. This one was specifically crafted to trigger the `ebtree:lookup_multi/3` error and hopefully other similar ones. The retry_until section at the end was added to differentiate between the case were we return a partial view and when the view is actually broken and it will never "catch up". Once we start returning only completed views, we remove that section.
* Merge pull request #3441 from apache/concurrent-write-test-with-updatesRobert Newson2021-03-182-1/+26
|\ | | | | Add Secondary data tests with updates test
| * Add Secondary data tests with updates testconcurrent-write-test-with-updatesRobert Newson2021-03-182-1/+26
| |
* | Merge pull request #3440 from cloudant/fix-typoiilyak2021-03-181-1/+1
|\ \ | |/ |/| Fix typo causing not saving of configuration changes from chttpd_node
| * Fix typo causing not saving of configuration changes from chttpd_nodeILYA Khlopotov2021-03-181-1/+1
|/
* feat(couchjs): add support for SpiderMonkey 86Jan Lehnardt2021-03-176-4/+827
|
* Fix _changes?filter=_design (#3430)Bessenyei Balázs Donát2021-03-162-2/+8
|
* Set wait_for_built_index=True for 17-multi-type-value-test.pyBessenyei Balázs Donát2021-03-161-1/+1
|
* Ignore unchecked JWT claimsJay Doane2021-03-151-10/+26
| | | | | | | | | | | | | | | | Previously, if a JWT claim was present, it was validated regardless of whether it was required. However, according to the spec [1]: "all claims that are not understood by implementations MUST be ignored" which we interpret to mean that we should not attempt to validate claims we don't require. With this change, only claims listed in required checks are validated. [1] https://tools.ietf.org/html/rfc7519#section-4
* Include necessary dependency in jwtf keystore test setup & teardownJay Doane2021-03-151-2/+2
| | | | | The config application depends on couch_log, so include it when setting up and tearing down tests.
* Remove error message on mix testBessenyei Balázs Donát2021-03-141-3/+0
|
* Remove _membership call from set_config_raw in integration testsNick Vatamaniuc2021-03-131-9/+3
| | | | Since we run `elixir` tests with `-n 1` we can just use `_local`
* Fix more couch_jobs flakinessNick Vatamaniuc2021-03-121-0/+11
| | | | | | | | The errors see in #3417 seem to indicate the expiration jobs are interfering with the couch_jobs tests, to prevent that prevent expiration_db job gen_server from starting at all. Fixes #3417
* Fix couch_jobs to be less flakyNick Vatamaniuc2021-03-121-532/+454
| | | | | | | | | | | | | | | | | | | | | | | It turns out fabric is dependent on couch_jobs because of db expiration module. So when couch_jobs was restarted multiple times per test case it could have brought down fabric. However, since couch_jobs needs fabric for transactional stuff it ended up brining couch_jobs app down as well. To fix it: * Switch to explicitly starting/stopping fabric and couch_jobs together * Break appart bad_messages* tests to individually test each type of message as app restarts in the middle of the tests kept killing fabric and intermettently killing couch_jobs a well. * Also make the tests look nicer by re-using ?TDEF_FE macros from `fabric2_test`, this we can avoid the `?_test(begin... end).` pattern. * Remove meck:unload since we don't really meck anything in the module * Don't need to spend time cleaning out database as we don't really create that many dbs (just one) and that one gets cleaned out in its own test.
* Fix and re-enable ChangesAsyncTestBessenyei Balázs Donát2021-03-114-52/+43
|
* Merge pull request #3413 from apache/concurrent_write_testsRobert Newson2021-03-103-0/+68
|\ | | | | Verify correctness with concurrent updates
| * Verify correctness with concurrent updatesRobert Newson2021-03-103-0/+68
| |
* | Bump erlfdb to v1.3.0Nick Vatamaniuc2021-03-091-1/+1
| | | | | | | | https://github.com/apache/couchdb-erlfdb/releases/tag/v1.3.0
* | Bump erlfdb to v1.2.9Nick Vatamaniuc2021-03-091-1/+1
| | | | | | | | | | | | | | | | https://github.com/apache/couchdb-erlfdb/releases/tag/v1.2.9 For: * ea54b1a Fix erlfdb_database_set_option else case
* | Bump erlfdb to v1.2.8Nick Vatamaniuc2021-03-051-1/+1
|/ | | | | | | | | The main feature is a fix to use buggify settings on the client (`enc` update was reverted from an intermediate release) https://github.com/apache/couchdb-erlfdb/releases/tag/v1.2.7 https://github.com/apache/couchdb-erlfdb/releases/tag/v1.2.8
* Lower the view indexer transaction retry limitNick Vatamaniuc2021-03-042-2/+25
| | | | | | | | Previously, the view indexer used the default retry limit (100) from the `fdb_tx_options` config section. However, since the batching algorithm relies on sensing errors and reacting to them, retrying the batch 100 times before erroring out was not optimal. So value is lowered down to 5 and it's also made configurable.
* Re-add transaction size exceeded test (#3395)Bessenyei Balázs Donát2021-03-032-1/+19
|
* Allow applying per-transaction options with fabric2_fdb:transactional/3Nick Vatamaniuc2021-03-033-24/+71
| | | | | | | | | | | | | 1) First, as a cleanup, remove DB `Options` from the `init_db/3 call. We always follow `init_db/3` (sometimes called through the `fabric2_fdb:transactional(DbName, ...)` with a `create(TxDb, Options)` or `open(TxDb, Options)` call, where we overrode `Options` anyway. The only time we didn't follow it up with a `create/2` or `open/2` is when dbs are deleted where `Options` wouldn't matter. 2) Add a new `fabric2_fdb:transactional(DbName|Db, TxOptions, Fun)` call which allows specifying per-transaction TX options in the `TxOptions` arg. The format of `TxOptions` is `#{option_name_as_atom => integer | binary}`
* Bump erlfdb to 1.2.6 (#3400)Joan Touzet2021-03-031-1/+1
|
* Update Makefile stripping remaining direct make refsAlessio Biancalana2021-03-031-4/+4
|
* Relax isolation level when indexer reads from DB (#3393)Adam Kocoloski2021-03-022-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | * Relax isolation level when indexer reads from DB This patch causes the indexing subsystem to use snapshot isolation when reading from the database. This reduces commit conflicts and ensures the index can make progress even in the case of frequently updated docs. In the pathological case, a document updated in a fast loop can cause the indexer to stall out entirely when using serializable reads. Each successful update of the doc will cause the indexer to fail to commit. The indexer will retry with a new GRV but the same target DbSeq. In the meantime, our frequently updated document will have advanced beyond DbSeq and so the indexer will finish without indexing it in that pass. This process can be repeated ad infinitum and the document will never actually show up in a view response. Snapshot reads are safe for this use case precisely because we do have the _changes feed, and we can always be assured that a concurrent doc update will show up again later in the feed. * Bump erlfdb version Needed to pull in fix for snapshot range reads.
* Fix badmatch in couch_views_indexerNick Vatamaniuc2021-03-021-4/+6
| | | | | | | | Previously, when an erlfdb error occured and a recursive call to `update/3` was made, the result of that call was always matched against `{Mrst, State}`. However, in the case when the call had finalized and returned `couch_eval:release_map_context/1` response, the result would be `ok` which would blow with a badmatch error against `{Mrst, State}`.
* Make session elixir test more robustBessenyei Balázs Donát2021-03-011-1/+1
|
* Set default nodes in dev/run to 1Bessenyei Balázs Donát2021-03-011-2/+2
|
* Merge pull request #3386 from apache/ebtree-lookup-optRobert Newson2021-02-281-4/+4
|\ | | | | Optimize lookup/3
| * Optimize lookup/3ebtree-lookup-optRobert Newson2021-02-271-4/+4
|/ | | | | A tidier version of https://github.com/apache/couchdb/pull/3384 that saves an unnecessary call to collate.
* Merge pull request #3384 from apache/ebtree-lookup-collate-eqRobert Newson2021-02-261-6/+6
|\ | | | | use collate in lookup
| * use collate in lookupRobert Newson2021-02-261-6/+6
| |
* | Fix ebtree:lookup_multi/3Paul J. Davis2021-02-261-6/+5
| | | | | | | | | | | | | | If one of the provided lookup keys doesn't exist in the ebtree, it can inadvertently prevent a second lookup key from being found if it the first key greater than the missing lookup key is equal to the second lookup key.
* | Add failing cases for ebtree:lookup_multi/3 bugPaul J. Davis2021-02-262-1/+28
| | | | | | | | | | | | These two test cases expose the subtle bug in ebtree:lookup_multi/3 where a key that doesn't exist in the tree can prevent a subsequent lookup key from matching in the same KV node.
* | Fix typoPaul J. Davis2021-02-261-8/+8
| |
* | Merge pull request #3365 from apache/active-tasks-process-status-mainRobert Newson2021-02-091-1/+10
|\ \ | |/ | | Show process status in active_tasks
| * Show process status in active_tasksRobert Newson2021-02-091-1/+10
|/ | | | | This allows users to verify that compaction processes are suspended outside of any configured strict_window.
* Handle all erlfdb error codes (#3355)Robert Newson2021-02-082-28/+6
|
* Fix PUT of multipart/related attachments support for Transfer-Encoding: ↵Bessenyei Balázs Donát2021-02-033-2/+22
| | | | | | | chunked (#3360) Transfer-Encoding: chunked causes the server to wait indefinitely, then issue a a 500 error when the client finally hangs up, when PUTing a multipart/related document + attachments. This commit fixes that issue by adding proper handling for chunked multipart/related requests.