| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
After running the Elixir suite under the buggify" mode, we noticed
retryable errors were not handled well. In some cases we handled some
errors (1009) but not others (1007, 1031). Some other retryable errors
were not considered at all.
So here we use the newly defined constants and `ERLFDB_IS_RETRYABLE/2`
guard from erlfdb v1.3.1 to make handling of these errors a bit more
consistent.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The indexer transaction time is decreased in order to allow enough
time for the client to re-use the same GRV to emit doc bodies.
This PR goes along with [1], where emitted doc bodies in a view
responses now come from the same database read version as the one used
by the indexer. Since the batcher previously used 4.5 seconds as the
maximum, that left little time to read any doc bodies.
[1]: https://github.com/apache/couchdb/pull/3391
Issue: https://github.com/apache/couchdb/issues/3381
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since with the recent changes there is an extra `do_finalize`
transaction, the test which relied on the `do_update` being called
last and inspecting a mocked active_tasks util call, started failing.
The test mocks and checks active_tasks reporting for changes done. The
first test case completes during the last `do_update` transaction, but
the indexing process continues on with `do_finalize`. And in the
meantime, the second test starts and sees 3 active tasks instead of
2. The extra one being from the first test.
To fix this make sure to clean indexing data and jobs after each test
case.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This clause allowed a subscriber to start reading a view as soon as the
indexer made it past the sequence of interest. The trouble with that
approach is the resulting view is not directly related to any snapshot
of the underlying DB. Waiting until the indexer finishes allows us to
provide better semantics, where the view observes a consistent snapshot
of the database at some point in time >= the requested sequence.
In order to see this work through the view reader should explicitly set
the read version of FDB to match the commit version introduced by the
indexer, to avoid seeing partial results from a follow-on indexing job.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
View indexer saves both the GRV it used during the view update, and
the committed versionstamp in the couch_jobs job data section. Then,
the view reader uses those versionstamps to emit a consistent snapshot
of the view.
* The committed versionstamp ensures that the view results can be
emitted even if the view gets updated between the time the view
finished and the reader gets notified.
* The indexer GRV ensures that the view will emit the same doc
revisions in case when include_docs=true option is used as what it
read during the time it indexed the data.
The view reader uses those two versionstamps only if it initiates the
view build itself and then waits for it to build, to ensure that it
doesn't operate on stale GRVs.
Because the committed version is only available after the main
transaction commits, during view indexing finalization there is now a
separate transaction which runs at the end which reads the committed
version then marks the view as `finished`.
Since included docs have to be read at the indexer's GRV version, and,
that version is different than the committed version, those documents
are loaded in from a separate process.
There are a few complications introduced by this commit:
* The versionstamps, especially the indexer GRV ones, may become
stale (older than 5 seconds) quickly and start throwing 1007
(transaction_too_old) errors. This could be mitigated by forcing
the indexer to commit after a shorter interval (1-2 seconds).
* There is some fragility introduced in respect to how included
docs are loaded in a separate process. When that crashes or does
timeout it maybe throw a new type of un-expected error that we
don't catch properly.
|
| |
|
|
|
|
|
|
|
| |
It's a CPU-intensive test and so it often times out in CI. So try to increase
its timeout and also decrease the size of attachments and the number of design
docs. We cannot decrease the number of leaves as that's the thing that's being
tested.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting with OTP 21 there is a new logging system, and we forgot to
add the legacy error logger handler for it. Without it `couch_log`
cannot emit gen_server, supervisor and other such system events.
Luckily, there is OTP support to enable legacy error_logger behavior and
that's what we're doing here. The `add_report_handler/1` call will
auto-start the `error_logger` app if needed, and it will also add an
`error_logger` handler to the global `logger` system.
We also keep the `gen_event:add_sup_handler/3` call, as that will
ensure we'll find out when `error_logger` dies so that
`couch_log_monitor` can restart everything.
Someday(TM) we'll write a proper log event handler for the new logger
and have nicely formatted structured logs, but it's better to do that
once we don't have to support OTP versions =< 20.
Issue: https://github.com/apache/couchdb/pull/3422
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* `Secondary data tests with updates and queries`: Like the `Secondary data
tests with updates` but adds intermittent queries while updates are taking
place.
* `Secondary data tests with deletes and queries`: Deletes and queries
intermettently. This one was specifically crafted to trigger the
`ebtree:lookup_multi/3` error and hopefully other similar ones. The
retry_until section at the end was added to differentiate between the case
were we return a partial view and when the view is actually broken and it
will never "catch up". Once we start returning only completed views, we
remove that section.
|
|\
| |
| | |
Add Secondary data tests with updates test
|
| | |
|
|\ \
| |/
|/| |
Fix typo causing not saving of configuration changes from chttpd_node
|
|/ |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if a JWT claim was present, it was validated regardless of
whether it was required.
However, according to the spec [1]:
"all claims that are not understood by implementations MUST be ignored"
which we interpret to mean that we should not attempt to validate
claims we don't require.
With this change, only claims listed in required checks are validated.
[1] https://tools.ietf.org/html/rfc7519#section-4
|
|
|
|
|
| |
The config application depends on couch_log, so include it when
setting up and tearing down tests.
|
| |
|
|
|
|
| |
Since we run `elixir` tests with `-n 1` we can just use `_local`
|
|
|
|
|
|
|
|
| |
The errors see in #3417 seem to indicate the expiration jobs are interfering
with the couch_jobs tests, to prevent that prevent expiration_db job gen_server
from starting at all.
Fixes #3417
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It turns out fabric is dependent on couch_jobs because of db expiration module.
So when couch_jobs was restarted multiple times per test case it could have
brought down fabric. However, since couch_jobs needs fabric for transactional
stuff it ended up brining couch_jobs app down as well.
To fix it:
* Switch to explicitly starting/stopping fabric and couch_jobs together
* Break appart bad_messages* tests to individually test each type of message
as app restarts in the middle of the tests kept killing fabric and
intermettently killing couch_jobs a well.
* Also make the tests look nicer by re-using ?TDEF_FE macros from
`fabric2_test`, this we can avoid the `?_test(begin... end).` pattern.
* Remove meck:unload since we don't really meck anything in the module
* Don't need to spend time cleaning out database as we don't really create
that many dbs (just one) and that one gets cleaned out in its own test.
|
| |
|
|\
| |
| | |
Verify correctness with concurrent updates
|
| | |
|
| |
| |
| |
| | |
https://github.com/apache/couchdb-erlfdb/releases/tag/v1.3.0
|
| |
| |
| |
| |
| |
| |
| |
| | |
https://github.com/apache/couchdb-erlfdb/releases/tag/v1.2.9
For:
* ea54b1a Fix erlfdb_database_set_option else case
|
|/
|
|
|
|
|
|
|
| |
The main feature is a fix to use buggify settings on the client
(`enc` update was reverted from an intermediate release)
https://github.com/apache/couchdb-erlfdb/releases/tag/v1.2.7
https://github.com/apache/couchdb-erlfdb/releases/tag/v1.2.8
|
|
|
|
|
|
|
|
| |
Previously, the view indexer used the default retry limit (100) from the
`fdb_tx_options` config section. However, since the batching algorithm relies
on sensing errors and reacting to them, retrying the batch 100 times before
erroring out was not optimal. So value is lowered down to 5 and it's also made
configurable.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) First, as a cleanup, remove DB `Options` from the `init_db/3 call. We always
follow `init_db/3` (sometimes called through the
`fabric2_fdb:transactional(DbName, ...)` with a `create(TxDb, Options)` or
`open(TxDb, Options)` call, where we overrode `Options` anyway. The only time
we didn't follow it up with a `create/2` or `open/2` is when dbs are deleted
where `Options` wouldn't matter.
2) Add a new `fabric2_fdb:transactional(DbName|Db, TxOptions, Fun)` call which
allows specifying per-transaction TX options in the `TxOptions` arg. The format
of `TxOptions` is `#{option_name_as_atom => integer | binary}`
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Relax isolation level when indexer reads from DB
This patch causes the indexing subsystem to use snapshot isolation when
reading from the database. This reduces commit conflicts and ensures
the index can make progress even in the case of frequently updated docs.
In the pathological case, a document updated in a fast loop can cause
the indexer to stall out entirely when using serializable reads. Each
successful update of the doc will cause the indexer to fail to commit.
The indexer will retry with a new GRV but the same target DbSeq. In the
meantime, our frequently updated document will have advanced beyond
DbSeq and so the indexer will finish without indexing it in that pass.
This process can be repeated ad infinitum and the document will never
actually show up in a view response.
Snapshot reads are safe for this use case precisely because we do have
the _changes feed, and we can always be assured that a concurrent doc
update will show up again later in the feed.
* Bump erlfdb version
Needed to pull in fix for snapshot range reads.
|
|
|
|
|
|
|
|
| |
Previously, when an erlfdb error occured and a recursive call to `update/3` was
made, the result of that call was always matched against `{Mrst, State}`.
However, in the case when the call had finalized and returned
`couch_eval:release_map_context/1` response, the result would be `ok` which
would blow with a badmatch error against `{Mrst, State}`.
|
| |
|
| |
|
|\
| |
| | |
Optimize lookup/3
|
|/
|
|
|
| |
A tidier version of https://github.com/apache/couchdb/pull/3384 that
saves an unnecessary call to collate.
|
|\
| |
| | |
use collate in lookup
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
If one of the provided lookup keys doesn't exist in the ebtree, it can
inadvertently prevent a second lookup key from being found if it the
first key greater than the missing lookup key is equal to the second
lookup key.
|
| |
| |
| |
| |
| |
| | |
These two test cases expose the subtle bug in ebtree:lookup_multi/3
where a key that doesn't exist in the tree can prevent a subsequent
lookup key from matching in the same KV node.
|
| | |
|
|\ \
| |/
| | |
Show process status in active_tasks
|
|/
|
|
|
| |
This allows users to verify that compaction processes are suspended
outside of any configured strict_window.
|
| |
|
|
|
|
|
|
|
| |
chunked (#3360)
Transfer-Encoding: chunked causes the server to wait indefinitely, then issue a a 500 error when the client finally hangs up, when PUTing a multipart/related document + attachments.
This commit fixes that issue by adding proper handling for chunked multipart/related requests.
|