| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
couch_raft.erl is a complete implementation of the raft algorithm but
currently only manages an in-memory state machine and log.
Preliminary work is also here to add a new btree inside the `.couch`
files, which will be the real raft log. The intent is that log entries
can be removed from this log and applied to by_id and by_seq trees
atomically.
raft log is preserved over compaction in the same manner as local
docs, all entries are slurped into memory and written in one
pass. This should be fine as the log should stay short, committed
entries can be promptly removed. It's probably not fine for local
docs, though...
Anyway, it's progress and hopefully we're going somewhere cool.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"Possible ancestors" are all the leaf revisions with a position less than any
of the missing revisions. Previously, for every leaf revision we potentially
checked every missing revision. And with missing revisions list being sorted
from smallest to largest, that meant often traversing most of the missing
revisions list. In general, it meant performing O(R * M) operations, where R is
the number of leaf revision, and M is the number of missing revisions.
The optimization is instead of comparing against every single missing revision
we can compare only against the highest one. If a leaf revision is less than
highest missing revison, then it is already a possible ancestor, so we can stop
checking. Thus, we can go from a quadratic O(R * M) to an O(R) + O(M)
complexity, which, when merging large conflicting documents could give us a
nice performance boost.
|
|
|
|
| |
Owner indicated the instance has gone away
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were used to generate the random rev trees used for bechmarking. But they
also assert useful properties for the function so let's add them to test_util.
Since we added a shuffle/1 function to test_util, DRY some extra shuffle/1
functions as we seem to have a bunch of them sprinkled here and there.
Also, since the rev tree generation code was changed, need to adjust the
`excessive_memory_when_stemming` test as well. To do it, reverted the commit
indicated in the test and then adjusted parameters until the test failed. Then
reset the revert and it failed.
|
|
|
|
|
|
|
|
|
|
| |
Previously we always fully traversed the search keys list, when we partitioned
it into possible and impossible sublists. The optimization is we can stop
traversing early if the search keys are sorted as soon as we hit a key that is
>= than the current position in the tree. To keep things simple we can use the
lists:splitwith/2 function which seems to do what we want [1].
[1] https://github.com/erlang/otp/blob/40922798411c2d23ee8a99456f96d6637c62b762/lib/stdlib/src/lists.erl#L1426-L1435
|
|
|
|
| |
Issue originally discovered by @iilyak using https://comby.dev/
|
|
|
|
|
|
|
|
| |
See dev list discussion:
https://lists.apache.org/thread/x4lc6vhthj1vkt2xpd0ox5osh959qsc4
Previous PR to disable protection on main so we can replace it:
https://github.com/apache/couchdb/pull/4053
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is used in the hot path of _revs_diff and _bulk_docs API calls.
Those could always use a bit more optimization:
* In `_revs_diff` it's used when fetching all the FDIs to see which docs are
missing in `couch_btree:lookup/2`.
* In `_bulk_docs` it's used in the `fabric_doc_update` when finalizing the
response.
Using erlperf in #4051 noticed an at most 5x speedup from using a map instead
of a dict. Since a map already falls back to a proplist for small sizes, skip
the length guard.
Some erlperf examples from #4051:
500 Keys
```
> f(Keys), f(Res), {Keys, Res} = Gen(500), ok.
> erlperf:run(#{runner => {couch_util, reorder_results2, [Keys, Res, 100, dict]}}).
2407
> erlperf:run(#{runner => {couch_util, reorder_results2, [Keys, Res, 100, map]}}).
11639
```
Using a map without the guard, which is the change in this this PR:
```
> f(Keys), f(Res), {Keys, Res} = Gen(500), ok.
ok
> erlperf:run(#{runner => {couch_util, reorder_results, [Keys, Res]}}).
12395
> erlperf:run(#{runner => {couch_util, reorder_results, [Keys, Res]}}).
12508
```
As a bonus this also cleans up the code a bit, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation was duplicated because fabric needed to return a response
([]) for existing revs. It doesn't seem worth it to keep a whole separate copy
just to handle the [] case. Instead, make the primary couchdb implementation
always return [] and just filter it out for the local httpd endpoint handlers.
It seems we didn't have any _revs_diff tests, so add a few basic tests along
the way.
Also, it turns out we still support the _missing_revs API. That's an endpoint
that's not used by the replicator since before 1.2 days. At some point it might
be worth removing it, but for now ensure we also test it alongside _revs_diff.
|
|
|
|
|
|
|
|
| |
Previously fabric_doc_update message handler crashed with a function_clause
error the first time it encountered an all_dbs_active error.
Opt to handle it as a `rexi_EXIT` with a hope at that some workers would still
return a valid result.
|
| |
|
|
|
|
| |
Comment them out for the time being
|
|
|
|
|
|
|
|
|
|
|
| |
We handle mochiweb's `{shutdown, Error}` exit. However, chttpd itself exits
with a plain `exit(shutdown)`, and in that case, our nested catch statements
[1] will emit an "unknown_error" log line. So, make sure to handle that as well
in order to keep the log noise down.
[1] `handle_req_after_auth/2` is nested inside `process_request/1`
https://github.com/apache/couchdb/blob/3.x/src/chttpd/src/chttpd.erl#L386 and
both call `catch_error/4`
|
|\ |
|
|/
|
|
|
|
|
|
| |
We forgot to transform the JOSE signature format to DER when verifying
ES signatures (and the reverse when signing).
Co-authored-by: Nick Vatamaniuc <vatamane@apache.org>
Co-authored-by: Jay Doane <jaydoane@apache.org>
|
| |
|
|
|
|
| |
As reported by erlang LSP server
|
|
|
|
|
|
|
|
|
|
|
| |
Removed:
* append_binary_md5/2
* assemble_file_chunk/1
* append_term_md5/2
* append_term_md5/3
* pread_binaries/2
* pread_iolists/2
* append_binaries/2
|
| |
|
|
|
|
|
| |
Emacs + erlang_ls noticed those. There may be more, but didn't know how
systematically check all of the files.
|
|\
| |
| | |
Implement resource_hoggers
|
|/ |
|
|
|
|
|
|
|
|
|
| |
It should save some dist bandwidth when workers are canceled at the end of
fabric requests. The feature has been available since 3.0.x (3 years ago) so
opt to enable it by default.
Users who do a rolling upgrade from 2.x could still set it to `false` and then,
after the upgrade completes, delete it to return it to default (true).
|
|\
| |
| | |
Fix typo
|
| |\
| |/
|/| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If workers drop after when new_edits=false and revisions are already present,
attachment parser will go into `mp_abort_parse_atts` state but still keep
consuming the request data, draining the uploaded attachment data, which helps
ensure the connection process will be in a consistent state and ready to
process the next HTTP request.
Add a test to PUT the same attachment multiple times with new_edits=false and
check that those requests all succeed.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Assuming a document may have multiple attachments, or even the possibility of
having multiple documents each with a different parsers, send
`hello_from_writer` once per worker process as soon as possible in fabric_rpc.
For additional safety (belt and suspenders) assert that the closure is called
from the same process which sent hello and started monitoring the parser, and
if mp_parser_died is caught, and the data function is called again, re-throw
the same mp_parser_died error.
|
| |
| |
| |
| |
| |
| | |
The `hello_from_writer` message is only used in fabric, while we still
have unit tests that go through the couch_httpd side of things. We can
afford to be defensive here.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds an extra `hello_from_writer` message into the handshake
between the process that reads the multipart attachment data from the
socket and the writer processes (potentially on remote nodes) that
persist the data into each shard file. This ensures that even in the
case where a writer does not end up asking for the data (e.g. because
the revision already exists in the tree), the parser will monitor the
writer and therefore know when the writer has exited.
The patch makes some assumptions that the attachment flush function is
executed in the same process as the initial one that is spawned to
handle the fabric_rpc work request. That's true today, but if it
changed in the future it would be a non-obvious breakage to debug.
|
| |
| |
| |
| |
| |
| | |
This doesn't actually work as you'd expect. The response for the
new_edits=false request does return quickly, but the request body
is never consumed and so the _next_ request on the wire will hang.
|
|/ |
|
|\
| |
| | |
Fix busy functions to work with pids
|
|/ |
|
|\
| |
| | |
Implement memory_info functions
|
|/ |
|
|
|
|
|
|
| |
With a few channels around these add up and make local developement not as
ergonomic. Let's turn them to `debug` and users can always toggle them to info
or error if needed.
|
|
|
|
|
|
|
|
|
| |
`{shutdown, Err}` should come before `{Err Reason}`, otherwise {Err, Reason} will always match.
Also, plugin `handle_error/1` response was forced to match initial `Error`,
which doesn't have to always be the case based on:
https://github.com/apache/couchdb/blob/42f2c1c534ed5c210b45ffcd9a621a31b781b5ae/src/chttpd/src/chttpd_plugin.erl#L39-L41
|
|
|
|
|
| |
jiffy changelog: https://github.com/davisp/jiffy/compare/1.0.9...1.1.1
b64url changelog: https://github.com/apache/couchdb-b64url/compare/1.0.2...1.0.3
|
| |
|
|\
| |
| | |
Implement print_table/2
|
|/ |
|
|
|
|
|
|
|
|
|
| |
Previously when handling mismatched responses for replicated changes, where
some response were `noreply` and some were `forbidden`, the `{DocID, Rev}`
tuple was dropped from the response, which resulted in `update_doc_result_to_json`
failing as it expects an `{{DocId, Rev}, Error}` argument instead of just `Error`.
Make sure to keep and return the reply as is in`check_forbidden_msg/1`.
|
|
|
|
| |
https://github.com/apache/couchdb-mochiweb/commit/077b4f801ba8f853a9649a9c0f055b8e4f33dcca
|
|
|
|
| |
Waiting on https://github.com/mochi/mochiweb/pull/242
|
|
|
|
|
|
|
| |
Previously on error they exited `normal`, and now exit with the `{shutdown, Error}`
reason.
See: https://github.com/mochi/mochiweb/commit/e56a4dce6b360c5c5d037e8de33dd267790092e4
|
|
|
|
|
|
|
|
|
|
|
| |
* fix badargs for timed out responses
Under heavy load, fabric_update_doc workers will return timeout via
rexi. Therefore no reponses will be populate the response dictionary.
This leads to badargs when we do dict:fetch for keys that do not exist.
This fix corrects this behavior by ensuring that each document update
must receive a response. If one document does not have a response,
then the entire update returns a 500.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we didn't check responses from get_state/2 or await/2 functions when
building indices. If an index updater crashed, and the index never finished
building, the get_state/2 call would simply return an error and the process
would exit normally. Then, the shard splitting job would count that as a
success and continue to make progress.
To fix that, make sure to check the response to all the supported indexing
types and wait until they return an `ok` result.
Additionally, increase the index building resilience to allow for more retries
on failure, and for configurable retries for individual index builders.
|
| |
|
|
|
|
|
| |
If the cookie is undefined then we should not set it so it can pick up
the ~/.erlang.cookie if it is there.
|