| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Some rexi and reshard parameters
Issue: https://github.com/apache/couchdb/issues/2457
|
|
|
|
| |
Includes configure changes and Jenkins setting change.
|
|
|
|
|
| |
Noticed mem3_sync_event_listner tests still fails intermetently, add a debug
log to it to hopefully find the cause of the failure.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously any failed node or rexi worker error resulted in requests failing
immediately even though there were available workers to keep handling the
request. This was because the progress check function didn't account for the
fact that partition requests only use a handful of shards which, by design, do
not complete the full ring.
Here we fix both partition info queries and dreyfus search functionality. We
follow the pattern from fabric and pass through a set of "ring options" that
let the progress function know it is dealing with partitions instead of a full
ring.
|
|\
| |
| | |
Adjust way to detect presence of hastings for Ken
|
|/
|
|
|
|
|
|
| |
After moving ken from https://github.com/apache/couchdb-ken to
https://github.com/apache/couchdb/tree/master/src/ken. The directory
structure related to ken is changed for downstream including Cloudant.
Increase more way to detect presence of hastings for Ken so that
Ken can work correctly for geospatial index.
|
|
|
|
|
|
|
|
| |
Adds a new configuration field, `couch_httpd_auth.same_site` which
sets the `SameSite` attribute of the CouchDB auth cookie. If no
value is set (the default), no `SameSite` attribute is added.
Refs #2221
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we made sure replication job statistics were preserved when
the jobs were started and stopped by the scheduler. However, if a db
node restarted or user re-created the job, replication stats would be
reset to 0.
Some statistics like `docs_read` and `docs_written` are perhaps not as
critical. However `doc_write_failures` is. That is the indicator that
some replication docs have not replicated to the target. Not
preserving that statistic meant users could perceive there was a data
loss during replication -- data was replicated successfully according
to the replication job with no write failures, user deletes source
database, then some times later noticed some of their data is missing.
These statistics were already logged in the checkpoint history and we
just had to initialize a stats object from them when a replication job
starts. In that initialization code we pick the highest values from
either the running scheduler or the checkpointed log. The reason is
that the running stats could be higher if say job was stopped suddenly
and failed to checkpoint but scheduler retained the data.
Fixes: #2414
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously if batch of bulk docs had to be bisected in order to fit a lower max
request size limit on the target, we only counted stats for the second batch.
So it was possibly we might have missed some `doc_write_failures` updates which
can be perceived as a data loss to the customer.
So we use the handy-dandy `sum_stats/2` function to sum the return stats from
both batches and return that.
Issue: https://github.com/apache/couchdb/issues/2414
|
| |
|
|
|
|
|
|
|
| |
We've had a number of segfaults in the `make javascript` test suite. The
few times we've been able to get core dumps all appear to indicate
something wrong in the JIT compiler. Disabling the JIT compilers appears
to prevent these segfaults.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now only support OTP 20+, with 19 at a stretch. erlang:now/0
was deprecated in OTP 18, so we can now suppress these warnings:
```
/home/joant/couchdb/src/dreyfus/src/dreyfus_index_updater.erl:62: Warning: erlang:now/0: Deprecated BIF. See the "Time and Time Correction in Erlang" chapter of the ERTS User's Guide for more information.
/home/joant/couchdb/src/dreyfus/src/dreyfus_index_updater.erl:83: Warning: erlang:now/0: Deprecated BIF. See the "Time and Time Correction in Erlang" chapter of the ERTS User's Guide for more information.
```
Also, some unused variables were removed:
```
/home/joant/couchdb/src/couch/src/couch_bt_engine.erl:997: Warning: variable 'NewSeq' is unused
/home/joant/couchdb/src/mem3/src/mem3_rep.erl:752: Warning: variable 'TMap' is unused
/home/joant/couchdb/src/dreyfus/src/dreyfus_httpd.erl:76: Warning: variable 'LimitValue' is unused
/home/joant/couchdb/src/dreyfus/src/dreyfus_util.erl:345: Warning: variable 'Db' is unused
```
PRs to follow in ets_lru, hyper, ibrowse to track the rest of `erlang:now/0`
deprecations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously many HTTP requests failed noisily with `function_clause` errors.
Expect some of those failures and handle them better. There are mainly 3 types
of improvements:
1) Error messages are shorter. Instead of `function_clause` with a cryptic
internal fun names, return a simple marker like `bulk_docs_failed`
2) Include the error body if it was returned. HTTP failures besides the error
code may contain useful information in the body to help debug the failure.
3) Do not log or include the stack trace in the message. The error names are
enough to identify the place were they are generated so avoid spamming the
user and the logs with them. This is done by using `{shutdown, Error}` tuples
to bubble up the error the replication scheduler.
There is a small but related cleanup of removing source and target monitors
since we'd want to handle those error better however those errors are never
triggered since we removed local replication endpoints recently.
Fixes: https://github.com/apache/couchdb/issues/2413
|
| |
|
|\
| |
| | |
Reset a view shard if the signature is wrong
|
| | |
|
|/
|
|
|
|
|
|
|
|
| |
We encountered a case_clause error when reading the header for a .view
file as the response was {ok, {Sig, nil}} where Sig is neither the
expected sig or the pre-upgrade sig (though surely the pre-1.2 goop is
not firing anymore).
We now log this specific issue and then proceed as if we found no
valid header.
|
|
|
|
|
|
|
|
|
|
| |
Previously the target was reset only when the whole job started, but not when
the initial copy phase restarted on its own. If that happened, we left the
target around so the retry failed always with the `eexist` error.
Target reset has a check to make sure the shards are not in the global shard
map, in case someone manually added them, for example. If they are found there
the job panics and exists.
|
| |
|
|
|
|
|
|
|
| |
Since we switched from Travis to Jenkins, let's see how tests run without
retries in the new environment.
For reference, retries were introduced in: https://github.com/apache/couchdb/commit/220462a1dd2d921fc4ecba3488f5fedefb75217f
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Design doc writes could fail on the target when replicating with non-admin
credentials. Typically the replicator will skip over them and bump the
`doc_write_failures` counter. However, that relies on the POST request
returning a `200 OK` response. If the authentication scheme is implemented such
that the whole request fails if some docs don't have enough permission to be
written, then the replication job ends up crashing with an ugly exception and
gets stuck retrying forever. In order to accomodate that scanario write _design
docs in their separate requests just like we write attachments.
Fixes: #2415
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
8e89688 added a syntax error to couchdb.in. This fixes it.
|
|
|
|
|
| |
- Start couch_log to make sure that couch_log_server proc exists
and write log instead of getting noproc error during test
|
|
|
|
| |
Fixes #2424
|
|
|
|
|
|
| |
* fix(#2143): allow env var overrides for js query server config
* Remove incorrect quotation marks from couchdb.cmd.in
Co-authored-by: Joan Touzet <wohali@apache.org>
|
|
|
|
|
| |
This test has been failing randomly on Jenkins across multiple PRs. This
adds more context to the error that causes the test to fail.
|
|
|
|
|
|
| |
Recently we've been seeing the `couchjs` test runner exiting without
displaying a traceback of an error. This logs the exit code of the OS
process to see if that gives any insight into why its exiting.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous implementation of Mango execution stats relied on
passing the docs_examined count from each shard to the coordinator
in the view_row record. This failed to collect the count of
documents read which weren't followed by a match (in a given shard).
For example, if an index was scanned but no documents were matched,
the docs_examined would be 0, when it should be equal to the number
of documents in the index.
This commit changes the implementation so that docs examined is passed
only when each shard has completed its index scan.
The work is split into 2 commits to support mixed-version cluster
upgrades - the previous commit adds the message handlers only
so can be safely rolled out without breaking in-flight requests.
|
|
|
|
|
| |
Adds message handlers to mango / all_docs / mrview fabric
to recieve an execution_stats message.
|
|
|
|
| |
This PR drops Debian jessie, adds Debian buster, and adds CentOS 8 to the binary platform build matrix on master.
|
|
|
| |
Fixes #2404
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add config variable chttpd.require_valid_user_except_for_up defaulting
to false.
This will allow various automated health check systems to hit /_up
without having to provide a username/password pair when the
chttpd.require_valid_user config setting is true. Apparently, many
of these health check providers do not even allow supplying creds
for such a purpose...
Closes #823
Co-authored-by: Joan Touzet <wohali@users.noreply.github.com>
|
|
|
|
|
|
|
|
| |
Also lower the default stream_limit to 5 based on the results of
performance testing.
Co-authored-by: Adam Kocoloski <kocolosk@apache.org>
Co-authored-by: Kyle Snavely <kjsnavely@gmail.com>
|
|
|
|
|
|
| |
The `batch_doc(Doc)` code was previously used for local endpoints when flushing
docs with attachments. After that code was removed, the `remote_doc_handler/2`
filters out all docs with attachments before they even get to the doc flusher
so batch_doc(Doc) effectively is always returns `true`.
|
|
|
|
| |
Co-authored-by: Joan Touzet <wohali@users.noreply.github.com>
|
|
|
|
|
|
| |
Adam K noticed that we aren't setting the session cookie correctly which
appears to have made this test fail randomly. Why its random and not
consistent is currently unknown.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit:
* Removes Travis CI from the build (no more .travis.yml)
* Moves Jenkinsfile to `build-aux/Jenkinsfile.full` and updates the version
of Erlang used in all steps to `20.3.8.24-1` (Erlang Solutions
packages).
* Introduces a new `build-aux/Jenkinsfile.pr` that just builds
CouchDB against the 3 most important versions of Erlang,
intended for use when building PRs, using a special
`debian-buster-erlang-all` image that has `kerl` builds of those 3
versions of Erlang:
* the oldest supported
* the version we release our binary convenience builds with
* the latest supported
* Builds against SpiderMonkey 60 on the platforms where it is available
(currently, only Debian buster)
* Updated README file with new, dynamic Jenkins embeddable build status
badge
|
|
|
|
|
|
| |
Apparently SpiderMonkey 60 changed the behavior of OOM errors to not
exit the VM. This updates the SpiderMonkey 60 implementation to match
that behavior.
|
|
|
|
|
| |
This test is actually checking the behvior of an OOM in `couchjs` now
since we lifted the OS process timeout limit.
|
|
|
|
|
|
| |
This avoids the 1.2s pause between tests to save time during the test
suite. All ported tests are also logged to measure our progress porting
the JS test suite.
|
|\
| |
| | |
Bring IOQ in tree
|