| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|\
| |
| | |
Implement couch_file:format_status to log filepath
|
|/ |
|
|\
| |
| | |
Couch server improvements
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The couchdb.update_lru_on_read setting controls whether couch_server
uses read requests as LRU update triggers. Unfortunately, the messages
for update_lru on reads are sent regardless of whether this is enabled
or disabled. While in principle this is harmless, and overloaded
couch_server pid can accumulate a considerable volume of these messages,
even when disabled. This patch prevents the caller from sending an
update_lru message when the setting is disabled.
|
|/
|
|
|
|
|
|
| |
This adds the read_concurrency option to couch_server's ETS table for
couch_dbs which contains the references to open database handles. This
is an obvious improvement as all callers opening database pids interact
with this ETS table concurrently. Conversely, the couch_server pid is
the only writer, so no need for write_concurrency.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Off-heap messages is an Erlang 19 feature:
http://erlang.org/doc/man/erlang.html#process_flag_message_queue_data
It is adviseable to use that setting for processes which expect to receive a
lot of messages. CouchDB sets it for couch_server, couch_log_server and bunch
of others as well.
In some cases the off-heap behavior could alter the timing of message receives
and expose subtle bugs that have been lurking in the code for years. Or could
slightly reduce performance, so a safety measure allow disabling it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Its possible that a busy couch_server and a specific ordering and timing
of events can end up with an open_async message in the mailbox while a
new and unrelated open_async process is spawned. This change just ensure
that if we encounter any old messages in the mailbox that we ignore
them.
The underlying issue here is that a delete request clears out the state
in our couch_dbs ets table while not clearing out state in the message
queue. In some fairly specific circumstances this leads to the message
on in the mailbox satisfying an ets entry for a newer open_async
process. This change just includes a match on the opener process.
Anything unmatched came before the current open_async request which
means it should be ignored.
|
|
|
|
|
|
|
|
| |
A rather uncommon bug found in production. Will write more as this is
just for show and tell.
For now this test case just demonstrates the issue that was discovered.
A fix is still being pondered.
|
|
|
|
|
|
|
| |
If couch_server terminates while there is an active open_async process
it will throw a function_clause exception because `couch_db:get_pid/1`
will fail due to the `#entry.db` member being undefined. Simple fix is
to just filter those out.
|
|\
| |
| | |
Log error when changes forced to rewind to beginning
|
|/ |
|
|\
| |
| | |
Create shard files if missing
|
|/
|
|
|
|
|
|
|
|
| |
If, when a database is created, it was not possible to create any of
the shard files, the database cannot be used. All requests return a
"No DB shards could be opened." error.
This commit changes fabric_util:get_db/2 to create the shard file if
missing. This is correct as that function has already called
mem3:shards(DbName) which only returns shards if the database exists.
|
|
|
|
|
|
|
|
|
|
|
| |
We removed a security call in `do_db_req` to avoid
a duplicate authorization check and as a result
there are now no db validation in noop call
`/db/_ensure_full_commit`. This makes it always
return a success code, even for missing databases.
This fix places the security check back, directly
in _ensure_full_commit call and adds eunit tests
for a good measure.
|
|\
| |
| | |
Implement convinience `mem3:ping/2` function
|
|/
|
|
|
|
|
|
| |
Sometimes in operations it is helpful to re-establish connection between
erlang nodes. Usually it is achieved by calling `net_adm:ping/1`. However
the `ping` function provided by OTP uses `infinity` timeout. Which causes
indefinite hang in some cases. This PR adds convinience function to be
used instead of `net_adm:ping/1`.
|
|\
| |
| | |
Improve cleanup_index_files
|
|/
|
|
|
|
|
|
|
| |
The previous implementation was based on a search using
{view_index_dir}/.shards/*/{db_name}.[0-9]*_design/mrview/*
This wildcard includes all shards for all indexes of all databases.
This PR changes the search to look at index_directory of a database.
|
|\
| |
| | |
Fix dialyzer warning of shard record construction
|
|/
|
|
|
|
|
|
| |
- Fix dialyzer warning that record construction #shard violates
the declared type in fabric_doc_open_revs.erl,
cpse_test_purge_replication.erl and other files
Fixes #1580
|
|\
| |
| | |
Improve validation of database creation parameters
|
|/ |
|
|\
| |
| | |
Fix make warning from cpse_test_purge_seqs.erl
|
|/
|
|
| |
Fixes #1572
|
|
|
|
|
|
|
|
|
|
| |
The builting _sum reduce function has no protection against overflowing
reduce values. Users can emit objects with enough unique keys to cause
the builtin _sum to create objects that are exceedingly large in the
inner nodes of the view B+Tree.
This change adds the same logic that applies to JavaScript reduce
functions to check if a reduce function is properly reducing its input.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, as described in issue #1571, `rexi_server_sup` supervisor could die
and restart. After it restarts `rexi_server_mon` would not respan rexi servers
as it wouldn't notice `rexi_server_sup` went away and come back. That would
leave the cluster in a disabled state.
To fix the issue, switch restart strategy to `rest_for_one`. In this case, if a
child at the top dies it will restart all the children below it in the list.
For example, if `rexi_server` dies, it will restart all the children. If
`rexi_server_sup` dies, it will restart `rexi_server_mon`. And then on restart
`rexi_server_mon` will properly spawn all the rexi servers.
Same for the buffers, if `rexi_buffer_sup` dies, it will restart `rexi_buffer_mon`
and on restart it will spawn buffers as expected.
Fixes: #1571
|
|\
| |
| |
| |
| | |
apache/COUCHDB-3326-clustered-purge-pr5-implementation
[5/5] Clustered Purge Implementation
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The HTTP API for clustered purge is fairly straightforward. It is
designed to match the general shape of the single node API. The only
major caveat here is that the purge sequence is now hardcoded as null
since the purge sequence would now otherwise be an opaque blob similar
to the update_seq blobs.
Its important to note that there is as yet no API invented for
traversing the history of purge requests in any shape or form as that
would mostly invalidate the entire purpose of using purge to remove any
trace of a document from a database at the HTTP level. Although there
will still be traces in individual shard files until all database
components have processed the purge and compaction has run (while
allowing for up to purge_infos_limit requests to remain available in
perpetuity).
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit implements the clustered API for performing purge requests.
This change should be a fairly straightforward change for anyone already
familiar with the general implementation of a fabric coordinator given
that the purge API is fairly simple.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Read-repair needs to know which nodes have requested an update to a
local doc so that it can determine if the update is applied. The basic
idea here is that we may have gotten an update from a remote node that
has yet to apply a purge request. If the local node were to apply this
update it would effectively undo a succesful purge request.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit implements the internal replication of purge requests. This
part of the anit-entropy process is important for ensuring that shard
copies continue to be eventually consistent even if updates happen to
shards independently due to a network split or other event that prevents
the successful purge request to a given copy.
The main addition to internal replication is that we both pull and push
purge requests between the source and target shards. The push direction
is obvious given that internal replication is in the push direction
already. Pull isn't quite as obvious but is required so that we don't
push an update that was already purged on the target.
Of note is that internal replication also has to maintain _local doc
checkpoints to prevent compaction removing old purge requests or else
shard copies could end up missing purge requests which would prevent the
shard copies from ever reaching a consistent state.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit updates the mrview secondary index to properly process the
new history of purge requests as well as to store the _local purge
checkpoint doc.
The importance of the _local checkpoint doc is to ensure that compaction
of a database does not remove any purge requests that have not yet been
processed by this secondary index.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
These test that we can successfully upgrade old databases that have
various configurations of purge requests in the legacy format.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This updates the couch_pse_tests to account for the new purge APIs as
well as introduces a bunch of new tests for covering the new APIs.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit updates the couch_bt_engine storage engine implementation to
satisfy the newly defined single-node purge APIs. This is accomplished
by storing two new database btrees.
The purge_seq_tree orders purge requests by their purge_seq. This tree
is used to satisfy the fold_purge_infos API for database components to
enumerate the list of purge requests in a defined order.
The second index is the purge_tree which orders purge requests by their
UUID to make for an efficient lookup when filtering replicated purge
requests.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch updates the single node API implementations for use with the
new clustered purge API. At the single node level the major change is to
store a history of purge requests that can then be consumed by various
other parts of the database system.
The simpler of the major areas to use this new functionality will be any
secondary indices. Rather than checking that only a single purge request
has occurred each secondary index will store a _local document
referencing its oldest processed purge request. During index updates
each secondary index implementation will process any new purge requests
and update its local doc checkpoint. In this way secondary indexes will
no longer be sensitive to reset when multiple purge requests are issued
against the database.
The two other major areas that will make use of the newly stored purge
request history are both of the anit-entropy mechanisms: read-repair and
internal replication.
Read-repair will use the purge request history to know when a node
should discard updates that have come from a node that has not yet
processed a purge request during internal replication. Otherwise
read-repair would effectively undo any purge replication that happened
"recently".
Internal replication will use the purge request history to be able to
mend any differences between shards. For instance, if a shard is down
when a purge request is issue against a cluster this process will pull
the purge request and apply it during internal replication. And
similarly any local purge requests will be applied on the target before
normal internal replication.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first of a series of commits to implement the new clustered
purge API. Currently purge is a single-node only API that allows for
removing document revisions (and by extension entire documents)
completely from a database. However, given our anti-entropy measures
this API is extremely difficult to use in a cluster and requires
significant operator intervention.
Along with the operator intervention, this API is inherently unsafe with
regards to accidentally triggering the rebuild of secondary indices. As
such this patch set is aimed at creating a cluster aware API that is
both easier to use and less likely to cause application downtime while
secondary indices are rebuilt.
There are four major areas that will be covered by this patch set:
1. Single node APIs and behavior changes
2. Cluster aware APIs
3. Anti-entropy updates
4. Cluster HTTP implementation
This patch set is split up into a series of commits to aid in the review
by other commiters that will hopefully allow for a logical and intuitive
progression of implementation rather than landing as a single opaque
commit covering a huge swath of the code base.
COUCHDB-3326
Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com>
Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
|
|\
| |
| |
| |
| | |
apache/COUCHDB-3326-clustered-purge-pr4-on-compact-plugin
[4/5] Clustered Purge: Add on_compact EPI hook
|
|/
|
|
|
|
|
|
|
|
|
| |
This trigger allows any storage engine that makes use of compaction to
notify that compaction is starting. This is preparatory work for
clustered indexes so that existing indexes are allowed to ensure they
have a clustered purge local doc before compaction runs.
COUCHDB-3326
Co-Authored-By: jiangphcn <jiangph@cn.ibm.com>
|
|\
| |
| |
| |
| | |
apache/COUCHDB-3326-clustered-purge-pr3-refactor-pse-tests
[3/5] Clustered Purge - Rewrite pluggable storage engine tests
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|/
|
|
|
|
|
| |
It turns out that if any storage engine has to open itself during
a callback it would end up violating the guarantee of a single writer.
This change in the test suite changes things to use couch_server so that
storage engines are now free to do as they want reopening themselves.
|
|\
| |
| |
| |
| | |
apache/COUCHDB-3326-clustered-purge-pr2-simplify-mem3-rep
Simplify logic in mem3_rep
|
|/
|
|
|
|
| |
Previously there were two separate database references and it was not
clear which was used where. This simplifies things by reducing it to a
single instance so that the logic is simpler.
|
|\
| |
| |
| |
| | |
apache/COUCHDB-3326-clustered-purge-pr1-misc-cleanup
[1/5] Clustered Purge - Misc Cleanup
|