summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Add a deadline for open messages with a timeoutcouch-server-improvements-with-deadlinePaul J. Davis2018-09-101-1/+24
| | | | | | | | | There's no point in performing the work for messages from workers that have abandoned listening for a request from couch_server. This change adds a deadline value to each timeout open message. If couch_server sees one of these after the deadline has passed it will return an error (although the process may not even be alive any longer) and then continue handling messages.
* Don't send update_lru messages when disabledRussell Branca2018-09-101-3/+8
| | | | | | | | | | The couchdb.update_lru_on_read setting controls whether couch_server uses read requests as LRU update triggers. Unfortunately, the messages for update_lru on reads are sent regardless of whether this is enabled or disabled. While in principle this is harmless, and overloaded couch_server pid can accumulate a considerable volume of these messages, even when disabled. This patch prevents the caller from sending an update_lru message when the setting is disabled.
* Add read_concurrency options to couch_server ETSRussell Branca2018-09-101-1/+7
| | | | | | | | This adds the read_concurrency option to couch_server's ETS table for couch_dbs which contains the references to open database handles. This is an obvious improvement as all callers opening database pids interact with this ETS table concurrently. Conversely, the couch_server pid is the only writer, so no need for write_concurrency.
* Allow disabling off-heap messagesNick Vatamaniuc2018-09-066-12/+17
| | | | | | | | | | | | | | Off-heap messages is an Erlang 19 feature: http://erlang.org/doc/man/erlang.html#process_flag_message_queue_data It is adviseable to use that setting for processes which expect to receive a lot of messages. CouchDB sets it for couch_server, couch_log_server and bunch of others as well. In some cases the off-heap behavior could alter the timing of message receives and expose subtle bugs that have been lurking in the code for years. Or could slightly reduce performance, so a safety measure allow disabling it.
* Fix couch_server concurrency errorPaul J. Davis2018-09-062-11/+21
| | | | | | | | | | | | | | | | Its possible that a busy couch_server and a specific ordering and timing of events can end up with an open_async message in the mailbox while a new and unrelated open_async process is spawned. This change just ensure that if we encounter any old messages in the mailbox that we ignore them. The underlying issue here is that a delete request clears out the state in our couch_dbs ets table while not clearing out state in the message queue. In some fairly specific circumstances this leads to the message on in the mailbox satisfying an ets entry for a newer open_async process. This change just includes a match on the opener process. Anything unmatched came before the current open_async request which means it should be ignored.
* Reproduce race condition in couch_serverPaul J. Davis2018-09-061-0/+173
| | | | | | | | A rather uncommon bug found in production. Will write more as this is just for show and tell. For now this test case just demonstrates the issue that was discovered. A fix is still being pondered.
* Fix couch_server:terminate/2Paul J. Davis2018-09-061-1/+5
| | | | | | | If couch_server terminates while there is an active open_async process it will throw a function_clause exception because `couch_db:get_pid/1` will fail due to the `#entry.db` member being undefined. Simple fix is to just filter those out.
* Merge pull request #1568 from cloudant/log-changes-rewind-reasonsPeng Hui Jiang2018-09-053-4/+24
|\ | | | | Log error when changes forced to rewind to beginning
| * Log warning when changes seq rewinds to 0Jay Doane2018-09-043-4/+24
|/
* Merge pull request #1591 from apache/create-shards-if-missingRobert Newson2018-08-301-1/+1
|\ | | | | Create shard files if missing
| * Create shard files if missingcreate-shards-if-missingRobert Newson2018-08-301-1/+1
|/ | | | | | | | | | If, when a database is created, it was not possible to create any of the shard files, the database cannot be used. All requests return a "No DB shards could be opened." error. This commit changes fabric_util:get_db/2 to create the shard file if missing. This is correct as that function has already called mem3:shards(DbName) which only returns shards if the database exists.
* Check if db exists in /db/_ensure_full_commit call (#1588)Eric Avdey2018-08-302-1/+28
| | | | | | | | | | | We removed a security call in `do_db_req` to avoid a duplicate authorization check and as a result there are now no db validation in noop call `/db/_ensure_full_commit`. This makes it always return a success code, even for missing databases. This fix places the security check back, directly in _ensure_full_commit call and adds eunit tests for a good measure.
* Merge pull request #1590 from cloudant/add-mem3-pingiilyak2018-08-301-0/+24
|\ | | | | Implement convinience `mem3:ping/2` function
| * Implement convinience `mem3:ping/2` functionILYA Khlopotov2018-08-301-0/+24
|/ | | | | | | | Sometimes in operations it is helpful to re-establish connection between erlang nodes. Usually it is achieved by calling `net_adm:ping/1`. However the `ping` function provided by OTP uses `infinity` timeout. Which causes indefinite hang in some cases. This PR adds convinience function to be used instead of `net_adm:ping/1`.
* Merge pull request #1586 from cloudant/improve-cleanup_index_filesiilyak2018-08-302-2/+57
|\ | | | | Improve cleanup_index_files
| * Improve cleanup_index_filesILYA Khlopotov2018-08-302-2/+57
|/ | | | | | | | | The previous implementation was based on a search using {view_index_dir}/.shards/*/{db_name}.[0-9]*_design/mrview/* This wildcard includes all shards for all indexes of all databases. This PR changes the search to look at index_directory of a database.
* Merge pull request #1581 from apache/1580-shard-record-violationPeng Hui Jiang2018-08-281-6/+6
|\ | | | | Fix dialyzer warning of shard record construction
| * Fix dialyzer warning of shard record constructionjiangph2018-08-281-6/+6
|/ | | | | | | | - Fix dialyzer warning that record construction #shard violates the declared type in fabric_doc_open_revs.erl, cpse_test_purge_replication.erl and other files Fixes #1580
* Merge pull request #1582 from apache/improve-dbcreate-validationRobert Newson2018-08-272-5/+12
|\ | | | | Improve validation of database creation parameters
| * Improve validation of database creation parametersimprove-dbcreate-validationRobert Newson2018-08-272-5/+12
|/
* Merge pull request #1576 from apache/1573-export-all-for-pse-testPeng Hui Jiang2018-08-241-1/+3
|\ | | | | Fix make warning from cpse_test_purge_seqs.erl
| * Fix make warning from cpse_test_purge_seqs.erljiangph2018-08-241-1/+3
|/ | | | Fixes #1572
* Fix builtin _sum reduce functionPaul J. Davis2018-08-232-1/+121
| | | | | | | | | | The builting _sum reduce function has no protection against overflowing reduce values. Users can emit objects with enough unique keys to cause the builtin _sum to create objects that are exceedingly large in the inner nodes of the view B+Tree. This change adds the same logic that applies to JavaScript reduce functions to check if a reduce function is properly reducing its input.
* Switch rexi_sup restart strategy to rest_for_oneNick Vatamaniuc2018-08-221-1/+1
| | | | | | | | | | | | | | | | | | Previously, as described in issue #1571, `rexi_server_sup` supervisor could die and restart. After it restarts `rexi_server_mon` would not respan rexi servers as it wouldn't notice `rexi_server_sup` went away and come back. That would leave the cluster in a disabled state. To fix the issue, switch restart strategy to `rest_for_one`. In this case, if a child at the top dies it will restart all the children below it in the list. For example, if `rexi_server` dies, it will restart all the children. If `rexi_server_sup` dies, it will restart `rexi_server_mon`. And then on restart `rexi_server_mon` will properly spawn all the rexi servers. Same for the buffers, if `rexi_buffer_sup` dies, it will restart `rexi_buffer_mon` and on restart it will spawn buffers as expected. Fixes: #1571
* Merge pull request #1370 from ↵Peng Hui Jiang2018-08-2249-470/+5284
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr5-implementation [5/5] Clustered Purge Implementation
| * [10/10] Clustered Purge: Clustered HTTP APICOUCHDB-3326-clustered-purge-pr5-implementationPaul J. Davis2018-08-224-32/+378
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HTTP API for clustered purge is fairly straightforward. It is designed to match the general shape of the single node API. The only major caveat here is that the purge sequence is now hardcoded as null since the purge sequence would now otherwise be an opaque blob similar to the update_seq blobs. Its important to note that there is as yet no API invented for traversing the history of purge requests in any shape or form as that would mostly invalidate the entire purpose of using purge to remove any trace of a document from a database at the HTTP level. Although there will still be traces in individual shard files until all database components have processed the purge and compaction has run (while allowing for up to purge_infos_limit requests to remain available in perpetuity). COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [09/10] Clustered Purge: Fabric APIPaul J. Davis2018-08-224-16/+638
| | | | | | | | | | | | | | | | | | | | | | | | This commit implements the clustered API for performing purge requests. This change should be a fairly straightforward change for anyone already familiar with the general implementation of a fabric coordinator given that the purge API is fairly simple. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [08/10] Clustered Purge: Update read-repairPaul J. Davis2018-08-224-56/+692
| | | | | | | | | | | | | | | | | | | | | | | | | | Read-repair needs to know which nodes have requested an update to a local doc so that it can determine if the update is applied. The basic idea here is that we may have gotten an update from a remote node that has yet to apply a purge request. If the local node were to apply this update it would effectively undo a succesful purge request. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [07/10] Clustered Purge: Internal replicationPaul J. Davis2018-08-226-20/+515
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements the internal replication of purge requests. This part of the anit-entropy process is important for ensuring that shard copies continue to be eventually consistent even if updates happen to shards independently due to a network split or other event that prevents the successful purge request to a given copy. The main addition to internal replication is that we both pull and push purge requests between the source and target shards. The push direction is obvious given that internal replication is in the push direction already. Pull isn't quite as obvious but is required so that we don't push an update that was already purged on the target. Of note is that internal replication also has to maintain _local doc checkpoints to prevent compaction removing old purge requests or else shard copies could end up missing purge requests which would prevent the shard copies from ever reaching a consistent state. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [06/10] Clustered Purge: Update mrview indexesPaul J. Davis2018-08-2210-30/+1022
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates the mrview secondary index to properly process the new history of purge requests as well as to store the _local purge checkpoint doc. The importance of the _local checkpoint doc is to ensure that compaction of a database does not remove any purge requests that have not yet been processed by this secondary index. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [05/10] Clustered Purge: Add upgrade testsjiangph2018-08-225-0/+220
| | | | | | | | | | | | | | | | | | | | These test that we can successfully upgrade old databases that have various configurations of purge requests in the legacy format. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [04/10] Clustered Purge: Update couch_pse_testsPaul J. Davis2018-08-227-116/+1057
| | | | | | | | | | | | | | | | | | | | This updates the couch_pse_tests to account for the new purge APIs as well as introduces a bunch of new tests for covering the new APIs. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [03/10] Clustered Purge: Update couch_bt_enginePaul J. Davis2018-08-224-68/+344
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates the couch_bt_engine storage engine implementation to satisfy the newly defined single-node purge APIs. This is accomplished by storing two new database btrees. The purge_seq_tree orders purge requests by their purge_seq. This tree is used to satisfy the fold_purge_infos API for database components to enumerate the list of purge requests in a defined order. The second index is the purge_tree which orders purge requests by their UUID to make for an efficient lookup when filtering replicated purge requests. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [02/10] Clustered Purge: Update single node APIsPaul J. Davis2018-08-225-99/+284
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates the single node API implementations for use with the new clustered purge API. At the single node level the major change is to store a history of purge requests that can then be consumed by various other parts of the database system. The simpler of the major areas to use this new functionality will be any secondary indices. Rather than checking that only a single purge request has occurred each secondary index will store a _local document referencing its oldest processed purge request. During index updates each secondary index implementation will process any new purge requests and update its local doc checkpoint. In this way secondary indexes will no longer be sensitive to reset when multiple purge requests are issued against the database. The two other major areas that will make use of the newly stored purge request history are both of the anit-entropy mechanisms: read-repair and internal replication. Read-repair will use the purge request history to know when a node should discard updates that have come from a node that has not yet processed a purge request during internal replication. Otherwise read-repair would effectively undo any purge replication that happened "recently". Internal replication will use the purge request history to be able to mend any differences between shards. For instance, if a shard is down when a purge request is issue against a cluster this process will pull the purge request and apply it during internal replication. And similarly any local purge requests will be applied on the target before normal internal replication. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [01/10] Clustered Purge: Define new purge APIPaul J. Davis2018-08-221-33/+134
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first of a series of commits to implement the new clustered purge API. Currently purge is a single-node only API that allows for removing document revisions (and by extension entire documents) completely from a database. However, given our anti-entropy measures this API is extremely difficult to use in a cluster and requires significant operator intervention. Along with the operator intervention, this API is inherently unsafe with regards to accidentally triggering the rebuild of secondary indices. As such this patch set is aimed at creating a cluster aware API that is both easier to use and less likely to cause application downtime while secondary indices are rebuilt. There are four major areas that will be covered by this patch set: 1. Single node APIs and behavior changes 2. Cluster aware APIs 3. Anti-entropy updates 4. Cluster HTTP implementation This patch set is split up into a series of commits to aid in the review by other commiters that will hopefully allow for a logical and intuitive progression of implementation rather than landing as a single opaque commit covering a huge swath of the code base. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
* Merge pull request #1369 from ↵Peng Hui Jiang2018-08-223-1/+37
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr4-on-compact-plugin [4/5] Clustered Purge: Add on_compact EPI hook
| * Create new on_compact triggerCOUCHDB-3326-clustered-purge-pr4-on-compact-pluginPaul J. Davis2018-08-223-1/+37
|/ | | | | | | | | | | This trigger allows any storage engine that makes use of compaction to notify that compaction is starting. This is preparatory work for clustered indexes so that existing indexes are allowed to ensure they have a clustered purge local doc before compaction runs. COUCHDB-3326 Co-Authored-By: jiangphcn <jiangph@cn.ibm.com>
* Merge pull request #1368 from ↵Peng Hui Jiang2018-08-2221-1735/+1817
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr3-refactor-pse-tests [3/5] Clustered Purge - Rewrite pluggable storage engine tests
| * Enhance PSE tests with setup/teardown functionsCOUCHDB-3326-clustered-purge-pr3-refactor-pse-testsPaul J. Davis2018-08-2111-195/+290
| |
| * Update to use new couch_pse_tests appPaul J. Davis2018-08-212-1/+2
| |
| * Update PSE test definitions for new util modulePaul J. Davis2018-08-219-140/+140
| |
| * Rename PSE test modulesPaul J. Davis2018-08-2110-47/+22
| |
| * Move PSE tests to their own appPaul J. Davis2018-08-2111-0/+20
| |
| * Rewrite the PSE test suite to use couch_serverPaul J. Davis2018-08-2111-722/+713
|/ | | | | | | It turns out that if any storage engine has to open itself during a callback it would end up violating the guarantee of a single writer. This change in the test suite changes things to use couch_server so that storage engines are now free to do as they want reopening themselves.
* Merge pull request #1570 from ↵Peng Hui Jiang2018-08-211-10/+9
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr2-simplify-mem3-rep Simplify logic in mem3_rep
| * Simplify logic in mem3_repCOUCHDB-3326-clustered-purge-pr2-simplify-mem3-repPaul J. Davis2018-08-211-10/+9
|/ | | | | | Previously there were two separate database references and it was not clear which was used where. This simplifies things by reducing it to a single instance so that the logic is simpler.
* Merge pull request #1366 from ↵Peng Hui Jiang2018-08-215-330/+322
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr1-misc-cleanup [1/5] Clustered Purge - Misc Cleanup
| * Update fabric_doc_open eunit testsPaul J. Davis2018-08-211-321/+310
| | | | | | | | Modernize and fix the eunit tests in fabric_doc_open.erl
| * Fix race on couch_db:reopen/1Paul J. Davis2018-08-211-2/+5
| | | | | | | | | | | | This fixes a minor race by opening the database before closing it. This was never found to be an issue in production and was just caught while contemplating the PSE test suite.
| * Fix default security object handlingPaul J. Davis2018-08-211-1/+1
| | | | | | | | | | | | | | | | There's a race where if a database is opened with a default_security set and it crashes before first compact, and is then reopened after the default_security option has changed that it will pick the second security option. This change fixes the relatively obscure bug that was only found during testing.