summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Adjust order of ref for shard record1580-shard-record-violationjiangph2018-08-271-1/+1
| | | | Fixes #1580
* Fix dialyzer warning of shard record constructionjiangph2018-08-271-5/+5
| | | | | | | | - Fix dialyzer warning that record construction #shard violates the declared type in fabric_doc_open_revs.erl, cpse_test_purge_replication.erl and other files Fixes #1580
* Merge pull request #1576 from apache/1573-export-all-for-pse-testPeng Hui Jiang2018-08-241-1/+3
|\ | | | | Fix make warning from cpse_test_purge_seqs.erl
| * Fix make warning from cpse_test_purge_seqs.erljiangph2018-08-241-1/+3
|/ | | | Fixes #1572
* Fix builtin _sum reduce functionPaul J. Davis2018-08-232-1/+121
| | | | | | | | | | The builting _sum reduce function has no protection against overflowing reduce values. Users can emit objects with enough unique keys to cause the builtin _sum to create objects that are exceedingly large in the inner nodes of the view B+Tree. This change adds the same logic that applies to JavaScript reduce functions to check if a reduce function is properly reducing its input.
* Switch rexi_sup restart strategy to rest_for_oneNick Vatamaniuc2018-08-221-1/+1
| | | | | | | | | | | | | | | | | | Previously, as described in issue #1571, `rexi_server_sup` supervisor could die and restart. After it restarts `rexi_server_mon` would not respan rexi servers as it wouldn't notice `rexi_server_sup` went away and come back. That would leave the cluster in a disabled state. To fix the issue, switch restart strategy to `rest_for_one`. In this case, if a child at the top dies it will restart all the children below it in the list. For example, if `rexi_server` dies, it will restart all the children. If `rexi_server_sup` dies, it will restart `rexi_server_mon`. And then on restart `rexi_server_mon` will properly spawn all the rexi servers. Same for the buffers, if `rexi_buffer_sup` dies, it will restart `rexi_buffer_mon` and on restart it will spawn buffers as expected. Fixes: #1571
* Merge pull request #1370 from ↵Peng Hui Jiang2018-08-2249-470/+5284
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr5-implementation [5/5] Clustered Purge Implementation
| * [10/10] Clustered Purge: Clustered HTTP APICOUCHDB-3326-clustered-purge-pr5-implementationPaul J. Davis2018-08-224-32/+378
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HTTP API for clustered purge is fairly straightforward. It is designed to match the general shape of the single node API. The only major caveat here is that the purge sequence is now hardcoded as null since the purge sequence would now otherwise be an opaque blob similar to the update_seq blobs. Its important to note that there is as yet no API invented for traversing the history of purge requests in any shape or form as that would mostly invalidate the entire purpose of using purge to remove any trace of a document from a database at the HTTP level. Although there will still be traces in individual shard files until all database components have processed the purge and compaction has run (while allowing for up to purge_infos_limit requests to remain available in perpetuity). COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [09/10] Clustered Purge: Fabric APIPaul J. Davis2018-08-224-16/+638
| | | | | | | | | | | | | | | | | | | | | | | | This commit implements the clustered API for performing purge requests. This change should be a fairly straightforward change for anyone already familiar with the general implementation of a fabric coordinator given that the purge API is fairly simple. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [08/10] Clustered Purge: Update read-repairPaul J. Davis2018-08-224-56/+692
| | | | | | | | | | | | | | | | | | | | | | | | | | Read-repair needs to know which nodes have requested an update to a local doc so that it can determine if the update is applied. The basic idea here is that we may have gotten an update from a remote node that has yet to apply a purge request. If the local node were to apply this update it would effectively undo a succesful purge request. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [07/10] Clustered Purge: Internal replicationPaul J. Davis2018-08-226-20/+515
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements the internal replication of purge requests. This part of the anit-entropy process is important for ensuring that shard copies continue to be eventually consistent even if updates happen to shards independently due to a network split or other event that prevents the successful purge request to a given copy. The main addition to internal replication is that we both pull and push purge requests between the source and target shards. The push direction is obvious given that internal replication is in the push direction already. Pull isn't quite as obvious but is required so that we don't push an update that was already purged on the target. Of note is that internal replication also has to maintain _local doc checkpoints to prevent compaction removing old purge requests or else shard copies could end up missing purge requests which would prevent the shard copies from ever reaching a consistent state. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [06/10] Clustered Purge: Update mrview indexesPaul J. Davis2018-08-2210-30/+1022
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates the mrview secondary index to properly process the new history of purge requests as well as to store the _local purge checkpoint doc. The importance of the _local checkpoint doc is to ensure that compaction of a database does not remove any purge requests that have not yet been processed by this secondary index. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [05/10] Clustered Purge: Add upgrade testsjiangph2018-08-225-0/+220
| | | | | | | | | | | | | | | | | | | | These test that we can successfully upgrade old databases that have various configurations of purge requests in the legacy format. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [04/10] Clustered Purge: Update couch_pse_testsPaul J. Davis2018-08-227-116/+1057
| | | | | | | | | | | | | | | | | | | | This updates the couch_pse_tests to account for the new purge APIs as well as introduces a bunch of new tests for covering the new APIs. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [03/10] Clustered Purge: Update couch_bt_enginePaul J. Davis2018-08-224-68/+344
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates the couch_bt_engine storage engine implementation to satisfy the newly defined single-node purge APIs. This is accomplished by storing two new database btrees. The purge_seq_tree orders purge requests by their purge_seq. This tree is used to satisfy the fold_purge_infos API for database components to enumerate the list of purge requests in a defined order. The second index is the purge_tree which orders purge requests by their UUID to make for an efficient lookup when filtering replicated purge requests. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [02/10] Clustered Purge: Update single node APIsPaul J. Davis2018-08-225-99/+284
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates the single node API implementations for use with the new clustered purge API. At the single node level the major change is to store a history of purge requests that can then be consumed by various other parts of the database system. The simpler of the major areas to use this new functionality will be any secondary indices. Rather than checking that only a single purge request has occurred each secondary index will store a _local document referencing its oldest processed purge request. During index updates each secondary index implementation will process any new purge requests and update its local doc checkpoint. In this way secondary indexes will no longer be sensitive to reset when multiple purge requests are issued against the database. The two other major areas that will make use of the newly stored purge request history are both of the anit-entropy mechanisms: read-repair and internal replication. Read-repair will use the purge request history to know when a node should discard updates that have come from a node that has not yet processed a purge request during internal replication. Otherwise read-repair would effectively undo any purge replication that happened "recently". Internal replication will use the purge request history to be able to mend any differences between shards. For instance, if a shard is down when a purge request is issue against a cluster this process will pull the purge request and apply it during internal replication. And similarly any local purge requests will be applied on the target before normal internal replication. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
| * [01/10] Clustered Purge: Define new purge APIPaul J. Davis2018-08-221-33/+134
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first of a series of commits to implement the new clustered purge API. Currently purge is a single-node only API that allows for removing document revisions (and by extension entire documents) completely from a database. However, given our anti-entropy measures this API is extremely difficult to use in a cluster and requires significant operator intervention. Along with the operator intervention, this API is inherently unsafe with regards to accidentally triggering the rebuild of secondary indices. As such this patch set is aimed at creating a cluster aware API that is both easier to use and less likely to cause application downtime while secondary indices are rebuilt. There are four major areas that will be covered by this patch set: 1. Single node APIs and behavior changes 2. Cluster aware APIs 3. Anti-entropy updates 4. Cluster HTTP implementation This patch set is split up into a series of commits to aid in the review by other commiters that will hopefully allow for a logical and intuitive progression of implementation rather than landing as a single opaque commit covering a huge swath of the code base. COUCHDB-3326 Co-authored-by: Mayya Sharipova <mayyas@ca.ibm.com> Co-authored-by: jiangphcn <jiangph@cn.ibm.com>
* Merge pull request #1369 from ↵Peng Hui Jiang2018-08-223-1/+37
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr4-on-compact-plugin [4/5] Clustered Purge: Add on_compact EPI hook
| * Create new on_compact triggerCOUCHDB-3326-clustered-purge-pr4-on-compact-pluginPaul J. Davis2018-08-223-1/+37
|/ | | | | | | | | | | This trigger allows any storage engine that makes use of compaction to notify that compaction is starting. This is preparatory work for clustered indexes so that existing indexes are allowed to ensure they have a clustered purge local doc before compaction runs. COUCHDB-3326 Co-Authored-By: jiangphcn <jiangph@cn.ibm.com>
* Merge pull request #1368 from ↵Peng Hui Jiang2018-08-2221-1735/+1817
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr3-refactor-pse-tests [3/5] Clustered Purge - Rewrite pluggable storage engine tests
| * Enhance PSE tests with setup/teardown functionsCOUCHDB-3326-clustered-purge-pr3-refactor-pse-testsPaul J. Davis2018-08-2111-195/+290
| |
| * Update to use new couch_pse_tests appPaul J. Davis2018-08-212-1/+2
| |
| * Update PSE test definitions for new util modulePaul J. Davis2018-08-219-140/+140
| |
| * Rename PSE test modulesPaul J. Davis2018-08-2110-47/+22
| |
| * Move PSE tests to their own appPaul J. Davis2018-08-2111-0/+20
| |
| * Rewrite the PSE test suite to use couch_serverPaul J. Davis2018-08-2111-722/+713
|/ | | | | | | It turns out that if any storage engine has to open itself during a callback it would end up violating the guarantee of a single writer. This change in the test suite changes things to use couch_server so that storage engines are now free to do as they want reopening themselves.
* Merge pull request #1570 from ↵Peng Hui Jiang2018-08-211-10/+9
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr2-simplify-mem3-rep Simplify logic in mem3_rep
| * Simplify logic in mem3_repCOUCHDB-3326-clustered-purge-pr2-simplify-mem3-repPaul J. Davis2018-08-211-10/+9
|/ | | | | | Previously there were two separate database references and it was not clear which was used where. This simplifies things by reducing it to a single instance so that the logic is simpler.
* Merge pull request #1366 from ↵Peng Hui Jiang2018-08-215-330/+322
|\ | | | | | | | | apache/COUCHDB-3326-clustered-purge-pr1-misc-cleanup [1/5] Clustered Purge - Misc Cleanup
| * Update fabric_doc_open eunit testsPaul J. Davis2018-08-211-321/+310
| | | | | | | | Modernize and fix the eunit tests in fabric_doc_open.erl
| * Fix race on couch_db:reopen/1Paul J. Davis2018-08-211-2/+5
| | | | | | | | | | | | This fixes a minor race by opening the database before closing it. This was never found to be an issue in production and was just caught while contemplating the PSE test suite.
| * Fix default security object handlingPaul J. Davis2018-08-211-1/+1
| | | | | | | | | | | | | | | | There's a race where if a database is opened with a default_security set and it crashes before first compact, and is then reopened after the default_security option has changed that it will pick the second security option. This change fixes the relatively obscure bug that was only found during testing.
| * Fix bug during purgePaul J. Davis2018-08-211-1/+1
| |
| * Fix typos in couch_db_engine.erlPaul J. Davis2018-08-211-5/+5
|/
* Merge pull request #1543 from cloudant/implement-node-restartiilyak2018-08-166-31/+29
|\ | | | | Add `POST /_node/$node/_restart` endpoint
| * Remove no longer needed handle_restart_req handlerILYA Khlopotov2018-08-151-22/+1
| |
| * Remove _restart endpoint from non-clustered interfaceILYA Khlopotov2018-08-151-1/+0
| |
| * Remove special handling of 'restart' from 'test/javascript/run'ILYA Khlopotov2018-08-151-5/+1
| |
| * Use "/_node/<node>/_restart" from JavaScript testsILYA Khlopotov2018-08-151-1/+10
| |
| * Calculate uptime since application start instead of a beam startILYA Khlopotov2018-08-152-2/+11
| |
| * Add `POST /_node/$node/_restart` endpointILYA Khlopotov2018-08-151-0/+6
|/ | | | | | We need to be able to restart CouchDB from integration test suite. We used to have this feature, but it was removed. This PR brings this functionality back under `/_node/$node/_restart`.
* Reduce size of #leaf.atts keysNick Vatamaniuc2018-08-151-2/+8
| | | | | | | | | | | | | | | | | | | `#leaf.atts` data structure is a `[{Position, AttachmentLength}, ...]` proplist which keeps track of attachment lengths and it is used when calculating external data size of documents. `Position` is supposed to uniquely identify an attachment in a file stream. Initially it was just an integer file offset. Then, after some refactoring work it became a list of `{Position, Size}` tuples. During the PSE work streams were abstracted such that each engine can supply its own stream implementation. The position in the stream then became a tuple that looks like `{couch_bt_engine_stream,{<0.1922.0>,[{4267,21}]}}`. This was written to the file the `#leaf.atts` data structure. While still correct, it is unnecessarily verbose wasting around 100 bytes per attachment, per leaf. To fix it use the disk serialized version of the stream position as returned from `couch_stream:to_disk_term`. In case of the default CouchDB engine implementation, this should avoid writing the module name and the pid value for each attachment entry.
* Merge pull request #1486 from cloudant/fix-doc-update-case-clauseiilyak2018-08-152-2/+59
|\ | | | | Expose document update errors to client
| * Expose document update errors to clientJay Doane2018-08-132-2/+59
|/ | | | | | | | | Currently, errors resulting from race conditions during document updates don't get handled correctly, result in a case error, and sending a 500 to the client. This change instead allows errors, which occur in the race to sync a doc update across the cluster, to be exposed to the client.
* Fix session based replicator auth when endpoints have require_valid_user setNick Vatamaniuc2018-08-131-12/+78
| | | | | | | | | | | | | | | | | | If _session response is 401 and WWW-Authentication header is set assume endpoint has require_valid_user set. Remember that in the state and retry to reinitialize again. If it succeeds, keep sending basic auth creds with every subsequent _session request. Since session uses the replicator worker pool, it needs to handle worker cleanup properly just like couch_replicator_httpc module does. If response headers indicate the connection will be closed, don't recycle it back to the pool, otherwise during an immediate retry there will be a connection_closing error, instead follow what the server indicated and stop the worker then release it to the pool. The pool already knows how to handle dead worker processes. This is needed with this commit, because we now have a pattern of an immediate retry after an auth failure. Fixes #1550
* Merge pull request #1553 from apache/ref-match-1544Robert Newson2018-08-101-2/+2
|\ | | | | Ensure we only receive the correct DOWN message
| * Ensure we only receive the correct DOWN messageRobert Newson2018-08-101-2/+2
|/ | | | relates to issue #1544.
* Move mango selector matching to the shard levelGarren Smith2018-08-084-20/+156
| | | | | | | | | | This moves the Mango selector matching down to the shard level. this would mean that the document is retrieved from the index and matched against the selector before being sent to the coordinator node. This reduces the network traffic for a mango query Co-authored-by: Paul J. Davis <paul.joseph.davis@gmail.com> Co-authored-by: Garren Smith <garren.smith@gmail.com>
* Add rexi ping messageGarren Smith2018-08-082-0/+11
| | | | | | | Add a ping message to rexi to avoid any long running operations from timing out. Long running operations at the node level can exceed the fabric timeout and be cancelled. Sending a ping message back will stop that from happening.
* Merge pull request #1432 from cloudant/support-callback-module-data-provideriilyak2018-08-084-14/+104
|\ | | | | Support callback module data provider