summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Fix initial view compaction task statusfix-view-compactor-task-statusPaul J. Davis2018-10-221-1/+3
| | | | | | | | | This is a minor consistency issue. Currently when a view compaction starts it doesn't include the `total_changes` and `changes_done` fields until the first task status update. This is easy to miss as every other task type includes those fields from the initial task definition. The obvious trivial fix is both obvious and trivial.
* couchjs: show default runtime SIZE limit on help messageIblis Lin2018-10-201-0/+1
|
* Update snappy dependency to CouchDB-1.0.2Nick Vatamaniuc2018-10-181-1/+1
| | | | | | This fixes a memory bug: https://github.com/apache/couchdb-snappy/commit/2038ad13b1d6926468f25adea110028e3c0b4b0c
* Merge pull request #1655 from cloudant/fix-lru_optsiilyak2018-10-181-18/+15
|\ | | | | Fix ets_lru configuration in chttpd application
| * Fix ets_lru configuration in chttpd applicationILYA Khlopotov2018-10-181-18/+15
|/ | | | | The code was incorect in a sense that it was using is_integer guard, while `config:get` cannot return integer.
* Avoid crashing if a mango query is reducedNick Vatamaniuc2018-10-181-4/+4
| | | | | | Previously returning null from mango native proc lead to case clause error in couch_query_servers. Instead return a proper shape but with null results for each reduction.
* Improve restart resilience of couch_log applicationNick Vatamaniuc2018-10-181-1/+1
| | | | | | Previously it was too easy to crash the whole node when any of couch_log's children restarted. To improve resiliency, let couch_log application restart a few more times before taking down the whole node with it.
* Do not crash couch_log application when gen_* servers send extra argsNick Vatamaniuc2018-10-182-23/+112
| | | | | | | | | | | | | | | gen_server, gen_fsm and gen_statem might send extra args when terminating. This is a recent behavior and not handling these extra args could lead to couch_log application crashing and taking down the whole VM with it. There are two improvements to fix the issue: 1) Handle the extra args. Format them and log as they might have useful information included. 2) Wrap the whole `format` function in a `try ... catch` statement. This will avoid any other cases where the logger itself if crashing when attepting to format error events.
* Merge pull request #1660 from apache/fix-upgrade_v5_testPeng Hui Jiang2018-10-171-2/+2
|\ | | | | Fix test failure on upgrade_v5_test
| * Fix test failure on upgrade_v5_testjiangph2018-10-171-2/+2
|/ | | | COUCHDB-3326
* Merge pull request #1657 from apache/COUCHDB-3326-upgrade-users-dbPeng Hui Jiang2018-10-172-3/+20
|\ | | | | Upgrade disk version to 7/latest for databases generated prior to clustered purge builds
| * Upgrade disk version to 7 for databasesCOUCHDB-3326-upgrade-users-dbjiangph2018-10-162-3/+20
|/ | | | | | | | - for databases generated before this code base, the disk version needs to be upgraded to 7 or higher so that it can match the db_header with purge_tree and purge_seq_tree COUCHDB-3326
* Merge pull request #1654 from apache/fix_exceed_limitRobert Newson2018-10-151-12/+12
|\ | | | | Test correct condition for exceed_limit error
| * Test correct condition for exceed_limit errorfix_exceed_limitRobert Newson2018-10-151-12/+12
|/ | | | | | | | | | | | | Previously we were testing if Pos + TotalBytes exceeded the pread limit. This is the wrong logic entirely. We are trying to prevent an attempted call to file:pread/3 where the third parameter, the number of bytes to read, is a very large number (due to a corruption elsewhere, say). Instead we throw exceed_limit as soon as a file gets above a certain size. I switched this to an if statement to make it clear that the "read past EOF" and "try to read too many bytes" checks are quite distinct from each other.
* Merge pull request #1649 from apache/COUCHDB-3326-metrics-docs-purgesPeng Hui Jiang2018-10-122-2/+14
|\ | | | | Add document_purges counter for stats
| * Add document_purges counter for statsCOUCHDB-3326-metrics-docs-purgesjiangph2018-10-122-2/+14
|/ | | | COUCHDB-3326
* Merge pull request #1652 from cloudant/restrict-active_tasks-to-server-adminEric Avdey2018-10-111-0/+1
|\ | | | | Restrict access to `_active_tasks` to server admin
| * Restrict access to _active_tasks to server adminEric Avdey2018-10-111-0/+1
|/
* Merge pull request #1650 from apache/bulk_get_users_dbRobert Newson2018-10-112-2/+4
|\ | | | | Pass user_ctx in _bulk_get
| * Pass user_ctx in _bulk_getRobert Newson2018-10-112-2/+4
|/ | | | This fixes _bulk_get for _users db and probably others I don't know
* Merge pull request #1647 from cloudant/validate-prefix-for-systemdbsiilyak2018-10-101-3/+12
|\ | | | | Validate database prefix against DBNAME_REGEX for system dbs
| * Validate database prefix against DBNAME_REGEX for system dbsILYA Khlopotov2018-10-101-3/+12
|/ | | | | | | | | Previously we only checked that the suffix of the database is matching one of the predefined system databases. We really should check the prefix against DBNAME_REGEXP to prevent creation of illegally named databases. This fixes #1644
* Update rebar.config.script and travis CIPaul J. Davis2018-10-042-2/+3
| | | | Fixes #1396
* Fix use of process_info(Pid, monitored_by)Paul J. Davis2018-10-043-8/+12
| | | | | | | | | This can now return references that are from NIFs monitoring the process. This is important for the new file IO NIFs that monitor the controlling process. For now we'll just take the easy way out by filtering the references from our returned monitor lists. Fixes #1396
* Fix couch_log eunit testsPaul J. Davis2018-10-043-6/+11
| | | | Fixes #1396
* Enable parameterized module callsPaul J. Davis2018-10-0418-0/+43
| | | | | | | | | | This is a temporary bandaid to allow us to continue using parameterized modules with Erlang 21. We'll have to go back and modify every one of these files to avoid that as well as figuring out how to upgrade mochiweb to something that doesn't use parameterized modules by the time they are fully removed from Erlang. Fixes #1396
* Do not crash replicator on VDU function failureNick Vatamaniuc2018-10-021-1/+57
| | | | | | | | | | | | | Previously a user could insert a VDU function into one of the _replicator databases such that it prevents the replicator application from updating documents in that db. Replicator application would then crash and prevent replications from running on the whole cluster. To avoid crashing the replicator when saving documents, log the error and return `{ok, forbidden}`. The return might seem odd but we are asserting that forbidden is an OK value in this context and explicitly handling it. This shape of the return also conforms to the expected `{ok, _Rev}` result, noticing that `_Rev` is never actually used.
* add test for 1612ermouth2018-10-021-4/+12
|
* js rewrite send bodyermouth2018-10-021-1/+8
| | | | Fixes #1612
* Make sure to start per-node rexi servers right awayNick Vatamaniuc2018-10-021-0/+2
| | | | | | | | | | | | | This ensures they will be ready to process requests as soon as the application starts up. This should make the service available sooner and should help tests which setup and tear down the services repeatedly, where it would avoid an annoying retry-until-ready loop. Per-node servers/buffers are started in the init method of the monitors. There is not chance of deadlock there because per-node supervisors are started before the monitors. Issue #1625
* Switch rexi server_per_node to true by defaultNick Vatamaniuc2018-10-022-3/+3
| | | | | | | This has been solid for years and when not enabled can be a performance bottleneck. Fixes #1625
* Avoid restarting /_replicate jobs if parameters haven't changedNick Vatamaniuc2018-10-023-55/+98
| | | | | | | | | | This used to be the case before the scheduling replicator: https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator.erl#L166 This is also how replications backed by a document in a _replicator db behave: https://github.com/apache/couchdb/blob/master/src/couch_replicator/src/couch_replicator_doc_processor.erl#L283
* Improve connection cleanup in replication connection poolNick Vatamaniuc2018-10-021-1/+5
| | | | | | | | | | | | | Previously when an owner process crashed before it had a chance to release the worker to the pool, the worker entry was simply deleted. In some cases that was ok because ibrowse's inactivity timeout would kick in and connection would stop itself. In other cases, as observed in practice with _changes feed connection over TLS protocol, inactivity timeout would never fire, so these deleted connections would slowly accumulate leaking memory and filling the process table. TLS connection would keep an associated session open as well making things even worse. To prevent the connection leak, explicitly unlink and kill the worker.
* Explicit Python version in scriptsAdrien Vergé2018-09-274-4/+4
| | | | | | | | | | | | | | Recent Linux distributions start defaulting to Python 3, and require ambiguous scripts to be more explicit. For example building for Fedora 30 (not released yet) fails with: ERROR: ambiguous python shebang in /opt/couchdb/bin/couchup: #!/usr/bin/env python. Change it to python3 (or python2) explicitly. So this commit changes the four Python scripts to use `python2`. Note: They seem to be Python-3-compatible, but I couldn't be sure. If you know they are, please tell me, I'll change it to `python3`.
* Mango match doc on co-ordinating node (#1609)garren smith2018-09-211-10/+72
| | | | | | | | | | | | | * Mango match doc on co-ordinating node This fixes an issue when doing a rolling upgrade of a CouchDB cluster and adding commit a6bc72e the nodes that were not upgraded yet would send through all the docs in the index and those would be passed through to the user because the co-oridnator would assume it was matched at the node level. This adds in a check to see if it has been matched at the node level or not. And then performs a match if required.
* Ignore local nodes in read repair filteringNick Vatamaniuc2018-09-192-3/+26
| | | | | | | | | | | | | | Previosly local node revisions were causing `badmatch` failures in read repair filter. Node sequences already filtered out local nodes while NodeRevs didn't, so during matching `{Node, NodeSeq} = lists:keyfind(Node, 1, NodeSeqs)` Node would not be found in the list and crash. Example of crash: ``` fabric_rpc:update_docs/3 error:{badmatch,false} [{fabric_rpc,'-read_repair_filter/3-fun-1-',4,[{file,"src/fabric_rpc.erl"},{line,360}]}, ```
* Merge pull request #1601 from cloudant/log_file_path_on_crashiilyak2018-09-121-1/+4
|\ | | | | Implement couch_file:format_status to log filepath
| * Implement couch_file:format_status to log filepathILYA Khlopotov2018-09-121-1/+4
|/
* Merge pull request #1593 from apache/couch-server-improvementsRussell Branca2018-09-121-4/+15
|\ | | | | Couch server improvements
| * Don't send update_lru messages when disabledcouch-server-improvementsRussell Branca2018-09-111-3/+8
| | | | | | | | | | | | | | | | | | | | The couchdb.update_lru_on_read setting controls whether couch_server uses read requests as LRU update triggers. Unfortunately, the messages for update_lru on reads are sent regardless of whether this is enabled or disabled. While in principle this is harmless, and overloaded couch_server pid can accumulate a considerable volume of these messages, even when disabled. This patch prevents the caller from sending an update_lru message when the setting is disabled.
| * Add read_concurrency options to couch_server ETSRussell Branca2018-09-111-1/+7
|/ | | | | | | | This adds the read_concurrency option to couch_server's ETS table for couch_dbs which contains the references to open database handles. This is an obvious improvement as all callers opening database pids interact with this ETS table concurrently. Conversely, the couch_server pid is the only writer, so no need for write_concurrency.
* Allow disabling off-heap messagesNick Vatamaniuc2018-09-066-12/+17
| | | | | | | | | | | | | | Off-heap messages is an Erlang 19 feature: http://erlang.org/doc/man/erlang.html#process_flag_message_queue_data It is adviseable to use that setting for processes which expect to receive a lot of messages. CouchDB sets it for couch_server, couch_log_server and bunch of others as well. In some cases the off-heap behavior could alter the timing of message receives and expose subtle bugs that have been lurking in the code for years. Or could slightly reduce performance, so a safety measure allow disabling it.
* Fix couch_server concurrency errorPaul J. Davis2018-09-062-11/+21
| | | | | | | | | | | | | | | | Its possible that a busy couch_server and a specific ordering and timing of events can end up with an open_async message in the mailbox while a new and unrelated open_async process is spawned. This change just ensure that if we encounter any old messages in the mailbox that we ignore them. The underlying issue here is that a delete request clears out the state in our couch_dbs ets table while not clearing out state in the message queue. In some fairly specific circumstances this leads to the message on in the mailbox satisfying an ets entry for a newer open_async process. This change just includes a match on the opener process. Anything unmatched came before the current open_async request which means it should be ignored.
* Reproduce race condition in couch_serverPaul J. Davis2018-09-061-0/+173
| | | | | | | | A rather uncommon bug found in production. Will write more as this is just for show and tell. For now this test case just demonstrates the issue that was discovered. A fix is still being pondered.
* Fix couch_server:terminate/2Paul J. Davis2018-09-061-1/+5
| | | | | | | If couch_server terminates while there is an active open_async process it will throw a function_clause exception because `couch_db:get_pid/1` will fail due to the `#entry.db` member being undefined. Simple fix is to just filter those out.
* Merge pull request #1568 from cloudant/log-changes-rewind-reasonsPeng Hui Jiang2018-09-053-4/+24
|\ | | | | Log error when changes forced to rewind to beginning
| * Log warning when changes seq rewinds to 0Jay Doane2018-09-043-4/+24
|/
* Merge pull request #1591 from apache/create-shards-if-missingRobert Newson2018-08-301-1/+1
|\ | | | | Create shard files if missing
| * Create shard files if missingcreate-shards-if-missingRobert Newson2018-08-301-1/+1
|/ | | | | | | | | | If, when a database is created, it was not possible to create any of the shard files, the database cannot be used. All requests return a "No DB shards could be opened." error. This commit changes fabric_util:get_db/2 to create the shard file if missing. This is correct as that function has already called mem3:shards(DbName) which only returns shards if the database exists.
* Check if db exists in /db/_ensure_full_commit call (#1588)Eric Avdey2018-08-302-1/+28
| | | | | | | | | | | We removed a security call in `do_db_req` to avoid a duplicate authorization check and as a result there are now no db validation in noop call `/db/_ensure_full_commit`. This makes it always return a success code, even for missing databases. This fix places the security check back, directly in _ensure_full_commit call and adds eunit tests for a good measure.