delta/couchdb.git - github.com: apache/couchdb.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make Erlang 24 the minimum version	Nick Vatamaniuc	2023-04-30	7	-160/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can drop a compat nouveau_maps module. Later we can check the code and see if we can replace any maps:map/2 with maps:foreach/2 perhaps. In smoosh_persist, no need to check for file:delete/2. Later we should probably make the delete in couch_file do the same thing to avoid going through the file server. `sha_256_512_supported/0` has been true for a while but the check had been broken, the latest crypto module is `crypto:mac/3,4` so we can re-enable these tests. ML discussion: https://lists.apache.org/thread/7nxm16os8dl331034v126kb73jmb7j3x
*	finish partitioned support for nouveau	Robert Newson	2023-04-29	4	-6/+32
\|
*	OTP 23 support	Robert Newson	2023-04-27	4	-4/+125
\|
*	Another flaky couch_js fix	Nick Vatamaniuc	2023-04-26	1	-3/+1
\| \| \| \| \| \| \|	After the previous fix, now the flakiness moved on to the next line. Remove the extra assertion to avoid it generating flaky tests. The main assertion is already checked above that we get a crash.
*	Noticed the new internal error couchjs test was flaky	Nick Vatamaniuc	2023-04-26	1	-0/+6
\| \| \| \| \| \|	It's designed to crash and exit but depending when it does it exactly it may generate different errors. Add a few more clauses. Hopefully we don't have to completely remove it or comment it out.
*	declare dependency on nouveau	Robert Newson	2023-04-26	1	-1/+2
\|
*	doc(cve): add 2023-26268 placeholder & backport release notes3.3.2.post1	Jan Lehnardt	2023-04-25	2	-0/+65
\|
*	doc(cve): add 2023-26268 placeholder	Jan Lehnardt	2023-04-25	1	-0/+27
\|
*	Import nouveau (#4291)	Robert Newson	2023-04-22	41	-33/+3653
\| \| \|	Nouveau - a new (experimental) full-text indexing feature for Apache CouchDB, using Lucene 9. Requires Java 11 or higher (19 is preferred).
*	fix(mango): GET invalid path under `_index` should not cause 500	Gabor Pali	2023-04-19	2	-3/+7
\| \| \| \| \| \| \|	Sending GET requests targeting paths under the `/{db}/_index` endpoint, e.g. `/{db}/_index/something`, cause an internal error. Change the endpoint's behavior to gracefully return HTTP 405 "Method Not Allowed" instead to be consistent with others.
*	mango: refactor	Gabor Pali	2023-04-18	1	-20/+23
\|
*	mango: fix definition of index coverage	Gabor Pali	2023-04-18	3	-5/+107
\| \| \| \| \| \| \|	Covering indexes shall provide all the fields that the selector may contain, otherwise the derived documents would get dropped on the "match and extract" phase even if they were matching. Extend the integration tests to check this case as well.
*	mango: enhance compositionality of `consider_index_coverage/3`	Gabor Pali	2023-04-18	2	-34/+46
\| \| \| \| \| \| \|	Ideally, the effect of this function should be applied at a single spot of the code. When building the base options, covering index information should be left blank to make it consistent with the rest of the parameters.
*	mango: mark fields with the `$exists` operator indexable	Gabor Pali	2023-04-18	1	-0/+94
\| \| \| \| \| \| \| \| \|	This is required to make index selection work better with covering indexes. The `$exists` operator prescribes the presence of the given field so that if an index has the field, it shall be considered because it implies true. Without this change, it will not happen, but covering indexes can work if the index is manually picked.
*	mango: add integration tests for keys-only covering indexes	Gabor Pali	2023-04-18	1	-0/+115
\|
*	_find: mention the `covered` attribute in the `_explain` response	Gabor Pali	2023-04-18	1	-0/+4
\|
*	mango: add eunit tests	Gabor Pali	2023-04-18	2	-1/+820
\|
*	mango: increase coverage of the `choose_best_index/1` test	Gabor Pali	2023-04-18	1	-2/+9
\|
*	mango: add type information for better self-documentation	Gabor Pali	2023-04-18	3	-8/+88
\|
*	mango: introduce support for covering indexes	Gabor Pali	2023-04-18	2	-27/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As a performance improvement, shorten the gap between Mango queries and the underlying map-reduce views: try to serve requests without pulling documents from the primary data set, i.e. run the query with `include_docs` set to `false` when there is a chance that it can be "covered" by the chosen index. The rows in the results are then built from the information stored there. Extend the response on the `_explain` endpoint to show information in the `covered` Boolean attribute about the query would be covered by the index or not. Remarks: - This should be a transparent optimization, without any semantical effect on the queries. - Because the main purpose of indexes is to store keys and the document identifiers, the change will only work in cases when the selected fields overlap with those. The chance of being covered could be increased by adding more non-key fields to the index, but that is not in scope here.
*	Remove explicit import	Gabor Pali	2023-04-18	2	-143/+120
\|
*	Remove limit parameter from ken	Nick Vatamaniuc	2023-04-17	2	-19/+4
\| \| \| \| \| \| \|	It's not used anymore. In a test where it was used to test config persistence, replace it with `set_delay`.
*	Improve couch_proc_manager	Nick Vatamaniuc	2023-04-15	15	-546/+1062
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main improvement is speeding up process lookup. This should result in improved latency for concurrent requests which quickly acquire and release couchjs processes. Testing with concurrent vdu and map/reduce calls showed a 1.6 -> 6x performance speedup [1]. Previously, couch_proc_manager linearly searched through all the processes and executed a custom callback function for each to match design doc IDs. Instead, use a separate ets table index for idle processes to avoid scanning assigned processes. Use a db tag in addition to a ddoc id to quickly find idle processes. This could improve performance, but if that's not the case, allow configuring the tagging scheme to use a db prefix only, or disable the scheme altogether. Use the new `map_get` ets select guard [2] to perform ddoc id lookups during the ets select traversal without a custom matcher callback. In ordered ets tables use the partially bound key trick [3]. This helps skip scanning processes using a different query language altogether. Waiting clients used `os:timestamp/0` as a unique client identifier. It turns out, `os:timestamp/0` is not guaranteed to be unique and could result in some clients never getting a response. This bug was mostly likely the reason the "fifo client order" test had to be commented out. Fix the issue by using a newer monotonic timestamp function, and for uniqueness add the client's gen_server return tag at the end. Uncomment the previously commented out test so it can hopefully run again. When clients tag a previously untagged process, asynchronously replace the untagged process with a new process. This happens in the background and the client doesn't have to wait for it. When a ddoc tagged process cannot be found, before giving up, stop the oldest unused ddoc processes to allow spawning new fresh ones. To avoid doing a linear scan here, keep a separate `?IDLE_ACCESS` index with an ordered list of idle ddoc proceses sorted by their last usage time. When processes are returned to the pool, quickly respond to the client with an early return, instead of forcing them to wait until we re-insert the process back into the idle ets table. This should improve client latency. If the waiting client list gets long enough, where it waits longer than the gen_server get_proc timeout, do not waste time assigning or spawning a new process for that client, since it already timed-out. When gathering stats, avoid making gen_server calls, at least for the total number of processes spawned metric. Table sizes can be easily computed with `ets:info(Table, size)` from outside the main process. In addition to peformance improvements clean up the couch_proc_manager API by forcing all the calls to go through properly exported functions instead of doing direct gen_server calls. Remove `#proc_int{}` and use only `#proc{}`. The cast to a list/tuple between `#proc_int{}` and `#proc{}` was dangerous and it avoided the compiler checking that we're using the proper fields. Adding an extra field to the record resulted in mis-matched fields being assigned. To simplify the code a bit, keep the per-language count in an ets table. This helps not having to thread the old and updated state everywhere. Everything else was mostly kept in ets tables anyway, so we're staying consistent with that general pattern. Improve test coverage and convert the tests to use the `?TDEF_FE` macro so there is no need for the awkward `?_test(begin ... end)` construct. [1] https://gist.github.com/nickva/f088accc958f993235e465b9591e5fac [2] https://www.erlang.org/doc/apps/erts/match_spec.html [3] https://www.erlang.org/doc/man/ets.html#table-traversal
*	fix (prometheus): do not emit ophaned HELP/TYPE lines	Will Holley	2023-04-14	1	-0/+2
\| \| \| \| \|	In cases where metrics are optional, prevent `# HELP` and `# TYPE` lines from being emitted if there is no corresponding metric series.
*	feat (prometheus): add Erlang distribution stats	Will Holley	2023-04-14	1	-1/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	# Why The _prometheus endpoint was missing the erlang distribution stats returned by the _system endpoint. This is useful when diagnosing networking issues between couchdb nodes. # How Adds a new function `couch_prometheus_server:get_distribution_stats/0`. This gathers the distribution stats in a similar fashion to `chttpd_node:get_distribution_stats/0` but formats them in a more prometheus-friendly way. Naming convention follows prometheus standards, so the type of the value is appended to the metric name and, where counter types are used, a "_total" suffix is added. For example: ``` couchdb_erlang_distribution_recv_oct_bytes_total{node="node2@127.0.0.1"} 30609 couchdb_erlang_distribution_recv_oct_bytes_total{node="node3@127.0.0.1"} 28392 ```
*	feat (prometheus): couch_db_updater and couch_file queue stats	Will Holley	2023-04-14	3	-4/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	# What Adds summary metrics for couch_db_updater and couch_file, the same as returned by the `_system` endpoint. Unlike the other message queue stats, these are returned as a Prometheus summary type across the following metrics, using `couch_db_updater` as an example: * couchdb_erlang_message_queue_couch_db_updater{quantile="0.5"} * couchdb_erlang_message_queue_couch_db_updater{quantile="0.9"} * couchdb_erlang_message_queue_couch_db_updater{quantile="0.99"} * couchdb_erlang_message_queue_couch_db_updater_sum * couchdb_erlang_message_queue_couch_db_updater_count The count metric represents the number of processes and the sum is the total size of all message queues for those processes. In addition, min and max message queue sizes are returned, matching the _system endpoint response: * couchdb_erlang_message_queue_couch_db_updater_min * couchdb_erlang_message_queue_couch_db_updater_max # How This represents a new type of metric in the prometheus endpoint - the existing `summary` types have all been for latency histograms - so a new utility function `pid_to_prom_summary` is added to format the message queue stats into prometheus metrics series. In `chttpd_node` I've extracted the formatting step from the `db_pid_stats` function to allow for re-use between `chttpd_node` and `couch_prometheus_server`, where the result is formatted differently. `chttpd_node` doesn't seem like the best place to put shared code like this but neither does there seem an obvious place to extract it to as an alternative, so I've left it for now.
*	feat (prometheus): include aggregated couch/index message queues	Will Holley	2023-04-14	2	-44/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In #3860 and #3366 we added sharding to `couch_index_server` and `couch_server`. The `_system` endpoint surfaces a "fake" message queue for each of these contining the aggregated queue size across all shards. This commit adds the same for the `_prometheus` endpoint. Originally I had thought to just filter out the per-shard queue lengths as we've not found these to be useful in Cloudant, but I'll leave them in for now for consistency with the `_system` endpoint. Arguably, we should filter in both places if there's agreement that the per-shard queue lengths are just noise.
*	Querying `_all_docs` with non-string `key` should return an empty list	jiahuili	2023-04-12	2	-4/+27
\| \| \| \| \|	Querying `_all_docs` with a non-string `key` should return an empty list, but currently it will return all documents. This PR should fix it.
*	Improve compression eunit test suite	Nick Vatamaniuc	2023-04-10	1	-61/+55
\| \| \| \| \| \| \| \| \| \| \| \|	Use the simpler `?TDEF_FE` macro and fix timeout usage bug. EUnit timeout is in seconds, while test_util:wait/2 takes milliseconds. This was revealed due a slow disk IO speed on the new s390x worker. To reduce flakiness and to ensure db / view size before compaction is greater than the size after compaction, even with `none` compression option, make db update batches smaller and use more frequent index commits. Issue: https://github.com/apache/couchdb/issues/4521
*	docs: Fixed `_compact/{ddoc}` and `_shards/{docid}` examples	jiahuili	2023-04-10	3	-10/+22
\| \| \| \| \| \| \| \|	- Modified the example and add a `manual view compaction` link to the api `/{db}/_compact/{ddoc}` to facilitate users to view `_compact` related documents. - Fixed example request for GET /{db}/_shards/{docid}
*	docs(_find): Remove redundancy from sample `_explain` response	Gabor Pali	2023-04-07	1	-1/+0
\| \| \| \| \|	Acciddentally in `467e14ef`, the `use_index` field got doubled in one of the sample responses.
*	allow configurable timeouts for _view and _search	Robert Newson	2023-04-05	4	-5/+5
\|
*	treat single-element keys as the key for `_view`	jiahuili	2023-04-04	6	-131/+518
\| \| \| \| \| \| \| \|	- Request `_view` with reduce function, treat single-element `keys` as `key` - Add treat_single_keys_as_key/2 function for POST `_view`. If we query `_all_docs` for deleted or nonexistent docs using single-element keys, we can get {deleted:true} and {error:not_found}. - Add documentation on using key, keys, start_key, and end_key
*	fix (prometheus): gauge types for metrics that can be decremented	Will Holley	2023-04-03	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Prometheus assumes that metrics with `counter` types are cumulative. This isn't the case in CouchDB / Folsom, which allows counters to be decremented. This changes the type of metrics where we decrement the counter values to `gauge`: - couchdb_open_databases - couchdb_couchdb_open_os_files - couchdb_httpd_clients_requesting_changes
*	feat (prometheus): membership metric	Will Holley	2023-04-03	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \|	Add a gauge metric `membership` to the `_prometheus` endpoint. The metric has labels: - `nodes=all_nodes` - `nodes=cluster_nodes` matching the fields in the `_membership` endpoint (I think consistency here is more useful than renaming the labels to e.g. expected/actual).
*	feat (prometheus): internal_replication_jobs metric	Will Holley	2023-03-31	4	-18/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds an internal replication backlog metric. In the `_system` endpoint this is called `internal_replication_jobs`, so I've preserved the name, though it appears to represent the backlog of changes. Adding a dependency on mem3 to `couch_prometheus` requires some changes to the tests and dependency tree: - `couchdb.app.src` no longer lists a dependency on `couch_prometheus`. I don't know why this was needed previously - it doesn't appear to be required. - `couch_prometheus` now has dependencies on `couch` and `mem3`. This both ensures that `couch_prometheus` doesn't crash if mem3 isn't running and also resolves a race condition on startup where the `_prometheus` endpoint returns incomplete stats. - `couch_prometheus:system_stats_test/0` is moved to `couch_prometheus_e2e_tests:t_starts_with_couchdb/0`. It is really an integration test, since it depends on the `_prometheus` endpoint being able to collect data for all the metrics, and it tests only that the metrics names begin with `couchdb_`.
*	feat (prometheus): metrics for individual message queues	Will Holley	2023-03-31	2	-8/+15
\| \| \| \| \| \| \| \| \| \| \|	The `_prometheus` endpoint today includes size/min/max metrics across all message queues. This adds a new metric - `erlang_message_queue_size{queue_name="<name>"}` which tracks the size of individual message queues. This could replace the previous metrics since those can be derived from the new metric by prometheus, but I've left them in place for compatibility.
*	docs(hosts): Remove misleading /etc/hosts info (#4506)	Ronny Berndt	2023-03-30	1	-4/+0
\|
*	Treat javascript internal errors as fatal	Nick Vatamaniuc	2023-03-29	1	-1/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Spidermonkey sometimes throws an `InternalError` when exceeding memory limits, when normally we'd expect it to crash or exit with a non-0 exit code. Because we trap exceptions, and continue emitting rows, it is possible for users views to randomly miss indexed rows based on whether GC had run or not, other internal runtime state which may have been consuming more or less memory until that time. To prevent the view continuing processing documents, and randomly dropping emitted rows, depending on memory pressure in the JS runtime at the time, choose to treat Internal errors as fatal. After an InternalError is raised we expect the process to exit just like it would during OOM. Add a test to assert this happens. Fix https://github.com/apache/couchdb/issues/4504
*	add error_info clause for 410 Gone	Robert Newson	2023-03-29	1	-0/+2
\|
*	docs(_find): catch up with the implementation and further fixes	Gabor Pali	2023-03-27	1	-15/+34
\| \| \| \| \| \| \| \| \|	- Unify the style of synopsis lines. - Mention the `partitioned` parameter where applicable. - Fix formatting of `warning` in one of the example responses. - Trade the possibly retired `range` attribute for `mrargs` and expand the attributes within `opts` in the response of `_explain`.
*	Increase index crash test cover a bit	Nick Vatamaniuc	2023-03-24	1	-59/+164
\| \| \| \| \| \| \|	Fail index opens in a few different ways and assert async_error is called. Also crash an index process after it's open to test it doesn't take down any index servers.
*	eunit test to assert ddoc_updated clause doesn't throw	Robert Newson	2023-03-24	1	-0/+128
\| \| \| \| \|	We pass in a shard name that doesn't exist, causing couch_util:with_db to throw. we assert that we get back {ok, St} and don't crash.
*	track index pids during open and don't crash if they do	Robert Newson	2023-03-22	2	-14/+33
\|
*	don't crash in handle_db_event	Robert Newson	2023-03-22	1	-38/+68
\|
*	log the original stack trace if Mod:Func throws	Robert Newson	2023-03-22	1	-7/+18
\|
*	Revert "catch and log any error from mem3:local_shards"	Robert Newson	2023-03-22	1	-5/+1
\| \| \| \|	This reverts commit 937ccb6ef84b773882c967a6fa6f4d71df42e4cf.
*	fix(doc): reverse definition of `all_nodes` and `cluster_nodes` to match reality	Jan Lehnardt	2023-03-22	1	-1/+1
\|
*	catch and log any error from mem3:local_shards	Robert Newson	2023-03-21	1	-1/+5
\|
*	docs(typo): Fix server name duplicate (#4484)	Ronny Berndt	2023-03-21	1	-1/+1
\|