summaryrefslogtreecommitdiff
path: root/src/mango
Commit message (Collapse)AuthorAgeFilesLines
* declare dependency on nouveauRobert Newson2023-04-261-1/+2
|
* Import nouveau (#4291)Robert Newson2023-04-227-12/+895
| | | Nouveau - a new (experimental) full-text indexing feature for Apache CouchDB, using Lucene 9. Requires Java 11 or higher (19 is preferred).
* fix(mango): GET invalid path under `_index` should not cause 500Gabor Pali2023-04-192-3/+7
| | | | | | | Sending GET requests targeting paths under the `/{db}/_index` endpoint, e.g. `/{db}/_index/something`, cause an internal error. Change the endpoint's behavior to gracefully return HTTP 405 "Method Not Allowed" instead to be consistent with others.
* mango: refactorGabor Pali2023-04-181-20/+23
|
* mango: fix definition of index coverageGabor Pali2023-04-183-5/+107
| | | | | | | Covering indexes shall provide all the fields that the selector may contain, otherwise the derived documents would get dropped on the "match and extract" phase even if they were matching. Extend the integration tests to check this case as well.
* mango: enhance compositionality of `consider_index_coverage/3`Gabor Pali2023-04-181-33/+42
| | | | | | | Ideally, the effect of this function should be applied at a single spot of the code. When building the base options, covering index information should be left blank to make it consistent with the rest of the parameters.
* mango: mark fields with the `$exists` operator indexableGabor Pali2023-04-181-0/+94
| | | | | | | | | This is required to make index selection work better with covering indexes. The `$exists` operator prescribes the presence of the given field so that if an index has the field, it shall be considered because it implies true. Without this change, it will not happen, but covering indexes can work if the index is manually picked.
* mango: add integration tests for keys-only covering indexesGabor Pali2023-04-181-0/+115
|
* mango: add eunit testsGabor Pali2023-04-182-1/+820
|
* mango: increase coverage of the `choose_best_index/1` testGabor Pali2023-04-181-2/+9
|
* mango: add type information for better self-documentationGabor Pali2023-04-183-8/+88
|
* mango: introduce support for covering indexesGabor Pali2023-04-182-27/+77
| | | | | | | | | | | | | | | | | | | | | | | | As a performance improvement, shorten the gap between Mango queries and the underlying map-reduce views: try to serve requests without pulling documents from the primary data set, i.e. run the query with `include_docs` set to `false` when there is a chance that it can be "covered" by the chosen index. The rows in the results are then built from the information stored there. Extend the response on the `_explain` endpoint to show information in the `covered` Boolean attribute about the query would be covered by the index or not. Remarks: - This should be a transparent optimization, without any semantical effect on the queries. - Because the main purpose of indexes is to store keys and the document identifiers, the change will only work in cases when the selected fields overlap with those. The chance of being covered could be increased by adding more non-key fields to the index, but that is not in scope here.
* Remove explicit importGabor Pali2023-04-182-143/+120
|
* mango: correct text index selection for queries with `$regex` (#4458)PÁLI Gábor János2023-03-105-22/+660
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * mango: Remove unused `op_insert` The `op_insert` elements in the abstract representation of the translated Lucene queries do not seem to be produced anywhere in the code. This might have been left over a while ago, and now retire it. * mango: Remove unused directory include * mango: Equip text index selection with tests, specs, and docs - Add specifications for the important functions that play some role in the text index selection. This would help to understand the implicit contracts around them and the associated data flow. - Introduce `test_utils:as_selector/1` to make it easier to build valid Mango selectors for testing. On the top level, it uses Erlang maps to ensure the structural consistency of the input (selectors are JSON objects that can be considered maps). Maps are then validated and normalized by `jiffy` and Mango's internal normalization rules for selectors for additional correctness, they eventually become embedded JSON objects. This facilities writing better unit tests that are closer to the real-world use. At the same time, it comes with a dependency on these tools and their misbehavior can cause test failures. - Add unit tests for the major functions that contribute to the index selection logic and boost the test coverage of the `mango_idx_text` and `mango_selector_text` modules. That is important because running integration tests on a higher level requires a working Clouseau instance, which may not always be available. With these unit tests in place, changes in the code can be tracked easily. Also, the test cases can aid the reader to get a better understanding of the assumed behavior. - Explain the purpose of `mango_idx_text:is_usable/3` as this is not trivial to catch at the first sight. Thanks @mikerhodes for providing the input. * mango: Refactor index selection tests * mango: Correct text index selection for `$regex` For the `$regex` operator, text indexes can be overly permissive which can cause that they are selected even if they could not serve the corresponding query. Rework the interpreteration of `$regex` to avoid such problems.
* Consolidate Mango integration testsGabor Pali2023-02-212-52/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit groups a couple of major changes that aim to amend many pain points in making the Mango integration test suites more accessible. - The test framework behind the Mango integration test suite provides a lot of flags that are not currently exposed on the level of the main `Makefile`. Change this for the greater flexilibility. - Mango's test suite documentation is buried in the source tree, which is not common for other kind of tests. To increase its visibility and unify the style, move the contents of this file over to the general developer documentation. - Promote the use of the `mango-test` target instead of setting up the related machinery manually. The commands recorded in the original documentation are out of date and only minor implementation details anyway. - Retire the explicit control over the activation of Mango integration tests that require support for text indexes. Instead learn the availability of this feature from the current CouchDB instance and run tests based on that. This effectively makes the activation automated, which could be controlled implicitly by either hooking up of a Clouseau instance or not. - Running the Mango integration tests do not remove the databases on their completion, which can inadvertently pollute the local data store. To avoid this, enforce removal of test databases but allow it to be disabled on demand.
* mango: switch to UTF-8 encoding for every testGabor Pali2023-02-214-4/+0
| | | | | | | | | | Python 3 uses UTF-8 encoding on reading the source files by default and UTF-8 itself has become more widely adopted in the recent years therefore it makes sense to remove the associated annotations. At the same time, it helps to unbreak the Unicode key tests where the Apple logo ('', as Unicode character) is featured and then got butchered by forcing the ISO-8859-1 encoding on it.
* mango: skip the `$keymapMatch` test for text indexesGabor Pali2023-02-211-0/+4
| | | | | | Text indexes do not support the `$keymapMatch` operator thus let the test suite know about this limitation to avoid the related error.
* mango: Fix specification of `choose_best_index/1`Gabor Pali2023-02-151-1/+2
| | | | | | | | | Comparators are not represented by binary strings in the selection ranges, captured by the `range/0` type. Although that is how they are coming from the corresponding parsed JSON object, they are being translated to specific atoms on the fly. Noticed by: nickva
* mango: Match comments with implementation for JSON index selectionGabor Pali2023-02-141-4/+7
|
* mango: Cover JSON index selection with unit testsGabor Pali2023-02-141-0/+79
|
* mango: Add type specification for the JSON index selectionGabor Pali2023-02-141-0/+6
|
* mango: Remove unused parameter from the JSON index selectionGabor Pali2023-02-142-4/+4
|
* mango: Remove unused importsGabor Pali2023-02-141-2/+0
|
* Employ `make python-black-update`Gabor Pali2023-02-023-6/+1
|
* Push down field projection in mango to shardMike Rhodes2023-01-201-23/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit aims to improve Mango by reducing the data transferred to the coordinator during query execution. It may reduce memory or CPU use at the coordinator but that isn't the primary goal. Currently, when documents are read at the shard level, they are compared locally at the shard with the selector to ensure they match before they are sent to the coordinator. This ensures we're not sending documents across the network that the coordinator immediately discards, saving bandwidth and coordinator processing. This commit further executes field projection (`fields` in the query) at the shard level. This should further save bandwidth, particularly for queries that project few fields from large documents. One item of complexity is that a query may request a quorum read of documents, meaning that we need to do the document read at the coordinator and not the shard, then perform the `selector` and `fields` processing there rather than at the shard. To ensure that documents are processed consistently whether at the shard or coordinator, match_and_extract_doc/3 is added. There is still one orphan call outside match_and_extract_doc/2 to extract/2 which supports cluster upgrade and should later be removed. Shard level processing is already performed in a callback, view_cb/2, that's passed to fabric's view processing to run for each row in the view result set. It's used for the shard local selector and fields processing. To make it clear what arguments are destined for this callback, the commit encapsulates the arguments, using viewcbargs_new/2 and viewcbargs_get/2. As we push down more functionality to the shard, the context this function needs to carry with it will increase, so having a record for it will be valuable. Supporting cluster upgrades: The commit supports shard pushdown for Mango `fields` processing for situations during rolling cluster upgrades. In the state where the coordinator is speaking to an upgraded node, the view_cb/2 needs to support being passed just the `selector` outside of the new viewcbargs record. In this case, the shard will not process fields, but the coordinator will. In the situation where the coordinator is upgraded but the shard is not, we need to send the selector to the shard via `selector` and also execute the fields projection at the coordinator. Therefore we pass arguments to view_cb/2 via both `selector` and `callback_args` and have an apparently spurious field projection (mango_fields:extract/2) in the code that receives back values from the shard ( factored out into doc_member_and_extract). Both of these affordances should only need to exist through one minor version change and be removed thereafter -- if people are jumping several minor versions of CouchDB in one go, hopefully they are prepared for a bit of trouble. Testing upgrade states: As view_cb is completely separate from the rest of the cursor code, we can first try out the branch's code using view_cb from `main`, and then the other way -- the branch's view_cb with the rest of the file from main. I did both of these tests successfully.
* Show mango_selector:match/2 call using testMike Rhodes2023-01-201-0/+25
| | | | | | | I needed to understand the format of arguments to `match/2` when writing the code to support projecting fields on the shard, so I wrote some code to figure it out as a test. I figure this may be useful for future work in this area, so push as commit.
* docs(mango): match description of `$mod` with realityGabor Pali2023-01-181-1/+1
| | | | | | | | | | | | | | The remainder argument for the `$mod` operator can be zero, while its documentation suggests otherwise. It actually covers a very realistic use case where divisibility is expressed. Neither related restrictions could be identified in the sources [1] nor MongoDB forbids this [2]. Tests also seem to exercise this specific case [3]. Thanks @iilyak for checking on these. [1] https://github.com/apache/couchdb/blob/adf17140e81d0b74f2b2ecdea48fc4f702832eaf/src/mango/src/mango_selector.erl#L512:L513 [2] https://www.mongodb.com/docs/manual/reference/operator/query/mod/ [3] https://github.com/apache/couchdb/blob/0059b8f90e58e10b199a4b768a06a762d12a30d3/src/mango/test/03-operator-test.py#L58
* Add editors magic linesNoah Shaw2022-07-281-0/+2
|
* Backport commits from fdbmain into main (old 3.x)Ronny Berndt2022-06-233-4/+4
| | | | | Cherry-picked commits from 0156a55012b76adb652c11032596d9801c71665e Thx @kianmeng
* Remove Erlang < 23 ifdefs and other macrosNick Vatamaniuc2022-06-172-8/+8
|
* Fix index creation with empty ddoc should return 400 (#3990)Jiahui Li2022-04-263-10/+16
| | | | | Creating an index with "ddoc":"" or "name":"" should return a 400 Bad Request. This fixes: https://github.com/apache/couchdb/issues/1472
* mango_tests: revert hypothesis back to python3.6 compatWill Young2022-04-231-1/+1
|
* Merge branch '3.x' into nose2Will Young2022-04-222-5/+5
|\
| * Search is available if it was ever available since startRobert Newson2022-04-132-5/+5
| | | | | | | | | | calling connected() every time causes spurious 503's when clouseau is temporarily unavailable, which is usually masked by retry logic.
* | nose->nose2 deps upgrade for mango testsWill2022-03-303-6/+9
|/
* Apply erlfmt formatting to source treeNick Vatamaniuc2021-11-2225-1717/+1506
| | | | | | | | | | | | | | | | | | | | | | | | | | | These exceptions from main were ported over to 3.x ``` --- a/src/chttpd/src/chttpd.erl +++ b/src/chttpd/src/chttpd.erl @@ -491,6 +491,7 @@ extract_cookie(#httpd{mochi_req = MochiReq}) -> end. %%% end hack +%% erlfmt-ignore set_auth_handlers() -> AuthenticationDefault = "{chttpd_auth, cookie_authentication_handler}, ``` ``` --- a/src/couch/src/couch_debug.erl +++ b/src/couch/src/couch_debug.erl @@ -49,6 +49,7 @@ help() -> ]. -spec help(Function :: atom()) -> ok. +%% erlfmt-ignore help(opened_files) -> ```
* chore: simplify version detection h/t @vatamaneJan Lehnardt2021-03-171-2/+1
|
* feat: somewhat hacky version detectionJan Lehnardt2021-03-171-1/+4
|
* feat: work around get_stacktrace deprecation/removalJan Lehnardt2021-03-172-8/+4
| | | | | | | | | | | This patch introduces a macro and inserts it everywhere we catch errors and then generatre a stacktrace. So far the only thing that is a little bit ugly is that in two places, I had to add a header include dependency on couch_db.erl where those modules didn’t have any ties to couchdb/* before, alas. I’d be willing to duplicate the macros in those modules, if we don’t want the include dependency.
* added $keyMapMatch Mango operatorMichal Borkowski2020-11-102-0/+41
|
* bypass partition query limit for mango (#3114)Tony Sun2020-08-281-1/+5
| | | | | | | | | When partition_query_limit is set for couch_mrview, it limits how many docs can be scanned when executing partitioned queries. But this limits mango's doc scans internally. This leads to documents not being scanned to fulfill a query. This fixes: https://github.com/apache/couchdb/issues/2795 Co-authored-by: Joan Touzet <wohali@users.noreply.github.com>
* python black cleanupJoan Touzet2020-04-271-7/+3
|
* safer binary_to_term in mango_json_bookmarkmango-bookmark-3.xRobert Newson2020-04-231-1/+1
|
* fix operator issue with empty arrays (#2805) (#2808)Tony Sun2020-04-222-3/+25
| | | | | | | | | | | | | Previously, in https://github.com/apache/couchdb/pull/1783, the logic was wrong in relation to how certain operators interacted with empty arrays. We modify this logic to make it such that: {"foo":"bar", "bar":{"$in":[]}} and {"foo":"bar", "bar":{"$all":[]}} should return 0 results. Co-authored-by: Joan Touzet <wohali@users.noreply.github.com>
* Fix Windows build (#2534)3.0.0-RC1Joan Touzet2020-02-083-6/+6
| | | | | | | | | | | | | | | | | | | | * Allows `configure.ps1` to correctly pull and build `rebar` on Windows * Removes the static declarations in `rebar.config.script` on specific, pre-determined paths to various includes/libraries necessary for NIFs and external binaries (expectation is these are passed in env vars INCLUDE, LIB and LIBPATH) * fixes the SM60 `couchjs` build by telling `windows.h` not to redefine min and max as macros through a `#define` * fixes the `make eunit` target on Windows * Adds the missing `EXE_LINK_CXX_TEMPLATE` that our rebar doesn't have, but `enc` has today, which is also causing a failed `couchjs` (C++) build on Windows * Causes `make python-black` to correctly cause failure in `make check` if it finds problems * fixes Mango tests on Python 3.8 by bumping the hypothesis dependency * fixes one Elixir test on Windows (incorrect calculation of `now(:ms)` due to Erlang clock precision difference) * a little bit of python black cleanup (mango tests)
* Return mango warnings as a delimited stringWill Holley2020-02-033-6/+6
| | | | | | | The CouchDB API defines the warning field returned by _find to be a string (and this is what Fauxton expects). 5d55e289 was missing a string conversion and returned the warning(s) as an array. This restores the intended behaviour.
* python-black style cleanup (#2505)Joan Touzet2020-01-311-1/+1
|
* quote strings for all text values (#2486)Tony Sun2020-01-273-1/+21
| | | | | | | | | | | | When a user issues a range query $lt, $lte, $gt, $gte for text indexes, the query is translated into a MIN, MAX range query against clouseau. If not quoted, an error occurs: {"error":"text_search_error","reason":"Cannot parse '(a_3astring:[\"\" TO string\\ containing\\ space})'..} This is because the string is broken up into 3 tokens which the parser cannot parse. If we add quotes to the string, the the range query works correctly.
* Python black cleanups (#2477)Joan Touzet2020-01-212-1/+5
|
* Handle not_found docs in mango text indexesarchive/mango_index_consistency_errormango_index_consistency_errorWill Holley2020-01-171-0/+4
| | | | | | mango_cursor_text:get_json_docs may return a not_found atom instead of a Doc. In this case, we should just ignore the hit instead of attempting to evaluate it against a mango selector.