summaryrefslogtreecommitdiff
path: root/src/docs/src/ddocs/nouveau.rst
diff options
context:
space:
mode:
Diffstat (limited to 'src/docs/src/ddocs/nouveau.rst')
-rw-r--r--src/docs/src/ddocs/nouveau.rst692
1 files changed, 692 insertions, 0 deletions
diff --git a/src/docs/src/ddocs/nouveau.rst b/src/docs/src/ddocs/nouveau.rst
new file mode 100644
index 000000000..7a9e6d217
--- /dev/null
+++ b/src/docs/src/ddocs/nouveau.rst
@@ -0,0 +1,692 @@
+.. Licensed under the Apache License, Version 2.0 (the "License"); you may not
+.. use this file except in compliance with the License. You may obtain a copy of
+.. the License at
+..
+.. http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+.. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+.. License for the specific language governing permissions and limitations under
+.. the License.
+
+.. _ddoc/nouveau:
+
+=======
+Nouveau
+=======
+
+.. warning::
+ Nouveau is an experimental feature. Future releases might change how the endpoints
+ work and might invalidate existing indexes.
+
+Nouveau indexes enable you to query a database by using the
+`Lucene Query Parser Syntax. <https://lucene.apache.org/core/9_5_0/queryparser/
+org/apache/lucene/queryparser/classic/package-summary.html#Overview>`_
+A nouveau index uses one, or multiple, fields from your documents. You can use a nouveau
+index to run queries to find documents based on the content they contain.
+
+.. warning::
+ Nouveau cannot function unless it has a functioning Nouveau server.
+ See :ref:`Nouveau Server Installation <install/nouveau>` for details.
+
+To create a nouveau index, you add a JavaScript function to a design document in the
+database. An index builds after processing one search request or after the server detects
+a document update. The ``index`` function takes the following parameters:
+
+#. Field type - The type of the field, can be ``string``, ``text``, ``double``
+ or ``stored``.
+#. Field name - The name of the field you want to use when you query the index.
+ If you set this parameter to ``default``, then this field is queried if no field is
+ specified in the query syntax.
+#. Data that you want to index, for example, ``doc.address.country``.
+#. (Optional) The third parameter includes the following field: ``store``.
+
+By default, a nouveau index response returns 25 rows. The number of hits that are returned
+can be changed by using the ``limit`` parameter. Each response includes a ``bookmark``
+field. You can include the value of the ``bookmark`` field in subsequent queries to fetch
+results from deeper in the result set.
+
+*Example design document that defines a nouveau index:*
+
+.. code-block:: javascript
+
+ {
+ "_id": "_design/nouveau_example",
+ "nouveau": {
+ "animals": {
+ "index": "function(doc){ ... }"
+ }
+ }
+ }
+
+A nouveau index will inherit the partitioning type from the ``options.partitioned`` field
+of the design document that contains it.
+
+Index functions
+===============
+
+Attempting to index by using a data field that does not exist fails. To avoid
+this problem, use the appropriate
+:ref:`guard clause <ddoc/nouveau/index_guard_clauses>`.
+
+.. note::
+ Your indexing functions operate in a memory-constrained environment
+ where the document itself forms a part of the memory that is used
+ in that environment. Your code's stack and document must fit inside this
+ memory. In other words, a document must be loaded in order to be indexed.
+ Documents are limited to a maximum size of 64 MB.
+
+The function that is contained in the index field is a JavaScript function
+that is called for each document in the database.
+The function takes the document as a parameter,
+extracts some data from it, and then calls the function that is defined
+in the ``index`` field to index that data.
+
+The ``index`` function takes four parameters, where the third parameter is optional.
+
+#. The first parameter is the type of the field.
+
+#. The second parameter is the name of the field you intend to use
+ when querying the index, and which is specified in the Lucene
+ syntax portion of subsequent queries. An example appears in the
+ following query:
+
+ .. code-block:: javascript
+
+ q=color:red
+
+ The Lucene field name ``color`` is the first parameter of the ``index`` function.
+
+ If the special value ``"default"`` is used when you define the name,
+ you do not have to specify a field name at query time.
+ The effect is that the query can be simplified:
+
+ .. code-block:: javascript
+
+ q=red
+
+#. The third parameter is the data to be indexed. Keep the following information
+ in mind when you index your data:
+
+ - This data must be only a string, number, or boolean. Other types will cause
+ an error to be thrown by the index function call.
+
+ - If an error is thrown when running your function, for this reason or others,
+ the document will not be added to that search index.
+
+#. The fourth, optional, parameter is a JavaScript object with the following fields:
+
+ *Index function (optional parameter)*
+
+ * **store** - If ``true``, the value is returned in the search result; otherwise,
+ the value is not returned. Values are ``true`` or ``false``. Default is ``false``.
+
+ .. note::
+
+ If you do not set the ``store`` parameter,
+ the index data results for the document are not returned in response to a query.
+
+*Example search index function:*
+
+.. code-block:: javascript
+
+ function(doc) {
+ if (typeof(doc.min_length) == 'number') {
+ index("double", "min_length", doc.min_length, {"store": true});
+ }
+ if (typeof(doc.diet) == 'string') {
+ index("string", "diet", doc.diet, {"store": true});
+ }
+ if (typeof(doc.latin_name) == 'string') {
+ index("string", "latin_name", doc.latin_name, {"store": true});
+ }
+ if (typeof(doc.class) == 'string') {
+ index("string", "class", doc.class, {"store": true});
+ }
+ }
+
+.. _ddoc/nouveau/index_guard_clauses:
+
+Index guard clauses
+-------------------
+
+Runtime errors in the index function cause the document not to be indexed at all. The
+most common runtime errors are described below;
+
+*Example of failing to check whether the indexed value exists:*
+
+.. warning:: example of bad code
+.. code-block:: javascript
+
+ index("min_length", doc.min_length, {"store": true});
+
+For documents without a `min_length` value, this index call will
+pass ``undefined`` as the value. This will be rejected by nouveau's
+validation function and the document will not be indexed.
+
+*Example of failing to check whether the nested indexed value exists:*
+
+.. warning:: example of bad code
+.. code-block:: javascript
+
+ if (doc.foo.bar) {
+ index("bar", doc.foo.bar, {"store": true});
+ }
+
+This bad example fails in a different way if ``doc.foo`` doesn't
+exist; the evaluation of ``doc.foo.bar`` throws an exception.
+
+.. code-block:: javascript
+
+ if (doc.foo && typeof(doc.foo) == 'object' && typeof(doc.foo.bar == 'string')) {
+ index("bar", doc.foo.bar, {"store": true});
+ }
+
+This example correctly checks that ``doc.foo`` is an object and its
+``bar`` entry is a string.
+
+*Example of checking the index value exists but disallowing valid false values:*
+
+.. warning:: example of bad code
+.. code-block:: javascript
+
+ if (doc.min_length) {
+ index("min_length", doc.min_length, {"store": true});
+ }
+
+We correct the previous mistake so documents without min_length are
+indexed (assuming there are other index calls for values that `do`
+exist) but we've acccidentally prevented the indexing of the
+``min_length`` field if the ``doc.min_length`` happens to be ``0``.
+
+.. code-block:: javascript
+
+ if (typeof(doc.min_length == 'number')) {
+ index("min_length", doc.min_length, {"store": true});
+ }
+
+This good example ensures we index any document where ``min_length`` is a number.
+
+.. _ddoc/nouveau/analyzers:
+
+Analyzers
+=========
+
+Analyzers convert textual input into ``tokens`` which can be searched
+on. Analyzers typically have different rules for how they break up
+input into tokens, they might convert all text to lower case, they
+might omit whole words (typically words so common they are unlikely to
+be useful for searching), they might omit parts of words (removing
+``ing`` suffixes in English, for example):
+
+We expose a large number of Lucene's analyzers. We invent one
+ourselves (``simple_asciifolding``);
+
+* arabic
+* armenian
+* basque
+* bulgarian
+* catalan
+* chinese
+* cjk
+* classic
+* czech
+* danish
+* dutch
+* email
+* english
+* finnish
+* french
+* galician
+* german
+* hindi
+* hungarian
+* indonesian
+* irish
+* italian
+* japanese
+* keyword
+* latvian
+* norwegian
+* persian
+* polish
+* portugese
+* romanian
+* russian
+* simple
+* simple_asciifolding
+* spanish
+* standard
+* swedish
+* thai
+* turkish
+* whitespace
+
+*Example analyzer document:*
+
+.. code-block:: javascript
+
+ {
+ "_id": "_design/analyzer_example",
+ "nouveau": {
+ "INDEX_NAME": {
+ "index": "function (doc) { ... }",
+ "default_analyzer": "$ANALYZER_NAME"
+ }
+ }
+ }
+
+.. _ddoc/nouveau/field-analyzers:
+
+Field analyzers
+----------------
+
+You may optionally specify a different analyzer for a specific field.
+
+*Example of defining different analyzers for different fields:*
+
+.. code-block:: javascript
+
+ {
+ "_id": "_design/analyzer_example",
+ "nouveau": {
+ "INDEX_NAME": {
+ "default_analyzer": "english",
+ "field_analyzers": {
+ "spanish": "spanish",
+ "german": "german"
+ },
+ "index": "function (doc) { ... }"
+ }
+ }
+ }
+
+Testing analyzer tokenization
+-----------------------------
+
+You can test the results of analyzer tokenization by posting sample data to the
+``_nouveau_analyze`` endpoint.
+
+*Example of using HTTP to test the keyword analyzer:*
+
+.. code-block:: http
+
+ POST /_nouveau_analyze HTTP/1.1
+ Content-Type: application/json
+ {"analyzer":"keyword", "text":"ablanks@renovations.com"}
+
+*Example of using the command line to test the keyword analyzer:*
+
+.. code-block:: sh
+
+ curl 'https://$HOST:5984/_nouveau_analyze' -H 'Content-Type: application/json'
+ -d '{"analyzer":"keyword", "text":"ablanks@renovations.com"}'
+
+*Result of testing the keyword analyzer:*
+
+.. code-block:: javascript
+
+ {
+ "tokens": [
+ "ablanks@renovations.com"
+ ]
+ }
+
+*Example of using HTTP to test the standard analyzer:*
+
+.. code-block:: http
+
+ POST /_nouveau_analyze HTTP/1.1
+ Content-Type: application/json
+ {"analyzer":"standard", "text":"ablanks@renovations.com"}
+
+*Example of using the command line to test the standard analyzer:*
+
+.. code-block:: sh
+
+ curl 'https://$HOST:5984/_nouveau_analyze' -H 'Content-Type: application/json'
+ -d '{"analyzer":"standard", "text":"ablanks@renovations.com"}'
+
+*Result of testing the standard analyzer:*
+
+.. code-block:: javascript
+
+ {
+ "tokens": [
+ "ablanks",
+ "renovations.com"
+ ]
+ }
+
+Queries
+=======
+
+After you create a search index, you can query it.
+
+- Issue a partition query using:
+ ``GET /$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/_nouveau/$INDEX_NAME``
+- Issue a global query using:
+ ``GET /$DATABASE/_design/$DDOC/_nouveau/$INDEX_NAME``
+
+Specify your search by using the ``q`` parameter.
+
+*Example of using HTTP to query a partitioned index:*
+
+.. code-block:: http
+
+ GET /$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/_nouveau/$INDEX_NAME?include_docs=true&query="*:*"&limit=1 HTTP/1.1
+ Content-Type: application/json
+
+*Example of using HTTP to query a global index:*
+
+.. code-block:: http
+
+ GET /$DATABASE/_design/$DDOC/_nouveau/$INDEX_NAME?include_docs=true&query="*:*"&limit=1 HTTP/1.1
+ Content-Type: application/json
+
+*Example of using the command line to query a partitioned index:*
+
+.. code-block:: sh
+
+ curl https://$HOST:5984/$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/
+ _nouveau/$INDEX_NAME?include_docs=true\&query="*:*"\&limit=1 \
+
+*Example of using the command line to query a global index:*
+
+.. code-block:: sh
+
+ curl https://$HOST:5984/$DATABASE/_design/$DDOC/_nouveau/$INDEX_NAME?
+ include_docs=true\&query="*:*"\&limit=1 \
+
+.. _ddoc/nouveau/query_parameters:
+
+Query Parameters
+----------------
+
+A full list of query parameters can be found in the
+:ref:`API Reference <api/ddoc/nouveau>`.
+
+.. note::
+ Do not combine the ``bookmark`` and ``update`` options. These options
+ constrain the choice of shard replicas to use for the response. When used
+ together, the options might cause problems when contact is attempted
+ with replicas that are slow or not available.
+
+Relevance
+---------
+
+When more than one result might be returned, it is possible for them to be sorted. By
+default, the sorting order is determined by 'relevance'.
+
+Relevance is measured according to `Apache Lucene Scoring
+<https://lucene.apache.org/core/9_5_0/core/org/apache/
+lucene/search/package-summary.html>`_.
+As an example, if you search a simple database for the word
+``example``, two documents might contain the word. If one document
+mentions the word ``example`` 10 times, but the second document
+mentions it only twice, then the first document is considered to be
+more 'relevant'.
+
+If you do not provide a ``sort`` parameter, relevance is used by default. The highest
+scoring matches are returned first.
+
+If you provide a ``sort`` parameter, then matches are returned in that order, ignoring
+relevance.
+
+If you want to use a ``sort`` parameter, and also include ordering by relevance in your
+search results, use the special fields ``-<score>`` or ``<score>`` within the ``sort``
+parameter.
+
+POSTing search queries
+----------------------
+
+Instead of using the ``GET`` HTTP method, you can also use ``POST``. The main advantage of
+``POST`` queries is that they can have a request body, so you can specify the request as a
+JSON object. Each parameter in the query string of a ``GET`` request corresponds to a
+field in the JSON object in the request body.
+
+*Example of using HTTP to POST a search request:*
+
+.. code-block:: http
+
+ POST /db/_design/ddoc/_nouveau/searchname HTTP/1.1
+ Content-Type: application/json
+
+*Example of using the command line to POST a search request:*
+
+.. code-block:: sh
+
+ curl 'https://$HOST:5984/db/_design/ddoc/_nouveau/searchname' -X POST -H 'Content-Type: application/json' -d @search.json
+
+*Example JSON document that contains a search request:*
+
+.. code-block:: javascript
+
+ {
+ "q": "index:my query",
+ "sort": "foo",
+ "limit": 3
+ }
+
+Query syntax
+============
+
+The CouchDB search query syntax is based on the
+`Lucene syntax. <https://lucene.apache.org/core/9_5_0/queryparser/org/apache/
+lucene/queryparser/classic/package-summary.html>`_
+Search queries take the form of ``name:value`` unless the name is omitted, in which case
+they use the default field, as demonstrated in the following examples:
+
+*Example search query expressions:*
+
+.. code-block:: javascript
+
+ // Birds
+ class:bird
+
+.. code-block:: text
+
+ // Animals that begin with the letter "l"
+ l*
+
+.. code-block:: text
+
+ // Carnivorous birds
+ class:bird AND diet:carnivore
+
+.. code-block:: text
+
+ // Herbivores that start with letter "l"
+ l* AND diet:herbivore
+
+.. code-block:: text
+
+ // Medium-sized herbivores
+ min_length:[1 TO 3] AND diet:herbivore
+
+.. code-block:: text
+
+ // Herbivores that are 2m long or less
+ diet:herbivore AND min_length:[-Infinity TO 2]
+
+.. code-block:: text
+
+ // Mammals that are at least 1.5m long
+ class:mammal AND min_length:[1.5 TO Infinity]
+
+.. code-block:: text
+
+ // Find "Meles meles"
+ latin_name:"Meles meles"
+
+.. code-block:: text
+
+ // Mammals who are herbivore or carnivore
+ diet:(herbivore OR omnivore) AND class:mammal
+
+.. code-block:: text
+
+ // Return all results
+ *:*
+
+Queries over multiple fields can be logically combined, and groups and fields can be
+further grouped. The available logical operators are case-sensitive and are ``AND``,
+``+``, ``OR``, ``NOT`` and ``-``. Range queries can run over strings or numbers.
+
+If you want a fuzzy search, you can run a query with ``~`` to find terms like the search
+term. For instance, ``look~`` finds the terms ``book`` and ``took``.
+
+.. note::
+ If the lower and upper bounds of a range query are both strings that
+ contain only numeric digits, the bounds are treated as numbers not as
+ strings. For example, if you search by using the query
+ ``mod_date:["20170101" TO "20171231"]``, the results include documents
+ for which ``mod_date`` is between the numeric values 20170101 and
+ 20171231, not between the strings "20170101" and "20171231".
+
+You can alter the importance of a search term by adding ``^`` and a positive number. This
+alteration makes matches containing the term more or less relevant, proportional to the
+power of the boost value. The default value is 1, which means no increase or decrease in
+the strength of the match. A decimal value of 0 - 1 reduces importance. making the match
+strength weaker. A value greater than one increases importance, making the match strength
+stronger.
+
+Wildcard searches are supported, for both single (``?``) and multiple (``*``) character
+searches. For example, ``dat?`` would match ``date`` and ``data``, whereas ``dat*`` would
+match ``date``, ``data``, ``database``, and ``dates``. Wildcards must come after the
+search term.
+
+Use ``*:*`` to return all results.
+
+The following characters require escaping if you want to search on them:
+
+.. code-block:: sh
+
+ + - && || ! ( ) { } [ ] ^ " ~ * ? : \ /
+
+To escape one of these characters, use a preceding backslash character (``\``).
+
+The response to a search query contains an ``order`` field for each of the results. The
+``order`` field is an array where the first element is the field or fields that are
+specified in the ``sort`` parameter. See the
+:ref:`sort parameter <api/ddoc/search>`. If no ``sort`` parameter is included
+in the query, then the ``order`` field contains the `Lucene relevance score
+<https://lucene.apache.org/core/9_5_0/core/org/apache/
+lucene/search/package-summary.html>`_.
+
+.. _ddoc/nouveau/faceting:
+
+Faceting
+--------
+
+Nouveau Search also supports faceted searching, enabling discovery of aggregate
+information about matches quickly and easily. You can match all documents by using the
+special ``?q=*:*`` query syntax, and use the returned facets to refine your query.
+
+*Example of search query:*
+
+.. code-block:: javascript
+
+ function(doc) {
+ index("string", "type", doc.type);
+ index("double", "price", doc.price);
+ }
+
+To use facets, all the documents in the index must include all the fields that have
+faceting enabled. If your documents do not include all the fields, you receive a
+``bad_request`` error with the following reason, "The ``field_name`` does not exist." If
+each document does not contain all the fields for facets, create separate indexes for each
+field. If you do not create separate indexes for each field, you must include only
+documents that contain all the fields. Verify that the fields exist in each document by
+using a single ``if`` statement.
+
+*Example if statement to verify that the required fields exist in each document:*
+
+.. code-block:: javascript
+
+ if (typeof doc.town == "string" && typeof doc.name == "string") {
+ index("string", "town", doc.town);
+ index("string", "name", doc.name);
+ }
+
+Counts
+------
+
+.. note::
+ The ``counts`` option is only available when making global queries.
+
+The ``counts`` facet syntax takes a list of fields, and returns the number of query
+results for each unique value of each named field.
+
+.. note::
+ The ``count`` operation works only if the indexed values are strings.
+ The indexed values cannot be mixed types. For example,
+ if 100 strings are indexed, and one number,
+ then the index cannot be used for ``count`` operations.
+ You can check the type by using the ``typeof`` operator, and convert it
+ by using the ``parseInt``,
+ ``parseFloat``, or ``.toString()`` functions.
+
+*Example of a query using the counts facet syntax:*
+
+.. code-block:: text
+
+ ?q=*:*&counts=["type"]
+
+*Example response after using of the counts facet syntax:*
+
+.. code-block:: javascript
+
+ {
+ "total_rows":100000,
+ "bookmark":"g...",
+ "rows":[...],
+ "counts":{
+ "type":{
+ "sofa": 10,
+ "chair": 100,
+ "lamp": 97
+ }
+ }
+ }
+
+Ranges
+------
+
+.. note::
+ The ``ranges`` option is only available when making global queries.
+
+The ``range`` facet syntax reuses the standard Lucene syntax for ranges to return counts
+of results that fit into each specified category. Inclusive range queries are denoted by
+brackets (``[``, ``]``). Exclusive range queries are denoted by curly brackets (``{``,
+``}``).
+
+.. note::
+ The ``range`` operation works only if the indexed values are numbers. The indexed
+ values cannot be mixed types. For example, if 100 strings are indexed, and one number,
+ then the index cannot be used for ``range`` operations. You can check the type by
+ using the ``typeof`` operator, and convert it by using the ``parseInt``,
+ ``parseFloat``, or ``.toString()`` functions.
+
+*Example of a request that uses faceted search for matching ranges:*
+
+.. code-block:: text
+
+ ?q=*:*&ranges={"price":{"cheap":"[0 TO 100]","expensive":"{100 TO Infinity}"}}
+
+*Example results after a ranges check on a faceted search:*
+
+.. code-block:: javascript
+
+ {
+ "total_rows":100000,
+ "bookmark":"g...",
+ "rows":[...],
+ "ranges": {
+ "price": {
+ "expensive": 278682,
+ "cheap": 257023
+ }
+ }
+ }