summaryrefslogtreecommitdiff
path: root/src/couch_mrview
diff options
context:
space:
mode:
authorNick Vatamaniuc <vatamane@gmail.com>2023-02-07 02:01:55 -0500
committerNick Vatamaniuc <nickva@users.noreply.github.com>2023-04-15 13:32:00 -0400
commit4b8b7ec9f39d1e5910f301283fd6b9051b057d0c (patch)
treef3121f8da34ed6f9802d908ba74fc705b1b7f747 /src/couch_mrview
parent98a356c3233f27e62d6b7ef7eb8478434198b888 (diff)
downloadcouchdb-4b8b7ec9f39d1e5910f301283fd6b9051b057d0c.tar.gz
Improve couch_proc_manager
The main improvement is speeding up process lookup. This should result in improved latency for concurrent requests which quickly acquire and release couchjs processes. Testing with concurrent vdu and map/reduce calls showed a 1.6 -> 6x performance speedup [1]. Previously, couch_proc_manager linearly searched through all the processes and executed a custom callback function for each to match design doc IDs. Instead, use a separate ets table index for idle processes to avoid scanning assigned processes. Use a db tag in addition to a ddoc id to quickly find idle processes. This could improve performance, but if that's not the case, allow configuring the tagging scheme to use a db prefix only, or disable the scheme altogether. Use the new `map_get` ets select guard [2] to perform ddoc id lookups during the ets select traversal without a custom matcher callback. In ordered ets tables use the partially bound key trick [3]. This helps skip scanning processes using a different query language altogether. Waiting clients used `os:timestamp/0` as a unique client identifier. It turns out, `os:timestamp/0` is not guaranteed to be unique and could result in some clients never getting a response. This bug was mostly likely the reason the "fifo client order" test had to be commented out. Fix the issue by using a newer monotonic timestamp function, and for uniqueness add the client's gen_server return tag at the end. Uncomment the previously commented out test so it can hopefully run again. When clients tag a previously untagged process, asynchronously replace the untagged process with a new process. This happens in the background and the client doesn't have to wait for it. When a ddoc tagged process cannot be found, before giving up, stop the oldest unused ddoc processes to allow spawning new fresh ones. To avoid doing a linear scan here, keep a separate `?IDLE_ACCESS` index with an ordered list of idle ddoc proceses sorted by their last usage time. When processes are returned to the pool, quickly respond to the client with an early return, instead of forcing them to wait until we re-insert the process back into the idle ets table. This should improve client latency. If the waiting client list gets long enough, where it waits longer than the gen_server get_proc timeout, do not waste time assigning or spawning a new process for that client, since it already timed-out. When gathering stats, avoid making gen_server calls, at least for the total number of processes spawned metric. Table sizes can be easily computed with `ets:info(Table, size)` from outside the main process. In addition to peformance improvements clean up the couch_proc_manager API by forcing all the calls to go through properly exported functions instead of doing direct gen_server calls. Remove `#proc_int{}` and use only `#proc{}`. The cast to a list/tuple between `#proc_int{}` and `#proc{}` was dangerous and it avoided the compiler checking that we're using the proper fields. Adding an extra field to the record resulted in mis-matched fields being assigned. To simplify the code a bit, keep the per-language count in an ets table. This helps not having to thread the old and updated state everywhere. Everything else was mostly kept in ets tables anyway, so we're staying consistent with that general pattern. Improve test coverage and convert the tests to use the `?TDEF_FE` macro so there is no need for the awkward `?_test(begin ... end)` construct. [1] https://gist.github.com/nickva/f088accc958f993235e465b9591e5fac [2] https://www.erlang.org/doc/apps/erts/match_spec.html [3] https://www.erlang.org/doc/man/ets.html#table-traversal
Diffstat (limited to 'src/couch_mrview')
-rw-r--r--src/couch_mrview/src/couch_mrview_show.erl5
1 files changed, 3 insertions, 2 deletions
diff --git a/src/couch_mrview/src/couch_mrview_show.erl b/src/couch_mrview/src/couch_mrview_show.erl
index 3e95be9cc..7fec0c5cd 100644
--- a/src/couch_mrview/src/couch_mrview_show.erl
+++ b/src/couch_mrview/src/couch_mrview_show.erl
@@ -85,6 +85,7 @@ handle_doc_show(Req, Db, DDoc, ShowName, Doc, DocId) ->
JsonDoc = couch_query_servers:json_doc(Doc),
[<<"resp">>, ExternalResp] =
couch_query_servers:ddoc_prompt(
+ Db,
DDoc,
[<<"shows">>, ShowName],
[JsonDoc, JsonReq]
@@ -142,7 +143,7 @@ send_doc_update_response(Req, Db, DDoc, UpdateName, Doc, DocId) ->
JsonReq = chttpd_external:json_req_obj(Req, Db, DocId),
JsonDoc = couch_query_servers:json_doc(Doc),
Cmd = [<<"updates">>, UpdateName],
- UpdateResp = couch_query_servers:ddoc_prompt(DDoc, Cmd, [JsonDoc, JsonReq]),
+ UpdateResp = couch_query_servers:ddoc_prompt(Db, DDoc, Cmd, [JsonDoc, JsonReq]),
JsonResp =
case UpdateResp of
[<<"up">>, {NewJsonDoc}, {JsonResp0}] ->
@@ -219,7 +220,7 @@ handle_view_list(Req, Db, DDoc, LName, VDDoc, VName, Keys) ->
end,
Args = Args0#mrargs{preflight_fun = ETagFun},
couch_httpd:etag_maybe(Req, fun() ->
- couch_query_servers:with_ddoc_proc(DDoc, fun(QServer) ->
+ couch_query_servers:with_ddoc_proc(Db, DDoc, fun(QServer) ->
Acc = #lacc{db = Db, req = Req, qserver = QServer, lname = LName},
case VName of
<<"_all_docs">> ->