summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndrew Shuvalov <andrew.shuvalov@mongodb.com>2021-12-30 14:34:52 +0000
committerEvergreen Agent <no-reply@evergreen.mongodb.com>2022-01-12 15:14:02 +0000
commit7deae2c1f359a9bbae8d92064aab4fb6cd677d14 (patch)
tree3ff068e4b30d01202e5a0d0a9ee5d899c35ac3d8
parenta11e822c1e05be2fa06d86a107ce151c55b16740 (diff)
downloadmongo-7deae2c1f359a9bbae8d92064aab4fb6cd677d14.tar.gz
SERVER-62312 health monitoring documentation
(cherry picked from commit 1c02e51ffdc45678cc42b21ab1f27388649505a4) SERVER-62546 health monitoring documetation
-rw-r--r--src/mongo/db/process_health/README.md73
1 files changed, 73 insertions, 0 deletions
diff --git a/src/mongo/db/process_health/README.md b/src/mongo/db/process_health/README.md
new file mode 100644
index 00000000000..c5e6f06e1ca
--- /dev/null
+++ b/src/mongo/db/process_health/README.md
@@ -0,0 +1,73 @@
+# Process Health Checking Library
+
+This module is capable to run server health checks and crash an unhealthy server.
+
+*Note:* in 4.4 release only the mongos proxy server is supported
+
+## Health Observers
+
+*Health Observers* are designed for every particular check to run. Each observer can be configured to be on/off and critical or not to be able to crash the serer on error. Each observer has a configurable interval of how often it will run the checks.
+
+## Health Observers Parameters
+
+- healthMonitoringIntensities: main configuration for each observer. Can be set at startup and changed at runtime. Valid values:
+ - off: this observer if off
+ - critical: if the observer detects a failure, the process will crash
+ - non-critical: if the observer detects a failure, the error will be logged and the process will not crash
+
+
+ Example as startup parameter:
+ ```
+ mongos --setParameter "healthMonitoringIntensities={ \"values\" : [{ \"type\" : \"ldap\", \"intensity\" : \"critical\" } ]}"
+ ```
+
+ Example as runtime change command:
+ ```
+ db.adminCommand({ "setParameter": 1,
+ healthMonitoringIntensities: {values:
+ [{type: "ldap", intensity: "critical"}] } });
+ ```
+
+- healthMonitoringIntervals: how often this health observer will run, in milliseconds.
+
+ Example as startup parameter:
+ ```
+ mongos --setParameter "healthMonitoringIntervals={ \"values\" : [ { \"type\" : \"ldap\", \"interval\" : 30000 } ] }"
+ ```
+ here LDAP health observer is configured to run every 30 seconds.
+
+ Example as runtime change command:
+ ```
+ db.adminCommand({"setParameter": 1, "healthMonitoringIntervals":{"values": [{"type":"ldap", "interval": 30000}]} });
+ ```
+
+## LDAP Health Observer
+
+LDAP Health Observer checks all configured LDAP servers that at least one of them is up and running. At every run, it creates new connection to every configured LDAP server and runs a simple query. The LDAP health observer is using the same parameters as described in the **LDAP Authorization** section of the manual.
+
+To enable this observer, use the *healthMonitoringIntensities* and *healthMonitoringIntervals* parameters as described above. The recommended value for the LDAP monitoring interval is 30 seconds.
+
+
+## Active Fault
+
+When a failure is detected, and the observer is configured as *critical*, the server will wait for the configured interval before crashing. The interval from the failure detection and crash is configured with *activeFaultDurationSecs* parameter:
+
+- activeFaultDurationSecs: how long to wait from the failure detection to crash, in seconds. This can be configured at startup and changed at runtime.
+
+ Example:
+ ```
+ db.adminCommand({"setParameter": 1, activeFaultDurationSecs: 300});
+ ```
+
+## Progress Monitor
+
+*Progress Monitor* detects that every health check is not stuck, without returning either success or failure. If a health check starts and does not complete the server will crash. This behavior could be configured with:
+
+- progressMonitor: configure the progress monitor. Values:
+ - *interval*: how often to run the liveness check, in milliseconds
+ - *deadline*: timeout before crashing the server if a health check is not making progress, in seconds
+
+ Example:
+ ```
+ mongos --setParameter "progressMonitor={ \"interval\" : 1000, \"deadline\" : 300 }"
+ ```