summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndrew Shuvalov <andrew.shuvalov@mongodb.com>2021-12-30 20:58:02 +0000
committerEvergreen Agent <no-reply@evergreen.mongodb.com>2021-12-30 21:11:06 +0000
commit540ba4b378cb59de776196fe4c7bc8265a63f2e2 (patch)
tree22000a46041541f3584bc31a53aa368ada508239
parente725e95c534574a4f4f3c3b2c8fa2e4b53fa16ee (diff)
downloadmongo-540ba4b378cb59de776196fe4c7bc8265a63f2e2.tar.gz
SERVER-62312 more health checking documentation
-rw-r--r--jstests/sharding/health_monitor/parameters.js4
-rw-r--r--src/mongo/db/process_health/README.md60
2 files changed, 60 insertions, 4 deletions
diff --git a/jstests/sharding/health_monitor/parameters.js b/jstests/sharding/health_monitor/parameters.js
index ee2c8c5d519..a375ee89b7e 100644
--- a/jstests/sharding/health_monitor/parameters.js
+++ b/jstests/sharding/health_monitor/parameters.js
@@ -8,10 +8,6 @@
let CUSTOM_INTERVAL = 1337;
let CUSTOM_DEADLINE = 5;
-// TODO(SERVER-59368):re-enable
-if (CUSTOM_INTERVAL > 0)
- return;
-
var st = new ShardingTest({
mongos: [
{
diff --git a/src/mongo/db/process_health/README.md b/src/mongo/db/process_health/README.md
index db7367b8ae1..c5e6f06e1ca 100644
--- a/src/mongo/db/process_health/README.md
+++ b/src/mongo/db/process_health/README.md
@@ -8,6 +8,66 @@ This module is capable to run server health checks and crash an unhealthy server
*Health Observers* are designed for every particular check to run. Each observer can be configured to be on/off and critical or not to be able to crash the serer on error. Each observer has a configurable interval of how often it will run the checks.
+## Health Observers Parameters
+- healthMonitoringIntensities: main configuration for each observer. Can be set at startup and changed at runtime. Valid values:
+ - off: this observer if off
+ - critical: if the observer detects a failure, the process will crash
+ - non-critical: if the observer detects a failure, the error will be logged and the process will not crash
+ Example as startup parameter:
+ ```
+ mongos --setParameter "healthMonitoringIntensities={ \"values\" : [{ \"type\" : \"ldap\", \"intensity\" : \"critical\" } ]}"
+ ```
+
+ Example as runtime change command:
+ ```
+ db.adminCommand({ "setParameter": 1,
+ healthMonitoringIntensities: {values:
+ [{type: "ldap", intensity: "critical"}] } });
+ ```
+
+- healthMonitoringIntervals: how often this health observer will run, in milliseconds.
+
+ Example as startup parameter:
+ ```
+ mongos --setParameter "healthMonitoringIntervals={ \"values\" : [ { \"type\" : \"ldap\", \"interval\" : 30000 } ] }"
+ ```
+ here LDAP health observer is configured to run every 30 seconds.
+
+ Example as runtime change command:
+ ```
+ db.adminCommand({"setParameter": 1, "healthMonitoringIntervals":{"values": [{"type":"ldap", "interval": 30000}]} });
+ ```
+
+## LDAP Health Observer
+
+LDAP Health Observer checks all configured LDAP servers that at least one of them is up and running. At every run, it creates new connection to every configured LDAP server and runs a simple query. The LDAP health observer is using the same parameters as described in the **LDAP Authorization** section of the manual.
+
+To enable this observer, use the *healthMonitoringIntensities* and *healthMonitoringIntervals* parameters as described above. The recommended value for the LDAP monitoring interval is 30 seconds.
+
+
+## Active Fault
+
+When a failure is detected, and the observer is configured as *critical*, the server will wait for the configured interval before crashing. The interval from the failure detection and crash is configured with *activeFaultDurationSecs* parameter:
+
+- activeFaultDurationSecs: how long to wait from the failure detection to crash, in seconds. This can be configured at startup and changed at runtime.
+
+ Example:
+ ```
+ db.adminCommand({"setParameter": 1, activeFaultDurationSecs: 300});
+ ```
+
+## Progress Monitor
+
+*Progress Monitor* detects that every health check is not stuck, without returning either success or failure. If a health check starts and does not complete the server will crash. This behavior could be configured with:
+
+- progressMonitor: configure the progress monitor. Values:
+ - *interval*: how often to run the liveness check, in milliseconds
+ - *deadline*: timeout before crashing the server if a health check is not making progress, in seconds
+
+ Example:
+ ```
+ mongos --setParameter "progressMonitor={ \"interval\" : 1000, \"deadline\" : 300 }"
+ ```