summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSanika Phanse <sanika.phanse@mongodb.com>2022-06-21 15:51:36 +0000
committerEvergreen Agent <no-reply@evergreen.mongodb.com>2022-06-21 16:56:43 +0000
commita45fbd288c1c0b2f6076ac1e2ae1fc8396788a15 (patch)
treec2c554891b7cccca849fec76f94bf6e030f91fda
parentafc52bea57c8edb8769efa67c01ebbde5a87f50d (diff)
downloadmongo-a45fbd288c1c0b2f6076ac1e2ae1fc8396788a15.tar.gz
SERVER-66358 Add the "Internal Sessions" section to the sharding architecture guide
-rw-r--r--src/mongo/db/s/README.md12
1 files changed, 10 insertions, 2 deletions
diff --git a/src/mongo/db/s/README.md b/src/mongo/db/s/README.md
index b7d8bdff562..f3e67bce8b8 100644
--- a/src/mongo/db/s/README.md
+++ b/src/mongo/db/s/README.md
@@ -752,10 +752,14 @@ operations. The metadata is reaped if the cluster does not receive a new operati
session for a reasonably long time (the default is 30 minutes).
A logical session is identified by its "logical session id," or `lsid`. An `lsid` is a combination
-of two pieces of information:
+of up to four pieces of information:
1. `id` - A globally unique id (UUID) generated by the mongo shell, driver, or the `startSession` server command
1. `uid` (user id) - The identification information for the logged-in user (if authentication is enabled)
+1. `txnNumber` - An optional parameter set only for internal transactions spawned from retryable writes. Strictly-increasing counter set by the transaction API to match the txnNumber of the corresponding retryable write.
+1. `txnUUID` - An optional parameter set only for internal transactions spawned inside client sessions. The txnUUID is a globally unique id generated by the transaction API.
+
+A logical session with a `txnNumber` and `txnUUID` is considered a child of the session with matching `id` and `uid` values. There may be multiple child sessions per parent session, and checking out a child/parents session checks out the other and updates the `lastUsedTime` of both. Killing a parent session also kills all of its child sessions.
The order of operations in the logical session that need to durably store metadata is defined by an
integer counter, called the `txnNumber`. When the cluster receives a retryable write or transaction
@@ -848,8 +852,12 @@ and to check the session back in upon completion. When a session is checked out,
until it is checked back in, forcing other operations to wait for the ongoing operation to complete
or yield the session.
+Checking out an internal/child session additionally checks out its parent session (the session with the same `id` and `uid` value in the lsid, but without a `txnNumber` or `txnUUID` value), and vice versa.
+
The runtime state for a session consists of the last checkout time and operation, the number of operations
-waiting to check out the session, and the number of kills requested. The last checkout time is used by
+waiting to check out the session, and the number of kills requested. Retryable internal sessions are reaped from the logical session catalog [eagerly](https://github.com/mongodb/mongo/blob/67e37f8e806a6a5d402e20eee4b3097e2b11f820/src/mongo/db/session_catalog.cpp#L342), meaning that if a transaction session with a higher transaction number has successfully started, sessions with lower txnNumbers are removed from the session catalog and inserted into an in-memory buffer by the [InternalTransactionsReapService](https://github.com/mongodb/mongo/blob/67e37f8e806a6a5d402e20eee4b3097e2b11f820/src/mongo/db/internal_transactions_reap_service.h#L42) until a configurable threshold is met (1000 by default), after which they are deleted from the transactions table (`config.transactions`) and `config.image_collection` all at once. Eager reaping is best-effort, in that the in-memory buffer is cleared on stepdown or restart. Any missed sessions will be reaped once the session expires or their `config.transactions` entries have not been written to for `TransactionRecordMinimumLifetimeMinutes` minutes.
+
+The last checkout time is used by
the [periodic job inside the logical session cache](#periodic-cleanup-of-the-session-catalog-and-transactions-table)
to determine when a session should be reaped from the session catalog, whereas the number of
operations waiting to check out a session is used to block reaping of sessions that are still in