summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGregory Wlodarek <gregory.wlodarek@mongodb.com>2021-11-01 21:30:09 +0000
committerEvergreen Agent <no-reply@evergreen.mongodb.com>2021-11-01 22:22:53 +0000
commit0835e6665f231556f931f1059b926ddf9434a60e (patch)
treea7693efc6a7578f82f681cbfa84acbdb975eab86
parent8d6668dd678ca61668d41ceb3d0082290b429562 (diff)
downloadmongo-0835e6665f231556f931f1059b926ddf9434a60e.tar.gz
SERVER-54590 Architecture Guide updates for PM-2189
-rw-r--r--src/mongo/db/catalog/README.md6
-rw-r--r--src/mongo/db/timeseries/README.md47
2 files changed, 39 insertions, 14 deletions
diff --git a/src/mongo/db/catalog/README.md b/src/mongo/db/catalog/README.md
index 921d793c446..94251eaa073 100644
--- a/src/mongo/db/catalog/README.md
+++ b/src/mongo/db/catalog/README.md
@@ -44,6 +44,12 @@ in this table is indexed with a 64-bit `RecordId`, referred to as the catalog ID
BSON document that describes the properties of a collection and its indexes. The `DurableCatalog`
class allows read and write access to the durable data.
+Starting in v5.2, catalog entries for time-series collections have a new flag called
+`timeseriesBucketsMayHaveMixedSchemaData` in the `md` field. Time-series collections upgraded from
+versions earlier than v5.2 may have mixed-schema data in buckets. This flag gets set to `true` as
+part of the upgrade process and is removed as part of the downgrade process through the
+[collMod command](https://github.com/mongodb/mongo/blob/cf80c11bc5308d9b889ed61c1a3eeb821839df56/src/mongo/db/catalog/coll_mod.cpp#L644-L663).
+
**Example**: an entry in the durable catalog for a collection `test.employees` with an in-progress
index build on `{lastName: 1}`:
diff --git a/src/mongo/db/timeseries/README.md b/src/mongo/db/timeseries/README.md
index dc257655225..15c6b822c11 100644
--- a/src/mongo/db/timeseries/README.md
+++ b/src/mongo/db/timeseries/README.md
@@ -76,11 +76,16 @@ certain properties:
In order to support queries on the time-series collection that could benefit from indexed access
rather than collection scans, indexes may be created on the time, meta-data, and meta-data subfields
-of a time-series collection. The index key specification provided by the user via `createIndex` will
-be converted to the underlying buckets collection's schema.
-* The details for mapping the index specificiation between the time-series collection and the
+of a time-series collection. Starting in v5.2, indexes on time-series collection measurement fields
+are permitted. The index key specification provided by the user via `createIndex` will be converted
+to the underlying buckets collection's schema.
+* The details for mapping the index specification between the time-series collection and the
underlying buckets collection may be found in
[timeseries_index_schema_conversion_functions.h](timeseries_index_schema_conversion_functions.h).
+* Newly supported index types in v5.2 and up
+ [store the original user index definition](https://github.com/mongodb/mongo/blob/cf80c11bc5308d9b889ed61c1a3eeb821839df56/src/mongo/db/timeseries/timeseries_commands_conversion_helper.cpp#L140-L147)
+ on the transformed index definition. When mapping the bucket collection index to the time-series
+ collection index, the original user index definition is returned.
Once the indexes have been created, they can be inspected through the `listIndexes` command or the
`$indexStats` aggregation stage. `listIndexes` and `$indexStats` against a time-series collection
@@ -92,16 +97,28 @@ field.
`dropIndex` and `collMod` (`hidden: <bool>`, `expireAfterSeconds: <num>`) are also supported on
time-series collections.
-Most index types are supported on time-series collections, including
-[hashed](https://docs.mongodb.com/manual/core/index-hashed/),
-[wildcard](https://docs.mongodb.com/manual/core/index-wildcard/),
-[sparse](https://docs.mongodb.com/manual/core/index-sparse/),
-[multikey](https://docs.mongodb.com/manual/core/index-multikey/), and
-[indexes with collations](https://docs.mongodb.com/manual/indexes/#indexes-and-collation).
+Supported index types on the time field:
+* [Single](https://docs.mongodb.com/manual/core/index-single/).
+* [Compound](https://docs.mongodb.com/manual/core/index-compound/).
+* [Hashed](https://docs.mongodb.com/manual/core/index-hashed/).
+* [Wildcard](https://docs.mongodb.com/manual/core/index-wildcard/).
+* [Sparse](https://docs.mongodb.com/manual/core/index-sparse/).
+* [Multikey](https://docs.mongodb.com/manual/core/index-multikey/).
+* [Indexes with collations](https://docs.mongodb.com/manual/indexes/#indexes-and-collation).
+
+Supported index types on the meta-data field and meta-data subfields:
+* All of the supported index types on the time field.
+* [2d](https://docs.mongodb.com/manual/core/2d/) in v5.2 and up.
+* [2dsphere](https://docs.mongodb.com/manual/core/2dsphere/) in v5.2 and up.
+* [Partial](https://docs.mongodb.com/manual/core/index-partial/) in v5.2 and up.
+
+Supported index types on measurement fields in v5.2 and up only:
+* [Single](https://docs.mongodb.com/manual/core/index-single/).
+* [Compound](https://docs.mongodb.com/manual/core/index-compound/).
+* [2dsphere](https://docs.mongodb.com/manual/core/2dsphere/).
+* [Partial](https://docs.mongodb.com/manual/core/index-partial/).
Index types that are not supported on time-series collections include
-[geo](https://docs.mongodb.com/manual/core/2dsphere/),
-[partial](https://docs.mongodb.com/manual/core/index-partial/),
[unique](https://docs.mongodb.com/manual/core/index-unique/), and
[text](https://docs.mongodb.com/manual/core/index-text/).
@@ -128,14 +145,16 @@ handled by an op observer, but may be necessary to call from other places.
A bucket is closed either manually, by setting the optional `control.closed` flag, or automatically
by the `BucketCatalog` in a number of situations. If the `BucketCatalog` is using more memory than
-it's given threshold (controlled by the server paramter
+it's given threshold (controlled by the server parameter
`timeseriesIdleBucketExpiryMemoryUsageThreshold`), it will start to close idle buckets. A bucket is
considered idle if it is open and it does not have any uncommitted measurements pending. The
-`BucketCatalog` will also close a bucket if it contains more than the maximum number of measurments
+`BucketCatalog` will also close a bucket if it contains more than the maximum number of measurements
(`timeseriesBucketMaxCount`), if it contains more than the maximum amount of data
(`timeseriesBucketMaxSize`), or if a new measurement would cause the bucket to span a greater
amount of time between it's oldest and newest time stamp than is allowed (currently hard-coded to
-one hour).
+one hour). If an incoming measurement is schematically incompatible relative to the measurements
+which have already landed in a given bucket, that bucket will be closed and is tracked with the
+`numBucketsClosedDueToSchemaChange` metric.
The first time a write batch is committed for a given bucket, the newly-formed document is
inserted. On subsequent batch commits, we perform an update operation. Instead of generating the