diff options
27 files changed, 509 insertions, 203 deletions
diff --git a/buildscripts/antithesis/README.md b/buildscripts/antithesis/README.md index 8b0a5de5998..8cff329010c 100644 --- a/buildscripts/antithesis/README.md +++ b/buildscripts/antithesis/README.md @@ -13,8 +13,8 @@ use Antithesis today. ## Base Images The `base_images` directory consists of the building blocks for creating a MongoDB test topology. These images are uploaded to the Antithesis Docker registry weekly during the -`antithesis_image_build` task. For more visibility into how these images are built and uploaded to -the Antithesis Docker registry, please see `evergreen/antithesis_image_build.sh`. +`antithesis_image_push` task. For more visibility into how these images are built and uploaded to +the Antithesis Docker registry, please see that task. ### mongo_binaries This image contains the latest `mongo`, `mongos` and `mongod` binaries. It can be used to @@ -27,7 +27,7 @@ container is not part of the actual toplogy. The purpose of a `workload` contain `mongo` commands to complete the topology setup, and to run a test suite on an existing topology like so: ```shell -buildscript/resmoke.py run --suite antithesis_concurrency_sharded_with_stepdowns_and_balancer --shellConnString "mongodb://mongos:27017" +buildscript/resmoke.py run --suite antithesis_concurrency_sharded_with_stepdowns_and_balancer ``` **Every topology must have 1 workload container.** @@ -46,18 +46,19 @@ consists of a `docker-compose.yml`, a `logs` directory, a `scripts` directory an directory. If this is structured properly, you should be able to copy the files & directories from this image and run `docker-compose up` to set up the desired topology. -Example from `buildscripts/antithesis/topologies/replica_set/Dockerfile`: +Example from `buildscripts/antithesis/topologies/sharded_cluster/Dockerfile`: ```Dockerfile FROM scratch COPY docker-compose.yml / ADD scripts /scripts ADD logs /logs ADD data /data +ADD debug /debug ``` All topology images are built and uploaded to the Antithesis Docker registry during the -`antithesis_image_build` task in the `evergreen/antithesis_image_build.sh` script. Some of these -directories are created in `evergreen/antithesis_image_build.sh` such as `/data` and `/logs`. +`antithesis_image_push` task. Some of these directories are created during the +`evergreen/antithesis_image_build.sh` script such as `/data` and `/logs`. Note: These images serve solely as a filesystem containing all necessary files for a topology, therefore use `FROM scratch`. @@ -66,20 +67,38 @@ therefore use `FROM scratch`. This describes how to construct the corresponding topology using the `mongo-binaries` and `workload` images. -Example from `buildscripts/antithesis/topologies/replica_set/docker-compose.yml`: +Example from `buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml`: ```yml version: '3.0' services: - database1: + configsvr1: + container_name: configsvr1 + hostname: configsvr1 + image: mongo-binaries:evergreen-latest-master + volumes: + - ./logs/configsvr1:/var/log/mongodb/ + - ./scripts:/scripts/ + - ./data/configsvr1:/data/configdb/ + command: /bin/bash /scripts/configsvr_init.sh + networks: + antithesis-net: + ipv4_address: 10.20.20.6 + # Set the an IPv4 with an address of 10.20.20.130 or higher + # to be ignored by the fault injector + # + + configsvr2: ... + configsvr3: ... + database1: ... container_name: database1 hostname: database1 image: mongo-binaries:evergreen-latest-master - command: /bin/bash /scripts/database_init.sh volumes: - ./logs/database1:/var/log/mongodb/ - ./scripts:/scripts/ - ./data/database1:/data/db/ + command: /bin/bash /scripts/database_init.sh Shard1 networks: antithesis-net: ipv4_address: 10.20.20.3 @@ -88,37 +107,59 @@ services: # database2: ... database3: ... + database4: ... + database5: ... + database6: ... + mongos: + container_name: mongos + hostname: mongos + image: mongo-binaries:evergreen-latest-master + volumes: + - ./logs/mongos:/var/log/mongodb/ + - ./scripts:/scripts/ + command: python3 /scripts/mongos_init.py + depends_on: + - "database1" + - "database2" + - "database3" + - "database4" + - "database5" + - "database6" + - "configsvr1" + - "configsvr2" + - "configsvr3" + networks: + antithesis-net: + ipv4_address: 10.20.20.9 + # The subnet provided here is an example + # An alternative subnet can be used workload: container_name: workload hostname: workload image: workload:evergreen-latest-master - command: /bin/bash /scripts/workload_init.sh volumes: - ./logs/workload:/var/log/resmoke/ - ./scripts:/scripts/ + command: python3 /scripts/workload_init.py depends_on: - - "database1" - - "database2" - - "database3" + - "mongos" networks: antithesis-net: ipv4_address: 10.20.20.130 # The subnet provided here is an example # An alternative subnet can be used - networks: antithesis-net: driver: bridge ipam: config: - subnet: 10.20.20.0/24 - ``` -Each container must have a `command`in `docker-compose.yml` that runs an init script. The init +Each container must have a `command` in `docker-compose.yml` that runs an init script. The init script belongs in the `scripts` directory, which is included as a volume. The `command` should be -set like so: `/bin/bash /scripts/[script_name].sh`. This is a requirement for the topology to start -up properly in Antithesis. +set like so: `/bin/bash /scripts/[script_name].sh` or `python3 /scripts/[script_name].py`. This is +a requirement for the topology to start up properly in Antithesis. When creating `mongod` or `mongos` instances, route the logs like so: `--logpath /var/log/mongodb/mongodb.log` and utilize `volumes` -- as in `database1`. @@ -133,28 +174,24 @@ Use the `evergreen-latest-master` tag for all images. This is updated automatica ### scripts -Example from `buildscripts/antithesis/topologies/replica_set/scripts/workload_init.sh`: -```shell -sleep 5s -mongo --host database1 --port 27017 --eval "config={\"_id\" : \"RollbackFuzzerTest\",\"protocolVersion\" : 1,\"members\" : [{\"_id\" : 0,\"host\" : \"database1:27017\"}, {\"_id\" : 1,\"host\" : \"database2:27017\"}, {\"_id\" : 2,\"host\" : \"database3:27017\"} ],\"settings\" : {\"chainingAllowed\" : false,\"electionTimeoutMillis\" : 500, \"heartbeatTimeoutSecs\" : 1, \"catchUpTimeoutMillis\": 700}}; rs.initiate(config)" - -# this cryptic statement keeps the workload container running. -tail -f /dev/null -``` -The `sleep` command can be useful to ensure that other containers have had a chance to start. In -this example, the `workload` container waits 5 seconds while the database containers start. -After that, it initiates the replica set. The `tail -f /dev/null` is required for `workload` -containers otherwise the container shuts down. +Take a look at `buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.py` to see +how to use util methods from `buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py` +to set up the desired topology. You can also use simple shell scripts as in the case of +`buildscripts/antithesis/topologies/sharded_cluster/scripts/database_init.py`. These init scripts +must not end in order to keep the underlying container alive. You can use an infinite while +loop for `python` scripts or you can use `tail -f /dev/null` for shell scripts. ## How do I create a new topology for Antithesis testing? To create a new topology for Antithesis testing is easy & requires a few simple steps. 1. Add a new directory in `buildscripts/antithesis/topologies` to represent your desired topology. You can use existing topologies as an example. -2. Update the `evergreen/antithesis_image_build.sh` file so that your new topology image is +2. Make sure that your workload test suite runs against your topology without any failures. This +may require tagging some tests as `antithesis-incompatible`. +3. Update the `antithesis_image_push` task so that your new topology image is uploaded to the Antithesis Docker registry. -3. Reach out to #server-testing on Slack & provide the new topology image name as well as the +4. Reach out to #server-testing on Slack & provide the new topology image name as well as the desired test suite to run. -4. Include a member of the STM team on the code review. +5. Include the SDP team on the code review. These are the required updates to `evergreen/antithesis_image_build.sh`: - Add the following command for each of your `mongos` and `mongod` containers in your topology to @@ -169,6 +206,7 @@ cd [your_topology_dir] sed -i s/evergreen-latest-master/$tag/ docker-compose.yml sudo docker build . -t [your-topology-name]-config:$tag ``` +These are the required updates to `evergreen/antithesis_image_push.sh`: - Push your new image to the Antithesis Docker registry ```shell sudo docker tag "[your-topology-name]-config:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/[your-topology-name]-config:$tag" diff --git a/buildscripts/antithesis/base_images/mongo_binaries/Dockerfile b/buildscripts/antithesis/base_images/mongo_binaries/Dockerfile index fe93253eef4..5390f8b9ddb 100644 --- a/buildscripts/antithesis/base_images/mongo_binaries/Dockerfile +++ b/buildscripts/antithesis/base_images/mongo_binaries/Dockerfile @@ -12,7 +12,7 @@ RUN mkdir -p /scripts # Install dependencies of MongoDB Server RUN apt-get update -RUN apt-get install -qy libcurl4 libgssapi-krb5-2 libldap-2.4-2 libwrap0 libsasl2-2 libsasl2-modules libsasl2-modules-gssapi-mit snmp openssl liblzma5 +RUN apt-get install -qy libcurl4 libgssapi-krb5-2 libldap-2.4-2 libwrap0 libsasl2-2 libsasl2-modules libsasl2-modules-gssapi-mit snmp openssl liblzma5 python3 # ------------------- # Everything above this line should be common image setup diff --git a/buildscripts/antithesis/base_images/workload/Dockerfile b/buildscripts/antithesis/base_images/workload/Dockerfile index 58f52bda489..01017d51c61 100644 --- a/buildscripts/antithesis/base_images/workload/Dockerfile +++ b/buildscripts/antithesis/base_images/workload/Dockerfile @@ -30,8 +30,12 @@ COPY src /resmoke RUN bash -c "cd /resmoke && python3.9 -m venv python3-venv && . python3-venv/bin/activate && pip install --upgrade pip wheel && pip install -r ./buildscripts/requirements.txt && ./buildscripts/antithesis_suite.py generate-all" +# copy the run_suite.py script & mongo binary -- make sure they are executable +COPY run_suite.py /resmoke + COPY mongo /usr/bin RUN chmod +x /usr/bin/mongo + COPY libvoidstar.so /usr/lib/libvoidstar.so RUN /usr/bin/mongo --version diff --git a/buildscripts/antithesis/base_images/workload/run_suite.py b/buildscripts/antithesis/base_images/workload/run_suite.py new file mode 100644 index 00000000000..ca06f3423ad --- /dev/null +++ b/buildscripts/antithesis/base_images/workload/run_suite.py @@ -0,0 +1,23 @@ +"""Script to run suite in Antithesis from the workload container.""" +import subprocess +from time import sleep +import pymongo + +client = pymongo.MongoClient(host="mongos", port=27017, serverSelectionTimeoutMS=30000) + +while True: + payload = client.admin.command({"listShards": 1}) + if len(payload["shards"]) == 2: + print("Sharded Cluster available.") + break + if len(payload["shards"]) < 2: + print("Waiting for shards to be added to cluster.") + sleep(5) + continue + if len(payload["shards"]) > 2: + raise RuntimeError('More shards in cluster than expected.') + +subprocess.run([ + "./buildscripts/resmoke.py", "run", "--suite", + "antithesis_concurrency_sharded_with_stepdowns_and_balancer" +], check=True) diff --git a/buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml b/buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml index 767a4d19eb1..8f4a5ce9a62 100644 --- a/buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml +++ b/buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml @@ -146,8 +146,14 @@ services: volumes: - ./logs/mongos:/var/log/mongodb/ - ./scripts:/scripts/ - command: /bin/bash /scripts/mongos_init.sh + command: python3 /scripts/mongos_init.py depends_on: + - "database1" + - "database2" + - "database3" + - "database4" + - "database5" + - "database6" - "configsvr1" - "configsvr2" - "configsvr3" @@ -163,17 +169,8 @@ services: volumes: - ./logs/workload:/var/log/resmoke/ - ./scripts:/scripts/ - command: /bin/bash /scripts/workload_init.sh + command: python3 /scripts/workload_init.py depends_on: - - "database1" - - "database2" - - "database3" - - "database4" - - "database5" - - "database6" - - "configsvr1" - - "configsvr2" - - "configsvr3" - "mongos" networks: antithesis-net: diff --git a/buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.py b/buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.py new file mode 100644 index 00000000000..0d3aa7cfc8d --- /dev/null +++ b/buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.py @@ -0,0 +1,158 @@ +"""Script to configure a sharded cluster in Antithesis from the mongos container.""" +import json +import subprocess +from utils import mongo_process_running, retry_until_success + +CONFIGSVR_CONFIG = { + "_id": "ConfigServerReplSet", + "configsvr": True, + "protocolVersion": 1, + "members": [ + {"_id": 0, "host": "configsvr1:27019"}, + {"_id": 1, "host": "configsvr2:27019"}, + {"_id": 2, "host": "configsvr3:27019"}, + ], + "settings": { + "chainingAllowed": False, + "electionTimeoutMillis": 2000, + "heartbeatTimeoutSecs": 1, + "catchUpTimeoutMillis": 0, + }, +} + +SHARD1_CONFIG = { + "_id": "Shard1", + "protocolVersion": 1, + "members": [ + {"_id": 0, "host": "database1:27018"}, + {"_id": 1, "host": "database2:27018"}, + {"_id": 2, "host": "database3:27018"}, + ], + "settings": { + "chainingAllowed": False, + "electionTimeoutMillis": 2000, + "heartbeatTimeoutSecs": 1, + "catchUpTimeoutMillis": 0, + }, +} + +SHARD2_CONFIG = { + "_id": "Shard2", + "protocolVersion": 1, + "members": [ + {"_id": 0, "host": "database4:27018"}, + {"_id": 1, "host": "database5:27018"}, + {"_id": 2, "host": "database6:27018"}, + ], + "settings": { + "chainingAllowed": False, + "electionTimeoutMillis": 2000, + "heartbeatTimeoutSecs": 1, + "catchUpTimeoutMillis": 0, + }, +} + +# Create ConfigServerReplSet once all nodes are up +retry_until_success(mongo_process_running, {"host": "configsvr1", "port": 27019}) +retry_until_success(mongo_process_running, {"host": "configsvr2", "port": 27019}) +retry_until_success(mongo_process_running, {"host": "configsvr3", "port": 27019}) +retry_until_success( + subprocess.run, { + "args": [ + "mongo", + "--host", + "configsvr1", + "--port", + "27019", + "--eval", + f"rs.initiate({json.dumps(CONFIGSVR_CONFIG)})", + ], + "check": True, + }) + +# Create Shard1 once all nodes are up +retry_until_success(mongo_process_running, {"host": "database1", "port": 27018}) +retry_until_success(mongo_process_running, {"host": "database2", "port": 27018}) +retry_until_success(mongo_process_running, {"host": "database3", "port": 27018}) +retry_until_success( + subprocess.run, { + "args": [ + "mongo", + "--host", + "database1", + "--port", + "27018", + "--eval", + f"rs.initiate({json.dumps(SHARD1_CONFIG)})", + ], + "check": True, + }) + +# Create Shard2 once all nodes are up +retry_until_success(mongo_process_running, {"host": "database4", "port": 27018}) +retry_until_success(mongo_process_running, {"host": "database5", "port": 27018}) +retry_until_success(mongo_process_running, {"host": "database6", "port": 27018}) +retry_until_success( + subprocess.run, { + "args": [ + "mongo", + "--host", + "database4", + "--port", + "27018", + "--eval", + f"rs.initiate({json.dumps(SHARD2_CONFIG)})", + ], + "check": True, + }) + +# Start mongos +retry_until_success( + subprocess.run, { + "args": [ + "mongos", + "--bind_ip", + "0.0.0.0", + "--configdb", + "ConfigServerReplSet/configsvr1:27019,configsvr2:27019,configsvr3:27019", + "--logpath", + "/var/log/mongodb/mongodb.log", + "--setParameter", + "enableTestCommands=1", + "--setParameter", + "fassertOnLockTimeoutForStepUpDown=0", + "--fork", + ], + "check": True, + }) + +# Add shards to cluster +retry_until_success( + subprocess.run, { + "args": [ + "mongo", + "--host", + "mongos", + "--port", + "27017", + "--eval", + 'sh.addShard("Shard1/database1:27018,database2:27018,database3:27018")', + ], + "check": True, + }) +retry_until_success( + subprocess.run, { + "args": [ + "mongo", + "--host", + "mongos", + "--port", + "27017", + "--eval", + 'sh.addShard("Shard2/database4:27018,database5:27018,database6:27018")', + ], + "check": True, + }) + +while True: + continue diff --git a/buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.sh b/buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.sh deleted file mode 100755 index 49df9e67490..00000000000 --- a/buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.sh +++ /dev/null @@ -1,7 +0,0 @@ -sleep 5s -mongo --host configsvr1 --port 27019 --eval "config={\"_id\" : \"ConfigServerReplSet\",\"configsvr\" : true,\"protocolVersion\" : 1,\"members\" : [{\"_id\" : 0,\"host\" : \"configsvr1:27019\"}, {\"_id\" : 1,\"host\" : \"configsvr2:27019\"}, {\"_id\" : 2,\"host\" : \"configsvr3:27019\"} ],\"settings\" : {\"chainingAllowed\" : false,\"electionTimeoutMillis\" : 2000, \"heartbeatTimeoutSecs\" : 1, \"catchUpTimeoutMillis\": 0}}; rs.initiate(config)" - -mongos --bind_ip 0.0.0.0 --configdb ConfigServerReplSet/configsvr1:27019,configsvr2:27019,configsvr3:27019 --logpath /var/log/mongodb/mongodb.log --setParameter enableTestCommands=1 --setParameter fassertOnLockTimeoutForStepUpDown=0 - -# this cryptic statement keeps the container running. -tail -f /dev/null diff --git a/buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py b/buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py new file mode 100644 index 00000000000..3338c68e7e0 --- /dev/null +++ b/buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py @@ -0,0 +1,25 @@ +"""Util functions to assist in setting up a sharded cluster topology in Antithesis.""" +import subprocess +import time + + +def mongo_process_running(host, port): + """Check to see if the process at the given host & port is running.""" + return subprocess.run(['mongo', '--host', host, '--port', + str(port), '--eval', '"db.stats()"'], check=True) + + +def retry_until_success(func, kwargs=None, wait_time=1, timeout_period=30): + """Retry the function periodically until timeout.""" + kwargs = {} if kwargs is None else kwargs + timeout = time.time() + timeout_period + while True: + if time.time() > timeout: + raise TimeoutError( + f"{func.__name__} called with {kwargs} timed out after {timeout_period} second(s).") + try: + func(**kwargs) + break + except: # pylint: disable=bare-except + print(f"Retrying {func.__name__} called with {kwargs} after {wait_time} second(s).") + time.sleep(wait_time) diff --git a/buildscripts/antithesis/topologies/sharded_cluster/scripts/workload_init.py b/buildscripts/antithesis/topologies/sharded_cluster/scripts/workload_init.py new file mode 100644 index 00000000000..3784825ba54 --- /dev/null +++ b/buildscripts/antithesis/topologies/sharded_cluster/scripts/workload_init.py @@ -0,0 +1,3 @@ +"""Script to initialize a workload container in Antithesis.""" +while True: + continue diff --git a/buildscripts/antithesis/topologies/sharded_cluster/scripts/workload_init.sh b/buildscripts/antithesis/topologies/sharded_cluster/scripts/workload_init.sh deleted file mode 100755 index acaf30ea036..00000000000 --- a/buildscripts/antithesis/topologies/sharded_cluster/scripts/workload_init.sh +++ /dev/null @@ -1,18 +0,0 @@ -sleep 10s - -mongo --host database1 --port 27018 --eval "config={\"_id\" : \"Shard1\",\"protocolVersion\" : 1,\"members\" : [{\"_id\" : 0,\"host\" : \"database1:27018\"}, {\"_id\" : 1,\"host\" : \"database2:27018\"}, {\"_id\" : 2,\"host\" : \"database3:27018\"} ],\"settings\" : {\"chainingAllowed\" : false,\"electionTimeoutMillis\" : 2000, \"heartbeatTimeoutSecs\" : 1, \"catchUpTimeoutMillis\": 0}}; rs.initiate(config)" - -sleep 5s - -mongo --host database4 --port 27018 --eval "config={\"_id\" : \"Shard2\",\"protocolVersion\" : 1,\"members\" : [{\"_id\" : 0,\"host\" : \"database4:27018\"}, {\"_id\" : 1,\"host\" : \"database5:27018\"}, {\"_id\" : 2,\"host\" : \"database6:27018\"} ],\"settings\" : {\"chainingAllowed\" : false,\"electionTimeoutMillis\" : 2000, \"heartbeatTimeoutSecs\" : 1, \"catchUpTimeoutMillis\": 0}}; rs.initiate(config)" - -sleep 5s - -mongo --host mongos --port 27017 --eval 'sh.addShard("Shard1/database1:27018,database2:27018,database3:27018")' - -sleep 5s - -mongo --host mongos --port 27017 --eval 'sh.addShard("Shard2/database4:27018,database5:27018,database6:27018")' - -# this cryptic statement keeps the workload container running. -tail -f /dev/null diff --git a/buildscripts/antithesis_suite.py b/buildscripts/antithesis_suite.py index 8397a992ee7..b8722ada418 100755 --- a/buildscripts/antithesis_suite.py +++ b/buildscripts/antithesis_suite.py @@ -3,7 +3,6 @@ import os.path import sys -import pathlib import click import yaml @@ -12,35 +11,74 @@ import yaml if __name__ == "__main__" and __package__ is None: sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) -SUITE_BLACKLIST = [ - "CheckReplDBHash", - "CheckReplOplogs", +HOOKS_BLACKLIST = [ "CleanEveryN", "ContinuousStepdown", - "ValidateCollections", "CheckOrphansDeleted", ] +_SUITES_PATH = os.path.join("buildscripts", "resmokeconfig", "suites") -def _sanitize_hooks(hooks): - if len(hooks) == 0: - return hooks - # it's either a list of strings, or a list of dicts, each with key 'class' - if isinstance(hooks[0], str): - return list(filter(lambda x: x not in SUITE_BLACKLIST, hooks)) - elif isinstance(hooks[0], dict): - return list(filter(lambda x: x['class'] not in SUITE_BLACKLIST, hooks)) - else: - raise RuntimeError('Unknown structure in hook. File a TIG ticket.') +def delete_archival(suite): + """Remove archival for Antithesis environment.""" + suite.pop("archive", None) + suite.get("executor", {}).pop("archive", None) -def _sanitize_test_data(test_data): - if test_data.get("useStepdownPermittedFile", None): - test_data["useStepdownPermittedFile"] = False - return test_data +def make_hooks_compatible(suite): + """Make hooks compatible in Antithesis environment.""" + if suite.get("executor", {}).get("hooks", None): + # it's either a list of strings, or a list of dicts, each with key 'class' + if isinstance(suite["executor"]["hooks"][0], str): + suite["executor"]["hooks"] = ["AntithesisLogging"] + [ + hook for hook in suite["executor"]["hooks"] if hook not in HOOKS_BLACKLIST + ] + elif isinstance(suite["executor"]["hooks"][0], dict): + suite["executor"]["hooks"] = [{"class": "AntithesisLogging"}] + [ + hook for hook in suite["executor"]["hooks"] if hook["class"] not in HOOKS_BLACKLIST + ] + else: + raise RuntimeError('Unknown structure in hook. File a TIG ticket.') -_SUITES_PATH = os.path.join("buildscripts", "resmokeconfig", "suites") + +def use_external_fixture(suite): + """Use external version of this fixture.""" + if suite.get("executor", {}).get("fixture", None): + suite["executor"]["fixture"] = { + "class": f"External{suite['executor']['fixture']['class']}", + "shell_conn_string": "mongodb://mongos:27017" + } + + +def update_test_data(suite): + """Update TestData to be compatible with antithesis.""" + suite.setdefault("executor", {}).setdefault( + "config", {}).setdefault("shell_options", {}).setdefault("global_vars", {}).setdefault( + "TestData", {}).update({"useStepdownPermittedFile": False}) + + +def update_shell(suite): + """Update shell for when running in Antithesis.""" + suite.setdefault("executor", {}).setdefault("config", {}).setdefault("shell_options", + {}).setdefault("eval", "") + suite["executor"]["config"]["shell_options"]["eval"] += "jsTestLog = Function.prototype;" + + +def update_exclude_tags(suite): + """Update the exclude tags to exclude antithesis incompatible tests.""" + suite.setdefault('selector', {}).setdefault('exclude_with_any_tags', + []).append("antithesis_incompatible") + + +def make_suite_antithesis_compatible(suite): + """Modify suite in-place to be antithesis compatible.""" + delete_archival(suite) + make_hooks_compatible(suite) + use_external_fixture(suite) + update_test_data(suite) + update_shell(suite) + update_exclude_tags(suite) @click.group() @@ -50,54 +88,13 @@ def cli(): def _generate(suite_name: str) -> None: - with open(os.path.join(_SUITES_PATH, "{}.yml".format(suite_name))) as fstream: + with open(os.path.join(_SUITES_PATH, f"{suite_name}.yml")) as fstream: suite = yaml.safe_load(fstream) - try: - suite["archive"]["hooks"] = _sanitize_hooks(suite["archive"]["hooks"]) - except KeyError: - # pass, don't care - pass - except TypeError: - pass - - try: - suite["executor"]["archive"]["hooks"] = _sanitize_hooks( - suite["executor"]["archive"]["hooks"]) - except KeyError: - # pass, don't care - pass - except TypeError: - pass - - try: - suite["executor"]["hooks"] = _sanitize_hooks(suite["executor"]["hooks"]) - except KeyError: - # pass, don't care - pass - except TypeError: - pass - - try: - suite["executor"]["config"]["shell_options"]["global_vars"][ - "TestData"] = _sanitize_test_data( - suite["executor"]["config"]["shell_options"]["global_vars"]["TestData"]) - except KeyError: - # pass, don't care - pass - except TypeError: - pass - - try: - suite["executor"]["config"]["shell_options"]["eval"] += "jsTestLog = Function.prototype;" - except KeyError: - # pass, don't care - pass - except TypeError: - pass + make_suite_antithesis_compatible(suite) out = yaml.dump(suite) - with open(os.path.join(_SUITES_PATH, "antithesis_{}.yml".format(suite_name)), "w") as fstream: + with open(os.path.join(_SUITES_PATH, f"antithesis_{suite_name}.yml"), "w") as fstream: fstream.write( "# this file was generated by buildscripts/antithesis_suite.py generate {}\n".format( suite_name)) diff --git a/buildscripts/resmokelib/testing/fixtures/shardedcluster.py b/buildscripts/resmokelib/testing/fixtures/shardedcluster.py index 9bc83f4f062..84574900e34 100644 --- a/buildscripts/resmokelib/testing/fixtures/shardedcluster.py +++ b/buildscripts/resmokelib/testing/fixtures/shardedcluster.py @@ -8,6 +8,7 @@ import pymongo import pymongo.errors import buildscripts.resmokelib.testing.fixtures.interface as interface +import buildscripts.resmokelib.testing.fixtures.external as external class ShardedClusterFixture(interface.Fixture): # pylint: disable=too-many-instance-attributes @@ -378,6 +379,53 @@ class ShardedClusterFixture(interface.Fixture): # pylint: disable=too-many-inst client.admin.command({"addShard": connection_string}) +class ExternalShardedClusterFixture(external.ExternalFixture, ShardedClusterFixture): + """Fixture to interact with external sharded cluster fixture.""" + + REGISTERED_NAME = "ExternalShardedClusterFixture" + + def __init__(self, logger, job_num, fixturelib, shell_conn_string): + """Initialize ExternalShardedClusterFixture.""" + external.ExternalFixture.__init__(self, logger, job_num, fixturelib, shell_conn_string) + ShardedClusterFixture.__init__(self, logger, job_num, fixturelib, mongod_options={}) + + def setup(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.setup(self) + + def pids(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.pids(self) + + def await_ready(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.await_ready(self) + + def _do_teardown(self, mode=None): + """Use ExternalFixture method.""" + return external.ExternalFixture._do_teardown(self) + + def _is_process_running(self): + """Use ExternalFixture method.""" + return external.ExternalFixture._is_process_running(self) + + def is_running(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.is_running(self) + + def get_internal_connection_string(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.get_internal_connection_string(self) + + def get_driver_connection_url(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.get_driver_connection_url(self) + + def get_node_info(self): + """Use ExternalFixture method.""" + return external.ExternalFixture.get_node_info(self) + + class _MongoSFixture(interface.Fixture): """Fixture which provides JSTests with a mongos to connect to.""" diff --git a/buildscripts/resmokelib/testing/hooks/antithesis_logging.py b/buildscripts/resmokelib/testing/hooks/antithesis_logging.py new file mode 100644 index 00000000000..49b1a357cc5 --- /dev/null +++ b/buildscripts/resmokelib/testing/hooks/antithesis_logging.py @@ -0,0 +1,26 @@ +"""Hook that prints Antithesis commands to be executed in the Antithesis evironment.""" + +from time import sleep +from buildscripts.resmokelib.testing.hooks import interface + + +class AntithesisLogging(interface.Hook): # pylint: disable=too-many-instance-attributes + """Prints antithesis commands before & after test run.""" + + DESCRIPTION = "Prints antithesis commands before & after test run." + + IS_BACKGROUND = False + + def __init__(self, hook_logger, fixture): + """Initialize the AntithesisLogging hook.""" + interface.Hook.__init__(self, hook_logger, fixture, AntithesisLogging.DESCRIPTION) + + def before_test(self, test, test_report): + """Ensure the fault injector is running before a test.""" + print("ANTITHESIS-COMMAND: Start Fault Injector") + sleep(5) + + def after_test(self, test, test_report): + """Ensure the fault injector is stopped after a test.""" + print("ANTITHESIS-COMMAND: Stop Fault Injector") + sleep(5) diff --git a/buildscripts/tests/resmokelib/testing/fixtures/test_api_adherence.py b/buildscripts/tests/resmokelib/testing/fixtures/test_api_adherence.py index 7f0e1110a32..81b717bb0fe 100644 --- a/buildscripts/tests/resmokelib/testing/fixtures/test_api_adherence.py +++ b/buildscripts/tests/resmokelib/testing/fixtures/test_api_adherence.py @@ -8,6 +8,7 @@ import os DISALLOWED_ROOT = "buildscripts" ALLOWED_IMPORTS = [ + "buildscripts.resmokelib.testing.fixtures.external", "buildscripts.resmokelib.testing.fixtures.interface", "buildscripts.resmokelib.testing.fixtures.fixturelib", "buildscripts.resmokelib.multiversionconstants", diff --git a/etc/evergreen.yml b/etc/evergreen.yml index a4b1589bfc9..f22c1027993 100644 --- a/etc/evergreen.yml +++ b/etc/evergreen.yml @@ -3053,7 +3053,6 @@ buildvariants: - name: enterprise-ubuntu1804-64-libvoidstar display_name: ~ Enterprise Ubuntu 18.04 w/ libvoidstar - cron: "0 4 * * FRI" # Every week at 0400 UTC Friday. This has to be a Friday since we run Antithesis on Fridays. modules: - enterprise run_on: @@ -3067,11 +3066,10 @@ buildvariants: multiversion_edition: enterprise repo_edition: enterprise large_distro_name: ubuntu1804-build - use_scons_cache: false - scons_cache_scope: "none" + scons_cache_scope: shared tasks: - name: compile_and_archive_dist_test_TG - - name: .antithesis + - name: antithesis_image_push - name: generate_buildid_to_debug_symbols_mapping - <<: *enterprise-windows-nopush-template diff --git a/etc/evergreen_yml_components/definitions.yml b/etc/evergreen_yml_components/definitions.yml index 2f0ebf2b2ce..f042cefd9c2 100644 --- a/etc/evergreen_yml_components/definitions.yml +++ b/etc/evergreen_yml_components/definitions.yml @@ -2072,6 +2072,26 @@ functions: content_type: text/plain display_name: Resmoke.py Invocation for Local Usage + "antithesis image build": + - command: subprocess.exec + params: + binary: bash + args: + - "./src/evergreen/antithesis_image_build.sh" + + "antithesis image push": + - command: subprocess.exec + params: + binary: bash + args: + - "./src/evergreen/antithesis_image_push.sh" + + "antithesis dry run": + - command: subprocess.exec + params: + binary: bash + args: + - "./src/evergreen/antithesis_dry_run.sh" # Pre task steps pre: @@ -7332,16 +7352,16 @@ tasks: args: - "./src/evergreen/feature_flag_tags_check.sh" -- name: antithesis_image_build +- name: antithesis_image_push tags: ["antithesis"] # this is not patchable to avoid hitting the docker registry excessively. # When iterating on this task, feel free to make this patchable for # testing purposes. Your image changes will be pushed with the # evergreen-patch tag, so as to not clobber the waterfall. Use the # antithesis_image_tag build parameter to override this if required. - patchable: false depends_on: - name: archive_dist_test_debug + exec_timeout_secs: 7200 commands: - *f_expansions_write - func: "git get project no modules" @@ -7349,13 +7369,6 @@ tasks: - func: "kill processes" - func: "cleanup environment" - func: "set up venv" - - command: s3.get - params: - aws_key: ${aws_key} - aws_secret: ${aws_secret} - remote_file: ${project}/${build_variant}/antithesis_last_push.txt - local_file: antithesis_last_push.txt - bucket: mciuploads - func: "do setup" - command: s3.get params: @@ -7364,22 +7377,9 @@ tasks: remote_file: ${mongo_debugsymbols} bucket: mciuploads local_file: src/mongo-debugsymbols.tgz - - command: subprocess.exec - params: - binary: bash - args: - - "./src/evergreen/antithesis_image_build.sh" - - command: s3.put - params: - optional: true - aws_key: ${aws_key} - aws_secret: ${aws_secret} - local_file: antithesis_next_push.txt - remote_file: ${project}/${build_variant}/antithesis_last_push.txt - bucket: mciuploads - permissions: private - content_type: text/plain - display_name: Last Push Date (seconds since epoch) + - func: "antithesis image build" + - func: "antithesis dry run" + - func: "antithesis image push" - name: generate_buildid_to_debug_symbols_mapping tags: ["symbolizer"] diff --git a/evergreen/antithesis_dry_run.sh b/evergreen/antithesis_dry_run.sh new file mode 100644 index 00000000000..740e21bd2d7 --- /dev/null +++ b/evergreen/antithesis_dry_run.sh @@ -0,0 +1,6 @@ +set -o errexit +set -o verbose + +cd antithesis/topologies/sharded_cluster +sudo docker-compose up -d +sudo docker exec workload /bin/bash -c 'cd resmoke && . python3-venv/bin/activate && python3 run_suite.py' diff --git a/evergreen/antithesis_image_build.sh b/evergreen/antithesis_image_build.sh index 0b986770102..2be5630740d 100644 --- a/evergreen/antithesis_image_build.sh +++ b/evergreen/antithesis_image_build.sh @@ -3,15 +3,6 @@ DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null 2>&1 && pwd)" set -euo pipefail -cd src -commit_date=$(date -d "$(git log -1 -s --format=%ci)" "+%s") -last_run_date=$(cat ../antithesis_last_push.txt || echo 0) -if [ "${is_patch}" != "true" ] && [ "${last_run_date}" -gt "${commit_date}" ]; then - echo -e "Refusing to push new antithesis images because this commit is older\nthan the last pushed commit" - exit 0 -fi -cd .. - # check that the binaries in dist-test are linked to libvoidstar ldd src/dist-test/bin/mongod | grep libvoidstar ldd src/dist-test/bin/mongos | grep libvoidstar @@ -75,27 +66,3 @@ sudo docker build . -t repl-set-config:$tag cd ../sharded_cluster sed -i s/evergreen-latest-master/$tag/ docker-compose.yml sudo docker build . -t sharded-cluster-config:$tag - -# login, push, and logout -echo "${antithesis_repo_key}" > mongodb.key.json -cat mongodb.key.json | sudo docker login -u _json_key https://us-central1-docker.pkg.dev --password-stdin -rm mongodb.key.json - -# tag and push to the registry -sudo docker tag "mongo-binaries:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/mongo-binaries:$tag" -sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/mongo-binaries:$tag" - -sudo docker tag "workload:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/workload:$tag" -sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/workload:$tag" - -sudo docker tag "repl-set-config:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/repl-set-config:$tag" -sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/repl-set-config:$tag" - -sudo docker tag "sharded-cluster-config:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/sharded-cluster-config:$tag" -sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/sharded-cluster-config:$tag" - -sudo docker logout https://us-central1-docker.pkg.dev - -if [ "${is_patch}" != "true" ]; then - echo "$commit_date" > antithesis_next_push.txt -fi diff --git a/evergreen/antithesis_image_push.sh b/evergreen/antithesis_image_push.sh new file mode 100644 index 00000000000..94e1a2bf0b7 --- /dev/null +++ b/evergreen/antithesis_image_push.sh @@ -0,0 +1,35 @@ +DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null 2>&1 && pwd)" +. "$DIR/prelude.sh" + +set -euo pipefail + +# push images as evergreen-latest-${branch_name}, unless it's a patch +tag="evergreen-latest-${branch_name}" +if [ "${is_patch}" = "true" ]; then + tag="evergreen-patch" +fi + +if [ -n "${antithesis_image_tag:-}" ]; then + echo "Using provided tag: '$antithesis_image_tag' for docker pushes" + tag=$antithesis_image_tag +fi + +# login, push, and logout +echo "${antithesis_repo_key}" > mongodb.key.json +cat mongodb.key.json | sudo docker login -u _json_key https://us-central1-docker.pkg.dev --password-stdin +rm mongodb.key.json + +# tag and push to the registry +sudo docker tag "mongo-binaries:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/mongo-binaries:$tag" +sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/mongo-binaries:$tag" + +sudo docker tag "workload:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/workload:$tag" +sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/workload:$tag" + +sudo docker tag "repl-set-config:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/repl-set-config:$tag" +sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/repl-set-config:$tag" + +sudo docker tag "sharded-cluster-config:$tag" "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/sharded-cluster-config:$tag" +sudo docker push "us-central1-docker.pkg.dev/molten-verve-216720/mongodb-repository/sharded-cluster-config:$tag" + +sudo docker logout https://us-central1-docker.pkg.dev diff --git a/jstests/concurrency/fsm_workloads/cleanupOrphanedWhileMigrating.js b/jstests/concurrency/fsm_workloads/cleanupOrphanedWhileMigrating.js index b34c3d60e1f..ec673ba935d 100644 --- a/jstests/concurrency/fsm_workloads/cleanupOrphanedWhileMigrating.js +++ b/jstests/concurrency/fsm_workloads/cleanupOrphanedWhileMigrating.js @@ -3,7 +3,7 @@ /** * Performs range deletions while chunks are being moved. * - * @tags: [requires_sharding, assumes_balancer_on] + * @tags: [requires_sharding, assumes_balancer_on, antithesis_incompatible] */ load('jstests/concurrency/fsm_libs/extend_workload.js'); diff --git a/jstests/concurrency/fsm_workloads/collection_defragmentation.js b/jstests/concurrency/fsm_workloads/collection_defragmentation.js index 7ec9bb8b071..c84729399e5 100644 --- a/jstests/concurrency/fsm_workloads/collection_defragmentation.js +++ b/jstests/concurrency/fsm_workloads/collection_defragmentation.js @@ -5,7 +5,7 @@ * * Runs defragmentation on collections with concurrent operations. * - * @tags: [requires_sharding, assumes_balancer_on] + * @tags: [requires_sharding, assumes_balancer_on, antithesis_incompatible] */ const dbPrefix = jsTestName() + '_DB_'; diff --git a/jstests/concurrency/fsm_workloads/create_collection_and_view.js b/jstests/concurrency/fsm_workloads/create_collection_and_view.js index 5a17b8b3b7c..29bcddee65f 100644 --- a/jstests/concurrency/fsm_workloads/create_collection_and_view.js +++ b/jstests/concurrency/fsm_workloads/create_collection_and_view.js @@ -4,7 +4,7 @@ * Repeatedly creates a collection and a view with the same namespace. Validates that we never * manage to have both a Collection and View created on the same namespace at the same time. * - * @tags: [catches_command_failures] + * @tags: [catches_command_failures, antithesis_incompatible] */ var $config = (function() { diff --git a/jstests/concurrency/fsm_workloads/internal_transactions_move_chunk.js b/jstests/concurrency/fsm_workloads/internal_transactions_move_chunk.js index 122352d1d37..16062a9f36a 100644 --- a/jstests/concurrency/fsm_workloads/internal_transactions_move_chunk.js +++ b/jstests/concurrency/fsm_workloads/internal_transactions_move_chunk.js @@ -8,7 +8,8 @@ * @tags: [ * requires_fcv_60, * requires_sharding, - * uses_transactions + * uses_transactions, + * antithesis_incompatible * ] */ load('jstests/concurrency/fsm_libs/extend_workload.js'); diff --git a/jstests/concurrency/fsm_workloads/internal_transactions_resharding.js b/jstests/concurrency/fsm_workloads/internal_transactions_resharding.js index 31c6e3df3b8..d8443a2ef7b 100644 --- a/jstests/concurrency/fsm_workloads/internal_transactions_resharding.js +++ b/jstests/concurrency/fsm_workloads/internal_transactions_resharding.js @@ -8,7 +8,8 @@ * @tags: [ * requires_fcv_60, * requires_sharding, - * uses_transactions + * uses_transactions, + * antithesis_incompatible * ] */ load('jstests/concurrency/fsm_libs/extend_workload.js'); diff --git a/jstests/concurrency/fsm_workloads/internal_transactions_sharded.js b/jstests/concurrency/fsm_workloads/internal_transactions_sharded.js index 9ab4f093e00..018df932d56 100644 --- a/jstests/concurrency/fsm_workloads/internal_transactions_sharded.js +++ b/jstests/concurrency/fsm_workloads/internal_transactions_sharded.js @@ -8,7 +8,8 @@ * @tags: [ * requires_fcv_60, * requires_sharding, - * uses_transactions + * uses_transactions, + * antithesis_incompatible * ] */ load('jstests/concurrency/fsm_libs/extend_workload.js'); diff --git a/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod.js b/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod.js index 404aa99ad13..f2ae41e3ac5 100644 --- a/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod.js +++ b/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod.js @@ -8,7 +8,8 @@ * @tags: [ * requires_fcv_60, * requires_sharding, - * uses_transactions + * uses_transactions, + * antithesis_incompatible * ] */ load('jstests/concurrency/fsm_libs/extend_workload.js'); diff --git a/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod_kill_sessions.js b/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod_kill_sessions.js index 290f363ab96..013850a86d6 100644 --- a/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod_kill_sessions.js +++ b/jstests/concurrency/fsm_workloads/internal_transactions_sharded_from_mongod_kill_sessions.js @@ -10,6 +10,7 @@ * requires_fcv_60, * requires_sharding, * uses_transactions, + * antithesis_incompatible * ] */ |