summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorAnna Wawrzyniak <anna.wawrzyniak@mongodb.com>2022-03-12 19:20:40 +0000
committerEvergreen Agent <no-reply@evergreen.mongodb.com>2022-03-12 19:49:32 +0000
commit7d29d1d90932faed698fcdcb0c7ca7aa082886c5 (patch)
treebe4ee78ae0d75eb7c8db228510fbd1c525aa9da4 /docs
parent1ca10441ee35fcea2a88a0c522fa0890017f5fbd (diff)
downloadmongo-7d29d1d90932faed698fcdcb0c7ca7aa082886c5.tar.gz
SERVER-63734 Add cli update/diff tools for golden data test management
Diffstat (limited to 'docs')
-rw-r--r--docs/golden_data_test_framework.md289
1 files changed, 289 insertions, 0 deletions
diff --git a/docs/golden_data_test_framework.md b/docs/golden_data_test_framework.md
new file mode 100644
index 00000000000..719939993ee
--- /dev/null
+++ b/docs/golden_data_test_framework.md
@@ -0,0 +1,289 @@
+# Overview
+Golden Data test framework provides ability to run and manage tests that produce an output which is
+verified by comparing it to the checked-in, known valid output. Any differences result in rest
+failure and either the code or expected output has to be updated.
+
+Golden Data tests excel at bulk diffing of failed test outputs and bulk accepting of new test
+outputs.
+
+# When to use Golden Data tests?
+* Code under test produces a deterministic output: That way tests can consistently succeed or fail.
+* Incremental changes to code under test or test fixture result in incremental changes to the
+output.
+* As an alternative to ASSERT for large output comparison: Serves the same purpose, but provides
+ tools for diffing/updating.
+* The outputs can't be objectively verified (e.g. by verifying well known properties). Examples:
+ * Verifying if sorting works, can be done by verifying that output is sorted. SHOULD NOT use
+ Golden Data tests.
+ * Verifying that pretty printing works, MAY use Golden Data tests to verify the output, as there
+ might not be well known properties or those properties can easily change.
+* As stability/versioning/regression testing. Golden Data tests by storing recorded outputs, are
+good candidate for preserving behavior of legacy versions or detecting undesired changes in
+behavior, even in cases when new behavior meets other correctness criteria.
+
+# Best practices for working with Golden Data tests
+* Tests MUST produce text output that is diffable can be inspected in the pull request.
+
+* Tests MUST produce an output that is deterministic and repeatable. Including running on different
+platforms. Same as with ASSERT_EQ.
+* Tests SHOULD produce an output that changes incrementally in response to the incremental test or
+code changes.
+
+* Multiple test variations MAY be bundled into a single test. Recommended when testing same feature
+with different inputs. This helps reviewing the outputs by grouping similar tests together, and also
+reduces the number of output files.
+
+* Changes to test fixture or test code that affect non-trivial amount test outputs MUST BE done in
+separate pull request from production code changes:
+ * Pull request for test code only changes can be easily reviewed, even if large number of test
+ outputs are modified. While such changes can still introduce merge conflicts, they don't introduce
+ risk of regression (if outputs were valid
+ * Pull requests with mixed production
+
+* Tests in the same suite SHOULD share the fixtures when appropriate. This reduces cost of adding
+new tests to the suite. Changes to the fixture may only affect expected outputs from that fixtures,
+and those output can be updated in bulk.
+
+* Tests in different suites SHOULD NOT reuse/share fixtures. Changes to the fixture can affect large
+ number of expected outputs.
+ There are exceptions to that rule, and tests in different suites MAY reuse/share fixtures if:
+ * Test fixture is considered stable and changes rarely.
+ * Tests suites are related, either by sharing tests, or testing similar components.
+ * Setup/teardown costs are excessive, and sharing the same instance of a fixture for performance
+ reasons can't be avoided.
+
+* Tests SHOULD print both inputs and outputs of the tested code. This makes it easy for reviewers to
+verify of the expected outputs are indeed correct by having both input and output next to each
+other.
+Otherwise finding the input used to produce the new output may not be practical, and might not even
+be included in the diff.
+
+* When resolving merge conflicts on the expected output files, one of the approaches below SHOULD be
+used:
+ * "Accept theirs", rerun the tests and verify new outputs. This doesn't require knowledge of
+ production/test code changes in "theirs" branch, but requires re-review and re-acceptance of c
+ hanges done by local branch.
+ * "Accept yours", rerun the tests and verify the new outputs. This approach requires knowledge of
+ production/test code changes in "theirs" branch. However, if such changes resulted in
+ straightforward and repetitive output changes, like due to printing code change or fixture change,
+ it may be easier to verify than reinspecting local changes.
+
+* Expected test outputs SHOULD be reused across tightly-coupled test suites. The suites are
+tightly-coupled if:
+ * Share the same tests, inputs and fixtures.
+ * Test similar scenarios.
+ * Test different code paths, but changes to one of the code path is expected to be accompanied by
+ changes to the other code paths as well.
+
+ Tests SHOULD use different test files, for legitimate and expected output differences between
+ those suites.
+
+ Examples:
+ * Functional tests, integration tests and unit tests that test the same behavior in different
+ environments.
+ * Versioned tests, where expected behavior is the same for majority of test inputs/scenarios.
+
+* AVOID manually modifying expected output files. Those files are considered to be auto generated.
+Instead, run the tests and then copy the generated output as a new expected output file. See "How to
+ diff and accept new test outputs" section for instructions.
+
+
+# How to use write Golden Data tests?
+Each golden data test should produce a text output that will be later verified. The output format
+must be text, but otherwise test author can choose a most appropriate output format (text, json,
+bson, yaml or mixed). If a test consists of multiple variations each variation should be clearly
+separated from each other.
+
+Note: Test output is usually only written. It is ok to focus on just writing serialization/printing
+code without a need to provide deserialization/parsing code.
+
+When actual test output is different from expected output, test framework will fail the test, log
+ both outputs and also create following files, that can be inspected later:
+* <output_path>/actual/<test_path> - with actual test output
+* <output_path>/expected/<test_path> - with expected test output
+
+## CPP tests
+`::mongo::unittest::GoldenTestConfig` - Provides a way to configure test suite(s). Defines where the
+ expected output files are located in the source repo.
+
+`::mongo::unittest::GoldenTestContext` - Provides an output stream where tests should write their
+outputs. Verifies the output with the expected output that is in the source repo
+
+See: [golden_test.h](../src/mongo/unittest/golden_test.h)
+
+**Example:**
+```c++
+#include "mongo/unittest/golden_test.h"
+
+GoldenTestConfig myConfig("src/mongo/my_expected_output");
+TEST(MySuite, MyTest) {
+ GoldenTestContext ctx(myConfig);
+ ctx.outStream() << "print something here" << std::endl;
+ ctx.outStream() << "print something else" << std::endl;
+}
+
+void runVariation(GoldenTestContext& ctx, const std::string& variationName, T input) {
+ ctx.outStream() << "VARIATION " << variationName << std::endl;
+ ctx.outStream() << "input: " << input << std::endl;
+ ctx.outStream() << "output: " << runCodeUnderTest(input) << std::endl;
+ ctx.outStream() << std::endl;
+}
+
+TEST_F(MySuiteFixture, MyFeatureATest) {
+ GoldenTestContext ctx(myConfig);
+ runMyVariation(ctx, "variation 1", "some input testing A #1")
+ runMyVariation(ctx, "variation 2", "some input testing A #2")
+ runMyVariation(ctx, "variation 3", "some input testing A #3")
+}
+
+TEST_F(MySuiteFixture, MyFeatureBTest) {
+ GoldenTestContext ctx(myConfig);
+ runMyVariation(ctx, "variation 1", "some input testing B #1")
+ runMyVariation(ctx, "variation 2", "some input testing B #2")
+ runMyVariation(ctx, "variation 3", "some input testing B #3")
+ runMyVariation(ctx, "variation 4", "some input testing B #4")
+}
+```
+
+Also see self-test:
+[golden_test_test.cpp](../src/mongo/unittest/golden_test_test.cpp)
+
+# How to diff and accept new test outputs on a workstation
+
+Use buildscripts/golden_test.py command line tool to manage the test outputs. This includes:
+* diffing all output differences of all tests in a given test run output.
+* accepting all output differences of all tests in a given test run output.
+
+## Setup
+buildscripts/golden_test.py requires a one-time workstation setup.
+
+Note: this setup is only required to use buildscripts/golden_test.py itselfit is NOT required to
+just run the Golden Data tests when not using buildscripts/golden_test.py.
+
+1. Create a yaml config file, as described by golden_test_options.idl (link TBD).
+2. Set GOLDEN_TEST_CONFIG_PATH environment variable to config file location, so that is available
+when running tests and when running buildscripts/golden_test.py tool.
+
+**Example (linux/macOS):**
+
+Create ~/.golden_test_config.yml, as one of the variants below, depending on personal preference.
+For full documentation of config file format see
+[Appendix - Config file reference](#appendix---config-file-reference)
+
+a. A config that uses a unique subfolder folder for each test run.
+ * Allows diffing each test run separately.
+ * Works with multiple source repos.
+```yaml
+outputRootPattern: /var/tmp/test_output/out-%%%%-%%%%-%%%%-%%%%
+diffCmd: diff -ruN --unidirectional-new-file --color=always "{{expected}}" "{{actual}}"
+```
+
+b. A config that uses the same folder for each test run.
+ * Allows to execute multiple independent test run and then diff them together.
+ * Requires running "buildscripts/golden_test.py list" to remove output files from previous runs.
+ * May not work as well with multiple source repos.
+```yaml
+outputRootPattern: /var/tmp/test_output
+diffCmd: diff -ruN --unidirectional-new-file --color=always "{{expected}}" "{{actual}}"
+```
+
+Update .bashrc, .zshrc
+```bash
+export GOLDEN_TEST_CONFIG_PATH=~/.golden_test_config.yml
+```
+alternatively modify /etc/environment or other configuration if needed by Debugger/IDE etc.
+
+
+## Usage
+
+### List all available test outputs
+```bash
+mongo > buildscripts/golden_test.py list
+```
+
+### Diff test results from most recent test run:
+
+```bash
+mongo > buildscripts/golden_test.py diff
+```
+This will run the diffCmd that was specified in the config file
+
+### Diff test results from most recent test run:
+```bash
+mongo > buildscripts/golden_test.py accept
+```
+This will copy all actual test outputs from that test run to the source repo and new expected
+outputs.
+
+
+### Get paths from most recent test run (to be used by custom tools)
+Get expected and actual output paths for most recent test run:
+```bash
+mongo > buildscripts/golden_test.py get
+```
+
+Get expected and actual output paths for most most recent test run:
+```bash
+mongo > buildscripts/golden_test.py get_root
+```
+
+Get all available commands and options:
+```bash
+mongo > buildscripts/golden_test.py --help
+```
+
+# How to diff test results from a non-workstation test run
+
+## Bulk folder diff the results:
+1. Parse the test log to find the root output locations where expected and actual output files were
+written.
+2. Then compare the folders to see the differences for tests that failed.
+
+**Example: (linux/macOS)**
+
+```bash
+# Show the test run expected and actual folders:
+$> cat test.log | grep "^{" | jq -s -c -r '.[] | select(.id == 6273501 ) | .attr.expectedOutputRoot + " " +.attr.actualOutputRoot ' | sort | uniq
+# Run the recursive diff
+$> diff -ruN --unidirectional-new-file --color=always <expected_root> <actual_root>
+```
+
+## Find the outputs of tests that failed.
+Parse logs and find the the expected and actual outputs for each failed test.
+
+**Example: (linux/macOS)**
+```bash
+# Find all expected and actual outputs of tests that have failed
+$> cat test.log | grep "^{" | jq -s '.[] | select(.id == 6273501 ) | .attr.testPath,.attr.expectedOutput,.attr.actualOutput'
+```
+
+# Appendix - Config file reference
+
+Golden Data test config file is a YAML file specified as:
+```yaml
+outputRootPattern:
+ type: String
+ optional: true
+ description:
+ Root path patten that will be used to write expected and actual test outputs for all tests
+ in the test run.
+ If not specified a temporary folder location will be used.
+ Path pattern string may use '%' characters in the last part of the path. '%' characters in
+ the last part of the path will be replaced with random lowercase hexadecimal digits.
+ examples:
+ /var/tmp/test_output/out-%%%%-%%%%-%%%%-%%%%
+ /var/tmp/test_output
+
+diffCmd:
+ type: String
+ optional: true
+ description:
+ Shell command to diff a single golden test run output.
+ {{expected}} and {{actual}} variables should be used and will be replaced with expected and
+ actual output folder paths respectively.
+ This property is not used during test execution, it is only used by post processing tools,
+ like golden_test.py
+ examples:
+ diff -ruN --unidirectional-new-file --color=always "{{expected}}" "{{actual}}"
+```
+