diff options
author | David Eichmann <davide@Well-Typed.com> | 2018-11-07 12:02:47 -0500 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2018-11-07 12:07:11 -0500 |
commit | 932cd41d8c7984c767c1b3b58e05146f69cc5c15 (patch) | |
tree | 41e77f048036a19100c5bee508c77b2ab8ec55d4 /.circleci | |
parent | 82a5c2410a47b16df09039b9786c2c0e34ba130e (diff) | |
download | haskell-932cd41d8c7984c767c1b3b58e05146f69cc5c15.tar.gz |
testsuite: Save performance metrics in git notes.
This patch makes the following improvement:
- Automatically records test metrics (per test environment) so that
the programmer need not supply nor update expected values in *.T
files.
- On expected metric changes, the programmer need only indicate the
direction of change in the git commit message.
- Provides a simple python tool "perf_notes.py" to compare metrics
over time.
Issues:
- Using just the previous commit allows performance to drift with each
commit.
- Currently we allow drift as we have a preference for minimizing
false positives.
- Some possible alternatives include:
- Use metrics from a fixed commit per test: the last commit that
allowed a change in performance (else the oldest metric)
- Or use some sort of aggregate since the last commit that allowed
a change in performance (else all available metrics)
- These alternatives may result in a performance issue (with the
test driver) having to heavily search git commits/notes.
- Run locally, performance tests will trivially pass unless the tests
were run locally on the previous commit. This is often not the case
e.g. after pulling recent changes.
Previously, *.T files contain statements such as:
```
stats_num_field('peak_megabytes_allocated', (2, 1))
compiler_stats_num_field('bytes allocated',
[(wordsize(64), 165890392, 10)])
```
This required the programmer to give the expected values and a tolerance
deviation (percentage). With this patch, the above statements are
replaced with:
```
collect_stats('peak_megabytes_allocated', 5)
collect_compiler_stats('bytes allocated', 10)
```
So that programmer must only enter which metrics to test and a tolerance
deviation. No expected value is required. CircleCI will then run the
tests per test environment and record the metrics to a git note for that
commit and push them to the git.haskell.org ghc repo. Metrics will be
compared to the previous commit. If they are different by the tolerance
deviation from the *.T file, then the corresponding test will fail. By
adding to the git commit message e.g.
```
# Metric (In|De)crease <metric(s)> <options>: <tests>
Metric Increase ['bytes allocated', 'peak_megabytes_allocated'] \
(test_env='linux_x86', way='default'):
Test012, Test345
Metric Decrease 'bytes allocated':
Test678
Metric Increase:
Test711
```
This will allow the noted changes (letting the test pass). Note that by
omitting metrics or options, the change will apply to all possible
metrics/options (i.e. in the above, an increase for all metrics in all
test environments is allowed for Test711)
phabricator will use the message in the description
Reviewers: bgamari, hvr
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #12758
Differential Revision: https://phabricator.haskell.org/D5059
Diffstat (limited to '.circleci')
-rw-r--r-- | .circleci/config.yml | 43 | ||||
-rwxr-xr-x | .circleci/push-test-metrics.sh | 46 |
2 files changed, 87 insertions, 2 deletions
diff --git a/.circleci/config.yml b/.circleci/config.yml index f35690124b..f80b2b321b 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -18,7 +18,7 @@ aliases: # ideally we would simply set THREADS here instead of re-detecting it every # time we need it below. Unfortunately, there is no way to set an environment # variable with the result of a shell script. - SKIP_PERF_TESTS: YES + SKIP_PERF_TESTS: NO VERBOSE: 2 - &boot run: @@ -32,6 +32,12 @@ aliases: include mk/flavours/\$(BuildFlavour).mk endif EOF + - &set_git_identity + run: + name: Set Git Identity + command: | + git config user.email "ghc-circleci@haskell.org" + git config user.name "GHC CircleCI" - &configure_unix run: name: Configure @@ -64,10 +70,16 @@ aliases: name: Test command: | mkdir -p test-results - make test THREADS=`mk/detect-cpu-count.sh` SKIP_PERF_TESTS=YES JUNIT_FILE=../../test-results/junit.xml + METRICS_FILE=$(mktemp) + echo "export METRICS_FILE=$METRICS_FILE" >> $BASH_ENV + make test THREADS=`mk/detect-cpu-count.sh` SKIP_PERF_TESTS=$SKIP_PERF_TESTS TEST_ENV=$TEST_ENV JUNIT_FILE=../../test-results/junit.xml METRICS_FILE=$METRICS_FILE - &store_test_results store_test_results: path: test-results + - &push_perf_note + run: + name: Push Performance Git Notes + command: .circleci/push-test-metrics.sh - &slowtest run: name: Full Test @@ -102,8 +114,10 @@ jobs: environment: <<: *buildenv GHC_COLLECTOR_FLAVOR: x86_64-linux + TEST_ENV: x86_64-linux steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -113,6 +127,7 @@ jobs: - *storeartifacts - *test - *store_test_results + - *push_perf_note "validate-x86_64-freebsd": resource_class: xlarge @@ -122,8 +137,10 @@ jobs: TARGET: FreeBSD <<: *buildenv GHC_COLLECTOR_FLAVOR: x86_64-freebsd + TEST_ENV: x86_64-freebsd steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -133,6 +150,7 @@ jobs: - *storeartifacts - *test - *store_test_results + - *push_perf_note "validate-x86_64-darwin": macos: @@ -147,8 +165,10 @@ jobs: # Build with in-tree GMP since this isn't available on OS X by default. CONFIGURE_OPTS: --with-intree-gmp <<: *buildenv + TEST_ENV: x86_64-darwin steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -158,6 +178,7 @@ jobs: - *storeartifacts - *test - *store_test_results + - *push_perf_note "validate-hadrian-x86_64-linux": resource_class: xlarge @@ -167,6 +188,7 @@ jobs: <<: *buildenv steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -179,8 +201,10 @@ jobs: - image: ghcci/x86_64-linux:0.0.4 environment: <<: *buildenv + TEST_ENV: x86_64-linux-unreg steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -188,6 +212,7 @@ jobs: - *make - *test - *store_test_results + - *push_perf_note "validate-x86_64-linux-llvm": resource_class: xlarge @@ -196,6 +221,7 @@ jobs: environment: <<: *buildenv BUILD_FLAVOUR: perf-llvm + TEST_ENV: x86_64-linux-llvm steps: - run: name: Install LLVM @@ -206,12 +232,14 @@ jobs: name: Verify that llc works command: llc - checkout + - *set_git_identity - *prepare - *submodules - *boot - *configure_unix - *make - *test + - *push_perf_note # Nightly build with -DDEBUG using devel2 flavour "validate-x86_64-linux-debug": @@ -221,8 +249,11 @@ jobs: environment: BUILD_FLAVOUR: devel2 <<: *buildenv + TEST_ENV: x86_64-linux-debug + SKIP_PERF_TESTS: YES steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -230,6 +261,7 @@ jobs: - *make - *test - *store_test_results + - *push_perf_note "validate-i386-linux": resource_class: xlarge @@ -238,8 +270,10 @@ jobs: environment: <<: *buildenv GHC_COLLECTOR_FLAVOR: i386-linux + TEST_ENV: i386-linux steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -249,6 +283,7 @@ jobs: - *storeartifacts - *test - *store_test_results + - *push_perf_note "validate-x86_64-fedora": resource_class: xlarge @@ -257,8 +292,10 @@ jobs: environment: <<: *buildenv GHC_COLLECTOR_FLAVOR: x86_64-fedora + TEST_ENV: x86_64-fedora steps: - checkout + - *set_git_identity - *prepare - *submodules - *boot @@ -268,6 +305,7 @@ jobs: - *storeartifacts - *test - *store_test_results + - *push_perf_note "slow-validate-x86_64-linux": resource_class: xlarge @@ -285,6 +323,7 @@ jobs: - *make - *slowtest - *store_test_results + - *push_perf_note workflows: version: 2 diff --git a/.circleci/push-test-metrics.sh b/.circleci/push-test-metrics.sh new file mode 100755 index 0000000000..4ea6958d99 --- /dev/null +++ b/.circleci/push-test-metrics.sh @@ -0,0 +1,46 @@ +#!/usr/bin/env bash +# vim: sw=2 et +set -euo pipefail + +fail() { + echo "ERROR: $*" >&2 + exit 1 +} + +GHC_ORIGIN=git@git.haskell.org:ghc + +# Add git.haskell.org as a known host. +echo "|1|F3mPVCE55+KfApNIMYQ3Dv39sGE=|1bRkvJEJhAN2R0LE/lAjFCEJGl0= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBUZS9jGBkE5UzpSo6irnIgcQcfzvbuIOsFc8+N61FwtZncRntbaKPuUimOFPgeaUZLl6Iajz6IIs7aduU0/v+I=" >> ~/.ssh/known_hosts +echo "|1|2VUMjYSRVpT2qJPA0rA9ap9xILY=|5OThkI4ED9V0J+Es7D5FOD55Klk= ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC+3TLluLAO4lkW60W+N2DFkS+WoRFGqLwHzgd1ifxG9TIm31wChPY3E/hgMnJmgGqWCF4UDUemmyCycEaL7FtKfzjTAclg9EfpQnozyE3T5hIo2WL7SN5O8ttG/bYGuDnn14jLnWwJyN4oz/znWFiDG9e2Oc9YFNlQ+PK8ae5xR4gqBB7EOoj9J1EiPqG2OXRr5Mei3TLsRDU6fnz/e4oFJpKWWeN6M63oePv0qoaGjxcrATZUWsuWrxVMmYo9kP1xRuFJbAUw2m4uVP+793SW1zxySi1HBMtJG+gCDdZZSwYbkV1hassLWBHv1qPttncfX8Zek3Z3VolaTmfWJTo9" >> ~/.ssh/known_hosts + +# Check that a git notes dont already exist. +# This is a percausion as we reset refs/notes/perf and we want to avoid data loss. +if [ $(git notes --ref=perf list | wc -l) -ne 0 ] +then + fail "Found an existing git note on HEAD. Expected no git note." +fi + +# Assert that the METRICS_FILE exists and can be read. +if [ "$METRICS_FILE" = "" ] || ! [ -r $METRICS_FILE ] +then + fail "Metrics file not found: $METRICS_FILE" +fi + +# Reset the git notes and append the metrics file to the notes, then push and return the result. +# This is favoured over a git notes merge as it avoids potential data loss/duplication from the merge strategy. +function reset_append_note_push { + git fetch -f $GHC_ORIGIN refs/notes/perf:refs/notes/perf || true + echo "git notes --ref=perf append -F $METRICS_FILE HEAD" + git notes --ref=perf append -F $METRICS_FILE HEAD + git push $GHC_ORIGIN refs/notes/perf +} + +# Push the metrics file as a git note. This may fail if another task pushes a note first. In that case +# the latest note is fetched and appended. +MAX_RETRY=20 +until reset_append_note_push || [ MAX_RETRY = 0 ] +do + ((MAX_RETRY--)) + echo "" + echo "Failed to push git notes. Fetching, appending, and retrying..." +done |