GHC Driver Readme ================= Greetings and well met. If you are reading this, I can only assume that you are likely interested in working on the testsuite in some capacity. For more detailed documentation, please see [here][1]. ## ToC 1. Entry points of the testsuite performance tests 2. Quick overview of program parts 3. How to use the comparison tool 4. Important Types 5. Quick answers for "how do I do X"? ## Entry Points of the testsuite performance tests The testsuite has two main entry points depending on which perspective you approach it. From the perspective of the test writer, the entry point is the collect_stats function called in *.T files. This function is declared in perf_notes.py along with its associated infrastructure. The purpose of this function is to tell the test driver what metrics to compare when processing the test. From the perspective of running the test-suite e.g. via make, its entry point is the runtests.py file. That file contains the main logic for running the individual tests, collecting information, handling failure, and outputting the final results. ## Overview of how the performance test bits work. During a Haskell Summer of Code project, an intern went through and revamped most of the performance test code, as such there have been a few changes to it that might be unusual to anyone previously familiar with the testsuite. One of the biggest immediate benefits is that all platform differences, compiler differences, and things such as that are not necessary to be considered by the test writer anymore. This is due to the fact that the test comparison relies entirely on locally collected metrics on the testing machine. As such, it is perfectly sufficient to write `collect_stats('all',20)` in the ".T" files to measure the 3 potential stats that can be collected for that test and automatically test them for regressions, failing if there is more than a 20% change in any direction. In fact, even that is not necessary as `collect_stats()` defaults to 'all', and 20% deviation allowed. The function `collect_compiler_stats()` is completely equivalent in every way to `collect_stats` except that it measures the performance of the compiler itself rather than the performance of the code generated by the compiler. See the implementation of collect_stats in /driver/testlib.py for more information. If the performance of a test is improved so much that the test fails, the value will still be recorded. The warning that will be emitted is merely a precaution so that the programmer can double-check that they didn't introduce a bug; something that might be suspicious if the test suddenly improves by 70%, for example. Performance metrics for performance tests are now stored in git notes under the namespace 'perf'. The format of the git note file is that each line represents a single metric for a particular test: `$test_env $test_name $test_way $metric_measured $value_collected` (delimited by tabs). One can view the maximum deviation a test allows by looking inside its respective all.T file; additionally, if one sets the verbosity level of the test-suite to a value >= 4, they will see a good amount of output per test detailing all the information about values. This information will also print if the test falls outside of the allowed bounds. (see the test_cmp function in /driver/perf_notes.py for exact formatting of the message) The git notes are only appended to by the testsuite in a single atomic python subprocess at the end of the test run; if the run is canceled at any time, the notes will not be written. The note appending command will be retried up to 4 times in the event of a failure (such as one happening due to a lock on the repo) although this is never anticipated to happen. If, for some reason, the 5 attempts were not enough, an error message will be printed out. Further, there is no current process or method for stripping duplicates, updating values, etc, so if the testsuite is ran multiple times per commit there will be multiple values in the git notes corresponding to the tests ran. In this case the average value is used. ## Quick overview of program parts The relevant bits of the directory tree are as such: ``` ├── driver -- Testsuite driver directory ├── junit.py -- Contains code implementing JUnit features. ├── kill_extra_files.py -- Some of the uglier implementation details. ├── perf_notes.py -- Comparison tool and performance tests. ├── runtests.py -- Main entrypoint for program; runs tests. ├── testglobals.py -- Global data structures and objects. ├── testlib.py -- Bulk of implementation is in here. └── testutil.py -- Misc helper functions. ├── mk └── test.mk -- Master makefile for running tests. ├── tests -- Main tests directory. ``` ## How to Use the Comparison Tool The comparison tool exists in `/driver/perf_notes.py`. When the testsuite is ran, the performance metrics of the performance tests are saved automatically in a local git note that will be attached to the commit. The comparison tool is designed to help analyze performance metrics across commits using this performance information. Currently, it can only be ran by executing the file directly, like so: ``` $ python3 perf_notes.py (arguments go here) ``` If you run `perf_notes.py -h` you will see a description of all of the arguments and how to use them. The optional arguments exist to filter the output to include only commits that you're interested in. The most typical usage of this tool will likely be running `perf_notes.py HEAD 'HEAD~1' '(commit hash)' ...` The way the performance metrics are stored in git notes remains strictly local to the machine; as such, performance metrics will not exist for a commit until you checkout that commit and run the testsuite (or test). ## Quick Answers for "How do I do X?" * Q: How do I add a flag to "make test" to extend the testsuite functionality? 1. Add the flag in the appropriate global object in testglobals.py 2. Add a argument to the parser in runtests.py that sets the flag 3. Go to the `testsuite/mk/test.mk` file and add a new ifeq (or ifneq) block. I suggest adding the block around line 200. * Q: How do I modify how performance tests work? * That functionality resides in perf_notes.py which has pretty good in-code documentation. * Additionally, one will want to look at `compile_and_run`, `simple_run`, and `simple_build` in testutil.py [1]: https://gitlab.haskell.org/ghc/ghc/wikis/building/running-tests