| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this was suggested in #18417.
Change the print format of the values.
* Shorten commit hash
* Reduce precision of the "Value" field
* Shorten metrics name
* e.g. runtime/bytes allocated -> run/alloc
* Shorten "MetricsChange"
* e.g. unchanged -> unch, increased -> incr
And, print the baseline environment if there are baselines that were
measured in a different environment than the current environment.
If all "Baseline commit" are the same, print it once.
|
| |
|
|
|
|
| |
Tamar reported that he saw crashes due to unhandled exceptions.
|
|
|
|
|
| |
Afterall, it's possible we were unable to create it due to lack of
symlink permission.
|
|
|
|
| |
Needs to `write` bytes, not str.
|
|
|
|
| |
Closes #17706.
|
|
|
|
|
|
|
|
| |
Previously we used platform.system() and while this worked fine (e.g.
returned `Windows`, as expected) locally under both msys and MingW64
Python distributions, it inexplicably returned `MINGW64_NT-10.0`
under MingW64 Python on CI. It seems os.name is more reliable so we now
use that instead..
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This tries to put the testsuite driver into a slightly more maintainable
condition:
* Add type annotations where easily done
* Use pathlib.Path instead of str paths
* Make it pass the mypy typechecker
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the Chart.js javascript library.
Everything is put into a standalone .html file and opened with the
default browser.
I also simplified the text output to use the same data as the chart.
You can now use a commit range with git's ".." syntax.
The --ci option will use results from CI (you'll need to fetch them
first):
$ git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/ci/perf
$ python3 testsuite/driver/perf_notes.py --ci --chart --test-env x86_64-darwin --test-name T9630 master~500..master
|
| |
|
|
|
|
| |
and CI results."
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the JUnit output more useful as now we also report the
stdout/stderr in the message which can be used to quickly identify why a
test is failing without downloading the log.
This also introduces TestResult,
previously we were simply passing around tuples, making things the
implementation rather difficult to follow and harder to extend.
|
|
|
|
|
|
|
|
|
| |
results."
Unfortunately this has broken all future commits due to spurious(?)
performance changes which I have been unable to work around.
This reverts commit cc2261d42f6a954d88e355aaad41f001f65c95da.
|
|
|
|
| |
gitlab-ci: push performance metrics as git notes to the "GHC Performance Notes" repository.
|
|
|
|
| |
This reverts commit 76c8fd674435a652c75a96c85abbf26f1f221876.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following improvement:
- Automatically records test metrics (per test environment) so that
the programmer need not supply nor update expected values in *.T
files.
- On expected metric changes, the programmer need only indicate the
direction of change in the git commit message.
- Provides a simple python tool "perf_notes.py" to compare metrics
over time.
Issues:
- Using just the previous commit allows performance to drift with each
commit.
- Currently we allow drift as we have a preference for minimizing
false positives.
- Some possible alternatives include:
- Use metrics from a fixed commit per test: the last commit that
allowed a change in performance (else the oldest metric)
- Or use some sort of aggregate since the last commit that allowed
a change in performance (else all available metrics)
- These alternatives may result in a performance issue (with the
test driver) having to heavily search git commits/notes.
- Run locally, performance tests will trivially pass unless the tests
were run locally on the previous commit. This is often not the case
e.g. after pulling recent changes.
Previously, *.T files contain statements such as:
```
stats_num_field('peak_megabytes_allocated', (2, 1))
compiler_stats_num_field('bytes allocated',
[(wordsize(64), 165890392, 10)])
```
This required the programmer to give the expected values and a tolerance
deviation (percentage). With this patch, the above statements are
replaced with:
```
collect_stats('peak_megabytes_allocated', 5)
collect_compiler_stats('bytes allocated', 10)
```
So that programmer must only enter which metrics to test and a tolerance
deviation. No expected value is required. CircleCI will then run the
tests per test environment and record the metrics to a git note for that
commit and push them to the git.haskell.org ghc repo. Metrics will be
compared to the previous commit. If they are different by the tolerance
deviation from the *.T file, then the corresponding test will fail. By
adding to the git commit message e.g.
```
# Metric (In|De)crease <metric(s)> <options>: <tests>
Metric Increase ['bytes allocated', 'peak_megabytes_allocated'] \
(test_env='linux_x86', way='default'):
Test012, Test345
Metric Decrease 'bytes allocated':
Test678
Metric Increase:
Test711
```
This will allow the noted changes (letting the test pass). Note that by
omitting metrics or options, the change will apply to all possible
metrics/options (i.e. in the above, an increase for all metrics in all
test environments is allowed for Test711)
phabricator will use the message in the description
Reviewers: bgamari, hvr
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #12758
Differential Revision: https://phabricator.haskell.org/D5059
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In Python 3, subprocess.communicate() returns a pair of bytes, which
need to be decoded. In runtests.py, we were just calling str() instead,
which converts b'x' to "b'x'". As a result, the loop that was checking
pkginfo for lines starting with 'library-dirs' couldn't work.
Reviewers: bgamari, thomie, Phyx
Reviewed By: thomie
Subscribers: Phyx, rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5046
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: validate
Reviewers: bgamari, O7 GHC - Testsuite
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4972
|
|
|
|
|
|
|
|
| |
Reviewers: austin
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3716
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a land far far away, a project called Cygwin was born.
Cygwin used newlib as it's standard C library implementation.
But Cygwin wanted to emulate POSIX systems as closely as possible.
So it implemented `execv` using the Windows function `spawnve`.
Specifically
```
spawnve (_P_OVERLAY, path, argv, cur_environ ())
```
`_P_OVERLAY` is crucial, as it makes the function behave *sort of*
like execv on linux. the child process replaces the original process.
With one major difference because of the difference in process models
on Windows: the original process signals the caller that it's done.
this is why the file is still locked. because it's still running,
control was returned because the parent process was destroyed,
but the child is still running.
I think it's just pure dumb luck, that the older runtimes are slow
enough to give the process time to terminate before we tried deleting
the file. Which explains why you do have sporadic failures even on
older runtimes like 2.5.0, of a test or two (like T7307).
So this patch fixes a couple of things. I leverage the existing
`timeout.exe` to implement a workaround for this issue.
a) The old timeout used to start the process then assign it to the job.
This is slightly faulty since child processes are only assigned to a
job is their parent were assigned at the time they started. So this
was a race condition. I now create the process suspended, assign it
to the job and then resume it. Which means all child processes are
not running under the same job.
b) First things, Is to prevent dangling child processes. I mark the job
with `JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE` so when the last process in
the job is done, it insures all processes under the job are killed.
c) Secondly, I change the way we wait for results. Instead of waiting
for the parent process to terminate, I wait for the job itself to
terminate.
There's a slight subtlety there, we can't wait on the job itself.
Instead we have to create an I/O Completion port and wait for signals
on it. See
https://blogs.msdn.microsoft.com/oldnewthing/20130405-00/?p=4743
This fixes the issues on all runtimes for me and makes T7307 pass
consistenly.
The threading was also simplified by hiding all the locking in a single
semaphore and a completion class. Futhermore some additional error
reporting was added.
For encoding the testsuite now no longer passes a file handle to the
subprocess since on windows, sh.exe seems to acquire a lock on the file
that is not released in a timely fashion.
I suspect this because cygwin seems to emulate console handles by
creating file handles and using those for std handles. So when we give
it an existing file handle it just locks the file. I what's happening is
that it's not releasing the handle until all shared cygwin processes are
dead. Which explains why it worked in single threaded mode.
So now instead we pass a pipe and do not interpret the resulting data.
Any bytes written to stdin or read out of stdout/stderr are done so in
binary mode and we do not interpret the data. The reason for this is
that we have encoding tests in GHC which pass invalid utf-8. If we try
to handle the data as text then python will throw an exception instead
of a test comparison failing.
Also I have fixed the ability to override `PYTHON` when calling `make
tests`. This now works the same as with `.\validate`.
Finally, after cleaning up the locks I was able to make the abort
behavior work correctly as I believe it was intended: when you press
Ctrl+C and send an interrupt signal, the testsuite finishes the active
tests and then gracefully exits showing you a report of the progress it
did make. So using Ctrl+C will not just *die* as it did before.
These changes lift the restriction on which python version you use
(msys/mingw) or which runtime or python 3 or python 2. All combinations
should now be supported.
Test Plan:
PATH=/usr/local/bin:/mingw64/bin:$APPDATA/cabal/bin:$PATH &&
PYTHON=/usr/bin/python THREADS=9 make test
THREADS=9 make test
PATH=/usr/local/bin:/mingw64/bin:$APPDATA/cabal/bin:$PATH &&
PYTHON=/usr/bin/python ./validate --quiet --testsuite-only
Reviewers: erikd, RyanGlScott, bgamari, austin
Subscribers: jrtc27, mpickering, thomie, #ghc_windows_task_force
Differential Revision: https://phabricator.haskell.org/D2684
GHC Trac Issues: #12725, #12554, #12661, #12004
|
|
|
|
|
|
|
|
|
|
|
|
| |
While msys' mingw Python 3 does indeed export `os.symlink`, it is
unusable since creating symbolic links on Windows requires permissions
that essentially no one has.
Test Plan: Validate on Windows
Reviewers: austin, Phyx, thomie
Differential Revision: https://phabricator.haskell.org/D2604
|
|
|
|
|
|
| |
* Set config settings directly in mk/test.mk, instead of indirectly in
config/ghc
* passing --hpcdir for WAY=hpc is unnecessary
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Major change to the testsuite driver.
For each TEST:
* create a directory `<testdir>` inside `/tmp`.
* link/copy all source files that the test needs into `<testdir>`.
* run the test inside `<testdir>`.
* delete `<testdir>`
Extra files are (temporarily) tracked in
`testsuite/driver/extra_files.py`, but can also be specified using the
`extra_files` setup function.
Differential Revision: https://phabricator.haskell.org/D1187
Reviewed by: Rufflewind, bgamari
Trac: #11980
|
|
|
|
|
| |
Python 3 support seems to have mildly bitrotten since #9184 was closed.
Luckily, only some minor tweaks seem necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rewrite config/ghc to use getStdout (which use subprocess.Popen) instead
of os.popen, which is deprecated; this also avoids the use of shell
Also:
* Move getStdout to driver/testutil.py so both config/ghc and
driver/runtests.py can use it
* Remove support for Python below 2.4, which doesn't have subprocess
Reviewed By: thomie
Differential Revision: https://phabricator.haskell.org/D908
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If one runs the testsuite with a profiling compiler, during the import
of `testlib.py`, `testlib.py` sets the global variable `gs_working`. To
do so, it executes a few statements which require the function
`strip_quotes` to be in scope. But that function only gets defined at
the very end of testlib.py.
This patch moves the definition of `strip_quotes` to testutil.py, which
is imported at the very top of testlib.py. This unbreaks the nightly
builders.
Reviewed By: austin
Differential Revision: https://phabricator.haskell.org/D728
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a fixup of https://phabricator.haskell.org/D233
The only difference is in findTFiles (first commit), which
previously broke Windows runner; now I translated literally
instead attempting to improve it, and checked it works.
Test Plan:
I used validate under 2,3 on Linux and under 2 on msys2.
On Windows I've seen a large number of failures, but they don't
seem to be connected with the patch.
Reviewers: hvr, simonmar, thomie, austin
Reviewed By: austin
Subscribers: thomie, carter, ezyang, simonmar
Differential Revision: https://phabricator.haskell.org/D310
GHC Trac Issues: #9184
|
|
|
|
|
|
| |
This reverts commit 084d241b316bfa12e41fc34cae993ca276bf0730.
This is a possible culprit of Windows breakage reported at ghc-devs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Most of the changes is adaptation of old Python 2 only code.
My priority was not breaking Python 2, and so I avoided bigger
changes to the driver. In particular, under Python 3 the output
is a str and buffering cannot be disabled.
To test, define PYTHON=python3 in testsuite/mk/boilerplate.mk.
Thanks to aspidites <emarshall85@gmail.com> who provided the initial patch.
Test Plan: validate under 2 and 3
Reviewers: hvr, simonmar, thomie, austin
Reviewed By: thomie, austin
Subscribers: aspidites, thomie, simonmar, ezyang, carter
Differential Revision: https://phabricator.haskell.org/D233
GHC Trac Issues: #9184
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
tc056(normal)
tc056(opt)
tc056(optasm)
tc056(prof)
tc056(profasm)
tc056(unreg)
After:
tc056(normal,opt,optasm,prof,profasm,unreg)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for testing generation and compilation of External Core
code. There are two new ways, which are not automatically enabled but can be
invoked from the command line: extcore and optextcore. Invoking either way will
test that ghc is able to generate External Core code for a given test, read the
code back in, and compile it to an executable that produces the expected output
for the test.
The External Core facility has a few limitations which result in certain tests
failing for the "extcore" way.
- External Core can't represent foreign calls other than static C calls
- External Core can't correctly represent literals resulting from a
"foreign label" declaration
- External Core can't represent declarations of datatypes with no
constructors
The first of these was already known, and GHC panics if you tried to
generate External Core for a program containing such a call. The second two
cases were not handled properly before now; in another commit, I've changed the
code that emits External Core to panic if either of them arises. Previously,
GHC would happily generate External Core in either case, but would not be able
to compile the resulting code.
There are several tests that exhibit these limitations of External Core, so
they've had to be made "expected failures" when compiling in the extcore or
optextcore ways.
|
|
|
|
|
|
|
|
|
|
| |
- Move some of the way-selection logic into the configuration file;
the build system now just passes in variables saying whether the
compiler supports profiling and native code generation, and the
configuration file adds the appropriate ways.
- Add a new option to the test driver, --way=<way> to select just a
single way.
|
|
Revamp the testsuite framework. The previous framework was an
experiment that got a little out of control - a whole new language
with an interpreter written in Haskell was rather heavyweight and left
us with a maintenance problem.
So the new test driver is written in Python. The downside is that you
need Python to run the testsuite, but we don't think that's too big a
problem since it only affects developers and Python installs pretty
easily onto everything these days.
Highlights:
- 790 lines of Python, vs. 5300 lines of Haskell + 720 lines of
<strange made-up language>.
- the framework supports running tests in various "ways", which should
catch more bugs. By default, each test is run in three ways:
normal, -O, and -O -fasm. Additionally, if profiling libraries
have been built, another way (-O -prof -auto-all) is added. I plan
to also add a 'GHCi' way.
Running tests multiple ways has already shown up some new bugs!
- documentation is in the README file and is somewhat improved.
- the framework is rather less GHC-specific, and could without much
difficulty be coaxed into using other compilers. Most of the
GHC-specificness is in a separate configuration file (config/ghc).
Things may need a while to settle down. Expect some unexpected
failures.
|