summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTamar Christina <tamar@zhox.com>2017-01-28 04:19:02 +0000
committerTamar Christina <tamar@zhox.com>2017-01-28 04:22:52 +0000
commit1f366b8d15feaa05931bd2d81d8b0c5bae92f3b8 (patch)
tree1a3addf4562d5ae3147bbef2b6dee15f96cf70f5
parent2af38b065b506cd86e9be20d9592423730f0a5e2 (diff)
downloadhaskell-1f366b8d15feaa05931bd2d81d8b0c5bae92f3b8.tar.gz
Add delete retry loop. [ci skip]
Summary: On Windows we have to retry the delete a couple of times. The reason for this is that a `FileDelete` command just marks a file for deletion. The file is really only removed when the last handle to the file is closed. Unfortunately there are a lot of system services that can have a file temporarily opened using a shared readonly lock, such as the built in AV and search indexer. We can't really guarantee that these are all off, so what we can do is whenever after a `rmtree` the folder still exists to try again and wait a bit. Based on what I've seen from the tests on CI server, is that this is relatively rare. So overall we won't be retrying a lot. If after a reasonable amount of time the folder is still locked then abort the current test by throwing an exception, this so it won't fail with an even more cryptic error. The issue is that these services often open a file using `FILE_SHARE_DELETE` permissions. So they can seemingly be removed, and for most intended purposes they are, but recreating the file with the same name will fail as the FS will prevent data loss. The MSDN docs for `DeleteFile` says: ``` The DeleteFile function marks a file for deletion on close. Therefore, the file deletion does not occur until the last handle to the file is closed. Subsequent calls to CreateFile to open the file fail with ERROR_ACCESS_DENIED. ``` Retrying seems to be a common pattern, SQLite has it in their driver http://www.sqlite.org/src/info/89f1848d7f The only way to avoid this is to run each way of a test in it's own folder. This would also have the added bonus of increased parallelism. Reviewers: austin, bgamari Reviewed By: bgamari Subscribers: thomie, #ghc_windows_task_force Differential Revision: https://phabricator.haskell.org/D2936 GHC Trac Issues: #12661, #13162
-rw-r--r--testsuite/driver/testlib.py43
1 files changed, 29 insertions, 14 deletions
diff --git a/testsuite/driver/testlib.py b/testsuite/driver/testlib.py
index c0135f0864..78e2c6f20d 100644
--- a/testsuite/driver/testlib.py
+++ b/testsuite/driver/testlib.py
@@ -1893,26 +1893,41 @@ def find_expected_file(name, suff):
if config.msys:
import stat
+ import time
def cleanup():
testdir = getTestOpts().testdir
-
+ max_attemps = 5
+ retries = max_attemps
def on_error(function, path, excinfo):
# At least one test (T11489) removes the write bit from a file it
# produces. Windows refuses to delete read-only files with a
# permission error. Try setting the write bit and try again.
- if excinfo[1].errno == 13:
- os.chmod(path, stat.S_IWRITE)
- function(path)
-
- shutil.rmtree(testdir, ignore_errors=False, onerror=on_error)
-
- if os.path.exists(testdir):
- # And now we try to cleanup the folder again, since the above
- # Would have removed the problematic file(s), but not the folder.
- # The onerror doesn't seem to be raised during the tree walk, only
- # afterwards to report the failures.
- # See https://bugs.python.org/issue8523 and https://bugs.python.org/issue19643
- shutil.rmtree(testdir, ignore_errors=False)
+ os.chmod(path, stat.S_IWRITE)
+ function(path)
+
+ # On Windows we have to retry the delete a couple of times.
+ # The reason for this is that a FileDelete command just marks a
+ # file for deletion. The file is really only removed when the last
+ # handle to the file is closed. Unfortunately there are a lot of
+ # system services that can have a file temporarily opened using a shared
+ # readonly lock, such as the built in AV and search indexer.
+ #
+ # We can't really guarantee that these are all off, so what we can do is
+ # whenever after a rmtree the folder still exists to try again and wait a bit.
+ #
+ # Based on what I've seen from the tests on CI server, is that this is relatively rare.
+ # So overall we won't be retrying a lot. If after a reasonable amount of time the folder is
+ # still locked then abort the current test by throwing an exception, this so it won't fail
+ # with an even more cryptic error.
+ #
+ # See Trac #13162
+ while retries > 0 and os.path.exists(testdir):
+ time.sleep((max_attemps-retries)*6)
+ shutil.rmtree(testdir, onerror=on_error, ignore_errors=False)
+ retries=-1
+
+ if retries == 0 and os.path.exists(testdir):
+ raise Exception("Unable to remove folder '" + testdir + "'. Unable to start current test.")
else:
def cleanup():
testdir = getTestOpts().testdir