summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Refactor test to keep setup/teardown out of timerjenkins-elixir-retry-until-flakesAdam Kocoloski2019-08-181-8/+11
|
* Extend timeout on shard splitting testAdam Kocoloski2019-08-181-1/+1
|
* Bump default timeout for retry_untilAdam Kocoloski2019-08-181-1/+1
|
* Give resharding more time to completeAdam Kocoloski2019-08-171-2/+2
|
* Extend timeouts for chttpd_view_test suiteAdam Kocoloski2019-08-161-6/+7
| | | | More occasional flakiness on Jenkins.
* Extend timeouts for chttpd_db_test suiteAdam Kocoloski2019-08-161-42/+43
| | | | | | The last 9 tests take a few hundred milliseconds locally and flaked a bit on Jenkins. For consistency's sake we bump the timeout from 5 to 60 seconds across the board.
* Extend timeouts for couch_bt_engine_upgrade_testsAdam Kocoloski2019-08-161-8/+9
| | | | Jenkins flaked out on one of these today.
* Don't try to publish trusty packagesAdam Kocoloski2019-08-161-2/+0
| | | | We aren't building them anymore.
* Ensure EUnit inherits appropriate env varsAdam Kocoloski2019-08-162-4/+4
| | | | | | | | | | | Omitting COUCHDB_VERSION caused the EUnit build of the replicator to have a corrupted User-Agent header. It tried to construct a version using git, but when building from a release tarball there is no git repo so the UA had a git error message in it. This error message contained a newline, which plausibly confused some part of the HTTP stack and caused replicator HTTP requests to hang. Related to #2098.
* Merge pull request #2122 from cloudant/cleanup-after-meckiilyak2019-08-161-0/+4
|\ | | | | Call :meck.unload() automatically after every test
| * Call :meck.unload() automatically after every testILYA Khlopotov2019-08-161-0/+4
|/
* Merge pull request #2105 from cloudant/improve-admin-part-setupiilyak2019-08-161-12/+29
|\ | | | | Do not fail 'dev/run' on connection close
| * Do not fail 'dev/run' on connection closeILYA Khlopotov2019-08-151-12/+29
|/ | | | | | | | | Sometimes admin party mode causes the 'dev/run' to fail with ``` http.client.RemoteDisconnected: Remote end closed connection without response ``` This PR makes this use case more robust.
* Merge pull request #2101 from cloudant/refactor-clean_index_filesiilyak2019-08-153-10/+121
|\ | | | | Refactor fabric:cleanup_index_files
| * Refactor fabric:cleanup_index_filesILYA Khlopotov2019-08-153-10/+121
|/ | | | | Previous implementation assembled a regexp by concatenating active signatures. The approach caused regexp to exceed system limit in the case of huge number of them.
* Merge pull request #2118 from apache/epi-support-map-childspecEric Avdey2019-08-152-12/+27
|\ | | | | Support map childspecs in couch_epi supervisor's children replacement
| * Support map childspecs in couch_epi supervisor's children replacementepi-support-map-childspecEric Avdey2019-08-152-12/+27
|/
* Fix replication rescheduling Running < MaxJobs corner caseNick Vatamaniuc2019-08-141-33/+49
| | | | | | | | | | | | | | | Previously, when total number of replication jobs exceed `MaxJobs`, if some jobs crashed, additional jobs didn't start immediately to bring the running total up to the `MaxJobs` limit. Then, during rescheduling, the `Running == MaxJobs, Pending > 0` guard would fail and jobs would not rotate. In other words, if at least one job crashed, rotation didn't happen. The fix is to simplify the rotation logic to handle the `Running < MaxJobs` case. First, up to `Churn` number of jobs are stopped, then enough jobs are started to reach the `MaxJobs` limit. The rotation logic case handles the `start_pending_jobs/3` case so there is no need to call that separately before rotation happens.
* Merge pull request #2109 from apache/fixup-cleanup-goPeng Hui Jiang2019-08-142-5/+8
|\ | | | | fixup for dreyfus_fabric_cleanup:go/1
| * fixup for dreyfus_fabric_cleanup:go/1fixup-cleanup-gojiangph2019-08-132-5/+8
|/
* Extend timeout for mrview_purge_docs_fabricAdam Kocoloski2019-08-091-5/+5
|
* Increase default HTTP timeoutsAdam Kocoloski2019-08-091-2/+10
| | | | | | | | | These are needed to avoid timeouts on ASF Jenkins build farm. The httpotion client uses ibrowse underneath, and ibrowse has three separate timeouts. We are configuring two of them here: the overall request timeout, and one that detects inactivity on the connection. We set them slightly differently just to be able to differentiate which one fired from the logs.
* Move couch startup to a fixtureAdam Kocoloski2019-08-091-39/+37
| | | | | This improves reliability because that time isn't charged to the test, and also speeds up the test.
* Add timeout for couch_db_split_testsAdam Kocoloski2019-08-091-1/+2
| | | | | The "Should copy local docs after split in four" test was occasionally timing out in CI.
* Configure environment for Elixir on ARMAdam Kocoloski2019-08-081-1/+3
| | | | | These settings are required to prevent Mix & Hex from trying to install packages into / on the ARM host.
* Avoid shebang length limits on jenkinsAdam Kocoloski2019-08-081-2/+2
| | | | | | | The `pip3` and `nosetest` executables are scripts, and on jenkins the specified interpreter can exceed the 128 character length limit because of the deeply-nested workspace. Invoking these as modules seems the preferred workaround per pypa/pip#1773
* Capture EUnit and ExUnit test results for JenkinsAdam Kocoloski2019-08-083-1/+40
|
* Refactor using sequential stages, in workspaceAdam Kocoloski2019-08-081-72/+177
| | | | | | | | | | This work moves the builds back into the workspace, using a separate sub-directory per platform to avoid clashes between builds caused by JENKINS-57454. It also breaks out the steps into a pair of sequential stages within each each parallel stage of the build, which gives us better visibility into the progress of the build, and also sets us up to capture test results and expose them directly via Jenkins UI for faster problem determination.
* Fix copy/paste errors in platform namingAdam Kocoloski2019-08-081-3/+3
|
* Fix cpse_test_purge_replication eunit testNick Vatamaniuc2019-08-071-2/+9
| | | | | | | | | It doesn't work on Jenkins but worked locally. Noticed that we started chttpd even though the clustered port was never used. Add a wait function in `db_url/1` to make sure to wait until the db is available via the HTTP interface before continuing.
* Fix bash-ism in EUnit retry logicNick Vatamaniuc2019-08-061-1/+1
| | | | Bash has `let` but other shells might not have it.
* Switch to only using elixir replication integration testNick Vatamaniuc2019-08-012-1922/+0
| | | | | | | And remove the js version. Elixir test has been running decently on Travis from what I observed. However, it was disabled on jenkins runs. With a recent hardware upgrade, perhaps there is chance this test will start passing there too.
* Remove local replication endpoints in CouchDB 3.xNick Vatamaniuc2019-07-3122-456/+136
| | | | | | | | | | | | | | | `local` replication endpoints do something completely unexpected from a user's point of view -- they replicate to and from node local databases on a random node. The only way this worked correctly was if someone used the backend port (:5986) with a single node database. However, that port is getting closed for 3.x release as well, so it makes even less sense to keep this functionality around. For more discussion and voting results see ML list: https://lists.apache.org/thread.html/ddcd9db93cee363db7da571f5cbc7f2bd24b881a34e1ef734d6a0a1c@%3Cdev.couchdb.apache.org%3E The `_replicate` HTTP "hack" was left as is, since it does work more or less, However it is inconsistent with what _replicator docs do so we should probably deprecated it and remove it in 4.x.
* Fix mem3_sync_event_listener EUnit testNick Vatamaniuc2019-07-301-20/+8
| | | | | Fix a race condition in state matching, also parameterize the state field in wait_state.
* Retry EUnit tests on failureNick Vatamaniuc2019-07-291-2/+12
| | | | | | | Whole app is retried 2 extra times if it fails. Added to *nix Makefile only for now. May not be needed for Windows as this is for CI flakiness mostly.
* Merge pull request #2039 from cloudant/exunit-simplifiediilyak2019-07-29205-79/+887
|\ | | | | Exunit simplified
| * Update .travis.ymlILYA Khlopotov2019-07-291-5/+0
| |
| * Unify runners for unit and integration testsILYA Khlopotov2019-07-2912-93/+67
| |
| * Add chained setupsILYA Khlopotov2019-07-2910-0/+685
| |
| * Move eunit tests into test/eunit directoryILYA Khlopotov2019-07-29175-3/+3
| |
| * Minimal ExUnit setupILYA Khlopotov2019-07-2912-3/+157
| |
| * Fix credo complains for dreyfusILYA Khlopotov2019-07-291-2/+2
|/
* Fix EUnit timeouts (#2087)Adam Kocoloski2019-07-286-104/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Proactively increase timeout for PBKDF2 test This test was taking 134s in a recent run, which is uncomfortably close to the threshold. * Extend timeouts for all reshard API tests We're observing timeouts on various tests in this suite so let's keep it consistent and increase timeouts across the board. * Bump default timeout for all mem3_reshard tests A couple of these tests were exceeding the default timeout under normal circumstances, but many of them do a significant amount of work, so for simplicity we set a module-wide timeout and apply it consistently throughout. * Modernize the sync_security test setup/teardown This test actually doesn't do much real work, but I think what was happening is that the setup and teardown time was being charged to the test itself. I've refactored it to use a more modern scaffolding following some of our more recent additions to the test suite, but have left the timeout at the default to test this hypothesis. * Increase timeouts on more heavyweight mem3 tests * Extend timeouts for replication tests
* Fix flaky mem3_sync_event_listener EUnit testNick Vatamaniuc2019-07-281-8/+31
| | | | | | | | Config setting was asynchronous and the waiting function was not waiting for the actual state value to change just that the state function was returning. The fix is to wait for the config value to propagate to the state.
* Increase timeouts on two slow btree testsAdam Kocoloski2019-07-281-2/+7
| | | | | | These two tests are reliably timing out on ARM hardware in Jenkins. They do a lot of individual btree operations so this is not entirely surprising. Appropriate course of action here is to raise the timeout.
* Make sure that fsync errors are raisedPaul J. Davis2019-07-222-3/+59
| | | | | | | | | This changes `couch_file` to ensure that errors are raised when a call to `fsync` fails. It will also stop the couch_file process to ensure that anything handling a failed `fsync` won't attempt to retry the operation and experience issues discovered by Postgres [1]. [1] http://danluu.com/fsyncgate/
* Add missing purge settings to default.iniNick Vatamaniuc2019-07-111-0/+11
|
* Fix max_document_id_length value in default.iniNick Vatamaniuc2019-07-111-1/+1
| | | | | | The code has "infinity" as the default value and not 0 See src/couch_replicator/src/couch_replicator_changes_reader.erl
* Add erlang 22 supportNick Vatamaniuc2019-07-102-3/+6
| | | | | | | | | | | Bumped elixir version to 1.7.4 as 1.6.6 wasn't built with Erlang 22 support. Also moving straight to 22.0.5 since 22.0 in travis crashed with a segmentation fault. Some of the release comments in the point release mention VM crashes, so it seems to check out. Fixes https://github.com/apache/couchdb/issues/2069
* Merge pull request #2062 from cloudant/update-ioq-2.1.2iilyak2019-07-031-1/+1
|\ | | | | Update ioq to 2.1.2