summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* 2.7.1 changelog updatesmitaka-eol2.7.1stable/mitakaJohn Dickinson2016-12-121-0/+27
| | | | Change-Id: I540eb2fff8a9f67815fda26263350ecaa217f217
* Merge "Avoid infinite loop while placing parts" into stable/mitakaJenkins2016-12-072-9/+31
|\
| * Avoid infinite loop while placing partsTim Burke2016-11-242-9/+31
| | | | | | | | | | | | | | | | | | | | Previously, we could over-assign how many parts should be in a tier. This would cause the local `parts` variable to go negative, which meant that our `while parts` loop would never terminate. Change-Id: Id7e7889742ca37cf1a9c0d55fba78d967e90e8d0 Closes-Bug: 1642538 (cherry picked from commit 2e7a7347fc58676fbaabce3d87a15866796d32e4)
* | Merge "Fixed regression in consolidate_hashes" into stable/mitakaJenkins2016-12-072-7/+154
|\ \
| * | Fixed regression in consolidate_hashesPavel Kvasnička2016-11-252-7/+154
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Occurs when a new file is stored to new suffix to not empty partition. Then suffix is added to an invalidations file but not into hashes pickle file. When a replication of this partition runs, replication of suffix is completed on first and each 10th run of replicator. Rsync runs on each new suffix because destination does not return hash of new suffix although suffix content is in the same state. This bug was introduced in 2.7.0 Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Change-Id: Ie2700f6e6171f2ecfa7d07b0f18b79e90cbf1c8a Closes-Bug: #1634967 (cherry picked from commit 8ac432fff3e01a07f4bff918bb9cc38d93532b43)
* | Fix non-deterministic suffix updates in hashes.pklPavel Kvasnička2016-12-063-6/+64
|/ | | | | | | | | | | | | | Right now the do_listdir option was set on every 10th replication run. Due to the randomness of the job listing this might update a given partition much less often than expected, for example with 1000 partitions per replicator only every ~70th run. Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Christian Schwede <cschwede@redhat.com> Related-Bug: #1634967 Closes-Bug: 1644807 Change-Id: Ib5c9dd17e40150450ec57a728ae8652fbc730af6
* Removed "in-process-" from func env tox name and added bindep.txtOndřej Nový2016-11-202-1/+18
| | | | | | | | This shorten shebang in infra, because we are hitting 128 bytes limit. Added bindep.txt, which is needed for infra Change-Id: I02477d81b836df71780942189d37d616944c4dce (cherry picked from commit 5d7a3a4 and aab2cee)
* Make ECDiskFileReader check fragment metadataAlistair Coles2016-11-038-176/+659
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the ECDiskFileReader check the validity of EC fragment metadata as it reads chunks from disk and quarantine a diskfile with bad metadata. This in turn means that both the object auditor and a proxy GET request will cause bad EC fragments to be quarantined. This change is motivated by bug 1631144 which may result in corrupt EC fragments being written to disk but appear valid to the object auditor md5 hash and content-length checks. NotImplemented: * perform metadata check when a read starts on any frag_size boundary, not just at zero Related-Bug: #1631144 Closes-Bug: #1633647 This is a backport of commit 2a75091c58948fb664016c0e91e72acd313e4610 Change-Id: Ifa6a7f8aaca94c7d39f4aeb9d4fa3f59c4f6ee13 Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
* Prevent ssync writing bad fragment data to diskfileAlistair Coles2016-10-177-36/+395
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if a reconstructor sync type job failed to provide sufficient bytes from a reconstructed fragment body iterator to match the content-length that the ssync sender had already sent to the ssync receiver, the sender would still proceed to send the next subrequest. The ssync receiver might then write the start of the next subrequest to the partially complete diskfile for the previous subrequest (including writing subrequest headers to that diskfile) until it has received content-length bytes. Since a reconstructor ssync job does not send an ETag header (it cannot because it does not know the ETag of a reconstructed fragment until it has been sent) then the receiving object server does not detect the "bad" data written to the fragment diskfile, and worse, will label it with an ETag that matches the md5 sum of the bad data. The bad fragment file will therefore appear good to the auditor. There is no easy way for the ssync sender to communicate a lack of source data to the receiver other than by disconnecting the session. So this patch adds a check in the ssync sender that the sent byte count is equal to the sent Content-Length header value for each subrequest, and disconnect if a mismatch is detected. The disconnect prevents the receiver finalizing the bad diskfile, but also prevents subsequent fragments in the ssync job being sync'd until the next cycle. N.B. Though this is a backport patch to 2.7.0 release, there is the difference from the original commit 3218f8b064e462d901466b04a4813e15ec96da85 on the master branch which is the number of eventlet trampoline sleeps to make before asserting the written log in test/unit/obj/test_ssync.py. That is because in 2.7.0 there is an extra eventlet coro for writing chunks into the real diskfile, which was subsequently removed with commit 4c11833a9cbff499725365e535e217f3eae3c442 during the 2.7.0-2.8.0 development cycle. Closes-Bug: #1631144 Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp> Change-Id: I54068906efdb9cd58fcdc6eae7c2163ea92afb9d
* Merge "Stop complaining about auditor_status files" into stable/mitakaJenkins2016-10-062-2/+40
|\
| * Stop complaining about auditor_status filesTim Burke2016-08-312-2/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following fd86d5a, the object-auditor would leave status files so it could resume where it left off if restarted. However, this would also cause the object-reconstructor to print warnings like: Unexpected entity in data dir: u'/srv/node4/sdb8/objects/auditor_status_ZBF.json' ...which isn't actually terribly useful or actionable. The auditor will clean it up (eventually); the operator doesn't have to do anything. Now, the reconstructor will specifically ignore those status files. Partial-Bug: 1583305 Change-Id: I2f3d0bd2f1e242db6eb263c7755f1363d1430048 (cherry picked from commit ad16e2c77bb61bdf51a7d3b2c258daf69bfc74da)
* | Ignore auditor status files to prevent replicator reports errorsCharles Hsu2016-08-312-4/+42
|/ | | | | | | | | | | | | Ignore `auditor_status_*.json` files during the collecting jobs and replicator won't use these wrong paths to find objects that causes an exception to increase failure count in replicator report. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Mark Kirkwood <mark.kirkwood@catalyst.net.nz> Change-Id: Ib15a0987288d9ee32432c1998aefe638ca3b223b Closes-Bug: #1583305 (cherry picked from commit 65b1820407ea40bd7d65a5356a58a689befe3cb5)
* Imported Translations from ZanataOpenStack Proposal Bot2016-05-0712-1655/+289
| | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: I6dc5872fe3005df1d98a7d914f4488a9d3b2f39f
* Imported Translations from ZanataOpenStack Proposal Bot2016-04-273-174/+168
| | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: Idffa1f644ef1363a043a0f7dd9f10e801b0dc374
* Imported Translations from ZanataOpenStack Proposal Bot2016-04-182-146/+147
| | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: I995ccd6d0494dd3535c6b93565c5ad9860d6f79e
* Fix upgrade bug in versioned_writesTim Burke2016-04-062-15/+439
| | | | | | | | | | | | | | | Previously, versioned_writes assumed that all container servers would always have the latest Swift code, allowing them to return reversed listings. This could cause the wrong version of a file to be restored during rolling upgrades. Now, versioned_writes will check that the listing returned is actually reversed. If it isn't, we will revert to getting the full (in-order) listing of versions and reversing it on the proxy. Change-Id: Ib53574ff71961592426cb386ef00a75eb5824def Closes-Bug: 1562083 (cherry picked from commit ebf0b220127b14bec7c05f1bc0286728f27f39d1)
* Imported Translations from ZanataOpenStack Proposal Bot2016-03-301-4/+174
| | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: Ib7e30b55fd1795bf63cb0c22a97580be5e6f9f23
* Imported Translations from ZanataOpenStack Proposal Bot2016-03-292-13/+179
| | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: Ia78689c73464b565fc6a09a633a1e9a84d92a97d
* Imported Translations from ZanataOpenStack Proposal Bot2016-03-285-17/+234
| | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: I68439736f0d0cfc981d4eacbdd1908f57e82b4c9
* Update .gitreview for stable/mitakaThierry Carrez2016-03-251-0/+1
| | | | Change-Id: If8945bca78f90e95f6d05baefc078b8905f6bdb2
* Merge "Check marker params in SimpleClient full listing requests"2.7.0Jenkins2016-03-242-36/+32
|\
| * Check marker params in SimpleClient full listing requestsAlistair Coles2016-03-232-36/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow up for change [1] to add some assertions to check that marker param is included in sequential GET requests sent during a full listing. Extract multiple FakeConn class definitions to single class at module level and share between all classes. Also, explicitly unpack the return values from base request calls made in the full listing section of base_request, and explicitly return a list to make more consistent with rest of the method. [1] Change-Id: I6892390d72f70f1bc519b482d4f72603e1570163 Change-Id: Iad038709f46364b8324d25ac79be4317add79df5
* | Merge "Fix full_listing in internal_client"Jenkins2016-03-242-1/+41
|\ \ | |/
| * Fix full_listing in internal_clientChristian Schwede2016-03-232-1/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | The internal_client is used in swift-dispersion-report, and in case one has more than 10000 containers or objects these are not queried. This patch adds support to the internal_client to iterate over all containers/objects if the listing exceeds the default of 10000 entries and the argument full_listing=True is used. Closes-Bug: 1314817 Closes-Bug: 1525995 Change-Id: I6892390d72f70f1bc519b482d4f72603e1570163
* | 2.7.0 authors and changelog updatesJohn Dickinson2016-03-233-2/+174
|/ | | | Change-Id: I16ad0c61b048921ca01fa96862ae7eea0eec6017
* Merge "Auditor will clean up stale rsync tempfiles"Jenkins2016-03-239-37/+344
|\
| * Auditor will clean up stale rsync tempfilesClay Gerrard2016-03-239-37/+344
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DiskFile already fills in the _ondisk_info attribute when it tries to open a diskfile - even if the DiskFile's fileset is not valid or deleted. During this process the rsync tempfiles would be discovered and logged, but no-one would attempt to clean them up - even if they were really old. Instead of logging and ignoring unexpected files when validate a DiskFile fileset we'll add unexpected files to the unexpected key in the _ondisk_info attribute. With a little bit of re-organization in the auditor's object_audit method to get things into a single return path we can add an unconditional check for unexpected files and remove those that are "old enough". Since the replicator will kill any rsync processes that are running longer than the configured rsync_timeout we know that any rsync tempfiles older than this can be deleted. Split unlink_older_than in common.utils into two functions to allow an explicit list of previously discovered paths to be passed in to avoid an extra listdir. Since the getmtime handling already ignores OSError there's less concern of race condition where a previous discovered unexpected file is reaped by rsync while we're attempting to clean it up. Update some doc on the new config option. Closes-Bug: #1554005 Change-Id: Id67681cb77f605e3491b8afcb9c69d769e154283
* | Merge "Container-Sync to perform HEAD before PUT object on remote"Jenkins2016-03-235-91/+327
|\ \
| * | Container-Sync to perform HEAD before PUT object on remoteOSHRITF2016-03-235-91/+327
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds a remote HEAD object request before each call to sync_row. Currently, container-sync-row attempts to replicate the object (using PUT) regardless of the existance of the object on the remote side, thus causing each object to be transferred on the wire several times (depending on the replication factor) An alternative to HEAD is to do a conditional PUT (using, 100-continue). However, this change is more involved and requires upgrade of both the client and server side clusters to work. In the Tokyo design summit it was decided to start with the HEAD approach. Change-Id: I60d982dd2cc79a0f13b0924507cd03d7f9c9d70b Closes-Bug: #1277223
* | Merge "Docs: Container sync does not require POST-as-COPY"Jenkins2016-03-235-30/+3
|\ \
| * | Docs: Container sync does not require POST-as-COPYAlistair Coles2016-03-225-30/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Updates docs to remove warnings that container sync only works with object_post_as_copy=True. Since commit e91de49 container sync will also sync POST updates when using object_post_as_copy=False. Change-Id: I5cc3cc6e8f9ba2fef6f896f2b11d2a4e06825f7f
* | | Imported Translations from ZanataOpenStack Proposal Bot2016-03-234-20/+210
| |/ |/| | | | | | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: Ia2c2819db372da46538d71a80888a4e27538bdcd
* | Merge "Make the object auditor's run-once mode run once."Jenkins2016-03-222-6/+128
|\ \
| * | Make the object auditor's run-once mode run once.Samuel Merritt2015-08-252-6/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | If you invoked the object auditor with --once, it would run the full-audit checker(s) once, but it would run the ZBF checker over and over until the full-audit checkers were done. Now it runs the ZBF and full-audit checkers once each. Change-Id: Ieeaa6fba4184a069756ee150727f24df7833697a
* | | Merge "Add .eggs/* to .gitignore"Jenkins2016-03-221-0/+1
|\ \ \
| * | | Add .eggs/* to .gitignoreAlistair Coles2016-03-221-0/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | After running: python setup.py build_sphinx there is a .eggs directory left in the repo root directory which is not currently ignored by git. Change-Id: Id15811f94046fd8bb22153425bf5cafe6c045453
* | | Merge "Remove unused code from container sync"Jenkins2016-03-221-12/+0
|\ \ \
| * | | Remove unused code from container syncAlistair Coles2016-03-221-12/+0
| | | | | | | | | | | | | | | | Change-Id: Ia44138aadcd30c474f744a9c552220e18302ecc6
* | | | Merge "Container sync nodes shuffle cleanup"Jenkins2016-03-222-9/+1
|\ \ \ \ | |/ / /
| * | | Container sync nodes shuffle cleanuposhritf2016-03-202-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit "Update container sync to use internal client" get_object is done using internal_client and not directly on nodes which makes the block of code to shuffle the nodes redundant. Change-Id: I45a6dab05f6f87510cf73102b1ed191238209efe
* | | | Merge "Set backend content length for fallocate - EC Policy"Jenkins2016-03-222-6/+44
|\ \ \ \
| * | | | Set backend content length for fallocate - EC PolicyJanie Richling2016-02-222-6/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the ECObjectController removes the 'content-length' header. This part is ok, except that value is being used to set 'X-Backend-Obj-Content-Length', so it is always 0. This leads to not calling fallocate (details on bug) on a PUT since the size is 0. This change makes use of some numbers returned from the EC Driver get_segment_info method in order to calculate the expected on-disk size that should be allocated. The EC controller will now set the 'X-Backend-Obj-Content-Length' value appropriately. Co-Authored-By: Kota Tsuyuzaki Co-Authored-By: John Dickinson Co-Authored-By: Tim Burke Change-Id: Ifd16c1438539e6fd9bb2dbcd053d11bea2e09fee Fixes: bug 1532008
* | | | | Imported Translations from ZanataOpenStack Proposal Bot2016-03-223-31/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For more information about this automatic import see: https://wiki.openstack.org/wiki/Translations/Infrastructure Change-Id: I70db7d29a9859cb47144ac49df8c289d1c2ec3e6
* | | | | Merge "Skip already checked partitions when auditing objects after a restart"Jenkins2016-03-224-9/+117
|\ \ \ \ \
| * | | | | Skip already checked partitions when auditing objects after a restartChristian Schwede2016-03-214-9/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The object auditor will save a short status file on each device, containing a list of remaining partitions for auditing. If the auditor is restarted, it will only audit partitions not yet checked. If all partitions on the current device have been checked, it will simply skip this device. Once all partitions on all disks are successfully audited, all status files are removed. Closes-Bug: #1183656 Change-Id: Icf1d920d0942ce48f1d3d374ea4d63dbc29ea464
* | | | | | Merge "Don't ssync data when only a durable is missing"Jenkins2016-03-214-26/+277
|\ \ \ \ \ \
| * | | | | | Don't ssync data when only a durable is missingAlistair Coles2016-03-044-26/+277
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an EC diskfile is missing its .durable file (for example due to a partial PUT failure) then the ssync missing check will fail to open the file and will consider it missing. This can result in possible reconstruction of the fragment archive (for a sync job) and definite transmission of the fragment archive (for sync and revert jobs), which is wasteful. This patch makes the ssync receiver inspect the diskfile state after attempting to open it, and if fragments exist at the timestamp of the sender's diskfile, but a .durable file is missing, then the receiver will commit the diskfile at the sender's timestamp. As a result, there is no longer any need to send a fragment archive. Change-Id: I4766864fcc0a3553976e8fd85bbb2fc782f04abd
* | | | | | | Merge "Don't report recon mount/usage status on files"Jenkins2016-03-212-8/+66
|\ \ \ \ \ \ \
| * | | | | | | Don't report recon mount/usage status on filesBrian Cline2016-03-142-8/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today recon will include normal files in the payload it returns for /recon/unmounted and /recon/diskusage. As a result it can trigger bogus alarms on any operations-side monitoring checking for unmounted disks or disks that show up in diskusage with weird looking stats. This change adds an isdir check for the entries it finds in /srv/node. Change-Id: Iad72e03fdda11ff600b81b4c5d58020cc4b9048e Closes-bug: #1556747
* | | | | | | | Merge "Fix ringbuilder tests"Jenkins2016-03-212-331/+191
|\ \ \ \ \ \ \ \