summaryrefslogtreecommitdiff
path: root/swift/proxy/controllers/container.py
Commit message (Collapse)AuthorAgeFilesLines
* Proxy: restructure cached listing shard rangesJianjian Huo2023-04-171-41/+99
| | | | | | | | | | | | | | | | | | | | Updating shard range cache has been restructured and upgraded to v2 which only persist the essential attributes in memcache (see Related-Change). This is the following patch to restructure the listing shard ranges cache for object listing in the same way. UpgradeImpact ============= The cache key for listing shard ranges in memcached is renamed from 'shard-listing/<account>/<container>' to 'shard-listing-v2/<account>/<container>', and cache data is changed to be a list of [lower bound, name]. As a result, this will invalidate all existing listing shard ranges stored in the memcache cluster. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b Change-Id: I54a32fd16e3d02b00c18b769c6f675bae3ba8e01
* Fix docstring regarding private methodTim Burke2023-02-221-1/+1
| | | | | | | | X-Backend-Allow-Method was used in some iteration, but not the version of the patch that finally landed. Change-Id: Id637253bb68bc839f5444a74c91588d753ef4379 Related-Change: Ia13ee5da3d1b5c536eccaadc7a6fdcd997374443
* proxy-server exception logging shows replication_ip/portindianwhocodes2023-02-101-1/+1
| | | | | | | | Adding a "use_replication" field to the node dict, a helper function to set use_replication dict value for a node copy by looking up the header value for x-backend-use-replication-network Change-Id: Ie05af464765dc10cf585be851f462033fc6bdec7
* swift_proxy: add memcache skip success/error stats for shard range.Jianjian Huo2023-01-201-43/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will add more granularity to shard operation cache or backend metrics, and then remove some of existing and duplicated metrics. Before this patch, related metrics are: 1.shard_<op>.cache.[hit|miss|skip] 2.shard_<op>.backend.<status_int> where op is 'listing' or 'updating'. With this patch, they are going to become: 1.shard_<op>.infocache.hit cache hits with infocache. 2.shard_<op>.cache.hit cache hits with memcache. 3.shard_<op>.cache.[miss|bypass|skip|force_skip|disabled|error] .<status_int> Those are operations made to backend due to below reasons. miss: cache misses. bypass: metadata didn't support a cache lookup skip: the selective skips per skip percentage config. force_skip: the request with 'x-newest' header. disabled: memcache is disabled. error: memcache connection error. For each kind of operation metrics, suffix <status_int> will count operations with different status. Then a sum of all status sub-metrics will the total metrics of that operation. UpgradeImpact ============= Metrics dashboard will need updates to display those changed metrics correctly, also infocache metrics are newly added, please see above message for all changes needed. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: Ib8be30d3969b4b4808664c43e94db53d10e6ef4c
* proxy: refactor ContainerController._GET_using_cacheAlistair Coles2022-11-251-87/+112
| | | | | | | | | | | | | | | | | Refactor the ContainerController._GET_using_cache. No behavioral changes. * Make the top-level method shorter. * Make the flow easier to follow. * Make various return points more obvious. * Change variable names to clarify those that are lists of ShardRange objects from those that are lists of dicts representing shard ranges. Change-Id: Ibb7cd761be4a5b1ec53dd16b7c5d256ed7666a88
* Merge "sharding: Skip shards that can't include any new subdir entries"Zuul2022-09-211-4/+17
|\
| * sharding: Skip shards that can't include any new subdir entriesTim Burke2022-07-201-4/+17
| | | | | | | | Change-Id: I08cc2c0bfe803e3cec1e6ada10af4d725359e5e8
* | Merge "memcached: Give callers the option to accept errors"Zuul2022-05-131-3/+9
|\ \
| * | memcached: Give callers the option to accept errorsTim Burke2022-04-281-3/+9
| |/ | | | | | | | | | | | | | | | | | | | | Auth middlewares in particular may want to *know* when there's a communication breakdown as opposed to a cache miss. Update our shard-range cache stats to acknowlegde the distinction. Drive-by: Log an error if all memcached servers are error-limited. Change-Id: Ic8d0915235d11124d06ec940c5be9a2edbe85c83
* | Add ceil method to utils.TimestampAlistair Coles2022-05-061-3/+2
|/ | | | | | | | | There are a few places where a last-modified value is calculated by rounding a timestamp *up* to the nearest second. This patch refactors to use a new Timestamp.ceil() method to do this rounding, along with a clarifying docstring. Change-Id: I9ef73e5183bdf21b22f5f19b8440ffef6988aec7
* proxy-server: add stats for backend shard_listing requestsAlistair Coles2022-01-271-14/+42
| | | | | | Add statsd metrics 'container.shard_listing.backend.<status_int>'. Change-Id: Ibd98ad3bdedc6c80d275a37697de0943e3e8fb4f
* Fix statsd prefix mutation in proxy controllersAlistair Coles2022-01-271-15/+16
| | | | | | | | | | | | | | | | | | | Swift loggers encapsulate a StatsdClient that is typically initialised with a prefix, equal to the logger name (e.g. 'proxy_server'), that is prepended to metrics names. The proxy server would previously mutate its logger's prefix, using its set_statsd_prefix method, each time a controller was instantiated, extending it with the controller type (e.g. changing the prefix 'proxy_server.object'). As a result, when an object request spawned container subrequests, for example, the statsd client would be left with a 'proxy_server.container' prefix part for subsequent object request related metrics. The proxy server logger is now wrapped with a new MetricsPrefixLoggerAdapter each time a controller is instantiated, and the adapter applies the correct prefix for the controller type for the lifetime of the controller. Change-Id: I0522b1953722ca96021a0002cf93432b973ce626
* proxy: Add a chance to skip memcache when looking for shard rangesTim Burke2022-01-261-1/+10
| | | | | | | | | | | | | | By having some small portion of calls skip cache and go straight to disk, we can ensure the cache is always kept fresh and never expires (at least, for active containers). Previously, when shard ranges fell out of cache there would frequently be a thundering herd that could overwhelm the container server, leading to 503s served to clients or an increase in async pendings. Include metrics for hit/miss/skip rates. Change-Id: I6d74719fb41665f787375a08184c1969c86ce2cf Related-Bug: #1883324
* proxy: Remove a bunch of logging translationsTim Burke2021-10-221-2/+1
| | | | | Change-Id: I2ffacdbd70c72e091825164da24cc87ea67721d7 Partial-Bug: #1674543
* container GET: return 503 if policy index mismatchesAlistair Coles2021-08-161-1/+10
| | | | | | | | | Return a 503 to the original container listing request if a GET from a shard returns objects from a different policy than requested (e.g. due to the shard container server not being upgraded). Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: Ieab6238030e8c264ee90186012be6e9da937b42e
* container-server: return objects of a given policyMatthew Oliver2021-08-161-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a tight coupling between a root container and its shards: the shards hold the object metadata for the root container, so are really an extension of the root. When we PUT objects in to a root container, it'll redirect them, with the root's policy, to the shards. And the shards are happy to take them, even if the shard's policy is different to the root's. But when it comes to GETs, the root redirects the GET onto it's shards whom currently wont respond with objects (which they probably took) because they are of a different policy. Currently, when getting objects from the container server, the policy used is always the broker's policy. This patch corrects this behaviour by allowing the ability to override the policy index to use. If the request to the container server contains an 'X-Backend-Storage-Policy-Index' header it'll be used instead of the policy index stored in the broker. This patch adds the root container's policy as this header in the proxy container controller's `_get_from_shards` method which is used by the proxy to redirect a GET to a root to its shards. Further, a new backend response header has been added. If the container response contains an `X-Backend-Record-Type: object` header, then it means the response is a response with objects in it. In this case this patch also adds a `X-Backend-Record-Storage-Policy-Index` header so the policy index of the given objects is known, as X-Backend-Storage-Policy-Index in the response _always_ represents the policy index of the container itself. On a plus side this new container policy API allows us a way to check containers for object listing is other policies. So might come in handy for OPs/SREs. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I026b699fc5f0fba619cf524093632d67ca38d32f
* Return 503 for container listings when shards are deletedClay Gerrard2021-06-181-5/+7
| | | | | | | | | | | Previously, when building a listing from shard containers, the proxy would return 200 to the client even if one or more of the component shards failed to return a successful listing response. With this patch the proxy will now return a 503 in those circumstances, since the listing would otherwise be incomplete due to an internal error. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I983d665584471c9d689506592f48ddd00c0887ef
* Fix logging in proxy container GET pathAlistair Coles2021-01-221-8/+4
| | | | | | | | | | | | | | | | | | | The proxy tried to map the x-backend-sharding-state header value to a ShardRange state, but the value is the container *db* state. The attempted mapping would commonly fail because db state names do not always correspond to ShardRange state names, in which the case the fallback was to log the header value i.e. the correct outcome. Sometimes the attempted mapping would succeed because for example 'sharded' is both a db state name and a ShardRange state name. In that case the log message would look something like: "Found 1024 objects in shard (state=(70, 'sharded')), total = 1024" i.e. the tuple of ShardRange state name and number was logged, which was inappropriate. Change-Id: Ic08e6e7df7162a4c1283a3ef6e67c3b21a4ce494
* Use cached shard ranges for container GETsAlistair Coles2021-01-061-34/+211
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes four significant changes to the handling of GET requests for sharding or sharded containers: - container server GET requests may now result in the entire list of shard ranges being returned for the 'listing' state regardless of any request parameter constraints. - the proxy server may cache that list of shard ranges in memcache and the requests environ infocache dict, and subsequently use the cached shard ranges when handling GET requests for the same container. - the proxy now caches more container metadata so that it can synthesize a complete set of container GET response headers from cache. - the proxy server now enforces more container GET request validity checks that were previously only enforced by the backend server, e.g. checks for valid request parameter values With this change, when the proxy learns from container metadata that the container is sharded then it will cache shard ranges fetched from the backend during a container GET in memcache. On subsequent container GETs the proxy will use the cached shard ranges to gather object listings from shard containers, avoiding further GET requests to the root container until the cached shard ranges expire from cache. Cached shard ranges are most useful if they cover the entire object name space in the container. The proxy therefore uses a new X-Backend-Override-Shard-Name-Filter header to instruct the container server to ignore any request parameters that would constrain the returned shard range listing i.e. 'marker', 'end_marker', 'includes' and 'reverse' parameters. Having obtained the entire shard range listing (either from the server or from cache) the proxy now applies those request parameter constraints itself when constructing the client response. When using cached shard ranges the proxy will synthesize response headers from the container metadata that is also in cache. To enable the full set of container GET response headers to be synthezised in this way, the set of metadata that the proxy caches when handling a backend container GET response is expanded to include various timestamps. The X-Newest header may be used to disable looking up shard ranges in cache. Change-Id: I5fc696625d69d1ee9218ee2a508a1b9be6cf9685
* Avoid loops when gathering container listings from shardsAlistair Coles2021-01-061-3/+15
| | | | | | | | | | | | | | | | | | Previously the proxy container controller could, in corner cases, get into a loop while building a listing for a sharded container. For example, if a root has a single shard then the proxy will be redirected to that shard, but if that shard has shrunk into the root then it will redirect the proxy back to the root, and so on until the root is updated with the shard's shrunken status. There is already a guard to prevent the proxy fetching shard ranges again from the same container that it is *currently* querying for listing parts. That deals with the case when a container fills in gaps in its listing shard ranges with a reference to itself. This patch extends that guard to prevent the proxy fetching shard ranges again from any container that has previously been queried for listing parts. Change-Id: I7dc793f0ec65236c1278fd93d6b1f17c2db98d7b
* Make all concurrent_get options per-policyClay Gerrard2020-09-021-1/+1
| | | | Change-Id: Ib81f77cc343c3435d7e6258d4631563fa022d449
* sharding: filter shards based on prefix param when listingTim Burke2020-02-051-0/+13
| | | | | | | Otherwise, we make a bunch of backend requests where we have no real expectation of finding data. Change-Id: I7eaa012ba938eaa7fc22837c32007d1b7ae99709
* sharding: Tolerate blank limits when listingTim Burke2019-12-191-1/+1
| | | | | | | | Otherwise, we can 500 with ValueError: invalid literal for int() with base 10: '' Change-Id: I35614aa4b42e61d97929579dcb16f7dfc9fef96f
* py3: fix up listings on sharded containersTim Burke2019-08-151-12/+14
| | | | | | | | | | | | We were playing a little fast & loose with types before; as a result, marker/end_marker weren't quite working right. In particular, we were checking whether a WSGI string was contained in a shard range, while ShardRange assumes all comparisons are against native strings. Now, get everything to native strings before making comparisons, and get them back to wsgi when we shove them in the params dict. Change-Id: Iddf9e089ef95dc709ab76dc58952a776246991fd
* Rework private-request-method interfaceTim Burke2019-05-221-1/+2
| | | | | | | | | | | | | | Instead of taking a X-Backend-Allow-Method that *must match* the REQUEST_METHOD, take a truish X-Backend-Allow-Private-Methods and expand the set of allowed methods. This allows us to also expose the full list of available private methods when returning a 405. Drive-By: make async-delete tests a little more robust: * check that end_marker and prefix are preserved on subsequent listings * check that objects with a leading slash are correctly handled Change-Id: I5542623f16e0b5a0d728a6706343809e50743f73
* Add operator tool to async-delete some or all objects in a containerTim Burke2019-05-221-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | Adds a tool, swift-container-deleter, that takes an account/container and optional prefix, marker, and/or end-marker; spins up an internal client; makes listing requests against the container; and pushes the found objects into the object-expirer queue with a special application/async-deleted content-type. In order to do this enqueuing efficiently, a new internal-to-the-cluster container method is introduced: UPDATE. It takes a JSON list of object entries and runs them through merge_items. The object-expirer is updated to look for work items with this content-type and skip the X-If-Deleted-At check that it would normally do. Note that the target-container's listing will continue to show the objects until data is actually deleted, bypassing some of the concerns raised in the related change about clearing out a container entirely and then deleting it. Change-Id: Ia13ee5da3d1b5c536eccaadc7a6fdcd997374443 Related-Change: I50e403dee75585fc1ff2bb385d6b2d2f13653cf8
* py3: start porting for unit/proxy/test_server.pyTim Burke2019-05-041-6/+6
| | | | | | | | | | | | | | | | | Mostly this ammounts to Exception.message -> Exception.args[0] '...' -> b'...' StringIO -> BytesIO makefile() -> makefile('rwb') iter.next() -> next(iter) bytes[n] -> bytes[n:n + 1] integer division Note that the versioning tests are mostly untouched; they seemed to get a little hairy. Change-Id: I167b5375e7ed39d4abecf0653f84834ea7dac635
* py3: port proxy container controllerPete Zaitcev2019-02-201-7/+10
| | | | Change-Id: Id74a93f10bc5c641d62141af33bef68e503f7e04
* Return 503 when account auto-create failsTim Burke2019-02-051-2/+2
| | | | | | ...and save 500 for things that would actually leave tracebacks in logs. Change-Id: I02b062ccabba0dcc1542d063e0538f0b1bbbbca9
* py3: get proxy-server willing and able to respond to some API requestsTim Burke2018-09-171-1/+1
| | | | | | | I saw GET account/container/replicated object all work, which is not too shabby. Change-Id: I63408274fb76a4e9920c00a2ce2829ca6d9982ca
* Improve building listings from shardsAlistair Coles2018-05-221-15/+13
| | | | | | | | | | | | | When building a listing from shard containers, objects fetched from each shard range are appended to the existing listing provided their name is greater than the last entry in the current listing and less than or equal to the fetched shard range. This allows misplaced objects below the shard range to possibly be included in the listing in correct name order. Previously that behaviour only occurred if the existing listing had entries, but now it occurs even if no objects have yet been found. Change-Id: I25cab53b9aa2252c98ebcf70aafb9d39887a11f1
* Add sharder daemon, manage_shard_ranges tool and probe testsMatthew Oliver2018-05-181-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sharder daemon visits container dbs and when necessary executes the sharding workflow on the db. The workflow is, in overview: - perform an audit of the container for sharding purposes. - move any misplaced objects that do not belong in the container to their correct shard. - move shard ranges from FOUND state to CREATED state by creating shard containers. - move shard ranges from CREATED to CLEAVED state by cleaving objects to shard dbs and replicating those dbs. By default this is done in batches of 2 shard ranges per visit. Additionally, when the auto_shard option is True (NOT yet recommeneded in production), the sharder will identify shard ranges for containers that have exceeded the threshold for sharding, and will also manage the sharding and shrinking of shard containers. The manage_shard_ranges tool provides a means to manually identify shard ranges and merge them to a container in order to trigger sharding. This is currently the recommended way to shard a container. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I7f192209d4d5580f5a0aa6838f9f04e436cf6b1f
* Enable proxy to build listings from shardsAlistair Coles2018-05-181-2/+107
| | | | | | | | | | | | | When a container is sharding or sharded the proxy container controller now builds container listings by concatenating components from shard ranges. Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Change-Id: Ia4cfebbe50338a761b8b6e9903b1869cb1f5b47e
* Add shard range support to container serverAlistair Coles2018-05-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Support PUTs to container server with json serialized ShardRanges in body. Shard range PUTs may autocreate containers. Support GET of shard ranges from container server.Shard range GETs support X-Backend-Include-Deleted to include deleted shard ranges in list and X-Backend-Override-Delete to get shard ranges when container has been marked as deleted. The X-Backend-Record-Type = ['object'|'shard'|'auto'] is introduced to differentiate container server requests for object versus shard ranges. When 'auto' is used with a GET request the container server will return whichever record type is appropriate for fetchng object listings, depending on whether the container is sharded or not. Support container PUTs with body in direct_client .py Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I029782ae348f38c5fb76d2759609f67a06c883ef
* Merge "Increase name-length limits for internal accounts"Zuul2018-01-311-4/+3
|\
| * Increase name-length limits for internal accountsTim Burke2017-10-131-4/+3
| | | | | | | | | | | | | | | | | | Double account and container name-length limits for accounts starting with auto_create_account_prefix (default: '.') -- these are used internally by Swift, and may need to have some prefix followed by a user-settable value. Related-Change: Ice703dc6d98108ad251c43f824426d026e1f1d97 Change-Id: Ie1ce5ea49b06ab3002c0bd0fad7cea16cea2598e
* | Return HTTPServerError instead of HTTPNotFoundcheng2018-01-151-2/+3
| | | | | | | | | | | | | | | | Swift allows autocreate account. It should be treat as server error instead of 404 when it fails to create account Change-Id: I726271bc06e3c1b07a4af504c3fd7ddb789bd512 Closes-bug: 1718810
* | Merge "Delay cache invalidation during container creation"Jenkins2017-09-201-2/+2
|\ \
| * | Delay cache invalidation during container creationThomas Herve2017-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Having the cache being cleared before the PUT request creates a fairly big window where the cache can be inconsistent, if a concurrent GET happens. Let's move the cache clear after the requests to reduce it. Change-Id: I45130cc32ba3a23272c2a67c86b4063000379426 Closes-Bug: #1715177
* | | Move listing formatting out to proxy middlewareTim Burke2017-09-151-0/+3
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make some json -> (text, xml) stuff in a common module, reference that in account/container servers so we don't break existing clients (including out-of-date proxies), but have the proxy controllers always force a json listing. This simplifies operations on listings (such as the ones already happening in decrypter, or the ones planned for symlink and sharding) by only needing to consider a single response type. There is a downside of larger backend requests for text/plain listings, but it seems like a net win? Change-Id: Id3ce37aa0402e2d8dd5784ce329d7cb4fbaf700d
* | Add Timestamp.now() helperTim Burke2017-04-271-2/+1
|/ | | | | | | Often, we want the current timestamp. May as well improve the ergonomics a bit and provide a class method for it. Change-Id: I3581c635c094a8c4339e9b770331a03eab704074
* Update calling super class constructor style in proxy controllersKazuhiro MIYAHARA2017-03-131-1/+1
| | | | | | | | | | | | | | | | "swift.proxy.controllers.base.Controller" inherits "object", so the Controller class and its sub classes (AccountController, ContainerController, BaseObjectController, InfoController) are "new style class". In new style class, if a class call super class's constructor, "super(SubClass, self).__init__(foo, bar)" is recommended. But, AccountController, ContainerController, BaseObjectController, and InfoController use "Controller.__init__(self, app)", and it is deprecated. This patch fixes the calling super class constructor codes. Change-Id: I4b94ec3131c7c7be4609716867a36490a70d5009 Closes-Bug: #1672285
* Make comparision simplelyzheng yin2016-07-181-2/+1
| | | | | | For example: a>b and a<=c is equal to b<a<=c Change-Id: Iae1532f0946c6d4aa7321f3957820b486869c59f
* Fix up get_account_info and get_container_infoSamuel Merritt2016-05-131-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_account_info used to work like this: * make an account HEAD request * ignore the response * get the account info by digging around in the request environment, where it had been deposited by elves or something Not actually elves, but the proxy's GETorHEAD_base method would take the HEAD response and cache it in the response environment, which was the same object as the request environment, thus enabling get_account_info to find it. This was extraordinarily brittle. If a WSGI middleware were to shallow-copy the request environment, then any middlewares to its left could not use get_account_info, as the left middleware's request environment would no longer be identical to the response environment down in GETorHEAD_base. Now, get_account_info works like this: * make an account HEAD request. * if the account info is in the request environment, return it. This is an optimization to avoid a double-set in memcached. * else, compute the account info from the response headers, store it in caches, and return it. This is much easier to think about; get_account_info can get and cache account info all on its own; the cache check and cache set are right next to each other. All the above is true for get_container_info as well. get_info() is still around, but it's just a shim. It was trying to unify get_account_info and get_container_info to exploit the commonalities, but the number of times that "if container:" showed up in get_info and its helpers really indicated that something was wrong. I'd rather have two functions with some duplication than one function with no duplication but a bunch of "if container:" branches. Other things of note: * a HEAD request to a deleted account returns 410, but get_account_info would return 404 since the 410 came from the account controller *after* GETorHEAD_base ran. Now get_account_info returns 410 as well. * cache validity period (recheck_account_existence and recheck_container_existence) is now communicated to get_account_info via an X-Backend header. This way, get_account_info doesn't need a reference to the swift.proxy.server.Application object. * both logged swift_source values are now correct for get_container_info calls; before, on a cold cache, get_container_info would call get_account_info but not pass along swift_source, resulting in get_account_info logging "GET_INFO" as the source. Amusingly, there was a unit test asserting this bogus behavior. * callers that modify the return value of get_account_info or of get_container_info don't modify what's stored in swift.infocache. * get_account_info on an account that *can* be autocreated but has not been will return a 200, same as a HEAD request. The old behavior was a 404 from get_account_info but a 200 from HEAD. Callers can tell the difference by looking at info['account_really_exists'] if they need to know the difference (there is one call site that needs to know, in container PUT). Note: this is for all accounts when the proxy's "account_autocreate" setting is on. Change-Id: I5167714025ec7237f7e6dd4759c2c6eb959b3fca
* Add concurrent reads option to proxyMatthew Oliver2016-03-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds 2 new parameters to enable and control concurrent GETs in swift, these are 'concurrent_gets' and 'concurrency_timeout'. 'concurrent_gets' allows you to turn on or off concurrent GETs, when on it will set the GET/HEAD concurrency to replica count. And in the case of EC HEADs it will set it to ndata. The proxy will then serve only the first valid source to respond. This applies to all account, container and object GETs except for EC. For EC only HEAD requests are effected. It achieves this by changing the request sending mechanism to using GreenAsyncPile and green threads with a time out between each request. 'concurrency_timeout' is related to concurrent_gets. And is the amount of time to wait before firing the next thread. A value of 0 will fire at the same time (fully concurrent), setting another value will stagger the firing allowing you the ability to give a node a shorter chance to respond before firing the next. This value is a float and should be somewhere between 0 and node_timeout. The default is conn_timeout. Meaning by default it will stagger the firing. DocImpact Implements: blueprint concurrent-reads Change-Id: I789d39472ec48b22415ff9d9821b1eefab7da867
* py3: Replace urllib imports with six.moves.urllibVictor Stinner2015-10-081-1/+1
| | | | | | | | | | | | The urllib, urllib2 and urlparse modules of Python 2 were reorganized into a new urllib namespace on Python 3. Replace urllib, urllib2 and urlparse imports with six.moves.urllib to make the modified code compatible with Python 2 and Python 3. The initial patch was generated by the urllib operation of the sixer tool on: bin/* swift/ test/. Change-Id: I61a8c7fb7972eabc7da8dad3b3d34bceee5c5d93
* Foundational support for PUT and GET of erasure-coded objectsSamuel Merritt2015-04-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit makes it possible to PUT an object into Swift and have it stored using erasure coding instead of replication, and also to GET the object back from Swift at a later time. This works by splitting the incoming object into a number of segments, erasure-coding each segment in turn to get fragments, then concatenating the fragments into fragment archives. Segments are 1 MiB in size, except the last, which is between 1 B and 1 MiB. +====================================================================+ | object data | +====================================================================+ | +------------------------+----------------------+ | | | v v v +===================+ +===================+ +==============+ | segment 1 | | segment 2 | ... | segment N | +===================+ +===================+ +==============+ | | | | v v /=========\ /=========\ | pyeclib | | pyeclib | ... \=========/ \=========/ | | | | +--> fragment A-1 +--> fragment A-2 | | | | | | | | | | +--> fragment B-1 +--> fragment B-2 | | | | ... ... Then, object server A gets the concatenation of fragment A-1, A-2, ..., A-N, so its .data file looks like this (called a "fragment archive"): +=====================================================================+ | fragment A-1 | fragment A-2 | ... | fragment A-N | +=====================================================================+ Since this means that the object server never sees the object data as the client sent it, we have to do a few things to ensure data integrity. First, the proxy has to check the Etag if the client provided it; the object server can't do it since the object server doesn't see the raw data. Second, if the client does not provide an Etag, the proxy computes it and uses the MIME-PUT mechanism to provide it to the object servers after the object body. Otherwise, the object would not have an Etag at all. Third, the proxy computes the MD5 of each fragment archive and sends it to the object server using the MIME-PUT mechanism. With replicated objects, the proxy checks that the Etags from all the object servers match, and if they don't, returns a 500 to the client. This mitigates the risk of data corruption in one of the proxy --> object connections, and signals to the client when it happens. With EC objects, we can't use that same mechanism, so we must send the checksum with each fragment archive to get comparable protection. On the GET path, the inverse happens: the proxy connects to a bunch of object servers (M of them, for an M+K scheme), reads one fragment at a time from each fragment archive, decodes those fragments into a segment, and serves the segment to the client. When an object server dies partway through a GET response, any partially-fetched fragment is discarded, the resumption point is wound back to the nearest fragment boundary, and the GET is retried with the next object server. GET requests for a single byterange work; GET requests for multiple byteranges do not. There are a number of things _not_ included in this commit. Some of them are listed here: * multi-range GET * deferred cleanup of old .data files * durability (daemon to reconstruct missing archives) Co-Authored-By: Alistair Coles <alistair.coles@hp.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: John Dickinson <me@not.mn> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com> Co-Authored-By: Paul Luse <paul.e.luse@intel.com> Co-Authored-By: Christian Schwede <christian.schwede@enovance.com> Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com> Change-Id: I9c13c03616489f8eab7dcd7c5f21237ed4cb6fd2
* Add support of x-remove- headers for container-syncArnaud JOST2015-02-191-1/+3
| | | | | | | | | If the used tool to send header doesn't support empty headers (older versions of curl), x-remove can be used to remove metadata. sync-key and sync-to metadata, used by container-sync, can now be removed using x-remove headers. Change-Id: I0edb4d5425a99d20a973aa4fceaf9af6c2ddecc0
* Make container GET call authorize when account not foundAlistair Coles2015-02-091-0/+4
| | | | | | | | | | | | | | When an account was not found, ContainerController would return 404 unconditionally for a container GET or HEAD request, without checking that the request was authorized. This patch modifies the GETorHEAD method to first call any callback method registered under 'swift.authorize' in the request environ and prefer any response from that over the 404. Closes-Bug: 1415957 Change-Id: I4f41fd9e445238e14af74b6208885d83698cc08d
* Restrict keystone cross-tenant ACLs to IDsanc2014-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The keystoneauth middleware supports cross-tenant access control using the syntax <tenant>:<user> in container ACLs, where <tenant> and <user> may currently be either a unique id or a name. As a result of the keystone v3 API introducing domains, names are no longer globally unique and are only unique within a domain. The use of unqualified tenant and user names in this ACL syntax is therefore not 'safe' in a keystone v3 environment. This patch modifies keystoneauth to restrict cross-tenant ACL matching to use only ids for accounts that are not in the default domain. For backwards compatibility, names will still be matched in ACLs when both the requesting user and tenant are known to be in the default domain AND the account's tenant is also in the default domain (the default domain being the domain to which existing tenants are migrated). Accounts existing prior to this patch are assumed to be for tenants in the default domain. New accounts created using a v2 token scoped on the tenant are also assumed to be in the default domain. New accounts created using a v3 token scoped on the tenant will learn their domain membership from the token info. New accounts created using any unscoped token, (i.e. with a reselleradmin role) will have unknown domain membership and therefore be assumed to NOT be in the default domain. Despite this provision for backwards compatibility, names must no longer be used when setting new ACLs in any account, including new accounts in the default domain. This change obviously impacts users accustomed to specifying cross-tenant ACLs in terms of names, and further work will be necessary to restore those use cases. Some ideas are discussed under the bug report. With that caveat, this patch removes the reported vulnerability when using swift/keystoneauth with a keystone v3 API. Note: to observe the new 'restricted' behaviour you will need to setup keystone user(s) and tenant(s) in a non-default domain and set auth_version = v3.0 in the auth_token middleware config section of proxy-server.conf. You may also benefit from the keystone v3 enabled swiftclient patch under review here: https://review.openstack.org/#/c/91788/ DocImpact blueprint keystone-v3-support Closes-Bug: #1299146 Change-Id: Ib32df093f7450f704127da77ff06b595f57615cb