| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updating shard range cache has been restructured and upgraded to v2
which only persist the essential attributes in memcache (see
Related-Change). This is the following patch to restructure the
listing shard ranges cache for object listing in the same way.
UpgradeImpact
=============
The cache key for listing shard ranges in memcached is renamed
from 'shard-listing/<account>/<container>' to
'shard-listing-v2/<account>/<container>', and cache data is
changed to be a list of [lower bound, name]. As a result, this
will invalidate all existing listing shard ranges stored in the
memcache cluster.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Change-Id: I54a32fd16e3d02b00c18b769c6f675bae3ba8e01
|
|
|
|
|
|
|
|
| |
X-Backend-Allow-Method was used in some iteration, but not the version
of the patch that finally landed.
Change-Id: Id637253bb68bc839f5444a74c91588d753ef4379
Related-Change: Ia13ee5da3d1b5c536eccaadc7a6fdcd997374443
|
|
|
|
|
|
|
|
| |
Adding a "use_replication" field to the node dict, a helper function to
set use_replication dict value for a node copy by looking up the header
value for x-backend-use-replication-network
Change-Id: Ie05af464765dc10cf585be851f462033fc6bdec7
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch will add more granularity to shard operation cache or
backend metrics, and then remove some of existing and duplicated
metrics.
Before this patch, related metrics are:
1.shard_<op>.cache.[hit|miss|skip]
2.shard_<op>.backend.<status_int>
where op is 'listing' or 'updating'.
With this patch, they are going to become:
1.shard_<op>.infocache.hit
cache hits with infocache.
2.shard_<op>.cache.hit
cache hits with memcache.
3.shard_<op>.cache.[miss|bypass|skip|force_skip|disabled|error]
.<status_int>
Those are operations made to backend due to below reasons.
miss: cache misses.
bypass: metadata didn't support a cache lookup
skip: the selective skips per skip percentage config.
force_skip: the request with 'x-newest' header.
disabled: memcache is disabled.
error: memcache connection error.
For each kind of operation metrics, suffix <status_int> will
count operations with different status. Then a sum of all
status sub-metrics will the total metrics of that operation.
UpgradeImpact
=============
Metrics dashboard will need updates to display those changed metrics
correctly, also infocache metrics are newly added, please see above
message for all changes needed.
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Ib8be30d3969b4b4808664c43e94db53d10e6ef4c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor the ContainerController._GET_using_cache. No behavioral
changes.
* Make the top-level method shorter.
* Make the flow easier to follow.
* Make various return points more obvious.
* Change variable names to clarify those that are lists of ShardRange
objects from those that are lists of dicts representing shard
ranges.
Change-Id: Ibb7cd761be4a5b1ec53dd16b7c5d256ed7666a88
|
|\ |
|
| |
| |
| |
| | |
Change-Id: I08cc2c0bfe803e3cec1e6ada10af4d725359e5e8
|
|\ \ |
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Auth middlewares in particular may want to *know* when there's a
communication breakdown as opposed to a cache miss.
Update our shard-range cache stats to acknowlegde the distinction.
Drive-by: Log an error if all memcached servers are error-limited.
Change-Id: Ic8d0915235d11124d06ec940c5be9a2edbe85c83
|
|/
|
|
|
|
|
|
|
| |
There are a few places where a last-modified value is calculated by
rounding a timestamp *up* to the nearest second. This patch refactors
to use a new Timestamp.ceil() method to do this rounding, along with a
clarifying docstring.
Change-Id: I9ef73e5183bdf21b22f5f19b8440ffef6988aec7
|
|
|
|
|
|
| |
Add statsd metrics 'container.shard_listing.backend.<status_int>'.
Change-Id: Ibd98ad3bdedc6c80d275a37697de0943e3e8fb4f
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Swift loggers encapsulate a StatsdClient that is typically initialised
with a prefix, equal to the logger name (e.g. 'proxy_server'), that is
prepended to metrics names. The proxy server would previously mutate
its logger's prefix, using its set_statsd_prefix method, each time a
controller was instantiated, extending it with the controller type
(e.g. changing the prefix 'proxy_server.object'). As a result, when an
object request spawned container subrequests, for example, the statsd
client would be left with a 'proxy_server.container' prefix part for
subsequent object request related metrics.
The proxy server logger is now wrapped with a new
MetricsPrefixLoggerAdapter each time a controller is instantiated, and
the adapter applies the correct prefix for the controller type for the
lifetime of the controller.
Change-Id: I0522b1953722ca96021a0002cf93432b973ce626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By having some small portion of calls skip cache and go straight to
disk, we can ensure the cache is always kept fresh and never expires (at
least, for active containers). Previously, when shard ranges fell out of
cache there would frequently be a thundering herd that could overwhelm
the container server, leading to 503s served to clients or an increase
in async pendings.
Include metrics for hit/miss/skip rates.
Change-Id: I6d74719fb41665f787375a08184c1969c86ce2cf
Related-Bug: #1883324
|
|
|
|
|
| |
Change-Id: I2ffacdbd70c72e091825164da24cc87ea67721d7
Partial-Bug: #1674543
|
|
|
|
|
|
|
|
|
| |
Return a 503 to the original container listing request if a GET from a
shard returns objects from a different policy than requested (e.g. due
to the shard container server not being upgraded).
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Change-Id: Ieab6238030e8c264ee90186012be6e9da937b42e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a tight coupling between a root container and its shards: the
shards hold the object metadata for the root container, so are really
an extension of the root. When we PUT objects in to a root container,
it'll redirect them, with the root's policy, to the shards. And the
shards are happy to take them, even if the shard's policy is different
to the root's. But when it comes to GETs, the root redirects the GET
onto it's shards whom currently wont respond with objects (which they
probably took) because they are of a different policy. Currently, when
getting objects from the container server, the policy used is always
the broker's policy.
This patch corrects this behaviour by allowing the ability to override
the policy index to use. If the request to the container server
contains an 'X-Backend-Storage-Policy-Index' header it'll be used
instead of the policy index stored in the broker.
This patch adds the root container's policy as this header in the
proxy container controller's `_get_from_shards` method which is used
by the proxy to redirect a GET to a root to its shards.
Further, a new backend response header has been added. If the
container response contains an `X-Backend-Record-Type: object` header,
then it means the response is a response with objects in it. In this
case this patch also adds a `X-Backend-Record-Storage-Policy-Index`
header so the policy index of the given objects is known, as
X-Backend-Storage-Policy-Index in the response _always_ represents the
policy index of the container itself.
On a plus side this new container policy API allows us a way to check
containers for object listing is other policies. So might come in handy
for OPs/SREs.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I026b699fc5f0fba619cf524093632d67ca38d32f
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, when building a listing from shard containers, the proxy
would return 200 to the client even if one or more of the component
shards failed to return a successful listing response. With this patch
the proxy will now return a 503 in those circumstances, since the
listing would otherwise be incomplete due to an internal error.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I983d665584471c9d689506592f48ddd00c0887ef
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The proxy tried to map the x-backend-sharding-state header value to
a ShardRange state, but the value is the container *db* state. The
attempted mapping would commonly fail because db state names do not
always correspond to ShardRange state names, in which the case the
fallback was to log the header value i.e. the correct outcome.
Sometimes the attempted mapping would succeed because for example
'sharded' is both a db state name and a ShardRange state name. In that
case the log message would look something like:
"Found 1024 objects in shard (state=(70, 'sharded')), total = 1024"
i.e. the tuple of ShardRange state name and number was logged, which
was inappropriate.
Change-Id: Ic08e6e7df7162a4c1283a3ef6e67c3b21a4ce494
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes four significant changes to the handling of GET
requests for sharding or sharded containers:
- container server GET requests may now result in the entire list of
shard ranges being returned for the 'listing' state regardless of
any request parameter constraints.
- the proxy server may cache that list of shard ranges in memcache
and the requests environ infocache dict, and subsequently use the
cached shard ranges when handling GET requests for the same
container.
- the proxy now caches more container metadata so that it can
synthesize a complete set of container GET response headers from
cache.
- the proxy server now enforces more container GET request validity
checks that were previously only enforced by the backend server,
e.g. checks for valid request parameter values
With this change, when the proxy learns from container metadata
that the container is sharded then it will cache shard
ranges fetched from the backend during a container GET in memcache.
On subsequent container GETs the proxy will use the cached shard
ranges to gather object listings from shard containers, avoiding
further GET requests to the root container until the cached shard
ranges expire from cache.
Cached shard ranges are most useful if they cover the entire object
name space in the container. The proxy therefore uses a new
X-Backend-Override-Shard-Name-Filter header to instruct the container
server to ignore any request parameters that would constrain the
returned shard range listing i.e. 'marker', 'end_marker', 'includes'
and 'reverse' parameters. Having obtained the entire shard range
listing (either from the server or from cache) the proxy now applies
those request parameter constraints itself when constructing the
client response.
When using cached shard ranges the proxy will synthesize response
headers from the container metadata that is also in cache. To enable
the full set of container GET response headers to be synthezised in
this way, the set of metadata that the proxy caches when handling a
backend container GET response is expanded to include various
timestamps.
The X-Newest header may be used to disable looking up shard ranges
in cache.
Change-Id: I5fc696625d69d1ee9218ee2a508a1b9be6cf9685
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the proxy container controller could, in corner cases, get
into a loop while building a listing for a sharded container. For
example, if a root has a single shard then the proxy will be
redirected to that shard, but if that shard has shrunk into the root
then it will redirect the proxy back to the root, and so on until the
root is updated with the shard's shrunken status.
There is already a guard to prevent the proxy fetching shard ranges
again from the same container that it is *currently* querying for
listing parts. That deals with the case when a container fills in gaps
in its listing shard ranges with a reference to itself. This patch
extends that guard to prevent the proxy fetching shard ranges again
from any container that has previously been queried for listing parts.
Change-Id: I7dc793f0ec65236c1278fd93d6b1f17c2db98d7b
|
|
|
|
| |
Change-Id: Ib81f77cc343c3435d7e6258d4631563fa022d449
|
|
|
|
|
|
|
| |
Otherwise, we make a bunch of backend requests where we have no
real expectation of finding data.
Change-Id: I7eaa012ba938eaa7fc22837c32007d1b7ae99709
|
|
|
|
|
|
|
|
| |
Otherwise, we can 500 with
ValueError: invalid literal for int() with base 10: ''
Change-Id: I35614aa4b42e61d97929579dcb16f7dfc9fef96f
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were playing a little fast & loose with types before; as a result,
marker/end_marker weren't quite working right. In particular, we were
checking whether a WSGI string was contained in a shard range, while
ShardRange assumes all comparisons are against native strings.
Now, get everything to native strings before making comparisons, and
get them back to wsgi when we shove them in the params dict.
Change-Id: Iddf9e089ef95dc709ab76dc58952a776246991fd
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of taking a X-Backend-Allow-Method that *must match* the
REQUEST_METHOD, take a truish X-Backend-Allow-Private-Methods and
expand the set of allowed methods. This allows us to also expose
the full list of available private methods when returning a 405.
Drive-By: make async-delete tests a little more robust:
* check that end_marker and prefix are preserved on subsequent
listings
* check that objects with a leading slash are correctly handled
Change-Id: I5542623f16e0b5a0d728a6706343809e50743f73
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a tool, swift-container-deleter, that takes an account/container
and optional prefix, marker, and/or end-marker; spins up an internal
client; makes listing requests against the container; and pushes the
found objects into the object-expirer queue with a special
application/async-deleted content-type.
In order to do this enqueuing efficiently, a new internal-to-the-cluster
container method is introduced: UPDATE. It takes a JSON list of object
entries and runs them through merge_items.
The object-expirer is updated to look for work items with this
content-type and skip the X-If-Deleted-At check that it would normally
do.
Note that the target-container's listing will continue to show the
objects until data is actually deleted, bypassing some of the concerns
raised in the related change about clearing out a container entirely and
then deleting it.
Change-Id: Ia13ee5da3d1b5c536eccaadc7a6fdcd997374443
Related-Change: I50e403dee75585fc1ff2bb385d6b2d2f13653cf8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly this ammounts to
Exception.message -> Exception.args[0]
'...' -> b'...'
StringIO -> BytesIO
makefile() -> makefile('rwb')
iter.next() -> next(iter)
bytes[n] -> bytes[n:n + 1]
integer division
Note that the versioning tests are mostly untouched; they seemed to get
a little hairy.
Change-Id: I167b5375e7ed39d4abecf0653f84834ea7dac635
|
|
|
|
| |
Change-Id: Id74a93f10bc5c641d62141af33bef68e503f7e04
|
|
|
|
|
|
| |
...and save 500 for things that would actually leave tracebacks in logs.
Change-Id: I02b062ccabba0dcc1542d063e0538f0b1bbbbca9
|
|
|
|
|
|
|
| |
I saw GET account/container/replicated object all work,
which is not too shabby.
Change-Id: I63408274fb76a4e9920c00a2ce2829ca6d9982ca
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When building a listing from shard containers, objects fetched from
each shard range are appended to the existing listing provided their
name is greater than the last entry in the current listing and less
than or equal to the fetched shard range. This allows misplaced
objects below the shard range to possibly be included in the listing
in correct name order. Previously that behaviour only occurred if the
existing listing had entries, but now it occurs even if no objects
have yet been found.
Change-Id: I25cab53b9aa2252c98ebcf70aafb9d39887a11f1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sharder daemon visits container dbs and when necessary executes
the sharding workflow on the db.
The workflow is, in overview:
- perform an audit of the container for sharding purposes.
- move any misplaced objects that do not belong in the container
to their correct shard.
- move shard ranges from FOUND state to CREATED state by creating
shard containers.
- move shard ranges from CREATED to CLEAVED state by cleaving objects
to shard dbs and replicating those dbs. By default this is done in
batches of 2 shard ranges per visit.
Additionally, when the auto_shard option is True (NOT yet recommeneded
in production), the sharder will identify shard ranges for containers
that have exceeded the threshold for sharding, and will also manage
the sharding and shrinking of shard containers.
The manage_shard_ranges tool provides a means to manually identify
shard ranges and merge them to a container in order to trigger
sharding. This is currently the recommended way to shard a container.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I7f192209d4d5580f5a0aa6838f9f04e436cf6b1f
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a container is sharding or sharded the proxy container controller
now builds container listings by concatenating components from shard
ranges.
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Change-Id: Ia4cfebbe50338a761b8b6e9903b1869cb1f5b47e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support PUTs to container server with json serialized ShardRanges in
body. Shard range PUTs may autocreate containers.
Support GET of shard ranges from container server.Shard range GETs
support X-Backend-Include-Deleted to include deleted shard ranges in
list and X-Backend-Override-Delete to get shard ranges when container
has been marked as deleted.
The X-Backend-Record-Type = ['object'|'shard'|'auto'] is introduced
to differentiate container server requests for object versus shard
ranges. When 'auto' is used with a GET request the container server
will return whichever record type is appropriate for fetchng object
listings, depending on whether the container is sharded or not.
Support container PUTs with body in direct_client .py
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I029782ae348f38c5fb76d2759609f67a06c883ef
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Double account and container name-length limits for accounts starting with
auto_create_account_prefix (default: '.') -- these are used internally by
Swift, and may need to have some prefix followed by a user-settable value.
Related-Change: Ice703dc6d98108ad251c43f824426d026e1f1d97
Change-Id: Ie1ce5ea49b06ab3002c0bd0fad7cea16cea2598e
|
| |
| |
| |
| |
| |
| |
| |
| | |
Swift allows autocreate account. It should be treat as server error
instead of 404 when it fails to create account
Change-Id: I726271bc06e3c1b07a4af504c3fd7ddb789bd512
Closes-bug: 1718810
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Having the cache being cleared before the PUT request creates a
fairly big window where the cache can be inconsistent, if a concurrent
GET happens. Let's move the cache clear after the requests to reduce it.
Change-Id: I45130cc32ba3a23272c2a67c86b4063000379426
Closes-Bug: #1715177
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make some json -> (text, xml) stuff in a common module, reference that in
account/container servers so we don't break existing clients (including
out-of-date proxies), but have the proxy controllers always force a json
listing.
This simplifies operations on listings (such as the ones already happening in
decrypter, or the ones planned for symlink and sharding) by only needing to
consider a single response type.
There is a downside of larger backend requests for text/plain listings, but
it seems like a net win?
Change-Id: Id3ce37aa0402e2d8dd5784ce329d7cb4fbaf700d
|
|/
|
|
|
|
|
| |
Often, we want the current timestamp. May as well improve the ergonomics
a bit and provide a class method for it.
Change-Id: I3581c635c094a8c4339e9b770331a03eab704074
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"swift.proxy.controllers.base.Controller" inherits "object", so the
Controller class and its sub classes (AccountController,
ContainerController, BaseObjectController, InfoController) are
"new style class". In new style class, if a class call super class's
constructor, "super(SubClass, self).__init__(foo, bar)" is recommended.
But, AccountController, ContainerController, BaseObjectController,
and InfoController use "Controller.__init__(self, app)", and it is
deprecated.
This patch fixes the calling super class constructor codes.
Change-Id: I4b94ec3131c7c7be4609716867a36490a70d5009
Closes-Bug: #1672285
|
|
|
|
|
|
| |
For example: a>b and a<=c is equal to b<a<=c
Change-Id: Iae1532f0946c6d4aa7321f3957820b486869c59f
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_account_info used to work like this:
* make an account HEAD request
* ignore the response
* get the account info by digging around in the request environment,
where it had been deposited by elves or something
Not actually elves, but the proxy's GETorHEAD_base method would take
the HEAD response and cache it in the response environment, which was
the same object as the request environment, thus enabling
get_account_info to find it.
This was extraordinarily brittle. If a WSGI middleware were to
shallow-copy the request environment, then any middlewares to its left
could not use get_account_info, as the left middleware's request
environment would no longer be identical to the response environment
down in GETorHEAD_base.
Now, get_account_info works like this:
* make an account HEAD request.
* if the account info is in the request environment, return it. This
is an optimization to avoid a double-set in memcached.
* else, compute the account info from the response headers, store it
in caches, and return it.
This is much easier to think about; get_account_info can get and cache
account info all on its own; the cache check and cache set are right
next to each other.
All the above is true for get_container_info as well.
get_info() is still around, but it's just a shim. It was trying to
unify get_account_info and get_container_info to exploit the
commonalities, but the number of times that "if container:" showed up
in get_info and its helpers really indicated that something was
wrong. I'd rather have two functions with some duplication than one
function with no duplication but a bunch of "if container:" branches.
Other things of note:
* a HEAD request to a deleted account returns 410, but
get_account_info would return 404 since the 410 came from the
account controller *after* GETorHEAD_base ran. Now
get_account_info returns 410 as well.
* cache validity period (recheck_account_existence and
recheck_container_existence) is now communicated to
get_account_info via an X-Backend header. This way,
get_account_info doesn't need a reference to the
swift.proxy.server.Application object.
* both logged swift_source values are now correct for
get_container_info calls; before, on a cold cache,
get_container_info would call get_account_info but not pass along
swift_source, resulting in get_account_info logging "GET_INFO" as
the source. Amusingly, there was a unit test asserting this bogus
behavior.
* callers that modify the return value of get_account_info or of
get_container_info don't modify what's stored in swift.infocache.
* get_account_info on an account that *can* be autocreated but has
not been will return a 200, same as a HEAD request. The old
behavior was a 404 from get_account_info but a 200 from
HEAD. Callers can tell the difference by looking at
info['account_really_exists'] if they need to know the difference
(there is one call site that needs to know, in container
PUT). Note: this is for all accounts when the proxy's
"account_autocreate" setting is on.
Change-Id: I5167714025ec7237f7e6dd4759c2c6eb959b3fca
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds 2 new parameters to enable and control concurrent GETs
in swift, these are 'concurrent_gets' and 'concurrency_timeout'.
'concurrent_gets' allows you to turn on or off concurrent GETs, when
on it will set the GET/HEAD concurrency to replica count. And in the
case of EC HEADs it will set it to ndata.
The proxy will then serve only the first valid source to respond.
This applies to all account, container and object GETs except
for EC. For EC only HEAD requests are effected.
It achieves this by changing the request sending mechanism to using
GreenAsyncPile and green threads with a time out between each
request.
'concurrency_timeout' is related to concurrent_gets. And is the
amount of time to wait before firing the next thread. A value of 0
will fire at the same time (fully concurrent), setting another value
will stagger the firing allowing you the ability to give a node a
shorter chance to respond before firing the next. This value is a float
and should be somewhere between 0 and node_timeout. The default is
conn_timeout. Meaning by default it will stagger the firing.
DocImpact
Implements: blueprint concurrent-reads
Change-Id: I789d39472ec48b22415ff9d9821b1eefab7da867
|
|
|
|
|
|
|
|
|
|
|
|
| |
The urllib, urllib2 and urlparse modules of Python 2 were reorganized
into a new urllib namespace on Python 3. Replace urllib, urllib2 and
urlparse imports with six.moves.urllib to make the modified code
compatible with Python 2 and Python 3.
The initial patch was generated by the urllib operation of the sixer
tool on: bin/* swift/ test/.
Change-Id: I61a8c7fb7972eabc7da8dad3b3d34bceee5c5d93
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit makes it possible to PUT an object into Swift and have it
stored using erasure coding instead of replication, and also to GET
the object back from Swift at a later time.
This works by splitting the incoming object into a number of segments,
erasure-coding each segment in turn to get fragments, then
concatenating the fragments into fragment archives. Segments are 1 MiB
in size, except the last, which is between 1 B and 1 MiB.
+====================================================================+
| object data |
+====================================================================+
|
+------------------------+----------------------+
| | |
v v v
+===================+ +===================+ +==============+
| segment 1 | | segment 2 | ... | segment N |
+===================+ +===================+ +==============+
| |
| |
v v
/=========\ /=========\
| pyeclib | | pyeclib | ...
\=========/ \=========/
| |
| |
+--> fragment A-1 +--> fragment A-2
| |
| |
| |
| |
| |
+--> fragment B-1 +--> fragment B-2
| |
| |
... ...
Then, object server A gets the concatenation of fragment A-1, A-2,
..., A-N, so its .data file looks like this (called a "fragment archive"):
+=====================================================================+
| fragment A-1 | fragment A-2 | ... | fragment A-N |
+=====================================================================+
Since this means that the object server never sees the object data as
the client sent it, we have to do a few things to ensure data
integrity.
First, the proxy has to check the Etag if the client provided it; the
object server can't do it since the object server doesn't see the raw
data.
Second, if the client does not provide an Etag, the proxy computes it
and uses the MIME-PUT mechanism to provide it to the object servers
after the object body. Otherwise, the object would not have an Etag at
all.
Third, the proxy computes the MD5 of each fragment archive and sends
it to the object server using the MIME-PUT mechanism. With replicated
objects, the proxy checks that the Etags from all the object servers
match, and if they don't, returns a 500 to the client. This mitigates
the risk of data corruption in one of the proxy --> object connections,
and signals to the client when it happens. With EC objects, we can't
use that same mechanism, so we must send the checksum with each
fragment archive to get comparable protection.
On the GET path, the inverse happens: the proxy connects to a bunch of
object servers (M of them, for an M+K scheme), reads one fragment at a
time from each fragment archive, decodes those fragments into a
segment, and serves the segment to the client.
When an object server dies partway through a GET response, any
partially-fetched fragment is discarded, the resumption point is wound
back to the nearest fragment boundary, and the GET is retried with the
next object server.
GET requests for a single byterange work; GET requests for multiple
byteranges do not.
There are a number of things _not_ included in this commit. Some of
them are listed here:
* multi-range GET
* deferred cleanup of old .data files
* durability (daemon to reconstruct missing archives)
Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I9c13c03616489f8eab7dcd7c5f21237ed4cb6fd2
|
|
|
|
|
|
|
|
|
| |
If the used tool to send header doesn't support empty headers (older versions
of curl), x-remove can be used to remove metadata.
sync-key and sync-to metadata, used by container-sync, can now be removed using
x-remove headers.
Change-Id: I0edb4d5425a99d20a973aa4fceaf9af6c2ddecc0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an account was not found, ContainerController would
return 404 unconditionally for a container GET or HEAD request,
without checking that the request was authorized.
This patch modifies the GETorHEAD method to first call any
callback method registered under 'swift.authorize' in the
request environ and prefer any response from that over the 404.
Closes-Bug: 1415957
Change-Id: I4f41fd9e445238e14af74b6208885d83698cc08d
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The keystoneauth middleware supports cross-tenant access
control using the syntax <tenant>:<user> in container ACLs,
where <tenant> and <user> may currently be either a unique
id or a name. As a result of the keystone v3 API introducing
domains, names are no longer globally unique and are only
unique within a domain. The use of unqualified tenant and
user names in this ACL syntax is therefore not 'safe' in a
keystone v3 environment.
This patch modifies keystoneauth to restrict cross-tenant
ACL matching to use only ids for accounts that are not in
the default domain. For backwards compatibility,
names will still be matched in ACLs when both the requesting
user and tenant are known to be in the default domain AND the
account's tenant is also in the default domain (the default
domain being the domain to which existing tenants are
migrated).
Accounts existing prior to this patch are assumed to be for
tenants in the default domain. New accounts created using a
v2 token scoped on the tenant are also assumed to be in the
default domain. New accounts created using a v3 token scoped
on the tenant will learn their domain membership from the
token info. New accounts created using any unscoped token,
(i.e. with a reselleradmin role) will have unknown domain
membership and therefore be assumed to NOT be in the default
domain.
Despite this provision for backwards compatibility, names
must no longer be used when setting new ACLs in any account,
including new accounts in the default domain.
This change obviously impacts users accustomed to specifying
cross-tenant ACLs in terms of names, and further work will be
necessary to restore those use cases. Some ideas are
discussed under the bug report. With that caveat, this patch
removes the reported vulnerability when using
swift/keystoneauth with a keystone v3 API.
Note: to observe the new 'restricted' behaviour you will need
to setup keystone user(s) and tenant(s) in a non-default domain
and set auth_version = v3.0 in the auth_token middleware config
section of proxy-server.conf. You may also benefit from the
keystone v3 enabled swiftclient patch under review here:
https://review.openstack.org/#/c/91788/
DocImpact
blueprint keystone-v3-support
Closes-Bug: #1299146
Change-Id: Ib32df093f7450f704127da77ff06b595f57615cb
|