diff options
author | Jordan Cook <jordan.cook@pioneer.com> | 2022-06-14 00:10:22 -0500 |
---|---|---|
committer | Jordan Cook <jordan.cook@pioneer.com> | 2022-06-16 16:48:34 -0500 |
commit | bcccae4d7fd9c1a317b24e52a21ad0802548709b (patch) | |
tree | 5f5138db4a88c64d0ffe70704bcb080663475656 /docs | |
parent | 82a9bd50cded0ef92e0e882ee4b63d1c4e9e590d (diff) | |
download | requests-cache-bcccae4d7fd9c1a317b24e52a21ad0802548709b.tar.gz |
Update docs
Diffstat (limited to 'docs')
-rw-r--r-- | docs/user_guide.md | 6 | ||||
-rw-r--r-- | docs/user_guide/headers.md | 1 | ||||
-rw-r--r-- | docs/user_guide/matching.md | 116 | ||||
-rw-r--r-- | docs/user_guide/security.md | 1 | ||||
-rw-r--r-- | docs/user_guide/serializers.md | 9 | ||||
-rw-r--r-- | docs/user_guide/troubleshooting.md | 14 |
6 files changed, 103 insertions, 44 deletions
diff --git a/docs/user_guide.md b/docs/user_guide.md index 4a38ea2..240b496 100644 --- a/docs/user_guide.md +++ b/docs/user_guide.md @@ -2,21 +2,21 @@ # {fa}`book` User Guide This section covers the main features of requests-cache. -## The Basics +## Basics ```{toctree} :maxdepth: 2 user_guide/installation user_guide/general +user_guide/backends user_guide/files user_guide/troubleshooting ``` -## Features & Options +## Advanced Features & Options ```{toctree} :maxdepth: 2 -user_guide/backends user_guide/filtering user_guide/headers user_guide/inspection diff --git a/docs/user_guide/headers.md b/docs/user_guide/headers.md index 7944352..a7e9441 100644 --- a/docs/user_guide/headers.md +++ b/docs/user_guide/headers.md @@ -76,3 +76,4 @@ The following headers are currently supported: - `Expires`: Used as an absolute expiration datetime - `ETag`: Validator used for conditional requests - `Last-Modified`: Validator used for conditional requests +- `Vary`: Used to indicate which request headers to match. See {ref}`matching-headers` for details. diff --git a/docs/user_guide/matching.md b/docs/user_guide/matching.md index c368b7d..055ad28 100644 --- a/docs/user_guide/matching.md +++ b/docs/user_guide/matching.md @@ -5,26 +5,18 @@ are normalized to account for any variations that do not modify response content There are some additional options to configure how you want requests to be matched. -## Matching Request Headers -In some cases, different headers may result in different response data, so you may want to cache -them separately. To enable this, use `match_headers`: -```python ->>> session = CachedSession(match_headers=True) ->>> # Both of these requests will be sent and cached separately ->>> session.get('https://httpbin.org/headers', {'Accept': 'text/plain'}) ->>> session.get('https://httpbin.org/headers', {'Accept': 'application/json'}) -``` - -If you only want to match specific headers and not others, you can provide them as a list: -```python ->>> session = CachedSession(match_headers=['Accept', 'Accept-Language']) -``` - (filter-params)= ## Selective Parameter Matching By default, all normalized request parameters are matched. In some cases, there may be request -parameters that don't affect the response data, for example authentication tokens or credentials. -If you want to ignore specific parameters, specify them with the `ignored_parameters` option. +parameters that you don't want to match. For example, an authentication token will change frequently +but not change reponse content. + +Use the `ignored_parameters` option if you want to ignore specific parameters. + +```{note} +Many common authentication parameters are already ignored by default. +See {ref}`default-filter-params` for details. +``` **Request Parameters:** @@ -49,7 +41,7 @@ This also applies to parameters in a JSON-formatted request body: **Request Headers:** -As well as headers, if `match_headers` is also used: +As well as headers, if `match_headers=True` is used: ```python >>> session = CachedSession(ignored_parameters=['auth-token'], match_headers=True) >>> session.get('https://httpbin.org/get', headers={'auth-token': '2F63E5DF4F44'}) @@ -60,10 +52,34 @@ As well as headers, if `match_headers` is also used: Since `ignored_parameters` is most often used for sensitive info like credentials, these values will also be removed from the cached request parameters, body, and headers. ``` +(matching-headers)= +## Matching Request Headers +```{note} +In some cases, request header values can affect response content. For example, sites that support +i18n and [content negotiation](https://developer.mozilla.org/en-US/docs/Web/HTTP/Content_negotiation) may use the `Accept-Language` header to determine which language to serve content in. + +The server will ideally also send a `Vary` header in the response, which informs caches about +which request headers to match. By default, requests-cache respects this, so in many cases it +will already do what you want without extra configuration. Not all servers send `Vary`, however. +``` + +Use the `match_headers` option if you want to specify which headers you want to match when `Vary` +isn't available: +```python +>>> session = CachedSession(match_headers=['Accept']) +>>> # These two requests will be sent and cached separately +>>> session.get('https://httpbin.org/headers', {'Accept': 'text/plain'}) +>>> session.get('https://httpbin.org/headers', {'Accept': 'application/json'}) +``` + +If you want to match _all_ request headers, you can use `match_headers=True`. + + (custom-matching)= ## Custom Request Matching If you need more advanced behavior, you can implement your own custom request matching. +### Cache Keys Request matching is accomplished using a **cache key**, which uniquely identifies a response in the cache based on request info. For example, the option `ignored_parameters=['foo']` works by excluding the `foo` request parameter from the cache key, meaning these three requests will all use the same @@ -76,6 +92,7 @@ cached response: >>> assert response_1.cache_key == response_2.cache_key == response_3.cache_key ``` +### Cache Key Functions If you want to implement your own request matching, you can provide a cache key function which will take a {py:class}`~requests.PreparedRequest` plus optional keyword args for {py:func}`~requests.request`, and return a string: @@ -84,28 +101,67 @@ def create_key(request: requests.PreparedRequest, **kwargs) -> str: """Generate a custom cache key for the given request""" ``` -`**kwargs` includes relevant {py:class}`.BaseCache` settings and any other keyword args passed to -{py:meth}`.CachedSession.send()`. See {py:func}`.create_key` for the reference implementation, and -see the rest of the {py:mod}`.cache_keys` module for some potentially useful helper functions. - You can then pass this function via the `key_fn` param: ```python session = CachedSession(key_fn=create_key) ``` -```{note} -`key_fn()` will be used **instead of** any other {ref}`matching` options and default matching behavior. +`**kwargs` includes relevant {py:class}`.BaseCache` settings and any other keyword args passed to +{py:meth}`.CachedSession.send()`. If you want use a custom matching function _and_ the existing +options `ignored_parameters` and `match_headers`, you can implement them in `key_fn`: +```python +def create_key( + request: requests.PreparedRequest, + ignored_parameters: List[str] = None, + match_headers: List[str] = None, + **kwargs, +) -> str: + """Generate a custom cache key for the given request""" ``` + +See {py:func}`.create_key` for the reference implementation, and see the rest of the +{py:mod}`.cache_keys` module for some potentially useful helper functions. + + ```{tip} -See {ref}`Examples<custom_keys>` page for a complete example for custom request matching. +See {ref}`Examples<custom_keys>` for a complete example for custom request matching. ``` ```{tip} -As a general rule, if you include less info in your cache keys, you will have more cache hits and -use less storage space, but risk getting incorrect response data back. For example, if you exclude -all request parameters, you will get the same cached response back for any combination of request -parameters. +As a general rule, if you include less information in your cache keys, you will have more cache hits +and use less storage space, but risk getting incorrect response data back. ``` ```{warning} If you provide a custom key function for a non-empty cache, any responses previously cached with a -different key function will likely be unused. +different key function will be unused, so it's recommended to clear the cache first. +``` + +### Custom Header Normalization +When matching request headers (using `match_headers` or `Vary`), requests-cache will normalize minor +header variations like order, casing, whitespace, etc. In some cases, you may be able to further +optimize your requests with some additional header normalization. + +For example, let's say you're working with a site that supports content negotiation using the +`Accept-Encoding` header, and the only varation you care about is whether you requested gzip +encoding. This example will increase cache hits by ignoring variations you don't care about: +```python +from requests import PreparedRequest +from requests_cache import CachedSession, create_key + + +def create_key(request: PreparedRequest, **kwargs) -> str: + # Don't modify the original request that's about to be sent + request = request.copy() + + # Simplify values like `Accept-Encoding: gzip, compress, br` to just `Accept-Encoding: gzip` + if 'gzip' in request.headers.get('Accept-Encoding', ''): + request.headers['Accept-Encoding'] = 'gzip' + else: + request.headers['Accept-Encoding'] = None + + # Use the default key function to do the rest of the work + return create_key(request, **kwargs) + + +# Provide your custom request matcher when creating the session +session = CachedSession(key_fn=create_custom_key) ``` diff --git a/docs/user_guide/security.md b/docs/user_guide/security.md index 17cf380..adc09a5 100644 --- a/docs/user_guide/security.md +++ b/docs/user_guide/security.md @@ -66,6 +66,7 @@ Then, if you try to get that cached response again (*with* your key), you will g BadSignature: Signature b'iFNmzdUOSw5vqrR9Cb_wfI1EoZ8' does not match ``` +(default-filter-params)= ## Removing Sensitive Info The {ref}`ignored_parameters <filter-params>` option can be used to prevent credentials and other sensitive info from being saved to the cache. It applies to request parameters, body, and headers. diff --git a/docs/user_guide/serializers.md b/docs/user_guide/serializers.md index efeec10..865aa4b 100644 --- a/docs/user_guide/serializers.md +++ b/docs/user_guide/serializers.md @@ -15,7 +15,9 @@ Some of these serializers require additional dependencies, listed in the section Similar to {ref}`backends`, you can specify which serializer to use with the `serializer` parameter for either {py:class}`.CachedSession` or {py:func}`.install_cache`. -## JSON Serializer +## Built-in Serializers + +### JSON Serializer Storing responses as JSON gives you the benefit of making them human-readable and editable, in exchange for a minor reduction in read and write speeds. @@ -43,7 +45,7 @@ This will use [ultrajson](https://github.com/ultrajson/ultrajson) if installed, pip install requests-cache[json] ``` -## YAML Serializer +### YAML Serializer YAML is another option if you need a human-readable/editable format, with the same tradeoffs as JSON. Usage: @@ -69,7 +71,7 @@ You can install the extra dependencies for this serializer with: pip install requests-cache[yaml] ``` -## BSON Serializer +### BSON Serializer [BSON](https://www.mongodb.com/json-and-bson) is a serialization format originally created for MongoDB, but it can also be used independently. Compared to JSON, it has better performance (although still not as fast as `pickle`), and adds support for additional data types. It is not @@ -100,7 +102,6 @@ human-readable/editable. Other content types will be saved as binary data. To sa >>> session = CachedSession('http_cache', backend=backend) ``` - ## Serializer Security See {ref}`security` for recommended setup steps for more secure cache serialization, particularly when using {py:mod}`pickle`. diff --git a/docs/user_guide/troubleshooting.md b/docs/user_guide/troubleshooting.md index fc5154f..d4e412b 100644 --- a/docs/user_guide/troubleshooting.md +++ b/docs/user_guide/troubleshooting.md @@ -31,8 +31,8 @@ logging.basicConfig( ) ``` -If you have other libraries installed with verbose debug logging, you can configure only the loggers -you want with `logger.setLevel()`: +If you have other libraries installed that have verbose debug logging, you can configure only the +loggers you want with `logger.setLevel()`: ```python import logging @@ -65,17 +65,17 @@ Here are some error messages you may see either in the logs or (more rarely) in * **`Unable to deserialize response with key {cache key}`:** This usually means that a response was previously cached in a format that isn't compatible with the - current version of requests-cache or one of its dependencies. It could also be the result of switching {ref}`serializers`. + current version of requests-cache or one of its dependencies. * This message is to help with debugging and can generally be ignored. If you prefer, you can - either {py:meth}`~.BaseCache.clear` the cache or {py:meth}`~.BaseCache.remove_expired_responses` - to get rid of the invalid responses. + either {py:meth}`~.BaseCache.remove` the invalid responses or {py:meth}`~.BaseCache.clear` the + entire cache. * **`Request for URL {url} failed; using cached response`:** This is just a notification that the - {ref}`stale_if_error <request-errors>` option is working as intended + {ref}`stale_if_error <request-errors>` option is working as intended. * **{py:exc}`~requests.RequestException`:** These are general request errors not specific to requests-cache. See `requests` documentation on [Errors and Exceptions](https://2.python-requests.org/en/master/user/quickstart/#errors-and-exceptions) for more details. -* **{py:exc}`ModuleNotFoundError`**: `No module named 'requests_cache.core'`: This module was deprecated in `v0.6` and removed in `v0.8`. Just import from `requests_cache` instead of `requests_cache.core`. +* **{py:exc}`ModuleNotFoundError`**: `No module named 'requests_cache.core'`: This module was deprecated in `v0.6` and removed in `v0.8`. Please import from `requests_cache` instead of `requests_cache.core`. * **{py:exc}`ImportError`:** Indicates a missing required or optional dependency. * If you see this at **import time**, it means that one or more **required** dependencies are not installed |