summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJordan Cook <jordan.cook@pioneer.com>2021-04-03 13:33:22 -0500
committerJordan Cook <jordan.cook@pioneer.com>2021-04-03 15:30:01 -0500
commit177e65644253667c2e0827ded6e3e16aa89a317d (patch)
tree70bae7d8aa892f6e7b375dabcf677be43119bee5
parent8854ae6982aeca12349536bcecf16eb0a8973c45 (diff)
downloadrequests-cache-177e65644253667c2e0827ded6e3e16aa89a317d.tar.gz
Make Readme more concise again, and split main usage docs into 'Quickstart' (Readme), 'User Guide', and 'Advanced Usage' sections
* Add more details and formatting to changelog * Add some more reference links to classes, methods, and functions mentioned in docs
-rw-r--r--CONTRIBUTING.md2
-rw-r--r--HISTORY.md95
-rw-r--r--README.md180
-rw-r--r--docs/advanced_usage.rst233
-rw-r--r--docs/api.rst6
-rw-r--r--docs/contributing.rst2
-rw-r--r--docs/index.rst11
-rw-r--r--docs/related_projects.rst17
-rw-r--r--docs/user_guide.rst281
9 files changed, 480 insertions, 347 deletions
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 8417d97..d1698af 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -11,7 +11,7 @@ If there is a new feature you would like to see, the best way to make that happe
for it!
## Bug Reports & Feedback
-If you discover a bug, want to propose new feature, or have other feedback about requests-cache, please
+If you discover a bug, want to propose a new feature, or have other feedback about requests-cache, please
[create an issue](https://github.com/reclosedev/requests-cache/issues/new/choose)!
## Project Discussion
diff --git a/HISTORY.md b/HISTORY.md
index 230eec0..4c9bc77 100644
--- a/HISTORY.md
+++ b/HISTORY.md
@@ -3,23 +3,32 @@
## 0.6.0 (2021-04-TBD)
[See all included issues and PRs](https://github.com/reclosedev/requests-cache/milestone/1?closed=1)
-### General
-* Drop support for python <= 3.5
-* Add `CacheMixin` class to make the features of `CachedSession` usable as a mixin class,
- for compatibility with other `requests`-based libraries
-* Add `CachedResponse` class to wrapped cached `requests.Response` objects,
- which makes additional cache information available to client code
+### Serialization
+**Note:** Due to the following changes, responses cached with previous versions of requests-cache
+will be invalid. These **old responses will be treated as expired**, and will be refreshed the
+next time they are requested. They can also be manually converted or removed, if needed (see notes below).
+
+* Add [example script](https://github.com/reclosedev/requests-cache/blob/master/examples/convert_cache.py)
+ to convert an existing cache from previous serialization format to new one
+* When running `remove_expired_responses()`, also remove responses that are invalid due to updated
+ serialization format
+* Add `CachedResponse` class to wrap cached `requests.Response` objects, which makes additional
+ cache information available to client code
+* Add `CachedHTTPResponse` class to wrap `urllib3.response.HTTPResponse` objects, available via `CachedResponse.raw`
+ * Re-construct the raw response on demand to avoid storing extra data in the cache
+ * Improve emulation of raw request behavior used for iteration, streaming requests, etc.
* Add `BaseCache.urls` property to get all URLs persisted in the cache
* Add optional support for `itsdangerous` for more secure serialization
-* Add `HEAD` to default `allowable_methods`
-* Remove invalid responses when running `remove_expired_responses()` (in case an update in
- requests-cache or one of its dependencies breaks backwards-compatibility with old cache data)
-* Handle additional edge cases with request normalization for cache keys (to avoid duplicate cached responses)
### Cache Expiration
+* Cached responses are now stored with an absolute expiration time, so `CachedSession.expire_after`
+ no longer applies retroactively. To revalidate previously cached items with a new expiration time,
+ see below:
+* Add support for overriding original expiration (i.e., revalidating) in `CachedSession.remove_expired_responses()`
* Add support for setting expiration for individual requests
* Add support for setting expiration based on URL glob patterns
-* Add support for overriding original expiration (i.e., revalidating) in `CachedSession.remove_expired_responses()`
+* Add support for setting expiration as a `datetime`
+* Add support for explicitly disabling expiration with `-1` (Since `None` may be ambiguous in some cases)
### Backends
* SQLite: Allow passing user paths (`~/path-to-cache`) to database file with `db_path` param
@@ -29,6 +38,8 @@
### Bugfixes
* Fix caching requests with data specified in `json` parameter
* Fix caching requests with `verify` parameter
+* Fix duplicate cached responses due to some unhandled variations in URL format
+ * To support this, the `url-normalize` library has been added to dependencies
* Fix usage of backend-specific params when used in place of `cache_name`
* Fix potential TypeError with `DbPickleDict` initialization
* Fix usage of `CachedSession.cache_disabled` if used within another contextmanager
@@ -37,19 +48,28 @@
requests-cache is not installed
* Update usage of deprecated MongoClient `save()` method
+### General
+* Drop support for python <= 3.5
+* Add `CacheMixin` class to make the features of `CachedSession` usable as a mixin class,
+ for [compatibility with other requests-based libraries](https://requests-cache.readthedocs.io/en/latest/advanced_usage.html#library-compatibility).
+* Add `HEAD` to default `allowable_methods`
+
### Docs & Tests
* Add type annotations to main functions/methods in public API, and include in documentation on
[readthedocs](https://requests-cache.readthedocs.io/en/latest/)
* Add [Contributing Guide](https://requests-cache.readthedocs.io/en/latest/contributing.html),
[Security](https://requests-cache.readthedocs.io/en/latest/security.html) info,
- and more examples & detailed usage info in an
- [Advanced Usage](https://requests-cache.readthedocs.io/en/latest/advanced_usage.html#) section.
-* Increased test coverage, and added containerized backends for both local and CI integration testing
-
-## 0.5.2 (2019-08-14)
+ and more examples & detailed usage info in
+ [User Guide](https://requests-cache.readthedocs.io/en/latest/user_guide.html) and
+ [Advanced Usage](https://requests-cache.readthedocs.io/en/latest/advanced_usage.html) sections.
+* Increase test coverage and rewrite most tests using pytest
+* Add containerized backends for both local and CI integration testing
+
+-----
+### 0.5.2 (2019-08-14)
* Fix DeprecationWarning from collections #140
-## 0.5.1 (2019-08-13)
+### 0.5.1 (2019-08-13)
* Remove Python 2.6 Testing from travis #133
* Fix DeprecationWarning from collections #131
* vacuum the sqlite database after clearing a table #134
@@ -65,52 +85,51 @@ Project is now added to [Code Shelter](https://www.codeshelter.co)
* Fix remove_expired_responses missed in __init__.py #93
* Fix deprecation warnings #122, thanks to mbarkhau
-## 0.4.13 (2016-12-23)
+-----
+### 0.4.13 (2016-12-23)
* Support PyMongo3, thanks to @craigls #72
* Fix streaming releate issue #68
-## 0.4.12 (2016-03-19)
+### 0.4.12 (2016-03-19)
* Fix ability to pass backend instance in `install_cache` #61
-
-## 0.4.11 (2016-03-07)
+### 0.4.11 (2016-03-07)
* `ignore_parameters` feature, thanks to @themiurgo and @YetAnotherNerd (#52, #55)
* More informative message for missing backend dependencies, thanks to @Garrett-R (#60)
-## 0.4.10 (2015-04-28)
+### 0.4.10 (2015-04-28)
* Better transactional handling in sqlite #50, thanks to @rgant
* Compatibility with streaming in requests >= 2.6.x
-## 0.4.9 (2015-01-17)
+### 0.4.9 (2015-01-17)
* `expire_after` now also accepts `timedelta`, thanks to @femtotrader
* Added Ability to include headers to cache key (`include_get_headers` option)
* Added string representation for `CachedSession`
-## 0.4.8 (2014-12-13)
+### 0.4.8 (2014-12-13)
* Fix bug in reading cached streaming response
-## 0.4.7 (2014-12-06)
+### 0.4.7 (2014-12-06)
* Fix compatibility with Requests > 2.4.1 (json arg, response history)
-## 0.4.6 (2014-10-13)
+### 0.4.6 (2014-10-13)
* Monkey patch now uses class instead lambda (compatibility with rauth)
* Normalize (sort) parameters passed as builtin dict
-## 0.4.5 (2014-08-22)
+### 0.4.5 (2014-08-22)
* Requests==2.3.0 compatibility, thanks to @gwillem
-## 0.4.4 (2013-10-31)
+### 0.4.4 (2013-10-31)
* Check for backend availability in install_cache(), not at the first request
* Default storage fallbacks to memory if `sqlite` is not available
-## 0.4.3 (2013-09-12)
+### 0.4.3 (2013-09-12)
* Fix `response.from_cache` not set in hooks
-## 0.4.2 (2013-08-25)
+### 0.4.2 (2013-08-25)
* Fix `UnpickleableError` for gzip responses
-
-## 0.4.1 (2013-08-19)
+### 0.4.1 (2013-08-19)
* `requests_cache.enabled()` context manager
* Compatibility with Requests 1.2.3 cookies handling
@@ -118,28 +137,30 @@ Project is now added to [Code Shelter](https://www.codeshelter.co)
* Redis backend. Thanks to @michaelbeaumont
* Fix for changes in Requests 1.2.0 hooks dispatching
-
+-----
## 0.3.0 (2013-02-24)
* Support for `Requests` 1.x.x
* `CachedSession`
* Many backward incompatible changes
-## 0.2.1 (2013-01-13)
+-----
+### 0.2.1 (2013-01-13)
* Fix broken PyPi package
## 0.2.0 (2013-01-12)
* Last backward compatible version for `Requests` 0.14.2
-## 0.1.3 (2012-05-04)
+-----
+### 0.1.3 (2012-05-04)
* Thread safety for default `sqlite` backend
* Take into account the POST parameters when cache is configured
with 'POST' in `allowable_methods`
-## 0.1.2 (2012-05-02)
+### 0.1.2 (2012-05-02)
* Reduce number of `sqlite` database write operations
* `fast_save` option for `sqlite` backend
-## 0.1.1 (2012-04-11)
+### 0.1.1 (2012-04-11)
* Fix: restore responses from response.history
* Internal refactoring (`MemoryCache` -> `BaseCache`, `reduce_response`
and `restore_response` moved to `BaseCache`)
diff --git a/README.md b/README.md
index ae81fac..6a50e97 100644
--- a/README.md
+++ b/README.md
@@ -8,164 +8,70 @@
[![Code Shelter](https://www.codeshelter.co/static/badges/badge-flat.svg)](https://www.codeshelter.co/)
## Summary
-**requests-cache** is a transparent persistent HTTP cache for the python [requests](http://python-requests.org)
-library. It is especially useful for web scraping, consuming REST APIs, slow or rate-limited
-sites, or any other scenario in which you're making lots of requests that are likely to be sent
-more than once.
-
-Several storage backends are included: **SQLite**, **Redis**, **MongoDB**, and **DynamoDB**.
+**requests-cache** is a transparent, persistent HTTP cache for the python [requests](http://python-requests.org)
+library. It's a convenient tool to use with web scraping, consuming REST APIs, slow or rate-limited
+sites, or any other scenario in which you're making lots of requests that are expensive and/or
+likely to be sent more than once.
See full project documentation at: https://requests-cache.readthedocs.io
-## Installation
-Install with pip:
+## Features
+* **Ease of use:** Use as a [drop-in replacement](https://requests-cache.readthedocs.io/en/latest/api.html#sessions)
+ for `requests.Session`, or [install globally](https://requests-cache.readthedocs.io/en/latest/user_guide.html#patching)
+ to add caching to all `requests` functions
+* **Customization:** Works out of the box with zero config, but with plenty of options available
+ for customizing cache
+ [expiration](https://requests-cache.readthedocs.io/en/latest/user_guide.html#cache-expiration)
+ and other [behavior](https://requests-cache.readthedocs.io/en/latest/user_guide.html#cache-options)
+* **Persistence:** Includes several [storage backends](https://requests-cache.readthedocs.io/en/latest/user_guide.html#cache-backends):
+ SQLite, Redis, MongoDB, and DynamoDB.
+* **Compatibility:** Can be used alongside
+ [other popular libraries based on requests](https://requests-cache.readthedocs.io/en/latest/advanced_usage.html#library-compatibility)
+
+# Quickstart
+First, install with pip:
```bash
pip install requests-cache
```
-**Requirements:**
-* Requires python 3.6+.
-* You may need additional dependencies depending on which backend you want to use. To install with
- extra dependencies for all supported backends:
-
- ```bash
- pip install requests-cache[backends]
- ```
-
-**Optional Setup Steps:**
-* See [Security](https://requests-cache.readthedocs.io/en/latest/security.html) for recommended
- setup steps for more secure cache serialization.
-* See [Contributing Guide](https://requests-cache.readthedocs.io/en/latest/contributing.html)
- for setup info for local development.
-
-## General Usage
-There are two main ways of using requests-cache:
-* [Sessions](https://requests-cache.readthedocs.io/en/latest/api.html#sessions):
- Use `requests_cache.CachedSession` in place of
- [requests.Session](https://requests.readthedocs.io/en/master/user/advanced/#session-objects) (recommended)
-* [Patching](https://requests-cache.readthedocs.io/en/latest/api.html#patching):
- Globally patch `requests` using `requests_cache.install_cache()`.
-
-### Sessions
-The `CachedSession` class is a drop-in replacement for `requests.Session` that adds caching features.
-
-Basic example:
-```python
-from requests_cache import CachedSession
-
-session = CachedSession('demo_cache', backend='sqlite')
-for i in range(100):
- session.get('http://httpbin.org/delay/1')
-```
-The URL in this example adds a delay of 1 second, but all 100 requests will complete in just over 1
-second. The response will be fetched once, saved to `demo_cache.sqlite`, and subsequent requests
-will return the cached response near-instantly.
+Next, use [requests_cache.CachedSession](https://requests-cache.readthedocs.io/en/latest/api.html#sessions)
+to send and cache requests. To quickly demonstrate how to use it:
-### Patching
-Using `requests_cache.install_cache()` will add caching to all `requests` functions:
+**This takes ~1 minute:**
```python
import requests
-import requests_cache
-requests_cache.install_cache()
-requests.get('http://httpbin.org/get')
session = requests.Session()
-session.get('http://httpbin.org/get')
-```
-
-`install_cache()` takes all the same parameters as `CachedSession`. It can be temporarily disabled
-with `disabled()`, and completely removed with `uninstall_cache()`:
-```python
-# Neither of these requests will use the cache
-with requests_cache.disabled():
- requests.get('http://httpbin.org/get')
-
-requests_cache.uninstall_cache()
-requests.get('http://httpbin.org/get')
-```
-
-**Limitations:**
-
-Like any other utility that uses global patching, there are some scenarios where you won't want to
-use this:
-* In a multi-threaded or multiprocess application
-* In an application that uses other packages that extend or modify `requests.Session`
-* In a package that will be used by other packages or applications
-
-### Cache Backends
-Several [cache backends](https://requests-cache.readthedocs.io/en/latest/modules/requests_cache.backends.html)
-are included, which can be selected with the `backend` parameter to `CachedSession` or `install_cache()`:
-
-* `'memory'` : Not persistent, just stores responses with an in-memory dict
-* `'sqlite'` : [SQLite](https://www.sqlite.org) database (**default**)
-* `'redis'` : [Redis](https://redis.io/) cache (requires `redis`)
-* `'mongodb'` : [MongoDB](https://www.mongodb.com/) database (requires `pymongo`)
-* `'dynamodb'` : [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) database (requires `boto3`)
-
-### Cache Expiration
-By default, cached responses will be stored indefinitely. There are a number of ways you can handle
-cache expiration. The simplest is using the `expire_after` param with a value in seconds:
-```python
-# Expire after 30 seconds
-session = CachedSession(expire_after=30)
+for i in range(60):
+ session.get('http://httpbin.org/delay/1')
```
-Or a `timedelta`:
+**This takes ~1 second:**
```python
-from datetime import timedelta
-
-# Expire after 30 days
-session = CachedSession(expire_after=timedelta(days=30))
-```
+import requests_cache
-You can also set expiration on a per-request basis, which will override any session settings:
-```python
-# Expire after 6 minutes
-session.get('http://httpbin.org/get', expire_after=360)
+session = requests_cache.CachedSession('demo_cache')
+for i in range(60):
+ session.get('http://httpbin.org/delay/1')
```
-If a per-session expiration is set but you want to temporarily disable it, use `-1`:
-```python
-# Never expire
-session.get('http://httpbin.org/get', expire_after=-1)
-```
+The URL in this example adds a delay of 1 second, simulating a slow or rate-limited website.
+With caching, the response will be fetched once, saved to `demo_cache.sqlite`, and subsequent
+requests will return the cached response near-instantly.
-For better performance, expired responses won't be removed immediately, but will be removed
-(or replaced) the next time they are accessed. To manually clear all expired responses:
+If you don't want to manage a session object, requests-cache can also be installed globally:
```python
-session.remove_expired_responses()
-```
-Or, when using patching:
-```python
-requests_cache.remove_expired_responses()
+requests_cache.install_cache('demo_cache')
+requests.get('http://httpbin.org/delay/1')
```
-Or, to revalidate the cache with a new expiration:
-```python
-session.remove_expired_responses(expire_after=360)
-```
+## Next Steps
+To find out more about what you can do with requests-cache, see:
-## More Features & Examples
-* You can find a working example at Real Python:
+* The
+ [User Guide](https://requests-cache.readthedocs.io/en/latest/user_guide.html) and
+ [Advanced Usage](https://requests-cache.readthedocs.io/en/latest/advanced_usage.html) sections
+* A working example at Real Python:
[Caching External API Requests](https://realpython.com/blog/python/caching-external-api-requests)
-* There are some additional examples in the [examples/](https://github.com/reclosedev/requests-cache/tree/master/examples) folder
-* See [Advanced Usage](https://requests-cache.readthedocs.io/en/latest/advanced_usage.html) for
- details on customizing cache behavior and other features beyond the basics.
-
-## Related Projects
-If `requests-cache` isn't quite what you need, you can help make it better! See the
-[Contributing Guide](https://requests-cache.readthedocs.io/en/latest/contributing.html)
-for details.
-
-You can also check out these other python cache projects:
-
-* [CacheControl](https://github.com/ionrock/cachecontrol): An HTTP cache for `requests` that caches
- according to HTTP headers
-* [diskcache](https://github.com/grantjenks/python-diskcache): A general-purpose (not HTTP-specific)
- file-based cache built on SQLite
-* [aiohttp-client-cache](https://github.com/JWCook/aiohttp-client-cache): An async HTTP cache for
- `aiohttp`, based on `requests-cache`
-* [aiohttp-cache](https://github.com/cr0hn/aiohttp-cache): A server-side async HTTP cache for the
- `aiohttp` web server
-* [aiocache](https://github.com/aio-libs/aiocache): General-purpose (not HTTP-specific) async cache
- backends
+* More examples in the
+ [examples/](https://github.com/reclosedev/requests-cache/tree/master/examples) folder
diff --git a/docs/advanced_usage.rst b/docs/advanced_usage.rst
index ba25c67..310d7e3 100644
--- a/docs/advanced_usage.rst
+++ b/docs/advanced_usage.rst
@@ -2,52 +2,17 @@
Advanced Usage
==============
+This section covers some more advanced and use-case-specific features.
+
.. contents::
:local:
-CachedSession Options
----------------------
-See :py:class:`.CachedSession` for a full list of parameters.
-
-Cache Name
-~~~~~~~~~~
-The ``cache_name`` parameter will be used as follows depending on the backend:
-
-* ``sqlite``: Cache filename, e.g ``my_cache.sqlite``
-* ``dynamodb``: Table name
-* ``mongodb`` and ``gridfs``: Database name
-* ``redis``: Namespace, meaning all keys will be prefixed with ``'cache_name:'``
-
-Cache Keys
-~~~~~~~~~~
-The cache key is a hash created from request information, and is used as an index for cached
-responses. There are a couple ways you can customize what information is used to create this key:
-
-* Use ``include_get_headers`` if you want headers to be included in the cache key. In other
- words, this will create separate cache items for responses with different headers.
-* Use ``ignored_parameters`` to exclude specific request params from the cache key. This is
- useful, for example, if you request the same resource with different credentials or access
- tokens.
-
-HTTP methods and status codes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-You can choose which request HTTP methods and response status codes you want to cache using the
-parameters ``allowable_methods`` and ``allowable_codes``, respectively. By default, only GET and HEAD
-requests and 200 responses are cached. Example:
-
- >>> from requests_cache import CachedSession
- >>>
- >>> session = CachedSession(
- >>> allowable_methods=('GET', 'POST'),
- >>> allowable_codes=(200, 418),
- >>> )
-
-Custom response filter
-~~~~~~~~~~~~~~~~~~~~~~
-If you need more advanced behaviour for determining what to cache, you can provide a custom filtering
-function via the ``filter_fn`` param. This function that takes a :py:class:`requests.Response` object
-and returns a boolean indicating whether or not that response should be cached. It will be applied to
-both new responses (on write) and previously cached responses (on read). Example:
+Custom Response Filtering
+-------------------------
+If you need more advanced behavior for determining what to cache, you can provide a custom filtering
+function via the ``filter_fn`` param. This can by any function that takes a :py:class:`requests.Response`
+object and returns a boolean indicating whether or not that response should be cached. It will be applied
+to both new responses (on write) and previously cached responses (on read). Example:
>>> from sys import getsizeof
>>> from requests_cache import CachedSession
@@ -58,48 +23,6 @@ both new responses (on write) and previously cached responses (on read). Example
>>>
>>> session = CachedSession(filter_fn=filter_by_size)
-Cache Expiration
-~~~~~~~~~~~~~~~~
-Use ``expire_after`` to specify how long responses will be cached. This can be:
-
-* A positive number (in seconds)
-* ``-1`` (to never expire)
-* A :py:class:`~datetime.timedelta`
-* A :py:class:`~datetime.datetime`
-
-This will only apply to responses cached in the current session; to apply a different expiration
-to previously cached responses, see :py:meth:`remove_expired_responses`.
-
-Expiration can also be set on a per-URL or per request basis. The following order of precedence
-is used:
-
-1. Per-request expiration (``expire_after`` argument for :py:meth:`.CachedSession.request`)
-2. Per-URL expiration (``urls_expire_after`` argument for ``CachedSession``)
-3. Per-session expiration (``expire_after`` argument for ``CachedSession``)
-
-URL Patterns
-~~~~~~~~~~~~
-You can use ``urls_expire_after`` to set different expiration times for different requests, based on
-URL glob patterns. This allows you to customize caching based on what you know about the resources
-you're requesting. For example, you might request one resource that gets updated frequently, another
-that changes infrequently, and another that never changes. Example:
-
- >>> urls_expire_after = {
- >>> '*.site_1.com': 30,
- >>> 'site_2.com/resource_1': 60 * 2,
- >>> 'site_2.com/resource_2': 60 * 60 * 24,
- >>> 'site_2.com/static': -1,
- >>> }
-
-**Notes:**
-
-* ``urls_expire_after`` should be a dict in the format ``{'pattern': expire_after}``
-* ``expire_after`` accepts the same types as ``CachedSession.expire_after``
-* Patterns will match request **base URLs**, so the pattern ``site.com/resource/`` is equivalent to
- ``http*://site.com/resource/**``
-* If there is more than one match, the first match will be used in the order they are defined
-* If no patterns match a request, ``expire_after`` will be used as a default.
-
Cache Inspection
----------------
Here are some ways to get additional information out of the cache session, backend, and responses:
@@ -108,8 +31,8 @@ Response Attributes
~~~~~~~~~~~~~~~~~~~
The following attributes are available on responses:
* ``from_cache``: indicates if the response came from the cache
-* ``created_at``: ``datetime`` of when the cached response was created or last updated
-* ``expires``: ``datetime`` after which the cached response will expire
+* ``created_at``: :py:class:`~datetime.datetime` of when the cached response was created or last updated
+* ``expires``: :py:class:`~datetime.datetime` after which the cached response will expire
* ``is_expired``: indicates if the cached response is expired (if an old response was returned due to a request error)
Examples:
@@ -151,8 +74,8 @@ responses they redirect to.
Custom Backends
---------------
-If the built-in :py:mod:`Cache Backends <requests_cache.backends>` don't suit your needs and you want to create your own, you can create
-subclasses of :py:class:`.BaseCache` and :py:class:`.BaseStorage`:
+If the built-in :py:mod:`Cache Backends <requests_cache.backends>` don't suit your needs, you can create your own by
+making subclasses of :py:class:`.BaseCache` and :py:class:`.BaseStorage`:
>>> from requests_cache import CachedSession
>>> from requests_cache.backends import BaseCache, BaseStorage
@@ -167,7 +90,7 @@ subclasses of :py:class:`.BaseCache` and :py:class:`.BaseStorage`:
>>> class MyStorage(BaseStorage):
>>> """Lower-level backend storage operations"""
-You can then use your custom backend in a ``CachedSession`` with the ``backend`` parameter:
+You can then use your custom backend in a :py:class:`.CachedSession` with the ``backend`` parameter:
>>> session = CachedSession(backend=MyCache())
@@ -204,62 +127,62 @@ Streaming Requests
If you use `streaming requests <https://2.python-requests.org/en/master/user/advanced/#id9>`_, you
can use the same code to iterate over both cached and non-cached requests. A cached request will,
of course, have already been read, but will use a file-like object containing the content.
-Example::
-
- from requests_cache import CachedSession
+Example:
- session = CachedSession()
- for i in range(2):
- r = session.get('https://httpbin.org/stream/20', stream=True)
- for chunk in r.iter_lines():
- print(chunk.decode('utf-8'))
+ >>> from requests_cache import CachedSession
+ >>>
+ >>> session = CachedSession()
+ >>> for i in range(2):
+ ... r = session.get('https://httpbin.org/stream/20', stream=True)
+ ... for chunk in r.iter_lines():
+ ... print(chunk.decode('utf-8'))
.. _library_compatibility:
Usage with other requests-based libraries
-----------------------------------------
-This library works by patching and/or extending ``requests.Session``. Many other libraries out there
+This library works by patching and/or extending :py:class:`requests.Session`. Many other libraries out there
do the same thing, making it potentially difficult to combine them. For that scenario, a mixin class
-is provided, so you can create a custom class with behavior from multiple Session-modifying libraries::
+is provided, so you can create a custom class with behavior from multiple Session-modifying libraries:
- from requests import Session
- from requests_cache import CacheMixin
- from some_other_lib import SomeOtherMixin
-
- class CustomSession(CacheMixin, SomeOtherMixin ClientSession):
- """Session class with features from both requests-html and requests-cache"""
+ >>> from requests import Session
+ >>> from requests_cache import CacheMixin
+ >>> from some_other_lib import SomeOtherMixin
+ >>>
+ >>> class CustomSession(CacheMixin, SomeOtherMixin ClientSession):
+ ... """Session class with features from both requests-html and requests-cache"""
Requests-HTML
~~~~~~~~~~~~~
-Example with `requests-html <https://github.com/psf/requests-html>`_::
-
- import requests
- from requests_cache import CacheMixin, install_cache
- from requests_html import HTMLSession
-
- class CachedHTMLSession(CacheMixin, HTMLSession):
- """Session with features from both CachedSession and HTMLSession"""
+Example with `requests-html <https://github.com/psf/requests-html>`_:
- session = CachedHTMLSession()
- r = session.get("https://github.com/")
- print(r.from_cache, r.html.links)
+ >>> import requests
+ >>> from requests_cache import CacheMixin, install_cache
+ >>> from requests_html import HTMLSession
+ >>>
+ >>> class CachedHTMLSession(CacheMixin, HTMLSession):
+ ... """Session with features from both CachedSession and HTMLSession"""
+ >>>
+ >>> session = CachedHTMLSession()
+ >>> r = session.get('https://github.com/')
+ >>> print(r.from_cache, r.html.links)
-Or, using the monkey-patch method::
+Or, using the monkey-patch method:
- install_cache(session_factory=CachedHTMLSession)
- r = requests.get("https://github.com/")
- print(r.from_cache, r.html.links)
+ >>> install_cache(session_factory=CachedHTMLSession)
+ >>> r = requests.get('https://github.com/')
+ >>> print(r.from_cache, r.html.links)
-The same approach can be used with other libraries that subclass ``requests.Session``.
+The same approach can be used with other libraries that subclass :py:class:`requests.Session`.
Requests-futures
~~~~~~~~~~~~~~~~
Example with `requests-futures <https://github.com/ross/requests-futures>`_:
-Some libraries, including `requests-futures`, support wrapping an existing session object.
+Some libraries, including ``requests-futures``, support wrapping an existing session object:
- session = FutureSession(session=CachedSession())
+ >>> session = FutureSession(session=CachedSession())
In this case, ``FutureSession`` must wrap ``CachedSession`` rather than the other way around, since
``FutureSession`` returns (as you might expect) futures rather than response objects.
@@ -271,44 +194,36 @@ Example with `requests-mock <https://github.com/jamielennox/requests-mock>`_:
Requests-mock works a bit differently. It has multiple methods of mocking requests, and the
method most compatible with requests-cache is attaching its
-`adapter <https://requests-mock.readthedocs.io/en/latest/adapter.html>`_ to a CachedSession::
-
- import requests
- from requests_mock import Adapter
- from requests_cache import CachedSession
-
- # Set up a CachedSession that will make mock requests where it would normally make real requests
- adapter = Adapter()
- adapter.register_uri(
- 'GET',
- 'mock://some_test_url',
- headers={'Content-Type': 'text/plain'},
- text='mock response',
- status_code=200,
- )
- session = CachedSession()
- session.mount('mock://', adapter)
-
- session.get('mock://some_test_url', text='mock_response')
- response = session.get('mock://some_test_url')
- print(response.text)
+`adapter <https://requests-mock.readthedocs.io/en/latest/adapter.html>`_ to a CachedSession:
+
+ >>> import requests
+ >>> from requests_mock import Adapter
+ >>> from requests_cache import CachedSession
+ >>>
+ >>> # Set up a CachedSession that will make mock requests where it would normally make real requests
+ >>> adapter = Adapter()
+ >>> adapter.register_uri(
+ ... 'GET',
+ ... 'mock://some_test_url',
+ ... headers={'Content-Type': 'text/plain'},
+ ... text='mock response',
+ ... status_code=200,
+ ... )
+ >>> session = CachedSession()
+ >>> session.mount('mock://', adapter)
+ >>>
+ >>> session.get('mock://some_test_url', text='mock_response')
+ >>> response = session.get('mock://some_test_url')
+ >>> print(response.text)
Internet Archive
~~~~~~~~~~~~~~~~
Example with `internetarchive <https://github.com/jjjake/internetarchive>`_:
-Usage is the same as other libraries that subclass `requests.Session`::
-
- from requests_cache import CacheMixin
- from internetarchive.session import ArchiveSession
+Usage is the same as other libraries that subclass `requests.Session`:
- class CachedArchiveSession(CacheMixin, ArchiveSession):
- """Session with features from both CachedSession and ArchiveSession"""
-
-Potential Issues
-----------------
-* Version updates of ``requests``, ``urllib3`` or ``requests-cache`` itself may not be compatible with
- previously cached data (see issues `#56 <https://github.com/reclosedev/requests-cache/issues/56>`_
- and `#102 <https://github.com/reclosedev/requests-cache/issues/102>`_).
- The best way to prevent this is to use a virtualenv and pin your dependency versions.
-* See :ref:`security` for notes on serialization security
+ >>> from requests_cache import CacheMixin
+ >>> from internetarchive.session import ArchiveSession
+ >>>
+ >>> class CachedArchiveSession(CacheMixin, ArchiveSession):
+ ... """Session with features from both CachedSession and ArchiveSession"""
diff --git a/docs/api.rst b/docs/api.rst
index 2602292..b45c73f 100644
--- a/docs/api.rst
+++ b/docs/api.rst
@@ -1,9 +1,9 @@
.. Note: backend module docs are auto-generated with apidoc; the remaining modules are manually
added here for more custom formatting.
-API
-===
-This section covers all the public interfaces of ``requests-cache``
+API Reference
+=============
+This section covers all the public interfaces of requests-cache.
.. contents:: Table of Contents
:depth: 2
diff --git a/docs/contributing.rst b/docs/contributing.rst
index 4fc5016..a0ad0a8 100644
--- a/docs/contributing.rst
+++ b/docs/contributing.rst
@@ -1 +1,3 @@
+.. _contributing:
+
.. mdinclude:: ../CONTRIBUTING.md
diff --git a/docs/index.rst b/docs/index.rst
index 2b49f96..892ff8a 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -2,22 +2,23 @@
.. Include Readme contents (except for the link to readthedocs, since we're already on readthedocs!)
.. mdinclude:: ../README.md
- :end-line: 17
-
-.. _general-usage:
+ :end-line: 15
.. mdinclude:: ../README.md
- :start-line: 19
+ :start-line: 17
+Contents
+========
.. toctree::
- :caption: Contents
:maxdepth: 2
+ user_guide
advanced_usage
security
api
contributing
contributors
+ related_projects
history
Indices and tables
diff --git a/docs/related_projects.rst b/docs/related_projects.rst
new file mode 100644
index 0000000..5ceb81b
--- /dev/null
+++ b/docs/related_projects.rst
@@ -0,0 +1,17 @@
+Related Projects
+================
+If requests-cache isn't quite what you need, you can help make it better! See the
+:ref:`Contributing Guide <contributing>` for details.
+
+You can also check out these other python cache projects:
+
+* `CacheControl <https://github.com/ionrock/cachecontrol>`_: An HTTP cache for ``requests`` that caches
+ according to HTTP headers
+* `diskcache <https://github.com/grantjenks/python-diskcache>`_: A general-purpose (not HTTP-specific)
+ file-based cache built on SQLite
+* `aiohttp-client-cache <https://github.com/JWCook/aiohttp-client-cache>`_: An async HTTP cache for
+ ``aiohttp``, based on `requests-cache`
+* `aiohttp-cache <https://github.com/cr0hn/aiohttp-cache>`_: A server-side async HTTP cache for the
+ ``aiohttp`` web server
+* `aiocache <https://github.com/aio-libs/aiocache>`_: General-purpose (not HTTP-specific) async cache
+ backends
diff --git a/docs/user_guide.rst b/docs/user_guide.rst
index f556a63..9f539cd 100644
--- a/docs/user_guide.rst
+++ b/docs/user_guide.rst
@@ -1,7 +1,278 @@
-:orphan:
-
-.. This is just a placeholder in case anyone had this page bookmarked
-
User Guide
==========
-This page has moved. Please see :ref:`index:general usage` and :ref:`advanced_usage` sections.
+This section covers the main features of requests-cache.
+
+.. contents::
+ :local:
+ :depth: 2
+
+Installation
+------------
+Install with pip:
+
+ $ pip install requests-cache
+
+Requirements
+~~~~~~~~~~~~
+* Requires python 3.6+.
+* You may need additional dependencies depending on which backend you want to use. To install with
+ extra dependencies for all supported backends:
+
+ $ pip install requests-cache[backends]
+
+Optional Setup Steps
+~~~~~~~~~~~~~~~~~~~~
+* See :ref:`security` for recommended setup steps for more secure cache serialization.
+* See :ref:`Contributing Guide <contributing:dev installation>` for setup steps for local development.
+
+General Usage
+-------------
+There are two main ways of using requests-cache:
+
+* **Sessions:** (recommended) Use :py:class:`.CachedSession` to send your requests
+* **Patching:** Globally patch ``requests`` using :py:func:`.install_cache()`
+
+Sessions
+~~~~~~~~
+:py:class:`.CachedSession` can be used as a drop-in replacement for :py:class:`requests.Session`.
+Basic usage looks like this:
+
+ >>> from requests_cache import CachedSession
+ >>>
+ >>> session = CachedSession()
+ >>> for i in range(60):
+ ... session.get('http://httpbin.org/delay/1')
+
+Any :py:class:`requests.Session` method can be used (but see :ref:`user_guide:http methods` section
+below for config details):
+
+ >>> session.request('GET', 'http://httpbin.org/get')
+ >>> session.head('http://httpbin.org/get')
+
+Caching can be temporarily disabled with :py:meth:`.CachedSession.cache_disabled`:
+
+ >>> with session.cache_disabled():
+ ... session.get('http://httpbin.org/get')
+
+The best way to clean up your cache is through :ref:`user_guide:cache expiration`, but you can also
+clear out everything with :py:meth:`.BaseCache.clear`:
+
+ >>> session.cache.clear()
+
+Patching
+~~~~~~~~
+In some situations, it may not be possible or convenient to manage your own session object. In those
+cases, you can use :py:func:`.install_cache` to add caching to all ``requests`` functions:
+
+ >>> import requests
+ >>> import requests_cache
+ >>>
+ >>> requests_cache.install_cache()
+ >>> requests.get('http://httpbin.org/get')
+
+As well as session methods:
+
+ >>> session = requests.Session()
+ >>> session.get('http://httpbin.org/get')
+
+:py:func:`.install_cache` accepts all the same parameters as :py:class:`.CachedSession`:
+
+ >>> requests_cache.install_cache(expire_after=360, allowable_methods=('GET', 'POST'))
+
+It can be temporarily :py:func:`.enabled`:
+
+ >>> with requests_cache.enabled():
+ ... requests.get('http://httpbin.org/get') # Will be cached
+
+Or temporarily :py:func:`.disabled`:
+
+ >>> requests_cache.install_cache()
+ >>> with requests_cache.disabled():
+ ... requests.get('http://httpbin.org/get') # Will not be cached
+
+Or completely removed with :py:func:`.uninstall_cache`:
+
+ >>> requests_cache.uninstall_cache()
+ >>> requests.get('http://httpbin.org/get')
+
+You can also clear out all responses in the cache with :py:func:`.clear`, and check if
+requests-cache is currently installed with :py:func:`.is_installed`.
+
+Limitations
+^^^^^^^^^^^
+Like any other utility that uses global patching, there are some scenarios where you won't want to
+use :py:func:`.install_cache`:
+
+* In a multi-threaded or multiprocess application
+* In an application that uses other packages that extend or modify :py:class:`requests.Session`
+* In a package that will be used by other packages or applications
+
+Cache Backends
+--------------
+Several cache backends are included, which can be selected with
+the ``backend`` parameter for either :py:class:`.CachedSession` or :py:func:`.install_cache`:
+
+* ``'sqlite'``: `SQLite <https://www.sqlite.org>`_ database (**default**)
+* ``'redis'``: `Redis <https://redis.io>`_ cache (requires ``redis``)
+* ``'mongodb'``: `MongoDB <https://www.mongodb.com>`_ database (requires ``pymongo``)
+* ``'gridfs'``: `GridFS <https://docs.mongodb.com/manual/core/gridfs/>`_ collections on a MongoDB database (requires ``pymongo``)
+* ``'dynamodb'``: `Amazon DynamoDB <https://aws.amazon.com/dynamodb>`_ database (requires ``boto3``)
+* ``'memory'`` : A non-persistent cache that just stores responses in memory
+
+A backend can be specified either by name, class or instance:
+
+ >>> from requests_cache.backends import RedisCache
+ >>> from requests_cache import CachedSession
+ >>>
+ >>> # Backend name
+ >>> session = CachedSession(backend='redis', namespace='my-cache')
+
+ >>> # Backend class
+ >>> session = CachedSession(backend=RedisCache, namespace='my-cache')
+
+ >>> # Backend instance
+ >>> session = CachedSession(backend=RedisCache(namespace='my-cache'))
+
+See :py:mod:`requests_cache.backends` for more backend-specific usage details, and see
+:ref:`advanced_usage:custom backends` for details on creating your own implementation.
+
+Cache Name
+~~~~~~~~~~
+The ``cache_name`` parameter will be used as follows depending on the backend:
+
+* ``sqlite``: Database path, e.g ``~/.cache/my_cache.sqlite``
+* ``dynamodb``: Table name
+* ``mongodb`` and ``gridfs``: Database name
+* ``redis``: Namespace, meaning all keys will be prefixed with ``'<cache_name>:'``
+
+Cache Options
+-------------
+A number of options are available to modify which responses are cached and how they are cached.
+
+HTTP Methods
+~~~~~~~~~~~~
+By default, only GET and HEAD requests are cached. To cache additional HTTP methods, specify them
+with ``allowable_methods``. For example, caching POST requests can be used to ensure you don't send
+the same data multiple times:
+
+ >>> session = CachedSession(allowable_methods=('GET', 'POST'))
+ >>> session.post('http://httpbin.org/post', json={'param': 'value'})
+
+Status Codes
+~~~~~~~~~~~~
+By default, only responses with a 200 status code are cached. To cache additional status codes,
+specify them with ``allowable_codes``"
+
+ >>> session = CachedSession(allowable_codes=(200, 418))
+ >>> session.get('http://httpbin.org/teapot')
+
+Request Parameters
+~~~~~~~~~~~~~~~~~~
+By default, all request parameters are taken into account when caching responses. In some cases,
+there may be request parameters that don't affect the response data, for example authentication tokens
+or credentials. If you want to ignore specific parameters, specify them with ``ignored_parameters``:
+
+ >>> session = CachedSession(ignored_parameters=['auth-token'])
+ >>> # Only the first request will be sent
+ >>> session.get('http://httpbin.org/get', params={'auth-token': '2F63E5DF4F44'})
+ >>> session.get('http://httpbin.org/get', params={'auth-token': 'D9FAEB3449D3'})
+
+Request Headers
+~~~~~~~~~~~~~~~
+By default, request headers are not taken into account when caching responses. In some cases,
+different headers may result in different response data, so you may want to cache them separately.
+To enable this, use ``include_get_headers``:
+
+ >>> session = CachedSession(include_get_headers=True)
+ >>> # Both of these requests will be sent and cached separately
+ >>> session.get('http://httpbin.org/headers', {'Accept': 'text/plain'})
+ >>> session.get('http://httpbin.org/headers', {'Accept': 'application/json'})
+
+Cache Expiration
+----------------
+By default, cached responses will be stored indefinitely. You can initialize the cache with an
+``expire_after`` value to specify how long responses will be cached.
+
+Expiration Types
+~~~~~~~~~~~~~~~~
+``expire_after`` can be any of the following:
+
+* ``-1`` (to never expire)
+* A positive number (in seconds)
+* A :py:class:`~datetime.timedelta`
+* A :py:class:`~datetime.datetime`
+
+Examples:
+
+ >>> # Set expiration for the session using a value in seconds
+ >>> session = CachedSession(expire_after=360)
+
+ >>> # To specify a different unit of time, use a timedelta
+ >>> from datetime import timedelta
+ >>> session = CachedSession(expire_after=timedelta(days=30))
+
+ >>> # Update an existing session to disable expiration (i.e., store indefinitely)
+ >>> session.expire_after = -1
+
+Expiration Scopes
+~~~~~~~~~~~~~~~~~
+Passing ``expire_after`` to :py:class:`.CachedSession` will set the expiration for the duration of that session.
+Expiration can also be set on a per-URL or per-request basis. The following order of precedence
+is used:
+
+1. Per-request expiration (``expire_after`` argument for :py:meth:`.CachedSession.request`)
+2. Per-URL expiration (``urls_expire_after`` argument for :py:class:`.CachedSession`)
+3. Per-session expiration (``expire_after`` argument for :py:class:`.CachedSession`)
+
+To set expiration for a single request:
+
+ >>> session.get('http://httpbin.org/get', expire_after=360)
+
+URL Patterns
+~~~~~~~~~~~~
+You can use ``urls_expire_after`` to set different expiration values for different requests, based on
+URL glob patterns. This allows you to customize caching based on what you know about the resources
+you're requesting. For example, you might request one resource that gets updated frequently, another
+that changes infrequently, and another that never changes. Example:
+
+ >>> urls_expire_after = {
+ ... '*.site_1.com': 30,
+ ... 'site_2.com/resource_1': 60 * 2,
+ ... 'site_2.com/resource_2': 60 * 60 * 24,
+ ... 'site_2.com/static': -1,
+ ... }
+ >>> session = CachedSession(urls_expire_after=urls_expire_after)
+
+**Notes:**
+
+* ``urls_expire_after`` should be a dict in the format ``{'pattern': expire_after}``
+* ``expire_after`` accepts the same types as ``CachedSession.expire_after``
+* Patterns will match request **base URLs**, so the pattern ``site.com/resource/`` is equivalent to
+ ``http*://site.com/resource/**``
+* If there is more than one match, the first match will be used in the order they are defined
+* If no patterns match a request, ``CachedSession.expire_after`` will be used as a default.
+
+Removing Expired Responses
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+For better performance, expired responses won't be removed immediately, but will be removed
+(or replaced) the next time they are requested. To manually clear all expired responses, use
+:py:meth:`.CachedSession.remove_expired_responses`:
+
+ >>> session.remove_expired_responses()
+
+Or, when using patching:
+
+ >>> requests_cache.remove_expired_responses()
+
+You can also apply a different ``expire_after`` to previously cached responses, which will
+revalidate the cache with the new expiration time:
+
+ >>> session.remove_expired_responses(expire_after=timedelta(days=30))
+
+Potential Issues
+----------------
+* Version updates of ``requests``, ``urllib3`` or ``requests-cache`` itself may not be compatible with
+ previously cached data (see issues `#56 <https://github.com/reclosedev/requests-cache/issues/56>`_
+ and `#102 <https://github.com/reclosedev/requests-cache/issues/102>`_).
+ The best way to prevent this is to use a virtualenv and pin your dependency versions.
+* See :ref:`security` for notes on serialization security