summaryrefslogtreecommitdiff
path: root/doc/source/reference/random
diff options
context:
space:
mode:
Diffstat (limited to 'doc/source/reference/random')
-rw-r--r--doc/source/reference/random/bit_generators/index.rst50
-rw-r--r--doc/source/reference/random/bit_generators/mt19937.rst3
-rw-r--r--doc/source/reference/random/bit_generators/pcg64.rst3
-rw-r--r--doc/source/reference/random/bit_generators/pcg64dxsm.rst32
-rw-r--r--doc/source/reference/random/bit_generators/philox.rst3
-rw-r--r--doc/source/reference/random/bit_generators/sfc64.rst3
-rw-r--r--doc/source/reference/random/c-api.rst12
-rw-r--r--doc/source/reference/random/generator.rst17
-rw-r--r--doc/source/reference/random/index.rst30
-rw-r--r--doc/source/reference/random/legacy.rst3
-rw-r--r--doc/source/reference/random/new-or-different.rst22
-rw-r--r--doc/source/reference/random/parallel.rst13
-rw-r--r--doc/source/reference/random/performance.py15
-rw-r--r--doc/source/reference/random/performance.rst138
-rw-r--r--doc/source/reference/random/upgrading-pcg64.rst152
15 files changed, 371 insertions, 125 deletions
diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst
index 315657172..c5c349806 100644
--- a/doc/source/reference/random/bit_generators/index.rst
+++ b/doc/source/reference/random/bit_generators/index.rst
@@ -15,10 +15,13 @@ Supported BitGenerators
The included BitGenerators are:
-* PCG-64 - The default. A fast generator that supports many parallel streams
- and can be advanced by an arbitrary amount. See the documentation for
- :meth:`~.PCG64.advance`. PCG-64 has a period of :math:`2^{128}`. See the `PCG
- author's page`_ for more details about this class of PRNG.
+* PCG-64 - The default. A fast generator that can be advanced by an arbitrary
+ amount. See the documentation for :meth:`~.PCG64.advance`. PCG-64 has
+ a period of :math:`2^{128}`. See the `PCG author's page`_ for more details
+ about this class of PRNG.
+* PCG-64 DXSM - An upgraded version of PCG-64 with better statistical
+ properties in parallel contexts. See :ref:`upgrading-pcg64` for more
+ information on these improvements.
* MT19937 - The standard Python BitGenerator. Adds a `MT19937.jumped`
function that returns a new generator with state as-if :math:`2^{128}` draws have
been made.
@@ -43,6 +46,7 @@ The included BitGenerators are:
MT19937 <mt19937>
PCG64 <pcg64>
+ PCG64DXSM <pcg64dxsm>
Philox <philox>
SFC64 <sfc64>
@@ -105,6 +109,44 @@ If you need to generate a good seed "offline", then ``SeedSequence().entropy``
or using ``secrets.randbits(128)`` from the standard library are both
convenient ways.
+If you need to run several stochastic simulations in parallel, best practice
+is to construct a random generator instance for each simulation.
+To make sure that the random streams have distinct initial states, you can use
+the `spawn` method of `~SeedSequence`. For instance, here we construct a list
+of 12 instances:
+
+.. code-block:: python
+
+ from numpy.random import PCG64, SeedSequence
+
+ # High quality initial entropy
+ entropy = 0x87351080e25cb0fad77a44a3be03b491
+ base_seq = SeedSequence(entropy)
+ child_seqs = base_seq.spawn(12) # a list of 12 SeedSequences
+ generators = [PCG64(seq) for seq in child_seqs]
+
+.. end_block
+
+
+An alternative way is to use the fact that a `~SeedSequence` can be initialized
+by a tuple of elements. Here we use a base entropy value and an integer
+``worker_id``
+
+.. code-block:: python
+
+ from numpy.random import PCG64, SeedSequence
+
+ # High quality initial entropy
+ entropy = 0x87351080e25cb0fad77a44a3be03b491
+ sequences = [SeedSequence((entropy, worker_id)) for worker_id in range(12)]
+ generators = [PCG64(seq) for seq in sequences]
+
+.. end_block
+
+Note that the sequences produced by the latter method will be distinct from
+those constructed via `~SeedSequence.spawn`.
+
+
.. autosummary::
:toctree: generated/
diff --git a/doc/source/reference/random/bit_generators/mt19937.rst b/doc/source/reference/random/bit_generators/mt19937.rst
index 71875db4e..d05ea7c6f 100644
--- a/doc/source/reference/random/bit_generators/mt19937.rst
+++ b/doc/source/reference/random/bit_generators/mt19937.rst
@@ -4,7 +4,8 @@ Mersenne Twister (MT19937)
.. currentmodule:: numpy.random
.. autoclass:: MT19937
- :exclude-members:
+ :members: __init__
+ :exclude-members: __init__
State
=====
diff --git a/doc/source/reference/random/bit_generators/pcg64.rst b/doc/source/reference/random/bit_generators/pcg64.rst
index edac4620b..889965f77 100644
--- a/doc/source/reference/random/bit_generators/pcg64.rst
+++ b/doc/source/reference/random/bit_generators/pcg64.rst
@@ -4,7 +4,8 @@ Permuted Congruential Generator (64-bit, PCG64)
.. currentmodule:: numpy.random
.. autoclass:: PCG64
- :exclude-members:
+ :members: __init__
+ :exclude-members: __init__
State
=====
diff --git a/doc/source/reference/random/bit_generators/pcg64dxsm.rst b/doc/source/reference/random/bit_generators/pcg64dxsm.rst
new file mode 100644
index 000000000..e37efa5d3
--- /dev/null
+++ b/doc/source/reference/random/bit_generators/pcg64dxsm.rst
@@ -0,0 +1,32 @@
+Permuted Congruential Generator (64-bit, PCG64 DXSM)
+----------------------------------------------------
+
+.. currentmodule:: numpy.random
+
+.. autoclass:: PCG64DXSM
+ :members: __init__
+ :exclude-members: __init__
+
+State
+=====
+
+.. autosummary::
+ :toctree: generated/
+
+ ~PCG64DXSM.state
+
+Parallel generation
+===================
+.. autosummary::
+ :toctree: generated/
+
+ ~PCG64DXSM.advance
+ ~PCG64DXSM.jumped
+
+Extending
+=========
+.. autosummary::
+ :toctree: generated/
+
+ ~PCG64DXSM.cffi
+ ~PCG64DXSM.ctypes
diff --git a/doc/source/reference/random/bit_generators/philox.rst b/doc/source/reference/random/bit_generators/philox.rst
index 8eba2d351..3c2fa4cc5 100644
--- a/doc/source/reference/random/bit_generators/philox.rst
+++ b/doc/source/reference/random/bit_generators/philox.rst
@@ -4,7 +4,8 @@ Philox Counter-based RNG
.. currentmodule:: numpy.random
.. autoclass:: Philox
- :exclude-members:
+ :members: __init__
+ :exclude-members: __init__
State
=====
diff --git a/doc/source/reference/random/bit_generators/sfc64.rst b/doc/source/reference/random/bit_generators/sfc64.rst
index d34124a33..8cb255bc1 100644
--- a/doc/source/reference/random/bit_generators/sfc64.rst
+++ b/doc/source/reference/random/bit_generators/sfc64.rst
@@ -4,7 +4,8 @@ SFC64 Small Fast Chaotic PRNG
.. currentmodule:: numpy.random
.. autoclass:: SFC64
- :exclude-members:
+ :members: __init__
+ :exclude-members: __init__
State
=====
diff --git a/doc/source/reference/random/c-api.rst b/doc/source/reference/random/c-api.rst
index a79da7a49..de403ce98 100644
--- a/doc/source/reference/random/c-api.rst
+++ b/doc/source/reference/random/c-api.rst
@@ -3,6 +3,8 @@ C API for random
.. currentmodule:: numpy.random
+.. versionadded:: 1.19.0
+
Access to various distributions below is available via Cython or C-wrapper
libraries like CFFI. All the functions accept a :c:type:`bitgen_t` as their
first argument. To access these from Cython or C, you must link with the
@@ -40,9 +42,9 @@ The functions are named with the following conventions:
- The functions without "standard" in their name require additional parameters
to describe the distributions.
-- ``zig`` in the name are based on a ziggurat lookup algorithm is used instead
- of calculating the ``log``, which is significantly faster. The non-ziggurat
- variants are used in corner cases and for legacy compatibility.
+- Functions with ``inv`` in their name are based on the slower inverse method
+ instead of a ziggurat lookup algorithm, which is significantly faster. The
+ non-ziggurat variants are used in corner cases and for legacy compatibility.
.. c:function:: double random_standard_uniform(bitgen_t *bitgen_state)
@@ -53,6 +55,8 @@ The functions are named with the following conventions:
.. c:function:: void random_standard_exponential_fill(bitgen_t *bitgen_state, npy_intp cnt, double *out)
+.. c:function:: void random_standard_exponential_inv_fill(bitgen_t *bitgen_state, npy_intp cnt, double *out)
+
.. c:function:: double random_standard_normal(bitgen_t* bitgen_state)
.. c:function:: void random_standard_normal_fill(bitgen_t *bitgen_state, npy_intp count, double *out)
@@ -69,6 +73,8 @@ The functions are named with the following conventions:
.. c:function:: void random_standard_exponential_fill_f(bitgen_t *bitgen_state, npy_intp cnt, float *out)
+.. c:function:: void random_standard_exponential_inv_fill_f(bitgen_t *bitgen_state, npy_intp cnt, float *out)
+
.. c:function:: float random_standard_normal_f(bitgen_t* bitgen_state)
.. c:function:: float random_standard_gamma_f(bitgen_t *bitgen_state, float shape)
diff --git a/doc/source/reference/random/generator.rst b/doc/source/reference/random/generator.rst
index 8706e1de2..7934be98a 100644
--- a/doc/source/reference/random/generator.rst
+++ b/doc/source/reference/random/generator.rst
@@ -15,7 +15,8 @@ can be changed by passing an instantized BitGenerator to ``Generator``.
.. autofunction:: default_rng
.. autoclass:: Generator
- :exclude-members:
+ :members: __init__
+ :exclude-members: __init__
Accessing the BitGenerator
==========================
@@ -70,13 +71,13 @@ By default, `Generator.permuted` returns a copy. To operate in-place with
`Generator.permuted`, pass the same array as the first argument *and* as
the value of the ``out`` parameter. For example,
- >>> rg = np.random.default_rng()
+ >>> rng = np.random.default_rng()
>>> x = np.arange(0, 15).reshape(3, 5)
>>> x
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
- >>> y = rg.permuted(x, axis=1, out=x)
+ >>> y = rng.permuted(x, axis=1, out=x)
>>> x
array([[ 1, 0, 2, 4, 3], # random
[ 6, 7, 8, 9, 5],
@@ -96,13 +97,13 @@ which dimension of the input array to use as the sequence. In the case of a
two-dimensional array, ``axis=0`` will, in effect, rearrange the rows of the
array, and ``axis=1`` will rearrange the columns. For example
- >>> rg = np.random.default_rng()
+ >>> rng = np.random.default_rng()
>>> x = np.arange(0, 15).reshape(3, 5)
>>> x
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
- >>> rg.permutation(x, axis=1)
+ >>> rng.permutation(x, axis=1)
array([[ 1, 3, 2, 0, 4], # random
[ 6, 8, 7, 5, 9],
[11, 13, 12, 10, 14]])
@@ -115,7 +116,7 @@ how `numpy.sort` treats it. Each slice along the given axis is shuffled
independently of the others. Compare the following example of the use of
`Generator.permuted` to the above example of `Generator.permutation`:
- >>> rg.permuted(x, axis=1)
+ >>> rng.permuted(x, axis=1)
array([[ 1, 0, 2, 4, 3], # random
[ 5, 7, 6, 9, 8],
[10, 14, 12, 13, 11]])
@@ -130,9 +131,9 @@ Shuffling non-NumPy sequences
a sequence that is not a NumPy array, it shuffles that sequence in-place.
For example,
- >>> rg = np.random.default_rng()
+ >>> rng = np.random.default_rng()
>>> a = ['A', 'B', 'C', 'D', 'E']
- >>> rg.shuffle(a) # shuffle the list in-place
+ >>> rng.shuffle(a) # shuffle the list in-place
>>> a
['B', 'D', 'A', 'E', 'C'] # random
diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst
index 13ce7c40c..96cd47017 100644
--- a/doc/source/reference/random/index.rst
+++ b/doc/source/reference/random/index.rst
@@ -25,7 +25,7 @@ nep-0019-rng-policy.html>`_ for context on the updated random Numpy number
routines. The legacy `RandomState` random number routines are still
available, but limited to a single BitGenerator. See :ref:`new-or-different`
for a complete list of improvements and differences from the legacy
-``Randomstate``.
+``RandomState``.
For convenience and backward compatibility, a single `RandomState`
instance's methods are imported into the numpy.random namespace, see
@@ -84,10 +84,10 @@ different
.. code-block:: python
try:
- rg_integers = rg.integers
+ rng_integers = rng.integers
except AttributeError:
- rg_integers = rg.randint
- a = rg_integers(1000)
+ rng_integers = rng.randint
+ a = rng_integers(1000)
Seeds can be passed to any of the BitGenerators. The provided value is mixed
via `SeedSequence` to spread a possible sequence of seeds across a wider
@@ -97,8 +97,8 @@ is wrapped with a `Generator`.
.. code-block:: python
from numpy.random import Generator, PCG64
- rg = Generator(PCG64(12345))
- rg.standard_normal()
+ rng = Generator(PCG64(12345))
+ rng.standard_normal()
Here we use `default_rng` to create an instance of `Generator` to generate a
random float:
@@ -146,10 +146,10 @@ As a convenience NumPy provides the `default_rng` function to hide these
details:
>>> from numpy.random import default_rng
->>> rg = default_rng(12345)
->>> print(rg)
+>>> rng = default_rng(12345)
+>>> print(rng)
Generator(PCG64)
->>> print(rg.random())
+>>> print(rng.random())
0.22733602246716966
One can also instantiate `Generator` directly with a `BitGenerator` instance.
@@ -158,16 +158,16 @@ To use the default `PCG64` bit generator, one can instantiate it directly and
pass it to `Generator`:
>>> from numpy.random import Generator, PCG64
->>> rg = Generator(PCG64(12345))
->>> print(rg)
+>>> rng = Generator(PCG64(12345))
+>>> print(rng)
Generator(PCG64)
Similarly to use the older `MT19937` bit generator (not recommended), one can
instantiate it directly and pass it to `Generator`:
>>> from numpy.random import Generator, MT19937
->>> rg = Generator(MT19937(12345))
->>> print(rg)
+>>> rng = Generator(MT19937(12345))
+>>> print(rng)
Generator(MT19937)
What's New or Different
@@ -222,6 +222,9 @@ one of three ways:
* :ref:`independent-streams`
* :ref:`parallel-jumped`
+Users with a very large amount of parallelism will want to consult
+:ref:`upgrading-pcg64`.
+
Concepts
--------
.. toctree::
@@ -230,6 +233,7 @@ Concepts
generator
Legacy Generator (RandomState) <legacy>
BitGenerators, SeedSequences <bit_generators/index>
+ Upgrading PCG64 with PCG64DXSM <upgrading-pcg64>
Features
--------
diff --git a/doc/source/reference/random/legacy.rst b/doc/source/reference/random/legacy.rst
index 6cf4775b8..42437dbb6 100644
--- a/doc/source/reference/random/legacy.rst
+++ b/doc/source/reference/random/legacy.rst
@@ -48,7 +48,8 @@ using the state of the `RandomState`:
.. autoclass:: RandomState
- :exclude-members:
+ :members: __init__
+ :exclude-members: __init__
Seeding and State
=================
diff --git a/doc/source/reference/random/new-or-different.rst b/doc/source/reference/random/new-or-different.rst
index 6cab0f729..a81543926 100644
--- a/doc/source/reference/random/new-or-different.rst
+++ b/doc/source/reference/random/new-or-different.rst
@@ -58,18 +58,18 @@ And in more detail:
from numpy.random import Generator, PCG64
import numpy.random
- rg = Generator(PCG64())
- %timeit -n 1 rg.standard_normal(100000)
+ rng = Generator(PCG64())
+ %timeit -n 1 rng.standard_normal(100000)
%timeit -n 1 numpy.random.standard_normal(100000)
.. ipython:: python
- %timeit -n 1 rg.standard_exponential(100000)
+ %timeit -n 1 rng.standard_exponential(100000)
%timeit -n 1 numpy.random.standard_exponential(100000)
.. ipython:: python
- %timeit -n 1 rg.standard_gamma(3.0, 100000)
+ %timeit -n 1 rng.standard_gamma(3.0, 100000)
%timeit -n 1 numpy.random.standard_gamma(3.0, 100000)
@@ -94,9 +94,9 @@ And in more detail:
.. ipython:: python
- rg = Generator(PCG64(0))
- rg.random(3, dtype='d')
- rg.random(3, dtype='f')
+ rng = Generator(PCG64(0))
+ rng.random(3, dtype='d')
+ rng.random(3, dtype='f')
* Optional ``out`` argument that allows existing arrays to be filled for
select distributions
@@ -112,7 +112,7 @@ And in more detail:
.. ipython:: python
existing = np.zeros(4)
- rg.random(out=existing[:2])
+ rng.random(out=existing[:2])
print(existing)
* Optional ``axis`` argument for methods like `~.Generator.choice`,
@@ -121,9 +121,9 @@ And in more detail:
.. ipython:: python
- rg = Generator(PCG64(123456789))
+ rng = Generator(PCG64(123456789))
a = np.arange(12).reshape((3, 4))
a
- rg.choice(a, axis=1, size=5)
- rg.shuffle(a, axis=1) # Shuffle in-place
+ rng.choice(a, axis=1, size=5)
+ rng.shuffle(a, axis=1) # Shuffle in-place
a
diff --git a/doc/source/reference/random/parallel.rst b/doc/source/reference/random/parallel.rst
index 721584014..7f0207bde 100644
--- a/doc/source/reference/random/parallel.rst
+++ b/doc/source/reference/random/parallel.rst
@@ -88,10 +88,11 @@ territory ([2]_).
estimate the naive upper bound on a napkin and take comfort knowing
that the probability is actually lower.
-.. [2] In this calculation, we can ignore the amount of numbers drawn from each
- stream. Each of the PRNGs we provide has some extra protection built in
+.. [2] In this calculation, we can mostly ignore the amount of numbers drawn from each
+ stream. See :ref:`upgrading-pcg64` for the technical details about
+ `PCG64`. The other PRNGs we provide have some extra protection built in
that avoids overlaps if the `~SeedSequence` pools differ in the
- slightest bit. `PCG64` has :math:`2^{127}` separate cycles
+ slightest bit. `PCG64DXSM` has :math:`2^{127}` separate cycles
determined by the seed in addition to the position in the
:math:`2^{128}` long period for each cycle, so one has to both get on or
near the same cycle *and* seed a nearby position in the cycle.
@@ -150,12 +151,14 @@ BitGenerator, the size of the jump and the bits in the default unsigned random
are listed below.
+-----------------+-------------------------+-------------------------+-------------------------+
-| BitGenerator | Period | Jump Size | Bits |
+| BitGenerator | Period | Jump Size | Bits per Draw |
+=================+=========================+=========================+=========================+
-| MT19937 | :math:`2^{19937}` | :math:`2^{128}` | 32 |
+| MT19937 | :math:`2^{19937}-1` | :math:`2^{128}` | 32 |
+-----------------+-------------------------+-------------------------+-------------------------+
| PCG64 | :math:`2^{128}` | :math:`~2^{127}` ([3]_) | 64 |
+-----------------+-------------------------+-------------------------+-------------------------+
+| PCG64DXSM | :math:`2^{128}` | :math:`~2^{127}` ([3]_) | 64 |
++-----------------+-------------------------+-------------------------+-------------------------+
| Philox | :math:`2^{256}` | :math:`2^{128}` | 64 |
+-----------------+-------------------------+-------------------------+-------------------------+
diff --git a/doc/source/reference/random/performance.py b/doc/source/reference/random/performance.py
index 28a42eb0d..794142836 100644
--- a/doc/source/reference/random/performance.py
+++ b/doc/source/reference/random/performance.py
@@ -1,14 +1,13 @@
-from collections import OrderedDict
from timeit import repeat
import pandas as pd
import numpy as np
-from numpy.random import MT19937, PCG64, Philox, SFC64
+from numpy.random import MT19937, PCG64, PCG64DXSM, Philox, SFC64
-PRNGS = [MT19937, PCG64, Philox, SFC64]
+PRNGS = [MT19937, PCG64, PCG64DXSM, Philox, SFC64]
-funcs = OrderedDict()
+funcs = {}
integers = 'integers(0, 2**{bits},size=1000000, dtype="uint{bits}")'
funcs['32-bit Unsigned Ints'] = integers.format(bits=32)
funcs['64-bit Unsigned Ints'] = integers.format(bits=64)
@@ -26,10 +25,10 @@ rg = Generator({prng}())
"""
test = "rg.{func}"
-table = OrderedDict()
+table = {}
for prng in PRNGS:
print(prng)
- col = OrderedDict()
+ col = {}
for key in funcs:
t = repeat(test.format(func=funcs[key]),
setup.format(prng=prng().__class__.__name__),
@@ -38,7 +37,7 @@ for prng in PRNGS:
col = pd.Series(col)
table[prng().__class__.__name__] = col
-npfuncs = OrderedDict()
+npfuncs = {}
npfuncs.update(funcs)
npfuncs['32-bit Unsigned Ints'] = 'randint(2**32,dtype="uint32",size=1000000)'
npfuncs['64-bit Unsigned Ints'] = 'randint(2**64,dtype="uint64",size=1000000)'
@@ -54,7 +53,7 @@ for key in npfuncs:
col[key] = 1000 * min(t)
table['RandomState'] = pd.Series(col)
-columns = ['MT19937','PCG64','Philox','SFC64', 'RandomState']
+columns = ['MT19937', 'PCG64', 'PCG64DXSM', 'Philox', 'SFC64', 'RandomState']
table = pd.DataFrame(table)
order = np.log(table).mean().sort_values().index
table = table.T
diff --git a/doc/source/reference/random/performance.rst b/doc/source/reference/random/performance.rst
index 74dad4cc3..85855be59 100644
--- a/doc/source/reference/random/performance.rst
+++ b/doc/source/reference/random/performance.rst
@@ -5,9 +5,12 @@ Performance
Recommendation
**************
-The recommended generator for general use is `PCG64`. It is
-statistically high quality, full-featured, and fast on most platforms, but
-somewhat slow when compiled for 32-bit processes.
+
+The recommended generator for general use is `PCG64` or its upgraded variant
+`PCG64DXSM` for heavily-parallel use cases. They are statistically high quality,
+full-featured, and fast on most platforms, but somewhat slow when compiled for
+32-bit processes. See :ref:`upgrading-pcg64` for details on when heavy
+parallelism would indicate using `PCG64DXSM`.
`Philox` is fairly slow, but its statistical properties have
very high quality, and it is easy to get assuredly-independent stream by using
@@ -39,49 +42,48 @@ Integer performance has a similar ordering.
The pattern is similar for other, more complex generators. The normal
performance of the legacy `RandomState` generator is much
-lower than the other since it uses the Box-Muller transformation rather
-than the Ziggurat generator. The performance gap for Exponentials is also
+lower than the other since it uses the Box-Muller transform rather
+than the Ziggurat method. The performance gap for Exponentials is also
large due to the cost of computing the log function to invert the CDF.
-The column labeled MT19973 is used the same 32-bit generator as
-`RandomState` but produces random values using
-`Generator`.
+The column labeled MT19973 uses the same 32-bit generator as
+`RandomState` but produces random variates using `Generator`.
.. csv-table::
- :header: ,MT19937,PCG64,Philox,SFC64,RandomState
- :widths: 14,14,14,14,14,14
-
- 32-bit Unsigned Ints,3.2,2.7,4.9,2.7,3.2
- 64-bit Unsigned Ints,5.6,3.7,6.3,2.9,5.7
- Uniforms,7.3,4.1,8.1,3.1,7.3
- Normals,13.1,10.2,13.5,7.8,34.6
- Exponentials,7.9,5.4,8.5,4.1,40.3
- Gammas,34.8,28.0,34.7,25.1,58.1
- Binomials,25.0,21.4,26.1,19.5,25.2
- Laplaces,45.1,40.7,45.5,38.1,45.6
- Poissons,67.6,52.4,69.2,46.4,78.1
+ :header: ,MT19937,PCG64,PCG64DXSM,Philox,SFC64,RandomState
+ :widths: 14,14,14,14,14,14,14
+
+ 32-bit Unsigned Ints,3.3,1.9,2.0,3.3,1.8,3.1
+ 64-bit Unsigned Ints,5.6,3.2,2.9,4.9,2.5,5.5
+ Uniforms,5.9,3.1,2.9,5.0,2.6,6.0
+ Normals,13.9,10.8,10.5,12.0,8.3,56.8
+ Exponentials,9.1,6.0,5.8,8.1,5.4,63.9
+ Gammas,37.2,30.8,28.9,34.0,27.5,77.0
+ Binomials,21.3,17.4,17.6,19.3,15.6,21.4
+ Laplaces,73.2,72.3,76.1,73.0,72.3,82.5
+ Poissons,111.7,103.4,100.5,109.4,90.7,115.2
The next table presents the performance in percentage relative to values
generated by the legacy generator, ``RandomState(MT19937())``. The overall
performance was computed using a geometric mean.
.. csv-table::
- :header: ,MT19937,PCG64,Philox,SFC64
- :widths: 14,14,14,14,14
-
- 32-bit Unsigned Ints,101,121,67,121
- 64-bit Unsigned Ints,102,156,91,199
- Uniforms,100,179,90,235
- Normals,263,338,257,443
- Exponentials,507,752,474,985
- Gammas,167,207,167,231
- Binomials,101,118,96,129
- Laplaces,101,112,100,120
- Poissons,116,149,113,168
- Overall,144,192,132,225
+ :header: ,MT19937,PCG64,PCG64DXSM,Philox,SFC64
+ :widths: 14,14,14,14,14,14
+
+ 32-bit Unsigned Ints,96,162,160,96,175
+ 64-bit Unsigned Ints,97,171,188,113,218
+ Uniforms,102,192,206,121,233
+ Normals,409,526,541,471,684
+ Exponentials,701,1071,1101,784,1179
+ Gammas,207,250,266,227,281
+ Binomials,100,123,122,111,138
+ Laplaces,113,114,108,113,114
+ Poissons,103,111,115,105,127
+ Overall,159,219,225,174,251
.. note::
- All timings were taken using Linux on an i5-3570 processor.
+ All timings were taken using Linux on an AMD Ryzen 9 3900X processor.
Performance on different Operating Systems
******************************************
@@ -98,33 +100,33 @@ across tables.
64-bit Linux
~~~~~~~~~~~~
-=================== ========= ======= ======== =======
-Distribution MT19937 PCG64 Philox SFC64
-=================== ========= ======= ======== =======
-32-bit Unsigned Int 100 119.8 67.7 120.2
-64-bit Unsigned Int 100 152.9 90.8 213.3
-Uniforms 100 179.0 87.0 232.0
-Normals 100 128.5 99.2 167.8
-Exponentials 100 148.3 93.0 189.3
-**Overall** 100 144.3 86.8 180.0
-=================== ========= ======= ======== =======
+===================== ========= ======= =========== ======== =======
+Distribution MT19937 PCG64 PCG64DXSM Philox SFC64
+===================== ========= ======= =========== ======== =======
+32-bit Unsigned Ints 100 168 166 100 182
+64-bit Unsigned Ints 100 176 193 116 224
+Uniforms 100 188 202 118 228
+Normals 100 128 132 115 167
+Exponentials 100 152 157 111 168
+Overall 100 161 168 112 192
+===================== ========= ======= =========== ======== =======
64-bit Windows
~~~~~~~~~~~~~~
-The relative performance on 64-bit Linux and 64-bit Windows is broadly similar.
-
+The relative performance on 64-bit Linux and 64-bit Windows is broadly similar
+with the notable exception of the Philox generator.
-=================== ========= ======= ======== =======
-Distribution MT19937 PCG64 Philox SFC64
-=================== ========= ======= ======== =======
-32-bit Unsigned Int 100 129.1 35.0 135.0
-64-bit Unsigned Int 100 146.9 35.7 176.5
-Uniforms 100 165.0 37.0 192.0
-Normals 100 128.5 48.5 158.0
-Exponentials 100 151.6 39.0 172.8
-**Overall** 100 143.6 38.7 165.7
-=================== ========= ======= ======== =======
+===================== ========= ======= =========== ======== =======
+Distribution MT19937 PCG64 PCG64DXSM Philox SFC64
+===================== ========= ======= =========== ======== =======
+32-bit Unsigned Ints 100 155 131 29 150
+64-bit Unsigned Ints 100 157 143 25 154
+Uniforms 100 151 144 24 155
+Normals 100 129 128 37 150
+Exponentials 100 150 145 28 159
+**Overall** 100 148 138 28 154
+===================== ========= ======= =========== ======== =======
32-bit Windows
@@ -134,20 +136,20 @@ The performance of 64-bit generators on 32-bit Windows is much lower than on 64-
operating systems due to register width. MT19937, the generator that has been
in NumPy since 2005, operates on 32-bit integers.
-=================== ========= ======= ======== =======
-Distribution MT19937 PCG64 Philox SFC64
-=================== ========= ======= ======== =======
-32-bit Unsigned Int 100 30.5 21.1 77.9
-64-bit Unsigned Int 100 26.3 19.2 97.0
-Uniforms 100 28.0 23.0 106.0
-Normals 100 40.1 31.3 112.6
-Exponentials 100 33.7 26.3 109.8
-**Overall** 100 31.4 23.8 99.8
-=================== ========= ======= ======== =======
+===================== ========= ======= =========== ======== =======
+Distribution MT19937 PCG64 PCG64DXSM Philox SFC64
+===================== ========= ======= =========== ======== =======
+32-bit Unsigned Ints 100 24 34 14 57
+64-bit Unsigned Ints 100 21 32 14 74
+Uniforms 100 21 34 16 73
+Normals 100 36 57 28 101
+Exponentials 100 28 44 20 88
+**Overall** 100 25 39 18 77
+===================== ========= ======= =========== ======== =======
.. note::
- Linux timings used Ubuntu 18.04 and GCC 7.4. Windows timings were made on
+ Linux timings used Ubuntu 20.04 and GCC 9.3.0. Windows timings were made on
Windows 10 using Microsoft C/C++ Optimizing Compiler Version 19 (Visual
- Studio 2015). All timings were produced on an i5-3570 processor.
+ Studio 2019). All timings were produced on an AMD Ryzen 9 3900X processor.
diff --git a/doc/source/reference/random/upgrading-pcg64.rst b/doc/source/reference/random/upgrading-pcg64.rst
new file mode 100644
index 000000000..9e540ace9
--- /dev/null
+++ b/doc/source/reference/random/upgrading-pcg64.rst
@@ -0,0 +1,152 @@
+.. _upgrading-pcg64:
+
+.. currentmodule:: numpy.random
+
+Upgrading ``PCG64`` with ``PCG64DXSM``
+--------------------------------------
+
+Uses of the `PCG64` `BitGenerator` in a massively-parallel context have been
+shown to have statistical weaknesses that were not apparent at the first
+release in numpy 1.17. Most users will never observe this weakness and are
+safe to continue to use `PCG64`. We have introduced a new `PCG64DXSM`
+`BitGenerator` that will eventually become the new default `BitGenerator`
+implementation used by `default_rng` in future releases. `PCG64DXSM` solves
+the statistical weakness while preserving the performance and the features of
+`PCG64`.
+
+Does this affect me?
+====================
+
+If you
+
+ 1. only use a single `Generator` instance,
+ 2. only use `RandomState` or the functions in `numpy.random`,
+ 3. only use the `PCG64.jumped` method to generate parallel streams,
+ 4. explicitly use a `BitGenerator` other than `PCG64`,
+
+then this weakness does not affect you at all. Carry on.
+
+If you use moderate numbers of parallel streams created with `default_rng` or
+`SeedSequence.spawn`, in the 1000s, then the chance of observing this weakness
+is negligibly small. You can continue to use `PCG64` comfortably.
+
+If you use very large numbers of parallel streams, in the millions, and draw
+large amounts of numbers from each, then the chance of observing this weakness
+can become non-negligible, if still small. An example of such a use case would
+be a very large distributed reinforcement learning problem with millions of
+long Monte Carlo playouts each generating billions of random number draws. Such
+use cases should consider using `PCG64DXSM` explicitly or another
+modern `BitGenerator` like `SFC64` or `Philox`, but it is unlikely that any
+old results you may have calculated are invalid. In any case, the weakness is
+a kind of `Birthday Paradox <https://en.wikipedia.org/wiki/Birthday_problem>`_
+collision. That is, a single pair of parallel streams out of the millions,
+considered together, might fail a stringent set of statistical tests of
+randomness. The remaining millions of streams would all be perfectly fine, and
+the effect of the bad pair in the whole calculation is very likely to be
+swamped by the remaining streams in most applications.
+
+.. _upgrading-pcg64-details:
+
+Technical Details
+=================
+
+Like many PRNG algorithms, `PCG64` is constructed from a transition function,
+which advances a 128-bit state, and an output function, that mixes the 128-bit
+state into a 64-bit integer to be output. One of the guiding design principles
+of the PCG family of PRNGs is to balance the computational cost (and
+pseudorandomness strength) between the transition function and the output
+function. The transition function is a 128-bit linear congruential generator
+(LCG), which consists of multiplying the 128-bit state with a fixed
+multiplication constant and then adding a user-chosen increment, in 128-bit
+modular arithmetic. LCGs are well-analyzed PRNGs with known weaknesses, though
+128-bit LCGs are large enough to pass stringent statistical tests on their own,
+with only the trivial output function. The output function of `PCG64` is
+intended to patch up some of those known weaknesses by doing "just enough"
+scrambling of the bits to assist in the statistical properties without adding
+too much computational cost.
+
+One of these known weaknesses is that advancing the state of the LCG by steps
+numbering a power of two (``bg.advance(2**N)``) will leave the lower ``N`` bits
+identical to the state that was just left. For a single stream drawn from
+sequentially, this is of little consequence. The remaining :math:`128-N` bits provide
+plenty of pseudorandomness that will be mixed in for any practical ``N`` that can
+be observed in a single stream, which is why one does not need to worry about
+this if you only use a single stream in your application. Similarly, the
+`PCG64.jumped` method uses a carefully chosen number of steps to avoid creating
+these collisions. However, once you start creating "randomly-initialized"
+parallel streams, either using OS entropy by calling `default_rng` repeatedly
+or using `SeedSequence.spawn`, then we need to consider how many lower bits
+need to "collide" in order to create a bad pair of streams, and then evaluate
+the probability of creating such a collision.
+`Empirically <https://github.com/numpy/numpy/issues/16313>`_, it has been
+determined that if one shares the lower 58 bits of state and shares an
+increment, then the pair of streams, when interleaved, will fail
+`PractRand <http://pracrand.sourceforge.net/>`_ in
+a reasonable amount of time, after drawing a few gigabytes of data. Following
+the standard Birthday Paradox calculations for a collision of 58 bits, we can
+see that we can create :math:`2^{29}`, or about half a billion, streams which is when
+the probability of such a collision becomes high. Half a billion streams is
+quite high, and the amount of data each stream needs to draw before the
+statistical correlations become apparent to even the strict ``PractRand`` tests
+is in the gigabytes. But this is on the horizon for very large applications
+like distributed reinforcement learning. There are reasons to expect that even
+in these applications a collision probably will not have a practical effect in
+the total result, since the statistical problem is constrained to just the
+colliding pair.
+
+Now, let us consider the case when the increment is not constrained to be the
+same. Our implementation of `PCG64` seeds both the state and the increment;
+that is, two calls to `default_rng` (almost certainly) have different states
+and increments. Upon our first release, we believed that having the seeded
+increment would provide a certain amount of extra protection, that one would
+have to be "close" in both the state space and increment space in order to
+observe correlations (``PractRand`` failures) in a pair of streams. If that were
+true, then the "bottleneck" for collisions would be the 128-bit entropy pool
+size inside of `SeedSequence` (and 128-bit collisions are in the
+"preposterously unlikely" category). Unfortunately, this is not true.
+
+One of the known properties of an LCG is that different increments create
+*distinct* streams, but with a known relationship. Each LCG has an orbit that
+traverses all :math:`2^{128}` different 128-bit states. Two LCGs with different
+increments are related in that one can "rotate" the orbit of the first LCG
+(advance it by a number of steps that we can compute from the two increments)
+such that then both LCGs will always then have the same state, up to an
+additive constant and maybe an inversion of the bits. If you then iterate both
+streams in lockstep, then the states will *always* remain related by that same
+additive constant (and the inversion, if present). Recall that `PCG64` is
+constructed from both a transition function (the LCG) and an output function.
+It was expected that the scrambling effect of the output function would have
+been strong enough to make the distinct streams practically independent (i.e.
+"passing the ``PractRand`` tests") unless the two increments were
+pathologically related to each other (e.g. 1 and 3). The output function XSL-RR
+of the then-standard PCG algorithm that we implemented in `PCG64` turns out to
+be too weak to cover up for the 58-bit collision of the underlying LCG that we
+described above. For any given pair of increments, the size of the "colliding"
+space of states is the same, so for this weakness, the extra distinctness
+provided by the increments does not translate into extra protection from
+statistical correlations that ``PractRand`` can detect.
+
+Fortunately, strengthening the output function is able to correct this weakness
+and *does* turn the extra distinctness provided by differing increments into
+additional protection from these low-bit collisions. To the `PCG author's
+credit <https://github.com/numpy/numpy/issues/13635#issuecomment-506088698>`_,
+she had developed a stronger output function in response to related discussions
+during the long birth of the new `BitGenerator` system. We NumPy developers
+chose to be "conservative" and use the XSL-RR variant that had undergone
+a longer period of testing at that time. The DXSM output function adopts
+a "xorshift-multiply" construction used in strong integer hashes that has much
+better avalanche properties than the XSL-RR output function. While there are
+"pathological" pairs of increments that induce "bad" additive constants that
+relate the two streams, the vast majority of pairs induce "good" additive
+constants that make the merely-distinct streams of LCG states into
+practically-independent output streams. Indeed, now the claim we once made
+about `PCG64` is actually true of `PCG64DXSM`: collisions are possible, but
+both streams have to simultaneously be both "close" in the 128 bit state space
+*and* "close" in the 127-bit increment space, so that would be less likely than
+the negligible chance of colliding in the 128-bit internal `SeedSequence` pool.
+The DXSM output function is more computationally intensive than XSL-RR, but
+some optimizations in the LCG more than make up for the performance hit on most
+machines, so `PCG64DXSM` is a good, safe upgrade. There are, of course, an
+infinite number of stronger output functions that one could consider, but most
+will have a greater computational cost, and the DXSM output function has now
+received many CPU cycles of testing via ``PractRand`` at this time.