diff options
Diffstat (limited to 'doc/source/reference/random')
-rw-r--r-- | doc/source/reference/random/bit_generators/index.rst | 50 | ||||
-rw-r--r-- | doc/source/reference/random/bit_generators/mt19937.rst | 3 | ||||
-rw-r--r-- | doc/source/reference/random/bit_generators/pcg64.rst | 3 | ||||
-rw-r--r-- | doc/source/reference/random/bit_generators/pcg64dxsm.rst | 32 | ||||
-rw-r--r-- | doc/source/reference/random/bit_generators/philox.rst | 3 | ||||
-rw-r--r-- | doc/source/reference/random/bit_generators/sfc64.rst | 3 | ||||
-rw-r--r-- | doc/source/reference/random/c-api.rst | 12 | ||||
-rw-r--r-- | doc/source/reference/random/generator.rst | 17 | ||||
-rw-r--r-- | doc/source/reference/random/index.rst | 30 | ||||
-rw-r--r-- | doc/source/reference/random/legacy.rst | 3 | ||||
-rw-r--r-- | doc/source/reference/random/new-or-different.rst | 22 | ||||
-rw-r--r-- | doc/source/reference/random/parallel.rst | 13 | ||||
-rw-r--r-- | doc/source/reference/random/performance.py | 15 | ||||
-rw-r--r-- | doc/source/reference/random/performance.rst | 138 | ||||
-rw-r--r-- | doc/source/reference/random/upgrading-pcg64.rst | 152 |
15 files changed, 371 insertions, 125 deletions
diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst index 315657172..c5c349806 100644 --- a/doc/source/reference/random/bit_generators/index.rst +++ b/doc/source/reference/random/bit_generators/index.rst @@ -15,10 +15,13 @@ Supported BitGenerators The included BitGenerators are: -* PCG-64 - The default. A fast generator that supports many parallel streams - and can be advanced by an arbitrary amount. See the documentation for - :meth:`~.PCG64.advance`. PCG-64 has a period of :math:`2^{128}`. See the `PCG - author's page`_ for more details about this class of PRNG. +* PCG-64 - The default. A fast generator that can be advanced by an arbitrary + amount. See the documentation for :meth:`~.PCG64.advance`. PCG-64 has + a period of :math:`2^{128}`. See the `PCG author's page`_ for more details + about this class of PRNG. +* PCG-64 DXSM - An upgraded version of PCG-64 with better statistical + properties in parallel contexts. See :ref:`upgrading-pcg64` for more + information on these improvements. * MT19937 - The standard Python BitGenerator. Adds a `MT19937.jumped` function that returns a new generator with state as-if :math:`2^{128}` draws have been made. @@ -43,6 +46,7 @@ The included BitGenerators are: MT19937 <mt19937> PCG64 <pcg64> + PCG64DXSM <pcg64dxsm> Philox <philox> SFC64 <sfc64> @@ -105,6 +109,44 @@ If you need to generate a good seed "offline", then ``SeedSequence().entropy`` or using ``secrets.randbits(128)`` from the standard library are both convenient ways. +If you need to run several stochastic simulations in parallel, best practice +is to construct a random generator instance for each simulation. +To make sure that the random streams have distinct initial states, you can use +the `spawn` method of `~SeedSequence`. For instance, here we construct a list +of 12 instances: + +.. code-block:: python + + from numpy.random import PCG64, SeedSequence + + # High quality initial entropy + entropy = 0x87351080e25cb0fad77a44a3be03b491 + base_seq = SeedSequence(entropy) + child_seqs = base_seq.spawn(12) # a list of 12 SeedSequences + generators = [PCG64(seq) for seq in child_seqs] + +.. end_block + + +An alternative way is to use the fact that a `~SeedSequence` can be initialized +by a tuple of elements. Here we use a base entropy value and an integer +``worker_id`` + +.. code-block:: python + + from numpy.random import PCG64, SeedSequence + + # High quality initial entropy + entropy = 0x87351080e25cb0fad77a44a3be03b491 + sequences = [SeedSequence((entropy, worker_id)) for worker_id in range(12)] + generators = [PCG64(seq) for seq in sequences] + +.. end_block + +Note that the sequences produced by the latter method will be distinct from +those constructed via `~SeedSequence.spawn`. + + .. autosummary:: :toctree: generated/ diff --git a/doc/source/reference/random/bit_generators/mt19937.rst b/doc/source/reference/random/bit_generators/mt19937.rst index 71875db4e..d05ea7c6f 100644 --- a/doc/source/reference/random/bit_generators/mt19937.rst +++ b/doc/source/reference/random/bit_generators/mt19937.rst @@ -4,7 +4,8 @@ Mersenne Twister (MT19937) .. currentmodule:: numpy.random .. autoclass:: MT19937 - :exclude-members: + :members: __init__ + :exclude-members: __init__ State ===== diff --git a/doc/source/reference/random/bit_generators/pcg64.rst b/doc/source/reference/random/bit_generators/pcg64.rst index edac4620b..889965f77 100644 --- a/doc/source/reference/random/bit_generators/pcg64.rst +++ b/doc/source/reference/random/bit_generators/pcg64.rst @@ -4,7 +4,8 @@ Permuted Congruential Generator (64-bit, PCG64) .. currentmodule:: numpy.random .. autoclass:: PCG64 - :exclude-members: + :members: __init__ + :exclude-members: __init__ State ===== diff --git a/doc/source/reference/random/bit_generators/pcg64dxsm.rst b/doc/source/reference/random/bit_generators/pcg64dxsm.rst new file mode 100644 index 000000000..e37efa5d3 --- /dev/null +++ b/doc/source/reference/random/bit_generators/pcg64dxsm.rst @@ -0,0 +1,32 @@ +Permuted Congruential Generator (64-bit, PCG64 DXSM) +---------------------------------------------------- + +.. currentmodule:: numpy.random + +.. autoclass:: PCG64DXSM + :members: __init__ + :exclude-members: __init__ + +State +===== + +.. autosummary:: + :toctree: generated/ + + ~PCG64DXSM.state + +Parallel generation +=================== +.. autosummary:: + :toctree: generated/ + + ~PCG64DXSM.advance + ~PCG64DXSM.jumped + +Extending +========= +.. autosummary:: + :toctree: generated/ + + ~PCG64DXSM.cffi + ~PCG64DXSM.ctypes diff --git a/doc/source/reference/random/bit_generators/philox.rst b/doc/source/reference/random/bit_generators/philox.rst index 8eba2d351..3c2fa4cc5 100644 --- a/doc/source/reference/random/bit_generators/philox.rst +++ b/doc/source/reference/random/bit_generators/philox.rst @@ -4,7 +4,8 @@ Philox Counter-based RNG .. currentmodule:: numpy.random .. autoclass:: Philox - :exclude-members: + :members: __init__ + :exclude-members: __init__ State ===== diff --git a/doc/source/reference/random/bit_generators/sfc64.rst b/doc/source/reference/random/bit_generators/sfc64.rst index d34124a33..8cb255bc1 100644 --- a/doc/source/reference/random/bit_generators/sfc64.rst +++ b/doc/source/reference/random/bit_generators/sfc64.rst @@ -4,7 +4,8 @@ SFC64 Small Fast Chaotic PRNG .. currentmodule:: numpy.random .. autoclass:: SFC64 - :exclude-members: + :members: __init__ + :exclude-members: __init__ State ===== diff --git a/doc/source/reference/random/c-api.rst b/doc/source/reference/random/c-api.rst index a79da7a49..de403ce98 100644 --- a/doc/source/reference/random/c-api.rst +++ b/doc/source/reference/random/c-api.rst @@ -3,6 +3,8 @@ C API for random .. currentmodule:: numpy.random +.. versionadded:: 1.19.0 + Access to various distributions below is available via Cython or C-wrapper libraries like CFFI. All the functions accept a :c:type:`bitgen_t` as their first argument. To access these from Cython or C, you must link with the @@ -40,9 +42,9 @@ The functions are named with the following conventions: - The functions without "standard" in their name require additional parameters to describe the distributions. -- ``zig`` in the name are based on a ziggurat lookup algorithm is used instead - of calculating the ``log``, which is significantly faster. The non-ziggurat - variants are used in corner cases and for legacy compatibility. +- Functions with ``inv`` in their name are based on the slower inverse method + instead of a ziggurat lookup algorithm, which is significantly faster. The + non-ziggurat variants are used in corner cases and for legacy compatibility. .. c:function:: double random_standard_uniform(bitgen_t *bitgen_state) @@ -53,6 +55,8 @@ The functions are named with the following conventions: .. c:function:: void random_standard_exponential_fill(bitgen_t *bitgen_state, npy_intp cnt, double *out) +.. c:function:: void random_standard_exponential_inv_fill(bitgen_t *bitgen_state, npy_intp cnt, double *out) + .. c:function:: double random_standard_normal(bitgen_t* bitgen_state) .. c:function:: void random_standard_normal_fill(bitgen_t *bitgen_state, npy_intp count, double *out) @@ -69,6 +73,8 @@ The functions are named with the following conventions: .. c:function:: void random_standard_exponential_fill_f(bitgen_t *bitgen_state, npy_intp cnt, float *out) +.. c:function:: void random_standard_exponential_inv_fill_f(bitgen_t *bitgen_state, npy_intp cnt, float *out) + .. c:function:: float random_standard_normal_f(bitgen_t* bitgen_state) .. c:function:: float random_standard_gamma_f(bitgen_t *bitgen_state, float shape) diff --git a/doc/source/reference/random/generator.rst b/doc/source/reference/random/generator.rst index 8706e1de2..7934be98a 100644 --- a/doc/source/reference/random/generator.rst +++ b/doc/source/reference/random/generator.rst @@ -15,7 +15,8 @@ can be changed by passing an instantized BitGenerator to ``Generator``. .. autofunction:: default_rng .. autoclass:: Generator - :exclude-members: + :members: __init__ + :exclude-members: __init__ Accessing the BitGenerator ========================== @@ -70,13 +71,13 @@ By default, `Generator.permuted` returns a copy. To operate in-place with `Generator.permuted`, pass the same array as the first argument *and* as the value of the ``out`` parameter. For example, - >>> rg = np.random.default_rng() + >>> rng = np.random.default_rng() >>> x = np.arange(0, 15).reshape(3, 5) >>> x array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]) - >>> y = rg.permuted(x, axis=1, out=x) + >>> y = rng.permuted(x, axis=1, out=x) >>> x array([[ 1, 0, 2, 4, 3], # random [ 6, 7, 8, 9, 5], @@ -96,13 +97,13 @@ which dimension of the input array to use as the sequence. In the case of a two-dimensional array, ``axis=0`` will, in effect, rearrange the rows of the array, and ``axis=1`` will rearrange the columns. For example - >>> rg = np.random.default_rng() + >>> rng = np.random.default_rng() >>> x = np.arange(0, 15).reshape(3, 5) >>> x array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]) - >>> rg.permutation(x, axis=1) + >>> rng.permutation(x, axis=1) array([[ 1, 3, 2, 0, 4], # random [ 6, 8, 7, 5, 9], [11, 13, 12, 10, 14]]) @@ -115,7 +116,7 @@ how `numpy.sort` treats it. Each slice along the given axis is shuffled independently of the others. Compare the following example of the use of `Generator.permuted` to the above example of `Generator.permutation`: - >>> rg.permuted(x, axis=1) + >>> rng.permuted(x, axis=1) array([[ 1, 0, 2, 4, 3], # random [ 5, 7, 6, 9, 8], [10, 14, 12, 13, 11]]) @@ -130,9 +131,9 @@ Shuffling non-NumPy sequences a sequence that is not a NumPy array, it shuffles that sequence in-place. For example, - >>> rg = np.random.default_rng() + >>> rng = np.random.default_rng() >>> a = ['A', 'B', 'C', 'D', 'E'] - >>> rg.shuffle(a) # shuffle the list in-place + >>> rng.shuffle(a) # shuffle the list in-place >>> a ['B', 'D', 'A', 'E', 'C'] # random diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst index 13ce7c40c..96cd47017 100644 --- a/doc/source/reference/random/index.rst +++ b/doc/source/reference/random/index.rst @@ -25,7 +25,7 @@ nep-0019-rng-policy.html>`_ for context on the updated random Numpy number routines. The legacy `RandomState` random number routines are still available, but limited to a single BitGenerator. See :ref:`new-or-different` for a complete list of improvements and differences from the legacy -``Randomstate``. +``RandomState``. For convenience and backward compatibility, a single `RandomState` instance's methods are imported into the numpy.random namespace, see @@ -84,10 +84,10 @@ different .. code-block:: python try: - rg_integers = rg.integers + rng_integers = rng.integers except AttributeError: - rg_integers = rg.randint - a = rg_integers(1000) + rng_integers = rng.randint + a = rng_integers(1000) Seeds can be passed to any of the BitGenerators. The provided value is mixed via `SeedSequence` to spread a possible sequence of seeds across a wider @@ -97,8 +97,8 @@ is wrapped with a `Generator`. .. code-block:: python from numpy.random import Generator, PCG64 - rg = Generator(PCG64(12345)) - rg.standard_normal() + rng = Generator(PCG64(12345)) + rng.standard_normal() Here we use `default_rng` to create an instance of `Generator` to generate a random float: @@ -146,10 +146,10 @@ As a convenience NumPy provides the `default_rng` function to hide these details: >>> from numpy.random import default_rng ->>> rg = default_rng(12345) ->>> print(rg) +>>> rng = default_rng(12345) +>>> print(rng) Generator(PCG64) ->>> print(rg.random()) +>>> print(rng.random()) 0.22733602246716966 One can also instantiate `Generator` directly with a `BitGenerator` instance. @@ -158,16 +158,16 @@ To use the default `PCG64` bit generator, one can instantiate it directly and pass it to `Generator`: >>> from numpy.random import Generator, PCG64 ->>> rg = Generator(PCG64(12345)) ->>> print(rg) +>>> rng = Generator(PCG64(12345)) +>>> print(rng) Generator(PCG64) Similarly to use the older `MT19937` bit generator (not recommended), one can instantiate it directly and pass it to `Generator`: >>> from numpy.random import Generator, MT19937 ->>> rg = Generator(MT19937(12345)) ->>> print(rg) +>>> rng = Generator(MT19937(12345)) +>>> print(rng) Generator(MT19937) What's New or Different @@ -222,6 +222,9 @@ one of three ways: * :ref:`independent-streams` * :ref:`parallel-jumped` +Users with a very large amount of parallelism will want to consult +:ref:`upgrading-pcg64`. + Concepts -------- .. toctree:: @@ -230,6 +233,7 @@ Concepts generator Legacy Generator (RandomState) <legacy> BitGenerators, SeedSequences <bit_generators/index> + Upgrading PCG64 with PCG64DXSM <upgrading-pcg64> Features -------- diff --git a/doc/source/reference/random/legacy.rst b/doc/source/reference/random/legacy.rst index 6cf4775b8..42437dbb6 100644 --- a/doc/source/reference/random/legacy.rst +++ b/doc/source/reference/random/legacy.rst @@ -48,7 +48,8 @@ using the state of the `RandomState`: .. autoclass:: RandomState - :exclude-members: + :members: __init__ + :exclude-members: __init__ Seeding and State ================= diff --git a/doc/source/reference/random/new-or-different.rst b/doc/source/reference/random/new-or-different.rst index 6cab0f729..a81543926 100644 --- a/doc/source/reference/random/new-or-different.rst +++ b/doc/source/reference/random/new-or-different.rst @@ -58,18 +58,18 @@ And in more detail: from numpy.random import Generator, PCG64 import numpy.random - rg = Generator(PCG64()) - %timeit -n 1 rg.standard_normal(100000) + rng = Generator(PCG64()) + %timeit -n 1 rng.standard_normal(100000) %timeit -n 1 numpy.random.standard_normal(100000) .. ipython:: python - %timeit -n 1 rg.standard_exponential(100000) + %timeit -n 1 rng.standard_exponential(100000) %timeit -n 1 numpy.random.standard_exponential(100000) .. ipython:: python - %timeit -n 1 rg.standard_gamma(3.0, 100000) + %timeit -n 1 rng.standard_gamma(3.0, 100000) %timeit -n 1 numpy.random.standard_gamma(3.0, 100000) @@ -94,9 +94,9 @@ And in more detail: .. ipython:: python - rg = Generator(PCG64(0)) - rg.random(3, dtype='d') - rg.random(3, dtype='f') + rng = Generator(PCG64(0)) + rng.random(3, dtype='d') + rng.random(3, dtype='f') * Optional ``out`` argument that allows existing arrays to be filled for select distributions @@ -112,7 +112,7 @@ And in more detail: .. ipython:: python existing = np.zeros(4) - rg.random(out=existing[:2]) + rng.random(out=existing[:2]) print(existing) * Optional ``axis`` argument for methods like `~.Generator.choice`, @@ -121,9 +121,9 @@ And in more detail: .. ipython:: python - rg = Generator(PCG64(123456789)) + rng = Generator(PCG64(123456789)) a = np.arange(12).reshape((3, 4)) a - rg.choice(a, axis=1, size=5) - rg.shuffle(a, axis=1) # Shuffle in-place + rng.choice(a, axis=1, size=5) + rng.shuffle(a, axis=1) # Shuffle in-place a diff --git a/doc/source/reference/random/parallel.rst b/doc/source/reference/random/parallel.rst index 721584014..7f0207bde 100644 --- a/doc/source/reference/random/parallel.rst +++ b/doc/source/reference/random/parallel.rst @@ -88,10 +88,11 @@ territory ([2]_). estimate the naive upper bound on a napkin and take comfort knowing that the probability is actually lower. -.. [2] In this calculation, we can ignore the amount of numbers drawn from each - stream. Each of the PRNGs we provide has some extra protection built in +.. [2] In this calculation, we can mostly ignore the amount of numbers drawn from each + stream. See :ref:`upgrading-pcg64` for the technical details about + `PCG64`. The other PRNGs we provide have some extra protection built in that avoids overlaps if the `~SeedSequence` pools differ in the - slightest bit. `PCG64` has :math:`2^{127}` separate cycles + slightest bit. `PCG64DXSM` has :math:`2^{127}` separate cycles determined by the seed in addition to the position in the :math:`2^{128}` long period for each cycle, so one has to both get on or near the same cycle *and* seed a nearby position in the cycle. @@ -150,12 +151,14 @@ BitGenerator, the size of the jump and the bits in the default unsigned random are listed below. +-----------------+-------------------------+-------------------------+-------------------------+ -| BitGenerator | Period | Jump Size | Bits | +| BitGenerator | Period | Jump Size | Bits per Draw | +=================+=========================+=========================+=========================+ -| MT19937 | :math:`2^{19937}` | :math:`2^{128}` | 32 | +| MT19937 | :math:`2^{19937}-1` | :math:`2^{128}` | 32 | +-----------------+-------------------------+-------------------------+-------------------------+ | PCG64 | :math:`2^{128}` | :math:`~2^{127}` ([3]_) | 64 | +-----------------+-------------------------+-------------------------+-------------------------+ +| PCG64DXSM | :math:`2^{128}` | :math:`~2^{127}` ([3]_) | 64 | ++-----------------+-------------------------+-------------------------+-------------------------+ | Philox | :math:`2^{256}` | :math:`2^{128}` | 64 | +-----------------+-------------------------+-------------------------+-------------------------+ diff --git a/doc/source/reference/random/performance.py b/doc/source/reference/random/performance.py index 28a42eb0d..794142836 100644 --- a/doc/source/reference/random/performance.py +++ b/doc/source/reference/random/performance.py @@ -1,14 +1,13 @@ -from collections import OrderedDict from timeit import repeat import pandas as pd import numpy as np -from numpy.random import MT19937, PCG64, Philox, SFC64 +from numpy.random import MT19937, PCG64, PCG64DXSM, Philox, SFC64 -PRNGS = [MT19937, PCG64, Philox, SFC64] +PRNGS = [MT19937, PCG64, PCG64DXSM, Philox, SFC64] -funcs = OrderedDict() +funcs = {} integers = 'integers(0, 2**{bits},size=1000000, dtype="uint{bits}")' funcs['32-bit Unsigned Ints'] = integers.format(bits=32) funcs['64-bit Unsigned Ints'] = integers.format(bits=64) @@ -26,10 +25,10 @@ rg = Generator({prng}()) """ test = "rg.{func}" -table = OrderedDict() +table = {} for prng in PRNGS: print(prng) - col = OrderedDict() + col = {} for key in funcs: t = repeat(test.format(func=funcs[key]), setup.format(prng=prng().__class__.__name__), @@ -38,7 +37,7 @@ for prng in PRNGS: col = pd.Series(col) table[prng().__class__.__name__] = col -npfuncs = OrderedDict() +npfuncs = {} npfuncs.update(funcs) npfuncs['32-bit Unsigned Ints'] = 'randint(2**32,dtype="uint32",size=1000000)' npfuncs['64-bit Unsigned Ints'] = 'randint(2**64,dtype="uint64",size=1000000)' @@ -54,7 +53,7 @@ for key in npfuncs: col[key] = 1000 * min(t) table['RandomState'] = pd.Series(col) -columns = ['MT19937','PCG64','Philox','SFC64', 'RandomState'] +columns = ['MT19937', 'PCG64', 'PCG64DXSM', 'Philox', 'SFC64', 'RandomState'] table = pd.DataFrame(table) order = np.log(table).mean().sort_values().index table = table.T diff --git a/doc/source/reference/random/performance.rst b/doc/source/reference/random/performance.rst index 74dad4cc3..85855be59 100644 --- a/doc/source/reference/random/performance.rst +++ b/doc/source/reference/random/performance.rst @@ -5,9 +5,12 @@ Performance Recommendation ************** -The recommended generator for general use is `PCG64`. It is -statistically high quality, full-featured, and fast on most platforms, but -somewhat slow when compiled for 32-bit processes. + +The recommended generator for general use is `PCG64` or its upgraded variant +`PCG64DXSM` for heavily-parallel use cases. They are statistically high quality, +full-featured, and fast on most platforms, but somewhat slow when compiled for +32-bit processes. See :ref:`upgrading-pcg64` for details on when heavy +parallelism would indicate using `PCG64DXSM`. `Philox` is fairly slow, but its statistical properties have very high quality, and it is easy to get assuredly-independent stream by using @@ -39,49 +42,48 @@ Integer performance has a similar ordering. The pattern is similar for other, more complex generators. The normal performance of the legacy `RandomState` generator is much -lower than the other since it uses the Box-Muller transformation rather -than the Ziggurat generator. The performance gap for Exponentials is also +lower than the other since it uses the Box-Muller transform rather +than the Ziggurat method. The performance gap for Exponentials is also large due to the cost of computing the log function to invert the CDF. -The column labeled MT19973 is used the same 32-bit generator as -`RandomState` but produces random values using -`Generator`. +The column labeled MT19973 uses the same 32-bit generator as +`RandomState` but produces random variates using `Generator`. .. csv-table:: - :header: ,MT19937,PCG64,Philox,SFC64,RandomState - :widths: 14,14,14,14,14,14 - - 32-bit Unsigned Ints,3.2,2.7,4.9,2.7,3.2 - 64-bit Unsigned Ints,5.6,3.7,6.3,2.9,5.7 - Uniforms,7.3,4.1,8.1,3.1,7.3 - Normals,13.1,10.2,13.5,7.8,34.6 - Exponentials,7.9,5.4,8.5,4.1,40.3 - Gammas,34.8,28.0,34.7,25.1,58.1 - Binomials,25.0,21.4,26.1,19.5,25.2 - Laplaces,45.1,40.7,45.5,38.1,45.6 - Poissons,67.6,52.4,69.2,46.4,78.1 + :header: ,MT19937,PCG64,PCG64DXSM,Philox,SFC64,RandomState + :widths: 14,14,14,14,14,14,14 + + 32-bit Unsigned Ints,3.3,1.9,2.0,3.3,1.8,3.1 + 64-bit Unsigned Ints,5.6,3.2,2.9,4.9,2.5,5.5 + Uniforms,5.9,3.1,2.9,5.0,2.6,6.0 + Normals,13.9,10.8,10.5,12.0,8.3,56.8 + Exponentials,9.1,6.0,5.8,8.1,5.4,63.9 + Gammas,37.2,30.8,28.9,34.0,27.5,77.0 + Binomials,21.3,17.4,17.6,19.3,15.6,21.4 + Laplaces,73.2,72.3,76.1,73.0,72.3,82.5 + Poissons,111.7,103.4,100.5,109.4,90.7,115.2 The next table presents the performance in percentage relative to values generated by the legacy generator, ``RandomState(MT19937())``. The overall performance was computed using a geometric mean. .. csv-table:: - :header: ,MT19937,PCG64,Philox,SFC64 - :widths: 14,14,14,14,14 - - 32-bit Unsigned Ints,101,121,67,121 - 64-bit Unsigned Ints,102,156,91,199 - Uniforms,100,179,90,235 - Normals,263,338,257,443 - Exponentials,507,752,474,985 - Gammas,167,207,167,231 - Binomials,101,118,96,129 - Laplaces,101,112,100,120 - Poissons,116,149,113,168 - Overall,144,192,132,225 + :header: ,MT19937,PCG64,PCG64DXSM,Philox,SFC64 + :widths: 14,14,14,14,14,14 + + 32-bit Unsigned Ints,96,162,160,96,175 + 64-bit Unsigned Ints,97,171,188,113,218 + Uniforms,102,192,206,121,233 + Normals,409,526,541,471,684 + Exponentials,701,1071,1101,784,1179 + Gammas,207,250,266,227,281 + Binomials,100,123,122,111,138 + Laplaces,113,114,108,113,114 + Poissons,103,111,115,105,127 + Overall,159,219,225,174,251 .. note:: - All timings were taken using Linux on an i5-3570 processor. + All timings were taken using Linux on an AMD Ryzen 9 3900X processor. Performance on different Operating Systems ****************************************** @@ -98,33 +100,33 @@ across tables. 64-bit Linux ~~~~~~~~~~~~ -=================== ========= ======= ======== ======= -Distribution MT19937 PCG64 Philox SFC64 -=================== ========= ======= ======== ======= -32-bit Unsigned Int 100 119.8 67.7 120.2 -64-bit Unsigned Int 100 152.9 90.8 213.3 -Uniforms 100 179.0 87.0 232.0 -Normals 100 128.5 99.2 167.8 -Exponentials 100 148.3 93.0 189.3 -**Overall** 100 144.3 86.8 180.0 -=================== ========= ======= ======== ======= +===================== ========= ======= =========== ======== ======= +Distribution MT19937 PCG64 PCG64DXSM Philox SFC64 +===================== ========= ======= =========== ======== ======= +32-bit Unsigned Ints 100 168 166 100 182 +64-bit Unsigned Ints 100 176 193 116 224 +Uniforms 100 188 202 118 228 +Normals 100 128 132 115 167 +Exponentials 100 152 157 111 168 +Overall 100 161 168 112 192 +===================== ========= ======= =========== ======== ======= 64-bit Windows ~~~~~~~~~~~~~~ -The relative performance on 64-bit Linux and 64-bit Windows is broadly similar. - +The relative performance on 64-bit Linux and 64-bit Windows is broadly similar +with the notable exception of the Philox generator. -=================== ========= ======= ======== ======= -Distribution MT19937 PCG64 Philox SFC64 -=================== ========= ======= ======== ======= -32-bit Unsigned Int 100 129.1 35.0 135.0 -64-bit Unsigned Int 100 146.9 35.7 176.5 -Uniforms 100 165.0 37.0 192.0 -Normals 100 128.5 48.5 158.0 -Exponentials 100 151.6 39.0 172.8 -**Overall** 100 143.6 38.7 165.7 -=================== ========= ======= ======== ======= +===================== ========= ======= =========== ======== ======= +Distribution MT19937 PCG64 PCG64DXSM Philox SFC64 +===================== ========= ======= =========== ======== ======= +32-bit Unsigned Ints 100 155 131 29 150 +64-bit Unsigned Ints 100 157 143 25 154 +Uniforms 100 151 144 24 155 +Normals 100 129 128 37 150 +Exponentials 100 150 145 28 159 +**Overall** 100 148 138 28 154 +===================== ========= ======= =========== ======== ======= 32-bit Windows @@ -134,20 +136,20 @@ The performance of 64-bit generators on 32-bit Windows is much lower than on 64- operating systems due to register width. MT19937, the generator that has been in NumPy since 2005, operates on 32-bit integers. -=================== ========= ======= ======== ======= -Distribution MT19937 PCG64 Philox SFC64 -=================== ========= ======= ======== ======= -32-bit Unsigned Int 100 30.5 21.1 77.9 -64-bit Unsigned Int 100 26.3 19.2 97.0 -Uniforms 100 28.0 23.0 106.0 -Normals 100 40.1 31.3 112.6 -Exponentials 100 33.7 26.3 109.8 -**Overall** 100 31.4 23.8 99.8 -=================== ========= ======= ======== ======= +===================== ========= ======= =========== ======== ======= +Distribution MT19937 PCG64 PCG64DXSM Philox SFC64 +===================== ========= ======= =========== ======== ======= +32-bit Unsigned Ints 100 24 34 14 57 +64-bit Unsigned Ints 100 21 32 14 74 +Uniforms 100 21 34 16 73 +Normals 100 36 57 28 101 +Exponentials 100 28 44 20 88 +**Overall** 100 25 39 18 77 +===================== ========= ======= =========== ======== ======= .. note:: - Linux timings used Ubuntu 18.04 and GCC 7.4. Windows timings were made on + Linux timings used Ubuntu 20.04 and GCC 9.3.0. Windows timings were made on Windows 10 using Microsoft C/C++ Optimizing Compiler Version 19 (Visual - Studio 2015). All timings were produced on an i5-3570 processor. + Studio 2019). All timings were produced on an AMD Ryzen 9 3900X processor. diff --git a/doc/source/reference/random/upgrading-pcg64.rst b/doc/source/reference/random/upgrading-pcg64.rst new file mode 100644 index 000000000..9e540ace9 --- /dev/null +++ b/doc/source/reference/random/upgrading-pcg64.rst @@ -0,0 +1,152 @@ +.. _upgrading-pcg64: + +.. currentmodule:: numpy.random + +Upgrading ``PCG64`` with ``PCG64DXSM`` +-------------------------------------- + +Uses of the `PCG64` `BitGenerator` in a massively-parallel context have been +shown to have statistical weaknesses that were not apparent at the first +release in numpy 1.17. Most users will never observe this weakness and are +safe to continue to use `PCG64`. We have introduced a new `PCG64DXSM` +`BitGenerator` that will eventually become the new default `BitGenerator` +implementation used by `default_rng` in future releases. `PCG64DXSM` solves +the statistical weakness while preserving the performance and the features of +`PCG64`. + +Does this affect me? +==================== + +If you + + 1. only use a single `Generator` instance, + 2. only use `RandomState` or the functions in `numpy.random`, + 3. only use the `PCG64.jumped` method to generate parallel streams, + 4. explicitly use a `BitGenerator` other than `PCG64`, + +then this weakness does not affect you at all. Carry on. + +If you use moderate numbers of parallel streams created with `default_rng` or +`SeedSequence.spawn`, in the 1000s, then the chance of observing this weakness +is negligibly small. You can continue to use `PCG64` comfortably. + +If you use very large numbers of parallel streams, in the millions, and draw +large amounts of numbers from each, then the chance of observing this weakness +can become non-negligible, if still small. An example of such a use case would +be a very large distributed reinforcement learning problem with millions of +long Monte Carlo playouts each generating billions of random number draws. Such +use cases should consider using `PCG64DXSM` explicitly or another +modern `BitGenerator` like `SFC64` or `Philox`, but it is unlikely that any +old results you may have calculated are invalid. In any case, the weakness is +a kind of `Birthday Paradox <https://en.wikipedia.org/wiki/Birthday_problem>`_ +collision. That is, a single pair of parallel streams out of the millions, +considered together, might fail a stringent set of statistical tests of +randomness. The remaining millions of streams would all be perfectly fine, and +the effect of the bad pair in the whole calculation is very likely to be +swamped by the remaining streams in most applications. + +.. _upgrading-pcg64-details: + +Technical Details +================= + +Like many PRNG algorithms, `PCG64` is constructed from a transition function, +which advances a 128-bit state, and an output function, that mixes the 128-bit +state into a 64-bit integer to be output. One of the guiding design principles +of the PCG family of PRNGs is to balance the computational cost (and +pseudorandomness strength) between the transition function and the output +function. The transition function is a 128-bit linear congruential generator +(LCG), which consists of multiplying the 128-bit state with a fixed +multiplication constant and then adding a user-chosen increment, in 128-bit +modular arithmetic. LCGs are well-analyzed PRNGs with known weaknesses, though +128-bit LCGs are large enough to pass stringent statistical tests on their own, +with only the trivial output function. The output function of `PCG64` is +intended to patch up some of those known weaknesses by doing "just enough" +scrambling of the bits to assist in the statistical properties without adding +too much computational cost. + +One of these known weaknesses is that advancing the state of the LCG by steps +numbering a power of two (``bg.advance(2**N)``) will leave the lower ``N`` bits +identical to the state that was just left. For a single stream drawn from +sequentially, this is of little consequence. The remaining :math:`128-N` bits provide +plenty of pseudorandomness that will be mixed in for any practical ``N`` that can +be observed in a single stream, which is why one does not need to worry about +this if you only use a single stream in your application. Similarly, the +`PCG64.jumped` method uses a carefully chosen number of steps to avoid creating +these collisions. However, once you start creating "randomly-initialized" +parallel streams, either using OS entropy by calling `default_rng` repeatedly +or using `SeedSequence.spawn`, then we need to consider how many lower bits +need to "collide" in order to create a bad pair of streams, and then evaluate +the probability of creating such a collision. +`Empirically <https://github.com/numpy/numpy/issues/16313>`_, it has been +determined that if one shares the lower 58 bits of state and shares an +increment, then the pair of streams, when interleaved, will fail +`PractRand <http://pracrand.sourceforge.net/>`_ in +a reasonable amount of time, after drawing a few gigabytes of data. Following +the standard Birthday Paradox calculations for a collision of 58 bits, we can +see that we can create :math:`2^{29}`, or about half a billion, streams which is when +the probability of such a collision becomes high. Half a billion streams is +quite high, and the amount of data each stream needs to draw before the +statistical correlations become apparent to even the strict ``PractRand`` tests +is in the gigabytes. But this is on the horizon for very large applications +like distributed reinforcement learning. There are reasons to expect that even +in these applications a collision probably will not have a practical effect in +the total result, since the statistical problem is constrained to just the +colliding pair. + +Now, let us consider the case when the increment is not constrained to be the +same. Our implementation of `PCG64` seeds both the state and the increment; +that is, two calls to `default_rng` (almost certainly) have different states +and increments. Upon our first release, we believed that having the seeded +increment would provide a certain amount of extra protection, that one would +have to be "close" in both the state space and increment space in order to +observe correlations (``PractRand`` failures) in a pair of streams. If that were +true, then the "bottleneck" for collisions would be the 128-bit entropy pool +size inside of `SeedSequence` (and 128-bit collisions are in the +"preposterously unlikely" category). Unfortunately, this is not true. + +One of the known properties of an LCG is that different increments create +*distinct* streams, but with a known relationship. Each LCG has an orbit that +traverses all :math:`2^{128}` different 128-bit states. Two LCGs with different +increments are related in that one can "rotate" the orbit of the first LCG +(advance it by a number of steps that we can compute from the two increments) +such that then both LCGs will always then have the same state, up to an +additive constant and maybe an inversion of the bits. If you then iterate both +streams in lockstep, then the states will *always* remain related by that same +additive constant (and the inversion, if present). Recall that `PCG64` is +constructed from both a transition function (the LCG) and an output function. +It was expected that the scrambling effect of the output function would have +been strong enough to make the distinct streams practically independent (i.e. +"passing the ``PractRand`` tests") unless the two increments were +pathologically related to each other (e.g. 1 and 3). The output function XSL-RR +of the then-standard PCG algorithm that we implemented in `PCG64` turns out to +be too weak to cover up for the 58-bit collision of the underlying LCG that we +described above. For any given pair of increments, the size of the "colliding" +space of states is the same, so for this weakness, the extra distinctness +provided by the increments does not translate into extra protection from +statistical correlations that ``PractRand`` can detect. + +Fortunately, strengthening the output function is able to correct this weakness +and *does* turn the extra distinctness provided by differing increments into +additional protection from these low-bit collisions. To the `PCG author's +credit <https://github.com/numpy/numpy/issues/13635#issuecomment-506088698>`_, +she had developed a stronger output function in response to related discussions +during the long birth of the new `BitGenerator` system. We NumPy developers +chose to be "conservative" and use the XSL-RR variant that had undergone +a longer period of testing at that time. The DXSM output function adopts +a "xorshift-multiply" construction used in strong integer hashes that has much +better avalanche properties than the XSL-RR output function. While there are +"pathological" pairs of increments that induce "bad" additive constants that +relate the two streams, the vast majority of pairs induce "good" additive +constants that make the merely-distinct streams of LCG states into +practically-independent output streams. Indeed, now the claim we once made +about `PCG64` is actually true of `PCG64DXSM`: collisions are possible, but +both streams have to simultaneously be both "close" in the 128 bit state space +*and* "close" in the 127-bit increment space, so that would be less likely than +the negligible chance of colliding in the 128-bit internal `SeedSequence` pool. +The DXSM output function is more computationally intensive than XSL-RR, but +some optimizations in the LCG more than make up for the performance hit on most +machines, so `PCG64DXSM` is a good, safe upgrade. There are, of course, an +infinite number of stronger output functions that one could consider, but most +will have a greater computational cost, and the DXSM output function has now +received many CPU cycles of testing via ``PractRand`` at this time. |