diff options
author | Seth M Morton <seth.m.morton@gmail.com> | 2014-07-19 02:29:10 -0700 |
---|---|---|
committer | Seth M Morton <seth.m.morton@gmail.com> | 2014-07-19 02:31:45 -0700 |
commit | 5e3c69fc471c4b2c7bd31cbf4060740ace3d501e (patch) | |
tree | 672ce991194a715d9fd61d23f00adb504c8b5cf7 | |
parent | 2ab0c989a1d3f8c4fd80880d49e19b198f5db314 (diff) | |
parent | 7f6c833885c8ae2ef8d28ddb018502961ece3f87 (diff) | |
download | natsort-3.4.0.tar.gz |
Natsort version 3.4.0 release.3.4.0
This release provides the following updates:
- Fixed a bug that caused user's options to the 'natsort_key' to not
be passed on to recursive calls of 'natsort_key'.
- Added a 'natsort_keygen' function that will generate a wrapped
version of 'natsort_key' that is easier to call. 'natsort_key' is
now set to depreciate at natsort version 4.0.0.
- Added an 'as_path' option to 'natsorted' & co. that will try to
treat input strings as filepaths. This will help yield correct
results for OS-generated inputs like.
- Massive performance enhancements for string input (1.8x-2.0x), at
the expense of reduction in speed for numeric input (~2.0x).
- This is a good compromise because the most common input will be
strings, not numbers, and sorting numbers still only takes 0.6x
the time of sorting strings.
- Added the 'order_by_index' function to help in using the output of
'index_natsorted' and 'index_versorted'.
- Added the 'reverse' option to 'natsorted' & co. to make it's API
more similar to the builtin 'sorted'.
- Added more unit tests.
- Added auxiliary test code that helps in profiling and
stress-testing.
- Reworked the documentation, moving most of it to PyPI's hosting
platform.
- Added support for coveralls.io.
- Entire codebase is now PyFlakes and PEP8 compliant.
36 files changed, 2867 insertions, 945 deletions
diff --git a/.coveragerc b/.coveragerc new file mode 100644 index 0000000..1bbfe9d --- /dev/null +++ b/.coveragerc @@ -0,0 +1,22 @@ +[report] +# Regexes for lines to exclude from consideration +exclude_lines = + # Have to re-enable the standard pragma + pragma: no cover + + # Don't complain if tests don't hit defensive assertion code: + raise AssertionError + raise NotImplementedError + raise$ + + # Don't complain if non-runnable code isn't run: + if 0: + if __name__ == .__main__.: + +ignore_errors = True + +# Files to not perform coverage on +omit = + natsort/__init__.* + natsort/py23compat.* + natsort/_version.* diff --git a/.travis.yml b/.travis.yml index 68055ab..72df9c9 100644 --- a/.travis.yml +++ b/.travis.yml @@ -5,15 +5,20 @@ python: - 3.2 - 3.3 - 3.4 -- pypy install: -- pip install pytest-cov -- pip install wheel +- pip install pytest-cov pytest-flakes pytest-pep8 +- pip install coveralls - if [[ $TRAVIS_PYTHON_VERSION == '2.6' ]]; then pip install argparse; fi script: -- python -m pytest --cov-report term-missing --cov natsort +- python -m pytest --cov natsort --flakes --pep8 - python -m pytest --doctest-modules natsort -- python -m pytest README.rst +- python -m pytest README.rst docs/source/intro.rst docs/source/examples.rst +- python -m pytest test_natsort/stress_natsort.py +after_success: + coveralls +before_deploy: +- pip install Sphinx numpydoc +- python setup.py build_sphinx deploy: provider: pypi user: SethMMorton @@ -21,5 +26,8 @@ deploy: secure: OaYQtVh4mGT0ozN7Ar2lSm2IEVMKIyvOESGPGLwVyVxPqp6oC101MovJ7041bZdjMzirMs54EJwtEGQpKFmDBGcKgbjPiYId5Nqb/yDhLC/ojgarbLoFJvUKV6dWJePyY7EOycrqcMdiDabdG80Bw4zziQExbmIOdUiscsAVVmA= on: tags: true + all_branches: true repo: SethMMorton/natsort + python: 3.3 distributions: "sdist bdist_wheel" + docs_dir: build/sphinx/html @@ -4,14 +4,13 @@ natsort .. image:: https://travis-ci.org/SethMMorton/natsort.svg?branch=master :target: https://travis-ci.org/SethMMorton/natsort -Natural sorting for python. ``natsort`` requires python version 2.6 or greater -(this includes python 3.x). To run version 2.6, 3.0, or 3.1 the -`argparse <https://pypi.python.org/pypi/argparse>`_ module is required. +.. image:: https://coveralls.io/repos/SethMMorton/natsort/badge.png?branch=master + :target: https://coveralls.io/r/SethMMorton/natsort?branch=master -``natsort`` comes with a shell script that is described below. You can -also execute ``natsort`` from the command line with ``python -m natsort``. +Natural sorting for python. Check out the source code at +https://github.com/SethMMorton/natsort. -Problem Statement +Quick Description ----------------- When you try to sort a list of strings that contain numbers, the normal python @@ -24,28 +23,30 @@ expect:: Notice that it has the order ('1', '10', '2') - this is because the list is being sorted in lexicographical order, which sorts numbers like you would -letters (i.e. 'a', 'at', 'b'). It would be better if you had a sorting -algorithm that recognized numbers as numbers and treated them like numbers, -not letters. - -This is where ``natsort`` comes in: it provides a key that helps sort lists -"naturally". It provides support for ints and floats (including negatives and -exponential notation), and also a function specifically for sorting version -numbers. +letters (i.e. 'b', 'ba', 'c'). -Synopsis --------- - -Using ``natsort`` is simple:: +``natsort`` provides a function ``natsorted`` that helps sort lists "naturally", +either as real numbers (i.e. signed/unsigned floats or ints), or as versions. +Using ``natsorted`` is simple:: >>> from natsort import natsorted >>> a = ['a2', 'a9', 'a1', 'a4', 'a10'] >>> natsorted(a) ['a1', 'a2', 'a4', 'a9', 'a10'] -``natsort`` identifies the numbers and sorts them separately from strings. +``natsorted`` identifies real numbers anywhere in a string and sorts them +naturally. + +Sorting version numbers is just as easy:: -You can also mix and match ``int``, ``float``, and ``str`` (or ``unicode``) types + >>> from natsort import versorted + >>> a = ['version-1.9', 'version-2.0', 'version-1.11', 'version-1.10'] + >>> versorted(a) + ['version-1.9', 'version-1.10', 'version-1.11', 'version-2.0'] + >>> natsorted(a) # natsorted tries to sort as signed floats, so it won't work + ['version-2.0', 'version-1.9', 'version-1.11', 'version-1.10'] + +You can mix and match ``int``, ``float``, and ``str`` (or ``unicode``) types when you sort:: >>> a = ['4.5', 6, 2.0, '5', 'a'] @@ -54,352 +55,44 @@ when you sort:: >>> # On Python 2, sorted(a) would return [2.0, 6, '4.5', '5', 'a'] >>> # On Python 3, sorted(a) would raise an "unorderable types" TypeError -The natsort algorithm will recursively descend into lists of lists so you can sort by -the sublist contents:: - - >>> data = [['a1', 'a5'], ['a1', 'a40'], ['a10', 'a1'], ['a2', 'a5']] - >>> sorted(data) - [['a1', 'a40'], ['a1', 'a5'], ['a10', 'a1'], ['a2', 'a5']] - >>> natsorted(data) - [['a1', 'a5'], ['a1', 'a40'], ['a2', 'a5'], ['a10', 'a1']] - -There is also a special convenience function provided that is best for sorting -version numbers:: - - >>> from natsort import versorted - >>> a = ['ver-2.9.9a', 'ver-1.11', 'ver-2.9.9b', 'ver-1.11.4', 'ver-1.10.1'] - >>> versorted(a) - ['ver-1.10.1', 'ver-1.11', 'ver-1.11.4', 'ver-2.9.9a', 'ver-2.9.9b'] - -The Sorting Algorithms -'''''''''''''''''''''' - -Sometimes you want to sort by floats, sometimes by ints, and sometimes simply -by digits. ``natsort`` supports all three number types. They can be chosen -with the ``number_type`` argument to ``natsorted``. - -Sort by floats -++++++++++++++ - -By default, ``natsort`` searches for floats (even in exponential -notation!). This means that it will look for things like negative -signs and decimal points when determining a number:: - - >>> a = ['a50', 'a51.', 'a50.4', 'a5.034e1', 'a50.300'] - >>> sorted(a) - ['a5.034e1', 'a50', 'a50.300', 'a50.4', 'a51.'] - >>> natsorted(a, number_type=float) - ['a50', 'a50.300', 'a5.034e1', 'a50.4', 'a51.'] - >>> natsorted(a) # Float is the default behavior - ['a50', 'a50.300', 'a5.034e1', 'a50.4', 'a51.'] - -Sort by ints -++++++++++++ - -In some cases you don't want ``natsort`` to identify your numbers as floats, -particularly if you are sorting version numbers. This is because you want the -version '1.10' to come after '1.2', not before. In that case, it is advantageous -to sort by ints, not floats:: - - >>> a = ['ver1.9.9a', 'ver1.11', 'ver1.9.9b', 'ver1.11.4', 'ver1.10.1'] - >>> sorted(a) - ['ver1.10.1', 'ver1.11', 'ver1.11.4', 'ver1.9.9a', 'ver1.9.9b'] - >>> natsorted(a) - ['ver1.10.1', 'ver1.11', 'ver1.11.4', 'ver1.9.9a', 'ver1.9.9b'] - >>> natsorted(a, number_type=int) - ['ver1.9.9a', 'ver1.9.9b', 'ver1.10.1', 'ver1.11', 'ver1.11.4'] - -Sort by digits (best for version numbers) -+++++++++++++++++++++++++++++++++++++++++ - -The only difference between sorting by ints and sorting by digits is that -sorting by ints may take into account a negative sign, and sorting by digits -will not. This may be an issue if you used a '-' as your separator before the -version numbers. Essentially this is a shortcut for a number type of ``int`` -and the ``signed`` option of ``False``:: - - >>> a = ['ver-2.9.9a', 'ver-1.11', 'ver-2.9.9b', 'ver-1.11.4', 'ver-1.10.1'] - >>> natsorted(a, number_type=int) - ['ver-2.9.9a', 'ver-2.9.9b', 'ver-1.10.1', 'ver-1.11', 'ver-1.11.4'] - >>> natsorted(a, number_type=None) - ['ver-1.10.1', 'ver-1.11', 'ver-1.11.4', 'ver-2.9.9a', 'ver-2.9.9b'] - -The ``versorted`` function is simply a wrapper for ``number_type=None``, -and if you need to sort just version numbers it is best to use the -``versorted`` function for clarity:: - - >>> natsorted(a, number_type=None) == versorted(a) - True - -Using a sorting key -''''''''''''''''''' - -Like the built-in ``sorted`` function, ``natsorted`` can accept a key so that -you can sort based on a particular item of a list or by an attribute of a class:: - - >>> from operator import attrgetter, itemgetter - >>> a = [['num4', 'b'], ['num8', 'c'], ['num2', 'a']] - >>> natsorted(a, key=itemgetter(0)) - [['num2', 'a'], ['num4', 'b'], ['num8', 'c']] - >>> class Foo: - ... def __init__(self, bar): - ... self.bar = bar - ... def __repr__(self): - ... return "Foo('{0}')".format(self.bar) - >>> b = [Foo('num3'), Foo('num5'), Foo('num2')] - >>> natsorted(b, key=attrgetter('bar')) - [Foo('num2'), Foo('num3'), Foo('num5')] - -API ---- - -The ``natsort`` package provides five functions: ``natsort_key``, -``natsorted``, ``versorted``, ``index_natsorted``, and ``index_versorted``. -You can look at the unit tests to see more thorough examples of how -``natsort`` can be used. - -natsorted -''''''''' - -``natsort.natsorted`` (*sequence*, *key* = ``lambda x: x``, *number_type* = ``float``, *signed* = ``True``, *exp* = ``True``) - - sequence (*iterable*) - The sequence to sort. - - key (*function*) - A key used to determine how to sort each element of the sequence. - - number_type (``None``, ``float``, ``int``) - The types of number to sort by: ``float`` searches for floating point numbers, - ``int`` searches for integers, and ``None`` searches for digits (like integers - but does not take into account negative sign). ``None`` is a shortcut for - ``number_type = int`` and ``signed = False``. - - signed (``True``, ``False``) - By default a '+' or '-' before a number is taken to be the sign of the number. - If ``signed`` is ``False``, any '+' or '-' will not be considered to be part - of the number, but as part of the string. - - exp (``True``, ``False``) - This option only applies to ``number_type = float``. If ``exp = True``, a string - like ``"3.5e5"`` will be interpreted as ``350000``, i.e. the exponential part - is considered to be part of the number. If ``exp = False``, ``"3.5e5"`` is - interpreted as ``(3.5, "e", 5)``. The default behavior is ``exp = True``. - - returns - The sorted sequence. - -Use ``natsorted`` just like the builtin ``sorted``:: - - >>> from natsort import natsorted - >>> a = ['num3', 'num5', 'num2'] - >>> natsorted(a) - ['num2', 'num3', 'num5'] - -versorted -''''''''' - -``natsort.versorted`` (*sequence*, *key* = ``lambda x: x``) - - sequence (*iterable*) - The sequence to sort. - - key (*function*) - A key used to determine how to sort each element of the sequence. - - returns - The sorted sequence. - -Use ``versorted`` just like the builtin ``sorted``:: - - >>> from natsort import versorted - >>> a = ['num4.0.2', 'num3.4.1', 'num3.4.2'] - >>> versorted(a) - ['num3.4.1', 'num3.4.2', 'num4.0.2'] - -This is a wrapper around ``natsorted(seq, number_type=None)``, and is used -to easily sort version numbers. - -index_natsorted -''''''''''''''' - -``natsort.index_natsorted`` (*sequence*, *key* = ``lambda x: x``, *number_type* = ``float``, *signed* = ``True``, *exp* = ``True``) - - sequence (*iterable*) - The sequence to sort. - - key (*function*) - A key used to determine how to sort each element of the sequence. - - number_type (``None``, ``float``, ``int``) - The types of number to sort on: ``float`` searches for floating point numbers, - ``int`` searches for integers, and ``None`` searches for digits (like integers - but does not take into account negative sign). ``None`` is a shortcut for - ``number_type = int`` and ``signed = False``. - - signed (``True``, ``False``) - By default a '+' or '-' before a number is taken to be the sign of the number. - If ``signed`` is ``False``, any '+' or '-' will not be considered to be part - of the number, but as part part of the string. - - exp (``True``, ``False``) - This option only applies to ``number_type = float``. If ``exp = True``, a string - like ``"3.5e5"`` will be interpreted as ``350000``, i.e. the exponential part - is considered to be part of the number. If ``exp = False``, ``"3.5e5"`` is - interpreted as ``(3.5, "e", 5)``. The default behavior is ``exp = True``. - - returns - The ordered indexes of the sequence. +The natsort algorithm does other fancy things like -Use ``index_natsorted`` if you want to sort multiple lists by the sort order of -one list:: + - recursively descend into lists of lists + - sort file paths correctly + - allow custom sorting keys + - allow exposed a natsort_key generator to pass to list.sort - >>> from natsort import index_natsorted - >>> a = ['num3', 'num5', 'num2'] - >>> b = ['foo', 'bar', 'baz'] - >>> index = index_natsorted(a) - >>> index - [2, 0, 1] - >>> # Sort both lists by the sort order of a - >>> [a[i] for i in index] - ['num2', 'num3', 'num5'] - >>> [b[i] for i in index] - ['baz', 'foo', 'bar'] +Please see the package documentation for more details, including additional examples +and recipes. -index_versorted -''''''''''''''' - -``natsort.index_versorted`` (*sequence*, *key* = ``lambda x: x``) - - sequence (*iterable*) - The sequence to sort. - - key (*function*) - A key used to determine how to sort each element of the sequence. - - returns - The ordered indexes of the sequence. - -Use ``index_versorted`` just like the builtin sorted:: - - >>> from natsort import index_versorted - >>> a = ['num4.0.2', 'num3.4.1', 'num3.4.2'] - >>> index_versorted(a) - [1, 2, 0] - -This is a wrapper around ``index_natsorted(seq, number_type=None)``, and is used -to easily sort version numbers by their indexes. - -natsort_key -''''''''''' - -``natsort.natsort_key`` (value, *number_type* = ``float``, *signed* = ``True``, *exp* = ``True``, *py3_safe* = ``False``) - - value - The value used by the sorting algorithm - - number_type (``None``, ``float``, ``int``) - The types of number to sort on: ``float`` searches for floating point numbers, - ``int`` searches for integers, and ``None`` searches for digits (like integers - but does not take into account negative sign). ``None`` is a shortcut for - ``number_type = int`` and ``signed = False``. - - signed (``True``, ``False``) - By default a '+' or '-' before a number is taken to be the sign of the number. - If ``signed`` is ``False``, any '+' or '-' will not be considered to be part - of the number, but as part part of the string. - - exp (``True``, ``False``) - This option only applies to ``number_type = float``. If ``exp = True``, a string - like ``"3.5e5"`` will be interpreted as ``350000``, i.e. the exponential part - is considered to be part of the number. If ``exp = False``, ``"3.5e5"`` is - interpreted as ``(3.5, "e", 5)``. The default behavior is ``exp = True``. - - py3_safe (``True``, ``False``) - This will make the string parsing algorithm be more careful by placing - an empty string between two adjacent numbers after the parsing algorithm. - This will prevent the "unorderable types" error. - - returns - The modified value with numbers extracted. - -Using ``natsort_key`` is just like any other sorting key in python:: - - >>> from natsort import natsort_key - >>> a = ['num3', 'num5', 'num2'] - >>> a.sort(key=natsort_key) - >>> a - ['num2', 'num3', 'num5'] - -It works by separating out the numbers from the strings:: - - >>> natsort_key('num2') - ('num', 2.0) - -If you need to call ``natsort_key`` with the ``number_type`` argument, or get a special -attribute or item of each element of the sequence, the easiest way is to make a -``lambda`` expression that calls ``natsort_key``:: - - >>> from operator import itemgetter - >>> a = [['num4', 'b'], ['num8', 'c'], ['num2', 'a']] - >>> f = itemgetter(0) - >>> a.sort(key=lambda x: natsort_key(f(x), number_type=int)) - >>> a - [['num2', 'a'], ['num4', 'b'], ['num8', 'c']] - -Shell Script +Shell script ------------ -For your convenience, there is a ``natsort`` shell script supplied to you that -allows you to call ``natsort`` from the command-line. ``natsort`` was written to -aid in computational chemistry research so that it would be easy to analyze -large sets of output files named after the parameter used:: - - $ ls *.out - mode1000.35.out mode1243.34.out mode744.43.out mode943.54.out - -(Obviously, in reality there would be more files, but you get the idea.) Notice -that the shell sorts in lexicographical order. This is the behavior of programs like -``find`` as well as ``ls``. The problem is in passing these files to an -analysis program that causes them not to appear in numerical order, which can lead -to bad analysis. To remedy this, use ``natsort``:: - - # This won't get you what you want - $ foo *.out - # This will sort naturally - $ natsort *.out - mode744.43.out - mode943.54.out - mode1000.35.out - mode1243.34.out - $ natsort *.out | xargs foo - -You can also filter out numbers using the ``natsort`` command-line script:: +``natsort`` comes with a shell script called ``natsort``, or can also be called +from the command line with ``python -m natsort``. The command line script is +only installed onto your ``PATH`` if you don't install via a wheel. There is +apparently a known bug with the wheel installation process that will not create +entry points. - $ natsort *.out -f 900 1100 # Select only numbers between 900-1100 - mode943.54.out - mode1000.35.out - -If needed, you can exclude specific numbers:: - - $ natsort *.out -e 1000.35 # Exclude 1000.35 from search - mode744.43.out - mode943.54.out - mode1243.34.out - -For other options, use ``natsort --help``. In general, the other options mirror -the ``natsorted`` API. +Requirements +------------ -It is also helpful to note that ``natsort`` accepts pipes. +``natsort`` requires python version 2.6 or greater +(this includes python 3.x). To run version 2.6, 3.0, or 3.1 the +`argparse <https://pypi.python.org/pypi/argparse>`_ module is required. -Note to users of the ``natsort`` shell script from < v. 3.1.0 -''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' +Depreciation Notices +-------------------- -The ``natsort`` shell script options and implementation for version 3.1.0 has -changed slightly. Options relating to interpreting input as file or directory -paths have been removed, and internally the input is no longer treated as file -paths. In most situations, this should not give different results, but in -some unique cases it may. Feel free to contact me if this ruins your work flow. + - In ``natsort`` version 4.0.0, the ``natsort_key`` function will be removed + from the public API. All future development should use ``natsort_keygen`` + in preparation for this. + - In ``natsort`` version 3.1.0, the shell script changed how it interpreted + input; previously, all input was assumed to be a filepath, but as of 3.1.0 + input is just treated as a string. For most cases the results are the same. + + - As of ``natsort`` version 3.4.0, a ``--path`` option has been added to + force the shell script to interpret the input as filepaths. Author ------ @@ -409,6 +102,38 @@ Seth M. Morton History ------- +These are the last three entries of the changelog. See the package documentation +for the complete changelog. + +07-19-2014 v. 3.4.0 +''''''''''''''''''' + + - Fixed a bug that caused user's options to the 'natsort_key' to not be + passed on to recursive calls of 'natsort_key'. + - Added a 'natsort_keygen' function that will generate a wrapped version + of 'natsort_key' that is easier to call. 'natsort_key' is now set to + depreciate at natsort version 4.0.0. + - Added an 'as_path' option to 'natsorted' & co. that will try to treat + input strings as filepaths. This will help yield correct results for + OS-generated inputs like + ``['/p/q/o.x', '/p/q (1)/o.x', '/p/q (10)/o.x', '/p/q/o (1).x']``. + - Massive performance enhancements for string input (1.8x-2.0x), at the expense + of reduction in speed for numeric input (~2.0x). + + - This is a good compromise because the most common input will be strings, + not numbers, and sorting numbers still only takes 0.6x the time of sorting + strings. If you are sorting only numbers, you would use 'sorted' anyway. + + - Added the 'order_by_index' function to help in using the output of + 'index_natsorted' and 'index_versorted'. + - Added the 'reverse' option to 'natsorted' & co. to make it's API more + similar to the builtin 'sorted'. + - Added more unit tests. + - Added auxiliary test code that helps in profiling and stress-testing. + - Reworked the documentation, moving most of it to PyPI's hosting platform. + - Added support for coveralls.io. + - Entire codebase is now PyFlakes and PEP8 compliant. + 06-28-2014 v. 3.3.0 ''''''''''''''''''' @@ -429,103 +154,3 @@ History - Re-"Fixed" unorderable types issue on Python 3.x - this workaround is for when the problem occurs in the middle of the string. - -05-07-2014 v. 3.2.0 -''''''''''''''''''' - - - "Fixed" unorderable types issue on Python 3.x with a workaround that - attempts to replicate the Python 2.x behavior by putting all the numbers - (or strings that begin with numbers) first. - - Now explicitly excluding __pycache__ from releases by adding a prune statement - to MANIFEST.in. - -05-05-2014 v. 3.1.2 -''''''''''''''''''' - - - Added setup.cfg to support universal wheels. - - Added Python 3.0 and Python 3.1 as requiring the argparse module. - -03-01-2014 v. 3.1.1 -''''''''''''''''''' - - - Added ability to sort lists of lists. - - Cleaned up import statements. - -01-20-2014 v. 3.1.0 -''''''''''''''''''' - - - Added the ``signed`` and ``exp`` options to allow finer tuning of the sorting - - Entire codebase now works for both Python 2 and Python 3 without needing to run - ``2to3``. - - Updated all doctests. - - Further simplified the ``natsort`` base code by removing unneeded functions. - - Simplified documentation where possible. - - Improved the shell script code - - - Made the documentation less "path"-centric to make it clear it is not just - for sorting file paths. - - Removed the filesystem-based options because these can be achieved better - though a pipeline. - - Added doctests. - - Added new options that correspond to ``signed`` and ``exp``. - - The user can now specify multiple numbers to exclude or multiple ranges - to filter by. - -10-01-2013 v. 3.0.2 -''''''''''''''''''' - - - Made float, int, and digit searching algorithms all share the same base function. - - Fixed some outdated comments. - - Made the ``__version__`` variable available when importing the module. - -8-15-2013 v. 3.0.1 -'''''''''''''''''' - - - Added support for unicode strings. - - Removed extraneous ``string2int`` function. - - Fixed empty string removal function. - -7-13-2013 v. 3.0.0 -'''''''''''''''''' - - - Added a ``number_type`` argument to the sorting functions to specify how - liberal to be when deciding what a number is. - - Reworked the documentation. - -6-25-2013 v. 2.2.0 -'''''''''''''''''' - - - Added ``key`` attribute to ``natsorted`` and ``index_natsorted`` so that - it mimics the functionality of the built-in ``sorted`` - - Added tests to reflect the new functionality, as well as tests demonstrating - how to get similar functionality using ``natsort_key``. - -12-5-2012 v. 2.1.0 -'''''''''''''''''' - - - Reorganized package. - - Now using a platform independent shell script generator (entry_points - from distribute). - - Can now execute natsort from command line with ``python -m natsort`` - as well. - -11-30-2012 v. 2.0.2 -''''''''''''''''''' - - - Added the use_2to3 option to setup.py. - - Added distribute_setup.py to the distribution. - - Added dependency to the argparse module (for python2.6). - -11-21-2012 v. 2.0.1 -''''''''''''''''''' - - - Reorganized directory structure. - - Added tests into the natsort.py file iteself. - -11-16-2012, v. 2.0.0 -'''''''''''''''''''' - - - Updated sorting algorithm to support floats (including exponentials) and - basic version number support. - - Added better README documentation. - - Added doctests. diff --git a/docs/source/api.rst b/docs/source/api.rst new file mode 100644 index 0000000..7546de6 --- /dev/null +++ b/docs/source/api.rst @@ -0,0 +1,18 @@ +.. default-domain:: py +.. currentmodule:: natsort + +.. _api: + +natsort API +=========== + +.. toctree:: + :maxdepth: 2 + + natsort_keygen.rst + natsort_key.rst + natsorted.rst + versorted.rst + index_natsorted.rst + index_versorted.rst + order_by_index.rst diff --git a/docs/source/changelog.rst b/docs/source/changelog.rst new file mode 100644 index 0000000..807bfe5 --- /dev/null +++ b/docs/source/changelog.rst @@ -0,0 +1,154 @@ +.. _changelog: + +Changelog +--------- + +07-19-2014 v. 3.4.0 +''''''''''''''''''' + + - Fixed a bug that caused user's options to the 'natsort_key' to not be + passed on to recursive calls of 'natsort_key'. + - Added a 'natsort_keygen' function that will generate a wrapped version + of 'natsort_key' that is easier to call. 'natsort_key' is now set to + depreciate at natsort version 4.0.0. + - Added an 'as_path' option to 'natsorted' & co. that will try to treat + input strings as filepaths. This will help yield correct results for + OS-generated inputs like + ``['/p/q/o.x', '/p/q (1)/o.x', '/p/q (10)/o.x', '/p/q/o (1).x']``. + - Massive performance enhancements for string input (1.8x-2.0x), at the expense + of reduction in speed for numeric input (~2.0x). + + - This is a good compromise because the most common input will be strings, + not numbers, and sorting numbers still only takes 0.6x the time of sorting + strings. If you are sorting only numbers, you would use 'sorted' anyway. + + - Added the 'order_by_index' function to help in using the output of + 'index_natsorted' and 'index_versorted'. + - Added the 'reverse' option to 'natsorted' & co. to make it's API more + similar to the builtin 'sorted'. + - Added more unit tests. + - Added auxillary test code that helps in profiling and stress-testing. + - Reworked the documentation, moving most of it to PyPI's hosting platform. + - Added support for coveralls.io. + - Entire codebase is now PyFlakes and PEP8 compliant. + +06-28-2014 v. 3.3.0 +''''''''''''''''''' + + - Added a 'versorted' method for more convenient sorting of versions. + - Updated command-line tool --number_type option with 'version' and 'ver' + to make it more clear how to sort version numbers. + - Moved unit-testing mechanism from being docstring-based to actual unit tests + in actual functions. + + - This has provided the ability determine the coverage of the unit tests (99%). + - This also makes the pydoc documentation a bit more clear. + + - Made docstrings for public functions mirror the README API. + - Connected natsort development to Travis-CI to help ensure quality releases. + +06-20-2014 v. 3.2.1 +''''''''''''''''''' + + - Re-"Fixed" unorderable types issue on Python 3.x - this workaround + is for when the problem occurs in the middle of the string. + +05-07-2014 v. 3.2.0 +''''''''''''''''''' + + - "Fixed" unorderable types issue on Python 3.x with a workaround that + attempts to replicate the Python 2.x behavior by putting all the numbers + (or strings that begin with numbers) first. + - Now explicitly excluding __pycache__ from releases by adding a prune statement + to MANIFEST.in. + +05-05-2014 v. 3.1.2 +''''''''''''''''''' + + - Added setup.cfg to support universal wheels. + - Added Python 3.0 and Python 3.1 as requiring the argparse module. + +03-01-2014 v. 3.1.1 +''''''''''''''''''' + + - Added ability to sort lists of lists. + - Cleaned up import statements. + +01-20-2014 v. 3.1.0 +''''''''''''''''''' + + - Added the ``signed`` and ``exp`` options to allow finer tuning of the sorting + - Entire codebase now works for both Python 2 and Python 3 without needing to run + ``2to3``. + - Updated all doctests. + - Further simplified the ``natsort`` base code by removing unneeded functions. + - Simplified documentation where possible. + - Improved the shell script code + + - Made the documentation less "path"-centric to make it clear it is not just + for sorting file paths. + - Removed the filesystem-based options because these can be achieved better + though a pipeline. + - Added doctests. + - Added new options that correspond to ``signed`` and ``exp``. + - The user can now specify multiple numbers to exclude or multiple ranges + to filter by. + +10-01-2013 v. 3.0.2 +''''''''''''''''''' + + - Made float, int, and digit searching algorithms all share the same base function. + - Fixed some outdated comments. + - Made the ``__version__`` variable available when importing the module. + +8-15-2013 v. 3.0.1 +'''''''''''''''''' + + - Added support for unicode strings. + - Removed extraneous ``string2int`` function. + - Fixed empty string removal function. + +7-13-2013 v. 3.0.0 +'''''''''''''''''' + + - Added a ``number_type`` argument to the sorting functions to specify how + liberal to be when deciding what a number is. + - Reworked the documentation. + +6-25-2013 v. 2.2.0 +'''''''''''''''''' + + - Added ``key`` attribute to ``natsorted`` and ``index_natsorted`` so that + it mimics the functionality of the built-in ``sorted`` + - Added tests to reflect the new functionality, as well as tests demonstrating + how to get similar functionality using ``natsort_key``. + +12-5-2012 v. 2.1.0 +'''''''''''''''''' + + - Reorganized package. + - Now using a platform independent shell script generator (entry_points + from distribute). + - Can now execute natsort from command line with ``python -m natsort`` + as well. + +11-30-2012 v. 2.0.2 +''''''''''''''''''' + + - Added the use_2to3 option to setup.py. + - Added distribute_setup.py to the distribution. + - Added dependency to the argparse module (for python2.6). + +11-21-2012 v. 2.0.1 +''''''''''''''''''' + + - Reorganized directory structure. + - Added tests into the natsort.py file iteself. + +11-16-2012, v. 2.0.0 +'''''''''''''''''''' + + - Updated sorting algorithm to support floats (including exponentials) and + basic version number support. + - Added better README documentation. + - Added doctests. diff --git a/docs/source/conf.py b/docs/source/conf.py new file mode 100644 index 0000000..ee8ea53 --- /dev/null +++ b/docs/source/conf.py @@ -0,0 +1,280 @@ +# -*- coding: utf-8 -*- +# +# natsort documentation build configuration file, created by +# sphinx-quickstart on Thu Jul 17 21:01:29 2014. +# +# This file is execfile()d with the current directory set to its +# containing dir. +# +# Note that not all possible configuration values are present in this +# autogenerated file. +# +# All configuration values have a default; values that are commented out +# serve to show the default. + +import os +import re + +def current_version(): + # Read the _version.py file for the module version number + VERSIONFILE = os.path.join('..', '..', 'natsort', '_version.py') + versionsearch = re.compile(r"^__version__ = ['\"]([^'\"]*)['\"]") + with open(VERSIONFILE, "rt") as fl: + for line in fl: + m = versionsearch.search(line) + if m: + return m.group(1) + else: + s = "Unable to locate version string in {0}" + raise RuntimeError(s.format(VERSIONFILE)) + +# If extensions (or modules to document with autodoc) are in another directory, +# add these directories to sys.path here. If the directory is relative to the +# documentation root, use os.path.abspath to make it absolute, like shown here. +#sys.path.insert(0, os.path.abspath('.')) + +# -- General configuration ------------------------------------------------ + +# If your documentation needs a minimal Sphinx version, state it here. +#needs_sphinx = '1.0' + +# Add any Sphinx extension module names here, as strings. They can be +# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom +# ones. +extensions = [ + 'sphinx.ext.autodoc', + 'sphinx.ext.intersphinx', + 'numpydoc', +] + +# Add any paths that contain templates here, relative to this directory. +templates_path = ['_templates'] + +# The suffix of source filenames. +source_suffix = '.rst' + +# The encoding of source files. +#source_encoding = 'utf-8-sig' + +# The master toctree document. +master_doc = 'index' + +# General information about the project. +project = u'natsort' +copyright = u'2014, Seth M. Morton' + +# The version info for the project you're documenting, acts as replacement for +# |version| and |release|, also used in various other places throughout the +# built documents. +# +# The full version, including alpha/beta/rc tags. +release = current_version() +# The short X.Y version. +version = '.'.join(release.split('.')[0:2]) + +# The language for content autogenerated by Sphinx. Refer to documentation +# for a list of supported languages. +#language = None + +# There are two options for replacing |today|: either, you set today to some +# non-false value, then it is used: +#today = '' +# Else, today_fmt is used as the format for a strftime call. +#today_fmt = '%B %d, %Y' + +# List of patterns, relative to source directory, that match files and +# directories to ignore when looking for source files. +exclude_patterns = ['solar/*'] + +# The reST default role (used for this markup: `text`) to use for all +# documents. +#default_role = None + +# If true, '()' will be appended to :func: etc. cross-reference text. +#add_function_parentheses = True + +# If true, the current module name will be prepended to all description +# unit titles (such as .. function::). +#add_module_names = True + +# If true, sectionauthor and moduleauthor directives will be shown in the +# output. They are ignored by default. +#show_authors = False + +# The name of the Pygments (syntax highlighting) style to use. +pygments_style = 'sphinx' +highlight_language = 'python' + +# A list of ignored prefixes for module index sorting. +#modindex_common_prefix = [] + +# If true, keep warnings as "system message" paragraphs in the built documents. +#keep_warnings = False + + +# -- Options for HTML output ---------------------------------------------- + +# The theme to use for HTML and HTML Help pages. See the documentation for +# a list of builtin themes. +html_theme = 'solar' + +# Theme options are theme-specific and customize the look and feel of a theme +# further. For a list of options available for each theme, see the +# documentation. +#html_theme_options = {} + +# Add any paths that contain custom themes here, relative to this directory. +html_theme_path = ['.'] + +# The name for this set of Sphinx documents. If None, it defaults to +# "<project> v<release> documentation". +#html_title = None + +# A shorter title for the navigation bar. Default is the same as html_title. +#html_short_title = None + +# The name of an image file (relative to this directory) to place at the top +# of the sidebar. +#html_logo = None + +# The name of an image file (within the static path) to use as favicon of the +# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 +# pixels large. +#html_favicon = None + +# Add any paths that contain custom static files (such as style sheets) here, +# relative to this directory. They are copied after the builtin static files, +# so a file named "default.css" will overwrite the builtin "default.css". +html_static_path = ['_static'] + +# Add any extra paths that contain custom files (such as robots.txt or +# .htaccess) here, relative to this directory. These files are copied +# directly to the root of the documentation. +#html_extra_path = [] + +# If not '', a 'Last updated on:' timestamp is inserted at every page bottom, +# using the given strftime format. +#html_last_updated_fmt = '%b %d, %Y' + +# If true, SmartyPants will be used to convert quotes and dashes to +# typographically correct entities. +#html_use_smartypants = True + +# Custom sidebar templates, maps document names to template names. +#html_sidebars = {} + +# Additional templates that should be rendered to pages, maps page names to +# template names. +#html_additional_pages = {} + +# If false, no module index is generated. +#html_domain_indices = True + +# If false, no index is generated. +#html_use_index = True + +# If true, the index is split into individual pages for each letter. +#html_split_index = False + +# If true, links to the reST sources are added to the pages. +#html_show_sourcelink = True + +# If true, "Created using Sphinx" is shown in the HTML footer. Default is True. +#html_show_sphinx = True + +# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. +#html_show_copyright = True + +# If true, an OpenSearch description file will be output, and all pages will +# contain a <link> tag referring to it. The value of this option must be the +# base URL from which the finished HTML is served. +#html_use_opensearch = '' + +# This is the file name suffix for HTML files (e.g. ".xhtml"). +#html_file_suffix = None + +# Output file base name for HTML help builder. +htmlhelp_basename = 'natsortdoc' + + +# -- Options for LaTeX output --------------------------------------------- + +latex_elements = { +# The paper size ('letterpaper' or 'a4paper'). +#'papersize': 'letterpaper', + +# The font size ('10pt', '11pt' or '12pt'). +#'pointsize': '10pt', + +# Additional stuff for the LaTeX preamble. +#'preamble': '', +} + +# Grouping the document tree into LaTeX files. List of tuples +# (source start file, target name, title, +# author, documentclass [howto, manual, or own class]). +latex_documents = [ + ('index', 'natsort.tex', u'natsort Documentation', + u'Seth M. Morton', 'manual'), +] + +# The name of an image file (relative to this directory) to place at the top of +# the title page. +#latex_logo = None + +# For "manual" documents, if this is true, then toplevel headings are parts, +# not chapters. +#latex_use_parts = False + +# If true, show page references after internal links. +#latex_show_pagerefs = False + +# If true, show URL addresses after external links. +#latex_show_urls = False + +# Documents to append as an appendix to all manuals. +#latex_appendices = [] + +# If false, no module index is generated. +#latex_domain_indices = True + + +# -- Options for manual page output --------------------------------------- + +# One entry per manual page. List of tuples +# (source start file, name, description, authors, manual section). +man_pages = [ + ('index', 'natsort', u'natsort Documentation', + [u'Seth M. Morton'], 1) +] + +# If true, show URL addresses after external links. +#man_show_urls = False + + +# -- Options for Texinfo output ------------------------------------------- + +# Grouping the document tree into Texinfo files. List of tuples +# (source start file, target name, title, author, +# dir menu entry, description, category) +texinfo_documents = [ + ('index', 'natsort', u'natsort Documentation', + u'Seth M. Morton', 'natsort', 'One line description of project.', + 'Miscellaneous'), +] + +# Documents to append as an appendix to all manuals. +#texinfo_appendices = [] + +# If false, no module index is generated. +#texinfo_domain_indices = True + +# How to display URL addresses: 'footnote', 'no', or 'inline'. +#texinfo_show_urls = 'footnote' + +# If true, do not generate a @detailmenu in the "Top" node's menu. +#texinfo_no_detailmenu = False + + +# Example configuration for intersphinx: refer to the Python standard library. +intersphinx_mapping = {'http://docs.python.org/': None} diff --git a/docs/source/examples.rst b/docs/source/examples.rst new file mode 100644 index 0000000..9704495 --- /dev/null +++ b/docs/source/examples.rst @@ -0,0 +1,150 @@ +.. default-domain:: py +.. currentmodule:: natsort + +.. _examples: + +Examples and Recipes +==================== + +If you want more detailed examples than given on this page, please see +https://github.com/SethMMorton/natsort/tree/master/test_natsort. + +Basic Usage +----------- + +In the most basic use case, simply import :func:`~natsorted` and use +it as you would :func:`sorted`:: + + >>> a = ['a50', 'a51.', 'a50.4', 'a5.034e1', 'a50.300'] + >>> sorted(a) + ['a5.034e1', 'a50', 'a50.300', 'a50.4', 'a51.'] + >>> from natsort import natsorted + >>> natsorted(a) + ['a50', 'a50.300', 'a5.034e1', 'a50.4', 'a51.'] + +Customizing Float Definition +---------------------------- + +By default :func:`~natsorted` searches for any float that would be +a valid Python float literal, such as 5, 0.4, -4.78, +4.2E-34, etc. +Perhaps you don't want to search for signed numbers, or you don't +want to search for exponential notation, and the ``signed`` and +``exp`` options allow you to do this:: + + >>> a = ['a50', 'a51.', 'a+50.4', 'a5.034e1', 'a+50.300'] + >>> natsorted(a) + ['a50', 'a+50.300', 'a5.034e1', 'a+50.4', 'a51.'] + >>> natsorted(a, signed=False) + ['a50', 'a5.034e1', 'a51.', 'a+50.300', 'a+50.4'] + >>> natsorted(a, exp=False) + ['a5.034e1', 'a50', 'a+50.300', 'a+50.4', 'a51.'] + +Sort Version Numbers +-------------------- + +With default options, :func:`~natsorted` will not sort version numbers +well. Version numbers are best sorted by searching for valid unsigned int +literals, not floats. This can be achieved in three ways, as shown below:: + + >>> a = ['ver-2.9.9a', 'ver-1.11', 'ver-2.9.9b', 'ver-1.11.4', 'ver-1.10.1'] + >>> natsorted(a) # This gives incorrect results + ['ver-2.9.9a', 'ver-2.9.9b', 'ver-1.11', 'ver-1.11.4', 'ver-1.10.1'] + >>> natsorted(a, number_type=int, signed=False) + ['ver-1.10.1', 'ver-1.11', 'ver-1.11.4', 'ver-2.9.9a', 'ver-2.9.9b'] + >>> natsorted(a, number_type=None) + ['ver-1.10.1', 'ver-1.11', 'ver-1.11.4', 'ver-2.9.9a', 'ver-2.9.9b'] + >>> from natsort import versorted + >>> versorted(a) + ['ver-1.10.1', 'ver-1.11', 'ver-1.11.4', 'ver-2.9.9a', 'ver-2.9.9b'] + +You can see that ``number_type=None`` is a shortcut for ``number_type=int`` +and ``signed=False``, and the :func:`~versorted` is a shortcut for +``natsorted(number_type=None)``. The recommend manner to sort version +numbers is to use :func:`~versorted`. + +Sort OS-Generated Paths +----------------------- + +In some cases when sorting file paths with OS-Generated names, the default +:mod:`~natsorted` algorithm may not be sufficient. In cases like these, +you may need to use the ``as_path`` option:: + + >>> a = ['./folder/file (1).txt', + ... './folder/file.txt', + ... './folder (1)/file.txt', + ... './folder (10)/file.txt'] + >>> natsorted(a) + ['./folder (1)/file.txt', './folder (10)/file.txt', './folder/file (1).txt', './folder/file.txt'] + >>> natsorted(a, as_path=True) + ['./folder/file.txt', './folder/file (1).txt', './folder (1)/file.txt', './folder (10)/file.txt'] + +Using a Custom Sorting Key +-------------------------- + +Like the built-in ``sorted`` function, ``natsorted`` can accept a custom +sort key so that:: + + >>> from operator import attrgetter, itemgetter + >>> a = [['a', 'num4'], ['b', 'num8'], ['c', 'num2']] + >>> natsorted(a, key=itemgetter(1)) + [['c', 'num2'], ['a', 'num4'], ['b', 'num8']] + >>> class Foo: + ... def __init__(self, bar): + ... self.bar = bar + ... def __repr__(self): + ... return "Foo('{0}')".format(self.bar) + >>> b = [Foo('num3'), Foo('num5'), Foo('num2')] + >>> natsorted(b, key=attrgetter('bar')) + [Foo('num2'), Foo('num3'), Foo('num5')] + +Generating a Natsort Key +------------------------ + +If you need to sort a list in-place, you cannot use :func:`~natsorted`; you +need to pass a key to the :meth:`list.sort` method. The function +:func:`~natsort_keygen` is a convenient way to generate these keys for you:: + + >>> from natsort import natsort_keygen + >>> a = ['a50', 'a51.', 'a50.4', 'a5.034e1', 'a50.300'] + >>> natsort_key = natsort_keygen() + >>> a.sort(key=natsort_key) + >>> a + ['a50', 'a50.300', 'a5.034e1', 'a50.4', 'a51.'] + >>> versort_key = natsort_keygen(number_type=None) + >>> a = ['ver-2.9.9a', 'ver-1.11', 'ver-2.9.9b', 'ver-1.11.4', 'ver-1.10.1'] + >>> a.sort(key=versort_key) + >>> a + ['ver-1.10.1', 'ver-1.11', 'ver-1.11.4', 'ver-2.9.9a', 'ver-2.9.9b'] + +:func:`~natsort_keygen` has the same API as :func:`~natsorted`. + +Sorting Multiple Lists According to a Single List +------------------------------------------------- + +Sometimes you have multiple lists, and you want to sort one of those +lists and reorder the other lists according to how the first was sorted. +To achieve this you would use the :func:`~index_natsorted` or +:func:`~index_versorted` in combination with the convenience function +:func:`~order_by_index`:: + + >>> from natsort import index_natsorted, order_by_index + >>> a = ['a2', 'a9', 'a1', 'a4', 'a10'] + >>> b = [4, 5, 6, 7, 8] + >>> c = ['hi', 'lo', 'ah', 'do', 'up'] + >>> index = index_natsorted(a) + >>> order_by_index(a, index) + ['a1', 'a2', 'a4', 'a9', 'a10'] + >>> order_by_index(b, index) + [6, 4, 7, 5, 8] + >>> order_by_index(c, index) + ['ah', 'hi', 'do', 'lo', 'up'] + +Returning Results in Reverse Order +---------------------------------- + +Just like the :func:`sorted` built-in function, you can supply the +``reverse`` option to return the results in reverse order:: + + >>> a = ['a2', 'a9', 'a1', 'a4', 'a10'] + >>> natsorted(a, reverse=True) + ['a10', 'a9', 'a4', 'a2', 'a1'] diff --git a/docs/source/index.rst b/docs/source/index.rst new file mode 100644 index 0000000..a6fd97c --- /dev/null +++ b/docs/source/index.rst @@ -0,0 +1,27 @@ +.. natsort documentation master file, created by + sphinx-quickstart on Thu Jul 17 21:01:29 2014. + You can adapt this file completely to your liking, but it should at least + contain the root `toctree` directive. + +natsort: Natural Sorting for Python +=================================== + +Contents: + +.. toctree:: + :maxdepth: 2 + :numbered: + + intro.rst + examples.rst + api.rst + shell.rst + changelog.rst + +Indices and tables +================== + +* :ref:`genindex` +* :ref:`modindex` +* :ref:`search` + diff --git a/docs/source/index_natsorted.rst b/docs/source/index_natsorted.rst new file mode 100644 index 0000000..ea48f25 --- /dev/null +++ b/docs/source/index_natsorted.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.index_natsorted` +================================ + +.. autofunction:: index_natsorted + diff --git a/docs/source/index_versorted.rst b/docs/source/index_versorted.rst new file mode 100644 index 0000000..07e266f --- /dev/null +++ b/docs/source/index_versorted.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.index_versorted` +================================ + +.. autofunction:: index_versorted + diff --git a/docs/source/intro.rst b/docs/source/intro.rst new file mode 100644 index 0000000..d4977e8 --- /dev/null +++ b/docs/source/intro.rst @@ -0,0 +1,116 @@ +.. default-domain:: py +.. module:: natsort + +The :mod:`natsort` module +========================= + +Natural sorting for python. Check out the source code at +https://github.com/SethMMorton/natsort. + +:mod:`natsort` was initially created for sorting scientific output filenames that +contained floating point numbers in the names. There was a serious lack of +algorithms out there that could perform a natural sort on `floats` but +plenty for ints; check out +`this StackOverflow question <http://stackoverflow.com/q/4836710/1399279>`_ +and its answers and links therein, +`this ActiveState forum <http://code.activestate.com/recipes/285264-natural-string-sorting/>`_, +and of course `this great article on natural sorting <http://blog.codinghorror.com/sorting-for-humans-natural-sort-order/>`_ +from CodingHorror.com for examples of what I mean. +:mod:`natsort` was created to fill in this gap. It has since grown +and can now sort version numbers (which seems to be the +most common use case based on user feedback) as well as some other nice features. + +Quick Description +----------------- + +When you try to sort a list of strings that contain numbers, the normal python +sort algorithm sorts lexicographically, so you might not get the results that you +expect:: + + >>> a = ['a2', 'a9', 'a1', 'a4', 'a10'] + >>> sorted(a) + ['a1', 'a10', 'a2', 'a4', 'a9'] + +Notice that it has the order ('1', '10', '2') - this is because the list is +being sorted in lexicographical order, which sorts numbers like you would +letters (i.e. 'b', 'ba', 'c'). + +:mod:`natsort` provides a function :func:`~natsorted` that helps sort lists +"naturally", either as real numbers (i.e. signed/unsigned floats or ints), +or as versions. Using :func:`~natsorted` is simple:: + + >>> from natsort import natsorted + >>> a = ['a2', 'a9', 'a1', 'a4', 'a10'] + >>> natsorted(a) + ['a1', 'a2', 'a4', 'a9', 'a10'] + +:func:`~natsorted` identifies real numbers anywhere in a string and sorts them +naturally. + +Sorting version numbers is just as easy with :func:`~versorted`:: + + >>> from natsort import versorted + >>> a = ['version-1.9', 'version-2.0', 'version-1.11', 'version-1.10'] + >>> versorted(a) + ['version-1.9', 'version-1.10', 'version-1.11', 'version-2.0'] + >>> natsorted(a) # natsorted tries to sort as signed floats, so it won't work + ['version-2.0', 'version-1.9', 'version-1.11', 'version-1.10'] + +You can mix and match ``int``, ``float``, and ``str`` (or ``unicode``) types +when you sort:: + + >>> a = ['4.5', 6, 2.0, '5', 'a'] + >>> natsorted(a) + [2.0, '4.5', '5', 6, 'a'] + >>> # On Python 2, sorted(a) would return [2.0, 6, '4.5', '5', 'a'] + >>> # On Python 3, sorted(a) would raise an "unorderable types" TypeError + +The natsort algorithm does other fancy things like + + - recursively descend into lists of lists + - sort file paths correctly + - allow custom sorting keys + - allow exposed a natsort_key generator to pass to list.sort + +Please see the :ref:`examples` for a quick start guide, or the :ref:`api` +for more details. + +Installation +------------ + +Installation of :mod:`natsort` is ultra-easy. Simply execute from the +command line:: + + easy_install natsort + +or, if you have ``pip`` (preferred over ``easy_install``):: + + pip install natsort + +Both of the above commands will download the source for you. + +You can also download the source from http://pypi.python.org/pypi/natsort, +or browse the git repository at https://github.com/SethMMorton/natsort. + +If you choose to install from source, you can unzip the source archive and +enter the directory, and type:: + + python setup.py install + +If you wish to run the unit tests, enter:: + + python setup.py test + +If you want to build this documentation, enter:: + + python setup.py build_sphinx + +:mod:`natsort` requires python version 2.6 or greater +(this includes python 3.x). To run version 2.6, 3.0, or 3.1 the +`argparse <https://pypi.python.org/pypi/argparse>`_ module is required. + +:mod:`natsort` comes with a shell script called :mod:`natsort`, or can also be called +from the command line with ``python -m natsort``. The command line script is +only installed onto your ``PATH`` if you don't install via a wheel. There is +apparently a known bug with the wheel installation process that will not create +entry points. diff --git a/docs/source/natsort_key.rst b/docs/source/natsort_key.rst new file mode 100644 index 0000000..351b351 --- /dev/null +++ b/docs/source/natsort_key.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.natsort_key` +============================ + +.. autofunction:: natsort_key + diff --git a/docs/source/natsort_keygen.rst b/docs/source/natsort_keygen.rst new file mode 100644 index 0000000..b0d5988 --- /dev/null +++ b/docs/source/natsort_keygen.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.natsort_keygen` +=============================== + +.. autofunction:: natsort_keygen + diff --git a/docs/source/natsorted.rst b/docs/source/natsorted.rst new file mode 100644 index 0000000..30b5692 --- /dev/null +++ b/docs/source/natsorted.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.natsorted` +========================== + +.. autofunction:: natsorted + diff --git a/docs/source/order_by_index.rst b/docs/source/order_by_index.rst new file mode 100644 index 0000000..b1d7681 --- /dev/null +++ b/docs/source/order_by_index.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.order_by_index` +=============================== + +.. autofunction:: order_by_index + diff --git a/docs/source/shell.rst b/docs/source/shell.rst new file mode 100644 index 0000000..e29a6fe --- /dev/null +++ b/docs/source/shell.rst @@ -0,0 +1,137 @@ +.. default-domain:: py +.. currentmodule:: natsort + +.. _shell: + +Shell Script +============ + +The ``natsort`` shell script is automatically installed when you install +:mod:`natsort` from "zip" or "tar.gz" via ``pip`` or ``easy_install`` +(there is a known bug with wheels that will not install the shell script). + +Below is the usage and some usage examples for the ``natsort`` shell script. + +Usage +----- + +:: + + usage: natsort [-h] [--version] [-p] [-f LOW HIGH] [-F LOW HIGH] + [-e EXCLUDE] [-r] [-t {digit,int,float,version,ver}] + [--nosign] [--noexp] + [entries [entries ...]] + + Performs a natural sort on entries given on the command-line. + A natural sort sorts numerically then alphabetically, and will sort + by numbers in the middle of an entry. + + positional arguments: + entries The entries to sort. Taken from stdin if nothing is + given on the command line. + + optional arguments: + -h, --help show this help message and exit + --version show program's version number and exit + -p, --paths Interpret the input as file paths. This is not + strictly necessary to sort all file paths, but in + cases where there are OS-generated file paths like + "Folder/" and "Folder (1)/", this option is needed to + make the paths sorted in the order you expect + ("Folder/" before "Folder (1)/"). + -f LOW HIGH, --filter LOW HIGH + Used for keeping only the entries that have a number + falling in the given range. + -F LOW HIGH, --reverse-filter LOW HIGH + Used for excluding the entries that have a number + falling in the given range. + -e EXCLUDE, --exclude EXCLUDE + Used to exclude an entry that contains a specific + number. + -r, --reverse Returns in reversed order. + -t {digit,int,float,version,ver}, --number-type {digit,int,float,version,ver} + Choose the type of number to search for. "float" will + search for floating-point numbers. "int" will only + search for integers. "digit", "version", and "ver" are + shortcuts for "int" with --nosign. + --nosign Do not consider "+" or "-" as part of a number, i.e. + do not take sign into consideration. + --noexp Do not consider an exponential as part of a number, + i.e. 1e4, would be considered as 1, "e", and 4, not as + 10000. This only effects the --number-type=float. + +Description +----------- + +``natsort`` was originally written to aid in computational chemistry +research so that it would be easy to analyze large sets of output files +named after the parameter used:: + + $ ls *.out + mode1000.35.out mode1243.34.out mode744.43.out mode943.54.out + +(Obviously, in reality there would be more files, but you get the idea.) Notice +that the shell sorts in lexicographical order. This is the behavior of programs like +``find`` as well as ``ls``. The problem is passing these files to an +analysis program causes them not to appear in numerical order, which can lead +to bad analysis. To remedy this, use ``natsort``:: + + $ natsort *.out + mode744.43.out + mode943.54.out + mode1000.35.out + mode1243.34.out + $ natsort *.out | xargs your_program + +You can also place natsort in the middle of a pipe:: + + $ find . -name "*.out" | natsort | xargs your_program + +To sort version numbers, use the ``--number-type version`` option +(or ``-t ver`` for short):: + + $ ls * + prog-1.10.zip prog-1.9.zip prog-2.0.zip + $ natsort -t ver * + prog-1.9.zip + prog-1.10.zip + prog-2.0.zip + +In general, all ``natsort`` shell script options mirror the :func:`~natsorted` API, +with notable exception of the ``--filter``, ``--reverse-filter``, and ``--exclude`` +options. These three options are used as follows:: + + $ ls *.out + mode1000.35.out mode1243.34.out mode744.43.out mode943.54.out + $ natsort *.out -f 900 1100 # Select only numbers between 900-1100 + mode943.54.out + mode1000.35.out + $ natsort *.out -F 900 1100 # Select only numbers NOT between 900-1100 + mode744.43.out + mode1243.34.out + $ natsort *.out -e 1000.35 # Exclude 1000.35 from search + mode744.43.out + mode943.54.out + mode1243.34.out + +If you are sorting paths with OS-generated filenames, you may require the +``--paths``/``-p`` option:: + + $ find . ! -path . -type f + ./folder/file (1).txt + ./folder/file.txt + ./folder (1)/file.txt + ./folder (10)/file.txt + ./folder (2)/file.txt + $ find . ! -path . -type f | natsort + ./folder (1)/file.txt + ./folder (2)/file.txt + ./folder (10)/file.txt + ./folder/file (1).txt + ./folder/file.txt + $ find . ! -path . -type f | natsort -p + ./folder/file.txt + ./folder/file (1).txt + ./folder (1)/file.txt + ./folder (2)/file.txt + ./folder (10)/file.txt diff --git a/docs/source/solar/NEWS.txt b/docs/source/solar/NEWS.txt new file mode 100644 index 0000000..d9743ee --- /dev/null +++ b/docs/source/solar/NEWS.txt @@ -0,0 +1,32 @@ +News
+====
+
+1.3
+---
+* Release date: 2012-11-01.
+* Source Code Pro is now used for code samples.
+* Reduced font size of pre elements.
+* Horizontal rule for header elements.
+* HTML pre contents are now wrapped (no scrollbars).
+* Changed permalink color from black to a lighter one.
+
+1.2
+---
+* Release date: 2012-10-03.
+* Style additional admonition levels.
+* Increase padding for navigation links (minor).
+* Add shadow for admonition items (minor).
+
+1.1
+---
+* Release date: 2012-09-05.
+* Add a new background.
+* Revert font of headings to Open Sans Light.
+* Darker color for h3 - h6.
+* Removed dependency on solarized dark pygments style.
+* Nice looking scrollbars for pre element.
+
+1.0
+---
+* Release date: 2012-08-24.
+* Initial release.
diff --git a/docs/source/solar/README.rst b/docs/source/solar/README.rst new file mode 100644 index 0000000..caeedbd --- /dev/null +++ b/docs/source/solar/README.rst @@ -0,0 +1,28 @@ +Solar theme for Python Sphinx
+=============================
+Solar is an attempt to create a theme for Sphinx based on the `Solarized <http://ethanschoonover.com/solarized>`_ color scheme.
+
+Preview
+-------
+http://vimalkumar.in/sphinx-themes/solar
+
+Download
+--------
+Released versions are available from http://github.com/vkvn/sphinx-themes/downloads
+
+Installation
+------------
+#. Extract the archive.
+#. Modify ``conf.py`` of an existing Sphinx project or create new project using ``sphinx-quickstart``.
+#. Change the ``html_theme`` parameter to ``solar``.
+#. Change the ``html_theme_path`` to the location containing the extracted archive.
+
+License
+-------
+`GNU General Public License <http://www.gnu.org/licenses/gpl.html>`_.
+
+Credits
+-------
+Modified from the default Sphinx theme -- Sphinxdoc
+
+Background pattern from http://subtlepatterns.com.
diff --git a/docs/source/solar/layout.html b/docs/source/solar/layout.html new file mode 100644 index 0000000..6c57110 --- /dev/null +++ b/docs/source/solar/layout.html @@ -0,0 +1,32 @@ +{% extends "basic/layout.html" %} + +{%- block doctype -%} +<!DOCTYPE html> +{%- endblock -%} + +{%- block extrahead -%} +<link href='http://fonts.googleapis.com/css?family=Source+Code+Pro|Open+Sans:300italic,400italic,700italic,400,300,700' rel='stylesheet' type='text/css'> +<link href="{{ pathto("_static/solarized-dark.css", 1) }}" rel="stylesheet"> +{%- endblock -%} + +{# put the sidebar before the body #} +{% block sidebar1 %}{{ sidebar() }}{% endblock %} +{% block sidebar2 %}{% endblock %} + +{%- block footer %} + <div class="footer"> + {%- if show_copyright %} + {%- if hasdoc('copyright') %} + {% trans path=pathto('copyright'), copyright=copyright|e %}© <a href="{{ path }}">Copyright</a> {{ copyright }}.{% endtrans %} + {%- else %} + {% trans copyright=copyright|e %}© Copyright {{ copyright }}.{% endtrans %} + {%- endif %} + {%- endif %} + {%- if last_updated %} + {% trans last_updated=last_updated|e %}Last updated on {{ last_updated }}.{% endtrans %} + {%- endif %} + {%- if show_sphinx %} + {% trans sphinx_version=sphinx_version|e %}Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> {{ sphinx_version }}.Theme by <a href="http://github.com/vkvn">vkvn</a>{% endtrans %} + {%- endif %} + </div> +{%- endblock %} diff --git a/docs/source/solar/static/solar.css b/docs/source/solar/static/solar.css new file mode 100644 index 0000000..15b5ade --- /dev/null +++ b/docs/source/solar/static/solar.css @@ -0,0 +1,344 @@ +/* solar.css + * Modified from sphinxdoc.css of the sphinxdoc theme. +*/ + +@import url("basic.css"); + +/* -- page layout ----------------------------------------------------------- */ + +body { + font-family: 'Open Sans', sans-serif; + font-size: 14px; + line-height: 150%; + text-align: center; + color: #002b36; + padding: 0; + margin: 0px 80px 0px 80px; + min-width: 740px; + -moz-box-shadow: 0px 0px 10px #93a1a1; + -webkit-box-shadow: 0px 0px 10px #93a1a1; + box-shadow: 0px 0px 10px #93a1a1; + background: url("subtle_dots.png") repeat; + +} + +div.document { + background-color: #fcfcfc; + text-align: left; + background-repeat: repeat-x; +} + +div.bodywrapper { + margin: 0 240px 0 0; + border-right: 1px dotted #eee8d5; +} + +div.body { + background-color: white; + margin: 0; + padding: 0.5em 20px 20px 20px; +} + +div.related { + font-size: 1em; + background: #002b36; + color: #839496; + padding: 5px 0px; +} + +div.related ul { + height: 2em; + margin: 2px; +} + +div.related ul li { + margin: 0; + padding: 0; + height: 2em; + float: left; +} + +div.related ul li.right { + float: right; + margin-right: 5px; +} + +div.related ul li a { + margin: 0; + padding: 2px 5px; + line-height: 2em; + text-decoration: none; + color: #839496; +} + +div.related ul li a:hover { + background-color: #073642; + -webkit-border-radius: 2px; + -moz-border-radius: 2px; + border-radius: 2px; +} + +div.sphinxsidebarwrapper { + padding: 0; +} + +div.sphinxsidebar { + margin: 0; + padding: 0.5em 15px 15px 0; + width: 210px; + float: right; + font-size: 0.9em; + text-align: left; +} + +div.sphinxsidebar h3, div.sphinxsidebar h4 { + margin: 1em 0 0.5em 0; + font-size: 1em; + padding: 0.7em; + background-color: #eeeff1; +} + +div.sphinxsidebar h3 a { + color: #2E3436; +} + +div.sphinxsidebar ul { + padding-left: 1.5em; + margin-top: 7px; + padding: 0; + line-height: 150%; + color: #586e75; +} + +div.sphinxsidebar ul ul { + margin-left: 20px; +} + +div.sphinxsidebar input { + border: 1px solid #eee8d5; +} + +div.footer { + background-color: #93a1a1; + color: #eee; + padding: 3px 8px 3px 0; + clear: both; + font-size: 0.8em; + text-align: right; +} + +div.footer a { + color: #eee; + text-decoration: none; +} + +/* -- body styles ----------------------------------------------------------- */ + +p { + margin: 0.8em 0 0.5em 0; +} + +div.body a, div.sphinxsidebarwrapper a { + color: #268bd2; + text-decoration: none; +} + +div.body a:hover, div.sphinxsidebarwrapper a:hover { + border-bottom: 1px solid #268bd2; +} + +h1, h2, h3, h4, h5, h6 { + font-family: "Open Sans", sans-serif; + font-weight: 300; +} + +h1 { + margin: 0; + padding: 0.7em 0 0.3em 0; + line-height: 1.2em; + color: #002b36; + text-shadow: #eee 0.1em 0.1em 0.1em; +} + +h2 { + margin: 1.3em 0 0.2em 0; + padding: 0 0 10px 0; + color: #073642; + border-bottom: 1px solid #eee; +} + +h3 { + margin: 1em 0 -0.3em 0; + padding-bottom: 5px; +} + +h3, h4, h5, h6 { + color: #073642; + border-bottom: 1px dotted #eee;
+} + +div.body h1 a, div.body h2 a, div.body h3 a, div.body h4 a, div.body h5 a, div.body h6 a { + color: #657B83!important;
+} + +h1 a.anchor, h2 a.anchor, h3 a.anchor, h4 a.anchor, h5 a.anchor, h6 a.anchor { + display: none; + margin: 0 0 0 0.3em; + padding: 0 0.2em 0 0.2em; + color: #aaa!important; +} + +h1:hover a.anchor, h2:hover a.anchor, h3:hover a.anchor, h4:hover a.anchor, +h5:hover a.anchor, h6:hover a.anchor { + display: inline; +} + +h1 a.anchor:hover, h2 a.anchor:hover, h3 a.anchor:hover, h4 a.anchor:hover, +h5 a.anchor:hover, h6 a.anchor:hover { + color: #777; + background-color: #eee; +} + +a.headerlink { + color: #c60f0f!important; + font-size: 1em; + margin-left: 6px; + padding: 0 4px 0 4px; + text-decoration: none!important; +} + +a.headerlink:hover { + background-color: #ccc; + color: white!important; +} + + +cite, code, tt { + font-family: 'Source Code Pro', monospace;
+ font-size: 0.9em;
+ letter-spacing: 0.01em; + background-color: #eeeff2; + font-style: normal; +} + +hr { + border: 1px solid #eee; + margin: 2em; +} + +.highlight { + -webkit-border-radius: 2px; + -moz-border-radius: 2px; + border-radius: 2px; +} + +pre { + font-family: 'Source Code Pro', monospace;
+ font-style: normal; + font-size: 0.9em;
+ letter-spacing: 0.015em; + line-height: 120%; + padding: 0.7em; + white-space: pre-wrap; /* css-3 */
+ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */
+ white-space: -pre-wrap; /* Opera 4-6 */
+ white-space: -o-pre-wrap; /* Opera 7 */
+ word-wrap: break-word; /* Internet Explorer 5.5+ */
+} + +pre a { + color: inherit; + text-decoration: underline; +} + +td.linenos pre { + padding: 0.5em 0; +} + +div.quotebar { + background-color: #f8f8f8; + max-width: 250px; + float: right; + padding: 2px 7px; + border: 1px solid #ccc; +} + +div.topic { + background-color: #f8f8f8; +} + +table { + border-collapse: collapse; + margin: 0 -0.5em 0 -0.5em; +} + +table td, table th { + padding: 0.2em 0.5em 0.2em 0.5em; +} + +div.admonition { + font-size: 0.9em; + margin: 1em 0 1em 0; + border: 1px solid #eee; + background-color: #f7f7f7; + padding: 0; + -moz-box-shadow: 0px 8px 6px -8px #93a1a1; + -webkit-box-shadow: 0px 8px 6px -8px #93a1a1; + box-shadow: 0px 8px 6px -8px #93a1a1; +} + +div.admonition p { + margin: 0.5em 1em 0.5em 1em; + padding: 0.2em; +} + +div.admonition pre { + margin: 0.4em 1em 0.4em 1em; +} + +div.admonition p.admonition-title +{ + margin: 0; + padding: 0.2em 0 0.2em 0.6em; + color: white; + border-bottom: 1px solid #eee8d5; + font-weight: bold; + background-color: #268bd2; +} + +div.warning p.admonition-title, +div.important p.admonition-title { + background-color: #cb4b16; +} + +div.hint p.admonition-title, +div.tip p.admonition-title { + background-color: #859900; +} + +div.caution p.admonition-title, +div.attention p.admonition-title, +div.danger p.admonition-title, +div.error p.admonition-title { + background-color: #dc322f; +} + +div.admonition ul, div.admonition ol { + margin: 0.1em 0.5em 0.5em 3em; + padding: 0; +} + +div.versioninfo { + margin: 1em 0 0 0; + border: 1px solid #eee; + background-color: #DDEAF0; + padding: 8px; + line-height: 1.3em; + font-size: 0.9em; +} + +div.viewcode-block:target { + background-color: #f4debf; + border-top: 1px solid #eee; + border-bottom: 1px solid #eee; +} diff --git a/docs/source/solar/static/solarized-dark.css b/docs/source/solar/static/solarized-dark.css new file mode 100644 index 0000000..6ebb945 --- /dev/null +++ b/docs/source/solar/static/solarized-dark.css @@ -0,0 +1,84 @@ +/* solarized dark style for solar theme */ + +/*style pre scrollbar*/ +pre::-webkit-scrollbar, .highlight::-webkit-scrollbar { + height: 0.5em; + background: #073642; +} + +pre::-webkit-scrollbar-thumb { + border-radius: 1em; + background: #93a1a1; +} + +/* pygments style */ +.highlight .hll { background-color: #ffffcc } +.highlight { background: #002B36!important; color: #93A1A1 } +.highlight .c { color: #586E75 } /* Comment */ +.highlight .err { color: #93A1A1 } /* Error */ +.highlight .g { color: #93A1A1 } /* Generic */ +.highlight .k { color: #859900 } /* Keyword */ +.highlight .l { color: #93A1A1 } /* Literal */ +.highlight .n { color: #93A1A1 } /* Name */ +.highlight .o { color: #859900 } /* Operator */ +.highlight .x { color: #CB4B16 } /* Other */ +.highlight .p { color: #93A1A1 } /* Punctuation */ +.highlight .cm { color: #586E75 } /* Comment.Multiline */ +.highlight .cp { color: #859900 } /* Comment.Preproc */ +.highlight .c1 { color: #586E75 } /* Comment.Single */ +.highlight .cs { color: #859900 } /* Comment.Special */ +.highlight .gd { color: #2AA198 } /* Generic.Deleted */ +.highlight .ge { color: #93A1A1; font-style: italic } /* Generic.Emph */ +.highlight .gr { color: #DC322F } /* Generic.Error */ +.highlight .gh { color: #CB4B16 } /* Generic.Heading */ +.highlight .gi { color: #859900 } /* Generic.Inserted */ +.highlight .go { color: #93A1A1 } /* Generic.Output */ +.highlight .gp { color: #93A1A1 } /* Generic.Prompt */ +.highlight .gs { color: #93A1A1; font-weight: bold } /* Generic.Strong */ +.highlight .gu { color: #CB4B16 } /* Generic.Subheading */ +.highlight .gt { color: #93A1A1 } /* Generic.Traceback */ +.highlight .kc { color: #CB4B16 } /* Keyword.Constant */ +.highlight .kd { color: #268BD2 } /* Keyword.Declaration */ +.highlight .kn { color: #859900 } /* Keyword.Namespace */ +.highlight .kp { color: #859900 } /* Keyword.Pseudo */ +.highlight .kr { color: #268BD2 } /* Keyword.Reserved */ +.highlight .kt { color: #DC322F } /* Keyword.Type */ +.highlight .ld { color: #93A1A1 } /* Literal.Date */ +.highlight .m { color: #2AA198 } /* Literal.Number */ +.highlight .s { color: #2AA198 } /* Literal.String */ +.highlight .na { color: #93A1A1 } /* Name.Attribute */ +.highlight .nb { color: #B58900 } /* Name.Builtin */ +.highlight .nc { color: #268BD2 } /* Name.Class */ +.highlight .no { color: #CB4B16 } /* Name.Constant */ +.highlight .nd { color: #268BD2 } /* Name.Decorator */ +.highlight .ni { color: #CB4B16 } /* Name.Entity */ +.highlight .ne { color: #CB4B16 } /* Name.Exception */ +.highlight .nf { color: #268BD2 } /* Name.Function */ +.highlight .nl { color: #93A1A1 } /* Name.Label */ +.highlight .nn { color: #93A1A1 } /* Name.Namespace */ +.highlight .nx { color: #93A1A1 } /* Name.Other */ +.highlight .py { color: #93A1A1 } /* Name.Property */ +.highlight .nt { color: #268BD2 } /* Name.Tag */ +.highlight .nv { color: #268BD2 } /* Name.Variable */ +.highlight .ow { color: #859900 } /* Operator.Word */ +.highlight .w { color: #93A1A1 } /* Text.Whitespace */ +.highlight .mf { color: #2AA198 } /* Literal.Number.Float */ +.highlight .mh { color: #2AA198 } /* Literal.Number.Hex */ +.highlight .mi { color: #2AA198 } /* Literal.Number.Integer */ +.highlight .mo { color: #2AA198 } /* Literal.Number.Oct */ +.highlight .sb { color: #586E75 } /* Literal.String.Backtick */ +.highlight .sc { color: #2AA198 } /* Literal.String.Char */ +.highlight .sd { color: #93A1A1 } /* Literal.String.Doc */ +.highlight .s2 { color: #2AA198 } /* Literal.String.Double */ +.highlight .se { color: #CB4B16 } /* Literal.String.Escape */ +.highlight .sh { color: #93A1A1 } /* Literal.String.Heredoc */ +.highlight .si { color: #2AA198 } /* Literal.String.Interpol */ +.highlight .sx { color: #2AA198 } /* Literal.String.Other */ +.highlight .sr { color: #DC322F } /* Literal.String.Regex */ +.highlight .s1 { color: #2AA198 } /* Literal.String.Single */ +.highlight .ss { color: #2AA198 } /* Literal.String.Symbol */ +.highlight .bp { color: #268BD2 } /* Name.Builtin.Pseudo */ +.highlight .vc { color: #268BD2 } /* Name.Variable.Class */ +.highlight .vg { color: #268BD2 } /* Name.Variable.Global */ +.highlight .vi { color: #268BD2 } /* Name.Variable.Instance */ +.highlight .il { color: #2AA198 } /* Literal.Number.Integer.Long */ diff --git a/docs/source/solar/static/subtle_dots.png b/docs/source/solar/static/subtle_dots.png Binary files differnew file mode 100644 index 0000000..bb2d611 --- /dev/null +++ b/docs/source/solar/static/subtle_dots.png diff --git a/docs/source/solar/theme.conf b/docs/source/solar/theme.conf new file mode 100644 index 0000000..d8fc2f3 --- /dev/null +++ b/docs/source/solar/theme.conf @@ -0,0 +1,4 @@ +[theme] +inherit = basic +stylesheet = solar.css +pygments_style = none diff --git a/docs/source/versorted.rst b/docs/source/versorted.rst new file mode 100644 index 0000000..6f88597 --- /dev/null +++ b/docs/source/versorted.rst @@ -0,0 +1,8 @@ +.. default-domain:: py +.. currentmodule:: natsort + +:func:`~natsort.versorted` +========================== + +.. autofunction:: versorted + diff --git a/natsort/__init__.py b/natsort/__init__.py index 7c474e7..ac8171d 100644 --- a/natsort/__init__.py +++ b/natsort/__init__.py @@ -1,14 +1,18 @@ # -*- coding: utf-8 -*- -from __future__ import print_function, division, unicode_literals, absolute_import +from __future__ import (print_function, division, + unicode_literals, absolute_import) -from .natsort import natsort_key, natsorted, index_natsorted, versorted, index_versorted +from .natsort import (natsort_key, natsort_keygen, natsorted, + index_natsorted, versorted, index_versorted, + order_by_index) from ._version import __version__ __all__ = [ - 'natsort_key', - 'natsorted', - 'versorted' - 'index_natsorted', - 'index_versorted', - ] - + 'natsort_key', + 'natsort_keygen', + 'natsorted', + 'versorted' + 'index_natsorted', + 'index_versorted', + 'order_by_index', +] diff --git a/natsort/__main__.py b/natsort/__main__.py index 7930d1a..af8ef63 100644 --- a/natsort/__main__.py +++ b/natsort/__main__.py @@ -1,14 +1,10 @@ # -*- coding: utf-8 -*- -from __future__ import print_function, division, unicode_literals, absolute_import +from __future__ import (print_function, division, + unicode_literals, absolute_import) import sys -import os -import re -from .natsort import natsort_key, natsorted, int_nosign_re, int_sign_re -from .natsort import float_sign_exp_re, float_nosign_exp_re -from .natsort import float_sign_noexp_re, float_nosign_noexp_re -from .natsort import regex_and_num_function_chooser +from .natsort import natsorted, regex_and_num_function_chooser from ._version import __version__ from .py23compat import py23_str @@ -26,38 +22,50 @@ def main(): formatter_class=RawDescriptionHelpFormatter) parser.add_argument('--version', action='version', version='%(prog)s {0}'.format(__version__)) - parser.add_argument('-f', '--filter', help='Used for ' - 'keeping only the entries that have a number ' - 'falling in the given range.', nargs=2, type=float, - metavar=('LOW', 'HIGH'), action='append') - parser.add_argument('-F', '--reverse-filter', help='Used for ' - 'excluding the entries that have a number ' - 'falling in the given range.', nargs=2, type=float, - metavar=('LOW', 'HIGH'), action='append', - dest='reverse_filter') - parser.add_argument('-e', '--exclude', type=float, action='append', - help='Used to exclude an entry ' - 'that contains a specific number.') - parser.add_argument('-r', '--reverse', help='Returns in reversed order.', - action='store_true', default=False) - parser.add_argument('-t', '--number-type', '--number_type', dest='number_type', - choices=('digit', 'int', 'float', 'version', 'ver'), - default='float', help='Choose the type of number ' - 'to search for. "float" will search for floating-point ' - 'numbers. "int" will only search for integers. ' - '"digit", "version", and "ver" are shortcuts for "int" ' - 'with --nosign.') - parser.add_argument('--nosign', default=True, action='store_false', - dest='signed', help='Do not consider "+" or "-" as part ' - 'of a number, i.e. do not take sign into consideration.') - parser.add_argument('--noexp', default=True, action='store_false', - dest='exp', help='Do not consider an exponential as part ' - 'of a number, i.e. 1e4, would be considered as 1, "e", ' - 'and 4, not as 10000. This only effects the ' - '--number_type=float.') - parser.add_argument('entries', help='The entries to sort. Taken from stdin ' - 'if nothing is given on the command line.', nargs='*', - default=sys.stdin) + parser.add_argument( + '-p', '--paths', default=False, action='store_true', + help='Interpret the input as file paths. This is not ' + 'strictly necessary to sort all file paths, but in cases ' + 'where there are OS-generated file paths like "Folder/" ' + 'and "Folder (1)/", this option is needed to make the ' + 'paths sorted in the order you expect ("Folder/" before ' + '"Folder (1)/").') + parser.add_argument( + '-f', '--filter', nargs=2, type=float, metavar=('LOW', 'HIGH'), + action='append', + help='Used for keeping only the entries that have a number ' + 'falling in the given range.') + parser.add_argument( + '-F', '--reverse-filter', nargs=2, type=float, + metavar=('LOW', 'HIGH'), action='append', dest='reverse_filter', + help='Used for excluding the entries that have a number ' + 'falling in the given range.') + parser.add_argument( + '-e', '--exclude', type=float, action='append', + help='Used to exclude an entry that contains a specific number.') + parser.add_argument( + '-r', '--reverse', action='store_true', default=False, + help='Returns in reversed order.') + parser.add_argument( + '-t', '--number-type', '--number_type', dest='number_type', + choices=('digit', 'int', 'float', 'version', 'ver'), default='float', + help='Choose the type of number to search for. "float" will search ' + 'for floating-point numbers. "int" will only search for ' + 'integers. "digit", "version", and "ver" are shortcuts for "int" ' + 'with --nosign.') + parser.add_argument( + '--nosign', default=True, action='store_false', dest='signed', + help='Do not consider "+" or "-" as part of a number, i.e. do not ' + 'take sign into consideration.') + parser.add_argument( + '--noexp', default=True, action='store_false', dest='exp', + help='Do not consider an exponential as part of a number, i.e. 1e4, ' + 'would be considered as 1, "e", and 4, not as 10000. This only ' + 'effects the --number-type=float.') + parser.add_argument( + 'entries', nargs='*', default=sys.stdin, + help='The entries to sort. Taken from stdin if nothing is given on ' + 'the command line.', ) args = parser.parse_args() # Make sure the filter range is given properly. Does nothing if no filter @@ -108,8 +116,8 @@ def keep_entry_range(entry, lows, highs, converter, regex): and False if it is not in the range and should not be kept. """ return any(low <= converter(num) <= high - for num in regex.findall(entry) - for low, high in zip(lows, highs)) + for num in regex.findall(entry) + for low, high in zip(lows, highs)) def exclude_entry(entry, values, converter, regex): @@ -133,29 +141,37 @@ def sort_and_print_entries(entries, args): 'int': int, 'float': float}[args.number_type], 'signed': args.signed, - 'exp': args.exp} + 'exp': args.exp, + 'as_path': args.paths, + 'reverse': args.reverse, } # Pre-remove entries that don't pass the filtering criteria - # Make sure we use the same searching algorithm for filtering as for sorting. - if args.filter is not None or args.reverse_filter is not None or args.exclude: + # Make sure we use the same searching algorithm for filtering + # as for sorting. + do_filter = args.filter is not None or args.reverse_filter is not None + if do_filter or args.exclude: inp_options = (kwargs['number_type'], args.signed, args.exp) regex, num_function = regex_and_num_function_chooser[inp_options] if args.filter is not None: - lows, highs = [f[0] for f in args.filter], [f[1] for f in args.filter] + lows, highs = ([f[0] for f in args.filter], + [f[1] for f in args.filter]) entries = [entry for entry in entries - if keep_entry_range(entry, lows, highs, num_function, regex)] + if keep_entry_range(entry, lows, highs, + num_function, regex)] if args.reverse_filter is not None: - lows, highs = [f[0] for f in args.reverse_filter], [f[1] for f in args.reverse_filter] + lows, highs = ([f[0] for f in args.reverse_filter], + [f[1] for f in args.reverse_filter]) entries = [entry for entry in entries - if not keep_entry_range(entry, lows, highs, num_function, regex)] + if not keep_entry_range(entry, lows, highs, + num_function, regex)] if args.exclude: exclude = set(args.exclude) entries = [entry for entry in entries - if exclude_entry(entry, exclude, num_function, regex)] + if exclude_entry(entry, exclude, + num_function, regex)] # Print off the sorted results - entries.sort(key=lambda x: natsort_key(x, **kwargs), reverse=args.reverse) - for entry in entries: + for entry in natsorted(entries, **kwargs): print(entry) diff --git a/natsort/_version.py b/natsort/_version.py index de2a514..d220a20 100644 --- a/natsort/_version.py +++ b/natsort/_version.py @@ -1,4 +1,5 @@ # -*- coding: utf-8 -*- -from __future__ import print_function, division, unicode_literals, absolute_import +from __future__ import (print_function, division, + unicode_literals, absolute_import) -__version__ = '3.3.0' +__version__ = '3.4.0' diff --git a/natsort/natsort.py b/natsort/natsort.py index e7c3e04..c67ec3f 100644 --- a/natsort/natsort.py +++ b/natsort/natsort.py @@ -12,17 +12,21 @@ See the README or the natsort homepage for more details. """ -from __future__ import print_function, division, unicode_literals, absolute_import +from __future__ import (print_function, division, + unicode_literals, absolute_import) import re -import sys +from os import curdir, pardir +from os.path import split, splitext from operator import itemgetter -from numbers import Number +from functools import partial from itertools import islice +from warnings import warn -from .py23compat import u_format, py23_basestring, py23_range, py23_str, py23_zip +from .py23compat import u_format, py23_str, py23_zip -__doc__ = u_format(__doc__) # Make sure the doctest works for either python2 or python3 +# Make sure the doctest works for either python2 or python3 +__doc__ = u_format(__doc__) # The regex that locates floats float_sign_exp_re = re.compile(r'([-+]?\d*\.?\d+(?:[eE][-+]?\d+)?)') @@ -34,219 +38,493 @@ int_nosign_re = re.compile(r'(\d+)') int_sign_re = re.compile(r'([-+]?\d+)') # This dict will help select the correct regex and number conversion function. regex_and_num_function_chooser = { - (float, True, True) : (float_sign_exp_re, float), - (float, True, False) : (float_sign_noexp_re, float), - (float, False, True) : (float_nosign_exp_re, float), - (float, False, False) : (float_nosign_noexp_re, float), - (int, True, True) : (int_sign_re, int), - (int, True, False) : (int_sign_re, int), - (int, False, True) : (int_nosign_re, int), - (int, False, False) : (int_nosign_re, int), - (None, True, True) : (int_nosign_re, int), - (None, True, False) : (int_nosign_re, int), - (None, False, True) : (int_nosign_re, int), - (None, False, False) : (int_nosign_re, int), + (float, True, True): (float_sign_exp_re, float), + (float, True, False): (float_sign_noexp_re, float), + (float, False, True): (float_nosign_exp_re, float), + (float, False, False): (float_nosign_noexp_re, float), + (int, True, True): (int_sign_re, int), + (int, True, False): (int_sign_re, int), + (int, False, True): (int_nosign_re, int), + (int, False, False): (int_nosign_re, int), + (None, True, True): (int_nosign_re, int), + (None, True, False): (int_nosign_re, int), + (None, False, True): (int_nosign_re, int), + (None, False, False): (int_nosign_re, int), } +# Number types. I have to use set([...]) and not {...} +# because I am supporting Python 2.6. +number_types = set([float, int]) -def _remove_empty(s): - """Remove empty strings from a list.""" - while True: - try: - s.remove('') - except ValueError: - break - return s +# This regex is to make sure we don't mistake a number for a file extension +decimal = re.compile(r'\.\d') def _number_finder(s, regex, numconv, py3_safe): """Helper to split numbers""" - # Split. If there are no splits, return now + # Split the input string by numbers. + # If there are no splits, return now. + # If the input is not a string, ValueError is raised. s = regex.split(s) if len(s) == 1: return tuple(s) - # Now convert the numbers to numbers, and leave strings as strings - s = _remove_empty(s) - for i in py23_range(len(s)): - try: - s[i] = numconv(s[i]) - except ValueError: - pass + # Now convert the numbers to numbers, and leave strings as strings. + # Remove empty strings from the list. + # Profiling showed that using regex here is much faster than + # try/except with the numconv function. + r = regex.match + s = [numconv(x) if r(x) else x for x in s if x] # If the list begins with a number, lead with an empty string. # This is used to get around the "unorderable types" issue. + # The most common case will be a string at the front of the + # list, and in that case the try/except method is faster than + # using isinstance. This was chosen at the expense of the less + # common case of a number being at the front of the list. + try: + s[0][0] # str supports indexing, but not numbers + except TypeError: + s = [''] + s + # The _py3_safe function inserts "" between numbers in the list, # and is used to get around "unorderable types" in complex cases. # It is a separate function that needs to be requested specifically # because it is expensive to call. - if not isinstance(s[0], py23_basestring): - return _py3_safe([''] + s) if py3_safe else [''] + s - else: - return _py3_safe(s) if py3_safe else s + return _py3_safe(s) if py3_safe else s + + +def _path_splitter(s): + """Split a string into its path components. Assumes a string is a path.""" + path_parts = [] + p_append = path_parts.append + path_location = s + # Continue splitting the path from the back until we have reached + # '..' or '.', or until there is nothing left to split. + while path_location != curdir and path_location != pardir: + parent_path = path_location + path_location, child_path = split(parent_path) + if path_location == parent_path: + break + p_append(child_path) + # This last append is the base path. + # Only append if the string is non-empty. + if path_location: + p_append(path_location) + # We created this list in reversed order, so we now correct the order. + path_parts.reverse() + # Now, split off the file extensions using a similar method to above. + # Continue splitting off file extensions until we reach a decimal number + # or there are no more extensions. + base = path_parts.pop() + base_parts = [] + b_append = base_parts.append + d_match = decimal.match + while True: + front = base + base, ext = splitext(front) + if d_match(ext) or not ext: + # Reset base to before the split if the split is invalid. + base = front + break + b_append(ext) + b_append(base) + base_parts.reverse() + # Return the split parent paths and then the split basename. + return path_parts + base_parts def _py3_safe(parsed_list): """Insert '' between two numbers.""" - if len(parsed_list) < 2: + length = len(parsed_list) + if length < 2: return parsed_list else: new_list = [parsed_list[0]] nl_append = new_list.append - for before, after in py23_zip(islice(parsed_list, 0, len(parsed_list)-1), + ntypes = number_types + for before, after in py23_zip(islice(parsed_list, 0, length-1), islice(parsed_list, 1, None)): - if isinstance(before, Number) and isinstance(after, Number): + # I realize that isinstance is favored over type, but + # in this case type is SO MUCH FASTER than isinstance!! + if type(before) in ntypes and type(after) in ntypes: nl_append("") nl_append(after) return new_list -@u_format -def natsort_key(s, number_type=float, signed=True, exp=True, py3_safe=False): +def _natsort_key(val, key=None, number_type=float, signed=True, exp=True, + as_path=False, py3_safe=False): """\ - Key to sort strings and numbers naturally, not lexicographically. - It is designed for use in passing to the 'sorted' builtin or - 'sort' attribute of lists. + Key to sort strings and numbers naturally. + + It works by separating out the numbers from the strings. This function for + internal use only. See the natsort_keygen documentation for details of each + parameter. + + Parameters + ---------- + val : {str, unicode} + key : callable, optional + number_type : {None, float, int}, optional + signed : {True, False}, optional + exp : {True, False}, optional + as_path : {True, False}, optional + py3_safe : {True, False}, optional + + Returns + ------- + out : tuple + The modified value with numbers extracted. - s - The value used by the sorting algorithm - - number_type (None, float, int) - The types of number to sort on: float searches for floating point - numbers, int searches for integers, and None searches for digits - (like integers but does not take into account negative sign). - None is a shortcut for number_type = int and signed = False. + """ - signed (True, False) - By default a '+' or '-' before a number is taken to be the sign - of the number. If signed is False, any '+' or '-' will not be - considered to be part of the number, but as part part of the string. + # Convert the arguments to the proper input tuple + inp_options = (number_type, signed, exp) + try: + regex, num_function = regex_and_num_function_chooser[inp_options] + except KeyError: + # Report errors properly + if number_type not in (float, int) and number_type is not None: + raise ValueError("_natsort_key: 'number_type' parameter " + "'{0}' invalid".format(py23_str(number_type))) + elif signed not in (True, False): + raise ValueError("_natsort_key: 'signed' parameter " + "'{0}' invalid".format(py23_str(signed))) + elif exp not in (True, False): + raise ValueError("_natsort_key: 'exp' parameter " + "'{0}' invalid".format(py23_str(exp))) + else: + # Apply key if needed. + if key is not None: + val = key(val) + + # If this is a path, convert it. + # An AttrubuteError is raised if not a string. + split_as_path = False + if as_path: + try: + val = _path_splitter(val) + except AttributeError: + pass + else: + # Record that this string was split as a path so that + # we can set as_path to False in the recursive call. + split_as_path = True + + # Assume the input are strings, which is the most common case. + try: + return tuple(_number_finder(val, regex, num_function, py3_safe)) + except TypeError: + # If not strings, assume it is an iterable that must + # be parsed recursively. Do not apply the key recursively. + # If this string was split as a path, set as_path to False. + try: + return tuple([_natsort_key(x, None, number_type, signed, + exp, as_path and not split_as_path, + py3_safe) for x in val]) + # If there is still an error, it must be a number. + # Return as-is, with a leading empty string. + # Waiting for two raised errors instead of calling + # isinstance at the opening of the function is slower + # for numbers but much faster for strings, and since + # numbers are not a common input to natsort this is + # an acceptable sacrifice. + except TypeError: + return (('', val,),) if as_path else ('', val,) - exp (True, False) - This option only applies to number_type = float. If exp = True, - a string like "3.5e5" will be interpreted as 350000, i.e. the - exponential part is considered to be part of the number. - If exp = False, "3.5e5" is interpreted as (3.5, "e", 5). - The default behavior is exp = True. - py3_safe (True, False) - This will make the string parsing algorithm be more careful by - placing an empty string between two adjacent numbers after the - parsing algorithm. This will prevent the "unorderable types" error. +@u_format +def natsort_key(val, key=None, number_type=float, signed=True, exp=True, + as_path=False, py3_safe=False): + """\ + Key to sort strings and numbers naturally. - returns - The modified value with numbers extracted. + Key to sort strings and numbers naturally, not lexicographically. + It is designed for use in passing to the 'sorted' builtin or + 'sort' attribute of lists. - Using natsort_key is just like any other sorting key in python + .. note:: Depreciated since version 3.4.0. + This function remains in the publicly exposed API for + backwards-compatibility reasons, but future development + should use the newer `natsort_keygen` function. It is + planned to remove this from the public API in natsort + version 4.0.0. A DeprecationWarning will be raised + via the warnings module; set warnings.simplefilter("always") + to raise them to see if your code will work in version + 4.0.0. + + Parameters + ---------- + val : {{str, unicode}} + The value used by the sorting algorithm + + key : callable, optional + A key used to manipulate the input value before parsing for + numbers. It is **not** applied recursively. + It should accept a single argument and return a single value. + + number_type : {{None, float, int}}, optional + The types of number to sort on: `float` searches for floating + point numbers, `int` searches for integers, and `None` searches + for digits (like integers but does not take into account + negative sign). `None` is a shortcut for `number_type = int` + and `signed = False`. + + signed : {{True, False}}, optional + By default a '+' or '-' before a number is taken to be the sign + of the number. If `signed` is `False`, any '+' or '-' will not + be considered to be part of the number, but as part part of the + string. + + exp : {{True, False}}, optional + This option only applies to `number_type = float`. If + `exp = True`, a string like "3.5e5" will be interpreted as + 350000, i.e. the exponential part is considered to be part of + the number. If `exp = False`, "3.5e5" is interpreted as + ``(3.5, "e", 5)``. The default behavior is `exp = True`. + + as_path : {{True, False}}, optional + This option will force strings to be interpreted as filesystem + paths, so they will be split according to the filesystem separator + (i.e. '/' on UNIX, '\\\\' on Windows), as well as splitting on the + file extension, if any. Without this, lists of file paths like + ``['Folder', 'Folder (1)', 'Folder (10)']`` will not be sorted + properly; ``'Folder'`` will be placed at the end, not at the front. + The default behavior is `as_path = False`. + + py3_safe : {{True, False}}, optional + This will make the string parsing algorithm be more careful by + placing an empty string between two adjacent numbers after the + parsing algorithm. This will prevent the "unorderable types" + error. + + Returns + ------- + out : tuple + The modified value with numbers extracted. + + See Also + -------- + natsort_keygen : Generates a properly wrapped `natsort_key`. + + Examples + -------- + Using natsort_key is just like any other sorting key in python:: >>> a = ['num3', 'num5', 'num2'] >>> a.sort(key=natsort_key) >>> a [{u}'num2', {u}'num3', {u}'num5'] - It works by separating out the numbers from the strings + It works by separating out the numbers from the strings:: >>> natsort_key('num2') ({u}'num', 2.0) - If you need to call natsort_key with the number_type argument, or get a special - attribute or item of each element of the sequence, the easiest way is to make a - lambda expression that calls natsort_key:: - - >>> from operator import itemgetter - >>> a = [['num4', 'b'], ['num8', 'c'], ['num2', 'a']] - >>> f = itemgetter(0) - >>> a.sort(key=lambda x: natsort_key(f(x), number_type=int)) - >>> a - [[{u}'num2', {u}'a'], [{u}'num4', {u}'b'], [{u}'num8', {u}'c']] + If you need to call natsort_key with the number_type argument, or get a + special attribute or item of each element of the sequence, please use + the `natsort_keygen` function. Actually, please just use the + `natsort_keygen` function. - Iterables are parsed recursively so you can sort lists of lists. + Notes + ----- + Iterables are parsed recursively so you can sort lists of lists:: >>> natsort_key(('a1', 'a10')) (({u}'a', 1.0), ({u}'a', 10.0)) - Strings that lead with a number get an empty string at the front of the tuple. - This is designed to get around the "unorderable types" issue of Python3. + Strings that lead with a number get an empty string at the front of the + tuple. This is designed to get around the "unorderable types" issue of + Python3:: >>> natsort_key('15a') ({u}'', 15.0, {u}'a') - You can give bare numbers, too. + You can give bare numbers, too:: >>> natsort_key(10) ({u}'', 10) - If you have a case where one of your string has two numbers in a row - (only possible with "5+5" or "5-5" and signed=True to my knowledge), you - can turn on the "py3_safe" option to try to add a "" between sets of two - numbers. + If you have a case where one of your string has two numbers in a row, + you can turn on the "py3_safe" option to try to add a "" between sets + of two numbers:: >>> natsort_key('43h7+3', py3_safe=True) ({u}'', 43.0, {u}'h', 7.0, {u}'', 3.0) """ - - # If we are dealing with non-strings, return now - if not isinstance(s, py23_basestring): - if hasattr(s, '__getitem__'): - return tuple(natsort_key(x) for x in s) - else: - return ('', s,) - - # Convert to the proper tuple and return - inp_options = (number_type, signed, exp) - try: - args = (s,) + regex_and_num_function_chooser[inp_options] + (py3_safe,) - except KeyError: - # Report errors properly - if number_type not in (float, int) and number_type is not None: - raise ValueError("natsort_key: 'number_type' " - "parameter '{0}' invalid".format(py23_str(number_type))) - elif signed not in (True, False): - raise ValueError("natsort_key: 'signed' " - "parameter '{0}' invalid".format(py23_str(signed))) - elif exp not in (True, False): - raise ValueError("natsort_key: 'exp' " - "parameter '{0}' invalid".format(py23_str(exp))) - else: - return tuple(_number_finder(*args)) + msg = "natsort_key is depreciated as of 3.4.0, please use natsort_keygen" + warn(msg, DeprecationWarning) + return _natsort_key(val, key, number_type, signed, exp, as_path, py3_safe) @u_format -def natsorted(seq, key=lambda x: x, number_type=float, signed=True, exp=True): +def natsort_keygen(key=None, number_type=float, signed=True, exp=True, + as_path=False, py3_safe=False): """\ - Sorts a sequence naturally (alphabetically and numerically), - not lexicographically. - - seq (iterable) - The sequence to sort. - - key (function) - A key used to determine how to sort each element of the sequence. - - number_type (None, float, int) - The types of number to sort on: float searches for floating point - numbers, int searches for integers, and None searches for digits - (like integers but does not take into account negative sign). - None is a shortcut for number_type = int and signed = False. + Generate a key to sort strings and numbers naturally. + + Generate a key to sort strings and numbers naturally, + not lexicographically. This key is designed for use as the + `key` argument to functions such as the `sorted` builtin. + + The user may customize the generated function with the + arguments to `natsort_keygen`, including an optional + `key` function which will be called before the `natsort_key`. + + Parameters + ---------- + key : callable, optional + A key used to manipulate the input value before parsing for + numbers. It is **not** applied recursively. + It should accept a single argument and return a single value. + + number_type : {{None, float, int}}, optional + The types of number to sort on: `float` searches for floating + point numbers, `int` searches for integers, and `None` searches + for digits (like integers but does not take into account + negative sign). `None` is a shortcut for `number_type = int` + and `signed = False`. + + signed : {{True, False}}, optional + By default a '+' or '-' before a number is taken to be the sign + of the number. If `signed` is `False`, any '+' or '-' will not + be considered to be part of the number, but as part part of the + string. + + exp : {{True, False}}, optional + This option only applies to `number_type = float`. If + `exp = True`, a string like "3.5e5" will be interpreted as + 350000, i.e. the exponential part is considered to be part of + the number. If `exp = False`, "3.5e5" is interpreted as + ``(3.5, "e", 5)``. The default behavior is `exp = True`. + + as_path : {{True, False}}, optional + This option will force strings to be interpreted as filesystem + paths, so they will be split according to the filesystem separator + (i.e. `/` on UNIX, `\\\\` on Windows), as well as splitting on the + file extension, if any. Without this, lists with file paths like + ``['Folder/', 'Folder (1)/', 'Folder (10)/']`` will not be sorted + properly; ``'Folder'`` will be placed at the end, not at the front. + The default behavior is `as_path = False`. + + py3_safe : {{True, False}}, optional + This will make the string parsing algorithm be more careful by + placing an empty string between two adjacent numbers after the + parsing algorithm. This will prevent the "unorderable types" + error. + + Returns + ------- + out : function + A wrapped version of the `natsort_key` function that is + suitable for passing as the `key` argument to functions + such as `sorted`. + + Examples + -------- + `natsort_keygen` is a convenient waynto create a custom key + to sort lists in-place (for example). Calling with no objects + will return a plain `natsort_key` instance:: + + >>> a = ['num5.10', 'num-3', 'num5.3', 'num2'] + >>> b = a[:] + >>> a.sort(key=natsort_key) + >>> b.sort(key=natsort_keygen()) + >>> a == b + True + + The power of `natsort_keygen` is when you want to want to pass + arguments to the `natsort_key`. Consider the following + equivalent examples; which is more clear? :: + + >>> a = ['num5.10', 'num-3', 'num5.3', 'num2'] + >>> b = a[:] + >>> a.sort(key=lambda x: natsort_key(x, key=lambda y: y.upper(), + ... signed=False)) + >>> b.sort(key=natsort_keygen(key=lambda x: x.upper(), signed=False)) + >>> a == b + True - signed (True, False) - By default a '+' or '-' before a number is taken to be the sign - of the number. If signed is False, any '+' or '-' will not be - considered to be part of the number, but as part part of the string. + """ + return partial(_natsort_key, + key=key, + number_type=number_type, + signed=signed, + exp=exp, + as_path=as_path, + py3_safe=py3_safe) - exp (True, False) - This option only applies to number_type = float. If exp = True, - a string like "3.5e5" will be interpreted as 350000, i.e. the - exponential part is considered to be part of the number. - If exp = False, "3.5e5" is interpreted as (3.5, "e", 5). - The default behavior is exp = True. - returns - The sorted sequence. +@u_format +def natsorted(seq, key=None, number_type=float, signed=True, exp=True, + reverse=False, as_path=False): + """\ + Sorts a sequence naturally. - Use natsorted just like the builtin sorted + Sorts a sequence naturally (alphabetically and numerically), + not lexicographically. Returns a new copy of the sorted + sequence as a list. + + Parameters + ---------- + seq : iterable + The sequence to sort. + + key : callable, optional + A key used to determine how to sort each element of the sequence. + It is **not** applied recursively. + It should accept a single argument and return a single value. + + number_type : {{None, float, int}}, optional + The types of number to sort on: `float` searches for floating + point numbers, `int` searches for integers, and `None` searches + for digits (like integers but does not take into account + negative sign). `None` is a shortcut for `number_type = int` + and `signed = False`. + + signed : {{True, False}}, optional + By default a '+' or '-' before a number is taken to be the sign + of the number. If `signed` is `False`, any '+' or '-' will not + be considered to be part of the number, but as part part of the + string. + + exp : {{True, False}}, optional + This option only applies to `number_type = float`. If + `exp = True`, a string like "3.5e5" will be interpreted as + 350000, i.e. the exponential part is considered to be part of + the number. If `exp = False`, "3.5e5" is interpreted as + ``(3.5, "e", 5)``. The default behavior is `exp = True`. + + reverse : {{True, False}}, optional + Return the list in reversed sorted order. The default is + `False`. + + as_path : {{True, False}}, optional + This option will force strings to be interpreted as filesystem + paths, so they will be split according to the filesystem separator + (i.e. '/' on UNIX, '\\\\' on Windows), as well as splitting on the + file extension, if any. Without this, lists of file paths like + ``['Folder', 'Folder (1)', 'Folder (10)']`` will not be sorted + properly; ``'Folder'`` will be placed at the end, not at the front. + The default behavior is `as_path = False`. + + Returns + ------- + out: list + The sorted sequence. + + See Also + -------- + natsort_keygen : Generates the key that makes natural sorting possible. + versorted : A wrapper for ``natsorted(seq, number_type=None)``. + index_natsorted : Returns the sorted indexes from `natsorted`. + + Examples + -------- + Use `natsorted` just like the builtin `sorted`:: >>> a = ['num3', 'num5', 'num2'] >>> natsorted(a) @@ -254,139 +532,292 @@ def natsorted(seq, key=lambda x: x, number_type=float, signed=True, exp=True): """ try: - return sorted(seq, key=lambda x: natsort_key(key(x), - number_type=number_type, - signed=signed, exp=exp)) + return sorted(seq, reverse=reverse, + key=natsort_keygen(key, number_type, + signed, exp, as_path)) except TypeError as e: # In the event of an unresolved "unorderable types" error # attempt to sort again, being careful to prevent this error. if 'unorderable types' in str(e): - return sorted(seq, key=lambda x: natsort_key(key(x), - number_type=number_type, - signed=signed, exp=exp, - py3_safe=True)) + return sorted(seq, reverse=reverse, + key=natsort_keygen(key, number_type, + signed, exp, as_path, + True)) else: # Re-raise if the problem was not "unorderable types" raise @u_format -def versorted(seq, key=lambda x: x): +def versorted(seq, key=None, reverse=False, as_path=False): """\ - Convenience function to sort version numbers. This is a wrapper - around natsorted(seq, number_type=None). + Convenience function to sort version numbers. - seq (iterable) - The sequence to sort. - - key (function) - A key used to determine how to sort each element of the sequence. - - returns - The sorted sequence. - - Use versorted just like the builtin sorted + Convenience function to sort version numbers. This is a wrapper + around ``natsorted(seq, number_type=None)``. + + Parameters + ---------- + seq : iterable + The sequence to sort. + + key : callable, optional + A key used to determine how to sort each element of the sequence. + It is **not** applied recursively. + It should accept a single argument and return a single value. + + reverse : {{True, False}}, optional + Return the list in reversed sorted order. The default is + `False`. + + as_path : {{True, False}}, optional + This option will force strings to be interpreted as filesystem + paths, so they will be split according to the filesystem separator + (i.e. '/' on UNIX, '\\\\' on Windows), as well as splitting on the + file extension, if any. Without this, lists of file paths like + ``['Folder', 'Folder (1)', 'Folder (10)']`` will not be sorted + properly; ``'Folder'`` will be placed at the end, not at the front. + The default behavior is `as_path = False`. + + Returns + ------- + out : list + The sorted sequence. + + See Also + -------- + index_versorted : Returns the sorted indexes from `versorted`. + + Examples + -------- + Use `versorted` just like the builtin `sorted`:: >>> a = ['num4.0.2', 'num3.4.1', 'num3.4.2'] >>> versorted(a) [{u}'num3.4.1', {u}'num3.4.2', {u}'num4.0.2'] """ - return natsorted(seq, key=key, number_type=None) + return natsorted(seq, key, None, reverse=reverse, as_path=as_path) @u_format -def index_natsorted(seq, key=lambda x: x, number_type=float, signed=True, exp=True): +def index_natsorted(seq, key=None, number_type=float, signed=True, exp=True, + reverse=False, as_path=False): """\ - Sorts a sequence naturally, but returns a list of sorted the - indexes and not the sorted list. - - seq (iterable) - The sequence to sort. - - key (function) - A key used to determine how to sort each element of the sequence. - - number_type (None, float, int) - The types of number to sort on: float searches for floating point - numbers, int searches for integers, and None searches for digits - (like integers but does not take into account negative sign). - None is a shortcut for number_type = int and signed = False. - - signed (True, False) - By default a '+' or '-' before a number is taken to be the sign - of the number. If signed is False, any '+' or '-' will not be - considered to be part of the number, but as part part of the string. - - exp (True, False) - This option only applies to number_type = float. If exp = True, - a string like "3.5e5" will be interpreted as 350000, i.e. the - exponential part is considered to be part of the number. - If exp = False, "3.5e5" is interpreted as (3.5, "e", 5). - The default behavior is exp = True. + Return the list of the indexes used to sort the input sequence. - returns - The ordered indexes of the sequence. - - Use index_natsorted if you want to sort multiple lists by the sort order of - one list: + Sorts a sequence naturally, but returns a list of sorted the + indexes and not the sorted list. This list of indexes can be + used to sort multiple lists by the sorted order of the given + sequence. + + Parameters + ---------- + seq : iterable + The sequence to sort. + + key : callable, optional + A key used to determine how to sort each element of the sequence. + It is **not** applied recursively. + It should accept a single argument and return a single value. + + number_type : {{None, float, int}}, optional + The types of number to sort on: `float` searches for floating + point numbers, `int` searches for integers, and `None` searches + for digits (like integers but does not take into account + negative sign). `None` is a shortcut for `number_type = int` + and `signed = False`. + + signed : {{True, False}}, optional + By default a '+' or '-' before a number is taken to be the sign + of the number. If `signed` is `False`, any '+' or '-' will not + be considered to be part of the number, but as part part of the + string. + + exp : {{True, False}}, optional + This option only applies to `number_type = float`. If + `exp = True`, a string like "3.5e5" will be interpreted as + 350000, i.e. the exponential part is considered to be part of + the number. If `exp = False`, "3.5e5" is interpreted as + ``(3.5, "e", 5)``. The default behavior is `exp = True`. + + reverse : {{True, False}}, optional + Return the list in reversed sorted order. The default is + `False`. + + as_path : {{True, False}}, optional + This option will force strings to be interpreted as filesystem + paths, so they will be split according to the filesystem separator + (i.e. '/' on UNIX, '\\\\' on Windows), as well as splitting on the + file extension, if any. Without this, lists of file paths like + ``['Folder', 'Folder (1)', 'Folder (10)']`` will not be sorted + properly; ``'Folder'`` will be placed at the end, not at the front. + The default behavior is `as_path = False`. + + Returns + ------- + out : tuple + The ordered indexes of the sequence. + + See Also + -------- + natsorted + order_by_index + + Examples + -------- + + Use index_natsorted if you want to sort multiple lists by the + sorted order of one list:: - >>> from natsort import index_natsorted >>> a = ['num3', 'num5', 'num2'] >>> b = ['foo', 'bar', 'baz'] >>> index = index_natsorted(a) >>> index [2, 0, 1] >>> # Sort both lists by the sort order of a - >>> [a[i] for i in index] + >>> order_by_index(a, index) [{u}'num2', {u}'num3', {u}'num5'] - >>> [b[i] for i in index] + >>> order_by_index(b, index) [{u}'baz', {u}'foo', {u}'bar'] """ - item1 = itemgetter(1) + if key is None: + newkey = itemgetter(1) + else: + newkey = lambda x: key(itemgetter(1)(x)) # Pair the index and sequence together, then sort by element - index_seq_pair = [[x, key(y)] for x, y in py23_zip(py23_range(len(seq)), seq)] + index_seq_pair = [[x, y] for x, y in enumerate(seq)] try: - index_seq_pair.sort(key=lambda x: natsort_key(item1(x), - number_type=number_type, - signed=signed, exp=exp)) + index_seq_pair.sort(reverse=reverse, + key=natsort_keygen(newkey, number_type, + signed, exp, as_path)) except TypeError as e: # In the event of an unresolved "unorderable types" error # attempt to sort again, being careful to prevent this error. if 'unorderable types' in str(e): - index_seq_pair.sort(key=lambda x: natsort_key(item1(x), - number_type=number_type, - signed=signed, exp=exp, - py3_safe=True)) + index_seq_pair.sort(reverse=reverse, + key=natsort_keygen(newkey, number_type, + signed, exp, as_path, + True)) else: # Re-raise if the problem was not "unorderable types" raise - return [x[0] for x in index_seq_pair] + return [x for x, _ in index_seq_pair] @u_format -def index_versorted(seq, key=lambda x: x): +def index_versorted(seq, key=None, reverse=False, as_path=False): """\ - Convenience function to sort version numbers but return the - indexes of how the sequence would be sorted. - This is a wrapper around index_natsorted(seq, number_type=None). - - seq (iterable) - The sequence to sort. - - key (function) - A key used to determine how to sort each element of the sequence. - - returns - The ordered indexes of the sequence. + Return the list of the indexes used to sort the input sequence + of version numbers. - Use index_versorted just like the builtin sorted + Sorts a sequence naturally, but returns a list of sorted the + indexes and not the sorted list. This list of indexes can be + used to sort multiple lists by the sorted order of the given + sequence. + + This is a wrapper around ``index_natsorted(seq, number_type=None)``. + + Parameters + ---------- + seq: iterable + The sequence to sort. + + key: callable, optional + A key used to determine how to sort each element of the sequence. + It is **not** applied recursively. + It should accept a single argument and return a single value. + + reverse : {{True, False}}, optional + Return the list in reversed sorted order. The default is + `False`. + + as_path : {{True, False}}, optional + This option will force strings to be interpreted as filesystem + paths, so they will be split according to the filesystem separator + (i.e. '/' on UNIX, '\\\\' on Windows), as well as splitting on the + file extension, if any. Without this, lists of file paths like + ``['Folder', 'Folder (1)', 'Folder (10)']`` will not be sorted + properly; ``'Folder'`` will be placed at the end, not at the front. + The default behavior is `as_path = False`. + + Returns + ------- + out : tuple + The ordered indexes of the sequence. + + See Also + -------- + versorted + order_by_index + + Examples + -------- + Use `index_versorted` just like the builtin `sorted`:: >>> a = ['num4.0.2', 'num3.4.1', 'num3.4.2'] >>> index_versorted(a) [1, 2, 0] """ - return index_natsorted(seq, key=key, number_type=None) + return index_natsorted(seq, key, None, reverse=reverse, as_path=as_path) + + +@u_format +def order_by_index(seq, index, iter=False): + """\ + Order a given sequence by an index sequence. + + The output of `index_natsorted` and `index_versorted` is a + sequence of integers (index) that correspond to how its input + sequence **would** be sorted. The idea is that this index can + be used to reorder multiple sequences by the sorted order of the + first sequence. This function is a convenient wrapper to + apply this ordering to a sequence. + + Parameters + ---------- + seq : iterable + The sequence to order. + + index : iterable + The sequence that indicates how to order `seq`. + It should be the same length as `seq` and consist + of integers only. + + iter : {{True, False}}, optional + If `True`, the ordered sequence is returned as a + generator expression; otherwise it is returned as a + list. The default is `False`. + + Returns + ------- + out : {{list, generator}} + The sequence ordered by `index`, as a `list` or as a + generator expression (depending on the value of `iter`). + + See Also + -------- + index_natsorted + index_versorted + + Examples + -------- + + `order_by_index` is a comvenience function that helps you apply + the result of `index_natsorted` or `index_versorted`:: + >>> a = ['num3', 'num5', 'num2'] + >>> b = ['foo', 'bar', 'baz'] + >>> index = index_natsorted(a) + >>> index + [2, 0, 1] + >>> # Sort both lists by the sort order of a + >>> order_by_index(a, index) + [{u}'num2', {u}'num3', {u}'num5'] + >>> order_by_index(b, index) + [{u}'baz', {u}'foo', {u}'bar'] + + """ + return (seq[i] for i in index) if iter else [seq[i] for i in index] diff --git a/natsort/py23compat.py b/natsort/py23compat.py index 85c06e1..3f3fb92 100644 --- a/natsort/py23compat.py +++ b/natsort/py23compat.py @@ -1,5 +1,6 @@ # -*- coding: utf-8 -*- -from __future__ import print_function, division, unicode_literals, absolute_import +from __future__ import (print_function, division, + unicode_literals, absolute_import) import functools import sys @@ -36,9 +37,9 @@ def _modify_str_or_docstring(str_change_func): else: func = func_or_str doc = func.__doc__ - + doc = str_change_func(doc) - + if func: func.__doc__ = doc return func @@ -52,7 +53,7 @@ if sys.version[0] == '3': @_modify_str_or_docstring def u_format(s): """"{u}'abc'" --> "'abc'" (Python 3) - + Accepts a string or a function, so it can be used as a decorator.""" return s.format(u='') else: @@ -60,7 +61,6 @@ else: @_modify_str_or_docstring def u_format(s): """"{u}'abc'" --> "u'abc'" (Python 2) - + Accepts a string or a function, so it can be used as a decorator.""" return s.format(u='u') - @@ -3,3 +3,13 @@ universal = 1 [sdist] formats = zip,gztar + +[pytest] +flakes-ignore = + natsort/py23compat.py UndefinedName + natsort/__init__.py UnusedImport + docs/source/conf.py ALL + +pep8ignore = + test_natsort/test_natsort.py E501 E241 E221 + docs/source/conf.py ALL @@ -19,11 +19,13 @@ class PyTest(TestCommand): self.test_suite = True def run_tests(self): - #import here, cause outside the eggs aren't loaded + # import here, cause outside the eggs aren't loaded import pytest - err1 = pytest.main([]) + err1 = pytest.main(['--cov', 'natsort', '--flakes', '--pep8']) err2 = pytest.main(['--doctest-modules', 'natsort']) - err3 = pytest.main(['README.rst']) + err3 = pytest.main(['README.rst', + 'docs/source/intro.rst', + 'docs/source/examples.rst']) return err1 | err2 | err3 @@ -55,21 +57,21 @@ REQUIRES = 'argparse' if sys.version[:3] in ('2.6', '3.0', '3.1') else '' # The setup parameters -setup(name='natsort', - version=VERSION, - author='Seth M. Morton', - author_email='drtuba78@gmail.com', - url='https://github.com/SethMMorton/natsort', - license='MIT', - install_requires=REQUIRES, - packages=['natsort'], - entry_points={'console_scripts': ['natsort = natsort.__main__:main']}, - tests_require=['pytest'], - cmdclass = {'test': PyTest}, - description=DESCRIPTION, - long_description=LONG_DESCRIPTION, - classifiers=( - #'Development Status :: 4 - Beta', +setup( + name='natsort', + version=VERSION, + author='Seth M. Morton', + author_email='drtuba78@gmail.com', + url='https://github.com/SethMMorton/natsort', + license='MIT', + install_requires=REQUIRES, + packages=['natsort'], + entry_points={'console_scripts': ['natsort = natsort.__main__:main']}, + tests_require=['pytest', 'pytest-pep8', 'pytest-flakes', 'pytest-cov'], + cmdclass={'test': PyTest}, + description=DESCRIPTION, + long_description=LONG_DESCRIPTION, + classifiers=( 'Development Status :: 5 - Production/Stable', 'Intended Audience :: Developers', 'Intended Audience :: Science/Research', @@ -83,5 +85,5 @@ setup(name='natsort', 'Programming Language :: Python :: 3', 'Topic :: Scientific/Engineering :: Information Analysis', 'Topic :: Utilities', - ) - ) + ) +) diff --git a/test_natsort/profile_natsorted.py b/test_natsort/profile_natsorted.py new file mode 100644 index 0000000..802fe5f --- /dev/null +++ b/test_natsort/profile_natsorted.py @@ -0,0 +1,113 @@ +# -*- coding: utf-8 -*- +"""\ +This file contains functions to profile natsorted with different +inputs and different settings. +""" +from __future__ import print_function +import cProfile +import random +import sys + +sys.path.insert(0, '.') +from natsort import natsorted, index_natsorted +from natsort.py23compat import py23_range + + +# Sample lists to sort +nums = random.sample(py23_range(10000), 1000) +nstr = list(map(str, random.sample(py23_range(10000), 1000))) +astr = ['a'+x+'num' for x in map(str, random.sample(py23_range(10000), 1000))] +tstr = [['a'+x, 'a-'+x] + for x in map(str, random.sample(py23_range(10000), 1000))] +cstr = ['a'+x+'-'+x for x in map(str, random.sample(py23_range(10000), 1000))] + + +def prof_nums(a): + print('*** Basic Call, Numbers ***') + for _ in py23_range(1000): + natsorted(a) +cProfile.run('prof_nums(nums)', sort='time') + + +def prof_num_str(a): + print('*** Basic Call, Numbers as Strings ***') + for _ in py23_range(1000): + natsorted(a) +cProfile.run('prof_num_str(nstr)', sort='time') + + +def prof_str(a): + print('*** Basic Call, Strings ***') + for _ in py23_range(1000): + natsorted(a) +cProfile.run('prof_str(astr)', sort='time') + + +def prof_str_index(a): + print('*** Basic Index Call ***') + for _ in py23_range(1000): + index_natsorted(a) +cProfile.run('prof_str_index(astr)', sort='time') + + +def prof_nested(a): + print('*** Basic Call, Nested Strings ***') + for _ in py23_range(1000): + natsorted(a) +cProfile.run('prof_nested(tstr)', sort='time') + + +def prof_str_noexp(a): + print('*** No-Exp Call ***') + for _ in py23_range(1000): + natsorted(a, exp=False) +cProfile.run('prof_str_noexp(astr)', sort='time') + + +def prof_str_unsigned(a): + print('*** Unsigned Call ***') + for _ in py23_range(1000): + natsorted(a, signed=False) +cProfile.run('prof_str_unsigned(astr)', sort='time') + + +def prof_str_unsigned_noexp(a): + print('*** Unsigned No-Exp Call ***') + for _ in py23_range(1000): + natsorted(a, signed=False, exp=False) +cProfile.run('prof_str_unsigned_noexp(astr)', sort='time') + + +def prof_str_asint(a): + print('*** Int Call ***') + for _ in py23_range(1000): + natsorted(a, number_type=int) +cProfile.run('prof_str_asint(astr)', sort='time') + + +def prof_str_asint_unsigned(a): + print('*** Unsigned Int (Versions) Call ***') + for _ in py23_range(1000): + natsorted(a, number_type=int, signed=False) +cProfile.run('prof_str_asint_unsigned(astr)', sort='time') + + +def prof_str_key(a): + print('*** Basic Call With Key ***') + for _ in py23_range(1000): + natsorted(a, key=lambda x: x.upper()) +cProfile.run('prof_str_key(astr)', sort='time') + + +def prof_str_index_key(a): + print('*** Basic Index Call With Key ***') + for _ in py23_range(1000): + index_natsorted(a, key=lambda x: x.upper()) +cProfile.run('prof_str_index_key(astr)', sort='time') + + +def prof_str_unorderable(a): + print('*** Basic Index Call, "Unorderable" ***') + for _ in py23_range(1000): + natsorted(a) +cProfile.run('prof_str_unorderable(cstr)', sort='time') diff --git a/test_natsort/stress_natsort.py b/test_natsort/stress_natsort.py new file mode 100644 index 0000000..7237db3 --- /dev/null +++ b/test_natsort/stress_natsort.py @@ -0,0 +1,53 @@ +# -*- coding: utf-8 -*- +"""\ +This file contains functions to stress-test natsort. +""" +from random import randint, sample, choice +from string import printable +from copy import copy +from pytest import fail +from natsort import natsorted +from natsort.py23compat import py23_range + + +def test_random(): + """Try to sort 100,000 randomly generated strings without exception.""" + + # Repeat test 100,000 times + for _ in py23_range(100000): + # Made a list of five randomly generated strings + lst = [''.join(sample(printable, randint(7, 30))) + for __ in py23_range(5)] + # Try to sort. If there is an exception, give some detailed info. + try: + natsorted(lst) + except Exception as e: + msg = "Ended with exception type '{exc}: {msg}'.\n" + msg += "Failed on the input {lst}." + fail(msg.format(exc=type(e).__name__, msg=str(e), lst=str(lst))) + + +def test_similar(): + """Try to sort 100,000 randomly generated + similar strings without exception. + """ + + # Repeat test 100,000 times + for _ in py23_range(100000): + # Create a randomly generated string + base = sample(printable, randint(7, 30)) + # Make a list of strings based on this string, + # with some randomly generated modifications + lst = [] + for __ in py23_range(5): + new_str = copy(base) + for ___ in py23_range(randint(1, 5)): + new_str[randint(0, len(base)-1)] = choice(printable) + lst.append(''.join(new_str)) + # Try to sort. If there is an exception, give some detailed info. + try: + natsorted(lst) + except Exception as e: + msg = "Ended with exception type '{exc}: {msg}'.\n" + msg += "Failed on the input {lst}." + fail(msg.format(exc=type(e).__name__, msg=str(e), lst=str(lst))) diff --git a/test_natsort/test_main.py b/test_natsort/test_main.py index c8a5825..8157c3e 100644 --- a/test_natsort/test_main.py +++ b/test_natsort/test_main.py @@ -72,7 +72,8 @@ num-6 num-2 """ - # Exclude the number 1 and 6. Both are present because we use digits/versions. + # Exclude the number 1 and 6. + # Both are present because we use digits/versions. sys.argv[1:] = ['-t', 'ver', '-e', '1', '-e', '6', 'num-2', 'num-6', 'num-1'] main() @@ -109,7 +110,8 @@ a1.0e3 """ # Include two ranges. - sys.argv[1:] = ['-f', '1', '10', '-f', '400', '500', 'a1.0e3', 'a5.3', 'a453.6'] + sys.argv[1:] = ['-f', '1', '10', '-f', '400', '500', + 'a1.0e3', 'a5.3', 'a453.6'] main() out, __ = capsys.readouterr() assert out == """\ @@ -127,16 +129,34 @@ a5.3 a453.6 """ + # To sort complicated filenames you need --paths + sys.argv[1:] = ['/Folder (1)/', '/Folder/', '/Folder (10)/'] + main() + out, __ = capsys.readouterr() + assert out == """\ +/Folder (1)/ +/Folder (10)/ +/Folder/ +""" + sys.argv[1:] = ['--paths', '/Folder (1)/', '/Folder/', '/Folder (10)/'] + main() + out, __ = capsys.readouterr() + assert out == """\ +/Folder/ +/Folder (1)/ +/Folder (10)/ +""" + def test_range_check(): - + # Floats are always returned assert range_check(10, 11) == (10.0, 11.0) assert range_check(6.4, 30) == (6.4, 30.0) # Invalid ranges give a ValueErro with raises(ValueError) as err: - range_check(7, 2) + range_check(7, 2) assert str(err.value) == 'low >= high' @@ -174,10 +194,10 @@ def test_exclude_entry(): def test_sort_and_print_entries(capsys): - + class Args: """A dummy class to simulate the argparse Namespace object""" - def __init__(self, filter, reverse_filter, exclude, reverse): + def __init__(self, filter, reverse_filter, exclude, as_path, reverse): self.filter = filter self.reverse_filter = reverse_filter self.exclude = exclude @@ -185,19 +205,36 @@ def test_sort_and_print_entries(capsys): self.number_type = 'float' self.signed = True self.exp = True + self.paths = as_path entries = ['tmp/a57/path2', 'tmp/a23/path1', - 'tmp/a1/path1', + 'tmp/a1/path1', + 'tmp/a1 (1)/path1', 'tmp/a130/path1', 'tmp/a64/path1', 'tmp/a64/path2'] # Just sort the paths - sort_and_print_entries(entries, Args(None, None, False, False)) + sort_and_print_entries(entries, Args(None, None, False, False, False)) + out, __ = capsys.readouterr() + assert out == """\ +tmp/a1 (1)/path1 +tmp/a1/path1 +tmp/a23/path1 +tmp/a57/path2 +tmp/a64/path1 +tmp/a64/path2 +tmp/a130/path1 +""" + + # You would use --paths to make them sort + # as paths when the OS makes duplicates + sort_and_print_entries(entries, Args(None, None, False, True, False)) out, __ = capsys.readouterr() assert out == """\ tmp/a1/path1 +tmp/a1 (1)/path1 tmp/a23/path1 tmp/a57/path2 tmp/a64/path1 @@ -206,7 +243,8 @@ tmp/a130/path1 """ # Sort the paths with numbers between 20-100 - sort_and_print_entries(entries, Args([(20, 100)], None, False, False)) + sort_and_print_entries(entries, Args([(20, 100)], None, False, + False, False)) out, __ = capsys.readouterr() assert out == """\ tmp/a23/path1 @@ -216,27 +254,31 @@ tmp/a64/path2 """ # Sort the paths without numbers between 20-100 - sort_and_print_entries(entries, Args(None, [(20, 100)], False, False)) + sort_and_print_entries(entries, Args(None, [(20, 100)], False, + True, False)) out, __ = capsys.readouterr() assert out == """\ tmp/a1/path1 +tmp/a1 (1)/path1 tmp/a130/path1 """ # Sort the paths, excluding 23 and 130 - sort_and_print_entries(entries, Args(None, None, [23, 130], False)) + sort_and_print_entries(entries, Args(None, None, [23, 130], True, False)) out, __ = capsys.readouterr() assert out == """\ tmp/a1/path1 +tmp/a1 (1)/path1 tmp/a57/path2 tmp/a64/path1 tmp/a64/path2 """ # Sort the paths, excluding 2 - sort_and_print_entries(entries, Args(None, None, [2], False)) + sort_and_print_entries(entries, Args(None, None, [2], False, False)) out, __ = capsys.readouterr() assert out == """\ +tmp/a1 (1)/path1 tmp/a1/path1 tmp/a23/path1 tmp/a64/path1 @@ -244,7 +286,7 @@ tmp/a130/path1 """ # Sort in reverse order - sort_and_print_entries(entries, Args(None, None, False, True)) + sort_and_print_entries(entries, Args(None, None, False, True, True)) out, __ = capsys.readouterr() assert out == """\ tmp/a130/path1 @@ -252,5 +294,6 @@ tmp/a64/path2 tmp/a64/path1 tmp/a57/path2 tmp/a23/path1 +tmp/a1 (1)/path1 tmp/a1/path1 """ diff --git a/test_natsort/test_natsort.py b/test_natsort/test_natsort.py new file mode 100644 index 0000000..0eeed12 --- /dev/null +++ b/test_natsort/test_natsort.py @@ -0,0 +1,291 @@ +# -*- coding: utf-8 -*- +"""\ +Here are a collection of examples of how this module can be used. +See the README or the natsort homepage for more details. +""" +import warnings +from operator import itemgetter +from pytest import raises +from natsort import natsorted, index_natsorted, natsort_key, versorted, index_versorted, natsort_keygen, order_by_index +from natsort.natsort import _number_finder, _py3_safe, _natsort_key +from natsort.natsort import float_sign_exp_re, float_nosign_exp_re, float_sign_noexp_re +from natsort.natsort import float_nosign_noexp_re, int_nosign_re, int_sign_re + + +def test_number_finder(): + + assert _number_finder('a5+5.034e-1', float_sign_exp_re, float, False) == ['a', 5.0, 0.5034] + assert _number_finder('a5+5.034e-1', float_nosign_exp_re, float, False) == ['a', 5.0, '+', 0.5034] + assert _number_finder('a5+5.034e-1', float_sign_noexp_re, float, False) == ['a', 5.0, 5.034, 'e', -1.0] + assert _number_finder('a5+5.034e-1', float_nosign_noexp_re, float, False) == ['a', 5.0, '+', 5.034, 'e-', 1.0] + assert _number_finder('a5+5.034e-1', int_nosign_re, int, False) == ['a', 5, '+', 5, '.', 34, 'e-', 1] + assert _number_finder('a5+5.034e-1', int_sign_re, int, False) == ['a', 5, 5, '.', 34, 'e', -1] + + assert _number_finder('a5+5.034e-1', float_sign_exp_re, float, True) == ['a', 5.0, '', 0.5034] + assert _number_finder('a5+5.034e-1', float_nosign_exp_re, float, True) == ['a', 5.0, '+', 0.5034] + assert _number_finder('a5+5.034e-1', float_sign_noexp_re, float, True) == ['a', 5.0, '', 5.034, 'e', -1.0] + assert _number_finder('a5+5.034e-1', float_nosign_noexp_re, float, True) == ['a', 5.0, '+', 5.034, 'e-', 1.0] + assert _number_finder('a5+5.034e-1', int_nosign_re, int, True) == ['a', 5, '+', 5, '.', 34, 'e-', 1] + assert _number_finder('a5+5.034e-1', int_sign_re, int, True) == ['a', 5, '', 5, '.', 34, 'e', -1] + + assert _number_finder('6a5+5.034e-1', float_sign_exp_re, float, False) == ['', 6.0, 'a', 5.0, 0.5034] + assert _number_finder('6a5+5.034e-1', float_sign_exp_re, float, True) == ['', 6.0, 'a', 5.0, '', 0.5034] + + +def test_py3_safe(): + + assert _py3_safe(['a', 'b', 'c']) == ['a', 'b', 'c'] + assert _py3_safe(['a']) == ['a'] + assert _py3_safe(['a', 5]) == ['a', 5] + assert _py3_safe([5, 9]) == [5, '', 9] + + +def test_natsort_key_private(): + + a = ['num3', 'num5', 'num2'] + a.sort(key=_natsort_key) + assert a == ['num2', 'num3', 'num5'] + + # The below illustrates how the key works, and how the different options affect sorting. + assert _natsort_key('a-5.034e1') == ('a', -50.34) + assert _natsort_key('a-5.034e1', number_type=float, signed=True, exp=True) == ('a', -50.34) + assert _natsort_key('a-5.034e1', number_type=float, signed=True, exp=False) == ('a', -5.034, 'e', 1.0) + assert _natsort_key('a-5.034e1', number_type=float, signed=False, exp=True) == ('a-', 50.34) + assert _natsort_key('a-5.034e1', number_type=float, signed=False, exp=False) == ('a-', 5.034, 'e', 1.0) + assert _natsort_key('a-5.034e1', number_type=int) == ('a', -5, '.', 34, 'e', 1) + assert _natsort_key('a-5.034e1', number_type=int, signed=False) == ('a-', 5, '.', 34, 'e', 1) + assert _natsort_key('a-5.034e1', number_type=None) == _natsort_key('a-5.034e1', number_type=int, signed=False) + assert _natsort_key('a-5.034e1', key=lambda x: x.upper()) == ('A', -50.34) + + # Iterables are parsed recursively so you can sort lists of lists. + assert _natsort_key(('a1', 'a-5.034e1')) == (('a', 1.0), ('a', -50.34)) + assert _natsort_key(('a1', 'a-5.034e1'), number_type=None) == (('a', 1), ('a-', 5, '.', 34, 'e', 1)) + # A key is applied before recursion, but not in the recursive calls. + assert _natsort_key(('a1', 'a-5.034e1'), key=itemgetter(1)) == ('a', -50.34) + + # Strings that lead with a number get an empty string at the front of the tuple. + # This is designed to get around the "unorderable types" issue. + assert _natsort_key(('15a', '6')) == (('', 15.0, 'a'), ('', 6.0)) + assert _natsort_key(10) == ('', 10) + + # Turn on as_path to split a file path into components + assert _natsort_key('/p/Folder (10)/file34.5nm (2).tar.gz', as_path=True) == (('/',), ('p', ), ('Folder (', 10.0, ')',), ('file', 34.5, 'nm (', 2.0, ')'), ('.tar',), ('.gz',)) + assert _natsort_key('../Folder (10)/file (2).tar.gz', as_path=True) == (('..', ), ('Folder (', 10.0, ')',), ('file (', 2.0, ')'), ('.tar',), ('.gz',)) + assert _natsort_key('Folder (10)/file.f34.5nm (2).tar.gz', as_path=True) == (('Folder (', 10.0, ')',), ('file.f', 34.5, 'nm (', 2.0, ')'), ('.tar',), ('.gz',)) + + # It gracefully handles as_path for numeric input by putting an extra tuple around it + # so it will sort against the other as_path results. + assert _natsort_key(10, as_path=True) == (('', 10),) + # as_path also handles recursion well. + assert _natsort_key(('/Folder', '/Folder (1)'), as_path=True) == ((('/',), ('Folder',)), (('/',), ('Folder (', 1.0, ')'))) + + # Turn on py3_safe to put a '' between adjacent numbers + assert _natsort_key('43h7+3', py3_safe=True) == ('', 43.0, 'h', 7.0, '', 3.0) + + # Invalid arguments give the correct response + with raises(ValueError) as err: + _natsort_key('a', number_type='float') + assert str(err.value) == "_natsort_key: 'number_type' parameter 'float' invalid" + with raises(ValueError) as err: + _natsort_key('a', signed='True') + assert str(err.value) == "_natsort_key: 'signed' parameter 'True' invalid" + with raises(ValueError) as err: + _natsort_key('a', exp='False') + assert str(err.value) == "_natsort_key: 'exp' parameter 'False' invalid" + + +def test_natsort_key_public(): + + # Identical to _natsort_key + # But it raises a depreciation warning + with warnings.catch_warnings(record=True) as w: + warnings.simplefilter("always") + assert natsort_key('a-5.034e1') == _natsort_key('a-5.034e1') + assert len(w) == 1 + assert "natsort_key is depreciated as of 3.4.0, please use natsort_keygen" in str(w[-1].message) + assert natsort_key('a-5.034e1', number_type=float, signed=False, exp=False) == _natsort_key('a-5.034e1', number_type=float, signed=False, exp=False) + + # It is called for each element in a list when sorting + with warnings.catch_warnings(record=True) as w: + warnings.simplefilter("always") + a = ['a2', 'a5', 'a9', 'a1', 'a4', 'a10', 'a6'] + a.sort(key=natsort_key) + assert len(w) == 7 + + +def test_natsort_keygen(): + + # Creates equivalent natsort keys + a = 'a-5.034e1' + assert natsort_keygen()(a) == _natsort_key(a) + assert natsort_keygen(signed=False)(a) == _natsort_key(a, signed=False) + assert natsort_keygen(exp=False)(a) == _natsort_key(a, exp=False) + assert natsort_keygen(signed=False, exp=False)(a) == _natsort_key(a, signed=False, exp=False) + assert natsort_keygen(number_type=int)(a) == _natsort_key(a, number_type=int) + assert natsort_keygen(number_type=int, signed=False)(a) == _natsort_key(a, number_type=int, signed=False) + assert natsort_keygen(number_type=None)(a) == _natsort_key(a, number_type=None) + assert natsort_keygen(as_path=True)(a) == _natsort_key(a, as_path=True) + + # Custom keys are more straightforward with keygen + f1 = natsort_keygen(key=lambda x: x.upper()) + f2 = lambda x: _natsort_key(x, key=lambda y: y.upper()) + assert f1(a) == f2(a) + + # It also makes sorting lists in-place easier (no lambdas!) + a = ['a50', 'a51.', 'a50.31', 'a50.4', 'a5.034e1', 'a50.300'] + b = a[:] + a.sort(key=natsort_keygen(number_type=int)) + assert a == natsorted(b, number_type=int) + + +def test_natsorted(): + + # Basic usage + a = ['a2', 'a5', 'a9', 'a1', 'a4', 'a10', 'a6'] + assert natsorted(a) == ['a1', 'a2', 'a4', 'a5', 'a6', 'a9', 'a10'] + + # Number types + a = ['a50', 'a51.', 'a50.31', 'a50.4', 'a5.034e1', 'a50.300'] + assert natsorted(a) == ['a50', 'a50.300', 'a50.31', 'a5.034e1', 'a50.4', 'a51.'] + assert natsorted(a, number_type=float, exp=False) == ['a5.034e1', 'a50', 'a50.300', 'a50.31', 'a50.4', 'a51.'] + assert natsorted(a, number_type=int) == ['a5.034e1', 'a50', 'a50.4', 'a50.31', 'a50.300', 'a51.'] + assert natsorted(a, number_type=None) == ['a5.034e1', 'a50', 'a50.4', 'a50.31', 'a50.300', 'a51.'] + + # Signed option + a = ['a-5', 'a7', 'a+2'] + assert natsorted(a) == ['a-5', 'a+2', 'a7'] + assert natsorted(a, signed=False) == ['a7', 'a+2', 'a-5'] + + # Number type == None + a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] + assert natsorted(a) == ['1.10.1', '1.11', '1.11.4', '1.9.9a', '1.9.9b'] + assert natsorted(a, number_type=None) == ['1.9.9a', '1.9.9b', '1.10.1', '1.11', '1.11.4'] + + # You can mix types with natsorted. This can get around the new + # 'unorderable types' issue with Python 3. + a = [6, 4.5, '7', '2.5', 'a'] + assert natsorted(a) == ['2.5', 4.5, 6, '7', 'a'] + a = [46, '5a5b2', 'af5', '5a5-4'] + assert natsorted(a) == ['5a5-4', '5a5b2', 46, 'af5'] + + # You still can't sort non-iterables + with raises(TypeError) as err: + natsorted(100) + assert str(err.value) == "'int' object is not iterable" + + # natsort will recursively descend into lists of lists so you can + # sort by the sublist contents. + data = [['a1', 'a5'], ['a1', 'a40'], ['a10', 'a1'], ['a2', 'a5']] + assert natsorted(data) == [['a1', 'a5'], ['a1', 'a40'], + ['a2', 'a5'], ['a10', 'a1']] + + # You can pass a key to do non-standard sorting rules + b = [('a', 'num3'), ('b', 'num5'), ('c', 'num2')] + c = [('c', 'num2'), ('a', 'num3'), ('b', 'num5')] + assert natsorted(b, key=itemgetter(1)) == c + + # Reversing the order is allowed + a = ['a50', 'a51.', 'a50.31', 'a50.4', 'a5.034e1', 'a50.300'] + b = ['a50', 'a50.300', 'a50.31', 'a5.034e1', 'a50.4', 'a51.'] + assert natsorted(a, reverse=True) == b[::-1] + + # Sorting paths just got easier + a = ['/p/Folder (10)/file.tar.gz', + '/p/Folder/file.tar.gz', + '/p/Folder (1)/file (1).tar.gz', + '/p/Folder (1)/file.tar.gz'] + assert natsorted(a) == ['/p/Folder (1)/file (1).tar.gz', + '/p/Folder (1)/file.tar.gz', + '/p/Folder (10)/file.tar.gz', + '/p/Folder/file.tar.gz'] + assert natsorted(a, as_path=True) == ['/p/Folder/file.tar.gz', + '/p/Folder (1)/file.tar.gz', + '/p/Folder (1)/file (1).tar.gz', + '/p/Folder (10)/file.tar.gz'] + + # You can sort paths and numbers, not that you'd want to + a = ['/Folder (9)/file.exe', 43] + assert natsorted(a, as_path=True) == [43, '/Folder (9)/file.exe'] + + +def test_versorted(): + + a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] + assert versorted(a) == natsorted(a, number_type=None) + assert versorted(a, reverse=True) == versorted(a)[::-1] + a = [('a', '1.9.9a'), ('a', '1.11'), ('a', '1.9.9b'), + ('a', '1.11.4'), ('a', '1.10.1')] + assert versorted(a) == [('a', '1.9.9a'), ('a', '1.9.9b'), ('a', '1.10.1'), + ('a', '1.11'), ('a', '1.11.4')] + + # Sorting paths just got easier + a = ['/p/Folder (10)/file1.1.0.tar.gz', + '/p/Folder/file1.1.0.tar.gz', + '/p/Folder (1)/file1.1.0 (1).tar.gz', + '/p/Folder (1)/file1.1.0.tar.gz'] + assert versorted(a) == ['/p/Folder (1)/file1.1.0 (1).tar.gz', + '/p/Folder (1)/file1.1.0.tar.gz', + '/p/Folder (10)/file1.1.0.tar.gz', + '/p/Folder/file1.1.0.tar.gz'] + assert versorted(a, as_path=True) == ['/p/Folder/file1.1.0.tar.gz', + '/p/Folder (1)/file1.1.0.tar.gz', + '/p/Folder (1)/file1.1.0 (1).tar.gz', + '/p/Folder (10)/file1.1.0.tar.gz'] + + +def test_index_natsorted(): + + # Return the indexes of how the iterable would be sorted. + a = ['num3', 'num5', 'num2'] + b = ['foo', 'bar', 'baz'] + index = index_natsorted(a) + assert index == [2, 0, 1] + assert [a[i] for i in index] == ['num2', 'num3', 'num5'] + assert [b[i] for i in index] == ['baz', 'foo', 'bar'] + assert index_natsorted(a, reverse=True) == [1, 0, 2] + + # It accepts a key argument. + c = [('a', 'num3'), ('b', 'num5'), ('c', 'num2')] + assert index_natsorted(c, key=itemgetter(1)) == [2, 0, 1] + + # It can avoid "unorderable types" on Python 3 + a = [46, '5a5b2', 'af5', '5a5-4'] + assert index_natsorted(a) == [3, 1, 0, 2] + + # It can sort lists of lists. + data = [['a1', 'a5'], ['a1', 'a40'], ['a10', 'a1'], ['a2', 'a5']] + assert index_natsorted(data) == [0, 1, 3, 2] + + # It can sort paths too + a = ['/p/Folder (10)/', + '/p/Folder/', + '/p/Folder (1)/'] + assert index_natsorted(a, as_path=True) == [1, 2, 0] + + +def test_index_versorted(): + + a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] + assert index_versorted(a) == index_natsorted(a, number_type=None) + assert index_versorted(a, reverse=True) == index_versorted(a)[::-1] + a = [('a', '1.9.9a'), ('a', '1.11'), ('a', '1.9.9b'), + ('a', '1.11.4'), ('a', '1.10.1')] + assert index_versorted(a) == [0, 2, 4, 1, 3] + + # It can sort paths too + a = ['/p/Folder (10)/file1.1.0.tar.gz', + '/p/Folder/file1.1.0.tar.gz', + '/p/Folder (1)/file1.1.0 (1).tar.gz', + '/p/Folder (1)/file1.1.0.tar.gz'] + assert index_versorted(a, as_path=True) == [1, 3, 2, 0] + + +def test_order_by_index(): + + # Return the indexes of how the iterable would be sorted. + a = ['num3', 'num5', 'num2'] + index = [2, 0, 1] + assert order_by_index(a, index) == ['num2', 'num3', 'num5'] + assert order_by_index(a, index) == [a[i] for i in index] + assert order_by_index(a, index, True) != [a[i] for i in index] + assert list(order_by_index(a, index, True)) == [a[i] for i in index] diff --git a/test_natsort/test_natsorted.py b/test_natsort/test_natsorted.py deleted file mode 100644 index bfb071d..0000000 --- a/test_natsort/test_natsorted.py +++ /dev/null @@ -1,159 +0,0 @@ -# -*- coding: utf-8 -*- -"""\ -Here are a collection of examples of how this module can be used. -See the README or the natsort homepage for more details. -""" -from operator import itemgetter -from pytest import raises -from natsort import natsorted, index_natsorted, natsort_key, versorted, index_versorted -from natsort.natsort import _remove_empty, _number_finder, _py3_safe -from natsort.natsort import float_sign_exp_re, float_nosign_exp_re, float_sign_noexp_re -from natsort.natsort import float_nosign_noexp_re, int_nosign_re, int_sign_re - - -def test__remove_empty(): - - assert _remove_empty(['a', 2, '', 'b', '']) == ['a', 2, 'b'] - assert _remove_empty(['a', 2, 'b', '']) == ['a', 2, 'b'] - assert _remove_empty(['a', 2, 'b']) == ['a', 2, 'b'] - - -def test_number_finder(): - - assert _number_finder('a5+5.034e-1', float_sign_exp_re, float, False) == ['a', 5.0, 0.5034] - assert _number_finder('a5+5.034e-1', float_nosign_exp_re, float, False) == ['a', 5.0, '+', 0.5034] - assert _number_finder('a5+5.034e-1', float_sign_noexp_re, float, False) == ['a', 5.0, 5.034, 'e', -1.0] - assert _number_finder('a5+5.034e-1', float_nosign_noexp_re, float, False) == ['a', 5.0, '+', 5.034, 'e-', 1.0] - assert _number_finder('a5+5.034e-1', int_nosign_re, int, False) == ['a', 5, '+', 5, '.', 34, 'e-', 1] - assert _number_finder('a5+5.034e-1', int_sign_re, int, False) == ['a', 5, 5, '.', 34, 'e', -1] - - assert _number_finder('a5+5.034e-1', float_sign_exp_re, float, True) == ['a', 5.0, '', 0.5034] - assert _number_finder('a5+5.034e-1', float_nosign_exp_re, float, True) == ['a', 5.0, '+', 0.5034] - assert _number_finder('a5+5.034e-1', float_sign_noexp_re, float, True) == ['a', 5.0, '', 5.034, 'e', -1.0] - assert _number_finder('a5+5.034e-1', float_nosign_noexp_re, float, True) == ['a', 5.0, '+', 5.034, 'e-', 1.0] - assert _number_finder('a5+5.034e-1', int_nosign_re, int, True) == ['a', 5, '+', 5, '.', 34, 'e-', 1] - assert _number_finder('a5+5.034e-1', int_sign_re, int, True) == ['a', 5, '', 5, '.', 34, 'e', -1] - - assert _number_finder('6a5+5.034e-1', float_sign_exp_re, float, False) == ['', 6.0, 'a', 5.0, 0.5034] - assert _number_finder('6a5+5.034e-1', float_sign_exp_re, float, True) == ['', 6.0, 'a', 5.0, '', 0.5034] - - -def test_py3_safe(): - - assert _py3_safe(['a', 'b', 'c']) == ['a', 'b', 'c'] - assert _py3_safe(['a']) == ['a'] - assert _py3_safe(['a', 5]) == ['a', 5] - assert _py3_safe([5, 9]) == [5, '', 9] - - -def test_natsort_key(): - - a = ['num3', 'num5', 'num2'] - a.sort(key=natsort_key) - assert a == ['num2', 'num3', 'num5'] - - # The below illustrates how the key works, and how the different options affect sorting. - assert natsort_key('a-5.034e1') == ('a', -50.34) - assert natsort_key('a-5.034e1', number_type=float, signed=True, exp=True) == ('a', -50.34) - assert natsort_key('a-5.034e1', number_type=float, signed=True, exp=False) == ('a', -5.034, 'e', 1.0) - assert natsort_key('a-5.034e1', number_type=float, signed=False, exp=True) == ('a-', 50.34) - assert natsort_key('a-5.034e1', number_type=float, signed=False, exp=False) == ('a-', 5.034, 'e', 1.0) - assert natsort_key('a-5.034e1', number_type=int) == ('a', -5, '.', 34, 'e', 1) - assert natsort_key('a-5.034e1', number_type=int, signed=False) == ('a-', 5, '.', 34, 'e', 1) - assert natsort_key('a-5.034e1', number_type=None) == natsort_key('a-5.034e1', number_type=int, signed=False) - - # Iterables are parsed recursively so you can sort lists of lists. - assert natsort_key(('a1', 'a10')) == (('a', 1.0), ('a', 10.0)) - - # Strings that lead with a number get an empty string at the front of the tuple. - # This is designed to get around the "unorderable types" issue. - assert natsort_key(('15a', '6')) == (('', 15.0, 'a'), ('', 6.0)) - assert natsort_key(10) == ('', 10) - - # Turn on py3_safe to put a '' between adjacent numbers - assert natsort_key('43h7+3', py3_safe=True) == ('', 43.0, 'h', 7.0, '', 3.0) - - # Invalid arguments give the correct response - with raises(ValueError) as err: - natsort_key('a', number_type='float') - assert str(err.value) == "natsort_key: 'number_type' parameter 'float' invalid" - with raises(ValueError) as err: - natsort_key('a', signed='True') - assert str(err.value) == "natsort_key: 'signed' parameter 'True' invalid" - with raises(ValueError) as err: - natsort_key('a', exp='False') - assert str(err.value) == "natsort_key: 'exp' parameter 'False' invalid" - - -def test_natsorted(): - - # Basic usage - a = ['a2', 'a5', 'a9', 'a1', 'a4', 'a10', 'a6'] - assert natsorted(a) == ['a1', 'a2', 'a4', 'a5', 'a6', 'a9', 'a10'] - - # Number types - a = ['a50', 'a51.', 'a50.31', 'a50.4', 'a5.034e1', 'a50.300'] - assert natsorted(a) == ['a50', 'a50.300', 'a50.31', 'a5.034e1', 'a50.4', 'a51.'] - assert natsorted(a, number_type=float, exp=False) == ['a5.034e1', 'a50', 'a50.300', 'a50.31', 'a50.4', 'a51.'] - assert natsorted(a, number_type=int) == ['a5.034e1', 'a50', 'a50.4', 'a50.31', 'a50.300', 'a51.'] - assert natsorted(a, number_type=None) == ['a5.034e1', 'a50', 'a50.4', 'a50.31', 'a50.300', 'a51.'] - - # Signed option - a = ['a-5', 'a7', 'a+2'] - assert natsorted(a) == ['a-5', 'a+2', 'a7'] - assert natsorted(a, signed=False) == ['a7', 'a+2', 'a-5'] - - # Number type == None - a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] - assert natsorted(a) == ['1.10.1', '1.11', '1.11.4', '1.9.9a', '1.9.9b'] - assert natsorted(a, number_type=None) == ['1.9.9a', '1.9.9b', '1.10.1', '1.11', '1.11.4'] - - # You can mix types with natsorted. This can get around the new - # 'unorderable types' issue with Python 3. - a = [6, 4.5, '7', '2.5', 'a'] - assert natsorted(a) == ['2.5', 4.5, 6, '7', 'a'] - a = [46, '5a5b2', 'af5', '5a5-4'] - assert natsorted(a) == ['5a5-4', '5a5b2', 46, 'af5'] - - # You still can't sort non-iterables - with raises(TypeError) as err: - natsorted(100) - assert str(err.value) == "'int' object is not iterable" - - # natsort will recursively descend into lists of lists so you can sort by the sublist contents. - data = [['a1', 'a5'], ['a1', 'a40'], ['a10', 'a1'], ['a2', 'a5']] - assert natsorted(data) == [['a1', 'a5'], ['a1', 'a40'], ['a2', 'a5'], ['a10', 'a1']] - - # You can pass a key to do non-standard sorting rules - b = [('a', 'num3'), ('b', 'num5'), ('c', 'num2')] - assert natsorted(b, key=itemgetter(1)) == [('c', 'num2'), ('a', 'num3'), ('b', 'num5')] - - -def test_versorted(): - - a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] - assert versorted(a) == natsorted(a, number_type=None) - -def test_index_natsorted(): - - # Return the indexes of how the iterable would be sorted. - a = ['num3', 'num5', 'num2'] - b = ['foo', 'bar', 'baz'] - index = index_natsorted(a) - assert index == [2, 0, 1] - assert [a[i] for i in index] == ['num2', 'num3', 'num5'] - assert [b[i] for i in index] == ['baz', 'foo', 'bar'] - - # It accepts a key argument. - c = [('a', 'num3'), ('b', 'num5'), ('c', 'num2')] - assert index_natsorted(c, key=itemgetter(1)) == [2, 0, 1] - - # It can avoid "unorderable types" on Python 3 - a = [46, '5a5b2', 'af5', '5a5-4'] - assert index_natsorted(a) == [3, 1, 0, 2] - - -def test_index_versorted(): - - a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] - assert index_versorted(a) == index_natsorted(a, number_type=None) |