Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Really only use PyUnicode_FromUnicode() when needed (GH-3697) | scoder | 2020-06-30 | 1 | -0/+28 |
| | | | | | | | | | | * Really only use PyUnicode_FromUnicode() for strings that contain lone surrogate, not for normal non-BMP strings and not for surrogate pairs on 16bit Unicode platforms. See https://github.com/cython/cython/issues/3678 * Extend buildenv test to debug a MacOS problem. * Add a test for surrogate pairs in Unicode strings. * Limit PyUnicode_FromUnicode() usage to strings containing lone surrogates. * Accept ambiguity of surrogate pairs in Unicode string literals when generated on 16bit Py2 systems. | ||||
* | Fix many indentation and whitespace issues throughout the code base (GH-3673) | scoder | 2020-06-10 | 1 | -2/+2 |
| | | | … and enforce them with pycodestyle. | ||||
* | unicode imports (#3119) | da-woods | 2019-09-30 | 1 | -0/+21 |
| | | | | | * Handle normalization of unicode identifiers * Support unicode characters in module names (Only valid under Python 3) | ||||
* | Unicode identifiers (PEP 3131) (GH-3081) | da-woods | 2019-08-24 | 1 | -0/+8 |
| | | | Closes #2601 | ||||
* | Evaluate multiplication of string literals at compile time if the result is ↵ | Stefan Behnel | 2018-01-13 | 1 | -0/+8 |
| | | | | short (<= 256 characters). | ||||
* | whitespace | Stefan Behnel | 2017-02-11 | 1 | -0/+1 |
| | |||||
* | Merge branch '0.23.x' | Stefan Behnel | 2015-09-02 | 1 | -4/+9 |
|\ | | | | | | | | | | | Conflicts: Cython/Compiler/Optimize.py Cython/Compiler/StringEncoding.py | ||||
| * | fix bytes literal creation from compile-time DEF expressions (used to become ↵ | Stefan Behnel | 2015-09-02 | 1 | -3/+11 |
| | | | | | | | | Unicode strings due to missing encoding) | ||||
* | | clean up and fix docstring serialisation (some are const, some are not) | Stefan Behnel | 2015-08-08 | 1 | -0/+7 |
|/ | |||||
* | adapt 'unicode' usage to Py2/Py3 | Stefan Behnel | 2015-07-26 | 1 | -8/+8 |
| | |||||
* | adapt usages of map() to Py2/Py3 | Stefan Behnel | 2015-07-25 | 1 | -1/+1 |
| | |||||
* | use explicit relative imports everywhere and enable absolute imports by default | Stefan Behnel | 2014-06-17 | 1 | -0/+3 |
| | |||||
* | support surrogates in unicode string literals in Py3.3 | Stefan Behnel | 2013-03-15 | 1 | -1/+21 |
| | |||||
* | Pass-through single surrogates in Py_UNICODE[] literal encoding routine. | Nikita Nemkin | 2013-03-07 | 1 | -3/+3 |
| | |||||
* | Compatibility fix: no UTF-32 codec in Python 2.4/2.5. | Nikita Nemkin | 2013-03-07 | 1 | -14/+21 |
| | |||||
* | Renamed Py_UNICODE* entities to use "pyunicode_ptr" prefix; fixed small ↵ | Nikita Nemkin | 2013-03-05 | 1 | -1/+1 |
| | | | | issues in Py_UNICODE* support. | ||||
* | Full support for Py_UNICODE[] literals with non-BMP characters. | Nikita Nemkin | 2013-03-03 | 1 | -6/+14 |
| | |||||
* | Basic support for Py_UNICODE* strings. | Nikita Nemkin | 2013-03-03 | 1 | -0/+12 |
| | |||||
* | preprocess byte string literal escaping instead of doing repeated ↵ | Stefan Behnel | 2013-01-10 | 1 | -11/+15 |
| | | | | replacements at runtime | ||||
* | undo Py3.3 surrogates support fixes - breaks too many special cases with strings | Stefan Behnel | 2013-01-10 | 1 | -26/+12 |
| | |||||
* | fix surrogates in Unicode literals in Python 3.3 (the UTF-8 codec rejects ↵ | Stefan Behnel | 2013-01-06 | 1 | -12/+26 |
| | | | | them explictly) | ||||
* | Fix python 3 deepcopy & sorting compatiblity | Mark Florisson | 2011-10-03 | 1 | -0/+6 |
| | |||||
* | Fix trac #640, long string literals with escapes. | Robert Bradshaw | 2011-01-12 | 1 | -3/+17 |
| | |||||
* | support redundant parsing of string literals as unicode *and* bytes string, ↵ | Stefan Behnel | 2010-09-04 | 1 | -0/+35 |
| | | | | fix 'str' literal assignments to char* targets when using Future.unicode_literals | ||||
* | prevent control characters in unicode literals (ord<32) from sneaking into ↵ | Stefan Behnel | 2010-08-09 | 1 | -2/+3 |
| | | | | the C source | ||||
* | fix order of surrogate pair in wide unicode strings | Stefan Behnel | 2010-07-03 | 1 | -1/+1 |
| | |||||
* | fix parsing of wide unicode escapes on narrow Unicode platforms | Stefan Behnel | 2010-07-03 | 1 | -2/+13 |
| | |||||
* | Don't split long literals at backslash. | Robert Bradshaw | 2010-02-12 | 1 | -2/+2 |
| | |||||
* | Split long string literals at 2000 chars. | Robert Bradshaw | 2010-02-05 | 1 | -1/+3 |
| | | | | (There may not be enough line breaks...) | ||||
* | split BytesNode, UnicodeNode and StringNode | Stefan Behnel | 2009-10-10 | 1 | -7/+7 |
| | |||||
* | Py2 bytes handling fix | Stefan Behnel | 2009-08-21 | 1 | -3/+10 |
| | |||||
* | Py2.x fix after Py3 char fix ;) | Stefan Behnel | 2009-08-21 | 1 | -3/+3 |
| | |||||
* | properly handle char values (bytes with length 1) in Py3 | Stefan Behnel | 2009-08-21 | 1 | -2/+3 |
| | |||||
* | fix byte string escaping of '\' in Py2.x (broken by latest Py3 fixes) | Stefan Behnel | 2009-07-08 | 1 | -1/+1 |
| | |||||
* | enable % formatting of byte strings by providing a __str__() special method ↵ | Stefan Behnel | 2009-07-06 | 1 | -3/+3 |
| | | | | that encodes to unicode | ||||
* | Py3 fix: make sure byte strings end up in the code as expected (not like ↵ | Stefan Behnel | 2009-07-06 | 1 | -4/+8 |
| | | | | >>b'...'<<) | ||||
* | make sure header filenames pass literally into the C code | Stefan Behnel | 2009-07-06 | 1 | -0/+6 |
| | |||||
* | Py3 fixes | Stefan Behnel | 2009-07-05 | 1 | -1/+1 |
| | |||||
* | Py3 fix | Stefan Behnel | 2009-07-05 | 1 | -21/+45 |
| | |||||
* | Optimization for shorter docstrings. | Robert Bradshaw | 2008-08-16 | 1 | -0/+2 |
| | |||||
* | Split docstring around \n for compilers who barf at long string literals (VS ↵ | david@evans-2.local | 2008-08-15 | 1 | -0/+3 |
| | | | | 2003). | ||||
* | Rewrite of the string literal handling code | Stefan Behnel | 2008-08-15 | 1 | -0/+144 |
String literals pass through the compiler as follows: - unicode string literals are stored as unicode strings and encoded to UTF-8 on the way out - byte string literals are stored as correctly encoded byte strings by unescaping the source string literal into the corresponding byte sequence. No further encoding is done later on! - char literals are stored as byte strings of length 1. This can be verified by the parser now, e.g. a non-ASCII char literal in UTF-8 source code will result in an error, as it would end up as two or more bytes in the C code, which can no longer be represented as a C char. Storing byte strings is necessary as we otherwise loose the ability to encode byte string literals on the way out. They do not necessarily contain only bytes that fit into the source code encoding as the source can use escape sequences to represent them. Previously, ASCII encoded source code could not contain byte string literals with properly escaped non-ASCII bytes. Another bug that was fixed: in Python, escape sequences behave different in unicode strings (where they represent the character code) and byte strings (where they represent a byte value). Previously, they resulted in the same byte value in Cython code. This is only a problem for non-ASCII escapes, since the character code and the byte value of ASCII characters are identical. |