Update docs w.r.t. encodings.

author: Georg Brandl <georg@python.org> 2014-11-06 13:18:32 +0100
committer: Georg Brandl <georg@python.org> 2014-11-06 13:18:32 +0100
commit: 8c0814068d229cfbf67f9e3a070bcdaa089c7ffa (patch)
tree: 3297ab209f67532ff71c9e8b82b6edd1f8b984a6 /doc/docs/unicode.rst
parent: 69e83eb0856666d2594c96b1e8fae42dbeb92318 (diff)
download: pygments-8c0814068d229cfbf67f9e3a070bcdaa089c7ffa.tar.gz
1 files changed, 14 insertions, 6 deletions
diff --git a/doc/docs/unicode.rst b/doc/docs/unicode.rst
index e79b4bec..7291a3b2 100644
--- a/doc/docs/unicode.rst
+++ b/doc/docs/unicode.rst
@@ -6,12 +6,20 @@ Since Pygments 0.6, all lexers use unicode strings internally. Because of that
 you might encounter the occasional :exc:`UnicodeDecodeError` if you pass strings
 with the wrong encoding.
 
-Per default all lexers have their input encoding set to `latin1`.
-If you pass a lexer a string object (not unicode), it tries to decode the data
-using this encoding.
-You can override the encoding using the `encoding` lexer option. If you have the
-`chardet`_ library installed and set the encoding to ``chardet`` if will ananlyse
-the text and use the encoding it thinks is the right one automatically:
+Per default all lexers have their input encoding set to `guess`.  This means
+that the following encodings are tried:
+
+* UTF-8 (including BOM handling)
+* The locale encoding (i.e. the result of `locale.getpreferredencoding()`)
+* As a last resort, `latin1`
+
+If you pass a lexer a byte string object (not unicode), it tries to decode the
+data using this encoding.
+
+You can override the encoding using the `encoding` or `inencoding` lexer
+options.  If you have the `chardet`_ library installed and set the encoding to
+``chardet`` if will ananlyse the text and use the encoding it thinks is the
+right one automatically:
 
 .. sourcecode:: python
author	Georg Brandl <georg@python.org>	2014-11-06 13:18:32 +0100
committer	Georg Brandl <georg@python.org>	2014-11-06 13:18:32 +0100
commit	8c0814068d229cfbf67f9e3a070bcdaa089c7ffa (patch)
tree	3297ab209f67532ff71c9e8b82b6edd1f8b984a6 /doc/docs/unicode.rst
parent	69e83eb0856666d2594c96b1e8fae42dbeb92318 (diff)
download	pygments-8c0814068d229cfbf67f9e3a070bcdaa089c7ffa.tar.gz