summaryrefslogtreecommitdiff
path: root/docs/src
diff options
context:
space:
mode:
authorblackbird <devnull@localhost>2007-01-13 13:09:14 +0100
committerblackbird <devnull@localhost>2007-01-13 13:09:14 +0100
commit420f485a95f7aa71a40dc2aad356e927d2a5237c (patch)
treedb935a65503272310df3d9c50e23dae694636adc /docs/src
parent319a9f06a787b23d77117973c60737ab7baf5a55 (diff)
downloadpygments-420f485a95f7aa71a40dc2aad356e927d2a5237c.tar.gz
[svn] added unicode information for pygments
Diffstat (limited to 'docs/src')
-rw-r--r--docs/src/index.txt2
-rw-r--r--docs/src/unicode.txt31
2 files changed, 33 insertions, 0 deletions
diff --git a/docs/src/index.txt b/docs/src/index.txt
index 1b80240a..6c7ca276 100644
--- a/docs/src/index.txt
+++ b/docs/src/index.txt
@@ -24,6 +24,8 @@ Welcome to the Pygments documentation.
- `Styles <styles.txt>`_
+ - `Unicode <unicode.txt>`_
+
- API and more
- `API documentation <api.txt>`_
diff --git a/docs/src/unicode.txt b/docs/src/unicode.txt
new file mode 100644
index 00000000..3e1c0b6b
--- /dev/null
+++ b/docs/src/unicode.txt
@@ -0,0 +1,31 @@
+===============
+Unicode Support
+===============
+
+Since Pygments 0.6 the lexers use unicode strings internally. Because of that
+you might discover some `UnicodeDecodeErrors` if you pass it strings with the
+wrong encoding.
+
+Per default all lexers have `encoding` set to `latin1`. If you pass a lexer a
+string object (not unicode) it tries to decode the data using this encoding.
+You can override the encoding using the `encoding` parameter. If you have the
+`chardet`_ library installed and set the encoding to ``guess`` if will ananlyse
+the text and fetch the best encoding automatically:
+
+.. sourcecode:: python
+
+ from pygments.lexers import PythonLexer
+ lexer = PythonLexer(encoding='guess')
+
+The best way is to pass pygments unicode objects. In that case you don't get
+unexpected output.
+
+The formatters now send unicode objects to the stream if you don't set the
+encoding. You can do so by passing the formatters an `encoding` parameter:
+
+.. sourcecode:: python
+
+ from pygments.formatters import HtmlFormatter
+ f = HtmlFormatter(encoding='utf-8')
+
+.. _chardet: http://chardet.feedparser.org/