summaryrefslogtreecommitdiff
path: root/git-send-email.perl
diff options
context:
space:
mode:
authorKarsten Blees <karsten.blees@gmail.com>2015-07-01 21:10:47 +0200
committerJunio C Hamano <gitster@pobox.com>2015-07-01 14:55:53 -0700
commit3a59e5954ef19ac94522219c2f29d49a187d31d8 (patch)
treeb79952098c087d313e92af57fb8fe0f87bb7a696 /git-send-email.perl
parenta5fe66802f8c4036badd54ff36ff327d43236e7e (diff)
downloadgit-3a59e5954ef19ac94522219c2f29d49a187d31d8.tar.gz
Documentation/i18n.txt: clarify character encoding supportkb/i18n-doc
As a "distributed" VCS, git should better define the encodings of its core textual data structures, in particular those that are part of the network protocol. That git is encoding agnostic is only really true for blob objects. E.g. the 'non-NUL bytes' requirement of tree and commit objects excludes UTF-16/32, and the special meaning of '/' in the index file as well as space and linefeed in commit objects eliminates EBCDIC and other non-ASCII encodings. Git expects bytes < 0x80 to be pure ASCII, thus CJK encodings that partly overlap with the ASCII range are problematic as well. E.g. fmt_ident() removes trailing 0x5C from user names on the assumption that it is ASCII '\'. However, there are over 200 GBK double byte codes that end in 0x5C. UTF-8 as default encoding on Linux and respective path translations in the Mac and Windows versions have established UTF-8 NFC as de-facto standard for path names. Update the documentation in i18n.txt to reflect the current status-quo. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'git-send-email.perl')
0 files changed, 0 insertions, 0 deletions