| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Surrogates are invalid in both UTF-32 and UTF-8.
See https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G28875
and https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G31703
|
|
|
|
|
|
|
|
|
| |
Previously we included it with an `-include` compiler directive. But
that's not portable. And it's better to be explicit anyway.
Every .c file should have `include "config.h"` first thing.
Signed-off-by: Ran Benita <ran@unusedvar.com>
|
|
|
|
|
|
|
|
|
|
| |
It used to be UTF-8 was defined for inputs > 0x10FFFF, but nowadays
that's the maximum and a codepoint is encoded up to 4 bytes, not 6.
Fixes: https://github.com/xkbcommon/libxkbcommon/issues/58
Fixes: https://github.com/xkbcommon/libxkbcommon/issues/59
Reported-by: @andrecbarros
Signed-off-by: Ran Benita <ran234@gmail.com>
|
|
|
|
| |
Signed-off-by: Ran Benita <ran234@gmail.com>
|
|
We need to validate some UTF-8, so this adds an is_valid_utf8()
function, which is probably pretty slow but should work correctly.
Signed-off-by: Ran Benita <ran234@gmail.com>
|