summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--UPGRADING.INTERNALS47
1 files changed, 47 insertions, 0 deletions
diff --git a/UPGRADING.INTERNALS b/UPGRADING.INTERNALS
index 6e30f7cad8..476e7c255e 100644
--- a/UPGRADING.INTERNALS
+++ b/UPGRADING.INTERNALS
@@ -2,6 +2,8 @@ PHP 7.1 INTERNALS UPGRADE NOTES
0. Wiki Examples
1. Internal API changes
+ e. Codepage handling on Windows
+ f. Path handling on Windows
2. Build system changes
a. Unix build system changes
@@ -21,6 +23,51 @@ changes. See: https://wiki.php.net/phpng-upgrading
1. Internal API changes
========================
+ e. Codepage handling on Windows
+
+ A set of new APIs was introduced, which allows to handle codepage
+ conversions. The corresponding prototypes and macros are contained
+ in win32/codepage.h.
+
+ Functions with php_win32_cp_* signatures provide handling for various
+ codepage aspects. Primarily they are in use at various places in the
+ I/O utils code and directly in the core where necessary, providing
+ conversions to/from UTF-16. Arbitrary conversions between codepages
+ are possible as well, whereby UTF-16 will be always an intermediate
+ state in this case.
+
+ For input length arguments, the macro PHP_WIN32_CP_IGNORE_LEN can be
+ passed, then the API will calculate the length. For output length
+ arguments, the macro PHP_WIN32_CP_IGNORE_LEN_P can be passed, then
+ the API won't set the output length.
+
+ The mapping between encodings and codepages is provided by the predefined
+ array of data contained in win32/cp_enc_map.c. To change the data,
+ a generator win32/cp_enc_map_gen.c needs to be edited and run.
+
+ f. Path handling on Windows
+
+ A set of new APIs was introduced, which allows to handle UTF-8 paths. The
+ corresponding prototypes and macros are contained in win32/ioutil.h.
+
+ Functions with php_win32_ioutil_* signatures provide POSIX I/O analogues.
+ These functions are integrated in various places across the code base to
+ support Unicode filenames. While accepting char * arguments, internally
+ the conversion to wchar_t * happens. Internally almost no ANSI APIs are
+ used, but directly their wide equivalents. The string conversion rules
+ correspond to those already present in the core and depend on the current
+ encoding settings. Doing so allows to move away from the ANSI Windows API
+ with its dependency on the system OEM/ANSI codepage.
+
+ Thanks to the wide API usage, the long paths are now supported as well. The
+ PHP_WIN32_IOUTIL_MAXPATHLEN macro is defined to 2048 bytes and will override
+ the MAXPATHLEN in files where the header is included.
+
+ The most optimal use case for scripts is utilizing UTF-8 for any I/O
+ related functions. UTF-8 filenames are supported on any system disregarding
+ the system OEM/ANSI codepage.
+
+
========================
2. Build system changes
========================