summaryrefslogtreecommitdiff
path: root/UPGRADING.INTERNALS
blob: 632c87e1c462f69222a16769ef0aa493609e0497 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
$Id$

PHP 7.0 INTERNALS UPGRADE NOTES

1. Internal API changes
  e. New data types
  f. zend_parse_parameters() specs
  g. sprintf() formats
  h. HashTable API
  i. New portable macros for large file support
  j. New portable macros for integers
  k. get_class_entry object handler info
  l. get_class_name object handler info
  m. Other portable macros info
  n. ZEND_ENGINE_2 removal
  o. Updated final class modifier
  p. TSRM changes
  q. gc_collect_cycles() is now hookable
  r. PDO uses size_t for lengths

2. Build system changes
  a. Unix build system changes
  b. Windows build system changes


========================
1. Internal API changes
========================

  e. New data types

     String

        Besides the old way of accepting the strings with 's', the new 'S' ZPP spec
	was introduced. It expects an argument of the type zend_string *. String lengths
	in it are bound to the size_t datatype.

     Integer types

	Integers do no more depend on the firm 'long' type. Instead a platform
	dependent integer type is used, it is called zend_long. That datatype is
	defined dynamically to guarantee the consistent 64 bit support. The zval 
	field representing user land integer it bound to zend_long.

	Signed integer is defined as zend_long, unsigned integer as zend_ulong
	inside Zend. 

     Other datatypes

	zend_off_t  - portable off_t analogue
	zend_stat_t - portable 'struct stat' analogue

	These datatypes are declared to be portable across platforms. Thus, direct
	usage of the functions like fseek, stat, etc. as well as direct usage of
	off_t and struct stat is strongly not recommended. Instead the portable
	macros should be used.

	zend_fseek - portable fseek equivalent
	zend_ftell - portable ftell equivalent
	zend_lseek - portable lseek equivalent
	zend_fstat - portable fstat equivalent
	zend_stat  - portable stat equivalent

  f. zend_parse_parameters() specs

       The new spec 'S' introduced, which expects an argument of type zend_string *.
       The 'l' spec expects a parameter of the type zend_long, not long anymore.
       The 's' spec expects parameters of the type char * and size_t, no int anymore.

  g. sprintf() formats

       New printf modifier 'p' was implemented to platform independently output zend_long,
       zend_ulong and php_size_t datatypes. That modifier can be used with'd', 'u', 'x' and 'o'
       printf format specs with spprintf, snprintf and the wrapping printf implementations.
       %pu is sufficient for both zend_ulong and php_size_t. the code using %p spec to output
       pointer address might need to be enclosed into #ifdef when it unlickily followed by 'd',
       'u', 'x' or 'o'.

       The only exceptions are the snprintf and zend_sprintf functions yet, because in some cases
       they can use the implemenations available on the system, not the PHP one. With snprintf the
       macros ZEND_INT_FMT and ZEND_UINT_FMT should be used.

  h. HashTable API

       Datatype for array indexes was changed to zend_ulong, for string keys to zend_string *.

  i. New portable macros for large file support

      Function(s)       Alias           Comment
      stat, _stat64     zend_stat   for use with zend_stat_t
      fstat, _fstat64   zend_fstat  for use with zend_stat_t
      lseek, _lseeki64  zend_lseek  for use with zend_off_t
      ftell, _ftelli64  zend_ftell  for use with zend_off_t
      fseek, _fseeki64  zend_fseek  for use with zend_off_t

  j. New portable macros for integers

      Function(s)                                         Alias             Comment
      snprintf with "%ld" or "%lld", _ltoa_s, _i64toa_s   ZEND_LTOA         for use with zend_long
      atol, atoll, _atoi64                                ZEND_ATOL         for use with zend_long
      strtol, strtoll, _strtoi64                          ZEND_STRTOL       for use with zend_long
      strtoul, strtoull, _strtoui64                       ZEND_STRTOUL      for use with zend_long
      abs, llabs, _abs64                                  ZEND_ABS          for use with zend_long
      -                                                   ZEND_LONG_MAX     replaces LONG_MAX where appropriate
      -                                                   ZEND_LONG_MIN     replaces LONG_MIN where appropriate
      -                                                   ZEND_ULONG_MAX    replaces ULONG_MAX where appropriate
      -                                                   SIZEOF_ZEND_LONG  reworked SIZEOF_ZEND_LONG representing the size of zend_long datatype
      -                                                   ZEND_SIZE_MAX     Max value of size_t
      -                                                   Z_L               casts an integral constant to zend_long
      -                                                   Z_UL              casts an integral constant to zend_ulong

      The macro ZEND_ENABLE_ZVAL_LONG64 reveals whether zval operates on 64 or 32 bit integer.

  k. The get_class_entry object handler is no longer available. Instead
     zend_object.ce is always used.

  l. The get_class_name object handler is now only used for displaying class
     names in debugging functions like var_dump(). It is no longer used in
     get_class(), get_parent_class() or similar.

     The handler is now obligatory, no longer accepts a `parent` argument and
     must return a non-NULL zend_string*, which will be released by the caller.

  m. Other portable macros info

     ZEND_SECURE_ZERO     - zeroes chunk of memory
     ZEND_VALID_SOCKET    - validates a php_socket_t variable 

     ZEND_FASTCALL is defined to use __vectorcall convention on VS2013 and above
     ZEND_NORETURN is defined as __declspec(noreturn) on VS

  n. The ZEND_ENGINE_2 macro has been removed. A ZEND_ENGINE_3 macro has been added.
  
  o. Removed ZEND_ACC_FINAL_CLASS in favour of ZEND_ACC_FINAL, turning final class
     modifier now a different class entry flag. Update your extensions.

  p. TSRM changes

     The TSRM layer undergone significant changes. It is not necessary anymore to 
     pass the tsrm_ls pointer explicitly. The TSRMLS_* macros should not be used
     in any new developments.

     The recommended way accessing globals in TS builds is by implementing it
     to use a local thread unique pointer to the corresponding data pool. These
     are the new macros that serve to achieve this goal:

     - ZEND_TSRMG - based on TSRMG and if static pointer for data pool access is
       implemented, will use it
     - ZEND_TSRMLS_CACHE_EXTERN - to be used in a header shared across multiple
       C/C++ files
     - ZEND_TSRMLS_CACHE_DEFINE - to be used in the main extension C/C++ file
     - ZEND_TSRMLS_CACHE_UPDATE - to be integrated at the top of the globals
       ctor or RINIT function
     - ZEND_ENABLE_STATIC_TSRMLS_CACHE - to be used in the config.[m4|w32] for
       enabling the globals access per thread unique pointer
     - TSRMLS_CACHE - expands to the variable name which links to the thread
       unique data pool

     Note that usually it will work without implementing the globals access per
     a thread unique pointer, however it might cause a slowdown. The reason is
     that the access model to the extension/SAPI own globals as well as the
     access to the core globals is determined in the compilation phase depending
     on whether ZEND_ENABLE_STATIC_TSRMLS_CACHE macro is defined. 

     Due to the current compiler evolution state, the TSRMLS_CACHE pointer
     (when implemented) has to be present and updated in every binary unit which
     is not compiled statically into PHP. So any DLL, DSO, EXE, etc. has to take
     care about defining and updating it's TSRMLS cache. An update should happen
     before any globals was accessed.

     Porting an extension or SAPI is usually as easy as removing all the TSRMLS_*
     ocurrences and integrating the macros mentioned above. However if tsrm_ls
     is used explicitly, its usage can be considered obsolete in most cases.
     Additionally, if an extension triggers its own threads, TSRMLS_CACHE shouldn't
     be passed to that threads directly.

     When working with CTOR/DTOR for the extension globals, only the data passed
     as parameters should be touched. For example, don't use code like
     free(MYEXT_G(vars)); inside globals destructor, as it possibly can destroy
     the data from a wrong thread. Instead use the pointer to the data delivered
     into the destructor. So the previous example could look like
     free(my_struct->vars); given the income was casted to (struct my_struct *).

  q. gc_collect_cycles() is now a function pointer, and can be replaced in the
     same manner as zend_execute_ex() if needed (for example, to include the
     time spent in the garbage collector in a profiler). The default
     implementation has been renamed to zend_gc_collect_cycles(), and is
     exported with ZEND_API.

  r. In accordance with general use of size_t as string length, all PDO API
     functions now use size_t for string length.

  s. Removed ZEND_REGISTER/FETCH_RESOURCE, use zend_fetch_resource and
     and zend_register_resource instead, zend_fetch_resource accepts a zend_resource *
     as first argument, if you still need to fetch a resource from zval, use 
     zend_fetch_resource_ex. More details can be found in Zend/zend_list.c.

  t. Optimized strings concatenation.
     ZEND_ADD_STRING/VAR/CHAR are replaced with ZEND_ROPE_INTI, ZEND_ROPE_ADD,
     ZEND_ROPE_END.
     Instead of reallocation and copying string on each ZEND_ADD_STRING/VAR/CAHR,
     collect all the strings and then alloca te and construct the resulting string once.


========================
2. Build system changes
========================

  a. Unix build system changes

  b. Windows build system changes

     - Besides Visual Studio, building with Clang or Intel Composer is now
       possible. To enable an alternative toolset, add the option
       --with-toolset=[vs,clang,icc] to the configure line. The default
       toolset is vs. Still clang or icc need the correct environment
       which involves many tools from the vs toolset.

       The toolset option is supported by phpize as well.

       AWARENESS The only recommended and supported toolset to produce production
       ready binaries is Visual Studio. Still other compilers can be used now for
       testing and analyzing purposes.

     - configure.js now produces response files which are passed to the linker
       and library manager. This solves the issues with the long command lines
       which can exceed the OS limit.

     - with clang toolset, an option --with-uncritical-warn-choke is available,
       which suppresses the most frequent false positive warnings

     - the --with-mp option will by default utilize all the available cores. It's
       enabled by default and can be disabled with the special "disable" keyword.

========================
3. Module changes
========================

  Session:

    - PS_MOD_UPDATE_TIMESTAMP() session save handler definition is added. New
      save handler should use PS_MOD_UPDATE_TIMESTAMP() definition. Save handler
      must not refer/change PS() variables directly from save handler. Use
      parameters.
    - PS_MOD()/PS_MOD_SID() macro exist only for transition. Beware these
      save handlers are less secure and slower than PS_MOD_UPDATE_TIMESTAMP().
    - PS(invalid_session_id) was removed. It was never worked as it supposed.
      To report invalid session ID, use PS_VALIDATE_SID() handler.
    - PS_VALIDATE_SID() defines session ID validation handler. This handler
      must validate if requested session ID exists in session data storage or not.
      Do not access PS(id) directly, but use this handler and it's parameter.
    - PS_UPDATE_TIMESTAMP() defines timestamp updating handler. This handler
      must update session data timestamp for GC if it is needed. e.g. Memcache
      updates timestap on read, so it does not need to update timestamp. Return
      SUCCESS simply for this case.
    - PS_CREATE_SID() should check session ID collision. Return NULL for failure.
    - More documentation can be found in ext/session/mod_files.c as comments.
      mod_files.c may be used as reference implementation.