summaryrefslogtreecommitdiff
path: root/CODING_STANDARDS.md
blob: cba2611ecb7c1bc86395eb8d17e422b316e684ba (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
# PHP coding standards

This file lists several standards that any programmer adding or changing code in
PHP should follow. Since this file was added at a very late stage of the
development of PHP v3.0, the code base does not fully follow it, but new
features are going in that general direction. Many sections have been recoded to
use these rules.

## Code implementation

1. Document your code in source files and the manual. (tm)

2. Functions that are given pointers to resources should not free them.

    For instance, `function int mail(char *to, char *from)` should NOT free to
    and/or from.

    Exceptions:

    * The function's designated behavior is freeing that resource. E.g.
      `efree()`

    * The function is given a boolean argument, that controls whether or not the
      function may free its arguments (if true - the function must free its
      arguments, if false - it must not)

    * Low-level parser routines, that are tightly integrated with the token
      cache and the bison code for minimum memory copying overhead.

3. Functions that are tightly integrated with other functions within the same
    module, and rely on each other non-trivial behavior, should be documented as
    such and declared `static`. They should be avoided if possible.

4. Use definitions and macros whenever possible, so that constants have
    meaningful names and can be easily manipulated. The only exceptions to this
    rule are 1 and 2, when used as `false` and `true` (respectively). Any other
    use of a numeric constant to specify different behavior or actions should be
    done through a `#define`.

5. When writing functions that deal with strings, be sure to remember that PHP
    holds the length property of each string, and that it shouldn't be
    calculated with `strlen()`. Write your functions in such a way so that
    they'll take advantage of the length property, both for efficiency and in
    order for them to be binary-safe. Functions that change strings and obtain
    their new lengths while doing so, should return that new length, so it
    doesn't have to be recalculated with `strlen()` (e.g. `php_addslashes()`).

6. NEVER USE `strncat()`. If you're absolutely sure you know what you're doing,
    check its man page again, and only then, consider using it, and even then,
    try avoiding it.

7. Use `PHP_*` macros in the PHP source, and `ZEND_*` macros in the Zend part of
    the source. Although the `PHP_*` macros are mostly aliased to the `ZEND_*`
    macros it gives a better understanding on what kind of macro you're calling.

8. When commenting out code using a `#if` statement, do NOT use `0` only.
    Instead use `"<git username here>_0"`. For example, `#if FOO_0`, where `FOO`
    is your git user `foo`. This allows easier tracking of why code was
    commented out, especially in bundled libraries.

9. Do not define functions that are not available. For instance, if a library is
    missing a function, do not define the PHP version of the function, and do
    not raise a run-time error about the function not existing. End users should
    use `function_exists()` to test for the existence of a function.

10. Prefer `emalloc()`, `efree()`, `estrdup()`, etc. to their standard C library
    counterparts. These functions implement an internal "safety-net" mechanism
    that ensures the deallocation of any unfreed memory at the end of a request.
    They also provide useful allocation and overflow information while running
    in debug mode.

    In almost all cases, memory returned to the engine must be allocated using
    `emalloc()`.

    The use of `malloc()` should be limited to cases where a third-party library
    may need to control or free the memory, or when the memory in question needs
    to survive between multiple requests.

## User functions/methods naming conventions

1. Function names for user-level functions should be enclosed with in the
    `PHP_FUNCTION()` macro. They should be in lowercase, with words underscore
    delimited, with care taken to minimize the letter count. Abbreviations
    should not be used when they greatly decrease the readability of the
    function name itself:

    Good:

    ```php
    str_word_count
    array_key_exists
    ```

    Ok:

    ```php
    date_interval_create_from_date_string
    // Could be 'date_intvl_create_from_date_str'?
    get_html_translation_table()
    // Could be 'html_get_trans_table'?
    ```

    Bad:

    ```php
    hw_GetObjectByQueryCollObj
    pg_setclientencoding
    jf_n_s_i
    ```

2. If they are part of a "parent set" of functions, that parent should be
    included in the user function name, and should be clearly related to the
    parent program or function family. This should be in the form of `parent_*`:

    A family of `foo` functions, for example:

    Good:

    ```php
    foo_select_bar
    foo_insert_baz
    foo_delete_baz
    ```

    Bad:

    ```php
    fooselect_bar
    fooinsertbaz
    delete_foo_baz
    ```

3. Function names used by user functions should be prefixed with `_php_`, and
    followed by a word or an underscore-delimited list of words, in lowercase
    letters, that describes the function. If applicable, they should be declared
    `static`.

4. Variable names must be meaningful. One letter variable names must be avoided,
    except for places where the variable has no real meaning or a trivial
    meaning (e.g. `for (i=0; i<100; i++) ...`).

5. Variable names should be in lowercase. Use underscores to separate between
    words.

6. Method names follow the *studlyCaps* (also referred to as *bumpy case* or
    *camel caps*) naming convention, with care taken to minimize the letter
    count. The initial letter of the name is lowercase, and each letter that
    starts a new `word` is capitalized:

    Good:

    ```php
    connect()
    getData()
    buildSomeWidget()
    ```

    Bad:

    ```php
    get_Data()
    buildsomewidget()
    getI()
    ```

7. Class names should be descriptive nouns in *PascalCase* and as short as
    possible. Each word in the class name should start with a capital letter,
    without underscore delimiters. The class name should be prefixed with the
    name of the "parent set" (e.g. the name of the extension) if no namespaces
    are used. Abbreviations and acronyms as well as initialisms should be
    avoided wherever possible, unless they are much more widely used than the
    long form (e.g. HTTP or URL). Abbreviations start with a capital letter
    followed by lowercase letters, whereas acronyms and initialisms are written
    according to their standard notation. Usage of acronyms and initialisms is
    not allowed if they are not widely adopted and recognized as such.

    Good:

    ```php
    Curl
    CurlResponse
    HTTPStatusCode
    URL
    BTreeMap // B-tree Map
    Id // Identifier
    ID // Identity Document
    Char // Character
    Intl // Internationalization
    Radar // Radio Detecting and Ranging
    ```

    Bad:

    ```php
    curl
    curl_response
    HttpStatusCode
    Url
    BtreeMap
    ID // Identifier
    CHAR
    INTL
    RADAR // Radio Detecting and Ranging
    ```

## Internal function naming conventions

1. Functions that are part of the external API should be named
    `php_modulename_function()` to avoid symbol collision. They should be in
    lowercase, with words underscore delimited. Exposed API must be defined in
    `php_modulename.h`.

    ```c
    PHPAPI char *php_session_create_id(PS_CREATE_SID_ARGS);
    ```

    Unexposed module function should be static and should not be defined in
    `php_modulename.h`.

    ```c
    static int php_session_destroy()
    ```

2. Main module source file must be named `modulename.c`.

3. Header file that is used by other sources must be named `php_modulename.h`.

## Syntax and indentation

1. Never use C++ style comments (i.e. `//` comment). Always use C-style comments
    instead. PHP is written in C, and is aimed at compiling under any ANSI-C
    compliant compiler. Even though many compilers accept C++-style comments in
    C code, you have to ensure that your code would compile with other compilers
    as well. The only exception to this rule is code that is Win32-specific,
    because the Win32 port is MS-Visual C++ specific, and this compiler is known
    to accept C++-style comments in C code.

2. Use K&R-style. Of course, we can't and don't want to force anybody to use a
    style he or she is not used to, but, at the very least, when you write code
    that goes into the core of PHP or one of its standard modules, please
    maintain the K&R style. This applies to just about everything, starting with
    indentation and comment styles and up to function declaration syntax. Also
    see [Indentstyle](http://www.catb.org/~esr/jargon/html/I/indent-style.html).

3. Be generous with whitespace and braces. Keep one empty line between the
    variable declaration section and the statements in a block, as well as
    between logical statement groups in a block. Maintain at least one empty
    line between two functions, preferably two. Always prefer:

    ```c
    if (foo) {
        bar;
    }
    ```

    to:

    ```c
    if(foo)bar;
    ```

4. When indenting, use the tab character. A tab is expected to represent four
    spaces. It is important to maintain consistency in indenture so that
    definitions, comments, and control structures line up correctly.

5. Preprocessor statements (`#if` and such) MUST start at column one. To indent
    preprocessor directives you should put the `#` at the beginning of a line,
    followed by any number of whitespace.

## Testing

1. Extensions should be well tested using `*.phpt` tests. Read about that at
    [qa.php.net](https://qa.php.net/write-test.php) documentation.

## Documentation and folding hooks

In order to make sure that the online documentation stays in line with the code,
each user-level function should have its user-level function prototype before it
along with a brief one-line description of what the function does. It would look
like this:

```c
/* {{{ proto int abs(int number)
   Returns the absolute value of the number */
PHP_FUNCTION(abs)
{
    ...
}
/* }}} */
```

The `{{{` symbols are the default folding symbols for the folding mode in Emacs
and vim (`set fdm=marker`). Folding is very useful when dealing with large files
because you can scroll through the file quickly and just unfold the function you
wish to work on. The `}}}` at the end of each function marks the end of the
fold, and should be on a separate line.

The `proto` keyword there is just a helper for the `doc/genfuncsummary` script
which generates a full function summary. Having this keyword in front of the
function prototypes allows us to put folds elsewhere in the code without
messing up the function summary.

Optional arguments are written like this:

```c
/* {{{ proto object imap_header(int stream_id, int msg_no [, int from_length [, int subject_length [, string default_host]]])
   Returns a header object with the defined parameters */
```

And yes, please keep the prototype on a single line, even if that line is
massive.

## New and experimental functions

To reduce the problems normally associated with the first public implementation
of a new set of functions, it has been suggested that the first implementation
include a file labeled `EXPERIMENTAL` in the function directory, and that the
functions follow the standard prefixing conventions during their initial
implementation.

The file labelled `EXPERIMENTAL` should include the following information:

* Any authoring information (known bugs, future directions of the module).
* Ongoing status notes which may not be appropriate for Git comments.

In general new features should go to PECL or experimental branches until there
are specific reasons for directly adding it to the core distribution.

## Aliases & legacy documentation

You may also have some deprecated aliases with close to duplicate names, for
example, `somedb_select_result` and `somedb_selectresult`. For documentation
purposes, these will only be documented by the most current name, with the
aliases listed in the documentation for the parent function. For ease of
reference, user-functions with completely different names, that alias to the
same function (such as `highlight_file` and `show_source`), will be separately
documented. The proto should still be included, describing which function is
aliased.

Backwards compatible functions and names should be maintained as long as the
code can be reasonably be kept as part of the codebase. See the `README` in the
PHP documentation repository for more information on documentation.