summaryrefslogtreecommitdiff
path: root/doc/gnulib-readme.texi
blob: f729bb7fc511b5fbde4e0bdae52170069961bb49 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
@c Gnulib README

@c Copyright 2001, 2003--2023 Free Software Foundation, Inc.

@c Permission is granted to copy, distribute and/or modify this document
@c under the terms of the GNU Free Documentation License, Version 1.3 or
@c any later version published by the Free Software Foundation; with no
@c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
@c copy of the license is at <https://www.gnu.org/licenses/fdl-1.3.en.html>.

@menu
* Gnulib Basics::
* Git Checkout::
* Keeping Up-to-date::
* Contributing to Gnulib::
* Portability guidelines::
* High Quality::
* Joining GNU::
@end menu

@node Gnulib Basics
@section Gnulib Basics

While portability across operating systems is not one of GNU's primary
goals, it has helped introduce many people to the GNU system, and is
worthwhile when it can be achieved at a low cost.  This collection helps
lower that cost.

Gnulib is intended to be the canonical source for most of the important
``portability'' and/or common files for GNU projects.  These are files
intended to be shared at the source level; Gnulib is not a typical
library meant to be installed and linked against.  Thus, unlike most
projects, Gnulib does not normally generate a source tarball
distribution; instead, developers grab modules directly from the
source repository.

The easiest, and recommended, way to do this is to use the
@command{gnulib-tool} script.  Since there is no installation
procedure for Gnulib, @command{gnulib-tool} needs to be run directly
in the directory that contains the Gnulib source code.  You can do
this either by specifying the absolute filename of
@command{gnulib-tool}, or by using a symbolic link from a place inside
your @env{PATH} to the @command{gnulib-tool} file of your preferred
Gnulib checkout.  For example:

@example
$ ln -s $HOME/gnu/src/gnulib.git/gnulib-tool $HOME/bin/gnulib-tool
@end example

@node Git Checkout
@section Git Checkout

Gnulib is available for anonymous checkout.  In any Bourne-shell the
following should work:

@example
$ git clone https://git.savannah.gnu.org/git/gnulib.git
@end example

For a read-write checkout you need to have a login on
@samp{savannah.gnu.org} and be a member of the Gnulib project at
@url{https://savannah.gnu.org/projects/gnulib}.  Then, instead of the
URL @url{https://git.savannah.gnu.org/git/gnulib.git}, use the URL
@samp{ssh://@var{user}@@git.savannah.gnu.org/srv/git/gnulib} where
@var{user} is your login name on savannah.gnu.org.

git resources:

@table @asis
@item Overview:
@url{https://en.wikipedia.org/wiki/Git_(software)}
@item Homepage:
@url{https://git-scm.com/}
@end table

When you use @code{git annotate} or @code{git blame} with Gnulib, it's
recommended that you use the @option{-w} option, in order to ignore
massive whitespace changes that happened in 2009.

@node Keeping Up-to-date
@section Keeping Up-to-date

The best way to work with Gnulib is to check it out of git.
To synchronize, you can use @code{git pull}.

Subscribing to the @email{bug-gnulib@@gnu.org} mailing list will help
you to plan when to update your local copy of Gnulib (which you use to
maintain your software) from git.  You can review the archives,
subscribe, etc., via
@url{https://lists.gnu.org/mailman/listinfo/bug-gnulib}.

Sometimes, using an updated version of Gnulib will require you to use
newer versions of GNU Automake or Autoconf.  You may find it helpful
to join the autotools-announce mailing list to be advised of such
changes.

@node Contributing to Gnulib
@section Contributing to Gnulib

All software here is copyrighted by the Free Software Foundation---you need
to have filled out an assignment form for a project that uses the
module for that contribution to be accepted here.

If you have a piece of code that you would like to contribute, please
email @email{bug-gnulib@@gnu.org}.

Generally we are looking for files that fulfill at least one of the
following requirements:

@itemize
@item
If your @file{.c} and @file{.h} files define functions that are broken or
missing on some other system, we should be able to include it.

@item
If your functions remove arbitrary limits from existing
functions (either under the same name, or as a slightly different
name), we should be able to include it.
@end itemize

If your functions define completely new but rarely used functionality,
you should probably consider packaging it as a separate library.

@menu
* Gnulib licensing::
* Indent with spaces not TABs::
* How to add a new module::
@end menu

@node Gnulib licensing
@subsection Gnulib licensing

Gnulib contains code both under GPL and LGPL@.  Because several packages
that use Gnulib are GPL, the files state they are licensed under GPL@.
However, to support LGPL projects as well, you may use some of the
files under LGPL@.  The ``License:'' information in the files under
modules/ clarifies the real license that applies to the module source.

Keep in mind that if you submit patches to files in Gnulib, you should
license them under a compatible license, which means that sometimes
the contribution will have to be LGPL, if the original file is
available under LGPL via a ``License: LGPL'' information in the
projects' modules/ file.

@node Indent with spaces not TABs
@subsection Indent with spaces not TABs

We use space-only indentation in nearly all files. This includes all
@file{*.h}, @file{*.c}, @file{*.y} files, except for the @code{regex}
module. Makefile and ChangeLog files are excluded, since TAB
characters are part of their format.

In order to tell your editor to produce space-only indentation, you
can use these instructions.

@itemize
@item
For Emacs: Add these lines to your Emacs initialization file
(@file{$HOME/.emacs} or similar):

@example
;; In Gnulib, indent with spaces everywhere (not TABs).
;; Exceptions: Makefile and ChangeLog modes.
(add-hook 'find-file-hook '(lambda ()
  (if (and buffer-file-name
           (string-match "/gnulib\\>" (buffer-file-name))
           (not (string-equal mode-name "Change Log"))
           (not (string-equal mode-name "Makefile")))
      (setq indent-tabs-mode nil))))
@end example

@item
For vi (vim): Add these lines to your @file{$HOME/.vimrc} file:

@example
" Don't use tabs for indentation. Spaces are nicer to work with.
set expandtab
@end example

For Makefile and ChangeLog files, compensate for this by adding this
to your @file{$HOME/.vim/after/indent/make.vim} file, and similarly
for your @file{$HOME/.vim/after/indent/changelog.vim} file:

@example
" Use tabs for indentation, regardless of the global setting.
set noexpandtab
@end example

@item
For Eclipse: In the ``Window|Preferences'' dialog (or ``Eclipse|Preferences''
dialog on Mac OS),

@enumerate
@item
Under ``General|Editors|Text Editors'', select the ``Insert spaces for tabs''
checkbox.

@item
Under ``C/C++|Code Style'', select a code style profile that has the
``Indentation|Tab policy'' combobox set to ``Spaces only'', such as the
``GNU [built-in]'' policy.
@end enumerate

If you use the GNU indent program, pass it the option @option{--no-tabs}.
@end itemize

@node How to add a new module
@subsection How to add a new module

@itemize
@item
Add the header files and source files to @file{lib/}.

@item
If the module needs configure-time checks, write an Autoconf
macro for it in @file{m4/@var{module}.m4}. See @file{m4/README} for details.

@item
Write a module description @file{modules/@var{module}}, based on
@file{modules/TEMPLATE}.

@item
If the module contributes a section to the end-user documentation,
put this documentation in @file{doc/@var{module}.texi} and add it to the ``Files''
section of @file{modules/@var{module}}.  Most modules don't do this; they have only
documentation for the programmer (= Gnulib user).  Such documentation
usually goes into the @file{lib/} source files.  It may also go into @file{doc/};
but don't add it to the module description in this case.

@item
Add the module to the list in @file{MODULES.html.sh}.
@end itemize

@noindent
You can test that a module builds correctly with:

@example
$ ./gnulib-tool --create-testdir --dir=/tmp/testdir module1 ... moduleN
$ cd /tmp/testdir
$ ./configure && make
@end example

@noindent
Other things:

@itemize
@item
Check the license and copyright year of headers.

@item
Check that the source code follows the GNU coding standards;
see @url{https://www.gnu.org/prep/standards}.

@item
Add source files to @file{config/srclist*} if they are identical to upstream
and should be upgraded in Gnulib whenever the upstream source changes.

@item
Include header files in source files to verify the function prototypes.

@item
Make sure a replacement function doesn't cause warnings or clashes on
systems that have the function.

@item
Autoconf functions can use @samp{gl_*} prefix. The @samp{AC_*} prefix is for
autoconf internal functions.

@item
Build files only if they are needed on a platform.  Look at the
@code{alloca} and @code{fnmatch} modules for how to achieve this.  If
for some reason you cannot do this, and you have a @file{.c} file that
leads to an empty @file{.o} file on some platforms (through some big
@code{#if} around all the code), then ensure that the compilation unit
is not empty after preprocessing.  One way to do this is to
@code{#include <stddef.h>} or @code{<stdio.h>} before the big
@code{#if}.
@end itemize

@node Portability guidelines
@section Portability guidelines

Gnulib code is intended to be portable to a wide variety of platforms,
not just GNU platforms.  Gnulib typically attempts to support a
platform as long as it is still supported by its provider, even if the
platform is not the latest version.  @xref{Target Platforms}.

Many Gnulib modules exist so that applications need not worry about
undesirable variability in implementations.  For example, an
application that uses the @code{malloc} module need not worry about
@code{malloc@ (0)} returning @code{NULL} on some Standard C
platforms; and @code{glob} users need not worry about @code{glob}
silently omitting symbolic links to nonexistent files on some
platforms that do not conform to POSIX.

Gnulib code is intended to port without problem to new hosts, e.g.,
hosts conforming to recent C and POSIX standards.  Hence Gnulib code
should avoid using constructs that these newer standards no longer
require, without first testing for the presence of these constructs.
For example, because C11 made variable length arrays optional, Gnulib
code should avoid them unless it first uses the @code{vararrays}
module to check whether they are supported.

The following subsections discuss some exceptions and caveats to the
general Gnulib portability guidelines.

@menu
* C language versions::
* C99 features assumed::
* C99 features avoided::
* Other portability assumptions::
@end menu

@node C language versions
@subsection C language versions

Currently Gnulib assumes at least a freestanding C99 compiler,
possibly operating with a C library that predates C99; with time this
assumption will likely be strengthened to later versions of the C
standard.  Old platforms currently supported include AIX 6.1, HP-UX
11i v1 and Solaris 10, though these platforms are rarely tested.
Gnulib itself is so old that it contains many fixes for obsolete
platforms, fixes that may be removed in the future.

Because of the freestanding C99 assumption, Gnulib code can include
@code{<float.h>}, @code{<limits.h>}, @code{<stdarg.h>},
@code{<stddef.h>}, and @code{<stdint.h>}
unconditionally; @code{<stdbool.h>} is also in the C99 freestanding
list but is obsolescent as of C23.  Gnulib code can also assume the existence
of @code{<ctype.h>}, @code{<errno.h>}, @code{<fcntl.h>},
@code{<locale.h>}, @code{<signal.h>}, @code{<stdio.h>},
@code{<stdlib.h>}, @code{<string.h>}, and @code{<time.h>}.  Similarly,
many modules include @code{<sys/types.h>} even though it's not even in
C11; that's OK since @code{<sys/types.h>} has been around nearly
forever.

Even if the include files exist, they may not conform to the C standard.
However, GCC has a @command{fixincludes} script that attempts to fix most
C89-conformance problems.  Gnulib currently assumes include files
largely conform to C89 or better.  People still using ancient hosts
should use fixincludes or fix their include files manually.

Even if the include files conform, the library itself may not.
For example, @code{strtod} and @code{mktime} have some bugs on some platforms.
You can work around some of these problems by requiring the relevant
modules, e.g., the Gnulib @code{mktime} module supplies a working and
conforming @code{mktime}.

@node C99 features assumed
@subsection C99 features assumed by Gnulib

Although the C99 standard specifies many features, Gnulib code
is conservative about using them, partly because Gnulib predates
the widespread adoption of C99, and partly because many C99
features are not well-supported in practice.  C99 features that
are reasonably portable nowadays include:

@itemize
@item
A declaration after a statement, or as the first clause in a
@code{for} statement.

@item
@code{long long int}.

@item
@code{<stdbool.h>}, although Gnulib code no longer uses
it directly, preferring plain @code{bool} via the
@code{stdbool} module instead.
@xref{stdbool.h}.

@item
@code{<stdint.h>}, assuming the @code{stdint} module is used.
@xref{stdint.h}.

@item
Compound literals and designated initializers.

@item
Variadic macros.

@item
@code{static inline} functions.

@item
@code{__func__}, assuming the @code{func} module is used.  @xref{func}.

@item
The @code{restrict} qualifier, assuming
@code{AC_REQUIRE([AC_C_RESTRICT])} is used.
This qualifier is sometimes implemented via a macro, so C++ code that
uses Gnulib should avoid using @code{restrict} as an identifier.

@item
Flexible array members (however, see the @code{flexmember} module).
@end itemize

@node C99 features avoided
@subsection C99 features avoided by Gnulib

Gnulib avoids some features even though they are standardized by C99,
as they have portability problems in practice.  Here is a partial list
of avoided C99 features.  Many other C99 features are portable only if
their corresponding modules are used; Gnulib code that uses such a
feature should require the corresponding module.

@itemize
@item
Variable length arrays (VLAs) or variably modified types,
without checking whether @code{__STDC_NO_VLA__} is defined.
See the @code{vararrays} and @code{vla} modules.

@item
Block-scope variable length arrays, without checking whether either
@code{GNULIB_NO_VLA} or @code{__STDC_NO_VLA__} is defined.
This lets you define @code{GNULIB_NO_VLA} to pacify GCC when
using its @option{-Wvla-larger-than warnings} option,
and to avoid large stack usage that may have security implications.
@code{GNULIB_NO_VLA} does not affect Gnulib's other uses of VLAs and
variably modified types, such as array declarations in function
prototype scope.

@item
@code{extern inline} functions, without checking whether they are
supported.  @xref{extern inline}.

@item
Type-generic math functions.

@item
Universal character names in source code.

@item
@code{<iso646.h>}, since GNU programs need not worry about deficient
source-code encodings.

@item
Comments beginning with @samp{//}.  This is mostly for style reasons.
@end itemize

@node Other portability assumptions
@subsection Other portability assumptions made by Gnulib

The GNU coding standards allow one departure from strict C: Gnulib
code can assume that standard internal types like
@code{ptrdiff_t} and @code{size_t} are no
wider than @code{long}.  POSIX requires implementations to support at
least one programming environment where this is true, and such
environments are recommended for Gnulib-using applications.  When it
is easy to port to non-POSIX platforms like MinGW where these types
are wider than @code{long}, new Gnulib code should do so, e.g., by
using @code{ptrdiff_t} instead of @code{long}.  However, it is not
always that easy, and no effort has been made to check that all Gnulib
modules work on MinGW-like environments.

Gnulib code makes the following additional assumptions:

@itemize
@item
@code{int} and @code{unsigned int} are at least 32 bits wide.  POSIX
and the GNU coding standards both require this.

@item
Signed integer arithmetic is two's complement.

Previously, Gnulib code sometimes also assumed that signed integer
arithmetic wraps around, but modern compiler optimizations
sometimes do not guarantee this, and Gnulib code with this
assumption is now considered to be questionable.
@xref{Integer Properties}.

Although some Gnulib modules contain explicit support for the other signed
integer representations allowed by the C standard (ones' complement and signed
magnitude), these modules are the exception rather than the rule.
All practical Gnulib targets use two's complement.

@item
There are no ``holes'' in integer values: all the bits of an integer
contribute to its value in the usual way.
In particular, an unsigned type and its signed counterpart have the
same number of bits when you count the latter's sign bit.

@item
Objects with all bits zero are treated as 0 or NULL@.  For example,
@code{memset@ (A, 0, sizeof@ A)} initializes an array @code{A} of
pointers to NULL.

@item
The types @code{intptr_t} and @code{uintptr_t} exist, and pointers
can be converted to and from these types without loss of information.

@item
Addresses and sizes behave as if objects reside in a flat address space.
In particular:

@itemize
@item
If two nonoverlapping objects have sizes @var{S} and @var{T} represented as
@code{ptrdiff_t} or @code{size_t} values, then @code{@var{S} + @var{T}}
cannot overflow.

@item
A pointer @var{P} points within an object @var{O} if and only if
@code{(char *) &@var{O} <= (char *) @var{P} && (char *) @var{P} <
(char *) (&@var{O} + 1)}.

@item
Arithmetic on a valid pointer is equivalent to the same arithmetic on
the pointer converted to @code{uintptr_t}, except that offsets are
multiplied by the size of the pointed-to objects.
For example, if @code{P + I} is a valid expression involving a pointer
@var{P} and an integer @var{I}, then @code{(uintptr_t) (P + I) ==
(uintptr_t) ((uintptr_t) P + I * sizeof *P)}.
Similar arithmetic can be done with @code{intptr_t}, although more
care must be taken in case of integer overflow or negative integers.

@item
A pointer @code{P} has alignment @code{A} if and only if
@code{(uintptr_t) P % A} is zero, and similarly for @code{intptr_t}.

@item
If an existing object has size @var{S}, and if @var{T} is sufficiently
small (e.g., 8 KiB), then @code{@var{S} + @var{T}} cannot overflow.
Overflow in this case would mean that the rest of your program fits
into @var{T} bytes, which can't happen in realistic flat-address-space
hosts.

@item
Adding zero to a null pointer does not change the pointer.
For example, @code{0 + (char *) NULL == (char *) NULL}.
@end itemize
@end itemize

Some system platforms violate these assumptions and are therefore not
Gnulib porting targets.  @xref{Unsupported Platforms}.

@node High Quality
@section High Quality

We develop and maintain a testsuite for Gnulib.  The goal is to have a
100% firm interface so that maintainers can feel free to update to the
code in git at @emph{any} time and know that their application will not
break.  This means that before any change can be committed to the
repository, a test suite program must be produced that exposes the bug
for regression testing.

@node Stable Branches
@subsection Stable Branches

In Gnulib, we don't use topic branches for experimental work.
Therefore, occasionally a broken commit may be pushed in Gnulib.
It does not happen often, but it does happen.

To compensate for this, Gnulib offers ``stable branches''.  These
are branches of the Gnulib code that are maintained over some
longer period (a year, for example) and include
@itemize
@item
bug fixes,
@item
portability enhancements (to existing as well as to new platforms),
@item
updates to @code{config.guess} and @code{config.sub}.
@end itemize

@noindent
Not included in the stable branches are:
@itemize
@item
new features, such as new modules,
@item
optimizations,
@item
refactorings,
@item
complex or risky changes in general,
@item
updates to @code{texinfo.tex},
@item
documentation updates.
@end itemize

So far, we have three stable branches:
@table @code
@item stable-202301
A stable branch that starts at the beginning of January 2023.
@item stable-202207
A stable branch that starts at the beginning of July 2022.
@item stable-202201
A stable branch that starts at the beginning of January 2022.
It is no longer updated.
@end table

The two use-cases of stable branches are thus:
@itemize
@item
You want to protect yourself from occasional breakage in Gnulib.
@item
When making a bug-fix release of your code, you can incorporate
bug fixes in Gnulib, by pulling in the newest commits from the
same stable branch that you were already using for the previous
release.
@end itemize

@node Writing reliable code
@subsection Writing reliable code

When compiling and testing Gnulib and Gnulib-using programs, certain
compiler options can help improve reliability.  First of all, make it
a habit to use @samp{-Wall} in all compilation commands. Beyond that,
the @code{manywarnings} module enables several forms of static checking in
GCC and related compilers (@pxref{manywarnings}).

For dynamic checking, you can run @code{configure} with @code{CFLAGS}
options appropriate for your compiler.  For example:

@example
./configure \
 CPPFLAGS='-Wall'\
 CFLAGS='-g3 -O2'\
' -D_FORTIFY_SOURCE=2'\
' -fsanitize=undefined'\
' -fsanitize-undefined-trap-on-error'
@end example

@noindent
Here:

@itemize @bullet
@item
@code{-D_FORTIFY_SOURCE=2} enables extra security hardening checks in
the GNU C library.
@item
@code{-fsanitize=undefined} enables GCC's undefined behavior sanitizer
(@code{ubsan}), and
@item
@code{-fsanitize-undefined-trap-on-error} causes @code{ubsan} to
abort the program (through an ``illegal instruction'' signal).  This
measure stops exploit attempts and also allows you to debug the issue.
@end itemize

Without the @code{-fsanitize-undefined-trap-on-error} option,
@code{-fsanitize=undefined} causes messages to be printed, and
execution continues after an undefined behavior situation.
The message printing causes GCC-like compilers to arrange for the
program to dynamically link to libraries it might not otherwise need.
With GCC, instead of @code{-fsanitize-undefined-trap-on-error} you can
use the @code{-static-libubsan} option to arrange for two of the extra
libraries (@code{libstdc++} and @code{libubsan}) to be linked
statically rather than dynamically, though this typically bloats the
executable and the remaining extra libraries are still linked
dynamically.

It is also good to occasionally run the programs under @code{valgrind}
(@pxref{Running self-tests under valgrind}).

@include join-gnu.texi