These coding style guidelines are mainly intended for use in rts.
NB These are just suggestions. They're not set in stone. Some of them are probably misguided. If you disagree with them, feel free to modify this document (and make your commit message reasonably informative) or mail someone (eg. The GHC mailing list)
In addition we use ANSI-C-style function declarations and prototypes exclusively. Every function should have a prototype; static function prototypes may be placed near the top of the file in which they are declared, and external prototypes are usually placed in a header file with the same basename as the source file (although there are exceptions to this rule, particularly when several source files together implement a subsystem which is described by a single external header file).
no_return
and
unused
)
POSIX_SOURCE
by default, but found that this caused more
problems than it solved, so now we require any code that is
POSIX-compliant to explicitly say so by having #include
"rts/PosixSource.h"
at the top. Try to do this whenever possible.
/* minimum alignment of unsigned int */ #define ALIGNMENT_UNSIGNED_INT 4 /* minimum alignment of long */ #define ALIGNMENT_LONG 4 /* minimum alignment of float */ #define ALIGNMENT_FLOAT 4 /* minimum alignment of double */ #define ALIGNMENT_DOUBLE 4
sizeof(int) != sizeof(ptr)
on that platform.
Use PK_FLT(addr)
, PK_DBL(addr)
to read
StgFloat and StgDouble values from the stack/heap,
and ASSIGN_FLT(val,addr)
/
ASSIGN_DBL(val,addr)
to assign StgFloat/StgDouble values
to heap/stack locations. These macros take care of alignment
restrictions.
Heap/Stack locations are always StgWord aligned; the
alignment requirements of an StgDouble may be more than that
of StgWord, but we don't pad misaligned StgDoubles
because doing so would be too much hassle (see PK_DBL
&
co above).
#ifdef solaris_HOST_OS // do something solaris specific #endifInstead, add an appropriate test to the configure.ac script and use the result of that test instead.
#if defined(HAVE_BSD_H) // use a BSD library #endif
The problem is that things change from one version of an OS to another - things get added, things get deleted, things get broken, some things are optional extras. Using "feature tests" instead of "system tests" makes things a lot less brittle. Things also tend to get documented better.
We put all our debugging code inside #if defined(DEBUG). The general policy is we don't ship code with debugging checks and assertions in it, but we do run with those checks in place when developing and testing. Anything inside #if defined(DEBUG) should not slow down the code by more than a factor of 2.
We also have more expensive "sanity checking" code for hardcore debugging - this can slow down the code by a large factor, but is only enabled on demand by a command-line flag. General sanity checking in the RTS is currently enabled with the -DS RTS flag.
There are a number of RTS flags which control debugging output and sanity checking in various parts of the system when DEBUG is defined. For example, to get the scheduler to be verbose about what it is doing, you would say +RTS -Ds -RTS. See rts/include/RtsFlags.h and rts/RtsFlags.c for the full set of debugging flags. To check one of these flags in the code, write:
IF_DEBUG(gc, fprintf(stderr, "..."));would check the gc flag before generating the output (and the code is removed altogether if DEBUG is not defined).
All debugging output should go to stderr.
Particular guidelines for writing robust code:
typedef enum { i_INTERNAL_ERROR /* Instruction 0 raises an internal error */ , i_PANIC /* irrefutable pattern match failed! */ , i_ERROR /* user level error */ ...
In particular:
#define add(x,y) ((x)+(y))instead of
#define add(x,y) x+y
// you can use this as an l-value or an l-value #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo) // polymorphic case // but note that min(min(1,2),3) does 3 comparisons instead of 2!! #define min(x,y) (((x)<=(y)) ? (x) : (y))
When a function is both inline and `static', if all calls to the function are integrated into the caller, and the function's address is never used, then the function's own assembler code is never referenced. In this case, GNU CC does not actually output assembler code for the function, unless you specify the option `-fkeep-inline-functions'. Some calls cannot be integrated for various reasons (in particular, calls that precede the function's definition cannot be integrated, and neither can recursive calls within the definition). If there is a nonintegrated call, then the function is compiled to assembler code as usual. The function must also be compiled as usual if the program refers to its address, because that can't be inlined. When an inline function is not `static', then the compiler must assume that there may be calls from other source files; since a global symbol can be defined only once in any program, the function must not be defined in the other source files, so the calls therein cannot be integrated. Therefore, a non-`static' inline function is always compiled on its own in the usual fashion. If you specify both `inline' and `extern' in the function definition, then the definition is used only for inlining. In no case is the function compiled on its own, not even if you refer to its address explicitly. Such an address becomes an external reference, as if you had only declared the function, and had not defined it. This combination of `inline' and `extern' has almost the effect of a macro. The way to use it is to put a function definition in a header file with these keywords, and put another copy of the definition (lacking `inline' and `extern') in a library file. The definition in the header file will cause most calls to the function to be inlined. If any uses of the function remain, they will refer to the single copy in the library.
#define ASSIGN_CC_ID(ccID) \ { \ ccID = CC_ID; \ CC_ID++; \ }(but it's usually better to use an inline function instead - see above).
#define doNothing() do { } while (0)
int* p, q;looks like it declares two pointers but, in fact, only p is a pointer. It's safer to write this:
int* p; int* q;You could also write this:
int *p, *q;but it is preferrable to split the declarations.
Examples:
typedef enum { /* N.B. Used as indexes into arrays */ NO_HEAP_PROFILING, HEAP_BY_CC, HEAP_BY_MOD, HEAP_BY_GRP, HEAP_BY_DESCR, HEAP_BY_TYPE, HEAP_BY_TIME } ProfilingFlags;instead of
# define NO_HEAP_PROFILING 0 /* N.B. Used as indexes into arrays */ # define HEAP_BY_CC 1 # define HEAP_BY_MOD 2 # define HEAP_BY_GRP 3 # define HEAP_BY_DESCR 4 # define HEAP_BY_TYPE 5 # define HEAP_BY_TIME 6and
typedef enum { CCchar = 'C', MODchar = 'M', GRPchar = 'G', DESCRchar = 'D', TYPEchar = 'Y', TIMEchar = 'T' } ProfilingTag;instead of
# define CCchar 'C' # define MODchar 'M' # define GRPchar 'G' # define DESCRchar 'D' # define TYPEchar 'Y' # define TIMEchar 'T'
#if defined(0)
... #endif
rather than /* ... */
because C doesn't
have nested comments.
typedef struct _Foo { ... } Foo;
If you must reindent or reorganise, don't include any functional changes that commit and give advance warning that you're about to do it in case anyone else is changing that file.
-v
and --verbose
options should be
used to generate verbose output (intended for the user).
-d
and --debug
options should be
used to generate debugging output (intended for the developer).
-?
and --help
options should be used
to display usage information on stdout. The program should exit
successfully afterwards.
-V
and --version
options should be
used to output version information on stdout, which includes one line
of the form 'Program version
Major.Minor[.Patchlevel] ...
'. The program
should exit successfully afterwards.