diff options
Diffstat (limited to 'gdb/doc/stabs.info-1')
-rw-r--r-- | gdb/doc/stabs.info-1 | 1166 |
1 files changed, 1166 insertions, 0 deletions
diff --git a/gdb/doc/stabs.info-1 b/gdb/doc/stabs.info-1 new file mode 100644 index 00000000000..4c36dd8cb94 --- /dev/null +++ b/gdb/doc/stabs.info-1 @@ -0,0 +1,1166 @@ +This is Info file stabs.info, produced by Makeinfo version 1.68 from +the input file ./stabs.texinfo. + +START-INFO-DIR-ENTRY +* Stabs: (stabs). The "stabs" debugging information format. +END-INFO-DIR-ENTRY + + This document describes the stabs debugging symbol tables. + + Copyright 1992, 93, 94, 95, 97, 1998 Free Software Foundation, Inc. +Contributed by Cygnus Support. Written by Julia Menapace, Jim Kingdon, +and David MacKenzie. + + Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + + Permission is granted to copy or distribute modified versions of this +manual under the terms of the GPL (for which purpose this text may be +regarded as a program in the language TeX). + + +File: stabs.info, Node: Top, Next: Overview, Up: (dir) + +The "stabs" representation of debugging information +*************************************************** + + This document describes the stabs debugging format. + +* Menu: + +* Overview:: Overview of stabs +* Program Structure:: Encoding of the structure of the program +* Constants:: Constants +* Variables:: +* Types:: Type definitions +* Symbol Tables:: Symbol information in symbol tables +* Cplusplus:: Stabs specific to C++ +* Stab Types:: Symbol types in a.out files +* Symbol Descriptors:: Table of symbol descriptors +* Type Descriptors:: Table of type descriptors +* Expanded Reference:: Reference information by stab type +* Questions:: Questions and anomolies +* Stab Sections:: In some object file formats, stabs are + in sections. +* Symbol Types Index:: Index of symbolic stab symbol type names. + + +File: stabs.info, Node: Overview, Next: Program Structure, Prev: Top, Up: Top + +Overview of Stabs +***************** + + "Stabs" refers to a format for information that describes a program +to a debugger. This format was apparently invented by Peter Kessler at +the University of California at Berkeley, for the `pdx' Pascal +debugger; the format has spread widely since then. + + This document is one of the few published sources of documentation on +stabs. It is believed to be comprehensive for stabs used by C. The +lists of symbol descriptors (*note Symbol Descriptors::.) and type +descriptors (*note Type Descriptors::.) are believed to be completely +comprehensive. Stabs for COBOL-specific features and for variant +records (used by Pascal and Modula-2) are poorly documented here. + + Other sources of information on stabs are `Dbx and Dbxtool +Interfaces', 2nd edition, by Sun, 1988, and `AIX Version 3.2 Files +Reference', Fourth Edition, September 1992, "dbx Stabstring Grammar" in +the a.out section, page 2-31. This document is believed to incorporate +the information from those two sources except where it explicitly +directs you to them for more information. + +* Menu: + +* Flow:: Overview of debugging information flow +* Stabs Format:: Overview of stab format +* String Field:: The string field +* C Example:: A simple example in C source +* Assembly Code:: The simple example at the assembly level + + +File: stabs.info, Node: Flow, Next: Stabs Format, Up: Overview + +Overview of Debugging Information Flow +====================================== + + The GNU C compiler compiles C source in a `.c' file into assembly +language in a `.s' file, which the assembler translates into a `.o' +file, which the linker combines with other `.o' files and libraries to +produce an executable file. + + With the `-g' option, GCC puts in the `.s' file additional debugging +information, which is slightly transformed by the assembler and linker, +and carried through into the final executable. This debugging +information describes features of the source file like line numbers, +the types and scopes of variables, and function names, parameters, and +scopes. + + For some object file formats, the debugging information is +encapsulated in assembler directives known collectively as "stab" +(symbol table) directives, which are interspersed with the generated +code. Stabs are the native format for debugging information in the +a.out and XCOFF object file formats. The GNU tools can also emit stabs +in the COFF and ECOFF object file formats. + + The assembler adds the information from stabs to the symbol +information it places by default in the symbol table and the string +table of the `.o' file it is building. The linker consolidates the `.o' +files into one executable file, with one symbol table and one string +table. Debuggers use the symbol and string tables in the executable as +a source of debugging information about the program. + + +File: stabs.info, Node: Stabs Format, Next: String Field, Prev: Flow, Up: Overview + +Overview of Stab Format +======================= + + There are three overall formats for stab assembler directives, +differentiated by the first word of the stab. The name of the directive +describes which combination of four possible data fields follows. It is +either `.stabs' (string), `.stabn' (number), or `.stabd' (dot). IBM's +XCOFF assembler uses `.stabx' (and some other directives such as +`.file' and `.bi') instead of `.stabs', `.stabn' or `.stabd'. + + The overall format of each class of stab is: + + .stabs "STRING",TYPE,OTHER,DESC,VALUE + .stabn TYPE,OTHER,DESC,VALUE + .stabd TYPE,OTHER,DESC + .stabx "STRING",VALUE,TYPE,SDB-TYPE + + For `.stabn' and `.stabd', there is no STRING (the `n_strx' field is +zero; see *Note Symbol Tables::). For `.stabd', the VALUE field is +implicit and has the value of the current file location. For `.stabx', +the SDB-TYPE field is unused for stabs and can always be set to zero. +The OTHER field is almost always unused and can be set to zero. + + The number in the TYPE field gives some basic information about +which type of stab this is (or whether it *is* a stab, as opposed to an +ordinary symbol). Each valid type number defines a different stab +type; further, the stab type defines the exact interpretation of, and +possible values for, any remaining STRING, DESC, or VALUE fields +present in the stab. *Note Stab Types::, for a list in numeric order +of the valid TYPE field values for stab directives. + + +File: stabs.info, Node: String Field, Next: C Example, Prev: Stabs Format, Up: Overview + +The String Field +================ + + For most stabs the string field holds the meat of the debugging +information. The flexible nature of this field is what makes stabs +extensible. For some stab types the string field contains only a name. +For other stab types the contents can be a great deal more complex. + + The overall format of the string field for most stab types is: + + "NAME:SYMBOL-DESCRIPTOR TYPE-INFORMATION" + + NAME is the name of the symbol represented by the stab; it can +contain a pair of colons (*note Nested Symbols::.). NAME can be +omitted, which means the stab represents an unnamed object. For +example, `:t10=*2' defines type 10 as a pointer to type 2, but does not +give the type a name. Omitting the NAME field is supported by AIX dbx +and GDB after about version 4.8, but not other debuggers. GCC +sometimes uses a single space as the name instead of omitting the name +altogether; apparently that is supported by most debuggers. + + The SYMBOL-DESCRIPTOR following the `:' is an alphabetic character +that tells more specifically what kind of symbol the stab represents. +If the SYMBOL-DESCRIPTOR is omitted, but type information follows, then +the stab represents a local variable. For a list of symbol +descriptors, see *Note Symbol Descriptors::. The `c' symbol descriptor +is an exception in that it is not followed by type information. *Note +Constants::. + + TYPE-INFORMATION is either a TYPE-NUMBER, or `TYPE-NUMBER='. A +TYPE-NUMBER alone is a type reference, referring directly to a type +that has already been defined. + + The `TYPE-NUMBER=' form is a type definition, where the number +represents a new type which is about to be defined. The type +definition may refer to other types by number, and those type numbers +may be followed by `=' and nested definitions. Also, the Lucid +compiler will repeat `TYPE-NUMBER=' more than once if it wants to +define several type numbers at once. + + In a type definition, if the character that follows the equals sign +is non-numeric then it is a TYPE-DESCRIPTOR, and tells what kind of +type is about to be defined. Any other values following the +TYPE-DESCRIPTOR vary, depending on the TYPE-DESCRIPTOR. *Note Type +Descriptors::, for a list of TYPE-DESCRIPTOR values. If a number +follows the `=' then the number is a TYPE-REFERENCE. For a full +description of types, *Note Types::. + + A TYPE-NUMBER is often a single number. The GNU and Sun tools +additionally permit a TYPE-NUMBER to be a pair +(FILE-NUMBER,FILETYPE-NUMBER) (the parentheses appear in the string, +and serve to distinguish the two cases). The FILE-NUMBER is a number +starting with 1 which is incremented for each seperate source file in +the compilation (e.g., in C, each header file gets a different number). +The FILETYPE-NUMBER is a number starting with 1 which is incremented +for each new type defined in the file. (Separating the file number and +the type number permits the `N_BINCL' optimization to succeed more +often; see *Note Include Files::). + + There is an AIX extension for type attributes. Following the `=' +are any number of type attributes. Each one starts with `@' and ends +with `;'. Debuggers, including AIX's dbx and GDB 4.10, skip any type +attributes they do not recognize. GDB 4.9 and other versions of dbx +may not do this. Because of a conflict with C++ (*note Cplusplus::.), +new attributes should not be defined which begin with a digit, `(', or +`-'; GDB may be unable to distinguish those from the C++ type +descriptor `@'. The attributes are: + +`aBOUNDARY' + BOUNDARY is an integer specifying the alignment. I assume it + applies to all variables of this type. + +`pINTEGER' + Pointer class (for checking). Not sure what this means, or how + INTEGER is interpreted. + +`P' + Indicate this is a packed type, meaning that structure fields or + array elements are placed more closely in memory, to save memory + at the expense of speed. + +`sSIZE' + Size in bits of a variable of this type. This is fully supported + by GDB 4.11 and later. + +`S' + Indicate that this type is a string instead of an array of + characters, or a bitstring instead of a set. It doesn't change + the layout of the data being represented, but does enable the + debugger to know which type it is. + + All of this can make the string field quite long. All versions of +GDB, and some versions of dbx, can handle arbitrarily long strings. +But many versions of dbx (or assemblers or linkers, I'm not sure which) +cretinously limit the strings to about 80 characters, so compilers which +must work with such systems need to split the `.stabs' directive into +several `.stabs' directives. Each stab duplicates every field except +the string field. The string field of every stab except the last is +marked as continued with a backslash at the end (in the assembly code +this may be written as a double backslash, depending on the assembler). +Removing the backslashes and concatenating the string fields of each +stab produces the original, long string. Just to be incompatible (or so +they don't have to worry about what the assembler does with +backslashes), AIX can use `?' instead of backslash. + + +File: stabs.info, Node: C Example, Next: Assembly Code, Prev: String Field, Up: Overview + +A Simple Example in C Source +============================ + + To get the flavor of how stabs describe source information for a C +program, let's look at the simple program: + + main() + { + printf("Hello world"); + } + + When compiled with `-g', the program above yields the following `.s' +file. Line numbers have been added to make it easier to refer to parts +of the `.s' file in the description of the stabs that follows. + + +File: stabs.info, Node: Assembly Code, Prev: C Example, Up: Overview + +The Simple Example at the Assembly Level +======================================== + + This simple "hello world" example demonstrates several of the stab +types used to describe C language source files. + + 1 gcc2_compiled.: + 2 .stabs "/cygint/s1/users/jcm/play/",100,0,0,Ltext0 + 3 .stabs "hello.c",100,0,0,Ltext0 + 4 .text + 5 Ltext0: + 6 .stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0 + 7 .stabs "char:t2=r2;0;127;",128,0,0,0 + 8 .stabs "long int:t3=r1;-2147483648;2147483647;",128,0,0,0 + 9 .stabs "unsigned int:t4=r1;0;-1;",128,0,0,0 + 10 .stabs "long unsigned int:t5=r1;0;-1;",128,0,0,0 + 11 .stabs "short int:t6=r1;-32768;32767;",128,0,0,0 + 12 .stabs "long long int:t7=r1;0;-1;",128,0,0,0 + 13 .stabs "short unsigned int:t8=r1;0;65535;",128,0,0,0 + 14 .stabs "long long unsigned int:t9=r1;0;-1;",128,0,0,0 + 15 .stabs "signed char:t10=r1;-128;127;",128,0,0,0 + 16 .stabs "unsigned char:t11=r1;0;255;",128,0,0,0 + 17 .stabs "float:t12=r1;4;0;",128,0,0,0 + 18 .stabs "double:t13=r1;8;0;",128,0,0,0 + 19 .stabs "long double:t14=r1;8;0;",128,0,0,0 + 20 .stabs "void:t15=15",128,0,0,0 + 21 .align 4 + 22 LC0: + 23 .ascii "Hello, world!\12\0" + 24 .align 4 + 25 .global _main + 26 .proc 1 + 27 _main: + 28 .stabn 68,0,4,LM1 + 29 LM1: + 30 !#PROLOGUE# 0 + 31 save %sp,-136,%sp + 32 !#PROLOGUE# 1 + 33 call ___main,0 + 34 nop + 35 .stabn 68,0,5,LM2 + 36 LM2: + 37 LBB2: + 38 sethi %hi(LC0),%o1 + 39 or %o1,%lo(LC0),%o0 + 40 call _printf,0 + 41 nop + 42 .stabn 68,0,6,LM3 + 43 LM3: + 44 LBE2: + 45 .stabn 68,0,6,LM4 + 46 LM4: + 47 L1: + 48 ret + 49 restore + 50 .stabs "main:F1",36,0,0,_main + 51 .stabn 192,0,0,LBB2 + 52 .stabn 224,0,0,LBE2 + + +File: stabs.info, Node: Program Structure, Next: Constants, Prev: Overview, Up: Top + +Encoding the Structure of the Program +************************************* + + The elements of the program structure that stabs encode include the +name of the main function, the names of the source and include files, +the line numbers, procedure names and types, and the beginnings and +ends of blocks of code. + +* Menu: + +* Main Program:: Indicate what the main program is +* Source Files:: The path and name of the source file +* Include Files:: Names of include files +* Line Numbers:: +* Procedures:: +* Nested Procedures:: +* Block Structure:: +* Alternate Entry Points:: Entering procedures except at the beginning. + + +File: stabs.info, Node: Main Program, Next: Source Files, Up: Program Structure + +Main Program +============ + + Most languages allow the main program to have any name. The +`N_MAIN' stab type tells the debugger the name that is used in this +program. Only the string field is significant; it is the name of a +function which is the main program. Most C compilers do not use this +stab (they expect the debugger to assume that the name is `main'), but +some C compilers emit an `N_MAIN' stab for the `main' function. I'm +not sure how XCOFF handles this. + + +File: stabs.info, Node: Source Files, Next: Include Files, Prev: Main Program, Up: Program Structure + +Paths and Names of the Source Files +=================================== + + Before any other stabs occur, there must be a stab specifying the +source file. This information is contained in a symbol of stab type +`N_SO'; the string field contains the name of the file. The value of +the symbol is the start address of the portion of the text section +corresponding to that file. + + With the Sun Solaris2 compiler, the desc field contains a +source-language code. + + Some compilers (for example, GCC2 and SunOS4 `/bin/cc') also include +the directory in which the source was compiled, in a second `N_SO' +symbol preceding the one containing the file name. This symbol can be +distinguished by the fact that it ends in a slash. Code from the +`cfront' C++ compiler can have additional `N_SO' symbols for +nonexistent source files after the `N_SO' for the real source file; +these are believed to contain no useful information. + + For example: + + .stabs "/cygint/s1/users/jcm/play/",100,0,0,Ltext0 # 100 is N_SO + .stabs "hello.c",100,0,0,Ltext0 + .text + Ltext0: + + Instead of `N_SO' symbols, XCOFF uses a `.file' assembler directive +which assembles to a `C_FILE' symbol; explaining this in detail is +outside the scope of this document. + + If it is useful to indicate the end of a source file, this is done +with an `N_SO' symbol with an empty string for the name. The value is +the address of the end of the text section for the file. For some +systems, there is no indication of the end of a source file, and you +just need to figure it ended when you see an `N_SO' for a different +source file, or a symbol ending in `.o' (which at least some linkers +insert to mark the start of a new `.o' file). + + +File: stabs.info, Node: Include Files, Next: Line Numbers, Prev: Source Files, Up: Program Structure + +Names of Include Files +====================== + + There are several schemes for dealing with include files: the +traditional `N_SOL' approach, Sun's `N_BINCL' approach, and the XCOFF +`C_BINCL' approach (which despite the similar name has little in common +with `N_BINCL'). + + An `N_SOL' symbol specifies which include file subsequent symbols +refer to. The string field is the name of the file and the value is the +text address corresponding to the end of the previous include file and +the start of this one. To specify the main source file again, use an +`N_SOL' symbol with the name of the main source file. + + The `N_BINCL' approach works as follows. An `N_BINCL' symbol +specifies the start of an include file. In an object file, only the +string is significant; the linker puts data into some of the other +fields. The end of the include file is marked by an `N_EINCL' symbol +(which has no string field). In an object file, there is no +significant data in the `N_EINCL' symbol. `N_BINCL' and `N_EINCL' can +be nested. + + If the linker detects that two source files have identical stabs +between an `N_BINCL' and `N_EINCL' pair (as will generally be the case +for a header file), then it only puts out the stabs once. Each +additional occurance is replaced by an `N_EXCL' symbol. I believe the +GNU linker and the Sun (both SunOS4 and Solaris) linker are the only +ones which supports this feature. + + A linker which supports this feature will set the value of a +`N_BINCL' symbol to the total of all the characters in the stabs +strings included in the header file, omitting any file numbers. The +value of an `N_EXCL' symbol is the same as the value of the `N_BINCL' +symbol it replaces. This information can be used to match up `N_EXCL' +and `N_BINCL' symbols which have the same filename. The `N_EINCL' +value, and the values of the other and description fields for all +three, appear to always be zero. + + For the start of an include file in XCOFF, use the `.bi' assembler +directive, which generates a `C_BINCL' symbol. A `.ei' directive, +which generates a `C_EINCL' symbol, denotes the end of the include +file. Both directives are followed by the name of the source file in +quotes, which becomes the string for the symbol. The value of each +symbol, produced automatically by the assembler and linker, is the +offset into the executable of the beginning (inclusive, as you'd +expect) or end (inclusive, as you would not expect) of the portion of +the COFF line table that corresponds to this include file. `C_BINCL' +and `C_EINCL' do not nest. + + +File: stabs.info, Node: Line Numbers, Next: Procedures, Prev: Include Files, Up: Program Structure + +Line Numbers +============ + + An `N_SLINE' symbol represents the start of a source line. The desc +field contains the line number and the value contains the code address +for the start of that source line. On most machines the address is +absolute; for stabs in sections (*note Stab Sections::.), it is +relative to the function in which the `N_SLINE' symbol occurs. + + GNU documents `N_DSLINE' and `N_BSLINE' symbols for line numbers in +the data or bss segments, respectively. They are identical to +`N_SLINE' but are relocated differently by the linker. They were +intended to be used to describe the source location of a variable +declaration, but I believe that GCC2 actually puts the line number in +the desc field of the stab for the variable itself. GDB has been +ignoring these symbols (unless they contain a string field) since at +least GDB 3.5. + + For single source lines that generate discontiguous code, such as +flow of control statements, there may be more than one line number +entry for the same source line. In this case there is a line number +entry at the start of each code range, each with the same line number. + + XCOFF does not use stabs for line numbers. Instead, it uses COFF +line numbers (which are outside the scope of this document). Standard +COFF line numbers cannot deal with include files, but in XCOFF this is +fixed with the `C_BINCL' method of marking include files (*note Include +Files::.). + + +File: stabs.info, Node: Procedures, Next: Nested Procedures, Prev: Line Numbers, Up: Program Structure + +Procedures +========== + + All of the following stabs normally use the `N_FUN' symbol type. +However, Sun's `acc' compiler on SunOS4 uses `N_GSYM' and `N_STSYM', +which means that the value of the stab for the function is useless and +the debugger must get the address of the function from the non-stab +symbols instead. On systems where non-stab symbols have leading +underscores, the stabs will lack underscores and the debugger needs to +know about the leading underscore to match up the stab and the non-stab +symbol. BSD Fortran is said to use `N_FNAME' with the same +restriction; the value of the symbol is not useful (I'm not sure it +really does use this, because GDB doesn't handle this and no one has +complained). + + A function is represented by an `F' symbol descriptor for a global +(extern) function, and `f' for a static (local) function. For a.out, +the value of the symbol is the address of the start of the function; it +is already relocated. For stabs in ELF, the SunPRO compiler version +2.0.1 and GCC put out an address which gets relocated by the linker. +In a future release SunPRO is planning to put out zero, in which case +the address can be found from the ELF (non-stab) symbol. Because +looking things up in the ELF symbols would probably be slow, I'm not +sure how to find which symbol of that name is the right one, and this +doesn't provide any way to deal with nested functions, it would +probably be better to make the value of the stab an address relative to +the start of the file, or just absolute. See *Note ELF Linker +Relocation:: for more information on linker relocation of stabs in ELF +files. For XCOFF, the stab uses the `C_FUN' storage class and the +value of the stab is meaningless; the address of the function can be +found from the csect symbol (XTY_LD/XMC_PR). + + The type information of the stab represents the return type of the +function; thus `foo:f5' means that foo is a function returning type 5. +There is no need to try to get the line number of the start of the +function from the stab for the function; it is in the next `N_SLINE' +symbol. + + Some compilers (such as Sun's Solaris compiler) support an extension +for specifying the types of the arguments. I suspect this extension is +not used for old (non-prototyped) function definitions in C. If the +extension is in use, the type information of the stab for the function +is followed by type information for each argument, with each argument +preceded by `;'. An argument type of 0 means that additional arguments +are being passed, whose types and number may vary (`...' in ANSI C). +GDB has tolerated this extension (parsed the syntax, if not necessarily +used the information) since at least version 4.8; I don't know whether +all versions of dbx tolerate it. The argument types given here are not +redundant with the symbols for the formal parameters (*note +Parameters::.); they are the types of the arguments as they are passed, +before any conversions might take place. For example, if a C function +which is declared without a prototype takes a `float' argument, the +value is passed as a `double' but then converted to a `float'. +Debuggers need to use the types given in the arguments when printing +values, but when calling the function they need to use the types given +in the symbol defining the function. + + If the return type and types of arguments of a function which is +defined in another source file are specified (i.e., a function +prototype in ANSI C), traditionally compilers emit no stab; the only +way for the debugger to find the information is if the source file +where the function is defined was also compiled with debugging symbols. +As an extension the Solaris compiler uses symbol descriptor `P' +followed by the return type of the function, followed by the arguments, +each preceded by `;', as in a stab with symbol descriptor `f' or `F'. +This use of symbol descriptor `P' can be distinguished from its use for +register parameters (*note Register Parameters::.) by the fact that it +has symbol type `N_FUN'. + + The AIX documentation also defines symbol descriptor `J' as an +internal function. I assume this means a function nested within another +function. It also says symbol descriptor `m' is a module in Modula-2 +or extended Pascal. + + Procedures (functions which do not return values) are represented as +functions returning the `void' type in C. I don't see why this couldn't +be used for all languages (inventing a `void' type for this purpose if +necessary), but the AIX documentation defines `I', `P', and `Q' for +internal, global, and static procedures, respectively. These symbol +descriptors are unusual in that they are not followed by type +information. + + The following example shows a stab for a function `main' which +returns type number `1'. The `_main' specified for the value is a +reference to an assembler label which is used to fill in the start +address of the function. + + .stabs "main:F1",36,0,0,_main # 36 is N_FUN + + The stab representing a procedure is located immediately following +the code of the procedure. This stab is in turn directly followed by a +group of other stabs describing elements of the procedure. These other +stabs describe the procedure's parameters, its block local variables, +and its block structure. + + If functions can appear in different sections, then the debugger may +not be able to find the end of a function. Recent versions of GCC will +mark the end of a function with an `N_FUN' symbol with an empty string +for the name. The value is the address of the end of the current +function. Without such a symbol, there is no indication of the address +of the end of a function, and you must assume that it ended at the +starting address of the next function or at the end of the text section +for the program. + + +File: stabs.info, Node: Nested Procedures, Next: Block Structure, Prev: Procedures, Up: Program Structure + +Nested Procedures +================= + + For any of the symbol descriptors representing procedures, after the +symbol descriptor and the type information is optionally a scope +specifier. This consists of a comma, the name of the procedure, another +comma, and the name of the enclosing procedure. The first name is local +to the scope specified, and seems to be redundant with the name of the +symbol (before the `:'). This feature is used by GCC, and presumably +Pascal, Modula-2, etc., compilers, for nested functions. + + If procedures are nested more than one level deep, only the +immediately containing scope is specified. For example, this code: + + int + foo (int x) + { + int bar (int y) + { + int baz (int z) + { + return x + y + z; + } + return baz (x + 2 * y); + } + return x + bar (3 * x); + } + +produces the stabs: + + .stabs "baz:f1,baz,bar",36,0,0,_baz.15 # 36 is N_FUN + .stabs "bar:f1,bar,foo",36,0,0,_bar.12 + .stabs "foo:F1",36,0,0,_foo + + +File: stabs.info, Node: Block Structure, Next: Alternate Entry Points, Prev: Nested Procedures, Up: Program Structure + +Block Structure +=============== + + The program's block structure is represented by the `N_LBRAC' (left +brace) and the `N_RBRAC' (right brace) stab types. The variables +defined inside a block precede the `N_LBRAC' symbol for most compilers, +including GCC. Other compilers, such as the Convex, Acorn RISC +machine, and Sun `acc' compilers, put the variables after the `N_LBRAC' +symbol. The values of the `N_LBRAC' and `N_RBRAC' symbols are the +start and end addresses of the code of the block, respectively. For +most machines, they are relative to the starting address of this source +file. For the Gould NP1, they are absolute. For stabs in sections +(*note Stab Sections::.), they are relative to the function in which +they occur. + + The `N_LBRAC' and `N_RBRAC' stabs that describe the block scope of a +procedure are located after the `N_FUN' stab that represents the +procedure itself. + + Sun documents the desc field of `N_LBRAC' and `N_RBRAC' symbols as +containing the nesting level of the block. However, dbx seems to not +care, and GCC always sets desc to zero. + + For XCOFF, block scope is indicated with `C_BLOCK' symbols. If the +name of the symbol is `.bb', then it is the beginning of the block; if +the name of the symbol is `.be'; it is the end of the block. + + +File: stabs.info, Node: Alternate Entry Points, Prev: Block Structure, Up: Program Structure + +Alternate Entry Points +====================== + + Some languages, like Fortran, have the ability to enter procedures at +some place other than the beginning. One can declare an alternate entry +point. The `N_ENTRY' stab is for this; however, the Sun FORTRAN +compiler doesn't use it. According to AIX documentation, only the name +of a `C_ENTRY' stab is significant; the address of the alternate entry +point comes from the corresponding external symbol. A previous +revision of this document said that the value of an `N_ENTRY' stab was +the address of the alternate entry point, but I don't know the source +for that information. + + +File: stabs.info, Node: Constants, Next: Variables, Prev: Program Structure, Up: Top + +Constants +********* + + The `c' symbol descriptor indicates that this stab represents a +constant. This symbol descriptor is an exception to the general rule +that symbol descriptors are followed by type information. Instead, it +is followed by `=' and one of the following: + +`b VALUE' + Boolean constant. VALUE is a numeric value; I assume it is 0 for + false or 1 for true. + +`c VALUE' + Character constant. VALUE is the numeric value of the constant. + +`e TYPE-INFORMATION , VALUE' + Constant whose value can be represented as integral. + TYPE-INFORMATION is the type of the constant, as it would appear + after a symbol descriptor (*note String Field::.). VALUE is the + numeric value of the constant. GDB 4.9 does not actually get the + right value if VALUE does not fit in a host `int', but it does not + do anything violent, and future debuggers could be extended to + accept integers of any size (whether unsigned or not). This + constant type is usually documented as being only for enumeration + constants, but GDB has never imposed that restriction; I don't + know about other debuggers. + +`i VALUE' + Integer constant. VALUE is the numeric value. The type is some + sort of generic integer type (for GDB, a host `int'); to specify + the type explicitly, use `e' instead. + +`r VALUE' + Real constant. VALUE is the real value, which can be `INF' + (optionally preceded by a sign) for infinity, `QNAN' for a quiet + NaN (not-a-number), or `SNAN' for a signalling NaN. If it is a + normal number the format is that accepted by the C library function + `atof'. + +`s STRING' + String constant. STRING is a string enclosed in either `'' (in + which case `'' characters within the string are represented as + `\'' or `"' (in which case `"' characters within the string are + represented as `\"'). + +`S TYPE-INFORMATION , ELEMENTS , BITS , PATTERN' + Set constant. TYPE-INFORMATION is the type of the constant, as it + would appear after a symbol descriptor (*note String Field::.). + ELEMENTS is the number of elements in the set (does this means how + many bits of PATTERN are actually used, which would be redundant + with the type, or perhaps the number of bits set in PATTERN? I + don't get it), BITS is the number of bits in the constant (meaning + it specifies the length of PATTERN, I think), and PATTERN is a + hexadecimal representation of the set. AIX documentation refers + to a limit of 32 bytes, but I see no reason why this limit should + exist. This form could probably be used for arbitrary constants, + not just sets; the only catch is that PATTERN should be understood + to be target, not host, byte order and format. + + The boolean, character, string, and set constants are not supported +by GDB 4.9, but it ignores them. GDB 4.8 and earlier gave an error +message and refused to read symbols from the file containing the +constants. + + The above information is followed by `;'. + + +File: stabs.info, Node: Variables, Next: Types, Prev: Constants, Up: Top + +Variables +********* + + Different types of stabs describe the various ways that variables +can be allocated: on the stack, globally, in registers, in common +blocks, statically, or as arguments to a function. + +* Menu: + +* Stack Variables:: Variables allocated on the stack. +* Global Variables:: Variables used by more than one source file. +* Register Variables:: Variables in registers. +* Common Blocks:: Variables statically allocated together. +* Statics:: Variables local to one source file. +* Based Variables:: Fortran pointer based variables. +* Parameters:: Variables for arguments to functions. + + +File: stabs.info, Node: Stack Variables, Next: Global Variables, Up: Variables + +Automatic Variables Allocated on the Stack +========================================== + + If a variable's scope is local to a function and its lifetime is +only as long as that function executes (C calls such variables +"automatic"), it can be allocated in a register (*note Register +Variables::.) or on the stack. + + Each variable allocated on the stack has a stab with the symbol +descriptor omitted. Since type information should begin with a digit, +`-', or `(', only those characters precluded from being used for symbol +descriptors. However, the Acorn RISC machine (ARM) is said to get this +wrong: it puts out a mere type definition here, without the preceding +`TYPE-NUMBER='. This is a bad idea; there is no guarantee that type +descriptors are distinct from symbol descriptors. Stabs for stack +variables use the `N_LSYM' stab type, or `C_LSYM' for XCOFF. + + The value of the stab is the offset of the variable within the local +variables. On most machines this is an offset from the frame pointer +and is negative. The location of the stab specifies which block it is +defined in; see *Note Block Structure::. + + For example, the following C code: + + int + main () + { + int x; + } + + produces the following stabs: + + .stabs "main:F1",36,0,0,_main # 36 is N_FUN + .stabs "x:1",128,0,0,-12 # 128 is N_LSYM + .stabn 192,0,0,LBB2 # 192 is N_LBRAC + .stabn 224,0,0,LBE2 # 224 is N_RBRAC + + *Note Procedures:: for more information on the `N_FUN' stab, and +*Note Block Structure:: for more information on the `N_LBRAC' and +`N_RBRAC' stabs. + + +File: stabs.info, Node: Global Variables, Next: Register Variables, Prev: Stack Variables, Up: Variables + +Global Variables +================ + + A variable whose scope is not specific to just one source file is +represented by the `G' symbol descriptor. These stabs use the `N_GSYM' +stab type (C_GSYM for XCOFF). The type information for the stab (*note +String Field::.) gives the type of the variable. + + For example, the following source code: + + char g_foo = 'c'; + +yields the following assembly code: + + .stabs "g_foo:G2",32,0,0,0 # 32 is N_GSYM + .global _g_foo + .data + _g_foo: + .byte 99 + + The address of the variable represented by the `N_GSYM' is not +contained in the `N_GSYM' stab. The debugger gets this information +from the external symbol for the global variable. In the example above, +the `.global _g_foo' and `_g_foo:' lines tell the assembler to produce +an external symbol. + + Some compilers, like GCC, output `N_GSYM' stabs only once, where the +variable is defined. Other compilers, like SunOS4 /bin/cc, output a +`N_GSYM' stab for each compilation unit which references the variable. + + +File: stabs.info, Node: Register Variables, Next: Common Blocks, Prev: Global Variables, Up: Variables + +Register Variables +================== + + Register variables have their own stab type, `N_RSYM' (`C_RSYM' for +XCOFF), and their own symbol descriptor, `r'. The stab's value is the +number of the register where the variable data will be stored. + + AIX defines a separate symbol descriptor `d' for floating point +registers. This seems unnecessary; why not just just give floating +point registers different register numbers? I have not verified whether +the compiler actually uses `d'. + + If the register is explicitly allocated to a global variable, but not +initialized, as in: + + register int g_bar asm ("%g5"); + +then the stab may be emitted at the end of the object file, with the +other bss symbols. + + +File: stabs.info, Node: Common Blocks, Next: Statics, Prev: Register Variables, Up: Variables + +Common Blocks +============= + + A common block is a statically allocated section of memory which can +be referred to by several source files. It may contain several +variables. I believe Fortran is the only language with this feature. + + A `N_BCOMM' stab begins a common block and an `N_ECOMM' stab ends +it. The only field that is significant in these two stabs is the +string, which names a normal (non-debugging) symbol that gives the +address of the common block. According to IBM documentation, only the +`N_BCOMM' has the name of the common block (even though their compiler +actually puts it both places). + + The stabs for the members of the common block are between the +`N_BCOMM' and the `N_ECOMM'; the value of each stab is the offset +within the common block of that variable. IBM uses the `C_ECOML' stab +type, and there is a corresponding `N_ECOML' stab type, but Sun's +Fortran compiler uses `N_GSYM' instead. The variables within a common +block use the `V' symbol descriptor (I believe this is true of all +Fortran variables). Other stabs (at least type declarations using +`C_DECL') can also be between the `N_BCOMM' and the `N_ECOMM'. + + +File: stabs.info, Node: Statics, Next: Based Variables, Prev: Common Blocks, Up: Variables + +Static Variables +================ + + Initialized static variables are represented by the `S' and `V' +symbol descriptors. `S' means file scope static, and `V' means +procedure scope static. One exception: in XCOFF, IBM's xlc compiler +always uses `V', and whether it is file scope or not is distinguished +by whether the stab is located within a function. + + In a.out files, `N_STSYM' means the data section, `N_FUN' means the +text section, and `N_LCSYM' means the bss section. For those systems +with a read-only data section separate from the text section (Solaris), +`N_ROSYM' means the read-only data section. + + For example, the source lines: + + static const int var_const = 5; + static int var_init = 2; + static int var_noinit; + +yield the following stabs: + + .stabs "var_const:S1",36,0,0,_var_const # 36 is N_FUN + ... + .stabs "var_init:S1",38,0,0,_var_init # 38 is N_STSYM + ... + .stabs "var_noinit:S1",40,0,0,_var_noinit # 40 is N_LCSYM + + In XCOFF files, the stab type need not indicate the section; +`C_STSYM' can be used for all statics. Also, each static variable is +enclosed in a static block. A `C_BSTAT' (emitted with a `.bs' +assembler directive) symbol begins the static block; its value is the +symbol number of the csect symbol whose value is the address of the +static block, its section is the section of the variables in that +static block, and its name is `.bs'. A `C_ESTAT' (emitted with a `.es' +assembler directive) symbol ends the static block; its name is `.es' +and its value and section are ignored. + + In ECOFF files, the storage class is used to specify the section, so +the stab type need not indicate the section. + + In ELF files, for the SunPRO compiler version 2.0.1, symbol +descriptor `S' means that the address is absolute (the linker relocates +it) and symbol descriptor `V' means that the address is relative to the +start of the relevant section for that compilation unit. SunPRO has +plans to have the linker stop relocating stabs; I suspect that their the +debugger gets the address from the corresponding ELF (not stab) symbol. +I'm not sure how to find which symbol of that name is the right one. +The clean way to do all this would be to have a the value of a symbol +descriptor `S' symbol be an offset relative to the start of the file, +just like everything else, but that introduces obvious compatibility +problems. For more information on linker stab relocation, *Note ELF +Linker Relocation::. + + +File: stabs.info, Node: Based Variables, Next: Parameters, Prev: Statics, Up: Variables + +Fortran Based Variables +======================= + + Fortran (at least, the Sun and SGI dialects of FORTRAN-77) has a +feature which allows allocating arrays with `malloc', but which avoids +blurring the line between arrays and pointers the way that C does. In +stabs such a variable uses the `b' symbol descriptor. + + For example, the Fortran declarations + + real foo, foo10(10), foo10_5(10,5) + pointer (foop, foo) + pointer (foo10p, foo10) + pointer (foo105p, foo10_5) + + produce the stabs + + foo:b6 + foo10:bar3;1;10;6 + foo10_5:bar3;1;5;ar3;1;10;6 + + In this example, `real' is type 6 and type 3 is an integral type +which is the type of the subscripts of the array (probably `integer'). + + The `b' symbol descriptor is like `V' in that it denotes a +statically allocated symbol whose scope is local to a function; see +*Note Statics::. The value of the symbol, instead of being the address +of the variable itself, is the address of a pointer to that variable. +So in the above example, the value of the `foo' stab is the address of +a pointer to a real, the value of the `foo10' stab is the address of a +pointer to a 10-element array of reals, and the value of the `foo10_5' +stab is the address of a pointer to a 5-element array of 10-element +arrays of reals. + + +File: stabs.info, Node: Parameters, Prev: Based Variables, Up: Variables + +Parameters +========== + + Formal parameters to a function are represented by a stab (or +sometimes two; see below) for each parameter. The stabs are in the +order in which the debugger should print the parameters (i.e., the +order in which the parameters are declared in the source file). The +exact form of the stab depends on how the parameter is being passed. + + Parameters passed on the stack use the symbol descriptor `p' and the +`N_PSYM' symbol type (or `C_PSYM' for XCOFF). The value of the symbol +is an offset used to locate the parameter on the stack; its exact +meaning is machine-dependent, but on most machines it is an offset from +the frame pointer. + + As a simple example, the code: + + main (argc, argv) + int argc; + char **argv; + + produces the stabs: + + .stabs "main:F1",36,0,0,_main # 36 is N_FUN + .stabs "argc:p1",160,0,0,68 # 160 is N_PSYM + .stabs "argv:p20=*21=*2",160,0,0,72 + + The type definition of `argv' is interesting because it contains +several type definitions. Type 21 is pointer to type 2 (char) and +`argv' (type 20) is pointer to type 21. + + The following symbol descriptors are also said to go with `N_PSYM'. +The value of the symbol is said to be an offset from the argument +pointer (I'm not sure whether this is true or not). + + pP (<<??>>) + pF Fortran function parameter + X (function result variable) + +* Menu: + +* Register Parameters:: +* Local Variable Parameters:: +* Reference Parameters:: +* Conformant Arrays:: + + +File: stabs.info, Node: Register Parameters, Next: Local Variable Parameters, Up: Parameters + +Passing Parameters in Registers +------------------------------- + + If the parameter is passed in a register, then traditionally there +are two symbols for each argument: + + .stabs "arg:p1" . . . ; N_PSYM + .stabs "arg:r1" . . . ; N_RSYM + + Debuggers use the second one to find the value, and the first one to +know that it is an argument. + + Because that approach is kind of ugly, some compilers use symbol +descriptor `P' or `R' to indicate an argument which is in a register. +Symbol type `C_RPSYM' is used in XCOFF and `N_RSYM' is used otherwise. +The symbol's value is the register number. `P' and `R' mean the same +thing; the difference is that `P' is a GNU invention and `R' is an IBM +(XCOFF) invention. As of version 4.9, GDB should handle either one. + + There is at least one case where GCC uses a `p' and `r' pair rather +than `P'; this is where the argument is passed in the argument list and +then loaded into a register. + + According to the AIX documentation, symbol descriptor `D' is for a +parameter passed in a floating point register. This seems +unnecessary--why not just use `R' with a register number which +indicates that it's a floating point register? I haven't verified +whether the system actually does what the documentation indicates. + + On the sparc and hppa, for a `P' symbol whose type is a structure or +union, the register contains the address of the structure. On the +sparc, this is also true of a `p' and `r' pair (using Sun `cc') or a +`p' symbol. However, if a (small) structure is really in a register, +`r' is used. And, to top it all off, on the hppa it might be a +structure which was passed on the stack and loaded into a register and +for which there is a `p' and `r' pair! I believe that symbol +descriptor `i' is supposed to deal with this case (it is said to mean +"value parameter by reference, indirect access"; I don't know the +source for this information), but I don't know details or what +compilers or debuggers use it, if any (not GDB or GCC). It is not +clear to me whether this case needs to be dealt with differently than +parameters passed by reference (*note Reference Parameters::.). + + +File: stabs.info, Node: Local Variable Parameters, Next: Reference Parameters, Prev: Register Parameters, Up: Parameters + +Storing Parameters as Local Variables +------------------------------------- + + There is a case similar to an argument in a register, which is an +argument that is actually stored as a local variable. Sometimes this +happens when the argument was passed in a register and then the compiler +stores it as a local variable. If possible, the compiler should claim +that it's in a register, but this isn't always done. + + If a parameter is passed as one type and converted to a smaller type +by the prologue (for example, the parameter is declared as a `float', +but the calling conventions specify that it is passed as a `double'), +then GCC2 (sometimes) uses a pair of symbols. The first symbol uses +symbol descriptor `p' and the type which is passed. The second symbol +has the type and location which the parameter actually has after the +prologue. For example, suppose the following C code appears with no +prototypes involved: + + void + subr (f) + float f; + { + + if `f' is passed as a double at stack offset 8, and the prologue +converts it to a float in register number 0, then the stabs look like: + + .stabs "f:p13",160,0,3,8 # 160 is `N_PSYM', here 13 is `double' + .stabs "f:r12",64,0,3,0 # 64 is `N_RSYM', here 12 is `float' + + In both stabs 3 is the line number where `f' is declared (*note Line +Numbers::.). + + GCC, at least on the 960, has another solution to the same problem. +It uses a single `p' symbol descriptor for an argument which is stored +as a local variable but uses `N_LSYM' instead of `N_PSYM'. In this +case, the value of the symbol is an offset relative to the local +variables for that function, not relative to the arguments; on some +machines those are the same thing, but not on all. + + On the VAX or on other machines in which the calling convention +includes the number of words of arguments actually passed, the debugger +(GDB at least) uses the parameter symbols to keep track of whether it +needs to print nameless arguments in addition to the formal parameters +which it has printed because each one has a stab. For example, in + + extern int fprintf (FILE *stream, char *format, ...); + ... + fprintf (stdout, "%d\n", x); + + there are stabs for `stream' and `format'. On most machines, the +debugger can only print those two arguments (because it has no way of +knowing that additional arguments were passed), but on the VAX or other +machines with a calling convention which indicates the number of words +of arguments, the debugger can print all three arguments. To do so, +the parameter symbol (symbol descriptor `p') (not necessarily `r' or +symbol descriptor omitted symbols) needs to contain the actual type as +passed (for example, `double' not `float' if it is passed as a double +and converted to a float). + + +File: stabs.info, Node: Reference Parameters, Next: Conformant Arrays, Prev: Local Variable Parameters, Up: Parameters + +Passing Parameters by Reference +------------------------------- + + If the parameter is passed by reference (e.g., Pascal `VAR' +parameters), then the symbol descriptor is `v' if it is in the argument +list, or `a' if it in a register. Other than the fact that these +contain the address of the parameter rather than the parameter itself, +they are identical to `p' and `R', respectively. I believe `a' is an +AIX invention; `v' is supported by all stabs-using systems as far as I +know. + |