1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
|
This is Info file stabs.info, produced by Makeinfo version 1.68 from
the input file ./stabs.texinfo.
START-INFO-DIR-ENTRY
* Stabs: (stabs). The "stabs" debugging information format.
END-INFO-DIR-ENTRY
This document describes the stabs debugging symbol tables.
Copyright 1992, 93, 94, 95, 97, 1998 Free Software Foundation, Inc.
Contributed by Cygnus Support. Written by Julia Menapace, Jim Kingdon,
and David MacKenzie.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
Permission is granted to copy or distribute modified versions of this
manual under the terms of the GPL (for which purpose this text may be
regarded as a program in the language TeX).
File: stabs.info, Node: Top, Next: Overview, Up: (dir)
The "stabs" representation of debugging information
***************************************************
This document describes the stabs debugging format.
* Menu:
* Overview:: Overview of stabs
* Program Structure:: Encoding of the structure of the program
* Constants:: Constants
* Variables::
* Types:: Type definitions
* Symbol Tables:: Symbol information in symbol tables
* Cplusplus:: Stabs specific to C++
* Stab Types:: Symbol types in a.out files
* Symbol Descriptors:: Table of symbol descriptors
* Type Descriptors:: Table of type descriptors
* Expanded Reference:: Reference information by stab type
* Questions:: Questions and anomolies
* Stab Sections:: In some object file formats, stabs are
in sections.
* Symbol Types Index:: Index of symbolic stab symbol type names.
File: stabs.info, Node: Overview, Next: Program Structure, Prev: Top, Up: Top
Overview of Stabs
*****************
"Stabs" refers to a format for information that describes a program
to a debugger. This format was apparently invented by Peter Kessler at
the University of California at Berkeley, for the `pdx' Pascal
debugger; the format has spread widely since then.
This document is one of the few published sources of documentation on
stabs. It is believed to be comprehensive for stabs used by C. The
lists of symbol descriptors (*note Symbol Descriptors::.) and type
descriptors (*note Type Descriptors::.) are believed to be completely
comprehensive. Stabs for COBOL-specific features and for variant
records (used by Pascal and Modula-2) are poorly documented here.
Other sources of information on stabs are `Dbx and Dbxtool
Interfaces', 2nd edition, by Sun, 1988, and `AIX Version 3.2 Files
Reference', Fourth Edition, September 1992, "dbx Stabstring Grammar" in
the a.out section, page 2-31. This document is believed to incorporate
the information from those two sources except where it explicitly
directs you to them for more information.
* Menu:
* Flow:: Overview of debugging information flow
* Stabs Format:: Overview of stab format
* String Field:: The string field
* C Example:: A simple example in C source
* Assembly Code:: The simple example at the assembly level
File: stabs.info, Node: Flow, Next: Stabs Format, Up: Overview
Overview of Debugging Information Flow
======================================
The GNU C compiler compiles C source in a `.c' file into assembly
language in a `.s' file, which the assembler translates into a `.o'
file, which the linker combines with other `.o' files and libraries to
produce an executable file.
With the `-g' option, GCC puts in the `.s' file additional debugging
information, which is slightly transformed by the assembler and linker,
and carried through into the final executable. This debugging
information describes features of the source file like line numbers,
the types and scopes of variables, and function names, parameters, and
scopes.
For some object file formats, the debugging information is
encapsulated in assembler directives known collectively as "stab"
(symbol table) directives, which are interspersed with the generated
code. Stabs are the native format for debugging information in the
a.out and XCOFF object file formats. The GNU tools can also emit stabs
in the COFF and ECOFF object file formats.
The assembler adds the information from stabs to the symbol
information it places by default in the symbol table and the string
table of the `.o' file it is building. The linker consolidates the `.o'
files into one executable file, with one symbol table and one string
table. Debuggers use the symbol and string tables in the executable as
a source of debugging information about the program.
File: stabs.info, Node: Stabs Format, Next: String Field, Prev: Flow, Up: Overview
Overview of Stab Format
=======================
There are three overall formats for stab assembler directives,
differentiated by the first word of the stab. The name of the directive
describes which combination of four possible data fields follows. It is
either `.stabs' (string), `.stabn' (number), or `.stabd' (dot). IBM's
XCOFF assembler uses `.stabx' (and some other directives such as
`.file' and `.bi') instead of `.stabs', `.stabn' or `.stabd'.
The overall format of each class of stab is:
.stabs "STRING",TYPE,OTHER,DESC,VALUE
.stabn TYPE,OTHER,DESC,VALUE
.stabd TYPE,OTHER,DESC
.stabx "STRING",VALUE,TYPE,SDB-TYPE
For `.stabn' and `.stabd', there is no STRING (the `n_strx' field is
zero; see *Note Symbol Tables::). For `.stabd', the VALUE field is
implicit and has the value of the current file location. For `.stabx',
the SDB-TYPE field is unused for stabs and can always be set to zero.
The OTHER field is almost always unused and can be set to zero.
The number in the TYPE field gives some basic information about
which type of stab this is (or whether it *is* a stab, as opposed to an
ordinary symbol). Each valid type number defines a different stab
type; further, the stab type defines the exact interpretation of, and
possible values for, any remaining STRING, DESC, or VALUE fields
present in the stab. *Note Stab Types::, for a list in numeric order
of the valid TYPE field values for stab directives.
File: stabs.info, Node: String Field, Next: C Example, Prev: Stabs Format, Up: Overview
The String Field
================
For most stabs the string field holds the meat of the debugging
information. The flexible nature of this field is what makes stabs
extensible. For some stab types the string field contains only a name.
For other stab types the contents can be a great deal more complex.
The overall format of the string field for most stab types is:
"NAME:SYMBOL-DESCRIPTOR TYPE-INFORMATION"
NAME is the name of the symbol represented by the stab; it can
contain a pair of colons (*note Nested Symbols::.). NAME can be
omitted, which means the stab represents an unnamed object. For
example, `:t10=*2' defines type 10 as a pointer to type 2, but does not
give the type a name. Omitting the NAME field is supported by AIX dbx
and GDB after about version 4.8, but not other debuggers. GCC
sometimes uses a single space as the name instead of omitting the name
altogether; apparently that is supported by most debuggers.
The SYMBOL-DESCRIPTOR following the `:' is an alphabetic character
that tells more specifically what kind of symbol the stab represents.
If the SYMBOL-DESCRIPTOR is omitted, but type information follows, then
the stab represents a local variable. For a list of symbol
descriptors, see *Note Symbol Descriptors::. The `c' symbol descriptor
is an exception in that it is not followed by type information. *Note
Constants::.
TYPE-INFORMATION is either a TYPE-NUMBER, or `TYPE-NUMBER='. A
TYPE-NUMBER alone is a type reference, referring directly to a type
that has already been defined.
The `TYPE-NUMBER=' form is a type definition, where the number
represents a new type which is about to be defined. The type
definition may refer to other types by number, and those type numbers
may be followed by `=' and nested definitions. Also, the Lucid
compiler will repeat `TYPE-NUMBER=' more than once if it wants to
define several type numbers at once.
In a type definition, if the character that follows the equals sign
is non-numeric then it is a TYPE-DESCRIPTOR, and tells what kind of
type is about to be defined. Any other values following the
TYPE-DESCRIPTOR vary, depending on the TYPE-DESCRIPTOR. *Note Type
Descriptors::, for a list of TYPE-DESCRIPTOR values. If a number
follows the `=' then the number is a TYPE-REFERENCE. For a full
description of types, *Note Types::.
A TYPE-NUMBER is often a single number. The GNU and Sun tools
additionally permit a TYPE-NUMBER to be a pair
(FILE-NUMBER,FILETYPE-NUMBER) (the parentheses appear in the string,
and serve to distinguish the two cases). The FILE-NUMBER is a number
starting with 1 which is incremented for each seperate source file in
the compilation (e.g., in C, each header file gets a different number).
The FILETYPE-NUMBER is a number starting with 1 which is incremented
for each new type defined in the file. (Separating the file number and
the type number permits the `N_BINCL' optimization to succeed more
often; see *Note Include Files::).
There is an AIX extension for type attributes. Following the `='
are any number of type attributes. Each one starts with `@' and ends
with `;'. Debuggers, including AIX's dbx and GDB 4.10, skip any type
attributes they do not recognize. GDB 4.9 and other versions of dbx
may not do this. Because of a conflict with C++ (*note Cplusplus::.),
new attributes should not be defined which begin with a digit, `(', or
`-'; GDB may be unable to distinguish those from the C++ type
descriptor `@'. The attributes are:
`aBOUNDARY'
BOUNDARY is an integer specifying the alignment. I assume it
applies to all variables of this type.
`pINTEGER'
Pointer class (for checking). Not sure what this means, or how
INTEGER is interpreted.
`P'
Indicate this is a packed type, meaning that structure fields or
array elements are placed more closely in memory, to save memory
at the expense of speed.
`sSIZE'
Size in bits of a variable of this type. This is fully supported
by GDB 4.11 and later.
`S'
Indicate that this type is a string instead of an array of
characters, or a bitstring instead of a set. It doesn't change
the layout of the data being represented, but does enable the
debugger to know which type it is.
All of this can make the string field quite long. All versions of
GDB, and some versions of dbx, can handle arbitrarily long strings.
But many versions of dbx (or assemblers or linkers, I'm not sure which)
cretinously limit the strings to about 80 characters, so compilers which
must work with such systems need to split the `.stabs' directive into
several `.stabs' directives. Each stab duplicates every field except
the string field. The string field of every stab except the last is
marked as continued with a backslash at the end (in the assembly code
this may be written as a double backslash, depending on the assembler).
Removing the backslashes and concatenating the string fields of each
stab produces the original, long string. Just to be incompatible (or so
they don't have to worry about what the assembler does with
backslashes), AIX can use `?' instead of backslash.
File: stabs.info, Node: C Example, Next: Assembly Code, Prev: String Field, Up: Overview
A Simple Example in C Source
============================
To get the flavor of how stabs describe source information for a C
program, let's look at the simple program:
main()
{
printf("Hello world");
}
When compiled with `-g', the program above yields the following `.s'
file. Line numbers have been added to make it easier to refer to parts
of the `.s' file in the description of the stabs that follows.
File: stabs.info, Node: Assembly Code, Prev: C Example, Up: Overview
The Simple Example at the Assembly Level
========================================
This simple "hello world" example demonstrates several of the stab
types used to describe C language source files.
1 gcc2_compiled.:
2 .stabs "/cygint/s1/users/jcm/play/",100,0,0,Ltext0
3 .stabs "hello.c",100,0,0,Ltext0
4 .text
5 Ltext0:
6 .stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0
7 .stabs "char:t2=r2;0;127;",128,0,0,0
8 .stabs "long int:t3=r1;-2147483648;2147483647;",128,0,0,0
9 .stabs "unsigned int:t4=r1;0;-1;",128,0,0,0
10 .stabs "long unsigned int:t5=r1;0;-1;",128,0,0,0
11 .stabs "short int:t6=r1;-32768;32767;",128,0,0,0
12 .stabs "long long int:t7=r1;0;-1;",128,0,0,0
13 .stabs "short unsigned int:t8=r1;0;65535;",128,0,0,0
14 .stabs "long long unsigned int:t9=r1;0;-1;",128,0,0,0
15 .stabs "signed char:t10=r1;-128;127;",128,0,0,0
16 .stabs "unsigned char:t11=r1;0;255;",128,0,0,0
17 .stabs "float:t12=r1;4;0;",128,0,0,0
18 .stabs "double:t13=r1;8;0;",128,0,0,0
19 .stabs "long double:t14=r1;8;0;",128,0,0,0
20 .stabs "void:t15=15",128,0,0,0
21 .align 4
22 LC0:
23 .ascii "Hello, world!\12\0"
24 .align 4
25 .global _main
26 .proc 1
27 _main:
28 .stabn 68,0,4,LM1
29 LM1:
30 !#PROLOGUE# 0
31 save %sp,-136,%sp
32 !#PROLOGUE# 1
33 call ___main,0
34 nop
35 .stabn 68,0,5,LM2
36 LM2:
37 LBB2:
38 sethi %hi(LC0),%o1
39 or %o1,%lo(LC0),%o0
40 call _printf,0
41 nop
42 .stabn 68,0,6,LM3
43 LM3:
44 LBE2:
45 .stabn 68,0,6,LM4
46 LM4:
47 L1:
48 ret
49 restore
50 .stabs "main:F1",36,0,0,_main
51 .stabn 192,0,0,LBB2
52 .stabn 224,0,0,LBE2
File: stabs.info, Node: Program Structure, Next: Constants, Prev: Overview, Up: Top
Encoding the Structure of the Program
*************************************
The elements of the program structure that stabs encode include the
name of the main function, the names of the source and include files,
the line numbers, procedure names and types, and the beginnings and
ends of blocks of code.
* Menu:
* Main Program:: Indicate what the main program is
* Source Files:: The path and name of the source file
* Include Files:: Names of include files
* Line Numbers::
* Procedures::
* Nested Procedures::
* Block Structure::
* Alternate Entry Points:: Entering procedures except at the beginning.
File: stabs.info, Node: Main Program, Next: Source Files, Up: Program Structure
Main Program
============
Most languages allow the main program to have any name. The
`N_MAIN' stab type tells the debugger the name that is used in this
program. Only the string field is significant; it is the name of a
function which is the main program. Most C compilers do not use this
stab (they expect the debugger to assume that the name is `main'), but
some C compilers emit an `N_MAIN' stab for the `main' function. I'm
not sure how XCOFF handles this.
File: stabs.info, Node: Source Files, Next: Include Files, Prev: Main Program, Up: Program Structure
Paths and Names of the Source Files
===================================
Before any other stabs occur, there must be a stab specifying the
source file. This information is contained in a symbol of stab type
`N_SO'; the string field contains the name of the file. The value of
the symbol is the start address of the portion of the text section
corresponding to that file.
With the Sun Solaris2 compiler, the desc field contains a
source-language code.
Some compilers (for example, GCC2 and SunOS4 `/bin/cc') also include
the directory in which the source was compiled, in a second `N_SO'
symbol preceding the one containing the file name. This symbol can be
distinguished by the fact that it ends in a slash. Code from the
`cfront' C++ compiler can have additional `N_SO' symbols for
nonexistent source files after the `N_SO' for the real source file;
these are believed to contain no useful information.
For example:
.stabs "/cygint/s1/users/jcm/play/",100,0,0,Ltext0 # 100 is N_SO
.stabs "hello.c",100,0,0,Ltext0
.text
Ltext0:
Instead of `N_SO' symbols, XCOFF uses a `.file' assembler directive
which assembles to a `C_FILE' symbol; explaining this in detail is
outside the scope of this document.
If it is useful to indicate the end of a source file, this is done
with an `N_SO' symbol with an empty string for the name. The value is
the address of the end of the text section for the file. For some
systems, there is no indication of the end of a source file, and you
just need to figure it ended when you see an `N_SO' for a different
source file, or a symbol ending in `.o' (which at least some linkers
insert to mark the start of a new `.o' file).
File: stabs.info, Node: Include Files, Next: Line Numbers, Prev: Source Files, Up: Program Structure
Names of Include Files
======================
There are several schemes for dealing with include files: the
traditional `N_SOL' approach, Sun's `N_BINCL' approach, and the XCOFF
`C_BINCL' approach (which despite the similar name has little in common
with `N_BINCL').
An `N_SOL' symbol specifies which include file subsequent symbols
refer to. The string field is the name of the file and the value is the
text address corresponding to the end of the previous include file and
the start of this one. To specify the main source file again, use an
`N_SOL' symbol with the name of the main source file.
The `N_BINCL' approach works as follows. An `N_BINCL' symbol
specifies the start of an include file. In an object file, only the
string is significant; the linker puts data into some of the other
fields. The end of the include file is marked by an `N_EINCL' symbol
(which has no string field). In an object file, there is no
significant data in the `N_EINCL' symbol. `N_BINCL' and `N_EINCL' can
be nested.
If the linker detects that two source files have identical stabs
between an `N_BINCL' and `N_EINCL' pair (as will generally be the case
for a header file), then it only puts out the stabs once. Each
additional occurance is replaced by an `N_EXCL' symbol. I believe the
GNU linker and the Sun (both SunOS4 and Solaris) linker are the only
ones which supports this feature.
A linker which supports this feature will set the value of a
`N_BINCL' symbol to the total of all the characters in the stabs
strings included in the header file, omitting any file numbers. The
value of an `N_EXCL' symbol is the same as the value of the `N_BINCL'
symbol it replaces. This information can be used to match up `N_EXCL'
and `N_BINCL' symbols which have the same filename. The `N_EINCL'
value, and the values of the other and description fields for all
three, appear to always be zero.
For the start of an include file in XCOFF, use the `.bi' assembler
directive, which generates a `C_BINCL' symbol. A `.ei' directive,
which generates a `C_EINCL' symbol, denotes the end of the include
file. Both directives are followed by the name of the source file in
quotes, which becomes the string for the symbol. The value of each
symbol, produced automatically by the assembler and linker, is the
offset into the executable of the beginning (inclusive, as you'd
expect) or end (inclusive, as you would not expect) of the portion of
the COFF line table that corresponds to this include file. `C_BINCL'
and `C_EINCL' do not nest.
File: stabs.info, Node: Line Numbers, Next: Procedures, Prev: Include Files, Up: Program Structure
Line Numbers
============
An `N_SLINE' symbol represents the start of a source line. The desc
field contains the line number and the value contains the code address
for the start of that source line. On most machines the address is
absolute; for stabs in sections (*note Stab Sections::.), it is
relative to the function in which the `N_SLINE' symbol occurs.
GNU documents `N_DSLINE' and `N_BSLINE' symbols for line numbers in
the data or bss segments, respectively. They are identical to
`N_SLINE' but are relocated differently by the linker. They were
intended to be used to describe the source location of a variable
declaration, but I believe that GCC2 actually puts the line number in
the desc field of the stab for the variable itself. GDB has been
ignoring these symbols (unless they contain a string field) since at
least GDB 3.5.
For single source lines that generate discontiguous code, such as
flow of control statements, there may be more than one line number
entry for the same source line. In this case there is a line number
entry at the start of each code range, each with the same line number.
XCOFF does not use stabs for line numbers. Instead, it uses COFF
line numbers (which are outside the scope of this document). Standard
COFF line numbers cannot deal with include files, but in XCOFF this is
fixed with the `C_BINCL' method of marking include files (*note Include
Files::.).
File: stabs.info, Node: Procedures, Next: Nested Procedures, Prev: Line Numbers, Up: Program Structure
Procedures
==========
All of the following stabs normally use the `N_FUN' symbol type.
However, Sun's `acc' compiler on SunOS4 uses `N_GSYM' and `N_STSYM',
which means that the value of the stab for the function is useless and
the debugger must get the address of the function from the non-stab
symbols instead. On systems where non-stab symbols have leading
underscores, the stabs will lack underscores and the debugger needs to
know about the leading underscore to match up the stab and the non-stab
symbol. BSD Fortran is said to use `N_FNAME' with the same
restriction; the value of the symbol is not useful (I'm not sure it
really does use this, because GDB doesn't handle this and no one has
complained).
A function is represented by an `F' symbol descriptor for a global
(extern) function, and `f' for a static (local) function. For a.out,
the value of the symbol is the address of the start of the function; it
is already relocated. For stabs in ELF, the SunPRO compiler version
2.0.1 and GCC put out an address which gets relocated by the linker.
In a future release SunPRO is planning to put out zero, in which case
the address can be found from the ELF (non-stab) symbol. Because
looking things up in the ELF symbols would probably be slow, I'm not
sure how to find which symbol of that name is the right one, and this
doesn't provide any way to deal with nested functions, it would
probably be better to make the value of the stab an address relative to
the start of the file, or just absolute. See *Note ELF Linker
Relocation:: for more information on linker relocation of stabs in ELF
files. For XCOFF, the stab uses the `C_FUN' storage class and the
value of the stab is meaningless; the address of the function can be
found from the csect symbol (XTY_LD/XMC_PR).
The type information of the stab represents the return type of the
function; thus `foo:f5' means that foo is a function returning type 5.
There is no need to try to get the line number of the start of the
function from the stab for the function; it is in the next `N_SLINE'
symbol.
Some compilers (such as Sun's Solaris compiler) support an extension
for specifying the types of the arguments. I suspect this extension is
not used for old (non-prototyped) function definitions in C. If the
extension is in use, the type information of the stab for the function
is followed by type information for each argument, with each argument
preceded by `;'. An argument type of 0 means that additional arguments
are being passed, whose types and number may vary (`...' in ANSI C).
GDB has tolerated this extension (parsed the syntax, if not necessarily
used the information) since at least version 4.8; I don't know whether
all versions of dbx tolerate it. The argument types given here are not
redundant with the symbols for the formal parameters (*note
Parameters::.); they are the types of the arguments as they are passed,
before any conversions might take place. For example, if a C function
which is declared without a prototype takes a `float' argument, the
value is passed as a `double' but then converted to a `float'.
Debuggers need to use the types given in the arguments when printing
values, but when calling the function they need to use the types given
in the symbol defining the function.
If the return type and types of arguments of a function which is
defined in another source file are specified (i.e., a function
prototype in ANSI C), traditionally compilers emit no stab; the only
way for the debugger to find the information is if the source file
where the function is defined was also compiled with debugging symbols.
As an extension the Solaris compiler uses symbol descriptor `P'
followed by the return type of the function, followed by the arguments,
each preceded by `;', as in a stab with symbol descriptor `f' or `F'.
This use of symbol descriptor `P' can be distinguished from its use for
register parameters (*note Register Parameters::.) by the fact that it
has symbol type `N_FUN'.
The AIX documentation also defines symbol descriptor `J' as an
internal function. I assume this means a function nested within another
function. It also says symbol descriptor `m' is a module in Modula-2
or extended Pascal.
Procedures (functions which do not return values) are represented as
functions returning the `void' type in C. I don't see why this couldn't
be used for all languages (inventing a `void' type for this purpose if
necessary), but the AIX documentation defines `I', `P', and `Q' for
internal, global, and static procedures, respectively. These symbol
descriptors are unusual in that they are not followed by type
information.
The following example shows a stab for a function `main' which
returns type number `1'. The `_main' specified for the value is a
reference to an assembler label which is used to fill in the start
address of the function.
.stabs "main:F1",36,0,0,_main # 36 is N_FUN
The stab representing a procedure is located immediately following
the code of the procedure. This stab is in turn directly followed by a
group of other stabs describing elements of the procedure. These other
stabs describe the procedure's parameters, its block local variables,
and its block structure.
If functions can appear in different sections, then the debugger may
not be able to find the end of a function. Recent versions of GCC will
mark the end of a function with an `N_FUN' symbol with an empty string
for the name. The value is the address of the end of the current
function. Without such a symbol, there is no indication of the address
of the end of a function, and you must assume that it ended at the
starting address of the next function or at the end of the text section
for the program.
File: stabs.info, Node: Nested Procedures, Next: Block Structure, Prev: Procedures, Up: Program Structure
Nested Procedures
=================
For any of the symbol descriptors representing procedures, after the
symbol descriptor and the type information is optionally a scope
specifier. This consists of a comma, the name of the procedure, another
comma, and the name of the enclosing procedure. The first name is local
to the scope specified, and seems to be redundant with the name of the
symbol (before the `:'). This feature is used by GCC, and presumably
Pascal, Modula-2, etc., compilers, for nested functions.
If procedures are nested more than one level deep, only the
immediately containing scope is specified. For example, this code:
int
foo (int x)
{
int bar (int y)
{
int baz (int z)
{
return x + y + z;
}
return baz (x + 2 * y);
}
return x + bar (3 * x);
}
produces the stabs:
.stabs "baz:f1,baz,bar",36,0,0,_baz.15 # 36 is N_FUN
.stabs "bar:f1,bar,foo",36,0,0,_bar.12
.stabs "foo:F1",36,0,0,_foo
File: stabs.info, Node: Block Structure, Next: Alternate Entry Points, Prev: Nested Procedures, Up: Program Structure
Block Structure
===============
The program's block structure is represented by the `N_LBRAC' (left
brace) and the `N_RBRAC' (right brace) stab types. The variables
defined inside a block precede the `N_LBRAC' symbol for most compilers,
including GCC. Other compilers, such as the Convex, Acorn RISC
machine, and Sun `acc' compilers, put the variables after the `N_LBRAC'
symbol. The values of the `N_LBRAC' and `N_RBRAC' symbols are the
start and end addresses of the code of the block, respectively. For
most machines, they are relative to the starting address of this source
file. For the Gould NP1, they are absolute. For stabs in sections
(*note Stab Sections::.), they are relative to the function in which
they occur.
The `N_LBRAC' and `N_RBRAC' stabs that describe the block scope of a
procedure are located after the `N_FUN' stab that represents the
procedure itself.
Sun documents the desc field of `N_LBRAC' and `N_RBRAC' symbols as
containing the nesting level of the block. However, dbx seems to not
care, and GCC always sets desc to zero.
For XCOFF, block scope is indicated with `C_BLOCK' symbols. If the
name of the symbol is `.bb', then it is the beginning of the block; if
the name of the symbol is `.be'; it is the end of the block.
File: stabs.info, Node: Alternate Entry Points, Prev: Block Structure, Up: Program Structure
Alternate Entry Points
======================
Some languages, like Fortran, have the ability to enter procedures at
some place other than the beginning. One can declare an alternate entry
point. The `N_ENTRY' stab is for this; however, the Sun FORTRAN
compiler doesn't use it. According to AIX documentation, only the name
of a `C_ENTRY' stab is significant; the address of the alternate entry
point comes from the corresponding external symbol. A previous
revision of this document said that the value of an `N_ENTRY' stab was
the address of the alternate entry point, but I don't know the source
for that information.
File: stabs.info, Node: Constants, Next: Variables, Prev: Program Structure, Up: Top
Constants
*********
The `c' symbol descriptor indicates that this stab represents a
constant. This symbol descriptor is an exception to the general rule
that symbol descriptors are followed by type information. Instead, it
is followed by `=' and one of the following:
`b VALUE'
Boolean constant. VALUE is a numeric value; I assume it is 0 for
false or 1 for true.
`c VALUE'
Character constant. VALUE is the numeric value of the constant.
`e TYPE-INFORMATION , VALUE'
Constant whose value can be represented as integral.
TYPE-INFORMATION is the type of the constant, as it would appear
after a symbol descriptor (*note String Field::.). VALUE is the
numeric value of the constant. GDB 4.9 does not actually get the
right value if VALUE does not fit in a host `int', but it does not
do anything violent, and future debuggers could be extended to
accept integers of any size (whether unsigned or not). This
constant type is usually documented as being only for enumeration
constants, but GDB has never imposed that restriction; I don't
know about other debuggers.
`i VALUE'
Integer constant. VALUE is the numeric value. The type is some
sort of generic integer type (for GDB, a host `int'); to specify
the type explicitly, use `e' instead.
`r VALUE'
Real constant. VALUE is the real value, which can be `INF'
(optionally preceded by a sign) for infinity, `QNAN' for a quiet
NaN (not-a-number), or `SNAN' for a signalling NaN. If it is a
normal number the format is that accepted by the C library function
`atof'.
`s STRING'
String constant. STRING is a string enclosed in either `'' (in
which case `'' characters within the string are represented as
`\'' or `"' (in which case `"' characters within the string are
represented as `\"').
`S TYPE-INFORMATION , ELEMENTS , BITS , PATTERN'
Set constant. TYPE-INFORMATION is the type of the constant, as it
would appear after a symbol descriptor (*note String Field::.).
ELEMENTS is the number of elements in the set (does this means how
many bits of PATTERN are actually used, which would be redundant
with the type, or perhaps the number of bits set in PATTERN? I
don't get it), BITS is the number of bits in the constant (meaning
it specifies the length of PATTERN, I think), and PATTERN is a
hexadecimal representation of the set. AIX documentation refers
to a limit of 32 bytes, but I see no reason why this limit should
exist. This form could probably be used for arbitrary constants,
not just sets; the only catch is that PATTERN should be understood
to be target, not host, byte order and format.
The boolean, character, string, and set constants are not supported
by GDB 4.9, but it ignores them. GDB 4.8 and earlier gave an error
message and refused to read symbols from the file containing the
constants.
The above information is followed by `;'.
File: stabs.info, Node: Variables, Next: Types, Prev: Constants, Up: Top
Variables
*********
Different types of stabs describe the various ways that variables
can be allocated: on the stack, globally, in registers, in common
blocks, statically, or as arguments to a function.
* Menu:
* Stack Variables:: Variables allocated on the stack.
* Global Variables:: Variables used by more than one source file.
* Register Variables:: Variables in registers.
* Common Blocks:: Variables statically allocated together.
* Statics:: Variables local to one source file.
* Based Variables:: Fortran pointer based variables.
* Parameters:: Variables for arguments to functions.
File: stabs.info, Node: Stack Variables, Next: Global Variables, Up: Variables
Automatic Variables Allocated on the Stack
==========================================
If a variable's scope is local to a function and its lifetime is
only as long as that function executes (C calls such variables
"automatic"), it can be allocated in a register (*note Register
Variables::.) or on the stack.
Each variable allocated on the stack has a stab with the symbol
descriptor omitted. Since type information should begin with a digit,
`-', or `(', only those characters precluded from being used for symbol
descriptors. However, the Acorn RISC machine (ARM) is said to get this
wrong: it puts out a mere type definition here, without the preceding
`TYPE-NUMBER='. This is a bad idea; there is no guarantee that type
descriptors are distinct from symbol descriptors. Stabs for stack
variables use the `N_LSYM' stab type, or `C_LSYM' for XCOFF.
The value of the stab is the offset of the variable within the local
variables. On most machines this is an offset from the frame pointer
and is negative. The location of the stab specifies which block it is
defined in; see *Note Block Structure::.
For example, the following C code:
int
main ()
{
int x;
}
produces the following stabs:
.stabs "main:F1",36,0,0,_main # 36 is N_FUN
.stabs "x:1",128,0,0,-12 # 128 is N_LSYM
.stabn 192,0,0,LBB2 # 192 is N_LBRAC
.stabn 224,0,0,LBE2 # 224 is N_RBRAC
*Note Procedures:: for more information on the `N_FUN' stab, and
*Note Block Structure:: for more information on the `N_LBRAC' and
`N_RBRAC' stabs.
File: stabs.info, Node: Global Variables, Next: Register Variables, Prev: Stack Variables, Up: Variables
Global Variables
================
A variable whose scope is not specific to just one source file is
represented by the `G' symbol descriptor. These stabs use the `N_GSYM'
stab type (C_GSYM for XCOFF). The type information for the stab (*note
String Field::.) gives the type of the variable.
For example, the following source code:
char g_foo = 'c';
yields the following assembly code:
.stabs "g_foo:G2",32,0,0,0 # 32 is N_GSYM
.global _g_foo
.data
_g_foo:
.byte 99
The address of the variable represented by the `N_GSYM' is not
contained in the `N_GSYM' stab. The debugger gets this information
from the external symbol for the global variable. In the example above,
the `.global _g_foo' and `_g_foo:' lines tell the assembler to produce
an external symbol.
Some compilers, like GCC, output `N_GSYM' stabs only once, where the
variable is defined. Other compilers, like SunOS4 /bin/cc, output a
`N_GSYM' stab for each compilation unit which references the variable.
File: stabs.info, Node: Register Variables, Next: Common Blocks, Prev: Global Variables, Up: Variables
Register Variables
==================
Register variables have their own stab type, `N_RSYM' (`C_RSYM' for
XCOFF), and their own symbol descriptor, `r'. The stab's value is the
number of the register where the variable data will be stored.
AIX defines a separate symbol descriptor `d' for floating point
registers. This seems unnecessary; why not just just give floating
point registers different register numbers? I have not verified whether
the compiler actually uses `d'.
If the register is explicitly allocated to a global variable, but not
initialized, as in:
register int g_bar asm ("%g5");
then the stab may be emitted at the end of the object file, with the
other bss symbols.
File: stabs.info, Node: Common Blocks, Next: Statics, Prev: Register Variables, Up: Variables
Common Blocks
=============
A common block is a statically allocated section of memory which can
be referred to by several source files. It may contain several
variables. I believe Fortran is the only language with this feature.
A `N_BCOMM' stab begins a common block and an `N_ECOMM' stab ends
it. The only field that is significant in these two stabs is the
string, which names a normal (non-debugging) symbol that gives the
address of the common block. According to IBM documentation, only the
`N_BCOMM' has the name of the common block (even though their compiler
actually puts it both places).
The stabs for the members of the common block are between the
`N_BCOMM' and the `N_ECOMM'; the value of each stab is the offset
within the common block of that variable. IBM uses the `C_ECOML' stab
type, and there is a corresponding `N_ECOML' stab type, but Sun's
Fortran compiler uses `N_GSYM' instead. The variables within a common
block use the `V' symbol descriptor (I believe this is true of all
Fortran variables). Other stabs (at least type declarations using
`C_DECL') can also be between the `N_BCOMM' and the `N_ECOMM'.
File: stabs.info, Node: Statics, Next: Based Variables, Prev: Common Blocks, Up: Variables
Static Variables
================
Initialized static variables are represented by the `S' and `V'
symbol descriptors. `S' means file scope static, and `V' means
procedure scope static. One exception: in XCOFF, IBM's xlc compiler
always uses `V', and whether it is file scope or not is distinguished
by whether the stab is located within a function.
In a.out files, `N_STSYM' means the data section, `N_FUN' means the
text section, and `N_LCSYM' means the bss section. For those systems
with a read-only data section separate from the text section (Solaris),
`N_ROSYM' means the read-only data section.
For example, the source lines:
static const int var_const = 5;
static int var_init = 2;
static int var_noinit;
yield the following stabs:
.stabs "var_const:S1",36,0,0,_var_const # 36 is N_FUN
...
.stabs "var_init:S1",38,0,0,_var_init # 38 is N_STSYM
...
.stabs "var_noinit:S1",40,0,0,_var_noinit # 40 is N_LCSYM
In XCOFF files, the stab type need not indicate the section;
`C_STSYM' can be used for all statics. Also, each static variable is
enclosed in a static block. A `C_BSTAT' (emitted with a `.bs'
assembler directive) symbol begins the static block; its value is the
symbol number of the csect symbol whose value is the address of the
static block, its section is the section of the variables in that
static block, and its name is `.bs'. A `C_ESTAT' (emitted with a `.es'
assembler directive) symbol ends the static block; its name is `.es'
and its value and section are ignored.
In ECOFF files, the storage class is used to specify the section, so
the stab type need not indicate the section.
In ELF files, for the SunPRO compiler version 2.0.1, symbol
descriptor `S' means that the address is absolute (the linker relocates
it) and symbol descriptor `V' means that the address is relative to the
start of the relevant section for that compilation unit. SunPRO has
plans to have the linker stop relocating stabs; I suspect that their the
debugger gets the address from the corresponding ELF (not stab) symbol.
I'm not sure how to find which symbol of that name is the right one.
The clean way to do all this would be to have a the value of a symbol
descriptor `S' symbol be an offset relative to the start of the file,
just like everything else, but that introduces obvious compatibility
problems. For more information on linker stab relocation, *Note ELF
Linker Relocation::.
File: stabs.info, Node: Based Variables, Next: Parameters, Prev: Statics, Up: Variables
Fortran Based Variables
=======================
Fortran (at least, the Sun and SGI dialects of FORTRAN-77) has a
feature which allows allocating arrays with `malloc', but which avoids
blurring the line between arrays and pointers the way that C does. In
stabs such a variable uses the `b' symbol descriptor.
For example, the Fortran declarations
real foo, foo10(10), foo10_5(10,5)
pointer (foop, foo)
pointer (foo10p, foo10)
pointer (foo105p, foo10_5)
produce the stabs
foo:b6
foo10:bar3;1;10;6
foo10_5:bar3;1;5;ar3;1;10;6
In this example, `real' is type 6 and type 3 is an integral type
which is the type of the subscripts of the array (probably `integer').
The `b' symbol descriptor is like `V' in that it denotes a
statically allocated symbol whose scope is local to a function; see
*Note Statics::. The value of the symbol, instead of being the address
of the variable itself, is the address of a pointer to that variable.
So in the above example, the value of the `foo' stab is the address of
a pointer to a real, the value of the `foo10' stab is the address of a
pointer to a 10-element array of reals, and the value of the `foo10_5'
stab is the address of a pointer to a 5-element array of 10-element
arrays of reals.
File: stabs.info, Node: Parameters, Prev: Based Variables, Up: Variables
Parameters
==========
Formal parameters to a function are represented by a stab (or
sometimes two; see below) for each parameter. The stabs are in the
order in which the debugger should print the parameters (i.e., the
order in which the parameters are declared in the source file). The
exact form of the stab depends on how the parameter is being passed.
Parameters passed on the stack use the symbol descriptor `p' and the
`N_PSYM' symbol type (or `C_PSYM' for XCOFF). The value of the symbol
is an offset used to locate the parameter on the stack; its exact
meaning is machine-dependent, but on most machines it is an offset from
the frame pointer.
As a simple example, the code:
main (argc, argv)
int argc;
char **argv;
produces the stabs:
.stabs "main:F1",36,0,0,_main # 36 is N_FUN
.stabs "argc:p1",160,0,0,68 # 160 is N_PSYM
.stabs "argv:p20=*21=*2",160,0,0,72
The type definition of `argv' is interesting because it contains
several type definitions. Type 21 is pointer to type 2 (char) and
`argv' (type 20) is pointer to type 21.
The following symbol descriptors are also said to go with `N_PSYM'.
The value of the symbol is said to be an offset from the argument
pointer (I'm not sure whether this is true or not).
pP (<<??>>)
pF Fortran function parameter
X (function result variable)
* Menu:
* Register Parameters::
* Local Variable Parameters::
* Reference Parameters::
* Conformant Arrays::
File: stabs.info, Node: Register Parameters, Next: Local Variable Parameters, Up: Parameters
Passing Parameters in Registers
-------------------------------
If the parameter is passed in a register, then traditionally there
are two symbols for each argument:
.stabs "arg:p1" . . . ; N_PSYM
.stabs "arg:r1" . . . ; N_RSYM
Debuggers use the second one to find the value, and the first one to
know that it is an argument.
Because that approach is kind of ugly, some compilers use symbol
descriptor `P' or `R' to indicate an argument which is in a register.
Symbol type `C_RPSYM' is used in XCOFF and `N_RSYM' is used otherwise.
The symbol's value is the register number. `P' and `R' mean the same
thing; the difference is that `P' is a GNU invention and `R' is an IBM
(XCOFF) invention. As of version 4.9, GDB should handle either one.
There is at least one case where GCC uses a `p' and `r' pair rather
than `P'; this is where the argument is passed in the argument list and
then loaded into a register.
According to the AIX documentation, symbol descriptor `D' is for a
parameter passed in a floating point register. This seems
unnecessary--why not just use `R' with a register number which
indicates that it's a floating point register? I haven't verified
whether the system actually does what the documentation indicates.
On the sparc and hppa, for a `P' symbol whose type is a structure or
union, the register contains the address of the structure. On the
sparc, this is also true of a `p' and `r' pair (using Sun `cc') or a
`p' symbol. However, if a (small) structure is really in a register,
`r' is used. And, to top it all off, on the hppa it might be a
structure which was passed on the stack and loaded into a register and
for which there is a `p' and `r' pair! I believe that symbol
descriptor `i' is supposed to deal with this case (it is said to mean
"value parameter by reference, indirect access"; I don't know the
source for this information), but I don't know details or what
compilers or debuggers use it, if any (not GDB or GCC). It is not
clear to me whether this case needs to be dealt with differently than
parameters passed by reference (*note Reference Parameters::.).
File: stabs.info, Node: Local Variable Parameters, Next: Reference Parameters, Prev: Register Parameters, Up: Parameters
Storing Parameters as Local Variables
-------------------------------------
There is a case similar to an argument in a register, which is an
argument that is actually stored as a local variable. Sometimes this
happens when the argument was passed in a register and then the compiler
stores it as a local variable. If possible, the compiler should claim
that it's in a register, but this isn't always done.
If a parameter is passed as one type and converted to a smaller type
by the prologue (for example, the parameter is declared as a `float',
but the calling conventions specify that it is passed as a `double'),
then GCC2 (sometimes) uses a pair of symbols. The first symbol uses
symbol descriptor `p' and the type which is passed. The second symbol
has the type and location which the parameter actually has after the
prologue. For example, suppose the following C code appears with no
prototypes involved:
void
subr (f)
float f;
{
if `f' is passed as a double at stack offset 8, and the prologue
converts it to a float in register number 0, then the stabs look like:
.stabs "f:p13",160,0,3,8 # 160 is `N_PSYM', here 13 is `double'
.stabs "f:r12",64,0,3,0 # 64 is `N_RSYM', here 12 is `float'
In both stabs 3 is the line number where `f' is declared (*note Line
Numbers::.).
GCC, at least on the 960, has another solution to the same problem.
It uses a single `p' symbol descriptor for an argument which is stored
as a local variable but uses `N_LSYM' instead of `N_PSYM'. In this
case, the value of the symbol is an offset relative to the local
variables for that function, not relative to the arguments; on some
machines those are the same thing, but not on all.
On the VAX or on other machines in which the calling convention
includes the number of words of arguments actually passed, the debugger
(GDB at least) uses the parameter symbols to keep track of whether it
needs to print nameless arguments in addition to the formal parameters
which it has printed because each one has a stab. For example, in
extern int fprintf (FILE *stream, char *format, ...);
...
fprintf (stdout, "%d\n", x);
there are stabs for `stream' and `format'. On most machines, the
debugger can only print those two arguments (because it has no way of
knowing that additional arguments were passed), but on the VAX or other
machines with a calling convention which indicates the number of words
of arguments, the debugger can print all three arguments. To do so,
the parameter symbol (symbol descriptor `p') (not necessarily `r' or
symbol descriptor omitted symbols) needs to contain the actual type as
passed (for example, `double' not `float' if it is passed as a double
and converted to a float).
File: stabs.info, Node: Reference Parameters, Next: Conformant Arrays, Prev: Local Variable Parameters, Up: Parameters
Passing Parameters by Reference
-------------------------------
If the parameter is passed by reference (e.g., Pascal `VAR'
parameters), then the symbol descriptor is `v' if it is in the argument
list, or `a' if it in a register. Other than the fact that these
contain the address of the parameter rather than the parameter itself,
they are identical to `p' and `R', respectively. I believe `a' is an
AIX invention; `v' is supported by all stabs-using systems as far as I
know.
|