diff options
Diffstat (limited to 'gcc/ada/exp_dbug.ads')
-rw-r--r-- | gcc/ada/exp_dbug.ads | 1428 |
1 files changed, 1428 insertions, 0 deletions
diff --git a/gcc/ada/exp_dbug.ads b/gcc/ada/exp_dbug.ads new file mode 100644 index 00000000000..5351ea71b87 --- /dev/null +++ b/gcc/ada/exp_dbug.ads @@ -0,0 +1,1428 @@ +------------------------------------------------------------------------------ +-- -- +-- GNAT COMPILER COMPONENTS -- +-- -- +-- E X P _ D B U G -- +-- -- +-- S p e c -- +-- -- +-- $Revision: 1.74 $ +-- -- +-- Copyright (C) 1996-2001 Free Software Foundation, Inc. -- +-- -- +-- GNAT is free software; you can redistribute it and/or modify it under -- +-- terms of the GNU General Public License as published by the Free Soft- -- +-- ware Foundation; either version 2, or (at your option) any later ver- -- +-- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- +-- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- +-- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -- +-- for more details. You should have received a copy of the GNU General -- +-- Public License distributed with GNAT; see file COPYING. If not, write -- +-- to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, -- +-- MA 02111-1307, USA. -- +-- -- +-- GNAT was originally developed by the GNAT team at New York University. -- +-- It is now maintained by Ada Core Technologies Inc (http://www.gnat.com). -- +-- -- +------------------------------------------------------------------------------ + +-- Expand routines for generation of special declarations used by the +-- debugger. In accordance with the Dwarf 2.2 specification, certain +-- type names are encoded to provide information to the debugger. + +with Sinfo; use Sinfo; +with Types; use Types; +with Uintp; use Uintp; +with Get_Targ; use Get_Targ; + +package Exp_Dbug is + + ----------------------------------------------------- + -- Encoding and Qualification of Names of Entities -- + ----------------------------------------------------- + + -- This section describes how the names of entities are encoded in + -- the generated debugging information. + + -- An entity in Ada has a name of the form X.Y.Z ... E where X,Y,Z + -- are the enclosing scopes (not including Standard at the start). + + -- The encoding of the name follows this basic qualified naming scheme, + -- where the encoding of individual entity names is as described in + -- Namet (i.e. in particular names present in the original source are + -- folded to all lower case, with upper half and wide characters encoded + -- as described in Namet). Upper case letters are used only for entities + -- generated by the compiler. + + -- There are two cases, global entities, and local entities. In more + -- formal terms, local entities are those which have a dynamic enclosing + -- scope, and global entities are at the library level, except that we + -- always consider procedures to be global entities, even if they are + -- nested (that's because at the debugger level a procedure name refers + -- to the code, and the code is indeed a global entity, including the + -- case of nested procedures.) In addition, we also consider all types + -- to be global entities, even if they are defined within a procedure. + + -- The reason for full treating all type names as global entities is + -- that a number of our type encodings work by having related type + -- names, and we need the full qualification to keep this unique. + + -- For global entities, the encoded name includes all components of the + -- fully expanded name (but omitting Standard at the start). For example, + -- if a library level child package P.Q has an embedded package R, and + -- there is an entity in this embdded package whose name is S, the encoded + -- name will include the components p.q.r.s. + + -- For local entities, the encoded name only includes the components + -- up to the enclosing dynamic scope (other than a block). At run time, + -- such a dynamic scope is a subprogram, and the debugging formats know + -- about local variables of procedures, so it is not necessary to have + -- full qualification for such entities. In particular this means that + -- direct local variables of a procedure are not qualified. + + -- As an example of the local name convention, consider a procedure V.W + -- with a local variable X, and a nested block Y containing an entity + -- Z. The fully qualified names of the entities X and Z are: + + -- V.W.X + -- V.W.Y.Z + + -- but since V.W is a subprogram, the encoded names will end up + -- encoding only + + -- x + -- y.z + + -- The separating dots are translated into double underscores. + + -- Note: there is one exception, which is that on IRIX, for workshop + -- back compatibility, dots are retained as dots. In the rest of this + -- document we assume the double underscore encoding. + + ----------------------------- + -- Handling of Overloading -- + ----------------------------- + + -- The above scheme is incomplete with respect to overloaded + -- subprograms, since overloading can legitimately result in a + -- case of two entities with exactly the same fully qualified names. + -- To distinguish between entries in a set of overloaded subprograms, + -- the encoded names are serialized by adding one of the two suffixes: + + -- $n (dollar sign) + -- __nn (two underscores) + + -- where nn is a serial number (1 for the first overloaded function, + -- 2 for the second, etc.). The former suffix is used when a dollar + -- sign is a valid symbol on the target machine and the latter is + -- used when it is not. No suffix need appear on the encoding of + -- the first overloading of a subprogram. + + -- These names are prefixed by the normal full qualification. So + -- for example, the third instance of the subprogram qrs in package + -- yz would have one of the two names: + + -- yz__qrs$3 + -- yz__qrs__3 + + -- The serial number always appears at the end as shown, even in the + -- case of subprograms nested inside overloaded subprograms, and only + -- when the named subprogram is overloaded. For example, consider + -- the following situation: + + -- package body Yz is + -- procedure Qrs is -- Encoded name is yz__qrs + -- procedure Tuv is ... end; -- Encoded name is yz__qrs__tuv + -- begin ... end Qrs; + + -- procedure Qrs (X: Integer) is -- Encoded name is yz__qrs__2 + -- procedure Tuv is ... end; -- Encoded name is yz__qrs__tuv + -- -- (not yz__qrs__2__tuv). + -- procedure Tuv (X: INTEGER) -- Encoded name is yz__qrs__tuv__2 + -- begin ... end Tuv; + + -- procedure Tuv (X: INTEGER) -- Encoded name is yz__qrs__tuv__3 + -- begin ... end Tuv; + -- begin ... end Qrs; + -- end Yz; + + -- This example also serves to illustrate, a case in which the + -- debugging data are currently ambiguous. The two parameterless + -- versions of Yz.Qrs.Tuv have the same encoded names in the + -- debugging data. However, the actual external symbols (which + -- linkers use to resolve references) will be modified with an + -- an additional suffix so that they do not clash. Thus, there will + -- be cases in which the name of a function shown in the debugging + -- data differs from that function's "official" external name, and + -- in which several different functions have exactly the same name + -- as far as the debugger is concerned. We don't consider this too + -- much of a problem, since the only way the user has of referring + -- to these functions by name is, in fact, Yz.Qrs.Tuv, so that the + -- reference is inherently ambiguous from the user's perspective, + -- regardless of internal encodings (in these cases, the debugger + -- can provide a menu of options to allow the user to disambiguate). + + -------------------- + -- Operator Names -- + -------------------- + + -- The above rules applied to operator names would result in names + -- with quotation marks, which are not typically allowed by assemblers + -- and linkers, and even if allowed would be odd and hard to deal with. + -- To avoid this problem, operator names are encoded as follows: + + -- Oabs abs + -- Oand and + -- Omod mod + -- Onot not + -- Oor or + -- Orem rem + -- Oxor xor + -- Oeq = + -- One /= + -- Olt < + -- Ole <= + -- Ogt > + -- Oge >= + -- Oadd + + -- Osubtract - + -- Oconcat & + -- Omultiply * + -- Odivide / + -- Oexpon ** + + -- These names are prefixed by the normal full qualification, and + -- suffixed by the overloading identification. So for example, the + -- second operator "=" defined in package Extra.Messages would + -- have the name: + + -- extra__messages__Oeq__2 + + ---------------------------------- + -- Resolving Other Name Clashes -- + ---------------------------------- + + -- It might be thought that the above scheme is complete, but in Ada 95, + -- full qualification is insufficient to uniquely identify an entity + -- in the program, even if it is not an overloaded subprogram. There + -- are two possible confusions: + + -- a.b + + -- interpretation 1: entity b in body of package a + -- interpretation 2: child procedure b of package a + + -- a.b.c + + -- interpretation 1: entity c in child package a.b + -- interpretation 2: entity c in nested package b in body of a + + -- It is perfectly valid in both cases for both interpretations to + -- be valid within a single program. This is a bit of a surprise since + -- certainly in Ada 83, full qualification was sufficient, but not in + -- Ada 95. The result is that the above scheme can result in duplicate + -- names. This would not be so bad if the effect were just restricted + -- to debugging information, but in fact in both the above cases, it + -- is possible for both symbols to be external names, and so we have + -- a real problem of name clashes. + + -- To deal with this situation, we provide two additional encoding + -- rules for names + + -- First: all library subprogram names are preceded by the string + -- _ada_ (which causes no duplications, since normal Ada names can + -- never start with an underscore. This not only solves the first + -- case of duplication, but also solves another pragmatic problem + -- which is that otherwise Ada procedures can generate names that + -- clash with existing system function names. Most notably, we can + -- have clashes in the case of procedure Main with the C main that + -- in some systems is always present. + + -- Second, for the case where nested packages declared in package + -- bodies can cause trouble, we add a suffix which shows which + -- entities in the list are body-nested packages, i.e. packages + -- whose spec is within a package body. The rules are as follows, + -- given a list of names in a qualified name name1.name2.... + + -- If none are body-nested package entities, then there is no suffix + + -- If at least one is a body-nested package entity, then the suffix + -- is X followed by a string of b's and n's (b = body-nested package + -- entity, n = not a body-nested package). + + -- There is one element in this string for each entity in the encoded + -- expanded name except the first (the rules are such that the first + -- entity of the encoded expanded name can never be a body-nested' + -- package. Trailing n's are omitted, as is the last b (there must + -- be at least one b, or we would not be generating a suffix at all). + + -- For example, suppose we have + + -- package x is + -- pragma Elaborate_Body; + -- m1 : integer; -- #1 + -- end x; + + -- package body x is + -- package y is m2 : integer; end y; -- #2 + -- package body y is + -- package z is r : integer; end z; -- #3 + -- end; + -- m3 : integer; -- #4 + -- end x; + + -- package x.y is + -- pragma Elaborate_Body; + -- m2 : integer; -- #5 + -- end x.y; + + -- package body x.y is + -- m3 : integer; -- #6 + -- procedure j is -- #7 + -- package k is + -- z : integer; -- #8 + -- end k; + -- begin + -- null; + -- end j; + -- end x.y; + + -- procedure x.m3 is begin null; end; -- #9 + + -- Then the encodings would be: + + -- #1. x__m1 (no BNPE's in sight) + -- #2. x__y__m2X (y is a BNPE) + -- #3. x__y__z__rXb (y is a BNPE, so is z) + -- #4. x__m3 (no BNPE's in sight) + -- #5. x__y__m2 (no BNPE's in sight) + -- #6. x__y__m3 (no BNPE's in signt) + -- #7. x__y__j (no BNPE's in sight) + -- #8. k__z (no BNPE's, only up to procedure) + -- #9 _ada_x__m3 (library level subprogram) + + -- Note that we have instances here of both kind of potential name + -- clashes, and the above examples show how the encodings avoid the + -- clash as follows: + + -- Lines #4 and #9 both refer to the entity x.m3, but #9 is a library + -- level subprogram, so it is preceded by the string _ada_ which acts + -- to distinguish it from the package body entity. + + -- Lines #2 and #5 both refer to the entity x.y.m2, but the first + -- instance is inside the body-nested package y, so there is an X + -- suffix to distinguish it from the child library entity. + + -- Note that enumeration literals never need Xb type suffixes, since + -- they are never referenced using global external names. + + --------------------- + -- Interface Names -- + --------------------- + + -- Note: if an interface name is present, then the external name + -- is taken from the specified interface name. Given the current + -- limitations of the gcc backend, this means that the debugging + -- name is also set to the interface name, but conceptually, it + -- would be possible (and indeed desirable) to have the debugging + -- information still use the Ada name as qualified above, so we + -- still fully qualify the name in the front end. + + ------------------------------------- + -- Encodings Related to Task Types -- + ------------------------------------- + + -- Each task object defined by a single task declaration is associated + -- with a prefix that is used to qualify procedures defined in that + -- task. Given + -- + -- package body P is + -- task body TaskObj is + -- procedure F1 is ... end; + -- begin + -- B; + -- end TaskObj; + -- end P; + -- + -- The name of subprogram TaskObj.F1 is encoded as p__taskobjTK__f1, + -- The body, B, is contained in a subprogram whose name is + -- p__taskobjTKB. + + ------------------------------------------ + -- Encodings Related to Protected Types -- + ------------------------------------------ + + -- Each protected type has an associated record type, that describes + -- the actual layout of the private data. In addition to the private + -- components of the type, the Corresponding_Record_Type includes one + -- component of type Protection, which is the actual lock structure. + -- The run-time size of the protected type is the size of the corres- + -- ponding record. + + -- For a protected type prot, the Corresponding_Record_Type is encoded + -- as protV. + + -- The operations of a protected type are encoded as follows: each + -- operation results in two subprograms, a locking one that is called + -- from outside of the object, and a non-locking one that is used for + -- calls from other operations on the same object. The locking operation + -- simply acquires the lock, and then calls the non-locking version. + -- The names of all of these have a prefix constructed from the name + -- of the name of the type, the string "PT", and a suffix which is P + -- or N, depending on whether this is the protected or non-locking + -- version of the operation. + + -- Given the declaration: + + -- protected type lock is + -- function get return integer; + -- procedure set (x: integer); + -- private + -- value : integer := 0; + -- end lock; + + -- the following operations are created: + + -- lockPT_getN + -- lockPT_getP, + -- lockPT_setN + -- lockPT_setP + + ---------------------------------------------------- + -- Conversion between Entities and External Names -- + ---------------------------------------------------- + + No_Dollar_In_Label : constant Boolean := Get_No_Dollar_In_Label; + -- True iff the target allows dollar signs ("$") in external names + + procedure Get_External_Name + (Entity : Entity_Id; + Has_Suffix : Boolean); + -- Set Name_Buffer and Name_Len to the external name of entity E. + -- The external name is the Interface_Name, if specified, unless + -- the entity has an address clause or a suffix. + -- + -- If the Interface is not present, or not used, the external name + -- is the concatenation of: + -- + -- - the string "_ada_", if the entity is a library subprogram, + -- - the names of any enclosing scopes, each followed by "__", + -- or "X_" if the next entity is a subunit) + -- - the name of the entity + -- - the string "$" (or "__" if target does not allow "$"), followed + -- by homonym number, if the entity is an overloaded subprogram + + procedure Get_External_Name_With_Suffix + (Entity : Entity_Id; + Suffix : String); + -- Set Name_Buffer and Name_Len to the external name of entity E. + -- If Suffix is the empty string the external name is as above, + -- otherwise the external name is the concatenation of: + -- + -- - the string "_ada_", if the entity is a library subprogram, + -- - the names of any enclosing scopes, each followed by "__", + -- or "X_" if the next entity is a subunit) + -- - the name of the entity + -- - the string "$" (or "__" if target does not allow "$"), followed + -- by homonym number, if the entity is an overloaded subprogram + -- - the string "___" followed by Suffix + + function Get_Entity_Id (External_Name : String) return Entity_Id; + -- Find entity in current compilation unit, which has the given + -- External_Name. + + ---------------------------- + -- Debug Name Compression -- + ---------------------------- + + -- The full qualification of names can lead to long names, and this + -- section describes the method used to compress these names. Such + -- compression is attempted if one of the following holds: + + -- The length exceeds a maximum set in hostparm, currently set + -- to 128, but can be changed as needed. + + -- The compiler switch -gnatC is set, setting the Compress_Debug_Names + -- switch in Opt to True. + + -- If either of these conditions holds, name compression is attempted + -- by replacing the qualifying section as follows. + + -- Given a name of the form + + -- a__b__c__d + + -- where a,b,c,d are arbitrary strings not containing a sequence + -- of exactly two underscores, the name is rewritten as: + + -- XC????????_d + + -- where ???????? are 8 hex digits representing a 32-bit checksum + -- value that identifies the sequence of compressed names. In + -- addition a dummy type declaration is generated as shown by + -- the following example. Supposed we have three compression + -- sequences + + -- XC1234abcd corresponding to a__b__c__ prefix + -- XCabcd1234 corresponding to a__b__ prefix + -- XCab1234cd corresponding to a__ prefix + + -- then an enumeration type declaration is generated: + + -- type XC is + -- (XC1234abcdXnn, aXnn, bXnn, cXnn, + -- XCabcd1234Xnn, aXnn, bXnn, + -- XCab1234cdXnn, aXnn); + + -- showing the meaning of each compressed prefix, so the debugger + -- can interpret the exact sequence of names that correspond to the + -- compressed sequence. The Xnn suffixes in the above are simply + -- serial numbers that are guaranteed to be different to ensure + -- that all names are unique, and are otherwise ignored. + + -------------------------------------------- + -- Subprograms for Handling Qualification -- + -------------------------------------------- + + procedure Qualify_Entity_Names (N : Node_Id); + -- Given a node N, that represents a block, subprogram body, or package + -- body or spec, or protected or task type, sets a fully qualified name + -- for the defining entity of given construct, and also sets fully + -- qualified names for all enclosed entities of the construct (using + -- First_Entity/Next_Entity). Note that the actual modifications of the + -- names is postponed till a subsequent call to Qualify_All_Entity_Names. + -- Note: this routine does not deal with prepending _ada_ to library + -- subprogram names. The reason for this is that we only prepend _ada_ + -- to the library entity itself, and not to names built from this name. + + procedure Qualify_All_Entity_Names; + -- When Qualify_Entity_Names is called, no actual name changes are made, + -- i.e. the actual calls to Qualify_Entity_Name are deferred until a call + -- is made to this procedure. The reason for this deferral is that when + -- names are changed semantic processing may be affected. By deferring + -- the changes till just before gigi is called, we avoid any concerns + -- about such effects. Gigi itself does not use the names except for + -- output of names for debugging purposes (which is why we are doing + -- the name changes in the first place. + + -- Note: the routines Get_Unqualified_[Decoded]_Name_String in Namet + -- are useful to remove qualification from a name qualified by the + -- call to Qualify_All_Entity_Names. + + procedure Generate_Auxiliary_Types; + -- The process of qualifying names may result in name compression which + -- requires dummy enumeration types to be generated. This subprogram + -- ensures that these types are appropriately included in the tree. + + -------------------------------- + -- Handling of Numeric Values -- + -------------------------------- + + -- All numeric values here are encoded as strings of decimal digits. + -- Only integer values need to be encoded. A negative value is encoded + -- as the corresponding positive value followed by a lower case m for + -- minus to indicate that the value is negative (e.g. 2m for -2). + + ------------------------- + -- Type Name Encodings -- + ------------------------- + + -- In the following typ is the name of the type as normally encoded by + -- the debugger rules, i.e. a non-qualified name, all in lower case, + -- with standard encoding of upper half and wide characters + + ------------------------ + -- Encapsulated Types -- + ------------------------ + + -- In some cases, the compiler encapsulates a type by wrapping it in + -- a structure. For example, this is used when a size or alignment + -- specification requires a larger type. Consider: + + -- type y is mod 2 ** 64; + -- for y'size use 256; + + -- In this case the compile generates a structure type y___PAD, which + -- has a single field whose name is F. This single field is 64 bits + -- long and contains the actual value. + + -- A similar encapsulation is done for some packed array types, + -- in which case the structure type is y___LJM and the field name + -- is OBJECT. + + -- When the debugger sees an object of a type whose name has a + -- suffix not otherwise mentioned in this specification, the type + -- is a record containing a single field, and the name of that field + -- is all upper-case letters, it should look inside to get the value + -- of the field, and neither the outer structure name, nor the + -- field name should appear when the value is printed. + + ----------------------- + -- Fixed-Point Types -- + ----------------------- + + -- Fixed-point types are encoded using a suffix that indicates the + -- delta and small values. The actual type itself is a normal + -- integer type. + + -- typ___XF_nn_dd + -- typ___XF_nn_dd_nn_dd + + -- The first form is used when small = delta. The value of delta (and + -- small) is given by the rational nn/dd, where nn and dd are decimal + -- integers. + -- + -- The second form is used if the small value is different from the + -- delta. In this case, the first nn/dd rational value is for delta, + -- and the second value is for small. + + ------------------------------ + -- VAX Floating-Point Types -- + ------------------------------ + + -- Vax floating-point types are represented at run time as integer + -- types, which are treated specially by the code generator. Their + -- type names are encoded with the following suffix: + + -- typ___XFF + -- typ___XFD + -- typ___XFG + + -- representing the Vax F Float, D Float, and G Float types. The + -- debugger must treat these specially. In particular, printing + -- these values can be achieved using the debug procedures that + -- are provided in package System.Vax_Float_Operations: + + -- procedure Debug_Output_D (Arg : D); + -- procedure Debug_Output_F (Arg : F); + -- procedure Debug_Output_G (Arg : G); + + -- These three procedures take a Vax floating-point argument, and + -- output a corresponding decimal representation to standard output + -- with no terminating line return. + + -------------------- + -- Discrete Types -- + -------------------- + + -- Discrete types are coded with a suffix indicating the range in + -- the case where one or both of the bounds are discriminants or + -- variable. + + -- Note: at the current time, we also encode static bounds if they + -- do not match the natural machine type bounds, but this may be + -- removed in the future, since it is redundant for most debugging + -- formats. However, we do not ever need XD encoding for enumeration + -- base types, since here it is always clear what the bounds are + -- from the number of enumeration literals, and of course we do + -- not need to encode the dummy XR types generated for renamings. + + -- typ___XD + -- typ___XDL_lowerbound + -- typ___XDU_upperbound + -- typ___XDLU_lowerbound__upperbound + + -- If a discrete type is a natural machine type (i.e. its bounds + -- correspond in a natural manner to its size), then it is left + -- unencoded. The above encoding forms are used when there is a + -- constrained range that does not correspond to the size or that + -- has discriminant references or other non-static bounds. + + -- The first form is used if both bounds are dynamic, in which case + -- two constant objects are present whose names are typ___L and + -- typ___U in the same scope as typ, and the values of these constants + -- indicate the bounds. As far as the debugger is concerned, these + -- are simply variables that can be accessed like any other variables. + -- In the enumeration case, these values correspond to the Enum_Rep + -- values for the lower and upper bounds. + + -- The second form is used if the upper bound is dynamic, but the + -- lower bound is either constant or depends on a discriminant of + -- the record with which the type is associated. The upper bound + -- is stored in a constant object of name typ___U as previously + -- described, but the lower bound is encoded directly into the + -- name as either a decimal integer, or as the discriminant name. + + -- The third form is similarly used if the lower bound is dynamic, + -- but the upper bound is static or a discriminant reference, in + -- which case the lower bound is stored in a constant object of + -- name typ___L, and the upper bound is encoded directly into the + -- name as either a decimal integer, or as the discriminant name. + + -- The fourth form is used if both bounds are discriminant references + -- or static values, with the encoding first for the lower bound, + -- then for the upper bound, as previously described. + + ------------------ + -- Biased Types -- + ------------------ + + -- Only discrete types can be biased, and the fact that they are + -- biased is indicated by a suffix of the form: + + -- typ___XB_lowerbound__upperbound + + -- Here lowerbound and upperbound are decimal integers, with the + -- usual (postfix "m") encoding for negative numbers. Biased + -- types are only possible where the bounds are static, and the + -- values are represented as unsigned offsets from the lower + -- bound given. For example: + + -- type Q is range 10 .. 15; + -- for Q'size use 3; + + -- The size clause will force values of type Q in memory to be + -- stored in biased form (e.g. 11 will be represented by the + -- bit pattern 001). + + ---------------------------------------------- + -- Record Types with Variable-Length Fields -- + ---------------------------------------------- + + -- The debugging formats do not fully support these types, and indeed + -- some formats simply generate no useful information at all for such + -- types. In order to provide information for the debugger, gigi creates + -- a parallel type in the same scope with one of the names + + -- type___XVE + -- type___XVU + + -- The former name is used for a record and the latter for the union + -- that is made for a variant record (see below) if that union has + -- variable size. These encodings suffix any other encodings that + -- might be suffixed to the type name. + + -- The idea here is to provide all the needed information to interpret + -- objects of the original type in the form of a "fixed up" type, which + -- is representable using the normal debugging information. + + -- There are three cases to be dealt with. First, some fields may have + -- variable positions because they appear after variable-length fields. + -- To deal with this, we encode *all* the field bit positions of the + -- special ___XV type in a non-standard manner. + + -- The idea is to encode not the position, but rather information + -- that allows computing the position of a field from the position + -- of the previous field. The algorithm for computing the actual + -- positions of all fields and the length of the record is as + -- follows. In this description, let P represent the current + -- bit position in the record. + + -- 1. Initialize P to 0. + + -- 2. For each field in the record, + + -- 2a. If an alignment is given (see below), then round P + -- up, if needed, to the next multiple of that alignment. + + -- 2b. If a bit position is given, then increment P by that + -- amount (that is, treat it as an offset from the end of the + -- preceding record). + + -- 2c. Assign P as the actual position of the field. + + -- 2d. Compute the length, L, of the represented field (see below) + -- and compute P'=P+L. Unless the field represents a variant part + -- (see below and also Variant Record Encoding), set P to P'. + + -- The alignment, if present, is encoded in the field name of the + -- record, which has a suffix: + + -- fieldname___XVAnn + + -- where the nn after the XVA indicates the alignment value in storage + -- units. This encoding is present only if an alignment is present. + + -- The size of the record described by an XVE-encoded type (in bits) + -- is generally the maximum value attained by P' in step 2d above, + -- rounded up according to the record's alignment. + + -- Second, the variable-length fields themselves are represented by + -- replacing the type by a special access type. The designated type + -- of this access type is the original variable-length type, and the + -- fact that this field has been transformed in this way is signalled + -- by encoding the field name as: + + -- field___XVL + + -- where field is the original field name. If a field is both + -- variable-length and also needs an alignment encoding, then the + -- encodings are combined using: + + -- field___XVLnn + + -- Note: the reason that we change the type is so that the resulting + -- type has no variable-length fields. At least some of the formats + -- used for debugging information simply cannot tolerate variable- + -- length fields, so the encoded information would get lost. + + -- Third, in the case of a variant record, the special union + -- that contains the variants is replaced by a normal C union. + -- In this case, the positions are all zero. + + -- As an example of this encoding, consider the declarations: + + -- type Q is array (1 .. V1) of Float; -- alignment 4 + -- type R is array (1 .. V2) of Long_Float; -- alignment 8 + + -- type X is record + -- A : Character; + -- B : Float; + -- C : String (1 .. V3); + -- D : Float; + -- E : Q; + -- F : R; + -- G : Float; + -- end record; + + -- The encoded type looks like: + + -- type anonymousQ is access Q; + -- type anonymousR is access R; + + -- type X___XVE is record + -- A : Character; -- position contains 0 + -- B : Float; -- position contains 24 + -- C___XVL : access String (1 .. V3); -- position contains 0 + -- D___XVA4 : Float; -- position contains 0 + -- E___XVL4 : anonymousQ; -- position contains 0 + -- F___XVL8 : anonymousR; -- position contains 0 + -- G : Float; -- position contains 0 + -- end record; + + -- Any bit sizes recorded for fields other than dynamic fields and + -- variants are honored as for ordinary records. + + -- Notes: + + -- 1) The B field could also have been encoded by using a position + -- of zero, and an alignment of 4, but in such a case, the coding by + -- position is preferred (since it takes up less space). We have used + -- the (illegal) notation access xxx as field types in the example + -- above. + + -- 2) The E field does not actually need the alignment indication + -- but this may not be detected in this case by the conversion + -- routines. + + -- All discriminants always appear before any variable-length + -- fields that depend on them. So they can be located independent + -- of the variable-length field, using the standard procedure for + -- computing positions described above. + + -- The size of the ___XVE or ___XVU record or union is set to the + -- alignment (in bytes) of the original object so that the debugger + -- can calculate the size of the original type. + + -- 3) Our conventions do not cover all XVE-encoded records in which + -- some, but not all, fields have representation clauses. Such + -- records may, therefore, be displayed incorrectly by debuggers. + -- This situation is not common. + + ----------------------- + -- Base Record Types -- + ----------------------- + + -- Under certain circumstances, debuggers need two descriptions + -- of a record type, one that gives the actual details of the + -- base type's structure (as described elsewhere in these + -- comments) and one that may be used to obtain information + -- about the particular subtype and the size of the objects + -- being typed. In such cases the compiler will substitute a + -- type whose name is typically compiler-generated and + -- irrelevant except as a key for obtaining the actual type. + -- Specifically, if this name is x, then we produce a record + -- type named x___XVS consisting of one field. The name of + -- this field is that of the actual type being encoded, which + -- we'll call y (the type of this single field is arbitrary). + -- Both x and y may have corresponding ___XVE types. + + -- The size of the objects typed as x should be obtained from + -- the structure of x (and x___XVE, if applicable) as for + -- ordinary types unless there is a variable named x___XVZ, which, + -- if present, will hold the the size (in bits) of x. + + -- The type x will either be a subtype of y (see also Subtypes + -- of Variant Records, below) or will contain no fields at + -- all. The layout, types, and positions of these fields will + -- be accurate, if present. (Currently, however, the GDB + -- debugger makes no use of x except to determine its size). + + -- Among other uses, XVS types are sometimes used to encode + -- unconstrained types. For example, given + -- + -- subtype Int is INTEGER range 0..10; + -- type T1 (N: Int := 0) is record + -- F1: String (1 .. N); + -- end record; + -- type AT1 is array (INTEGER range <>) of T1; + -- + -- the element type for AT1 might have a type defined as if it had + -- been written: + -- + -- type at1___C_PAD is record null; end record; + -- for at1___C_PAD'Size use 16 * 8; + -- + -- and there would also be + -- + -- type at1___C_PAD___XVS is record t1: Integer; end record; + -- type t1 is ... + -- + -- Had the subtype Int been dynamic: + -- + -- subtype Int is INTEGER range 0 .. M; -- M a variable + -- + -- Then the compiler would also generate a declaration whose effect + -- would be + -- + -- at1___C_PAD___XVZ: constant Integer := 32 + M * 8 + padding term; + -- + -- Not all unconstrained types are so encoded; the XVS + -- convention may be unnecessary for unconstrained types of + -- fixed size. However, this encoding is always necessary when + -- a subcomponent type (array element's type or record field's + -- type) is an unconstrained record type some of whose + -- components depend on discriminant values. + + ----------------- + -- Array Types -- + ----------------- + + -- Since there is no way for the debugger to obtain the index subtypes + -- for an array type, we produce a type that has the name of the + -- array type followed by "___XA" and is a record whose field names + -- are the names of the types for the bounds. The types of these + -- fields is an integer type which is meaningless. + + -- To conserve space, we do not produce this type unless one of + -- the index types is either an enumeration type, has a variable + -- upper bound, has a lower bound different from the constant 1, + -- is a biased type, or is wider than "sizetype". + + -- Given the full encoding of these types (see above description for + -- the encoding of discrete types), this means that all necessary + -- information for addressing arrays is available. In some + -- debugging formats, some or all of the bounds information may + -- be available redundantly, particularly in the fixed-point case, + -- but this information can in any case be ignored by the debugger. + + ---------------------------- + -- Note on Implicit Types -- + ---------------------------- + + -- The compiler creates implicit type names in many situations where + -- a type is present semantically, but no specific name is present. + -- For example: + + -- S : Integer range M .. N; + + -- Here the subtype of S is not integer, but rather an anonymous + -- subtype of Integer. Where possible, the compiler generates names + -- for such anonymous types that are related to the type from which + -- the subtype is obtained as follows: + + -- T name suffix + + -- where name is the name from which the subtype is obtained, using + -- lower case letters and underscores, and suffix starts with an upper + -- case letter. For example, the name for the above declaration of S + -- might be: + + -- TintegerS4b + + -- If the debugger is asked to give the type of an entity and the type + -- has the form T name suffix, it is probably appropriate to just use + -- "name" in the response since this is what is meaningful to the + -- programmer. + + ------------------------------------------------- + -- Subprograms for Handling Encoded Type Names -- + ------------------------------------------------- + + procedure Get_Encoded_Name (E : Entity_Id); + -- If the entity is a typename, store the external name of + -- the entity as in Get_External_Name, followed by three underscores + -- plus the type encoding in Name_Buffer with the length in Name_Len, + -- and an ASCII.NUL character stored following the name. + -- Otherwise set Name_Buffer and Name_Len to hold the entity name. + + -------------- + -- Renaming -- + -------------- + + -- Debugging information is generated for exception, object, package, + -- and subprogram renaming (generic renamings are not significant, since + -- generic templates are not relevant at debugging time). + + -- Consider a renaming declaration of the form + + -- x typ renames y; + + -- There is one case in which no special debugging information is required, + -- namely the case of an object renaming where the backend allocates a + -- reference for the renamed variable, and the entity x is this reference. + -- The debugger can handle this case without any special processing or + -- encoding (it won't know it was a renaming, but that does not matter). + + -- All other cases of renaming generate a dummy type definition for + -- an entity whose name is: + + -- x___XR for an object renaming + -- x___XRE for an exception renaming + -- x___XRP for a package renaming + + -- The name is fully qualified in the usual manner, i.e. qualified in + -- the same manner as the entity x would be. + + -- Note: subprogram renamings are not encoded at the present time. + + -- The type is an enumeration type with a single enumeration literal + -- that is an identifier which describes the renamed variable. + + -- For the simple entity case, where y is an entity name, + -- the enumeration is of the form: + + -- (y___XE) + + -- i.e. the enumeration type has a single field, whose name + -- matches the name y, with the XE suffix. The entity for this + -- enumeration literal is fully qualified in the usual manner. + -- All subprogram, exception, and package renamings fall into + -- this category, as well as simple object renamings. + + -- For the object renaming case where y is a selected component or an + -- indexed component, the literal name is suffixed by additional fields + -- that give details of the components. The name starts as above with + -- a y___XE entity indicating the outer level variable. Then a series + -- of selections and indexing operations can be specified as follows: + + -- Indexed component + + -- A series of subscript values appear in sequence, the number + -- corresponds to the number of dimensions of the array. The + -- subscripts have one of the following two forms: + + -- XSnnn + + -- Here nnn is a constant value, encoded as a decimal + -- integer (pos value for enumeration type case). Negative + -- values have a trailing 'm' as usual. + + -- XSe + + -- Here e is the (unqualified) name of a constant entity in + -- the same scope as the renaming which contains the subscript + -- value. + + -- Slice + + -- For the slice case, we have two entries. The first is for + -- the lower bound of the slice, and has the form + + -- XLnnn + -- XLe + + -- Specifies the lower bound, using exactly the same encoding + -- as for an XS subscript as described above. + + -- Then the upper bound appears in the usual XSnnn/XSe form + + -- Selected component + + -- For a selected component, we have a single entry + + -- XRf + + -- Here f is the field name for the selection + + -- For an explicit deference (.all), we have a single entry + + -- XA + + -- As an example, consider the declarations: + + -- package p is + -- type q is record + -- m : string (2 .. 5); + -- end record; + -- + -- type r is array (1 .. 10, 1 .. 20) of q; + -- + -- g : r; + -- + -- z : string renames g (1,5).m(2 ..3) + -- end p; + + -- The generated type definition would appear as + + -- type p__z___XR is + -- (p__g___XEXS1XS5XRmXL2XS3); + -- p__q___XE--------------------outer entity is g + -- XS1-----------------first subscript for g + -- XS5--------------second subscript for g + -- XRm-----------select field m + -- XL2--------lower bound of slice + -- XS3-----upper bound of slice + + function Debug_Renaming_Declaration (N : Node_Id) return Node_Id; + -- The argument N is a renaming declaration. The result is a type + -- declaration as described in the above paragraphs. If not special + -- debug declaration, than Empty is returned. + + --------------------------- + -- Packed Array Encoding -- + --------------------------- + + -- For every packed array, two types are created, and both appear in + -- the debugging output. + + -- The original declared array type is a perfectly normal array type, + -- and its index bounds indicate the original bounds of the array. + + -- The corresponding packed array type, which may be a modular type, or + -- may be an array of bytes type (see Exp_Pakd for full details). This + -- is the type that is actually used in the generated code and for + -- debugging information for all objects of the packed type. + + -- The name of the corresponding packed array type is: + + -- ttt___XPnnn + + -- where + -- ttt is the name of the original declared array + -- nnn is the component size in bits (1-31) + + -- When the debugger sees that an object is of a type that is encoded + -- in this manner, it can use the original type to determine the bounds, + -- and the component size to determine the packing details. + + -- Packed arrays are represented in tightly packed form, with no extra + -- bits between components. This is true even when the component size + -- is not a factor of the storage unit size, so that as a result it is + -- possible for components to cross storage unit boundaries. + + -- The layout in storage is identical, regardless of whether the + -- implementation type is a modular type or an array-of-bytes type. + -- See Exp_Pakd for details of how these implementation types are used, + -- but for the purpose of the debugger, only the starting address of + -- the object in memory is significant. + + -- The following example should show clearly how the packing works in + -- the little-endian and big-endian cases: + + -- type B is range 0 .. 7; + -- for B'Size use 3; + + -- type BA is array (0 .. 5) of B; + -- pragma Pack (BA); + + -- BV : constant BA := (1,2,3,4,5,6); + + -- Little endian case + + -- BV'Address + 2 BV'Address + 1 BV'Address + 0 + -- +-----------------+-----------------+-----------------+ + -- | 0 0 0 0 0 0 1 1 | 0 1 0 1 1 0 0 0 | 1 1 0 1 0 0 0 1 | + -- +-----------------+-----------------+-----------------+ + -- <---------> <-----> <---> <---> <-----> <---> <---> + -- unused bits BV(5) BV(4) BV(3) BV(2) BV(1) BV(0) + -- + -- Big endian case + -- + -- BV'Address + 0 BV'Address + 1 BV'Address + 2 + -- +-----------------+-----------------+-----------------+ + -- | 0 0 1 0 1 0 0 1 | 1 1 0 0 1 0 1 1 | 1 0 0 0 0 0 0 0 | + -- +-----------------+-----------------+-----------------+ + -- <---> <---> <-----> <---> <---> <-----> <---------> + -- BV(0) BV(1) BV(2) BV(3) BV(4) BV(5) unused bits + + ------------------------------------------------------ + -- Subprograms for Handling Packed Array Type Names -- + ------------------------------------------------------ + + function Make_Packed_Array_Type_Name + (Typ : Entity_Id; + Csize : Uint) + return Name_Id; + -- This function is used in Exp_Pakd to create the name that is encoded + -- as described above. The entity Typ provides the name ttt, and the + -- value Csize is the component size that provides the nnn value. + + -------------------------------------- + -- Pointers to Unconstrained Arrays -- + -------------------------------------- + + -- There are two kinds of pointers to arrays. The debugger can tell + -- which format is in use by the form of the type of the pointer. + + -- Fat Pointers + + -- Fat pointers are represented as a struct with two fields. This + -- struct has two distinguished field names: + + -- P_ARRAY is a pointer to the array type. The name of this + -- type is the unconstrained type followed by "___XUA". This + -- array will have bounds which are the discriminants, and + -- hence are unparsable, but will give the number of + -- subscripts and the component type. + + -- P_BOUNDS is a pointer to a struct, the name of whose type is the + -- unconstrained array name followed by "___XUB" and which has + -- fields of the form + + -- LBn (n a decimal integer) lower bound of n'th dimension + -- UBn (n a decimal integer) upper bound of n'th dimension + + -- The bounds may be any integral type. In the case of an + -- enumeration type, Enum_Rep values are used. + + -- The debugging information will sometimes reference an anonymous + -- fat pointer type. Such types are given the name xxx___XUP, where + -- xxx is the name of the designated type. If the debugger is asked + -- to output such a type name, the appropriate form is "access xxx". + + -- Thin Pointers + + -- Thin pointers are represented as a pointer to the ARRAY field of + -- a structure with two fields. The name of the structure type is + -- that of the unconstrained array followed by "___XUT". + + -- The field ARRAY contains the array value. This array field is + -- typically a variable-length array, and consequently the entire + -- record structure will be encoded as previously described, + -- resulting in a type with suffix "___XUT___XVE". + + -- The field BOUNDS is a struct containing the bounds as above. + + -------------------------------------- + -- Tagged Types and Type Extensions -- + -------------------------------------- + + -- A type C derived from a tagged type P has a field named "_parent" + -- of type P that contains its inherited fields. The type of this + -- field is usually P (encoded as usual if it has a dynamic size), + -- but may be a more distant ancestor, if P is a null extension of + -- that type. + + -- The type tag of a tagged type is a field named _tag, of type void*. + -- If the type is derived from another tagged type, its _tag field is + -- found in its _parent field. + + ----------------------------- + -- Variant Record Encoding -- + ----------------------------- + + -- The variant part of a variant record is encoded as a single field + -- in the enclosing record, whose name is: + + -- discrim___XVN + + -- where discrim is the unqualified name of the variant. This field name + -- is built by gigi (not by code in this unit). In the case of an + -- Unchecked_Union record, this discriminant will not appear in the + -- record, and the debugger must proceed accordingly (basically it + -- can treat this case as it would a C union). + + -- The type corresponding to this field has a name that is obtained + -- by concatenating the type name with the above string and is similar + -- to a C union, in which each member of the union corresponds to one + -- variant. However, unlike a C union, the size of the type may be + -- variable even if each of the components are fixed size, since it + -- includes a computation of which variant is present. In that case, + -- it will be encoded as above and a type with the suffix "___XVN___XVU" + -- will be present. + + -- The name of the union member is encoded to indicate the choices, and + -- is a string given by the following grammar: + + -- union_name ::= {choice} | others_choice + -- choice ::= simple_choice | range_choice + -- simple_choice ::= S number + -- range_choice ::= R number T number + -- number ::= {decimal_digit} [m] + -- others_choice ::= O (upper case letter O) + + -- The m in a number indicates a negative value. As an example of this + -- encoding scheme, the choice 1 .. 4 | 7 | -10 would be represented by + + -- R1T4S7S10m + + -- In the case of enumeration values, the values used are the + -- actual representation values in the case where an enumeration type + -- has an enumeration representation spec (i.e. they are values that + -- correspond to the use of the Enum_Rep attribute). + + -- The type of the inner record is given by the name of the union + -- type (as above) concatenated with the above string. Since that + -- type may itself be variable-sized, it may also be encoded as above + -- with a new type with a further suffix of "___XVU". + + -- As an example, consider: + + -- type Var (Disc : Boolean := True) is record + -- M : Integer; + + -- case Disc is + -- when True => + -- R : Integer; + -- S : Integer; + + -- when False => + -- T : Integer; + -- end case; + -- end record; + + -- V1 : Var; + + -- In this case, the type var is represented as a struct with three + -- fields, the first two are "disc" and "m", representing the values + -- of these record components. + + -- The third field is a union of two types, with field names S1 and O. + -- S1 is a struct with fields "r" and "s", and O is a struct with + -- fields "t". + + ------------------------------------------------ + -- Subprograms for Handling Variant Encodings -- + ------------------------------------------------ + + procedure Get_Variant_Encoding (V : Node_Id); + -- This procedure is called by Gigi with V being the variant node. + -- The corresponding encoding string is returned in Name_Buffer with + -- the length of the string in Name_Len, and an ASCII.NUL character + -- stored following the name. + + --------------------------------- + -- Subtypes of Variant Records -- + --------------------------------- + + -- A subtype of a variant record is represented by a type in which the + -- union field from the base type is replaced by one of the possible + -- values. For example, if we have: + + -- type Var (Disc : Boolean := True) is record + -- M : Integer; + + -- case Disc is + -- when True => + -- R : Integer; + -- S : Integer; + + -- when False => + -- T : Integer; + -- end case; + + -- end record; + -- V1 : Var; + -- V2 : Var (True); + -- V3 : Var (False); + + -- Here V2 for example is represented with a subtype whose name is + -- something like TvarS3b, which is a struct with three fields. The + -- first two fields are "disc" and "m" as for the base type, and + -- the third field is S1, which contains the fields "r" and "s". + + -- The debugger should simply ignore structs with names of the form + -- corresponding to variants, and consider the fields inside as + -- belonging to the containing record. + + ------------------------------------------- + -- Character literals in Character Types -- + ------------------------------------------- + + -- Character types are enumeration types at least one of whose + -- enumeration literals is a character literal. Enumeration literals + -- are usually simply represented using their identifier names. In + -- the case where an enumeration literal is a character literal, the + -- name aencoded as described in the following paragraph. + + -- A name QUhh, where each 'h' is a lower-case hexadecimal digit, + -- stands for a character whose Unicode encoding is hh, and + -- QWhhhh likewise stands for a wide character whose encoding + -- is hhhh. The representation values are encoded as for ordinary + -- enumeration literals (and have no necessary relationship to the + -- values encoded in the names). + + -- For example, given the type declaration + + -- type x is (A, 'C', B); + + -- the second enumeration literal would be named QU43 and the + -- value assigned to it would be 1. + + ------------------- + -- Modular Types -- + ------------------- + + -- A type declared + + -- type x is mod N; + + -- Is encoded as a subrange of an unsigned base type with lower bound + -- 0 and upper bound N. That is, there is no name encoding; we only use + -- the standard encodings provided by the debugging format. Thus, + -- we give these types a non-standard interpretation: the standard + -- interpretation of our encoding would not, in general, imply that + -- arithmetic on type x was to be performed modulo N (especially not + -- when N is not a power of 2). + + --------------------- + -- Context Clauses -- + --------------------- + + -- The SGI Workshop debugger requires a very peculiar and nonstandard + -- symbol name containing $ signs to be generated that records the + -- use clauses that are used in a unit. GDB does not use this name, + -- since it takes a different philsophy of universal use visibility, + -- with manual resolution of any ambiguities. + + -- The routines and data in this section are used to prepare this + -- specialized name, whose exact contents are described below. Gigi + -- will output this encoded name only in the SGI case (indeed, not + -- only is it useless on other targets, but hazardous, given the use + -- of the non-standard character $ rejected by many assemblers.) + + -- "Use" clauses are encoded as follows: + + -- _LSS__ prefix for clauses in a subprogram spec + -- _LSB__ prefix for clauses in a subprogram body + -- _LPS__ prefix for clauses in a package spec + -- _LPB__ prefix for clauses in a package body + + -- Following the prefix is the fully qualified filename, followed by + -- '$' separated names of fully qualified units in the "use" clause. + -- If a unit appears in both the spec and the body "use" clause, it + -- will appear once in the _L[SP]S__ encoding and twice in the _L[SP]B__ + -- encoding. The encoding appears as a global symbol in the object file. + + ------------------------------------------------------------------------ + -- Subprograms and Declarations for Handling Context Clause Encodings -- + ------------------------------------------------------------------------ + + procedure Save_Unitname_And_Use_List + (Main_Unit_Node : Node_Id; + Main_Kind : Node_Kind); + -- Creates a string containing the current compilation unit name + -- and a dollar sign delimited list of packages named in a Use_Package + -- clause for the compilation unit. Needed for the SGI debugger. The + -- procedure is called unconditionally to set the variables declared + -- below, then gigi decides whether or not to use the values. + + -- The following variables are used for communication between the front + -- end and the debugging output routines in Gigi. + + type Char_Ptr is access all Character; + pragma Convention (C, Char_Ptr); + -- Character pointers accessed from C + + Spec_Context_List, Body_Context_List : Char_Ptr; + -- List of use package clauses for spec and body, respectively, as + -- built by the call to Save_Unitname_And_Use_List. Used by gigi if + -- these strings are to be output. + + Spec_Filename, Body_Filename : Char_Ptr; + -- Filenames for the spec and body, respectively, as built by the + -- call to Save_Unitname_And_Use_List. Used by gigi if these strings + -- are to be output. + +end Exp_Dbug; |