------------------------------------------------------------------------------ -- -- -- GNAT COMPILER COMPONENTS -- -- -- -- E X P _ D B U G -- -- -- -- S p e c -- -- -- -- Copyright (C) 1996-2005 Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- -- ware Foundation; either version 2, or (at your option) any later ver- -- -- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- -- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -- -- for more details. You should have received a copy of the GNU General -- -- Public License distributed with GNAT; see file COPYING. If not, write -- -- to the Free Software Foundation, 51 Franklin Street, Fifth Floor, -- -- Boston, MA 02110-1301, USA. -- -- -- -- GNAT was originally developed by the GNAT team at New York University. -- -- Extensive contributions were provided by Ada Core Technologies Inc. -- -- -- ------------------------------------------------------------------------------ -- Expand routines for generation of special declarations used by the -- debugger. In accordance with the Dwarf 2.2 specification, certain -- type names are encoded to provide information to the debugger. with Types; use Types; with Uintp; use Uintp; package Exp_Dbug is ----------------------------------------------------- -- Encoding and Qualification of Names of Entities -- ----------------------------------------------------- -- This section describes how the names of entities are encoded in -- the generated debugging information. -- An entity in Ada has a name of the form X.Y.Z ... E where X,Y,Z -- are the enclosing scopes (not including Standard at the start). -- The encoding of the name follows this basic qualified naming scheme, -- where the encoding of individual entity names is as described in -- Namet (i.e. in particular names present in the original source are -- folded to all lower case, with upper half and wide characters encoded -- as described in Namet). Upper case letters are used only for entities -- generated by the compiler. -- There are two cases, global entities, and local entities. In more -- formal terms, local entities are those which have a dynamic enclosing -- scope, and global entities are at the library level, except that we -- always consider procedures to be global entities, even if they are -- nested (that's because at the debugger level a procedure name refers -- to the code, and the code is indeed a global entity, including the -- case of nested procedures.) In addition, we also consider all types -- to be global entities, even if they are defined within a procedure. -- The reason for treating all type names as global entities is that -- a number of our type encodings work by having related type names, -- and we need the full qualification to keep this unique. -- For global entities, the encoded name includes all components of the -- fully expanded name (but omitting Standard at the start). For example, -- if a library level child package P.Q has an embedded package R, and -- there is an entity in this embdded package whose name is S, the encoded -- name will include the components p.q.r.s. -- For local entities, the encoded name only includes the components -- up to the enclosing dynamic scope (other than a block). At run time, -- such a dynamic scope is a subprogram, and the debugging formats know -- about local variables of procedures, so it is not necessary to have -- full qualification for such entities. In particular this means that -- direct local variables of a procedure are not qualified. -- As an example of the local name convention, consider a procedure V.W -- with a local variable X, and a nested block Y containing an entity -- Z. The fully qualified names of the entities X and Z are: -- V.W.X -- V.W.Y.Z -- but since V.W is a subprogram, the encoded names will end up -- encoding only -- x -- y.z -- The separating dots are translated into double underscores. ----------------------------- -- Handling of Overloading -- ----------------------------- -- The above scheme is incomplete with respect to overloaded -- subprograms, since overloading can legitimately result in a -- case of two entities with exactly the same fully qualified names. -- To distinguish between entries in a set of overloaded subprograms, -- the encoded names are serialized by adding the suffix: -- __nn (two underscores) -- where nn is a serial number (2 for the second overloaded function, -- 3 for the third, etc.). A suffix of __1 is always omitted (i.e. no -- suffix implies the first instance). -- These names are prefixed by the normal full qualification. So -- for example, the third instance of the subprogram qrs in package -- yz would have the name: -- yz__qrs__3 -- A more subtle case arises with entities declared within overloaded -- subprograms. If we have two overloaded subprograms, and both declare -- an entity xyz, then the fully expanded name of the two xyz's is the -- same. To distinguish these, we add the same __n suffix at the end of -- the inner entity names. -- In more complex cases, we can have multiple levels of overloading, -- and we must make sure to distinguish which final declarative region -- we are talking about. For this purpose, we use a more complex suffix -- which has the form: -- __nn_nn_nn ... -- where the nn values are the homonym numbers as needed for any of -- the qualifying entities, separated by a single underscore. If all -- the nn values are 1, the suffix is omitted, Otherwise the suffix -- is present (including any values of 1). The following example -- shows how this suffixing works. -- package body Yz is -- procedure Qrs is -- Name is yz__qrs -- procedure Tuv is ... end; -- Name is yz__qrs__tuv -- begin ... end Qrs; -- procedure Qrs (X: Int) is -- Name is yz__qrs__2 -- procedure Tuv is ... end; -- Name is yz__qrs__tuv__2_1 -- procedure Tuv (X: Int) is -- Name is yz__qrs__tuv__2_2 -- begin ... end Tuv; -- procedure Tuv (X: Float) is -- Name is yz__qrs__tuv__2_3 -- type m is new float; -- Name is yz__qrs__tuv__m__2_3 -- begin ... end Tuv; -- begin ... end Qrs; -- end Yz; -------------------- -- Operator Names -- -------------------- -- The above rules applied to operator names would result in names -- with quotation marks, which are not typically allowed by assemblers -- and linkers, and even if allowed would be odd and hard to deal with. -- To avoid this problem, operator names are encoded as follows: -- Oabs abs -- Oand and -- Omod mod -- Onot not -- Oor or -- Orem rem -- Oxor xor -- Oeq = -- One /= -- Olt < -- Ole <= -- Ogt > -- Oge >= -- Oadd + -- Osubtract - -- Oconcat & -- Omultiply * -- Odivide / -- Oexpon ** -- These names are prefixed by the normal full qualification, and -- suffixed by the overloading identification. So for example, the -- second operator "=" defined in package Extra.Messages would -- have the name: -- extra__messages__Oeq__2 ---------------------------------- -- Resolving Other Name Clashes -- ---------------------------------- -- It might be thought that the above scheme is complete, but in Ada 95, -- full qualification is insufficient to uniquely identify an entity -- in the program, even if it is not an overloaded subprogram. There -- are two possible confusions: -- a.b -- interpretation 1: entity b in body of package a -- interpretation 2: child procedure b of package a -- a.b.c -- interpretation 1: entity c in child package a.b -- interpretation 2: entity c in nested package b in body of a -- It is perfectly legal in both cases for both interpretations to -- be valid within a single program. This is a bit of a surprise since -- certainly in Ada 83, full qualification was sufficient, but not in -- Ada 95. The result is that the above scheme can result in duplicate -- names. This would not be so bad if the effect were just restricted -- to debugging information, but in fact in both the above cases, it -- is possible for both symbols to be external names, and so we have -- a real problem of name clashes. -- To deal with this situation, we provide two additional encoding -- rules for names -- First: all library subprogram names are preceded by the string -- _ada_ (which causes no duplications, since normal Ada names can -- never start with an underscore. This not only solves the first -- case of duplication, but also solves another pragmatic problem -- which is that otherwise Ada procedures can generate names that -- clash with existing system function names. Most notably, we can -- have clashes in the case of procedure Main with the C main that -- in some systems is always present. -- Second, for the case where nested packages declared in package -- bodies can cause trouble, we add a suffix which shows which -- entities in the list are body-nested packages, i.e. packages -- whose spec is within a package body. The rules are as follows, -- given a list of names in a qualified name name1.name2.... -- If none are body-nested package entities, then there is no suffix -- If at least one is a body-nested package entity, then the suffix -- is X followed by a string of b's and n's (b = body-nested package -- entity, n = not a body-nested package). -- There is one element in this string for each entity in the encoded -- expanded name except the first (the rules are such that the first -- entity of the encoded expanded name can never be a body-nested' -- package. Trailing n's are omitted, as is the last b (there must -- be at least one b, or we would not be generating a suffix at all). -- For example, suppose we have -- package x is -- pragma Elaborate_Body; -- m1 : integer; -- #1 -- end x; -- package body x is -- package y is m2 : integer; end y; -- #2 -- package body y is -- package z is r : integer; end z; -- #3 -- end; -- m3 : integer; -- #4 -- end x; -- package x.y is -- pragma Elaborate_Body; -- m2 : integer; -- #5 -- end x.y; -- package body x.y is -- m3 : integer; -- #6 -- procedure j is -- #7 -- package k is -- z : integer; -- #8 -- end k; -- begin -- null; -- end j; -- end x.y; -- procedure x.m3 is begin null; end; -- #9 -- Then the encodings would be: -- #1. x__m1 (no BNPE's in sight) -- #2. x__y__m2X (y is a BNPE) -- #3. x__y__z__rXb (y is a BNPE, so is z) -- #4. x__m3 (no BNPE's in sight) -- #5. x__y__m2 (no BNPE's in sight) -- #6. x__y__m3 (no BNPE's in signt) -- #7. x__y__j (no BNPE's in sight) -- #8. k__z (no BNPE's, only up to procedure) -- #9 _ada_x__m3 (library level subprogram) -- Note that we have instances here of both kind of potential name -- clashes, and the above examples show how the encodings avoid the -- clash as follows: -- Lines #4 and #9 both refer to the entity x.m3, but #9 is a library -- level subprogram, so it is preceded by the string _ada_ which acts -- to distinguish it from the package body entity. -- Lines #2 and #5 both refer to the entity x.y.m2, but the first -- instance is inside the body-nested package y, so there is an X -- suffix to distinguish it from the child library entity. -- Note that enumeration literals never need Xb type suffixes, since -- they are never referenced using global external names. --------------------- -- Interface Names -- --------------------- -- Note: if an interface name is present, then the external name -- is taken from the specified interface name. Given the current -- limitations of the gcc backend, this means that the debugging -- name is also set to the interface name, but conceptually, it -- would be possible (and indeed desirable) to have the debugging -- information still use the Ada name as qualified above, so we -- still fully qualify the name in the front end. ------------------------------------- -- Encodings Related to Task Types -- ------------------------------------- -- Each task object defined by a single task declaration is associated -- with a prefix that is used to qualify procedures defined in that -- task. Given -- -- package body P is -- task body TaskObj is -- procedure F1 is ... end; -- begin -- B; -- end TaskObj; -- end P; -- -- The name of subprogram TaskObj.F1 is encoded as p__taskobjTK__f1, -- The body, B, is contained in a subprogram whose name is -- p__taskobjTKB. ------------------------------------------ -- Encodings Related to Protected Types -- ------------------------------------------ -- Each protected type has an associated record type, that describes -- the actual layout of the private data. In addition to the private -- components of the type, the Corresponding_Record_Type includes one -- component of type Protection, which is the actual lock structure. -- The run-time size of the protected type is the size of the corres- -- ponding record. -- For a protected type prot, the Corresponding_Record_Type is encoded -- as protV. -- The operations of a protected type are encoded as follows: each -- operation results in two subprograms, a locking one that is called -- from outside of the object, and a non-locking one that is used for -- calls from other operations on the same object. The locking operation -- simply acquires the lock, and then calls the non-locking version. -- The names of all of these have a prefix constructed from the name of -- the type, and a suffix which is P or N, depending on whether this is -- the protected/non-locking version of the operation. -- Operations generated for protected entries follow the same encoding. -- Each entry results in two suprograms: a procedure that holds the -- entry body, and a function that holds the evaluation of the barrier. -- The names of these subprograms include the prefix 'E' or 'B' res- -- pectively. The names also include a numeric suffix to render them -- unique in the presence of overloaded entries. -- Given the declaration: -- protected type Lock is -- function Get return Integer; -- procedure Set (X: Integer); -- entry Update (Val : Integer); -- private -- Value : Integer := 0; -- end Lock; -- the following operations are created: -- lock_getN -- lock_getP, -- lock_setN -- lock_setP -- lock_update1sE -- lock_udpate2sB ---------------------------------------------------- -- Conversion between Entities and External Names -- ---------------------------------------------------- No_Dollar_In_Label : constant Boolean := True; -- True iff the target does not allow dollar signs ("$") in external names -- ??? We want to migrate all platforms to use the same convention. -- As a first step, we force this constant to always be True. This -- constant will eventually be deleted after we have verified that -- the migration does not cause any unforseen adverse impact. -- We chose "__" because it is supported on all platforms, which is -- not the case of "$". procedure Get_External_Name (Entity : Entity_Id; Has_Suffix : Boolean); -- Set Name_Buffer and Name_Len to the external name of entity E. -- The external name is the Interface_Name, if specified, unless -- the entity has an address clause or a suffix. -- -- If the Interface is not present, or not used, the external name -- is the concatenation of: -- -- - the string "_ada_", if the entity is a library subprogram, -- - the names of any enclosing scopes, each followed by "__", -- or "X_" if the next entity is a subunit) -- - the name of the entity -- - the string "$" (or "__" if target does not allow "$"), followed -- by homonym suffix, if the entity is an overloaded subprogram -- or is defined within an overloaded subprogram. procedure Get_External_Name_With_Suffix (Entity : Entity_Id; Suffix : String); -- Set Name_Buffer and Name_Len to the external name of entity E. -- If Suffix is the empty string the external name is as above, -- otherwise the external name is the concatenation of: -- -- - the string "_ada_", if the entity is a library subprogram, -- - the names of any enclosing scopes, each followed by "__", -- or "X_" if the next entity is a subunit) -- - the name of the entity -- - the string "$" (or "__" if target does not allow "$"), followed -- by homonym suffix, if the entity is an overloaded subprogram -- or is defined within an overloaded subprogram. -- - the string "___" followed by Suffix -- -- Note that a call to this procedure has no effect if we are not -- generating code, since the necessary information for computing the -- proper encoded name is not available in this case. -------------------------------------------- -- Subprograms for Handling Qualification -- -------------------------------------------- procedure Qualify_Entity_Names (N : Node_Id); -- Given a node N, that represents a block, subprogram body, or package -- body or spec, or protected or task type, sets a fully qualified name -- for the defining entity of given construct, and also sets fully -- qualified names for all enclosed entities of the construct (using -- First_Entity/Next_Entity). Note that the actual modifications of the -- names is postponed till a subsequent call to Qualify_All_Entity_Names. -- Note: this routine does not deal with prepending _ada_ to library -- subprogram names. The reason for this is that we only prepend _ada_ -- to the library entity itself, and not to names built from this name. procedure Qualify_All_Entity_Names; -- When Qualify_Entity_Names is called, no actual name changes are made, -- i.e. the actual calls to Qualify_Entity_Name are deferred until a call -- is made to this procedure. The reason for this deferral is that when -- names are changed semantic processing may be affected. By deferring -- the changes till just before gigi is called, we avoid any concerns -- about such effects. Gigi itself does not use the names except for -- output of names for debugging purposes (which is why we are doing -- the name changes in the first place. -- Note: the routines Get_Unqualified_[Decoded]_Name_String in Namet -- are useful to remove qualification from a name qualified by the -- call to Qualify_All_Entity_Names. -------------------------------- -- Handling of Numeric Values -- -------------------------------- -- All numeric values here are encoded as strings of decimal digits. -- Only integer values need to be encoded. A negative value is encoded -- as the corresponding positive value followed by a lower case m for -- minus to indicate that the value is negative (e.g. 2m for -2). ------------------------- -- Type Name Encodings -- ------------------------- -- In the following typ is the name of the type as normally encoded by -- the debugger rules, i.e. a non-qualified name, all in lower case, -- with standard encoding of upper half and wide characters ------------------------ -- Encapsulated Types -- ------------------------ -- In some cases, the compiler encapsulates a type by wrapping it in -- a structure. For example, this is used when a size or alignment -- specification requires a larger type. Consider: -- type y is mod 2 ** 64; -- for y'size use 256; -- In this case the compile generates a structure type y___PAD, which -- has a single field whose name is F. This single field is 64 bits -- long and contains the actual value. This kind of padding is used -- when the logical value to be stored is shorter than the object in -- which it is allocated. For example if a size clause is used to set -- a size of 256 for a signed integer value, then a typical choice is -- to wrap a 64-bit integer in a 256 bit PAD structure. -- A similar encapsulation is done for some packed array types, -- in which case the structure type is y___JM and the field name -- is OBJECT. This is used in the case of a packed array stored -- in modular representation (see section on representation of -- packed array objects). In this case the JM wrapping is used to -- achieve correct positioning of the packed array value (left or -- right justified in its field depending on endianness. -- When the debugger sees an object of a type whose name has a -- suffix of ___PAD or ___JM, the type will be a record containing -- a single field, and the name of that field will be all upper case. -- In this case, it should look inside to get the value of the inner -- field, and neither the outer structure name, nor the field name -- should appear when the value is printed. ----------------------- -- Fixed-Point Types -- ----------------------- -- Fixed-point types are encoded using a suffix that indicates the -- delta and small values. The actual type itself is a normal -- integer type. -- typ___XF_nn_dd -- typ___XF_nn_dd_nn_dd -- The first form is used when small = delta. The value of delta (and -- small) is given by the rational nn/dd, where nn and dd are decimal -- integers. -- -- The second form is used if the small value is different from the -- delta. In this case, the first nn/dd rational value is for delta, -- and the second value is for small. ------------------------------ -- VAX Floating-Point Types -- ------------------------------ -- Vax floating-point types are represented at run time as integer -- types, which are treated specially by the code generator. Their -- type names are encoded with the following suffix: -- typ___XFF -- typ___XFD -- typ___XFG -- representing the Vax F Float, D Float, and G Float types. The -- debugger must treat these specially. In particular, printing -- these values can be achieved using the debug procedures that -- are provided in package System.Vax_Float_Operations: -- procedure Debug_Output_D (Arg : D); -- procedure Debug_Output_F (Arg : F); -- procedure Debug_Output_G (Arg : G); -- These three procedures take a Vax floating-point argument, and -- output a corresponding decimal representation to standard output -- with no terminating line return. -------------------- -- Discrete Types -- -------------------- -- Discrete types are coded with a suffix indicating the range in -- the case where one or both of the bounds are discriminants or -- variable. -- Note: at the current time, we also encode compile time known -- bounds if they do not match the natural machine type bounds, -- but this may be removed in the future, since it is redundant -- for most debugging formats. However, we do not ever need XD -- encoding for enumeration base types, since here it is always -- clear what the bounds are from the total number of enumeration -- literals, and of course we do not need to encode the dummy XR -- types generated for renamings. -- typ___XD -- typ___XDL_lowerbound -- typ___XDU_upperbound -- typ___XDLU_lowerbound__upperbound -- If a discrete type is a natural machine type (i.e. its bounds -- correspond in a natural manner to its size), then it is left -- unencoded. The above encoding forms are used when there is a -- constrained range that does not correspond to the size or that -- has discriminant references or other compile time known bounds. -- The first form is used if both bounds are dynamic, in which case -- two constant objects are present whose names are typ___L and -- typ___U in the same scope as typ, and the values of these constants -- indicate the bounds. As far as the debugger is concerned, these -- are simply variables that can be accessed like any other variables. -- In the enumeration case, these values correspond to the Enum_Rep -- values for the lower and upper bounds. -- The second form is used if the upper bound is dynamic, but the -- lower bound is either constant or depends on a discriminant of -- the record with which the type is associated. The upper bound -- is stored in a constant object of name typ___U as previously -- described, but the lower bound is encoded directly into the -- name as either a decimal integer, or as the discriminant name. -- The third form is similarly used if the lower bound is dynamic, -- but the upper bound is compile time known or a discriminant -- reference, in which case the lower bound is stored in a constant -- object of name typ___L, and the upper bound is encoded directly -- into the name as either a decimal integer, or as the discriminant -- name. -- The fourth form is used if both bounds are discriminant references -- or compile time known values, with the encoding first for the lower -- bound, then for the upper bound, as previously described. ------------------- -- Modular Types -- ------------------- -- A type declared -- type x is mod N; -- Is encoded as a subrange of an unsigned base type with lower bound -- 0 and upper bound N. That is, there is no name encoding. We use -- the standard encodings provided by the debugging format. Thus -- we give these types a non-standard interpretation: the standard -- interpretation of our encoding would not, in general, imply that -- arithmetic on type x was to be performed modulo N (especially not -- when N is not a power of 2). ------------------ -- Biased Types -- ------------------ -- Only discrete types can be biased, and the fact that they are -- biased is indicated by a suffix of the form: -- typ___XB_lowerbound__upperbound -- Here lowerbound and upperbound are decimal integers, with the -- usual (postfix "m") encoding for negative numbers. Biased -- types are only possible where the bounds are compile time -- known, and the values are represented as unsigned offsets -- from the lower bound given. For example: -- type Q is range 10 .. 15; -- for Q'size use 3; -- The size clause will force values of type Q in memory to be -- stored in biased form (e.g. 11 will be represented by the -- bit pattern 001). ---------------------------------------------- -- Record Types with Variable-Length Fields -- ---------------------------------------------- -- The debugging formats do not fully support these types, and indeed -- some formats simply generate no useful information at all for such -- types. In order to provide information for the debugger, gigi creates -- a parallel type in the same scope with one of the names -- type___XVE -- type___XVU -- The former name is used for a record and the latter for the union -- that is made for a variant record (see below) if that record or -- union has a field of variable size or if the record or union itself -- has a variable size. These encodings suffix any other encodings that -- that might be suffixed to the type name. -- The idea here is to provide all the needed information to interpret -- objects of the original type in the form of a "fixed up" type, which -- is representable using the normal debugging information. -- There are three cases to be dealt with. First, some fields may have -- variable positions because they appear after variable-length fields. -- To deal with this, we encode *all* the field bit positions of the -- special ___XV type in a non-standard manner. -- The idea is to encode not the position, but rather information -- that allows computing the position of a field from the position -- of the previous field. The algorithm for computing the actual -- positions of all fields and the length of the record is as -- follows. In this description, let P represent the current -- bit position in the record. -- 1. Initialize P to 0. -- 2. For each field in the record, -- 2a. If an alignment is given (see below), then round P -- up, if needed, to the next multiple of that alignment. -- 2b. If a bit position is given, then increment P by that -- amount (that is, treat it as an offset from the end of the -- preceding record). -- 2c. Assign P as the actual position of the field. -- 2d. Compute the length, L, of the represented field (see below) -- and compute P'=P+L. Unless the field represents a variant part -- (see below and also Variant Record Encoding), set P to P'. -- The alignment, if present, is encoded in the field name of the -- record, which has a suffix: -- fieldname___XVAnn -- where the nn after the XVA indicates the alignment value in storage -- units. This encoding is present only if an alignment is present. -- The size of the record described by an XVE-encoded type (in bits) -- is generally the maximum value attained by P' in step 2d above, -- rounded up according to the record's alignment. -- Second, the variable-length fields themselves are represented by -- replacing the type by a special access type. The designated type -- of this access type is the original variable-length type, and the -- fact that this field has been transformed in this way is signalled -- by encoding the field name as: -- field___XVL -- where field is the original field name. If a field is both -- variable-length and also needs an alignment encoding, then the -- encodings are combined using: -- field___XVLnn -- Note: the reason that we change the type is so that the resulting -- type has no variable-length fields. At least some of the formats -- used for debugging information simply cannot tolerate variable- -- length fields, so the encoded information would get lost. -- Third, in the case of a variant record, the special union -- that contains the variants is replaced by a normal C union. -- In this case, the positions are all zero. -- Discriminants appear before any variable-length fields that depend -- on them, with one exception. In some cases, a discriminant -- governing the choice of a variant clause may appear in the list -- of fields of an XVE type after the entry for the variant clause -- itself (this can happen in the presence of a representation clause -- for the record type in the source program). However, when this -- happens, the discriminant's position may be determined by first -- applying the rules described in this section, ignoring the variant -- clause. As a result, discriminants can always be located -- independently of the variable-length fields that depend on them. -- The size of the ___XVE or ___XVU record or union is set to the -- alignment (in bytes) of the original object so that the debugger -- can calculate the size of the original type. -- As an example of this encoding, consider the declarations: -- type Q is array (1 .. V1) of Float; -- alignment 4 -- type R is array (1 .. V2) of Long_Float; -- alignment 8 -- type X is record -- A : Character; -- B : Float; -- C : String (1 .. V3); -- D : Float; -- E : Q; -- F : R; -- G : Float; -- end record; -- The encoded type looks like: -- type anonymousQ is access Q; -- type anonymousR is access R; -- type X___XVE is record -- A : Character; -- position contains 0 -- B : Float; -- position contains 24 -- C___XVL : access String (1 .. V3); -- position contains 0 -- D___XVA4 : Float; -- position contains 0 -- E___XVL4 : anonymousQ; -- position contains 0 -- F___XVL8 : anonymousR; -- position contains 0 -- G : Float; -- position contains 0 -- end record; -- Any bit sizes recorded for fields other than dynamic fields and -- variants are honored as for ordinary records. -- Notes: -- 1) The B field could also have been encoded by using a position -- of zero, and an alignment of 4, but in such a case, the coding by -- position is preferred (since it takes up less space). We have used -- the (illegal) notation access xxx as field types in the example -- above. -- 2) The E field does not actually need the alignment indication -- but this may not be detected in this case by the conversion -- routines. -- 3) Our conventions do not cover all XVE-encoded records in which -- some, but not all, fields have representation clauses. Such -- records may, therefore, be displayed incorrectly by debuggers. -- This situation is not common. ----------------------- -- Base Record Types -- ----------------------- -- Under certain circumstances, debuggers need two descriptions -- of a record type, one that gives the actual details of the -- base type's structure (as described elsewhere in these -- comments) and one that may be used to obtain information -- about the particular subtype and the size of the objects -- being typed. In such cases the compiler will substitute a -- type whose name is typically compiler-generated and -- irrelevant except as a key for obtaining the actual type. -- Specifically, if this name is x, then we produce a record -- type named x___XVS consisting of one field. The name of -- this field is that of the actual type being encoded, which -- we'll call y (the type of this single field is arbitrary). -- Both x and y may have corresponding ___XVE types. -- The size of the objects typed as x should be obtained from -- the structure of x (and x___XVE, if applicable) as for -- ordinary types unless there is a variable named x___XVZ, which, -- if present, will hold the the size (in bits) of x. -- The type x will either be a subtype of y (see also Subtypes -- of Variant Records, below) or will contain no fields at -- all. The layout, types, and positions of these fields will -- be accurate, if present. (Currently, however, the GDB -- debugger makes no use of x except to determine its size). -- Among other uses, XVS types are sometimes used to encode -- unconstrained types. For example, given -- -- subtype Int is INTEGER range 0..10; -- type T1 (N: Int := 0) is record -- F1: String (1 .. N); -- end record; -- type AT1 is array (INTEGER range <>) of T1; -- -- the element type for AT1 might have a type defined as if it had -- been written: -- -- type at1___C_PAD is record null; end record; -- for at1___C_PAD'Size use 16 * 8; -- -- and there would also be -- -- type at1___C_PAD___XVS is record t1: Integer; end record; -- type t1 is ... -- -- Had the subtype Int been dynamic: -- -- subtype Int is INTEGER range 0 .. M; -- M a variable -- -- Then the compiler would also generate a declaration whose effect -- would be -- -- at1___C_PAD___XVZ: constant Integer := 32 + M * 8 + padding term; -- -- Not all unconstrained types are so encoded; the XVS -- convention may be unnecessary for unconstrained types of -- fixed size. However, this encoding is always necessary when -- a subcomponent type (array element's type or record field's -- type) is an unconstrained record type some of whose -- components depend on discriminant values. ----------------- -- Array Types -- ----------------- -- Since there is no way for the debugger to obtain the index subtypes -- for an array type, we produce a type that has the name of the -- array type followed by "___XA" and is a record whose field names -- are the names of the types for the bounds. The types of these -- fields is an integer type which is meaningless. -- To conserve space, we do not produce this type unless one of -- the index types is either an enumeration type, has a variable -- upper bound, has a lower bound different from the constant 1, -- is a biased type, or is wider than "sizetype". -- Given the full encoding of these types (see above description for -- the encoding of discrete types), this means that all necessary -- information for addressing arrays is available. In some -- debugging formats, some or all of the bounds information may -- be available redundantly, particularly in the fixed-point case, -- but this information can in any case be ignored by the debugger. ---------------------------- -- Note on Implicit Types -- ---------------------------- -- The compiler creates implicit type names in many situations where -- a type is present semantically, but no specific name is present. -- For example: -- S : Integer range M .. N; -- Here the subtype of S is not integer, but rather an anonymous -- subtype of Integer. Where possible, the compiler generates names -- for such anonymous types that are related to the type from which -- the subtype is obtained as follows: -- T name suffix -- where name is the name from which the subtype is obtained, using -- lower case letters and underscores, and suffix starts with an upper -- case letter. For example, the name for the above declaration of S -- might be: -- TintegerS4b -- If the debugger is asked to give the type of an entity and the type -- has the form T name suffix, it is probably appropriate to just use -- "name" in the response since this is what is meaningful to the -- programmer. ------------------------------------------------- -- Subprograms for Handling Encoded Type Names -- ------------------------------------------------- procedure Get_Encoded_Name (E : Entity_Id); -- If the entity is a typename, store the external name of the entity as in -- Get_External_Name, followed by three underscores plus the type encoding -- in Name_Buffer with the length in Name_Len, and an ASCII.NUL character -- stored following the name. Otherwise set Name_Buffer and Name_Len to -- hold the entity name. Note that a call to this procedure has no effect -- if we are not generating code, since the necessary information for -- computing the proper encoded name is not available in this case. -------------- -- Renaming -- -------------- -- Debugging information is generated for exception, object, package, -- and subprogram renaming (generic renamings are not significant, since -- generic templates are not relevant at debugging time). -- Consider a renaming declaration of the form -- x typ renames y; -- There is one case in which no special debugging information is required, -- namely the case of an object renaming where the backend allocates a -- reference for the renamed variable, and the entity x is this reference. -- The debugger can handle this case without any special processing or -- encoding (it won't know it was a renaming, but that does not matter). -- All other cases of renaming generate a dummy type definition for -- an entity whose name is: -- x___XR for an object renaming -- x___XRE for an exception renaming -- x___XRP for a package renaming -- The name is fully qualified in the usual manner, i.e. qualified in -- the same manner as the entity x would be. In the case of a package -- renaming where x is a child unit, the qualification includes the -- name of the parent unit, to disambiguate child units with the same -- simple name and (of necessity) different parents. -- Note: subprogram renamings are not encoded at the present time. -- The type is an enumeration type with a single enumeration literal -- that is an identifier which describes the renamed variable. -- For the simple entity case, where y is an entity name, -- the enumeration is of the form: -- (y___XE) -- i.e. the enumeration type has a single field, whose name -- matches the name y, with the XE suffix. The entity for this -- enumeration literal is fully qualified in the usual manner. -- All subprogram, exception, and package renamings fall into -- this category, as well as simple object renamings. -- For the object renaming case where y is a selected component or an -- indexed component, the literal name is suffixed by additional fields -- that give details of the components. The name starts as above with -- a y___XE entity indicating the outer level variable. Then a series -- of selections and indexing operations can be specified as follows: -- Indexed component -- A series of subscript values appear in sequence, the number -- corresponds to the number of dimensions of the array. The -- subscripts have one of the following two forms: -- XSnnn -- Here nnn is a constant value, encoded as a decimal -- integer (pos value for enumeration type case). Negative -- values have a trailing 'm' as usual. -- XSe -- Here e is the (unqualified) name of a constant entity in -- the same scope as the renaming which contains the subscript -- value. -- Slice -- For the slice case, we have two entries. The first is for -- the lower bound of the slice, and has the form -- XLnnn -- XLe -- Specifies the lower bound, using exactly the same encoding -- as for an XS subscript as described above. -- Then the upper bound appears in the usual XSnnn/XSe form -- Selected component -- For a selected component, we have a single entry -- XRf -- Here f is the field name for the selection -- For an explicit deference (.all), we have a single entry -- XA -- As an example, consider the declarations: -- package p is -- type q is record -- m : string (2 .. 5); -- end record; -- -- type r is array (1 .. 10, 1 .. 20) of q; -- -- g : r; -- -- z : string renames g (1,5).m(2 ..3) -- end p; -- The generated type definition would appear as -- type p__z___XR is -- (p__g___XEXS1XS5XRmXL2XS3); -- p__g___XE--------------------outer entity is g -- XS1-----------------first subscript for g -- XS5--------------second subscript for g -- XRm-----------select field m -- XL2--------lower bound of slice -- XS3-----upper bound of slice function Debug_Renaming_Declaration (N : Node_Id) return Node_Id; -- The argument N is a renaming declaration. The result is a type -- declaration as described in the above paragraphs. If not special -- debug declaration, than Empty is returned. --------------------------- -- Packed Array Encoding -- --------------------------- -- For every packed array, two types are created, and both appear in -- the debugging output. -- The original declared array type is a perfectly normal array type, -- and its index bounds indicate the original bounds of the array. -- The corresponding packed array type, which may be a modular type, or -- may be an array of bytes type (see Exp_Pakd for full details). This -- is the type that is actually used in the generated code and for -- debugging information for all objects of the packed type. -- The name of the corresponding packed array type is: -- ttt___XPnnn -- where -- ttt is the name of the original declared array -- nnn is the component size in bits (1-31) -- When the debugger sees that an object is of a type that is encoded -- in this manner, it can use the original type to determine the bounds, -- and the component size to determine the packing details. ------------------------------------------- -- Packed Array Representation in Memory -- ------------------------------------------- -- Packed arrays are represented in tightly packed form, with no extra -- bits between components. This is true even when the component size -- is not a factor of the storage unit size, so that as a result it is -- possible for components to cross storage unit boundaries. -- The layout in storage is identical, regardless of whether the -- implementation type is a modular type or an array-of-bytes type. -- See Exp_Pakd for details of how these implementation types are used, -- but for the purpose of the debugger, only the starting address of -- the object in memory is significant. -- The following example should show clearly how the packing works in -- the little-endian and big-endian cases: -- type B is range 0 .. 7; -- for B'Size use 3; -- type BA is array (0 .. 5) of B; -- pragma Pack (BA); -- BV : constant BA := (1,2,3,4,5,6); -- Little endian case -- BV'Address + 2 BV'Address + 1 BV'Address + 0 -- +-----------------+-----------------+-----------------+ -- | ? ? ? ? ? ? 1 1 | 0 1 0 1 1 0 0 0 | 1 1 0 1 0 0 0 1 | -- +-----------------+-----------------+-----------------+ -- <---------> <-----> <---> <---> <-----> <---> <---> -- unused bits BV(5) BV(4) BV(3) BV(2) BV(1) BV(0) -- -- Big endian case -- -- BV'Address + 0 BV'Address + 1 BV'Address + 2 -- +-----------------+-----------------+-----------------+ -- | 0 0 1 0 1 0 0 1 | 1 1 0 0 1 0 1 1 | 1 0 ? ? ? ? ? ? | -- +-----------------+-----------------+-----------------+ -- <---> <---> <-----> <---> <---> <-----> <---------> -- BV(0) BV(1) BV(2) BV(3) BV(4) BV(5) unused bits -- Note that if a modular type is used to represent the array, the -- allocation in memory is not the same as a normal modular type. -- The difference occurs when the allocated object is larger than -- the size of the array. For a normal modular type, we extend the -- value on the left with zeroes. -- For example, in the normal modular case, if we have a 6-bit -- modular type, declared as mod 2**6, and we allocate an 8-bit -- object for this type, then we extend the value with two bits -- on the most significant end, and in either the little-endian -- or big-endian case, the value 63 is represented as 00111111 -- in binary in memory. -- For a modular type used to represent a packed array, the rule is -- different. In this case, if we have to extend the value, then we -- do it with undefined bits (which are not initialized and whose value -- is irrelevant to any generated code). Furthermore these bits are on -- the right (least significant bits) in the big-endian case, and on the -- left (most significant bits) in the little-endian case. -- For example, if we have a packed boolean array of 6 bits, all set -- to True, stored in an 8-bit object, then the value in memory in -- binary is ??111111 in the little-endian case, and 111111?? in the -- big-endian case. -- This is done so that the representation of packed arrays does not -- depend on whether we use a modular representation or array of bytes -- as previously described. This ensures that we can pass such values -- by reference in the case where a subprogram has to be able to handle -- values stored in either form. -- Note that when we extract the value of such a modular packed array, -- we expect to retrieve only the relevant bits, so in this same example, -- when we extract the value, we get 111111 in both cases, and the code -- generated by the front end assumes this, although it does not assume -- that any high order bits are defined. -- There are opportunities for optimization based on the knowledge that -- the unused bits are irrelevant for these type of packed arrays. For -- example if we have two such 6-bit-in-8-bit values and we do an -- assignment: -- a := b; -- Then logically, we extract the 6 bits and store only 6 bits in the -- result, but the back end is free to simply assign the entire 8-bits -- in this case, since we don't actually care about the undefined bits. -- However, in the equality case, it is important to ensure that the -- undefined bits do not participate in an equality test. -- If a modular packed array value is assigned to a register, then -- logically it could always be held right justified, to avoid any -- need to shift, e.g. when doing comparisons. But probably this is -- a bad choice, as it would mean that an assignment such as a := b -- above would require shifts when one value is in a register and the -- other value is in memory. ------------------------------------------------------ -- Subprograms for Handling Packed Array Type Names -- ------------------------------------------------------ function Make_Packed_Array_Type_Name (Typ : Entity_Id; Csize : Uint) return Name_Id; -- This function is used in Exp_Pakd to create the name that is encoded -- as described above. The entity Typ provides the name ttt, and the -- value Csize is the component size that provides the nnn value. -------------------------------------- -- Pointers to Unconstrained Arrays -- -------------------------------------- -- There are two kinds of pointers to arrays. The debugger can tell -- which format is in use by the form of the type of the pointer. -- Fat Pointers -- Fat pointers are represented as a struct with two fields. This -- struct has two distinguished field names: -- P_ARRAY is a pointer to the array type. The name of this -- type is the unconstrained type followed by "___XUA". This -- array will have bounds which are the discriminants, and -- hence are unparsable, but will give the number of -- subscripts and the component type. -- P_BOUNDS is a pointer to a struct, the name of whose type is the -- unconstrained array name followed by "___XUB" and which has -- fields of the form -- LBn (n a decimal integer) lower bound of n'th dimension -- UBn (n a decimal integer) upper bound of n'th dimension -- The bounds may be any integral type. In the case of an -- enumeration type, Enum_Rep values are used. -- The debugging information will sometimes reference an anonymous -- fat pointer type. Such types are given the name xxx___XUP, where -- xxx is the name of the designated type. If the debugger is asked -- to output such a type name, the appropriate form is "access xxx". -- Thin Pointers -- The value of a thin pointer is a pointer to the second field -- of a structure with two fields. The name of this structure's -- type is "arr___XUT", where "arr" is the name of the -- unconstrained array type. Even though it actually points into -- middle of this structure, the thin pointer's type in debugging -- information is pointer-to-arr___XUT. -- The first field of arr___XUT is named BOUNDS, and has a type -- named arr___XUB, with the structure described for such types -- in fat pointers, as described above. -- The second field of arr___XUT is named ARRAY, and contains -- the actual array. Because this array has a dynamic size, -- determined by the BOUNDS field that precedes it, all of the -- information about arr___XUT is encoded in a parallel type named -- arr___XUT___XVE, with fields BOUNDS and ARRAY___XVL. As for -- previously described ___XVE types, ARRAY___XVL has -- a pointer-to-array type. However, the array type in this case -- is named arr___XUA and only its element type is meaningful, -- just as described for fat pointers. -------------------------------------- -- Tagged Types and Type Extensions -- -------------------------------------- -- A type C derived from a tagged type P has a field named "_parent" -- of type P that contains its inherited fields. The type of this -- field is usually P (encoded as usual if it has a dynamic size), -- but may be a more distant ancestor, if P is a null extension of -- that type. -- The type tag of a tagged type is a field named _tag, of type void*. -- If the type is derived from another tagged type, its _tag field is -- found in its _parent field. ----------------------------- -- Variant Record Encoding -- ----------------------------- -- The variant part of a variant record is encoded as a single field -- in the enclosing record, whose name is: -- discrim___XVN -- where discrim is the unqualified name of the variant. This field name -- is built by gigi (not by code in this unit). In the case of an -- Unchecked_Union record, this discriminant will not appear in the -- record, and the debugger must proceed accordingly (basically it -- can treat this case as it would a C union). -- The type corresponding to this field has a name that is obtained -- by concatenating the type name with the above string and is similar -- to a C union, in which each member of the union corresponds to one -- variant. However, unlike a C union, the size of the type may be -- variable even if each of the components are fixed size, since it -- includes a computation of which variant is present. In that case, -- it will be encoded as above and a type with the suffix "___XVN___XVU" -- will be present. -- The name of the union member is encoded to indicate the choices, and -- is a string given by the following grammar: -- union_name ::= {choice} | others_choice -- choice ::= simple_choice | range_choice -- simple_choice ::= S number -- range_choice ::= R number T number -- number ::= {decimal_digit} [m] -- others_choice ::= O (upper case letter O) -- The m in a number indicates a negative value. As an example of this -- encoding scheme, the choice 1 .. 4 | 7 | -10 would be represented by -- R1T4S7S10m -- In the case of enumeration values, the values used are the -- actual representation values in the case where an enumeration type -- has an enumeration representation spec (i.e. they are values that -- correspond to the use of the Enum_Rep attribute). -- The type of the inner record is given by the name of the union -- type (as above) concatenated with the above string. Since that -- type may itself be variable-sized, it may also be encoded as above -- with a new type with a further suffix of "___XVU". -- As an example, consider: -- type Var (Disc : Boolean := True) is record -- M : Integer; -- case Disc is -- when True => -- R : Integer; -- S : Integer; -- when False => -- T : Integer; -- end case; -- end record; -- V1 : Var; -- In this case, the type var is represented as a struct with three -- fields, the first two are "disc" and "m", representing the values -- of these record components. -- The third field is a union of two types, with field names S1 and O. -- S1 is a struct with fields "r" and "s", and O is a struct with -- fields "t". ------------------------------------------------ -- Subprograms for Handling Variant Encodings -- ------------------------------------------------ procedure Get_Variant_Encoding (V : Node_Id); -- This procedure is called by Gigi with V being the variant node. -- The corresponding encoding string is returned in Name_Buffer with -- the length of the string in Name_Len, and an ASCII.NUL character -- stored following the name. --------------------------------- -- Subtypes of Variant Records -- --------------------------------- -- A subtype of a variant record is represented by a type in which the -- union field from the base type is replaced by one of the possible -- values. For example, if we have: -- type Var (Disc : Boolean := True) is record -- M : Integer; -- case Disc is -- when True => -- R : Integer; -- S : Integer; -- when False => -- T : Integer; -- end case; -- end record; -- V1 : Var; -- V2 : Var (True); -- V3 : Var (False); -- Here V2 for example is represented with a subtype whose name is -- something like TvarS3b, which is a struct with three fields. The -- first two fields are "disc" and "m" as for the base type, and -- the third field is S1, which contains the fields "r" and "s". -- The debugger should simply ignore structs with names of the form -- corresponding to variants, and consider the fields inside as -- belonging to the containing record. ------------------------------------------- -- Character literals in Character Types -- ------------------------------------------- -- Character types are enumeration types at least one of whose -- enumeration literals is a character literal. Enumeration literals -- are usually simply represented using their identifier names. In -- the case where an enumeration literal is a character literal, the -- name aencoded as described in the following paragraph. -- A name QUhh, where each 'h' is a lower-case hexadecimal digit, -- stands for a character whose Unicode encoding is hh, and -- QWhhhh likewise stands for a wide character whose encoding -- is hhhh. The representation values are encoded as for ordinary -- enumeration literals (and have no necessary relationship to the -- values encoded in the names). -- For example, given the type declaration -- type x is (A, 'C', B); -- the second enumeration literal would be named QU43 and the -- value assigned to it would be 1. ---------------------------- -- Effect of Optimization -- ---------------------------- -- If the program is compiled with optimization on (e.g. -O1 switch -- specified), then there may be variations in the output from the -- above specification. In particular, objects may disappear from -- the output. This includes not only constants and variables that -- the program declares at the source level, but also the x___L and -- x___U constants created to describe the lower and upper bounds of -- subtypes with dynamic bounds. This means for example, that array -- bounds may disappear if optimization is turned on. The debugger -- is expected to recognize that these constants are missing and -- deal as best as it can with the limited information available. end Exp_Dbug;