diff options
Diffstat (limited to 'TAO/TAO_IDL/docs/WRITING_A_BE')
-rw-r--r-- | TAO/TAO_IDL/docs/WRITING_A_BE | 1350 |
1 files changed, 1350 insertions, 0 deletions
diff --git a/TAO/TAO_IDL/docs/WRITING_A_BE b/TAO/TAO_IDL/docs/WRITING_A_BE new file mode 100644 index 00000000000..5c3c069f7a1 --- /dev/null +++ b/TAO/TAO_IDL/docs/WRITING_A_BE @@ -0,0 +1,1350 @@ +OMG INTERFACE DEFINITION LANGUAGE COMPILER FRONT END PROTOCOLS +============================================================== + +INTRODUCTION +------------ + +Welcome to the publicly available source release of SunSoft's +implementation of the compiler front end (CFE) for OMG Interface Definition +Language! + +This document explains how to use the release to create a fully functional +OMG Interface Definition Language to target language compiler for your +selected target system configuration. The section OVERVIEW explains this +document's structure. + +CONTEXT +------- + +The implementation has three parts: + +1. A main program driving the compilation process +2. A parser and attendant utilities for converting the IDL input into + an internal form +3. One or more back ends which take as input the internal form representing + the IDL input, and which produce output in a target language and target + format + +The release contains components 1 and 2, and a demonstration implementation +of component 3. To use this release, you + +- write a back end which takes the internal representation of the parsed input + and translates it to the target language and format. You may replace or + modify the demonstration back end provided. +- link the back end with the provided main program and parser sources + to produce a complete compiler. + +OVERVIEW +-------- + +This document does not explain IDL nor does it introduce IDL features. +For this information, refer to the OMG CORBA specification, available by +anonymous FTP from omg.org. + +This document does not explain C++, except to demonstrate how it is +used to construct the CFE. The ARM by Stroustrup and Ellis provides a +thorough explanation of C++. + +This document consists of two independent parts. The first part +s all CFE supported protocols and the required +application programmer's interface entry points that a conformant +BE must provide. The second part steps through the process of +constructing a working BE. + +The first part describes: + +- The compilation process +- The Abstract Syntax Tree (AST) internal representation of parsed IDL + input +- How access to member data fields is managed +- How the AST is generated from the IDL input (Generator protocol) +- How definition scopes are nested and how name lookup works +- The narrowing mechanism +- How definition scopes are managed and how nodes are added to scopes +- How BEs get control during the AST construction process (Add protocol) +- The inheritance scheme used by the AST and how it affects BEs +- How errors are handled and reported +- How the CFE is initialized +- How the command line arguments are parsed +- What global variables and functions are provided +- What API is required to be supported by a BE in order to link + with the CFE +- What files must be included in each BE file + +The second part describes + +- The API to be supplied by each BE +- How to subclass from the AST to add BE specific functionality +- How to subclass from the Generator protocol to create BE specific + extended AST nodes +- How to write constructors for the derived BE classes +- How to use the Add protocol to store BE specific information +- How to maintain BE specific information which applies to the entire + AST generated from the IDL input +- How to use data members in your BE +- How to build a complete compiler + +PART I. FEATURES OF THE CFE +-=========================- + +THE COMPILATION PROCESS +----------------------- + +The OMG IDL compiler operates as follows: + +- Parses command line arguments. If an option is directed at a + BE, an appropriate operation provided by the BE is invoked to process + the option. +- Performs global initialization. +- Forks a copy of the compiler for each file specified as input. +- An ANSI-compatible preprocessor preprocesses the IDL input. +- Parses the file using the CFE parser, and constructs an AST describing the + IDL input. +- Prints the AST for verification, if requested. +- Invokes the BE to process the AST and produce the output + characteristic of that BE. + +ABSTRACT SYNTAX TREE +-------------------- + +The AST (Abstract Syntax Tree) is the primary mechanism for communication +between a BE and the CFE. It consists of a tree of instances of classes +defined in the CFE or refinements of those classes as defined in a BE. +The class hierarchy of the AST closely resembles the structure of the IDL +syntax. Most AST classes have direct equivalents in IDL constructs. + +The UTL_Scope class defines common functionality for definition scope +management and name lookup. This is explained in a following section. +UTL_Scope is defined in include/utl_scope.hh and implemented in +util/utl_scope.cc. + +The AST provides the following classes: + +AST_Decl Base of the AST class hierarchy. Each class in the AST + inherits from AST_Decl. Defined in include/ast_decl.hh + and implemented in ast/ast_decl.cc + +AST_Type Common base class for all classes which represent IDL + type constructs. Defined in include/ast_type.hh and + implemented in ast/ast_type.cc. Inherits from AST_Decl. + +AST_ConcreteType Common base class for all classes which represent IDL + types other than interfaces. Defined in the file + include/ast_concrete_type.hh and implemented in + ast/ast_concrete_type.cc. Inherits from AST_Type. + +AST_PredefinedType Instances of this class represent all predefined types + such as long, char and so forth. Defined in the file + include/ast_predefined_type.hh and implemented in + ast/ast_predefined_type.cc. Inherits from + AST_ConcreteType. + +AST_Module Represents the IDL module construct. Defined in the + file include/ast_module.hh and implemented in + ast/ast_module.cc. Inherits from AST_Decl and + UTL_Scope. + +AST_Root Represents the root of the abstract syntax tree being + constructed. Is a subclass of AST_Module. Can be + subclassed in BEs to store information associated with + the entire AST. Defined in the file include/ast_root.hh + and implemented in ast/ast_root.cc. Inherits from + AST_Module. + +AST_Interface Represents the IDL interface construct. Defined in + include/ast_interface.hh and implemented in the file + ast/ast_interface.cc. Inherits from AST_Type and + UTL_Scope. + +AST_InterfaceFwd Represents a forward declaration of an IDL interface. + Defined in include/ast_interface_fwd.hh and implemented + in ast/ast_interface_fwd.cc. Inherits from AST_Decl. + +AST_Attribute Represents an IDL attribute construct. Defined in + include/ast_attribute.hh and implemented in the file + ast/ast_attribute.cc. Inherits from AST_Decl. + +AST_Exception Represents an IDL exception construct. Defined in + include/ast_exception.hh and implemented in the file + ast/ast_exception.cc. Inherits from AST_Decl. + +AST_Structure Represents an IDL struct construct. Defined in the file + include/ast_structure.hh and implemented in the file + ast/ast_structure.cc. Inherits from AST_ConcreteType + and UTL_Scope. + +AST_Field Represents a field in an IDL struct or exception + construct. Defined in include/ast_field.hh and + implemented in ast/ast_field.cc. Inherits from + AST_Decl. + +AST_Operation Represents an IDL operation construct. Defined in the + file include/ast_operation.hh and implemented in + ast/ast_operation.cc. Inherits from AST_Decl and + UTL_Scope. + +AST_Argument Represents an argument to an IDL operation construct. + Defined in include/ast_argument.hh and implemented in + ast/ast_argument.cc. Inherits from AST_Field. + +AST_Union Represents an IDL union construct. Defined in + include/ast_union.hh and implemented in + ast/ast_union.cc. Inherits from AST_ConcreteType and + from UTL_Scope. + +AST_UnionBranch Represents an individual branch in an IDL union + construct. Defined in include/ast_union_branch.hh and + implemented in ast/ast_union_branch.cc. Inherits from + AST_Field. + +AST_UnionLabel Represents the label of an individual branch in an IDL + union construct. Defined in include/ast_union_label.hh + and implemented in ast/ast_union_label.cc + +AST_Constant Represents an IDL constant construct. Defined in + include/ast_constant.hh and implemented in the file + ast/ast_constant.cc. Inherits from AST_Decl. + +AST_Enum Represents an IDL enum construct. Defined in the file + include/ast_enum.hh and implemented in ast/ast_enum.cc. + Inherits from AST_ConcreteType and UTL_Scope. + +AST_EnumVal Represents an enumerator in an IDL enum construct. + Defined in include/ast_enum_val.hh and implemented in + ast/ast_enum_val.cc. Inherits from AST_Constant. + +AST_Sequence Represents an IDL sequence construct. Defined in + include/ast_sequence.hh and implemented in + ast/ast_sequence.cc. Inherits from AST_Decl. + +AST_String Represents an IDL string construct. Defined in the file + include/ast_string.hh and implemented in + ast/ast_string.cc. Inherits from AST_Decl. + +AST_Array Represents an array modifier to the type of an IDL + field or typedef declaration. Defined in the file + include/ast_array.hh and implemented in + ast/ast_array.cc. Inherits from AST_Decl. + +AST_Typedef Represents an IDL typedef construct. Defined in the file + include/ast_typedef.hh and implemented in + ast/ast_typedef.cc. Inherits from AST_Decl. + +AST_Expression Represents an IDL expression. Defined in the file + include/ast_expression.hh and implemented in + ast/ast_expression.cc. + +AST_Root A subclass of AST_Module, an instance of this class + is used to represent the distinguished root node of + the AST. Defined in include/ast_root.hh and implemented + in ast/ast_root.cc. Inherits from AST_Module. + + +USING INSTANCE DATA +------------------- + +The AST classes define member data fields in addition to defining +operations on instances. These member data fields are all private, to allow +only the instance in which they are stored direct access. Other objects +(including other instances of the same class) can obtain access to the +member data fields of an instance through accessor functions. These +accessor functions allow retrieval of the data, and in some cases update +functions are also provided to store new values. + +There are several reasons why this approach is taken. First, it hides the +actual implementation of the member data fields from outside the class. For +example, a Thermometer class would not expose whether its temperature +reading is stored in Farenheit or Celsius units, and it could allow access +through either unit method. + +Second, protecting access to member data in this manner restricts the +ability to update it to the instance itself, save where update functions +are explicitly provided. This makes for more reliable implementations, +since the manipulation of the data is isolated in the class implementation +itself. + +Third, wrapping a function call around access to member data allows such +access and update operations to be protected in a multithreaded +environment. While the CFE itself is not multithreaded and the access +operations as currently defined do no special work to protect against +mutliple conflicting access operations, this may be changed in a future +version. Moving the CFE to a multithreaded environment without protecting +access to member data in this manner would be extremely difficult. + +The protocol defined in the CFE is that member data fields are all private +and have names which start with the prefix "pd_" (denoting Private Data). +The access functions have names which are the same as the name of the field +sans the prefix. For example, AST_Decl has a field pd_defined_in and an +access function defined_in(). + +The update functions have names starting with "set_" followed by the name +of the corresponding access function. Thus, AST_Decl defines a function +set_in_main_file(boolean) which sets the pd_in_main_file data member's +value to the boolean provided. + +GENERATION OF THE AST +--------------------- + +The CFE generates the abstract syntax tree after parsing IDL +input. The nodes of the AST are defined by classes introduced in the +previous section, or by subclasses thereof as defined by each BE. In +writing the CFE, we were faced with the following problem: how to generate +the AST containing nodes of the derived classes as defined in each BE +without knowledge of the types and conventions of these BE classes. + +One alternative was to define a naming scheme which predetermines the names +of each subclass a BE can define. The AST would then be generated by +calling an appropriate constructor on the BE derived class. This scheme +suffers from some shortcomings: + +- It breaks the modularity of the compiler and imports knowledge about + types defined in a BE into the CFE, where this information does not belong. +- It restricts a compiler to having only one BE loaded at a time because the + names of these classes can be in use in only one BE at a time. +- It requires a BE to provide derived classes for all AST classes, even for + those classes where the BE adds no functionality. + +The mechanism we chose is different. We define the AST_Generator class +which has an operation for each constructor defined on each AST class. The +operation takes arguments appropriate to the constructor, invokes it and +returns the created AST node, using the type known to the CFE. All such +operations on the generator are declared virtual. The names of all +operations start with "create_" and contain the name of the construct. +Thus, an operation which invokes a constructor of an AST_Module is named +create_module. AST_Generator is defined in include/ast_generator.hh and +implemented in ast/ast_generator.cc. + +If a BE derives from any AST class, it must also derive from the +AST_Generator class and redefine the relevant operations to invoke +constructors of the BE provided class instead of the AST provided class. +For example, if BE_Module is a subclass of AST_Module in a BE, the BE would +also define BE_Generator and redefine create_module to call the constructor +of BE_Module instead of that provided by AST_Module. + +During initialization, the CFE causes an instance of the BE derived +generator to be created and saved. This is explained in the section on +REQUIRED ENTRY POINTS SUPPLIED BY A BE. During parsing, actions in the Yacc +grammar invoke operations on the saved instance to create new nodes for the +AST as it is being built. These operations invoke constructors for BE +derived classes or for AST provided classes if they were not overridden. + +DEFINITION SCOPES +----------------- + +IDL is a nested scoped language. The scoping rules are defined by the CORBA +spec and closely follow those of C++. + +Scope management is implemented in two classes provided in the utilities +library, UTL_Scope and UTL_Stack. UTL_Scope manages associations between +names and AST nodes, and UTL_Stack manages scope nesting and entry and exit +from definition scopes as the parse is proceeding. UTL_Scope is defined in +include/utl_scope.hh and implemented in util/utl_scope.cc. UTL_Stack is +defined in include/utl_stack.hh and implemented in util/utl_stack.cc. + +During initialization, the CFE creates an instance of UTL_Stack and saves +it. During parsing, as definition scopes are entered and exited, AST nodes +are pushed onto, or popped from, the stack represented by the saved +instances. Nodes on the stack are stored as instances of UTL_Scope. Section +THE NARROWING MECHANISM explains how to obtain the real type of a node +retrieved from the stack. + +All definition scopes are linked in a tree rooted in the distinguished AST +root node. This linkage is implemented by UTL_Scope and AST_Decl. The +linkage is a permanent record of the scope nesting while the stack is a +dynamic record which at each instant represents the current state of the +parse. + +The nesting information is used to do name lookup. IDL uses scoped names +which are concatenations of definition scope names ending with individual +construct names. For example, in + + interface a { + struct b { + long c; + }; + const long k = 23; + struct s { + long ar[k]; + }; + }; + +the name a::b::c represents the long field in the struct b inside the +interface a. + +Lookup is performed by searching down the linkage chain for the first component +of the name, then, when found, recursively resolving the remaining +components in the scope defined by the first component. Lookup is relative +to the scope of use; in the above example, k could also have been referred to +as a::k within the struct s. + +Nodes are stored in a definition scope as instances of AST_Decl. Thus, name +lookup returns instances of AST_Decl. The next section, THE NARROWING +MECHANISM, explains how to obtain the real type of a node retrieved from a +definition scope. + +THE NARROWING MECHANISM +----------------------- + +Here we give only a cursory explanation of how narrowing works. We +concentrate on defining the problem and showing how to use our narrowing +mechanism. The narrowing mechanism is defined in include/idl_narrow.hh. + +As explained above, nodes are stored on the scope stack as instances of +UTL_Scope, and inside definition scopes as instances of AST_Decl. Also, +nodes are linked in a nesting tree as instances of AST_Decl. Given a node +retrieved from the stack or a definition scope, one is faced with the task +of obtaining its real class. C++ does not currently provide an implicit +mechanism for narrowing to a derived class, so the CFE defines its own +mechanism. This mechanism requires some work on your part as BE implementor +and requires some explicit code to be written when it is to be used. + +The class AST_Decl defines an enum whose members encode specific AST node +classes. AST_Decl provides an accessor function, node_type(), which +retrieves a member of the enum representing the AST type of the node. Thus, +if an instance of AST_Decl really is an instance of AST_Module, the +node_type() accessor returns AST_Decl::NT_module. + +The class UTL_Scope also provides an accessor function, scope_node_type(), +which returns a member of the enum encoding the actual type of the node. +Thus, given an UTL_Scope instance which is really an instance of +AST_Operation, scope_node_type() would return AST_Decl::NT_op. + +Perusing the header files for classes provided by the AST, you will note +the use of some macros defined in include/idl_narrow.hh. These macros +define the explicit narrowing mechanism: + +DEF_NARROW_METHODSx(<class name>,<parent_x>) for x equal to 0,1,2 or 3, +defines a narrowing method for the specified class which has 0,1,2 or 3 +immediate base classes from which it inherits. For example, ast_module.hh +which defines AST_Module contains the following line: + + DEF_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope) + +This is because AST_Module inherits directly from AST_Decl and UTL_Scope. + +DEF_NARROW_FROM_DECL(<class name>) appears in class definitions for classes +which are derived from AST_Decl and which can be stored in a definition +scope. This macro declares a static operation narrow_from_decl(AST_Decl *) +on the class in which it appears. The operation returns the provided +instance as an instance of <class name> if it can be narrowed, or NULL. + +DEF_NARROW_FROM_SCOPE(<class name>) appears in class definitions of classes +which are derived from UTL_Scope and which can be stored on the scope +stack. This macro declares a static operation narrow_from_scope(UTL_Scope *) +on the class in which it appears. The operation returns the provided +instance as an instance of <class name> if it can be narrowed, or NULL. + +Now look in the files implementing these classes. You will note occurrences +of the following macros: + +IMPL_NARROW_METHODSx(<class name>,<parent_x>) for x equal to 0,1,2 or 3, +implements a narrowing method for the specified class which has 0,1,2 or 3 +immediate base classes from which it inherits. For example, ast_module.cc +which implements AST_Module contains the following line: + + IMPL_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope) + +IMPL_NARROW_FROM_DECL(<class name>) implements a method to narrow from an +instance of AST_Decl to an instance of <class name> as defined above. + +IMPL_NARROW_FROM_SCOPE(<class name>) implements a method to narrow from an +instance of UTL_Scope to an instance of <class name> as defined above. + +To put it all together: In the file ast_module.hh, you will find: + + // Narrowing + DEF_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope); + DEF_NARROW_FROM_DECL(AST_Module); + DEF_NARROW_FROM_SCOPE(AST_Module); + +In the file ast_module.cc, you will see: + +/* + * Narrowing methods + */ +IMPL_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope) +IMPL_NARROW_FROM_DECL(AST_Module) +IMPL_NARROW_FROM_SCOPE(AST_Module) + +The CFE uses narrowing internally to obtain the correct type of nodes in +the AST. The CFE contains many code fragments such as the following: + + AST_Decl *d = get_an_AST_Decl_from_somewhere(); + AST_Module *m; + ... + if (d->node_type() == AST_Decl::NT_module) { + m = AST_Module::narrow(d); + if (m == NULL) { // Narrow failed + ... + } else { // Success, do normal processing + ... + } + } + ... + +Similar code implements narrowing instances of UTL_Scope to their actual +types. + +In your BE classes which derive from UTL_Scope you must include a line +defining how to narrow from a scope, so: + + DEF_NARROW_FROM_SCOPE(<your BE class>) + +and similarly for your BE classes which derive from AST_Decl. + +The narrowing mechanism is defined only for narrowing from AST_Decl and +UTL_Scope. If your BE class inherits directly from one or more classes +which themselves are derived from AST_Decl and/or UTL_Scope, you must +include a line + + DEF_NARROW_METHODSx(<your class name>,<parent 1>,<parent 2>) + +To make this concrete, here is what you'd write in a definition of BE_union +which inherits from AST_Union: + + DEF_NARROW_METHODS1(BE_Union, AST_Union); + DEF_NARROW_FROM_DECL(BE_Union); + DEF_NARROW_FROM_SCOPE(BE_Union); + +and in the implementation file of BE_Union: + +/* + * Narrowing methods: + */ +IMPL_NARROW_METHODS1(BE_Union, AST_Union) +IMPL_NARROW_FROM_DECL(BE_Union) +IMPL_NARROW_FROM_SCOPE(BE_Union) + +Then, in BE code which expects to see an instance of your derived BE_Union +class, you will write: + + AST_Decl *d = get_an_AST_Decl_from_somewhere(); + BE_Union *u; + ... + if (d->node_type() == AST_Decl::NT_union) { + u = BE_Union::narrow_from_decl(d); + if (u == NULL) { // Narrow failed + ... + } else { // Success, do normal processing + ... + } + } + ... + + +SCOPE MANAGEMENT +---------------- + +Instances of classes which are derived from UTL_Scope implement definition +scopes. A definition scope can contain any kind of AST node as long as it +is derived from AST_Decl. However, specific kinds of definition scopes such +as interfaces and unions can contain only a restricted subset of all AST +node types. + +UTL_Scope provides operations to add instances of each AST provided class +to a definition scope. The names of these operations are constructed by +prepending the string "add_" to the name of the IDL construct. So, to add +an interface to a definition scope, invoke the operation add_interface. +The operations are all defined virtual and are intended to be overridden in +classes derived from UTL_Scope. + +If the node was successfully added to the definition scope, the node is +returned as the result. Otherwise the node is not added to the definition +scope and NULL is returned. + +All add operation implementations in UTL_Scope return NULL. Thus, +only the operations which implement legal additions to a specific kind of +definition scope must be overridden in the implementation of that +definition scope. For example, in AST_Module the add_interface operation is +overridden to add the provided instance of AST_Interface to the scope and +to return the provided instance if the addition was successful. Operations +which were not overridden return NULL to indicate that the addition is +illegal in this context. For example, in AST_Operation the definition of +add_interface is not overridden since it is illegal to store an interface +inside an operation definition scope. + +The add operations are invoked in the actions in the Yacc grammar. The +following fragment is a representative example of code using the add +operations: + + AST_Constant *d = construct_a_new_constant(); + ... + if (current_scope->add_constant(d) == NULL) { // Failed + ... + } else { // Succeeded + ... + } + +BE INTERACTION DURING THE PARSING PROCESS +----------------------------------------- + +The add operations can be overridden in BE derived classes to let the BE +perform additional house-keeping work during the process of constructing +the AST. For example, a BE could keep separate lists of interfaces as they +are being added to a module. + +If you override an add operation in your BE, you must invoke the overridden +operation in the superclass of your derived class to allow the CFE to +perform its own house-keeping tasks. A good rule is to invoke the operation +on the superclass before you do your own processing; then, if the +superclass operation returns NULL, this indicates that the addition failed +and your own code should immediately return NULL. An example explains this: + +AST_Interface * +BE_Module::add_interface(AST_Interface *i) +{ + if (AST_Module::add_interface(i) == NULL) // Failed, bail out! + return NULL; + ... // Do your own work here + return i; // Return success indication +} + +We strongly advise you to only define add operations that override add +operations provided by the AST classes. Add operations which +do not override equivalent operations in the AST in effect +extend the semantics of the language accepted by the compiler. For +example, the CFE does not have an add_interface operation on +AST_Operation. If you were to define one in your BE_Operation class, +the resulting compiler would allow an interface to be +stored in an operation definition scope. The current CORBA specification +does not allow this. + +AST INHERITANCE SCHEME +---------------------- + +The AST classes all use public virtual inheritance to construct the +inheritance tree. This ensures that a class may appear several times in the +inheritance tree through different paths and the derived class's instances +will have only one copy of the inherited class's data. + +The use of public virtual inheritance has several important effects on how +a BE is constructed. We explain those effects below. + +First, you must define a default constructor for your BE class, since +your class may be used as a virtual base class of some other class. In this +case the compiler may want to call a default constructor for your class. It +is a good idea to have a default constructor anyway, even if you do not +plan to subclass your BE class, since for most C++ compilers this causes +the code to be smaller. Your default constructor should initialize all +constant data members. Additionally, it may initialize any non-constant +data member whose value must be set before the first time the instance is +used. + +Second, the constructor of your BE derived class must explicitly call all +constructors of virtual base classes which perform useful work. For +example, if a class in the AST from which your BE class inherits has an +initializer for a data member, you must call that constructor. This rule is +discussed in detail in the C++ ARM. An example may help here. + +Suppose you define a class BE_attribute which inherits from AST_Attribute. +Its constructor should be as follows: + + BE_Attribute::BE_Attribute(boolean ro, + AST_Type *ft, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Attribute(ro, ft, n, p), + AST_Field(ft, n, p), + AST_Decl(AST_Decl::NT_attr, n, p) + { + } + +The calls to the constructors of AST_Attribute, AST_Field and AST_Decl are +needed because these constructors do useful initializations on their +classes. + +Note that there is some redundancy in the data passed to these +constructors. We chose to preserve this redundancy since it should be +possible to create BEs which subclass only some of the classes supplied by +the AST. This means that the constructors on each class provided by the AST +should take arguments which are sufficient to construct the instance if +the AST class is the most derived one. + +The code supplied with this release contains a demonstration BE which +subclasses all the AST provided classes. The constructors for each class +provided by the BE are found in the file be/be_classes.cc. + +INITIALIZATION +-------------- + +The following steps take place at initialization: + +- The global data instance is created, stored in idl_global and filled with + default values (in driver/drv_init.cc). +- The command line arguments are parsed (in driver/drv_args.cc). +- For each IDL input file, a copy of the compiler process is forked (in + driver/drv_fork.cc). +- The IDL input is preprocessed (in driver/drv_preproc.cc). +- FE initialization stage 1 is done: the scopes stack is created and stored + in the global data variable idl_global->scopes() field (in fe/fe_init.cc). +- BE_init is called to create the generator instance and the returned + instance is stored in the global data variable idl_global->gen() field. +- FE initialization stage 2 is done: the global scope is created, pushed on + the scopes stack and populated with predefined types (in fe/fe_init.cc). + +GLOBAL STATE AND ENTRY POINTS +----------------------------- + +The CFE has one global variable named idl_global, which stores an instance +of a class IDL_GlobalData as explained below: + +The CFE defines a class IDL_GlobalData which defines the global +information used in a specific run of the compiler. IDL_GlobalData is +defined in include/idl_global.hh and implemented in the file +util/utl_global.cc. + +Initialization creates an instance of this class and stores it in the value +of the global variable idl_global. Thus, the individual pieces of +information stored in the instance are accessible everywhere. + +ERROR HANDLING +-------------- + +All error handling is defined by a class provided by the CFE, UTL_Error. +This class is defined in include/utl_error.hh and implemented in the file +util/utl_error.cc. The class provides several methods for reporting +specific errors as well as generic error reporting methods taking zero to +three arguments. + +The CFE instantiates the class and stores the instance as part of the +global state, accessible as idl_global->err(). Thus, to cause an error +report, you would write code similar to the following: + + if (error condition found) + idl_global->err()->specific_error_message(arg1, ..); + +or + + if (error condition found) + idl_global->err()->generic_error_message(flag, arg1, ..); + +The flag argument is one of the predefined error conditions found in the +enum at the head of the UTL_Error class definition. The arguments to the +specific error message routine are defined by the signature of that +routine. The arguments to a generic error message routine are always +instances of AST_Decl. + +The running count of errors is accessible as idl_global->err_count(). If +the value returned by this operation is non-zero after the IDL input has +been parsed, the BE is not invoked. + +HANDLING OF COMMAND LINE ARGUMENTS +---------------------------------- + +Defined command line arguments are specified in the document CLI, in this +directory. The CFE calls the required BE API entry point BE_prep_arg to +process arguments passed within a -Wb flag. + +REQUIRED ENTRY POINTS SUPPLIED BY A BE +-------------------------------------- + +The following API entry points must be supplied by a BE in order to +successfully link with the CFE: + +extern "C" AST_Generator *BE_init(); + + Creates an instance of the generator object and returns it. Note + that the global scope is not yet set up and the scopes stack is + empty when this routine is called. + +extern "C" void BE_produce(); + + Called by the compiler main program after the IDL input has been + successfully parsed and processed. The job of this routine is to + carry out the specific function of the BE. The AST is accessible as + the value of idl_global->root(). + +extern "C" void BE_prep_arg(char *, idl_bool); + + Called to process an argument passed in with a -Wb flag. The boolean + will always be FALSE. + +extern "C" void BE_abort(); + + Called when the CFE decides to abort the compilation. Can be used in + a BE to clean up after itself, e.g. remove temporary files or + directories it created while the parse was in progress. + +extern "C" void BE_version(); + + Called when a -V argument is processed. This should produce a + message for the user identifying the BE that is loaded and its + version information. + +PART II. WRITING A BACK END +-=========================- + +REQUIRED API THAT EACH BE MUST SUPPORT +-------------------------------------- + +Below are the API entry points that each BE must supply in order to use the +CFE framework. This is a repeat of the BE API section: + +extern "C" AST_Generator *BE_init(); + + Creates an instance of the generator object and returns it. Note + that the scopes stack is still not set up at the time this routine + is called. + +extern "C" void BE_produce(); + + Called by the compiler main program after the IDL input has been + successfully parsed and processed. The job of this routine is to + carry out the specific function of the BE. The AST is accessible as + the value of idl_global->root(). + +extern "C" void BE_prep_arg(char *, boolean); + + Called to process an argument passed in with a -Wb flag. The boolean + will always be FALSE. + +extern "C" void BE_abort(); + + Called when the CFE decides to abort the compilation. Can be used in + a BE to clean up after itself, e.g. remove temporary files or + directories it created while the parse was in progress. + +extern "C" void BE_version(); + + Called when a -V argument is processed. This should produce a + message for the user identifying the BE that is loaded and its + version information. + +WHAT FILES TO INCLUDE +--------------------- + +To use the CFE, each implementation file of your BE must include the +following two header files: + +#include <idl.hh> +#include <idl_extern.hh> + +Following this, you can include any header files needed by your BE. + +HOW TO SUBCLASS THE AST +----------------------- + +Your BE may subclass from any of the classes provided by the AST. Your +class should use public virtual inheritance to ensure that only one copy of +the class's data members is present in each instance. Read the section on +HOW TO WRITE CONSTRUCTORS to learn about additional considerations that you +must take into account when writing constructors for your BE classes. + +HOW TO SUBCLASS THE GENERATOR TO CREATE BE ENHANCED AST NODES +------------------------------------------------------------- + +Your BE subclasses from classes provided by the AST. To ensure that +instances of these classes are constructed when the AST is built, you must +also subclass AST_Generator and return an instance of your subclass from +the call to BE_init. + +The AST_Generator class provides operations to create instances of all +classes defined in the AST. For example, the operation to create an +AST_Attribute node is as follows: + + AST_Attribute * + AST_Generator::create_attribute(...) + { + return new AST_Attribute(...); + } + +In your BE_Generator subclass of AST_Generator, you will override methods +for creation of nodes of all AST classes which you have subclassed. Thus, +if your BE has a class BE_Attribute which is a subclass of AST_Attribute, +your BE_Generator class definition has to override the create_attribute +method to ensure that instances of BE_Attribute are created. + +The definition of the overriden operations should call the constructor of +the derived class and return the new node as an instance of the inherited +class. Thus, the implementation of create_attribute is as follows: + + AST_Attribute * + BE_Generator::create_attribute(...) + { + return new BE_Attribute(...); + } + +The Yacc grammar actions call create_xxx operations on the generator +instance stored in the global variable idl_global->gen() field. By storing +an instance of your derived generator class BE_Generator you ensure that +instances of the BE classes you defined will be created. + +HOW TO WRITE CONSTRUCTORS FOR BE CLASSES +---------------------------------------- + +As mentioned above, the AST uses public virtual inheritance to derive the +AST class hierarchy. This has two important effects on how you write a BE, +specifically how you write constructors for derived BE classes. + +First, you must define a default constructor for your BE class, since +your class may be used as a virtual base class of some other class. In that +case the compiler may want to call a default constructor for your class. It +is a good idea to have a default constructor anyway, even if you do not +plan to subclass your BE class, since for most C++ compilers this causes +the code to be smaller. Your default constructor should initialize all +constant data members. Additionally, it may initialize any non-constant +data member whose value must be set before the first time the instance is +used. + +Second, the constructor for your BE class must explicitly call all +constructors of virtual base classes which do some useful work. For +example, if a class in the AST from which your BE class inherits, directly +or indirectly, has an initializer for a data member, your BE class's +constructor must call the AST class's constructor. This is discussed +extensively in the C++ ARM. + +Below is a list showing how to write constructors for subclasses of each +class provided by the BE. For each AST class we show a definition of a +constructor for a derived class which calls all neccessary constructors on +AST classes: + +AST_Argument: + + BE_Argument::BE_Argument(AST_Argument::Direction d, + AST_Type *ft, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Argument(d, ft, n, p), + AST_Field(AST_Decl::NT_argument, ft, n, p), + AST_Decl(AST_Decl::NT_argument, n, p) + { + } + +AST_Array: + + BE_Array::BE_Array(UTL_ScopedName *n, + unsigned long nd, + UTL_ExprList *ds) + : AST_Array(n, nd, ds), + AST_Decl(AST_Decl::NT_array, n, NULL) + + { + } + +AST_Attribute: + + BE_Attribute::BE_Attribute(boolean ro, + AST_Type *ft, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Attribute(ro, ft, n, p), + AST_Field(AST_Decl::NT_attr, ft, n, p), + AST_Decl(AST_Decl::NT_attr, n, p) + { + } + +AST_ConcreteType: + + BE_ConcreteType::BE_ConcreteType(AST_Decl::NodeType nt, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Decl(nt, n, p) + { + } + +AST_Constant: + + BE_Constant::BE_Constant(AST_Expression::ExprType t, + AST_Expression *v, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Constant(t, v, n, p), + AST_Decl(AST_Decl::NT_const, n, p) + { + } + +AST_Decl: + + BE_Decl::BE_Decl(AST_Decl::NodeType nt, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Decl(nt, n, p) + { + } + +AST_Enum: + + BE_Enum::BE_Enum(UTL_ScopedName *n, + UTL_StrList *p) + : AST_Enum(n, p), + AST_Decl(AST_Decl::NT_enum, n, p), + UTL_Scope(AST_Decl::NT_enum) + { + } + +AST_EnumVal: + + BE_EnumVal::BE_EnumVal(unsigned long v, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_EnumVal(v, n, p), + AST_Constant(AST_Expression::EV_ulong, + AST_Decl::NT_enum_val, + new AST_Expression(v), + n, + p), + AST_Decl(AST_Decl::NT_enum_val, n, p) + { + } + +AST_Exception: + + BE_Exception::BE_Exception(UTL_ScopedName *n, + UTL_StrList *p) + : AST_Decl(AST_Decl::NT_except, n, p), + AST_Structure(AST_Decl::NT_except, n, p), + UTL_Scope(AST_Decl::NT_except) + { + } + +AST_Field: + + BE_Field::BE_Field(AST_Type *ft, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Field(ft, n, p), + AST_Decl(AST_Decl::NT_field, n, p) + { + } + +AST_Interface: + + BE_Interface::BE_Interface(UTL_ScopedName *n, + AST_Interface **ih, + long nih, + UTL_StrList *p) + : AST_Interface(n, ih, nih, p), + AST_Decl(AST_Decl::NT_interface, n, p), + UTL_Scope(AST_Decl::NT_interface) + { + } + +AST_InterfaceFwd: + + BE_InterfaceFwd::BE_InterfaceFwd(UTL_ScopedName *n, + UTL_StrList *p) + : AST_InterfaceFwd(n, p), + AST_Decl(AST_Decl::NT_interface_fwd, n, p) + { + } + +AST_Module: + + BE_Module::BE_Module(UTL_ScopedName *n, + UTL_StrList *p) + : AST_Decl(AST_Decl::NT_module, n, p), + UTL_Scope(AST_Decl::NT_module) + { + } + +AST_Operation: + + BE_Operation::BE_Operation(AST_Type *rt, + AST_Operation::Flags fl, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Operation(rt, fl, n, p), + AST_Decl(AST_Decl::NT_op, n, p), + UTL_Scope(AST_Decl::NT_op) + { + } + +AST_PredefinedType: + + BE_PredefinedType::BE_PredefinedType( + AST_PredefinedType::PredefinedType *pt, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_PredefinedType(pt, n, p), + AST_Decl(AST_Decl::NT_pre_defined, n, p) + { + } + +AST_Root: + + BE_Root::BE_Root(UTL_ScopedName *n, UTL_StrList *p) + : AST_Module(n, p), + AST_Decl(AST_Decl::NT_module, n, p), + UTL_Scope(AST_Decl::NT_module) + { + } + + +AST_Sequence: + + BE_Sequence::BE_Sequence(AST_Expression *ms, AST_Type *bt) + : AST_Sequence(ms, bt), + AST_Decl(AST_Decl::NT_sequence, + new UTL_ScopedName(new String("sequence"), NULL), + NULL) + { + } + +AST_String: + + BE_String::BE_String(AST_Expression *ms) + : AST_String(ms), + AST_Decl(AST_Decl::NT_string, + new UTL_ScopedName(new String("string"), NULL), + NULL) + { + } + +AST_Structure: + + BE_Structure::BE_Structure(UTL_ScopedName *n, + UTL_StrList *p) + : AST_Decl(AST_Decl::NT_struct, n, p), + UTL_Scope(AST_Decl::NT_struct) + { + } + +AST_Type: + + BE_Type::BE_Type(AST_Decl::NodeType nt, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Decl(nt, n, p) + { + } + +AST_Typedef: + + BE_Typedef::BE_Typedef(AST_Type *bt, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Typedef(bt, n, p), + AST_Decl(AST_Decl::NT_typedef, n, p) + { + } + +AST_Union: + + BE_Union::BE_Union(AST_ConcreteType *dt, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_Union(dt, n, p), + AST_Structure(AST_Decl::NT_union, n, p), + AST_Decl(AST_Decl::NT_union, n, p), + UTL_Scope(AST_Decl::NT_union) + { + } + +AST_UnionBranch: + + BE_UnionBranch::BE_UnionBranch(AST_UnionLabel *fl, + AST_Type *ft, + UTL_ScopedName *n, + UTL_StrList *p) + : AST_UnionBranch(fl, ft, n, p), + AST_Field(ft, n, p), + AST_Decl(AST_Decl::NT_union_branch, n, p) + { + } + +AST_UnionLabel: + + BE_UnionLabel::BE_UnionLabel(AST_UnionLabel::UnionLabel lk, + AST_Expression *lv) + : AST_UnionLabel(lk, lv) + { + } + +HOW TO USE THE ADD PROTOCOL +--------------------------- + +As explained the section SCOPE MANAGEMENT, the CFE manages scopes by +calling type-specific functions to add new nodes to the scope to be +augmented. These functions can be overridden in your BE classes to do work +specific to your BE class. For example, in a BE_module class, you might +override add_interface to do additional work. + +The protocol defined by the "add_" functions is that they return NULL to +indicate failure. They return the node that was added (and which was given +as an argument) if the operation succeeded. Your functions in your BE class +should follow the same protocol. + +The "add_" functions defined in the BE must call the overridden function in +the base class defind in the CFE in order for the CFE scope management +mechanism to work. Otherwise, the CFE does not get an opportunity to +augment its scopes with the new node to be added. It is good practice to +call the overridden "add_" function as the first action in your BE +function, because the success or failure of the CFE operation indicates +whether your function should complete its task or abort early. + +Here is an example. Suppose you have defined a class BE_module which +inherits from AST_Module. You may wish to override the add_interface +function as follows: + + class BE_Module : public virtual AST_Module + { + .... + /* + * ADD protocol + */ + virtual AST_Interface *add_interface(AST_Interface *); + ... + }; + +The implementation of this function would look something like the following: + + AST_Interface * + BE_Module::add_interface(AST_Interface *new_in) + { + /* + * Check that the CFE operation succeeds. If it returns + * NULL, stop any further work + */ + if (AST_Module::add_interface(new_in) == NULL) + return NULL; + /* + * OK, non-NULL, this means the BE can do its own work here + */ + ... + /* + * Finally, don't forget to return the argument to indicate + * success + */ + return new_in; + } + +HOW TO MAINTAIN BE SPECIFIC INFORMATION +--------------------------------------- + +The CFE provides a special class AST_Root, a subclass of AST_Module. An +instance of the AST_Root class is used as the distinguished root of the +abstract syntax tree built during a parse. + +Your BE can subclass BE_Root from AST_Root and override the create_root +operation in your BE_Generator class derived from AST_Generator. This will +cause the CFE to create an instance of your BE_Root class as the root of +the tree being constructed. + +You can use the instance of the BE_Root class as a convenient place to +store information specific to an individual tree. For example, you could +add operations on the BE_Root class to count how many nodes of each class +are created. + +HOW TO USE MEMBER DATA +---------------------- + +As explained above, the AST classes provide access and update functions for +manipulating data members. Your BE classes must use these functions when +they require access to data members defined in the AST classes, since the +data members themselves are private. + +It is good practice to follow the same scheme in your BE classes. Make all +data members private. Prepend the names of all such fields with "pd_". +Define access functions with names equal to the name of the field without the +prefix. Define update functions according to need by prepending the name of +the access function with the prefix "set_". + +Using these techniques will allow your BE to enjoy the same benefits that +are imparted onto the CFE. Your BE will be easier to move to a +multithreaded environment and its data members will be better protected and +hidden. + +HOW TO BUILD A COMPLETE COMPILER +-------------------------------- + +We now have all information needed to write a BE and to link it in with the +CFE, to produce a complete IDL compiler. + +The following assumes that your BE will be stored in the "be" directory +under the "release" directory. See the document ROADMAP for an explanation +of the directory structure of the source release. If you decide to use a +different directory to store your BE, you may have to modify the CPP_FLAGS in +"idl_make_vars" in the top-level directory to allow your BE to find the +include files it needs. You will also need to modify several targets in +the Makefile in the top-level directory to correctly compile your BE into a +library and to correctly link it in with the CFE to produce a complete +compiler. + +You can get started quickly on writing your BE by modifying the sources +found in the "demo_be" directory. The Makefile supports all the the targets +that are needed to build a complete system and the maintenance target +"clean" which assists in keeping the files and directories tidy. The files +provided in the "demo_be" directory also provide all the API entry points +that are mandated by this document. + +To build a complete compiler, invoke "make" or "make all" in the top-level +directory. This will compile your BE and all the CFE sources, if this is +the first invocation. On subsequent invocations this will recompile only +the modified files. You will rarely if at all modify the CFE sources, so +the overhead of compiling the CFE is incurred only the first time. To build +just your BE, you can invoke "make all" or "make" in the "demo_be" +directory. You can also, from the top-level directory, invoke "make +demo_be/libbe.a". + +HOW TO OBTAIN ASSISTANCE +------------------------ + +First, read all the documents provided. If you have unanswered questions, +mail them to + + idl-cfe@sun.com + +Sun does not promise to support the IDL CFE source release in any manner. +However, we will attempt to answer questions and correct problems as time +allows. + +NOTE: + +SunOS, SunSoft, Sun, Solaris, Sun Microsystems or the Sun logo are +trademarks or registered trademarks of Sun Microsystems, Inc. + +COPYRIGHT NOTICE +---------------- + +Copyright 1992, 1993, 1994 Sun Microsystems, Inc. Printed in the United +States of America. All Rights Reserved. + +This product is protected by copyright and distributed under the following +license restricting its use. + +The Interface Definition Language Compiler Front End (CFE) is made +available for your use provided that you include this license and copyright +notice on all media and documentation and the software program in which +this product is incorporated in whole or part. You may copy and extend +functionality (but may not remove functionality) of the Interface +Definition Language CFE without charge, but you are not authorized to +license or distribute it to anyone else except as part of a product or +program developed by you or with the express written consent of Sun +Microsystems, Inc. ("Sun"). + +The names of Sun Microsystems, Inc. and any of its subsidiaries or +affiliates may not be used in advertising or publicity pertaining to +distribution of Interface Definition Language CFE as permitted herein. + +This license is effective until terminated by Sun for failure to comply +with this license. Upon termination, you shall destroy or return all code +and documentation for the Interface Definition Language CFE. + +INTERFACE DEFINITION LANGUAGE CFE IS PROVIDED AS IS WITH NO WARRANTIES OF +ANY KIND INCLUDING THE WARRANTIES OF DESIGN, MERCHANTIBILITY AND FITNESS +FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR ARISING FROM A COURSE OF +DEALING, USAGE OR TRADE PRACTICE. + +INTERFACE DEFINITION LANGUAGE CFE IS PROVIDED WITH NO SUPPORT AND WITHOUT +ANY OBLIGATION ON THE PART OF Sun OR ANY OF ITS SUBSIDIARIES OR AFFILIATES +TO ASSIST IN ITS USE, CORRECTION, MODIFICATION OR ENHANCEMENT. + +SUN OR ANY OF ITS SUBSIDIARIES OR AFFILIATES SHALL HAVE NO LIABILITY WITH +RESPECT TO THE INFRINGEMENT OF COPYRIGHTS, TRADE SECRETS OR ANY PATENTS BY +INTERFACE DEFINITION LANGUAGE CFE OR ANY PART THEREOF. + +IN NO EVENT WILL SUN OR ANY OF ITS SUBSIDIARIES OR AFFILIATES BE LIABLE FOR +ANY LOST REVENUE OR PROFITS OR OTHER SPECIAL, INDIRECT AND CONSEQUENTIAL +DAMAGES, EVEN IF SUN HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +Use, duplication, or disclosure by the government is subject to +restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in +Technical Data and Computer Software clause at DFARS 252.227-7013 and FAR +52.227-19. + +Sun, Sun Microsystems and the Sun logo are trademarks or registered +trademarks of Sun Microsystems, Inc. + +SunSoft, Inc. +2550 Garcia Avenue +Mountain View, California 94043 |