The binary identifier tree module
(bintree)

Introduction

This module contains all the data structures used for storing the information read from the input file and also information derived from the input file. It contains procedures to add new identifiers to the tree and to retrieve information about these identifiers. Besides this, the module also provides procedures which are used for suppressing error messages and a procedure which returns statistical information about the binary identifier tree.

The first section describes all the types that are used for the data structures and in the second section the variables that are used are described. In the third section the storing and retrieval procedures for the binary tree are described. In the fourth section we describe the error status of identifiers and how it can be used to suppress error messages. The fifth section describes a procedure that returns statistics about the tree. In the last two sections the listing of the module is given.

1 Typing for the data structures

This module contains the typing of all the data structures used in the program. We first give an extensive description of this typing and explain its meaning in terms of the formal description of Attributed Abstract Program Trees [2].

We make the following comments about the nomenclature of the type identifiers. We use the prefix "t" for constructed types. The prefix "p" is used for pointer types which point to types that begin with the prefix "t". After these prefixes the letter "l" stands for a list of a certain type, the letters "alt" stand for alternative and the letters "tr" for transformed. For example, plalt ass stands for a pointer to a list of alternative assignments, which is used in the case assignment. For all enumeration types which end with "kind", the names have a prefix which is equal to the first letter of the type name (without prefix) for which they are used. For example, n class in the enumeration type tnode kind, is used for a node in the binary tree of an identifier that is defined as a class.

(In the text we do not use the underscore symbol, as in the listing, but set the identifier in a different font. This is done for the sake of readability.)

1.1 A node of the binary tree

Each node of the binary identifier tree contains information of one identifier. At this moment we only give a pointer to this type, because we can only give a full declaration of its type when we have given all other declarations.

  pnode       = ^tnode;

1.2 Elements

The elements, together with attributes, are the most significant objects of the grammar. Each element has a unique element number, which makes it possible to implement sets of elements using a set type, as is done here. For elements the following types are declared:

  elem_range    = 0..max_elem_nr;
  elements      = SET OF elem_range;
  elem_array    = ARRAY[0..max_elem_nr] OF pnode;

The type elem array represents a mapping from the element numbers to the nodes where the elements with these numbers are described.

1.3 Attributes

Each attribute has a unique attribute number, which makes it possible to implement sets of attributes using a set type. For attributes the following types are declared:

  attr_range    = 0..max_attr_nr;
  attributes    = SET OF attr_range;
  attr_array    = ARRAY[0..max_attr_nr] OF pnode;
  tattr_at_elem = ARRAY[0..max_elem_nr] OF attributes;

The type attr array represents a mapping from the attribute numbers to the nodes where the attributes with this number are described. The type tattr at elem represents a mapping from the set of elements to the set of attributes.

1.4 Expressions

In this section we describe all the typing for the expressions in the attribute assignments. As will be described in [1.7.1] an expression can have three forms. It can either be a simple expression, a semantic function or a case expression. Because of this we declare the enumeration type texpr kind for these three different types of expression in the following way (where the letters "atoc" are short for attribute occurrence):

  texpr_kind    = (e_atoc, e_func, e_case);

Using this we can now declare the type for an expression in the following way:

  pexpr         = ^texpr;
  texpr         = RECORD
                    line_nr     : integer;
                  CASE kind : texpr_kind OF
                    e_atoc : ( attr       : pnode;
                               partnamenr : integer    );
                    e_func : ( func       : pnode;
                               args       : plexpr     );
                    e_case : ( headpnnr   : integer;
                               alter      : plalt_expr )
                  END;

The field line nr contains the input line number on which the expression starts. When the expression is a simple expression (attribute occurrence) of the form "A OF P" then the field kind is equal to e atoc. The field attr contains the pointer to the node that represents the identifier "A". If "P" is equal to "#" the field partnamenr contains the value of the constant main part nr, otherwise it contains the number of the part name according to the tree rule, with which the expression was found. If the part name "P" is not a part name of the current tree rule, than partnamenr will be equal the value of the constant error part nr.

When the expression is a semantic function of the form "F(EXPR₁,..,EXPR_n)" then the field kind is equal to e func. The field func will contain a pointer to the node with the identifier "F". The field args will contain the representation of the expressions "EXPR₁,..,EXPR_n". The type declaration of plexpr (pointer to list of expressions) is:

  plexpr        = ^tlexpr;
  tlexpr        = RECORD
                    first       : pexpr;
                    type_of_arg ,
                    loc_rep     : alfa;
                    rest        : plexpr
                  END;

When a pointer of type plexpr points to the list of expressions "EXPR₁,..,EXPR_n" then the field first will point to the expression "EXPR₁", the field type of arg will contain the type of this expression and the field loc rep will be used in the generation phase to store the representation of the local variable containing the result. The field rest will represent "EXPR₂,..,EXPR_n".

When the expression represents a case expression of the form "CASE P OF SEL₁:EXPR₁;..;SEL_n:EXPR_n ESAC" then the field kind is equal to e_case. The field headpnnr will contain the number of the part name in the tree rule with this expression. It is equal to the value of the constant main part nr when "P" is equal to "#" and equal to the value of the constant error part nr when "P" is not a part name with the current tree rule. The field alter points to the structure that describes the alternatives "SEL₁:EXPR₁;..;SEL_n:EXPR_n". The type declaration of plalt expr (pointer to a list of alternatives) is:

  plalt_expr    = ^tlalt_expr;
  tlalt_expr    = RECORD
                    line_nr     : integer;
                    other_sel   : boolean;
                    selectors   : elements;
                    expr        : pexpr;
                    rest        : plalt_expr
                  END;

When a pointer of type plalt expr points to the list of alternatives "SEL₁:EXPR₁;..;SEL_n:EXPR_n" then the field rest will point to the representation of "SEL₂:EXPR₂;..;SEL_n:EXPR_n". The field line nr will contain the line number on which this alternative starts. When "SEL₁" is not equal to "OTHERS" the field other sel contains the value false and the field selectors contains the set of elements represented by "SEL₁". Otherwise the field other sel will contain the value true and the field selectors is initialized to empty during the first pass. During the second pass it is assigned with the remaining set of elements of the class with this case expression, which are not used in one of the previous selectors. And because the grammar only allows the selector of the last alternative to be equal to "OTHERS", the field rest will always be equal to NIL. The field expr will contain the pointer to the representation of "EXPR₁".

1.5 Attribute assignments

In this section we describe all the typing for the attribute assignments. As will be described in [1.7] an attribute assignment can have two forms. It can either be a simple assignment of an expression to an attribute occurrence or a selective assignment. Because of this we declare the enumeration type tass kind for these two different types of attribute assignments in the following way:

  tass_kind     = (a_simp, a_sele);

Using this we can now declare the type for a list of attribute assignments in the following way:

  plattr_ass    = ^tlattr_ass;
  tlattr_ass    = RECORD
                    line_nr     : integer;
                    rest        : plattr_ass;
                  CASE kind : tass_kind OF
                    a_simp : ( attr       : pnode;
                               partnamenr : integer;
                               expr       : pexpr    );
                    a_sele : ( headpnnr   : integer;
                               alter      : plalt_ass )
                  END;

There is no type for a single attribute assignment because we allow lists of attribute assignments everywhere that they may occur. The field line nr contains the input line number on which the first attribute assignment of the list starts and the field rest points to the rest of the attribute assignments in the list.

When the first attribute assignment is a simple attribute assignment of the form "A OF P = EXPR" then the field kind will contain the value a simp. The field attr contains the pointer to the node that represents the identifier "A". If "P" is equal to "#" the field partnamenr contains the value of the constant main part nr, otherwise the number of the part name with the tree rule with which the attribute assignment was found. If the part name "P" is not a part name of the current tree rule, then partnamenr will be equal to the value of the constant error part nr. The field expr will point to the representation of "EXPR".

When the first attribute assignment represents a selective attribute assignment of the form "CASE P OF SEL₁:AAS₁;..;SEL_n:AAS_n ESAC" then the field kind will contain the value a sele. The field headpnnr will contain the number of the part name in the tree rule with this attribute assignment. It is equal to the value of the constant main part nr when "P" is equal to "#" and equal to the value of the constant error part nr when "P" is not a part name in the current tree rule. The field alter points to the structure that describes the alternatives "SEL₁:AAS₁;..;SEL_n:AAS_n". The type declaration of plalt ass (pointer to a list of alternatives) is:

  plalt_ass     = ^tlalt_ass;
  tlalt_ass     = RECORD
                    line_nr     : integer;
                    other_sel   : boolean;
                    selectors   : elements;
                    attr_ass    : plattr_ass;
                    rest        : plalt_ass
                  END;

When a pointer of type plalt ass points to the list of alternatives "SEL₁:AAS₁;..;SEL_n:AAS_n" then the field rest will point to the representation of "SEL₂:AAS₂;..;SEL_n:AAS_n". The field line nr will contain the line number on which this alternative starts. When "SEL₁" is not equal to "OTHERS" the field other sel contains the value false and the field selectors contains the set of elements represented by "SEL₁". Otherwise the field other sel will contain the value true and the field selectors is initialized to empty during the first pass. During the second pass it is assigned with the remaining set of elements of the class with this selective attribute assignment, which are not used in one of the previous selectors. And because the grammar only allows the selector of the last alternative to be equal to "OTHERS", the field rest will always be equal to NIL. The field attr ass will contain the pointer to the representation of "AAS₁".

1.6 Transformed attribute assignments

In the second pass the attribute assignments are transformed (rearranged) into a more orthogonal structure. In this section we describe the types for this structure. The basic idea is that for each part name of a tree rule (including the main part name) we can make a tree in which all the assignments can be stored. Each node of the tree represents a class or a node type. At each node we attach the assignments to attributes that belong to this class or node type. Under each class node of this tree there are the nodes that represent the elements of the class. The type ttr ass represents a node of this tree, and is declared by:

  ttr_ass       = RECORD
                    ass_attr    : attributes;
                    top         : plexpr;
                    deeper      : pltr_ass
                  END;

For each variable of type ttr ass with the element "E" (which is determined by the context) the field ass attr represents the ordered list of attributes (ordered on the attribute numbers) that have to be assigned with this element. If ass attr represents the list of attributes "A₁,..,A_n", then the field top points to the list of expressions "EXPR₁,..,EXPR_n", where "EXPR_i" represents the expression that has to be assigned to the attribute "A_i" with the element "E" in the part name that belongs to this tree. When the element "E" is a node type the field deeper is equal to NIL, otherwise it will point to the list of nodes that belong to the set of elements (ordered on the element numbers) defined by class(E). The type pltr ass that represents this list is declared by:

  pltr_ass      = ^tltr_ass;
  tltr_ass      = RECORD
                    first       : ttr_ass;
                    rest        : pltr_ass
                  END;

The field first contains the first node and the field rest will contain the possible empty list of the rest of the list.

1.7 Tree rules

There are two ways the tree rules are stored in the data structure. This is mainly due to historical reasons. Originally, the program was divided in a number of smaller programs that used two different methods of storing information. When these smaller programs were combined in one program with different passes it was easier to let the two methods of storage co-exist. One way (which is used in the early passes) of storing the tree rule is as a linked list, where each element of the linked list contains one part of the tree rule. The typing for a list of parts of a tree rule is described by:

  ptree_rule    = ^ttree_rule;
  ttree_rule    = RECORD
                    partname    : alfa;
                    element     : pnode;
                    trans_ass   : ttr_ass;
                    rest        : ptree_rule
                  END;

The field partname contains the alphabetic representation of the part name of the first part and the field element is the pointer to the node with the identifier for the element of this part. When the second pass is completed, the field trans ass will contain the transformed assignments for this part with the element represented by element for the part name represented by partname for this tree rule. The field rest points to the rest of the list of parts.

The transformed assignments of the main part name are stored with the element of the tree rule, as will be described later.

The other method (which is used during the last code generation pass) of storing the tree rule is in an array, using the following type declarations:

  p_tree_list   = ^t_tree_list;
  t_tree_list   = ARRAY[0..max_part_nr] OF ptree_rule;

The first position (with index 0) will be used for storing the element of the tree rule and the transformed assignments with the main part name.

During the last pass we also use the following array type declaration to store sets of attributes with each part name:

  t_imported    = ARRAY[0..max_part_nr] OF attributes;

1.8 Attribute sets with elements

With each element several sets of attributes are defined within the formal description. These will be stored with each element as an array. We first declare the enumeration type tattr kind representing all these different sets:

  tattr_kind    = (nor_gen, nor_in, nor_out,
                   all_gen, all_in, all_out);

The function "attr of" [2.4] is represented by the kind nor gen. The function "input attr of" and "output attr of" [2.5] are represented by respectively nor in and nor out, respectively. The kinds all gen, all in and all out are used for "all attr of", "all input attr" and "all output attr" [2.10], respectively. The declaration of the array type that contains all these sets is:

  tattr         = ARRAY[nor_gen..all_out] OF attributes;

1.9 Formal types

Formal types are used to represent the types that are used in the input description to type the semantic expressions. A formal type is not the name of a type, but it is used to describe the type of a part of an expression or a type definition of an attribute or a semantic function. A formal type is defined by the following type declaration:

  ttype         = RECORD
                    type_name   : pnode;
                    conc        : boolean
                  END;

The field type name is the pointer to the node with the name of the type. The field conc says whether this type is defined by the user, or deduced by the program. It has the value TRUE if the type is not directly defined by the input description, but is somehow deduced while reading the input. In the description of the error-handler module a more detailed description of formal types and there usage is presented [errors 4].

The following declaration is used to describe lists of formal types:

  pltype        = ^tltype;
  tltype        = RECORD
                    first       : ttype;
                    rest        : pltype
                  END;

The field first represents the first formal type of the list and rest the remaining part of the list.

1.10 The nodes

Each node of the binary identifier tree contains the information of one identifier. There are a number of different kinds of identifiers. Each identifier can only be used as one kind of identifiers. To identify the kind we use the following enumerated type declaration:

  tnode_kind    = (n_class, n_node, n_elem, n_type,
                   n_func, n_attr,
                   n_system, n_pascal, n_undef
                  );

The kinds n class and n node are used for class and node type identifiers, respectively. The kind n elem is used for identifiers that are used as an element, but have not been defined as such. It is possible that identifiers of this kind change in a class or node type identifier if they are used in such a way. The kind n type is used for type identifiers. The kind n func is used for semantic function identifiers. The kind n attr is used for attribute identifiers. The kind n pascal is used for identifiers that are pascal preserved words. These identifiers are stored in the tree during initialization. The two remaining kinds n system and n undef are currently not used as identifier kinds.

1.10.1 Typing of a node

We first present that part of the node type declaration that is the same for all kinds of identifiers. The incomplete declaration of the type tnode is:

  tnode    = RECORD
               name        : alfa;
               left,right  : pnode;
               status      : terror_sts;
             CASE kind : tnode_kind OF
               (  variant parts for different kinds of identifiers  )
             END;

The field name contains the alphabetic representation of the identifier. The fields left and right are used for the left and right subtrees of this node in the binary identifier tree. The field status is used to record erroneous usage of this identifier. The type declaration of terror sts is:

  terror_sts    = RECORD
                    defined     : SET OF tnode_kind;
                    conc        : boolean
                  END;

The field defined can be used to record all other kinds of identifiers that this identifier is used for and the field conc indicates whether any property of this identifier has been deduced from its context. The error status can be used for suppressing error messages as will be explained later.

1.10.2 Element identifiers

The element identifiers use the following extra declaration:

               n_class  ,
               n_node   ,
               n_elem   : ( elem_nr        : elem_range;
                            tree_rule      : ptree_rule;
                            kind_of_rule   : trule_kind;
                            attr_ass       : plattr_ass;
                            attr           : tattr;
                            parent         ,
                            class_rule     ,
                            clos_class     : elements;
                            trans_ass      : ttr_ass;
                            tree_list      : p_tree_list;
                            nr_of_parts    : integer);

The field elem nr contains the element number assigned to the element represented by this node. The fields tree rule and kind of rule represent the tree rule with this element. The sets ("elements with tree rule" [2.6.2], "elements with indirect tree rule", "elements with indirect tree rule" [2.6.3] and "left hand side elements" [2.9]) and the consistent definition of the function "tree production of" [2.6.3] are constructed during the first pass when the rules are read. This is implemented by keeping track of the status of each element that has been read so far. This status is defined by the enumeration type trule kind with the following declaration:

  trule_kind    = (r_empt, r_dire, r_indi, r_undf);

The different kinds stand for : an empty tree rule, a direct tree rule, an indirect tree rule and an undefined tree rule. The status is initialized on r empt for node types and r undf for classes. While new rules are read the status of elements involved are changed and error messages are generated as soon as conflicts are detected.

The set "elements with tree rule" is defined as all those elements with status r dire. The set "elements with indirect tree rule" is defined as all those elements with status r indi. The set "elements with tree production" is defined as all those elements with status r dire or r indi. And the set "left hand side elements" is defined as all those elements with status r empt or r dire.

The field attr ass points to the attribute assignments declared with this element. The field attr contains all the attribute relations as described in section 1.8 The fields parent, class rule and clos class represent the "parent(E)", "class(E)" and "clos class(E)" relation, respectively, where "E" represents this element. The field trans ass contains the transformed attribute assignments that belong to the main part name; these are the assignments to the synthesized attributes with the main element of the tree rule. The fields tree list and nr of parts describe the alternative representation of the tree rule as described in section 1.7

1.10.3 Attribute identifiers

The attribute identifiers use the following extra declarations:

               n_attr   : ( attr_nr        : attr_range;
                            type_of_attr   : ttype     );

The field attr nr contains the attribute number of the attribute that is represented by this identifier. The field type of attr is used to represent the formal type of this attribute, and it is the representation of the function "type of(A)".

1.10.4 Semantic function identifiers

The semantic function identifiers have the following extra declarations:

               n_func   : ( nr_of_arg      : integer;
                            type_of_func   : ttype;
                            args           : pltype    );

When this function identifier is equal to "F", then the field type of func represents "type of(F)" and the field nr of args represents "number of args of(F)". The field args contains a pointer to the list of types of the arguments of this semantic function. The ith element of this list is equal to the expression "type arg(i,F)"

1.10.5 Other identifiers

The rest of the identifier kinds have no extra declarations:

               n_type   : ();
               n_system ,
               n_pascal ,
               n_undef  : ()

This completes the definition of all the type declarations that are used for the nodes of the binary identifier tree.

2 Variables used for the data structure

In this section we describe the variables that contain all the data that is stored by the binary identifier tree module. In each subsection we will describe a part of the data structure.

2.1 The binary tree

The variable nametree of type pnode is used to store the entire binary identifier tree. This variable should not be manipulated outside this module. The variable correct status of type terror sts (see subsection 1.10.1) is initialized with the value of a correct error status of an identifier. This means that the field defined contains the empty set and the field conc is equal to FALSE.

2.2 Elements

The identifiers for the elements are stored in the binary tree. We introduced the type elements because we wanted to store sets of elements in an efficient way. But these sets contain only numbers. The variable element of type elem array contains the mapping of element numbers to nodes of the binary tree that represent these elements. The variable nr of elem contains the highest element number that is used. For all element numbers "nr" in the range "[0..nr of elem]" the expression "element[nr]^.elem nr" is equal to "nr" and for each node that represents an element identifier the expression "element[elem nr]" points to this node.

The variable classes is the representation of the set "classes" [2.1] and the variable node types" of the set "nodes" [2.1] (not to be confused with the nodes of the binary tree). The variables left side elements and right side elements represent the sets "left hand side elements" and "right hand side elements" [2.9], respectively. These variables are initialized by the procedure make right left side elements [depend 1].

The variable root contains "root" [2.7].

2.3 Attributes

The identifiers for the attributes are stored in the binary tree. We introduced the type attributes because we wanted to store sets of attributes in an efficient way. But these sets contain only attribute numbers. The variable attribute of type attr array contains the mapping of the numbers to nodes in the binary tree that represent these elements. The variable nr of attr contains the highest attribute number that is used. For all attribute numbers "nr" in the range "[0..nr of attr]" the expression "attribute[nr]^.attr nr" is equal to "nr" and for each node that represents an attribute identifier the expression "attribute[attr nr]" points to this node.

The sets "synthesized attributes" and "inherited attributes" are represented by the variables g syn attr and g inh attr of type attributes, respectively. The variables g input attr and g output attr of type attributes are used in the construction of the relations "input attr of", "output attr of" and "interface attr of". They contain those attributes that are defined as input/output attributes for all elements.

2.4 Formal types

The two variables und type and err type are used as the initial values of an undefined formal type and an erroneous formal type, respectively.

3 The binary identifier tree

In this section we describe how the binary tree is used. The binary identifier tree is used to store information associated with predefined identifiers, and identifiers read from the input file. We could have used another storage mechanism, for example a hash-table. The storage medium that is implemented with a binary tree does not need to be visible to the user. In this module four procedures are given that manipulate or inspect the binary identifier tree. They are described in the following sections.

3.1 Initialization

Before the binary tree can be used it must be initialized. This is done by the following procedure:

PROCEDURE init_tree;

This procedure first initializes the binary tree as the empty tree, and then fills in all the predefined identifiers. These are the PASCAL reserved words and the standard type identifiers. The procedure also initializes the two counters nr of elem and nr of attr, which are used to assign unique numbers to element and attribute identifiers, respectively, as described previously. The variable correct status is also initialized.

Two other variables und type and err type used to represent certain formal types are also initialized. The ways in which these two variables are used is described in the error handler module.

3.2 Storing and retrieving

There are two functions for manipulating or inspecting the values associated with an identifier in the binary tree. The first one is used to add a new identifier to the tree. The header of this function is:

FUNCTION insert_name(VAR root : pnode; in_name : alfa;
                     VAR p : pnode; tag : tnode_kind
                    ) : boolean;

This function must always be called with the variable nametree as its first argument. The function yields TRUE when the identifier represented by the parameter in name, of the identifier kind represented by the parameter tag, was added to the tree. Otherwise, the result is FALSE, indicating that the identifier was already present. In both cases the parameter p will point to the node with the identifier name. The reason for this is twofold. First, the pointer will always be unequal NIL, and second, recording erroneous uses of the identifier can be stored in its node.

The second function is used to get information associated with a certain identifier. The header of this function is:

FUNCTION get_name(root : pnode; in_name : alfa;
                  VAR p : pnode) : boolean;

This function also has to be called with the variable nametree as its first argument. This function yields TRUE when the identifier represented by the parameter in name is stored in the tree. In this case, the parameter p will contain a pointer to the node with this name. Otherwise, the function will return FALSE and the parameter p will be NIL.

4 Error status and error suppressing

There is a mechanism to administer erroneous uses of identifier names. This can be very helpful in reducing the number of error messages. When, for example, an identifier has not been defined this should be reported only once. There are two kinds of errors that could occur. Variables may not be defined when they are used as an identifier of a certain kind, or they can be used as a variable of the incorrect kind. In the first case, we can deduce the declaration of the identifier from its usage.

At each identifier an error status is attached, which consists of two different parts. The first part is a boolean (initially FALSE), indicating whether information of this identifier has been deduced. The other is a set (initially empty) of all identifier kinds indicating for which kinds this identifier has been used, different from its own identifier kind. For each part there are two procedures.

4.1. Deduced error status

With the procedure conclude the concluded error status can be set. The header of the procedure is:

PROCEDURE conclude(name_ptr : pnode);

The parameter name ptr points to the node of the identifier for which the concluded status has to be set. The function concluded can be used to test whether the status has been set. The header of this function is:

FUNCTION concluded(name_ptr : pnode) : boolean;

This function returns TRUE when the status has been set by calling the procedure conclude with the same arguments, otherwise FALSE.

4.2. Incorrectly used error status

With the procedure error define it is possible to administer that an identifier has been used as a incorrect kind of identifier. The header of the procedure is:

PROCEDURE error_define(name_ptr : pnode;
                       wanted : tnode_kind);

The identifier kind wanted is added to the error status of the identifier with the node name ptr. The function error defined has the header:

FUNCTION error_defined(name_ptr : pnode;
                       wanted : tnode_kind) : boolean;

This function will return TRUE when the identifier with the node name ptr was used as an identifier of the kind represented by the parameter wanted (only if this was reported by calling the procedure error define with the same arguments). Otherwise, the function will return FALSE.

5 Tree statistics

This section describes the procedure tree statistics that returns statistical information about the binary identifier tree. The header of the procedure is:

PROCEDURE tree_statistics
           (VAR nr_of_nodes,
                min_dept, max_depth : integer;
            VAR average_depth : real;
            VAR count : levels);

The parameter nr of nodes will contain the total number of nodes in the tree. The parameters min depth and max depth will contain the minimum depth that is completely filled and the maximum depth that does contain a node, respectively. The parameter average depth will contain the average depth of all the nodes, which is representative of the average search time for each identifier. The last parameter count of type levels will contain the number of nodes on the first 20 levels of the binary tree.

6 Interface

This module uses declarations from the following modules:

definitions, perform.

6.1 Exported types

Because of the size of type definitions declared by this module we refer to the listing given in section 7.1

6.2 Exported variables

The variables are described in section 2 The following variables are exported:

nametree     : pnode;        { root of the name tree }

correct_status : terror_sts; { a correct status }
und_type       ,             { undefined formal type }
err_type       : ttype;      { error formal type }

g_inh_attr      ,             { inherited attributes }
g_syn_attr      ,             { synthesized attributes }
g_input_attr    ,             { global input attributes }
g_output_attr   : attributes; { global output attributes }
element         : elem_array; { all the information about the elements }
nr_of_elem      : integer;    { highest element number }
attribute       : attr_array; { all the information about the attributes }
nr_of_attr      : integer;    { highest attribute number }
root            : pnode;      { element number of the root }
classes         ,             { the set of classes }
node_types      ,             { the set of node types }
left_side_elements  ,           { left_hand_side_elements }
right_side_elements : elements; { right_hand_side_elements }

6.3 Exported functions and procedures

We list the exported functions and procedures, followed by a short description of their usage taken from the listing file:

PROCEDURE next_elem(VAR elem_nr : integer;
                    VAR elem_set : elements);

This procedure seeks the next element number in the set of elements, represented by elem set, and deletes this number, which is also returned in the (reference) parameter elem nr.

PROCEDURE next_attr(VAR attr_nr : integer;
                    VAR attr_set : attributes);

This procedure seeks the next attribute number in the set of attributes, represented by attr set and deletes this number, which is also returned in the (reference) parameter attr nr.

FUNCTION insert_name(VAR root : pnode; in_name : alfa;
                     VAR p : pnode; tag : tnode_kind
                    ) : boolean;

This function tries to insert a new node for the identifier with the name represented by in name, in the tree with the root pointed to by root. In the case this is successful it returns the value TRUE. Otherwise the value FALSE is returned, indicating that there existed already a node with the name represented by in name. In both cases the (reference) parameter p will point to the node which represents the identifier with the given name. See subsection 3.2

FUNCTION get_name(root : pnode; in_name : alfa;
                  VAR p : pnode) : boolean;

This function returns the value TRUE, when there is a node with a name equal to in name, in the tree pointed to by root. Otherwise, the value FALSE is returned. The (reference) parameter p will point to the node in the tree, with the name equal to in name, when it exists. See subsection 3.2

PROCEDURE conclude(name_ptr : pnode);

This procedure sets the concluded property of the identifier pointed to by name ptr. See subsection 4.1

FUNCTION concluded(name_ptr : pnode) : boolean;

This function returns the value TRUE, when the concluded property of the identifier pointed to by name ptr, has been set. Otherwise it returns the value FALSE. See subsection 4.1

PROCEDURE error_define(name_ptr : pnode; wanted : tnode_kind);

This procedure records that the identifier pointed to by name ptr, is used as identifier of the kind represented by wanted. See subsection 4.2

FUNCTION error_defined(name_ptr : pnode; wanted : tnode_kind
                      ) : boolean;

This function returns the value TRUE, when it was recorded that the identifier pointed to by the parameter name ptr, has been used as an identifier of the kind represented by wanted. Otherwise it returns the value FALSE. See subsection 4.2

PROCEDURE init_tree;

This procedure initializes the binary identifier tree. See subsection 3.1

[GLOBAL,HIDDEN]
PROCEDURE tree_statistics(VAR nr_of_nodes, min_depth,
                               max_depth : integer;
                          VAR average_depth : real;
                          VAR count : levels);

This procedure produces statistical information on the binary identifier tree, by counting the number of non-empty nodes on every depth, with a maximum of 20. See section 5

7 The listing

In this section the complete listing of the module, with subsections for each page of the original listing, is given:

[ENVIRONMENT ('bintree.pen'),
 INHERIT     ('definitions.pen'
             ,'perform.pen')]

MODULE bintree(output);

(*  This module contains all the procedures to manipulate the binary tree,    *)
(*  such as storing and retrieving the information attached to an identifier. *)
(*  It also contains procedures for error suppressing, depending on           *)
(*  information stored with each identifier about erroneous usages.           *)
(*  It also contains an initialization routine that has to be called before   *)
(*  any operation on the binary tree is done.                                 *)
(*  Furthermore it contains an external procedure that returns statistical    *)
(*  information about the tree.                                               *)
(*                                                                            *)
(*  ADAPTIONS:                                                                *)
(*                                                                            *)
(*  If no error suppressing is used one has to delete the definition of       *)
(*  terror_sts, and the status field in the definition of tnode and the       *)
(*  entire page on error suppressing. The concluded field in ttype is used in *)
(*  the procedures of the ERROR module.                                       *)
(*  If no performance analysis is required, one has to remove all the         *)
(*  occurrences of perf_nr_calls and the page on tree statistics.             *)

(*      BINARY TREE REPRESENTATION                                            *)

CONST
  max_elem_nr     = 255;
  max_attr_nr     = 255;
  max_part_nr     = 10;
  error_nr        = -1;
  start_nr        = -1;
  error_part_nr   = -1;
  main_part_nr    =  0;
  main_part_name  = '#';

7.1 Type declarations

TYPE
(* forward declaration of the pointer to the nodes : *)
  pnode         = ^tnode;

(* Begin of the declaration for the elements                                  *)
  elem_range    = 0..max_elem_nr;
  elements      = SET OF elem_range;
  elem_array    = ARRAY[0..max_elem_nr] OF pnode;

(* Begin of the declaration for the attributes                                *)
  attr_range    = 0..max_attr_nr;
  attributes    = SET OF attr_range;
  attr_array    = ARRAY[0..max_attr_nr] OF pnode;
  tattr_at_elem = ARRAY[0..max_elem_nr] OF attributes;

(* Begin of the declaration for expressions                                   *)
  pexpr         = ^texpr;

  plexpr        = ^tlexpr;
  tlexpr        = RECORD
                    first       : pexpr;
                    type_of_arg ,
                    loc_rep     : alfa;
                    rest        : plexpr
                  END;

  plalt_expr    = ^tlalt_expr;
  tlalt_expr    = RECORD
                    line_nr     : integer;
                    other_sel   : boolean;
                    selectors   : elements;
                    expr        : pexpr;
                    rest        : plalt_expr
                  END;

  texpr_kind    = (e_atoc, e_func, e_case);
  texpr         = RECORD
                    line_nr     : integer;
                  CASE kind : texpr_kind OF
                    e_atoc : ( attr       : pnode;
                               partnamenr : integer    );
                    e_func : ( func       : pnode;
                               args       : plexpr     );
                    e_case : ( headpnnr   : integer;
                               alter      : plalt_expr )
                  END;

(* Begin of the declaration for normal attribute assignments                  *)
  plattr_ass    = ^tlattr_ass;

  plalt_ass     = ^tlalt_ass;
  tlalt_ass     = RECORD
                    line_nr     : integer;
                    other_sel   : boolean;
                    selectors   : elements;
                    attr_ass    : plattr_ass;
                    rest        : plalt_ass
                  END;

  tass_kind     = (a_simp, a_sele);
  tlattr_ass    = RECORD
                    line_nr     : integer;
                    rest        : plattr_ass;
                  CASE kind : tass_kind OF
                    a_simp : ( attr       : pnode;
                               partnamenr : integer;
                               expr       : pexpr    );
                    a_sele : ( headpnnr   : integer;
                               alter      : plalt_ass )
                  END;

(* Begin of the declaration for transformed attribute assignments             *)
  pltr_ass      = ^tltr_ass;

  ttr_ass       = RECORD
                    ass_attr    : attributes;
                    top         : plexpr;
                    deeper      : pltr_ass
                  END;

  tltr_ass      = RECORD
                    first       : ttr_ass;
                    rest        : pltr_ass
                  END;

(* Begin of the declaration for the tree rules                                *)
  ptree_rule    = ^ttree_rule;
  ttree_rule    = RECORD
                    partname    : alfa;
                    element     : pnode;
                    trans_ass   : ttr_ass;
                    rest        : ptree_rule
                  END;
  p_tree_list   = ^t_tree_list;
  t_tree_list   = ARRAY[0..max_part_nr] OF ptree_rule;
  t_imported    = ARRAY[0..max_part_nr] OF attributes;

(* Begin of the declaration of sets of attributes                             *)
  tattr_kind    = (nor_gen, nor_in, nor_out, all_gen, all_in, all_out);
  tattr         = ARRAY[nor_gen..all_out] OF attributes;

(* Begin of the declaration for an element                                    *)
  trule_kind    = (r_empt, r_dire, r_indi, r_undf);

(* representation of a formal type : *)
  ttype         = RECORD
                    type_name   : pnode;
                    conc        : boolean
                  END;

(* representation of a list of formal types : *)
  pltype        = ^tltype;
  tltype        = RECORD
                    first       : ttype;
                    rest        : pltype
                  END;

  tnode_kind    = (n_class, n_node, n_elem, n_type, n_func, n_attr,
                   n_system, n_pascal, n_undef
                  );
  tinout_kind   = (n_input, n_output);

(* representation of the error status : *)
  terror_sts    = RECORD
                    defined     : SET OF tnode_kind;
                    conc        : boolean
                  END;

(* A node of the binary tree : *)
  tnode         = RECORD
                    name        : alfa;
                    left,right  : pnode;
                    status      : terror_sts;
                  CASE kind : tnode_kind OF
                    n_class  ,
                    n_node   ,
                    n_elem   : ( elem_nr        : elem_range;
                                 tree_rule      : ptree_rule;
                                 kind_of_rule   : trule_kind;
                                 attr_ass       : plattr_ass;
                                 attr           : tattr;
                                 parent         ,
                                 class_rule     ,
                                 clos_class     : elements;
                                 trans_ass      : ttr_ass;
                                 tree_list      : p_tree_list;
                                 nr_of_parts    : integer);
                    n_type   : ();
                    n_func   : ( nr_of_arg      : integer;
                                 type_of_func   : ttype;
                                 args           : pltype     );
                    n_attr   : ( attr_nr        : attr_range;
                                 type_of_attr   : ttype      );
                    n_system ,
                    n_pascal ,
                    n_undef  : ()
                  END;

7.2 Variable declaration

VAR
  nametree     : pnode;      (* root of the name tree                         *)

  correct_status : terror_sts; (* a correct status *)
  und_type       ,        (* undefined formal type *)
  err_type       : ttype; (* error formal type *)


(* from this program, shared with LOAD, TRANS and DEPEND                      *)
  g_inh_attr      ,             (* inherited attributes                       *)
  g_syn_attr      ,             (* synthesized attributes                     *)
  g_input_attr    ,             (* global input attributes                    *)
  g_output_attr   : attributes; (* global output attributes                   *)
  element         : elem_array; (* all the information about the elements     *)
  nr_of_elem      : integer;    (* highest element number                     *)
  attribute       : attr_array; (* all the information about the attributes   *)
  nr_of_attr      : integer;    (* highest attribute number                   *)
  root            : pnode;      (* element number of the root                 *)
  classes         ,             (* the set of classes                         *)
  node_types      ,             (* the set of node types                      *)
  left_side_elements  ,           (* left_hand_side_elements                  *)
  right_side_elements : elements; (* right_hand_side_elements                 *)

7.3 Set procedures

  PROCEDURE next_elem(VAR elem_nr : integer; VAR elem_set : elements);
  (* This procedure seeks the next element number in the set of elements      *)
  (* elem_set and deletes the number, which it returns in elem_nr.            *)
  BEGIN
    REPEAT elem_nr := succ(elem_nr)
    UNTIL elem_nr IN elem_set;
    elem_set := elem_set - [elem_nr]
  END;

  PROCEDURE next_attr(VAR attr_nr : integer; VAR attr_set : attributes);
  (* This procedure seeks the next attribute number in the set of attributes  *)
  (* attr_set and deletes the number, which it returns in attr_nr.            *)
  BEGIN
    REPEAT attr_nr := succ(attr_nr)
    UNTIL attr_nr IN attr_set;
    attr_set := attr_set - [attr_nr]
  END;

7.4 Storing and retrieving procedures

This page contains the procedures to store (add/put) new identifiers in the binary tree, and to retrieve (get) identifiers.

[HIDDEN, EXTERNAL]
PROCEDURE impl_error(nr : integer); EXTERN;

[HIDDEN]
PROCEDURE create_node(VAR root:pnode; in_name:alfa; tag:tnode_kind);
(* This procedure creates a new node and initializes the variant part which   *)
(* is different for each kind of identifier.                                  *)
VAR attr_kind : tattr_kind;

  FUNCTION new_element_number : integer;
  (* This function makes a new unique number for this element, pointed to by  *)
  (* root. If there are no numbers available anymore, an error message is     *)
  (* generated. As a side effect, the pointer root is also stored in          *)
  (* element with its number.                                                 *)
  BEGIN
    nr_of_elem := nr_of_elem + 1;
    IF nr_of_elem > max_elem_nr
    THEN (* no more numbers available *)
         BEGIN
           IF   nr_of_elem = max_elem_nr + 1
           THEN impl_error(1); (* error(t_err, i_tme) *)
           new_element_number := nr_of_elem
         END
    ELSE BEGIN
           element[nr_of_elem] := root;
           new_element_number := nr_of_elem
         END
  END;

  FUNCTION new_attribute_number : integer;
  (* This function makes a new unique number for this attribute, pointed to   *)
  (* by root. If there are no numbers available anymore, an error message is  *)
  (* generated. As a side effect, the pointer root is also stored in          *)
  (* attribute with its number.                                               *)
  BEGIN
    nr_of_attr := nr_of_attr + 1;
    IF nr_of_attr > max_attr_nr
    THEN (* no more numbers available *)
         BEGIN
           IF   nr_of_attr = max_attr_nr + 1
           THEN impl_error(2); (* error(t_err, i_tma) *)
           new_attribute_number := nr_of_attr
         END
    ELSE BEGIN
           attribute[nr_of_attr] := root;
           new_attribute_number := nr_of_attr
         END
  END;

BEGIN (* of create_node *)
  new(root);
  WITH root^
  DO BEGIN
       name   := in_name;
       left   := NIL;
       right  := NIL;
       kind   := tag;
       status := correct_status;

(* initialize variant part of node record *)
       CASE kind OF
         n_class  ,
         n_node   ,
         n_elem   : BEGIN
                      elem_nr      := new_element_number;
                      tree_rule    := NIL;
                      IF   kind = n_class
                      THEN kind_of_rule := r_undf
                      ELSE kind_of_rule := r_empt;
                      attr_ass     := NIL;
                      FOR attr_kind := nor_gen TO all_out
                      DO attr[attr_kind] := [];
                      parent       := [];
                      class_rule   := [];
                      clos_class   := [];
                      (* trans_ass not initialized, set in pass2 *)
                    END;
         n_func   : BEGIN
                      nr_of_arg    := 0;
                      type_of_func := und_type;
                      args         := NIL
                    END;
         n_attr   : BEGIN
                      attr_nr      := new_attribute_number;
                      type_of_attr := und_type
                    END;
       OTHERWISE
       END
     END
END;

[HIDDEN]
FUNCTION compare (name1, name2:alfa):integer;
(* This function compares name1 with name2. It returns 0 if they are equal,   *)
(*  -1 if name1 is lexicographical before name2 and otherwise 1.              *)
BEGIN
  perf_nr_calls(perf_compare);      (* for performance analysis *)
  IF name1 = name2
  THEN compare := 0
  ELSE IF name1 < name2
       THEN compare := -1
       ELSE compare := 1
END;

FUNCTION insert_name(VAR root:pnode; in_name:alfa;
                     VAR p:pnode; tag : tnode_kind):boolean;
(* This function tries to insert a new node in the tree for identifier        *)
(* in_name in the tree with root root. If it succeeds it will return TRUE     *)
(* otherwise FALSE, indicating that there already exists a node for the       *)
(* identifier. In both cases the variable pointer p will always point to the  *)
(* node for the identifier.                                                   *)
BEGIN
  perf_nr_calls(perf_insert_name);       (* for performance analysis *)
  IF root = NIL
  THEN BEGIN
         create_node(root,in_name,tag);
         p := root;
         insert_name := TRUE
       END
  ELSE WITH root^
       DO CASE compare(in_name, name) OF
(* less  *) -1 : insert_name := insert_name(left, in_name, p, tag);
(* equal *)  0 : BEGIN
                   p := root;
                   insert_name := FALSE
                 END;
(*greater*)  1 : insert_name := insert_name(right, in_name, p, tag)
          END
END;

FUNCTION get_name(root:pnode; in_name:alfa; VAR p:pnode):boolean;
(* This function returns TRUE if in_name is in the tree pointed to by root;   *)
(* the pointer variable p will point to the node in the tree, if it exists.   *)
BEGIN
  perf_nr_calls(perf_get_name);      (* for performance analysis *)
  IF root = NIL
  THEN BEGIN
         p := NIL;
         get_name := FALSE
       END
  ELSE WITH root^
       DO CASE compare(in_name, name) OF
            -1 : get_name := get_name(left, in_name, p);  (* less *)
             0 : BEGIN
                   p := root;
                   get_name := TRUE                       (* equal *)
                 END;
             1 : get_name := get_name(right, in_name, p); (* greater *)
          END
END;

7.5 Error suppressing procedures

(*  The following procedures are for the administration of errors that        *)
(*  occurred with a certain name, so that error suppressing is possible, by   *)
(*  converting normal error into warnings.                                    *)
(*                                                                            *)
(*  EXPLANATION                                                               *)
(*                                                                            *)
(*  When administrating identifiers and the information attached certain      *)
(*  errors will occur that can cause a great number of error messages.        *)
(*  For example, if an identifier is not declared this should be              *)
(*  reported only once.                                                       *)
(*  There are two kinds of error status' incorperated in the binary tree :   *)
(*  - incorrect usage of identifiers (for example: a type name as an          *)
(*    element identifier).                                                    *)
(*  - concluded information (for example the type of an undeclared variable   *)
(*    can be concluded from its usage)                                        *)
(*  The error procedure (described in the ERROR module) as an optional third  *)
(*  boolean parameter that can be used for certain error messages to reduce   *)
(*  from an error into a warning. With the right option set, these warnings   *)
(*  can be suppressed.                                                        *)

  PROCEDURE conclude(name_ptr:pnode);
  (* This procedure sets the concluded property of name, pointed to by        *)
  (* name_ptr. This means that some of the properties of this name are        *)
  (* concluded, and error message have to be suppressed.                      *)
  BEGIN
    name_ptr^.status.conc := TRUE
  END;

  FUNCTION concluded(name_ptr:pnode) : boolean;
  (* This function returns TRUE, if the concluded property of the name,        *)
  (* pointed to by name_ptr, has been set, otherwise it returns FALSE.         *)
  BEGIN
    concluded := name_ptr^.status.conc
  END;

  PROCEDURE error_define(name_ptr:pnode; wanted:tnode_kind);
  (* This procedure registrated that the name, pointed to by name_ptr, has    *)
  (* been used also as name of kind wanted. This is registrated so that at    *)
  (* further uses this will be seen.                                          *)
  BEGIN
    WITH name_ptr^.status
    DO defined := defined + [wanted]
  END;

  FUNCTION error_defined(name_ptr:pnode; wanted:tnode_kind) : boolean;
  (* This function returns TRUE, if it was registrated that name, pointed by  *)
  (* name_ptr, has been used before as a name of kind wanted. Otherwise it    *)
  (* will return FALSE.                                                       *)
  BEGIN
    error_defined := wanted IN name_ptr^.status.defined
  END;

7.6 Initializing the binary tree

PROCEDURE init_tree;
(* This procedure initiates the name tree with system names and Pascal words  *)
(* (including standard types). The created tree is almost AVL balanced,       *)
(* because the order of insertions is carefully chosen.                       *)
(* This procedure should be called before any operation on the binary tree    *)
(* by calling any of the other exported procedures.                           *)
(*                                                                            *)
(* WARNING: The messages generated by the different init_ procedures should   *)
(*          not occur on the terminal. They indicate programming errors !!!!  *)

  PROCEDURE init_pascal(name:alfa);
  (* This procedure inserts name as a Pascal word *)
  VAR
    p : pnode;
  BEGIN
    IF NOT insert_name(nametree, name, p, n_pascal)
    THEN  write('ERROR IN INIT_TREE FOR : ', name, ' , DEFINED AS N_PASCAL')
  END;

  PROCEDURE init_system(name:alfa);
  (* This procedure inserts name as a system name *)
  VAR
    p : pnode;
  BEGIN
    IF NOT insert_name(nametree, name, p, n_system)
    THEN  write('ERROR IN INIT_TREE FOR : ', name, ' , DEFINED AS N_SYSTEM')
  END;

  PROCEDURE init_type(name:alfa);
  (* This procedure inserts name as a type name *)
  VAR
    p : pnode;
  BEGIN
    IF NOT insert_name(nametree, name, p, n_type)
    THEN write('ERROR IN INIT_TREE FOR : ', name, ' , DEFINED AS N_TYPE')
  END;

BEGIN
  nr_of_elem := start_nr;
  nr_of_attr := start_nr;
  correct_status.defined := [];
  correct_status.conc    := FALSE;
  und_type.conc      := FALSE;
  und_type.type_name := NIL;
  err_type.conc      := TRUE;
  err_type.type_name := NIL;
  nametree := NIL;
  init_pascal('if');
  init_pascal('do');
  init_pascal('case');
  init_pascal('array');
  init_pascal('and');
  init_pascal('begin');
  init_type  ('boolean');
  init_pascal('const');
  init_type  ('char');
  init_pascal('div');
  init_pascal('file');
  init_pascal('end');
  init_pascal('else');
  init_pascal('downto');
  init_pascal('function');
  init_pascal('for');
  init_pascal('goto');
  init_pascal('packed');
  init_pascal('mod');
  init_pascal('module');
  init_pascal('in');
  init_pascal('label');
  init_type  ('integer');
  init_pascal('of');
  init_pascal('nil');
  init_pascal('not');
  init_pascal('otherwise');
  init_pascal('or');
  init_pascal('to');
  init_pascal('record');
  init_pascal('rem');
  init_pascal('program');
  init_pascal('procedure');
  init_type  ('real');
  init_pascal('set');
  init_pascal('repeat');
  init_pascal('then');
  init_pascal('type');
  init_pascal('until');
  init_pascal('var');
  init_pascal('varying');
  init_pascal('with');
  init_pascal('while');
  init_pascal('external');
  init_pascal('fortran');
  init_pascal('forward');
END;

7.7 Tree statistics procedure

  [GLOBAL,HIDDEN]
  PROCEDURE tree_statistics( VAR nr_of_nodes, min_depth, max_depth : integer
                           ; VAR average_depth : real
                           ; VAR count:levels);
  (* This procedure produces stastical information on the binary identifier   *)
  (* tree, by counting the number of non-empty nodes on every depth, with a   *)
  (* maximum of 20, and all the nodes with a greater depth are counted.       *)
  VAR
    i : integer;

    PROCEDURE walk(root:pnode; depth:integer; part_of_depth : real);
    (* This procedure will walk through the tree with root root and on the    *)
    (* depth depth, and do all the work for this tree.                        *)
    BEGIN
      IF   root <> NIL
      THEN (* a non-empty node found *)
           WITH root^
           DO BEGIN
                nr_of_nodes := succ(nr_of_nodes);
                IF   depth < max_level
                THEN count[depth]     := succ(count[depth])
                ELSE count[max_level] := succ(count[max_level]);
                depth := succ(depth);
                average_depth := average_depth + part_of_depth;
                part_of_depth := part_of_depth/2;
                walk(left,  depth, part_of_depth);
                walk(right, depth, part_of_depth);
                IF   depth > max_depth
                THEN max_depth := depth
              END
      ELSE IF   depth < min_depth
           THEN min_depth := depth
    END;

  BEGIN  (* tree_statistics *)
    FOR i := 0 TO max_level DO count[i] := 0;
    nr_of_nodes := 0;
    max_depth     := 0;
    average_depth := -1;
    min_depth     := max_level;
    walk(nametree, 0, 1)
  END;

END.

My life as a hacker | My home page

The binary identifier tree module (bintree)