@q file: prog.w@> @q% Copyright Dave Bone 1998 - 2015@> @q% /*@> @q% This Source Code Form is subject to the terms of the Mozilla Public@> @q% License, v. 2.0. If a copy of the MPL was not distributed with this@> @q% file, You can obtain one at http://mozilla.org/MPL/2.0/.@> @q% */@> @** External routines and globals.\fbreak General routines to get things going:\fbreak \ptindent{1) get control file and put into \O2's holding file} \ptindent{2) parse the command line} \ptindent{3) format errors} \ptindent{4) \O2's parse phrases --- pieces of syntactic structures} These are defined by including |o2_externs.h|. Item 4 is driven out of the |pass3.lex| grammar. It demonstrates a procedural approach similar to recursive descent parsing technique. The globals are:\fbreak \ptindent{a) |Error_queue| --- global container of errors passed across all parsings} \ptindent{b) Switches from command line parse} \ptindent{c) Token containers for the parsing phases} @= extern int RECURSION_INDEX__; extern void COMMONIZE_LA_SETS(); extern int NO_LR1_STATES; extern STATES_SET_type VISITED_MERGE_STATES_IN_LA_CALC; extern LR1_STATES_type LR1_COMMON_STATES; extern CYCLIC_USE_TBL_type CYCLIC_USE_TABLE; extern void Print_dump_state(state* State); @** Main line of \O2. @+= YACCO2_define_trace_variables(); //|Recursion_count();| int RECURSION_INDEX__(0); yacco2::CHAR T_SW('n'); yacco2::CHAR ERR_SW('n'); yacco2::CHAR PRT_SW('n'); yacco2::TOKEN_GAGGLE JUNK_tokens; yacco2::TOKEN_GAGGLE P3_tokens; yacco2::TOKEN_GAGGLE Error_queue; char Big_buf[BIG_BUFFER_32K]; T_sym_tbl_report_card report_card; std::string o2_file_to_compile; std::string o2_fq_fn_noext; STBL_T_ITEMS_type STBL_T_ITEMS; STATES_type LR1_STATES; LR1_STATES_type LR1_COMMON_STATES; bool LR1_HEALTH(LR1_COMPATIBLE); int NO_LR1_STATES(0); STATES_SET_type VISITED_MERGE_STATES_IN_LA_CALC; CYCLIC_USE_TBL_type CYCLIC_USE_TABLE; int main(int argc, char* argv[]) { cout << yacco2::Lr1_VERSION << std::endl; @; @; lrclog << yacco2::Lr1_VERSION << std::endl; @; @; @; @; @; @; @; @; @; @; @; //|@;| //|@;| exit: lrclog << "Exiting O2" << std::endl; return 0; } @** Some Programming sections. @*2 Shutdown.\fbreak Prints out the thread table with their runtime activity, and calls each one of them to quitely remove themselves as threads. Within Unix this is not needed as the winddown duties of the process removes launched threads: That is why it is commented out. Uncommenting it provides the run statistics for the compiler writer to view reality in terms of performance stats. @= lrclog << "Before thread shutdown" << std::endl; yacco2::Parallel_threads_shutdown(pass3); lrclog << "After thread shutdown" << std::endl; @*2 Setup \O2 for parsing.\fbreak @= @; @*2 Load \O2's keywords into symbol table.\fbreak Basic housekeeping. Originally a grammar recognized keywords by being in competition with the Identifier thread. Keyword thread only ran if its first set matched the starting character making up an identifier and keyword. Now it's blended into Identifier using the symbol table lookup that returns not only the identifier terminal but all other keyword entries put into the symbol table. For now, only the keywords are cloned off as unique entities whilst all other entries are passed back from their symbol table with its source co-ordinates being overriden. @= LOAD_YACCO2_KEYWORDS_INTO_STBL(); @*2 Fetch command line info and parse the 3 languages.\fbreak The 3 separate languages to parse are:\fbreak \ptindent{1) fetching of the command line to place into a holding file} \ptindent{2) the command line in the holding file --- grammar file name and options} \ptindent{3) the grammar file's contents} Items 1 and 2 are handled by external routines where fetching of the command line is crude but all-purpose whilst the command line language is specific to \O2. @= @; @; @; @*2 Get command line, parse it, and place contents into a holding file. It uses a generic external routine to do this. The parse is very rudimentary. The command data is placed into a holding file provided by |Yacco2_holding_file| defined in the external library |o2_externs.h|. See |cweb| documents mentioned in the introduction regarding other support libraries. If the result is okay, set up \O2's library files for tracing. @= GET_CMD_LINE(argc,argv,Yacco2_holding_file,Error_queue); @; @*2 Do we have errors?. Check that error queue for those errors. Note, |DUMP_ERROR_QUEUE| will also flush out any launched threads for the good housekeeping or is it housetrained seal award? Trying to do my best in the realm of short lived winddowns. @= if(Error_queue.empty()!=true){ DUMP_ERROR_QUEUE(Error_queue); return 1; } @*2 Parse command line data placed in holding file. @= YACCO2_PARSE_CMD_LINE (T_SW,ERR_SW,PRT_SW,o2_file_to_compile,Error_queue); @; @; @; @; @*3 Extract fully qualified file name to compile without its extension.\fbreak Used to access the generated first set control file for |cweb| documentation and \O2's tracings. Simple check, if the grammar file name does not contain a ``.extension'' then use the complete file name. @= std::string::size_type pp = o2_file_to_compile.rfind('.'); if(pp == std::string::npos){ o2_fq_fn_noext += o2_file_to_compile; }else{ o2_fq_fn_noext += o2_file_to_compile.substr(0,pp); } @*2 Set up \O2's logging files local to the parsed grammar.\fbreak There are 2 stages. Stage 1 logs to ``1lrerrors.log'' and ``1lrtracings'' as the command line is being parsed --- |o2_lcl_opts| and |o2_lcl_opt| grammars. It has no knowledge of the grammar file to parse. Stage 2 passed the command line parsing and the inputted grammar file name can be used to build the grammar's local \O2 tracing files. These log files are ``xxx\_tracings.log'' and ``xxx\_errors.log'' where the ``xxx'' is the grammar's base file name. @= std::string normal_tracing(o2_fq_fn_noext.c_str()); normal_tracing += "_tracings.log"; std::string error_logging(o2_fq_fn_noext.c_str()); error_logging += "_errors.log"; yacco2::lrclog.close(); yacco2::lrerrors.close(); yacco2::lrclog.open(normal_tracing.c_str()); yacco2::lrerrors.open(error_logging.c_str()); @*3 Display to user options selected. @= lrclog << "Parse options selected:" << std::endl; lrclog << " Gen T: " << T_SW; lrclog << " Gen Err: " << ERR_SW; lrclog << " Gen RC: " << PRT_SW; @*2 Parse the grammar.\fbreak Due to the syntax directed code not having legitimate grammars to parse it, a character-at-a-time parsing approach is used. This is a lexical and syntactic mix of parsing instead of your separate lexical, syntax parse stages. Why? I'll use a question as an answer. How do you recognize the `***' directive to end a c++ syntax directed code portion that is an unstructured sequence of characters? Well crawl at a character's pace per prefix accessment. This is why the bluring between lexical and syntatical boundaries. So walk-the-walk-and-talk of a lexical parser using recursive descent (for its single call of fame containing a bottom-up parse) tripped off by a bottom-up syntax directed code. What a mouthfull! Should mother use soap and a tooth brush to punish the child? Who is this mother anyway? Within the |pass3.lex| grammar are procedure calls containing the parse phases. Each phase is called from within the syntax-directed-code of the recognized keyword: ``fsm'',`` rules'', etc. This demonstrates a bottom-up / top-down approach to parsing. Options are what it's all about. What's your choice? @= //|yacco2::YACCO2_TH__ = 1;| //|yacco2::YACCO2_MSG__ = 1;| using namespace NS_pass3; tok_can cmd_line(o2_file_to_compile.c_str()); Cpass3 p3_fsm; Parser pass3(p3_fsm,&cmd_line,&P3_tokens,0,&Error_queue,&JUNK_tokens,0); pass3.parse(); @; @; @*2 Dump lexical and syntactic's outputted tokens. @= yacco2::TOKEN_GAGGLE_ITER i = P3_tokens.begin(); yacco2::TOKEN_GAGGLE_ITER ie = P3_tokens.end(); lrclog << "Dump of P3 tokons" << endl; for(int yyy = 1;i != ie;++i){ CAbs_lr1_sym* sym = *i; if(sym == yacco2::PTR_LR1_eog__) continue; lrclog << yyy << ":: " << sym->id__ << " file no: " << sym->tok_co_ords__.external_file_id__ << " line no: " << sym->tok_co_ords__.line_no__ << " pos: " << sym->tok_co_ords__.pos_in_line__ << endl; ++yyy; } @*2 Dump aid --- Enumerate grammar's components.\fbreak As a reference aid to a grammar's components, each component has an enumerate value of ``x.y.z'' where x stands for the rule number, y is its subrule number, and z is the component number. The grammar's enumerated elements are ``rule-def'', ``subrule-def'', and components ``refered-rule'', ``refered-T'', and ``eosubrule''. The ``rules-phrase'' is not enumerated as it just ties all the forests together. An enumerate example is ``1'' standing for the Start rule. ``1.2.2'' goes to its 2nd subrule of component 2. The grammar is read whereby all its forests are enumerated relative to one another. @= @=set enumerate_filter;@>@/ enumerate_filter.insert(T_Enum::T_rule_def_); enumerate_filter.insert(T_Enum::T_T_subrule_def_); enumerate_filter.insert(T_Enum::T_refered_T_); enumerate_filter.insert(T_Enum::T_T_eosubrule_); enumerate_filter.insert(T_Enum::T_refered_rule_); enumerate_filter.insert(T_Enum::T_T_called_thread_eosubrule_); enumerate_filter.insert(T_Enum::T_T_null_call_thread_eosubrule_); using namespace NS_enumerate_grammar;@/ tok_can_ast_functor walk_the_plank_mate; ast_prefix enumerate_grammar_walk (*rules_tree,&walk_the_plank_mate,&enumerate_filter,ACCEPT_FILTER);@/ tok_can enumerate_grammar_can(enumerate_grammar_walk); Cenumerate_grammar enumerate_grammar_fsm; Parser enumerate_grammar(enumerate_grammar_fsm,&enumerate_grammar_can,0,0,&Error_queue); enumerate_grammar.parse(); @*2 Epsilon and Pathological assessment of Rules.\fbreak Epsilon condition:\fbreak Rule contains an empty symbol string in a subrule. The only subtlety is when a rule has a subrule(s) containing all rules. If all the rules within that subrule are epsiloned, then this subrule is an epsilon and so turn on its rule as epsilonable.\fbreak \fbreak Pathological Rule assessment:\fbreak Does a rule derive a terminal string? The empty string is included in this assessment. |epsilon_rules| grammar tells the whole story.\fbreak \fbreak Note:\fbreak The tree is walked using discrete levels: Rules and Subrules. The subrule's elements are filtered out (not included) for the discrete rule traversal but is added within the rule's syntax directed code logic a subrule's element advancement. Element advancement bypasses the thread component expression. These are neat facilities provided by \O2 using the |tok_can| tree traversal containers. @= using namespace NS_epsilon_rules;@/ @=set yes_pile;@>@/ @=set no_pile;@>@/ @=list< pair > maybe_list;@>@/ T_rules_phrase* rules_ph = O2_RULES_PHASE; AST* rules_tree = rules_ph->phrase_tree(); @=set filter;@>@/ filter.insert(T_Enum::T_T_subrule_def_); filter.insert(T_Enum::T_rule_def_); tok_can_ast_functor just_walk_functr; ast_prefix rule_walk(*rules_tree,&just_walk_functr,&filter,ACCEPT_FILTER);@/ tok_can rules_can(rule_walk); Cepsilon_rules epsilon_fsm; Parser epsilon_rules(epsilon_fsm,&rules_can,0,0,&Error_queue); epsilon_rules.parse(); @; //|@;| @ Print pathological symptoms but continue. @= if(Error_queue.empty()!=true){ DUMP_ERROR_QUEUE(Error_queue); Error_queue.clear(); return 1; } @*2 Get the total number of subrules.\fbreak I'm lazy and don't want to distribute the count as the individual rules are being parsed so do it via the a tree walk on subrules. Why do it anyway? I've hardwired the |elem_space| table size against a constant |Max_no_subrules|. Why not allocate the table size dynamicly? Glad u asked as the malloc approach burped. Maybe there's mixed metaphores on malloc versus how the C++ new / delete allocation is done. Anyway this works and is reasonable. @= @=set sr_filter;@>@/ sr_filter.insert(T_Enum::T_T_subrule_def_); ast_prefix sr_walk(*rules_tree,&just_walk_functr,&sr_filter,ACCEPT_FILTER);@/ tok_can sr_can(sr_walk); for(int xx(0);sr_can[xx] != yacco2::PTR_LR1_eog__;++xx); O2_T_ENUM_PHASE->total_no_subrules(sr_can.size()); if(O2_T_ENUM_PHASE->total_no_subrules() > Max_no_subrules){ lrclog << "Grammar's number of subrules: " << O2_T_ENUM_PHASE->total_no_subrules() << " exceeds the allocated space for table elem_space: " << Max_no_subrules << endl; lrclog << "This is a big grammar so please correct the grammar." << std::endl; clog << "Grammar's number of subrules: " << O2_T_ENUM_PHASE->total_no_subrules() << " exceeds the allocated space for table elem_space: " << Max_no_subrules << endl; clog << "This is a big grammar so please correct the grammar." << std::endl; return 1; } @*2 Calculate each rule's first set.\fbreak Lov the discrete logic of a grammar to code algorithms. See |first_set_rules| grammar as it's really is simple in its logic: i'm getting there from all corners of the coding world. Not any more as i'm pruning the overhead so out my drafty thoughts and this grammar |first_set_rules|. Just iterate over the grammar tree for filtered |rule_def| nodes only. @= @=set fs_filter;@>@/ fs_filter.insert(T_Enum::T_rule_def_); ast_prefix fs_rule_walk(*rules_tree,&just_walk_functr,&fs_filter,ACCEPT_FILTER);@/ tok_can fs_rules_can(fs_rule_walk); for(int xx(0);fs_rules_can[xx] != yacco2::PTR_LR1_eog__;++xx){ @=rule_def* rd = (rule_def*)fs_rules_can[xx];@>@/ GEN_FS_OF_RULE(rd); } @*2 Calculate Start rule's called threads first set list.\fbreak It calculates the ``called threads'' first set for the ``to be emitted xxx.fsc'' file. The neat wrinkle is the epsilonable rule that requires same transience left-to-right moves thru the subrule expressions. This is fodder to \olinker that builds each thread's first set from the ``list-of-native-first-set-terminal'' and ``list-of-transitive-threads'' constructs. The final outcome of \olinker is an optimized list of threads per terminal. The calculation goes across the Start rule and its closured rules to determine the list of called threads. This list can be \emptyrule. In the ``Start rule'' is the contents for ``list-of-transitive-threads''. @= rule_def* start_rule_def = (rule_def*)fs_rules_can.operator[](0); GEN_CALLED_THREADS_FS_OF_RULE(start_rule_def); @*2 Are all Grammar phases parsed?.\fbreak As i parse the individual phrases by their keyword presence without using a grammar to sequence each phase, now is the time to see if all the parts are present in the grammar. This is a simple iteration on the posted |O2_PHRASE_TBL| to fetch their phrase terminals and to put them thru a post grammar sequencer. I changed how the tokens are fetched from fill the container by iterating the |O2_xxx| phases to reading the grammar's tree. Why? Cuz i implicitly changed to on-the-fly enumeration of their values while they were being parsed. If their order was changed then their appropriate enumerates are out-of-alignment. For example if the raw character classification came before the ``lrk'' definitions, this would be catastrophic due to the down stream semantics' dependency on their correct enumerates. \fbreak A bird's view of \O2's phases: indent shows node's dependency\fbreak \ptindent{::1 grammar-phrase grammar-phrase file 2:0: line 24:4: sym*: 0122B598} \ptindent{ ::2 fsm-phrase fsm-phrase file 2:766: line 24:4: sym*: 01220BA0} \ptindent{ ::3 T-enum-phrase T-enum-phrase file 4:1069: line 32:14: sym*: 01272500} \ptindent{ ::4 lr1-k-phrase lr1-k-phrase file 5:1727: line 44:21: sym*: 011F0360} \ptindent{ ::5 rc-phrase rc-phrase file 6:303: line 13:15: sym*: 01270C98} \ptindent{ ::6 error-symbols-phrase error-symbols-phrase file 7:1026: line 34:14: sym*: 0257F388} \ptindent{ ::7 terminals-phrase terminals-phrase file 8:474: line 15:10: sym*: 011F1458} \ptindent{ ::8 rules-phrase rules-phrase file 2:1708: line 60:6: sym*: 02FB3AA8} Notice i walk the tree by |ast_prefix_wbreadth_only|. This visits the start node ``grammar-phrase'' and only its immediate children by the ``breadth-only'' qualifier. @= @=set phase_order_filter;@>@/ phase_order_filter.insert(T_Enum::T_T_fsm_phrase_); phase_order_filter.insert(T_Enum::T_T_enum_phrase_); phase_order_filter.insert(T_Enum::T_T_lr1_k_phrase_); phase_order_filter.insert(T_Enum::T_T_rc_phrase_); phase_order_filter.insert(T_Enum::T_T_error_symbols_phrase_); phase_order_filter.insert(T_Enum::T_T_terminals_phrase_); phase_order_filter.insert(T_Enum::T_T_rules_phrase_); tok_can_ast_functor orderly_walk; ast_prefix_wbreadth_only evaluate_phase_order(*GRAMMAR_TREE,&orderly_walk,&phase_order_filter,ACCEPT_FILTER);@/ tok_can phrases_can(evaluate_phase_order); using namespace NS_eval_phrases; Ceval_phrases eval_fsm; Parser eval_phrases(eval_fsm,&phrases_can,0,0,&Error_queue,0,0); eval_phrases.parse(); @; @*2 Thread's end-of-token stream: Lookahead expression post evaluation.\fbreak If the grammar contains the `parallel-parser' construct, then it is considered a thread. As a refinement, this construct allows one to fine-tune the lookahead boundaries of the grammar in its own contextual way. As this construct is declared before the grammar's vocabulary definitions --- rules and terminals, the expression must be kept in raw character token format with some lexems removed like comments. Only after all the grammar has been recognized can the lookahead expression be parsed properly: the terms in the expression must relate to T-in-stbl, rule-in-stbl, and the $+$ or $-$ expression operators. Squirrelled away in the `parallel-parser' terminal is the raw token stream of the lookahead expression. The strategy used is to fetch the appropriate parsed phase token from the \O2 phase table and then deal with its locally defined pieces of information. Originally these parse phases were kept in the global symbol table but now they are contained in its own table. Why? Well how do u guard against a grammar writer defining a terminal whose key could be a synomyn to one of my internal parse phases? Regardless of how clever one is to naming keys, separation between my internal tables and the global symbol table has a 100\% assurance of no conflict. First set Criteria:\fbreak \ptindent{1) Element is a Terminal, use its calculated enumeration value} \ptindent{2) If the element is eolr, then use all calculated enumeration values} \ptindent{3) Element is a Rule, use its calculated First set terminals} Before the Lookahead first set can be calculated, the terminal vocabulary must be traversed and assigned an enumeration value per terminal. The grammar's rules must also have their first sets calculated before the lookahead expression can be calculated. The lookahead logic within its grammar(s) is two fold:\fbreak \ptindent{a) parse the lookahead expression for kosher syntax} \ptindent{b) calculate the lookahead's first set from the expression} The error checks are for an ill-formed expressions, and for an empty first set calculation: for example, `a' - `a', or `b' - `eolr', and epsilon Rules used in the lookahead expression. This calculated first set is then used down stream in the finite state automata (FSA) generation of the grammar. @= if(O2_PP_PHASE != 0){ @; } @*3 Parse the la expression and calculate its first set. @= T_parallel_parser_phrase* pp_ph = O2_PP_PHASE; if(pp_ph->la_bndry() == 0){ CAbs_lr1_sym* sym = new Err_pp_la_boundary_attribute_not_fnd; sym->set_rc(*pp_ph); Error_queue.push_back(*sym); @; } T_parallel_la_boundary* la_bndry = pp_ph->la_bndry(); yacco2::TOKEN_GAGGLE* la_srce_tok_can = la_bndry->la_supplier(); yacco2::TOKEN_GAGGLE la_tok_can_lex; yacco2::TOKEN_GAGGLE la_expr_tok_can; using namespace NS_la_expr_lexical; Cla_expr_lexical la_expr_lex_fsm; Parser la_expr_lex_parse(la_expr_lex_fsm,la_srce_tok_can ,&la_tok_can_lex,0,&Error_queue,&JUNK_tokens,0); la_expr_lex_parse.parse(); @; using namespace NS_la_expr; Cla_expr la_expr_fsm; Parser la_expr_parse(la_expr_fsm,&la_tok_can_lex,&la_expr_tok_can ,0,&Error_queue,&JUNK_tokens,0); la_expr_parse.parse(); @; @*2 Determine rule use count: Optimization.\fbreak To improve performance, the rules (Productions) symbols are newed once and recycled when needed. To ensure that there are enough recycled rules available, the gramar is traversed and their uses counted. If recursion is present within the rule, this adds one more use. The grammar tree is traversed looking only for ``rule-def'', ``subrule-def'', and ``refered-rule'' tokens. @= lrclog << "Evaluate rules count" << endl; using namespace NS_rules_use_cnt; @=set rules_use_cnt_filter;@>@/ rules_use_cnt_filter.insert(T_Enum::T_T_subrule_def_); rules_use_cnt_filter.insert(T_Enum::T_rule_def_); rules_use_cnt_filter.insert(T_Enum::T_refered_rule_); tok_can_ast_functor rules_use_walk_functr; ast_prefix rules_use_walk(*GRAMMAR_TREE,&rules_use_walk_functr ,&rules_use_cnt_filter,ACCEPT_FILTER); tok_can rules_use_can(rules_use_walk); Crules_use_cnt rules_use_cnt_fsm; Parser rules_use_cnt(rules_use_cnt_fsm,&rules_use_can,0,0,&Error_queue); rules_use_cnt.parse(); @*1 Generate grammar's LR1 states.\fbreak The global lr states list |LR1_STATES| is added to dynamicly as each closure state/vector gens their states. |LR1_HEALTH| is the diagnostic of the parsed grammar. @*2 {\bf {Driver generating lr1 states}}.\fbreak Goes thru the lr state list looking for closure states to gen. Note: a closure state gens its transitive states. A part from the ``closure only'' state (start state), all other states contain 2 contexts: transitive core items, and possibly added to closured items. As the list is read, it evaluates the possible state for gening by seeing if there are {\bf {closured}} items needing to be gened. There are 3 possible outcomes to this evaluation:\fbreak \ptindent{1) items not gened: goto of item is nil.} \ptindent{2) items completed due to right boundedness from a previous gen closure state / vector context.} \ptindent{3) partially gened items due to common prefix of a previous closure state/vector context.} Point 1 + 3 need gening. Point 1 is your regular generation context. Point 3 requires walking thru its right side symbols to where its goto state needs gening (nil). From there its gening proceeds as normal within its own closure state/vector context. \fbreak During each state closure part/vectors pass, lr kosherness is tested within each closure state/vector gening context. A non lr(1) verdict is returned immediately within the gening closure state/vector context. The balance of the closure state/vectors to gen are not completed. @= AST* start_rule_def_t = AST::get_1st_son(*rules_tree); state* gening_state = new state(start_rule_def_t); gen_context gening_context(0,-1); STATES_ITER_type si = LR1_STATES.begin(); STATES_ITER_type sie = LR1_STATES.end(); // list added to dynamicly as each gening context created for(;si!=sie;++si){ gening_state = *si; gening_context.for_closure_state_ = gening_state; gening_context.gen_vector_ = -1; lrclog << "lr state driver considered state: " << gening_context.for_closure_state_->state_no_ << " for vector: " << gening_context.gen_vector_ << endl; LR1_HEALTH = gening_state->gen_transitive_states_for_closure_context(gening_context,*gening_state,*gening_state); if(LR1_HEALTH==NOT_LR1_COMPATIBLE){ @; } } //|@;| @; /* please put back at sign if u want to trace*/ //|@;| //|@;| @*3 Is the grammar unhealthy? yes report the details and exit.\fbreak @= if(LR1_HEALTH == NOT_LR1_COMPATIBLE){ yacco2::lrclog << "===>Please check Grammar dump file: " << normal_tracing.c_str() << " for Not LR1 details" << endl; std::cout << "===>Please check Grammar dump file: " << normal_tracing.c_str() << " for Not LR1 details" << endl; yacco2::lrclog << "Not LR1 --- check state conflict list of state: " << gening_state->state_no_ << " for details" << endl; @; @; return 1; } @*3 Commonize LA Sets --- Combine common sets as a space saver.\fbreak Go thru the lr states looking for reduced subrules. Their lookahead sets have already been calculated so by set equality determine common la sets by reading thru the registry for its soul mate. This common reference to same sets minimizes space in the emitted lr state tables. The index number per set in the |COMMON_LA_SETS| registry will be used as part of each generated la set's name. This is why the found index number is deposited per reduced subrule. When the state tables get emitted, this index number + 1 is used in the gened lookahead's name as i prefer its name to be relative to 1. @= COMMONIZE_LA_SETS(); @** Overview of \O2's state generated components.\fbreak \O2 generates the components making up the automaton and the first set language for \O2{}Linker to compile. These files are the header definition of the grammar, the ``first set'' file for \O2{}Linker, and the implementations of the automata (fsm), it's symbols, and the fsm's states. Depending on the switches inputted, \O2 can generate the Terminal vocubulary defined for the grammar environment: the individual terminal classifications of errors, lr constants, raw characters, and Terminals. As a global reference to all defined terminals, an enumeration scheme is emitted.