@q file: prog.w@>
@q%   Copyright Dave Bone 1998 - 2015@>
@q% /*@>
@q%    This Source Code Form is subject to the terms of the Mozilla Public@>
@q%    License, v. 2.0. If a copy of the MPL was not distributed with this@>
@q%    file, You can obtain one at http://mozilla.org/MPL/2.0/.@>
@q% */@>
@** External routines and globals.\fbreak
General routines to get things going:\fbreak
\ptindent{1) get control file and put into \O2's holding file}
\ptindent{2) parse the command line}
\ptindent{3) format errors}
\ptindent{4) \O2's parse phrases --- pieces of syntactic structures}
These are defined by including |o2_externs.h|. 
Item 4 is driven out of the |pass3.lex| grammar. 
It demonstrates a procedural approach similar to recursive descent parsing
technique. 

The globals are:\fbreak
\ptindent{a) |Error_queue| --- global container of errors passed across all parsings}
\ptindent{b) Switches from command line parse}
\ptindent{c) Token containers for the parsing phases}

@<External rtns and variables@>=
extern int RECURSION_INDEX__;
extern void COMMONIZE_LA_SETS();
extern int NO_LR1_STATES;
extern STATES_SET_type VISITED_MERGE_STATES_IN_LA_CALC;
extern LR1_STATES_type LR1_COMMON_STATES;
extern CYCLIC_USE_TBL_type CYCLIC_USE_TABLE;
extern void Print_dump_state(state* State);
@** Main line of \O2.
@<accrue \o2 code@>+=

YACCO2_define_trace_variables();
//|Recursion_count();|
int RECURSION_INDEX__(0);
yacco2::CHAR T_SW('n');
yacco2::CHAR ERR_SW('n');
yacco2::CHAR PRT_SW('n');

yacco2::TOKEN_GAGGLE JUNK_tokens;
yacco2::TOKEN_GAGGLE P3_tokens;
yacco2::TOKEN_GAGGLE Error_queue;
char Big_buf[BIG_BUFFER_32K];
T_sym_tbl_report_card report_card;
std::string o2_file_to_compile;
std::string o2_fq_fn_noext;
STBL_T_ITEMS_type STBL_T_ITEMS;
STATES_type LR1_STATES;
LR1_STATES_type LR1_COMMON_STATES;
bool LR1_HEALTH(LR1_COMPATIBLE);
int NO_LR1_STATES(0);
STATES_SET_type VISITED_MERGE_STATES_IN_LA_CALC;
CYCLIC_USE_TBL_type CYCLIC_USE_TABLE;


int main(int argc, char* argv[])
{
  cout << yacco2::Lr1_VERSION << std::endl;
 
@<setup \O2 for parsing@>;
@<fetch command line info...@>;
  lrclog << yacco2::Lr1_VERSION << std::endl;

@<are all phases parsed?@>;
@<epsilon and pathological assessment of Rules@>;
@<dump aid: enumerate grammar's components@>;
@<determine if la expression...@>;
@<get total number of subrules for |elem_space| size check@>;
@<calculate rules first sets@>;
@<calculate Start rule called threads first sets@>;
@<generate grammar's LR1 states@>;
@<is the grammar unhealthy? yes report the details and exit@>;
@<determine each rule use count@>;
@<emit FSA, FSC, and Documents of grammar@>;
//|@<print tree@>;|
//|@<shutdown@>;|
exit:
lrclog << "Exiting O2" << std::endl;
  return 0;
}
@** Some Programming sections.
@*2 Shutdown.\fbreak
Prints out the thread table with their runtime activity, and calls
each one of them to quitely remove themselves as threads.
Within Unix this is not needed as the winddown duties of the 
process removes launched threads:
That is why it is commented out.
Uncommenting it provides the 
run statistics for the compiler writer to view reality in terms of performance stats.
@<shutdown@>=
lrclog << "Before thread shutdown" << std::endl;
yacco2::Parallel_threads_shutdown(pass3);
lrclog << "After thread shutdown" << std::endl;

@*2 Setup \O2 for parsing.\fbreak
@<setup \O2 for parsing@>=
  @<load \o2's keywords into symbol table@>;
  
@*2 Load \O2's keywords into symbol table.\fbreak
Basic housekeeping. Originally a grammar recognized keywords
by being in competition with the Identifier thread.
Keyword thread only ran if its first set matched the starting character
making up an identifier and keyword.
Now it's blended into Identifier using the symbol table lookup that
returns not only the identifier terminal but all other keyword entries put into
the symbol table.

For now, only the keywords are cloned off as unique entities whilst all other
entries are
passed back from their symbol table with its source co-ordinates being overriden. 
@<load \o2's keywords into symbol table@>=
   LOAD_YACCO2_KEYWORDS_INTO_STBL();

@*2 Fetch command line info and parse the 3 languages.\fbreak
The 3 separate languages to parse are:\fbreak
\ptindent{1) fetching of the command line to place into a holding file}
\ptindent{2) the command line in the holding file --- grammar file name and options}
\ptindent{3) the grammar file's contents}
Items 1 and 2 are handled by external routines where
fetching of the command line is crude but all-purpose whilst
the command line language is specific to \O2.
@<fetch command line info and parse the 3 languages@>=
  @<get command line...@>;  
  @<parse command line data placed in holding file@>;
  @<parse the grammar@>;

@*2 Get command line, parse it, and place contents into a holding file.
It uses a generic external routine to do this. The parse is very rudimentary.
The command data is placed into a holding file provided by
|Yacco2_holding_file| defined in the external library |o2_externs.h|.
See |cweb| documents mentioned in the introduction regarding other support libraries.
If the result is okay, set up \O2's library files for tracing.
 @<get command line, parse it, and place contents into a holding file@>=
  GET_CMD_LINE(argc,argv,Yacco2_holding_file,Error_queue);
  @<if error queue not empty then deal with posted errors@>;


@*2 Do we have errors?.
Check that error queue for those errors. 
Note, |DUMP_ERROR_QUEUE| will also flush out
 any launched threads for the good housekeeping or is it
housetrained seal
award? Trying to do my best in the realm of short lived winddowns.
@<if error queue not empty then deal with posted errors@>=
if(Error_queue.empty()!=true){
	DUMP_ERROR_QUEUE(Error_queue);
    return 1;
}

@*2 Parse command line data placed in holding file.
@<parse command line data placed in holding file@>=
  YACCO2_PARSE_CMD_LINE
	(T_SW,ERR_SW,PRT_SW,o2_file_to_compile,Error_queue);
  @<if error queue not empty then deal with posted errors@>;
  @<display to user options selected@>;
  @<extract fq name without extension@>;
  @<set up logging files@>;

@*3 Extract fully qualified file name to compile without its extension.\fbreak
Used to access the generated first set control file
for |cweb| documentation and \O2's tracings.
Simple check, if the grammar file name does not contain
a ``.extension'' then use the complete file name.
@<extract fq name without extension@>=
std::string::size_type pp = o2_file_to_compile.rfind('.');
if(pp == std::string::npos){
 o2_fq_fn_noext += o2_file_to_compile;
}else{
 o2_fq_fn_noext += o2_file_to_compile.substr(0,pp);
}

@*2 Set up \O2's logging files local to the parsed grammar.\fbreak
There are 2 stages.
Stage 1 logs to ``1lrerrors.log'' and ``1lrtracings''
as the command line is being parsed --- |o2_lcl_opts| and |o2_lcl_opt| grammars.
It has no knowledge of the grammar file to parse.
Stage 2 passed the command line parsing and the 
inputted grammar file name
can be used to build the grammar's local \O2 tracing files.
These log files are ``xxx\_tracings.log'' and ``xxx\_errors.log''
where the ``xxx'' is the grammar's base file name.
@<set up logging files@>=
std::string normal_tracing(o2_fq_fn_noext.c_str());
normal_tracing += "_tracings.log";
std::string error_logging(o2_fq_fn_noext.c_str());
error_logging += "_errors.log";
yacco2::lrclog.close();
yacco2::lrerrors.close();
yacco2::lrclog.open(normal_tracing.c_str());
yacco2::lrerrors.open(error_logging.c_str());
  

@*3 Display to user options selected.
@<display to user options selected@>=
 lrclog << "Parse options selected:" << std::endl;
 lrclog << "  Gen T: " << T_SW;
 lrclog << "  Gen Err: " << ERR_SW;
 lrclog << "  Gen RC: " << PRT_SW;

@*2 Parse the grammar.\fbreak
Due to the syntax directed code not having legitimate grammars to parse it,
a character-at-a-time parsing approach is used. 
This is a lexical and syntactic mix of parsing instead of your separate
lexical, syntax parse stages.
Why? 
I'll use a question as an answer. How do you recognize the
`***' directive to end a c++ syntax directed code portion that is an
 unstructured sequence of characters? 
Well crawl at a character's pace per prefix accessment. 
This is why the bluring between lexical and syntatical boundaries.
So walk-the-walk-and-talk of a lexical parser using recursive descent
(for its single call of fame containing a bottom-up parse)
 tripped off by a bottom-up syntax directed code.
What a mouthfull! 
Should mother use soap and a tooth brush to punish the child?
Who is this mother anyway?

Within the |pass3.lex| grammar are procedure calls
containing the parse phases.
Each phase is called from
within the syntax-directed-code of the recognized keyword:
``fsm'',`` rules'', etc.
This demonstrates a bottom-up / top-down approach to parsing.
Options are what it's all about. What's your choice? 
@<parse the grammar@>=
//|yacco2::YACCO2_TH__ = 1;|
//|yacco2::YACCO2_MSG__ = 1;|
  using namespace NS_pass3;
  tok_can<std::ifstream> cmd_line(o2_file_to_compile.c_str());
  Cpass3 p3_fsm;
  Parser pass3(p3_fsm,&cmd_line,&P3_tokens,0,&Error_queue,&JUNK_tokens,0);
  pass3.parse();
  @<if error queue not empty then deal with posted errors@>;
  @<dump lexical and syntactic's outputted tokens@>;


@*2 Dump lexical and syntactic's outputted tokens.
@<dump lexical and syntactic's outputted tokens@>=
 yacco2::TOKEN_GAGGLE_ITER i = P3_tokens.begin();
  yacco2::TOKEN_GAGGLE_ITER ie = P3_tokens.end();
  lrclog << "Dump of P3 tokons" << endl;
  for(int yyy = 1;i != ie;++i){
      CAbs_lr1_sym* sym = *i;
	if(sym == yacco2::PTR_LR1_eog__) continue;
	lrclog << yyy << ":: " << sym->id__
		<< " file no: " << sym->tok_co_ords__.external_file_id__ 
		<< " line no: " << sym->tok_co_ords__.line_no__
		<< " pos: " << sym->tok_co_ords__.pos_in_line__
		<< endl;
    ++yyy;
}

@*2 Dump aid --- Enumerate grammar's components.\fbreak
As a reference aid to a grammar's components,
each component has an enumerate value of ``x.y.z'' where
x stands for the rule number, y is its subrule number, and z is the component
number.
The grammar's enumerated elements are ``rule-def'', ``subrule-def'', and components
``refered-rule'', ``refered-T'', and ``eosubrule''.
The ``rules-phrase'' is not enumerated as it just ties all the forests together.
An enumerate example is ``1'' standing for the Start rule.
``1.2.2'' goes to its 2nd subrule of component 2.

The grammar is read whereby all its forests are enumerated relative to one another.   
@<dump aid: enumerate grammar's components@>=
@=set<int> enumerate_filter;@>@/
        enumerate_filter.insert(T_Enum::T_rule_def_);
        enumerate_filter.insert(T_Enum::T_T_subrule_def_);
        enumerate_filter.insert(T_Enum::T_refered_T_);
        enumerate_filter.insert(T_Enum::T_T_eosubrule_);
        enumerate_filter.insert(T_Enum::T_refered_rule_);
        enumerate_filter.insert(T_Enum::T_T_called_thread_eosubrule_);
        enumerate_filter.insert(T_Enum::T_T_null_call_thread_eosubrule_);
using namespace NS_enumerate_grammar;@/
  
tok_can_ast_functor walk_the_plank_mate;
ast_prefix enumerate_grammar_walk
(*rules_tree,&walk_the_plank_mate,&enumerate_filter,ACCEPT_FILTER);@/
tok_can<AST*> enumerate_grammar_can(enumerate_grammar_walk);
Cenumerate_grammar enumerate_grammar_fsm;
Parser enumerate_grammar(enumerate_grammar_fsm,&enumerate_grammar_can,0,0,&Error_queue);
enumerate_grammar.parse();

@*2 Epsilon and Pathological assessment of Rules.\fbreak
Epsilon condition:\fbreak
Rule contains an empty symbol string in a subrule.
The only subtlety is when a rule has a subrule(s) containing all rules.
If all the rules within that subrule are epsiloned, then this
subrule is an epsilon and so turn on its rule as epsilonable.\fbreak
\fbreak
Pathological Rule assessment:\fbreak
Does a rule derive a terminal string? 
The empty string is included in this assessment.
|epsilon_rules| grammar tells the whole story.\fbreak
\fbreak
Note:\fbreak
The tree is walked using discrete levels: Rules and Subrules.
The subrule's elements  are filtered out (not included)
for the discrete rule traversal but is added
within the rule's syntax directed code logic 
a subrule's element advancement.
Element advancement bypasses the thread component expression.
These are neat facilities provided by \O2 using the 
|tok_can| tree traversal containers.   
@<epsilon and pathological assessment of Rules@>=
using namespace NS_epsilon_rules;@/
@=set<AST*> yes_pile;@>@/
@=set<AST*> no_pile;@>@/
@=list< pair<AST*,AST*> > maybe_list;@>@/
T_rules_phrase* rules_ph = O2_RULES_PHASE;
AST* rules_tree = rules_ph->phrase_tree();

@=set<int> filter;@>@/
filter.insert(T_Enum::T_T_subrule_def_);  
filter.insert(T_Enum::T_rule_def_);
  
tok_can_ast_functor just_walk_functr;
ast_prefix rule_walk(*rules_tree,&just_walk_functr,&filter,ACCEPT_FILTER);@/
tok_can<AST*> rules_can(rule_walk);
Cepsilon_rules epsilon_fsm;
Parser epsilon_rules(epsilon_fsm,&rules_can,0,0,&Error_queue);
epsilon_rules.parse();
@<Print pathological symptoms but continue@>;
//|@<print tree@>;|

@ Print pathological symptoms but continue.
@<Print pathological symptoms but continue@>=
if(Error_queue.empty()!=true){
	DUMP_ERROR_QUEUE(Error_queue);
    Error_queue.clear();
    return 1;
}

@*2 Get the total number of subrules.\fbreak
I'm lazy and don't want to distribute the count as
the individual rules are being parsed so
do it via the a tree walk on subrules.
Why do it anyway?
I've hardwired the |elem_space| table size against
a constant |Max_no_subrules|.
Why not allocate the table size dynamicly?
Glad u asked as the malloc approach burped.
Maybe there's mixed metaphores on
malloc versus how the C++ new / delete allocation is done.
Anyway this works and is reasonable.
@<get total number of subrules for |elem_space| size check@>=
@=set<int> sr_filter;@>@/
sr_filter.insert(T_Enum::T_T_subrule_def_); 
ast_prefix sr_walk(*rules_tree,&just_walk_functr,&sr_filter,ACCEPT_FILTER);@/
tok_can<AST*> sr_can(sr_walk);
for(int xx(0);sr_can[xx] != yacco2::PTR_LR1_eog__;++xx);
O2_T_ENUM_PHASE->total_no_subrules(sr_can.size());
if(O2_T_ENUM_PHASE->total_no_subrules() > Max_no_subrules){
  lrclog << "Grammar's number of subrules: "
        << O2_T_ENUM_PHASE->total_no_subrules()
        << " exceeds the allocated space for table elem_space: "
        << Max_no_subrules << endl;
  lrclog << "This is a big grammar so please correct the grammar." << std::endl;
  clog << "Grammar's number of subrules: "
        << O2_T_ENUM_PHASE->total_no_subrules()
        << " exceeds the allocated space for table elem_space: "
        << Max_no_subrules << endl;
  clog << "This is a big grammar so please correct the grammar." << std::endl;
  return 1;
}

@*2 Calculate each rule's first set.\fbreak
Lov the discrete logic of a grammar to code algorithms.
See |first_set_rules| grammar as it's really is simple in its logic: 
i'm getting there from all corners of the coding world.
Not any more as i'm pruning the 
overhead so out my drafty thoughts and this grammar |first_set_rules|.
Just iterate over the grammar tree for filtered |rule_def| nodes only.
@<calculate rules first sets@>= 
@=set<int> fs_filter;@>@/
fs_filter.insert(T_Enum::T_rule_def_);
ast_prefix fs_rule_walk(*rules_tree,&just_walk_functr,&fs_filter,ACCEPT_FILTER);@/
tok_can<AST*> fs_rules_can(fs_rule_walk);
for(int xx(0);fs_rules_can[xx] != yacco2::PTR_LR1_eog__;++xx){
  @=rule_def* rd = (rule_def*)fs_rules_can[xx];@>@/
  GEN_FS_OF_RULE(rd);
}

@*2 Calculate Start rule's called threads first set list.\fbreak
It calculates the ``called threads'' first set 
for the ``to be emitted xxx.fsc''
 file. The neat wrinkle is the epsilonable rule that requires
same transience left-to-right moves thru the subrule expressions.
This is fodder to
\olinker that builds each thread's first set
 from the ``list-of-native-first-set-terminal'' 
and ``list-of-transitive-threads'' constructs.
The final outcome of \olinker is an optimized list of threads per terminal.
The calculation goes across the Start rule and its closured rules
to determine the list of called threads. This list can be \emptyrule.
In the ``Start rule'' is the contents for ``list-of-transitive-threads''.
@<calculate Start rule called threads first sets@>=
rule_def* start_rule_def = (rule_def*)fs_rules_can.operator[](0);
GEN_CALLED_THREADS_FS_OF_RULE(start_rule_def);

@*2 Are all Grammar phases parsed?.\fbreak
As i parse the individual phrases by their keyword presence
without using a grammar to sequence each phase, 
now is
the time to see if all the parts are present in the grammar.
This is a simple iteration on the posted |O2_PHRASE_TBL|
to fetch their phrase terminals and to put them
thru a post grammar sequencer.

I changed how the tokens are fetched from fill the container by
iterating the |O2_xxx| phases to reading the grammar's tree.
Why?
Cuz i implicitly changed to on-the-fly enumeration of their values
while they were being parsed.
If their order was changed then their appropriate enumerates are out-of-alignment.
For example if the raw character classification came before the ``lrk'' definitions,
this would be catastrophic due to the down stream semantics' dependency on their 
correct enumerates. \fbreak
A bird's view of \O2's phases: indent shows node's dependency\fbreak
\ptindent{::1 grammar-phrase grammar-phrase file 2:0: line 24:4: sym*: 0122B598} 
\ptindent{ ::2 fsm-phrase fsm-phrase file 2:766: line 24:4: sym*: 01220BA0} 
\ptindent{ ::3 T-enum-phrase T-enum-phrase file 4:1069: line 32:14: sym*: 01272500} 
\ptindent{ ::4 lr1-k-phrase lr1-k-phrase file 5:1727: line 44:21: sym*: 011F0360} 
\ptindent{ ::5 rc-phrase rc-phrase file 6:303: line 13:15: sym*: 01270C98} 
\ptindent{ ::6 error-symbols-phrase error-symbols-phrase file 7:1026: line 34:14: sym*: 0257F388} 
\ptindent{ ::7 terminals-phrase terminals-phrase file 8:474: line 15:10: sym*: 011F1458} 
\ptindent{ ::8 rules-phrase rules-phrase file 2:1708: line 60:6: sym*: 02FB3AA8} 
Notice i walk the tree by |ast_prefix_wbreadth_only|. This visits the start node
``grammar-phrase'' and only
its immediate children by the ``breadth-only'' qualifier. 
@<are all phases parsed?@>=
@=set<int> phase_order_filter;@>@/
phase_order_filter.insert(T_Enum::T_T_fsm_phrase_);
phase_order_filter.insert(T_Enum::T_T_enum_phrase_);
phase_order_filter.insert(T_Enum::T_T_lr1_k_phrase_);
phase_order_filter.insert(T_Enum::T_T_rc_phrase_);
phase_order_filter.insert(T_Enum::T_T_error_symbols_phrase_);
phase_order_filter.insert(T_Enum::T_T_terminals_phrase_);
phase_order_filter.insert(T_Enum::T_T_rules_phrase_);
  
tok_can_ast_functor orderly_walk;
ast_prefix_wbreadth_only 
evaluate_phase_order(*GRAMMAR_TREE,&orderly_walk,&phase_order_filter,ACCEPT_FILTER);@/
tok_can<AST*> phrases_can(evaluate_phase_order);

using namespace NS_eval_phrases;
Ceval_phrases eval_fsm;
Parser eval_phrases(eval_fsm,&phrases_can,0,0,&Error_queue,0,0);
eval_phrases.parse();
@<if error queue not...@>;

@*2 Thread's end-of-token stream: Lookahead expression post evaluation.\fbreak
If the grammar contains the `parallel-parser' construct, then it is considered 
a thread.
As a refinement, this construct allows one to 
fine-tune the lookahead boundaries of the grammar in its own contextual way.
As this construct is declared before the grammar's vocabulary definitions --- rules and terminals,
the expression must be kept in raw character token
format with some lexems removed like comments.
Only after all the grammar has been recognized can the lookahead expression be
parsed properly: the terms in the expression must
relate to T-in-stbl, rule-in-stbl, and the $+$ or $-$ expression operators.

Squirrelled away in the `parallel-parser' terminal is the raw token stream
of the lookahead expression.
The strategy used is to fetch the appropriate parsed phase token from
the \O2 phase table and then deal with its locally defined pieces of information.
Originally these parse phases were kept in the global symbol table but
now they are contained in its own table. Why? Well
how do u guard against a grammar writer defining a terminal whose key 
could be a synomyn
to one of my internal parse phases?
Regardless of how clever one is to naming keys, separation between my
internal tables and the global symbol table has a 100\% assurance of no conflict.

First set Criteria:\fbreak
\ptindent{1) Element is a Terminal, use its calculated enumeration value} 
\ptindent{2) If the element is eolr, then use all calculated enumeration values} 
\ptindent{3) Element is a Rule, use its calculated First set terminals}

Before the Lookahead first set can be calculated, the terminal vocabulary must
be traversed and assigned an enumeration value per terminal. 
The grammar's rules 
must also have their first sets calculated 
before the lookahead expression can be calculated.

The lookahead logic within its grammar(s) is two fold:\fbreak
	\ptindent{a) parse the lookahead expression for kosher syntax}
	\ptindent{b) calculate the lookahead's first set from the expression}
The error checks are for an ill-formed expressions, and for
an empty first set calculation: for example, `a' - `a', or `b' - `eolr',
and epsilon Rules used in the lookahead expression.
This calculated first set is then
used down stream in the finite state automata (FSA) generation of the grammar. 
@<determine if la expression present. Yes parse it @>=
  if(O2_PP_PHASE != 0){
    @<parse la expression and calculate its first set@>;
  }

@*3 Parse the la expression and calculate its first set.
@<parse la expression and calculate its first set@>=
T_parallel_parser_phrase* pp_ph = O2_PP_PHASE;
   
if(pp_ph->la_bndry() == 0){
  CAbs_lr1_sym* sym = new Err_pp_la_boundary_attribute_not_fnd;
  sym->set_rc(*pp_ph);
  Error_queue.push_back(*sym);
  @<if error queue not empty then deal with posted errors@>;
}
T_parallel_la_boundary* la_bndry = pp_ph->la_bndry();
yacco2::TOKEN_GAGGLE* la_srce_tok_can = la_bndry->la_supplier();
yacco2::TOKEN_GAGGLE la_tok_can_lex;
yacco2::TOKEN_GAGGLE la_expr_tok_can;
using namespace NS_la_expr_lexical;
Cla_expr_lexical la_expr_lex_fsm;
  Parser la_expr_lex_parse(la_expr_lex_fsm,la_srce_tok_can
  ,&la_tok_can_lex,0,&Error_queue,&JUNK_tokens,0);
  la_expr_lex_parse.parse();
  @<if error queue not empty then deal with posted errors@>;
using namespace NS_la_expr;
Cla_expr la_expr_fsm;
  Parser la_expr_parse(la_expr_fsm,&la_tok_can_lex,&la_expr_tok_can
  ,0,&Error_queue,&JUNK_tokens,0);
  la_expr_parse.parse();
  @<if error queue not empty then deal with posted errors@>;

@*2 Determine rule use count: Optimization.\fbreak
To improve performance, the rules (Productions) symbols are
newed once and recycled when needed.
To ensure that there are enough recycled rules available,
the gramar is traversed and their uses counted.
If recursion is present within the rule, this adds one more use.
The grammar tree is traversed looking only for
``rule-def'', ``subrule-def'', and ``refered-rule'' tokens.
@<determine each rule use count@>=
lrclog << "Evaluate rules count" << endl;
using namespace NS_rules_use_cnt;

@=set<int> rules_use_cnt_filter;@>@/
rules_use_cnt_filter.insert(T_Enum::T_T_subrule_def_);  
rules_use_cnt_filter.insert(T_Enum::T_rule_def_);
rules_use_cnt_filter.insert(T_Enum::T_refered_rule_);
  
tok_can_ast_functor rules_use_walk_functr;
ast_prefix rules_use_walk(*GRAMMAR_TREE,&rules_use_walk_functr
,&rules_use_cnt_filter,ACCEPT_FILTER);
tok_can<AST*> rules_use_can(rules_use_walk);
Crules_use_cnt rules_use_cnt_fsm;
Parser rules_use_cnt(rules_use_cnt_fsm,&rules_use_can,0,0,&Error_queue);
rules_use_cnt.parse();


@*1 Generate grammar's LR1 states.\fbreak
The global lr states list |LR1_STATES| is added to
dynamicly  as each closure state/vector gens their states.
|LR1_HEALTH| is the diagnostic of the parsed grammar.

@*2 {\bf {Driver generating lr1 states}}.\fbreak
Goes thru the lr state list looking for closure states to gen.
Note: a closure state gens its transitive states.
A part from the ``closure only'' state (start state), all other states contain
2 contexts: transitive core items, and possibly added to closured items.
As the list is read, it evaluates the possible state for gening by
seeing if there are {\bf {closured}} items needing to be gened.
There are 3 possible outcomes to this evaluation:\fbreak
\ptindent{1) items not gened: goto of item is nil.}
\ptindent{2) items completed due to right boundedness from a previous gen closure state / vector context.}
\ptindent{3) partially gened items due to common prefix of a previous closure state/vector context.}
Point 1 + 3 need gening. Point 1 is your regular generation context.
Point 3 requires  walking thru its right side symbols 
to where its goto state needs gening (nil).
From there its gening proceeds as normal within its own closure state/vector context.
\fbreak

During each state closure part/vectors pass, lr kosherness is tested within
each closure state/vector gening context.
A non lr(1) verdict is returned immediately within
the gening closure state/vector context.
The balance of the closure state/vectors to gen are not completed.
@<generate grammar's LR1 states@>=
AST* start_rule_def_t = AST::get_1st_son(*rules_tree);
state* gening_state = new state(start_rule_def_t);
gen_context gening_context(0,-1);
STATES_ITER_type si = LR1_STATES.begin();
STATES_ITER_type sie = LR1_STATES.end();

// list added to dynamicly as each gening context created
for(;si!=sie;++si){
  gening_state = *si;
  gening_context.for_closure_state_ = gening_state;
  gening_context.gen_vector_ = -1;
lrclog << "lr state driver considered state: " 
<< gening_context.for_closure_state_->state_no_
<< " for vector: "
<<  gening_context.gen_vector_
<< endl;

  LR1_HEALTH = gening_state->gen_transitive_states_for_closure_context(gening_context,*gening_state,*gening_state);
  if(LR1_HEALTH==NOT_LR1_COMPATIBLE){
   @<is the grammar unhealthy? yes report the details and exit@>;
  }
}
//|@<print dump state@>;|
@<commonize la sets@>;
/* please put back at sign if u want to trace*/
//|@<print dump state@>;|
//|@<print dump common states@>;|
@*3 Is the grammar unhealthy? yes report the details and exit.\fbreak
@<is the grammar unhealthy? yes report the details and exit@>=
  if(LR1_HEALTH == NOT_LR1_COMPATIBLE){
    yacco2::lrclog << "===>Please check Grammar dump file: " 
      << normal_tracing.c_str() << " for Not LR1 details" << endl;
    std::cout << "===>Please check Grammar dump file: " 
      << normal_tracing.c_str() << " for Not LR1 details" << endl;
    yacco2::lrclog << "Not LR1 --- check state conflict list of state: " 
    << gening_state->state_no_ << " for details" << endl;
	@<print dump state@>;
       @<print dump common states@>;
   return 1;
  }

@*3 Commonize LA Sets --- Combine common sets as a space saver.\fbreak
Go thru the lr states looking for reduced subrules.
Their lookahead sets have already been calculated so by set equality 
determine common la sets by reading thru the registry for 
its soul mate. This common reference to same sets minimizes space in the emitted lr state tables.
The index number per set in the |COMMON_LA_SETS| registry will be used
as part of each generated la set's name.
This is why the found index number is deposited per reduced subrule.
When the state tables get emitted, this index number + 1 is used in the gened lookahead's 
name as i prefer its name to be relative to 1. 
 @<commonize la sets@>=
  COMMONIZE_LA_SETS();


@** Overview of \O2's state generated components.\fbreak
\O2 generates the components making up the  automaton and the first set language for \O2{}Linker
to compile.
These files are the header definition of the grammar, the ``first set'' file for \O2{}Linker,
and the implementations of the automata (fsm), it's symbols, and the fsm's states.

Depending on the switches inputted, \O2 can generate the 
Terminal vocubulary defined for the grammar environment: 
the individual terminal classifications of 
errors, lr constants, raw characters, and Terminals.
As a global reference to all defined terminals, 
an enumeration scheme is emitted.

  
