summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* set the pubdatecolm-0.60.6Adrian Thurston2012-06-201-1/+1
|
* moved away closed issuesAdrian Thurston2012-06-1648-0/+0
|
* converted issues to text and split by idAdrian Thurston2012-06-0979-1643/+824
|
* cleanup of region creationAdrian Thurston2012-05-292-61/+35
|
* flattened the reg lang name tree down to a list for regionsAdrian Thurston2012-05-292-44/+14
|
* only need regions in the name tree.Adrian Thurston2012-05-282-138/+0
|
* cleanup in token region codeAdrian Thurston2012-05-282-68/+27
| | | | | Use the same name for the RegionDef and TokenRegion. Eventually should be able to unify these two structs.
* don't need labels in the regular language treeAdrian Thurston2012-05-282-30/+0
|
* code cleanupAdrian Thurston2012-05-284-251/+3
| | | | | | | Eliminated the name resolution walk within the state machine. This is from ragel and is not needed. Also removed some top level code for constructing state machines not in a scanner. We don't have this in colm, all state machines are in a scanner.
* code cleanupAdrian Thurston2012-05-285-59/+42
| | | | | The JoinOrLm structs are no longer needed. VarDef and RegionDef reference the Join and the TokenRegion, respectively.
* specializing graph dicts and lists for regions and regular language defsAdrian Thurston2012-05-286-77/+171
| | | | | | Previously used a single graph dictionary for regions and regular language defs because we were derived from ragel. Splitting these The split goes down to VarDef and JoinOrLm.
* separating graph dict for regular language defs and scannersAdrian Thurston2012-05-272-30/+24
| | | | | | Renamed the def function for regions to reflect that it is for regions. Took the isInstance out of both functions. All lexical regions are instances (in the ragel sense) and all regular language definitions are not.
* cleanup of ragel-derived codeAdrian Thurston2012-05-274-15/+40
| | | | | | | The scanner code was derivied from ragel, where the same map of names to graphs is used for regular language defintions and scanners. Some of the regular lanuage defintiions are instantiations, meaning they create states. Starting to retire this by creating a separate map for regular language defs (rlMap).
* some fixes for this test, but not, but not funcional yetAdrian Thurston2012-05-271-22/+26
|
* added shell script test harnessAdrian Thurston2012-05-273-10/+166
| | | | | | | Added a shell script test harness that executes all the lm tests in the current directory. Doesn't require any makefile generation, TESTS file, etc. The shell script be easily programmed to add extra steps during testing, such as pre/post execution, linking, etc. Just the better way to go.
* cleanup: file renamingAdrian Thurston2012-05-264-3/+3
| | | | | | codegen.cc for writing the colm program compiler.cc for the main compiler logic synthesis.cc for the bytecode program generation
* code movementAdrian Thurston2012-05-266-22/+55
| | | | | The exports.cc file is for writing the C++ interface. Write for generic code writing. Currently has only the main file.
* cleanup: code movementAdrian Thurston2012-05-263-124/+83
| | | | Merged parsedata.cc and analysis.cc and renamed it colm.cc
* class name change ParseData -> CompilerAdrian Thurston2012-05-2618-444/+444
|
* minor code cleanupAdrian Thurston2012-05-261-2/+4
| | | | | Allocate scanners for included files on the heap. Consistent with the main line.
* cleanup of the mainlineAdrian Thurston2012-05-264-35/+37
| | | | | Allocating the primary processing objects in the mainline and calling them, previously had scanners allocating parser and parsers allocating parse data.
* removed the opt_collect_ignore productions, not usingAdrian Thurston2012-05-251-40/+10
|
* test the capture-ignore mechanismAdrian Thurston2012-05-253-5/+16
|
* putting collect-ignores in the grammar as zero-length tokensAdrian Thurston2012-05-2510-72/+75
| | | | | | PDA construction and execution is complicated too much by the automatic insertion of collect ignore tokens when the collect-ignore property is set. Instead put the collect ignores into grammars, patterns and replacements.
* Bump to 0.6. Will start depending on this version.Adrian Thurston2012-05-251-1/+1
|
* cleanup of collect-ignoreAdrian Thurston2012-05-256-10/+38
| | | | | Suppress code generation of types that are duplicates into the ignore/token/ci regions. Removed some print statements used for debugging.
* collect-ignore implementationAdrian Thurston2012-05-246-21/+52
| | | | | | | Now possible to parse patterns that have collect-ignores. Sometimes you need them present in the input stream when you pass over the production. Other times you don't when you pass over the nonterminal. Built skipping of them into the backtracker.
* experimenting with use of a nonterm for collecting ignores.Adrian Thurston2012-05-2413-33/+233
| | | | | | | | | Can say that a production should collect ignores from a region. There is a collect ignore region created, but the states from the ignore-version of the region is used. When the scanner fails to produce a token from the collect-ignore region, the collect-ignore token is generated and accepted by the fsm. Need to take it out of the data tree on reductions and put it into an ignore list. Reverse this during unparsing.
* removed old print statementAdrian Thurston2012-05-231-2/+2
|
* added a syntax for specifying no ignoresAdrian Thurston2012-05-2312-49/+83
| | | | | | Added the keyword 'ni', which can go ahead of or before a token pattern (literal or usual), which means no-ignore. Sets the noPreIgnore and noPostIgnore bits in the token, which affect the ignore scanning and attaching.
* fixed botched initialization of TokenDef::dupOfAdrian Thurston2012-05-231-1/+1
|
* fix for right ignore attachingAdrian Thurston2012-05-221-16/+12
| | | | | | The right ignore attaching failed to take into account that the accume ignore list is in reverse order. Need to take the tail that is for right ignore, not the head.
* updated tests for latest parser changesAdrian Thurston2012-05-2229-34/+44
|
* added another ignore testAdrian Thurston2012-05-227-7/+64
| | | | | Exercises the attaching of tokens to the side that the ignore definitions came from.
* improvements to ignore handling in the parserAdrian Thurston2012-05-2214-60/+214
| | | | | | | | | | | | Every region now also has a duplicate scanning region that is only for tokens. The duplicate ignores and tokens generate the original tokens through a TokenDef ignore mechanism. Can turn off post ignore parsing and pre-igore parsing on a token-by-token basis. Probably want to move it into the productions and specify it there. Currently don't have a specification mechanism. If an ignore is a post-token ignore it is not right-attached.
* added text_notrim() to the C++ interface.Adrian Thurston2012-05-221-3/+4
| | | | | The text_notrim() functions retrieve the text of a token without automatically trimming off whitespace.
* added trim control flag to print code, auto-trimming all colm print callsAdrian Thurston2012-05-2217-43/+54
| | | | | | | | The print implemenation now takes a trim flag. The colm print function now sets this flag by default. This is a change to the colm language back to 0.5 semantics. The $ conversion uses this flag too (also 0.5 semantics), in the previous commit it issued a tree trim operation. The % operation gives a string conversion without triming.
* force DEF_PAT names to be unique.Adrian Thurston2012-05-211-1/+2
|
* took out the trim before str conversionAdrian Thurston2012-05-211-1/+1
| | | | | | No longer need to trim trees before converting to strings because the string conversion does it automatically. To convert to a string without trimming the % operator is used. This may change.
* moved repeat -> repeat1, added repeat2Adrian Thurston2012-05-219-12/+7438
| | | | | The repeat2 test is the current doc gen program from ragel. There is a custom language with a traversal that calls prints (no real transformation).
* removed empty fsmrun.cAdrian Thurston2012-05-212-21/+1
|
* auto trim before $ string conversionAdrian Thurston2012-05-213-3/+15
| | | | | The $ operation automatically adds a TRIM. The '%' opertion was added, which is the original $ conversion without the trim.
* clone elimination/refactoring of ignore functionsAdrian Thurston2012-05-211-39/+2
| | | | Eliminated final clone of the push ignores, this one was in the trim operation.
* clone elimination.Adrian Thurston2012-05-211-36/+2
| | | | Clone elimination on calls pushIgnore.
* ongoing refactoring cleanupAdrian Thurston2012-05-213-16/+18
| | | | | | Removed sp from the pushIgnore functions. Need to use it in contexts where it is not currently available. Actually not needed because we can directly access refs when trees are moved around during the push.
* more clone removal surrounding ignore handlingAdrian Thurston2012-05-213-70/+88
|
* clone removalAdrian Thurston2012-05-213-130/+141
| | | | The ignore node handling code has been frequently cloned. Cleaning that up.
* eliminated the IgnoreTree struct.Adrian Thurston2012-05-218-93/+45
| | | | | Eliminated the IgnoreTree struct because it no longer contains any extensions of Tree. Just using Tree.
* eliminated generation from IgnoreListAdrian Thurston2012-05-215-12/+0
| | | | | This field was the only field that extends the basic tree. Can now eliminate the structure and just use Tree.
* test cases updated for no-kid-flags and no-dup-ignorsAdrian Thurston2012-05-2121-43/+51
| | | | | Whitespace is shifting. Most of the updates involve triming whitespace where it was previously trimmed automatically.