summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJan Max Meyer <jmm@phorward.de>2021-12-04 16:56:15 +0100
committerAdrian Thurston <thurston@colm.net>2021-12-04 11:10:49 -0800
commit428715b40931abf6c422c57154da82ce8621d74e (patch)
tree9b87dd12a9df414298fd34dd1c3170e9dd7e58f5
parentf9708db3960b95457c1efaf092efc65d3b486c9f (diff)
downloadcolm-428715b40931abf6c422c57154da82ce8621d74e.tar.gz
Providing a more helpful README.md for Colm
-rw-r--r--README51
-rw-r--r--README.md161
-rw-r--r--doc/colm/0_00_welcome.adoc27
-rw-r--r--doc/colm/0_03_commandline.adoc27
-rw-r--r--doc/colm/0_04_hello_world.adoc2
-rw-r--r--doc/colm/code/assign.lm4
6 files changed, 192 insertions, 80 deletions
diff --git a/README b/README
deleted file mode 100644
index f7e8f64e..00000000
--- a/README
+++ /dev/null
@@ -1,51 +0,0 @@
- Colm.Net Suite of Programs
- ==========================
-
-This package contains the Colm Programming Language, Ragel State Machine
-Compiler 7.0+, and supporting libraries.
-
-DEPENDENCIES
-============
-
-This package has no external dependencies, other than usual autotools and C/C++
-compiler programs.
-
-For the program:
-
- make libtool gcc g++ autoconf automake
-
-For the docucumentation:
-
- * asciidoc
- * fig2dev
-
-
-BUILDING
-========
-
-Colm is built in the usual autotool way:
-
-$ ./autogen
-$ ./configure
-$ make
-$ make install
-
-RUN-TIME DEPENDENCIES
-=====================
-
-The colm program depends on GCC at runtime. It produces a C program as output,
-then compiles and links it with a runtime library. The compiled program depends
-on the colm library.
-
-Notes on RUNNING
-================
-
-To find the includes and the runtime library to pass to GCC, colm looks at
-argv[0] to decide if it is running out of the source tree. If it is, then the
-compile and link flags are derived from argv[0]. Otherwise, it uses the install
-location (prefix) to construct the flags.
-
-SYNTAX HIGHLIGHTING
-===================
-
-There are vim syntax definition files colm.vim and ragel.vim
diff --git a/README.md b/README.md
new file mode 100644
index 00000000..e2a9c536
--- /dev/null
+++ b/README.md
@@ -0,0 +1,161 @@
+# Colm = COmputer Language Machinery
+
+Colm is a programming language designed for the analysis and [transformation of computer languages](https://www.program-transformation.org/Transform/TransformationSystems).<br>
+Colm is influenced primarily by [TXL](https://www.txl.ca/).
+
+
+## What is a transformation language?
+
+A transformation language has a type system based on formal languages.<br>
+Rather than defining classes or data structures, one defines grammars.
+
+A parser is constructed automatically from the grammar, and the parser is used for two purposes:
+
+- to parse the input language,
+- and to parse the structural patterns in the program that performs the analysis.
+
+In this setting, grammar-based parsing is critical because it guarantees that both the input and the structural patterns are parsed into trees from the same set of types, allowing comparison.
+
+
+## Colm's features
+
+Colm is not-your-typical-scripting-languageā„¢:
+
+- Colm's main contribution lies in the parsing method.<br>Colm's parsing engine is generalized, but it also allows for the construction of arbitrary global data structures that can be queried during parsing. In other generalized methods, construction of global data requires some very careful consideration because of inherent concurrency in the parsing method. It is such a tricky task that it is often avoided altogether and the problem is deferred to a post-parse disambiguation of the parse forest.
+- By default Colm will create an elf executable that can be used standalone for that actual transformations.
+- Colm is a static and strong typed scripting language.
+- Colm is very tiny and fast and can easily be embedded/linked with c/cpp programs.
+- Colm's runtime is a stackbased VM that starts with the bare minimum of the language and bootstraps itself.
+
+
+## Examples
+
+This is how Colm is greeting the world ([`hello_world.lm`](doc/colm/code/hello_world.lm)):
+```colm
+print "hello world\n"
+```
+
+Here's a Colm program implementing a little assignment language ([`assign.lm`](doc/colm/code/assign.lm)) and its parse tree synthesis afterwards.
+```colm
+lex
+ token id / ('a' .. 'z' | 'A' .. 'Z' ) + /
+ token number / ( '0' .. '9' )+ /
+ literal `= `;
+ ignore / [ \t\n]+ /
+end
+
+def value
+ [id] | [number]
+
+def assignment
+ [id `= value `;]
+
+def assignment_list
+ [assignment assignment_list]
+| [assignment]
+| []
+
+parse Simple: assignment_list[ stdin ]
+
+if ( ! Simple ) {
+ print( "[error]\n" )
+ exit( 1 )
+}
+else {
+ for I:assignment in Simple {
+ print( $I.id, "->", $I.value, "\n" )
+ }
+}
+```
+
+More real-world programs parsing several languages implemented in Colm can be found in the [`grammar/`-folder](https://github.com/adrian-thurston/colm/tree/master/grammar).
+
+
+## Usage
+
+To immediatelly compile and run e.g. the `hello_world.lm` program from above, call
+
+```
+$ colm -r hello_world.lm
+hello world
+```
+
+Run `colm --help` for help on further options.
+
+```
+$ colm --help
+usage: colm [options] file
+general:
+ -h, -H, -?, --help print this usage and exit
+ -v --version print version information and exit
+ -b <ident> use <ident> as name of C object encapulaing the program
+ -o <file> if -c given, write C parse object to <file>,
+ otherwise write binary to <file>
+ -p <file> write C parse object to <file>
+ -e <file> write C++ export header to <file>
+ -x <file> write C++ export code to <file>
+ -m <file> write C++ commit code to <file>
+ -a <file> additional code file to include in output program
+ -E N=V set a string value available in the program
+ -I <path> additional include path for the compiler
+ -i activate branchpoint information
+ -L <path> additional library path for the linker
+ -l activate logging
+ -r run output program and replace process
+ -c compile only (don't produce binary)
+ -V print dot format (graphiz)
+ -d print verbose debug information
+
+```
+
+
+## Building
+
+To build Colm on your own, see the following dependencies and build instructions.
+
+### Dependencies
+
+This package has no external dependencies, other than usual autotools and C/C++ compiler programs.
+
+For the program:
+- make
+- libtool
+- gcc
+- g++
+- autoconf
+- automake
+
+For the documentation, install [`asciidoc`](https://asciidoctor.org/) and [`fig2dev`](https://github.com/getlarky/fig2dev) as well.
+
+### Building
+
+Colm is built in the usual autotool way:
+
+```
+$ ./autogen.sh
+$ ./configure
+$ make
+$ make install
+```
+
+### Run-time dependencies
+
+The colm program depends on GCC at runtime. It produces a C program as output,
+then compiles and links it with a runtime library. The compiled program depends
+on the colm library.
+
+To find the includes and the runtime library to pass to GCC, colm looks at
+`argv[0]` to decide if it is running out of the source tree. If it is, then the
+compile and link flags are derived from `argv[0]`. Otherwise, it uses the install
+location (prefix) to construct the flags.
+
+
+## Syntax highlighting
+
+There is a vim syntax definition file [colm.vim](/colm.vim).
+
+
+## License
+
+Colm is free software under the MIT license.<br>
+Please see the COPYING file for more details.
diff --git a/doc/colm/0_00_welcome.adoc b/doc/colm/0_00_welcome.adoc
index 93b446d2..93a5dd95 100644
--- a/doc/colm/0_00_welcome.adoc
+++ b/doc/colm/0_00_welcome.adoc
@@ -3,16 +3,16 @@ Welcome
== Colm = COmputer Language Machinery
-Colm is a programming language designed for the analysis and http://www.program-transformation.org/Transform/TransformationSystems[transformation of computer languages].
-Colm is influenced primarily by http://www.txl.ca/[TXL].
+Colm is a programming language designed for the analysis and https://www.program-transformation.org/Transform/TransformationSystems[transformation of computer languages].
+Colm is influenced primarily by https://www.txl.ca/[TXL].
=== What is a transformation language?
-A transformation language has a type system based on formal languages.
-Rather than define classes or data structures, one defines grammars.
-A parser is constructed automatically from the grammar, and the parser is used for two purposes:
+A transformation language has a type system based on formal languages.
+Rather than defining classes or data structures, one defines grammars.
+A parser is constructed automatically from the grammar, and the parser is used for two purposes:
* to parse the input language,
-* and to parse the structural patterns in the program that performs the analysis.
+* and to parse the structural patterns in the program that performs the analysis.
In this setting, grammar-based parsing is critical because it guarantees that both the input and the structural patterns are parsed into trees from the same set of types, allowing comparison.
@@ -23,23 +23,22 @@ Colm is not-your-typical-scripting-language (TM):
Colm's parsing engine is generalized, but it also allows for the construction of arbitrary global data structures that can be queried during parsing.
In other generalized methods, construction of global data requires some very careful consideration because of inherent concurrency in the parsing method.
It is such a tricky task that it is often avoided altogether and the problem is deferred to a post-parse disambiguation of the parse forest.
-* By default Colm will create an elf exectuable that be used standalone for that actual transformations.
+* By default Colm will create an elf executable that can be used standalone for that actual transformations.
* Colm is a static and strong typed scripting language.
* Colms' is very tiny and fast and can easily be embedded/linked with c/cpp programs.
-* Colm's runtime is a stackbased VM that starts with the bare minium of the language and bootstraps itself.
-* creates aVM is very tycan be embedded in C as it It runs in a embeddable vm, the language is bootstrapped.
+* Colm's runtime is a stackbased VM that starts with the bare minimum of the language and bootstraps itself.
+* Creates a VM which can be embedded in C. As it runs in an embeddable VM, the language is bootstrapped.
=== Where is colm used?
-Colm is developed and used intensively by http://www.colm.net/[Colm Networks] to develop fast network traffic automata and systems for traffic identification, decoding, pattern matching, and extraction of security events.
-But colm is also the driving force in http://www.colm.net/open-source/ragel/[the Ragel State Machine Compiler]
+Colm is developed and used intensively by https://www.colm.net/[Colm Networks] to develop fast network traffic automata and systems for traffic identification, decoding, pattern matching, and extraction of security events.
+But colm is also the driving force in https://www.colm.net/open-source/ragel/[the Ragel State Machine Compiler]
=== What is colm's history?
-Colm's development started by https://twitter.com/ehdtee[Adrian Thurston] during his http://www.colm.net/files/colm/thurston-phdthesis.pdf[Ph.D. thesis] period after intensive study of http://research.cs.queensu.ca/~cordy/Papers/TC_SCAM06_ETXL.pdf[TXL].
+Colm's development started by https://twitter.com/ehdtee[Adrian Thurston] during his https://www.colm.net/files/colm/thurston-phdthesis.pdf[Ph.D. thesis] period after intensive study of http://research.cs.queensu.ca/~cordy/Papers/TC_SCAM06_ETXL.pdf[TXL].
=== When not to use Colm
-Colm is meant to create executables or object files that can be linked in other programs.
+Colm is meant to create executables or object files that can be linked in other programs.
This make is ideal for tasks like high performance transformations, but not very convenient for throwaway-oneliners that are common with tools like 'sed' or 'awk'.
-
diff --git a/doc/colm/0_03_commandline.adoc b/doc/colm/0_03_commandline.adoc
index 980003b0..005a769c 100644
--- a/doc/colm/0_03_commandline.adoc
+++ b/doc/colm/0_03_commandline.adoc
@@ -3,25 +3,28 @@ Commandline
Let's start colm with the '--help' command line argument.
-----
-colm --help
-----
-
-NOTE: This reflects the development version 0.13.0.4;
-
-
-```usage: colm [options] file
+```$ ./colm --help
+usage: colm [options] file
general:
-h, -H, -?, --help print this usage and exit
-v --version print version information and exit
- -o <file> write output to <file>
- -c compile only (don't produce binary)
+ -b <ident> use <ident> as name of C object encapulaing the program
+ -o <file> if -c given, write C parse object to <file>,
+ otherwise write binary to <file>
+ -p <file> write C parse object to <file>
-e <file> write C++ export header to <file>
-x <file> write C++ export code to <file>
-m <file> write C++ commit code to <file>
-a <file> additional code file to include in output program
+ -E N=V set a string value available in the program
+ -I <path> additional include path for the compiler
+ -i activate branchpoint information
+ -L <path> additional library path for the linker
+ -l activate logging
+ -r run output program and replace process
+ -c compile only (don't produce binary)
+ -V print dot format (graphiz)
+ -d print verbose debug information
```
This reveals us some more insights: it reads a 'colm' file and creates a object file with eventually cpp/h/x code.
-
-
diff --git a/doc/colm/0_04_hello_world.adoc b/doc/colm/0_04_hello_world.adoc
index 8cbfab53..291961fa 100644
--- a/doc/colm/0_04_hello_world.adoc
+++ b/doc/colm/0_04_hello_world.adoc
@@ -44,7 +44,7 @@ We can strip the file to check if we can reduce the executable.
[source,bash]
----
strip ./hello_world
-ls -l hello_words
+ls -l hello_world
----
----
diff --git a/doc/colm/code/assign.lm b/doc/colm/code/assign.lm
index 6add7cce..cc17b13a 100644
--- a/doc/colm/code/assign.lm
+++ b/doc/colm/code/assign.lm
@@ -10,7 +10,7 @@ def value
def assignment
[id `= value `;]
-
+
def assignment_list
[assignment assignment_list]
| [assignment]
@@ -20,7 +20,7 @@ parse Simple: assignment_list[ stdin ]
if ( ! Simple ) {
print( "[error]\n" )
- exit( 1 )
+ exit( 1 )
}
else {
for I:assignment in Simple {