summaryrefslogtreecommitdiff
path: root/keama/doc.txt
diff options
context:
space:
mode:
Diffstat (limited to 'keama/doc.txt')
-rw-r--r--keama/doc.txt516
1 files changed, 516 insertions, 0 deletions
diff --git a/keama/doc.txt b/keama/doc.txt
new file mode 100644
index 00000000..a1664a22
--- /dev/null
+++ b/keama/doc.txt
@@ -0,0 +1,516 @@
+Part 1: Kea Migration Assistant support
+=======================================
+
+Files:
+------
+ - data.h (tailq list and element type declarations)
+ - data.c (element type code)
+ - keama.h (DHCP declarations)
+ - keama.c (main() code)
+ - json.c (JSON parser)
+ - option.c (option tables and code)
+ - keama.8 (man page)
+
+The code heavily uses tailq lists, i.e. doubled linked lists with
+a pointer to the last (tail) element.
+
+The element structure mimics the Kea Element class with a few differences:
+ - no smart pointers
+ - extra fields to handle declaration kind, skip and comments
+ - maps are implemented as lists with an extra key field so the order
+ of insertion is kept and duplicates are possible
+ - strings are length + content (vs C strings)
+
+There is no attempt to avoid memory leaks.
+
+The skip flag is printed as '//' at the beginning of lines. It is set
+when something cannot be converted and the issue counter (returned
+by the keama command) incremented.
+
+Part 2: ISC DHCP lexer organization
+===================================
+
+Files:
+-----
+ - dhctoken.h (from includes, enum dhcp_token definition)
+ - conflex.c (from common, lexical analyzer code)
+
+Tokens (dhcp_token enum): characters are set to their ASCII value,
+ others are >= 256 without real organization (e.g. END_OF_FILE is 607).
+
+The state is in a parse structure named "cfile". There is one per file
+and a few routine save it in order to do a backtrack on a larger
+set than the usual lookahead.
+The largest function is intern() which recognizes keywords with
+a switch on the first character and a tree of if strcasecmp's.
+
+Standard routines:
+-----------------
+enum dhcp_token
+next_token(const char **rval, unsigned *rlen, struct parse *cfile);
+
+and
+
+enum dhcp_token
+peek_token(const char **rval, unsigned *rlen, struct parse *cfile);
+
+rval: if not null the content of the token is put in it
+rlen: if not null the length of the token is put in it
+cfile: lexer context
+return: the integer value of the token
+
+Changes:
+-------
+
+Added LBRACKET '[' and RBRACKET ']' tokens for JSON parser
+(switch on dhcp_token type).
+
+Added comments to collect ISC DHCP # comments, element stack to follow
+declaration hierarchy, and issue counter to struct parse.
+
+Moved the parse_warn (renamed into parse_error and made fatal) routine
+from conflex.c to keama.c
+
+Part 3: ISC DHCP parser organization
+====================================
+
+Files:
+-----
+ - confparse.c (from server)
+ for the server in parse_statement())
+ - parse.c (from common)
+
+4 classes: parameters, declarations, executable statements and expressions.
+
+the original code parses config and lease files, I kept only the first
+at the exception of parse_binding_value().
+
+entry point
+ |
+ V
+conf_file_parse
+ |
+ V
+conf_file_subparse <- read_conf_file (for include)
+ until END_OF_FILE call
+ |
+ V
+parse_statement
+ parse parameters and declarations
+ switch on token and call parse_xxx_declaration routines
+ on default or DHCPv6 token in DHCPv4 mode call parse_executable_statement
+ and put the result under the "statement" key
+ |
+ V
+parse_executable_statement
+
+According to comments the grammar is:
+
+ conf-file :== parameters declarations END_OF_FILE
+ parameters :== <nil> | parameter | parameters parameter
+ declarations :== <nil> | declaration | declarations declaration
+
+ statement :== parameter | declaration
+
+ parameter :== DEFAULT_LEASE_TIME lease_time
+ | MAX_LEASE_TIME lease_time
+ | DYNAMIC_BOOTP_LEASE_CUTOFF date
+ | DYNAMIC_BOOTP_LEASE_LENGTH lease_time
+ | BOOT_UNKNOWN_CLIENTS boolean
+ | ONE_LEASE_PER_CLIENT boolean
+ | GET_LEASE_HOSTNAMES boolean
+ | USE_HOST_DECL_NAME boolean
+ | NEXT_SERVER ip-addr-or-hostname SEMI
+ | option_parameter
+ | SERVER-IDENTIFIER ip-addr-or-hostname SEMI
+ | FILENAME string-parameter
+ | SERVER_NAME string-parameter
+ | hardware-parameter
+ | fixed-address-parameter
+ | ALLOW allow-deny-keyword
+ | DENY allow-deny-keyword
+ | USE_LEASE_ADDR_FOR_DEFAULT_ROUTE boolean
+ | AUTHORITATIVE
+ | NOT AUTHORITATIVE
+
+ declaration :== host-declaration
+ | group-declaration
+ | shared-network-declaration
+ | subnet-declaration
+ | VENDOR_CLASS class-declaration
+ | USER_CLASS class-declaration
+ | RANGE address-range-declaration
+
+Typically declarations use { } and are associated with a group
+(changed to a type) in ROOT_GROUP (global), HOST_DECL, SHARED_NET_DECL,
+SUBNET_DECL, CLASS_DECL, GROUP_DECL and POOL_DECL.
+
+ROOT: parent = TOPLEVEL, children = everythig but not POOL
+HOST: parent = ROOT, GROUP, warn on SHARED or SUBNET, children = none
+SHARED_NET: parent = ROOT, GROUP, children = HOST (warn), SUBNET, POOL4
+SUBNET: parent = ROOT, GROUP, SHARED, children = HOST (warn), POOL
+CLASS: parent = ROOT, GROUP, children = none
+GROUP: parent = ROOT, SHARED, children = anything but not POOL
+POOL: parent = SHARED4, SUBNET, warn on others, children = none
+
+isc_boolean_t
+parse_statement(struct parse *cfile, int type, isc_boolean_t declaration);
+
+cfile: parser context
+type: declaration type
+declaration and return: declaration or parameter
+
+On the common side:
+
+ executable-statements :== executable-statement executable-statements |
+ executable-statement
+
+ executable-statement :==
+ IF if-statement |
+ ADD class-name SEMI |
+ BREAK SEMI |
+ OPTION option-parameter SEMI |
+ SUPERSEDE option-parameter SEMI |
+ PREPEND option-parameter SEMI |
+ APPEND option-parameter SEMI
+
+isc_boolean_t
+parse_executable_statement(struct element *result,
+ struct parse *cfile, isc_boolean_t *lose,
+ enum expression_context case_context,
+ isc_boolean_t direct);
+
+result: map element where to put the statement
+cfile: parser context
+lose: set to ISC_TRUE on failure
+case_context: expression context
+direct: called directly by parse_statement so can execute config statements
+return: success
+
+parse_executable_statement
+ switch on keywords (far more than in the comments)
+ on default with an identifier try a config option, on number or name
+ call parse_expression for a function call
+ |
+ V
+parse_expression
+
+expressions are divided into boolean, data (string) and numeric expressions
+
+ boolean_expression :== CHECK STRING |
+ NOT boolean-expression |
+ data-expression EQUAL data-expression |
+ data-expression BANG EQUAL data-expression |
+ data-expression REGEX_MATCH data-expression |
+ boolean-expression AND boolean-expression |
+ boolean-expression OR boolean-expression
+ EXISTS OPTION-NAME
+
+ data_expression :== SUBSTRING LPAREN data-expression COMMA
+ numeric-expression COMMA
+ numeric-expression RPAREN |
+ CONCAT LPAREN data-expression COMMA
+ data-expression RPAREN
+ SUFFIX LPAREN data_expression COMMA
+ numeric-expression RPAREN |
+ LCASE LPAREN data_expression RPAREN |
+ UCASE LPAREN data_expression RPAREN |
+ OPTION option_name |
+ HARDWARE |
+ PACKET LPAREN numeric-expression COMMA
+ numeric-expression RPAREN |
+ V6RELAY LPAREN numeric-expression COMMA
+ data-expression RPAREN |
+ STRING |
+ colon_separated_hex_list
+
+ numeric-expression :== EXTRACT_INT LPAREN data-expression
+ COMMA number RPAREN |
+ NUMBER
+
+parse_boolean_expression, parse_data_expression and parse_numeric_expression
+calls parse_expression and check its result
+
+parse_expression itself is divided into parse_non_binary and internal
+handling of binary operators
+
+isc_boolean_t
+parse_non_binary(struct element *expr, struct parse *cfile,
+ isc_boolean_t *lose, enum expression_context context)
+
+isc_boolean_t
+parse_expression(struct element *expr, struct parse *cfile,
+ isc_boolean_t *lose, enum expression_context context,
+ struct element *lhs, enum expr_op binop)
+
+expr: map element where to put the result
+cfile: parser context
+lose: set to ISC_TRUE on failure
+context: expression context
+lhs: NULL or left hand side
+binop: expr_none or binary operation
+return: success
+
+parse_non_binary
+ switch on unary and nullary operator keywords
+ on default try a variable reference or a function call
+
+parse_expression
+ call parse_non_binary to get the right hand side
+ switch on binary operator keywords to get the next operation
+ with one side if expr_none return else get the second hand
+ handle operator precedence, can call itself
+ return a map entry with the operator name as the key, and
+ left and right expression branches
+
+Part 4: Expression processing
+=============================
+
+Files:
+------
+ - print.c (new)
+ - eval.c (new)
+ - reduce.c (new)
+
+Print:
+------
+
+const char *
+print_expression(struct element *expr, isc_boolean_t *lose);
+const char *
+print_boolean_expression(struct element *expr, isc_boolean_t *lose);
+const char *
+print_data_expression(struct element *expr, isc_boolean_t *lose);
+const char *
+print_numeric_expression(struct element *expr, isc_boolean_t *lose);
+
+expr: expression to print
+lose: failure (??? in output) flag
+return: the text representing the expression
+
+Eval:
+-----
+
+struct element *
+eval_expression(struct element *expr, isc_boolean_t *modifiedp);
+struct element *
+eval_boolean_expression(struct element *expr, isc_boolean_t *modifiedp);
+struct element *
+eval_data_expression(struct element *expr, isc_boolean_t *modifiedp);
+struct element *
+eval_numeric_expression(struct element *expr, isc_boolean_t *modifiedp);
+
+expr: expression to evaluate
+modifiedp: a different element was returned (still false for updates
+ inside a map)
+return: the evaluated element (can have been updated for a map or a list,
+ or can be a fully different element)
+
+Evaluation is at parsing time so it is mainly a constant propagation.
+(no beta reduction for instance)
+
+Reduce:
+-------
+
+struct element *
+reduce_boolean_expression(struct element *expr);
+struct element *
+reduce_data_expression(struct element *expr);
+struct element *
+reduce_numeric_expression(struct element *expr);
+
+expr: expression to reduce
+return: NULL or the reduced expression as a Kea eval string
+
+reducing works for a limited (but interesting) set of expressions which
+can be converted to kea evaluatebool and for literals.
+
+Part 5: Specific issues
+=======================
+
+Reservations:
+-------------
+ ISC DHCP host declarations are global, Kea reservations were per subnet
+ only until 1.5.
+ It is possible to use the fixed address but:
+ - it is possible to finish with orphan reservations, i.e.
+ reservations with an address which match no subnets
+ - a reservation can have no fixed address. In this case the MA puts
+ the reservation in the last declared subnet.
+ - a reservation can have more than one fixed address and these
+ addresses can belong to different subnets. Current code pushes
+ IPv4 extra addresses in a commented extra-ip-addresses but
+ it is legal feature for IPv6.
+ - it is not easy to use prefix6
+ The use of groups in host declarations is unclear.
+ ISC DHCP UID is mapped to client-id, host-identifier to flex-id
+ Host reservation identifiers are generated on first use.
+
+Groups:
+-------
+TODO: search missing parameters from the Kea syntax.
+ (will be done in the third pass)
+
+Shared-Networks:
+----------------
+ Waiting for the feature to be supported by Kea.
+ Currently at the end of a shared network declaration:
+ - if there is no subnets it is a fatal error
+ - if there is one subnet the shared-network is squeezed
+ - if there are more than one subnet the shared-network is commented
+TODO (useful only with Kea support for shared networks): combine permit /
+deny classes (e.g. create negation) and pop filters to subnets when
+there is one pool.
+
+Vendor-Classes and User-Classes:
+--------------------------------
+ ISC DHCP code is inconsistent: in particular before setting the
+ super-class "tname" to "implicit-vendor-class" / "implicit-user-class"
+ it allocates a buffer for data but does not copy the lexical value
+ "val" into it... So I removed support.
+
+Classes:
+--------
+ Only pure client-classes are supported by kea.
+ Dynamic/deleted stuff is not supported but does it make sense?
+ To spawn classes is not supported.
+ Match class selector is converted to Kea eval test when the corresponding
+ expression can be reduced. Fortunately it seems to be the common case!
+ Lease limit is not supported.
+
+Subclasses:
+-----------
+ Understood how it works:
+ - (super) class defined with a MATCH <data-expression> (vs.
+ MATCH IF <boolean-expression>)
+ - subclasses defined by <superclass-name> <data-literal> which
+ are equivalent to
+ MATCH IF <superclass-data-expression> EQUAL <data-literal>
+ So subclasses are convertible when the data expression can be reduced.
+ Cf https://kb.isc.org/article/AA-01092/202/OMAPI-support-for-classes-and-subclasses.html
+ which BTW suggests the management API could manage classes...
+
+Hardware Addresses:
+-------------------
+ Kea supports only Ethernet.
+
+Pools:
+------
+ All permissions are not supported by Kea at the exception of class members
+ but in a very different way so not convertible.
+ Mixed DHCPv6 address and prefix pools are not supported, perhaps in this
+ case the pool should be duplicated into pool and pd-pool instances?
+ The bootp stuff was ifdef's as bootp is obsolete.
+ Temporary (aka IA_TA) is commented ny the MA.
+ ISC DHCP supports interval ranges for prefix6. Kea has a different
+ and IMHO more powerful model.
+ Pool6 permissions are not supported.
+
+Failover:
+---------
+ Display a warning on the first use.
+
+Interfaces:
+-----------
+ Referenced interface names are pushed to an interfaces-config but it is
+ very (too!) easy to finish with a Kea config without any interface.
+
+Hostnames:
+----------
+ ISC DHCP does dynamic resolution in parse_ip_addr_or_hostname.
+ Static (at conversion time) resolution to one address is done by
+ the MA for fixed-address. Resolution is considered as painful
+ there are better (and safer) ways to do this. The -r (resolve)
+ command line parameter controls the at-conversion-time resolution.
+ Note only the first address is returned.
+TODO: check the multiple address comment is correctly taken
+ (need a known host resolving in a stable set of addresses)
+
+Options:
+--------
+ Some options are known only in ISC DHCP (almost fixed), a few only by Kea.
+ Formats are supposed to be the same, the only known exception
+ (DHCPv4 domain-search) was fixed by #5087.
+ For option spaces DHCPv4 vendor-encapsulated-options (code 43, in general
+ associated to vendor-class-identifier code 60) uses a dedicated feature
+ which had no equivalent in Kea (fixed).
+ Option definitions are convertible with a few exception:
+ - no support in Kea for an array of records (mainly by the lack
+ of a corresponding syntax). BTW there is no known use too.
+ - no support in Kea for an array at the end of a record (fixed)
+ All unsupported option declarations are set to full binary (X).
+ - X format means ASCII or hexa:
+ * standard options are in general mapped to binary
+ * new options are mapped to string with format x (vs x)
+ * when a string got hexadecimal data a warning in added in comments
+ suggesting to switch to plain binary.
+ - ISC DHCP use quotes for a domain-list but not for a domain-name,
+ this is no very coherent and makes domain-list different than
+ domain-name array.
+Each time an option data has a format which is not convertible than
+a CSV false binary data is produced.
+ We have no example in ISC DHCP, Kea or standard but it is possible
+ than an option defined as a fixed sized record followed by
+ (encapsulated) suboptions bugs (it already bugs toElement).
+ For operations on options ISC DHCP has supersede, send, append,
+ prepend, default (set if not yet present), Kea puts them in code order
+ with a few built-in exceptions.
+ To finish there is the way to enforce Kea to add an option in a response
+ is pretty different and can't be automatically translated (cf Kea #250).
+
+Duplicates:
+-----------
+ Many things in ISC DHCP can be duplicated:
+ - options can be redefined
+ - same host identifier used twice
+ - same fixed address used in tow different hosts
+ etc.
+ Kea is far more strict and IMHO it is a good thing. Now the MA does
+ no particular check and multiple definitions work only for classes
+ (because it is the way the ISC DHCP parse works).
+ If we have Docsis space options, they are standard in Kea so they
+ will conflict.
+
+Dynamic DNS:
+------------
+ Details are very different so the MA maps only basic parameters
+ at the global scope.
+
+Expressions:
+------------
+ ISC DHCP expressions are typed: boolean, numeric, and data aka string.
+ The default for a literal is to be a string so literal numbers are
+ interpreted in hexadecimal (for a strange consequence look at
+ https://kb.isc.org/article/AA-00334/56/Do-the-list-of-parameters-in-the-dhcp-parameter-request-list-need-to-be-in-hex.html ).
+ String literals are converted to string elements, hexadecimal literals
+ are converted to const-data maps.
+TODO reduce more hexa aka const-data
+ As booleans are not data there is no way to fix this:
+ /tmp/bool line 9: Expecting a data expression.
+ option ip-forwarding = foo = foo;
+ ^
+ Cf Kea #247
+ The tautology 'foo = foo' is not a data expression so is rejected by
+ both the MA and dhcpd (BTW the role of the MA is not to fix ISC DHCP
+ shortcomings so it does what it is expected to do here).
+ Note this does not work too:
+ option ip-forwarding = true;
+ because "true" is not a keyword and it is converted into a variable
+ reference... And I expect ISC DHCP makes this true a false at runtime
+ because the variable "true" is not defined by default.
+ Reduced expressions are pretty printed to allow an extra check.
+ Hardware for DHCPv4 is expansed into a concatenation of hw-type and
+ hw-address, this allows to simplify expression where only one is used.
+
+Variables:
+----------
+ ISC DHCP has a notion of variables in a scope where the scope can be
+ a lexical scope in the config or a scope in a function body
+ (ISC DHCP has even an unused "let" statement).
+ There is a variant of bindings for lease files using types and able
+ to recognize booleans and numbers. Unfortunately this is very specific...
+
+TODO:
+ - global host reservations
+ - class like if statement
+ - add more tests for classes in pools and class generation