summaryrefslogtreecommitdiff
path: root/keama/doc.txt
blob: a1664a221aa37927a035be72f039b4c41cab9692 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
Part 1: Kea Migration Assistant support
=======================================

Files:
------
 - data.h (tailq list and element type declarations)
 - data.c (element type code)
 - keama.h (DHCP declarations)
 - keama.c (main() code)
 - json.c (JSON parser)
 - option.c (option tables and code)
 - keama.8 (man page)

The code heavily uses tailq lists, i.e. doubled linked lists with
a pointer to the last (tail) element.

The element structure mimics the Kea Element class with a few differences:
 - no smart pointers
 - extra fields to handle declaration kind, skip and comments
 - maps are implemented as lists with an extra key field so the order
  of insertion is kept and duplicates are possible
 - strings are length + content (vs C strings)

There is no attempt to avoid memory leaks.

The skip flag is printed as '//' at the beginning of lines. It is set
when something cannot be converted and the issue counter (returned
by the keama command) incremented.

Part 2: ISC DHCP lexer organization
===================================

Files:
-----
 - dhctoken.h (from includes, enum dhcp_token definition)
 - conflex.c (from common, lexical analyzer code)

Tokens (dhcp_token enum): characters are set to their ASCII value,
 others are >= 256 without real organization (e.g. END_OF_FILE is 607).

The state is in a parse structure named "cfile". There is one per file
and a few routine save it in order to do a backtrack on a larger
set than the usual lookahead.
The largest function is intern() which recognizes keywords with
a switch on the first character and a tree of if strcasecmp's.

Standard routines:
-----------------
enum dhcp_token
next_token(const char **rval, unsigned *rlen, struct parse *cfile);

and

enum dhcp_token
peek_token(const char **rval, unsigned *rlen, struct parse *cfile);

rval: if not null the content of the token is put in it
rlen: if not null the length of the token is put in it
cfile: lexer context
return: the integer value of the token

Changes:
-------

Added LBRACKET '[' and RBRACKET ']' tokens for JSON parser
(switch on dhcp_token type).

Added comments to collect ISC DHCP # comments, element stack to follow
declaration hierarchy, and issue counter to struct parse.

Moved the parse_warn (renamed into parse_error and made fatal) routine
from conflex.c to keama.c

Part 3: ISC DHCP parser organization
====================================

Files:
-----
 - confparse.c (from server)
  for the server in parse_statement())
 - parse.c (from common)

4 classes: parameters, declarations, executable statements and expressions.

the original code parses config and lease files, I kept only the first
at the exception of parse_binding_value().

entry point
  |
  V
conf_file_parse
  |
  V
conf_file_subparse <- read_conf_file (for include)
 until END_OF_FILE call
  |
  V
parse_statement
 parse parameters and declarations
 switch on token and call parse_xxx_declaration routines
 on default or DHCPv6 token in DHCPv4 mode call parse_executable_statement
  and put the result under the "statement" key
    |
    V
parse_executable_statement

According to comments the grammar is:

   conf-file :== parameters declarations END_OF_FILE
   parameters :== <nil> | parameter | parameters parameter
   declarations :== <nil> | declaration | declarations declaration

   statement :== parameter | declaration

   parameter :== DEFAULT_LEASE_TIME lease_time
               | MAX_LEASE_TIME lease_time
               | DYNAMIC_BOOTP_LEASE_CUTOFF date
               | DYNAMIC_BOOTP_LEASE_LENGTH lease_time
               | BOOT_UNKNOWN_CLIENTS boolean
               | ONE_LEASE_PER_CLIENT boolean
               | GET_LEASE_HOSTNAMES boolean
               | USE_HOST_DECL_NAME boolean
               | NEXT_SERVER ip-addr-or-hostname SEMI
               | option_parameter
               | SERVER-IDENTIFIER ip-addr-or-hostname SEMI
               | FILENAME string-parameter
               | SERVER_NAME string-parameter
               | hardware-parameter
               | fixed-address-parameter
               | ALLOW allow-deny-keyword
               | DENY allow-deny-keyword
               | USE_LEASE_ADDR_FOR_DEFAULT_ROUTE boolean
               | AUTHORITATIVE
               | NOT AUTHORITATIVE

   declaration :== host-declaration
                 | group-declaration
                 | shared-network-declaration
                 | subnet-declaration
                 | VENDOR_CLASS class-declaration
                 | USER_CLASS class-declaration
                 | RANGE address-range-declaration

Typically declarations use { } and are associated with a group
(changed to a type) in ROOT_GROUP (global), HOST_DECL, SHARED_NET_DECL,
SUBNET_DECL, CLASS_DECL, GROUP_DECL and POOL_DECL.

ROOT: parent = TOPLEVEL, children = everythig but not POOL
HOST: parent = ROOT, GROUP, warn on SHARED or SUBNET, children = none
SHARED_NET: parent = ROOT, GROUP, children = HOST (warn), SUBNET, POOL4
SUBNET: parent = ROOT, GROUP, SHARED, children = HOST (warn), POOL
CLASS: parent = ROOT, GROUP, children = none
GROUP: parent = ROOT, SHARED, children = anything but not POOL
POOL: parent = SHARED4, SUBNET, warn on others, children = none

isc_boolean_t
parse_statement(struct parse *cfile, int type, isc_boolean_t declaration);

cfile: parser context
type: declaration type
declaration and return: declaration or parameter

On the common side:

   executable-statements :== executable-statement executable-statements |
                             executable-statement
 
   executable-statement :==
        IF if-statement |
        ADD class-name SEMI |
        BREAK SEMI |
        OPTION option-parameter SEMI |
        SUPERSEDE option-parameter SEMI |
        PREPEND option-parameter SEMI |
        APPEND option-parameter SEMI

isc_boolean_t
parse_executable_statement(struct element *result,
                           struct parse *cfile, isc_boolean_t *lose,
                           enum expression_context case_context,
                           isc_boolean_t direct);

result: map element where to put the statement
cfile: parser context
lose: set to ISC_TRUE on failure
case_context: expression context
direct: called directly by parse_statement so can execute config statements
return: success

parse_executable_statement
 switch on keywords (far more than in the comments)
 on default with an identifier try a config option, on number or name
  call parse_expression for a function call
    |
    V
parse_expression

expressions are divided into boolean, data (string) and numeric expressions

   boolean_expression :== CHECK STRING |
                          NOT boolean-expression |
                          data-expression EQUAL data-expression |
                          data-expression BANG EQUAL data-expression |
                          data-expression REGEX_MATCH data-expression |
                          boolean-expression AND boolean-expression |
                          boolean-expression OR boolean-expression
                          EXISTS OPTION-NAME

   data_expression :== SUBSTRING LPAREN data-expression COMMA
                                        numeric-expression COMMA
                                        numeric-expression RPAREN |
                       CONCAT LPAREN data-expression COMMA 
                                        data-expression RPAREN
                       SUFFIX LPAREN data_expression COMMA
                                     numeric-expression RPAREN |
                       LCASE LPAREN data_expression RPAREN |
                       UCASE LPAREN data_expression RPAREN |
                       OPTION option_name |
                       HARDWARE |
                       PACKET LPAREN numeric-expression COMMA
                                     numeric-expression RPAREN |
                       V6RELAY LPAREN numeric-expression COMMA
                                      data-expression RPAREN |
                       STRING |
                       colon_separated_hex_list

   numeric-expression :== EXTRACT_INT LPAREN data-expression
                                             COMMA number RPAREN |
                          NUMBER

parse_boolean_expression, parse_data_expression and parse_numeric_expression
calls parse_expression and check its result

parse_expression itself is divided into parse_non_binary and internal
handling of binary operators

isc_boolean_t
parse_non_binary(struct element *expr, struct parse *cfile,
                 isc_boolean_t *lose, enum expression_context context)

isc_boolean_t
parse_expression(struct element *expr, struct parse *cfile,
                 isc_boolean_t *lose, enum expression_context context,
                 struct element *lhs, enum expr_op binop)

expr: map element where to put the result
cfile: parser context
lose: set to ISC_TRUE on failure
context: expression context
lhs: NULL or left hand side
binop: expr_none or binary operation
return: success

parse_non_binary
 switch on unary and nullary operator keywords
 on default try a variable reference or a function call

parse_expression
 call parse_non_binary to get the right hand side
 switch on binary operator keywords to get the next operation
 with one side if expr_none return else get the second hand
 handle operator precedence, can call itself
 return a map entry with the operator name as the key, and
 left and right expression branches

Part 4: Expression processing
=============================

Files:
------
 - print.c (new)
 - eval.c (new)
 - reduce.c (new)

Print:
------

const char *
print_expression(struct element *expr, isc_boolean_t *lose);
const char *
print_boolean_expression(struct element *expr, isc_boolean_t *lose);
const char *
print_data_expression(struct element *expr, isc_boolean_t *lose);
const char *
print_numeric_expression(struct element *expr, isc_boolean_t *lose);

expr: expression to print
lose: failure (??? in output) flag
return: the text representing the expression

Eval:
-----

struct element *
eval_expression(struct element *expr, isc_boolean_t *modifiedp);
struct element *
eval_boolean_expression(struct element *expr, isc_boolean_t *modifiedp);
struct element *
eval_data_expression(struct element *expr, isc_boolean_t *modifiedp);
struct element *
eval_numeric_expression(struct element *expr, isc_boolean_t *modifiedp);

expr: expression to evaluate
modifiedp: a different element was returned (still false for updates
 inside a map)
return: the evaluated element (can have been updated for a map or a list,
 or can be a fully different element)

Evaluation is at parsing time so it is mainly a constant propagation.
(no beta reduction for instance)

Reduce:
-------

struct element *
reduce_boolean_expression(struct element *expr);
struct element *
reduce_data_expression(struct element *expr);
struct element *
reduce_numeric_expression(struct element *expr);

expr: expression to reduce
return: NULL or the reduced expression as a Kea eval string

reducing works for a limited (but interesting) set of expressions which
can be converted to kea evaluatebool and for literals.

Part 5: Specific issues
=======================

Reservations:
-------------
 ISC DHCP host declarations are global, Kea reservations were per subnet
 only until 1.5.
 It is possible to use the fixed address but:
  - it is possible to finish with orphan reservations, i.e.
   reservations with an address which match no subnets
  - a reservation can have no fixed address. In this case the MA puts
   the reservation in the last declared subnet.
  - a reservation can have more than one fixed address and these
   addresses can belong to different subnets. Current code pushes
   IPv4 extra addresses in a commented extra-ip-addresses but
   it is legal feature for IPv6.
  - it is not easy to use prefix6
 The use of groups in host declarations is unclear.
 ISC DHCP UID is mapped to client-id, host-identifier to flex-id
 Host reservation identifiers are generated on first use.

Groups:
-------
TODO: search missing parameters from the Kea syntax.
 (will be done in the third pass)

Shared-Networks:
----------------
 Waiting for the feature to be supported by Kea.
 Currently at the end of a shared network declaration:
  - if there is no subnets it is a fatal error
  - if there is one subnet the shared-network is squeezed
  - if there are more than one subnet the shared-network is commented
TODO (useful only with Kea support for shared networks): combine permit /
deny classes (e.g. create negation) and pop filters to subnets when
there is one pool.

Vendor-Classes and User-Classes:
--------------------------------
 ISC DHCP code is inconsistent: in particular before setting the
 super-class "tname" to "implicit-vendor-class" / "implicit-user-class"
 it allocates a buffer for data but does not copy the lexical value
 "val" into it... So I removed support.

Classes:
--------
 Only pure client-classes are supported by kea.
 Dynamic/deleted stuff is not supported but does it make sense?
 To spawn classes is not supported.
 Match class selector is converted to Kea eval test when the corresponding
 expression can be reduced. Fortunately it seems to be the common case!
  Lease limit is not supported.

Subclasses:
-----------
 Understood how it works:
  - (super) class defined with a MATCH <data-expression> (vs.
   MATCH IF <boolean-expression>)
  - subclasses defined by <superclass-name> <data-literal> which
   are equivalent to
   MATCH IF <superclass-data-expression> EQUAL <data-literal>
 So subclasses are convertible when the data expression can be reduced.
 Cf https://kb.isc.org/article/AA-01092/202/OMAPI-support-for-classes-and-subclasses.html
  which BTW suggests the management API could manage classes...

Hardware Addresses:
-------------------
 Kea supports only Ethernet.

Pools:
------
 All permissions are not supported by Kea at the exception of class members
 but in a very different way so not convertible.
 Mixed DHCPv6 address and prefix pools are not supported, perhaps in this
 case the pool should be duplicated into pool and pd-pool instances?
 The bootp stuff was ifdef's as bootp is obsolete.
 Temporary (aka IA_TA) is commented ny the MA.
 ISC DHCP supports interval ranges for prefix6. Kea has a different
 and IMHO more powerful model.
 Pool6 permissions are not supported.

Failover:
---------
 Display a warning on the first use.

Interfaces:
-----------
 Referenced interface names are pushed to an interfaces-config but it is
 very (too!) easy to finish with a Kea config without any interface.

Hostnames:
----------
 ISC DHCP does dynamic resolution in parse_ip_addr_or_hostname.
 Static (at conversion time) resolution to one address is done by
 the MA for fixed-address. Resolution is considered as painful
 there are better (and safer) ways to do this. The -r (resolve)
 command line parameter controls the at-conversion-time resolution.
 Note only the first address is returned.
TODO: check the multiple address comment is correctly taken
 (need a known host resolving in a stable set of addresses)

Options:
--------
 Some options are known only in ISC DHCP (almost fixed), a few only by Kea.
 Formats are supposed to be the same, the only known exception
 (DHCPv4 domain-search) was fixed by #5087.
 For option spaces DHCPv4 vendor-encapsulated-options (code 43, in general
 associated to vendor-class-identifier code 60) uses a dedicated feature
 which had no equivalent in Kea (fixed).
 Option definitions are convertible with a few exception:
  - no support in Kea for an array of records (mainly by the lack
   of a corresponding syntax). BTW there is no known use too.
  - no support in Kea for an array at the end of a record (fixed)
   All unsupported option declarations are set to full binary (X).
  - X format means ASCII or hexa:
    * standard options are in general mapped to binary
    * new options are mapped to string with format x (vs x)
    * when a string got hexadecimal data a warning in added in comments
     suggesting to switch to plain binary.
  - ISC DHCP use quotes for a domain-list but not for a domain-name,
   this is no very coherent and makes domain-list different than
   domain-name array.
Each time an option data has a format which is not convertible than
a CSV false binary data is produced.
 We have no example in ISC DHCP, Kea or standard but it is possible
 than an option defined as a fixed sized record followed by
 (encapsulated) suboptions bugs (it already bugs toElement).
 For operations on options ISC DHCP has supersede, send, append,
 prepend,  default (set if not yet present), Kea puts them in code order
 with a few built-in exceptions.
 To finish there is the way to enforce Kea to add an option in a response
 is pretty different and can't be automatically translated (cf Kea #250).

Duplicates:
-----------
 Many things in ISC DHCP can be duplicated:
  - options can be redefined
  - same host identifier used twice
  - same fixed address used in tow different hosts
  etc.
 Kea is far more strict and IMHO it is a good thing. Now the MA does
 no particular check and multiple definitions work only for classes
 (because it is the way the ISC DHCP parse works).
 If we have Docsis space options, they are standard in Kea so they
 will conflict.

Dynamic DNS:
------------
 Details are very different so the MA maps only basic parameters
 at the global scope.

Expressions:
------------
 ISC DHCP expressions are typed: boolean, numeric, and data aka string.
 The default for a literal is to be a string so literal numbers are
 interpreted in hexadecimal (for a strange consequence look at
 https://kb.isc.org/article/AA-00334/56/Do-the-list-of-parameters-in-the-dhcp-parameter-request-list-need-to-be-in-hex.html ).
 String literals are converted to string elements, hexadecimal literals
 are converted to const-data maps.
TODO reduce more hexa aka const-data
 As booleans are not data there is no way to fix this:
  /tmp/bool line 9: Expecting a data expression.
  option ip-forwarding = foo = foo;
                                ^
 Cf Kea #247
 The tautology 'foo = foo' is not a data expression so is rejected by
 both the MA and dhcpd (BTW the role of the MA is not to fix ISC DHCP
 shortcomings so it does what it is expected to do here).
 Note this does not work too:
  option ip-forwarding = true;
 because "true" is not a keyword and it is converted into a variable
 reference... And I expect ISC DHCP makes this true a false at runtime
 because the variable "true" is not defined by default.
 Reduced expressions are pretty printed to allow an extra check.
 Hardware for DHCPv4 is expansed into a concatenation of hw-type and
 hw-address, this allows to simplify expression where only one is used.

Variables:
----------
 ISC DHCP has a notion of variables in a scope where the scope can be
 a lexical scope in the config or a scope in a function body
 (ISC DHCP has even an unused "let" statement).
 There is a variant of bindings for lease files using types and able
 to recognize booleans and numbers. Unfortunately this is very specific...

TODO:
 - global host reservations
 - class like if statement
 - add more tests for classes in pools and class generation