summaryrefslogtreecommitdiff
path: root/POSIX.STD
blob: d40433ff908b72a1b914af5d512f554cf3af69ac (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
  Copyright (C) 1992, 1995, 1998, 2001, 2006, 2007, 2010, 2011, 2015, 2019
  Free Software Foundation, Inc.
  
  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
  notice and this notice are preserved.
--------------------------------------------------------------------------
Sun Apr 21 14:27:19 IDT 2019
============================
This file documents several things related to the 2008 POSIX standard
that I noted after reviewing it.

1. POSIX leaves undefined what happens for something like

	awk '{ print ; exit }' if=42 /etc/passwd

   Mawk diagnoses this.  Gawk and BWK awk do not. This doesn't seem to be
   worth the effort to add the code, but at least I'm aware of it.

2. The 2001-2004 standards accidentally required support for hexadecimal
   floating point constants and for Infinity and Not-A-Number (NaN) values.

   The 2008 standard now explicitly allows, but does not require, such
   support.

   More discussion is provided in the node `POSIX Floating Point Problems'
   in gawk.texi.

3. String comparison with <, <= etc is supposed to take the locale's collating
   sequence into account.  By default gawk doesn't do this. Rather, gawk
   will do this only if --posix is in effect.

4. According to POSIX, the function parameters of one function may not have
   the same name as another function, making this invalid:

	function foo() { ... }
	function bar(foo) { ...}

   Or even:

	function bar(foo) { ...}
	function foo() { ... }

   Gawk enforces this only with --posix.

5. According to POSIX (following the A, K, & W book), when RS = "", then
   newline is also a separator, "no matter what the value of FS is", implying
   that this is true even if FS is a regexp.

   In fact, UNIX awk has never behaved that, way; it adds \n to the list
   for FS = " " or any other single character, and gawk does too. This is
   essentially a bug in POSIX.

The following things aren't described by POSIX but ought to be:

1. The value of $0 in an END rule

2. The return value of a function that either does return with no value
   or that falls off the end of the function body.

3. What happens with substr() if start is <= 0, or greater than the length
   of the string, or if length is <= 0.

4. Whether "next" can be invoked from a function body.