summaryrefslogtreecommitdiff
path: root/TODO
blob: 8e98c70c56bc66cdfce7523a97ee86dafd5e3dca (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
Thu May  9 15:31:55 IDT 2013
============================

There were too many files tracking different thoughts and ideas for
things to do, or consider doing.  This file merges them into one. As
tasks are completed, they should be moved to the DONE section, below,
or simply removed.

Upon creation of a release (major or patch release), items from the
previous release should be removed.

TODO
====

Minor Cleanups and Code Improvements
------------------------------------

	Rationalize use of min/max/MIN/MAX in gawk code.

	API:
		??? #if !defined(GAWK) && !defined(GAWK_OMIT_CONVENIENCE_MACROS)

	?? Add debugger commands to reference card

	Look at function order within files.

	regex.h - remove underscores in param names

	Consider removing use of and/or need for the protos.h file.

	Review the bash source script for working with shared libraries in
	order to nuke the use of libtool.

	In test/Makefile.am and generation scripts, quote $(srcdir) etc.
	for windows or other systems with spaces in path names. Fun.

	Enhance profiling to same comments in a byte-code that does nothing
	but that can be used when pretty printing the program.

Minor New Features
------------------

	Enhance extension/fork.c waitpid to allow the caller to specify
	the options.  And add an optional array argument to wait and
	waitpid in which to return exit status information.

	Add a readfile() function in awk from mail.

	Make writes to ENVIRON / deletions affect the real environment
	using setenv / unsetenv.

	Consider relaxing the strictness of --posix.

	? Add an optional base to strtonum, allowing 2-36.

	? Optional third argument for index indicating where to start the
	search.

Major New Features
------------------

	Think about how to generalize indirect access. Manuel Collado
	suggests things like

		foo = 5
		@"foo" += 4

	Also needed:

		indirect calls of built-ins
		indirect calls of extension functions
		indirect through array elements, not just scalar variables

	Fix the early chapters in the doc with more up-to-date examples.
	No-one uses Bulletin Board Systems anymore.

	Rework management of array index storage. (Partially DONE.)

	Consider using an atom table for all string array indices.

	DBM storage of awk arrays. Try to allow multiple dbm packages.

	?? Some way to make regexp constants first class citizens
		- Assign to variables
		- Pass to functions

	?? A RECLEN variable for fixed-length record input. PROCINFO["RS"]
	would be "RS" or "RECLEN" depending upon what's in use.
	*** Could be done as an extension?

	?? Use a new or improved dfa and/or regex library.

Things To Think About That May Never Happen
-------------------------------------------

	Consider making shadowed variables a warning and not
	a fatal warning when --lint=fatal.

	Similar for extra parameters in a function call.

	?? Scope IDs for IPv6 addresses ??

	??? Gnulib

	Look at code coverage tools, like S2E: https://s2e.epfl.ch/
	
	Try running with diehard. See http://www.diehard-software.org,
	https://github.com/emeryberger/DieHard

	Change from dlopen to using the libltdl library (i.e. lt_dlopen).
	This may support more platforms.

	FIX regular field splitting to use FPAT algorithm.
		Note: Looked at this. Not sure it's with the trouble:
		If it ain't broke...

	Implement namespaces.  Arnold suggested the following in an email:
	- Extend the definition of an 'identifier' to include "." as a valid character
	  although an identifier can't start with it.
	- Extension libraries install functions and global variables with names
	  that have a "." in them:  XML.parse(), XML.name, whatever.
	- Awk code can read/write such variables and call such functions, but they
	  cannot define such functions
	 function XML.foo() { .. }	# error
	  or create a variable with such a name if it doesn't exist. This would
	  be a run-time error, not a parse-time error.
	- This last rule may be too restrictive.
	I don't want to get into fancy rules a la perl and file-scope visibility
	etc, I'd like to keep things simple.  But how we design this is going
	to be very important.

	Include a sample rpm spec file in a new packaging subdirectory.

	Patch lexer for @include and @load to make quotes optional.

	Do an optimization pass over parse tree?

	Consider integrating Fred Fish's DBUG library into gawk.

	Make 	awk '/foo/' files...	run at egrep speeds (how?)

	? Have strftime() pay attention to the value of ENVIRON["TZ"]

	Add a lint check if the return value of a function is used but
	the function did not supply a value.

	Consider making gawk output +nan for NaN values so that it
	will accept its own output as input.
		NOTE: Investigated this.  GLIBC formats NaN as '-nan'
		and -NaN as 'nan'.  Dealing with this is not simple.

	Enhance FIELDWIDTHS with some way to indicate "the rest of the record".
	E.g., a length of 0 or -1 or something.  Maybe "n"?

	Make FIELDWIDTHS be an array?

Code Review
-----------
awkgram.y
debug.c
eval.c
ext.c
field.c
floatcomp.c
floatmagic.h
gawkmisc.c
profile.c
protos.h

DONE
====

Minor Cleanups and Code Improvements
------------------------------------
Done in 4.1:

	Review all FIXME and TODO comments

	Fix all *assoc_lookup() = xxx  calls.

	Make GAWKDEBUG pass the test suite.

	Make fflush() and fflush("") both flush all files, as in BWK awk.

	In gawkapi.c - review switch statements and use of default.

	API:
		awk_true and awk_false
		Update doc to use gcc -o filefuncs.so -shared filefuncs.o
			instead of ld ...
		Have check for name not rely on isalpha, isalnum since
			the locale could botch that up. Also make it not
			fatal.

Minor New Features
------------------
Done in 4.1:

	Merge the chapter and the appendix on floating-point math (for 4.1).

Major New Features
------------------
Done in 4.1:

	Integration of array_iface branch.

	Design and implement I/O plugin API.

	Implement designed API for loadable modules

	Redo the loadable modules interface from the awk level.

	xgawk features (@load, -l, others)

	Merge gawk/pgawk/dgawk into one executable

	Merge xmlgawk XML extensions (via source forge project that
	      works with new API)

	Integrate MPFR to provide high precision arithmetic.

	Consider really implementing BWK awk SYMTAB for seeing what
	global variables are defined.

Things That We Decided We Will Never Do
---------------------------------------

	Consider moving var_value info into Node_var itself to reduce
	memory usage. This would break all uses of get_lhs in the
	code. It's too sweeping a change.

	Add macros for working with flags instead of using & and |
	directly.

Code Review
-----------
array.c
awk.h
builtin.c
cmd.h
command.y
custom.h
hard-locale.h
io.c
main.c
mbsupport.h
msg.c
node.c
re.c
replace.c
version.c
xalloc.h