|
You now have the capability of linking C subroutines into a
special version of perl. See the files in usub/ for an example.
There is now an operator to include library modules with duplicate
suppression and error checking, called "require". (makelib has been
renamed to h2ph, and Tom Christiansen's h2pl stuff has been included
too. Perl .h files are now called .ph files to avoid confusion.)
It's now possible to truncate files if your machines supports any
of ftruncate(fd, size), chsize(fd, size) or fcntl(fd, F_FREESP, size).
Added -c switch to do compilation only, that is, to suppress
execution. Useful in combination with -D1024.
There's now a -x switch to extract a script from the input stream
so you can pipe articles containing Perl scripts directly into perl.
Previously, the only places you could use bare words in Perl were as
filehandles or labels. You can now put bare words (identifiers)
anywhere. If they have no interpretation as filehandles or labels,
they will be treated as if they had single quotes around them.
This works together nicely with the fact that you can use a
symbol name indirectly as a filehandle or to assign to *name.
It basically means you can write subroutines and pass filehandles
without quoting or *-ing them. (It also means the grammar is even
more ambiguous now--59 reduce/reduce conflicts!!! But it seems
to do the Right Thing.)
Added __LINE__ and __FILE__ tokens to let you interpolate the
current line number or filename, such as in a call to an error
routine, or to help you translate eval linenumbers to real
linenumbers.
Added __END__ token to let you mark the end of the program in
the input stream. (^D and ^Z are allowed synonyms.) Program text
and data can now both come from STDIN.
`command` in array context now returns array of lines. Previously
it would return a single element array holding all the lines.
An empty %array now returns 0 in scalar context so that you can
use it profitably in a conditional: &blurfl if %seen;
The include search path (@INC) now includes . explicity at the
end, so you can change it if you wish. Library routines now
have precedence by default.
Several pattern matching optimizations: I sped up /x+y/ patterns
greatly by not retrying on every x, and disabled backoff on
patterns anchored to the end like /\s+$/. This made /\s+$/ run
100 times faster on a string containing 70 spaces followed by an X.
Actual improvements will generally be less than that. I also
sped up {m,n} on simple items by making it a variant of *.
And /.*whatever/ is now optimizaed to /^.*whatever/ to avoid
retrying at every position in the event of failure. I fixed
character classes to allow backslashing hyphen, by popular
request.
In the past, $ in a pattern would sometimes match in the middle
of the string and sometimes not, if $* == 0. Now it will never
match except at the end of the string, or just before a terminating
newline. When $* == 1 behavior is as before.
In the README file, I've expanded on just how I think the GNU
General Public License applies to Perl and to things you might
want to do with Perl.
The interpreter used to set the global variable "line" to be
the current line number. Instead, it now sets a global pointer
to the current Perl statement, which is no more overhead, but
now we will have access to the file name and package name associated
with that statement, so that the debugger soon be upgraded to
allow debugging of evals and packages.
In the past, a conditional construct in an array context passed
the array context on to the conditional expression, causing
general consternation and confusion. Conditionals now always
supply a scalar context to the expression, and if that expression
turns out to be the one whose value is returned, the value is
coerced to an array value of one element.
The switch optimizer was confused by negative fractional values,
and truncating them the wrong direction.
Configure now checks for chsize, select and truncate functions, and
now asks if you want to put scripts into some separate directory
from your binaries. More and more people are establishing a common
directory across architectures for scripts, so this is getting
important.
It used to be that a numeric literal ended up being stored both
as a string and as a double. This could make for lots of wasted
storage if you said things like "$seen{$key} = 1;". So now
numeric literals are now stored only in floating point format,
which saves space, and generates at most one extra conversion per
literal over the life of the script.
The % operator had an off-by-one error if the left argument was
negative.
The pack and unpack functions have been upgraded. You
can now convert native float and double fields using f and d.
You can specify negative relative positions with X<n>, and absolute
positions in the record with @<n>. You can have a length of *
on the final field to indicate that it is to gobble all the rest
of the available fields. In unpack, if you precede a field
spec with %<n>, it does an n-bit checksum on it instead of the
value itself. (Thus "%16C*" will checksum just like the Sys V sum
program.) One totally wacked out evening I hacked a u format
in to pack and unpack uudecode-style strings.
A couple bugs were fixed in unpack--it couldn't unpack an A or a
format field in a scalar context, which is just supposed to
return the first field. The c and C formats were also calling
bcopy to copy each character. Yuck.
Machines without the setreuid() system call couldn't manipulate
$< and $> easily. Now, if you've got setuid(), you can say $< = $>
or $> = $< or even ($<, $>) = ($uid, $uid), as long as it's
something that can be done with setuid(). Similarly for setgid().
I've included various MSDOS and OS/2 patches that people have sent.
There's still more in the hopper...
An open on a pipe for output such as 'open(STDOUT,"|command")' left
STDOUT attached to the wrong file descriptor. This didn't matter
within Perl, but it made subprocesses expecting stdout to be on fd 1
rather irate.
The print command could fail to detect errors such as running out
room on the disk. Now it checks a little better.
Saying "print @foo" might only print out some of the elements
if there undefined elements in the middle of the array, due to
a reversed bit of logic in the print routine.
On machines with vfork the child process would allocate memory
in the parent without the parent knowing about it, or having any way
to free the memory so allocated. The parent now calls a cleanup
routine that knows whether that's what happened.
If the getsockname or getpeername functions returned a normal
Unix error, perl -w would report that you tried I/O on an
unopened socket, even though it was open.
MACH doesn't have seekdir or telldir. Who ever uses them anyway?
Under certain circumstances, an optimized pattern match could
pass a hint into the standard pattern matching routine which
the standard routine would then ignore. The next pattern match
after that would then get a "panic: hint in do_match" because the
hint didn't point into the current string of interest.
The $' variable returned a short string if it contained an
embedded null.
Two common split cases are now special-cased to avoid the regular
expression code. One is /\s+/ (and its cousin ' ', which also
trims leading whitespace). The other is /^/, which is very useful
for splitting a "here-is" quote into lines:
@lines = split(/^/, <<END);
Element 0
Element 1
Element 2
END
You couldn't split on a single case-insensitive letter because
the single character split optimization ignore the case folding
flag.
Sort now handles undefined strings right, and sorts lists
a little more efficiently because it weeds them out before
sorting so it doesn't have to check for them on every comparison.
The each() and keys() functions were returning garbage on null
keys in DBM files because the DBM iterator merely returns a pointer
into the buffer to a string that's not necessarily null terminated.
Internally, Perl keeps a null at the end of every string (though
allowing embedded nulls) and some routines make use of this
to avoid checking for the end of buffer on every comparison. So
this just needed to be treated as a special case.
The &, | and ^ operators will do bitwise operations on two strings,
but for some reason I hadn't implemented ~ to do a complement.
Using an associative array name with a % in dbmopen(%name...)
didn't work right, not because it didn't parse, but because the
dbm opening routine internally did the wrong thing with it.
You can now say dbmopen(name, 'filename', undef) to prevent it
from opening the dbm file if it doesn't exist.
The die operator simply exited if you didn't give an argument,
because that made sense before eval existed. But now it will be
equivalent to "die 'Died';".
Using the return function outside a subroutine returned a cryptic
message about not being able to pop a magical label off the stack.
It's now more informative.
On systems without the rename() system call, it's emulated with
unlink()/link()/unlink(), which could clobber a file if it
happened to unlink it before it linked it. Perl now checks to
make sure the source and destination filenames aren't in fact
the same directory entry.
The -s file test now returns size of file. Why not?
If you tried to write a general subroutine to open files, passing
in the filehandle as *filehandle, it didn't work because nobody
took responsibility to allocate the filehandle structure internally.
Now, passing *name to subroutine forces filehandle and array
creation on that symbol if they're already not created.
Reading input via <HANDLE> is now a little more efficient--it
does one less string copy.
The dumpvar.pl routine now fixes weird chars to be printable, and
allows you to specify a list of varables to display. The debugger
takes advantage of this. The debugger also now allows \ continuation
lines, and has an = command to let you make aliases easily. Line
numbers should now be correct even after lines containing only
a semicolon.
The action code for parsing split; with no arguments didn't
pass correct a corrent value of bufend to the scanpat it was
using to establish the /\s+/ pattern.
The $] variable returned the rcsid string and patchlevel. It still
returns that in a string context, but in a numeric context it
returns the version number (as in 4.0) + patchlevel / 1000.
So these patches are being applied to 3.018.
The variables $0, %ENV, @ARGV were retaining incorrect information
from the previous incarnation in dumped/undumped scripts.
The %ENV array is suppose to be global even inside packages, but
and off-by-one error kept it from being so.
The $| variable couldn't be set on a filehandle before the file
was opened. Now you can.
If errno == 0, the $! variable returned "Error 0" in a string
context, which is, unfortunately, a true string. It now returns ""
in string context if errno == 0, so you can use it reasonable in
a conditional without comparing it to 0: &cleanup if $!;
On some machines, conversion of a number to a string caused
a malloc string to be overrun by 1 character. More memory is
now allocated for such a string.
The tainting mechanism didn't work right on scripts that were setgid
but not setuid.
If you had reference to an array such as @name in a program, but
didn't invoke any of the usual array operations, the array never
got initialized.
The FPS compiler doesn't do default in a switch very well if the
value can be interpreted as a signed character. There's now a
#ifdef BADSWITCH for such machines.
Certain combinations of backslashed backslashes weren't correctly
parsed inside double-quoted strings.
"Here" strings caused warnings about uninitialized variables because
the string used internally to accumulate the lines wasn't initialized
according to the standards of the -w switch.
The a2p translator couldn't parse {foo = (bar == 123)} due to
a hangover from the old awk syntax. It also needed to put a
chop into a program if the program referenced NF so that the
field count would come out right when the split was done.
There was a missing semicolon when local($_) was emitted.
I also didn't realize that an explicity awk split on ' ' trims
leading whitespace just like the implicit split at the beginning
of the loop. The awk for..in loop has to be translated in one
of two ways in a2p, depending on whether the array was produced
by a split or by subscripting. If the array was a normal array,
a2p put out code that iterated over the array values rather than
the numeric indexes, which was wrong.
The s2p didn't translate \n correctly, stripping the backslash.
|