Lua steps awkward
My first encounter with awk was John Kortink's port of
gawk 3.0.3 for RISC OS in 1998. I am grateful to John for
helping me get to grips with compiling it for myself. I went
on to compile later versions of gawk, and even managed to
extend them with some ARM assembly code to provide some
convenient RISC OS extensions. Later I came across Lucent's The One True Awk and Michael Brennan's mawk, which I also compiled. I did
not try extending The One True Awk because I felt that would ran counter
to its spirit of sticking to the original, as represented in
Aho, Weinberger and Kernighan's classic The Awk Programming Language. Naturally, I fell in love with awk and found it a wonderful
tool.
I discovered that StrongED had a marvellous, and
largely unexploited, facility: the
Process command. This made it possible to apply awk scripts to the contents
of a StrongED window, thereby enhancing the powers of StrongED and
providing an easy desktop development environment for writing awk
scripts. I am grateful to Fred Graute for developing StrongED to
make this facility available for any kind of
scripting language.
Then I discovered Lua. Apart from the fact that I have
always preferred small languages, what really rocked me off my
feet was how marvellously easy it was to port. The number of hours
I have wasted trying to compile packages that claim to be
portable, only to give up in frustration when it became clear that
I did not have the necessary libraries. I think in the old days
programming language implementors assumed that everybody used
Unix workstations; conceivably they had a Windows machine at
home. It is amazing how many variant interpretations of C
there are; just a tiny detail, do you allow macros to extend
over many lines by using a backslash to hide newlines?
Pages of compiler errors result if your version of C does not
allow this but the source code thinks it is OK. Even when you
have managed to get something to compile, that is no guarantee
that the result will run properly. As I get older and lazier,
and newer and possibly more incompatible versions of C
libraries are brought out, the prospect of sifting through
lists of warnings, errors and serious errors grows more
daunting. Lua, on the other hand, continues to be written
in strict ANSI C. It will compile even without UnixLib.
In short, compiling a new version of Lua is a lot less
hassle than compiling an old version of awk (there are
no new ones).
Awk was originally devised as a tool for text manipulation,
like sed. Its popularity led to its being developed to a
programming language. Its authors are at pains to point out
that awk was never intended for large programming jobs or
for general programming. Quick and easy, is its motto, so long as it is being used in an
appropriate context. Lua, by contrast, grew to a programming
language from a data description format. It is more of a general
purpose language. Because it was devised two decades after
awk, it incorporates all sorts of good principles which were
hardly appreciated when awk was born. Look at awk's perfunctory
treatment of local variables, for example. It does not have any,
except by accident, as it were, as formal parameters. Lua's
auxiliary libraries, particularly the string library, can
give some very snappy code - but not quite as compact as awk.
Because Lua is a bit more general purpose than awk, it makes
sense to try and mimic awk with Lua. So I have devised a pseudo_awk.
Its syntax is more that of Pascal than of C, but its scripts are
also written as a series of pattern-action statements. The patterns
must use Lua pattern syntax. Functions can only be defined inside
pattern-action statements, not outside as in awk. OK, it is
really just an exercise, and probably not worth pushing further,
but here is how it goes. It presumes Lua version 5, but with $
as an allowable character in variable names (as is the case with
RiscLua). The command
lua awkward pseudo_awk_script data_file
runs the pseudo_awk script on the data_file. Example scripts are:
/Henry/ { printf("%s %s",NR,$[0]) }
which prints out with its line number all lines containing the
substring 'Henry', and
BEGIN {count = 0}
$[1] == "Henry" {
count = count + $[2]
print(NR," ",$[2])
}
END { print("Total = ",count) }
Note that where awk would have
$i pseudo_awk has
$[i]. There is no
+= operator in pseudo_awk, and we have initialized the variable
count. We could have removed the first line and written instead
count = (count or 0) + $[2]
to cater for the fact that pseudo_awk has
nil and variables are not automatically initialized to 0 or an
empty string. See here the
awkward pseudo_awk interpreter.
Back