Doing more with text
GCW December 2011Making scripts for StrongED
This page is a tutorial for making RiscLua scripts for
StrongED. It presumes you have a recent version of StrongED,
showing this icon - the apply icon - on the smarticonbar along the top of the StrongED
window:

and that you have !lua in !Boot.resources (or elsewhere and
that it has been filer_booted).If you move the mouse pointer over parts of the example
programs below, you should get extra information in
tooltips -
but unfortunately not yet with NetSurf.
Reserved words (which you must not use as names
for your own variables) are shown in green and comments are shown in .The idea is that you use StrongED to enter the text of the scripts
given below and save them as text files somewhere (or download them
directly by clicking on the icon). You use them on the contents of
a StrongED window by either SHIFT-dragging them or CTRL-dragging them
onto the apply icon. The latter is the safest, because it will put
the transformed text into a new window, leaving the old unchanged,
whereas the former will simply replace the old text by the new.Here is a very simple script. It does nothing.
For each line of the old text it simply prints it out to the new text
- not very useful Let us add some line numbers to the text.
format is a useful string method. It should have the same
number of arguments as specifiers in the format string it is
applied to.The line n = n and n+1 or 1
is equivalent to if n then
n = n+1
else
n = 1
end
After the words if while until not
and before the words and or
values are treated as true if they are neither nil (i.e. undefined)
nor false, and otherwise as false. These last operators are lazy in their
right hand arguments. That is to say if x is false or undefined then the y in x and y is never evaluated, Conversely if x is true or defined (non-nil) then y in the expression x or y is never evaluated. Note that zero and the empty string, being non-nil, are counted as true.Ex2-1 : Modify script Ex2 so that line numbering starts at some
other value, say 20, and increases by some other increment, say 10.Note that line numbering with Ex1 adds precisely six characters to
each line, presuming that the text has fewer than a thousand lines.
The expression line:sub(7)
yields the substring of line from the seventh character to the end.Ex2-2 : With this information you should be able to write a script to undo
the line-numbering of Ex2.The expression line:sub(m,n)
yields the substring of line starting at the m-th character and
finishing at the n-th. Negative positions count backwards from the
end, so that -1 refers to the last character of line. If the second argument
is omitted the value -1 is used for it.Ex2-3 : Using the fact that equality is tested by the operator == and
inequality by ~= you should now be able to write a script that deletes
all lines starting with "Sir John" and ending with "Rice-Davies".Suppose we want something a bit more specific. Say we want to delete
all the lines starting with "Mary" and ending with "grow?"
which contain the word "contrary". For that we need to know something
about pattern matching. Here is a script:
An irritatingly common misspelling, no doubt caused by the predominance
of the schwa in English speech, is "seperate" for "separate". This script
will do the correction.
An extra subtlety here is that line:gsub can return
multiple values, and print can take multiple arguments.
By surrounding the expression in an extra set of parentheses the
extra values can be suppressed. Although this precaution is not needed in
this particular example it is wise to establish good habits.For example 5 we suppose that we have text referring to files
saved from a Unix system so that in RISC OS they have names like
foo/c, foo/h or foo/o. We want to change these to the form c.foo, h.foo
and o.foo respectively. Here is a script.
In this example the search pattern has three captures. Captures are
ordered by the positions of the opening parentheses that carve out
the subexpression to be captured. The first (x parameter) encloses the
whole search pattern, so x is used as return value when no change is
required. The second capture (y parameter) extends upto the forward-slash
and the third (z parameter) is the extension. In the search expression
[%w_] denotes either an alphanumeric character or an underscore. The +
suffix means "at least one".The global substitution method is very powerful. Its last argument,
when a function, can be used to perform side-effects that accumulate
information from the text. For the next exercise let us suppose that
we want a script that will scan the text for doctors (e.g. Dr Smith)
and which will print out a list of them, in the order they were
first encountered, together with the numbers of the lines in which
they are mentioned
The notation { push = push } deserves some comment. It is syntactic
sugar for { ["push"] = push }. That is to say the left hand push is the name of a key; the right hand push is the value of
the function defined in line 5. The notation drlist:push(name) is
syntactic sugar for drlist.push(drlist,name) which in turn is syntactic
sugar for drlist["push"](drlist,name). And that is just push(drlist,name).In the same way, line:gsub(search,scrape) stands for
line.gsub(line,search,scrape). But line is not
a table but a string, and strings have been set to look up their
methods in the string library. So this expression is
interpreted as string.gsub(line,search,scrape).