stredrl.png

Doing more with text

GCW December 2011

Making scripts for StrongED

This page is a tutorial for making RiscLua scripts for StrongED. It presumes you have a recent version of StrongED, showing this icon - the apply icon - on the smarticonbar along the top of the StrongED window:

apply.gif

and that you have !lua in !Boot.resources (or elsewhere and that it has been filer_booted).

If you move the mouse pointer over parts of the example programs below, you should get extra information in tooltips - but unfortunately not yet with NetSurf. Reserved words (which you must not use as names for your own variables) are shown in green and comments are shown in red.

The idea is that you use StrongED to enter the text of the scripts given below and save them as text files somewhere (or download them directly by clicking on the icon). You use them on the contents of a StrongED window by either SHIFT-dragging them or CTRL-dragging them onto the apply icon. The latter is the safest, because it will put the transformed text into a new window, leaving the old unchanged, whereas the former will simply replace the old text by the new.

Here is a very simple script. It does nothing. download.png

# makes the first line a comment!lua tells StrongED to use RiscLua -- Ex1
for line a variable - call it what you likein ioio is a built-in library.indexed at the key named ... linesiterates through the lines of ... ( arg[1]indicates the text in the StrongED window ) do
    printoutput to new text ( line the typical line of the old text )
end -- for

For each line of the old text it simply prints it out to the new text - not very useful

Let us add some line numbers to the text. download.png

# makes the first line a comment!lua tells StrongED to use RiscLua -- Ex2
local layout a variable - call it what you like= assignment"%3d number specifier for a 3 digit field: %sstring specifier"
for line a variable - call it what you likein ioio is a built-in library.indexed at the key named ... linesiterates through the lines of ... ( arg[1]indicates the text in the StrongED window ) do
    local local to the for-loopn line count= assignmentn and n+1 taken if n definedor 1taken if n undefined
    printoutput to new text ( layout:format ( n, line )substitute n and line into layout according to specifiers )
end -- for

format is a useful string method. It should have the same number of arguments as specifiers in the format string it is applied to.

The line

    n = n and n+1 or 1

is equivalent to

    if n then
      n = n+1
    else
      n = 1
    end -- if

After the words

      if while until not
and before the words
      and or
values are treated as true if they are neither nil (i.e. undefined) nor false, and otherwise as false. These last operators are lazy in their right hand arguments. That is to say if x is false or undefined then the y in x and y is never evaluated, Conversely if x is true or defined (non-nil) then y in the expression x or y is never evaluated.

Note that zero and the empty string, being non-nil, are counted as true.

Ex2-1 : Modify script Ex2 so that line numbering starts at some other value, say 20, and increases by some other increment, say 10.

Note that line numbering with Ex1 adds precisely six characters to each line, presuming that the text has fewer than a thousand lines. The expression

      line:sub(7)

yields the substring of line from the seventh character to the end.

Ex2-2 : With this information you should be able to write a script to undo the line-numbering of Ex2.

The expression

      line:sub(m,n)

yields the substring of line starting at the m-th character and finishing at the n-th. Negative positions count backwards from the end, so that -1 refers to the last character of line. If the second argument is omitted the value -1 is used for it.

Ex2-3 : Using the fact that equality is tested by the operator == and inequality by ~= you should now be able to write a script that deletes all lines starting with "Sir John" and ending with "Rice-Davies".

Suppose we want something a bit more specific. Say we want to delete all the lines starting with "Mary" and ending with "grow?" which contain the word "contrary". For that we need to know something about pattern matching. Here is a script: download.png

# makes the first line a comment!lua tells StrongED to use RiscLua -- Ex3
for line a variable - call it what you likein ioio is a built-in library.indexed at the key named ... linesiterates through the lines of ... ( arg[1]indicates the text in the StrongED window ) do
    if not line:match match string method"^matches start of lineMary.+at least one charactercontrary.+at least one charactergrow?$matches end of line" then
      printoutput ( line )
    end -- if
end -- for

An irritatingly common misspelling, no doubt caused by the predominance of the schwa in English speech, is "seperate" for "separate". This script will do the correction.download.png

# makes the first line a comment!lua tells StrongED to use RiscLua -- Ex4
for line a variable - call it what you likein ioio is a built-in library.indexed at the key named ... linesiterates through the lines of ... ( arg[1]indicates the text in the StrongED window ) do
    print output( ( parentheses needed to suppress extra return valuesline:gsubglobal substitution method ( "seperat"pattern to match,"separat"replacement string ) ) )parentheses needed to suppress extra return values
end -- for

An extra subtlety here is that line:gsub can return multiple values, and print can take multiple arguments. By surrounding the expression in an extra set of parentheses the extra values can be suppressed. Although this precaution is not needed in this particular example it is wise to establish good habits.

For example 5 we suppose that we have text referring to files saved from a Unix system so that in RISC OS they have names like foo/c, foo/h or foo/o. We want to change these to the form c.foo, h.foo and o.foo respectively. Here is a script. download.png

# makes the first line a comment!lua tells StrongED to use RiscLua -- Ex5
local extns variable = { "c","h","o" }allowed extensions
local ok = {} empty table
for _conventional don't care variable,v variablein ipairs built-in list-part iterator( extns ) do ok[v] index by strings= true end -- for
local search variable= "(([%w_]+)/([%w_]+))"pattern with 3 captures () - %w matches an alphanumeric
local out variable= "%s.%s"format string
local switch variable= \ function(x,y,z)parameters corresponding to captures
    => returnsok[z] extension ok?and out:format (z,y) output switched namesor xoriginal expression if not ok
  end -- function
for line a variable - call it what you likein ioio is a built-in library.indexed at the key named ... linesiterates through the lines of ... ( arg[1]indicates the text in the StrongED window ) do
    print ( ( gsub global substitution( search search pattern, switch function of captures returning replacement) ) )
end -- for

In this example the search pattern has three captures. Captures are ordered by the positions of the opening parentheses that carve out the subexpression to be captured. The first (x parameter) encloses the whole search pattern, so x is used as return value when no change is required. The second capture (y parameter) extends upto the forward-slash and the third (z parameter) is the extension. In the search expression [%w_] denotes either an alphanumeric character or an underscore. The + suffix means "at least one".

The global substitution method is very powerful. Its last argument, when a function, can be used to perform side-effects that accumulate information from the text. For the next exercise let us suppose that we want a script that will scan the text for doctors (e.g. Dr Smith) and which will print out a list of them, in the order they were first encountered, together with the numbers of the lines in which they are mentioned download.png

# makes the first line a comment!lua tells StrongED to use RiscLua -- Ex6
local search variable= "Dr%s+(%a+)"search pattern
local lineno variable= 0
local incline variable= \ function() no parameterslineno = 1 + lineno end
local push variable= \ function(x,a) x[1 + #x] = a push a onto stack x, # is length operatorend
local drlist, refs = multi-assignment{ push = push }list with a push method, {}empty table
local scrape variable= \ function( name )capture
      if not built-inrefs[name] have we met him before?then
        drlist:push(name)if not put him in list
        refs[name] = { push = push }and create his line number list
      end -- if
      refs[name]:push (lineno)record the line
      end -- function
for line a variable - call it what you likein ioio is a built-in library.indexed at the key named ... linesiterates through the lines of ... ( arg[1]indicates the text in the StrongED window ) do
      incline()increase line number
      line:gsub(search,scrape)do the work
end -- for
local write outputin ioinput/output library
for_don't care variable,name variablein ipairs list iterator( drlist ) do
    write(name," : ")
    for i,n in ipairs list iterator(refs[name]) do
      write ( n, (i%12 == 0) every 12-th item start a new lineand " \n"newline or " ")
    end -- for
    write "\n\n"blank line between
end -- for

The notation { push = push } deserves some comment. It is syntactic sugar for { ["push"] = push }. That is to say the left hand push is the name of a key; the right hand push is the value of the function defined in line 5. The notation drlist:push(name) is syntactic sugar for drlist.push(drlist,name) which in turn is syntactic sugar for drlist["push"](drlist,name). And that is just push(drlist,name).

In the same way, line:gsub(search,scrape) stands for line.gsub(line,search,scrape). But line is not a table but a string, and strings have been set to look up their methods in the string library. So this expression is interpreted as string.gsub(line,search,scrape).