title logo

back contents forward

Chapter 6

iterators

To iterate over a numerical range we can write code in the form

       for  loopvar   =  start, finish, step     do  body     end 

The loop variable is local to the for-chunk. If the step value is omitted it is taken to be 1.

      addup = \ (...)
              local sum = 0
              for i = 1,select('#',...) do
                sum = sum + (select(i,...))
              end -- for
              => sum
              end -- function
      print(addup(1,2,3,4)) --> 10

There is a much more general form of for-chunk using an iterator

       for  varlist     in  exprlist     do  body     end 

The variables in varlist are local to the for-chunk.

We can iterate over the list part of a table t with

       for i,v in ipairs(t) do . . . end -- for
This is equivalent to, but more efficient than

       for i = 1,#t do
         local v = t[i]
            . . .
       end -- for 
To get a new list from a list, by applying a function to each member, we can use:

map = \(f) => \(t)
      local o = {}
      for i,v in ipairs(t) do o[i] = f(v) end -- for
      => o
      end end

We can iterate over all the key-value pairs of a table with

       for key,value in pairs(t) do . . . end -- for
but of course the ordering of the pairs is undefined. RiscLua provides a builtin iterator over directories

 for leafname,ftype in os.dir(directory) do
     . . .
 end -- for
The variable  ftype  is the numerical value of the object's filetype. The  io  library provides the iterator

 for line in io.lines(filename) do
    . . . .
 end -- for

which iterates over the lines of the file. In fact it is possible to write your own iterators. This is because the general for-loop

  for v_1, . . , v_n in iterator_expr do
    block
  end -- for

is equivalent to

   do
     local f, s, i = iterator_expr
     while true do
      local v_1, . . , v_n = f(s,i)
      i = v_1
      if not i then break end -- if
      block
     end -- while
   end -- do

We call v_1 the iteration variable. The loop continues until it takes the value nil. The expression between in and do must evaluate to three quantities: the iteration function,  f , the invariant state,  s  and the initial value of the iteration variable.

Here we cook up an example of an iterator. It iterates over the prime divisors of its argument. It relies on the obvious fact that the smallest divisor of a number must be a prime number.

primdiv = \(n)
          assert(n ~= 0)
          if n < 0 then n = -n end -- if
          local f = \(s,v)  -- s not used
                local p
                if n > 0 then
                 while n%v > 0 do
                  v = v + ( v == 2 and 1 or 2)
                  if v*v > n then v = n end -- if
                 end -- while
                 n = n/v; => v
                end -- if
                end -- function
           => f,nil,2
           end

for p in primdiv(84) do
  io.write(p," ")
end

--> 2 2 3 7
The while-chunk in the body of the iteration function increases the iteration variable, a candidate for a divisor of n , until either it does divide n or it is greater than the square root of n (from which we deduce that n itself is prime).

The notion of for-chunk is not strictly necessary. It just makes for more intelligible code.

sorting

The function table.sort can be used to sort lists. It takes a list for its first argument, and for its optional second argument a function of two parameters which returns the boolean value of the statement that the first parameter precedes the second. For example, to sort candidates in decreasing order of marks:

candidates = {
 [[Alfred Wang]];
 [[Efthymia Callionomasta]];
 [[Gorgo Goodwillie]];
 [[Morten Torsksind]];
 [[Josquin de Vilbrequin]];
              }
marks = {
  ["Alfred Wang"] = 78;
  ["Efthymia Callionomasta"] = 66;
  ["Gorgo Goodwillie"] = 74;
  ["Morten Torsksind"] = 28;
  ["Josquin de Vilbrequin"] = 98;
         }

table.sort(candidates,\(x,y) => marks[x] > marks[y] end)
mesg = "%d.%s%s  -- %d"
for i,x in ipairs(candidates) do
  print(mesg:format(i,x,(" "):rep(32-#x),marks[x]))
end

produces

1.Josquin de Vilbrequin           -- 98
2.Alfred Wang                     -- 78
3.Gorgo Goodwillie                -- 74
4.Efthymia Callionomasta          -- 66
5.Morten Torsksind                -- 28

Note the use of an unnamed function as second argument to table.sort . The function string.rep is used for repeating a string. Here we repeat a blank space to format the output. The candidates table has been rearranged in decreasing order of marks.
coroutines

Suppose we needed to search a directory containing lots of text files for the occurrence of a certain word. We want a program that will either report that the word does not occur, or report the name of the first file and the number of the first line in which it finds an occurrence. Of course, we have seen how to iterate over lines in a file and over files in a directory, so we can certainly put the two together and cobble up a program with a double for-loop. But suppose that we have reason to believe that the word we are seeking is actually close to the beginning of some file, and that the files are very big; too big, maybe, for storing in RAM. Under those circumstances it would make more sense to slice the cake differently: to search all the first lines first, then all the second lines, and so on. But how can we do this?

The answer is to use coroutines. Here is an example program.

--[[ Find  an occurrence of a word w in a text file
in a directory d, returning the file's name and the
line number of the first occurrence found or nil]]

makecounter = \()  -- factory for line counters
    local count = 1
    => \ ()
       local n = count
       count = count + 1
       => n
       end -- function
    end -- function

find = \(w,d)
  -- create a line counter and a
  -- coroutine for each text file
       local co,sep = {},'.'
       local create,yield,resume in coroutine
       local lines in io
       for leaf,ftype in os.dir(d) do
        if ftype == 0xfff then  -- text file
         local linecount = makecounter()
         co[leaf] = create( \()
           for line in lines(d..sep..leaf) do
             yield(line:match(w) and
                leaf,linecount())
           end -- for
             end)
        end -- if
       end -- for
       -- run the coroutines
       local status,some,result,st,line = {},true
       while some and not result do
          for leaf,x in pairs(co) do
            if x then
             st,result,line = resume(x)
             if st and result then
                => result,line
             end -- if
             status[leaf] = st
            end -- if
           end -- for
           -- has nothing been found?
           some = false
           for leaf,st in pairs(status) do
             co[leaf] = st and co[leaf]
             some = some or st
           end
       end -- while
        end -- function

d = "Mem::Memphis.$.Search_here"
w = "Rumpelstiltskin"
leaf,line = find(w,d)
if leaf then
 print("Found in file ",leaf," at line ",line)
else
 print "Not found"
end

We have used three functions:  coroutine.create creates a coroutine, coroutine.resume is used for entering or resuming a coroutine, and coroutine.yield for returning from a coroutine.

The function find iterates through the text files of the directory - we have no way of knowing in what order. It creates a line counter function and a coroutine for each one. The job of the coroutine is to scan a single line of the file. It returns three things: a boolean value to say if all went well (all coroutines return this), the name of the file if the word is found and nil otherwise, and lastly the current line count. Then these coroutines are all started - again we have no way of knowing in what order. Once a coroutine has reached the end of its file, it dies. The coroutines are resumed until either success has been reported by one or they have all died. Between one particular coroutine's reading of one line and its resumption to read the next, all the other coroutines still alive will have been resumed. The total effect is that the scanning of the files in the directory is the inner loop, the incrementing of the line number is the outer loop, with files dropping out if all their lines have been scanned.


back contents forward