If A and B are patterns the pattern A | B (also written A + B in Lpeg) first tries to match A and if that fails it tries B. If we suppose that the code for our patterns are subroutines then the code for A | B can be obtained by following the code for A with that of B, replacing the failure-exit of A with an instruction to reset the text-pointer back to its starting position and then dropping through to the code for B. A similar set of modifications apply to multiple alternatives
A | B | C ......with each one but the last dropping through, on failure, to the next, with resetting of the text-pointer.
In the same way, the code for !A may be got by swapping the succes and failure exits, and by adding an instruction to reset the text-pointer before the success exit.
In matching a pattern to a text one often wants to capture information. What sort of information will clearly depend strongly on the language in which one's pattern matching code is embedded. But in the simplest case one might consider patterns that always succeed, but which as a side effect push the current text pointer to a stack. From these pointer values we can then extract substrings from the text. For simplicity we would only use these capture patterns at the top level and outside repetitions.
If non-terminals correspond to subroutines, then a production in a grammar, of the form
X = U V W ....might be coded as
X_out LDR PC, [stack], #4 X_enter STR PC, [stack, #-4]! BL U_enter TST status BEQ X_out BL V_enter TST status BEQ X_out BL W_enter TST status BEQ X_out ...so that the tree-structure of the algebra would be reflected by the call-structure of the code. The subroutines would not save the register TXT_PTR used for the current text-pointer. The status register is here represented as being zero for failure and non-zero for success. It might be more efficient to minimize the branch-with-link instructions and instead use inline code
X_out LDR PC, [stack], #4 X_enter STR PC, [stack, #-4]! <U's code> TST status BEQ X_out <V's code> TST status BEQ X_out <W's code> TST status BEQ X_out ...The same considerations apply to productions of the form
X = U | V | W ....which might be coded as
X_out LDR PC, [stack], #4 X_enter STR PC, [stack, #-4]! MOV START, TXT_PTR BL U_enter TST status BNE X_out MOV TXT_PTR,START BL V_enter TST status BNE X_out MOV TXT_PTR,START BL W_enter TST status BNE X_out ...Here START would be a register for local use, not used by the subroutines, for restoring the text-pointer after failure.
Conversely there may also be situations when it pays for the compiler to introduce more non-terminals in order to simplify complex expressions.