Monday, February 20, 2012

4. Syntactic analysis of Wally


Parts: 1 2 3 4 5 6 7

In this part we'll be creating the parser rules for Wally. If you haven't already got the necessary files, download them here.

A Wally script optionally starts with one ore more use-statements followed by zero or more file_atoms and ending with an optional return statement. A use statement looks like this:

# use object Tuple from the file Tuple.wy
use Tuple from "Tuple.wy"
 
# use everything from the file Math.wy
use * from "Math.wy"

Resulting in the following parser rules:

parse
 : EOL* use_stats? file_atom* return_stat EOF
 ;
 
use_stats
 : use_stat+
 ;
 
use_stat
 : Use (Id | '*') From Str EOL+
 ;

A file_atom is one of three things:

  • an object;
  • a function;
  • or a block_atom (an assignment, function call, if-statement, ...).
file_atom
 : object
 | function
 | block_atom
 ;
 
object
 : Obj Id EOL INDENT object_atom+ DEDENT EOL* 
 ;
 
object_atom
 : vars
 | function
 ;
 
vars
 : Id (',' Id)* EOL
 ;
 
function
 : any_id id_list EOL block
 ;
 
any_id
 : ReservedId
 | In
 | Id
 ;
 
id_list
 : '(' (Id (',' Id)*)? ')'
 ;
 
block
 : INDENT block_atom* return_stat DEDENT
 ;

Since there is a bit of ambiguity w.r.t. an assignment and a function_call (both can start with a lookup), we'll add a syntactic predicate before the first to make sure we only parse an assignment if there really is an assignment ahead.

block_atom
 : (assignment)=> assignment EOL
 | function_call EOL
 | Break EOL
 | for_stat
 | if_stat
 | while_stat
 ;
 
for_stat
 : For Id '=' expr To expr EOL block
 | For Id In expr EOL block
 ;
 
if_stat
 : If expr EOL block elif_stat* else_stat?
 ;
 
elif_stat
 : Elif expr EOL block
 ;
 
else_stat
 : Else EOL block
 ;
 
while_stat
 : While expr EOL block
 ;
 
assignment
 : lookup ( '=' expr
          | '||=' expr
          | '&&=' expr
          | '+=' expr
          | '-=' expr
          | '*=' expr
          | '/=' expr
          | '%=' expr
          | '^=' expr
          )
 ;
 
function_call
 : lookup
 | Print '(' expr ')'
 | Println '(' expr? ')'
 | Assert '(' expr (',' expr)? ')'
 ;
 
return_stat
 : Return expr EOL
 | /* epsilon */
 ;

Below are the various expression rules:

expr
 : or ( '?' expr ':' expr
      | In expr
      )?
 ;
 
or
 : and (Or and)*
 ;
 
and
 : rel (And eq)*
 ;
 
rel
 : eq ((Lt | Gt | LtEq | GtEq) eq)?
 ;
 
eq
 : add ((Eq | NEq) add)*
 ;
 
add
 : mult ((Add | Sub) mult)*
 ;
 
mult
 : unary ((Mult | Div | Mod) unary)*
 ;
 
unary
 : Sub pow
 | Not unary
 | pow
 ;
 
pow
 : atom (Pow atom)*
 ;
 
atom
 : '(' expr ')'
 | New Id expr_params
 | lookup
 | list
 | Param
 | Str
 | Num
 | True
 | False
 | None
 ;
 
lookup
 : Id tail*
 | This tail*
 | any_id expr_params
 ;
 
tail
 : '.' Id
 | '[' expr ']'
 | '.' any_id expr_params
 ;
 
expr_params
 : '(' (expr (',' expr)*)? ')'
 ;
 
list
 : OBrace (expr (',' expr)*)? CBrace
 ;

That's the entire parser grammar. Now paste the following in the Test.wy file:

# import Tuple
use Tuple from "Tuple.wy"
 
# function twice
twice(n)
  temp = n + n
  return temp
 
# a rational object
obj Rational
 
  num, den
 
  Rational(n, d)
    num = n
    den = d
 
  str()
    return num + "/" + den
 
# script starts here
r = new Rational(1, 2)
t = twice(r)
println("t = " + t)

Recreate the JAR file and execute the Test.wy script:

ant jar
java -jar Wally.jar src/scripts/Test.wy

Again, you shouldn't see anything on the console, meaning the tokenization and parsing of the input went okay.

Download the project with the updates from this part here.

The next step is to let out parser construct a proper AST for us to walk (and evaluate) at a later time. Continue reading here.

No comments: