Writing your first grammar

First steps

Let's try creating a grammar which parses a signed integer:

from metasyntax import *

grammar = Grammar({
   "start": GrammarRule(
      Start(),
      Or(
         String("-"),
         String("+"),
         String("")
      ),
      Regex(r"\d+"),
      End()
   )
})
  1. We import * from metasyntax
  2. We create a Grammar instance under the variable name grammar. We supply a dict as a parameter.
  3. In that dict, we define a rule named start. This is the rule the parser will always look for as a starting point
  4. We add an Or token, this only matches if one ore more of the contained tokens match. This token contains three tokens. The first and second + and - and the third an empty string. This ensures the sign is optional.
  5. We add an Regex token containing a regular expression matching one digit ore more
  6. We add an End token. This just assures there is nothing coming after the tokens before

Adding Semantics

You can add semantic actions by subclassing object and defining methods for each rule. Tokens return objects. For example, a String returns itself on a match, a Regex returns a re match object, a Rule object returns a list of the token return values of the named rule and so on. The methods defined for each rule take self and the listed return values as a parameter. As you add more rules to your grammar and they call each other with Rule, this may become useful.

An example:

class Semantics(object):
   def start(self, obj):
      num = int(obj[2].group(0)) 
      if obj[1] == "-":
         num = num - num * 2
      return num 

Putting it together

from metasyntax import *

class Semantics(object):
   def start(self, obj):
      num = int(obj[2].group(0)) 
      if obj[1] == "-":
         num = num - num * 2
      return num 

grammar = Grammar({
   "start": GrammarRule(
      Or(
         String("-"),
         String("+"),
         String("")
      ),
      Regex(r"\d+"),
      End()
   )
}, Semantics())

print(grammar("-5"))

If you run this, It should print the integer -5 returned by grammar(). Now this is a trivial usage, but serves as a good example