pyPEG – a PEG Parser-Interpreter in Python
Requires Python 3.x or 2.7
Older versions: pyPEG 1.x
Offers parsing and composing capabilities. Implements an intrinsic Packrat parser.
pyPEG uses memoization as speed enhancement. Create a
Parser instance to have a reset cache memory.
Usually this is recommended if you're parsing another text – the cache
memory will not provide wrong results but a reset will save memory
consumption. If you're altering the grammar then clearing the cache
memory for the respective things is required for having correct parsing
results. Please use the
clear_memory() method in that
case.
The instance variables are representing the parser's state.
| Regular expression to scan whitespace; default: |
|
|
| after parsing, |
| string to use to indent while composing; default: four spaces |
| level to indent to; default: |
| original text to parse; set for decorated syntax errors |
| filename where text is origin from |
| add blanks while composing if grammar would possibly be violated otherwise; default: True |
| keep otherwise cropped things like comments and whitespace; these
things are being put into the |
__init__(self)
Initialize instance variables to their defaults.
clear_memory(self, thing=None)
Clear cache memory for packrat parsing.
This method clears the cache memory for thing. If None is given
as thing, it clears the cache completely.
| thing for which cache memory is cleared; default: |
parse(self, text, thing, filename=None)
(Partially) parse text following thing as grammar and return the
resulting things.
This method parses as far as possible. It does not raise a
SyntaxError if the source text does not parse completely. It
returns a SyntaxError object as result part of the return value if
the beginning of the source text does not comply with grammar
thing.
| text to parse |
| grammar for things to parse |
| filename where text is origin from |
Returns (text, result) with:
| unparsed text |
| generated objects |
| if input does not match types |
| if output classes have wrong syntax for their respective |
| if grammar contains an object of unkown type |
| if grammar contains an illegal cardinality value |
Example:
>>> from pypeg2 import Parser, csl, word
>>> p = Parser()
>>> p.parse("hello, world!", csl(word))
('!', ['hello', 'world'])
compose(self, thing, grammar=None)
Compose text using thing with grammar. If thing.compose()
exists, execute it, otherwise use grammar to compose.
|
|
|
|
Composed text
| if |
| if |
| if |
Example:
>>> from pypeg2 import Parser, csl, word
>>> p = Parser()
>>> p.compose(['hello', 'world'], csl(word))
'hello, world'
generate_syntax_error(self, msg, pos)
Generate a syntax error construct.
| string with error message |
|
|
Instance of SyntaxError with error text
parse(text, thing, filename=None, whitespace=whitespace,
comment=None, keep_feeble_things=False)
Parse text following thing as grammar and return the resulting things or
raise an error.
|
|
|
|
|
|
| regular expression to skip |
|
|
| keep otherwise cropped things like comments and whitespace; these
things are being put into the |
generated things
| if |
| if input does not match types |
| if output classes have wrong syntax for |
| if |
| if |
Example:
>>> from pypeg2 import parse, csl, word
>>> parse("hello, world", csl(word))
['hello', 'world']
compose(thing, grammar=None, indent=" ", autoblank=True)
Compose text using thing with grammar.
|
|
|
|
| string to use to indent while composing; default: four spaces |
| add blanks if grammar would possibly be violated otherwise; default: True |
composed text
| if input does not match |
| if |
| if |
Example:
>>> from pypeg2 import compose, csl, word
>>> compose(['hello', 'world'], csl(word))
'hello, world'
attributes(grammar, invisible=False)
Iterates all attributes of a grammar.
This function can be used to iterate through all attributes which
will be generated for the top level object of the grammar. If
invisible is False omit attributes whose names are starting with
an underscore _.
Example:
>>> from pypeg2 import attr, name, attributes, word, restline
>>> class Me:
... grammar = name(), attr("typing", word), restline
...
>>> for a in attributes(Me.grammar): print(a.name)
...
name
typing
>>>
how_many(grammar)
Determines the possibly parsed objects of grammar.
This function is meant to check if the results of a grammar can be stored in a single object or a collection will be needed.
| if there will be no objects |
| if there will be a maximum of one object |
| if there can be more than one object |
| if |
| if |
Example:
>>> from pypeg2 import how_many, word, csl
>>> how_many("some")
0
>>> how_many(word)
1
>>> how_many(csl(word))
2
Base class for all errors pyPEG delivers.
A grammar contains an object of a type which cannot be parsed,
for example an instance of an unknown class or of a basic type
like float. It can be caused by an int at the wrong place, too.
A grammar contains an object with an illegal value, for example an undefined cardinality.