Re: cl-yacc vs parsegen:defparser
On Wednesday 25 April 2007 9:47 am, Joel Reymont wrote:
> Is there any reason to prefer one over the other?
Joel,
>From your series of questions, it's hard to tell what you are trying to
accomplish (e.g. maybe a pretty-printer, or maybe a full-blown compiler).
You mentioned C/C++/C#. These languages (C/C++ for sure, C# I don't know) are
ridiculously "hard" to parse because you can't simply do it using syntax
alone. You will end up needing to track typedefs, you will end up needing to
deal with macros. For example, the statements:
MACRO();
and
MACRO;
and
MACRO
are "legal", but are not part of the 'formal' syntax of C/C++ (they are
stripped out by the macro processor before the code reaches the parser).
Furthermore, typedefs make the syntax of C/C++ ambiguous and you will need to
build symbol tables into the scanner to disambiguate.
Using a bottom-up parser (e.g. yacc's) will force you to go all the way, even
if you want something "simpler".
Whatever you do, you are going to find that you need a "state machine" of some
kind for parsing this class of languages.
If you are trying to accomplish something simpler (like a pretty printer), you
might want to also look at backtracking parsers - e.g. TXL (not lisp) and the
fairly new parsing technology called packrat parsing (PEG). I think (but
don't know) that Antlr may also have abilities of this kind (or different
abilities that attack the same problem).
In fact, in the next compiler I build (esp. if it were for a c-like language),
I would look carefully at PEG parsing. But, then, I enjoy building compiler
tools :-). Ymmv.
http://pdos.csail.mit.edu/~baford/packrat/
pt