Sunday, 15 November 2015

1. Introduction to PLY

PLY is a pure-Python implementation of the popular compiler construction tools lex and yacc. The main goal of PLY is to stay fairly faithful to the way in which traditional lex/yacc tools work. This includes supporting LALR(1) parsing as well as providing extensive input validation, error reporting, and diagnostics. Thus, if you've used yacc in another programming language, it should be relatively straightforward to use PLY.

In this article, we focus on prototyping interpreters using Python and PLY. However, this toolset could easily be used to write compilers as well. Simple syntax features such as string interpolation and Python's triple quotes for quoting multiple line strings are convenient when creating compilers. Furthermore, compilers do not have the performance constraints that interpreters do, so it is reasonable to write a production compiler in Python, instead of just a prototype. Nonetheless, we were still surprised at how easy Python made it thanks to tricks such as reusing Python's type system and libraries.

Early versions of PLY were developed to support an Introduction to Compilers Course taught in 2001 at the University of Chicago. In this course, students built a fully functional compiler for a simple Pascal-like language. Their compiler, implemented entirely in Python, had to include lexical analysis, parsing, type checking, type inference, nested scoping, and code generation for the SPARC processor.

Since PLY was primarily developed as an instructional tool, you will find it to be fairly picky about token and grammar rule specification. In part, this added formality is meant to catch common programming mistakes made by novice users. However, advanced users will also find such features to be useful when building complicated grammars for real programming languages. It should also be noted that PLY does not provide much in the way of bells and whistles (e.g., automatic construction of abstract syntax trees, tree traversal, etc.). Nor would we consider it to be a parsing framework. Instead, you will find a bare-bones, yet fully capable lex/yacc implementation written entirely in Python.

The rest of this document assumes that you are somewhat familiar with parsing theory, syntax directed translation, and the use of compiler construction tools such as lex and yacc in other programming languages. If you are unfamiliar with these topics, you will probably want to consult an introductory text such as "Compilers: Principles, Techniques, and Tools", by Aho, Sethi, and Ullman. O'Reilly's "Lex and Yacc" by John Levine may also be handy. In fact, the O'Reilly book can be used as a reference for PLY as the concepts are virtually identical.


No comments:

Post a Comment