Programming Assignment 4 (Due on Feb. 29 )

In this assignment you will add a checking pass to your compiler - an algorithm that traverses the syntax tree and decides if it is legal code.

Files

For this assignment, you will be downloading a compressed "tar" (tape-archive) file which contains all the files you will need. Keep all the files in a single directory.

Once you've downloaded the compressed tar file, execute the following commands:

This will create a directory called pa4 and will put in it all the files you need. You can then remove the pa4.tar file.

You'll notice that we have now moved to a structure where each Java class is in its own source file. This is because of the growing interdependence between classes. For example, in this project, the classes SymTable and Entry refer to Node, and most of the Node classes will refer to those classes.


The Problem

There is only one part to this assignment; you don't have to do it using both the top-down and bottom-up parsers. We will continue with the top-down parser. Included in the tar file is a correct solution to Programming Assignment 3. It has some added features that will be discussed below.

There are two equally reasonable ways to proceed with the project: the procedural approach and the object-oriented approach. We will be using the OO approach. At the end of this document is a discussion of how these approaches differ.

The program you turn in should check an input Decaf program for the following properties:

That seems like a lot, but using a few basic ideas about symbol tables and stacks, it can be done relatively painlessly. Notice that once parsing is successful for an input program, your code should report all errors.

Getting Started

Download the tar file and un-tar it as described above. Now execute the following instructions:

Both commands should return with no errors - in fact no messages of any kind.

OK, so what do you have to do? The actual code in Main.java looks like this:

So we're sending the check message to the top node of the syntax tree created by the parser. You will be adding to each node class a method that checks that node for semantic correctness. In the case of the ClassDecl node, that method has been completed for you, and looks like this:

First we tell the symbol table that we're entering a new scope. Then we tell it to register the name of the class itself. Now if any subsequent variables use this name, it will already be in the table and you should report an error. Then we register the names of each method in the symbol table. This is necessary because you can call a method that gets declared later in the file. Then we send the check message to the methods, and close the scope.

This is the pattern: for each type of node in the syntax tree, you check its children, and do some checking on the node itself. So to start the assignment, you now have to look at the code for MethodDecl. In fact, that class' check() method is also already finished for you. But you have to look at it to see what other check() methods it calls. Continuing this way you will eventually look at each node class and fill in its check() methods.

Some basic rules to follow when writing your checking code:

Helper classes

You are provided also with various helper classes. For convenience while browsing this HTML document, the source code for these classes exists in an un-tarred form. But don't download these classes from the links; you already get them when you download the tar file.

Added Features

There are a few additions to the Parser we built in the previous assignment. The most significant is that Nodes now have a field that keeps their range (location in the source program). This makes the Checker's error messages a lot more useful. To fill these values in, the Parser has to do some more work too, saving the locations of Tokens as it parses.

There is also a new method in the NodeList class called length() that tells how many elements in the list. This will be useful when you have to count parameters in parameter lists.


Procedural vs. OO

The following argument compares the procedural and OO programming styles. You do not have to read it to do the project. It should give you a good insight on different ways of implementing a type checker.

The procedural approach resembles closely things we've done before. (Recall that we won't be using this way; it's just for illustration.) You would use a framework just like the one used in the top-down parser. That is, you would have a class called Checker, which has methods like checkClassDecl(), checkReturnStmt(), checkEqualityExpr(), and so forth. You could almost just replace every parseXXX() method in the parser with a checkXXX method in the checker. It would look something like this:

At each step you just check each part of the node you're passed as a parameter. So what's the drawback? Consider the routine for checking statements:

Whenever you see code like this, alarm bells should go off in your head. This is the sort of thing that polymorphism is supposed to take care of in object-oriented programming. Using that approach, you define a check() method for each type of syntax node. In other words, each type of node knows how to check itself. Then when you have a statement s that you want to check, you just say

That takes the thirteen lines of code in checkStatement() and reduces them to one line. At runtime, when this call is made, the appropriate sort of checking will be done for each type of node, even though at compile time all you know is that s is some sort of Statement.

Comparing the tradeoffs between the two approaches:

The problem with node classes growing in size can be solved by using the Visitor design pattern. But that's for a more advanced class.