The goals of this project are:
- to implement the grammar for our "simple" programming language
- to get familiar with front-end generators such as Lex and Yacc
This is an individual project.
The project is due on Friday, October 28 2016, 23:59:59 PST (extended).
We want you to build the first part of your compiler: the
scanner and parser. To do this, we will be using lex (flex) and
yacc (bison), common tools to build LR parsers. For the rest of
this class, we will be focusing on a new language, which we call
CSimple. You will be building a compiler for this new
programming language. The language manual can be
This manual is going to be used for the rest of this quarter
(and might be updated frequently). Always use this as the first
and last authority on what your grammar should be.
Tour of the Code
Again, we provide code that you should use as a starting point
for your project. You can find the
Makefile - Your make file. You don't need to edit this file.
main.cpp - The main C++ file. You don't need to edit this file.
parser.y - The yacc file that contains your
grammar rules/productions. At this point, it contains the
grammar from the Project 1. You will need to edit it and take
that out. Replace that grammar with the your own grammar for
the language that we have defined for you in the manual.
lexer.l - The lex file that contains the regular
expressions for recognizing your tokens. It now contains the
tokens from Project 1. Once again, you will have to edit this
file and replace these expressions with your own.
test.good.calc - A test file just so you can compile
and run this project as it is (this is not a valid CSimple
program, add your own test cases as you go along).
Steps to Solve the Challenge
- READ the manual for the language. Understand this language
and its specifications.
- Go over the small example we have included. Make sure you
understand how Lex and Yacc work together in the example. To
familiarize yourself with Lex and Yacc, you should read up on
them. Here is a good
that might help you out. Also, use Google and read the man
- From the language specification, create a grammar which
accepts all valid programs for our language. This is the
crucial part. You must get the grammar correct here in this
first part of the core compiler project. You will be building
on top of this project, and your grammar must be correct. Test
- Implement the scanner (in Lex) and make sure you account for
all of the lexical patterns. Ensure that your scanner gives an
error for dangling comments (comments not terminated before
the EOF is reached), and make sure that it handles characters and strings correctly.
- Implement your grammar (in Yacc). Save time for this part, since
you will likely have have to iteratively correct for errors.
- TEST: You can find some test
files here. You
will want to test your parser using these good and bad files
thoroughly. You must also create your own test files. Make
them as complete and complex as possible.
To test your Lexer, put printf statements before you
return something. This will tell you where your scanner
stopped working AND which token you just failed in parsing.
To test your Parser, put printf statements after each rule.
This will make it easier for you to trace what your parser is doing.
To run your program use:
./csimple < test.lang where test.lang is a test file.
If you run Yacc with the -v flag, it will write the file y.output. It contains a
readable description of the parsing tables (more specifically, a description of the
LR(1) states and the items they contain). In addition, it will report where the conflicts
or problems in the grammar appear.
Make sure you get the Lexer working perfectly first! Lex allows you to execute C
code when it matches a rule (AFTER it matches the rule). Simply print to stdout
like you did for the previous project. You should get a stream of tokens.
What Your Parser Has to Do!
- Your parser should be able parse any valid input file from our language.
- You will need to catch ALL syntax errors.
- You will need to catch ALL program structure errors. By this I mean that
your parser has to know that the keyword "procedure" ALWAYS precedes a
procedure_id in a procedure declaration.
- You will NOT have to check that procedures and variables have been declared
before you use them.
- You will NOT have to check that there is one and only one Main(). Remember
that Main() is just a special procedure. At this point we don't care that
it is special.
- You will NOT have to check that procedure_ids and variable_ids are used multiple
times. So you could declare variable A multiple times and it would be okay
at this point.
- You will NOT have to check the return types of procedures.
- In a nutshell, your parser looks at each line of code individually. It does not
have global knowledge of variables or procedures....yet.
Deliverables / Turnin
Please follow the instructions below exactly!
- Your files must be in a directory named "parse".
- All files must be included (makefiles, everything!) in that folder.
- Your project must compile on a CSIL machine. If you worked
on a Windows machine or your laptop at home, then make sure
it still works on CSIL or modify it appropriately!
- Include a README with this project. Explain what you did in
the README. If you had problems, tell us why and what.
- All errors (ALL OF THEM) go to stderr.
Use this command to submit your work: turnin proj2@cs160
We will run your program against a number of test files that
check for correct parsing of individual language features, in
increasing levels of complexity. Your grade derives from the
fraction of test files that you parse correctly. Remember that
correctly parsing implies that you reject invalid input
files and throw appropriate error messages.
Important Note: No README == No partial credit if the
project does not work 100%