Assignment 5
GALA: GVM Assembly LAnguage

The purpose of this programming assignment is to give you programming experience with:

It also is intended to give you conceptual experience with building a computer science tool, namely, an assembler, whose output is GVM machine code.

This assignment is more challenging than those that have come before. There are several reasons why this is so: I am not giving you any of the OOD; you will design it completely. I am not giving you any skeletons, as doing so would implicitly be giving design information. The logic, at points, is somewhat involved.

You will need to think carefully about both the OOD & the stepwise refinement of your methods. As a starting point for the former, the nouns describing the project are candidate objects; the verbs are candidate methods; the adjectives are candidate attributes. This however is only a starting point. Assigning to each object those responsibilities that are most suited to it materially simplifies your implementation. Try to discover the "nature" of the system you are designing. If it feels unduly complicated, it probably is.

Methods similarly should be carefully refined. Each method should be a short sequence of actions that all are at approximately the same level of abstraction. If a method is starting to feel long, you probably are rendering some of it in too much detail. Factor lower level details into helper methods. Method design, in this sense, is recursive. It is unlikely that your 1st design is the best. Stay flexible.

When you successfully complete this assignment, you will have earned the pride you will feel.

From Wikipedia's Assembly language page:

GALA Specification

A GALA program is a sequence of GALA statements. GALA statements comprise sequences of strings from 3 syntactic categories:

Each statement is of the following general form, where items within square brackets may or may not occur:

Each assembler statement appears on its own line.

The define instruction

This is the only assembler instruction (as opposed to machine instruction), & the only instruction with 2 operands. The 1st operand is an identifier; the 2nd is a number. The meaning: associate that number with that identifier. This is analogous to a Java static final int constant. For example, the analog of the Java statement

is Their purpose is to enhance the readability & maintainability of GALA program.

An identifier, used as a symbolic constant, can be defined only once. If it occurs in another define statement, the duplicate use is reported by the assembler as an error. The statement that defines a symbolic constant must occur before any statement that refers to it.

is a valid define statement, unless INCREMENT already had been defined. The following statement is not valid: because the 2nd operand is not a number.

Opcodes with 1 operand

The following opcodes have 1 operand: set, load, store, add, goto, & zero. The operand for each can be a number or an identifier.

If it is a number, it has the same meaning as the number operand of its corresponding GVM instruction. For example, store 17 means store the contents of the accumulator (data memory location 0) into data memory location 17.

If the operand is an identifier, its meaning depends on the opcode. If the opcode is neither a goto nor a zero, then the identifier must be a previously defined symbolic constant; otherwise, it is an invalid statement. For example,

is a valid statement if & only if INCREMENT already had been defined via a define statement.

If the opcode is goto or zero, the identifier must refer to a statement label (i.e., it refers to the location of an instruction in the program memory). The referenced label may occur on a statement that comes after a statement that references it. Your GALA assembler is a 1-pass assembler: When it finishes reading & processing all the lines in the text file, it does not need to rescan them. The implication for forward statement label references is that, when a statement label is referenced (in a goto or zero statement), if it is not already defined, then the instruction that refers to it must be put in a set or list of such instructions. When the statement label is defined, the unresolved reference list is accessed. Every instruction on the list, if any, then can have its operand patched with the correct value: the program memory location of the instruction with the statement label.

Opcodes with 0 operands

The following opcodes have no operand: stop, setcolor, drawline, drawrect, drawoval, fillrect, & filloval. They simply translate into the corresponding machine instruction.

GALA Predefined symbolic names

Gala has the following predefined symbolic constants that reference standard data memory cells:

data memory cellsymbolic name
0ACC
1X
2Y
3WIDTH
4HEIGHT
5RED
6GREEN
7BLUE

Blank lines

Blank lines are ignored by the assembler.

The GALA Comment

Whenever the character "#" appears on a line, that character & all that follow are ignored by the assembler.

GUI controls

In addition to the controls on the GVM, now there is a Load JButton, a program JTextArea, & a console JTextArea. The program text area is where the file to be assembled is displayed. The console text area is used by the assembler to report its conclusions: success, failure, error messages, & assembly time.

The load button is how the user initiates the assembler. The procedure is as follows:

  1. Use JFileChooser to let the user select a text file to be assembled (see the course web page on Sample Code);
  2. The assembler processes each text line; it keeps track of the time to assemble the file, starting immediately after the user selects the file to be assembled.
  3. If the assembly is error-free:
    • indicate this on the console text area, including the time to assemble the file;
    • load the assembler's output, the program, into the GVM.
    As the assembler processes a line, it posts to the console text area any error messages associated with that line. An assembly is defined as unsuccessful when 1 or more statements contained any error. If the assembly is unsuccessful, indicate that, the time the assembler took to do its work, & the number of erroneous statements, all on the console text area.
You may lay out the GUI components any way you like.

The Map interface & the HashMap class

You probably will want to create what often is referred to as a symbol table: a function from a string to its value, which, in the case of GALA, always is an integer. The Java Map is ideal for this purpose. Please look in your textbook, the Java tutorial, & the Javadoc for this interface & the HashMap class that implements the Map interface.

The String & Pattern classes

The process of doing this assignment will give you considerable experience with string processing in Java. Please look at the String class's methods to see which may be of use. I found, for example, both the trim & replaceFirst methods useful when deleting non-meaningful white space & comments, as preparation for subsequent processing.

Sample GALA test files

Below, 2 test files are given. The 1st is of a correct GALA program for producing a 17 X 17 checkerboard (with alternating squares & circles); the 2nd is a sequence of erroneous GALA statements that you can use to test your assembler's error-checking ability. We will use these files to test your implementation.

GALA Checker board program


            # GVM data memory addresses
            define DECREMENT       20
            define ROW_COUNTER     21
            define COLUMN_COUNTER  22
            define DELTA           23

            # GVM program constants
            define N            17
            define DELTA_VALUE  40

            set -1
            store DECREMENT
            set DELTA_VALUE
            store DELTA
            store WIDTH
            store HEIGHT
            set N
            store ROW_COUNTER
            store COLUMN_COUNTER
            
 FillShape  load RED
            zero  IsRed

                # is black
                 set 255   # make it red
                 store RED
                 setcolor
                 filloval
                 goto IfAdvanceColumn
            
IsRed            set 0   # make it black
                 store RED
                 setcolor
                 fillrect
                
IfAdvanceColumn load COLUMN_COUNTER
                add DECREMENT
                store COLUMN_COUNTER
                zero AdvanceColumn
                
               # Finished columns. Advance row?
               load ROW_COUNTER
               add DECREMENT
               store ROW_COUNTER
               zero AdvanceRow
              # Finished rows.
              stop
            
AdvanceColumn load X
              add DELTA
              store X
              goto FillShape
            
AdvanceRow load Y
           add DELTA
           store Y
           # reset column
             set 0
             store X
             set N
             store COLUMN_COUNTER
             goto FillShape

Here is a screenshot of my [current] GALA assembler, in response to this file:

GALA screenshot for checkerboard program

GALA program to test error-handling


    define              # too few operands
    define oneOperand   # too few operands
    define 17           # not an identifier
    define SYMBOL 17 18 # too many operands
    define TYMBOL 17 xx # too many operands
    define 17 UYMBOL    # 17 not an identifier
17xh                    # invalid label
abs                     # no opcode
xyz yzx                 # invalid opcode
label   add 20          # valid 
label   store 16        # duplicate label
    17x                 # neither an identifier nor an opcode
    set NotDefined      # operand symbol is not defined
    goto UndefinedLabel # undefined label
    define xyzz 8g      # 8g is not a number

You do not need to report all errors in a statement, only 1. Here is what my [current] GALA assembler puts to the console, in response to this file:


Starting assembly ...
1: The number of operands for this opcode is incorrect.
2: The number of operands for this opcode is incorrect.
3: The number of operands for this opcode is incorrect.
4: The number of operands for this opcode is incorrect.
5: The number of operands for this opcode is incorrect.
6: define opcode's 1st operand must be an identifier.
7: 17xh is an invalid statement label.
8: labeled statement is missing an opcode.
9: yzx is an invalid opcode.
11: label has been defined previously. 
12: 17x is an invalid statement label.
13: NotDefined is undefined.
15: 8g is not interpretable as a number.
UndefinedLabel is undefined in statements with the following numbers:14

BUILD FAILED. Assembly time: 0.01 sec. Number of invalid statements: 13

Implementation notes

Discussion of possible enhancements

Please post any enhancements to the assembler that would you like to see on Discussion forum.

Rubric

ValueAspect
5Correctly assembles the "checkerboard" program (the operational definition of correct: when the program runs, it produces the correct out)
3Correctly assembles & diagnoses the errors of the error-handling test program
2Javadoc & Style
10Total

For style, we want:


 cappello@cs.ucsb.edu © Copyright 2014 Peter Cappello                                           2014.05.14