CS5nm, Fall 2008

ex15: recursion


Introduction

Goals for this exercise

This exercise provides practice in writing recursive functions on Python lists to:

This exercise also provides practice with writing functions that use Boolean logic, and with using functions you've already defined to make new functions.

What you'll be doing: a heuristic for syllable counting

Many word processing programs (e.g. Microsoft Word) have tools built in that determine the "reading level" of a piece of text. This is important for choosing reading assignments in K-12 education, and for things like determining whether a ballot initiative can be understood by the average California voter.

One factor in determining reading level is the average number of syllables per word. Because English is such a complex language, the only way to be 100% sure of the number of syllables in each word is to look up each word in a dictionary. However, even if we don't have a dictionary, we can still use a technique called a "heuristic" to get a good approximation of the number of syllables in a word.

A "heuristic" is an approach to solving a problem that might not produce an answer that is 100% correct, but it is an answer that is often "good enough". In Computer Science, when problems are difficult to solve with 100% certainty, we often take the approach of coming up with a heuristic, and then trying to improve on that heuristic.

For syllables, here's a first heuristic:

Taking just the words in the preceding sentence, and treating y as a vowel (in addition to a,e,i,o, and u) we see that this heuristic works for every word in that sentence, except for the words "good" and "indication", which is a 90% success rate for that sentence.

However, we can improve the heuristic if we remove "consecutive vowels" before we count—that is, if we have two or more vowels, in a row, we remove the extra ones. Thus "good" becomes "god", and "indication" becomes "indicatin" before we count the vowels. If we do that, then our heuristic of counting vowels is 100% effective on that sentence!

However, this heuristic is still not perfect. For example, if fails on words such as "like", "there", and "looked".

Thus we introduce two more heuristics: ones to try to remove silent e from words such as "like", "there", "plane", etc., and the "ed" that doesn't contribute an extra syllable to words such as "looked", "jumped", "kissed", and "frowned".

The comments in the sample code discuss these heuristics. Be sure to read the comments carefully. They are there to help you make sense of the functions, and how the heuristics were developed.

As you'll see the addition of the silent e and ed removal heuristics improve our syllable counting, but there are still words on which these techniques fail. (The word "techniques" is one example). If we had more time in the quarter, we'd extend this exercise to challenge you to come up with additional heuristics to make our syllable counting technique even more accurate.

As it is, there are still some challenges for you—applying the techniques of writing recursive functions for lists to this syllable counting problem. So, for this exercise, stick to the heuristics given.

If you want to experiment with additional heuristics, just for fun as a challenge to yourself, then finish the exercise as given first in a file called ex15YourName.py, and submit that first on GauchoSpace! Then make a separate copy called ex15YourName_v2.py, in which you experiment with your additional heuristics.

Note that working on additional heuristics for the spelling problem, while not required for this exercise, is an excellent way to study for the exam. Just like with swimming or playing the guitar, the more you do it, the better you will get at it. What we are learning in this class is not primarily a collect of facts, but a new way of thinking, and you need to practice that way of thinking to improve.

Step by Step Instructions

Step 0: Getting Started


Step 0a: Download the starting point code from the ex15 folder

You should find a folder ex14 in the usual place on the web site. Here is a direct link:

ex15/ex15PhillConrad.py

If it is not there, ask your TA or instructor for help.

 

Step 1: Read through the comments, code and test cases in the ENTIRE file first

Before you jump in, just read thorough the ENTIRE file from start to finish. Read all of the comments, all of the code, and all of the test cases. Having a sense of the entire file will really help you as you proceed.

Find the stubs. These are the places where you'll be adding code in Step 2. But don't just jump in and start coding yet.

As you find things you don't understand, ask the TA or the instructor, or make a note to do so during class or office hours.

Only after you've read the entire file through—and you understand the big picture—proceed to step 2.

Step 2: Starting from the top, replace the stubs with correct function bodies

Your task is to replace the stubs (things like return -1 and return "stub") with correct function bodies, so that all the tests pass.

It is best to start at the top, and do the functions in the order they appear in the file.

A Helpful Hint:

What NOT do to...

There can be a temptation to make your test cases pass by changing them to the answer your function returns, even if it is wrong, but that is not permitted—in fact, it is a kind of academic dishonesty.

So, don't change any of the test cases, unless you have permission of the TA/Instructor. (This should only be done if we determine that there is an error in the test cases provided.)

When all the test cases pass...

When your file contains all the functions and test cases from the original ex15PhillConrad.py, and all the tests pass, you are finished!

Step 3: Submitting on GauchoSpace

Before submitting on Gauchospace:


Evaluation and Grading (150 pts)

Grading Rubric:

Function Item Points  
all Naming and submitting your file correctly 10  
all Fixing the header comment in each of the three files 10  
all Code style (variable names, comments, etc,
removing all the @@@ lines)
10  
isVowel() correctness of code—code passes all tests from original file 20  
countVowels() correctness of code—code passes all tests from original file 20  
allVowelsA() correctness of code—code passes all tests from original file 20  
removeSilentE() correctness of code—code passes all tests from original file 20  
removeEdWhenNotASyllable() correctness of code—code passes all tests from original file 20  
countSyllables() correctness of code—code passes all tests from original file 20  
TOTAL   150  

 

Due Date: Fri 12/05 (6:30pm—no extensions!)

I will have special office hours on Friday from 2pm-4pm in the Mesa Lab, and will stay later for anyone that shows up by 4pm.

However, you must finish this by 6:30pm on Friday. The Mesa Lab is open until 6:50pm on Friday, but they might start kicking people out at 6:45, so you need to finish up by 6:30pm to be safe.

Submit whatever you have by then, finished or not. Partial credit is better than a zero.

Try to finish this before or during lab on Wednesday 12/03—that's a better approach!


Copyright 2008, Phillip T. Conrad, CS Dept, UC Santa Barbara. Permission to copy for non-commercial, non-profit, educational purposes granted, provided appropriate credit is given; all other rights reserved.