CS 3733 Operating Systems, Spring 2009 Assignment 4


Due Friday, March 27

In this assignment you will start on a parallel version of the wc program that can be found in /usr/bin.

Suppose we want to count the number of words in a few large files. We could use /usr/bin/wc, but we prefer to do the calculation in parallel, if possible. This version will count the words in all of the regular files in a directory. It will do this by creating a child process for each such file, have the child calculate the number of words and send the result back to the parent. The parent will be responsible for displaying the result.

Part 0
Write a function:
int getWordCount(char *filename);
that takes as its parameter the pathname of a file and returns the number of words in that file. Word delimiters are the characters for which the C library function isspace returns true. Return -1 if it cannot read the file. Do not make any assumptions about the size of the file, other than its ability to be accessed. Do not assume that the entire file can be read into memory.

Note: The simplest implementation probably uses that state machine concept. Scan through the file, keeping track of whether the last character was a word character or a delimitor.

Part 1
Write a main program called wordCount that takes a single string as a command line parameter. If the parameter is pathname it starts by printing a line in the form:
Serial word count for pathname written by ...
where it displays the command line parameter and your name. If the command line parameter is the name of a directory (either an absolute or relative path) it calls getWordCount for each regular file in the directory and displays the number of words and the name of the file (just the name, not the entire path). It then displays the total number of words in all of the files. The display should be in a list of two columns, first is the number of words (right justified) and second is the file name (left justified). The total should be in the first column, lined up with the other values with the word total in the second column.

Note: You will need to create a full path name to pass to getWordCount.

Create and run tests and compare the results to that of wc -w. Note any differences. When finished with your tests, test the program with the directory ~cs3733/s2009tests. Indicate approximately how long it took to complete this test.

Part 2
Write a main program called wordCountChild that is like the program in Part 1, but uses a child process to calculate the word count of each file. Replace the first line output with:
Child concurrent word count for pathname written by ...

Implement the program as follows:

The output generated from Part 2 should be the same as for Part 1, except possibly for the order in which the lines are displayed.

Note: If you write this correctly, the files will be handled concurrently rather than serially.

Test this as in Part 1. Make sure all of your test output is appropriately labeled.

Handing in your assignment
Use this cover sheet. Consecutively number all of the pages you turn in. Make sure you answer the questions that are on the cover sheet.