CS 2213 Advanced Programming
Functions and Program Structure


Read Chapter 4 of the text.
Previous Topic: Control Flow in C
Start of Class Tuesday, Sept. 19, 2000
Next Topic: Arrays and Pointers in C

Basics of Functions

We covered this already.

Functions Returning Non-integers

There are some subtle points here.

The C compiler assumes that functions return an int unless told otherwise.

This is why you can write
main()
instead of
int main()

Consider the following program:

#include <stdio.h>

int main() {
   double x;
   x = sqrt(2.0);
   printf("The square root of 2 is %f\n",x);
   return 0;
}
If this is in t1.c we can compile it with
cc -o t1 t1.c -lm
The last part tells the linker to look in the math library for the sqrt function.

This compiles without error and when we run it we get the following:

The square root of 2 is 1073741824.000000

What happened and why?

The C compiler thought that sqrt would return an int and included code to covert the int to a double.

In fact, sqrt returns a double.

We could fix the problem by putting a prototype above main:

double sqrt(double x);

or a better solution would be to include math.h which contains this and other prototypes.

Now the output looks like:

The square root of 2 is 1.414214

Lint would have saved us on the first program:

c
vip2% lint t1.c

implicitly declared to return int
    (5) sqrt        

name used but not defined
    sqrt             	t1.c(5)

function returns value which is always ignored
    printf          
Running lint on the correct program would give lots of errors of the form

name declared but never used or defined
which we could get rid of with
lint t.1 -lm
which when run on the incorrect program gives:

implicitly declared to return int
    (5) sqrt        

value type used inconsistently
    sqrt             	llib-lm(37) double () :: t1.c(5) int ()

function returns value which is always ignored
    printf      
and when run on the correct program gives

function returns value which is always ignored
    printf          

External Variables

An external variable is one defined outside of any function.

It can be accessed by any function inside the same file and by default can also be accessed by functions in other files.

The linkage class of a function or a variable determines whether it can be accessed from outside the file in which it is declared.

A variable or function can either have internal linkage or external linkage.

Variables defined inside functions can only be accessed from inside the functions that they are defined in.
These must have internal linkage.

Variables defined outside of any function and functions themselves have external linkage by default.
They can be made to have internal linkage by declaring them to be static.

Note: The word static has a different meaning when applied to a variable defined inside a function.

Variables and functions with internal linkage are private to the file in which they are defined.

We can use external variables with internal linkage to design C modules that behave like Java objects.

Example: a stack implementation:

#include <stdio.h>
#define MAXVAL 100

static int sp = 0;
static double val[MAXVAL];

void push(double f) {
   if (sp < MAXVAL)
      val[sp++] = f;
   else
      printf("error: stack full, can't push %g\n",f);
}

double pop(void) {
   if (sp > 0)
      return val[--sp];
   else {
      printf("error: stack empty\n");
      return 0.0;
   }
}
void display_stack() {
   int i;
   printf("Number of items on the stack: %d\n",sp);
   for (i=0;i<sp;i++)
      printf("\t%.8g\n",val[i]);
}

A simple calculator

Some HP calculators and some languages (like Postscript) use reverse Polish notation for doing calculations.

The operator follows the operands.

This allows expressions to be evaluated without using parentheses.

Example:

(1 - 2) * (4 + 5) becomes
1 2 - 4 5 + *

This can be implemented using a stack.

When you get a number, push it on the stack.
When you get an operator, pop the operands and push the result
push(1) 1 is a number, push it
push(2) 2 is a number, push it
x = pop() - is an operator with two operands
push(pop()-x)
push(4) 4 is a number, push it
push(5) 5 is a number, push it
push(pop()+pop()) + is an operator with two operands
push(pop()*pop()) * is an operator with two operands

Note that - is treated differently because the order matters.

Here is a calculator that takes single digit numbers and the following operators: + - * / = '\n' where = displays the stack without poping anything and the last one pops the result from the stack and displays it.

#include <stdio.h>

void push(double);
double pop(void);
void display_stack(void);

int main() {
   int type;
   double op2;

   while ((type = getchar()) != EOF) {
      switch(type) {
      case('+'):
         push(pop()+pop());
         break;
      case('-'):
         op2 = pop();
         push(pop()-op2);
         break;
      case('*'):
         push(pop()*pop());
         break;
      case('/'):
         op2 = pop();
         if (op2 != 0.0)
            push(pop()/op2);
         else
            printf("error: zero divisor\n");
         break;
      case('\n'):
         printf("\t%.8g\n",pop());
         break;
      case('='):
         display_stack();
         break;
      default:
         if ( (type >= '0') && (type <= '9') ) {
            push(type-'0');
         }
         else
            printf("error: invalid input: %c\n",(char)type);
      }
   }
   return 0;
}
What would happen if you misspelled case?
What would happen if you misspelled default?

What if we wanted to accept numbers that were more than one digit?

Instead of getting a character, we would get a token.

A token is just a string that does not contain delimiters.

Suppose we have a function:
getToken(char[] s, int max)
which read in a token from standard input.

Suppose also that there was a library function:
double atof(char[] s)
Which converted a string to a double.

We could rewrite our calculator as follows:

#include <stdio.h>
#include <stdlib.h>

#define MAXTOKEN 100

void push(double);
double pop(void);
void display_stack(void);
void getToken(char s[], int max);

int main() {
   double op2;
   char token[MAXTOKEN];
 
   while (getToken(token,MAXTOKEN),token[0] != '\0') {
      if (token[1] == '\0') {
         switch(token[0]) {
         case('+'):
            push(pop()+pop());
            break;
         case('-'):
            op2 = pop();
            push(pop()-op2);
            break;
         case('*'):
            push(pop()*pop());
            break;
         case('/'):
            op2 = pop();
            if (op2 != 0.0)
               push(pop()/op2);
            else
               printf("error: zero divisor\n");
            break;
         case('\n'):
            printf("\t%.8g\n",pop());
            break;
         case('='):
            display_stack();
            break;
         default:
            push(atof(token));
         }
      }   
      else
         push(atof(token));
   }
   return 0;
}

Here is a version of getToken
#include <stdio.h>

static int delimiter(int c) {
   if (c == ' ') return 1;
   if (c == '\t') return 1;
   return 0;
}

void getToken(char s[], int max) {
   int i;
   int c;
   s[0] = '\0';
   while (delimiter(c=getchar())) ;
   if (c == EOF)
      return;
   s[0]=c;
   i=1;
   while ( !delimiter(c=getchar()) ) {
      if (c == EOF) {
         s[i] = '\0';
         return;
      }  
      s[i] = c;
      i++;
      if (i == max-1) {
         s[i] = 0;
         return;
      }  
   }   
   s[i] = '\0';
   return;

}


Scope Rules

An identifier (name) is visible (i.e. can be used) only within a region of program text called its scope.

There are 4 kinds of scope: function, file, block, and function prototype.

function scope is only for labels (used with goto) and its scope is the entire function in which it appears.

Every other identifier has scope determined by its placement.

If it appears outside any block or list of parameters it has file scope.

If it appears inside a block or inside a list of parameter declarations, specifications of a function definition, it has block scope which terminates at the brace that closes the block.

If an identifier is declared inside a function prototype, is has function prototype scope which terminates at the end of the function declarator.

The scope of most identifiers begins just after the completion of the declarator.
We have not discussed any of the things that are exceptions to this.

Summary: Since we will never use gotos and the scope of parameters in function prototypes is usually not of interest, the two cases left are:

file scope is for those things declared outside any block and these are also called external variables. Function names are always external variables.

block scope is for those declared inside a block, and these are also called internal variables. Function parameters are an example of these and the block is the body of the function.

Recall that an external variable or function must have one definition and may have several declarations.

For external variables, extern is used to distinguish between a declaration and a definition.

Initialization of an external variable can only go with the definition.

Array sizes must be specified in the definition, but this is optional in the declaration.

For functions, a prototype is used for the declaration, and the extern is optional.


Header Files

Header files are used to contain variable and function declarations (not definitions) and definitions of constants.

This allows related declarations to be kept together and accessed from different files.


Static Variables

The word static has a different meaning when applied to external and internal variables.

When applied to an external variable (or a function), it prevents access to that variable or function from outside the file in which it is defined.

Internal variables (including function parameters) are by default automatic.
This means that they are created when the block in which they are defined is entered and destroyed when the block is left.

Storage used by internal variables (other than function parameters) may be made permanent by declaring them static.
This means that the location used by the variable is permanent (for the life of the program) and such variables will retain their value between calls to the function in which they are defined.


Register Variables

A register declaration is a suggestion to the compiler to attempt to keep the variable in fast storage.
It is only a suggestion and the compiler does not have to do it.


Block Structure

Blocks are delimited by braces.

Functions may not be defined inside blocks, but variables may be.

You are allowed to do something like the following:

int x;
int y;

int f(double x) {
   double y;
   ...
{
Inside f, x is of type double and is independent of the integer by the same name defined outside the function.
Inside the function, y is of type double and unrelated to the variable of the same name defined outside the function.


Initialization

If you do not explicitly initialize a variable, external and static internal variables are initialized to 0 bits.

This may not represent 0.0 for float and double variables.

Automatic variables are not initialized by default and may start with any value.

Scalar variables may be initialized when they are defined:

int x = 1;
char squote = '\'';
long day = 1000L * 60L * 60L *24L;  /* milliseconds in a day */
For external or internal static variables, only constants may be used.

Automatic variables may be initialized using an expression involving any previously defined values:

int binsearch(int x, intv[], int n) {
   int low = 0;
   int high = n - 1;
   ...
}
Arrays may be initialized using a list of values int days[] = {31,28,31,30,31,30,31,31,30,31,30,31}; and character arrays may be initialized using string constants:
char pattern[] = "ould";


Recursion

C functions may be called recursively.

We will look at recursion later.


The C Preprocessor

The two main uses we have for the preprocessor are file inclusion and constant definition.

#include "filename" and
#include <filename>
both attempt to include the named file.
The first starts looking in the current directory. The second starts in a system dependent place.

#define name replacement can be used to define a constant. We will discuss more complex usage at a later time.