Functions

16 minute read

Adding New Functions

So far, we have only been using the functions that come with Python, but it is also possible to add new functions. A function definition specifies the name of a new function and the sequence of statements that run when the function is called.

Here is an example:

def print_lyrics():
    print("I'm a lumberjack, and I'm okay.")
    print("I sleep all night and I work all day.")

def is a keyword that indicates that this is a function definition. The name of the function is print_lyrics. The rules for function names are the same as for variable names: letters, numbers and underscore are legal, but the first character can’t be a number. You can’t use a keyword as the name of a function, and you should avoid having a variable and a function with the same name.

The empty parentheses after the name indicate that this function doesn’t take any arguments.

The first line of the function definition is called the header; the rest is called the body. The header has to end with a colon and the body has to be indented. By convention, indentation is always four spaces. The body can contain any number of statements.

If you type a function definition in the shell, the Python interpreter prints dots (…) to let you know that the definition isn’t complete:

>>> def print_lyrics():
...     print("I'm a lumberjack, and I'm okay.")
...     print("I sleep all night and I work all day.")
...

To end the function, you have to enter an empty line.

Defining a function creates a function object, which has type function:

>>> print_lyrics
<function print_lyrics at 0xb7e99e9c>
>>> type(print_lyrics)
<class 'function'>

The syntax for calling the new function is the same as for built-in functions:

>>> print_lyrics()
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.

Once you have defined a function, you can use it inside another function. For example, to repeat the previous refrain, we could write a function called repeat_lyrics:

def repeat_lyrics():
    print_lyrics()
    print_lyrics()

And then call repeat_lyrics:

>>> repeat_lyrics()
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.

But that’s not really how the song goes.

Definitions and Uses

Pulling together the code fragments from the previous section, the whole program looks like this:

def print_lyrics():
    print("I'm a lumberjack, and I'm okay.")
    print("I sleep all night and I work all day.")

def repeat_lyrics():
    print_lyrics()
    print_lyrics()

repeat_lyrics()

This program contains two function definitions: print_lyrics and repeat_lyrics. Function definitions get executed just like other statements, but the effect is to create function objects. The statements inside the function do not run until the function is called, and the function definition generates no output.

As you might expect, you have to create a function before you can run it. In other words, the function definition has to run before the function gets called.

As an exercise, move the last line of this program to the top, so the function call appears before the definitions. Run the program and see what error message you get.

Now move the function call back to the bottom and move the definition of print_lyrics after the definition of repeat_lyrics. What happens when you run this program?

Flow of Execution

To ensure that a function is defined before its first use, you have to know the order statements run in, which is called the flow of execution.

Execution always begins at the first statement of the program. Statements are run one at a time, in order from top to bottom.

Function definitions do not alter the flow of execution of the program, but remember that statements inside the function don’t run until the function is called.

A function call is like a detour in the flow of execution. Instead of going to the next statement, the flow jumps to the body of the function, runs the statements there, and then comes back to pick up where it left off.

That sounds simple enough, until you remember that one function can call another. While in the middle of one function, the program might have to run the statements in another function. Then, while running that new function, the program might have to run yet another function!

Fortunately, Python is good at keeping track of where it is, so each time a function completes, the program picks up where it left off in the function that called it. When it gets to the end of the program, it terminates.

In summary, when you read a program, you don’t always want to read from top to bottom. Sometimes it makes more sense if you follow the flow of execution.

Parameters and Arguments

Some of the functions we have seen require arguments. For example, when you call math.sin you pass a number as an argument. Some functions take more than one argument: math.pow takes two, the base and the exponent.

Inside the function, the arguments are assigned to variables called parameters. Here is a definition for a function that takes an argument:

def print_twice(bruce):
    print(bruce)
    print(bruce)

This function assigns the argument to a parameter named bruce. When the function is called, it prints the value of the parameter (whatever it is) twice.

This function works with any value that can be printed.

>>> print_twice('Spam')
Spam
Spam
>>> print_twice(42)
42
42
>>> print_twice(math.pi)
3.14159265359
3.14159265359

The same rules of composition that apply to built-in functions also apply to programmer-defined functions, so we can use any kind of expression as an argument for print_twice:

>>> print_twice('Spam '*4)
Spam Spam Spam Spam
Spam Spam Spam Spam
>>> print_twice(math.cos(math.pi))
-1.0
-1.0

The argument is evaluated before the function is called, so in the examples the expressions 'Spam '*4 and math.cos(math.pi) are only evaluated once.

You can also use a variable as an argument:

>>> michael = 'Eric, the half a bee.'
>>> print_twice(michael)
Eric, the half a bee.
Eric, the half a bee.

The name of the variable we pass as an argument (michael) has nothing to do with the name of the parameter (bruce). It doesn’t matter what the value was called back home (in the caller); here in print_twice, we call everybody bruce.

Variables and Parameters are Local

When you create a variable inside a function, it is local, which means that it only exists inside the function. For example:

def cat_twice(part1, part2):
    cat = part1 + part2
    print_twice(cat)

This function takes two arguments, concatenates them, and prints the result twice. Here is an example that uses it:

>>> line1 = 'Bing tiddle '
>>> line2 = 'tiddle bang.'
>>> cat_twice(line1, line2)
Bing tiddle tiddle bang.
Bing tiddle tiddle bang.

When cat_twice terminates, the variable cat is destroyed. If we try to print it, we get an exception:

>>> print(cat)
NameError: name 'cat' is not defined

Parameters are also local. For example, outside print_twice, there is no such thing as bruce.

Stack Diagrams

To keep track of which variables can be used where, it is sometimes useful to draw a stack diagram. Like state diagrams, stack diagrams show the value of each variable, but they also show the function each variable belongs to.

Each function is represented by a frame. A frame is a box with the name of a function beside it and the parameters and variables of the function inside it. The stack diagram for the previous example is shown below.

Stack diagram.

The frames are arranged in a stack that indicates which function called which, and so on. In this example, print_twice was called by cat_twice, and cat_twice was called by __main__, which is a special name for the topmost frame. When you create a variable outside of any function, it belongs to __main__.

Each parameter refers to the same value as its corresponding argument. So, part1 has the same value as line1, part2 has the same value as line2, and bruce has the same value as cat.

If an error occurs during a function call, Python prints the name of the function, the name of the function that called it, and the name of the function that called that, all the way back to __main__.

For example, if you try to access cat from within print_twice, you get a NameError:

Traceback (innermost last):
  File "test.py", line 13, in __main__
    cat_twice(line1, line2)
  File "test.py", line 5, in cat_twice
    print_twice(cat)
  File "test.py", line 9, in print_twice
    print(cat)
NameError: name 'cat' is not defined

This list of functions is called a traceback. It tells you what program file the error occurred in, and what line, and what functions were executing at the time. It also shows the line of code that caused the error.

The order of the functions in the traceback is the same as the order of the frames in the stack diagram. The function that is currently running is at the bottom.

Return Values

Many of the Python functions we have used, such as the math functions, produce return values. But the functions we’ve written are all void: they have an effect, like printing a value, but they don’t have a return value.

When a function generates a return value, we usually assign to a variable or use as part of an expression.

e = math.exp(1.0)
height = radius * math.sin(radians)

The functions we have written so far are void. Speaking casually, they have no return value; more precisely, their return value is None.

How can we write a function that returns a value? Our first example is area, which returns the area of a circle with the given radius:

def area(radius):
    a = math.pi * radius**2
    return a

Notice that there is a return statement in the body of the function. This statement means: “Return immediately from this function and use the following expression as a return value.” The expression can be arbitrarily complicated, so we could have written this function more concisely:

def area(radius):
    return math.pi * radius**2

On the other hand, temporary variables like a can make debugging easier.

Once the function area is defined, we can use it in larger programs.

import math

def area(radius):
    return math.pi * radius**2

def main():
    r1 = float(input("Enter a radius: "))
    r2 = float(input("Enter another one: "))

    print("The area of the first circle is", area(r1))
    print("The area of the second one is", area(r2))

main()

Why functions?

It may not be clear why it is worth the trouble to divide a program into functions. There are several reasons:

Creating a new function gives you an opportunity to name a group of statements, which makes your program easier to read and debug.
Functions can make a program smaller by eliminating repetitive code. Later, if you make a change, you only have to make it in one place.
Dividing a long program into functions allows you to debug the parts one at a time and then assemble them into a working whole.
Well-designed functions are often useful for many programs. Once you write and debug one, you can reuse it.

Incremental Development

As you write larger functions, you might find yourself spending more time debugging.

To deal with increasingly complex programs, you might want to try a process called incremental development. The goal of incremental development is to avoid long debugging sessions by adding and testing only a small amount of code at a time.

As an example, suppose you want to find the distance between two points, given by the coordinates \((x_1, y_1)\) and \((x_2, y_2)\). By the Pythagorean theorem, the distance is:

\[d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}\]

The first step is to consider what a distance function should look like in Python. In other words, what are the inputs (parameters) and what is the output (return value)?

In this case, the inputs are two points, which you can represent using four numbers. The return value is the distance represented by a floating-point value.

Immediately you can write an outline of the function:

def distance(x1, y1, x2, y2):
    return 0.0

Obviously, this version doesn’t compute distances; it always returns zero. But it is syntactically correct, and it runs, which means that you can test it before you make it more complicated.

To test the new function, call it with sample arguments:

>>> distance(1, 2, 4, 6)
0.0

I chose these values so that the horizontal distance is 3 and the vertical distance is 4; that way, the result is 5, the hypotenuse of a 3-4-5 triangle. When testing a function, it is useful to know the right answer.

At this point we have confirmed that the function is syntactically correct, and we can start adding code to the body. A reasonable next step is to find the differences \(x_2 - x_1\) and \(y_2 - y_1\). The next version stores those values in temporary variables and prints them.

def distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    print('dx is', dx)
    print('dy is', dy)
    return 0.0

If the function is working, it should display dx is 3 and dy is 4. If so, we know that the function is getting the right arguments and performing the first computation correctly. If not, there are only a few lines to check.

Next we compute the sum of squares of dx and dy:

def distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    dsquared = dx**2 + dy**2
    print('dsquared is: ', dsquared)
    return 0.0

Again, you would run the program at this stage and check the output (which should be 25). Finally, you can use math.sqrt to compute and return the result:

def distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    dsquared = dx**2 + dy**2
    result = math.sqrt(dsquared)
    return result

If that works correctly, you are done. Otherwise, you might want to print the value of result before the return statement.

The final version of the function doesn’t display anything when it runs; it only returns a value. The print statements we wrote are useful for debugging, but once you get the function working, you should remove them. Code like that is called scaffolding because it is helpful for building the program but is not part of the final product.

When you start out, you should add only a line or two of code at a time. As you gain more experience, you might find yourself writing and debugging bigger chunks. Either way, incremental development can save you a lot of debugging time.

The key aspects of the process are:

Start with a working program and make small incremental changes. At any point, if there is an error, you should have a good idea where it is.
Use variables to hold intermediate values so you can display and check them.
Once the program is working, you might want to remove some of the scaffolding or consolidate multiple statements into compound expressions, but only if it does not make the program difficult to read.

Glossary

function definition: A statement that creates a new function, specifying its name, parameters, and the statements it contains.
function object: A value created by a function definition. The name of the function is a variable that refers to a function object.
header: The first line of a function definition.
body: The sequence of statements inside a function definition.
parameter: A name used inside a function to refer to the value passed as an argument.
argument: A value provided to a function when the function is called. This value is assigned to the corresponding parameter in the function.
local variable: A variable defined inside a function. A local variable can only be used inside its function.
flow of execution: The order statements run in.
stack diagram: A graphical representation of a stack of functions, their variables, and the values they refer to.
frame: A box in a stack diagram that represents a function call. It contains the local variables and parameters of the function.
traceback: A list of the functions that are executing, printed when an exception occurs.
temporary variable: A variable used to store an intermediate value in a complex calculation.
incremental development: A program development plan intended to avoid debugging by adding and testing only a small amount of code at a time.
scaffolding: Code that is used during program development but is not part of the final version.

Self Checks

Note: Try your hand at the following self checks, but do not be discouraged if you aren’t able to solve them. Writing functions is difficult at first, but coming to class having already thought about each of them will help.

Check 1

Use incremental development to write a function called hypotenuse that returns the length of the hypotenuse of a right triangle given the lengths of the other two legs as arguments. Record each stage of the development process as you go.

Check 2

A function object is a value you can assign to a variable or pass as an argument. For example, do_twice is a function that takes a function object as an argument and calls it twice:

def do_twice(f):
    f()
    f()

Here’s an example that uses do_twice to call a function named print_spam twice.

def print_spam():
    print('spam')

do_twice(print_spam)

Type this example into a script and test it.
Modify do_twice so that it takes two arguments, a function object and a value, and calls the function twice, passing the value as an argument.
Copy the definition of print_twice from earlier in this chapter to your script.
Use the modified version of do_twice to call print_twice twice, passing 'spam' as an argument.
Define a new function called do_four that takes a function object and a value and calls the function four times, passing the value as a parameter. There should be only two statements in the body of this function, not four.

Solution

Check 3

Write a function that draws a grid like the following:

+ - - - - + - - - - +
|         |         |
|         |         |
|         |         |
|         |         |
+ - - - - + - - - - +
|         |         |
|         |         |
|         |         |
|         |         |
+ - - - - + - - - - +

Note: This exercise should be done using only the statements and other features we have learned so far.

Hint: To print more than one value on a line, you can print a comma-separated sequence of values:

print('+', '-')

By default, print advances to the next line, but you can override that behavior and put a space at the end, like this:

print('+', end=' ')
print('-')

The output of these statements is '+ -' on the same line. The output from the next print statement would begin on the next line.

Write a function that draws a similar grid with four rows and four columns.