Abstract¶
Functions allow programmers to reuse code that is efficiently implemented and well-tested. This notebook not only demonstrates the basic syntax for using and writing functions, but also uses concrete examples to illustrate the importance of code reuse. It highlights the flexibility of functions by showing how they can be applied with different arguments to solve similar problems and how they can be customized to suit specific applications.
from __init__ import install_dependencies
await install_dependencies()
import math
%load_ext divewidgets
%load_ext jupyter_ai
%ai update chatgpt dive:chat
What is a Function?¶
A function is a callable object, e.g.:
callable(callable), callable(1)
The function callable
is callable in the sense that
- it can be called/invoked with some input arguments/parameters such as
1
enclosed by parentheses()
, and then - returns some value computed from the input arguments, such as the boolean value
False
to indicate that the input argument1
is not callable.
A function can be defined using the def
keyword.
E.g., a simple function that prints “Hello, World!” can be defined as follows:
# Function definition
def say_hello():
print("Hello, World!")
# Function invocation
say_hello()
To make a function more powerful and solve different problems,
- use a return statement to return a value that
- depends on some input arguments.
def increment(x):
return x + 1
increment(3)
A function must have a return value. By default None
is returned.
print(f"The return value is {say_hello()}.", )
We can also have multiple input arguments.
def length_of_hypotenuse(a, b):
return (a ** 2 + b ** 2) ** 0.5
length_of_hypotenuse(1, 2), length_of_hypotenuse(3, 4)
The arguments are evaluated from left to right:
print("1st input:", input(), "\n2nd input:", input())
Indeed, how arguments are passed into a function can be more complicated than you may think. To check if you have the correct understanding:
Does the code below increment x
and prints 4
?
x
and prints 4
?- Step 3: The function
increment
is invoked with an argumentx
. - Step 3-4: A local frame is created for variables local to
increment
during its execution.- The formal parameter
x
indef increment(x):
becomes a local variable and - it is assigned the value
3
of the actual parameter given by the global variablex
.
- The formal parameter
- Step 5-6: The local (but not the global) variable
x
is incremented. - Step 6-7: The function call completes and the local frame is removed.
%%optlite -l -h 400
def increment(x):
x += 1
x = 3
increment(x)
print(x) # 4?
%%ai chatgpt -f text
Explain the differences in how Python, C, and Java pass arguments to functions.
In particular, explain
1. call by value,
2. call by reference, and
3. call by object reference.
Can we increment a variable instead of returning its increment?
In C++, it is possible to pass an argument by reference, allowing the local variable to point to the same memory location as the variable being passed in.
A fundamental property of functions in Python is that they are first-class citizens, which means that a function can be
- assigned to a variable,
- passed as an input argument, and
- returned by a function.
%%ai chatgpt -f text
Are there programming languages that do not treat functions as first-class citizens? Why?
The following is a simple illustration using the def
statement to define an i
dentity function that uses the return
statement to return the input argument.
%%optlite -h 300
def i(x):
return x
assert i(i) == i and i.__name__ == 'i'
A function can also be defined using the lambda
expression, which creates an anonymous function:[1]
%%optlite -h 300
assert (i := lambda x: x)(i) == i \
and i.__name__ == "<lambda>"
A non-trivial example is the following implementation of the boolean values as functions:
%%optlite -h 450
def true(x, y): return x
def false(x, y): return y
def ifthenelse(b, x, y): return b(x, y)
assert ifthenelse(true, "A", "B") == "A"
assert ifthenelse(false, "A", "B") == "B"
Why the name <lambda>
for anonymous function?
<lambda>
for anonymous function?The expression , known as function abstraction, was defined by Alonzo Church, the Ph.D. supervisor of Alan Turing. This notation is used to define a function that can take (return) a string or another function passed in as (returned as but with substituted by the value of ). What is surprising is that a machine capable of taking and applying all such functions is Turing complete. For more information, see the λ-calculus and watch the following video created with manim
or another video here.
Perhaps you may also be interested in the following:
%%ai chatgpt -f text
Why Alonzo Church used lambda for lambda calculus?
Code Reuse¶
Previously, we learned about iteration, where the same piece of code can run multiple times. Function abstraction take this even further: It allows the same piece of code to be executed with different parameters and at different locations. Code reuse is a good programming practice. If done properly, it makes the code readible and efficient. We will explore these benefits using a concrete example below.
Perfect Square¶
For instance, the first 10 perfect squares are:
for i in range(10):
print(i**2)
Instead of generating perfect squares, how about writing a function that checks if a number is a perfect square?
def is_perfect_square(n):
### BEGIN SOLUTION
return n == math.isqrt(n) ** 2
### END SOLUTION
# test cases
assert is_perfect_square(10**2)
assert not is_perfect_square(10**2 + 1)
assert is_perfect_square(10**10)
assert not is_perfect_square(10**10 + 1)
assert is_perfect_square(10**100)
assert not is_perfect_square(10**100 + 1)
As another demonstration of code reuse, the following solution uses a for loop to implement Definition 1 exactly.
def is_perfect_square(n):
# checks if n is the square of i for i in the range up to n (exclusive).
for i in range(n):
if i**2 == n:
return True
If you try running the test on the above solution, it will take an unacceptably long time to run.[2] (Why?)
To properly test the function, we should modify it to fail if it takes too long to run. Implementing such a feature, called timeout, is difficult. Fortunately, we can reuse the code written by others. Run the following cell to
- install the package
wrapt_timeout_decorator
and - import the function
timeout
from the modulewrapt_timeout_decorator
.
%pip install wrapt_timeout_decorator >/dev/null 2>&1
from wrapt_timeout_decorator import timeout
You will learn how to import a function in a subsequent section (Importing External Modules). For now, let’s see how to use the timeout
function:
# enhanced test without timeout
duration = 5
@timeout(duration) # raise error if the test does not complete in 5 seconds.
def test():
if not input(f"Run the test with a timeout of {duration}s? [Y/n]").lower() == "n":
assert is_perfect_square(10**2)
assert not is_perfect_square(10**2 + 1)
assert is_perfect_square(10**10)
assert not is_perfect_square(10**10 + 1)
assert is_perfect_square(10**100)
assert not is_perfect_square(10**100 + 1)
test() # run the test
Does the for loop implementation pass all the test cases?
assert is_perfect_square(10**10)
fails because Python is not fast enough to go through 10**10
numbers in 5
seconds.
To add timeout to the test, we simply
- wrapped the test inside a function
test
, and - decorated it with
@timeout(duration)
.
The function test
will then be capabable of raising a TimeOutError
if it takes more than the specified time duration
to run.
You will learn how to write a decorator later. It is a powerful way to reuse functions and other objects with additional customizations.
Integer Square Root¶
To improve the efficiency, consider the following sufficient and necessary condition for perfect squares:
A simple implementation is as follows:
def is_perfect_square(n):
# check if n is the square of its integer square root
return n == int(n**0.5) ** 2
assert is_perfect_square(10**10)
Note that it fixed the efficiency issue on the test case with n
being 10**10
. Let’s run all the test cases:
test()
Does it pass all the test cases?
assert is_perfect_square(10**100)
fails because of the finite precision of floating point numbers.
Perhaps we should use math.isclose
instead of ==
, since int(n**0.5) ** 2
is a float
.
def is_perfect_square(n):
return math.isclose(n, int(n**0.5) ** 2)
assert is_perfect_square(10**100)
Note that it can correctly say 10**100
is a perfect square. Let’s run all the test cases:
test()
Does it pass all the test cases?
assert not is_perfect_square(10**10+1)
because of the tolerance is too high.
How to fix the issue? The culprit is that the computation for integer square root is not exact:
x = 10**100
int((x) ** 0.5)
There are better ways to compute integer square root. Binary search is a relatively easy one to try first, although it is not the best choice.
%%ai chatgpt -f text
Explain how integer square root can be implemented in python using binary search.
But there is a much easier way to have a better implementation: Code reuse! Try math.isqrt
for Exercise 1 and check that you can pass all the test cases instantly.
x = 10**100
math.isqrt(x), int((x) ** 0.5)
How is isqrt
implemented?
isqrt
implemented?math.isqrt
implements an adaptive-precision pure-integer version of Newton’s iteration.
The source code is written in C, but there is a Python implementation given in the source code, along with a sketch of proof.
%%ai chatgpt -f text
Explain briefly in two paragraph how isqrt is implemented as an adaptive-precision pure-integer version of Newton's iteration.
While you may want to write self-contained codes that do not rely on external libraries, code reuse advocates would recommend you to use standard libraries as much as possible. Why?
Indeed, the math
library provides functions that it does not implement:
CPython implementation detail: The
math
module consists mostly of thin wrappers around the platform C math library functions. - pydoc last paragraph
E.g., see the source code wrapper for log
.[3]:
%%ai chatgpt -f text
When working on a programming assignment, should I write all the code myself, or is it acceptable to use standard libraries?
The ultimate dilema: Should I reuse code from LLM for a programming assignment? See if LLM can resolve the dilema below.[4]
%%ai chatgpt -f text
When working on a programming assignment, should I use LLM to write the code for me?
Modules¶
To facilitate code reuse, all Python codes are organized into libraries called modules. E.g., you can list all available modules using pip list
:
%pip list
Python searches for packages using the search path:
import sys
sys.path
For instance, to show the location of a package, say divewidgets
, run:
%pip show divewidgets
See Also
pip
can also be used to install packages temporarily until your Jupyter server restarts. You are free to install any packages, and if things break, you can simply restart your Jupyter server to reset. For advanced users who would like to install a package that persists over restarts of the server, create and activate a virtual environment using mamba. Indeed, we used mamba to install Python and Jupyter in your Jupyter server as shown in the dockerfile.
%%ai chatgpt -f text
Explain in a paragraph or two what pip is, and compare it with other
alternatives like mamba.
%%ai chatgpt -f text
Explain how to use mamba to
1. create a virtual environment called test,
2. install the package wrapt_timeout_decorator, and
3. make the environment available as a jupyter kernel.
Note that I already have mamba and jupyter installed.
Builtins Module¶
In Python, every function must come from a module, including the build-in functions:
__builtins__.print(f"`{print.__name__}` is from the {print.__module__} module.")
The buildins
are automatically imported as __builtins__
(and also __builtin__
) along with all the functions and objects it provides because they are commonly use by programmers.
We can use the built-in function dir
(directory) to list all built-in objects available.
dir(__builtins__)
For instance, there is a built-in function help
for showing the docstring (documentation string) of functions or other objects.
help(help) # can also show the docstring of help itself
help(__builtins__) # can also show the docstring of a module
print(dir())
Solution to Exercise 2
As summarized by the first line of the docstring of dir
:
If called without an argument, return the names in the current scope.
Importing External Modules¶
For other available modules, we can use the import
statement to import multiple functions or objects into the program global frame.
%%optlite -h 300
from math import ceil, log10
x = 1234
print("Number of digits of x:", ceil(log10(x)))
The above imports both the functions log10
and ceil
from math
to compute the number of digits of a strictly positive integer .
Once can also import all functions from a library:
%%optlite -h 300
from math import * # import all except names starting with an underscore
print("{:.2f}, {:.2f}, {:.2f}".format(sin(pi / 6), cos(pi / 3), tan(pi / 4)))
The above uses the wildcard *
to import (nearly) all the functions/variables provided in math
.
What if different packages define the same function?
In the following code:
- The function
pow
imported frommath
overwrites the built-in functionpow
. - Unlike the built-in function,
pow
frommath
returns only floats but not integers or complex numbers. - We say that the import statement polluted the namespace of the global frame and caused a name collision.
%%optlite -h 500
print("{}".format(pow(-1, 2)))
print("{:.2f}".format(pow(-1, 1 / 2)))
from math import *
print("{}".format(pow(-1, 2)))
print("{:.2f}".format(pow(-1, 1 / 2)))
To avoid name collisions, it is a good practice to use the full name (fully-qualified name) such as math.pow
prefixed with the module.
%%optlite -h 350
import math
print("{:.2f}, {:.2f}".format(math.pow(-1, 2), pow(-1, 1 / 2)))
Using the full name can be problematic if the name of a module is very long. There can even be a hierarchical structure.
E.g., to plot a sequence using pyplot
module from matplotlib
package:
%matplotlib widget
import matplotlib.pyplot
matplotlib.pyplot.stem([4, 3, 2, 1])
matplotlib.pyplot.ylabel(r"$x_n$")
matplotlib.pyplot.xlabel(r"$n$")
matplotlib.pyplot.title("A sequence of numbers")
matplotlib.pyplot.show()
In Python, modules can be structured into packages, which are themselves modules that can be imported. It is common to rename matplotlib.pyplot
as plt
:
import matplotlib.pyplot as plt
plt.stem([4, 3, 2, 1])
plt.ylabel(r"$x_n$")
plt.xlabel(r"$n$")
plt.title("A sequence of numbers")
plt.show()
We can also rename a function as we import it to avoid name collision:
from math import pow as fpow
fpow(2, 2), pow(2, 2)
%%optlite -h 500
import math as m
for m in range(5):
m.pow(m, 2)
Solution to Exercise 3
There is a name collision: m
is assigned to an integer in the for loop and so it is no longer the module math
when calling m.pow
.
Documentation¶
Understanding how to properly document a function is crucial for maintaining clear and efficient code. It also allow others to use the code properly to avoid bugs. How should one go about documenting a function effectively? As an example:
# Author: John Doe
# Last modified: 2020-09-14
def increment(x):
"""Increment by 1.
A simple demo of
- parameter passing,
- return statement, and
- function documentation."""
return x + 1 # + operation is used and may fail for 'str'
The help
command shows the docstring we write
- at the beginning of the function body
- delimited using triple single/double quotes.
help(increment)
The docstring should contain the usage guide, i.e., information for new users to call the function properly. See Python style guide (PEP 257) for
Why doesn’t help
show the comments that start with #
?
help
show the comments that start with #
?# Author: John Doe
# Last modified: 2020-09-14
def increment(x):
...
return x + 1 # + operation is used and may fail for 'str'
Those comments are not usage guide. They are intended for programmers who need to maintain/extend the function definition:
- Information about the author and modification date facilitate communications among programmers.
- Comments within the code help explain important and not-so-obvious implementation details.
We can also annotate the function with type hints to indicate the types of the arguments and return value.
# Author: John Doe
# Last modified: 2020-09-14
def increment(x: float) -> float:
"""Increment by 1.
A simple demo of
- parameter passing,
- return statement, and
- function documentation."""
return x + 1 # + operation is used and may fail for 'str'
help(increment)
Annotations, if done right, can make the code easier to understand. However, annotations are not enforced by the Python interpreter.[5]
def increment_user_input():
return increment(input()) # does not raise error even though input returns str
Does calling the function lead to any error:
increment_user_input()
# Author: John Doe
# Last modified: 2020-09-14
def increment(x: float) -> float:
"""Increment by 1.
A simple demo of
- parameter passing,
- return statement, and
- function documentation.
Parameters
----------
x: float
Value to be incremented.
Returns
-------
float:
Value of x incremented by 1.
"""
return x + 1 # + operation is used and may fail for 'str'
help(increment)
How to turn the docstrings into a user reference?
To faciliate the generation of documentation, there are tools such as:
- sphinx that can compile the docstrings automatically into an API reference.
nbdev
which releases a package and compiles the documentation from Jupyter notebooks, hence providing a literate programming experience.
Note that the colon in
lambda ...:
cannot be followed by line break because it expects an expression rather than a suite.Use the keyboard interrupt (■) to stop the execution.
An efficient implementation often uses the CORDIC algorithm.
For an official answer, see the grading policy in Lab0
/Course _Materials .ipynb.