Values and Variables

Abstract¶

In this notebook, readers will see how different expressions in a computer program get evaluated to different types of values. With variables, programmers can assign a meaningful name, known as an identifier, to a value of various types without having to worry about where it should be stored and transferred in the physical storage unit for computations. At this point, readers may regard identifiers simply as variables that can take different values, although this model will be refined later with the concepts of aliasing and mutations.

from __init__ import install_dependencies

await install_dependencies()

import sys  # to access system-specific parameters
from dis import dis  # for disassembly of bytecode
from ipywidgets import interact  # for interactive user interface controls

# use OPTLite to visualize code execution
%load_ext divewidgets
# Set LLM alias
%load_ext jupyter_ai
%ai update chatgpt dive:chat

Values¶

Programming is the art of instructing a computer to perform tasks by manipulating data, which is represented as values. In mathematical terms, a computer (programming language) is essentially a set of rules for how values are stored and manipulated^[1]. In other words, values are the building blocks of computation, and they represent a specific piece of data that can be manipulated in various ways.

Integers¶

While a machine language only uses binary numbers as values, a high-level programming language provides more flexible and expressive ways to represent and manipulate values.

In Python, for instance, we can enter an integer in different number systems as follows.

# in decimal (base 10)
15

# in 'b'inary (base 2)
0b1111

# in 'o'ctadecimal (base 8)
0o17

# in he'x'adecimal (base 16)
0xF

All the above expressions are integer literals, namely, integers written out literally. They have the same numerical value, which gets printed in decimal by default.

There are also expressions with integer values but are not integer literals:

4 + 5 + 6

pow(2, 4) - 1

### BEGIN SOLUTION
# There is (in principle) no limit on how big an integer can be
10 ** sys.get_int_max_str_digits()  # this is okay as long as it is not printed!
10 ** sys.get_int_max_str_digits() - 1
### END SOLUTION
# SPOILER: Printing an integer actually involves converting it to a string!

%%ai chatgpt -f text
Use a simple analogy to explain in a short paragraph how can a computer represent all integers using variable-length code?

Strings¶

A string value is a sequence of characters that can be written literally using quotes:

# single quote
print("\U0001f600:\n\tI'm a string.")

# double quote
print("\N{grinning face}:\n\tI'm a string.")

# triple double quote
print(
    """😀:
	I'm a string."""
)

Note that all the literals represent the same value:

Escape sequence

\U0001f600 and \N{grinning face} are escape sequences representing the same grinning face emoji 😀 where

0001f600 is the unicode in hexadecimal,
grinning face is the name, and
\ is called the escape symbol.

Control code

\n and \t are control code that does not represent any symbol.

\n creates a new line when printing the string.
\t creates a tab to indent the line.

%%ai chatgpt -f text
Explain in one line why a string value is often quoted in a computer program?

%%ai chatgpt -f text
Are there programming languages that do not quote its string? Why?

The following is yet another way to print the same string:

print("\N{grinning face}:", "\tI'm a string.", sep="\n")

It is an elegant one-line code (one-liner) where

sep="\n" is a keyword argument that specifies the separator of the list of strings.
The default separator is a single space character, i.e., sep=" ".

In a notebook, we can get the docstring (document string) of a function conveniently using the symbol ? such as ?print or

print?

We can also use the contextual help by placing the cursor over a function name and

click the menu item Help $\to$ Show Contextual Help or
press the short-cut key Shift + Tab.

### BEGIN SOLUTION
# install art module
%pip install art >/dev/null 2>&1
import art

yyyy, mm = "2024", "09"
art.tprint(
    f"""{yyyy}{mm}CS1302
Intro to Comp Progm'g"""
)
### END SOLUTION

Test your message below. Try switching to the more powerful LLM such as dive-azure:gpt4o if you have already installed your API key.

%%ai chatgpt -f text
Explain what you see in the following:
 (ง •̀_•́)ง 
 ╰(●’◡’●)╮ 
 (..•˘_˘•..)
 (づ￣ 3￣)づ

User Input¶

Instead of entering a value in a program, a programmer can get user input values at runtime, i.e., when a program executes:

print("Your name is", input("Please input your name: ") + ".")

The input method prints its argument, if any, as a prompt.
The method takes user’s input and returns it as a string.
There is no need to delimit the input string by quotation marks. Simply press enter after typing a string.

print("My name is", print("Python"))

Solution to Exercise 3 #

Unlike input, the function print does not return the string it is trying to print. Printing a string is, therefore, different from returning a string.
print actually returns a None object that gets printed as None.

%%ai chatgpt -f text
Explain in one-line whether the Python print function return the value it prints?

Variables¶

A complicated computation often needs to be broken down into many basic computations, with intermediate values stored and transferred to different memory locations. Keeping track of where a value is written, and allocating free memory locations to write to, are not only burdensome but also error-prone.

This is where the concept of variables comes in—they serve as a logical (as opposed to physical) unit of storage that abstracts away the complexities of memory management.

Assignment¶

To define a variable, we can use the assignment operator = as follows:

x = 15

What does the above code mean?

$x$ is equal to 15?
$x$ is defined to be 15?

Let’s discover the truth by executing the following assignments step-by-step using OPTLite:

%%optlite -h 300
x = 15
x = x + 1

In the second assignment, does it mean:

$x$ is equal to $x$ +1?
$x$ is defined to be $x$ +1?

If we say yes to the above questions, the value of $x$ should be $\pm\infty$ . (Why?)

%%ai chatgpt -f math
What is the solution to $x+1=x$?

To see how the Python interpreter carries out the assignment operation, the following code compiles the Python code to bytecode (similar to machine code) and displays it in assembly language.^[2]

dis(compile('x = x + 1', '_', 'exec'))

It is possible to assign different values to multiple variables in one line using the so-called tuple assignment syntax:

%%optlite -l -h 400
x, y, z = "15", "30", 15

One can also assign the same value to different variables in one line using a chained assignment:

%%optlite -l -h 400
x = y = z = 0

Once defined, a variable can be deleted using the del keyword. Accessing a variable that has not been assigned any value raises an error.

%%optlite -h 350
x = y = 1+1j
del x
x

%%ai chatgpt -f text
How would you define constants in Python? Are literals also constants?

Identifiers¶

One reason why Python is expressive is that it affords programmers a significant amount of flexibility when it comes to choosing variable names. For instance, identifiers, such as variable names, are case-sensitive and of unlimited length, unlike older languages such as Pascal and Fortran. This flexibility also makes the program more readable. For instance, consider the following program:

%%optlite -h 400
def name():
    return first.name + last.name

first.name = "John"
last.name = "Smith"
print(name())

Unfortunately, the program fails in the middle, why? Let’s take a closer look at the operations involved:

dis(compile("first.name + last.name", "_", "eval"))

How about the following fix?

dis(compile("first-name + last-name", "_", "eval"))

Obviously, not all names are valid identifiers as some names may be misinterpreted by the Python interpreter. Instead of fixing the names by trial-and-error, let’s try to learn the exact syntax for identifiers

identifier   ::=  xid_start xid_continue*
...

which is specified in a notation called the Extended Backus-Naur Form (EBNF). Furthermore, some identifiers called keywords are reserved. There are also soft keywords that are reserved under specific contexts.

%%ai chatgpt -f text
Explain in one paragraph the notation of the following in pydoc:
identifier ::= xid_start xid_continue*

Tip

If you find EBNF too difficult to understand now, consider the following rules of thumb:

Start a variable name with a letter or _ (an underscore) followed by letters, digits, or _, but

do not use any of the following reserved words:

False      await      else       import     pass
None       break      except     in         raise
True       class      finally    is         return
and        continue   for        lambda     try
as         def        from       nonlocal   while
assert     del        global     not        with
async      elif       if         or         yield

That should work most of the time except for a few specific cases.^[3]

@interact
def identifier_syntax(
    assignment=[
        "a-number = 15",
        "a_number = 15",
        "15 = 15",
        "_15 = 15",
        "del = 15",
        "Del = 15",
        "type = print",
        "print = type",
        "input = print",
    ]
):
    exec(assignment)
    print("Ok.")

Solution to Exercise 4

a-number = 15 violates Rule 1 because - is not allowed. - is interpreted as an operator.
15 = 15 violates Rule 1 because 15 starts with a digit instead of letter or _.
del = 15 violates Rule 2 because del is a keyword.

Type Conversion¶

The following program tries to compute the sum of two numbers from user inputs:

num1 = input("Please input an integer: ")
num2 = input("Please input another integer: ")
print(num1, "+", num2, "is equal to", num1 + num2)

Solution to Exercise 5 #

The two numbers are concatenated instead of added together.

input returns user input as a string. E.g., if the user enters 12, the input is

not treated as the integer twelve, but rather
treated as a string containing two characters, one followed by two.

To confirm this, we can use type to return the data type of an expression.

num1 = input("Please input an integer: ")
print("Your input is", num1, "with type", type(num1))

type(15), type(print), type(print()), type(input), type(type), type(type(type))

What happens when we add strings together?

"4" + "5" + "6"

How to fix the bug then?

We can convert a string to an integer using int.

int("4") + int("5") + int("6")

We can also convert an integer to a string using str.

str(4) + str(5) + str(6)

num1 = input("Please input an integer: ")
num2 = input("Please input another integer: ")
# print(num1, '+', num2, 'is equal to', num1 + num2)  # fix this line below
### BEGIN SOLUTION
print(num1, "+", num2, "is equal to", int(num1) + int(num2))
### END SOLUTION

Error Types¶

In addition to writing code, a programmer spends significant time in debugging code that contains errors. A natural question is:

Can an error be automatically detected by the computer?

You have just seen an example of a logical error, which is due to an error in the logic. The ability to debug or even detect such an error is, unfortunately, beyond Python interpreter’s intelligence.

Let’s see if LLM can debug logical error:

%%ai chatgpt -f text
What is wrong with the following code?
--
num1 = input("Please input a number: ")
num2 = input("Please input another number: ")
print(num1, "+", num2, "is equal to", num1 + num2)

Refining questions based on the LLM’s responses can yield more meaningful answers. Essentially, we should learn to ask questions to gain knowledge:

"學問" == "學" "問"

Other kinds of error may be detected automatically by Python. As an example, note that equality == is not equal to assignment =:

%%optlite -l -h 400
print("Assignment:")
"學問" = "學" "問"

As another example, juxtaposition is not the same as addition +:

%%optlite -l -h 400
print("Juxtaposition:")
3 == 2 1

Python interpreter detects the bug and raises a syntax error.

The following code raises a different kind of error.

%%optlite -l -h 500
print("Add integer to string:")
"4" + "5" + 6  # adding string to integer

Python is a strongly-and-dynamically-typed language:

Dynamically-typed languages check data type only at runtime after translating the code to machine code.
Strongly-typed: languages do not force a type conversion to avoid a type error.

To understand what the above actually means, let’s consider how other languages work differently.

C/C++ and Java are statically-typed languages that checks data type during compilation, so the type error above becomes a compile-time error instead of a runtime error.

try:
    !gcc 456.cpp
except Error:
    print('Cannot run shell command.')

try:
    !javac 456.java
except Error:
    print('Cannot run shell command.')

On the opposite extreme, the web programming language Javascript does not raise an error at all:

%%javascript
let x = '4' + '5' + 6;
element.append(x + ' ' + typeof(x));
// no error because 6 is converted to a str implicitly

%%javascript
let x = '4' * '5' * 6;
element.append(x + ' ' + typeof(x));
// no error because 4 and 5 are converted to numbers implicitly

Javascript forces a type conversion to make the code run. It is called a weakly-typed language because of this flexibility.

%%javascript
element.append(
    ("b" + "a" +
     + "a" + "a").toUpperCase()
);

Solution to Exercise 8 #

See the explanation here.

May be LLM knows why?

%%ai chatgpt -f text
Why does the following javascript return BANANA?
element.append(
    ("b" + "a" +
     + "a" + "a").toUpperCase()
);

Footnotes¶

Indeed, a program that defines what values to compute is essentially a value itself. More precisely, one cannot distinguish between a value and a program in the context of a Turing Machine.
↩
The conversion from machine code to assembly code is called disassembly, hence the name dis. BTW, why an interpreted language like python has a compiler? The interpreter actually compiles the source code to a more compact form called the bytecode that allows the interpreter to run it faster. Indeed, even though Python was originally intended to be an interpreted language, Python 3.13 supports JIT compilation that will compile source code Just In Time to machine code for the first run... Why? See this post.
↩
match and case are recognized as keywords in the context of a match statement, where _ is treated as a wildcard. This is regardless of whether match, case, or _ are used as variables outside this context.
↩
TypeScript is a superset of JavaScript that adds optional static typing and other features to the language. By enforcing stronger typing, TypeScript can detect potential errors at compile-time and improve the overall reliability of your code. Learn more about TypeScript at typescriptlang.org.
↩

Lab 1

Debugger

Lecture 2

Expressions and Arithmetic