PE100-08: Lists

All of the variables we’ve seen so far store exactly one value. If you set the variable “weight” to 74.5, then 74.5 is the only value there is in “weight”. Nice and simple. If we need to save several values then we can use several variables…

In [1]:

weight_1 = 74.5
weight_2 = 76.7
weight_3 = 77.1

As you can imagine, this turns tedious in a hurry. What if you had a thousand values to deal with? And even if you did all of that typing, doing any kind of non-trivial computation with it would be difficult, too. We need a way to store a bunch of values, but doing it in a way that makes it easy to manipulate the whole thing as a whole or each individual value. For doing that, Python provides us with lists.

Python is one of the few languages that support lists deep down in the language itself. Because of that, they’re easy to work with. Let’s take a look, shall we?

In [6]:

names = ['Alice', 'Bob', 'Candice', 'Dan']
names

['Alice', 'Bob', 'Candice', 'Dan']

Lists are represented with square brackets [ ] at the beginning and end, and with the values inside the brackets separated by commas.

In [7]:

odd_numbers = [1, 3, 5, 7, 9]
ingredients = ['flour', 'lard', 'baking powder', 'milk']

The values in a list don’t all have to be the same type.

In [8]:

playing_card = [9, 'Diamonds']

A list can have any number of values, limited only by the amount of memory in the computer that is hosting the Jupyter (or JupyterLab) server. Lists are even allowed to have no values in them.

In [9]:

empty_list = []

So far we’ve been creating lists using literal values, but we could use variables just as easily…

In [12]:

nimh = 16
lithiumPrimary = 2
carbonZinc = 6

battery_inventory = [nimh, lithiumPrimary, carbonZinc]
print(battery_inventory)

[16, 2, 6]

To find out how many elements are in a list, use the len() function:

In [13]:

number_of_ingredients = len(ingredients)
print(number_of_ingredients)

print(len(battery_inventory))

4
3

There are operators that act on lists. The * operator is used for repetition…

In [14]:

my_list = [1, 2, 3] * 2
print(my_list)
many_zeros = [0] * 25
print(many_zeros)

[1, 2, 3, 1, 2, 3]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

…and the + operator combines two lists:

In [15]:

big_list = my_list + many_zeros
print(big_list)

[1, 2, 3, 1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Lists are iterables, just like the results of the range() function, so they can be iterated over using a for loop:

In [17]:

for name in ['David', 'Bill', 'Richard']:
    print(name)

David
Bill
Richard

In [18]:

for ingr in ingredients * 3:
    print(ingr)

flour
lard
baking powder
milk
flour
lard
baking powder
milk
flour
lard
baking powder
milk

It’s fairly common to iterate over a list for things like sums and averages.

In [19]:

total = 0
for item in [4, 3, 4, 5]:
    total += item
print(total)

The above code steps its way over all of the values in the list. Each time it goes to a new value, it adds that value to total. When it gets to the end, all of the values have been added up. If we want an average, we don’t have to count the values ourselves. We can just use the len() function.

In [20]:

total = 0
my_list = [4, 3, 4, 5]
for item in my_list:
    total += item
avg = total / len(my_list)
print(avg)

4.0

Sometimes you need to use a particular value in a list and you don’t want to iterate over the whole thing. For this, Python gives us indexing, letting us directly access any element of a list. The first (as it appears on screen, “leftmost”) element in numbered zero and each one after that goes up by one. The highest numbered one is therefore the length of the list minus one.

In [23]:

print(ingredients)
print()
print("Of all the biscuit ingredients,", ingredients[0] , "is the most important one.")
print("The second most important one is", ingredients[2])

['flour', 'lard', 'baking powder', 'milk']

Of all the biscuit ingredients, flour is the most important one.
The second most important one is baking powder

Like the majority of programming languages, Python uses square brackets to indicate the index into the list. Unlike the vast majority of languages, Python allows indexes to be negative! A negative number for an index means “count backwards from the end”. my_list[-1] refers to the item at the end of the list. my_list[-3] refers to the third to last item.

In [24]:

print(ingredients[-1])
print(ingredients[-3])

milk
lard

We’ve seen how to iterate over lists and also how to access individual list elements by using indexing. Python has a special indexing scheme, though, that lets us deal with small lists made from our original list. This is called List Slicing and can save you a lot of work sometimes. The overall syntax for this looks like list_name[start:end]

An example is definitely called for here:

In [25]:

my_list = [2, 4, 6, 8, 10, 12]
print(my_list[1:3])

[4, 6]

Remember that list indexes count from zero, and remember also that ranges in Python include the starting index (here, it’s the 1) and will continue to the last value that is smaller than the one on the right side of the colon.

Both the starting and the ending indexes are optional! If one of the two is missing, it will be interpreted as 0 or the list’s length, respectively.

In [27]:

print(my_list)
print()
print(my_list[:3])
print(my_list[1:])
print(my_list[:])

[2, 4, 6, 8, 10, 12]

[2, 4, 6]
[4, 6, 8, 10, 12]
[2, 4, 6, 8, 10, 12]

And finally, the in operator is used to test list membership.

In [31]:

lucky_numbers = [2, 7, 17, 9]
player_number = int(input('Enter your favorite number'))
if player_number in lucky_numbers:
    print("Your favorite number is lucky!")
else:
    print("Sorry! Better luck next time!")

Enter your favorite number 8

Sorry! Better luck next time!

There’s Method to the Madness

There are two kinds of functions available for working with lists. Built-in functions are the ones that are part of Python itself. Methods, as you’ll recall from the unit on files, are special functions that are situated inside of objects and only usable with that kind of object. Python lists are objects. They’re iterable objects, in fact.

Let’s take a look at a few of the methods available for working with lists. First up is append().

In [32]:

print(lucky_numbers)
lucky_numbers.append(106)

lucky_numbers

[2, 7, 17, 9]

[2, 7, 17, 9, 106]

Just as the name implies, append() adds an element to the end of a list.

But what if we want to put a new element in a specific place? For that, there is insert().

In [33]:

print(lucky_numbers)
lucky_numbers.insert(2, 202)
lucky_numbers

[2, 7, 17, 9, 106]

[2, 7, 202, 17, 9, 106]

The insert function takes two arguments. The first is the position in the list where the insertion should happen. In the example above, it was at position 2. Remember, list indexes start at zero! The second argument is the element to insert. And when we look at the resulting list, we see that 202 is in position 2 now (which is the third position!) and all the other elements have been shifted to the right.

We’ve been fetching elements from the list by location number, so far. How do we find something by searching for it? The index() method does that.

In [35]:

where_found = lucky_numbers.index(202)
print(where_found)

We passed the argument 202 to the index method. It searched the list and returned the index of the first occurence. That index is 2. Makes sense because we just inserted it there a minute ago!

If we can insert things into a list then surely we can remove them too, right? Indeed we can with the remove() method.

In [36]:

print(lucky_numbers)
lucky_numbers.remove(7)
print(lucky_numbers)

[2, 7, 202, 17, 9, 106]
[2, 202, 17, 9, 106]

Watch out! remove() looks up an item, like index() does, and then removes it. It doesn’t take a position number. In other words:

In [38]:

people = ['David', 'Bill', 'Richard']
people.remove('Bill')
print(people)

['David', 'Richard']

You might find yourself needing to sort the items in a list, and for that the sort() method exists:

In [39]:

print(lucky_numbers)
lucky_numbers.sort()
print(lucky_numbers)

[2, 202, 17, 9, 106]
[2, 9, 17, 106, 202]

Finally, there are methods to find the greatest and smallest values in a list.

In [40]:

print(min(lucky_numbers))
print(max(lucky_numbers))

2
202

Earlier we saw the use of len() to find out how many items are in a list. This is a built-in function and works on many types of variables, not just lists. There are two more built-in functions that are useful for working with lists: min() and max().

In [43]:

siblings = ['David', 'Bill', 'Shirley', 'Richard', 'Laverne']
print(min(siblings))
print(max(siblings))

Bill
Shirley

Lists and Functions

Functions have no problem accepting lists as arguments and they can also return lists as the function’s value. There is a subtle “gotcha” when passing lists as an argument, though.

First, let’s look at a simple example:

In [45]:

original_list = [1, 2, 3, 9]

def sum_of_list(list_to_sum):
    sum = 0
    for i in list_to_sum:
        sum = sum + i
    return sum

the_sum = sum_of_list(original_list)
the_sum

That worked as expected - there’s no problem passing lists into functions. What about returning lists from functions?

In [46]:

def pet_factory(how_many_pairs):
    pets = ['goldfish', 'catfish'] * how_many_pairs
    return pets

many_fish = pet_factory(5)
print(many_fish)

['goldfish', 'catfish', 'goldfish', 'catfish', 'goldfish', 'catfish', 'goldfish', 'catfish', 'goldfish', 'catfish']

Earlier, when we talked about functions in section 5, we said that if a function changes the value of one of its arguments then the effects of that change stay inside the function and aren’t visible to anything when the function exits. That statement was mostly true. If you pass a list as an argument to a function and if that function changes the list then the change made there will be visible outside. Strings, floats, and integers asre protected, but lists are more exposed.

In [48]:

original_list = [1, 2, 3, 9]

def doubler(numbers):
    for i in range(len(numbers)):
        numbers[i]=numbers[i]*2

print(original_list)
doubler(original_list)
print(original_list)

[1, 2, 3, 9]
[2, 4, 6, 18]

Changing the value of an argument inside of a function usually isn’t a great idea, but in the case of lists it can be useful.

No Funny Glasses Required

The lists we have worked with up to this point have all been one dimensional. Lists get a lot more interesting as the number of dimensions goes up.

Unlike most programming languages, Python does not have a multi-dimensional list or array construction, per se. What Python does have is a list that is versatile enough to contain anything - and that includes containing other lists! A two-dimensional list in Python is just a “list of lists”.

Take a look:

In [50]:

first_presidents = [['George', 'Washington'], ['John', 'Adams'], ['Thomas', 'Jefferson']]

Above, on that very long line, we’ve created a list with square brackets. Inside that list, we’ve put three more lists inside square brackets of their own. So we’ve made a list of lists.

That long line is hard to read, isn’t it? Python won’t let us just split a long line of code across multiple lines… unless we explicitly tell it what we’re doing. That is done by ending each line with a backslash and immediately pressing enter. It looks like this:

In [51]:

first_presidents = [['George', 'Washington'],\
                    ['John', 'Adams'],\
                    ['Thomas', 'Jefferson']]

Jupyter even goes to the trouble to line up the columns for us.

Anyway, let’s see what we’ve created.

In [52]:

print(first_presidents)

[['George', 'Washington'], ['John', 'Adams'], ['Thomas', 'Jefferson']]

In [54]:

print(first_presidents[0])

['George', 'Washington']

In [55]:

print(first_presidents[2])

['Thomas', 'Jefferson']

We can index into the outer array, the one that contains the smaller lists, just like we normally would. We can also index into the inner array two different ways. The long way…

In [56]:

president_number_one = first_presidents[0]
first_name = president_number_one[0]
first_name

'George'

… or we can take the shortcut:

In [58]:

first_name = first_presidents[0][0]
first_name

'George'

The first zero got us to the “George”, “Washington” element, and the second zero indexed into that and gave us ‘George’. Let’s try some other combinations:

In [59]:

next_first_name = first_presidents[1][0]
next_first_name

'John'

In [60]:

another_name = first_presidents[1][1]
another_name

'Adams'

It’s easy to see how we’re indexing into this two-dimensional list. In fact, it works roughly the same way as a 2-D array in most programming languages.

It’s so similar, in fact, that you’re probably feeling the urge to do some Linear Algebra right now.

Don’t. Not yet.

Python’s multidimensional list support is exactly that: support for lists. It can be pressed into service for arrays (in the linear algebraic sense of the term) but performance is pretty bad. In Programming Elements 101 we’ll see a software library called “numpy”. It is superior for arrays where you want to do some math.

Now let’s look at how to traverse multi-dimensional array. We’ll create a 2-D list that look like this:

        Column 0  Column 1  Column 2  Column3
Row 0.     A         B         C         D
Row 1.     E         F         G         H
Row 2.     I         J         K         L
Row 3.     M         N         O         P

In [61]:

letter_table = [['A', 'B', 'C', 'D'],\
['E', 'F', 'G', 'H'],\
['I', 'J', 'K', 'L'],\
['M', 'N', 'O', 'P']]

We can get a whole row:

In [62]:

print(letter_table[1])

['E', 'F', 'G', 'H']

or we can get a specific cell (the order is row, then column):

In [63]:

print(letter_table[2][1])

We can access the table by column, but it’s not as easy. We’ll have to write a loop that steps down a column and reads the values:

In [64]:

for i in range(len(letter_table)):
    print(letter_table[i][3])

D
H
L
P

What if we want to access all of the cells in the array? For that, nested loops work.

In [65]:

for row in range(len(letter_table)):
    for col in range(len(letter_table[row])):
        print(letter_table[row][col])

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P

“But wait!”, I hear you say. “I need to store higher-dimensionality data!” No problem. Python will allow arbitrarily deep nesting. We can have lists of lists of lists (3 dimensions) or lists of lists of lists of lists for four dimensions. Accessing the cells is just a matter of adding more array indexes to the end of the name.

In [70]:

first_vice_presidents = [["John", "Adams"],\
                         ["Thomas", "Jefferson"],\
                         ["Aaron", "Burr"]]

early_us_leaders = [first_presidents, first_vice_presidents]
print(early_us_leaders[1][0][1])
print(early_us_leaders[1][2][0])

Adams
Aaron

It’s easy to get confused with deeply nested lists. Three dimensions isn’t bad, four is managable, but as the structures get deeper and deeper I have to resort to drawing pictures and frequent testing every step of the way.

tl;dr: If you’re a string theorist working in 21 dimensions or whatever, Python lists probably aren’t the way to go. You should use numpy.

Tuples

A “double”, mathematically speaking, is two of something. A “triple” is three of them. If you don’t know how many, or you don’t want to specify, then it’s generically called a “tuple” (pronounced “Too pull”, according the The American Heritage Dictionary and, more importantly, everyone who has ever taught the database class).

Python gracious provides us with tuples. Their syntax is just like a list, only using parentheses instead of square brackets. For instance:

In [71]:

my_tuple = (2, 8, 256)
print(my_tuple[1])

Tuples have some restrictions when compared to lists. * You can’t sort them. * You can’t insert or delete from them * You can’t change the values in them

Why would we want tuples if they’re so similar to lists, only somewhat disabled? In a word, “speed”. They’re very fast compared to lists. That’s why some Python functions require them. The most likely time you’ll see tuples is when you’re accessing data from a database. The second most common use is when you need to return multiple values from a function.

Since tuples have the speed advantage but lists are more versatile, it’s not unusual to see programmers use the list() and tuple() functions to convert between the two types:

In [72]:

my_tuple = (2, 8, 256)
list_version = list(my_tuple)
list_version

[2, 8, 256]

In [73]:

my_list=[2, 4, 6, 8]
tuple_version = tuple(my_list)
tuple_version

(2, 4, 6, 8)

Returning multiple values from a function feels like cheating the first time you do it. After all, sin(x) returns exactly one number, right?

What if you wrote a function that returns a complex number, like 1.105+7.3i ? That’s one number (albeit one on the complex plane) but it’s written like two pieces of data being returned.

What if you got really fancy and wrote a function that returned a column vector? That would be like returning a lot of numbers all at once, wouldn’t it?

So returning multiple values at once isn’t that bad, is it? Especially if the values all have related meaning and “belong” together.

In [75]:

def get_extremes(number_list):
    min_val = min(number_list)
    max_val = max(number_list)
    return (min_val, max_val)

numbers = [5, 3, 2, 7, 2, 5]
low, high = get_extremes(numbers)
print(low)
print(high)

2
7

A couple of things to note. First, notice how the function creates a tuple and returns it. The parentheses indicate a tuple is being constructed and the min_val and max_val variables are put into the tuple as the first and second elements.

Second, look at how that tuple is returned to the caller, taken apart, and stored in a pair of variables. You’ll see the syntax first_variable, second_variable, third_variable = func() when a tuple is returned from a function. The first element of the tuple is placed in first_variable and so on.

Coming Up Next

We’ve made it to the end of this section. Take a moment, breathe, and relax… this is the longest module in the “Python and Jupyter” series. Next up we have lots of information on strings. We’ve been using strings a lot already without really looking at what they are and what they can do. It’s time to remedy that.

(pssst. Want a hint? Strings are just tuples of letters!)