Python Lists in Depth

What are iterators and are they worth caring about?This article aims to remove some of the confusion around these questions and more.

We’ll start off by looking at lists in isolation: how we make them and how to interact with them.

After that, we’ll look at some examples of leveraging loops, comprehensions, and recursion for creating lists.

We’ll finish off by comparing lists to some other iterable types.

VersionThe examples below are all written in Python3.

If you are running this stuff in Python2, some things might turn out a little differently.

the id functionWe’ll be making use of Python’s built in id function quite a bit in the examples that follow, so we'll start off by making sure we understood it.

>>> id<built-in function id>>>> print(id.

__doc__)Return the identity of an object.

This is guaranteed to be unique among simultaneously existing objects.

(CPython uses the object's memory address.

)>>>>>> # let's see it in action>>>>>> a = 1>>> id(a)10919424>>>>>>>>> b = a>>>>>> id(b) == id(a)True>>>>>> id(b) == id('spam')FalseSo in plain English, the id function returns something that represents the unique identity of an object.

If we have two values that have the same id output, then they are the same object.

That is, they are in the same place in memory.

As an analogy, let’s say we have a human named Robert.

His mother calls him Robert, his siblings call him Rob, and his friends call him Bob.

Bob’s social security number is the same as Robert’s social security number.

Rob’s social security number is the same as Bob’s.

In Python this is like:>>> id(bob) == id(robert)True>>> id(rob) == id(bob)TrueThis also means that if you do something that changes Rob, it would affect Robert and Bob.

If Rob decides to wear a blue t-shirt, then that means that Robert and Bob are wearing a blue t-shirt — it’s the same t-shirt.

So:>>> id(bob.

shirt) == id(robert.

shirt)True>>> id(rob.

shirt) == id(bob.

shirt)TrueJust listsIn this section, we’ll run through a bunch of examples.

We’ll start simple.

Open up a Python3 shell if you want to follow along.

Creating listsFirst, some basic syntax for creating lists.

Let’s make our first list:>>> l1 = [1,2,3]>>> l1[1, 2, 3]>>>>>>>>> type(l1)<class 'list'>>>>So the l1 is an instance of the class called list.

Lists are objects.

l1 was a list of integers.

Let's make a list of strings.

>>> l2 = ['a','b','c']>>> l2['a', 'b', 'c']We can also refer to other variables from within lists.

>>> foo = 'b'>>> l2 = ['a',foo,'c']>>> l2['a', 'b', 'c']Trailing commas and whitespace around the individual elements don’t make a difference in how things are interpreted.

This means that the following statements are equivalent:>>> l2 = ['a',foo,'c']>>> l2 = ['a',foo,'c' , ]Lists can also be spread out over multiple lines.

Sometimes it’s nice to do this for readability.

>>> l1 = [.

1, # this can also be useful if you want to.

2, # make comments about specific elements in.

3 # your list.

]>>> l1[1, 2, 3]A single list can contain data of many types.

For example, this one contains integers as well as strings:>>> l3 = [1, 2, 3, 'a', 'b', 'c']>>> l3[1, 2, 3, 'a', 'b', 'c']Lists can even contain lists!>>> l4 = [1,2,[3,4,[5]],'a',3.

2,True]>>> l4[1, 2, [3, 4, [5]], 'a', 3.

2, True]So far so good.

Now you can recognize and create lists.

Accessing individual elementsNow we’ll be using indices to access individual elements in our lists:>>> l4[1, 2, [3, 4, [5]], 'a', 3.

2, True]Indices start from 0.

So the first element has an index of 0, the second element has an index of 1, etc.

>>> l4[0]1>>> l4[1]2>>> l4[2][3, 4, [5]]>>> l4[3]'a'>>> l4[4]3.

2>>> l4[5]TrueThat was our last element.

If we try to access an element past the end of a list, then Python raises an IndexError:>>> l4[6]Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: list index out of rangeAnd if you pass in an index that Python doesn’t understand, you’ll get a TypeError:>>> l4['eggs']Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: list indices must be integers or slices, not str>>>>>> l4[1.

0]Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: list indices must be integers or slices, not float>>>>>> l4[1]2So, according to those TypeErrors, list indices must be integers or slices.

We have covered positive integers.

Now for some negative ones:>>> l4[1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[-1]True>>> l4[-2]3.

2>>> l4[-3]'a'>>> l4[-4][3, 4, [5]]>>> l4[-5]2>>> l4[-6]1>>> l4[-7]Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: list index out of rangeSo our list has six elements, the highest integer index is 5, and the lowest integer index is -6.

It looks a little something like:Slicing and dicingList indices must be integers or slices and, in this section, we’ll cover the latter.

Slicing is a mechanism for creating new lists from existing lists, in which the new list is simply a subset of the original list.

To understand slices, you’ll need to understand integer indices.

Ready?Slice is a class defined within Python by default.

You don’t need to import anything to use it.

>>> slice<class 'slice'>>>> print(slice.

__doc__)slice(stop)slice(start, stop[, step])Create a slice object.

This is used for extended slicing (e.

g.

a[0:10:2]).

Okay, that’s a little confusing.

Let’s see what happens if we use a slice as an index.

>>> l4[1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l5 = l4[slice(0,6,1)]>>> l5[1, 2, [3, 4, [5]], 'a', 3.

2, True]So l4 and l5 are the same.

or are they?.Remember that id function I was droning on about earlier?.Here is where it comes into play:>>> id(l4)==id(l5)FalseSo l5 looks like l4, but it's really just a copy.

Going back to the analogy we used before: if l4 was Rob, then l5 is like Rob's twin.

Let's call her.

Roberta.

But wait, there's more:>>> id(l4[0])==id(l5[0])True>>> id(l4[1])==id(l5[1])True>>> id(l4[2])==id(l5[2])TrueSo l4 and l5 are different but their contents are at the same location in memory.

To continue our analogy: let’s say Robert and Roberta have a shelf of books that they share.

If Roberta removes one of her books from the shelf, then she has removed one of Robert’s books from the shelf (because it’s the same book).

If Robert drops one of his books into the bathtub, then he dropped one of Roberta’s books into the bathtub (because it is the same book) (Damnit Robert!).

Just like Roberta and Robert share the same books, l4 and l5 share the same contents.

A technical way of saying this is:l5 is a shallow copy of l4.

This might be a bit of a surprise, but:>>> id(l4[slice(0,6,1)]) == id(l4[slice(0,6,1)])TrueThe same initial list sliced with the same slice returns the same copy!We’ll go more into the significance of memory later on.

It might seem fairly straightforward now, but I’ve seen a few pretty weird bugs come out of this behavior.

Let’s take a closer look at what kinds of slices we can make:Earlier, we did this:>>> l5 = l4[slice(0,6,1)]>>> l5[1, 2, [3, 4, [5]], 'a', 3.

2, True]Python has a nice shorthand that we’ll use for now on.

This is equivalent to what we did before.

>>> l5 = l4[0:6:1]>>> l5[1, 2, [3, 4, [5]], 'a', 3.

2, True]The arguments of slice are called start, stop, and step.

We'll change each one to see what it does first.

Let's look at start:>>> l4[0:6:1][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[1:6:1][2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[2:6:1][[3, 4, [5]], 'a', 3.

2, True]>>> l4[-1:6:1][True]>>> l4[-2:6:1][3.

2, True]In general: some_list[start:stop:step][0] == some_list[start].

But if you refer to some index off the end of the list, then slice doesn't raise an exception.

>>> l4[5000:6:1][]>>> l4[-5000:6:1][1, 2, [3, 4, [5]], 'a', 3.

2, True]Now, let’s look at stop.

Keep in mind that the last element of l4 is True and has the index 5.

>>> l4[0:6:1][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[0:5:1][1, 2, [3, 4, [5]], 'a', 3.

2]>>> l4[0:4:1][1, 2, [3, 4, [5]], 'a']>>> l4[0:-1:1][1, 2, [3, 4, [5]], 'a', 3.

2]>>> l4[0:-2:1][1, 2, [3, 4, [5]], 'a']>>> l4[0:-3:1][1, 2, [3, 4, [5]]]In general: some_list[start,stop:step][-1] == some_list[step-1].

And again, it's alright to refer to indices off the end of the list:>>> l4[0:5000:1][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[0:-5000:1][]The final argument is step:>>> l4[0:6:1][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[0:6:2][1, [3, 4, [5]], 3.

2]>>> l4[0:6:3][1, 'a']>>> l4[0:6:4][1, 3.

2]>>> l4[0:6:-1]>>> []>>> l4[6:0:-1][True, 3.

2, 'a', [3, 4, [5]], 2]>>> l4[6:0:-2][True, 'a', 2]So if step is 1, then we return every element in the list.

If step is 2, we return every second element.

Negative step values reverse the list order, which means the start and stop need appropriate values.

If the step is positive, then it would make sense that start < stop.

But if the step is negative, then stop < start would make more sense.

Well done!.now you can slice like a pro.

There is one more thing worth knowing: not all slice parameters are required.

Below are some more shortcuts you can use.

First, the default step is 1, so if you leave that out, then all is well.

The following are all equivalent:>>> l4[0:6:1][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[0:6:][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[0:6][1, 2, [3, 4, [5]], 'a', 3.

2, True]So long as there is a colon : then the index is considered a slice.

The start and stop slice parameters are also optional.

The following are equivalent:>>> l4[0:6][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[:6][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[0:][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[:][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[::][1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[::1][1, 2, [3, 4, [5]], 'a', 3.

2, True]It also works if you specify a negative step.

>>> l4[5:-7:-1][True, 3.

2, 'a', [3, 4, [5]], 2, 1]>>> l4[:-7:-1][True, 3.

2, 'a', [3, 4, [5]], 2, 1]>>> l4[5::-1][True, 3.

2, 'a', [3, 4, [5]], 2, 1]>>> l4[::-1][True, 3.

2, 'a', [3, 4, [5]], 2, 1]Awesome work!.That’s all you need to know about slicing.

Common list operations and functionsThere is more to lists than slicing and dicing.

Here we’ll breeze through some common functions.

append adds an element to the end of the list.

>>> l=[]>>> l[]>>> l.

append('a')>>> l['a']>>> l.

append([1,2])>>> l['a', [1, 2]]>>> l.

append(1,2)Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: append() takes exactly one argument (2 given)extend is used to concatenate lists.

Take careful note of how this is different to append:>>> l['a', [1, 2]]>>> l.

extend()Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: extend() takes exactly one argument (0 given)>>> l.

extend([])>>> l['a', [1, 2]]>>> l.

extend([1,2,3,4,5])>>> l['a', [1, 2], 1, 2, 3, 4, 5]in is used to check if an element exists inside a list:>>> l['a', [1, 2], 1, 2, 3, 4, 5]>>> 'a' in lTrue>>> 'b' in lFalsenot in does the opposite:>>> 'b' not in lTrue>>> 'a' not in lFalsesort sorts a list in place, and sorted returns a new list that is ordered correctly:>>> l = [111,4,22,6,30]>>>>>> sorted(l)[4, 6, 22, 30, 111]>>> l[111, 4, 22, 6, 30]>>> l.

sort()>>> l[4, 6, 22, 30, 111]And lastly, you can change individual elements in a list using assignment:>>> l = [1,2,3]>>> l[1, 2, 3]>>> l[0] = 'new'>>> l['new', 2, 3]>>> l[3] = 'new'Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: list assignment index out of rangeMutability and memory mattersIn this section, we’ll cover a few sticking points.

This kind of behavior can lead to some very confusing bugs.

As we have seen, lists are mutable.

That means that you can change parts of a list without creating a whole new list:>>> l1 = [1,2,3]>>> l1[1, 2, 3]>>> id(l1)139786247365704>>> original_id = id(l1)>>> l1[1] = "updated">>> l1[1, 'updated', 3]>>> id(l1) == original_idTrueSo if we have two variables pointing to the same list, then changing one list will change them both:>>> l1[1, 'updated', 3]>>> l2 = l1>>> l2[1, 'updated', 3]>>> id(l1) == id(l2)True>>> l2.

append('parrot')>>> l2[1, 'updated', 3, 'parrot']>>> l1[1, 'updated', 3, 'parrot']This kind of behavior applies even with nested data structures:>>> # l4 contains a list.

We're going to point at it from another variable>>> l4[1, 2, [3, 4, [5]], 'a', 3.

2, True]>>> l4[2][3, 4, [5]]>>> l5 = l4[2]>>> l5[3, 4, [5]]>>> l4[2][0]3>>> l5[0]3>>> l4[2][1]4>>> l5[1]4>>> l4[2][2][5]>>> l5[2][5]>>>>>> id(l5) == id(l4[2])TrueSo l4[2] and l5 refer to the same area in memory (Like Rob and Robert being the same person).

>>> l5.

extend(['cheddar','gouda'])>>> l5[3, 4, [5], 'cheddar', 'gouda']>>> l4[2][3, 4, [5], 'cheddar', 'gouda']>>> l4[1, 2, [3, 4, [5], 'cheddar', 'gouda'], 'a', 3.

2, True]So far so good… now for some potentially confusing bits:>>> id(l4[2]) == id(l5)True>>> l5[3, 4, [5], 'cheddar', 'gouda']>>> l5 = ['ni']>>> l5['ni']>>> l4[2][3, 4, [5], 'cheddar', 'gouda']>>> id(l4[2]) == id(l5)FalseWhat happened here is: we created a new list and assigned it to l5.

l5 now points to a brand new memory location.

This version mutates the list without creating a new one:>>> l5 = l4[2]>>> id(l4[2]) == id(l5)True>>> l5[3, 4, [5], 'cheddar', 'gouda']>>> l5.

clear()>>> l5.

append('ni')>>> l5['ni']>>> l4[2]['ni']>>> id(l4[2]) == id(l5)TrueLists as function argumentsI’ve seen confusion around this stuff cause a lot of bugs.

>>> def spam(some_list):.

some_list.

append(1).

return some_list.

>>> l = []>>> spam(l)[1]>>> spam(l)[1, 1]>>> spam(l)[1, 1, 1]>>> l[1, 1, 1]So the list that gets passed into our spam function gets mutated every time the function is called.

Pretty obvious, right?This is how default list arguments behave:>>> def eggs(some_list=[]):.

some_list.

append(1).

return some_list.

>>> eggs()[1]>>> eggs()[1, 1]>>> eggs()[1, 1, 1]>>> id(eggs())139786247382088>>> id(eggs())139786247382088Eggs keeps returning the same list object.

The list was created when the function was defined for the first time.

Now, let’s create a new list and pass it in:>>>>>> l=['something_new']>>> eggs(l)['something_new', 1]>>> eggs(l)['something_new', 1, 1]>>> eggs(l)['something_new', 1, 1, 1]>>>>>> # So now it behaves like our spam function>>>>>> # how about this:>>>>>> eggs([])[1]>>> eggs([])[1]>>> eggs([])[1]The moral of the story: eggs should be burned as a witch.

Default mutable parameters are dangerous!.Avoid them.

Below is a safer way of doing things.

Now the default behavior doesn't change every time the function is called.

>>> def better_eggs(some_list=None):.

if some_list == None:.

some_list = [].

some_list.

append(1).

return some_list.

>>> better_eggs()[1]>>> better_eggs()[1]>>> better_eggs()[1]RecapIn this section, we managed to get our heads around all things listy.

We can create and mutate them, we can fetch different parts of them in different ways, and we can avoid certain weird errors with success.

List versus …Lists are great and all, but they aren’t the only built-in iterable Python has to offer.

In this section, we’ll do a little roundup of a few other Python types:DictionariesDictionaries map keys to values.

Values can have any …erm…value.

Keys are a little more specific but fairly flexible.

>>> d = {}>>> type(d)<class 'dict'>>>>>>> dir(d)['__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']>>>>>> d = {1:2,'a':3, 'c': 'ddddd'}>>>>>> d{1: 2, 'a': 3, 'c': 'ddddd'}You access individual values by key, not by index:>>> d[0]Traceback (most recent call last): File "<stdin>", line 1, in <module>KeyError: 0>>> d[1]2>>> d['a']3>>> d['c']'ddddd'Dictionaries, like lists, are mutable.

You can make changes to them without having to make a whole new dict.

This means that dicts can fall victim to the same gotchas we went over before.

>>> d['c'] = 'new value'>>> d{1: 2, 'a': 3, 'c': 'new value'}>>>>>> d['new key'] = 'new value'>>> d{1: 2, 'a': 3, 'new key': 'new value', 'c': 'new value'}You can provide default values when trying to access values from a dict.

>>> x = d[0]Traceback (most recent call last): File "<stdin>", line 1, in <module>KeyError: 0>>> xTraceback (most recent call last): File "<stdin>", line 1, in <module>NameError: name 'x' is not defined>>> x = d.

get(0)>>> x>>> print(x)None>>>>>> x = d.

get(0,"some default")>>> x'some default'For loops work a little differently to lists.

In lists, the for loop iterates over the list elements.

With dicts, it iterates over the keys (not the values!)>>> for key in d:.

print(key).

1anew keyc>>> for key in d:.

print(key,' : ',d[key]).

1 : 2a : 3new key : new valuec : new valueSetsA set is an unordered collection of unique elements:>>> s = {1,2,3}>>> s{1, 2, 3}>>> type(s)<class 'set'>>>> dir(s)['__and__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__iand__', '__init__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']Since it is unordered, you can’t access elements by index.

>>> s[1]Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'set' object does not support indexingTo add things to a set, you use the add function.

This means that sets are mutable, just like lists and dicts:>>> s{1, 2, 3}>>> s.

add(1)>>> s{1, 2, 3}>>> s.

add(55)>>> s{1, 2, 3, 55}>>> s.

add("parrot")>>> s{1, 2, 3, 'parrot', 55}Notice the position of 'parrot' above.

Remember that sets don't care about ordering.

For loops and the in operator work the same for lists as for sets:>>> for x in s:.

print(x).

123parrot55>>>>>> s{2, 3, 'parrot', 55}>>> 2 in sTrue>>> 22 in sFalse>>> 2 not in sFalse>>> 22 not in sTrueTuplesTuples are ordered collections of elements but are IMMUTABLE.

If you want to make a change to a tuple, you need to create a whole new tuple.

>>> t = (1,2,3)>>> type(t)<class 'tuple'>>>> dir(t)['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']>>>>>> t(1, 2, 3)>>> t[0]1>>> t[1]2>>> t[3]Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: tuple index out of rangeIndex assignment doesn’t work because it’s immutable:>>> t[3] = 'boo'Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'tuple' object does not support item assignment>>> t[2] = 'boo'Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'tuple' object does not support item assignmentFor loops and the in operator work as expected.

>>> for x in t:.

print(x).

123>>>>>> t(1, 2, 3)>>> 1 in tTrue>>> 111 in tFalse>>> 1 not in tFalse>>> 111 not in tTrueConclusionThis article covered lists in depth.

We covered the basics of list creation, indexing and slicing.

We also spoke about list mutability and demonstrated a few not-terribly-intuitive list behaviors.

We then briefly compared lists to other data structures.

You should now have the tools needed to explore those data structures further on your own.

.

. More details

Leave a Reply