Part 5

Dictionary

Lists can be handy in many situations, but they are limited by the fact that the items are accessed through indexes; 0, 1, 2, and so forth. If you want to find some item in a list, you will either have to know its index, or, at worst, traverse through the entire list.

Another central data structure in Python is the dictionary. In a dictionary, the items are indexed by keys. Each key maps to a value. The values stored in the dictionary can be accessed and changed using the key.

Using a dictionary

The following example shows you how the dictionary data structure works. Here is a simple dictionary from Finnish to English:

my_dictionary = {}

my_dictionary["apina"] = "monkey"
my_dictionary["banaani"] = "banana"
my_dictionary["cembalo"] = "harpsichord"

print(len(my_dictionary))
print(my_dictionary)
print(my_dictionary["apina"])
Sample output

3 {'apina': 'monkey', 'banaani': 'banana', 'cembalo': 'harpsichord'} monkey

The notation {} creates an empty dictionary, to which we can now add content. Three key-value pairs are added:"apina" maps to "monkey", "banaani" maps to "banana", and "cembalo" maps to "harpsichord". Finally, the number of key-value pairs in the dictionary is printed, along with the entire dictionary, and the value mapped to the key "apina".

After defining the dictionary we could also use it with user input:

word = input("Please type in a word: ")
if word in my_dictionary:
    print("Translation: ", my_dictionary[word])
else:
    print("Word not found")

Notice the use of the in operator above. When used on a variable of type dictionary, it checks whether the first operand is among the keys stored in the dictionary. Given different inputs, this program might print out the following:

Sample output

Please type in a word: apina Translation: monkey

Sample output

Please type in a word: pöllö Word not found

What can be stored in a dictionary?

The data type is called dictionary, but it does not have to contain only strings. For example, in the following dictionary the keys are strings, but the values are integers:

results = {}
results["Mary"] = 4
results["Alice"] = 5
results["Larry"] = 2

Here the keys are integers and the values are lists:

lists = {}
lists[5] = [1, 2, 3]
lists[42] = [5, 4, 5, 4, 5]
lists[100] = [5, 2, 3]

How keys and values work

Each key can appear only once in the dictionary. If you add an entry using a key that already exists in the dictionary, the original value mapped to that key is replaced with the new value:

my_dictionary["suuri"] = "big"
my_dictionary["suuri"] = "large"
print(my_dictionary["suuri"])
Sample output

large

All keys in a dictionary must be immutable. So, a list cannot be used as a key, because it can be changed. For example, executing the following code causes an error:

my_dictionary[[1, 2, 3]] = 5
Sample output

TypeError: unhashable type: 'list'

Unlike keys, the values stored in a dictionary can change, so any type of data is acceptable as a value. A value can also be mapped to more than one key in the same dictionary.

Loading
Loading

Traversing a dictionary

The familiar for item in collection loop can be used to traverse a dictionary, too. When used on the dictionary directly, the loop goes through the keys stored in the dictionary, one by one. In the following example, all keys and values stored in the dictionary are printed out:

my_dictionary = {}

my_dictionary["apina"] = "monkey"
my_dictionary["banaani"] = "banana"
my_dictionary["cembalo"] = "harpsichord"

for key in my_dictionary:
    print("key:", key)
    print("value:", my_dictionary[key])
Sample output

key: apina value: monkey key: banaani value: banana key: cembalo value: harpsichord

Sometimes you need to traverse the entire contents of a dictionary. The method items returns all the keys and values stored in the dictionary, one pair at a time:


for key, value in my_dictionary.items():
    print("key:", key)
    print("value:", value)

In the examples above, you may have noticed that the keys are processed in the same order as they were added to the dictionary. As the keys are processed based on a hash value, the order should not usually matter in applications. In fact, in many older versions of Python the order is not guaranteed to follow the time of insertion.

Some more advanced ways to use dictionaries

Let's have a look at a list of words:

word_list = [
  "banana", "milk", "beer", "cheese", "sourmilk", "juice", "sausage",
  "tomato", "cucumber", "butter", "margarine", "cheese", "sausage",
  "beer", "sourmilk", "sourmilk", "butter", "beer", "chocolate"
]

We would like to analyze this list of words in different ways. For instance, we would like to know how many times each word appears in the list.

A dictionary can be a useful tool in managing this kind of information. In the example below, we go through the items in the list one by one. Using the words in the list as keys in a new dictionary, the value mapped to each key is the number of times the word has appeared:

def counts(my_list):
    words = {}
    for word in my_list:
        # if the word is not yet in the dictionary, initialize the value to zero
        if word not in words:
            words[word] = 0
        # increment the value
        words[word] += 1
    return words

# call the function
print(counts(word_list))

The program prints out the following:

Sample output

{'banana': 1, 'milk': 1, 'beer': 3, 'cheese': 2, 'sourmilk': 3, 'juice': 1, 'sausage': 2, 'tomato': 1, 'cucumber': 1, 'butter': 2, 'margarine': 1, 'chocolate': 1}

What if we wanted to categorize the words based on the initial letter in each word? One way to accomplish this would be to use dictionaries:

def categorize_by_initial(my_list):
    groups = {}
    for word in my_list:
        initial = word[0]
        # initialize a new list when the letter is first encountered
        if initial not in groups:
            groups[initial] = []
        # add the word to the appropriate list
        groups[initial].append(word)
    return groups

groups = categorize_by_initial(word_list)

for key, value in groups.items():
    print(f"words beginning with {key}:")
    for word in value:
        print(word)

The structure of the function is very similar to the previous exercise but this time the values mapped to the keys are lists. The program prints out the following:

Sample output

words beginning with b: banana beer butter beer butter beer words beginning with m: milk margarine words beginning with c: cheese cucumber cheese chocolate words beginning with s: sourmilk sausage sausage sourmilk sourmilk words beginning with j: juice words beginning with t: tomato

Loading
Loading
Loading

Removing keys and values from a dictionary

It is naturally possible to also remove key-value paris from the dictionary. There are two ways to accomplish this. The first is the command del:

staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"}
del staff["David"]
print(staff)
Sample output

{'Alan': 'lecturer', 'Emily': 'professor'}

If you try to use the del command to delete a key which doesn't exist in the dictionary, there will be an error:

staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"}
del staff["Paul"]
Sample output
>>> del staff["Paul"]
Traceback (most recent call last):
  File "", line 1, in 
KeyError: 'Paul'

So, before deleting a key you should check if it is present in the dictionary:

staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"}
if "Paul" in staff:
  del staff["Paul"]
  print("Deleted")
else:
  print("This person is not a staff member")

The other way to delete entries in a dictionary is the method pop:

staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"}
deleted = staff.pop("David")
print(staff)
print(deleted, "deleted")
Sample output

{'Alan': 'lecturer', 'Emily': 'professor'} lecturer deleted

As you can see above, pop also returns the value from the deleted entry.

By default, pop will also cause an error if you try to delete a key which is not present in the dictionary. It is possible to avoid this by giving the method a second argument, which contains a default return value. This value is returned in case the key is not found in the dictionary. The special Python value None will work here:

staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"}
deleted = staff.pop("Paul", None)
if deleted == None:
  print("This person is not a staff member")
else:
  print(deleted, "deleted")
Sample output

This person is not a staff member

NB: if you need to delete the contents of the entire dictionary, and try to do it with a for loop, like so

staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"}
for key in staff:
  del staff[key]

you will receive an error message:

Sample output

RuntimeError: dictionary changed size during iteration

When traversing a collection with a for loop, the contents may not change while the loop is in progress.

Luckily, there is a dictionary method for just this purpose:

staff.clear()
Loading
Loading

Using dictionaries for structured data

Dictionaries are very useful for structuring data. The following code will create a dictionary which contains some personal data:

person = {"name": "Pippa Python", "height": 154, "weight": 61, "age": 44}

This means that we have here a person named Pippa Python, whose height is 154, weight 61, and age 44. The same information could just as well be stored in variables:

name = "Pippa Python"
height = 154
weight = 61
age = 44

The advantage of a dictionary is that it is a collection. It collects related data under one variable, so it is easy to access the different components. This same advantage is offered by a list:

person = ["Pippa Python", 153, 61, 44]

With lists, the programmer will have to remember what is stored at each index in the list. There is nothing to indicate that person[2] contains the weight and person[3] the age of the person. When using a dictionary this problem is avoided, as each bit of data is accessed through a named key.

Assuming we have defined multiple people using the same format, we can access their data in the following manner:

person1 = {"name": "Pippa Python", "height": 154, "weight": 61, "age": 44}
person2 = {"name": "Peter Pythons", "height": 174, "weight": 103, "age": 31}
person3 = {"name": "Pedro Python", "height": 191, "weight": 71, "age": 14}

people = [person1, person2, person3]

for person in people:
    print(person["name"])

combined_height = 0
for person in people:
    combined_height += person["height"]

print("The average height is", combined_height / len(people))
Sample output

Pippa Python Peter Pythons Pedro Python The average height is 173.0

Loading
Loading

At this point in the course, you can choose to participate in a research study related to learning programming. Participation is voluntary and individual participants cannot be identified from the data gathered in the study. You can freely quit the experiment at any point. Click here to begin the study!

You have reached the end of this section! Continue to the next section:

You can check your current points from the blue blob in the bottom-right corner of the page.