3 of 24

Logistics

Coding Activity 5 (due Friday November 1)

HW4 Part 1 (due Friday October 28)

No resubmission for Hw 4 Part 1
If you want to resubmit, wait until Part 2 is due, the resubmissions will open then

HW4 Part 2 (due Friday November 8)

Submitting on Gradescope
Wait for autograder!

4 of 24

Lecture Review: Sets

5 of 24

Sets

Sets are a type of data structure which is unordered and unindexed
There can be no duplicates
Imagine it as a “bag of values”

You can imagine a set that contains the values 1 through 6 like this:

6 of 24

Sets

You can imagine a set that contains the values 1 through 6 like this:

s = set([1, 2, 3, 4])

s = {1, 1, 1, 1, 1, 1, 2, 3, 4}

s = {1, 2, 3, 4}

s = set()

s.add(1)

s.add(2)

s.add(3)

s.add(4)

7 of 24

Add, Remove, and Discard

Say we have the set s that has elements 1, 2, 3, 4 inside

Add

adds an element to the set
s.add(5)

Remove

takes out an existing element from the set (Must exist in the set!)
s.remove(5)

Discard

takes out an element from the set (doesn’t need to be in there already)
s.discard(5)

Pop

Returns a random element
s.pop()
Could return 1, 2, 3, or 4

8 of 24

Sets

Although you can convert any data structure into a set, you can only add immutable types into a set (just like dict keys)
Data types that can not go in a set (mutable types)

Dictionaries
Lists
other sets

Data types that can go in a set (immutable types)

Integers
Floats
Booleans (but why would you do this?)
Strings
Tuples

9 of 24

Looping Through a Set

To see all elements in a set, we can loop through it

Would print 1, 2, 3, 4 in some random order

for element in s:

print(element)

10 of 24

Checking If Something Is In A Set

We can use in to see if an element is in a set

Returns True

Returns False

2 in s

6 in s

11 of 24

Set Operations

A | B

A & B

A - B

A ^ B

elements only in A

elements only in B

items in both

12 of 24

Set Operations

Set Operation	Code
Adding values	my_set.add(val)
Removing values	my_set.remove(val) #Value must already exist my_set.discard(val) #Value doesn’t need to exist
Return a random element	my_set.pop()
Return all values in both sets	set_1 \| set_2 Or set_1.union(set_2)
Return values found in both sets	set_1 & set_2 Or set_1.intersection(set_2)
Return values only found in set_1	set_1 - set_2 Or set_1.difference(set_2)
Return values not found in both sets	set_1 ^ set_2

13 of 24

Lecture Review: Nested Dictionaries

14 of 24

Lists of Dictionaries

List of dictionaries:

[{‘a’ : 1, ‘b’ : 2, ‘c’ : 3}, {‘a’ : 4, ‘b’ : 5, ‘c’ : 6}, {‘d’ : 1, ‘e’ : 2, ‘f’ : 3}]

They might be used to represent a table (e.g. an excel file)

Where each item in the list in the dictionary is a row in the table
In this version, each dictionary should have the same keys

Example:

[{‘County’ : ‘King’, ‘Population’ : 2269675, “Temperature” : 57}, {‘County’ : ‘Pierce’, ‘Population’ : 921130, “Temperature” : 61}, {‘County’ : ‘Snohomish’, ‘Population’ : 827957, “Temperature” : 53}]

15 of 24

Nested Dictionaries

Dictionaries themselves can hold mutable elements as values which means we can put a dictionary inside a dictionary

{“dict_1” : {“a” : 1, “b” : 2, “c” : 3}, “dict_2” : {“a” : 5, “b” : 4, “c” : 3}, “dict_3” : {“a” : 1, “b” : 2, “c” : 3}}
Can have duplicate values

This can be used to better categorize data, transforming the list of dictionaries

{‘King’ : {‘Population’ : 2269675, ‘Temperature’ : 57}, ‘Pierce’ : {‘Population’ : 921130 , ‘Temperature’ : 61}, ‘Snohomish’ : {‘Population’ : 827957 , “Temperature” : 53}}

Can now easily find information based on county instead of traversing through a list

16 of 24

Section Handout Problems

17 of 24

Problem 1

Write a function called all_unique_words(file_name) that takes in a string file_name and returns the number of unique words in the file. You may use the split() function for this problem, which takes in a string and returns a list of the words in the string separated by empty spaces.

Example:

If colors.txt has the content "red green blue green"

Your output should be: 3

18 of 24

Problem 1

def all_unique_words(file_name):

file = open(file_name):

words = file.read()

unique = set(words.split())

file.close()

return len(unique)

19 of 24

Problem 2

Write a function called sum_dict(nested_dict) that, given a dictionary of dictionaries, creates a single dictionary containing the sums of values with the same key in the given dictionaries.

For example: Given this list of dictionaries:

{“dict_1” : {"b": 10, "a": 5, "c": 90},

“dict_2” : {"b": 78, "a": 45},

“dict_3” : {"a": 90, "c": 10}}

Your code should create : {"b": 88, "a": 140, "c": 100}

Python Tutor

20 of 24

Problem 2

def sum_dict(nested_dict):

new_dict = {}

for inner_dict in nested_dict.values():

for key in inner_dict:

if key not in new_dict:

new_dict[key] = 0

new_dict[key] += inner_dict[key]

return new_dict

21 of 24

Problem 3

Write a function called reformat_dict(dict_list, new_key) that when given a list of dictionaries and a key returns a dictionary of dictionaries with the keys being the value of the given key for each dictionary and the value being a dictionary with the rest of the information.

For example, given: key = “County”

nested_dict = [{"County": "King", "Population": 2269675, "Temperature": 57},{"County": "Pierce", "Population": 921130, "Temperature": 61},{"County": "Snohomish", "Population": 827957, "Temperature": 53}]

Your code should produce:

{‘King’ : {‘Population’ : 2269675, ‘Temperature’ : 57}, ‘Pierce’ : {‘Population’ : 921130 , ‘Temperature’ : 61}, ‘Snohomish’ : {‘Population’ : 827957 , “Temperature” : 53}}

Python Tutor

22 of 24

Problem 3

def reformat_dict(dict_list, new_key):

new_dict = {}

for inner_dict in dict_list:

current_key = inner_dict[new_key]

new_dict[current_key] = {}

for key in inner_dict:

if key != new_key:

new_dict[current_key][key] = inner_dict[key]

return new_dict

23 of 24

Problem 4

Given a file.csv that looks like the following, write code that reads the data into a nested dictionary, where the first entry is the outer key, the second entry is the inner key, and the third entry is the value of the inner dictionary. You can ignore the header when reading in the file.

# Sample CSV:

State, City, Population

Washington, Seattle, 733919

Oregon, Portland, 641162

California, San Francisco, 815201

Michigan, Detroit, 632464

Example Output:

{“Washington”: {“Seattle”: 733919}, “Oregon”: {“Portland”: 641162}, {“California”: {“San Francisco”: 815201}, “Michigan”: {“Detroit”: 632464}}

Python Tutor

Python Tutor Snippet:

import io

file = io.StringIO("""Washington, Seattle, 733919

Oregon, Portland, 641162

California, San Francisco, 815201

Michigan, Detroit, 632464""")

24 of 24

Problem 4

def read_data(file_name):

nested_dict = {}

file = open(example.txt)

for line in file:

data = line.split()

inner_dict = {}

inner_dict[data[1]] = data[2]

nested_dict[data[0]] = inner_dict

file.close()

return nested_dict