Joining and
Row Methods
1
Data 6 Summer 2023
LECTURE 12
Combining multiple sources of data.
Developed by students and faculty at UC Berkeley and Tuskegee University
Icebreaker
What is your hottest take?
(most unpopular opinion)
2
Week 3
Announcements!
3
Today’s Roadmap
Lecture 12, Data 6 Summer 2023
4
Joining Tables
5
1. Joining
2. Demo
3. The Row Data Type
➤
phones
inventory
Question: If I sold all of the phones in my inventory, what would my revenue be?
.join()
The method
table_1.join('column 1', table_2, 'column 2')
combines table_1 and table_2 into a larger table, by looking for matches in 'column 1' of table_1 and 'column 2' of table_2 and combining matching rows.
phones
inventory
phones.join('Model', inventory, 'Handset')
phones
inventory
phones.join('Model', inventory, 'Handset')
No matches!
phones
inventory
phones.join('Model', inventory, 'Handset')
phones
inventory
phones.join('Model', inventory, 'Handset')
Some Considerations
The order that you join in can change the order of your columns, but doesn’t change the content of your table.
The column that you join by is sorted by default. (Uppercase characters come before lowercase characters!)
Questions?
13
Quick Check 1
Consider the tables contacts and codes.
contacts.join(___, ___, ___)
14
Quick Check
Demo
15
1. Joining
2. Demo
3. The Row Data Type
➤
Followup
Beware: joining won’t always give you the result you’re looking for.
Here, it seems odd that we have multiple rows for Sandy with different regions. But that’s how join works!
Disclaimer
Python doesn’t know what columns to join by – we need to tell it.
If you try and join on columns that have no shared elements, the result will be an empty table.
The Row Data Type
18
1. Joining
2. Demo
3. The Row Data Type
➤
Rows vs. Columns
We know that columns are stored as arrays.
So what are rows stored as, then?
Each column only contains one type (e.g. float).
Each row contains strings, ints, and floats.
t.row(index)
The method
t.row(index)
returns the Row at the provided index.
Rows are not arrays!
t.with_row(s)
The methods t.with_row(lst) returns a new table with an additional row.
We add rows far less frequently than we add columns, but these methods are still good to know.
In Python, lists are very similar to arrays, but can contain multiple data types
Questions?
22
In Conclusion…
23
Summary
24
Method | Behavior |
table_1.join('column 1', table_2, 'column 2') | Combines table_1 and table_2 by looking for matches in 'column 1' of table_1 and 'column 2' of table_2 and combining matching rows. Returns a new table. |
table_1.join('column', table_2) | Shortcut to the above if 'column 1' and 'column 2' are equal (i.e. the labels of the join columns in both tables are equal). |
t.row(index) | Returns the row of t at the specified index. |
t.with_row(lst) | Returns a copy of t with a single additional row. |
Recap
Next Time
25
Week 3
Announcements!
26