1 of 26

Joining and

Row Methods

1

Data 6 Summer 2023

LECTURE 12

Combining multiple sources of data.

Developed by students and faculty at UC Berkeley and Tuskegee University

data6.org/su23/syllabus/#acknowledgements-

2 of 26

Icebreaker

What is your hottest take?

(most unpopular opinion)

2

3 of 26

Week 3

Announcements!

  • Homework 2 has been released and will be due on 7/20 @ 11 PM
  • Office Hours locations and times on the course website
  • Post on Ed if you have questions about particular concepts or assignment questions

3

4 of 26

Today’s Roadmap

Lecture 12, Data 6 Summer 2023

  1. Joining
  2. Demo
  3. The Row Data Type

4

5 of 26

Joining Tables

5

1. Joining

2. Demo

3. The Row Data Type

6 of 26

phones

inventory

Question: If I sold all of the phones in my inventory, what would my revenue be?

7 of 26

.join()

The method

table_1.join('column 1', table_2, 'column 2')

combines table_1 and table_2 into a larger table, by looking for matches in 'column 1' of table_1 and 'column 2' of table_2 and combining matching rows.

8 of 26

phones

inventory

phones.join('Model', inventory, 'Handset')

9 of 26

phones

inventory

phones.join('Model', inventory, 'Handset')

No matches!

10 of 26

phones

inventory

phones.join('Model', inventory, 'Handset')

11 of 26

phones

inventory

phones.join('Model', inventory, 'Handset')

12 of 26

Some Considerations

The order that you join in can change the order of your columns, but doesn’t change the content of your table.

The column that you join by is sorted by default. (Uppercase characters come before lowercase characters!)

13 of 26

Questions?

13

14 of 26

Quick Check 1

Consider the tables contacts and codes.

  1. Fill in the blanks to join the two tables in the way that feels most natural.

contacts.join(___, ___, ___)

  • How many rows and columns will be in the joined table?

14

Quick Check

15 of 26

Demo

15

1. Joining

2. Demo

3. The Row Data Type

16 of 26

Followup

Beware: joining won’t always give you the result you’re looking for.

Here, it seems odd that we have multiple rows for Sandy with different regions. But that’s how join works!

17 of 26

Disclaimer

Python doesn’t know what columns to join by – we need to tell it.

  • It doesn’t know that it makes sense to join two tables by area code if you don’t tell it to.
  • Your role is important!

If you try and join on columns that have no shared elements, the result will be an empty table.

18 of 26

The Row Data Type

18

1. Joining

2. Demo

3. The Row Data Type

19 of 26

Rows vs. Columns

We know that columns are stored as arrays.

  • Question: Are rows stored as arrays too?
  • Answer: No – elements in arrays must all be of the same type, but each row can have multiple types of values.

So what are rows stored as, then?

Each column only contains one type (e.g. float).

Each row contains strings, ints, and floats.

20 of 26

t.row(index)

The method

t.row(index)

returns the Row at the provided index.

Rows are not arrays!

  • They can have elements of�different types.
  • You can use still use the .item method to get an element by index.

21 of 26

t.with_row(s)

The methods t.with_row(lst) returns a new table with an additional row.

  • If we want to add a single row to t, we call t.with_row(lst) where lst is a series of elements in square brackets [ and ].

We add rows far less frequently than we add columns, but these methods are still good to know.

In Python, lists are very similar to arrays, but can contain multiple data types

22 of 26

Questions?

22

23 of 26

In Conclusion…

23

24 of 26

Summary

24

Method

Behavior

table_1.join('column 1', table_2, 'column 2')

Combines table_1 and table_2 by looking for matches in 'column 1' of table_1 and 'column 2' of table_2 and combining matching rows. Returns a new table.

table_1.join('column', table_2)

Shortcut to the above if 'column 1' and 'column 2' are equal (i.e. the labels of the join columns in both tables are equal).

t.row(index)

Returns the row of t at the specified index.

t.with_row(lst)

Returns a copy of t with a single additional row.

25 of 26

Recap

Next Time

  • Working with Rows
  • Joining Tables
  • Functions
  • Function Parameters
  • Return Values
  • String Methods

25

26 of 26

Week 3

Announcements!

  • Homework 2 has been released and will be due on 7/20 @ 11 PM
  • Office Hours locations and times on the course website
  • Post on Ed if you have questions about particular concepts or assignment questions

26