1 of 19

CSE 163

Groupby and Apply�

Suh Young Choi�

🎶 Listening to: Minecraft soundtrack

💬 Before Class: If you were a kitchen appliance, what would you be?

2 of 19

This Time

  • Keyword Arguments
  • Groupby
  • Apply

Last Time

  • Imports
  • Pandas
    • How to read a file
    • Accessing columns
    • Element-wise operators
    • Filtering
    • .loc

2

3 of 19

Announcements

  • THA 1 due tomorrow!

  • THA 0 feedback out tomorrow

  • Resubmission Cycles start this Friday (7/11)
    • You can resubmit or late-submit 1 homework per cycle
    • Resubmission cycle closes on Tuesday (7/15)
    • Be on the lookout for an Ed post with a Google form
    • MUST complete Google form so that we can track your resubmission!

3

4 of 19

Keyword Arguments

  • Can specify parameters “by position” or “by name”

4

def div(a, b):

return a / b

# Same behavior

div(1, 2)

div(a=1, b=2)

div(b=2, a=1)

# Different behavior

div(b=1, a=2)

5 of 19

DataFrame

  • One of the basic data types from pandas is a DataFrame
    • It’s essentially a table with column and rows!

5

id

year

month

day

latitude

longitude

name

magnitude

0

nc72666881

2016

7

27

37.672333

-121.619000

California

1.43

1

us20006i0y

2016

7

27

21.514600

94.572100

Burma

4.90

2

nc72666891

2016

7

27

37.576500

-118.859167

California

0.06

Columns

Index (row)

6 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

7 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)

8 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)

9 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)

10 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)

11 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)

A Groupby DataFrame

12 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)[‘col2’]

col2

13 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

A

1

B

2

C

3

A

4

C

5

result = data.groupby(‘col1’)[‘col2’].sum()

col2

.sum()

.sum()

.sum()

14 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

B

2

C

8

A

5

result = data.groupby(‘col1’)[‘col2’].sum()

col2

15 of 19

Groupby Demo

 

col1 

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

B

2

C

8

A

5

result = data.groupby(‘col1’)[‘col2’].sum()

16 of 19

Group By

result = data.groupby('col1')['col2'].sum()

16

col1

col2

0

A

1

1

B

2

2

C

3

3

A

4

4

C

5

col2

C

3

5

col2

B

2

col2

A

1

4

A

5

B

2

C

8

A

5

B

2

C

8

Data�DataFrame

Split

Apply

Combine�Series

17 of 19

Apply

  • We have shown how to filter and group your data, but sometimes you want to transform your data
  • Pretty easy to change numerical data using the operators we learned last time (+, -, /, *, abs(), min(), max(), etc.)
  • With Strings, it’s not so easy

  • The last two pass a function as a parameter!

17

data['name'].str.len()

data['name'].str.upper()

data['name'].apply(len)

data['name'].apply(my_function)

18 of 19

Group Work:

Best Practices

When you first working with this group:

  • Introduce yourself!
  • If possible, angle one of your screens so that everyone can discuss together

Tips:

  • Starts with making sure everyone agrees to work on the same problem
  • Make sure everyone gets a chance to contribute!
  • Ask if everyone agrees and periodically ask each other questions!
  • Call TAs over for help if you need any!

18

19 of 19

Before Next Time

  • Complete Lesson 8
    • Remember not for points, but do go towards Checkpoint Tokens
  • Keep working on THA 1
  • Go to section tomorrow!

Next Time

  • Data visualization

19