1 of 9

CSE 163

Pandas

��Hunter Schafer

💬 Did you do anything fun this weekend?

🎵Music: Manatee Commune

2 of 9

This Time

  • Imports
  • Pandas!
    • How to read a file
    • Accessing columns
    • Element-wise operators
    • Filtering
    • loc

Last Time

  • Dictionary Methods
    • How to loop over dictionary
  • CSVs
  • List of Dictionaries

2

3 of 9

Group Work Tips

Running out of time?

  • Focus on getting high level idea and understanding any errors
  • Try to write down questions to ask!

Have some time left over?

  • Discuss alternate solutions and if they work (and why!)
  • Come up with quiz questions to test understanding
  • Work on a set of shared notes for the day (Google Drive or OneNote)

3

4 of 9

Importing

Importing lets you use the contents defined in another Python file

  • We call a Python file a module
  • Generally there are three ways to import!

4

# Method 1: Import module

import module

module.function()

# Method 1: Import module

import module

module.function()

# Method 2: Import and rename module

import module as m

m.function()

# Method 1: Import module

import module

module.function()

# Method 2: Import and rename module

import module as m

m.function()

# Method 3: Import specific function from module

from module import function

function()

5 of 9

DataFrame

  • One of the basic data types from pandas is a DataFrame
    • It’s essentially a table with column and rows!

5

id

year

month

day

latitude

longitude

name

magnitude

0

nc72666881

2016

7

27

37.672333

-121.619000

California

1.43

1

us20006i0y

2016

7

27

21.514600

94.572100

Burma

4.90

2

nc72666891

2016

7

27

37.576500

-118.859167

California

0.06

Columns

Index (row)

6 of 9

Series

  • A Series is like a 1-dimensional DataFrame (no columns!)
    • Has an index
    • It’s basically like a fancy dictionary/list hybrid
  • For example

6

0

California

1

Burma

2

California

df['name']

df['name'][1] # 'Burma'

7 of 9

Filtering

  • Can use a bool Series to select which rows from the dataset

  • Can use multiple filters with: & (and), | (or), ~ (not)

7

mask = df['magnitude'] > 5

df[mask]

# Same as: data[data['magnitude'] > 5]

id

year

month

day

latitude

longitude

name

magnitude

30

us20006i18

2016

7

27

-24.286000

-67.864700

Chile

5.60

114

us20006i35

2016

7

27

36.492200

140.756800

Japan

5.30

421

us1000683b

2016

7

28

-16.824200

-172.515800

Tonga

5.10

df[(df['magnitude'] > 5) & ~(df['day'] == 27)]

8 of 9

Location

How to access data in pandas

Series

DataFrame

Options for indexers:

  • A single value
  • A list of values or a slice
  • A mask
  • : to select all values

Remember the end of a slice is inclusive unlike Python’s standard

8

df[<indexer>]

df.loc[<row indexer>, <column indexer>]

series[<indexer>]

9 of 9

Group Work:

Best Practices

When you first working with this group:

  • Introduce yourself!
  • If possible, angle one of your screens so that everyone can discuss together

Tips:

  • Starts with making sure everyone agrees to work on the same problem
  • Make sure everyone gets a chance to contribute!
  • Ask if everyone agrees and periodically ask each other questions!
  • Call TAs over for help if you need any!

9