Lecture 3
Data Tables, Indexes, pandas
DS 100
Fall 2017
Slides created by Sam Lau (samlau95@berkeley.edu)
Announcements
Last Time...
Can Big Data Account for no SRS?
Chance that 1st-born in DS100
Me
Where we are
Data Science Lifecycle
Data Science Lifecycle
Today: pandas
How this lecture will work
What you will learn
You won’t remember everything, but...
Getting the data
(Demo)
Question 1:
What was the most popular name in CA last year?
Always have high-level steps
In pandas
(Demo)
Recap
Question 2:
What were the most popular names in each state for each year?
Break it down
(Demo)
Recap
When do I need to group?
Question 3:
Can I deduce gender from the last letter of a person’s name?
Survey Question
Which last letter is most indicative of a person’s gender?
Break it down
(Demo)
Recap
When do I need to pivot?
Seaborn
Seaborn
sns.pairplot(df, hue="species")
How to Seaborn
(Demo)
Recap
Use the docs!
And Google.