1 of 12

Lecture 9

Functions

DATA 8

Spring 2024

2 of 12

Announcements

  • HW 3 due Wednesday at 5pm
  • Lab 4 due Friday
  • Project 1 will be released on Friday
  • HW 1 regrade requests are due Wednesday
  • My OH are also on Zoom today 4-6pm

3 of 12

Histograms

4 of 12

Area and Height

Area of bar = % in bin = Height x width of bin

  • “How many individuals in the bin?” Use area.

  • “How crowded is the bin?” Use height.

(Demo)

5 of 12

Charts Wrap Up

6 of 12

Summary

  • Line graph: sequential data (over time, etc.)�
  • Scatter plot: relation between two numerical variables

  • Bar chart: distribution of one categorical variable or relation between a categorical and a numerical variable

  • Histogram: distribution of one numerical variable

(Demo)

7 of 12

Discussion Question

You have data about daily temperatures as shown. Which type of chart would show the answer to each question?

  • Are there more cloudy than�sunny days?
  • What percentage of days�have a high at least 72º?
  • Do days with hotter highs �tend to have hotter lows?

8 of 12

Defining Functions

9 of 12

Def Statements

User-defined functions give names to blocks of code

def spread(values):

return max(values) - min(values)

(Demo)

Name

Argument names (parameters)

Body

Return expression

10 of 12

Discussion Question

What does this function do? What kind of input does it take? What output will it give? What's a reasonable name?

def f(s):

return np.round(s / sum(s) * 100, 2)

(Demo)

11 of 12

Apply

12 of 12

Apply

apply

  1. Calls a function on every element in the input column(s)
  2. Produces an array containing the output of the function on each input column element.
  3. First argument: Function to apply
  4. Other arguments: Specified input column(s)

table_name.apply(function_name, 'column_label(s)')

(Demo)