Lecture 5

Building Tables


Fall 2019


Weekly Goals

  • Today:
    • Creating tables from scratch
    • Manipulating columns as arrays

  • Later this week
    • Table review
    • Visualizing data
    • Working with Census data
    • Distributions



An array contains a sequence of values

  • All elements of an array should have the same type
  • Arithmetic is applied to each element individually
  • Adding arrays adds elements (if same length!)
  • A column of a table is an array



A range is an array of consecutive numbers

  • np.arange(end):
    An array of increasing integers from 0 up to
  • np.arange(start, end):
    An array of increasing integers from
    start up to end
  • np.arange(start, end, step):
    A range with
    step between consecutive values

The range always includes start but excludes end

Ways to create a table

  • Table.read_table(filename) - reads a table from a spreadsheet
  • Table() - an empty table

  • and… select, where, sort and so on all create new tables


Charles Joseph Minard, 1781-1870

  • French civil engineer who created one of the greatest graphs of all time
  • Visualized Napoleon's 1812 invasion of Russia, including
    • the number of soldiers
    • the direction of the march
    • latitude and longitude
    • temperature on the return journey
    • dates in November and December

Some of Minard’s Data


Discussion Question

Use the table functions we learned last week to find the southernmost city along the army’s retreat.

Table Methods

  • Creating and extending tables:
    • Table().with_column and Table.read_table
  • Finding the size: num_rows and num_columns
  • Referring to columns: labels, relabeling, and indices
    • labels and relabeled; column indices start at 0
  • Accessing data in a column
    • column takes a label or index and returns an array
  • Using array methods to work with data in columns
    • item, sum, min, max, and so on
  • Creating new tables containing some of the original columns:
    • select, drop


Manipulating Rows

  • t.sort(column) sorts the rows in increasing order
  • t.take(row_numbers) keeps the numbered rows
    • Each row has an index, starting at 0
  • t.where(column, are.condition) keeps all rows for which a column's value satisfies a condition
  • t.where(column, value) keeps all rows containing a certain value in a column
Lecture 05 - Building Tables - Google Slides