1 of 21

CSE 163

Memory

��Hunter Schafer

💬 Before class: What is your favorite breakfast?

🎵Music: Japanese Breakfast

2 of 21

Memory

Just a big array/list to store values

  • Names of these regions are not important for us

2

3 of 21

Objects

  • Data and objects are just chunks of memory
  • When you construct an object, the computer gives you an appropriately sized chunk of memory
    • The fields are just specific locations within a chunk to store data�
  • This is why we need the difference between equality of value and equality of identity
    • Equal value: memory chunks store the same values
    • Equal identity: same chunk of memory�
  • By creating objects, you are making your program use more memory

3

4 of 21

Memory Hierarchy

4

5 of 21

Disk Drive

5

6 of 21

Takeaways

  • Understanding how memory works can help you write more efficient programs (or understand some strange bugs)
  • Key takeaways
    • Do follow principles of locality
    • Avoid reading/writing to disk frequently
    • Watch out for large datasets! Paging can slow down your program a TON.

6

7 of 21

CSE 163

Mid-Quarter Review

��Hunter Schafer

8 of 21

Overview

What is this class?

Competencies

  1. More advanced programming concepts than in CSE 142 or CSE 160 including how to write bigger programs with multiple classes and modules.�
  2. How to work with different types of data: tabular, text, images, geo-spatial, etc.�
  3. Ecosystem of data science tools including Jupyter Notebook and various data science libraries including scikit-image, scikit-learn, and pandas data frames.�
  4. Basic concepts related to code complexity, efficiency of different types of data structures, and memory management.

8

9 of 21

Quarter �So Far

Competencies: Program | Data | Data Sci. | Comp Sci.

  • M0 - Introduction to Python
  • Program
  • M1 - Advanced Python + Data Structures + CSVs
  • Program | Data
  • M2 - Pandas
  • Data | Data Sci.
  • M3 - Data Science Libraries (Data Viz + ML)
  • Data Sci.
  • M4 - Classes, Objects, and Modules
  • Program | Data
  • M5 - Efficiency
  • Comp. Sci.

9

10 of 21

0 - Intro to Python

Summary

Learn the basics how how to write Python programs

Topics: Program

  • Hello world!
  • Variables and Data Types
  • Control Structures
  • Strings (slices)
  • Lists
  • Documentation
  • None

10

11 of 21

1 - Python cont. + Data Structures

Summary

Learn more advanced Python concepts including how to represent data in data structures. Process our first type of data called CSV.

Topics: Program | Data

  • Files
  • Advanced Lists
  • Python Modes (REPL vs script)
  • Jupyter Notebooks (Data Sci.)
  • Tuple
  • Set
  • Dictionary
  • CSVs
  • List of Dictionaries format

11

[

{'name': 'Seattle', 'magnitude': 4},

{'name': 'Genovia', 'magnitude': 6},

{'name': 'Seattle', 'magnitude': 3.5}

]

12 of 21

2 - Pandas

Summary

Learn our first data science library, pandas, to help process CSVs.

Topics: Data | Data Sci.

  • Pandas objects: DataFrame and Series
  • Access Data
    • Access column
    • Filter with a mask
    • .loc
  • Aggregates (mean, sum, count)
  • groupby
  • apply
  • Missing Data

12

13 of 21

3 - Data Science Libraries

Summary

Learn other common libraries for data science. Learned the principles of data visualization (seaborn) and machine learning (scikit-learn).

Topics: Data Sci.

13

  • Make plots in seaborn
  • Point of visualization
  • Effectiveness of encodings
  • ML vocab
    • Features
    • Model
    • Learning Algorithm
    • Training vs test set
    • Classification vs Regression

  • Categorical data
  • Decision Trees
  • Overfitting
  • Hyperparameter tuning

14 of 21

4 - Classes, Objects, and Modules

Summary

A new way of organizing our code into objects with state and behavior.

Topics: Program | Data

  • State (fields) and behavior (methods)
  • References
  • Define a class
  • Construct an instance/object
  • Value equality vs Identity equality
  • “Special methods” (__init__, __eq__, etc.)
  • Why the main-method pattern is necessary
  • Inverted index
  • TF-IDF

14

15 of 21

5 - Efficiency

Summary

How to understand the efficiency of our programs.

Topics: Comp. Sci.

  • Limitations of Timing
  • Big-O notation
  • Complexity classes
  • Profiling
  • Memory management
    • How computer memory works

15

16 of 21

CSE 163

Requests

��Hunter Schafer

17 of 21

Internet

Allows computers to talk to each other to communicate data

One way to use the internet is to use a browser (e.g. Google Chrome) to browse web-pages.

Everything is determined by a URL that identifies

  • Which computers to talk to
  • Which resource you want that computer to give you

17

18 of 21

Resources

In most cases, a URL specifies either

  • A file that we want to access
  • A special command to the particular service to return some meaningful information. This is called an Application programming interface (API).
    • Example: http://api.open-notify.org/iss-now.json
    • There are many types of APIs that return different things depending on what you ask, but it is a useful way for one computer to get data from another

18

19 of 21

API Demo

  • requests module
  • GET request
    • Response codes
    • Response data
  • JSON data

For this example, we used the Open-Notify API

19

20 of 21

HTML

A website is actually just text that’s in a very special format: HTML

Basic idea:

  • Use “tags” around parts of the text to indicate meaning
  • Example tags
    • Paragraphs
    • Images
    • Links
    • Etc.

20

21 of 21

HTML Tags

Generally, we use an open tag and a close tag to say indicate everything inside belongs to that type of data:

  • Tags can have attributes (src, href)
  • Some tags are “self-closing” like img

Wikipedia > View Source

21

<p>This is a paragraph.</p>

Here is an image: <img src=”dog.jpg”>

<p>

This is another paragraph but it has a

<a href=”http://www.puppies.com”>link</a>!

</p>