1 of 24

Representing Information as Data�Part I: Scalar, Itemization, and Partition Data

CS 5010 Program Design Paradigms “Bootcamp”

Lesson 1.3

© Mitchell Wand, 2012-2013

This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.

2 of 24

Learning Objectives for Lessons 1.3-1.4

  • By the time you finish this lesson, you should be able to:
    • name the 5 kinds of information.
    • given a piece of information, determine which of the 5 kinds it is.
    • for each of the 5 kinds of information, choose a data representation for it and write down your choice as a data definition, including the template.

3 of 24

Information Analysis and Data Design

  • Information is what lives in the real world
  • Need to decide what part of that information needs to be represented as data.
  • Need to decide how that information will be represented as data
  • Need to document how to interpret the data as information

Let’s recall what we learned about this step in the last lesson.

4 of 24

Information and Data

Information

Data

representation

interpretation

5 of 24

Information Analysis

  • Of all the information in the world, which parts do we need to represent?
  • Choice depends on the application
  • Let’s take a look at some applications you might want to write to help you manage a classroom.

6 of 24

Some of the information about a classroom

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

7 of 24

Classroom Scheduling Application

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

If we were writing a classroom scheduling application, what information would we need to represent?

8 of 24

HVAC Application

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

(Heating, Ventilation, and Air-Conditioning)

If we were writing an application to control the heating, ventilation, and air-conditioning, what information would we need to represent?

9 of 24

Building Maintenance Application

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

  • Location
  • Capacity
  • Schedule
  • Arrangement of blackboards
  • Who’s in it now?
  • What color are their clothes?
  • Current temperature
  • Number of Windows
  • Color of seats
  • Status of projector
  • Status of internet connection
  • Number of pieces of chalk in the tray…

If we were writing a building maintenance application, what information would we need to represent?

10 of 24

Data Design

  • For each kind of information we want to manipulate, we must choose a representation.
  • The data definition documents our choice of representation. It has 3 components:
    • It shows how to construct a value of the right kind.
    • It always shows how to interpret the data as information
    • For structured data, it includes a template that provides an outline for functions that manipulate that data
  • This serves as a reference as we design the rest of our program.

11 of 24

Kinds of Information

  • Before we can choose a representation for our information, we need to know what kind of information we are trying to represent.
  • We will consider 5 kinds of information:
    1. scalar information
    2. itemization information
    3. partition information
    4. compound information
    5. mixed information

We’ll do 1-3 in this lesson, and 4-5 in the next lesson

12 of 24

1. Scalar Information

  • Simple data, eg numbers, strings, etc.
  • These are already values in Racket.
  • Racket has lots more kinds of values, but these will be enough for now.
  • For scalar information, there’s no data definition or template.
  • You’ll get to these when you use scalar data as part of more complicated data.

13 of 24

2. Itemization Information

  • Itemization information is data that is one of a few values.
  • The data definition lists the possible values and their interpretation, and the template shows how to write a function that looks at each of the possible values.

14 of 24

Sample Data Definition for Itemization Information

A TLState is one of

-- "red"

-- "yellow"

-- "green"

We use CamelCase for names of kinds of data

Itemization Data was called “Enumeration Data” in HtDP 1/e.

interpretation is obvious, so it’s not necessary here. We'll get to more complicated examples later.

15 of 24

Video: How to write a template for itemization data

16 of 24

Itemization Data: Example

;; A TLState is one of

;; -- "red" interp: the traffic light is red

;; -- "yellow" interp: the traffic light is yellow

;; -- "green" interp: the traffic light is green

;; tls-fn : TLState -> ??

;(define (tls-fn tls)

; (cond

; [(string=? tls "red") ...]

; [(string=? tls "yellow") ...]

; [(string=? tls "green") ...]))

Write it without comments, then comment it

17 of 24

Templates give us the shape of the program

  • This template gives us the exact form for every function that manipulates values from the data definition.

18 of 24

3. Partition Information

  • Sometimes you need to divide up an existing scalar data type into groups.
  • Examples:
    • partition the integers into NegInt, 0, and PosInt
    • partition the integers into NegInt and NonNegInt
    • partition NonNegInt into [0-10000), [10000,20000), and [20000,infinity)
    • partition the strings into strings of length 1 and all others.

19 of 24

Partition Data

  • The data definition must show
    • What kind of data is being partitioned into cases
    • Exactly which values are in each case
    • The interpretation of each case
  • The cases must be
    • Mutually exclusive
    • Exhaustive
    • So each value of the data falls into one and only one case.
  • The template will be similar to one for itemization data, but it will usually have an “else” clause for the “all others” case.

20 of 24

Data Definitions for Partition Data: Example 1

;; A TaxableIncome is a partition of NonNegInt into

;; the following categories:

;; [0-10000) interp: low income

;; [10000,20000) interp: medium income

;; all others interp: high income

;; taxable-income-fn : TaxableIncome -> ??

;; (define (taxable-income-fn income)

;; (cond

;; [(< income 10000) ...]

;; [(and

;; (<= 10000 income)

;; (< income 20000)) ...]

;; [else ...]))

Every positive number falls into exactly one of these categories

The predicates in the cond clause distinguish between them

21 of 24

Data Definition for Partition Data: Example 2

;; An FallingCatKeyEvent is a partition of

;; KeyEvent into the following categories:

;; -- " " (interp: pause/unpause)

;; -- any other KeyEvent (interp: ignore)

;; cat-kev-fn : FallingCatKeyEvent -> ??

(define (cat-kev-fn kev)

(cond

[(string=? kev " ") ...]

[else ...]))

22 of 24

Itemization Data vs. Partition Data: how to tell the difference

  • Partition data will always have an “all others” clause.
  • Itemization data never has an “all others” clause.

23 of 24

Summary

  • Information lives in the world, data lives in the computer
  • Choice of information to be represented depends on the application
  • Represent information as data; interpret data as information
  • Data definition includes representation, interpretation, and template.
  • Kinds of data
    • scalar, partition, enumeration, compound, mixed
    • templates for each kind of data

To be covered in the next lesson

24 of 24

Next Steps

  • Ask Questions on the Discussion Board
  • Go on to Lesson 1.4. There we will learn about compound and mixed information.