1 of 80

2 of 80

Unit 2

Digital Information

Welcome!

3 of 80

Unit 2 Lesson 1 (U2L1)

For the Students

  • Bytes and File Sizes - Activity Guide (PDF | DOCX)

Post in Classroom:

4 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Please Complete the Quick Check-In before the bell rings.

5 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

As we embark on a new unit about Data and Digital Information we need to get familiar with terminology about data and different types of data files.

Byte: 8 bits of data

Recall that a single character of ASCII text requires 8 bits.

A byte is the standard fundamental unit (or “chunk size”) underlying most computing systems today. You may have heard "megabyte", "kilobyte", "gigabyte", etc. which are all different amounts of a bytes. We're going to learn more about them today.

6 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Compare sizes of plain text v. MS Word doc

Recall In a previous lesson (Unit 1 - Sending Formatted Text) we learned that in addition to the actual text of a document, it is usually necessary to store the formatting information that allows the text to be displayed correctly.

We might wonder just how much extra information, i.e. how many extra bytes, we need to store when we include all of this formatting. Let's find out!

7 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

If a single ASCII character is one byte then if we were to store the word “hello” in a plain ASCII text file in a computer, we would expect it to require 5 bytes (or 40 bits) of memory.

What about a Microsoft Word document that contains the single word "hello"?

In your INB, Predict: "How many more bytes will a Word document require to store the word “hello” than a plain text document?"

Teacher: Use Notepad and Word to demo.

8 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Text Document

Word Document

9 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Why is there such a difference?

Text Document

Word Document

Look back at predictions to see how close they were.

  • The big difference in file size between .txt and .docx is due to the extensive formatting information included along with the actual text in .docx.

10 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Rapid Research: Bytes and File Sizes

Work with your partner/group to complete the Activity.

Links can be found in Lesson 1 on the “Bytes and File Sizes” Activity Guide.

Timer on next slide.

11 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Rapid Research: Bytes and File Sizes

Work with your partner/group to complete the Activity.

Links can be found in Lesson 1 on the “Bytes and File Sizes” Activity Guide.

12 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Wrap-Up:

As you have seen data file size can grow very quickly in size. In the modern world there is a lot of data around us and usually we want it transmitted over the internet.

There is a problem though: If you want to transmit a lot of data you are limited by the speed of your internet connection. Even if you have a fast Internet connection there is a physical limit to how fast you can transmit bits.

What if the data you want to send is big enough that it takes an unreasonable amount of time to transmit it, even with a really fast internet connection. Assuming you can't make the Internet connection any faster, could you still transmit the data faster somehow?

The answer is yes and it's probably something you've done, or do every day!

13 of 80

Unit 2 Lesson 1 (U2L1)

Bytes and File Sizes

Complete the Check Your Understanding.

14 of 80

Unit 2 Lesson 2 (U2L2a)

Day 1 of 2

For the Students

  • Decode this message - Activity Guide (PDF | DOCX)

15 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

Warm up: Abbr In Ur Txt Msgs (5-7 mins)

When you send text messages to a friend, do you spell every word correctly?

Do you use abbreviations for common words?

In your INB, write some examples of things you might see in a text message that are not proper English.

Why do you use these abbreviations?

What is the benefit?

16 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

What's this about? - Compression: Same Data, Fewer Bits

  • Today's class is about compression
  • When you abbreviate or use coded language to shorten the original text, you are “compressing text.” Computers do this too, in order to save time and space.
  • The art and science of compression is about figuring out how to represent the SAME DATA with FEWER BITS.
  • Why is this important?
    • One reason is that storage space is limited and you'd always prefer to use fewer bits if you could. A much more compelling reason is that there is an upper limit to how fast bits can be transmitted over the Internet.
  • What if we need to send a large amount of text faster over the Internet, but we’ve reached the physical limit of how fast we can send bits? Our only choice is to somehow capture the same information with fewer bits; we call this compression.

17 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

Pass out Activity Guide:

Decode this message!

In pairs, decode the Encoded Message:

When you are finished, share your answer with your teacher.

Count the TOTAL number of characters needed for the Original Message (don’t forget to count the spaces).

18 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

The Original Message is….

19 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

Recap: How much was it compressed?

The total number of characters would be:

25

How many characters are in the Key?

31 + 25 = 56

How many characters are in the compressed poem?

If you only sent the compressed poem, would the receiver be able to decode the message?

How many fewer characters is the compressed message?

31

93 - 56 = 37

Compression:

37/93 ≈ 40%

No

20 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

Text Compression Widget with Aloe Blacc

(5:51)

21 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

Go to Lesson 2, and start the Widget: Text Compression

Your challenge is to have the highest level of Compression that you can for your assigned text.

Try to develop a general strategy that will lead to a good compression.

1

2

3

X

5

6

7

8

X

Compare your results and strategy with your group.

22 of 80

Unit 2 Lesson 2 (U2L2a)

Text Compression

  • What makes doing this compression hard?
    • You can start in lots of different ways. Early choices affect later ones. Once you find one set of patterns, others emerge.
    • There is a tipping point: you might be making progress compressing, but at some point the scale tips and the dictionary starts to get so big that you lose the benefit of having it. But then you might start re-thinking the dictionary to tweak some bits out.
  • Do we think that these compression amounts that we’ve found are the best? Is there a way to know what the best compression is?
    • We probably don’t know what’s best.
    • There are so many possibilities it’s hard to know. It turns out the only way to guarantee perfect compression is brute force. This means trying every possible set of substitutions. Even for small texts this will take far too long. The “best” is really just the best we’ve found so far.

23 of 80

Unit 2 Lesson 2 (U2L2b)

Day 2 of 2

For the Students

24 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

25 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

Vocabulary

Heuristic: a problem solving approach (typically an algorithm) to find a satisfactory solution where finding an optimal or exact solution is impractical or impossible. (rule of thumb)

Algorithm: A list of steps that you can follow to finish a task

Demo video (not great)

26 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

Go to Lesson 2, and start the Widget: Text Compression, pick a different Poem/Song.

As you do so, develop a set of rules, or a “heuristic” that generally seems to provide good results.

Record your heuristic (rule of thumb) as a list of steps that someone else unfamiliar with the problem could follow and still end up with decent compression. ��

Trade your heuristics with another group. Are they clear and specific enough that you always know what to do? If not, provide feedback to one another and improve your heuristics to provide clearer instructions.

�Using another group’s heuristic, attempt to compress one or more of the poems in the tool. Record the amount of compression you achieve in your INB.�

27 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

Do you think it’s possible to describe (or write) a specific set of instructions that a person could follow that would always result in better text compression than your heuristic? Why or why not?

�Is there a way to know that a compressed piece of text is compressed the most possible? If yes, describe how you could determine it. If no, why not?�

28 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

Recap Questions

What did all groups’ processes for compression have in common?

  • Pattern Recognition
  • Abstraction (patterns referring to other patterns)

Will following this process always lead to the same compression? (i.e. two people following the process for the same poem, will result in the same compression?)

  • No. It’s imprecise, but still OK. The text still gets compressed, no matter what.
  • Since there is no way to know what’s best, all we need is a process that comes up with some solution, and a way to make progress.

29 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

Valhalla vs. El Cajon Valley Compression Competition

30 of 80

Unit 2 Lesson 2 (U2L2b)

Text Compression

What is the difference between lossless compression and lossy compression?�

Define heuristic in your own words.

Complete the Check Your Understanding

31 of 80

For the Students

32 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

Develop an Encoding Scheme (10 mins)

Look at the simple black-and-white image below.

With your partner discuss how you might encode images like these in binary.�

Record your ideas for your encoding in your INB.

As you develop your encoding consider:

  • What information will need to be included to reconstruct the image?
  • How will that information be represented in binary?
  • Try out your encoding with the sample “A”. Is there any aspect of your encoding that is ambiguous or could lead to improperly structured images? Could you include more information in your encoding to solve this problem.

33 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

Now consider these larger images.

Does your encoding scheme still work?

Why or why not?

What would you have to change to make it work?

34 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

How have you encoded white and black portions of your image, what do 0 and 1 stand for in your encoding?

Are your encodings flexible enough to accommodate images of any size? How do they accomplish this?

�Is your encoding intuitive and easy to use?

�Is your encoding efficient?

35 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

Vocabulary

Pixel: short for "picture element", the fundamental unit of a digital image, typically a tiny square or dot that contains a single point of color of a larger image.

Where did this word pixel come from? It turns out that originally the dots were referred to as "picture elements", that got shortened to "pict-el" and eventually "pixel".

�What we've discovered is that the data for our image file must contain more than just a 0 or 1 for every pixel. It must contain other data that describes the pixel data.�This is called metadata. In this case the metadata encodes the width and height of the image.

Metadata: is data that describes other data.

�We've seen metadata before: an internet packet. The packet contains the data that needs to be sent, but also other data like the to and from address, and packet number.

36 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

B&W Pixelation Widget

(3:18)

37 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

38 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

39 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

40 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

41 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

The Check Your Understanding is included on the next two slides.

42 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

Check Your Understanding (also on code.org)

  • What are the largest dimensions (width and height) of an image we can make with the pixelation widget?

  • How many total bits would there be in the largest possible image we could make with the pixelation widget?

255 x 255. Because there is only 1 byte each for width and height, the largest you can set the width to is 1111 1111 = 255.

Slightly tricky question. The answer is 65,041 bits.

The largest image is 255x255 = 65,025, but we also need to add in the 16 bits for the metadata.

43 of 80

Unit 2 Lesson 3 (U2L3)

Encoding B&W Images

  • How many bits would it take to represent the smallest possible image (i.e. an image with one pixel)?

  • What would happen if we didn’t include width and height bits in our protocol? Assume your friend just sent you 32 bits of pixel data (just the 0s and 1s for black and white pixels). Could you recover the original image? If so, how?

17 bits. 16 bits for metadata, plus 1 bit for the single pixel.

Without the metadata we do not know immediately what the dimensions of the image are supposed to be. With 32 bits, there are 6 possible pairs of dimensions for the image: 1x32, 2x16, 4x8 and their inverses. You could recover the original image by setting the metadata to each of those possibilities and seeing if it looked like anything recognizable.

44 of 80

Unit 2 Lesson 4 (U2L4a)

Day 1 of 2

For the Students

45 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

Number

Binary

Hexadecimal

0

0000

0

1

0001

1

2

0010

2

3

0011

3

4

0100

4

5

0101

5

6

0110

6

7

0111

7

8

1000

8

9

1001

9

10

1010

A

11

1011

B

12

1100

C

13

1101

D

14

1110

E

15

1111

F

Write the numbers 0 - 15 in your notes,

then convert to Binary.

How many bits are needed for these 16 numbers?

Is there a better way?

Hexadecimal: a number system comprised of the familiar ten arabic numerals (0, 1, 2 … 9) as well as the first six English letters (A, B, C, D, E, F). Also referred to as a “base 16” number system, hexadecimal is used in the world of computing to help humans read and talk about large binary numbers.

46 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

Complete the activity

Encoding Hexadecimal Numbers

47 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

How might you encode colors?

In the previous lesson we came up with a simple encoding scheme for B&W images. What if we wanted to have color?

�Devise an encoding scheme for color in an image file. How would you represent color for each pixel?

�How many different colors could you represent? Do you have a particular order to the colors?

48 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

The way color is represented in a computer is different from the ways we represented text or numbers. With text, we just made a list of characters and assigned a number to each one. As you are about to see, with color, we actually use binary to encode the physical phenomenon of LIGHT. You saw this a little bit in the previous lesson, but today we will see how to make colors by mixing different amounts of colored light.

49 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

Images, Pixels,

and RGB

(5:49)

50 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

Important ideas from this video include:

Image sharing services are a universal and powerful way of communicating all over the world.

�Digital images are just data (lots of data) composed of layers of abstraction: pixels, RGB, binary.

�The RGB color scheme is composed of red, green, and blue components that have a range of intensities from 0 to 255.

�Screen resolution is the number of pixels and how they are arranged vertically and horizontally, and density is the number of pixels per a given area.

�Digital photo filters are not magic! Math is applied to RGB values to create new ones.

51 of 80

Unit 2 Lesson 4 (U2L4a)

Encoding Color Images

You are now going to complete the Pixelation activities (3-7) on your own.

52 of 80

Unit 2 Lesson 4 (U2.4b)

Day 2 of 2

For the Students

53 of 80

Unit 2 Lesson 4 (U2.4b)

Encoding Color Images

Directions

  1. Create a personal 16 x 16 favicon and encode it using the Pixelation Widget on the final level of this lesson in Code Studio.
  2. The image you make should represent your personality in some distinctive way.

Requirements

  • The icon must be 16 x 16 pixels.
  • You must use the Pixelation Widget to encode the bits of color information.
  • The image must be encoded with at least 24 bits per pixel.
  • You must use Hexadecimal.
  • http://htmlcolorcodes.com/
  • Share your Favicon on the Google Slide.

54 of 80

Unit 2 Lesson 4 (U2.4b)

Encoding Color Images

55 of 80

Unit 2 Lesson 4 (U2.4b)

Encoding Color Images

Wrap-Up

Complete the Check Your Understanding

56 of 80

Unit 2 Lesson 5

For the Students

57 of 80

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

58 of 80

Lossy Text Compression App

Answer the following questions with your elbow partner:�

What is happening in the app?�

Should this “count” as text compression?

Why or why not?�

What do you think “lossy” refers to?

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

59 of 80

When we did text compression a few lessons ago, that kind of compression is known as “lossless” compression because in doing the compression, and in reconstructing the original text, nothing was lost; every character that was part of the original text could be recovered.�

“Lossy” compression -- yes, that’s the official word -- does something else. Lossy compression schemes are ones in which “useless” or less-than-totally-necessary information is thrown out in order to reduce the size of the data.

�The lossy text compression app did that, and for the most part, you could probably make out what the text was supposed to say.

�But it’s not perfect. If you saw the word “fd” it could be “food”, “feed”, “feud”, or “fad”. By reading it in context, you might know what it was supposed to be, but there’s no real way to know what the original word was. The original word is lost.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

60 of 80

Prompt: When you use lossy compression you lose the ability to decompress your information and get back a perfect copy. Even so, people use lossy compression all the time. Can you think of reasons or situations where someone would still use lossy compression?

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

61 of 80

We’ve been looking at image file formats. And we’ve also seen text compression. Both of those attempted to render perfectly every piece of information.

�Both the image file format and the text compression scheme we used were lossless. Lossy compression schemes usually take advantage of the fact that a human is supposed to interpret the data at the other end, and human brains are good at filling the gaps when information is missing.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

62 of 80

Text Compression

(0:35 - 2:28)

Yes, I know we have seen this before….

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

63 of 80

One computer per pair.

You and your partner have 6 minutes to complete the first column, JPG.

Keep track of your sources, create a new tab in your browser for each source, then put a link in your notes.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

64 of 80

  • JPG is lossy compression. Many images on the web are converted to JPG. Odds are if you've seen a grainy or pixelated image on the Internet, it's a JPG.
  • Lossy compression like this results in a lower quality image with fewer details, but as humans we can still tell what's in the picture. Especially in instances where you don't need a high quality image and bandwidth is limited JPG makes a lot of sense.
  • This is a free and open format, but even so there have been disputes about its ownership.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

65 of 80

Prompt: Let's think for a minute about how you are doing your research. What kinds of sources are you finding? How are you actually reading these sources?

Conducting research online about CS topics is a skill. Often it means hunting through articles you don't completely understand for the key pieces of information that you do.

Other subjects like history or math may have classic agreed upon texts but in the world of CS you'll often end up on Wikipedia or online forums. This is ok. We're going to keep working on this skill of reading technical articles. If you sometimes are confused that's ok and entirely normal.

There is no one on earth who understands everything about CS with how large a subject it is and how quickly it's changing. When you're doing research as a computer scientist it usually means sticking with it even if a lot of the content doesn't make sense at first.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

66 of 80

One computer per pair.

You and your partner have 10 minutes to complete MP3 and PNG.

Keep track of your sources, create a new tab in your browser for each source, then put a link in your notes.

Compare your answers with your group.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

67 of 80

Prompt: Lossy compression seems to be "worse" than lossless compression but obviously both are being used all the time. Write down three reasons or situations where someone would be willing to use lossy compression even though it means some loss of quality.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

68 of 80

Today we accomplished a lot. We explored the difference between lossy and lossless compression, we practiced reading and researching a CS topic, and we learned a little bit about the complicated world of file formats! Next time we'll close out this unit by digging a little deeper on all of these topics.

Unit 2 Lesson 5 (U2L5)

Lossy Compression vs Lossless Compression

69 of 80

Unit 2 Lesson 6

2 Day Lesson

For the Students

70 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Prompt: Imagine I was comparing two different file formats for representing images.

I tell you that one of them is "better" than the other.

Based on everything that you've learned in this unit, what might "better" mean?

Try to come up with at least three ideas and incorporate vocabulary from the unit.

  • Better compression ratio or better quality image even when compressed
  • Can hold larger or more complex types of information
  • Different kinds of metadata
  • Includes features not found in other formats
  • Used by more people
  • Open, no licensing fees
  • Newer and optimized for modern technology�

71 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Today we're going to dig a little deeper into this question by doing a Rapid Research project. Not only are you going to research your own file format, you're going to make the case why yours is the best!

72 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Let’s take a look at the Rapid Research - Format Showdown

73 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Project Overview: Format Showdown

Sharing text, images, video, and audio is increasingly how people communicate. For any of this to work we need shared file formats. Over the last few decades there have been many formats for each kind of file, and at any point in time there’s usually many formats to choose from. Why so many choices, and which one is best?�

For this project you will be researching a popular file format to understand what it does, how it works, and what makes it important. Then you will create a computational artifact sharing the results of your research and arguing why your format is the best!�

74 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

One Pager and Computational Artifact

You will choose a file format that you wish to research. Based on your research you will create both a one-pager and a computational artifact. Important questions will include:

  • The history of the format
  • Key information about how it works
  • Advantages and disadvantages of this format
  • How it compares to other similar formats��

75 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

General Process & Requirements

  • Review the one-pager template (separate document) and the rubric (below)
  • Choose a format using the guide
  • Conduct your research by following the research guide (below)
  • Complete the one-pager
  • Design a computational artifact (instructions on one-page template)��

76 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Choose Your Format

Some Suggestions: Use the list below to help you begin your research. Just because you haven’t heard of a format before doesn’t mean that it isn’t interesting or important!

Image Formats

Audio Formats

Video Formats

Other Formats

BMP

JPEG 2000

GIF

TIFF

PDF

RAW

WebP

HEIF

WAV

AAC

OGG

FLAC

ALAC

WMA

AIFF

MP4

OGG

Quicktime

AVI

FLV

DOCX

ODT

77 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

You need to include at least 3 sources of information!

78 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Note: All text in Italics, including this text, is intended to be replaced by your responses, and deleted once you’ve completed your one-pager.

Restate the questions in your answers!

79 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown

Day 1

  • Review Activity Guide and Rubric
  • Choose Your Format
  • Conduct Your Research
  • Begin One-Pager

Day 2

  • Complete One-Pager
  • Computational Artifact

Day 3

  • Peer Grading

80 of 80

Unit 2 Lesson 6 (U2L6)

Rapid Research - Format Showdown