U2L2a: Text Compression (Day 1)
U2L4a: Encoding Color Images (Day 1)
U2L2b: Text Compression (Day 2)
U2L4b: Encoding Color Images (Day 2)
Unit 2
Digital Information
Welcome!
Unit 2 Lesson 1 (U2L1)
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Please Complete the Quick Check-In before the bell rings.
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
As we embark on a new unit about Data and Digital Information we need to get familiar with terminology about data and different types of data files.
Byte: 8 bits of data
Recall that a single character of ASCII text requires 8 bits.
A byte is the standard fundamental unit (or “chunk size”) underlying most computing systems today. You may have heard "megabyte", "kilobyte", "gigabyte", etc. which are all different amounts of a bytes. We're going to learn more about them today.
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Compare sizes of plain text v. MS Word doc
Recall In a previous lesson (Unit 1 - Sending Formatted Text) we learned that in addition to the actual text of a document, it is usually necessary to store the formatting information that allows the text to be displayed correctly.
We might wonder just how much extra information, i.e. how many extra bytes, we need to store when we include all of this formatting. Let's find out!
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
If a single ASCII character is one byte then if we were to store the word “hello” in a plain ASCII text file in a computer, we would expect it to require 5 bytes (or 40 bits) of memory.
What about a Microsoft Word document that contains the single word "hello"?
In your INB, Predict: "How many more bytes will a Word document require to store the word “hello” than a plain text document?"
Teacher: Use Notepad and Word to demo.
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Text Document
Word Document
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Why is there such a difference?
Text Document
Word Document
Look back at predictions to see how close they were.
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Rapid Research: Bytes and File Sizes
Work with your partner/group to complete the Activity.
Links can be found in Lesson 1 on the “Bytes and File Sizes” Activity Guide.
Timer on next slide.
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Rapid Research: Bytes and File Sizes
Work with your partner/group to complete the Activity.
Links can be found in Lesson 1 on the “Bytes and File Sizes” Activity Guide.
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Wrap-Up:
As you have seen data file size can grow very quickly in size. In the modern world there is a lot of data around us and usually we want it transmitted over the internet.
There is a problem though: If you want to transmit a lot of data you are limited by the speed of your internet connection. Even if you have a fast Internet connection there is a physical limit to how fast you can transmit bits.
What if the data you want to send is big enough that it takes an unreasonable amount of time to transmit it, even with a really fast internet connection. Assuming you can't make the Internet connection any faster, could you still transmit the data faster somehow?
The answer is yes and it's probably something you've done, or do every day!
Unit 2 Lesson 1 (U2L1)
Bytes and File Sizes
Complete the Check Your Understanding.
Unit 2 Lesson 2 (U2L2a)
Day 1 of 2
For the Students
Unit 2 Lesson 2 (U2L2a)
Text Compression
Warm up: Abbr In Ur Txt Msgs (5-7 mins)
When you send text messages to a friend, do you spell every word correctly?
Do you use abbreviations for common words?
In your INB, write some examples of things you might see in a text message that are not proper English.
Why do you use these abbreviations?
What is the benefit?
Unit 2 Lesson 2 (U2L2a)
Text Compression
What's this about? - Compression: Same Data, Fewer Bits
Unit 2 Lesson 2 (U2L2a)
Text Compression
Pass out Activity Guide:
Decode this message!
In pairs, decode the Encoded Message:
When you are finished, share your answer with your teacher.
Count the TOTAL number of characters needed for the Original Message (don’t forget to count the spaces).
Unit 2 Lesson 2 (U2L2a)
Text Compression
The Original Message is….
Unit 2 Lesson 2 (U2L2a)
Text Compression
Recap: How much was it compressed?
The total number of characters would be:
25
How many characters are in the Key?
31 + 25 = 56
How many characters are in the compressed poem?
If you only sent the compressed poem, would the receiver be able to decode the message?
How many fewer characters is the compressed message?
31
93 - 56 = 37
Compression:
37/93 ≈ 40%
No
Unit 2 Lesson 2 (U2L2a)
Text Compression
Text Compression Widget with Aloe Blacc
(5:51)
Unit 2 Lesson 2 (U2L2a)
Text Compression
Go to Lesson 2, and start the Widget: Text Compression
Your challenge is to have the highest level of Compression that you can for your assigned text.
Try to develop a general strategy that will lead to a good compression.
1
2
3
X
5
6
7
8
X
Compare your results and strategy with your group.
Unit 2 Lesson 2 (U2L2a)
Text Compression
Unit 2 Lesson 2 (U2L2b)
Day 2 of 2
For the Students
Unit 2 Lesson 2 (U2L2b)
Text Compression
Unit 2 Lesson 2 (U2L2b)
Text Compression
Vocabulary
Heuristic: a problem solving approach (typically an algorithm) to find a satisfactory solution where finding an optimal or exact solution is impractical or impossible. (rule of thumb)
Algorithm: A list of steps that you can follow to finish a task
Demo video (not great)
Unit 2 Lesson 2 (U2L2b)
Text Compression
Go to Lesson 2, and start the Widget: Text Compression, pick a different Poem/Song.
As you do so, develop a set of rules, or a “heuristic” that generally seems to provide good results.
Record your heuristic (rule of thumb) as a list of steps that someone else unfamiliar with the problem could follow and still end up with decent compression. ��
Trade your heuristics with another group. Are they clear and specific enough that you always know what to do? If not, provide feedback to one another and improve your heuristics to provide clearer instructions.
�Using another group’s heuristic, attempt to compress one or more of the poems in the tool. Record the amount of compression you achieve in your INB.�
Unit 2 Lesson 2 (U2L2b)
Text Compression
Do you think it’s possible to describe (or write) a specific set of instructions that a person could follow that would always result in better text compression than your heuristic? Why or why not?
�Is there a way to know that a compressed piece of text is compressed the most possible? If yes, describe how you could determine it. If no, why not?�
Unit 2 Lesson 2 (U2L2b)
Text Compression
Recap Questions
What did all groups’ processes for compression have in common?
Will following this process always lead to the same compression? (i.e. two people following the process for the same poem, will result in the same compression?)
Unit 2 Lesson 2 (U2L2b)
Text Compression
Valhalla vs. El Cajon Valley Compression Competition
Unit 2 Lesson 2 (U2L2b)
Text Compression
What is the difference between lossless compression and lossy compression?�
Define heuristic in your own words.
Complete the Check Your Understanding
For the Students
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Develop an Encoding Scheme (10 mins)
Look at the simple black-and-white image below.
With your partner discuss how you might encode images like these in binary.�
Record your ideas for your encoding in your INB.
As you develop your encoding consider:
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Now consider these larger images.
Does your encoding scheme still work?
Why or why not?
What would you have to change to make it work?
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
How have you encoded white and black portions of your image, what do 0 and 1 stand for in your encoding?
Are your encodings flexible enough to accommodate images of any size? How do they accomplish this?
�Is your encoding intuitive and easy to use?
�Is your encoding efficient?
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Vocabulary
Pixel: short for "picture element", the fundamental unit of a digital image, typically a tiny square or dot that contains a single point of color of a larger image.
Where did this word pixel come from? It turns out that originally the dots were referred to as "picture elements", that got shortened to "pict-el" and eventually "pixel".
�What we've discovered is that the data for our image file must contain more than just a 0 or 1 for every pixel. It must contain other data that describes the pixel data.�This is called metadata. In this case the metadata encodes the width and height of the image.
Metadata: is data that describes other data.
�We've seen metadata before: an internet packet. The packet contains the data that needs to be sent, but also other data like the to and from address, and packet number.
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
B&W Pixelation Widget
(3:18)
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
The Check Your Understanding is included on the next two slides.
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
Check Your Understanding (also on code.org)
255 x 255. Because there is only 1 byte each for width and height, the largest you can set the width to is 1111 1111 = 255.
Slightly tricky question. The answer is 65,041 bits.
The largest image is 255x255 = 65,025, but we also need to add in the 16 bits for the metadata.
Unit 2 Lesson 3 (U2L3)
Encoding B&W Images
17 bits. 16 bits for metadata, plus 1 bit for the single pixel.
Without the metadata we do not know immediately what the dimensions of the image are supposed to be. With 32 bits, there are 6 possible pairs of dimensions for the image: 1x32, 2x16, 4x8 and their inverses. You could recover the original image by setting the metadata to each of those possibilities and seeing if it looked like anything recognizable.
For the Students
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
Number | Binary | Hexadecimal |
0 | 0000 | 0 |
1 | 0001 | 1 |
2 | 0010 | 2 |
3 | 0011 | 3 |
4 | 0100 | 4 |
5 | 0101 | 5 |
6 | 0110 | 6 |
7 | 0111 | 7 |
8 | 1000 | 8 |
9 | 1001 | 9 |
10 | 1010 | A |
11 | 1011 | B |
12 | 1100 | C |
13 | 1101 | D |
14 | 1110 | E |
15 | 1111 | F |
Write the numbers 0 - 15 in your notes,
then convert to Binary.
How many bits are needed for these 16 numbers?
Is there a better way?
Hexadecimal: a number system comprised of the familiar ten arabic numerals (0, 1, 2 … 9) as well as the first six English letters (A, B, C, D, E, F). Also referred to as a “base 16” number system, hexadecimal is used in the world of computing to help humans read and talk about large binary numbers.
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
Complete the activity
Encoding Hexadecimal Numbers
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
How might you encode colors?
In the previous lesson we came up with a simple encoding scheme for B&W images. What if we wanted to have color?
�Devise an encoding scheme for color in an image file. How would you represent color for each pixel?
�How many different colors could you represent? Do you have a particular order to the colors?
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
The way color is represented in a computer is different from the ways we represented text or numbers. With text, we just made a list of characters and assigned a number to each one. As you are about to see, with color, we actually use binary to encode the physical phenomenon of LIGHT. You saw this a little bit in the previous lesson, but today we will see how to make colors by mixing different amounts of colored light.
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
Images, Pixels,
and RGB
(5:49)
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
Important ideas from this video include:
Image sharing services are a universal and powerful way of communicating all over the world.
�Digital images are just data (lots of data) composed of layers of abstraction: pixels, RGB, binary.
�The RGB color scheme is composed of red, green, and blue components that have a range of intensities from 0 to 255.
�Screen resolution is the number of pixels and how they are arranged vertically and horizontally, and density is the number of pixels per a given area.
�Digital photo filters are not magic! Math is applied to RGB values to create new ones.
Unit 2 Lesson 4 (U2L4a)
Encoding Color Images
You are now going to complete the Pixelation activities (3-7) on your own.
For the Students
Unit 2 Lesson 4 (U2.4b)
Encoding Color Images
Directions
Requirements
Unit 2 Lesson 4 (U2.4b)
Encoding Color Images
Unit 2 Lesson 4 (U2.4b)
Encoding Color Images
Wrap-Up
Complete the Check Your Understanding
Unit 2 Lesson 5
For the Students
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Lossy Text Compression App
Answer the following questions with your elbow partner:�
What is happening in the app?�
Should this “count” as text compression?
Why or why not?�
What do you think “lossy” refers to?
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
When we did text compression a few lessons ago, that kind of compression is known as “lossless” compression because in doing the compression, and in reconstructing the original text, nothing was lost; every character that was part of the original text could be recovered.�
“Lossy” compression -- yes, that’s the official word -- does something else. Lossy compression schemes are ones in which “useless” or less-than-totally-necessary information is thrown out in order to reduce the size of the data.
�The lossy text compression app did that, and for the most part, you could probably make out what the text was supposed to say.
�But it’s not perfect. If you saw the word “fd” it could be “food”, “feed”, “feud”, or “fad”. By reading it in context, you might know what it was supposed to be, but there’s no real way to know what the original word was. The original word is lost.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Prompt: When you use lossy compression you lose the ability to decompress your information and get back a perfect copy. Even so, people use lossy compression all the time. Can you think of reasons or situations where someone would still use lossy compression?
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
We’ve been looking at image file formats. And we’ve also seen text compression. Both of those attempted to render perfectly every piece of information.
�Both the image file format and the text compression scheme we used were lossless. Lossy compression schemes usually take advantage of the fact that a human is supposed to interpret the data at the other end, and human brains are good at filling the gaps when information is missing.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Text Compression
(0:35 - 2:28)
Yes, I know we have seen this before….
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
One computer per pair.
You and your partner have 6 minutes to complete the first column, JPG.
Keep track of your sources, create a new tab in your browser for each source, then put a link in your notes.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Prompt: Let's think for a minute about how you are doing your research. What kinds of sources are you finding? How are you actually reading these sources?
Conducting research online about CS topics is a skill. Often it means hunting through articles you don't completely understand for the key pieces of information that you do.
Other subjects like history or math may have classic agreed upon texts but in the world of CS you'll often end up on Wikipedia or online forums. This is ok. We're going to keep working on this skill of reading technical articles. If you sometimes are confused that's ok and entirely normal.
There is no one on earth who understands everything about CS with how large a subject it is and how quickly it's changing. When you're doing research as a computer scientist it usually means sticking with it even if a lot of the content doesn't make sense at first.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
One computer per pair.
You and your partner have 10 minutes to complete MP3 and PNG.
Keep track of your sources, create a new tab in your browser for each source, then put a link in your notes.
Compare your answers with your group.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Prompt: Lossy compression seems to be "worse" than lossless compression but obviously both are being used all the time. Write down three reasons or situations where someone would be willing to use lossy compression even though it means some loss of quality.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Today we accomplished a lot. We explored the difference between lossy and lossless compression, we practiced reading and researching a CS topic, and we learned a little bit about the complicated world of file formats! Next time we'll close out this unit by digging a little deeper on all of these topics.
Unit 2 Lesson 5 (U2L5)
Lossy Compression vs Lossless Compression
Unit 2 Lesson 6
2 Day Lesson
For the Students
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Prompt: Imagine I was comparing two different file formats for representing images.
I tell you that one of them is "better" than the other.
Based on everything that you've learned in this unit, what might "better" mean?
Try to come up with at least three ideas and incorporate vocabulary from the unit.
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Today we're going to dig a little deeper into this question by doing a Rapid Research project. Not only are you going to research your own file format, you're going to make the case why yours is the best!
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Let’s take a look at the Rapid Research - Format Showdown
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Project Overview: Format Showdown
Sharing text, images, video, and audio is increasingly how people communicate. For any of this to work we need shared file formats. Over the last few decades there have been many formats for each kind of file, and at any point in time there’s usually many formats to choose from. Why so many choices, and which one is best?�
For this project you will be researching a popular file format to understand what it does, how it works, and what makes it important. Then you will create a computational artifact sharing the results of your research and arguing why your format is the best!�
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
One Pager and Computational Artifact
You will choose a file format that you wish to research. Based on your research you will create both a one-pager and a computational artifact. Important questions will include:
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
General Process & Requirements
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Choose Your Format
Some Suggestions: Use the list below to help you begin your research. Just because you haven’t heard of a format before doesn’t mean that it isn’t interesting or important!
Image Formats | Audio Formats | Video Formats | Other Formats |
BMP JPEG 2000 GIF TIFF RAW WebP HEIF | WAV AAC OGG FLAC ALAC WMA AIFF | MP4 OGG Quicktime AVI FLV | DOCX ODT |
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
You need to include at least 3 sources of information!
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Note: All text in Italics, including this text, is intended to be replaced by your responses, and deleted once you’ve completed your one-pager.
Restate the questions in your answers!
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown
Day 1
Day 2
Day 3
Unit 2 Lesson 6 (U2L6)
Rapid Research - Format Showdown