1 of 56

Compression, Complexity, and P = NP

1

Lecture 39

CS61B, Spring 2023 @ UC Berkeley

Josh Hug and Justin Yokota

2 of 56

Announcements

My office hours today have moved to 3:30 - 4:30 (instead of 4:00 - 5:00).

  • Ceremony 5 - 8:00 in Sibley Auditorium if you want to come.

I’m exploring larger room options or even doing a second lecture on Friday.

  • More soon!

My usual goal is to structure lectures such that if you are paying close attention you can’t help but understand.

  • This is not the case for Lectures 38 and 39. They move quickly, omit details, and omit checkpoint exercises.
  • Happy to talk about more about these in office hours.

I will have office hours next week. Times TBD.

3 of 56

Compression Model 1 vs. Model 2 Review

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

4 of 56

Last Time: Compression

We saw one technique for compression called Huffman Coding.

  • There are many other approaches to compression.
  • Interesting question: What is the best way to compress a given bitstream?

01010101000001010101110...

Compression

Algorithm C

1001010101...

01010101000001010101110...

Decompression

Algorithm C-1

1001010101...

Bitstream B

Compressed bits C(B)

C(B)

B

5 of 56

Comparing Compression Algorithms

Example: What is the best way to compress mobydick.txt?

  • One way to approach this problem: Try a bunch of different standard tools and see which yields the smallest size.

Algorithm

Uncompressed size (bits)

Compressed size (bits)

zip

5145656

2091000

huffman

5145656

3412896

bzip2

5145656

1805288

6 of 56

Comparing Compression Algorithms

Example: What is the best way to compress mobydick.txt?

  • One way to approach this problem: Try a bunch of different standard tools and see which yields the smallest size.

One problem: What if someone writes a custom compression algorithm?

  • If input is 0, return the entire text of Moby Dick.
  • If input is 1, return the entire text of Great Expectations.
  • For all other inputs, it just ignores the top bit and returns the rest.

Algorithm

Uncompressed size (bits)

Compressed size (bits)

zip

5145656

2091000

huffman

5145656

3412896

bzip2

5145656

1805288

7 of 56

Ultra Compression

public static String greatmobydecompress(Bitstream bs) {

if (bs.toInt() == 0) {

return "CALL me Ishmael. Some years ago never mind how...";

}

if (bs.toInt() == 1) {

return "It was the best of times, it was the worst of times...";

}

return bs.dropFirstBit().toString();

}

Algorithm

Uncompressed size (bits)

Compressed size (bits)

zip

5145656

2091000

huffman

5145656

3412896

bzip2

5145656

1805288

greatmobydecompress

5145656

1

8 of 56

A Flaw in Compression Model #1

Source code for decompression algorithm itself might be highly complex.

  • To avoid this issue, we can include the number of bits for the source code of the algorithm itself.

1

0100001101000001...

mobydick.txt

Compressed Bits

Decompression

Algorithm C-1

1 bit

5145656 bits

10561262 bits

9 of 56

Compression Models #1 and #2

0100001101000001...

Compression

Algorithm C

1

0100001101000001...

Decompression

Algorithm C-1

1

Bitstream B

Compressed bits C(B)

C(B)

B

01110000...1

GreatMobyDecompressWithHardCodedOneBit.java

Decompression Algorithm and C(B)

Compression

Decompression Model #1

Decompression Model #2: Include the code for the decompression algorithm as part of the compressed output.

Interpreter

5,145,656 bits

0100001101000001...

mobydick.txt

10 of 56

Compression Model 2

The goal of a compression algorithm is to find short sequences of bits that generate desired longer sequences of bits.

  • Given a sequence of bits B, find a shorter sequence DA+C(B) that produces B when fed into an interpreter.

Interpreter

5,145,656 bits

0100001101000001...

mobydick.txt

Great Moby Compression Algorithm

01110000...1

GreatMobyDecompressWithHardCodedOneBit.java

Decompression Algorithm and C(B)

10,561,262

+ 1 =

10,561,263 bits

11 of 56

Compression Model 2

The goal of a compression algorithm is to find short sequences of bits that generate desired longer sequences of bits.

  • Given a sequence of bits B, find a shorter sequence DA+C(B) that produces B when fed into an interpreter.

Interpreter

5,145,656 bits

0100001101000001...

mobydick.txt

Huffman Compression Algorithm

01110000...1

HuffmanWithHardCoded CompressedMobyDick.java

Decompression Algorithm and C(B)

45,960 + 2,091,000 =

2,136,960 bits

12 of 56

Model 1 vs. Model 2 Compression

Algorithm

Uncompressed size (bits)

Compressed size (bits)

Compressed size using model 2 (bits)

zip

5145656

2,091,000

2,091,000 + 67,160

huffman

5145656

3,412,896

3,412,896 + 45,960

bzip2

5145656

1,805,288

1,805,288 + 74,984

greatmobydecompress

5145656

1

1 + 10,561,262

Twice as big because this algorithm also has Great Expectations hard coded into it!

13 of 56

Seeking Optimal Compression of the HugPlant

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

14 of 56

Compression Model 2

The goal of a compression algorithm is to find short sequences of bits that generate desired longer sequences of bits.

  • Given a sequence of bits B, find a shorter sequence DA+C(B) that produces B when fed into an interpreter.

Interpreter

Huffman Compression Algorithm

01110000...01110100...

HuffmanWithHardCoded CompressedHugPlant.java

Decompression Algorithm and C(B)

45,960 + 1,994,024 =

2,039,984 bits

8,389,584 bits

0100001001001101...

HugPlant.bmp

B

15 of 56

Even Better Compression

Compression ratio of 25% is certainly very impressive, but we can do much better. MysteryX achieves a 0.35% compression ratio!

  • Of the 28389584 possible bit streams of length 8389584, only one in 28360151 can be generated by our interpreter using an input of length 29,432 bits.

29,432 bits

0110100101101101...

MysteryX

Interpreter

01110000...01110100...

HuffmanWithHardCoded CompressedHugPlant.java

Interpreter

8,389,584 bits

0100001001001101...

HugPlant.bmp

2,037,424 bits

8,389,584 bits

0100001001001101...

HugPlant.bmp

16 of 56

Even Better Compression

Compression ratio of 25% is certainly very impressive, but we can do much better. MysteryX achieves a 0.35% compression ratio!

  • Of the 28389584 possible bit streams of length 8389584, only one in 28360151 can be generated by our interpreter using an input of length 29,432 bits.

29,432 bits

0110100101101101...

MysteryX

Interpreter

01110000...01110100...

HuffmanWithHardCoded CompressedHugPlant.java

Interpreter

8,389,584 bits

0100001001001101...

HugPlant.bmp

2,037,424 bits

8,389,584 bits

0100001001001101...

HugPlant.bmp

Question: What is mystery X?

17 of 56

MysteryX: HugPlant.java

MysteryX is just HugPlant.java, the piece of code that I used to generate the .bmp file originally.

29,432 bits

0110100101101101...

CB: HugPlant.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

69 6d 70 6f 72 74 20 6a 61 76 61 2e 61 77 74 2e

43 6f 6c 6f 72 3b 0a 0a 0a 70 75 62 6c 69 63 20

63 6c 61 73 73 20 48 75 67 50 6c 61 6e 74 20 7b

0a 09 2f 2f 73 65 6e 64 20 65 6d 61 69 6c 20 74

6f 20 70 72 69 7a 65 40 6a 6f 73 68 68 2e 75 67

20 74 6f 20 72 65 63 65 69 76 65 20 79 6f 75 72

20 70 72 69 7a 65 0a 0a 09 70 72 69 76 61 74 65

20 73 74 61 74 69 63 20 64 6f 75 62 6c 65 20 73

63 61 6c 65 46 61 63 74 6f 72 3d 32 30 2e 30 3b

0a 0a 09 70 72 69 76 61 74 65 20 73 74 61 74 69

63 20 69 6e 74 20 67 65 6e 43 6f 6c 6f 72 56 61

6c 75 65 28 69 6e 74 20 6f 6c 64 56 61 6c 2c

18 of 56

Question #1: Comprehensible Compression

Interesting question #1:

  • Can we create a “comprehensible” compression algorithm that takes as input a target bitstream B, and outputs useful, readable Java code?

29,432 bits

0110100101101101...

CB: HugPlant.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

“Comprehensible” Compression

19 of 56

Question #2: Optimal Compression

Interesting question #2:

  • Can we create an optimal compression takes as input a target bitstream B, and outputs the shortest possible Java program that outputs this bitstream?

Seems plausible that this optimal program would also have structure.

  • The answer turns out to be deep!

??? bits

0110100101101101...

CB: OptimalHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Optimal Compression

20 of 56

Optimal Compression and Kolmogorov Complexity (172 preview)

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

21 of 56

Kolmogorov Complexity

Given a target bitstream B, what is the shortest bitstream CB that outputs B.

  • Definition: The Java-Kolmogorov complexity KJ(B) is the length of the shortest Java program (in bytes) that generates B.
    • Example: KJ(HugPlant.bmp) would be the length of OptimalHP.java.
    • There IS an answer. It just might be very hard to find.

??? bits

0110100101101101...

CB: OptimalHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Optimal Compression

22 of 56

Kolmogorov Complexity

Given a target bitstream B, what is the shortest bitstream CB that outputs B.

  • Definition: The Java-Kolmogorov complexity KJ(B) is the length of the shortest Java program (in bytes) that generates B.
    • There IS an answer. It just might be very hard to find.

Fact #1: Kolmogorov Complexity is effectively independent of language.

  • For any bit stream, the Java-Kolmogorov Complexity is no more than a constant factor larger than the Python-Kolmogorov Complexity.
    • Why?

Java Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

??? bits

0010000001110000...

CB: OptimalHP.java

23 of 56

Kolmogorov Complexity (Language Independence)

Fact #1: Kolmogorov Complexity is effectively independent of language.

  • For any bit stream, the Java-Kolmogorov Complexity is no more than a constant factor larger than the Python-Kolmogorov Complexity.
    • Why?

Paul Hilfinger writes a Python program that is very short and uses weird Python features and I am stumped and cannot think of a similar thing in Java.

  • I could just write a Python interpreter in Java and then run Paul’s program.
    • KJ(B) ≤ KP(B) + size(python interpreter)

Java Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

0010000001110000...

CB: OptimalPythonHPAnd

PythonInterpreter.java

24 of 56

Kolmogorov Complexity (Language Independence)

Fact #1: Kolmogorov Complexity is effectively independent of language.

This is a deep fact!

  • It means that most bitstreams are fundamentally incompressible no matter what programming language we use for our compression algorithm.
  • Example: For all possible compression algorithms in all possible programming languages, a completely random sequence of 1,000,000 bits has at best, a 1 in 2499999 chance of being compressed by 50%.

Java Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

0010000001110000...

CB: OptimalPythonHPAnd

PythonInterpreter.java

25 of 56

Kolmogorov Complexity (Uncomputability)

Fact #2: It is impossible to write a program that even calculates the Kolmogorov Complexity of any bitstream. Proof available here.

  • Corollary: If we can’t even compute the length of the shortest program, it is also impossible to write the “perfect” compression algorithm.

??? bits

0110100101101101...

CB: OptimalHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Optimal Compression

Optimal compression algorithm that works for all inputs does not exist! Leads to a logical fallacy.

26 of 56

Space/Time Bounded Compression

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

27 of 56

Question #2: Optimal Compression

Interesting question #2:

  • Can we create an optimal compression algorithm takes as input a target bitstream B, and outputs the shortest possible Java program that outputs this bitstream?

Unfortunately the answer is no. This is not possible, even theoretically.

  • No “optimal compression” algorithm exists.

??? bits

0110100101101101...

CB: OptimalHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Optimal Compression

28 of 56

Question #2S: Space Bounded Compression

Interesting question #2S: Can we create a compression algorithm C(B, S) that:

  • Takes two inputs:
    • A target bitstream B.
    • A size S.
  • and outputs a Java program of length ≤ S that outputs B?

Let’s call this “space bounded compression”.

≤ S bits

0110100101101101...

CB: SpaceBoundedHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Space Bounded

Compression

S

29 of 56

Question #2S: Space Bounded Compression

Interesting question #2S: Can we create a compression algorithm C(B, S) that:

  • Takes two inputs:
    • A target bitstream B.
    • A size S.
  • and outputs a Java program of length ≤ S that outputs B?

No! Could be used to find KJ(B).

  • How? Binary search on S.
    • C(B, 500), then C(B, 250), etc.

≤ S bits

0110100101101101...

CB: SpaceBoundedHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Space Bounded

Compression

S

30 of 56

Question #2ST: Space/Time Bounded Compression

Interesting question #2ST: Can we create a space/time bounded compression algorithm C(B, S, T) that:

  • Takes three inputs:
    • A target bitstream B.
    • A size S.
    • A maximum number of lines of bytecode executed T.
  • and outputs a Java program of length ≤ S that outputs B in fewer than T executed lines of bytecode?

≤ S bits

0110100101101101...

CB: SpaceBoundedHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Space/Time Bounded

Compression

S

T

executes ≤ T lines of bytecode

31 of 56

Question #2ST: Space/Time Bounded Compression

Interesting question #2ST: Can we create a space/time bounded compression algorithm C(B, S, T)? Yes! And here’s an algorithm:

  • For each possible program p of length S or less:
    • If p compiles, run program p until either:
      • p terminates.
      • We output B.
      • T lines of bytecode are executed.

Runtime: O(T x 2S)

≤ S bits

0110100101101101...

CB: SpaceBoundedHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Space/Time Bounded

Compression

S

T

executes ≤ T lines of bytecode

Treating B as a constant.

32 of 56

Question #2ST-E: Space/Time Bounded Compression

Interesting question #2ST-E: Can we create an efficient space/time bounded compression algorithm?

  • Need to make a more precise definition of what we mean by “efficient”.
  • Turns out to be closely related to an important puzzle in computer science: Does P = NP?
    • Will study carefully in 170. But let’s take a quick look.

≤ S bits

0110100101101101...

CB: SpaceBoundedHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Efficient Space/Time Bounded

Compression

S

T

executes ≤ T lines of bytecode

33 of 56

P = NP?

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

34 of 56

Surprising Fact

An efficient solution to any of these three problems:

  • 3SAT
  • Independent Set, a.k.a. INDSET
  • Longest Path, a.k.a. LONGEST_PATH

Would also give an efficient space/time bounded compression algorithm.

Why?

  • Space/time bounded compression reduces to 3SAT, INDSET, and LONGEST_PATH*.

*: And also tens of thousands of other related problems.

35 of 56

3SAT and Space/Time Bounded Compression

Example:

  • Let X = “Does there exist a Java program that outputs , and:
    • is of of length 20,000 or less.
    • produces this output in fewer than 1,000,000 executed lines of bytecode.”
  • X can be transformed into a longest paths problem (or a 3SAT problem or an independent set problem).

36 of 56

Visual Reduction

LONGEST_PATH can be used to solve Bounded Space/Time Compression.

  • The actual graphs to represent our problem will be phenomenally complex. See 170 if you’re curious how this reduction works.

B:

S: 20,000

T: 10,000,000

Bounded Space/Time Compression

Preprocess

LONGEST_PATH

G

path of weight > W

Postprocess

Java code of length < 20,000 bits that outputs HugPlant in 10,000,000 executed bytecode lines.

37 of 56

3SAT and Space/Time Bounded Compression

Example:

  • Let X = “Does there exist a Java program that outputs , and:
    • is of of length 20,000 or less.
    • produces this output in fewer than 1,000,000 executed lines of bytecode.”
  • X can be transformed into a longest paths problem (or a 3SAT problem or an independent set problem).

How do we know X can be turned into a longest paths problem?

  • Short answer: “It’s a problem in the complexity class NP and therefore can be reduced to any NP complete problem, including longest paths”.
    • I haven’t introduced many of the terms in this statement. We will very briefly go over them, but too quickly to make complete sense.
  • Longer answer: See CS170.

38 of 56

P = NP?

Two important classes of yes/no problems:

  • P: Efficiently solvable problems.
  • NP: Problems with solutions that are efficiently verifiable.*

Examples of problems in P:

  • Is this array sorted?
  • Does this array have duplicates?

Examples of problems in NP:

  • Is there a solution to this 3SAT problem?
  • In graph G, does there exist a path from s to t of weight > k?

*: Technically it’s problems for a which a “yes” answer is efficiently verifiable.

39 of 56

P = NP?

Two important classes of yes/no problems:

  • P: Efficiently solvable problems.
  • NP: Problems with solutions that are efficiently verifiable.*

Examples of problems not in NP:

  • Is this the best chess move I can make next?
    • Hard to verify.
  • What is the longest path?
    • Not a yes/no question.�

*: Technically it’s problems for a which a “yes” answer is efficiently verifiable.

40 of 56

Totally Shocking Fact

Every single NP problem reduces to 3SAT.

  • This includes Bounded Space/Time Compression.

In other words, any decision problem for which a yes answer can be efficiently verified can be transformed into a 3SAT problem.

  • This transformation is also “efficient” (polynomial time).

This result is by Cook (1971) and Levin (1973). See Cook-Levin Theorem for more.

41 of 56

Open Question in Computer Science

Question posed Stephen Cook in 1971: Are all NP problems also P problems?

  • In other words, are all problems with efficiently verifiable solutions also efficiently solvable?
  • Often stated as “Does P = NP?”

One reason to think yes:

  • Easy to check any given answer.
    • Maybe with the right pruning rules you can zero in on the answer?

See CS170 for a much more thorough and formal treatment of this problem.

42 of 56

“Almost Like Gods”

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

43 of 56

P = NP?

Consensus Opinion (Bill Gasarch Poll, 2012 poll)

  • 83%: P ≠ NP (126 respondents)
  • 9%: P = NP (12 respondents)
  • 9%: Other (13 respondents)

Why is opinion generally negative?

  • Someone would have proved it by now.
    • “The only supporting arguments I can offer are the failure of all efforts to place specific NP-complete problems in P by constructing polynomial-time algorithms.” - Dick Karp
  • Creation of solutions seems philosophically more difficult than verification.

What is NP-complete?

44 of 56

NP Complete Problems

It turns out that there are tons of NP problems that all reduce to each other.

  • Solving any of these problems means that you have solved all of them.
  • These problems are known as “NP Complete” problems.
    • There are tens of thousands of them, and none have been solved.

Independent Set

Hamiltonian Cycle

Decision version of TSP

Hamiltonian Path

3-SAT

Two-Prime Integer Factorization

Purple: NP-Complete

White: NP, not known to be NP complete.

v w means

a solver for v

can solve w.

45 of 56

Fun Fact: Mathematical Proofs Are in NP!

Example of NP problem: Is there a proof that the Riemann Hypothesis is true?

  • A yes answer can be easily verified (we just need to check the proof).

If P=NP, then mathematical proof can be automated!

  • P=NP means checking a proof is roughly as easy as creating the proof.
  • First observed informally by Kurt Gödel himself.

“[A linear or quadratic-time procedure for what we now call NP complete problems would have] consequences of the greatest magnitude. [For such a procedure] would clearly indicate that, despite the unsolvability of the Entscheidungsproblem, the mental effort of the mathematician in the case of yes-or-no questions could be completely replaced by machines.” - Kurt Gödel

46 of 56

One of These Things, Is Not Like The Others

In 2000, the Clay Mathematics Institute set up $1,000,000 prizes for the solution of each of seven problems.

Millenium Prize Problems.

  • Hodge conjecture
  • Poincare conjecture (solved!)
  • Riemann hypothesis
  • Yang-Mills existence and mass gap
  • Navier-Stokes existence and smoothness
  • Birch and Swinnerton-dyer conjecture
  • P=NP
    • If true, proof might allow you to trivially solve all of these problems.

47 of 56

Question #2ST-E: Space and Time Bounded Compression

Back to our earlier quest:

  • Since Space/Time Bounded Compression can be reduced to 3SAT (or LONGEST_PATH or INDSET or any other NP complete problem), an efficient solution to any NP complete problem would act as a compression algorithm.

≤ S bits

0110100101101101...

CB: SpaceBoundedHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

Efficient Space/Time Bounded

Compression

S

T

executes ≤ T lines of bytecode

48 of 56

Even More Impressive Consequences

Even More Impressive Consequences

“I have heard it said, with a straight face, that a proof of P = NP would

be important because it would let airlines schedule their flights better, or

shipping companies pack more boxes in their trucks!

If [P = NP], then we could quickly find the smallest Boolean circuits that output (say) a table of historical stock market data, or the human genome.... It seems entirely conceivable that, by analyzing these circuits, we could make an easy fortune on Wall Street, or retrace evolution... For broadly speaking, that which we can compress we can understand, and that which we can understand we can predict.

So if we could solve the general case—if knowing something was tantamount to knowing the shortest efficient description of it—then we would be almost like gods. [Assuming P ≠ NP] is the belief that such power will be forever beyond our reach.”

- Scott Aaronson http://www.scottaaronson.com/papers/npcomplete.pdf

49 of 56

Does Short = Comprehensible?

Lecture 39, CS61B, Spring 2023

Compression

  • Model 1 vs. Model 2 Review
  • Seeking Optimal Compression of the HugPlant

Optimal Compression and Kolmogorov Complexity (172 preview)

Space/Time Bounded Compression

P = NP?

“Almost Like Gods”

Does Short = Comprehensible?

50 of 56

Back to Question #1: Comprehensible Compression

Earlier, we’d hoped for “comprehensible compression”.

Scott Aaronson earlier made an implicit conjecture that a short program will also be comprehensible.

  • This seems feasible (but not obvious) to me.
  • A short program will probably exhibit hierarchy and structure.
    • Example might even look like HugPlant.java.

29,432 bits

0110100101101101...

CB: ReadableHP.java

Interpreter

8,389,584 bits

0100001001001101...

B: HugPlant.bmp

“Comprehensible” Compression

51 of 56

Comprehensible Compression of Other Data

Scott also conjectures that if we fed in useful data about the stock market, we’d effectively get back a useful economy simulator.

  • Could read the source code to understand how the world works.

0110100101101101...

CB: EconomySimulator.java

Interpreter

01001010010100...

B: marketData.csv

“Comprehensible” Compression

All business reports by every business ever along with stock market data.

Some sort of relatively short Java program that acts as a simulator for the world economy that we can read through and understand.

52 of 56

Complexity from Simple Rules

However, there are also reasons to suspect that simplicity might not indicate comprehensibility.

Earlier, we said that compressibility was a rare thing.

  • And yet, short programs can produce surprisingly complex output.

53 of 56

Fractals

In the 1970s, Mandelbrot built demonstrations that very short programs could generate highly complex visual patterns.

  • Biological processes exploit these same ideas. Short sequences of DNA give rise to interesting spatial (and other) patterns.

54 of 56

The Mandelbrot Set

Bounded Space/Time Compression

S

T

???

55 of 56

Short (But Simple)?

...

public class Mandelbrot extends JFrame {

...

for (int y = 0; y < getHeight(); y++) {

for (int x = 0; x < getWidth(); x++) {

zx = zy = 0;

cX = (x - 400) / ZOOM;

cY = (y - 300) / ZOOM;

int iter = MAX_ITER;

while (zx * zx + zy * zy < 4 && iter > 0) {

tmp = zx * zx - zy * zy + cX;

zy = 2.0 * zx * zy + cY;

zx = tmp;

iter--;

}

I.setRGB(x, y, iter | (iter << 8));

}

}

...

}

Hard to say what we learn by looking at this code.

  • Very basic rules yield such a highly complex object.
  • … but I’m not the complexity theorist!

Bounded Space/Time Compression

S

T

56 of 56

Music from Simple Rules

This notion of complexity from simple rules is arguably more interesting (and alarming) when applied towards sound generation.

Music from very short programs (3rd iteration): Youtube

  • See this link if you want to try writing your own fractal sound generator.
  • Or you can try playing around with this javascript version that I used in my freshman seminar on generative art.
    • Fun program: return 100 * (t & (t >> 3) & (t >> 8)) % 16384;
    • See these slides for examples of interesting inputs.