1 of 45

Announcements

Reminder: Project 1A is out.

  • Very strongly encouraged to work on this project in IntelliJ.
    • Having the ability to visually debug your code is incredibly useful.
    • Having your IDE yell at you about compilation errors while you are writing code is really nice to avoid issues with, for example, generics.
  • Autograder is up, but we still want you to write your own tests.
  • Tests not graded.
  • On part 1B there will be graded tests, so might be worthwhile to write tests just to save yourself some work next week.

2 of 45

CS61B, 2021

Lecture 7: Arrays and Lists

  • A Last Look at Linked Lists
  • Naive Array Lists
  • Resizing Arrays
  • Generic ALists
  • Obscurantism in Java

3 of 45

A Last Look at Linked Lists

4 of 45

Doubly Linked Lists

Behold. The state of the art as we arrived at in last week’s lecture. Through various improvements, we made all of the following operations fast:

  • addFirst, addLast
  • getFirst, getLast
  • removeFirst, removeLast
  • You will build this in project 1A.

??

3

5

size

17

next

value

prev

addLast()

getLast()

get(int i)

removeLast()

38

sentinel

5 of 45

Arbitrary Retrieval

Suppose we added get(int i), which returns the ith item from the list.

Why would get be slow for long lists compared to getLast()? For what inputs?

??

3

5

size

17

next

value

prev

addLast()

getLast()

get(int i)

removeLast()

38

sentinel

6 of 45

Arbitrary Retrieval

Suppose we added get(int i), which returns the ith item from the list.

Why would get be slow for long lists compared to getLast()? For what inputs?

  • Have to scan to desired position. Slow for any i not near the sentinel node.
  • How do we fix this?

??

3

5

size

17

next

value

prev

addLast()

getLast()

get(int i)

removeLast()

38

sentinel

7 of 45

Arbitrary Retrieval

Suppose we added get(int i), which returns the ith item from the list.

Why would get be slow for long lists compared to getLast()? For what inputs?

  • Have to scan to desired position. Slow for any i not near the sentinel node.
  • Will discuss (much later) sophisticated changes that can speed up lists.
  • For now: We’ll take a different tack: Using an array instead (no links!).

??

3

5

size

17

next

value

prev

addLast()

getLast()

get(int i)

removeLast()

38

sentinel

8 of 45

Naive Array Lists

9 of 45

Random Access in Arrays

Retrieval from any position of an array is very fast.

  • Independent* of array size.
  • 61C Preview: Ultra fast random access results from the fact that memory boxes are the same size (in bits).

10 of 45

Our Goal: AList.java

Want to figure out how to build an array version of a list:

  • In lecture we’ll only do back operations. Project 1A is the front operations.

addLast()

getLast()

get(int i)

removeLast()

???

Let’s try it out...

??

3

5

size

17

next

value

prev

addLast()

getLast()

get(int i)

removeLast()

38

sentinel

11 of 45

Naive AList Code

AList Invariants:

  • The position of the next item to be inserted is always size.
  • size is always the number of items in the AList.
  • The last item in the list is always in position size - 1.

Let’s now discuss delete operations.

public class AList {

private int[] items;

private int size;

public AList() {

items = new int[100]; size = 0;

}

public void addLast(int x) {

items[size] = x;

size += 1;

}

public int getLast() {

return items[size - 1];

}

public int get(int i) {

return items[i];

}

public int size() {

return size;

}

}

From last lecture, “things that must be true”.

12 of 45

The Abstract vs. the Concrete

When we removeLast(), which memory boxes need to change? To what?-

User’s mental model: {5, 3, 1, 7, 22, -1} → {5, 3, 1, 7, 22}

0

0

...

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

0

0

6

size

0 1 2 3 4 5 6 7 97 98 99

Actual truth:

13 of 45

Deletion: yellkey.com/enter

When we removeLast(), which memory boxes need to change? To what?

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

0

0

0

0

0

...

6

size

0 1 2 3 4 5 6 7 97 98 99

  1. size
  2. size and items
  3. size and items[i] for some i
  4. size, items, and items[i] for some i
  5. size, items, and items[i] for many different i
  • The position of the next item to be inserted is always size.
  • size is always the number of items in the AList.
  • The last item in the list is always in position size - 1.

AList invariants.

14 of 45

Naive AList Code

AList Invariants:

  • The position of the next item to be inserted is always size.
  • size is always the number of items in the AList.
  • The last item in the list is always in position size - 1.

public class AList {

private int[] items;

private int size;

public AList() {

items = new int[100]; size = 0;

}

public void addLast(int x) {

items[size] = x;

size += 1;

}

public int getLast() {

return items[size - 1];

}

public int get(int i) {

return items[i];

}

public int size() {

return size;

}

}

public int removeLast() {

int returnItem = items[size - 1];

items[size - 1] = 0;

size -= 1;

return returnItem;

}

Setting deleted item to zero is not necessary to preserve invariants, and thus not necessary for correctness.

15 of 45

The Mighty AList

Key Idea: Use some subset of the entries of an array.

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

3

2

7

12

4

...

6

size

0 1 2 3 4 5 6 7 97 98 99

16 of 45

The Mighty (?) AList

Key Idea: Use some subset of the entries of an array.

What happens if we insert into the AList above? What should we do about it?

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

5

7

-1

5

7

...

100

size

0 1 2 3 4 5 6 7 97 98 99

17 of 45

Resizing Arrays

18 of 45

Array Resizing

When the array gets too full, e.g. addLast(11), just make a new array:

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

5

7

-1

5

7

...

100

size

0 1 2 3 4 5 6 7 97 98 99

size==items.length

19 of 45

Array Resizing

When the array gets too full, e.g. addLast(11), just make a new array:

  • int[] a = new int[size+1];

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

5

7

-1

5

7

...

100

size

0 1 2 3 4 5 6 7 97 98 99

0

0

0

0

0

0

0

0

0

0

0

...

0 1 2 3 4 5 6 7 97 98 99 100

0

a

size==items.length

20 of 45

Array Resizing

When the array gets too full, e.g. addLast(11), just make a new array:

  • int[] a = new int[size+1];
  • System.arraycopy(...)

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

5

7

-1

5

7

...

100

size

0 1 2 3 4 5 6 7 97 98 99

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99 100

0

a

size==items.length

21 of 45

Array Resizing

When the array gets too full, e.g. addLast(11), just make a new array:

  • int[] a = new int[size+1];
  • System.arraycopy(...)
  • a[size] = 11;

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

5

7

-1

5

7

...

100

size

0 1 2 3 4 5 6 7 97 98 99

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99 100

11

a

size==items.length

22 of 45

Array Resizing

When the array gets too full, e.g. addLast(11), just make a new array:

  • int[] a = new int[size+1];
  • System.arraycopy(...)
  • a[size] = 11;
  • items = a; size +=1;

addLast()

getLast()

get(int i)

removeLast()

items

101

size

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99 100

11

a

size==items.length

23 of 45

Array Resizing

When the array gets too full, e.g. addLast(11), just make a new array:

  • int[] a = new int[size+1];
  • System.arraycopy(...)
  • a[size] = 11;
  • items = a; size +=1;

addLast()

getLast()

get(int i)

removeLast()

items

101

size

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99 100

11

a

size==items.length

We call this process “resizing”

24 of 45

Implementation

Let’s implement the resizing capability.

  • As usual, for those of you watching online, I recommend trying to implement this on your own before watching me do it.
  • Starter code is provided in the lists4 study guide if you want to try it out on a computer.

25 of 45

Resizing Array Code

public void addLast(int x) {

if (size == items.length) {

int[] a = new int[size + 1];

System.arraycopy(items, 0, a, 0, size);

items = a;

}

items[size] = x;

size += 1;

}

private void resize(int capacity) {

int[] a = new int[capacity];

System.arraycopy(items, 0, a, 0, size);

items = a;

}

public void addLast(int x) {

if (size == items.length) {

resize(size + 1);

}

items[size] = x;

size += 1;

}

Works

Much Better

26 of 45

Runtime and Space Usage Analysis: yellkey.com/camera

Suppose we have a full array of size 100. If we call addLast two times, how many total array memory boxes will we need to create and fill (for just these 2 calls)?

  1. 0
  2. 101
  3. 203
  4. 10,302

Bonus question: What is the maximum number of array boxes that Java will track at any given time? Assume that “garbage collection” happens immediately when all references to an object are lost.�

private void resize(int capacity) {

int[] a = new int[capacity];

System.arraycopy(items, 0, a, 0, size);

items = a;

}

public void addLast(int x) {

if (size == items.length) {

resize(size + 1);

}

items[size] = x;

size += 1;

}

27 of 45

Array Resizing

Resizing twice requires us to create and fill 203 total memory boxes.

  • Bonus answer: Most boxes at any one time is 203.
  • When the second addLast is done, we are left with 102 boxes.

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99 100

11

5

3

1

7

22

-1

5

7

-1

5

7

...

0 1 2 3 4 5 6 7 97 98 99 100 101

11

3

101

102

100

28 of 45

Runtime and Space Usage Analysis: yellkey.com/focus

Suppose we have a full array of size 100. If we call addLast until size = 1000, roughly how many total array memory boxes will we need to create and fill?

  • 1,000
  • 500,000
  • 1,000,000
  • 500,000,000,000
  • 1,000,000,000,000

Bonus question: What is the maximum number of array boxes that Java will track at any given time? Assume that “garbage collection” happens immediately when all references to an object are lost.

private void resize(int capacity) {

int[] a = new int[capacity];

System.arraycopy(items, 0, a, 0, size);

items = a;

}

public void addLast(int x) {

if (size == items.length) {

resize(size + 1);

}

items[size] = x;

size += 1;

}

29 of 45

Runtime and Space Usage Analysis

Suppose we have a full array of size 100. If we call addLast until size = 1000, roughly how many total array memory boxes will we need to create and fill?

B. 500,000

Going from capacity 100 to 101: 101

From 101 to 102: 102

From: 999 to 1000: 1000

Total array boxes created/copied: 101 + 102 + … + 1000

Since sum of 1 + 2 + 3 + … + N = N(N+1)/2, sum(101, …, 1000) is close to 500,000.

See: http://mathandmultimedia.com/2010/09/15/sum-first-n-positive-integers

We’ll be doing a lot of this after the midterm.

private void resize(int capacity) {

int[] a = new int[capacity];

System.arraycopy(items, 0, a, 0, size);

items = a;

}

30 of 45

Resizing Slowness

Inserting 100,000 items requires roughly 5,000,000,000 new containers.

  • Computers operate at the speed of GHz (due billions of things per second).
  • No huge surprise that 100,000 items took seconds.

Note: Insert here is addFirst

31 of 45

Fixing the Resizing Performance Bug

How do we fix this?

private void resize(int capacity) {

int[] a = new int[capacity];

System.arraycopy(items, 0, a, 0, size);

items = a;

}

public void addLast(int x) {

if (size == items.length) {

resize(size + 1);

}

items[size] = x;

size += 1;

}

32 of 45

(Probably) Surprising Fact

Geometric resizing is much faster: Just how much better will have to wait.

Unusably bad.

Great performance.

This is how the Python list is implemented.

public void addLast(int x) {

if (size == items.length) {

resize(size + RFACTOR);

}

items[size] = x;

size += 1;

}

public void addLast(int x) {

if (size == items.length) {

resize(size * RFACTOR);

}

items[size] = x;

size += 1;

}

33 of 45

Performance Problem #2

Suppose we have a very rare situation occur which causes us to:

  • Insert 1,000,000,000 items.
  • Then remove 990,000,000 items.

Our data structure will execute these operations acceptably fast, but afterwards there is a problem.

  • What is the problem?

34 of 45

Memory Efficiency

An AList should not only be efficient in time, but also efficient in space.

  • Define the “usage ratio” R = size / items.length;
  • Typical solution: Half array size when R < 0.25.
  • More details in a few weeks.

Later we will consider tradeoffs between time and space efficiency for a variety of algorithms and data structures.�

5

3

1

7

0

0

0

0

0

0

0

...

0 1 2 3 4 5 6 7 97 98 99

Usage ratio: 4/100 = 0.04

addLast()

getLast()

get(int i)

removeLast()

items

4

size

35 of 45

Generic ALists

36 of 45

Generic ALists (similar to generic SLists)

bleepblrop

public class AList {

private int[] items;

private int size;

public AList() {

items = new int[8];

size = 0;

}

private void resize(int capacity) {

int[] a = new int[capacity];

System.arraycopy(items, 0,

a, 0, size);

items = a;

}

public int get(int i) {

return items[i];

}

...

public class AList<Glorp> {

private Glorp[] items;

private int size;

public AList() {

items = (Glorp []) new Object[8];

size = 0;

}

private void resize(int cap) {

Glorp[] a = (Glorp []) new Object[cap];

System.arraycopy(items, 0,

a, 0, size);

items = a;

}

public Glorp get(int i) {

return items[i];

}

...

37 of 45

Generic ALists (similar to generic SLists)

When creating an array of references to Glorps:

  • (Glorp []) new Object[cap];
  • Causes a compiler warning, which you should ignore.�

Why not just new Glorp[cap];

  • Will cause a “generic array creation” error.

public class AList<Glorp> {

private Glorp[] items;

private int size;

public AList() {

items = (Glorp []) new Object[8];

size = 0;

}

private void resize(int cap) {

Glorp[] a = (Glorp []) new Object[cap];

System.arraycopy(items, 0,

a, 0, size);

items = a;

}

public Glorp get(int i) {

return items[i];

}

...

38 of 45

Nulling Out Deleted Items

Unlike integer based ALists, we actually want to null out deleted items.

  • Java only destroys unwanted objects when the last reference has been lost.
  • Keeping references to unneeded objects is sometimes called loitering.
  • Save memory. Don’t loiter.

public Glorp deleteBack() {

Glorp returnItem = getBack();

items[size - 1] = null;

size -= 1;

return returnItem;

}

null

0 1 2 3

items

3

size

39 of 45

Loitering Example

Changing size to 2 yields a correct AList.

  • But memory is wasted storing a reference to the red sign image.

null

0 1 2 3

items

2

size

40 of 45

Loitering Example

Changing size to 2 yields a correct AList.

  • But memory is wasted storing a reference to the red sign image.

By nulling out items[2], Java is free to destroy the unneeded image from memory, which could be potentially megabytes in size.

null

null

0 1 2 3

items

2

size

41 of 45

Loitering Example

Changing size to 2 yields a correct AList.

  • But memory is wasted storing a reference to the red sign image.

By nulling out items[2], Java is free to destroy the unneeded image from memory, which could be potentially megabytes in size.

null

null

0 1 2 3

items

2

size

42 of 45

Obscurantism in Java

43 of 45

One last thought: Obscurantism in Java

We talk of “layers of abstraction” often in computer science.

  • Related concept: obscurantism. The user of a class does not and should not know how it works.

User’s mental model: {5, 3, 1, 7, 22, -1} → {5, 3, 1, 7, 22}

0

0

0

...

addLast()

getLast()

get(int i)

removeLast()

items

5

3

1

7

22

-1

0

0

5

size

0 1 2 3 4 5 6 7 97 98 99

Actual truth:

44 of 45

One last thought: Obscurantism in Java

We talk of “layers of abstraction” often in computer science.

  • Related concept: obscurantism. The user of a class does not and should not know how it works.
    • The Java language allows you to enforce this with ideas like private!
  • A good programmer obscures details from themselves, even within a class.
    • Example: addFirst and resize should be written totally independently. You should not be thinking about the details of one method while writing the other. Simply trust that the other works.
    • Breaking programming tasks down into small pieces (especially functions) helps with this greatly!
    • Through judicious use of testing, we can build confidence in these small pieces, as we’ll see in the next lecture.

45 of 45

Citations