1 of 48

Lecture 20: Graph Modeling

CSE 373: Data Structures and Algorithms

CSE 373 21 SP – CHAMPION

1

2 of 48

Announcements

  • Exercise 6 now due Tuesday (next Tuesday) at 11:59PM
  • P4 due next Wednesday at 11:59PM
  • final exam out next Wednesday, due Saturday at NOON (not midnight!!)
    • no questions can be asked on Ed / OH about final exam
      • only clarifying questions but nothing conceptual

that’s it.

3 of 48

Warm Up

CSE 373 SP 18 - KASEY CHAMPION

3

1

6

3

weight = 1

4

2

10

5

7

0

9

8

11

15

13

weight = 8

14

12

17

16

18

weight = 10

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

-1

-1

1

2

2

2

1

6

7

7

6

-1

11

12

12

11

15

15

17

Store (weight * -1)

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

-1

-10

1

2

2

2

1

6

7

7

6

-8

11

12

12

11

15

15

17

Each “node” now only takes 4 bytes of memory instead of 32

Fill in the array with the correct values representing this Disjoint Set forest

Use the indices that correspond with the values

tinyurl.com/Summer373L20

4 of 48

DP ClimbingStairs() Question - Recap

You are climbing a staircase. It takes n steps to reach the top.

Each time you can either climb 1 or 2 steps. In how many distinct ways can you climb to the top?

How can we solve this problem in the most optimized way?

4

5 of 48

Implement - Java Code

public int climbStairs(int n) {

int secondLast = 1, last = 1, current = 1;

if (n == 1)

return last;

for (int i = 2; i <= n; i++) {

current = secondLast + last;

secondLast = last;

last = current;

}

return last;

}

5

6 of 48

Using Arrays for Up-Trees

  • Since every node can have at most one parent, what if we use an array to store the parent relationships?
  • Proposal: each node corresponds to an index, where we store the index of the parent (or –1 for roots). Use the root index as the representative ID!
  • Just like with heaps, tree picture still conceptually correct, but exists in our minds!

Aileen (2)

Santino

Paul (4)

Joyce (0)

Ken

Sam

Alex

0

1

2

3

4

5

6

-1

0

-1

6

-1

2

0

Joyce

Sam

Aileen

Alex

Paul

Santino

Ken

7 of 48

Using Arrays: Find

  • Initial jump to element still done with extra Map
  • But traversing up the tree can be done purely within the array!

0

1

2

3

4

5

6

-1

0

-1

6

-1

2

0

Joyce

Sam

Aileen

Alex

Paul

Santino

Ken

Aileen (2)

Santino

Paul (4)

Joyce (0)

Ken

Sam

Alex

Alex

Aileen

Sam

find(A):

index = jump to A node’s index

while array[index] > 0:

index = array[index]

path compression

return index

1

2

find(Alex)

1

2

= 0

  • Can still do path compression by setting all indices along the way to the root index!

0

3

3

8 of 48

Using Arrays: Union

  • For WeightedQuickUnion, we need to store the number of nodes in each tree (the weight)
  • Instead of just storing -1 to indicate a root, we can store -1 * weight!

0

1

2

3

4

5

6

-4

0

-2

6

-1

2

0

Joyce

Sam

Aileen

Alex

Paul

Santino

Ken

Aileen (2)

Santino

Joyce (0)

Ken

Sam

Alex

union(A, B):

rootA = find(A)

rootB = find(B)

use -1 * array[rootA] and -1 *

array[rootB] to determine weights

put lighter root under heavier root

weight 4

weight 2

union(Ken, Santino)

Paul (4)

weight 1

9 of 48

Using Arrays: Union

  • For WeightedQuickUnion, we need to store the number of nodes in each tree (the weight)
  • Instead of just storing -1 to indicate a root, we can store -1 * weight!

0

1

2

3

4

5

6

-4

0

-2

6

-1

2

0

Joyce

Sam

Aileen

Alex

Paul

Santino

Ken

union(A, B):

rootA = find(A)

rootB = find(B)

use -1 * array[rootA] and -1 *

array[rootB] to determine weights

put lighter root under heavier root

-6

0

Aileen (2)

Santino

Joyce (0)

Ken

Sam

Alex

weight 6

Paul (4)

weight 1

Aileen

union(Ken, Santino)

10 of 48

Practice

CSE 373 SP 18 - KASEY CHAMPION

10

3

0

weight = 1

4

11

1

5

2

13

12

weight = 8

10

9

14

15

8

weight = 6

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

weight = 2

6

7

16

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

3

0

0

-6

3

-1

-2

6

12

13

13

0

13

-8

12

12

12

union(2, 16)

11 of 48

Practice

CSE 373 SP 18 - KASEY CHAMPION

11

3

0

weight = 1

4

11

1

5

2

13

12

weight = 8

10

9

14

15

8

weight = 6

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

weight = 2

6

7

16

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

3

0

3

13

3

-1

-2

6

12

13

13

0

13

-8

12

12

13

union(2, 16)

findSet(2) with path compression

findSet(16) with path compression

union(3,13) by weight

12 of 48

Using Arrays for WQU+PC

  • Same asymptotic runtime as using tree nodes, but check out all these other benefits:
    • More compact in memory
    • Better spatial locality, leading to better constant factors from cache usage
    • Simplify the implementation!

(Baseline)

QuickFind

QuickUnion

WeightedQuickUnion

WQU + Path Compression

ArrayWQU+PC

makeSet(value)

𝚯(1)

𝚯(1)

𝚯(1)

𝚯(1)

𝚯(1)

𝚯(1)

find(value)

𝚯(n)

𝚯(1)

𝚯(n)

𝚯(log n)

𝚯(log* n)

𝚯(log* n)

union(x, y)

assuming root args

𝚯(n)

𝚯(n)

𝚯(1)

𝚯(1)

𝚯(1)

𝚯(1)

union(x, y)

𝚯(n)

𝚯(n)

𝚯(n)

𝚯(log n)

𝚯(log* n)

𝚯(log* n)

13 of 48

Implementing Dijkstra’s

14 of 48

Review Dijkstra’s Algorithm: Key Properties

  • Once a vertex is marked known, its shortest path is known
    • Can reconstruct path by following back-pointers (in edgeTo map)

  • While a vertex is not known, another shorter path might be found
    • We call this update relaxing the distance because it only ever shortens the current best path

  • Going through closest vertices first lets us confidently say no shorter path will be found once known
    • Because not possible to find a shorter path that uses a farther vertex we’ll consider later�

dijkstraShortestPath(G graph, V start)

Set known; Map edgeTo, distTo;

initialize distTo with all nodes mapped to ∞, except start to 0

while (there are unknown vertices):

let u be the closest unknown vertex

known.add(u)

for each edge (u,v) to unknown v with weight w:

oldDist = distTo.get(v) // previous best path to v

newDist = distTo.get(u) + w // what if we went through u?

if (newDist < oldDist):

distTo.put(v, newDist)

edgeTo.put(v, u)�

15 of 48

Review Why Does Dijkstra’s Work?

  • Similar “First Try Phenomenon” to BFS

  • How can we be sure we won’t find a shorter path to X later?
    • Key Intuition: Dijkstra’s works because:
    • IF we always add the closest vertices to “known” first,
    • THEN by the time a vertex is added, any possible relaxing has happened and the path we know is always the shortest!

X

KNOWN

8??

3

1

A

1

5

6??

Example:

  • We’re about to add X to the known set
  • But how can we be sure we won’t later find a path through some node A that is shorter to X?
  • Because if we could, Dijkstra’s would explore A first

Dijkstra’s Algorithm Invariant

All vertices in the “known” set have the correct shortest path

INVARIANT

16 of 48

Review Why Does Dijkstra’s Work?

  • Similar “First Try Phenomenon” to BFS

  • How can we be sure we won’t find a shorter path to X later?
    • Key Intuition: Dijkstra’s works because:
    • IF we always add the closest vertices to “known” first,
    • THEN by the time a vertex is added, any possible relaxing has happened and the path we know is always the shortest!

Dijkstra’s Algorithm Invariant

All vertices in the “known” set have the correct shortest path

INVARIANT

X

KNOWN

7??

3

1

A

1

5

6

Example:

  • We’re about to add X to the known set
  • But how can we be sure we won’t later find a path through some node A that is shorter to X?
  • Because if we could, Dijkstra’s would explore A first

17 of 48

Implementing Dijkstra’s

  • How do we implement “let u be the closest unknown vertex”?�

dijkstraShortestPath(G graph, V start)

Set known; Map edgeTo, distTo;

initialize distTo with all nodes mapped to ∞, except start to 0

while (there are unknown vertices):

let u be the closest unknown vertex

known.add(u)

for each edge (u,v) to unknown v with weight w:

oldDist = distTo.get(v) // previous best path to v

newDist = distTo.get(u) + w // what if we went through u?

if (newDist < oldDist):

distTo.put(u, newDist)

edgeTo.put(u, v)�

  • Would sure be convenient to store vertices in a structure that…
    • Gives them each a distance “priority” value
    • Makes it fast to grab the one with the smallest distance
    • Lets us update that distance as we discover new, better paths�

MIN PRIORITY QUEUE ADT

18 of 48

Implementing Dijkstra’s: Pseudocode

  • Use a MinPriorityQueue to keep track of the perimeter
    • Don’t need to track entire graph
    • Don’t need separate “known” set – implicit in PQ (we’ll never try to update a “known” vertex)
  • This pseudocode is much closer to what you’ll implement in P4
    • However, still some details for you to figure out!
    • e.g. how to initialize distTo with all nodes mapped to
    • Spec will describe some optimizations for you to make

dijkstraShortestPath(G graph, V start)

Map edgeTo, distTo;

initialize distTo with all nodes mapped to ∞, except start to 0

PriorityQueue<V> perimeter; perimeter.add(start);

while (!perimeter.isEmpty()):

u = perimeter.removeMin()

for each edge (u,v) to v with weight w:

oldDist = distTo.get(v) // previous best path to v

newDist = distTo.get(u) + w // what if we went through u?

if (newDist < oldDist):

distTo.put(v, newDist)

edgeTo.put(v, u)

if (perimeter.contains(v)):

perimeter.changePriority(v, newDist)

else:

perimeter.add(v, newDist)

19 of 48

Dijkstra’s Runtime

dijkstraShortestPath(G graph, V start)

Map edgeTo, distTo;

initialize distTo with all nodes mapped to ∞, except start to 0

PriorityQueue<V> perimeter; perimeter.add(start);

while (!perimeter.isEmpty()):

u = perimeter.removeMin()

for each edge (u,v) to v with weight w:

oldDist = distTo.get(v) // previous best path to v

newDist = distTo.get(u) + w // what if we went through u?

if (newDist < oldDist):

distTo.put(v, newDist)

edgeTo.put(v, u)

if (perimeter.contains(v)):

perimeter.changePriority(v, newDist)

else:

perimeter.add(v, newDist)

 

 

 

 

 

 

 

 

 

20 of 48

Dijkstra’s Runtime

dijkstraShortestPath(G graph, V start)

Map edgeTo, distTo;

initialize distTo with all nodes mapped to ∞, except start to 0

PriorityQueue<V> perimeter; perimeter.add(start);

while (!perimeter.isEmpty()):

u = perimeter.removeMin()

for each edge (u,v) to v with weight w:

oldDist = distTo.get(v) // previous best path to v

newDist = distTo.get(u) + w // what if we went through u?

if (newDist < oldDist):

distTo.put(v, newDist)

edgeTo.put(v, u)

if (perimeter.contains(v)):

perimeter.changePriority(v, newDist)

else:

perimeter.add(v, newDist)

 

 

 

 

 

 

 

 

 

 

Final result:

Why can’t we simplify further?

  • We don’t know if |V| or |E| is going to be larger, so we don’t know which term will dominate.

  • Sometimes we assume |E| is larger than |V|, so |E|log|V| dominates. But not always true!

21 of 48

Topological Sort

22 of 48

Topological Sort

  • A topological sort of a directed graph G is�an ordering of the nodes, where for every edge in the graph, the origin appears before the destination in the ordering

  • Intuition: a “dependency graph”
    • An edge (u, v) means u must happen before v
    • A topological sort of a dependency graph gives an ordering that respects dependencies�
  • Applications:
    • Graduating
    • Compiling multiple Java files
    • Multi-job Workflows

A

B

C

A before C

B before C

A before B

A

B

C

Topological Sort:

With original edges for reference:

A

B

C

Input:

23 of 48

Can We Always Topo Sort a Graph?

  • Can you topologically sort this graph?�����
  • What’s the difference between this graph and our first graph?����
  • A graph has a topological ordering if it is a DAG
    • But a DAG can have multiple orderings

CSE 143

CSE 373

CSE 417

🤔 Where do I start?

Where do I end? 🤔

MATH 126

CSE 142

CSE 143

CSE 373

CSE 374

CSE 417

No 😭

DIRECTED ACYCLIC GRAPH

  • A directed graph without any cycles
  • Edges may or may not be weighted

24 of 48

Problem 1: Ordering Dependencies

  • Today’s (first) problem: Given a bunch of courses with prerequisites, find an order to take the courses in.

CSE 373 SP 18 - KASEY CHAMPION

24

Math 126

CSE 142

CSE 143

CSE 373

CSE 374

CSE 417

25 of 48

Problem 1: Ordering Dependencies

  • Given a directed graph G, where we have an edge from u to v if u must happen before v.
  • We can only do things one at a time, can we find an order that respects dependencies?

CSE 373 19 SP - KASEY CHAMPION

25

Given: a directed graph G

Find: an ordering of the vertices so all edges go from left to right (all the dependency arrows are satisfied and the vertices can be processed left to right with no problems) .

Topological Sort (aka Topological Ordering)

26 of 48

Ordering a DAG

  • Does this graph have a topological ordering? If so find one.

CSE 373 19 WI - KASEY CHAMPION

26

A

B

C

E

D

If a vertex doesn’t have any edges going into it, we can add it to the ordering.

More generally, if the only incoming edges are from vertices already in the ordering, it’s safe to add.

0

1

2

1

1

A

C

B

D

E

0

1

0

0

0

27 of 48

Topological Ordering

  • A course prerequisite chart and a possible topological ordering.

CSE 373 19 SP - KASEY CHAMPION

27

Math 126

CSE 142

CSE 143

CSE 373

CSE 374

CSE 417

Math 126

CSE 142

CSE 143

CSE 373

CSE 374

CSE 417

28 of 48

Reductions

29 of 48

Reductions

  • A reduction is a problem-solving strategy that involves using an algorithm for problem Q to solve a different problem P
    • Rather than modifying the algorithm for Q, we modify the inputs/outputs to make them compatible with Q!
    • “P reduces to Q”

1. Convert input for P into input for Q

2. Solve using algorithm for Q

3. Convert output from Q into output from P

Q INPUT

P INPUT

Q OUTPUT

P OUTPUT

PROBLEM Q

PROBLEM P

30 of 48

Reductions

  • Example: I want to get a note to my friend in Chicago, but walking all the way there is a difficult problem to solve ☹
    • Instead, reduce the “get a note to Chicago” problem to the “mail a letter” problem!

1. Place note inside of envelope

2. Mail using US Postal Service

3. Take note out of envelope

Q INPUT

Q OUTPUT

Mail a letter

Get a note to Chicago

Chicago

Chicago

Seattle

Seattle

31 of 48

How To Perform Topo Sort?

  • If we add a phantom “start” vertex pointing to other starts, we could use BFS!

0

1

3

4

7

2

5

6

👻

0

1

3

4

7

2

5

6

👻

0

1

3

4

7

2

5

6

1

2

3

4

5

6

7

0

1

2

3

4

5

6

7

0

👻

BFS

Sweet sweet victory 😎

Reduce topo sort to BFS by modifying graph, running BFS, then modifying output back

Performing Topo Sort

32 of 48

Checking for Duplicates

containsDuplicates(array) {

for (int i = 0; i < array.length; i++):

for (int j = i; j < array.length; j++):

if (array[i] == array[j]):

return true

return false

}

0

1

2

3

4

2

4

8

3

8

33 of 48

Goal of a Reduction

0

1

2

3

4

2

4

8

3

8

Goal: Reduce the problem of “Contains Duplicates?” to another problem we have an algorithm for.

Try to identify each of the following:

  1. How will you convert the “Contains Duplicates?” input?
  2. What algorithm will you apply?
  3. How will you convert the algorithm’s output?

Q INPUT

P INPUT

Q OUTPUT

P OUTPUT

PROBLEM Q

PROBLEM P

34 of 48

One Solution: Sorting!

Array

Array

Sorted Array

Boolean

Sorting

Contains Duplicates?

One Solution: Reduce “Contains Duplicates?” to the problem of sorting an array

  • We know several algorithms that solve this problem quickly!

 

 

35 of 48

Graph Modeling Review

36 of 48

Recap: Graph Modeling

SCENARIO

&

QUESTION TO ANSWER

ANSWER!

MODEL AS A GRAPH

RUN ALGORITHM

  • Choose vertices
  • Choose edges
  • Directed/Undirected
  • Weighted/Unweighted
  • Cyclic/Acyclic

  • Just visit every node?
    • BFS or DFS
  • s-t Connectivity?
    • BFS or DFS
  • Unweighted shortest path?
    • BFS
  • Weighted shortest path?
    • Dijkstra’s
  • Minimum Spanning Tree?
    • Prim’s or Kruskal’s

Often need to refine original model as you work through details of algorithm

Many ways to model any scenario with a graph, but question motivates which data is important

37 of 48

Graph Modeling Activity

Note Passing - Part I

Imagine you are an American High School student. You have a very important note to pass to your crush, but the two of you do not share a class so you need to rely on a chain of friends to pass the note along for you. A note can only be passed from one student to another when they share a class, meaning when two students have the same teacher during the same class period.

 

Unfortunately, the school administration is not as romantic as you, and passing notes is against the rules. If a teacher sees a note, they will take it and destroy it. Figure out if there is a sequence of handoffs to enable you to get your note to your crush.�

How could you model this situation as a graph?

 

Period 1

Period 2

Period 3

Period 4

You

Smith

Patel

Lee

Brown

Anika

Smith

Lee

Martinez

Brown

Bao

Brown

Patel

Martinez

Smith

Carla

Martinez

Jones

Brown

Smith

Dan

Lee

Lee

Brown

Patel

Crush

Martinez

Brown

Smith

Patel

38 of 48

Possible Design

  • Vertices
    • Students
    • Fields: Name, have note
  • Edges
    • Classes shared by students
    • Not directed
    • Could be left without weights
    • Fields: vertex 1, vertex 2, teacher, period

You

Anika

Carla

Bao

Dan

Crush

Smith, 1

Martinez, 1

Patel, 2

Lee, 2

Martinez, 3

Brown, 3

Smith, 4

Patel, 4

You

A

B

C

D

Crush

A

B

B

D

You

A

C

You

D

Crush

B

C

Crush

A

C

D

  • Adjacency List
  • Algorithm
  • BFS or DFS to see if you and your Crush are connected

39 of 48

More Design

Note Passing - Part II

Now that you know there exists a way to get your note to your crush, we can work on picking the best hand off path possible.

Thought Experiments:

  1. What if you want to optimize for time to get your crush the note as early in the day as possible?
    • How can we use our knowledge of which period students share to calculate for time knowing that period 1 is earliest in the day and period 4 is later in the day?
    • How can we account for the possibility that it might take more than a single school day to deliver the note?

2. What if you want to optimize for rick avoidance to make sure your note only gets passed in classes least likely for it to get intercepted?

- Some teachers are better at intercepting notes than others. The more notes a teacher has intercepted, the more likely it is they will take yours and it will never get to your crush. If we knew how many notes each teacher has intercepted how might we incorporate that into our graph to find the least risky route?

40 of 48

Optimize for Time

You

Anika

Carla

Bao

Dan

Crush

1

1

2

2

3

3

4

4

  1. Add the period number to each edge as its weight
  2. Run Dijkstra’s from You to Crush

Vertex

Distance

Predecessor

Process Order

You

0

--

0

Anika

1

You

1

Bao

2

You

5

Carla

6

Dan

3

Dan

3

Anika

2

Crush

7

Carla

4*

*The path found wraps around to a new school day because the path moves from a later period to an earlier one

- We can change our algorithm to check for wrap arounds and try other routes

“Distance” will represent the sum of which periods the note is passed in, because smaller period values are earlier in the day the smaller the sum the earlier the note gets there except in the case of a “wrap around”

41 of 48

Optimize for Risk

You

Anika

Carla

Bao

Dan

Crush

1

3

2

4

3

5

1

4

  1. Add the number of letters intercepted by the teacher to each edge as its weight

  1. Run Dijkstra’s from You to Crush

Vertex

Distance

Predecessor

Process Order

You

0

--

0

Anika

1

You

1

Bao

4

Anika

2

Carla

5

Bao

3

Dan

10

Carla

5

Crush

8

Carla

4

Teacher

Notes Intercepted

Smith

1

Martinez

3

Lee

4

Brown

5

Patel

2

“Distance” will represent the sum of notes intercepted across the teachers in your passing route. The smaller the sum of notes the “safer” the path.

42 of 48

Seam Carving

43 of 48

Content-Aware Image Resizing

Seam carving: A distortion-free technique for resizing an image by removing “unimportant seams”

Seam carving for content-aware image resizing (Avidan, Shamir/ACM); Broadway Tower (Newton2, Yummifruitbat/Wikimedia)

Original Photo

Horizontally-Scaled�(castle and person�are distorted)

Seam-Carved�(castle and person are undistorted; “unimportant” sky removed instead)

44 of 48

44

45 of 48

Seam Carving Reduces to Dijkstra’s!

  1. Transform the input so that it can be solved by the standard algorithm
    • Formulate the image as a graph
      • Vertices: pixel in the image
      • Edges: connects a pixel to its 3 downward neighbors
      • Edge Weights: the “energy” (visual difference) between adjacent pixels�
  2. Run the standard algorithm as-is on the transformed input
    • Run Dijkstra’s to find the shortest path (sum of weights) from top row to bottom row�
  3. Transform the output of the algorithm to solve the original problem
    • Interpret the path as a removable “seam” of unimportant pixels

Shortest Paths (Robert Sedgewick, Kevin Wayne/Princeton)

1.5

1.0

1.6

58.2

120.9

greater pixel difference = higher weight!

46 of 48

An Incomplete Reduction

  • Complication:
    • Dijkstra’s starts with a single vertex S and ends with a single vertex T
    • This problem specifies sets of vertices for the start and end�
  • Question to think about: how would you transform this graph into something Dijkstra’s knows how to operate on?

Shortest Paths (Robert Sedgewick, Kevin Wayne/Princeton)

S

T

47 of 48

In Conclusion

  • Topo Sort is a widely applicable “sorting” algorithm
  • Reductions are an essential tool in your CS toolbox -- you’re probably already doing them without putting a name to it

  • Many more reductions than we can cover!
    • Shortest Path in DAG with Negative Edges reduces to Topological Sort! (Link)
    • 2-Color Graph Coloring reduces to 2-SAT (Link)
    • Staying on top of the end of the quarter in this course reduces to starting early on P4 and EX4/5

Q INPUT

P INPUT

Q OUTPUT

P OUTPUT

PROBLEM Q

PROBLEM P

48 of 48

Appendix