1 of 54

Hanzhi Wang

The University of Melbourne

Sublinear-Time Algorithms for Random-Walk Probability Estimation on Large Graphs

11 December 2025

0

2 of 54

Outline

Background

2

Contributions and Techniques

4

Problem Definitions

3

Overview

1

Future Work

5

1

3 of 54

Outline

Background

2

Techniques

4

Problem and Contributions

3

Overview

1

Future Work

5

2

4 of 54

Background

V.S.

    • Acquiring more servers.
    • Cons:

1. Hardware Advances

2. Algorithm Improvements

    • Developing efficient algorithms
    • Pros:

How to Process Data at a Massive Scale?

Trend of Electricity Consumption

enormous costs,

huge energy consumption

cost-saving,

energy-efficient

3

5 of 54

Background

Performance gains due to the algorithmic improvement of Maximum Subarray problem

MIT Computer Science & Artificial Intelligence Laboratory:

  • Performance gains due to algorithmic improvements have exceeded those due to hardware advances.

Hardware

Advances

The impact of algorithmic improvement for Polynomial Factorization

4

6 of 54

Exponential Time:

Polynomial Time:

Linear Time:

Inefficient

Efficient

Inefficient

Efficient

Classical Notion.

The Massive Data Volume Challenges

the Classical Notion of Efficient Algorithms

Background

Big Data Notion

5

7 of 54

 

 

Background

In the Early Stage

Contemporary Internet

 

The Massive Data Volume Challenges

the Classical Notion of Efficient Algorithms

completed in 3 years using 100 servers!

completed in 10 s with a single server

6

8 of 54

Designing Scalable Algorithms for P Problems

Motivations

Class P

Class NP

Class S

Solvable

in Polynomial Time

Capable

to handle big inputs

(nearly linear or sub-linear)

Verifiable

in Polynomial Time

Focus: Sublinear-Time Algorithm for Random-Walk Probability

Estimation on Large Graphs

7

9 of 54

Outline

Background

2

Contributions and Techniques

4

Problem Definitions

3

Overview

1

Future Work

5

8

10 of 54

Graphs

Social Network

Protein Structure

Trade Network

Transportation Network

 

 

a node

an edge

9

11 of 54

Random Walks on Graphs

1

4

3

2

5

6

7

9

10

8

11

12

 

 

 

 

 

 

 

 

10

12 of 54

 

 

1

4

3

2

5

6

7

9

10

8

11

12

 

 

 

 

11

13 of 54

PageRank

 

 

  • Uniformly Sampling a Source Node …

1

4

3

2

5

6

7

9

10

8

11

12

 

 

 

1

4

3

2

5

6

7

9

10

11

12

8

Sampled!

12

14 of 54

Personalized PageRank (PPR)

1

4

3

2

5

6

7

9

10

8

11

12

 

 

 

 

 

Given a

Source Node

 

 

PageRank

PPR

13

15 of 54

Applications

Graph Theory and Network Analysis

  • Graph Centrality Evaluation [FOCS’18, SICOMP’23]
  • Friend Recommendation (Twiter Who-to-Follow) [WWW’13]
  • Local Graph Clustering [FOCS’06]
  • Graph Neural Networks [NeurIPS’20]

neural

network

14

16 of 54

Outline

Background

2

Contributions and Techniques

4

Problem Definitions

3

Overview

1

Future Work

5

15

17 of 54

Query Types

Single-Node Random-Walk Probability Computations

Single-Target Random-Walk Probability Computations

Single-Source Random-Walk Probability Computations

Random-Walk Meeting Probability Computations

16

18 of 54

Query Types

Given a

Target Node

 

Goal:

Single-Node Random-Walk Probability Computations

Single-Target Random-Walk Probability Computations

Single-Source Random-Walk Probability Computations

Random-Walk Meeting Probability Computations

 

Uniformly Sampled Source Node

 

17

19 of 54

Query Types

Given a

Target Node

 

Single-Node Random-Walk Probability Computations

Single-Target Random-Walk Probability Computations

Single-Source Random-Walk Probability Computations

Random-Walk Meeting Probability Computations

 

18

20 of 54

Query Types

 

 

Single-Node Random-Walk Probability Computations

Single-Target Random-Walk Probability Computations

Single-Source Random-Walk Probability Computations

Random-Walk Meeting Probability Computations

 

19

21 of 54

Query Types

 

Single-Node Random-Walk Probability Computations

Single-Target Random-Walk Probability Computations

Single-Source Random-Walk Probability Computations

Random-Walk Meeting Probability Computations

Meet!

20

22 of 54

Outline

Background

2

Contributions and Techniques

4

Problem Definitions

3

Overview

1

Future Work

5

21

23 of 54

Contributions Summary

 

[SIGMOD’20, VLDBJ]

Meeting Prob.

 

 

 

Single-Source

[VLDB’22]

All are first-authored papers or listed alphabetically following the TCS convention.

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

22

24 of 54

Contributions Summary

 

[SIGMOD’20, VLDBJ]

Meeting Prob.

 

 

 

Single-Source

[VLDB’22]

All are first-authored papers or listed alphabetically following the TCS convention.

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

23

25 of 54

Contributions Summary

 

 

Single-Target

[STOC’24, KDD’21, KDD’20]

All are first-authored papers or listed alphabetically following the TCS convention.

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

Node Push

Edge Push

Power Method

Monte Carlo

24

26 of 54

The Power Method [WWW’98]

 

PageRank/PPR vector

Initial Distribution

Damping Factor

Transition Matrix

 

 

 

 

 

25

27 of 54

The Power Method [WWW’98]

 

 

 

 

 

 

26

28 of 54

Techniques

Node Push

Edge Push

Power Method

 

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

 

Monte Carlo

27

29 of 54

Problem 1: Single-Node PageRank Estimation

Given a

Target Node

 

Goal:

 

Uniformly

Sampled!

 

 

28

30 of 54

The Monte-Carlo Method [InternetMath ’05]

1

4

3

2

5

6

7

9

10

8

11

12

 

 

 

 

 

 

 

 

 

 

 

 

1

4

3

2

5

6

7

9

10

11

12

8

Sampled!

Sampled!

Sampled!

  • Uniformly Sampling a Source Node …
  • Uniformly Sampling a Source Node …
  • Uniformly Sampling a Source Node …

 

 

 

 

29

31 of 54

Challenges: Single-Node PageRank Estimation

 

 

 

 

 

 

30

32 of 54

Our Contributions:

Single-Node PageRank on Undirected Graphs

Hanzhi Wang: Revisiting Local PageRank Estimation on Undirected Graphs: Simple and Optimal. [KDD 2024]

 

 

 

 

 

 

KDD’14, WAW’15

Ours, KDD’24

 

Monte Carlo

Optimal !

 

Symmetry of Random-Walk Prob. on Undirected Graphs:

 

 

 

31

33 of 54

Challenges: Single-Node PageRank on Directed Graphs

 

 

 

 

 

How about directed graphs?

 

32

34 of 54

Techniques

Node Push

Edge Push

Power Method

Monte Carlo

 

 

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

 

33

35 of 54

Node Push [WAW’07]

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

34

36 of 54

Push + Monte Carlo [WSDM’16]

 

 

 

 

 

 

 

Pushing probability mass along in-edges

 

 

 

35

37 of 54

Push + Monte Carlo [WSDM’16]

 

 

Push

Balance

 

 

 

 

 

 

 

 

 

 

 

 

 

 

36

38 of 54

Our Contributions

Single-Node PageRank Estimation on Directed Graphs

 

 

WSDM 2016

SICOMP 2023

STOC 2024 (Ours)

 

 

 

 

 

Upper Bound

Upper Bound

Upper Bound

Upper Bound

Lower Bound

Lower Bound

Lower Bound

FOCS 2018

 

Lower Bound

 

 

 

Hanzhi Wang, Zhewei Wei, Ji-Rong Wen, Mingji Yang, Revisiting Local Computation of PageRank: Simple and Optimal. [STOC 2024]

37

39 of 54

 

 

 

Our Contributions

 

 

Upper Bound

Lower Bound

 

 

 

 

 

SODA 2026 (Ours)

STOC 2024

 

Mikkel Thorup, Hanzhi Wang, Zhewei Wei, Mingji Yang, PageRank Centrality in Directed Graphs with Bounded In-Degree. [SODA’26]

 

 

 

38

40 of 54

Techniques

Node Push

Edge Push

Power Method

Monte Carlo

 

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

 

39

41 of 54

Problem 2: Single-Target PPR Estimation

Given a

Target Node

 

 

 

40

42 of 54

Open Problem

 

Upper Bound

Lower Bound

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

41

43 of 54

Our Contributions

Optimal Algorithm for Single-Target PPR Estimation

Negative Answer to the Open Problem!

Ours, STOC 2024

 

Optimal under Graph Access Model

Hanzhi Wang, Zhewei Wei, Ji-Rong Wen, Mingji Yang, Revisiting Local Computation of PageRank: Simple and Optimal. [STOC 2024]

WAW 2007

 

 

 

Upper

Bound

Lower

Bound

42

44 of 54

Hard Instances

 

 

 

 

 

 

 

 

43

45 of 54

Techniques

Node Push

Edge Push

Power Method

Monte Carlo

 

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

 

44

46 of 54

Our Contributions

 

 

 

 

 

 

 

 

 

45

47 of 54

Our Contributions

Optimal Algorithm for Single-Target PPR Estimation

Hanzhi Wang, Zhewei Wei et al., Personalized PageRank to a Target Node, Revisited. [KDD 2020]

Positive Answer to the Open Problem!

WAW 2007

 

Ours, KDD 2020

 

Optimal under Degree-Sorted Model

 

Upper

Bound

Lower

Bound

 

46

48 of 54

Techniques

Node Push

Edge Push

Power Method

Monte Carlo

 

 

Single-Target

[STOC’24, KDD’21, KDD’20]

Single-Node

[SODA’26, STOC’24, KDD’24, VLDB’23]

 

47

49 of 54

Outline

Background

2

Contributions and Techniques

4

Problem Definitions

3

Overview

1

Future Work

5

48

50 of 54

Future Work

Single-Pair Random-Walk Probability Estimation

 

Upper Bound

 

Lower Bound

Given a Source Node

 

 

Goal:

Given a

Target Node

 

Solved

49

51 of 54

Future Work

Dynamic Random-Walk Probability Estimation

 

 

add an edge

  • recompute
  • update

Random-Walk Probability after Update:

no easy-to-apply error guarantees

50

52 of 54

Future Work

 

 

 

v.s.

51

53 of 54

Future Work

Scalable

Algorithms

Dynamic

Algorithms

Practical

Algorithms

52

54 of 54

Thanks!

53