1 of 37

lovingly prepared by Conrad Soon for NUS OrcaCode © 2024

2 of 37

Motivation

  • Answering repeated range queries, possibly with updates
    • Given an array, find minimum number in range multiple times.
    • Given an array, find the sum of numbers in a range multiple times.
    • What if we want to support point updates? Change a[i] to 69 or something?
    • What if we want to support range updates? Add 2 from a[2], a[3], to a[420]?
  • Naive solution
    • o(q) queries with max o(n) elements each -> o(nq) operations!
    • a[1] + … + a[7], a[3] + … a[8]
    • There’s repeated computation here (a[3] + … + a[7]), can we find a better way?
  • We want some sort of range query data structure, which supports these operations as fast as possible
    • Range query
    • Point update, range update

3 of 37

In this talk:

  • Some range query data structures that you won’t see in CS2040S
    • Prefix sum (Most common to appear in LeetCodes)
    • Segment trees (no Fenwick trees, I hate them)
    • Sparse table (very unlikely to see, but fun idea)
    • Square root decomposition (very unlikely to see, but fun idea)
  • Too lazy to find good implementations, but will give you the idea on how to implement.
  • Some problems solved using these data structures
    • Directly, or through some transformation that allows us to solve it.
  • Roughly CS2040S++ difficulty?
  • For the “I am from NUSH CS/RI/HCIRS with NOI Gold” chads, hopefully my math-y formalisation of some stuff will be kinda cool.

4 of 37

Prefix sum (apparently covered by Shawn)

  • Quite a common technique for Leetcodes, for answering repeated range queries
  • Basic idea is this:
    • For an array A, compute an auxiliary array B where B[i] = A[0] + A[1] + … + A[i] in O(n) time.
    • If we are asked to calculate A[15] + A[16] + … + A[25], observe that we can take B[25] - B[14], which gives us ( A[0] + … + A[25] ) - ( A[0] + … + A[14] ) = A[15] + … A[25]!
  • O(n) precomputation of B, which allows us to do O(1) range queries!
  • Quick question: overkill for only 1 range query?

5 of 37

Variations on a theme

  • What other operations are possible?
    • XOR?
    • Matrix addition?
    • Multiplication? What if I have zeros?
      • A bit more complex.
    • General matrix multiplication? Matrix multiplication w/ non-zero det?
    • Minimum/maximum?
  • Key idea is this: “inverse” exists for all elements
    • Technically you need associativity too, but all data structures today do.
    • If you are a math lover (🤓🤓🤓), you’ll know that this is a group structure

6 of 37

Prefix sum problems

  • https://cses.fi/problemset/task/1646
    • Classical prefix sum
  • https://cses.fi/problemset/task/1650
    • Prefix sum, but with different operation (XOR)

7 of 37

Reference implementation

# function to preprocess the input array and compute prefix sums

def prefix_sums(arr):

n = len(arr)

prefix = [0] * (n + 1)

for i in range(1, n + 1):

prefix[i] = prefix[i - 1] + arr[i - 1]

return prefix

# function to handle queries using the prefix sum array

def range_sum_query(prefix, a, b):

return prefix[b] - prefix[a - 1]

# read input values

n, q = map(int, input().split()) # number of elements and queries

arr = list(map(int, input().split())) # array of elements

# compute prefix sums

prefix = prefix_sums(arr)

# process each query and print the sum of values in the given range

for _ in range(q):

a, b = map(int, input().split())

print(range_sum_query(prefix, a, b))

8 of 37

Recap:

  • We want some sort of range query data structure, which supports these operations as fast as possible
    • Range query
    • Point update, range update
  • We have a data structure that gives us range queries for certain types of operations (ones with inverse).
  • What about updates? Updating is still O(n) for each update, assume O(q) updates, O(nq) overall!
  • Can we support more types of operations?
    • What constraints can we relax? Can we do away with requirement for inverse?

9 of 37

Segment trees

  • Good-ish time complexity for range queries Yes, O(log n) vs O(1))
  • Better time complexity for updates? Yes, O(log n) with segment trees.
  • More types of operations? Yes, only associativity with segment trees.
    • Minimum is now possible!
  • So what actually is a segment tree?
    • Key idea: break computation down into blocks we can reuse.
  • Let’s see how we can derive one.

10 of 37

Let’s think of how to break up the calculation

  • Use inclusive left, exclusive right semantics
    • Makes calculation easier! Minimises off-by-one index errors from calculating mid-point.
  • Given already computed sums of
    • [0,16) (length 16 blocks)
    • [0,8), [8,16) (length 8 blocks)
    • [0,4), [4,8), [8,12), [12,16) (length 4 blocks)
    • [0,2), [2,4), [4,6), [6,8), [8,10), [10,12), [12,14), [14,16) (length 2 blocks)
    • [0,1), [1,2), [2,3), [3,4), [4,5), [5,6), [6,7), [7,8), [8,9), [9,10), [10,11), [11,12), [12,13), [13,14), [14,15), [15,16) (length 1 block)
  • Suppose I want [3,12), can you find a way to minimise the amount of blocks you use to compute a range?
    • You could simply do [3,4) + [4,5) + … + [11,12), but then there’s no point in maintaining such a complicated structure. Just naively add.

11 of 37

TL:DR; Please draw a pretty picture

12 of 37

Let’s phrase it algorithmically

  • Node is a struct that holds 5 values
    • L, left end of range inclusive
    • R, right end of range exclusive
    • Sum of values within the range [L, R)
    • Pointer to left child node [L, M)
    • Pointer to right child node [M, R)
  • sum(i, j, Node*) is a function that given a range [i, j), returns you the sum of all the points in the Node’s range which intersect with it.
  • If I have node N representing range [0,7) with sum a[0], … , a[7], sum(2,10, N) should return me the sum of [0,7) intersect [2,10), or [2, 7).
  • Calling sum(i ,j, [0, 16)) should give me the answer to the range query!

13 of 37

Let’s phrase it algorithmically

  • sum has recursive structure!
    • sum(i, j, Node* n) = sum(i, j, leftChild) + sum(i, j, rightChild)
  • If recursive: need to think about base cases.
    • If node is fully contained within [i,j), just return the sum
    • If node is fully disjoint from [i,j) return 0.
    • Otherwise, node has intersection with the range requested, and need to recurse to children.

14 of 37

Back to the drawing board.

  • sum(i, j, Node* n) = sum(i, j, leftChild) + sum(i, j, rightChild)
  • If node is fully contained within [i,j), just return the sum
  • If node is fully disjoint from [i,j) return 0.

15 of 37

Claim: range queries in O(log n), NOT O(n)!

  • Recursion looks like it doubles by two each level, maybe O(n) overall?
    • However, one child is guaranteed to hit base case and return immediately.
  • At each level, it only does at most 4 operations, O(log n) levels, 4 log n operations total.
  • Informally, proof by 👀 observation, formally, proof by 🤓 induction.
  • For no insight into how the algo works at all: proof by 🤢 contradiction
    • Suppose >= 5 operations at a level
    • This means at previous level, at least 3 operations had to break into children.
    • This means requested range intersected partially with 3 ranges.
    • But ranges are distinct and ordered, if it intersects with two ranges, and intersects with one more, it implies that the third range is wholly contained within the range.
    • Contradiction!

16 of 37

Wait, how do we calculate the blocks in the first place?

  • Naively computing every single block -> O(NlogN)
  • Reusing previously calculated results -> O(N)

17 of 37

Claim: updates are in log(n) now too!

  • How can we update values?
  • Since now our computation is broken up into blocks, we only have to update the blocks!
  • There are only log(n) blocks which contain the relevant sums (imagine “drawing” up a line).
  • Furthermore, updates can be efficiently propagated.
    • Updating a block of n length can be done in O(1) time (update the relevant child block, and then combine the blocks).

18 of 37

Claim: updates are in log(n) now too!

19 of 37

Reference implementation

  • This slide is left as an exercise to the reader.
  • Backing data structure can be array or node-based tree.
    • Arrays are faster but have to allocate all nodes.
    • Node-based trees are slower but can be allocated only when needed.
  • Suppose range of [0,1e9), array-based tree is impossible, but node-based tree possible!

20 of 37

Recap

  • We want some sort of range query data structure, which supports these operations as fast as possible
    • ✅ Range query (log n)
    • ✅ Point update (log n), ❓range update
  • More variety of operations + efficient point updates, slightly worse queries.
  • Suppose we range update naively:
    • Run point update for each point in query.
    • If n range update queries of length n -> O(n^2logn)
    • This is worse than naive O(n^2) from run time!
  • Our data structure doesn’t support efficient range updates yet!
    • But it can with lazy propagation

21 of 37

Lazy propagation

  • Similar to how we broke up computation into “blocks”, let’s break up updates into “blocks”.
  • We store a variable aggAdd, indicating the “aggregate add” that should be applied to it.
    • If we want to add 1 to range [0, 8), node.aggAdd =1.
    • We also store 1 flag, indicating whether there is a pending aggAdd.
  • Can we derive how this should work?

22 of 37

Lazy propagation

  • update(0, 8, 2)
  • query(0, 8)
  • Need to define how aggAdd interacts with sum! (sum += aggAdd*l)
  • query(0, 4)
  • Need to define how aggAdd is pushed down to children!
  • update(0, 2, 4)
  • Need to define how aggAdd interacts with other aggAdds!
  • query(0, 4)
  • Need to recalculate sum after an update!

23 of 37

Let’s phrase it algorithmically

  • Think about a few things:
    • How does aggAdd get applied with sum? (sum += aggAdd * length of range)
    • How does aggAdd get composed with another aggAdd from above? (aggAdd += aggAddParent)
    • How does aggAdd get split up and pushed down to children? (aggAddChild = aggAddParent)
  • Whenever you call sum() on a node:
    • If pending aggAdd, apply aggAdd on the sum.
    • Push down the aggAdd to the 2 children node.
    • Recurse on sum() as per normal.
  • Whenever you call update() on a node:
    • Similar recursive structure as add(), with same base cases.
    • If node is fully contained within range, compose aggAdd to current node’s aggAdd.
    • If node is fully disjoint from range, do nothing.
    • Call update(leftChild) and update(rightChild).
    • Recalculate value of sum() after calling sum(leftChild) and sum(rightChild)

24 of 37

Back to the drawing board.

  • Whenever you call sum() on a node:
    • If pending operation, apply aggAdd on the sum.
    • Push down the aggAdd to the 2 children node.
    • Call sum() as per normal.
  • Whenever you call update() on a node:
    • Similar as sum(), recursive structure, base case is when node is fully contained or disjoint.
    • Call update(leftChild) and update(rightChild).
    • Recalculate sum from sum(left), sum(right).

  • update(0, 8, 2)
  • query(0, 8)
  • query(0, 4)
  • update(0, 2, 2)
  • query(0, 4)

25 of 37

Claim: range update is log(n)

  • Similar argument as for sum(), since similar recursive structures.
  • Each level you only do at most 4 operations, therefore O(log n).
  • Additionally, aggAdd only adds O(1) to each “block” operation in sum(), so it does not affect the time complexity of sum().

26 of 37

Recap

  • We want some sort of range query data structure, which supports these operations as fast as possible
    • ✅ Range query (log n)
    • ✅ Point update (log n), ✅ range update log(n)
  • What requirements do we need?
    • No longer need existence of an inverse!
    • To define range updates:
      • aggOpp.apply(sum): How to apply aggregate operation to the sum?
      • aggOpp.compose(aggOpp): How to compose aggregate operation?
      • aggOpp.pushDown(Node): How to push down an aggregate operation to another node?
  • Yay! We have found a fantastic range query data structure!

27 of 37

Variations on a theme

  • Given an array, support these operations:
    • Range minimum query, with range update
    • Range sum query, with range update
    • XOR query, with range update
    • So on and so forth!

28 of 37

Why is it not O(n) and not O(log n)?

  • Intuitively, lazy propagation “pushes” down operations.
  • But aren’t there in total 2N nodes?
  • Why is it not O(2N)?
  • Claim: amortisation
  • You only “push down” when you need it.
  • To force N push downs, you need at least N operations.
  • Spread out, this means 1 extra operation per.
  • In other words, only 1 operation to push it donw.

29 of 37

Read more for lazy propagation

  • Segment Tree : The general concept behind Lazy Propagation - Codeforces
    • Excellent comment that dissects lazy propagation mathematically (personally I felt this helped me understand it the best because I am a setpilled mathcel)

30 of 37

(Bonus 1) Square root decomposition

  • I have literally never used this before, but I think the idea itself is cool.
  • No special structure required, gives o(sqrt n) point updates, o(sqrt n) range query run time, o(n + sqrt(n)) memory usage.
  • Break each range query up into this:
    • Dangling left + block_1 + … + block_k + Dangling right
    • O(sqrt(n)) blocks with O(1) time for sum, dangling left and right calculated in O(sqrt(n)) time.
    • Overall, O(sqrt(n))
  • Total precomputation is sqrt(N) blocks * sqrt(N) operations = O(N)
    • Chunk array up into sqrt(N) blocks, each block has sqrt(N) elements.
    • For each block calculate total sum of block (O(sqrt(N))
  • Updating just recompute affected block.

31 of 37

(Bonus 2) Sparse table

  • I have seen this used literally only once
  • Requires idempotence and commutativity, but gives O(1) range query, O(log n) point update, O(n log n) memory usage.
  • Requires precomputation of O(n log n), for each value calculate [i,i+1), [i,i+2), [i,i+4), etc.
  • Key idea is this:
    • Because operations are idempotent, it doesn’t matter if you apply them twice!
    • If I want to find [4,25), combine [4, 4+16 = 20) with [25 - 16 = 9, 25)!
  • Sparse table works for range minimum but not for range sum!
    • Minimum is idempotent (min(x,min(x,y)) = min(x,y))
    • Sum is not idempotent! (sum(x,sum(x,y)) != sum(x,y))

32 of 37

LCA

  • LCA (Lowest Common Ancestor) is a “classical” algo problem.
  • Given a tree and two nodes, find their “lowest common ancestor”.
  • A lot of funni ways to tackle: some of the common ones you’ll see are
    • Naive (iterate upwards and find the first common ancestor)
    • Binary lifting (as above, but more optimised)
  • Since this is a range query data structures talk, is there a way to solve this with range query data structures?

33 of 37

LCAs: RMQs in Disguise

  • Lowest common ancestor is secretly a range minimum query problem!
  • Get an preorder labelling of nodes (BFS), then order these labels in a inorder.
  • The LCA of two nodes, is the RMQ on the range made by their positions!
    • Above statements make no sense as words, Conrad go draw on white board.
  • Later on there will be a tutorial question in CS2040S about LCA:
    • Common solution most people will iterate to is O(n log n) precomp, O(log n) queries
    • Then they will wonder “wait wtf the ideal time complexity for this is O(1) query, how??”
    • Enlightened NUS OrcaCode enjoyers will know that this time complexity is achievable by viewing LCA as an RMQ and using sparse table for O(1) query.

34 of 37

LCAs: RMQs in Disguise

35 of 37

This is what I procrastinated on to prepare this talk

  • please send help

36 of 37

For the grouptheorypilled mathcels: a summary

37 of 37

Read more

  • https://cp-algorithms.com/data_structures/segment_tree.html
    • Excellent reference website w/ good implementations
  • https://cses.fi/problemset/
    • Excellent problem set, delineated by topic (check out the Range Query section in particular)