2 of 47

Example Problems

Kattis: Baloni
LeetCode: Subarray Sum Equals K
USACO 2016 December Silver 2: Cities and States
LeetCode: Maximum Frequency Stack

Homework Problems

Anagram Counting
Exam Seating
Affine Names (Similar to USACO Cities and States)

3 of 47

Additional Problems

LeetCode: Find Common Characters
LeetCode: Find Words That Can Be Formed by Characters
LeetCode: Display Table of Food Orders in a Restaurant
LeetCode: Sort Characters by Frequency
LeetCode: Top k Frequent Elements
Kattis: CD
CodeForces: Constructing the Array

The homework problem Exam Seating is inspired by and similar to this problem.

CodeForces: Distinct Characters Queries
CodeForces: Polycarp's phone book

4 of 47

Kattis: Baloni

5 of 47

Kattis: Baloni

There are N (1<=N<=10^6) balloons floating, lined up from left to right, each at height H_i, where 1<=H_i<=10^6. One wants to pop all of them by shooting arrows at chosen heights. Each arrow travels from left to right. When an arrow touches a balloon, the balloon pops and disappears and the arrow continues its way at a height decreased by 1. Output the minimal number of arrows that is needed.

Input: Output:

5 3

4 5 2 1 4

6 of 47

Baloni: Illustration

5
4
3
2
1

Arrow 1

Arrow 2

Arrow 3

7 of 47

Baloni: Analysis

One brute-force approach is to simulate effect of each arrow.

Scanning from left to right, a new balloon requires a new arrow, follow the trace of this arrow, which may destroy other balloons. Repeat while there are balloons left. What is the complexity of this approach?
Simulating one arrow takes O(N) time, and we may need O(N) arrows, for a complexity of O(N^2)

How to avoid scanning through all balloons many times?

Simulate multiple arrows at the same time! How to do that efficiently?

Use a histogram to maintain the state of multiple arrows (mapping height to number of arrows travelling at that height). For each new baloon, either it can be popped, or a new arrow is needed.

8 of 47

Java Code for Baloni

public static void main(String[] args) throws IOException {

Scanner in = new Scanner(System.in);

int N = in.nextInt(), ans = 0;

int[] h = new int[N], a = new int[1000001];

for (int i=0; i<N; i++) { h[i] = in.nextInt(); }

for (int i=0; i<N; i++) {

int hb = h[i];

if (a[hb] == 0) { ans++; a[hb-1]++; }

else { a[hb]--; a[hb-1]++; }

}

System.out.println(ans);

}

10 of 47

Set

A set is a collection containing no duplicate elements
A set supports the following operations efficiently:

Testing for membership, Adding elements, Removing elements

An ordered set can order the elements in it by some order, and thus additionally support efficient

first, last, ordered traversal, etc.

Set operations:

Union, Intersection, Difference, Subset

11 of 47

Methods in The Java Set Interface

boolean contains (E e) // member test

boolean add (E e) // adding, enforces no-dupicates

boolean remove (Object o)

boolean isEmpty ()

int size ()

Iterator<E> iterator ()

boolean containsAll (Collection<E> c) // subset test

boolean addAll (Collection<E> c) // set union

boolean removeAll (Collection<E> c) // set difference

boolean retainAll (Collection<E> c) // set intersection

12 of 47

Java Classes Implementing Set

HashSet

Implemented using hashing
Methods add/remove/contains/size takes O(1) time

TreeSet

Implemented using a balanced binary search tree
Implement SortedSet

first() : Returns the first (lowest) element
last() : Returns the last (highest) element
headSet/tailSet/subSet (returns a portion of the set)

Methods add/remove/contains/first/last takes O(log N) time

13 of 47

C++ STL Classes for set

unordered_set (hash-based)

Methods such insert, erase, find, count, … take O(1) time

set (ordered)

Methods such as insert, erase, find, count, lower_bound, upper_bound take O(log N) time

14 of 47

Map

Map is a set of ordered pairs, each (key, value)
In a given Map, there are no duplicate keys
Each key is mapped to one value
Different keys can be mapped to the same value
Can be used to implement multi-set, i.e., a collection of things where one thing can appear multiple times.

An array can be viewed as a Map from a set of integer indices to a set of values

15 of 47

Some Methods of java.util.Map<K, V>

// K is the key type

// V is the value type

V get (Object key) // may return null

V put (K key, V value) // returns previous value or null

V remove (Object key) // returns previous value or null

boolean isEmpty ()

int size ()

containsKey(Object key)

Set<K> keySet()

Set<Map.Entry<K,V>> entrySet()

16 of 47

Using Map:

A helpful method: p.merge(k, v, Function) does the following

e = p.get(k);

if (e==null) {

p.put(k,v);

return v;

} else {

e=Function(e,v);

p.put(k,e);

return e;

}

17 of 47

Java Classes Implementing Map

HashTable (HashMap supporting multi-threaded access)

Not needed in competitive programming

HashMap

(Amortized) Constant, i.e., O(1), time for get/put/remove

TreeMap

Implemented SortedMap
Have firstKey(), lastKey(), ...
O(log N) time for get/put/remove

18 of 47

C++ STL classes for Map

unordered_map

We will show usages of them in sample code

19 of 47

Using Map for Histograms

We want to figure out for each value, how many time it has been added. This is a histogram.
When the element is integer of a limited range, we can use an array.
When the element can be very large, using Array takes too much space.
We can use HashMap or TreeMap.

Important methods: put(k,v), get(k), size()
For HashMap, these operations take constant time on average.
For TreeMap, put/get takes time O(log N), where N is the number of keys in the map, which is close to constant .

20 of 47

LeetCode 560 Subarray Sum Equals K

21 of 47

Example: LeetCode 560 Subarray Sum Equals K

Given an array of integers and an integer k, find the total number of continuous subarrays whose sum equals to k.

The length of the array in range [1, 20,000]. Range of numbers in the array is [-1000, 1000] and the range of k is [-1e7, 1e7].

Input: nums = [1, -1, 4, 2, -2, 2, 1], k = 6 Output: ?

Answer: 5 [1, -1, 4, 2] [4, 2] [4, 2, -2, 2] [1, -1, 4, 2, -2, 2]

[-1, 4, 2, -2, 2, 1]

22 of 47

Subarray Sum Equals K: Take One (Bruteforce)

1 public int subarraySum(int[] nums, int k) {

2 int count=0, n=nums.length;

3 for (int i=0; i<n; i++) for (int j=i+1; j<=n; j++) {

4 int subsum = 0;

5 for (int p=i; p<j; p++) subsum += nums[p];

6 if (subsum == k) count++;

7 }

8 return count;

9 }

What is the time complexity of this approach?

A: O(N) B: O(N^2) C: O(N^3)

D: O(N^2 K) E: O(N K^2)

What computation are repeated?

How to avoid them?

Main computation is lines 5, which is executed O(N^2) times (about N^2/2 times). Lines 5 is another loop that executes O(N) times (perhaps around N/3 on average). Overall complexity is thus O(N^3).

23 of 47

Analysis: LeetCode 560 Subarray Sum Equal K

The Naive algorithm is to check all subarrays A[i..j], compute their sums, and check whether it works.
There are O(N^2) pairs to check, and O(N) to compute sum, so total complexity is O(N^3).
Observation: The sum of a subarray A[range(i,j)] (all elements from A[i] to A[j-1]), equals the difference of prefix sums: S_j-S_i.
If we store all prefix sums in an array, computing subarray sums takes constant time.

24 of 47

Subarray Sum Equal K: Take Two (Prefix-sum array)

public int subarraySum(int[] nums, int k) {
int count=0, n=nums.length;
int[] psums = new int[n+1]; // prefix sum array
for (int i=0; i<n; i++) psums[i+1]=psums[i]+nums[i];
for (int i=0; i<n; i++) for (int j=i+1; j<=n; j++) {
if (/* what goes in here? */) count++;
}
return count;
}

What is the time complexity of this approach?

A: O(N) B: O(N^2) C: O(N K) D: O(N^2 K)

Line 6: (psums[j]-psums[i]==k)

Main computation is on line 6, which is executed O(N^2) times, and takes constant time. Overall complexity is thus O(N^2).

Can we speed it up further?

25 of 47

Subarray Sum Equal K: Analysis

Take-two checks sums of each of O(N^2) subarrays. Can we avoid checking each one. Let us look at an example (k=6):

num		1	-1	4	2	-2	2	1
psums	0	1	0	4	6	4	6	7
# of subarrays with sum=k ending at this index	?	?	?	?	?	?	?	?

0 0 0 0 2 0 2 1

Approach in take 2 looks through all psums[i] and counting how many of them are psums[j]-k. Can we do that faster?

26 of 47

Subarray Sum Equal K: Take Three: Prefix Histograms

public int subarraySum(int[] nums, int k) {

int count=0, rsum=0, MAXS=2000000;

int[] hist=new int[2*MAXS+1];

hist[MAXS] = 1; // hist[x] record number of i’s psums[i]=x-MAXS;

for (int x : nums) {

rsum += x;

count += hist[rsum + MAXS - k];

hist[rsum + MAXS] ++;

}

return count;

// should this code pass?

}

This code should not pass because array length is up to 20000, and each element is in [-1000,1000]. Thus MAXS should be 2*10^7 instead of 2*10^6.

Using MAXS=2*10^7 fails in leetcode. Using MAXS=2*10^6 passes.

We should not use array.

27 of 47

Take Four: Using HashMap for Histogram

Debug Time: Find the Bugs in the Code.

public int subarraySum(int[] nums, int k) {
HashMap<Integer,Integer> s=new HashMap<Integer,Integer>();
int sum = 0, count = 0;
for (int x : nums) {
sum += x;
count += s.get(k-sum);
s.put(sum, 1);
}
return count;
}

Bugs:

Before line 4, should add s.put(0,1);

Line 6 should be something like

Integer x = s.get(sum-k);

If (x != null) count += x;

Line 7 should be

s.merge(sum, 1, Integer::sum);

Time complexity is linear, because line 5-7 are executed N times, and the map operations take constant time.

28 of 47

Lesson from LeetCode 560

Any subarray sum can be computed as difference of two prefix sums
The information of how many time a prefix sum has appeared can be maintained as a histogram
How to implement a histogram depends on K

When K is small, uses an array, where R[i] is the number of prefix sums with remaining mod K equalling i
When K is large, we use a Map.

Thinking about counting how many solutions ending at an index is an important generic technique.

29 of 47

USACO 2016 December

Silver Problem 2: Cities and States

30 of 47

USACO: Cities and States

Given N cities, each has a city name and a two-letter state code. We say that city A and city B are related if the first two letters of A’s name is the same as B’s state code and vice versa, and A and B are from different states. For example:

FLINT, MI MIAMI, FL

Output the number of pairs of cities that are related.

1 ≤ N ≤ 200,000

31 of 47

Cities and States Analysis

Take one: Consider each pair of cities, check whether they are related.
What is the time complexity?

O(N^2), not efficient enough.

We have to avoid checking every pairs.
For each city, can we quickly find which cities are related to it?

32 of 47

Cities and States Analysis (Continued)

Given “FLINT, MI” what features must a city name have to be related to it?

It must be “MI..., FL”. Can we quickly find these cities?
We can use an index, using “MIFL”, using a Map.
Thus “FLINT, MI” will be put under key “FLMI”, and any name under the key “MIFL” will be related to it.

How to deal with the case that the cities must be from different states?

Do not consider cities like “XY… XY”

33 of 47

A Key Idea

Each city has two keys, it is put into the map under one key, and to look for the other city, look using the other key

34 of 47

Solution Code in Java

public static void main(String[] args) throws IOException {

// initializing in and out

int N = in.nextInt(), ans = 0;

Map<String, Integer> ind = new HashMap();

for (int i = 0; i < N; i++) {

String ct = in.next().substring(0,2), st = in.next();

if (ct.equals(st)) continue; // skip XY… XY

String key1 = ct + st, key2 = st + ct;

ans += ind.getOrDefault(key1, 0); // looking for matches

ind.merge(key2, 1, Integer::sum); // adding it

}

out.println(ans);

out.close();

}

35 of 47

Solution Code in C++11

int n, ans=0;

cin >> n;

map<string, int> d;

for (int i=0; i<n; i++) {

string cname, st, cn;

cin >> cname >> st;

cn = cname.substr(0,2);

if (cn != st) ans += d[cn+st];

d[st+cn]++;

}

cout << ans;

36 of 47

LeetCode: Maximum Frequency Stack

(Optional)

37 of 47

What is a Stack?

A Stack supports the following two operations

push(x) stores the value x
pop() removes and returns the value that is pushed most recently.

For example, if we have

push(2) push(3) push(2) push(1) pop() returns 1 pop() returns 2 push(3) pop() returns 3 pop() returns ?

38 of 47

Maximum Frequency Stack

Implement a data structure that supports the following two operations
push(int x) Add x to the data
pop()

returns and removes the most recent addition of the value that has been added the most number of times. If there is a tie, remove the value that is added most recently

For example, after push(5), push(7), push(5), push(7), push(4), push(5).
The sequence of values that we can get from pop()’s is

5	7	5	7	4	5

5, 7, 5, 4, 7, 5

39 of 47

Maximum Frequency Stack (Take One: Inefficient)

Use an dynamically sized array, e.g., ArrayList in Java and vector in C++

For each push, add to the end

Efficient!

For each pop

Scan through the array, find the last appearance of an element that appears the most number of times, remove and return that

Takes time linear in the array size

40 of 47

Maximum Frequency Stack (Take Two)

For pop(), how to quickly find the element that appeared the most?
We can use an ordered set to store information about each element, e.g., an ordered set of

[number_of_occurrences, element]
We can select the last element in the ordered set

How to support tie-breaking?

Use a counter for push (called that time)
[number_of_occurences, time of most recent addition, element]
Selecting the last element gives the right element.

41 of 47

Maximum Frequency Stack (Take Two, continued)

How to update the ordered set after pop()?

We need to find the next most recent addition time of that element.
That means we need to store all addition times of that element in another data structure.

We need two data structures:

An ordered set of triples (count, last addition time, value)
For each element, a list of added times (what data structure to use?)

After pop() of one element, find the next to last addition of the value, add a new triple to the ordered set.

42 of 47

Maximum Frequency Stack (Take Two, continued)

How to update the two data structures when a push(int x) happens?

Ordered set of triples
Map of value to list of addition times

Updating the map is easy.

Find the list, append the time

Update the set is a bit more complicated

From the map, we have all information needed to determine the element in the set corresponding to x.
We can remove the element, then add a new one.

43 of 47

Java Code

class FreqStack {

// set of [Frequency, Last Index, Value]

private TreeSet<int[]> data;

// Maps v to list of indices

private HashMap<Integer, ArrayList<Integer>> ind;

private int count = 0; // Used to order push calls

public FreqStack() {

data = new TreeSet<int[]>(comp);

ind = new HashMap<Integer, ArrayList<Integer>>();

}

44 of 47

Java Code for the Comparator

static Comparator<int[]> comp = new Comparator<int[]>() {

public int compare(int[] a, int[] b) {

for (int i=0; i<a.length; i++) {

if (a[i] < b[i]) return -1;

else if (a[i] > b[i]) return 1;

}

return 0;

}

};

45 of 47

The pop() function

public int pop() {

int[] x = data.pollLast();

// x[0]==Frequency, x[1]==Last push, x[2] = value

int v = x[2];

ArrayList<Integer> oc = ind.get(v);

oc.remove(occs.size() - 1);

int sz = occs.size();

if (sz > 0)

data.add(new int[] {sz, oc.get(sz-1), v});

return v;

}

46 of 47

The push() function

public void push(int x) {

ArrayList<Integer> oc = ind.get(x);

if (oc == null) { // First time seeing the value

oc = new ArrayList<Integer>();

ind.put(x, oc);

}

int sz = oc.size();

if (sz > 0) { // Currently non-zero occurrences

data.remove(new int[]{sz, oc.get(sz-1), x});

}

oc.add(count);

data.add(new int[]{sz+1, count, x});

count++;

}

47 of 47

Maximum Frequency Stack (Take three)

Using multiple stacks, one for each frequency

For example, after push(5a), push(7a), push(5b), push(7b), push(4a), push(5c), pop(), pop(), push (4b).

3	5c
2	5b	7b, 4b
1	5a	7a	4a