Topic 2:�Histograms, Sets, and Maps
Competitive Programming I (Fall 2020)
Ninghui Li
Example Problems
Additional Problems
Kattis: Baloni
Kattis: Baloni
There are N (1<=N<=10^6) balloons floating, lined up from left to right, each at height H_i, where 1<=H_i<=10^6. One wants to pop all of them by shooting arrows at chosen heights. Each arrow travels from left to right. When an arrow touches a balloon, the balloon pops and disappears and the arrow continues its way at a height decreased by 1. Output the minimal number of arrows that is needed.
Input: Output:
5 3
4 5 2 1 4
Baloni: Illustration
5 | | | | | |
4 | | | | | |
3 | | | | | |
2 | | | | | |
1 | | | | | |
Arrow 1
Arrow 2
Arrow 3
Baloni: Analysis
One brute-force approach is to simulate effect of each arrow.
How to avoid scanning through all balloons many times?
Simulate multiple arrows at the same time! How to do that efficiently?
Use a histogram to maintain the state of multiple arrows (mapping height to number of arrows travelling at that height). For each new baloon, either it can be popped, or a new arrow is needed.
Java Code for Baloni
public static void main(String[] args) throws IOException {
Scanner in = new Scanner(System.in);
int N = in.nextInt(), ans = 0;
int[] h = new int[N], a = new int[1000001];
for (int i=0; i<N; i++) { h[i] = in.nextInt(); }
for (int i=0; i<N; i++) {
int hb = h[i];
if (a[hb] == 0) { ans++; a[hb-1]++; }
else { a[hb]--; a[hb-1]++; }
}
System.out.println(ans);
}
Set and Map
Set
Methods in The Java Set Interface
boolean contains (E e) // member test
boolean add (E e) // adding, enforces no-dupicates
boolean remove (Object o)
boolean isEmpty ()
int size ()
Iterator<E> iterator ()
boolean containsAll (Collection<E> c) // subset test
boolean addAll (Collection<E> c) // set union
boolean removeAll (Collection<E> c) // set difference
boolean retainAll (Collection<E> c) // set intersection
Java Classes Implementing Set
HashSet
TreeSet
C++ STL Classes for set
Map
Some Methods of java.util.Map<K, V>
// K is the key type
// V is the value type
V get (Object key) // may return null
V put (K key, V value) // returns previous value or null
V remove (Object key) // returns previous value or null
boolean isEmpty ()
int size ()
containsKey(Object key)
Set<K> keySet()
Set<Map.Entry<K,V>> entrySet()
Using Map:
e = p.get(k);
if (e==null) {
p.put(k,v);
return v;
} else {
e=Function(e,v);
p.put(k,e);
return e;
}
Java Classes Implementing Map
C++ STL classes for Map
Using Map for Histograms
Example: LeetCode 560 Subarray Sum Equals K
Given an array of integers and an integer k, find the total number of continuous subarrays whose sum equals to k.
The length of the array in range [1, 20,000]. Range of numbers in the array is [-1000, 1000] and the range of k is [-1e7, 1e7].
Input: nums = [1, -1, 4, 2, -2, 2, 1], k = 6 Output: ?
Answer: 5 [1, -1, 4, 2] [4, 2] [4, 2, -2, 2] [1, -1, 4, 2, -2, 2]
[-1, 4, 2, -2, 2, 1]
Subarray Sum Equals K: Take One (Bruteforce)
1 public int subarraySum(int[] nums, int k) {
2 int count=0, n=nums.length;
3 for (int i=0; i<n; i++) for (int j=i+1; j<=n; j++) {
4 int subsum = 0;
5 for (int p=i; p<j; p++) subsum += nums[p];
6 if (subsum == k) count++;
7 }
8 return count;
9 }
What is the time complexity of this approach?
A: O(N) B: O(N^2) C: O(N^3)
D: O(N^2 K) E: O(N K^2)
What computation are repeated?
How to avoid them?
Main computation is lines 5, which is executed O(N^2) times (about N^2/2 times). Lines 5 is another loop that executes O(N) times (perhaps around N/3 on average). Overall complexity is thus O(N^3).
Analysis: LeetCode 560 Subarray Sum Equal K
Subarray Sum Equal K: Take Two (Prefix-sum array)
What is the time complexity of this approach?
A: O(N) B: O(N^2) C: O(N K) D: O(N^2 K)
Line 6: (psums[j]-psums[i]==k)
Main computation is on line 6, which is executed O(N^2) times, and takes constant time. Overall complexity is thus O(N^2).
Can we speed it up further?
Subarray Sum Equal K: Analysis
num | | 1 | -1 | 4 | 2 | -2 | 2 | 1 |
psums | 0 | 1 | 0 | 4 | 6 | 4 | 6 | 7 |
# of subarrays with sum=k ending at this index | ? | ? | ? | ? | ? | ? | ? | ? |
0 0 0 0 2 0 2 1
Approach in take 2 looks through all psums[i] and counting how many of them are psums[j]-k. Can we do that faster?
Subarray Sum Equal K: Take Three: Prefix Histograms
public int subarraySum(int[] nums, int k) {
int count=0, rsum=0, MAXS=2000000;
int[] hist=new int[2*MAXS+1];
hist[MAXS] = 1; // hist[x] record number of i’s psums[i]=x-MAXS;
for (int x : nums) {
rsum += x;
count += hist[rsum + MAXS - k];
hist[rsum + MAXS] ++;
}
return count;
// should this code pass?
}
This code should not pass because array length is up to 20000, and each element is in [-1000,1000]. Thus MAXS should be 2*10^7 instead of 2*10^6.
Using MAXS=2*10^7 fails in leetcode. Using MAXS=2*10^6 passes.
We should not use array.
Take Four: Using HashMap for Histogram
Debug Time: Find the Bugs in the Code.
Bugs:
Before line 4, should add s.put(0,1);
Line 6 should be something like
Integer x = s.get(sum-k);
If (x != null) count += x;
Line 7 should be
s.merge(sum, 1, Integer::sum);
Time complexity is linear, because line 5-7 are executed N times, and the map operations take constant time.
Lesson from LeetCode 560
Silver Problem 2: Cities and States
USACO: Cities and States
Given N cities, each has a city name and a two-letter state code. We say that city A and city B are related if the first two letters of A’s name is the same as B’s state code and vice versa, and A and B are from different states. For example:
FLINT, MI MIAMI, FL
Output the number of pairs of cities that are related.
1 ≤ N ≤ 200,000
Cities and States Analysis
Cities and States Analysis (Continued)
A Key Idea
Solution Code in Java
public static void main(String[] args) throws IOException {
// initializing in and out
int N = in.nextInt(), ans = 0;
Map<String, Integer> ind = new HashMap();
for (int i = 0; i < N; i++) {
String ct = in.next().substring(0,2), st = in.next();
if (ct.equals(st)) continue; // skip XY… XY
String key1 = ct + st, key2 = st + ct;
ans += ind.getOrDefault(key1, 0); // looking for matches
ind.merge(key2, 1, Integer::sum); // adding it
}
out.println(ans);
out.close();
}
Solution Code in C++11
int n, ans=0;
cin >> n;
map<string, int> d;
for (int i=0; i<n; i++) {
string cname, st, cn;
cin >> cname >> st;
cn = cname.substr(0,2);
if (cn != st) ans += d[cn+st];
d[st+cn]++;
}
cout << ans;
LeetCode: Maximum Frequency Stack
(Optional)
What is a Stack?
Maximum Frequency Stack
5 | 7 | 5 | 7 | 4 | 5 | |
5, 7, 5, 4, 7, 5
Maximum Frequency Stack (Take One: Inefficient)
Use an dynamically sized array, e.g., ArrayList in Java and vector in C++
Maximum Frequency Stack (Take Two)
Maximum Frequency Stack (Take Two, continued)
Maximum Frequency Stack (Take Two, continued)
Java Code
class FreqStack {
// set of [Frequency, Last Index, Value]
private TreeSet<int[]> data;
// Maps v to list of indices
private HashMap<Integer, ArrayList<Integer>> ind;
private int count = 0; // Used to order push calls
public FreqStack() {
data = new TreeSet<int[]>(comp);
ind = new HashMap<Integer, ArrayList<Integer>>();
}
Java Code for the Comparator
static Comparator<int[]> comp = new Comparator<int[]>() {
public int compare(int[] a, int[] b) {
for (int i=0; i<a.length; i++) {
if (a[i] < b[i]) return -1;
else if (a[i] > b[i]) return 1;
}
return 0;
}
};
The pop() function
public int pop() {
int[] x = data.pollLast();
// x[0]==Frequency, x[1]==Last push, x[2] = value
int v = x[2];
ArrayList<Integer> oc = ind.get(v);
oc.remove(occs.size() - 1);
int sz = occs.size();
if (sz > 0)
data.add(new int[] {sz, oc.get(sz-1), v});
return v;
}
The push() function
public void push(int x) {
ArrayList<Integer> oc = ind.get(x);
if (oc == null) { // First time seeing the value
oc = new ArrayList<Integer>();
ind.put(x, oc);
}
int sz = oc.size();
if (sz > 0) { // Currently non-zero occurrences
data.remove(new int[]{sz, oc.get(sz-1), x});
}
oc.add(count);
data.add(new int[]{sz+1, count, x});
count++;
}
Maximum Frequency Stack (Take three)
Using multiple stacks, one for each frequency
For example, after push(5a), push(7a), push(5b), push(7b), push(4a), push(5c), pop(), pop(), push (4b).
3 | 5c | | | |
2 | 5b | 7b, 4b | | |
1 | 5a | 7a | 4a | |