Solutions to 15295 Fall 2024 #3 Builtins

15295 Fall 2024 #3 Builtins -- Problem Discussion

September 11, 2024

A. Digital Multiples

It is clear that 142857 will satisfy the problem’s condition for all n that may come (2<=n<=6). This sets an upper bound on the minimum value satisfying the condition.

We can check all numbers from 1 to 142857 by brute force. For each number x, we store the digit counts in a multiset (eg. unordered_map<int,int> or Python dict). Then we compute the digit counts for multiples 2x, … , nx and compare for equality. This comparison check is O(1), and should pass over ~6e5 checks.

This can also be pre-computed for each n and returned in overall O(1) in a one-line solution

B. Sliding Window Median

Solution 1:

GNU PBDS contains an ordered set implementation that supports find_by_order(), basically find k’th largest element in log(n). Since we need an ordered multiset, we can follow the workaround here

From here the solution is trivial. Just add the first window’s elements to the set, and then slide the window by adding the right element and removing the left element. The median at any point is find_by_order(k/2). Over n-k windows, the total time complexity is O(nlogk)

–Anonymous

For my solution I maintained the elements in the current range i, i+1, … , i+k-1 in two sets. One called LEFT which is the smallest ⎡k/2⎤ and RIGHT which is the largest ⎣k/2⎦ of them. (The elements are actually pairs (x,i) where x=a[i], to avoid collisions in the set.) The median of these k elements is always the rightmost element of LEFT.

To initialize the two sets, run through positions 0 to k-1 of the array and insert them into LEFT. Then sequentially move (⎣k/2⎦ times) the rightmost element of LEFT into RIGHT.

To move the window one to the right, we have to remove (a[i],i) and add (a[i+k],i+k). To do the remove we have to see which of LEFT or RIGHT contains it, and then remove it. To do the add we have to see if the new one is to the right of the rightmost element of LEFT, then add it to LEFT or RIGHT. Finally we may have to balance the two sets so that their sizes are consistent with the required invariant. Then we can output the rightmost element of LEFT as the answer to the median request. This gives an O(n log n) algorithm, assuming sets are represented as binary search trees.

–Danny

C. 2047

Every number that ever exists in this system is an odd number times a power of two. And when you combine two such numbers, the odd number does not change. This immediately implies that we can partition this into a totally separate problem for each odd number.

So we preprocess the input as follows. So suppose we have 2 copies of 10 and 3 copies of 5. This turns into 4 copies of 5 and 3 copies of 5, or 7 copies of 5. We can combine these together in pairs. Note that 7 = 111 in binary. So we end up with a 5, a 10 and a 20 after the process finishes. The number of piles is the number of bits in the binary representation of 7, namely 3.

So we use a hash table where the key is an odd number and the value is the number of copies of that odd number we have. And we build it from the input numbers. So for each odd number like, say, 5, we see how many 5s we have and then count the 1 bits in it (the count). And we sum this over all the odd numbers that occur. This is our answer.

–Danny

Solution 2:

For any quantity x that appears s times, we can greedily merge this to floor(s / 2) amount of quantity 2 * x, since this will minimize the number of occurrences. This process repeats at most log(s) times and the maximum amount is bounded by x * 2^log(s) = x * s. We greedily simulate this by going from smallest x to largest x and merging greedily, maintaining elements using a data structure like a BST. The answer is what remains after all iterations. This takes O(n log n log s_max).

–Dustin

D. Maximum Sum Subarrays

The problem can be solved using dynamic programming. Let dp[idx][state1][state2] represent the maximum possible sum when considering the first idx indices, where the current state of the first array is state1 and the current state of the second array is state2.

The states (0 <= state1, state2 <= 2) are defined as:

0: The contiguous subarray hasn't started yet.

1: The contiguous subarray has started, but isn't closed.

2: The contiguous subarray has started and is closed.

If both subarrays are still open at any given index (both states are ones), swapping elements at this index is unnecessary, as we simply add both a[idx] and b[idx] to the answer.

If only one of the subarrays is open, we take the maximum between a[idx] and b[idx] for the current opened subarray.

You can transition from one state to another in the same index with no additional cost. For example, after calculating the answers for all possible states (state1, state2) where (0 <= state1, state2 <= 2), you can use prefix maximums to ensure the best possible value. Specifically, you can update:

dp[idx][state1][state2] = max(dp[idx][state1][0..state2], dp[idx][0..state1][state2]).

- Omar Ahmed

This could also be solved greedily. We first swap the elements so a[i] >= b[i] for every i. Then, the optimal solution must be equal to one of these two cases:

Max subarray sum of a + max subarray sum of b
Max sum of two (disjoint) subarrays of a

The key observation is that if a subarray is in the optimal solution, then every element in it must be the greater element at the index, unless both are selected. If the two subarrays are disjoint, then it fits into case 2 if we swap the subarray in b to a. If not, then we can swap it according to our rule, so the indices selected for b is a subset of indices for a, and the solution must be at most case 1.

Then, both problems can be solved using the classic dp algorithm.

– Yan Pan

E. Watching Fireworks is Fun

Let f[i][j] (1 <= i <= m, 1 <= j <= n) be the maximum happiness at time t_i if you end at position j. The transitions are defined as f[i][j] = b_i + |a_i - j| + min (f[i-1][k] ), where k ranges from j - d * (t_i - t_{i-1}) to j + d * (t_i - (t_{i-1}), inclusive. This represents the set of points at time t_{i-1} that can reach position j by time t_i. The answer is max(f[m][j]) over all j. Naively, these transitions take O(n^2 m) time to evaluate.

This can be optimized using a range-min query data structure, such as a segment tree or a min-queue which work in O(log n) and O(1), respectively (multisets are too slow to fit in TL). This reduces the time complexity to O(nm log n) and O(nm). It may also be necessary to reduce memory consumption by storing only f[i-1][...] and f[i][...] at step i, reducing memory complexity to O(n).

– Dustin

F. Coin Collecting

There are two different observations here: The first is that we can get rid of one of the three types of coins. In particular, set a' = a-c, b' = b-c, c' = 0 for all people and add c to your final answer for each person. This is like saying that each person will definitely give you c coins, and then you have to account for the difference.

The second idea is binary search. If we know exactly which people we will want to select for coins of type a or b, then our criteria for deciding should be: we have some quantity k, and if b-a > k, we select b, otherwise if b-a < k we select a, otherwise if b-a = k, we choose which type depending on restrictions on x and y. We can then binary search over k and decide whether our value is too big or too small each time.

–GusterBuster27

That explanation is pretty opaque. Let me try to explain it. Let X be the set of people picked for their gold coins, and Y and Z be the sets for silver and bronze. The first observation is that when you do the replacement of (a,b,c) by (a-c, b-c,0) for each triple, then the sets that achieve the maximum number of coins are the same before and after the replacement. This is because if we let f(X,Y,Z) be the value of that solution in the original problem and f’(X,Y,Z) be the value of that solution in the modified problem, then we know f and f’ are related as follows:

f(X,Y,Z) = c*n + f’(X,Y,Z)

i.e. the difference in these two values is independent of the sets X,Y, and Z.

Now, think about an easier problem. Suppose that we had no constraints on the sizes of X and Y, and only had a constraint on the size of Z. Namely that |Z| = z. Call this the “unrestricted problem”. How could we solve the problem? It’s easy. There’s a simple greedy solution. Sort the people (largest to smallest) based on the value of max(a,b). Then clearly we’ll want to pick the first n-z of those to be in the X and Y sets, and the remaining ones to be in Z. And the set X will contain the ones with a≥b, and the set Y will contain those with a≤b.

Now, suppose, just by luck, that when we did this it turned out that |X|=x and |Y|=y. We would then know that this is the optimal solution. Because the solution is the optimum one over a bigger space of possible solutions (ones where |X| and |Y| are unrestricted). So it must also be the optimum solution over the small subset of solutions in which |X|=x and |Y|=y.

There’s a way to force this to happen!

Suppose we do this transformation of all pairs (a,b) to be (a-s,b) for some constant s. Again letting f and f’ be the solution values for these two problems we have, and assuming that |X|=x then

f(X,Y,Z) = x*s + f’(X,Y,Z)

In other words the optimum solution (the sets X,Y,Z) does not change when the constant s is subtracted from all the a values.

Notice that in the unrestricted problem if we vary s, it will change the size of the sets X and Y. If we increase s, it will decrease |X|, and increase |Y|. So this means we can do a binary search over s to find a value such that |X|=x and |Y|=y in the unrestricted problem. And when we find this value, the sets X and Y must be optimal solutions to the original problem. Voilà!

PS: It can also be set up as a min-cost flow problem. Although this would have to be done with care in order to run in time. –Danny Sleator

G. Evaluate Expression