Call the tiles which are stuck in place "fixed values", and call all other values "unfixed values". First, append large fixed values (say, 106) to either end of the array, so that every unfixed value is between two fixed values. At the end, we then subtract 2⋅106 from the answer.
Define a range to be the space between two consecutive fixed tiles al and ar.
Instead of the original scenario of a sequence of tiles, we can imagine that we have K+1 "ranges", where the ith range consists of two values mi=min(al,ar),Mi=max(al,ar), and some size leni, satisfying ∑K+1i=1leni=N−K. Our job is to allocate N−K values to the K+1 ranges, where the ith range gets a sequence of values bi,1,bi,2,…,bi,leni, such that the value we want to minimize is ∑K+1i=1∑lenij=0max(bi,j,bi,j+1) where we also define bi,0=mi, bi,leni+1=Mi as fixed values.
Claim: For a range i, it is optimal for the values to be sorted such that bi,j≤bi,j+1 for j∈[1,leni−1].
Proof: This holds from the 0−1 principle. If all numbers are 0 or 1, then clearly it is optimal for the values to be sorted in the above manner. So, it is optimal to sort even when the numbers are not 0 or 1.
More rigorously, the objective function we want to minimize can be written as ∑K+1i=1∑lenij=0∑∞v=0max(fv(bi,j),fv(bi,j+1))=∑∞v=0∑K+1i=1∑lenij=0max(fv(bi,j),fv(bi,j+1)). where fv(x)=1 if x>v, and otherwise fv(x)=0. Now, for each value of v, the inner summation is minimized when the values are sorted. So, it is optimal to sort the values in increasing order.
So, for the K=0 case, there is only one range, and we simply sort the values to compute ∑len1j=0max(b1,j,b1,j+1).
For N≤50, we can do a O(∏K+1i=1leni)≤O((NK+1)K+1) state DP. For each of the unfixed values in increasing value order, we decide which range to put it in, and place the value in the first position which is unfilled in the range. The DP state is the number of elements filled for each range so far (placed1,placed2,…,placedK+1), and the value of the state is the sum over all i of ∑placedi−1j=0max(bi,j,bi,j+1) if placedi<leni, and ∑lenij=0max(bi,j,bi,j+1) if placedi=leni.
Richard's Code for N≤50:
#include <bits/stdc++.h> using namespace std; #define sz(x) int((x).size()) int main() { cin.tie(0)->sync_with_stdio(0); int N, K; cin >> N >> K; multiset<int> multi_vals; vector<int> inp_a; for (int i = 1; i <= N; i++) { int a; cin >> a; multi_vals.insert(a); inp_a.push_back(a); } vector<pair<int, int>> xys; for (int i = 1; i <= K; i++) { int x, y; cin >> x; y = inp_a[x - 1]; xys.push_back(make_pair(x, y)); multi_vals.erase(multi_vals.find(y)); } xys.push_back(make_pair(0, 1000000)); xys.push_back(make_pair(N + 1, 1000000)); sort(begin(xys), end(xys)); vector<int> vals; for (auto u : multi_vals) vals.push_back(u); // xys and vals store the input vector<pair<pair<int, int>, int>> rangs; //ranges: ((min value, max value), size) int addon_zerosz = 0; for (int i = 0; i + 1 < sz(xys); i++) { int rang_size = xys[i + 1].first - xys[i].first - 1; if (rang_size == 0) { addon_zerosz += max(xys[i + 1].second, xys[i].second); } else { pair<int, int> rang_vals = make_pair(xys[i].second, xys[i + 1].second); if (rang_vals.first > rang_vals.second) swap(rang_vals.first, rang_vals.second); rangs.push_back(make_pair(rang_vals, rang_size)); } } map<vector<int>, int> dp; dp[vector<int>(sz(rangs), 0)] = 0; for(auto v: vals){ //for each value in increasing order map<vector<int>, int> ndp; for(const auto& u: dp){ vector<int> st = u.first; //current # of placed values in each range for(int i = 0; i < sz(rangs); i++){ //place the next value in range i if(st[i]+1 <= rangs[i].second){ vector<int> new_st = st; int new_val = u.second; new_st[i]++; if(st[i] == 0){ new_val+=max(v, rangs[i].first.first)-v; } if(st[i] == rangs[i].second-1){ new_val+=max(v, rangs[i].first.second)-rangs[i].first.second; } if(!ndp.count(new_st) || ndp[new_st] >= new_val){ ndp[new_st] = new_val; } } } } swap(dp, ndp); } assert(sz(dp) == 1); int ans = dp.begin()->second; //add in fixed values for (auto u : vals) { ans += u; } for (auto u : rangs) { ans += u.first.second; } ans += addon_zerosz; ans -= 2 * 1000000; cout << ans << "\n"; }
Now, using the sorted property, notice that in the expression ∑K+1i=1∑lenij=0max(bi,j,bi,j+1), we almost always have max(bi,j,bi,j+1)=bi,j+1. The only j where this might not be the case are j=0,leni. However, we know the value of ∑K+1i=1∑lenij=0bi,j+1 is just the sum of all non-fixed values plus ∑K+1i=1Mi.
So, it suffices to minimize ∑K+1i=1∑lenij=0max(bi,j,bi,j+1)−bi,j+1 instead. From the above observation, we know that ∑K+1i=1∑lenij=0max(bi,j,bi,j+1)−bi,j+1=∑K+1i=1max(bi,0,bi,1)−bi,1+max(bi,leni,bi,leni+1)−bi,leni+1=∑K+1i=1max(mi,bi,1)−bi,1+max(bi,leni,Mi)−Mi
For the K=1 subtask, we have 2 ranges. Notice that M1=M2=+106, so that for i=1,2, we have max(bi,leni,bi,leni+1)−bi,leni+1=Mi−Mi=0. Also, for K=1, we have both m1=m2=m as the only fixed value. So, we want to minimize ∑2i=1max(m,bi,1)−bi,1. Let the unfixed values be c1,c2,…,cN−1, in increasing value order. One of the values bi,1 is forced to be the value c1, and because max(m,bi,1)−bi,1 is a decreasing function in bi,1, it suffices to maximize the other bi,1. It is easy to prove that the maximum such value is cmax(len1,len2)+1, where we allocate a prefix of the values c1,c2,…,cN−1 to be in one range, and a suffix of the values to be in the other range.
Now, we go for the full solution. Define the bounds of a range i to be [bi,1,bi,leni]. Define the sequence of non-fixed values in increasing order as c1,c2,…,cN−K.
Claim: There is an optimal solution where for every two ranges one of the bounds contains the other, or the bounds do not intersect.
Proof: WLOG, let the two ranges be ranges 1 and 2. Let the values currently allocated to range 1 and 2 be cs1≤cs2≤⋯≤cslen1+len2 for some sequence of indices s1<s2⋯<slen1+len2, such that len1 of the values are allocated to range 1 and len2 of the values are allocated to range 2.
Now, if indices s1,slen1+len2 both belonged to the same range, then one range would contain the other. Thus, we can assume WLOG that range 1 contains index s1 and range 2 contains index slen1+len2. Also, if first len1 values corresponding to indices s1,…,slen1 all belong to range 1, then the bounds do not intersect, and we are done.
Otherwise, there is a k such that sk belongs to range 2 and sk+1 belongs to range 1. Now, if we can show that we can swap the values such that sk belongs to range 1 and sk+1 belongs to range 2, then after making as many of these repeated swaps as possible, we are left in the case where the bounds do not intersect, and we are done. This is because if we list out which range each sk belongs to as a sequence of 1s and 2s, each swap decreases the number of inversions by 1.
Notice that because s1 corresponds to b1,1 and slen1+len2 corresponds to b2,len2, sk does not correspond to b2,len2, and sk+1 does not correspond to b1,1. So, the effect of making the swap on sk,sk+1 on b1,1,b1,len1,b2,1,b2,len2 can only result in increasing b2,1 and decreasing b1,len1.
Recall that the quantity for range i that we want to minimize is max(mi,bi,1)−bi,1+max(bi,leni,Mi)−Mi, which is decreasing when we increase bi,1 and decrease bi,leni. Thus, making the swap does not decrease the summation, as desired.
Now, we can use dynamic programming, on consecutive subsegments of the array c1,c2,…,cN−K and a subset of ranges (O(N22K) states in total). The state stores the minimum sum of differences attainable if for each range in the subset we only allocate values within the contiguous subsequence to it.
Specifically, dp[L][R][sub] to be the minimum cost of assigning non-fixed values cL,cL+1,…,cR to completely fill the ranges determined by the mask sub, where do not necessarily use all of the values cL,…,cR.
First, if the number of required values to fill all ranges in sub is more than R−L+1, then we set the cost to +∞. Now, there are three types of transitions:
Clearly, we must have M≥L+size(sub1)−1, where size(sub1)=∑i∈sub1leni. Using a similar swapping argument as the proof of the previous claim, it can be shown that it is optimal for M=L+size(sub1)−1 (in other words, all of the values in the range [L,M] exactly correspond to ranges in sub1 and no other ranges.
The answer is dp[1][N−K][2# of ranges−1], plus the fixed quantity that we subtracted out earlier.
For each L,R and mask sub, we iterate over all submasks of sub, for a total time complexity of O(N23K).
Richard's Code (Full Solution):
#include <bits/stdc++.h> using namespace std; #define sz(x) int((x).size()) const int MOD = 1e9+7; int dp[305][305][1<<7]; int size_sum[1<<7]; int main() { cin.tie(0)->sync_with_stdio(0); int N, K; cin >> N >> K; multiset<int> multi_vals; vector<int> inp_a; for(int i = 1; i <= N; i++){ int a; cin >> a; multi_vals.insert(a); inp_a.push_back(a); } vector<pair<int, int>> xys; for(int i = 1; i <= K; i++){ int x, y; cin >> x; y = inp_a[x-1]; xys.push_back(make_pair(x, y)); multi_vals.erase(multi_vals.find(y)); } xys.push_back(make_pair(0, 1000000)); xys.push_back(make_pair(N+1, 1000000)); sort(begin(xys), end(xys)); vector<int> vals; for(auto u: multi_vals) vals.push_back(u); //xys and vals store the input vector<pair<pair<int, int>, int>> rangs; //((min, max), size) int addon_zerosz = 0; for(int i = 0; i+1 < sz(xys); i++){ int rang_size = xys[i+1].first-xys[i].first-1; if(rang_size == 0){ addon_zerosz+=max(xys[i+1].second, xys[i].second); } else{ pair<int, int> rang_vals = make_pair(xys[i].second, xys[i+1].second); if(rang_vals.first > rang_vals.second) swap(rang_vals.first, rang_vals.second); rangs.push_back(make_pair(rang_vals, rang_size)); } } for(int i = 0; i < (1<<sz(rangs)); i++){ for(int j = 0; j < sz(rangs); j++){ if((i>>j)&1){ size_sum[i]+=rangs[j].second; } } } for(int i = 0; i < 305; i++) for(int j = 0; j < 305; j++) for(int k = 0; k < (1<<7); k++) dp[i][j][k] = MOD; for(int L = sz(vals)-1; L >= 0; L--){ for(int R = L; R < sz(vals); R++){ for(int mask = 0; mask < (1<<sz(rangs)); mask++){ //compute dp[L][R][mask] int rang_size = R-L+1; if(size_sum[mask] > rang_size) continue; if(size_sum[mask] <= rang_size-1){ //not use either end dp[L][R][mask] = min(dp[L][R][mask], dp[L+1][R][mask]); dp[L][R][mask] = min(dp[L][R][mask], dp[L][R-1][mask]); } //split in half somewhere by some submask for(int submask = mask; ; submask = ((submask-1)&mask)){ if(submask == 0) break; if(submask == mask) continue; dp[L][R][mask] = min(dp[L][R][mask], dp[L][L+size_sum[submask]-1][submask]+dp[L+size_sum[submask]][R][mask^submask]); } for(int k = 0; k < sz(rangs); k++){ //assign L, R to be left and right of range k if((mask>>k)&1){ if(size_sum[1<<k] == 1 && L < R) continue; //compute contribution of range k to the sum int small_val = vals[L]; int big_val = vals[R]; int contrib = max(big_val, rangs[k].first.second)-rangs[k].first.second + max(small_val, rangs[k].first.first)-small_val; assert(contrib >= 0); if((1<<k) == mask){ //only thing left } else{ assert(L+1 <= R-1); contrib+=dp[L+1][R-1][mask^(1<<k)]; } dp[L][R][mask] = min(dp[L][R][mask], contrib); } } } } } int ans = dp[0][sz(vals)-1][(1<<sz(rangs))-1]; //add fixed term for(auto u: vals){ ans+=u; } for(auto u: rangs){ ans+=u.first.second; } ans+=addon_zerosz; ans-=2*1000000; cout << ans << "\n"; }