Since a friendship group must all have the same label, let's lock in on a specific label $L$ and denote the sorted indices of the cows with label $L$ as $b_1, b_2, \ldots, b_k$ (assuming there are $k$ cows with label $L$). The number of friendship groups contributed by the label is the minimum number of disjoint subarrays that $b$ must be partitioned such that the last element of each subarray must be at most $x$ more than the first element of the subarray.
For a fixed $x$ and $L$, this can be solved with a greedy algorithm. Let's iterate over the indices in sorted order and keep two pointers $l$ and $r$, the left and right pointers respectively. We will keep incrementing $r$ until $b_r - b_l > x$. Then, we will increase the number of groups by $1$ and set $l$ to $r$. We will repeat this process until $r$ has reached the end of $b$. Over all $x$, since the sum of lengths of $b$ over all $L$ is $N$, this will take $\mathcal{O}(N^2)$ time. This is sufficient to pass subtask $1$.
For subtask $2$, there are only $10$ possible values of $L$. For each $L$, let's create an array $\mathtt{next}$ where $\mathtt{next}_i$ stores the least $j \geq i$ such that $a_j = L$ (or $N + 1$ if there is no such $j$). This array can be precomputed by looping through $a$ backwards. For a fixed $x$, we can count the number of groups in $\mathcal{O}(\frac{N}{x})$ time using the following algorithm:
pointer = first i such that a[i] = L while pointer <= n: add 1 to number of groups if pointer + x + 1 > n: break out of the loop pointer = next[pointer + x + 1]
Initially, the pointer is at the first index of first group. In every iteration, we are jumping to the start of the next group, which must be at least $x + 1$ indices forwards. In the worst case, we will jump $\frac{N}{x}$ times. Over all $x$, the total running time is $\mathcal{O}(N \log N)$ due to the harmonic series. Finally, over all $L$ this becomes $\mathcal{O}(10N \log N) \approx \mathcal{O}(N \log N)$.
For subtask $3$, the number of possible $L$ can be huge, but the length of $b$ for each $L$ is at most $10$. Observe that the maximum number of groups that can be made is $10$ because each group must have at least $1$ element. If we fix and denote the number of groups we want to make as $g$, then we want to find $l$, the minimum $x$ such that $g$ groups can be made. This can be done with binary search.
Let's create an array $c_1, c_2, \ldots, c_k$, where $c_i$ is the value of $l$ for each $i = g$. It is apparent that $c$ must be sorted in decreasing order. Essentially, $c_g$ is the barrier point of $x$ that splits $g$ groups and $g+1$ groups. We know that $1$ group can be made for $x$ in the interval $[c_1, N]$, $2$ groups for $[c_2, c_1-1]$, and so on.
The complexity for a fixed $L$ is $\mathcal{O}(10^2 \log N) \approx \mathcal{O}(\log N)$. Over all $L$, the total complexity is $\mathcal{O}(\frac{N}{10} \cdot 10^2 \log N) \approx \mathcal{O}(N \log N)$.
For full credit, make the following observation:
Inevitably, if we set $z$ to be a relatively large value, then the number of $L$ we have to process will be relatively small since they are inversely proportional. Using our subtask $2$ algorithm, the total complexity to process all such $L$ is $\mathcal{O}(\frac{N}{z} \cdot N \log N)$.
Now, collect all $L$ such that $L$ appears less than $z$ times in $a$. Using our subtask $3$ algorithm, the total complexity to process all such $L$ is $\mathcal{O}(\frac{N}{z} \cdot z^2 \log N)$.
It becomes apparent that the complexities of our subtask $2$ and $3$ algorithms are inversely proportional. We must choose a suitable value for $z$ such that the complexities for both subtasks are as equal as possible.
Finally, if we set $z \approx \sqrt{N}$, then the complexity of both subtasks is $\mathcal{O}(N \sqrt{N} \log {N})$, which is enough to receive full credit.
My C++ Code:
#include <bits/stdc++.h> using namespace std; const int N = 1e5 + 5; vector<int> lab_idx[N]; // use two pointers to manually calculate number of groups int get_groups(int label, int x) { int cnt = 0, l = 0, r = 0; auto &v = lab_idx[label]; while (l < (int)v.size()) { while (r < (int)v.size() && v[r] - v[l] <= x) { r++; } cnt++; l = r; } return cnt; } int n; vector<int> nxt; int get_groups_big(int label, int x) { int cnt = 0, i = lab_idx[label][0]; while (i < n) { cnt++; if (i + x + 1 >= n) break; i = nxt[i + x + 1]; } return cnt; } int main() { cin.tie(0)->sync_with_stdio(0); cin >> n; nxt.resize(n + 1); for (int i = 0; i < n; i++) { int c; cin >> c; lab_idx[c].push_back(i); } vector<int> small_ans(n + 1), large_ans(n + 1); int SQ = (int)sqrt(n); for (int label = 1; label <= n; label++) { auto& v = lab_idx[label]; int lab_sz = v.size(); if (lab_sz > SQ) { // Subtask 2 fill(nxt.begin(), nxt.end(), 0); for (int i : v) { nxt[i] = i; } nxt[n] = n; for (int i = n - 1; i >= 0; i--) { if (!nxt[i]) nxt[i] = nxt[i + 1]; } for (int x = 1; x <= n; x++) { large_ans[x] += get_groups_big(label, x); } } else { // Subtask 3 for (int groups = 1; groups <= lab_sz; groups++) { int l = 0, r = n + 1; while (l < r - 1) { int mid = l + (r - l) / 2; if (get_groups(label, mid) < groups) { r = mid; } else { l = mid; } } // will contribute to x in the range of [1, r) small_ans[1]++; if (r <= n) small_ans[r]--; } } } for (int i = 2; i <= n; i++) { small_ans[i] += small_ans[i - 1]; } for (int i = 1; i <= n; i++) { cout << small_ans[i] + large_ans[i] << "\n"; } }
Bonus: Solve this problem in $O(N\sqrt N)$ time using the method from this blog post.