Partial Credit: $N,Q\le 5000$
To solve this subtask, we need to be able to answer each query in $O(N)$ time. Define $S=a_1+a_2+\cdots +a_N$, and suppose that the candidate we are considering is $d$.
This immediately leads to an $O(NQ)$ solution.
Danny Mittal's code:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.StringTokenizer;
public class SleepingInClassHarder {
public static void main(String[] args) throws IOException {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
int n = Integer.parseInt(in.readLine());
long[] times = new long[n];
long sum = 0;
StringTokenizer tokenizer = new StringTokenizer(in.readLine());
for (int j = 0; j < n; j++) {
times[j] = Long.parseLong(tokenizer.nextToken());
sum += times[j];
}
StringBuilder out = new StringBuilder();
for (int q = Integer.parseInt(in.readLine()); q > 0; q--) {
long targetNumber = Long.parseLong(in.readLine());
if (sum % targetNumber == 0L) {
long res = ((long) n) + (sum / targetNumber);
long curr = 0;
for (long time : times) {
curr += time;
if (curr % targetNumber == 0L) {
res -= 2L;
}
}
out.append(res);
} else {
out.append(-1);
}
out.append('\n');
}
System.out.print(out);
}
}
Full Solution: We need a faster way to compute $x_d$.
Define $q_i=\gcd(p_i,S)$. If $d|S$, then $d|p_i\Leftrightarrow d|q_i$, so we can ignore all the $p_i$ and instead count the number of $i$ such that $d|q_i$. Each $q_i$ is a factor of $S$. Knowing the number of times each factor occurs is enough to answer all queries, so we first count them.
Let $\sigma_0(S)$ denote the number of factors of $S$. The maximum number of factors that can occur within the input constraints is achieved by the largest highly composite number not exceeding $10^{18}$. This is
which has $103680$ factors (see OEIS).
In the worst case, there are $O(\sigma_0(S))$ distinct nontrivial queries. If we can answer each query naively in $O(\sigma_0(S))$, this part by itself will take $O(\sigma_0(S)^2)$, which is too slow.
A faster way is to observe that if we arrange the factors in a multidimensional array, with a dimension for each prime factor, answering the queries from the frequencies is just computing multidimensional prefix sums.
Using inclusion-exclusion, we can compute each prefix sum in $O(2^{\omega(S)})$, where $\omega(S)$ counts the number of distinct prime factors of $S$ (not to be confused with the $\omega$ from complexity theory), but this is still slow.
A better way is to iterate over the dimensions, and compute prefix sums along that direction. This is a generalization of what is commonly referred to as Sum Over Subsets (SOS) and runs in $O(\omega(S)\sigma_0(S))$.
To compute the dimensions of the array, it helps to know the prime factorization of $S$. One way to do this is to use a factorization algorithm that is fast enough to factor numbers up to $10^{18}$. Alternatively, one can use trial division up to $10^6$ to eliminate all but at most two prime factors, then compute $\gcd$s of the remaining factor with the prefix sums and the queries. It is possible that a semiprime is left unfactored, but treating it as prime will not affect the correctness of the queries in that case since no query could distinguish it from being prime.
#include <cstdio>
#include <vector>
#include <numeric>
int N;
long long prefix[100005];
int Q;
long long queries[100005];
std::vector<std::pair<long long,int> > primes;
int num_factors(){
int cnt=1;
for(int i=0;i<primes.size();i++){
cnt*=primes[i].second+1;
}
return cnt;
}
int freq[103680];
int& get_freq(long long num){
int code=0;
for(int j=0;j<primes.size();j++){
int cnt=0;
while(num%primes[j].first==0){
num/=primes[j].first;
cnt++;
}
code=code*(primes[j].second+1)+std::min(cnt,primes[j].second);
}
return freq[code];
}
int main(){
scanf("%d",&N);
for(int i=1;i<=N;i++){
scanf("%lld",&prefix[i]);
prefix[i]+=prefix[i-1];
}
long long rest=prefix[N];
for(int p=2;p<=1000000;p++){
if(rest%p==0){
int cnt=0;
while(rest%p==0){
rest/=p;
cnt++;
}
primes.emplace_back(p,cnt);
}
}
//now rest has at most two prime factors
for(int i=1;i<N;i++){
long long tmp=std::gcd(rest,prefix[i]);
if(tmp!=1&&tmp!=rest){
if(tmp>rest/tmp){
tmp=rest/tmp;
}
if(tmp*tmp==rest){
primes.emplace_back(tmp,2);
}else{
primes.emplace_back(tmp,1);
primes.emplace_back(rest/tmp,1);
}
rest=1;
}
}
scanf("%d",&Q);
for(int i=0;i<Q;i++){
scanf("%lld",&queries[i]);
long long tmp=queries[i];
if(rest%tmp==0&&tmp!=1&&tmp!=rest){
if(tmp>rest/tmp){
tmp=rest/tmp;
}
if(tmp*tmp==rest){
primes.emplace_back(tmp,2);
}else{
primes.emplace_back(tmp,1);
primes.emplace_back(rest/tmp,1);
}
rest=1;
}
}
if(rest!=1){
//assume it is prime; can't tell anyway
primes.emplace_back(rest,1);
}
for(int i=1;i<=N;i++){
get_freq(prefix[i])++;
}
int block=1;
for(int i=primes.size()-1;i>=0;i--){
for(int code=num_factors()-1;code>=0;code--){
if(code/block%(primes[i].second+1)!=0){
freq[code-block]+=freq[code];
}
}
block*=primes[i].second+1;
}
for(int i=0;i<Q;i++){
if(prefix[N]%queries[i]!=0){
printf("-1\n");
}else{
long long ans=N+prefix[N]/queries[i];
ans-=get_freq(queries[i])*2;
printf("%lld\n",ans);
}
}
}
In the implementation above, each factor $f$ of $S$ gets encoded to a location $\text{code}(f)$ in the $\texttt{freq}$ array. This encoding is generalized to other numbers $g$ by mapping them to the location $\text{code}(\gcd(S,g))$.
The part that performs the multidimensional prefix sum is as follows:
int block=1;
for(int i=primes.size()-1;i>=0;i--){
for(int code=num_factors()-1;code>=0;code--){
if(code/block%(primes[i].second+1)!=0){
freq[code-block]+=freq[code];
}
}
block*=primes[i].second+1;
}