(Analysis by Avnith Vijayram)

Let $up_x$ be the expected time for Bessie to reach an ancestor of $x$ for the first time when starting at $x$. If $x$ is a leaf node, then $up_x=1$. Otherwise, the expected time to either reach $x$ again or an ancestor of $x$ is the average of $up_c$ over all children $c$ of $x$ plus $1$. Since all ancestors of $x$ are equally likely to be the first one visited in this scenario, the probability we land on $x$ again is $\frac{1}{d_x+1}$, where $d_x$ is the number of ancestors of $x$. Thus, we have

$$up_x=1 + \frac{1}{|C_x|}\sum\limits_{c \in C_x} up_c+\frac{1}{d_x+1}up_x$$

Here, $C_x$ is the set of children of node $x$. We can solve for $up_x$, giving us

$$up_x=\left(1 + \frac{1}{d_x}\right)\left(1 + \frac{1}{|C_x|}\sum\limits_{c \in C_x} up_c\right)$$

We can compute $up_x$ for all nodes $x$ (except the root) with a DFS in $O(N)$.

Subtask 1:

Let $root_x$ represent the expected time to reach node $1$ starting at node $x$. We have $root_1 = 0$. Additionally, letting $A_x$ denote the set of ancestors of $x$, we have

$$ root_x = up_x + \frac{1}{d_x} \sum\limits_{a \in A_x} root_a $$

We can compute $root_x$ for all $x$ in $O(N)$ with a DFS from node $1$ by propagating the sum $\sum\limits_{a \in A_x} root_a$ down the tree.

Subtask 2:

Let $E_y$ be the expected time to reach node $y$ from node $1$. Define $down_y$ be the expected time to reach node $y$ from $par_y$, the parent of $y$, and let $down_1=0$. We have $E_y = \sum\limits_{a \in A_y \cup \{y\}} down_a$.

If we start from $par_y$, we either take the correct step to $y$ with chance $\frac{1}{|C_{par_y}|}$ but otherwise have to return to $par_y$ or an ancestor of $par_y$. Thus,

$$ down_y = 1 + \frac{|C_{par_y}|-1}{|C_{par_y}|} \left(\mathop{\text{avg}}_{c \in C_{par_y} \setminus \{y\}} up_c + \mathop{\text{avg}}_{a \in A_{y}} \left( E_{par_y} - E_a \right) + down_y\right)$$

$E_{par_y} - E_a$ represents the time it takes to reach $par_y$ from $a$ since every path from $1$ to $par_y$ goes through $a$. We can rearrange the equation to the following form.

$$ down_y = |C_{par_y}| + \sum_{c \in C_{par_y} \setminus \{y\}} up_c + \left(|C_{par_y}|-1\right)\left( E_{par_y} - \frac{1}{d_y} \sum_{a \in A_y} E_a \right)$$

Notice that $down_{y}$ only depends on the values $E_a$ for all ancestors $a$ of $y$. By maintaining $E_{par_y}$ and $\sum\limits_{a \in A_y} E_a$, we can compute $down_y$ and $E_y$ for all nodes $y$ with a DFS in $O(N)$.

Answering Queries:

Given nodes $x$, $y$, we want to determine the expected time to reach $y$ starting at $x$. Let $S$ be the set of common ancestors of $x$ and $y$ and let $\ell \in S$ be the lowest common ancestor (LCA). Any path from $x$ to $y$ must go through a node in $S$. Let $w$ be the first node in $S$ we visit when starting at node $x$. We can split the path into $x \to w$ and $w \to y$. By linearity of expectation, the answer is the average expected time from $x \to w$ over all $w \in S$ plus the average expected time from $w \to y$ over all $w \in S$. The latter is $\frac{1}{|S|}\sum\limits_{w \in S} \left( E_y - E_w \right)=E_y-\frac{1}{|S|}\sum\limits_{w\in S} E_w$ by linearity of expectation since the path from $1$ to $y$ must pass through $w$.

Now, we want to compute the average expected time from $x \to w$ over all $w \in S$. Since all $w \in S$ are equally likely, we need to find the expected time to reach any $w \in S$. Let $\ell=x_0, x_1, x_2, \ldots, x_m =x$ be the nodes on the path from $\ell$ to $x$. Define $reach_i$ as the expected time to reach a node in $S$ when starting at $x_i$. Our goal is to compute $reach_m$.

We can observe the following relation:

$$ reach_{i} = up_{x_i} + \frac{1}{d_{x_i}}\sum_{j=1}^{i-1}reach_{j}$$

This is because it takes expected time $up_{x_i}$ to move to an ancestor of $x_i$, and we have an $\frac{|S|}{d_{x_i}}$ chance to land in $S$ and an $\frac{1}{d_{x_i}}$ chance each to land on $x_1, x_2, \ldots, x_{i-1}$.

Subtask 3:

In this subtask, we use the fact that the expected maximum depth of the tree is $O(\log N)$. Thus, we can compute $reach_m$ in $O(m^2)$ or $O(m)$ per query for a total time complexity of $O(N + Q \log^2 N)$ or $O(N + Q \log N)$, respectively.

Full solution:

We claim that

$$ reach_{i} = up_{x_i} + \sum\limits_{j=1}^{i-1} \frac{up_{x_j}}{d_{x_j}+1}$$

This is because for each $x_j$, the probability that it is visited before any of its ancestors is $\frac{1}{d_{x_j}+1}$.

Thus, to compute $reach_m$ we just need to sum $\frac{up_{x_i}}{d_{x_i}+1}$ for all $x_i$ on the path from $\ell$ to $y$, both exclusive. For each $v$, we can precompute the sum of $\frac{up_{z}}{d_{z}+1}$ over all nodes $z$ on the path from node $1$ to node $v$ in $O(N)$. This allows us to compute the sum along a path by taking differences of two of these sums. We now have a solution in $O(N \log N + Q \log N)$, though this may vary slightly depending on implementation of LCA.

#include <bits/stdc++.h>
using namespace std;

using ll = long long;
const int mxn = 2e5+5;
const ll M = 1e9+7;
ll modpow(ll x, ll p) {
    ll a = 1;
    while (p) {
        if (p & 1) a = a * x % M;
        x = x * x % M;
        p /= 2;
    }
    return a;
}
ll inv(ll x) {
    assert(x != 0);
    return modpow(x, M-2);
}

ll par[mxn];
vector<ll> childs[mxn];

ll d[mxn], up[mxn], e[mxn], esum[mxn], upd[mxn];

ll dfs1(int x, ll dep) {
    d[x] = dep;
    ll cx = childs[x].size();
    if (cx == 0) {
        up[x] = 1;
        return up[x];
    }
    
    ll upsum = 0;
    for (auto c : childs[x]) {
        upsum += dfs1(c, dep+1);
    }
    upsum %= M;

    if (dep > 0) {
        up[x] = (1 + inv(dep)) * (1 + inv(cx) * upsum % M) % M;
    }
    return up[x];
}

void dfs2(int x, ll cp, ll upcsum, ll ep, ll epsum, ll updsum) {
    ll downx = 0;
    if (x != 0) {
        downx = (cp + upcsum + (cp - 1) * (ep - inv(d[x]) * epsum % M)) % M;
        if (downx < 0) downx += M;
    }
    e[x] = (ep + downx) % M;
    esum[x] = (epsum + e[x]) % M;
    if (x > 0) updsum = (updsum + up[x] * inv(d[x]+1)) % M;
    upd[x] = updsum;

    ll cx = childs[x].size();
    ll upsum = 0;
    for (auto c : childs[x]) {
        upsum += up[c];
    }

    for (auto c : childs[x]) {
        dfs2(c, cx, (upsum + M - up[c]) % M, e[x], esum[x], upd[x]);
    }
}

int main() {
    int n, q; cin >> n >> q;
    for (int i = 1; i < n; i++) {
        int p; cin >> p; p--;
        childs[p].push_back(i);
        par[i] = p;
    }

    dfs1(0, 0);
    dfs2(0, 0, 0, 0, 0, 0);

    vector<vector<int>> jmp(18, vector<int>(n, -1));
    for (int i = 1; i < n; i++) jmp[0][i] = par[i];
    for (int j = 1; j < 18; j++) {
        for (int i = 0; i < n; i++) {
            int nx = jmp[j-1][i];
            if (nx != -1) nx = jmp[j-1][nx];
            jmp[j][i] = nx;
        }
    }

    auto jump = [&](int x, int l) -> int {
        for (int j = 17; j >= 0; j--) {
            if (x == -1) return x;
            if (l & (1 << j)) x = jmp[j][x];
        }
        return x;
    };

    while (q--) {
        int x, y; cin >> x >> y; x--; y--;
        
        int l;
        int nx = x, ny = y;
        if (d[nx] > d[ny]) nx = jump(nx, d[nx]-d[ny]);
        if (d[nx] < d[ny]) ny = jump(ny, d[ny]-d[nx]);
        if (nx == ny) l = nx;
        else {
            for (int j = 17; j >= 0; j--) {
                if (jmp[j][nx] != jmp[j][ny]) {
                    nx = jmp[j][nx];
                    ny = jmp[j][ny];
                }
            }
            assert(par[nx] == par[ny]);
            l = par[nx];
        }

        if (l == x) {
            cout << (e[y]-e[x]+M)%M << endl;
        } else {
            ll reachm = (up[x] + upd[par[x]] - upd[l] + M) % M;
            ll wtoy = (e[y] - (inv(d[l]+1) * esum[l] % M) + M) % M;
            cout << (reachm + wtoy) % M << endl;
        }
    }
}