Posts tagged codeforces

Photo URL is broken

It's currently a rainy day up here in Saratoga Springs, NY, so I've decided to write about a problem that introduced me to Fenwick trees, otherwise known as binary indexed trees. It exploits the binary representation of numbers to speed up computation from $O(N)$ to $O(\log N)$ in a similar way that binary lifiting does in the Least Common Ancestor Problem.

Consider a vector elements of $X = (x_1,x_2,\ldots,x_N)$. A common problem is to find the sum or a range of elements. Define $S_{jk} = \sum_{i=j}^kx_i$. We want to compute $S_{jk}$ quickly for any $j$ and $k$. The first thing to is define $F(k) = \sum_{i=1}^kx_i$, where we let $F(0) = 0$. Then, we rewrite $S_{jk} = F(k) - F(j-1)$, so our problem reduces to finding a way to compute $F$ quickly.

Computing from $X$ directly, computing $F$ is a $O(N)$ operation, but updating incrementing some $x_i$ is a $O(1)$ operation. On the other hand, we can precompute $F$ with some dynamic programming since $F(k) = x_k + F(k-1)$. In this way computing $F$ is a $O(1)$ operation, but if we increment $x_i$, we have to update $F(j)$ for all $j \geq i$, which is a $O(N)$ operation. The Fenwick tree makes both of these operations $O(\log N)$.

Fenwick Tree

The key idea of the Fenwick tree is to cache certain $S_{jk}$. Suppose that we wanted to compute $F(n)$. First, we write $$n = d_02^0 + d_12^1 + \cdots + d_m2^m.$$ If we remove all the terms where $d_i = 0$, we rewrite this as $$ n = 2^{i_1} + 2^{i_2} + \cdots + 2^{i_l},~\text{where}~i_1 < i_2 < \cdots < i_l. $$ Let $n_j = \sum_{k = j}^l 2^{i_k}$, and $n_{l + 1} = 0$. Then, we have that $$ F(n) = \sum_{k=1}^{l} S_{n_{k+1}+1,n_k}. $$ For example for $n = 12$, we first sum the elements $(8,12]$, and then, the elements $(0,8]$, secondly.

We can represent these intervals and nodes in a tree like this. Like a binary heap, the tree is stored as an array. The number before the colon is the index in the array. The number after the colon is the value is the sum of $x_i$, where $i$ is in the interval $(a,b]$. The interval for node $i$ is $(p_i,i]$, where $p_i$ is the parent of node $i$.

Now, suppose $X = (58,62,96,87,9,46,64,54,87,7,51,84,33,69,43)$. Our tree should look like this.

Calculating $F(n)$

Suppose we wanted to calculate $F(13)$. We start at node $13$, and we walk up towards the root adding the values of all the nodes that we visit. In this case, we find that $F(13) = 738$. Writing the nodes visited in their binary representations reveals a curious thing: \begin{align*} 13 &= \left(1101\right)_2 \\ 12 &= \left(1100\right)_2 \\ 8 &= \left(1000\right)_2 \\ 0 &= \left(0000\right)_2. \end{align*} If you look closely, at each step, we simply remove the rightmost bit, so finding the parent node is easy.

Updating a Fenwick Tree

This is a little trickier, but it uses the same idea. Suppose that we want to increment $x_n$ by $\delta$. First, we increase the value of node $n$ by $\delta$. Recall that we can write $$ n = 2^{i_1} + 2^{i_2} + \cdots + 2^{i_l},~\text{where}~i_1 < i_2 < \cdots < i_l. $$ Now if $j < i_1$, node $n + 2^{j}$ is a descendant of node $n$. Thus, the next node we need to update is $n + 2^{i_1}$. We repeat this process of adding the rightmost bit and updating the value of the node until we exceed the capacity of the tree. For instance, if we add $4$ to $x_5$, we'll update the nodes in blue.

Two's Complement

If you read the above carefully, we you'll note that we often need to find the rightmost bit. We subtract it when summing and add it when updating. Using the fact that binary numbers are represented with Two's complement, there's an elegant trick we can use to make finding the rightmost bit easy.

Consider a 32-bit signed integer with bits $n = b_{31}b_{30}\cdots b_{0}$. For $0 \leq i \leq 30$, $b_i = 1$ indicates a term of $2^i$. If $b_{31} = 0$, then $n$ is positive and $$ n = \sum_{i \in \left\{0 \leq i \leq 30~:~b_i = 1\right\}}2^i. $$ On the other hand if $b_{31} = 1$, we still have the same terms but we subtract $2^{31}$, so $$ n = -2^{31} + \sum_{i \in \left\{0 \leq i \leq 30~:~b_i = 1\right\}}2^i, $$ which makes $n$ negative.

As an example of the result of flipping $b_{31}$, we have \begin{align*} 49 &= (00000000000000000000000000110001)_{2} \\ -2147483599 &= (10000000000000000000000000110001)_{2}. \end{align*}

Now, consider the operation of negation. Fix $x$ to be a nonnegative integer. Let $y$ be such that $-2^{31} + y = -x$, so solving, we find that $$y = -x + 2^{31} = -x + 1 + \sum_{i=0}^{30}2^i.$$ Therefore, $y$ is the positive integer we get by flipping all the bits of $x$ except $b_{31}$ and adding $1$. Making $x$ negative, $-x = -2^{31} + y$ will have $b_{31}$ flipped, too. Using $49$ as an example again, we see that \begin{align*} 49 &= (00000000000000000000000000110001)_{2} \\ -49 &= (11111111111111111111111111001111)_{2}. \end{align*}

This process looks something like this: $$ x = (\cdots 10\cdots0)_2 \xrightarrow{\text{Flip bits}} (\cdots 01\cdots1)_2 \xrightarrow{+1} (\cdots 10\cdots0)_2 = y. $$ In this way $x$ and $y$ have same rightmost bit. $-x$ has all the same bits as $y$ except for $b_{31}$. Thus, $x \land -x$ gives us the rightmost bit.

Fenwick Tree Implementation

Using this trick, the implementation of the Fenwick tree is just a couple dozen lines. My implemenation is adapted for an $X$ that is $0$-indexed.

class FenwickTree {
  vector<int> tree;
public:
  FenwickTree(int N) : tree(N + 1, 0) {}
  // sums the elements from 0 to i inclusive
  int sum(int i) {
    if (i < 0) return 0;
    ++i; // use 1-indexing, we're actually summing first i + 1 elements 
    if (i > tree.size() - 1) i = tree.size() - 1;
    int res = 0;
    while (i > 0) {
      res += tree[i];
      i -= (i & -i); // hack to get least bit based on two's complement
    }
    return res;
  }
  // sums the elements from i to j inclusive
  int sum(int i, int j) {
    return sum(j) - sum(i - 1);
  }
  // update counts
  void update(int i, int delta) {
    ++i;  // convert to 1-indexing  
    while (i < tree.size()) {
      tree[i] += delta;
      i += (i & -i);
    }
  }
};

Vika and Segments

The original motivation for me to learn about Fenwick trees was the problem, Vika and Segments. Here's the problem statement:

Vika has an infinite sheet of squared paper. Initially all squares are white. She introduced a two-dimensional coordinate system on this sheet and drew $n$ black horizontal and vertical segments parallel to the coordinate axes. All segments have width equal to $1$ square, that means every segment occupy some set of neighbouring squares situated in one row or one column.

Your task is to calculate the number of painted cells. If a cell was painted more than once, it should be calculated exactly once.

The first thing to do is join together the horizontal lines that overlap and the vertical lines that overlap. The basic idea is to count all the squares that are painted by the horizontal lines. Then, we sort the vertical lines by their x-coordinates and sweep from left to right.

For each vertical line, we count the squares that it covers and subtract out its intersection with the horizontal lines. This is where the Fenwick tree comes into play.

For each vertical line, it will have endpoints $(x,y_1)$ and $(x,y_2)$, where $y_1 < y_2$. As we sweep from left to right, we keep track of which horizontal lines are active. Let $Y$ be array of $0$s and $1$s. We set $Y[y] = 1$ if we encounter a horizontal line, and $Y[y] = 0$ if the horizonal line ends. Every time that we encounter a vertical line, we'll want to compute $\sum_{y = y_1}^{y_2}Y[y]$, which we can quickly with the Fenwick tree.

Now, the range of possible coordinates is large, so there are some details with coordinate compression, but I believe the comments in the code are clear enough.

struct LineEndpoint {
  int x, y;
  bool start, end;
  LineEndpoint(int x, int y, bool start) : x(x), y(y), start(start), end(!start)
  {}
};

void joinLines(map<int, vector<pair<int, int>>> &lines) {
  for (map<int, vector<pair<int, int>>>::iterator lineSegments = lines.begin();
       lineSegments != lines.end(); ++lineSegments) {
    sort((lineSegments -> second).begin(), (lineSegments -> second).end());
    vector<pair<int, int>> newLineSegments;
    newLineSegments.push_back((lineSegments -> second).front());
    for (int i = 1; i < (lineSegments -> second).size(); ++i) {
      if (newLineSegments.back().second + 1 >= (lineSegments -> second)[i].first) { // join line segment
        // make line as large as possible
        newLineSegments.back().second = max((lineSegments -> second)[i].second, newLineSegments.back().second);
      } else { // start a new segment
        newLineSegments.push_back((lineSegments -> second)[i]);
      }
    }
    (lineSegments -> second).swap(newLineSegments);
  }
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  int N; cin >> N; // number of segments
  map<int, vector<pair<int, int>>> horizontalLines; // index by y coordinate
  map<int, vector<pair<int, int>>> verticalLines; // index by x coordinate
  for (int n = 0; n < N; ++n) { // go through segements
    int x1, y1, x2, y2;
    cin >> x1 >> y1 >> x2 >> y2;
    if (y1 == y2) {
      if (x1 > x2) swap(x1, x2);
      horizontalLines[y1].emplace_back(x1, x2);
    } else if (x1 == x2) {
      if (y1 > y2) swap(y1, y2);
      verticalLines[x1].emplace_back(y1, y2);
    }
  }  
  // first join horizontal and vertical segments that coincide  
  joinLines(horizontalLines); joinLines(verticalLines);
  /* now compress coordinates
   * partition range so that queries can be answered exactly
   */
  vector<int> P; 
  for (pair<int, vector<pair<int, int>>> lineSegments : verticalLines) {
    for (pair<int, int> lineSegment : lineSegments.second) {
      P.push_back(lineSegment.first - 1);
      P.push_back(lineSegment.second);
    }
  }
  sort(P.begin(), P.end());
  P.resize(unique(P.begin(), P.end()) - P.begin());
  /* Now let M = P.size(). We have M + 1 partitions.
   * (-INF, P[0]], (P[0], P[1]], (P[1], P[2]], ..., (P[M - 2], P[M-1]], (P[M-1],INF]
   */
  unordered_map<int, int> coordinateBucket;
  for (int i = 0; i < P.size(); ++i) coordinateBucket[P[i]] = i;
  // begin keep track of blackened squares
  long long blackenedSquares = 0;
  // sort the horizontal lines end points to prepare for a scan
  // tuple is (x-coordinate, flag for left or right endpoint, y-coordinate)
  vector<LineEndpoint> horizontalLineEndpoints;  
  for (pair<int, vector<pair<int, int>>> lineSegments : horizontalLines) {
    for (pair<int, int> lineSegment : lineSegments.second) {
      horizontalLineEndpoints.emplace_back(lineSegment.first, lineSegments.first, true); // start
      horizontalLineEndpoints.emplace_back(lineSegment.second, lineSegments.first, false); //end
      // horizontal lines don't coincide with one another, count them all
      blackenedSquares += lineSegment.second - lineSegment.first + 1;
    }
  }  
  // now prepare to scan vertical lines from left to right
  sort(horizontalLineEndpoints.begin(), horizontalLineEndpoints.end(), 
       [](LineEndpoint &a, LineEndpoint &b) -> bool {
         if (a.x != b.x) return a.x < b.x;
         if (a.start != b.start) return a.start; // add lines before removing them
         return a.y < b.y;
       });
  FenwickTree horizontalLineState(P.size() + 1);
  vector<LineEndpoint>::iterator lineEndpoint = horizontalLineEndpoints.begin();
  for (pair<int, vector<pair<int, int>>> lineSegments : verticalLines) {
    /* update the horizontal line state
     * process endpoints that occur before vertical line      
     * add line if it occurs at the vertical line
     */
    while (lineEndpoint != horizontalLineEndpoints.end() && 
           (lineEndpoint -> x < lineSegments.first ||
            (lineEndpoint -> x == lineSegments.first && lineEndpoint -> start))) {
      int bucketIdx = lower_bound(P.begin(), P.end(), lineEndpoint -> y) - P.begin();
      if (lineEndpoint -> start) { // add the line
        horizontalLineState.update(bucketIdx, 1);
      } else if (lineEndpoint -> end) { // remove the line
        horizontalLineState.update(bucketIdx, -1);
      }
      ++lineEndpoint;
    }
    for (pair<int, int> lineSegment : lineSegments.second) {
      // count all squares
      blackenedSquares += lineSegment.second - lineSegment.first + 1;
      // subtract away intersections, make sure we start at first bucket that intersects with line
      blackenedSquares -= horizontalLineState.sum(coordinateBucket[lineSegment.first - 1] + 1, 
                                                  coordinateBucket[lineSegment.second]);
    }    
  }
  cout << blackenedSquares << endl;
  return 0;
}

Photo URL is broken

Doing the problem Minimum spanning tree for each edge gave me a refresher on minimum spanning trees and forced me to learn a new technique, binary lifting.

Here's the problem.

Connected undirected weighted graph without self-loops and multiple edges is given. Graph contains $N$ vertices and $M$ edges.

For each edge $(u, v)$ find the minimal possible weight of the spanning tree that contains the edge $(u, v)$.

The weight of the spanning tree is the sum of weights of all edges included in spanning tree.

It's a pretty simple problem statement, and at a high level not even that hard to solve. I figured out rather quickly the general approach. First, let $w: V \times V \rightarrow \mathbb{R}$ be the weight of an edge.

  1. Find any minimum spanning tree, $T = (V,E)$.
  2. Given that minimum spanning tree, root the tree.
  3. Now, given any edge $(u,v)$, if $(u,v) \in E$, return the weight of the spanning tree. If not, add that edge. We must necessarily have a cycle, for otherwise, the tree would not be a tree.
    1. Find the cycle by finding the least common ancestor of $u$ and $v$.
    2. Remove the edge with highest weight along the cycle that is not $(u,v)$. The weight of the minimum spanning tree with $(u,v)$ is obtained by adding $w(u,v)$ and subtracting the weight of edge that we removed.

Unfortunately, implementing this solution is rather tricky. I already knew how to find the minimum spanning tree. Finding the least common ancestor efficiently was not so easy, however.

Minimum Spanning Tree

Consider an undirected weighted graph $G = (V,E,w)$. A minimum spanning tree is a subgraph $MST = (V, E^\prime, w)$, where we choose $E^\prime \subset E$ such that $\sum_{(u,v) \in E^\prime,~u < v} w(u,v)$ is minimized.

There are two standard algorithms to solve this problem, Prim's algorithm and Kruskal's algorithm. Apparently, Prim's is faster on dense graphs, whereas Kruskal's is faster on sparse graphs. Also, to get the advantages of Prim's algorithm, you'll need to have some type of fancy priority queue implemented. I always use Prim's because that's what the USACO taught me.

Prim's algorithm is actually pretty easy to memorize, too, because it's so similar to Dijkstra's algorithm. In fact, I use the exact same implementation of the binary heap. Let's go through the steps.

  • Initialization: Set the parent of each vertex to be a sentinal, say $-1$ if the vertices are labeled from $0$ to $N - 1$. Initialize $E^\prime = \emptyset$. These are the edges of the minimum spanning tree consisting of just vertex 0. Set vertex $0$ as visited and push all its neigbors onto the priority queue with the value being $w(0,v)$ for $(0,v) \in E$. Set each parent of such $v$ to be $0$.
  • Maintenance: At the start, we have subgraph $G_k$, which is the minimum spanning tree of some vertices $v_0,v_1,...,v_{k-1}$. Now, we're going to add vertices one-by-one if the priority queue is not empty.

    1. Pop a vertex from $v_k$ from the priority queue. This is the vertex that is closest to the current minimum spanning tree. Mark $v_k$ as visited.
    2. Let $P(v_k)$ be the parent of $v_k$. Add the edge $\left(P(v_k), v_k\right)$ to our minimum spanning tree.
    3. For each neighbor $u$ of $v_k$, we have several cases.

      1. If $u$ has never been added to the priority queue, push $u$ with value $w(v_k, u)$. Set $u$'s parent to be $v_k$.
      2. If $u$ is in the priority queue already, and $w(v_k, u)$ is smaller than the current distance of $u$ from the minimum spanning tree, decrease the value of key $u$ in the priority queue to $w(v_k, u)$. Set $u$'s parent to be $v_k$.
      3. If $u$ was already visited of $w(v_k,u)$ is bigger the current distance of $u$ from the minimum spanning tree, do nothing.

      We need prove that this actually creates a minimum spanning tree $G_{k + 1}$. Let $G_{k + 1}^\prime$ be a minimum spanning tree of vertices $v_0, v_1, v_2,\ldots,v_k$. Let $W(G)$ denote sum of the edge weights of graph $G.$

      Suppose for a contradiction that $G_{k + 1}$ is not a minimum spanning tree. $G_{k + 1}$ contains some edge $(u, v_k)$, and $G^\prime_{k+1}$ contains an edge $(u^\prime, v_k)$. Now, we have that $W(G_{k + 1}) = W(G_k) + w(u,v_k)$ and $W(G^\prime_{k + 1}) = W(G_k^\prime) + w(u^\prime,v_k).$ Clearly, $u, u^\prime \in \{v_0,v_1,\ldots,v_{k - 1}\}$. Since $G_k$ is a minimum spanning tree of $v_0, v_1,\ldots,v_{k-1}$ and $w(u,v_k) \leq w(v,v_k)$ for any $v \in \{v_0,v_1,\ldots,v_{k - 1}\}$ by our induction hypothesis and construction, we cannot have that $W(G_{k+1}) > W(G_{k+1}^\prime)$. Thus, $G_{k+1}$ is the minimum spanning tree of vertices $v_0,v_1,\ldots,v_k$, and this greedy approach works.

  • Termination: There's nothing to be done here. Just return $G_{N}$.

Here's an example of this algorithm in action. Vertices and edges in the minimum spanning tree are colored red. Possible candidates for the next vertex are colored in blue. The most recently added vertex is bolded.

Here's the code. It takes and adjacency list and returns the subset of the adjacency list that is a minimum spanning tree, which is not necessarily unique.

vector<unordered_map<int, int>> findMinimumSpanningTree(const vector<unordered_map<int, int>> &adjacencyList) {
  int N = adjacencyList.size();
  vector<unordered_map<int, int>> minimumSpanningTree(N);
  vector<int> visited(N, false);
  vector<int> parent(N, -1);
  phillypham::priority_queue pq(N); // keep track of closest vertex
  pq.push(0, 0);
  while (!pq.empty()) {
    // find closest vertex to current tree
    int currentVertex = pq.top(); 
    int minDistance = pq.at(currentVertex);
    pq.pop();    
    visited[currentVertex] = true;
    // add edge to tree if it has a parent
    if (parent[currentVertex] != -1) { 
      minimumSpanningTree[parent[currentVertex]][currentVertex] = minDistance;
      minimumSpanningTree[currentVertex][parent[currentVertex]] = minDistance;
    }    
    // relax neighbors step
    for (pair<int, int> nextVertex : adjacencyList[currentVertex]) { // loop through edges
      if (!visited[nextVertex.first]) { // ignore vertices that were already visited
        if (!pq.count(nextVertex.first)) { // add vertex to priority queue for first time
          parent[nextVertex.first] = currentVertex; 
          pq.push(nextVertex.first, nextVertex.second); // distance is edge weight
        } else if (pq.at(nextVertex.first) > nextVertex.second) {
          parent[nextVertex.first] = currentVertex;
          pq.decrease_key(nextVertex.first, nextVertex.second);
        }
      }
    }
  }  
  return minimumSpanningTree;
}

Least Common Ancestor

The trickier part is to find the cycle and the max edge along the cycle whenever we add an edge. Suppose we're trying to add edge $(u,v)$ not in $E^\prime$. We can imagine the cycle starting and stopping at the least common ancestor of $u$ and $v$. The naive way to find this is to use two pointers, one starting at $u$ and the other at $v$. Whichever pointer points to a vertex of greater depth, replace that pointer with one to its parent until both pointers point to the same vertex.

This is too slow. To fix this, we use a form of path compression that some call binary lifting. Imagine that we have vertices $0,1,\ldots,N-1$. We denote the $2^j$th ancestor of vertex $i$ by $P_{i,j}$. Storing all $P_{i,j}$ takes $O(N\log N)$ storage since the height of the tree is at most $N$. Then, if we were trying to find the $k$th ancestor, we could use the binary representation and rewrite $k = 2^{k_1} + 2^{k_2} + \cdots + 2^{k_m}$, where $k_1 < k_2 < \cdots < k_m$. In this way, from $v$, we could first find the $2^{k_m}$th ancestor $a_1$. Then, we'd find the $2^{k_{m-1}}$th ancestor of $a_1$. Call it $a_2$. Continuing in this manner, $a_m$ would be the $k$th ancestor of $v$. $m \sim O(\log N)$, and each lookup is takes constant time, so computing the $k$th ancestor is a $O(\log N)$ operation.

We can also compute all $P_{i,j}$ in $O(N\log N)$ time. We initially know $P_{i,0}$ since that's just the parent of vertex $i$. Then for each $j = 1,2,\ldots,l$, where $l = \lfloor \log_2(H) \rfloor$, where $H$ is the height of the tree, we have that \begin{equation} P_{i,j+1} = P_{P_{i,j},j} \end{equation} if the depth of $i$ is at least $2^{j + 1}$ since $P_{i,j}$ is the $2^j$th ancestor of vertex $i$. The $2^j$th ancestor of vertex $P_{i,j}$ will then be the $2^j + 2^j = 2^{j+1}$th ancestor of vertex $i$.

Thus, if we want to find the least common ancestor of $u$ and $v$, we use the following recursive algorithm in which we have three cases.

  • Base case: If $u = v$, the least common ancestor is $u$. If $P_{u,0} = P_{v,0}$, the least common ancestor is $P_{u,0}$.
  • Case 1 is that the depths of $u$ and $v$ are different. Assume without loss of generality that the depth of $u$ is greater than the depth of $v$. Replace $u$ with $P_{u,j}$, where $j$ is chosen to be as large as possible such that the depth of $P_{u,j}$ is at least the depth of $v$.
  • Case 2 is that the depths of $u$ and $v$ are the same. Replace $u$ with $P_{u,j}$ and $v$ with $P_{v,j}$, where $j$ is the largest integer such that $P_{u,j}$ and $P_{v,j}$ are distinct.

I'm sure that you're thinking that this is all very interesting, but how about that edge of max weight in the cycle? This paradigm can be used to find this, too.

Suppose we have some function that acts on the edges $g : E^\prime \rightarrow \mathbb{R}$. We have some combiner function $C$ with with we can extend $g$ to $\tilde{g} : V \times V \rightarrow \mathbb{R}$ in the following manner. Consider any two vertices $u$ and $v$. If $(u,v) \in E^\prime$, then $\tilde{g}(u,v) = g(u,v)$. If there is not an edge between $u$ and $v$, there is an intermediate vertex $w$. Then, we define \begin{equation} \tilde{g}(u,v) = C\left(\tilde{g}(u, w), \tilde{g}(w, v)\right). \end{equation}

When we add an edge $(u,v) \not\in E^\prime$, we want add $w(u,v)$ and remove an edge of weight $\tilde{g}(u,v)$. Such a function can be computed in parallel with find the least common ancestor. Let $w$ be the least common ancestor. As we walk up the tree from $u$ to $w$, we'll compute $\tilde{g}(u,w)$. Similarly, we'll compute $\tilde{g}(w,v)$ as we walk up the tree from $v$ to $w$. This will give us $\tilde{g}(u,v)$ after applying $C$.

Let's walk through the steps with code now. Recall our minimum spanning tree is an adjacency list of type vector<unordered_map<int, int>>.

  1. First, we'll root the tree and compute the parent and depth of each vertex.

     vector<pair<int, int>> rootTree(int root, const vector<unordered_map<int, int>> &adjacencyList) {
       int N = adjacencyList.size();
       vector<pair<int, int>> parentDepth(N);
       stack<int, vector<int>> s;
       s.push(root);
       parentDepth[root] = make_pair(-1, 0);
       while (!s.empty()) {
         int currentVertex = s.top(); s.pop();
         for (pair<int, int> nextVertex : adjacencyList[currentVertex]) {
           if (nextVertex.first != parentDepth[currentVertex].first) {
             parentDepth[nextVertex.first] = make_pair(currentVertex, parentDepth[currentVertex].second + 1);
             s.push(nextVertex.first);
           }
         }
       }
       return parentDepth;
     }
    
  2. Next, we compute $P_{i,j}$ and $\tilde{g}(i,P_{i,j})$. $P_{i,j}$ corresponds to ancestor[i][j], and $\tilde{g}(i,P_{i,j})$ corresponds to maxEdge[i][j].

     void memoizeAncestorAndMaxEdge(vector<vector<int>> &ancestor, 
                                    vector<vector<int>> &maxEdge,
                                    vector<unordered_map<int, int>> &minimumSpanningTree,
                                    const vector<pair<int, int>> &parentDepth) {
       int N = minimumSpanningTree.size();
       int maxLogDepth = 0;
       for (int i = 0; i < N; ++i) {
         int logDepth = parentDepth[i].second == 0 ? 0 : floor(log2(parentDepth[i].second) + 1);
         if (maxLogDepth < logDepth) maxLogDepth = logDepth;
         ancestor.push_back(vector<int>(logDepth));
         maxEdge.push_back(vector<int>(logDepth));
         if (logDepth > 0) {
           ancestor.back().front() = parentDepth[i].first;
           maxEdge.back().front() = minimumSpanningTree[parentDepth[i].first][i];
         }
       }            
       for (int j = 1; j < maxLogDepth; ++j) {
         for (int i = 0; i < N; ++i) {          
           int logDepth = parentDepth[i].second == 0 ? 0 : floor(log2(parentDepth[i].second) + 1);
           // go up 2^(j-1) + 2^(j-1) = 2^j levels
           if (j < logDepth) {
             ancestor[i][j] = ancestor[ancestor[i][j - 1]][j - 1];
             maxEdge[i][j] = max(maxEdge[i][j-1], maxEdge[ancestor[i][j - 1]][j - 1]);
           }
         }
       }
     }
    
  3. Now, we can recursively compute $\tilde{g}(u,v)$ whenever we want to add $(u,v) \not\in E^\prime$. Note that one of the base cases is handle in the else part of the if statement.

     int findMaxEdge(int u, int v, 
                     const vector<pair<int, int>> &parentDepth, 
                     const vector<vector<int>> &ancestor,
                     const vector<vector<int>> &maxEdge) {
       if (u == v) return 0;
       if (parentDepth[u].second < parentDepth[v].second) swap(u, v);
       // now depth(u) >= depth(v)
       if (parentDepth[u].second != parentDepth[v].second) {
         int depthDelta = parentDepth[u].second - parentDepth[v].second;
         int logDelta = floor(log2(depthDelta));
         return max(maxEdge[u][logDelta], findMaxEdge(ancestor[u][logDelta], v, parentDepth, ancestor, maxEdge));
       } else { // depth(u) == depth(v)
         int L = 0; 
         int U = parentDepth[u].second == 0 ? 0 : floor(log2(parentDepth[u].second) + 1);
         int mid = L + (U - L)/2;
         while (L < U) { // find smallest index such that ancestor[u][j] == ancestor[v][j]
           if (ancestor[u][mid] == ancestor[v][mid]) {
             U = mid;
           } else { // ancestor[u][mid] != ancestor[v][mid]
             L = mid + 1;
           }
           mid = L + (U - L)/2;
         }
         if (mid == 0) return max(maxEdge[u][mid], maxEdge[v][mid]);
         --mid; // recursively run on the shallowest distinct ancestors
         int maxEdgeWeight = max(maxEdge[u][mid], maxEdge[v][mid]);    
         return max(maxEdgeWeight, findMaxEdge(ancestor[u][mid], ancestor[v][mid], 
                                               parentDepth, ancestor, maxEdge));
       }
     }
    

Code

Here's the main loop, the code that glues all these functions together.

long long computeTreeWeightHelper(int parent, int vertex,
                                  const vector<unordered_map<int, int>> &adjacencyList) {
  long long weight = 0LL;
  for (pair<int, int> nextVertex : adjacencyList[vertex]) {
    if (nextVertex.first != parent) {
      weight += nextVertex.second + computeTreeWeightHelper(vertex, nextVertex.first, adjacencyList);
    }
  }
  return weight;
}

long long computeTreeWeight(const vector<unordered_map<int, int>> &adjacencyList) { 
  return computeTreeWeightHelper(-1, 0, adjacencyList);
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  int N, M; cin >> N >> M; // vertices and edges
  vector<tuple<int, int, int>> edgeList; edgeList.reserve(M);
  vector<unordered_map<int, int>> adjacencyList(N);
  for (int m = 0; m < M; ++m) {
    int u, v, w; cin >> u >> v >> w;
    --u; --v;
    if (adjacencyList[u].count(v)) {
      adjacencyList[u][v] = min(w, adjacencyList[u][v]);
    } else {
      adjacencyList[u][v] = w;
    }
    if (adjacencyList[v].count(u)) {
      adjacencyList[v][u] = min(w, adjacencyList[v][u]);
    } else {
      adjacencyList[v][u] = w;
    }
    edgeList.emplace_back(u, v, w);
  }
  vector<unordered_map<int, int>> minimumSpanningTree = findMinimumSpanningTree(adjacencyList);
  // do lots of pre-processing
  long long mstWeight = computeTreeWeight(minimumSpanningTree);
  vector<pair<int, int>> parentDepth = rootTree(0, minimumSpanningTree);
  // for special pair of vertices find least common ancestor and heaviest edge between vertices
  vector<vector<int>> ancestor; ancestor.reserve(N); // ancestor[u][j] is 2^j ancestor of u
  vector<vector<int>> maxEdge; maxEdge.reserve(N); // maxEdge[u][j] is heaviest edge between u and ancestor[u][j]
  memoizeAncestorAndMaxEdge(ancestor, maxEdge, minimumSpanningTree, parentDepth);

  // now iterate through given edges and include each and remove heaviest edge in cycle
  for (tuple<int, int, int> edge : edgeList) {
    if (minimumSpanningTree[get<0>(edge)].count(get<1>(edge))) {
      if (minimumSpanningTree[get<0>(edge)][get<1>(edge)] == get<2>(edge)) {
        cout << mstWeight << '\n';
      } else {
        cout << mstWeight + get<2>(edge) - minimumSpanningTree[get<0>(edge)][get<1>(edge)] << '\n';
      }
    } else {      
      // now for each edge take diffs with the minimum spanning tree
      int maxEdgeWeight = findMaxEdge(get<0>(edge), get<1>(edge), 
                                      parentDepth, ancestor, maxEdge);     
      cout << mstWeight + get<2>(edge) - maxEdgeWeight << '\n';
    }
  }
  cout << flush;
  return 0;
}

For what it's worth, here's my complete submission, 18541620.


Lately, I've come across several programming competitions with a strong mathematical component. You would think that this would be my strong suit, but I actually struggled on a few of them. For this reason, I'll write about some of the solutions here.

Sharing Candies

Here's Sharing Candies from CodeChef.

Alvin and Berto have gotten tired of eating chocolates, so now they have decided to eat candies instead.

Alvin has $A$ apple candies, and Berto has $B$ banana candies. (I know, they have weird tastes.) Alvin and Berto always wants the split of candies to be as fair as possible. The problem is, Alvin only wants apple candies and Berto only wants banana candies!

Here comes Chef to the rescue! Chef bought an infinite number of candy packs. There are two types of packs:

  • Packs containing exactly $C$ apple candies.
  • Packs containing exactly $D$ banana candies.

Chef wants to give some (could be zero) apple candy packs to Alvin and some (could be zero) banana candy packs to Berto in such a way that the absolute difference between the number of candies they have is minimized. What is this minimum absolute difference?

Note that Chef doesn't want to open any pack; he gives each pack in full.

Let $x$ be the number of packs of apple candies that Chef gives Alvin and $y$ be the number of packs of banana candies. The absolute difference can be written \begin{equation} \left|(A + xC) - (B + yD)\right| = \left|(A - B) + (xC - yD)\right|. \end{equation}

Let $d$ be the greatest common denominator of $C$ and $D$. Then, by Bézout's identity, we have that \begin{equation} xC - yD = kd \end{equation} for any $k$ and some $x$ and $y$. Moreover, every integer solution $(x^\prime,y^\prime)$ belongs to the set \begin{equation} S_{x,y,k} = \left\{\left(x + l\frac{D}{d}, y + l\frac{C}{d}\right) : l \in \mathbb{Z}\right\}. \end{equation}

By choosing large $l$, we can have both $x^\prime \geq 0$ and $y ^\prime \geq 0$, so we have a nonnegative integer solutions. Thus, our solution is \begin{equation} \min\left(\left|A - B\right| \bmod d, d - \left(\left|A - B\right| \bmod d\right)\right) \end{equation} since those are two numbers closet to $0$ we can get by adding or subtracting $k$ to $A - B$. Note that we don't actually have to calculate $x$ and $y$, so we just find $d$ with the Euclidean algorithm. Here's the code.

#include <algorithm>
#include <cmath>
#include <climits>
#include <iostream>

using namespace std;

long long computeGcd(long long a, long long b) {
  if (a < b) swap(a, b);
  // now a >= b
  if (b == 0) return a;
  long long q = a/b;
  a -= q*b;
  return computeGcd(b, a);
}

long long minimizeDifference(long long A, long long B,
                             long long C, long long D) {
  long long delta = A > B ? A - B : B - A;
  long long gcd = computeGcd(C, D);  
  return min(delta % gcd, gcd - (delta % gcd));
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  int T; cin >> T;
  for (int t = 0; t < T; ++t) {
    long long A, B, C, D;
    cin >> A >> B >> C >> D;
    cout << minimizeDifference(A, B, C, D) << '\n';
  }
  cout << flush;
  return 0;
}

Bear and Tower of Cubes

Here's Bear and Tower of Cubes from Codeforces.

Limak is a little polar bear. He plays by building towers from blocks. Every block is a cube with positive integer length of side. Limak has infinitely many blocks of each side length.

A block with side a has volume $a^3$. A tower consisting of blocks with sides $a_1,a_2,\ldots,a_k$  has the total volume $a_1^3 + a_2^3 + \cdots + a_k^3$.

Limak is going to build a tower. First, he asks you to tell him a positive integer $X$ — the required total volume of the tower. Then, Limak adds new blocks greedily, one by one. Each time he adds the biggest block such that the total volume doesn't exceed $X$.

Limak asks you to choose $X$ not greater than $m$. Also, he wants to maximize the number of blocks in the tower at the end (however, he still behaves greedily). Secondarily, he wants to maximize $X$.

Can you help Limak? Find the maximum number of blocks his tower can have and the maximum $X \leq m$ that results this number of blocks.

The key observation to make that I missed is realize that if $a$ is the greatest integer such that $a^3 \leq m$, then we should choose the first block to have length either $a$ or $a-1$.

For a proof of this fact, suppose that we choose our first block to be of length $b$. Then, it must be true that $b^3 \leq X \lneq (b+1)^3$. Our new volume limit is then $X - b^3$ after choosing a block of length $b$. Now, note that \begin{equation} 0 \leq X - b^3 < (b+1)^3 - b^3 = 3b^2 + 3b = 3b(b+1), \end{equation} In this way, the upper bound of our new volume limit $m^\prime = X - b^3$ increases as a function of $b$. Thus, it makes sense to choose $b$ as large as possible. But if we choose $b = a$, then $0 \leq m^\prime \leq m - a^3$ since it's possible that $m$ is not much larger than $a^3$.

Thus, for every volume limit we choose a block of length $a$ or $a - 1$. Since the volume limit decreases quickly, we can just use a brute force recursive solution and keep track of the number of blocks and total volume as we go. Here's the code.

#include <cmath>
#include <iostream>
#include <utility>

using namespace std;

pair<int, long long> findMaxBlocksAndVolumeHelper(long long volumeLimit, 
                                                  int currentBlocks,
                                                  long long currentVolume) {
  if (volumeLimit == 0) return make_pair(currentBlocks, currentVolume);
  long long maxA = floor(cbrt(volumeLimit));
  pair<int, long long> bigBlockState = findMaxBlocksAndVolumeHelper(volumeLimit - maxA*maxA*maxA,
                                                                    currentBlocks + 1,
                                                                    currentVolume + maxA*maxA*maxA);
  if (maxA > 1) {
    pair<int, long long> smallBlockState = findMaxBlocksAndVolumeHelper(maxA*maxA*maxA - 1 - (maxA - 1)*(maxA - 1)*(maxA - 1),
                                                                        currentBlocks + 1,
                                                                        currentVolume + (maxA - 1)*(maxA - 1)*(maxA - 1));
    if (smallBlockState.first > bigBlockState.first ||
        (smallBlockState.first == bigBlockState.first && smallBlockState.second > bigBlockState.second)) 
      return smallBlockState;
  } 
  return bigBlockState;

}

pair<int, long long> findMaxBlocksAndVolume(long long volumeLimit) {
  return findMaxBlocksAndVolumeHelper(volumeLimit, 0, 0);
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  long long M; cin >> M;
  pair<int, long long> blockVolume = findMaxBlocksAndVolume(M);
  cout << blockVolume.first << ' ' << blockVolume.second << endl;
  return 0;
}

Longest Increasing Subsequences

In another, CodeChef problem Longest Increasing Subsequences, we make use of Ternary numerical system.

Chef recently learned about the classic Longest Increasing Subsequence problem. However, Chef found out that while the length of the longest increasing subsequence is unique, the longest increasing subsequence itself is not necessarily unique; for example, in the array $[1, 3, 2, 4]$, there are two longest increasing subsequences: $[1, 3, 4]$ and $[1, 2, 4]$.

Chef decided to investigate on this more, and now he has sufficient mastery in it that he was able to come up with a problem:

Given an integer $K$, output an integer $N$ and a permutation of the array $[1, 2,\ldots, N]$ such that there are exactly $K$ longest increasing subsequences. Chef also requires that $1 \leq N \leq 100$, otherwise he found the problem is too easy.

In case there are multiple possible answers, any one will be accepted.

Consider blocks of decreasing sequences $B_1, B_2,\ldots,B_n$, where for all $x \in B_i$ and $y \in B_j$, $x < y$ if $i < j$. For example, we might have $n = 3$ and \begin{align*} B_1 &= [5,4,3,2,1] \\ B_2 &= [8,7,6] \\ B_3 &= [10,9]. \end{align*} If we concatenate these sequences, we have a permutation of $[1,2,...,10]$ such that length of a longest increasing sequence is 3. We can make any such sequence by choosing exactly one number from each block. Thus, the number of such sequences is $5 \times 3 \times 2 = 30$ in this case.

Now imagine all the blocks are of the same size, say $k$. We want to maximize the number of sequences that we can encode with such blocks. Fix $N$. Then, we have about $N/k$ blocks that can represent $f_N(k) = k^{N/k}$ longest increasing sequences with this strategy. We have that \begin{equation} f_N^\prime(k) = N\exp\left(\frac{N}{k}\log k\right) \frac{1 - \log k}{k^2}, \end{equation} so $f$ is maximized when at $e$. Since $k$ must be an integer, we set $k = \lceil e \rceil = 3$. Thus, our block size is fixed to be size $3$.

Now, suppose that we write \begin{equation} K = d_0 + d_13 + d_23^2 + \cdots + d_n3^n = \sum_{j = 0}^nd_j3^j, \end{equation} where $d_n \neq 0$. Suppose that $K \geq 9$, too. Then, we have $n$ blocks, $B_1,B_2,\ldots,B_n$ of size 3. If $d_n = 2$, we actually let $B_1$ be of size $6$. In this way, by concatenating $B_1,B_2,\ldots,B_n$ we have $d_n3^n$ longest increasing sequences.

Now suppose that $d_j > 0$. We can add $d_j3^j$ sequences by inserting a sequence between $B_{n - j}$ and $B_{n - j + 1}$. If $d_j = 1$, we need to add an increasing sequence of length $n - j$ such that all the numbers are greater than those in $B_{n-j}$ but less than those in $B_{n - j + 1}$. If $d_j = 2$, we add an increasing sequence of length $n - j + 1$ with the same properties, but we transpose the first numbers. As an example, consider $K = 71 = 2 \cdot 3^0 + 2 \cdot 3^1 + 3^2 + 2 \cdot 3^3$. We'll have 3 blocks. Since no $d_j = 0$, we need to interleave a sequence between every block. Thus, our complete sequence will be $$ \overbrace{[14,13,12,11,10,9]}^{B_1} [8] \overbrace{[17,16,15]}^{B_2} [6,5,7] \overbrace{[20,19,18]}^{B_3} [2,1,3,4], $$ which gives exactly $K = 71$ longest increasing sequences.

When $K < 9$, we can simple just use the sequence $[K,K-1,\ldots, 1]$. Here's the code.

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>

using namespace std;

const int MAX_N = 100;

vector<int> makeSequence(int K) {
  if (K <= 8) { // subsequences of length 1
    vector<int> sequence(K);
    for (int i = 0; i < K; ++i) sequence[i] = K - i;
    return sequence;
  }
  int bitCount = 0;
  vector<int> base3K;
  do {
    base3K.push_back(K % 3);
    K /= 3;
  } while (K > 0);  
  /* maxBit this is the length of the longest increasing subsequence, 
   * K = d_0*3^0 + d_1*3^1 + ... + d_maxBit*3^maxBit
   */
  int maxBit = base3K.size() - 1;   
  int N = 3*maxBit;
  if (base3K.back() == 2) N += 3;
  for (int i = 0; i <= maxBit - 1; ++i) {
    if (base3K[i] > 0) {
      N += maxBit - i;
      if (base3K[i] == 2) ++N;
    }
  }
  vector<int> sequence(N);
  int currentIdx = 1;
  int endCursor = N;  
  for (int i = 0; i < maxBit; ++i) { // interleave to get other ternary
    if (base3K[i] > 0) {
      int j = endCursor - (maxBit - i);
      if (base3K[i] == 2) --j;
      for (; j < endCursor; ++j) {
        sequence[j] = currentIdx++;
      }
      endCursor -= maxBit - i;
      if (base3K[i] == 2) { // if digit is 2
        --endCursor;
        swap(sequence[endCursor], sequence[endCursor + 1]);
      }      
    }
    sequence[--endCursor] = N - 2;
    sequence[--endCursor] = N - 1;
    sequence[--endCursor] = N;
    N -= 3;
  }
  if (endCursor > 0) { // first digit in base 3 is 2    
    sequence[--endCursor] = N - 2; 
    sequence[--endCursor] = N - 1;
    sequence[--endCursor] = N;
    N -= 3;
    swap(sequence[0], sequence[3]);
    swap(sequence[1], sequence[4]);
    swap(sequence[2], sequence[5]);
  }
  return sequence;
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  int T; cin >> T;
  for (int t = 0; t < T; ++t) {
    int K; cin >> K;
    vector<int> sequence = makeSequence(K);
    cout << sequence.size() << '\n';
    copy(sequence.begin(), sequence.end() - 1, ostream_iterator<int>(cout, " "));
    cout << sequence.back() << '\n'; 
  }
  cout << flush;
  return 0;
}

Chef and Super Array

Another problem from CodeChef is Chef and Super Array.

Chef has a an array $A$ consisting of $N$ elements. He wants to add some elements into the array as per the below mentioned process.

After each minute, Chef iterates over the array in order from left to right, and takes every two neighbouring pair of elements, say $x$ and $y$, he adds a new element $x + y$ in the middle of elements $x$ and $y$.

For example, if initial array $A = \{1, 6, 9\}$.

  • After first minute, the array A will be equal to $\{1, \mathbf{7}, 6, \mathbf{15}, 9\}$. Please note that the elements shown in the bold font are the newly added elements during first minute. As you can observe that $7 = 1 + 6$, and $15 = 6 + 9$.
  • After second minute, the array will be $\{1, \mathbf{8}, 7, \mathbf{13}, 6, \mathbf{21}, 15, \mathbf{24}, 9\}$. Once again, elements added during the second minute, are shown in bold.

Chef wants to know the sum of elements between $x$th and $y$th positions in the array $A$ (i.e. $A_x + A_{x + 1} + \cdots + A_y$) after $m$ minutes. As the answer could be large, output it modulo $10^9+7 = 1000000007$. Please note that we use $1$-based indexing in the problem.

The important thing to note in this problem is the recursive structure of it. Consider an array $A = \{A_1,A_2,\ldots,A_N\}$. Let $A_i = a$ and $A_{i+1} = b$. Let $S_{a,b}(k)$ be the sum of elements between $a$ and $b$ after $k$ steps of the process. Clearly, $S_{a,b}(0) = 0$. Now, suppose $c$ is between $a$ and $b$. After a step, c appears $3$ times, and we add an additional $a$ and $b$. For example, $\{a,c,b\}$ becomes $\{a, a + c, c, b + c, b\}$. Thus, $S_{a,b}(k + 1) = 3S_{a,b}(k) + (a + b)$. Since we can write $S_{a,b}(0) = (3^0 - 1)\frac{a + b}{2},$ we have that in general, \begin{equation} S_{a,b}(k) = \left(3^k - 1\right)\frac{a + b}{2}. \label{eqn:sum_between} \end{equation}

Now if we use $0$-indexing, element $i$ of array $A$ has index $i2^m$ after $m$ steps. Suppose that we want to sum elements up to index $x$. If $i2^m \leq x$, we can simply use Equation \ref{eqn:sum_between} with $a = A_{i-1}$ and $b = A_i$.

What about the case when $(i-1)2^m \leq x < i2^m$ in general? Let $a = A_{i-1}$, $b = A_i$ and $c = a + b$. Set $x^\prime = x - (i-1)2^m$. Let $T_{a,b,m,x^\prime}$ be the sum of elements with indices from $(i-1)2^m$ to $x$. We have that \begin{equation} T_{a,b,m,x^\prime} = \begin{cases} a, &x^\prime = 0 \\ a + \left(3^m - 1\right)\frac{a + b}{2}, &x^\prime = 2^m - 1 \\ T_{a,c,m - 1, x^\prime}, &x^\prime < 2^{m - 1} \\ a + \left(3^{m - 1} - 1\right)\frac{a + c}{2} + T_{a,c,m - 1, x^\prime - 2^{m - 1}}, &x^\prime \geq 2^{m - 1} \end{cases} \end{equation} since we can imagine that the process starts after $1$ step and reaches this point after $m - 1$ steps. Here's the code for this.

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>

using namespace std;

const int MOD = 1000000007;
const int MAX_M = 30;

long long p2[MAX_M + 1];
long long p3[MAX_M + 1];

/* consider the subarray newA[k],...,newA[l] after m minutes
 * where newA[k] = a, newA[l] = b
 */
int subSum(int i, int a, int b, int m) {
  if (i == 0) return a;
  if (i == p2[m] - 1) return (a + (a + b)*(p3[m] - 1)/2) % MOD;
  if (i < p2[m - 1]) return subSum(i, a, a + b, m - 1);
  return (subSum(p2[m - 1] - 1, a, a + b, m - 1) + subSum(i - p2[m - 1], a + b, b, m - 1)) % MOD;
}

// find the sum of newA[0], newA[1],...,newA[i]
int cumulativeSum(long long i, const vector<int> &A, int m) {
  if (i < 0) return 0;
  int idx = 0;
  int S = 0;
  // in new array A[idx] is located at index idx*2^m
  while (idx*p2[m] <= i) { 
    S += subSum(min(p2[m] - 1, i - idx*p2[m]), A[idx], A[idx + 1], m);
    S %= MOD;
    ++idx;
  }
  return S;
}

int sum(const vector<int> &A, int m, long long x, long long y) {
  int S = cumulativeSum(y, A, m) - cumulativeSum(x - 1, A, m);
  S %= MOD;
  if (S < 0) S += MOD;
  return S;
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  // precompute powers of 2 and 3
  p2[0] = 1LL; p3[0] = 1LL;
  for (int i = 1; i <= MAX_M; ++i) {
    p2[i] = p2[i - 1]*2LL; p3[i] = p3[i - 1]*3LL;
  }
  // run through test cases
  int T; cin >> T;
  for (int t = 0; t < T; ++t) {
    int N, m;
    long long x, y;
    cin >> N >> m;
    cin >> x >> y;
    --x; --y; // 0-index
    vector<int> A; A.reserve(N + 1);
    for (int i = 0; i < N; ++i) {
      int a; cin >> a; A.push_back(a);
    }
    A.push_back(0); // dummy number padding
    cout << sum(A, m, x, y) << '\n';
  }
  cout << flush; 
  return 0;
}

Photo URL is broken

Recently, I came across two problems that required the convex hull trick. Often, we have a set of lines, and given a time $t$, we want to find out which line has the maximal (or minimal value). To make this more concrete, here's the first problem in which I came across this technique, Cyclist Race.

Chef has been invited to be a judge for a cycling race.

You may think about cyclists in the race as points on a line. All the cyclists start at the same point with an initial speed of zero. All of them move in the same direction. Some cyclists may force their cycles to speed up. The change of speed takes place immediately. The rest of the time, they move with a constant speed. There are N cyclists in total, numbered from 1 to N.

In order to award points to the participants, Chef wants to know how far the cyclists have reached at certain points of time. Corresponding to the above two, there will be two types of queries.

  • Change the speed of cyclist i at some time.
  • Ask for the position of the race leader at some time.

Return the results of the second type of query.

Now, it's not too hard to see that the distance traveled by each cyclist is a piecewise linear function. For each query of the second type, we could just evaluate all these peicewise linear functions and take the max, but there's a better way.

In this particular problem, the speed is always increasing. So if you look at the thick lines in the title picture that indicate which cyclist is in the lead, it forms the bottom of a convex hull, hence the name, the convex hull trick. The distance of the lead cyclist is also piecewise linear, so the goal becomes to merge the piecewise linear functions of all the cyclist into one.

Essentially, we'll want to create a vector $\left((x_0, l_0), (x_1,l_1),\ldots,(x_{n-1},l_{n-1})\right)$, where $l_i$ is a line, and $x_i$ is the $x$-coordinate at which the line becomes relavant. That is, line $l_i$ has the maximal value on the interval $\left[x_i, x_{i+1}\right]$, where $x_n = \infty$. In this way, if we sort our vector by $x_i$, then to find the maximal value at $x$, we can do a binary search.

To build this vector, we to first have all our lines sorted in ascending order by slope. We iterate through our lines, but we only want to keep some of them. Let us call our convex hull $C$. Initialize $C = \left((-\infty, l_0)\right)$. Now, suppose we encouter line $l$ and we need to make a decision. Recall that the slope $l$ will need be greater than or equal to any line line in $C$. There are several cases.

  1. If $C$ has at least two lines, consider the previous line $l^\prime$ and the line before $l^\prime$, $l^{\prime\prime}$. $l^\prime$ becomes relevant at the intersection with $l^{\prime\prime}$ at say $x^\prime$. If $l$ intersects $l^{\prime\prime}$ at $x$ and $x \leq x^\prime$ $l$ becomes relevant before $l^\prime$, so we remove $l^\prime$. Repeat until we remove as many unnecessary lines as possible. Then, if append to our vector $(x,l)$ where $x$ is the $x$-coordinate of the intersection of $l$ and $l^\prime$.
  2. If $C$ has only one line and $l$ has the same slope as that line, then only keep the line with the bigger $y$-intercept.
  3. If $C$ has only one line and $l$ has a larger slope, then append to our vector $(x,l)$ where $x$ is the $x$-coordinate of the $l$ and the one line in $C$.

See the title picture for an example. We'd initialize with the green line $\left(-\infty, y = 0.1x\right)$. Now, we're in case 3, so we'd add the blue line $(0, y = 0.2x)$ since the two lines intersect at $(0,0)$. For the thick red line, we find ourselves in case 1. We'll pop off the blue line and find ourselves in case 3 again, so now our vector is $C = \left((-\infty, y= 0.1x),(0,y=x/3)\right)$.

The next line we encounter is the green line $y = 0.5x - 0.8$. We're in case 1, but we can't pop off any lines since its intersection with other two lines is far to the right, so we simply append this line and its intersection with the thick red line, so we have $$C = \left((-\infty, y= 0.1x), (0,y=x/3), (4.8,y=0.5x-0.8)\right).$$ Our next line is thick blue line, $y = 2x/3 - 1.4$. This intersects the thick red line at $x = 4.2$, while the thick pervious green line intersects at $x =4.8$, so we'll pop off the green line $y = 0.5x-0.5$, and push the thick blue line on to the stack and get $C = \left((-\infty, y= 0.1x), (0,y=x/3), (4.2,y=2x/3-1.4)\right)$. Finally, we encounter the thick green line and our convex hull is $$ C = \left((-\infty, y= 0.1x), (0,y=x/3), (4.2,y=2x/3-1.4),(5.4, y=2x-8.6)\right). $$

Hopefully, it's clear that the natural data structure to keep track of the lines is a vector, which we'll use as a stack. Then, adding new lines is just a constant time operation. The most time-consuming operation is the initial sorting of the lines, so constructing the convex hull is $O(N\log N)$, where $N$ is the number of lines. Evaluating the distance at some $x$ is $O(\log N)$ using binary search as we discussed earlier. If we have $M$, queries the total running time will be $O\left((N + M)\log N \right)$.

Here's the code for this problem.

#include <algorithm>
#include <iostream>
#include <utility>
#include <vector>

using namespace std;

class Line {
private:
  long double m, b; // y = mx + b
public:
  Line(long double m, long double b) : m(m), b(b) {}
  long double& slope() { return m; }
  long double& yIntercept() { return b; }  
  pair<long double, long double> intersect(Line other) {
    long double x = (this -> b - other.yIntercept())/(other.slope() - this -> m);
    long double y = (this -> m)*x + this -> b;
    return make_pair(x, y);
  }
  double getY(long double x) { return m*x + b; }
};

vector<pair<long double, Line>> makeConvexHull(vector<Line> &lines) {
  int N = lines.size(); 
  vector<pair<long double, Line>> convexHull; convexHull.reserve(N);
  if (N == 0) return convexHull;
  convexHull.emplace_back(0, lines.front());
  for (int i = 1; i < N; ++i) {
    // pop stack while new line's interesection with second to last line is to the left of last line
    // note that slopes are strictly increasing
    while (convexHull.size() > 1 &&
           lines[i].intersect(convexHull[convexHull.size() - 2].second).first <= convexHull.back().first) {
      convexHull.pop_back();
    }
    // check intersection with x = 0 when there's only 1 line
    if (convexHull.size() == 1 && lines[i].yIntercept() >= convexHull.back().first) convexHull.pop_back();
    convexHull.emplace_back(lines[i].intersect(convexHull.back().second).first, lines[i]);
  }
  return convexHull;
}

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  int N, Q; cin >> N >> Q;      // number of cyclists and queries
  // current speed, time change, distance traveled so far
  vector<Line> lines; // y = mx + b, first is m, second is b
  lines.emplace_back(0,0);
  vector<pair<int, long long>> cyclists(N, make_pair(0, 0)); // first is speed v, second is x, where x + vt is location at time t
  vector<long long> queryTimes;
  for (int q = 0; q < Q; ++q) {
    int queryType; cin >> queryType;
    long long time;
    switch(queryType) {
    case 1:                     // speed change
      int cyclist, newSpeed;
      cin >> time >> cyclist >> newSpeed;
      --cyclist;   // convert to 0 indexing
      // suppose current function is x + vt
      // new function is x + v*time + (t - time)*newSpeed
      // (x + v*time - time*newSpeed) + newSpeed*t
      // = (x + (v-newSpeed)*time) + newSpeed*t
      cyclists[cyclist].second += time*(cyclists[cyclist].first - newSpeed);
      cyclists[cyclist].first = newSpeed;      
      lines.emplace_back(newSpeed, cyclists[cyclist].second);
      break;
    case 2:                     // leader position
      cin >> time;
      queryTimes.push_back(time);
      break;
    }
  }
  // sort slopes in ascending order
  sort(lines.begin(), lines.end(),
       [](Line &a, Line &b) -> bool {
         if (a.slope() != b.slope()) return a.slope() < b.slope();
         return a.yIntercept() < b.yIntercept();
       }); 
  if (lines.size() == 1) {      // there will always be at least 1 line
    for (long long t : queryTimes) {
      cout << (long long) lines.back().getY(t) << '\n';
    }
  } else {
    // eliminate irrelevant lines, first is time where the line becomes relevant
    vector<pair<long double, Line>> convexHull = makeConvexHull(lines); 
    // since times are strictly increasing just walk through lines 1 by 1
    int cursor = 0;
    for (long long t : queryTimes) {
      while (cursor + 1 < convexHull.size() && convexHull[cursor + 1].first <= t) ++cursor;
      cout << (long long) convexHull[cursor].second.getY(t) << '\n';
    }
  }
  cout << flush;
  return 0;
}

In this particular problem, negative times make no sense, so we can start at $x = 0$ for our convex hull.

Applications to Dynamic Programming

In certain cases, we can apply this trick to a dynamic programming problem to significantly improve the running time. Consider the problem Levels and Regions.

Radewoosh is playing a computer game. There are $N$ levels, numbered $1$ through $N$. Levels are divided into $K$ regions (groups). Each region contains some positive number of consecutive levels.

The game repeats the the following process:

  1. If all regions are beaten then the game ends immediately. Otherwise, the system finds the first region with at least one non-beaten level. Let $X$ denote this region.
  2. The system creates an empty bag for tokens. Each token will represent one level and there may be many tokens representing the same level.
    • For each already beaten level $i$ in the region $X$, the system adds $t_i$ tokens to the bag (tokens representing the $i$-th level).
    • Let $j$ denote the first non-beaten level in the region $X$. The system adds $t_j$ tokens to the bag.
  3. Finally, the system takes a uniformly random token from the bag and a player starts the level represented by the token. A player spends one hour and beats the level, even if he has already beaten it in the past.

Given $N$, $K$ and values $t_1,t_2,\ldots,t_N$, your task is to split levels into regions. Each level must belong to exactly one region, and each region must contain non-empty consecutive set of levels. What is the minimum possible expected number of hours required to finish the game?

It's not obvious at all how the convex hull trick should apply here. Well, at least it wasn't obvious to me. Let's do some math first. Consider the case where we only have 1 region first consisting of levels $1,2,\ldots,n$. Let $T_i$ be the time at which we finish level $i$. Write \begin{equation} T_n = T_1 + (T_2 - T_1) + (T_3 - T_2) + \cdots + (T_n - T_{n-1}). \label{eqn:expectation_Tn} \end{equation}

$T_1 = 1$ since we'll just put $t_i$ tokens in the bag and draw from those $t_i$ tokens, so $t_i/t_i = 1$, so we always play and beat the first level immediately. Now $T_{i} - T_{i-1}$ is the time that it takes to be level $i$ given that levels $1,2,\ldots,i-1$ are beaten. This has a geometric distribution. Every time we try to beat level $i$, we'll put in $t_1 + t_2 + \cdots + t_i$ tokens in the back and try to get one of the $t_i$ tokens labeled $i$. The probability of doing so is \begin{equation} p = \frac{t_i}{\sum_{j=1}^i t_j}. \end{equation} Thus, we'll have that \begin{align} \mathbb{E}(T_i - T_{i-1}) &= p + 2(1-p)p + 3(1-p)p + \cdots = \sum_{k=1}^\infty k(1-p)^{k-1}p \nonumber\\ &= \frac{p}{\left(1-(1-p)\right)^2} = \frac{1}{p} \nonumber\\ &= \frac{\sum_{j=1}^i t_j}{t_i}. \label{eqn:expectation_delta_T} \end{align}

Now by linearity of expectation, applying Equation \ref{eqn:expectation_delta_T} to Equation \ref{eqn:expectation_Tn} will give us that \begin{equation} \mathbb{E}T_n = \sum_{i = 1}^n\frac{\sum_{j=1}^i t_j}{t_i}. \label{eqn:expection_T_n} \end{equation}

Now, define $E_{k,n}$ denote the minimum expected time to beat $n$ levels if we split them into $k$ regions. Note that each region must have at least 1 level, so this is only well-defined if $n \geq k$. Now, the levels must be beaten in order, so imagine putting levels $t_l,t_{l+1},\ldots,t_n$ into the last region. Since each region must have at least 1 level, we'll have that $k \leq l \leq n$, which gives us the recurrence relation \begin{equation} E_{k,n} = \inf_{k \leq l \leq n}\left\{E_{k - 1, l - 1} + \sum_{i = l}^n\frac{\sum_{j=l}^i t_j}{t_i}\right\} \label{eqn:expectation_recurrence} \end{equation} by Equation \ref{eqn:expection_T_n}.

Now, suppose we define \begin{equation} S_k = \sum_{i=1}^k t_i ~\text{and}~ R_k = \sum_{i=1}^k \frac{1}{t_i}, \label{eqn:sum_dp} \end{equation} which leads to \begin{equation} E_{1,n} = \mathbb{E}T_n = \sum_{i=1}^n\frac{S_i}{t_i} \label{eqn:base_expectation} \end{equation} Equation \ref{eqn:expectation_recurrence} becomes \begin{align} E_{k,n} &= \inf_{k \leq l \leq n}\left\{E_{k - 1, l - 1} + \sum_{i = l}^n\frac{\sum_{j=l}^i t_j}{t_i}\right\} \nonumber\\ &= \inf_{k \leq l \leq n}\left\{E_{k - 1, l - 1} + \sum_{i = l}^n\frac{S_i - S_{l-1}}{t_i}\right\} \nonumber\\ &= \inf_{k \leq l \leq n}\left\{E_{k - 1, l - 1} + \sum_{i = l}^n\frac{S_i}{t_i} - S_{l-1}\left(R_n - R_{l-1}\right)\right\} \nonumber\\ &= \inf_{k \leq l \leq n}\left\{E_{k - 1, l - 1} + \left(E_{1,n} - E_{1,l-1}\right) - S_{l-1}\left(R_n - R_{l-1}\right)\right\} \nonumber\\ &= E_{1,n} + \inf_{k \leq l \leq n}\left\{\left(E_{k - 1, l - 1} - E_{1,l-1} + S_{l-1}R_{l-1}\right) - S_{l-1}R_n\right\} \label{eqn:expectation_line} \end{align} by Equations \ref{eqn:base_expectation} and \ref{eqn:sum_dp}.

Do you see the lines of decreasing slope in Equation \ref{eqn:expectation_line}, yet? It's okay if you don't. Look at the expression inside the $\inf$. The $y$-intercept is in parentheses and the slope is $-S_{l-1}$ which is decreasing with $l$. So index our lines by $l$.

Fix $k$. Imagine that we have calculated $E_{j,n}$ for all $j < k$. Now, we're going to attempt to calculate $E_{k,k},E_{k,k+1},\ldots, E_{k,N}$ in that order. To calculate $E_{k,n}$ we only need to consider the lines $l = k,\ldots,n$. The intercept does not vary as a function of $n$, so we can add lines one-by-one. Before calculating $E_k,n$, we'll add the line with slope $-S_{n-1}$. Now our answer will be $E_{K,N}$.

In this way, it simplifies a $O(KN^2)$ problem into a $O(KN\log N)$ problem. Notice that here, instead of building our convex hull before making any queries like in the previous problem, we dynamically update it. Here's the code with some differences since $0$-indexing was used in the code.

#include <algorithm>
#include <iomanip>
#include <iostream>
#include <utility>
#include <vector>

using namespace std;

class Line {
private:
  long double m, b; // y = mx + b
public:
  Line(long double m, long double b) : m(m), b(b) {}
  long double slope() { return m; }
  long double& yIntercept() { return b; }  
  pair<long double, long double> intersect(Line other) {
    long double x = (this -> b - other.yIntercept())/(other.slope() - this -> m);
    long double y = (this -> m)*x + this -> b;
    return make_pair(x, y);
  }
  double getY(long double x) { return m*x + b; }
};

int main(int argc, char *argv[]) {
  ios::sync_with_stdio(false); cin.tie(NULL);
  int N, K; cin >> N >> K; // number of levels and regions
  vector<int> T; T.reserve(N);  // tokens
  for (int i = 0; i < N; ++i) {
    int t; cin >> t;
    T.push_back(t);
  }
  vector<long long> S; S.reserve(N); // cumulative sum of tokens
  vector<long double> R; R.reserve(N); // cumulative sum of token reciprocals
  S.push_back(T.front());
  R.push_back(1.0/T.front());
  for (int n = 1; n < N; ++n) {
    S.push_back(S.back() + T[n]);
    R.push_back(R.back() + 1.0L/T[n]);
  }
  /* let eV[k][n] be the minimum expected time to beat
   * levels 0,1,...,n-1 spread across regions 0,1,...,k-1
   */ 
  vector<vector<long double>> eV(K, vector<long double>(N)); // only indices eV[k][n] with n >= k are valid
  eV.front().front() = 1;
  for (int n = 1; n < N; ++n) { // time to beat each level is a geometric distribution
    eV[0][n] = eV[0][n-1] + ((long double) S[n])/T[n];
  }
  /* eV[k][n] = min(eV[k-1][l-1] + (S[l]-S[l-1])/t_l + ... + (S[n] - S[l-1])/t_{n}),
   * where we vary k <= l < n. That is, we're putting everything with index at least l
   * in the last region. Note that each region needs at least 1 level. This simplifes to
   * eV[k][n] = min(eV[k-1][l-1] + eV[0][n] - eV[0][l-1]  - S[l-1](R[n] - R[l-1])
   *          = eV[0][n] + min{(eV[k-1][l-1] - eV[0][l-1] + S[l-1]R[l-1]) - S[l-1]R[n]},
   * Thus, for each l we have a line with slope -S[l - 1].
   * -S[l-1] is strictly decreasing and R[n] is strictly increasing.
   * Use the convex hull trick.
   */ 
  vector<pair<long double, Line>> convexHull; convexHull.reserve(N);
  for (int k = 1; k < K; ++k) {    
    /* in convex hull we add lines in order of decreasing slope, 
     * the the convexHull[i].first is the x-coordinate where 
     * convexHull[i].second and convexHull[i-1].second intersect
     */
    convexHull.clear();
    int cursor = 0;
    for (int n = k; n < N; ++n) { // calculate eV[k][n]
      /* add lines l = k,...,n to build convex hull
       * loop invariant is that lines l = k,...,n-1 have been added, so just add line l = n
       */
      long double slope = -S[n - 1];
      long double yIntercept = eV[k - 1][n - 1] - eV[0][n - 1] + S[n - 1]*R[n - 1];
      Line l(slope, yIntercept);
      while (convexHull.size() > 1 && 
             convexHull.back().first >= l.intersect(convexHull[convexHull.size() - 2].second).first) {
        convexHull.pop_back(); // get rid of irrelevant lines by checking that line intersection is to the left
      }
      // check intesection with x = 0 if no lines left
      if (convexHull.size() == 1 && l.yIntercept() <= convexHull.back().second.yIntercept()) convexHull.pop_back();
      convexHull.emplace_back(convexHull.empty() ? 0 : l.intersect(convexHull.back().second).first, l); // add line
      /* binary search for the correct line since they are sorted by x intersections
       * lower bound is old cursor since x coordinate of intersection is monotonically decreasing
       * and R[n] is increasing, too
       */ 
      if (cursor >= convexHull.size()) cursor = convexHull.size() - 1;
      cursor = upper_bound(convexHull.begin() + cursor, convexHull.end(), R[n], 
                           [](long double x, pair<long double, Line> l) { return x < l.first; }) - convexHull.begin();
      --cursor; // since we actually want the last index that is less than or equal to
      eV[k][n] = eV[0][n] + convexHull[cursor].second.getY(R[n]);

    }
  }
  cout << fixed;
  cout << setprecision(10);
  cout << eV.back().back() << endl;
  return 0;
}

By the way the drawing was done with GeoGebra. It's a pretty cool way to do math.