PhillyPham

When stuck inside because of the snow, what else is there to do but bake and code? I made these brownies here. They are amazingly moist, but I'll probably cut down on the sugar next time I make them.

On the algorithms side, I finally got to the Platinum Division on the USA Computing Olympiad. It's primarily for high school students, but I find it fun to participate anyway.

One of the competition's problems employs clever usage of binary search that I want to write about. Basically, there are times when the solution is very hard to compute, but it is not too costly to verify. If the solution is numerical and bounded, we can guess solutions with binary search. I've actually been quite familiar with this strategy for a year now, but somehow I missed it in this particular problem. Here, we use a two-dimensional binary search. Thankfully, I got enough points to get promoted to the next division anyway, anyway.

Angry Cows

Here's the problem statement:

Bessie the cow has designed what she thinks will be the next big hit video game: "Angry Cows". The premise, which she believes is completely original, is that the player shoots a cow with a slingshot into a one-dimensional scene consisting of a set of hay bales located at various points on a number line; the cow lands with sufficient force to detonate the hay bales in close proximity to her landing site, which in turn might set of a chain reaction that causes additional hay bales to explode. The goal is to use a single cow to start a chain reaction that detonates all the hay bales. There are $N$ hay bales located at distinct integer positions $x_1,x_2,\ldots,x_N$ on the number line. If a cow is launched with power $R$ landing at position $x$, this will causes a blast of "radius $R$", engulfing all hay bales within the range $x−R \ldots x+R$. These hay bales then themselves explode (all simultaneously), each with a blast radius of $R−1$. Any not-yet-exploded bales caught in these blasts then all explode (all simultaneously) with blast radius $R−2$, and so on.

Please determine the minimum amount of power $R$ with which a single cow may be launched so that, if it lands at an appropriate location, it will cause subsequent detonation of every single hay bale in the scene.

INPUT FORMAT (file angry.in):

The first line of input contains $R$ ($2 \leq N \leq 50,000$). The remaining $N$ lines all contain integers $x_1 \ldots x_N$ (each in the range $0 \ldots 1,000,000,000$).

OUTPUT FORMAT (file angry.out):

Please output the minimum power $R$ with which a cow must be launched in order to detonate all the hay bales. Answers should be rounded and printed to exactly $1$ decimal point.

So, if we assume the hay bales are sorted $x_1 \leq \cdots \leq x_N$. The minimum blast radius must be at most $(x_N - x_1)/2$ since we can just launch such a cow at the midpoint and destroy all the hay bales without the chain reaction. It's also worth noting that if the optimal blast radius is $R^*,$ then $2R^* \in \mathbb{Z}$, that is, twice the optimal blast radius is an integer. Since all the hay bales are located at integer coordinates, adding less than $0.5$ to the radius will never encompass another hay bale. Finally, the last observation is that we should fire the cow so that the very left of the blast lines up exactly with a hay bale since we would not gain anything by having the hay bale strictly inside the blast radius.

Let $L$ be the index of the leftmost hay bale hit by the initial blast. Thus, we could brute force by trying all $2R^* \in \{0,1,\ldots,x_N-x_1\}$ and $L \in \{1,2,\ldots,N\}$. To check if such values work, we can simulate the chain reaction which takes $O(N)$ time. Thus, brute force would take $O\left(N^2(x_N - x_1)\right)$ time. This is where binary search comes in.

During the contest, it was obvious to me that we should do a binary search to find $2R^*$ considering that $x_N - x_1$ could be as large as $10^9$. However, this is not fast enough, as that only gets us $O\left(N^2\log(x_N-x_1)\right)$ time, and $N^2$ can be as large as $2.5 \times 10^9$. After sleeping on it, I made the key insight that we can binary search on the index of the leftmost hay bale, too, so now we have $O\left(N\log(N)\log(x_N-x_1)\right)$ time, which is adequate.

To make this explicit, here's the code:

import java.io.*;
import java.util.*;

public class angry {

    /* check that all the hay bales to the left of idx explode 
     * if we throw cow of power T/2 at hayBales[idx] + T/2
     */
    public static boolean leftExplodes(int idx, int T, int[] hayBales) {
        double currentFloor = hayBales[idx];
        double currentR = T/2.0;
        int left; // leftmost exploded bale
        for (left = idx; left >= 0 && hayBales[left] >= currentFloor; --left) {
            if (left == 0 || hayBales[left - 1] >= currentFloor) continue;
            currentR -= 1.0;
            currentFloor = hayBales[left] - currentR;
        }
        return left == -1;
    }

    public static boolean isDiameterPossible(int T, int[] hayBales) {
        int N = hayBales.length;
        int leftMin = 0; // inclusive
        int leftMax = N; // exclusive         
        int leftIdx = leftMin + (leftMax - leftMin)/2;
        while (leftMin < leftMax) { // find smallest left such that this doesn't work
            if (leftExplodes(leftIdx, T, hayBales)) {
                leftMin = leftIdx + 1;
            } else {
                leftMax = leftIdx;
            }
            leftIdx = leftMin + (leftMax - leftMin)/2;            
        }        
        --leftIdx; // this works
        // now check that the right explodes
        double currentCeiling = hayBales[leftIdx] + T;
        double currentR = T/2.0;
        int right;
        for (right = leftIdx; right < N && hayBales[right] <= currentCeiling; ++right) {
            if (right == N - 1 || hayBales[right + 1] <= currentCeiling)  continue;
            currentR -= 1.0;
            currentCeiling = hayBales[right] + currentR;
        }        
        return right == N;        
    }        

    public static void main(String[] args) throws IOException {
        BufferedReader in = new BufferedReader(new FileReader("angry.in"));
        PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("angry.out")));
        int N = Integer.parseInt(in.readLine());
        int[] hayBales = new int[N];
        for (int n = 0; n < N; ++n) hayBales[n] = Integer.parseInt(in.readLine());
        Arrays.sort(hayBales);

        // search for T = 2R
        int minT = 0; int maxT = hayBales[N - 1] - hayBales[0];
        int T = minT + (maxT - minT)/2;       
        while (minT < maxT) { // find smallest T that works
            if (isDiameterPossible(T, hayBales)) {
                maxT = T;
            } else {
                minT = T + 1;
            }
            T = minT + (maxT - minT)/2;
        }
        out.printf("%.1f\n", T/2.0);
        in.close();
        out.close();
    }

}

Tags: algorithm, usaco, life, cooking

After finishing Far From the Madding Crowd, I started reading Middlemarch by George Eliot. This is one of the first books that I've read by a female author. Despite its length at over 800 pages, I found taking nearly 6 months to finish worthwhile.

Middlemarch details the happenings of a small provincial town, primarily focusing on the lives of Dorthea, a serious young woman committed to devoting her life to a higher cause, and Dr. Lydgate, a French-educated doctor with grand ambitions of making a profound medical discovery.

Clearly, both Dorthea and Dr. Lydgate have noble intentions. Throughout the novel, they are proven to be of good character, too. One of the novel's themes is how the "imperfect social state" can make carrying out such noble intentions impossible. For Dorthea, the imperfect social state is the second-class role of women in society along with her naive marriage to the older Mr. Casubon. Dr. Lydgate confronts a town in upheaval, mistrusting of change and his new medical ideas. His somewhat hidebound view of marriage traps him in a marriage with the spendthrift Rosamond.

One very interesting aspect of this novel is that both Dorthea and Rosamond are married without children. Having acquired husbands, being a memeber of the gentry, and not having children to raise, both characters struggle with ennui and what exactly can a women do. Many of the male characters are dismissive of women's capacity for serious intellectual endeavors and see them only as entertainment. In the end, I find the novel to be ambivalent on a woman's role.

On one hand, Dorthea's impulsive, self-sacrificing nature leads to her disastrous first marriage with Mr. Casubon, but Dorthea's second self-sacrifice ends in happiness. Dorthea gives up her grand ideas of improving the lot of the poor with her fortune, yet the narrator notes that the "growing good of the world is partly dependent on unhistoric acts" such as simply being a wife and mother. As for Rosamond, we should despise her for secretly disobeying her husband's wishes and using her beauty to manipulate, but yet we are made to pity her because we realize these are the only mechanisms of agency that she has. Perhaps, the message here is simply that women should have more freedom to choose their life, whether it be as a wife or fighting for social reform.

Given my perpetual loneliness, I found the romance and marriages of the novel most interesting. Mr. Farebrother's advice to Fred Vincy, "Men outlive their love, but they don't outlive the consequences of their recklessness," I found all too humorous as I'm still living with the consequences of my recklessness. While I often yearn for the seemingly simple, intentional nature of Victorian courtship, both Dorthea's first marriage and Dr. Lydgate's marriage to Rosamond lead to unhappiness. Caleb Garth perhaps says it best:

Young folks may get fond of each other before they know what life is, and they may think it all holiday if they can only get together; but it soon turns into working day,

which alludes to same idea that occurred in Far from the Madding Crowd. Dates are often rather artificial environments where we only see one side of the person. We associate in pleasure away from the true hardships of life. I have often thought that this is why couples from The Bachelor and The Bachelorette never last.

I suppose that there is something to be said for these so-called "organic" relationships that spring up by chance because you and the other person just naturally have similar interests or friends. When I was watching Master of None, I think about how Rachel and Dev get into a relationship. It's a near year long process of chance encounters: one night stand, run into each other somewhere, go on a date, hook up, run into each other again, etc. Because I never could stand the ambiguity of intentions in this type of relationship, I often envied how simple the process was in Victorian novels, where a guy would simply visit her house a few times, maybe talk to her father, and then propose. I have to admit, though, that perhaps these "organic" type of relationships may lead to more "similarity of pursuits" that binds a couple more strongly since if you're spending more time together by chance it's very likely that you're similar people. Perhaps this is why online dating fails for so many.

Anyway, I'm just rambling now. The best is probably a compromise between the two. Relationships can't be forced, so there needs to be certain amount of chemistry. But in my limited experience, I do think people are putting a little too much faith in chance, for they don't want to be seen as trying too hard. Both men and women being more honest and intentional would probably save a lot of lonely souls out there.

Tags: literature, life

Yesterday, I made a leg of lamb. I rather liked how it turned out, so I'll write some notes here for posterity.

Ingredients

3 lbs leg of lamb, preferably with the bone
Spices
- 1 teaspoon salt
- 1/4 cup olive oil
- 1/2 teaspoon black pepper
- 1 lime/lemon
- 3-4 sprigs of fresh rosemary
- 1 teaspoon thyme
- 4 cloves garlic
- 1 teaspoon onion granules
2 cups chicken stock

Directions

Zest the lime and squeeze out the juice. Combine with all the spices. Make slits in the lamb and rub in the spice olive oil mixture. It looks like this.
Sear the leg of lamb in a dutch oven with lard. This takes about 3 minutes per side on medium-high heat. Add chicken stock to the dutch oven.
Braise it for about 45 minutes in the dutch oven with lid on at 325 degrees Farenheit. Remove the lid for and cook it another 30 minutes at 350 degrees. When done, it will look like this.
Slice and serve with reduced chicken stock. The bone marrow is an especially nice treat.

Tags: recipe, cooking

In probability, math contests, and programming contests, we often need to count. Here, I'll write about a few cases that I see pretty often. Before we jump into things, recall the binomial coefficient and various ways of calculating it: $$ {n \choose k} = \frac{n!}{n!(n-k)!} = {n - 1 \choose k } + {n - 1 \choose k - 1 }, $$ where $n \geq k$ and is $0$ if $n < k.$ Thus, we compute the binomial coefficient with dynamic programming by using a triangular array: \begin{matrix} 1 \\ 1 & 1 \\ 1 & 2 & 1\\ 1 & 3 & 3 & 1\\ 1 & 4 & 6 & 4 & 1\\ \vdots & \vdots & \vdots & \ddots & \ddots & \ddots \end{matrix} where if we define $$ X_{n,k} = \begin{cases} {n \choose k}, &k \leq n \\ 0, &k > n, \end{cases} $$ then ${n \choose k} = X_{n,k} = X_{n-1,k} + X_{n-1,k-1}.$

We will see that we can count many things in this manner.

From $n$ objects with replacement

Some cases when we draw from $n$ objects with replacement are an infinite deck of cards, assigning types or categories, or drawing from $\{1,2,3,4,5,6\}$ by rolling a dice.

Ordered set of $k$ objects

If we are sampling $k$ objects from $n$ objects with replacement, each of the $k$ objects has $n$ possibilities. Thus, there are are $\boxed{n^k}$ possibilities.

For example, if $k$ distinct people are buying from a selection of $n$ ice cream flavors, there are $n^k$ different ways, this group of people can order. Another common situation is a sequence of $k$ bits. In this case $n = 2,$ so there are $2^k$ possible sequences.

Unordered set of $k$ objects

Another way to think of this is putting $k$ balls into $n$ bins, where the balls are not distinct. The method for solving this problem is also known as stars and bars.

Imagine each ball as a star, so we have $k$ of them. Along with the stars we have $n - 1$ bars. Now arrange these $(n-1) + k$ objects in any order. For example, take $$\star\star \mid \mid \star \mid \star\star\star.$$ This order would correspond to $2$ balls in the bin $1$, $0$ balls in bin $2$, $1$ ball in bin $3$, and $3$ balls in bin $4$. Thus, we have $(n-1 + k)!$ orderings if the objects were distinct. Since the $k$ balls and $n-1$ bars are identical, we divide by $(n-1)!$ and $k!$. Thus, the number of possible sets is $$ \frac{(n-1 + k)!}{(n-1)!k!} = \boxed{{n + k - 1\choose k}.} $$

From $n$ objects without replacement

This scenario common occurs where we have a finite collection of objects such as a deck of cards.

Unordered set of $k$ objects

We might see this situation when counting the number of $5$-card hands in poker for instance. The order that you draw the cards doesn't matter.

If we have $n$ cards, for the first card there are $n$ possibilities. For the next card, there are $n-1$ possibilities. Thus, if we draw $k$ cards, we have $n(n-1)\cdots(n-k+1) = n!/(n-k)!$ possible draws. Since the order doesn't matter, we divide by $k!.$ Thus, the count is $$\frac{n!}{(n-k)!k!} = \boxed{{n \choose k}.}$$

This makes the formula $${n \choose k} = {n-1 \choose k} + {n - 1 \choose k -1}$$ for computing the binomial coefficient intuitive. Imagine trying to choose $k$ objects from $n$ objects. We can either include or not include the $n$th object. If we don't include the $n$th object we choose $k$ objects from the first $n-1$ objects, which gives us the ${n-1 \choose k}$ term. If we do include the $n$th object then, we only choose $k-1$ objects from the first $n-1$ objects, which gives us the ${n-1 \choose k - 1}$ term.

Another common use of the of binomial coefficient is counting paths. Suppose we are on a grid, we can only make right and down moves, and we are trying to get from $A$ to $B$, where $B$ is $k$ moves to the right and $l$ moves downward. Our set of $n = k + l$ objects is $\{1, 2,\ldots,n\}.$ We choose $k$ indices to make a right move, and the rest of the moves will be down. Then, the number of paths is ${n \choose k}.$

Ordered set of $k$ objects

In this case, we care about the order of the cards we draw. From the discussion above, it's calculated the same way as an unordered set of $k$ objects except we don't divide by $k!$, so the number of possible draws is $$ n(n-1)\cdots(n-k+1) = \boxed{\frac{n!}{(n-k)!} = (n)_k,} $$ where the I have used the Pochhammer symbol to denote the falling factorial.

Recontres Numbers

These count the number of permutations with a certain number of fixed points. A permutation on $n$ elements is an element of the symmetric group $S_n$. For those without a background in algebra, it is essentially a way of rearranging $n$ objects. From our discussion of above, there are $n!$ ways to do so. For example, $$\sigma = \begin{pmatrix} 1 & 2 & 3 & 4\\ 3 & 2 & 4 & 1\\ \end{pmatrix}$$ reorders $(1,2,3,4)$ as $(3,2,4,1)$. $\sigma$ can be thought of as a function $\sigma(1) = 3$, $\sigma(2) = 2$, $\sigma(3) = 4$, and $\sigma(4) = 1$. Since for only $x = 2$ do we have $\sigma(x) = x$, there is only $1$ fixed point.

Now let $D_{n,k}$ be the number of permutations of $n$ objects with $k$ fixed points. Let $D_{0,0} = 1$. Clearly, $D_{1,0} = 0$ since $\sigma(1) = 1$ is the only possible permutation for $n = 1.$ Then, we have the recursive formula for $k > 0$ $$ D_{n,k} = {n \choose k}D_{n-k,0}, $$ which can be thought of as taking an unordered set of $k$ points to be fixed from $\{1,2,\ldots,n\}$, hence the ${n \choose k}$. For the remaining $n - k$ points, we have no fixed points because we want exactly $k$ fixed points.

So if we know $D_{0,0}, D_{1,0},\ldots,D_{n,0}$, we can calculate $D_{n+1,1}, D_{n+1,2},\ldots, D_{n+1,n+1}$. Then, for $k = 0$, since there are $(n+1)!$ total permutations, we have that $$ D_{n+1,0} = (n+1)! - \sum_{k=1}^{n+1}D_{n+1,k}. $$ There's a better way to calculate $D_{n,0}$, though, which I learned at CTY. These permutations with no fixed points are called derangements. Clearly, $D_{0,0} = 1$ and $D_{1,0} = 0$.

Now assume $n \geq 2$. Focus on element $n$. A permutation can be thought of as disjoint cycles. Recall the notation in abstract algebra, where we may write $$\sigma = \begin{pmatrix}1 & 2 & 3\end{pmatrix}\begin{pmatrix}4 & 5\end{pmatrix}\in S_5,$$ which gives us $2$ cycles, one of length $3$ and the other of length $2$. A cycle of length $1$ is a fixed point, so $n$ is part of a cycle of length $2$ or more than $2$. In the case that $n$ is part of a cycle of length $2$, there are $n-1$ options for the other element in the cycle. The number of ways to permute the remaining elements is $D_{n-2,0}$. In the case that $n$ is part of a cycle of length greater than $2$, we can consider permuting the first $n-1$ elements with no fixed points. For each such permutation, we have $n - 1$ elements after which we can insert element $n$, so it becomes part of an existing cycle. In this way, we have that $$ D_{n,0} = (n-1)\left(D_{n-1,0} + D_{n-2,0}\right). $$

Again, we have a triangular array \begin{matrix} 1 \\ 0 & 1 \\ 1 & 0 & 1\\ 2 & 3 & 0 & 1\\ 9 & 8 & 6 & 0 & 1\\ \vdots & \vdots & \vdots & \ddots & \ddots & \ddots \end{matrix}

One helpful way to visualize this process that I like is to imagine a dance class with $n$ couples. After each dance everyone has to find a new partner. There are $D_{n,k}$ ways that $k$ couples stay the same.

Bell Numbers

The Bell numbers count the ways to partition a set. Consider the set $S = \{1,2,3\}$. The possible nonempty subsets are $\{1\}$, $\{2\}$, $\{3\}$, $\{1,2\}$, $\{2,3\}$, $\{1,3\}$, and $\{1,2,3\}$. A partition would be a group of disjoint nonempty subsets such that each element of $S$ is an element of some subset in the partition. Thus, our partitions are $\left\{\{a\},\{b\},\{c\}\right\}$, $\left\{\{a\},\{b,c\}\right\}$, $\left\{\{a, c\},\{b\}\right\}$, $\left\{\{a, b\},\{c\}\right\}$, and $\left\{\{a, b,c\}\right\}$.

Let $B_n$ be number of ways to partition a set of $n$ objects. $B_0 = 1$, $B_1 = 1$, $B_2 = 2$ and $B_3 = 5$ for example. In general to calculate $B_{n+1}$, we have the recurrence relation $$ \boxed{B_{n+1} = \sum_{k=0}^n{n \choose k}B_{n-k} = \sum_{k=0}^n{n \choose k}B_{k}} $$ since ${n \choose k} = {n \choose n-k}$. To see this, consider partitions of the set $\{1,2,\ldots,n,n+1\}$. In the partition there is a subset that contains $n+1$, say $S$. We can have $|S| = k + 1 \in \{1,2,\ldots,n,n+1\}$. Clearly, $n+1 \in S$. Choosing the other $k$ elements of $S$ amounts to selecting an unordered set from $\{1,2,\ldots,n\}$, hence the ${n \choose k}$ factor in each term. For the remaining $n + 1 - (k+1) = n - k$ objects there are $B_{n-k}$ ways to partition them. Thus, we have the terms ${n \choose k}B_{n-k}$. We avoid double counting since the partitions corresponding to each term are disjoint because $n+1$ is in a subset of different size.

Catalan Numbers

Consider strings of $n$ $0$s and $n$ $1$s, so the string has length $2n$. Call this set $\Omega^{2n}$. Let $C_n$ be the number of such strings where no initial substring has more $1$s than $0$s. Formally, $$C_n = \left|\left\{ X = (x_1,x_2,\ldots,x_{2n}) \in \{0,1\}^{2n} : \sum_{i=1}^{2n} x_i = n, \sum_{i=1}^k x_i \leq \frac{k}{2}, \forall k \in \{1,2,\ldots,2n\}\right\}\right|.$$ $C_n$ is the $n$th Catalan number.

To calculate these numbers first note that $C_0 = 1$ and that every $X \in \Omega^{2(n+1)}$ for $n \geq 1$ can be written $$X = (0,X_1,1,X_2),$$ where $X_1$ and $X_2$ are elements of of $\Omega^{2k}$ and $\Omega^{2(n-k)}$, respectively for some $k \in \{0,1,\ldots,n\}$. Such a form is unique. To see this, let $X = (x_1,x_2,\ldots,x_{2n},x_{2n+1},x_{2n+2})$. Note that by the defintion, the first number in the string must be a $0$. Since the total numbers of $0$s and $1$s in the sequence must be equal, there exists an even index $j$ such that $\sum_{i=1}^j x_i = j/2$. Fix $j$ to be the smallest such index. We must have that $x_j = 1$ since otherwise the defintion would have been violated as $\sum_{i=1}^{j-1}x_i = j/2 > (j-1)/2$.

Then, we'll have $$X = (x_1 = 0,X_1,x_j = 1,X_2),$$ where $X_1$ is a string of length $2k = j-2$ and $X_2$ has length $2n + 2 - j = 2n-2k$. We show that $X_1 \in \Omega^{2k}$ and $X_2 \in \Omega^{2(n-k)}$. Since there are an equal number of $0$s and $1$s at index $j$, $X_1$ must have an equal number of $0$s and $1$s. If at any point $1 \leq l \leq 2k$, we have that $\sum_{i=2}^{l + 1}x_i > l/2$ then $\sum_{i=1}^{l+1}x_i \geq (l+1)/2$, which implies that $X \not\in \Omega^{2(n+1)}$ or that there is an index smaller than $j$ such that the initial substring has an equal number of $0$s and $1$s since $l+1 \leq j-1$. Both are a contradiction so we have $X_1 \in \Omega^{2k}$. Showing $X_2 \in \Omega^{2(n-k)}$ is similar. We have that $X_2$ must have an equal number of $0$s and $1$s in order for the whole string to have an equal number of $0$s and $1$s. If for any $1 \leq l \leq 2(n-k)$, we have that $\sum_{i=j+1}^{j+l}x_i > l/2$, then $$ \sum_{i=1}^{j+l}x_i = \sum_{i=1}^{j}x_i + \sum_{i=j+1}^{j+l}x_i = \frac{j}{2} + \sum_{i=j+1}^{j+l}x_i > \frac{j}{2} + \frac{l}{2} = \frac{j+l}{2}, $$ which implies that $X \not\in \Omega^{2(n+1)}$, which is a contradiction.

Thus, we have our desired result that $X = (x_1 = 0,X_1,x_j = 1,X_2)$, where $X_1 \in \Omega^{2k}$ and $X_2 \in \Omega^{2(n-k)}$, where $k \in \{0,1,\ldots,n\}$. Varying $k$, we come upon the recurrence relation $$\boxed{C_{n+1} = \sum_{k=0}^nC_kC_{n-k}.}$$ This is a pretty nice solution, but we can actually do better and find a closed-form solution.

Consider the generating function $$ c(x) = \sum_{n=0}^\infty C_nx^n = 1 + \sum_{n=1}^\infty C_nx^n = 1 + x\sum_{n=0}^\infty C_{n+1}x^n. $$ Substituting in the recurrence relation, for $C_{n+1}$, we have that \begin{align*} c(x) &= 1 + x\sum_{n=0}^\infty C_{n+1}x^n = 1 + x\sum_{n=0}^\infty \sum_{k=0}^nC_kC_{n-k}x^n \\ &= 1 + x\sum_{n=0}^\infty x^n\sum_{k=0}^nC_kC_{n-k} = 1 + x\left[\sum_{n=0}^\infty C_n x^n\right]^2 \\ &= 1 + x[c(x)]^2. \end{align*} Solving for $c(x)$ with the quadratic formula, we find that $$ c(x) = \frac{1 \pm \sqrt{1-4x}}{2x} = \frac{2}{1\mp\sqrt{1-4x}}. $$ Since $c(0) = 1$, $$\displaystyle c(x) = \frac{1 - \sqrt{1-4x}}{2x} = \frac{1}{2x}\left(1 - \sqrt{1-4x}\right).$$

Consider the Taylor series of $f(y) = \sqrt{1+y}.$ By induction, $$f^{(n)}(y) = (-1)^{n+1}\frac{\prod_{k=0}^{n-1}(2k-1)}{2^n}(1+y)^{-(2n-1)/2} \Rightarrow f^{(n)}(0) = (-1)^{n+1}\frac{\prod_{k=0}^{n-1}(2k-1)}{2^n}.$$ Moreover, \begin{align*} f^{(n)}(0) &= (-1)^{n+1}\frac{\prod_{k=0}^{n-1}(2k-1)}{2^n} = (-1)^{n+1}\frac{\prod_{k=0}^{n-1}(2k-1)}{2^n}\cdot \frac{2^n n!(2n-1)}{2^n n!(2n-1)} \\ &= \frac{(-1)^{n+1}}{4^n(2n-1)}\frac{(2n)!}{n!}. \end{align*}

Thus, we have that $$ f(y) = \sqrt{1+y} = 1 + \sum_{n=1}^\infty \frac{(-1)^{n+1}}{4^n(2n-1)}\frac{(2n)!}{n!n!}y^n, $$ so we have \begin{align*} c(x) &= \frac{1}{2x}(1 + f(-4x)) = \frac{1}{2x}\left(\sum_{n=1}^\infty \frac{(-1)^{n}}{4^n(2n-1)}\frac{(2n)!}{n!n!}(-4x)^n\right) \\ &= \frac{1}{2x}\left(\sum_{n=1}^\infty \frac{(-1)^{2n}}{(2n-1)}\frac{(2n)!}{n!n!}x^n\right) = \sum_{n=1}^\infty \frac{1}{2n(2n-1)}\frac{(2n)!}{(n-1)!n!}x^{n-1} \\ &= \sum_{n=1}^\infty \frac{1}{n}\frac{(2n-2)!}{(n-1)!(n-1)!}x^{n-1} = \sum_{n=1}^\infty \frac{1}{n}{2(n-1) \choose n-1 }x^{n-1} \\ &= \sum_{n=0}^\infty \frac{1}{n+1}{2n \choose n}x^{n} = \sum_{n=0}^\infty C_nx^{n}, \end{align*} so $\displaystyle \boxed{C_n = \frac{1}{n+1}{2n \choose n}.}$

Tags: combinatorics, math, probability

One of my favorite ways to bring people together, fellowship, and share a little bit about my culture is to cook a huge pot of phở (Vietnamese beef noodle soup for the uninitiated). It's fairly easy to make, and it's a novel experience for most people, who are accustomed to restaurant phở. I especially recommend making some during the cold winter months. After making it a few times I've come up with my own recipe. You'll need a stock pot that holds at least 16 quarts to proceed.

Ingredients

Beef
- 2-3 lbs leg bones
- 2-3 lbs neck bones
- 2-3 lbs oxtail
- 5-7 lbs eye round roast, freeze and thinly slice, let it come to room temperature before serving
Spices (you can opt to toast the spices over medium heat)
- 5-10 whole star anise
- 2 cardamom pods
- 1 tablespoon whole coriander seeds
- 1 tablespoon whole fennel seeds
- 1 scant teaspon whole cloves
- 2 sticks of cinnamon
2-4 nubs of ginger
2 medium-sized onions
1 dozen green onions also known as scallions
1 tablespoon salt
1/4-1/3 cups of a sweetener, sugar or syrup works
1 cup fish sauce, I recommend the Red Boat brand
6 lbs of fresh noodles, bánh phở tươi
Condiments
- Siracha
- Hoison sauce
- Thai chili peppers
- Cilantro
- Thai basil
- Mung bean sprouts
- Limes
- Green onions

Steps

Ahead of time, freeze your eye round roasts and thinly slice them. This is easily the most labor-intensive part. Set aside and refrigerate. Let the slices sit for 2-3 hours at room temperature before serving.
Parboil the bones and oxtail for a cleaner broth. Bring water to a boil. Put the bones and oxtail in the water. Let the water return to a boil. After 5-10 minutes, dump the water, and wash the bones and oxtail. Return the bones and oxtail to the pot, fill it with water, and simmer.
Char the onions and ginger under the broiler. This usually takes about 10 minutes. Add the onions and ginger to the pot. Also, add the white part of the green onions.
Toast the spices and put them in a spice bag or tea infuser. Add the bag of spices or tea infuser to the pot.
Add fish sauce, salt, and sweetener. Back when I followed Paleo more strictly, I refused to use sugar, so I used maple syrup. In reality, sugar works just as well.
Now let the broth simmer. I find 8 hours is enough. You can go longer for a more intense flavor. If you serve it after just 8 hours, you can just add more water to make more broth. It's a little bit like making a second brew of tea.
Add more fish sauce, salt, or sugar to taste.

Serving

Bring the thinly sliced eye round roast out. Wash the vegetables. Remove the thick stems from cilantro. Cut the limes into eighths. Cut the Thai chili peppers and the green part of the scallions.
Filter out broth into a smaller pot. Skim excess fat. Don't skim all of it, though. The fat makes the broth more savory. Bring the smaller pot to a boil.
To cook the noodles, bring another pot of water to a boil. Add the noodles and stir them around for about 20 seconds. Drain with a colander.
Put the noodles in a bowl and add the raw meat to the bowl. Pour the boiling broth over the raw meat to cook the meat.
For your VIP guests, dig out some oxtail from the larger stock pot. The braised, fatty meat melts in your mouth.
Add condiments and enjoy!

After making a few bowls, I usually let guests make their own. This recipe may not be the most authentic, but it tastes pretty good in my opinion. Notice that most times and ingredients are given in ranges and are not exact. The recipe is pretty forgiving, and you can modify it according to your preference.

Tags: cooking, recipe

Happy New Years, everyone! As a way to start off the year, I thought that it would be interesting to write about something that has evolved a lot over the past year: the Republican field of presidential candidates.

At Penn, I've been working on Snapstream Searcher, which searches through closed captioning television scripts. I've decided to see how often a candidate is mentioned on TV has changed over the year. Check out the chart and do your own analysis here.

As you can see in the title picture, Donald Trump has surged in popularity since he announced his candidicy in June. In general every candidate, experiences a surge in mentions upon announcing his or her candidacy. Usually the surge is not sustained, though.

Many candidates lost popularity over the course of 2015. Jeb Bush lost quite a bit of ground, but perhaps no one has suffered as much as Chris Christie.

Other candidates like Ben Carson are passing fads with a bump from Octorber to November before fading away:

Some cool features I added are the ability to zoom. The D3 brush calculates the coordinates, and then, I update the scales and axes. The overflow is hidden with SVG clipping. To illustrate the usefulness of this feature, we can focus in on the September debate. Here, we see Carly Fiorina's bump in popularity due to her strong debate performance.

Another cool feature that I added was the ability to see actual data points. If one holds the Control key and hovers over the point, we can see a tooltip.

Play around with it here, and let me know what you think!

Tags: stat, javascript, d3.js, data viz

One of the more interesting problems that I've come across recently is to calculate the distribution of the last time a simple random walk is at $0.$

Let $X_1,X_2,\ldots,X_{2n}$ be independent, indentically distributed random variables such that $P(X_i = 1) = P(X_i = -1) = 1/2.$ Define $S_k = \sum_{i=1}^k X_i.$ Then, we have a path $$(0,0) \rightarrow (1,S_1) \rightarrow (2,S_2) \rightarrow \cdots \rightarrow (2n,S_{2n}).$$ Define the random variable $L_{2n} = \sup\{ k \leq 2n : S_k = 0\}.$ We want the distribution of $L_{2n}.$

Note that we have that \begin{equation} \mathbb{P}(S_{2n} = 0) = 2^{-2n}{2n \choose n} \end{equation} since we have $n$ positive steps and $n$ negative steps.

Let $\displaystyle N_{n,x} ={n \choose (n+x)/2}$ denote the number of paths from $(0,0)$ to $(n,x)$ since $(n+x)/2$ positive steps implies there are $(n-x)/2$ negative steps, and $(n+x)/2 - (n-x)/2 = x$. Note that $n + x$ must be even for this to be well defined. If $n + x$ is not even, then $N_{n,x} = 0$ since $x$ must have the same parity as $n.$ First, we prove the reflection principle.

Reflection Principle

If $x,y > 0,$ the number of paths that from $(0,x)$ to $(n,y)$ that are $0$ at some time, that is, they touch the $x$-axis is equal to the total number of paths from $(0,-x)$ to $(n,y),$ which is $N_{n,y+x}.$ Therefore, the number of paths from $(0,x)$ to $(n,y)$ that do not touch $0$ is $N_{n,|y-x|} - N_{n,y+x}.$

We can establish a one-to-one correspondence between the set $A$, the paths from $(0,x)$ to $(n,y)$ that are $0$ at some time and the set $B$ the paths from $(0,-x)$ to $(n,y)$.

Consider any path $P$ in $A$. $P$ must include the point $(m,0),$ where $0 < m < n$. Fix $m$ to be the greatest such integer. We construct a path $Q_1$ from $(0,-x)$ to $(m,0)$ by going in the opposite direction as $P.$ We construct a path $Q_2$ from $(m,0)$ to $(n,y)$ by mirroring $P$. Thus, we have that $Q = Q_1 \cup Q_2 \in B.$

Now consider any path $Q$ in $B$. Since paths are continuous, $Q$ must cross the $x$-axis, so $Q$ includes a point $(m,0)$, where $0 < m < n.$ Fix $m$ to be the greatest such integer. We construct $P_1,$ a path from $(0,x)$ to $(m,0)$ by going in the opposite direction as $Q$. We construct $P_2$ by mirroring $Q$. Thus, we have that $P = P_1 \cup P_2 \in A.$

So, we have established a one-to-one correspondence, and therefore, we have proven $|A| = |B|.$

Symmetry of Zeroes

$\mathbb{P}(S_1 \neq 0, S_2 \neq 0,\ldots,S_{2n} \neq 0) = \mathbb{P}(S_{2n} = 0).$

First note that $$ \mathbb{P}(S_1 \neq 0,\ldots,S_{2n} \neq 0) = \mathbb{P}(S_1 > 0,\ldots,S_{2n} > 0) + \mathbb{P}(S_1 < 0,\ldots,S_{2n} < 0) $$ since we can never have the path touch $0.$ Also note that the two terms are equal, so \begin{equation} \mathbb{P}(S_1 \neq 0, S_2 \neq 0,\ldots,S_{2n} \neq 0) = 2\mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} > 0). \end{equation} Now, note that \begin{equation} \mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} > 0) = \sum_{r=1}^{n}\mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} = 2r) \end{equation} since we have taken an even number of steps.

To calculate $\mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} = 2r),$ we note that $X_1 = 1$ since $S_1 > 0$. Then, the number of paths from $(1,1)$ to $(2n,2r)$ that do not touch $0$ by the Reflection Principle is $N_{2n-1,2r-1} - N_{2n-1,2r+1}$. Thus, \begin{align*} \mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} = 2r) &= \left(\frac{1}{2}\right)\left(\frac{1}{2^{2n-1}}\right)\left(N_{2n-1,2r-1} - N_{2n-1,2r+1}\right) \\ &= \left(\frac{1}{2^{2n}}\right)\left(N_{2n-1,2r-1} - N_{2n-1,2r+1}\right). \end{align*}

So, we have that \begin{align*} \mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} > 0) &= \left(\frac{1}{2^{2n}}\right) \sum_{r=1}^n\left(N_{2n-1,2r-1} - N_{2n-1,2r+1}\right) \\ &= \frac{1}{2^{2n}}N_{2n-1,1} = \frac{1}{2^{2n}}{2n - 1 \choose n}\\ &= \frac{1}{2^{2n}}\frac{(2n-1)!}{n!(n-1)!} = \frac{1}{2}\frac{1}{2^{2n}}\frac{(2n)!}{n!n!} \\ &= \frac{1}{2}\mathbb{P}(S_{2n} = 0) \end{align*} by telescoping since $N_{2n-1,2n+1} = 0$ and substituting in the previous equations.

Recalling the earlier equation, we have the desired result, $$ \mathbb{P}(S_1 \neq 0, S_2 \neq 0,\ldots,S_{2n} \neq 0) = 2\mathbb{P}(S_1 > 0, S_2 > 0,\ldots,S_{2n} > 0) = \mathbb{P}(S_{2n} = 0). $$

Distribution of Last Zero Visit

Let $L_{2n}$ be the random variable whose value is the last time the simple random walk visited $0.$ Formally, $$L_{2n} = \sup\left\{2k : k \in \{0,\ldots,n\},~\sum_{i=1}^{2k} X_i = 0\right\}.$$ Then, we have that $$\mathbb{P}(L_{2n} = 2k) = 2^{-2n}{2k \choose k}{2n-2k \choose n-k}.$$

To see this, we have that \begin{align*} \mathbb{P}(L_{2n} = 2k) &= \mathbb{P}(S_{2k} = 0, S_{2k+1} \neq 0, \ldots,S_{2n} \neq 0) \\ &= \mathbb{P}(S_{2k} = 0, S_{2k+1} - S_{2k} \neq 0, \ldots,S_{2n} - S_{2k} \neq 0) \\ &= \mathbb{P}(S_{2k} = 0)\mathbb{P}(S_1 \neq 0, \ldots,S_{2n-2k} \neq 0) \\ &= \mathbb{P}(S_{2k} = 0)\mathbb{P}(S_{2n-2k} = 0), \end{align*} where the last equaility is by Symmetry of Zeroes. Thus, we have that $$\mathbb{P}(L_{2n} = 2k) = \mathbb{P}(S_{2k} = 0)\mathbb{P}(S_{2n-2k} = 0) =2^{-2n}{2k \choose k}{2n-2k \choose n-k}$$ as desired.

Conculusion

Let's look at what this distribution looks like when $n = 15.$ We have symmetric distribution about $15$, so the mean is $15.$ However, the distribution is U-shaped, and the most likely values are very far from the mean. Thus, to take an analogy from sports, if two evenly matched teams were to played against each other multiple times over the course of a season, the most likely scenario is for one team team to lead the other team the entire season. So, saying that team A led team B the entire season is an almost meaningless statistic.

Tags: probability, combinatorics, math, stat

Croissants. I love to eat Sam's club croissants in bulk. I never thought this would be such a tedious process but now, my mere respect goes to the bakers who wakes up at 5 am to painstakingly make this bread........and sells it for a buck or two...

This is hard and this is still in beta so please follow it with a grain of salt. If you wish to follow it, please follow it precisely.

My goal is to make honeycomb interior croissant but I noticed that is evidently hard. I have use this reciepe with minor alterations.

Btw in order to make this, this requires a lot of planning ahead.

Stuff

Poolish

160 g of King Arthur Bread flour
160 g of water (heated to about 80 degrees F)
1/8 tsp of instant yeast (I used SAF instant yeast)

Dough

The poolish above
362 g of King Arthur Bread flour
135 g Whole milk
67 g of granulated white sugar
1 tsp and 1/8 tsp of SAF instant yeast
10 g of Redmond real salt
22 g of softened Plugra non salt butter
Roll in butter 286 g of Plugra non salt butter

Procedure

DAY 1: Make the poolish by mixing all the poolish ingredients above. Wrap the bowl tightly with cling wrap and leave it fo rabout 12 to 16 hours. It will become fuzzy like this below
DAY 1.5: Now we need to make the dough. Mix the poolish with all the ingredients above, omit the roll in butter, and make dough by using hand. The dough will look ugly. Wrap this tightly, and keep it in fridge overnight...
DAY 2: Soften the rollout butter and sandwitch it between two parchment paper or wax paper. Roll it out into 7.5 inch x 7.5 inch square. This should be not too soft or too hard.
Roll out the dough you kept in the fridge overnight into 11 inch by 11 inch then place the butter like this below
Fold the dough over the butter. MAKE SURE to cover all the butter with the dough. pay attention to corners and edges where butter may spill out.
Roll out into a 11 x 24 inch sheet. Do not let any butter spill out during this procedure
Cut one edge off from the dough to expose just like the gif below then fold it as if it is a buisness letter. You do this to prevent trapping excess dough into the fold.This will be your first fold

Folding technique
Wrap it and rest it in fridge for 1 hour minimum.
Make a 11x24 sheet again and repeat step 7 to 8 for two more times. More folds = less flaky dough. Keep about about three folds in total. I have yet to try two but I am lazy at this moment. Wrap it up and rest in fridge.
DAY 3: You can leave the dough over night after the third fold but if you are in a rush, rest about 2 hours before you make a sheet again
Cut the dough exactly in half. Use a measuring stick! Roll out the dough into 10x18 sheet
Mark a small cut on every 4.5 inch on one side of the longer part of the sheet. On the other side (the top) mark a 2.25 inch first and then every 4.5 inch thereafter. Using a measuring stick, make diagonal cuts from the corner to the top mark and repeat. Horrible pic below.
Cut a small cut on the bottom of the triangle, then gentle stretch it out. Roll from the bottom to the pointy top. Make sure to roll it tightly.

Folding technique
I like to stretch the tip a little so I can get more small "steps" on my croissants Here this is the end product
Beat one egg and add 1 tbsp of water.
Egg wash then let it proof for 1 to 2 hours. It will almost double in size.
Heat oven to 400 degrees.
Egg wash them again, then bake at 370 degrees for 12 minutes and 350 degrees for 15 minutes unitl crisp and brown.
You can add things in the croissant like this... but for chocolate croissants, you rather have it square. I do not have any picture or cutting tutorials of such master piece.

Model shots

Model style 1

Model style 2

Humphry the snail

Over Halloween weekend, my roommate Masato and I decided to have another one of our cook offs and made donuts together. He made these Fluffy Yeasted Donuts, and I made these Apple Cider Doughnuts. I thought mine came out pretty well, but it was pretty much universally agreed upon that Masato's came out better.

I usually eat mostly Paleo, so working with flour and dough was a very novel experience. In this particular recipe, the dough was very sticky, so it required sprinkling a lot of extra flour when rolling and cutting the dough (thanks Masato for the pro tip!). In the end, I thought that the flavor was great, but the donuts were pretty dense. I really enjoyed the fluffiness of Masato's donuts more. Some future diabetics said that it could be sweeter, but that's just their opinion.

As for life, things are going pretty well. I'm finally no longer sick, so I've gotten some good workouts in. Sprinting this morning with Michael Vo was death. I lost every set except the first, and I only won probably because he got confused. Reapplying to graduate school is definitely stressful, but I'm learning to cope. Lately, I haven't had time to code too much, but hopefully I'll get back into that soon.

Tags: life, cooking

Some time ago, I was doing a problem on HackerRank that in introduced me to two new data structures that I want to write about. The problem is called Cross the River.

The premise is this:

You're standing on a shore of a river. You'd like to reach the opposite shore.

The river can be described with two straight lines on the Cartesian plane, describing the shores. The shore you're standing on is $Y=0$ and another one is $Y=H$.

There are some rocks in the river. Each rock is described with its coordinates and the number of points you'll gain in case you step on this rock.

You can choose the starting position arbitrarily on the first shore. Then, you will make jumps. More precisely, you can jump to the position $(X_2,Y_2)$ from the position $(X_1,Y_1)$ in case $\left|Y_2−Y_1\right| \leq dH$, $\left|X_2−X_1\right| \leq dW$ and $Y_2>Y_1$. You can jump only on the rocks and the shores.

What is the maximal sum of scores of all the used rocks you can obtain so that you cross the river, i.e. get to the opposite shore?

No two rocks share the same position, and it is guaranteed that there exists a way to cross the river.

Now, my first instinct was to use dynamic programming. If $Z_i$ is the point value of the rock, and $S_i$ is the max score at rock $i$, then $$ S_i = \begin{cases} Z_i + \max\{S_j : 1 \leq Y_i - Y_j \leq dH,~|X_i - X_j| \leq dW\} &\text{if rock is reachable} \\ -\infty~\text{otherwise,} \end{cases} $$ where we assume the existence of rocks with $Y$ coordinate $0$ of $0$ point value for all $X.$

Thus, we can sort the rocks by their $Y$ coordinate and visit them in order. However, we run into the problem that if $dW$ and $dH$ are large we may need to check a large number of rocks visited previously, so this approach is $O(N^2).$

My dynamic programming approach was the right idea, but it needs some improvements. Somehow, we need to speed up the process of looking through the previous rocks. To do this, we do two things:

Implement a way to quickly find the max score in a range $[X-dW, X + dW]$
Only store the scores of rocks in range $[Y-dH, Y)$

To accomplish these tasks, we use two specialized data structures.

Segment Trees

Segment trees solve the first problem. They provide a way to query a value (such as a maximum or minimum) over a range and update these values in $\log$ time. The key idea is to use a binary tree, where the nodes correspond to segments instead of indices.

For example suppose that we have $N$ indices $i = 0,1,\ldots, N-1$ with corresponding values $v_i.$ Let $k$ be the smallest integer such that $2^k \geq N.$ The root node of our binary tree will be the interval $[0,2^k).$ The first left child will be $[0,2^{k-1}),$ and the first right child will be $[2^{k-1},2^k).$ In general, we have for some node $[a,b)$ if $b - a > 1$, then the left child is $[a,(b-a)/2),$ and the right child is $[(b-a)/2,b).$ Otherwise, if $b - a = 1$, there are no children, and the node is a leaf. For example, if $5 \leq N \leq 8$, our segment tree looks like this.

In general, there are $2^0 + 2^1 + 2^2 + \cdots + 2^k = 2^{k+1} - 1$ nodes needed. $2N - 1 \leq 2^{k+1} - 1 \leq 2^2(N-1) - 1$, so the amount of memory needed is $O(N).$ Here's the code for constructing the tree.

class MaxSegmentTree {
    private long[] maxes;        
    private int size;
    public MaxSegmentTree(int size) {
        int actualSize = 1;
        while (actualSize < size) actualSize *= 2;
        this.size = actualSize;
        // if size is 2^k, we need 2^(k+1) - 1 nodes for all the intervals
        maxes = new long[2*actualSize - 1];
        Arrays.fill(maxes, Long.MIN_VALUE);
    }
    ...
}

Now, for each node $[a,b),$ we store a value $\max(v_a,v_{a+1},\ldots,v_{b-1}).$ An update call consists of two parameters, an index $k$ and a new $v_k.$ We would traverse the binary tree until we reach the node $[k, k+1)$ and update that node. Then, we update the max of each ancestor by taking the max of its left and right child since the segment of child is always contained in the segment of the parent. In practice, this is done recursively like this.

class MaxSegmentTree {
    ...
    public long set(int key, long value) {
        return set(key, value, 0, 0, this.size);
    }
    /** 
     * @param node index of node since binary tree is implement with array
     * @param l    lower bound of segement (inclusive)
     * @param r    upper bound of segement (exclusive)
     */
    private long set(int key, long value,
                     int node, int l, int r) {
        // if not in range, do not set anything
        if (key < l || key >= r) return maxes[node]; 
        if (l + 1 == r) {
            // return when you reach a leaf
            maxes[node] = value;
            return value;
        }
        int mid = l + (r-l)/2;
        // left node
        long left = set(key, value, 2*(node + 1) - 1, l, mid);
        // right node
        long right = set(key, value, 2*(node + 1), mid, r);
        maxes[node] = Math.max(left, right);
        return maxes[node];
    }
    ...
}

A range max query takes two parameters: the lower bound of the range and the upper bound bound of the range in the form $[i,j).$ We obtain the max recursively. Let $[l,r)$ be the segment corresponding to a node. If $[l,r) \subseteq [i,j),$ we return the max associated with $[l,r)$. If $[l,r) \cap [i,j) = \emptyset,$ we ignore this node. Otherwise, $[l,r) \cap [i,j) \neq \emptyset,$ and $\exists k \in [l,r)$ such that $k \not\in [i,j),$ so $l < i < r$ or $l < j < r.$ In this case, we descend to the child nodes. The algorithm looks like this.

class MaxSegmentTree {
    ...
    /** 
     * @param i from index, inclusive
     * @param j to index, exclusive
     * @return the max value in a segment.
     */
    public long max(int i, int j) {
        return max(i, j, 0, 0, this.size);
    }

    private long max(int i, int j, int node, int l, int r) {
        // if in interval
        if (i <= l && r <= j) return maxes[node];
        // if completely outside interval
        if (j <= l || i >= r ) return Long.MIN_VALUE;
        int mid = l + (r-l)/2;    
        long left = max(i, j, 2*(node+1) - 1, l, mid);
        long right = max(i, j, 2*(node+1), mid, r);
        return Math.max(left, right);
    }
    ...
}

I prove that this operation is $O(\log_2 N).$ To simplify things, let us assume that $N$ is a power of $2$, so $2^k = N.$ I claim that the worst case is $[i,j) = [1, 2^k - 1).$ Clearly this is true when $k = 2$ since we'll have to visit all the nodes but $[0,1)$ and $[3,4),$ so we visit $5 = 4k - 3 = 4\log_2 N - 3$ nodes.

Now, for our induction hypothesis we assume that the operation is $O(\log_2 N)$ for $1,2,\ldots, k - 1$. Then, for some $k$, we can assume that $i < 2^{k-1}$ and $j > 2^{k-1}$ since otherwise, we only descend one half of the tree, and it reduces to the $k - 1$ case. Now, given $[i, j)$ and some node $[l,r)$, we'll stop there if $[i,j) \cap [l,r) = \emptyset$ or $[l,r) \subseteq [i,j).$ Otherwise, we'll descend to the node's children. Now, we have assumed that $i < 2^{k-1} < j,$ so if we're on the left side of the tree, $j > r$ for all such nodes. We're not going to visit any nodes with $r \leq i,$ we'll stop at nodes with $l \geq i$ and compare their max, and we'll descend into nodes with $l < i < r$. At any given node on the left side, if $[l,r)$ is not a leaf and $l < i < r$, we'll choose to descend. Let the left child be $[l_l, r_l)$ and the right child be $[l_r,r_r)$. The two child segments are disjoint, so we will only choose to descend one of them since only one of $l_l < i < r_l$ or $l_r < i < r_r$ can be true. Since $l_l = l < i$, we'll stop only at the right child if $l_r = i.$ If $i$ is not odd, we'll stop before we reach a leaf. Thus, the worst case is when $i$ is odd.

On the right side, we reach a similar conclusion, where we stop when $r_l = j,$ and so the worst case is when $j$ is odd. To see this visually, here's an example of the query $[1,7)$ when $k = 3.$ Nodes where we visit the children are colored red. Nodes where we compare a max are colored green.

Thus, we'll descend at $2k - 1 = 2\log_2 N - 1$ nodes and compare maxes at $2(k-1) = 2(\log_2 N - 1)$ nodes, so $4\log_2 N - 3$ nodes are visited.

Max Queues

Now, the segment tree contains the max score at each $X$ coordinate, but we want to our segement tree to only contain values corresponding to rocks that are within range of our current position. If our current height is $Y$, we want rocks $j$ if $0 < Y - Y_j \leq dH.$

Recall that we visit the rocks in order of their $Y$ coordinate. Thus, for each $X$ coordinate we add the rock to some data structure when we visit it, and we remove it when it becomes out of range. Since rocks with smaller $Y$ coordinates become out of range first, this is a first in, first out (FIFO) situation, so we use a queue.

However, when removing a rock, we need to know when to update the segment tree. So, the queue needs to keep track of maxes. We can do this with two queues. The primary queue is a normal queue. The second queue will contain a monotone decreasing sequence. Upon adding to the queue, we maintain this invariant by removing all the smaller elements. In this way, the head of the queue will always contain the max element since it would have been removed otherwise. When we removing an element from the max queue, if the two heads are equal in value, we remove the head of each queue. Here is the code.

class MaxQueue<E extends Comparable<? super E>> extends ArrayDeque<E> {
    private Queue<E> q; // queue of decreasing subsequence of elements (non-strict)
    public MaxQueue() {
        super();
        q = new ArrayDeque<E>();
    }

    @Override
    public void clear() {
        q.clear();
        super.clear();
    }

    @Override
    public E poll() { 
        if (!super.isEmpty() && q.peek().equals(super.peek())) q.poll();
        return super.poll();
    }

    @Override
    public E remove() {
        if (!super.isEmpty() && q.peek().equals(super.peek())) q.remove();
        return super.remove();
    }        

    @Override
    public boolean add(E e) {
        // remove all the smaller elements
        while (!q.isEmpty() && q.peek().compareTo(e) < 0) q.poll();
        q.add(e);
        return super.add(e);
    }

    @Override
    public boolean offer(E e) {
        // remove all the smaller elements
        while (!q.isEmpty() && q.peek().compareTo(e) < 0) q.poll();
        q.offer(e);
        return super.offer(e);
    }

    public E max() {
        return q.element();
    }        
}

Solution

With these two data structures the solution is pretty short. We keep one segment tree that stores the current max at each $X$ coordinate. For each $X$, we keep a queue to keep track of all possible maxes. The one tricky part is to make sure that we look at all rocks at a certain height before updating the segment tree since lateral moves are not possible. Each rock is only added and removed from a queue once, and we can find the max in $\log$ time, so the running time is $O(N\log N)$, where $N$ is the number of rocks. Here's the code.

public class CrossTheRiver {

    private static final int MAX_X = 100000;
    ...
    public static void main(String[] args) throws IOException {
        BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
        PrintWriter out = new PrintWriter(new BufferedWriter(new OutputStreamWriter(System.out)));
        StringTokenizer st = new StringTokenizer(in.readLine());
        int N = Integer.parseInt(st.nextToken()); // rocks
        int H = Integer.parseInt(st.nextToken()); // height
        int dH = Integer.parseInt(st.nextToken()); // max y jump
        int dW = Integer.parseInt(st.nextToken()); // max x jump        
        Rock[] rocks = new Rock[N];
        for (int i = 0; i < N; ++i) { // read through rocks
            st = new StringTokenizer(in.readLine());           
            int Y = Integer.parseInt(st.nextToken());
            int X = Integer.parseInt(st.nextToken()); // 0 index
            int Z = Integer.parseInt(st.nextToken());
            rocks[i] = new Rock(X, Y, Z);
        }        
        Arrays.sort(rocks);                
        long[] cumulativeScore = new long[N];
        MaxSegmentTree sTree = new MaxSegmentTree(MAX_X + 1);
        ArrayList<MaxQueue<Long>> maxX = new ArrayList<MaxQueue<Long>>(MAX_X + 1);
        for (int i = 0; i <= MAX_X; ++i) maxX.add(new MaxQueue<Long>());
        int i = 0; // current rock
        int j = 0; // in range rocks
        while (i < N) {
            int currentY = rocks[i].y;
            while (rocks[j].y < currentY - dH) {
                // clear out rocks that are out of range
                maxX.get(rocks[j].x).poll();
                if (maxX.get(rocks[j].x).isEmpty()) {
                    sTree.set(rocks[j].x, Long.MIN_VALUE);
                } else {
                    sTree.set(rocks[j].x, maxX.get(rocks[j].x).max());
                }
                ++j;
            }            
            while (i < N && rocks[i].y == currentY) {
                // get previous max score from segment tree
                long previousScore = sTree.max(rocks[i].x - dW, rocks[i].x + dW + 1);
                if (rocks[i].y <= dH && previousScore < 0) previousScore = 0;
                if (previousScore > Long.MIN_VALUE) {  // make sure rock is reachable
                    cumulativeScore[i] = rocks[i].score + previousScore;
                    // keep max queue up to date
                    maxX.get(rocks[i].x).add(cumulativeScore[i]); 
                }
                ++i;
            }       
            // now update segment tree
            for (int k = i - 1; k >= 0 && rocks[k].y == currentY; --k) {
                if (cumulativeScore[k] == maxX.get(rocks[k].x).max()) {
                    sTree.set(rocks[k].x, cumulativeScore[k]);
                }
            }
        }

        long maxScore = Long.MIN_VALUE;
        for (i = N - 1; i >= 0 && H - rocks[i].y <= dH; --i) {
            if (maxScore < cumulativeScore[i]) maxScore = cumulativeScore[i];
        }
        out.println(maxScore);
        in.close();
        out.close();
    }
}

Tags: competitive programming, hackerrank, data structure, algorithm

About Me

Snow, Brownies, USACO, and Binary Search

Angry Cows

Middlemarch

Leg of Lamb

Ingredients

Directions

Counting Various Things

From $n$ objects with replacement

Ordered set of $k$ objects

Unordered set of $k$ objects

From $n$ objects without replacement

Unordered set of $k$ objects

Ordered set of $k$ objects

Recontres Numbers

Bell Numbers

Catalan Numbers

Phở Philly Pham

Ingredients

Steps

Serving

2016 Republican Candidates on TV in 2015

Last Visit to $0$ of a Simple Random Walk

Reflection Principle

Symmetry of Zeroes

Distribution of Last Zero Visit

Conculusion

Humphry the snail bread

Stuff

Poolish

Dough

Procedure

Model shots

Apple Cider Doughnuts

Segment Trees and Max Queues

Segment Trees

Max Queues

Solution