Posts tagged d3.js

Photo URL is broken

Happy New Years, everyone! As a way to start off the year, I thought that it would be interesting to write about something that has evolved a lot over the past year: the Republican field of presidential candidates.

At Penn, I've been working on Snapstream Searcher, which searches through closed captioning television scripts. I've decided to see how often a candidate is mentioned on TV has changed over the year. Check out the chart and do your own analysis here.

As you can see in the title picture, Donald Trump has surged in popularity since he announced his candidicy in June. In general every candidate, experiences a surge in mentions upon announcing his or her candidacy. Usually the surge is not sustained, though.

Many candidates lost popularity over the course of 2015. Jeb Bush lost quite a bit of ground, but perhaps no one has suffered as much as Chris Christie.

Other candidates like Ben Carson are passing fads with a bump from Octorber to November before fading away:

Some cool features I added are the ability to zoom. The D3 brush calculates the coordinates, and then, I update the scales and axes. The overflow is hidden with SVG clipping. To illustrate the usefulness of this feature, we can focus in on the September debate. Here, we see Carly Fiorina's bump in popularity due to her strong debate performance.

Another cool feature that I added was the ability to see actual data points. If one holds the Control key and hovers over the point, we can see a tooltip.

Play around with it here, and let me know what you think!


Photo URL is broken

Back when I lived in Boston, I was an avid CrossFitter. For a variety of reasons, mainly financial, I no longer go to a CrossFit box, but I'm still interested in the sport. Being a bit of a data nerd, I've always been curious about what makes an elite CrossFitter and how much height and weight play a role. To satisfy my curiosity, I scraped data on the top 2,000 athletes from CrossFit Games, and created a pivot chart in D3.js, where you can compare statistics on workouts and lifts by different groups of athletes.

Play around with the data yourself at 2015 CrossFit Open Pivot Chart. Be careful. The data may not be that reliable. If there are a lot of outliers, it may be better to use a robust statistic like median instead of mean (in particular, Sprint 400m and Run 5k workouts seem to have this problem). If you don't choose your groups wisely, you may fall into Simpson's Paradox by excluding important data. For example, from the chart below, one may conclude that back squat strength decreases with age.

Mean Back Squat by Age Group

But now, when we consider gender, we have an entirely different story:

Mean Back Squat by Gender and Age Group

Unsurprisingly, women back squat less than men do. Back squat strength remains stable with age for women, and if anything, back squat strength actually increases slightly with age for men. Whoa, what's going on here? Check this out:

Gender and Age of Top CrossFit Athletes

Notice that the 3 rightmost female bars (red) are taller than the 3 rightmost male bars (blue). On the other hand, the 3 leftmost female bars are shorter than the 3 leftmost male bars. Thus, it seems that women age better than men in the CrossFit world. This has the implication that there are more women than men in older age groups, so the average back squat of that group appears to be lower, when in reality, there simply are a greater relative number of women in that group. You can reach the same conclusion from the title picture, where the bars are stacked.

Height and Weight

There definitely seems to be a prototypical build for an elite CrossFit athlete. For men it's about 5'10" and 200 lb, with a lot of athletes just over 6 feet, too.

Male Height and Weight

For women, most athletes seem to be about 5'6" and 145 lb, which happen to be my dimensions. Smaller female athletes that barely break 5 feet are pretty well-represented, too. I was somewhat surprised at the lack of taller women.

Female Height and Weight

Some open workouts like 1A, which was a one-rep max clean and jerk favored larger athletes:

Open Workout 1A by Height (One-rep Max Clean & Jerk)

Other open workouts like 4, which was an ascending rep ladder of cleans and handstand push-ups, favored smaller athletes:

Open Workout 4 by Height (Cleans and Handstand Push-ups)

You can check the other workouts yourself, but overall, this year's open seemed fair with regard to athlete size.

Anyway, feel free to play with the data yourself here. Let me know if you find anything cool and if you have any suggestions for improving usability.