Modeling whether an outbreak of COVID-19 virus will hit San Francisco — Epistemic status: To be clear, this is a highly simplified model and should not be used to insight panic or to instill a false sense of security. Please enjoy with a healthy level of skeptical ...
Google Analytics for Scrolling on a Static Website (or Google Analytics is Creepy) — A couple of months ago, I helped one of my friends set up Scroll Tracking with Google Analytics on an experimental website. While working on this I discovered that I could do really cool/creepy stuff ...
Visualizing shared budgets and dividing up household expenses fairly — My partner and I have been tracking our finances ever since we moved in together back in 2014. Originally, we started just tracking our shared expenses like rent and groceries in order to make it ...
Analyzing Edge.org forum data - Experiences with Crowdsourcing Analysis — Epistemic status: First foray into this type of analysis, expecting a 33% chance of at least 1 major technical error. I recently signed up to take part in an experiment on crowdsourcing data analysis run ...
Text Mining and Natural Language Processing on Health Forums — As part of the Insight Health Data Science Fellowship, I just got to spend the last 3 weeks working on a pretty fun project applying natural language processing to medical health forums. The goal is ...
Parallel IPython with Jupyter Notebooks on a SLURM cluster — I figured it would be such a piece of cake to get my Jupyter IPython notebooks to run parallel on my work cluster, but in the end, I had so much trouble trying to find ...
Q&A on the Route Crime Calculator — I've recently built a cool little web app that helps evaluate the number of crimes occurring along a person's travel route in the City of Chicago. To elaborate a little on the rationale and methodology, ...
The Barbie Hadoop Cluster (Multi-Node Cluster) — I've finally finished the Barbie Hadoop Cluster! It's been several months since I started the project, but after the hiatus I was ready to come back and get it finished. Due to space limitations in ...
The Barbie Hadoop Cluster - Stage 1 — Ever since I took Coursera's Intro to Data Science course, I've been dreaming of getting my own MapReduce system up and running. It just seems so much more rewarding to have my own data framework ...
A Better LinkedIn — I've been wanting to fix my old paper resumé's abhorrent lack of data visualization with an interactive web version. And since I'm about to hit the job market anyway, I figured now is the perfect ...
Predicting Bike Share Usage — I recently worked on a Kaggle competition while I was taking Coursera's Introduction to Data Science. Here's a quick summary of what I came up with. The goal of this project was to come up ...