Data Science Portfolio

Here are a few examples of projects that I have worked on in the past few years. See my GitHub for more projects

Senior Thesis

Leveraged SAS to back test Joel Greenblatt’s Magic Formula on securities from the S&P 500 between 1985 and 2014. Portfolio yielded an average annual return of 35.65% (compared to 12.57% of the Value Weighted S&P 500) and a 33.56% CAGR over the testing the investing horizon. The formula created a portfolio of equally weighted investments in the 30 securities with the highest Earnings Yield & ROIC from the S&P 500.

Greenblatt’s Magic Formula relies entirely on two financial ratios, Return on Capital (ROC) and Earnings Yield. Greenblatt believed that he could find “good companies [that are trading] at a bargain price” (Greenblatt 2005) using only these two ratios. Greenblatt calculates Earnings Yield and Return on Capital as the following:

  1. Earnings Yield = EBIT / Enterprise Value
  2. Return on Capital = EBIT / (Working Capital + Net Fixed Assets)

After each company in the index is ranked by ROC and Earnings Yield, the Magic Formula Portfolio is ready to be assembled. The Magic Formula ranking system works by simply summing the rank of ROC and Earnings Yield of a particular security to give it a “Magic Formula Rank.” After ranking every security in the index, create a portfolio consisting of the 30 highest ranked securities on this list. Hold this portfolio for a year before rebalancing based on the previously listed principles.

Magic Formula Returns

Magic Formula vs S&P 500

Andrew Ng's Coursera Work

As I was taking Andrew Ng's Machine Learning Coursera Course I took all the excercises that we did in Octave and converted them to Python. Below is a link to all the Ipython notebooks I created during this process. Entailed in these notebooks are concepts such as:

For each Ipython notebook I code the Machine Learning Algorithm both with and without the Python libraries that typically do the heavy calculations.

Knock Login

Multivariate Linear Regression

Knock Profile View

Logistic Regression