Construct the value-weighted market return using CRSP data, replicating the market return time series available in Kenneth French website. Also calculate the equal-weighted market return, and the lagged total market capitalization. Your output should be from January 1926 to December 2017, at a monthly frequency.
Part 1
Before calculating the portfolio time series, I conduct a series of data cleaning as part of my PS Q1 function. Next, I describe my data cleaning process and their respective assumptions:
- Universe of stocks: Following Ken French procedure, I restrict the sample to common shares (share codes 10 and 11) and to securities traded in the New York Stock Echange, American Stock Exchange, or the Nasdaq Stock Exchange (exchange codes 1, 2, and 3).
- Missing returns: The sample missing of both RET and DLRET are removed from dataset. And
NA
are labeled for their respective missing code. - Delisting return calculation: We use cum-dividend returns for all the calculation. Delisting return is used when there is no listing return available. When both are avaiable, we use compounded rate of return.
- Market Capitalization calculation: Since the price in the dataset is at the end of the period, we shall use 1-period lag to calculate the market capitalization. The market capitalization is computed as the product of share price and outstanding shares.
- Portfolio weights: Here we calculate both equal-weighted and value-weighted return.
- Sample period: Trying to include as much as possible, we can get dataset from Jan 1926 to Dec 2017.
- Definition of portfolio weights: For equal-weighted portfolio, we take the arithmetic mean of the monthly return for all available stocks during that period. For value-weighted portfolio, the weight is computed from the weight of lagged market capitalization of respective stock.
Input Dataset
Variable Name | Variable Type |
---|---|
PERMNO | integer |
date | Date |
SHRCD | integer |
EXCHCD | integer |
RET | character |
DLRET | character |
PRC | numeric |
SHROUT | integer |
1 |
|
Output Dataset
Variable Name | Variable Type | Variable Description |
---|---|---|
Year | numeric | Year |
Month | numeric | Month |
Stock_lag_MV | numeric | Total market value the previous month (in millions) |
Stock_Ew_Ret | numeric | Equal-weighted returns |
Stock_Vw_Ret | numeric | Value-weighted returns |
Part 2
Using the risk-free rate of return from French’s website, report the following moments of the market excess returns for both time series (4 decimal digits): annualized return, annualized volatility, annualized Sharpe ratio, skewness, and excess kurtosis. Annualized values should be calculated geometrically. You should be comparing between July 1926 to December 2017, at a monthly frequency.
Input Dataset
The input comes with data from Kenneth French’s website:
Variable Name | Variable Type |
---|---|
Year | numeric |
Month | numeric |
Market_minus_Rf | numeric |
SMB | numeric |
HML | numeric |
Rf | numeric |
The summary statistics are in Table 1 below. Five statistics are reported: annualized mean, annualized standard deviation, annualized sharpe ratio, skewness, and excess kurtosis. In Column 1, statistics for the replicated value-weighted market portfolio of stocks calculated in the previous question are given. In Column 2, statistics for the value-weighted market portfolio of stocks from Ken French’s website are computed.
Estimated FF Market Excess Return | Actual FF Market Excess Return | |
---|---|---|
Annualized Mean | 0.06719 | 0.06428 |
Annualized Standard Deviation | 0.1846 | 0.1850 |
Annualized Sharpe Ratio | 0.3638 | 0.3473 |
Skewness | 0.1676 | 0.1878 |
Excess Kurtosis | 7.6617 | 7.8814 |
From previous question, we have a times series of value-weighted market returns, namely \({r_t}{t=1}^T\). Let the market return from French’s website be given by \({r_t^F }{t=1}^T\). I compute these statistics as follows:
- Sample period: Monthly from July 1926 to December 2017.
- Skewness: Skewness of the monthly return is calculated without annualization or logs using the full sample.
- Excess Kurtosis: Excess kurtosis of the monthly return is calculated without annualization or logs using the full sample.
- Annualized Mean: For annualized mean, we first calculate the geometric mean of the monthly return over the period. Then doing annulization using \(\hat{\mu}_A = (1 + \hat{\mu}_M)^{12} - 1\)
- Annualized Standard Deviation: For annualized standard deviation, we first calculate the standard deviation of the monthly return over the period. Then doing annulization using \(\hat{\sigma}_A = \sqrt{12}\hat{\sigma}_M\)
- Sharpe Ratio: Calculated from the ratio of annualized mean over annualized standard deviation.
Part 3
Report (up to 8 decimal digits) the correlation between your time series and French’s time series, and the maximum absolute difference between the two time series. Is it zero? If not, justify whether the difference is economically negligible or not. What are the reasons a nonzero difference? You should be comparing between July 1926 to December 2017, at a monthly frequency.
Table 2 below shows the time-series correlation between the replicated value-weighted market portfolio of stocks and the value-weighted market portfolio of stocks from Ken French’s website. I also report the maximum difference between the two series.
The sample is restricted between July 1926 and December 2017.
The difference between the replicated portfolio and the one from French’s website is not zero because of different ways in data processing and sample selection. Also Kenneth French’s dataset truncate the numbers to 2 digits. As the correlation is very close to 1, the moments of the two time-series are relatively similar. For my replication, the maximum absolute difference is around 1%, it is considered to be too large if economical decisions are to be maded based on my results.
Correlation: 0.99980530
Maximum Absolute Difference: 0.01168131