On the economic significance of the benchmark portfolio

Abasic issue underlying financial theory is the constitution of the market portfolio. Hence the adequacy of its usual proxy, the S&P500, is of paramount importance. Using 17 industry portfolios, we form an equally-weighted (passive) portfolio statistically identical to the S&P500 with respect to volatility. We find that, about half the time, the industry portfolio has higher returns than the S&P500. We offer this as an explanation for the flatness of the CAPM noted and questioned in early studies by Basu (1977), Black, Jensen and Scholes (1972), and Reinganum (1981). We suggest that the partial inefficiency of the S&P500 is laden with serious implications for investors and portfolio managers, question the behavioral motivation for its continued use as a benchmark, and introduce new measures of full diversification. We estimate a Jensenâ€™s Alpha error of 2.04% associated with the wrong proxy for the market portfolio.


Introduction
Requests for proper benchmarks against which to measure performance are pervasive in the finance industry, because benchmarks have two critical uses.First, investors need to be able to evaluate performance.Second, managers may be compensated according to the "alpha" that they earn.That is, they are paid to outperform a benchmark after adjusting for risk.
A basic issue underlying most of financial theory is the constitution of the "market portfolio" M. In theory, M is a portfolio comprised of all tradable assets and does not have a tangible existence.Hence the adequacy of its usual proxy, the S&P500, is of paramount theoretical as well as practical importance.Although the most recent 10 years is clearly a unique period of financial history, the recent lackluster returns may not be a direct result of these specific recent events.In spite of the cautionary recommendations of academics like Wharton's Jeremy Siegel (cf. Tergesen, 2005: 100), investors still rely more on exemplary indices than broader diversification.That is, market performance is generally measured through broad-based market indices such as the Standard & Poor Index (S&P500), the Dow-Jones Industrial Average (DJIA) and the Russell-3000; yet the extant literature leaves a question about the appropriateness of these indices in the context of efficiency.
In this paper, we pose two basic questions: "Do major market indices fall on the efficient frontier, and, if not, what causes the continued benchmarking against them?"These questions are critical for portfolio managers because linking performance to a totally (or even partially) inefficient proxy for the market portfolio has serious implications not only for themselves, but also for the trillions of investment dollars pegged to major-index performance.
The statistical motivators of our analysis (Giraud, Hedges and Wright, 2001: 27) suggest that: "The vast majority of hedge funds attempt to generate returns from market inefficiencies, local arbitrage opportunities, or market information."However, it seems rather odd that returns generated from inefficiencies would be measured against a benchmark that is inefficient.Surely this injects a bias regarding the amount of abnormal returns.This paper comprises four sections.Its first section evaluates the S&P500 as to whether it can fully, even if asymptotically, serve as a proxy for the theoretical concept of "M".Its second develops a broad-based portfolio of assets which we offer as an alternative with at least as good chances of fulfilling the tall order of representing a true market portfolio.The third section summarizes our current thrust and evaluates its prospects for helping delineate new research directions.The fourth section provides an analysis with respect to the investment errors associated with choosing an inefficient proxy for the market portfolio.

2.0
Background: The large shoes of "M"

Great expectations, meager satisfactions
If portfolio managers were surveyed about recent capital market performance, the most likely response would be that the market's 10-year return has been poor.In fact, since the market correction in April 2000, the capital markets have been hit by a variety of shocks including 9/11, firm-specific problems (e.g., WorldComm, Tyco, Martha Stewart's Omnimedia, and Global Crossing) as well as some relatively market-wide issues (e.g., NYSE and AIG in 2008-09).Ex post data over this period show that an investor would have done slightly better by investing in short-term Treasury securities than holding "the market"; what's more, those returns would have been risk-free!As a result, hedging and derivative-related activity is having a field day.In a 2-page feature in Business Week, Mara Der Hovanesian (2005) informs us that the market for derivatives has tripled last year to the impressive sum of $8.5 trillion.Awed by this lopsidedness, she warns us that this pile-up has all the solidity of a deck of cards.Institutional investors are aware of this risk, perversely due to their collective "sheepishness for risk" and herd-like behavior in risk avoidance.The comprehensive analysis by Baigent and Massaro (2005) shows how the very mechanisms of portfolio insurance designed to stabilize small market fluctuations actually worsened the large ones.Hence it appears to them that little can be done by the individual investor attempting to make his or her institution benefit from all the market devices now available, and all the mathematical niceties that could accompany them.
This paper contributes two important questions.The first question is an empirical one: it asks whether or not the major market indices do actually fall on the efficient frontier.And, as it has been shown that the leading index, the S&P500 actually does not, we follow with questioning why the practice of benchmarking against the S&P500 persists in the face of its known inadequacies.
The latter question is of paramount practical importance for portfolio managers, because linking performance to a totally (or even partially) inefficient proxy for the market portfolio has serious implications not only for themselves, but also for the trillions of investment dollars pegged to major index performance.For example, exchange traded funds exist for "Spiders" which are pegged to the S&P500, and "Diamonds", which are pegged to the DJIA.Lynn Cohn (2004) of S&P Communications documents that the funds pegged to the S&P500 have passed the trillion-dollar mark.But what is the cost of this pegging?If there is any evidence of the S&P500'seven periodic inefficiency, the search must begin for alternative proxies for the market portfolio, but this doesn't seem to be imminent.

Thrust of our analysis
Here's the problem as we see it: in order for a firm's stock to rise, it must have had "good" historical performance also accompanied by heavy trading volume.But we also know that DeBondt and Thaler (1985) show significant evidence of market overreaction in the long run, say three years.Thus, securities that have done well in the past, a requisite for inclusion in the S&P500, tend to perform poorly in subsequent periods.In short, the S&P500 is including securities at their peak, and removing them at their nadir, a recipe for poor performance.DeBondt and Thaler (1985) findings imply that the S&P500 contains negative bias because of its method of construction, but we still require an explanation for its resilience as a benchmark.Dreman and Berry (1995) suggest that there may be a "career" incentive.That is, when an analyst makes a prediction similar to others he might be protected, even if wrong, in the context of "no-one else got it right either!"Although a rational risk-averse behavior, something akin to this was described long ago by Keynes' "beauty contest" analogy.In other words, it may be the case that analysts attempt to predict what they expect average opinion to be, instead of forming their own forecast.Hence, analysts seemingly use the S&P500 because everyone else is being measured against it, or it represents what average opinion "ought to be".
In this analysis we examine only the S&P500, one of the most oft cited market indicators, and there are two important reasons for this limitation.The first is that there is a long-standing history of the S&P500, which dates back to 1928.A lack of history excludes indices such as the Russell 3000, which comprises a larger proportion of the outstanding securities than the S&P500.The second reason is that the size of the S&P500 is large enough to represent a fair approximation of "the market".The DJIA, on the other hand, has an equally long history, but its original complement of 16 stocks, and even now at 30, is not sufficiently large to qualify as "market-wide" representation.Still, we contend that our arguments are relevant regardless of the exact choice of market index.
Using data from Kenneth French's data library1 over the time span 1928 to 2014, we find evidence that, about half the time, an innovative equally-weighted portfolio of 17 industry sectors has better returns than the S&P500, but with statistically identical risk.The relevance of this finding is that, at best, about one-half of the time empirical tests of the Capital Asset Pricing Model (CAPM) are using the wrong proxy for the market portfolio.We say this because a more exhaustive search might actually reveal the "true" market portfolio that renders the proportion to an even lower quantity.2This gives heavy credence to Roll's (1977) thesis on the testability of the CAPM.
Part of the difficulty with the S&P500 is that it most likely contains a bias.The bias is partially "survivorship", but there's more than that.To be included in the S&P500 a firm must meet minimum values of market capitalization and trading volume (or liquidity).But what are the necessary conditions for a large market capitalization?In part, it's a "run-up" of price, but it also requires that prices do not mean-revert.There is substantial evidence to suggest that price changes tend to not mean-revert if they are supported by heavy trading volume (Stickel & Verrecchia, 1994).Thus, the first criterion implies the second, which is an element of redundancy, at least in the design of the S&P500.

Centrality of the issue
The issue at hand is central to extant financial theory as it touches on the fundamental tenets underlying the CAPM.In the context of capital market performance, academic and non-academic researchers have long posited the question "why is the estimated slope of the CAPM so low?"This seemingly perpetual question has cast doubts on the usefulness of the CAPM as a predictor of returns.Black, Jensen and Scholes (1972) find that lowbeta stocks had returns higher than predicted by the CAPM; Banz (1981) and Reinganum (1981) find a size effect in which small firms have higher rates of return than predicted by the CAPM; and Basu (1977) finds that low P/E stocks have returns that are higher than can be explained by the CAPM.These empirical findings converge in that they all suggest a CAPM with a slope (or market risk premium) that is too low.
Researchers have suggested two possible explanations for the anomalies listed above.The first explanation is provided by Black (1972: 455) who suggests that: "If there is a riskless asset, then the slope of the line relating the expected return on a risky asset to its  must be smaller than it is when there are no restrictions on borrowing."A second explanation for the empirical anomalies reported in tests of the CAPM can be found in "Roll's critique" (Roll 1977).He states that testing the CAPM is actually a joint test of the CAPM and the efficiency of the market portfolio.An asset or portfolio is characterized as "efficient" if it provides the highest expected return for a given level of risk, or the lowest level of risk for a given expected return.That is, the asset or portfolio must fall on the efficient frontier if the CAPM is to be tested properly.This requirement is reiterated by Roll and Ross (1994: 101) who state that: "Not finding a positive cross-sectional relation means that the index proxies used in empirical testing are not ex ante mean-variance-efficient." More recently, Baigent (2014) shows that a major deficiency of the M-Squared (Modigliani and Modigliani, 1997) measure of performance is the absence of the benchmark return in their risk-adjusted performance metric.However, if the presence of a benchmark is central to the acceptability of a risk-adjusted performance measure, then the benchmark itself is required to be efficient.
Most empirical tests of the CAPM rely on a proxy for the market portfolio, and one of the most common is the S&P500 index.The S&P500 is relatively desirable in view of the two desiderata of a market portfolio.First, unlike the DJIA that amounts to an equally weighted composite of only 30 industrial stocks, it is weighted by a much larger number of market values.Second, its component stocks are all actively traded (one of the requirements for inclusion among the 500 composite stocks).These two virtues render it attractive as a proxy for the market portfolio.However, the story does not stop here because the S&P500 has also shouldered the far more important responsibility of being essential to any empirical testing of the CAPM.

3.0
Testing the s&p500 as a proxy for "M" 3.01 Unresolved aspects to be tackled As implied by Roll's seminal work, the S&P500 (or any other index) is itself a critical input to any empirical test of the CAPM.Thus, there are two issues to be addressed: (i) testing the crucial assumption that the risk-return characteristics of the S&P500 are statistically identical to those of a broad-based portfolio of assets, and (ii) should they turn out not to be statistically identical, discussing the implications of the widespread use of an inefficient proxy for the market portfolio.We address these two issues in turn.
In search of an alternative to or a broadening of the S&P500, in keeping with the spirit of Siegel's recent recommendations, we constructed an "Inclusive Industry Portfolio (IP)" as an equally weighted portfolio of seventeen industry indices.(We chose this approach because it is the most "passive" index we could envision.) The result is that the annual total returns and their variance are statistically identical to the S&P500 over the period 1927-2003.However, we note that in about 51% of the periods of observation, the IP had higher rates of return than the S&P500.That is to say, about half of the time, the proxy for the market portfolio does not lie on the efficient frontier.Therefore, since the S&P500 is consistently used as a proxy for the market portfolio, its probability of being on the efficient frontier is about the same as flipping a fair coin.Since it routinely underestimates the market risk premium, its slope will be lower than it should be.We offer this as a further explanation for the higher-than-expected returns of low-beta stocks, small firms and low P/E stocks.[It is acknowledged that the data for this study ends at 2003 but argue that a portfolio of 17 industry sectors has an expectation of representing the market portfolio.A search for a suitable representative that includes the recent financial crisis of 2008 continues.] This finding has implications for interpreting the well-known study of Fama and French (1992).Fama and French claim to have evidence that the slope of the line relating expected return to systematic risk is "flat".We contend that this is not necessarily the failure of the CAPM at predicting returns, but instead, it may be an empirical failure in finding an efficient proxy for the market portfolio.For in a related article, Athanasoulis and Shiller (2000) point out the importance of the market portfolio.Theoretically, they show that the construction of a "world share market" has social benefits.Although we do not form a "world share market," we do construct a broad-based industry portfolio and, in so doing, are able to move towards quantifying the social benefit of having an efficient proxy for the market portfolio.The following is an excerpt from Athanasoulis and Shiller's article: "This world share market would represent a radical innovation, since at the present time only a small fraction of world endowments are traded.Using a stochastic endowment economy where preferences are mean variance, it is shown that creating such a market may be justified in terms of its contribution to social welfare" (p.301).
Lastly, in the context of social welfare benefits, the findings in this analysis have serious implications for financial analysts and portfolio managers.If the market risk premium (or slope of the CAPM) is too low, then required rates of return are also too low.Thus there are malfunctions due predispositions to accepting investments that should be rejected or rejecting those that should be accepted.Alarmingly, portfolio managers may be taking long positions in securities that are over-priced and firms may even be investing in capital projects that have negative NPVs (net present values).

Rejecting the S&P500 as a Proxy for "M"
To test the efficiency of the S&P500 we obtained total annual return data on the S&P500 from 1927 to 2014.In addition, extrapolating beyond the recommendations of even the contrarian authors of financial theory, we obtained returns on 17 industry portfolios from Ken French's data library.The 17 industry portfolios are food, mining, oil, textiles, durables, consumer products, fabricated products, chemicals, construction, steel, machinery, automobiles, transportation, utilities, retail, financial and "other."These categories span the complete market inclusive IP are shown in Figure 1.Casual observation shows that their returns closely map one another.Testing the relationship through an ordinary least squared (OLS) regression defined by equation ( 1) yields the results shown in Table 1.(T-statistics are reported in parentheses, and "***" indicates significance at the 1% level.)The OLS results reveal that the equally-weighted portfolio (IP) and the S&P500 closely map one another.The coefficient of determination, 2 R , indicates that the correlation over the 87-year observation period is 0.959.
We also tested for the relative volatility of the equally-weighted industry portfolio and the S&P500.This is stated in null hypothesis form below. Combined with the high correlation obtained above, the inconclusiveness of both these t-tests suggests that the S&P500 cannot be proven statistically superior or inferior to an equally weighted portfolio formed across 17 major industry sectors.These findings could be construed as contributing to the traditional view of the S&P500 as a reasonable, at least partially efficient proxy for the market portfolio; yet we argue that they also bring up some deeper issues discussed in the next section.

4.0
Constructing a broad-based portfolio of assets 4.01 Monetary relevance of the issue The "market portfolio" is defined in the literature as a fully diversified portfolio of all tradable assets where the weight of each asset is based on its market value (which requires actively traded or liquid assets).The S&P500 is frequently used as a proxy for the market portfolio, but we point out that the proxy for the market portfolio is a critical input to both the market model and the CAPM.The empirical requirement that the proxy be efficient is difficult to attain because it is supposed to be ex ante efficient while researchers typically work with ex post data.As shown in Figure 2, the risk-return characteristic of the market portfolio (or its proxy) determines the slope of the capital market line (CML).
Figure 02: The capital market line and the efficient frontier The work of Sharpe (1964) transposes the CML to the security market line (SML) where the position of the proxy for the market portfolio is paramount.That is, if the real M (or M*) lies below the efficient frontier, even periodically, then the slope of the SML is too low.In Figure 3, the slope of the security market line is: For expositional completeness, the CAPM is expressed as: The statistical tests shown in the previous section indicate that the S&P500 and the equally-weighted industry portfolio are identical in terms of risk and return.However, the critical issue for fund managers is to choose the "most efficient" fund in each investment period.If the S&P500 has a return that falls below the industry portfolio half the time, there must be a real dollar cost to pegging to the S&P500.To examine this cost we assume that we have perfect foresight regarding choice between the S&P500 and the industry portfolio so that we always choose the efficient portfolio (EP).The startling results are shown in Figure 4. .Moreover, the impact of the S&P500's periodic inefficiency has an average opportunity cost of 2.76% per year.In an environment where success is measured in small units, an annual cost of 276 basis points is significant, even when the statistics indicate the portfolios are identical.

Practical implication for investors
We take issue with the conventional wisdom that holding the S&P500 is a passive investment.In fact, Markman (2002) points out that: "Unlike most index publishers, such as the Nasdaq and Dow Jones, Standard & Poor's adds and subtracts stocks from its three broad indexes -the Largecap 500, the Midcap 400, and the Smallcap 600frequently in accordance with a largely subjective list of criteria that includes market capitalization, liquidity and their representation of industrial sectors." On the date of Markman's article, 45 of the stocks that were added to the S&P500 in 2000 and that remained in the index, 22 had declined by more than 50%, 13 were down by more than 75%, and eight had declined by more than 85%.Although beyond the purview of this analysis, it seems reasonable to question the selection process.
In fact, it may be argued that momentum causes inclusion in the S&P500, but DeBondt and Thaler's (1985) "winners become losers" finding rings loud in the aftermath.
The clear conclusion, not foreseen by the conventional wisdom, is that the S&P500 is as much a managed mutual fund as a true broad-based index -investors beware!But to be fair, the S&P500 has several redeeming qualities.
For example, periodically reconstituting the portfolio is a good thing, and basing this restructuring of the index on performance is going from good to better.But therein lies the rub: even though ex ante perfection is clearly beyond mortal capabilities, we all expect the decision makers at Standard & Poor to have better foresight than everyone else regarding what to include or exclude from a major market indicator.

Estimating the error of using an inefficient benchmark
Suppose that we had perfect foresight and were able to predict, ex ante, the efficient proxy for the market portfolio with a universe constrained to the S&P500 and the 17-sector industry portfolio (IP).That is, we write the historical returns on the benchmark "M" as for each period so that M lies on the efficient frontier.Using data from CISDM's database, we consider the case of the equally-weighted hedge fund index (HFI) over the time period 1994-2014.As such, measuring HFI against the S&P500 would indicate an excess return of 3.98%, but when measured against a more efficient market proxy (M), the excess is only 1.94%.In terms of cumulative returns, HFI has an excess return of 66.25% over the S&P500, but falls short (-550.06%)when compared to M.
A further issue becomes apparent through Sharpe's ratio.An investor would consider HFI a superior investment to either the S&P500 or M because it provides a "better" reward to risk ratio, however, it is clearly easier to outperform a ratio of 0.46 than 0.65.If the HFI had a Sharpe's ratio of, say, 0.5, we would erroneously consider it efficient if to have a better reward to risk ratio if S&P500 is the benchmark, but not if M is the benchmark.
The misrepresentation of the market portfolio carries with it two glaring errors when Jensen's Alpha is computed.The first is the measure of systematic risk.When measured against M, HFI  is 0.50 but when measured against the S&P500 it is 0.45.The second error pertains to the unbiased expectation of the return on the market portfolio, which is in fact, downward biased.The average annual return on the S&P500 is 11.13% compared to 14.53% for M. The difference of 3.40% significantly alters the slope of the security market line.
The differences in the slope (through expected return on the market portfolio) and the estimation of beta are manifested in Jensen's Alpha values of 194.33 versus 398.18.The difference between the two is 203.84 basis points, or 2.04%, which is the performance error.
To conclude this section, we have provided evidence, albeit over only a fourteen-year period, that selection of an inefficient proxy for the market portfolio has serious consequences for investors and money managers.

Summary and future research directions
Figure 4 shows the annual returns on the equally-weighted industry portfolio and the S&P500 and the cumulative returns respectively.Although the statistics support the hypothesis that the portfolios are not distinguishable in practice, we must note that statistical testing averages things out over all observations and individual observations are lost in the process.The fact is, sometimes the S&P500 has a higher return than the industry portfolio so that it lies on the efficient frontier.However, there are periods in which the industry portfolio lies on the efficient frontier and the S&P500 falls below it.This occurred 39 times out of 76 in our sample (or 51.3% of the time).Therefore this research has unearthed that, about half the time, there is a portfolio with equal variance but higher return than the S&P500.First, for financial theoreticians, could this not be the cause of the security market line of CAPM being too flat?
The second implication of our findings is relevant for portfolio managers.If portfolio managers are searching for stocks with non-zero "alphas," their investment decisions may be incorrect.Equation (2) determines Jensen's Alpha where i R is the average return on asset .""i , then managers (portfolio or financial) should choose to take a long position in .""i However, if the slope of the CAPM is too low, they will be overestimating the abnormal return that could be earned.Stated differently, they may be taking a long position in assets with negative i J , or buying over-priced assets.
The third implication is relevant for individual investors as well as portfolio managers.Managers' performance is generally measured against a benchmark portfolio.If the benchmark portfolio is the S&P500, it will be easy to outperform the market about half the time because it does not fall on the efficient frontier.This is misleading for investors and results in higher-than-required compensation for portfolio managers in those periods.Moreover, there must be cases where even adjusting for risk will not undo this effect, because our evidence reveals the existence of at least one other portfolio with the same volatility but greater return in those periods.The implications for this finding are widespread as it causes a significant cost to all categories of financial investment.
The fourth and last issue to be discussed concerns Roll's suggestion that there is nothing unique about the market portfolio.He suggests that it is always possible to choose any efficient portfolio as an index and then find the minimum variance portfolio that is uncorrelated (zero beta) with this index.With the exception of some lone voices such as Jeremy Siegel's, the literature seems to have missed that this should lead to the development of measures of diversity.
The looming issue that stems from this research is how to form diversified portfolios without making reference to an already existing, absolutely a priori market index (i.e., when a beta of 1.0 is not in the cards because we don't know the market portfolio).Robert Shiller and colleagues at Case-Shiller-Weiss have been making efforts to develop a true market portfolio.Also, several measures of diversity have been proposed, with the more common ones in the economic and strategic management literatures being the Entropy measure and the Hirschman-Herfindahl index.In addition to these classic but possibly problematic indices, recent work by Acar and Bhatnagar (2003) and Acar and Troutt (2008) proposes the use of calibrated measures of diversity.
Because the field may be moving toward paying greater attention to various aspects of diversity within a portfolio (e.g., Aggarwal & Samwick, 2003;Tergesen, 2005), we deem this to be an inviting avenue for future research.

APPENDIX
Several measures of diversity have been proposed, with the more common ones in the economic and strategic management literatures being the Entropy measure and the Hirschman-Herfindahl index.In addition to these classic but possibly problematic indices, recent work by Aggarwal & Samwick, (2003) proposes the use of calibrated measures of diversity.These authors point out that a proper range of values for a diversification index should be from 0 (no diversification or full concentration) to 1.0 (full diversification).They also show that, under a large number of investment alternatives such as the total list of S&P stocks or even the economic sectors entailed, the usual measures of diversification will only operate properly for fully concentrated or fully diversified portfolios -neither condition being valid for the major financial portfolios in use nowadays.
Acar and Bhatnagar demonstrate how the existing diversity measures could be replaced by calibrated ones that operate properly (remain approximately linear) for intermediate distributions likely to be used by portfolio managers, such as the "triangular" distribution.In particular, they derive two measures that have this property and also are approximately linear in the intervening range: In equations ( 3) and (4), as in the literature on the economics of industrial organization (IO), i p is the proportion of security i in the portfolio, and n is the total number of securities.The application of diversification measures is a departure from measuring riskiness relative to a market portfolio.Because mathematical derivations in a recent article by Acar and Troutt (2008) shows the A2 measure to be fully linear, we contend that this points to a rich new area of research within the overall domain advocated by Aggarwal & Samwick, (2003);Tergesen, (2005).

Table 2
neutral investor would only be interested in the first two rows of Table2, giving no consideration to risk.