Portfolio Optimization via Generalized Multivariate Shrinkage

The shrinkage method of Ledoit and Wolf (2003; 2004a; 2004b) has shown certain success in estimating a well-conditioned covariance matrix for high dimensional portfolios. This paper generalizes the shrinkage method of Ledoit and Wolf to a multivariate shrinkage setting, by which the well-conditioned covariance matrix is estimated using the weighted averaging of multiple priors, instead of single ones. In fact, it can be argued that the generalized multivariate shrinkage approach reduces estimation errors and uncertainty when projecting the true covariance matrix onto the line, spanned by priors joining to the sample covariance matrix. Hence, the generalized multivariate shrinkage is less subjected to sampling variation. Empirically, I use the U.S. firms to form portfolios for out-of-sample forecast. Using Ledoit and Wolf's approach as benchmark, out-of-sample portfolios constructed from the proposed method gain significant variance reductions and sizable improvement of information ratios. JEL Classifications: G11, G12


Introduction
It is a long-standing difficulty to estimate a well-conditioned and invertible variance-covariance matrix for a high dimensional portfolio selection.To address this issue, Ledoit and Wolf (2003;2004a;2004b) optimize investment portfolios by shrinking between a structured estimator and the sample covariance matrix to gain the trade-off between estimation errors and bias.The trade-off between the bias and variance is realized through shrinkage weights in the projection of the true covariance matrix onto the geometric line between a structure estimator and the sample covariance matrix.Their shrinkage method has shown theoretically and empirically attractive to the covariance estimation problem of a high dimensional portfolio in that it guarantees obtaining a well-defined and invertible variance-covariance matrix.
The shrinkage approach is also referred to as the empirical Bayesian shrinkage, e.g., Ledoit and Wolf (2004b) and DeMiguel, Garlappi, Nogales, and Uppal (2009), with a natural Bayesian interpretation for the trade-off.However, in their contexts, the Bayesian decision-maker is assumed to depend on only one single prior (a single structure matrix), or, equivalently, to be neutral to uncertainty of priors in the sense of Knight (1921).Given the difficulty in estimating moments of asset returns, and the sensitivity to the choice of a particular prior, it is important to consider multiple priors and hence desire robust portfolio rules that work well for a set of possible models.
~ 57 ~ Let F be a ( ) . Hence, F contains K priors in the given system.Theoretically, K can be infinity; however, it is practically inefficient.

Consider the optimization problem:
[ ] ) where 2   ⋅ is Frobenius norm.α is a .Equation (1) has that ( ) with ⊗ being the Kronecker product.I is an N N × identity matrix and 1 denotes a conformable vector of ones.* Σ , depending on α , is the linear convex combination of the multivariate shrinkage targeting matrices , F , and the sample covariance matrix, S .Σ is the population covariance matrix with elements Hence, the quadratic loss function can be written as follows The corresponding risk function is: where ij f is a Note that equation ( 5) is a positive definite matrix such that there exists a solution for minimizing equations (1) and (3).
Setting the first order condition to be zero and solve for the optimal weight vector * α , I derive the solution for the optimal multivariate shrinkage intensity as (6) where ~ 58 ~ is a K K × matrix.See Appendix A for the proof of the optimal multivariate shrinkage intensity of equation (6).

Remark 1:
The main difference of equation ( 6) from Ledoit and Wolf is now that * α is a vector solution for multivariate targeting matrices.The asymptotic theorem and properties derived in Ledoit andWolf (2003, 2004a&b) are still valid in equation (6).

Interpretation
An interpretation for the generalized multivariate shrinkage from frequentist statistics is based on the properties of conditional variances and covariances: where Ξ is a subset of F .That is, one never does worse for predicting Σ when additional information are conditioned on.
To understand the intuition underlying the multi-prior model, I provide a Bayesian interpretation for the generalized multivariate shrinkage.* Σ in equation ( 1) can be seen as the combination of two signals: prior information and sample information.Prior information states that the true covariance matrix Σ lies on the sphere centered around the shrinkage target I µ with radius 1 α . 1 Sample information states that Σ lies on another sphere, centered around the sample covariance matrix S with radius β .Bringing together prior and sample information, Σ must lie on the intersection of the two spheres, which is the area A in Graph 1.At the center of the area A stands * Σ .

Graph 1
However, to address sensitivity and uncertainty to a prior, I introduce the second prior information, saying M , which also states that the true covariance matrix Σ lies on the sphere centered around the shrinkage target M with radius 2 α .Now, bringing together the two priors and sample information, Σ must lie on the intersection of these three spheres, which is the area B in Graph 2. At the center of area B stands * Σ .It can be seen that the area B is smaller than the area A in Graph 1.This uncertainty reduction improves the precision to locate * Σ by reducing variance.
1 See the definitions of , Theoretically, the number of structure targeting matrices can be infinite at the computation cost.Notice that Ledoit and Wolf (2004b) choose the shrinkage targeting, I µ since I µ is all bias and no variance.By contrast, the sample matrix, S , is consistent but has a lot of estimation errors.Similarly, other structure targeting matrices, e.g., market model and constant correlation matrix, are bias but much less variation.
An alternative way to interpret the multivariate shrinkage philosophy is geometric.Lemma 2.1 in Ledoit and Wolf (2004b) is a projection theorem in Hilbert space, a rewriting of the Pythagorean Theorem as in Graph 3. * Σ is the projection of Σ on the line between I µ and S .

Graph 3
However, since Σ is unknown, Graph 3, merely based on the single shrinkage targeting, I µ , cannot uniquely determine the true Σ as showed in Graph 4. The variance for Σ estimation is a circle with radius 2 1 η .Accordingly, the projection range for * Σ is the line between I µ and S .
Graph 4 To reduce the variance for Σ or the projection range for * Σ , I introduce the other shrinkage targeting, say M .M targeting alone by replacing I µ has the analog properties with S and Σ as in Graph 3.However, when combine M , I µ and S together, I first determine a line between M and I µ .Then, based on the line between M and I µ , I can uniquely determine a point for Σ , given that M has no estimation errors, as illustrated in Graph 5.

Graph 5
However, in general, the structural targeting matrix, M , still exists to some extent variation as illustrated in Graph 6, while its variation is much less than the sample matrix, S .This provides a variance reduction in Σ or a projection range reduction for * Σ , which is much smaller than that in Graph 4. Now the * Σ variation for Σ is no longer a circle but reduced to an arc-shaped radiant,  AB as in Graph 6, whose projection for * Σ is only a partition ( ) CD of the line between I µ and S .

Consistent Estimator of Optimal Multivariate Shrinkage Intensity
The rest of this paper takes the optimal multivariate shrinkage intensity of equation ( 6) for estimation.As shown by Ledoit and Wolf (2003), the optimal shrinkage intensity vanishes asymptotically of the order ) (1/T O ; thus, for simplifying the estimation, the optimal multivariate shrinkage intensity takes the form as matrix, is an asymptotic estimator for the first term of the right hand side of equation ( 6), while the rest of terms in equation ( 6) are asymptotically estimated by Π and Θ , both 1 × K vectors.The equation ( 7) can be verified by Theorem 1 of Ledoit and Wolf (2003).
The constant Ledoit and Wolf (2003) and the formula of shrinkage intensity in Appendix B of , where The element estimation for Θ is given by In the empirical section, I consider four shrinkage targeting matrices, namely market model, constant correlation, identity and diagonal shrinkage.A consistent estimator for k ij ρ of market model is given by x is the market return and its sample mean, 00 s is the variance of the market return, 0 i s is the covariance between stock i and the market return, ij s is the covariance between stock i and j .
A consistent estimator for k ij ρ of constant correlation is given by From Ledoit and Wolf (2003) For indentity and diagonal models, I have , respectively.
Applying Theorem 1 and Lemma 3 of Ledoit and Wolf (2003), I have ( ) Finally, the optimal multivariate shrinkage intensity is determined by: ( ) In the case of ∑ as the optimal shrinkage intensity.

Portfolios and Multivariate Shrinkage Targeting Matrices
Consider a general mean-variance portfolio (MVP) of Markowitz (1952) type with a universe of N stocks, whose returns are distributed with mean vector µ , and covariance matrix, Σ .Markowitz (1952) defines the problem of portfolio selection as: where 1 denotes a conformable vector of ones, and q is the expected rate of return that is required on the portfolio as a constraint.The well-known solution is with In this paper, I also estimate a global minimum variance portfolio (GMVP) as with its solution as Note that the solutions of both equations ( 8) and ( 9) involve the inverse of the covariance matrix.The conventional approach is to use the sample covariance matrix, Σ ˆ to approximate the population matrix, Σ .However, in a high dimensional portfolio selection problem, the sample covariance is typically not well-conditioned and may not even be invertible.The generalized multivariate shrinkage method obtains an estimator that is both well-conditioned and more accurate than the sample covariance matrix asymptotically.The estimator is distribution-free and has a simple explicit formula that is easy to compute and interpret.
The shrinkage estimator is expressed as a weighted average of the multivariate shrinkage targeting matrices and the sample covariance matrix as where α is a shrinkage intensity vector with , and . The shrinkage intensity reflects the trade-off of estimation errors and bias.* Σ as the estimator of Σ is used in equations ( 8) and ( 9) to obtain optimal portfolio weights.
To determine the components of F , I consider four types of targeting matrices which have been applied in the literature, namely identity ( I ), market ( M ), constant correlation ( C ), and diagonal ( D ) targeting matrices, all of which have been applied in Ledoit and Wolf (2003;2004a;2004b) .In empirical section, I compare out-of-sample performance of multivariate shrinkage to single targeting shrinkage of Ledoit and Wolf.In particular, the formula for shrinkage targeting matrices are as follows.

Data
Monthly U.S. stock returns, from January 1980 to December 2010, were taken from the Center for Research in Security Prices (CRSP).The portfolios with monthly rebalancing are constructed similar to DeMiguel et al. (2009) and Jagannathan and Ma (2003): in April of each year I randomly select N assets among all assets in the CRSP data set for which there is return data for the previous 120 months as well as for the next 12 months.I then consider these randomly selected N assets as the asset universe for the next 12 months.Kirby and Ostdiek (2012) have shown that targeting conditional expected excess returns ( q determined by investors in equation ( 8)) have greatly affected out-of-sample performance of portfolios; thus, in this paper, the different expected return constraints are considered for .This range of portfolio sizes covers the important benchmarks as DJIA, Xetra DAX, DJ STOXX 50, FTSE 100, NASDAQ-100, NIKKEI 225, and S&P 500, similar to Ledoit and Wolf (2004a).I take the average of realized returns over the estimation window of the past 120 months for expected returns, µ in equation (8).

Empirical Results
The generalized multivariate shrinkage performance is compared to the single targeting performance of Ledoit and Wolf approach in terms of: (i) reduction in out-of-sample portfolio standard deviation; (ii) improvement in out-of-sample portfolio information ratio; and (iii) lower portfolio turnover.The turnover is defined as the total turnover of Grinold and Kahn (2000, Chapter 16) and DeMiguel et al. (2009).In general, the higher the turnover is, the less attractive the portfolio is to an active manager.Wolf (2003Wolf ( , 2004)).I, M,C and D denot e t he shrinkage t owards indent it y, market , const ant correlat ion and diagonal mat rices, respect ively.Mult ivarait e t arget ing mat rices consist of any combinat ions o f t hese shrinkage t arget ing mat rices.[a] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 300 basis point s; [b] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 800 basis point s; [c] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 1200 basis point s.[d] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 1600 basis point s.[e] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 2000 basis point s.[f] GMV represent s a global minimum variance port folio.n represent s t he port folio size.p values are report ed in parent hesis.(1) average t he st andard deviat ions of t he single t arget ing met hods; (2) average t he t wo t arget ing mat rix met hods; (3) average t he t hree t arget ing mat rix met hods; (4) average all mult ivariat e t arget ing met hods.
To measure the statistical significance of out-of-sample performance, I use bootstrapping methods.In particular, to compute the p-values for the information ratios I apply the bootstrapping method proposed by Ledoit and Wolf (2008), while to test the hypothesis of the equality of two given portfolios' variances, I employ the stationary bootstrap of Politis and Romano (1994), and then the resulting bootstrap p -values are generated by the methodology suggested in Ledoit and Wolf (2008, Remark 3.2).The programming code for the robust tests of Ledoit and Wolf (2008) is available at http://www.econ.uzh.ch/faculty/wolf/publications.html.
Table 1 reports the out-of-sample standard deviations for the different portfolios and p -values in parenthesis.The standard deviations of portfolios estimated from the single targeting of identity matrix are set as the benchmark, so that all other models are compared to it in terms of the equality tests.The difference is significant between two portfolios if the p-value is less than 5%.I denote any p -values less than 1% by 0.00.All other pair-wise p -values are also computed and used for discussion, while I do not report those due to the space limit.The averages of portfolio standard deviations are also reported.
Among the single targetings of the mean-variance and global minimum portfolios, shrinkages towards market model ( M ) have the lowest standard deviations, while shrinkages towards constant correlation ( C ) are the best for the portfolios with 3% = q .In contrast, among the various multivariate shrinkages, shrinkages towards the combination of market model and diagonal matrix consistently achieve the lowest standard deviations, while shrinkages towards CD and MCD have better performance for the portfolios with 3% = q .Hence, comparing the best multivariate shrinkage, MD , to the best single targeting shrinkage, M , it can be observed that MD always has obtained the larger reduction in portfolio standard deviations than those of M across different portfolio types and sizes.Importantly, the differences between MD and M are statistically significant.
On average, the multivariate shrinkage targeting have the lower standard deviations for the global minimum portfolios and the portfolios with q at modest levels, such as 3%-12%.However, the single targeting performs better when the portfolios were required by a relative higher level of conditional expected returns, i.e., 20% 16%, = q .This evidence is in line with the findings of Kirdy and Ostdiek (2012) that a high targeting conditional expected excess returns might lead to poor out-of-sample performance because it greatly magnifies both estimation risk and portfolio turnover.
Table 2 presents the empirical results for the out-of-sample information ratios and p -values of the corresponding equality tests.Among the single targetings, shrinkages towards identity matrix ( I ) have the highest information ratios for the small and medium portfolios, such as ,100 80 30, = N , while shrinkages towards market model ( M ) obtain the highest information ratios for the large sized portfolios.Among the multivariate shrinkage estimations, I find that IMD achieves the highest information ratios for the portfolio sizes from 30 to 225, while MD get the best for the portfolios of size, 500.Similarly, comparing the best multivariate shrinkage to the best single targeting, IMD and MD consistently outperform I and M across different portfolio types, sizes and q constraints.Also, the differences between multivariate shrinkage and single targeting are statistically significant as indicated by p-values which are generally smaller than 5%.However, on average, single targetings have higher information ratios than those of multivariate shrinkages for the large sized portfolios, e.g., ~ 69 ~ n=30 n=50 n=80 n=100 n=225 n=500 n=30 n=50 n=80 n=100 n=225 n=500 n=30 n=50 n=80 n=100 n=225 n=500 n=30 n=50 n=80 n=100 n=225 n=500 n=30 n=50 n=80 n=100 n=225 n=500 n=30 n=50 n=80 n=100 n=225 n=500  Lediot and Wolf denotes the single shrinkage targeting matrix method in Lediot and Wolf (2003Wolf ( , 2004)).I, M,C and D denote the shrinkage targeting to indentity, market, constant correlation and diagonal matrice, respectively.Multivaraite targeting matrice consist of any combinationo f these shrinkage targeting matrix.3 show that shrinkages towards identity ( I ) and market model ( M ) have the lowest turnovers among the single targetings.By contrast, the multivariate shrinkage of IMD has the lowest turnovers across the portfolios, while IMCD obtains the lowest for the global minimum-variance portfolio and the mean-variance portfolio with 3% = q , and some of 8% = q .Comparing the best multivariate shrinkages to the best single targetings, IMD and IMCD always have lower turnovers than I and M .Observably on averag, all multivariate shrinkages have lower turnovers than single targetings.
Additionally, there are many cases among the estimations which show that a multivariate shrinkage performs better than any single component shrinkage of that multivariate shrinkage.For instance, in terms of the standard deviation, the multivariate shrinkage of CD for the portfolio size 30 = N and 8% = q has the standard deviation, 0.4112, smaller than those of both the single targetings of C , 0.415, and Additionally, it is also interesting to see the difference between the shrinkage intensities of the multivariate shrinkage and the single targetings.Figure 1  .All other portfolios have the similar graphical results.We take one of the single targeting matrices compared to those multivariate shrinkages which contain that single targeting matrix as a component.It can be seen that the estimated shrinkage intensities vary significantly for different targeting matrices.For instance, shrinkage intensity of the single targeting market model is around 0.65, while the multivariate shrinkages including market model has the lowest around 0.2 on average.Moreover, the comparison shows that the shrinkage intensities estimated by multivariate shrinkages always are lower than those of single targetings.C and D which are estimated by the multivariate shrinkage of MCD .We see that the single targeting of I and D have the lowest shrinkage intensity in general, whereas most of multivariate shrinkages have the higher shrinkage intensity.In addition, the shrinkage intensity variance of multivariate shrinkage method is larger than single targetings, except for the single targeting of constant correlation.

Conclusion
This paper generalizes the single targeting shrinkage method of Ledoit and Wolf to a multivariate shrinkage setting.The optimal shrinkage intensity solution of the generalized multivariate shrinkage has also been provided in this paper.The mean-variance and global minimum-variance portfolios are constructed with various sizes, and constraint expected returns for out-of-sample portfolio performance.
Empirically, the generalized multivariate shrinkage outperforms over the single targeting method, in terms of reducing out-of-sample portfolio variance across different portfolio types, sizes, and conditional expected returns.It is also observed that the proposed multivariate shrinkage method has higher information ratios only for small and medium sized portfolios.Additionally, the out-of-sample portfolios of the generalized multivariate shrinkage appear more attractive to an active portfolio The risk function of the objective function is derived as: ' ' where ij f is a dimension K vector with the elements ) ,..., ( 1 . k ij f is the element of the th i row and th j column of the is a scalar , and is a K dimension vector.ij σ is the element of Σ at the th i row and th j column.
Minimize the risk function, ) (α R , with respect to α to get: Note that the second order condition is a positive definite such that there exists a solution for minimizing the equations ( 1) and (3).
Set the first order condition to be zero: ( vector.Solve for optimal * α , the formula for the optimal multivariate shrinkage intensity can be obtained as in equation (6). .C is a constant correlation targeting matrix, and M is a market model as in Ledoit and Wolf (2003).the following scalars are defined as:  ; therefore, the third term on the right-hand side of equation ( 11) is equal to zero.This completes the proof of Lemma 1(1).The proof is similar for (2) and (3).

R
, with respect to α to get: (2004a), I can asymptotically estimate each element of Π and Θ .The element estimation for Π is given by

δ
elements of the constant correlation targeting matrix.I have C estimated as for market model, I obtain the market risk beta for individual stocks by the following regression denote the variance of residual it ε and 00 s for the market return sample variance.The covariance matrix implied by this market model is containing the estimated risk beta of individual stocks and ∆ is the diagonal matrix containing residual variances ii δ .
[a] Constrainted portfolio with the annully targeting expected return, 300 basis points; [b] Constrainted portfolio with the annully targeting expected return, 800 basis points; [c] Constrainted portfolio with the annully targeting expected return, 1200 basis points.[d] Constrainted portfolio with the annully targeting expected return, 1600 basis points.[e] Constrainted portfolio with the annully targeting expected return, 2000 basis points.[f] GMV represents a global minimum variance portfolio.n represents the portfolio size.(1) average the turnovers of the single targeting methods; (2) average the two targeting matrix methods; (3) average the three targeting matrix methods; (4) average all multivariate targeting methods.

Figure 2
Figure 2 plots the shrinkage intensities of the sum of the components' shrinkage intensities of a multivariate shrinkage, compared to those of single targetings.For instance, the shrinkage intensity of MCD is computed by summing up the shrinkage intensities of , M ,

F.
be a vector with K elements.Each of the elements in F is a α is a dimension K weighting vector, I is an identity matrix with the dimension N N × and 1 denotes a comfortable vector of ones. ) ( * α Σ is the linear combination of the multivariate shrinkage targeting matrices, F , and the sample covariance matrix, S .Σ is a N N × population covariance matrix.
, elements on the diagonal are

Table 1 : Standard De viation and Robust Te sts of Out-of-Sample Portfolio Fore casting
Wolf (2003 es t he single shrinkage t arget ing mat rix met hod in Lediot andWolf (2003Wolf ( , 2004)).I, M,C and D denot e t he shrinkage t owards indent it y, market , const ant correlat ion and diagonal mat rices, respect ively.Mult ivarait e t arget ing mat rices consist of any combinat ions o f t hese shrinkage t arget ing mat rices.[a] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 300 basis point s; [b] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 800 basis point s; [c] Constrainted portfolio with the annully targeting expected return, 1200 basis points.[d] Constrainted portfolio with the annully targeting expected return, 1600 basis points.[e] Constrainted portfolio with the annully targeting expected return, 2000 basis point s.[f] GMV represent s a global minimum variance port folio.n represent s t he port folio size.p values are report ed in parent hesis.(1) average t he st andard deviat ions of t he single t arget ing met hods; (2) average t he t wo t arget ing mat rix met hods; (3) average t he t hree t arget ing mat rix met hods; (4) average all mult ivariat e t arget ing met hods.St andard devait on is annulized by mult iplying 12. Wolf denot es t he single shrinkage t arget ing mat rix met hod in Lediot and Wolf (2003, 2004).I, M,C and D denot e t he shrinkage t owards indent it y, market , const ant correlat ion and diagonal mat rices, respect ively.Mult ivarait e t arget ing mat rices consist of any combinat ions o f t hese shrinkage t arget ing mat rices.[a] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 300 basis point s; [b] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 800 basis point s; [c] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 1200 basis point s.[d] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 1600 basis point s.[e] Const raint ed port folio wit h t he annully t arget ing expect ed ret urn, 2000 basis point s.[f] GMV represent s a global minimum variance port folio.n represent s t he port folio size.p values are report ed in parent hesis.(1) average t he st andard deviat ions of t he single t arget ing met hods; (2) average t he t wo t arget ing mat rix met hods; (3) average t he t hree t arget ing mat rix met hods; (4) average all mult ivariat e t arget ing met hods.

Table 3 : Portfolio Turnovers
D , 0.4326; moreover, in terms of the information ratio, the multivariate of the single component targetings of I , 0.2851, C , 0.264, and D , 0.265; looking at portfolio turnovers, we see that the multivariate shrinkage of IMCD for the portfolio size