Автор: Gupta A.K.   Nagar D.K.  

Теги: mathematics   applied mathematics   natural sciences   monograph  

ISBN: 1-58488-046-5

Год: 2000

Текст
                    CHAPMAN & HALL/CRC
Monographs and Surveys
Pure and Applied Mathematics
N & HALL/CRC


π CHAPMAN & HALL/CRC Monographs and Surveys in Pure and Applied Mathematics 104 MATRIX VARIATE DISTRIBUTIONS
CHAPMAN & HALL/CRC Monographs and Surveys in Pure and Applied Mathematics Main Editors H. Brezis, Universite de Paris R.G. Douglas, Texas A&M University A. Jeffrey, University of Newcastle upon Tyne (Founding Editor) Editorial Board H. Amann, University of Zurich R. Aris, University of Minnesota G.I. Barenblatt, University of Cambridge H. Begehr, Freie Universitat Berlin P. Bullen, University of British Columbia R.J. Elliott, University of Alberta R.P. Gilbert, University of Delaware R. Glowinski, University of Houston D. Jerison, Massachusetts Institute of Technology K. Kirchgassner, Universitat Stuttgart B. Lawson, State University of New York B. Moodie, University of Alberta S. Mori, Kyoto University L.E. Payne, Cornell University D.B. Pearson, University of Hull I. Raeburn, University of Newcastle G.R Roach, University of Strathclyde I. Stakgold, University of Delaware W.A. Strauss, Brown University J. van der Hoek, University of Adelaide
π CHAPMAN &HALL/CRC Monographs and Surveys in Pure and Applied Mathematics 104 MATRIX VARIATE DISTRIBUTIONS A.K. GUPTA D.K. NAGAR CHAPMAN & HALL/CRC Boca Raton London New York Washington, D.C.
Library of Congress Cataloging-in-Publication Data Gupta, A. K. (Arjun K.), 1938- Matrix variate distributions / A.K. Gupta, D.K. Nagar. p. cm. — (Monographs and surveys in pure and applied mathematics) Includes bibliographical references and index. ISBN 1-58488-046-5 1. Distribution (Probability theory) 2. Multivariate analysis. 3. Random matrices. I. Nagar, D. K. II. Title. III. Series: Chapman & Hall/CRC monographs and surveys in pure and applied mathematics. QA273.6.G875 1999 519.24—dc21 99-40291 CIP This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe. © 2000 by Chapman & Hall/CRC No claim to original U.S. Government works International Standard Book Number 1-58488-046-5 Library of Congress Card Number 99-40291 Printed in the United States of America 1 234567890 Printed on acid-free paper
То Меега AKG To the memory of my mother DKN
PREFACE Random matrices play an important role in the study of multivariate statistical methods. They have been found useful in physics, economics, psychology and other fields of scientific investigation. The literature on the subject is widely dispersed throughout statistical journals. A lot of material has accumulated over the years and an analytical review of the material was deemed necessary. At present there is no book available which deals with this subject primarily. This volume presents most of the developments that have taken place in continuous matrix variate distribution theory in a systematic and integrated form. Some new results have also been included. It is hoped that this volume will stimulate further research and help advance the field of multivariate statistical analysis. This book will be especially useful to graduate students, teachers and researchers who are interested in multivariate statistical analysis. The first author presented parts of this book in a one semester course at Bowling Green State University. It can also serve as a source of supplementary reading and reference to many researchers. It is assumed that the reader is familiar with introductory multivariate statistical analysis and matrix algebra. This work supplements the four volume encyclopaedic work by Johnson and Kotz on the Distributions in Statistics (Discrete Distributions, Continuous Univariate Distributions 1, Continuous Univariate Distributions 2 and Continuous Multivariate Distributions) which has been an important contribution to statistical literature. The authors have also benefited by many fine books in multivariate analysis, e.g., Anderson, An Introduction to Multivariate Statistical Analysis; Kshirsagar, Multivariate Analysis; Srivastava and Khatri, An Introduction to Multivariate Statistics; Muirhead, Aspects of Multivariate Statistical Theory; and Siotani, Hayakawa and Fujikoshi, Modern Multivariate Statistical Analysis: A Graduate Course and Handbook. After introducing the basic mathematical results from matrix algebra, integration, zonal polynomials and hypergeometric functions in Chapter 1, we study the matrix variate normal and Wishart distributions in Chapters 2 and 3, respectively. We discuss the matrix variate ί-distribution in Chapter 4, the matrix variate beta and F-distributions in Chapter 5, and the matrix variate Dirichlet distributions in Chapter 6. The distribution of matrix quadratic forms is given in Chapter 7, and some miscellaneous distributions are presented in Chapter 8. The last chapter, Chapter 9, gives some general families of distributions. Every chapter is followed with a set of problems. Finally, a bibliography, which contains only items cited in the text, has been provided. It is not exhaustive especially as regards to papers on applied topics. It is a pleasure to express our thanks to all who contributed to the production of this book. Foremost, we would like to record our gratitude to late Professor
С. G. Khatri for many valuable suggestions and for reading parts of the manuscript critically before his untimely death. The authors also benefited greatly from conversations with Professors V. Girko, S. Konishi, S. Kotz, N. Sugiura, and С G. Troskie. Furthermore, Drs. J. Chen, D. Song, J. Tang and T. Varga helped by reading parts of the manuscript. Thanks are also due to Drs. D. J. de Waal, P. С N. Groenewald, R. D. Gupta, J. M. Juritz, Ravindra Khattree, N. J. le Roux, D. G. Marx, D. G. Nel, H. M. Nel, J. J. J. Roux, W. Y. Tan and C. A. van der Merwe for providing the preprints and reprints of their work. The first author would like to acknowledge his special thanks to his wife, Meera, for her continued encouragement and his children Alka, Mita and Nisha, for their support and help in editing the book. However, whatever errors of omission or commission that remain are entirely ours. The academic environment at the Department of Mathematics and Statistics, Bowling Green State Univeristy and the Department of Mathematics, Universidad de Antioquia, was essential to complete this project. Ms. Mary Lince, CRC Press, has been very cooperative during the publishing phase. Finally, we very much appreciate the help rendered by Ms. Cynthia Patterson in the early stages of the production of this book. Bowling Green, Ohio August, 1999. A. K. Gupta D. K. Nagar
TABLE OF CONTENTS 1 PRELIMINARIES 1 1.1. INTRODUCTION 1 1.2. MATRIX ALGEBRA 2 1.3. JACOBIANS OF TRANSFORMATIONS 12 1.4. INTEGRATION 18 1.5. ZONAL POLYNOMIALS 29 1.6. HYPERGEOMETRIC FUNCTIONS OF MATRIX ARGUMENT 34 1.7. LAGUERRE POLYNOMIALS 41 1.8. GENERALIZED HERMITE POLYNOMIALS 42 1.9. NOTION OF RANDOM MATRIX 44 PROBLEMS 47 2 MATRIX VARJATE NORMAL DISTRIBUTION 55 2.1. INTRODUCTION 55 2.2. DENSITY FUNCTION 55 2.3. PROPERTIES 56 2.4. SINGULAR MATRIX VARIATE NORMAL DISTRIBUTION 68 2.5. SYMMETRIC MATRIX VARIATE NORMAL DISTRIBUTION 70 2.6. RESTRICTED MATRIX VARIATE NORMAL DISTRIBUTION 74 2.7. MATRIX VARIATE ^-GENERALIZED NORMAL DISTRIBUTION 77 PROBLEMS 82 3 WISHART DISTRIBUTION 87 3.1. INTRODUCTION 87 3.2. DENSITY FUNCTION 87 3.3. PROPERTIES 90 3.4. INVERTED WISHART DISTRIBUTION 111 3.5. NONCENTRAL WISHART DISTRIBUTION 113 3.6. MATRIX VARIATE GAMMA DISTRIBUTION 122 3.7. APPROXIMATIONS 124 PROBLEMS 127
MATRIX VARIATE ^-DISTRIBUTION 133 4.1. INTRODUCTION 133 4.2. DENSITY FUNCTION 134 4.3. PROPERTIES 135 4.4. INVERTED MATRIX VARIATE i-DISTRIBUTION 142 4.5. DISGUISED MATRIX VARIATE i-DISTRIBUTION 143 4.6. RESTRICTED MATRIX VARIATE i-DISTRIBUTION 151 4.7. NONCENTRAL MATRIX VARIATE i-DISTRIBUTION 152 4.8. DISTRIBUTION OF QUADRATIC FORMS 156 PROBLEMS 159 MATRIX VARIATE BETA DISTRIBUTIONS 165 5.1. INTRODUCTION 165 5.2. DENSITY FUNCTIONS 165 5.3. PROPERTIES 171 5.4. RELATED DISTRIBUTIONS 182 5.5. NONCENTRAL MATRIX VARIATE BETA DISTRIBUTION 188 PROBLEMS 193 MATRIX VARIATE DIRICHLET DISTRIBUTIONS 199 6.1. INTRODUCTION 199 6.2. DENSITY FUNCTIONS 199 6.3. PROPERTIES 204 6.4. RELATED DISTRIBUTIONS 214 6.5. NONCENTRAL MATRIX VARIATE DIRICHLET DISTRIBUTIONS 218 PROBLEMS 222 DISTRIBUTION OF QUADRATIC FORMS 225 7.1. INTRODUCTION 225 7.2. DENSITY FUNCTION 225 7.3. PROPERTIES 228 7.4. FUNCTIONS OF QUADRATIC FORMS 233 7.5. SERIES REPRESENTATION OF THE DENSITY 238 7.6. NONCENTRAL DENSITY FUNCTION 246 7.7. EXPECTED VALUES 251 7.8. WISHARTNESS AND INDEPENDENCE OF QUADRATIC FORMS OF THE TYPE XAX' 253 7.9. WISHARTNESS AND INDEPENDENCE OF QUADRATIC FORMS OF THE TYPE XAX' + \{LX' + XL') + С 262 7.10. WISHARTNESS AND INDEPENDENCE OF QUADRATIC FORMS OF THE TYPE XAX' + LiX' + XL'2 + С 270 PROBLEMS 273
8 MISCELLANEOUS DISTRIBUTIONS 279 8.1. INTRODUCTION 279 8.2. UNIFORM DISTRIBUTION ON STIEFEL MANIFOLD 279 8.3. VON MISES-FISHER DISTRIBUTION 281 8.4. BINGHAM MATRIX DISTRIBUTION 284 8.5. GENERALIZED BINGHAM-VON MISES MATRIX DISTRIBUTION 285 8.6. MANIFOLD NORMAL DISTRIBUTION 287 8.7. MATRIX ANGULAR CENTRAL GAUSSIAN DISTRIBUTION 288 8.8. BIMATRIX WISHART DISTRIBUTION 289 8.9. BETA-WISHART DISTRIBUTION 290 8.10. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 1 DISTRIBUTION 291 8.11. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 2 DISTRIBUTION 295 8.12. HYPERGEOMETRIC FUNCTION DISTRIBUTIONS 298 8.13. GENERALIZED HYPERGEOMETRIC FUNCTION DISTRIBUTIONS 301 8.14. COMPLEX MATRIX VARIATE DISTRIBUTIONS 303 PROBLEMS 304 GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS 311 9.1. INTRODUCTION 311 9.2. MATRIX VARIATE LIOUVILLE DISTRIBUTIONS 311 9.3. MATRIX VARIATE SPHERICAL DISTRIBUTIONS 315 9.4. MATRIX VARIATE ELLIPTICALLY CONTOURED DISTRIBUTIONS 322 9.5. ORTHOGONALLY INVARIANT AND RESIDUAL INDEPENDENT MATRIX DISTRIBUTIONS 323 PROBLEMS 328 GLOSSARY OF NOTATIONS AND ABBREVIATIONS 331 REFERENCES 343 SUBJECT INDEX 364
CHAPTER 1 PRELIMINARIES 1.1. INTRODUCTION Multivariate normal distribution plays a pivotal role in the theory of multivariate statistical analysis. Besides mathematical tractability, there are other reasons for this phenomenon. Often the multivariate observations are at least approximately normally distributed. Even when the original data is not multivariate normal, due to the central limit theorem, sampling distributions of certain statistics can be approximated by normal distribution. The independent multivariate observations are often written in terms of a matrix, which is known as Sample Observation Matrix (Roy, 1957). In such a matrix, when sampling from multivariate normal distribution, the columns are distributed independently as multivariate normal with common mean vector and dispersion matrix. The assumption of independence of multivariate observations is not met in multivariate time series, stochastic processes and repeated measurements on multivariate variables. In these cases, the matrix of observations leads to the introduction of the matrix variate normal distribution. As already stated, the multivariate statistical analysis heavily depends upon multivariate normal distribution. Therefore, the distribution of sample covariance matrix, which has Wishart distribution (Wishart, 1928), plays a central role in almost all multivariate inferential procedures. A distribution closely connected to the Wishart distribution, known as Matrix Variate Beta (Khatri, 1959a; Olkin and Rubin, 1964), was introduced by Hsu (1939a) while studying distribution of roots of certain determinantal equation. The matrix variate ^-distribution was first obtained by Kshirsagar (1961a), when he proved that the unconditional distribution of the usual estimator of the parameter matrix of regression coefficients has a matrix variate ^-distribution. The subsequent development of the theory of random matrices was brought about by theoretical and practical considerations. Furthermore, multivariate techniques depend upon functions of random matrices such as determinants, traces and characteristic roots. Thus, random matrices are the backbone of multivariate statistical analysis. Random matrices have found their applications in many fields. Wigner (1967) applied the theory of random matrices to nuclear physics. Treatment of this appli- 1
2 CHAPTER 1. PRELIMINARIES cation and its development are reported by Mehta (1991). Carmeli (1974, 1983), dealing with the statistical theory of energy levels and its relation to random matrices studied the complex Gaussian random matrix and introduced the quaternion random matrix. Girko and Gupta (1996) surveyed distributions of random matrices and their applications in such diverse fields as control theory, stochastic systems, linear stochastic programming, molecular chemistry, experiment planning and ring accelerator. Random matrices have also been used in studies connected with information theory, pattern recognition problems, statistical signal analysis, target detection, identification procedure, and multiple time series. Random matrices are also widely used in experimental studies in various branches such as agriculture, anthropology, biology, cybernetics, economics, education, medicine, and psychology. In these studies the observed random phenomena often can be described by random matrices which include the dependence structure of the relevant random vectors. Many books on multivariate statistical analysis, e.g., Kshirsagar (1972); Srivas- tava and Khatri (1979); Muirhead (1982); Anderson (1984); Siotani, Hayakawa and Fujikoshi (1985), give some results on matrix random variables. In particular, all of them cover Wishart distribution. The book by Gupta and Varga (1993) covers most results on matrix variate elliptically contoured distribution. The present volume incorporates most of the results on matrix variates distributions. 1.2. MATRIX ALGEBRA This section presents a brief discussion of some of the definitions and theorems from matrix algebra. These results can be found in any book of linear algebra, e.g., Bellman (1970), Graybill (1983), Magnus and Neudecker (1988), or books on multivariate statistical analysis, e.g., Roy (1957), Rao (1973), Muirhead (1982), Anderson (1984), Siotani, Hayakawa and Fujikoshi (1985), and Gupta and Varga (1993). DEFINITION 1.2.1. Let A = (a^) be a square matrix of order p. Then, A is called (i) nonsingular ifdet(A) φ 0, (ii) a diagonal matrix, denoted by diag(an,..., a^) if' a^ = 0, г φ j, (Hi) an identity matrix, denoted by Ip, if A is diagonal and a« = 1, г = 1,... ,p, (iv) a symmetric matrix if aij = aji, for г φ j or equivalently A = A!, (v) a lower triangular matrix if a^ = 0, г < j, (vi) an upper triangular matrix if a^ = 0, г > j, (vii) an orthogonal matrix if AA! = A!A = Ip, (viii) an idempotent matrix if A2 = A, (ix) a symmetric idempotent matrix if A = A! and A2 = A, (x) a positive definite (positive semidefinite) matrix, denoted by A > 0 (A > 0) if A is symmetric and for every p-dimensional nonzero vector v, v'Av > 0 (v'Av > 0). THEOREM 1.2.1. Let А, В be ρ χ ρ and С be q χ ρ matrices. Then, we have the following results. (i) IfA>0, then A-1 > 0. (ii) IfA>0, then С AC > 0.
1.2. MATRIX ALGEBRA 3 (Hi) Ifq<p,A>0, and rank(C) = q, then С AC > 0. (iv) If A > 0, £ > 0 and A - £ > 0, then B~l - Α~Ύ > 0. (υ) If A > 0, and В > 0, then det(A + B) > det(A) + det(5). DEFINITION 1.2.2. Let A be α ρ χ ρ matrix. Then, the roots (with multiplicity) of the equation det(A - XIP) = 0 are called characteristic roots or eigenvalues of the matrix A. THEOREM 1.2.2. LetXi,...,Xp be the characteristic roots ofA(pxp). Then, the following results hold, (i) det(A) = Π?=1 λ,. (И)Ы(А) = Е1Л- (Hi) rank(A) = the number of nonzero characteristic roots. (iv) A is nonsingular if and only if all its characteristic roots are nonzero. (v) Furthermore, if we assume that A is symmetric, then the characteristic roots are real. (vi) A is positive definite (positive semidefinite) if and only if all the characteristic roots of A are positive (non-negative). (viz) A is symmetric idempotent of rank(A) = к if and only if all the nonzero characteristic roots are unity. DEFINITION 1.2.3. Let A = (α0·) be ар χ q matrix. Then, α 2 χ 2 partition of A is defined as /An A12\ r ~\A21 A22) p-r s q — s where the submatrices Ац} А\2, A2\, and A22 are Mi = (%), г = 1,..., r, j = 1,..., s Αϊ2 = (α0·), i = 1, - · ·, r, j = s + 1,..., q M\ = (ay), i = r + l,...,p,j = l...,s A22 = (a^·), i = r + l,...,p, j = s + l...,q. Similarly, an m χ η partition of A is defined as A = / Au A21 \ Ami Qi A12 A22 Am2 · 42 Aln \ A2n •A-mn ) qn Pi Pi Pra
4 CHAPTER 1. PRELIMINARIES where ρλ Η \-pm = Ρ and gH l· <?n = <?· Thus, for m = 1, ρλ = ρ, η = q, qi = · · · = qn = 1, one can write A as A = (oi,...,og), where a1?..., ας are the p-dimensional column vectors of A. Also, when η = 1, <?i = <?, m = P> Pi = · · · = Pm = 1, we have A = where а*,..., a*' are the ^-dimensional row vectors of A. THEOREM 1.2.3. Let A be a nonsingular square matrix of order p. Then, (i) (kA)~l = k~lA~l, к ^ 0 is a scalar, (ii) (AB)_1 = Β~ΎΑ~Ύ, Β (ρ χ ρ) is nonsingular, (in) A-1 =diag(an1J...,a^?1) if A = diag(an,... ,α^), au φ 0, i = 1,...,; (iv) For Β (ρ χ q), D(q χ ρ) and nonsingular С (q χ q), (A + BCD)-1 = A-1 - A~lB{C~l + DA-lB)-lDA~l. (v) For symmetric A{p χ ρ) partitioned as A = ' n 12 A"1 = An ^-22 ^■11-2 ""^11-2^12^22 -A22 ^-21^-ll-2 ^-22-1 ^■11-2 ~~Ац A12A22.i ~Α22·ΐΑ21Αη ^-22-1 where -^ц.2 = -^-n — -^12^22 ^21; ^22-1 == ^22 — ^2i'^Lii •^■12; assuming Л^ ; -^22 ; ^11-2 and A^ е:ш£. THEOREM 1.2.4. Lei A be α ρ χ q matrix. Then, we have the following results, (i) 0 < rank(A) < min(p, <?). fnj J/A = 0, ^en rank(A) = 0. fm,) rank(A) = rank(A') = rank( AA') = rank( A'A). (iv) rank(A + B) < rank(A) + rank(J3); for A and В of order ρ χ q. (ν) rank(AB) < nim(rank(A), rank(B)), for A(p χ q) and В (q χ r). (vi) If Β (ρ χ ρ) and C (q x q) are nonsingular, then rank(i3 AC) = rank(A). (vii) For A{pxp), rank(A) = ρ iff A is nonsingular. THEOREM 1.2.5. For the trace function, defined as the sum of the diagonal elements of a square matrix, we have (i)ti{A) = ti{A!),A{pxp),
1.2. MATRIX ALGEBRA 5 (ii) ti(kA) = kti(A), A(p χ ρ) and к φ 0 is a scalar, (Hi) ti(A + B) = ti(A) + tr(5), A{px ρ), B{px ρ), (iv) ti(AB) = ti(BA), A(pxq), B(qxp), (v) ti(ABC) = ti(ACB), A(p χ ρ), Β (ρ χ ρ) and С (ρ χ ρ) are symmetric, (vi) ti(HAH') = tr(A), Η (ρ χ ρ) is orthogonal, (vii) ti(A) = rank(A) if A is idempotent, and (viii) ti(Ak) = ]T?=1 Af; к is a positive integer, λι,...,λρ are the characteristic roots of A(p χ p). THEOREM 1.2.6. Let A be a symmetric nonsingular matrix of order p. Then, for B(q χ p) and С (ρ χ q), (i) det(7p + С В) = det(Iq + ВС). (ii) For a partition А = ( лп л12 det(A) = det(An) det(A22 —A21A^A12), if Au is nonsingular = det(A22) det(A11 — A12A22A21), if A22 is nonsingular. THEOREM 1.2.7. Let A = (au ...,ap) = (aj,..., a*)' be an orthogonal matrix of order p. Then, (i) A"1 = A!, (ii) the characteristic roots of A are either +1 or —1, (Hi) det(A) = ±1, (iv) α\αό = 0, г ^ j, а[а{ = 1, г, j = 1,... ,ρ, (ν) a*'a* = О, г φ j, afa* = 1, г, j = 1,... ,ρ. THEOREM 1.2.8. Let Α (ρ χ ρ) be an idempotent matrix. Then, (i) Ip — A is idempotent, (ii) PAP'1 is idempotent, Ρ (ρ χ ρ) is a nonsingular matrix, (Hi) HAH' is idempotent, Η (ρ χ ρ) is orthogonal, (iv) non-zero characteristic roots of A are unity, (v)ti(A) =rank(A); (vi) Ak is idempotent, к is a positive integer, (vii) //rank(A) = p, then A = Ip. DEFINITION 1.2.4. Let A = (a^) be a square matrix of order p. Then, the sub- matrices A^ and A[a], 1 < a < p, of the matrix A, are defined as f О.Ц ··· aia ^ aw =
6 CHAPTER 1. PRELIMINARIES and I ap-a+l,p-a+l ' ' ' &ρ-α+1,ρ \ A[a] = \ ap,p-a+l Upp = Α(ρ_α+ι) respectively. THEOREM 1.2.9. Let A and В be two lower (upper) triangular matrices of order p. Then, the following results hold. (i) A'1 is lower (upper) triangular. (ii) (AB)-1 is lower (upper) triangular. (Hi) A(a) is lower (upper) triangular and (A(a))_1 = (A_1)(a). (iv) A^ is lower (upper) triangular and {A^)~l = (A-1)^. (v) If A is lower triangular and partitioned as An ^22 where Ац and A22 are nonsingular, then A22 Ά21 Απ Ά 0 22 (vi) If A is upper triangular and partitioned as (Mi A12 ~ V 0 A22 where Ац and A22 are nonsingular, then ^■11 —An A\2A22 A~l = ■ о ли1 H(AB)W=AWSW. (viii) (AB){a) = A{a)B{a). THEOREM 1.2.10. (spectral decomposition of a symmetric matrix) Let A(px p) be a symmetric matrix. Then, there exists an orthogonal matrix Η such that A = HDH', where D is a diagonal matrix having diagonal elements as the characteristic roots of A.
1.2. MATRIX ALGEBRA 7 THEOREM 1.2.11. (square root factorization) Let A{p χ ρ) be a positive definite matrix. Then, there exists a positive definite matrix Β (ρ χ ρ) such that A = В2. Furthermore, we define square root of A as Aз = B. THEOREM 1.2.12. (rank factorization) Let A(p χ ρ) be a symmetric matrix of rank q. Then, there exists a matrix Β {ρ χ q) of rank q such that A = BB'. THEOREM 1.2.13. Let A(qxp) be of rank q{<p) · Then, there exist an orthogonal matrix Η (ρ χ ρ) and a positive definite matrix В (q x q) such that A = B(Iq 0)tf where 0 denotes the q χ (ρ — q) null matrix. THEOREM 1.2.14. (Cholesky decomposition) Let A(p χ ρ) be a positive definite matrix. Then there exists a unique lower (upper) triangular matrix Τ (ρ χ ρ) with positive diagonal elements such that A = TT. THEOREM 1.2.15. Let A(q χ ρ) be of rank q < p. Then, there exist a lower (upper) triangular matrix Τ with positive diagonal elements, and a semiorthogonal matrix H\, H1H1 = Ip, such that Α = ΤΗλ. Moreover this representation is unique. THEOREM 1.2.16. Let Αχ,..., Am be ρ χ ρ symmetric matrices. Then necessary and sufficient condition for simultaneous diagonalization of A\,..., Am by an orthogonal matrix Η is that AiAj = AjAi} for every pair (г, j), г φ j, г, j = 1,..., т. THEOREM 1.2.17. Let A\,... ,Am be ρ χ ρ symmetric idempotent matrices and AiAj = 0, г φ j. Then there exists an orthogonal matrix Η (ρ χ ρ) such that Н'АгН = [ η ) , ΗΆ2Η = 1 0 0/ where Ari = diag(Aib . · ·, AirJ, λ^· 's are the characteristic roots of Ai, r; = rank(Ai); г = 1,..., m, and the null matrices are of appropriate orders. From Theorem 1.2.10, we can derive the following well-known representation. THEOREM 1.2.18. (spectral representation) Let A(pxp) be a symmetric matrix. Then the matrix A can be written as m 3=1 where Au...,Am are symmetric idempotent matrices, гапк(Д) = /i? AiAj = 0, /0 0 0 лГ2 \0 0 o\ 0 0) ,.,.,Η AmH — /0 0 0 A,m U 0 0 0 0
8 CHAPTER 1. PRELIMINARIES г φ j, and λι,..., Am are the characteristic roots of A with multiplicities /1,..., /m respectively, Χι > · · · > Am. THEOREM 1.2.19. Let A(p χ ρ) be a symmetric matrix written as m Α = ΣαόΑά 3=1 where αϊ,..., ата are positive real numbers, Αχ,..., Am are symmetric idempotent matrices, AiAy= 0, i φ j, and jyjLi Aj = I?· Then the matrix A is positive definite and A-1 = (ΣΤ=ι ocjAj)'1 = ΣΤ=ι <*74· THEOREM 1.2.20. Let Αχ,..., Am be symmetric matrices of order ρ and let A = YJjLi Aj. Consider the following four conditions: ~(i)A^ = A (ii) AAj = 0,i^j (in) A2 = A (iv) Σί=ι гапк(Д) = гапк(А). Then (a) any two of the conditions (i), (ii), and (Hi) imply the remaining, and (b) conditions (Hi) and (iv) imply (i) and (ii). The above result was given by Graybill and Marsaglia (1957). For an easy proof of this result, the reader is referred to Loynes (1966) and Searle (1971, pp. 62-63). The next two theorems state some useful results concerning the Kronecker product, also called the direct product. DEFINITION 1.2.5. The Kronecker product of two matrices A(mxn) = (a^·) and Β (ρ χ q) = (bij), denoted by A® B, is the mp χ nq matrix defined by A®B = = = ( (auB a2\B \amlB ааВ). αλ2Β a22B am2B · ащВ\ d2nB • amnB J Using the Definition 1.2.5, the following properties of Kronecker product can be easily proved (see Graham, 1981; Graybill, 1983). THEOREM 1.2.21. (i) For any nonzero scalars a and β, (αΑ)®(βΒ)=αβ(Α®Β). (ii) For A (m χ η), Β (πι χ η) and any С, (Α + Β) ® С = (А® С) + (В ® С),
1.2. MATRIX ALGEBRA 9 С <g> {A + B) = (C <g> A) + (C <g> Б). fmj (A (g) Б) <g> С = A <g> (B <g> C). fa) (A<g>£)' = A'<g>£'. (v) For A(mxm) and В (га х га); tr(A<g>£) = (tiA)(tiB). (vi) For Α(πιχ η), Β (ρ χ <?), С (η χ r) and D (q χ s), (A <g> £)(C <g> L>) = (AC) (g) (££>). (vii) For nonsingular matrices A and В {A^B)-1=A-1(^B-1. (viii) If Ρ and Q are orthogonal matnces, then Ρ <g> Q is an orthogonal matrix, (ix) If Ρ and Q are positive definite matnces, then P<g>Q is also a positive definite matrix. (x) For A{mxm) and Β (η χ η), det(A <g> В) = det(A)naet(B)m. (xi) If the eigenvalues of Α (πι χ га) are α», г = l,...,m, and of Β (η χ η) are bj, j = 1,..., η then the eigenvalues of А® В are a^·, г = 1,..., га, j = 1,..., п. DEFINITION 1.2.6. For a matrix X (ra χ η), vec(X) is the mn χ 1 vector defined as fxi\ vec(X) = i , where £ci? г = 1,..., η is the ith column of X. THEOREM 1.2.22. For A(p χ га), Β (η χ q), С (q χ га), D (q χ η), Ε {πι χ πι), and Χ (πι χ η), we have (г) vec(AXB) = {В' <g> A) vec(X), (it) ti(CXB) = (vec(C"))'(i* ® X) vec(£), (Hi) ti(DX'EXB) = (vec(X)Y(D,Bf <g> E) vec(X) = (vec(X)y(BjD (g) E') vec(X). Now, we define the commutation matrix, also known as the permutation matrix, which transforms vec(A) into vec(A'). The commutation matrix Kpq of order pq χ pq is defined as Kpg=f:j:(Hij®H'ij), (1.2.1) i=\ j=l
10 CHAPTER 1. PRELIMINARIES where the matrix H^ {ρ χ q) has a unit element at the (г, j)th place and zero elsewhere. Note that Hij = а& (1.2.2) where α; (ρ χ 1), bj (q χ 1) have unity as the 2th and jth element, respectively, together with the remaining elements as zero. Using this representation, the following properties can easily be proved. (i) Kpq = K'qp = K£. (1.2.3) (ii) Kpqvec{A) = vec(A!), (1.2.4) where A is an ρ χ q matrix, (iii) Kn{A <g> B)Krs = В <g> A, (1.2.5) where A is an q χ r matrix and В is an ρ χ s matrix, (iv) tiiKniA' <g> B)} = ti(A'B) = (vec(A))'vec(B), (1.2.6) for A(p χ q), B(p χ q). (v) vec(A®B) = (Iq®Kpr®Ir)(vec(A)®vec(B)), (1.2.7) for А (га х n) and В (r χ s). For proof of these results, one can refer to Magnus and Neudecker (1979), and Neudecker and Wansbeek (1983). Next we define the vec notation for a symmetric matrix (Brown, 1974). DEFINITION 1.2.7. For a symmetric matrix Χ (ρ χ ρ), vecp(X) is a \p(p + 1)- dimensional column vector formed from the elements above and including the diagonal, taken columnwise. In other words if X = ^21 ^22 Xlp\ X2p \ Xpi Xp2 ' ' ' Xpp / then vecp(X) = ίχη\ X\2 x-n Xlp \XppJ vecp(X')·
1.2. MATRIX ALGEBRA 11 DEFINITION 1.2.8. The matrix Bp of order ρ2 χ \p{p + 1), with typical element (Bp)ij,gh = -^(SigSjh + 6ih6jg), i<p,j<p, g<h<p, (1.2.8) where 6rs is the Kronecker's delta, is called the transition matrix. It may be noticed that the rank of Bp is \p(p + 1). The Moore-Penrose inverse of B; = {B'pBp)-'B'p (1.2.9) Bp is which is of order \p(p + 1) x p2 with typical element (B£)ghtf = (2 - 6gh)(Bp)ijtgh, l < 0, j < 0, Q < Η < Ρ = 1, ij = gh or ij = hg = 0, otherwise. (1.2.10) For symmetric matrix X (p xp), the matrices Bp and B+ can be used to express vec(X) in terms of vecp(X) and vecp(X) in terms of vec(X), respectively, vecp(X) = £pvec(X) (1.2.11) vec(X) = (B+)'vecp(X). (1.2.12) The ρ2 χ p2 idempotent matrix mp = bpb; has the typical element (Mp)ij,9h = -^(SigSjh + 6ih6jg), i<p,j<p,g<p,h<p. (1.2.13) It is interesting to note that 1 2 (/P2 + Kpp) = Mp, (1.2.14) MPBP = £p, (1.2.15) and в;мр = в;. (1.2.16) Further, if Υ is a p χ ρ matrix, then from (1.2.4) and (1.2.14) we get Mpvec(Y) = ^(Ip, + Kpp)yec(Y) = i(vecQO + vec(r)) = vec(X) (1.2.17)
12 CHAPTER 1. PRELIMINARIES where X = \(Y + Y') = X'. Thus for a symmetric Υ (ρ χ ρ), Mpvec(Y)=vec(Y). For a matrix A(pxm), Mp{A®A) = {A®A)Mm, and for Α (ρ χ p}, det(B'p(A <g> A)£p) = 2"^-1} det(A)p+l. (1.2.18) In particular if A = Ip then det(£££p) = 2~2p(P~l). 1.3. JACOBIANS OF TRANSFORMATIONS Let X and У be two matrices having the same number of independent elements xi,..., xp and ух,..., yp respectively. Consider the matrix transformation Υ = F(X). Then the Jacobian of the transformation from X to Υ is defined as / dxi dxi \ / дуг "' дур \ J(X -+Y) = mod det : I dxp dxp ι \dyi '" дур I The following results for Jacobians are well known. Their proofs and details are given in Deemer and Olkin (1951), Olkin (1953), Olkin and Roy (1954), Roy (1957), Olkin and Rubin (1964), Perlman (1977), and Rogers (1980). (i) dxi · - - dxp = J(X -^Y)dyi" - dyp (ii) If J(X -> Υ) φ 0 , then J(Y^X) = {J(X^Y)}~\ (iii) If Υ = F(Z) and Ζ = G(X), then J{X -> Y) = J(X -> Z)J(Z -> Y). (iv) If Υ = F(X) and Ζ = G(W), then J(X, W -> Г, Z) = J(X -> У) J(V^ -> Z). (v) J(X -> Y) = J{(dX) -> (ЙУ)) where (dZ) is the matrix of differentials of elements of Z. (vi) If y{ = fi(xu ..., xm, xm+i,..., xm+n), г = 1,..., m, where xb ..., xm+n are subject to η constraints /»(xi,..., xm+n) = 0, г = га + 1,...,га + п, then T/ , \ ^ l/l j · · · j Jm+n ^ 2^1? · · · j ^m+nj «/(2/Ь---)2/т -> Хь-..,Хта) = ^ Wm+1 j · · · j Jm+n ^ ^m+1 j - · · j %m+n)
1.3. JACOBIANS OF TRANSFORMATIONS 13 The Jacobians of certain transformations which are needed in the subsequent chapters are now given, where for simplicity "mod" has been suppressed from their values. LINEAR TRANSFORMATIONS (i) For y(px 1), ж (ρ χ 1) and Α (ρ χ ρ) if у = Ax, then J (у -> χ) = det(A). (1.3.1) (ii) For Υ (px q), X (p x q), and A(p χ ρ), if Υ = AX, then J(Y -> X) = det(A)9. (1.3.2) (iii) For У (ρ χ <?), Χ(ρχ <?), and £ (<? χ <?), if Υ = XB, then J(Y -> X) = det(£)p. (1.3.3) (iv) For У (p x <?), I(px <?), Α (ρ χ p), and β (<? χ <?), if У = AX£, then J(y -> X) = det(A)9 det(£)p. (1.3.4) (v) For У (ρ χ ρ), Χ (ρ χ ρ) symmetric and Α (ρ χ ρ) if У = ΑΧ A!, then J(y^X) = det(A)p+1. (1.3.5) (vi) For У (ρ χ ςτ), Χ (ρ χ <?) and a scalar α if У = αΧ, then J(Y -+X)=apq (1.3.6) which follows from (ii) by taking A = alp. (vii) For lower triangular matrices Υ (ρ χ ρ), Χ (ρ χ p), and A = (α#), if У = AX then J(y->X) = n<& (1.3.7) (viii) For upper triangular matrices Υ (ρ χ ρ), Χ (ρ χ p), and Α=(α^·), if Y=AX then 7(У^Х) = ПаГ+1· (1-3-8) i=l (ix) For lower triangular matrices Υ {px p), X (px ρ), Α(ρ χ p) = (ay) and Β {ρ χ ρ) = (by), if У = AXB, then ЛУчХ) = П(4г+1). (1.3.9) t=l (х) For a symmetric matrix У (ρ χ ρ) and lower triangular matrices Χ (ρ χ p) and A(pxp) = (oij), if У = АЯ7 + ΧΑ', then J(y -> X) = 2P [J οΓ<+1. (1.3.10) г=1 (xi) For a symmetric matrix Υ (ρ χ p) and upper triangular matrices Χ (ρ χ p)
14 CHAPTER 1. PRELIMINARIES and A(pxp) = (ay), if Υ = AX' + XA', then 7(УчХ) = ?П4- (1-3-И) INVERSE TRANSFORMATIONS (xii) For nonsingular matrices Υ (ρ χ ρ) and X (ρ χ ρ), if У = Χ-1, then J(Y ->X) = det(X)"2p. (1.3.12) (xiii) For nonsingular symmetric matrices Υ (pxp) and Χ (ρ χ ρ), if Υ = X~l, then J(Y -> X) = det(X)-fr+1). (1.3.13) QUADRATIC TRANSFORMATIONS (xiv) For a symmetric positive definite matrix Υ {ρ χ p) and a lower triangular matrix T{pxp) = (ί0·), if У = ТГ, then and if Υ = ТАГ, then J(Y ^T) = 2pf[ (t?ri+l det(AM)), (1.3.15) i=l where Α {ρ χ ρ) = (α^) is nonsingular and A^ = (a^), j, к = 1,..., г. (xv) For a symmetric positive definite matrix Υ (p x p) and an upper triangular matrix Τ (ρ χ p) = (ty), if У = 77*, then J(7^T) = 2p[I4 (1-3.16) t=l and if У = ТАГ, then J(y->r) = yn(*«det(AM)), (1.3.17) t=l where A(pxp) = (a^) is nonsingular and Ащ = (α^), j, A; =p—г +1,..., p. (xvi) For symmetric positive definite matrices Υ (ρ χ ρ), Χ (ρ χ p), U (ρ χ ρ) and V (ρ χ ρ), if U = Χ + У and V = £НУ(£Г*)' where U = 1У*(1У*)', then J(X,Y^ U,V) = det(U)^+l). (1.3.18) (xvii) For symmetric positive definite matrices Υ {pxp) and X {pxp), if Y = {IP + X)~k-X((IP + Χ)"")' where {Ip + X) = {Ip + Λ")*((/ρ + X)")',
1.3. JACOBIANS OF TRANSFORMATIONS 15 then J(Y -> X) = det(/p + X)-b+l). (1.3.19) (xviii) For symmetric positive definite matrices Υ (p x p) and Χ (ρ χ ρ), if У=Х2, then Olkin and Rubin (1964) showed that J(Y -> X) = Π№ + ί,·) = h(Su ... A) (1-3.20) where £i... ,δρ are the eigenvalues of matrix X. For ρ= 2,3 and 4, (1.3.20) is simplified as М^ь^г) = 22α2αχ М*ь*2,*з) = 23α3(αια2 - α3) М^ь ^2» <Ь> &a) = 2 а^а^аз — α3 — αλα^)^ where α^ is the A:th elementary symmetric function of 5χ,..., δρ. Further йк = tik{X) where tik{X) is the sum of all kth order principal minors ofX. (xix) For symmetric positive definite matrices Υ {ρ χ ρ), Χ (ρ χ p) and a symmetric matrix Β (ρ χ ρ), if Υ = ΧΒΧ, then (Olkin and Rubin, 1964) J(Y^X) = f[(Xi + Xj) (1.3.21) i<j where Ab ..., λρ are eigenvalues of ΒϊΧΒ*. ORTHOGONAL TRANSFORMATIONS Let the rank of the matrix Χ {ρ χ η) be ρ (< η). Then X has the unique representation X = ΤH\ where T{pxp) is a lower triangular matrix with positive diagonal elements and Hi {ρ χ n) is a semiorthogonal matrix, г.е., #x#i = /p. Here the matrix X has np variables, Τ has |p(p + 1) variables, and the semiorthogonal matrix Hi has np — \p{p + 1) functionally independent variables due to the restriction ΗλΗ[ = Ip. To obtain the Jacobian J(X —>· Τ, #χ), consider the differential form (dX) = (d(THi)) = (dT)Hi + T{dHi). (1.3.22) Then J(X -> T, Hi) = J((dX) -> (ΛΓ), (d#i))· Let Я = (^ ) P be an orthog- \ri2J η — ρ onal matrix. Then ΗλΗ' = (Ip 0) and (d(HlH')) = (0 0) = (dHl)H' + Hl(dH[ dH'2). Also, let Ri = (dHi)Hf and R2 = {dHi)H'2. Then it is easy to see that Ri (ρ χ ρ) is skew-symmetric. Post-multiplying the differential form (1.3.22) by H\ we get (dX)H' = (dT)HiH' + T(dHi)H'
16 CHAPTER 1. PRELIMINARIES = (dT)(Ip 0)+T(Ri Д2) = {WX W2) where Wl = (dT) + TRl = (w}j) and W2 = TR2. Thus J(X -> T,#i) = J((dX) -> (ЛГ), (dffi)) = J((dX) -> W)J(Wi -> (ЛГ), J?i)J(W2 -> R2)J(RuR2 -> №)). Now J((dX) -> W) = mod det(#')p = 1, J(W2 -> Д2) = det(T)n"p, and Then Further, if (dT) = then J(RUR2 -> (ОД)) = £n,P(#i) (say). J{X^T,HX) = J{Wl^{dTlRl)aet{TT-pgn^Hl). /Ли О СЙ21 ^ 22 0 \ о / о , and R\ = \ dtpi dtp2 · · · dtpp ) '12 0 '23 \-r lp ' 2p ' 3p and w\j = <fty + (Tflifo, г > j = (TR^a, i < j J(Wt -* (dT),RJ = f^tlf ■■■t™ = f[t\r i=l Substituting from (1.3.24) in (1.3.23), we finally get ρ Π t=l 7(Χ^Τ,^) = ΠίΓ5η,Ρ№). where gnAHl) = J((dH1)(H[ H'2)^{dHx)). (1.3.23) rip\ Γ2ρ 0 / (1.3.24) (1.3.25) (1.3.26) Here gn#{Hi) dH\ defines the invariant measure on the Stiefel manifold 0(p,n), 0(p,n) = {Hl(pxn):HlH[ = Ip} (1.3.27)
1.3. JACOBIANS OF TRANSFORMATIONS 17 and is denoted by [(dHl)H[]. For ρ = η, the Stiefel manifold reduces to the orthogonal group, Ofap) = 0(p) = {H(px p) : tftf' = Jp}, and the invariant measure on 0(p) is [{dH)H'\. In the next section it is shown that ШЖ] = frrrr JO(p,n) *-р\2П) = Vol(0(p,n)). (1.3.28) For p = n, 2р7Г^р2 Vol(0(p)) = =7T-r. Dividing [(durjurj] by the volume (or the surface area) of the Stiefel manifold, we obtain the unit invariant measure, ^ = ШВгу С·329' Thus, the measure (1.3.29) is the normalized surface area of the rip — \p(p + 1) dimensional surface in the np-space defined by (1.3.27). From here the density of H\ for η > ρ is obtained as Γρ(|η) —j—^(Ях), ΗλΗλ = Ip. For p = n, we have [dH] = ^я'] Vol(0(p))' which is known as the unit invariant Haar measure on the orthogonal group 0(p). Partition Hi (ρ χ η) as V\ q Я1= . , Ύ\) P-Q Now choose Ζ ((p-q) x {n — q)) and G{V) ((n-q)xn) such that ZZ' = /ρ_ς, ( Γ(νλ ) is orthogonal, V\ = ZG{V), and the relationship between Ζ G 0{p — q, n — q) and VI G 0(p — q,n) is one-to-one. Then the unit invariant measure [dH\] can be decomposed as the product [dffi] = [dy][dZ], (1.3.30) where [dV] and [dZ] are unit invariant measures on 0{q,n) and 0{p — q,n — q) respectively. The decomposition (1.3.30) was derived by Chikuse (1990a). She has also given the sequential decomposition of [dH\] into the product of several invariant measures. For further details the reader is referred to James (1954), Herz (1955), Farrell (1985), Muirhead (1982), and Chikuse (1990a, 1990b).
18 CHAPTER 1. PRELIMINARIES 1.4. INTEGRATION Integrals involving functions of matrix arguments are frequently used in this book. In this section, we study such integrals. Let f(X) be a scalar function of the matrix X. Then /*'<*> dX is defined as the iterated integral of f(X) with respect to each element of X separately over a region R in the space defined by the simplex bounding the ranges of the elements of X. Evaluation of these integrals is facilitated by the use of Laplace transform discussed in detail in Herz (1955). DEFINITION 1.4.1. Let f(A) be a function ofA(pxp) > 0 and Ζ = X + tY, l = γ7—Ϊ, be apxp complex symmetric matrix. Then, the Laplace transform g(Z) of f(A) is defined as g(Z)= f eti(-ZA)f(A)dA, JA>0 where the integral is assumed to be absolutely convergent in the right half plane Re(Z) = X > X0 > 0. The Laplace transform g(Z) of f(A) defined above is an analytic function of Ζ in the right half plane Re(Z) = X > X0 > 0. In addition, if / \g(X + tY)\dY <oo (1.4.1) J-oo<Y=Y'<oo for all X > X0 > 0, and lim [ \g(X + iY)\dY = 0 (1.4.2) then the unique inverse Laplace transform f(A) of g{Z) is f(A) = r-—- / eti(ΖA)g(Z) dZ. (1.4.3) An important property of Laplace transform is the convolution result. Η gi and g2 are the respective Laplace transforms of /i and /2, then gig2 is the Laplace transform of /3 where h(B)=[ h(B-A)f2(A)dA. (1.4.4) Some integrals useful in the matrix variate distribution theory are now given. DEFINITION 1.4.2. The multivariate gamma function, denoted by Γρ(α); is defined as Γρ(ο) = f etr(-A) det(A)a-^(p+1> dA, (1.4.5) where Re(a) > \{p— 1), and the integral is over the space of ρ χ ρ symmetric positive definite matrices.
1.4. INTEGRATION 19 The multivariate gamma function Γρ(α) can be expressed as product of ordinary gamma functions as given in the following theorem. THEOREM 1.4.1. For Re (a) > \{p- 1), Γρ(α) = π**-1>ΠΓ[α-5(ΐ-1)]. Proof: By definition Γρ(α) = / etr(-A) det^)"-^1) dA. Ja>o Substitute A = TT', where Τ is a lower triangular matrix with tu > О, г = 1,... ,p. Then, tr(A) = tr(TT') = Σ£<{ί£·, det(A) = det(TT') = det(T)2 = Πί=ι 4 and from (1.3.14) 7(А^Т) = 2РП*Г+1· Hence, Ρ \ P Γρ(α) = 2? /-.·/ П(4Г^ехр(-5:4)П^ ta>0 = [Π / exp(-4) dti}] [Π 2 / ехр(-4)(4Г^ <2ί; o--(i-l) = W|P(P-Djjr A particular Laplace transform which is quite useful is / etr(-AZ) dettA)*-^1) dA = Γ J a) det(Z)"a. (1.4.6) Herz (1955) proved that the above integral is absolutely convergent for Re(Z) > 0, and Re(a) > \(jp - 1). Hence, for Re(Z) > 0, substituting A = Z^AZi with the Jacobian J(A ->· A) = de^Z)"^1), in the above integral we get [ etr(-AZ) det(A)a"^+1) dA = det(Z)"a / etr(-A) dei(A)a-?b+V dA Ja>o Ja>o = rp(a)det(Z)"a. This proves (1.4.6) for real Z. It follows for complex Ζ by analytic continuation since Re(Z) > 0, det(Z) φ 0 and det(Z)a are well defined by continuation. Using the inversion formulas (1.4.3), after verifying the conditions (1.4.1) and (1.4.2), Herz (1955) gave the inversion of (1.4.6) as 2§Pf"1) , j etr(ZA) det(Z)-*<*4 dZ = rdetjA)° ., Λ > 0. (2πι)3Ρ(ρ+ι) Уяе(г)=х>х0>о Гр[о + |(р+1)]
20 CHAPTER 1. PRELIMINARIES _ ΓΡ(α)Γρ(6) ^"-' ~ Γρ(α + 6) DEFINITION 1.4.3. ТДе multivariate beta function, denoted by βρ{α,ο), is defined by βρ(α, Ъ)= f det(A)a"^+1) det( Jp - A)^^ dA, (1.4.7) Jo<A<Ip where Re(a) > \{p - 1) and Re(6) > |(p - 1). The multivariate beta function βρ(α, b) can be expressed in terms of multivariate gamma functions. THEOREM 1.4.2. For Re(a) > \{p - 1) and Re(6) > \{p - 1), A>M) = = A>(M). (1.4.8) Proof: We have Γρ(α)Γρ(6) = / etT(-A)det(A)a~te+V dA [ eti(-B)det(B)b~^p+l) dB JA>0 JB>0 = [ [ eti{-(A + B)}det(A)a~2^ aet(B)b-^l) dAdB. Ja>oJb>o Now making the transformation W = A + B, Z=(A + B)~* A{A + B)~* (where (A + Β)ϊ is symmetric square root of A + B) with the Jacobian J (А, В —>· Ζ, И^) = det^)^^ we get Γρ(α)Γρ(6) = / eti(-W)det(W)a+b~i{p+1) aW Jw>o [ det(Z)a"^1} det(/p - Z)M^+1) dZ Jo<z<ip = Γρ(α + 6)/3ρ(α,6). ■ Alternatively, Theorem 1.4.2 can be proved by using the convolution formula (1.4.4). Substituting A=(IP + B)~l in (1.4.7) with Jacobian J (A -> B) = J((dA) -> (dB)) = det(/p H-B)-^^, we get an equivalent integral representation for the multivariate beta function as Α(α,δ) = / det(B)6"^+1) det(/p + B)"(a+6) dB. (1.4.9) Jb>o The incomplete gamma and beta functions corresponding to (1.4.5) and (1.4.7), which are expressible in terms of hypergeometric functions, are defined in Section 1.6. Now we generalize the multivariate beta function.
1.4. INTEGRATION 21 DEFINITION 1.4.4. The multivariate Dirichlet function, denoted by /3ρ(αι,..., αΓ; b), is defined by /3p(ab...,aT;b)= [■■■[ Πdet^)*-*^det (/„ - £Z>/"W+l) f[dZ, Zi>0 (1.4.10) where Re(a;) > \{p — 1), г = 1,... ,r, and Re(6) > |(p — 1). The relation between the multivariate Dirichlet function and the multivariate gamma function is given in the following theorem. THEOREM 1.4.3. For Re(a;) > \{p - 1), i = 1,... ,r, and Re(6) > |(p - 1), *· ^>°r,(Sw (>■"« гуДеге a = Σ[=1 a*. Proof: First consider the integral Φ(Ζ) = / · · · / Π (ВДУ4-*0*1* det (/„ - Σ Z%)b~h{p+l) Π dZ,. (1.4.12) *Ζ<>0 Substituting ΣΓ=ι & = Z,W, = Z'^ZiZ'^ г = 1,... ,r - 1, where Z§Z5 = Z, in (1.4.12) with Jacobian J(ZU. · ■, ^r-i, Zr -»· Wi,..., Wr-u Z) = det(Z)i<r-1><*+1> we get φ(Ζ) = det(Z)a-^p+14et(Ip-Z)b-^+^ [■■■( Π detiWi)"1"^4det (/P- Σ^)"_§(Ρ+1) Π «Wi *' *' .'—ι .'—ι «_ι Wi>0 = det(Z)*-^1) det(Jp - Z)b~^+l^p(au ..., ar_i; ar)- (1-4.13) Now from (1.4.13) and (1-4.7) we can write A>(ab...,ar;6) = / φ(Ζ)άΖ J0<Z<L· = /3p(ab...,ar_i;ar) / det(Z)a"^+1) det(/p - ZjM&H-i) dZ Jo<z<ip = /3p(oi,..., ar_i; ar)/3p(a, 6). (1.4.14)
22 CHAPTER 1. PRELIMINARIES From the recurrence relation (1.4.14) we get г r-1 Д,(аь...,аг;6) = /Зр(^аьб)/Зр(£]аьаг) ' * *А>(аьа2)· (1.4.15) г=1 г=1 Substituting for the multivariate beta functions on the right hand side from (1.4.15) we get the result. ■ The following result, Olkin (1959), is the matrix variate analog of Liouville's extension of Dirichlet integral. THEOREM 1.4.4. Let f(V) be a continuous scalar function of the symmetric matrix V(pxp). Then for Β (ρ χ ρ) > Α(ρ χ ρ) > 0, Re(a») > \{ρ - 1), г = 1,...,г, and Σ[=1 OLi = a, г г [■■■[ Πdet(Zi)e'-i^1>/(ΣЪ) ΠdZt Zi>0 = /3ρ(α1>α2)/3ρ(α1+α2)α3)···/3ρ(Σαί)αΓ) / det(Z)"-^^f(Z)dZ. г=1 JA<Z<B Proof: Making the same transformation as in Theorem 1.4.3, the above left hand side integral becomes / · · · / π det(wr-^+1) (iP - Σ ^)<V_§(P+1) π dm Wi>0 ( det(Z)a-^+1)/(Z) dZ Ja<z<b = /3p(ab..., ar_i; aT) ( det(Z)-iWf(Z) dZ JA<Z<B which is obtained by using (1.4.10). The desired result now follows from (1.4.15). ■ The following integral is useful in the theory of correlation matrices in multivariate statistical analysis. THEOREM 1.4.5. Let R = (ηά) with гц = rjU i φ j, г, j = 1,... ,p and Гц = 1. Then, for Re(a) > \{p — 1), we have [ det(R)a~^+1) J0<R<L· dR- _ Гр(а) Jo<r<ip~^k"j "* [T(a)]p where dR = П£< j dr^. Proof: We have Гв(о) = f Ja>o TJa) = f etr(-A) det(A)a"^(p+1) dA. (1.4.16) JA>0
1.4. INTEGRATION 23 Making the transformation a^· = ^/а~й y/aJjUj, г φ j, i,j = 1,..., ρ and an = an with the Jacobian Ρ ι, _,ч J (an, · · - ,Λρρ,αΐ2,... ,αρ_ι}Ρ -^an,.. -,Q>pp,ri2,... ,rP-ilP) = Π^ , г=1 in (1.4.16), we get Γρ(ο) = / det(#)a-^+1) dRUf аГ1 ехр(-а«) dau. J0<R<Ip г=1^>0 The result follows since /a..>0 a^"1 ехр(-ац) dan = Г(а). ■ Bellman (1956) generalized the multivariate gamma function as follows. DEFINITION 1.4.5. The generalized multivariate gamma function, denoted by Г*(аь ..., ap), is defined as r det{A)<*-hMebr{-A) 1р{а1,...,ар)- jA>o ^ det(^(Q])ma+i where aj = mi H h rrij and Re(aj) > \(j — 1), j = l,...,p. THEOREM 1.4.6. For Refe) > \{j - 1), j = 1,... ,p, Γ;(α1>...>αρ) = π»^1)ΠΓ[α^-5ϋ-1) j = l L Z ■dA Proof: By definition Γρ(α1?...,αρ) = ^>ο-^ detiAJ^-i^^etri-A) det(AM)" <L4, (1.4.17) where a, = mi + · · · + rrij. Let A = TT' where Τ is a lower triangular matrix with positive diagonal elements and partition A and Τ as 41 Λ12 а A-I ) " .Г-Г" °, Mi A22)V-ot \T2i T22J V-ol a p — a a p — a Now it is easy to see that A^ = TUT{U det(A^) = Π.?=ι*«, det(A) = Π?=ι<« and tr(A) = ЕГ>Д" From (1-3.14) we have J(A -> Τ) = 2ΡΠ?=ι<Γί+1- Hence we can write (1.4.17) as Via αϊ <P Γ f ПШ^еМ-^Ъ) * Гр(*ч->*р) = г ·" ГГ\Па^·)— П ij -ΩΩ^ί,,^ΩΩ L l0t=1 L 1ΐ=1 V "' -00«ij<00 ίϋ>0 j<i = [ Π / <*ρ(-4) <%| [ Π 2 / (iS)-*-*4 exp(-4) A«| lj<iJ-oo<tij<oo J Li=1 Jtu>0 J
24 CHAPTER 1. PRELIMINARIES The desired result now follows since /_00<i£ <00 exp(—i^·) cftij = y/π and 2 /*.£>0(*й J*4"5* ехрН?4)Л« = г[а4-^-1)]. ■ THEOREM 1.4.7. For Re^·) > |(j - 1), j = 1,... ,p and β > 0, where a,j = m\-\ h ra^. Proof: Since В > 0, let β = UU' where 17 = (u^·) is an upper triangular matrix. Substitute Л = 17'A/7, then ti(BA) = ti(UU'A) = ti(U'AU) = tr(A), and J(A -> Л) = det(/7)-^+1) = det(B)-^^1). Now partition 17, A and Л as tt-(Uu Ul2\ a A-(Au ΑΐΛ α A-(Au ΑΐΛ α V 0 ί/22/ρ-α' \Α2ι Α22) ρ-α' \Λ2ι Λ22/ ρ-α а ρ — α α ρ — α α ρ—α Then, Лп = АН = Е^Ац^, and det(AM) = det^i^n^n) = det(An) П4 = det(A^) f[ul г=1 г=1 Thus the left hand side of (1.4.18) reduces to det(£)-a* г det(A)a^-i(p+1)etr(-A) _ Г;(аь... ,ap)det(£)-a*> ΠΓ^ιΠ?^^2)^1^ n^\det(AW)^i " n^\n?=i(^2)mu+1 Now the result follows by noting that Π?=ι *4 = det(B^\ ) · ■ The proofs of above theorems are due to Olkin (1959). He has also generalized the multivariate beta integral as given in the next theorem. DEFINITION 1.4.6. The generalized multivariate beta function, denoted by /?*(αι,..., ap; 61,..., bp), is defined by r detjA^-hb+V det(/p - A)b'-^V pp[au · · ·, op, Ou · · ·, 0P) - JQ<A<jp nP_i {det(A[a])ma+1 det((/p _ A)[a])ka+lу «А> where αό = Т/{=1ти bj = Σ*=ι**, Re(aj) > \{j - 1), and Re(6j) > \{j - 1), j = l,...,p. THEOREM 1.4.8. For Щаа) > \{j - I), and Re(bj) > \{j - 1), j = 1,... ,p, /3*(ab...,ap;6i,...,6p) _ Γ;(αι,...,αρ)Γ;(6ι,...,6ρ) г;(о1 + бь...,ар + ад
1.4. INTEGRATION 25 Proof: We have, by Definition 1.4.5, Γ*(αι,...,αρ)Γ*(6ι,...,6ρ) = r г det(A)^i^) det(B)^i^) etr(-A - B) Ja>o Jb>o Πα=ι {det(AW)m-+i det(£H)fc°+i} V where Oj = ELi^b fy = E«=ife, ΙΙβ(α,) > |(j - 1), and Refo·) > |(j - 1), j = 1,... ,p. Now, let A + В = TT" where Τ is a lower triangular matrix with positive diagonal elements and W = T~lA{T')~l. Then A = TWT\ В = T(IP - W)T, det(BM) = det((/p - W)M) Π?β1 4 det(A^) = det(^W) Π?=ι 4, and tr(A + B) = Σ^<ί*?7· The Jacobian of transformation from (1.3.14) and (1.3.18) is given by J (A, B^W,T) = 2ρΠΓ=ι ttsTi+2. Hence (1.4.19) can be written as Γ*(η η Wh h\- f det(W)<*-1^ det(/p - W)*»-*™ p[ b·*·' p) p[°l*'~' p) ~ yo<iv</Dn^UdetWW)-^detff/„-^)W)Wr aW 2P o<w<ip n%i\{det(WM)m*+i det((Ip-W)W)k*+i} Π^ι(4)α^-^βχρ(-Σ^4)Λ У * * J Up-\ П? , (*?.УПа + 1+fca + l V· ^ -no<rb,<mo Ha=l Ili=l^u; j<i -00<tij<00 i»>0 = βρ(αι, · · ·, Ορί bi,..., δρ)Γ*(αι + 6Ь ..., ap + 6P). The last equality follows from Definition 1.4.6 and Theorem 1.4.6. ■ In matrix variate distribution theory quite often we transform Υ = ТНг (1.4.20) where Υ (ρ χ η) has rank ρ < η, Τ (ρ χ ρ) = (Uj) is a lower triangular matrix with tu > 0, г = 1,... ,p, and #ι {ρ χ η) is a semiorthogonal matrix, #x#( = /p. The Jacobian of this transformation, given in (1.3.25), is J(Y -»· T, Яг) = £„,„(#!) Π «Г*. (1-4-21) г=1 where gniP(Hi) dH\ defines the invariant measure on the Stiefel manifold 0(p, n). The following integral, which involves gn#{Hi), is used to derive the distribution of certain transformed matrices. THEOREM 1.4.9. For n>p, r 2ρπτηρ / 9n,p(H\) dHi = . ν . Jh.h^i, Γρ(|η) Proof: Let y^·, г = 1,... ,p, j = 1,..., η be np (p < n) independent standard normal variates. The joint density of these variates is (27r)-Wexp{ - 5ΣΣ1Λ "«о < W < °°- (I·4·22) ^ Z 1=1 7=1 J
26 CHAPTER 1. PRELIMINARIES Define (y\\ ··· y\n\ Y = \Ур1 ··· Ура/ Then the density (1.4.22) in matrix notation is (27r)-^npetr (- \уу% У e Kpxn. Since (1.4.22) is a density function, we have (2тгНпр / etr (- IyY') dY = 1. Making the transformation (1.4.20) with Jacobian (1.4.21) in (1.4.23) and noting tr(yr) = tr(TT') = E4·. (1.4.23) }<i we have 1 = (2тг)-^ /···/ / rn*r'exp(-5Et5W(ffi)iffin*« tu>0 = (fcr)"W ί Π / exp (- \tl) dti}] f ft / tr exp (- k) *«1 / 9n,P(Hi)dHi. -oo<ty<oo ч Ζ J/ £>o *r exp (- \t%) dtu = 2^-^ [i(n - < +1)], Now using and and Theorem 1.4.1 the result follows. ■ Since gnfP(Hi)dHi defines the invariant measure on 0(p,n), the above theorem gives the surface area or volume of the Stiefel manifold 0(p, n). That is Vol(0(p,n)) = j UdHJHl) JO(p,n) = [ дпгШаНг 2ρπϊηρ =Γρ(|η)· The following theorem (Hsu, 1940) is useful in deriving the distribution of quadratic forms.
1.4. INTEGRATION 27 THEOREM 1.4.10. Let Υ (ρ χ η) be of rank ρ (< η) and /( ) be a function of Υ which depends on Υ through YY' only. Then, Γ 7Γ2ηρ ι L>=Af{Yy,) dY=ftm det №{п~р~1]№' t1·4·24) Proof: Note that / f{YY')dY = f{A)f dY. Jyy'=a Jyy'=a Now transform Υ = Τ Hi and A = TV, where Τ is a triangular matrix with positive diagonal elements, and Hi (ρ χ η) is a semiorthogonal matrix. The Jacobian of this transformation is J(Y -> A, Hi) = J(Y -> T, Hi)J(T -> A) = n*3Ti^№){2pnci+1}"1 t=l "· i=l -1 = 2-"det(A)5("-"-1)5niP(^). Thus, we have Jyy'=a Jh^h'^i, T*np r^det(A)?<-n-p-1'>. (1.4.25) Γρ(|η) This last step is derived by using Theorem 1.4.9. ■ THEOREM 1.4.11. For Υ (ρ χ η) and Re(m) > η + ρ - 1, / det(Jp + ГГ )""m ^ = -Чг^ ~· yy€RPxn ρ Гр(|т) Proof: We first prove the theorem when гапк(У)=р<п. In this case, using Theorem 1.4.10, we get [ det(Jp + YY')-l2m dY = f f det( Ip + YY')-imdYdA jYeRpxn у Ja>oJyy'=a = -^7—τ / det(A)^n-p-Vaet(L + A)-imdA Tp(ln)JA>o πτηρ /11 \ = ϊνΜ/4^(ίη"4 (L4-26)
28 CHAPTER 1. PRELIMINARIES The last step follows from (1.4.9). Further simplification of beta function in (1.4.26), using Theorem 1.4.2, gives the desired result in this case. For the case гапк(У) = η < ρ, writing det(Jp + YY') = det(/n + Y'Y), and applying the above result we get [ det(/p + YY')'im άΥ = [ det(/n + Y'Y)~im άΥ _ 7rbTn[^(m-p)] Г»(±т) _ тг^Гр[|(т-п)] Гр(Н Another useful result on integration is a generalization of Sverdrup's lemma (Sver- drup, 1947) by Kabe (1965) and Khatri (1965). THEOREM 1.4.12. Let Y(pxn) be of rank ρ <η, D(qxn) be of rank q <n and С (η χ η) be a symmetric positive definite matrix. Then for p + q <n, YC DY'=B- [ f(YC~lY\DY')dY = —^ Tdet(C)Wet(I>CI>')-£ det(A - B{DCD')-lB')^n-p-q~l)f{A, B'). Proof: Let С з be the unique symmetric positive definite root of C. Since DC з is of rank q, we can find a matrix L{(n — q) χ η) such that t'DC*\ , fDCD' 0 G =[ , and GC = fDCi\ , fDCD' 0 \ , and GC =[ Now let Υ = X{G~l)'C* = (Xi X2) {G~l)'C* (1.4.27) where X\ and X2 are of order ρ x q and ρ χ (η — q) respectively. The Jacobian of transformation is J(Y -> X) = aet(DCD')-&det(C)2P. From (1.4.27) we get YC~lY' = Xl(DCD,)~1X[ + X2X'2 (1.4.28) and Χλ = YD' = B. (1.4.29) Substituting (1.4.29) in (1.4.28) we get YC~lY' = B(DCD')~lB' + X2X'2, (1.4.30)
1.5. ZONAL POLYNOMIALS 29 and j f(YC~lY\ DY') dY= J Л(ВД) dX2 (1.4.31) X2X'2=V where V = A - B(DD')-lB' and YC-lY'=A DY'=B' Л(ВД) = f(X2X'2 + B{DCD')~lB\B'). Finally, using Theorem 1.4.10 to evaluate (1.4.31) we get / Л(ВД) dX2 = y^ detiV^-'-r-VhiV). (1.4.32) Substituting for V in (1.4.32) gives the desired result. ■ COROLLARY 1.4.12.1. For ρ = 1, Υ = y\ A = а, В' = Ь, the above integral becomes [ /(i/C-V Dy) dy = *] П q det(C)i det(DCD')-> Dy=b (a - b'(DCr»')"1b)^(n","2)/(a, b). (1.4.33) The result (1.4.33) was derived by Sverdrup (1947). 1.5. ZONAL POLYNOMIALS In this and subsequent sections we give a brief description of zonal polynomials and hypergeometric functions of matrix arguments developed by Herz (1955), Hua (1959), and James (I960, 1961b, 1964). Let S (ρ χ p) be a symmetric matrix and 14 be the vector space of homogeneous polynomials </>(5) of degree к in \p{p + 1) distinct elements of S. The space 14 can be decomposed into a direct sum of irreducible invariant subspaces VK where /c = (k\,..., A;p), fci + · · · + &p = Λ:, Λ:ι > · · · > кр > 0. Then the polynomial (tiS)k G Vk has the unique decomposition into polynomials CK(S) G VK as (trS)fc = Ea(5). (1.5.1) Thus we have DEFINITION 1.5.1. The zonal polynomial CK(S) is the component of (tr S)k in the subspace VK. The zonal polynomial CK(S) is defined for all к and p, but for a partition κ of к into more than ρ parts, it is identically zero. These polynomials are invariant under orthogonal transformation, i.e.,
30 CHAPTER 1. PRELIMINARIES CK(S) = CK(HSH'), Η e 0(p). (1.5.2) Hence CK(S) is a symmetric homogeneous polynomial in the characteristic roots of S. Also if R is a symmetric positive definite matrix, then CK(RS) = CK(R*SRi) (1.5.3) where R* is the unique symmetric positive definite square root of R. Khatri (1971), has shown that \CK(S)\ < CK(S0) (1.5.4) where So = diag(|si|,..., \sp\), and si? г = 1,.. .,p, are the characteristic roots of S. If S = Ip, and the partition /c of к has r nonzero parts, then Constantine (1963) and James (1964) have shown that 22kk\(±p)KYTi<j(2ki-2kj-i + j) li=i(2*i + r-t)! ед) = — π Ζ , ^ л,—- (L5·5) where (|p)K = ΠΓ=ι(Κρ " * + *))*£ with (α)* = α(α + 1) · · · (α + Α; - 1), (α)0 = 1. James (1964) has tabulated CK(S) up to к = 6 and Parkhurst and James (1974) have extended these tables up to к = 12. Next we define the generalized hypergeometnc coefficient which frequently occurs in integrals involving zonal polynomials. Let /c = (fci,..., &p), fci > · ·· > kp > 0, ki + --- + kp = k. Then («)« = Π(α-|ϋ-1))ν (1-5-6) Using the notation Γρ(α, κ) = π^-D Ц Г[о + Aj - i(j - 1)], Re(o) > \{p - 1) - fcp, (1.5.7) Ρ Π i=i so that Γρ(α, 0) = Γρ(α), we can write (1.5.6) as Γρ(α, /c) Γρ(α) Khatri (1966) introduced the notation Γρ(α, -/с) = π**»-1) Π Γ [о - *, - \(ρ - j)], Re(o) > i(p - 1) + *ь which is also used here. Alternatively we can write Г,(а,-«)= ^УЦл - (L5·9) (-α+£(ρ+1))
1.5. ZONAL POLYNOMIALS 31 Having defined the zonal polynomials, we now give certain integrals involving them. We will have several occasions to use these results. LEMMA 1.5.1. / {tr(tftf )p [dH] = Σ Т^^(ДД'), (1-5-Ю) J0(P) K (2P)k J CK(RHSH>) [dH] = °^Tf\ (1.5.11) where [dH] is the unit invariant Haar measure on the orthogonal group 0{p). Proof: See James (1961b) for (1.5.10) and James (I960) for (1.5.11). ■ LEMMA 1.5.2. Let Ζ (ρ χ ρ) be a complex symmetric matrix of which real part is positive definite and Τ (ρ χ ρ) be a complex symmetric matrix. Then f eti(-ZS) det(S)a~^+l)CK(TS) dS Js>o = Γρ(α, /с) det(Z)~aCK(TZ~l), Re(o) > |(p - 1), (1.5.12) and f eti(-ZS) detiSy-^^CJTS-1) dS Js>o = Гр(о, -/с) det(Z)~aCK(TZ), Re(o) > |(p - 1) + кг. (1.5.13) Proof: See Constantine (1963) for (1.5.12) and Khatri (1966) for (1.5.13). ■ It may be noted that (1.5.12) is the Laplace transform of det(S)a"^+1\ Thus, using inverse Laplace transform, we get r—— / etr(ZS) det(Z)~aCK(TZ~l) dZ = тг^—, det(Sy~^+VCK(TSl Re(a) > \{p - 1). (1.5.14) Γρ(α, κ) 2 Similarly, from (1.5.13) we get 25P(P-1) 22PW-J-) r r—— / etr( ΖS)det(Z)~aCJTZ)dZ r / , det^r-^^C^TS"1), Re(a) > \{p - 1) + kx. (1.5.15)
32 CHAPTER 1. PRELIMINARIES LEMMA 1.5.3. Let R(px p) be a symmetric matrix, then [ det(5)a"^+1) det(/p - Sf-^^C^RS) dS Jo<s<ip = тЙтОД. Re(fl) > \(P ~ !)· Mb) > \{P - 1) (1-5-16) Lp{a + o,K) 2 2 and [ det(S)a~1^(p+1) det(/p - S)b-^+l)CK(RS~l) dS Jo<s<ip = ^T^^a(jR)' Re(a) > \{v ~l) + ku Re(6) > \{v -l)- (L5'17) Proof: See Constantine (1963) for (1.5.16) and Khatri (1966) for (1.5.17). ■ LEMMA 1.5.4. Let Τ (ρ χ ρ) be a complex symmetric matrix, then [ detiS)"-*^ det(/p + S)~{a+b)CK(TS) dS Js>o = Τρ{χζΡ+^~Κ)°^η Ma) > \(P ~ 1), Re(6) > \(p - 1) + ku (1.5.18) and { detiSy-^V det(/p + Sy^^C^TS'1) dS Js>o =Tp^lfK)c^n Re(a) > \<* -i)+^Re(6) > \* -i)- (l5-i9) Proof: See Khatri (1966). ■ LEMMA 1.5.5. Let Τ (ρ χ ρ) be a complex symmetric matrix, then [ etr(-S) det(S)a-^p+l\tiS)jCJTS) dS Js>o - ^K)Tipa + j + k)-CK(T), Re(a) >L·- I), (1-5.20) Г(ра + к) KK " w 2V and f etr(-S) det(5)a~ = ^^(tr SYCJTS-1) dS Js>o = Тр{а'~ЦРТк+)^к)с^ **»> > ¥- X> + *1· (L5·21) where j = 0,1,2,...,.. . Proof: See Khatri (1966). ■
1.5. ZONAL POLYNOMIALS 33 In the following theorem we give a generalization of Lemma 1.5.3. THEOREM 1.5.1. Let R(p χ ρ) be a symmetric matrix. Then, / Π detiZO-H^ det (lp - ± ztf'^C^R ± Zt) f[ dZ, Zi>0 - r^g^^P^)·^»^-')— - Re(6)>i(p-1), (1.5.22) and Z;>0 Γ,(α,-κ) Гр(6)Щ=1Гр(аО 1 = Гр(а + 6,-«) ад Ск(Л)' Εβ(αί) > 2(P - 1}' J = lj · · · 'P' Re(a)>-(i)-l) + A:1, Re(6)>|(p-1), (1-5.23) Proof: Here we give proof of (i). The proof of (ii) is similar. The integral on the left hand side of (1.5.22), using (1.4.12) and (1.4.13), can be written as / ndet^r^^det^-E^^V^E^) UdZi Zi>0 = / <j>(Z)CK(RZ)dZ Jo<z<ir = 0p(au...,or-i;Or) / det^)-***"4det(/p - Zf-^+l)CK{RZ)dZ Jo<z<ip R(n n , ^Г(а,/с)Гр(6) = /?р(аь ..., αΓ_ι; ar) Г/ , , ч ^(Д). 1 (а + о, /с) The last step is derived using (1.5.16). Now substituting for /Зр(аь ..., ar_i; ar) from (1.4.11) and simplifying gives the desired result. ■ For many other results on zonal polynomials and integrals involving zonal polynomials, the reader is referred to Subrahmaniam (1976), and Muirhead (1982).
34 CHAPTER 1. PRELIMINARIES 1.6. HYPERGEOMETRIC FUNCTIONS OF MATRIX ARGUMENT Distributional results of random matrices are often derived in terms of hypergeometric functions of matrix arguments. Bochner (1952) defined the Bessel function of matrix argument as the inverse Laplace transform of the exponential function. Herz (1955) introduced the hypergeometric function of matrix argument using Laplace and inverse Laplace transforms. Constantine (1963) gave the power series representation in series involving zonal polynomials as given below. DEFINITION 1.6.1. The hypergeometric function of matrix argument is defined by mFn(au · · ·, a,m\ bu ..., 6n; S) = Σ Σ 7TV 7TV1 Ίι ' (1.6.1) where α», г = 1,..., га; bj, j = 1,..., η are arbitrary complex numbers, S (ρ x p) is a complex symmetric matrix and Σκ denotes summation over all partitions к. Conditions for convergence of the series (1.6.1) are: (i) none of the bj is zero, an integer or half integer less than or equal to \{p— 1), (ii) if ai is a negative integer, say —r, then the function reduces to a finite polynomial of degree pr, (iii) the series converges for all S {ρ χ ρ) if га < η + 1, (iv) if га = η + 1, the series converges for all S (ρ χ p) such that ||5|| < 1 where the norm ||5|| denotes the maximum absolute value of the characteristic roots of 5, (v) unless the series terminates, it diverges for all S Φ 0 if ra > η + 1. From Definition 1.6.1 it follows that oib(5) = EE^ A:=0 κ Κ· = ^(trS)fe A:=0 /C· = etr(S). DEFINITION 1.6.2. The hypergeometric function of two symmetric matrices S (p xp) and Τ (ρ χ ρ) is defined by jp(p)f^ * -h h . с тл ^ V^ (αι)*''' (Q™)* CK(S)CK(T) /tao\ A:=o κ {θι)κ···{θη)κ CK{Ip)k\ Conditions for convergence of (1.6.2) are similar to the conditions for the convergence of (1.6.1) except that for ra = n +1 the series converges for ||5|| < 1 or \\T\\ < 1. If both S(p xp) and Τ (ρ χ p) are such that ||5|| < 1 and \\T\\ < 1, then the series will converge more rapidly.
1.6. HYPERGEOMETRIC FUNCTIONS OF MATRIX ARGUMENT 35 It is clear from the Definition 1.6.2 that the order of S and Τ is unimportant and if one of the arguments is identity matrix, this function reduces to the hypergeometric function of one matrix argument. By averaging the hypergeometric function of one matrix argument over the orthogonal group O(p), one can obtain the hypergeometric function of two matrices as follows. THEOREM 1.6.1. If S (pxp) is a symmetric positive definite matrix and Τ (pxp) is a symmetric matrix, then [ mFn(au ..., am; bu ..., bn: SHTH') [dH] = mi,nw(oi,...,am;6i,...,6n;5,T). (1.6.3) Proof: The result follows by expanding the integrand using (1.6.1) and then integrating term by term using (1.5.11). ■ Some of the results given in Section 1.5 can be extended for hypergeometric functions. THEOREM 1.6.2. Let Ζ (pxp) be a complex symmetric matrix of which real part is positive definite and Τ (ρ χ ρ) be a complex symmetric matrix. Then [ eti(-ZS) det(S)a-^+1> mFn(au ..., am; bu ..., 6n; ST) Js>o dS = Γρ(ο) det(Z)"a m+iFn(ab ..., am, a; bu · · ·, 6»; Z~lT), Re(a) > |(p-l), (1-6.4) and [ etr(-ZS) det(S)a-^+1> m#>(ab ...,am; blt..., bn; ST, R) dS Js>o = Гр(о) det(Z)"am+lF^(au ..., am, a; bu ..., 6n; Z~lT, Л), Re(o) > |(p-l), (1-6.5) where R(p xp) is a symmetric matrix. Proof: Expanding the hypergeometric function in the integrand and integrating term by term using Lemma 1.5.2 gives the desired result. ■ COROLLARY 1.6.2.1. For \\Z\\ < 1, lF0(a;Z) = det(Ip-Z)~a. Proof: From (1.6.4), letting Τ = Ip and replacing Ζ by Z_1, we get ^(a; Z) = dt^V I eti(-Z-lS) aet(S)a~^+l^ 0F0(S) dS. 1 p(a) Js>o
36 CHAPTER 1. PRELIMINARIES Now substituting Z~^SZ~^ = A with Jacobian J(S ->· A) = det(Z)^(p+1) and using 0F0(ZA) = etr(ZA), we get !ib(a; Z) = =-Ц / etr{-^(Jp - Z)} deW*^1* dA Lp{a) Ja>o = det(Jp - Z)~a. which follows from (1.4.6). ■ THEOREM 1.6.3. Let R(p χ ρ) be a symmetric matrix, then [ det(S)a-^+1> det(Jp - S^WmFniau ..., am; 6b ..., 6n; RS) dS J0<S<Ip -m+iFn+i(ai,..., am, a; 6b ..., 6n, a + 6; R). (1.6.6) ΓΡ(α)Γρ(6) Γρ(α + 6) Proof: The result follows by expanding the hypergeometric function in the integrand and integrating term by term using Lemma 1.5.3. ■ COROLLARY 1.6.3.1. For Re(a) > \{p - 1), Re(/3) > \(p - 1) and Re(/3 - a) > ^(p — 1) and symmetric R(p x p), etr(#S)dS. (1.6.7) Proof: Substituting m = η = 0, a = a, and 6 = /3 — α in (1.6.6), we get 1 p(a)l p(p — a) Jo<s<ip 0F0(RS) dS, Re(a) > |(p - 1), Re(/? - a) > |(p - 1). The result foUows by using 0F0(RS) = eti(RS). m COROLLARY 1.6.3.2. For Refr) > |(p-l), Refr-a) > |(p-l) and symmetnc R(pxp) where Re(R) < Ip, Γρ(α)Γρ(7 - a) Jo<s<ip det(/p - RS)-0 dS. (1.6.8) Proof: Substituting m = 1, η = 0, αχ = /3, a = a, and 6 = 7 — α in (1.6.6), we get
1.6. HYPERGEOMETRIC FUNCTIONS OF MATRIX ARGUMENT 37 2^(а,/?;7;Д) = rJffi , f β r detiiri^det^-Sr-i^1) I p{a)i p(7 - a) Jo<s<ip iib(i8; Л5) dS, Re(a) > ±(p - 1), Re(7 - a) > ±(p - 1). Now using Corollary 1.6.2.1 the result follows. ■ The integral representations (1.6.7) and (1.6.8) are generalizations of the classical confluent hypergeometric functions iF\ and Gauss hypergeometric function 2F\ respectively, and are due to Herz (1955). He also generalized Rummer's and Euler's relations for classical \F\ and 2F\ functions to the matrix argument. The hypergeometric functions iFi and 2F\ satisfy the following relations (Herz, 1955). хЛ(а; 7; 5) = etr(S) 1^(7 - a; 7; -S) (1.6.9) 2Fl(a,p',T,S) = det(/p-5)^2^(7-α,/3;7;-^(/p-S)-1) = det(/p-5r-a^2F1(7-a,7-/3;7;^)· (1-6.10) Subrahmaniam (1973) proved (1.6.10) using the partial differential equation for 2jF\. Using zonal polynomial expansion it is easy to establish the confluence relations lim ^(aw-S) =oFi(r,S) (1.6.11) or—юо \ Ot. ' lim 2ΡΊ(α^;7;-5) = ι^(α;7;5). (1.6.12) ос—юо \ OL ' From Theorem 1.6.2 it is seen that m+\Fn function can be obtained from mFn by means of a Laplace transform. Conversely mFn function can be obtained from m+iFn function by using an inverse Laplace transform. There is also an inverse Laplace transform which enables the mFn+i function to be obtained from mFn function (see Herz, 1955, p. 485). It has already been shown that oF0(S) = etr(S) and \F0(a; S) = det(/p - S)~a. The hypergeometric functions oF\ and iFi have the following integral representations given by Herz (1955) and James (1961a). THEOREM 1.6.4. LetX(pxn),p<n be a real matrix and Η = f Λ G 0(n) where H\ is ρ χ п. Then and / JO( eti(XH[) [аНг] = 0F1 {\щ -XX') Jo(p,n) v2 4 where [dHi] denotes the unit invariant measure on 0(p,n). THEOREM 1.6.5. Let Ηλ G 0(p,n), i.e., Ηλ is ρ χ η and НгН[ = Ip. Further let [dHi] be the normalized invariant measure on 0(p,n) so that fo(pyn) [dHi] = l· If Χ (η χ n) is positive definite matrix, then L еИ{ХН[Нх) [аНг] = г^^р-^щХ). 0(pyn) 4Ζ λ
38 CHAPTER 1. PRELIMINARIES Next, by using Theorem 1.6.4, we derive an integral useful in the study of noncen- tral density of Wishart matrix, and the theory of quadratic forms. THEOREM 1.6.6. For X(pxn) of rank p<n andL(px n), L=/tr^dX=r$)det^§(n-p-1)oFi(^;3L^)· Proof: Transform X = THU where Hx is ρ χ η, НгН[ = /р, and Τ (ρ χ ρ) is a lower triangular matrix with positive diagonal elements, with Jacobian, from (1.3.25), where gnyP(Hi) dH\ defines the invariant measure on 0(p, n). Then / eti(LX')dX = J Π«"'ί ηι ^ebr{rLH'l)gn,{H1)dH1dT JXX'=A jTT=Ai=i JHieO(pyn) 2Р7Г2ПР Г Р Г = FTTT / Π *«"' / eti{T'LH[) [аНг] dT -щ^ил^'"^'·-^'^^ <L6I3) The expression (1.6.13) has been obtained by using Theorem 1.6.4. Further transforming TV = S, with the Jacobian J(T -> S) = 2~pUPi=i ^"^ we get the final result. ■ There is yet another type of confluent hypergeometric function, Ф, of matrix argument defined below (Muirhead, 1970). DEFINITION 1.6.3. The confluent hypergeometric function Φ of symmetric matrix R(p χ p) is defined by Φ(α, с; R) = —J- / eti(-RS) det^)*-^1) det(/p + sy—i&H-D dS^ (L6>14) Γρ(α) Js>o where Re(R) > 0, and Re(a) > \{p - 1). Using (1.6.8), it can easily be proved that the confluent hypergeometric function Φ can also be obtained as a limit of Gauss hypergeometric function, Um 2Fi(a, 6; c; Jp - c-R-1) = det(#)6#(&, b - a + -{p + 1); #). The Whittaker's function of matrix argument has been studied by Abdi (1968). The Bessel functions of matrix argument are defined as follows.
1.6. HYPERGEOMETEJC FUNCTIONS OF MATRIX ARGUMENT 39 DEFINITION 1.6.4. The Bessel function (type one Bessel function of Herz) of matrix argument, denoted by ΑΊ(Ξ) is defined as MS) = ΓΡ[7+£(Ρ+1)]£?(7 + *(Ρ+1))**! = τρ[ί + 1(ρ^)}°Ει{ί+1^ + 1);-3)' (L6-15) where Re(7) > — 1. It can be easily shown that the Bessel function defined above has the integral representation 2§p(p-i) r . ^W = ΤΓ^δϊί) L^n et<Z ~ SZ~^ te4Z)~',-i(p+1) dZ. (1.6.16) (27Γφρ^+^ JRe(Z)>0 The result (1.6.16) can be proved by expanding etr(—SZ~l) in zonal polynomials and using (1.5.14). The Laplace transform of det(S)7Ay(S) is derived from (1.6.15) as / etr(-5Z)det(5)7^7(5)d5 = etr(-Z"1)det(Z)-7-2(p+1). (1.6.17) Js>o For ρ = 1, the relation between ΑΊ(·) and the ordinary Bessel function, J7(·), (Luke, 1969; p. 212) is given by J^t) = A,(\t>)(\t) DEFINITION 1.6.5. The type two Bessel function of Herz of matrix argument, B$, is defined as B5(WZ) = det(W)-' / eti(-SW) eti(-S'lZ) det(S)-'-i^ dS, (1.6.18) Js>o where Re(W) > 0 and Re(Z) > 0. By changing variables from S to 5_1, we note that Bs(Z) = Bs(Z)det(Z)5 and we can write Βδ(Ζ) = [ eti(-SZ) eti(-S~l) aet(S)5-^+l) dS. (1.6.19) Js>o For ρ = 1, the relation between £$(·)> and the Bessel function of the third kind of imaginary argument Ks(·), (Luke, 1969; p. 212) is given by Next we define incomplete gamma and beta functions of matrix argument.
40 CHAPTER 1. PRELIMINARIES DEFINITION 1.6.6. For Re(a) > \{p - 1), the incomplete gamma function is defined by 7p(a, B) = f det(j4)a-^+1) eti(-A) dA. (1.6.20) J0<A<B THEOREM 1.6.7. ForRe(a) > \(p-l), 7p(a, B) = det(B)·ψτ^Μί§ Λ (a; * + 5(P + 1); -B)■ (1-6-21) rp[a+2(p+l)J ν 2 / Proof: Substituting S = Β~^ΑΒ~^ with Jacobian J (A -> 5) = det^^G*1), in (1.6.20) we get 7p(a,B) = det(£)a / detfSy'-i^etri-BSJdS Jo<s</p The last equality is obtained from Corollary 1.6.3.1. ■ DEFINITION 1.6.7. For Re(a) > \(p - 1), and Re(6) > \{p - 1), the incomplete beta function is defined by Д{а,Ъ,В)= f det(A)a-*b+l) det(Jp - A)6"* (p+D Л4 (1.6.22) J0<A<B where 0 < В < Ip. THEOREM 1.6.8. For Re(a) > \{p - 1), Re(6) > \{p - 1) andO<B< Ip, B(abB)- Tp{a)Tp[>{p + 1)]dct(B)« 2Л(а,-6 + ^(р+1);о+^(р+1);В). (1.6.23) Proof: Substituting 5 = B~2^B~2 with Jacobian J(4 -> 5) = det(B)^+1) in (1.6.23) we get βρ(α, 6, В) = det(B)a [ det(5)a"2^ det(/p - BS)b~^+l) dS Jo<s<ip The last equahty is obtained from Corollary 1.6.3.2. ■
1.7. LAGUERRE POLYNOMIALS 41 1.7. LAGUERRE POLYNOMIALS Laguerre polynomials of matrix argument were introduced by Herz (1955). Constan- tine (1966) modified his definition and gave the following integral representation. DEFINITION 1.7.1. The Laguerre polynomial L1(S) of a symmetric matrix S (ρ χ p) corresponding to the partition к of к is defined as Ll(S) = etr(S) / etr(-R) det (R)~> СK(R) MRS) dR, (1.7.1) Jr>o where Ay(R) is the В ess el function and Re(7) > — 1. Substituting for Ay(RS) from (1.6.15) in (1.7.1), changing the order of integration and integrating with respect to R we get 2^p(p-i) ι / etx{Z)a^{Z)-1-^+l)CK{Ip-SZ-l)dZ. (1.7.2) 7Re(Z)>0 Further, write ftft-^>_£E«£t«p (1.,3, where (*J is the generahzed binomial coefficient (Constantine, 1966), and r is a partition oft. Substituting (1.7.3) in (1.7.2) we get 2§p(p-i) , ι ч Ll{S) = ад^^Гр(7+2(Р + 1)'К)Ск(/р) ΣΣ (*)г7П L ,* MZ)det(Z)-^^CK(-SZ-')dZ. HT W W(ip) JR*(Z)>0 Now using (1.5.14), we obtain the series representation for L].(S) as Clearly LJ.(S) is a symmetric polynomial of degree к in the eigenvalues of S and Ь2(0) = (7+!(р+1))кед,), (1.7.5) \Ll(S)\ < (7 + |(p+ 1))KCK(/P)etr(5), 7 > -1. Next we give the Laplace transform of det(5)7L^(5), which is useful in the theory of quadratic forms.
42 CHAPTER 1. PRELIMINARIES THEOREM 1.7.1. Let Ζ (ρ χ ρ), and Τ (ρ χ ρ) be complex symmetric matrices, Re(Z) > 0. Then Js>oeti(-ZS)aet(SrLZ(TS)dS = (7 + 1(р + 1))лГр[7 + |(р + 1)] detiZy^^C^Ij, - Z~lT). (1.7.6) Proof: Substituting from (1.7.4) in the left hand side of (1.7.6), and using Lemma 1.5.2, we get I etr(- ΖS)det{SyCT(- ST) dS Js>o ■(^5*+»+!»+.)]адЁЕ(^. Now the result follows from (1.7.3). ■ The generating function for the Laguerre polynomial L^.(S) is £ЕВД^Я = det(/p - Z)-t№ f ^ etr{-SHZ(Ip - Z^H'} [dH], k=0 к Ьк{1р)К\ JO(p) \\Z\\ < 1, S > 0, (1.7.7) which can be proved by multiplying both sides by det(S)7 and showing that their Laplace transforms are equal. 1.8. GENERALIZED HERMITE POLYNOMIALS In this section we define the generalized Hermite polynomial and its extensions. These functions of matrix arguments play an important role in the study of the distribution of quadratic forms. Hayakawa (1969) modified the definition given by Herz (1955) and defined the Hermite polynomial of matrix argument as HK(T) = 7rb>etr(TT') J eti(-UU' - 2lTU')Ck(-UU') dU, (1.8.1) where Τ (ρ χ η) is a real matrix, and CK(·) is a zonal polynomial. In 1972 he extended the above definition by introducing CK(—UAU') in place of CK(UU') in (1.8.1) where Α (η χ n) is a real symmetric matrix. He denoted these polynomials by PK(T, A), and studied several of its properties. He also calculated expressions for PK(T, A) up to
1.8. GENERALIZED HERMITE POLYNOMIALS 43 к = 4. Crowther (1975) called these polynomials Hayakawa polynomials and further extended them to PK(T,A, B) where Τ (ρ χ η) is a complex matrix, and A(n χ η), Β (ρ χ ρ) are real symmetric matrices: ΡΚ(Τ,Α,Β) = 7rb>etr(TT') / eti(-UUf - 2tTU,)CK(-BUAU,)dU = тгЬр у etr{-(C/ + iT)(J7 + ^'^(-В/УЛС/') dJ7 = £?[СК(-В(У - lT)A(V - lT)% (1.8.2) where CK(S) is a zonal polynomial and expectation is with respect to the p.d.f. 7r-2nPetr(-W)· Prom (1.8.2) it is easily seen that PK(T,A,B) = PK(-T,A,B). For В = Jp, and Τ real, PK(T,AJP) = PK(T,A). For Τ = 0, by using invariance property and integrating over 0(n), we get Р*(0,ДВ) = B[CK(-BVAV')] = π-Ьр / / etrf-WJCici-BVAV') </У dK JA>0 JVV'=A см) = г fiarfT λ L etr("A) ^W"{n~P~l)C,(-BA)dK ip(±n)CK(In) Ja>o 1 ч CK(A)CK(- 2П)« CK(In) and hence P(0A)- (l-n) WW-1') = (#№-*)■ An upper bound for \PK(T,A, B)\ can be obtained as \PK(T,A,B)\ < etr(TT)PK(0,A,B) = (-n)Ketr(TT) ад) . Crowther (1975) has calculated the polynomials PK(T,A,B) for κ = (1), (2), (1,1), (3), (2,1), and (1,1,1).
44 CHAPTER 1. PRELIMINARIES 1.9. NOTION OF RANDOM MATRIX In this section we define basic concepts related to random matrices. The format of this section corresponds to standard treatment of the univariate case and its step by step generalization (le Roux, 1978; Anderson, 1984; Hogg and Craig, 1994). A matrix random phenomenon is an observable phenomenon which can be represented in a matrix form which under repeated observations yields different outcomes which are not deterministically predictable. Instead the outcomes obey certain conditions of statistical regularity. The set of descriptions of all possible outcomes which may occur on observing a matrix random phenomenon is the sample space <S. A matrix event is a subset of the sample space <S. A measure of the degree of certainty with which a given matrix event will occur when observing a matrix random phenomenon can be found by defining a probability function on subsets of the sample space, <S, which assigns a probability to every matrix event according to the three postulates of Kolmogorov (Rao, 1973). DEFINITION 1.9.1. A matrixX (pxn) consisting ofnp elements χπ(·)>xi2(')> · · ·, Xjm(') which are real valued functions defined on the sample space S is a real random matrix if the range RpXn of fxn(') '" xin(')\ I I ' \Xpl(') '" Χρη(')) consists of Вorel sets of np-dimensional real space and if for each Borel set В of real np-tuples, arranged in a matrix, /xn ··· хы\ I I ' \ *Epl * * * %pn } ^ln(Sin)\ ] ев\ %pn\Spn) / ) Now that we have defined a random matrix, let us define its probability density function. Throughout this book we shall consider only real continuous random matrices. Furthermore, no distinction will be made between a random matrix and its realization. DEFINITION 1.9.2. A scalar function fx(X) such that (г) fx(X) > 0 in Rpxn, the set iixn(sn) seS:\ \Xpl(Spl) is an event in S.
1.9. NOTION OF RANDOM MATRIX 45 (u)!xfx(X)dX = i and (Hi)P(XeA) = JAfx(X)dX where A is a subset of the space of realizations of X, defines the probability density function (p.d.f) of the random matrix X. DEFINITION 1.9.3. A scalar function fXY(X,Y) such that (i)fx,Y(X,Y)>0 (H)!YSxfxy{X,Y)dXdY = \ and (Hi) P((X, Y) e A) = f f fXtY(X, Y) dX dY A where A is a subset of the space of realizations of (X, Y), defines the joint (bimatrix variate) p.d.f. of X and Υ. DEFINITION 1.9.4. Let the random matrices Χ (ρ χ η) and Υ (r χ s) have the joint p.d.f. fXtY(X,Y). Then (i) the marginal p.d.f. of X is defined by fx(X) = JYfx,Y(X,Y)dY, and (ii) the conditional p.d.f. of X given Υ is defined by fxlY(X\Y) = fx'J^p,fY(Y)>0 where fY(Y) is the marginal p.d.f. ofY. Likewise, one can define the marginal p.d.f. of Y, and the conditional p.d.f. of Υ given X. Two random matrices X (pxn) and Υ (r χ s) are independently distributed if and only if fxX(X,Y) = fx(X)fY(Y) where fx{X) and fY(Y) are the marginal densities of X and Υ respectively. DEFINITION 1.9.5. The moment generating function (m.g.f.) of the random matrix Χ (ρ χ η) is defined as Μχ(Ζ) = jxeti{ZX')fx(X)dX where Ζ {pxn) is a real arbitrary matrix. A function Μχ{Χ) is a m.g.f. if and only if it is positive and continuous in a neighborhood of Ζ = 0, where Μχ(0) = 1. In this case, the p.d.f. is determined uniquely by the m.g.f.
46 CHAPTER!. PRELIMINARIES The characteristic function (c.f.) of a random matrix Χ (ρ χ η) is defined by φ(Ζ) = Μχ(ιΖ). The m.g.f. of a bimatrix variate distribution is defined by MXltX2(ZuZ2) = £[ехр^г(ЗД)+*г(ад)}] = / / exp{ti(ZlX[)+ti(Z2X'2)}fXuX2(XuX2)dX1dX2. J Χι J X2 The function Mxltx2(Zx, Ζ2) is a m.g.f. if and only if it is positive and continuous in the neighborhood of Z\ = 0 and Z2 = 0, where Μχ1}χ2(0,0) = 1. The m.g.f. of the marginal distributions of Xj, j = 1,2, are given by ΜΧι(Ζ1) = ΜΧιΛ(Ζ1,0) and Μχ2(Ζ2) = ΜΧι,χ2(0,Ζι) respectively. In this case the joint p.d.f. /х1}х2(Хи X2) is determined uniquely. Let I(pxn)bea random matrix and h(X) = (hij(X)) where /i^ : KpXn ->· R, г = 1,..., r, j = 1,..., 5. Then the expected value of the function h(X) is a r χ s matrix defined by E[h(X)} = (E(Ay(X))) when E(hij(X)) exists. From above it is an easy consequence that (i) Ε (A) = A, A constant matrix, (ii) for A(pxr) and В (s χ q) E[Ah(X)B] = AE[h(X)]B, (iii) for hi(X) and h2(X) of the same order E{hx(X) + h2 (X)} = E{hx(X)} + E{h2(X)}. Thus for the random matrix X(pxn), the mean matrix is given by E(X) = (E(Jfy))· The pnxrs covariance matrix of the random matrices Χ (ρ χ n) and Υ (r χ s) is defined by cov(X,y) = cov(vec(XVec(y')) = £{(vec(X') - Evec(X'))(vec(Y') -Evec(Y'))'} = £{vec(X')(vec(y'))'} - E{vec(X')}E{(vec(Y'))'} /cov(x*,y*) cov(x*,y^) ··· cov(x*,y;)\ V cov(x;, y*) cov(x;, y\) · · · cov(x;, y*))
PROBLEMS 47 where χ*' and у*- are the 2th and jth rows of the matrices X and Υ respectively, г = 1,... ,p and j = 1,... ,r. As a special case of above we get the covariance matrix of X as· cov(X) = cov(vec(X')) = E{vec(X')(vec(X')y} - E{vec(X')}E{(vec(X'))'} ( cov(x*) cov(x*,x*) ··· cov(x*vx*p)\ \cov(x*p,xl) co\(xp,x*2) cov(x;) / We have given most of the results needed in the book. Several other results, in addition to these, will be given in the text, with relevant references. PROBLEMS 1.1. Prove that 1.2. For B(pxp) >0, prove that Lo n>=2det(A(Q))*- dA Мйь---АШ**(Я ) , where 6,- = крЧ+1 + ■ ■ ■ + kp, bj > \{j - 1), j = 1,...,p. (Olkin, 1959) 1.3. Show that ... r det(A)^-§(P+1)etr(-A)(trA)t JA ,* ч _,. . where a,- = mi Η h m,-, a,- > \{j — 1), j = 1,...,p, and (U) L lTa=2det(A(Q))^ dA = (g Η ΓΑ· · ■ ■ Л), where 6,- = Vj+i + · · · + fcp, &,· > \{j - 1), j = 1,... ,p. 1.4. Let Д = (ry) be.the matrix of correlations and dR = n?<jdr#. Then, show that detCfi)^-^^1' JO Γ;(α1;...,αρ) ) ' г аеЦКГ"^' i;{ai,...,c K > Jo<r<ip Upal\ det(№)m.+i nS=i Г(о,:
48 CHAPTER 1. PRELIMINARIES where a,j = πΐχ Η h rrij, a > \{j — 1), j = 1,... ,p and r det(fi)^-^i) Г;(6Ь...,6Р) W Уо<я</, ΐΖ=2 det(/Z(e))*-i *" Ш=1 Г(Ь) ' where fy = /cp_i+i + · · · + /cp, bj > \{j - 1), j = 1,... ,p. (Olkin, 1959) 1.5. Prove Theorem 1.4.2 using triangular decomposition of matrix A in (1.4.7). 1.6. Show that det(A)^-^(p+1) det(/p - A)b'-^+V г det(Ajap~2^^ det(7p - Α)ν*"*η'χ> Jo<a<ip YH=2 {det(A(a))-°-1 det((/p - A){a))k«-i] dA = βρ(αι,...,αρ;6ι,...,6ρ) where a3- = T%=p-j+imu bj = Epi=p-j+iku Re(aj) > \{j - 1), and Re^·) > 1.7. Show that I aet(S)^~^(l + ±tip-lS)Y"{n+Tnp) dS JS>0 ч Π ' Г[1(п + шр)] V ' 1.8. Show that for m > p, n,· > p, j = 1,..., к and η = Σ*=ι Щ-, г г n)=ictet(yJ-)^-1,-1)det(Jl> + E;.1yJ)-i("^-P-1) Л Jrr Λ/,χ) Juk>o nPa=idet((Ip + ZUUi)W) i=i 1 1 '2nfc;27 = A>(2nb--->9nfc;om) and у у П*=1 detiU^-r-V det(Jp + Σ*=1 [/,)-i("+"-P-D * = /3p(-n1,...,-n*;;-m) (Olkin and Rubin, 1964) 1.9. Show that /yeR,xn Π det((/p + ГГ)И)-^ det(/p + YY')~"n dY a=l = (2тг)> ^(αχ,.-.,αρ) Γ^Οχ + Ιη,.-.,Ορ + Ιη)'
PROBLEMS 49 where a,j = πΐχ + ... + rrij, and Re(a,j) > \{j — 1), j = 1,... ,p. 1.10. Show that for symmetric R(p χ ρ), / · · · / π садг-*™ det (ip - έ ^)M(P+1) Zi>0 г г mFn(<*u ... ,ате; А,..., &; Д(/р - Σ^)) Π ^ t=l t=l Щ=1 Гр(а^)Гр(6) / ^ \ 1 р\1^г=1 &г + О) i=l where Re(6) > \{р — 1), and Re(a,i) > \{p — 1), i = 1,... ,r. 1.11. Let /(V) be a continuous scalar function of the symmetric matrix V (p xp), a* > |(p — 1), г = 1,..., /c and 6, > |(p — 1), j/ = 1,..., ^. Then show that A; £ j · · · | Π detW)*-*^ Π det(V^-s(p+D o<ELi^.1^<Bi=1 i=1 Vi>0,i=l,...,A: И^>0,^=1,...,£ A: A: £ t=l i=l j=l £ 1 = Д>(аь ..., afc_i; а*)/?р(&ь..., 6*_r, 6*)/?p( ^ 6i? -(p + 1)) i=i z / det(Z)Si=i^-5(p+Ddet(B _ Ζ)Σ'=Λ/(Ζ)όΖ. J0<Z<B (Olkin, 1979) 1.12. Let / and g be continuous scalar functions of a symmetric matrix V (ρ χ ρ), Q>i > \{v — 1), г = 1,..., /c and bj > \{p — 1), j = 1,..., £. Then show that J · · · J Π det^)*-*^1* Π detiW,·)6^*^4 °<Σ·=1*+Σ^<βί=1 i=1 Vi>0,t=ll...,A: И^>0,^=1,...,£ к . e к t Α\λί . t=l j=l t=l j=l = A>K · - -, afc_i; a*) / · · · / det(X)£-=1 *-^+1> X>0 Wj>oj=i,...,e
50 CHAPTER 1. PRELIMINARIES Π det(H^-*(p+1)/POs( Σ wj) dX Π dWJ 7=1 j=i j=i βρφι,... M-\'M) J' - · - ί det(X)^Li «i-iCH-D det(y)^-i bj~^p+l)f(X)g(Y) dX dY. 3=1 3=1 3=1 = βΡ(θΊ, · · ·, Ofc_i; ak)Pp(bi,..., 6/_i;6/) 0<X+Y<B Y>0 (Olkin, 1979) 1.13. For Re(i) > \{p - 1), prove that (i) / etr(-y) det^)'"1 Π det^)"1^) dY = Γρ(ί, *)CK(/P) 2=1 and etr(-y) det(y)*-1 Jy>o (ii) / etr(-y) det(y)'-1 Π det(Y{i))~lCK(Y) dY = Γρ(ί, *)CK(/P). */y>0 i=2 (Gupta and Nagar, 1998) 1.14. For Re(7) > \{p - 1) and Re(7 - α - β) > \{j> - 1), prove that Fin. /9.τ.η_Γρ(7)Γρ(7-α-/3) 1.15. Prove that / det^)7"^*1* det(/p + A)"p 2Ή(<*, /3; 7; -A) dA Ja>o = Γρ(7)Γρ(α + ρ - 7)rp(j9 + ρ - 7) Гр(р)Гр(а + /3 + р-7) where Re(7) > \{p — 1) and Re(p + а - 7) > |(p _ 1)· [HINT: Use (1.6.10) and transform U = A(IP + A)~\] (Subrahmaniam, 1973) 1.16. Prove that [ det(A)7"^+1) det(/p - A)^ib+VCK((IP - A)B) 2Fi(a, β; 7; A) dA Ja>o _ Гр(7)Гр(р, /с)Гр(7 + ρ - a - β, /с) Гр(р + 7 ~ а, /с)Гр(7 + ρ - /5, /с) " (Subrahmaniam, 1973; Kabe, 1979)
PROBLEMS 51 1.17. For Β (ρ χ ρ) symmetric positive definite matrix, show that (i) / det(A)a"2 и*1* det(/p + CA)-a~b dA J0<A<B = βρ(α, i(p + 1)) det(B)a 2FX (a, a + 6; a + i(p + 1); -ВС), where Re(-BC) < Ip and Re(o) > \{p - 1), and (ii) f det(A)e_*i,H-1)det(/- + CA)-e-bdA .ΛΑ>Β = Д>(&, i(p + 1)) det(C)-a"6 det(B)-6 2F1(6,a + 6;6 + ^(p + l);-(BC)-1), where Re(-(BC)~l) < Jp, and Re(6) > \{p- 1). 1.18. Prove that / det^)*"^1) det(/p + A)~b det(/p + BA)"C dA JA>0 = βρ(α, b + c-a) det(B)~c 2Fi(b + с - α, с; 6 + с; Ιρ - β-1), where Re(/P - B~l) < Jp, Re(6 + с - α) > \{p - 1), and Re(a) > |(p - 1). 1.19. For С (ρ χ ρ) symmetric positive definite matrix, prove that (i) / det(A)a"2(p+1)etr(-AC)dA = det(C)-a7p(a,C*BC*) J0<A<B where Re (a) > \{p — 1), and (ii) / det(A)a-^p+1) det(/p - CA)6"^1) <L4 = det(C)"a/3p(a, 6, С*ВС*) Л)<Л<В where Re(a) > \{p - 1), Re(6) > \{p - 1), and С±В& < Ip. 1.20. For Re(a) > \{p - 1), Re(6) > \{p - 1), and 0 < В < Ip prove that f det^"^1) det(/p - A^^^F^a, β; ъ АВ) dA J0<A<Ip = /3ρ(α,6)3^2(α,α,/3;α + 6,7;β) (Subrahmaniam, 1973) 1.21. For Υ (pxn) and X(pxn), rank(X) = ρ < η, show that f etr(AY' - XX') det(XX')c~* ^ φ(α> c; **') ^ тгЬтр[с + ^(п-р-1)] , 1, . 1 1л
52 CHAPTER 1. PRELIMINARIES where Re(a) > \{p — 1) and Re(c) > —\n + m. 1.22. Prove that / etr(-Xy) det(Y)b~12^+l\Fl(α, α - с + ^(p + 1); 6; -У) JY>0 ч Z ' = rp(6)det(X)6"a^(a,c;X). 1.23. Prove that | det(y)6"2(p+1) etr(-AY) Φ(α, с; У) ОУ = Гр(Ь)Гр[Ь-с+|(р + 1)] Гр[а + 6-с+^(р + 1)] 2ii(b,b-c+-(p+l);o + 6-c+-(p+l);/p-A·), where Re(X) > 0, Re(6 - с) > -1, and Re(a) > \{p - 1). 1.24. Prove that ί etr(-Xy) det(y)6"2(p+1) xFi(a; c; AY) dY = Гр(6) det(X)"6 2Fi(a, 6; c; AX"1), where Re(AX_1) < Ip. 1.25. Show that for S G RmX7\ n>mandp<$, / eti(-SX' - XX')pFq(au ... ,ар;6ь ... ,6P; XX') dX JS£Rmxn ч 'и №i)*-■·№*)* fc! where 7 = |(n — m — 1). 1.26. Let the elements of a matrix A be functions of a random variable x. Let A be symmetric positive definite for all values of x. Then prove that E(A~l) — {E(A)}~1 is positive semidefinite, provided E(A~l) and E(A) exist. (Groves and Rothenberg, 1969) 1.27. Let Χ {ρ χ ρ) > 0 be a random matrix with p.d.f. fx(X)- Show that £[det(X)] = / det(X)fdet{x)(det(X))d(det(X)) (le Roux, 1978)
PROBLEMS 53 1.28. Let X G Rpxn and Υ G RqXTn be random matrices with joint p.d.f. f(X,Y). Let gi(X) and д2(У) denote the marginal densities and hi(X\Y) and h2(Y\X) be the conditional densities. Assume f(X,Y), gi(X), 9ι(Υ), hi(X\Y), and h2(Y\X) are defined for all X G RpXn and Υ G R9Xm. Suppose there exists Y0 G R9Xm such that h2(Y0\X) φ 0 for all X G RpXn. Then show that fiyyl_^2(TOl№) дл,г;-/с д2(Го|Х) > where /с is a constant. (Gupta and Varga, 1992)
54 CHAPTER 1. PRELIMINARIES
CHAPTER 2 MATRIX VARIATE NORMAL DISTRIBUTION 2.1. INTRODUCTION The random variable x, with the p.d.f. (2πσ2)"^ exp {- ^(x - μ)2}, χ G К, (2.1.1) where μ G К, is said to have a normal distribution with mean μ and variance σ2. The multivariate generalization of (2.1.1) for χ = (x1?... ,xp)' is (27r)-2pdet(E)-2 etr {- \^~\x ~ μ)(χ ~ μ)'}, ж G Rp, μ G Rp, Σ > 0, (2.1.2) and the random vector χ is said to have a multivariate normal distribution, denoted by χ ~ Νρ(μ, Σ), with mean vector μ and covariance matrix Σ. This distribution has been studied extensively and plays a key role in multivariate statistical analysis. In this chapter, we discuss its matrix variate generalization, i.e., matrix variate normal distribution, which is one of the most important matrix variate distributions. 2.2. DENSITY FUNCTION DEFINITION 2.2.1. The random matrix X (pxn) is said to have a matrix variate normal distribution with mean matrix Μ (ρ χ η) and covariance matrix Σ (g> Φ where Σ (ρ χ ρ) > 0 and У (η χ η) > 0, if vec(X') ~ A^vectM'), Σ <g> Ψ). We shall use the notation X ~ NPt7l(M, Σ <g> Ψ). We now derive the density of the random matrix X. THEOREM 2.2.1. If Χ ~ Νρ,η(Μ,Σ <g> Φ), then the p.d.f. of X is given by (2π)-^βΐ(Σ)-^ΐ(Φ)-ΚΐΓ{- \z~\X ~ М)Ъ~\Х - Μ)'}, X G Rpxn, Μ G Rpxn. (2.2.1) 55
56 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION Proof: Let χ = vec(X') and m = vec(M'). Then, according to the Definition 2.2.1, x ~ Npjjn, Ε <g> Φ), and its p.d.f. is (2тг)-Ь^(Е <g> φ)-ί etr {- i(E (g) Ф)_1(ж - m)(x - m)'}. Using Theorems 1.2.21 and 1.2.22, we get det(E <g> Φ)"* = det(E)-^ndet(*)-K (2.2.2) tr{(E (g) Ф)-1(ж - m)(x - m)'} = tr{(E_1 <g> Ф_1)(ж - т)(ж - τη)'} = Ιτ{Σ~ι(Χ - М)Ъ~\Х - Μ)'}. (2.2.3) Now, from (2.2.2) and (2.2.3), the result (2.2.1) is easily established. ■ This distribution belongs to the class of matrix variate elliptically contoured distributions studied in Chapter 9. In particular for Μ = 0, the distribution belongs to (i) the class of right spherical distributions if Φ = In, (ii) the class of left spherical distributions if Ε = Ip, and (iii) to the class of spherical distributions if Φ = In and Σ = ΙΡ. The matrix variate normal distribution arises when sampling from multivariate normal population. Let sci,... ,χχ be a random sample of size N from Νρ(μ,Έ). Define the observation random matrix (e.g., see Roy, 1957; Siotani, Hayakawa and Fujikoshi, 1985), as Z21 %2N /<\ (жь.. .,χν) = (2.2.4) \ Xpl · · · XpN ) then Χ' ~ ΝΝ,ρ(βμ', IN <g> E), where e (Ν χ 1) = (1,..., 1)' \</ 2.3. PROPERTIES In this section, we study various properties of matrix variate normal distribution. THEOREM 2.3.1. If X ~ iVp,n(M,E <g> Φ), then X' ~ ΛΓη,ρ(Μ',Φ <g> E). Proof: It suffices to prove that the exponents occurring in the densities of vec(X') and vec(X) are equal. This, however, follows easily from Theorem 1.2.22. ■ THEOREM 2.3.2. If X ~ NPi7l(M, Ε <g> Φ), then the characteristic function of X is (2.3.1) φχ{Ζ) = etr (ιΖ'Μ - ^Ζ'ΣΖΦ).
2.3. PROPERTIES 57 Proof: We have φχ(Ζ) = E{etr(iXZ')}, l = л/3! = E[exp{L(vec(X'))'vec(Z')}}. Now we know that vec(X') ~ ATpn(vec(M/), Σ <8> Ψ). Hence, from the characteristic function of a multivariate normal distribution, we get φχ(Ζ) = exp{4vec(M'))'vec(Z') - i(vec(Z'))'№ ® φ) vec(Z')} = eti (t,Z'Μ-^Ζ'ΣΖΦ). The last equality follows from Theorem 1.2.22. ■ THEOREM 2.3.3. Let Χ ~ ΛΓρ,η(Μ,Σ<Ε>Φ), and Μ = (m^·), Σ = (σ«), Φ = (?Ы· ТЛеп, ft> EfaijiXw) = ahi2^hJ2 + miihmi2J2 (llj &\Xiij1Xi2J2'Cizh) = ™ΐ\3\σWzV323Z ' ^12^2^113^1 J3 ~r 'rrii3J30'iii2^jiJ2 ' ^ilj 1^1232^1333 and (ill) ■ty{Xi1j1Xi2J2Xi333Xi4H) = σηχΑ.Ψ3\3\σΪ2Ϊ3Ψ3233 ' σ4Ϊ2Ψ3ΐ32σ^ΐ3 Ψ3Λ33 + <JiiizWji3z(^ui24)U32 ~^~ rnii3irni232<JuizWjAl3z + ^iiji^13^3^1412^4^2 ' 'rrii232™"i333<*iiU/lr3i34 ' ™i\3\™i\3\01213^3233 ' ^i4J4^i2J2°ν4Ϊ3Ύ}3i33 ' 'ηίίΪ4347ηΐ·333σ4^2/ψ3ΐ32 ' 'rri4Jirrii232rrii3J3'rrii4J4· Proof: Prom Theorem 2.3.2, the characteristic function of X is ^x(Z) = exp{MZ)}, (2.3.2) where ρ η -ι ρ ρ η η Μ^) = * Σ Σ mvzv ~ 9 Σ Σ Σ Σ ζΐόΨό^σα (2.3.3) i=l j=l Ζ ΐ=1 ί=1 A:=l j=l and Z = (ζ^·). Now, from (2.3.2) ■^-φχ(Ζ) = exp{h(Z)}-^-h(Z) (2.3.4) and ρ η ρ Π -Ι , ρ П ρ Π 71 ΣΣ^ζϋ - 91Σ Σ 4^·σ« + Σ Σ Σ ^Λ^σα t=l j=l z ^ i=l j=l t=l fc=l j=l ppn ρ ρ η η ν"| + Σ Σ Σ ЪзФззЫ** + Σ Σ Σ Σ ζϋ^^σα \ г=1 ί=1 j=l г=1 ί=1 fc=l j=l ' J L j=l г=1 ί=1 fc=l 3= Φ* φί Φ5
58 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION urriix ζίΐ3ιΨήήσ44 + Σ UjikZiik^ii fc=l Φή ρ ρ π + Σ ^hh^j^tn + Σ Σ ifrhkZtkVtv. ί=1 ί=1 fc=l Фч φ%ι φ3\ (2.3.5) Substituting from (2.3.5) in (2.3.4), we get E(Xiih) = --^—Φχ(Ζ)\ζ=0 = m43i' Now differentiating (2.3.4) with respect to Zt2J-2, we get d2 ^zidi^zi2J2 φχ(Ζ) = exp{h(Z)} д h(z).^-h(z)+ d2 OZiljl &Zi2j2 OZi^OZi^ h{Z) , (2.3.6) where -~^—h(Z) is obtained from (2.3.5) by replacing ζχ by z2 and j\ by j2- Further differentiating (2.3.5) with respect to zi2j-2, OZiljl OZi2j2 h(Z) = -aili2ipjlj2. (2.3.7) Hence, ι Я2 E(XhhXi2h) = ~ΪΒ . ο ^(Z)|Z=0 = ahi2i>jlJ2 + ™ΰ. £ Ozi\ji OZi2J2 ,ΤΠι. Now, from (2.3.6) &ziiji &zi2J2 VZizh φχ(Ζ) = exp{/i(Z)} dz, 9 ВД.^ВД. * d*. dz;. -a(z) + + + + UziijiUZi2j2 92 h(Z)--?-h(Z) иггззз h(Z)--?-h(Z) OZi0 7*0 d2 OZiljl Vzizh — h(Z) ■ ^-h(Z) OZi2j2OZi3j3 &ziiji a3 OZiljl &zi2J2^'Zi3J3 h(Z) (2.3.8) Using (2.3.5), (2.3.17), and d3 Vziiji ^zi2J2^Zi3J3 -h(Z)=0
2.3. PROPERTIES 59 in (2.3.8), we get d3 E(xilhxi2j2xi3h) = - φχ{Ζ)\ — ТПЧЗ\(Т12гъ'Фз23ъ + ТПг232С7Чгъ'Фз\ЗЪ ' η^χζ3ζσ^ύ2Ψ3\3i ' rriiiji'rrii2J2rrii333· ш Continuing this procedure one can also establish (iii). ■ COROLLARY 2.3.3.1. Let X ~ iVp,n(0, Σ <g> Φ), then ( V ^\ХЧ31Хг232) = σ4ΐ2Ψ3ΐ32 (ll) ■fcs{Xi1j1Xi2J2Xi333) = ^ and (ill) ■ty{Xi1j1Xi2j2Xi3j3Xi4j4) = (Jixi2W3\32(Jizi\{r3z3\ 'σϊ\ϊζΨ3\3ζσΪ2ΪΑ.Ψ323\ + ^1114 ψji J4 G%2iz Ψ3233 · Proof: Substitute M = 0 in Theorem 2.3.3. ■ van der Merwe (1980) has derived expectations of the traces of certain functions of X. Some of these are given in the next theorem. THEOREM 2.3.4. Let X ~ NPy7l(M, Σ <g> Φ) and Σ = (σ^·), Φ = (^). Then, for any constant matrix A(p χ n) and a = 0,1,2,...,.., we have (i) E{ti{XX')) = tr(E)txfr) (it) E{ti{XX'{AA')a)) = ΪΓ(Σ(Α4')α)ΐΓ(Φ) (iii) E{ti2{XA!)) = ϊγ(ΣΑΦΑ') (iv) E{ti(XA!f) = ϊγ(ΣΑΦΑ') (υ) E(tr(XA')tr(XA'(AA')a)) = ΪΓ(ΣΑΦΑ'(Α4')α). Proof: Here we give the proof for (i) and (iii); the others can be similarly derived. (i) E(ti(xx')) = s(ttw) = Σ Σ £(*«*«) г=1 j=l г=1 j=\ = ЕЕ^^ = М2)1х(Ф). (iii) £(tr2(XA')) = £(tr2tf), where Г = ΧΑ' ~ JVPiP(0, Σ ® (ΑΨΑ')) = £(£</«)2 = £(ΣΣ№) г=1 г=1 j=l = £i>to) = έί>* (where ЛФЛ' = Wfc)) t=lj=l г=1j=l = ϊγ(ΣΑΦΑ'). ■ Η. Μ. Nel (1977) has derived expectations of certain matrix valued functions of X, some of which are given in the next five theorems.
60 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION THEOREM 2.3.5. Let Χ ~ Νρ,η(Μ, Σ <g> Φ), then (i) E{X'AX) = ϊγ(ΣΑ')Φ + Μ'AM, Α (ρ χ ρ) (ii) Е{ХАХ') = ϊγ(Α'Φ)Σ + МАМ', Α (η χ η) (Hi) E(XAX) = ΣΑ'Φ + MAM, Α (η χ ρ) (iv) E{ti{AX)X) = ΣΑ'Φ + ti(AM)M, Α (η χ ρ) (υ) E(ti(AX)X') = ΦΑΣ + ti(AM)M', A(nxp) (vi)*E{ti{AX')X) = ΣΑΦ + ti{AM')M, A(pxn) (mi) E(ti(AX')X') = ΦΑ'Σ + ti{AM')M', Α {ρ χ η). Proof: (i) Let X = {χί5) and А = (α^·), then the (г, j)th element of X'AX is Σ?=ι Ya=\ Xti(kkXkj, and from Theorem 2.3.3 we get E(X'AX) = Efif^ixuatkXkj)) ^ k=it=i ' = ( Σ Σ utkivtkuij + rntimkj) J ^ fe=l i=l ' , V V V V ч = (Ψϋ Σ Σ btkVtk + Σ Σ utkmtirrikj J ^ fc=li=l k=lt=l J = ϊγ(Α'Σ)Φ + Μ'AM. (ii) Prom Theorem 2.3.1, Χ' ~ ^(Μ',Φ^Σ). Therefore, result (ii) foUows from result (i). (iii) The (i,j)th element of XAX is Σ?=ι Σ%=\ Xik^ktXtj and hence using Theorem 2.3.3, we get e(xax) = 4(ΣΣ^) ^ i=l fc=l ' ( v n \ = ( Σ Σ akt(^it^kj + mikmtj) J = ΣΑ'Φ + MAM. (iv) The (г, j)th element of ti(AX)X is χ# ΣΖ=ι Σ?=ι aktXtk and hence from Theorem 2.3.3, we get E(tr(AX)X) = ^(feEEH) \ k=it=i ' = EEM^fe + ^i^fe) V fe=l i=l J = ΣΑ'Φ + tr( AM)M. (v)-(vii) Using Theorem 2.3.1, the result (iv) and noting that ti(AX) = ti(AX)' = tr(A'X') the results follow. ■
2.3. PROPERTIES 61 It may be noted that some of the results given in Theorem 2.3.4 can be derived from Theorem 2.3.5. THEOREM 2.3.6. Let Χ ~ ΛΓρ,η(Μ, Σ <g> Φ), then (i) E(XAXBX) = ΜΑΣΒ'ύ + ΣΒ'ΜΆ'Φ + ΣΑ'ΨΒΜ + MAMBM, A(n xp), B(nx p), (ii) E(X'AXBX) = М'АИВ'Ъ + ϊγ(Σ£'ΜΆ')Φ + ϊγ(ΑΣ)Φ£Μ + M'AMBM, Α{ρχ ρ), Β (η χ ρ), (Hi) Ε(ΧΆΧ'ΒΧ) = \ι{ΣΒ')Μ'Α^ + ϊγ(ΑΜ'£Σ)Φ + ΦΑ'ΣΒΜ + Μ'ΑΜ'ΒΜ, Α{ρχ η), Β {ρ χ ρ), (iv) Ε(Χ'ΑΧΒΧ') = ti(BV)M'AZ + ΦΒ'ΜΆ'Σ + ϊγ(ΑΣ)Φ£Μ' + Μ'ΑΜΒΜ', Α (ρ χ ρ), Β (η χ η), (υ) Ε{ΧΑΧ'ΒΧ') = ΜΑ^Β'Σ + ϊγ(ΑΜ'£Φ)Σ + ϊγ(ΑΦ)Σ£Μ' + ΜΑΜ'ΒΜ', Α (η χ η), Β (ρ χ ρ), (vi) E{X'AX'BX') = Μ'Α^Β'Σ + ^Β'ΜΑ'Σ + ΦΑ'ΣΒΜ' + Μ'ΑΜ'ΒΜ', Α {ρ χ η), Β (ρ χ η), (vii) Ε{ΧΑΧ'ΒΧ) = ϊτ(ΒΣ)ΜΑ% + ΣΒ'ΜΑ'^ + ϊγ(ΑΦ)Σ£Μ + ΜΑΜ'ΒΜ, Α{ηχ η), β (ρ χ ρ). Proof: (i) The (ij)th element of ХАХБХ is Σ*=ι Σ?=ι Σ?=ι Σ^ι^α^χ^χ^-. Hence, ρ η ρ η Ι ^9= ρ η ρ η ε(χαχβχ) = (ΣΣΣΣ^^^(^χ^,·)) \ 9=ι £=ι t=i k=i J / Ρ Π Ρ П ч = ί Σ Σ Σ ^aktbig{mikatg^tj + mteaigipkj + гпд^ифы + гпгкгпигпдо)\ \g=i e=i ί=ι fc=i J ^9 Э' Л /Г' Л' = ΜΑΣΒ'^ + ΣΒ'Μ'Α'Φ + ΣΑ'ΦΒΜ + ΜΑΜΒΜ. The proofs of (ii), (iii), and (iv) follow similar steps. For the proof of (v), (vi), and (vii), notice that X' ~ ЛГП}Р(М',Ф <8> Σ) and use the results (ii), (i), and (iv) respectively. ■ THEOREM 2.3.7. Let Χ ~ ΛΓρ,η(Μ, Σ <g> Φ), then (i) E{ti{X'AXB)X) = ΪΓ(Α,Σ)ΐΓ(β,Φ)Μ + ΣΑ,Μβ,Φ + ΣΑΜβΦ, + ti(M'AMB)M, A(px ρ), Β (η χ η), (ii) E(ti(AX)XBX) = ΜΒΣΑ'Φ + ΣΑ'ΨΒΜ + ϊγ(ΑΜ)Σ£'Φ + ti(AM)MBM, Α{ηχ ρ),Β (η χ ρ), (Hi) E(ti(AX)X'BX) = Μ'ΒΣΑ'Φ + ^ΑΣΒΜ + ti(AM) ϊγ(£Σ)Φ + tr(AM)M'BM, Α{ηχ ρ), Β (ρ χ ρ). Proof: (i) The (r, s)th element of tr(X'AX£)X is η η ρ ρ Σ Σ Σ Σ xtiXkjXrs0<tkbji. i=\ j=\ k=l t=l
62 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION Hence, E{ti{X'AXB)X) , η η ρ ρ ν = ( Σ Σ Σ Σ atkbji{rnrsatki)ij + rntiarkil>sj + rnkjart^si + mrsmtimkj) j \ t=l j=l fc=i t=l ' = tr(A'E) ϊγ(Β'Φ)Μ + ΣΑ'ΜΒ'Φ + ΣΑΜΒΦ + tr(M'AMJ3)M. (ii) The (r, s)th element of ti(AX)XBX is ρ η η ρ Σ Σ Σ Σ aijhtXrkXtsXji- ί=1 fc=l i=l j=l Hence, £(tr(AX)X£X) , ρ η η ρ ν = ( Σ Σ Σ Σ aijbkt{rnrkatj^si + rntsarj^ki + гп^а^фкз + гпгктит0г) J ^ ί=1 А:=1 г=1 j=l ' = ΜΒΣΑ'Φ + ΣΑ'ΦΒΜ + ϊγ(ΑΜ)Σ£'Φ + tr( AM)MBM. (iii) The derivation is similar to (ii). ■ THEOREM 2.3.8. Let X ~ NPi7l(M, Σ <g> Φ), then (i) E(XAXBXCX) = ΣσΦΒΣΑ'Φ + ΣΑ'ΦΒΣΟ'Φ + ϊγ(ΑΣ6"Φ)Σ£'Φ + ΜΑΜ£Σ6"Φ + МАЪС'М'В'Ъ + ЕСМ'β'Μ'Α'Φ + ΜΑΣΒ'ΦΟΜ + ЕБ'М'А'ФСМ + ΣΑ'ΦΒΜΟΜ + Μ AM ВМС Μ, Α (η χ ρ), Β (η χ ρ), С (η χ ρ), (ii) E{X'AXBXCX) = ϊγ(Σ6"Φ£ΣΑ')Φ + ϊγ(ΑΣ)Φ£Σ6"Φ + ΦΟΣΑ'ΣΒ'Φ + ΜΆΜΒΣσΨ + М'АЕС'М'Б'Ф + tr( АМВМСТ>)Ъ + М'АЕБ'ФСМ + tr( АМВТ,)ЪСМ + tr( ΑΣ)Φ£ΜΟΜ + Μ'AM BMC Μ, Α {ρ χ ρ), Β (η χ ρ), С (η χ ρ), (iii) E{XAX'BXCX) = ϊγ(£Σ)Σ6"ΦΑ'Φ + ϊγ(ΑΦ)Σ£Σ6"Φ + Σ£'Σ6"ΦΑΦ + ΜΑΜ'£Σ6"Φ + ti(MCEB)MAV + Σ6"Μ'£'ΜΑ'Φ + ΐΓ(Σβ)ΜΑΦΟΜ + ИВ'МА'ЪСМ + tr(^)E5MCM + МАМ'ВМСМ, Α{ηχ η), β (ρ χ ρ), С (η χ ρ), (iv) E{X'AX'BXCX) = ϊγ(Σ6"ΦΑ') ίτ(ΒΣ)Φ + ΦΑ'Σ£Σ6"Φ + ΦΟΣΒΣΑΦ + ΜΆΜ'£Σ6"Φ + \х(МСЪВ)М'АЪ + ΪΓ(ΑΜ,βΜΟΣ)Φ + ti(BE)M'AVCM + ΐΓ(ΑΜ,βΣ)ΦΟΜ + ФА'ЕБМСМ + М'АМ'ВМСМ, Α(ρχ η), β (ρ χ ρ), С (η χ ρ), (υ) E{X'AXBX'CX) = tr(EC"Ei4') ϊγ(ΒΦ)Φ + tr(AE) ϊγ(ΟΣ)Φ£Φ + ^(АЕС;Е)ФВ;Ф + tr(EC)M'AM^ + М'АЕС'МБ'Ф + ΐΓ(ΑΜΒΜ,σΣ)Φ + ΐΓ(ΒΦ)Μ,ΑΣσΜ + фб'М'А'есм + Ъ(АЕ)ЪВМ'СМ + М'АМВМ'СМ, Α(ρχ ρ), Β (η χ η), С (ρ χ ρ), (vi) E{X'AXBXCX') = νσΦΒΣΑ'Σ + tr(AE) ϊγ(ΟΦ)Φ£Σ + ΦΟΦΒΣΑΣ + ίΓ(Φσ)Μ;ΑΜΒΕ + ΐΓ(ΒΜσΦ)Μ,ΑΣ + ФСМ'В'М'А'Е + М'АЕБ'ФСМ' + ΐΓ(ΣΑΜΒ)ΦαΜ/ + ti{AT)^BMCM' + M'AMBMCM', Α{ρχ ρ), Β (η χ ρ), С (η χ η).
2.3. PROPERTIES 63 Proof: As in the proof of Theorem 2.3.7, the results (i)-(vi) can be derived by taking the (г, j)th element of the random matrix, substituting its expected value from Theorem 2.3.3, and converting the resulting expression in matrix form. ■ THEOREM 2.3.9. Let Χ ~ ΛΓρ,η(Μ, Σ <g> Φ), then (i) E{ti{XBXCX')XA) = Σ^Β'ΦΟΦΑ + Ъ(СЪ)Е2В'ЪА + ϊγ(Σ)ΣΒ'Φ6"ΦΑ + tr(MBE) ti(CV)MA + ti(MCVB) tr(E)MA + ЕМВМСФА + ϊγ(Μ6"ΦΒΕ)ΜΑ + ЕВ'М'МС'ФА + ЕМС'М'В'ФА + tr(MBMCM')MA, Α (η χ η), Β (η χ ρ), С (η χ η), (η) E(ti(BXCX')XAX) = ЕВЕА'ФС'Ф + tr(BE) ^(СФ)ЕА'Ф + ЕВ'ЕА'ФСФ + tr(BE) ^(СФ)МАМ + МАЕВМСФ + ЕВМСФАМ + МАЕВ'МС'Ф + ЕВ'МС'ФАМ + ϊγ(Μ6"ΜΒ')ΕΑ'Φ + ti{BMCM')MAM, Α{ηχ ρ), Β (η χ ρ), С (η χ η), (Hi) E{ti{BXCX')X'AX) = ϊγ(ΕΒΣΑ')Φ6"Φ + tr(AE) tr(BE) ϊγ(ΟΦ)Φ + ^(ЕВТА')ФСФ + tr(BE) ^(СФ)М'АМ + М'АЕВМСФ + ФС'М'В'ЕАМ + М'АЕВ'МС'Ф + ФСМ'ВЕАМ + tr(AE) ϊγ(ΜΟΜ'Β)Φ + tr(BMCM')M'AM, Α (ρ χ ρ), β (ρ χ ρ), С (η χ η), fw; £(tr(AX)XBXcx) = ев'фсеа'ф + еа'фвесф + есфаев'ф + ϊγ(ΑΜ)Μ£Σ6"Φ + ϊγ(ΑΜ)Ε6"Μ'Β'Φ + МВМСЕА'Ф + ^(МА)ЕВ'ФСМ + МВЕА'ФСМ + ΕΑΦΒΜΦΜ + tr( ΑΜ)ΜΒΜΦΜ, Α (ρ χ ρ), Β (η χ ρ), С (η χ ρ), (ν) E{ti(AX)X'BXCX) = ^(ВЕ)ФСЕА'Ф + ФАЕВЕСФ + ϊγ(ΕΒΈ6"ΦΑ)Φ + ϊγ(ΜΑ)Μ'ΒΕ6"Φ + tr(MA) ^(МСЕВ)Ф + М'ВМСФА'Ф + tr(BE) tr(MA)VCM + М'ВЕА'ФСМ + ФАЕВМСМ + ti(MA)M'BMCM, Α(ρχ ρ), Β (ρ χ ρ), С (η χ ρ), (vi) E(ti(AX)X,BX,CX) = ФВ'ЕСФА'Ф + ^(ЕС)ФАЕВФ + ^(ЕСЕВФА)Ф + ti(MA) ^(ЕС)М'ВФ + tr(MA) ϊγ(ΜΒΈ6")Φ + М'ВМ'СЕА'Ф + ^(МА)ФВ'ЕСМ + М'ВФАЕСМ + ФАЕВМСМ + ti{AM)M'BM'CM, Α{ρχ ρ), Β (ρ χ ρ), С {ρ χ ρ), fvt»; B(tr(AY)X;BXCX;) = ^(ФС)ФАЕВЕ + ^(ВЕ)ФСФАЕ + ФС'ФАЕВ'Е + М'ВМСФАЕ + М'ВЕА'ФСМ' + ФАЕВМСМ' + ίΓ(Φσ) tr(AM)M'BE + ϊγ(ΑΜ)Φ6"Μ'ΒΈ + tr(AM) ^(ВЕ)ФСМ' + ti{AM)M'BMCM', Α(ηχ ρ), Β (ρ χ ρ), С (η χ η). Proof: The results can be derived by using the procedure described above. ■ It should be noted that Theorems 2.3.5-2.3.9 are sufficiently general to cover many cases. Neudecker and Wansbeek (1987) have given an alternative method of derivation of (v) of Theorem 2.3.8. They (and von Rosen, 1988b) have also given expectation of X <g> X <g> X <g> X, and the cov(vec(XAX'), vec(XBX')) (derived in Chapter 7). Further let X ~ NPiP(M, Ip <g> Ip) and define Mfc = £(AX)fc,/c = 2,3,...,..
64 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION and В = AA!, where A (p xp) is a constant matrix. Then, Hudak and Richter (1996), besides many other results, have proved that /X2fc-1 = 0, and tok = tr(/i2fc-2)£ + (2fc - 2)βμ2Α:_2, к = 2,3,...,.. with μ2 = В. From this recurrence formula it can be deduced that E{X2) = /„, and E(X2k) = {p + 2k-2){p + 2k-4)---{p + 2)Ip, fc = 2,3,...,.. . THEOREM 2.3.10. J/ X ~ JVp,n(M,E <g> Φ), D(m χ p) is of rank m < ρ and С (η χ t) is of rank t < n, then DXC ~ N^DMC, {DUD') <g> (СФС)). Proof: The characteristic function of DXC is 0dxc(Z) = £[etr(J}XCZ')l = E[eti{iXZ[)l Z[ = CZ'D. Now, from Theorem 2.3.2, we get Φπχο(Ζ) = etr (lZ[M- \z'{LZ^) = etr \iZ\DMC) - ^-Z'iDED^ZiC'VC)}. (2.3.9) Since (2.3.9) is the characteristic function of a matrix variate normal distribution with mean DM С and covariance matrix {DUD') <g> (СФС), the proof is complete. ■ COROLLARY 2.3.10.1. In the above theorem, let m = t = 1, D = d! (1 χ ρ) and С = c(n χ 1), then d!Xc ~ N(d!Mc, (<£'Е<£)(с'Фс)). Furthermore, {d!(X - M)c}2 2 (d'EdXc'tfc) ~Xl" COROLLARY 2.3.10.2. In the above theorem, (i) ifm = p, and D = Σ~2; then Σ-^XC ~ N^-^MC, Ip <g> (СФС)), (%",) if t = n, and С = Ф~2, йеп ЯХФ-2- - АГта,п(^Мф-^, (£>Σ£)') Θ /»).
2.3. PROPERTIES 65 THEOREM 2.3.11. Let Χ ~ ΛΓρ?η(Μ, Σ <g> Φ), and partition Χ, Μ, Σ, and Φ as ίΧιι Xu\ m M_(Mn Mu\ rn \X2i X22 J Ρ - m' \M2\ ^22 J ρ-τη" t n—t t n—t Σ=/Σιι Σ12\ m φ=/Φιι Φΐ2λ ί νΣ21 Σ22) p-m' \Ъ21 Ъ22) n-t τη ρ — τη t n — t Then, Xn ~ Wm|t(Mii,En ®Фц). Proof: The result follows by taking D = (/m 0) andC' = (Jt 0) in Theorem 2.3.10. ■ THEOREM 2.3.12. Let X ~ NPi7l(M, Σ <g> Φ), and partition Χ, Μ, Σ, and Φ as X= ( = (Xlc X2c) X2rJ p-m t n_t /Mlr\ m M= Μ )n ™=(Mlc M2c) Mir) P-m t n_t _ . -» -12 \ m /Фц Φχ2\ ί Σ = and Φ = Σ21 Σ22) ρ —τα \ ^2ΐ ^22) n — t τη ρ — τα t n — t Then, (г) Xlr ~ iVm,n(Mlr, Ση (g) Φ), Xlc - iVp,t(Mlc, Σ (g) Φη), (ii) X2r\Xlr ~ iVp_m,n(M2r + Σ21Σ1"11(Χ1Γ - ΜΙΤΙΈ22Λ Θ Φ), and X2c\Xlc ~ iVp,n_t(M2c + (Xlc - Μ^Φ^Φ^,Σ (g) Ф2;м), wuere Σ2;Μ = Σ22 - Σ21Σ^Σι2 and Φ22.1=Φ22-Φ21ΦΓι1Φΐ2- Proof: (i) In Theorem 2.3.11 substitute £ = η to get the density of Xir, and m = ρ to get the density of X\c. /£ll £12 ч (11) Let Σ = ( y,21 y,22 J, then Σ = Σ11>2, Σ = Σ22.1? Σ = —Σ11.2Σ12Σ22 = -Σ^ΣυΣ^ι = (Σ21)', and (Χ - Μ)'Σ-\Χ - Μ) ={{Xir-M,ry ιχ»-*.ϊ)(% %){χζ~-μ:) = (Χ1Γ - ΜΐΓ)'Ση(ΧΐΓ - Μ1τ) + (Χ1τ - ΜΐΓ)'Σ12(Χ2τ - Μ2τ) + (Χ2τ - Μ2γ)'Σ21(Χ1τ - Μ1τ) + {Χ2τ - Μ2Γ)'Σ22(Χ2τ - Μ2τ)
66 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION = (XlT - Mlr)'(Eu - Σ12(Σ22)-1Σ21)(Χ1Γ - Mlr) + (XlT - М1г)^^22Г^21(Х1г - Mlr) + (Xlr - Μ1τ)'Σ12(Χ2τ - Μ2τ) + (Х2т - Μ2τ)'Σ2\Χ1τ - Mlr) + (Χ2τ - Μ2τ)'Σ2\Χ2τ - Μ2τ) = (Χ1τ-ΜΐΓ)>Σύ(Χ1τ-ΜΙΤ) + (Χ2τ - Μ2τ - Σ21Σ^(Χ1τ - Μ1τ))'Σ22\(Χ2γ - Μ2τ - Σ21Σΰι(Χ1τ - Μ1τ)). Thus, the density of X can be written as }{X) = (27r)-b>det(E)-5ndet(*)-5Petr[- hx - Μ)'Σ~\Χ - М)Ъ~1] = (2^-^det(En)-^det(*)->etr[- \{XlT - Μ1τ)'Σ^{Χ1τ - Μ^Φ"1] .(tor)-***-"* det(E22.1)-bdet(*)-^-™) etr [- l-{X2r - M2r - Σ21Σΰι(Χ1τ - Μ1τ))'Σ22\(Χ2τ - Μ2τ - Σ21Σ^(Χ1τ - Μ^Φ"1]. Hence, X2r\Xlr ~ iVp_m,n(M2r + Σ21Σ^(Χ1τ - Mlr), Σ22Λ ® Φ). Since from Theorem 2.3.1, X' = ( )c ) ~ NnJ ( M/C j >φ ® Σ) >the above result gives X'2c\X[c ~ iVn-t,p(M2c + Φ21ΦΓι4^ίο - M[c), Φ2Μ ® Σ). Therefore, X2c\Xlc ~ Np>n_t(M2c + (Xlc - М1с)Фй1Ф12, Σ ® Φ22.1). ■ THEOREM 2.3.13. If X ~ Np,n(M, Σ <g> Φ), then Ιτ{Σ~ι(Χ - M)*-l(X - Μ)'} ~ χΙρ. Proof: The result follows by noting that triE-^X - М)Ъ~\Х - Μ)'} = tr(rr). where Υ = Σ~2"(Χ — Μ)Φ~2 which, according to Corollary 2.3.10.2, is distributed as iVp,n(0,/p(g)/n). ■ THEOREM 2.3.14. Let X ~ ЛГр?п(М, E<g>#), and B(nx t), and D(nxs) be given matrices. Then XB and XD are independent if and only if B'^D = 0. Proof: Without loss of generality, assume that Μ = 0. The matrix of covariances between XB and XD is given by cov(XB,XD) = cov(vec(X£)>ec(XD)')
2.3. PROPERTIES 67 = £{vec(X£y(vec(XL>)7} = E{{IP g> ff) vec(X')(vec(X'))'(Ip g> £>)} = (/p Θ B')E{vec(X')(vec(X'))'}(Ip Θ £>) = (/ρΘΒ'ΚΣΘΦΚ/ρΘΰ) = Е®(В;Ф£>). (2.3.10) It follows that cov(X£,X.D) = 0 if and only if B'^D = 0. This completes the proof of the theorem. ■ In (2.3.10), by taking t = 1, В = e» (η χ 1), s = 1, and .D = e^ (η χ 1), we get cov(xi,Xj) = ^tjS, г,j = l,...,n, where ж; is the zth column of the matrix X. Further, it can be shown that cov(Xi,Xj) = σ0·Φ, i,j = 1,... ,p, where ж J' is the zth row of matrix X. THEOREM 2.3.15. Let X ~ ΛΓρ,η(Μ,Σ <g> Ф), A(rx p), and C(gxp) 6e given matrices. Then AX and CX are independent if and only if ΑΣΟ' = 0. Proof: The proof is similar to the proof of Theorem 2.3.14. ■ By combining the results of Theorems 2.3.14 and 2.3.15, we get the following. THEOREM 2.3.16. Let X ~ ΛΓρ,η(Μ, Σ <g> Φ), A (r χ ρ), Β {η χ t), C {q x ρ), and D{n x s) be given matrices. Then ΑΧ Β and CXD are independent if and only if either ΑΣΟ' = 0 or B'^D = 0. Now, we generalize a result of Basu and Khatri (1969) for the matrix variate normal case. THEOREM 2.3.17. Let Χ ~ ΑΓρ?η(Μ,Σι <g> Σ2), and fij(X) be a real-valued function ofX,i = l,...,r,j = l,...,s. IfF(X) = (fij(X)) ~ ^(μ,Φι ΘΦ2) for every Μ e Rpxn, Σι > 0 and Σ2 > 0, then F(X) = AX В + С almost everywhere, where А, В and С do not depend on Μ, Σι and Σ2. Proof: Noting the fact that vec(X') is multivariate normal, the proof follows from Basu and Khatri (1969). ■ THEOREM 2.3.18. Let X ~ NPi7l(0Jp <g> In), and X = TL, where Τ (ρ χ ρ) = {Uj), Ui > 0 is a lower triangular matrix and L(p χ η) is a semiorthogonal matrix, LL' = Ip. Then Τ and L are independently distributed. The p.d.f of Τ is {2^-^rp(in)}_1 ft tr etr (- \ΤΤ% (2.3.11)
68 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION that is Uj 's are independently distributed, 1 < j < г < ρ, t\ ~ x£_i+1, i = 1,... ,p, and Uj ~ iV(0,1), 1 < j < г < р. Proof: The p.d.f. of X is given by Now using the transformation X = TL with Jacobian, from (1.3.25), J{X -^T,L) = 9n,P(L) nf=i ί%~\ where gn,P(L) is a function of L only, we get the joint density of Τ and L as (2π)" W Π ПГ etr (- \TT')gn,p{L). (2.3.12) From (2.3.12), it follows that Τ and L are independent and the density of T, using Theorem 1.4.9, is given by O-^np+p Ρ ,1ч Т^ШГ etr (~-2ТГ). (2.3.13) This completes the proof of the theorem. ■ A result more general than Theorem 2.3.18 is proven in the following theorem. THEOREM 2.3.19. Let Υ (ρ χ η) be a random matrix with гапк(У) =р<п, and p.d.f. f(YYf). IfY = TL, where Τ = (Uj), tu > 0 is a lower triangular matrix and L is a semiorthogonal matrix, LL' = Ip, then Τ and L are independently distributed and the p. d.f. of Τ is fL-T[trf(TT'). (2.3.14) LP\2n) i=l Proof: As in the proof of Theorem 2.3.18, the joint density of Τ and L is now given ЬУ /(ir)ntrW£)· (2-3-15) From (2.3.15), it follows that Τ and L are independent. Integrating (2.3.15) with respect to L, by using Theorem 1.4.9, we get (2.3.14). ■ The random matrix L, in Theorems 2.3.18 and 2.3.19, has uniform distribution over the Stiefel manifold 0(p, n) = {L : LL' = /p}, which will be studied in Chapter 8. 2.4. SINGULAR MATRIX VARIATE NORMAL DISTRIBUTION The density (2.2.1) of X does not exist if Σ <8> Φ is positive semidefinite. In this case, X is said to have singular normal distribution which we now define.
2.4. SINGULAR MATRIX VARIATE NORMAL DISTRIBUTION 69 DEFINITION 2.4.1. Let Χ (ρ χ η) be a random matrix with E{X) = Μ and cov(X) = Σ <8> Φ, where Σ(ρ χ ρ) and Φ (η χ η) are positive semidefinite with ranks pi (< p) and щ (< η) respectively. Then X is said to have singular matrix variate normal distribution if there exist matrices Η (ρ χ рг) and Я(щ χ η) of ranks pi and щ respectively such that X = HYR + Μ for some random matrix Υ ~NPuni(0,P®Q), Ρ fa xpi)>0 and Q fa χ щ) > 0. We will denote this by X ~ NPy7l(M, Е®Ф|рь щ). From Theorem 2.3.10, it follows that Σ = Η Ρ Η' and Φ = R'QR. It may be noted that if either (i) pi = p, and щ < η or (ii) pi < p, and щ = η, then also Σ <g> Φ is positive semidefinite and the random matrix X has a singular matrix variate normal distribution. ТЩХЖЕМ 2.4.1. Let Χ ~ ΛΓρ?η(Μ, Σ <g> Φ|ρι,ηι), then φχ{Ζ) = etr (lZ'M - ^Ζ'ΣΖφ). Proof: By definition, φχ(Ζ) = E[eti(iXZ')] = E[eti{i(HYR + M)Z'}] = ^τ{υΜΖ')φγ{Η'ΖΕ:) = eti(LMZ') etr (- ]-PH'ZR!QRZ'H) = e\x{uZlM-):Z\HPHl)Z{R!QR)}. ■ THEOREM 2.4.2. If Χ ~ ΑΓρ?η(Μ,Σ <g> Φ|ρι,ηι), D (m χ ρ), and С (η χ t), then DXC ~ Nmtt(DMC,DED') <g> (С'ФС)|ть*1), where ml = τа,Ώk(DΣD,) and h = гапк(С"ФС).' Proof: The characteristic function of DXC is <I>dxc(Z) = E[etr(iDXCZ')] = φχ(σζσ) = etr (lMCZ'D - ^-CZ'DHD'ZC'^) = etr [lDMCZ' - ]-{DYlD,)Z(C'^C)Z'), from which the result follows. ■ THEOREM 2.4.3. Let X ~ NPtn(M, Σ <g> Ф|рь щ), and partition Χ, Μ, Σ, ana7 Φ as fXn Xu\ rn M_fMn Mi2\ 7M VXn ^22J p — rn' \M2i M22J p — m' t n—t t n—t
70 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION /Σπ Σ12\ τη φ=/Φιι *Ί2\ t \Σ21 Σ22) p-m' V^2i Φ22/ n-t' τα ρ —τη t η — t Then Xu ~ А^та><(Мп,Е11 <g> #n|mi,£i) wuere mi = гапк(Ец) and ti = гапк(Фп). Proof: In Theorem 2.4.2, let D = (Im 0) and С = (It 0) so that £>XC = Xu, DUD' = Σιι and С"ФС = ФП. ■ 2.5. SYMMETRIC MATRIX VARIATE NORMAL DISTRIBUTION Let 7(pxp)~ ΝΡιΡ(Ν, Σ <g> Φ), then vec(y') ~ iVp2(vec(iV'), Σ <g> Φ) and vec(Y") ~ iVp2(vec(iV), Φ (g> Σ). Now, using the transformations vec(XO = ^Pvec(r), (2.5.1) and vec(X) = Mpvec(r), (2.5.2) in Section 1.2, we have vec(X') - Arp2(Mpvec(iV),Mp(E (g) Φ)ΜΡ), (2.5.3) where the matrix Mp is defined in Section 1.2, we have and vec(X) ~ N^(Mpvec(N),Mp(y <g> Σ)ΜΡ). (2.5.4) From (1.2.17), note that X = X'. Therefore, from (2.5.3) and (2.5.4), ΜΡ(Σ <g> Φ)ΜΡ = МР(Ф <g> Σ)ΜΡ, which is satisfied if ΣΦ = ΦΣ. The characteristic function of vec(X') is ^(Х0(тес(Г)) = £[exp{6(vec(T0)'vec(X')}], T(pxp) = Tf = E[exp{L(vec(r))'Mpvec(Y')}] = £[exp{t(Mpvec(T'))'vec(r)}] = etr [l(Mpvec(iV'))' vec(T') - i(Mp νβΰ(Τ'))'(Σ <g> Φ)(ΜΡvec(T'))] = etr [^(vec(M'))' vec(T) - |(νβο(Τ)),(Σ <g> Φ)(νβο(Γ))] = etr [<TM - )-ТТТЪ]. (2.5.5) where vec(M') = Mpvec(N') gives Μ = Μ'. Since Χ (ρ χ ρ) is a symmetric matrix, it contains only \p(p+ 1) distinct elements and therefore the covariance matrix of X
2.5. SYMMETRIC MATRIX VARIATE NORMAL DISTRIBUTION 71 should be a matrix of order \p(p + 1) x \p{p + 1). To obtain this covariance matrix, we derive the characteristic function of vecp(X), ^vecP(x)(vecp(T)) = E[exp{t(vecp{T)Y vecp(X)}} = E[exp{c(Bpvecp(T)Y vec(X)}] = exp{t(vec(M))'£p vecp(T) - -(Bp vecp(T))'(E <g> Φ)£ρvecp(T)} = exp{.(£pvec(M))'vecp(T) - ^(vecp(T))'£ρ(Σ ® Ф)Вруеср(Т)} = exp{6(vecp(M)),vecp(T) - i(vecp(T));B;(E®«)Bpvecp(T)}, where vecp(X), and Bp have been defined in Section 1.2. Thus, we define the symmetric matrix variate normal distribution as follows. DEFINITION 2.5.1. Let X (pxp) be a symmetric random matrix and Μ, Σ, and Φ be constant symmetric pxp matrices such that ΣΦ = ΦΣ. If the \p(p+l) x 1 vector vecp(X) formed from X is distnbuted as Nip^p+l^(vecp(M),Bp(L<S>1^)Bp), then X is said to have symmetric matrix variate normal distnbution, with mean matrix Μ and covariance matrix £ρ(Σ <g> Φ)£ρ, and is denoted as X = X' ~ SNPiP(M, Βρ(Σ <g> Φ)Βρ). From the Definition 2.5.1, the probability density function of X, in terms of vecp(X), is (27Γ)-ϊΡ(ίΗ-υ det(B'p(E <g> Ф)Вр)"* ехр [- ^(vecp(X) - vecp(M))' ■B+(Ε ® Ъ)~1В;\уеср(Х) - vecp(M))]. (2.5.6) Using (1.2.12), (2.5.6) can be written in terms of vec(X) as (2π)-*ρ(ρ+1) det(£p(E <g> Ф)Вр)"* ехр [- ^(vec(X) - vec(M))' -(Σ <g> Ф)"1(уес(Х) - vec(M))] (2.5.7) which, applying Theorem 1.2.22, can be rewritten as (27Γ)-ϊρ(ρ+ι) det(£p(E (g) Φ)ΒΡ)"* etr [- \^~l(X - М)Ъ~1(Х - Μ)]. (2.5.8) We now derive the product moments of the elements of the random matrix X, given by Η. Μ. Nel (1977) and D. G. Nel (1978). THEOREM 2.5.1. Let X = X' ~ SiVp,p(M, Bp(E <g> Φ)ΒΡ), then (i) E(xij) = my (ii) cov(xij, xiu) = -Xpik^jt + ajkil>x + aurpjk + σ^φ^)
72 CHAPTER 2. MATRIX VARJATE NORMAL DISTRIBUTION (Hi) E(xijxkiXrs) = rriij cov(xw,xrs) + mkiсоу(ху,xrs) + mrs cov(xij?хы) + rriijmk£mrs and (iv) E(zijZkeZrsZtq) = cov{xij,xtq)cov{xki,xrs) + cav(xij,xiu) cov(xrs,xtJ + COv(Zij,Xr3) COv(Xfc£, Χ*ς) + ΤηίάΤηίς COv(Xfc£, Xrs) = rriijmki cov(xrs, Xgt) + т^тгз cov(xfc£, χ<ς) + rriktrrirs cov(xij,xtq) + rrirsmqt cov (x^, Xke) + rriktmtq cov(x Proof: From the characteristic function (2.5.5), using the method of Theorem 2.3.3, the results are easily obtained. ■ Results parallel to the ones given in Theorems 2.3.5-2.3.9 can also be derived in a similar manner using the above Theorem. For Α (ρ χ p), С (ρ χ ρ), and D(p x ρ) constant matrices we similarly have E(XAX) = ^[ΣΑ'Φ + ΦΑ'Σ + ϊγ(ΑΦ)Σ + ϊγ(ΑΣ)Φ] + МАМ, (2.5.9) E(ti(AX)X) = ^[ΣΑ'Φ + ΦΑΣ + ΣΑΦ + ΦΑ'Σ] + tr( AM)M, (2.5.10) and E(XAXCX) = ^[ΜΑΣΟ'Φ + ΜΑΦΟ'Σ + ΣΟ'ΜΑ'Φ + ΦΟ'ΜΑ'Σ + ΣΑ'ΦΟΜ + ΦΑ'ΣΟΜ + ϊγ(ΟΣ)ΜΑΦ + ϊγ(ΟΦ)ΜΑΣ + ϊγ(6"ΜΑ'Σ)Φ + ϊγ(ΟΜΑΦ)Σ + ϊγ(ΑΣ)ΦΟΜ + ϊγ(ΑΦ)ΣΟΜ] + МАМСМ. Many other higher order expectations are given by Η. Μ. Nel (1977) and D. G. Nel (1978). It may be noted that the moments of X = X' ~ SNPyP(M, Βρ(Σ <g> Φ)£ρ) can also be obtained from the moments of nonsymmetric Υ ~ NPyP(M, Σ <g> Φ) by substituting \{Y + Y') for X-M. For example, E{XX') = E[\{Y + Г)(^ + У'У + THEOREM 2.5.2. Let X = X' ~ SNPiP(M, β;(Σ^Φ)βρ), A (pxp) be a symmetric matrix such that ΣΑΦ = ΦΑΣ, and h(X) be an elementary symmetric function of X. Then E[eti(AX)h(X)} = etr [AM + hlAVA)E(h(Y)) where Y = Y'~ SNPiP(M + ΣΑΦ, Βρ(Σ <8> Φ)ΒΡ). Proof: We have £[etr(AX)u(X)] = (2Tr)-i*b+V det(B'p(Z <g> Ф)ВР) • / h(X) etr [AX - ^Σ~ι(Χ - M)(X - M)\ dX. Simplifying the term within square brackets, using ΣΑΦ = ΦΑΣ, we get
2.5. SYMMETRIC MATRIX VARIATE NORMAL DISTRIBUTION 73 E[eti(AX)h(X)} = (2тг)-^+1) det(5p(E <g> Ф)5р) etr (AM + ^ΣΑΦΑ) • / />(X)etr [ - \z~l(X -Μ- ΣΑΦ)(Χ -Μ- Σ ΑΦ)] dX = etr [AM + ^ΣΑΦΑ) у h(X)f(X) dX where /(X) denotes the density SNPiP(M + ΣΑΦ, £ρ(Σ <g> Φ)Βρ). This completes the proof of the theorem. ■ THEOREM 2.5.3. Let X = X' ~ SNPyP(M, Βρ{Σ <g> Φ) J3P), taen ϊγ(Χ)-ΑΓ(ϊγ(Μ),ϊγ(ΣΦ)). Proof: The characteristic function of tr(X) is &r(jo(t) = S[exp{tf tr(X)}] = £[exp{*,tr(TX)}], where Τ = ί/ρ = exp [it tr(M) - -t2 ϊγ(ΣΦ)] . The last equality is obtained from (2.5.5). Hence, the proof is complete. ■ THEOREM 2.5.4. Let X = X' ~ SNP,P(M, Β'ρ(Σ <g> Ф)ВР), йеп АХА' - SNq,q(AMA',B'q((AEA') <g> (ΑΦΑ'))£ς), гиЛеге А (<? χ ρ) is ο/ rank q < p. Proof: The characteristic function of AX A! is Фаха>(Т) = £[etr(.TAXA')] = E[eti(i(A'TA)X)] = etr [lT(AMA!) - ^Τ(ΑΣΑ')Τ(ΑΦΑ')], from which the result follows immediately. ■ THEOREM 2.5.5. Let X = X' ~ SNPyP(M, βρ(Σ <g> Φ)5Ρ), αηίί partition Χ, Μ, Σ, and Φ as VXii Xn) p-t \M2l M,J p-t t p — t t p — t Σ=ίΣΐ1 Σιή * ,«=(*» Φιή ' . VE21 Σ22; P-t \*21 *22/i>-i ί ρ — ί ί ρ — ί ITien, Χιι ~ JVt,t(Mu> Β{(Σιι β *u)Bt).
74 CHAPTER 2. MATRIX VARJATE NORMAL DISTRZBtOTON Proof: Let A(txp) = (It 0), then AX A' = Xlu ΑΜΑ' = Mlu ΑΈΑ' = Σπ and A&A! = Φη. Now, from Theorem 2.5.4, the result follows. ■ Many authors have studied the matrix variate symmetric normal distribution. H. M. Nel (1977) derived the marginal, conditional distributions and the distribution of the roots. D. G. Nel (1978) applied this distribution to derive the asymptotic expansion of a Wishart matrix. Hayakawa and Kikuchi (1979) derived moments of a function of tr(X) using zonal polynomials. 2.6. RESTRICTED MATRIX VARIATE NORMAL DISTRIBUTION DEFINITION 2.6.1. Let Χ ~ Νρ,η(Μ, Σ <g> Φ) and С (η χ s) be a constant matrix of rank s{< n). If the domain of definition of X is restricted to the subspace XC = 0 and if MC = 0, then the distribution of X is called restricted matrix variate normal with restriction XC = 0, and is denoted by X ~ NPiTl(M, Σ <g> Φ|δ, С). In the following theorem, we derive an explicit form of the restricted matrix variate normal density. THEOREM 2.6.1. Let Χ ~ Νρ,η(Μ, Σ <g> Φ|β, С), then the density of X is given by (2π)-^n-s)pdet(Φ)-2pdet(C,ΦC)^det(Σ)-^n-s) etr{- ^-1(* - Μ)'Έ~\Χ - Μ)}, XC = 0. Proof: The density function of unrestricted matrix X is f(X) = (2π)-5"Ρ(1βί(Σ)"5η(ΐβί(Φ)-5Ρ etr{- \z~l(x - м)ъ~\х - му], x e wxn. Hence, the density of the restricted matrix X is f(X) _ eti{-±E-l(X - М)Ф-*(Х - Μ)'} J f(X)dX J βΙι{-\Σ~ι(Χ - М)Ъ~1(Х - M)'}dX' X€Kpxn X€Kpxn (2.6.1) xc=o xc=o From Theorem 1.4.12, the denominator on the right hand side can be evaluated as J etr {- \z~l(X - М)Ъ~1(Х - Μ)'} dX xc=o = i J etr {-^Σ"1(Χ-Μ)φ-1(Χ- MY) dXdW W>0 (Χ-Μ)Φ"1(Χ-Μ)'=ΐν C'(X-M)'=0
2.6. RESTRICTED MATRIX VARIATE NORMAL DISTRIBUTION 75 dYdW = f f etr (- ^Е-'УФ^Г) ι W>0 Y^-^Y'z^W C'Y'=0 = τ. Τι? S P μ det(V)ipdet(CVC)-*p [ det(Wp)*(n-p—x>etr (- \^~lw) dW ^p[-2(n-s)] J>q V 2 ' = (2^^n-s)pdet(^)^det(C^C)-^det(E)^n-s). (2.6.2) Now substituting (2.6.2) in (2.6.1), we get the density of the restricted matrix X as (2^-^n-s)pdet(^)-^det(C'^C)^det(E)-^n-s) etr {- ^Φ"1^ ~ Μ)'Σ~\Χ - Μ)}, XC = 0. ■ THEOREM 2.6.2. Let X ~ Np,n(M, Σ <g> Φ) and В (r χ ρ) be a constant matrix of rank r < p. If the domain of definition of X is restricted to the subspace BX = 0 and if BM = 0, then X' ~ ΛΓη,ρ(Μ', Φ <g> E|r, B'). Proof: Prom Theorem 2.3.1, Υ = Χ' ~ iVn,p(M', Φ <g> Σ). Also, the restriction BX = 0 is equivalent to X'B' = 0, i.e., YB' = 0. Now, from Definition 2.6.1, it is obvious that Υ ~ Νηφ(Μ\ Φ <g> E|r, £'). ■ THEOREM 2.6.3. LetX ~ ЛГр,п(М,Е<Е>Ф|5,С). The characteristic function of X is φχ(Ζ) = etr [lMZ' - )-ΣΖ4>Ζ' + hlZVCiC'VC^C'VZ'}. Proof: The characteristic function of X is given by φχ(Ζ) = (2^-^(n-s)pdet(^)-^det(C^C)^det(E)-^(n-s) / etr [lXZ' - \ъ~\Х - М)Ъ~1(Х - Μ)'} dX xeRpxn xc=o = (2тг)-^-*>р det(«)"*p det(C"«C)& det(E)-^-s) etr {lMZ' - ^ΕΖΦΖ') j etr {- \ъ~\Х -М- ιΣΖΨ)ν-ι(Χ -Μ- ιΣΖΦ)'} dX. (2.6.3) xc=o Now, let Υ = X - Μ - ί,ΕΖΦ so that Υ С = -ιΣΖ^Ο =-lA (say). We have / etr {- \z~l(X -M- lEZV)V-1(X -Μ- ιΣΖΦ)'} dX xew>xn xc=o
76 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION ί etr (- ^Σ^ΥΦ^Υ') dY ХП zzvc I I etr (" ^E"ly^"lyOrfy YeRpxn YC=-lZZVC dW C'Y'=-lA У etr (- ^Σ"1^) det(W + Л(С,ФС)-1Л,)2(п"р"5"1) dW \К+\{С'ЪС)-1К>0 - WJY7 rTdet(^)2Pdet(C,^C)-2Petr{^Z^C(C,^C,)"1C,^Z,EJ Tpi^n-s)] ^2 > rp[i(n-e)]det(2E)ii»-) = (2^^n-s)pdet(^)2Pdet(C^C)-2Pdet(E)2^-s) etr {^фсссфс^сфя'е}. (2.6.4) Now by substituting (2.6.4) in (2.6.3), we get the desired result. ■ THEOREM 2.6.4. Let Χ ~ ΛΓρ?η(Μ,Σ <g> Ф|в,С), В (га χ ρ) be of rank m < ρ and D(n χ n) be a nonsingular matrix. Then BXD ~ Nmy7l(BMD, (BUB') <g> (D^DJIs.D^C). Proof: The characteristic function of BXD is ^bxd(^) = E[eti{iBXDZf)] = E[eti{iX{B'ZD')'}] = etr [lBMDZ' - ]-{ΒΈΒ')Ζ{Ό'^Ό)Ζ' + )-{BY,B')Z(D'4!D) ■D-lC{{D-lC)'{D'^D){D-lC))-\D-lC)\D'^D)Z']. (2.6.5) which is the characteristic function of a random matrix with distribution Nmi7l(BMD, (ΒΣΒ')®(Ό'νΌ)\3,Ό-ιΟ). m In the above theorem let m = p, then the p.d.f. of Υ = BXD becomes (2^-2(—s)Pdet(D^D)-2Pdet(C^C)2Pdet(BEB,)"2(n"s) etr {- h&4!D)-l(Y - ΒΜΌ)'(ΒΈΒ')-\Υ - BMD)), YD~lC = 0. (2.6.6)
2.7. MATRIX VARIATE Θ-GENERALIZED NORMAL DISTRIBUTION 77 Also, using the transformation Υ = BXD, with the Jacobian J{X —>· У), from Theorem 2.6.1 we get the p.d.f. of У as (2π)-2{η~3)ρ det(V)-*p det(C"#C) ** det(E)-2^-s) etr {- Uf-l(B~lYD-1 - Μ)'Έ-ι{Β-ιΥϋ~ι - Μ)} J(X -> У), YD'1 С = 0. (2.6.7) Since, the density of Υ is unique, comparing (2.6.6) and (2.6.7) we get the following result. LEMMA 2.6.1. Let the matrix X be of order ρ χ η and transform Υ = BXD, such that XC = 0 where Β (ρ χ ρ) and D (η χ η) are nonsingular matrices and С (η χ s) is of rank s < n. Then the Jacobian of transformation is J(X —>· Y) = det(D)-*>det(B)-(n-s\ 2.7. MATRIX VARIATE ^-GENERALIZED NORMAL DISTRIBUTION Another way of extending the concept of normal distribution was shown by Goodman and Kotz (1973). They introduced the multivariate ^-generalized normal distribution. A random vector у (ρ χ 1) is said to have a vector variate ^-generalized normal distribution if it can be written as у = Cx + μ where μ (ρ χ 1) is a constant vector, С is ap xp nonsingular matrix, and χ = (χι,..., xp)' is a random vector whose elements are independent and each has the probability density function 2Γ -Λ^γ exp ( - |s/), θ > 0, Xi e R, i = 1,... ,p. (2.7.1) V1 + *J The distribution of у is denoted by Νρ(μ, С, θ). An extension of this concept to the matrix variate case has been given by Gupta and Varga (1995a). DEFINITION 2.7.1. Let θ > 0. Then X = (x0-), г = 1,... ,p, j = 1,..., η has a matrix variate standard θ-generalized normal distribution ifxij 's are independent and identically distributed random variables with p.d.f. —j ^exp(- |xi/), 0>O, xij GR, i = l,...,p, j = l,...,n. 2Γ [1 + θ) DEFINITION 2.7.2. Let θ > 0. Then the random matrix Υ {ρ χ η) is said to have a matrix variate θ-generalized normal distribution if Υ can be written as Υ = AXB+M where X (pxn) is a standard θ-generalized normal random matrix, A(pxp), В (nxn), and Μ (ρ χ ή) are constant matrices, with A and В being nonsingular.
72 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION The distribution of Υ is denoted by NPy7l(M, Α,Β,Θ). For η = 1, we get the multivariate ^-generalized normal distribution. Furthermore, the case η = ρ = 1 reduces to the Laplace density for θ = 1, and the normal density for θ = 2. It approaches the uniform density as θ —>· oo, and an improper uniform one over the real line as θ —> 0. The probability density function of a matrix variate ^-generalized normal distribution is given in the following theorem. THEOREM 2.7.1. Let Υ ~ Νρ,η(Μ, Α, Β, Θ). Then the probability density function of Υ is J2r(l + ^)} ПРdet(A)~n det(B)~p *ΧΡ Ι" Σ Σ Ι Σ Σ **Ы ~ тыЩ6) (2.7.2) where A~l = (aik), B~l = (b£j), Μ = (тке), and Y = (уке). Proof: The p.d.f. of X is Let Υ = ΑΧ Β + Μ. Substituting xi5 = ELi Σ?=ι а1к(уы - mke)bij alongwith the Jacobian of the transformation J{X —>· Y) = det(^)~n det(B)~p in the above density we get (2.7.2). ■ Linear transformations of matrices with matrix variate ^-generalized normal distribution also have matrix variate ^-generalized normal distribution. This is proved in the next theorem. THEOREM 2.7.2. Let Υ ~ NPi7l(M, Α, Β, Θ). Let С (ρ χ ρ), D(nxn) be nonsin- gular matrices, L be α ρ χ η matrix, and define Ζ = CYD + L. Then Ζ ~ NPin(CMD + L, С A, BD, Θ) (2.7.3) Proof: Let X ~ iVp,n(0, JP, Jn, 0) and Υ = AX В + M. Then Z = (CA)X(BD) + (CMD + L), where С A and BD are nonsingular. From this (2.7.3) follows. ■ It may be remarked here that Νρ,η(Μ, Α, £, 2) = ΛΓρ?η(Μ, |(A4')<g> (££')). Indeed, let Υ ~ Np,n(M, A, B, 2). Then Υ = ^Α^ΣΧΒ + Μ,'where л/2 X ~ ΛΓρ,η(0, ΙΡ Θ /») from which the statement follows. Therefore the matrix variate normal distribution is a special case of the matrix variate ^-generalized normal distributions. The relationship between matrix variate ^-generalized normal distributions and multivariate 0—generalized normal distributions is pointed out in the next theorem. THEOREM 2.7.3. Υ ~ ΛΓρ?η(Μ, А, Б, θ) if and only if vec(r) - iVnp(vec(M'), A ® £', Θ).
2.7. MATRIX VARIATE Θ-GENERALIZED NORMAL DISTRIBUTION 79 Proof: Υ = ΑΧ Β + Μ is equivalent to уес(У') = (A <g> B') vec(X') + vec(M'), from which the statement of the theorem follows. ■ The next theorem shows that the parameters of a matrix variate ^-generalized normal distribution are not uniquely determined. THEOREM 2.7.4. ΝΡι71(Μ,Α,Β,θ) and Νρ^{Μ\Α\Β\θ) define the same distri- bution if and only if Μ = Μ* and (a) in the case of θ = 2, there exist G (ρ x p) and Η (η χ η) orthogonal matrices and с > 0 such that A* = cAG and B* = \HB, (b) in the case of θ φ 2, there exist Ρ {ρ χ ρ) and Q (η χ η) signed permutation matrices and с > 0 such that A* = cAP and B* = \QB. Proof: The sufficiency of the conditions is obvious. To prove necessity assume that NpnivecW), A <g> Β', Θ) and N^ve^M*'), A* <g> £*', Θ) define the same distribution. Since the first distribution is symmetric about vec(M') and the second one about vec(M*'), we must have Μ = Μ*. (a) If θ = 2, we get iV( vec(M'), \{AA!) ® {Β'Β)) = iV( vec(M'), \(A*A*') ® (Β*'Β*)) Hence there exists c2 > 0 such that A*A*' = c2AA' and B*'B* = B'B. But then we can find G (ρ χ p) and Η (η χ η) orthogonal matrices such that A* = cAG and B* = ±HB. с (b) If θ φ 2 we use Theorem 3 of Goodman and Kotz (1973) which says that A* <g> B*' can differ from A <g> B' by at most a post-multiplicative signed permutation matrix R. That is A* <g> B*' = (A <g> B')R, or A~lA* <g> (Β~ι)'Β*' = R. This last equation is equivalent to A~lA* = cP, (B~l)'B*' = \Q, where P{p χ ρ), Q (η χ η) are signed permutation matrices, and с > 0. ■ The first four moments of a matrix variate ^-generalized normal distribution are derived next. For notational ease we will write θ = -. THEOREM 2.7.5. Let Χ ~ ΛΓρ,η(0, Jp, Jn,0), then (i) E{Xij) = 0, {%%) ■&{XiijiZi2J2) ~ τν~Λ "*ι*2471.72' r(a?), rfa) (III) ■&\Xiiji%i2J2%izjz) U, and (iv)E(x- x- x- .x. .)=[£М_3£!й?) "ll 121314 VjlJ2J3J4 ' Г>2/~\ \"»1»2 ^ilj2^3i4^i3j4 mere o,14..,4 |0 otherwise Г2(3т?), ' ^Ul3^jiJ3^l2i4^J2j4 ' ^Ul4^Jij4^2i3^J2j3/
80 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION Proof: If Xij has the p.d.f. /(Xy) = 2f(r^)eXp{~|Xiil"} and к > 1, then Jo χύ^χν>αχν 2Τ(η) ■ Thus if к is a non-negative integer, we have Е(4)-Щ^-Ц^- («.4) Using (2.7.4) and the fact that the elements of X are independent of each other we obtain the results of the theorem. ■ THEOREM 2.7.6. Let Υ ~ iVp,n(0, Α,Β,Θ), then (г) E(Vij) = 0, Г(Зту), rfa)" (Hi) E(yilhyi2hyi3j3) = 0, and (ii) E(yilhyi2h) = -^r-rgi^h^, (iv) E(yilolyi202yizjzyuu) = Γ(5η) Γ2(3τ7) Qh. ' Iil2l3l4yjlj2j3j4 ~^~ ТТ2/~Л \9ili2^jlJ29i3i4^J3J4 Г(7?) Pfo) Γ2(3τ?), ' 9i\iz hjijz9l24 "'J234 ' 9i\l4 'ljlJ4 9%2ΪΖ ",j2jz ) where guv = Σ%=ι OukOvk, huv = YJl=l bivbiv, ruvwt = Σ£=1 o,ukavkawkatk, and quvwt = Proof: The results can be obtained from Theorem 2.7.5 by expressing Υ as Υ = ΑΧ Β where X ~ iVp,n(0, Jp, Jn, Θ). m THEOREM 2.7.7. Let Υ ~ NPiTl(M, Α, Β,θ), then (%) E(yi3) = mij} (llj Ь(у4i3\Ui232) = ρ/ \ 9г\%2"'3\32 * miijimi2J2> (ill) &\У1131Уг232УггЗг) ~ p/„\ [9iii2",jiJ2rriizJz ' 9hiz'lJiJ3rrii2J2 ' 9г2гг'1323гГПг\3\\ and
2.7. MATRIX VAKLATE Θ-GENERALIZED NORMAL DISTRIBUTION 81 (iv) E(yhjlyi2j2yi3hyid4) = Γ(5η) Г2(3ту)1 Τ2ίπ\ \rili2i3UQjlJ2J3J4 Τ(η) Π(τ?) J ГЦ, Г2(г?) ' 9i\iz hjijz9x214 ",j2J4 ι <?lil4 %l J*4 ^213 %2J3 J Γ(3τ?), ' ^hjl ^^UJ4 9l2l3 "'3233 ' ™'i2J2™'izJ3 9i\l4 ^j\ j\ \t-¥lLL(n h η h Τ2(π\ \9i\i2n'h329izi4n>3334 + ^(^^ЛА^+тй*™.,^^ ' '1^i232™Ji\3\9i\i3"'3\33 ' ™i333™Ji\3\9i\i2"'3\32 ' 'rri4ji 'rrii2J2 ™гз33 ™Ϊ4Зл )> гуДеге £Ле functions g, h, r, and q are defined in Theorem 2.7.6. Proof: The results follow from Theorem 2.7.6 if we express Υ as Υ = X + Μ where Х~ЛГр>п(О,Д£,0). ■ COROLLARY 2.7.7.1. Let X ~ Νρ>η(Μ, Α, Β,θ), then E(X) = Μ and cov(vec(X')) = Щ$(АА') Θ (Β'Β). (2.7.5) COROLLARY 2.7.7.2. Let X ~ iVp>n(M, Д β, β), then 9г\г2^3\32 corr(yiui,yi2J2) = \] 9i\i\91212 'b'ljl ",j2J2 and hence the corr(yilJ1?yi2J-2) does not depend on Θ. Using the expressions for the moments the following result can be derived. THEOREM 2.7.8. Let Χ ~ ΝΡί71(Μ, A, 5,0) and Ε (r χ ρ), С (η χ /с), Ffa χ ρ), and D {η χ ί) be constant matrices. Then EXC and FXD are uncorrelated if and only if either C'B'BD = 0 or EAA'F' = 0. Specially, XC and XD are uncorrelated iffC'B'BD = 0, and EX and FX are uncorrelated iff EAA'F' = 0. Proof: Using (2.7.5) we get cov(vec(EXC)\vec(FXD)') = cov((£ <g> C) vec(X'),(F® D') vec(X')) = {E® С')Щ^{(АА') Θ (B'B)}(Ff Θ D) = ^^-(EAA'F') ® {C'B'BD), r{v) and the last expression equals zero iff EAA'F' = 0 or C'B'BD = 0. ■
82 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION The next theorem shows that matrix variate ^-generalized normal distributions have maximal entropy in certain class of distributions. THEOREM 2.7.9. Let X (px n) be a random matrix with p.d.f. f such that E\\AXB + M\\e = c where A(p χ ρ), Β (η χ η) are nonsingular matrices, Μ is ρ χ η matrix, c is a given scalar, and for α ρ χ η matrix Υ we define \\Υ\\θ ols \\Υ\\* = Σ,Έ\να\θ- Then the entropy of X, that is, E(—hif(X)) is maximized iff X = Y a.e. where j-i (?1\* Δ-ι R-i \pn The maximal entropy is pn 11 -t- ш ι Proof: See Gupta and Varga (1995a). Υ~Ν»η(-Α-1ΜΒ-\(—γΑ-\Β-\θ>). )y is 1 + In φ] - In {(2Γ(ΐ + i))-"Pdet(Ar det(B)"}. PROBLEMS 2.1. Let the p.d.f. of Χ (ρ χ η) be given by (2.2.1). Derive the characteristic function ofX 2.2. Let X(pxn) ~ NPy7l(MuEl <g> Φχ) and Υ (ρ χ η) ~ ΑΓρ>η(Μ2,Σ2 Θ Φ2) be independently distributed. Prove that X + Υ ~ Νρ^{Μλ + M2, (Σι <g> Φι) + (Σ2<Ε>Φ2)). 2.3. Let Χ ~ Νρ,η(Μ, Σ <g> Φ), and partition Χ, Μ, Σ and Φ as /ΧιΛ m * = L =(*le *2c) \Л2г/ ρ-га t n_t ( Mir \ m M=( =(Mlc M2c) \M2rJ p-m t n_t Σ=(Σ" ^ m and*^*" Φΐ2) * . νΣ21 Σ22; p-m V^2i Φ22/ η-ί τη ρ — τη t η — t Then, prove that (i) Xlr and X2r are independent if and only if Σι2 = 0, and (ii) Xic and X2c axe independent if and only if Φι2 = 0.
PROBLEMS 83 2.4. Let X = (жь..., xn) ~ NPy7l(M, Σ<Ε>Φ) and denote its p.di. by p(X). Further, let /(2/1I2/2) be the conditional density of уг given y2- Using suitable notations for the means and covariances, write down explicitly p(X) = /i(xi)/2(a52|a*i)/3(a53|a>b a*2) · · · fn(xn\xu · · ·, ®n-i)· 2.5. Prove Theorem 2.3.3(iii). 2.6. Let X(pxn)~ Α^ρ,η(0, Σ <g> Φ) and Σ = (σ0·), Φ = (^»j)· Then show that Г/ (Xiij'i Xi2J2ХгзЗЗXU3aXib35ХгвЗб ) = σ4ΐ2 Ψ3\32σΐ3*4 Ψ3334σΐδί6 rjs36 ι σ%\%2Ψ3\3\σϊζ^Ψ3Ζ3δσϊ\4'ψ3\3$ ' аг\г2'Фз\32<7гъгб'ФзъЗЬ<7игъ'1гUjb 1 <Jiii3Wjijz(Ji2i4^J234(Ji5i6,^3536 ' σi\iz^hhG^Ъ^г323bG^Ь^3\3ь + σχ\χζΎ3\3ζσΪ2Χ$Ψ3^3^σΐ\^Ψ3\3^ ~^~ <Jhi4V;JiJ4(7i2i3^3233(7i5i6V3536 + <Jiii4WjiJ4<Ji2i5'lP3235<Ji3i6'^3336 ' аЧ14Г3134аъгбГ323б<7гзг5'1Рзз35 l (7hi5VjlJ5Cri2i3Vj2J3(7Ui6Vj4J6 ' °ti 15 %"lj5 ^24 ^.72.74 °*3*6 ^rj3j6 + Gi\ib^3\3bGnib^323b(J^4^3334 ' ^ii»6^j4j6^*2t3^J2J3^*4*5^rj4j5 + σχ\χ$Ψ3\3$σΪ2Ϊ4Ύ3^σΪ3^Ψ333^ ~l~ ahi6V;jiJ6<Ji2i5^3235<Ji3i4^3334· 2.7. Prove Theorem 2.3.4(ii), (iv) and (v). 2.8. Prove Theorem 2.3.5(v)-(vii). 2.9. Prove Theorem 2.3.6(ii)-(vii). 2.10. Prove Theorem 2.3.8. 2.11. Prove Theorem 2.3.9. 2.12. Let Χ ~ ΑΓρ>η(Μ,Σ <g> Φ). Then, for given matrices A, £, and С of suitable order, find (i) £(X'AX'£X'CX) (ii) ^(X'AX'BX'CX') (iii) £(XAX'£XCX') (iv) £(tr(XAX£X')*') (v) £(tr(XAX'£X')X) (vi) £(tr(X'AX'BX)X') (vii) £(tr(i4XBX')XCX') (viii) £;(tr(AX)X,BX,CX) (ix) E(ti(AX)X'BX'CX') (χ) £?(ίΓ(Χ;Α)ΧΒΧσΧ) (xi) Е(Ь(Х'А)Х'ВХ'СХ'). 2.13. Let Χ ~ Νρ>η(Μ, Σ <g> Φ). Then, prove that £(X (g) X) = νβο(Σ)(νβο(Φ))' + Μ <g> M. (Neudecker and Wansbeek, 1987)
84 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBU^N 2.14. Let X ~ iVp,n(0, Σ ® Φ). Then, for given Α (ρ χ η) and о = 0,1,2,...,.. find (i) E(ti(XA')2(AA')a) (ii) Е(Ы(ХА'АХ'(АА')а)) (iii) E(ti(XX')ti(XX'(AA')a)) (iv) E(ti((XX')(AA')°)) (v) E(ti3(XX')). 2.15. Let X (pxn) and У (рхтг) be identically distributed random matrices. Suppose that Y\X ~ NPiTl(aX + £,Σ <g> Jn) with 5(pxn) and |a| < 1. Then, prove that X ~ iVPfn((l - a)~lB, (1 - α2)"^ <g> In) and (ί)~.ν,,.(α-«)-(^),(ΐ-α=Γ·(αΣε f )·*.)· (Bekker and Roux, 1990) 2.16. Let X (pxn) and У (ρ χ η) be identically distributed random matrices. Suppose that Y\X ~ iVp>n(AX + £, /p <8> /n) with В (pxn) and A(pxp) is symmetric. Then, prove that X ~ ATp>n((Jp - A)~lB, (Ip - A2)~l <g> Jn) and X\ ,/(/ρ-Α)-^\ / (/p-A2)-1 A(Ip-A*)-i\, y)~ 2p^{(ip-a)-ib)'[a(ip-a>)-i (ιρ-α*)-4>· (Bekker and Roux, 1990) 2.17. Let X (pxn) and Υ (pxn) be identically distributed random matrices. Further, let X and V = Υ - aX be independent with V ~ NPy7l(B, Σ®Ιη), B(pxn) and \a\ < 1. Then, prove that X ~ iVp>n((l - α)_1£,Σ <g> Jn) and ί γ j also has a matrix variate normal distribution. (Bekker and Roux, 1990) 2.18. Let X and Υ be ρ χ η, identically distributed random matrices. Suppose that Y\X ~ NPy7l(AX + Β, Σ <g> Φ), where В is ρ χ η, and A is a p χ ρ matrix which satisfies the following conditions: (i) A is symmetric, (ii) таХг|С1^(А)| < 1, (iii) ΑΣ = ΣΑ. Define Ζ = ( v]. Then, prove that ((ΙΡ-Α)-*Β\ /E(/rAV ΣΑ(/Ρ-Α^\ (Gupta and Varga, 1994b)
PROBLEMS 85 2.19. Let X and Υ be ρ χ η, identically distributed random matrices. Suppose that X and V = Υ - AX are independent, and V ~ iVp>n(J3,E <g> Φ) where β is ρ χ η, and A is ρ χ ρ matrix which satisfies the conditions (i)-(iii) of Problem 2.18. Define Ζ = ( γ J. Then, prove that 7 Μ ,{Ά-Λ)-'Β\ /Σ{Ι,-Α')-' ΣΑ(ί,-Λ!)-\ (Gupta and Varga, 1994b) 2.20. Let X and У be ρ χ η, identically distributed random matrices with E(X) = E(Y) = 0 and suppose vec(X') has covariance matrix Σ <g> Φ. Moreover, suppose that A is nonsingular and satisfies the conditions (i)-(iii) of Problem 2.18. Let Ζ = ( Y j. Then, prove that z~n^Q-{za ")··)■ if and only if X and V = (Ip - A2) 2 (Y - AX) are independent and identically distributed. (Gupta and Varga, 1994b) 2.21. Let X (pxn) and У (qxn) be random matrices. Suppose that Y\X ~ Nq>n(C+ ΌΧ,Σ2 <g> Φ) and X ~ NPtTl(F,Zi <g> Φ). Let Ζ = ( Υ Then prove that Z ~ N^ ((z>/+ с) . (Д Σ2 + £ljD<) ° *> (Gupta and Varga, 1994b) 2.22. Let Χ (ρ χ η), F({xn) be random matrices and suppose that Y\X ~ Nqtn(C+ ϋΧ,Σ2 ® Φ), Х|Г = Го ~ Νρ<η(Μ,Σχ <g> Φ), where С(q χ η), £>(<? χ ρ), %2(q x ς), Φ (η χ η), Μ (ρ χ η), Σι (ρ χ ρ), Σχ > 0, Σ2 > 0, Φ > 0, and У0 is a fixed qxn matrix. Define Β = Σ^'Σ^1, Α = Μ - Σ^ϋ'Σ^Υο, ({IP-BD)-\A + BC)\ ,χ\ Ν = ρ / and Ζ = ( £ . Then, prove that V(/,-ob)-1(c+-da); W , / (7p-BD)-% (/Ρ-Β£>)-1ΒΣ2\ Ν (Gupta and Varga, 1992)
86 CHAPTER 2. MATRIX VARIATE NORMAL DISTRIBUTION 2.23. Let X = X' ~ SNp,p(M,B'p(E®V)Bp). Then, forgiven Α (ρ χ ρ), and С (ρ χ ρ) prove that (i) E(ti(CX)XAX) = ^[МАЕС'Ф + МАФСЕ + МАЕСФ + МАФС'Е + ЕС'ФАМ + ФСЕАМ + ЕСФАМ + ФСЕАМ + tr(CM)EA'# + tr(AE) tr(CM)# + ϊγ(ΑΦ) tr(CM)E + tr(CM)#A'E] + tr(CM)MA (ii) £(tr(AXCX)X) = i[tr(AEC,^)M + tr(A^)tr(CE)M + tr(A^C,E)M + tr(AE) ti(VC)M + ЕС'МА'Ф + ФАМСЕ + ЕАМСФ + ФС'МА'Е + ЕА'МС'Ф + ФСМАЕ + ЕСМАФ + ФА'МС'Е] + ti(AMM)M. (Η. Μ. Nel, 1977) 2.24. Let X ~ NPy7l(M, Ε <g> Jn). Assuming α priori that Μ ~ ΑΓρη(0,Ω <g> Jn), derive its posterior distribution. 2.25. Let X ~ iVp,n(M,E <g> Ф|в,С). Partition X as X = (*lr) p\Pl+p2= p, and derive the marginal p.d.f. of X.
CHAPTER 3 WISHART DISTRIBUTION 3.1. INTRODUCTION Let y1?..., yn be η independent standard normal variables. Then, w = Σ?=1 yf ~ χ£ with p.d.f. {2br(in)}_1^b-i exp ( - )-w), w > 0. (3.1.1) A p-variate generalization of (3.1.1) has been given by Krishnamoorthy and Partha- sarthy (1951). In this chapter, we study a matrix variate generalization of (3.1.1), known as Wishart distribution (Wishart, 1928). The discovery of this distribution has contributed enormously to the development of multivariate analysis, e.g., see Roy (1957), Kshirsagar (1972), Press (1972), Giri (1977), Srivastava and Kha- tri (1979), Muirhead (1982), Anderson (1984), and Siotani, Hayakawa and Fujikoshi (1985). 3.2. DENSITY FUNCTION In this section, we derive the density of a Wishart matrix using normal vectors. We begin by defining the Wishart distribution. DEFINITION 3.2.1. Α ρ χ ρ random symmetric positive definite matrix S is said to have a Wishart distribution with parameters p, n, and Σ (ρ χ ρ) > 0, written as S ~ Wp(n,E), if its p.d.f. is given by {2>Γρ(^η) det(E)^}"1 det(5)^(n-p-1} etr (- ^Σ"^), S > 0, η > p. (3.2.1) Fisher (1915) derived this distribution for ρ = 2 in order to study the distribution of correlation coefficient from a normal sample. Wishart (1928) obtained the distribution for arbitrary ρ as the joint distribution of sample variances and covari- ances from multivariate normal population. Because of its important role in multivariate statistical analysis, various authors have given different derivations, e.g., see Wishart and Bartlett (1933), Ingham (1933), Mahalnobis, Bose and Roy (1937), 87
88 CHAPTER 3. WISHART DISTRIBUTION Madow (1938), Hsu (1939b), Elfving (1947), Sverdrup (1947), Rasch (1948), Ogawa (1953), James (1954), Mauldon (1955), Wijsman (1957), Kshirsagar (1959), and Jambunathan (1965). This distribution, for Σ = /p, belongs to the class of orthogonally invariant and residual independent distributions discussed in Chapter 9. The orthogonal invariance and residual independence properties in this case are given in Theorems 3.3.2 and 3.3.4 respectively. If cci,..., xn are independent Np(0, Σ), then X = (ж1?..., xn) has a matrix vari- ate normal distribution. Further, if η > ρ, then XX' > 0 with probability one (Stein, 1969; Dykstra, 1970) and XX' ~ Wp(n, Σ) as shown below. THEOREM 3.2.1. Let Χ ~ ΛΓρ>η(0, Σ <g> Jn), η > ρ, then XX' > 0 with probability one. Proof: Let X = (xb ..., xn). Then, it suffices to show that X has rank ρ < η, that is any ρ random vectors £cb ..., xp are linearly independent with probability one. Now, P{χi,..., xp are linearly independent} = 1 — P{x\,..., xp are linearly dependent} ρ > 1 — Σ Ρ{χί is a linear combination of others} i=l Ρ = 1 — pP\x\ = Σ djXj, for at least one dj φ 0|. Since, £c1?... ,xp are independent random vectors having nondegenerate continuous distribution with covariance matrix Σ > 0, ρ P[x\ = Σ djXj, for at least one dj φ 0 j = 0. J=2 Hence, P{x\,..., Xp are linearly independent} = 1 and the proof is complete. ■ The above theorem has been proven by Eaton and Perlman (1973) without assuming normality (see also Das Gupta, 1971). THEOREM 3.2.2. Let Χ ~ ΛΓρ>η(0, Σ <g> Jn) and define S = XX', n > p. Then 5-ν^ρ(η,Σ). Proof: The density of X is (27r)-bPdet^)-betr (- hrlXX'). Since XX' > 0 with probability one, make the transformation X = ΤΗχ , where T(pxp) = (Uj) is a lower triangular matrix with tu > 0, г = 1,... ,p, and Hi (ρ χ η)
3.2. DENSITY FUNCTION 89 is a semiorthogonal matrix, HXH[ = Ip. The Jacobian of this transformation, J(X -± T, Hi) = nLi t?i~%9nyP(Hi), is given in (1.3.25). Hence, the joint density of Τ and Ηλ is (27r)-b>det(E)-Ktr (- ΐΣ-ιΤΤ)Υ[ίΓ9ηΑΗι). Z г=1 Now, integrating out #i using Theorem 1.4.9, we get the marginal density of Τ as O-^np+p -ι ρ —r— det(E)4»etr - -Σ~ιΤΤ') Ц %-*. (3.2.2) 1Р\2П) L t=l In (3.2.2), let S = TV"(= XX') with the Jacobian J(T -► 5) = ^nLiC^1)"1, then the density of S is {2*^Γρ(|η) det(E)2n}_1 det(5)^n-p-1} etr ( - \?TlS). ■ Note that in the derivation of the Wishart density given above it is assumed that η (> ρ) is an integer, but the density (3.2.1) exists for all η >p. If η < ρ, the density of XX' is called, by some authors, a pseudo Wishart, e.g., Kshirsagar (1972), Siotani, Hayakawa and Fujikoshi (1985). If Σ is of less than full rank, say p1? then by Definition 2.4.1, X ~ iVp>n(0, Σ (g> ΛιΙΡι»71)» and there exists a matrix Η (ρ χ pi) of rank pi such that X = ЯУ, where ^ ~ ^Pi,n(0, /Ρ1 Θ /η). In this case, 5 = HYY'H', where УГ - W^n, JP1), and S is said to have singular Wishart distribution. We now derive the c.d.f. of a Wishart matrix. THEOREM 3.2.3. Let S ~ Π^(η,Σ), *Леп Р(*<Л)= t ^(p-fl)]det(A)b ι ι 1Σ-1Α) V ' 2Wdet(E)iTp[i(n + p+l)] V2 '2V ^ ;' 2 ^ w/геге Λ (ρ χ ρ) > 0. Proof: We have P{S < Λ) " 2Wry(l„')de,g)i- jL·* (-5Е"'5) de,<S>'<"~'-4iS- (3-2-3' Substituting Б = Λ-25Λ-5 with the Jacobian, J(5 -¥ B) = det(A)^(p+1), in (3.2.3) and writing etr(-|E_15) = 0F0(-^-lS), we get P(S < A) = - ^^ r- / det(B)^-?-V0F0(- Ые^В) dB. The proof is completed by using the Theorem 1.6.3. ■ In Theorem 3.2.2, we have derived the Wishart density assuming Χ ~ iVp>n(0, Σ <g> Φ) where Φ = In. However, if Φ φ /η, under certain conditions on Φ, XX' is still distributed as Wishart as shown below.
90 CHAPTER 3. WISHART DISTRIBUTION THEOREM 3.2.4. Let Χ ~ Νρ>η(0,Σ <g> Ф|р,д), мЛеге Φ (η χ η) is α symmetric idempotent matrix of rank q>p. Then XX' ~ Wp(q, Σ). Proof: Since Φ is singular, from Definition 2.4.1, we can write X = YR, where Υ ~ ΛΓρ><7(0,Σ <g> /ς) and Д(д χ η) is a matrix of rank q > p with Φ = /УД. Note that RR! is an idempotent matrix of full rank and hence, an identity matrix. Now, XX' = YRR'Y' = YY' ~ Wp(q, Σ) according to Theorem 3.2.2. ■ A result closely related to the above theorem is the following. THEOREM 3.2.5. Let X ~ iVp>n(0, Σ <g> In) and Ψ (η χ n) be a symmetric idempotent matrix of rank q>p, then X^X' ~ Wp(q, Σ). Proof: Since Φ (η χ η) is of rank q < n, one can write Φ = В В', where Β (η χ q) is of rank q. Now, from Theorem 2.3.10, Υ = XB ~ ΝΜ(0,Σ <g> B'B). Here, β'β is an idempotent matrix of full rank and hence, B'B = Iq. The result follows from Theorem 3.2.2, by noting that XBB'X' = ХУХ' = YY'. m THEOREM 3.2.6. Let X ~ iVp,m(0, Σ <g> Im) and A(pxp) be a constant symmetric positive semidefinite matrix of rank r > m such that ΑΣΑ = A. Then X'AX ~ Wm(rJm). Proof: Write Σ = С С where С (ρ χ ρ) is a nonsingulax matrix and X = CY. Then Υ ~ iVP,m(0, Ip®Im) and X'AX = Y'(C'AC)Y. Since С AC is an idempotent matrix because ΑΣΑ = A, the result follows from Theorem 3.2.5. ■ 3.3. PROPERTIES 3.3.1. Invariance and Decomposition of S THEOREM 3.3.1. Let S ~ Wp(n, Σ) and A be anypxp nonsingular matrix. Then, ΑΞΑ'~\νρ(η,ΑΣΑ'). Proof: The result follows by making the transformation V = ASA' with Jacobian J(S -+V) = det(A)-fr+1) in the density of S given by (3.2.1). ■ COROLLARY 3.3.1.1. Let S ~ Wp(n, Σ) and Σ"1 = A'A, then ASA' ~ Wp{n, Ip). THEOREM 3.3.2. Let S ~ Wp(n, Ip) and Η (pxp) be an orthogonal matnx, whose elements are either constants or random variables distributed independently of S. Then, the distribution of S is invariant under the transformation S —* HSH' and is independent of Η in the latter case. Proof: First, let Я be a constant matrix. Then from Theorem 3.3.1, HSH' ~ Wp(n, Ip). If, however, Я is a random orthogonal matrix, the conditional distribution of HSH'\H ~ Wp(n,Ip). Since this distribution does not depend on Я, HSH' ~ Wp(nJp). m
3.3. PROPERTIES 91 THEOREM 3.3.3. Let S ~ Wp(n,E), and η be an integer, then S = XX', where Χ~ΛΓρ,η(0,ΣΘ/η). Proof: Let V = ASA', where Σ"1 = A'A. Then, according to Corollary 3.3.1.1, V ~ Wp{n, Ip). Define an independent random matrix L(p χ n) such that LL' = Ip, with the density c_1£n>p(L), where с = ψγτ^ and gn,p(L) is given in (1.3.26). Then, the joint density of L and V is c-i{2bprp(in)}-1etr (- \v) det(V)k^-^gn<p(L). Now, using the transformations (i) V = TV, where Τ = (Uj) is a lower triangular matrix with i« > 0 and (ii) TL = Y, with the Jacobians J{V -> T) = 2pnLi *Γ*+1 and J(T,L^ Y) = {дпАь)Т1Р1=1*Т1}~1 given in (1.3.14) and (1.3.25) respectively, we get the density of Y, after some simplification as (27r)-bpetr (- \yy'), y e κρχη. Hence, Υ ~ ΛΓρ>η(0, Jp<g> Jn), and V = YY'. It follows that S = Α~ιΥΥ'(Α-1)' = XX' (say), where X ~ Npn(0, A~l(A~1)' <g> /n). This completes the proof since Σ = The following result is of importance in multivariate analysis and is known as Bartlett's decomposition, Bartlett (1933). THEOREM 3.3.4. Let S ~ Wp(n, Ip) and S = TV, where Τ = (Uj) is a lower triangular matrix with tu > 0. Then, Uj, 1 < j <i <p are independently distributed, tl ~ Xn-i+i» l<i<pand Uj ~ N(0, l),l<j<i<p. Proof: The density of S is {2^Tp(\n)}~1 det(S)^-1) etr (- \s). (3.3.1) Making the transformation S = TV, with Jacobian J(S -> Τ) = 2ΡΠ?=ι*«~ί+1> in (3.3.1), we get the joint density of tn, t2\, - · ·, tpi,tp2,...,tpp as i=l ^ l<j<i<p U ί l ( ^241 Λ ί 2(^(-г)ехрН^.) ) = JL Ш βΧΡ ( - 2*«)Ι Π { 2*(^>Γ[ι(η _ 4 + 1}]}. Ui > 0, 1 < г < р, -оо < ί0· < оо, 1 < j < г < р. (3.3.2) From (3.3.2), it is easily seen that Uj, 1 < j < г < ρ, are independently distributed and Uj ~ N(0,1), 1 < j < г < p. By substituting у и = t%, one can show that *«~Xn-i+i» 1<*<P· ■
92 CHAPTER 3. WISHART DISTRIBUTION A similar result can also be proved for an upper triangular factorization of S, as given in the next theorem. THEOREM 3.3.5. Let S ~ Wp(nJp), and S = TT, where Τ = (ί0·) is an upper triangular matrix with tu > 0. Then Uj, 1 < г < j < ρ are independently distributed, tl ~ Xl-p+b l<i<pand Uj ~ iV(0,1), 1 < г < j < p. Proof: Similar to the proof of Theorem 3.3.4. ■ 3.3.2. Distribution of Sample Covariance Matrix THEOREM 3.3.6. Let xu...,xN be independent Νρ(μ,Σ), Σ > 0, Ν > p. Define χ = jj Σ^ι Χχ and S = Σ?=ι(χί — x)(xi — х)'· Then, (i) χ and S are independently distributed, (ii) χ ~ Νρ(μ, ^Σ); and (Hi) S ~ Wp(n, Σ), where η = N — 1. Proof: Let X (pxN) = (xu ..., xN), then Χ ~ Νρ,Ν(με', Σ®ΙΝ), where ε' (1 χ Ν) = (Ι,.,.,Ι). The density of X is (2π)-*Νράβί(Σ)-*Νβίχ{- ^Σ~ι(Χ-με')(Χ-με')'}. (3.3.3) Now, transform Хг = XH, where Я (AT χ AT) is an orthogonal matrix, Η = (^e Bj, obtaining Хг = (уЛ*х ХВ), XX' = XXX[ = Nxx' + YY\ where Υ (ρ χ (Ν - 1)) = Χ Β. Further, (Χ - με'){Χ - με')' = XX' - με'Χ' - Χεμ' + με'εμ', (3.3.4) με'Χ' = με'ΗΧ[ = Νμχ' (3.3.5) and με'εμ' = Νμμ'. (3.3.6) Hence, (3.3.4) can be written as {X - με')(Χ - με')' = Nxx' + YY' - Νμχ' - Ν χ μ' + Νμμ' = Ν(χ - μ)(χ - μ)' + ΥΥ'. (3.3.7) Now, substituting from (3.3.5), (3.3.6), and (3.3.7) together with the Jacobian of transformation J(X —> \/~N x,Y) = 1, in (3.3.3), we get the joint density of y/N χ and Υ as
3.3. PROPERTIES 93 f(y/Nx,Y) = (2^-Wet(E)-*etr{-yΣ~λ(χ - μ)(χ - μ)'} (27r)-i(^-i)Pdet(E)-^7V-1)etr(-iE-1rr). (3.3.8) From (3.3.8), it is evident that χ and Υ are independent, χ ~ Νρ(μ, -^Σ) and Υ ~ Νρ,η(0, Σ <g> Jn). Hence, YY' ~ Wp(n, Σ), and since S = YY\ which follows from the identity (3.3.7) by substituting χ for μ. The proof of the theorem is complete. ■ In the above theorem it has been proved that the sample covariance matrix 5, while sampling from a multivariate normal population, has Wishart distribution. In this case jjS is the maximum likelihood estimator (MLE) of Σ under the assumption that Σ is positive definite. The distribution of S was first derived by Fisher (1915) and Wishart (1928) when Σ is positive definite. Eben (1994) derived the distribution of S, when the inverse of the covariance matrix is a band matrix. Tsai (1995) obtained the MLE of Σ under the assumption Σ > Ip and has also derived its density. It may also be noted that (ж, S) form a complete sufficient set of statistics for (μ, Σ) and hence Wishart matrix plays an important role in drawing inferences about the parameters of a multivariate normal distribution. There is a vast literature on this topic and the reader is referred to Roy (1957), Kshirsagar (1972), Eaton (1972), Giri (1977), Srivastava and Khatri (1979), Muirhead (1982), Anderson (1984), and Siotani, Hayakawa and Fujikoshi (1985). It may also be noted that Ghurye and Olkin (1969) have derived the minimum variance unbiased estimate of Wishart density. 3.3.3. Characteristic Function and Additive Property of Wishart Matrices THEOREM 3.3.7. Let S ~ Wp{n, Σ), then the characteristic function of S, i.e., the joint characteristic function of sn, Su, · · · ,Spp is φ5(Ζ) = det(Jp - 2ιΖΣ)~*η, (3.3.9) where Ζ = Ζ' (ρ χ ρ)= (|(1 + uj)ztj) and ii<7- is the Kronecker's delta. Proof: The characteristic function of S is φδ{Ζ) = E[eti(tZS)} = {2bprp(in)det(E)b}_1 / etr{- hlp-2ιΖΣ)Σ~ιs] det{S)^n-p~l)dS = {2^rp(in)det(E)b}-1 det (\(IP - 2υΖΣ)Σ~^ηΤρ(ί-η). (3.3.10) The above equality is obtained by using (1.4.6). Now, simplifying (3.3.10) we get the desired result. ■ The above result can also be derived by assuming η to be an integer and using the decomposition given in Theorem 3.3.3, e.g., see Anderson (1984).
94 CHAPTER 3. WISHART DISTRIBUTION THEOREM 3.3.8. Let Su...,Sk be independently distributed with S, ~ Wp{nh Σ), j = 1,..., *. Then, EL Sj ~ WP(EL· rij, Σ). Proof: The characteristic function of Σ*=ι Sj is E\eti{t(^Sj)z]\ = Y[E[eti(iSjZ)} L j=i J j=i A: = Π det(/p - 2ιΖΣ)~^ j=i = aet(Ip-2iZE)-^Uni. m In the above theorem when the covariance matrices are not equal, the distribution of J2j=\ Sj is n°t Wishart. For к = 2, the density involves iF\ function and is given in Problem 3.5. Further, let Sj ~ Wp(rij, Σ^), j = 1,..., k+r, and define Qi = T,j=i ^jSj and Q2 = Σ^Ι+ι ^jSj where λ/s are positive constants. Then the asymptotic distributions of -ln{det(<5i)}, -\n{det(QiQ21)} and -ln{det(<5i(<5i + Q2)~1)} have been derived by Gupta, Chattopadhyay, and Krishnaiah (1975). 3.3.4. Marginal and Conditional Distributions THEOREM 3.3.9. Let S ~ Wp(n, Σ) and partition S and Σ as q-(Sn Sl2\ q y_(^u ΣιΛ q \S2i S22 J Ρ - q' V Σ2ι Σ22 J p-q q p-q q p-q Let Sn.2 = Sn — S12S22 S21, Σχχ.2 = Ση — Σ12Σ^"2 Σ2ι, then (i)S22~Wp_q(n^22), (ii) 5ц.2 ~ Wq(n -p + q, Σ11.2), (Иг) 5ц.2 and (Sn, S22) o,re independent, (iv) 5ι2|522 ~ Α^,Ρ-ς(Σΐ2Σ2"21522, Ση.2 <8> S22)· Proof: Let Σ"1 = (^1 ^2), Σ" ^ X q)' Then Σ" = Σ"«' Σ*2 = ^ Σ" = —Σ^Σ^Σ^1! and Σ21 = — Σ2"21Σ21Σ]"11.2. Also, note that Xi{YTlS) = tr(Eu5n + Σ1252ι) + ϊγ(Σ21512 + Σ22522) S12S22 S21 + Sl2S22 S21)\ + tr(E S21) + tr(E21512) + tr[(E22 - Σ21^11)-^12 + Σ21(Σ11)-1Σ12)522] = tr^uSu.2) + ti(Z22-lS22) + ϊγ(Σ1151252-21521) + ίΓ(Σ12521) + ϊγ(Σ21512) + Ιγ[Σ21(Σ")-1Σ12522]
3.3. PROPERTIES 95 = tr(EuSu.2) + tr(E22-1522) + tr[E11(512 + (Σ11)"1Σ12522)52-21(512 + (Σ11)-1Σ»522)'] = tr(E11.2511.2) + tr(E22 S22) + ΐΓ[Ση.2(512 — Σ12Σ22 S22)S22 (o12 — Σ12Σ22 522) J and det(S) = det(5n.2) det(S22), det(E) = det(En.2)det(E22). Now, transforming ^n-2 = Sn — Sl2S22lS2l with Jacobian J(Sn —> £112) = 1, the joint density of Si2, S22, and Su.2 obtained from the density of S, can be written as /(512 , S22,5ц.2) — /i(5n.2)/2(5i2, S22), (3.3.11) where /i(5ii-2) = {25("-^)«r,[i(n -p + q)] detPn.^-^y1 det(Su.2)^-^HC<+1>etr (- \ZU2SU.2) and /2(5i2,522) = (3.3.12) {2h(p-o) det&22)^rpAn)yl det(S22)^-r+*-V etr (- \z22lS22) ί(2π)-**<Ρ-*> det(En.2)-^-^ det(522)"^ etr |— 2^11-2(^12 ~~ Σ12Σ22 ^22)^22 (^12 ~~ ^12^22 ^22) ; (3.3.13) Prom (3.3.11), it follows that 5ц.2 and (5i2, 522) are independent, and from (3.3.12), we have 5ц.2 ~ Wq{n — p + q, 2ц.2). Further, from equation (3.3.13), results (i) and (iv) are easily established. ■ The density of S has been used to prove the above theorem. However, an alternative proof assuming η to be an integer can also be given using the decomposition given in Theorem 3.3.3, e.g., see Srivastava and Khatri (1979). THEOREM 3.3.10. Let S = (5^·) and Σ = (Σ0·), where S^pi χ pj), and Σ0·(ρ* χ Pj), ij = 1,..., k, pi + · · · + pk = p. If S ~ Wp(n, Σ), then 5« ~ WPi(n, Σ«), i = 1,..., к. Moreover, if Σ^· = 0, г Ф j, then they are independent. Proof: Assuming η is an integer, from Theorem 3.3.3, S = XX' where X ~ ΑΓρη(0, Σ <g> Jn). Partition X as X2 \XkJ
96 CHAPTER 3. WISHART DISTRIBUTION where Χι(ρι χ η) ~ iVPi>n(0, Σ« <g> Jn) (see Theorem 2.3.12) . Now, (ΧχΧ'ι ΧχΧ'ϊ ··· ΧχΧίΛ 5 = ХГ = V Xk^'i XkX'2 ''' Xk^'k / and Sa = Х{Х'{ ~ WPi(n, Ей), г = 1,..., к. Further, if Σ0· = 0, г ^ j, then X/s are independent and hence, S^s are independent. ■ COROLLARY 3.3.10.1. Let S = (s^·) ~ W^n, Ip). Then, su ~ **, г = 1,... ,p, and they are independent. Proof: Take pi = 1, г = 1,... ,p in the above theorem. ■ 3.3.5. Distribution of ASA' and (AS^A')-1 THEOREM 3.3.11. Let S ~ Wp(n, Σ). Then, for A(qx p), with rank(A) = q < p, ASA! ~Wq{n,AY>A!). Proof: The characteristic function of ASA! is <t>ASA>{Z) = E[eti(tASA'Z)], Ζ (q χ q) = Zf = E[eti(tA'ZAS)] = deb(Ip-2t,A'ZAE,)-bn = det(Iq - 2ιΖΑΣΑ')-1ϊη. The proof is complete by observing that det(/^ — 2υΖΑΣΑ!)~ϊη is the characteristic function of a random matrix distributed as Wq(n, ΑΣΑ'), m COROLLARY 3.3.11.1. Let S ~ Wp(n, Σ). Then, ^g ~ χ2η where α (ρ χ 1) φ 0. Proof: In Theorem 3.3.11, substitute q = 1. ■ Mitra (1969) has given a counter example to show that the converse of Corollary 3.3.11.1 is not true in general. However, if ^|^ ~ χ*, for all nonnull a eRp and S can be written as S = Υ AY', where column vectors of Υ (ρ x η) are independently distributed as normal and A (n x n) is a symmetric nonrandom matrix of full rank, then S ~ Wp(n, Σ), see Rao (1973), p. 535. THEOREM 3.3.12. Let S ~ Wp(n, Σ) and y(pxl) be a random vector distributed independently of S, and P(y φ 0) = 1. Then, }Φ^ ~ χ£ and is independent of y. Proof: In Theorem 3.3.11, take q = 1 and A = y'. Then, the conditional distribution of 1t|^ given у is χ£, which is also the unconditional distribution. ■
3.3. PROPERTIES 97 COROLLARY 3.3.12.1. Let Χι,.,.,χχ be a random sample from Νρ(μ,Σ), χ = ψ Σϋι χι and S = TtLifa - x)(*i - *)'· Then, f^f - χ2η, η = Ν - I, and is independent of x. Proof: Prom Theorem 3.3.6, S ~ \Υρ(η,Σ) and is independent of x. The result follows from above theorem. ■ THEOREM 3.3.13. Let S ~ Wp(n,E) and A(q χ ρ) be a matrix of rank q < p. Then, (AS-1 A')-1 ~ Wq(n -p + q, (ΑΣ~λ A')"1)· Proof: Let Β = ΑΣ~ϊ and Λ = Σ~2"5Σ~2 where Σ 2 is the symmetric positive definite square root of Σ, then Λ ~ Wp(n, Ip) and (AS'1 A')'1 = (ΑΣ-^Σ^^Σ^Σ-^')-1 = (BA-'B')-1· So, we need to prove that (БЛ"1^')"1 ~ Щ{п~Р+Ч, (BB')~l), since BB' = ΑΣ~ιΑ'. Let В = С (Iq 0) #, where С (q χ q) is of full rank and HH' = H'H = Ip. Now, (вл-^')-1 = = ( с {i. с-1)' 0)ЯЛ"1Я'(^ (/. o)^(^); с -1 С" (c~lY(vu)-lc-\ where V = HAH' ~ Wp(n, Ip) and Vй (q χ q) = (Vn - Vl2V£lV2l)'1 = V{{\, where V = ( J'11 J'12 ), Vn (qxq). Prom Theorem 3.3.9, Vu.2 ~ WJn -p + q, Iq). Hence V V^21 V/22 / (C-l)'(Vn)-lC-1 ~ Wq(n-p + q,(CC')~l). But, CC = BB' = ΑΣ~ιΑ!. This completes the proof. ■ COROLLARY 3.3.13.1. Let S ~ \νρ(η,Σ) and a e W, α φ 0. Then £§^f ~ Xn-p+l- Proof: Take q = 1 in the above theorem. ■ THEOREM 3.3.14. Let S ~ Wp(n, Σ) and y(pxl) be a random vector distributed independently of S, and P(y φ 0) = 1. Then, \,§-\* ~ Xn-p+i and г5 independent ofy· Proof: In Theorem 3.3.13 take q = 1 and A = y''. Then, the conditional distribution ■=£■ given у is χη_ρ+1, which is also the unconditional distribution. ■
98 CHAPTER 3. WISHART DISTRIBUTION COROLLARY 3.3.14.1. Let χ and S be defined as in Theorem 3.3.6, then f^f ~ Χη-ρ+ι αηά г5 independent of χ. Proof: From Theorem 3.3.6, S ~ Wp(n,H) and is independent of x. The result follows from the above theorem. ■ It may be noted that the distribution of Hotelling's T2 can be derived from the above corollary as in Muirhead (1982), p. 98. 3.3.6. Expected Values In this section, we give expected values of the elements of S and some of its scalar and matrix valued functions. THEOREM 3.3.15. Let S = (s^·) ~ \νρ(η,Σ), then (i) E(sij) = naij, cov(s»j, Ske) = n(aikaj£ + σ^σ^), (ii) E(SAS) = ηΣΑ'Σ + ntr(EA)E + η2ΣΑΣ, (Hi) E(tr(AS)S) = ηΣΑΣ + ηΣΑΣ + η2 tr(AE)E, (iv) E(ti(AS) tx(BS)) = η ϋ(ΑΣΒΣ) + η ϊχ(Α'ΣΒΣ) + η2 tr(AE) tr(BE), where Σ = (σ^·) and Α (ρ χ ρ) and Β (ρ χ ρ) are constant matrices. Proof: (i) Assuming η to be an integer and using Theorem 3.3.3, we can write S = YY', sij = Er=i Viryjr, where Υ (ρ χ η) = (yio) ~ iVp>n(0, Σ <g> Jn). Hence, using Corollary 2.3.3.1, η E(sij) = J2E(yiryjr) r=l η = Σσϋ r=l = ησίό (3.3.14) and E{sijSki) = J212E(yiryjryktyet) r=lt=l = Σ ЕЫтУзтУкгУгт) + Σ Σ Е(У1тУзтУыУгг) r=l r=lt=l = Σ(σ**σ# + °~it°~jk + O-ijSke) + Σ Σ °~ij°~k£ r=l r=lt=l = n[pikGji + σασ^ + GijGkt) + n(n — \)σ^σ^. (3.3.15)
3.3. PROPERTIES 99 From (3.3.14) and (3.3.15), we get cov(sij, Ske) = n(aikaje + aaajk). (ii) The (г, j)th element of SAS is Σ?=ι Σ£=ι sikaktstj and hence, E(SAS) = 4(ΣΣ5Λ^·)) , V V ч = ( J2J2aktE(sikStj)) 4 t=l A:=l ' , V V ч = ( n Σ Σ akt(cnt&jk + tfijtffct + naikatj) J ^ t=lfc=l ' = ηΣΑ'Σ + η tr(AE)E + η2ΣΑΣ. The proofs of the other two expected values are similar. ■ By differentiating the moment generating function of S ~ Wp(n, Σ), de Waal and D. G. Nel (1973) have derived the following results: (i) E(S2) = n{(n + 1)Σ + (ϊγΣ)/ρ}Σ (ii) E(S3) = n{(n2 + Sn + 4)Σ2 + 2(n + l)(trΣ)Σ + (η + l)(trΣ2) Jp + (trΣ)2/Ρ}Σ and (iii) E(SA) = n{(nz + 6n2 + 21n + 20)Σ3 + (3n2 + 10η + 12)(ϊγΣ)Σ2 + (2n2 + bn + 5)(tr Σ2)Σ + 3(n + l)(tr Σ)2Σ + (η2 + 2n + 4)(ϊγΣ3)/ρ + 3(η + 1)(ϊγΣ)(ϊγΣ2)/ρ + (ϊγΣ)%}Σ. The result (i) can also be obtained from Theorem 3.3.15(ii) by substituting A = Ip. For an alternative proof of Theorem 3.3.15(ii), see Styan (1979). Haff (1979), using an identity involving Wishart matrix and assuming A is positive semidefinite, has also obtained expression for E(SAS). Wishart (1928) derived the central moments up to fourth order of the elements of S. Haff has also derived expected values similar to the above theorem for S~l as given in the next theorem. THEOREM 3.3.16. Let S ~ Wp(n, Σ), then (i) E(sij) = —^—-, η - ρ - 1 > О η — ρ — 1 (η) οον(β», s") = {η_ρ){η_ρ_ι){η_ρ_ζ) , η - ρ - 3 > 0 1 J v y (π-ρ)(π-ρ-1)(π-ρ-3) (π-ρ)(π-ρ-3) where S~l = (su), Σ-1 = (συ) and A (pxp) is a constant positive semidefinite matrix. The following expected values were derived by von Rosen (1988a).
CHAPTER 3. WISHART DISTRIBUTION THEOREM 3.3.17. Let S ~ Wp(n, Σ), then (i) E(S~3) = (c3ci + c3c2 + C4C1 + 5c4c2)E-3 + (2c3c2 + c4Ci + c4c2)(tr E_1)E-2 - (c3c2 + c4c2)(trE-2)E-1 - c4c2(trE~1)2E~1, η -ρ - 5 > О, (it) E((tiS~l)S~l) = ciitrE-^E"1 + 2c2E"2, η -ρ - 3 > О, and fmj ^((tri-^S) = " .JtrE-^E - ^ ./„, η -p - 1 >0, w/iere cx = (n - ρ - 2)c2; c2 = {(n - p)(n -p- l)(n - ρ - 3)}~\ c3 = (η - ρ - 3) {(η - ρ - 5)(η - ρ + Ι)}"1 and c4 = 2{(η - ρ - 5)(η - ρ + Ι)}-1. Marx (1981) obtained the following expected values. THEOREM 3.3.18. Let S ~ Wp(n,E), then (i) E(S~lAS~l) = с^-ЫЕ"1 + сзр-Ы'Е"1 + tr^E-^E"1] (it) E(ti(AS~l)S~l) = d Ιτ(ΑΣ~ι)Σ~ι + ^[E-^'E"1 + E^AE"1] (Hi) E(tr(AS'l)tr(BS'1)) = 0ι\χ(ΑΣ~ι)1χ(ΒΣ-ι) + €2[ΐχ(ΒΣ-ιΑΣ-1) + Κ(ΒΣ~ιΑ'Σ-1)}, where C\ and c2 are defined in Theorem 3.3.17 and A(p χ ρ), Β (ρ χ ρ) are constant matrices. Proof: (i) Let S~l = (sij) and A = (ai5). Then the (i,j)th element of S~lAS~l is ELiELi^Aand E(S~lAS-1) = E((jtib***kskj)) ^ t=lfc=l ' 4t=lA:=l J = ( Σ Σ ^{соу(5й, s«) + B(s«)E(s«)}). By substituting for cov(sa, skj) and E(si:>) from Theorem 3.3.16, we obtain EiS^AS'1) = ί Σ Σ **{<*(2(η - ρ - l)"1^^' + σ*β« + σ*σΗ) + (η - ρ - 1)-2σ*σ*}) = c2[2(n -ρ - Ι^Σ^ΑΣ"1 + Σ"1 Α'Σ'1 + ί^ΑΣ"1^-1] + (η -ρ -Ι^Σ"1 ΑΣ"1 = ί^Σ^ΑΣ"1 + cap-U'E-1 + ΐτίΑΣ-^Σ"1]. (ϋ) The (i,j)th element of t^AS"1^-1 is sy Σ?=ι Σ?=ι «««'*· Thus>
3.3. PROPERTIES 101 E(ti(AS~l)S~l) Ρ Ρ 4 *=lfc=l J = (έέα„£(β«βα)) 4£=lfc=l У = ( Σ Σ aw{c2(2(n - ρ - 1)" VV + σ< V + auakj) + (η - ρ - 1)"VVfc}) = c2[2(n - ρ - Ι)"1 α{ΑΣ~ι)Σ~ι + Σ~ιΑΣ~ι + Σ"1 Α'Σ~1} +(η - ρ - 1)~2 ίτ(ΑΣ~ι)Σ~ι = d α(ΑΣ~ι)Σ~ι + ο2[Σ-ιΑΣ~ι + Σ-^'Σ"1]. (iii) Pre-multiplying the result (ii) by the constant matrix В and taking the trace, the desired result follows. ■ Styan (1989), using a result from Olkin and Rubin (1962), has also proved Theorem 3.3.18(i). He has also derived expression for E(SAS~l), where S ~ \νρ(η,Σ) and A is a square nonrandom matrix not necessarily symmetric, as E(SAS~l) = _1_ [ηΣΑΣ~ι -A - ti(A)Ip] . THEOREM 3.3.19. Let S ~ Π^,(η,Σ), then (i)E(CK(S))=2k(±n^CK&) and (ii) E(CK(S~1)) = 2-*Γρβ","*)σκ(Σ-1)> \n > \{p - 1) + h. lP \2П) Proof: (i) We have E(CK(S)) = {2ί^Γρ(^η)άβί(Σ)^}~1 Js>oCK(S)det(S)^n~^eti(- ^E~1S)dS = {25"Tp(in) det(E)b}_1rp(in, «)2bpdet(E-1)-bcK(2E) where we have used the Lemma 1.5.2. (ii) The proof is similar to the above, using Lemma 1.5.2. ■
102 CHAPTER 3. WISHART DISTRIBUTION From (i) above we have S[(trS)fc] = T,E(CK(S)) Since Σ-55Σ-5 ~ Wp(n, Ip), it follows that £[{tr(E-15)}fc]=2fcX:(^)KCK(7p) where the last step has been obtained by using (see Subrahmaniam, 1976), Σ(η)κΟκ(Ιρ) = (np)k. к This result has also been obtained by Muirhead (1986) who also derived the following results. SKtrp-^)}-*] = ( - \)\ - \np + l)fc, 2k < up, £[{tr(E-15)}fctr(5)] = n2fc(inp+ l)fc(trE), ^[{ΜΣ-1^}^!^-1)] = (η -ρ - 1)_12*(|ηρ - l)fc(trΣ"1), η > ρ + 1, ^{tr^S)}*^^1)] = (η -ρ - 1)-Ip2k(±np - l)k, n>p + l, E[{tr(^S)}kdet(S)h} = 2'h+k(lnp + ph)fp&\h) det(E)fc, E[{tr&-lS)}rCK{SB)] = 2k+r(±np + к)Дп)к Οκ(ΒΣ), E[{ti^S)}-rCK(SB)} = ρ^^-(Ιη)κσκ(ΒΣ), г < \np + k, where Β (ρ χ ρ) is a constant matrix. THEOREM 3.3.20. Let S = TV ~ Wp(n, Ip), where Τ = (ί0·) is a lower triangular matrix with positive diagonal elements, then E(T'T)~l = В where В = diag(6i,..., bp) with h = ^2
3.3. PROPERTIES 103 and b^ (n-1) 3 (n-j-l)(n-j) , j = 2,...,p. Proof: From Theorem 3.3.4, it is known that ti/s (1 < j < г < ρ) are independent, with Uj ~ W(0,1), 1 < j < i < Ρ and t\ ~ χ^+ι, t = 1,... ,p. Let ai = E ι та η — г — 1 , г = 1,...,р. (3.3.16) For any diagonal matrix .D, with diagonal elements ±1, DTD and Τ have the same distribution and therefore, В = E(T'T)~l = E[(DTD)'(DTD)]-1 = DBD, which implies that β is a diagonal matrix. Writing 'Tn 0 T = we get and T~l = Тол To: 0 (TT)~l = Τ ι— \rp rp— 1 /τπ— 1 22 -^21^ 11 -*22 -^2ΐΡΊΐ) ^21^21 +T22 (Τ22) where #2ι = — ^г^ТгхТ^1. Now, taking expectation we get E{T[{T^ 0 E(TT)~l = B = 0 ^№1^21+ (^2T22)"1) (3.3.17) Letting Гц be (p - 1) χ (ρ - 1), T22 = ^,, Τ21(1 χ (ρ - 1)) = t'21 and using the independence of T11? tpp and t21, we get bp = Ε = Ε ^{l + t^T^J-1^} ιτρρ L6PPJ ΒίΙ + ^ίΙΪ!^)"1^}. (3.3.18) Since t2i - ^ρ_ι(0,/ρ_ι), and from (3.3.17), ^(Т^Тц)"1 = diag(6b... ,6P_0, we have
104 CHAPTER 3. WISHART DISTRIBUTION = ti{E(t2ltf2l)E(ruTu)-1} = ξ> (3·3·19) J=l Now, using (3.3.16) and (3.3.19) in (3.3.18), one obtains p-l bP = ap(l + J2bX 4 7=1 J By an inductive process, it is straightforward to show that j-i bj = a, (l + Σ Ьг) J = 2,... ,p (3.3.20) 4 i=i J and bi = αχ. Solving equations (3.3.20), in terms of α/s, we get h = au j-i i=l and using (3.3.16), we finally get 1 6i = n-2' and (π-1) . л 6i = 7 ·—Ги τ, J = 2,...,p. ■ The above result has been derived by Eaton and Olkin (1987). Using this procedure one can derive a similar result for an upper triangular factorization of 5 as given in the next theorem. THEOREM 3.3.21. Let S = TV ~ Wp(nJp), where Τ = (ί0·) is an upper triangular matrix with positive diagonal elements. Then E{VT)~l = В where В = diag(6b..., bp) with bj = τ ——iw ——zr, j = 1,2, ...,p- 1, (n-p + j- l)(n - ρ + j - 2)
3.3. PROPERTIES 105 and p η-2' Proof: Similar to the proof of Theorem 3.3.20. ■ THEOREM 3.3.22. If S ~ Wp(n,E); then det(S) p (^ ~a—ίτΛ ~ Π Ui> where щ 's are independent and щ ~ χ£_ί+1, i = 1, - - - ,p, and Г«; B[det(S)fc] = 2*det(E)fc Π ^TTTT r^l ■ ReW > "ο71 + Ψ ~ l^ (3.3.21) Proof: (i) Let V = E^SE"*, then from Corollary 3.3.1.1, V ~ Wp(n,Ip). Now, from Theorem 3.3.4, V can be written as TV and det(V) = det(S) det(E)"1 = [J ^' (3.3.22) where щ = t% are independently distributed as x^_i+1, г = 1,..., p. (ii) Prom (3.3.22), we have E[det(S)h] = det(E)*B(nui) г=1 -det(E)ll|2 rg(n_i + 1)j Alternately, £,[det(5)/'] can be evaluated using the density of 5 as follows. £[det(S) ] - ys>odet(5) 2Wd.iE.jnr. . dS Js>o 2Wdet(E)H\,(±n) 2ρ/Μ6ΐ(Σ)Λ Substituting Γρ(·) from Theorem 1.4.1 we get (3.3.21). The statistic n~pdet(5) is known as the sample generalized variance (Wilks, 1932). Many test statistics in multivariate statistical analysis are functions of sample generalized variance (e.g., see Anderson, 1984; Gupta and Tang, 1984, 1986a, 1986b, 1987, 1988; Sen Gupta, 1987).
106 CHAPTER 3. WISHART DISTRIBUTION THEOREM 3.3.23. Let S ~ Η^π,Σ), then the characteristic junction ofti(S) is and the kth moment of tr(S) is E[(tiS)k] = 2kJ2&) CK(E),* = 0,1,2,...,.. . к Z K Proof: The characteristic function of tr(5) is <kv(S)(z) = E[exp{Lzti(S)}\ = E[exp{tti(ZS)}], Z = zlp, = det(/p - 2ιζΣ)-*η. (3.3.23) The last equality is obtained from the characteristic function of S. Now, expanding (3.3.23), for ||2ζΣ|| < 1, ^) = ЕЕ?(Яад from which the coefficient of ^- gives the A:th moment of tr(5). ■ A number of results has also been obtained on the expected values of the elementary symmetric functions of S ~ V^p(n,E). The following results have been derived by de Waal and D. G. Nel (1973). E(tij S) = n(n - 1) · · · (n - j + l)(tij Σ) £[(tri 5)(tr2 S)} = n(n - l)(n + 2)(tri E)(tr2 Σ) - 6n(n - l)(tr3 Σ) ^[(tri 5)(tr3 S)] = n(n - l)(n - 2)(n + 2)(tri E)(tr3 Σ) - 8n(n - l)(n - 2)(tr4 Σ) E(ti2 S)2 = n(n + 2)(n - l)(n + l)(tr2 Σ)2 - 4n(n + 2)(n - l)(tri Σ)(ϊγ3 Σ) -4η(η-1)(2η-5)(ΐΓ4Σ) £[(tnS)2(tr2S)] = η(η + 2)(η + 4)(η-1)(ΐΓ1Σ)2(ΐΓ2Σ) -4η(η-1)(η + 2)(ΐΓ2Σ)2 - Ι2η(η - 1)(η + 2)(tri Σ)(ϊγ3 Σ) + 48η(η - l)(tr4 Σ) £?[(tn S)2} = η(η + 2)(tri Σ)2 - 4n(tr2 Σ) jE?[(tri Sf] = η(η + 2)(η + 4)(tri Σ)3 - 12η(η + 2)(tri Σ)(ϊγ2 Σ) + 24n(tr3 Σ) £?[(tri 5)4] = η(η + 2)(η + 4)(η + 6)(tri Σ)4 - 24η(η + 2)(η + 4)(tr2 Σ)(ΐΓι Σ)2 + 48η(η + 2)(tr2 Σ)2 + 96η(η + 2)(trx Σ)(ϊγ3 Σ) - 192n(tr4 Σ), where trjS is the jth elementary symmetric function of the matrix S. For further work in this direction, the reader is referred to Pillai and Gupta (1967, 1968), de Waal (1972a, 1978), de Waal and D. G. Nel (1973), Saw (1973), and Shah and Khatri (1974).
3.3. PROPERTIES 107 3.3.7. Distributions of Correlation, Regression Matrices and S"1 THEOREM 3.3.24. Let R = (r^·) be the correlation matrix of a random sample of size N = η + 1 from Νρ(μ, Σ). Then, the density of R when Σ = diag(an,..., σ^) is P^- det(R)^-p~l\ -1 < ry < 1, i < j. (3.3.24) Proof: Note that r^· = -0=, where S = (sij) is defined in Theorem 3.3.6. The density of S is {2Ьтр(^п) det^b}"1 det(5)2(n-p-1} etr ( - \^~lS). Now, making the transformation ι ι к к S = diag(si1?..., 5Й>)Д diagisfx,..., s&), with the Jacobian J(S -> Sn,..., Spp, R) = Π£=ι 5^ , we get the joint density of 5ц,..., Spp and R as fi,jr1r<-ft)lint«)y'—>. (3.3.25) Prom (3.3.25), it is seen that Sn,...,Spp and R are independently distributed and su ~ σ"ϋΧ^, г = 1,... ,p. The density of R is obtained from (3.3.25) and is given by (3.3.24). ■ From the above theorem it is clear that if S = (s^) ~ \νρ(η,Σ), Σ = (σ^·), then for Gij = 0, г φ j, χι = ^, г = 1,... ,p are independently distributed as chi-square with η degrees of freedom. In the general case when σ^ φ 0, the joint distribution of xb ..., xp is given by the following theorem (Mathai and Tan, 1977). THEOREM 3.3.25. Let S = (s0·) ~ \Υρ(η,Σ), where Σ = (σ0·). Then the joint distribution о{щ = ^, г = 1,... ,p is Σ^ΣΣ··-Σ^Κ···λ; ν·.,η, (3.3.26) m=0 l^71/771· α1=0α2=0 αρ=0 Ζ Ζ гуДеге Αα is the coefficient of z^z^2 · · · ζ*** in the expansion of [b(z)]m = Σ™1Ζ=0 Σ^=ο • · ·Σ£=0 Ααζ?ζ? ---φ with det(/p - A{z)) = 1 - b(z), A(z) = ( 0 Z1P12 ZlPlp\ Z2P21 0 ··· Z2P2p \ZpPpi ZpPp2 0 / y/0~ii&jj
108 CHAPTER 3. WISHART DISTRIBUTION and /< Theorem 3.3.5 is a special case of a general result derived by Jensen (1970). He obtained the joint distribution of щ = \t^E^1^·), г = 1,... ,<?, where 5 = (5^·), Sij (Pi x Pj), hj = 1, · · · ,<7, Pi +P2 + ''' +Pq = p. THEOREM 3.3.26. Let S ~ Wp(n, Σ), and partition S as ' S\\ S\2 \ q ,52i S22 J Р-Я q p-q Then the distribution of the regression coefficient matrix В = S^S12 is (3.3.27) where β = ΣϊΐΣ12. Proof: From Theorem 3.3.9, it is known that Su ~ W9(n,En) and 52i|5n ~ Νρ-ςις(Σ21Σ^8η,Σ22Λ <g> 5n). Now, using Theorem 2.3.1 and 2.3.10, we get SnlSl2\Su ~ Ν^Σ^Ση,εΰ1 0Σ2Μ). The distribution of В = SiiS12 is then derived by integrating out Sn from the joint density of S^iSl2 and 5ц. Thus, we have the density of β as (2π)-*«<"-<> det(E22.1)-i«{2br,(in) det^n)*"}-1 / detiiu)^"^-2'-1) etr f- ±SU{(B - β)Σ^Λ(Β - β)' + Σ^1} JSu>o L 2 dS'и. The above integral is evaluated using (1.4.6), finally giving the density of β as (3.3.27). ■ The above density was derived by Kshirsagar (1961a) and is known as matrix variate ^-density. This density is studied in the next chapter. It may be remarked here that the distribution of В is not known when 5 has a noncentral Wishart distribution. However, in the linear case the result has been derived by Juritz and Troskie (1976). THEOREM 3.3.27. Let S ~ Ης,(π,Σ), then the density ofV = 5"1 is {2^prp(^)det(E)b}"1det(y)-2(n+p+1)etr(- ^Σ-V"1), V > 0. (3.3.28)
3.3. PROPERTIES 109 Proof: In the density of S given by (2^Γρ(1η) det(E)b}_1 det(S)2(n-p-1} etr (- \^~lS)· making the transformation V = S"1, with the Jacobian, J(S ->· V) = det(y)_(i>+1), we get the density of У as given in (3.3.28). ■ THEOREM 3.3.28. Let S ~ Wp{n,Ip), and χ ~ Np(0Jp) be independent. Then, where S = CO, the matrix С being either triangular or nonsingular, and FPA is the F-distribution with ρ and q degrees of freedom. Proof: Let у = {C~l)'x. Then y\C ~ iVp(0, (C"1)^"1) . Denote the conditional and the unconditional densities of у by f(y\C) and f(y) respectively. Then, f(y) = Ec[f(y\C)} = Ec[f(y\CC')} = Ecc\f{y\CC')} = f f(y\S)g(S)dS, Js>o where g(S) is the p.d.f. of S. Now, f(y) = (2π)-*"{2^rp(in)}-1 Js>odet(S)^-rietr (-±S - ±Syy')dS TJUn + 1)} , /1,τ ,_ΐ(η+ΐ) -2J.-'!^rJUde'(^№+i"") π2ΡΓ[^(η-ρ+ 1)] Finally using Theorem 1.4.10, we get the density of y'y = x'{C'C)~lx = ν (say) as {β(\ρ, \(n-p+ 1))}_1^(p-2)(i + v)~i^\ υ > 0) which is the desired result. ■ THEOREM 3.3.29. Let S ~ Wp(n,Ip), and a G W, α φ 0. Then j£f^f is distributed as xy, where χ and у are independent, x „ B'fan - ρ + 2), -(ρ - 1)) and у ~ χ*_ρ+1.
110 CHAPTER 3. WISHART DISTRIBUTION Proof: Let -^. (3.3.29, From Theorem 3.3.2, it is known that for any orthogonal matrix Γ (pxp), the distribution of Γ5Γ' is Wp(n, Ip). Now, let V = (%) = Γ5Γ' and choose the orthogonal matrix Γ as Г = ((о;о)-*о С). Then, 5-1 = rV"^, S~2 = TV"2r, a,S~la=(a,a)vl\ (3.3.30) and a'S~2a = {a'a) f>lj)2, (3.3.31) i=i where V~l = (vij). By substituting from (3.3.30) and (3.3.31) in (3.3.29), we get Now, let V = TT\ where Τ is an upper triangular matrix with positive diagonal elements and partition Τ as T=Co Q^22{{p-i),{p-i)). Then, and 0 T2~2l y-l = (JvjTVj-1 = (T')-iT-i cll cllli22 "*11 (^22)" * №2^22)" + *11 (^22) **'^22 (3.3.32) From (3.3.32) it follows that tf tu + tu t'(T22T22) lt l + t?(T!,2T22)-4'
3.4. INVERTED WISHART DISTRIBUTION 111 From Theorems 3.3.5 and 3.3.28, it is known that t2u and t\T22T22) lt are independent, with t2u ~ x^p+i and ϊ(ΤΪ2Τ22)-4 ~ ^ Fp-i,n-P+2· Since, 1+^Τ22)-4 ~ £7(|(η - ρ + 2), |(p - 1)), the theorem follows. ■ The above result has been derived by Gupta and Nagar (1994). They have also derived the distribution of w in terms of the Whittaker function. THEOREM 3.3.30. Let S ~ Wp(nJp). Then X = j^iffi and tr(5) are independent. Proof: Let R = diag(sn2,..., Spp2)Sdiag(sn2,..., Spp2). Then from Theorem 3.3.24, 5ц,..., Spp and R are independent, and sa ~ χη, i — 1,... ,p. Further, let y{ = ^, г = 1,... ,p - 1 and ζ = £j=1 s^. Then (yb ..., ур_х) and ζ are independent. Now, since P-i X = Ifl[yi(l-Y/yi)aet(R) is a function of yb ..., yp_i and det(i?) only, it is independent of tr(5) = z. ■ The statistics λ given above is the likelihood ratio test statistic for sphericity hypothesis first studied by Mauchly (1940) (also see Gupta, 1977; Muirhead, 1982; Anderson, 1984; Amey and Gupta, 1992). 3.4. INVERTED WISHART DISTRIBUTION DEFINITION 3.4.1. A random matrix V {ρ χ ρ) is said to be distributed as inverted Wishart, with m degrees of freedom and parameter matrix Φ (ρ χ ρ), denoted byV~ IWp(m, Ψ), if its density is given by 2_I(m_p_l)pd t/^4l(m_p_l) χ - ^ r-etrf- -V"1*), V > 0, Φ > 0, m > 2p. The inverted Wishart distribution is the matrix variate generalization of the inverted gamma distribution. This distribution has been used as conjugate prior for the covariance matrix in a normal distribution. The relation between the Wishart and inverted Wishart distributions is given in the following theorem. THEOREM 3.4.1. Let V ~ IWp(m, Φ), then V~l ~ Wp(m - ρ - 1, Φ"1). Proof: The density of V is 2-j(m-p-1)pdetWl(m-p-l) ι rp[i(m-p-l)]det(lO*ra ^ 2 У
112 CHAPTER 3. WISHART DISTRIBUTION Transforming S = V l with Jacobian J(V ->· S) = det(S) ^"^, we get the density of S as 2_i(m_p_1)pd ^i(m_p_D - Г,[|(т-р-1)] «Η*)*—1 etr (" 25Φ)· which is the Wishart density with parameters m — p — 1 and Φ"1. ■ The marginal distribution of any square submatrix on the main diagonal of an inverted Wishart matrix is also an inverted Wishart. THEOREM 3.4.2. Let V ~ IWp(m, Φ) and partition V and Φ as /Vn V12\ q /Фц Φχ2\ q \V2l V22) p-q V^2i Ф22У V-q q p-q q p-q Then, Vu ~ IWq(m -2p + 2q, Фц). Proof: From Theorem 3.4.1, V~l ~ Wp(m - ρ - 1, Φ"1). Let v-iJvn П * \V21 V22) p-q' q p-q Then from Theorem 3.3.9, Vй'2 = Vй - ^12(^22^-1^21 „ w^m _ 2p + q _ 1? φΐι·^ where Φ112 = φ" _φΐ2(φ22^-ιφ2ΐ and φ-ι = ^|2ι ^2 Υ Now? since yii-2 = y-i and Φ11'2 = Φ^1, we have V^1 ~ Wq(m-2p + q-l^^) and hence, Vn ~ IWq(m- 2р + 2д,Фц). ■ COROLLARY 3.4.2.1. Any diagonal element of an inverted Wishart matrix is distributed as inverted gamma. Proof: Take q = 1, in Theorem 3.4.2, and write VI1 = vu, Фц = т/>п, then vu ~ IWi(m — 2p + 2, фц). The density of vn from Definition 3.4.1 is {2§^)r[i(m - ЭД]}-Vr-2P^ul(m-2p+2) exp ( - ^), «„ > 0, m > 2P, which is an inverted gamma density. ■ Different techniques have been used to derive the first and second order moments of inverted Wishart matrix. Kaufman (1967) derived the moments using a factorization theorem. Das Gupta (1968) employed the invariance arguments. Haff (1979) established an identity by applying Stokes' theorem and derived the first two moments, von Rosen (1988a) gave a general method to obtain the rth order moment and obtained explicit expressions up to fourth order. Some of these results are given in the next two theorems.
3.5. NONCENTRAL WISHART DISTRIBUTION 113 THEOREM 3.4.3. Let V ~ IWp(m,V), then fi) E(Vij) = ζ^τ, m - 2p - 2 > 0, m — zp — ζ Μ οο,^,««) = (та_2р_1)(т_2р_2)(та_2р_4)! rn - 2р - 4 > О, 1 у v y (га-2р-1)(га-2р-2)(га-2р-4)' гуДеге V = (vij), Φ = (tfrij), and Α (ρ χ ρ) is a constant positive semidefinite matrix. Proof: See Haff (1979). ■ THEOREM 3.4.4. Let V ~ /И^(га,Ф), then (i) E(V3) = (cic3 + c2c3 + сгс4 + 5с2с4)Ф3 + (2c2c3 + сгс4 + c2c4)(tr#)#2 - (c2cz + ο2ο4)(ϊγΦ2)Φ - c2c4(tr2 Ф)Ф, m - 2p - 6 > 0, (ii) E(ti(V)V) = Οι(ϊγΦ)Φ + 2с2Ф2, т - 2p - 4 > 0, W Wjv-) = ""-"-„'^f"27'· - - * -»> о. гуДеге Ci = (га — 2p — 3)c2; c2 = {(ra — 2p — l)(ra — 2p — 2)(ra — 2p — 4)}"1, c3 = (m-2p- 4){(ra - 2p - 6)(ra - 2p)}~\ and c4 = 2{(ra -2p- 6)(ra - г^)}"1. Proof: See von Rosen (1988a). ■ THEOREM 3.4.5. Let V ~ /И^(га,Ф), then (i) E{VAV) = С1ФАФ + с2[ФА'Ф + ϊγ(ΑΦ)Φ], (ii) E(ti(AV)V) = сг ϊγ(ΑΦ)Φ + с2[ФА'Ф + Φ ΑΦ], (iwj £(tr(A\0 tr(5V)) = Ci ϊγ(ΑΦ) ϊγ(5Φ) + ο2[ϊγ(£ΦΑΦ) + ϊγ(£ΦΛ'Φ)]; гуДеге ci; c2 are defined in Theorem 3.4-4 andA{pxp), В (pxp) are constant matrices. Proof: From Theorem 3.4.1 we know that V~l ~ Wp(m - ρ - 1, Φ"1). The results then follow by using Theorem 3.3.18. ■ 3.5. NONCENTRAL WISHART DISTRIBUTION Noncentral Wishart distribution is the matrix variate generalization of noncentral chi-square distribution. It is useful in studying robustness and power of most of the multivariate tests. DEFINITION 3.5.1. Α ρ χ ρ random symmetric positive definite matrix S is said to have a noncentral Wishart distribution with parameters ρ, η, Σ > 0 and Θ, written
114 CHAPTER 3. WISHART DISTRIBUTION as S ~ Wp(n, Σ, Θ), if its p.d.f. is given by {2*ПРГр(\п) det(E)b}_1 etr ( - ^©) etr ( - \^~lS) det(S)^n-p~l) ο^-η—ΘΣ"^), S>0, n>p. (3.5.1) where qFi is the hypergeometnc function (Bessel function). The matrix θ is called the noncentrality parameter matrix. When 0 = 0, the noncentral Wishart distribution reduces to the Wishart distribution defined in Section 3.2. This distribution, like Wishart distribution, can also be derived from normal distribution. THEOREM 3.5.1. Let X ~ ΛΓρ,η(Μ, Σ®/η), η > ρ, then S = XX' ~ Wp(n, Σ, Θ), where θ = Σ~ιΜΜ'. Proof: The Laplace transform of /(5), the density of S = XX\ is g(Z) = E[eti(-ZS)l Ζ (ρ χ p) = Z' = E[eti(-ZXX')] = (27r)-b>det(E)-b JX£Rpxn etr {- ZXX' - \ς~\Χ -M)(X- M)f} dX. (3.5.2) Now, write trace of the quadratic form in the exponent as tr{- zxx' - \z~l(x - м)(х - му] = tr {- i(2Z + Σ~ι)(Χ - (2Z + Σ~ι)~ιΣ~ιΜ)(Χ - (2Z + Σ"1)'^"^)' + ^Σ~1ΜΜ,Σ~\2Ζ + Σ"1)"1 - ]-Σ~ιΜΜ'}. (3.5.3) Substituting from (3.5.3) in (3.5.2) and evaluating the integral we get g(Z) = det(Σ)-2ndet(2Z + Σ-1)-2n etr {- ]-Σ~ιΜΜ' + ]-Σ~ιΜΜ'Σ~ι(2Ζ + Σ"1)"1} = 2-i"Pdet(E)-indet (Ζ + ^Σ~ι)~'η eti (- ±θ) oF0(\eZ-l(z + ^Σ"1)"1), Re [Z + V1) > 0. (3.5.4) The density /(5) of 5 is obtained by finding the inverse Laplace transform of (3.5.4) as
3.5. NONCENTRAL WISHART DISTRIBUTION 115 f(S) = r—- / eti(SZ)g(Z) dZ ι ι / 1 \ 22Р(Р-!) r = 2-2-Pdet(E)-2-etr (- -θ) г—— / etr(SZ) det (Z + ^Σ"1)"* Vo^eE"1^ + ^Σ"1)"1) dZ _ 2-Wdet(Z)-*wetr(-|6) rp(in) det(5)2(n-p-1} etr ( - ^Σ-χ5) οίι(^η; ^ΘΣ"^). (3.5.5) The last equality is obtained by applying the result (1.5.14). ■ The noncentral Wishart density (3.5.5) was derived by Herz (1955) and James (1954, 1955, 1964). In the case rank(0) = 1,2, the results were obtained by Anderson and Girshick (1944), Anderson (1946) and Herz (1955) whereas Weibull (1953) and James (1955) gave the results for rank(0) = 3. When Σ = Ip and the only nonzero element of θ is 0ц, then the p.d.f. of S = (sij) simplifies to {2bprp(in)}_1 detiS)**"-*-1) exp (-\trS- \θη) οίι(|η; Jill5ll), (3.5.6) where now o^\(*) is the Bessel function of a scalar argument. In the rest of this section, we study some basic properties of noncentral Wishart distribution. THEOREM 3.5.2. Let Χ ~ ΛΓρ,η(Μ,Σ <g> Jn), η > ρ, and A(q χ ρ) be any matrix of rank q <p. Then, AXX'A' ~ Wq(n,AZA!, (ΑΣΑ!)~ιΑΜΜ'Α). Proof: Let Υ = AX, then from Theorem 2.3.10, Υ ~ Nq,n(AM, (ΑΣΑ) <g> In). From Theorem 3.5.1, we get YY' = AXX'A' ~ Wq(n,AZA', (ΑΣΑ')~ιΑΜΜ'Α'). m THEOREM 3.5.3. Let S ~ Η^π,Σ,θ). Then, the characteristic function of S is det(/p - 2^ΣΖ)"2η etr {- ^θ + hlp - 2ιΣΖ)~ιθ}, where Ζ = Ζ' (ρ χ ρ) = (|(1 + uj)^·) and 6ij is the Kronecker's delta. Proof: By definition, the characteristic function of S is φ5{Ζ) = {2bPdet(£)brp(in)}_1etr(-i0)
116 CHAPTER 3. WISHART DISTRIBUTION Now using (1.6.4), we get / det(5)2(n-p"1) etr {lZS - \^~lS) οΉ^η; ^ΘΣ"^) dS = Γρ(±η) det (IE"1 - cZ)^n Л(1п; \щ\&~1 ~ ^"W1) = 2*^Γρ(^η) det^)2ndet(/p - 2iEZ)-inetr{i(/p - г^Я^в}. (3.5.8) Substituting from (3.5.8) in (3.5.7), we get φ5(Ζ) = det(/p - 2^Z)-betr{- ^θ + hlv - 2ιΣΖ)~ιθ]. m Next we derive a differential equation for the characteristic function of the non- central Wishart matrix which is useful in the study of approximation of noncentral distribution by a central distribution (Steyn and Roux, 1972). THEOREM 3.5.4. Let X ~ NPi7l(M, Σ® Jn), n>p, S = XX' and Γ = (7ii); where jij = |(1 + 6{j)zij, Zij = Zji, i, j = 1,... ,p and 6{j is the Kronecker's delta. Then the characteristic function φ of S satisfies the differential equation || = б[п(Ф - 2d?)-1 + (Φ - 2J)"4MM'$($ - г^Г)"^ мЛеге Φ = Σ"1 and §§ = (|£). Proof: Let Χ = (жь...,жп), жа = (xla,..., х^)', Μ = (mb...,mn), ma = (mia,..., тттра)', а = 1,..., η and Φ = (^0·). Then S = Σα=ι жа< = (srt), srt = Σα=ι Xra^ia, and жа, a = 1,..., η are independently distributed as /a = cexp ~ ο Σ fajfaa - mia)(xja - TTlja) ' i,j=l where с = (2π) 2?det(#)2. The characteristic function of S is φ = E[eti(iTS)] /oo /-oo г l/n\Pn • · · / exp hJ^ZrtSrt ( Π //?) Π Π άχίβ- -oo ./-oo L _^ J ч λ_ι »·_ι λ_ι /?=1 t=l/?=l Differentiating 0 w.r.t. z\j we get 50 σώ r°° r°° / \ y -^- = ■■■/ «ryetriJi) (ΠΛ)ΠΠ^· (3·5·9) OZij J-oo J-oo ^β=ι ί=ιβ=1 Multiplying (3.5.9) by ψ^ and summing over j we get
3.5. NONCENTRAL WISHART DISTRIBUTION 117 Ρ βΛ roo roo / Ρ η \ η ρ η Σ>^=/ "■■/ ί^ΣΣ^ΐαχ,α etr(,r5)(ΠΛ)ΠΠ^- j=l °Ζ\3 J~°° */-°°Vj=la=l ' 0=1 г=1/?=1 Now using the result Σ%=ιΦΐ№α = E?=i^tj(sja - rnja) + Т^=\Ф%зЩа, the above expression can be rewritten as p дф p n Σ ΨϋΈ— = wi + L Σ Σ ФгзГПз*У\* (3.5.10) j=l a2:lj j=i α=1 where n roo roo ρ / n \ p n Wl = L Σ / · ' · / Χ1* Σ ^ijfea - Ща) etl(iTS) ( Д //?) Π Π α=1·/-οο J-oo j=1 ^=1 ' i=l0=1 and /OO ΓΟΟ . fl У " - - - / χ1α eti(tTS) ( Π ίβ) Π Π dxW- ■°° -7-00 /?=1 г=1/?=1 Further, tt;i can be written as n^ roo roo r roo . Ρ ^ ν ϊ Wi = iJ2 "· I { Xla ( Σ Ψϋ(Χ3<* ~ Ща) ) βίφΓ5)/α dxia \ α=1 J-oo J-oo w-oo 7=1 ^ η ρ η ( Π Λ) Π Π <**„. (3.5.11) /?=1 <?=1/?=1 Now integrating out xict using the result 7..-.U. = - ρ 9 ΣΨϋ(Χ3<* ~ mja)/a = -^—/a, and Λ Ρ -— (χιαetr(iTS)) = (txia j^270·χ^α + 6ц) etr(iTS) the expression (3.5.11) is simplified as roo roo , P n ч η ρ η /οο roo / y \ / \ • · · / (26 5^ 7ij 51 ZlaZja + niii ) βίφΓ5) ( [J //?) Π Π dX9P ■~ */-°°V j=l a=l 0=1 <7=1/?=1 = ,|ηία^ + 2Σ7ϊ^} (3.5.12) Substituting w\ from (3.5.12) in (3.5.10) we finally get p дф ( ρ дф " p дф ( ν дф ν Ί Σ^ϋ--— = ά п6цф + 2 ^7zj ο— + Σ Σ ФчЩаУга \. (3.5.13) j=l ^lj ^ j=l ^lj a=lj=] -1
118 CHAPTER 3. WISHART DISTRIBUTION Similarly by differentiating φ w.r.t. z2j·,... ,zpj, we can derive (p — 1) differential equations which together with (3.5.13) can, in general, be written as Σ ΨΰΓ" = Μ n<W + 2 Σ %■ я- + Σ Σ ФчЩаУ** \ (3.5.14) j=l ^J l j=l ^'j a=lj=l j where yea= Γ --- Γ xeaeti(iTS)( Π //?) Π Π <**ί* * = 1, - - - ,Ρ- (3-5.15) Further using the result ρ ρ ρ 3=1 3=1 3=1 together with (3.5.15) we have Σ % Σ yja^ite = Σ т*<* / '"" / W (Σ ^j(xia - mja)) etr(*TS)/a dxia ^ j=l a=l a=l -7-00 J-ooU-ooV=1 J ρ η 71 у Tl у 71 (Π //?) Π Π dx^ + 0 Σ ^ Σ щат^. /?=1 $=1/?=1 j=l a=l Φ" (^)^(ζ,α) Now solving the integral inside the curly brackets, as before, we have ρ η ρ η ρ η Σ % Σ Узсст* = Ι Σ 2Ή Σ Vjamea + Φ Σ ^<J Σ ЩаГП£а· (3-5.16) j=l α=1 j=l α=1 j=l α=1 Further let ya = (yia,..., Ура)' then equations (3.5.14) and (3.5.16) can be written as (Ф - 2ιΓ)|| = с{п1рф + Φ £ may>a} (3.5.17) and (Φ - 2бГ) Σ 2/α™ά = Φ* Σ ™α™ά = <^MM' (3.5.18) а=1 а=1 respectively. Finally substituting for Σα=ι 772α2/ά fr°m (3.5.18) in (3.5.17), we get (Φ - 2бГ)Ц = ψ/ρ + ФММ'Ф(Ф - 2ιΓ)~ι}φ i. е. -^ = Лп(Ф - 2d?)-1 + (Ф - г^Г^ФММ'ФСФ - 2ιΤ)~ι}φ. ш THEOREM 3.5.5. Lei 5^ ~ ^(η^,Σ,θ^), j = l,...,/c be independently distributed, then J2j=1 Sj ~ Wp(n, Σ, Θ) where η = Σ,$=ι Щ and θ = Σ$=ι Qj-
3.5. NONCENTRAL WISHART DISTRIBUTION 119 Proof: The characteristic function of S = Σ*=ι Sj is <f)S(Z) = E[eti{tZS)} к = J] E[eti(iZSj)\ i=i k r - - 1л 1 = Π det(^P - 2iEZ)-*ni etr {- -Θ,- + -(Jp - г^ЕЯ)"1©,·} i=i ^ 2 2 = det(Jp - 2^ΣΖ)-2ηetr {- ^θ + hlp - 2ιΣΖ)~1θ}, which is the characteristic function of a noncentral Wishart matrix with parameters η, Σ and Θ. ■ When Sj ~ Wp(rij^j,Qj), j = l,...,к are independently distributed, Chikuse -A: Laguerre polynomials. and Davis (1986) derived the distribution of Σ*=1 Sj in series involving generalized THEOREM 3.5.6. Let S ~ ν^ρ(η,Σ,θ); then EMM - 2PferV(^det(£)ft ^r (- ie) Л (in + Η; \n; \θ), Re(ft)>-|n+|(p-l). (3.5.19) Proof: Prom the density (3.5.1), we get E[det(S)h] = {2bprp(in)det(^b}-1etr(-^) [ etr (- lz~lS) aet(S)^n-p~l)+h oF^ln; ^ΘΣ"^) dS. Js>o ч 2 ' ч2 4 ' Now, using (1.6.4) and simplifying, the result follows. ■ THEOREM 3.5.7. Let Υ ~ NPyTl(M, Σ <g> In), n>p, S = YY' = (s0-) and MM' = {ojij). Then (i) E(sij) = riGij + Uij and (ii) E(sijSki) = (naij + и^)(паке + иы) + n(aikaj£ + σχσ^) + CTji^ik + &i£Ujk + CTjkUii + CTikUji. Proof: (i) From Theorem 2.3.5(h) we get E(YY') = ηΣ + MM', (3.5.20) and hence E(sij) = ησίό + ωίό.
120 CHAPTER 3. WISHART DISTRIBUTION (ii) From Theorem 2.3.8(v), we get E(YY'BYY') = ntr(£E)E + η2ΣΒΣ + ηΣΒ'Σ + ηΜΜ'ΒΣ + ΜΜ'Β'Σ + tr(£MM')E + tr(BE)MM' + ΣΒ'ΜΜ' + ηΣΒΜΜ' + MM'BMW. (3.5.21) Now £(УУ£УУ) = £(S£S) 4 fc=lj=l 7 and hence ν ν ν ν E^^^SijbjkSke) = ΣΣ[nσмbjкσкj + n2σijbjkσk£ + nσijbkjσkt + nωijbjkσki k=lj=l k=lj=l + UijbkjVu + crubjk^kj + ^ubjk^kj + cnjbkj^ki + naijbjk^ke + UijbjkUkt). Next, substituting bjk = 1 and = 0 otherwise, we get E{sijSkt) = (rwij + Uij^naiu + ωΜ) + n(a^jk + σ^σχ) +CTj£Uik + GitWkj + GjkMit + Vikbljt. ■ In the case Μ = 0, the above theorem gives the first two moments of 5, where S ~ Wp(n,E). Premultiplying (3.5.20) and (3.5.21) by Σ"1, setting Β = Σ~ι in (3.5.21), and taking the trace of the resulting equation, for η > ρ, we get JE?[tr(E-15)]=np + tr(0) and ^(E^SE^S)] = np(n + p+l) + 2(n + p+l) tr(0) + ϊγ(Θ2) where S ~ V^p(n, Σ, Θ). For an identity involving expectation of noncentral Wishart matrices, the reader is referred to Leung (1994). Shah and Khatri (1974) have proved that if S ~ Η^η,Σ,θ) with θ = Σ"1^, W = MM' and tr» S is the 2th elementary symmetric function of 5, then (i) E(tip S) = E[det(A)] = det(E) [nw + f> - t^"0tr, θ] t=l and
3.5. NONCENTRAL WISHART DISTRIBUTION 121 where n^') = n(n — 1) · · · (n — j +1), E(i(j)) and W(i(j)) are submatrices obtained by considering ii,t2,---,ij rows and zi, 22,..., ij columns of matrices Σ and V^ respectively and Σ%ν) = ΣΓ1=ι · · · Σξ=ι · Saw (1973) has shown that *1>»2> — >*J E[tij(E~lS)} = J2(n - t)^> fP " Λ ϊγ,(Θ), г < j < ρ < п. i=o \J ~ V THEOREM 3.5.8. Let S ~ Wp(n, Jp, θ), θ = diag(0,0,..., 0) and S = TV where Τ (ρ χ ρ) = (Uj) is a lower triangular matrix with diagonal elements tu > 0. Then, Uj, I < j <i <p are independently distributed t\x ~ Л^,2(0), t\ ~ x£_t+i> г = 2,... ,p; and ί0- ~ iV(0,1), 1 < j < i < p. Proof: The density of S for θ = diag(0,O, -,0) and Σ = Ip from (3.5.1) is {2ЬрГр(1п)}"1ехр(- I^)etr(- i5)det(5)^-p-1)0Fi(^;^5ll). (3.5.22) Let 5 = TV so that t=lj=l det(5) = I14 t=l and from (1.3.14), J(S->T) = 2>n« -t+l t=l The joint density of £#, 1 < j < г < ρ, obtained from (3.5.22) is Ui > 0, 1 < г < ρ, -co < Uj < со, 1 < j < г < p. (3.5.23) From (3.5.23) it is easily seen that Uj, I < j < г < ρ are independently distributed and Uj ~ iV(0,1), 1 < j < i < p. Substituting у и = £?fJ one can show that t^ ~ χ'η(θ) and tl ~ xl_i+l, t = 2,...,p. ■ There is also the noncentral inverted Wishart distribution defined by Roux and Becker (1984). DEFINITION 3.5.2. A random matrix V (ρ χ ρ) is said to be distributed as non- central inverted Wishart with m degrees of freedom and parameter matrices Φ (ρ χ ρ) and θ (ρ χ ρ), denoted byV~ IWp(m, Φ, θ), if its density is given by 2-i(m-p-l)p Aet(\h\^{m-p-l) ι ι г,[1(Л-В1 e" (" f>e" (" \V ·)dw(Vri" 0fi (|(m - ρ - 1); Jew1), V > 0, Φ > 0, то > 2р.
122 CHAPTER 3. WISHART DISTRIBUTION This distribution is a matrix variate generalization of inverted noncentral gamma distribution. It may be noted that if V ~ IWp(m, Ψ, Θ), then V~l ~ Wp(m — ρ — 1, φ-1, θ). 3.6. MATRIX VARIATE GAMMA DISTRIBUTION Asoo (1969) defined the matrix variate gamma distribution as follows. DEFINITION 3.6.1. A random positive definite matrix W (ρ χ ρ) is said to follow a matrix variate gamma distribution, denoted as W ~ Gp(a, C), if its p.d.f. is {rp(a)det(C)-a}~\ti(-CW)det(W)a-i(*+1\ W > 0, where С (ρ χ ρ) > 0 and a > \{p — 1). Note that if S ~ Wp(n, Σ), then S ~ Gp (\n, \Σ~Μ. Similarly the random positive definite matrix W (pxp) has the noncentral matrix variate gamma distribution, Gp(a, C, Θ), if its p.d.f. is {rp(a)det(C)-a}_1etr(-0 - CW)det(W)a-^^+l\F1(a;GCW), W > 0, where C(p χ ρ) > 0, a > \{p — 1) and the symmetric matrix θ is the noncentrality parameter. In this case if S ~ Wp(n, Σ, Θ), then S ~ <2ρ(|η, |Σ_1, |θ). From Definitions 3.4.1 and 3.6.1, we define the matrix variate inverted gamma distribution with the notation, W ~ IGp(m, B), if its p.d.f. is detiB)™'!^) ^^TT—-det(W)~™etT(-BW-^ W > 0, Tp[m-^(p + l)\ where Б (pxp) > 0 and m > p. If W ~ IGp(m,B), then W~l ~ Gp(m-\(p+l),B). Conversely if W ~ Gp(a, C), then W~l ~ IGp(a + \(p+ 1), C). Using Bellman's (1956) integral identities, one can also give the following generalizations of matrix variate gamma distribution (see Olkin, 1959). DEFINITION 3.6.2. A random positive definite matrix W (pxp) is said to follow Bellman gamma type I distribution, denoted by W ~ BGp(au ..., ap\ C), if its p.d.f is given by {r;(oi,...,Op) Π det(C(a))-m4"1etr(-C^)det(^)^-^^+1) Π det(W^)~m^\ ^ a=l J a=l where С (ρ χ ρ) > 0 is a constant matrix, aj = m\ + · · · + rrij, and aj > \(j — 1), j = l,...,p.
3.6. MATRIX VARIATE GAMMA DISTRIBUTION 123 The generalized multivariate gamma function Г*(аь... ,ар) is defined in Theorem 1.4.6, and the matrices A^ and A^ are given in Definition 1.2.4. DEFINITION 3.6.3. A random positive definite matrix, W (pxp), is said to follow Bellman gamma type II distribution, denoted byW~ BGp*(bi,..., bp\ B), if its p.d.f. is given by {г;(Ьь ..., bp) [J det(BM)-*4 l eti(-BW) det^)*-*^ Π det^))"*-1 ^ a=l ' a=2 where Β (ρ χ ρ) > 0 is a constant matrix, bj = kp-j+i + · · · + kp, and bj > \{j — 1), j = l,....,p. THEOREM 3.6.1. Let S = TT ~ Wp(n, Σ), where Τ {pxp) is a lower triangular matrix with positive diagonal elements, then the distribution of the matrix R = Τ'Σ~ιΤ is {2bprp(in)}"1det(^)^71-2) jQdet^))-^^ (- |д), R > 0. Pi-oof: The density S is {22η?Τρ(^η) det(E)b}_1 det(5)2(n-p-1} etr (- ^Z~lS). Let S = TV, then the Jacobian of transformation is J(S -+T) = 2P Π-Li *«"i+\ and the density of Τ = {Uj) is 2р{22пргрЬп) det(E)b}_1 det(TT')2(n-p-1) etr ( - ^Σ~ιΤΤ') f[ %~ί+ι. t=l Write Σ"1 = A'A where A = (%·) is a lower triangular matrix and transform Ri = (ru(i)) = AT, which is a lower triangular matrix and гцщ = data. The Jacobian of transformation from (1.3.7) is J(T ->· Ri) = ΠΡ=ι αϊϊ\ and the density of Ri is given ЬУ 2ψ^Γρ(\η)}~1 det^R^-r-V etr (- ^ВД) Д ^f. τιΛ i * Hpt.ftf ff.^^-P-1) Pt.r I- -Ft.PA Now, let Д = R[RX = ΤΣ~ιΤ and get r«(i) \det(R{i+l)) ] 'Z V··,? A 1*И>, »=P The Jacobian of this transformation is J(J?i —>· R) = 2 ρΠί=ι^ζ(ΐ)> and fr°m tne density of Л1? we get the density of R as {2>rp(in)}"1det(i?)^"-2)ndet(i?(i))-1etr(- ±я). . Tan and Guttman (1971) derived the above density in a slightly different form and
124 CHAPTER 3. WISHART DISTRIBUTION called it the disguised Wishart distribution. However, this distribution is a special case of Bellman gamma distribution type II given above. The disguised inverted Wishart distribution has been studied by Gupta and Ofori-Nyarko (1995). 3.7. APPROXIMATIONS In this section we derive approximations to the distributions of a linear combination of Wishart matrices and a noncentral Wishart matrix. The linear combination of independent Wishart matrices arise in matrix quadratic forms, MANOVA random effects models, and robustness studies involving mixtures of multivariate normal distributions. Let Sj ~ Wp(rij^j), j = l,...,/c be mutually independent. Consider a linear combination к S = J^ajSj, dj > 0. i=i In the univariate case, the distribution of a linear combination of chi-square variables has been approximated by a chi-square distribution by equating the first two moments. In the present case Tan and R. P. Gupta (1983) have approximated the distribution of S by the distribution of W where W ~ Wp(n, Σ) and η and Σ have been obtained by comparing their expected values and the generalized variances. Write S = (suv), vecp(S) = (sn, s12, s22, · · · > *ιΡ> · · · > «ρρ)'> Μ = cov(vecp(5)), and A2 = cov(vecp(V^)). Then £(5) = Σ>η,·Σ,·, (3.7.1) E(W) = ηΣ, (3.7.2) Аг = 2 £ α*η,.βρ(Σ,- Θ Σ,·)Βρ, (3.7.3) J=l and Α2 = 2η£ρ(Σ ® Σ)Βρ, (3.7.4) where the expressions (3.7.3) and (3.7.4) have been obtained by using a result given in Problem 3.19, and the matrix Bp (ρ2 χ \p(p + 1)J has been defined in Section 1.2. Now equating the expected values from (3.7.1) and (3.7.2) and the generalized variances from (3.7.3) and (3.7.4) we get 1 k Σ = -ΣαΛΣ; (3.7.5) and _ jn^ det(A2) nT=i "H-'T'^'P <«-6)
3.7. APPROXIMATIONS 125 It may be noted that n^^^ det(Ai) does not depend on n. Using (1.2.18), we get det(A2) = (2n)^p(p+1)det(B;(E(g)E)Bp) = 2pn^p(p+1)det(E)p+1 = 2pn^+1> det (- Σ ЪЩЪ,)**1 = Уп-И*"1) det ( Σ djrijEj)^1 and therefore n5PCp+i)det(Ai) = n^p(p+1)det(A2) к = 2P det (Σ ajTijEj) Another approximation to the distribution of S has been obtained by Khatri (1989), by comparing the expected values and the total variance. Yet another approximation can be given by generalized Gram-Charlier series expansion, which becomes quite complicated if higher order derivatives are included (Tan, 1980 and Tan and R. P. Gupta, 1982). The noncentral Wishart distribution has been approximated by a Wishart distribution (Steyn and Roux, 1972) by using the representation of noncentral Wishart matrix in normal vectors. Let X ~ ΑΓρ?η(Μ, Σ <g> /n), η > p. Then S = XX' = (si<7·) has a noncentral Wishart distribution. Prom Theorem 3.5.8 the first two moments of S are given by E(sij) = natj + Uij (3.7.7) and E(sijSki) = (naij + Uij)(naki + ω кг) + η{σί]ζσ3ί + σ^σ^) + σ^ω^ + GuWjk + σύ]ζωα + GikUjt, (3.7.8) where MM1 = (ω^·). When Μ = 0, i.e., ω^ = 0, the above moments reduce to the moments of Wishart distribution given by E(sij) = naij (3.7.9) and EfajSke) = η2σίόσΗ + n(aikaje + au>ajk). (3.7.10) Now consider a Wishart matrix В = (6^·), В ~ Wp(n,E*), where Σ* = Σ + \MM'. Then from (3.7.9) and (3.7.10) we have E(bij) = ησ^ + Wij (3.7.11)
126 CHAPTER 3. WISHART DISTRIBUTION and + σ^ω,Α: + σ^ωα + σϊ*ω# + - (ω^ω^ + uaLJjk). (3.7.12) Comparing (3.7.7) with (3.7.11) and (3.7.8) with (3.7.12) it is seen that the first order moments of S and В are identical, where as the second order moments differ in terms of order 0(n~l), i.e., Щи) = E{Sij) and Е{Ь^Ьке) = Е{з^к£) + 0{п-1). This suggests that we can approximate the distribution of 5 by a Wishart distribution with parameters η and Σ+^ΜΜ'. Note that the characteristic function φ of S satisfies the differential equation given in Theorem 3.5.5, viz. ^| = <,{η(Φ - 2d?)-1 + (Φ - 2бГ)"1ФММ/Ф(Ф - 2ιΓ)~ι}φ (3.7.13) where Φ = Σ"1, and Γ = Ш\ + δ^)ζί3λ. When Μ = 0, this differential equation reduces to |!=η,(Φ-2,Γ)-ν = m(Ip - 2ιΓΣ)~ιΣφ. (3.7.14) From (3.7.14), the characteristic function, φ*, of В satisfies the following differential equation ^ = ru{lp - 2^(Σ + -MM')}'1 (Σ + -ΜΜ')φ\ (3.7.15) Now, by taking Г such that the conditions for convergence of matrix series are satisfied, from (3.7.15) it follows that дф* Jz=m \lp - 2.ΓΣ)"1 + (Jp - 2υΤΣ)~ι2ί™Μ\ΐρ - 2.ΓΣ)"1 + 0(η-2)](Σ + -ΜΜ')φ*. Jч η / Thus Ц = Ц(Ф - 26Г)"1 + (Φ - 26Γ)"ιΦΜ^(Φ - 2сГ)~1 + 0(п-2)]ф*. (3.7.16) The expressions in (3.7.13) and (3.7.16) differ only in terms of order 0(n~2), which indicates the closeness of approximation of the noncentral Wishart distribution by a central Wishart distribution. For further results on approximation of noncentral
PROBLEMS 127 Wishart distribution by a Wishart distribution see Tan (1979), Tan and R. P. Gupta (1982), and Kollo and von Rosen (1995). For results on the asymptotic expansion of the Wishart density, the reader is referred to Sugiura (1973), D. G. Nel (1978), and D. G. Nel and Groenewald (1979). PROBLEMS 3.1. Let X = (χι,..., xn), where x{ ~ Νρ(μ, Σ), г = 1,..., Ν are independently distributed. Further, let Α (Ν χ Ν) be a constant matrix of rank (N - r). Then, prove that XAX' is positive definite with probability one if Ν > ρ + r. 3.2. Let S ~ Wp(n, Σ) and Χ ~ ΝΡ|Τη(0, Σ <g> Im) are independently distributed. Assuming m < p, prove that S + XX' ~ Wp(m + η, Σ). 3.3. Let X ~ ATp,m(M, Σ <g> Φ|β, C),n>p + s. Show that (X - M)4rl(X - Μ)' ~ Wp{n-s,V). 3.4. Prove Theorem 3.3.7, when η is an integer by expressing the matrix S in normal variables. 3.5. Let S\ ~ Wp(ni, Σι) and 52 ~ Wp(n2, Σ2) be independent. Show that the p.d.f. of S = Si + S2 is given by {2^ni+n2)prp[i(n1 + n2)] det(E!)b det(E2)b}_1 etr (- ^lS) det(S)^+n>-p-V iFx^na; ±(щ + n2); ^(ΣΓ1 - Σ2 x)5), S > 0. 3.6. Prove Theorem 3.3.9, when η is an integer by expressing the matrix S in normal variables. 3.7. Let S ~ νΡρ(π, Σ) and partition S and Σ as in Theorem 3.3.9. Then, prove that 5ц and 522 are independent if and only if Σ12 = 0. 3.8. Let S ~ Wp(n, Σ) and for A(pxp) = (a{j) define A^ = (а#), г, j = 1,..., r. Then prove that det(gH) dettE^) det^"1!) det(EH) 'Γ-1'···'Ρ> where det(5^) = det(E^) = 1, are independently distributed as x*_r+1, r = l,...,p. 3.9. Let 5 ~ Wp in, ^EJ. Prove that the asymptotic distribution, as η —>· oo, of (ti§hl£UisJV(0,l). 3.10. Let 5 ~ Wp(n, Σ). Prove that the asymptotic distribution of y/n (j^ettL — lj is normal with mean 0 and variance 2p.
128 CHAPTER 3. WISHART DISTRIBUTION 3.11. Let Si ~ Wp(nbE) and S2 ~ Wv(n2,T) be independent. Show that Sx + S2 and (Si + 52)~2515^"1(5i + S2)2 axe independent, where (Si + S2)i is any square root depending only on S\ + S2 and not on the individual values of S\ and S2. (Perlman, 1977) 3.12. Let Si ~ Wp(rii, Σ), г = 1,..., d be independently distributed. (i) If S = Σ$=ι Sj and g(Su · · · > Sd) are independently distributed, then show that the random variable g(ASiAr,..., ASdA') has the same distribution as g{Si,...,Sd) for any nonsingular matrix A(p χ p). (ii) If for each В > 0, there is an Μ with β = MM' and g(MSlM\ ..., MSdM') and #(5i,..., 5d) have identical distribution, then prove that S = Υ%=\ Sj and g(S\, ...,Sd) are independent. (Olkin and Rubin, 1964) 3.13. Let Si ~ Wp(nuY), г = l,...,d be independently distributed. Then, prove that the random matrices (a) Wj = (Si + · · · + Sj)~^Sj+l(Si + -· + Sj)~K j = 1,..., d - 1, where (S\ + ··· + Sj)* is the triangular root of Si + · · · + Sj, are independently distributed, and (b) Zj = (Si + · · · + Sj+l)'iSj+1(Si + -"+ Sf+iJ-i, j = 1,..., d - 1 where (Si Η l· 5j+i)2 is any nonsingular square root depending only on Si Η l· Sj+ι, are independently distributed. 3.14. Let Si ~ Wp(rii, Σ), г = 1,..., d be independently distributed, and (Si Η h Sj)2 be any square root depending only on Si Η l· Sj. Then, show that the random matrices WJ- = (Si + ... + Si)-^+i(Si + --- + 5J-)-i,i = l,...,d-l are not independent. However, Wi,..., Wd-i are independent where Щ = (Si + - - - + Si+i)-iSi+i(Si + -" + Sj+i)~l(Si + · ■ · + Si+i)2. (Olkin and Rubin, 1964; Perlman, 1977) 3.15. Let S ~ V^p(n, Σ). Then, show that (i) £[ln{det(S)}] = ln{det(E)} +pln2 + £> ρ Γ1 -(n-t+1) г=1 2V where ?/>(·) is the psi-function. (ii) When Σ = Jp, a GF, α ^ 0, (q/S-1a)(a/S-2a) (a'a)2 - - -, η > p+5. (η — p)(n — ρ — l)(n — ρ — 3)(n — ρ — 5)'
PROBLEMS 129 3.16. Let nS ~ Wp(n,Ip) and S = Ip + η *W', where W = (wij). Furthermore, let α be a fixed vector. Then prove that (i) E{w\x) = 2 (ii) E{w\2) = 1 (iii) E{a'Wa)2 = 2(α'α)2 (iv) E{a'W2a) = {p + l)a'a. 3.17. Let S ~ Wp(n, Σ) and put a = \E (^), where δ = \ tr^"1). Prove that (i)a = ytr(E_l5) δ \ trS (ii) 0 < α < 1 for all Σ > 0. 3.18. Let S ~ Wp(7i, Σ) and и be distributed as beta with parameters {\m, \{n — m)) independent of 5, η > га. Further, let A = uS and a be any ρ χ 1 vector of constants. Then prove that (ii) E(A) = τηΣ. 3.19. Let S ~ И^(гс, Σ). Prove that cov(vec(5)) = п(/и + Κρρ)(Σ <g> Σ), and cov(vecp(5)) = 2ηΒ^(Σ <g> Σ)£ρ where the matrices Kw and Bp are defined in Section 1.2. 3.20. Let 5 = TT" ~ W3(n, /3), where Τ = (Uj) is a lower triangular matrix with positive diagonal elements. Prove that (ii)£l^) = ("-2)(n-3)(n-4)'n>4 (iii) £ А31-*з2^2Л2 = — ι— 4 V V *?i*3s У (n-3)(n-4)2' ^j_ j_\ = (n -1) t\Ai t\2) (n-2)(n-4)(n-5)' N£br + j = ,. „w. , ,_ ^>">5 ^2l(^31 ~ ^32^22 ^21) _ ^32 \ _ ^ — 3n — 2 *11*22*33 _ *22W ~ (П - 2)(l» - 3)(l» - 4)2(n - 5)'
130 CHAPTER 3. WISHART DISTRIBUTION and hence, show that for η > 6, (ιέ* ° ° E(T'T)-2 = a n2—3n—2 a U (n-2)(n-4)2(n-5) U (") *5p ~ ^Χη-,Η-l. Where Σ_1 = (*°")> a A (n-l)(n2-3n-6) V υ пи»-*) / 3.21. Let 5 ~ Wp(n,Ip) and 5 = TV, where Τ is a lower triangular matrix with positive diagonal elements. Further, let Q = E{VDT)~2, where D is a diagonal matrix with elements ±1. Then (i) show that Q is a diagonal matrix, and (ii) find a recurrence relation between the diagonal elements of Q. (Krishnamoorthy and Gupta, 1989) 3.22. Let S ~ Wp(n, Σ) and S = TV, where Τ is a lower triangular matrix with positive diagonal elements. Show that (i) ^ = ^, where S"1^), σϊ>Ρ' (iii) when Σ = Jp, E{VAT)~l = (t^·), where w^ = β&φ^ г φ j\ wu = (£?Sj, "« = ^b^E^^ + fe], i = 2,...,p, A"1 = Б = (by), and ft-V2r[i|n..+1)>t = l,-,P- 3.23. Prove the results (i) and (ii) in Problem 3.21, where Τ is an upper triangular matrix. 3.24. Let S ~ Wp(7i,E) and S = TV, where Τ is an upper triangular matrix with positive diagonal elements. Show that (i) t2u = —, where S~l = (sij). $11 (ii) t2n ~ ^xLp+υ where Σ-1 = (**). 3.25. Let r be the sample correlation coefficient from a sample of size η + 1 from a bivariate normal population. Assuming that the population correlation coefficient r is different from zero, show that the p.d.f. of r is given by |^<1V)hi-r>)><"-»>(1-rt-"M4i;,. + i;i(i+^). 3.26. Let R be the correlation matrix of a random sample of size η+1 from Νρ(μ, Ip). Then, prove that
PROBLEMS 131 3.27. Let S ~ Wp(7i,E) and a priori Σ ~ JWp(ra,#). Show that given S, the posterior distribution of Σ is IWp(n + m, 5 + Φ). 3.28. Let Χι ~ Νρ(μ^Σ), г = l,...,iV be independently distributed. Prove that under suitable transformation 5 = ΣίΙι(®ι — ж)(ж; — ж)', where ж = jj Σ?=ι &% can be represented as S = Eil^1 Уг2/'г witn 2/г ~ Np(i/U Σ), г = 1,..., N - 1 independent. 3.29. Let X ~ ΛΓρ?η(Μ, Σ®/η), Μ = ( mi *n* mn J where тщ,..., ran are scalars. Derive the p.d.f. of XX'. 3.30. Let 5 ~ ν^ρ(η,Σ,θ) and α (ρ χ 1) be a vector of constants. Then prove that a'Sa ~ (α'Σα)^(λ), where λ = 2££g». 3.31. Let 5 ~ Wp(n, Σ, Θ). Prove that the characteristic function of tr(5) is 0tr(S)(i) = det(/p - 2αΣ)~ϊη exp[itti{QZ(Ip - 2^Σ)-1}]. 3.32. Let S ~ ν^ρ(η,Σ). Then show that EiS-1 (8) S"1) = d^"1 (8) Σ"1) + c2 vec^Xvec^-1))' + ο2Κρρ(Σ~ι (8) Σ"1) and E(S~l (g) 5"1 (g) S~l) = ο3θι(Σ-1 (g) Σ"1 <g> Σ"1) + (c4ci + c3c2) vec(Σ-1)(vec(Σ-1)), <g> Σ"1 + (c3c2 + ο4ο2)Ρι(Σ-1 (g) Σ"1 <g> Σ"1) + C3C2P2 vec(Σ-1)(vec(Σ-1)), <g> Σ^Ρ, + ο3ο2Ρ2Ρι(Σ-1 (g) Σ"1 (g> Σ"1)Ρ2 + (c4ci - ο3ο2)Ρ2(Σ-1 (g) Σ"1 (g) Σ"1) + c4c2P2^i(S-1 (g) Σ"1 <g> Σ"1) + 2c4c2 vec(Σ-1)(vec(Σ-1)), <g> Σ~ιΡ2Ρ1 - (c3c2 + ο4ο2)Σ-1 (g) (vec^-^vec^-1))') - c4c2PiP2 vec^-^vec^-1))' (g) Σ"1, where Pi = Kpp (g> /p, P2 = Ip <g> i^, and cb c2, c3 and c4 are defined in Theorem 3.3.17. 3.33. Let S ~ Wp(n, Σ). Then show that E(S~l <g> S) = Σ"1 (g) Σ η — ρ — 1 -(vec(7p)(vec(/p))' + K„), η - ρ - 1 > 0. η—ρ
132 CHAPTER 3. WISHART DISTRIBUTION
CHAPTER 4 MATRIX VARIATE t-DISTRIBUTION 4.1. INTRODUCTION Let χ and υ be independent random variables distributed as standard normal and chi-square with η degrees of freedom respectively. Then, the random variable * = (-)"** is said to have i-distribution with η degrees of freedom. In the multivariate case, χ is replaced by the vector ж, which is distributed as Np(0, Σ) and define *=0"*«. (4-1-D which is distributed as multivariate t with parameters η and Σ. The density of t is given by ΣΜμΛ det(E)-i (l + ^'Σ-4)-*{η+Ρ\ t € W. (4.1.2) It is also known that t has the representation t = (S-*)'y (4.1.3) where 5 = 55(55)' ~ Wp(n+p- Ι,Σ"1) and у ~ A^p(0, n/p) are independent. In this chapter matrix variate generalization of (4.1.2) is studied. Because of its applications in Bayesian inference, many researchers have studied this distribution, e.g., Khatri (1959a), Kshirsagar (1961a), Tan (1964), Tiao and Zellner (1964), Geisser (1965), Dickey (1967, 1976), Juritz (1973), Rinco (1973), Haqand Rinco (1976), Marx (1981), Marx and Nel (1982), Javier (1982), Javier and Gupta (1985b), and Phillips (1985). 133
134 CHAPTER 4. MATRIX VARIATE t-DISTRJBUTION 4.2. DENSITY FUNCTION The matrix variate i-distribution is defined as follows. DEFINITION 4.2.1. The random matrix T(pxm) is said to have a matrix variate t-distribution with parameters Μ, Σ, Ω, and η if its p.d.f. is given by rp[j(n + m + p-l)] det(E)-2mdet(Q)-2-° *i"*Tp[i(n+p-l)] det(Jp + Σ~ι(Τ - M)Q-\T - M)')-^n+m+p-l\ (4.2.1) where Τ G RpXTn, Μ G Rpxm, Ω (m χ m) > 0, Σ (ρ χ ρ) > 0 and η > 0. We shaU denote this by Τ ~ Τρ?7η(η, Μ, Σ, Ω). Dickey, Dawid and Kadane (1986) call the matrices Ω and Σ, the spread matrices and η the degrees of freedom. This distribution belongs to the class of matrix variate elliptically contoured distributions studied in Chapter 9. In particular, for Μ = 0, this distribution belongs to (i) the class of right spherical distributions if Ω = /m, (ii) the class of left spherical distributions if Σ = /ρ, and (iii) the class of spherical distributions if Ω = Im and Σ = Ip. When η = 1, this distribution may be called the matrix variate Cauchy. When m = 1 or ρ = 1 it reduces to a multivariate ί-distribution (Cornish, 1954, 1955, 1962; Dunnett and Sobel, 1954; Lin, 1972). More specifically when m = 1, Τ = t {ρ χ 1), Μ = μ(ρ χ 1), Ω = ω and (4.2.1) becomes *-*'4(ΓΛΡ)]**(ΣΓ*ω-Κΐ + ~(t ~ μ)'Σ-4* - μ)Γ>+Ρ), t β W, γ^ή) ч ω ' which will be denoted by t ~ ίρ(η, ω, μ, Σ). For ρ = 1, by taking Μ = ν' and Σ = σ, it is easily seen that V = t ~ £m(n, σ, ι/, Ω). This distribution can be derived in a manner similar to the univariate theory as shown in the following theorem. THEOREM 4.2.1. Let S ~ ^(η+ρ-Ι,Σ"1), independent ofX ~ ΛΓρ?7η(0,/ρ®Ω). £>е/£пе Τ = (£-^Χ + Μ, (4.2.2) гу/iere Μ (pxm) is a constant matrix, and S^^S^Y = S. Then, Τ ~ ΤΡ}7η(η, Μ, Σ, Ω). Proof: The joint density of S and X is given by ^detm^^nrb ^ >(n_2) (_ l _ l } 2έ("+-+Ρ-ΐ)ΡΓρ[Ι(η+ρ-1)] V 2 2 У 5 > 0, X G Mpxn. Now, let Τ = (S"£yX + M. The Jacobian of the transformation is J(X -> T) = det(5)2m. Substituting for X in terms of Τ in the joint density of X and 5, and
4.3. PROPERTIES 135 multiplying the resulting expression by J{X ->· T), we get the joint p.di. of Τ and S as 2§(»+™+Р-1)РГр[|(п+р-1)] V 7 etr is{E + (T - Μ)Ω-χ(Τ - Μ)'} , 5 > 0, Τ в Крх" Now, integrating out S using multivariate gamma integral (1.4.6) the density of Τ is obtained as Tp[i(n + m + p-l)} det(£)-5mdet(n)-5" *bTp[i(n + p-l)] det (/„ + Σ_1(Τ - Μ)Ω-χ(Τ - Μ)'), Τ € Rpxm. ■ The above result was proved by Dickey (1967). Another representation of Τ when Σ and Ω are symmetric nonnegative definite matrices is given by Dickey, Dawid and Kadane (1986). 4.3. PROPERTIES In this section, various properties of the random matrix Τ are studied using its p.d.f. and the representation (4.2.2). First, we derive expected values of the random matrix Τ and some of its functions. THEOREM 4.3.1. Let Τ ~ Tp,m(n, Μ, Σ, Ω), then E(T) = Μ and cov(vec(T')) = — Σ <g> Ω, η > 2. (η - 2) Proof: According to Theorem 4.2.1 the random matrix Τ can be represented as T = (S-s)'X + M, (4.3.1) where S*(Ss)' ~ Wp(n+p- Ι,Σ-1) and Χ ~ ΝΡιΤη(0,/ρ®Ω) are independent. From (4.3.1), it is seen that T\S ~ NPim(M,S~l <g> Ω) and therefore E(T\S) = Μ (4.3.2) and cov(vec(r)|5) = S~l (g) Ω. (4.3.3)
136 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION Now, from (4.3.2), we have E(T) = Μ since the conditional expectation does not depend on S. Also, from (4.3.3) we have cov(vec(T')) = £s{cov(vec(T')|S)} = Es(S~l <g> Ω) = (n - 2)~ιΣ <g> Ω. The last step follows from Theorem 3.3.16. ■ THEOREM 4.3.2. Let Τ ~ Тр,т(п, Μ, Σ, Ω), then (г) Е{ТСТ) = (п- 2)~ιΣσά + МСМ, С (πι χ ρ), (it) E(TCT) = (n- 2)"1 ϊγ(6"Ω)Σ + MCM', C(mx m), (Hi) E(TCT'DT) = (n - 2)~ιΣϋ,Μση + (η - 2)"1 Ιτ(ΌΣ)ΜΟΟ. +(n - 2)"1 ϊγ(6"Ω)Σ£>Μ + MCM'DM, С (πι χ πι), D (ρ χ ρ), (ιυ) E(TCTDT) = (η - 2)~ιΣϋ'Μ'σ^ + (η - 2)~ιΜΟΣϋ'Ω +(η - 2)~ιΣσΩΌΜ + MCMDM, С (πι χ ρ), D(mx ρ), where η > 2. Proof: The representation (4.3.1) yields E(TCT) = Es[Ex{((S-i)'X + M)C{(S'i)'X + M)\S}] = Es[Ex{((S-^)'XC(S-^yx + (S-^)'XCM +MC(S-?)'X + MCM)\S}] = Es[(S-*),S~*C,n + MCM] where the last equality follows from Theorem 2.3.5, since X ~ АГр?та(0, IP <8> Ω). Now, using Theorem 3.3.16, the result is easily obtained. The derivation of E(TCT') is similar. Also, E(TCTDT) = ES[EX{((S-$)'X + M)C((S~^)'X + M)D((S~^)'X + M)\S}} = Es[Ex{((s-12),xc(s-^yxD(s-^yx + (s-*yxcMD(s-iyx + MC(S~^yXD(S~^yx + MCMD(S~^yx + (S-^yxC(S~^)'XDM + (S-^)'XCMDM + MC(S~^yXDM + MCMDM)\S}] = EsKS-tyS-iD'M'C'n + MC(S-^yS~^D'Q. + (S-^yS^CQDM + MCMDM], where the last equality has been obtained from Theorems 2.3.5 and 2.3.6. The desired result now foUows from Theorem 3.3.16. The derivation of E(TCT'DT) is similar. ■
4.3. PROPERTIES 137 THEOREM 4.3.3. If Τ ~ Тр,та(п, Μ, Σ, Ω), then Τ ~ Тта,р(п, Μ', Ω, Σ). Proof: The result follows by noting that /(T) oc det(/p + E-1(T-M)Q-1(T-M),)"^(n+m+p"1) = det(Im + n-l(T - Μ')Σ~ι{Τ' - Mj)-^n+p+m~l\ m It may be noted that the matrix variate t-distribution is a mixture of matrix variate normal distributions and matrix variate normal distribution is, itself, a limiting case of the matrix variate ί-distribution as shown below. THEOREM 4.3.4. Let Τ ~ ΤΡ}7η(η,Μ,ηΣ,Ω); then Τ Д X as η -> со where Χ ~ Νρ,πι(Μ,Σ <8> Ω) and "—>·" denotes convergence in distribution. Proof: The p.d.f. of Τ is _ Гр[|(п + т + р-1)] /(T) = Wr ir 1 Ί det(^)-^det(Q)-^ π2 ρΓρ[|(η + ρ-1)] ,ч-|(п+т+р-1) det (/p + -Σ-^Τ - М)П~\Т - Μ)') Now, since lim^oo det(Jp + IAJ-^+^+p-D = etr(-|A), where A = Σ~ι(Τ-Μ)Ω~ι (Τ - Μ)', and ^^^п-^рЩ^!!^ = (!)*«* we have lim f(T) = (2^-5mPdet(E)-5mdet(n)-5P etr {- ^Σ-^Τ - М)П~1(Т - Μ)'}, Τ e Rpxm. In the next three theorems, we will derive distributions of certain linear transformations of the matrix T. Some of these results were derived by Tan (1969a). THEOREM 4.3.5. Let Τ ~ Τρ?7η(η,Μ,Σ,Ω); and A(p χ ρ) and В (га χ га) be nonsingular matrices, then ATB ~ ΓΡ}7η(7ΐ, ΑΜΒ, ΑΣΑ', ΒΏΒ). Proof: Transforming W = ATB, with the Jacobian of transformation J(T —>· W) = det(A)~m det(B)~p, from the density (4.2.1) of Τ we get the density of W as rp[i(n + m+p-l)] det(^A')"m det(£'Q£)"p π*"*Γρβ(η + ρ-1)] det(/p + (ΑΣΑ')-1^ - AMB)(B'nB)~l(W - AMB)')-^n+Tn+p-l\ W e Rpxm. and, hence, the result. ■
138 CHAPTER 4. MATRIX VARIATE t-DISTRIBUTION COROLLARY 4.3.5.1. In the above theorem, (i) if A = Σ~2; then Σ-*ΤΒ ~ Гр,та(п, Σ-5Μ5, Ip, ΒΏΒ), and (ii) if Β = Ω~2; £Дея ΑΤΩ~5 ^Γρ,^η,ΑΜΩ-^,ΑΣΑ',/η»). THEOREM 4.3.6. Let Τ ~ ΤΡιΤη(η, Μ, Σ, Ω) and Β (πι χ r) be a matrix of rank r<m. Then,TB ~Τρ^(η,ΜΒ,Σ,ΒΏΒ). Proof: According to the Theorem 4.2.1, Τ can be represented as T = (S-1*)'X + M (4.3.4) where S^(S^)' ~ Wp(n + p - Ι,Σ"1) and X ~ NPiTn(0,Ip <g> Ω) are independently distributed. Post multiplying (4.3.4) by the matrix B, we get the representation for ТВ as TB = (S~1i),(XB) + MB where, from Theorem 2.3.10, XB ~ NPir(0,Ip <g> (ΒΏΒ)). Hence it follows, from Theorem 4.2.1, that ТВ ~ Гр,г(п, MB, Σ,' ΒΏΒ). ■ THEOREM 4.3.7. Let Τ ~ Tp?m(n, Μ, Σ, Ω) and A(s χ ρ) be a matrix of rank s<p. Then, AT ~ Γβ|Τη(η, AM, ΑΣΑ, Ω). Proof: Let Υ = AT then Y' = ТА'. From Theorem 4.3.3, we have Τ ~ TmiP(n, M\ Ω,Σ) and from Theorem 4.3.6, we get Y' = ТА' ~ Гт,,(п,М'А',Й, ΑΣΑ!). Now, using Theorem 4.3.3, again we get Υ = AT ~ Γβ|Τη(η, AM, ΑΣΑ', Ω). ■ Combining the above two results, we get the following theorem. THEOREM 4.3.8. Let Τ ~ TPim(n, Μ, Σ, Ω) and A(s χ ρ), Β (m χ r) be constant matrices of ranks s(<p) and r(<m), respectively. Then AT В ~ Ts?r(n, AMB, ΑΣΑ', ΒΏΒ). Proof: Let W = AY and Υ = ТВ. From Theorem 4.3.6, Υ ~ Tp,r(n, MB, Σ, ΒΏΒ) and from Theorem 4.3.7, W = AT В ~ TStr(n, AMB, ΑΣΑ', ΒΏΒ). ш The marginal and conditional distributions for column (row) partitions of Τ were derived by Dickey (1967) and are presented below (see also Box and Tiao, 1973). THEOREM 4.3.9. Let Τ ~ ΤΡιΤη(η, Μ, Σ, Ω) and partition Τ, Μ, Σ, and Ω as τ=(Ί}ΛΡι =(т1с т2с),м=(™1г)Р1=(м1с м2с), \±2rJ Pi m m2 \M2rj p2 rn\ m2
4.3. PROPERTIES 139 „ /Σιι Σι2\ ρι /Ωη Ω12\ πΐι Σ = Ι , and Ω = Ι . \Σ2ΐ Σ22/ ί>2 \Ω21 Ω22/ ΤΠ2 Ρ\ P2 ΤΠχ ΤΠ2 Then, (г). T2r ~ Ги,т(п, М2г, Σ22, Ω), Т1г|Г2г - Tpl,m(n+p2,Mlr + El2Zz2\T2r-M2r)^n^ Щ1т + Ω"1^ - M2r)'ll22\T2r - M2r))) and (ii) T2c ~ Гр,та2(п, M2c, Σ,Ω22), Tlc\T2c ~ Tp,mi(n + m2, Mlc + (T2c - M2c)Q22lQ2v Σ(/ρ + Σ"1^ - М2с)П22\Т2с - M2c)'), Ωη.2). Proof: (i) Prom (4.2.1), the density of Τ is f{T) = ψη + m + p-l)} |m |p det(7p + Σ-1(Γ - Μ)Ω-χ(Τ - м)')"^^"1^-1), = Гр[|(п + т + Р-1)] §та ,р тЧ[1(«+р-1)] V У V det(7m + Ω-χ(Τ - Μ/Σ-^Τ - Af))"*(n+ra+p-1). (4.3.5) Now, the quadratic form (T - Μ)'Σ_1(Γ - Μ) can be written as {T - Μ)'Σ-\Τ - M) = (Tlr - Mlr - Е^Ей1^ - М2г))'£Г112(Г1г - MlT - Σ12Σ22\Τ2τ ~ M2r)) + (Т2г - Μ2τ)'Σ22\Τ2τ - М2т). (4.3.6) Substituting (4.3.6) in (4.3.5) and noting that det^) = det^22)det^n.2), we can factorize the density of Τ as /(T) = /1(T2r)/2(Tlr|T2r), where Ж^тп^Тр2[\{п+р2-\)\ det(/m + Ω"1^ - Μ2τ)'Σ22\Τ2τ - АГар)Г*<"+"*«-1> (4.3.7) and
140 CHAPTER 4. MATRIX VARIATE t-DISTRIBimON f(T \r \- ТРгЩп + т + р-1)} M™ ~ »Кй(»+р-1)] det(/m + Ω~\Τ2τ - M2t)'T,221{T2t - Μ*))-1*» det(fi)-i» det(En.2)-imdet(/m + (7m + Ω"1^ - Μ2τ)'Υ,22\Τ2τ - Μ2τ))~λ Q-\Tlr - MlT - Σ12 Ъ22\Т2г - М2т)У Σ#2(Γ1Γ - Mlr - Σ12Σ22\Τ2Γ - M2r)))-*{n+m+p-l)- (4-3.8) Prom (4.3.7) and (4.3.8), it follows that the marginal distribution of T2r is T^^n, M2r, Σ22, Ω) and the conditional distribution of Tlr given T2r is TPum(n+p2, Mlr + Σ^Σ^1 (T2r - M2r), Ση.2, Cl(Im + Ω-χ(Γ2Γ - M2r)fE22\T2r - M2r))) respectively. (ii) From Theorem 4.3.3, T=( ^,c λ ~ TmiP(n, Μ', Ω, Σ), and from part (i) T'2c ~ Гта2>, M2c, Ω22, Σ) and Т{С\ЦС ~ Tm^p(n+m2, М{С+П12П221 (Т2с-М2с)', Ωη.2, Σ(/ρ+ 2~1(72с-^2с)^221(Т2с-^2сУ))· Now, the distributions of T2c and Tlc\T2c are obtained using Theorem 4.3.3. ■ From the above theorem, matrix variate ί-density can be written as the product of multivariate ί-densities. Setting mi = 1 and m2 = m — 1, Tlc = t1? and T2c = (t2,...,tm) in (ii), we get txlTac - TPil(n + m - 1, Mlc + (T2c - Μ^Ω^Ω^, Σ(/ρ + Σ"1^ - Μ^Ω^Τ* - M2c)'), Ωπ.2). which is the p-dimensional multivariate ί-density. Next, from the marginal distribution of T2c, one can see that t2|*3, · · · ,*m is also multivariate t. Repeating this procedure (m — 1) times, it is easy to see that the density of Τ can be expressed as /СП = /i(*i|t2,..., tm)f2(t2\t3,..., tm) · · ■ fm(tm) where every density on the right hand side is a p-dimensional multivariate t. Similarly, using part (i) of Theorem 4.3.9, it can be proved that the density of Τ is the product of ρ m-dimensional multivariate ^-densities. It may be noted that while in Chapter 2, the normal matrix X is merely an arrangement of multivariate normal vector vec(X'), but this is not the case with the matrix T, as pointed out by Dickey, Dawid, and Kadane (1986). For consider the matrix Τ (2 χ πι) = (t\ t*2)' ~ Γ2}7η(η, 0, Σ, Ω). Then, according to Theorem 4.3.9, the marginal distribution of t* is multivariate t, with η degrees of freedom and the conditional distribution of t2|t*, will have η + 1 degrees of freedom. If Γ is merely an arrangement of the elements of the vector vec(X"), then the distribution of vec(T') = ( Λ J would be 2m-variate ^-distribution with η degrees of freedom. Now, contrary to the above, the distribution of t2|t* will have n + m degrees of freedom.
4.3. PROPERTIES 141 In (4.2.1) if Ω = Im, then the columns of the matrix Τ are uncorrelated. Further if Μ = με', where μ (ρ χ 1) is a constant vector and e (m χ 1) = (1,..., 1)', then the p.d.f. ofT= (гь...,tm) is _ rp[i(n + m+p-l)] /(ti,...,*m) = ι \ — -det(E)-*m πΗτρβ(η + ρ-1)] V ' det(/p + Σ~ι(Τ - με')(Τ - ^e,),)"^(n+m+P"1)- (4-3.9) The distribution of A = EJLi(*j ~ *)(*j - *)', where t = £ Σ™^ tj, is given in the following theorem. THEOREM 4.3.10. Let Τ = (t^... ,tm) be distnbuted as (4.3.9). Then, the dis- tnbution ofA = Y%Li(tj - t){ta - t)' is [βρ(\{τη - 1), \(n + p- 1)) J"' det(E)-^"1) det(A)^m-p"2)det(/p + Σ~ιΑ)-^η+πι+ρ~2\ Α>0 and y/mt\A ~ ТрЛ(п + т - Ι,μ, Σ + A, 1). Proof: Let Η (m χ m) = (^e β J be an orthogonal matrix. Transform Υ = TH=(yl Y2 ), where уг (ρ χ 1) and Y2 (px(m- 1)). Then, from (3.3.7) (T - με')(Τ - με')' = m(t - μ){1 - μ)' + Y2Y2. (4.3.10) Note that if μ is replaced by t in (4.3.10), then we get A = Y2Y2. Substitute from (4.3.10) in (4.3.9) together with the Jacobian of transformation J(T —>· y/mt, Y2) = 1, to get the joint density of y/rnt and Y2 as Гр[|(п + т+р-1)] det(E)" π*"*Γρβ(η + ρ-1)] det(/p + πιΣ~ι(1 - μ)(ϊ - μ)' + Σ-ι^')-§("+™+ρ-ι)? £ G RP? γ2 e Rpx(m-i)_ Making use of the Theorem 1.4.10, the joint density of y/rnt and A is given by Гр[|(п + т + р-1)] det(E)" | _ det(/p + mE-1(i-^)(t-^), + E-1r2r2,)-^(n+m+p-1)rfy2 = y-fr ГРй(п + т + р-1)] det(I])-bdet(A)^^-2) rp[i(n+p-l)]rp[i(m-l)] K > K } det(/p + mE-^t - μ)(ί - μ)' + Ε"1 Α)~^η+τη+ρ~ι) = h(A)f2(y/mt\A), t e Kp, A > 0,
142 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION where MA) = [βρ(\(τη - 1), \{n +p - I))}"' det(E)-^-1) det(A)5(m-p-2> det(/p + Σ"1 A)-^n+m+p-^, A > 0 and 1 [%(n + m — l)\ det(/p + m(E + A)_1(t - μ)(1 - μ)')-5("+™+ρ-ΐ); t 6 W. ■ 4.4. INVERTED MATRIX VARIATE t-DISTRIBUTION In this section, we define the inverted matrix variate ^-distribution. DEFINITION 4.4.1. The random matrix Τ (ρ χ га) is said to have an inverted matrix variate t-distribution with parameters Μ G Mpxm; Σ(ρχρ) > 0; and Ω (га х га) > 0, if its p.d.f. is given by ψη + m + p-l)] ,m |p det(/p - Σ-\Τ - Μ)Ω~\Τ - M)')5(n-2), Τ e Kpxm (4.4.1) «Лете /p - Σ-χ(Τ - М^Г1^ - Μ)' > 0. We shall denote this by Τ ~ /ΤΡ)7π(η,Μ,Σ,Ω). When m = 1, Γ = t(p χ 1), Μ = μ (ρ χ 1), Ω = ω and the density (4.4.1) reduces to Щ^- det(E)-iori>(i - !(t - μ)Έ-ι(* - μ))*"-1, t e *\ π2ρΓ(^η) ^ ω ' which is the inverted multivariate ε-density and will be denoted by t ~ Itp(n, ω, μ, Σ). When ρ = 1, by taking Μ = ι/, Σ = σ, it is easily seen that X" = £ ~ Itm{n, σ, ι/, Ω). Khatri (1959a) derived the above density, but the following derivation of the inverted matrix variate ^-distribution is due to Dickey (1967). THEOREM 4.4.1. Let S ~ Wp(n + p- l,/p) and X ~ iVp,TO(0, JP® JTO) be independently distributed. For Μ G Mpxm, define Τ = E*(S + ΧΧΤ^ΧΩϊ + Μ, (4.4.2) where S + XX' = (S + XX'^^S + XX')*)' and Σз and Ω2 are the symmetric square roots of the positive definite matrices Σ(ρ χ p) and Ω (га х га); respectively. Then, Τ-/ΤΡ}7η(η,Μ,Σ,Ω).
4.5. DISGUISED MATRIX VARIATE t-DISTRIBUTION 143 Proof: The joint density of S and X is 7Г~5таР —f —Ц det(5)^n"2) etr {- -{S + XX')), S > 0, X e Kpxm. 2i(n-nn-^-i)prp[I(n + p-l)] V У l 2V yi' Transforming U = S + XX', with Jacobian J(5 —>· 17) = 1, we get the joint density of U and X as —t ^- det(J7 - XX')^^ etr (- ±u\ U - XX' > 0, X e Rpxm. (4.4.3) Now, let Τ = Στυ-ϊΧ& + Μ. Then, J(X -> Γ) = det(/7)bdet(E)-^mdet(Q)-^ and the joint density of Τ and 17, from (4.4.3), is |21(п+та+р_1)рГрjl (n + m + p _ ^j j"1 det(/7)J(n+m-2)etr J_ l^j r[l(n + m + p-l)] de Jm de Jp πΗτρ[1(η + ρ-1)] det(/p - Σ~ι(Τ - М)П~1(Т - M);)*(n"2), 17 > 0, Г е Kpxm. (4.4.4) Prom (4.4.4), it is easily seen that Τ ~ /Γρ?7η(η, Μ, Σ, Ω) and is independent of U. m It may also be noted that (i) if Τ ~ ΙΤΡιΤη(η, Μ, Σ, Ω) then Τ ~ ITPtm(n, Μ', Ω, Σ), (ii) if Γ ~ /ГР|ТО(п, Μ, ηΣ, Ω) then Γ Д Χ as η -> οο, where Χ ~ A^,m(M, Σ <g> Ω) and (iii) if Τ ~ /Тр?та(п, Μ, Σ, Ω), and Α (ρ χ ρ), Β (τη χ m) are nonsingular matrices then AT£ - /Tp,m(n, AM Β, ΑΣΑ', ΒΏΒ). The corresponding results on marginal and conditional distributions can also be derived (see Problems 4.11-4.15). 4.5. DISGUISED MATRIX VARIATE t-DISTRIBUTION The type of distributions introduced in this section were derived by Olkin and Rubin (1964). Tan (1973) studied their properties and called them disguised matrix variate ^-distributions because of their similarities with matrix variate i-distribution and relationship with matrix variate beta distribution. First, we derive the lower and upper disguised matrix variate i-densities. THEOREM 4.5.1. Let S ~ Wm(n, Φ) and X ~ iVP|TO(0, Σ <g> Ф) be independent and T = XU~\ (4.5.1)
144 CHAPTER 4. MATRIX VARIATE t-DISTRIBUTION where S = U'U and U is a lower triangular matrix with positive diagonal elements. Then, Τ is said to have a lower disguised matrix variate t-distribution given by {K(m,p, η + p)}~1 det(E)-*m det(/m + jvE-irj-i(n+p-m-i) m Π det((/m + Γ'Σ-Ύ),;,)-1, Τ e R*xm, (4.5.2) where K{m^n) = Г4Я = K(p,m,n). (4.5.3) Proof: Since Τ = XU~l = ΧΦ"(/УФ" )_1, where Φ 2 is the lower triangular matrix with positive diagonal elements such that Φ = (Φ2)'φ2? the distribution of Τ remains invariant under the transformation X —>· ΧΦ" and U —>· /УФ". Hence, without loss of generality take Φ = Im. Now, the joint density of X and S is 7Γ 2" 2§т(п+р)Гт(1п) — det(E)"imetr {- hx^'lX + S)\ det(5)^n-m-1}. Transforming S = U'U, Τ = XU~\ with Jacobian J(S,X -> U,T) = 2mdet(Uy Πί=ι «1», where C/ = (г^·), the joint density of U and Τ is given by EZ^Zt) a^u'u^+P~m~l) etr {" \ντΣ->τυ + u'u)} Π <. Now, let W = U'(Im+VE-lT)U. Then, the Jacobian of transformation is J(U -> W) = 2~m Π£ι |W det((JTO + Τ'Σ-ιΤ)[{[)-1} and the joint density of W and Τ is ^ίΖ^^ β det((/m + Γ'Σ-Γ)^-1 det(/m + Г^Т)-^+Р-т-i) det(Wp)^n_4,-m-1) etr (- |w). Integrating out V^ in the above density using multivariate gamma integral (1.4.6) one obtains the p.d.f. of Τ as *">Tw/i'irp)] det(E)4m π det((^+га-'юм)-1 det(/m + T'S-lT)-|(n+p-m-l)? T G RPxm_ щ It may be noted that if Г = (tb ..., tm), then
4.5. DISGUISED MATRIX VARIATE t-DISTRIBUTION Ιπι + Τ,Σ~ιΤ = Ιτη + {^Σ-% .·. tmIrHm) 145 (4.5.4) from which it is seen that (/m + rE-1T)[i] = /i + I Zfm-i+lLj lm-t+l l'Tn-i+\Lj Zm 1 \ *m^ *m-t+l C^-1^ and det((/m + T'E-1T)[i]) = det(E)-1det(E+ JT t^.). (4.5.5) j=m—1+1 Substituting (4.5.5) in (4.5.2) the density of Τ can be equivalently written as {^(m,p,n+p)}-1det(E)^n+p-1)ndet(E+ f) t^.) ' i=l j=m—г'+l det(E + rr'j-Kn+p-m-i)^ T G rxm_ (4.5.6) If, in (4.5.1), we take U as an upper triangular matrix, we obtain what is known as the upper disguised matrix variate i-distribution. THEOREM 4.5.2. Let S ~ Wm(n, Φ) and X ~ iVp,m(0, Ε <g> Φ) be independent and T = XU~\ where S = U'U and U is the upper triangular matrix with positive diagonal elements. Then, Τ is said to have an upper disguised matrix variate t-distribution given by {K{m,p, η + p)}~1 det(E)"*m det(/m + Τ'Σ-ιΤ)-§(η+Ρ-™-ΐ) τη Π det((/m + ΤΣ~ιΤ)®)-\ Τ e Rpxm, 2=1 where K(m,p,n) is defined by (4-5.3). Proof: The proof is similar to the one given for Theorem 4.5.1. ■ From (4.5.4), we get (4.5.7) (/m + T'E-1T)W=/i + / t'Jl-% ■ ■ ■ t'Jl-% \ (4.5.8) and
146 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION det((/m + ΤΣ~ιΤ)®) = det(E)"1 det (Σ + Σ ψά). (4.5.9) Substituting (4.5.9) in (4.5.7) the upper disguised matrix variate ε-density can be equivalently written as 771 X , {K(m,p,n +P)}"1 det(E)i(n+^« Π det (Σ + £Щ)~ i=l j=l det(E + 7т*)-|(»-*-т-1)^ г G rm. (4.5.10) The above two results were derived by Olkin and Rubin (1964, p. 465). Tan (1973) called these distributions as lower and upper disguised matrix variate t. If the density of W = Τ- Μ is (4.5.2), we shall write Τ ~ DTPiTn(n + ρ,Μ,Σ) and if it is (4.5.7), then Τ ~ DTp,m(n + ρ, Μ, Σ). Next, we derive expected values of the matrix Τ and some of its functions. THEOREM 4.5.3. (i) IfT ~ Ι2ΓΡιΤη(η, Μ, Σ), then E{T) = Μ and cov(vec(T;)) = Σ <8> Β where В = diag(&i,..., bm) with n-p-l (n-p-m + j- 2)(n - ρ - m + j - 1)' bj = 71—Ζ—„ , .·—7^7Z—Ζ—„ , .· TT» i = l,2,...,m- 1, and 6m = n-p -2' (it) IfT ~ £>ΤΡ}7η(π,Μ,Σ); ί/ien £(Γ) = Μ and cov(vec(T')) = Σ <g> B, where 5 = diag(6i,...,6TO), юйЛ η -ρ- 2 and η -ρ- 1 (n-p-j-l)(n-p-j)' ^ = 72—Ζ—:—7\7I—Ι—Τν j = 2,...,m. Proof: (i) Notice that the random matrix Τ can be represented as T = XU~l+M, (4.5.11) where X and U are independently distributed, X ~ ΝΡ|Τη(0, Σ <8> /m), £W ~ Жп(тг — р,/та), and £/ is a lower triangular matrix with positive diagonal elements. From (4.5.11), it is clear that T\U ~ iVp|TO(M, Σ® (C/C/')"1)» ie-> the conditional mean of Τ given 17 is Μ, which is independent of U and hence the unconditional mean of Τ is also M. Further, the conditional variance of vec(X") given U is Σ <g> (UU')~l. Therefore, the unconditional variance of vec(X") is Eu(£>®{UU')~l). Now, from Theorem 3.3.21, we get the desired result. (ii) The proof is similar to part (i). ■
4.5. DISGUISED MATRIX VARIATE t-DISTRIBUTION 147 THEOREM 4.5.4. If Τ ~ £Tp,m(n,M,E); toen ft) £(ГСГ) = EC В + MCM, C(mx ν), (ii) Е{ТСТ) = tr(C£)E + MCM', C(mxm), (iii) E(TCT'DT) = ti{Y,D')MCB + ΈΌ'ΜΟ'Β + ^(ΟΒ)ΣϋΜ + MCM'DM, C(mx m), D(px p), (iv) E(TCTDT) = HD'M'C'B + MCHD'B + HC'BDM + MCMDM, C{mx p), £> (ra χ ρ), where the matrix В is defined in Theorem 4-5.3(i). Proof: (i) Using the representation (4.5.11), we get E(TCT) = E[(XU~l + M)C(XU~l + M)] = EuEx[(XU~lCXU-1 + MCXU~l + XU~lCM + MCM)\U) = EulExiiXU^CXU-1 + MCM)\U}] = Ευ[ΈΟ'{υ-ι)'υ~ι + MCM], from Theorem 2.3.5 = HC Eu{UU')~l + MCM = НС В + MCM. The last step follows from Theorem 3.3.21. (ii) Derivation of E(TCT') is similar to (i). (iii) As in part (i), we have E(TCT'DT) = E[(XU~l + M)C(XU~l + M)'D(XU~l + M)] = ЕиЕх[{Хи-1С{и-1)'Х^Хи-1 + MC(U~l)'X'DXU~l + XU~lCM'DXU~l + MCM'DXU~l + XU~lC{U~l)'X'DM + MC{U~l)'X'DM + XU~lCM'DM + MCM'DM)\U) = Eu[t^D,)MC(UU,)~l + Y>D'MC'{UU')~l + ti(C(UU')~l)EDM] + MCM'DM = ti(Y,D')MCB + HD'MC'B + tr(C£)E.DM + MCM'DM. (iv) This result can be derived in the same manner as (iii). ■ It may be noted that if Τ ~ DTPym(n, Μ, Σ), then the results (i)-(iv) given above still hold, but the matrix В is now given by Theorem 4.5.3(ii). We now derive the distribution of certain functions of lower (upper) disguised matrix T. THEOREM 4.5.5. Let A{p χ p) be a constant nonsingular matrix, (i) If Τ ~ DTPiTn(n, Μ, Σ), then AT ~ ΠΓΡιπι(η,ΑΜ,ΑΣΑ'),
148 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION and (ii) if Τ ~ DTPim(n, Μ, Σ), then АТ~ОТр,т(п,АМ,АЪА'). Proof: (i) The density of Τ is m {ΚΙτη,ρ,η)}-1 det(E)"b Цdet((/m + (Γ - Μ)'Σ-\Τ - М))щ)~1 det(/m + (T - Μ)'Σ~\Τ - M))-^n-m~l\ Τ e Rpxm. (4.5.12) Substituting W = AT, with Jacobian J(T -+ W) = det(A)-m, in (4.5.12) we get the density of W as m {K{m,p,n)}~1 det(AEi4;)"*m Πdet((7™ + (w ~ AM)\ALA!)'\W - АМ))Щ)~1 i=l det(/TO + (W- AM)'(ΑΣΑ')'1 (W - AM))-^n-m~l\ Τ e Kpxm. Hence, the result. (ii) This can be proved in the same way as part (i). ■ COROLLARY 4.5.5.1. (i) IfT ~ DTPiTn(n, Μ, Σ), then Σ~^Γ~ Ι2ΓΡ}7η(π,Ε~^Μ,/ρ), and (ii) if Τ ~ £ΤΡ}7η(π,Μ,Ε), then E-^T~DTp,m(n,E-^M,/p). Proof: Put A = E~2 in Theorem 4.5.5. ■ THEOREM 4.5.6. Let A(r χ ρ) be a constant matrix of rank r <p. (i) IfT ~ XSrPlTO(n, Μ, Σ), iuen АГ ~ 22_Tr,m(n -p + r, AM, ΑΣΑ'), and (ii) if Τ ~ £>Tp,m(n, Μ, Σ), ί/ien AT ~ DTrym(n -p + r, AM, ΑΣΑ'). Proof: Here we give the proof for (i) only since the proof of (ii) is similar. Let W = T - M, then from (4.5.11) W can be represented as W = T-M = XU~l and AW = A(T -M) = AXU~l where ΑΧ ~ Ν^Ο,ΑΣΑ' <g> Jm). Therefore, by definition, A(T - M) ~ ПГГуГП(п - ρ + г, О, АЕА') and the result follows immediately. ■
4.5. DISGUISED MATRIX VARIATE t-DISTRIBUTION 149 THEOREM 4.5.7. Let Τ (ρ χ га) = {Tlc Т2с), Tlc(p χ rai) and Μ (ρ χ га) = (Mlc Mlc); Mlc(pxra!). ft) //Τ - DTp,m(n,M,E); iften T2c - DTp,m2(n,M2c,E); and Wf1^ - Mlc) - DTPtmi{n - m2A Ip), where Wx = {Σ + (T^ - M2c){T2c - M2c)'}5, and (it) ifT~ DTPim(n, Μ, Σ), then Tlc ~ £>Tp,mi(n, Mlc, Σ), and Wf1^ - M2c) - ^,m2(n - mb 0, Ip), where W2 = {Σ + (Tlc - Mic)(Tic - Mlc)'}i. Proof: We shall give the proof of (i) only since proof of (ii) follows similar steps. Without loss of generality, it can be assumed that Μ = 0. The lower disguised matrix variate ε-density given by (4.5.6) is {^(p,m!n)}-1det(E)5("-1)ndet(E+ f) t^)~l i=\ j=m—i+l det(E + rr'j-iin-m-i^ T e RP*rn (4.5.13) where K(p,m,n) is defined by (4.5.3) and Τ = (t1?... ,tm). Integrating (4.5.13), with respect to ti, we get the joint density of t2, t3,..., tm as m—1 m _. {^(ί.,τη,η)}-1 det(E)^-1) Π det (Σ + £ t/,)" г=1 j=m— г+1 /" det(E + f;tX)"i(n"m+1)dt1. (4.5.14) Substituting yx = (Σ + Σ^=2*7*ί)_5*ΐι witn tne Jacobian J(tx ->· yx) = det(E + Σ?=2*;# , in (4.5.14), we get 77г—1 m _, {tf(p,m, η)}"1 detiE)^"-1) Д det (Σ + £ t,.*;) г=1 j=m—i+l τη i_ / \ г detfc + ^tjt'j)" / det(/p + yiyi)-i<n^+1>dyi. (4.5.15) Evaluating the integral in (4.5.15), using Theorem 1.4.11 and simplifying, we get the joint density of t2,..., tm as 771—1 771 . {K(p, то - 1, n)}"1 det(E)>-D Ц det (Σ + £ t^) г=1 j=m—г+1 ттг _ ι_ / _ ν det(E + EVi)" · (4·5·16) Now, integrate (4.5.16), with respect to t2, using the same procedure, to get the joint density of t3,... ,tm as
771 det (Σ + Σ t,t;·)" 771 det(E+ Σ Щ)' 150 CHAPTER 4. MATRIX VARIATE t-DISTRIBUTION τη—2 τη _1 {K(p,m-2, η)}"1 det(E)^"-1) Ц det (Σ + Σ ¥ί) г=1 j=m—i+l 'J J=3 Repeating this procedure m\ times, we get the joint density of £mi+i, · · ·, tm as 7712 7,г -ι {^(p,m2,n)}-1det(E)5("-1)ndet(E+ Σ Щ)~ г=1 з—т—г+1 w\-|(n-m2-1) j=77ll+l г.е., (tmi+i,..., tm) ~ £Tp?m2(n, M2c, Σ). To prove the second part, note that det(E + TV) = det(E + T2cT'2c) det(/p + W^XT^Wf1), (4.5.17) 771 771 ndet(E+ ς ¥;) г=1 j=rn—г+1 7712 ттг ттг ттг = Πίβί(Σ+ Σ */ί) Π det(E+ Σ t/,) (4.5.18) г=1 j=m— г+1 г'=7П2+1 j=m— г+1 and ттг ттг mi ттг Π det(E+ Σ ν;) = Π det (Σ + Σ V;) ι=ττΐ2+1 j=rn—г+1 г=1 j=m\— г+1 = det(E + r2c^cr mi m\ Ildet^ + Wr1 Ε ЩЩ-1)· (4-5.19) г=1 j=7ni— г+1 Now, substituting (4.5.19) in (4.5.18) and (4.5.17) and (4.5.18) in (4.5.13), we get the joint density of Tic and T2c as 7712 771 _, {K(p, m, η)}"1 det^)^"1) Ц det (Σ + Σ Щ) det(E + Г2е2£.)~*(п~та-1"НП1) г'=1 j=m—г+1 ττίχ τη χ ι . . J[^(ip + wrl Σ ψ^)det(ir + wrXXcW^y^-^. г'=1 j=mi-7:+l Now, transforming Vj = W^Hj, j = 1,..., mu so that Υ = (уъ ..., ymJ = W{lTlc, with Jacobian J(Tic -> У) = det(Wi)TOl, the joint density of Υ and T2c is 7712 {ЯГ(р, τη, η)}"1 detCE)^""1) Ц det (Σ + Σ Щ) ' det(E + Γ^)"^-"*-1) г'=1 j=m-?;+l τηχ τηχ . Π det (/ρ + Σ Vi»i)~ det^ + yr)-^-™-1). (4.5.20) г'=1 ^=7711-1+1
4.6. RESTRICTED MATRIX VARIATE t-DISTRIBUTION 151 From (4.5.20), it is easily seen that Υ and T2c are independently distributed, T2c ~ £Tp,m2(n,0,E) and Y = W{lTlc ~ DTPiTni(n - m2,0,/p). ■ THEOREM 4.5.8. Let Τ ~ ~DTp,m(n, Μ, Σ) and partition Τ, Μ and Σ as \Τ2τ) ρ2 \M2J p2 \Σ21 Σ22; p2 Pi Pi Then, _ _ (i) TlT ~ DTpum(n -P2,Mlr,Σ„), Τ2τ ~ ЯГи,т(п - pb M2r,Σ22), and ft) (Tlr - Mlr) -Ε^1^ - M2r) ~ КГР1,та(гс - ρ2,0,Ση.2); (T2r - M2r) - Σ^ΣΠ1^ - Mlr) - ЯГ^гс-^ДЕ^). Proof: (i) According to Theorem 4.5.2, we can write T-M = XU~\ where Χ ~ ЛГр?та(0, Σ <8> /m) and U'U ~ Wm(n — p, Im) are independent and £/ is an upper triangular matrix with positive diagonal elements. Partitioning X as X = (^1г) Pl, we have Tlr - Mlr = XlrU~l and T2r - M2r = X^/7"1, where Xlr ~ \A2ry p2 ΑΓρΐ}7η(0,Σιι <g> Jm) and X2r ~ iVp2,m(0,E22 Θ Jm). Hence, the results follow from Theorem 4.5.2. (ii) Here, we have (Tlr - Mlr) - Е^ЕйЧГаг - M2r) = (Xlr - Σ12Σ£Χ*.)υ-\ where Xlr - Σ^Σ^^γ ~ NPi,m(0,En.2 Θ /m) and hence, (Tlr - Mlr) - Е^ЕааЧГаг - M2r) ~ DTPl,m(n -ρ2,0,Ση.2). The proof of the second part is similar. ■ When Τ has lower disguised matrix variate ^-distribution, results similar to Theorem 4.5.8 can also be derived. 4.6. RESTRICTED MATRIX VARIATE t-DISTRIBUTION Tan (1969b) defined a restricted form of the matrix variate ^-distribution which occurs in the derivation of the posterior distribution of a parameter of a generalized multivariate normal process. In this section, we study this restricted matrix variate ^-distribution.
152 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION DEFINITION 4.6.1. A random matrix Τ (ρ χ ra), such that TC = 0, where С (rax s) is constant matrix of rank s(<m), is said to have restricted matrix variate t-distribution if its p.d.f is given by Г [l(n + m + ρ - s - 1)] ,p dem-h{m-s) aei{Cmb det(Jp + Σ-ιΤΩ-ιΤ')-!(η+™+Ρ-5-ΐ) This density will be denoted by Τρ?7η(π, 0, E,Q|s,C). Further, if for Μ (ρ χ га), MC = 0, and Τ - Μ ~ ТРут(п, 0, Σ, Ω|«, C), then Τ ~ ΓΡ}7η(π, Μ, Σ, Ω|β, C). This density can be derived by using the representation of Τ given in Theorem 4.2.1, where now X ~ iVp,m(0, Σ <g> Ω|β, С). THEOREM 4.6.1. Let S ~ Wp(n + p - Ι,Σ"1) independent of X ~ ЛГр,та(0, JP <g> Ω|5, С). Define Τ = (S"i)rX + Μ, гуДеге Μ (ρ χ πι) is a constant matrix, such that MC = 0, and ^(S^y = 5. Then, Τ ~ Гр,та(тг, Μ, Σ,Ω|β, С). Proof: Given 5, Τ is distributed as NPfm{M, S~l ® Ω|θ, С). Hence, the unconditional density of Τ is given by 2-i(n+m+p-s-l)p _I(m-s)p rrirw-ur,m det(E)5^-1)det(n)""det(C'nC)^ rp[5(n + p-l)J /" det(5)5("+m-s-2) etr {- Js(T - AT )Ω_1(Γ - MY - Jes} dS Js>o l 2 2 J = ,-^^ΓΡ[Ι(η + m + ρ - . - 1)] de ,p ^ rp[i(n + p-l)]det(E)iim-> V ' V ' det(Jp + Σ_1(Γ - Μ)Ω~ι(Τ - M)0~*(n+m+p~e~1), TC = 0, which is the required result. ■ The above theorem can also be proved along the lines as Theorem 4.2.1. THEOREM 4.6.2. If Τ ~ ΓΡ}7η(π, Μ, Σ, Ω|β, С), and Β (ρ χ ρ) and Ό (ra x ra) are constant nonsingular matrices, then BTD ~ Tp?m(n, BMD, ΒΣΒ', D'Q,D\s, D~lC). Proof: See Problem 4.24. ■ 4.7. NONCENTRAL MATRIX VARIATE t-DISTRIBUTION In Section 4.2, we defined the matrix variate ^-distribution. In the subsequent section, it was represented as Τ = (S~^)'X + M, where X ~ iVP,m(0, Ip <8> Ω), and S ~ Wp{n + ρ - 1, Σ"1). When S ~ Wp(n + ρ - 1, Σ"1, θ) or X~ ATp?m(M, Ip <g> Ω), the distribution of T, so obtained is called the noncentral matrix variate ^-distribution.
4.7. NONCENTRAL MATRIX VARIATE t-DISTRIBUTION 153 THEOREM 4.7.1. Let X ~ NPtm(0Jp <g> Ω) and S ~ Wp(n + p- Ι,Σ^,Θ) be independent. Then the distribution of Γ=(5"*);Χ, where S = S^(S^)r, is the lower noncentral matrix variate t and the density of Τ is given by det(Ip + Σ-ιΤΩ-ιΓ')-5(-+™+ρ-ι) lFl(I(„ + m + ρ - 1); i(n + ρ - 1); hip + Σ-1τςι-1,Γ)-1θ), τ e Rpxm. (4.7.1) Proof: The joint density of X and S is (27r)-5mPdet(fi)-5petr (- ^П-1Х'Х)Ы1п*р-1'>ргЛ(п + р- 1)]}"1 detiE)^""^1) det(S)*<»-2> etr (- ^Σ5 - |θ) 0ii(|(n + ρ - 1); ^ΘΣ5). Transforming Τ = (5_5)'ΛΓ, with the Jacobian J(X -> T) = det(S)5m, we get the joint density of Τ and 5 as |2i(„+m+P-i)P7rimpΓρjl(n + p _ ^j|_1 det(E)i(n+P-i) det(n)-*pdet(5)i(n+m-2) etr {- ^(ΓΩ-Ύ' + E)S - ^0} 0Fi(\(n + p-l); ±0ES). (4.7.2) To find the marginal density of Τ we integrate out (4.7.2), with respect to 5. Let U = ±(ΤΩ-χΤ' + E)iS(ra-17v + E)i, then J(S -> U) = аеЬ(1(ТП-1Т' + Σ))~^+1^ and we can write / det(S)5("+m-2> etr {- i(TO-1r + Σ)5} „fi(\{n + ρ - 1); 7©ES) dS 7s>o l 2 ' ^2 4 ' = 25("+m+P-1)p det(rn_1T' + s)_5(n+m+i'-1) / eti(-U) det(uWn+m-2) Ju>o oFi(\{n + P - 1); |(ΓΩ_1Γ + Σ)-5ΘΣ(ΓΩ"1Τ' + E)"*tf) d*7 = 2^n+m+p-1)prp[i(n + m +p - 1)] det(TO_1r + Ε)-**"-*"4*-4 ^χ^η + τη + ρ-^^η + ρ-Ι^^/ρ + Σ^ΓΩ-^')-^). The last equality is obtained by using Theorem 1.6.2. Hence, the density of Τ is given by (4.7.1). ■
154 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION In the above theorem, if Ω = /m, Σ = /p, and θ = diag(#n,0,... ,0), then the p.d.f. of Τ simplifies to r[l(n + m + p-l)] , l . TTri(n+m+p_1} π*"*Γρβ(η + ρ-1)] V 2 XV ^ ' ^(^(n + m+p-ljj^n + p-l);^11^!), where (Ip + TTf)~l = (rlj). These distributions were studied by Juritz and Troskie (1976), Marx (1981), and Hayakawa (1985). Marx (1981) and Hayakawa (1985) also derived asymptotic expansions of the p.d.f. of Τ given in the next theorem. THEOREM 4.7.2. The asymptotic expansion for the lower noncentral matrix van- ate t-density (4-7.1), (ι)ι/Σ = ΝΦ, θ = 0(1), N = n + p-l, is (2^-2mPdet^)-bdet(Q)-2petr (- 1φ-ιΤΩ~ιΤ) [l + -^ + 0(N~2)], where с = Knp(m -p - 1) + 1{α(Ω-ιΤ'φ-ιΤ)2 - 2mtr(Q-1T^"1T)} + ^mtr(e) - ^ Κ(ΤΩ-ιΤΦ~ιθ), and (ii) if Σ = ΝΦ,Θ = Νθ1} θι = 0(1), Ν = η + ρ-1, is (2^-bPdet^)-2mdet(Q)-2pdet(Jp + ©^betr {- hlp + Θι)φ-1ΤΩ"1Γ} where d = ^mp(m-p- 1) - ^тЬ^Ф^Т^Г) - ^ηϊχ{φ-1ΤΩ.-1Τθι{Ιρ + θχ)"1} - \ tr{(7p - 2е1)(Ф~1т-1Т')2} + Jm[{tr 0!(/р + ©О"1}2 - (m- 1)&{©!(/„ + Θ!)-1}2]. Proof: See Hayakawa (1985). ■ THEOREM 4.7.3. Let X ~ Np,m(M, Ip ® Ω) and S ~ Wp(n + ρ - 1, Σ"1) 6e independent. Then, the distribution of T=(S-i)'X,
4.7. NONCENTRAL MATRIX VABJATE t-DISTRIBUTION 155 where S = S^S^y, г-5 ^е Upper noncentral matrix vanate t and the density of Τ is given by Tp[\(n + m + p-l)} det(Q)-iP det(E)-Jm defc(/ + Е-1ТО-1тГ1(п+7п+р-1) π27ηΡΓρ[|(η + ρ- Ι)] /(Μ, Τ, Σ, Ω). (4.7.3) гу/iere /(Μ,Τ,Σ,Ω) = |rp[^(n + m+p-l)]} etr (-^Ω^Μ'Μ) | det(5)2(n+m"2) etr {- 5 + V2 ΜΩ-ιΤ(ΤΩ-ιΤ + Σ)"*S* } dS. (4.7.4) Proof: The joint density of X and S is (2^-Wdet(Q)-^etr {- \tt~l(X - M)'(X - M)} ί2|(η+Ρ-ΐ)ρΓρ[1 (n +p _ x)j j"1 det(E)J(n+p-i) det(5)2(n-2) etr (- his). Transforming Τ = (5"i);X, with J(X -> T) = det(S)2m, we get the joint density of Τ and 5 as 9-|(n+m+p-l)p -|mp -i r,[|(n + p-l)) det(fi)"5P ^)Φ+Ρ~1) etr (" 2Ω_1Μ'Μ) det(5)2(n+m"2) etr {- UtSI~1T' + Σ)5 + Til'lM'(si)'}. (4.7.5) To find the marginal density of T, we integrate (4.7.5) with respect to S. Let U = \(ΤΩ~ιΤ' + Σ)ϊ8{ΤΩ-ιΤ' + Σ)5, then J(5 -> U) = аеЬЩТП^Т + Σ))"^+1) and we can write / etr {- hm~lr + Σ)5 + ΤΩ"1Μ,(52)/} det(5)2(n+m"2) dS = 2h(n+m+p~l)pdet(Tn~lT' + Σ)~^η+πι+ρ~^ [ det(U)^n+m~2) Ju>o etr [-U + ν2ΜΩ-ιΤ'(ΤΩ~ιΤ' + Σ)-*ϋ*}άυ. (4.7.6) Substituting (4.7.6) in (4.7.5), we get the result (4.7.3). ■ The integral (4.7.4) has not been evaluated so far. However, when Μ = 0, 7(0, Τ, Σ, Ω) = 1 and we get the central case. When X ~ ЛГр,та(М, Ip <g> Ω) and S ~ Wp(n + ρ - 1, Σ"1, θ), the distribution of Τ = (S~~2)rX is called doubly noncentral matrix variate t. However, its density has not been evaluated due to complexity of certain integrals involved in the derivation. Marx (1981) has given an asymptotic expansion for the density of Τ in this case.
156 CHAPTER 4. MATRIX VARIATE t-DISTRIBUWN 4.8. DISTRIBUTION OF QUADRATIC FORMS In this section we study the distribution of quadratic forms of the type TAT', where A(mxm) is symmetric positive definite and the random matrix Τ(pxm),p <m has ^-distribution. It may be recalled here that for ρ < га, TAT' > 0 with probability 1. The next result is due to Javier and Gupta (1985b). THEOREM 4.8.1. Let Τ ~ TPim(n, 0, Σ, Ω) and A(mxm) be a symmetric positive definite matrix. Then the p.d.f. ofW = TAT' for ρ <m is given by {/^fa + ^ + P-1)»^™)} det(E)-bdet(QA)-2p det(Wr)i(m~p~1) det(Jp + E-1Wr)""*(n+m+p""1) lFt\\(n + m + p-l);(Ip + Σ-1W)-1Σ-1W,в),W>0, (4.8.1) where В = Im- A~*$l~lA~%. Proof: The density of Τ is 4.П-П1 l l det(E)"mdet(Q)"p Гр[|(п + т + р-1)] >ΡΓρ[1(η + ρ-1)] det(Jp + Σ-ιΤΩ-ιΤ')-έ(-+^+ρ-ΐ)? Therefore the density of W = TAT', is given by _ rp[i(n + m + p-l)] f{W) = Wr if 1 Ί det(Z)-^det(Q)-^ π2^Γρ[|(η + ρ-1)] / det(Jp + Σ-ιΤΩ-ιΤ')-έ(-+^+Ρ~ι) dr. Now let У = ТА*. Then J(T -> У) = det(A)"^, and hence M ' π*^Γρ[1(η + ρ-1)] ^ V ' [ det(/p + Е"1УА-*П-1А-*У/)"*(п+тарН,"1) dY- (4·8·2) Next write det(/p + Е"1УА-*П-Ы-*У/)"*(П+Я,+Р"1) = det(/p + E-1yy/)"*(n+mr+1,"1) det(/p - (/p + Е-1УУ/)"1Е"1У(Д„ - A~^~lΑ~^)Υ')~^η+τη+1ρ-^
АЛ. NONCENTRAL MATRIX VARIATE t-DISTRIBUTION 157 = det(/p + E-1yy')"^"+m+P~1) det(7m - Y'(IP + Σ-ΐγγ>)-ιΣ-ιγΒ}-\{η+τη+Ρ-ι) = det(/p + s-iyy')-§(n+m+P-i) i-Pom) {\(n + m + p-l); Y'{IP + Tr1YY' )-1Е"1УВ). (4.8.3) Substituting from (4.8.3) in (4.8.2), we get 'W = ΪρΓ^Γ!Ρ"!?ΐ сВД-Wet(Afi)-^(B), (4.8.4) 7Г2таРГр[^(п+р- 1)] where g(B) = ( aet(Ip + E-lYY,)-1'{n+m+p~l) Jyy'=w i*om) (\{n + m + ρ - 1); Г(/p + Е^УГ)-1Е-1ГБ) ЙУ. (4.8.5) Since Б is a symmetric matrix, the integral (4.8.5) is invariant under the transformation В —>· ΗΒΗ', Η Ε O(ra). Hence, from (4.8.5), using Theorem 1.6.1, we obtain f{w) = r[|(n + m + p-l)] ^ |p π*Τρβ(η + ρ-1)] K ' K > [ det(7p + s-iyy')-§(n+m+p-D ι*ο"° (^(n + m + ρ - 1); (IP + Σ^ΥΥΤ^ΥΥ'', -В) <*У. (4.8.6) Finally using Theorem 1.4.10 we get the desired result. ■ If m < p, then Τ ~ Tm,p(n, 0, Ω, Σ) and for Α (ρ χ ρ) > 0, the density of R = TAT is obtained from the above theorem as {^(^(n + p + m-lJ.ipJpdetinj-iMetiEArb det^)^-™-1) det(/m + Q-i^)-K-+p+^-D lFp\^(n + p + m-l);(Im + n-lR)-ln-lR,B),R>0, where В = Ip- Α~^Σ~ιΑ~^ The synthetic representation (4.2.2) can also be used to obtain the density of W = TAT, where Τ ~ Τρ?7η(η,Μ,Σ,Ω), ρ < га. Briefly the approach is as follows. The matrix Τ can be represented as
158 CHAPTER 4. MATRIX VARIATE t-DISTRIBUTION t = (s-1*)'x + m, where independently S ~ Wp(n + ρ - Ι,Σ"1), and X ~ NPiTn(OJp <g> Ω). Then T\S ~ Np,m(M, 5"1 ® Ω), and V^|5 is distributed as УАГ, where У - АГр,та(М, 5"1 ®Ω). Such quadratic forms have been studied in Chapter 7. From there, the conditional density of W given S, fw\s{W\S) can be obtained. Then the unconditional density of W, /w(W), is fw(W) = Es[fw\s(W\S)], where Es denotes the expectation with respect to S ~ И^р(п + ρ — 1, Σ-1). When Μ = 0, the null density of W can be obtained by using either (7.2.1) or (7.2.5) or (7.2.7). For M/0, Marx (1983), using the density function in Theorem 7.6.2, has obtained the nonnull density of W in terms of an integral. The density of T'BT, where the matrix Τ (ρ χ га), (ρ > га), has a lower noncentral ^-distribution, has been derived by Hayakawa (1985) in terms of invariant polynomials (Davis, 1979, 1980). Quadratic forms in disguised matrix variate t have been found useful in the study of simultaneous equations in econometric problems, e.g., see Tiao and Zellner (1964), Goldberger (1970), Chang (1972), Tiao, Tan and Chang (1970), and Tan (1973). The next two theorems give the distribution of quadratic form in T. THEOREM 4.8.2. Let Τ (ρ χ га) be distributed as a lower or upper disguised matrix variate t with parameters n, M, and Σ. Then (T — M){T — M)' is distributed as Y'Y where Υ ~ Tm^{n - ρ - m + 1,0, /m, Σ). Proof: According to Theorems 4.5.1 and 4.5.2, Τ — Μ can be represented as T-M = XU~l where X and U are independent, X ~ iVp,m(0, Σ <g> /m), U'U = S ~ Wm(n - p, Im) and U is a lower or upper triangular matrix with positive diagonal elements. Then (T - M){T - Μ)' = X(U'U)-lX' = XS~lX' = (S-tX'YiS-tX') = Y'Y, where S^ is the symmetric positive definite square root of S. Since X' ~ Arm?p(0, Im <8> Σ) and S ~ Wm(n - p, Im) are independent, S'^X' = Υ ~ Tm,p{n - ρ - m + 1,0, Jm, Σ). ■ The density of Y'Y is given in (4.8.1). THEOREM 4.8.3. Let A{p χ ρ) be a positive semidefinite matrix of rank r(<p). (i) If Τ ~ βΤρ?7η(η,Μ,Σ); and ΑΣΑ = A, then the p.d.f of Wx = (T - MY A (Τ -Μ) is {pm(\r, i(n - ρ))}'' detO^)^™1* det(/m + И^)-*<»-|Н--т-1) m Π det((Jm + WiJh)"1, r > ra, η > m + p, and (ii) if Τ ~ ШРуГП(п, Μ, Σ), and ΑΣΑ = A, then the p.d.f. ofW2 = (T-MYA(T- M) is
PROBLEMS 159 {Pm(\r, i(n - p))}_1 det(^2)5('—-1) det(/m + w2)-^-^-m~^ m Π det((/m + W2)W)~\ r > ra, η > ra + p, Proof: From Theorem 4.5.1, we can write (T - M)'A(T -M) = (U~lYQU~l where Q = X'AX, X ~ iVp,m(0,E <g> Jm), U'U ~ V^m(n - p, Jm) and C/ is a lower triangular matrix with positive diagonal elements. Using Theorem 3.2.6 we have Q~Wm(r9Im). Now the joint density of U and Q is {11 "^ —1 m 2Ы»-г*-*)гт(-г)Гт[-(п -ρ)]} det(trtO*(n-,,-ra-1) Π4 dettQ)^-™-1) etr {- |(Q + UU')}. Transforming Wx = (JJ-^'QU-1, Ζ = U'(Im + WX)U with the Jacobian J(Q,U ->· WUZ) = det([/'[/)5(m+1)2-mn™i{4det((/m + Wi)w)}-\ the joint density of Wx and Ζ is given by j2im(n-i*T) rm[i(n + г - ρ)] Γ1 det(Z)*(n+r-I,-m-« etr (- iz) |/3m(ir, i(n -ρ))}"' det(W1)i<'-m-1> det(7m + iy1)-*(n-p+r-m-1) 77г Ildetii/n. + lW-1. (4-8.7) г=1 From (4.8.7) it is easy to see that W\ and Ζ are independent and the distribution of Wi is given in (i) above. Similarly one can prove the result (ii). ■ PROBLEMS 4.1. Let S ~ Wp(n + p - Ι,Σ"1), independent of Χ ~ ΛΓΡ}7η(0,Φ <g> Ω). Define T = Φ5(5-5)'Φ"~5Χ + Μ, where Μ (ρ χ га) is a constant matrix, S$(Si)' = S and φέ(φέ)' = Φ. Prove that Τ - Tp?m(n, Μ, Φ*Σ(Φ*)', Ω). (Marx, 1981) 4.2. Let Τ ~ Tp?m(n, Μ, Σ, Ω). Then show that the characteristic function of Τ is {Γρ(^)}-1 βίφΖΜ')Β-*(-ΖΩΖ'Σ), Z(px ra), where <5 = |(n + p — 1) and £a(·) is the type two Bessel function of Herz
160 CHAPTER 4. MATRIX VARIATE t-DISTRIBUTION defined in Section 1.6. 4.3. Let Τ ~ ΤΡιΤη(η, Μ, Ε, Ω), then prove that (i) EiTCTDTFT) = c1EF,QDEC,Q + c2{ED,QFEC,Q + tr(F'Q.DE)EC"Q} + c&C'SlDUF'Q, + ^{ED'QCEF'Q + tr(C,QDE)EF,Q} + ci tr(F'QCE)E.D'Q + ο2{Ε6"Ω.ΡΕ£>Ώ + EFQCED'Q} + (n - 2)"1{ED,M,C/QFM + HF'M'D'M'C'Sl + HC'SIDMFM + MCHD'SIFM + MCUF'M'D'Sl + MCMDUF'Q,} + MCMDMFM. (ii) £(TCT'DTFT) = tr(QF,QC,){citr(DE)E + c2ED,E + c2EDE} + tr(CQ) tr(FQ){ciE.DE + с2Е£>'Е + c2 tr(DE)E} + tr(CQF,Q){c1ED,E + c2E£>E + c2 tr(Z7E)E} + (η - 2)-1{tr(C,Q)EDMFM/ + HD'MC'SIFM' + tr(FM,D,MC,Q)E + tr(D,E)MCQFM/ + MCSIF'M'D'Y, + tr(FQ)MCM,DE} + MCM'DMFM', where the matrices C, Д and F are of appropriate order, c\ = (n — 3)c2, c2 = {(n - l)(n - 2)(n - 4)}"1, and η > 4. 4.4. Let the joint density of Χ (ρ χ га) and Ω (m χ га) > 0 be 2 -1 m(p+n+m-1) ^ - i mp rm[i(n + m-l)] • det(E)-5mdet(*)-5(n+m-1' det(^)-^n+p+2m'> etr i{(X - M)"Z~l{X -M) + Щ0Г1 X eW* Then, prove that (i) given Ω, X ~ Np,m(M, Σ ® Ω), (ii) Ω ~ IWm{n + 2m, Φ), and (iii) X ~ Tp,m(n, Μ, Σ, Φ) . 4.5. In Problem 4.4, let Ω = (^ £>), Ωη Κ χ m,). * = (£ £)> Фп (rrii xmi), mi+m2 = га, Ω22α = Ω22—Ω21Ω111Ω12, Φ22.ι = Φ^-Φ^Φι/Φ^ and T = Ω^Ω^. Prove that (i) Ωχι and (Τ,Ω22.χ) are independent, (и)Пц~Л^т1(п + 2т,Фц), (iii) Ω22.ι ~ IWm2(n + 2ra + rab Φ22.ι), and (iv) Τ ~ Ттаьта2(п + 2ra + mx, Ф^Фи, »n» φ22·ι)·
PROBLEMS 161 4.6. Let X ~ iVp}7n(0, Σ <g> Ω) and υ ~ χ^ be independent. Prove that the p.d.f. of T=(l)-"Xis Щ(п + тр)} det(£)-bdet(n)-b (птг)ЬрГ(|п) (ι + - tr(E-1TO-1r'))~|(n+mp), τ g Rpxm. 4.7. In Problem 4.6, prove that the distribution of 5 = TQ.~lT', for m > p, is r[l(n + mp)] det(E)4mdet(s)^(n^i)(1 + lte(rig))->(-^)> 5 > o. п*"*Г(±п)Гр(±т) ^ ; V ; V η ν ^ 4.8. Let 5 ~ Wp(n + ρ - 1, Ip) and X ~ NP,m(0, Σ"1 <g> Ω) be independent. Prove that Τ = (5"5);X + Μ ~ Τρ?7η(η,Μ,Σ,Ω), where M(p χ m) is a constant matrix and 52(52)' = 5. 4.9. Let Τ (η, Μ, Σ, Ω), and A (ra χ £), and С (т х s) be constant matrices. Prove that cov(TA,TC) = (n - 2)"ΧΣ (g) (ΑΏΟ), η > 2 and hence show that cov(ti,ti) = (n-2)-1tjyE, where Ω = (ωίό) and Τ = (tu ..., tm). 4.10. Let Τ r>*J ^v τη (η, Μ, Σ, Ω), and В (г χ ρ) and D (s χ ρ) be constant matrices. Prove that cov(£T, DT) = {n- 2)"ΧΩ <g> BED', η > 2 and hence show that cov(tJ,t;) = (n-2)-1a0A where Σ = (σ^·) and £*' is the zth row of the matrix T. 4.11. Let Τ ~ /TPiTO(n, Μ, Σ, Ω). Show that the m.g.f. of Τ is etr(ZM;)0Fi(-(n + m+p- 1);-ΖΩΖ'Σ), Ζ (ρ χ га), where 0^1 (·) is defined in Section 1.6. 4.12. Let Τ ~ 1ТРуГП(п, Μ, Σ, Ω) and Б (га χ r) be a matrix of rank r < ra. Then, prove that ТБ ~ JTp,m(n + ra - r, MB, Σ, ΒΏΒ). (HINT: Let B0 = (Β Βχ) where Bi (rax(ra—r)) is such that В is nonsingular and then find the marginal distribution of ТВ from the distribution of TB0.) 4.13. Let Xi ~ ATp?m.(0, Σ <g> Jmi), г = 1,..., к and 5 ~ Wp(n, Σ) be independent. Further, let Sk = S + Σ*=ι XjX'j and 7} = (5 + H=i ВД)"^,·, j = 1,..., fc. Then, show that Tb ..., Tk are independent and derive the distribution of 7}.
162 CHAPTER 4. MATRIX VARJATE t-DISTRIBUTION 4.14. Let Τ ~ ITPtm(n, Μ, Σ, Ω), and partition Τ, Μ, Σ and Ω as т=(^1г)Р1=(т1с T2c\M=(™lr)Pl=(Mlc м2с), mi m2 Σ=(Σΐ1 El2V1,andQ=ffiu fil2>)mi. \Σ2ι Σ22/ p2 \Ω2ι Ω22/ m2 Pi £>2 rni ra2 Then, prove that (i) T2r ~ IT^m{n + p2, M2r, Σ22, Ω), Tlr\T2r ~ /TPl,m(n,Mlr + ^12^22 (^2r ~~ ^2r)> Ση.2, «(An " (T* - Μ^'Σ^Τ* - M2r))) and (ii) T2c ~ ITp,m2(n + m2, M2c, Σ,Ω22), Tlc\T2c ~ /^^(η,Μ^+^-Μ^Ω^Ω,!, (/p - (T2c - Μ^Ω^ - Μ2ο)')Σ,Ωη.2). 4.15. Let Τ = (tbt2,...,tTO) ~ /TPiTO(n,Μ,Σ,Ω) and denote its p.d.f. by p(T). Further let /(ух |у2) be the conditional density of yx given y2. Using suitable notation, write down explicitly p{T) = /i(ti)/2(t2|ti)/3(t3|tb h) · · · fm(tm\tu ..., t^). 4.16. Prove Theorem 4.5.2. 4.17. Prove Theorem 4.5.3(ii). 4.18. Prove Theorem 4.5.4(iv). 4.19. Let Τ ~ 2ΖΓΡ}7η(η, Μ, Σ), and A(t χ p) and Cfsxp) be constant matrices. Prove that cov(AT, CT) = В <g> (ΑΣσ;), where Б is a diagonal matrix given in Theorem 4.5.3(i). Also, show that where Σ = (σ^·) and t*' is the zth row of the matrix T. 4.20. Let Τ ~ ϋΤΡιπι(η,Μ,Σ), and A(ixp) and C(sxp) be constant matrices. Derive the results stated in Problem 4.19, where the matrix В is now given in Theorem 4.5.3(ii). 4.21. Let S ~ Wp(njp,0), where Θ = diag(0n,O,... ,0) and partition S as S = / S S \ \ ol a2 )' ^n (QxQ)· Prove that the distribution of В = 5^512 is given by V b2i b22 / 'ii^w'·"'*'det<7'+sв'г,<"+'",, л &"+p - * J"; 5δ""")· where (wij) = (Iq + BBf)~\ (Juritz and Troskie, 1976)
PROBLEMS 163 4.22. Let the joint density of Χ (ρ χ га) and Σ (ρ χ ρ) > 0 be 2-i(n+m+p-s-l)p -|(m-s)p -7Y7 det(Q)-2pdet(C,QC)2pdet(^)2(n+p"1) Γρ[|(η + ρ-1)] V ; V ; V ; det(E)^m+n-s"2) etr {- hl{X - M)Q~l(X - M)' - ^ΣΦ}, where the domain of definition of X is restricted to all X Ε Rpxm such that XC = 0, and MC = 0, for a fixed matrix С (m χ s) of rank 5 < ra. Then, prove that (i) given Σ, X ~ TV^M, Σ"1 <g> Ω|β, С) (ii) Σ~ Wp(n + p- Ι,Φ"1), and (iii) Χ ~Τρ,4η,Μ, Φ,Ω|5,С). 4.23. If Τ ~ ΤΡ}7η(η,Μ,ηΣ,Ω|5,α), then prove that Τ Д X as η -> oo, where Χ~^(Μ,Σ®Ω|β,σ). 4.24. Prove Theorem 4.6.2. (HINT: Use the transformation W = BTD, with the Jacobian J{T -> W) given in Lemma 2.6.1.) 4.25. Let Τ ~ Тр,т(п, Μ, Σ <g> Ω|β, С) and partition Τ, Μ and Ω as Ώχι Ωχ2\ 771 χ T = (Tlc T2c),M=(Mlc М2с),П = πΐι ra2 rni m2 ^21 "22/ m2 77Τ-ι ^2 where πΐ\ + ra2 = га. Then, prove that (i) Tic~TPimi(n,Mlc,E^n|s,C) and (ii) T2c|Tlc - Tp,m2(n + mb M2c + Ω21ΩΓι1(Τ1ο - Mlc)', (Σ + (Tlc - Μ^ΩΓχ1^ - Mlcy)^22.ik, C). 4.26. Let 5 ~ Wp(n + p- l,a/p), a > 0, and X ~ ΛΓΡ}7η(0, Jp<g> Jm) be independently distributed. Derive the distribution of Τ = (5 + ХХ')~зХ. 4.27. Let ρ χ 1 random vectors xu...,xn have the joint density π2^Γ(|ι/) I i=1 ) where ι/ > 0 and Λ (ρ χ ρ) > 0. Define A = Σ%=ι(χά - *){nj - *)', -/V* = T,jLi Xj and η = AT - 1 > p. Show that (i) the p.d.f. of χ is π2*Τ(±ι/) L J
164 CHAPTER 4. MATRIX VARIATE t-DISTRIBUTION (ii) the p.d.f. of A is * ρ Д ρ ^T)] det(A)-*" aet(A)^-?~V{v + ь^-Ы)}-*^), A > 0, (Ш) £[de.W1 - ^ί^ffiffW." > 2'»· И s[deWA] = ^(in + .)i^^i^)det(ArA, ν > 2(rp + 1) and (v) E[CK(A)) = Sn*V{~v)k\\n)CK{A), ν > 2k. (Sutradhar and Ali, 1989; Joarder and Ali, 1992)
CHAPTER 5 MATRIX VARIATE BETA DISTRIBUTIONS 5.1. INTRODUCTION The random variable и with the p.d.f. {p(a,b)}-lua~l(l - u)b~\ 0 < и < 1, (5.1.1) where a > 0 and b > 0, is said to have a beta type I distribution with parameters (a, b). The random variable υ with p.d.f. {p(a,b)}-lva~l(l + v)~(a+b\ ν > 0, (5.1.2) where a > 0 and 6 > 0, is said to have beta type II distribution with parameters (a, b). Since (5.1.2) can be obtained from (5.1.1) by the transformation ν = ^, some authors call the distribution of ν an inverted beta distribution. In this chapter, several generalizations which lead to matrix variate analogs of beta type I and type II distributions have been studied. 5.2. DENSITY FUNCTIONS First we shall define the matrix variate beta distributions of type I and type II. DEFINITION 5.2.1. Α ρ χ ρ random symmetric positive definite matrix U is said to have a matrix variate beta type I distribution with parameters (a,b), denoted as U ~ Bp(a,b), if its p.d.f is given by {βρ{α, b)}~1 aet{U)a~^V det(/p - υγ~τ{Ί>+ι\ 0<U < J?, (5.2.1) where a > \{p — 1), b > \{p — 1), and Pp(a,b) is the multivariate beta function given by (1.18). 165
166 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS Using Theorem 1.6.8, the c.d.f. of U is obtained as Гр(6)Гр[а + 5 (ρ + l)j 2Fi (a, -b + -(p + 1); a + -(p + 1); Λ), 0 < Л < Ip. DEFINITION 5.2.2. A pxp random symmetric positive definite matrix V is said to have a matrix variate beta type II distribution with parameters (a, 6), denoted as V ~ -BpJ(a, b)} if its p.d.f. is given by {βρ{α,ο)}~1 det(V)a-^+1) det(Jp + V)~^a+b\ V > 0, (5.2.2) where a > \{p — 1), and b > \{p — 1). As in the univariate case, the density (5.2.2) can be obtained from (5.2.1) by transforming U = (Ip + V)~lV, together with the Jacobian J(U ->· V) = det(/p + у)-(р+1). χ^ matrix variate beta type II distribution is also known as matrix variate F-distribution. These distributions belong to the class of orthogonally invariant and residual independent distributions discussed in Chapter 9. The c.d.f. of V, using the transformation U = (Ip + V)~lV and Theorem 1.6.8, is given by ^<*-»№«^+*-· 2Fi (a, -b + ]p + 1); a + ±(p + 1); (Jp + A)"xA) , A > 0. By means of a bilinear transformation of the random matrix £/, a generalized matrix variate beta type I distribution is generated as given in the following theorem. THEOREM 5.2.1. Let U ~ Bp(a,b). Then for given pxp symmetric matrices Φ (> 0) and Ω (> Φ), the random matrix Χ (ρ χ ρ) defined by X = (Q- φ)*|7(Ω - Φ)* + Φ (5.2.3) has the p. d.f det(X - Φ)-^1) det(Q - Х)М(р+!) /^(α,&^Ω-Φ)^6)-^1) Φ < Χ < Ω. (5.2.4) Proof: The Jacobian of the transformation (5.2.3) is J(U -> X) = (Ιβ^Ω-Φ)-^1). Hence, the p.d.f. of U is transformed to the p.d.f. of X given by (5.2.4). ■ DEFINITION 5.2.3. A pxp random symmetric positive definite matrix X is said to have a generalized matrix variate beta type I distribution with parameters a, b; Ω; Φ denoted by X ~ GB{,(a,b;il94f) if its p.d.f is given by (5.2.4).
5.2. DENSITY FUNCTIONS 167 When Φ = 0 and Ω = /p, the above definition yields the standard beta type I distribution (5.2.1). Further if X ~ σΒ£(α,6;Ω,Φ), then (Ω - Φ)"*(Χ - Φ)(Ω - Φ)"* ~Β£(α,6). THEOREM 5.2.2. Lei У ~ Β^7(α,6). For givenpxp symmetric matrices Φ (> 0) and Ω (> Φ), the random matrix Υ defined by γ = (Ω + φ)§ν(Ω + Φ)2 + Φ (5.2.5) /ms ί/ге ρ.<£/. det(y"5lr,oet(^+rr(0+t)^>^ (5^) /?ρ(α, 6) det(Q + Φ)"6 ν y Proof: The Jacobian of transformation (5.2.5) is J(V -> У) = det(Q + Φ)-*^1), from which the p.d.f. of У follows. ■ DEFINITION 5.2.4. Α ρ χ ρ random symmetric positive definite matrix Υ is said to have a generalized matrix variate beta type II distribution with parameters a, b; Ω and Φ if its p.d.f. is given by (5.2.6). In this case we write У ~ GB{/(a9 6; Ω, Φ). When Φ = 0 and Ω = Jp, the above definition yields the standard beta type II distribution. Further, if Υ ~ <2Βρ7(α, 6; Ω, Φ), then (Ω + Ф)-2(У - Φ)(Ω + Φ)-* - B^(a,b). In univariate statistical analysis if χ and у are independent chi-square random variables with degrees of freedom щ and n2 respectively, then ^- is distributed as beta type I, and | is distributed as beta type II. In the multivariate case, the Wishart distribution plays the role of the chi-square distribution, and these ratios have been generalized in many ways. As is often the case, these generalizations can take a number of different forms. Many of these generalizations have been studied extensively in the literature, e.g., see Hsu (1939a), Khatri (1959a, 1970a), Olkin (1959), Olkin and Rubin (1964), Tan (1969c), Mitra (1970), Javier (1982), Javier and Gupta (1985a), and Uhlig (1994). In the next two theorems, we give derivations of beta distributions of type I and II, generalizing the ratios | and ■£- to the matrix case. The other generalizations, which do not lead to the beta distributions (5.2.1) or (5.2.2), will be studied in Section 5. To derive the beta density from Wishart density, the following result is needed. THEOREM 5.2.3. Let Χ ~ ΛΓρ?ηι(0, Σ <g> Ιηι), and S2 ~ Wp(n2, Σ) be independent. Further let S = S2 + XX' and Ζ = S~^X, where S" is a nonsingular square root of S. Then S and Ζ are independent, S ~ \Ур(щ + n2,Σ) and Ζ ~ ITPfTll(n2 — ρ + 1, Ο,/ρ,/nJ- Proof: The joint density of X and S2 is XeW*ni,S2 >0. (5.2.7)
168 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS Making the transformation S2 = S - XX', Ζ = S *X, with Jacobian J(S2,X ->■ S, Z) = det(5)2ni? from (5.2.7), we get the joint density of 5 and Ζ as {2^1+^1^(71! + n2)] det(E)^ni+n2)}_1 det(S)i(ni+n2"p-1> etr ( - ^Σ"^) ГР[2(П1+П2)] det( _ zzf,i{n2.p.1) z e Rpxn S>Q (5 2 8) π-2η*Γρ(\η2) Ρ From (5.2.8), it is easily seen that S and Ζ are independent, S ~ Wp(rii + n2, Σ) and Ζ ~ ITPy7ll (n2—p + 1,0, /ρ, Ιηι) as defined in Section 4.4. ■ Now, we give the derivation of the matrix variate beta type I density. THEOREM 5.2.4. Let Ζ be defined as in Theorem 5.2.3. (i) Ifm > p, then ZZ' ~ BJp (|nb \n2). (ii) Ifm < p, then Z'Z ~ Blx Qp, \(m +n2- p)). Proof: (i) From Theorems 5.2.3 and 1.4.10, the density of U = ZZ1 is obtained as Γρ,[|(ηΐ +,П2)1 / det(/p - ZZ')^->-V dZ тгЬрГр(!п2) Jzz>=u p = ψηι \П2)] ■ j£L det(l/)§<—) det(7p - U)^->-V π5^Γρ(|η2) Γρ(1η2) V > Ур > = [Рр(\пъ in2)p det(U)L^-r-» det(/p - U)i<*-p-4, which proves that U ~ Bp (5П1, |n2). (ii) Again from Theorem 1.4.10 if nx < p, we get /" det(7„1-Z'Z)*(na-^1>dZ JZ'Z=W π*' r»,(b) Up r- det(W)^-ni-V det(/ni - W)i(»a-p-1)j and hence W ~ £^ Qp, |(ni + n2 - p)). ■ It may be noted that if in (i) above, m < ρ or in (ii), m > ρ, then the density functions of U and W do not exist and are called singular beta distributions (Mitra, 1970). The above results were derived by Hsu (1939a), and by Khatri (1959a) for a triangular root of S. For m > ρ, from Theorems 5.2.3 and 5.2.4(i), it is observed that XX' = Si ~ Wp(nuT) and therefore, U = ZZ' = (Sl + S2)-^Sl((S1 + 52)-*); ~ Bp(\nu \n2). This, therefore gives a natural generalization of the ratio ^- in the univariate case. It may be noted here that (Si + S2)5 can be taken any reasonable
5.2. DENSITY FUNCTIONS 169 square root depending on S\ + S2. Mitra (1970) took this square root to be a lower triangular matrix and assumed щ +n2 > ρ only. He then studied certain properties of the random matrix U using the density free approach. Khatri (1970a) further relaxed the restrictions on щ and n2 and derived Mitra's results. THEOREM 5.2.5. Let Sx ~ Wp{nuIp) and S2 ~ Wp(n2Jp) be independent. Define v = s;isls;K where S2 is a symmetric square root of S2. Then, V ~ Β^^Πχ, \n2). Proof: The joint density of S\ and S2 is ^(^^(l^r^)}-1 etr{_ l(5i + 52)} det(51)^ni-p-1)det(52)^(n2-p-1), Si > 0, S2 > 0. Transforming V = SPS^P, with Jacobian J^Si ->■ V,S2) = det(52)5(p+1), we get the joint density of S2 and V as (2|(η1+η2)ΡΓρ(ΐηι)Γρ(ΐη2)|-etr |_ ι{Ιρ + v)Saj det(Vr)*(ni"p-1) det(52)2(ni+n2-p-1), V > 0, S2 > 0. (5.2.9) Now, integrating out S2 from (5.2.9), using the multivariate gamma integral, completes the proof. ■ The above result has been derived by Olkin and Rubin (1964). The converse of this result is not true. That is if X\ (pxp) and X2 (pxp) are independent random matrices and X2 2ΧλΧ2 2 has beta type II distribution, then it does not necessarily follow that Χι and X2 have Wishart density. Hence this property does not characterize Wishart distribution. Roux (1975) has given the following result. THEOREM 5.2.6. Let Xi(p x ρ), г = 1,2 be independent random matrices with density Τρ[αί + \ν+\(ρ+1)} χΛΪ^-Ρ-ι) Гр(а;)Г> + |(р+1)ГЧЛг) ifi (а{ + ^ + -fi> + 1); ν + - (p + 1); -Xi) ,X{>0, where Re(|i/ + \(p + 1) - a<) > \(p - 1), and Re(oi) > \(p - 1), г = 1,2. Then Proof: Making the transformation Υ = X2 2 ΧλΧ2 2 with Jacobian J(XUX2 —>· y?X2) = det(X2)2^+1\ in the joint density of X\ and X2 we get the joint density of Υ and X2 as
170 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS n{^:|g::ii}"oT^ ^ Λ (αϊ + ii/ + J(p + 1); ι/ + i(p + 1); -Χ2Υ) 1Г1(а2 + ^+±(р+1);и+^(р+1у,-Х2). (5.2.10) Integrating (5.2.10) with respect to X2, the p.d.f. of У is given by Π {pf\+r^+j;P+?i!ldet(y)^-^) / det(X2r /=ДГРК)Г>+|(р+1)]/ Ух2>о lFl (αι + \v + I(p +!);„+ I(p + 1); -Х2Г) i^i(02 + \v+\(p+l);v+^(p+l);-X2) dX2. (5.2.11) Using (1.6.7), we can write iFifa + lv + lfp+lbv+lfp+Vi-XtY) _ I> + l(p+l)] ι etl(_YlXoYls\ detiS)^2^^1^01-^^1) det(/p - 5)ϊ(2"+ρ+ΐ)-^-έ(ρ+ΐ) dS. (5.2.12) Substituting from (5.2.12) in (5.2.11), and using (1.6.4) we get ^2 + ^+\{j>+l)} detfni(*-p-l) Γ^αΟΓρί^Γρ^+^ίρ+Ι)-^] 64rj / det(5)J(2l/+p+1)+ai-5(P+1) det(/p - 5)ϊ(2"+ρ+«-<1ι-έ0'+ι) detryS)""-*^1) J0<S</y 2*Ί (" + ξ(ρ + 1); a2 + \v+\(p+l);v+\{p+ l); -(Г5)"1) d5 = rp[a2 + it/+|(p+l)] i^, r , 1+α2_Λ(ρ+1) Γρ(αι)Γρ(α2)Γρ[^ + l(p + 1) - βι] d6t(r} io<s</P d6t(6) det(7p - 5)i(2^HH-i)-«,-i(iH-i) det(/p + y5)-a2-i(2,+P+i) d5 = rP/Vra2\det(y)°2"l(P+1) 1Ρ(αι)Γρ(α2) 2Fi (αχ + a2; a2 + -i/ + -(p + 1); a2 + -i/ + -(p + 1); -У), [ from = t^/V^I det(r)a2"^+1) det(/p + y)-(-i+^), У > 0 Ιρ(αι)Ιρ(α2) 2 4V from (1.6.8)],
5.3. PROPERTIES 171 From Theorems 5.2.3 and 5.2.4, it is clear that U = (Sx + S2)-*Sl((Sl + 52)-i); ~ Я^тц, ina) and is independent of Sx + S2, which holds for all Σ > 0 and all square roots of Si + S2 depending on Si + S2 only. However analogous results do not hold for V = S2 2 SiS2 2, since the distribution of V and its independence from 5χ + S2 depend on Σ and the 1 choice of the square root of S2. If S2 is a symmetric square root and Σ = α/ρ, then from Theorem 5.2.5, we have V ~ J3pJ(|ni, \η2), but is not independent of Si + S2. If Σ φ alp, then the distribution of V is not beta type II. If S2 is taken as a triangular matrix, then V and Si + S2 are independent for all Σ > 0, but again distribution of V is not beta type II (see Section 4). Considering these facts, Perlman (1977) defined V = (S1 + S2)-iSlS2l(Sl + S2)i = ((Sl + S2)^S2lSl((Sl+S2)-^ = V' and proved that V ~ В™ (hnu \η2λ and is independent of Si + S2 for all Σ > 0 and all choices of square root (Si + S2)2, provided that the choice is made in a measurable way depending only on Si + S2. 5.3. PROPERTIES In this section, we study some properties of the random matrices distributed as matrix variate beta type I and II. THEOREM 5.3.1. Let U ~ B^a, b) and A(pxp) be a constant nonsingular matrix. Then AUA! - GBfa, b; AA', 0). Proof: The density of U is {βρ(α9 b)}~1 det(C/)a"2^ det(/p - t/)6"^1), 0 < U < Ip. (5.3.1) Making the transformation X = AUA', with the Jacobian J(U -> X) = det(A)"(p+1), the density of X, obtained from (5.3.1), is {i^,(a,b)}-1det(i4i4/)-(e+6)+*(p+1) detiX)"-^^1) det^A'-X)6"^^1), 0 < X < AA', which is the desired result. ■ Similar result for beta type II distribution is given in the next theorem. THEOREM 5.3.2. Let V ~ B^a.b) and A(p χ ρ) be a constant nonsingular matrix. Then AVA! ~ GB^a, b; AA', 0).
172 CHAPTER 5. МАТШХ VABJATE BETA DISTRIBUTIONS Proof: The density of V is {pp(a,b)}~1 det(V)a-^+V det(/p + V)~{a+b\ V > 0. (5.3.2) Making the transformation Υ = AVA', with the Jacobian J(V -+Y) = det^)-^1), the density of У, obtained from (5.3.2), is {^(^^l-Met^A'^detirr-^^det^A' + r)-^6), Υ > 0, which completes the proof of the theorem. ■ In the next two theorems, it is shown that matrix variate beta distributions are orthogonally invariant. THEOREM 5.3.3. Let U ~ Bp(a, b) and Η (pxp) be an orthogonal matrix, whose elements are either constants or random variables distributed independently of U. Then, the distribution of U is invariant under the transformation U —> HUH', and is independent of Η in the latter case. Proof: First, let Я be a constant matrix. Then, from Theorem 5.3.1, HUH' ~ Bp(a,b) since HH' = Ip. If, however, Я is a random orthogonal matrix, then HUH'\H ~ Bp(a,b). Since this distribution does not depend on Я, HUH' ~ B^b). m THEOREM 5.3.4. Let V ~ В™ (a, b) and Η (pxp) be an orthogonal matrix whose elements are either constants or random variables distributed independently of V. Then, the distribution of V is invariant under the transformation V —>· HVH', and is independent of Η in the latter case. Proof: Similar to the proof of Theorem 5.3.3. ■ The relationship between beta type I and type II matrices is now exhibited. First, we derive densities of U~l and V~l. THEOREM 5.3.5. Let U ~ B£(a,b), then the density of X = U~l is {pp(a,b)}~1 det(X)-(a+6> det(X - irf-^K X > /p, (5.3.3) where a > \(p — 1), and b> \(p — 1). Proof: Making the transformation X = C/_1, with the Jacobian J(U —>· X) = aet(X)~^+1\ in the density of U the result follows. ■ Now (5.3.3) may be called the inverse beta type I density and denoted by IBp(a,b). From Theorem 5.3.5, it is clear that if U ~ βρ(α, 6), then U~l does not follow the beta I distribution. However it is easily seen that Ip — U ~ Bp(b,a), and U~l — Ip ~ βρ7(6, α). For beta type II random matrix V, the distribution of V~l is also beta type II as shown in the following theorem.
5.3. PROPERTIES 173 THEOREM 5.3.6. Let V ~ Bj?(a,b), then Υ = V~l ~ В^(Ь9а). Proof: Making the transformation Υ = V~l, with the Jacobian J(V ->· Y) = det(y)_(i>+1), in the density of V the result follows. ■ THEOREM 5.3.7. (i) Let U ~ B{,{a,b) and V = (Ip - U)~iU{Ip - J7)"5, then V~B»(a,b). (ii) Similarly, if V ~ B^a.b) and U = (Ip + ^)"*У(/Р + V)~±, then U ~ Proof: (i) Since C/ commutes with any rational function of E/, V = (Jp — U)~*U{IP — E/)~2 = (7p — U)~lU, and the Jacobian of this transformation is J(C/ —>· V) = det(/p + У)~(р+1). Now, making the substitution in the density of U given by (5.2.1) the result follows. (ii) The proof is similar to part (i). ■ The characteristic functions of U and V are now obtained in the following theorems. THEOREM 5.3.8. Let U ~ BJp{a, b). Then the characteristic function ofU = (ща), i.e., the joint characteristic function о/иц,Щ2,..., Upp is φυ(Ζ) = lFl(a;a + b;tZ), where Ζ = Ζ' (ρ χ ρ) = Ш\ + <5ij)z*i) o,nd и = у/^Л. Proof: By definition, φυ(Ζ) = E[eti(iZU)\ = Ша, b)}-1 f eti(tZU) aet(U)a~^+1) det(/p - !7)6-i<p+i) dU Jo<u<iP = iFi(a;a + b;tZ). The last equality follows from Corollary 1.6.3.1. ■ It may be noted here that if X ~ GBp(a, 6; Ω, Φ), then the characteristic function of X can be obtained from the above theorem. Since X = (Ω — Φ)5 [/(Ω — ψ)2 + Φ, where U ~ Bp{a,b), we have φχ(Ζ) = E[eti(t,ZX)] = E[eti{iZ((Q - Φ)*17(Ω - Φ)* + Φ)}] = eti(iZ^)E[eti{i(n - Φ)^Ζ(Ω - Φ)*17}] = βίφΖΦ)ώ,((Ω - Φ)2Ζ(Ω - Φ)5) = eti(iZ^) xFi(a; a + b; ιΖ(Ω - Φ)). (5.3.4)
174 CHAPTER 5. МАТШХ VARJATE BETA DISTRIBUTIONS THEOREM 5.3.9. Let V ~ Bjffab). Then, the characteristic function of V = (vij), i.e., the joint characteristic function ofvn, vu, ■ ■ ■ ,Vpp is <j>v(Z) = ^±p-*(a;-b+l-(p+iy,-tZ), where Ζ = Ζ' (ρ χ ρ) = (|(1 + δίό)ζίό), ι = у/^Л, and Re(-tZ) > 0. Proof: Here, the characteristic function of V is given by φν(Ζ) = {Pp(a,b)}~1 [ eti(tZV) det(y)a-^+1) det(/p + V)~{a+b) dV. Jv>o Now, using the Definition 1.6.3 of the confluent hypergeometrie function Φ, the result follows. ■ In this case as well, if Υ ~ GB^fa 6; Ω, Φ), then Φν(Ζ) = Γ^α(|}6) eti(iZV) Φ (α; -b + \{p + 1); -lZ(SI + Φ)), (5.3.5) where Re(-cZ(n + Φ)) > 0. The marginal and conditional distributions of U are given next. THEOREM 5.3.10. Let U = ( ^u ^12 ), Щ fa χ pA, Ρι+ρ2= Ρ, and U22.i = \ Vi\ U22 J U22 — U2iUillUl2· IfU~ Bp(a,b), then U\\ and C/221 are independently distributed, Un-B^b) andU22.i^BIP2{a-\pub). Further, U2l\UluU22.x ~ /ТИ|Р1(2Ь -ρ + Ι,Ο,^-^-ι,Μίρι-^ιι))· Proof: The density of U is f(U) = {βΡ{α, b)}~1 det(C/)a-2(p+1) det(/p - U)b~^+l\ 0 < U < Ip. (5.3.6) From the partition of U, we have det(t/) = det(J7u) det(t/22.i) (5.3.7) and det(/p -U) = det(/Pl - Un) det(/w - U22 - U21(IPI - Un)-1Ul2) = det(/Pl - J7n) det^ - υ22Λ - U2l(U£ + (JPl - Un)-l)Ul2) = det(/Pl - Uu) det(/w - и22Л - U2lUul(IPl - UnylUl2). (5.3.8) Now making the transformation Е/ц = E/ц, U2\ = U21 and £/22-i — ^22 ~~ Ui)U\\U\2 with Jacobian J(£/n,^22,^2i —> ^11,^22-1,^21) = 1 and substituting (5.3.7) and (5.3.8)in (5.3.6), we get the joint density of С/ц, С/221, and C/21 as
5.3. PROPERTIES 175 №п,и22л,и21) = {/%,(o,6)}-1det(^11)e-i^1>det(u22.1)e-i^1>det(7Pl -I^u)»-*^1) det(7K - U22.x - U2lUj{IPi - Un)-'Ul2)b-1^ = {/3Pl(a)6)}-1det(l/11r"(pi+1)det(/Pl -t/„)b-^'+1) {/%* (a - \ръ b) Υ' det(u22.1)e-in-i(»+1) det(/w - t/22.1)fc-5(P2+1) det(/w - υ22Λ - U2lU^{Ipi - Uu)-lUl2)b-^l\ (5.3.9) where 0 < Uu < JPl, 0 < ϋ22Λ < IP2 and /^ - U22.x - U2lUu\lpi - Uu)~lUl2 > 0. From the factorization (5.3.9), the result follows. ■ THEOREM 5.3.11. Let V = (j^1 j£), Vy (ρ< χ Pj), Pl + p2 = p, and V22.x = ^22 — V^1V^71Vi2- IfV~ βρ7(α,6), then Vu and V22.\ are independently distributed, Vn ~ B»(a,b-±p2), У22Л ~ В£{а-\ръЪ), andV2l\VluVn.i~TMl{2a + 2b-p + Ι,Ο,^+^χ,νπί/ρ,+νΐ!)). Proof: Similar to the proof of Theorem 5.3.10. ■ The distributions of certain matrix valued functions, viz AUA', AVA', (AU~lA')~l, (AV~lA!)~l where A (q χ ρ) is a constant matrix of rank q (< p), are now derived. THEOREM 5.3.12. Lei С/ ~ В$(а,Ь). Then, for a constant matnx A(q χ p) of rank q (< p), AUA' ~ GBrq(a, b; AA', 0). Proof: Write A = Μ (Iq 0) Γ, where Μ (q χ q) is nonsingular and Γ (ρ χ ρ) is orthogonal. Now AUA = M(Iq 0) ГОГ' (Iq 0)' M' = MXUM', where X = ГС/Г and Xn (<? χ <?) is the first principal diagonal block of X. From Theorems 5.3.3 and 5.3.10, we know that X ~ Bfa.b) and Xn ~ Bq{a,b). Hence, using Theorem 5.3.1, MXUM' ~ GBfa,6; MM', 0) and the result follows by noting that MM' = AA!. ■ COROLLARY 5.3.12.1. Let U ~ B£(a,b) and a e W, α φ 0, then ^ ~ £J(a,6). Proof: Take q=lin Theorem 5.3.12. ■ In Corollary 5.3.12.1 the distribution of 9^ does not depend on o. Thus if у (ρ χ 1) is a random vector, independent of £/, and P(y φ 0) = 1, then it follows that*g*~B'(o,b). THEOREM 5.3.13. Lei V ~ В^{а,Ъ). TAen, /or α constant matrix A (q χ ρ) ο/ ran*; 9 (< ρ), AVA' ~ GBf (α, 6 - \(р - q); AA', θ).
176 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS Proof: Similar to the proof of Theorem 5.3.12. ■ COROLLARY 5.3.13.1. Let V ~ 5£J(a,&) and a e Rp, α φ 0, then ^ ~ B"(fl>fc-l(p-l)). Proof: Take q = 1 in Theorem 5.3.13. ■ In Corollary 5.3.13.1, the distribution of 9^L does not depend on a. Thus, if у (ρ χ 1) is a random vector, independent of V, and P(y φ 0) = 1, then also Ц^- ~ B"(a,b-l(p-l)). THEOREM 5.3.14. Let A(qxp) be a constant matrix of rank q(<p). (i) If U ~ £>,6), then (AU-'A')'1 ~ GB< (a - \(j> - q),b; (ΑΑ')-\θ)- (ii) IfV~ J9£'(a, b), then (AV^A')'1 ~ GB1,1 (a-\(j>- q), b; (AA')~\θ). Proof: Write A = Μ (/, 0) Γ, where Μ (q χ q) is nonsingular and Γ (ρ χ ρ) is orthogonal. Now, (AV^A')-1 = [M(Ig 0)ГУ-1Г'(/, Ο/Μ']"1 (/, 0)Y~1(^ -1 M- = (M')~l = (m')-\y11)-1m-\ wherey= (^J £2J =rvr-B^(a,b),y11(9x(?),aiidy11 = (Уп-У^У^Г1 = Уй^. From Theorem 5.3.11, Уц.2 ~ £" (a - |(p - <?),&) and from Theorem 5.3.2, {Μ')-ιΥη.2Μ-1 ~ G^J (a - |(p - <?), 6; (MM')~\ θ). The proof of (ii) is now completed by observing that MM' = AA!. Similarly, one can prove part (i). ■ From the above theorem, when a G Rp, α φ 0, it follows that and г£г~*"И<»-ч4 In the next six theorems we give expected values of the elements of beta type I and type II matrices and some of their scalar and matrix valued functions. THEOREM 5.3.15. Let U ~ B{,(a,b). Then, ^_Tp(a + h)Tp(a + b) D^ , ,w 1, A *[d6t(C/) ] = Ua + bYvW Re(a + h) > 2 <* ~ Ъ and «*'.-°rt=S2^,+kl>>-"'
5.3. PROPERTIES 177 Proof: Prom the density of £/, we have E[det(U)h] = {βρ(α, b)}~1 [ det(U)h+a-^+l) det(/p - U)b~^+1) dU Jo<u<iv 0<U<Ip , Re(a + h)>^-(p-l) _ PP(a + h,b) 0л/л ( fcW 1 Pp(a,b) ^Jn + K\ 1 ,Re(a + h)>-(p-l). _ Γρ(α + h)Tp(a + b) ~ Γρ(α + 6 + Λ)Γρ(α)* lp- THEOREM 5.3.16. Let U ~ B*(a,b). Then, Similarly E[det(Ip - U)h] can be derived. and Proof: From the density of [/, we have £[CK(t/)] = {βρ(α, b)}'1 [ CK(U) det(U)a-x^+V det(7p - £/)M(j>+i) dU Jo<u<ip _ 1 Гр(а,/с)Гр(6) } /?р(а,6) Гр(а + 6,к) (а) (а + Ь)„ сш where the integral has been evaluated by using (1.5.16). The E[CK(U~1)] is similarly derived by applying (1.5.17). THEOREM 5.3.17. Let V ~ B^a.b). Then, and да if^ft + vrvn - Yj£W*$f в* + <0 > ί (г - Ц. Proof: By definition, E[det(V)h] = {Д,(а, b)}-1 f det(V)a+h~^+1) det(Jp + V)^a+b^ dV Jv>o = ^(afl+^M~fe)' R<a+л) > ?(p -^Re(6 -л) > ?(p -1} βρ{α, b) 2 2 Γρ(α + h)TJb - h) 1, ,. _ ,t. . 1. ,.
178 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS Notice that (Ip + V)~lV ~ Bfa,b) and hence, E[det((Ip + V)-lV)h] is obtained from Theorem 5.3.15. ■ THEOREM 5.3.18. Let V ~ 5^(а,Ь). ТЛеп, ft В[СЛ(10] = piz^^-C^/p), Re(6) > i(p - 1) + fcb and W ВДУ1)] = —t}^—CK(Ip), Re(a) > i(p - 1) + fcx- Proof: By definition, E[CK(V)} = {βρ(α, b)}~1 j CK{V) det(V)°-^+1) det(/p + V)-^ dV Jv>o - l г»МТ*{ь'-к)ск(1Р),МЬ)>кр-1) + к pp(a,b) Tp(a + b) KV p" w 2 = pfeig^C7l,(/p),Re(6)>i(p-l) + *1, where the integral has been evaluated by using Lemma 1.5.4 and simplification has been done using (1.5.9). By noting that V~l ~ Bjffaa), E{CK(V~1)] is obtained from E[CK(V)]. m Konno (1988), using Haff's (1979) method, derived identities for expectations of certain functions of beta type I and type II matrices. When U ~ GB^rii, \ri2\ Ω, 0), he gave an identity for E[g(U) tr((Q - C/)_1T)], where g(U) is a scalar function and Τ (ρ χ ρ) is a matrix valued function of U and Ω. From this identity, the following results are obtained. THEOREM 5.3.19. Let U ~ GB'p Qnb £η2;Ω,θ), then (i) E(ui:j) = —LJij η (ii) E(uijUki) = —f——jV7-—ly\[{ni(n + 1) - 2}<JtjW« + η2{ωόίωΗ + uieukj)}} where η = щ +Щ, U = (щ) and Ω = (ω^·). Proof: See Konno (1988). ■ From Theorem 5.3.19, we immediately get COvfaj, Ukt) = — * 2 оч [ LJijUiu + UjtUik + Шишу] , (5.3.10) n[n — l)[n + I)L η J and E{JJAST) = —( ^ -[{η1(η + 1)-2}ΩΑΩ+η2{(ΩΑΩ), + ΐΓ(ΩΑ)Ω}], (5.3.11) n[n — l)(n + 2) where Α {ρ χ ρ) is a fixed matrix.
5.3. PROPERTIES 179 When V ~ <2££7(|ηι,|η2;Ω,0), h(V) is a scalar function of V and Τ (ρ χ p) is a matrix valued function of V and Ω, Konno (1988) also derived an identity for E[h(V) tr((Q + V)~lT)}, from which the following results were obtained. THEOREM 5.3.20. Let V ~ ££^(|пь |η2;Ω,0), ί/ien Πι ft %) n2-p-l and (ii) E(vijVki) Wtj, n2-p- 1 > 0, Tli / ν/ 1λ/ ^[{^ι(η2-ρ-2) + 2}α;^α;^ (n2 - p)(n2 - ρ - l)(n2 - ρ - 3) + (η - ρ - l)(ujeuik + wtfJiy)], n2 - ρ - 3 > 0. Proof: See Konno (1988). ■ From the above theorem, one can easily see that for n2 — ρ — 3 > 0, πι(η-ρ- Ι) COY (Vij.Vki) = (n2 - p)(n2 -p- l)(n2 - ρ - 3) 2 -zUijUkt + UjtUik + UieUkj I П2 — P — 2, J and £(\Л4У) Πι (n2 - p)(n2 - ρ - l)(n2 - ρ - 3) + (η - ρ - 1){(ΩΑΩ)' + ϊγ(ΩΑ)Ω}], (5.3.12) [{ηι(η2-ρ-2) + 2}ΩΑΩ (5.3.13) where Α (ρ χ ρ) is a fixed matrix. Further by noting that the distributions of U~l and Ω-1 + V~l are identical, Konno (1988) derived r> — r) — 1 (5.3.14) £([/-*) = n p ^ω-1, щ - ρ -1 > о, Ε(νΡν!*) = щ-р-1 η —ρ — I (ηι - ρ)(ηι - ρ - 1)(ηι - ρ - 3) + ri2(c^Vfc + o/V*')], ηι - ρ - 3 > 0 [{(η -ρ)(ηι-ρ-3) + η2}ω^ωΗ (5.3.15) ν о,к£\ = cov(ulJ,uk£) η2(η-ρ- 1) (ηι - р)(щ - ρ - 1)(ηι - ρ - 3) Lni - ρ - 1 , ηι -ρ-3 > 0, ω«ω* and ^(ErUEr1) = η —ρ — 1 (ηι -ρ)(ηι -ρ- 1)(ηι -ρ- 3) (5.3.16) [{(η - ρ)(ηι - ρ - 3) + η2}Ω_1 ЖГ1 -1 ιλ-1\/ . , //-ч —1 ,,χ^-Ι-»!
180 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS where Α (ρ χ ρ) is a fixed matrix. In the remaining part of this section we give various factorizations of beta type I and type II matrices. It is interesting to note that, like Wishart matrix which factorizes into normal matrices, the beta type I and type II matrices factorize into inverted t- and t- matrices. THEOREM 5.3.21. Let U ~ £p(a,6). If a is half an integer, then U can be fac- torized asU = XX', where X ~ ITPi2a(2b - ρ + 1,0, Jp, J2a). Proof: Let 2a = m and L(pxm)bea semiorthogonal (LLr = Ip) random matrix which is independent of U. Then, the joint density of L and U is given by c~l{0p{\m>ή}'' det(U)^m-p-V det(Jp - EO6-*^1 WL), (5·3·18) where с = ^7*3 and gm,p(L) is defined in (1.3.26). Since U > 0, with probability one, we can write U = TV where Τ (ρ χ ρ) is a lower triangular matrix with positive diagonal elements. Further, since m > p, we can write TL = X, where Χ (ρ χ m) is a random matrix of rank p. Now transforming U = TV', TL = X, with the Jacobian (from (1.3.14) and (1.3.25)) 7(17, L -> X) = J(U -> TV)J{T, L^X) = 2?f[ *S"i+1 Π ^ίW^)}"1, г=1 г=1 the density of X is given by Гр^т + 6) det(/p - χχγ-^-ΐ)-!, χ e ; (5.3.19) From (5.3.19), it is clear that X ~ ITPi2a(2b - ρ + 1,0, Jp, I2a). ■ The result for beta type II matrix, corresponding to Theorem 5.3.21, is given below. THEOREM 5.3.22. Let V ~ B^(a,b). If a is half an integer, then V can be factonzed as V = YY', where Υ ~ Tp?2a(26 - ρ + 1,0, Jp, Ι2α)· Proof: Similar to the proof of Theorem 5.3.21. ■ THEOREM 5.3.23. Let U ~ B{,(a,b) and U = TV, where Τ = (ί0·) is an upper triangular matrix with tu > 0, г = 1,... ,p. Partition Τ as Tn t \ ρ - 1 . (5-3.20) о' W ι Then, tpp, у = (1 — ίρρ)~2(/ρ_ι — TuT[i}~*t and Tu are independently distributed, t^ ~ -BJ(a, b), у ~ Itp-i(2b — ρ + 1,1,0, /p_i) and the distribution of Tu is same as that of Τ with ρ and a replaced by ρ — 1 and a — \ respectively.
5.3. PROPERTIES 181 Proof: Making the transformation U = TV, with the Jacobian of transformation J(U -+T) = 2p UPi=i t\i in the density of t/, we get the p.d.f. of Τ as ρ Ε ί=1 2р{0р{а,Ь)}-1 Π(*«)β~*(ρ~<+1) det(/p - ГТ')6-^1), (5.3.21) where —oo < Uj < oo, i < j, i, j = 1,... ,p, and tu > 0, i = 1,... ,p. From (5.3.20), we have det(/_ - ГГ) = det P = (1 - £,) det^ - T^ - (1 + t%(l - tlY'W) = (l-t2pp)aet(Ip_1-TnT{1) det^ - (1 - i^rU-i - Τ11ϊϊ1)-1**') = (l-ipp)det(/p_1-T11T1'1) (1 - (1 - ^p)-1f (7p_i - T!!^)"1*). (5.3.22) Substituting (5.3.22) in (5.3.21) we get f(Tu,t, i„,), the joint density of Гц, ί and ίρ as f(TU,t,tpp) = fl(Tn)f2(tpP)h(t\Tu,tpp), where 1 \1_1P_1 MTU) = 2^[βρ_χ(α - -,b)}~ ПЙГ^^"0 det^ - Μ^Κ (5.3 .23) f2(tpp) = 2{β{α,1)}-\ήψγ-(\ - 4)6"1 (5.3.24) and (1 - (1 - i^t'^ - TuT1'1)-1t)fc-^+1). (5.3.25) Now transforming xp = t^ and у = {\ — t^)~^{Ip-\ — T^T^)^ £? with the Jacobian J(Mpp -> У^р) = (24)"1(1 - Xp)^(p_1)det(/P_1 - ТцТ^)*, we get the desired result. ■ Results similar to the above were derived by Kshirsagar (1961b, 1972) when Τ is a lower triangular matrix. THEOREM 5.3.24. Let U ~ Bj,(a,b) and U = TV, where Τ = (ty) is an upper triangular matrix with tu > 0, г = l,...,p. Then, ί\ΐ7...,^ are independently distributed, t?£ ~ B!(a — \{p — г), 6), г" = 1,... ,ρ.
182 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS Proof: From Theorem 5.3.23, it is known that t^ ~ β7(α,6) and is independent of Tn, which has the same distribution as Τ with ρ and a replaced by ρ — 1 and a — \ respectively. Further partitioning Tu yields tp_i?p_i ~ B\a — \,b). Repeated application of this procedure completes the proof of the theorem. ■ The next result, derived by Javier and Gupta (1985a), is a matrix variate generalization of a result given in Rao (1952). THEOREM 5.3.25. If X ~ £p(a,6) and Υ ~ B'p(a + 6,c) are independent, then U = Y$X(Y*)'~BIp(a,b + c). Proof: The joint density of X and Υ is {βρ(α, b)Pp(a + 6, c)}"1 det(X)a-^+1) det(Jp - X)6"^1) det(y)a+6-^+1> det(Jp - y)c-|(p+D? о < X < Jp, 0 < Υ < Ip. (5.3.26) Making the transformation U = Y^X{y^)' with the Jacobian J{X,Y ->· U,Y) = det(y)"^+1) in (5.3.26) we get the joint density of U and Υ as Г>)агУб)^) deW4(P+1) det^ " Y-*U<y-i)Ti{p¥l) det(y)M<*H> det(Jp - y)c-i(H-Dj о < С/ < У < /p. (5.3.27) Now to obtain the marginal density of U, we need to evaluate f det(Jp-y-5[/(y-5)')M(p+D de^y)6"^1) det(/p-y)c"2^+1) dY. (5.3.28) JU<Y<IP Substituting in (5.3.28), W = (Ip - U)~i(Y - U)((IP - U)^)' with the Jacobian J(Y -> W) = det(Jp - U)^+l\ we get det(/p - U)b+c~^p+l) f detiWf-*^ det(/p - W)c~^+l>> dW Jo<w<ip = det(/p - ^*мЖЁ. (5.3.29) Гр(о + с) Integration of У in (5.3.27), using (5.3.28) and (5.3.29), completes the proof of the theorem. ■ A shorter proof of the above theorem can be given by using the m.g.f. of У ϊΧ(Υ5)'. 5.4. RELATED DISTRIBUTIONS In this section, we study some distributions related to the matrix variate beta type I and type II distributions.
5.4. RELATED DISTRIBUTIONS 183 THEOREM 5.4.1. Let Si ~ Wp(ni,Ip), i = 1,2 be independent (i) If S2 = TT', where Τ is a lower triangular matrix with positive diagonal elements, then the distribution ofU = Tf(Si + S2)~lT is given by тП-2^)} nr=1det(^) ^<U<IP. (5.4.1) Further U and S\ + S2 are independently distributed. (ii) If S2 = TT', where Τ is an upper triangular matrix with positive diagonal elements, then the distribution ofU = T'(Si + S2)~lT is given by N2n2'2n0} nr=1det(^]) ^<U<IP. (5.4.2) Further U and S\ + S2 are independently distributed. Proof: (i) The joint p.d.f. of Si and S2 is ^^^Тр^п^Гр^па)}"1 etr {- |(5χ + S2)} detiSx)*^-'-1) det^)^2"?"1). Transforming S = Si + £2, and S2 = TT' with the Jacobian of transformation J(SUS2 -> S, T) = 7P Π?=ι C*+\ we get the joint p.d.f. of S and Τ as |2έ(^^-2)ΡΓρ(1η1)Γρ(^η2) J"' etr (- \s) det(5 - ГГ)*^"^1) П «Г*. where 5 — TT' > 0, ί^ > 0 and —00 < Uj < 00, г > j. Further, transforming U = T'S-lT with the Jacobian J{T ->£/) = ^nLi^det^S-1)^])}"1, we get the joint p.d.f. of S and £/, |2|(-i^2)prp(in1)rp(in2)}"1 etr (- is) det^)^^2"?-1) det(/p - [/)έ(-ι-ρ-ΐ) ^([/)Ь [J detil/и)-1, (5.4.3) since Π?=ι*« = ΠΡ=ι det(6(5-flh)· From (5·4·^)' it; is easy to see that 5, i.e., Si + S2 and t/ are independent, 5 ~ Wp(rii + n2, Ip) and the distribution of U is given by (5.4.1). (ii) Proof is similar to part (i). ■ THEOREM 5.4.2. Let S{ ~ Wp(nu Σ), i = 1,2 be independent, S2 = TT, where Τ is a triangular matrix with positive diagonal elements, and V = T~lSi(T~l)f. Then, V and Si + S2 are independently distributed. Further (i) if Τ is a lower triangular matrix, then the p. d.f of V is (Λ 1 4l-1det(V)i^-^1)det(/p + V)-i^+na-'-1) _. . ,K . .. Щ^^Н nL^i + vn >y>0> (5A4)
184 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS and (ii) if Τ is an upper triangular matrix, then the p. d.f. of V is mn^)\ nr=1det((/;+n,) 'v>0- (5·4·5> Proof: (i) The joint p.d.f. of Si and S2 is {2*^+n3^rp(in1)rp(|n2) det(E)^^)}"1 etr {- ^Σ"1^ + S2)} det^)^-?"1) det(52)^(n2-p-1). (5.4.6) Transforming S2 = TV and V = T^S^T-1)', with the Jacobian of transformation J(SU S2 -> V, T) = 2p nf=i ί^1*"*, the joint density of V and Τ is given by {2^ηι+η»-2^Γρ(|ηι)Γρ(^η2) det^)*^*"'*}"1 etr {- ^Σ"1^ + У)Т'} det(V)^ni-p-V Π^1+η2_ί· (5-4.7) Further transforming 5 = T(IP+V)V, with the Jacobian J(T -+ S) = {2P ULi *S~i+1 det((/p + V)^)}"1, the joint density of V and 5, obtained from (5.4.7), is given by {2^ηι-^^Γρ(^ηι)Γρ(|η2) det(E)i<ni+na>}_1 etr (- \z~lS) det^)^^"1) det(V)*(ni-p"1) det(/p + V)-^+n*-p-l) f[ det((/p + VjW)"1. (5.4.8) i=l Prom (5.4.8) it is clear that V and S = Si + S2 are independently distributed, 5 ~ Wp(rii + П2, Σ) and the p.d.f. of V is given by (5.4.4). (ii) The proof is similar to part (i). ■ Theorems 5.4.1 and 5.4.2 were proved by Olkin and Rubin (1964). In Theo- _i _i rem 5.4.2, we notice that V = S2 2SX(S2 2)' and Si + S2 are independent for all ι ι Σ > 0 when S2 is a triangular square root. However this is not the case when S2 is a symmetric square root as shown by Olkin and Rubin (1964) and given in the next theorem. THEOREM 5.4.3. Let Si ~ Wp(nu Σ), i = 1,2 be independent Define S = Sx + S2 and V = S2 2 Si S2 2, where S2 S2 = S2 · Then S and V are not independent and their joint p. d.f is given by bK-i+^Jpp^^^r^l^ det(E)i(n1+n2) j l etr (_ Ις-1^) det(5)^ni+n2-p-1} det(V)i<ni-p-1> det(/p + VJ-iC^+^-p-1) Ц (-^γ-) , S > 0, V > 0,
5.4. RELATED DISTRIBUTIONS 185 where λ; and δι (г = 1,.. .,p) are £Ле eigenvalues of {{Ip + V)2S(IP + V)^}^ and {Ip + 1^)-*{(/р + V)iX(Ip + V)i}i(Ip + У)-г2 respectively. Proof: The joint p.d.f. of Si and S2 is given by (5.4.6). Let S2 = X2,V = Χ"1^"1, and ίχ,..., <5P be the eigenvalues of X. Then, from (1.3.5) and (1.3.20), the Jacobian of transformation is J(5b S2 —>· V,X) = Пг<^(^г + £j) det(X)p+1, and the joint p.d.f. of V and X is obtained as {2^(ni+n2)prp(in1)rp(in2) det(E)^^)}"1 etr {- \ς~1Χ(Ιρ + V)x} det(X2)^ni+na-p-1^ Π№ + 5i) det(y)^ni-p-1}. Now transforming 5 = X{IP + V)X with Jacobian J(X -> 5) = Пг<Л*г + λ,·)-1, where λι,..., λρ are the eigenvalues of {Ip + V)*X{IP + V)a, the joint p.d.f. of 5 and У is |2i(m+«2)prp(Inι)Γρ(^η2) det(E)2(ni+n2)|_1 etr (- ^Σ"^) det(5)i(ni+n2-p-1} detiV)*^1-'-^ det(/p + V)-i(»i+»a-p-D jj (-^-±^1 , S > 0, У > 0, (5.4.9) Now the independence of V and 5 depends on the factorization of (5.4.9) into two functions, one of V alone and the other of S alone. However, this factorization depends on the factorization of Пг<^ (Irpv)· For Ρ = 2> from (1-3-20) let и(Ъ±к) = *№ + &) (5410) yU + Aj Λ(λι+λ2) lD4iUJ where now and Λ(ίχ + δ2) = 22αλα2 = 4 det(X) tr(X), (5.4.11) Λ(λχ + λ2) = 4 det((/p + V)X) tr(X(/p + V)), (5.4.12) X = (/p + V)"* {(/ρ + V)*S(IP + V)*}*(IP + У)"2. (5.4.13) Prom (5.4.10)-(5.4.13), we get A№+*2) _ tr[(Ip + V)-l{(Ip + V)iS(Ip + V)i}i] h(Xi + X2) det(/p + V) tr[{(/p + V)iS(Ip + V)i}i]' (5.4.14) Now, (5.4.14) does not factorize to give the independence of V and S. m Next, we derive the p.d.f. of U when the Wishart matrices S\ and S2 have different covariance matrices.
186 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS THEOREM 5.4.4. Let Si ~ 1Ур(пьЕ;}, г = 1,2 be independent If u = (sl + s2)-1isl((sl + s2)-1*y, where (5i+52)2((5i+52)2)/ is a reasonable nonsingular factorization of Si + S2, then the p. d.f of U is given by |2§(η1+η2)ΡΓρ(1ηι)Γρ(1η2) det(El)b det^b}-1 det(U)1^ni-p-1) det(/p - t/)^-?-1) [ det(S)^ni+n*-p-l) Js>o etr{- ir^S - ^(Ef1 - Еах)5*1/(5*)'}й5, 0< £/ < /p. (5.4.15) Proof: The joint density of 5X and S2 is 2 г Π {2*Τρ(±η,) det(E0b}- etr (- fe%) det^)^-""-1' Making the transformation Si + S2 = S, Si = S^U(S^)' with Jacobian J(Si,S2 -»· [/, 5) = det(5)5^+1>, we get the joint p.d.f. of U and S as {2έ(η1+η2)ρΓρ(1ηι)Γρ(1η2) det(El)b det(E2)b^}-1 det(lO*(ni-p-1) det(/p - C/)K^-p-D detiS)^"1*"2-"-1' etr {- hl^S^Ip - U)(S')' - ]p?SiU(Si)'}, 0<U<Ip,S>0. (5.4.16) To find the marginal density of U, we integrate (5.4.16) with respect to 5, obtaining (5.4.15). ■ When Σι = Σ2 = Σ, the integral in the p.d.f. (5.4.15) can be easily evaluated as f det(S)^ni+n2-p-l) etr (- ^Σ"χ5) dS = 2*(ni+na)Tp[i(ni +П2)] άβί{Σ)^ηι+η3\ and in this case U ~ В^щ, |n2). THEOREM 5.4.5. IfU is distributed as (5.4.15), then -det^Ej)* E[det(U)h] = rP(bi + fe)rP[|K+n2)]^^-ly,i7I1 Гр(±т)ГрЦ{п1 + n2) + A] 2Я(|ni + h i(m +n2);i(n1 +n2) + A;/P - E^E2E^), Re(/i)>--(n1-p+l),
5.4. RELATED DISTRIBUTIONS 187 and I р(2П2)1 pl2(nl + nV + Щ ail (««ι. ^ + n2); o(ni + П2) + ft; 7p ~ Σι *Σ2Σι *)' Re(ft)>--(TU-p+l). Proof: From (5.4.15), we get E[det(U)h] = {25^+»^rp(in1)rp(in2)det(E1)bdet(E2)b3}-1 / det(Eni<n^1>-rtdet(/p-lO*(na_,,-1) f det(S)^ni+n*-p-l) Jo<u<iP Js>o etr{- ^E^S - i(ErJ - E^Sil/iStyJdtfdS. (5.4.17) Since etriKEj1 -ЕГ1)^!^*)'} = 0F0(^1 -Ef1)S*l7(Si)')J we use the integral (1.6.6) to obtain / det(I0*(ni-p-1)+kdet(/p - t0*(na_^1) oio&Ej1 - E^S^S*)') ДТ Jo<u<ip ч2 ' Now, substituting (5.4.18) in (5.4.17), we get E[det(U)h] = {2*<"'^"Γρ(|η1)Γρ[|(η1 + n2) + fc] det(Ex)b det(E2)W}_1 ΤΡ(\ηι + h) js>odet(S)^+^-^ etr (- Ie^S) ljPl(ini + ft; i(m + n2) + Λ; i(E, * - Ef^S) dS. (5.4.19) The integral in (5.4.19) can easily be evaluated by using (1.6.4) to give the desired result. The derivation of jE?[det(/p - U)h] is similar. ■ Theorems 5.4.4 and 5.4.5 are special cases of the results derived by de Waal (1970) for Dirichlet matrices and are given in Theorems 6.4.1 and 6.4.3. The statistic det(E/) = a^Tsi+s2) IS usec*to test ^e nu^ hyPotnesis Я : Σι = Σ2· The distribution of U given in (5.4.15) is needed to study the power of this test.
188 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS 5.5. NONCENTRAL MATRIX VARIATE BETA DISTRIBUTION In Section 5.2, we defined the matrix variate beta type I distribution and subsequently derived it using the representation U = (Sx + S2)~iSi((Si + S2)~i)f, where 5» ~ Wp{rii, Σ), i = 1,2 are independent. In case S2 ~ Wp(n2, Σ, Θ), the corresponding distribution of U is called the noncentral matrix variate beta type 1(A), (Hart and Money, 1976) and is derived in the next theorem (de Waal, 1968). THEOREM 5.5.1. Let Si ~ Wp(nuE), and S2 ~ Wp(n2,E,0), be independently distnbuted. Define U = (Sx + S2)-*Sl((Sl + S2)~*)', where (Sx + S2)*((Si + ft)*)' is a reasonable nonsingular factorization of S\ + S2, then the p.d.f ofU is given by |2§К+"з)РГр(1П1)Гр(1П2) det(E)5^+^)p etr (- ίθ) det(E0*(ni"p"1) det(7p - C/)K^-p-i) f det(5)*(»i-hu-p-i) 7s>o etr (- ίΣ-χ5) οίΊ^τυ; ^©S^ - U)(S$)') dS,0<U< IP. (5.5.1) Proof: The joint density of Si and S2 is |2§(η1+η2)ΡΓρ(1ηι)Γρ(1η2) detpjiC».^)}"1 etr (- ±θ) etr {- ^Σ"^ + S2)} det(Si)5^^-1»det(S2)5^-^1)oFi(in2;iE-10S2). Using the transformation Si + S2 = S, Sx = S^U(S^)' with Jacobian J(Si,S2 ->· 17, S) = dettS)^1), we get the joint p.d.f. of U and S as |2§Κ+η2)ΡΓρ(1ηι)Γρ(1η2) det(E)§^+^)}_1 etr (- i©) det(E0*(ni-,,"1) det(7p - J7)i("»-p-D det(S)i(-ni+n*-p-V etr (- iE-xS) oii^na; ^~lQSi(Ip - U)(S?)'). (5.5.2) To find the marginal density of U, we integrate (5.5.2) with respect to S. ■ Substituting θ = 0 gives the central matrix variate beta type I density. When the rank of θ is unity, the linear case, the distribution of U = (uy) is given by Kshirsagar (1961b) as {b(\*u ^n2)}_1 dettt/)^-»-1) det(7p - U)^~^ exp ( - \θ2) ιίι(|(ηι +n2);in2;i02(l -u„)), 0 < U < Ip,
5.5. NONCENTRAL MATRIX VARIATE BETA DISTRIBUTION 189 where Θ2 is the only nonzero eigenvalue of Θ. He also discussed the planar case, i.e., when the rank of θ is two. UW = (Si + S2)-*S2((Si + &)"*)', Si ~ Wp{n,Σ) and S2 - Wp(n2,Σ,Θ), then W = Ip — U and from Theorem 5.5.1, its p.d.f. is |2l^^)prp(ini)rp(in2) det(E)^»^)}"1 etr (- i©) det(Wr)*(n2~p~1) det(/p - И0*(п1~р_1) / det(S)2(ni+n2-p-1} Js>o etr (- \z~lS) 0Fi (in2; ^E^GS* W(S*)') dS, 0 < W < Ip. This distribution is known as noncentral matrix variate beta type 1(B), and for ρ = 1, the distribution of W is the usual noncentral beta type I distribution. The next theorem gives the moments of det(U) and det(/p — £/), when U has noncentral matrix variate beta type 1(A) distribution. THEOREM 5.5.2. IfU is distnbuted as (5.5.1), then \Fi(-(ni + n2); -(ni + n2) + h; -θ), Re(A)> -^(ni-p+l), and (y l Vp ;J Γρ(|η1)Γρ[|(η1+η2) + /ι] ν 2 ; 2-p2(r(ni + n2), -n2 + A; -n2, -(nx + n2) + Α; -θ) 2 "2V Re(A)>--(n2-i> + l), Proof: From (5.5.1), we have E[det(U)h] = {25^+^rp(in1)rp(in2)det(E)5^+^)j"1etr(_l0) I аеии)^П1-р-1)+,гаеЬ(1р-и)^П2-р-1) f det(S)*( Jo<u<iP Js>o etr (- |e_1S) ο^ι(^π2; jE^eS^I? - U)(S^)') dUdS. (5.5.3) Now using the integral (1.6.6) we can write t(ni+n2-p-l)
190 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS f det(U)l^-p-1)+hdet(Ip - ί/)έ("2-ρ-ΐ) oFjl ]z-lOSHlp ~ U)(Sty) dU Jo<u<iP 42 4 ' Substituting (5.5.4) in (5.5.3), we get E[det(U)h] = {2*("Ι^>ρΓρ(|η1)Γρ[|(η1 + n2) + л] det^^^'pr^m + ft) etr (- ^θ) / detiS)^"'^-"-1' etr (- W'S) oil (^(ni + n2) + Α; ^Σ-χθ5) dS. (5.5.5) The integral in (5.5.5) can be evaluated by using (1.6.4). Similarly one can derive E[det(Ip - U)h]. m The distribution of U given in (5.5.1) is useful in studying power of the test statistic det(6g+xg ) f°r testing certain hypothesis in multivariate statistical analysis, e.g., see Roy (1966), de Waal (1968), Pillai and Gupta (1969), Gupta (1971a, 1971b, 1971c), Das Gupta (1972), Nagarsenker (1979), and Gupta and Javier (1986). Asoo (1969), following the univariate density, defined the noncentral matrix variate beta type I density as follows. DEFINITION 5.5.1. A symmetric positive definite random matrix U (pxp) is said to have noncentral matrix variate beta type 1(B) distribution with parameters a, b and Θ, if its p.d.f. is given by ^Γ(~^ det(U)a-^+l) det(/p - U)b~^+l) xFi(a + 6; a; OU), 0 < U < Ip. Pp{a, o) In Theorem 5.2.5, it was shown that V = S2^SlS2^ ~ B1^ (\rii,\n2) where Si ~ Wp(rii,Ip), г = 1,2 are independent and S2 is a symmetric square root of S2. Here, we derive the p.d.f. of V when S2 ~ Wp(n2,Ip, Θ). This distribution of У is called the noncentral matrix variate beta type 11(A), see de Waal (1969). THEOREM 5.5.3. Let Sx ~ Wp(nuIp) and S2 ~ Wp(n2Jp,G) be independently distributed. Then the p.d.f. ofV = S22SlS22, where S2 is a symmetric square root of S2, is given by {&(|»i. l^y1 etr (- i©) det(V)^->-»det(/p + V)~^^ xFx^nx + n2); in2; l-Q{Iv + V)~l), V > 0. (5.5.6)
5.5. NONCENTRAL MATRIX VARJATE BETA DISTRIBUTION 191 Proof: The joint p.d.f. of Si and S2 is {2^+^Τρ(\η1)Γρ(\η2)γ1 etr (- \θ) etr{- I(Sl + S2)} det(51)5(ni-p-1)det(52)i^-"-1)oir1(in2;i052). Transforming V = 52~^5152~% with the Jacobian J(S1 -»· V) = det(52)5(p+1), the joint p.d.f. of V and S2 is given by {2^+^Γρ(\η1)Γρ(\η2)У' etr (- ±θ) det(V)i^-""» etr {- ±(JP + V)S2} det(52)*(ni+na-^" ο^ι(^η2; ^Θ52), S2 > О, V > 0. (5.5.7) Prom (1.6.4), we have Is >o etr (~ \{Ip + V)S2} det(52)|(7n+n2_P_1) οίΊ {\η2; \eS2) dS2 = 2^ηι+η^Γρ[^{η1 + n2)] det(/p + V)-^1+^ iF^im + n2); \n2; \&{IP + V)~l). (5.5.8) Now, integrating (5.5.7), using (5.5.8) we get the marginal p.d.f. of У as rfffyffi etr (- 5Θ)MM***-*-1'det(/p + V)-ito+*) Γρ(5ηι)Γρ(5η2) ν 2 / 1Л(|(П1 + na); \n2\ \&{ΙΡ + V)'1), V > 0. which completes the proof of the theorem. ■ If we transform U = {Ip + V)~l with Jacobian J(V ->£/) = det(t/)_(p+1) in the above theorem, then the p.d.f. of U is {/%4»ь ^2)p etr (- ±θ) detiC/)^--^"1) det(/p - C/)^-p-D 1^1(2^1 + n2); 2^2; 2©^) 1 0 < C/ < /p. which is the noncentral matrix variate beta type 1(B) distribution defined by Asoo (1969). In the special case when θ = diag(#n, 0,..., 0), the density (5.5.6) simplifies to {pP{\nu in2)}_1 etr (- iflu) det(V)^-r-» det(Ip + V)~i iFi (l(ni + n2); \n2; Ъптп), V > 0. (ni+n2)
192 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS where (Ip + V) l = (τυ), and xFi is the hyper geometric function of scalar argument (Rainville, 1970). If F = S2^S{lS2\ Si ~ Wp(nuIp) and S2 ~ Wp(n2Jp,G) are independent, then F = V~l and from Theorem 5.5.3, its p.d.f. is {&{\ni> \ъ) Υ' etr (- \θ) det(F)*^-'-1) det(/p + F)~^+^ lFl(I(ni + П2); 1П2; IeF(/p + F)-i), F > o. This distribution is known as noncentral matrix variate beta type 11(B) and for ρ = 1, the distribution of F is the usual noncentral beta type II distribution. Following the univariate density, Asoo (1969) has defined the noncentral matrix variate beta type II distribution as follows: DEFINITION 5.5.2. A symmetric positive definite random matrix V (pxp) is said to have noncentral matrix variate beta type 11(B) distribution with parameters a, b and Θ, if it p.d.f is given by {βρ(α, b)}~1 det(V)e-i^^ det(/p + V)^a+b^ etr(-G) iFi(a + 6; a; QV(IP + V)"1), V > 0. In the above density by transforming U = (Ip + V)~lV, we get the noncentral matrix variate beta type 1(B) distribution given in Definition 5.5.1. THEOREM 5.5.4. If V is distributed as noncentral matrix variate beta type 11(A) with p. d.f (5.5.6) then ~2 ("ι - Ρ + !) < Μ*) < J ("a - Ρ + 1). and fii) FMeKT -4-1Л-*1 rp(5"2 + fe)rp[i(m+n2)] / 1ч (η) E[aet(Ip + V) ] = ГрС_П2)ТрШп1+П2) + к]^{-2е) 2F2(-(n1 + n2), -n2 + h\ -n2, -(ni + n2) + Λ; -θ), Re(/i)> --(n2-p+l). Proof: By definition, £[det(V)ft] = {/3p(i„1)In2)}"1etr(-i0)/v>odet(V)i(--"-1)+ft det(/p + VT*(ni+na) ifl^fa + "2); ^2; ^Θ(/ρ + У)"1) dV. (5.5.9)
PROBLEMS 193 Substituting U = (Ιρ + ν)-χ with Jacobian J(V -+U) = det(U)-^+1), in (5.5.9), and using (1.6.6), we get l(„2-2h-p-l) det(Ip - U)ito+n-r» jFx (|(щ + n2); ^n2; ±©tf) dtf = {4>(2Пь ^ή } Γρ[1(η1+η2)] 6tr (" 2θ) 2^2(2^2 - Λ, ^("ι + пг); ^(«ι + пг), 2η2'. 2Θ) Γρ(|ηχ + ft)rp(in2 - ft) , 1ν Л 1 . 1ч = Γ^ΙηΟΓ^Ιη.) 6ίΓ С" 2Θ) lF42n2 - Λ' 2П2' 2Θ)· ιΡ\2"Ί^Ρν2 Similarly one can prove second part. ■ PROBLEMS 5.1. Let 5|Σ ~ Wp(n, Σ) where 5(pxp)>0 and Σ (ρ χ ρ) > 0. Assume that a priori S ~ ЛУр(га, Φ). Prove that the marginal distribution of S is a generalized matrix variate beta type II distribution. 5.2. Let X ~ Bjffab) and Г - Б^(а + 6,с). Prove that (Ip + X)-*Y(Ip + X)-h ~ 5pJ(6,a + c). 5.3. Let 5 ~ V7p(ni,/p) and X ~ NPi7l2(Q,Ip <g> /n2), n2 < ρ be independently distributed. Then show that, (i) F = X'(S + XX')-1 X ~ ^2 (ip, i(m + n2 - ρ)), and ΟΟΧ^ΛΓ-Β^ρ,^ηχ+η,-ρ)). 5.4. Let 5 ~ Wp(n, Σ) and A(pxr) be given matrix of rank r (< \p). Define W = (A'S-lA)-\ Δ = (Α'Σ-ιΑ)-\ and В = Ahw-l(A'S-^S-lA)-lW-lbh. Then, (i) W and Б are independent, (ii) W ~ Wr(n - ρ + r, Δ), and (ш) В ~ В^(п-р + 2г),±(р-г)). (Khatri and Rao, 1987) 5.5. Let S ~ Wp(rii + τΐ2,Σ) and С/ ~ BIp{^ni,\ri2) be independently distributed. Show that Si = S^U(S^)' and S2 = S?(IP - U)(S^)' are independent, Sx ~ Wp(nbE) and 52~^ρ(η2,Σ).
194 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS 5.6. Let Τ ~ Τρ?7η(η,Μ,Σ,Ω). Then prove that Σ_1(Τ - Μ)Ω~ι(Τ - Μ)' ~ B^(im,i(n+p-l)). 5.7. Prove Theorem 5.3.4. 5.8. Prove Theorem 5.3.7(ii). 5.9. Prove Theorem 5.3.11. 5.10. Derive the characteristic function of У, where Υ ~ GBjffa, b; Ω, Φ). 5.11. Let X ~ GBfa, 6; Ω, 0) and partition X and Ω as / -X"ll ^12 \ Pi _ / Ωχι Ω12 \ Pi X= [ у γ ) n ' Ω= ο ο L ' ъ+ъ=Р' \ А21 А22 / ί>2 \ "21 ^22 / Р2 Pi Р2 Pi P2 Then, prove that (i) Xu and Хцл are independent, Хц ~ (λΒρ^α,&ίΩχχ,Ο), Χ22·ι - GB^a- ipi,b;fi22.i,0), and (ii) X2i|*n,*22.i - /Тил(26-р + Ι,Ο,Ω^Ω^1^, Ω22.ι - -X22.il Cpi _ ΩηΧ^11)^11)1 where X22-i = ^22 ~ ^21 Хц X\2 and Ω224 = Ω22 ~~ Ω^Ω^ Ω12. 5.12. Let Υ ~ GBjffa 6; Σ, 0) and partition У and Σ as v / Y11 Y12 \ Pi ^ ( Ση Ей \ Pi \ *21 >22 / Р2 \ Ь21 L22 J Pi Р\ Pi P\ Pi Then, prove that (i) Yu and У22.1 are independent, Yu ~ GB^a.b - \p2\ Σιι,Ο), У22.1 ~ GS^(a-ipi,6;E22.i,0), and (ii) ^21^11^22.1 ~ Тил(2а+26- p+1, Σ^ΣΓ^π, Σ22.14^22.1ϊ (/Pl +ΣΓι%)νΐι), where У2М = Ум-ВДТ1^ and Σ22.1 = Σ22 — Σ21Ση Σ12. 5.13. Prove Theorem 5.3.13. 5.14. Prove Theorem 5.3.14(i). 5.15. Prove Theorem 5.3.15(ii). 5.16. Prove Theorem 5.3.16(ii). 5.17. Let X ~ GBfa, b; Ω, 0), then prove that E[det(X)h] = det(tt)hE[det(U)h] E[det(Q - X)h] = det(Q)hE[det(Ip - U)h] E(CK(X)) = ^E(CK(U)) and Е{Ск{Х-')) = Щ^Е{Ск{и^)), where U ~ B'p (a, b).
PROBLEMS 195 5.18. Let Υ ~ GBjfia, 6; Σ, 0), then prove that E[det(Y)h] = det(E)hE[det(V)h] E{CK{Y)) = ^1e(Ck(V)) and E{C.{Y-')) = ^^E{C.{V-')\ where V ~Б^7(а,6). 5.19. Let U ~ βρ(α, 6) and C/ = TT", where Τ = (ί^·) is a lower triangular matrix with positive diagonal elements. Show that t\^ ..., t^ are independently distributed, t\ ~ B\a - \{i - 1),6), г = 1,... ,p. 5.20. Let J7 ~ GBfab) and t/W = (uy), 1 < i, j < a, then find ρ Ε a=l ^(ndet(C/W)^det(C/)/l). 5.21. Let U ~ Bp(a,6). Then prove that ά^($-ι))> r = 1,... ,p are independently distributed as BJ(a — |(r — 1),6), r = l,...,p. 5.22. Let V ~ В?(а,Ъ) and V = TV, where Τ = (ί0·) is a lower triangular matrix with positive diagonal elements. Show that t\^ ..., t^ are independently distributed, t\ ~ B"(a - \{i - 1),6 - |(p - г)), г = 1,... ,ρ. 5.23. Let С/ ~ В£(±пи \п2). Then show that (i) E((trU)U) = Пг{п1Р(п+1Н2(п2-р)} w vv J J n(n- l)(n + 2) p (ii) S((tr IT')^) = ("-P-l)iP("-l>)("x-P-3) + n,&> + 2)} (ni-p)(ni-p-l)(ni-p-3) πχ -p-3 > 0. where щ + П2 = η. 5.24. Let G = T'(/p + V)~lT = (9ij), where V = TV ~ 5£J(±nb |n2) and Τ is a lower triangular matrix with positive diagonal elements. Then show that E(9ii) = ^=^. (Bilodeau and Srivastava, 1992) 5.25. Prove Theorem 5.4.1(ii). 5.26. Prove Theorem 5.4.2(ii). 5.27. Prove Theorem 5.4.5(ii).
196 CHAPTER 5. MATRIX VARIATE BETA DISTRIBUTIONS 5.28. Let Si ~ \νρ(ηίΊΥ>ι), г = 1,2 be independently distributed. Derive the distribution of V = S2~ * Si S2~ *. 5.29. In Problem 5.28 derive E[det(V)h]. 5.30. Prove Theorem 5.5.2(ii). 5.31. Prove Theorem 5.5.4(ii). 5.32. Let Si ~ Wp(nbEi) and S2 ~ ν7ρ(η2,Σ2,θ) be independently distributed. Prove that the p.d.f. of V = S2 2SXS2 2 is given by ^^r^nfafa) det(Ex)b det(E2)b}_1 etr (- |θ) det(V)^ni-p-1'> [ det(52)^n'+^-p-i) etr {- httlS2 + Sf ΣΓ1.?! V)| 7s2>o l 2 J 0F1(in2;^E2-152)d52)y>0. (de Waal, 1969) 5.33. Let 5|Σ ~ Wp(n,Z) and a priori Σ ~ /Μ^τπ,Φ,θ). Then, prove that the marginal density of S is {&(|(m - ρ - 1), |n) J"' etr (- ±θ) det(^)^^"1) det(5)2(n-p-1} det(5 + φ)-έ(™+»-ρ-υ which is the generalized noncentral matrix variate beta type 11(B) density \щ\(т - ρ - 1), Φ and |θ, where \d with parameters |n, |(ra — ρ — 1), Φ and |θ, where \θ is the noncentrality parameter. 5.34. Let 5 ~ И^(п,/Р) and partition 5 as S = (5У), i,j = 1,2, Sn (<? χ q). Then prove that Sn, 522 and #12 = £ii2£i2£n2 are independently distributed. Further, show that the p.d.f. of #12 is r9[I(n-p + g)]det^ Λι2^12) 5.35. In Problem 5.34, derive the distribution of R = -R12.R'12 for ρ > 2<?. 5.36. Let S ~ Wp(n, Σ) and partition 5 and Σ as /Sn 5i2 \ <? ν_/^Ση Σι2 \ q \S21 S22 J ρ - q' \Σ21 Σ22 ) ρ- q q p-q q p-q
PROBLEMS 197 where ρ > 2q. Further, let Ei2 = 0. Prove that (i) matrices Slh2 = Su — S12S22S21, S22 and S^S^^i aie independently distributed, (ii) Sl2S2~2lS2i ~ Wq(p — q, En), and (iii) det(5de?^t(5 . is distributed as the product of independent beta variables. 5.37. Let S ~ ν7ρ+ς(η, Σ) and partition /S11 Sn\p /Σ11 Σι2\ ρ \S2i S22J q \Σ2ι Σ22/ Q ρ q ρ q Define G = Sl2S22lS2i· Then, show that the p.d.f. of G, for ρ < q, is given by {2*"Γρ(|ς) det^n.^}"1 det(/p - P)Ktr (- \ς^20) det(G)^-?-V xFifin; ^; \p*G), G > 0, where Ρ = Σπ5Σ12Σ2"21Σ21Σ^ and Ρ* = Δ^Δ^Δ^, with Δ = Σ-1. 5.38. Let S\ ~ \νρ(ηι,Σ) and 52 ~ ν7ρ(η2,Σ) be independently distributed. Prove that deffgffig^ and Si + 52 are independent and hence, deduce that if 5» ~ Wp(rii, Σ), г = 1,2,...... are all independent, then det(Si) det(Si + S2) det(Si+S2y det(5i + 52 + 53)''"'" are all independent. 5.39. Let S and Σ be defined as in Problem 5.37. Define R = Sn*Sl2S£S2lSU*. Then, show that (i) for ρ = 1, the p.d.f. of R is given by Γ(!9)Γ[Ι(η-<?)]* (1_Я) ^Ч2п'2П'29'рЛ> where Ρ = ρ2, (Mathai, 1981) (ii) for arbitrary p, the p.d.f. of R is {rp(i9)rp[i(n - 9)]}"1 det(2E„.2)-*"det(/p - P)h det(7p - r^-p-v-V det(R)^~p-l) f det(S)^n-p~1) Js>o etr (- ^г/.25) xF^in; |9; |p*S**Si) dS,0<R< IP. (Troskie, 1969)
198 CHAPTER 5. MATRIX VARJATE BETA DISTRIBUTIONS 5.40. Let R be distributed as in Problem 5.39(ii), then prove that rp(5n + /i)rp[5(n-9)J 2Fi (-n, -n; -n + h;P), Re(h) > --(n - ρ - q + 1).
CHAPTER 6 MATRIX VARIATE DIRICHLET DISTRIBUTIONS 6.1. INTRODUCTION The random variables ui,...,ur are said to have Dirichlet type I distribution with parameters αϊ,..., αΓ+ι, if their joint p.d.f. is given by S JK-'fr ~ Σ^Γ'"1-0 < «* < ι. Σ«« < ι, (6.1.1) llt=l l\ai) i=l i=l t=l where a; > О, i = l,...,r + 1. The random variables ui,...,ur are said to follow Dirichlet type II distribution with parameters 6b ..., 6r+i, if their joint p.d.f. is given ШцП»гс+е».)-е::'ч.«>». №«) where 6* > 0, г = 1,..., r + 1. Letting ν» = г^ (1 — £J=1 u%) \ г = 1,..., r, we can obtain the p.d.f. (6.1.2) from (6.1.1) with parameters ab... ,ar+i. For this reason, (6.1.2) is also known as inverted Dirichlet distribution (Tiao and Guttman, 1965). In this chapter we study matrix variate generalizations of (6.1.1) and (6.1.2). 6.2. DENSITY FUNCTIONS The matrix variate Dirichlet type I and II distributions are defined as follows. DEFINITION 6.2.1. The ρ χ ρ symmetric positive definite random matrices U\,..., Ur are said to have the matrix variate Dirichlet type I distribution with parameters αϊ,..., CLr+i, denoted by (Ε/χ,..., Ur) ~ £>ρ(αι,..., ar; ar+i), if their joint p.d.f. is given by {&K. · -, <v; aT+1)}-1 f[ det^)0^^ det (lp - £ У,)"*1"*^4, г=1 г=1 0<^</р,0<^К</Р, (6.2.1) 199
200 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS where α,ι > \{p — 1), г = 1,..., r + 1, and Γρ(Σ£ί*)" 0p(au...,ar;ar+1) = ^j^Pi. (6.2.2) DEFINITION 6.2.2. The ρ χ ρ symmetric positive definite random matrices V\,..., Vr are said to have matrix variate Dirichlet type II distribution with parameters 61,..., 6r+i, denoted by (VI,..., Vr) ~ Djfibu · · · > br\ br+i), if their joint p.d.f. is given by Ш&ъ - - - Λ; b+i)}"1 Π detiVi)61"^4 det (/p + £>,)" ^ V, > 0, (6.2.3) г=1 г=1 гуДеге 6f > \{p — 1), г = 1,..., г + 1, and /?ρ(δι,..., 6r; 6r+i) 25 defined in (6.2.2). For r = 1, (6.2.1) reduces to the p.d.f. of matrix variate beta type I and (6.2.3) to the p.d.f. of matrix variate beta type II given in Chapter 5. For ρ = 1, (6.2.1) and (6.2.3) reduce to (6.1.1) and (6.1.2) respectively. The matrix variate Dirichlet distributions are special cases of the matrix variate Liouville distributions discussed in Chapter 9. In univariate distribution theory, if xiy г = 1, ...,r and у are independent random variables having chi-square distributions with щ, i = 1,..., r and m degrees of freedom respectively, then the joint p.d.f. oiui,...,uri where щ = =^—г——, г = 1,..., r, (6.2.4) £j=i xj + у is Dirichlet type I with parameters |ni,..., \nr, \m. Further, if Vi = ^,t = l,...,r, (6.2.5) У then the joint density of v\,..., vr is Dirichlet type II with parameters \n\,..., \nr, \m. In the matrix variate case Wishart distribution plays the role of chi-square distribution. Let 5» ~ ^(η^,Σ), г = l,...,r and В ~ Wp(m,Σ) be independent random matrices. Then, natural generalizations of the ratios (6.2.4) and (6.2.5) are Ui=(J2Sj + By~2Si(J2Sj + B)"\i = l,...,r (6.2.6) and 1^ = В-*5«В"*,г = 1,...,г, (6.2.7) where A2 denotes a square root of the matrix A. The distributions of Ui (Vi) depend on the definition of the root in (6.2.6) ((6.2.7)). Also, certain independence properties depend on the choice of the root, e.g., see Problems 3.11-3.14. Here we derive densities (6.2.1) and (6.2.3) by suitably choosing the root in (6.2.6) and (6.2.7). The densities of Ui{Vi) for other choices of the roots which do not yield (6.2.1) and (6.2.3) are derived in Section 4.
6.2. DENSITY FUNCTIONS 201 THEOREM 6.2.1. Let S{ ~ Wp(nu Σ), г = 1,..., r and В ~ Wp(m, Σ) 6e independently distributed. Define Ui = S-Si(S-*)f,i = l,...,r, (6.2.8) where S = Σ£=ι £i + -δ uftd S^(S^Y is any reasonable factorization of S. Then (С/ь...,С/г)^^(1пь...,1пг;|т). Proof: The joint density of Si,..., Sr and Б is given by -p-1) Π [{2*здГр(|п,) det(E)*"-}"1 etr (- Ie^S,) det^)^" j2impr^l ^ det(E)imJ"1 etr (_ Ις-1^) det(B)i(m-p-1^. (6.2.9) Making the transformation Σί=ι & + 5 = 5, Si = SiUi(Sb)', г = l,...,r with Jacobian J(Sb ..., 5r, В -> J7b ..., J7r, 5) = det(5)2r^+1) in (6.2.9), the joint density of E/i,... ,C/r and S is det(5)2(m+n-p-1} etr (- \?>~lS), (6.2.10) where η = ΣΤ=ι?ν From (6.2.10), it is easily seen that (E/i,..., C/r) and 5 are independent and the density of (E/i,..., C/r) is given by Гр(2т)Пг=1Гр(2Пг){=1 Ч i=1 ' For г = 1, the above theorem gives the matrix variate beta type I distribution discussed in Chapter 5. THEOREM 6.2.2. Let S{ ~ №р(щ,1р), г = 1,... ,r, and В ~ Wp(mJp) be independently distributed. Define У1 = В-*5*В-*,1 = 1,...,г, (6.2.11) where BiBi = B. Then (Vb ..., Vr) ~ I^ru, ..., |nr; \m). Proof: The joint density of 5b ..., Sr and В is given by (6.2.9) with Σ = Ip. Making the transformation 5» = ΒϊViB^, г = 1,..., r with J(5b ..., Sr —>· Vi, ..., Vr) = det(B)2r^+1), we obtain the joint density of Vb..., Vr and В as {2i(-^rp(im)nrp(ini)}-1ndet(K)^--p-1) etr{- i(/p + £v;)B}det(S)5(m+n-p-1>, (6.2.12)
202 CHAPTER 6. MATRIX VABJATE DIBJCHLET DISTRIBUTIONS where η = £[=1 щ. Integrating out В using j etr j- Ulp + Σ ν^)Β} det(B)^m+n-p-^ dB = 2**^>Гр[±(т + n)] det (/p + ± vf*™, we get (VI, ...,K)~ Я£7(£пь..., |nr; ±m). ■ For r = 1, the above theorem gives the matrix variate beta type II distribution. The p.d.f.'s (6.2.1) and (6.2.3) are called the standard matrix variate Dirichlet type I and II distributions. Next we derive what are known as generalized matrix variate Dirichlet type I and II distributions. THEOREM 6.2.3. Let (Uu...,Ur) ~ D£(ab... ,ar;ar+i) and Φχ,...,ΦΓ,Ω be symmetric matrices such that Ω > 0 and Ω — Σ[=1 Φ* > 0. Define Ζ4=(Ω-έφ0*ϋ'(Ω-έφ*)*+φ*>* = 1>···>Γ· (6.2.13) г=1 г=1 Then (Z\,..., Ζr) have the generalized matrix variate Dirichlet type I distribution with p.d.f. ECt det(Zj - φ,^-ίΟρ+ΐ) detffl - Σί,ι ZQ"^'-^) /?p(ab...л;<V+1)det(fi - Σ[=1 *,)£'=■ αί""(ρ+1) г Φ;<Ζ;<Ω, г = 1,...,г, ^Ζί<Ω. (6.2.14) г=1 Proof: Making the transformation Ui = (n-Y,%yi{Zi-^t)(Q-J2^i)~i,i = l,...,r, г=1 г=1 with Jacobian J(UU ..., Ur -> Zb ..., Zr) = det(il - ΣΓ=1 Φ^-έ^1) in (6.2.1), we get (6.2.14). ■ If (Zb ..., Zr) has p.d.f. (6.2.14), then we wiU write (Zb ..., Zr) ~ GDTp(au ..., ar; аг+1^;Фь...,Фг)· N^e that G^(ab ... ,ar;ar+i; Jp;0,... ,0) = £>£(ab ... ,ar; Or+i). THEOREM 6.2.4. Lei (Vu...,Vr) ~ ^J(6b ... ,6r;6r+1) and Фь ... ,ΦΓ,Ω бе symmetric matrices such that Ω > 0 and Ω + ΣΓ=ι Φ* > 0· Define У,= (Ω+ £*i)4i(n+ £<!><)*, t = l,...,г. г=1 г=1 ТЛеп, (УЬ...,У^) /mve £Ле generalized matrix variate Dirichlet type II distribution with p.d.f.
6.2. DENSITY FUNCTIONS 203 Pp(Oi,...,bT;bT+i) .=1 ^ i=1 Π сВД - φ,)»·-*^!) det (Ω + Σ У,)" " " г=1 У<>Ф*,г = 1,...,г, (6.2.15) Proof: Making the transformation Κ = (Ω + Σί=ι Φ*)"5(У* - Φ*)(Ω + ΣΙ=ι Φ*)", г = 1,..., г, with the Jacobian J(VU..., Vr -> Уь ..., Yr) = det (Ω + ΣΙ=ι Фг)~^г(р+1) in (6.2.3), we get (6.2.15). ■ If (У1,..., Yr) has p.d.f. (6.2.15), then we will write (Yu ..., Yr) ~ GDjfibu · · ·, V, &Г+1^;ФЬ...,ФГ)· In this case G^/(61,...,6r;6r+1;/p;0,...,0) = D^(6b ... A; br+i). Next we define and derive the inverse Dirichlet distribution. The inverse Dirichlet distribution can be obtained from the matrix variate Dirichlet Type I distribution by means of an inverse transformation. THEOREM 6.2.5. Let (Uu ..., Ur) ~ £>£(аь ..., ar; ar+l). Define Xi = Ur\ г = 1,..., r. Then the joint p.d.f of X\,..., Xr is given by {/3p(ab ...A; a^)}"1 Π detixr**-^ det (/, - ± ^p-^, Xi > Jp, i = 1,..., r, 0 < £ X~l < Jp, (6.2.16) where ai > \(jp — 1), г = 1,..., r + 1 and /3p(ab ..., ar; ar+i) is defined in (6.2.2). Proof: The transformation X{ = С//-1, г = 1,..., r, with Jacobian J(£/i,..., Ur —>· Xb ..., Xr) = Щ=1 detiXi)"^"1, in the p.d.f. of (J7b ..., J7r) yields the joint p.d.f. of Χι,..., Xr as given above. ■ The distribution of (Xi,...,Xr) given in the above theorem is called inverse Dirichlet distribution, Xu (1987). If (Xu...,Xr) has p.d.f. (6.2.16), then we will write (Χι,..., Xr) ~ IDp{a\,..., ar; ar+i). For r = 1, this distribution reduces to the inverse beta distribution given in Theorem 5.3.5. Like the matrix variate Dirichlet distribution this distribution can also be derived using Wishart matrices. THEOREM 6.2.6. Let W{ ~ IWpfa + ρ + 1, Φ), г = 1,..., r and V ~ IWp(m + p + Ι,Φ) be independently distributed. Define χ. = W*WiW*, i = 1,... ,r (6.2.17) where W = Σϊ=ι Щ~1 + V~l and W*{W*)' is any reasonable factorization of W. Then (Xu ..., Xr) ~ /ВДщ,..., \rw\ \m). Proof: From Theorem 3.4.1, S. = Wf1 ~ Wp(nu Ф"1), i = 1,..., r and В = V~l ~ Η^τη,φ-1). Now from Theorem 6.2.1, the joint density of X"1 = (W* WW*)-1 = S-*SiS-*, г = 1,... ,r, where S = Σί=ι ft + В is I^ni,..., \nr- \m). Finally, using Theorem 6.2.5, the result follows. ■
204 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS 6.3. PROPERTIES In this section, we will study certain properties of matrix variate Dirichlet type I and II distributions. It may be noted that densities (6.2.1) and (6.2.3) are orthogonally invariant, that is, for any fixed orthogonal matrix Γ(ρχρ), the distribution of (ГС/ιΓ, ПУ2Г',..., ГигГ) is the same as the distribution of (J7b ..., Ur), and similarly the distribution of (ГУгГ, Γν2Γ',..., ГУГГ) is the same as that of (VI,..., Vr). THEOREM 6.3.1. (i) If (J7b ..., Ur) ~ £>£(аь ..., аг; ar+1) and yi=(lP-J2Uiyhi(lp-J2UiyKi = l,...,r, (6.3.1) i=l i=l then (Vi,...,Vr)~ D^(ai, ...,ar; ar+1). (ii) If{Vu..., VT) ~ D'/ih,..., br; br+χ) and Ui = (lv + r£Vi)~hVi(lp + YJVl)~h,i = l,...,r, (6.3.2) then (Ui,..., Иг) - D*(bu ... A; br+l). Proof: (i) Let Ζ = Ip - £T=1 Ui and Vf = Z~iUiZ-^ i = 1,... ,r - 1, then Vr = Z_1 — (/p + YZZi Vi)· The Jacobian of transformation (6.3.1) is given by J(J7i,...,J7r-rVi,...,V;) = J(C/1,...,C/r^V1,...,Vr_1,Z)J(V1,...,Vr_1,Z^V1,...,Vr) = det(Z)2(r-1^+1Met(Z)p+1 = det ^/p - X; ui J = с1е*(/р + Е^)-|(Г+1)(Р+1). i=l Now, making the transformation and substituting for the Jacobian in the joint density of Uι,..., Ur given in (6.2.1), we get the desired result, (ii) The proof of the second part follows similarly. ■ THEOREM 6.3.2. // (C/b...,C/r) ~ ££(аь ... ,аг;аг+1); then (Uu ..., Us) ~ DIp(au...,as]Yfi=l+iO'i), s < r, and the density o/(C/s+i,... ,C/r)|(C/b ..., C/s) is given by Д,(о.+1,..., α,; c+i) det(/p - Σ-=1 U^l+ι β·"^1> i=l i=s+l г=1
6.3. PROPERTIES 205 Proof: First we find the marginal density of E/i,..., Ur-\ by integrating out Ur from the joint density of Ε/χ,..., Ur as {βρ(α1,...,αΓ;αΓ+1)}-4 f[ det^)*"^ ^o<t/r</p-2^i=1 i/i i=1 det^-^»^1-^^· (6.3.3) Now, substituting Zr = (Ip - ΣΓ=ι Ui)~^Ur{Ip - ЕЙ ί/i)""* with Jacobian J(Ur -> ZP) = det(/p - Σ·=ί ui)*^1) in (6.3.3), we get r-1 ar+Or+i-|(p+l) aet[Uifl 2^■ ^ aet ^ip-2^ ^iJ {/?p(ab...л;<V+i)}-1 Πdemr-iWdet (/p- ££/*) t=l i=l / detiZr)0*-?^ det(/p - ζ,)"^1-^1) dZr. J0<Zr<Ip But {/3p(ab..., ar; a^)}"1 / det(Zr)ep~i(,H"1) det(/p - Zr)<^-^+1> dZT J0<Zr<Ip = {Pp(au..., ar_i; ar + ar+1)}_1. Hence, we get (E/i,..., C/r-i) ~ -^p(ab · · ·, ^г-ь ar + ar+i). Repeating this procedure r — s times gives the marginal density of (E/i,..., C/s). Now, the second part of the theorem follows immediately. ■ COROLLARY 6.3.2.1. J/(t/b ..., t/r) - ££(ab... ,ar;ar+1), *Леп ui - В£(о*, Σ·ί}(^)%);ζ = 1,...,Γ. THEOREM 6.3.3. // (Vi,...,Vr) ~ ^J(6i,...,6P;br+i), *Леп (УЬ...,К) - D^(bu ..., 6S; 6r+i), 5 < r; and £Ле density of (Vs+U ..., K)|(V1,..., K) is given by det(/p + EjLi ν·)Σ;-Λ+^ι Π[=3+1 det(V^-^) /3p(6s+i,... ,6r; Ei=i bi + br+i) i=l i=s+l det (lp + £ V4 + Σ *ϊ) ^ 6<. ^ > 0- Proof: The proof is similar to the one given for Theorem 6.3.2. In this case, to obtain the marginal density of Vb..., Κ-ъ we substitute Wr = (Ip + ΣΙ=ί V^)~iVr(Ip + ЕЙ Vt)-i with the Jacobian J(UT -*■ WT) = det(/p + Σ£ί V;)^1). ■ COROLLARY 6.3.3.1. If (Vu ...,Vr) ~ Djfibi,..., bT; bT+i), then Vt ~ B^fc, Ьг+i), г = l,...,r.
206 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS THEOREM 6.3.4. Let (Uu ...,Ur)~ Dj,(au ..., ar; ar+1) and define wi = (ip - Σ ъ)" * ъ (Jp - Σ ъ)"έ> * =s +1> · · ·>r ■ г=1 г=1 Then (i) (Ws+i,..., Wr) and (U\,..., I7S) are independent, and (ii) (W5+i,...,Wr) ~D^(ae+i,...,ar;ar+i). Proof: Transforming Wi = (/Ρ-Σ?=ι ^)"^(/Ρ-Σ·=ι ^)"% * = s + 1, - -. ,r with Jacobian J(US+U..., Ur -> We+b..., Wr) = det(/p - Σ·=ι У»)*^"00*4"^, in the joint density of (£/i,..., £/r), we get the desired result. ■ THEOREM 6.3.5. Let (Vu ..., Vr) ~ Df/fa,... A; 6r+i) and de/me г=1 г=1 Then (г) (Zs+i,... ,Zr) and (Vi,..., Vs) are independent, and (ii)(Zs+u...,Zr)~DIp\bs+u...A]ZUbJ + br+i)- Proof: Similar to the proof of Theorem 6.3.4. ■ In next two theorems, we derive the joint p.d.f.'s of partial sums of random matrices distributed as matrix variate Dirichlet type I or II. THEOREM 6.3.6. Let (Uu ..., Ur) ~ Dfai,..., ar; ar+1) and define г υϋ)= Σ υ3>4)= Σ аз>ro = °>γ* = Σο>i = i,...j. Then (I7(i),..., U{q) ~ D{,(a^)7... ,a(^;ar+i). Proof: Make the transformation l7(o = Σ Ъ, and ИЛ = Urf U.Urf, (6.3.4) j = r*_x + 1,..., r* - 1, г = 1,..., t. The Jacobian of this transformation is given by 7(17!,...,^^^,...,^^ = Π J(ur:_i+U..., urJ -> w;._l+1,..., wr.-b i7(0) t=l = ndet(^))|(ri"1)(p+1)- (6-3-5) t=l
6.3. PROPERTIES 207 Now, substituting from (6.3.4) and (6.3.5) in the joint density of (U\,..., Ur) given by (6.2.1), we get the joint density of Wr*_ +i,..., Wr*_i, £/(»), г = 1,... ,£ as {βρ(αι, ...,aT; a^)}-1 {[ det(U{t))4*-iW det (/„ - £ tfa)e,+I"i(P+1) i=l i=l Π{ 'ff det(^r-^Ddet(/p- χ; ^)^-|frfl)}, (6.3.6) i=l li=r· +1 i=r;_,+l J i=l Ij^rilj+l J^rr.j+l where 0 < J7(i) < /p, Σ*=ι Цо < /P, 0 < W0 < Jp, j = r*_x + l,...,r* - 1, Σ?=γ·_ +i ^7 < /P> i = 1,...,^. From (6.3.6), it is easy to see that (£/(i),..., Uy)) and (Wrr_ +i,..., Wr*_i), г = 1,...,^, are independently distributed. Further, (I7(i),..., ί/(£)) ~ ^(α(ΐ),...,α(£);αΓ+ι) and (Wr*_i+i, ..., Wr._i) ~ ^(ar;_i+b ..., ar*_i; ar·), г = 1,...,£ ■ When I = 1, ΣΓ=ι I7i ~ Β£(Σί=ι *, Or+i). THEOREM 6.3.7. Let (Vu . - -, Vr) ~ ^J(6b · · · A; 6r+i) and de/ine ТДеп (Vfo,..., Vfo) - D^(6(1),..., 6(0; 6r+i). Proof: Make the transformation Vfo = Σ Vj^naZj = V-"vjV~K (6.3.7) j = r*_x + 1,..., r* — 1, г = 1,..., £. The Jacobian of this transformation is given by J(VU... ,Vr ->· Zi,... ,Zn_i, V(i),...,Zr*_i+i,.. .,Zr_i, V(£)) = П^г;.1+ь...,К.^^Р._1+1,...,^р;-1,У(0) г=1 = ndet(y(o)'(ri"1)(p+1)- (6·3·8) г=1 Now, substituting from (6.3.7) and (6.3.8) in the joint density of (VI,..., Vr) given by (6.2.3), it can easily be shown that (V(i),..., V^) and {Zr\_ +i, ■ ■ ■, Zr?-i), г = 1,..., ■£, are independently distributed. Further, (Vfi),..., V^) ~ Djffyi),..., b^;br+i) and (Zrr_i+1,..., ZP._i) - £p>r*_1+i, · · -, 6τ·-ι; br·), < = 1, · · · X ■ When ί = 1, the distribution of ΣΓ=ι ^ is beta type II with parameters ΣΓ=ι &г and 6Γ+χ. Next, we give generalizations of the results for marginal and conditional distributions of beta random matrix.
208 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS THEOREM 6.3.8. Let (Uu ...,Ur)~ Я£(аь ..., ar; ar+l) and define Ui=[rT TJ ,Pi+P2=P, \^21(i) U22(i)J P2 P\ P2 ^22-1(0 "~ ^22(0 ^21(0^11(1)^12(0' r _^ Ai0 = \JP1 ~ Σ UIIU)J ' J=t+1 j=i and ^21(0 "~ ^21(0 + ( A^ ^21(i)J^oA j=i+l for г = 1,2,..., г. ТЛеп, (ty (С^и(1),- - - ,Ε^ιΐ(τ-)) ~ ^(01,...,^;^+!), (ϋ) (С/22-1(1),- - -, ^22-i(r)) ~ ^(^ι - |τ>ι,-.., a». — |pi;ar+i + |pi(r - 1)), and fill,) (Ζ21(1),···,Ζ21(γ))|(^11(0>^22·1(0> 2 = l,...,r) ~ JT^,^ (2ar+i-£+1, 0, Jp2- Σ^ι^.ιο·)^-1), where A = diag(Ab ..., Ar). Proof: See Tan (1968, 1969c). ■ THEOREM 6.3.9. Let (Vi,..., Vr) ~ Df/fa, ... A; 6r+1) and define (Ущг) V\2{i)\ Pi Vi= τ/ τ/ ,Ρι+Ρ2=Ρ, \ V21{i) V22(i) J P2 Pi P2 V22-l(i) = V22(i) ~ ^21(0^7(0^12(0' Bi0= (/Pl + Y^Vii(j)) ' 3=г ^ = ^(/Р1+ Σ 4))(^+Σ4))"' j=i+l j=i and j'=t+l fori = 1,2,.. .,r. ТЛеп, ft (Vli(i),..., Vii(p)) ~ £#(6b ... Л; br+i - 5P2),
6.3. PROPERTIES 209 (ii) (V22.i(i),.-.,V22.i(r)) ~D£(bi- \ръ...,Ьг- \pi\br+i), and (Hi) (W2i(i), · · · ι W2i(r))|(Vii(i), V^KOi i = 1,... ,r) ~ ТИ|П>1(2ЕЙ ^-ρ2-τΡι + Ι,Ο,/ρ,+Σ^ι^ΜΟ)^"1), гуДеге Б = diag(Bb..., Д.). Proof: See Tan (1968, 1969c). ■ Next we derive factorizations of the matrix variate Dirichlet density. THEOREM 6.3.10. Let (Uu ..., Ur) ~ Щаъ ..., ar\ ar+l) and define Χι = tfi -Xr = (/p - J7i J7P-i)-il7P(/p - J7i Ι7Ρ-ιΓ*. (6.3.9) ТДеп Χι,...,Xr are independently distributed, Χι ~ -Βρ(α;,JZJii+i olj), i = \,... ,r. Proof: The density of (U\,..., Ur) is given by (6.2.1). From the above transformation it is easy to see that det(Jp - Ui) = det(Jp - Xx) det(Jp -Ui- U2) = det(/p -ΧΎ- (Jp - Χ1)$Χ2(ΙΡ - Χι)*) = det(/p-Xi)det(/p-X2) det(/p - Ux Ur) = det(/p - Хг) · · · det(/p - Xr) det(Ux) = det(Xi) det(l72) = det(X2)det(/p-X1) det(J7P) = det(Xr) det(/p - Хг) · · · det(/p - Xr_x) and J(J71,...,J7P-^X1,...,XP) = ndet(/p-X:^)l(P+1) г=1 j=\ = ndet(/p-Xi)i(r"0C,H"1)- t=l Substituting appropriately in the density of (£/χ,..., £/r), one obtains
210 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS {P(au ...,ar- aT+1)Yl аОДуН&н-Ч f[ { det^) Π det(/p - Xj)}'^^ i=1 j=l Π det(/p - x^-hb+V jj det(/p - χ,)έ(-0(ρ+ΐ)) г=1 г=1 0<Xi < Jp, г = 1,...,г. (6.3.10) Combining factors containing Χι together and using the result r r+1 A,(ab...,ar;ar+i) = ΠΑ>(α*> Σ aj) (6.3.11) we obtain the desired result. ■ THEOREM 6.3.11. Let (Uu ..., Ur) ~ Dfau ..., ar; ar+i) and define xr = ur Xr_! = (Jp - J7r)-*l7r-i(/p - I7r)-* Xi = (IP-Ur Ι72)"^ι(/ρ-ί7Ρ J72)-i. (6.3.12) ТДеп ХЬ...,ХГ are independently distributed, Χι ~ £ρ(α»,Σ}=ι^ + ar+i), г = 1 r Proof: Similar to the proof of Theorem 6.3.10. ■ THEOREM 6.3.12. Let (VI,..., Vr) ~ D£J(6b ... A; 6r+1) and define Yr = Vr Гг_! = (/р + КГЫ-1(/Р + КГ* yx = (Jp + Vr + · · · + Va)"*Vi(/P + Vr + ■ · ■ + ν2)-*. (6.3.13) Then Yi,...,Yr are independently distributed, Yi ~ £pJ(&b Σ^=;+ι fy)> г = 1,..., r. Proof: Observe that det(/p + K) = det(/p + yr) det(Jp + Vr + Vr-г) = det(Jp + Yr) det(Jp + Гг_0 det(/p + Vr + · · · + Vi) = det(/p + Yr) · · · det(/p + Yi)
6.3. PROPERTIES 211 det(K) = det(yr) det(K_!) = det(rr_!)det(/p + yr) det(Vi) = det(yi) det(/p + УР)■ ■ ■ det(/p + Y2). Substituting these together with the Jacobian of transformation J(VU ...,Vr^Y1,...,Yr) = f[ det(Ip + y^iO-DCP+D 3=2 in the density of (Vi,..., VT) and simplifying, one obtains {&(&!,... Λ;^+i)}_1 Π deW*0*1* Π det(/p + ^Γ^='\ Υ > 0. (6.3.14) г=1 г=1 Now using (6.3.11) the desired result follows. ■ THEOREM 6.3.13. Let (Vu ..., Vr) ~ D^(6b ..., 6r; 6r+i) and define Yi = Vi Yi = (IP + Vl)-W2(Ip + V2)^ Yr = {Ip + Vi + --- + Vr-iYH{IP + Vi + --- + Vr-iY>. (6.3.15) Then ΥΊ, ..., Yr are independently distributed, Yi ~ B^ipi, Σ}=\ bj+6r+i), г = 1,..., г. Proof: Similar to the proof of Theorem 6.3.12. ■ The above results have been derived using matrix transformations. Tan (1969c) has derived Theorem 6.3.11 and Theorem 6.3.12 using certain results on marginal and conditional distributions. Likewise, using suitable inverse transformations, one can derive the matrix variate Dirichlet distribution from the independent beta matrices as given in the following theorem. THEOREM 6.3.14. Let X\,... ,Xr be independent ρ χ ρ random matrices, Χι ~ Бр"(а;,А); г = 1,...,г. Define Ui = Xi U2 = (Ip-Xl)ix2(Ip-Xl)$ Ur = (Ip-Xl)i---(Ip-Xr-1)ixr(Ip-Xr-l)b-..(Ip-Xl)b. Then (/7b ... ,Ur) ~ £>£(аь... ,аг; Д.) iff & = ai+l + /?i+1, г = 1,... ,r - 1.
212 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS THEOREM 6.3.15. Let Χχ,...,Χτ be independent ρ χ ρ random matrices, and Xi^BIp(ai,pi),i = l,...,r. Define Ur = Xr иг-г = (Ip-Xr)ixr.1(Ip-Xr)i Ci = (IP ~ X)§ · · · (/„ - X2)*XiVr ~ Χ2Ϋ* ■■■(Ip- Jfr)*. Then (^,... ,UT) ~ £^(ab ... ,ατ;βχ) iff 0i+1 = Qi + fr, i = 1,... ,r - 1. THEOREM 6.3.16. Let Υχ,...,Υτ be independent ρ χ ρ random matrices, Y{ ~ B"(ai,0i),i = l,...,r. Define VT=YT K-l = (Ip + YrY>Yr-l(Ip + Yr)> Vx = (Ip + Yr)i ... (Ip + Y2)hYl(lp + y2)i ... (/p + YT)k. Then {Vlt..., Vr) ~ ^(ai,..., ат; А) flf Д = ai+1 + A+ь i = 1,..., r - 1. THEOREM 6.3.17. Lei У1,..., YT be independent pxp random matrices, and Υ. ~ BIpI{ai,pi),i = l,...,r. Define V2 = (IP + Y1)*Y2(IP + Yi)> Vr = (IP + У1)* ···(/„ + Yr-i)*Yr(IP + П-0* · · · (/p + H)*. Then(V1,...,VT)~DIpI(al,...,aT;01)iffpi+1=ai + pi,i = l,...,r-l. From the transformations given in Theorem 6.3.14, one can see that Ip-£Ui= (h - *l)§ · · · (7P - *--l)*(/p - *r)(/p - Xr-l)* · · · Up - Xl)* i=l where /p — Xb ..., Ip — Xr axe independent, Ip — Xi ~ £p(A, #г), г = 1, · · ·, r and /P - ELi ui ~ ^(Д.,Е*=1 Oi) iff A = Oi+i + A+i, г = 1,... ,r - 1. Similarly, from Theorem 6.3.15, one obtains г=1
6.3. PROPERTIES 213 where Ip - Xu ..., Ip - Xr are independent, Ip - Xi ~ £p(A> <*t), г = 1,..., r and Ip ~ ΣΓ=1 Ή ~ Β£(/?ι,Σί=ι <*) iff A+l = Oi + A, » = 1, - - - ,Γ - 1. Thus we obtain the following result generalizing a result given by Javier and Gupta (1985a)(see Theorem 5.3.25), and Rao (1952). THEOREM 6.3.18. LetWu...,Wr be independent ρ χ ρ random matrices, W{ ~ BIp(ci,di),i = l,...,r. Then w*...wilwrwil...w?~Bl(cr,J2di) t=l iff (k = di+i + Ci+i, г = 1..., r - 1 and W? ... WJW.WJ ...w) ~ B^cuj^di) г=1 iff α+ι = а{ + d, i = I... ,r - I. Tiao and Guttman (1965) derived certain asymptotic distribution for the univariate Dirichlet type I distribution. Here we give the matrix variate generalization of their result due to Javier and Gupta (1985a). THEOREM 6.3.19. Let (Ux,..., Ur) ~ D£(ai,..., ar; ar+1) and W = (W1,..., Wr) be defined by Wi = ar+\Ui, г = 1,... ,r. Then W is asymptotically distributed as a product of independent matrix variate gamma densities; more specifically lim /(HQ = ndet^ where f(W) denotes the density of the matrix W. Proof: In the joint density of (£/b..., Ur) given by (6.2.1) transform Wi = ar+iUi, г = 1,..., r with the Jacobian J(UU..., Ur -> Wu ..., Wr) = а~ДГр(р+1). The density of W = (W\,..., Wr) is given by ,ar+i-i(p+l) rp(ar+i) U=i Гр(Ог-) J ч ar+i i=1 / The result follows, since lim ^,)^л = 1 ar+i —юо and «~1-юо ΓρίΟτ+χ) Г+1 Шп det(/p - J-V^)ar+1_§(P+1) = etr ( - ±W{).
214 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS An analogous result for Dirichlet type II distribution is easily shown to be the following. THEOREM 6.3.20. Let (Vu ..., Vr) ~ DjJ(bb ..., br; br+l) and W = (Wu ..., Wr) be defined by Wi = br+iVi} г = 1,... ,r. Then, W is asymptotically distributed as a product of independent matrix variate gamma densities; more specifically hm g(W) = [[ ^-тгт , where g(W) denotes the density of matrix W. 6.4. RELATED DISTRIBUTIONS In this section, we study distributions that are closely related to the matrix variate Dirichlet type I and type II distributions by generalizing the ratios (6.2.4) and (6.2.5). THEOREM 6.4.1. Let 5t- ~ Wp(n{,Ei), i = 1,... ,r and В ~ Wp(m,E2) be independently distributed. Define Ui = S-*Si(S-*)',i = l,...,r, Si A Ui,...,Ur is given by where Si (Si)' = Σί=ι S{ + B is a reasonable factorization of S. Then, the density of Js>Q det(S)^m+n~r-V etr {- ±£?S + ^Ej1 - Ej"1)^ (£ Ux)(S*)'} dS, 0</7{< Jp, 0<£>{< Jp, (6.4.1) t=l гуДеге η = £^=χ щ. Proof: The joint density of Sb ..., Sr and Б is given by Π [{2^ΓΡ(^) dettEOb}-1 dettfiY^-r-Veti (- ^S/)] {25трГр(^т) det(E2)5m}_1 det(B)5(m-p-1) etr (- \^B). 2 ,1 rr /ЛК Making the transformation Σί=ι $ + # = 5, 5t· = SiU^S*)', г = l,...,r with Jacobian J(Sb..., 5r, В -> 17ь ..., I7r, 5) = det(5)^r(i>+1), we get the joint density of Ui,...,Ur and S as
6.4. RELATED DISTRIBUTIONS 215 det(S)i<m*,-*-1>etr {- ^lS+^1 - Er^f»^)'} (6.4.2) Now, integrating (6.4.2) with respect to S we get the desired result (6.4.1). pV 2 " » " " " ' 2 r' 2 In Theorem 6.4.1, if Σχ = Σ2, then (17ь ... ,17r) ~ Dl(\nu ..., |nr; |ra). For r = 1, the distribution of £/i is given in Theorem 5.4.4. THEOREM 6.4.2. If the joint density of symmetric positive definite random matrices U\,...,Ur is (6.4-1), then the density of Ζ — Σ[=1 £/»· is given by Гр(2т)Гр(2П) / det(5)2(m+n-p-1} etr f- ^lS + ^(Σ^1 - Σ^1)5^Ζ(5^),1 dS, Js>o 12 2 J 0 < Ζ < Ip. (6.4.3) Proof: SubstitutingΣ?=ι t/f = Z, W{ = Ζ~έ 17Χ·Ζ-*, г = 1,...,r-1, where Z2Z2 =Z in (6.4.1) with Jacobian of transformation J(/7b ..., £/r_i, £/r ->· ТУЬ ..., Wr_b Z) = dettZ)^-1)^1), we get the joint density of Wu ..., Wr_i and Ζ as {/j(±nb ..., inr_i; in.)}"1 Π detW·)^—^ det (/p - g Wi)i("r"p-1) к г=1 г=1 {2^m+n)pTp^m)rp^n)fe [ det(5)2(m+n-p-1} etr f- ^lS + ^(Σ^1 - Σ^χ)5*Z(S*)'l dS, (6.4.4) Js>o 12 2 J where 0 < W{ < Jp, г = 1,... ,r- 1, ЕЙ W{ < Jp and 0 < Ζ < Jp. Now, from (6.4.4), it is clear that (Wb ..., Wr_i) and Ζ are independently distributed, (Wu ..., Wr_i) ~ Dp(|nb ..., |nr_i; \nr) and the density of Ζ is given by (6.4.3). ■ Next we give moments of Щ=1 det(/7{)ni and det(/p — ΣΓ=ι ЭД)· THEOREM 6.4.3. // the joint density of symmetric positive definite random matrices Ui,...,Ur is (6.4-1), then a) E\t[demr)h = rj'("t1n);^(h+^d«t(sr%)b U [l\ KJl Щ=1Гр(1п,)Гр[|(т + п) + И Vl 2; 2Fj (|n + nh, \{m + n); \{m + n) + hn; Ip - Σ^5Σ2Σ^5), Re(n,/i) > --(пг - ρ + 1), г = 1,... ,r,
216 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS and (ii)E[^{lp-±U,)h\ = rfe^n^l^"^*" L ч i=i Гр^тДУ^т + гО + А] 2Fi(|n, |(m + n); |(m + n) + A; Jp - Σ~*Σ2Σ~*), Re(h) > --(m-p+l). Proof: (i) From (6.4.1), we have E\f]det(U-r]h - 2^(m+'l)pdet(E1)-bdet(£2)-i Гр(1т)Щ=1Гр(|п{) I · · · J Π det(t/i)5(2to'+"'-P-1> det (/„ - J3 Ui) 0<Ui<Ip °<ΣΓ=ι ^<Ji> /s>Qetr {- ^5+ ^ - E^S^C^S*)'} det(5)*(ra+n_,^1)dSdUi··· dUT. (6.4.5) Writing βΙτί^Σϊ1 - Er^S^ET-! K)(54)'} = οΉ(| (Σ,1 - E^GS-i ui)(S*)'), and using the integral given in Problem 1.10 we get I(m-p-l) J---J Πdet(ui)i(2fcn£+n£"p"1)det (jp - £ui)3 °<ΣΓ=ι ^</P ο^ο(^(Σ2-1-ΣΓ1)5^(έ^)(^),)^ι ··· ^ i=l = ГО=1Гр[|п,(1 + 2А)]Гр(|т) Гр[|(га + n) +/m] χ Л (±n(l + 2A); \{m + n) + hn; 1(Σ,χ - ΣΓ1)^), Re(niA) > --(m-p+l). (6.4.6) Now, substituting (6.4.6) in (6.4.5) we have F\Tl*«(TT.V*-\h - 2-^m+^det(^)-^det(^)-bnr=1 Гр(|пг + fen,) л[Паед> j - Гр[|(т + п) + НШ=1Гр(|п{) / etr (- ^Σ^1^) det(5)i(m+n-p_1) iFi(^n(l + 2Λ); i(m + n) + An; ^(Σ^1 - Ef^S) dS.
6.4. RELATED DISTRIBUTIONS 217 Using the integral (1.6.4), we get the desired result. (ii) The derivation of E[det(Ip - £J=1 Ui)h] is similar. ■ The above results were derived by de Waal (1970) when £2 is a lower triangular matrix. The next theorem gives the results derived by Olkin and Rubin (1964). THEOREM 6.4.4. Let Si ~ Wp{nu Σ), г = 1,..., r and В ~ Wp(m, Σ) be independent. (i) If В = TT' where Τ is a lower triangular matrix with positive diagonal matrix, then the joint density of Vj = T~lSj(T~1)', j = 1,..., r is given by !r(\ II μ -1 n^ d<*(vj)h{nj-p-l) det(/P + Σ,·=1 ν;·)-^*"-*-1* УЧг"1' ·' ·' 2Пг' 2m)i YIU det((/p + EJ-i Vi)W) (6.4.7) (ii) If В = TT' where Τ is an upper triangular matrix with positive diagonal matrix, then the joint density of Vj = T~lSj(T~1)', j = 1,..., r is given by {a,l 11 ^ 1 -1Щ.! detQflK"'-»»-1) det(Jp + Σ-=i V^-^+^p^) {РР(-2Пи-..,-2Пг;-2т)} lK.idet((I, + I5.x^)w) " (6.4.8) Proof: In the joint density of Si,..., SP and В given by (6.2.9), making the transformation В = TT'(T = (Uj), Ui > 0, is lower triangular) and Vj = T'1 Sj(T-1)', j = l,...,r, with Jacobian J(Si,...,Sr,B ->· Vu...,Vr,T) = 2pdet(TT')5r(p+1) Π^=ι tfj1'1', we get the joint density of Vi,..., VT and Τ as {2*("·+»>>Τρ(|ιη) Π rp(\ni)} X det(E)-5<m+"> Ц det(Vi)i(n<"p"1) i=l i=l etr {- ^"^(/p + Σ νλτ'\ det(TT,)2(m+n"p"1)2p [] *y/W· (6·4·9) ^ 2 i=l J j=l Now, in order to obtain the joint density oiVi,...,Vr we need to integrate (6.4.9) with respect to T. For this, consider the integral f etr(- lE^T^p + ^^T'ldetiTT^^^-^^Pn^1"'^· (6-4.10) Substituting W = T{IP + ELi V$T with the Jacobian J(T -> W) = {2*> Π£=ι 4£1~* ПГ=1 det((/p + EJ=i Ц)Щ~1 the above integral becomes det (/, + ± V^'^l fidet ((/, + Σ^)Η)Γ j=l 4=1 j=l J / det^)^771*71"^ etr (- ^Σ"1^) dW
218 CHAPTER 6. MATRIX VARJATE DIRJCHLET DISTRIBUTIONS = det (IP + Σ VS)"§(m+n-P_1){ Π det ((/, + Σ^·)Μ)Γ j = l ^ i=l j = l ' rp[i(m + n)]det(iE-)"(m+n). (6.4.11) Now, from (6.4.9) and (6.4.11) we get (6.4.7). The proof for the case when В = TV (T upper triangular) is similar. ■ In the above theorem, without loss of generality, Σ can be taken as Ip since the densities (6.4.7) and (6.4.8) do not depend on it. Now, it may be noted that we have three different joint densities of V}, j = l,...,r, (6.2.11), (6.4.7), and (6.4.8), depending whether the root of matrix В is symmetric, lower triangular or upper triangular respectively. 6.5. NONCENTRAL MATRIX VARIATE DIRICHLET DISTRIBUTIONS Here, we derive the distribution of (Ε/χ,..., Ur) and (VI,..., Vr), defined by (6.2.6) and (6.2.7) respectively when the matrix В has noncentral Wishart distribution. THEOREM 6.5.1. Let S{ ~ Wv(nu Σ), г = 1,..., r and В ~ Wp(m, Σ, θ) be independently distributed. Define Ui = s-iSi(s-i)',i = i,...,r, where S = Σί=ι 5» + В and £2 (£2 у ^ аПу reasonable factorization of S. Then, the joint density ofU\,...,Ur is given by /s>Qetr (- l-Y,~lS) det(S)^m+"-r-V „fi^m; ^ΘΣ"^ (i, - &№)') dS, r 0 < Ui < Ip, i = 1,..., r, 0 < ]T Ui < Ip. (6.5.1) Proof: The joint density of Si,..., Sr and В is given by Π [{2Ьргр(^) det(E)b}_1 etr (- ^S,) det^)^-"-1'] {£тртЛт) det^)5m}_1 etr (- i©) det(B)5(m"p-1) etr(- ^-lB)oF1(^m;^-1B). (6.5.2)
6.5. NONCENTRAL MATRIX VARIATE DIRICHLET DISTRIBUTIONS 219 Making the transformation ELi & + В = S, Si = SWi(S*)', г = l,...,r with Jacobian J(SU... ,Sr,B -> Uu... ,Ur,S) = det(5)^r(p+1) in (6.5.2), we get the joint density of U\,..., Ur and S as etr (- ±TrlS) det(S)^m+"-'-V oFl(±m; \θΣ~^ (lP - £l>i)(S*)'), г 5 > 0, 0 < £/i < /p, г = 1,... ,r, 0 < J^U{ < Ip. (6.5.3) i=l Integrating (6.5.3) with respect to S we get (6.5.1). ■ The above result was derived by de Waal (1972b) for a triangular root of S. Substituting θ = 0 in (6.5.1), we get the results of Theorem 6.2.1. When Σ = Ip and θ = diag(0,0,...,0), the joint distribution of C/» = (ujk(i)), i = 1,... ,r, is given by, Troskie (1967), as {Pp{\nu ..., \nr; l-m)Yl exp (- \θ) Д det^)^"^ det (lP - ±U^'^ i=l i=l xFx (|(m + n); im; ±fl(l - X>11(0)). (6·5·4) For r = 1, Theorem 6.5.1 reduces to Theorem 5.5.1 and the result (6.5.1) simplifies to (5.5.1). THEOREM 6.5.2. Let 5.· ~ И^р(пг·, Σ), i = 1,... ,r and В ~ Wp{m, Σ, Θ) be mde- pendently distributed. Define Vi = B-i5iB-i,i = l,...,r, гуДеге Β^Βΐ = В. Then, the joint density of Vi,..., Vr is given by г'гТСУ'г?ГГе"(-^)п^(^;--' ^etrj-^-^^^ + ^K^^det^)^771^- ■p-1) Л 1 oFxf-mj-eE^BjdB, (6.5.5) гуДеге V* > 0, г = 1,..., т. Proof: The joint density of S\,..., Sr and В is given by (6.5.2). Making the transformation Si = В?ЦВ?, г = l,...,r with J(5b...,5r,-> Vu...,Vr) = det(B)^+1),
220 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS we get the joint density of V\,..., Vr and В as etr{-iE-1B5(/p + ^v;)B5}det(B)5(m+"-P-1OF1(im;i0E-1B). (6.5.6) Integrating (6.5.6) with respect to B, we get the desired result. ■ The above result was derived by Troskie (1972). For Σ = Ip, the integral in (6.5.5), using (1.6.4), becomes /b>o 6tr {" \B (/p + έ V) } det(B)*<"*"-'-1> oi\ (^m; Jqb) dB = 2§(^)prp[I(m + »)] det (/p + Σ vX^m+n) Z i=l and the density (6.5.5) simplifies to {/3p(inb..., irv; im)}"1 etr (- \θ) Д det W^"-1> det (jp + gVi)-|(~*° 1ί1(^(τη + η);|τη;^θ(/ρ + Σνί)"1),ν<>0,» = 1,...,Γ. (6.5.7) For Θ = 0, the density (6.5.7) reduces to the Dirichlet type II density. When Θ = diag(0,0,..., 0), the density (6.5.7) simplifies to {#4*1, · · ·, \nT; Im)}'1 exp (- \θ) Ц det^)^"-1' det (/, + Εу{)-*(то+п) ifi(i(m + n); |m; ^λ11), V, > 0, i = 1,...,r. where (/Р + П-1^)_1 = (А*). THEOREM 6.5.3. /f ί/ie joint density of symmetric positive definite random matrices U\,...,UT is (6.5.1), then ^[ШОД) ] =гр[|(ж + п) + ИЩ=1Грап,)^(-2°) 1F1 (-(m + n); -(m + n) + hn\ -θ), Re(n^) > --(ni-p + 1),
6.5. NONCENTRAL MATRIX VABJATE DIBJCHLET DISTRIBUTIONS 221 and 2F2(-m + h, -(m + n); -m, -(m + n) + h; -θ), 2 '2V Re(/i) > --(ra-p + 1). Proof: (i) Prom (6.5.1), we have 4й«°*Г-Ч^&-1-& [■■■ [ Πdet(i/i)5(2ft'li+ni-p-1)det (/p - ΣUi)l 0<C/i</P 2^ i=l ' i=l / etr (- ^~lS) det(5)^m+n-p-1} Js>o ч 2 ' 0F1(im;^E-152(/p-^C/i)(5^),)d5dC/1 · · · dUr. (6.5.8) i=l Now, using the integral given in Problem 1.10 we can write j · · · j Π dct(ui)i(2fcni+ni-p-1) det (/p - Σ ui) έν °<ΣΙ=ι ^</P oF1(im;^E-15^(/p-^K-)(5^),)dC/1 · · · dUr Δ 4 i=l Гр[|(т + n) + /m] V2V y '4 r Re(nih) >--(п{-р+1). (6.5.9) Substituting (6.5.9) in (6.5.8), we have F\UA^(Tn^h 2^(TO+")pdet(E)-^(-"+") Щ=1 Гр[*тц(1 + 2ft)] , 1 ν 4Sdet(i/i)J ЪМ&> rp[> + n) + Metr^2°) / etr(-|E-15)det(5)i(m+n-,,-1) oFj (|(то + η) + ftn; -ΘΣ_15) dS. (6.5.10) <2V ' '4 Now, using integral (1.6.4), we obtain the desired result
222 CHAPTER 6. MATRIX VARJATE DIRJCHLET DISTRIBUTIONS (ii) The derivation of £[det(Jp - ££=1 Ui)h] is similar. ■ de Waal (1972b) and Gupta and Nagar (1987) have derived asymptotic expansions of suitable functions of -21пЩ=1 det(C/{)ni and -21ndet(Jp - ££=1 Щ. PROBLEMS 6.1. Let the random matrix T(pxm) be partitioned as Τ = (7\, ..., Tk), Т{(рχгаг·), гаг· > ρ, г = 1,..., /с and πΐ\Λ h mjt = т. Define Вг· = Т{Г[, г = 1,..., к. Prove that (i) (Вь ..., Вк) ~ D^^mi,..., \mk· \(η + ρ-ΐ)) if Τ - Тр,та(п, О, /р, /та) and (п)(В1,...,В0~ОДть...,§т^ 6.2. Let (Uu ..., Ur) ~ Df,(au ..., ar; 6) and X ~ Β£(Σ?=ι α* + &, c) be independent. Prove that (X*£/iX5,..., XhUrX*) ~ £>£(аь ..., ar; 6 + c). 6.3. Let (E/i,..., C/r) ~ Dp(a,i,..., ar; 6). Prove that for any nonzero α G Kp, a't/ia a47ra\ 7 1 -£>ί(αι,...,αΓ;6). 6.4. Let (J7b ..., Ur) ~ Dj,(au ..., ar; 6) and X ~ B£J(c, ΣΓ=ι ^ + 6) be independent. Prove that ((Jp + X)-iUx(Ip + X)~K · · ·, (/P + ЛТ*ВД + -Χ")"*) ~ £>£(аь...,аг; Ь +с). 6.5. Let (Ε/χ,..., Ur) ~ Dp(cii,..., ar; ar+i). Prove that for any nonzero α G Kp, / а'а а'а \ ,/ 1, . 1, . Ι^Γ^'··'^^)~ί)ι(αΐ-2(ρ-1)'···'α'·-2(ρ-1); ar+1 + i(p-l)(r-l)). 6.6. If (Ui,..., Ur) ~ Dp(ai,..., Or; ar+i), then show that r (С/ь...,С/г_ь/р-^С/у,С/г+ь...,С/г) ~ D^(ai,...,ai_i,ar+i, аг-+ь...,аг;аг·). 6.7. Let (J7b ..., J7r) ~£>£(ab ..., ar; ar+i) and S~ И^(2а, Σ), а = Е£}а,·, be independent. Define Wi = S^UiS*, i = 1,... ,r-l, and Wr = 5^(/p-Ey=i tfj)Sl Then show that Wi,..., Wr are independent, Wi ~ И^р(2аг·, Σ), г = 1,..., г -1 and И^-И^р(2аг+1,Е). 6.8. Let(Zi,...,ZP)-G^(ai,...,aP;aP+i;a«i,...,«P). (i) Show that, for* < r, (Zb ...,Z,)~ GDfau ..., as;E^+iV>Ω-£;=*+As Φΐ,···>Φβ)· (ii)' Prove that Zt- - £Вр7(аг·, Σ^\{τΗ) 4,Ω - Σ;=ι(*·) Φ* Φ.·), г = 1,..., г. (iii) Derive the conditional distribution of (Zs+i,..., Zr)\(Zu..., Zs).
PROBLEMS 223 6.9. Let (Yi,..., Yr) ~ GDjfibu ... A; br+i; Ω; Φι, .··, Фг)· Derive the marginal distribution of (Yi,...,ys), s < r, and the conditional distribution of (Ув+1,...,Уг)|(Уь...,Ув). 6.10. Prove Theorem 6.3.1(ii). 6.11. Prove Theorem 6.3.20. 6.12. Prove that the inverse matrix variate Dirichlet distribution is orthogonally invariant, that is, if (Xi,..., Xr) ~ IDp(ai,..., αΓ; ar+i) then for any fixed orthogonal matrix Γ(ρχ ρ), the distribution of (ΓΧχΓ', ΓΧ2Γ',..., ГХГГ') is same as that of (Xi,..., Xr). 6.13. Let (Хь..., Xr) ~ IDp(a,i,..., ar; ar+i). Prove that for any nonzero a G Kp, 6.14. Let (Xb..., Xr) ~ IDp(au..., ar; 6) and У ~ /£Ρ(Σ[=1 α* + b, c) be independent. Prove that (Y^X^,..., У 2Xryl) - /£>р(аь ..., аг; Ы- с). 6.15. Let (ХЬ...,ХГ) ~ i\Dp(ai,...,ar;ar+i). Show that, for s < r, (Xi,... ,XS) ~ /Г)р(а1,...,а5·,^^^!^), and the density of (Xs+b ... ,Xr)|(Xb ... ,XS) is given by IIUn det(Xt-)—^+1) det(/p - Σ|=ι ΧΓ1 - Σ·=5+ι Xrl)^~h^) pp(as+u ..., ar; ar+1) det(Jp - ELi Xf1)*^! «-*<ρ+ι> 0<Xr1</P^ = s+l,...,r, Σ,ΧΓι<Ιρ-Σ,^1· i=s+l i=l Hence or otherwise show that Хг· ~ /Бр(аг·, ^Йгл=1 aj) · 6.16. If (Xb..., Xr) ~ IDp(a,i,..., ar; ar+i), then show that ί Χχ,..., Xi-ι ,у1Р — /2 Xj ) ' ^*"+ι,..., ХГ J v i=i y ~ IDp(au ..., α»_ι, Or+i, a»+i,..., ar; a»). 6.17. Let (Xb...,Xr) - /Dp(ab...,ar;ar+1). Define YJ = (Jp - Σ^ι*/1)*-*. (A> - Ej=i ^"i"1)^ г = 5 + 1,..., r. Then show that (i) (У5+1,... ,УГ) - /Dp(ae+i,... ,ar;ar+i), and (ii) (У5+1,..., Yr) and (Xb ..., Xs) are independent. 6.18. Let (Xb...,Xr) ~ /£>ρ(αι,...,αΓ;αΓ+ι) and V ~ JWp(2a + ρ + 1, Φ), α = IZyilttj, be independent. Define Yi = V^XiV^, г = l,...,r — 1, and Yr = W(JP - EJ=i Х/1)"1^. Then show that Уь ... ,УГ are independent, Y{ ~ IWp(2ai + ρ + 1, Φ), г = 1,..., г - 1 and Yr ~ Wp(2ar+1 + ρ + 1, Φ).
224 CHAPTER 6. MATRIX VARIATE DIRICHLET DISTRIBUTIONS 6.19. Let (Χι,... ,Xr) ~ IDp(a,i,..., αΓ; ar+i) and the random matrix Xi be partitioned as Xi = Ι ν ^ v·21^^ I » where Хца) is a matrix of order 9x5. Then \Αΐ2(») A 22(i)/ show that (i) №2(1), - - -, Лад) ~ IDp-qfai - |<?,..., ar - \q\ ar+x + \q(r - 1)), and (ii) (1ц.2(1),..., Хц.2(г)) ~ IDp(ai,..., ar; ar+i) where Xn.2(o = ^n(0~ ^12(0^22(0^21(0· 6.20. Let (Xi,...,Xr) ~ IDp(a,i,...,ar;ar+i) and A(gxp) be a constant matrix such that A A' = Iq. Then show that (АХгА',..., AXrA') ~ Л><(сц - ί(ρ - ς),..., ar - \{V - q); ar+l + -(p-q)(r-l)). 6.21. Prove Theorem 6.4.3(ii). 6.22. Prove Theorem 6.4.4(ii). 6.23. Let (Ε/χ,..., Ur) be distributed as in Theorem 6.5.1. Derive the distribution of Ζ = Σί=ι Ui. 6.24. Let (VI,..., Vr) be distributed as in Theorem 6.5.2. Derive the distribution of 6.25. Prove Theorem 6.5.3(ii). 6.26. Let {жу, j = 1,..., Ni} be a random sample from a p-variate normal population with mean vector μ{ and covariance matrix Σ*, г = 1,..., к. Then show that for testing Η : Σι = · · · = Σ^; μλ = · · · = /xfc, the likelihood ratio criterion is a function of Dirichlet type I matrices. 6.27. Let (Vu...,Vr) - i^J(bi,... A;br+i) and W{ = br+lVu i = l,...,r. Then show that the p.d.f. of W = (Wu ..., Wr) can be expanded as aetiWi)*-1*^ eti(-Wj) 'iW)'R rM oj , 3α2 + 4α2 _з WJ>0, t = l,...,r, where αχ = tr[(- ΣΓ=ι W;)2] + 2mtr(- ELi Ж) + m2p - \mp{p + 1), a2 = 2 tr[(- ΣΓ-1 И^)3]+3т tr[(- Π=ι И^{)2]-m3p+ |m2p(p+1)- |тр(2р2+3р-1) and m = 5ZJ=i bj· (Gupta and Song, 1990)
CHAPTER 7 DISTRIBUTION OF QUADRATIC FORMS 7.1. INTRODUCTION Let χ (η χ 1) be a random vector and Α (η χ η) be a symmetric matrix. The quadratic form in χ associated with A is defined as s = x'Ax. (7.1.1) The distribution of s, assuming χ ~ Νρ(μ, Σ), has been studied extensively by many authors, e.g., Kotz, Johnson and Boyd (1967a, 1967b), Johnson and Kotz (1970, 1972), Khatri (1980), Konishi, Niki and Gupta (1988), and Mathai and Provost (1992). When Χ (ρ χ n) is a random matrix, the matrix quadratic form in X associated with A is defined by S = XAX'. (7.1.2) In this chapter we study the distribution of S assuming X has matrix variate normal distribution. The distribution of S has been studied by Khatri (1959b, 1962, 1963, 1966, 1971, 1975, 1977, 1980), Hogg (1963), Hayakawa (1966, 1972), Shah (1970), Crowther (1975), and Gupta and Varga (1991, 1992, 1993, 1994d). 7.2. DENSITY FUNCTION In this section we derive the density of S when E{X) = 0. First we give the derivation given by Khatri (1966). THEOREM 7.2.1. Let X ~ iVp>n(0, Σ <g> Φ), η > ρ, Σ > 0, and Φ > 0. Then the density function of S = XAX', where A(nxn)>0, is given by (2±nP rp(In)| det(AΦ)-^det(Σ)-^det(5)^n-p-1)etг (- L^E"^) 0F0(n)(B,^-1E-15),5>0, (7.2.1) 225
226 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS where В = In — qA~2^!~lA~2, q > 0 is an arbitrary constant and Αϊ Αϊ = Α. Proof: The density of X is (2^-bPdet(E)-bdet(#)-^etr (- ^Σ'ιΧ^'ιΧ'), X G Rpxn. Transforming Υ = ΧΑϊ, with the Jacobian J{X —>· Y) = det(A)~2p, we obtain the density of У as (2^-bPdet(E)-bdet(A^)-^etr (- )-Έ~ιΥA~H~lA~*Y') = (2^-bPdet(E)-bdet(A#)-^etr (- Ъг1ТГ1УУ + ^ΓιΥΓιΥΒΥ'), Υ G Rpxn. Now using the Definition 1.6.1, we can write etr(i9-1E-1yBr) = ^{Ις-'Υ'Σ-'ΥΒ), and integrating out the density of Υ over the surface YY' = S, we get the density of Sas (2тг)-Ьр det(E)"5n det(AV)-3pg(B), (7.2.2) where ^W = /yy,=s etr (- Iq-^yy1) oFt] {\ς-ιΥ'Σ-ιΥΒ) dY (7.2.3) Since Б is a symmetric matrix, the integral (7.2.3) is invariant under the transformation В -¥ ΗΒΗ', Η G O(n), and integration with respect to Η over orthogonal group 0(n). Hence using (1.6.3), we get = IYY,=setl (- ^~ΐΣ~1γγ') οΗη){Β, i^E-W) dY = щ-^ det(S)>-'-» etr (- ^Σ"^) „J^ (в, ^E^S). (7.2.4) The last step is obtained by using (1.4.24). Now substituting for g(B) from (7.2.4) in (7.2.2) we get the desired result. ■ We will write S ~ Qp?n(A,E, Φ) if the density of S is (7.2.1). It may be noted here that for ΑΦ = In the density (7.2.1) reduces to the Wishart density Wp(n, E). The density of 5, in an equivalent form, can also be written as |2*^Γρ(|η)} det(A^)-^det(E)-bdet(5)^(n-p-1) 0F<>n)(v-lA-\-^-lS), S > 0. (7.2.5)
7.2. DENSITY FUNCTION 227 By substituting from in (7.2.5) we obtain the expansion in terms of zonal polynomials, which, however, is only slowly convergent. The following expansion in terms of Laguerre polynomials may be preferable for computational purposes |(2^)bprp(in)r1det(E)-bdet(5)^(n-p-1)etr(-V1E-15) The forms of the density (7.2.5) and (7.2.7), for Φ = /n, were given by Hayakawa (1966) and Shah (1970) respectively. Khatri (1966) also derived the following form for the density of 5, which is useful in obtaining expected values of CK(S). THEOREM 7.2.2. Let X ~ iVp,n(0,E <g> Φ), η > ρ, and Σ > 0, Φ > 0. Then the density of S = XAX', where Α (η χ η) > 0, is given by J2bprp(in)}~1det(E)-bdet(Q)^det(5)^n-p-1) / etr (- hrbH&H'jrbs) [dH], (7.2.8) JHeO{n) v 2 ' where H' = {H[ H'2) is annxn orthogonal matrix with H\ (pxn) and #2 ((n-p)xn) andQ~l = A^A%. Proof: By using (7.2.6), and (1.5.11), we have 1 °° 0Ft]{B ,\q-l^s) = Σ Σ fc=o к CK{In)k\ -£?£/томй(яя'(йТ5 l)H)w = LM°a{BH'4"-"£-'s)H^dw· '72·9' where [dH] is the unit invariant Haar measure defined on 0(n). Now using (7.2.9) it is easy to see that etr (- \q-^-lS) 0Ft\B, ^E^S) = j^ ^ etr (- ^SH.QH',) [dH]. (7.2.10)
228 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS FinaUy, by substituting (7.2.10) and Q~l = Α* Φ A? in (7.2.1) we get (7.2.8). ■ From the p.d.f. (7.2.5) we get the c.d.f. of S as Ρ(5<Ω) = {2bprp(in)}_1det(A^)-^det(E)-b / det(5)^n-p-1} 0F0(n) (Φ"1 Α"1, -^'lS) dS. (7.2.11) J0<S<Q ч 2 ' Expanding 0F^n\^-lA~\ -^Σ_15), using (7.2.6), we can write / det(5)^n-p"1) 0Ft]h~lA-\ -^~lS) dS J0<S<Q ч 2 ' £iV CK(In)k\ Jo<s<n v ; KV 2 ^ fc=0 Γι-1 4-1 Κ ' to* CK(In)k\ Γρ(|(η+ρ+1),«) КУ 2 ^ = det(Ω)Ьr^ffijpVl)1/1 ^П)(Ь ^ +P + 1); Φ"1Α_1' -^Σ"1Ω)· (7'2Л2) The last two expressions have been obtained by using (1.5.16), and (1.6.2), respectively. Substituting from (7.2.12) in (7.2.11), we get the c.d.f. of 5 as rP[|(P + i)] F(5 < Ω) = г Гi П4.Т.4- П1 det^)"Met(2E)""det(Q)^ rPl5(n-l-p+l)J ι^ι(η)(^η; i(n +P + 1); Φ-1^-1, -^Ω). (7.2.13) The corresponding results for the p.d.f.'s (7.2.1) and (7.2.7) can also be derived. However, they are quite involved. 7.3. PROPERTIES In this section we first derive the m.g.f. of S and then study some properties of its distribution. THEOREM 7.3.1. If S ~ Qp?n(A,E, Φ), then the moment generating function of S is MS(Z) = det(q-lA^)-^aet(A)-^ ^(^ЩВ^А-1), (7.3.1) = det(q-lAV)-&det(A)-*n f[ det(/p - bjA"1)"*, (7.3.2) i=i
7.3. PROPERTIES 229 .Ι ι , 1 where bj, j = 1,..., η are the roots of В = In — qA 2 φ lA 2; and A = IP — 2ςΣΖ. Proof: From (7.2.1), we have MS(Z) = {2bprp(in)}_1det(A^)-^det(E)-b| det(S)^n-p-l) etr(ZS) etr (- Ve"^) 0Ft\B, lq~lZ~lS) dS ϊ1 ~;ϋ υ ν '2 -P-i) = {2^ΡΓρ(-η)} ^et^J-i'detiE)-*" / det(S)2(n" etr (- ±q-lSZ-lA) 0F0(n)(B, \q~lZ~lS) dS. 2Ί — /υ*υ V~'2 Now, using (1.6.5), we get (7.3.1). To derive (7.3.2), consider E[etr(ZXAX')] = (2^-bPdet(E)-bdet(*)-^ Jxew f etr (ZXAX' - Ις^ΧΨ^Χ') dX. (7.3.3) 1 ._, 1 ^r A 1 Now making the transformation Υ = q 2 Ε 2ΧΑ2, with Jacobian J(X —>· У) = <?bPdet(E)2ndet(A)-2P, we obtain M< rs(Z) = (2^-bPdet^-1A^)-2P / etr (\yBY' - \κΥΥ') άΥ. (7.3.4) JYewxn ч2 2 ' Since YY' is invariant under post multiplication of Υ by an orthogonal matrix, we can take В in (7.3.4) to be a diagonal matrix with b/s as diagonal elements. Hence (7.3.4) can be written as MS(Z) = (2тг)-> detfo-U*)-** Π /v6EP exp {- |»ί(Λ - fc/p)yi} dVi, (7.3.5) where y^ (ρ χ 1) is the 2th column of Υ. Now (7.3.2) follows immediately by evaluating the integral. ■ An alternate expression for MS{Z), given in (7.3.2), is , 1 ~«-, 1 ч 1 MS{Z) = Π det(/p - 2^E2ZE2)-2, (7.3.6) j=i where £j, j = 1,..., n, are the characteristic roots of ΑΦ. If in (7.3.3), we transform Υ = Σ"^ΙΦ"5, with the Jacobian J(X -> Y) = det(E)2ndet(^)2P, and use the expansion еЦУ;Е*^Е*УФ*АФ*) = ££^Ск(УЪ*ЯЕ*Уф£;4ф£), A:=0 * *' where /с = (/сь ..., /cn), /cx > · · · > kn > О, кг Η h /Cn = /с, we get
230 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS Οκ(Υ'Σ*ΖΣ*Υ4>*Α4!*) άΥ. (7.3.7) Since (7.3.7) is homogeneous and symmetric function in ФгАФг, we have as in the proof of Theorem 7.2.1, ms{Z) = {2τ)-^±Έ^^Μ- \yr)c^z^yr)dY = g Σ (|п)«а(АФ)Ск(2£^)^ (7 з g) A:=0 * ^ CK(In) where the last two expressions have been obtained by using Theorem 1.4.10 and Lemma 1.5.2. Another expression for the m.g.f. of S can be given by using the density (7.2.8) as follows. MS{Z) = {2bprp(in)}_1det(E)-bdet(Q)^| j det(5)^n"p-1> etr [ZS - ^Z-iH&Hfi-is) dS [dH] = det(Q)*p J aet(HlQH[)-12naet(Ip-2E^ZE^HlQH[)~l)-12n[dH}. (7.3.9) Using the expansion, in terms of zonal polynomials, ,=o. НЛ2 in (7.3.9) we get where MS(Z) =det(Q)^±j:y(]-n)Kg(tfZEl>), (7.3.10) k=o« »Л2 g^ZZ?)= [ detiHiQKy^CJZizZ^HiQHiy^idli]. (7.3.11) JHeO(n) Note that ς(Έ,ϊΖΥ,ϊ) is a homogeneous and symmetric function in Σ2ΖΣ2. Proceeding as in the proof of Theorem 7.2.1, we get g&Zllh) = ^^ / det^QH'^C^QH'^) [dH]. (7.3.12)
7.3. PROPERTIES 231 Since (7.3.8) and (7.3.10) are both m.g.f.'s of S, by comparing the coefficients of CK(ZS), we get det(Q)i* / аеЬ(Н^Н[)-^Ск((Н^Н[)~1) [dH] = ^\ck(Q~1). (7.3.13) Substituting (7.3.13) in (7.3.12), we get g&ZLi) = ^(^(Q-1) det(Q)-|p. (7.3.14) THEOREM 7.3.2. Let S ~ <3Ρ}η(Α,Σ,Φ), and Β (ρ χ ρ) be any constant nonsin- gular matrix. Then, В SB' ~ Qp,n(A, ΒΣΒ', Ф). Proof: The result follows by transforming W = BSB', with the Jacobian J(S —>· W) = det(B)-p_1 in the density (7.2.1). ■ COROLLARY 7.3.2.1. Let S ~ <?Ριη(Α,Σ,Φ), and Σ = (C'C)~l. Then, CSC ~ βρ,η(Λ/ρ,Φ). THEOREM 7.3.3. Let S ~ Qp,n(A, Ip, Φ) and Η (ρ χ p) be an orthogonal matrix, whose elements are either constants or random variables distributed independently of S. Then, the distribution of S is invariant under the transformation S —>· HSH', and is independent of Η in the latter case. Proof: First, let Я be a constant matrix. Then, from Theorem 7.3.2, HSH' ~ QPyTl(A, Ip, Ф) since HH' = Iv. If, however, Я is a random orthogonal matrix, then the conditional distribution of HSH'\H ~ QPfTl(A, Ip, Ф). Since this distribution does not depend on Я, HSH' ~ Qp,n(A, Jp, Ф). ■ THEOREM 7.3.4. Let S = (50·), and Σ = (Σ0·) where 50· fa xpj) and Σ0· fa χρό), i,j = 1,..., k, Pl + · · · + pk = p. IfS~ QP,n(A, Σ, Φ), then 5« ~ QPi,n(A, Σ«, Φ), г = 1,..., k. Moreover, ζ/Σ^· = 0, г Ф j, then they are independent. Proof: By using the definition S = XAX', where X ~ NPyTl(0, Σ <8> Φ), we can write Sij = Χ,ΑΧ^ with X = (X[,.. .,X'k), Xi fa χ n). Then X{ ~ iVPi,n(0, Σ« Θ Φ), and consequently Su ~ φρ.ιη(Α,Σϋ,Φ). Further, if Σ^· = 0, г Φ j, then X^'s are independent. Therefore 5u's are independent. ■ Next we give results on expected values of functions of quadratic forms. For proofs and other details the reader is referred to Section 7.7. THEOREM 7.3.5. Let Χ ~ ΛΓρ,η(0,Σ <g> Ф), and define SA = XAX' and SB = XBX', where the constant matrices A(nxn) and Β (τη χ τη) need not be symmetric. Then (i) E(SA) = tr(j4*)E, (ii) E(SaCSb) = tr(*B'*A')tr(CE)E + tr(A*)tr(B*)ECE + ίΓ(ΛΦΒ'Φ)Σ6"Σ, (Hi) E(ti(SBC)SA) = tr(A'*£*)EC"E + tr(A*) tr(J3*) tr(CE)E + tr(A'*B'*)ECE,
232 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS and (iv) cov(vec(5A), vec(5B)) = ϊγ(ΑΦ£'Φ)Σ <g> Σ + ϊγ(Α'Φ£'Φ)(Σ <g> Σ)^. By substituting A = В in (iv), we get the covariance matrix of уес(5л) as cov(vec(5A)) = ϊγ(ΑΦΑ'Φ)Σ <g> Σ + ϊγ(Α'ΦΑ'Φ)(Σ (g) Σ)^. THEOREM 7.3.6. Let S ~ Qp,n(A, Σ, Φ). Then BICJiW-tfaWWfV. (7.3.15) Proof: Prom the p.d.f. (7.2.8), we obtain E{CK{ZS)) = {2bprp(in)}"1det(E)-bdet(Q)i"^0n Js>QCK(ZS) det(S)1^n-p-1) etr (- hz~iH&Hfi-lS) dS \dH). Next, use of Lemma 1.5.2 yields E(CK(ZS)) = 2fc(in)/cdet(Q)^^^det(^Q^)-b Finally, substituting from (7.3.14) in the above expression gives the desired result. ■ When ΑΦ = Jn, S ~ V^p(n, Σ), and (7.3.15) simplifies to E(CK(ZS)) = 2k(±n)KCK(ZE). (7.3.16) The result (7.3.16) was derived by Constantine (1963). Similarly the expectation of a zonal polynomial in S~l is given by S(CK(Z5-1)) = {25"Prp(Jn)}"1det(E)-bdet(Q)5p/ / CK{ZS~l) 1 yl'> JHeo(n) Js>o det(S)*(-n-p-l) etr (- iE-teiQfljE-is) dS [dff] = 2"'ΓΡΓ^Γ)(1βΐ^έΡ/, „, det(^Q^)-b lp(2n) JHeO(n) Οκ(Σ-^ΖΣ~^Η^Η[)[άΗΐ I(n-p + l)>fc1. (7.3.17) The last expression is obtained by using Lemma 1.5.2. The integral in (7.3.17) is a homogeneous and symmetric function in Yr^ZYr*. Therefore, E(CK(ZS-1)) = 2-fcE#^%^det(Q)W det(^Q^)"b Γρ(5η) 0K(Jp) JHzo(n) СК(Н^Н[) [dH], i(n - ρ + 1) > h. (7.3.18)
7.4. FUNCTIONS OF QUADRATIC FORMS 233 The above integral is not available in the literature. However, for ΑιΨΑ2 = Q~l = Jnj i.e., when S ~ \νρ(η,Σ), its value is CK(IP). Hence, E(CK(ZS'1)) = 2-"Γρ^;~κ)θκ(ΖΣ~% hn-p + l)> кг. (7.3.19) Гр(2П) 2 The above expression can be further simplified by using (1.5.9). 7.4. FUNCTIONS OF QUADRATIC FORMS In this section, we derive some distributions of functions of positive definite matrix quadratic forms. These distributions are useful in deriving the distribution of characteristic roots, which are fundamental in the study of multivariate tests. THEOREM 7.4.1. Let X (pxn) and Y(pxm) be independent, X ~ ΛΓρ>η(0,Σ&Ψ) and Υ ~ iVp>m(0, Σ <g> Im). Then the p.d.f. of F = Y\XAX')~lY, for m < p < n, is given by 5^(^+/?)U"(m+n)pdet(A^)-^det(F)^-m-1) det(/m + qF)~^m+^ Tp^njTm^p) lFt)(\(m + n);B,R*),F>0, (7.4.1) -.*■-(<'-+.*- ^). Proof: Since F is invariant under the transformation X —> Σ~ϊΧ, and Υ —> Σ" У, we can assume Σ = Ip. Hence the joint p.d.f. of S = XAX' and Υ is |2§(™+п)Рт§тРГрД Jj-1 det(A^-iPdet(5)i(n-P-i)etr|_ 1(уу/+ 9-i5)| 0F0(n) (S, ^9_15) ,5>0,Уе Rpxm. (7.4.2) Now making the transformation Ζ = S~?Y, with Jacobian J(Y —>■ Z) = det(5)5m, we get the joint p.d.f. of Ζ and S as {2^т+п)рж^трТр(У)у1 det^-^dettS)^"-"-1) etr {- \q~\h + qZZ')S) 0Ft)(B,±q~1S),S>0,ZeW*m. Integrating this joint p.d.f. with respect to S, using (1.6.5), we get the p.d.f. of Ζ as ГР[|(т + п)] i(m+n)p άθΐ(Αφ)-ΐΡ det(/ + qZZ')-^m+^ ^п)(\(т + n); B, (Ip + qZZ'y1), Ζ e W>™
234 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS Note that CK(IP + qZZ')-1 = CK (^Im + qZ'Z>> ° V and det(/p + qZZ') = det(7m + qZ'Z). Therefore, the density of Ζ can be written as Γ [l(m + n)] i(m+n)pά6ΐ(Αφ)-ΐΡdet(J + z,z)-i(m+n) An) (\(m + »); B, ((/m + fZ)~l ° )),Z6«r- Now, by using Theorem 1.4.10, we get the density of F = Z'Z, for m <p. COROLLARY 7.4.1.1. If A^ = In, then the random matrix S has Wp(n, Σ) and F~B»{\p,\{m + n-p)). THEOREM 7.4.2. Let X(pxn) and Y(pxm) be independent, X ~ iVp>n(0, Σ <g> Ф) andY~NPtm(0,Zi®Im). Then (i) the p.d.f. of Fi = X'(YY')~lX, for η <ρ <m, is given by l*a^a\ (ϊβί(Ω)*"ά8ί(Φ)-^άθί(^)^-"-1> det(/m + (^)-1F1)"*(m+w) Lp^mjr^p) ^^^(т + п);^^^ + Fj-1), Fx > 0, where q > 0, Ω = Σ~2"ΣιΣ~2 and Ω* = Ip — qQ, and (ii) the p.d.f ofF2 = (ΥΥ')-τΧΧ'(ΥΥ')-ϊ, forn>p,m> p, is given by Гр\\(т + п)} det(i24indet(«)--ipdet(F2)i(n-p-1) det(/p + q'lSlF2)-^m^l) Tp^mJTp^n) lFt)(\(m + ny,B,F2(qn-1+F2)-1), F2 > 0, where В = In — qty~l. Proof: (i) Since F\ is invariant under the transformation X —>· Σ~2"Χ, and Υ -¥ Σ~τΥ, we can take X ~ ЛГр?п(0, JP <g> Φ) and Y ~ iVPfTO(0,n <g> Jm), where Ω = Σ"5ΣιΣ-5. Further, for m > p, ГУ = S ~ Wp(m,Q). Now transforming Ζ = S~iX, with the Jacobian J{X —> Z) = det(S)2n, in the joint density of S and X we obtain the joint density of Ζ and S as |2|H+-)p π|-ρ Гр(1т^ J"1 ^ΐ(Ω)"^ det(^)"^ det(5)^(m+n-p-^ etr {- \s{Q~l + Zy~lZ')), S>0,Ze W*n. Integrating S in the above density, we get the marginal density of Ζ as Гр[д(т + п)] ά^,ηγ-η det(^)-|pdet(/ + QZilf-lZ,)-^m+n\ Ζ e W*n. (7.4.3) тгЬ*Тр(|т) Р
7.4. FUNCTIONS OF QUADRATIC FORMS 235 Since, aet(Ip + nzy~lZf) = (Ιβ^/η + Φ^ΖΏΖ) = det(tf)-1 det(# + q~lZ'Z - q~lZ'STZ) = q~n det(^)"1 det(qV + Z'Z) det(/n - Z'WZ(0 + Z'Z)~l) = q~n det(tf)-1 det(qV + Z'Z) det(/p - Z(qV + Ζ'Ζ)~ιΖΏ*), we get aet{Ip + nzy-lZ')-^m+n) = {^-ndet(^)-1det(^ + Z,Z)}-^m+n) ^(^{m + n^ZiqV + Z'Zy'Z'n*). (7.4.4) Now substituting (7.4.4) in (7.4.3), and integrating over Z'Z = F\, the density of F\ is rp^(mtn)1det(Q)bdet(^)-^/ det(/n + q~l^~l Z'Z)~^m^ -K^TJlm) Jz'Z=f1 (\m) ι**® {\(m + Ό; z№ + z'z)-lz'sr) dz. The integral in above density is a homogeneous and symmetric function in Ω*. Transforming Ω* —>· #Ω*#', Η G O(p), and integrating with respect to Η over O(p), by using Theorem 1.6.1, we get Гр[£(т + п)] det(Q)bdet(^)-^ / det(/n + q-l4~l Z'Z)~^m^ π2ηρΓΌ(τ;Τη) JZ'Z=F1 LP\2 iFJp (hm + η); (дФ + Ζ'Ζ)'1 Ζ'Ζ, Ω*) dZ. Finally, the result follows from Theorem 1.4.10. (ii) Proof is similar to (i). ■ COROLLARY 7.4.2.1. If in the above theorem Σ = Еь then the p.d.f. of Fu for n<p<m, is given by Ιψ™^α\ ά*(Φ)" Wet^)*^1* det(/m + ^Fl)~^m+n\ F, > 0. COROLLARY 7.4.2.2. If in the above theorem (ii), Φ = In, then the p.d.f. of F2, for n>p,m>p, is given by rjt\ra\ det^-det^)^-»-1' det(/p + тГ^т+п\ F2 > 0. I p{2m)L p{2n)
236 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS It may be noted that the density of F3 = (XX')-*YY'(XX')-h, η > p, m > p, where X and Υ are as in Theorem 7.4.1, can be obtained from the density of F2, by the relation of the transformation F2 —>· F^1. Hence, the densities of F$ and FA = (YY')l2(XX')-l(YY')l2 are identical. THEOREM 7.4.3. Let S(pxp) and Υ (ρ χ га) be independent, S ~ <2p>n(A, Jp, Ф) andY~NPtm(0,qIp®Im). Then (i) the p.d.f ofFb = (S + YY')-iYY'(S + YY')'^ for m>p,is given by q^A^an} det(A»)-*Met(ii)*(m-p-1) det(/p - F5)^~^ rp(2m)rp(2n) 1^0(п)(|(т + п);В,/р - F5), 0 < F5 < Jp, where q > О, В = In — qA~2^f~lA~2} and (ii) the p.d.f. of F6 = Y'(S + YY')~lY, for p>m, is given by 9γΤι[^Π)] det(^)-^det(F6)^—» det(/p - F^^~» iFt](\(rn + n); B, Im - ii), 0 < F6 < /p, Proof: The joint density of S and Υ is {25(™+")р(«^)Ьргр(^ det(5)^n-p-1} ο^οη)(^? J<Tl5)> S >0,У Ε Rpxm. (7.4.5) Now making the transformation G = S + YY' and Ζ = G" У, with the Jacobian J(r, S -> Z, G) = det(G)^m, in (7.4.5), we get the joint density of G and Ζ as {2^ιη+η>ρ(ςπ)*"4,Γρ(^η)}"1 det(A*)-^etr (- rf"1*?) det(G)2(m+n-p-1} det(/p - ZZ')^n~p-l) 0Fon\B, \q~lG{Ip - ZZ')), G > 0, Ζ e RpXm. Integrating this joint density with respect to G, by using Theorem 1.6.2, we get the marginal density of Ζ as 7Γ^Γρ(ΐη) An)(\(m + n); B, Ip - ZZ'), Ζ e W*m. Finally, by using Theorem 1.4.10, we get the density of F5 = ZZ' if m > p, and the density of F6 = Z'Z if m <p. ш
7.4. FUNCTIONS OF QUADRATIC FORMS 237 The above theorem is a generalization of Theorems 5.2.3 and 5.2.4. If we let ΑΦ = In here, we get F5 ~ B£(§m, \n) and F6 ~ £^(|p, |(m + η - ρ)). Next, we study the density of tr(5). THEOREM 7.4.4. Lei 5 ~ Qp,n(A Σ, Ф). ТЛеп, the p.d.f. of и = tr(5), /or η > ρ, is {2bpr(inp)}"1det(A^)-^det(E)-bul(-p-2)exp(-^-1u) oFo(np)(B1,^-1u),ii>0, (7.4.6) wuere βχ = Ιηρ - ς(Σ~ι <g> Α-Ιφ-Μ-Ι). Proof: Writing 5 = XAX\ with X ~ NPtTl(0, Σ <g> Φ), and by using Theorem 1.2.22, we can write u = (vec(X'))'(/p <8> A) vec(X'), (7·4·7) where vec(X') ~ ΛΓηρ(0,Σ <g> Φ). Since (7.4.7) is a quadratic form in vec(X')> the density of u, from Theorem 7.2.1, is given by {212ηρΓ{^ηρ)}~1 det((Jp <g> Α)(Σ <g> Ф))-*и*(пр"2) exp (- ^_1u) 0F0(np)(/np - q(Ip ® Α"*)(Σ Θ Φ)"1^ ® A"*), ^"1u), u > О, Substituting det((Jp<8> Α)(Σ <g> Φ)) = det(Σ)n det(AΦ)p and (Jp <g> A"5)(Σ <g> Φ)"1 (Jp <g> A~2) = Σ-1 <g> Α~2φ_1Α~2 in the above expression we get the final result. ■ The c.d.f. of u, by using p.d.f. (7.4.6), is P(u < w) = {2Ьpг(inp)}"1det(AΦ)-^det(Σ)-Ь / u^np~2) *· ^2 '' Ju<w exp (- \q-lu) 0Ftp)(Bu ±q-lu) du f expi-l-q-'uju^^-Vdu Ju<w v 2 ' = {Γ(^ηρ)}"195ηΡ(ΐβί(ΑΦ)-5Ρ(1βί(Σ)-5π where 7(0, x) = /0ccexp(—ί)ία_1 eft is the incomplete gamma function.
238 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS 7.5. SERIES REPRESENTATION OF THE DENSITY Let X = (xu..., xn) ~ NPtn(M, Σ <g> Φ). The quadratic form XAX', A = (aio) > 0, can be written as η η t=ij=i For Μ = 0, the density of XAX', in various forms, using different methods, has been derived by Khatri (1966), Hayakawa (1966), and Shah (1970). These are given in Section 7.1. Using suitable transformations, we can easily show that the density of XAX' is the same as that of S = £,ViU = YY'> (7-5-1) 2=1 where У = (уг,...,уп) ~ Λ^ρ,η(Δ,Σ <g> D), A = (ib...,in), D = diag(ab... ,an) and αϊ > · · · > an > 0 are the characteristic roots of ΑΦ. If ax = · · · = an = a0, then S ~ Wp(n, αοΣ,αο^-^Δ')· In this section we study series representations of the density of the quadratic form for Δ = 0 (i.e., central case), given by Khatri (1971) who generalized the results of Kotz, Johnson and Boyd (1967a). The series representation for Δ φ 0 will be discussed in Section 7.6. Write the density f(S) of S as f(S) = f:j:aKhK(S). (7.5.2) A:=0 « Then Khatri (1971) has studied two types of representations: (i) Power-series and Wishart type representations: /ι(5) = ΣΣ41)Αΐ1)(5), (7.5.3) A:=0 « where h£\S) = Wp(in)7E-1;5){(in)J"1CK(E-15)) (7.5.4) with wp(±n,^-\S) = {^(^pdet^^^-^etri-TE-^), η > p. For 7 = 0, (7.5.3) reduces to the power-series representation of Hayakawa (1966), for 7 > 0, it is the Wishart type representation (or a mixture of Wishart densities), and for 7 = ςτ1, q > 0, it is the representation (7.2.1) given by Khatri (1966). (ii) Laguerre type representation: A(S) = EE«W(S), (7-5-5) A:=0 к
7.5. SERIES REPRESENTATION OF THE DENSITY 239 where h?(S) = U)p(in,7S-1;5){(in)(c}"1L|("-p-1)(aE-15), α φ 0. (7.5.6) For α = 7 = \q~l, q > 0, Shah (1970) obtained this representation and is given in Section 7.2. To derive representations (7.5.3), and (7.5.5), and to study their convergence properties, we shall need the following results. LEMMA 7.5.1. Let S (ρ χ ρ) be a positive definite matrix and θ (q x q), q > p, be a real symmetric matrix. Let #i,..., 0ς be the characteristic roots of Θ, such that \ωθ{\ < 1, i = 1,..., q, where ω is any real or complex number. Then, for Я G 0(q), Н' = (Н[ Щ),Н1(рхд), etii-wSH^H'^I? - wH^H'J-1} [dH], (7.5.7) and Е^-|(Р+1)(5)-Щ-| < р-Ч det(J,-рЯ^Д!)-* V k\CK{Iq)\ JHeO(q) etilpSH^oH'^Ip + ptf^otfi)-1} [dH\ < p-\\ - pe)"^exp{pe(l + pe)"1 tr(5)}, (7.5.8) where θο = diag(|#i|,..., \9q\), б = тахг|0;|; o.nd ρ is any number such that 0 < pe < 1. Proof: Prom Lemma 1.5.1, for Я G 0(#), we have C"(ffif9) = / CK{HXQH[) [dH]. (7.5.9) Substituting from (7.5.9) in the left hand side of (7.5.7), we get у У iZ'^^js) °κ(ωθ>) k=0 « k\CK(Iq) - /„ 0,,EE^V)^|^V|. (Τ.5.10) JHeO(q)k=0 K k\CK{Ip) Now by using (1.7.7) in the integrand of (7.5.10), we get / / det(/p - ωΗοΗ,ΘΗΐΗί)-0 е^-иЯЯзЯ^Я^Тр - ω^θ^)"^} [dH] [dH3l (7.5.11)
240 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS f Я 0 \ f Η Η \ where Я3 G 0(p). Next transforming ( 3 T ] Я = Я4 , i.e., ( 3 l ) = Я4 = (^41 V and [d#] = [dH4], (7.5.11) becomes / f det(Ip-uHAleH'Al)-fi JH4eO(q)JH3eO(p) eti{-ujSHAieH'Al(Ip - а;Я410Я41)-1} [dHA] [dH3]. Integrating with respect to Я3 and replacing Яц by Яь (7.5.7) follows. To prove (7.5.8), note that VL" {S)ucJU\ ~\ ()Щ) <ΣΣ^^+1\^· (7.5.12) By using (7.5.7) in (7.5.12), the inequalities are easily obtained. ■ Using the representation (7.5.1) and results on normal distributions, we get the Laplace transform of the p.d.f. of S. LEMMA 7.5.2. Let Ζ (pxp) be a complex symmetric matrix such that Re(Z) > 0. Then, the Laplace transform of the density of S, for Δ φ 0, is E[eti(-ZS)} = Π det(/p + 2α,·ΣΖ)"* exp {- £ δ'άΖ(Ιρ + 2а^)~1б\. When Δ = 0, this Laplace transform reduces to E[eti(-ZS)\ = Π det(/p + 2α,·ΣΖ)-*, j=l which is the Laplace transform of the density of S in the central case. Next lemma follows from an application of Lebesgue dominated convergence theorem. LEMMA 7.5.3. Let {hK} be a sequence of complex valued measurable functions on the space of positive definite matrices such that OO ι ι Σ Σα*Μ^) < α etr(£S), for almost all S > 0, A:=0 ' * ' where {aK} is a sequence of complex numbers, a is a real number, and В is a symmetric matrix. Define /(S) = EE«A(S), A:=0 *
7.5. SERIES REPRESENTATION OF THE DENSITY 241 (well defined a.e. for S > 0). Then, the Laplace transforms hK{Z) and f{Z) of hK{S) and f{S), respectively exist for Re(Z) > B, and f(Z) = Σ Σ *MZ) for Re(Z) > B. (7.5.13) A:=0 * The definition and existence of Laplace transform are given in Chapter 1. In order to obtain explicit expressions for a^ of (7.5.3), and а$ of (7.5.5), we use the following method. Let us write K{Z)=i{Z)C«{G{Z)l (7.5.14) where ξ(Ζ) φ 0, is analytic for Re(Z) > B, and G{Z) is a one to one function. Further let θ = G(Z). Then G-l[G(Z)} = G-l(G) = Z. Define M<9>-t§w (7-5Л5) where L0(Z) = E{eti(-ZS)}, i.e., f(Z) = L0(Z). Hence, from (7.5.15), we get _ ~ ~K{G-\Q)) ~ L·2? Ki(G-4e))' = ΣΣ^.(Θ), (7.5.17) A:=0 * where the last two steps have been obtained by using (7.5.13) and (7.5.14). Now equating the coefficients of Οκ(θ) in (7.5.16) and (7.5.17), we get the explicit form for aK. THEOREM 7.5.1. For the power senes and Wishart type representation (7.5.3), a« ~a° ^ГоДо' (7·5 8} where 41} = det(D)-ipdet(2E)-in, D = diag(ab ... л), Μ = diag(/?b... ,βη), β. = 2απ~' Pj = 2-^,j = i,.-.,n. Proof: From (7.5.4), using Lemma 1.5.2, we get A№) = {(in)(crp(in)}"1ji>oetr{-(Z + 7E-1)5}det(5)i(»-'-1)C(,(E-15)dS = det(E)*"det(7/p + ΣΖ)-^Οκ({-γΙρ + ΣΖ)'1), Re{Z + 7Σ-1) > 0.
242 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS Now let ξ(Ζ) = det(E)bdet(7/p + ΣΖ)"Κ and G(Z) = θ = (jlp + EZ)"1. Then Ζ = Σ-^θ"1 - 7/p) = G_1(0). Using Lemma 7.5.2 and (7.5.16), we have Μ(θ) = n?=1det(/p + 2a^G-1(e))-^ άβί(Σγ2ηάβί(ΊΙρ + ΣΟ-ι(Θ))-12η = det(E)"b det(2L>)"^ f[ det(/p - A©)"* = det(£)-bdet(2D)-b£Ei^^l^i®); (7.5.19) where the last equality is written by comparing (7.3.6) and (7.3.8). The series expansion in (7.5.19) is valid if and only if max | chi θ| < -, or min | ch;(EZ) + 7| > e, (7.5.20) г б г where б = max, \Pj\ = max, I7 — (20j)_1|. Now comparing coefficients of CK(Q) in (7.5.17) and (7.5.19), we have αϊ1» = am {\n)«CK{Ai) 0 k\ CK(Q- Using a$ from (7.5.18), we get the power series and Wishart type representation (7.5.3) of the density of 5 as where 7 is any real positive number. For this series to be a density, it should satisfy conditions of Lemma 7.5.3. Here we have ElEWwl = <ff Σ [ Σ Ск( i?rfn ^^(^.τΣ-1; s)\ A:=0 * k\CK{In) pV2 Now, using (1.5.4) and t = max,· \0j\ = max,· I7 - (2a,·) x|, we have |C«(Ai)| < CK(A10) < ekCK(In), where Al0 = diag(|A|, · · ■, Ш)- Hence OO I I I Σ Σ«№(5) ^аоЧ(кb~£)Σ_1·'5)· k=0{ к I Z For В = -(7 - б)Е-1, Re(Z) > β satisfies the condition (7.5.20) and therefore from the uniqueness property of Laplace transform we get the density (7.5.21). The series
7.5. SERIES REPRESENTATION OF THE DENSITY 243 (7.5.21) is uniformly convergent if 7 > e. For choosing 7, and for rapid convergence of the series (7.5.21), we give upper bound for e$\S) = From (7.5.21), we have klai№\S) = aP{rp(\n)} Σ Σ«»($) fc=N+l к 1 О^СЛАОСДЕ-1^) (7.5.22) (7.5.23) k\CK(In) det(5)5("-P-1)etr(-7E-15). Next, using Theorem 1.4.10, we can write C^E^S) det(5)5("-p-1) et^-yE"1,?) = π-5"ΡΓρ(^η)| _ C<«(E-1yy,)etr(-7E-1yy)<iy. (7.5.24) Substituting from (7.5.24), and then using the results д(Л1)а(Е-1УУ/) k\CK(In) = f Οκ{Έ-ιΥΗΑλΗΎ) [dH], JHeOin) we get k\a^h^(S) = π->4χ) / / etr(-7E-1yy') K K K ' u JYY'=sJHeO(n) K ' 1неО(п) Ο^Σ^ΥΗΑ^Ύ') [dH] dY. Hence, e$(S) = π-^Ρα^Ι / / etr(-7E-1yy') I Jyy'=s JHnOin) t j:c^-1y^h'y,)[dHw k=N+l к k\ - ,-W« = ττ-ϊ'^αχ-'Ι / I etr(-7E-1yy') I Jyy'=s JHeO(n) l №-lYH^H'Y'Vk[dH]dY k=N+l k\ (7.5.25) (i) Let Αι be negative semidefinite, i.e., 7 < ^Γ , where ax > · · · > an > 0. Since, Σ A:=7V+1 (-*)* k\ exp(-x) - £ "(-*)* I . x"+1 Jt=0 Jfc! < (N+l)\'
244 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS we can write Σ {- tx(p-1YH(-A1)ITY')}k I {^(Е-1УЯ(-Л1)Я'У)}Л'+1 Jt=jv+i *■· < (N + l)\ ~ Σ Щ^у . (7-5-26) where C\ is a zonal polynomial, λ = (£χ,..., £η), ίχ > ■ ■ ■ > £n > 0, ίχ 4 + ίη = N + l. Hence e$(S) < π-ϊηραΡ [ ί βίτ^Σ^ΥΥ') JYY'=sJHeO(n) Σ (ϊνΤΊ)! [dH] dY- Now using / Сл(Е-1УЯ(-Л1)Я'Г) [dH] = Cx(-Ax)C^YY')^ JHeO{n) Cx{In) Cx(-Ax) < Cx(Al0) < Cx(In), and integrating over the surface YY' = S, we get eii)(S)<4'4(i»,7E-';5).''"£|E12 where e = ^ 7· For the uniform convergence of the series (7.5.21), we need 4α~ < Ί < dr* Note that the power series expansion does not converge rapidly and uniformly, because in this case 7 = 0. The best choice of 7 is 7 = ^-, when Ax is negative semidefinite, and б = \{·£- — ^)· (ii) If 7 > 2^-, the matrix Αι will have negative as well as positive elements. Let A10 = diag(|A|H..,|&|). Then {^(Σ-ΎΗΑ^Ύ'^Ι ~ {^(Е-1УЯА10Я,У)}А: Σ w" кГ " * Σ k=N+l л" A:' A:=7V+1 Λ· (ΑΓ+1)! = E^^f^T^etri^yr). (7-5.28)
7.5. SERIES REPRESENTATION OF THE DENSITY 245 Now substituting from (7.5.28) in (7.5.25), we get χ Jyy'=s JHeO(n) ΟΧ(Σ-ιΥΗΑι0ΗΎ') [dH] dY (N+l)\ Сд(Е_1УУ') йУ -p-i) etr{-(7 - cJE-^CaCE-1^ < ^(in, (7 - β)Σ"1; ^"+1 ^ + Ι)Γ' (7.5.29) where б = max, |/?j| = max, I7 — (2oj)_1|. For uniform convergence of (7.5.21), 7 > e gives 7 > 4^-. The best choice for б is б = inf7maxj I7 — (2oj)-1|, which gives 7 = j(— + —) and hence e = j(— -) and 7 — б = ^. Therefore the choice ' 4 ^ αϊ αη' 4^αη ct\' ' 2αη 7 = i(— + — )isa better choice than the best choice in (i). THEOREM 7.5.2. For the Laguerre type representation (7.5.5), (2) J2)(|nl^(A2) /7гом 4 a° ^ГЩЛО' (7'5'30) where a{Q] = det(.D)-Wet(2EHndet(Jn - A2)&, D = diag(ab... ,an); A2 = diag(0b ..., 0n), and φά = γζ^^, j = 1,... ,n. Proof: From (7.5.6), and Theorem 1.7.1, we have h%\Z) = det(E)b det(7/p + ΣΖ)-*ηΟκ(Ιρ - α(ΊΙρ + ΣΖ)"1), Re(7/P + ΣΖ) > 0. Now let ξ(Ζ) = det(E)bdet(7/p + EZ)"b, and G(Z) = θ = Ip - α(ΊΙρ + ΣΖ)"1. Then Ζ = Σ~ι[α(Ιρ - θ)"1 - ΊΙΡ) = G~l(e). Using Lemma 7.5.2 and (7.5.16), we have M(Q) = ΠΓ=ι det(/P + 2ai{a{Ip - Θ)-1 - 7/,})-* det(E)b det(a(/p - Θ)-1)-^" ί=1 ,^^ψο^αψχ (7531) Α:=0 « Κ· ^κ\*η)
246 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS The series expansion in (7.5.31) is valid if and only if тах|сп^0| < —, or maxll- _ ,_,_. 1 < —, (7.5.32) * ' 6ι' χ I chi(EZ)+7l 6i' v J where e\ = max, \</>j\. Now comparing the coefficients of Οκ(θ) in (7.5.17) and (7.5.31), we have -m_ Л2)(ъп)кСк(А2) а« ~а° k\ cK(in)'u Using a,W from (7.5.30), we get the Laguerre type representation (7.5.5) of the density of S as f2(S) = aywp[-n,^ 1;S)}212 ϊΤγΤμ ' (7.5.SS) A:=0 к klCK(In) where 7 is any real positive number. It may be noted here that the series (7.5.33) is convergent if and only if 61 = maxj \<f>j\ < 1. Therefore, we choose 7 and α such that 61 < 1. Then using Lemma 7.5.1, £|E«2)(S)| < (ι -ρ)-»-(ι - ^)_1424(^> (7 - γ^ς-1;*), where ρ is a real number, 61 < ρ < 1. Since Re(Z) > —(7 — ^-)Σ_1, satisfies the condition (7.5.32) and therefore from the uniqueness property of Laplace transform and Lemma 7.5.3, f2(S) defines the density of S. The series (7.5.33) is uniformly convergent if 7 > -f^-. ■ The results given in this section were derived by Khatri (1971). For the Laguerre series expansion, he has also given upper bound for J2) era A:=7V+1 « 7.6. NONCENTRAL DENSITY FUNCTION Let Χ ~ ΑΓρ?η(Μ, Σ <g> Φ). In Section 7.2, the density of S = XAX', Α (η χ η) > 0, η > ρ, for Μ = 0 has been derived. In this section, the density of S for Μ φ 0, called the noncentral density, is derived. THEOREM 7.6.1. Let X ~ /Vp?n(M, Σ <g> In), n>p, αηάΣ> 0. Then the density of S = XAX', Α (η χ n) > 0, is given by {22ηρΓρ(^η)}_1 det^)-2ndet(A)-2petr (- h:~lMM' - \q^~lS) det(S)*<n-p-1) 00 ι ι 1
7.6. NONCENTRAL DENSITY FUNCTION 247 where q > 0, In — qA is positive definite and PK(-, ·, ·) is the generalized Hayakawa polynomial defined in Section 1.8. Proof: The density of S = XAX', X ~ NPtn(M, Σ <g> Jn), can be obtained from the density of S = XAX', Χ ~ ΝΡι71(Σ-^Μ,Ιρ <g> Jn), by transforming S -> E^SEi Therefore we derive the density of S = XAX', Χ ~ ΛΓΡ}η(μ, Jp <g> Jn), μ = Σ^Μ, which is given by f(S) = (2tt)-^p / etr {- \{X - μ){Χ - μ)'} dX JXAX' = S L Δ J = (2π)-1ϊηρ [ etr [- \{qXAX' + X(In - qA)X' Jxax'=s L 2 -2μΧ' + μμ'}1όΧ. (7.6.1) Note that the integral *-*""_£etr [- {i/ - ^(*('« - **)* - M(/n - qA)-i)} {U-^(X(In - qA)i - μ(Ιη - qA)-l)}'] dU (7.6.2) is unity since U ~ JVp,n (^ (X(/n - qA)? - μ(Ιη - qA)~b) , \lv ® /n). Next multiplying (7.6.2) and (7.6.1), and changing the order of integration we get f(S) = (2^)~bPetr {- 1-μμ! + |μ(/η - *А)"У} J eti{-UU' - V2tU{In - ςΑ)-*μ')} [ etr [- \qXAX' + V2iU(In - qA)ix'] dXdU. (7.6.3) JXAX'=S L 2 J Now substituting Υ = ХАъ, with the Jacobian J(X —>· У) = det(A)~2P? and using Theorem 1.6.6, we get [ etr [- \qXAX' + y/2tU(In - qA)iX'] dX JXAX'=S L 2 -I ι = руцт det(A)-i'etr (- -9S) det(S)^-'-1) oFi(\n,-\u(A-l-qIn)U'S) = -^ det(A)-^etr (- i9s) det(S)^-1) Γρ(|η) 2 Σ Σ 7ттгй^( - ί^_1 - «W 4 (7·6·4) A:=0 * V27i/«/C· Z
248 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS In (7.6.3), substitute from (7.6.4), and use (1.8.2) to get {2bTp(in)}-1 det(A)-btr (- ψμ' - ±qs) det{S)*^i) ΣΣ(ρ^^(^-^)-Μ-1-9/45);5>ο. A:=0 * V2 Finally transforming S ->· Σ25Σ2 we get the desired result. ■ THEOREM 7.6.2. Lei X ~ NPy7l(M, Σ <g> Φ), η > ρ, Σ > 0 and Φ > 0. ТДеп *Ле density function of S = XAX'', Α(η χ n) > 0, is given by {2^pΓp(irι)}"1det(Σ)-Ьdet(β)-2petr(-iΣ-1Mφ-1M,) 1 °° 1 *(-1*г-5)<И5)М~ >ЕЕд^ Ρκ(-^=Σ"5Μφ-5(/η - дВГКВ'1 - qln, |ς-*5Σ"*), 5 > 0, w/iere <? > 0; β = #2 АФг; /та — qB is positive definite and PK(-, ·, ·) is the generalized Hayakawa polynomial defined in Section 1.8. Proof: Note that XAX' = Υ BY', where Υ ~ ΛΓρ,η(Μφ-2, Σ®/n) and В = φ2 Αφέ. The result now follows from Theorem 7.6.1. ■ COROLLARY 7.6.2.1. For Μ = 0, *Ле above density of S reduces to (7.2.1). Proof: When Μ = 0, from (1.8.3), we have .1 ™-ι_ ι > P.(ftB- -rt.ir.5E4) - (у1'.-'-у^;). („5) Using (7.6.5) in Theorem 7.6.2 for Μ = 0, and simplifying we get the desired result. ■ COROLLARY 7.6.2.2. The density of S = XX' is given by {2Ьpгp(irι)}"1det(Σ)-^ndet(Φ)-^etг(-iΣ-1Mφ-1M,) ι °° ι elt(-i,E-.S)d,t<^— E?iFL_ Ρκ(-^Σ"5Μφ-5(7η - 9Φ)-έ, Φ-1 - qln, ^Σ-55Σ-5), S > 0. COROLLARY 7.6.2.3. Рог Ф?ЛФ5 = In, S ~ Η^η,Σ,Σ^ΜΦ^Μ').
7.6. NONCENTRAL DENSITY ΡϋΝΟΉΟΝ 249 COROLLARY 7.6.2.4. For X (1 χ η) = ж', M(l χ η) = т! = (mb...,mn), Л = /η,Σ(1 χ 1) = 1, Φ = diag^i/»,,...,^/^), Σ£ι** = η, s = χ'χ - Σϋι^ίΧ^(^ί) гуДеге χ7^ is α noncentral chi-square distribution with щ degrees of freedom and noncentrality parameter ω* = Σ^ΐϊ+^+ηί-ι+ι ~7^· The density ofs, from Theorem 7.6.2, is {2br(in)}-1(n^)"%xp(-iE-i-^)^(n-2) For calculating PK(t',A, B), Crowther (1975) has given a method of utilizing the cumulants of certain quadratic form involving A. Khatri (1977), using Laplace transforms, has generalized the results of Shah (1971) to the noncentral case. The density of 5, derived by Khatri (1977), is {2>rp(in)}"1det(^-1E)-betr(-^E-15)det(5)^n-p-1) oo -ι ι 1 Σ Σ (I^)jfe!L"(29E"i5Iri'7" ~ qA' ^E-iMifoyl)-1 - /„)-*), S > 0. where q > 0 is a constant which governs the convergence of the series, and L£(S, A, T) is the generalized Laguerre polynomial defined by Khatri (1977). When A = In, i.e., S ~ Wp(n, Σ, Y>~lMM'), he also obtained the following representation of noncentral Wishart density in terms of the generalized Laguerre polynomials, {2bprp(in)}"1det(E)-betr(-iE-15)det(5)^(n-p-1) Next, following the method similar to Section 7.5, we derive a series representation for the density of S in the noncentral case. Since S has \p{p + 1) distinct random variables, we can write the series form of f(S) as /(s) = EE«4f«s), (7-6.6) A:=0 К where К = (&1Ь &12,..., klp, k22, · · ·, k2p,..., *„,), k{j > 0, Σ?=ι Σ?=» *« = *> Σ* is the multinomial sum, a# is a constant, and /k(S) is a suitable function of S. This series uses ^p(p + 1) partitions whereas the series (7.5.2) uses ρ partitions of /c, and structurally these two representations are different. For the convergence of the series (7.6.6), we require |/(5)| < b etr(BS), for all S > 0,
250 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS where b is a real constant and —β is a positive semidefinite matrix. Since the density of S can be obtained from the density of Σ~2£Σ~2, (Khatri, 1975), we therefore without loss of generality take Σ = Ip in the following derivation. From Lemma 7.5.2, the Laplace transform, L0(Z), of S is Lo(Z) = Π det(/p + 2a3Z)-* exp {- £ δ'όΖ{Ιρ + 2αάΖ)~ιδλ. (7.6.7) Let f(Z) be the Laplace transform of (7.6.6). Then (7.6.6) is the p.d.f. of S if and only if f(Z) = Lo(Z), (7.6.8) for almost all Ζ such that Re(Z) > 0. Further let θ = (0{i) = (ηΙρ + Z)~\ i.e., Ζ = θ-1 - ηΙρ. Then from (7.6.7) we have L0(G-1 - 7/P) = a0 det(©)*n Π det(7p " fte)_i exP f Σ ^Θ(7ρ " /%©)~Ч}> j=i L i=i } (7-6.9) where a0 = aet(2D)-^pexp{-\Z]=i^~}, "j = ^t^ D = diag(ab ... ,an), and bj = 2aj'2~\ j = 1,..., n. Now (7.6.9) can be written as L0(O-1 - 7/p) = det(©)*n £ Σ α* 7\fc(0), (7.6.10) A:=0 К where TV* (θ) = IBU П?=*Ф", *tf = fy, i,j = 1, · ·. ,p. From (7.6.8) and (7.6.10), we get /(Z) = det(7/P + Z)~in £ Σ ακΝκ((ΊΙρ + Ζ)~ι). (7.6.11) A:=0 ΛΓ Comparing (7.6.11) with (7.6.6), it follows that the Laplace transform of /k(S) is fK(Z) = det(7/P + Z)-inNK(frIp + z)~1)· (7-6.12) Thus, we need to find the function /k(S) whose Laplace transform is (7.6.12). To do so let us consider gk(S,Q)=Y/{(±n)xy1Cx(QS)wp(±n,1Ip;S), (7.6.13) where Wt (^n,7/P;5) = {rp^njJ^detiSJ^-^etri^S), n>p, A = (iu i2, · · ·, tp), t\ > h > · · · > tp > 0, and ex + £2 + · · · + £P = k. The Laplace transform of (7.6.13) is gk(S,Q) = j:{rp(^n)(\n)xY1Js>oetT(-ZS)Cx(QS)wp(^nnIP;S)dS
7.7. EXPECTED VALUES 251 = Σ{ΓΡ(^η) (in)J_1 |s>oetr{-(7/p + Z)S}CX(QS) det(S)^-"-1) dS = det(7/P + z)-^J2^(QbiP + z)-1) λ = det(7/P + Z)-b{tr(Q(7/p + Z)-')}k = Σ,ΜΖ)Ν*№)<*> (7·6·14) к where c^ = 2/c"^i=ifcii k\ |n?=i П^=г *τί}· Then from the uniqueness of the Laplace transform, ίκ{%) is the coefficient of Nk{Q)ck in the expansion of gk(S,Q), and /д:(5) is the coefficient of Nk(Q)ck in the expansion of gk(S, Q). Thus ί(ί) = ΣΣ«ώΜ, (7.6.15) A:=0 К where /k{S) can be obtained as described above. For convergence of the series (7.6.15), Khatri (1975) has obtained bounds for | Σκ^κίκ^Ι for k > 1. He has also tabulated α,κ and /#(£) for k = 1,2. 7.7. EXPECTED VALUES The matrix quadratic forms studied in this chapter are defined in terms of X ~ NPtn(M, Σ®Φ). The matrix variate normal distribution has been studied in Chapter 2. There we have also given several expected values of functions of XAX'. For the sake of completeness, we state those below (for proof the reader is referred to Chapter 2). Throughout this section the matrices of quadratic forms need not be symmetric. THEOREM 7.7.1. Let SA = XAX' and SB = XBX', where X ~ ΛΓρ?η(Μ, Σ <g> Φ), and Α (η χ η) and Β (η χ η) be constant matrices. Then (г) E(SA) = ίτ(ΑΦ)Σ + МАМ', (ii) E(SACSB) = Ь1(ЪВ'ЪА')а(СЕ)Е + а(АЪ)а(ВЪ)ЕСЕ + ^(ΑΦ£'Φ)Σ6"Σ + ίτ(Β4>)ΜΑΜΌΣ + МАФВ'М'С'Ъ + Ы(АМ'СМВЧ>)Ъ + Ьт(СЪ)МА$ВМ' + ЪС'МА'ЪВМ' + ίι(ΑΨ)ΣΟΜΒΜ' + МАМ'СМВМ', and (Hi) E(ti(SBC)SA) = ti(A,^B^)EC,E + ti(A^)ti(B^)ti(CE)E + ίΓ(Α;Φ5;Φ)ΣσΣ + tr( 5Φ) ti{CT)MAM' + МАЪВМ'СТ* + ЪС'МВ'ЪАМ' + МАЫВ'М'С'Т, + ЕСМ'ВЪАМ' + Ы(АФ) ίτ(ΜΌΜΒ)Σ + ti{BM'CM)MAM' where C (ρ χ ρ) is a constant matrix. Next we derive the covariance matrix of vec(XAX') and vec(XBX'), a result due to Neudecker and Wansbeek (1987).
252 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS THEOREM 7.7.2. Let SA = XAX' and SB = XBX', where X ~ NPy7l(M, Σ <g> Φ). Then cov(vec(5A), vec(5B)) = ^(ЛФ5;Ф)Е <g> Σ + МА'ФБМ' <g> Σ + Σ (g) МАЫВ'М' + {^(Α,Φβ,Φ)Σ (g) Σ + МАЪВ'М' <g> Σ + Σ (g) МАФБМ'}^, where Kpp is the commutation matrix defined in Section 1.2. Proof: Prom Theorem 7.7.1, we have E(SACSB) - E(SA)CE{SB) = ^(Φβ,ΦΑ,)^(αΣ)Σ + ^(ΑΦβ,Φ)Σσ,Σ + МАЫВ'М'С'Т. + Ьт(АМ'СМВЪ)Е + ίι(ΟΣ)ΜΑΨΒΜ' + ЪС'МАЪВМ' = YsPiC'Qi + j^tiiC'P^Qi, (7.7.1) i=l i=4 where Pj = ίΓ(ΑΦΒ'Φ)Σ, P3 = ΜΑΦΒ'Μ', P5 = ΜΑ'ΦΒ'Μ', Ρ2 = ΡΑ = Ρ6 = Σ = Q1 = Q3 = Q5, Q2 = ΜΑ'ΦΒΜ', Q4 = ίΓ(Α'Φ£'Φ)Σ and Q6 = ΜΑΦΒΜ'. Now for D (ρ χ p), we can write tc{K„{C <g> L>) cov(vec(SA), vec(SB))} = tr [Κ„(σ ® Z?){£(vec(5^)(vec(5B))') - £(vec(5^))£(vec(5B))'}] = tr [(C ® D){E(vec(SA)(vee(S'B))') - E(vec(SA))E(vee(SB))'}} = tr [E{vec(DSAC)(vee(S'B))'} - E{vec(DSAC)}E{(vee(SB))'}] = ti{E{SACSBD)} - ti{E(SA)CE(SB)D}. (7.7.2) The expression (7.7.2) has been obtained by using the properties of Kronecker product and the commutation matrix given in Section 1.2. Now, using (7.7.1) in (7.7.2), we get tiiKppiC ® D) cov(vec(SA), vec(SB))} = £ tiiPiC'QiD) + £ tr(C'P0 tr(DQi), i=l i=4 i=l + Σ ix{K„{C· ® D)K„(Qi ® Pi)}, (7.7.3) i=4 since triC'QiDPi) = tri^ppiC'Qi (8) DPi)} = Ьт{К„{а (g £>)(<?« (g P*)}
7.8. QUADRATIC FORMS OF THE TYPE XAX' 253 and tr(PiC') ti(QiD) = ίτ(Ρβ' <g> QiD) = tr{(C <g> D)(Pi ® Qi)} = ti{(C ® D)Kpp(Qi ® PJKpp}. The result (7.7.3) holds for any С and D. Hence cov(vec(5^), vec(5B)) = £(Q4 ® P«) + £ ^w(<3i ® p<)> (7-7.4) г=1 г=4 By substituting for Pi? Qi? г = 1,..., 6, in (7.7.4) we get the desired result. ■ Letting A = В in the above theorem, we get the following result. COROLLARY 7.7.2.1. The covariance matrix ofvec(SA) is given by cov(vec(5A)) = ίΓ(ΑΦΑ;Φ)Σ <g> Σ + МАЪАМ' <g> Σ + Σ <g> МАЫАМ1 + {^(Α,ΦΑ,Φ)Σ (g) Σ + МА!ЪА!М' <g> Σ + Σ ® ΜΑΦΑΜ'}/^, (7.7.5) When Φ = /η, (7.7.5) reduces to the result given by Neudecker (1985). For A = In, Φ = /η, (7.7.5) gives the covariance matrix of noncentral Wishart matrix, as in Magnus and Neudecker (1979). By substituting Μ = 0 in Theorems 7.7.1 and 7.7.2 we get the results given in Theorem 7.3.5. von Rosen (1988b) derived E[(XAX') <g> {XBX')\ when X - ΑΓρ,η(Μ,Σ <g> Φ). Tracy and Sultana (1993) derived E[(XAX') <g> {XBX') <g> (XCX'j] when X - Νρ,η(0,Σ (g> Φ). Rang and Kim (1996) gave general result for Е[®?=1(ХA{X% for Χ-ΑΓρ?η(0,ΣΘΦ). 7.8. WISHARTNESS AND INDEPENDENCE OF QUADRATIC FORMS OF THE TYPE XAX' So far we have studied the distribution of XAX', where Χ ~ ΑΓρη(Μ, Σ <g> Φ) for Μ = 0 and M/0, г.е., the central and noncentral cases. Under certain conditions these quadratic forms follow Wishart or noncentral Wishart distribution, as noted in Chapter 3 and also in this chapter. In the present section we give conditions for Wishartness of quadratic forms of the type XAX'. Conditions for independence of two or more such quadratic forms are also given. In the next section we have derived similar conditions for the quadratic forms of the type XAX' + \{LX' + XL') + C. Most of the results derived in the present section can be obtained as special cases of the results in the next section. However, for the sake of completeness and readability, the results for two types of quadratic forms are presented sequentially.
254 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS Let us now consider the quadratic forms of the type XAX'. First we derive the m.gl. oiXAX' for* = Jn. THEOREM 7.8.1. Let Χ ~ Νρ,η(Μ, E<g> Jn). Then the m.g.f. of S = XAX', where A(n χ n) is a symmetric matrix of rank t, is given by MS{Z) = Π det(/p - 2λ,·ΣΖ)-* exp { £ X^MfZ(Ip - 2X£Z)-lMq\, (7.8.1) 3=1 lj=l j гуДеге A = Q ( n n J Q'; Q (η χ η) zs an orthogonal matrix, Q = (ςτ1?... , <7n), jDa = diag(Ai,..., Xt) and λ1?..., Xt are the nonzero characteristic roots of A. Proof: The m.g.f. of S = XAX' is given by MS(Z) = (27r)-bPdet(E)-b / eti\ZXAX' - Ις~1(Χ - M)(X - M)')dX. (7.8.2) Since A is symmetric matrix of rank t (< n), we can write A = Q ( Λ J Q', where D\ = diag(Ai,..., At), Xj, j = 1,..., t are the nonzero characteristic roots of A, and Q(n χ n) is an orthogonal matrix. Using the transformation Υ = Σ~ϊΧ(2 in (7.8.2) with the Jacobian J(X ->· У) = det(E)^n, we have Afs(Z) = (2π)-^ηρ / etrJEsZEsyQ'AQy' - hYQ - E-*M)(yQ' - Σ-5Μ)'} ЙУ = (27r)-^/yeRpxn etr {Е^Е^ВД' - |(У - ЛГ)(У - Ν)'} dY = (2π)~& [ eti {Σ^ΖΣ^ϋ,Υΐ-^Υ,-Ν,χΥ,-Ν,)'} dY, = (2тт)-**е<1(-|зд) /W« etl {" ^' " 2Σ*ΖΣ*ΥιΌ>Χ) + ^ί} йУь (7-8.3) where AT = E^MQ, У = (Ух У2), Ух (ρ χ ί), and ΛΓ = (ATX JV2), Νχ {ρ χ ί)· Writing Υχ = (ylu ..., ylt) and ATX = (ηη,..., nlt), we get ί ВД^) = Е"уУу. (7-8-4) ЧВД) = Е"'У"у. (7·8·5) ί=1
7.8. QUADRATIC FORMS OF THE TYPE XAX' 255 and tr {YXY{ - 2Σ*ΖΣ*Υ1£>λ1?) = tr { £(/p - 2λ,·Σ*ΖΣ*)ν„ι^}. (7.8.6) By substituting from (7.8.4), (7.8.5) and (7.8.6) in (7.8.3) we get MS(Z) = (2π)-**βφ(-5 Σ »'«»»«) Π / „, exP {- УнУр - 2A^^*)y„ + n'„y„} dy„ = exP (" ο Σ ηυηυ) Π det('p " 2λ,·Σ*ΖΣ*)"* z j=i 3=1 t ( Π j (27r)-Wet(Jp - 2X^2ΖΥΛ)τ L**>expЬ \y'lj{Ip ~ 2Χ^ΖΣ^υ + ηΊ;*υ} аУи} t = Π<*βψρ-2λ;Σ5ΖΣ5)-5 i=i «Φ {" \ Σ "ii«y + 5 Σ "ц(4> - 2λ^*ΖΣ*)"1ην} ί = Υ[ά^{Ιρ-2\3ΈτΖΈτ)-τ 3=1 t exp { £ λ^.Σ*ΖΣ*(/ρ - 2λ,·Σ^ΖΣ^)-1η1^}. (7.8.7) 3=1 The final result is obtained from (7.8.7) by noting that Νλ = Σ~ϊΜ(ςλ,..., qt). m COROLLARY 7.8.1.1. If λ* = 1, г = 1,... ,ί, t > p} then S ~ Η^ί,Σ,Σ"1 MAM'). Proof: Substituting λ; = 1, i = 1,..., t in (7.8.1) and noting that A = Q Γ * λ Q ш yi.o.L) anu nuting tiiat i\ = ц/ ι Sj=i ^j^j? we nave tne m.g.f. of S as MS(Z) = det(/p - 2ΣΖ)"*' etr{Z(/p - 2ΣΖ)"1ΜΑΜ'}, which is the m.g.f. of a noncentral Wishart matrix. ■ From the above theorem we obtain the cumulant generating function of S.
256 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS THEOREM 7.8.2. Let X ~ NPi7l(M, Σ <g> In). Then the c.g.f. of S = XAX', where A(n χ n) is a symmetric matrix of rank t, is given by In MS(Z) = Σ — ti(As) tr((EZ)s) + Σ 2s ti{MAs+1M'Z(i:Z)s}. (7.8.8) Proof: Prom Theorem 7.8.1, we get t t 2 InMS(Z) = -\ Σ ln{det(/p - 2λ,·ΣΖ)} + Σ \43M'Z(lp - 2\3ΣΖ)~λΜ4ύ. (7.8.9) 3=1 3=1 Note that ln{det(/p - 2λ,·ΣΖ)} = - Σ ~xj tr((s^)s), (7.8.10) s=l S and t Y2xjq'jM'z(ip-2xpz)-1Mqj = Σ\ό4'όΜ'ζ(Σ(2\ρζγ)Μ4ό j=\ j=l s=0 oo r t . = $>tr M'Z(5Z)sM^Aj+1^;· , (7.8.11) s=0 L j=l ) where the series expansion in (7.8.10) and (7.8.11) are valid for max* | ο1ΐ;(λ^ΣΖ)| < \, which can be met since Ζ is arbitrary. From Theorem 7.8.1, we have Α* = Σ^34ρ (7-8.12) 3=1 and Finally substituting (7.8.10) and (7.8.11) in (7.8.9) and simplifying the resulting expression using (7.8.12) and (7.8.13), we get the desired result. ■ COROLLARY 7.8.2.1. If\{ = 1, i = 1,..., t, t > p, then S ~ Wp(t, Σ, Т,~1МАМ') with the c.g.f. oo os—1 oo In MS{Z) = ίΣ *γ((ΣΖ)5) + Σ 2S ti{MAM'Z{Y,Z)s}. (7.8.14) s=l S s=0 Proof: From the Corollary 7.8.1.1, S ~ Wp(t, Σ, Σ^ΜΑΜ') . Now the result follows by substituting Xi = 1, i = 1,..., t, i.e., A2 = A, and ti(A) = t, in (7.8.8). ■ Next we derive conditions for Wishartness of a matrix quadratic form XAX'. THEOREM 7.8.3. Let S = XAX', where Χ ~ ΛΓρ>η(Μ,Σ <g> In). The necessary and sufficient condition for S to be distributed as Wp(t, Σ,Σ-1ΜΑΜ') is that A is idempotent of rank t > p.
7.8. QUADRATIC FORMS OF THE TYPE XAX' 257 Proof: The m.g.f. of XAX' is given in Theorem 7.8.1. Let A be idempotent of rank t > p. Then λ» = 1, i = 1,. ..,£ and thus from Corollary 7.8.1.1, S ~ Wp(t, Σ, Σ~ιΜΑΜ') with the m.g.f. MS(Z) = det(Jp - 2ΣΖ)-*' etr{Z(Jp - 2ΣΖ)-1ΜΑΜ'}. (7.8.15) Conversely, if XAX' is distributed as noncentral Wishart with parameters t, Σ and Σ~ιΜΑΜ\ then its m.g.f. given in (7.8.1) must be identical with (7.8.15). Hence, equating the logarithm of these two expressions, from (7.8.8) and (7.8.15), we get Σ — ti(As) tr((EZ)s) + Σ 2s ti{MAs+lM'Z(ZZ)s} s=l S s=0 oo Os-1 oo = ΐΣ tr((EZ)s) + Σ 2S ti{MAM'Z(i:Z)s}. (7.8.16) s=l S s=0 Since this holds for any linear function of Z, by equating the coefficient of tr((EZ)s) we get ti(As) = t, s = 1,2,...,.., i.e. Σ]=ι AJ = t, 5 = Ι,.,.,ί. Consequently Sj=i ^jS(^j — I)2 = 0, and hence \λ = \2 = · - · = \t = I i.e. A is idempotent of rank t. It may be noted that for A = As, the identity (7.8.16) is satisfied. ■ COROLLARY 7.8.3.1. The necessary and sufficient condition for S = XAX', where Χ ~ iVPjn(0, Σ <g> In), to be distributed as Wp(t, Σ) is that A is idempotent of rank t >p. An alternate proof of Theorem 7.8.3 can be given by using the condition for chi- squaredness (noncentral) of the diagonal elements of XAX'. The conditions for the Wishartness of a matrix quadratic form XAX', when the columns of X are correlated, are given next. THEOREM 7.8.4. Let S = XAX', where X ~ NPi7l(M,Σ®Φ). The necessary and sufficient condition for S to be distributed as Wp(t, Σ,Σ~ιΜΑΜ') is that AQA = A and rank (A) = t>p. Proof: Note that XAX' has same distribution as У(ф5АФз)У, where Υ ~ ΛΓρ>η(Μψ-έ,Σ <8> In). The condition for У(Ф5АФ5)У to be distributed as Wp(t^, Σ~ιΜΑΜ') is that Φ^Αφέ is idempotent of rank t > p, i.e., Φ^ΑφέφέΑφέ = Φ^ΑΦ^ or equivalently ΑΦΑ = A, and гапк(Ф^Аф5) = t (> p). m COROLLARY 7.8.4.1. The necessary and sufficient condition for S = XAX', where Χ ~ ΛΓρ>η(0,Σ <8> Φ), to be distributed as Wp(t, Σ) is that ΑΦΑ = A and rank(A) = t > p. In the remainder of this section we prove some theorems about the stochastic independence of quadratic forms of the type XAX', where Χ ~ ΑΓρη(0, Σ <g> In). We first state a lemma (see Khatri, 1959b) which will be used in the proof of independence.
258 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS LEMMA 7.8.1. Let A(mxn) be a matrix of rank r <n (n < ra). Then there exists an orthogonal matrix Q(n χ η) such that ( Τι Ο \ where T\{r χ r) is a lower triangular matrix with positive diagonal elements, and T2{{m — r) χ r) is a linear function ofT\. THEOREM 7.8.5. Let SA = XAX' and SB = XBX', where X ~ NPyTl(M, E<g> Jn), and A{nxn) and В (га х га) be the constant symmetric matrices. The necessary and sufficient condition for Sa and Sb to be stochastically independent is that AB = 0. Proof: The joint m.g.f. of SA and Sb is MSaSb{ZuZ2) = (27r)-2ni>det(E)-b / etiIz^AX' + Z2XBX' - \z~l(X - M)(X - M)'} dX. (7.8.17) If AB = 0, then from Theorem 1.2.17, there exists an orthogonal matrix Q such that (Da 0 \ r Q'AQ =[ \ 0 0 ) n-r r η — r and /00 0 \ Q'BQ = 0 Όβ 0 \0 0 0 У Г 5 Π — Г — 5 Г 5 Π — Г — S where .Da (r χ r) and D^ (5 χ s) are diagonal matrices of the nonzero characteristic roots of A and B, respectively. Using the transformation Υ = E~^XQ in (7.8.17) with the Jacobian J(X ->· Y) = det(E)^n, we have MSaiSb(ZuZ2) = (2тг)-Ь> [ etr№z&iYQAQY' + ^Z2^YQ'BQY' - h:-l(ytf - E-5M)(yQ' - Σ-5Μ)'} dY = (2π)-5"Ρ [ etr{Σ^,Σ^Ι^Υ/ + Y?Z2Y?Y2DeYi JYeRpxn ^ -±(Y-N)(Y-N)'}dY JYi£Wxr Jy2eisipxa k
7.8. QUADRATIC FORMS OF THE TYPE XAX' 259 -\(Y2-N2)(Y2-N2)f}dY1Y2 = MSa(Zi)MSb(Z2), where N = E^MQ, Y(pxn) = (Yl Y2 Y3), Yx (ρχr), Y2 (ρχ s), and N(pxn) = (Μ AT2 N3), iVi (ρ χ r), ΑΓ2 (ρ χ 5). The last step follows from (7.8.3). Hence SA and Sb are independent. This proves the sufficiency. Conversely if Sa and Sb are independent, then MSa,sb(ZuZ2) = Μ5Α{Ζλ)Μ5Β{Ζ2) must hold for Z\ — Ζ and Z2 = pZ, where ρ φ 0. In this case MSa,Sb(ZuZ2) = MSa+pB(Z) = MSa(Z)MSb(PZ). (7.8.18) Let Qi (nxn), Q2 (η χ η) and Q(nxn) be orthogonal matrices such that Q[AQ1 = diag(ab...,ar,0,...,0), Qf2BQ2 = diag(&,... ,β,,Ο,... ,0) and Q'(^ + p£)Q = diag(Ab ..., At, 0,..., 0), r = rank(A), 4 = rank(£) and t = rank(A + pB). Then the m.g.f. and the c.g.f. of Sa, Sb and Sa+pb can be obtained from (7.8.1) and (7.8.8) respectively by making appropriate substitutions. Thus we get 00 os—1 00 In MSa (Ζ) = Σ tr(^s) tr(EZ)s + Σ 2s tr{MAs+1M'Z(EZ)s}, s=l S s=0 In MSB(pZ) = f^ — ti(pB)s tr(EZ)s + Σ 2s tr{M(p£)s+1M'Z(EZ)s}, s=l S s=0 and 00 Os—1 00 In MSa+pB (Ζ) = Σ tr(^ + PBY tr(EZ)s + Σ 2S ti{M(A + pB)s+1Μ1Ζ'(ΣΖ)3}. ~i S s=0 Now taking logarithm of (7.8.18) and substituting 1ηΜ^(Ζ), InMSB(pZ) and In ΜςΑ+ρΒ (Ζ) from the above equations, after simplifying, we have 00 95-1 Σ -^{tr(A + pB)s - ti(As) - ti(pB)s} tr(EZ)s =1 7Ξλ * = Σ 2s ti[M{As+l + (pB)s+l -(A + pB)s+l}M'Z(EZ)% (7.8.19) s=0 which must be true for any linear function of Ζ and any value of p. Equating coefficients of tr(EZ)s, we have ti(A + pB)s = ti(As) + tr(p£)s, 5 = 1,2,...,.. .
260 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS For 5 = 4, equating the coefficients of p2, we get 2tr(CC,) + tr(C2) = 0, (7.8.20) where С = AB = (сц), say. From (7.8.20), it is easy to see that tr(C + C'f = tr(C')2 = tr(C2). (7.8.21) Since tr(CC') = Zij 4'' from (7·8·20) we Set 2£4 + tr(C + C')2 = 0. The second term in the above equation is the trace of the square of a symmetric matrix and hence is nonnegative. The first term is the sum of squared quantities. Thus C{j = 0, i.e., AB = 0, which proves the necessity. ■ COROLLARY 7.8.5.1. Let S = XAX' and V = XL, where L(nxt) is a constant matrix, and Χ ~ ΑΓρη(Μ, Σ <8> In). The necessary and sufficient condition for S and V to be stochastically independent is that AL = 0. Proof: Let AL = 0, then ALL· = 0. Now if S and VV are independent, then S and V are also independent. From Theorem 7.8.5, S and VV are independent if and only if ALL' = 0. Thus the sufficiency is proved. Conversely, if S and V are independent then, from Theorem 7.8.5, ALL· = 0. From Lemma 7.8.1, we can write fTi0\ L=[t2o)Q> (7·8·22) where rank(L) = r, 7\ (r χ r) is a lower triangular matrix with positive diagonal elements, T2 ((n — r) χ r) is a linear function of 7\, and Q (η χ η) is an orthogonal matrix. Using (7.8.22) in ALL· = 0, we get A(rJ(I? Ti) = 0' Therefore f Τι Ο \ A\ = 0, A Q = 0, г.е., AL = 0. Hence ALL' = 0 if and only if AL = 0. This completes the proof of necessity. ■
7.8. QUADRATIC FORMS OF THE TYPE XAX' 261 COROLLARY 7.8.5.2. Let S{ = ХА{Х', г = l,...,fc, where Χ ~ ΛΓρ>η(Μ,Σ <g> In). Then the quadratic forms S1?..., Sk are stochastically independent if and only if A{Aj = 0, гфз. In Chapter 2, we have proved the independence of χ = -^ Σ^ι X{ and 5 = Σίίι(&ί — й)(хг — *)', together with Wishartness of S, where X{ ~ Νρ(μ,Σ), г = 1,..., Ν (Ν > ρ). This result can now easily be obtained from the above theorem. Let X = (χι,.. .,xn), then Χ ~ Νρ^(μβ', Σ <g> /τν)- Note that χ = jjXe, and 5 = X(In — ^ee')X'. Prom Corollary 7.8.5.1, it follows that χ and S are independent since (In — j^ee')e = 0. Also the matrix (1^ — j^ee') is idempotent of rank N - 1 and therefore from Corollary 7.8.3.1, S ~ WP(N - 1, Σ). THEOREM 7.8.6. Let SA = XAX' and SB = XBX', where X ~ NPi7l(M, Σ <g> Φ), and A(n χ n) and Β (η χ η) be the constant symmetric matrices. The necessary and sufficient condition for Sa o,nd Sb to be stochastically independent is that A^B = 0. Proof: Note that the quadratic forms XAX' and XBX' are stochastically independent if and only if the quadratic forms У(ф2 АФа)У and Υ'(ФаВ^^)У\ where У ~ ]УР)П(МФ"2, Σ®/n), are stochastically independent. Hence from Theorem 7.8.5, we get the condition φέΑφ2φέ£φέ = 0, i.e., АФБ = 0. ■ COROLLARY 7.8.6.1. Lei 5 = XAX' and V = XL, where L(nxt) is a constant matrix, and Χ ~ ΝΡιΎΙ(Μ, Σ <8> Φ). The necessary and sufficient condition for S and V to be stochastically independent is that AtyL = 0. Proof: Note that S = XAX' and V = XL are stochastically independent if and only if the quadratic form У(Ф*АФ*)У and У(Ф*£), where Υ ~ ΛΓρ>η(ΜΦ""2, Σ® Jn), are stochastically independent. Therefore from Corollary 7.8.5.1, we get Φ2 ΑΦ2 фгЬ = 0, i.e., A^L = 0. ■ COROLLARY 7.8.6.2. Let S{ = ХА{Х', г = 1,..., к, where X ~ Νρ,η(Μ, Σ <g> Φ), and Α{ (η χ η) are constant symmetric matrices. Then the quadratic forms S\,...,Sk are stochastically independent if and only if AfliAj = 0, г Ф j. The above results are taken from Khatri (1959b). As mentioned in the beginning of this section, these results on quadratic forms have been derived by comparing moment generating functions. An alternate proof of Corollary 7.8.6.1 has been given by Hogg (1963). It may be noted that the proof of Theorems 7.8.3. and 7.8.5 can also be given by first obtaining conditions on the diagonal elements of the quadratic forms, together with the additional conditions obtained from the moment generating function. Next we give general results of Cochran theorem. THEOREM 7.8.7. Let XAX' = Zi=iXAix'> where Χ ~Νρ,η(Μ,Σ® In), rank(A) — r (> p), and гапк(Д) = гч (> ρ), г = 1,..., /с. Consider the following four conditions:
262 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS (i) XAiX' ~ Wp(ru Σ, YrlMAiM% (ii) ΧΑ{Χ' and XAjX', г ф j, are stochastically independent, (Hi) XAX' ~ Wp(r, Σ, Σ~ιΜΑΜ'), and (iv)r = Etiri- Then, (a) any two of the conditions (i), (ii), and (Hi) imply the remaining, and (b) conditions (iii) and (iv) imply (i) and (ii). Proof: Let any two of the conditions (i), (ii), and (iii) be satisfied. Then from Theorems 7.8.3 and 7.8.5, any two of the conditions (i), (ii), and (iii) of Theorem 1.2.20 will hold and consequently the remaining conditions of the theorem will also hold. Hence all four conditions of the Theorem 1.2.20 will hold. Therefore all the conditions of Theorem 7.8.7 hold which proves the part (a). Further let the conditions (iii) and (iv) hold. Then from Theorem 7.8.3, the conditions (iii) and (iv) of the Theorem 1.2.20 hold. Consequently conditions (i) and (ii) of Theorem 1.2.20 also hold, and hence (i) and (ii) of Theorem 7.8.7 follow, which completes the proof of part (b). ■ COROLLARY 7.8.7.1. Let XAX' = Y,ki=lXAiX', where A2 = A, and X ~ -/νρ>η(0, Σ <g> In), rank(A) = r (> p), and гапк(Д) = г» (> ρ), г = 1,..., к. Then XAiX' ~ Wp(ri, Σ), i = 1,..., к and are independent if and only if r = £f=1 tv It is noticeable that the conditions for Wishartness and independence of quadratic forms of the type XAX', X ~ NPyTl(M, Σ <8> Φ), do not depend on Σ, and hence they are valid even when Σ is singular, i.e., when Χ ~ ΛΓρ>η(0, Σ <g) Ψ|ρι, η), ρι < ρ. However when Χ ~ ΑΓρη(0, Σ (8) Ψ|ρ, πι), щ < η, the conditions given in Theorems 7.8.4 and 7.8.6 are no more valid. For this case, in the following theorem, conditions for Wishartness of the quadratic form of the type XAX' are given without proof. THEOREM 7.8.8. Let S = XAX', where Χ ~ ΑΓρ>η(Μ, Σ <g> Ψ|ρ, щ), щ < п. The necessary and sufficient conditions for S to be distributed as Wp(t, Σ, Σ~ιΜΑΜ') are (i) ΦΑΦΑΦ = ΦΑΦ (ii) MA& = ΜΑΦΑΦ, and (iii) MAM' = МАУАМ' where гапк(АФ) = t>p. 7.9. WISHARTNESS AND INDEPENDENCE OF QUADRATIC FORMS OF THE TYPE XAX' + \{LX' + XV) + С In Section 7.8 we studied conditions for Wishartness and independence of quadratic forms of the type XAX', where X ~ NPi7l(M, Σ <8> In)· In this section we study conditions for Wishartness and independence of quadratic forms which are generalizations of the quadratic forms of the type XAX'. The results derived here are more general and include results derived in Section 7.8 as special cases. The method of proofs of preceeding section can be used here for deriving conditions for Wishartness and independence, but we will follow a slightly different approach. The generalized quadratic
7.9. QUADRATIC FORMS OF THE TYPE XAX' + \(LX' + XV) + С 263 form in Χ (ρ χ n) is defined by S = XAX' + hbX' + XV) + C, (7.9.1) where Α (η χ n) = A', L{p χ n) and С (ρ χ p) = С are constant matrices. We first derive the m.g.f. of S when X ~ NPyTl(M, Σ <g> Φ). We begin with a lemma which expresses the m.g.f. of S in terms of m.g.f. of YBY\ where the distribution of Υ (ρ χ η) is matrix variate normal and Β (η χ η) is a symmetric matrix. LEMMA 7.9.1. Let X ~ ΛΓρ?η(Μ,Σ®Φ). ТДеп *Ле m.g.f. of S = XAX'+\(LX' + XV) + С can be expressed as MS(Z) = etr {Z(C + MV) + ^νΖΣΖ}ΜΥΒΥ,(Ζ), (7.9.2) мЛеге Υ ~ ΑΓρ>η(Μφ-2 + ΣΖΖ,φέ, Σ <g> In) and В = ^Α$τ. Proof: The m.g.f. of S is MS(Z) = E[eti(ZS)] = E[etx{z(XAX' + hbX' + XL') + C)}] = Я [etr [Z(YBY' + ±(NY' + YN') + c)}],Y~ ^n(Mi"i, Σ Θ /η) = (2ττ)-2ηΡ(ΐβΚΣ)-2η / etr \Ζ(ΥΒΥ' + l(NY' + YN') + C) -^Σ~1(Υ-μ)(Υ-μ)'}άΥ, where В = Ψ^Αψϊ, jv = £ψ§ and μ = ΜΦ". Next writing the exponent inside the integral as tr {z(yby' + hiw' + yn') + c) - \z-\y - μ)(Υ - μ)'} = tr [Z{C + μΝ') + ^ΝΝ'ΖΣΖ) + ti(ZYBY') - i tr{E_1(y - μ - ΣΖΝ)(Υ -μ- ΣΖΝ)'}, we get MS(Z) = {2π)~^ηράβί(Σ)-^ηβΙτ{Ζ(0 + μΝ') + ^ΝΝ'ΖΣΖ} ( etr \ZYBY' - \ς~\Υ -μ- ΣΖΝΜΥ -μ- ΣΖΝ)'} dY JY£W>xn ^ 2 J = etr \Z(C + μΝ') + ^ΝΝ'ΖΣΖ}Ε[βίτ{ΖΥΒΥ%
264 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS where Υ ~ Νρ^η(μ+ΣΖΝ, E<g>/n). The proof is completed by substituting μ = ΜΦ"5 and ЛГ = £ф2\ ■ Now the m.g.f. of S is evaluated in the following theorem. THEOREM 7.9.1. Let X ~ iVp,n(M,E <g> Ф). Then the m.g.f. of S = XAX' + \{LX' + XL') + C, where A(n χ η) is a symmetric matrix of rank t, is given by t MS{Z) = Π det(Jp - 2Χ3·ΣΖ)~12 etr [Z(C + ML') + ^L^L'ZEZ] i=i exp Σ\ά4ά4Γ±(Μ + ΣΖΐν)'Ζ(Ιρ - 2\5ΈΖ)~ι (Μ + ΣΖ£Φ)Φ"*ςτΛ, (7.9.3) гуДеге Ф2АФ2 = Q ί λ ) Q'} Q(nxn) is an orthogonal matrix, Q = (q1?...,qrn); D\ = diag(Ai,..., Xt) and λ1?..., Xt are the nonzero characteristic roots of Φ 2 ΑΦ2. Proof: Prom Lemma 7.9.1, we have MS(Z) = etr [Z(C + ML') + ]-L^L'ZY,Z\MYBy'{Z\ (7.9.4) where Υ ~ АГр>п(МФ~2 + ΣΖΖ,Φ^,Σ <g> In) and Β = Φ a ΑΦέ. Now appropriately substituting from Theorem 7.8.1, for Μγβγ'(Ζ) we get the desired result. ■ COROLLARY 7.9.1.1. If АФА = A, then the m.g.f. of S is given by MS(Z) = det(Jp - 2ΣΖ)"5* etr \Z(IP - 2ΣΖ)~\Μ + HZL^)A{M + HZL^)' + Z(C + ML') + )-L4>L'ZY,z). Proof: If ΑΦΑ = A holds, then Φ2 ΑΦ2 = A is an idempotent matrix of rank t, i.e., χ. = l, i = l,..., t and Φ^Αφέ = Q ( * ) Q' = Σ,)=ι 4& Substituting these in (7.9.3) we get the result. ■ The c.g.f. of S is derived in the next theorem. THEOREM 7.9.2. Let X ~ NPyTl{M, Σ <g> Φ). Then the c.g.f. of XAX' + \{LX' + XL') + C, where A(n x n) is a symmetric matrix of rank t, is given by In MS(Z) = tr {Z(C + ML') + ^Ζ,ΦΖ/ΖΣΖ} + £ — tr(* A)s far(EZ) s=l + Σ 2S tr{Z(EZ)s(M + ΣΖΖ,Φ)Α(ΦΑ)δ(Μ + ΣΖΖ,Φ)'}. s=0
7.9. QUADRATIC FORMS OF THE TYPE XAX' + \{LX* + XV) + С 265 Proof: From (7.9.2), we obtain In MS{Z) = tr \Z(C + ML') + ^Ζ,ΦΖ/ΖΣΖ} + lnMyBy,(Z), (7.9.5) where Г ~ ΑΓρ>η(ΜΦ~2 + ΣΖΖ,#2, Σ®/η) and β = #2 Αφέ is of rank t with λ1?..., Xt being the nonzero characteristic roots. Using Theorem 7.8.2, the c.g.f. of Myby'(Z) is obtained as oo os—1 oo In MY by'(Z) = Σ ti(VA)sti&Z)s + Σ24ι{Ζ(ΣΖγ s=l S s=0 (M + EZL^)A(^A)S(M + ΣΖΖ,Φ)'}. (7.9.6) Now using (7.9.6) in (7.9.5) we get the desired result. ■ Alternately, a proof can be constructed parallel to the proof of the Theorem 7.8.2. The following lemmas, which are used in the sequel, give conditions for chi-squaredness and independence of second degree polynomials in n-variate normal vector. LEMMA 7.9.2. Let P(x) = x'Ax + £'x + c, where χ ~ Νη(μ, Ιη), Α(η χ η) is a symmetric matrix of rank t, £(n χ I) is a constant vector, and с is a scalar. Then the necessary and sufficient conditions for P(x) to be distributed as noncentral chi- square with t degrees of freedom and noncentrality parameter μ'μ are that (i) A2 = A, (ii) Ai = £, and (Hi) с = \tAl. LEMMA 7.9.3. Let PA(x) = x'Ax + £'x + c and Рв(х) = x'Bx + n'x + d, where x ~ Νη(μ,Ιη), A{n χ n) and Β (η χ η) are a symmetric matrices, £(n χ 1) and n(nx 1) are constant vectors, and с and d are scalars. Then the necessary and sufficient conditions for Pa(x) and Рв(х) to be distributed independently are (г) АВ = 0, (ii) £'B = 0, (ii) n'A = 0, and (iv) £!n = 0. Lemma 7.9.2 was establish by Khatri (1962) and the proof of Lemma 7.9.3 was given by Laha (1956). LEMMA 7.9.4. Let A(n χ η) and Β (η χ η) be symmetric matrices and L(p χ η) and Ν (ρ χ η) be matrices such that t = rank (A V), и = rank (Β Ν'), ΑΒ = 0, LB = 0, NA = 0 and LN' = 0. Then there exists a semiorthogonal matrix Q (η χ (t + u)),t + u<n, such that L = (T 0)Q',M = (0 U)Q, A = q(^ jjW, and В = Q ( n F j Q', where Ε (t xt), F (и х и) are symmetric matrices, A and U are ρ xt and ρ χ и respectively. Proof: See Khatri (1962). ■ THEOREM 7.9.3. Let S = XAX' + \{LX' + XL') + C, where X ~ NPy7l(M, Σ <g> In). The necessary and sufficient conditions for S to be distributed as Wp(t, Σ, Σ-1 (M + \L)A{M + \L)') are that (i) A2 = A, rank(A) = t > p, (ii) LA = L, and (Hi) С = \LAL'.
266 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS Proof: Let us assume that the conditions (i)-(iii) hold. Then s = xax' + \{lax' + xal') + -lav = {x + \l)a{x + \l) = Υ AY' where У ~ Npn(M + \L, Σ <g> In). Now from Theorem 7.8.3, Υ AY' ~ Wp(t, Σ, Σ"1 (M + \L)A(M+\L)'). Conversely, let S = (Sij) ~ W^t, Σ, Tr\M + \L)A{M + \L)'), sa = x?Ax* + i*'x\ + Сй, where xf and £*' denote the zth row of the matrices X and L respectively, and сц is the (г, z)th diagonal element of the matrix C. Then, the diagonal elements Sa are distributed as noncentral chi-square if and only if (Lemma 7.9.2) A2 = Д i*'A = £*\ and сц = \i*'Ai*, г = 1,... ,p, or equivalently, A2 = A, LA = L, and the diagonal elements of \LAL' — С are all zero. Therefore, under these conditions S = (X + \L)A{X + \L)' + С - \LAV'. Now, given that S is a noncentral Wishart matrix, and (X + \L)A(X + \L)' is also a noncentral Wishart matrix, we must have С — \LAL' = 0. This completes the proof of necessity. ■ THEOREM 7.9.4. Let S = XAX' + \{LX' + XL') + C, where X ~ NPyTl(M, Σ <g> Φ). The necessary and sufficient conditions for S to be distributed as Wp(t, Σ,Σ-1 (Μ + \L)A(M + \L)') are that (г) АЪА = A, rank(A) = t>p, (ii) LVA = L, and (Hi) С = \ISSU. Proof: Note that S = XAX' + \{LX' + XL') + С has same distribution as S* = У(Ф5АФ5)У + \(L^Y' + УФ*!/) + С, where У - ΛΓρ>η(Μφ-*,Σ <g> Jn). Prom Theorem 7.9.3, the necessary and sufficient conditions for S* (and hence for S) to be distributed as Η^,(ί, Σ,Σ"1 (M+|L)A(M+|L);) are that фЫфМаФ* = Ф*АФ2, Ζ,φέφέ^φέ = Ζ,φέ an(i 1£Ф5(Ф5АФ5)Ф5£/ = с. Now conditions (i), (ii) and (iii) are obtained from above conditions upon simplification. ■ THEOREM 7.9.5. Let Sx = XAX'+±(LX'+XL')+C andS2 = XBX'+±(NX'+ XN') + D, where Χ ~ ΛΓρ>η(0,Σ<Ε>/η), A(nxn) = A, B(nxn) = B',C(pxp) = C, D(p χ p) = D', L(p χ n), and Ν (ρ χ η) are constant matrices. Then the necessary and sufficient conditions for S\ and 52 to be stochastically independent are (г) АВ = 0, (ii) LB = 0, (iii) NA = 0 and (iv) LN' = 0.
7.9. QUADRATIC FORMS OF THE TYPE XAX' + \{LX' + XV) + С 267 Proof: The joint m.g.f. of S\ and S2 is MSus2(ZuZ2) = (2^-bPdet(E)-b f etilzJXAX' + \{LX' + XL') + C) + Z2(XBX' + \{NX' + XN') + D) - \ζ~λ(Χ -M){X- M)'}dX. (7.9.7) If the conditions (i)-(iv) hold, then using Lemma 7.9.4, there exists a semi-orthogonal matrix Q (n x?) = (Qi Q2) such that L = (T 0)Q', AT = (О £/)Q', А = q(? [jW, andB = Q^ JW, where ^ = * + u (< n), t = rank(A L'), и = rank (β Ν'), Ε = Ε', F = F', T and С/ are matrices of order txt,uxu,pxt and ρ χ и respectively. Let Qz (η χ (η — q)) be a semiorthogonal matrix such that Qo (η χ n) = (Q Q3) is orthogonal. Next using the transformation Υ = Σ~ϊΧ(20 in (7.9.7), with Jacobian J(X -> Y) = det(E)b, we get M< SbS2(ZbZ2) = (2π)"^ρ / eti {h(ZuZ2,Y) - I^Qi - Е"*М)(У<% - Σ-5Μ)'} dF (7.9.8) where ft(Zb Z2, У) = Ζχ [EiygOAQorE* + i (Τ 0) Q'Q0rE5 + ^E$YQ'0Q(T 0)' + C]+ Z2^YQ'0BQ0Y^k- + i(0 U)Q'QoY'& + \&YQOQ(0 U)' + D] = Z^ztyEY&i + ^TYpi + \^YXT' + c] + Z2 [Е*У2 fY2'E* + hwfii + W?Y2U' + D] (7.9.9) and Υ (ρ χ η) = (У Υ2 Υ3), У (ρ χ ί), У2 (ρ χ ω), У3 (ρ χ (η - <?)), q = t + и. Furthermore, (YQ'0 - Е-*М)(У<% - ΣΓ3Μ)' = (У - E-5MQ0)(y - ^MQQ)' = (Y-K)(Y-K)' = (у - ^о(ух - #о' + (У2 - #2)(У2 - κ·2)' + (Уз - Κ3)(Υ3 - Κ3)', (7.9.10) whereif(pxn) = E-iMQo = (^i K2 Ks), Kx{pxt), K2(pxu), K3(px(n-q)). Now substituting (7.9.9) and (7.9.10) in (7.9.8), and integrating with respect to У3,
268 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS JY2ew>x we have Μ5ιΑ(Ζ1,Ζ2) = (2π)-^/" / JY1£W>xtJY2 etr |zx [Σ* Yi^Yi'E* + ^ΤΎ/Σ^ + ^2 YiT' + С] + Z2\&Y2FY£1* + h/Yfii + h$Y2U' + D] -1-{Υλ - K^Y, - Kx)' - \{Y2 - K2)(Y2 - K2)'} dY, dY2 = Msr(E2ZiE2)Ms-(E2Z2E2), where SI = ΥλΕΥ{ + hsr^TY{ + ^Τ'Σ-2 + Σ-20Σ-2, S*2 = Y2FY2' + ^ST^UYi + \y2U'Y>~* + E-2DE-2, Ух ~ NPtt(E-iMQuIp® It) and Y2 ~ NPtU(£~iMQ2Jp ® Iu). Evaluating Msi(Σ2Ζ1Σ2) and Ms*(Σ2Ζ2Σ2) using Theorem 7.9.1 and writing £, F, Τ and C/ in terms of A, B, L and N, it can be seen that Ms-i^Z^) = MSl(Z{) and Μ5·(Σ2Ζ2Σ2) = Ms2(Z2). Hence S\ and S2 are independent. This proves sufficiency. Conversely if S\ = (shj) and S2 = (s2ij) are independent, then their diagonal elements are also independent, where sUi = x*'Ax* + ±(£*'x* + x\'eT) + ck, 52zz = ж*'£ж* + - (гг?'я?? + x*'n*) + da, ж*', i*' and n*' denote the ith row of the matrices X, L and AT, and сц and <fo are the (г, i)th diagonal elements of the matrices С and D respectively. Now from Lemma 7.9.3, sm and s2ii are independent if and only if AB = 0, i*{'B = 0, n*'A = 0, and £*'n* = 0, г = 1,... ,p or equivalently A£ = 0, LB = 0, ATA = 0 and diagonal elements of LN' = 0. (7.9.11) Further since AB = 0, we can find an orthogonal matrix Q (η χ η) = (Qi Q2 Q3), Qi (η χ r), Q2 (ft x -s), Q3 (η χ (n - <?)), q = r + 5, such that Qi AQ: = diag(ab ■ ■ ■, Or) = A», (7.9.12) Q;2BQ2 = diag(A, ...,&) = £>* (7.9.13) where а», г = 1,..., r and /?i? г = 1,..., s are the nonzero characteristic roots of A and В respectively. Next using the transformation Υ = (ΥΊ Y2 У3) = Σ~ϊΧ(2,
7.9. QUADRATIC FORMS OF THE TYPE XAX' + \{LX' + XV) + С 269 Υί {ρ χ г), У2 (р х s), Уз (Ρ х (η - <?)) in (7.9.7) with the Jacobian J(X -> У) = det(E)in, and (7.9.11), (7.9.12), and (7.9.13), we get MSl A(Zb Z2) = (2π)"^ρ / etr fZx [Е^ВД'Е* + ^У/Е* + ^Y&L' + C] + Z2 [Е*У2£>^Е* + ijVQ2r2'E5 + ^Y2Q'2N' + D]+Zl [\lQ3Y3^ + \zl>Y3Q'3L'] + Z2[±NQ3Y3& + ^Y3Q'3N'] - i(y - K){Y - A")'} dY = Msi{T,^Z1T,^)Ms-{^Z2T,^)My3{T,^{Z1L + Z2N)Q3), (7.9.14) whereii = E-5MQ = (^1 K2 K3), Кг(р χ r), K2(p χ s), K3(p x (n-q)), S{ = Y,DaY{ + Iz-iLQM + |yiQiL'E-i + E^CE"*, 52* = У2ВД[ + ^~?NQ2Y2' + iy2Q2JV'E-5 + Е^шН, У ~ iV^i/fx, /p ® 7r), У2 ~ JV^ifc, /P ® /.) and У3 ~ Np>n^(K3, Ip <g> /„_,). Now, from Theorem 2.3.2, the m.g.f. Y3 is M^E^L + ^iV)^) = etrft^L + Z2N)Q3Q'3M' + Y,{ZXL + Z2N)Q3Q'3{ZXL + Z2N)'}. (7.9.15) Substituting from (7.9.15) in (7.9.14) and taking the logarithm we get InAfSlA(ZbZ2) = {\x{ZxLQ3Q'3M' + Y.ZXLQ3Q3UZX) + lnMsj(EiZiE*)} + {ti(Z2NQ3Q'3M' + VZ2NQ3Q3N'Z2) + lnMsr(E5Z2E5)} + 2tr(EZ1Lg3Q^JV'Z2) = lnMSl(Z!) + In MS2(Z2) + 2tr(EZ1Z,Q3Q3iV',Z2). (7.9.16) The last step follows from the Theorem 7.9.2. Therefore for independence of S\ and 52, we must have ti{Y,ΖXLQ3Q'3N'Ζ2) = 0 ie. tr{EZ1L(/n-Q1Q'1 -Q2Q'2)AT'Z2} = 0 i.e. ti(ZZiLN'Z2) = 0 for all symmetric matrices Z\ and Z2 and hence LN' = 0. This completes the proof of necessity. ■
270 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS THEOREM 7.9.6. Let Si = XAX'+\(LX'+XL')+C and S2 = XBX'+\(NX'+ XN') + D, where X~ 7\ΓΡιη(0,Σ®Φ), A(nxn) = A',B(nxn) = B', C{pxp) = С, D (ρ χ p) = D1, L(pxn), and Ν (ρ χ η) are constant matrices. Then necessary and sufficient conditions for S\ and S2 to be stochastically independent are (i) A^B = 0, (ii) L^B = 0, and (Hi) NVA = 0 and (iv) LVN' = 0. Proof: Note that the forms Sx = XAX' + \{LX' + XL') + С and S2 = XBX' + \{NX' + XN') + D are stochastically independent if and only if the quadratic forms У(Ф*АФ5)УЧ|(£Ф5УЧУФ5£0+С^ where У ~ ΑΓρη(ΜΦ~2,Σ <g> /n), are stochastically independent. Hence from Theorem 7.9.5, we get the required conditions. ■ COROLLARY 7.9.6.1. In the above notations Si and the linear form NX' + XN' are stochastically independent if and only if (i) A^N' = 0, and (ii) L^N' = 0. COROLLARY 7.9.6.2. In the above notations the linear forms LX' + XL' and NX' + XN' are stochastically independent if and only if L^/N' = 0. COROLLARY 7.9.6.3. Let X ~ NPy7l(M, Σ <g> Φ). Then the quadratic forms (X + Li)A{(X + Li)', i = 1,..., к are stochastically independent if and only if A^Aj = 0, i^j, hj = l,-..,fc. THEOREM 7.9.7. Let (X + L)A(X + L)' = Σ?=ι(* + U)Ai{X + Ц)1, where X ~ Np,n(M, Σ <g> In), rank(A) = r (> p), and гапк(Д) = η (> ρ), г = 1,..., к. Consider the following four conditions: (i) (X + Li)A(X + Li)' ~ Wp(ruΣ, Σ~ι(Μ + Li)Ai{M + Li)'), (ii) (X + Li)Ai(X + Li)' and (X + Lj)Aj(X + Lj)', г ф j, are stochastically independent, (Hi) (X + L)A(X + L)' ~ ^(γ,Σ,Σ-^Μ + L)A(M + L)'), and (iv)r = Zi=in- Then, (a) any two of the conditions (i), (ii), and (Hi) imply the remaining, and (b) conditions (Hi) and (iv) imply (i) and (ii). Proof: The proof is similar to the proof of Theorem 7.8.7. ■ 7.10. WISHARTNESS AND INDEPENDENCE OF QUADRATIC FORMS OF THE TYPE XAX' + LXX' + XL'2 + С Consider the polynomial of the type S = XAX' + LYX' + XL'2 + C, (7.10.1) where A(nxn) = A\ L\ (pxn), L2 (p x n) and С (pxp) are constant matrices. For Li = L2 = \L, С = С", the polynomial (7.10.1) reduces to the polynomial (7.9.1)
7.10. QUADRATIC FORM OF THE TYPE XAX1 + LXX' + XV2 + С 271 of Section 7.9. Conditions for Wishartness and independence, for this case, when X ~ NPiTl(M, Σ <g> Φ), are given there. In this section we discuss such conditions for polynomials of type (7.10.1) when Χ ~ ΑΓρη(Μ, Σ <g> Ф|г, s), r < ρ, s < n. First we derive its m.g.f. Since Σ > 0 and Φ > 0 are of ranks r and s respectively, we can write Σ = ВгВ[ and Υ = BB', where B\ {ρ χ r) and Β (ρ χ s) are of ranks r and s respectively. From Definition 2.4.1, we can write X = Μ + ΒλΥΒ' where Υ ~ iVr,s(0, Ir <g> IS). Therefore S can be written as S = BlYA{l)Y'B[ + L{l)Y'B[ + ΒλΥϋ{2) + C(1), (7.10.2) where Am = B'AB, Lw = (MA + L{)B, L(2) = (MA + L2)B, and C(i) = MAM' + LXM' + ML'2 + C. Now for any arbitrary matrix Ζ (ρ χ ρ), and Z0 = \(Ζ + Ζ'), ti(ZS) = ti[Z0BlYA{l)Y'B'l + (ZLm + Z'L^Y'B^ + ZCm] = ив;адул(1)г + B[(ZLm + z'l{2))y'] + ti(zcm). (7.10.3) Note that ti{B[Z0BxYAmY') = (vec(r'))'(£Wi ® ^(1))vec(r') and tr(Ly') = (vec(r'))'vec(L') where 2L = B[(ZLW + Z'L^). Hence, (7.10.3) can be written as tr(ZS) = (vec(r'))'(S^oSi®^(i))vec(y')+2(vec(y'))'vec(L') + tr(ZC(1)), (7.10.4) where vec(y') ~ Nrs(0, IT ® Is). Using (7.10.4), the m.g.f. of 5 is MS{Z) = £[ехр{(уес(У))'(В;ЗД <g> A(1)) vecQ") + 2(vec(r'))'vec(L') + tr(ZC(1))}] = det(/rs - 2(B[Z0B1 <g> A(1)))-5 etr(ZC(i))£;[etr{2(vec(r))'vec(L')}],
272 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS where now vec(y') ~ iVrs(0, (Ira - 2(B[Z0Bl <g> A(1)))"1). Therefore MS(Z) = det(/rs - 2(B[Z0Bl <g> A(1)))-i etr(ZC(i)) exp{2(vec(L,)),(/rs - 2(B[Z0Bl g> Α(1)))"1 vec(L')}. (7.10.5) Next let the spectral decomposition of Ащ = B'AB be Ащ = Σψ=ι ^jEj, where Ej is symmetric, E? = E-, E{Ej = 0, i Φ j, and λι > · · · > Ата are the nonzero characteristic roots of A^ with multiplicity /1?..., fm respectively. Let E0 = Is — Σ™=ι Ej, then El = E0, E0E5 = 0, and (Irs-2(B[Z0Bl®A{l)))-1 = (ΐ„-2(Β[Ζ0Βι®Σ,ΧίΕί)Υ m _. = (/r <g> £0 + ^(/r - гА^Б^оБ!) <g> Εά) 77г = Ir g> £0 + Σ(Λ· - 2XjB[Z0Bl)-1 g> ^· (7.10.6) Now (vec(L'))'(/„ " 2№(Α ® Λΐ)))"1 vec(L') = ti{LE0L') + £tr{(Jr - 2XjB'lZ0Bl)-lLEjL'} 1 ™ = 7 Σ tr{(/r - 2XjB[Z0Bl)-'B[(ZL{l) + ΖΊ,{2))Εά{ϋ{ι)Ζ' + L[2)Z)BX}, *j=0 (7.10.7) where we define λ0 = 0, and m det(/rs - 2{B[Z0Bl Θ Α(1)))-^ = Π det(/r - 2\jB[Z0Bl)-1^. (7.10.8) Substituting from (7.10.7) and (7.10.8) in (7.10.5), we get m MS(Z) = Π det(/r - 2XjB[Z0B1)-1^ etr(ZC(1)) etr {^ Σβι(^ - 2\jB[Z0Bl)-'B[{ZL{l) + Z'Lm)Ej(L[1)Z' + L[2)Z)} m = Π det(7p - 2λ,.ΣΖ0)-*Λ etr(ZC(1))
PROBLEMS 273 If L(x) = L(2) = L(o) (say), and C(i) is symmetric, then (L(i) — L(2))# = 0 and (7.10.9) reduces to m MS(Z) = etr(Z0C(2) + 2ΣΖ0Ω0Ζ0) Π det(/p - 2λάΣΖ0)~ϊ* etr f Σ f (Jp - 2λ,.ΣΖ0)-1Ω,·Ζ0}. (7.10.10) where Ω3 = L{0)EjL[0) = (MA + Li)BEjB'(MA + Li)', j = 0,1,..., m and C(2) = C(i) - Σ£ι ψ· Now from (7.10.10), it is clear that S is distributed as Σ?=ι XjWj + \{Y + Y1) where Wu ..., Wm and Υ are independent, У ~ ЛГр>р(С(2), 4Σ <g> Ω0) and if fj > p, then Wj ~ И^(£, ^Ω,), i = 1,..., m. The results on Wishartness and independence of quadratic forms of the type (7.10.1) have also been given by Khatri (1980). Some of these are stated below without proof. THEOREM 7.10.1. Let S = XAX1 + LxXf + XL'2 + C, where A(n χ η) = A', L\ (ρ χ n), L2 (p x n) and С (ρ χ ρ) are constant matrices, and X ~ NPy7l(M, Σ <g> Ф|г,s). Then S is distributed as Y^LiXjWj, where Wi,...,Wm are independent, Wj ~ Wp(fj, Σ, -gfi-j), for distinct nonzero λι, λ2,..., Xm, if and only if (i) λι, λ2,..., Ата are distinct nonzero characteristic roots of ΦΑ with multiplicities /ij /2, · ■ · ? fm respectively such that fj > p, j = 1,... ,p, (ii) LiSIf = L2#, (Hi) (Li + ΜΑ)Φ = Ζ,ΨΑΨ for some matrix L, (iv) Qj = (Lx + MA)(BEsB')(Li + MA)', and (ν) Μ AM' + LrM' + ML'2 + L = Σ]ίι ffi. The asymptotic distribution of Σ^ΐι ^jWj, under the conditions of Theorem 7.10.1, is also given by Khatri (1980). THEOREM 7.10.2. Let S{ = XA{X' + LUX' + XL'2i + Ci} where A{ (η χ n) = A'i} L\i{p x n), L2i{p x n) and Ci{p x p) are constant matrices, г = 1,2, and X ~ NPyn(M, Σ <8> Ф|г, s). Then S\ and S2 are independently distributed if and only if ' (г) ΦΑιΦΑ2Φ = 0, (ii) (Ljx + ΜΑι)ΦΑ2Φ = (Lj2 + МА2)ФЛхФ = 0, j = 1,2, and (Hi) the coefficients of the elements of Z\ and Z2 from tr(Z2L,12) + Ζ2Ζ/,22ν)Φ (Z[LfU) + ΖλΏ,21γ)' are zero where L^ = Lji + MAi} i,j = 1, 2. PROBLEMS 7.1. Show that the Laplace transform of the density of 5 = XAX', where X ~ Np<n(0,Z®In),is
274 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS k=0 κ Κ· ^κ(Ιη) where q is a real positive quantity and G = Ιρ+2ςΖΣ. Hence derive the density (7.2.7). (Shah, 1970) 7.2. Derive the p.d.f. of и = tr(5) using the density (7.2.5) of S, as f(u) = J2bpr(^np)det(A^)^det(E)bJ η^ηρ~2) oFo^E"1 Θ A-H-lA~K -\u), и > 0, and then show that P(u < w) = Ыпрг{^пр + l) det(A#)Wet(E)bj wfrp ιΕ^ρ\\ηρ^ηρ+1;Σ-1 ^A~^-lA-K-\w). (Hayakawa, 1966) 7.3. Show that the limiting distributions of F\ and F2 as q —>· oo are given by -^fet^det(0)-bdet(*)-^det(F1)^-™-1) Гр(5т)Гп(5р) iFoW(5(ro + n);-fi-\«-1i'1)> Fx > 0, and -^^7^det(n)-bdet(*)-bdet(F2)i("--1) Гр(2т)Гр(2П) ^(^(m + n);-*-1^-1^), F2 > 0, respectively. (Khatri, 1966) 7.4. Let S ~ <2p,n(A Σ, Φ). Derive the p.d.f. of S~l in the forms parallel to (7.2.1), (7.2.5) and'(7.2.7). 7.5. Let S (ρ χ p) and Υ (ρ χ га) be independent, 5 ~ QP,n( A /P, Φ) and Υ ~ iVp>rn(0, /p (g) Jm). Derive the density of SIVY'S'*, for ρ < ra.
PROBLEMS 275 7.6. Let the density of S is given by (7.2.1). Show that E[det{S)h] = д(ь+Ь)г 2*^(2" + A) det(i№)-ipdet(Z)* Гр(2П) = ^+Η?2^Г^ + V άβί(ΑΦ)-^det(Z)* 2*ίη)(^Ρ. ^n + A; ^n; In - i*-*^1*"*). 7.7. Let S be distributed as in Theorem 7.6.1. Then show that E[det(S)h] = д-(ь+Ь)г2^Гр^+Д) det(A)-^det(E)heti(- \тг1ММ') гр(2п) ч 2 У where Re(h) > —\{п - ρ + 1). 7.8. Let Χ ~ ΑΓρ,η(Μ, Σ <g> Φ), and Q = (Χ - М)Ъ~\Х - Μ)' - (Χχ - Μ^Φ^1 (Χ\ - Μχ)', where Χ = (Χχ Χ2), Л\ (ρ χ <?), Μ = ( Μλ Μ2), Μχ (ρ x q) and Φ = ( -11 -12 V Фп (<?x<7), and η > ρ+<?. Then show that Q ~ Wp(n-q, Σ). 7.9. Let Χ ~ Α^ρ,η(ΓW,Σ(g)Jn) where Г(рхг) and W(rxn) are constant matrices, rank(W) = r < p. Define Я = WW, G = XW'H~\ and Q = XX' - GHG'. Then prove that G and Q are independent, and Q ~ Wp(n — r, Σ), η — r > p. 7.10. Let X ~ NPi7l(M, Σ <g> Φ), and A, Au ..., Afc_i, Afc be η χ η real symmetric matrices so that Α = Αι + ··· + Ak_x + A*. Let XAX\ XAXX',..., ΧΑ*_ιΛ"' have Wishart distributions and let Ak be positive semidefinite. Then ΧΑχΧ', ...,XAk-\X' and XAkXf are stochastically independent, and XAkX' has a Wishart distribution. (Hogg, 1963) 7.11. Let X(p χ n) be a real matrix. Let A(n χ η) and Б (η χ η) be symmetric idempotent matrices of rank r and 5 respectively, ρ <r < s. Then prove that a necessary and sufficient condition for В — A to be positive semidefinite is that det(XAX') < det(XBX'), for all X e Rpxn. (Hogg, 1963) 7.12. Let XAX1 = £ti ΧΑ*', where Χ ~ Α^,η(Μ, Σ <g> Jn). Then prove that any one of the following six conditions is a necessary and sufficient condition for
276 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS Theorem 7.8.7. (i) Д2 = Д-, г = 1,..., /с, and XA^X' and XAjX\ г Φ j, are stochastically independent, (ii) A\ = Д, г = 1,..., /с, and XAX' - Wp(iank(A), Σ, Σ^ΜΜ'), (iii) AiAj = О, г ^ j, and ХДХ' - И^р(гапк(Д), Σ, Σ"1 МДМ'), г = 1,..., /с, (iv) ДА,- = 0, г φ j, and XΑΧ' ~ Wp(rank(A), Σ, Σ"1 МАМ'), (ν) Α2 = A, and XAiX' and XAjX\ г Φ j, are stochastically independent, and (vi) A2 = Д and XAiX' ~ Жр(гапк(Д·), Σ, Σ"1 МДМ'), г = 1,..., /с. 7.13. Let Si = Χ ΑΧ' + Ζ^Χ' + XL'2 + С and 52 = XBX' + i^X' + XN'2 + Д where * ~ iVPln(0, Σ <g> Ф). Then show that (ί)£?(5ι) = ίτ(ΑΦ)Σ + σ, (ii) E(SlGS2) = ЦФБ'ФА') ϊγ(£Σ)Σ + ϊγ(ΑΦ) ϊγ(£Φ)Σ£Σ + ϊγ(ΑΦ£'Φ)Σ£'Σ + ϊγ(ΑΦ)Σ££> + Ζ^ΦΛ^'Σ + tr(GE)L^iV£ + tr(L'2G7\^)E + ZG'L2Wi + ti(BV)CGE + CGD, and (iii) £?(ίΓ(52σ)5ι) = ϊγ(Α'Φ£Φ)Σ£'Σ + ϊγ(ΑΦ) ϊγ(ΒΦ) tr(GE)E + ϊγ(Α'Φ£'Φ)Σ£Σ + ϊγ(ΑΦ) tr(DG)E + Ζ^ΦΛ^'Σ + Σ£'ΛΓ2ΦΖ/2 + ϊγ(Σ£) ίΓ(Φ5)σ + tr(jDG)C. 7.14. Let Χ - Α^ρ,η(Μ, Σ <g> Φ) and define S = D'XAX'D + \(LX'B + £'XL') + C. Then show that S(S) = D'MAM'D + ti(AV)D'ED + ]:{LM'B + B'ML') + С and cov(S) = Μρ[{2ϊγ(ΑΦ)2}(£>'Σ£> <g> D'HD) + 4£>'ΜΑΦΑΜ'£> <g> DTD + Ζ,ΦΖ/ (g) £'Σ£ + IL^AM'D <g> £'Σ£> + 2(Ζ,ΦΑΜ'£>)' <g> (Β'Σ£>)']ΜΡ. (Brown and Neudecker, 1988) 7.15. Derive E(Si), E(SiGS2) and E(ti(S2G)Si) where Sx and S2 are defined in Problem 7.13 and Χ ~ ΛΓρ,η(Μ, Σ <g> Φ). 7.16. Prove Theorem 7.8.9. (Hint: use Definition 2.4.1). 7.17. Let XAX'+LX'+XL'+C = Y^=i(X+Li)Ai(x+Li)^ where x ~ νρΛμ, ς® Φ), rank(A) = г (> ρ), and гапк(Д) = r{ (> ρ), г = 1,... ,/c. Consider the following conditions: (ai) (X + Li)Ai{X + U)' ~ Wp(ru Σ, Σ"Χ(Μ + Ь{)А{(М + U)'), (a2) (X + Li)Ai(X + Li)' and (X + Lj)A5(X + La)\ г Ф j, are stochastically independent, (аз) (X + L)A(X + L)' ~ Wp(r, Σ, Έ~\Μ + L)A(M + L)'), (ci) Ai4>Ai = Ai, г = 1, ...,/c, (с2)ЛИ-0,г^·,
PROBLEMS 277 (c3) АЪА = А, and (С4)Г = Е?=1^ Then, prove that (a) any two of the three conditions (ai), (a,2), and (аз), or (b) any two of the three conditions (ci), (c2), and (сз) or (c) any one set of (a;) and (cj), г φ j, i,j = 1,2,3; or (d) conditions (сз) and (аз); or (e) conditions (сз) and (c4) are necessary and sufficient for all the remaining conditions. (Khatri, 1962) 7.18. Let S = XAX' + L^X' + XL'2 + C, where X ~ iVp,n(0,E <g> Jn). Show that the necessary and sufficient conditions for S to be distributed as V + У, where V ~ Wp(rank(A), Σ), Υ (ρ χ ρ) is normal and is independent of V, are (i) A2 = A, rank(A) > ρ (ii) (Lx - L2)A = 0, and (iii) LXA = 0. 7.19. Let Si = XA^'+^X'+XL'^^ and S2 = XA2X,+NlX,+XN!i+C2, where X ~ iVp,n(0, Σ <g> In). Then show that S\ and S2 are stochastically independent if and only if (i) АгА2 = 0, (ii) LXA2 = 0, (iii) NXAX = 0, (iv) {Nl-N2)A1 = 0, (v) (Ll - L2)A2 = 0, and (vi) (^щ {^)(~ν"-Ν2)>) = 0' (Khatri, 1980) 7.20. Let X ~ ΛΓρ?η(Μ, Σ <g> Jn), η > ρ, Σ > 0 and S = XAX' where гапк(Л) = t {p < t < ri). Further let S = (5^·), г, j = 1,..., &, 5^ (<? x <?), and &<? = p. Then show that the necessary and sufficient condition for principal minors 5ц,..., Skk to be distributed as the principal minors of a Wishart matrix is that A be idempotent. (Gupta and Chattopadhyay, 1979)
278 CHAPTER 7. DISTRIBUTION OF QUADRATIC FORMS
CHAPTER 8 MISCELLANEOUS DISTRIBUTIONS 8.1. INTRODUCTION In Chapters 1-7 we introduced the basic matrix variate distributions. These distributions, because of their wide applicability in multivariate statistical analysis and other fields, have been studied extensively. There are many other matrix variate distributions which have not been classified in the foregoing chapters. In this chapter we give these distributions which, among others, have been studied by James (1954), Herz (1955), Khatri (1970a), Roux (1971), Downs (1972), van der Merwe and Roux (1974), Khatri and Mardia (1977), Mardia and Khatri (1977), de Waal (1979, 1983), and Chikuse (1990a, 1990b, 1991a, 1991b, 1993a, 1993b). However, the coverage here is not exhaustive. Patil, Boswell, Ratnaparkhi and Roux (1984) have also written a classified bibliography of statistical distributions which include matrix variate distributions. 8.2. UNIFORM DISTRIBUTION ON STIEFEL MANIFOLD The uniform distribution on the Stiefel manifold, 0(p,n), has already been encountered in Chapter 1, while studying the Jacobian of a certain transformation involving semiorthogonal matrix X(pxn),p<n, XX' = Ip. Recall that J((dX)X'0 -> (dX)) dX (8.2.1) defines an invariant measure on the Stiefel manifold 0(p, n) and is denoted by [(dX]X,\. Here X0 (η χ η) = (Χ' Χ[), X'0X0 = In and 0(p,n) = {X(pxn):XX' = Ip}. In Section 1.4, it was shown that Vd(0(p,n)) = f KdX)X') JO(p,n) 279
280 CHAPTER 8. MISCELLANEOUS DISTBJBUTIONS 2р7гЬр Thus щкп»тх']=[dX] (8·2·2) defines the probability element of the invariant distribution of random matrix X known as uniform distribution on the Stiefel manifold, 0(p, n), denoted by lip<a. Note that the random matrix Χ (ρ χ n) has np — \p(p + 1) functionally independent and \p(p + 1) functionally dependent elements. Let Xi be the set of functionally independent elements and Xd be the set of functionally dependent elements of X. Let J(XX' —>· Xd) be the Jacobian of transformation from XX' to Xd at Xj (Roy, 1957, p. 170). Then the probability density function of X, with parameters ρ and n, is defined as fp,n(X) = ^M{J(XX' -+ XD)}~\ X e 0(p,n). (8.2.3) 7Γ2ηΡ The above density was given by Khatri (1970a). An alternative representation of density (8.2.3) in terms of generalized Eulerian angles 0^·, г = l,...,p,j = i + l,...,n (Hoffman, Raffenetti and Ruedenberg, 1972; Girko and Gupta, 1996) is given by Khatri and Mardia (1977). Let Ρ^(θ^) be an η χ η matrix with unities on the diagonal except in (г, i)th and (j,j)th positions which contain cos0»j, and all off-diagonal elements are zero except (i,j)th and (j, i)th elements which are sin^· and — sin^·, respectively, j > i. Further, let ^ = Π Π W«). where the product is written from right to left. The matrix P(n χ η) is orthogonal and its first ρ rows can be chosen to represent X in polar coordinates, ρ < п. Then -π < Oj.ij < π, j = 2,3,... ,η, --π < 0y <-7r,j^ + 1. (8.2.4) Using (8.2.3), Khatri (1970a) has derived several results for uniform density. Here we state some of them without proof. THEOREM 8.2.1. Let X ~ UPy7l. Then for Μ G 0(p), and N e 0(n), (i) XN ~ Up<a (ii) MX ~ Up<rt and (in) MXN ~ Κρ,η. THEOREM 8.2.2. Let X = (Χλ Χ2) be distributed as (8.2.3) where X{ispx ni} г = 1,2 with η = щ + ri2, ri2 > p. Then, all the elements of X\ can be taken as random and the random matrices X\ and W = (Ip — X1X[)~^X2 are independent, W ~ UPtn2 and Xi ~ ITPini (n2 + ρ + 1,0, Jp, Jni).
8.3. VON MISES-FISHER DISTRIBUTION 281 For щ = p, above theorem was proved by Herz (1955, Lemma 3.7) using the uniqueness of Fourier transform and results on hypergeometric function of matrix argument. THEOREM 8.2.3. Let Χ (ρ χ η) ~ Up<a andR=(Rx R2) ~ UP,n+m, #i (pxn), R2(p x m) be independent and elements of R2 be functionally independent. Then W = ((RlRl)ix R2)^UP}7l+m. THEOREM 8.2.4. Let X ~ Up<a and X' = (X[ X'2), Хг = (Xn Xl2), X2 = (X2i X22) where X{j is a matrix of order pi xrij, i,j = 1, 2, p\ +p2 = ρ, ηι+η2 = η, ft ι > Pi- Define the matrices T\ {n\ χ (n\ —p\)) of rank щ —р\ such that X\\T\ = 0, Τ[Τλ = Ini-Pl and T2 = (Ini + Xri2{XiiX[i)~lΧ\2)ϊ- Then the random matrices Υ = (Χ2\Τι X22T2) and X\ are independent, Υ ~ UP2f7l-Pl and X\ ~ UPly7l. THEOREM 8.2.5. Let the random matrix Υ (ρ χ η), η > ρ have a density with respect to the Lebesgue measure. Further let the distribution of Υ be invariant under the transformation Υ —>· YN for any orthogonal matrix Ν (η χ η). Then YY' and X = (YY')~2У are independently distributed, and X ~ lip<a. In the above theorem it is assumed that the distribution of YN does not depend on N. Without this condition X and YY' are not independent. Moreover, the distribution of Υ may not be uniform, as shown by Chikuse (1990b)(also see Section 8.7). Consider a random sample Xi(pxn), г = Ι,.,.,Ν, from the uniform distribution on the Stiefel manifold. Define the random matrices Q = jj Σϊ!=ι Xl^n and D = diag(cji,..., ωρ) where ω», i = Ι,.,.,ρ are the eigenvalues of Q. Then, Mardia and Khatri (1977) derived the distributions of Q as well as of D and also gave their asymptotic distributions. 8.3. VON MISES-FISHER DISTRIBUTION The von Mises-Fisher (or Langevin) matrix variate distribution defined in this section is useful in orientation statistics, Downs (1972), Khatri and Mardia (1977). DEFINITION 8.3.1. The random matrix Χ (ρ χ η), ρ < η, is said to have von Mises-Fisher distribution with parameter F (ρ χ ri), if its probability element is given by a(F)eti(FX')[dX], X G 0(p,n) (8.3.1) where [dX] is the unit invariant measure on 0{p,n) and a{F) is the normalizing constant. If the probability element of a random matrix X (pxn) is given by (8.3.1), we will write X ~ MPi7l(F). This distribution is a special case, for С = /p, of Downs (1972) who studied the distribution of X when it lies on the Stiefel C-manifold : S(C) = {X (p x n) : XX' = С > 0}. An alternate representation of (8.3.1) in terms of
282 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS generalized Eulerian angles can be given by using (8.2.4) for [dX]. For F = 0, this distribution (8.3.1) reduces to uniform distribution on the Stiefel manifold. If the rank of F (ρ χ n) is τ < ρ, then the singular value decomposition of F can be written as F = Α'ϋφΘ where Δ G 0(r,p), Θ G 0(r,ri), ϋφ = diag(0i,..., фг), φι > 0, г = 1,... ,r, and Ф\ > Ф\ > "' > Φΐ > 0> are the nonzero eigenvalues of FF'. For the uniqueness of this decomposition we assume φ\ > φι > · · · > φτ > 0, and the elements in the first column of θ are positive. The matrices Δ and θ indicate orientations and φι, 02, · · ·, Фг are concentration parameters in the r directions determined by Δ and Θ. The distribution has model orientation Μ = Δ'θ. It is rotationally symmetric around Μ (Chikuse, 1991b). The normalizing constant a(F) in (8.3.1) can be evaluated by using Theorem 1.6.4, = oFl(\n;\F'F) where D\ = diag(</>?,..., </>*). The m.g.f. of X ~ Mp?n(F) is given by MX(Z) = [ 4bT(FX')eti(ZX')[dX] JxeO(p,n) = oF^niftF + ZXF + Z)') 0Fi(±n;±FF') The last step is obtained by using Theorem 1.6.4. Now partition Χ (ρ χ n) = ( * ), X{ (pi χ η), г = 1,2, and F = ( * ) similarly. The marginal distribution of X\, when X ~ Mp?n(F) can be obtained by using the decomposition of the unit invariant measure [dX], as given in Chapter 1. For given X\ (pi x n), we can find Xs ((n — pi) χ n) = G(X\) and Υ (ρ2 χ (n — pi)) such that X'Q(n χ n) = (X[ X'z) is orthogonal and X2 = YX$. The invariant measure [dX], X G 0(p,n), can be decomposed, Chikuse (1990a), as [dX] = [dX^dY], Χι e 0(pi,n), Υ G 0(p2,n-pi). Further, when X ~ Mp?n(F), using this factorization, the joint probability element of Χι and Υ is given by (Khatri and Mardia,1977; Chikuse, 1990a) {0*1 (±n; \FfF)ylехр^ВД + ti(X3F^Y)} [dXj [dY], Χι e 0(pu η), Υ G 0(p2, η - Pl). (8.3.2)
8.3. VON MISES-FISHER DISTBJBUTION 283 The marginal probability element of X\, after integrating with respect to Y, is [oF^n-^FF)]'1 expMFiX,)} 0Fi(|(n -pi); i(/n - ВД№2) [dXx], Xx G 0(pbn). Note that when F2 = 0, X\ ~ MPu7l(F). The conditional probability element of Χι given X2 is {oFi (|(n - p2); i(/n - ВД№) }"' <*г(ВД) [dX] [dXa]"1, where [dX] [dX2]~l is the unit Haar measure of X\ given X2 subject to ΧχΧ'χ = IPl and XiX2 = 0. Hence the conditional distribution of X\ given X2 is essentially a von Mises-Fisher distribution. If we partition Χ (ρ χ n) = (Χι X2), Xi(p x щ), г = 1,2, n2 > ρ, and F = (F\ F2) similarly, then (Chikuse, 1990a) using Theorem 8.8.2, the joint probability element of Xx and W = {IP- Х1Х[)~\Х2 is π2ηιΡΓρ(^ηι) I 4^ 4 ') det(/p - XiXi)^712^-1) [<W] dXi. From above it is apparent that W ~ MPy7l2((Ip - X-^X'^F^ and the p.d.f. of X\ is 3=^Whi")}"'«**"«*,>} Thus, if F2 = 0, W and Xi are independently distributed. The matrix W is distributed uniformly on 0(p,n2) and the p.d.f. of X\ is Using the sequential decomposition of invariant measure on 0(p,n) into those for independent measures on component Stiefel manifolds and on subspaces of component rectangular matrices, Chikuse (1990a) has given further decomposition of (8.3.2). Khatri and Mardia (1977) have given first two moments of X, and approximations to (8.3.1). For further insight into this distribution, and its special cases the reader is referred to Downs (1972), Khatri and Mardia (1977), Jupp and Mardia (1979), Chikuse (1990a, 1990b, 1991b, 1993a), and Bingham, Chang and Richards (1992).
284 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS 8.4. BINGHAM MATRIX DISTRIBUTION The Bingham matrix distribution defined in this section is the obvious analogue on the Stiefel manifold of Bingham's antipodally symmetric distribution on sphere (Bingham, 1974). DEFINITION 8.4.1. The random matrix Χ (ρχη), ρ <n, is said to have Bingham matrix distribution with parameter A(n χ η) = Af, if its probability element is given by b(A) eti(XAX') [dX], X G 0(p,n), (8.4.1) where [dX] is the unit invariant measure on 0(p,n) and b(A) is the normalizing constant. For identifiability of A we take ti(A) = 0. If the probability element of a random matrix X{pxn) is given by (8.4.1), we will write X ~ BPiTl(A). A generalization of (8.4.1) may be given as bi(A B)eti(BXAX') [dX], X G 0(p,n), (8.4.2) where Β {ρ χ ρ) is a symmetric matrix and b\(A, B) is the normalizing constant. We shall denote this as X ~ BPi7l(A, B). For В = Jp, (8.4.2) reduces to (8.4.1) and for В = 0, the matrix variate Bingham distribution reduces to the uniform distribution on the Stiefel manifold. An alternate representation of (8.4.1) in terms of generalized Eulerian angles can be given by using (8.2.4) for [dX]. The Bingham matrix distribution (8.4.1) is a special case of the generalized von Mises-Fisher matrix variate distribution introduced by Khatri and Mardia (1977) (see Section 8.5). The normalizing constant b(A) in (8.4.1) can be evaluated by using Theorem 1.6.4, {6(A)}"1 = / eti(XAX')[dX] = 1Ft\\n11-P;A). Let us partition Χ (ρ χ η) = ί * J, X{ (pi χ η), г = 1,2. For given Хг (pi χ η), we can find X3 ((n — pi) χ η) = G(X\) and Υ (ρ2 χ (n — pi)) such that Xf0 (η χ n) = (X[ X'3) is orthogonal and X2 = YX$. Then using the factorization of invariant measure over the Stiefel manifolds, given in Section 1.3, the joint probability element of Χι and Υ is {ιίίη)(\n, \p; А)}'"ехр{едAX!) + ti(YX3AX'3Y')} [tUb] [dY], X1eO(pl!n)!YeO(p2,n-p1). (8.4.3)
8.5. GENERALIZED BINGHAM-VON MISES MATRIX DISTRIBUTION 285 Now integrating (8.4.3) with respect to У, using Theorem 1.6.4, we get the probability element of X\ as {i^i(n) (|n, |p; Α) Υ' ^{ХгАХ[) ^ (|(n - Pl); ^p2; X3AXj) №1 Xi€0(pi,n). since Xq^o = In· From (8.4.3), it is also seen that the conditional distribution of Υ given Χχ is Bingham matrix distribution, Β^^-^Χ^ΑΧ^), Х$ = G(X\). Using the sequential decomposition of invariant measure on 0(p, n) into those for independent measures on component Stiefel manifolds and on subspaces of component rectangular matrices, Chikuse (1990a) has given further decomposition of (8.4.3). If we partition Χ (ρ χ ή) = (Χι X2), Xiip x щ), г = 1,2, n2 > p, and A = \ aU a12 )' Ai (ni x nj)' ηι + η2 = η, then (Chikuse, 1990a) using Theorem 8.2.2, the joint probability element of Χχ and W = (Ip — ΧΎΧ[)~ϊΧ2 is уДл , {iii(|n; |p; A)}"' etriX^n^ + (JP - Χ^ί)^^^ Γρ(|η)_ + 2(JP - ΧλΧ[)±νΤΑ!12Χ[} det(Jp - X1XJ)i(n2"p"1) Μ^Ί dXi- Thus the conditional probability element of W given Xi is generalized Bingham-von Mises-Fisher matrix variate distribution (Khatri and Mardia, 1977) discussed in the next section. The Bingham matrix distribution on the Stiefel manifold has been generalized by Prentice (1982). He has also obtained the large sample maximum likelihood estimators and uniformity test. 8.5. GENERALIZED BINGHAM-VON MISES MATRIX DISTRIBUTION Let Χ ~ ΑΓρ>η(Μ, Σ<8>Ψ), ρ <η. The conditional distribution of X on the Stiefel manifold 0(p, n) is known as generalized Bingham-von Mises-Fisher distribution. Khatri and Mardia (1977), using the density of Έ~ϊΧΧ'Έ~ϊ = S given in (7.6.6), derived the probability element of X given XX' = /p, as {^(Σ,Φ,Δ)}"1 eti^XVX' + Σ-*ΔφΧ') [dX], X G 0(p,n), (8.5.1) where [dX] is the unit invariant measure on 0(ρ,η), Φ = QDQf, Q'Q = /n, D = diag(ab..., αη), V = aln - ^Φ"1, Δ = H'^MQD, a is any arbitrary number, and g(-) is the normalizing constant.
286 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS de Waal (1979), using the density of XX' given in Theorem 7.6.2, derived the probability element of X given XX' = /p, as {ΛΓ(Ω, Φ, Μ)}"1 etr{Q(X - МЩХ - Μ)'} [dX], X G 0(ρ, η), (8.5.2) where ΛΓ(Ω, Φ, Μ) = βΪΓ(ΩΜΦΜ') etr(tf2) £ £ (οη) *! Г Α:=0 κ ^2 '* J РЯ(^*МФ*(/П - (7Φ-1)-", Φ - <?/„, -Ω), with Ω-1 = —2Σ, Φ-1 = Φ and Ρκ(·) is the Hayakawa polynomial. Next we give some special cases of (8.5.2). (i) Μ = 0 (Generalized Bingham Distribution) The probability element of X is {ϋΓ(Ω,Φ,Ο)}"1 еЬ(ПХФХ') [dX], X e 0(p,n) where ΛΓ(Ω, Φ, 0) = eti(qQ) 0Ρο(Φ - g/«, Ω). (ϋ) Μ = 0, Ω = Ιρ (Bingham Distribution) The probability element of X is {K(IP, Ф, О)}"1 еЬг(ХФХ') [dX], X e 0(p, n) where K(IP, Ф, 0) = exp(p<?) iFi (-ρ; -η; Φ - qln). (iii) Φ = In (von Mises-Fisher (or Langevin) Distribution) The probability element of X is {ΑΓ(Ω, Jn, Μ)}"1 βίτ(Ω + ΩΜΜ') βίτ(-2ΩΜΛ"') [dX], X e 0(p, n) where by evaluating the above density we obtain ΑΓ(Ω, Jn, M) = βΐΓ(Ω + ΩΜΜ') 0Fi (^p; ^n; Ω2ΜΜ'). (iv) Ω = /p (Bingham-von Mises-Fisher Distribution) The probability element of X is {#(/„, Φ, Μ)}"1 etr{(X - Μ)Φ(Χ - Μ)'} [dX], X € 0(p, n) where ΑΓ(/Ρ, Φ, Μ) = βΐι(ΜΦΜ') exp(p<?) £ Σ { Й») fc!} "' Рк(МФ*(1п - ίΦ-^,Φ - <?/n, -/„).
8.6. MANIFOLD NORMAL DISTRIBUTION 287 (ν) Ω = /ρ, Φ = In (von Mises-Fisher (or Langevin) Distribution) The probability element of X is {Κ(Ιρ,Ιη,Μ)}~1 etr((X - M)(X - M)f) [dX], X e 0(p,n) (8.5.3) where, for q = 0, K(IpJn,M) = etr(MM')EEi(^) k^Ll^-^i-MM1). The form (8.5.3) for the von Mises-Fisher distribution is written in a different manner than (8.3.1). 8.6. MANIFOLD NORMAL DISTRIBUTION Let Χ ~ ΑΓρη(Μ, Σ <g> Ψ), ρ < п. The conditional distribution of X on the Stiefel C-manifold S(C) = {Χ (ρ χ η) : XX' = С > 0} is known as the manifold normal distribution. de Waal (1983), using the density of XX' = S given in Theorem 7.6.2, derived the probability element of X given XX1 = S as {ΛΓ(Σ, Φ, Μ)}"1 etr (- ^Е-1ХФ-1;Г + Σ^ΧΦ^Μ') [dX]c, (8.6.1) where 00 Γ 1 ^ — 1 1 1 κ(Σ, φ, Μ) = Ε Σ {(2η)κ fc!} лЬ^мф-*, φ-i, -E-isE-i), Jt=0 κ and [dX],; is the content element on S(C). He has also given an approximation to (8.6.1). The m.g.f. of X is MX(Z) = {Κ{Σ, Φ, Μ)}"1 f etr(XZ') etr (- ^E_1 JfΦ_1Χ' + Σ^ΧΦ^Μ') [dX]c = {if (Ε, Φ, Μ)}"1 f etr (- ^Ε-^φ-1^ Js(C) v 2 + Σ_1ΧΦ_1(Μ + ΣΖΦ)') [dX]c _ #(Σ,Φ,Μ+ΣΖΦ)) #(Σ,Φ,Μ) ' It may be noted that for С = Ip, [dX]c = [dX] and (8.6.1) reduces to (8.5.1). Hence manifold normal distribution can be regarded as a generalization of Bingham-von Mises-Fisher distribution discussed in Section 8.5.
288 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS 8.7. MATRIX ANGULAR CENTRAL GAUSSIAN DISTRIBUTION In Sections 3 through 6, we have defined matrix von Mises-Fisher distribution and Bingham matrix distribution and their extensions. The Bingham matrix distribution is an antipodally symmetric distribution. Chikuse (1990b) using polar decomposition of a random matrix, has proposed matrix angular central Gaussian distribution as an alternative to Bingham matrix distribution for modeling antipodally symmetric orientational data on Stiefel manifold (see also Tyler, 1987). For any random matrix Χ (ρ χ n) of rank ρ < η, the unique polar decomposition of X is defined (Chikuse, 1990b) as X = SXHX (8.7.1) with Sx = XX', and Hx = (XX')-±X. So that Hx e 0(p,n) and SX is the unique positive definite square root of Sx. Let fx(X) be the density oi Χ (ρ χ n). Then the joint probability element of Sx and Ex is (Chikuse, 1990b), 1 rjT^fx{SxHx) det(Sx)^-^ [dHx] dSx. (8.7.2) Integrating (8.7.2) with respect to Sx, the probability element of Η χ is obtained as 1 -?L-[dHx][ fx(SxHx)aet(Sx)^-^dSx. (8.7.3) Lp\2n) JSX>0 Ч ' Integrating (8.7.2) with respect to Ηχ, the denisity of Sx is given by 1 ^^det(Sx)^-^ [ fx(SxHx)[dHx}. (8.7.4) lp(2n) JHxeo(p,n) \ ' From (8.7.2), it may be noted that if the density of XN, N e 0(n) does not depend on iV, then Sx and Η χ are independent, the distribution of Η χ is uniform, and the density of Sx is 1 гкгМ^) det(5x)^"-p-1)! sx > o. If X ~ NPi7l(M, Ip <g> Ф), then the distribution of Η χ is called matrix angular central Gaussian distribution (ACG) with parameters ρ, η and Φ. This is denoted by Η χ ~ ACGPi7l(4!). From (8.7.3) the probability element of Hx is detW-Wetttf^-^rb [dHx], Hx e 0{p,n). (8.7.5)
8.8. BIMATRIX WISHABT DISTRIBUTION 289 The density of Sx = XX', from (7.2.1), is |ffi| **(*)*<—» etr (- \sx) Я*(/. - *-\ \SX), Sx > 0. Note that the ACG distribution (8.7.5) is invariant under the transformation Ηχ -* QHx, Q £ 0(p). In case Φ = /n, this distribution reduces to the uniform distribution over 0(p,n). If the density of random matrix X (pxn) is of the form g(XX'), then the density of XN, N £ 0(n) obviously does not depend on N and therefore (i) Ηχ and Sx are independent, (ii) Ηχ is distributed uniformly over 0(p,n), and (iii) the density of Sx has the form 7T2nJ? — det(Sx)-^-r-Vg(Sx), Sx > 0. Тр\2П) Chikuse (1990b) has proved the necessity of conditions (i), (ii) and (iii) for the density of X to be of the form g(XX'). For g(Sx) = etr(—\Sx), the conditions (i), (ii) and (iii) provide characterization of matrix variate standard normal distribution. For g(Sx) = det(/p + 5x)~2(n+m+P~1), the conditions (i), (ii) and (iii) give a characterization of matrix variate ί-distribution, Tp?n(rn, 0,/p,/n). For other relevant results the reader is referred to Chikuse (1990a, 1990b). 8.8. BIMATRIX WISHART DISTRIBUTION The following distribution has been given by Roux and Raath (1973). DEFINITION 8.8.1. A random bimatrix X = (XUX2)} where X{(p χ ρ) is symmetric, i = 1,2, is said to have bimatrix Wishart distribution with parameters n\ (> ρ), n2 (> ρ), Φι (> 0), Ψ2 (> 0) and θ(= θ') if its p.d.f is given by Π [{2^Γρ(±η^^ Here LI is Laguerre polynomial of matrix argument defined in Chapter 1. For θ = 0, it is easily seen that the matrices X\ and X2 are independent, and -X"i~Wrp(ni,«i),i = l,2. THEOREM 8.8.1. The joint m.g.f of X = (XUX2) is given by 2 - °° CK(Q) =o MXl,X2(ZuZ2) = Π det(/p - 2Ф^)"Ь Σ £ .,У,Д12 f[CK(2<HiZi(Ip-2<i>iZi)-1).
290 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS Proof: The joint m.g.f. of X\ and X2 is 1 MXuX2{ZuZ2) = \l[{2^Tp(-ni)aetm^}~ ΣΣ CK(Q) %*? k\[CK(IP)}2 k=0 π {(Η Γ L·etr (XiZi - ¥^det(Xi)IK" P-I) ,i(ni_p_i)/l , Now transforming Y{ = %*Х{%* with Jacobian J(X{ -+ У;) = det^)^1), and using Theorem 1.7.1, the integral on the right hand side becomes Jx >o etr (ХЛ - ΙφΓ1^) det(Xi)i(n'-,,-1)^(n|-^1) (^Г1*,) **« = 2^det(<i>i)ini{^ni)KTp(^ni) det(/p - 2Ф{^)~Ь C„(-2*iZi(/p - Way1), Ip - 2%Zi > 0. Finally substituting from (8.8.2) in (8.8.1) we get the desired result. ■ From the Definition 8.8.1, it can easily be shown that the marginal p.d.f. of Xi is Wp(n.i, Φί), ΐ = 1,2. The conditional density of X\ given X2 can easily be seen to be ^PTp(lni) det^)""'}"1 etr (- ^ΦΓ'-Χί) det^)^"'-""1) ΣΣ^ρΠ [{(b).}"d'"'"'"" М- *■ * > ° 8.9. BETA-WISHART DISTRIBUTION In this section we give two distributions of bimatrix X = (Хг,Х2), with specified marginals and conditionals. First we define beta-Wishart type I distribution. DEFINITION 8.9.1. A random bimatrix X = (XUX2), where Xi (ρ χ ρ) is sym- metnc, г = 1,2, is said to have beta-Wishart type I distribution with parameters n\ (> V), n2 (> p), n3 (> p), and Σ > 0 if its p.d.f. is given by {2Ьргр(^щ) detE)bД,(1п2, Ι^)}"' etr (- ^Σ"1^) det^)^"—з) det(X2)2(n2"p-1) det(Xi - X2)^3-P-D? о < X2 < ΧΎ. From Definition 8.9.1, it can be shown that Χχ-Η^ηχ,Σ),
8.10. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 1 DISTRIBUTION 291 X2 ~ CH^-nu 2^з, ^K - n2 + P + 1), -E^kind l), and X2|X1^GJBp/(in2,in3;X1,0), where C#pJ denotes the confluent hypergeometrie function kind 2 and type II distribution defined in Section 8.11. For ρ = 1, the above distribution reduces to the beta-Stacy distribution (Mihram and Hulquist, 1967). Next we define beta-Wishart type II distribution. DEFINITION 8.9.2. A random bimatrix X = (XUX2), where X{ (ρ χ p) is symmetric, г = 1,2, is said to have beta-Wishart type II distribution with parameters n>i (> ρ), ri2 (> ρ) if its p.d.f. is given by {2i^^yPTp(\ni)Tp(\ni)}-1 etr {- 1-Χχ{Ιρ + X2)} deb(X1)^ni+ni-p-^det(X2)^n2-p-1\X1 >Q,X2> 0. From Definition 8.9.2, it can be shown that Xi~Wp(nuIP), Xx\X2 ~ Wp(m + n2, (IP + Х2У1), and XilXi^W^nuXr1). It may be noted that if Χχ ~ Wp(ni,Ip) and U ~ Wp(n2, Ip) are independent, then the joint distribution of X2 = Xx 2 ΙΙΧχ 2 and Χχ is beta-Wishatr type II distribution with parameters ηχ and n2. This result is given in Chapter 5, in (5.2.9), where it is proved that X2 = Χχ*υΧχ* ~ Bj,1^, \n{). 8.10. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 1 DISTRIBUTION Here we define a matrix variate distribution in terms of the confluent hyp ergeome trie function. This distribution arises in the study of ratios of certain random matrices. DEFINITION 8.10.1. A random symmetric positive definite matrix Χ (ρ χ ρ) is said to have a confluent hypergeometric function kind 1 distribution if its p.d.f. is given by Τρ(η)Τρ{β)Γρ(α - η) where Re(/3 — η) > 0, and Re(a — n) > 0. The parameters n, a, and β are restricted to take values such that the density function is non-negative.
292 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS We denote this distribution by CHp(n, a, /3, kind 1). By transforming X = A2YA2, (A > 0) with the Jacobian J(X -+ Y) = det^)^1), the density of Υ is obtained as г^^г"П) ^ det(Ar det^r^1) ^(a;/?; -AY), Υ > 0. (8.10.2) Γρ{η)Γρ(β)Γρ{α - η) We denote this distribution by CHp(n, α, β; A, kind 1). When a = /3, the CHp(n, a, /3; A, kind 1) density simplifies to <2ρ(η, A) defined in Chapter 3. The c.d.f. of X, obtained from (8.10.1) is The confluent hypergeometric function kind 1 distribution arises as the distribution of ratio of beta and gamma matrices, as shown in the following theorem. THEOREM 8.10.1. Let W ~ Gp(nJp) and U ~ Bj,(a,b) be independent Then X = U~2WU~2 ~ CHp(n,a + n,a + b + n,kind 1). Proof: The joint density of W and U is given by {βρ(α, б)Гр(п)}-1 det(U)a-iW det( Jp - tfjM&H-D etr(-W) det(^)n-^1}, 0 < U < /p, W > 0. Now transform X = tHwtH, with the Jacobian J(W -+X) = dettt/)^1), the joint density of U and X is given by {βρ(α, б)Гр(п)}-1 detiA·)"-*^^ eti(-UX) aet(U)a~^+1) det(Jp - l/)6"*^1), 0 < U < /p, X > 0. (8.10.3) Integrating (8.10.3) with respect to U, using Corollary 1.6.3.1, we get the marginal p.d.f. of X as ГЫгЖаЛ"2^ deW**4 i*k(a + »; a + 6 + η; -Χ), Χ > 0, 1 ρ(α)1 ρ(η)1 ρ{α + ο + η) which completes the proof of the theorem. ■ In the next theorem we derive the m.g.f. of X ~ CHp(n, α, β, kind 1). THEOREM 8.10.2. Let Χ ~ ΟΗρ(η,α,β,Ηηά 1). Then the m.g.f. of X is Mx^ = wmr'if"!!det^ " ζ)~η2^β- *;ft % - гП (8.Ю.4) Γρ(/3)Γρ(α - η) where Ip — Ζ > 0.
8.10. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 1 DISTRIBUTION 293 Proof: The m.g.f. of X is M*<z> - rjffij5&«-„ Jx>0ett(zx)*«*r»■*"■* («^-*>« - -*Wi/ str{-(/,-Z)X>det(X)->«> (a - n) Ух>о Γρ(η)Γρ(/?)Γρ(α - η) Jx>o 1F1(p-a;0;X)dX Τρ(α)Τρ(β-η) Τρ(β)Γρ(α-η) det(/p - Ζ)-" 2ί\(η, /? - a; /3; (Jp - Ζ)"1), where the last two steps have been obtained using (1.6.9) and (1.6.4), respectively. ■ From (8.10.4) the m.g.f. of tr(X) is easily obtained as Mtr^(Z) -Γρ(/3)Γρ(α-η)5οΣ(1" *) (/3M! W (δ·10·5) for |1 - z\ > 0. Next expanding (1 - г)-<*чн-*) = Σ~0(ηΡ + *0*lf> W < h and substituting in (8.10.5) and equating the coefficients of ^, we obtain Now we give certain properties of the confluent hypergeometric function kind 1 distribution. THEOREM 8.10.3. Let X ~ CHp(n,a,p,kind 1) and A be any ρ χ ρ constant nonsingular matrix. Then AX A' ~ CHp(n, a, /3; (AA')~l, kind 1). THEOREM 8.10.4. Let X ~ CHp(n,a,P,kind 1), and Η (ρ χ ρ) be an orthogonal matrix whose elements are either constants or random variables distributed independently of X. Then, the distribution of X is invariant under the transformation X —>> HXH', and is independent of Η in the latter case. Proof: The proof is similar to the proof of Theorem 3.3.2. ■ THEOREM 8.10.5. Let X ~ CHp(n, a, β, kind 1), and partition X as χ _ ( ХП Х12 \ Я \ ^2i ^22 J p-q q p-q Then Xu and Х22л = ^22 ~~ -^21-^n -^"12 are independent and Хц ~ CHq(n,a — \{ν~4),β- \(p-q),kind 1), and X22-i ~CHp-q(n- \q,a-\q,fi- \q,kind Ϊ).
294 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS Proof: The theorem can be proved by using the integral representation of 1F1 in (8.10.1) and integrating with respect to Xu- ■ THEOREM 8.10.6. Let X ~ CHp(n,a,p,kind 1). Then for a constant matrix A(qx p), with rank(A) = q<p, AX A ~ CHq(n, a-\(p- q),p - \{p - q); (AA!)~l, kind 1). COROLLARY 8.10.6.1. Let X ~ CHp(n,a,fi,kind Ϊ). Then, for αφ 0, ^ ~ СНх{ща- I(p- i),/3- I(p- i),fcjnd l). In the above corollary it is clear that the distribution of ^f does not depend on a. Thus, for any random vector y(px 1) distributed independently of X with P(y φ 0) = 1,*2* ~ СНх{п,а- \{p ~ 1),/? - \{v~ l),kind 1). THEOREM 8.10.7. Let X ~ CHp(n,a,p,kind 1), and A(q χ ρ) be a constant matrix of rank q <p. Then (AX~lA')~l ~ CHq(n- \(p-q),ot- \(p- q),fi- \{p — q);AA',kind 1). COROLLARY 8.10.7.1. Let X ~ CHp(n, α,/З, fend i). ТЛеп, for α φ 0; ^^ ~ Ctf^n - ±(p- l),a- \{p- IIP- \{p- l\kmdi). THEOREM 8.10.8. Let X ~ C#p(n, a,/3, fend 1). Then (i) E[CK(X)} = (Гг(~ f ^ У?(^ ^У^^^С^)^ Re(a -*)> \iP " 1) + *ь and (ii) E[CK(X-1)} = iw^Vwi^ C*(/p)' Re(n) > ^ " 1} + fcl· (/7 - nj^^-n + 5(p + 1))K 2 THEOREM 8.10.9. Lei X - C#p(n, a, /3; a"1 A, fend i). Then X Д 7o5 a -+ oo, гуДеге £Ле р. <£/. о/ У гя ^гЪеп by г^гГт detiArdetiy)"-^4оВД-^У), ^ > О, Γρ(η)Γρ(/?) гуДеге "X —> Υ " denotes convergence in distribution. If, on the other hand, X ~ CHp(n, α, β\ β A, kind 1), then X Д У as β -¥ oo, where Υ ~ <2£7/(η, α - η; Α"*, 0). These two results were obtained by van der Merwe and Roux (1974) by using the confluence relations given in Chapter 1.
8.11. CONFLUENTHYPERGEOMETRIC ΡϋΝΟΉΟΝ KIND 2 DISTRIBUTION 295 8.11. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 2 DISTRIBUTION In section 8.10 we defined confluent hypergeometric function kind 1 distribution. In this section we study certain distributions which correspond to the confluent hypergeometric function of kind 2 defined in Chapter 1. DEFINITION 8.11.1. A symmetric random matrix Χ (ρ χ ρ) is said to have a confluent hypergeometric function kind 2 and type I distribution, if its p. d.f is given Γρ(η)Γρ(α-η)Γρ[η-β+±(ρ+1)] y J y ,M' ;' where Re(n, α — η) > \{p — 1) and Re(n — β) > —1. The parameters η, α, and β are restricted to take values such that the density function is non-negative. This distribution will be denoted by СЩ(п, α, β, kind 2). By transforming X = AWa\, (A > 0) with the Jacobian J(X -+Y)= det^)^1), the density of Υ is Г (1г?1Г'1^Г i+ */£.*?;?+ П1 det(A)" det(rr^^) Φ(α,/3; AY), Υ > 0. Γρ(η)Γρ(α - п)Гр[п - β + 5(ρ + 1)J This distribution will be denoted by Οϊρ{η,α,β\Α,kind 2). THEOREM 8.11.1. Let W ~ Gp(n,Ip) and V ~ £pJ(a,6) be independent. Then X = V\WV\ ~ СЩ(п, b + η, η - a + \{p + 1), fend 5). Proof: Making the transformation X = V^WVi with the Jacobian J(W —> X) = det(y)~2(p+1), in the joint density of V and W, and then integrating with respect to V, we get df[X)^+'] [ etr(-VX) det(\06+"-^+1) det(/p + V)-^ dV βρ(α, ο)Γρ(η) Jv>o = ΓΓ Γηΐ^Γ Ш} **(*Г ^ *(» + η,η - a + i(p + 1); *), X > 0. Γρ(η)Γρ(α)Γρ(&) ν 2 / The last step is obtained from the Definition 1.6.13. ■ If X ~ CHp(n, α, β, kind 2), then the Laplace transform of its density is Γρ(α)Γρ[α-/?+|(ρ+1)] Γρ(η)Γρ(α - η)Γρ[α + η - β + \{ρ + 1)] 2FX (η - β + i(p + 1), η; α + η - β + |(ρ + 1); Ιρ - Ζ), (8.11.1) where Re(/P - Ζ) < Ιρ, Re(n - β) > -1 and Re(a) > |(ρ - 1). Now we give certain properties of the confluent hypergeometric function kind 2 and type I distribution.
296 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS THEOREM 8.11.2. Let X ~ CHfo,a,P,kind 2), and A be any ρ χ ρ constant nonsingular matrix. Then AX A' ~ СЩ(п, α, β, (ΑΑ')~ι, hind 2). THEOREM 8.11.3. Let X ~ CH£(n,a,P,kind 2), and Η (ρ χ ρ) be an orthogonal matrix whose elements are either constants or random variables distributed independently of X. Then, the distribution of X is invariant under the transformation X —> HXH', and is independent of Η in the latter case. THEOREM 8.11.4. Let X ~ ΟΗ^η,α,β, kind 2), and partition X as -Xll -^12 \ Q x= . ^21 -^22 J P — Q q p-q Then Xu and X22-i = ^22 ~~ ХцХпХп are independent, Хц ~ СН^(п,а — \{p — q\ β-l(p- q), kind 2), and Х22Л ~ СЩ_д(п -\q,a- \q, β-^p-q), kind 2). THEOREM 8.11.5. Let X ~ СЩ(п,а,Р,Ыпа 2). Then for a constant matrix A(qx p), with rank(A) = q < p, AX A! ~ СЩ(п, a-\{p-q)^-\{p-q)', {AA!)~l, kind 2). COROLLARY 8.11.5.1. Let X ~ СЩ(п,а,Р,Ыпа 2). Then, for a фО, ^ ~ CH[(n,a- l-(p-l)^-l-{p-l),kmd 2). In the above corollary it is noted that the distribution of 9^L does not depend on a. Thus, for any random vector y(pxl) distributed independently of X with P(y φ 0) = 1,*** ~ СН((щ а-\{р- 1),/? - \{ρ - l),kind 2). THEOREM 8.11.6. Let X ~ CH*(n,a,P,kind 2), and A(q χ ρ) be a constant matrix of rank q < p. Then {AX~lA!)-1 ~ CE{{n - \{p - <?), α - \{p -q),fi-\{p- q)\AA!,kind 2). COROLLARY 8.11.6.1. Let X ~ СЩ(п,а,Р,Ыпа 2). Then, for αφ 0, ^^^СЯ[(п-^(р-1),а-Ь(р-1),/3-^(р-1),Ьп^). Next, we define confluent hypergeometric function kind 2 and type II distribution and study its properties. DEFINITION 8.11.2. A random symmetric matrix Χ (ρ χ ρ) is said to have a confluent hypergeometric function kind 2 and type II distribution, if its p.d.f. is given rtr fn+ Ж1 **(*Г »™ etr(-X) *(«,/?; X), X > 0, Γρ{η}Γρ[η - β + ^{p + 1)J where Re(n,a) > \{p - 1) and Re(n - β) > — 1.
8.11. CONFLUENT HYPERGEOMETRIC FUNCTION KIND 2 DISTRIBUTION 297 The parameters n, a and β are restricted to take values such that the density function is non-negative. We wiil denote this distribution by СЩ^п, α, β, kind 2). By transforming X = Α$ΥΑ*, (A > 0) with the Jacobian J(X -> Y) = det(A)^1), we get the density of У as ГЫГ i+ UL 1Ч,+Л det^ ^(УГ >^> etr(-Ar) Φ(α, β; ΑΥ\ Υ > 0, Гр(п)Гр[п - β + 2 (ρ + 1)J Re(n) > |(ρ - 1), Re(a) > |(ρ - 1), Re(n - /3) > -1. This distribution will be denoted by CH™(n, a,/3; A,kind 2). THEOREM 8.11.7. Lei V^ ~ Gp(n,Ip) and U ~ £p(a,&) be independent Then X = U*WU* ~ CH^(n,a,n- b+ \{j> + \),kind 2). Proof: See Khattree and R. D. Gupta (1989). ■ THEOREM 8.11.8. Let X ~ СЯ^(п,а,/3, kind 2). Then its т.д./. is Μχ(Ζ) = 2Γι{η-β+^(ρ+1),η;α + η-β+^(ρ+1);Ζ). Proof: The m.g.f. of X is given by MX(Z) = Γρ[α-/? + η+|(ρ+1)] Γρ(η)Γρ[η-/? + ±(ρ+1)] ί etr{-(/p - Z)X}det(X)n-ib+V Ψ(α,β;Χ)άΧ Jx>o = 2F1{n-p+hp+l),n;a + n-p+^(p+l);Z), Re(Ip - Z) > 0, Re(a) > -(p - 1), Re(n - β) > -1. The last step has been obtained by using the Laplace transform of confluent hyper- geometric function given in Problem 1.23 in Chapter 1. ■ The m.g.f. of tr(X), from the above theorem, is derived as Mtl{X)(Z) = 2F1(n-p+^(p+l),n;a + n-p + i(p + 1); zlp) fc=o к (n + a-/?+i(i)+l))K fc! ,M V 7 From (8.11.2) the fcth moment of tr(X) is obtained as Pi^v^l ν (η)κ(η-β+ \{p+ 1)),^ ,rN ,, 9 £Mi=i:(n+a_i+i(p+1))5Ufc=u-,--
298 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS THEOREM 8.11.9. Let X ~ CH^(n,a,P,kind 2), and A be any ρ χ ρ constant nonsingular matrix. Then AX A! ~ CH^fji, α, β; (Μ)"1, kind 2). THEOREM 8.11.10. Let X ~ CH^(n,a,P,ki7id 2), and Η (ρ χ ρ) be an orthogonal matrix whose elements are either constants or random variables distributed independent of X. Then, the distribution of X is invariant under the transformation X —> HXH', and is independent of Η in the latter case. Khattree and R. D. Gupta (1989) have derived many results on expectations using Theorem 8.11.8. They have shown that if X ~ CH^(n, α, η - b + \{p + 1), kind 2) then for i(pxl)/0, -=rz- ~ СН[\п,a,n-b+ 1,kind 2) о о and -^^ ~ C#i'(n - i(p - 1), α,η - b + 1,kind 2). 8.12. HYPERGEOMETRIC FUNCTION DISTRIBUTIONS In this section we give hypergeometric function distributions of two types. First we define hypergeometric function distribution of type I. DEFINITION 8.12.1. A random symmetric matrix Χ (ρ χ ρ) is said to have hypergeometric function distribution of type I, if its p.d.f is given by Γρ(7)Γρ(η)Γρ(7 + η-α-/?) v ' Ур > 2*Ί(α, β- r,Ip-X),0<X< Ip, (8.12.1) where Re(7 + η - a - β) > \{p - 1), Re(7) > \(p - 1) and Re(n) > |(p - 1). The parameters α, β, η and η are restricted to take values such that the density- function is non-negative. We will denote this distribution by Hp{n, α, β, η). For a = 7, the density (8.12.1) reduces to {βρ(Ί,η - β)}~1 det(X)n-0-^+Vdet(lp -χγ-hl»-1), 0 < X < Ip, and for β = 7, hypergeometric function density of type I (8.12.1) reduces to {/?p(7 - α,η)}-1 detpOn-a-^+1>det(/p - X)^5(?>+1), 0 < X < Ip. THEOREM 8.12.1. Let U ~ B*(a,b) and V ~ Bj,(c,d) be independent. Then Ζ = uWub ~ Щ{с,Ь,с + d- a,b + d).
8.12. HYPERGEOMETEIC FUNCTION DISTRIBUTIONS 299 Proof: The joint density of U and V is given by {i^(o,b)i6i,(c,d)}"1det(u)e-i^1>det(/p - t/)M(p+D det(V)c-^+1> det(/p - y)d"^+1), 0< t/ < /p, 0 < У < /p. Making the transformation Ζ = U*VU*, with Jacobian J(V -> Z) = det(C/)"^(p+1), and integrating out U from the joint density of U and Z, we get the marginal density of Ζ as {β,(α, 6)Д>(с, d)}~1 det(Z)c-ib+l> f det(U)^c-^l) det(/p - tf jW&h-D Jz<u<ip det(/p - 1/-1Ζ)<ί-*^1> Д7. (8.12.2) Now substituting A = (Ip - Z)~^(IP - U)(IP - Z)'*, (8.12.2) becomes {βρ(α, b)pp(c, d)}'1 det(Z)c-5(^D det(/p - £)*+<4(p+D f det(A)fc-^+1) det(/p - A)^^1) det(/p - (L - Z)A)-{-c+d-^ dA Jo<A<If 2F1(b,c + d-a;b + d'Jp- Z), 0 < Ζ < Ip. The last step has been obtained by using the Corollary 1.6.3.2. ■ For c = a + b, the above theorem gives Ζ ~ Bp(a, 6 + d), as proved in Theorem 5.3.25. The m.g.f. of X ~ #p (n, α, /3,7) is given by deb(Ip-Xy-^1\F1(a,p;TJP-X)dX Γρ(7)Γρ(η)Γρ(7 + η - α - β) Jo<y<ip det(/p - У)-^1) 2Fx(a,/3; 7; Г) dY = Γρ(7 + η-α)Γρ(7 + η-/3) ~ 1 r _ Гр(7)Гр(п)Гр(7 + п-а-/?)^4гк!Уо<у</, ^ l P )} det^)7"^1) det(/p - γγ-ϊ^+ΐ) 2jPl(a? β· 7; y) dY = 2F2{n, 7 + η - a - β; 7 + η - a, 7 + η - /3; 7; Z), where the last step is derived by using the result of Problem 1.16. van der Merwe and Roux (1974), using Gauss' hypergeometric function, have defined the following density.
CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS BEFINITION 8.12.2. A random symmetric matrix Χ (ρ χ ρ) is said to have hypergeometric function distribution of type II, if its p. d.f is given by ГЫГН^а ;У1) det(Xr^> *lMr, -AX), X > 0, (8.12.3) Γρ(η)Γρ(7)Γρ(α - η)Γρ{β - η) where A>0, Re(j - n) > \{p - 1), Re(a - n) > \{p - 1) and Re(/3 - n) > \{p - 1). The parameters η, α, β and 7 are restricted to take values such that the density function is non-negative. Denote the above distribution by Ярг/(п,а,/3,7; A). The c.d.f. of the random matrix X can be shown to be P(X < Ω) = Гр(а)Гр(/3)Гр(7 - η)Γρ[*(ρ + 1)] Γρ(η)Γρ(7)Γρ(α - η)Τρ(β - η)Τρ[η + \(jp + 1)] det(A)n det(Q)n3F2(a,β, η; η + |(ρ + 1),7; -ΑΩ), (8.12.4) where 3-F2 ( ) is the generalized hypergeometric function defined in Chapter 1. For /3 = 7, the density (8.12.3) reduces to the density {βρ(η, a - n)}"1 det(A)n det(X)n-^+1) det(/p + AX)~a, X > 0, which is the density of generalized beta type II distribution GB^faa — n; A_1,0). From the confluence relation (1.6.12) it follows that if X ~ #pJ(n, α, β, 7; β~ιΑ), then X —>■ У as /3 —> 00, where the p.d.f. of У is CHp(n, a, 7; A, kind 1). The following theorem gives the hypergeometric function distribution of type II as the distribution of ratio of two independent beta matrices. THEOREM 8.12.2. Let U ~ £p(a,6) and V ~ B^(c,d) be independent Then X = U~WU-12 ~Я^(с,а + с,с + й,о + Ь + с;/р). Proof: The joint density of U and V as given by {βρ(α, 6)Д,(с, d)}-1 detiUy-^V det(/p - υ)^^ι) aet(V)c-^l) det(/p + V)~(c+d\ 0<U<Ip,V>0. Transforming X = tHvtH, with the Jacobian J(V -+ X) = dettt/)^1), and integrating out U from the joint density of U and X we get the marginal density of Xas {Д,(а, 6)/3p(c, rf)}"1 det(X)c"^+1) / det(C/)a+c"^+1) det(/p - t/)6"^) 7o<£/</? det(/p + XU)~{cJhi) dU The last step has been obtained by using Corollary 1.6.3.2. ■
8.13. GENERALIZED HYPERGEOMETRIC FUNCnON DISTRIBUTIONS 301 8.13. GENERALIZED HYPERGEOMETRIC FUNCTION DISTRIBUTIONS Roux (1971), by multiplying Wishart, beta, and Dirichlet densities by generalized hypergeometric function, has defined a number of densities which are given in this section. (i) The random symmetric matrix Χ (ρ χ ρ) is said to have Generalized Hypergeometric Function (GHF) type I distribution if its density is etr(-X)det(X)»-i<*-i) rrs(ai,...,ar\bi,...,os, t)A), A > 0, rp(n) r+iFs(ab ..., ar, n; 6b ..., 6S; θ): (8.13.1) where η > \{p — 1) and the parameters a^, 6j, г = 1,..., r, j = 1,..., s are restricted to take those values for which the density function is non-negative. For θ = 0, we get the Wishart density, Wp(2n, 2/p), as a special case of the above density. When r = 0, s = 1, and &i = η we get ^Γ? etr(-X) detpO"-^1) „Ή(η; ΘΧ), (8.13.2) Γρ(η) which is the noncentral Wishart density, Wp(2n, 2/p, Θ). For 0 = /p,r = s = l, the density (8.13.1) reduces to etr(-X)det(Xr^D Γρ(η)2^ι(αι,η;6ι;/ρ) Simplifying this density using the results 2^ΐ(αΐ,Π,6ι,/ρ) = — -Tp-77 "Τ Γρ(θχ -αι)Γρ(&ι -η) and 1F1(a1;61;X)=etr(-X)1F1(61-a1;61;-X), it can be easily seen that X ~ CHp(n, &i — аь 61? kind 1). (ii) The random symmetric matrix Χ (ρ χ p) is said to have Generalized Hyperge- ometric Function (GHF) type II distribution if its density is Гр(т + n) det(X)m-^+1> det(/p - xy~W+i) Гр(т)Гр(п) r+1Fe+i(ai,..., ar, m; bu ..., b8, m + η; Θ) rFs(au ..., ar; bu ..., 6S; ΘΧ), 0 < X < /p, (8.13.3) where θ = θ', m > \{p — 1), η > \{p — 1) and the parameters ai? bj, г = 1,... ,r, j = 1,..., s are restricted to take those values for which the density function is non- negative.
302 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS For θ = 0, the GHF type II density (8.13.3) reduces to the matrix variate beta type I density with parameters ra and n. For r = s = 1, ai=m + n, and &i = ra we get the density of X as β?~Θί detpO™-^1) det(/p - xr-i^D ^т + n; m; ΘΧ), 0 < X < /p, which is the noncentral matrix variate beta type 1(B) density with parameters ra, n, and Θ. (iii) The random symmetric matrix Χ (ρ χ p) is said to have Generalized Hyper- geometric Function (GHF) type III distribution if its density is Гр(га + n) det(X)m~^(p+1) det(/p + X)-("+"> Гр(га)Гр(п) r+iFs+i(ab ..., ar,n; bu ..., 6S, ra + η; Θ) rF.(au ..., ar; bu ..., 6S; θ(/ρ + X)"1), X > 0, (8.13.4) where θ = θ', m > \(p — 1), η > \(jp — 1) and the parameters ai? bj, г = 1,... ,r, j = 1,..., s are restricted to take those values for which the density function is non- negative. For 0 = 0, the GHF type III density (8.13.4) reduces to the matrix variate beta type II density with parameters ra and n. For r = s = 1, ax = ra + n, and &i = η we get the density of X as У~е)ч det(Xr-^) det(/p + X)-(™+") ^(ra + η; η; θ(/ρ + X)"1), X > 0, pp(m, n) which is the noncentral matrix variate beta type 11(A) density with parameters ra, n, and θ (see Theorem 5.5.3). (iv) The ρ χ ρ random symmetric matrices Xi,...,Xq are said to have Generalized Hypergeometnc Function (GHF) type IV distribution if their joint density is Гр(т + n) nti det(Xfc)™*-^+D det(/p - £Li Xk)n-Jb+V Ш=1 rp(mfc)rp(n) r+iFs+i(ab ..., ar, ra; 6b ..., 6S, ra + η; Θ) / ς \ q PFe(ai,...,oP;6i,...,be;eX;Xfc), X;Xfc</p,Xfc>0, * = 1,...,ς, (8.13.5) fc=l fc=l where m = Σ|=1 то*, θ = θ', тк > \(p - 1), к = 1,..., q, η > \(ρ - 1), and the parameters ai? bj, г = 1,..., r, j = 1,..., s are restricted to take those values for which the density function is non-negative. Note that for q = 1, the GHF type IV density (8.13.5) becomes GHF type II density (8.11.3). For θ = 0, the GHF type IV density (8.13.5) reduces to matrix variate Dirichlet type I density with parameters mi,...,m9 and n. For r = s = 1, αϊ = ra + π, and bi = ra we get the density of X as ^(πι + η-,πι-,θΣΧή, ^Xfc < Jp, Xk > 0, к = 1,... ,<?, fc=l fc=l
8.14. COMPLEX MATRIX VARIATE DISTRIBUTIONS 303 which is the noncentral matrix variate Dirichlet type I density with parameters m\,..., mq\ n, and Θ. (v) The ρ χ ρ random symmetric matrices X\,..., Xq are said to have Generalized Hypergeometric Function (GHF) type V distribution if their joint density is ГР(т + n) nLi det(Xfc)^-!^1) det(/p + YLi ^)"(m+n) IlLi rp(mfc)rp(n) r+iFs+i(ab ..., ar, n; bu ..., 6S, m + η; Θ) rFs(ab...,ar;6b...,6s;©(/p + £Xfc)_1), Xfc > 0, * = l,...,g, (8.13.6) fc=l where m = Σΐ=ι mk, © = θ', mk > \(p - 1), /c = 1,..., q, η > |(p — 1), and the parameters ai? bj, г = 1,..., r, j = 1,..., s are restricted to take those values for which the density function is non-negative. Note that for q = 1, the GHF type V density (8.13.6) becomes GHF type III density (8.11.4). For θ = 0, the GHF type V density reduces to matrix variate Dirichlet type II density with parameters m\,...,mq and n. For r = s = l,ai=m + n, and b\=m we get the density of X as iFi (m + η; η; θ(/ρ + £ Xfc)"'), Xk > 0, к = 1,..., <?, A:=l which is the noncentral matrix variate Dirichlet type II density with parameters mi,..., mq\n, and Θ. 8.14. COMPLEX MATRIX VARIATE DISTRIBUTIONS The complex multivariate distributions play an important role in various fields of research. The complex multivariate Gaussian distribution was introduced by Wooding (1956), Turin (1960), and Goodman (1963a). The complex Wishart distribution was derived by Goodman (1963a) to approximate the distribution of an estimate of the spectral density matrix for a vector valued stationary Gaussian process. In multiple time series analysis, complex multivariate distributions are used to describe estimators of frequency domain parameters. For applications of these distributions in time series analysis, reference may be made to Whaba (1968, 1971), Goodman and Dub- man (1969), Hannan (1970), Priestly, Subba Rao and Tong (1973), Brillinger (1969, 1975), and Shaman (1980). This distribution has also been found useful in nuclear physics in studying the distribution of spacings between energy levels of nuclei in high excitation. For further details reference may be made to Dyson (1962a, 1962b, 1962c), Dyson and Mehta (1963a, 1963b), Bronk (1965), Porter (1965), and Carmeli (1974, 1983).
304 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS The complex multivariate elliptically symmetric distribution has been studied by Krishnaiah and Lin (1986), and Khatri and Bhavsar (1990). This family includes complex multivariate Gaussian and complex multivariate ^-distributions. The joint distributions of the roots of some complex random matrices have been derived by James (1964), Wigner (1965), and Khatri (1965). Parallel to the real case, substantial work in the complex case hase been carried out. The distributions of several test statistics in the complex case have been studied by several authors, e.g., see Goodman (1963b) Khatri (1965, 1969, 1970b), Pillai and Jouris (1971), Nagarsenker and Das (1975), Chikuse (1976), Krishnaiah (1976), Gupta (1971a, 1973, 1976), Fang, Krishnaiah and Nagarsenkar (1982), Gupta and Rathie (1983a, 1983b), Gupta and Conradie (1987), Gupta and Nagar (1985, 1987, 1988, 1989, 1992), Nagar, Jain and Gupta (1985), and Nagar and Gupta (1993). A number of results on the distribution of complex random matrices has also been derived. Srivastava (1965) gave a derivation of complex Wishart distribution. A characterization of complex Wishart distribution has been given by Gupta and Kabe (1998). James (1964) and Khatri (1965) derived the complex central as well as the noncentral matrix variate beta distributions. Systematic treatment of the distributions of complex random matrices was given by Tan (1968) which included the Gaussian, Wishart, beta, and Dirichlet distributions. Kabe (1984) defined hyper complex matrix variate Gaussian distribution which includes Hamilton's quaternions, biquaternions, octonions, and bioctonions. He also studied the corresponding sampling distribution theory. Rautenbach and Roux (1985) have also derived the quaternion distribution and studied its properties. PROBLEMS 8.1. Let the joint p.d.f. of the random matrices X\ {ρ χ ρ) and X2 (j> x p) be Π [{2ЬрГр(^) det^)^}"1 etr (- ^Фг1^) det(Xt)^^ ^ν^(αι)*·· ■(«*·)« <?«(θ) A IV1-λ r'rlK-P-il/L-iy^ XuX2>0. Then show that the joint m.g.f. of Χχ and Xi is Псдазд-гФ,^)-1). (Roux and Raath, 1973) 8.2. Prove that (8.10.1) is a density.
PROBLEMS 8.3 305 Let X ~ CHp(m,щ + m, щ + n2 + m, kind 1), where n1? n2 and m are positive integers. Then show that ΕΥγ-ч m(n-p-l) r ^ W = ~ 1—Г" A» ni - ρ - 1 > 0, £(x2) = ярг1) = ni - ρ - 1 m(m+ l)(n — ρ — 1) (πι - ρ)(ηι - ρ - 1)(ηι - ρ - 3) {(ηι - ρ)(ηχ - ρ - 3) + η2(ηι - 1)}7ρ, ηι - ρ - 3 > 0, 2ηι (2m — ρ — 1)π /ρ, 2m - ρ - 1 > 0, 2 = 4ni[(2m - 1){(η+ 1) +η2(ρ+ 1) - 2} + 2η2 - ρ(ρ + 1)η2] , 1 ] (2m-p)(2m-p-l)(2m-p-3)n(n-l)(n + 2) p' 2m - ρ - 3 > 0, where щ +η2 = η. 8.4. Prove Theorem 8.10.4. 8.5. Prove Theorem 8.10.5. 8.6. Prove Theorem 8.10.6. 8.7. Prove Theorem 8.10.7. 8.8. Prove Theorem 8.10.8. 8.9. Let X ~ CHp(n, a, /?, kind 1). Then show that Γρ(β - n)Tp{n + h)Tp(a -n-h) E[det(X)h] = Γρ(η)Γρ(α - η)Τρ(β -n-h) ~n + «(P - 1) < Re(/i) < min[Re(a - n), Re(0 - n)] - -(p - 1). 8.10. Derive the Laplace transform of (8.11.1). 8.11. Prove Theorem 8.11.3. 8.12. Prove Theorem 8 8.13. Prove Theorem 8 8.14. Prove Theorem 8 8.15. Prove Theorem 8. 8.16. Prove Theorem 8 8.17. Prove Theorem 8. 11.4. 11.5. 11.6. 11.7. 11.9. 11.10.
306 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS 8.18. Let X ~ СЩ'(п, a, n - b + ±(p + 1),kind 2). Then show that m F(r (x-4\ - ГГ (a + t-*i-|(p-j))ti r , . (й)адх-1))= p(e + b-|CP+i)) (iii) E(tr(X-2)) (n-i(p+l))(b-i(p+l))' Re(n)>i(p + l),Re(6)>i(p+l), 2. _ l(a + b-i(p + 3))(g + fc-i(p + l))p(p + 2) (b - I(p + 3))(ft - i(p + l))(n - \{p + 3))(n - I(p + 1)) f(o + fr-|(p+l))(g + fr-ip)p(p-l) (b-I(p+l))(b-ip)(„-i(p+l))(„-ip)' Re(n) > |(p + 3), Re(i>) > i(p + 3), Re(/i) > max [- b + \{p + 1), -n + §(p + 1)]. (Khattree and R. D. Gupta, 1989) 8.19. Let X ~ CH^(n, α, β, kind 2), and partition X as -X"ll -X"l2 \ Q X= . ^21 ^22 у Ρ - 9 Then show that Xu and Χ22·ι = ^22 ~~ ^n^fi1^^ are independently distributed. Furthermore X221 ~ СЯ£ (n - ±9, α, β - \q, kind 2). 8.20. If X ~ CfijJ(n, a, /3, kind 2), then show that FfrWWbl - Γρ(" + *)Γ,(β ~ Π - h)Tp[n - β + \{p + 1) + ft] 1 Ч JJ~ Γρ(η)Γρ(α-η)Γρ[η-/?+§(ρ+1) -η + max [Re(/3 - 1), -(p - 1)] < Re(ft) < Re(a - n) - -(p - 1). 8.21. If X ~ CHjfin, α, β, kind 2), then prove that that £[detpOh] = ,h, _ rp(n+ft)rp[n-/?+i(p+l) + ft] Γρ(η)Γρ[η-/3+|(ρ+1)] ' Re(ft) > -n + max[Re(/?-l),-(p+l)].
PROBLEMS 307 8.22. Let W ~ Gp(mJp) and U ~ #p~(n,a,/?,7) be independent. Then show that the p.d.f. of X = U-3WU~i is ГР(7 + η - а)Гр(7 + η - β)Γρ(η + m)Tp(j + η + τη-α-β) Γρ(η)Γρ(7 + η-α- /?)ΓΡ(7 + η + τη- α)Γρ(7 + η + τη- β)Τρ(τη) aet(X)Tn-12{p+l)2F2(n + πι,Ί + η + πι-α-β; 7 + η + τη - α, 7 + η + m - β\ -Χ), Χ > 0. 8.23. Let Χ ~ Щ{щα,/3,7). Then show that ч*1 _ ГрЬ ~ п + α)Γρ(^ + η - 0)гр(7 + А) E[det(X)h Гр(7)Гр(7 + п-а- β)Γρ(Ί + n + h) 3F2(a,/?,7 + /i;7,7 + ra + /i;/p), Re(/i) > 0. 8.24. Let X ~ CH^(n, a, /?, 7; A). Then show that *[de Ч = Γρ(7 - η)Γ (n + Λ)Γρ(α - η - ft) Γ,(/3 - η- h) 1 ν 7 J Γρ(η)Γρ(α - η)Γρ(β - η)Γρ(7 - η - h) κ ' ' -η + 1(ρ - 1) < Re(ft) < min[Re(/? - η), Re(7 - η)] - |(ρ - 1). 8.25. Show that the m.g.f. of the GHF type I distribution (8.13.1) is given by MX(Z) = r+.Fs(ai ar!n;bl,.. ^efo-Z)-') de _ z < r+iFs(ab ..., ar, n; 6b ..., 6S; Θ) 8.26. Show that the hth moment of det(X), where X has GHF type I distribution (8.13.1), is given by E\det(X)h] = Гр(п + A) r+i^(Qi» ...,Or,n + h;bi,...,ba;0) rp(n) r+iFs(ai,..., ar, n; 61?..., 6S; Θ) Ite(A)>-n+|(p-l). 8.27. Show that the following results hold good for the generalized hypergeometric function density (8.13.3), HrWvVbl - Γρ(^ + η)Γρ(™ + ft) r+iFs+i(ai,... ,ar,m + ft;fri,. ■ ■ ,frs,m + n + ft;6) r+\Fs+i(ai,..., ar, m; 61,..., 6S, m + η; Θ) Re(ft) > -m + -(p - 1).
308 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS and E[det(Ip-X)]-Tp{n)rAm + n + h) r+ifi+i(fli,..., ar, m; 6b ..., frs, m + η + Α; Θ) r+iFs+i(ai,..., ar, ra; &b ..., &s, ra + π; θ) Re(A) > -n+-(p- 1). (Roux, 1971) 8.28. For the density (8.13.4), show that E[det(X)h] and £[det(Jp + X)~h] are given by h] _ Tp(m + h)Tp(n - h) E[det(X)h] = Гр(т)Гр(п) r+ifi+i(fli, · · ·, Qr, η - Д; Ьь..., Ьд> m + η; Θ) r+i-^s+i(a1?..., ar, n; 6b ..., &s, ra + η; θ) -τη + -(ρ - 1) < Ite(A) <n--(p- 1), and £[det(Jp + X)-'4 = _M _ Γρ(η + Α)Γρ(τη + η) Гр(п)Гр(т + η + Λ) r+iFs+i(ai,..., Or, η + h; Ьь ..., 6S, m + η + Α; Θ) r+i-Fs+i(ai,..., ar, n; 6b ..., 6S, η + Λ; θ) Ite(A)>-n + -(p-l), respectively. (Roux, 1971) 8.29. For the density (8.13.5), prove the following. FiTTrWYWl _ nLirp(mfc + Afc)rp(r7i + n) ^lfldet(Xfc) J " Ш=1ГРК)Гр(т + п + А) r+iFs+i(ai,... ,ar,m + A;6i,... ,6s,m + η + Α;θ) Γ+ι^β+ι(αι,..., Or, m; 6ι,..., 6β, m + η; θ) Re(Afc) > -mfc + -(p-l), A; = !,...,<?
PROBLEMS 309 where h = Σ£_ι hk, and E[det{Ip-^Xk) j - Гр{т + п + к)Гр{п) r+iFs+i(au ..., αΓ, m; 61, ..., bs, m + η + h; Θ) r+i-Fs+i(αϊ,..., ar, m; 61,..., 6S, m + η; θ) R*(A)>-n+i(p-l), respectively. (Roux, 1971) 8.30. For the density (8.13.6), prove the following. F\ ΓΓ rW X ^1 - nLirp(mfc + Afc)rp(n-A) r+ifi+i(Qi,·· · ,ar,ra + А;^ь· · . A,m + n;6) r+iFs+i(ai,..., ar, ra; 61?..., 6S, ra + η; θ) Re(Afc) > -rafc + -(p- 1), A; = l,...,g, Ite(A)<n-i(p-l), where A = Σ£=ι ^fc> and B[det(/p + L^J ]-Гр(т + п + Д)Гр(п) r+iFs+i(ai,... ,ar,m;6i,... ,6s,m + n + Α;Θ) γ+ι^+ι(αι,- ·· ,ar,m;6i,... ,6s,ra + η;θ) Re(A) >-n+-(p-l), respectively. (Roux, 1971) 8.31. Let W ~ Gp(ni,Ip) and С/ ~ C#p(n2, a, 0, kind 1) be independent. Let X = W + U and Υ = (W + U)-*U(W + U)~^. Then show that (i) the random matrix X is distributed as GHF type I with density Γρ(ί6)Γρ(α-η2)Γρ(η1+η2)Γρ(η2) v ; v ' 2F2(n2,P- а;щ + η2,β;Χ), Χ > 0,
310 CHAPTER 8. MISCELLANEOUS DISTRIBUTIONS and (ii) the random matrix Υ is distributed as GHF type II with density 2F1(n1 +η2,β-α;β;Υ),0<Υ< Ιρ.
CHAPTER 9 GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS 9.1. INTRODUCTION In Chapters 1 through 7 we have considered a number of models leading to matrix variate generalizations of well known continuous distributions. These distributions usually arise as sampling distributions, when the underlying population is multivariate normal. In this chapter we study families of distributions which are defined through functional form assumption, either on density function, or on characteristic function, or invariance property. 9.2. MATRIX VARIATE LIOUVILLE DISTRIBUTIONS In this section we study a family of distributions defined through functional form assumption on the density. The random variables x1?..., xr are said to have Liouville distribution of the first kind if their joint p.d.f. is proportional to г г Π x?~l9(Σx0' ° < Xi < °°' ai > °' * = X' * * *' Γ· ι9'2*1) i=l i=l Here g is a measurable positive real valued function defined on the interval (0,oo) such that /0°° g(r)rs~l dr exists for all s > 0. The random variables y\,..., yr are said to have Liouville distribution of the second kind if their joint p.d.f. is proportional to П^"ЧЕУг)' 0 < Уг < 1, J2vi < 1. bi > °> * = 1. · · · >r, (9.2.2) г=1 г=1 г=1 311
312 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS where g is a measurable positive real valued function defined on the interval (0,1) such that Jq g(r)rs~l dr exists for all s > 0. These families of distributions were defined by Marshall and Olkin (1979), and Sivazlian (1981). They include Dirichlet distributions, and have found applications in compositional data (Aitchison, 1986), and life time data (Barlow and Mendal, 1992). The distributional properties of these families and their extensions have been studied by Sivazlian (1981), R. D. Gupta and Richards (1987, 1990, 1991, 1992), Fang, Kotz and Ng (1990), Song and Gupta (1997), and Gupta and Song (1996). In this section we give matrix variate generalizations of (9.2.1) and (9.2.2) studied by R. D. Gupta and Richards (1987). DEFINITION 9.2.1. The ρ χ ρ symmetric positive definite random matrices X\,..., Xr are said to have Liouville distribution of the first kind if their joint p. d.f is proportional to ndet(xoai~^VE*z)'Xi > °'a* > \^-χ)' * = 1>--->г> (9·2·3) where g(-) is positive, continuous, supported on S = {Χ (ρ χ ρ) : Χ > 0} such that det(T)a-^+1^(T)dT<oo, / Jt>o and a = ΣΓ=ι αΐ· This distribution will be denoted by L^(g, ab ..., ar). The normalizing constant of the density (9.2.3) depends on the function g and for given g, can be evaluated explicitly. In general, this constant can be written in terms of Weyl fractional integral defined below. If a real valued continuous function / defined on the space of ρ χ ρ symmetric positive definite matrices satisfies the condition J det(T)a"^+1^(T) dT < oo, (9.2.4) where a > \{p — 1), then the Weyl fractional integral of order α of / is defined as WQf(T) = -}— [ det(5 - T)a-bb+Vf(S) dS. (9.2.5) I (a) Js>t Properties of Waf(T) are given by Gindikin (1964), and Rooney (1972) for ρ = 1, and by Richards (1984) for arbitrary p. There is one to one correspondence between /(·) and its Weyl fractional derivative Waf(-). The Weyl fractional integral Wa also satisfies the semigroup property WQ+f3 = \να\νβ\ a> \(p-l), β> \{p-l). Now we turn to the evaluation of the normalizing constant A of the density (9.2.3), = n^^p4(ai) / aBt{T)a~h^+l)g(T)dT. (9.2.6) Γρ(α) jt>o
9.2. MATRIX VARIATE LIOOVLLLE DISTRIBUTIONS 313 The last step is obtained by using Theorem 1.4.4. Prom the definition of Weyl fractional integral (9.2.5), it follows that 7 = ΠΓρ(α01^(0). (9.2.7) л *=ι DEFINITION 9.2.2. The ρ χ ρ symmetric positive definite random matrices Yi,...,Yr are said to have Liouville distribution of the second kind if their joint p.d.f is proportional to fldet^-^Mi»' 0 < У< < J,, ЕУ, < /p i=l i=l i=l 6i>|(p-l),t = ll...,r> (9.2.8) where g(-) is positive, continuous, supported on S = {Χ (ρ χ ρ) : О < Χ < Ιρ} such that Ι det(T)b~^Vg(T) dT <oo, js andb = Σί=ι^· This distribution will be denoted by Lf\g, &b ..., br). The normalizing constant of the density (9.2.8) is given in (9.2.7). Next we give some special cases of the above densities. (i) In (9.2.3) taking g(T) = det(Jp + T)""££}% where ar+l > \(jp - 1), we get the matrix variate Dirichlet type II distribution with parameters (αϊ,..., αΓ; αΓ+ι). (ii) In (9.2.3) taking g{T) = etr(—T), we get the product of Wishart densities. (iii) In (9.2.8) taking g(T) = det(/p - T)b^~^+l\ where 6r+1 > \(p - 1), we get the matrix variate Dirichlet type I distribution with parameters (6b ..., br; 6r+i)· Next we study some properties of above distributions. The first theorem gives relationship between the first kind and the second kind. The proof is similar to that of Theorem 6.3.1. THEOREM 9.2.1. (i) If (Уь ..., Yr) - L<2>($, bu ..., 6P) and then (Xu..., XT) ~ L«(/, bu-.-A) where f(T) = det(/p + T)-^-i^»g(T{Ip + Τ)"1), Τ > 0. Further there is one to one correspondence between g(-) and /(·). (ii) If (Xu...,Xr)~ LW(g,oi,...,or) and Yi=(iP+T,xJyhxi{iP+i:xJyKi=i,...,r, j=l j=l
314 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS then (Yi,..., Уг) ~ Lf?\f, ab ..., ar), where f(T) = det(/p - Τ)-α-№+ι)9(Τ(Ιρ - Τ)"1), О < Τ < Ιρ. Further there is one to one correspondence between g(-) and /(·). In the following theorems we give stochastic representation of X\,..., Xr where (X1,...,Xr)~LM(g,a1,...,ar),i = l,2. THEOREM 9.2.2. Let (Xu ..., Xr) ~ L<%, ob ..., ar). Define Χό = YSYjY), j = 1,... ,r - 1, and Xr = y)(Ip - ZrjZl Yj)YS. Then (Уь ... ,yr_i) and Yr are independent, (Уь ..., Yr-i) ~ Dfai,..., αΓ_ι; ar) and Yr ~ L^\g, EJU a»). From this result, in view of Theorem 6.3.4, it is easily seen that (Σ*0~έ(έ*)(Σ*Γ*~*ί(Σβ*. Σ «0-s<r· г=1 г=1 г=1 г=1 i=s+l THEOREM 9.2.3. Let (Xu ..., Xr) ~ L<%, ob ..., ar). £e/me x, = i?nn'-iIIW. x2 = yr* π nii.^/p - υ,) Π у/у/, J'=2 j=2 Т/геп Уь ..., ΥΓ are independent, Yk ~ £ρ(Σ£=ι a*, ^fc+i), A; = 1,..., r — 1, and Yr ~ THEOREM 9.2.4. Lei (Xb..., Xr) ~ L<%,ab..., or). £e/ine j=l J=l *2 = П* nVp + n+wr^nVp + ^riy,*, j=2 j=2 Xr = УЛ'р + П-1ГЫ-1(/р + 1;-1)-5уЛ Т/геп Yi,...,YJ. are independent, Yk ~ Βρ^α^+ι,Σ^ a^), A; = l,...,r — l.; and
9.3. MATRIX VARIATE SPHERICAL DISTRIBUTIONS 315 THEOREM 9.2.5. Let (XU...,XT)~ L^(g,ai, ...,ar). Define X2 = YrHlp-Y^Y^Ip-Y^YrK XT = Y}(IP - Ух)* · · · (IP - Y^HIp - Yr-ι)" --{Iv- ΥιΫ'Υΐ Then Yi,..., Yr are independent, Yk ~ Bfak, Σ[=α:+ι α»), к = 1,..., г — 1, and Yr ~ THEOREM 9.2.6. Let (Xu ..., Xr) ~ L®(g, a1?..., ar). Then (г) (Хи...,Ха) ~L®(ga,ai,...,aa), s <r, where gs(T) = Wag(T) is the Weyl fractional integral of order a = ΣΓ=β+ι a*> ^(Xs+1,...,Xr)|(Xb...,Xs)^L^s(/s,as+b...5ar),5<r; where fs(T)= ga(£=iXi) ■ Proof: (i) The joint p.d.f. of Xu ..., Xs (s < r) is A/ ···/ ΠάβΚΧ,Γ-^^ίΣ^) Π dXt, (9.2.9) where A is given by (9.2.7). Integrating Xs+i,...,Xr, using Theorem 1.4.4, from (9.2.9), we get = ^ ndet№)ai"|(p+1)^E-«aiff(E^) 2=1 2=1 2 = 1 2 = 1 The last step is obtained by using the definition of Weyl fractional integral, (ii) The proof is straightforward. ■ 9.3. MATRIX VARIATE SPHERICAL DISTRIBUTIONS Sometimes it is desirable to study robustness of normal theory model under nonnormal situation. The class of elliptically contoured distributions in such studies is useful because the density functions of such distributions have the same elliptical shape as the normal density. For properties of these distributions one can refer to Kelker (1970), Chmielewski (1981), Cambanis, Huang and Simons (1981) and Muirhead (1982).
316 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS In this and subsequent sections we study the matrix variate generalizations of these families. We begin by defining matrix variate spherical distribution studied by Dawid (1977, 1978) and Fang and Chen (1984). DEFINITION 9.3.1. The random matrix Χ (ρ χ η) is said to have (i) right spherical distribution if X = XA, V Λ G 0(n), (ii) left spherical distribution if X = ΓΧ, V Γ G 0(p), and (Hi) spherical distribution if X = ΓΧΛ, V Γ G 0{p), V Λ G 0(n). It may be noted that if Χ (ρ χ η) is right spherical, then, for Τ (ρ χ η), its characteristic function is of the form φ(ΤΤ'). We have Φχ(Τ) = E[eti(iTX')} = Ε[<Αζ(ιΤΛΛ'Χ% Λ G 0(n) = E[eti(c(TA)(XA)% A G 0(n) = E[eti(t(TA)X% since X = XA, V Λ G 0(n) = Φχ(ΤΛ). Hence Φχ(Τ) is invariant under 0(n) and is a function of the maximal invariant under 0(n), i.e., for some function φ, Фх(Т) = φ(ΊΤ). Ιϊ Χ {ρ χ ή) is right spherical with the characteristic function φ(ΤΤ'), we will denote it by Χ ~ Λ5ρ>η(ψ). If Χ (ρ χ η) is left spherical, we will write X ~ L5p>n(0). THEOREM 9.3.1. If Χ (ρ χ n) is nght spherical, then (i) X' is left spherical (ii) —X is nght spherical, —X = X. THEOREM 9.3.2. Let X - RSP^). (i) For a constant matrix A{qxp), AX ~ RSqyn(ip) where ψ(ΤΤ) = φ{Α,ΤΤΑ), T(qxn). (ii) For X = (Χι X2), where X\ ispxm, X\ ~ RSPyTn№). In Chapter 8 we have defined uniform distribution over Stiefel manifold. This distribution belongs to the class of right spherical distributions, as shown in Theorem 8.2.1. Its converse is given in the next theorem. THEOREM 9.3.3. Let Χ (ρ χ n) be nght spherical and XX' = Ip, ρ < п. Then X ~ 6/ip>n. THEOREM 9.3.4. The distribution of right spherical matrix Χ (ρ χ ή) is fully determined by that of XX'.
9.3. MATRIX VAEJATE SPHEBJCAL DISTRIBUTIONS 317 Proof: Let Υ (ρ χ η) be another right spherical matrix such that XX' = YY'. Further let U ~ КПуП with characteristic function ω(ΖΖ'), Ζ (nxn). The characteristic function φ(ΤΤ'), Τ (ρ χ η), of X is фх(ТГ) = Ex[eti(tTX')} = Ex{ ( eti(iTUU'X')[dU] I JO(n) = 4Leti{iT'xu')[du]} JO(n) = Εχ[ω{Τ'ΧΧ'Τ)] = Εγ[ω(ΤΎΥ'Τ)] = Φυ(ΤΤ). Therefore Χ = Υ. ш Prom the above theorem, it follows that the uniform distribution is the unique right spherical distribution over 0(p,n). For right spherical matrix, in general the density may not exist. However if X has a density with respect to a Lebesgue measure on Wxn, then it is of the form f(XX'). Some examples of this distribution are given below. (i) When Χ ~ ΛΓρ>η(0, Σ <g> /n), the density of X is (27r)-^npdet(E)-^netr (- ^Σ"^^), Χ G Rpxr\ with the characteristic function etr(—\YTT'). (ii) When X ~ Tp>n(5,0, Σ, /n), the density of X is /pt^ + rc + P-1)] det(/ + Σ-ιχχΤι{8+η+Ρ-ΐ) χ RP*n with the characteristic function {rp[i(i + P-i)]}"IB.i(^1)(iE7T') where Β$(·) is the Herz's Bessel function of second kind of order δ. THEOREM 9.3.5. If Χ (ρ χ η) is right spherical and Κ (η χ m) is a fixed matrix, then the distribution of XK depends on К only through K'K. Proof: Let the matrix H(nxm) be such that H'H = K'K. Then Η = Γ Κ for some Г G 0{n). Hence XH = ХГК = XK, XK depends on К only through K'K. Г G 0(n). Hence XH = ХГ К = XK, from which it follows that the distribution of
318 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS In the above theorem, if K'K = Im then the distribution of XK is right spherical. This can be shown by evaluating the ci. of XK. Let X = (Xi X2), Χι (p x (n - m)), X2 (p x m)), A"' = (tf( K2), ^ ((n - m) χ m) = 0, if2 (m χ га) = Im. Then if'if = /m, and therefore Xif = X2 is right spherical. It is easy to show that if the distribution of X is mixture of right spherical distributions, then X is right spherical. It follows that if Χ (ρ χ η), conditional on a random variable v, is right spherical and Q (q χ ρ) is a function of υ, then QX is right spherical. The results given above have obvious analogues for left spherical distributions. Now from the theory of spherical distributions many results for uniform distribution can be derived. In the next theorem we give some results for the uniform distribution. THEOREM 9.3.6. Let U ~UPyTl. (i) Partition U' = (U[ Щ), Ъг {qxn),\<q<p. Then Ui ~ КЯуП- (ii) For fixed Г G 0(q,p), p>q,TU~ ЩуП. (Hi) U is spherical. (iv) Ifn=p, then U' = U~l ~ UPyP. Proof: (i) Since U is also right spherical, U\ is right spherical. From the fact that υλυ[ = Iq, and Theorem 9.3.2, the result follows. (ii) Note that TU is right spherical and (Γ£/)(Γ£/); = Iq. Therefore, from Theorem 9.3.3, ГС/ ~ Uqy7l. (iii) For q = ρ, Γ G O(p), and TU ~ UPyTl. Hence U is left spherical. Since U is also right spherical, the result follows. (iv) Since UUr = Ip, U' = C/_1, from (iii) U is left spherical and hence U' = U~l is right spherical. Now the result follows from Theorem 9.3.2, since [/-i([/-i)' = /p. . We now study the stochastic representation of spherical distribution. THEOREM 9.3.7. Let X ~ ЯБРуП(ф). Then there exists a random matrix A(pxp) such that X = AU (9.3.1) where U ~ Up>n is independent of A. Proof: For Χ (ρ χ η), we can find A(p xp) such that XX' = AA!. Let U ~ UPyTl be independent of A. Define Y = AU. Then YY' = AA! = XX1. From Theorem 9.3.4 it follows that X = Υ = AU. m The matrix A in stochastic representation (9.3.1) is not unique. One can take it to be lower (upper) triangular matrix with non-negative diagonal elements or right spherical matrix with A > 0. Further, in addition, if we assume that P(det(XX') = 0) = 0, then the distribution of A is unique. The next theorem proves the uniqueness of A when it is lower triangular.
9.3. MATRIX VARIATE SPHERICAL DISTRIBUTIONS 319 THEOREM 9.3.8. Let X ~ RSp,n(</>) and P(det(XXf) φ 0) = 1. Then for A, B lower triangular matrices with positive diagonal elements and U ~ UPyTl, Q ~ MPln, (i) X = AU and X = BU =► A = B, (ii) X = AU and X = AQ => U = Q. Proof: (i) Note that AAf = BBf. Now consider one to one function f(A) = AA!. Then for any Borel measurable function h(-), E{h(A)} = E{h(f-\AA'))} = E{h(f-\BB'))} = E{h(B)} and hence A = B. (ii) Define the function g(AQ) = (A,Q). Then (Д Q) = g(AQ) = g(X) = g(AU) = (A,U), and hence Q = C/. ■ For studying the spherical distribution, singular value decomposition of the matrix Χ (ρ χ n) provides a powerful tool. When ρ < η, let X = GAH, where G G 0(p), Η G Ο(ρ,η), Λ = diag(Ab... ,λρ), λχ > λ2 > · · · > λρ > 0, and Xi 's are the eigenvalues of (XX') 2. THEOREM 9.3.9. If X (p x n), p<n, is spherical, then X = UKV (9.3.2) where U ~ Ц>>р; V ~ UPtTl and Л are mutually independent. Proof: Let X = GAH, G G O(p), Η G 0(p,n), and Л = diag(Ab... ,λρ), be the singular value decomposition of the matrix X (pxn). Further let U* ~ UPyP, V* ~ Кщп be independent of (G,A,tf), and define U = U*G, V = HV\ and X* = U*XV*. Then, given (G, Л,Я), U ~ UPyP, V ~ UPyTl are independent. Hence £/, V and Л are independent. Now, since X is spherical, for given U* and V*, X* = U*XV* = X. The proof is completed by noting that X* = UKV. ■ THEOREM 9.3.10. If Χ (ρ χ η) is spherical, then its characteristic function is of the form φ(\(ΤΤ')), where T(px n), \{TV) = diag(rb ..., гх); and τλ > · · · > rp > 0 are the eigenvalues ofTT'. Proof: From the definition of spherical distribution, it follows that the characteristic function of X is Фх(Т) = E[etr(iTX')] = Е[еЬт(с(ГТА)(ГХА)% Г G 0(p)9 Δ G 0{n) = Φχ(ΓΤΔ).
320 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS Thus the characteristic function Φχ(Τ) satisfies the equation Φχ(Τ) = Φχ(ΓΤΔ) for every Γ G 0(p) and Δ G 0(n). The maximal invariant of Τ in this case is X(TTf). Hence Φχ(Τ) = φ(\(ΤΤ')) for some function ψ. ■ From the above theorem it follows that, if the density of a spherical matrix X exists, then it is of the form f(X(XX')). THEOREM 9.3.11. Let X ~ /2SPt„(0). If the second order moments of X exist then (ϊ)Ε{Χ) = 0, (it) cov(X) = V®In, where V = E(xlx[)} X = (жь ... ,жп). Proof: (i) Since -X = X, it follows that E(X) = 0. (ii) See Fang and Zhang (1990, p. 104). ■ THEOREM 9.3.12. Let X ~ /25Pt„(0) with the density f(XX'). Then the density ofS = XX!, n>p, is 7r-^det(S)^~^f(S),S>0. Γρ(1η) Proof: Let h(-) be a non-negative Borel measurable function. Then EMS^ = L h(XX')f(XX')dX JxeRpxn = f f h{XX')f{XX')dXdS = ls>0KS)f(S)dsJxx^dx 7г1пр г = r7Vi/c HS)det(S)-^-^f(S)dS. lp\2n) Js>0 The last step is obtained by using Theorem 1.4.10. Hence the density of S is 7Г2 rp(in) — det(5)5(n-p-1)/(5)! 5 > 0. Prom the above theorem, it follows that if X ~ J?5Pi„(</>), with density f(XX'), then XX' ~ l£\f, \п), n>p. COROLLARY 9.3.12.1. Let Χ ~ Wp,„(0, Σ ® /„). Tften /(XX') = (2π)-5"Ράβί(Σ)-5"β^ (- ^Σ-1^'), X € W*n, and S = XX' ~ Wp(n, Σ), η > ρ, with the density {2>Γρ(^η) det^)5"}-1 det(S)^n-p~V etr (- ^Σ_15), 5 > 0.
9.3. MATRIX VAEJATE SPHERICAL DISTRIBUTIONS 321 COROLLARY 9.3.12.2. Let X ~ TPyTl(S, 0, Σ, In). Then f(XX') = r„_ Γρ[\(δ + η + ρ-1)} (2π)^Γρβ(ί + ρ-1)] det(E)"^det(/p + Σ-ΐχχ')-|(^η+Ρ-ΐ)? χ e RPxn? and S = XX' ~ GB{/(\n, £(ί + ρ - 1); Σ, 0) with the density {βΡ(\η^(δ + ρ-1))γ1άβ^Σ)~^ det(5)2(n-p-1} det(/p + Е-15)"*('-И1+р"1), 5 > 0. The above results have also been studied in Chapters 3 and 5 respectively. Theorem 9.3.12 can be generalized as follows and gives the joint distribution of several quadratic forms in terms of Liouville distribution. THEOREM 9.3.13. Let X ~ Д5Л„(0) with the density f(XX'). Partition X as X = (Xu ..., Xr), Xi (ρ χ щ), η* > ρ, г = 1,..., r, Σ[=ι η* = п. Define Si = Х{Х[, г = 1, ...,r. Then (Sb... ,5r) ~ L^(/, 5П1,..., \nr) with p.d.f. ——j—YldetiS^-'-Vf^Si), 54 > 0, t = l,...,r. lU=lLp\2ni) г=1 г=1 Proof: The proof is similar to the proof of Theorem 9.3.12 and hence is not given here. ■ The above theorem has been generalized further by Anderson and Fang (1987). Let X ~ #Sp>n(0) with the density f(XX'), and A(n χ η) be a symmetric matrix. Then XAX'~L?\fu±k) (9.3.3) where /χ(Τ) = W^-VfiT), if and only if A2 = A and rank(A) = к > p. Further, let Αι (η χ n),..., As (η χ η) be symmetric matrices. Then (XAtf',.. .,XASX') ~ L?>(/b ±щ,..., ±n,), (9.3.4) where /i(T) = Η^<»-»ι-···-».)/(τ), if and only if ДА,- = ^Д, and rank(A) = nu Щ >p, i,j = l,...,s. It may be mentioned that the class of matrix variate spherical distributions studied here are generalizations of multivariate spherical distributions. There are several other classes of matrix variate generalizations of multivariate spherical distributions studied by Jensen and Good (1981), Fang and Chen (1984, 1986) and Fang and Anderson (1990).
322 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS 9.4. MATRIX VARIATE ELLIPTICALLY CONTOURED DISTRIBUTIONS The class of matrix variate elliptically contoured distributions can be defined in many ways, e.g., see Fang and Zhang (1990) and Gupta and Varga (1993). DEFINITION 9.4.1. Let Χ (ρ χ η) ~ RSPt^) and Μ (ρ χ га), Β (η χ πι) be constant matrices. Then the random matrix Υ (ρ χ га), where Υ = Μ + ΧΒ, Σ = Β'Β is said to have matrix variate elliptically contoured distribution, denoted by Υ ~ £#5Ρ)7η(Μ,Σ,0). The characteristic function of Υ can be shown to be &τ(υΤ'Μ)φ(ΤΣΤ), Τ (ρ χ m). (9.4.1) If the density of Υ {ρ χ га) exists, then it has the form det(E)"2p/((^ - Μ)Σ~ι(Υ - Μ)'). (9.4.2) Next we give the distribution of a linear transformation of an elliptically contoured matrix. THEOREM 9.4.1. Let Υ (ρ χ га) ~ ERSPyrn(M, Σ, φ) and С (га χ q), N (ρ χ q) be constant matrices. Then Ζ = N+ YC ~ ERSPtq(N + MC, C'EC, φ). Proof: The characteristic function of Z, evaluated at Τ (ρ χ q) is ΦΖ(Τ) = E[etT{tTZ')] = eti(LTN')E[eti(L(TC'Y'))} = etr(iT(W + С'М'))ф(Т(С'Т,С)Т'). (9.4.3) The above expression of the characteristic function of Υ is derived from the Definition 9.4.1 and the characteristic function (9.4.1). Now from (9.4.3) the desired result follows. ■ COROLLARY 9.4.1.1. Partition Υ, Μ and Σ asY = { Υλ Y2 ), Yx (ρ χ q), Μ = (Mx M2), Μλ (ρ χ q) and Σ = ^ ^), Ση (q x q). Then Yx ~ ERSM(MU Ση,φ). Proof: In the above theorem, let N = 0, and С = (Iq 0). Then Υλ = Ζ ~ £?Д5рЛ(МьЕц,^). ■
9.5. OEIARIM DISTRIBUTIONS 323 THEOREM 9.4.2. Let Υ ~ £#Spm(M,E,0) and partition Υ, Μ and Σ as Υ = (£ £)> «>(« * ■*м - (ιέ «;:)■ «-<«*-> - e - (S; £)· Ец(гхг). ThenYn ~ ERSqr(MluZiu<t>*), where ф*(Ап) = ф(А), forA(pxp) = (*■ °). *.<.*.). Proof: The proof is straightforward and is left to the reader as an exercise. ■ THEOREM 9.4.3. Let Υ ~ £#Sp>m(M,E,<£). If the second order moment of Υ exist then (i) E(Y) = M, (ii) cov(r) = V <g> Σ, where V = E{xlxll)9 X = (xu ...,жп) and Υ = M + XB, Σ = В'В. For further results on matrix variate elliptically contoured distribution the reader is referred to Hayakawa (1986, 1987, 1989), Sutradhar and AH (1989), Fang and Zhang (1990), Gupta and Verga (1991, 1994a, 1994c, 1994d, 1995b, 1997), Wong and Liu (1994), Li and Fang (1995), Girko and Gupta (1996), Gupta and Girko (1996) and Gupta (1998). 9.5. ORTHOGONALLY INVARIANT AND RESIDUAL INDEPENDENT MATRIX DISTRIBUTIONS In the preceding chapters we have seen that the Wishart, gamma, beta type I and beta type II distributions are orthogonally invariant. That is, the distribution of Χ {ρ χ p) > 0 is same as that of Г XT', Г G 0(p). Many other properties follow from this fact, i.e., the diagonal elements of X are identically distributed. Additionally for X = TV, where Τ is a triangular matrix, the diagonal elements of Τ are independent. Motivated from these common properties, Khatri, Khattree and R. D. Gupta (1991) have defined the orthogonally invariant and residual independent, ORIARIM in short, family of distributions, Cp. DEFINITION 9.5.1. The random symmetric positive definite matrix Χ {ρ χ ρ) is said to have an ORIARIM distribution if (i) for any Γ G 0(p), the distribution of X and TXT' are identical, and (ii) for any lower triangular factorization X = TT', Τ = (Т^); Тц (pi χ ρι), г = 1,..., к are independent, for any partition {pi,£>2, · · · ,Pk} of p. When X has ORIARIM distribution, we will write X G Cp. The matrix variate beta type I and type II, gamma (C = /p), Wishart (Σ = /p), and inverted Wishart (Ф = Ip) distributions belong to this class. Next we give some properties of the class of ORIARIM distributions.
324 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARJATE DISTRIBUTIONS THEOREM 9.5.1. Let X G Cp. Partition X as x=\Xn Xu) q %2\ ^22 J P-Q q p-q Then Xu and Χ22Ί =-^22 ~" -^21-^11 -^12 are independent and Хц G Cq, and -Χ22Ί ^ Lsp-q. THEOREM 9.5.2. Let X G Cp. Then, for α (ρ χ 1) φ О, (г) has same distribution as Хц where X = (xij), and a'a (ii) —r^-;— has same distribution as xu where X~l = (xlJ). a'X xa THEOREM 9.5.3. Let X eCp and Υ G Cp be independent Further let Τ and U be two different square roots ofY. Then TXT' and UXU' have identical distributions. THEOREM 9.5.4. Let X G Cp and Υ G Cp be independent. Then for any square root Τ of Υ', the distribution of Ζ = TXT' belongs to Cp. From the above theorem it follows that if Γ = (Γι Γ2), Γ* (ρ χ ρ*), г = 1,2, V\ +Ρ2 = Ρ is a random orthogonal matrix independent of Ζ G Cp, then ^[ΖΓ^^ G C^ and (ΓΊΖ"1^)-1 G C^ are independent. Further if E(Z), E(Z~l), and E(Za), a an integer, exist, then (i) E(Z) = alp (n)E(Z'l) = bIp (iii) E(Z") = Ca/p, where a = Е(хцуп), b = E(xu)E(yu), and the constant ca depends on moments of order less than or equal to α of X and Y. Let Z® be any principal minor of Ζ of order г and Υ = TV, X = UU' be lower triangular factorizations. Then det(Zfl) 2 2 ^"det^-1))-^"' l~ 1'···'ρ' where det(Z^) = 1, are independent and S(det(Zr) = nS(«g) provided the expectations involved exist. Let A ~ Bfaubil Bi ~ B^ici.di), i = 1,2, A ~ ^(a,6), and Б - В£7М be independent. Define Zx = AM2(Af)' (9.5.1) Z2 = B{B2(B{)' (9.5.2)
9.5. OEIARIM DISTRIBUTIONS 325 Z3 = A*B(A*)' (9.5.3) and Z4 = B$A(B2)f. (9.5.4) Then from Theorem 9.5.4, it follows that Z{ G Cp, г = 1,2,3,4. From Theorem 8.12.1, Z\ ~ #ρ(α2,δι,α2 + δ2 — аьδι + δ2) and its p.d.f. is 2^ι(δι, α2 + δ2 - αϊ; δχ + δ2; /ρ - Ζχ), 0 < Ζχ < /ρ. The density of Ζ2 can be shown to be 2Fi(di + c2,c2 + d2;ci + c2 + di + d2;Ip - Z2), Z2 > 0. (9.5.5) Next from the joint p.d.f. of A and B, by transforming Z3 = Αϊ Β Αϊ with the Jacobian J(A, β —>· A, Z3) = det(A)~2(p+1), and using the definition of 2Fb the marginal p.d.f. of Z3 is obtained as βρ{α, b)pp(c, d) Note that the distribution of ZA is same as that of Z3. Next let X{ ~ Gp(muIp), Y< ~ /£р(п{ + \{p+ l),/p), г = 1,2, X ~ Gp(m,/P), and У ~ IGp(n + \{p + 1),IP) be independent. Let Zb = х\хг(Х.\)' (9.5.6) Z6 = 1?У2(*?У (9-5.7) Z7 = X?Y{Xh)' (9.5.8) and Z8 = У-5Х(Г5)'. (9.5.9) Then the p.d.f. of Zb is {rp(m1)rp(m2)}-1det(Z5ri-5(P+1)Bmi_m2(Z5), Z5 > 0. Since V = {YfrjYhr1 = Prt^-1*!-*, 1Ϊ"1 = (*Ί~*)'1ί~έ ~ Gp(nb/P), У2 * ~ Gp(n2,Ip), the p.d.f. of Z6 obtained from the p.d.f. of Z5 is {rp(n1)rp(n2)}"1det(Z6)-Tll-^+1)Bni_n2(Z6), Z6 > 0, where Bg(-) is the Herz's Bessel function of type II. Note that Z7 ~ BpJ(ra,n), and also Z8 ~ BpJ(ra,n) which follows from Theorem 5.2.5.
326 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARJATE DISTRIBUTIONS Further define the following random matrices which again belong to the class Cp: Z9 = AiX(Ai)' (9.5.10) Z10 = Χ*Α(ΧΪ)' (9.5.11) Zn = BtX(B*)' (9.5.12) Zl2 = X±B{X*)' (9.5.13) Z13 = Α$Υ(Α$)' (9.5.14) Zu = Υ$Α(Υ*)' (9.5.15) Z15 = BiY(B*y (9.5.16) and Zie = У*В(У*У. (9.5.17) The random matrices Z9 and Ζχο have the same density, given in Theorem 8.11.7. The random matrices Zu and Zi2 have the same density, given in Theorem 8.11.1. Similarly, the random matrices Z13 and Ζχ4 have the same density as do the random matrices Z\$ and Zie, given by «Ρίαΐπΐ\ detCZw)"""^1) Λ (α + n; a + 6 + n; -Ζ^1), Z13 > 0, and i^^det(ZJ,)--^)»(c + n;»-d+i(P+l);^),Zie>0, respectively, where ι-Ρι(-) and Φ(·) are confluent hypergeometric functions of kind 1 and 2. The p.d.f.'s of random matrices Z», г = 1,..., 16 have been studied by Khattree and R. D. Gupta (1992). However, in their paper the p.d.f.'s of Z2, Z5, Z6, Zu and Zi3 seem to be in error. These p.d.f.'s have been given here in their corrected forms. Next we give some properties of the random matrix Z\ eCp. (i) For α (ρ χ 1)^0, o!Z\a a'a and H\ 0*2, &i, a2 + b2- αϊ; b\ + b2) ,„-i ~ H[(a2 - -{v - 1), bu a2 + b2 - <ц; bx + b2). α Δι α κ ζ ' (ii) Let Zi = (zUj) and Zx l = (z\J). Then гш ~ H{(a2,bua2 + b2- ax\bx + 62), i = 1,... ,p and zf ~ H[{a2 - \[p - 1),bua2 + b2- <ц; bi + Ь2), г = 1,... ,p.
9.5. ORIARIM DISTRIBUTIONS 327 Ζι = ,Ρι+Ρ2=Ρ· (iii) Let Z\\\ Ζ112 \ V\ Ζ121 Zu2 J V2 V\ V2 Then Zin and Z122.i = ^122 ~~ ^121^111^112 are independent, Zm ~ H (α2,&ι,α2 + b2 - ax; bi + b2) and Z122.1 ~ Щ2(а2 ~ \Pi,h,a2 + b2- ax; Ьг + b2). (iv) Let Zf = (zijk), l<j,k<i. Define det(Zf]) Vi = ,г = 1,...,ρ and det(ZfJ) = l. detizf"11)' Then ^i,... ,vp are mutually independent and v% ~ #f(a2 — \{i — 1),ί>ι,α2 + £>2 — αϊ; bi + 62), г = 1,... ,p. Further the p.d.f. of det(Zi) is the same as that of Π£=ι ν%· (ν) For ρ = 2, det(Zi)i ~ Я1/(а2 - 1,6ι,α2 + b2 - аг;Ьг + b2) (vi) Using the representation Ζλ = ΑϊΑ2(Αϊ)\ the following expected values can easily be obtained: α^2 ВД) = (ai + bi)(o2+b2) 4». ! (2ai + 2bt - ρ - l)(2a2 + 262 - ρ - 1) r Ь{АХ ) = τττ^ ln E{CK(Z{)) = (2ai-p-l)(2a2-p-l) (αι)κ(α2)κ едо, (αϊ + ί>ι)κ(α2 + b2)K P(r (7-i\\ - (~ai ~bi + \{p + !))«(-02 ~ h + \(p + 1))» ( k[ l )} ~ (-αχ + i(p + 1))„ (-a2 + I(p + !))„ ЭД), S(Z?) = αχα2 3(al+b1)(a2 + b2) Re(oi) > *i + -(p-l), г = 1,2, (αι + 1)(α2 + 1)(ρ + 2) (αϊ +6ι + 1)(α2 + 62 + 1) (2ai-l)(2a2-l)(p-l) (2ai+2b! -l)(2a2 + 262-l) £(ζΓ2) = (2ai + 2bi - ρ - l)(2a2 + 2b2 - ρ - 1) 3(2αχ -p-l)(2a2-p-l) (2ai + 2fei - ρ - 3) (2αχ-ρ-3) (2a2 + 262 - Ρ - 3)(p + 2) (2ai + 26χ - p)(2a2 + 262 - p)(p - 1) (2a2-p-3) (2ai -p)(2a2-p) 4» Re(oi)>-(p + 3),t = l,2. The results (i) through (vi) given above are based on Khattree and R. D. Gupta
328 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARJATE DISTRIBUTIONS (1992) who also derived them for Z2 and Z3. Similar results for the random matrices Z9, Zn and Z15 are available in Khatri, Khattree and R. D. Gupta (1991). Using their approach results for the other random matrices can also be derived. PROBLEMS 9.1. Prove Theorem 9.2.1. 9.2. Prove Theorem 9.2.2. 9.3. Prove Theorem 9.2.3. 9.4. Prove Theorem 9.2.4. 9.5. Prove Theorem 9.2.5. 9.6. Prove Theorem 9.2.6. 9.7. Prove that if Χ (ρ χ η) is spherical, then for given К (q χ ρ) and Q(n χ га), the distribution of KXQ depends only on KK' and Q'Q. 9.8. Let X~RSPtn(<j)) with density function f(XX'). Partition X as X = (Χι X2), Χι (ρ χ rii), щ > ρ, i = 1,2, щ + п2 = п. Then prove that (XX')~*XlX[ (XXr^B^nufa). 9.9. Let X ~ RSPtTl(<j)) with density function f(XX'). Partition the random matrix X (pxn) asX = (Xb...,Xr), Х;(рхп;), щ > ρ, г = 1,... ,r, nx Л \-пГ = п. Define WJ = (ХГ)^ад(ХГ)-}, г = l,...,r - 1. Then prove that (Wl9..., Wr-X) ~ В£(±щ,..., inr_i; \nr). 9.10. Let X ~ RSPyn(4) with density function f(XX'), n>p. Then prove that the density function of W = (XXf)~l is тгЬр rp(in) r^aet{W)~^n+p+^f(W~l), W > 0. 9.11. Let $ ~ Wp(nb/P), г = 1,2 be independent. Prove that (Si + S2)~1Si(Si + 5a)-1 G ς,,» = 1,2. 9.12. For the random matrix Zb defined in (9.5.6), prove that (i) for α (ρ χ 1) φ 0, the p.d.f. of ν = ^jjf is 2{r(m1)r(m2)}-1^^mi+m2-2)irmi_m2(2v^), ν > 0, (ii) for a(pxl)/0, the p.d.f. of u = a,a^ia is {r(mi -p+ 1)Г(т2 -p + l)}-1^^1+m2-2^irmi_m2(2v^), u >0, where K$ is the Bessel function of scalar argument of the third kind.
PROBLEMS 329 9.13. (contd.) Partition Zb as / Z5U -^512 \ Pi Zb=\ ,Pl+P2=P- \ -^521 -^522 J P.2 Pi Pi Then prove that the random matrices Z*>\\ and Zb22.\ = Zb22 — Zb71Z^ZbVl are independent. Further prove that the p.d.f. of Z51i is {rp(m1)rp(m2)}-1det(Z511ri-^1+1)^i-m2(^5ii), Z5U > 0, and the p.d.f. of £522-1 is {ги(ггц - ψ^Τ^ζιτη* - gPi)} det(Z522.iri-"(pi+P2+1)^m2(^522.i), £522-1 > 0. 9.14. (contd.) Let Z™ = {zbjk), 1 < j,k <i. Define Vi = e*LiiL, г = 1,... ,p and det(Z50]) = 1. Then prove that vu ..., vp are independent and the p.d.f. of Vi is 2{г[тх - \{i - l)]r[m2 - \{i - 1)]}"1^(та1+т2^-1)^т1_т2(20^), t* > 0, 9.15. (contd.) Prove that E(Z5) = mim2Ip, Re(ra;) > -{p - 1), г = 1,2, 4 £(^5 )=(2m1-p-l)(2m2-p-l)/p' £(CK(Z5)) = Г^тОГрЮСД/р), E(CK(Z5 )) = j .. j .. CK(Ip), {-mi + 5(p + 1))к(-тг2 + 5(P + l))K Re(mi)>k1 + -(p-l),i = l,2.
330 CHAPTER 9. GENERAL FAMILIES OF MATRIX VARIATE DISTRIBUTIONS
GLOSSARY OF NOTATIONS AND ABBREVIATIONS A(pxq) A = (aij) A' A~l = (a^) AM A[a) A{a) Da = diag(ai,...,ap) h det(A) tr(A) A®B A>0 A>0 A> В A>B o(pxl) e(pxl) et (ρ χ 1) vec(X) matrix with ρ rows and q columns matrix with elements a^-'s transposed matrix of A inverse of a nonsingular matrix A, with elements a1·3 AH = (ay), l<M<a ^[a] = (fly), Ρ - Λ + 1 < h 3 < Ρ A{cc) = (CLij), <*<l,j<P diagonal matrix with elements αϊ,... ,ap along the main diagonal unit matrix of order ρ determinant of a nonsingular square matrix trace of a square matrix Kronecker product (direct product) of the matrices A and В A is positive definite A is positive semidefinite A — В is positive definite A — В is positive semidefinite column vector with elements a\,.. .,ap column vector with elements unity column vector with unity at 2th place and zero elsewhere for a matrix X (га х n), vec(X) is an ran χ 1 vector defined vec(X)= ; J, as 331
332 GLOSSARY OF NOTATIONS AND ABBREVIATIONS vecp(X) Ofan) 0(p) E(x\ E{x\ var(x) cov(x, y) corr(x, y) var(x) cov(x, y) cov(X,F) etr(A) J(X^Y) E(X) where xi7 г = 1,..., η is the ith column of X for a symmetric matrix X (pxp), vecp(X) is a |p(p+l) column vector formed from the elements above and including the diagonal, taken columnwise. In other words if X = ( X\\ Я12 Я21 Я22 V Xpl Xp2 Zip \ X2p XPP ) then vecp(X) / χιι \ Z12 Z22 Zip V W Stiefel manifold, 0(р,п) = {H1{pxn): НгН[ = Ip} orthogonal group, 0(p) = {Η (ρ χ ρ) : HH' = Ip} end of the proof of a theorem (corollary) is distributed as equal in distribution expected values of random quantity χ, χ and X respectively variance of a random variable χ covariance of random variables χ and у correlation coefficient between random variables χ and у covariance matrix of a random vector χ covariance matrix of random vectors χ and у covariance matrix of random matrices Χ (ρ χ n) and Υ (r χ 5), cov(X, Y) = cov(vec(X'), vec(r')) exp{tr(A)} Jacobian of the transformation Υ = F(X) : Kronecker delta. ' 6ij " { 0, if г=3 гфз
GLOSSARY OF NOTATIONS AND ABBREVIATIONS 333 Γρ(α) : multivariate gamma function ρ Ε i=i βρ(α, b) : multivariate beta function, Γρ(α) = π**""1) Π Γ[α - i(j - 1)] - Re(A) > i(p - 1) _ ΓΡ(α)Γρ(6) ^(а'Ь)= Гр(а + 6) Γ;(α1,...>αρ)=π^-1)ΠΓ[%-5θ-1)] Г*(аь ..., αρ) : generalized multivariate gamma function, ρ Π /?*(αι,..., ар; Ьь . ·., Ьр) : generalized multivariate beta function, /?*(αι,...,αρ;ί>ι,...,6ρ) = г;(аи...,ар)г;(ъи...,ър) Γ;(αχ +bi,...,Op + bp) Ja>o f(A) dA : integral of f(A) over the domain {A:A>0} where cL4 = Пг<; daij Jg(X)=o /PO dX · integral of f(X) over the domain {X : G(X) = 0} where dX = Uij dxij [(аН^Щ] : invariant measure on Stiefel manifold [{dH)H'\ : invariant measure on orthogonal group [dHi] : unit invariant measure on Stiefel manifold [dH] : unit invariant measure on orthogonal group or Haar measure p.d.f. : probability density function c.d.f. : cumulative distribution function m.g.f. : moment generating function c.f. : characteristic function c.g.f. : cumulant generating function Re(/i) : real part of h UNIVARIARE DISTRIBUTIONS ΛΓ(μ,σ2) : normal distribution; its probability density function is 1 f (x _ /Λ2 ^ ^^ exP { - oJT }►, s G Κ, μ G К and σ G K+ 2πσ ( л/SFa *\ 2σ*
334 GLOSSARY OF NOTATIONS AND ABBREVIATIONS X„ : chi-square distribution; its probability density function is ■—χ2η_1 exp (--χ) , χ e K+,η > 0 2*ηΓ(±η) : Student's t-distribution; its probability density function is r[i(n + l)]/ x2\-^(n+1) L2V л Ί + — , хеш, n>o ^JrmT{\n) \ η β7(α,6) : beta type I distribution; its probability density function is 1 -χα-1(1-χ)6-\0<χ< 1, a>0, b>0 Bn(a,b) : beta type II distribution; its probability density function is —^—χα-χ(1 + χ)-(α+6\ χ>0, α > 0, 6 > О p{a,b) MULTIVARIATE DISTRIBUTIONS Νρ(μ, Σ) : multivariate normal distribution; its p.d.f. is (2тг)-2Р det(E)-i etr {- Ь^'1(х - μ) (χ - μ);}, xeW, μ G Rp, and Σ > О ίρ(η,ω,μ, Σ) : multivariate ε-distribution; its p.d.f. is Щ±^ det(E)-^(l +1(, - μ)^(« - μ)Υ*(η+Ρ\ π2ρΓ(^η) ч cj ' xeW, μβΜ?,ω> 0, and Σ > О jDJ(ai,..., ar; ar+i) : Dirichlet type I distribution; its p.d.f. is ШМПиГ1{1-±чГ1-\0<щ<1,±щ<1, Ili=l L \ai) i=l i=l i=l where a» > 0, г = 1,..., r + 1 Dn(bi,... ,br;br+1) : Dirichlet type II distribution; its p.d.f. is Γ(ΣΓ±ι fr) TT 6j-l Λ , f- λ" Σ£ι* * 0 where Ь» > 0, г = 1,..., r + 1
GLOSSARY OF NOTATIONS AND ABBREVIATIONS 335 MATRIX VARIATE DISTRIBUTIONS NPtn(M, Σ <g> Φ) : matrix variate normal distribution; its p.d.f. is (2тг)-Ьр det(E)"2n det(#)-2p etr{- \z~l(x - м)ъ~\х - M)'},x e Rpxn, where Μ G Rpxn, Σ > 0 and Φ > 0 SNPtP(M, Bp(E<g>ty)Bp): symmetric matrix variate normal distribution; its p.d.f. is (2tt)-^+1) det(B;(E g> Φ)ΒρΓ* etr [- ^Σ"Χ(Χ - Μ)Φ-Χ(Χ - Μ)], Χ = Χ' G RpXp where Μ = Μ' e Rpxp, Σ > 0 and Φ > О NPtn(M, Σ <8> Φ|δ, C) : restricted matrix variate normal distribution; its p.d.f. is (27r)-2(n-s)p det(^)"2p det(C'^C) 2p det(E)" ^n"s) etr {- \z~\X - М)Ъ~\Х - Μ)'}, XC = 0 where Μ G Rpxn, Σ > 0 and Φ > 0 ΝΡι71(Μ,Α,Β,θ) : matrix variate ^-generalized normal distribution; its p.d.f. is for (l + 1)1 ПР det(A)~n det(£)"p exp{" ΣΣ| Σί>*(ϊΛκ - mM)b*f}, X G Kpx where Μ G Rpxn, A > О, В > 0, A"1 = (aik), B~l = (b#), Μ = (mw), and Г = (yke) Wp(n, Σ) : Wishart distribution; its probability density function is {2^npΓp(^n)det(Σ)Ь}"1det(5)2(τг-p-1)etг(-iΣ-15), S > 0, n>p Wp(n, Σ, Θ) : noncentral Wishart distribution; its p.d.f. is {2bprp(in) det(E)^}_1etr ( - ±θ) etr ( - ^S) detiS)^"-""1' o^i (|n; ^ΘΣ-^), 5 > 0, η > ρ where o-Fi is the hypergeometric function (Bessel function)
336 GLOSSARY OF NOTATIONS AND ABBREVIATIONS IWp(m, Φ) : inverted Wishart distribution; its probability density function is 2_i(m_p_1)pd t^a(m-P-i) , ^ 5— etr (- -zV V), v > °> where Φ > 0 and m > 2p IWp(m, Φ, Ω) : noncentral inverted Wishart distribution; its p.d.f. is 2-§(m-p-l)pdet^I(m-p-l) χ 1т,_1тЧ rpft(m-p-l)] βΪΓ (" 2Θ) * (" 2V '*) det(V)-im0ii (|(m - ρ - 1); ^W1), У > 0, where Φ > 0 and m> 2p Gp(a, C) : matrix variate gamma distribution; its probability density function is {Γρ(α) det(C)-a} l eti(-CW) det^)*"^"^, W > 0, where С > 0 and α > £(p - 1) IGp(m, C) : inverted matrix variate gamma distribution; its probability density function is aetiBY1"*^) r7 wn,ni det(H0"metr(-BW~% W>Q, where В > 0, and m> ρ Gp(a, C, θ) : noncentral matrix variate gamma distribution; its probability density function is {Γρ(α) det(C)-a}"1 etr(-0 - CW) det^)*""*^ oF1(a-eCW),W>0,a>±(p-l) BGp{ai, ..., ap; C) : Bellman gamma type I distribution; its p.d.f. is {r;(a1;...,ap) Π det(C(e))-"-}_1etr(-CW) α=1 p-1 det(V^)a^2^+1) [J det(VHal)-m°+1, W > 0, a=l where С > 0, a, = m\ -\ hm^, a, > |(j — 1), j = 1...,ρ
GLOSSARY OF NOTATIONS AND ABBREVIATIONS 337 -BGpJ(&i, ...,bp',B) : Bellman gamma type II distribution; its p.d.f. is {Г;(ЬХ>... Α) Π det^)-*.}"1 eti(-BW) a=l aetiWfr-i^ [J det(W{a))-k*-\ W > 0, a=2 where В > 0, 6j = fcp-j+i + · · · + kp, bj > \{j - 1), j l,...,p ΓΡ}7η(η, Μ, Σ, Ω) : matrix variate ί-distribution; its p.d.f. is Гр[|(п + т+р-1)] det(E)-2mdet(Q)-2p >rp[i(n + p-l)] det(/p + Σ~ι(Τ - Μ)Ω-ι(Τ - M),)"2(n+m+p"1), Τ e Rpxm where Μ G Rpxm, Ω (m χ га) > 0, Σ (ρ χ ρ) > 0 and η > 0 ΙΤρ,τη(,η, Μ, Σ, Ω) : inverted matrix variate ί-distribution; its p.d.f. is r?[l(n + m + p-l)] >m §p det(/p - Σ_1(Τ - Μ)Ω-χ(Τ - M)')*(n_2), Τ e Rpxm where 7P - Σ"1^ - M)Q~l(T - M)' > 0, Μ 6 Rpxm, Ω (то χ m) > 0, Σ (ρ χ ρ) > 0 and η > 0 ■DTPi7n(n + ρ, Μ, Σ) : upper disguised matrix variate ί-distribution; its p.d.f. is {Κ(τη!ρ,η+ρ)}-1άβί(Σ)-^πι det(/m + (T- Μ)'Σ~ι(Τ - M))~^n+p-m-1) m Π det((7ra + (T - Μ)'Σ-χ(Τ - Μ))'*1)-1, Τ 6 Rpxm I27p,m(tt + ρ, Μ, Σ) : lower disguised matrix variate ί-distribution; its p.d.f. is {Κ(τη,ρ,η + ρ)}-1άβί(Σ)-^τη det(/m + (T - Jlf )'Σ-1(Τ - M))-^"^-"1-1' Π det((/m + (T - Μ)'Σ~ι{Τ - M)){i])~\ Τ e Rpxm TPy7n(n, Μ, Σ, fi|s, С) : restricted matrix variate ί-distribution; its p.d.f. is Г[1(п + то + р-,-1)] det(£)_§(m_s) Μσασ)ΪΡ det(ft)-=pdet(/p + Σ"1 (Τ - М)9,~\Т - м)')"*(л+отЧ1'"*"1)
338 GLOSSARY OF NOTATIONS AND ABBREWATIONS Bp(a, b) : matrix variate beta type I distribution; its density is {PP(a,b)}~1 det(C/)a"^+1) det(/p - U)b~^+l\ 0 < U < /p, where a > \{p - 1), b > \{p - 1), and βρ(α, b) is the multivariate beta function βρ7(α, b) : matrix variate beta type II distribution; its density is {PP(a,b)}~1 det(V)a-±b+V det(/p + у )-<·+*), V > 0, where a > \{p - 1) and b > \{p - 1) jDp(ai,..., ar; dr+i) "· matrix variate Dirichlet type I distribution; its density is Шоь · · ·. βτ5 ^+i)}_1 Π demr-te+V det (lp - ± υ^+1"2(ρ+1\ 0<Ui<Ip,0<±Ui< IP, г=1 г=1 where ai > \{p — 1), г = 1,..., r + 1, and /Маь · · ·, ar, ar+i) = +1 £>pJ(&i,..., 6r; &r+i) : matrix variate Dirichlet type II distribution; its density is {&(bi, - · · Λ; i-r+i)}"1 Π detiVJ)4·-^4 i=l det(/p + ^H)~E:=ll6i, V1>0, i=l where 6» > |(p - 1), г = 1,..., r + 1 <2p,n(A Σ, Φ) : the density of S = XAX\ A > 0, X ~ A^p,n(M, Σ ® Φ) is {2^npΓP(^n)}"1det(AΦ)-2Pdet(Σ)-2ndet(5)2(n-p-1) etr (- ^Σ-1^) 0F0(n)(B, ^T^S), S > 0, where В = In — qA~2^f~lA~2 and <? > 0 is an arbitrary constant Μρ?η0Ρ) : von Mises-Fisher distribution with parameter matrix F (ρ χ η); its probability element is given by a(F) eti(FX') [dX], X G 0(p, η), ρ < n, where [dX"] is the unit invariant measure on 0(p,n) and a(F) is the normalizing constant given by {a(F)}-1 = 0Fx (^n; W) = „Я (k \f'f)
GLOSSARY OF NOTATIONS AND ABBREVIATIONS 339 BPin(A) : Bingham matrix distribution with parameter matrix A = A'; its probability element is given by b(A) eti(XAX') [dX], X G 0(p, η), ρ < η, where [dX] is the unit invariant measure on 0(p,n) and b(A) is the normalizing constant given by {Ь(А)}^ = ^\и1-р;А) 2 '2* βρ?η(Α, β) : generalized Bingham matrix distribution with parameter matrices A = A' and В = В'; its probability element is given by bi(A, B)eti(BXAX') [dX], X G 0(p,n), where &i(A, J3) is the normalizing constant ACGPfn(^) : matrix angular central Gaussian distribution (ACG) with parameters ρ, η and Φ > 0; its probability element is det^-Wet^-1^)^71 [dH], Η G 0{p,n) CHp{n, α, β, kind 1) : confluent hypergeometric function kind 1 distribution; its p.d.f. is given by Р^У""', det(Xr^> lFl (a; /?; -X), X>0, Γρ(η)Γρ(/?)Γρ(α - η) where Re(/3—n) > 0, and Re(a — n) > 0. The parameters n, a, and /? are restricted to take values such that the density function is non-negative СЩ(п, α, β, kind 2) : confluent hypergeometric function kind 2 and type I distribution; its p.d.f. is given by Γρ(α)Γρ[α-/?+±(ρ + 1)] Γρ(η)Γρ(α-η)Γρ[η-β+\(ρ+1)] aet(X)n-^+1) Φ(α,/?;X), X > 0, where Re(n, a - n) > \(p - 1) and Re(n - β) > -1. The parameters η, α and /3 are restricted to take values such that the density function is non-negative СЯр7(п, a,/3, kind 2) : confluent hypergeometric function kind 2 and type II distribution; its p.d.f. is given by Τρ[α-β + η+ί{ρ + 1)} detm„_i(p+1) еЬт{-Х)Я>(а,р;Х),Х>0,
340 GLOSSARY OF NOTATIONS AND ABBREWATIONS where Re(n,a) > \{p - 1) and Re(n - β) > -1. The parameters η, α and β are restricted to take values such that the density function is non-negative Я^(η, α,/3,7) ·' hypergeometric function distribution of type I; its p.d.f. is given by Τρ(Ί)Τρ(η)Τρ(Ί + η-α-β) K J det(/p - xy-iW 2JFi(a, β; 7; Ip - X), 0 < X < Jp, where Re(7 + η - a - β) > |(p — 1), Reft) > \{p - 1) and Re(n) > \{p — 1). The parameters a, /3, 7 and η are restricted to take values such that the density function is non-negative #pJ(n, a, /?, 7; A) : hypergeometric function distribution of type II; its p.d.f. is given by rp(a)rp(ff)rp(7-n)det(A)" detf ^«-ΐίρ+ΐ) Γρ(η)Γρ(7)Γρ(α-η)Γρ(/3-η) V ; 2F1(a,i8;7;-AX'),X>0, where A > 0, Re(7 - n) > |(p - 1), Re(a - n) > \{p - 1) and Re(/3 — n) > \{p — 1). The parameters η, α, β and 7 are restricted to take values such that the density function is non-negative L^(g, cli, · · ·, ar) · matrix variate Liouville distribution of the first kind; its p.d.f. is proportional to f[det(Xir-^1)g{J:Xi),Xi>0, i=l i=l at > -(p-1), г = 1,...,г, where g(-) is positive, continuous, supported on <S = {X (px p) : X > 0} such that [ det(T)a-2fr+1 Vr)dT < 00, Jt>o and α = ΣΓ=ι α» Lf\g, bu ..., 6r) : matrix variate Liouville distribution of the second kind; its p.d.f. is proportional to Π сВД^-^ЦЕя)· о < « < ip, Σ* < ip, i=l i=l i=l bi > -(p-1), г = 1,...,г,
GLOSSARY OF NOTATIONS AND ABBREVIATIONS 341 where g(-) is positive, continuous, supported on <S = {X (px p) : 0 < X < Ip} such that J det(T)M(^i)£(T)dT < οο? and b = Zri=i bi·
342 GLOSSARY OF NOTATIONS AND ABBREWATIONS
REFERENCES Abdi, W. H. (1968) Whittaker's M^-function of a matrix argument, Rend. Circ. Mat. Palermo, 17, 333-342. Aitchison, J. (1986) The Statistical Analysis of Compositional Data, Chapman and Hall, New York. Amey, A. K. A. and Gupta, A. K. (1992) Testing sphericity under a mixture model, Aust J. Statist, 34, 451-460. Anderson, T. W. (1946) The non-central Wishart distribution and certain problems of multivariate statistics, Ann. Math. Statist, 17, 409-431. Correction (1964), 35, 923-24. Anderson, T. W. (1984) An Introduction to Multivariate Statistical Analysis, 2nd ed., John Wiley & Sons, New York. Anderson, T. W. and Fang, К. Т. (1987) Cochran's theorem for elliptically contoured distributions, Sankhya, A49(3), 305-315. Anderson, T. W. and Girshick, M. A. (1944) Some extensions of the Wishart distribution, Ann. Math. Statist, 15, 345-357. Asoo, Y. (1969) On the Γ-distribution of matric argument and its related distributions, Memoirs of Faculty of Literature and Sciences, Shimane University, Natural Science, 2, 1-13. Barlow, R. E. and Mendel, Μ. Β. (1992) De Finetti-type representation for life distributions, J. Amer. Statist. Assoc, 87, 1116-1122. Bartlett, M. S. (1933) On the theory of statistical regression, Proc. Roy. Soc. Edinb., 53, 260-283. Basu, D. and Khatri, C. G. (1969) Some characterizations of statistics, Sankhya, A31, 199-208. Bekker, A. and Roux, J. J. J. (1990) Some characterizations of the matrix variate normal distribution, South African Statist. J., 24, 45-54. Bellman, R. (1956) A generalization of some integral identities due to Ingham and Siegel, Duke Math. J., 23, 571-577. Bellman, R. (1970) Introduction to Matrix Analysis, 2nd ed., McGraw-Hill, New York. 343
344 REFERENCES Bilodeau, Μ. and Srivastava, M. S. (1992) Estimation of the eigenvalues of EXE2 \ J. Multivariate Anal., 41, 1-13. Bingham, С (1974) An antipodally symmetric distribution on sphere, Ann. Statist, 2(6), 1201-1225. Bingham, C, Chang, T. and Richards, Donald St. P. (1992) Approximating the matrix Fisher and Bingham distributions: Applications to spherical regression and procrustes analysis, J. Multivariate Anal, 41, 314-337. Bochner, S. (1952) Bessel functions and modular relations of higher type and hyperbolic differential equations, Communications der Seminaire Mathematique de rUniversite Lund, Tome Supplement aire dedie a Mercel Riez, 12-20. Box, G. E. R and Tiao, G. C. (1973) Bayesian Inference in Statistical Analysis, Addison-Wesley, Massachusetts. Brillinger, D. R. (1969) Asymptotic properties of spectral estimate of second order, Biometrika, 56, 375-387. Brillinger, D. R. (1975) Time Series: Data Analysis and Theory, Holt, Rinehart and Winston, New York. Bronk, R. V. (1965) Exponential ensembles for random matrix, J. Math. Phys., 6, 228-237. Brown, M. W. (1974) Generalized least square estimators in the analysis of variance, South African Statist. J., 8, 1-24. Brown, M. W. and Neudecker, H. (1988) The covariance matrix of a general symmetric second degree matrix polynomial under normality assumption, Linear Algebra Appl, 103, 113-120. Cambanis, S., Huang, S. and Simons, G. (1981) On the theory of elliptically contoured distributions, J. Multivariate Anal., 11, 368-385. Carmeli, M. (1974) Statistical theory of energy levels and random matrices in physics, J. Statist Phys., 10, 259-297. Carmeli, M. (1983) Statistical Theory and Random Matrices in Physics, Marcel Dekker, New York. Chang, Y. С (1972) Bayesian Analysis of Multivariate Regressions Subjected to Constraints, Ph. D. thesis, University of Wisconsin, Madison. Chikuse, Y. (1976) Partial differential equations for hypergeometric functions of complex matrices and their applications, Ann. Inst. Statist. Math., 28, 187-199. Chikuse, Y. (1990a) Distributions of orientations on Stiefel manifolds, J. Multivariate Anal, 33(2), 247-264. Chikuse, Y. (1990b) The matrix angular central Gaussian distribution, J. Multivariate Anal, 33(2), 265-274. Chikuse, Y. (1991a) High dimensional limit theorems and matrix decompositions on the Stiefel manifold, J. Multivariate Anal, 36(2), 145-162.
REFERENCES 345 Chikuse, Υ. (1991b) Asymptotic expansions for distributions of the large sample matrix resultant and related statistics on the Stiefel manifold, J. Multivariate Anal, 39(2), 270-283. Chikuse, Y. (1993a) High dimensional asymptotic expansions for the matrix Langevin distributions on the Stiefel manifold, J. Multivariate, Anal, 44(1), 82- 101. Chikuse, Y. (1993b) Asymptotic theory for the concentrated matrix Langevin distributions on the Grassmann manifold, Statistical Sciences and Data Analysis (K. Matusita, M. L. Puri and T. Hayakawa, eds.), VSP, Netherlands, 237-245. Chikuse, Y. and Davis, A. W. (1986) Some properties of invariant polynomials with matrix arguments and their applictions in econometrics, Ann. Inst. Statist Math., 38, 109-122. Chmielewski, M. A. (1981) Elliptically symmetric distributions: A review and bibliography, Int. Statist. Rev., 49, 67-74. Constantine, A. G. (1963) Some noncentral distribution problems in multivariate analysis, Ann. Math. Statist, 34, 1270-1285. Constantine, A. G. (1966) The distribution of Hotelling's generalized T02, Ann. Math. Statist, 37, 215-225. Cornish, E. A. (1954) The multivariate ί-distribution associated with a set of normal sample deviates, Aust. J. Phys., 7, 531-542. Cornish, E. A. (1955) The sampling distributions of statistics derived from the multivariate ί-distribution, Aust. J. Phys., 8, 193-199. Cornish, E. A. (1962) The multivariate t-distribution associated with the general multivariate normal distribution, Division of Mathematical Statistics, Commonwealth Scientific and Industrial Research Organization, Australia, Technical Report No. 13. Crowther, N. A. S. (1975) The exact non-central distribution of a quadratic form in normal vectors, South African Statist. J., 9, 27-36. Das Gupta, S. (1968) Some aspects of discrimination function coefficients, Sankhya, A30, 387-400. Das Gupta, S. (1971) Nonsingularity of the sample covariance matrix, Sankhya, A33, 475-478. Das Gupta, S. (1972) Non-central matrix-variate beta distribution and Wilks' U distribution, Sankhya, A34, 357-362. Davis, A. W. (1979) Invariant polynomials with two matrix arguments extending the zonal polynomials: Applications to multivariate distribution theory, Ann. Inst. Statist. Math., A31, 465-485. Davis, A. W. (1980) Invariant polynomials with two matrix arguments extending the zonal polynomials, Multivariate Analysis V (P. R. Krishnaiah, ed.), North- Holland, 287-299.
346 REFERENCES Dawid, A. P. (1977) Spherical matrix distributions and a multivariate model, J. Roy. Statist. Soc, B39, 254-261. Dawid, A. P. (1978) Extendability of spherical matrix distributions, J. Multivariate Anal, 8, 559-556. Deemer, W. L. and Olkin, I. (1951) The Jacobians of certain matrix transformations useful in multivariate analysis, Based on lectures of P. L. Hsu at the University of North Carolina, 1947, Biometrika, 38, 345-367. de Waal, D. J. (1968) An asymptotic distribution for the determinant of a non-central В statistic in multivariate analysis, South African Statist. J., 2, 77-84. de Waal, D. J. (1969) The non-central multivariate beta type 2 distribution, South African Statist. J., 3, 101-108. de Waal, D. J. (1970) Distributions connected with a multi-variate beta statistic, Ann. Math. Statist, 41(3), 1091-1095. Correction (1971), 42(6), 2165-2166. de Waal, D. J. (1972a) On the expected values of the elementary symmetric functions of a noncentral Wishart matrix, Ann. Math. Statist, 43, 344-347. de Waal, D. J. (1972b) An asymptotic distribution of noncentral multivariate Dirichlet variates, South African Statist. J., 6, 31-40. de Waal, D. J. (1978) The expected values of the elementary symmetric functions of some matrices, South African Statist. J., 12, 75-82. de Waal, D. J. (1979) On the normalizing constant for the Bingham-von Mises-Fisher matrix distribution, South African Statist. J., 13, 103-112. de Waal, D. J. (1983) Quadratic forms and manifold normal distributions, Contributions in Statistics, Essays in honour of Norman L. Johnson (P. K. Sen, ed.), North-Holland, 115-121. de Waal, D. J. and Nel, D. G. (1973) On some expectations with respect to Wishart matrices, South African Statist. J., 7, 61-67. Dickey, J. M. (1967) Matricvariate generalizations of the multivariate t distribution and the inverted multivariate t distribution, Ann. Math. Statist, 38(2), 511-518. Dickey, J. M. (1976) A new representation of Student's t as a function of independent £'s, with a generalization to matrix t, J. Multivariate Anal, 6, 343-346. Dickey, J. M., Dawid, A. P. and Kadane, J. B. (1986) Subjective-probability assessment methods for multivariate-* and matrix-ε models, Bayesian Inference and Decision Techniques (P. Goel and A. Zellner, eds.), Elsevier Science Publishers B. V., Chapter 12, 177-195. Downs, T. D. (1972) Orientation statistics, Biometrika, 59, 665-676. Dunnett, C. W. and Sobel, M. (1954) A bivariate generalization of student's t- distribution with tables for certain special cases, Biometrika, 41, 153-169. Dykstra, R. L. (1970) Establishing the positive definiteness of the sample covariance matrix, Ann. Math. Statist, 41, 2153-2154.
REFERENCES 347 Dyson, F. J. (1962a) Statistical theory of the energy levels of complex systems I, J. Math. Phys., 3, 140-156. Dyson, F. J. (1962b) Statistical theory of the energy levels of complex systems II, J. Math. Phys., 3, 157-165. Dyson, F. J. (1962c) Statistical theory of the energy levels of complex systems III, J. Math. Phys., 3, 166-175. Dyson, F. J. and Mehta, M. L.(1963a) Statistical theory of the energy levels of complex systems IV, J. Math. Phys., 4, 701-712. Dyson, F. J. and Mehta, M. L.(1963b) Statistical theory of the energy levels of complex systems V, J. Math. Phys., 4, 713-719. Eaton, M. L. (1972) Multivariate Statistical Analysis, Institute of Mathematical Statistics, University of Copenhagen, Denmark. Eaton, M. L. and Olkin, I. (1987) Best equivariant estimators of a Cholesky decomposition, Ann. Statist, 15, 1639-1650. Eaton, M. L. and Perlman, M. D. (1973) The nonsingularity of generalized sample covariance matrices, Ann. Statist, 1, 710-717. Eben, K. (1994) A generalization of Wishart density for the case when the inverse of the covariance matrixis a band matrix, Math. Bohem., 119(4), 337-346. Elfving, G. (1947) A simple method of deducing certain distributions connected with multivariate sampling, Skandinavisk Aktuarietidskrift, 30, 56-74. Fang, C, Krishnaiah, P. R. and Nagarsenkar, B. N. (1982) Asymptotic distribution of the likelihood ration test statistics for covariance structures of the complex multivariate normal distributions, J. Multivariate Anal, 12, 597-611. Fang, K. T. and Anderson, T. W. (Eds.) (1990) Statistical Inference in Elliptically Contoured and Related Distributions, Allerton Press, New York. Fang, K. T. and Chen, H. F. (1984) Relationships among classes of spherical matrix distributions, Acta Mathematica Sinica (English Series), 1(2), 138-148. Fang, K. T. and Chen, H. F. (1986) On the spectral decompositions of spherical matrix distributions and some of their subclasses, J. Math. Res. Exposition, 1, 47-156. Fang, К. Т., Kotz, S. and Ng, K. W. (1990) Symmetric Multivariate and Related Distributions, Chapman and Hall, New York. Fang, К. Т. and Zhang, Υ. Τ. (1990) Generalized Multivariate Analysis, Science Press, Beijing and Springer-Verlag, Berlin. Farrell, R. H. (1985) Multivariate Calculation: Use of the Continuous Groups, Springer- Verlag, New York. Fisher, R. A. (1915) Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population, Biometrika, 10, 507-521.
348 REFERENCES Geisser, S. (1965) Bayesian estimation in multivariate analysis, Ann. Math. Statist, 36, 150-159. Ghurye, S. G. and Olkin, I. (1969) Unbiased estimation of some multivariate probability densities and related functions, Ann. Math. Statist, 40(4), 1261-1271. Gindikin, S. G. (1964) Analysis in homogeneous domains, Russian Math. Surveys, 19, 1-90. Giri, N. C. (1977) Multivariate Statistical Inference, Academic Press, New York. Girko, V. L. and Gupta, A. K. (1996) Multivariate elliptically contoured linear models and some aspects of the theory of random matrices, Multidimensional Statistical Analysis and Theory of Random Matrices (A. K. Gupta and V. L. Girko, eds.), VSP, Netherlands, 327-386. Goldberger, A. S. (1970) Criteria and constraints in multivariate regression, EME 7026, Social System Research Institute, University of Wisconsin, Madison, paper presented at the Second World Congress of the Econometric Society, Cambridge, England, September 1970. Goodman, N. R. (1963a) Statistical analysis based on a certain multivariate complex Gaussian distribution (an introduction), Ann. Math. Statist, 34, 152-177. Goodman, N. R. (1963b) The distribution of the determinant of complex Wishart distributed matrix, Ann. Math. Statist, 34, 178-180. Goodman, N. R. and Dubman, M. R. (1969) Theory of time-varying spectral analysis and complex Wishart process, Multivariate Analysis-II (P. R. Krishnaiah, ed.), Academic Press, New York, 351-365. Goodman, T. R. and Kotz, S. (1973) Multivariate ^-generalized normal distributions, J. Multivariate Anal, 3, 204-219. Graham, Alexander (1981) Kronecker Products and Matrix Calculus with Applications, Ellis Horwood, Chichester. Graybill, F. A. (1983) Matrices with Applications in Statistics, Wadsworth, Belmont, California. Graybill, F. A. and Marsaglia, G. (1957) Idempotent matrices and quadratic forms in general linear hypothesis, Ann. Math. Statist, 28, 678-686. Groves, T. and Rothenberg, T. (1969) A note on the expected value of an inverse matrix, Biometrika, 56, 690-691. Gupta, A. K. (1971a) Distribution of Wilks' likelihood ratio criterion in the complex case, Ann. Inst Statist Math., 23, 77-87. Gupta, A. K. (1971b) Noncentral distribution of Wilks' statistic in MANOVA, Ann. Math. Statist, 42, 1254-1261. Gupta, A. K. (1971c) On a stochastic inequality for the Wilks' statistic, Ann. Inst. Statist Math., 27, 341-348.
REFERENCES 349 Gupta, A. K. (1973) On a test for reality of the covariance matrix of a complex Gaussian distribution, J. Statist Сотр. Simul, 2, 333-342. Gupta, A. K. (1976) Nonnull distribution of Wilks' statistic for MANOVA in the complex case, Commun. Statist-Simul Сотр., 5, 177-188. Gupta, A. K. (1977) On the distribution of sphericity test criterion in the multivariate Gaussian distribution, Aust J. Statist, 19, 202-205. Gupta, A. K. (1998) Multivariate elliptically contoured and ^-generalized normal models, Random Oper. Stochastic Equations, 6, 281-290. Gupta, A. K. and Chattopadhyay, A. K. (1979) Gammaization and Wishartness of dependent quadratic forms, Commun. Statist-Theory Meth., A8(9), 945-951. Gupta, A. K., Chattopadhyay, A. K. and Krishnaiah, P. R. (1975) Asymptotic distributions of the determinants of some random matrices, Commun. Statist, 4, 33-47. Gupta, A. K. and Conradie, W. (1987) Quadratic forms in complex normal variates: Basic results, Statistica, 47, 37-84. Gupta, A. K. and Girko, V. L. (Eds.) (1996) Multidimensional Statistical Analysis and Theory of Random Matrices, Proceedings of the Sixth Lukacs Symposium, VSP, Netherlands. Gupta, A. K. and Javier, W. R. (1986) Nonnull distribution of the determinant of B-statistic in multivariate analysis, South African Statist. J., 20, 87-102. Gupta, A. K. and Kabe, D. G. (1998) Characterization of gamma and the complex Wishart densities, Applied Statistical Science III (E. Ahmed, M. Ahsanullah and В. К. Sinha, eds.), Nova Science, 393-400. Gupta, A. K. and Nagar, D. K. (1985) Nonnull distribution of LR-statistic for testing μ = μ0, Σ = σ2Ι in complex multivariate normal model, Statistica, 45(4), 457-464. Gupta, A. K. and Nagar, D. K. (1987) Distribution of the product of determinants of random matrices connected with non-central matric variate Dirichlet distribution, South African Statist J., 21, 141-153. Gupta, A. K. and Nagar, D. K. (1988) Nonnull distribution of likelihood ratio criterion for testing multisample sphericity in the complex case, Aust. J. Statist, 30(3), 307-318. Gupta, A. K. and Nagar, D. K. (1989) Asymptotic nonnull distribution of likelihood ratio statistic for testing homogeneity of complex multivariate Gaussian populations, J. Statist. Сотр. Simul, 31, 83-91. Gupta, A. K. and Nagar, D. K. (1992) Distribution of LR-statistic for testing Η : μ = μ0; Σ = σ2J in multivariate complex Gaussian distribution, Statistica, 52(2), 255-267. Gupta, A. K. and Nagar, D. K. (1994) A note on the distribution of (a'S~la) (a'S~2a)~l, Random Oper. Stochastic Equations, 2(4), 331-334.
350 REFERENCES Gupta, A. K. and Nagar, D. K. (1998) Quadratic forms in disguised matrix i-vatiate, Statistics, 30, 357-374. Gupta, A. K. and Offori-Nyarko, S. (1995) On disguised inverted Wishart distribution, Proc. Amer. Math. Soc, 123, 2557-2562. Gupta, A. K. and Rathie, P. N. (1983a) Nonnull distribution of Wilks' Λ in the complex linear case, Statistica, 43(3), 445-450. Gupta, A. K. and Rathie, P. N. (1983b) On the noncentral distribution of the determinant of a complex Wishart matrix, Metron, 41, 109-116. Gupta, A. K. and Song, D. (1990) Asymptotic expansion of the matrix variate Dirich- let distribution, Department of Mathematics and Statistics, Bowling Green State University, Technical Report No. 90-04. Gupta A. K. and Song, D. (1996) Generalized Liouville distribution, Comput Math. Appl, 32(2), 103-109. Gupta, A. K. and Tang, J. (1984) Distribution of likelihood ratio statistic for testing equality of covariance matrices of multivariate Gaussian models, Biometrika, 71, 555-559. Gupta, A. K. and Tang, J. (1986a) Some properties of LR-tests for generalized variances of two multivariate normal populations, Publications de VInstitut de Statis- tique de VUniversite de Paris, 31, 59-69. Gupta, A. K. and Tang, J. (1986b) Exact distribution of certain general test statistic in multivariate analysis, Aust. J. Statist, 28, 104-114. Gupta, A. K. and Tang, J. (1987) On testing equality of generalized variances of к multivariate normal populations, Publications de VInstitut de Statistique de VUniversite de Paris, 32, 29-42. Gupta, A. K. and Tang, J. (1988) A general distribution theory for a class of likelihood ratio criteria, Aust J. Statist, 30, 359-366. Gupta, A. K. and Varga, T. (1991) Rank of a quadratic form in an elliptically contoured matrix random variable, Statist Probab. Lett, 11, 131-134. Gupta, A. K. and Varga, T. (1992) Characterization of matrix variate normal distribution, J. Multivariate Anal, 41, 80-88. Gupta, A. K. and Varga, T. (1993) Elliptically Contoured Models In Statistics, Kluwer Academic Publishers, Dordrecht. Gupta, A. K. and Varga, T. (1994a) Some applications of the stochastic representation of matrix variate elliptically contoured distributions, Random Oper. Stochastic Equations, 2(1), 1-11. Gupta, A. K. and Varga, T. (1994b) Characterization of matrix variate normality through conditional distributions, Math. Methods Statist, 3(2), 163-170. Gupta, A. K. and Varga, T. (1994c) A new class of matrix variate elliptically contoured distributions, J. Italian Statist Soc, 3, 255-270.
REFERENCES 351 Gupta, A. K. and Varga, T. (1994d) Moments and other expected values for matrix variate elliptically contoured distributions, Statistica, 54, 361-373. Gupta, A. K. and Varga, T. (1995a) Matrix variate ^-generalized normal distribution, Trans. Amer. Math. Soc, 347(4), 1429-1437. Gupta, A. K. and Varga, T. (1995b) Some inference problems for matrix variate elliptically contoured distributions, Statistics, 26, 219-229. Gupta, A. K. and Varga, T. (1997) Characterization of matrix variate elliptically contoured distributions, Advances in the Theory and Practice of Statistics: A Volume in Honor of Samuel Kotz (N. L. Johnson and N. Balakrishnan, eds.), John Wiley & Sons, New York, 455-467. Gupta, R. D. and Richards, Donald St. P. (1987) Multivariate Liouville distributions, J. Multivariate Anal, 23, 233-256. Gupta, R. D. and Richards, Donald St. P. (1990) The Dirichlet distributions and polynomial regression, J. Multivariate Anal, 32, 95-102. Gupta, R. D. and Richards, Donald St. P. (1991) Multivariate Liouville distributions, II, Probab. Math. Statist, 12(2), 291-301. Gupta, R. D. and Richards, Donald St. P. (1992) Multivariate Liouville distributions, III, J. Multivariate Anal, 43, 29-57. Haff, L. R. (1979) An identity for the Wishart distribution with applications, J. Multivariate Anal, 9, 531-544. Hannan, E. J. (1970) Multiple Time Senes, John Wiley & Sons, New York. Haq, M. S. and Rinco, S. (1976) /^-expectation tolerance regions for a generalized multivariate model with normal error variables, J. Multivariate Anal, 6, 414-421. Hart, M. L. and Money, A. H. (1976) On Wilks' multivariate generalization of the correlation ratio, Biometrika, 63(1), 59-67. Hayakawa, T. (1966) On the distribution of a quadratic form in a multivariate normal sample, Ann. Inst. Statist. Math., 18, 191-200. Hayakawa, T. (1969) On the distributions of the latent roots of a positive definite random symmetric matrix I, Ann. Inst. Statist. Math., 21, 1-21. Hayakawa, T. (1972) On the distribution of the multivariate quadratic form in multivariate normal sample, Ann. Inst. Statist. Math., 25, 205-230. Hayakawa, T. (1985) On the distribution of a quadratic form of a matrix ί-variate, Statistical Theory and Data Analysis (K. Matusita, ed.), Elsevier Science Publishers B. V. (North-Holland), 249-256. Hayakawa, T. (1986) On testing hypotheses of covariance matrices under an elliptical population, J. Statist. Plan. Inf., 13, 193-202. Hayakawa, T. (1987) Normalizing and variance stabilizing transformation of multivariate statistics under an elliptical population, Ann. Inst. Statist. Math., 39, 299-306.
352 REFERENCES Hayakawa, Т. (1989) On the distributions of the functions of the F-matrix under an elliptical population, J. Statist Plan. Inf., 21, 41-52. Hayakawa, T. and Kikuchi, Y. (1979) The moments of a function of traces of a matrix with a multivariate symmetric normal distribution, South African Statist. J., 13, 71-82. Herz, C. S. (1955) Bessel functions of matrix argument, Ann. Math., 61, 474-523. Hoffman, D. K., Raffenetti, R. С and Ruedenberg, K. (1972) Generalization of Euler angles to AT-dimensional orthogonal matrices, J. Math. Phys., 13(4), 528-532. Hogg, R. V. (1963) On the independence of certain Wishart variables, Ann. Math. Statist, 34, 935-939. Hogg, R. V. and Craig, A. T. (1994) Introduction to Mathematical Statistics, 5th ed., MacMillan, New York. Hsu, P. L. (1939a) On the distribution of the roots of certain determinantal equations, Annals of Eugenics, 9, 250-258. Hsu, P. L. (1939b) A new proof of the joint product moment distribution, Proc. Camb. Phil. Soc, 35, 336-338. Hsu, P. L. (1940) An algebraic derivation of the distribution of rectangular coordinates, Proc. Edinb. Math. Soc, 6, 185-189. Hua, L. K. (1959) Harmonic Analysis of Functions of Several Complex Variables in Classical Domains, Moscow (in Russian), American Mathematical Society (English Translation). Hudak, D. and Richter, G. (1996) Moments of special normally distributed matrices, Statistics, 27, 363-378. Ingham, A. E. (1933) An integral which occurs in statistics, Proc. Camb. Phil Soc, 29, 271-276. Jambunathan, M. V. (1965) A quick method of deriving Wishart's distribution, Current Science, 34, 78. James, A. T. (1954) Normal multivariate analysis and the orthogonal group, Ann. Math. Statist, 25, 40-75. James, A. T. (1955) The noncentral Wishart distribution, Proc. Roy. Soc. Lond., A229, 364-366. James, A. T. (1960) The distribution of the latent roots of the covariance matrix, Ann. Math. Statist, 31, 151-158. James, A. T. (1961a) The distribution of noncentral means with known covariance matrix, Ann. Math. Statist, 32, 874-882. James, A. T. (1961b) Zonal polynomials of the real positive definite symmetric matrices, Ann. Math., 74, 456-469.
REFERENCES 353 James, A. T. (1964) Distributions of matrix variate and latent roots derived from normal samples, Ann. Math. Statist, 35, 475-501. Javier, W. R. (1982) On the Distributions of Certain Random Matric Variates and Their Functions, Ph. D. thesis, Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, Ohio. Javier, W. R. and Gupta, A. K. (1985a) On generalized matric variate beta distributions, Mathematische Operationsforschung una Statistik, 16(4), 549-558. Javier, W. R. and Gupta, A. K. (1985b) On matric variate-ί distribution, Commun. Statist-Theory Meth., 14(6), 1413-1425. Jensen, D. R. (1970) The joint distribution of traces of Wishart matrices and some applications, Ann. Math. Statist., 41(1), 133-145. Jensen, D. R. and Good, I. J. (1981) Invariant distributions associated with matrix laws under structural symmetry, J. Roy. Statist. Soc, B43, 327-332. Joarder, A. H. and Ali, Μ. Μ. (1992) On some generalized Wishart expectations, Commun. Statist-Theory Meth., 21(1), 283-294. Johnson, N. L. and Kotz, S. (1970) Continuous Univariate Distnbutions-2, Houghton Mifflin, New York. Johnson, N. L. and Kotz, S. (1972) Continuous Multivariate Distributions, John Wiley and Sons, New York. Jupp, P. E. and Mardia, K. V. (1979) Maximum likelihood estimators for the matrix von Mises-Fisher and Bingham distributions, Ann. Statist, 7(3), 599-606. Juritz, J. M. (1973) Aspects of Noncentral Multivariate t-distributions, Ph. D. thesis, University of Cape Town, RSA. Juritz, J. M. and Troskie, С G. (1976) Noncentral matrix Τ distributions, South Ajrican Statist. J., 10, 1-7. Kabe, D. G. (1965) Generalization of Sverdrup's Lemma and its applications to multivariate distribution theory, Ann. Math. Statist, 36, 671-676. Kabe, D. G. (1979) On Subrahmaniam's conjecture for an integral involving zonal polynomials, Utilitas Math., 15, 245-248. Kabe, D. G. (1984) Classical statistical analysis based on a certain hypercomplex multivariate normal distribution, Metrika, 31, 63-76. Kang, C. and Kim, В. С (1996) The iVth moment of matrix quadratic form, Statist. Probab. Lett, 28, 291-297. Kaufman, G. M. (1967) Some Bayesian moment formulae, Centre for Operations Research and Econometrics, Catholic University of Louvain, Heverlee, Belgium, Report No. 6716. Kelker, D. (1970) Distribution theory of spherical distributions and a location-scale parameter generalization, Sankhya, A32, 419-430.
354 REFERENCES Khatri, С. G. (1959a) On the mutual independence of certain statistics, Ann. Math. Statist, 30, 1258-1262. Khatri, C. G. (1959b) Conditions for the forms of the type XAX' to be distributed independently or to obey Wishart distribution, Calcutta Statist. Assoc. Bull, 8, 162-168. Khatri, C. G. (1962) Conditions for Wishartness and independence of second degree polynomials in a normal vector, Ann. Math. Statist, 33, 1002-10Q7. Khatri, С G. (1963) Further contributions to Wishartness and independence of second degree polynomials in a normal vectors, J. Indian Statist. Assoc. 1, 61-70. Khatri, С G. (1965) Classical statistical analysis based on a certain multivariate complex Gaussian distribution, Ann. Math. Statist, 36, 98-114. Khatri, С G. (1966) On certain distribution problems based on positive definite quadratic functions in normal vectors, Ann. Math. Statist, 37, 468-479. Khatri, С G. (1969) Non-central distributions of г-th largest characteristic roots of three matrices concerning complex multivariate normal populations, Ann. Inst. Statist. Math., 21, 23-32. Khatri, С G. (1970a) A note on Mitra's paper "A density-free approach to the matrix variate beta distribution," Sankhya, A32, 311-318. Khatri, С G. (1970b) On the moments of traces of two matrices in three situations for complex multivariate normal populations, Sankhya, A32, 65-80. Khatri, С G. (1971) Series representation of distributions of quadratic form in the normal vectors and generalized variance, J. Multivariate Anal, 1(2), 199-214. Khatri, С G. (1975) Distribution of a quadratic form in normal vectors (multivariate non-central case), A Modern Course on Statistical Distributions in Scientific Work, Volume 1, Models and Structures (G. P. Patil, S. Kotz, and J. K. Ord, eds.), D. Reidel, Dordrecht-Holland, 345-354. Khatri, С G. (1977) Distribution of a quadratic form in noncentral normal vectors using generalized Laguerre polynomials, South African Statist. J., 11, 167-179. Khatri, C. G. (1980) Quadratic forms in normal variables, Handbook of Statistics, Volume 1 (P. R. Krishnaiah, ed.). North-Holland, 443-469. Khatri, С G. (1989) Multivariate generalization of ^'-statistic based on the mean square successive differences, Commun. Statist-Theory Meth., 18(5), 1983-1992. Khatri, С G. and Bhavsar, С D. (1990) Some asymptotic inferential problems connected with complex elliptical distribution, J. Multivariate Anal, 35, 66-85. Khatri, С G. and Mardia, K. V. (1977) The von Mises-Fisher matrix distribution in orientation statistics, J. Roy. Statist. Soc, 39(1), 95-106. Khatri, C.G. and Rao, С R. (1987) Effects of estimated noise covariance matrix in optimal signal detection, IEEE Trans. Acoustics, Speech, and Signal Processing, 35, 671-679.
REFERENCES 355 Khatri, С. G., Khattree, R. and Gupta, R. D. (1991) On a class of orthogonal invariant and residual independent matrix distributions, Sankhya, B53(l), 1-10. Khattree, R. and Gupta, R. D. (1989) Estimation of matrix valued realized signal to noise ratio, J. Multivanate Anal, 30(2), 312-327. Khattree, R. and Gupta, R. D. (1992) Some probability distributions connected with beta and gamma matrices, Commun. Statist-Theory Meth., 21(2), 369-390. Kollo, T. and von Rosen, D. (1995) Approximating by the Wishart distribution, Ann. Inst. Statist Math., 47, 767-783. Konishi, S., Niki, N. and Gupta, A. K. (1988) Asymptotic expansions for the distribution of quadratic forms in normal variables, Ann. Inst Statist Math., 40(2), 279-296. Konno, Y. (1988) Exact moments of the multivariate F and beta distributions, J. Japan Statist Soc, 18(2), 123-130. Kotz, S., Johnson, N. L. and Boyd, D. W. (1967a) Series representations of distributions of quadratic forms in normal variables I: central case, Ann. Math. Statist, 38, 823-837. Kotz, S., Johnson, N. L. and Boyd, D. W. (1967b) Series representations of distributions of quadratic forms in normal variables II: non-central case, Ann. Math. Statist, 38, 838-848. Krishnaiah, P. R. (1976) Some recent developments on complex multivariate distributions, J. Multivanate Anal, 6, 1-30. Krishnaiah, P. R. and Lin, J. (1986) Complex elliptically symmetric distributions, Commun. Statist-Theory Meth., 15(12), 3693-3718. Krishnamoorthy, A. S. and Parthasarathy, M. (1951) A multivariate gamma-type distribution, Ann. Math. Statist, 22, 549-557. Correction (1960), 31, 229. Krishnamoorthy, K. and Gupta, A. K. (1989) Improved minimax estimation of a normal covariance matrix, Canadian J. Statist, 17, 91-102. Kshirsagar, A. M. (1959) Bartlett decomposition and Wishart distribution, Ann. Math. Statist, 30, 239-241. Kshirsagar, A. M. (1961a) Some extensions of the multivariate t distribution and the multivariate generalization of the distribution of the regression coefficients, Proc. Camb. Phil. Soc, 57, 80-85. Kshirsagar, A. M. (1961b) The non-central multivariate beta distribution, Ann. Math. Statist, 32, 104-111. Kshirsagar, A. M. (1972) Multivanate Analysis, Marcel Dekker, New York. Laha, R. G. (1956) On the stochastic independence of two second-degree polynomial statistics in normally distributed variates, Ann. Math. Statist, 27, 790-796. le Roux, N. J. (1978) The Algebra of Random Matrices, Ph. D. thesis, Faculty of Science, University of South Africa, RSA.
356 REFERENCES Leung, P. L. (1994) An identity for the noncentral Wishart distribution with application, J. Multivariate Anal., 48, 107-114. Li, R. Z. and Fang, К. Т. (1995) Estimation of scale matrix of elliptically contoured matrix distribution, Statis. Probab. Lett, 24, 289-297. Lin, P. E. (1972) Some characterizations of the multivariate t distribution, J. Multivariate Anal, 2, 339-344. Loynes, R. M. (1966) On idempotent matrices, Ann. Math. Statist, 37, 295-296. Luke, Y. L. (1969) The Special Functions and Their Approximations, Volume I, Academic Press, New York. Madow, W. A. (1938) Contribution to the theory of multivariate statistical analysis, Trans. Amer. Math. Soc, 44, 454-495. Magnus, J. R. and Neudecker, H. (1979) The commutation matrix: some properties and applications, Ann. Statist, 7, 381-394. Magnus, J. R. and Neudecker, H. (1988) Matrix Differential Calculus with Applications in Statistics and Econometrics, John Wiley & Sons, Chichester. Mahalanobis, P. C, Bose, R. С and Roy, S. N. (1937) Normalization of statistical variates and the use of rectangular coordinates in the theory of sampling distributions, Sankhya, 3, 1-40. Mardia, K. V. and Khatri, С G. (1977) Uniform distribution on a Stiefel manifold, J. Multivariate Anal, 7, 468-473. Marshall, A. and Olkin, I. (1979) Inequalities: Theory of Majorization and Its Applications, Academic Press, New York. Marx, D. G. (1981) Aspects of the Matric t-Distribution, Ph. D. thesis, University of The Orange Free State, RSA. Marx, D. G. (1983) Quadratic forms of a matric-ί variate, Ann. Inst. Statist. Math., 35, 347-353. Correction (1985), 37, 567. Marx, D. G. and Nel, D. G. (1982) A note on the distribution of linear combinations of matric-, vector- and scalar-ί variates, Department of Mathematical Statistics, University of The Orange Free State, Technical Report No. 85. Mathai, A. M. (1981) Distribution of the canonical correlation matrix, Ann. Inst. Statist Math., A33, 35-43. Mathai, A. M. and Provost, S. B. (1992) Quadratic Forms In Random Variables, Marcel Dekker, New York. Mathai, A. M. and Tan, W. Y. (1977) The non-null distribution of the likelihood ratio criterion for testing the hypothesis that the covariance matrix is diagonal, Canadian J. Statist, 5(1), 63-74. Mauchly, J. W. (1940) Significance test for sphericity of a normal n-variate distribution, Ann. Math. Statist, 11, 204-209.
REFERENCES 357 Mauldon, J. G. (1955) Pivotal quantities for Wishart's and related distributions, and a paradox in fiducial theory, J. Roy. Statist. Soc, В17, 79-85. Mehta, M. L. (1991) Random Matrices, 2nd ed., Academic Press, New York. Mirham, G. A. and Hultquist, R. A. (1967) A bivariate warning-time/failure-time distribution, J. Amer. Statist. Assoc, 62, 589-599. Mitra, S. K. (1969) Some characteristic and noncharacteristic properties of the Wishart distribution, Sankhya, A3l(l), 19-22. Mitra, S. K. (1970) A density-free approach to the matrix variate beta distribution, Sankhya, A32, 81-88. Muirhead, R. J. (1970) Asymptotic distributions of some multivariate tests, Ann. Math. Statist, 41, 1002-1010. Muirhead, R. J. (1982) Aspects of Multivariate Statistical Theory, John Wiley & Sons, New York. Muirhead, R. J. (1986) A note on some Wishart expectations, Metrika, 33, 247-251. Nagar, D. K. and Gupta, A. K. (1993) Asymptotic non-null distribution of likelihood ratio statistic for testing μ = μ0;Σ = σ2Ιρ in complex multivariate Gaussian model, Statistica, 53(4), 603-617. Nagar, D. K., Jain, S. K. and Gupta, A. K. (1985) Distribution of LRC for testing sphericity of a complex multivariate Gaussian model, Int. J. Math. & Math. Sci., 8(3), 555-562. Nagarsenker, B. N. (1979) Noncentral distribution of Wilk's statistic for test of three hypotheses, Sankhya, A41(l & 2), 67-81. Nagarsenkar, B. N. and Das, M. M. (1975) Exact distribution of sphericity criterion in the complex case and its percentage points, Commun. Statist., 4(4), 362-374. Nel, D. G. (1978) On the symmetric multivariate normal distribution and the asymptotic expansion of a Wishart matrix, South African Statist. J., 12, 145-159. Nel, D. G. and Groenewald, P. С N. (1979) On a Fisher-Cornish type expansion of Wishart matrices, Department of Mathematical Statistics, University of the Orange Free State, Technical Report No. 47. Nel, Η. Μ. (1977) On distributions and moments associated with matrix normal distributions, Department of Mathematical Statistics, University of the Orange Free State, Technical Report No. 24. Neudecker, H. (1985) On the dispersion matrix of a matrix quadratic form connected with the noncentral Wishart distribution, Linear Algebra Appl., 70, 257-267. Neudecker, H. and Wansbeek, T. (1983) Some results on commutation matrices with statistical applications, Canadian J. Statist, 11, 221-231. Neudecker, H. and Wansbeek, T. (1987) Fourth order properties of normally distributed random matrices, Linear Algebra Appl, 97, 13-21.
358 REFERENCES Ogawa, J. (1953) On the sampling distributions of classical statistics in multivariate analysis, Osaka Math. J., 5, 13-52. Olkin, I. (1953) Note on the Jacobians of certain matrix transformations useful in multivariate analysis, Biometrika, 40, 43-46. Olkin, I. (1959) A class of integral identities with matrix argument, Duke Math. J., 26, 207-214. Olkin, I. (1979) Matrix extensions of Liouville-Dirichlet-type integrals, Linear Algebra Appl, 28, 155-160. Olkin, I. and Roy, S. N. (1954) On multivariate distribution theory, Ann. Math. Statist, 25, 329-33 Olkin, I. and Rubin, H. (1962) A characterization of the Wishart distribution, Ann. Math. Statist, 33(4), 1272-1280. Olkin, I. and Rubin, H. (1964) Multivariate beta distributions and independence properties of Wishart distribution, Ann. Math. Statist, 35, 261-269. Correction (1966), 37(1), 297. Parkhurst, A. M. and James, A. T. (1974) Zonal polynomials of order 1 through 12, Selected Tables in Mathematical Statistics (H. L. Harter and D. B. Owen, eds.), American Mathematical Society, Providence, R. I., 199-388. Patil, G. P., Boswell, M. Т., Ratnaparkhi, M. V. and Roux, J. J. J. (1984) Dictionary and Classified Bibliography of Statistical Distributions in Scientific Work, Volume 3, Multivariate Models, International Co-operative Publishing House, Fair- land, Maryland. Perlman, M. D. (1977) A note on the matrix-variate F distribution, Sankhya, A39(3), 290-298. Phillips, P. С. В. (1985) The distribution of matrix quotients, J. Multivariate Anal., 16, 157-161. Pillai, К. С S. and Gupta A. K. (1967) On the distribution of the second elementary symmetric function of the roots of a matrix, Ann. Inst. Statist Math., 19, 167- 179. Pillai, К. С S. and Gupta A. K. (1968) On the noncentral distribution of second elementary symmetric function of the roots of a matrix, Ann. Math. Statist., 39, 833-839. Pillai, К. С S. and Gupta, A. K. (1969) On the exact distribution of Wilks' criterion, Biometrika, 56, 109-118. Pillai, К. С S. and Jouris, G. M. (1971) Some distribution problems in the multivariate complex Gaussian case, Ann. Math. Statist, 42, 517-525. Porter, С. Е. (1965) Statistical Theory of Spectra: Fluctuations, Academic Press, New York.
REFERENCES 359 Prentice, Μ. J. (1982) Antipodally symmetric distributions for orientation statistics, J. Statist Plan. Inf., 6, 205-214. Press, S. J. (1972) Applied Multivariate Analysis, Holt, Rinehart and Winston, Inc., New York. Priestly, M. В., Subba Rao, T. and Tong, H. (1973) Identification of the structure of multivariable stochastic systems, Multivariate Analysis-Ill (P. R. Krishnaiah, ed.), Academic Press, New York, 351-368. Rainville, E. D. (1970) Special Functions, Macmillan, New York. Rao, C. R. (1952) Advanced Statistical Methods in Biometric Research, John Wiley & Sons, New York. Rao, C. R. (1973) Linear Statistical Inference and Its Applications, 2nd ed., John Wiley & Sons, New York. Rasch, G. (1948) A functional equation for Wishart's distribution, Ann. Math. Statist, 19, 262-266. Rautenbach, H. M. and Roux, J. J. J. (1985) Statistical analysis based on quaternion normal random variables, Department of Statistics, University of South Affrica, RSA, Research Report No. 85-09. Richards, Donald St. P. (1984) Hyperspherical models, fractional derivatives and exponential distributions on matrix spaces, Sankhya, A46(2), 155-165. Rinco, S. (1973) β-Expectation Tolerance Regions Based on the Structural Models, Ph. D. thesis, The University of Western Ontario, London, Canada. Rogers, G. S. (1980) Matrix Derivatives, Marcel Dekker, New York. Rooney, P. G. (1972) On the ranges of certain fractional integrals, Canadian J. Math., 24, 1198-1216. Roux, J. J. J. (1971) On generalized multivariate distributions, South African Statist. J., 5, 91-100. Roux, J. J. J. (1975) New families of multivariate distributions, A Modern Course on Statistical Distributions in Scientific Work, Volume 1, Models and Structures (G. P. Patil, S. Kotz, and J. K. Ord, eds.), D. Reidel, Dordrecht-Holland, 281-297. Roux, J.J.J, and Becker, P. J. (1984) On prior inverted Wishart distribution, Department of Statistics and Operations Research, University of South Africa, Pretoria, Research Report No. 2. Roux, J. J. J. and Raath, E. L. (1973) Generalized Laguerre series forms of Wishart distributions, South African Statist. J., 7, 23-34. Roy, J. (1966) Power of the likelihood-ratio test used in analysis of dispersion, Multivariate Analysis (P. R. Krishnaiah, ed.), Academic Press, New York, 105-127. Roy, S. N. (1957) Some Aspects of Multivariate Analysis, John Wiley & Sons, New York.
360 REFERENCES Saw, J. G. (1973) Expectation of elementary symmetric functions of a Wishart matrix, Ann. Statist, 1(3), 580-582. Searle, S. R. (1971) Linear Models, John Wiley & Sons, New York. Sen Gupta, A. (1987) Tests for standardized generalized variances of multivariate normal populations of possibly different dimensions, J. Multivariate Anal, 23(2), 51-59. Shah, В. К. (1970) Distribution theory of a positive definite quadratic form with matrix argument, Ann. Math. Statist., 41(2), 692-697. Shah, B. K. and Khatri, С G. (1974) Proof of conjectures about the expected values of the elementary symmetric functions of a noncentral Wishart matrix, Ann. Statist, 2(4), 833-836. Shaman, P. (1980) The inverted complex Wishart distribution and its application to spectral estimates, J. Multivariate Anal, 10, 51-59. Siotani, M., Hayakawa, T. and Fujikoshi, Y. (1985) Modern Multivariate Statistical Analysis: A Graduate Course and Handbook, American Sciences Press, Columbus, Ohio, USA. Sivazlian, B. D. (1981) On a multivariate extension of the gamma and beta distributions, SIAM J. Appl. Math., 41, 205-209. Song, D. and Gupta A. K. (1997) Properties of generalized Liouville distribution, Random Oper. Stochastic Equations, 5(4), 337-348. Srivastava, M. S. (1965) On the complex Wishart Distribution, Ann. Math. Statist, 36, 312-315. Srivastava, M. S. and Khatri, C. G. (1979) An Introduction to Multivariate Statistics, North Holland, New York. Stein, C. (1969) Multivariate Analysis I, Department of Statistics, Stanford University, USA, Technical Report No. 42. Steyn, H. S. and Roux, J. J. J. (1972) Approximations for the non-central Wishart distributions, South African Statist. J., 6, 165-173. Styan, G. P. H. (1979) Three useful expressions for expectations involving Wishart matrices, Statistical Data Analysis and Inference (Y. Dodge, ed.), Elsevier Science Publishers B. V., 283-196 Subrahmaniam, K. (1973) On some functions of matrix argument, Utilitas Math., 3, 83-106. Subrahmaniam, K. (1976) Recent trends in multivariate normal distribution: On the zonal polynomials and other functions of matrix argument, Sankhya, A38, 221- 258. Sugiura, N. (1973) Derivatives of the characteristic root of a symmetric or a Hermitian matrix with two applications in multivariate analysis, Commun. Statist, 1(5), 393-417.
REFERENCES 361 Sutradhar, В. С. and Ali, Μ. Μ. (1989) A generalization of the Wishart distribution for the eUiptical model and its moments for the multivariate t model, J. Multivariate Anal, 22, 155-162. Sverdrup, E. (1947) Derivation of the Wishart distribution of the second order sample moments by straight forward integration of a multiple integral, Skandinavisk Aktuarietidskrift, 30, 151-166. Tan, W. Y. (1964) Bayesian Analysis of Random Effect Models, Ph. D. thesis, University of Wisconsin, Madison. Tan, W. Y. (1968) Some distribution theory associated with complex Gaussian distribution, Tamkang J., 7, 263-301. Tan, W. Y. (1969a) Some results on multivariate regression analysis, Nanta Mathematical 3, 54-71. Tan, W. Y. (1969b) The restricted matric-ί distribution and its applications in deriving posterior distributions of parameters in multivariate regression analysis, Department of Statistics, University of Wisconsin, Madison, Technical Report No. 205. Tan, W. Y. (1969c) Note on the multivariate and the generalized multivariate beta distributions, J. Amer. Statist. Assoc, 64, 230-241. Tan, W. Y. (1973) Multivariate studentization and its applications, Canadian J. Statist, 1(2), 181-199. Tan, W. Y. (1979) On the approximation of noncentral Wishart distribution by Wishart distribution, Metron, 37(3), 49-58. Tan, W. Y. (1980) On approximating multivariate distributions, Multivariate Statistical Analysis (R. P. Gupta, ed.), North-Holland, Amsterdam, 237-249. Tan, W. Y. and Gupta, R. P. (1982) On approximating the non-central Wishart distribution by Wishart distribution: A monte carlo study, Commun. StatisL- Simul Сотр., 11(1), 47-64. Tan, W. Y. and Gupta, R. P. (1983) On approximating a linear combination of central Wishart matrices with positive coefficients, Commun. Statist.-Theory Meth., 12(22), 2589-2600. Tan, W. Y. and Guttman, I. (1971) A disguised Wishart variable and a related theorem, J. Roy .Statist. Soc, B33, 147-152. Tiao, G. C. and Guttman, I. (1965) The inverted Dirichlet distribution with application, J. Amer. Statist. Assoc, 60, 793-805. Tiao, G. C., Tan, W. Y. and Chang Y. C. (1970) A Bayesian approach to multivariate regression subject to linear constraints, paper presented at the Second World Congress of the Econometric Society, Cambridge, England, September 1970. Tiao, G. C. and Zellner, A. (1964) On the Bayesian estimation of multivariate regression, J. Roy. Statist. Soc, B26, 277-285.
362 REFERENCES Tracy, D. S. and Sultan, S. A. (1993) Third moment of matrix quadratic form, Statist Probab. Lett, 16, 71-76. Troskie, C. G. (1967) Noncentral multivariate Dirichlet distributions, South African Statist J., 1, 21-32. Troskie, C. G. (1969) The generalised multiple correlation matrix, South African Statist J., 3(2), 109-122. Troskie, C. G. (1972) The distributions of some test criteria depending on multivariate Dirichlet distributions, South African Statist J., 6, 151-163. Tsai, Μ. Τ. (1995) A generalization of Wishart distribution, Statist Probab. Lett, 24, 67-70. Turin, G. L. (1960) The characteristic function of Hermitian quadratic forms in complex normal variables, Biometrika, 47, 199-201. Tyler, D. E. (1987) Statistical analysis for the angular central Gaussian distributionon the sphere, Biometrika, 74(3), 579-589. Uhlig, H. (1994) On singular Wishart and singular multivariate beta distributions, Ann. Statist, 22(1), 3950-405. van der Merwe, C. A. (1980) Expectations of the traces of functions of a multivariate normal variable, Department of Mathematical Statistics, University of the Orange Free State, RSA, Technical Report No. 56. van der Merwe, G. J. and Roux, J. J. J. (1974) On a generalized matrix-variate hypergeometric distribution, South African Statist. J., 8, 49-58. von Rosen, D. (1988a) Moments for the inverted Wishart distribution, Scand. J. Statist, 15, 97-109. von Rosen, D. (1988b) Moments of matrix normal variables, Statistics, 19, 575-583. Weibull, M. (1953) The distribution of t- and F-statistics and of correlation and regression coefficients in stratified samples from normal populations with different means, Skandinavisk Aktuarietidskrift, 1-2, Supplement, 1-106. Whaba, G. (1968) On the distributions of some statistics useful in the analysis of jointly stationary time series, Ann. Math. Statist, 39, 1849-1862. Whaba, G. (1971) Some tests of independence for stationary multivariate time series, J. Roy. Statist Soc, B33, 153-166. Wigner, E. P. (1965) Distribution laws for the roots of a random Hermitian matrix, Statistical Theory of Spectra: Fluctuations (С. Е. Porter, ed.), Academic Press, New York, 446-461. Wigner, E. P. (1967) Random matrices in physics, SI AM Review, 9, 1-23. Wijsman, R. A. (1957) Random orthogonal transformation and their use in some classical distribution problems in multivariate analysis, Ann. Math. Statist, 28, 415-423.
REFERENCES 363 Wilks, S. S. (1932) Certain generalizations in the analysis of variance, Biometrika. 24, 471-494. Wishart, J. (1928) The generalized product moment distribution in samples from a normal multivariate population, Biometrika, A20, 32-52. Wishart, J. and Bartlett, M. S. (1933) The generalized product moment distribution in a normal system, Proc. Camb. Phil Soc, 29, 260-270. Wong, С S. and Liu, D. (1994) Moments for left elliptically contoured random matrices, J. Multivariate Anal, 49, 1-23. Wooding, R. A. (1956) The multivariate distribution of complex normal variables, Biometrika, 43, 212-215. Xu, Jian-Lun (1987) Inverse Dirichlet distribution and its applications, Acta Math- ematicae Applicatae Sinica, 10, 91-100. Reprinted in Statistical Inference in Elliptically Contoured and Related Distributions (K. T. Fang and T. W. Anderson, eds.), Allerton Press, New York.
SUBJECT INDEX Approximations, 124 Asymptotic Distribution: Dirichlet type I, 213 Dirichlet type II, 214 Bessel Function: type I, 39 type II, 39 Beta Distribution: definition type I, 165 type II, 166 characteristic function, 173, 174 generalized type I, 166 generalized type II, 167 noncentral, 188 Beta function: generalized multivariate, 23 incomplete, 40, 51 multivariate, 18-19 Characteristic function: definition, 46 beta distribution, 173, 174 elliptically contoured distribution, 322 normal distribution, 56-57 Wishart distribution, 93 noncentral Wishart distribution, 115 restricted normal distribution, 75 singular normal distribution, 69 Characteristic roots, 3, 5 Choleskny decomposition, 7 Commutation matrix, 9 Complex distributions, 303-304 Conditional distribution: inverted ^-distribution, 162 normal distribution, 65-66 ^-distribution, 138-140 Wishart distribution, 94-95 Confluent hypergeometric function: type I, 36 type II, 38 Covariance matrix, 46 Dirichlet: asymptotic distribution, 213 inverse Dirichlet distribution, 203 type I distribution, 199 type II distribution, 200 noncentral distribution, 218 multivariate Dirichlet function, 21 also see Distribution Distributions: angular central Gaussian, 288-289 Bellman gamma, 122 beta type I also see Beta Distribution beta type II also see Beta Distribution beta-Wishart, 290-291 bimatrix Wishart, 289-290 Bingham, 284-285 Bingham-von Mises, 285-287 complex, 303-304 confluent hypergeometric function kind 1, 291-294 kind 2 and type I, 295-296 kind 2 and type I, 296-298 correlation matrix, 107 Dirichlet type I, 199 Dirichlet type II, 200 elliptically contoured, 322-323 gamma, 122 generalized hypergeometric function, 301-303
SUBJECT INDEX 365 hypergeometric function type I, 298 type II, 300 inverse Dirichlet distribution, 203 inverted gamma, 122 Liouville type I, 312 type II, 313 normal definition, 55 restricted, 74 singular, 68 symmetric, 70 ^-generalized, 77 orthogonally invariant and residu- ally independent, 323 quadratic form, 225 regression matrix, 107 sample Covariance matrix, 92 spherical, 315-321 t definition, 133 disguised, 143 inverted, 142 quadratic form, 156 noncentral, 152 restricted, 151 uniform, 279-281, 316 von-Mises-Fisher, 281-283 Elementary symmetric function, 15, 72 Elliptically contoured models: definition, 322 characteristic function, 322 marginal distribution, 323 Entropy, 82 Gamma distribution see Distributions Gamma function: generalized multivariate, 23-24 incomplete, 40 multivariate, 18-19 Generalized Hermite polynomial, 42-43 Generalized hypergeometric coefficient, 30 Generalized hypergeometric function: integrals, 35-38 one matrix, 34 two matrices, 34 Generalized hypergeometric function distributions, 301-303 Generalized Laguerre polynomial, 41 Haar measures, 17 Hermite polynomial, 42-43 hypergeometric function distribution: type I, 298 type II, 300 Independence: linear form, 260, 161 quadratic form, 258, 261 Idempotent matrix, 2, 3, 5, 7-8 Integration, 18 Inverse beta distribution, 172 Inverse Dirichlet distribution, 203 Inverse Laplace transform, 18 Inverted Wishart distribution, 111-113 Invariance, 90, 172, 204, 223, 231, 293, 296, 298, 323 Invariant measure, 16-17 orthogonal group, 17 Stiefel manifold, 16 Jacobian of transformation: inverse, 14 linear, 13-14 quadratic, 14-15 orthogonal, 15-17 Kronecker product of matrices, 8 Laguerre polynomial see also generalized Laguerre polynomial Liouville distribution see also Distributions Laplace transform: convolution, 18 inverse, 18 Latent root see characteristic roots Lower triangular matrix, 2, 5-6 Marginal distribution: beta type I distribution, 174 beta type II distribution, 175 elliptically contoured distribution, 322 inverted Wishart distribution, 111
366 SUBJECT INDEX normal distribution, 65-66 singular normal distribution, 69 ^-distribution, 138-140 Wishart distribution, 94-95 Matrix: characteristic roots, 3, 5, 7 Cholesk$y decomposition, 7 commutation, 9 idempotent, 2-3, 5 Kronecker product, 8 nonsingular, 2 orthogonal, 2, 5 partition, 3-4 positive definite, 2 random, 44 rank, 3-5 rank factorization, 7 spectral decomposition, 6 spectral representation, 7 square root factorization, 7 symmetric, 2 trace, 3-5 transition, 11 triangular lower, 2, 6 upper, 2, 6 vec, 9 vecp, 10 Moments: beta, 178-179 inverted Wishart, 113 elliptically contoured, 323 normal, 57-59 Wishart, 98 Moment generating function: definition, 45 quadratic function distribution, 228-230 S = XAX', 254 S = XAX' + \{LX' + XL') + C, 263 S = XAX'+^X'+XL't + C, 271- 273 Multivariate: beta function, 20 Dirichlet distribution, 199 Dirichlet function, 21 gamma function, 18-19 inverted ^-distribution, 142 normal distribution, 55 ^-distribution, 133, 134 Noncentral distribution: beta, 188 Dirichlet, 218 inverted Wishart, 121 quadratic form, 246 t, 152 Wishart, 113 Nonsingular matrix, 2 Normal distribution: expected values, 57-64 characteristic function, 56-57 conditional, 65 marginal, 65 restricted, 74 singular, 68 symmetric, 70 ^-generalized, 77 Orthogonal group: invariant measure, 16-17 volume, 17, 25-26 Orthogonal matrix, 2, 5 Partition of a matrix, 3 Positive definite matrix, 2 Quadratic form: distribution, 225 expected values, 231-233 involving t variables, 156 involving Wishart matrix, 98-102 moment generating function, 228- 230 noncentral distribution, 246-251 Wishartness, 90, 256-257, 265-266, 270, 273 Random matrix: characteristic function, 46 conditional distribution, 45 covariance matrix, 46-47 definition, 44 expected value, 46
SUBJECT INDEX 367 moment generating function, 45 Rank factorization, 7 Rank of a matrix, 3 Roots of a matrix, 3 Sample: correlation matrix, 107 covariance matrix, 92 generalized variance, 105 Singular normal distribution, 68 Spectral decomposition, б Spherical distribution, 315 Square root factorization, 7 Symmetric matrix, 2 Stiefel manifold: definition, 16 invariant measure, 16-17 volume, 17, 25-26 Sverdrup's lemma, 28-29 Symmetric normal distribution, 70 ^-distribution: conditional, 138-140 definition, 134 disguised, 143 expected value, 135-136, 146-147, 160 inverted, 142 marginal, 138-140, 162 noncentral, 152 quadratic form, 156 restricted, 151 Trace of a matrix, 4 Transition matrix, 11 Triangular matrix, 2, 5-6 Uniform distribution, 279-281, 316 Upper triangular matrix, 2, 5-6 vec of a matrix, 9 vecp of a matrix, 10 Volume, 17, 25-26 Wishart distribution: additive property, 93 characteristic function, 93 conditional distribution, 94-95 cumulative distribution function, 89 definition, 87 invariance, 90 inverted, 111 marginal distribution, 94-95 moments, 98, 113 noncentral, 113 noncentral inverted, 121 triangular factorization, 91-92, 102, 104, 110, 121 zonal polynomial, 101-102 Wishartness, 90, 256-257, 265-266, 270, 273 Zonal polynomial: expectation beta distribution, 177-178, 194- 195 oriarim distribution, 306, 327, 329 quadratic form, 232-233 Wishart distribution, 101-102 integrals, 31-33, 50
;s ing Green LM06108 ISBN l-5fi4fifi-D4b-5 90000 9»781584»880462