Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger
Moshe Shaked...

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger

Moshe Shaked J. George Shanthikumar

Stochastic Orders

Moshe Shaked Department of Mathematics University of Arizona Tucson, AZ 85721 [email protected] J. George Shanthikumar Department of Industrial Engineering and Operations Research University of California, Berkeley Berkeley, CA 94720 [email protected]

Library of Congress Control Number: 2006927724 ISBN-10: 0-387-32915-3 ISBN-13: 978-0387-32915-4 Printed on acid-free paper. © 2007 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 1 springer.com

To my wife Edith and to my children Tal, Shanna, and Lila M.S. To my wife Mellony and to my children Devin, Rajan, and Sohan J.G.S.

To my wife Edith and to my children Tal Shanna Lila

To my wife Mellony and to my children Devin Rajan Sohan

Preface

Stochastic orders and inequalities have been used during the last 40 years, at an accelerated rate, in many diverse areas of probability and statistics. Such areas include reliability theory, queuing theory, survival analysis, biology, economics, insurance, actuarial science, operations research, and management science. The purpose of this book is to collect in one place essentially all that is known about these orders up to the present. In addition, the book illustrates some of the usefulness and applicability of these stochastic orders. This book is a major extension of the ﬁrst six chapters in Shaked and Shanthikumar [515]. The idea that led us to write those six chapters arose as follows. In our own research in reliability theory and operations research we have been using, for years, several notions of stochastic orders. Often we would encounter a result that we could easily (or not so easily) prove, but we could not tell whether it was known or new. Even when we were sure that a result was known, we would not know right away where it could be found. Also, sometimes we would prove a result for the purpose of an application, only to realize later that a stronger result (stronger than what we needed) had already been derived elsewhere. We also often have had diﬃculties giving a reference for one source that contained everything about stochastic orders that we needed in a particular paper. In order to avoid such diﬃculties we wrote the ﬁrst six chapters in Shaked and Shanthikumar [515]. Since 1994 the theory of stochastic orders has grown signiﬁcantly. We think that now is the time to put in one place essentially all that is known about these orders. This book is the result of this eﬀort. The simplest way of comparing two distribution functions is by the comparison of the associated means. However, such a comparison is based on only two single numbers (the means), and therefore it is often not very informative. In addition to this, the means sometimes do not exist. In many instances in applications one has more detailed information, for the purpose of comparison of two distribution functions, than just the two means. Several orders of distribution functions, that take into account various forms of possible knowl-

VIII

Preface

edge about the two underlying distribution functions, are studied in Chapters 1 and 2. When one wishes to compare two distribution functions that have the same mean (or that are centered about the same value), one is usually interested in the comparison of the dispersion of these distributions. The simplest way of doing it is by the comparison of the associated standard deviations. However, such a comparison, again, is based on only two single numbers, and therefore it is often not very informative. In addition to this, again, the standard deviations sometimes do not exist. Several orders of distribution functions, which take into account various forms of possible knowledge about the two underlying distribution functions (in addition to the fact that they are centered about the same value), are studied in Chapter 3. Orders that can be used for the joint comparison of both the location and the dispersion of distribution functions are studied in Chapters 4 and 5. The analogous orders for multivariate distribution functions are studied in Chapters 6 and 7. When one is interested in the comparison of a sequence of distribution functions, associated with the random variables Xi , i = 1, 2, . . ., then one can use, of course, any of the orders described in Chapters 1–7 for the purpose of comparing any two of these distributions. However, the parameter i may now introduce some patterns that connect all the underlying distributions. For example, suppose not only that the random variables Xi , i = 1, 2, . . ., increase stochastically in i, but also that the increase is sharper for larger i’s. Then the sequence Xi , i = 1, 2, . . ., is stochastically increasing in a convex sense. Such notions of stochastic convexity and concavity are studied in Chapter 8. Notions of positive dependence of two random variables X1 and X2 have been introduced in the literature in an eﬀort to mathematically describe the property that “large (respectively, small) values of X1 go together with large (respectively, small) values of X2 .” Many of these notions of positive dependence are deﬁned by means of some comparison of the joint distribution of X1 and X2 with their distribution under the theoretical assumption that X1 and X2 are independent. Often such a comparison can be extended to general pairs of bivariate distributions with given marginals. This fact led researchers to introduce various notions of positive dependence orders. These orders are designed to compare the strength of the positive dependence of the two underlying bivariate distributions. Many of these orders can be further extended to comparisons of general multivariate distributions that have the same marginals. In Chapter 9 we describe these orders. We have in mind a wide spectrum of readers and users of this book. On one hand, the text can be useful for those who are already familiar with many aspects of stochastic orders, but who are not aware of all the developments in this area. On the other hand, people who are not very familiar with stochastic orders, but who know something about them, can use this book for the purpose of studying or widening their knowledge and understanding of this important area.

Preface

IX

We wish to thank Haijun Li, Asok K. Nanda, and Taizhong Hu for critical readings of several drafts of the manuscript. Their comments led to a substantial improvement in the presentation of some of the results in these chapters. We also thank Yigal Gerchak and Marco Scarsini for some illuminating suggestions. We thank our academic advisors John A. Buzacott (of J. G. S.) and Albert W. Marshall (of M. S.) who, years ago, introduced us to some aspects of the area of stochastic orders.

Tucson, Berkeley, August 16, 2006

Moshe Shaked J. George Shanthikumar

Contents

1

2

Univariate Stochastic Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A The Usual Stochastic Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 1.A.2 A characterization by construction on the same probability space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A.3 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A.4 Further characterizations and properties . . . . . . . . . . . . . . 1.A.5 Some properties in reliability theory . . . . . . . . . . . . . . . . . 1.B The Hazard Rate Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.B.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 1.B.2 The relation between the hazard rate and the usual stochastic orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.B.3 Closure properties and some characterizations . . . . . . . . . 1.B.4 Comparison of order statistics . . . . . . . . . . . . . . . . . . . . . . . 1.B.5 Some properties in reliability theory . . . . . . . . . . . . . . . . . 1.B.6 The reversed hazard order . . . . . . . . . . . . . . . . . . . . . . . . . . 1.C The Likelihood Ratio Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.C.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.C.2 The relation between the likelihood ratio and the hazard and reversed hazard orders . . . . . . . . . . . . . . . . . . . 1.C.3 Some properties and characterizations . . . . . . . . . . . . . . . . 1.C.4 Shifted likelihood ratio orders . . . . . . . . . . . . . . . . . . . . . . . 1.D The Convolution Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.E Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Residual Life Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A The Mean Residual Life Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.2 The relation between the mean residual life and some other stochastic orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.3 Some closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 3 4 5 8 15 16 16 18 18 31 35 36 42 42 43 44 66 70 71 81 81 81 83 86

XII

Contents

2.A.4 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 94 2.B The Harmonic Mean Residual Life Order . . . . . . . . . . . . . . . . . . . 94 2.B.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 2.B.2 The relation between the harmonic mean residual life and some other stochastic orders . . . . . . . . . . . . . . . . . . . . 95 2.B.3 Some closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.B.4 Properties in reliability theory . . . . . . . . . . . . . . . . . . . . . . 105 2.C Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 3

Univariate Variability Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 3.A The Convex Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 3.A.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 109 3.A.2 Closure and other properties . . . . . . . . . . . . . . . . . . . . . . . . 119 3.A.3 Conditions that lead to the convex order . . . . . . . . . . . . . 133 3.A.4 Some properties in reliability theory . . . . . . . . . . . . . . . . . 138 3.A.5 The m-convex orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 3.B The Dispersive Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 3.B.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 146 3.B.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 3.C The Excess Wealth Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 3.C.1 Motivation and deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 3.C.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 3.D The Peakedness Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 3.D.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 3.D.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 3.E Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

4

Univariate Monotone Convex and Related Orders . . . . . . . . . 181 4.A The Monotone Convex and Monotone Concave Orders . . . . . . . 181 4.A.1 Deﬁnitions and equivalent conditions . . . . . . . . . . . . . . . . . 181 4.A.2 Closure properties and some characterizations . . . . . . . . . 185 4.A.3 Conditions that lead to the increasing convex and increasing concave orders . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 4.A.4 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 4.A.5 Some properties in reliability theory . . . . . . . . . . . . . . . . . 203 4.A.6 The starshaped order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 4.A.7 Some related orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 4.B Transform Orders: Convex, Star, and Superadditive Orders . . . 213 4.B.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 4.B.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 4.B.3 Some related orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 4.C Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

Contents

XIII

5

The Laplace Transform and Related Orders . . . . . . . . . . . . . . . 233 5.A The Laplace Transform Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 5.A.1 Deﬁnitions and equivalent conditions . . . . . . . . . . . . . . . . . 233 5.A.2 Closure and other properties . . . . . . . . . . . . . . . . . . . . . . . . 235 5.B Orders Based on Ratios of Laplace Transforms . . . . . . . . . . . . . . 245 5.B.1 Deﬁnitions and equivalent conditions . . . . . . . . . . . . . . . . . 245 5.B.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 5.B.3 Relationship to other stochastic orders . . . . . . . . . . . . . . . 249 5.C Some Related Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 5.C.1 The factorial moments order . . . . . . . . . . . . . . . . . . . . . . . . 252 5.C.2 The moments order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 5.C.3 The moment generating function order . . . . . . . . . . . . . . . 260 5.D Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

6

Multivariate Stochastic Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 6.A Notations and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 6.B The Usual Multivariate Stochastic Order . . . . . . . . . . . . . . . . . . . 266 6.B.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 266 6.B.2 A characterization by construction on the same probability space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 6.B.3 Conditions that lead to the multivariate usual stochastic order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 6.B.4 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 6.B.5 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 6.B.6 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 279 6.B.7 Stochastic ordering of stochastic processes . . . . . . . . . . . . 280 6.C The Cumulative Hazard Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 6.C.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 6.C.2 The relationship between the cumulative hazard order and the usual multivariate stochastic order . . . . . . . . . . . 288 6.D Multivariate Hazard Rate Orders . . . . . . . . . . . . . . . . . . . . . . . . . . 290 6.D.1 Deﬁnitions and basic properties . . . . . . . . . . . . . . . . . . . . . 290 6.D.2 Preservation properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 6.D.3 The dynamic multivariate hazard rate order . . . . . . . . . . 294 6.E The Multivariate Likelihood Ratio Order . . . . . . . . . . . . . . . . . . . 298 6.E.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 6.E.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 6.E.3 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 304 6.F The Multivariate Mean Residual Life Order . . . . . . . . . . . . . . . . . 305 6.F.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 6.F.2 The relation between the multivariate mean residual life and the dynamic multivariate hazard rate orders . . . 306 6.F.3 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 307 6.G Other Multivariate Stochastic Orders . . . . . . . . . . . . . . . . . . . . . . 307 6.G.1 The orthant orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

XIV

Contents

6.G.2 The scaled order statistics orders . . . . . . . . . . . . . . . . . . . . 314 6.H Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 7

Multivariate Variability and Related Orders . . . . . . . . . . . . . . . 323 7.A The Monotone Convex and Monotone Concave Orders . . . . . . . 323 7.A.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 7.A.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 7.A.3 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 7.A.4 Convex and concave ordering of stochastic processes . . . 330 7.A.5 The (m1 , m2 )-icx orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 7.A.6 The symmetric convex order . . . . . . . . . . . . . . . . . . . . . . . . 332 7.A.7 The componentwise convex order . . . . . . . . . . . . . . . . . . . . 333 7.A.8 The directional convex and concave orders . . . . . . . . . . . . 335 7.A.9 The orthant convex and concave orders . . . . . . . . . . . . . . 339 7.B Multivariate Dispersion Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 7.B.1 A strong multivariate dispersion order . . . . . . . . . . . . . . . 342 7.B.2 A weak multivariate dispersion order . . . . . . . . . . . . . . . . . 344 7.B.3 Dispersive orders based on constructions . . . . . . . . . . . . . 346 7.C Multivariate Transform Orders: Convex, Star, and Superadditive Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 7.D The Multivariate Laplace Transform and Related Orders . . . . . 349 7.D.1 The multivariate Laplace transform order . . . . . . . . . . . . 349 7.D.2 The multivariate factorial moments order . . . . . . . . . . . . . 352 7.D.3 The multivariate moments order . . . . . . . . . . . . . . . . . . . . 353 7.E Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

8

Stochastic Convexity and Concavity . . . . . . . . . . . . . . . . . . . . . . . 357 8.A Regular Stochastic Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 8.A.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 8.A.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 8.A.3 Stochastic m-convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 8.B Sample Path Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 8.B.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 8.B.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 8.C Convexity in the Usual Stochastic Order . . . . . . . . . . . . . . . . . . . 374 8.C.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 8.C.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 8.D Strong Stochastic Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 8.D.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 8.D.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 8.E Stochastic Directional Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . 381 8.E.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 8.E.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 8.F Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384

Contents

9

XV

Positive Dependence Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 9.A The PQD and the Supermodular Orders . . . . . . . . . . . . . . . . . . . . 387 9.A.1 Deﬁnition and basic properties: The bivariate case . . . . . 387 9.A.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 9.A.3 The multivariate case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 9.A.4 The supermodular order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 9.B The Orthant Ratio Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 9.B.1 The (weak) orthant ratio orders . . . . . . . . . . . . . . . . . . . . . 404 9.B.2 The strong orthant ratio orders . . . . . . . . . . . . . . . . . . . . . 407 9.C The LTD, RTI, and PRD Orders . . . . . . . . . . . . . . . . . . . . . . . . . . 408 9.D The PLRD Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 9.E Association Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 9.F The PDD Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 9.G Ordering Exchangeable Distributions . . . . . . . . . . . . . . . . . . . . . . . 423 9.H Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

Note

Throughout the book “increasing” means “nondecreasing” and “decreasing” means “nonincreasing.” Expectations are assumed to exist whenever they are written. The “inverse” of a monotone function (which is not strictly monotone) means the right continuous version of it, unless stated otherwise. For example, if F is a distribution function, then the right continuous version of its inverse is F −1 (u) = sup{x : F (x) ≤ u}, u ∈ [0, 1]. The following aging notions will be encountered often throughout the text. Let X be a random variable with distribution function F and survival function F ≡ 1 − F. (i) The random variable X (or its distribution) is said to be IFR [increasing failure rate] if F is logconcave. It is said to be DFR [decreasing failure rate] if F is logconvex. (ii) The nonnegative random variable X (or its distribution) is said to be IFRA [increasing failure rate average] if − log F is starshaped; that is, if − log F (t)/t is increasing in t ≥ 0. It is said to be DFRA [decreasing failure rate average] if − log F is antistarshaped; that is, if − log F (t)/t is decreasing in t ≥ 0. (iii) The nonnegative random variable X (or its distribution) is said to be NBU [new better than used] if F (s)F (t) ≥ F (s + t) for all s ≥ 0 and t ≥ 0. It is said to be NWU [new worse than used] if F (s)F (t) ≤ F (s + t) for all s ≥ 0 and t ≥ 0. (iv) The random variable X (or its distribution) is said to be DMRL [decreas ing mean residual life] if

∞ t

F (s)ds F (t)

is decreasing in t over {t : F (t) > 0}. It

is said to be IMRL [increasing mean residual life] if

∞ t

F (s)ds F (t)

is increasing

in t over {t : F (t) > 0}. (v) The nonnegative random variable X (or its distribution) is said to be NBUE [new better than used in expectation] if

∞ t

F (s)ds F (t)

≤ EX for all

2

Note

t ≥ 0. It is said to be NWUE [new worse than used in expectation] if ∞ t

F (s)ds F (t)

≥ EX for all t ≥ 0.

The majorization order will be used in some places in the text. Recall from Marshall and Olkin [383] that a vector a = (a1 , a2 , . . . , an ) is said to be smaller in the majorization order than the vector b = (b1 , b2 , . . . , bn ) (denoted a ≺ b) n j j n if i=1 ai = i=1 bi and if i=1 a[i] ≤ i=1 b[i] for j = 1, 2, . . . , n − 1, where a[i] [b[i] ] is the ith largest element of a [b], i = 1, 2, . . . , n. An n-dimensional function φ is called Schur convex [concave] if a ≺ b =⇒ φ(a) ≤ [≥] φ(b). The notation N ≡ {. . . , −1, 0, 1, . . . }, N+ ≡ {0, 1, . . . }, and N++ ≡ {1, 2, . . . } will be used in this text.

1 Univariate Stochastic Orders

In this chapter we study stochastic orders that compare the “location” or the “magnitude” of random variables. The most important and common orders that are considered in this chapter are the usual stochastic order ≤st , the hazard rate order ≤hr , and the likelihood ratio order ≤lr . Some variations of these orders, and some related orders, are also examined in this chapter.

1.A The Usual Stochastic Order 1.A.1 Deﬁnition and equivalent conditions Let X and Y be two random variables such that P {X > x} ≤ P {Y > x}

for all x ∈ (−∞, ∞).

(1.A.1)

Then X is said to be smaller than Y in the usual stochastic order (denoted by X ≤st Y ). Roughly speaking, (1.A.1) says that X is less likely than Y to take on large values, where “large” means any value greater than x, and that this is the case for all x’s. Note that (1.A.1) is the same as P {X ≤ x} ≥ P {Y ≤ x}

for all x ∈ (−∞, ∞).

(1.A.2)

It is easy to verify (by noting that every closed interval is an inﬁnite intersection of open intervals) that X ≤st Y if, and only if, P {X ≥ x} ≤ P {Y ≥ x} for all x ∈ (−∞, ∞).

(1.A.3)

In fact, we can recast (1.A.1) and (1.A.3) in a seemingly more general, but actually an equivalent, way as follows: P {X ∈ U } ≤ P {Y ∈ U } for all upper sets U ⊆ (−∞, ∞).

(1.A.4)

(In the univariate case, that is on the real line, a set U is an upper set if, and only if, it is an open or a closed right half line.) In the univariate case the

4

1 Univariate Stochastic Orders

equivalence of (1.A.4) with (1.A.1) and (1.A.3) is trivial, but in Chapter 6 it will be seen that the generalizations of each of these three conditions to the multivariate case yield diﬀerent deﬁnitions of stochastic orders. Still another way of rewriting (1.A.1) or (1.A.3) is the following: E[IU (X)] ≤ E[IU (Y )]

for all upper sets U ⊆ (−∞, ∞),

(1.A.5)

where IU denotes the indicator function of U . From (1.A.5) it follows that if X ≤st Y , then E

m

m ai IUi (X) − b ≤ E ai IUi (Y ) − b

i=1

(1.A.6)

i=1

for all ai ≥ 0, i = 1, 2, . . . , m, b ∈ (−∞, ∞), and m ≥ 0. Given an increasing function φ, it is possible, for each m, to deﬁne a sequence of Ui ’s, a sequence of ai ’s, and a b (all of which may depend on m), such that as m → ∞ then (1.A.6) converges to E[φ(X)] ≤ E[φ(Y )], (1.A.7) provided the expectations exist. It follows that X ≤st Y if, and only if, (1.A.7) holds for all increasing ∞the expectations exist. ∞functions φ for which The expressions x P {X > y}dy and x P {Y > y}dy are used extensively in Chapters 2, 3, and 4. It is of interest to note that X ≤st Y if, and only if, ∞ ∞ P {Y > y}du − P {X > y}dy is decreasing in x ∈ (−∞, ∞). x

x

(1.A.8) If X and Y are discrete random variables taking on values in N, then we have the following. Let pi = P {X = i} and qi = P {Y = i}, i ∈ N. Then X ≤st Y if, and only if, i

pj ≥

j=−∞

i

qj ,

i ∈ N,

j=−∞

or, equivalently, X ≤st Y if, and only if, ∞ j=i

pj ≤

∞

qj ,

i ∈ N.

j=i

1.A.2 A characterization by construction on the same probability space An important characterization of the usual stochastic order is the following theorem (here =st denotes equality in law).

1.A The Usual Stochastic Order

5

Theorem 1.A.1. Two random variables X and Y satisfy X ≤st Y if, and ˆ and Yˆ , deﬁned on the same probonly if, there exist two random variables X ability space, such that ˆ =st X, X Yˆ =st Y,

(1.A.10)

ˆ ≤ Yˆ } = 1. P {X

(1.A.11)

(1.A.9)

and Proof. Obviously (1.A.9), (1.A.10), and (1.A.11) imply that X ≤st Y . In order to prove the necessity part of Theorem 1.A.1, let F and G be, respectively, the distribution functions of X and Y , and let F −1 and G−1 be the corresponding ˆ = F −1 (U ) and right continuous inverses (see Note on page 1). Deﬁne X −1 ˆ Y = G (U ) where U is a uniform [0, 1] random variable. Then it is easy to ˆ and Yˆ satisfy (1.A.9) and (1.A.10). From (1.A.2) it is seen that see that X (1.A.11) also holds.

Theorem 1.A.1 is a special case of a more general result that is stated in Section 6.B.2. From (1.A.2) and Theorem 1.A.1 it follows that the random variables X and Y , with the respective distribution functions F and G, satisfy X ≤st Y if, and only if, F −1 (u) ≤ G−1 (u), for all u ∈ (0, 1). (1.A.12) Another way of restating Theorem 1.A.1 is the following. We omit the obvious proof of it. Theorem 1.A.2. Two random variables X and Y satisfy X ≤st Y if, and only if, there exist a random variable Z and functions ψ1 and ψ2 such that ψ1 (z) ≤ ψ2 (z) for all z and X =st ψ1 (Z) and Y =st ψ2 (Z). In some applications, when the random variables X and Y are such that ˆ on the probability space on X ≤st Y , one may wish to construct a Yˆ [X] ˆ =st X and which X [Y ] is deﬁned, such that Yˆ =st Y and P {X ≤ Yˆ } = 1 [X ˆ ≤ Y } = 1]. This is always possible. Here we will show how this can be P {X done when the distribution function F [G] of X [Y ] is absolutely continuous. When this is the case, F (X) [G(Y )] is uniformly distributed on [0, 1], and ˆ = F −1 (G(Y ))] is the desired construction Yˆ therefore Yˆ = G−1 (F (X)) [X ˆ [X]. 1.A.3 Closure properties Using (1.A.1) through (1.A.11) it is easy to prove each of the following closure results. The following notation will be used: For any random variable Z and an event A, let [Z A] denote any random variable that has as its distribution the conditional distribution of Z given A.

6

1 Univariate Stochastic Orders

Theorem 1.A.3. (a) If X ≤st Y and g is any increasing [decreasing] function, then g(X) ≤st [≥st ] g(Y ). (b) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤st Yi for i = 1, 2, . . . , m, then, for any increasing function ψ : Rm → R, one has ψ(X1 , X2 , . . . , Xm ) ≤st ψ(Y1 , Y2 , . . . , Ym ). In particular,

m

Xj ≤st

j=1

m

Yj .

j=1

That is, the usual stochastic order is closed under convolutions. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞, where “→st ” denotes convergence in distribution. If Xj ≤st Yj , j = 1, 2, . . ., then X ≤st Y. (d) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤st [Y Θ = θ] for all θ in the support of Θ. Then X ≤st Y . That is, the usual stochastic order is closed under mixtures. 0 In the next result and in the sequel we deﬁne j=1 aj ≡ 0 for any sequence {aj , j = 1, 2, . . . }. Theorem 1.A.4. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent random variables, and let M be a nonnegative integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent random variables, and let N be a nonnegative integer-valued random variable which is independent of the Yi ’s. If Xi ≤st Yi , i = 1, 2, . . ., and if M ≤st N , then M

Xj ≤st

j=1

N

Yj .

j=1

Another related result is given next. Theorem 1.A.5. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have that K Xj ≤st [≥st ] Y1 j=1

and

1.A The Usual Stochastic Order

7

M ≤st [≥st ] KN, then

M

Xj ≤st [≥st ]

j=1

N

Yj .

j=1

Proof. The assumptions yield M i=1

Xi ≤st [≥st ]

KN

Xi =

i=1

N

Ki

Xj ≤st [≥st ]

i=1 j=K(i−1)+1

N

Yi .

i=1

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line R. Let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result is a generalization of both parts (a) and (c) of Theorem 1.A.3. Theorem 1.A.6. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If

X(θ) ≤st X(θ )

whenever θ ≤ θ ,

(1.A.13)

and if Θ1 ≤st Θ2 ,

(1.A.14)

Y1 ≤st Y2 .

(1.A.15)

then Proof. Note that, by (1.A.13), P {X(θ) > y} is increasing in θ for all y. Thus P {Y1 > y} = P {X(θ) > y}dF1 (θ) X ≤ P {X(θ) > y}dF2 (θ) X

= P {Y2 > y},

for all y,

where the inequality follows from (1.A.14) and (1.A.7). Thus (1.A.15) follows from (1.A.1).

8

1 Univariate Stochastic Orders

Note that, using the notation that is introduced below before Theorem 1.A.14, (1.A.13) can be rewritten as {X(θ), θ ∈ X } ∈ SI. The following example shows an application of Theorem 1.A.6 in the area of Bayesian imperfect repair; a related result is given in Example 1.B.16. Example 1.A.7. Let Θ1 and Θ2 be two random variables with supports in X = (0, 1] and distribution functions F1 and F2 , respectively. For some survival function K, deﬁne 1−θ Gθ = K , θ ∈ (0, 1], and let X(θ) have the survival function K 1−θ

1−θ

1−θ

. Note that (1.A.13) holds be-

(y) ≤ K (y) for all y whenever 0 < θ ≤ θ ≤ 1. Thus, if cause K Θ1 ≤st Θ2 then Yi , with survival function H i deﬁned by 1 1−θ H i (y) = K (y)dFi (θ), y ∈ R, i = 1, 2, 0

satisfy Y1 ≤st Y2 . 1.A.4 Further characterizations and properties Clearly, if X ≤st Y then EX ≤ EY . However, as the following result shows, if two random variables are ordered in the usual stochastic order and have the same expected values, they must have the same distribution. Theorem 1.A.8. If X ≤st Y and if E[h(X)] = E[h(Y )] for some strictly increasing function h, then X =st Y . ˆ and Yˆ be as in Theorem Proof. First we prove the result when h(x) = x. Let X ˆ < Yˆ } > 0, then EX = E X ˆ < E Yˆ = EY , a contradiction to the 1.A.1. If P {X ˆ = Yˆ =st Y . Now let h be some assumption EX = EY . Therefore X =st X strictly increasing function. Observe that if X ≤st Y , then h(X) ≤st h(Y ) and therefore from the above result we have that h(X) =st h(Y ). The strict monotonicity of h yields X =st Y .

Other results that give conditions, involving stochastic orders, which imply stochastic equalities, are given in Theorems 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16. As was mentioned above, if X ≤st Y , then EX ≤ EY . It is easy to ﬁnd counterexamples which show that the converse is false. However, X ≤st Y implies other moment inequalities (for example, EX 3 ≤ EY 3 ). Thus one may wonder whether X ≤st Y can be characterized by a collection of moment inequalities. Brockett and Kahane [109, Corollary 1] showed that there exist no ﬁnite number of moment inequalities which imply X ≤st Y . In fact, they showed it for many other stochastic orders that are studied later in this book. In order to state the next characterization we deﬁne the following class of bivariate functions: Gst = {φ : R2 → R : φ(x, y) is increasing in x and decreasing in y}.

1.A The Usual Stochastic Order

9

Theorem 1.A.9. Let X and Y be independent random variables. Then X ≤st Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Gst .

(1.A.16)

Proof. Suppose that (1.A.16) holds. The function φ deﬁned by φ(x, y) ≡ x belongs to Gst . Therefore X ≤st Y . In order to prove the “only if” part, suppose that X ≤st Y . Let φ ∈ Gst and deﬁne ψ(x, y) = φ(x, −y). Then ψ is increasing on R2 . Since X and Y are independent it follows that X and −Y are independent and also that −X and Y are independent. Since X ≤st Y it follows (for example, from Theorem 1.A.1) that −Y ≤st −X. Therefore, by Theorem 1.A.3(b), we have ψ(X, −Y ) ≤st ψ(Y, −X), that is, φ(X, Y ) ≤st φ(Y, X).

The next result is a similar characterization. In order to state it we need the following notation: Let φ1 and φ2 be two bivariate functions. Denote ∆φ21 (x, y) = φ2 (x, y)−φ1 (x, y). The proof of the following theorem is omitted. Theorem 1.A.10. Let X and Y be two independent random variables. Then X ≤st Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all φ1 and φ2 which satisfy that, for each y, ∆φ21 (x, y) decreases in x on {x ≤ y}; for each x, ∆φ21 (x, y) increases in y on {y ≥ x}; and ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. Another similar characterization is given in Theorem 4.A.36. Let X and Y be two random variables with distribution functions F and G, respectively. Let M(F, G) denote the Fr´echet class of bivariate distributions ˆ Yˆ ) ∈ M(F, G) with ﬁxed marginals F and G. Abusing notation we write (X, ˆ to mean that the jointly distributed random variables X and Yˆ have the marginal distribution functions F and G, respectively. The Fortret-MourierWasserstein distance between the ﬁnite mean random variables X and Y is deﬁned by ˆ d(X, Y ) = inf {E|Yˆ − X|}. (1.A.17) ˆ Yˆ )∈M(F,G) (X,

Theorem 1.A.11. Let X and Y be two ﬁnite mean random variables such that EX ≤ EY . Then X ≤st Y if, and only if, d(X, Y ) = EY − EX. Proof. Suppose that d(X, Y ) = EY −EX. The inﬁmum in (1.A.17) is attained ˆ Yˆ ), and we have E|Yˆ − X| ˆ = E(Yˆ − X). ˆ Therefore P {X ˆ ≤ Yˆ } = for some (X, 1, and from Theorem 1.A.1 it follows that X ≤st Y . ˆ and Yˆ be as in Theorem 1.A.1. Conversely, suppose that X ≤st Y . Let X Then, for any (X , Y ) ∈ M(F, G) we have that E|Y − X | ≥ |EY − EX | = ˆ Therefore d(X, Y ) = EY − EX.

E Yˆ − E X.

10

1 Univariate Stochastic Orders

A simple suﬃcient condition which implies the usual stochastic order is described next. The following notation will be used. Let a(x) be deﬁned on I, where I is a subset of the real line. The number of sign changes of a in I is deﬁned by S − (a) = sup S − [a(x1 ), a(x2 ), . . . , a(xm )], (1.A.18) where S − (y1 , y2 , . . . , ym ) is the number of sign changes of the indicated sequence, zero terms being discarded, and the supremum in (1.A.18) is extended over all sets x1 < x2 < · · · < xm such that xi ∈ I and m < ∞. The proof of the next theorem is simple and therefore it is omitted. Theorem 1.A.12. Let X and Y be two random variables with (discrete or continuous) density functions f and g, respectively. If S − (g − f ) = 1

and the sign sequence is −, +,

then X ≤st Y . Let X1 be a nonnegative random variable with distribution function F1 and survival function F 1 ≡ 1 − F1 . Deﬁne the Laplace transform of X1 by ∞ ϕX1 (λ) = e−λx dF1 (x), λ > 0, 0

and denote 1 aX λ (n) =

(−1)n dn 1 − ϕX1 (λ) , n! dλn λ

n ≥ 0, λ > 0,

and n X1 1 αX λ (n) = λ aλ (n − 1),

n ≥ 1, λ > 0.

Similarly, for a nonnegative random variable X2 with distribution function F2 X1 2 and survival function F 2 ≡ 1 − F2 , deﬁne αX λ (n). It can be shown that αλ X2 and αλ are discrete survival functions (see the proof of the next theorem); denote the corresponding discrete random variables by Nλ (X1 ) and Nλ (X2 ). The following result gives a Laplace transform characterization of the order ≤st . Theorem 1.A.13. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described above. Then X1 ≤st X2 ⇐⇒ Nλ (X1 ) ≤st Nλ (X2 )

for all λ > 0.

Proof. First suppose that X1 ≤st X2 . Select a λ > 0. Let Z1 , Z2 , . . . , be independent exponential mean 1/λ. can be shown

n randomvariables with

It n X2 1 that αX and that (n) = P Z ≤ X α (n) = P 1 i=1 i i=1 Zi ≤ X2 . It λ λ thus follows that Nλ (X1 ) ≤st Nλ (X2 ). Now suppose that Nλ (X1 ) ≤st Nλ (X2 ) for all λ > 0. Select an x > 0. X2 1 Thus αX n/x (n) ≤ αn/x (n). Letting n → ∞, one obtains F 1 (x) ≤ F 2 (x) for all continuity points x of F1 and F2 . Therefore, X1 ≤st X2 by (1.A.1).

1.A The Usual Stochastic Order

11

The implication =⇒ in Theorem 1.A.13 can be generalized as follows. A family of random variables {Z(θ), θ ∈ Θ} (Θ is a subset of the real line) is said to be stochastically increasing in the usual stochastic order (denoted by {Z(θ), θ ∈ Θ} ∈ SI) if Z(θ) ≤st Z(θ ) whenever θ ≤ θ . Recall from Theorem 1.A.3(a) that if X1 ≤st X2 , then g(X1 ) ≤st g(X2 ) for any increasing function g. The following result gives a stochastic generalization of this fact. Theorem 1.A.14. If {Z(θ), θ ∈ Θ} ∈ SI and if X1 ≤st X2 , where Xk and Z(θ) are independent for k = 1, 2 and θ ∈ Θ, then Z(X1 ) ≤st Z(X2 ). Note that Theorem 1.A.14 is a restatement of Theorem 1.A.6. Let X be a random variable and denote by X(−∞,a] the truncation of X at a, that is, X(−∞,a] has as its distribution the conditional distribution of X given that X ≤ a. X(a,∞) is similarly deﬁned. It is simple to prove the following result. Results that are stronger than this are contained in Theorems 1.B.20, 1.B.55, and 1.C.27. Theorem 1.A.15. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the usual stochastic order. An interesting example in which truncated random variables are compared is the following. Example 1.A.16. Let X (1) , X (2) , . . . , X (n) be independent and identically dis(1) (2) (n) tributed random variables. For a ﬁxed t, let X(t,∞) , X(t,∞) , . . . , X(t,∞) be the corresponding truncations, and assume that they are also independent and identically distributed. Then

(1) (2) (n) max X (1) , X (2) , . . . , X (n) (t,∞) ≤st max X(t,∞) , X(t,∞) , . . . , X(t,∞) ,

where max X (1) , X (2) , . . . , X (n) (t,∞) denotes the corresponding trunca

tion of max X (1) , X (2) , . . . , X (n) . The proof consists of a straightforward veriﬁcation of (1.A.2) for the compared random variables. Let φ1 and φ2 be two functions that satisfy φ1 (x) ≤ φ2 (x) for all x ∈ R, and let X be a random variable. Then, clearly, φ1 (X) ≤ φ2 (X) almost surely. From Theorem 1.A.1 we thus obtain the following result. Theorem 1.A.17. Let X be a random variable and let φ1 and φ2 be two functions that satisfy φ1 (x) ≤ φ2 (x) for all x ∈ R. Then φ1 (X) ≤st φ2 (X). In particular, if φ is a function that satisﬁes x ≤ [≥] φ(x) for all x ∈ R, then X ≤st [≥st ] φ(X). Remark 1.A.18. The set of all distribution functions on R is a lattice with respect to the order ≤st . That is, if X and Y are random variables with distributions F and G, then there exist random variables Z and W such that Z ≤st X, Z ≤st Y , W ≥st X, and W ≥st Y . Explicitly, Z has the survival function min{F , G} and W has the survival function max{F , G}.

12

1 Univariate Stochastic Orders

The next four theorems give conditions under which the corresponding spacings are ordered according to the usual stochastic order. Let X1 , X2 , . . . , Xm be any random variables with the corresponding order statistics X(1) ≤ X(2) ≤ · · · ≤ X(m) . Deﬁne the corresponding spacings by U(i) = X(i) −X(i−1) , i = 2, 3, . . . , m. When the dependence on m is to be emphasized, we will denote the spacings by U(i:m) . Theorem 1.A.19. Let X1 , X2 , . . . , Xm , Xm+1 be independent and identically distributed IFR (DFR) random variables. Then (m − i + 1)U(i:m) ≥st [≤st ] (m − i)U(i+1:m) ,

i = 2, 3, . . . , m − 1,

and (m − i + 2)U(i:m+1) ≥st [≤st ] (m − i + 1)U(i:m) ,

i = 2, 3, . . . , m.

The proof of Theorem 1.A.19 is not given here. A stronger version of the DFR part of Theorem 1.A.19 is given in Theorem 1.B.31. Some of the conclusions of Theorem 1.A.19 can be obtained under diﬀerent conditions. These are stated in the next two theorems. Again, the proofs are not given. In the next two theorems we take X(0) ≡ 0, and thus U(1) = X(1) . For the following theorem recall from page 1 the deﬁnition of Schur concavity. Theorem 1.A.20. Let X1 , X2 , . . . , Xm be nonnegative random variables with an absolutely continuous joint distribution function. If the joint density function of X1 , X2 , . . . , Xm is Schur concave (Schur convex ), then (m − i + 1)U(i:m) ≥st [≤st ] (m − i)U(i+1:m) ,

i = 1, 2, . . . , m − 1.

Theorem 1.A.21. Let X1 , X2 , . . . , Xm be independent exponential random variables with possibly diﬀerent parameters. Then (m − i + 1)U(i:m) ≤st (m − i)U(i+1:m) ,

i = 1, 2, . . . , m − 1.

Theorem 1.A.22. Let X1 , X2 , . . . , Xm be independent and identically distributed random variables with a ﬁnite support, and with an increasing [decreasing] density function over that support. Then U(i:m) ≥st [≤st ] U(i+1:m) ,

i = 2, 3, . . . , m − 1.

The proof of Theorem 1.A.22 uses the likelihood ratio order, and therefore it is deferred to Section 1.C, Remark 1.C.3. Note that any absolutely continuous DFR random variable has a decreasing density function. Thus we see that the assumption in the DFR part of Theorem 1.A.19 is stronger than the assumption in the decreasing part of Theorem 1.A.22, but the conclusion in the DFR part of Theorem 1.A.19 is stronger than the conclusion in the decreasing part of Theorem 1.A.22. It is

1.A The Usual Stochastic Order

13

of interest to compare Theorems 1.A.19–1.A.22 with Theorems 1.B.31 and 1.C.42. From Theorem 1.A.1 it is obvious that if X(1) ≤ X(2) ≤ · · · ≤ X(m) are the order statistics corresponding to the random variables X1 , X2 , . . . , Xm , then X(1) ≤st X(2) ≤st · · · ≤st X(m) . Now let X(1) ≤ X(2) ≤ · · · ≤ X(m) be the order statistics corresponding to the random variables X1 , X2 , . . . , Xm , and let Y(1) ≤ Y(2) ≤ · · · ≤ Y(m) be the order statistics corresponding to the random variables Y1 , Y2 , . . . , Ym . As usual, for any distribution function F , we let F ≡ 1 − F denote the corresponding survival function. Theorem 1.A.23. (a) Let X1 , X2 , . . . , Xm be independent random variables with distribution functions F1 , F2 , . . . , Fm , respectively. Let Y1 , Y2 , . . . , Ym be independent and identically distributed random variables with a common distribution function G. Then X(i) ≤st Y(i) for all i = 1, 2, . . . , m if, and only if, m Fj (x) ≥ Gm (x) for all x; j=1

that is, if, and only if, X(m) ≤st Y(m) . (b) Let X1 , X2 , . . . , Xm be independent random variables with survival functions F 1 , F 2 , . . . , F m , respectively. Let Y1 , Y2 , . . . , Ym be independent and identically distributed random variables with a common survival function G. Then X(i) ≥st Y(i) for all i = 1, 2, . . . , m if, and only if, m

m

F j (x) ≥ G (x)

for all x;

j=1

that is, if, and only if, X(1) ≥st Y(1) . The proof of Theorem 1.A.23 is not given here. More comparisons of order statistics in the usual stochastic order can be found in Theorem 6.B.23 and in Corollary 6.B.24. The following neat example compares a sum of independent heterogeneous exponential random variables with an Erlang random variable; it is of interest to compare it with Examples 1.B.5 and 1.C.49. We do not give the proof here. Example 1.A.24. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Let Yi , i = 1, 2, . . . , m, be independent, identically distributed, exponential random variables with mean η −1 . Then m i=1

Xi ≥st

m

Yi ⇐⇒

m λ1 λ2 · · · λm ≤ η.

i=1

The next example may be compared with Examples 1.B.6, 1.C.51, and 4.A.45.

14

1 Univariate Stochastic Orders

Example 1.A.25. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m Xi ≥st Y ⇐⇒ p ≤ n p1n1 pn2 2 · · · pnmm , i=1

and m

Xi ≤st Y ⇐⇒ 1 − p ≤

n

(1 − p1 )n1 (1 − p2 )n2 · · · (1 − pm )nm .

i=1

The following example gives necessary and suﬃcient conditions for the comparison of normal random variables; it is generalized in Example 6.B.29. See related results in Examples 3.A.51 and 4.A.46. Example 1.A.26. Let X be a normal random variable with mean µX and vari2 , and let Y be a normal random variable with mean µY and variance ance σX 2 2 σY . Then X ≤st Y if, and only if, µX ≤ µY and σX = σY2 . Example 1.A.27. Let the random variable X have a unimodal density, symmetric about 0. Then (X + a)2 ≤st (X + b)2

whenever |a| ≤ |b|.

Example 1.A.28. Let X have a multivariate normal density with mean vector 0 and variance-covariance matrix Σ 1 . Let Y have a multivariate normal density with mean vector 0 and variance-covariance matrix Σ 1 + Σ 2 , where Σ 2 is a nonnegative deﬁnite matrix. Then

X 2 ≤st Y 2 , where · denotes the Euclidean norm. The next result involves the total time on test (TTT) transform and the observed TTT random variable. Let F be the distribution function of a nonnegative random variable, and suppose, for simplicity, that 0 is the left endpoint of the support of F . The TTT transform associated with F is deﬁned by F −1 (u) HF−1 (u) = F (x)dx, 0 ≤ u ≤ 1, (1.A.19) 0

where F ≡ 1−F is the survival function associated with F . The inverse, HF , of ∞ the TTT transform is a distribution function. If the mean µ = 0 xdF (x) = ∞ F (x)dx is ﬁnite, then HF has support in [0, µ]. If X has the distribution 0 function F , then let Xttt be any random variable that has the distribution HF . The random variable Xttt is called the observed total time on test. Theorem 1.A.29. Let X and Y be two nonnegative random variables. Then X ≤st Y =⇒ Xttt ≤st Yttt . See related results in Theorems 3.B.1, 4.A.44, 4.B.8, 4.B.9, and 4.B.29.

1.A The Usual Stochastic Order

15

1.A.5 Some properties in reliability theory Recall from page 1 the deﬁnitions of the IFR, DFR, NBU, and NWU properties. The next result characterizes random variables that have these properties by means of the usual stochastic order. The statements in the next theorem follow at once from the deﬁnitions. Recall from Section 1.A.3 that for any random variable Z and an event A we denote by [Z A] any random variable that has as its distribution the conditional distribution of Z given A. Theorem 1.A.30. (a) The random variable X is IFR [DFR] if, and only if, [X − tX > t] ≥st [≤st ] [X − t X > t ] whenever t ≤ t . (b) The nonnegative random variable X is NBU [NWU] if, and only if, X ≥st [≤st ] [X − tX > t] for all t > 0. Note that if X is the lifetime of a device, then [X − tX > t] is the residual life of such a device with age t. Theorem 1.A.30(a), for example, characterizes IFR and DFR random variables by the monotonicity of their residual lives with respect to the order ≤st . Theorem 1.A.30 should be compared to Theorem 1.B.38, where a similar characterization is given. Some multivariate analogs of Theorem 1.A.30(a) are used in Section 6.B.6 to introduce some multivariate IFR notions. For a nonnegative random variable X with a ﬁnite mean, let AX denote the corresponding asymptotic equilibrium age. That is, if the distribution function of X is F , then the distribution function Fe of AX is deﬁned by x 1 Fe (x) = F (y)dy, x ≥ 0, (1.A.20) EX 0 where F ≡ 1 − F is the corresponding survival function. Recall from page 1 the deﬁnitions of the NBUE and the NWUE properties. The following result is immediate. Theorem 1.A.31. The nonnegative random variable X with ﬁnite mean is NBUE [NWUE] if, and only if, X ≥st [≤st ] AX . Another characterization of NBUE random variables is the following. Recall from Section 1.A.4 the deﬁnition of the observed total time on test random variable Xttt . Theorem 1.A.32. Let X be a nonnegative random variable with ﬁnite mean µ. Then X is NBUE if, and only if, Xttt ≥st U(0, µ), where U(0, µ) denotes a uniform random variable on (0, µ). Let X be a nonnegative random variable with ﬁnite mean and distribution function F , and let AX be the corresponding asymptotic equilibrium age having the distribution function Fe given in (1.A.20). The requirement

16

1 Univariate Stochastic Orders

X ≥st [AX − tAX > t]

for all t ≥ 0,

(1.A.21)

has been used in the literature as a way to deﬁne an aging property of the lifetime X. It turns out that this aging property is equivalent to the new better than used in convex ordering (NBUC) notion that is deﬁned in (4.A.31) in Chapter 4.

1.B The Hazard Rate Order 1.B.1 Deﬁnition and equivalent conditions If X is a random variable with an absolutely continuous distribution function F , then the hazard rate of X at t is deﬁned as r(t) = (d/dt)(− log(1 − F (t))). The hazard rate can alternatively be expressed as P {t < X ≤ t + ∆tX > t} f (t) = , t ∈ R, (1.B.1) r(t) = lim ∆t↓0 ∆t F (t) where F ≡ 1 − F is the survival function and f is the corresponding density function. As can be seen from (1.B.1), the hazard rate r(t) can be thought of as the intensity of failure of a device, with a random lifetime X, at time t. Clearly, the higher the hazard rate is the smaller X should be stochastically. This is the motivation for the order discussed in this section. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions and with hazard rate functions r and q, respectively, such that r(t) ≥ q(t), t ∈ R. (1.B.2) Then X is said to be smaller than Y in the hazard rate order (denoted as X ≤hr Y ). Although the hazard rate order is usually applied to random lifetimes (that is, nonnegative random variables), deﬁnition (1.B.2) may also be used to compare more general random variables. In fact, even the absolute continuity, which is required in (1.B.2), is not really needed. It is easy to verify that (1.B.2) holds if, and only if, G(t) F (t)

increases in t ∈ (−∞, max(uX , uY ))

(1.B.3)

(here a/0 is taken to be equal to ∞ whenever a > 0). Here F denotes the distribution function of X and G denotes the distribution function of Y , and uX and uY denote the corresponding right endpoints of the supports of X and of Y . Equivalently, (1.B.3) can be written as F (x)G(y) ≥ F (y)G(x)

for all x ≤ y.

(1.B.4)

1.B The Hazard Rate Order

17

Thus (1.B.3) or (1.B.4) can be used to deﬁne the order X ≤hr Y even if X and/or Y do not have absolutely continuous distributions. A useful further condition, which is equivalent to X ≤hr Y when X and Y have absolutely continuous distributions with densities f and g, respectively, is the following: f (x) g(x) ≥ F (y) G(y)

for all x ≤ y.

(1.B.5)

Rewriting (1.B.4) as F (t + s) G(t + s) ≤ F (t) G(t)

for all s ≥ 0 and all t,

it is seen that X ≤hr Y if, and only if, P {X − t > sX > t} ≤ P {Y − t > sY > t}

for all s ≥ 0 and all t; (1.B.6)

that is, if, and only if, the residual lives of X and Y at time t are ordered in the sense ≤st for all t. Equivalently, (1.B.6) can be written as [X X > t] ≤st [Y Y > t] for all t. (1.B.7) Substituting u = F GF

−1

−1

(u)

u

(t) in (1.B.3) shows that X ≤hr Y if, and only if, ≥

GF

−1

v

(v)

for all 0 < u ≤ v < 1.

Simple manipulations show that the latter condition is equivalent to 1 − F G−1 (1 − v) 1 − F G−1 (1 − u) ≤ u v

for all 0 < u ≤ v < 1.

(1.B.8)

For discrete random variables that take on values in N the deﬁnition of ≤hr can be written in two diﬀerent ways. Let X and Y be such random variables. We denote X ≤hr Y if P {X = n} P {Y = n} ≥ , P {X ≥ n} P {Y ≥ n}

n ∈ N.

(1.B.9)

Equivalently, X ≤hr Y if P {X = n} P {Y = n} ≥ , P {X > n} P {Y > n}

n ∈ N.

The discrete analog of (1.B.4) is that (1.B.9) holds if, and only if, P {X ≥ n1 }P {Y ≥ n2 } ≥ P {X ≥ n2 }P {Y ≥ n1 }

for all n1 ≤ n2 . (1.B.10)

In a similar manner (1.B.3) and (1.B.5) can be modiﬁed in the discrete case. Unless stated otherwise, we consider only random variables with absolutely continuous distribution functions in the following sections.

18

1 Univariate Stochastic Orders

1.B.2 The relation between the hazard rate and the usual stochastic orders By setting x = −∞ in (1.B.4) (or n1 = −∞ in (1.B.10)), and then using (1.A.1), we obtain the following result. Theorem 1.B.1. If X and Y are two random variables such that X ≤hr Y , then X ≤st Y . 1.B.3 Closure properties and some characterizations Let φ be a strictly increasing function with inverse φ−1 . If X has the survival function F , then φ(X) has the survival function F φ−1 . Similarly, if Y has the survival function G, then φ(Y ) has the survival function Gφ−1 . If X ≤hr Y , then from (1.B.3) it follows that Gφ−1 (t) F φ−1 (t)

increases in t over {t : Gφ−1 (t) > 0}.

We have thus shown an important special case of the next theorem. When φ is just increasing (rather than strictly increasing) the result is still true, but the above simple argument is no longer suﬃcient for its proof. Theorem 1.B.2. If X ≤hr Y , and if φ is any increasing function, then φ(X) ≤hr φ(Y ). In general, if X1 ≤hr Y1 and X2 ≤hr Y2 , where X1 and X2 are independent random variables and Y1 and Y2 are also independent random variables, then it is not necessarily true that X1 + X2 ≤hr Y1 + Y2 . However, if these random variables are IFR, then it is true. This is shown in Theorem 1.B.4, but ﬁrst we state and prove the following lemma, which is of independent interest. Lemma 1.B.3. If the random variables X and Y are such that X ≤hr Y and if Z is an IFR random variable independent of X and Y , then X + Z ≤hr Y + Z.

(1.B.11)

Proof. Denote by fW and F W the density function and the survival function of any random variable W . Note that, for x ≤ y,

1.B The Hazard Rate Order

19

F X+Z (x)F Y +Z (y) − F X+Z (y)F Y +Z (x) = fX (u)F Z (x − u)fY (v)F Z (y − v) v

u≥v

fX (u)F Z (y − u)fY (v)F Z (x − v)

− v

u≥v

= v

+ fX (v)F Z (x − v)fY (u)F Z (y − u) dudv

u≥v

+ fX (v)F Z (y − v)fY (u)F Z (x − u) dudv F X (u)fY (v) − fX (v)F Y (u) × F Z (y − v)fZ (x − u) − fZ (y − u)F Z (x − v) dudv,

where the second equality is obtained by integration by parts with respect to u and by collection of terms. Since X ≤hr Y it follows from (1.B.5) that the expression within the ﬁrst set of brackets in the last integral is nonpositive. Since Z is IFR it can be veriﬁed that the quantity in the second pair of brackets in the last integral is also nonpositive. Therefore, the integral is nonnegative. This proves (1.B.11).

The above proof is very similar to the proof that a convolution of two independent IFR random variables is IFR. In fact, this convolution result can be shown to be a consequence of Lemma 1.B.3; see Corollary 1.B.39 in Section 1.B.5. Theorem 1.B.4. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤hr Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, are all IFR, then m m Xi ≤hr Yi . i=1

i=1

Proof. Repeated application of (1.B.11), using the closure property of IFR under convolution, yields the desired result.

The following neat example compares a sum of independent heterogeneous exponential random variables with an Erlang random variable; it is of interest to compare it with Examples 1.A.24 and 1.C.49. We do not give the proof here. Example 1.B.5. Let Xi be an exponential random variable with mean λ−1 > 0, i i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Let Yi , i = 1, 2, . . . , m, be independent, identically distributed, exponential random variables with mean η −1 . Then m i=1

Xi ≥hr

m i=1

Yi ⇐⇒

m λ1 λ2 · · · λm ≤ η.

20

1 Univariate Stochastic Orders

The next example may be compared with Examples 1.A.25, 1.C.51, and 4.A.45. Example 1.B.6. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m

n , i=1 (ni /pi )

Xi ≥hr Y ⇐⇒ p ≤ m

i=1

and

m

n . i=1 (ni /(1 − pi ))

Xi ≤hr Y ⇐⇒ 1 − p ≤ m

i=1

A hazard rate order comparison of random sums is given in the following result. Theorem 1.B.7. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative IFR independent random variables. Let M and N be two discrete positive integervalued random variables such that M ≤hr N (in the sense of (1.B.9) or (1.B.10)), and assume that M and N are independent of the Xi ’s. Then M i=1

Xi ≤hr

N

Xi .

i=1

The hazard rate order (unlike the usual stochastic order; see Theorem 1.A.3(d)) does not have the property of being simply closed under mixtures. However, under quite strong conditions the order ≤hr is closed under mixtures. This is shown in the next theorem. Theorem Let X, Y , and Θ be random variables such that [X Θ = 1.B.8. θ] ≤hr [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤hr Y . Proof. Select a θ and a θ in the support of Θ. Let F (· θ), G(· θ), F (· θ ), and G(·θ ) be the survival functions of [X Θ = θ], [Y Θ = θ], [X Θ = θ ], and [Y Θ = θ ], respectively. For simplicity assume that these random variables have densities which we denote by f (· θ), g(· θ), f (·θ ), and g(·θ ), respectively. It is suﬃcient to show that for α ∈ (0, 1) we have αf (tθ) + (1 − α)f (tθ ) αg(tθ) + (1 − α)g(tθ ) ≥ for all t ≥ 0. αF (tθ) + (1 − α)F (tθ ) αG(tθ) + (1 − α)G(tθ ) (1.B.12) This is an inequality of the form a+b w+x ≥ , c+d y+z

1.B The Hazard Rate Order

21

where all eight variables are nonnegative and by the assumptions of the theorem they satisfy a w ≥ , c y

a x ≥ , c z

b w ≥ , d y

and

b x ≥ . d z

It is easy to verify that the latter four inequalities imply the former one, completing the proof of the theorem.

It should be pointed out, however, that mixtures, of distributions which are ordered by the hazard rate order, are ordered by the usual stochastic order. That is, if X, Y , and Θ are random variables such that [X Θ = θ] ≤hr [Y Θ = θ] for all θ in the support of Θ, then X ≤st Y . This follows from a (conditional) application of Theorem 1.B.1, combined with the fact that the usual stochastic order is closed under mixtures (Theorem 1.A.3(d)). In order to state the next characterization we deﬁne the following class of bivariate functions.

Ghr = φ : R2 → R : φ(x, y) is increasing in x, for each y, on {x ≥ y},

and is decreasing in y, for each x, on {y ≥ x} .

Theorem 1.B.9. Let X and Y be independent random variables. Then X ≤hr Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Ghr .

(1.B.13)

Proof. Suppose that (1.B.13) holds. Select an x and a y such that x ≥ y. Let φ(u, v) = I{u≥x,v≥y} , where IA denotes the indicator function of the set A. It is easy to see that φ(u, v) is increasing in u. In addition, for a ﬁxed u and v such that v ≥ u, we have that φ(u, v) = 1 if u ≥ x and φ(u, v) = 0 if u < x. Therefore, φ ∈ Ghr . Hence, F (y)G(x) = Eφ(Y, X) ≥ Eφ(X, Y ) = F (x)G(y)

whenever x ≥ y,

where F and G are the survival functions of X and Y , respectively. Therefore, by (1.B.4), X ≤hr Y . Conversely, assume that X ≤hr Y . Let ψ : R → R be an increasing function and let φ ∈ Ghr . Denote a(x, y) = ψ(φ(x, y)) − ψ(φ(y, x)). For simplicity assume that a is diﬀerentiable and that X and Y have densities that we denote by f and g, respectively (otherwise, approximation arguments can be used). Then

∞

a(x, y)[f (x)g(y) − f (y)g(x)]dxdy

Ea(X, Y ) = y=−∞ x≥y ∞

= y=−∞

x≥y

∂ a(x, y) F (x)g(y) − f (y)G(x) dxdy ≤ 0, ∂x

22

1 Univariate Stochastic Orders

where the second equality follows from integration by parts, and the inequality follows from X ≤hr Y , the fact that a(x, y) increases in x for all x ≥ y, and (1.B.5).

The next result is a similar characterization. It uses the notation of Theorem 1.A.10, and their comparison is of interest. The proof of the following theorem is omitted. Theorem 1.B.10. Let X and Y be two independent random variables. Then X ≤hr Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all φ1 and φ2 such that, for each x, ∆φ21 (x, y) increases in y on {y ≥ x}, and such that ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. A further similar characterization is given in Theorem 4.A.36. The next result describes another characterization of the order ≤hr . Theorem 1.B.11. Let X and Y be two, absolutely continuous or discrete, independent random variables. Then X ≤hr Y if, and only if, (1.B.14) [X min(X, Y ) = z] ≤hr [Y min(X, Y ) = z] for all z. Also, X ≤hr Y if, and only if, [X min(X, Y ) = z] ≤st [Y min(X, Y ) = z]

for all z.

(1.B.15)

Proof. First suppose that X and Y are absolutely continuous. Denote the survival functions of X and Y by F and G, respectively, and denote the corresponding density functions by f and g. Then 1, if x < z, P [X > x min(X, Y ) = z] = (1.B.16) F (x)g(z) , if x ≥ z, f (z)G(z)+g(z)F (z) and P [Y > x min(X, Y ) = z] =

1,

if x < z,

G(x)f (z) , f (z)G(z)+g(z)F (z)

Therefore P [Y > x| min(X, Y ) = z] = P [X > x| min(X, Y ) = z]

1, G(x) F (x)

if x ≥ z.

if x < z, ·

f (z) g(z) ,

if x ≥ z.

(1.B.17)

(1.B.18)

If X ≤hr Y , then G(z) · f (z) ≥ 1, and G(x) is increasing in x. Thus (1.B.18) is F (z) g(z) F (x) increasing in x, and (1.B.14) follows. Obviously (1.B.15) follows from (1.B.14). Now suppose that (1.B.15) holds. Then from (1.B.16) and (1.B.17) we get that F (x)g(z) ≤ G(x)f (z) for all x ≥ z. Therefore X ≤hr Y by (1.B.5). The proof when X and Y are discrete is similar.

1.B The Hazard Rate Order

23

Some related characterizations are given in the next result. Theorem 1.B.12. Let X and Y be two independent random variables. The following conditions are equivalent: (a) X ≤hr Y . (b) E[α(X)]E[β(Y )] ≤ E[α(Y )]E[β(X)] for all functions α and β for which the expectations exist and such that β is nonnegative and α/β and β are increasing. (c) For any two increasing functions a and b such that b is nonnegative, if E[a(X)b(X)] = 0, then E[a(Y )b(Y )] ≥ 0. Proof. Assume (a). Let α and β be as in (b). Deﬁne φ1 (x, y) = α(x)β(y) and φ2 (x, y) = α(y)β(x). Then ∆φ21 (x, y) = φ2 (x, y) − φ1 (x, y) = β(x)β(y) · [α(y)/β(y) − α(x)/β(x)], which is increasing in y. Note that ∆φ21 (x, y) + ∆φ21 (y, x) = 0. Condition (b) now follows from Theorem 1.B.10. Assume (b). By taking, for some u ≤ v, α(x) = I(v,∞) (x) and β(x) = I(u,∞) (x) in (b) one obtains (1.B.4), from which (a) follows. Assume (b). Let a and b be two increasing functions such that b is nonnegative and such that E[a(X)b(X)] = 0. Deﬁne β(x) = b(x) and α(x) = a(x)b(x). Substitution in (b) yields E[a(Y )b(Y )] ≥ 0; that is, (c) holds. Assume (c). Let α and β be as in (b). Denote c = E[α(X)]/E[β(X)]. Deﬁne a(x) = α(x)/β(x) − c and b(x) = β(x). Then E[a(X)b(X)] = 0. So, by (c), E[a(Y )b(Y )] ≥ 0. But the latter reduces to E[α(X)]E[β(Y )] ≤ E[α(Y )]E[β(X)], and this establishes (b).

Example 1.B.13. Let {N (t), t ≥ 0} be a nonhomogeneous Poisson process with mean function Λ (that is, Λ(t) ≡ E[N (t)], t ≥ 0). Let T1 , T2 , . . . be the successive epoch times, and let Xn ≡ Tn − Tn−1 , n = 1, 2 . . . (where T0 ≡ 0), be the corresponding inter-epoch times. The survival function of Tn is given i n−1 · e−Λ(t) , t ≥ 0, n = 1, 2, . . .. It is easy to verify by P {Tn > t} = i=0 (Λ(t)) i! P {Tn+1 >t} that P {Tn >t} is increasing in t ≥ 0, n = 1, 2, . . ., and thus, by (1.B.3), Tn ≤hr Tn+1 ,

n = 1, 2, . . . .

(1.B.19)

A result that is stronger than (1.B.19) is given in Example 1.C.47. If we denote by Fn the distribution function of Tn , then ∞ P {Xn+1 > t} = P {Tn+1 − Tn > tTn = u}dFn (u) 0 ∞ = exp{−[Λ(t + u) − Λ(u)]}dFn (u) 0

= E[exp{−[Λ(t + Tn ) − Λ(Tn )]},

n = 0, 1, . . . .

Fix t1 ≤ t2 and let α(x) ≡ exp{−[Λ(t2 + x) − Λ(x)]} and β(x) ≡ exp{−[Λ(t1 + x) − Λ(x)]}. Note that if Λ is concave, then α(x)/β(x) is increasing. Thus, by Theorem 1.B.12(b), if Λ is concave, then

24

1 Univariate Stochastic Orders

E[α(Tn )] E[β(Tn )] P {Xn+1 > t1 } P {Xn+1 > t2 } = ≥ = , P {Xn > t2 } E[α(Tn−1 )] E[β(Tn−1 )] P {Xn > t1 } n = 1, 2, . . . . It follows, by (1.B.3), that Xn ≤hr Xn+1 ,

n = 1, 2, . . . .

It can be shown in a similar manner that if Λ is convex, then Xn ≥hr Xn+1 , n = 1, 2, . . .. As another example of the use of Theorem 1.B.12 consider an increasing convex function H such that H(0) = 0. Let X and Y be nonnegative random variables such that X ≤hr Y . Then E[H(X)] E[H(Y )] ≤ . E[X] E[Y ] Rather than using Theorem 1.B.12, one can also obtain the above inequality from (2.B.5) in Chapter 2, and from the fact that the hazard rate order implies the hmrl order (which is discussed there). Other characterizations of the order ≤hr can be found in Theorems 2.A.6 and 5.A.22. Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result generalizes both Theorems 1.B.2 and 1.B.8, just as Theorem 1.A.6 generalized both parts (a) and (c) of Theorem 1.A.3. Theorem 1.B.14. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the survival function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. H i (y) = X

If

X(θ) ≤hr X(θ )

whenever θ ≤ θ ,

(1.B.20)

and if Θ1 ≤hr Θ2 ,

(1.B.21)

Y1 ≤hr Y2 .

(1.B.22)

then

1.B The Hazard Rate Order

25

Proof. Assumption (1.B.20) means that Gθ (y) is TP2 (totally positive of order 2) as a function of θ ∈ X and of y ∈ R (that is, Gθ (y)Gθ (y ) ≥ Gθ (y )Gθ (y) whenever y ≤ y and θ ≤ θ ). Assumption (1.B.21) means that F i (θ), as a function of i ∈ {1, 2} and of θ ∈ X , is TP2 . Also, from Theorem 1.B.1 it follows that Gθ (y) is increasing in θ. Therefore, by Theorem 2.1 of Joag-Dev, Kochar, and Proschan [259], H i (y) is TP2 in i ∈ {1, 2} and y ∈ R. That gives (1.B.22).

The following example shows an interesting and useful application of Theorem 1.B.14 Example 1.B.15. Let {Xni , n ≥ 0} be a Markov chain with state space {1, 2, . . . , M } (M can be inﬁnity) and transition matrix P , which starts from state i; that is, X0i = i. If X1i ≤hr X1i for all i ≤ i , then (a) I1 ≤hr I2 implies that XnI1 ≤hr XnI2 for all n ≥ 0, and (b) Xn1 ≤hr Xn1 whenever n ≤ n . In order to prove (a), ﬁrst note that the result is trivial for n = 0. Suppose that the result is true for n = k. Deﬁne Y (i) = X1i . By the Markov property, i we have Xk+1 =st Y (Xki ) for all i. By induction, XkI1 ≤hr XkI2 . In particular, Y (Xki ) ≤hr Y (Xki ) for all i ≤ i . Therefore, from Theorem 1.B.14 we get I1 I2 = Y (XkI1 ) ≤hr Y (XkI2 ) = Xk+1 . Xk+1 In order to prove (b), note that X01 = 1 ≤hr X11 . So, by (a) and the Markov X1

1 property we have Xn1 ≤hr Xn 1 =st Xn+1 .

The following example shows an application of Theorem 1.B.14 in the area of Bayesian imperfect repair. Example 1.B.16. Let Θ1 and Θ2 be two random variables as in Example 1.A.7. Let Gθ , X(θ), Y1 , and Y2 also be as in Example 1.A.7. Note that (1.B.20) holds 1−θ

1−θ

(y)/K (y) is increasing in y whenever 0 < θ ≤ θ ≤ 1. Thus, because K if Θ1 ≤hr Θ2 , then Y1 ≤hr Y2 . It is of interest to compare Example 1.B.16 to Example 5.B.13 which deals with random minima and maxima. The next example deals with the same proportional hazard model as in Example 1.B.16; however, for convenience we change the notation. Example 1.B.17. Let Θ and X be two nonnegative random variables with distribution function F and G, respectively. Let Y have the survival function H deﬁned as ∞ θ H(y) = G (y)dF (θ), y ≥ 0. 0

Suppose that G is absolutely continuous with hazard rate function r. Then H is also absolutely continuous, and we denote its hazard rate function by q. We will now show that if EΘ ≤ 1, then X ≤hr Y . In order to see it,

26

1 Univariate Stochastic Orders

write H(y) = M (log G(y)), where M is the moment generating function of Θ. Diﬀerentiating − log H(y) we obtain q(y) = −

M (log G(y)) d log H(y) = r(y) dy M (log G(y)) = r(y)

EΘeΘ log G(y) EeΘ log G(y)

≤ r(y)

EΘEeΘ log G(y) EeΘ log G(y)

= r(y)EΘ ≤ r(y),

where the ﬁrst inequality follows from Chebyshev’s Inequality (that is, Cov(Θ, eΘ log G(y) ) ≤ 0), and the second inequality follows from EΘ ≤ 1. The stated result now follows from (1.B.2). The following result gives a Laplace transform characterization of the order ≤hr . It should be compared with Theorem 1.A.13. Theorem 1.B.18. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤hr X2 ⇐⇒ Nλ (X1 ) ≤hr Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤hr Nλ (X2 ) is in the sense of (1.B.9). Proof. First suppose that X1 ≤hr X2 . Denote Γλ (n, x) = λe−λx

(λx)n−1 , (n − 1)!

n ≥ 1, x ≥ 0.

X2 1 Let αX λ (n) = P {Nλ (X1 ) ≥ n} and αλ (n) = P {Nλ (X2 ) ≥ n} be as in the proof of Theorem 1.A.13. Then it can be veriﬁed that ∞ ∞ X1 X2 αλ (n) = Γλ (n, x)F 1 (x)dx and αλ (n) = Γλ (n, x)F 2 (x)dx, 0

0

where F 1 and F 2 are the survival functions corresponding to X1 and X2 . For n1 ≤ n2 , some computation yields X2 X1 X2 1 αX λ (n1 )αλ (n2 ) − αλ (n2 )αλ (n1 ) ∞ y = [Γλ (n1 , x)Γλ (n2 , y) − Γλ (n1 , y)Γλ (n2 , x)] y=0 x=0 × F 1 (x)F 2 (y) − F 1 (y)F 2 (x) dxdy.

It is not hard to verify that if x ≤ y and n1 ≤ n2 , then [Γλ (n1 , x)Γλ (n2 , y) − Γλ (n1 , y)Γλ (n2 , x)] ≥ 0. Also, using (1.B.4) it is seen that X1 ≤hr X2 implies

F 1 (x)F 2 (y) − F 1 (y)F 2 (x) ≥ 0 for x ≤ y. Thus, from (1.B.10) it is seen that Nλ (X1 ) ≤hr Nλ (X2 ).

1.B The Hazard Rate Order

27

Now suppose that Nλ (X1 ) ≤hr Nλ (X2 ) for every λ > 0. Deﬁne c(n, λ) = X2 1 αX λ (n)/αλ (n). It can be shown that c(n, λ) increases in λ and decreases in n. Thus, c(n, n/x) ≥ c(n, n/y) whenever x ≤ y. Letting n → ∞ shows that F 1 (x)/F 2 (x) ≥ F 1 (y)/F 2 (y) for all continuity points x and y of F1 and F2 such that x ≤ y. Thus, from (1.B.3) it is seen that X1 ≤hr X2 .

The implication =⇒ in Theorem 1.B.18 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 1.B.14. A related result is the following. Theorem 1.B.19. Let X1 , X2 , . . . , Xm , Θ1 , and Θ2 be independent nonnegative random variables. Deﬁne Nj (t) =

n

I[Θj Xi ] (t),

t ≥ 0, j = 1, 2,

i=1

where

1 I[Θj Xi ] (t) = 0

if if

Θj Xi > t, Θj Xi ≤ t.

If Θ1 ≤hr Θ2 then N1 (t) ≤hr N2 (t) in the sense of (1.B.9) for all t ≥ 0. The following easy-to-prove result strengthens Theorem 1.A.15. An even stronger result appears in Theorem 1.C.27. Theorem 1.B.20. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the hazard rate order. In Theorem 1.A.17 it was seen that if φ is a function which satisﬁes that φ(x) ≤ x for all x ∈ R, then φ(X) ≤st X. The order ≤hr does not satisfy such a general property. However, we have the following easy-to-prove result. Theorem 1.B.21. Let X be a nonnegative IFR random variable, and let a ≤ 1 be a positive constant. Then aX ≤hr X. In fact, a necessary and suﬃcient condition for a nonnegative random variable X, with survival function F , to satisfy aX ≤hr X for all 0 < a < 1, is that log F (ex ) is concave in x ≥ 0. In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of hazard rate ordered random variables, is bounded from below and from above, in the hazard rate order sense, by these two random variables. Theorem 1.B.22. Let X and Y be two random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤hr Y , then X ≤hr W ≤hr Y .

28

1 Univariate Stochastic Orders

Proof. Let uX , uY , and uW denote the right endpoints of the supports of the corresponding random variables, and note that max(uX , uW ) = max(uX , uY ). Now, if X ≤hr Y , then G(t) pF (t) + (1 − p)G(t) = p + (1 − p) F (t) F (t) is increasing in t ∈ (−∞, max(uX , uW )). Therefore, by (1.B.3), X ≤hr W . The proof that W ≤hr Y is similar.

Example 1.B.23. For a nonnegative random variable X with density function f , and for a nonnegative function w such that E[w(X)] exists, deﬁne X w as the random variable with the so-called weighted density function fw given by fw (x) =

w(x)f (x) , E[w(X)]

x ≥ 0.

Similarly, for another nonnegative random variable Y with density function g, such that E[w(Y )] exists, deﬁne Y w as the random variable with the density function gw given by gw (x) =

w(x)g(x) , E[w(Y )]

x ≥ 0.

We will show that if w is increasing, then X ≤hr Y =⇒ X w ≤hr Y w .

(1.B.23)

In order to do this, ﬁrst note that the hazard rate function rX w of X w is given by w(x)rX (x) , x ≥ 0, rX w (x) = E[w(X)X > x] where rX is the hazard rate function of X. Similarly, the hazard rate function rY w of Y w is given by rY w (x) =

w(x)rY (x) , E[w(Y )Y > x]

x ≥ 0,

where rY is the hazard rate function of Y . Now, from X ≤hr Y it follows that [X X > x] ≤hr [Y Y > x] for all x ≥ 0. Next, using Theorem 1.B.2 and the monotonicity of w, we get that [w(X) X > x] ≤hr [w(Y )Y > x], and therefore, by Theorem 1.B.1, E[w(X) X > x] ≤ E[w(Y )Y > x]. Combining this inequality with rX ≥ rY , it is seen that rX w ≥ rY w . The above random variables are also studied in Example 1.C.59. In particular, taking w to be the identity function w(x) = x, we see from (1.B.23) that the hazard rate ordering of X and Y implies the hazard rate ordering of the corresponding spread (or length-biased) random variables. See Example 8.B.12 for another result involving spreads.

1.B The Hazard Rate Order

29

Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R ∪ {∞} is a lattice with respect to the order ≤hr . The following example may be compared to Examples 1.C.48, 2.A.22, 3.B.38, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 1.B.24. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the hazard rate ordering of the ﬁrst two epoch times implies the hazard rate ordering of all the corresponding later epoch times; that is, it will be shown below that if X ≤hr Y , then T1,n ≤hr T2,n , n ≥ 1. The survival function F 1,n of T1,n is given by F 1,n (t) = P (T1,n > t) =

n−1 j=0

(Λ1 (t))j −Λ1 (t) = Γ n (Λ1 (t)), e j!

t ≥ 0, (1.B.24)

where Γ n is the survival function of the gamma distribution with scale parameter 1 and shape parameter n. The corresponding density function f1,n is given by f1,n (t) = γn (Λ1 (t))λ1 (t), t ≥ 0, where γn is the density function associated with Γ n . The corresponding hazard rate function rF1,n is given by rF1,n (t) = rΓn (Λ1 (t))λ1 (t),

t ≥ 0,

where rΓn is the hazard rate function associated with Γ n . Similarly, rF2,n (t) = rΓn (Λ2 (t))λ2 (t),

t ≥ 0.

If X ≤hr Y , then rF1,n (t) = rΓn (Λ1 (t))λ1 (t) ≥ rΓn (Λ2 (t))λ2 (t) = rF2,n (t),

t ≥ 0,

where the inequality follows from λ1 (t) ≥ λ2 (t), Λ1 (t) ≥ Λ2 (t), and the fact that the hazard rate function of the gamma distribution described above is increasing. Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Again, note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the hazard rate ordering of the ﬁrst two inter-epoch times implies the hazard rate ordering of all the corresponding

30

1 Univariate Stochastic Orders

later inter-epoch times. Explicitly, it will be shown below that if X ≤hr Y , and if F and G are logconvex (that is, X and Y are DFR), and if λ2 (t) λ1 (t)

is increasing in t ≥ 0,

(1.B.25)

then X1,n ≤hr X2,n for each n ≥ 1. For the purpose of this proof let us denote F by F1 , and G by F2 . Let Gi,n denote the survival function of Xi,n , i = 1, 2. The stated result is obvious for n = 1, so let us ﬁx an n ≥ 2. Then, from (7) in Baxter [62] we obtain Gi,n (t) =

∞

λi (s) 0

(s) Λn−2 i F i (s + t)ds, (n − 2)!

t ≥ 0, i ∈ {1, 2}.

(1.B.26)

Condition (1.B.25) means that λi (t) is TP2 (totally positive of order 2) in (i, t). Condition (1.B.25) also implies that Λ2 (t)/Λ1 (t) is increasing in t ≥ 0, that is, Λi (t) is TP2 in (i, t). Since a product of TP2 kernels is TP2 we get that λi (t)

(t) Λn−2 i (n − 2)!

is TP2 in (i, t).

The assumption F1 ≤hr F2 implies that F i (s + t) is TP2 in (i, s) and in (i, t). Finally, the logconvexity of F 1 and of F 2 means that F i (s + t) is TP2 in (s, t). Thus, by Theorem 5.1 on page 123 of Karlin [275], we get that Gi,n (t) is TP2 in (i, t); that is, X1,n ≤hr X2,n . The inequality X1,n ≤hr X2,n , n ≥ 1, can also be obtained under slightly weaker assumptions, namely, that X ≤hr Y , that (1.B.25) holds, and that either X or Y is DFR; see Hu and Zhuang [245]. Example 1.B.25. Let X1 , X2 , Y1 , and Y2 be independent, nonnegative random variables such that X1 =st X2 and Y1 =st Y2 . Denote by λX and λY the hazard rate functions of X1 and Y1 , respectively. If X1 ≤hr Y1 , and if λY /λX is decreasing on [0, 1), then min{max(X1 , X2 ), max(Y1 , Y2 )} ≤hr min{max(X1 , Y1 ), max(X2 , Y2 )}.

1.B The Hazard Rate Order

31

1.B.4 Comparison of order statistics Let X1 , X2 , . . . , Xm be random variables. As usual denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(m) . When we want to emphasize the dependence on m, we denote the order statistics by X(1:m) ≤ X(2:m) ≤ · · · ≤ X(m:m) . The following three theorems compare the order statistics in the hazard rate order. Theorem 1.B.26. If X1 , X2 , . . . , Xm are independent random variables, then X(k) ≤hr X(k+1) for k = 1, 2, . . . , m − 1. A relatively simple proof of Theorem 1.B.26 can be obtained using the likelihood ratio order which is discussed in the next section. Therefore the proof of this theorem will be given there in Remark 1.C.40. Theorem 12.5 in Cramer and Kamps [136] extends Theorem 1.B.26 to the so called sequential order statistics. Further comparisons of order statistics are given in the next two theorems. Theorem 1.B.27. Let X1 , X2 , . . . , Xm be independent random variables. If Xj ≤hr Xm for all j = 1, 2, . . . , m − 1, then X(k−1:m−1) ≤hr X(k:m) for k = 2, 3, . . . , m. Theorem 1.B.28. If X1 , X2 , . . . , Xm are independent random variables, then X(k:m−1) ≥hr X(k:m) for k = 1, 2, . . . , m − 1. From Theorem 1.B.27 it follows that if X1 , X2 , . . . , Xm are independent random variables, then X(1:1) ≥hr X(1:2) ≥hr · · · ≥hr X(1:m) .

(1.B.27)

One may wonder what kind of results of this type hold without the independence assumption. Since X(1:1) ≥ X(1:2) ≥ · · · ≥ X(1:m) a.s., it follows from Theorem 1.A.1 that X(1:1) ≥st X(1:2) ≥st · · · ≥st X(1:m) hold without any (independence) assumption. However, a counterexample in the literature shows that (1.B.27) does not always hold. We now describe some conditions under which (1.B.27) holds. Let X = (X1 , X2 , . . . , Xm ) be a random vector with a partially diﬀerentiable survival function F . The function R = − log F is called the hazard function of X, and the vector r X of partial derivatives, deﬁned by (1)

(2) (m) r X (x) = rX (x), rX (x), . . . , rX (x) ∂ ∂ ∂ R(x), R(x), . . . , R(x) , (1.B.28) = ∂x1 ∂x2 ∂xm for all x ∈ {x : F (x) > 0}, is called the hazard gradient of X; see Johnson (i) and Kotz [264] and Marshall [381]. Note that rX (x) can be interpreted as the conditional hazard rate of Xi evaluated at xi , given that Xj > xj for all j = i. That is,

32

1 Univariate Stochastic Orders (i)

rX (x) =

fi (xi |Xj > xj , j = i) , F i (xi |Xj > xj , j = i)

where fi (·|Xj > xj , j = i) and F i (·|Xj > xj , j = i) are the conditional density and survival functions of Xi , given that Xj > xj for all j = i. For convenience, (i) here and below we set rX (x) = ∞ for all x ∈ {x : F (x) = 0}. For any subset P ⊆ {1, 2, . . . , m} deﬁne YP = min Xi . i∈P

Denote

0 if i ∈ / P, 1P (i) = 1 if i ∈ P, 1P = (1P (1), 1P (2), . . . , 1P (m)),

and 1P c = 1 − 1P ,

where 1 = (1, 1, . . . , 1), and P denotes the complement of P in {1, 2, . . . , m}. Also denote 0 if i ∈ / P c, ∞ · 1P c (i) = ∞ if i ∈ P c , c

and ∞ · 1P c = (∞ · 1P c (1), ∞ · 1P c (2), . . . , ∞ · 1P c (m)). Then the survival function GP of YP can be expressed as GP (t) = F (t · 1P − ∞ · 1P c ),

t ∈ R.

Theorem 1.B.29. Let (X1 , X2 , . . . , Xm ) be a random vector with an absolutely continuous distribution function. Let P and Q be two subsets of {1, 2, . . . , m} such that P ⊂ Q. If r(i) (t · 1P − ∞ · 1P c ) ≤ r(i) (t · 1Q − ∞ · 1Qc ),

t ∈ R, i ∈ P,

(1.B.29)

then YP ≥hr YQ . A suﬃcient condition for (1.B.29) is that r(i) (x1 , x2 , . . . , xm ) is increasing in xj ,

j = i, i = 1, 2, . . . , m.

This is easily seen to be equivalent to the requirement that F (x1 , . . . , xi−1 , xi , xi+1 , . . . , xm ) F (x1 , . . . , xi−1 , xi , xi+1 , . . . , xm ) is decreasing in xj , j = i, whenever xi ≤ xi , i = 1, 2, . . . , m. (1.B.30) Condition (1.B.30) means that F is RR2 (reverse regular of order 2) in pairs; see Karlin [275]. In particular, it holds when X1 , X2 , . . . , Xm are independent. Karlin and Rinott [279] showed that some multivariate normal distributions,

1.B The Hazard Rate Order

33

as well as the Dirichlet distribution, are RR2 in pairs. So Theorem 1.B.29 applies to these distributions. When (X1 , X2 , . . . , Xm ) has an exchangeable distribution function, then the corresponding multivariate hazard function R is permutation symmetric. Therefore each r(i) can be expressed by means of r(1) as follows r(i) (x1 , x2 , . . . , xi−1 , xi , xi+1 , . . . , xm ) = r(1) (xi , x2 , . . . , xi−1 , x1 , xi+1 , . . . , xm ),

i = 2, 3, . . . , m.

Corollary 1.B.30. Let (X1 , X2 , . . . , Xm ) be a random vector with an absolutely continuous exchangeable distribution function. If r(1) (t, t, . . . , t, −∞, −∞, . . . , −∞) ≤ r(1) (t, t, . . . , t, −∞, −∞, . . . , −∞), i times

m−i times

i+1 times

m−i−1 times

t ∈ R, i = 1, 2, . . . , m − 1, (1.B.31) then X(1:1) ≥hr X(1:2) ≥hr · · · ≥hr X(1:m) .

(1.B.32)

If (1.B.31) is not imposed, then (1.B.32) need not be true; this follows from a counterexample in the literature. The following result strengthens the DFR part of Theorem 1.A.19. Recall that the spacings that correspond to the random variables X1 , X2 , . . . , Xm are denoted by U(i) = X(i) − X(i−1) , i = 2, 3, . . . , m, where the X(i) ’s are the corresponding order statistics. When the dependence on m is to be emphasized, we will denote the spacings by U(i:m) . Theorem 1.B.31. Let X1 , X2 , . . . , Xm , Xm+1 be independent and identically distributed, absolutely continuous, DFR random variables. Then (m − i + 1)U(i:m) ≤hr (m − i)U(i+1:m) , i = 2, 3, . . . , m − 1, (m − i + 2)U(i:m+1) ≤hr (m − i + 1)U(i:m) , i = 2, 3, . . . , m,

(1.B.33) (1.B.34)

and U(i:m) ≤hr U(i+1:m+1) ,

i = 2, 3, . . . , m.

(1.B.35)

Note that (1.B.33)–(1.B.35) can be summarized as (m − j + 1)U(j:m) ≤hr (n − i + 1)U(i:n)

whenever i − j ≥ max{0, n − m}.

Theorem 1.B.31 is a simple consequence of Theorem 1.C.45 below. It is of interest to compare Theorem 1.B.31 to Theorems 1.A.19 and 1.A.22. A comparison of such normalized spacings from two diﬀerent samples is described next. Here U(i:m) denotes, as before, the ith spacing that corresponds to the sample X1 , X2 , . . . , Xm , and V(j:n) denotes the jth spacing that corresponds to the sample Y1 , Y2 , . . . , Yn . It is of interest to compare the next result with Theorem 1.C.45.

34

1 Univariate Stochastic Orders

Theorem 1.B.32. For positive integers m and n, let X1 , X2 , . . . , Xm be independent identically distributed random variables with an absolutely continuous common distribution function, and let Y1 , Y2 , . . . , Yn be independent identically distributed random variables with a possibly diﬀerent absolutely continuous common distribution function. If X1 ≤hr Y1 , and if either X1 or Y1 is DFR, then (m − j + 1)U(j:m) ≤st (n − i + 1)V(i:n)

whenever i − j ≥ max{0, n − m}.

The hazard rate order is closed under the operation of taking minima, as the next result shows. Theorem 1.B.33. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤hr Yi , i = 1, 2, . . . , m. Then min{X1 , X2 , . . . , Xm } ≤hr min{Y1 , Y2 , . . . , Y m }. Proof. Clearly, it is enough to show the result when m = 2. For simplicity assume that X1 , X2 , Y1 , and Y2 have hazard rate functions r1 , r2 , q1 , and q2 , respectively. Then it is very easy to see that the hazard rate function of min{X1 , X2 } is r1 + r2 and the hazard rate function of min{Y1 , Y2 } is q1 + q2 . By the assumptions of the theorem (see (1.B.2)) r1 (t) ≥ q1 (t) and r2 (t) ≥ q2 (t) for all t ≥ 0. Therefore r1 (t) + r2 (t) ≥ q1 (t) + q2 (t) for all t ≥ 0, that is, min{X1 , X2 } ≤hr min{Y1 , Y2 }.

If the Xi ’s in Theorem 1.B.33 are identically distributed and if the Yi ’s in Theorem 1.B.33 are also identically distributed, then all order statistics (and not just the minima) corresponding to the Xi ’s and the Yi ’s can be compared in the hazard rate order. This is shown in the following result. Theorem 1.B.34. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of absolutely continuous random variables such that Xi ≤hr Yi , i = 1, 2, . . . , m. Suppose that the Xi ’s are identically distributed and that the Yi ’s are identically distributed. Then X(k:m) ≤hr Y(k:m) ,

k = 1, 2, . . . , m.

(1.B.36)

If the Xi ’s or the Yi ’s in Theorem 1.B.34 are not identically distributed, then the conclusion (1.B.36) need not hold. However, the following result, from Chapter 16 by Boland and Proschan in [515], gives conditions under which (1.B.36) holds. Proposition 1.B.35. Let X1 , X2 , . . . , Xm [respectively, Y1 , Y2 , . . . , Ym ] be m independent (not necessarily identically distributed) absolutely continuous random variables, all with support (a, b) for some a < b. If Xi ≤hr Yj for all i and j, then X(k:m) ≤hr Y(k:m) , k = 1, 2, . . . , m. A result which is stronger than Proposition 1.B.35, but which uses Proposition 1.B.35 in its proof, is the following.

1.B The Hazard Rate Order

35

Theorem 1.B.36. Let X1 , X2 , . . . , Xm be m independent (not necessarily identically distributed) random variables, and let Y1 , Y2 , . . . , Yn be other n independent (not necessarily identically distributed) random variables, all having absolutely continuous distributions with support (a, b) for some a < b. If Xi ≤hr Yj for all i and j, then X(j:m) ≤hr Y(i:n)

whenever i − j ≥ max{0, n − m}.

The proof of Theorem 1.B.36 uses the likelihood ratio order which is discussed in the next section. Therefore the proof will be given in Remark 1.C.41. The following example describes an interesting instance in which the two maxima are ordered in the hazard rate order. It may be compared with Example 3.B.32. Example 1.B.37. Let Y1 , Y2 , . . . , Ym be independent exponential random variables with hazard rates λ1 , λ2 , . . . , λm , respectively. Let X1 , X2 , . . . , Xm be independent andidentically distributed exponential random variables with m hazard rate λ = i=1 λi /m. Then X(m:m) ≤hr Y(m:m) .

(1.B.37)

Let Z1 , Z2 , . . . , Zm be independent and identically distributed exponential

˜ = m λi 1/m . Then random variables with hazard rate λ i=1 Z(m:m) ≤hr Y(m:m) .

(1.B.38)

˜ and ProposiIn fact, from the arithmetic-geometric mean inequality (λ ≥ λ) tion 1.B.35, it follows that (1.B.38) implies (1.B.37). 1.B.5 Some properties in reliability theory The order ≤hr can be trivially (but beneﬁcially) used to characterize IFR random variables. The next result lists several such characterizations. Recall from Section 1.A.3 that for any random variable Z and an event A we denote by [Z A] any random variable that has as its distribution the conditional distribution of Z given A. Theorem 1.B.38. The random variable X is IFR [DFR] if, and only if, one of the following equivalent conditions holds (when the support of the distribution function of X is bounded, condition (iii) does not have a simple DFR analog): (i) [X − tX > t] ≥hr [≤ hr ] [X − t X > t ] whenever t ≤ t . (ii) X ≥hr [≤hr ] [X − tX > t] for all t ≥ 0 (when X is a nonnegative random variable). (iii) X + t ≤hr X + t whenever t ≤ t .

36

1 Univariate Stochastic Orders

Note that if X is the lifetime of a device, then [X − tX > t] is the residual life of such a device with age t. Theorem 1.B.38(i), for example, characterizes IFR random variables by the monotonicity of their residual lives with respect to the order ≤hr . Some multivariate analogs of conditions (i) and (ii) of Theorem 1.B.38 are used in Section 6.D.3 to introduce a multivariate IFR notion. Part (iii) of Theorem 1.B.38 can be used to prove the closure under convolution property of IFR random variables: Corollary 1.B.39. Let X and Y be two independent IFR random variables. Then X + Y has an IFR distribution. Proof. From Theorem 1.B.38(iii) it follows that X + t ≤hr X + t whenever t ≤ t . Also, Y is independent of X +t and of X +t for all t and t , respectively. From Lemma 1.B.3 it now follows that X + Y + t ≤hr X + Y + t whenever t ≤ t . Thus, again from Theorem 1.B.38(iii), it follows that X + Y is IFR.

Recall from (1.A.20) that for a nonnegative random variable X with a ﬁnite mean we denote by AX the corresponding asymptotic equilibrium age. Recall from page 1 the deﬁnitions of the DMRL and the IMRL properties. The following result is immediate. Theorem 1.B.40. The nonnegative random variable X with ﬁnite mean is DMRL [IMRL] if, and only if, X ≥hr [≤hr ] AX . 1.B.6 The reversed hazard order If X is a random variable with an absolutely continuous distribution function F , then the reversed hazard rate of X at the point t is deﬁned as r˜(t) = (d/dt)(log F (t)). One interpretation of the reversed hazard rate at time t is the following. Suppose that X is nonnegative with distribution function F . Then X can be thought of as the lifetime of some device. Given that the device has already failed by time t, then the probability that it survived up to time t − ε (for a small ε > 0) is approximately ε · r˜(t). Some of the results regarding the hazard rate order have analogs when the hazard rate is replaced by the reversed hazard rate. Let X and Y be two random variables with absolutely continuous distribution functions and with reversed hazard rate functions r˜ and q˜, respectively, such that r˜(t) ≤ q˜(t), t ∈ R. (1.B.39) Then X is said to be smaller than Y in the reversed hazard rate order (denoted as X ≤rh Y ). In fact, the absolute continuity, which is required in (1.B.39), is not really needed. It easy to verify that (1.B.39) holds if, and only if,

1.B The Hazard Rate Order

G(t) F (t)

increases in t ∈ (min(lX , lY ), ∞)

37

(1.B.40)

(here a/0 is taken to be equal to ∞ whenever a > 0). Here F denotes the distribution function of X and G denotes the distribution function of Y , and lX and lY denote the corresponding left endpoints of the supports of X and of Y . Equivalently, (1.B.40) can be written as F (x)G(y) ≥ F (y)G(x)

for all x ≤ y.

(1.B.41)

Thus (1.B.40) or (1.B.41) can be used to deﬁne the order X ≤rh Y even if X and/or Y do not have absolutely continuous distributions. The analog of (1.B.5) for the reversed hazard order when X and Y have densities f and g, respectively, is that X ≤rh Y if, and only if, f (y) g(y) ≤ F (x) G(x)

for all x ≤ y.

(1.B.42)

Another condition that is equivalent to X ≤rh Y is GF −1 (v) GF −1 (u) ≤ u v

for all 0 < u ≤ v < 1.

Finally, another condition that is equivalent to X ≤rh Y is P {X − t ≤ −sX ≤ t} ≥ P {Y − t ≤ −sY ≤ t} for all s ≥ 0 and all t, or, equivalently,

[X X ≤ t] ≤st [Y Y ≤ t]

for all t.

(1.B.43)

For discrete random variables X and Y that take on values in N, we denote X ≤rh Y if P {X = n} P {Y = n} ≤ , n ∈ N. (1.B.44) P {X ≤ n} P {Y ≤ n} A useful relationship between the hazard rate and the reversed hazard rate orders is described in the following theorem. Theorem 1.B.41. Let X and Y be two continuous random variables with supports (lX , uX ) and (lY , uY ), respectively. Then X ≤hr Y =⇒ φ(X) ≥rh φ(Y ) for any continuous function φ which is strictly decreasing on (lX , uY ). Also, X ≤rh Y =⇒ φ(X) ≥hr φ(Y ) for any such function φ. Using Theorem 1.B.41 it is easy to obtain the following analogs of results regarding the order ≤hr .

38

1 Univariate Stochastic Orders

Theorem 1.B.42. If X and Y are two random variables such that X ≤rh Y , then X ≤st Y . Theorem 1.B.43. If X ≤rh Y , and if φ is any increasing function, then φ(X) ≤rh φ(Y ). Lemma 1.B.44. If the random variables X and Y are such that X ≤rh Y , and if Z is a random variable independent of X and Y and has decreasing reversed hazard rate, then X + Z ≤rh Y + Z. Theorem 1.B.45. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤rh Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, all have decreasing reversed hazard rates, then m

Xi ≤rh

i=1

m

Yi .

i=1

Theorem Let X, Y , and Θ be random variables such that [X Θ = 1.B.46. θ] ≤rh [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤rh Y . In order to state a bivariate characterization result for the order ≤rh we deﬁne the following class of bivariate functions: Grh = {φ : R2 → R : φ(x, y) is increasing in x, for each y, on {x ≤ y}, and is decreasing in y, for each x, on {y ≤ x}}. The proof of the next result (Theorem 1.B.47) is similar to the proof of Theorem 1.B.9. Theorem 1.B.47. Let X and Y be independent random variables. Then X ≤rh Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Grh .

The next result uses the notation of Theorem 1.A.10. Theorem 1.B.48. Let X and Y be two independent random variables. Then X ≤rh Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all φ1 and φ2 such that, for each y, ∆φ21 (x, y) decreases in x on {x ≤ y}, and such that ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. A further similar characterization is given in Theorem 4.A.36. The following result is an analog of Theorem 1.B.11.

1.B The Hazard Rate Order

39

Theorem 1.B.49. Let X and Y be two independent random variables. Then X ≤rh Y if, and only if, [X max(X, Y ) = z] ≤rh [Y max(X, Y ) = z] for all z. (1.B.45) Also, X ≤rh Y if, and only if, [X max(X, Y ) = z] ≤st [Y max(X, Y ) = z]

for all z.

(1.B.46)

Proof. First suppose that X and Y are absolutely continuous. Denote the distribution functions of X and Y by F and G, respectively, and denote the corresponding density functions by f and g. Then F (x)g(z) , if x ≤ z, (1.B.47) P [X ≤ x max(X, Y ) = z] = f (z)G(z)+g(z)F (z) 1, if x > z, and P [Y ≤ x max(X, Y ) = z] =

G(x)f (z) f (z)G(z)+g(z)F (z) ,

if x ≤ z,

1,

if x > z.

(1.B.48)

Therefore P [Y ≤ x| max(X, Y ) = z] = P [X ≤ x| max(X, Y ) = z]

G(x) F (x)

1,

·

f (z) g(z) ,

if x ≤ z, if x > z.

(1.B.49)

G(z) f (z) If X ≤rh Y , then G(x) F (x) is increasing in x, and F (z) · g(z) ≤ 1. Thus (1.B.49) is increasing in x, and (1.B.45) follows. Obviously (1.B.46) follows from (1.B.45). Now suppose that (1.B.46) holds. Then from (1.B.47) and (1.B.48) we get that F (x)g(z) ≥ G(x)f (z) for all x ≤ z. Therefore X ≤rh Y by (1.B.42). The proof when X and Y are discrete is similar.

The following result is an analog of Theorem 1.B.12. Theorem 1.B.50. Let X and Y be two independent random variables. The following conditions are equivalent: (a) X ≤rh Y . (b) E[α(X)]E[β(Y )] ≥ E[α(Y )]E[β(X)] for all functions α and β for which the expectations exist and such that β is nonnegative and α/β and β are decreasing. (c) For any increasing function a and a nonnegative decreasing function b, if E[a(Y )b(Y )] = 0, then E[a(X)b(X)] ≤ 0. Example 1.B.51. Let X and Y be two random variables with support [c, d], where c < 0 < d, and suppose that E[Y ] > 0. Let u be an increasing diﬀerentiable concave function, corresponding to the utility function of a risk-averse

40

1 Univariate Stochastic Orders

individual. Let kX be a value which maximizes gX (k) ≡ E[u(kX)], and similarly let kY be a value which maximizes gY (k) ≡ E[u(kY )]. Theorem 1.B.50(c) can be used to prove that if X ≤rh Y , then kX ≤ kY . In order to see it, ﬁrst note that the result is trivial if kX = −∞ or if kY = ∞. Thus, let us assume that kX and kY are ﬁnite. Note that then kX and kY satisfy E[Xu (kX X)] = 0 and E[Y u (kY Y )] = 0, where u denotes the derivative of u. Also note that from the assumption E[Y ] > 0 it follows that kY > 0. Without loss of generality let kY = 1. Thus E[Y u (Y )] = 0, and using the concavity of u the assertion would follow if we show that E[Xu (X)] ≤ 0. But this follows from Theorem 1.B.50(c). Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result generalizes Theorem 1.B.43, just as Theorem 1.A.6 generalized Theorem 1.A.3(a). The proof of the next theorem is similar to the proof of Theorem 1.B.14 and is therefore omitted. Theorem 1.B.52. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If

X(θ) ≤rh X(θ )

whenever θ ≤ θ ,

and if Θ1 ≤rh Θ2 , then Y1 ≤rh Y2 . The following result, which is the “reversed hazard analog” of Theorem 1.B.18, gives a Laplace transform characterization of the order ≤rh . Theorem 1.B.53. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤rh X2 ⇐⇒ Nλ (X1 ) ≤rh Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤rh Nλ (X2 ) is in the sense of (1.B.44).

1.B The Hazard Rate Order

41

The implication =⇒ in Theorem 1.B.53 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 1.B.52. The reversed hazard analog of Theorem 1.B.19 is the following. Theorem 1.B.54. Let X1 , X2 , . . . , Xm , Θ1 , and Θ2 be independent nonnegative random variables. Deﬁne Nj (t) for t ≥ 0 and j = 1, 2 as in Theorem 1.B.19. If Θ1 ≤rh Θ2 , then N1 (t) ≤rh N2 (t) in the sense of (1.B.44) for all t ≥ 0. The reversed hazard analog of Theorem 1.B.20 is the following. Theorem 1.B.55. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the reversed hazard order. Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R ∪ {−∞} is a lattice with respect to the order ≤rh . The reversed hazard analog of Theorem 1.B.26 is the following. Theorem 1.B.56. If X1 , X2 , . . . , Xm are independent random variables, then X(k) ≤rh X(k+1) for k = 1, 2, . . . , m − 1. The reversed hazard analog of Theorem 1.B.27 is the following. Theorem 1.B.57. Let X1 , X2 , . . . , Xm be independent random variables. If Xm ≤rh Xj for all j = 1, 2, . . . , m − 1, then X(k−1:m−1) ≤rh X(k:m) for k = 2, 3, . . . , m. The reversed hazard analog of Theorem 1.B.28 is the following. Theorem 1.B.58. If X1 , X2 , . . . , Xm are independent random variables, then X(k:m−1) ≥rh X(k:m) for k = 1, 2, . . . , m − 1. The reversed hazard analogs of Theorems 1.B.33, 1.B.34, and 1.B.36 are the following results. Theorem 1.B.59. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤rh Yi , i = 1, 2, . . . , m. Then max{X1 , X2 , . . . , Xm } ≤rh max{Y1 , Y2 , . . . , Ym }. Theorem 1.B.60. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of absolutely continuous random variables such that Xi ≤rh Yi , i = 1, 2, . . . , m. Suppose that the Xi ’s are identically distributed and that the Yi ’s are identically distributed. Then X(k:m) ≤rh Y(k:m) ,

k = 1, 2, . . . , m.

42

1 Univariate Stochastic Orders

Theorem 1.B.61. Let X1 , X2 , . . . , Xm be m independent (not necessarily identically distributed) random variables, and let Y1 , Y2 , . . . , Yn be other n independent (not necessarily identically distributed) random variables, all having absolutely continuous distributions with support (a, b) for some a < b. If Xi ≤rh Yj for all i and j, then X(j:m) ≤rh Y(i:n)

whenever i − j ≥ max{0, n − m}.

Finally, the reversed hazard analog of Theorem 1.B.38 is the following. Theorem 1.B.62. The random variable X with support (a, b), for some −∞ ≤ a < b ≤ ∞, has decreasing [increasing] reversed hazard rate if, and only if, one of the following equivalent conditions holds: (i) [X − tX < t] ≥rh [≤rh ] [X − t X < t ] whenever a < t ≤ t < b. (ii) X ≤rh [≥rh ] [X − tX < t] for all t ∈ (a, b) (when X is a nonpositive random variable). (iii) X + t ≤rh [≥rh ] X + t whenever a < t ≤ t < b. Corollary 1.B.63. Let X and Y be two independent random variables with decreasing reversed hazard rates. Then X +Y has a decreasing reversed hazard rate.

1.C The Likelihood Ratio Order 1.C.1 Deﬁnition Let X and Y be continuous [discrete] random variables with densities [discrete densities] f and g, respectively, such that g(t) f (t)

increases in t over the union of the supports of X and Y

(1.C.1)

(here a/0 is taken to be equal to ∞ whenever a > 0), or, equivalently, f (x)g(y) ≥ f (y)g(x)

for all x ≤ y.

(1.C.2)

Then X is said to be smaller than Y in the likelihood ratio order (denoted by X ≤lr Y ). By integrating (1.C.2) over x ∈ A and y ∈ B, where A and B are measurable sets in R, it is seen that (1.C.2) is equivalent to P {X ∈ A}P {Y ∈ B} ≥ P {X ∈ B}P {Y ∈ A} for all measurable sets A and B such that A ≤ B, (1.C.3) where A ≤ B means that x ∈ A and y ∈ B imply that x ≤ y. Note that condition (1.C.3) does not directly involve the underlying densities, and thus

1.C The Likelihood Ratio Order

43

it applies uniformly to continuous distributions, or to discrete distributions, or even to mixed distributions. At a ﬁrst glance (1.C.1) or (1.C.2) or (1.C.3) seem to be unintuitive technical conditions. However, it turns out that in many situations they are very easy to verify, and this is one of the major reasons for the usefulness and importance of the order ≤lr . It is also easy to verify by a simple diﬀerentiation (at least when X and Y have the same support) that X ≤lr Y ⇐⇒ GF −1 is convex.

(1.C.4)

Here F and G are the distribution functions of X and Y , respectively. 1.C.2 The relation between the likelihood ratio and the hazard and reversed hazard orders Note that from (1.C.1) it follows (in the continuous case) that y ∞ ∞ y f (t)g(t )dt dt ≥ f (t )g(t)dt dt for all x ≤ y, t=x

t =y

t=y

which, in turn, implies that ∞ ∞ f (t)dt g(t )dt ≥ x

y

t =x

∞

x

g(t)dt

∞

f (t )dt

for all x ≤ y,

y

that is, (1.B.4). We thus have shown a part of the following result. The other parts of the next theorem are proven similarly (recall that the discrete versions of the orders ≤hr and ≤rh are deﬁned in (1.B.9) and (1.B.44), respectively). Theorem 1.C.1. If X and Y are two continuous or discrete random variables such that X ≤lr Y , then X ≤hr Y and X ≤rh Y (and therefore X ≤st Y ). Remark 1.C.2. Neither of the orders ≤hr and ≤rh (even if both hold simultaneously) implies the order ≤lr . In order to see it let X be a uniform random variable over the set {1, 2, 3, 4} and let Y have the probabilities P {Y = 1} = .1, P {Y = 2} = .3, P {Y = 3} = .2, and P {Y = 4} = .4. Then it is not true that X ≤lr Y , however, in this case we have that X ≤hr Y and also that X ≤rh Y . Remark 1.C.3. Using Theorem 1.C.1 we can now give a proof of Theorem 1.A.22. Let F and f denote, respectively, the distribution function and the density function of X1 . Given X(i−1:m) = u and X(i+1:m) = v, the conditional (u+w) density of U(i:m) at the point w is F f(v)−F (u) , 0 ≤ w ≤ v−u, and the conditional f (v−w) density of U(i+1:m) at the point w is F (v)−F (u) , 0 ≤ w ≤ v − u. Since f is increasing [decreasing] it is seen that, conditionally, U(i:m) ≥lr [≤lr ] U(i+1:m) , and therefore, by Theorem 1.C.1, U(i:m) ≥st [≤st ] U(i+1:m) . Theorem 1.A.22 now follows from Theorem 1.A.3(d).

44

1 Univariate Stochastic Orders

Although neither of the orders ≤hr and ≤rh implies the order ≤lr (see Remark 1.C.2), the following result gives a simple condition under which this is actually the case. The proof is immediate and is therefore omitted. Theorem 1.C.4. Let X and Y be two random variables with distribution functions F and G, (discrete or continuous) hazard rate functions r and q, and (discrete or continuous) reversed hazard rate functions r˜ and q˜, respectively. (a) If X ≤hr Y and if (b) If X ≤rh Y and if

q(t) r(t) q˜(t) r˜(t)

increases in t, then X ≤lr Y . increases in t, then X ≤lr Y .

1.C.3 Some properties and characterizations The usual stochastic order has the useful and important constructive property described in Theorem 1.A.1. There is no analogous property associated with the likelihood ratio order. Therefore it is of importance to understand better the relationship between the orders ≤st and ≤lr . We already know from Theorems 1.C.1 and 1.B.1 that the likelihood ratio order implies the usual stochastic order. The following result characterizes the likelihood ratio order by means of the order ≤st . It says that X ≤lr Y if, and only if, for any interval I, the conditional distribution of X, given that X ∈ I, is stochastically smaller than the conditional distribution of Y , given that Y ∈ I. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. It is of interest to contrast the next result with (1.B.7) and (1.B.43). Theorem 1.C.5. The two random variables X and Y satisfy X ≤lr Y if, and only if, [X a ≤ X ≤ b] ≤st [Y a ≤ Y ≤ b] whenever a ≤ b. (1.C.5) Proof. Suppose that (1.C.5) holds. Select an a and a b such that a < b. Then P {u ≤ Y ≤ b} P {u ≤ X ≤ b} ≤ P {a ≤ X ≤ b} P {a ≤ Y ≤ b}

whenever u ∈ [a, b].

It follows then that P {a ≤ X < u} P {a ≤ Y < u} ≥ P {u ≤ X ≤ b} P {u ≤ Y ≤ b}

whenever u ∈ [a, b].

P {a ≤ X < u} P {u ≤ X ≤ b} ≥ P {a ≤ Y < u} P {u ≤ Y ≤ b}

whenever u ∈ [a, b].

That is,

In particular, for u < b ≤ v,

1.C The Likelihood Ratio Order

45

P {b ≤ X ≤ v} P {u ≤ X < b} ≥ . P {u ≤ Y < b} P {b ≤ Y ≤ v} Therefore, when X and Y are continuous random variables, P {a ≤ X < u} P {b ≤ X ≤ v} ≥ P {a ≤ Y < u} P {b ≤ Y ≤ v}

whenever a < u ≤ b ≤ v.

Now let a → u and b → v to obtain (1.C.2). The proof for discrete random variables is similar. Conversely, suppose that X ≤lr Y , then clearly, [X a ≤ X ≤ b] ≤lr [Y a ≤ Y ≤ b] whenever a < b (see also Theorem 1.C.6). From Theorems 1.C.1 and 1.B.1 we obtain (1.C.5).

The likelihood ratio order is preserved under general truncations of the involved random variables. This is stated in the next theorem, the proof of which follows directly from (1.C.2). Theorem 1.C.6. If X and Y are two random variables such that X ≤lr Y , then for any measurable set A ⊆ R we have [X X ∈ A] ≤lr [Y Y ∈ A]. By combining Theorems 1.C.5 and 1.C.6 it is seen that X ≤lr Y if, and only if, (1.C.6) [X X ∈ A] ≤st [Y Y ∈ A] for all measurable sets A ⊆ R. In fact, one can take (1.C.6) as the deﬁnition of the likelihood ratio order. The advantage of this approach is that it does not directly involve the underlying densities, and thus, similarly to condition (1.C.3), it applies uniformly to continuous distributions, or to discrete distributions, or even to mixed distributions. Using the characterization (1.C.3), it is not hard to obtain the following result. Theorem 1.C.7. Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤lr Yj , j = 1, 2, . . ., then X ≤lr Y . Let ψ be a strictly monotone increasing [decreasing] diﬀerentiable function with inverse ψ −1 . If X has the density function f , then ψ(X) has the density function (f ψ −1 )/(ψ (ψ −1 )). Similarly, if Y has the density function g, then ψ(Y ) has the density function (gψ −1 )/(ψ (ψ −1 )). If X ≤lr Y , then ψ −1 )(u)/(ψ (ψ −1 (u))) from (1.C.1) it follows that (f (gψ −1 )(u)/(ψ (ψ −1 (u))) decreases [increases] over the unions of the supports of ψ(X) and ψ(Y ). We have thus proved an important special case of Theorem 1.C.8 below. For discrete random variables the result is proven in a similar manner. When ψ is just monotone (rather than strictly monotone) the result is still true, but the preceding simple argument is no longer suﬃcient for its proof.

46

1 Univariate Stochastic Orders

Theorem 1.C.8. If X ≤lr Y and ψ is any increasing [decreasing] function, then ψ(X) ≤lr [≥lr ] ψ(Y ). If X1 ≤lr Y1 and X2 ≤lr Y2 , where X1 and X2 are independent random variables, and Y1 and Y2 are also independent random variables, then it is not necessarily true that X1 + X2 ≤lr Y1 + Y2 . However, if these random variables have logconcave densities, then it is true. In fact, a slightly stronger result is true: Theorem 1.C.9. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤lr Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, all have (continuous or discrete) logconcave densities, except possibly one Xl and one Yk (l = k), then m m Xi ≤lr Yi . i=1

i=1

Proof. Since a convolution of random variables with logconcave densities has a logconcave density, it is enough to show that if W1 , W2 , and Z are independent random variables such that W1 ≤lr W2 , and Z has a logconcave density function, then W1 + Z ≤lr W2 + Z. We will give the proof for the continuous case; the proof for the discrete case is similar. Let fWi , fWi +Z , i = 1, 2, and fZ denote the density functions of the indicated random variables. Then ∞ fWi +Z (t) = fZ (t − w)fWi (w) dw, i = 1, 2, t ∈ R. −∞

The assumption W1 ≤lr W2 means that fWi (w), as a function of w and of i ∈ {1, 2}, is TP2 . The logconcavity of fZ means that fZ (t−w), as a function of t and of w, is TP2 . Therefore, by the basic composition formula (Karlin [275]) we see that fWi +Z (t) is TP2 in i ∈ {1, 2} and t; that is, W1 + Z ≤lr W2 + Z.

Example 1.C.10. Consider m independent Bernoulli trials with probability pi of success in the ith trial. Let q(k, p) denote the probability of k successes, k = 1, 2, . . . , m, where p = (p1 , p2 , . . . , pm ). Then q(k + 1, p)/q(k, p) is increasing in each pi for k = 0, 1, . . . , m − 1. In order to see it, let Xi be a Bernoulli random variable with probability pi of success, i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Similarly, let Yi be a Bernoulli random variable with probability pi of success, i = 1, 2, . . . , m, and assume that the Yi ’s are independent. Obviously, the discrete density functions of the Xi ’s and of the Yi ’s are logconcave, and if p ≤ p , then Xi ≤lr Yi , i = 1, 2, . . . , m. The stated result thus follows from Theorem 1.C.9. For nonnegative random variables, Theorem 1.C.9 can be generalized further by having more Yi ’s summed than Xi ’s. Under the assumptions of Theorem 1.C.9, one then obtains, for m ≤ n, that

1.C The Likelihood Ratio Order m

Xi ≤lr

i=1

n

47

Yi .

i=1

Of course, in this case, for m+1 ≤ i ≤ n, the Yi ’s only need to have logconcave densities—they do not have to have corresponding Xi ’s to which they need to be comparable in the order ≤lr . One may expect that the latter inequality can be extended to the following one: M

Xi ≤lr

i=1

N

Yi ,

i=1

where M and N are two discrete positive integer-valued random variables, independent of the Xi ’s and of the Yi ’s, respectively, such that M ≤lr N . Indeed this inequality is true under some additional assumptions on the distributions of the Xi ’s and the Yi ’s that will not be stated here. An important special case is the following theorem. Theorem 1.C.11. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative independent random variables with logconcave densities. Let M and N be two discrete positive integer-valued random variables such that M ≤lr N , and assume that M and N are independent of the Xi ’s. Then M i=1

Xi ≤lr

N

Xi .

i=1

In Pellerey [445] it is claimed that the conclusion of Theorem 1.C.11 holds even under the weaker assumption that M ≤hr N (in the sense of (1.B.9) or (1.B.10)). However, there is a mistake in [445] (see Pellerey [446]). It is of interest to compare Theorem 1.C.11 to the following result, which combines uses of the likelihood ratio and the hazard [reversed hazard] rate orders. Theorem 1.C.12. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative independent random variables that are IFR [have decreasing reversed hazard rates]. Let M and N be two discrete positive integer-valued random variables such that M ≤lr N , and assume that M and N are independent of the Xi ’s. Then M N Xi ≤hr [≤rh ] Xi . i=1

i=1

Note that the hazard rate part of Theorem 1.C.12 is weaker than Theorem 1.B.7 because of Theorem 1.C.1. The hazard rate order can be characterized by means of the likelihood ratio order and the appropriate equilibrium age variables. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result is immediate from (1.B.3) and (1.C.1).

48

1 Univariate Stochastic Orders

Theorem 1.C.13. Let X and Y be two nonnegative random variables with ﬁnite positive means. Then X ≤hr Y if, and only if, AX ≤lr AY . In light of Theorem 1.C.13 it is of interest to note that the order ≤lr can also be used to characterize the hazard rate order as is described in the next theorem. Let X and Y be two nonnegative random variables with ﬁnite means and suppose that X ≤st Y and that EX < EY . Let F and G be the distribution functions of X and of Y , respectively. Deﬁne the random variable ZX,Y as the random variable that has the density function h given by h(z) =

G(z) − F (z) , EY − EX

z ≥ 0.

(1.C.7)

Theorem 1.C.14. Let X and Y be two nonnegative random variables with ﬁnite means such that X ≤st Y and such that EY > EX > 0. Then AX ≤lr ZX,Y ⇐⇒ AY ≤lr ZX,Y ⇐⇒ X ≤hr Y, where ZX,Y has the density function given in (1.C.7). Proof. Denote by fe the density function of AY . Then, using (1.A.20), we obtain h(x) F (x) EY 1− , x ≥ 0, = fe (x) EY − EX G(x) and the second stated equivalence follows from (1.C.1) and (1.B.3). The proof of the ﬁrst equivalence is similar.

It is of interest to contrast Theorem 1.C.14 with Theorems 2.A.5 and 2.B.3. The likelihood ratio order enjoys a closure under mixture property which is similar to the closure under mixture property of the hazard rate order stated in Theorem 1.B.8. This is stated next; the proof is similar to the proof of Theorem 1.B.8; we omit the details. Θ = Theorem 1.C.15. Let X, Y , and Θ be random variables such that [X θ] ≤lr [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤lr Y . As a corollary of Theorem 1.C.15 we obtain the following result. Corollary 1.C.16. Let N be a positive integer-valued random variable, and let Xi , i = 1, 2, . . ., be random variables which are independent of N . Let Y be a random variable such that Xi ≤lr Y , i = 1, 2, . . .. Then XN ≤lr Y . Consider now a family of (continuous or discrete) density functions {gθ , θ ∈ X } where X is a subset of the real line. As in Section 1.A.3 let X(θ) denote a random variable with density function gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with density function h given by

1.C The Likelihood Ratio Order

49

h(y) = X

gθ (y)dF (θ),

y ∈ R.

The following result generalizes both Theorems 1.C.8 and 1.C.15, just as Theorem 1.A.6 generalized parts (a) and (c) of Theorem 1.A.3. Theorem 1.C.17. Consider a family of density functions {gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the density function of Yi is given by gθ (y)dFi (θ), y ∈ R, i = 1, 2. hi (y) = X

If

X(θ) ≤lr X(θ )

whenever θ ≤ θ ,

(1.C.8)

and if Θ1 ≤lr Θ2 ,

(1.C.9)

Y1 ≤lr Y2 .

(1.C.10)

then Proof. We give the proof under the assumption that Θ1 and Θ2 are absolutely continuous with density functions f1 and f2 , respectively. The proof for the discrete case is similar. Assumption (1.C.8) means that gθ (y), as a function of θ and of y, is TP2 . Assumption (1.C.9) means that fi (θ), as a function of i ∈ {1, 2} and of θ, is TP2 . Therefore, by the basic composition formula (Karlin [275]) we see that hi (y) is TP2 in i ∈ {1, 2} and y. That gives (1.C.10).

A related result is the following; see also Theorems 1.B.19 and 1.B.54. Theorem 1.C.18. Let X1 , X2 , . . . , Xm , Θ1 , and Θ2 be independent nonnegative random variables. Deﬁne Nj (t) for t ≥ 0 and j = 1, 2 as in Theorem 1.B.19. If Θ1 ≤lr Θ2 , then N1 (t) ≤lr N2 (t) for all t ≥ 0. The following example is an application of Theorem 1.C.17; it may be compared to Examples 1.A.7 and 1.B.16. Example 1.C.19. Let Θ1 and Θ2 be two nonnegative random variables with distribution functions F1 and F2 , respectively. Let G be some absolutely continuous distribution function, and let g be the corresponding density function. Denote by X(θ) a random variable with the distribution function Gθ . Deﬁne Yi = X(Θi ); that is, the distribution function Hi of Yi is given by ∞ Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. 0

50

1 Univariate Stochastic Orders

Note that the density function kθ of X(θ) is given by kθ (y) = θg(y)Gθ−1 (y),

y ∈ R.

It is easy to verify that (1.C.8) holds. Thus, by Theorem 1.C.17, if Θ1 ≤lr Θ2 , then Y1 ≤lr Y2 . θ Now, denote by X(θ) a random variable with the survival function G , i ); that is, the survival function H i of Yi where G ≡ 1 − G. Deﬁne Yi = X(Θ is given by ∞ θ i (y) = G (y)dFi (θ), y ∈ R, i = 1, 2. H 0

is given by Note that the density function kθ of X(θ) θ−1 kθ (y) = θg(y)G (y),

y ∈ R.

) whenever θ ≤ θ . Thus, by an It is easy to verify now that X(θ) ≥lr X(θ obvious modiﬁcation of Theorem 1.C.17, if Θ1 ≤lr Θ2 , then Y1 ≥lr Y2 . In order to state a bivariate characterization result for the order ≤lr we deﬁne the following class of bivariate functions: Glr = {φ : R2 → R : φ(x, y) ≤ φ(y, x) whenever x ≤ y}. Theorem 1.C.20. Let X and Y be independent random variables. Then X ≤lr Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Glr .

(1.C.11)

Proof. We give the proof for the absolutely continuous case only; the proof for the discrete case is similar. Suppose that (1.C.11) holds. Select u, v, ∆u > 0, and ∆v > 0 such that u ≤ v. As before, let IA denote the indicator function of the set A, and deﬁne φ(x, y) = I{u−∆u≤y≤u,v≤x≤v+∆v} . Clearly, φ ∈ Glr . Hence P {v ≤ X ≤ v + ∆v, u − ∆u ≤ Y ≤ u} = Eφ(X, Y ) ≤ Eφ(Y, X) = P {v ≤ Y ≤ v + ∆v, u − ∆u ≤ X ≤ u}. Dividing both sides by ∆u∆v and letting ∆u → 0 and ∆v → 0, we obtain (1.C.2), that is, X ≤lr Y . Conversely, suppose that X ≤lr Y . Let φ ∈ Glr and let ψ be an increasing function. Then E[ψ(φ(Y, X)) − ψ(φ(X, Y ))] = [ψ(φ(y, x)) − ψ(φ(x, y))]f (x)g(y)dxdy y x [ψ(φ(y, x)) − ψ(φ(x, y))][f (x)g(y) − f (y)g(x)]dydx ≥ 0.

= y

y≥x

1.C The Likelihood Ratio Order

51

A typical application of Theorem 1.C.20 is shown in the proof of Theorem 6.B.15 in Chapter 6. Another typical application is the following result. Theorem 1.C.21. Let X1 , X2 , . . . , Xm be independent random variables such that X1 ≤lr X2 ≤lr · · · ≤lr Xm . Let a1 , a2 , . . . , am be constants such that a1 ≤ a2 ≤ · · · ≤ am . Then m i=1

am−i+1 Xi ≤st

m

aπi Xi ≤st

i=1

m

ai Xi ,

i=1

where π = (π1 , π2 , . . . , πm ) denotes any permutation of (1, 2, . . . , m). Proof. We only give the proof when m = 2; the general case then can be obtained by pairwise interchanges. So, suppose that X1 ≤lr X2 and that a1 ≤ a2 . Deﬁne φ by φ(x, y) = a1 y + a2 x. Then it is easy to verify that φ ∈ Glr . Thus, by Theorem 1.C.20, a1 X2 + a2 X1 ≤st a1 X1 + a2 X2 .

The next two results are characterizations similar to the one in Theorem 1.C.20. They use the notation of Theorem 1.A.10, and their comparison is of interest. The proofs of the following two theorems are omitted. Theorem 1.C.22. Let X and Y be two independent random variables. Then X ≤lr Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all functions φ1 and φ2 that satisfy ∆φ21 (x, y) ≥ 0 whenever x ≤ y, and ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. Theorem 1.C.23. Let X and Y be two independent random variables. Then X ≤lr Y if, and only if, φ1 (X, Y ) ≤st φ2 (X, Y ) for all φ1 and φ2 that satisfy ∆φ21 (x, y) ≥ 0 whenever x ≤ y, and φ1 (x, y) ≤ φ2 (y, x) for all x and y (then, in particular, ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y). The next theorem gives a characterization of the likelihood ratio order in the spirit of Theorems 1.B.11 and 1.B.49. Theorem 1.C.24. Let X and Y be two independent random variables. Then X ≤lr Y if, and only if, [X min(X, Y ) = z1 , max(X, Y ) = z2 ] ≤lr [Y min(X, Y ) = z1 , max(X, Y ) = z2 ]

for all z1 ≤ z2 .

52

1 Univariate Stochastic Orders

Proof. First suppose that X and Y are absolutely continuous with density functions f and g, respectively. Then P [X = z1 min(X, Y ) = z1 , max(X, Y ) = z2 ] = 1 − P [X = z2 min(X, Y ) = z1 , max(X, Y ) = z2 ] = P [Y = z2 min(X, Y ) = z1 , max(X, Y ) = z2 ] = 1 − P [Y = z1 min(X, Y ) = z1 , max(X, Y ) = z2 ] =

f (z1 )g(z2 ) , f (z1 )g(z2 ) + f (z2 )g(z1 )

and the stated result follows. The proof when X and Y are discrete is similar.

Another similar characterization is given in Theorem 4.A.36. The following result gives a Laplace transform characterization of the order ≤lr . It should be compared with Theorems 1.A.13, 1.B.18, and 1.B.53. The proof is omitted. Theorem 1.C.25. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤lr X2 ⇐⇒ Nλ (X1 ) ≤lr Nλ (X2 )

for all λ > 0.

The implication =⇒ in Theorem 1.C.25 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 1.C.17. Some interesting simple implications of the likelihood ratio order are described in the following theorem. Theorem 1.C.26. Let X, Y , and Z be independent random variables. If X ≤lr Y , then [X X + Y = v] ≤lr [Y X + Y = v] for all v, [X X + Z = v] ≤lr [Y Y + Z = v] for all v, and [Z X + Z = v] ≥lr [Z Y + Z = v] for all v. Proof. We give only the proof of the ﬁrst inequality; the proofs of the other two are similar. First suppose that X and Y are absolutely continuous with density functions f and g, respectively. Denote the density function of X + Y by h. Then the density function of [Y X + Y = v] is given by f (v−·)g(·) , and h(v) f (·)g(v−·) the density function of [X X +Y = v] is given by . It is now seen that h(v)

the monotonicity of g/f implies the monotonicity of the ratio of the above two density functions. The proof when X and Y are discrete is similar.

1.C The Likelihood Ratio Order

53

The next, easily proven, result is stronger than Theorems 1.A.15, 1.B.20, and 1.B.55. Theorem 1.C.27. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the likelihood ratio order. A similar setting in which the order ≤hr gives rise to the order ≤lr is described in the following result. Theorem 1.C.28. Let X, Y , and T be random variables such that T is independent of (X, Y ). If X ≤hr Y , then [T T < X] ≤lr [T T < Y ]. Proof. For simplicity assume that T is absolutely continuous with density function fT . Let F X and F Y be the survival functions of X and Y . The density T < X] is proportional to fT F X and the density function function of [T of [T T < Y ] is proportional to fT F Y . The stated result now follows from (1.B.3).

An analog of the remark after Theorem 1.B.21 is the following result; its proof is straightforward. Theorem 1.C.29. Let X be a nonnegative, absolutely continuous, random variable with the density function f . Then aX ≤lr X for all 0 < a < 1 if, and only if, log f (ex ) is concave in x ≥ 0. In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of likelihood ratio ordered random variables, is bounded from below and from above, in the likelihood ratio order sense, by these two random variables. Theorem 1.C.30. Let X and Y be two random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤lr Y , then X ≤lr W ≤lr Y . Proof. Let A and B be two measurable sets such that A ≤ B; see (1.C.3). If X ≤lr Y , then P {X ∈ A}P {W ∈ B} = P {X ∈ A}(pP {X ∈ B} + (1 − p)P {Y ∈ B}) ≥ P {X ∈ B}(pP {X ∈ A} + (1 − p)P {Y ∈ A}) = P {X ∈ B}P {W ∈ A}, where the inequality follows from (1.C.3). Thus, by (1.C.3), X ≤lr W . The proof that W ≤lr Y is similar.

54

1 Univariate Stochastic Orders

Analogous to the result in Remark 1.A.18, it can be shown that some general sets of distribution functions on R are lattices with respect to the order ≤lr . Let X1 , X2 , . . . , Xm be random variables, and let X(k:m) denote the corresponding kth order statistic, k = 1, 2, . . . , m. Theorem 1.C.31. Let X1 , X2 , . . . , Xm be m independent random variables, all with absolutely continuous distribution functions, all having the same support which is an interval of the real line, and all having diﬀerentiable densities. (a) If X1 ≤lr X2 ≤lr · · · ≤lr Xm , then X(k−1:m) ≤lr X(k:m) ,

2 ≤ k ≤ m,

X(k−1:m−1) ≤lr X(k:m) ,

2 ≤ k ≤ m.

and

(b) If X1 ≥lr X2 ≥lr · · · ≥lr Xm , then X(k:m) ≤lr X(k:m−1) ,

1 ≤ k ≤ m − 1.

A similar result for a ﬁnite population is the following. Consider a ﬁnite population of size N which is linearly ordered, and suppose, without loss of generality, that it can be represented as {1, 2, . . . , N }. Here let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote now the order statistics corresponding to a simple random sample of size m from this population. Theorem 1.C.32. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) be deﬁned as in the preceding paragraph. Then X(1) ≤lr X(2) ≤lr · · · ≤lr X(m) . Proof. For each k ∈ {1, 2, . . . , m}, let fk denote the discrete density of X(k) . Then ⎧ j−1 N −j ⎨ (k−1)(m−k) , j = k, k + 1, . . . , k + N − m; N (m ) fk (j) = ⎩0, otherwise. Therefore, for k ∈ {1, 2, . . . , m − 1}, ⎧ ⎪0, fk+1 (j) ⎨ (m−k)(j−k) = k(N −j−m+k+1) , ⎪ fk (j) ⎩ ∞,

we have j = k; j = k + 1, k + 2, . . . , k + N − m; j = k + N − m + 1.

This is increasing in j, and therefore X(k) ≤lr X(k+1) .

1.C The Likelihood Ratio Order

55

Under some conditions the likelihood ratio order is closed under the formation of order statistics. As above, let X(j:m) denote the jth order statistic associated with the random variables X1 , X2 , . . . , Xm , and let Y(i:n) denote similarly the ith order statistic associated with the random variables Y1 , Y2 , . . . , Yn . Theorem 1.C.33. Let X1 , X2 , . . . , Xm be m independent random variables, and let Y1 , Y2 , . . . , Yn be other n independent random variables. If Xj ≤lr Yi

for all 1 ≤ j ≤ m and 1 ≤ i ≤ n,

then X(j:m) ≤lr Y(i:n)

whenever j ≤ i and m − j ≥ n − i.

Proof. First we give the proof when X1 , X2 , . . . , Xm and Y1 , Y2 , . . . , Yn all have absolutely continuous distribution functions. In this proof we use an idea of Chan, Proschan, and Sethuraman [123]. Let fj , Fj , and F j ≡ 1 − Fj denote the density, distribution, and survival functions of Xj . Similarly, let gi , Gi , and Gi denote the density, distribution, and survival functions of Yi . The density functions of X(j:m) and Y(i:n) are given by fπ1 (t)Fπ2 (t) · · · Fπj (t)F πj+1 (t) · · · F πm (t), fX(j:m) (t) = π

and gY(i:n) (t) =

gσ1 (t)Gσ2 (t) · · · Gσi (t)Gσi+1 (t) · · · Gσn (t),

σ

where π signiﬁes the sum over all permutations π = (π1 , π2 , . . . , πm ) of (1, 2, . . . , m), and σ similarly denotes the sum over all permutations σ = (σ1 , σ2 , . . . , σn ) of (1, 2, . . . , n). Write gY(i:n) (t) gσ (t)Gσ2 (t) · · · Gσi (t)Gσi+1 (t) · · · Gσn (t) = σ 1 . (1.C.12) fX(j:m) (t) π fπ1 (t)Fπ2 (t) · · · Fπj (t)F πj+1 (t) · · · F πm (t) Now, for any choice of a permutation π of (1, 2, . . . , m) and a permutation σ of (1, 2, . . . , n) we have gσ1 (t)Gσ2 (t) · · · Gσi (t)Gσi+1 (t) · · · Gσn (t) fπ1 (t)Fπ2 (t) · · · Fπj (t)F πj+1 (t) · · · F πm (t) =

Gσi+1 (t) · · · Gσn (t) gσ1 (t) Gσ2 (t) · · · Gσj (t) × × fπ1 (t) Fπ2 (t) · · · Fπj (t) F πm−n+i+1 (t) · · · F πm (t) Gσj+1 (t) · · · Gσi (t) . × F πj+1 (t) · · · F πm−n+i (t)

56

1 Univariate Stochastic Orders

Since Xπ1 ≤lr Yσ1 we see from (1.C.1) that the ﬁrst fraction above is increasing in t. From Xπk ≤lr Yσk and Theorem 1.C.1 it follows that Xπk ≤rh Yσk ; but that means that Gσk (t)/Fπk (t) is increasing in t, k = 2, . . . , j, and therefore the second fraction above is increasing in t. Similarly, from Xπk+m−n ≤lr Yσk and Theorem 1.C.1 it also follows that Xπk+m−n ≤hr Yσk ; but that means that Gσk (t)/F πk+m−n (t) is increasing in t, k = i + 1, . . . , n, and therefore the third fraction above is increasing in t. The fourth fraction above obviously increases in t too, and thus the whole product increases in t. Note that if a1 , a2 , . . . , am and b1 , b2 , . . . , bn are all nonnegative univariate functions, such that aj (t)/b i (t) is increasing in t for all 1 ≤ j ≤ m and m n 1 ≤ i ≤ n, then j=1 aj (t)/ i=1 bi (t) is also increasing in t. It follows from this fact, and from (1.C.12), that gY(i:n) (t)/fX(j:m) (t) is increasing in t, and from (1.C.1) we obtain the stated result. The result for the case when the random variables do not necessarily have absolutely continuous distribution functions follows from the above proof and the closure of the likelihood ratio order under weak convergence (Theorem 1.C.7).

Some of the results that are described in the following pages are stated in the literature (see Section 1.E) only for random variables with absolutely continuous distribution functions. However, by the closure of the likelihood ratio order under weak convergence (Theorem 1.C.7) these results are true also for random variables that do not necessarily have absolutely continuous distribution functions. As a corollary of Theorem 1.C.33 we obtain the following result. Corollary 1.C.34. Let X1 , X2 , . . . , Xm be m independent random variables and let Y1 , Y2 , . . . , Ym be other m independent random variables. If Xj ≤lr Yi , for all choices of i and j, then X(k) ≤lr Y(k) , k = 1, 2, . . . , m. Example 1.C.35. Let X and Y be two independent random variables. If X ≤lr Y , then min{X, Y } ≤lr Y and X ≤lr max{X, Y }. Example 1.C.36. Let X, Y , and Z be three independent random variables. If X ≤lr Y ≤lr Z, then min{X, Y } ≤lr min{Y, Z} and max{X, Y } ≤lr max{Y, Z}. By letting all the Xj ’s and Yi ’s in Theorem 1.C.33 be identically distributed we obtain the following result. Theorem 1.C.37. For positive integers m and n, let X1 , X2 , . . . , Xmax{m,n} be independent identically distributed random variables. Then X(j:m) ≤lr X(i:n)

whenever j ≤ i and m − j ≥ n − i.

In particular, it follows from Theorem 1.C.37 that X1 ≤lr X(m:m) ,

m = 2, 3, . . .

(1.C.13)

1.C The Likelihood Ratio Order

57

and X1 ≥lr X(1:m) ,

m = 2, 3, . . . .

(1.C.14)

Note that (1.C.13) and (1.C.14) can also be obtained by induction from Example 1.C.35. The following two corollaries of Theorem 1.C.37 can be compared to Theorems 1.B.27 and 1.B.28. Corollary 1.C.38. Let X1 , X2 , . . . , Xm be independent identically distributed random variables. Then X(k−1:m−1) ≤lr X(k:m) for k = 2, 3, . . . , m. Corollary 1.C.39. Let X1 , X2 , . . . , Xm be independent identically distributed random variables. Then X(k:m−1) ≥lr X(k:m) for k = 1, 2, . . . , m − 1. Remark 1.C.40. The likelihood ratio order can be used to provide a proof of Theorem 1.B.26. Let X1 , X2 , . . . , Xm be independent nonnegative random variables, and let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Fix s and t such that 0 ≤ s ≤ t. For j = 1, 2, . . . , m, deﬁne Mj = 1 if Xj ≤ s, and Mj = 0 if Xj >s, and also deﬁne Nj = 1 if Xj ≤ t, and m m Nj = 0 if Xj > t. Denote M = j=1 Mj and N = j=1 Nj . Note that, for j = 1, 2, . . . , m, we have P {M < j} = P {X(j) > s}, P {N < j} = P {X(j) > t}.

and

Since P {Mj = 1} = P {Xj ≤ s} ≤ P {Xj ≤ t} = P {Nj = 1} it is easily seen that Mj ≤lr Nj , j = 1, 2, . . . , m. Also, obviously, Mj and Nj have logconcave discrete density functions. Thus, from Theorem 1.C.9 it is seen that M ≤lr N . Therefore, by Theorem 1.C.1, M ≤rh N . Thus, from (1.B.44), we get that P {N < j} P {M < j}

is increasing in j ≥ 1.

Therefore, for k such that 1 ≤ k ≤ m − 1 we have P {X(k) > t} P {X(k+1) > t} P {N < k} P {N < k + 1} = ≤ = . P {X(k) > s} P {M < k} P {M < k + 1} P {X(k+1) > s} From (1.B.3) it thus follows that X(k) ≤hr X(k+1) . Remark 1.C.41. The likelihood ratio order can be used to provide a proof of Theorem 1.B.36. Let the Xi ’s and the Yj ’s be as in that theorem. Assume that Xi ≤hr Yj for all i, j. We ﬁrst show that there exists a random variable Z with support (a, b) such that Xi ≤hr Z ≤hr Yj for all i, j. Let rXi and rYj denote the hazard rate functions of the indicated random variables. From the assumption that Xi ≤hr Yj for all i, j it follows by (1.B.2) that min{rX1 (t), rX2 (t), . . . , rXm (t)} ≥ max{rY1 (t), rY2 (t), . . . , rYn (t)},

t ∈ (a, b).

58

1 Univariate Stochastic Orders

Let q be a function which satisﬁes min{rX1 (t), rX2 (t), . . . , rXm (t)} ≥ q(t) ≥ max{rY1 (t), rY2 (t), . . . , rYn (t)}, t ∈ (a, b); for example, let q(t) = min{rX1 (t), rX2 (t), . . . , rXm (t)}. It can be shown that q is indeed a hazard rate function. Let Z be a random variable with the hazard rate function q. Then indeed Xi ≤hr Z ≤hr Yj for all i, j. Now, let Z1 , Z2 , . . . , Zmax{m,n} be independent random variables which are distributed as Z. Then, for j ≤ i and m − j ≥ n − i we have X(i:m) ≤hr Z(i:m) ≤lr Z(j:n)

(by Proposition 1.B.35)

≤hr Y(j:n)

(by Proposition 1.B.35),

(by Theorem 1.C.37)

and Theorem 1.B.36 follows from the fact that the likelihood ratio order implies the hazard rate order. Recall that for a collection X1 , X2 , . . . , Xm of nonnegative random variables, the spacings are deﬁned by U(i) ≡ X(i) − X(i−1) , i = 1, 2, . . . , m, where X(0) ≡ 0. The following result may be compared with Theorems 1.A.19, 1.A.21, and 1.B.31. Theorem 1.C.42. Let X1 , X2 , . . . , Xm be independent exponential random variables with possibly diﬀerent parameters. Then U(1) ≤lr

m−i+1 · U(i) , m

i = 1, 2, . . . , m.

It is worth mentioning that Kochar and Kirmani [313] claimed that if X1 , X2 , . . . , Xm are independent and identically distributed random variables with a common logconvex density, then U(i) ≤lr ((m − i)/(m − i + 1))U(i+1) for i = 1, 2, . . . , m − 1. However, Misra and van der Meulen [396] showed via a counterexample that this is not correct. For spacings that are not “normalized” we have the following results. We denote by U(i:m) = X(i:m) − X(i−1:m) the ith spacing that corresponds to a sample X1 , X2 , . . . , Xm of size m. Theorem 1.C.43. Let X1 , X2 , . . . , Xm , Xm+1 be independent, identically distributed, nonnegative random variables with a common logconvex density. Then U(i:m) ≤lr U(i+1:m) , U(i:m+1) ≤lr U(i:m) ,

1 ≤ i ≤ m − 1, 1 ≤ i ≤ m,

and U(i:m) ≤lr U(i+1:m+1) ,

1 ≤ i ≤ m.

1.C The Likelihood Ratio Order

59

Note that the three statements of the above theorem can be summarized as U(j:m) ≤lr U(i:n)

whenever i − j ≥ max{0, n − m}.

We also have the following result. Theorem 1.C.44. Let X1 , X2 , . . . , Xm , Xm+1 be independent, identically distributed, nonnegative random variables with a common logconcave density. Then U(i:m) ≥lr U(i+1:m+1) , 1 ≤ i ≤ m. A comparison of spacings from two diﬀerent samples, that is similar to Theorem 1.B.32, is described next. In fact, it will be argued after the next theorem that the next result strengthens Theorem 1.B.31. Here U(i:m) = X(i:m) − X(i−1:m) denotes, as before, the ith spacing that corresponds to the sample X1 , X2 , . . . , Xm , and V(j:n) denotes, similarly, the jth spacing that corresponds to the sample Y1 , Y2 , . . . , Yn . Other results which give related comparisons can be found in Theorem 4.B.17 and in Examples 6.B.25 and 6.E.15. Theorem 1.C.45. For positive integers m and n, let X1 , X2 , . . . , Xm be independent identically distributed random variables with an absolutely continuous common distribution function, and let Y1 , Y2 , . . . , Yn be independent identically distributed random variables with a possibly diﬀerent absolutely continuous common distribution function. If X1 ≤lr Y1 , and if either X1 or Y1 is DFR, then (m − j + 1)U(j:m) ≤hr (n − i + 1)V(i:n)

whenever i − j ≥ max{0, n − m}.

Taking X1 =st Y1 in Theorem 1.C.45 it is seen that Theorem 1.B.31 is a consequence of Theorem 1.C.45. In the following example it is shown that, under the proper conditions, random minima and maxima are ordered in the likelihood ratio order sense; see related results in Examples 3.B.39, 4.B.16, 5.A.24 and 5.B.13. Example 1.C.46. Let X1 , X2 , . . . be a sequence of absolutely continuous nonnegative independent and identically distributed random variables with a common distribution function FX1 and a common density function fX1 . Let N1 and N2 be two positive integer-valued random variables which are independent of the Xi ’s. Denote X(1:Nj ) ≡ min{X1 , X2 , . . . , XNj } and X(Nj :Nj ) ≡ max{X1 , X2 , . . . , XNj }, j = 1, 2. Then the density function of X(Nj :Nj ) is given by fX(Nj :Nj ) (x) =

∞

n−1 nFX (x)fX1 (x)P {Nj = n}, 1

x ≥ 0, j = 1, 2.

n=1

If N1 ≤lr N2 , then P {Nj = n} is TP2 in n ≥ 1 and j ∈ {1, 2}. Also, n−1 nFX (x)fX1 (x) is TP2 in n ≥ 1 and x ≥ 0. Therefore, by the Basic Com1 position Formula (Karlin [275]) it follows that fX(Nj :Nj ) (x) is TP2 in x ≥ 0

60

1 Univariate Stochastic Orders

and j ∈ {1, 2}. That is, X(N1 :N1 ) ≤lr X(N2 :N2 ) . In a similar fashion it can be shown also that X(1:N1 ) ≥lr X(1:N2 ) . Example 1.C.47. Let {N (t), t ≥ 0} be a nonhomogeneous Poisson process with mean function Λ (that is, Λ(t) ≡ E[N (t)], t ≥ 0), and let T1 , T2 , . . . be the successive epoch times. The survival function of Tn is given by P {Tn > i n−1 t} = i=0 (Λ(t)) e−Λ(t) , t ≥ 0, and the density function of Tn is given by i! (n−1)

fn (t) = λ(t) (Λ(t)) (n−1)! easy to verify that

e−Λ(t) , t ≥ 0, where λ(t) ≡

fn+1 (t) fn (t)

d dt Λ(t),

n = 1, 2, . . .. It is

is increasing in t ≥ 0, n = 1, 2, . . ., and therefore

Tn ≤lr Tn+1 ,

n = 1, 2, . . . .

Theorem 2.6 on page 182 of Kamps [273] extends Example 1.C.47 (as it extends Theorem 1.C.45) to the so called generalized order statistics. A further extension is described in Franco, Ruiz, and Ruiz [205]. The following example may be compared to Examples 1.B.24, 2.A.22, 3.B.38, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 1.C.48. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G and density functions f and g, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that, under some conditions, the likelihood ratio ordering of the ﬁrst two epoch times implies the likelihood ratio ordering of all the corresponding later epoch times. Explicitly, it will be shown below that if X ≤lr Y , and if Λ2 (t) is increasing in t ≥ 0, (1.C.15) Λ1 (t) then T1,n ≤lr T2,n , n ≥ 1. From (1.B.24) it is easy to see that the density function f1,n of T1,n is given by (Λ1 (t))n−1 f1,n (t) = f (t) , t ≥ 0, n ≥ 1, (n − 1)! and that the density function f2,n of T2,n is given by f2,n (t) = g(t) Thus,

(Λ2 (t))n−1 , (n − 1)!

t ≥ 0, n ≥ 1.

g(t) Λ2 (t) n−1 f2,n (t) = . f1,n (t) f (t) Λ1 (t)

1.C The Likelihood Ratio Order

61

Now, if X ≤lr Y and (1.C.15) holds, then f2,n /f1,n is increasing and we obtain T1,n ≤lr T2,n . Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Again, note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the likelihood ratio ordering of the ﬁrst two inter-epoch times implies the likelihood ratio ordering of all the corresponding later inter-epoch times. Explicitly, it will be shown below that if X ≤hr Y , if f and g are logconvex, and if (1.B.25) holds, then X1,n ≤lr X2,n for each n ≥ 1. First note that by Theorem 1.C.4 we have X ≤lr Y . For the purpose of the following proof we denote f by f1 and g by f2 . Let gi,n denote the density function of Xi,n , i = 1, 2. The stated result is obvious for n = 1, so let us ﬁx an n ≥ 2. From (1.B.26) we obtain ∞ Λn−2 (s) gi,n (t) = λi (s) i fi (s + t) ds, t ≥ 0, i = 1, 2. (n − 2)! 0 As in Example 1.B.24, we have that λi (t)

(t) Λn−2 i (n − 2)!

is TP2 in (i, t).

The assumption F1 ≤lr F2 implies that fi (s + t) is TP2 in (i, s) and in (i, t). Finally, the logconvexity of f1 and of f2 means that fi (s + t) is TP2 in (s, t). Thus, by Theorem 5.1 on page 123 of Karlin [275], we get that gi,n (t) is TP2 in (i, t); that is, X1,n ≤lr X2,n . The following neat example compares a sum of independent heterogeneous exponential random variables with an Erlang random variable; it is of interest to compare it with Examples 1.A.24 and 1.B.5. We do not give the proof here. Example 1.C.49. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Let Yi , i = 1, 2, . . . , m, be independent, identically distributed, exponential random variables with mean η −1 . Then m i=1

Xi ≥lr

m i=1

Yi ⇐⇒

λ1 + λ2 + · · · + λm ≤ η. m

A related example is the following. Recall from page 2 the deﬁnition of the majorization order ≺ among n-dimensional vectors. It is of interest to compare the example below with Example 3.B.34.

62

1 Univariate Stochastic Orders

Example 1.C.50. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and let Yi be an exponential random variable with mean ηi−1 > 0, i = 1, 2, . . . , m. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m i=1

Xi ≥lr

m

Yi .

i=1

The next example may be compared with Examples 1.A.25, 1.B.6, and 4.A.45. Example 1.C.51. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m i=1

and

m i=1

n , (n i=1 i /pi )

Xi ≥lr Y ⇐⇒ p ≤ m

n . (n /(1 − pi )) i i=1

Xi ≤lr Y ⇐⇒ 1 − p ≤ m

The order ≤lr can be used to characterize random variables with logconcave densities. The next result lists several such characterizations. It shows that logconcavity can be interpreted as an aging notion in reliability theory by a correct use of the likelihood ratio ordering. This theorem may be compared to Theorem 1.B.38. Theorem 1.C.52. The random variable X has a logconcave density (that is, a Polya frequency of order 2 (PF2 )) if, and only if, one of the following equivalent conditions holds: (i) [X − tX > t] ≥lr [X − t X > t ] whenever t ≤ t . (ii) X ≥lr [X − tX > t] for all t ≥ 0 (when X is a nonnegative random variable). (iii) X + t ≤lr X + t whenever t ≤ t . Random variables that satisfy (i) in Theorem 1.C.52 (and hence any of the conditions of that theorem) are said to have the ILR (increasing likelihood ratio) property; see Section 13.D.2 by Righter in [515]. A multivariate extension of parts (i) and (ii) of Theorem 1.C.52 is given in Section 6.E.3. Another connection between logconcavity and the likelihood ratio order is illustrated in the next result. It is worthwhile to compare the following result with Theorem 6.B.9 in Section 6.B.3. Theorem 1.C.53. Let X1 , X2 , . . . , Xm be independent random variables having logconcave density functions. Then

1.C The Likelihood Ratio Order

m m Xi Xj = s ≤lr Xi Xj = s j=1

63

whenever s ≤ s , i = 1, 2, . . . , m.

j=1

Proof. Since the convolution of logconcave density functions is logconcave, it is suﬃcient to prove the result for m = 2 and i = 1. Let f1 and f2 denote the density functions of X1 and X2 , respectively. The conditional density of X1 , given X1 + X2 = s, is fX1 |X1 +X2 =s (x1 ) =

f1 (x1 )f2 (s − x1 ) . f1 (u)f2 (s − u)du

Thus, fX1 |X1 +X2 =s (x1 ) f2 (s − x1 ) f1 (u)f2 (s − u)du . = fX1 |X1 +X2 =s (x1 ) f2 (s − x1 ) f1 (u)f2 (s − u)du

(1.C.16)

The logconcavity of f2 implies that the expression in (1.C.16)) increases in x1 , whenever s ≤ s . By (1.C.1) the proof is complete.

Theorems 1.C.52 and 1.C.53 have straightforward discrete analogs, which we do not state here. A few other properties of the order ≤lr can be found in Lemma 13.D.1 in Chapter 13 by Righter, and in (14.B.7) in Chapter 14 by Shanthikumar and Yao, in [515]. An interesting closure property of logconcave density functions is described in the following result. Theorem 1.C.54. Let X1 , X2 , . . . , Xm be independent, identically distributed random variables with a common logconcave density function. Then the ith order statistic X(i:m) also has a logconcave density function, 1 ≤ i ≤ m. Proof. Let f , F , and F denote, respectively, the density, distribution, and survival function of X1 . Then the density function of X(i:m) is given by m−1 m−i f(i:m) (x) = m F i−1 (x)f (x)F (x). i−1 Since the logconcavity of f implies the logconcavity of F and of F , it follows that f(i:m) is logconcave.

Misra and van der Meulen [396] showed the preservation of logconcavity and logconvexity from the parent density to the density of the corresponding spacings. The likelihood ratio order can be used to characterize some aging notions in reliability theory. Recall from (1.A.20) that for a nonnegative random variable X with a ﬁnite mean we denote by AX the corresponding asymptotic equilibrium age. Recall from page 1 the deﬁnitions of the IFR and the DFR properties. The following result is immediate. It is of interest to contrast it with Theorems 1.A.31 and 1.B.40

64

1 Univariate Stochastic Orders

Theorem 1.C.55. The nonnegative random variable X with ﬁnite mean is IFR [DFR] if, and only if, X ≥lr [≤lr ] AX . An interesting comparison of asymptotic equilibrium ages is described in the next example. Recall from page 1 the deﬁnitions of the DMRL property. Example 1.C.56. Let X and Y be two independent nonnegative DMRL random variables with survival functions F and G, density functions f and g, and asymptotic equilibrium ages AX and AY , respectively. Let Amin{X,Y } denote the asymptotic equilibrium age of min{X, Y }. Then min{AX , AY } ≤lr Amin{X,Y } . In order to see this, assume, for simplicity, that the supports of X and of Y are (0, ∞). Note that the density function of min{AX , AY } is given by ∞ ∞ −1 fmin{AX ,AY } (t) = (EXEY ) F (t) G(x) dx+G(t) F (x)dx , t ≥ 0, t

t

and the density function of Amin{X,Y } is given by

−1 fAmin{X,Y } (t) = E[min{X, Y }] F (t)G(t),

t ≥ 0.

Therefore

−1 fAmin{X,Y } (t) EXEY m(t) + l(t) = , fmin{AX ,AY } (t) E[min{X, Y }]

t ≥ 0,

where m and l are the mean residual life functions of X and of Y , given by m(t) = E[X − tX > t] and l(t) = E[Y − tY > t], t ≥ 0. The functions m and l are decreasing by the DMRL assumptions, and therefore min{AX , AY } ≤lr Amin{X,Y } by (1.C.1). In the following example it is shown that if X is increasing in Θ in the likelihood ratio sense, then the posterior distribution of Θ is increasing in X in the same sense. Example 1.C.57. Let X be a random variable whose distribution function depends on the real parameter Θ. Denote the prior density function of Θ by π, and denote the posterior density function of Θ, given X =x, by π ∗ (·x). Also, denote the conditional density of X, given Θ = θ by f (·θ), and denote the marginal density of X by g. If X is increasing in Θ in the likelihood ratio sense (that is, if [X Θ = θ] ≤lr [X Θ = θ ] whenever θ ≤ θ ), then Θ is increasing in X in the likelihood ratio sense (that is, [ΘX = x] ≤lr [ΘX = x ] whenever x ≤ x ). The proof of this statement is easy by noting that f (xθ)π(θ) ∗ π (θ x) = . g(x)

1.C The Likelihood Ratio Order

65

An extension of Example 1.C.57 to the multivariate likelihood ratio order is given in Example 6.E.16. Example 1.C.58. Let X be a random variable whose distribution function depends on the random parameter Θ1 or, in other circumstances, on the random parameter Θ2 . Denote the prior density functions, of Θ1 and Θ2 , by π1 and π2 , respectively, and denote the posterior density functions of Θ1 and Θ2 , given X = x, by π1∗ (·x) and π2∗ (·x), respectively. Also, denote the conditional density of X, given Θ1 = θ or Θ2 = θ, by f (·θ), and denote the marginal density of X by g1 or by g2 , according to whether X depends on Θ1 or on Θ2 . Then, for any x, we have that Θ1 ≤lr Θ2 =⇒ [Θ1 X = x] ≤lr [Θ2 X = x]. The proof of this statement is easy by noting that f (xθ)πi (θ) ∗ πi (θ x) = , i = 1, 2. gi (x) Example 1.C.59. Recall from Example 1.B.23 that for a nonnegative random variable X with density function f , and for a nonnegative function w such that E[w(X)] exists, we denote by X w the random variable with the weighted density function fw given by fw (x) =

w(x)f (x) , E[w(X)]

x ≥ 0.

(1.C.17)

Similarly, for another nonnegative random variable Y with density function g, such that E[w(Y )] exists, we denote by Y w the random variable with the density function gw given by gw (x) =

w(x)g(x) , E[w(Y )]

x ≥ 0.

(1.C.18)

It is then obvious that X ≤lr Y =⇒ X w ≤lr Y w . Example 1.C.60. Let X be a nonnegative random variable with density function f , and for a nonnegative function w such that E[w(X)] exists, let X w be the random variable with the weighted density function fw given in (1.C.17). It is then obvious that if w is increasing [decreasing], then X ≤lr [≥lr ] X w . In particular, the inequality X ≤lr X w holds when X w is the length-biased version of X; that is, when w(x) = x, x ≥ 0. Example 1.C.61. Let the random variable X have a generalized skew normal distribution with parameters n and λ, that is, suppose that its density function is given by Φn (λx)φ(x) f (x; n, λ) = , x ∈ R, C(n, λ)

66

1 Univariate Stochastic Orders

where φ and Φ are, respectively, the density and distribution functions of a standard normal random variable, and C(n, λ) is given by ∞ C(n, λ) = Φn (λx)φ(x)dx. −∞

Let Y have a generalized skew normal distribution with parameters n1 and λ. It is easy to see that if λ > [ x ≤lr Y for all x ∈ (lX , uX ).

68

1 Univariate Stochastic Orders

Another shifted likelihood ratio stochastic order is deﬁned next. Let X and Y be two absolutely continuous random variables with support [0, ∞). Suppose that X ≤lr [Y − xY > x] for all x ≥ 0. Then X is said to be smaller than Y in the down shifted likelihood ratio order (denoted as X ≤lr↓ Y ). Note that in the above deﬁnition only nonnegative random variables are compared. This is because for the down shifted likelihood ratio order it is not possible to take an analog of (1.C.19), such as X ≤lr Y −x, as a deﬁnition. The reason is that here, by taking x very large, it is seen that practically no random variables would satisfy such an order relation. Note that in the deﬁnition above, the right-hand side [Y − xY > x] can take on (as x varies) any value in the right neighborhood of 0. Therefore the support of the compared random variables is restricted here to be [0, ∞). Let f and g denote the density functions of X and Y , respectively. An analog of (1.C.20) is the following: X ≤lr↓ Y ⇐⇒

g(t + x) is increasing in t ≥ 0 for all x ≥ 0. f (t)

(1.C.21)

(A discrete version of the down shifted likelihood ratio order is deﬁned and used in Section 6.B.3.) It is readily apparent that for nonnegative random variables with support [0, ∞) we have X ≤lr↓ Y =⇒ X ≤lr Y. We describe now some further properties of the down shifted likelihood ratio order. Theorem 1.C.70. Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables, with support [0, ∞), such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤lr↓ Yj , j = 1, 2, . . ., then X ≤lr↓ Y . The following result is an analog of Theorem 1.C.63, however, it does not follow at once from Theorem 1.C.15. Its proof can be found in Lillo, Nanda, and Shaked [361]. Θ = θ] Theorem 1.C.71. Let X, Y , and Θ be random variables such that [X and [Y Θ = θ] are absolutely continuous and have the support [0, ∞) for all θ in the support of Θ. If [X Θ = θ] ≤lr↓ [Y Θ = θ ] for all θ and θ in the support of Θ, then X ≤lr↓ Y . More properties are listed next. Theorem 1.C.72. Let X and Y be two absolutely continuous random variables with support [0, ∞). If X or Y or both have logconvex densities on [0, ∞), and if X ≤lr Y , then X ≤lr↓ Y .

1.C The Likelihood Ratio Order

69

Theorem 1.C.73. Let X and Y be two absolutely continuous random variables with diﬀerentiable densities on their support [0, ∞). Then X ≤lr↓ Y if, and only if, there exists a random variable Z with a logconvex density on [0, ∞) such that X ≤lr Z ≤lr Y . Theorem 1.C.74. Let X be an absolutely continuous random variable with support [0, ∞). Then X ≤lr↓ X if, and only if, f is logconvex on [0, ∞). Theorem 1.C.75. Let X and Y be two absolutely continuous random variables with support [0, ∞). If X ≤lr↓ Y and if Y has a decreasing density function on [0, ∞), then φ(X) ≤lr↓ φ(Y ) for any strictly increasing twice diﬀerentiable convex function φ : [0, ∞) → [0, ∞) (with ﬁrst and second derivatives φ and φ ) such that φ (x)/(φ (x))2 is decreasing. Example 1.C.76. An interesting family of distribution functions, with associated random variables that are ordered in the down shifted likelihood ratio order, is the Pareto family. Explicitly, for θ ∈ (0, ∞), let Xθ be a random variable with density function fθ deﬁned by fθ (x) = θ/(1 + x)θ+1 ,

x ≥ 0.

Then, by verifying (1.C.21), it is easy to see that Xθ1 ≤lr↓ Xθ2 whenever θ1 ≥ θ2 > 0. Some results that compare order statistics in the shifted likelihood ratio orders are described next. Again, X(j:m) denotes the jth order statistic associated with the random variables X1 , X2 , . . . , Xm , and Y(i:n) denotes the ith order statistic associated with the random variables Y1 , Y2 , . . . , Yn . An analog of Theorem 1.C.33 for the order ≤lr↑ is the following result. Note that in the following theorem the assumption is stronger than the assumption in Theorem 1.C.33, but so is the conclusion. Theorem 1.C.77. Let X1 , X2 , . . . , Xm be m independent random variables, and let Y1 , Y2 , . . . , Yn be other n independent random variables, all having absolutely continuous distributions. If Xj ≤lr↑ Yi for all 1 ≤ j ≤ m and 1 ≤ i ≤ n, then X(j:m) ≤lr↑ Y(i:n)

whenever j ≤ i and m − j ≥ n − i.

Proof. Fix an x ≥ 0 and denote by (X − x)(j:m) the jth order statistic among the random variables X1 − x, X2 − x, . . . , Xm − x. By assumption we have Xj − x ≤lr↑ Yi for all 1 ≤ j ≤ m and 1 ≤ i ≤ n. Therefore from Theorem 1.C.33 we get (X − x)(j:m) ≤lr Y(i:n) whenever j ≤ i and m − j ≥ n − i. The stated result follows from the fact that (X − x)(j:m) = X(j:m) − x.

For the down shifted likelihood ratio order, the method of proof used in the proof of Theorem 1.C.33 only yields comparisons of minima as described in the following result.

70

1 Univariate Stochastic Orders

Theorem 1.C.78. Let X1 , X2 , . . . , Xm be m independent random variables, and let Y1 , Y2 , . . . , Yn be other n independent random variables, all having absolutely continuous distributions with support [0, ∞). If Xj ≤lr↓ Yi for all 1 ≤ j ≤ m and 1 ≤ i ≤ n, then X(1:m) ≤lr↓ Y(1:n)

whenever m ≥ n.

Now let X1 , X2 , . . . be independent and identically distributed random variables. Taking Yi =st Xj for all i and j in Theorems 1.C.77 and 1.C.78, and using Theorems 1.C.66 and 1.C.74, we obtain the following analogs of Theorem 1.C.37. Note that in the next theorem (unlike in Theorem 1.C.37) we assume logconcavity or logconvexity of the underlying density function, but the conclusion in part (a) of the next theorem is stronger than the conclusion in Theorem 1.C.37. Theorem 1.C.79. (a) Let X1 , X2 , . . . be independent and identically distributed absolutely continuous random variables with an interval support. If the common density function is logconcave, then X(j:m) ≤lr↑ X(i:n)

whenever j ≤ i and m − j ≥ n − i.

(b) Let X1 , X2 , . . . be independent and identically distributed absolutely continuous random variables with support [0, ∞). If the common density function is logconvex on [0, ∞), then X(1:m) ≤lr↓ X(1:n)

whenever m ≥ n.

1.D The Convolution Order Let X and Y be two random variables such that Y =st X + U,

(1.D.1)

where U is a nonnegative random variable, independent of X. Then X is said to be smaller than Y in the convolution order (denoted as X ≤conv Y ). Obviously, the convolution order is a partial order. It is equivalent to the information order which is deﬁned for statistical experiments when the underlying parameter is a location parameter. The convolution order is obviously closed under increasing linear transformations. That is, for any a ∈ R and b ≥ 0 we have X ≤conv Y =⇒ a + bX ≤conv a + bY. The convolution order is obviously also closed under convolutions. That is, let X1 , X2 , . . . , Xn be a set of independent random variables, and let Y1 , Y2 , . . . , Yn be another set of independent random variables. Then

1.E Complements

71

n n

Xj ≤conv Yj , j = 1, 2, . . . , n =⇒ Xi ≤conv Yi . i=1

i=1

It is obvious from Theorem 1.A.2 and (1.D.1) that X ≤conv Y =⇒ X ≤st Y. For any nonnegative random variable X we denote by LX its classical Laplace transform, that is, LX (s) = E[e−sX ],

s ≥ 0.

Recall that a nonnegative function φ is a Laplace transform of a nonnegative measure on (0, ∞) if, and only if, φ is completely monotone, that is, all the derivatives φ(n) of φ exist, and they satisfy (−1)n φ(n) (x) ≥ 0 for all x ≥ 0 and n = 1, 2, . . .. It follows that for nonnegative random variables we have X ≤conv Y ⇐⇒

LY (s) is a completely monotone function in s ≥ 0. (1.D.2) LX (s)

Example 1.D.1. Let Xi be an exponential random variable with mean 1/λi , i = 1, 2. If λ1 > λ2 , then X1 ≤conv X2 . To see this, note that the ratio of the Laplace transforms of X2 and X1 at s is equal to (λ2 /λ1 )((s + λ1 )/(s + λ2 )), and it is easy to verify that this ratio is completely monotone. The result thus follows from (1.D.2). Example 1.D.2. Let X1 , X2 , . . . , Xn be independent and identically distributed exponential random variables with mean 1/λ for some λ > 0. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(n) . Then X(i) ≤conv X(j)

whenever 1 ≤ i < j ≤ n.

To see this, note that X(k+1) =st X(k) + Zk , where Zk is an exponential random variable with mean ((n − k)λ)−1 , k = 1, 2, . . . , n − 1, and use the transitivity property of the order ≤conv .

1.E Complements Section 1.A: The usual stochastic order is being used in many areas of applications, but there is no single source where many of the basic results can all be found. Some standard references are the books of Lehmann [342], Marshall and Olkin [383], Ross [475], and M¨ uller and Stoyan [419], where most of the results described in Section 1.A can be found. For example, Theorem 1.A.2 can be found in Marshall and Olkin [383]. The characterization of the usual stochastic order by the monotonicity described in

72

1 Univariate Stochastic Orders

(1.A.8) is taken from M¨ uller [407], whereas the characterization given in (1.A.12) can be found in Fellman [193]. The comparison of the random sums in Theorem 1.A.5 is motivated by ideas in Pellerey and Shaked [455]; it was communicated to us by Pellerey [444]. The application of the order ≤st in Bayesian imperfect repair (Example 1.A.7) is taken from Lim, Lu, and Park [364]. The result which gives conditions for stochastic equality (Theorem 1.A.8) can be found in Baccelli and Makowski [27] and in Scarsini and Shaked [494]. Lemma 2.1 of Costantini and Pasqualucci [135] with n = 1 is an interesting variation of Theorem 1.A.8. The bivariate characterizations in Theorems 1.A.9 and 1.A.10 are taken from Shanthikumar and Yao [532] and from Righter and Shanthikumar [466], respectively. The characterization of the order ≤st by means of the FortretMourier-Wasserstein distance (Theorem 1.A.11) is taken from Adell and de la Cal [3]. The Laplace transform characterization of the order ≤st (Theorem 1.A.13) can be found in Kebir [281] and in Kan and Yi [274]. An extension of Theorem 1.A.13 to more general orders can be found in Nanda [422]. The closure of the order ≤st under a stochastically increasing family of random variables (Theorem 1.A.14) is taken from Shaked and Wong [524]. The condition for the usual stochastic order, given in Theorem 1.A.17, has been communicated to us by Gerchak and He [210]. The comparison of truncated maximum with truncations maximum (Example 1.A.16) can be found in Pellerey and Petakos [453]. The lattice property of the order ≤st (Remark 1.A.18) is given in M¨ uller and Scarsini [418]. The four results that give the stochastic orderings of the spacings, Theorems 1.A.19–1.A.22, can be found in Barlow and Proschan [35], Ebrahimi and Spizzichino [178], Pledger and Proschan [458], and Joag-Dev [258], respectively. The stochastic comparison of order statistics of independent random variables with the order statistics of independent and identically distributed random variables (Theorem 1.A.23) is taken from Ma [371]; it generalizes some previous results in the literature. The stochastic comparison of a sum of independent heterogeneous exponential random variables with a proper Erlang random variable (Example 1.A.24) is taken from Bon and P˘ alt˘ anea [105], where more reﬁned comparisons can also be found. The stochastic comparison of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 1.A.25) is taken from Boland, Singh, and Cukic [102]. The necessary and suﬃcient conditions for the comparison of normal random variables (Example 1.A.26) are taken from M¨ uller [413]; an extension of this result to Kotz-type distributions is given in Ding and Zhang [168]. The stochastic comparisons of norms, in Examples 1.A.27 and 1.A.28, are taken from Lapidoth and Moser [333]. The TTT transform (1.A.19) is introduced in Barlow, Bartholomew, Bremner, and Brunk [32], and is further studied in Barlow and Doksum [34] and in Barlow and Campo [33]. The observed total time on test random variable Xttt is deﬁned and studied in Li and Shaked [356], where the implication in Theorem 1.A.29 can be found. The

1.E Complements

73

characterizations of the NBUE and the NWUE aging notions by means of the usual stochastic order (Theorem 1.A.31) can be found in Whitt [565] and in Fagiuoli and Pellerey [187]. The other characterization, by means of the random variable Xttt (Theorem 1.A.32), is taken from Li and Shaked [356]. The aging notion that is described in (1.A.21) is studied in Mugdadi and Ahmad [402]. Boland, Singh, and Cukic [103] studied an order, called the stochastic precedence order, according to which the random variable X is smaller than the random variable Y if P {X < Y } ≥ P {Y < X}. If X and Y are independent, then X ≤st Y implies that X is smaller than Y in the stochastic precedence order. Section 1.B: Many of the basic results regarding the hazard rate order can be found in Ross [475] and in M¨ uller and Stoyan [419]. The characterization (1.B.8) can be found in Lehmann and Rojo [345]. The results regarding the preservation of the orders ≤hr and ≤rh under monotone increasing transformations (Theorems 1.B.2 and 1.B.43) can be found in Keilson and Sumita [283]. The closure under convolutions result (Theorem 1.B.4) and the bivariate characterization result (Theorem 1.B.9) are taken from Kijima [291] and Shanthikumar and Yao [532]. A special case of Lemma 1.B.3 can be found in Mukherjee and Chatterjee [403]. The hazard rate order comparison of a sum of independent heterogeneous exponential random variables with a proper Erlang random variable (Example 1.B.5) is taken from Bon and P˘ alt˘ anea [105], where more reﬁned comparisons can also be found. The hazard rate order comparison of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 1.B.6) is taken from Boland, Singh, and Cukic [102]. The hazard rate order comparison of random sums (Theorem 1.B.7) can be found in Pellerey [445]; some related results are Theorem 7.2 of Kijima [291] and Proposition 2.2 of Kebir [282]. The closure under mixtures result (Theorem 1.B.8) can be found in Boland, El-Neweihi, and Proschan [97]; a generalization of it is contained in Nanda, Jain, and Singh [424]. The bivariate characterizations in Theorems 1.B.10 and 1.B.11 are taken from Righter and Shanthikumar [466] and from Cheng and Righter [128], respectively. The characterizations given in Theorem 1.B.12 can be found in Cap´era` a [118] and in Joag-Dev, Kochar, and Proschan [259]. The hazard rate ordering result regarding the inter-epoch times of a nonhomogeneous Poisson process (Example 1.B.13) is taken from Kochar [309] where other applications of Theorem 1.B.12 can also be found. The hazard rate ordering of the epoch times of a nonhomogeneous Poisson process (1.B.19) can be found in Baxter [62]. The closure property of the order ≤hr under hazard rate ordered mixtures (Theorem 1.B.14) is taken from Shaked and Wong [524]; a related result is Proposition 4.1 of Kebir [282]. The preservation of the order ≤hr under the formation of a proper Markov chain (Example 1.B.15) can essentially be found in Ross, Shanthikumar,

74

1 Univariate Stochastic Orders

and Zhu [478]; they gave a version of this preservation result for the order ≤rh . The application of the order ≤hr in Bayesian imperfect repair (Example 1.B.16) is inspired by Lim, Lu, and Park [364], but the result given here is stronger than their Theorem 4.1(iii). The hazard rate order comparison of a proportional hazard mixture with its parent distribution (Example 1.B.17) is taken from Gupta and Gupta [214]. The Laplace transform characterization of the order ≤hr (Theorem 1.B.18) can be found in Kebir [281] and in Kan and Yi [274]. An extension of Theorem 1.B.18 to more general orders can be found in Nanda [422]. The result about the inheritance of the order ≤hr , from the mixing scales to the underlying counting processes (Theorem 1.B.19), is essentially taken from Ma [374]. The closure property which is given in Theorem 1.B.21 can be found in Kochar [305]; the necessary and suﬃcient condition, given after Theorem 1.B.21, is taken from Ma [374]. The result involving the hazard rate comparison of weighted random variables (Example 1.B.23) is taken from Nanda and Jain [423]; see also Bartoszewicz and Skolimowska [51]. The hazard rate comparison of epoch times of nonhomogeneous Poisson processes in Example 1.B.24 can be found in Ahmadi and Arghami [6] and in Belzunce, Lillo, Ruiz, and Shaked [69]; in the latter paper the result is extended to nonhomogeneous pure birth processes. The hazard rate order comparison of inter-epoch times of nonhomogeneous Poisson processes in Example 1.B.24 is taken from Belzunce, Lillo, Ruiz, and Shaked [69], who also obtained a similar result for the more general nonhomogeneous pure birth processes. The hazard rate order comparison of series systems of parallel systems (Example 1.B.25) can be found in Vald´es and Zequeira [553]. The proof of Theorem 1.B.26 (given in Remark 1.C.40) is taken from Boland, Shaked, and Shanthikumar [101]. The hazard rate order comparisons of order statistics described in Theorems 1.B.27 and 1.B.28 can be found in Korwar [321]. The conditions that lead to the hazard rate ordering of minima (Theorem 1.B.29 and Corollary 1.B.30) are taken from Navarro and Shaked [430]. The two results that give the hazard rate orderings of the spacings (Theorem 1.B.31) can be found in Kochar and Kirmani [313] and in Khaledi and Kochar [285], whereas the comparison of spacings from two diﬀerent samples (Theorem 1.B.32) is taken from Khaledi and Kochar [285]; further results can be found in Hu and Wei [240] and in Misra and van der Meulen [396]. The closure property under formations of order statistics (Theorem 1.B.34) is taken from Singh and Vijayasree [537]; see also Lynch, Mimmack, and Proschan [369]. Boland, El-Neweihi, and Proschan [97] show, by a counterexample, that the conclusion of Theorem 1.B.34 need not hold when the Xi ’s or the Yi ’s are not identically distributed. Extensions of Theorem 1.B.34 can be found in Shaked and Shanthikumar [516], in Belzunce, Mercader, and Ruiz [70], and in Hu and Zhuang [247]. The general comparison result, given in Theorem 1.B.36, is taken from Boland, Hu, Shaked, and Shanthikumar [99]; see related results in Franco, Ruiz, and Ruiz [205] and in Hu and Zhuang [247]. The hazard

1.E Complements

75

rate order comparisons of maxima of heterogeneous exponential random variables (Example 1.B.37) are taken from Dykstra, Kochar, and Rojo [174] and from Khaledi and Kochar [287]. The closure under convolution property of IFR random variables (Corollary 1.B.39) can be found, for example, in Barlow and Proschan [36, page 100]). The characterizations of the DMRL and the IMRL aging notions by means of the hazard rate order (Theorem 1.B.40) can be found in Brown [111, page 229], in Whitt [565], and in Fagiuoli and Pellerey [187]. The observation that essentially reduces the study of the reversed hazard rate order into the study of the hazard rate order (Theorem 1.B.41) is taken from Nanda and Shaked [428]. The bivariate characterization results for the reversed hazard order (Theorems 1.B.47 and 1.B.49) can be found in Shanthikumar, Yamazaki, and Sakasegawa [529] and in Cheng and Righter [128]. The application of the reversed hazard order in economics, described in Example 1.B.51, is taken from Eeckhoudt and Gollier [180]; further results in this vein can be found in Kijima and Ohnishi [293]. The closure property of the order ≤rh under reversed hazard rate ordered mixtures (Theorem 1.B.52) is taken from Shaked and Wong [524]; a related result is Proposition 4.1 of Kebir [282]. The Laplace transform characterization of the order ≤rh (Theorem 1.B.53) is taken from Kebir [281]. The result about the inheritance of the order ≤rh , from the mixing scales to the underlying counting processes (Theorem 1.B.54), is essentially taken from Ma [374]. The results about the reversed hazard rate ordering of order statistics (Theorems 1.B.56 and 1.B.57), and the characterizations of the reversed hazard rate order given in Theorem 1.B.62, can be found in Block, Savits, and Singh [96], whereas the result described in Theorem 1.B.58 is taken from Hu and He [232]. The preservation of the order statistics in the sense of the order ≤rh (Theorem 1.B.60) can be found in Nanda, Jain, and Singh [426]. An order among nonnegative random variables, which is deﬁned by stipulating the monotonicity of the ratio of the hazard rate functions (when they exist), is studied in Kalashnikov and Rachev [271], Sengupta and Deshpande [500], and Rowell and Siegrist [479]. Equivalently, if F and G are survival functions, and we denote RF = − log F and RG = − log G, then the order mentioned above can be deﬁned by requiring that the com−1 position RF ◦ RG be convex on [0, ∞). The notion of the monotonicity of the ratio of hazard rate functions is used in Examples 1.B.24 (see (1.B.25)) and 1.B.25, as well as in Theorem 1.C.4. Sengupta and Deshpande [500] and Rowell and Siegrist [479] also studied the orders deﬁned by stipulating −1 that RF ◦ RG be starshaped or superadditive. Brown and Shanthikumar [112], Lillo, Nanda, and Shaked [361], Hu and Zhu [242], Di Crescenzo and Longobardi [165], and Belzunce, Ruiz, and Ruiz [74] have introduced and studied various shifted hazard and reversed hazard rate orders. Similar orders which extend the likelihood ratio order are studied in Section 1.C.4.

76

1 Univariate Stochastic Orders

Section 1.C: Again, many of the basic results regarding the likelihood ratio order can be found in Ross [475] and in M¨ uller and Stoyan [419]. Condition (1.C.3) is implicit in Block, Savits, and Shaked [95], and it is explicit in M¨ uller [408]. The relation (1.C.4) is mentioned in Chan, Proschan, and Sethuraman [123]. The suﬃcient conditions for X ≤lr Y , given in Theorem 1.C.4, have been noted in Belzunce, Lillo, Ruiz, and Shaked [69]. The closure property of the likelihood ratio order under conditioning (Theorem 1.C.5) is observed in Whitt [561]. Many variations of Theorem 1.C.5 with respect to general sample spaces can be found in Whitt [561] and in R¨ uschendorf [485]. The closure under limits property of the order ≤lr (Theorem 1.C.7) is taken from M¨ uller [408]. The result regarding the preservation of the order ≤lr under monotone increasing transformations (Theorem 1.C.8) can be found in Keilson and Sumita [283]. The several closure under convolution results (Theorems 1.C.9, 1.C.11, and 1.C.12) as well as the bivariate characterization result (Theorem 1.C.20) are taken from Shanthikumar and Yao [532]; a related result is Proposition 2.4 of Kebir [282]. A special case of Theorem 1.C.9 can be found in Mukherjee and Chatterjee [403]. The result about the number of successes in independent trials (Example 1.C.10) is statement (7) in Samuels [488], who attributed it to Ghurye and Wallace. The characterization of the order ≤hr by means of the order ≤lr , given in Theorem 1.C.14, is taken from Di Crescenzo [164]; a density of the form (1.C.7) can be found in Adell and Lekuona [4, page 773]. The likelihood ratio order comparison of a random random variable with a ﬁxed random variable (Corollary 1.C.16) is a slight generalization of Problem B in Szekli [544, page 22]. The closure property of the order ≤lr under likelihood ratio ordered mixtures (Theorem 1.C.17) is an extension of a result in Kebir [282]. The result about the inheritance of the order ≤lr , from the mixing scales to the underlying counting processes (Theorem 1.C.18), is taken from Ma [374]. Example 1.C.19 is inspired by Theorem 4.12 of Asadi and Shanbhag [23], but Example 1.C.19 has weaker assumptions (Θ1 and Θ2 need not be degenerate) and stronger conclusions (Y1 and Y2 are ordered in the likelihood ratio order, rather than in the hazard rate order) than the result of Asadi and Shanbhag [23]. The result in Theorem 1.C.21 is a special case of a result in Ross [475]. The bivariate characterizations in Theorems 1.C.22, 1.C.23, and 1.C.24 are taken from Righter and Shanthikumar [466] and from Chapter 13 by Righter in [515]. The Laplace transform characterization of the order ≤lr (Theorem 1.C.25) can be found in Kebir [281]. An extension of Theorem 1.C.25 to more general orders can be found in Nanda [422]. The conditional likelihood ratio orderings, described in Theorem 1.C.26, can be found in Ku and Niu [324] and in Chapter 14 by Shanthikumar and Yao in [515]. The setting in which the order ≤hr gives rise to the order ≤lr , as described in Theorem 1.C.28, is essentially taken from Ross, Shanthikumar, and Zhu [478]; they gave a version of this result for the order ≤rh . The necessary and suﬃcient condition for aX ≤lr X (Theorem 1.C.29) can be found in

1.E Complements

77

Hu, Nanda, Xie, and Zhu [237]. The likelihood ratio order comparisons of the order statistics given in Theorem 1.C.31 are taken from Bapat and Kochar [31] and from Hu, Zhu, and Wei [243]; an extension of the ﬁrst part of Theorem 1.C.31(a) can be found in Ma [373]. The result about the likelihood ratio order comparison of order statistics of a simple random sample from a ﬁnite population (Theorem 1.C.32) can be found in Kochar and Korwar [315]. The general result which compares order statistics from two samples of diﬀerent size (Theorem 1.C.33) is taken from Lillo, Nanda, and Shaked [362]; see related results in Franco, Ruiz, and Ruiz [205] and in Hu and Zhuang [247]. Belzunce and Shaked [78] extended Theorem 1.C.33 to comparison of lifetimes of coherent systems in reliability theory; see also Belzunce, Franco, Ruiz, and Ruiz [66]. The closure property under formation of order statistics (Corollary 1.C.34) can be found in Chan, Proschan, and Sethuraman [123]; a special case of this result can be found in Singh and Vijayasree [537]. The likelihood ratio order comparison of the order statistics given in Theorem 1.C.37 is taken from Raqab and Amin [465]. Theorem 2.6 in Kamps [273, page 182] extends Theorem 1.C.37 to the so called generalized order statistics; see also Korwar [322] and Hu and Zhuang [247]. The special case of Theorem 1.C.37 when j = i, is extended in Nanda, Misra, Paul, and Singh [427] to the case when the sample sizes m and n are random. Nanda, Misra, Paul, and Singh [427] also extend the special case of Theorem 1.C.37 when m = n, to the case when the common sample size is random. The likelihood ratio order comparison of normalized spacings (Theorem 1.C.42) can be found in Kochar and Korwar [314], whereas the comparisons for nonnormalized spacings (Theorem 1.C.43) are special cases of results in Misra and van der Meulen [396] and in Hu and Zhuang [246, 248]. The comparison of spacings that correspond to random variables with logconcave density (Theorem 1.C.44) is a special case of a result of Hu and Zhuang [246, 248]. The comparison of spacings from two diﬀerent samples (Theorem 1.C.45) is taken from Khaledi and Kochar [285]; an extension of this result can be found in Franco, Ruiz, and Ruiz [205], and a related result can be found in Belzunce, Mercader, and Ruiz [70]. The results about the likelihood ratio order comparisons of random minima and maxima (Example 1.C.46) are taken from Shaked and Wong [526]; see a related result in Bartoszewicz [49]. The result about the likelihood ratio comparison of the successive epochs of a nonhomogeneous Poisson process (Example 1.C.47) is given in Kochar [307, 309], where it is also shown that it implies the likelihood order comparison of successive record values of a sequence of independent and identically distributed random variables. The likelihood ratio comparisons of epoch and inter-epoch times of nonhomogeneous Poisson processes (Example 1.C.48) are taken from Belzunce, Lillo, Ruiz, and Shaked [69], who also extended them to comparisons of epoch and inter-epoch times of nonhomogeneous pure birth processes. The likelihood ratio order comparison of a sum of independent heterogeneous exponential random variables with a proper Erlang ran-

78

1 Univariate Stochastic Orders

dom variable (Example 1.C.49) is a combination of results from Boland, El-Neweihi, and Proschan [98] and from Bon and P˘ alt˘ anea [105], where more reﬁned comparisons can also be found. For instance, the comparison in Example 1.C.50 is given in Boland, El-Neweihi, and Proschan [98]. The likelihood ratio order comparison of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 1.C.51) is taken from Boland, Singh, and Cukic [102]. An interpretation of logconcavity and logconvexity as aging notions can be found in Shaked and Shanthikumar [506], where the proof of parts (i) and (ii) of Theorem 1.C.52 can be found. A proof of (1.C.13) can also be found there. The likelihood ratio ordering of random variables conditioned on their sum (Theorem 1.C.53) is essentially Example 12 of Lehmann [343]. The closure property of logconcave densities under order statistics (Theorem 1.C.54) is a generalization of an observation in Li and Lu [355]. The characterizations of the IFR and the DFR aging notions by means of the likelihood ratio order (Theorem 1.C.55) can be found in Whitt [565]. The likelihood ratio order comparison of the asymptotic equilibrium ages, given in Example 1.C.56, is a special case of a result of Bon and Illayk [104]. The likelihood ratio monotonicity of the parameter in the observation, given the likelihood ratio monotonicity of the observation in the parameter (Example 1.C.57), can be found in Whitt [560], whereas the preservation of the likelihood ratio order of the priors by the posteriors (Example 1.C.58) is given as Remark 3.14 in Spizzichino [539]. The comparison of the weighted random variables (Example 1.C.59) can be found in Bartoszewicz and Skolimowska [51]. An extension of the implication in Example 1.C.59, when X w and Y w are the length-biased versions of X and of Y , respectively, is given in Hu and Zhuang [244]. An extension of the implication in Example 1.C.59 to multivariate weighted distributions can be found in Jain and Nanda [253]. The result in Example 1.C.60 is taken from Bartoszewicz and Skolimowska [51]; extensions of the inequality X ≤lr X w , when X w is the length-biased version of X, are given in Ross [476]. The ordering of generalized skew normal random variables (Example 1.C.61) is taken from Gupta and Gupta [215]. The up shifted likelihood ratio order is introduced in Shanthikumar and Yao [530]. The results described in Section 1.C.4 can mostly be found in Lillo, Nanda, and Shaked [361, 362]. An extension of Theorem 1.C.77 is given in Belzunce, Ruiz, and Ruiz [74]; see also Belzunce and Shaked [78]. Ramos Romero and Sordo D´ıaz [464] deﬁned an order that is reminiscent of the order ≤lr↑ as deﬁned in (1.C.19). According to their deﬁnition, the nonnegative random variable X is said to be smaller than the nonnegative random variable Y if aX ≤lr Y for every 0 < a < 1. Lehmann and Rojo [345] used the characterization (1.C.4) in order to deﬁne stochastic orders that are stronger than ≤lr . For example, let X and Y be two random variables with distribution functions F and G,

1.E Complements

79

respectively, and consider the stipulation that, for a ﬁxed k, dn GF −1 (u) ≥ 0 dun

for all 0 < u < 1 and all n = 1, 2, . . . , k.

If k ≥ 3, then X is stochastically smaller than Y in a sense that is stronger than ≤lr . The order ≤lr is obtained when k = 2. Lehmann and Rojo [345] showed, for example, that if X1 , X2 , . . . , Xm are independent, identically distributed, then X1 is smaller than max{X1 , X2 , . . . , Xm }, in the above sense, with k = m. Chang [126] considered four exponential random variables X1 , X2 , Y1 , and Y2 , with the corresponding rates λ1 , λ2 , µ1 , and µ2 , where X1 and X2 are independent, and Y1 and Y2 are independent. He obtained the necessary and suﬃcient conditions on λ1 , λ2 , µ1 , and µ2 , for each of the following results: (i) X1 + X2 ≤lr Y1 + Y2 , (ii) X1 + X2 ≥lr Y1 + Y2 , and (iii) X1 + X2 and Y1 + Y2 are not comparable in the likelihood ratio order. Section 1.D: The discussion in this section follows Shaked and SuarezLlorens [520]. Fagiuoli and Pellerey [185] have introduced an approach that describes a uniﬁed point of view regarding some of the orders studied in this chapter and some of the orders studied in Chapters 2, 3, and 4. This approach led Fagiuoli and Pellerey to introduce some families of new orders. Several properties of these orders were studied in Fagiuoli and Pellerey [185], in Nanda, Jain, and Singh [424, 425], and in Hu, Kundu, and Nanda [236]; see also Hesselager [221]. Another general approach that uniﬁes some of the orders studied in this chapter and in Chapter 2 was introduced in Hu, Nanda, Xie, and Zhu [237]. Other orders that are related to the orders ≤st and ≤lr have been introduced and studied in Di Crescenzo [163]. Yanagimoto and Sibuya [571], Zijlstra and de Kroon [577], and Shanthikumar and Yao [532], extended the deﬁnitions of X ≤st Y , X ≤hr Y , and X ≤lr Y , to jointly distributed random variables X and Y ; see also Arcones, Kvam, and Samaniego [15]. Ebrahimi and Pellerey [177] have introduced a stochastic order based on a notion of uncertainty and studied its relationship to some of the orders studied in this chapter.

2 Mean Residual Life Orders

In this chapter we study two orders that are based on comparisons of functionals of mean residual lives. Like the orders in Chapter 1, the purpose of the orders here is to compare the “location” or the “magnitude” of random variables. Among other things, the relationship between the orders of Chapter 1 and the orders in this chapter will be analyzed.

2.A The Mean Residual Life Order 2.A.1 Deﬁnition If X is a random variable with a survival function F and a ﬁnite mean µ, the mean residual life of X at t is deﬁned as E[X − tX > t], for t < t∗ ; m(t) = (2.A.1) 0, otherwise, where t∗ = sup{t : F (t) > 0}. Note that if X is an almost surely positive random variable, then m(0) = µ. By the ﬁniteness of µ we have that m(t) < ∞ for all t < ∞. However, it is possible that m(∞) ≡ limt→∞ m(t) = ∞. A useful ∞ observation is that m(t) = ( t F (x)dx)/F (t) when t∗ = ∞. Although in (2.A.1) there is no restriction on the support of X, the mean residual life function is usually of interest when X is a nonnegative random variable. In that case X can be thought of as a lifetime of a device and m(t) then expresses the conditional expected residual life of the device at time t given that the device is still alive at time t. Clearly, m(t) ≥ 0, but not every nonnegative function is a mean residual life (mrl) function corresponding to some random variable. In fact, a function m is an mrl function of some nonnegative random variable with an absolutely continuous distribution function if, and only if, m satisﬁes the following properties: (i) 0 ≤ m(t) < ∞ for all t ≥ 0,

82

(ii) (iii) (iv) (v)

2 Mean Residual Life Orders

m(0) > 0, m is continuous, m(t) + t is increasing on [0, ∞], and when there exists a t0 such that m(t0 ) = 0, then m(t) = 0 for all t ≥ t0 . Otherwise, when there does not exist such a t0 with m(t0 ) = 0, then ∞ 1 dt = ∞. m(t) 0

Clearly, the smaller the mrl function is, the smaller X should be in some stochastic sense. This is the motivation for the order discussed in this section. Let X and Y be two random variables with mrl functions m and l, respectively, such that m(t) ≤ l(t) for all t. (2.A.2) Then X is said to be smaller than Y in the mean residual life order (denoted as X ≤mrl Y ). Analogously to (1.B.3), it can be shown that X ≤mrl Y if, and only if, ∞ ∞ G(u)du t∞ F (u)du > 0}, (2.A.3) increases in t over {t : F (u)du t t or equivalently, if, and only if, ∞ G(t) F (u)du ≤ F (t)

∞

for all t,

(2.A.4)

increases in t over {t : E[(X − t)+ ] > 0},

(2.A.5)

t

G(u)du

t

or equivalently, if, and only if, E[(Y − t)+ ] E[(X − t)+ ]

where, for any real number a, we let a+ denote the positive part of a; that is, a+ = a if a ≥ 0 and a+ = 0 if a < 0. Analogously to (1.B.5), we also have that X ≤mrl Y if, and only if, F (s) G(s) ∞ ≥ ∞ F (u)du G(u)du t t

for all s ≤ t

(2.A.6)

such that the denominators are positive. It is worthwhile to note that Condition (2.A.5) uses the expectations E[(X − t)+ ] and E[(Y − t)+ ] as (3.A.5) in Chapter 3 and (4.A.4) in Chapter 4 do. For discrete random variables that take on values in N+ the deﬁnition of ≤mrl should be modiﬁed. Let X be such a random variable with a ﬁnite mean µ. The mrl function of X at n is deﬁned as E[X − nX ≥ n], for n ≤ n∗ ; m(n) = 0, otherwise,

2.A The Mean Residual Life Order

83

where n∗ = max{n : P {X ≥ n} > 0}. Note that for such a random variable m(0) = µ. By the ﬁniteness of µ we have that m(n) < ∞ for n < ∞. Let X and Y be two such random variables with mrl functions m and l, respectively. We denote X ≤mrl Y if m(n) ≤ l(n)

for all n ≥ 0.

(2.A.7)

The discrete analog of (2.A.3) is that (2.A.7) holds if, and only if, ∞ ∞ j=n P {Y ≥ j} ∞ increases in n over N+ ∩ {n : P {X ≥ j} > 0}. j=n P {X ≥ j} j=n The discrete analog of (2.A.4) is that (2.A.7) holds if, and only if, P {Y ≥ n}

∞

P {X ≥ j} ≤ P {X ≥ n}

j=n+1

∞

P {Y ≥ j}

for all n ≥ 0.

j=n+1

The discrete analog of (2.A.6) is that X ≤mrl Y if, and only if, P {Y ≥ m} P {X ≥ m} ∞ ≥ ∞ P {X ≥ j} j=n+1 j=n+1 P {Y ≥ j}

for all m ≤ n

such that the denominators are positive. 2.A.2 The relation between the mean residual life and some other stochastic orders If X is a random variable with mrl function m and hazard rate function r, it is not hard to verify that t∗ x m(t) = exp − r(u)du dx, for t < t∗ . (2.A.8) t

t

Therefore, if Y is another random variable with mrl function l and hazard rate function q and (1.B.2) is satisﬁed, that is, X ≤hr Y , then X ≤mrl Y . We thus have proved the following result. Theorem 2.A.1. If X and Y are two random variables such that X ≤hr Y , then X ≤mrl Y . Neither of the orders ≤st and ≤mrl implies the other; counterexamples can be found in the literature. The next result, however, gives a condition under which X ≤mrl Y if, and only if, X ≤hr Y . Therefore, in particular, under that condition, X ≤mrl Y =⇒ X ≤st Y . Theorem 2.A.2. Let X and Y be two random variables with mrl functions m and l, respectively. Suppose that m(t) l(t) increases in t. Then, if X ≤mrl Y , then X ≤hr Y .

84

2 Mean Residual Life Orders

Proof. It is not hard to verify that m is diﬀerentiable over {t : P {X > t} > 0} and that if X has the hazard rate function r, then r(t) =

m (t) + 1 , m(t)

where m denotes the derivative of m. Similarly, if Y has the hazard rate function q, then l (t) + 1 q(t) = . l(t) The monotonicity of m(t)/l(t), together with (2.A.2), implies that r(t) =

1 l (t) 1 m (t) + ≥ + = q(t), m(t) m(t) l(t) l(t)

that is, X ≤hr Y .

Under a condition that is weaker than the one in Theorem 2.A.2 one merely obtains that X ≤mrl Y implies that X ≤st Y . This is shown in the next result. Theorem 2.A.3. Let X and Y be two nonnegative random variables with mrl m(0) m(t) EX functions m and l, respectively. Suppose that m(t) l(t) ≥ l(0) (that is, l(t) ≥ EY when X and Y are almost surely positive), t ≥ 0. If X ≤mrl Y , then X ≤st Y . Proof. Let F be the survival function of X. It is not hard to verify that t 1 EX F (t) = exp − dx over {t : P {X > t} > 0}. m(t) 0 m(x) Similarly, the survival function of Y can be expressed as t 1 EY G(t) = exp − dx over {t : P {Y > t} > 0}. l(t) 0 l(x) Therefore, under the assumptions of the theorem, it is seen that

G(t) F (t)

≥ 1.

The mean residual life order can be characterized by means of the hazard rate order and the appropriate equilibrium age variables. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result follows at once from (1.B.3) and (2.A.3). It may be contrasted with Theorem 1.C.13. Theorem 2.A.4. For nonnegative random variables X and Y with ﬁnite means we have X ≤mrl Y if, and only if, AX ≤hr AY .

2.A The Mean Residual Life Order

85

In the next theorem the order ≤mrl is characterized by ordering two related random variables in the sense of the hazard rate order. Let X and Y be two nonnegative random variables with ﬁnite means and suppose that X ≤st Y and that EX < EY . Let F and G be the distribution functions of X and of Y , respectively. Deﬁne the random variable ZX,Y as the random variable that has the density function h given by (1.C.7), as in Theorem 1.C.14; see also Theorem 2.B.3. Theorem 2.A.5. Let X and Y be two nonnegative random variables with ﬁnite means such that X ≤st Y and such that EY > EX > 0. Then X ≤mrl Y ⇐⇒ AY ≤hr ZX,Y ⇐⇒ AX ≤hr ZX,Y , where ZX,Y has the density function given in (1.C.7). Proof. Denote by Ge and H the survival functions of AY and ZX,Y , respectively. Using (1.A.20) and (1.C.7) we compute ∞ F (u)du H(x) EY 1 − x∞ = , x ≥ 0, EY − EX Ge (x) G(u)du x and the ﬁrst stated equivalence follows from (2.A.3) and (1.B.3). The second equivalence is proven similarly.

Some characterizations of the hazard rate order by means of the order ≤mrl are given below. We denote by Exp(µ) any exponential random variable with mean µ. Theorem 2.A.6. Let X and Y be two continuous nonnegative random variables. Then X ≤hr Y if, and only if, min{X, Exp(µ)} ≤mrl min{Y, Exp(µ)}

for all µ > 0.

The proof of Theorem 2.A.6 uses the Laplace transform order which is discussed in Chapter 5, and it will be given in Remark 5.A.23. Note that from Theorem 2.A.6 it follows, for continuous nonnegative random variables, that X ≤hr Y if, and only if, min{X, Z} ≤mrl min{Y, Z} for any nonnegative random variable Z which is independent of X and of Y . This is so because X ≤hr Y implies min{X, Z} ≤hr min{Y, Z} by Theorem 1.B.33, and the latter implies the above inequality by Theorem 2.A.1. The proof of the next result is not given here. Theorem 2.A.7. Let X and Y be two continuous nonnegative random variables. Then X ≤hr Y if, and only if, 1 − e−sX ≤mrl 1 − e−sY

for all s > 0.

A characterization of the order ≤mrl , by means of the increasing convex order, is given in Theorem 4.A.24.

86

2 Mean Residual Life Orders

2.A.3 Some closure properties In general, if X1 ≤mrl Y1 and X2 ≤mrl Y2 , where X1 and X2 are independent random variables and Y1 and Y2 are also independent random variables, then it is not necessarily true that X1 + X2 ≤mrl Y1 + Y2 . However, if these random variables are IFR, then it is true. This is shown in Theorem 2.A.9, but ﬁrst we state and prove the following lemma, which is of independent interest. Lemma 2.A.8. If the random variables X and Y are such that X ≤mrl Y and if Z is an IFR random variable which is independent of X and Y , then X + Z ≤mrl Y + Z.

(2.A.9)

Proof. Denote by fW and F W the density function and the survival function of any random variable W . Note that ∞ ∞ F X+Z (x)dx = F X (u)F Z (s − u)du for all s. −∞

x=s

Now, for s ≤ t, compute ∞ ∞ ∞ ∞ F X+Z (x)dx F Y +Z (y)dy − F X+Z (x)dx F Y +Z (y)dy x=s y=t x=t y=s F X (u)F Z (s − u)F Y (v)F Z (t − v) = v u≥v + F X (v)F Z (s − v)F Y (u)F Z (t − u) dudv − F X (u)F Z (t − u)F Y (v)F Z (t − v) v u≥v + F X (v)F Z (t − v)F Y (u)F Z (s − u) dudv ∞ ∞ = F X (x) dx · F Y (v) − F Y (x) dx · F X (v) v

u≥v

x=u

x=u

× [fZ (s − u)F Z (t − v) − fZ (t − u)F Z (s − v)]dudv, where the second equality is obtained by integration of parts and by collection of terms. Since X ≤mrl Y it follows from (2.A.4) that the expression within the ﬁrst set of brackets in the last integral is nonpositive. Since Z is IFR it can be veriﬁed that the quantity in the second pair of brackets in the last integral is also nonpositive. Therefore the integral is nonnegative. This proves (2.A.9).

Theorem 2.A.9. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤mrl Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, are all IFR, then m m Xi ≤mrl Yi . i=1

i=1

2.A The Mean Residual Life Order

87

Proof. Repeated application of (2.A.9), using the closure property of IFR under convolution, yields the desired result.

Another interesting lemma is stated next. Recall that a random variable X is said to be (or to have) decreasing mean residual life (DMRL) if m(t) is decreasing in t. Lemma 2.A.10. If the random variables X and Y are such that X ≤hr Y and if Z is a DMRL random variable independent of X and Y , then X + Z ≤mrl Y + Z. Proof. Integrating the identity in the proof of Lemma 1.B.3, we obtain that, for s ≤ t, one has ∞ ∞ ∞ ∞ F X+Z (x)dx F Y +Z (y)dy − F X+Z (x)dx F Y +Z (y)dy x=s y=t x=t y=s F X (u)fY (v) − fX (v)F Y (u) = v u≥v ∞ ∞ F Z (y − v)dy · F Z (s − u) − F Z (x − v)dx · F Z (t − u) dudv. × y=t

x=s

The result now follows from the assumptions.

It should be pointed out that a theorem such as Theorem 2.A.9 cannot be obtained from Lemma 2.A.10. The reason is that the inductive argument used to prove Theorem 2.A.9 does not have an analog based on Lemma 2.A.10. Theorem 2.A.11. Let X be a DMRL random variable, and let Z be a nonnegative random variable independent of X. Then X ≤mrl X + Z. Proof. Let FX , FZ , and FX+Z denote the distribution functions of the corresponding random variables, and let F X and F X+Z denote the corresponding survival functions. Then, for any t ∈ R we have ∞ ∞ ∞ F X (t) F X+Z (u)du = F X (t) F X (u − z)dFZ (z)du t t ∞ 0 ∞ = F X (t) F X (u − z)dudFZ (z) 0 t ∞ ∞ = F X (t) F X (u)dudFZ (z) 0 t−z ∞ ∞ ≥ F X (t − z) F X (u)dudFZ (z) 0 t ∞ = F X+Z (t) F X (u)du, t

where the inequality follows from the assumption that X is DMRL. The stated result now follows from (2.A.4).

88

2 Mean Residual Life Orders

A mean residual life order comparison of random sums is given in the following result. Theorem 2.A.12. Let {Xi , i = 1, 2, . . . } be a sequence of independent and identically distributed nonnegative IFR random variables. Let M and N be two discrete positive integer-valued random variables such that M ≤mrl N (in the sense of (2.A.7)), and assume that M and N are independent of the Xi ’s. Then M N Xi ≤mrl Xi . i=1

i=1

The mean residual life order does not have the property of being simply closed under mixtures. However, under quite strong conditions the order ≤mrl is closed under mixtures. This is shown in the next theorem which may be compared with Theorem 1.B.8. Theorem 2.A.13. Let X, Y , and Θ be random variables such that [X Θ = θ] ≤mrl [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤mrl Y . Proof. The proof is similar to the proof 1.B.8. of Theorem Select a θ and a θ), G(·θ), F (·θ ), and G(·θ ) be the survival θ in the support of Θ. Let F (· functions of [X Θ = θ], [Y Θ = θ], [X Θ = θ ], and [Y Θ = θ ], respectively. It is suﬃcient to show that for α ∈ (0, 1) we have α

∞ t

∞ F (uθ)du + (1 − α) t F (uθ )du αF (tθ) + (1 − α)F (tθ ) ∞ ∞ α t G(uθ)du + (1 − α) t G(uθ )du ≤ αG(tθ) + (1 − α)G(tθ )

for all t ≥ 0.

The proof of this inequality is similar to the proof of (1.B.12).

An analog of Theorem 1.B.12 exists for the order ≤mrl . This is stated next. Theorem 2.A.14. Let X and Y be two nonnegative independent random variables. Then X ≤mrl Y if, and only if, for all functions α and β such that β is nonnegative and α/β and β are increasing, one has E[α∗ (X)]E[β ∗ (Y )] ≤ E[α∗ (Y )]E[β ∗ (X)], provided the expectations exist, where x ∗ α (x) = α(u)du and

∗

β (x) =

0

x

β(u)du. 0

In particular, if X ≤mrl Y , then E[Y n ] E[X n ]

is increasing in n.

(2.A.10)

2.A The Mean Residual Life Order

89

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Sections 1.A.3 and 1.C.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result is comparable to Theorems 1.A.6, 1.B.14, 1.B.52, and 1.C.17. Theorem 2.A.15. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If

X(θ) ≤mrl X(θ )

whenever θ ≤ θ ,

(2.A.11)

and if Θ1 ≤hr Θ2 ,

(2.A.12)

Y1 ≤mrl Y2 .

(2.A.13)

then The proof of Theorem 2.A.15 uses the increasing convex order, and is therefore given in Remark 4.A.29 in Chapter 4. A Laplace transform characterization of the order ≤mrl is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, and 1.C.25. Theorem 2.A.16. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤mrl X2 ⇐⇒ Nλ (X1 ) ≤mrl Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤mrl Nλ (X2 ) is in the sense of (2.A.7). Proof. We use the notation of Theorem 1.A.13. Denote the distribution and survival functions of Xk by Fk and F k , k = 1, 2. For k = 1, 2, note that k αX λ (n) can be written as ∞ ∞ (λx)i Xk dFk (x) αλ (n) = e−λx i! 0 i=n 1, n = 0, = ∞ −λx (λx)n−1 (2.A.14) λe F (x)dx, n = 1, 2, . . . . k (n−1)! 0

90

2 Mean Residual Life Orders

Therefore

∞

P [Xk = n] =

e−λx

0

(λx)n dFk (x), n!

n = 0, 1, 2, . . . .

(2.A.15)

k = 1, 2,

(2.A.16)

From (2.A.15) it is seen that E[Nλ (Xk )] = λE[Xk ],

provided the expectations exist. First assume that X1 ≤mrl X2 . For the sake of this proof replace temporarX2 1 ily the notation αX λ (n) and αλ (n), by αλ,1 (n) and αλ,2 (n), respectively. We also denote E[X1 ] and E[X2 ] by µ1 and µ2 , respectively. The proof of Nλ (X1 ) ≤mrl Nλ (X2 ) will consist of showing the following three inequalities: ∞ ∞ αλ,2 (n) αλ,2 (n) n=0 ∞ ≤ n=1 , (2.A.17) ∞ α (n) λ,1 n=1 αλ,1 (n) n=0 ∞ ∞ αλ,2 (n) αλ,2 (n) n=1 ≤ n=2 , (2.A.18) ∞ ∞ n=1 αλ,1 (n) n=2 αλ,1 (n) and ∞

αλ,k (n) is TP2 in k ∈ {1, 2} and m ≥ 2.

(2.A.19)

n=m

In order to prove (2.A.17) note that from (2.A.16) it follows that ∞ n=0 ∞

αλ,k (n) = 1 + λµk αλ,k (n) = µk ,

k = 1, 2,

and

k = 1, 2.

(2.A.20)

n=1

But since X1 ≤mrl X2 implies that µ1 ≤ µ2 it follows that 1 + λµ2 λµ2 ≤ , 1 + λµ1 λµ1 and (2.A.17) is obtained. Next notice that (2.A.18) is equivalent to ∞ αλ,2 (n) αλ,2 (1) . ≤ n=1 ∞ αλ,1 (1) n=1 αλ,1 (n) ∞ Since n=1 αλ,k (n) = λµk , k = 1, 2, and αλ,k (1) = 0

∞

λe−λx F k (x)dx = λ µk − 0

∞

λe−λx

(2.A.21)

∞

F k (u)dudx ,

x

k = 1, 2,

2.A The Mean Residual Life Order

it follows that (2.A.21) is the same as ∞ ∞ ∞ −λx −λx µ1 λe F 2 (u)dudx − µ2 λe 0

0

x

∞

91

F 1 (u)dudx ≥ 0. (2.A.22)

x

Rewriting the left-hand side of (2.A.22) we see that ∞ ∞ λe−λx µ1 F 2 (u)du − µ2 F 1 (u)du dx x x ∞ 0 ∞ ∞ ∞ −λx = λe F 1 (u)du F 2 (u)du − F 2 (u)du

∞

0

0

0

x

∞

F 1 (u)du dx

x

≥ 0, ∞ where the inequality follows from the TP2 -ness of x F k (u)du in k = 1, 2, and x ≥ 0 (see (2.A.3)). This proves (2.A.22), and hence (2.A.18). Finally, in order to prove (2.A.19), notice, using a straightforward computation, that, for m ≥ 2, ∞

∞

αλ,k (n) = 0

n=m

λ2 e−λx

(λx)m−2 (m − 2)!

∞

F k (u)dudx.

(2.A.23)

x

∞

F k (u)du is TP2 in k ∈ {1, 2} and x ≥ 0. Furthermore, ∞ is TP2 in m ≥ 2 and x ≥ 0. Thus, it follows that n=m αλ,k (n) is TP2 in k ∈ {1, 2} and m ≥ 2, and this establishes (2.A.19). Now suppose that Nλ (X1 ) ≤mrl Nλ (X2 ) for all λ > 0. Then ∞ ∞ αλ,2 (n) n=m αλ,1 (n) ≤ n=m , m = 0, 1, 2, . . . . αλ,1 (m) αλ,2 (m)

By assumption, m−2

x

λ2 e−λx (λx) (m−2)!

For m ≥ 2, by (2.A.23) and (2.A.14), ∞ −λu (λu)m−2 ∞ m−2 ∞ λe−λu (λu) F 1 (x)dx du λe F 2 (x)dx du (m−2)! (m−2)! u 0 u ≤ . ∞ ∞ m−1 m−1 λe−λu (λu) λe−λu (λu) (m−1)! F 1 (u)du (m−1)! F 2 (u)du 0 0 (2.A.24) For a ﬁxed y > 0, deﬁne λ = (m − 1)/y. Letting m → ∞ (λ → ∞), we have ∞ ∞ ∞ m−2 −λu (λu) λe F k (x)dx du → F k (x)dx, (m − 2)! u 0 y ∞ 0

and

0

∞

λe−λu

(λu)m−1 F k (u)du → F k (y), (m − 1)!

k = 1, 2,

as long as y is a continuity point of F 1 (x) and F 2 (x). For such y’s, (2.A.24) gives us

92

2 Mean Residual Life Orders

∞ y

F 1 (x)dx

F 1 (y)

∞ ≤

y

F 2 (x)dx

F 2 (y)

.

It follows that X1 ≤mrl X2 since the set of continuity points of F 1 (x) and F 2 (x) is dense in the set of positive real numbers.

An analog of Theorem 1.B.21 is the following result. Theorem 2.A.17. Let X be a nonnegative DMRL random variable, and let a ≤ 1 be a positive constant. Then aX ≤mrl X. Proof. It is easy to verify that the mean residual life function of aX is given by am( at ), for all t, where m is the mean residual life function of X. Now t t am( ) ≤ m( ) ≤ m(t) a a

for all t,

where the ﬁrst inequality follows from a ∈ [0, 1] and the second inequality follows from the assumption that X is DMRL. The proof now follows from (2.A.2).

In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of mean residual life ordered random variables, is bounded from below and from above, in the mean residual life order sense, by these two random variables. Theorem 2.A.18. Let X and Y be two random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤mrl Y , then X ≤mrl W ≤mrl Y . The proof of Theorem 2.A.18 is similar to the proof of Theorem 1.B.22, but it uses (2.A.3) instead of (1.B.3). We omit the details. The following result is proven in Remark 4.A.25 of Section 4.A.3. Theorem 2.A.19. Let X and Y be two random variables. If X ≤mrl Y , then φ(X) ≤mrl φ(Y ) for every increasing convex function φ. Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R+ with ﬁnite means is a lattice with respect to the order ≤mrl . Let X1 , X2 , . . . , Xm be random variables, and let X(k:m) denote the corresponding kth order statistic, k = 1, 2, . . . , m. Theorem 2.A.20. Let X1 , X2 , . . . , Xm be m independent random variables. If Xi ≤mrl Xm , i = 1, 2, . . . , m − 1, then X(m−1:m−1) ≤mrl X(m:m) .

2.A The Mean Residual Life Order

93

Let X1 , X2 , . . . , Xm be nonnegative random variables and let U(i:m) = X(i:m) − X(i−1:m) denote the corresponding spacings, i = 1, 2, . . . , m (where U(1:m) = X(1:m) ). Similarly, let Y1 , Y2 , . . . , Yn be nonnegative random variables and let V(i:n) denote the corresponding spacings, i = 1, 2, . . . , n. Theorem 2.A.21. For positive integers m and n, let X1 , X2 , . . . , Xm be independent identically distributed nonnegative random variables, and let Y1 , Y2 , . . . , Yn be other independent identically distributed nonnegative random variables. If X1 ≤mrl Y1 , and if X1 is IMRL and Y1 is DMRL, then (m − j + 1)U(j:m) ≤mrl (n − i + 1)V(i:n)

for j ≤ m and i ≤ n.

The following example may be compared to Examples 1.B.24, 1.C.48, 3.B.38, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 2.A.22. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G and density functions f and g, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , and let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the mean residual life ordering of the ﬁrst two inter-epoch times implies the mean residual life ordering of all the corresponding later inter-epoch times. Explicitly, it will be shown below that if X ≤mrl Y , if X and Y are IMRL, and if (1.B.25) holds, then X1,n ≤mrl X2,n for each n ≥ 1. For the purpose of this proof we denote F by F 1 and G by F 2 . The stated result is obvious for n = 1. So let us ﬁx n ≥ 2. The survival function Gi,n of Xi,n , i = 1, 2, is given in (1.B.26). From (2.A.3) it is seen that the stated result is equivalent to ∞ Gi,n (x)dx is TP2 in (i, t); t

that is, to ∞ λi (s) s=0

(s) Λn−2 i (n − 2)!

∞

F i (u)duds is TP2 in (i, t).

(2.A.25)

u=s+t

Now, from Example 1.B.24 we know that (1.B.25) implies that λi (s) is TP2 in (i, s). The assumption F1 ≤mrl F2 means that ∞ F i (u)du is TP2 in (i, s) and in (i, t). u=s+t

Finally, the assumption that Fi is IMRL means that

Λn−2 (s) i (n−2)!

94

2 Mean Residual Life Orders

∞

F i (u)du

is TP2 in (s, t).

u=s+t

Thus (2.A.25) follows from Theorem 5.1 on page 123 of Karlin [275]. 2.A.4 A property in reliability theory The order ≤mrl can be used to characterize DMRL random variables. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. Theorem 2.A.23. The random variable X is DMRL if, and only if, any one of the following equivalent conditions holds: (i) [X − tX > t] ≥ mrl [X − t X > t ] whenever t ≤ t . (ii) X ≥mrl [X − tX > t] for all t ≥ 0 (when X is a nonnegative random variable). (iii) X + t ≤mrl X + t whenever t ≤ t . The proofs of all these statements are trivial and are thus omitted. Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.B.17, 3.A.56, 3.C.13, and 4.A.51. A multivariate extension of parts (i) and (ii) of Theorem 2.A.23 is given in Section 6.F.3. An interesting application of part (iii) of Theorem 2.A.23 is the following corollary. Its proof consists of a combination of Theorem 2.A.23(iii) with Lemma 2.A.8 (or, alternatively, a combination of Theorem 2.A.23(iii), Theorem 1.B.38(iii), and Lemma 2.A.10). Corollary 2.A.24. Let X be a DMRL random variable and let Y be an IFR random variable. If X and Y are independent, then X + Y is DMRL.

2.B The Harmonic Mean Residual Life Order 2.B.1 Deﬁnition Let X and Y be two nonnegative random variables with mrl functions m and l, respectively, and suppose that the harmonic averages of m and l are comparable as follows: x −1 x −1 1 1 1 1 ≤ du du x 0 m(u) x 0 l(u)

for all x > 0.

(2.B.1)

Then X is said to be smaller than Y in the harmonic mean residual life order (denoted as X ≤hmrl Y ).

2.B The Harmonic Mean Residual Life Order

95

Notice that F (u) 1 d =− = ∞ log m(u) du F (v)dv u Therefore

x

1 du = log m(u)

0

Similarly

0

x

1 du = log l(u)

∞ x

∞

F (v)dv .

u

EX . F (u)du

EY ∞ . G(u)du x

Thus it is seen that (2.B.1) holds if, and only if, ∞ ∞ F (u)du G(u)du x ≤ x for all x ≥ 0. EX EY

(2.B.2)

For discrete random variables that take on values in N+ the deﬁnition of ≤hmrl should be modiﬁed. Let X and Y be two such random variables. We denote X ≤hmrl Y if ∞ ∞ j=n P {X ≥ j} j=n P {Y ≥ j} ≤ , n = 1, 2, . . . . (2.B.3) E[X] E[Y ] 2.B.2 The relation between the harmonic mean residual life and some other stochastic orders Since the harmonic averages of m and l are increasing functionals of m and l, respectively, it follows that X ≤mrl Y =⇒ X ≤hmrl Y. The order ≤hmrl is closely related to the order ≤icx which is studied in Section 4.A. The reader may ﬁnd it helpful to browse over that section now, since some of the ideas that are explained there are used below. Note that both (2.B.2) and (2.B.3) are equivalent to E[(X − t)+ ] E[(Y − t)+ ] ≤ E[X] E[Y ]

for all t ≥ 0,

(2.B.4)

and from (2.B.4) it follows that X ≤hmrl Y if, and only if, E[φ(X)] E[φ(Y )] ≤ E[X] E[Y ]

for all increasing convex functions φ : [0, ∞) → R,

(2.B.5) such that the expectations exist. It is worthwhile to note that condition (2.B.4) uses the expectations E[(X − t)+ ] and E[(Y − t)+ ] as (2.A.5) and as (3.A.5)

96

2 Mean Residual Life Orders

in Chapter 3 and (4.A.4) in Chapter 4 do. In Chapter 4, where the order ≤icx is studied, we will use (2.B.4) in order to derive a relationship between the orders ≤hmrl and ≤icx (see Theorem 4.A.28). Neither of the orders ≤st and ≤hmrl implies the other; counterexamples can be found in the literature. Letting x → 0 in (2.B.1) we obtain m(0) ≤ l(0), that is, X ≤hmrl Y =⇒ E[X X > 0] ≤ E[Y Y > 0]. Thus, when X and Y are positive almost surely, then X ≤hmrl Y =⇒ EX ≤ EY.

(2.B.6)

X ≤hmrl Y ⇐⇒ X ≤cx Y,

(2.B.7)

If EX = EY , then where the order ≤cx is studied in Section 3.A (see (3.A.7)). Thus, from (3.A.4) it follows that if X ≤hmrl Y and EX = EY , then Var[X] ≤ Var[Y ]. Under the proper condition, even if X and Y do not have the same mean, one can still get the variance inequality; this is shown in the next result. Theorem 2.B.1. Let X and Y be two almost surely positive random variables with ﬁnite second moments. If X ≤hmrl Y , and if Y is NWUE, then Var[X] ≤ Var[Y ]. Proof. From (2.B.5) we get E[X 2 ] E[Y 2 ] ≤ . E[X] E[Y ]

(2.B.8)

From Barlow and Proschan [36, page 187] it is seen that Var[Y ] ≥ {E[Y ]}2 , since Y is NWUE. Thus, using (2.B.6), we see that Var[Y ] ≥ E[Y ]E[X]. Therefore E[X] Var[Y ] + {E[Y ] − E[X]}E[X] E[Y ] E[X] · E[Y 2 ] − {E[X]}2 = E[Y ]

Var[Y ] ≥

≥ E[X 2 ] − {E[X]}2 = Var[X], where the last inequality follows from (2.B.8).

The harmonic mean residual life order can be characterized by means of the usual stochastic order and the appropriate equilibrium age variables. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result follows at once from (1.A.1) and (2.B.2). It may be contrasted with Theorems 1.C.13 and 2.A.4.

2.B The Harmonic Mean Residual Life Order

97

Theorem 2.B.2. For nonnegative random variables X and Y with ﬁnite means we have X ≤hmrl Y if, and only if, AX ≤st AY . In the next theorem the order ≤hmrl is characterized by ordering two related random variables in the sense of the usual stochastic order. Let X and Y be two nonnegative random variables with ﬁnite means and suppose that X ≤st Y and that EX < EY . Let F and G be the distribution functions of X and of Y , respectively. Deﬁne the random variable ZX,Y as the random variable that has the density function h given by (1.C.7), as in Theorem 1.C.14; see also Theorem 2.A.5. Theorem 2.B.3. Let X and Y be two nonnegative random variables with ﬁnite means such that X ≤st Y and such that EY > EX > 0. Then X ≤hmrl Y ⇐⇒ AY ≤st ZX,Y ⇐⇒ AX ≤st ZX,Y , where ZX,Y has the density function given in (1.C.7). Proof. It is easy to see that (here H is the survival function of Z, Ge is the survival function of AY , and F e is as in (1.A.20)) EX H(x) − Ge (x) = Ge (x) − F e (x) , x ≥ 0. EY − EX Thus the ﬁrst stated equivalence follows from Theorem 2.B.2. The proof of the second equivalence is similar.

The order ≤hmrl can characterize the order ≤mrl as follows. Theorem 2.B.4. Let X and Y be two nonnegative random variables with ﬁnite means. Then X ≤mrl Y if, and only if, [X −tX > t] ≤hmrl [Y −tY > t] for all t ≥ 0. X > t] The proof of Theorem 2.B.4 consists of applying (2.B.2) to [X − t and [Y − t Y > t], for each t ≥ 0, and then showing that the resulting inequality is equivalent to (2.A.3). We omit the details. 2.B.3 Some closure properties Under the proper conditions, the order ≤hmrl is closed under the operation of convolution. First we prove the following lemma. Recall that a nonnegative random variable X with a ﬁnite mean is called NBUE (new better than used in expectation) if E[X −tX > t] ≤ E[X] for all t > 0. Note that a nonnegative NBUE random variable must be almost surely positive. Lemma 2.B.5. If the two almost surely positive random variables X and Y are such that X ≤hmrl Y , and if Z is an NBUE nonnegative random variable independent of X and Y , then X + Z ≤hmrl Y + Z.

98

2 Mean Residual Life Orders

Proof. Let F , G, and H [F , G, and H] be the distribution [survival] functions corresponding to X, Y , and Z, respectively. The corresponding equilibrium age distribution [survival] functions will be denoted by Fe , Ge , and He [F e , Ge , and H e ]. Let AX , AY , AZ , AX+Z , and AY +Z denote the asymptotic equilibrium ages corresponding to X, Y , Z, X + Z, and Y + Z, respectively. Now compute ∞ 1 P {X + Z > v}dv P {AX+Z > t} = E[X + Z] v=t ∞ ∞ 1 = F (v − u)dH(u)dv EX + EZ v=t u=0 ∞ ∞ 1 = F (v − u)dvdH(u) EX + EZ u=0 v=t ∞ ∞ 1 = F (v)dvdH(u) EX + EZ u=0 v=t−u t ∞ 1 = F (v)dvdH(u) EX + EZ u=0 v=t−u ∞ 0 ∞ ∞ F (v)dvdH(u) + dvdH(u) + u=t

v=0

t 1 EX = F e (t − u)dH(u) EX + EZ 0

u=t

v=t−u

+ EX · H(t) +

∞

H(u)du t

=

1 EX · P {AX + Z > t} + EZ · H e (t) , EX + EZ

where AX and Z are taken to be independent in the above expression. Now, since Z is NBUE we have that Z ≥st AZ . Therefore P {AX + Z > t} ≥ P {AX + AZ > t} ≥ P {AZ > t} = H e (t).

(2.B.9)

Now notice that 1 EX · P {AX + Z > t} + EZ · H e (t) EX + EZ 1 EY · P {AX + Z > t} + EZ · H e (t) ≤ EY + EZ 1 EY · P {AY + Z > t} + EZ · H e (t) ≤ EY + EZ = P {AY +Z > t}

P {AX+Z > t} =

(AY and Z are taken to be independent in the above), where the ﬁrst inequality follows from (2.B.6) and (2.B.9), and the second inequality follows from Theorem 2.B.2. The result now follows from Theorem 2.B.2.

2.B The Harmonic Mean Residual Life Order

99

Repeated application of Lemma 2.B.5, using the closure property of NBUE under convolution, and noting that every NBUE random variable is almost surely positive, yields the following result. Theorem 2.B.6. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of nonnegative random variables such that Xi ≤hmrl Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, are all NBUE, then m

Xi ≤hmrl

i=1

m

Yi .

i=1

Using Theorem 2.B.6 we can prove the following result. Theorem 2.B.7. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of NBUE nonnegative independent and identically distributed random variables such that Xi ≤hmrl Yi , i = 1, 2, . . .. Let M and N be integer-valued positive random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤hmrl N . Then M

Xj ≤hmrl

j=1

N

Yj .

j=1

Proof. The proof here is similar to the proof of Theorem 4.A.9. The reader may wish to look at that proof before continuing to read the present mproof. 1 E From Theorem 2.B.6 and (2.B.4) it is seen that mE[X i=1 Xi − 1]

m

1 u + ≤ mE[Y1 ] E i=1 Yi − u + (all the Xi ’s have the same mean, and also all the Yi ’s have the same mean). Therefore m

m E E i=1 Xi − u + i=1 Yi − u + for all u ≥ 0, m = 1, 2, . . . . ≤ E[X1 ] E[Y1 ] Thus E

M

i=1 Xi − u + M E i=1 Xi

∞ m=1

=

E

m i=1

Xi − u

P {M = m} +

E[M ]E[X1 ] m

m=1 E i=1 Yi − u + P {M = m}

∞ ≤

E[M ]E[Y1 ]

E i=1 Yi − u + = . M E i=1 Yi M

Therefore (again by (2.B.4)) we have M i=1

Xi ≤hmrl

M i=1

Yi .

(2.B.10)

100

2 Mean Residual Life Orders

Now let φ be an increasing convex function and denote g(n) ≡ E[φ(Y1 + Y2 +· · ·+Yn )]. In the proof of Theorem 4.A.9 it is shown that g(n) is increasing

E φ

M

i=1 and convex in n. Therefore, since M ≤hmrl N , we have that E[M ]

N E φ i=1 Yi , and since the Yi ’s have the same mean we have that E[N ]

Yi

≤

M M N N E φ E φ E φ E φ i=1 Yi i=1 Yi i=1 Yi i=1 Yi ≤ = = . M N E[M ]E[Y1 ] E[N ]E[Y1 ] E E i=1 Yi i=1 Yi Thus we have that

M i=1

Yi ≤hmrl

N

Yi .

(2.B.11)

i=1

The inequalities (2.B.10) and (2.B.11) yield the stated result.

A result that is related to Theorem 2.B.7 is given next. It is of interest to compare it to Theorem 1.A.5. Theorem 2.B.8. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed NBUE random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent and identically distributed NBUE random variables, and let N be a positive integervalued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K

Xi ≤hmrl [≥hmrl ] Y1 ,

i=1

and M ≤hmrl [≥hmrl ] KN. Then

M j=1

Xj ≤hmrl [≥hmrl ]

N

Yj .

j=1

We do not give a detailed proof of Theorem 2.B.8 here since it is similar to the proof of Theorem 4.A.12 in Section 4.A.1. In order to construct a proof of Theorem 2.B.8 from the proof of Theorem 4.A.12 one just uses the equivalence (2.B.7) and one replaces the application of Theorem 4.A.9 by an application of Theorem 2.B.7. Two other similar theorems are the following. Their proofs are similar to the proofs of Theorems 4.A.13 and 4.A.14 in Section 4.A.1. Theorem 2.B.9. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed NBUE random variables, and let M

2.B The Harmonic Mean Residual Life Order

101

be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent and identically distributed NBUE random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤hmrl Y1

and

M ≤hmrl

i=1

K

Ni ,

i=1

or if we have KX1 ≤hmrl Y1

and

M ≤hmrl KN,

KX1 ≤hmrl Y1

and

M ≤hmrl

or if we have K

Ni ,

i=1

then

M

Xj ≤hmrl

j=1

N

Yj .

j=1

Theorem 2.B.10. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed NBUE random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent and identically distributed NBUE random variables, and let N be a positive integervalued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1 i=1

then

Xi ≤hmrl

K1 Y1 K2

M j=1

and

Xj ≤hmrl

M ≤hmrl K2 N,

N

Yj .

j=1

The harmonic mean residual life order does not have the property of being simply closed under mixtures. However, under quite strong conditions the order ≤hmrl is closed under mixtures. This is shown in the next theorem which may be compared with Theorems 1.B.8 and 2.A.13. Theorem 2.B.11. Let X and Y be nonnegative random variables, and let Θ be another random variable, such that [X Θ = θ] ≤hmrl [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤hmrl Y .

102

2 Mean Residual Life Orders

Proof. The proof is similar to the proof 1.B.8. of Theorem Select a θ and a θ), G(·θ), F (·θ ), and G(·θ ) be the survival θ in the support F (· of Θ. Let functions of [X Θ = θ], [Y Θ = θ], [XΘ = θ ], and [Y Θ = θ ], respectively. Let E[X θ], E[Y θ], E[X θ ], and E[Y θ ] be the corresponding expectations. By (2.B.2) it is suﬃcient to show that for α ∈ (0, 1) we have ∞ ∞ α t F (u|θ)du + (1 − α) t F (u|θ )du αE[X|θ] + (1 − α)E[X|θ ] ∞ ∞ α t G(u|θ)du + (1 − α) t G(u|θ )du for all t ≥ 0. (2.B.12) ≤ αE[Y |θ] + (1 − α)E[Y |θ ] The proof of this inequality is similar to the proof of (1.B.12).

Another condition under which the order ≤hmrl is closed under mixtures is given in the following theorem. Theorem 2.B.12. Let X and Y be nonnegative random variables, and let Θ be another random variable, such that [X Θ = θ] ≤hmrl [Y Θ = θ] for all θ in the support of Θ. Furthermore, assume that E[Y |Θ = θ] =k E[X|Θ = θ]

(independent of θ).

(2.B.13)

Then X ≤hmrl Y . Proof. As in the a θ and a θ in the support proof of Theorem 2.B.11, select of Θ. Let F (·θ), G(·θ), F (·θ ), and G(· θ ) be the survival functions of [X Θ = θ], [Y Θ = θ], [X Θ = θ ], and [Y Θ = θ ], respectively. Let E[X θ], E[Y θ], E[X θ ], and E[Y θ ] be the corresponding expectations. Let α ∈ (0, 1). Note that from (2.B.13) we obtain αE[Y |θ] + (1 − α)E[Y |θ ] = k. (2.B.14) αE[X|θ] + (1 − α)E[X|θ ] Also, from [X Θ = θ] ≤hmrl [Y Θ = θ], [X Θ = θ ] ≤hmrl [Y Θ = θ ], and (2.B.13), we get, for t ≥ 0, that ∞ ∞ ∞ ∞ k F (uθ)du ≤ G(uθ)du and k F (uθ )du ≤ G(uθ )du, t

t

t

and hence ∞ k α F (u θ)du + (1 − α) t

t

F (uθ )du t ∞ ≤α G(uθ)du + (1 − α) ∞

t

∞

G(uθ )du.

t

From this inequality and (2.B.14) we obtain (2.B.12), and this completes the proof.

2.B The Harmonic Mean Residual Life Order

103

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Sections 1.A.3 and 1.C.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result is comparable to Theorems 1.A.6, 1.B.14, 1.B.52, 1.C.17 and 2.A.15. Theorem 2.B.13. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If

X(θ) ≤hmrl X(θ )

whenever θ ≤ θ ,

(2.B.15)

and if Θ1 ≤hr Θ2 ,

(2.B.16)

Y1 ≤hmrl Y2 .

(2.B.17)

then The proof of Theorem 2.B.13 uses the increasing convex order, and is therefore given in Remark 4.A.29 in Chapter 4. A Laplace transform characterization of the order ≤hmrl is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, and 2.A.16. Theorem 2.B.14. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤hmrl X2 ⇐⇒ Nλ (X1 ) ≤hmrl Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤hmrl Nλ (X2 ) is in the sense of (2.B.3). Proof. First assume that X ≤hmrl Y . As in the proof of Theorem 2.A.16 we X2 1 temporarily replace the notation αX λ (n) and αλ (n), by αλ,1 (n) and αλ,2 (n), respectively. We also denote the survival function and the mean of Xk by F k and µk , respectively, k = 1, 2. Let m ≥ 2. Using (2.A.23) we have µ1

∞

P 2 (n) − µ2

n=m

= 0

∞ n=m

∞

λ2 e−λx

P 1 (n) (λx)m−2 µ1 (m − 2)!

∞

x

F 2 (u)du − µ2

∞

x

F 1 (u)du dx.

104

2 Mean Residual Life Orders

The integrand is nonnegative by the assumption of the theorem, and one direction of the proof is complete. The proof of the converse statement is similar to the proof of the converse of Theorem 2.A.16.

The following result gives necessary and suﬃcient conditions for two random variables to be equal in the sense of the order ≤hmrl . Theorem 2.B.15. Let X and Y be two nonnegative random variables with positive expectations, such that EX ≤ EY . Then X =hmrl Y if, and only if, X =st BY for some Bernoulli random variable B, independent of Y . Proof. First assume that X =st BY for some Bernoulli random variable B, independent of Y . Then E[(X − t)+ ] E[(BY − t)+ ] E[(Y − t)+ ]P {B = 1} = = E[X] E[BY ] E[Y ]P {B = 1} E[(Y − t)+ ] = E[Y ]

for all t ≥ 0,

and thus X =hmrl Y follows from (2.B.4). Conversely, suppose that X =hmrl Y . By (2.B.2) this means that ∞ ∞ P {X > u}du P {Y > u}du t = t for all t ≥ 0, EX EY which yields EX · P {Y > t}, t ≥ 0. EY That is, X =st BY , where B is a Bernoulli random variable such that P {B = 1} = EX/EY .

P {X > t} =

From the proof of Theorem 2.B.15 it is seen, in contrast to (2.B.6), that if X ≤hmrl Y , then it does not necessarily follow that EX ≤ EY (unless X and Y are positive almost surely). In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of harmonic mean residual life ordered random variables, is bounded from below and from above, in the harmonic mean residual life order sense, by these two random variables. Theorem 2.B.16. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤hmrl Y , then X ≤hmrl W ≤hmrl Y . Proof. By assumption, (2.B.2) holds. Therefore

2.B The Harmonic Mean Residual Life Order

∞ x

F (u)du p ≤ EX

∞ x

∞

F (u)du + (1 − p) x G(u)du ≤ pEX + (1 − p)EY

∞ x

105

G(u)du EY for all x ≥ 0,

and the stated result follows from (2.B.2).

2.B.4 Properties in reliability theory The order ≤hmrl can be used to characterize DMRL random variables. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. Theorem 2.B.17. The nonnegative random variable X is DMRL if, and only if, [X − tX > t] ≥hmrl [X − t X > t ] whenever t ≥ t ≥ 0. The proof is simple and thus omitted. Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.A.23, 3.A.56, 3.C.13, and 4.A.51. The order ≤hmrl can also be used to characterize NBUE random variables as follows. Theorem 2.B.18. Let X be a nonnegative random variable with a ﬁnite positive mean. Then the following assertions are equivalent: (i) X ≤hmrl X + Y for any nonnegative random variable Y with a ﬁnite positive mean, which is independent of X. (ii) X is NBUE. (iii) X + Y1 ≤hmrl X + Y2 whenever Y1 and Y2 are almost surely positive random variables with ﬁnite means, which are independent of X, such that Y1 ≤hmrl Y2 . Proof. Suppose that (i) holds. Then, taking Y =a.s. y for some y > 0, we get from (2.B.4) that E[(X − t)+ ] E[(X + y − t)+ ] ≤ , E[X] E[X] + y

t ≥ 0.

Upon rearrangement this gives

yE[(X − t)+ ] ≤ E[X] E[(X + y − t)+ ] − E[(X − t)+ ] , that is, E[X] E[(X − t)+ ] ≤ y

t

P {X > u}du,

t ≥ 0.

t−y

Letting y → 0 we obtain E[(X − t)+ ] ≤ E[X]P {X > t},

t ≥ 0,

t ≥ 0;

106

2 Mean Residual Life Orders

that is, X is NBUE. The statement (ii)=⇒(iii) is Lemma 2.B.5. Now assume that (iii) holds. Let Y1 =a.s a and Y2 =a.s y, where 0 < a < y. It is easy to verify (for instance, using (2.B.4)) that Y1 ≤hmrl Y2 . That is, (E[X] + y)E[(X + a − t)+ ] ≤ (E[X] + a)E[(X + y − t)+ ],

t ≥ 0.

Letting a → 0 we obtain (E[X] + y)E[(X − t)+ ] ≤ E[X]E[(X + y − t)+ ],

t ≥ 0, y ≥ 0.

Integrating both sides of the above inequality with respect to the distribution of Y (Y is any random variable as described in (i)) we obtain (E[X] + E[Y ])E[(X − t)+ ] ≤ E[X]E[(X + Y − t)+ ],

t ≥ 0,

that is, by (2.B.4), we have X ≤hmrl X + Y .

Another characterization of NBUE random variables by means of the usual stochastic order is given in Theorem 1.A.31.

2.C Complements Section 2.A: Basic properties of the mrl function (which is also called the biometric function) can be found in Yang [572] and references therein. Some properties of the mrl functions are summarized in Shaked and Shanthikumar [513], where further references can be found. The counterexamples mentioned after Theorem 2.A.1 can also be found in that paper and further counterexamples can be found in Gupta and Kirmani [216] and in Alzaid [12]. The conditions under which the ≤mrl order implies the ≤hr and the ≤st orders (Theorems 2.A.2 and 2.A.3) are taken from Gupta and Kirmani [216]. The equivalence of the order ≤mrl and (2.A.3) can be found, for example, in Singh [536]. The characterization of the order ≤mrl which is given in Theorem 2.A.5 is taken from Di Crescenzo [164]. The characterizations of the order ≤hr by means of the order ≤mrl , given in Theorems 2.A.6 and 2.A.7, can be found in Belzunce, Gao, Hu, and Pellerey [67]. The closure under convolution results of the order ≤mrl in Section 2.A.3 were communicated to us by Pellerey [444]. A special case of Lemma 2.A.8 can be found in Mukherjee and Chatterjee [403]. Theorem 2.A.9 can be found in Pellerey [448] and Theorem 2.A.12 can be found in Fagiuoli and Pellerey [186]. The fact that a DMRL random variable increases in the order ≤mrl when a nonnegative random variable is added to it (Theorem 2.A.11) is a result that is slightly stronger than a result in Frostig [207]. The closure under mixtures result (Theorem 2.A.13) is taken from Nanda, Jain, and Singh [424]. The characterization of the

2.C Complements

107

mrl order that is given in Theorem 2.A.14 can be found in Joag-Dev, Kochar, and Proschan [259], whereas its special case given in (2.A.10) is taken from Fagiuoli and Pellerey [187]. Fagiuoli and Pellerey [187] have extended (2.A.10) to sums of mrl ordered random variables. The closure under mixtures property of the order ≤mrl (Theorem 2.A.15) is a special case of a result of Hu, Kundu, and Nanda [236], and it can also be found in Hu, Nanda, Xie, and Zhu [237]; see also Theorem 3.4 in Ahmed [7]. The Laplace transform characterization of the order ≤mrl (Theorem 2.A.16) is taken from Shaked and Wong [524]; see also Kan and Yi [274]. An extension of Theorem 2.A.16 to more general orders can be found in Nanda [422]. The mean residual life order comparisons of order statistics (Theorems 2.A.20 and 2.A.21) can be found in Hu, Zhu, and Wei [243] and in Hu and Wei [240]. The comparison of inter-epoch times of two nonhomogeneous Poisson processes in the sense of the mean residual life order (Example 2.A.22) is taken from Belzunce, Lillo, Ruiz, and Shaked [69]. The result that a convolution of an IFR and a DMRL random variables is DMRL (Corollary 2.A.24) can be found in Kopocinska and Kopocinski [320]. Nanda, Singh, Misra, and Paul [429] studied a notion of reversed residual lifetime, and introduced and studied a stochastic order based on it. An order which is related to the mean residual life order is introduced in Ebrahimi and Zahedi [179]. If m and l are the mrl functions of X and Y , d respectively, then the order is deﬁned by requiring dt (l(t) − m(t)) to be monotone in t. Ebrahimi and Zahedi [179] show that this order implies the mean residual life order. In Kirmani [297] it is claimed that the spacings, from a sample of independent and identically distributed IMRL random variables, are ordered in the mean residual life order. However, the proof of Kirmani is erroneous; see Kirmani [298]. Section 2.B: The order ≤hmrl is studied, for example, in Deshpande, Singh, Bagai, and Jain [161] and in Heilmann and Schr¨ oter [219]. Baccelli and Makowski [28] call it the forward recurrence times stochastic order (see an additional comment on the paper of Baccelli and Makowski [28] in Section 4.C). The counterexamples mentioned after (2.B.5) can be found, for example, in Mi [394]. In fact, Gerchak and Golani [209] have noticed that the example given on page 489 of Wolﬀ [567] shows that it is possible for both X ≤st Y and Y ≤hmrl X to hold simultaneously in the strict sense. The comparison of the expectations of ≤hmrl ordered random variables, described in (2.B.6), is a special case of a result of Nanda, Jain, and Singh [425]. The variance inequality (Theorem 2.B.1) can be found in Kirmani [297]. The characterization of the order ≤hmrl which is given in Theorem 2.B.3 is taken from Di Crescenzo [164]. The characterization of the order ≤mrl by means of the order ≤hmrl (Theorem 2.B.4) can be

108

2 Mean Residual Life Orders

found in Hu, Kundu, and Nanda [236]. The preservation under convolution property of the order ≤hmrl (Theorem 2.B.6) is taken from Pellerey [448, 449] (the latter is a correction note), and the closure under random summations property of the order ≤hmrl (Theorem 2.B.7) is also taken from Pellerey [448, 449], though it is alluded to in Heilmann and Schr¨ oter [219]. These results (Theorems 2.B.6 and 2.B.7) can also be found in Baccelli and Makowski [28]. A slight extension of Theorem 2.B.6 is given in Lef`evre and Utev [340]. Theorems 2.B.8–2.B.10 have been communicated to us by Pellerey [447]. The closure under mixtures properties of the order ≤hmrl (Theorems 2.B.11 and 2.B.12) are taken from Nanda, Jain, and Singh [424] and from Lef`evre and Utev [340], respectively, whereas Theorem 2.B.13 is inspired by Ahmed, Soliman, and Khider [9]. The Laplace transform characterization of the order ≤hmrl (Theorem 2.B.14) is taken from Shaked and Wong [524]. An extension of Theorem 2.B.14 to more general orders can be found in Nanda [422]. The conditions under which X =hmrl Y (Theorem 2.B.15) can be found in Lef`evre and Utev [340]. The NBUE characterization, given in Theorem 2.B.18, is taken from Lef`evre and Utev [340].

3 Univariate Variability Orders

In this chapter we study stochastic orders that compare the “variability” or the “dispersion” of random variables. The most important and common orders that are studied in this chapter are the convex and the dispersive orders. We also study in this chapter the excess wealth order (which is also called the right spread order) which is found to be useful in an increasing number of applications. Various related orders are also examined in this chapter.

3.A The Convex Order 3.A.1 Deﬁnition and equivalent conditions Let X and Y be two random variables such that E[φ(X)] ≤ E[φ(Y )]

for all convex functions φ : R → R,

(3.A.1)

provided the expectations exist. Then X is said to be smaller than Y in the convex order (denoted as X ≤cx Y ). Roughly speaking, convex functions are functions that take on their (relatively) larger values over regions of the form (−∞, a) ∪ (b, ∞) for a < b. Therefore, if (3.A.1) holds, then Y is more likely to take on “extreme” values than X. That is, Y is “more variable” than X. It should be mentioned here that in (3.A.1) it is suﬃcient to consider only functions φ that are convex on the union of the supports of X and Y rather than over the whole real line; we will not keep repeating this point throughout this chapter. One can also deﬁne a concave order by requiring (3.A.1) to hold for all concave functions φ (denoted as X ≤cv Y ). However, X ≤cv Y if, and only if, Y ≤cx X. Therefore, it is not necessary to have a separate discussion for the concave order. Note that the functions φ1 and φ2 , deﬁned by φ1 (x) = x and φ2 (x) = −x, are both convex. Therefore, from (3.A.1) it easily follows that

110

3 Univariate Variability Orders

X ≤cx Y =⇒ E[X] = E[Y ],

(3.A.2)

provided the expectations exist. Later it will be helpful to observe that if E[X] = E[Y ], then ∞ ∞ F (u) − G(u) du = F (u) − G(u) du = 0, (3.A.3) −∞

−∞

provided the integrals exist, where F [F ] and G [G] are the survival [distribution] functions of X and Y , respectively. The function φ, deﬁned by φ(x) = x2 , is convex. Therefore, from (3.A.1) and (3.A.2), it follows that X ≤cx Y =⇒ Var[X] ≤ Var[Y ],

(3.A.4)

whenever Var(Y ) < ∞. For a ﬁxed a, the function φa , deﬁned by φa (x) = (x−a)+ , and the function ϕa , deﬁned by ϕa = (a − x)+ , are both convex. (The reader is encouraged to draw a sketch of φa and ϕa since they are very handy in the analysis of the order ≤cx as well as in the analysis of the monotone convex and the monotone concave orders discussed in Chapter 4.) Therefore, if X ≤cx Y , then E[(X − a)+ ] ≤ E[(Y − a)+ ]

for all a

(3.A.5)

E[(a − X)+ ] ≤ E[(a − Y )+ ]

for all a,

(3.A.6)

and provided the expectations exist. Alternatively, using a simple integration by parts, it is seen that (3.A.5) and (3.A.6) can be rewritten as ∞ ∞ F (u)du ≤ G(u)du for all x (3.A.7) x

and

x

x

−∞

F (u)du ≤

x

G(u)du

for all x,

(3.A.8)

−∞

provided the integrals exist. In fact, when E[X] = E[Y ], (3.A.7) is equivalent to X ≤cx Y . To see this equivalence, note that every convex function can be approximated by (that is, is a limit of) positive linear combinations of the functions φa ’s, for various choices of a’s, and of the function φ(x) = −x. By (3.A.7), E[φa (X)] ≤ E[φa (Y )] for all a’s, and this fact, together with the equality of the means of X and Y , implies (3.A.1). We thus have proved the ﬁrst part of the following result. The other part is proven similarly. Theorem 3.A.1. Let X and Y be two random variables such that E[X] = E[Y ]. Then (a) X ≤cx Y if, and only if, (3.A.7) holds.

3.A The Convex Order

111

(b) X ≤cx Y if, and only if, (3.A.8) holds. By adding a to both sides of the inequality in (3.A.5), it is seen that (3.A.5) can be rewritten as max{X, a}] ≤ E[max{Y, a}]

for all a.

(3.A.9)

Thus, when E[X] = E[Y ], then (3.A.9) is equivalent to X ≤cx Y . In a similar manner (3.A.6) can be rewritten. The following theorem provides another characterization of the convex order. Theorem 3.A.2. Let X and Y be two random variables such that E[X] = E[Y ]. Then X ≤cx Y if, and only if, E|X − a| ≤ E|Y − a|

for all a ∈ R.

(3.A.10)

Proof. Clearly, if X ≤cx Y , then (3.A.10) holds. So suppose that (3.A.10) holds. Without loss of generality it can be assumed that EX = EY = 0. A straightforward computation gives ∞ a E|X − a| = a + 2 F (u)du = −a + 2 F (u)du. (3.A.11) a

−∞

The result now follows from (3.A.7) or (3.A.8).

The function −E|X − ·| is called the potential of the probability measure of X. Similarly, −E|Y − ·| is the potential of the probability measure of Y . Thus, (3.A.10) can be written as −E|X − ·| ≥ −E|Y − ·| pointwise. Using this observation, we obtain from Chacon and Walsh [122] the following characterization. Theorem 3.A.3. Let X and Y be two random variables such that E[X] = E[Y ] = 0. Then X ≤cx Y if, and only if, for a standard Brownian motion from 0, {B(t), t ≥ 0}, there exist two stopping times T1 and T2 , such that T1 ≤ T2 almost surely, and X =st B(T1 ) and Y =st B(T2 ). An immediate consequence of (3.A.5) is shown next. Denote the supports of X and Y by supp(X) and supp(Y ). Let lX = inf{x : x ∈ supp(X)} and uX = sup{x : x ∈ supp(X)}. Deﬁne lY and uY similarly. Then we have that if X ≤cx Y , then lY ≤ lX and uY ≥ uX . As proof, suppose, for example, that uY < uX . Let a be such that uY < a < uX . Then E[(Y − a)+ ] = 0 < E[(X − a)+ ], in contradiction to (3.A.5). Therefore we must have uY ≥ uX . Similarly, using (3.A.6), it can be shown that lY ≤ lX . As a consequence we have that if X and Y are random variables whose supports are intervals, then X ≤cx Y =⇒ supp(X) ⊆ supp(Y ).

(3.A.12)

An important characterization of the convex order by construction on the same probability space is stated in the next theorem.

112

3 Univariate Variability Orders

Theorem 3.A.4. The random variables X and Y satisfy X ≤cx Y if, and ˆ and Yˆ , deﬁned on the same probonly if, there exist two random variables X ability space, such that ˆ =st X, X Yˆ =st Y, ˆ Yˆ } is a martingale, that is, and {X, ˆ =X ˆ E[Yˆ X]

a.s.

(3.A.13) ˆ and Yˆ can be selected such that [Yˆ X ˆ= Furthermore, the random variables X x] is increasing in x in the usual stochastic order ≤st .

It is not easy to prove the constructive part of Theorem 3.A.4. However, it ˆ and Yˆ as described in the theorem is easy to prove that if random variables X exist, then X ≤cx Y . Just note that if φ is a convex function, then by Jensen’s Inequality, ˆ = Eφ(E[Yˆ X]) ˆ ≤ E{E[φ(Yˆ )X]} ˆ = E[φ(Yˆ )] = E[φ(Y )], E[φ(X)] = E[φ(X)] which is (3.A.1). Other characterizations of the convex order are described in the next theorem. Theorem 3.A.5. Let X and Y be two random variables with distribution functions F and G, respectively, and with equal ﬁnite means. Then each of the following two statements is a necessary and suﬃcient condition for X ≤cx Y : p p F −1 (u)du ≥ G−1 (u)du for all p ∈ [0, 1]; (3.A.14) 0

and

0 1

F

−1

(u)du ≤

p

1

G−1 (u)du

for all p ∈ [0, 1].

(3.A.15)

p

1 1 Proof. Since EX = 0 F −1 (u)du and EY = 0 G−1 (u)du, and since EX = EY , it follows that for any p ∈ [0, 1] the inequality 1 1 F −1 (u)du ≤ G−1 (u)du (3.A.16) p

p

is equivalent to the inequality p F −1 (u)du ≥ 0

p

G−1 (u)du.

(3.A.17)

0

It follows that (3.A.14) and (3.A.15) are equivalent. Thus, we just need to show that X ≤cx Y is equivalent to (3.A.14).

3.A The Convex Order

113

We only give the proof for the case when the distribution functions F and G of X and Y are continuous; the proof for the general case is similar, though notationally more complex. Without loss of generality, suppose that F and G are not identical. Since EX = EY , it follows that F and G must cross each other at least once. If either (3.A.7) or (3.A.14) hold, then, if there is a ﬁrst time that F crosses G, it must cross it there from below. Similarly, if there is a last time that F crosses G, it also must cross it there from below. (Thus, if there is a ﬁnite number of crossings, then it must be odd.) Let (y0 , p0 ), (y1 , p1 ), and (y2 , p2 ) be three consecutive crossing points as depicted in Figure 3.A.1. Note that (y0 , p0 ) may be (−∞, 0) (we then adopt the convention that 0 · (−∞) ≡ 0), and that (y2 , p2 ) may be (∞, 1) (we then adopt the convention that 0 · ∞ ≡ 0). Note that by the continuity assumption we have pi = F (yi ) = G(yi ), i = 0, 1, 2. p 1

6 F G

p2 F p1

G G F

p0 G F

y0 y1 y2 Fig. 3.A.1. Typical segments of F and G when X ≤cx Y

Assume that X ≤cx Y . Then ∞ F (x)dx ≤ y2

Thus

-y

∞

y2

G(x)dx.

(3.A.18)

114

3 Univariate Variability Orders

1

F −1 (u)du = y2 (1 − p2 ) +

p2

≤ y2 (1 − p2 ) +

∞

F (x)dx y 2∞

G(x)dx

(by (3.A.18))

(3.A.19)

y2 1

=

G−1 (u)du.

p2

Now, for u ∈ [p1 , p2 ] we have that F −1 (u) − G−1 (u) ≤ 0 (see Figure 3.A.1). 1 Thus p (F −1 (u) − G−1 (u))du is increasing in p ∈ [p1 , p2 ]. Therefore, from (3.A.19) we get that

1

F −1 (u)du ≤

p

1

G−1 (u)du

for p ∈ [p1 , p2 ].

(3.A.20)

p

From X ≤cx Y we also have y0

−∞

F (x)dx ≤

y0

G(x)dx.

(3.A.21)

−∞

Thus

p0

F −1 (u)du = y0 p0 −

0

y0

F (x)dx −∞ y0

≥ y 0 p0 − G(x)dx −∞ p0 = G−1 (u)du.

(by (3.A.21))

(3.A.22)

0

Now, for ∈ [p0 , p1 ] we have that F −1 (u) − G−1 (u) ≥ 0 (see Figure 3.A.1). p u −1 Thus 0 (F (u) − G−1 (u))du is increasing in p ∈ [p0 , p1 ]. Therefore, from (3.A.22) we get that p p F −1 (u)du ≥ G−1 (u)du for p ∈ [p0 , p1 ]. (3.A.23) 0

0

Thus we see from (3.A.20) and (3.A.23) that for each p ∈ [0, 1] either (3.A.16) or (3.A.17) hold. Therefore, (3.A.14) (or, equivalently, (3.A.15)) holds. Conversely, assume that (3.A.14) (or, equivalently, (3.A.15)) holds. Then

1

p2

Thus

F −1 (u)du ≤

1

p2

G−1 (u)du.

(3.A.24)

3.A The Convex Order

∞

1

F (x)dx = y2

p2 1

≤

115

F −1 (u)du − y2 (1 − p2 ) G−1 (u)du − y2 (1 − p2 )

(by (3.A.24))

(3.A.25)

p 2∞

=

G(x)dx. y2

Now, ∞ for x ∈ [y1 , y2 ] we have that F (x) − G(x) ≤ 0 (see Figure 3.A.1). Thus (F (x) − G(x))dx is increasing in y ∈ [y1 , y2 ]. Therefore, from (3.A.25) we y get that ∞ ∞ F (x)dx ≤ G(x)dx for y ∈ [y1 , y2 ]. (3.A.26) y

y

From (3.A.14) we also have p0 F −1 (u)du ≥ 0

Thus

y0

−∞

p0

G−1 (u)du.

(3.A.27)

0

F (x)dx = y0 p0 −

p0

F −1 (u)du

0

p0

≤ y 0 p0 − G−1 (u)du 0 y0 = G(x)dx.

(by (3.A.27))

(3.A.28)

−∞

Now, y for x ∈ [y0 , y1 ] we have that F (x) − G(x) ≤ 0 (see Figure 3.A.1). Thus (F (x) − G(x))dx is decreasing in y ∈ [y0 , y1 ]. Therefore, from (3.A.28) we −∞ get that y y F (x)dx ≤ G(x)dx for y ∈ [y0 , y1 ]. (3.A.29) −∞

−∞

Thus we see from (3.A.26) and (3.A.29) that for each y ∈ R either (3.A.7) or (3.A.8) hold. Therefore X ≤cx Y .

We now give a bivariate characterization result for the order ≤cx that is similar to the characterizations given in Theorems 1.A.9, 1.B.9, 1.B.47, and 1.C.20, for the orders ≤st , ≤hr , ≤rh , and ≤lr , respectively. We deﬁne the following class of bivariate functions: Gcx = {φ : R2 → R : φ(x, y) − φ(y, x) is convex in x for all y}. Theorem 3.A.6. Let X and Y be independent random variables. Then X ≤cx Y if, and only if, E[φ(X, Y )] ≤ E[φ(Y, X)]

for all φ ∈ Gcx .

(3.A.30)

116

3 Univariate Variability Orders

Proof. Suppose that (3.A.30) holds. Let ψ be a univariate convex function. Deﬁne φ(x, y) = ψ(x). Then φ ∈ Gcx and from (3.A.30) we see that X ≤cx Y . Conversely, suppose that X ≤cx Y . Let φ ∈ Gcx and let Yˆ be another random variable, independent of X and Y , such that Yˆ =st Y . Deﬁne ψ by ψ(x) ≡ E[φ(x, Yˆ ) − φ(Yˆ , x)]. From the independence of X and Yˆ it follows that ψ is convex. Therefore, since X ≤cx Y , it follows that E[φ(X, Y )] − E[φ(Y, X)] = E[ψ(X)] ≤ E[ψ(Y )] = 0.

Another characterization of the convex order, by means of the number of sign changes of two distribution functions, is given in Theorem 3.A.45 in Section 3.A.3. Let X be a random variable with survival function F , and let h : [0, 1] → [0, 1] be an increasing function that satisﬁes h(0) = 0 and h(1) = 1. Such a function h is called a probability transformation function. Consider the functional ∞ Vh (X) = − xdh(F (x)); (3.A.31) −∞

this functional is called the Yaari functional and it is of interest in economics. Theorem 3.A.7. Let X and Y be two random variables with the same ﬁnite means. Then X ≤cx Y if, and only if, Vh (X) ≤ Vh (Y )

for every convex probability transformation function h.

As can be seen from (3.A.2), only random variables that have the same means can be compared by the order ≤cx . Often, however, we do not want a variability order to depend on the location of the involved distributions. Several ideas for using the order ≤cx to deﬁne a variability order that is independent of the locations of the underlying random variables X and Y have been suggested in the literature. When X and Y have ﬁnite means, one idea is to say that X is less variable than Y if [X − EX] ≤cx [Y − EY ].

(3.A.32)

This is sometimes called the dilation order. When the random variables X and Y satisfy (3.A.32), we denote X ≤dil Y . For nonnegative random variables X and Y with ﬁnite means one can deﬁne X as less variable than Y if X Y ≤cx . EX EY

(3.A.33)

This is sometimes called the Lorenz order. When the nonnegative random variables X and Y satisfy (3.A.33), we denote X ≤Lorenz Y . Bhattacharjee and Sethuraman [88] introduced a stochastic order, for nonnegative random variables with ﬁnite means, denoted by ≤hnbue . Kochar [306] showed that the orders ≤hnbue and ≤Lorenz are equivalent. The dilation order can be characterized as follows.

3.A The Convex Order

117

Theorem 3.A.8. Let X and Y be two random variables with distribution functions F and G, respectively, and with ﬁnite expectations. Then X ≤dil Y if, and only if, 1 1 1 −1 −1 [F (u) − G (u)]du ≤ [F −1 (u) − G−1 (u)]du for all p ∈ [0, 1). 1−p p 0 (3.A.34) Proof. Denote ∆ = EX − EY . Then the stochastic inequality X ≤dil Y can be rewritten as X − ∆ ≤cx Y . Denote by F∆ the distribution function of X − ∆, and note that from Theorem 3.A.5 we have that X − ∆ ≤cx Y if, and only if, 1 1 −1 F∆ (u)du ≤ G−1 (u)du for all p ∈ [0, 1]. (3.A.35) p

p

−1 (u) = F −1 (u) − ∆ Since F∆ (x) = F (x + ∆) for all x ∈ R it follows that F∆ for all u ∈ [0, 1]. Therefore (3.A.35) is equivalent to 1 1 −1 [F (u) − ∆]du ≤ G−1 (u)du for all p ∈ [0, 1]; p

that is,

1

[F p

−1

p

(u) − G

−1

(u)]du ≤

1

[EX − EY ]du

for all p ∈ [0, 1];

p

that is, 1 1−p

1

[F −1 (u) − G−1 (u)]du ≤ EX − EY

for all p ∈ [0, 1).

(3.A.36)

p

1 1 Now, since EX = 0 F −1 (u)du and EY = 0 G−1 (u)du it is seen that (3.A.36) is equivalent to (3.A.34).

1 For each p ∈ (0, 1), the quantity 0 [F −1 (u) − G−1 (u)]du on the right-hand 1 −1 1 side of (3.A.34) is a weighted average of 1−p [F (u) − G−1 (u)]du and of p 1 p −1 (u)−G−1 (u)]du. Thus, from Theorem 3.A.8 we obtain that X ≤dil Y p 0 [F if, and only if, 1 1 p −1 −1 −1 [F (u) − G (u)]du ≤ [F (u) − G−1 (u)]du for all p ∈ (0, 1]. p 0 0 Also, X ≤dil Y if, and only if, 1 1 1 p −1 [F −1 (u)−G−1 (u)]du ≤ [F (u)−G−1 (u)]du 1−p p p 0

for all p ∈ (0, 1).

118

3 Univariate Variability Orders

For p ∈ [0, 1], let us denote the p-quantiles of X and of Y by x(p) = F −1 (p) and y(p) = G−1 (p), respectively. As in Jewitt [256], we observe that F −1 (p) 1 p −1 F (u)du = 0 xdF (x) = pE[X X ≤ x(p)]. Similarly, p F −1 (u)du = 0 p 1 (1 − p)E[X X ≥ x(p)], 0 G−1 (u)du = pE[Y Y ≤ y(p)], and p G−1 (u)du = (1 − p)E[Y Y ≥ y(p)]. Thus we see that each of the following three statements is a necessary and suﬃcient condition for X ≤dil Y : E[X X ≥ x(p)] − E[Y Y ≥ y(p)] ≤ EX − EY for all p ∈ [0, 1), (3.A.37) E[X X ≤ x(p)] − E[Y Y ≤ y(p)] ≥ EX − EY for all p ∈ (0, 1], (3.A.38) and E[X X ≥ x(p)] − E[Y Y ≥ y(p)] ≤ E[X X ≤ x(p)] − E[Y Y ≤ y(p)] for all p ∈ (0, 1). Rewriting (3.A.37) and (3.A.38) we see that under the conditions of Theorem 3.A.8 we have that X ≤dil Y if, and only if, E[X − EX X ≥ x(p)] ≤ E[Y − EY Y ≥ y(p)] for all p ∈ [0, 1). (3.A.39) Also, X ≤dil Y if, and only if, E[X − EX X ≤ x(p)] ≥ E[Y − EY Y ≤ y(p)]

for all p ∈ (0, 1]. (3.A.40)

When EX = EY we have that X ≤dil Y ⇐⇒ X ≤cx Y . Therefore, when EX = EY , the convex order can be characterized by noting that X ≤cx Y if, and only if, E[X X ≥ x(p)] ≤ E[Y Y ≥ y(p)] for all p ∈ [0, 1). (3.A.41) Also then, X ≤cx Y if, and only if, E[X X ≤ x(p)] ≥ E[Y Y ≤ y(p)]

for all p ∈ (0, 1].

(3.A.42)

Another characterization of the dilation order is given next. Theorem 3.A.9. Let X and Y be two random variables with ﬁnite means, and let the corresponding distribution functions be F and G, respectively. Then X ≤dil Y if, and only if, 1 1 −1 φ(p)[F (p) − EX]dp ≤ φ(p)[G−1 (p) − EY ]dp, 0

0

for any increasing function φ on [0, 1] for which the integrals above are welldeﬁned.

3.A The Convex Order

119

The Lorenz order is closely connected to the so-called Lorenz curve deﬁned as follows. Let X be a nonnegative random variable with distribution function F . The Lorenz curve LX , corresponding to X, is deﬁned as p −1 F (u)du LX (p) = 01 , p ∈ [0, 1]. (3.A.43) F −1 (u)du 0 The Lorenz curve is used in economics to measure the inequality of incomes. Let Y be another nonnegative random variable with distribution function G. The Lorenz curve LY , corresponding to Y , is deﬁned analogously. The next theorem, which follows from Theorem 3.A.5, highlights the connection between the Lorenz curve and the Lorenz order. Theorem 3.A.10. Let X and Y be two nonnegative random variables with equal means. Then X ≤Lorenz Y (or, equivalently, X ≤cx Y ) if, and only if, LX (p) ≥ LY (p)

for all p ∈ [0, 1].

Another related characterization of the Lorenz order is described next. Let Ψ be the set of all measurable mappings from R+ to [0, 1]. For any nonnegative random variable X with a ﬁnite mean deﬁne the Lorenz zonoid in R2+ by $ ∞ % ∞ 1 L(X) = ψ(x)dF (x), xψ(x)dF (x) : ψ ∈ Ψ , EX 0 0 where F denotes the distribution function of X. Theorem 3.A.11. Let X and Y be two nonnegative random variables with ﬁnite means. Then X ≤Lorenz Y ⇐⇒ L(X) ⊆ L(Y ). Ramos and Sordo [463] deﬁned what they called a “second-order absolute Lorenz order” by requiring two random variables X and Y , with ﬁnite means and with distribution functions F and G, respectively, to satisfy 1 u 1 u [F −1 (v) − EX]dvdu ≥ [G−1 (v) − EY ]dvdu for all p ∈ [0, 1]. p

0

p

0

3.A.2 Closure and other properties Using (3.A.1) through (3.A.13) it is easy to prove each of the closure results in the ﬁrst two parts of the following theorem. (Recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.) Theorem 3.A.12. (a) Let X and Y be two random variables. Then X ≤cx Y ⇐⇒ −X ≤cx −Y.

120

3 Univariate Variability Orders

(b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤cx [Y Θ = θ] for all θ in the support of Θ. Then X ≤cx Y . That is, the convex order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that E|Xj | → E|X|

and

E|Yj | → E|Y |

as j → ∞.

(3.A.44)

If Xj ≤cx Yj , j = 1, 2, . . ., then X ≤cx Y . (d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx Yi for i = 1, 2, . . . , m, then m j=1

Xj ≤cx

m

Yj .

j=1

That is, the convex order is closed under convolutions. In order to prove part (c) of Theorem 3.A.12 we will use the characterization of the convex order given in Theorem 3.A.2. Without loss of generality it can be assumed that EXj = EYj = EX = EY = 0 for all j. From (3.A.11) a we have that E|Xj − a| = −a + 2 −∞ Fj (u)du for all a, where Fj denotes the distribution function of Xj . In particular, when a = 0, it is seen that 0 a E|Xj | = 2 −∞ Fj (u)du. Therefore E|Xj − a| = E|Xj | − a + 2 0 Fj (u)du. Using (3.A.44) it is seen that, as j → ∞, the latter expression converges a to E|X| − a + 2 0 F (u)du = E|X − a|, where F is the distribution function of X. That is, for all a, E|Xj − a| → E|X − a|, as j → ∞. Similarly, E|Yj − a| → E|Y − a|, as j → ∞. The result now follows from Theorem 3.A.2. One way of proving part (d) of Theorem 3.A.12 is the following. Note that part (b) of Theorem 3.A.12 can be rephrased as follows: Let Z1 , Z2 , and Θ be independent random variables and let g be a bivariate function such that g(Z1 , θ) ≤cx g(Z2 , θ)

for all θ in the support of Θ.

(3.A.45)

Then g(Z1 , Θ) ≤cx g(Z2 , Θ). If Z1 and Z2 satisfy Z1 ≤cx Z2 , then the function g, deﬁned by g(z, θ) = z + θ, satisﬁes (3.A.45), since the order ≤cx is closed under shifts. Thus we have shown that if Z1 ≤cx Z2 and Θ is any random variable independent of Z1 and Z2 , then Z1 + Θ ≤cx Z2 + Θ. (3.A.46) Repeated applications of (3.A.46) yield part (d) of Theorem 3.A.12. It should be pointed out, in contrast to part (a) of Theorem 3.A.12, that if X and Y are such that X ≤cx Y , it is not necessarily true that X ≤cx −Y also, even when EX = EY = 0. This can be seen easily from (3.A.12).

3.A The Convex Order

121

Without condition (3.A.44) the conclusion of part (c) of Theorem 3.A.12 need not be true. For example, let the Xj ’s be all uniformly distributed on [.5, 1.5]. And let the Yj ’s be such that P {Yj = 0} = (j −1)/j and P {Yj = j} = 1/j, j ≥ 2. Note that the distributions of the Yj ’s converge to a distribution that is degenerate at 0. Here Xj ≤cx Yj , j = 2, 3, . . ., but it is not true that X ≤cx Y . For nonnegative random variables, a “random sums” analog of Theorem 3.A.12(d) follows. We omit the proof (however, in Theorem 8.A.13 of Chapter 8 we give a proof of a special case of the following theorem). Theorem 3.A.13. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent random variables such that Xi ≤cx Yi , i = 1, 2, . . .. Let M and N be integer-valued positive random variables that are independent of the {Xi } and {Yi } sequences, respectively, such that M ≤cx N . If the Xi ’s or the Yi ’s are increasing in i in the convex order, then M j=1

Xj ≤cx

N

Yj .

j=1

A result that is related to Theorem 3.A.13 is given next. It is of interest to compare it to Theorems 1.A.5 and 2.B.8. Theorem 3.A.14. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K Xi ≤cx [≥cx ] Y1 , i=1

and M ≤cx [≥cx ] KN. Then

M j=1

Xj ≤cx [≥cx ]

N

Yj .

j=1

We do not give a detailed proof of Theorem 3.A.14 here since it is similar to the proof of Theorem 4.A.12 in Section 4.A.1. Two other similar theorems are the following. Their proofs are similar to the proofs of Theorems 4.A.13 and 4.A.14 in Section 4.A.1. Theorem 3.A.15. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let

122

3 Univariate Variability Orders

{Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤cx Y1

and

M ≤cx

K

Ni ,

i=1

i=1

or if we have KX1 ≤cx Y1

and

M ≤cx KN,

KX1 ≤cx Y1

and

M ≤cx

or if we have K

Ni ,

i=1

then

M

Xj ≤cx

j=1

N

Yj .

j=1

Theorem 3.A.16. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1 i=1

then

Xi ≤cx

K1 Y1 K2

M j=1

and

Xj ≤cx

N

M ≤cx K2 N,

Yj .

j=1

Another result which involves a comparison of random sums, with respect to the convex order, is given in Example 9.A.19. Theorem 3.A.12(d) can be generalized to situations in which the Xj ’s or the Yj ’s are not necessarily independent. For example, the result (7.A.13) in Chapter 7 is a generalization of Theorem 3.A.12(d). The next result is a trivial illustration of a case in which one of the independence assumptions is dropped.

3.A The Convex Order

123

Theorem 3.A.17. Let X be a random variable with a ﬁnite mean. Then X + EX ≤cx 2X. Proof. Let X be an independent copy of X. Then, for any convex function φ for which the expectations below exist, one has Eφ(X + EX) = Eφ(E(X + X X)) ≤ Eφ(X + X ) ≤ Eφ(2X), where the ﬁrst inequality follows from Jensen’s Inequality and the second inequality follows from Example 3.A.29 below (with n = 2).

Theorem 3.A.17 can also be easily proven using Theorem 3.A.4. The following result provides a generalization of Theorem 3.A.17; see a comment after Theorem 3.A.18. Recall from (3.A.32) the deﬁnition of the dilation order. Theorem 3.A.18. Let X be a random variable with a ﬁnite mean. Then X ≤dil aX

whenever a ≥ 1.

Proof. Without loss of generality assume that EX = 0. Let φ be a convex function which, without loss of generality, can be assumed to satisfy φ(0) = 0. Then, for k ≥ 1 we have Eφ(X) ≤ E[kφ(X)] ≤ Eφ(kX).

From Theorem 3.A.18 it follows that X + (k − 1)EX ≤cx kX

whenever k ≥ 1,

which is, indeed, a generalization of Theorem 3.A.17. From Theorem 3.A.12(a) it is not hard to see that X ≤dil Y ⇐⇒ −X ≤dil −Y. Another property of the dilation and of the convex orders is described in the following theorem. Theorem 3.A.19. Let X1 and X2 (Y1 and Y2 ) be two independent copies of X (Y ), where X and Y have ﬁnite means. If X ≤dil Y , then X1 − X2 ≤dil Y1 − Y2 . If X ≤cx Y , then X1 − X2 ≤cx Y1 − Y2 . Proof. Using the fact that X ≤dil Y if, and only if, −X ≤dil −Y , and the fact that the dilation order is closed under convolutions (see Theorem 3.A.12(d)), the stated result follows. The proof of X ≤cx Y =⇒ X1 − X2 ≤cx Y1 − Y2 is similar (using Theorem 3.A.12(a) and (d)).

124

3 Univariate Variability Orders

An interesting comparison of sums of random variables in the convex order is the following result. Theorem 3.A.20. Let X1 , X2 , . . . , Xn , and Z be random variables. Then X1 + X2 + · · · + Xn ≥cx E[X1 Z] + E[X2 Z] + · · · + E[Xn Z], provided the conditional expectations above exist. Proof. Let φ be a convex function. By Jensen’s Inequality we have Eφ(X1 + X2 + · · · + Xn ) = E E[φ(X1 + X2 + · · · + Xn )Z] ≥ E φ E[X1 Z] + E[X2 Z] + · · · + E[Xn Z] , and the stated result follows.

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a convex subset (that is, an interval) of the real line or of N. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by Gθ (y)dF (θ), y ∈ R. H(y) = X

Theorem 3.A.21. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If for every convex function φ E[φ(X(θ))]

is convex in θ,

(3.A.47)

and if Θ1 ≤cx Θ2 , then Y1 ≤cx Y2 . The proof of Theorem 3.A.21 is similar to the proof of Theorem 4.A.18 below, and therefore we omit it. It is worth mentioning that condition (3.A.47) is the same as the condition {X(θ), θ ∈ X } ∈ SCX which is studied in Section 8.A of Chapter 8. The following corollary of Theorem 3.A.21 shows that the convex order is closed under products of nonnegative random variables. A variation of this corollary is given in Example 4.A.19.

3.A The Convex Order

125

Corollary 3.A.22. Let X1 and X2 be a pair of independent random variables, and let Y1 and Y2 be another pair of independent random variables. If Xi ≤cx Yi , i = 1, 2, then X1 X2 ≤cx Y1 Y2 . Proof. Using Theorem 3.A.21 twice we see that X1 X2 ≤cx Y1 X2 ≤cx Y1 Y2 , and the stated result follows from the transitivity property of the convex order.

An interesting variation of Theorem 3.A.21 is the following. Again, we omit the proof because it is similar to the proof of Theorem 4.A.18. Theorem 3.A.23. Consider a family of distribution functions {Gθ , θ ∈ X } as described before Theorem 3.A.21. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If for every convex function φ E[φ(X(θ))]

is increasing in θ,

and if Θ1 ≤st Θ2 , then Y1 ≤cx Y2 . The next result indicates the “minimal” and the “maximal” random variables, with respect to the order ≤cx , when the support and the mean are given. The proof, using (3.A.7) or (3.A.8) for example, is trivial and is thus omitted. Theorem 3.A.24. Let X be a random variable with mean EX. Denote the left [right] endpoint of the support of X by lX [uX ] (see the paragraph preceding (3.A.12) for the exact deﬁnition of lX and uX ). Let Z be a random variable such that P {Z = lX } = (uX − EX)/(uX − lX ) and P {Z = uX } = (EX − lX )/(uX − lX ). Then EX ≤cx X ≤cx Z, (3.A.48) where in (3.A.48) (and in (3.A.49)) EX denotes a random variable that takes on the value EX with probability 1.

126

3 Univariate Variability Orders

Another result that indicates the “minimal” random variable, with respect to the order ≤cx , for some rich families of random variables when the mean is given is Theorem 3.A.46. It follows from the ﬁrst inequality of (3.A.48) and from the fact that for any two random variables U and V one has U ≤cx V ⇐⇒ V ≤cv U , that if X is a random variable with mean EX, then X ≤cv EX.

(3.A.49)

In analogy to Theorem 1.A.17 we have the following results. We omit the proof of Theorem 3.A.25; however, the necessity part of Theorem 3.A.25 is a special case of Theorem 3.A.26. In the next three theorems we assume that all the random variables that are considered have ﬁnite means. Theorem 3.A.25. Let X be a nonnegative random variable that is not degenerate at 0 and let g be a nonnegative function deﬁned on [0, ∞). If g(x) > 0 for all x > 0, and if g is increasing on [0, ∞), and if g(x)/x is decreasing [increasing] on (0, ∞), then g(X) ≤Lorenz [≥Lorenz ] X. For example, if X is a nonnegative random variable, then X + a ≤Lorenz X

whenever a > 0.

The proof of the next theorem follows from results in Chapter 4 (see Theorem 4.B.5 and the ﬁrst part of the proof of Theorem 4.B.4). Theorem 3.A.26. Let X be a nonnegative random variable that is not degenerate at 0, and let g and h be nonnegative increasing functions, deﬁned on [0, ∞), such that g(x) > 0 and h(x) > 0 for all x > 0. If h(x)/g(x) is increasing in x ∈ (0, ∞), then g(X) ≤Lorenz h(X). Using Theorem 3.A.25 it is not too hard to prove the following result. Theorem 3.A.27. Let X and Z be two independent nonnegative random variables that are not degenerate at 0 and let g be a nonnegative function deﬁned on [0, ∞)2 such that g(Z, X) is not degenerate at 0. If g(z, x)/x is increasing in x for every z, and if g(z, x) is increasing in x for every z, then X ≤Lorenz g(Z, X). The Lorenz order often implies the harmonic mean residual life order, as the following result shows. Theorem 3.A.28. Let X and Y be two nonnegative random variables with positive expectations. If X ≤Lorenz Y and if EX ≤ EY , then X ≤hmrl Y .

3.A The Convex Order

127

Proof. For t ≥ 0 we have

Y t =E − EX EY EX +

EY E Y − EX · t + E[(Y − t)+ ] ≤ , = EY EY

where the ﬁrst inequality follows from X ≤Lorenz Y (that is, X ≤cx EX EY Y ), and the second inequality follows from EX ≤ EY . The stated result now follows from (2.B.4).

E E[(X − t)+ ] ≤ EX

EX EY

·Y −t

+

Let us now return to the characterization of the convex order given in Theorem 3.A.4. This characterization is sometimes useful for establishing the relation ≤cx between two random variables. The next example is a ﬁne illustration of this procedure. Example 3.A.29. Let X1 , X2 , . . . be independent and identically distributed random variables. Denote by X n the sample mean of X1 , X2 , . . . , Xn . That is, X n = (X1 + X2 + · · · + Xn )/n. It is well known that if the variances exist, then for every n ≥ 2 one has Var(X n ) ≤ Var(X n−1 ). But more than that is true. In fact, if the expectation of X1 exists, then for each n ≥ 2 one has X n ≤cx X n−1 . In order to see it note that from the exchangeability of X1 , X2 , . . . , Xn it follows that E[Xi X n ] = X n for all i ≤ n. Therefore E[X n−1 X n ] = X n . That is, {X n , X n−1 } is a martingale. The result now follows from Theorem 3.A.4. An extension of Example 3.A.29 to the multivariate case is given in Example 7.A.11. A result that is similar to Example 3.A.29 is the following (actually it is a generalization of Example 3.A.29 as will be argued below). Theorem 3.A.30. Let X1 , X2 , . . . , Xn be independent and identically distributed random nvariables. Let φ1 , φ2 , . . . , φn be measurable real functions. Denote φ = n1 i=1 φi . Then n i=1

φ(Xi ) ≤cx

n

φi (Xi ).

i=1

The proof of Theorem 3.A.30 consists of verifying, using the exchangeability of the Xi ’s, that n n n 1 φπi (Xi ) φ(Xi ) = φ(Xi ) E n! π i=1 i=1 i=1

128

3 Univariate Variability Orders

and that

n n 1 φπi (Xi ) =st φi (Xi ). n! π i=1 i=1

The desired result then follows from Theorem 3.A.4. Corollary 3.A.31. Let X1 , X2 , . . . , Xn be independent and identically distributed n random variables. Let a1 , a2 , . . . , an be real constants. Denote a = 1 i=1 ai . Then n n n a Xi ≤cx ai Xi . i=1

i=1

By taking ai = 1/(n − 1) for i = 1, 2, . . . , n − 1, and an = 0, it is easily seen that Example 3.A.29 is a special case of Corollary 3.A.31. Example 3.A.32. Let m ≤ m be two positive integers, and let M and N be two Poisson random variables with means mλ and m λ, respectively, for some λ > 0. Deﬁne X = mN and Y = m M . Then, using Example 3.A.29, it can be shown that X ≤cx Y . This result can be extended to the case where m and m are not integers, by approximating m/m with rational numbers. Two other simple results that follow from Theorem 3.A.4 are the following theorems. Theorem 3.A.33. Let X and Y be independent random variables with ﬁnite means and suppose that EY = a. Then aX ≤cx Y X. Proof. Clearly, E[Y X X] = aX and the result now follows from Theorem 3.A.4. This result is also an immediate consequence of Corollary 3.A.22 if one takes there X1 = a almost surely, X2 = X, and Y1 = Y and Y2 = X.

Theorem 3.A.34. Let X and Y be independent random variables with ﬁnite means and suppose that EY = 0. Then X ≤cx X + Y. Proof. Clearly, E[X + Y X] = X and the result follows from Theorem 3.A.4. Another way of proving this result is to use Theorem 3.A.12(d).

Recall from (3.A.32) the deﬁnition of the dilation order. From Theorem 3.A.34 it follows that if X and Y are independent random variables with ﬁnite means, then X ≤dil X + Y. (3.A.50) Recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. The next result strengthens Corollary 3.A.31.

3.A The Convex Order

129

Theorem 3.A.35. Let X1 , X2 , . . . , Xn be exchangeable random variables. Let a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bn ) be two vectors of constants. If a ≺ b, then n n ai Xi ≤cx bi X i . (3.A.51) i=1

i=1

Proof. Below, for any constants a, b, c, and d the notation a ≤ [b, c] stands for a ≤ min{b, c}, and the notation [b, c] ≤ d stands for max{b, c} ≤ d. By a wellknown property of the majorization order it suﬃces to prove the result only for n = 2. Let X1 and X2 be exchangeable random variables, and let a1 , a2 , b1 , and b2 be four constants such that b1 ≤ a1 ≤ a2 ≤ b2 and a1 + a2 = b1 + b2 . Denote X(1) = min{X1 , X2 } and X(2) = max{X1 , X2 }. Then, almost surely, b1 X(2) + b2 X(1) ≤ [a1 X(1) + a2 X(2) , a1 X(2) + a2 X(1) ] ≤ b1 X(1) + b2 X(2) and a1 X(2) + a2 X(1) + a1 X(1) + a2 X(2) = b1 X(2) + b2 X(1) + b1 X(1) + b2 X(2) . Hence for any convex function φ we have, almost surely, φ(a1 X(2) + a2 X(1) ) + φ(a1 X(1) + a2 X(2) ) ≤ φ(b1 X(2) + b2 X(1) ) + φ(b1 X(1) + b2 X(2) ). Therefore, 2Eφ(a1 X1 + a2 X2 ) = E[φ(a1 X(2) + a2 X(1) ) + φ(a1 X(1) + a2 X(2) )] ≤ E[φ(b1 X(2) + b2 X(1) ) + φ(b1 X(1) + b2 X(2) )] = 2Eφ(b1 X1 + b2 X2 ), and the stated result follows.

A result that is related to Theorem 3.A.35 is Theorem 4.A.39. Another result that is related to Theorem 3.A.35 n is Theorem n7.B.8 in Chapter 7 by Tong in [515]; the latter compares i=1 bi Xi and i=1 ai Xi in the sense of the peakedness order of Section 3.D, rather than in the sense of the order ≤cx . From Theorem 3.A.35 it follows that if the Xi ’s are exchangeable (in particular, if they are identically distributed), if ai ≥ 0, i = 1, 2, . . . , n, and n a = 1, and if X1 ≤cx Y for some random variable Y , then i i=1 n

ai Xi ≤cx Y.

(3.A.52)

i=1

The next result shows that (3.A.52) is true even if the Xi ’s are not exchangeable, but have any joint distribution.

130

3 Univariate Variability Orders

Theorem 3.A.36. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≤cx Y , i = 1, 2, . . . , n, then n

ai Xi ≤cx Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n, and

n i=1

ai = 1.

Proof. Let φ be any convex function for which the expectations below exist. Then n n n E φ ai Xi ≤ E ai φ(Xi ) = ai E[φ(Xi )] i=1

i=1

i=1

≤

n

ai E[φ(Y )] = E[φ(Y )],

i=1

where the ﬁrst inequality follows from the convexity of φ, and the second inequality from Xi ≤cx Y , i = 1, 2, . . . , n.

Similar results are described in Theorems 5.A.14, 5.C.8, and 5.C.18. An interesting result in which the coeﬃcients in (3.A.51) are replaced by Bernoulli random variables is described next. Let Ip denote a Bernoulli random variable with probability of success p, that is, P {Ip = 1} = 1−P {Ip = 0} = p. Theorem 3.A.37. Let X1 , X2 , . . . , Xn be nonnegative exchangeable random variables, and let Ip1 , Ip2 , . . . , Ipn and Iq1 , Iq2 , . . . , Iqn be independent Bernoulli random variables that are independent of X1 , X2 , . . . , Xn . If p ≺ q, then n

Ipi Xi ≥cx

i=1

n

Iqi Xi .

i=1

A result that is related to Theorem 3.A.37 is Theorem 4.A.38. Example 3.A.38. If the Xi ’s in Theorem 3.A.37 are all identically equal to 1, then we get that p ≺ q implies that n

Ipi ≥cx

i=1

In particular,

n

n

Iqi .

i=1

Iqi ≤cx Y,

i=1

where n Y is a binomial random variable having the parameters n and q = i=1 qi /n.

3.A The Convex Order

131

Conceptually it can be expected that if the random variables X1 , X2 , . . . , Xn are “more positively [negatively] associated” than the random variables Y 1 ,nY2 , . . . , Yn in some n sense, but otherwise Xi =st Yi for each i, then i=1 Xi ≥cx [≤cx ] i=1 Yi . The following result is a formalization of this idea. Recall that random variables X1 , X2 , . . . , Xn are said to be positively associated if Cov(h1 (X1 , X2 , . . . , Xn ), h2 (X1 , X2 , . . . , Xn )) ≥ 0

(3.A.53)

for all increasing functions h1 and h2 for which the above covariance is deﬁned. Similarly, X1 , X2 , . . . , Xn are said to be negatively associated if Cov(h1 (Xi1 , Xi2 , . . . , Xik ), h2 (Xj1 , Xj2 , . . . , Xjn−k )) ≤ 0

(3.A.54)

for all choices of disjoint subsets {i1 , i2 , . . . , ik } and {j1 , j2 , . . . , jn−k } of {1, 2, . . . , n}, and for all increasing functions h1 and h2 for which the above covariance is deﬁned. Theorem 3.A.39. Let X1 , X2 , . . . , Xn be positively [negatively] associated random variables, and let Y1 , Y2 , . . . , Yn be independent random variables such that Xi =st Yi , i = 1, 2, . . . , n. Then n i=1

Xi ≥cx [≤cx ]

n

Yi .

i=1

Theorem 3.A.39 follows from Theorem 9.A.23 in Chapter 9; see a comment there after that theorem. A Laplace transform characterization of the order ≤cx is stated next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, and 2.B.14. We do not give the proof of this characterization here since it follows easily from Theorem 4.A.21 in Chapter 4. Theorem 3.A.40. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤cx X2 ⇐⇒ Nλ (X1 ) ≤cx Nλ (X2 )

for all λ > 0.

Example 3.A.41. Let Θ be a random variable whose realization, θ, is a parameter of interest. In the context of statistical inference the distribution function of Θ is called a prior distribution. Let X and Y be two random variables whose distribution functions depend on θ, that is, the conditional distribution of X given Θ = θ is, say, Fθ , and the conditional distribution of Y given Θ = θ is, say, Gθ . Let L(a, θ) be the loss incurred when Θ = θ and when the action a has been taken (a is a number in the action space A which is a compact subset of R). In the following discussion, every expected value that is mentioned is assumed to exist.

132

3 Univariate Variability Orders

If X = x is observed, and action a is taken, then the expected loss is E L(a, Θ)X = x . The minimal expected loss, given that X = x has been observed, is then min E L(a, Θ)X = x . a∈A

Therefore the expected minimal expected loss, for an experiment in which X is used for inference on θ, is

E min E L(a, Θ)X . a∈A

Similarly, the expected minimal expected loss, for an experiment in which Y is used for inference on θ, is

E min E L(a, Θ)Y . a∈A

We say that Y is more informative than X for Θ if

E min E L(a, Θ)X ≥ E min E L(a, Θ)Y a∈A

a∈A

(3.A.55)

for any loss function L, and any action space A, for which the minima and the expected values above are well deﬁned. Let U = E[ΘX] and V = E[ΘY ] be the posterior means in the corresponding experiments. Obviously, EU = E[E[ΘX]] = EΘ = E[E[ΘY ]] = EV. (3.A.56) Take A = [0, 1] and consider the loss function Lc (a, θ) = a(θ − c), where c is some constant. Then min E Lc (a, Θ)X = min a[E[Θ − cX]] = min{0, U − c} = −(c − U )+ , a∈A

and, similarly,

a∈A

min E Lc (a, Θ)Y = −(c − V )+ . a∈A

From (3.A.55) we get that E[(c − U )+ ] ≤ E[(c − V )+ ] for all c. Therefore, from (3.A.6) and (3.A.56) it follows that E[ΘX] ≤cx E[ΘY ]. The following result is an analog of Theorem 1.A.8; similar results are Theorems 3.A.59, 4.A.48, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14– 7.A.16.

3.A The Convex Order

133

Theorem 3.A.42. Let X and Y be two random variables. Suppose that X ≤cx Y [X ≤cv Y ] and that E[X 2 ] = E[Y 2 ], provided the expectations exist. Then X =st Y . Proof. Denote the distribution functions of X and Y by F and G, respectively. Then u 0 2 2 (G(v) − F (v))dv du E[Y ] − E[X ] = 2 u=−∞ v=−∞ ∞ ∞ (G(v) − F (v))dv du. +2 u=0

v=u

2 2 By Theorem 3.A.1 are nonnegative. From E[X u ∞ ] = E[Y ] u both inner integrals we thus obtain v=−∞ F (v)dv = v=−∞ G(v)dv for u ≤ 0, and v=u F (v)dv = ∞ G(v)dv for u ≥ 0. Diﬀerentiating these equalities we obtain F = G.

v=u

Theorem 3.A.42 can be strengthened as follows; we do not detail the proof here. Theorem 3.A.43. Let X and Y be two random variables. Suppose that X ≤cx Y [X ≤cv Y ] and that for some strictly convex function φ we have that E[φ(X)] = E[φ(Y )], provided the expectations exist. Then X =st Y . Theorem 3.A.60 below is a generalization of Theorem 3.A.43. 3.A.3 Conditions that lead to the convex order Once the relation X ≤cx Y has been established between the two random variables X and Y it can be of great use. However, given the two random variables and their distributions it is sometimes not clear how to verify that X ≤cx Y . In this section we point out several simple conditions that imply the convex order. Recall the notation S − (a) (deﬁned in (1.A.18)) for the number of sign changes of the function a. Theorem 3.A.44. Let X and Y be two random variables with equal means, density functions f and g, distribution functions F and G, and survival functions F and G, respectively. Then X ≤cx Y if any of the following conditions hold: S − (g − f ) = 2

and the sign sequence is +, −, +,

(3.A.57)

−

and the sign sequence is +, −,

(3.A.58)

−

and the sign sequence is +, −.

(3.A.59)

S (F − G) = 1 S (G − F ) = 1

Proof. We will prove the result for the continuous case; the proof in the discrete case is similar. Suppose that S − (g − f ) = 2 and that the sign sequence is +, −, +. Let a and b (a < b) be two of the crossing points, where the deﬁnition of a crossing point is self-explanatory. Denote I1 = (−∞, a], I2 = (a, b],

134

3 Univariate Variability Orders

and I3 = (b, ∞). Then g(x) − f (x) ≥ 0 on I1 , g(x) − f (x) ≤ 0 on I2 , and g(x) − f (x) ≥ 0 on I3 . Therefore x G(x) − F (x) = [g(u) − f (u)]du −∞

is increasing on I1 , decreasing on I2 , and increasing on I3 . It is also clear that limx→−∞ [G(x) − F (x)] = limx→∞ [G(x) − F (x)] = 0. Combining all these observations shows that S − (G − F ) = 1 and that the sign sequence is +, −. Now suppose that S − (G − F ) = 1 and that the sign sequence is +, −. Let c be a crossing point. Denote J1 = (−∞, c] and J2 = (c, ∞). Then G(x)−F (x) ≥ 0 on J1 and G(x) − F (x) ≤ 0 on J2 . Clearly x lim [G(u) − F (u)]du = 0 x→−∞

−∞

and from the equality of the means (see (3.A.3)) it follows that x [G(u) − F (u)]du = 0. lim x→∞

−∞

Combining these observations shows that (3.A.8) holds. This proves that (3.A.57) and (3.A.59) imply X ≤cx Y . Note that S − (F − G) = S − (G − F ) with the same sign sequence. This observation, together with (3.A.59), shows that (3.A.58) implies X ≤cx Y .

The condition (3.A.58) (or, equivalently, (3.A.59)) is not only suﬃcient for X ≤cx Y , but, for nonnegative random variables, it can also characterize the convex order as the following theorem shows. Theorem 3.A.45. Let X and Y be two nonnegative random variables with equal means. Then X ≤cx Y if, and only if, there exist random variables Z1 , Z2 , . . ., with distribution functions F1 , F2 , . . ., such that Z1 =st X, EZj = EY , j = 1, 2, . . ., Zj →st Y as j → ∞, and S − (F j − F j+1 ) = 1 and the sign sequence is +, −, j = 1, 2, . . .. If the random variables in Theorem 3.A.45 are not nonnegative then the suﬃciency part of that theorem is not correct. This can be seen by noting that Example 1 of M¨ uller [410] describes a sequence of distribution functions (say of the random variables Z1 , Z2 , . . .), and two other distribution functions (say of the random variables X and Y , which are not nonnegative), which satisfy all the conditions in Theorem 3.A.45, but such that X ≤cx Y . We thank Taizhong Hu for pointing out this fact to us. In Theorem 3.A.24 we obtained the “minimal” random variable with respect to the order ≤cx when the support and the mean are given. Now, with the aid of Theorem 3.A.44, we can obtain the “minimal” random variables with respect to the order ≤cx for some rich families of random variables when the mean is given. This is shown in the next result.

3.A The Convex Order

135

Theorem 3.A.46. Let X be a nonnegative random variable with mean µ. (a) Suppose that X has a density function that is decreasing on [0, ∞). Let Y be uniformly distributed over the interval [0, 2µ] (so that EY = µ). Then Y ≤cx X. (b) Suppose that X has a density function that is decreasing and convex on [0, ∞). Let Z have the triangular distribution over the interval [0, 3µ] with density function 2 − 2 2 x, if 0 ≤ x ≤ 3µ, fZ (x) = 3µ 9µ 0, otherwise (so that EZ = µ). Then Z ≤cx X. Proof. In order to prove (a) let fX and fY denote the density functions of X and Y , respectively. It is easy to see, using the fact that EX = EY , that S − (fX −fY ) = 2 and that the sign sequence is +, −, +. The result now follows from Theorem 3.A.44. The proof of (b) is similar.

Some illustrations of the applicability of Theorem 3.A.44 are shown in the following examples. Example 3.A.47. The following statements can be proven by verifying, using the method in Shaked [502], that in each one of them the two random variables have the same mean, and that their densities satisfy (3.A.57). (a) Let X and Y have, respectively, the Poisson and the Pascal distributions with the discrete densities f and g given by (λ/α)x , f (x) = e−λ/α x! α λ 1 x Γ (x + λ) g(x) = , 1+α 1+α Γ (λ)x!

x = 0, 1, . . . , x = 0, 1, . . . ,

where α > 0 and λ > 0. Then X ≤cx Y . (b) Let X and Y have, respectively, the exponential and the power distributions with the densities f and g given by f (x) = (γ − 1)δ −1 exp{−(γ − 1)δ −1 x}, −γ−1

g(x) = (γ/δ)(1 + x/γ)

where γ > 1 and δ > 0. Then X ≤cx Y .

,

x ≥ 0, x ≥ 0,

136

3 Univariate Variability Orders

(c) Let X and Y have, respectively, the binomial and the Polya distributions with the discrete densities f and g given by n α x β n−x f (x) = , x = 0, 1, . . . , n, α+β x α+β n Γ (α + β)Γ (α + x)Γ (β + n − x) , x = 0, 1, . . . , n, g(x) = Γ (α)Γ (β)Γ (α + β + n) x where α > 0 and β > 0. Then X ≤cx Y . (d) Let X and Y have, respectively, the discrete densities f and g given by λ β − 1 x Γ (x + λ) α , α+β−1 α+β−1 Γ (λ)x! Γ (α + β)Γ (β + λ)Γ (λ + x)Γ (α + x) , g(x) = Γ (α)Γ (β)Γ (λ)Γ (α + β + λ + x)x!

f (x) =

x = 0, 1, . . . , x = 0, 1, . . . ,

where α > 0, β > 1, and λ > 0. Then X ≤cx Y . Example 3.A.48. Let X and Y be Bernoulli random variables with parameters p and q, respectively, where 0 < p ≤ q ≤ 1. Then X Y ≥cx . p q This can be seen by easily verifying (3.A.59), where F and G there are the distribution functions of Y and X, respectively. A further illustration of the applicability of Theorem 3.A.44 is shown in the following example. Example 3.A.49. Let U(i:n) be the ith order statistic from a sample of n uniform [0, 1] random variables. By examination of the density functions of the normalized variables n+1 i U(i:n) it is possible to verify (3.A.57) and obtain the following results (see also Example 4.B.13): U(i+1:n) ≤Lorenz U(i:n) , U(i:n) ≤Lorenz U(i:n+1) , U(n−i+1:n+1) ≤Lorenz U(n−i:n) ,

for all i ≤ n − 1, for all i ≤ n + 1, for all i ≤ n,

and U(n+2:2n+3) ≤Lorenz U(n+1:2n+1) , for all n. The last inequality may be described as “sample medians exhibit less variability as sample size increases.” Arnold and Villasenor [21], who derived the above results, give many other Lorenz order inequalities for order statistics and record values associated with various parametric families; see also Wilﬂing [566] and Kleiber [304].

3.A The Convex Order

137

Example 3.A.50. Let X(i:n) denote the ith order statistic in a sample of n independent and identically distributed random variables having the common distribution F , survival function F , and density function f . Recall that a function φ : [0, ∞) → [0, ∞) is said to be regularly varying at ∞ with index ρ ∈ R if φ(tx) lim = tρ , for all t ∈ [0, ∞). x→∞ φ(x) The function φ is said to be regularly varying at −∞ with index ρ if φ(−x) is regularly varying at ∞ with index ρ. Finally, the function φ is said to be regularly varying at 0 with index ρ if φ(x−1 ) is regularly varying at ∞ with index ρ. For F with support (−∞, ∞) Kleiber [303] showed: (a) If F is regularly varying at −∞ with index α < 0, and if f is monotone on (−∞, c] for some c, then X(j:m) ≤dil X(i:n) implies i ≤ j. (b) If F is regularly varying at ∞ with index α < 0, and if f is monotone on [c, ∞) for some c, then X(j:m) ≤dil X(i:n) implies n − i ≤ m − j. For F with support [0, ∞) Kleiber [303] also showed: (c) If F is regularly varying at 0 with index α < 0, and if f is monotone on (0, c] for some c > 0, then X(j:m) ≤Lorenz X(i:n) implies i ≤ j. (d) If F is regularly varying at ∞ with index α < 0, and if f is monotone on [c, ∞) for some c, then X(j:m) ≤Lorenz X(i:n) implies n − i ≤ m − j. The following example gives necessary and suﬃcient conditions for the comparison of normal random variables; it is generalized in Example 7.A.13. See related results in Examples 1.A.26 and 4.A.46. Example 3.A.51. Let X be a normal random variable with mean µX and vari2 ance σX , and let Y be a normal random variable with mean µY and variance 2 2 σY . Then X ≤cx Y if, and only if, µX = µY and σX ≤ σY2 . Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R, with any ﬁxed ﬁnite mean, is a lattice with respect to the order ≤cx . Let X and Y be two random variables with densities f and g, respectively. Recall that supp(X) and supp(Y ) denote the respective supports. We say that X is uniformly less variable than Y (denoted as X ≤uv Y ) if supp(X) ⊆ supp(Y ) and the ratio f (x)/g(x) is unimodal over supp(Y ), where the mode is a supremum, but X and Y are not ordered by the usual stochastic order (see deﬁnition in Section 1.A). The relation ≤uv is not a transitive order. It is possible to have X ≤uv Y and Y ≤uv Z but not X ≤uv Z. However, it is useful as a simple condition which implies (3.A.57). The next theorem points out this relationship. The proof of the theorem is easy and is therefore omitted.

138

3 Univariate Variability Orders

Theorem 3.A.52. Let X and Y be two random variables with densities f and g, respectively, such that supp(X) ⊆ supp(Y ). Then X ≤uv Y if, and only if, S − (g − cf ) ≤ 2 whenever c > 0, and in case of equality the sign sequence is +, −, +. (3.A.60) From (3.A.60) and (3.A.57) we see that the order ≤uv is a suﬃcient condition for the order ≤cx provided the underlying random variables have equal means. This is formally stated in the next theorem. Theorem 3.A.53. Let X and Y be two random variables with absolutely continuous distributions and equal means such that supp(X) ⊆ supp(Y ). If X ≤uv Y , then X ≤cx Y . A relation that is even stronger than ≤uv is deﬁned next. Its usefulness is that it gives a simple suﬃcient condition for the order ≤uv and therefore for the order ≤cx . Again, let X and Y be two random variables with densities f and g, respectively. We say that X is logconcave relative to Y (denoted by X ≤lc Y ) if f /g is logconcave. The relation ≤lc , unlike the relation ≤uv , is transitive, and it implies the relation ≤uv as the next result shows. Again, the proof is trivial and hence it is omitted. Theorem 3.A.54. Let X and Y be two random variables with densities f and g, respectively, such that supp(X) ⊆ supp(Y ) and S − (g − f ) = 2. Then X ≤lc Y =⇒ X ≤uv Y . 3.A.4 Some properties in reliability theory Recall from page 1 the deﬁnitions of NBUE and NWUE random variables. Such random variables are of interest in reliability theory. The next result shows that NBUE [NWUE] random variables are smaller [larger] than exponential random variables with the same means with respect to the convex order. We denote by Exp(µ) an exponential random variable with mean µ. Theorem 3.A.55. If X is an NBUE [NWUE] random variable with mean µ, then (3.A.61) X ≤cx [≥cx ] Exp(µ), or, equivalently, X ≥cv [≤cv ] Exp(µ).

(3.A.62)

The proof consists of showing that if F is the survival function of X, then ∞ F (u)du ≤ [≥] µe−x/µ , x ≥ 0, x

and the result then follows from (3.A.7). We omit the details.

3.A The Convex Order

139

Random variables that satisfy (3.A.61) are called harmonic new better [worse] than used in expectation (HNBUE [HNWUE]). Sometimes such random variables are deﬁned by X ≤hmrl [≥hmrl ] Exp(µ) rather than by (3.A.61), but by (2.B.7) these two deﬁnitions are the same. Recall from page 1 the deﬁnition of IMRL and DMRL random variables. The following result characterizes such random variables by means of the dilation order deﬁned in (3.A.32). Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.A.23, 2.B.17, 3.C.13, and 4.A.51. Theorem 3.A.56. The nonnegative random variable X is DMRL [IMRL] if, and only if, [X X > t] ≥dil [≤dil ] [X X > t ] whenever t ≤ t . Two related results are stated next without proofs. Theorem 3.A.57. Let X and Y be two random variables that have a common support of the form (0, ∞), and that have ﬁnite means. If X and/or Y is IMRL, and if X ≤mrl Y , then X ≤dil Y . Theorem 3.A.58. Let X and Y be two random variables that have a common support of the form (0, ∞), and that have ﬁnite means. If X is NBUE and Y is NWUE, then X ≤mrl Y ⇐⇒ X ≤dil Y ⇐⇒ EX ≤ EY. 3.A.5 The m-convex orders Let S be a subinterval of the real line. The subinterval S may be open, halfopen, or closed, ﬁnite or inﬁnite. Fix a positive integer m, and consider the class MSm-cx of all functions φ : S → R whose mth derivative φ(m) exists and satisﬁes φ(m) (x) ≥ 0, for all x ∈ S, or which are limits of sequences of functions whose mth derivative is continuous and nonnegative on S. Let X and Y be two random variables that take on values in S such that E[φ(X)] ≤ E[φ(Y )]

for all functions φ ∈ MSm-cx ,

(3.A.63)

provided the expectations exist. Then X is said to be smaller than Y in the m-convex order (denoted as X ≤Sm-cx Y ). For random variables X and Y that take on values in N++ the deﬁnition of the m-cx order is similar — it can be found in Denuit and Lef`evre [146]. In a similar manner one can deﬁne the m-concave order and observe that X ≤Sm-cv Y when m is odd, S X ≤m-cx Y ⇐⇒ Y ≤Sm-cv X when m is even. It can be shown that

140

3 Univariate Variability Orders

EX k = EY k , k = 1, 2, . . . , m − 1, and Y ⇐⇒ E(X − t)m−1 ≤ E(Y − t)m−1 for all t ∈ S, + +

X ≤Sm-cx

(3.A.64)

and also that

EX k = EY k , k = 1, 2, . . . , m − 1, and X Y ⇐⇒ m−1 m−1 ≥ 0 for all t ∈ S. − E(t − X)+ (−1)m E(t − Y )+ (3.A.65) Note that the order ≤S1-cx is just the order ≤st , and that the order ≤S2-cx is the order ≤cx . Menezes, Geiss, and Tressler [390] gave the following interpretation to the order ≤S3-cx : if X ≤S3-cx Y , then, of course, X and Y have the same mean and variance, but X then has smaller rightside risk than Y . Let F and G be the distribution functions of X and Y , respectively. Denote t F [0] (t) = F (t), and, for k ≥ 1, denote F [k] (t) = −∞ F [k−1] (x)dx. Similarly, ∞ [k−1] [0] [k] denote F (t) = F (t), and, for k ≥ 1, denote F (t) = t F (x)dx. ≤Sm-cx

Deﬁne G[k] and G

[k]

in a similar manner. Using the identities

[m−1]

(t) − FX

[m−1]

(t) − F X

FY

[m−1]

(t) =

m−1 − E(t − X)+ E(t − Y )m−1 + (m − 1)!

(t) =

m−1 m−1 − E(X − t)+ E(Y − t)+ (m − 1)!

and FY

[m−1]

(which are easily proven by induction and Fubini’s Theorem) we obtain from (3.A.64) and (3.A.65) that EX k = EY k , k = 1, 2, . . . , m S − 1, and X ≤m-cx Y ⇐⇒ [m−1] [m−1] m (−1) FY (t) − FX (t) ≥ 0 for all t ∈ R, (3.A.66) and also that X ≤Sm-cx Y ⇐⇒

EX k = EY k , [m−1] FY (t)

−

k = 1, 2, . . . , m − 1,

[m−1] FX (t)

≥0

and

for all t ∈ R.

(3.A.67)

Using the identities ∞ [m−1] [m−1] m FY (x) − F X (x) dx = E(Y − t)m m! + − E(X − t)+ t

and

t

m! −∞

[m−1] [m−1] m FY (x) − FX (x) dx = E(t − Y )m + − E(t − X)+ ,

we obtain from (3.A.66) and (3.A.67) that

3.A The Convex Order

X ≤Sm-cx

141

EX k = EY k , k = 1, 2, . . . , m − 1, and Y ⇐⇒ m E(Y − t)m is decreasing in t ∈ R, + − E(X − t)+

and also that X

≤Sm-cx

EX k = EY k , k = 1, 2, . . . , m − 1, and Y ⇐⇒ m (−1)m E(t − Y )m + − E(t − X)+ is increasing in t ∈ R.

Fishburn [203] has reported some attempts at obtaining an analog of Theorem 3.A.4 for the 3-cx order. From (3.A.63) it is seen that if X ≤Sm-cx Y , then EX k ≤ EY k

for k ≥ m such that k − m is even.

If, moreover, X and Y are nonnegative, then EX k ≤ EY k

for k ≥ m.

Motivated by Theorem 3.A.42 (see also Theorems 1.A.8, 4.A.48, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16) we have the following result. Theorem 3.A.59. Let X and Y be two random variables that take on values in S. If X ≤Sm-cx Y , and if E[X m ] = E[Y m ], then X =st Y . Theorem 3.A.59 can be strengthened to the following result in a way that is analogous to the way in which Theorem 3.A.43 strengthened Theorem 3.A.42; we do not detail the proof here. Theorem 3.A.60. Let X and Y be two random variables that take on values in S. If X ≤Sm-cx Y and if E[φ(X)] = E[φ(Y )] for some φ ∈ MSm-cx which satisﬁes φ(m) (x) > 0 for all x ∈ S, then X =st Y . Note that Theorems 1.A.8, 3.A.43, and 3.A.59 are all special cases of Theorem 3.A.60. A generalization of (3.A.12) is given in the next theorem. The notations lX , uX , lY , and uY are described before (3.A.12). Theorem 3.A.61. Let X and Y be two random variables that take on values in S. If X ≤Sm-cx Y , then uX ≤ uY . Also, if m is even, then lX ≥ lY , and if m is odd, then lX ≤ lY . Some closure properties of the order ≤Sm-cx are given in the next theorem. Theorem 3.A.62. (a) Let X and Y be two random variables that take on values in S. Then when s is even, −X ≤−S m-cx −Y X ≤Sm-cx Y ⇐⇒ −Y ≤−S −X when s is odd. m-cx

142

3 Univariate Variability Orders

(b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤Sm-cx [Y Θ = θ] for all θ in the support of Θ. Then X ≤Sm-cx Y . That is, the m-convex order is closed under mixtures. (c) If X ≤Sm-cx Y , then cX ≤cS m-cx cY whenever c > 0, where cS = {x ∈ R : x/c ∈ S}. (d) If X ≤Sm-cx Y , then cX ≤cS m-cx cY whenever c < 0 and m is even, and cY ≤cS m-cx cX whenever c < 0 and m is odd. (e) If X ≤Sm-cx Y , then X + d ≤S+d m-cx Y + d for all d ∈ R, where S + d = {x ∈ R : x − d ∈ S}; that is, the m-convex order is shift-invariant. (f) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables that take on values in S, such that Xj →st X and Yj →st Y m−1 m−1 as j → ∞. Assume that E(X)+ and E(Y )+ are ﬁnite and that m−1 m−1 m−1 m−1 E(Xj )+ → E(X)+ and that E(Yj )+ → E(Y )+ as j → ∞. S S If Xi ≤m-cx Yi for all integers i, then X ≤m-cx Y . That is, the m-convex order is closed under limits. (g) Let X1 , X2 , . . . , Xn be a set of independent random variables and let Y1 , Y2 , . . . , Yn be another set of independent random variables, all taking on values in S. If Xi ≤Sm-cx Yi for i = 1, 2, . . . , n, then n j=1

Xj ≤R m-cx

n

Yj ,

j=1

where R denotes the union of the supports of the distribution functions of the two sums. That is, the m-convex order is closed under convolutions. (h) Let X1 , X2 , . . . be a set of independent random variables and let Y1 , Y2 , . . . be another set of independent random variables, all taking on values in S. If Xi ≤Sm-cx Yi for i = 1, 2, . . ., then, for any positive integer-valued random variable N which is independent of the Xi ’s and of the Yj ’s, one has N N ˜ Xj ≤R Yj , m-cx j=1

j=1

˜ denotes the union of the supports of the distribution functions of where R the two compound sums. Theorem 3.A.62(h) can be extended as follows. Theorem 3.A.63. Let X1 , X2 , . . . be a set of independent random variables and let Y1 , Y2 , . . . be another set of independent random variables, all taking on values in S. Let N1 be an integer-valued random variable that is independent of the Xi ’s, and let N2 be an integer-valued random variable that is independent of the Yi ’s, both taking on values in Q. If Xi ≤Sm-cx Yi for i = 1, 2, . . ., and if N1 ≤Q m-cx N2 , then N2 N1 ˜ Xj ≤R Yj , m-cx j=1

j=1

3.A The Convex Order

143

˜ denotes the union of the supports of the distribution functions of the where R two compound sums. Theorem 3.A.19 can be extended as follows. Theorem 3.A.64. Let X1 and X2 (Y1 and Y2 ) be two independent copies of X (Y ). If X ≤S2m-cx Y , then X1 − X2 ≤R 2m-cx Y1 − Y2 , where R denotes the union of the supports of the distribution functions of the two diﬀerences. The proof of Theorem 3.A.64 is similar to the proof of Theorem 3.A.19 (using Theorem 3.A.62(a) and (g)). Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means, we denote by AX and AY the corresponding asymptotic equilibrium ages. Theorem 3.A.65. Let X and Y be two nonnegative random variables. Then, for m ≥ 2 we have [0,∞)

X ≤[0,∞) m-cx Y ⇐⇒ AX ≤(m−1)-cx AY . In particular, X ≤cx Y ⇐⇒ AX ≤st AY . We now describe a generalization of Theorem 3.A.44. Let Bm (S; µ1 , µ2 , . . . , µm−1 ) denote the class of all the random variables X whose distribution functions have support in S and which have the ﬁrst m − 1 moments EX k = µk , k = 1, 2, . . . , m − 1. Theorem 3.A.66. Let X and Y be two random variables in Bm (S; µ1 , µ2 , . . . , µm−1 ) with distribution functions F and G, respectively, and with density functions f and g, respectively. (a) If S − (F −G) ≤ m−1 and if the last sign of F −G is a +, then X ≤Sm-cx Y . (b) If S − (f − g) ≤ m and if the last sign of g − f is a +, then X ≤Sm-cx Y . The following example describes typical applications of Theorem 3.A.66. Example 3.A.67. Let X have the Gamma density given by fX (x) =

β α α−1 −βx e , x Γ (α)

x > 0,

where α > 0 and β > 0 are constants, and let Y have the inverse Gaussian density given by % $ (α − βx)2 αx−3/2 fY (x) = √ , x > 0, exp − 2βx 2πβ where also here α > 0 and β > 0 are constants. Note that X and Y have the same mean α/β and the same second moment α(α + 1)/β 2 . We claim that

144

3 Univariate Variability Orders [0,∞)

X ≤3-cx Y . In order to see it, ﬁrst note that without loss of generality we can take the means to be equal to 1, that is, β = α. Now, a straightforward computation yields log

fX (x) 1 αx α =C + α+ − , log x + fY (x) 2 2x 2

x > 0,

where C is some constant. The ﬁrst derivative of the above expression is a quadratic form in 1/x, which cannot have more than two zeroes, so the expression itself has no more than three sign changes. In addition, the above expression tends to −∞ as x → ∞. The stated result now follows from Theorem 3.A.66(b). Let Z have the lognormal density given by % $ 1 (log x − ν)2 √ exp − fZ (x) = , x > 0, 2τ 2 xτ 2π where τ > 0 and ν > 0 are constants. With the choice τ 2 = log(1 + α1 ) and α3 ν = 12 log (α+1)β 2 we have that X and Z have the same mean α/β and the [0,∞)

same second moment α(α + 1)/β 2 . We now claim that X ≤3-cx Z. In order to see it, again note that without loss of generality we can take the means to be equal to 1, that is, β = α. Now, a straightforward computation yields log

1 log2 x fX (x) =C + α+ log x − αx + , fZ (x) 2 2τ 2

x > 0,

where C is some constant. Substituting u = log x, the above expression is seen to be the diﬀerence of a quadratic form in u and an exponential function, which cannot have more than three sign changes. In addition, the above expression tends to −∞ as x → ∞. The stated result again follows from Theorem 3.A.66(b). Theorem 3.A.24 can be viewed as a result that gives the “minimal” and the “maximal” random variables with respect to the order ≤S2-cx when the (bounded) support and the mean are given. The following theorem gives the extrema with respect to the order ≤S3-cx when the ﬁrst two moments are given. Here we take S = [a, b] for some ﬁnite a and b. Theorem 3.A.68. Let X ∈ B3 ([a, b]; µ1 , µ2 ). Consider the random variables (3) (3) Xmin and Xmax in B3 ([a, b]; µ1 , µ2 ) deﬁned by ⎧ µ −µ2 ⎨a with probability (a−µ1 )22 +µ12 −µ2 , (3) 1 Xmin = 2 (a−µ1 )2 ⎩µ1 + µ2 −µ1 with probability 2, 2 µ1 −a

and

(a−µ1 ) +µ2 −µ1

3.A The Convex Order (3) Xmax

(3)

⎧ ⎨µ1 − = ⎩b

[a,b]

[a,b]

µ2 −µ21 b−µ1

with probability with probability

145

(b−µ1 )2 , (b−µ1 )2 +µ2 −µ21 µ2 −µ21 . (b−µ1 )2 +µ2 −µ21

(3)

Then Xmin ≤3-cx X ≤3-cx Xmax . An eﬀective method for deriving the support points and the associated probabilities of the stochastic extrema in general (that is, for m’s other than 3) will be described next. For the purpose of somewhat simplifying the expressions below we take a = 0. Thus we describe how to obtain (m) (m) the support points and the associated probabilities of Xmin and Xmax in Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ). (2k) If m is even, m = 2k, say, then the support of Xmin in B2k ([0, b]; µ1 , µ2 , . . . , µ2k−1 ) consists of k interior points x1 , x2 , . . . , xk , 0 < x1 < x2 < · · · < xk < b, which are the k distinct roots of the equation (denoting µ0 = 1) 1 x x2 · · · xk µ0 µ1 µ2 · · · µk µ1 µ2 µ3 · · · µk+1 = 0; .. .. .. . . .. . . . . . µk−1 µk µk+1 · · · µ2k−1 the corresponding probabilities p1 , p2 , . . . , pk are now found by solving p1 xj1 + p2 xj2 + · · · + pk xjk = µj ,

j = 0, 1, . . . , k − 1.

(2k)

The support of Xmax in B2k ([0, b]; µ1 , µ2 , . . . , µ2k−1 ) consists of the points 0, b, and k − 1 interior points x2 , x3 , . . . , xk , 0 < x2 < x3 < · · · < xk < b, which are the k − 1 distinct roots of the equation 1 x x2 ··· xk−1 µ2 − bµ1 µ3 − bµ2 µ − bµ · · · µ − bµ 4 3 k+1 k µ3 − bµ2 µ4 − bµ3 µ − bµ · · · µ − bµ 5 4 k+2 k+1 = 0; .. .. .. . . .. .. . . . µk − bµk−1 µk+1 − bµk µk+2 − bµk+1 · · · µ2k−1 − bµ2k−2 the corresponding probabilities p1 , p2 , . . . , pk+1 are now found by solving the Vandermonde system p1 + p2 + · · · + pk+1 = 1, p2 xj2 + p3 xj3 + · · · + pk xjk + pk+1 bj = µj , j = 1, 2, . . . , k. (2k+1)

in When m is odd, m = 2k + 1, say, then the support of Xmin B2k+1 ([0, b]; µ1 , µ2 , . . . , µ2k ) consists of 0 and k interior points x2 , x3 , . . . , xk+1 , 0 < x2 < x3 < · · · < xk+1 < b, which are the k distinct roots of the equation

146

3 Univariate Variability Orders

1 x µ1 µ2 µ2 µ3 .. .. . . µk µk+1

x2 µ3 µ4 .. . µk+2

· · · xk · · · µk+1 · · · µk+2 = 0; . .. . .. · · · µ2k

the corresponding probabilities p1 , p2 , . . . , pk+1 are now found by solving p1 + p2 + · · · + pk+1 = 1, p2 xj2 + p3 xj3 + · · · + pk+1 xjk+1 = µj , j = 1, 2, . . . , k. (2k+1)

The support of Xmax in B2k+1 ([0, b]; µ1 , µ2 , . . . , µ2k ) consists of the points b and k interior points x1 , x2 , . . . , xk , 0 < x1 < x2 < · · · < xk < b, which are the k distinct roots of the equation 1 x x2 ··· xk µ1 − b µ − bµ µ − bµ · · · µ − bµ 2 1 3 2 k+1 k µ2 − bµ1 µ3 − bµ2 µ − bµ · · · µ − bµ 4 3 k+2 k+1 = 0; .. .. .. . . .. .. . . . µk − bµk−1 µk+1 − bµk µk+2 − bµk+1 · · · µ2k − bµ2k−1 the corresponding probabilities p1 , p2 , . . . , pk+1 are now found by solving the Vandermonde system p1 xj1 + p2 xj2 + · · · + pk+1 xjk+1 + pk+1 bj = µj ,

j = 0, 1, . . . , k. (m)

(m)

Explicit descriptions for the distribution functions of Xmin and Xmax , for values of m up to 5, are given in Tables 3.A.1 and 3.A.2, where in Table 3.A.2 we use the notation

2 = (µ1 − b)(µ4 − bµ3 ) − (µ2 − bµ1 )(µ3 − bµ2 )

− 4 (µ1 − b)(µ3 − bµ2 ) − (µ2 − bµ1 )2

× (µ2 − bµ1 )(µ4 − bµ3 ) − (µ3 − bµ2 )2 . Denuit, De Vylder, and Lef`evre [142] obtained also the extrema with respect to the order ≤Sm-cx when not only the ﬁrst m − 1 moments and the support are given, but also when the density function of X is known to be unimodal with a known mode. Tables that are similar to Tables 3.A.1 and 3.A.2, but when the mode is known, are available in Denuit, Lef`evre, and Shaked [153, 154].

3.B The Dispersive Order 3.B.1 Deﬁnition and equivalent conditions Let X and Y be two random variables with distribution functions F and G, respectively. Let F −1 and G−1 be the right continuous inverses of F and G,

3.B The Dispersive Order

147

(m)

Table 3.A.1. Probability distribution of Xmin ∈ Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ) m Support point

Probability mass

1 0

1

2 µ1

1

3 0

µ2 −µ2 1 µ2 µ2 1 µ2

µ2 µ1

√

4 r+ =

µ3 −µ1 µ2 +

r− =

µ3 −µ1 µ2 −

2 (µ3 −µ1 µ2 )2 −4(µ2 −µ2 1 )(µ1 µ3 −µ2 )

√

5 0

µ1 −r− r+ −r−

2(µ2 −µ2 1)

2 (µ3 −µ1 µ2 )2 −4(µ2 −µ2 1 )(µ1 µ3 −µ2 )

1−

2(µ2 −µ2 1)

√

t+ =

µ1 µ4 −µ2 µ3 +

t− =

µ1 µ4 −µ2 µ3 −

µ1 −r− r+ −r−

1 − p+ − p− 2 (µ1 µ4 −µ2 µ3 )2 −4(µ1 µ3 −µ2 2 )(µ2 µ4 −µ3 )

√

2(µ1 µ3 −µ2 2)

2 (µ1 µ4 −µ2 µ3 )2 −4(µ1 µ3 −µ2 2 )(µ2 µ4 −µ3 )

2(µ1 µ3 −µ2 2)

p+ =

µ2 −t− µ1 t+ (t+ −t− )

p− =

µ2 −t+ µ1 t− (t− −t+ )

(m)

Table 3.A.2. Probability distribution of Xmax ∈ Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ) m Support point

Probability mass

1 b

1

2 0

b−µ1 b

3

b

µ1 b

bµ1 −µ2 b−µ1

(b−µ1 )2 (b−µ1 )2 +µ2 −µ2 1 µ2 −µ2 1 (b−µ1 )2 +µ2 −µ2 1

b

1 − p1 − p2

4 0

(µ2 −bµ1 )3 (µ3 −bµ2 )(µ3 −2bµ2 +b2 µ1 )

µ3 −bµ2 µ2 −bµ1

p1 =

b

p2 =

µ1 µ3 −µ2 2 b(µ3 −2bµ2 +b2 µ1 )

q+ =

µ2 −(b+z− )µ1 +bz− (z+ −z− )(z+ −b)

q− =

µ2 −(b+z+ )µ1 +bz+ (z− −z+ )(z− −b)

√

5 z+ =

(µ1 −b)(µ4 −bµ3 )−(µ2 −bµ1 )(µ3 −bµ2 )+ 2((µ1 −b)(µ3 −bµ2 )−(µ2 −bµ1 )2 )

z− =

(µ1 −b)(µ4 −bµ3 )−(µ2 −bµ1 )(µ3 −bµ2 )− 2((µ1 −b)(µ3 −bµ2 )−(µ2 −bµ1 )2 )

b

√

1 − q+ − q −

148

3 Univariate Variability Orders

respectively, and assume that F −1 (β) − F −1 (α) ≤ G−1 (β) − G−1 (α)

whenever 0 < α ≤ β < 1. (3.B.1)

Then X is said to be smaller than Y in the dispersive order (denoted as X ≤disp Y ). It is conceptually clear that the order ≤disp indeed corresponds to a comparison of X and Y by variability because it requires the diﬀerence between any two quantiles of X to be smaller than the corresponding quantiles of Y . It is clear from (3.B.1) that the order ≤disp is location-free. That is, X ≤disp Y ⇐⇒ X + c ≤disp Y

for any real c.

(3.B.2)

For a ﬁxed α, one can ﬁnd a c such that the inverse of the distribution of X + c, which is F −1 (·) + c, satisﬁes F −1 (α) + c = G−1 (α) = x0 , say. It follows then from (3.B.2) that F (x − c) ≥ G(x) for all x ≥ x0 . Similarly, it can be seen that F (x − c) ≤ G(x) for all x ≤ x0 . This is true for every α (c and x0 are determined by α). By varying α one can obtain any desired c of the form G−1 (α) − F −1 (α). In fact, it can be shown that X ≤disp Y if, and only if, S − (F (· − c) − G(·)) ≤ 1

for all c, with the sign sequence being −, + in the case of equality.

(3.B.3)

It is not hard to prove that condition (3.B.3) is equivalent to the following condition: G(G−1 (α) + c) ≤ F (F −1 (α) + c)

for all α ∈ (0, 1) and c > 0,

(3.B.4)

for all α ∈ (0, 1) and c > 0.

(3.B.5)

or, equivalently, G(G−1 (α) − c) ≥ F (F −1 (α) − c)

Alternatively, (3.B.4) and (3.B.5) can be written as (X − F −1 (α))+ ≤st (Y − G−1 (α))+ ,

α ∈ (0, 1).

(3.B.6)

From (3.B.1) it is clear that X ≤disp Y if, and only if, G−1 (α) − F −1 (α)

increases in α ∈ (0, 1),

(3.B.7)

decreases in α ∈ (0, 1),

(3.B.8)

or, equivalently, if, and only if, G

−1

(α) − F

−1

(α)

where F ≡ 1 − F and G ≡ 1 − G are the survival functions associated with X and Y , respectively. Let R ≡ − log F and Q ≡ − log G denote the cumulative −1 hazard functions of X and Y , respectively. Note that R−1 (z) = F (e−z ) and

3.B The Dispersive Order

Q−1 (z) = G if,

−1

149

(e−z ). Thus from (3.B.8) we obtain that X ≤disp Y if, and only Q−1 (z) − R−1 (z)

increases in z ≥ 0.

(3.B.9)

Substituting α = F (x) in (3.B.7) we obtain that X ≤disp Y if, and only if, G−1 (F (x)) − x increases in x.

(3.B.10)

When X and Y have densities f and g, respectively, then X ≤disp Y if, and only if, g(G−1 (α)) ≤ f (F −1 (α)) for all α ∈ (0, 1); (3.B.11) this can be obtained at once by diﬀerentiation of (3.B.10) and a simple substitution. When X and Y have hazard rate functions r and q, then (3.B.11) can alternatively be recast as q(G−1 (α)) ≤ r(F −1 (α))

for all α ∈ (0, 1).

(3.B.12)

The dispersive order can be characterized also by comparing transformations of the random variables X and Y . For example, for continuous random variables X and Y we have that X ≤disp Y if, and only if, Y =st φ(X)

for some φ which satisﬁes φ(x ) − φ(x) ≥ x − x whenever x ≤ x . (3.B.13)

In order to prove it just let φ be G−1 F . When the φ in (3.B.13) is diﬀerentiable, the condition on φ there is the same as φ ≥ 1, where φ denotes the derivative of φ. An equivalent way of recasting (3.B.13) for continuous random variables X and Y is the following: Y =st X + ψ(X)

for some increasing function ψ.

(3.B.14)

Condition (3.B.13) can also be rewritten as X =st ϕ(Y )

for some increasing ϕ which satisﬁes ϕ(x ) − ϕ(x) ≤ x − x whenever x ≤ x . (3.B.15)

In fact, (3.B.15) characterizes X ≤disp Y even if X and Y are not continuous random variables. The next characterization of the dispersive order that we describe is by means of observed total time on test random variables (see Section 1.A.4). Let F be an absolutely continuous distribution function of a nonnegative random variable X, and suppose, for simplicity, that 0 is the left endpoint of the support of F . Let HF−1 be as deﬁned in (1.A.19), and let Xttt have the distribution function HF . Denote by hF the density function associated with HF . Then it is easy to see that

150

3 Univariate Variability Orders

f (F −1 (u)) , hF HF−1 (u) = 1−u

0 ≤ u < 1,

(3.B.16)

where f is the density function associated with F . Similarly, if G is another absolutely continuous distribution function, then the density hG , of the inverse of the TTT transform HG that is associated with G, satisﬁes −1 g(G−1 (u)) , hG H G (u) = 1−u

0 ≤ u < 1,

(3.B.17)

where g is the density function associated with G. Let Y and Yttt have the distribution functions G and HG , respectively. From (3.B.11), (3.B.16), and (3.B.17) we obtain the following result. Theorem 3.B.1. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions having 0 as the left endpoint of their supports. Then X ≤disp Y ⇐⇒ Xttt ≤disp Yttt . See related results in Theorems 1.A.29, 4.A.44, 4.B.8, 4.B.9, and 4.B.29. Next we mention a characterization by means of the so-called Q-addition (quantiles-addition). The random variable Y with distribution function G is said to be the Q-addition of the random variables X and Z, with corresponding distribution functions F and H, if G−1 (α) = F −1 (α) + H −1 (α) for all α ∈ (0, 1). If X and Y have distribution functions F and G, respectively, then by (3.B.1), X ≤disp Y if, and only if, H −1 (α) ≡ G−1 (α) − F −1 (α)

is increasing in α ∈ (0, 1).

That means that H −1 is an inverse of a distribution function of a random variable Z, say. Thus we see that X ≤disp Y if, and only if, Y is a Q-addition of X and Z for some random variable Z. Another characterization of the order ≤disp is given in the following theorem. Theorem 3.B.2. Let X and Y be two random variables. Then X ≤disp Y if, and only if, for every increasing function φ and increasing concave function h such that φ and ψ(·) ≡ h(φ(·)) are integrable with respect to the distribution of Y , and for every real number c, we have that Eφ(X − c) ≥ Eφ(Y ) =⇒ Eψ(X − c) ≥ Eψ(Y ). It is worthwhile to mention that two twice diﬀerentiable functions φ and ψ satisfy ψ(·) ≡ h(φ(·)) for some increasing concave function h if, and only if, φ /φ ≥ ψ /ψ (see Pratt [459] or Arrow [22]). Like the convex order (see Theorem 3.A.7), the dispersive order can be characterized by means of Yaari functionals Vh deﬁned in (3.A.31).

3.B The Dispersive Order

151

Theorem 3.B.3. Let X and Y be two random variables with the same ﬁnite means. Then X ≤disp Y if, and only if, Vh (X) ≤ Vh (Y )

for every probability transformation function h ≤ 1.

Before leaving this subsection we should mention an alternative way of comparing by dispersion random variables that are symmetric about 0. In such a case one may say (as an alternative to (3.B.1)) that X is less dispersed than Y if F −1 (α)−F −1 (1/2) ≤ [≥] G−1 (α)−G−1 (1/2) whenever α ≥ [≤] 1/2. If X and/or Y are not necessarily symmetric, then one can deﬁne an order that is weaker than ≤disp by requiring F −1 (α) − F −1 (1 − α) ≤ G−1 (α) − G−1 (1 − α),

α ∈ [1/2, 1];

see Townsend and Colonius [552]. If X and Y are positive random variables, then, as an alternative to (3.B.1), one can say that X is less dispersed than Y if log X ≤disp log Y . The latter condition is equivalent to log X ≤∗ log Y , where the order ≤∗ is deﬁned in Section 4.B (see Theorem 4.B.1). 3.B.2 Properties The dispersive order satisﬁes some desirable closure properties but does not satisfy some other desirable properties. For example, it is very easy to verify the following result (compare it to Theorem 3.A.18). Theorem 3.B.4. Let X be a random variable. Then X ≤disp aX

whenever a ≥ 1.

Theorem 3.B.4 can be generalized as follows. For two functions φ and ψ let us denote φ ≤disp ψ if φ(y) − φ(x) ≤ ψ(y) − ψ(x)

whenever x ≤ y.

(3.B.18)

Note that if φ and ψ are diﬀerentiable then φ ≤disp ψ if, and only if, φ ≤ ψ , where φ and ψ are the derivatives of φ and ψ, respectively. Now let X be a random variable. Write ψ(X) = φ(X) + (ψ(X) − φ(X)). From (3.B.14) we obtain the following result. Theorem 3.B.5. Let X be a random variable. Then φ(X) ≤disp ψ(X)

whenever φ ≤disp ψ.

Another simple desirable property that is easily veriﬁed is the following theorem. Theorem 3.B.6. Let X and Y be two random variables. Then X ≤disp Y ⇐⇒ −X ≤disp −Y.

(3.B.19)

152

3 Univariate Variability Orders

However, the dispersive order is not closed under convolutions. In fact, it is not even true in general that for any two independent random variables X and Y we have that X ≤disp X + Y . This observation follows from the next theorem, the proof of which is omitted. Theorem 3.B.7. The random variable X satisﬁes X ≤disp X + Y

for any random variable Y independent of X

if, and only if, X has a logconcave density. A random variable Z is said to be dispersive if X +Z ≤disp Y +Z whenever X ≤disp Y and Z is independent of X and Y . From Theorem 3.B.7 it follows that every dispersive random variable must be strongly unimodal (that is, have a logconcave density). It turns out that strong unimodality is also a suﬃcient condition for dispersivity, as the next result shows. Again the proof is omitted. Theorem 3.B.8. The random variable X is dispersive if, and only if, X has a logconcave density. Other characterizations of random variables with logconcave densities are given in Theorem 1.C.52. From Theorem 3.B.8 we obtain, by iteration, the following result. Theorem 3.B.9. Let X1 , X2 , . . . , Xn be a set of independent random variables, and let Y1 , Y2 , . . . , Yn be another set of independent random variables. If the Xi ’s and the Yi ’s have logconcave densities, and if Xi ≤disp Yi , i = 1, 2, . . . , n, then n n Xi ≤disp Yi . i=1

i=1

The dispersive order is closed under increasing convex and decreasing concave transformations when the underlying random variables are ordered in the usual stochastic order. We have the following result. Theorem 3.B.10. Let X and Y be two random variables such that X ≤st Y . (a) If X ≤disp Y , then φ(X) ≤disp φ(Y )

for all increasing convex and all decreasing concave functions φ. (3.B.20)

(b) If X ≤disp Y , then φ(X) ≥disp φ(Y )

for all decreasing convex and all increasing concave functions φ. (3.B.21)

3.B The Dispersive Order

153

Proof. First we prove (3.B.20) when φ is increasing and convex. Let F and G denote the distribution functions of X and Y , respectively, and let F −1 and G−1 be the respective inverses. For simplicity suppose that F , G, and φ are diﬀerentiable with derivatives f , g, and φ , respectively. The condition X ≤st Y implies that (see (1.A.12)) F −1 (α) ≤ G−1 (α)

for all α ∈ (0, 1).

Since φ is convex it follows that φ is increasing. Therefore φ (F −1 (α)) ≤ φ (G−1 (α))

for all α ∈ (0, 1).

(3.B.22)

The condition X ≤disp Y implies that (see (3.B.11)) g(G−1 (α)) ≤ f (F −1 (α))

for all α ∈ (0, 1).

(3.B.23)

Since φ is increasing it follows that φ ≥ 0. Therefore, combining (3.B.22) and (3.B.23), we see that g(G−1 (α))φ (F −1 (α)) ≤ f (F −1 (α))φ (G−1 (α))

for all α ∈ (0, 1),

and, again from (3.B.11), it is seen that the latter inequality is equivalent to φ(X) ≤disp φ(Y ). If φ is decreasing and concave, then −φ is increasing and convex. Therefore, from what we just proved it follows that −φ(X) ≤disp −φ(Y ). From Theorem 3.B.6 we obtain that φ(X) ≤disp φ(Y ). The proof of (3.B.21) is similar.

Theorem 3.B.10 can be generalized in several ways. Here are two generalizations of the increasing convex part of (3.B.20). Theorem 3.B.11. Let X and Y be two random variables such that X ≤st Y . (a) If X ≤disp Y , then φ(X) ≤disp ψ(Y ) whenever φ ≤disp ψ (in the sense of (3.B.18)) and φ or ψ is an increasing convex function. (b) If X ≤disp Y , then φ(X) ≤disp ψ(Y ) whenever φ and ψ are diﬀerentiable and their derivatives, φ and ψ , respectively, satisfy φ (x) ≤ ψ (y) for all x ≤ y. A relation similar to (3.B.20) can be used as a suﬃcient condition for X ≤disp Y . The next result states such a condition. Note that in (3.B.24) the directions of the monotonicity in the convex and the concave cases are interchanged. Theorem 3.B.12. Let X and Y be two random variables such that X ≤st Y . If φ(X) ≤disp φ(Y ) then X ≤disp Y .

for some decreasing convex or increasing concave function φ, (3.B.24)

154

3 Univariate Variability Orders

The proof of Theorem 3.B.12 uses Theorem 3.B.10. If φ in (3.B.24) is increasing and concave, then φ−1 is increasing and convex. Since X ≤st Y it follows that φ(X) ≤st φ(Y ). Therefore, by Theorem 3.B.10, X = φ−1 (φ(X)) ≤disp φ−1 (φ(Y )) = Y . The proof for a decreasing and convex φ is similar. For random variables with equal left-end support points the assumption in Theorems 3.B.10 and 3.B.11 of the comparison of X and Y in the usual stochastic order need not be stated. This is because of the following observation. Here, for random variables X and Y , we denote the corresponding endpoints of their supports by lX , uX , lY , and uY as deﬁned before (3.A.12). Theorem 3.B.13. (a) If X and Y are random variables such that lX = lY > −∞, then X ≤disp Y =⇒ X ≤st Y. (b) If X and Y are random variables such that uX = uY < ∞, then X ≤disp Y =⇒ X ≥st Y. For example, if X and Y are nonnegative random variables such that lX = lY = 0, then Theorem 3.B.13(a) applies. A stronger version of this fact is described in Remark 4.B.35. The proof of Theorem 3.B.13(a) is based on the fact that if F and G are the distribution functions of X and Y , respectively, then F −1 (0) = lX = lY = G−1 (0). Therefore, from (3.B.1) one obtains that F −1 (β) ≤ G−1 (β) for all β ∈ (0, 1), that is, X ≤st Y by (1.A.12). The proof of Theorem 3.B.13(b) is similar. The following result can be shown using the same kind of argument. Theorem 3.B.14. If X and Y are random variables having the same ﬁnite support and satisfying X ≤disp Y , then they must have the same distribution. The next result is an analog of (3.A.12). We omit the proof. Theorem 3.B.15. Let X and Y be random variables whose supports are intervals. Then X ≤disp Y =⇒ µ{supp(X)} ≤ µ{supp(Y )}, where µ denotes the Lebesgue measure. Suppose that X and Y are two random variables with distributions F and G, respectively, such that X ≤disp Y . Then by taking c = 0 in (3.B.3) we see that (3.A.59) holds for the random variables X − EX and Y − EY . We thus have proved the following implication. Theorem 3.B.16. Let X and Y be two random variables with ﬁnite means. Then X ≤disp Y =⇒ X ≤dil Y.

3.B The Dispersive Order

155

A more reﬁned result can be obtained by combining (3.C.7) and (3.C.9) in Section 3.C below. From Theorem 3.B.16, (3.A.32), and (3.A.4) it follows that if X ≤disp Y , then Var(X) ≤ Var(Y ), (3.B.25) whenever Var(Y ) < ∞. From Theorem 3.B.7 it follows that

X ≤conv Y, and X has a logconcave density =⇒ X ≤disp Y.

(3.B.26)

In contrast to (3.B.19), if X ≤disp Y , it does not necessarily follow that X ≤disp −Y . In order to see it, let X be an exponential random variable with mean 1. Clearly X ≤disp X (in fact, this is the case for any random variable X). The distribution function of X is concave on [0, ∞), and the distribution function of −X is convex on (−∞, 0). Since the order ≤disp is preserved under shifts, it follows that X ≤disp −X. Using an argument as in the proof of Theorem 3.A.44, we obtain the following suﬃcient condition for the dispersive order. Theorem 3.B.17. Let X and Y be random variables with respective densities f and g. If S − (f (· − c) − g(·)) ≤ 2 for all c, with the sign sequence being −, +, − in the case of equality, (3.B.27) then X ≤disp Y . Another suﬃcient condition for X ≤disp Y is given next. Theorem 3.B.18. Let X and Y be two absolutely continuous random variables with hazard rate functions (see (1.B.1)) r and q, respectively. If r(u) ≥ q(u + x)

for all u and x ≥ 0,

(3.B.28)

then X ≤disp Y . Proof. Let F and G denote the distribution functions of X and Y , respectively. Condition (3.B.28) implies that r(u) ≥ q(u); that is, X ≤hr Y . This, in turn, implies X ≤st Y , which, in turn, implies (1.A.12). Now, (3.B.28) therefore gives r(F −1 (α)) ≥ q(G−1 (α)) for all α ∈ (0, 1), which is equivalent to X ≤disp Y by (3.B.12).

Let X be a random variable and denote by X(−∞,a] the truncation of X at a as deﬁned in Section 1.A.4. One would expect X(−∞,a] to increase in a in the sense of the dispersion order. This is not always the case, but it is the case if the distribution function F of X is logconcave; that is, if X has decreasing reverse hazard (see Section 1.B.6). This is shown in the next result,

156

3 Univariate Variability Orders

which is an analog of Theorem 1.A.15 for the dispersion order. The proof of the ﬁrst part of the theorem consists of verifying that for α ≤ β the quantity F −1 (βF (a)) − F −1 (αF (a)) increases in a when F is logconcave. The other parts of the theorem are proven similarly. The notation X(a,b) for a < b is self-explanatory. Theorem 3.B.19. Let X be a random variable with distribution function F and density f . (a) If F is logconcave, then X(−∞,a] increases in a in the sense of the dispersion order. (b) If F is logconcave (that is, if X is IFR), then X(a,∞) decreases in a in the sense of the dispersion order. (c) If f is logconcave, then X(a,b) decreases in a (< b) and increases in b (> a) in the sense of the dispersion order. Recall from page 1 the deﬁnitions of the IFR, DFR, NBU, NWU, DMRL and IMRL properties. The following theorems list some relations between the dispersion order and some other orders. The proofs are mostly straightforward and are not detailed here. Theorem 3.B.20. Let X and Y be two nonnegative random variables. (a) If X ≤hr Y and X or Y is DFR, then X ≤disp Y . (b) If X ≤disp Y and X or Y is IFR, then X ≤hr Y . (c) If X is NBU and Y is NWU, then X ≤disp Y ⇐⇒ X ≤hr Y . A version of parts (a) and (b) of Theorem 3.B.19, where ≤hr is replaced by ≤rh , and DFR and IFR are replaced by monotonicity conditions on the reversed hazard rate function, can be found in Bartoszewicz [44]. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means, we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result may be contrasted with Theorem 2.A.4. Theorem 3.B.21. Let X and Y be two nonnegative random variables. (a) If X ≤mrl Y and X or Y is IMRL, then AX ≤disp AY . (b) If AX ≤disp AY and X or Y is DMRL, then X ≤mrl Y . (c) If X ≤disp Y and X is DMRL and Y is IMRL, then X ≤mrl Y . Example 3.B.22. Let X1 , X2 , . . . , Xn be independent DFR random variables, and let X(1) ≤ X(2) ≤ · · · ≤ X(n) be the corresponding order statistics. Then X(1) is DFR (since its hazard rate function is the sum of the hazard rate functions of the Xi ’s). From Theorem 1.B.26 we see that X(1) ≤hr X(i) , i = 2, 3, . . . , n. Therefore, by Theorem 3.B.20(a), we have that X(1) ≤disp X(i) ,

i = 2, 3, . . . , n.

3.B The Dispersive Order

157

Example 3.B.23. Let X1 , X2 , . . . , Xm , Xm+1 be independent and identically distributed DFR random variables and let the corresponding spacings be denoted by U(i:m) as in Theorem 1.B.31. It is easy to see then that the spacings are DFR random variables (see Barlow and Proschan [35]). Then, from Theorems 1.B.31 and 3.B.20(a) we get (m − i + 1)U(i:m) ≤disp (m − i)U(i+1:m) , i = 2, 3, . . . , m − 1, (m − i + 2)U(i:m+1) ≤disp (m − i + 1)U(i:m) , i = 2, 3, . . . , m,

(3.B.29) (3.B.30)

and U(i:m) ≤disp U(i+1:m+1) ,

i = 2, 3, . . . , m.

(3.B.31)

Note that (3.B.29)–(3.B.31) can be summarized as (m − j + 1)U(j:m) ≤disp (n − i + 1)U(i:n)

whenever i − j ≥ max{0, n − m}.

The dispersive order can be used to characterize IFR and DFR random variables as the following result shows. Theorem 3.B.24. Let X be a nonnegative random variable. Then X is IFR [DFR] if, and only if, [X − tX > t] ≥disp [≤disp ] [X − t X > t ] whenever t ≤ t . Proof. If X is IFR, then, by Theorem 3.B.19(b), [X X > t] is decreasing in t in the sense of the dispersive order. Since the dispersive order is preserved under shifts, it is seen that [X − tX > t] is decreasing in t in the sense of the dispersive order. The proof of the DFR case is similar, though one ﬁrst needs to prove a DFR version of Theorem 3.B.19(b). The converses of the above statements are consequences of Theorems 1.A.30(a) and 3.B.13(a).

Under some regularity conditions on the distribution function of X and on its support, but without the assumption of nonnegativity of X, we have a related characterization of the IFR and the DFR properties. We do not give the proof of this result here. Theorem 3.B.25. Let X be a random variable with a continuous distribution function, and with support of the form (a, ∞), where a ≥ −∞ [respectively, a > −∞]. Then X is IFR [DFR] if, and only if, X ≥disp [≤disp ] [X − tX > t] for all t > a. The next result states a preservation property of the order ≤disp which is useful in reliability theory as well as in nonparametric statistics. Let X and Y be two random variables. Let X(1:n) ≤ X(2:n) ≤ · · · ≤ X(n:n) denote the order statistics from a sample X1 , X2 , . . . , Xn of independent and identically distributed random variables that have the same distribution as X. Similarly, let Y(1:n) ≤ Y(2:n) ≤ · · · ≤ Y(n:n) denote the order statistics from another sample Y1 , Y2 , . . . , Yn of independent and identically distributed random variables that have the same distribution as Y .

158

3 Univariate Variability Orders

Theorem 3.B.26. Let X and Y be two random variables. If X ≤disp Y , then X(j:n) ≤disp Y(j:n) for j = 1, 2, . . . , n. The proof follows at once from (3.B.10) and the fact that −1 G−1 F j:n Fj:n = G

for j = 1, 2, . . . , n,

where F , Fj:n , G, and Gj:n are the distribution functions of X, X(j:n) , Y , and Y(j:n) , respectively. For the next result about comparison of order statistics we will need the following lemma. Lemma 3.B.27. Let E(j:m) and E(i:n) denote the jth and the ith order statistics of samples from the exponential distribution with rate λ > 0 of sizes m and n, respectively. Then E(j:m) ≤disp E(i:n)

whenever i − j ≥ max{0, n − m}.

j Proof. Write E(j:m) =st k=1 Em−j+k , where Em−j+k is an exponential random variable with rate (m − j + k)λ, k = 1, 2, . . . , j, and the Em−j+k ’s are i independent. Similarly, write E(i:n) =st k=1 En−i+k , where En−i+k is an exponential random variable with rate (n − i + k)λ, k = 1, 2, . . . , i, and the En−i+k ’s are independent. It is easy to check, for instance using Theorem because m − j ≥ n − i. Since exponential 3.B.4, that Em−j+k ≤disp En−i+k random variables have logconcave densities, we obtain from Theorems 3.B.9 and 3.B.7, respectively, that E(j:m) =st

j k=1

Em−j+k ≤disp

j

En−i+k ≤disp

k=1

i

En−i+k =st E(i:n)

k=1

because j ≤ i.

Theorem 3.B.28. Let X(j:m) and X(i:n) denote the jth and the ith order statistics of samples from a DFR distribution F of sizes m and n, respectively. Then X(j:m) ≤disp X(i:n) whenever i − j ≥ max{0, n − m}. Proof. The distribution Fj:m of X(j:m) can be expressed as Fj:m = Bj:m F , where Bj:m is the beta distribution with parameters j and m−j +1. Similarly, the distribution Fi:n of X(i:n) can be expressed as Fi:n = Bi:n F . Now write Fj:m = Bj:m GG−1 F = Hj:m G−1 F, where G denotes the distribution function of an exponential random variable with mean 1, and Hj:m = Bj:m G. Note that Hj:m is the distribution function of E(j:m) in Lemma 3.B.27. Similarly, write Fi:n = Hi:n G−1 F,

3.B The Dispersive Order

159

and ﬁnally notice that −1 −1 Fj:m = ψHi:n Hj:m ψ −1 , Fi:n −1 Hj:m (x)− where ψ = F −1 G. From Lemma 3.B.27 and (3.B.10) we see that Hi:n x is increasing in x. The function ψ is strictly convex because F is DFR, and it satisﬁes ψ(0) = 0. Therefore, by a result of Bartoszewicz [40] it follows that −1 Fi:n Fj:m (x)−x is increasing in x. The stated result now follows from (3.B.10).

As a corollary of Theorems 3.B.26 and 3.B.28 we get the following result. Theorem 3.B.29. Let X(j:m) and Y(i:n) denote the jth and the ith order statistics of samples from the distribution F and G of sizes m and n, respectively. If F or G is DFR, and if X ≤disp Y , then X(j:m) ≤disp Y(i:n)

whenever i − j ≥ max{0, n − m}.

Proof. If F is DFR, then X(j:m) ≤disp X(i:n) ≤disp Y(i:n) by Theorems 3.B.28 and 3.B.26, respectively. If G is DFR, then X(j:m) ≤disp Y(j:m) ≤disp Y(i:n) by Theorems 3.B.26 and 3.B.28, respectively.

It is of interest to compare Theorem 3.B.29 to the following example (which follows from Example 3.A.50 and Theorem 3.B.16). Example 3.B.30. Let X(i:n) denote the ith order statistic in a sample of n independent and identically distributed random variables having the common distribution function F , survival function F , and density function f . Recall the deﬁnition of regular variation from Example 3.A.50. For F with support (−∞, ∞) we have: (a) If F is regularly varying at −∞ with index α < 0, and if f is monotone on (−∞, c] for some c, then X(j:m) ≤disp X(i:n) implies i ≤ j. (b) If F is regularly varying at ∞ with index α < 0, and if f is monotone on [c, ∞) for some c, then X(j:m) ≤disp X(i:n) implies n − i ≤ m − j. The dispersive order between X and Y implies the usual stochastic order between the corresponding spacings as the next result shows. In order to state it we use the following notation. Let X(1:n) ≤ X(2:n) ≤ · · · ≤ X(n:n) and Y(1:n) ≤ Y(2:n) ≤ · · · ≤ Y(n:n) be the order statistics as above. The corresponding spacings are deﬁned by U(i:n) ≡ X(i:n) − X(i−1:n) and V(i:n) ≡ Y(i:n) − Y(i−1:n) , i = 2, 3, . . . , n. The proof of the next theorem is given in Example 6.B.25 in Chapter 6. Theorem 3.B.31. Let X and Y be two random variables. If X ≤disp Y , then U(i:n) ≤st V(i:n) for i = 2, 3, . . . , n. Theorem 2.7 on page 182 of Kamps [273] extends Theorem 3.B.31 to the spacings of the so called generalized order statistics. The following example describes an interesting instance in which the two maxima are ordered in the dispersive order. It may be compared with Example 1.B.37.

160

3 Univariate Variability Orders

Example 3.B.32. Let Y1 , Y2 , . . . , Yn be independent exponential random variables with hazard rates λ1 , λ2 , . . . , λn , respectively. Let X1 , X2 , . . . , Xn be independent andidentically distributed exponential random variables with n hazard rate λ = i=1 λi /n. Then X(n:n) ≤disp Y(n:n) .

(3.B.32)

Let Z1 , Z2 , . . . , Zn be independent andidentically distributed exponential ran˜ = ( n λi )1/n . Then dom variables with hazard rate λ i=1 Z(n:n) ≤disp Y(n:n) .

(3.B.33)

˜ it follows Note that from the arithmetic-geometric mean inequality (λ ≥ λ) that X1 ≤hr Z1 . Therefore, by Theorem 3.B.20(a), X1 ≤disp Z1 . Alternatively, we can see that X1 ≤disp Z1 from Example 1.D.1 and (3.B.26). Hence, by Theorem 3.B.26, X(n:n) ≤disp Z(n:n) . That is, actually (3.B.33) is a stronger result than (3.B.32). Example 3.B.33. Let Y1 , Y2 , . . . , Yn and X1 , X2 , . . . , Xn be as in Example 3.B.32. Denote the corresponding spacings by U(i:n) ≡ X(i:n) − X(i−1:n) and V(i:n) ≡ Y(i:n) − Y(i−1:n) , i = 2, 3, . . . , n. Then U(i:n) ≤disp V(i:n) ,

i = 2, 3, . . . , n.

A related example is the following. Recall from page 2 the deﬁnition of the majorization order ≺ among n-dimensional vectors. It is of interest to compare the example below with Example 1.C.50. Example 3.B.34. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and let Yi be an exponential random variable with mean ηi−1 > 0, i = 1, 2, . . . , m. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m

Xi ≥disp

i=1

m

Yi .

i=1

Similar examples are the following. Example 3.B.35. Let Xi be a uniform random variable on [0, λ−1 i ], i = 1, 2, . . . , m, and let Yi be a uniform random variable on [0, ηi−1 ], i = 1, 2, . . . , m. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m i=1

Xi ≥disp

m

Yi .

i=1

Example 3.B.36. Let Xi be a Gamma random variable with density funcα−1 −λi x e , x > 0, i = 1, 2, . . . , m, and let Yi be a Gamma tion (1/Γ (α))λα i x random variable with density function (1/Γ (α))ηiα xα−1 e−ηi x , x > 0, i =

3.B The Dispersive Order

161

1, 2, . . . , m. Here α ≥ 1, and the λi ’s and the ηi ’s are positive parameters. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m

Xi ≥disp

i=1

m

Yi .

i=1

The proof of the next example is omitted. Example 3.B.37. Let {N (t), t ≥ 0} be a nonhomogeneous Poisson process with mean function Λ (that is, Λ(t) ≡ E[N (t)], t ≥ 0), and let T1 , T2 , . . . be the successive epoch times. If Λ is strictly increasing and concave, then Tn ≤disp Tn+1 ,

n = 1, 2, . . . .

In the following example the idea of the proof of Theorem 3.B.26 is used. This example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 3.B.38. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G, i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 3.B.37), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the dispersive ordering of the ﬁrst two epoch times implies the dispersive ordering of all the corresponding later epoch times; that is, it will be shown below that if X ≤disp Y , then T1,n ≤disp T2,n , n ≥ 1. In order to see it, ﬁx an n ≥ 1, and denote by F1,n and F2,n the distribution functions of T1,n and T2,n , respectively. Note from (1.B.24) that F1,n (t) = ψn (F (t))

and F2,n (t) = ψn (G(t)),

where ψn (u) ≡ Γn (− log(1 − u)), u ∈ [0, 1]. Therefore, −1 (F1,n (t)) − t = (ψn (G))−1 (ψn (F (t))) − t = G−1 (F (t)) − t, F2,n

t ≥ 0.

Thus, from (3.B.10) it is seen that X ≤disp Y if, and only if, T1,n ≤disp T2,n . In the following example it is shown that, under the proper conditions, random minima and maxima are ordered in the dispersive order sense; see related results in Examples 1.C.46, 4.B.16, 5.A.24, and 5.B.13. Example 3.B.39. Let X1 , X2 , . . ., and Y1 , Y2 , . . ., each be a sequence of independent and identically distributed random variables with common distribution functions FX1 and FY1 , respectively, and common survival functions F X1 and F Y1 , respectively. Let N be a positive integer-valued random variable, independent of the Xi ’s and of the Yi ’s, with a Laplace transform LN .

162

3 Univariate Variability Orders

Denote X(1,N ) = min{X1 , X2 , . . . , XN }, X(N,N ) = max{X1 , X2 , . . . , XN }, Y(1,N ) = min{Y1 , Y2 , . . . , YN }, and Y(N,N ) = max{Y1 , Y2 , . . . , YN }. The distribution functions of X(N,N ) and Y(N,N ) are given by FX(N,N ) (x) = LN (− log FX1 (x)),

x ≥ 0, j = 1, 2,

FY(N,N ) (x) = LN (− log FY1 (x)),

x ≥ 0, j = 1, 2.

and If X1 ≤disp Y1 , then, for 0 < α ≤ β < 1 we compute

−1 −1 −1 −L−1 −1 −L−1 FX e N (β) − FX e N (α) (β) − FX (α) = FX 1 1 (N :N ) (N :N ) −L−1 (β) −L−1 (α) e N − FY−1 e N ≤ FY−1 1 1 (β) − FY−1 (α). = FY−1 (N :N ) (N :N ) Therefore X(N :N ) ≤disp Y(N :N ) . Similarly it can be shown that if X1 ≤disp Y1 , then X(1:N ) ≤disp Y(1:N ) . Example 3.B.40. Let X (respectively, Y ) have the central t-distribution with νX (respectively, νY ) degrees of freedom. If νX ≤ νY , then X ≥disp Y . Example 3.B.41. As in Example 1.C.59, for nonnegative absolutely continuous random variables X and Y , let X w and Y w be the random variables with the weighted density functions fw and gw given in (1.C.17) and (1.C.18). Suppose that X ≤disp Y . If X is DFR, if Y is IFR, and if w is decreasing and convex, then X w ≤disp Y w . Analogous to the result in Remark 1.A.18, it can be shown that a certain quotient set of all distribution functions on R is a lattice with respect to the order ≤disp . A consequence of the order ≤disp is given in the next theorem. It is a motivation for a multivariate dispersion order that is described in Chapter 7. Theorem 3.B.42. Let X and X be two independent and identically distributed random variables and let Y and Y be two other independent and identically distributed random variables. If X ≤disp Y , then |X − X | ≤st |Y − Y |, that is, P {|X − X | > z} ≤ P {|Y − Y | > z}

for all z ≥ 0.

(3.B.34)

Proof. Denote the common distribution function of X and X [respectively, Y and Y ] by F [G]. Select a z ≥ 0. Then ∞ [F (x + z) − F (x − z)]dF (x) P {|X − X | ≤ z} = −∞ 1

=

0

{F [F −1 (u) + z] − F [F −1 (u) − z]}du

3.C The Excess Wealth Order

1

≥

163

{G[G−1 (u) + z] − G[G−1 (u) − z]}du

0

= P {|Y − Y | ≤ z}, where the inequality is a consequence of (3.B.4) and (3.B.5). This proves (3.B.34).

3.C The Excess Wealth Order 3.C.1 Motivation and deﬁnition Let X be a nonnegative random variable with distribution function F and with a ﬁnite mean. Recall from (3.A.43) the deﬁnition of the Lorenz curve. ˜ X , corresponding to The nonstandardized (or the generalized) Lorenz curve L X, is deﬁned as p ˜ X (p) = F −1 (u)du, p ∈ [0, 1]. L 0

˜X Note that the requirement that X is nonnegative is not needed in order for L to be well deﬁned. Thus, in this section we will not assume the nonnegativity of the discussed random variables, unless stated otherwise. For a nonnegative random variable X with a ﬁnite mean, a transform that is closely related to the nonstandardized Lorenz curve is the transform TX deﬁned as −1 F

(p)

TX (p) =

F (x)dx,

p ∈ [0, 1].

0

The transform TX is called the TTT transform, and is denoted by HF−1 in (1.A.19). A third transform, that is related to the nonstandardized Lorenz curve and to the TTT transform, and which will be heavily used in this section, is the excess wealth transform WX deﬁned as ∞ WX (p) = F (x)dx, p ∈ (0, 1]. F −1 (p)

Note that it is not necessary for the random variable X to be nonnegative in order for WX to be well deﬁned; it is only required that X has a ﬁnite mean. This useful property of the excess wealth transform is one of the main reasons for its applicability as a tool that deﬁnes a stochastic order. ˜ X , TX , and WX , when X is nonnegative with a ﬁnite The transforms L ˜ X (p) is depicted mean, are depicted in Figure 3.C.1. For p ∈ (0, 1) the value L as the area of the region A in the ﬁgure. The value TX (p) is the area of A ∪ B, and the value WX (p) is the area of C. Note that the area of A ∪ B ∪ C is EX. The order which is determined by the pointwise comparison of the excess wealth transforms of two random variables is of interest in this section. Let X

164

3 Univariate Variability Orders 1

6 F

C

B p

A

F −1 (p) ˜ X (p), TX (p), and WX (p). Fig. 3.C.1. Depiction of L

-x

and Y be two random variables with distribution functions F and G. Assume that ∞ ∞ WX (p) ≡ F (x)dx ≤ G(x)dx ≡ WY (p) for all p ∈ (0, 1). F −1 (p)

G−1 (p)

(3.C.1) Then X is said to be smaller than Y in the excess wealth order (denoted as X ≤ew Y ). −1 −1 Note that since F −1 (p) = F (1 − p) and G−1 (p) = G (1 − p), we see that X ≤ew Y if, and only if, ∞ ∞ F (x)dx ≤ G(x)dx for all p ∈ (0, 1). F

−1

(p)

If we deﬁne ΨX (y) = X ≤ew Y if, and only if,

G

∞ y

−1

(p)

F (x)dx and ΨY (y) =

−1 ΨY−1 (z) − ΨX (z)

∞ y

G(x)dx, x ∈ R, then

is decreasing in z ≥ 0.

(3.C.2)

In order to obtain another characterization of the ≤ew order, rewrite (3.C.1) as p

1

F −1 (u) − F −1 (p) du ≤

1

G−1 (u) − G−1 (p) du

(3.C.3)

p

(this can be formally veriﬁed by Fubini’s Theorem or, informally, by rewriting the area of the region C in Figure 3.C.1 as the left-hand side above). It is thus seen that X ≤ew Y if, and only if,

3.C The Excess Wealth Order

G−1 (p) − F −1 (p) ≤

1 1−p

1

G−1 (u) − F −1 (u) du,

165

p ∈ (0, 1).

p

By a straightforward diﬀerentiation it can be veriﬁed that the latter is equivalent to 1 −1

1 G (u) − F −1 (u) du is increasing in p ∈ (0, 1). (3.C.4) 1−p p Thus, X ≤ew Y if, and only if, (3.C.4) holds.

∞

F (x)dx

∞

G(x)dx

Let mX and mY , deﬁned by mX (t) ≡ t F (t) and mY (t) ≡ t G(t) (for t’s for which the denominators are not 0), denote the mean residual life functions associated with X and Y (see (2.A.1)). Then it is seen that X ≤ew Y if, and only if, mX (F −1 (p)) ≤ mY (G−1 (p)),

p ∈ (0, 1).

(3.C.5)

p ∈ (0, 1).

(3.C.6)

Also, X ≤ew Y if, and only if, mX (F

−1

(p)) ≤ mY (G

−1

(p)),

Another characterization of the excess wealth order is given in Theorem 4.A.43. Like the convex and the dispersive orders (see Theorems 3.A.7 and 3.B.3), the excess wealth order can be characterized by means of Yaari functionals Vh deﬁned in (3.A.31). Recall that an increasing function h : [0, 1] → [0, 1] is starshaped if h(t)/t is increasing on [0, 1]. Theorem 3.C.1. Let X and Y be two random variables with the same ﬁnite means. Then X ≤ew Y if, and only if, Vh (X) ≤ Vh (Y )

for every starshaped probability transformation function h.

Jewitt [256] considered an order, called the location independent riskier order that can be denoted by ≤lir . It is shown in Fagiuoli, Pellerey, and Shaked [188] that X ≤lir Y ⇐⇒ −X ≤ew −Y . Thus every result that holds for the order ≤ew can be reworded by means of the order ≤lir . 3.C.2 Properties It is easy to verify that the excess wealth order is location-independent. That is, X ≤ew Y =⇒ X + a ≤ew Y for any a ∈ R. From (3.C.4) and Theorem 3.A.8 we see that X ≤ew Y =⇒ X ≤dil Y.

(3.C.7)

166

3 Univariate Variability Orders

It follows that if EX = EY , then X ≤ew Y =⇒ X ≤cx Y.

(3.C.8)

Shaked and Shanthikumar [518] showed that if X ≤cx Y , then it does not necessarily follow that X ≤ew Y . From (3.C.7), (3.A.32), and (3.A.4) it follows that X ≤ew Y =⇒ Var(X) ≤ Var(Y ), provided Var(Y ) < ∞. From (3.C.3) and (3.B.1) we see that for random variables with ﬁnite means, we have X ≤disp Y =⇒ X ≤ew Y. (3.C.9) A characterization of the excess wealth order, which is similar to the characterization of the dispersive order, given in Theorem 3.B.2, is given next. Theorem 3.C.2. Let X and Y be two random variables. Then X ≤ew Y if, and only if, for all increasing convex functions φ and h such that φ and ψ(·) ≡ h(φ(·)) are integrable with respect to the distribution of Y , and for every real number c, we have that Eφ(X − c) ≤ Eφ(Y ) =⇒ Eψ(X − c) ≤ Eψ(Y ). It is worthwhile to mention that two twice diﬀerentiable functions φ and ψ satisfy ψ(·) ≡ h(φ(·)) for some increasing convex function h if, and only if, φ /φ ≤ ψ /ψ . Another characterization of the excess wealth order is described in the following theorem. It is similar to the characterization of the convex order in Theorem 3.A.45. Below, for any random variable Z, the function ΨZ is as deﬁned before (3.C.2). Theorem 3.C.3. Let X and Y be two random variables with equal means. Then X ≤ew Y if, and only if, there exist random variables Z1 , Z2 , . . ., with distribution functions F1 , F2 , . . ., such that Z1 =st X, EZj = EY , j = 1, 2, . . ., ΨZj (x) → ΨY (x) as j → ∞ for all x ∈ R, and, for any c ≥ 0, it holds that

S − F j (·) − F j+1 (· − c) = 1 and the sign sequence is +, −, j = 1, 2, . . .. An important closure property of the excess wealth order is given next. Theorem 3.C.4. Let X and Y be two continuous random variables with ﬁnite means. Then, for any increasing convex function φ, we have X ≤ew Y =⇒ φ(X) ≤ew φ(Y ). In the next two results we describe some relationships between the orders ≤ew and ≤mrl . We denote the left endpoint of the support of a random variable X by lX .

3.C The Excess Wealth Order

167

Theorem 3.C.5. Let X and Y be two random variables with distribution functions F and G, respectively, with ﬁnite means, and with ﬁnite left endpoints lX and lY such that lX ≤ lY . If X ≤ew Y , and if either X or Y or both are DMRL, then X ≤mrl Y . Proof. We only give the proof for the case when the distribution functions F and G of X and Y are continuous; the proof for the general case is similar, though notationally more complex. Let (y0 , p0 ), (y1 , p1 ), and (y2 , p2 ) be three consecutive points of crossing as in the proof of Theorem 3.A.5 (see Figure 3.A.1). Note that by the continuity assumption we have pi = F (yi ) = G(yi ), i = 0, 1, 2. Suppose that Y is DMRL. For p ∈ [p1 , p2 ] we have F −1 (p) ≤ G−1 (p), and therefore, for such a p we have mX (F −1 (p)) ≤ mY (G−1 (p)) ≤ mY (F −1 (p)), where the ﬁrst inequality follows from (3.C.5), and the second from the assumption that Y is DMRL. Thus, mX (y) ≤ mY (y)

for y ∈ [y1 , y2 ].

(3.C.10)

If X (rather than Y ) is DMRL, then (3.C.10) follows from mX (G−1 (p)) ≤ mX (F −1 (p)) ≤ mY (G−1 (p)),

p ∈ [p1 , p2 ],

where the ﬁrst inequality follows from the assumption that X is DMRL, and the second from (3.C.5). Since y0 = F −1 (p0 ) = G−1 (p0 ), from X ≤ew Y we also have that ∞ ∞ F (x)dx ≤ G(x)dx. (3.C.11) y0

y0

Now let y ∈ (y0 , y1 ). For x ∈ (y0 , y) we have F (x) ≥ G(x). Therefore y y F (x)dx ≥ G(x)dx. (3.C.12) y0

Hence

∞

∞

F (x)dx = y

y0 ∞

≤

y 0∞

G(x)dx −

F (x)dx

[by (3.C.11)]

G(x)dx

[by (3.C.12)]

y 0y

G(x)dx − y0

G(x)dx. y

F (x)dx y0 y

y 0∞

=

y

F (x)dx −

≤

y0

168

3 Univariate Variability Orders

Therefore

y

∞

F (x)dx ≤

∞

G(x)dx,

for all y ∈ [y0 , y1 ].

y

But since F (y) ≥ G(y) for y ∈ [y0 , y1 ], we see that ∞ ∞ ∞ F (x)dx G(x)dx G(x)dx y y y ≤ ≤ . F (y) F (y) G(y) So mX (y) ≤ mY (y)

for y ∈ [y0 , y1 ].

(3.C.13)

That is, from (3.C.10) and (3.C.13) we have mX (y) ≤ mY (y)

for y ∈ [y0 , y2 ].

In order to complete the proof we need to show that the interval [lX , ∞) is a union of segments [y0 , y2 ) as above. Suppose that a last point of crossing of F and G exists, and denote it by (yl , pl ). Denote (y0 , p0 ) = (yl−1 , pl−1 ), (y1 , p1 ) = (yl , pl ), and (y2 , p2 ) = (∞, 1), where (yl−1 , pl−1 ) is the point of the next to the last crossing of F and G. From thefacts that F −1 (p1 ) = G−1 (p1 ), ∞ ∞ and that X ≤ew Y implies F −1 (p ) F (x)dx ≤ G−1 (p ) G(x)dx, it follows that 1 1 F crosses G from below at (y1 , p1 ), and therefore the interval [y0 , ∞) is of the type described above. Now suppose that a ﬁrst point of crossing of F and G exists, and denote it by (yf , pf ). If lX < lY , then at the ﬁrst point of crossing, F crosses G from above. Thus, from the above proof it follows that mX (y) ≤ mY (y) for all y ≥ yf . The proof that mX (y) ≤ mY (y) also for y < yf is similar to the proof of (3.C.10). If lX = lY , then consider two possible cases: (a) in the ﬁrst point of crossing, F crosses G from above, and (b) in the ﬁrst point of crossing, F crosses G from below. In case (a) we obtain mX (y) ≤ mY (y) for all y, as we obtained it above when lX < lY . In case (b) denote y0 = sup{y : F (y) = G(y)}, p0 = F (y0 ), (y1 , p1 ) = (yf , pf ), and (y2 , p2 ) = (yf +1 , pf +1 ), where (yf +1 , pf +1 ) is the point of the second crossing of F and G. The interval [y0 , y2 ) is of the kind described above, and therefore mX (y) ≤ mY (y) for y ∈ [y0 , y2 ), and from it it also follows that mX (y) ≤ mY (y) for y < y0 .

Theorem 3.C.6. Let X and Y be two random variables with distribution functions F and G, respectively, with ﬁnite means, and with ﬁnite left endpoints lX and lY such that lX ≤ lY . If X ≤mrl Y , and if either X or Y or both are IMRL, then X ≤ew Y . Proof. Again, we only give the proof for the case when the distribution functions F and G of X and Y are continuous; the proof for the general case is similar, though notationally more complex. Let (y0 , p0 ), (y1 , p1 ), and (y2 , p2 ) be three consecutive points of crossing as in the proof of Theorem 3.A.5 (see Figure 3.A.1).

3.C The Excess Wealth Order

169

Suppose that Y is IMRL. For p ∈ [p1 , p2 ] we have F −1 (p) ≤ G−1 (p), and therefore, for such a p we have mX (F −1 (p)) ≤ mY (F −1 (p)) ≤ mY (G−1 (p)), where the ﬁrst inequality follows from X ≤mrl Y , and the second from the assumption that Y is IMRL. Thus, mX (F −1 (p)) ≤ mY (G−1 (p))

for p ∈ [p1 , p2 ].

(3.C.14)

If X (rather than Y ) is IMRL, then (3.C.14) follows from mX (F −1 (p)) ≤ mX (G−1 (p)) ≤ mY (G−1 (p)),

p ∈ [p1 , p2 ],

where the ﬁrst inequality follows from the assumption that X is IMRL, and the second from X ≤mrl Y . Since y0 = F −1 (p0 ) = G−1 (p0 ), from X ≤mrl Y we also have that mX (F −1 (p0 )) ≤ mY (G−1 (p0 )).

(3.C.15)

Now let p ∈ (p0 , p1 ). Since F (x) ≥ G(x) for x ∈ [y0 , y1 ] we see that

F −1 (p)

F −1 (p)

F (x)dx ≥

G−1 (p)

G(x)dx ≥

y0

y0

G(x)dx,

(3.C.16)

y0

where the second inequality follows from F −1 (p) ≥ G−1 (p). Therefore ∞ F (x)dx F −1 (p) −1 mX (F (p)) = 1−p F −1 (p) ∞ F (x)dx − y0 F (x)dx y0 = 1−p G−1 (p) ∞ G(x)dx − y0 G(x)dx y0 ≤ [by (3.C.15) and (3.C.16)] 1−p = mY (G−1 (p)), for p ∈ [p0 , p1 ]. So, from the preceding inequality and from (3.C.14) we obtain mX (F −1 (p)) ≤ mY (G−1 (p))

for p ∈ [p0 , p2 ].

In order to complete the proof we need to show that the interval [lX , ∞) is a union of segments [y0 , y2 ) as above. Suppose that a last point of crossing of F and G exists, and denote it by (yl , pl ). Denote (y0 , p0 ) = (yl−1 , pl−1 ), (y1 , p1 ) = (yl , pl ), and (y2 , p2 ) = (∞, 1), where (yl−1 , pl−1 ) is the point of the next to the last crossing of F and G. From ∞ the facts that F (y1 ) = G(y1 ), and ∞ that X ≤mrl Y implies y F (x)dx ≤ y G(x)dx, it follows that F crosses 1

1

170

3 Univariate Variability Orders

G from below at (y1 , p1 ), and therefore the interval [y0 , ∞) is of the type described above. Now suppose that a ﬁrst point of crossing of F and G exists, and denote it by (yf , pf ). If lX < lY , then at the ﬁrst point of crossing, F crosses G from above. Thus, from the above proof it follows that mX (F −1 (p)) ≤ mY (G−1 (p)) for all p ≥ pf . The proof that mX (F −1 (p)) ≤ mY (G−1 (p)) also for p < pf is similar to the proof of (3.C.14). If lX = lY , then consider two possible cases: (a) in the ﬁrst point of crossing, F crosses G from above, and (b) in the ﬁrst point of crossing, F crosses G from below. In case (a) we obtain mX (F −1 (p)) ≤ mY (G−1 (p)) for all p, as we obtained it above when lX < lY . In case (b) denote y0 = sup{y : F (y) = G(y)}, p0 = F (y0 ), (y1 , p1 ) = (yf , pf ), and (y2 , p2 ) = (yf +1 , pf +1 ), where (yf +1 , pf +1 ) is the point of the second crossing of F and G. The interval [y0 , y2 ) is of the kind described above, and therefore mX (F −1 (p)) ≤ mY (G−1 (p)) for p ∈ [p0 , p2 ), and from it it also follows that mX (F −1 (p)) ≤ mY (G−1 (p)) for p ≤ p0 . In summary, we have shown that mX (F −1 (p)) ≤ mY (G−1 (p)) for all p ∈ (0, 1). Therefore X ≤ew Y by (3.C.5).

The following few results give conditions under which the order ≤ew is closed under convolutions. Theorem 3.C.7. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤ew Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, all have (continuous or discrete) logconcave densities, except possibly one Xl and one Yk (l = k), then m m Xi ≤ew Yi . i=1

i=1

In order to prove Theorem 3.C.7 one ﬁrst proves that if X and Y are two random variables such that X ≤ew Y , and if Z is a random variable with logconcave density that is independent of X and of Y , then X + Z ≤ew Y + Z. The statement of the theorem can then be derived from the fact that a convolution of random variables with logconcave densities has a logconcave density. We do not give the details here. The next two results are analogs of Theorems 3.B.7 and 3.B.8. Theorem 3.C.8. The random variable X satisﬁes X ≤ew X + Y

for any random variable Y independent of X

if, and only if, X is IFR. Theorem 3.C.9. Let Z be a random variable. Then X + Z ≤ew Y + Z

whenever X ≤disp Y and Z is independent of X and Y

if, and only if, Z is IFR.

3.D The Peakedness Order

171

Since a convolution of IFR random variables is IFR (see Corollary 1.B.39), repeated application of Theorem 3.C.9 yields the following result, which is an analog of Theorem 3.B.9. Theorem 3.C.10. Let X1 , X2 , . . . , Xn be a set of independent random variables, and let Y1 , Y2 , . . . , Yn be another set of independent random variables. If the Xi ’s and the Yi ’s are all IFR, and if Xi ≤disp Yi , i = 1, 2, . . . , n, then n

Xi ≤ew

i=1

n

Yi .

i=1

An interesting closure property of the order ≤ew is given next. Theorem 3.C.11. Let X1 , X2 , . . . be a collection of independent and identically distributed random variables, and let Y1 , Y2 , . . . be another collection of independent and identically distributed random variables. Also, let N be a positive, integer-valued, random variable, independent of the Xi ’s and of the Yi ’s. If X1 ≤ew Y1 , then max{X1 , X2 , . . . , XN } ≤ew max{Y1 , Y2 , . . . , YN }. The following result may be compared to Theorem 3.B.31. By (3.C.9), we assume below less than is assumed in Theorem 3.B.31, but the conclusion is weaker. We use below the notation for spacings that was used in Theorem 3.B.31. Theorem 3.C.12. Let X and Y be two random variables. If X ≤ew Y , then EU(n−1:n) ≤ EV(n−1:n) for n = 2, 3, . . .. The order ≤ew can be used to characterize DMRL and IMRL random variables. The following result may be compared with Theorems 2.A.23, 2.B.17, 3.A.56, and 4.A.51. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. Theorem 3.C.13. Let X be a continuous random variable with a ﬁnite left endpoint of its support lX . Then X is DMRL [IMRL] if, and only if, any one of the following equivalent conditions holds: (i) [X − tX > t] ≥ew [≤ X > t ] whenever t ≥ t ≥ lX . ] [X − t ew (ii) X ≥ew [≤ew ] [X − tX > t] for all t ≥ lX (when lX = 0). The proof of this result is omitted.

3.D The Peakedness Order 3.D.1 Deﬁnition In this section we discuss a variability order that applies to random variables with symmetric distribution functions. This is one of the oldest (if not the oldest) variability notions that can be found in the literature. It stochastically

172

3 Univariate Variability Orders

compares random variables according to their distance from their center of symmetry. Let X be a random variable with a distribution function that is symmetric about µ, and let Y be another random variable with a distribution function that is symmetric about ν. Suppose that |X − µ| ≤st |Y − ν|. Then X is said to be smaller than Y in the peakedness order (denoted by X ≤peak Y ). Note that, in the literature, often X is said to be more peaked about µ than Y about ν if X ≤peak Y . The following result is easy to prove. Theorem 3.D.1. Let X and Y be two random variables with diﬀerent distribution functions, but with the same mean. Suppose that the distribution functions F and G, of X and Y , respectively, are symmetric about the common mean. Then X ≤peak Y if, and only if, S − (G − F ) = 1 and the sign sequence is +, −, where S − is deﬁned in (1.A.18). 3.D.2 Some properties The peakedness order satisﬁes some desirable closure properties. For example, it is easy to verify the following result. Theorem 3.D.2. Let X be a random variable with a symmetric distribution function. Then X ≤peak aX whenever a ≥ 1. The closure results in the next theorem can also be easily veriﬁed. Theorem 3.D.3. (a) Let X, Y, and Θ be random variables such that the distribution functions of [X Θ = θ] are symmetric about some µ (which is independent of θ) and the distribution functions of [Y Θ = θ] are symmetric about some ν (which is also independent of θ) and such that [X Θ = θ] ≤peak [Y Θ = θ] for all θ in the support of Θ. Then X ≤peak Y . That is, the peakedness order is closed under mixtures. (b) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables with symmetric distribution functions such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤peak Yj , j = 1, 2, . . ., then X ≤peak Y . The peakedness order is also closed under convolutions of random variables that have unimodal symmetric distribution functions (that is, with mode at their center of symmetry). This is shown next.

3.D The Peakedness Order

173

Theorem 3.D.4. Let X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn be two sets of independent random variables, all having distribution functions that are symmetric about possibly diﬀerent centers, and all having unimodal densities with possibly some probability mass at their respective centers. If Xi ≤peak Yi for i = 1, 2, . . . , n, then n n Xi ≤peak Yi . i=1

i=1

In particular, X ≤peak Y , where X and Y denote the corresponding sample means. Proof. Without loss of generality we may assume that all the centers of the Xi ’s and of the Yi ’s are 0. First we prove the result for n = 2. Let F1 , F2 , G1 , and G2 denote the distribution functions of X1 , X2 , Y1 , and Y2 , respectively. Select an a > 0. Then ∞ [F1 (x + a) − F1 (x − a)]dF2 (x) P {|X1 + X2 | ≤ a} = 2 0 ∞ ≥2 [F1 (x + a) − F1 (x − a)]dG2 (x) 0 ∞ =2 [G2 (x + a) − G2 (x − a)]dF1 (x) 0 ∞ ≥2 [G2 (x + a) − G2 (x − a)]dG1 (x) 0

= P {|Y1 + Y2 | ≤ a}, where the ﬁrst inequality follows from the unimodality of X1 (therefore, the integrand is decreasing in x ≥ 0) and from X2 ≤peak Y2 , and the second inequality follows from the unimodality of Y2 and from X1 ≤peak Y1 . This proves the result for n = 2. The general result can be obtained by a simple induction together with the observation that a sum of independent random variables, all having distribution functions that are symmetric about 0 and all having unimodal densities, also has a unimodal density symmetric about 0.

If X1 , X2 , . . . are independent and identically distributed random variables, then, for each n, we denote by X n the sample mean of X1 , X2 , . . . , Xn . That is, X n = (X1 + X2 + · · · + Xn )/n. In Example 3.A.29 it is shown that X n ≤cx X n−1 . The following result shows that a similar property holds for the peakedness order under an additional condition. Theorem 3.D.5. If X1 , X2 , . . . are independent and identically distributed random variables, having a common logconcave density function that is symmetric about a common value, then for each n ≥ 2 one has X n ≤peak X n−1 .

174

3 Univariate Variability Orders

A relationship between the dispersive and the peakedness orders is described next. Theorem 3.D.6. Let X and Y be two random variables having distribution functions that are symmetric about possibly diﬀerent centers. If X ≤disp Y , then X ≤peak Y .

3.E Complements Section 3.A: For historical reasons, the convex order is sometimes referred to as “dilation.” However, in recent literature the order deﬁned in (3.A.32) is often called the dilation order. Some standard references about the convex order are Ross [475] and M¨ uller and Stoyan [419], where many of the results described in Section 3.A can be found. Another monograph that studies the convex order (under the mask of the Lorenz order) is Arnold [19], and many of the results in this section that deal directly with the Lorenz order can be found there. The proof of Theorem 3.A.2 is taken from Mu˜ noz-Perez and Sanches-Gomez [421]; an alternative proof, using ideas from the area of comparison of experiments, can be found in Torgersen [551, page 369]. Result (3.A.12) is taken from Hickey [223]. The present version of the characterization of the convex order given in Theorem 3.A.4 is taken from M¨ uller and R¨ uschendorf [415]. The characterization of the convex order in Theorem 3.A.5 can be found in Fagiuoli, Pellerey, and Shaked [188]; see also Levy and Kroll [346] and Ramos and Sordo [463]. The characterization of the convex order by means of Yaari functionals (Theorem 3.A.7) can be found in Chateauneuf, Cohen, and Meilijson [127]. The characterization of the dilation order, given in Theorem 3.A.8, is taken from Fagiuoli, Pellerey, and Shaked [188]. The characterization given in Theorem 3.A.9 can be found in Ramos and Sordo [463]. The characterization of the Lorenz order by means of the Lorenz zonoids (Theorem 3.A.11) is taken from Arnold [20]. The result about the convex ordering of random sums (Theorem 3.A.13) is a special case of a result of Jean-Marie and Liu [254]; the extensions of it when the underlying random variables are identically distributed (Theorems 3.A.14–3.A.16) are taken from Pellerey [450]. Theorem 3.A.17 and some related results can be found in Berger [79]. The property of the increase in the dilation order with an increase in the scale (Theorem 3.A.18) is taken from Hickey [223]. The result about the dilation ordering of two diﬀerences (Theorem 3.A.19) can be found n in Kochar and Carri`ere [312]. The convex order lower bound on i=1 Xi , given in Theorem 3.A.20, is taken from Vyncke, Goovaerts, De Schepper, Kaas, and Dhaene [557]. The property of inheritance of the convex order from the mixing random variables to the mixed ones (Theorem 3.A.21) can be found in Schweder [499]; its variation, Theorem 3.A.23, is taken from Kottas and Gelfand [323]. The property of the preservation of the convex

3.E Complements

175

order under products of nonnegative random variables (Corollary 3.A.22) can be found in Whitt [562]. The Lorenz order comparison of g(X) and h(X) (Theorem 3.A.26) can be found in Wilﬂing [566]. The relationship between the orders ≤Lorenz and ≤hmrl , given in Theorem 3.A.28, is taken from Lef`evre and Utev [340]. The result about the convex ordering of the sample means (Example 3.A.29) can be found in Marshall and Olkin [383, page 288]. Its generalizations (Theorem 3.A.30 and Corollary 3.A.31) are taken from Denuit and Vermandele [158] and from O’Cinneide [439]. The convex order comparison of scaled Poisson random variables (Example 3.A.32) is inspired by a result at the end of page 1078 in B¨ auerle [60]. The convex order comparison in Theorem 3.A.33 can be found in O’Cinneide [439]. The closure property (3.A.50) of the dilation order can be found in Mu˜ noz-Perez and Sanches-Gomez [421]. The majorization result (Theorem 3.A.35) is a special case of a result of Marshall and Proschan [384]; related results can be found in Ma [375]. The preservation of the convex order under linear convex combinations (Theorem 3.A.36) is taken from Pellerey [452]. The convex order comparison of sums of random variables with random coeﬃcients (Theorem 3.A.37) can be found in Ma [375]. The particular case of it, given in Example 3.A.38, is a result of Karlin and Novikoﬀ [277]; Marshall and Olkin [383, Section 15.E] obtained a generalization of this special case which is diﬀerent from the result in Theorem 3.A.37. The convex order comparison of sums of positively [respectively, negatively] associated random variables, and independent random variables, given in Theorem 3.A.39, can be found in Denuit, Dhaene, and Ribas [143] [respectively, Shao [535]]; see also Boutsikas and Vaggelatou [107]. The Laplace transform characterization of the order ≤cx (Theorem 3.A.40) is taken from Shaked and Wong [524]; see also Kan and Yi [274]. The convex order comparison of posterior means, in the context of statistical experiments (Example 3.A.41), is essentially taken from Baker [30]. The condition for stochastic equality of ≤cx -ordered random variables (Theorem 3.A.42) is a special case of a result by Denuit, Lef`evre, and Shaked [151], whereas its generalization (Theorem 3.A.43) has been motivated by a result in Bhattacharjee and Bhattacharya [87]; see also Huang and Lin [249]. The result that gives suﬃcient conditions for the convex order by means of the number of crossings of the underlying densities or distribution functions (Theorem 3.A.44) is taken from Shaked [502], but its origins may be found in Karlin and Novikoﬀ [277], if not before. A proof of the characterization of the convex order by means of the number of crossings of two distribution functions (Theorem 3.A.45) can be found in M¨ uller [407]; similar results are given in Borglin and Keiding [106]. The convex order comparison of normalized Bernoulli random variables (Example 3.A.48) can be found in Makowski [379]. The necessary and suﬃcient conditions for the comparison of normal random variables (Example 3.A.51) are taken from M¨ uller [413]. The relations ≤uv and ≤lc were introduced in Whitt [564] as means to identify the order ≤cx . The

176

3 Univariate Variability Orders

characterization of DMRL and IMRL random variables by means of the convex order (Theorem 3.A.56) is taken from Belzunce, Candel, and Ruiz [64]. The relationships between the orders ≤dil and ≤mrl that are described in Theorems 3.A.57 and 3.A.58 can be found in Belzunce, Pellerey, Ruiz, and Shaked [72] where further related results can also be found. The results on the m-convex order (Section 3.A.5) are mostly taken from Denuit, Lef`evre, and Shaked [151]. The condition that implies the stochastic equality of ≤Sm-cx -ordered random variables (Theorem 3.A.60) is taken from Denuit, Lef`evre, and Shaked [152]. The method for deriving the distributions of the stochastic extrema in Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ) is taken from Denuit, De Vylder, and Lef`evre [142]. The stochastic comparisons of the Gamma, inverse Gaussian, and lognormal random variables (Example 3.A.67) are taken from Kaas and Hesselager [270]. Tables 3.A.1 and 3.A.2 can be found in Denuit, Lef`evre, and Shaked [153, 154]. Theorem 3.A.63 can be found in Denuit, Lef`evre, and Utev [155]. The result about the 2m-cx ordering of two diﬀerences (Theorem 3.A.64) is taken from Bassan, Denuit, and Scarsini [52]. Denuit and Lef`evre [146], Denuit, Lef`evre, and Utev [156], and Denuit, Lef`evre, and Mesﬁoui [149] studied discrete analogs of the m-convex order; in particular they obtained some analogs of the results in Section 3.A.5 for arithmetic random variables, as well as some speciﬁc results for the discrete case. Denuit, Lef`evre, and Utev [155] extended the m-convex order to Tchebycheﬀ-type orders; see also Lynch [367]. Bhattacharjee [85] studied the order ≤cx under the restriction that the compared random variables are discrete. Metzger and R¨ uschendorf [393] studied variability orderings, which are related to ≤uv and ≤lc , deﬁned by requiring the ratio of the distribution functions F/G or of the survival functions F /G to be unimodal. For example, they showed that if X and Y are two random variables with distribution functions F and G, respectively, such that supp(X) ⊆ supp(Y ), and if X ≤uv Y , then F/G is unimodal. They also considered the order deﬁned by requiring the ratio of a shifted density to another density f (· + a)/g(·) to be unimodal for all a. This order is to be compared with the order ≤uv and also with the order ≤lr↑ studied in Section 1.C.4. M¨ uller [412] considered an order that is deﬁned by requiring (3.A.1) to hold for all so-called (a, b)-concave functions. Other related stochastic orders can be found in M¨ uller [412] as well. An order which is related to the Lorenz order is studied in Zenga [576]. Section 3.B: Doksum [169] studied some properties of the dispersive order by stipulating (3.B.10) and calling it the “tail-order” (see Deshpande and Kochar [159] for further early references in which this order is studied). A basic paper on the dispersive order is Shaked [503] where many of the

3.E Complements

177

equivalent conditions described in Section 3.B.1 can be found. The conditions (3.B.3), (3.B.4), and (3.B.6) are taken, respectively, from Saunders [489], Hickey [223], and Mu˜ noz-Perez [420]. Another characterization of the order ≤disp , which is related to (3.B.3), is given in Burger [115]. The observation (3.B.15) has been noted in M¨ uller and Stoyan [419]. The characterization of the dispersive order by means of the observed total time on test random variables (Theorem 3.B.1) can be found in Bartoszewicz [42]; other related results can be found in Bartoszewicz [39, 42]. The notion of Q-addition was introduced in Mu˜ noz-Perez [420]. The characterization of the dispersive order given in Theorem 3.B.2 is taken from Landsberger and Meilijson [330]. The characterization of the dispersive order by means of Yaari functionals (Theorem 3.B.3) can be found in Chateauneuf, Cohen, and Meilijson [127]. The properties described in Section 3.B.2 have been collected from many sources. The result of Theorem 3.B.7 can be found in Droste and Wefelmeyer [171]. Several versions of Theorem 3.B.8 can be found in Lewis and Thompson [347] and in Lynch, Mimmack, and Proschan [368]. Some versions of Theorem 3.B.10 can be found in Bartoszewicz [37] and in Rojo and He [472]. Some related results appear in Hickey [223]; for example, his Theorem 4 can be obtained from (3.B.21) applied to the decreasing convex case. Theorems 3.B.14 and 3.B.15 are also taken from that paper. The relationship between the orders ≤disp and ≤conv , given in (3.B.26), was noted in Shaked and Suarez-Llorens [520]. The suﬃcient condition for the dispersive order by means of comparison of shifted hazard rate functions (Theorem 3.B.18) can be found in Belzunce, Lillo, Ruiz, and Shaked [69]. Theorem 3.B.19 has been proved in Mailhot [377], whereas Theorem 3.B.20 combines results from Bartoszewicz [38, 40] and Bagai and Kochar [29]. The relationships between the orders ≤disp and ≤mrl , given in Theorem 3.B.21, can be found in Bartoszewicz [44]. The result about the dispersive ordering of order statistics of DFR random variables (Example 3.B.22) is taken from Kochar [308]; some other related results can also be found there. The results about the dispersive ordering of the spacings of DFR random variables (Example 3.B.23) are taken from Kochar and Kirmani [313] and from Khaledi and Kochar [285]; an extension of these results can be found in Belzunce, Hu, and Khaledi [68]. The characterizations of IFR and DFR random variables by means of the dispersive order (Theorems 3.B.24 and 3.B.25) have been derived by Belzunce, Candel, and Ruiz [64], and by Pellerey and Shaked [456]. The results on the dispersive order comparisons of order statistics and spacings (Theorems 3.B.26, 3.B.28, 3.B.29, and 3.B.31) can be found in Bartoszewicz [39], in Khaledi and Kochar [286], and in Oja [440], whereas Example 3.B.30 is mentioned in Kleiber [303]; related results can be found in Alzaid and Proschan [14], in Belzunce, Hu, and Khaledi [68], in Belzunce, Mercader, and Ruiz [70], and in Hu and Zhuang [247]. An extension of Theorem 3.B.26 to order statistics from samples with random size can be found in Nanda, Misra, Paul, and Singh [427].

178

3 Univariate Variability Orders

The dispersive order comparisons of maxima of heterogeneous exponential random variables (Example 3.B.32) are taken from Dykstra, Kochar, and Rojo [174] and from Khaledi and Kochar [287], whereas the comparison of the spacings (Example 3.B.33) is taken from Kochar and Korwar [314]. The comparison of sums of heterogeneous exponential random variables (Example 3.B.34) can be found in Kochar and Ma [317]. The comparisons of sums of uniform and Gamma random variables (Examples 3.B.35 and 3.B.36) are slightly weaker than results that are given in Khaledi and Kochar [288, 289]. The result about the dispersive order comparison of the successive epochs of a nonhomogeneous Poisson process (Example 3.B.37) is given in Kochar [310], though it is stated by means of the dispersive order comparison of successive record values of a sequence of independent and identically distributed random variables with a common DFR distribution function. The dispersive order comparison of epoch times of nonhomogeneous Poisson processes (Example 3.B.38) can be found in Belzunce, Lillo, Ruiz, and Shaked [69] and in Yue and Cao [575]. The results about the dispersive order comparisons of random minima and maxima (Example 3.B.39) are taken from Shaked and Wong [526]; a simple proof of these results is given in Bartoszewicz [49]. The comparison of t-distributed random variables (Example 3.B.40) can be found in Arias-Nicol´ as, Fern´ andez-Ponce, Luque-Calvo, and Su´ arez-Llorens [17], whereas the comparison of weighted random variables (Example 3.B.41) can be found in Bartoszewicz and Skolimowska [51]. Finally, the result of Theorem 3.B.42 has been derived by Giovagnoli and Wynn [211] in order to motivate a deﬁnition of multivariate dispersive order (see Section 7.B); Theorem 3.B.42 was also obtained by Kusum, Kochar, and Deshpande [327] who actually derived it for logarithms of positive random variables. Fern´ andez-Ponce and Su´ arez-Llorens [197] introduced a “weakly dispersive” order by requiring that, corresponding to every interval of length ε in the support of the “larger” variable, there exists an interval of the same length in the support of the “smaller” variable, such that the probability mass of the latter with respect to the distribution of the “smaller” variable is at least as large as the probability mass of the former with respect to the distribution of the “larger” variable. Belzunce, Hu, and Khaledi [68] studied an order, which they denoted by ≤disp-hr , that is stronger than the order ≤disp . Condition (3.B.1) can be written as F −1 (β) − F −1 (α) ≤M G−1 (β) − G−1 (α)

whenever 0 < α < β < 1,

where M = 1. Lehmann [344] considered this condition for other possible values of M in order to compare the tails of F and G. Burger [115] studied, among other things, the above condition (with M = 1), but only for α

3.E Complements

179

and β such that 0 < α < G−1 (µ) < β < 1, where µ is some constant. Rojo [471] studied the above condition with M = ∞ in the sense lim sup u→1

F −1 (u) < ∞, G−1 (u)

and Bartoszewicz [43] obtained comparison results, with respect to the latter order, for the observed total time on test random variables Xttt and Yttt , with distribution functions as deﬁned in (1.A.19). Section 3.C: Most of the results, about the excess wealth order, that are described in this section are taken from Shaked and Shanthikumar [518], Fagiuoli, Pellerey, and Shaked [188], and Kochar, Li, and Shaked [316]. Fernandez-Ponce, Kochar, and Mu˜ noz-Perez [195] also studied the excess wealth order by the name of the right spread order. The characterization of the excess wealth order given in (3.C.2) is taken from Chateauneuf, Cohen, and Meilijson [127]; the characterization of the excess wealth order by means of Yaari functionals (Theorem 3.C.1) can be found in that paper as well. The characterization of the excess wealth order given in Theorem 3.C.2 is a translation of the deﬁnition of the order ≤lir into the order ≤ew , which can be done by virtue of Lemma 3.1 of Fagiuoli, Pellerey, and Shaked [188]. The characterization of the excess wealth order by means of the number of crossings of two distribution functions (Theorem 3.C.3) can be obtained in a similar manner from a correction by M¨ uller [410] of Theorem 1 in Landsberger and Meilijson [330]. The conditions for the preservation of the order ≤ew under convolutions (Theorems 3.C.7–3.C.10) can essentially all be found in Hu, Chen, and Yao [231]. The result about the preservation of the excess wealth order under random maxima (Theorem 3.C.11) is taken from Li and Zuo [358], and the result that compares the expected values of the extreme spacings (Theorem 3.C.12) is a special case of a result of Li [353]. The characterization of DMRL and IMRL random variables by the order ≤ew (Theorem 3.C.13) is taken from Belzunce [63]. Belzunce, Hu, and Khaledi [68] studied a stochastic order, denoted by ≤disp-mrl , which is stronger than the order ≤ew . Section 3.D: The peakedness order was introduced by Birnbaum [90]. The characterization of this order, given in Theorem 3.D.1, was observed in Kottas and Gelfand [323]. Theorem 3.D.4 was essentially proven by Birnbaum [90]; the proof given here is adopted from Bickel and Lehmann [89]. The result about the monotonicity of the sample means in the sense of the peakedness order (Theorem 3.D.5) is given in Proschan [461]; an extension of Theorem 3.D.5 can be found in Ma [372]. The relationship between the dispersive and the peakedness orders, given in Theorem 3.D.6, was observed in Shaked [503].

4 Univariate Monotone Convex and Related Orders

In Chapter 1 we studied orders that compare random variables according to their “magnitude”. In Chapter 3 the studied orders compare random variables according to their “variability”. The orders that are discussed in this chapter compare random variables according to both their “location” and their “spread”. The most important and common orders that are studied in this chapter are the increasing convex and the increasing concave orders. Also the transform orders that are studied here, that is, the convex, the star, and the superadditive orders, are of interest in many theoretical and practical applications. In addition, some other related orders are investigated in this chapter as well.

4.A The Monotone Convex and Monotone Concave Orders 4.A.1 Deﬁnitions and equivalent conditions Let X and Y be two random variables such that E[φ(X)] ≤ E[φ(Y )] for all increasing convex [concave] functions φ : R → R,

(4.A.1)

provided the expectations exist. Then X is said to be smaller than Y in the increasing convex [concave] order (denoted by X ≤icx Y [X ≤icv Y ]). Roughly speaking, if X ≤icx Y , then X is both “smaller” and “less variable” than Y in some stochastic sense. Similarly, if X ≤icv Y , then X is both “smaller” and “more variable” than Y in some stochastic sense. One can also deﬁne a decreasing convex [concave] order by requiring (4.A.1) to hold for all decreasing convex [concave] functions φ (denoted by X ≤dcx [≤dcv ] Y ). The terms “decreasing convex” and “decreasing concave” are counterintuitive in the sense that if X is smaller than Y in the sense

182

4 Univariate Monotone Convex and Related Orders

of either of these two orders, then X is “larger” than Y in some stochastic sense. These orders can be easily characterized using the orders ≤icx and ≤icv . Therefore, it is not necessary to have a separate discussion for these orders. In analogy with Theorem 3.A.12(a), the orders ≤icx and ≤icv are related to each other as follows. Theorem 4.A.1. Let X and Y be two random variables. Then X ≤icx [≤icv ] Y ⇐⇒ −X ≥icv [≥icx ] − Y. The proof of Theorem 4.A.1 is based on the fact that a function φ satisﬁes that φ(x) is increasing and convex in x if, and only if, −φ(−x) is increasing and concave in x. We omit the straightforward details. Note that the function φ, deﬁned by φ(x) = x, is increasing and is both convex and concave. Therefore, from (4.A.1) it follows that X ≤icx Y =⇒ E[X] ≤ E[Y ]

(4.A.2)

X ≤icv Y =⇒ E[X] ≤ E[Y ],

(4.A.3)

and that provided the expectations exist. Let F [F ] and G [G] be the survival [distribution] functions of X and Y , respectively. For a ﬁxed a, the function φa , deﬁned by φa (x) = (x − a)+ , is increasing and convex. Therefore, if X ≤icx Y , then E[(X − a)+ ] ≤ E[(Y − a)+ ]

for all a,

(4.A.4)

provided the expectations exist. Alternatively, using a simple integration by parts, it is seen that (4.A.4) can be rewritten as ∞ ∞ F (u)du ≤ G(u)du for all x, (4.A.5) x

x

provided the integrals exist. For any real number a let a− denote the negative part of a, that is, a− = a if a ≤ 0 and a− = 0 if a > 0. For a ﬁxed a, the function ζa , deﬁned by ζa (x) = (x − a)− , is increasing and concave. Therefore, if X ≤icv Y , then E[(X − a)− ] ≤ E[(Y − a)− ]

for all a,

(4.A.6)

provided the expectations exist. Alternatively, again using a simple integration by parts, it is seen that (4.A.6) can be rewritten as x x F (u)du ≥ G(u)du for all x, (4.A.7) −∞

provided the integrals exist.

−∞

4.A The Monotone Convex and Monotone Concave Orders

183

In fact (4.A.5) [(4.A.7)] is equivalent to X ≤icx Y [X ≤icv Y ]. To see it, note that every increasing convex [concave] function can be approximated by (that is, is a limit of) positive linear combinations of the functions φa ’s [ζa ’s], for various choices of a’s. By (4.A.5), E[φa (X)] ≤ E[φa (Y )] for all a, and this fact implies (4.A.1) in the increasing convex case. Similarly, by (4.A.7), E[ζa (X)] ≤ E[ζa (Y )] for all a, and this fact implies (4.A.1) in the increasing concave case. We thus have proved the following result. Theorem 4.A.2. Let X and Y be two random variables. Then X ≤icx Y [X ≤icv Y ] if, and only if, (4.A.5) [(4.A.7)] holds. The next two results give further characterizations of the order ≤icx . The ﬁrst one is an analog of Theorem 3.A.5. Theorem 4.A.3. Let X and Y be two random variables with distribution functions F and G, respectively. Then X ≤icx Y if, and only if, 1 1 F −1 (u)du ≤ G−1 (u)du for all p ∈ [0, 1]. p

p

Theorem 4.A.4. Let X and Y be two random variables with distribution functions F and G, respectively. Then X ≤icx Y if, and only if,

1

F

−1

1

(u)dφ(u) ≤

0

G−1 (u)dφ(u)

0

for all increasing convex functions φ : [0, 1] → R. Another necessary and suﬃcient condition for X ≤icx Y is the following: F −1 (p) +

1 1−p

∞

F (x)dx F −1 (p)

≤ G−1 (p) +

1 1−p

∞

G(x)dx, G−1 (p)

p ∈ (0, 1). (4.A.8)

Condition (4.A.8) may be compared with (3.C.1); see also Corollary 4.A.32. An important characterization of the increasing convex and the increasing concave orders by construction on the same probability space is stated next. Theorem 4.A.5. Two random variables X and Y satisfy X ≤icx Y [X ≤icv ˆ and Yˆ , deﬁned on the Y ] if, and only if, there exist two random variables X same probability space, such that ˆ =st X, X Yˆ =st Y, ˆ Yˆ } is a submartingale [{Yˆ , X} ˆ is a supermartingale], that is, and {X,

184

4 Univariate Monotone Convex and Related Orders

ˆ ≥X ˆ [E[X ˆ Yˆ ] ≤ Yˆ ] E[Yˆ X]

almost surely.

(4.A.9) ˆ and Yˆ can be selected such that [Yˆ X ˆ= Furthermore, the random variables X ˆ ˆ x] [[X Y = x]] is increasing in x in the usual stochastic order ≤st . The proof of this theorem is similar to the proof of Theorem 3.A.4. It is not easy to prove the constructive part of Theorem 4.A.5. However, it is easy ˆ and Yˆ as described in the theorem exist, to prove that if random variables X then X ≤icx Y [X ≤icv Y ]. For example, if the ﬁrst inequality in (4.A.9) holds and if φ is an increasing convex function, then, using Jensen’s Inequality, ˆ ≤ E{φ(E[Yˆ X])} ˆ E[φ(X)] = E[φ(X)]

ˆ = E[φ(Yˆ )] = E[φ(Y )], ≤ E{E[φ(Yˆ )X]}

which is (4.A.1). Theorem 4.A.6. (a) Two random variables X and Y satisfy X ≤icx Y if, and only if, there exists a random variable Z such that X ≤st Z ≤cx Y. (b) Two random variables X and Y satisfy X ≤icx Y if, and only if, there exists a random variable Z such that X ≤cx Z ≤st Y. (c) Two random variables X and Y satisfy X ≤icv Y if, and only if, there exists a random variable Z such that X ≤cv Z ≤st Y. (d) Two random variables X and Y satisfy X ≤icv Y if, and only if, there exists a random variable Z such that X ≤st Z ≤cv Y. Proof. First we prove part (a). It is obvious (see, for example, Theorem 4.A.34 ˆ below) that X ≤st Z ≤cx Y =⇒ X ≤icx Y . So suppose that X ≤icx Y . Let X ˆ and Y be deﬁned on the same probability space, as in Theorem 4.A.5. Deﬁne ˆ It is seen that E[Yˆ Z] ˆ = E[Yˆ X] ˆ = Z. ˆ Thus, by Theorem 3.A.4, Zˆ = E[Yˆ X]. ˆ ≤ Z, ˆ and therefore, by Theorem 1.A.1, Zˆ ≤cx Yˆ . Also, by Theorem 4.A.5, X ˆ ≤st Z. ˆ Letting Z have the same distribution as Z, ˆ we obtain the stated X result. Now we prove part (b). Again it is obvious that X ≤cx Z ≤st Y =⇒ ˆ and Yˆ be deﬁned on the same X ≤icx Y . So suppose that X ≤icx Y . Let X ˆ − E[Yˆ X]. ˆ Then, probability space, as in Theorem 4.A.5. Let Zˆ = Yˆ + X ˆ ˆ ˆ by Theorem 4.A.5, Z ≤ Y , and therefore, by Theorem 1.A.1, Z ≤st Yˆ . Also,

4.A The Monotone Convex and Monotone Concave Orders

185

ˆ = X, ˆ and thus, by Theorem 3.A.4, X ˆ ≤cx Z. ˆ Letting Z have the E[Zˆ X] ˆ we obtain the stated result. same distribution as Z, Parts (c) and (d) can be proven similarly. Alternatively, using Theorem 4.A.1, part (c) can be obtained from part (a), and part (d) can be obtained from part (b).

The following bivariate characterization of the orders ≤icx and ≤icv is analogous to Theorem 3.A.6. Its proof is similar to the proof of Theorem 3.A.6 and is therefore omitted. Deﬁne the following classes of bivariate functions: Gicx = {φ : R2 → R : φ(x, y) − φ(y, x) is increasing and convex in x for all y} and Gicv = {φ : R2 → R : φ(x, y)−φ(y, x) is increasing and concave in x for all y}. Theorem 4.A.7. Let X and Y be independent random variables. Then X ≤icx Y [X ≤icv Y ] if, and only if, E[φ(X, Y )] ≤ E[φ(Y, X)]

for all φ ∈ Gicx [Gicv ].

Another characterization of the increasing convex order, by means of the number of sign changes of two distribution functions, is given in Theorem 4.A.23 below. 4.A.2 Closure properties and some characterizations Using (4.A.1) through (4.A.9) it is easy to prove each of the closure results in the ﬁrst two parts of the following theorem. The last two parts can be proven as in Theorem 3.A.12. (Recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.) Theorem 4.A.8. (a) If X ≤icx Y [X ≤icv Y ] and g is any increasing and convex [concave] function, then g(X) ≤icx [≤icv ] g(Y ). (b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤icx [≤icv ] [Y Θ = θ] for all θ in the support of Θ. Then X ≤icx [≤icv ] Y . That is, the increasing convex [concave] order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that EX+ [EX− ] and EY+ [EY− ] are ﬁnite and that E(Xj )+ → EX+ [E(Xj )− → EX− ] and E(Yj )+ → EY+ [E(Yj )− → EY− ]

as j → ∞. (4.A.10)

If Xj ≤icx [≤icv ] Yj , j = 1, 2, . . ., then X ≤icx [≤icv ] Y .

186

4 Univariate Monotone Convex and Related Orders

(d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤icx [≤icv ] Yi for i = 1, 2, . . . , m, then m

Xj ≤icx [≤icv ]

j=1

m

Yj .

j=1

That is, the increasing convex [concave] order is closed under convolutions. In part (c), as in Theorem 3.A.12, the condition (4.A.10) is necessary — without it the conclusion of part (c) may not hold. Part (d) of Theorem 4.A.8 can be strengthened as follows. Theorem 4.A.9. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent and identically distributed random variables such that Xi ≤icx [≤icv ] Yi , i = 1, 2, . . .. Let M and N be positive integer-valued random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤icx [≤icv ] N . Then M

Xj ≤icx [≤icv ]

N

Yj .

j=1

j=1

Proof. Let φ be an increasing convex [concave] function and denote g(n) ≡ E[φ(X1 + X2 + · · · + Xn )]. Clearly g(n) increases in n. Denote Sn = X1 + X2 + · · · + Xn for n ≥ 1. Now, E[φ(Sn + Xn+1 ) − φ(Sn )Sn = s] = E[φ(s + Xn+1 ) − φ(s)] = h(s), say. Since φ is convex [concave] it follows that h(s) is increasing [decreasing] in s. Since Sn is increasing in n in the usual stochastic order, it follows that g(n + 1) − g(n) = E[h(Sn )] is increasing [decreasing] in n. That is, g(n) is increasing and convex [concave] in n. Therefore M N Xi ≤ E φ Xi , E φ i=1

i=1

that is, M

Xi ≤icx [≤icv ]

i=1

N

Xi .

(4.A.11)

i=1

From Theorem 4.A.8 (b) and (d) it follows that N i=1

Xi ≤icx [≤icv ]

N

Yi ,

i=1

and the proof is complete by the transitivity property of the order ≤icx [≤icv ].

4.A The Monotone Convex and Monotone Concave Orders

187

A special case of Theorem 4.A.9 is stated, and proven in a diﬀerent manner, in Chapter 8 (see Theorem 8.A.13). Remark 4.A.10. If in Theorem 4.A.9 the Xi ’s are only assumed to be increasing [decreasing] in i in the increasing convex [concave] order (rather than being identically distributed), or if the same is assumed about the Yi ’s, then the conclusion of the theorem is still true. As a special case of the result mentioned in Remark 4.A.10 we obtain the following theorem. Theorem 4.A.11. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative independent random variables such that Xi ≤st Xi+1 , i = 1, 2, . . .. Let M and N be two discrete positive integer-valued random variables such that M ≤icx N , and assume that M and N are independent of the Xi ’s. Then M

Xi ≤icx

i=1

N

Xi .

i=1

The following result follows easily from Theorem 4.A.9. It is of interest to compare it to Theorems 1.A.5, 2.B.8, and 3.A.14. Theorem 4.A.12. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K Xi ≤icx [≥icx , ≤icv , ≥icv ] Y1 , i=1

and M ≤icx [≥icx , ≤icv , ≥icv ] KN. Then

M

Xj ≤icx [≥icx , ≤icv , ≥icv ]

j=1

N

Yj .

j=1

Proof. The assumptions yield M

Xi ≤icx [≥icx , ≤icv , ≥icv ]

i=1

KN

Xi

i=1

=

N

Ki

i=1 j=K(i−1)+1

Xj ≤icx [≥icx , ≤icv , ≥icv ]

N

Yi ,

i=1

where the inequalities follow from Theorem 4.A.9. This gives the stated result.

188

4 Univariate Monotone Convex and Related Orders

Some results that are related to Theorem 4.A.12 are given in the next theorem. Theorem 4.A.13. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤icx Y1

M ≤icx

and

i=1

K

Ni ,

(4.A.12)

i=1

or if we have KX1 ≤icx Y1

M ≤icx KN,

and

(4.A.13)

or if we have KX1 ≤icx Y1

M ≤icx

and

K

Ni ,

(4.A.14)

i=1

then

M

Xj ≤icx

j=1

N

Yj .

(4.A.15)

j=1

Proof. Assume that (4.A.13) holds. Then M i=1

Xi ≤icx

KN i=1

Xi =

N

Ki

i=1 j=K(i−1)+1

Xj ≤cx

N i=1

KXi ≤icx

N

Yi ,

i=1

where the ﬁrst and the third inequalities follow from Theorem 4.A.9, and the second inequality follows from Theorem 3.A.13 and Example 3.A.29. This gives (4.A.15). K Next note, using Example 3.A.29, that i=1 Ni ≤icx KN . Thus, by Theorem 4.A.12, the conditions in (4.A.12) imply (4.A.15), and, by (4.A.13), the conditions in (4.A.14) imply (4.A.15).

A slight generalization of the conditions in (4.A.12) is given in the next theorem.

4.A The Monotone Convex and Monotone Concave Orders

189

Theorem 4.A.14. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1

Xi ≤icx

i=1

then

K1 Y1 K2

M

M ≤icx K2 N,

and

Xj ≤icx

j=1

N

Yj .

j=1

Proof. The ﬁrst assumption and Example 3.A.29 yield K1 · that is,

K2 i=1

K2 i=1

K2

Xi

≤cx K1 ·

K1 i=1

K1

Xi

≤icx

K1 Y1 ; K2

Xi ≤icx Y1 . The result now follows from Theorem 4.A.12.

Parts (a) and (d) of Theorem 4.A.8 can be generalized as follows. Theorem 4.A.15. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤icx Yi for i = 1, 2, . . . , m, then g(X1 , X2 , . . . , Xm ) ≤icx g(Y1 , Y2 , . . . , Ym )

(4.A.16)

for every increasing and componentwise convex function g. Proof. Without loss of generality we can assume that all the 2m random variables are independent because such an assumption does not aﬀect the distributions of g(X1 , X2 , . . . , Xm ) and g(Y1 , Y2 , . . . , Ym ). The proof is by induction on m. For m = 1 the result is just Theorem 4.A.8(a). Assume that (4.A.16) is true for vectors of size m − 1. Let g and φ be increasing and componentwise convex functions. Then E[φ(g(X1 , X2 , . . . , Xm ))X1 = x] = E[φ(g(x, X2 , . . . , Xm ))] ≤ E[φ(g(x, Y2 , . . . , Ym ))] = E[φ(g(X1 , Y2 , . . . , Ym ))X1 = x], where the equalities above follow from the independence assumption and the inequality follows from the induction hypothesis. Taking expectations with respect to X1 , we obtain

190

4 Univariate Monotone Convex and Related Orders

E[φ(g(X1 , X2 , . . . , Xm ))] ≤ E[φ(g(X1 , Y2 , . . . , Ym ))]. Repeating the argument, but now conditioning on Y2 , . . . , Ym and using (4.A.16) with m = 1, we see that E[φ(g(X1 , Y2 , . . . , Ym ))] ≤ E[φ(g(Y1 , Y2 , . . . , Ym ))], and this proves the result.

From Theorem 4.A.15 we obtain the following corollary. Corollary 4.A.16. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤icx Yi for i = 1, 2, . . . , m, then max{X1 , X2 , . . . , Xm } ≤icx max{Y1 , Y2 , . . . , Ym }. From Corollary 4.A.16 and Theorem 4.A.1 it is easy to see that if X1 , X2 , . . . , Xm are independent random variables, and if Y1 , Y2 , . . . , Ym are independent random variables, and if Xi ≤icv Yi for i = 1, 2, . . . , m, then min{X1 , X2 , . . . , Xm } ≤icv min{Y1 , Y2 , . . . , Ym }. A comparison of maxima of two partial sums in the increasing convex order is given next. Recall from (3.A.54) the deﬁnition of negatively associated random variables. Theorem 4.A.17. Let X1 , X2 , . . . , Xn be negatively associated random variables, and let Y1 , Y2 , . . . , Yn be independent random variables such that Xi =st Yi , i = 1, 2, . . . , n. Then max

1≤k≤n

k i=1

Xi ≤icx max

1≤k≤n

k

Yi .

i=1

Theorem 4.A.17 follows from Theorem 9.A.23 in Chapter 9; see a comment there after that theorem. Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a convex subset (that is, an interval) of the real line or of N. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by Gθ (y)dF (θ), y ∈ R. H(y) = X

The following result generalizes Theorem 4.A.8(a), just as Theorem 1.A.6 generalized Theorem 1.A.3(a).

4.A The Monotone Convex and Monotone Concave Orders

191

Theorem 4.A.18. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If for every increasing convex [concave] function φ E[φ(X(θ))]

is increasing and convex [concave] in θ,

(4.A.17)

and if Θ1 ≤icx [≤icv ] Θ2 ,

(4.A.18)

Y1 ≤icx [≤icv ] Y2 .

(4.A.19)

then Proof. Select an increasing convex [concave] function φ for which the expectations below exist, denote ψ(θ) = E[φ(X(θ))],

θ ∈ X,

and notice that ψ is increasing and convex [concave] by (4.A.17). Then E[φ(Y1 )] = E[ψ(Θ1 )] ≤ E[ψ(Θ2 )] = [E[φ(Y2 )], where the inequality follows from (4.A.18). This gives (4.A.19).

Note that (4.A.11) can be easily obtained from the result above. It is worth mentioning also that condition (4.A.17) is weaker than the condition {X(θ), θ ∈ X } ∈ SICX [SICV] which is studied in Section 8.A of Chapter 8. An extension of Theorem 4.A.18 is given as Theorem 4.A.65 below. The following example illustrates the use of Theorem 4.A.18. It may be compared to Corollary 3.A.22. Example 4.A.19. Let U , Θ1 , and Θ2 be independent positive random variables. Deﬁne U U Y1 = and Y2 = . Θ1 Θ2 If Θ1 ≤icv [≤icx ] Θ2 , then Y1 ≥icx [≥icv ] Y2 . This can be proven by a simple application of Theorems 4.A.18 and 4.A.1. An interesting variation of Theorem 4.A.18 is the following. Its proof is similar to the proof of Theorem 4.A.18 and is therefore omitted.

192

4 Univariate Monotone Convex and Related Orders

Theorem 4.A.20. Consider a family of distribution functions {Gθ , θ ∈ X } as described before Theorem 4.A.18. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If for every increasing convex [concave] function φ E[φ(X(θ))]

is increasing in θ,

and if Θ1 ≤st Θ2 , then Y1 ≤icx [≤icv ] Y2 . A Laplace transform characterization of the orders ≤icx and ≤icv is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, and 2.B.14. Theorem 4.A.21. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤icx [≤icv ] X2 ⇐⇒ Nλ (X1 ) ≤icx [≤icv ] Nλ (X2 )

for all λ > 0.

Proof. First assume that X1 ≤icx [≤icv ] X2 . For k = 1, 2, denote the distribution function of Xk by Fk . Let φ be an increasing convex [concave] function. Without loss of generality assume that φ(0) = 0. Then, from (2.A.16) we have that ∞ ∞ (λx)n dFk (x), E[φ(Xk )] = φ(n)e−λx n! 0 n=1 and therefore it is seen that it suﬃces to show that g(x) ≡

∞

φ(n)e−λx

n=1

(λx)n n!

is increasing and convex [concave] in x. Now compute

g (x) =

∞

φ(n)λe

n=1 ∞

=λ

n=0

−λx

(λx)n−1 (λx)n − (n − 1)! n!

[φ(n + 1) − φ(n)]e−λx

(λx)n . n!

If we denote ∆φ (n) ≡ φ(n + 1) − φ(n), then it is seen that

4.A The Monotone Convex and Monotone Concave Orders

193

g (x) = λE{∆φ [N (x)]}, where {N (x), x ≥ 0} is a Poisson process with rate λ. Since ∆φ (n) ≥ 0, by the monotonicity of φ, it follows that g (x) ≥ 0. Also, since ∆φ (n) ↑ [↓] n by the convexity [concavity] of φ, and since N (x) ↑st x, it follows that g (x) ↑ [↓] x. Therefore g is increasing and convex [concave]. Now suppose that Nλ (X1 ) ≤icx Nλ (X2 ) for all λ > 0, that is, using the notation of the proof of Theorem 2.A.16, ∞

αλ,1 (n) ≤

n=m

∞

αλ,2 (n),

m = 0, 1, 2, . . . .

n=m

Then for m ≥ 2, (2.A.23) yields

∞

λe 0

m−2 −λu (λu)

(m − 2)!

∞

u

F 1 (x)dx du ∞ m−2 ∞ −λu (λu) ≤ λe F 2 (x)dx du. (m − 2)! u 0

For any ﬁxed y > 0 set λ = (m − 1)/y. It follows that as m → ∞ (then λ → ∞), ∞ ∞ ∞ (λu)m−2 λe−λu F k (x)dx du → F k (x)dx, k = 1, 2. (m − 2)! u 0 y Therefore we obtain y

∞

F 1 (x)dx ≤

∞

F 2 (x)dx,

y > 0,

y

that is X1 ≤icx X2 (see (4.A.5)). The proof of the converse for the ≤icv order is similar.

The implication =⇒ in Theorem 4.A.21 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 4.A.18. 4.A.3 Conditions that lead to the increasing convex and increasing concave orders Once the relation X ≤icx Y or the relation X ≤icv Y has been established between the two random variables X and Y , it can be of great use. However, given the two random variables and their distribution functions it is sometimes not clear how to verify that X ≤icx Y or that X ≤icv Y . Parallel to the analysis in Section 3.A.3 we point out here some simple conditions that imply the increasing convex and the increasing concave orders.

194

4 Univariate Monotone Convex and Related Orders

Theorem 4.A.22. Let X and Y be two random variables with distribution functions F and G and survival functions F and G, respectively, and with ﬁnite means such that EX ≤ EY . (a) If S − (F −G) ≤ 1 and the sign sequence is +, − [−, +] when equality holds, then X ≤icx Y [X ≤icv Y ]. (b) If S − (G−F ) ≤ 1 and the sign sequence is +, − [−, +] when equality holds, then X ≤icx Y [X ≤icv Y ]. The proof of this theorem is similar to the proof of Theorem 3.A.44 and is not detailed here. The condition in part (a) (or, equivalently, in part (b)) of Theorem 4.A.22 is not only suﬃcient for X ≤icx Y , but, for nonnegative random variable, it can also characterize the increasing convex order in a similar manner in which (3.A.58) (or, equivalently, (3.A.59)) characterizes the convex order in Theorem 3.A.45. This is stated next. Theorem 4.A.23. Let X and Y be two nonnegative random variables such that EX ≤ EY . Then X ≤icx [≤icv ] Y if, and only if, there exist random variables Z1 , Z2 , . . ., with distribution functions F1 , F2 , . . ., such that Z1 =st X, EZj ≤ EY , j = 1, 2, . . ., Zj →st Y as j → ∞, EZj → EY as j → ∞, and S − (F j − F j+1 ) = 1 and the sign sequence is +, − [−, +], j = 1, 2, . . .. If the random variables in Theorem 4.A.23 are not nonnegative, then the suﬃciency part of that theorem is not correct. This follows from the remark after Theorem 3.A.45. An interesting characterization of the mean residual life order by means of the increasing convex order is the following result. Theorem 4.A.24. Let X and Y be two random variables. Then X ≤mrl Y if, and only if, [X − sX > s] ≤icx [Y − sY > s] for all s. (4.A.20) Proof. Let F and G be the survival functions of X and Y , respectively. Condition (4.A.20) can be written as ∞ ∞ F (s + u)du G(s + u)du t ≤ t for all s and all t ≥ 0, F (s) G(s) which is equivalent to X ≤mrl Y by (2.A.6).

Remark 4.A.25. Let φ be an increasing convex function. For any s let s be selected such that φ(s ) = s. Note that if (4.A.20) holds, then [X X > s ] ≤icx [Y Y > s ]. Therefore E[φ(X)X > s ] ≤ E[φ(Y )Y > s ], and therefore E[φ(X) − sφ(X) > s] ≤ E[φ(Y ) − sφ(Y ) > s]. Thus we have proven that if X ≤mrl Y , then φ(X) ≤mrl φ(Y ) for every increasing convex function φ.

4.A The Monotone Convex and Monotone Concave Orders

195

From Theorem 4.A.24 we see that if X ≤mrl Y , then [X X > s] ≤icx [Y Y > s] for all s. Letting s → −∞ we obtain from Theorem 4.A.8(c) the following result. Theorem 4.A.26. Let X and Y be two random variables with ﬁnite means. If X ≤mrl Y , then X ≤icx Y . An analog of Theorem 4.A.26 for the increasing concave order is the following result. Theorem 4.A.27. Let X and Y be two random variables with ﬁnite means. If E[X X ≤ x] ≤ E[Y Y ≤ x] for all x ∈ R, then X ≤icv Y . For positive random variables we have a result that is stronger than Theorem 4.A.26: Theorem 4.A.28. Let X and Y be two almost surely positive random variables with ﬁnite means. If X ≤hmrl Y , then X ≤icx Y . Proof. Let F and G be the survival functions of X and Y , respectively. From (2.B.4) (or, equivalently, from (2.B.2)) it follows that ∞ ∞ F (u)du G(u)du t ≤ t for all t ≥ 0. (4.A.21) EX EY Since, for almost surely positive random variables, X ≤hmrl Y implies that EX ≤ EY (see (2.B.6)), it follows that (4.A.5) holds.

Remark 4.A.29. With the help of Theorem 4.A.28 we can now provide proofs for Theorems 2.A.15 and 2.B.13. First we prove Theorem 2.A.15. From (2.A.3) it is seen that assumption ∞ (2.A.11) means that y Gθ (u)du, as a function of θ and of y, is TP2 , where Gθ is the survival function associated with Gθ . Assumption (2.A.12) means that F i (θ), as a function of i ∈ {1, 2} and of θ, is TP2 . From Theorem 4.A.28 and ∞ (4.A.5) it follows that y Gθ (u)du is increasing in θ. Therefore, by Theorem ∞ 2.1(i) of Lynch, Mimmack, and Proschan [369], X y Gθ (u)du dFi (θ) is TP2 ∞ ∞ Gθ (u)dFi (θ) du, in i ∈ {1, 2} and y. But X y Gθ (u)du dFi (θ) = y X and that, by (2.A.3), gives (2.A.13). Next we prove Theorem 2.B.13. Fix an x > 0. From (2.B.2) it is seen ∞ that assumption (2.B.15) implies that y Gθ (u)du is TP2 in y ∈ {0, x} and θ, where Gθ is the survival function associated with Gθ . Assumption 2} and of θ, is TP2 . (2.B.16) means that F i (θ), as a function of i ∈ {1, ∞ From Theorem 4.A.28 and (4.A.5) it follows that y Gθ (u)du is increasing in θ. Therefore, by Theorem 2.1(i) of Lynch, Mimmack, and Proschan ∞ [369], X y Gθ (u)du dFi (θ) is TP2 in i ∈ {1, 2} and y ∈ {0, x}. But ∞ ∞ Gθ (u)du dFi (θ) = y Gθ (u)dFi (θ) du and this expression is TP2 X y X in i ∈ {1, 2} and y ∈ {0, x} for all x > 0. Thus, by (2.B.2), we obtain (2.B.17).

196

4 Univariate Monotone Convex and Related Orders

Under quite weak conditions the order ≤dil implies the order ≤icx . This is shown in the next theorem. For any random variable Z, let lZ denote the left endpoint of the support of Z. Theorem 4.A.30. Let X and Y be two random variables with ﬁnite means. If lX ≤ l Y (4.A.22) and if X ≤dil Y , then X ≤icx Y . Proof. Suppose that X ≤dil Y . Then [X − EX] ≤cx [Y − EY ].

(4.A.23)

Therefore, by (3.A.12) we get that supp(X − EX) ⊆ supp(Y − EY ). Thus lY − EY ≤ lX − EX. Hence, EY − EX ≥ lY − lX .

(4.A.24)

Combining (4.A.22) with (4.A.24) it is seen that EX ≤ EY.

(4.A.25)

X ≤cx Y − (EY − EX),

(4.A.26)

From (4.A.23) it follows that

and from (4.A.25) it follows that Y − (EY − EX) ≤st Y.

(4.A.27)

Using Theorem 4.A.6(b) it is seen that, from (4.A.26) and (4.A.27), we obtain X ≤icx Y . It is also easy to obtain X ≤icx Y from (4.A.26) and (4.A.27) by noticing that the usual stochastic order and the convex order both imply the increasing convex order.

As a corollary of Theorem 4.A.30 we obtain the following result. Corollary 4.A.31. Let X and Y be two nonnegative random variables with ﬁnite means, such that X has the support [0, ∞). If X ≤dil Y , then X ≤icx Y . A corollary of Theorem 4.A.30 and of (3.C.7) is the following result. Corollary 4.A.32. Let X and Y be two random variables with ﬁnite means. If lX ≤ lY and if X ≤ew Y , then X ≤icx Y . The next result gives a simple condition that implies the increasing convex order between a given random variable and a scale transformation of another random variable. Let X1 , X2 , . . . be a sequence of independent and identically distributed nonnegative random variables with a common distribution

4.A The Monotone Convex and Monotone Concave Orders

197

function F , and let Y1 , Y2 , . . . be another sequence of independent and identically distributed nonnegative random variables with a common distribution function G. Let X(n) ≡ max{X1 , X2 , . . . , Xn } be the nth order statistic of a sample of size n from the distribution F , n = 1, 2, . . .. Let Y(n) be similarly deﬁned for n = 1, 2, . . .. Note that from Corollary 4.A.16 it follows that if X1 ≤icx Y1 , then X(n) ≤icx Y(n) for all n = 1, 2, . . .. The following theorem is a weak converse of this observation. The proof is not given here. Theorem 4.A.33. Let X1 , X2 , . . . be a sequence of independent and identically distributed nonnegative random variables and let Y1 , Y2 , . . . be another sequence of independent and identically distributed nonnegative random variables. If E[X(n) ] ≤ E[Y(n) ] for all n = 1, 2, . . ., then X1 ≤icx κY1 for some constant κ ≥ 1 that is independent of the distributions of X1 and Y1 . The constant κ can be taken to be equal to 2(1 − e−1 )−1 . 4.A.4 Further properties Let X and Y be two random variables. If E[φ(X)] ≤ E[φ(Y )] for all increasing functions φ, then (4.A.1) deﬁnitely holds. If E[φ(X)] ≤ E[φ(Y )] for all convex [concave] functions φ, then (4.A.1) also holds. From (1.A.7) and (3.A.1) we thus obtain the following result. Note that in the conclusion of the second part of (b) in the next theorem the random variables X and Y are interchanged. Theorem 4.A.34. Let X and Y be two random variables. (a) If X ≤st Y , then X ≤icx Y and X ≤icv Y . (b) If X ≤cx Y , then X ≤icx Y and Y ≤icv X. Thus we see that indeed the increasing convex [concave] order has both properties of ordering by size and ordering by variability. One indication of the ordering by size property is (4.A.2) [(4.A.3)], that is, the ordering of the expected values (when they exist) that follows from the increasing convex [concave] order. It turns out that the ordering of the expected values is actually the only indication of the ordering by size property. If the two means are equal, then the monotone convex and the monotone concave orders reduce to the convex order of Section 3.A. This is stated formally in the following theorem. Theorem 4.A.35. Let X and Y be two random variables with ﬁnite means. (a) If X ≤icx Y and EX = EY , then X ≤cx Y . (b) If X ≤icv Y and EX = EY , then Y ≤cx X. Proof. If X ≤icx Y , then (4.A.5) (which is the same as (3.A.7)) holds. Part (a) now follows from Theorem 3.A.1(a). Part (b) is proven similarly using (4.A.7), (3.A.8), and Theorem 3.A.1(b).

198

4 Univariate Monotone Convex and Related Orders

The order ≤icx can be used to yield bivariate characterizations of the orders ≤st , ≤hr , ≤rh , and ≤lr (compare the following result to Theorems 1.A.10, 1.B.10, 1.B.48, 1.C.22, and 1.C.23). Let φ1 and φ2 be two bivariate functions and let ∆φ21 (x, y) = φ2 (x, y) − φ1 (x, y). Consider the following set of conditions on φ1 and φ2 : (a) (b) (c) (d) (e) (f) (g)

∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. ∆φ21 (x, y) ≥ 0 whenever x ≤ y. φ1 (y, x) ≤ φ2 (x, y) whenever x ≤ y. For each x, φ2 (x, y) increases in y on {y ≥ x}. For each y, φ2 (x, y) decreases in x on {x ≤ y}. For each x, ∆φ21 (x, y) increases in y on {y ≥ x}. For each y, ∆φ21 (x, y) decreases in x on {x ≤ y}.

The proof of the next theorem is omitted. Theorem 4.A.36. Let X and Y be two independent random variables. Then (i) X ≤st Y if, and only if, φ1 (X, Y ) ≤icx φ2 (X, Y )

(4.A.28)

for all φ1 and φ2 satisfying (a), (b), (c), (d), (e), (f), and (g). (ii) X ≤hr Y if, and only if, (4.A.28) holds for all φ1 and φ2 satisfying (a), (b), (c), (d), and (f). (iii) X ≤rh Y if, and only if, (4.A.28) holds for all φ1 and φ2 satisfying (a), (b), (c), (e), and (g). (iv) X ≤lr Y if, and only if, (4.A.28) holds for all φ1 and φ2 satisfying (a), (b), and (c). A typical application of Theorem 4.A.36 is the following result (compare it to Theorem 1.C.21). Theorem 4.A.37. Let X1 , X2 , . . . , Xm be independent random variables such that X1 ≤rh X2 ≤rh · · · ≤rh Xm . Let a1 , a2 , . . . , am be constants such that a1 ≤ a2 ≤ · · · ≤ am . Then m i=1

am−i+1 Xi ≤icv

m i=1

aπi Xi ≤icv

m

ai Xi ,

i=1

where π = (π1 , π2 , . . . , πm ) denotes any permutation of (1, 2, . . . , m). Proof. We only give the proof when m = 2; the general case then can be obtained by pairwise interchanges. So, suppose that X1 ≤rh X2 and that a1 ≤ a2 . Deﬁne φ1 and φ2 by φ1 (x, y) = −a1 x−a2 y and φ2 (x, y) = −a1 y−a2 x. Then it is easy to verify that (a), (b), (c), (e), and (g) above hold. Thus, by Theorem 4.A.36(iii), −a1 X1 − a2 X2 ≤icx −a1 X2 − a2 X1 . By Theorem 4.A.1 this means a1 X2 + a2 X1 ≤icv a1 X1 + a2 X2 .

4.A The Monotone Convex and Monotone Concave Orders

199

In the next few results we denote by Ip a Bernoulli random variable with probability of success p, that is, P {Ip = 1} = 1 − P {Ip = 0} = p. Recall from page 2 the deﬁnition of the majorization order ≺ among n-dimensional vectors. It is shown after the next theorem that it partially extends Theorem 3.A.37. Theorem 4.A.38. Let X1 , X2 , . . . , Xn be independent nonnegative random variables, and let Ip1 , Ip2 , . . . , Ipn and Iq1 , Iq2 , . . . , Iqn be independent Bernoulli random variables that are independent of X1 , X2 , . . . , Xn . Suppose that (i) 1 ≥ p1 ≥ p2 ≥ · · · ≥ pn and 1 ≥ q1 ≥ q2 ≥ · · · ≥ qn , (ii) Xn ≤st Xn−1 ≤st · · · ≤st X1 , and (iii) p ≺ q. Then

n

Ipi Xi ≤icv

i=1

n

Iqi Xi .

i=1

If X1 , X2 , . . . , Xn in Theorem

4.A.38 are identically distributed, then n n E I X I X = E and therefore p i q i i i i=1 i=1 n n in this theon n the conclusion rem is i=1 Ipi Xi ≤cv i=1 Iqi Xi ; that is, i=1 Ipi Xi ≥cx i=1 Iqi Xi . This is the same as the conclusion of Theorem 3.A.37. The following result partially extends Theorem 3.A.35. Theorem 4.A.39. Let X1 , X2 , . . . , Xn be independent and identically distributed nonnegative random variables, and let Ip1 , Ip2 , . . . , Ipn be independent Bernoulli random variables that are independent of X1 , X2 , . . . , Xn . Let a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bn ) be two vectors of constants. Suppose that (i) 1 ≥ p1 ≥ p2 ≥ · · · ≥ pn , (ii) a1 ≥ a2 ≥ · · · ≥ an and b1 ≥ b2 ≥ · · · ≥ bn , and (iii) a ≺ b. Then

n i=1

Ipi ai Xi ≤icx

n

Ipi bi Xi .

i=1

A family of nonnegative random variables {X(θ), θ > 0} is said to have the semigroup property if, for all θ1 > 0 and θ2 > 0, one has X(θ1 + θ2 ) =st X(θ1 ) + X(θ2 ), where X(θ1 ) and X(θ2 ) are independent. As a corollary of Theorem 4.A.39 we obtain the following result. Corollary 4.A.40. Let {X(θ), θ > 0} be a family of random variables with the semigroup property, and let Ip1 , Ip2 , . . . , Ipn be independent Bernoulli random variables that are independent of {X(θ), θ > 0}. Let θ = (θ1 , θ2 , . . . , θn ) and γ = (γ1 , γ2 , . . . , γn ) be two vectors of constants. Suppose that (i) 1 ≥ p1 ≥ p2 ≥ · · · ≥ pn ,

200

4 Univariate Monotone Convex and Related Orders

(ii) θ1 ≥ θ2 ≥ · · · ≥ θn and γ1 ≥ γ2 ≥ · · · ≥ γn , and (iii) θ ≺ γ. Then

n i=1

Ipi X(θi ) ≤icx

n

Ipi X(γi ).

i=1

The following characterizations of the dilation order, by means of the order ≤icx , are similar to characterizations (3.A.39) and (3.A.40). Theorem 4.A.41. Let X and Y be two random variables with distribution functions F and G, respectively, and with ﬁnite expectations. Then X ≤dil Y if, and only if, any of the following two statements hold: [X − EX X ≥ F −1 (p)] ≤icx [Y − EY Y ≥ G−1 (p)] for all p ∈ [0, 1), and [X − EX X ≤ F −1 (p)] ≥icx [Y − EY Y ≤ G−1 (p)]

for all p ∈ [0, 1).

The following characterizations of the convex order, by means of the order ≤icx , are similar to characterizations (3.A.41) and (3.A.42). These characterizations follow at once from Theorem 4.A.41 and from (3.A.32). Theorem 4.A.42. Let X and Y be two random variables with distribution functions F and G, respectively, and with equal ﬁnite means. Then X ≤cx Y if, and only if, any of the following two statements hold: [X X ≥ F −1 (p)] ≤icx [Y Y ≥ G−1 (p)] for all p ∈ [0, 1), and

[X X ≤ F −1 (p)] ≥icx [Y Y ≤ G−1 (p)]

for all p ∈ [0, 1).

In a manner similar to the characterization (3.B.6) of the dispersive order by the usual stochastic order, the increasing convex order can characterize the excess wealth order as follows. Theorem 4.A.43. Let X and Y be two continuous random variables with distribution functions F and G, respectively. Then X ≤ew Y if, and only if, (X − F −1 (α))+ ≤icx (Y − G−1 (α))+ ,

α ∈ (0, 1).

(4.A.29)

Proof. We give the proof under the assumption that F and G are strictly increasing; the more general proof can be found in the literature. First assume that (4.A.29) holds. Then, by (4.A.2) we get E[(X − F −1 (α))+ ] ≤ E[(Y − G−1 (α))+ ],

α ∈ (0, 1).

The latter inequality is easily seen to be equivalent to (3.C.5), and therefore X ≤ew Y .

4.A The Monotone Convex and Monotone Concave Orders

201

In order to obtain the converse note that (4.A.29) is equivalent to ∞ ∞ H(t, α) ≡ G(x)dx − F (x)dx ≥ 0, (t, α) ∈ [0, ∞) × (0, 1). t+G−1 (α)

t+F −1 (α)

Select an α ∈ (0, 1). Note that limt→∞ H(t, α) = 0. If H(·, α) attains a minimum at t∗ , since H(·, α) is continuous and diﬀerentiable, t∗ should satisfy ∂H(t,α) ∗ = 0. This equality holds if, and only if, ∂t

t=t

F (t∗ + F −1 (α)) = F (t∗ + G−1 (α)) = β, say. Since F and G are strictly increasing it is seen that F −1 (β) = t∗ + F −1 (α) and G−1 (β) = t∗ + G−1 (α). Therefore ∞ ∞ ∗ H(t , α) = G(x)dx − F (x)dx ≥ 0, G−1 (β)

F −1 (β)

where the inequality follows from X ≤ew Y .

Let X and Y be two nonnegative random variables with respective distri−1 bution functions F and G. Let HF−1 and HG be the TTT transforms associated with F and G, respectively (see (1.A.19)), and let HF and HG be the respective inverses. Let Xttt and Yttt be random variables with distribution functions HF and HG (see Section 1.A.4). Theorem 4.A.44. Let X and Y be two nonnegative random variables. Then X ≤icv Y =⇒ Xttt ≤icv Yttt . See related results in Theorems 1.A.29, 3.B.1, 4.B.8, 4.B.9, and 4.B.29. The next example may be compared with Examples 1.A.25, 1.B.6, and 1.C.51. Example 4.A.45. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m

Xi ≥icx Y ⇐⇒ p ≤

n

pn1 1 pn2 2 · · · pnmm ,

i=1

and

m i=1

m Xi ≤icx Y ⇐⇒ p ≥

i=1

n

n i pi

.

The following example gives necessary and suﬃcient conditions for the comparison of normal random variables; it is generalized in Example 7.A.13. See related results in Examples 1.A.26 and 3.A.51.

202

4 Univariate Monotone Convex and Related Orders

Example 4.A.46. Let X be a normal random variable with mean µX and vari2 ance σX , and let Y be a normal random variable with mean µY and variance 2 2 ≤ σY2 . σY . Then X ≤icx Y if, and only if, µX ≤ µY and σX Example 4.A.47. Let X1 , X2 , . . . , Xn be independent exponential random varin ables with distinct hazard rates λ1 > λ2 > · · · > λn > 0. Then n1 i=1 Xi ≤icx Xn . Conditions for stochastic equality, for random variables that are ≤icx - or ≤icv -ordered, are given in the following result. This result may be compared to Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16. Theorem 4.A.48. Let X and Y be two nonnegative random variables. Suppose that X ≤icx Y [X ≤icv Y ] and that E[X r ] = E[Y r ] for some r ∈ (1, ∞) [r ∈ (0, 1)], provided the expectations exist. Then X =st Y . This result is a corollary of Theorem 4.A.69 below with p = 1. In fact, the following stronger result, which is an analog of Theorem 3.A.43, holds for the orders ≤icx and ≤icv . Theorem 4.A.49. Let X and Y be two random variables. Suppose that X ≤icx [≤icv ] Y and that for some increasing strictly convex [concave] function φ we have that E[φ(X)] = E[φ(Y )], provided the expectations exist. Then X =st Y . Of course, in Theorem 4.A.49 we can replace “increasing strictly convex [concave] function” by “decreasing strictly concave [convex] function.” Theorem 4.A.50. Let X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn (n ≥ 2) be two collections of independent and identically distributed random variables. If X1 ≤icx Y1 and if E[max{X1 , X2 , . . . , Xn }] = E[max{Y1 , Y2 , . . . , Yn }], then X1 =st Y1 . Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R with ﬁnite means is a lattice with respect to the order ≤icx . Meilijson and N´ adas [389] have proved the following result which, for the sake of simplicity, we describe informally. Let X be a random variable with mean residual life function m (see, for example, (2.A.1)). Deﬁne H by H(x) = m(x) + x = E[X X > x], for all x, and note that H is increasing. Denote ˜ = H(X). Then X ˜ ≥st Y for every random variable Y which satisﬁes Y ≤icx X ˜ is the least stochastic X. In fact, Meilijson and N´ adas [389] proved that X majorant in the sense that if another random variable Z also satisﬁes Z ≥st Y ˜ ≤st Z. for every Y such that Y ≤st X, then X

4.A The Monotone Convex and Monotone Concave Orders

203

4.A.5 Some properties in reliability theory We have seen in Theorem 1.A.30 that a nonnegative random variable is IFR [DFR] if, and only if, [X − tX > t] ≥st [≤st ] [X − t X > t ] whenever t ≤ t . A question of interest then is what does one get if in the above condition one replaces the order ≥st by the order ≥icx . It turns out that the order ≥icx can characterize another familiar aging notion in reliability theory. Recall from page 1 the deﬁnitions of DMRL and IMRL random variables. A combination of Theorems 2.A.23 and 4.A.24 provides a proof of the DMRL part of the next theorem. The proof of the IMRL part is similar. Theorem 4.A.51. The nonnegative random variable X is DMRL [IMRL] if, and only if, [X − tX > t] ≥icx [≤icx ] [X − t X > t ] whenever t ≤ t . Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.A.23, 2.B.17, 3.A.56, and 3.C.13. We will now describe a generalization of the suﬃciency part of Theorem 4.A.51. For two independent random variables X and T , let XT denote a random variable that has the distribution of [X − T X > T ]. Note that XT is not the residual life of X given T . Theorem 4.A.52. Let X, T1 , and T2 be independent random variables. If T1 ≤rh T2 , and if X is DMRL [IMRL], then XT1 ≥icx [≤icx ] XT2 . Proof. We will prove the DMRL part only. The proof of the IMRL part is similar. Let F denote the survival function of X, and let Gi denote the survival function of XTi , i = 1, 2. Then, for any ﬁxed x we have

∞

x

∞ G2 (y)dy − G1 (y)dy x ∞ ∞ E F (T1 ) E x F (T2 + y)dy − E F (T2 ) E x F (T1 + y)dy = . E F (T1 ) E F (T2 ) (4.A.30)

∞ Deﬁne the functions α and β by α(t) = x F (t + y)dy and β(t) = F (t). Note that β is nonnegative and decreasing, and that α/β is decreasing because X is DMRL. Therefore, by Theorem 1.B.50(b), we see that the numerator in (4.A.30) is nonpositive for any x. It follows, by (4.A.5), that XT1 ≥icx XT2 .

Note that if the nonnegative random variable X is DMRL [IMRL], then, from Theorem 4.A.51 it follows that X ≥icx [≤icx ] [X − tX > t] for all t ≥ 0. (4.A.31)

204

4 Univariate Monotone Convex and Related Orders

Nonnegative random variables that satisfy (4.A.31) are called new better [worse] than used in convex ordering (NBUC [NWUC]) or new better [worse] than used in mean (NBUM [NWUM]). An equivalent deﬁnition of the NBUC notion, by means of the usual stochastic order, is given in (1.A.21). It is of interest to note that a nonnegative random variable X with survival function F is NBUC if, and only if, ∞ x+t F (t) F (y)dy ≤ F (y)dy for all t ≥ 0 and x ≥ 0. (4.A.32) 1 − F (t) x x+t It is worthwhile to point out that a nonnegative random variable X that satisﬁes (4.A.31), but with the increasing concave (rather than the increasing convex) order, is said to be NBU(2) [NWU(2)]. If a nonnegative random variable X satisﬁes [X − tX > t] ≥icv [≤icv ] [X − t X > t ] whenever t ≤ t , (4.A.33) then, in some places in the literature, the random variable X is said to have the IFR(2) [DFR(2)] property. However, Belzunce, Hu, and Khaledi [68] proved that the IFR(2) [DFR(2)] property is the same as the IFR [DFR] property. Thus they obtained the following characterization of the IFR [DFR] property. Theorem 4.A.53. The nonnegative random variable X is IFR [DFR] if, and only if, (4.A.33) holds. 4.A.6 The starshaped order A function φ : [0, ∞) → [0, ∞), which satisﬁes φ(0) = 0, is called starshaped if φ(x)/x is increasing in x on (0, ∞) (here we use the convention a/∞ = 0 for a > 0). Note that such a function is increasing. Note also that every increasing convex function φ on [0, ∞), such that φ(0) = 0, is starshaped. Let X and Y be two nonnegative random variables such that E[φ(X)] ≤ E[φ(Y )]

for all starshaped functions φ : [0, ∞) → [0, ∞), (4.A.34) provided the expectations exist. Then X is said to be smaller than Y in the starshaped order (denoted by X ≤ss Y ). Theorem 4.A.54. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. Then X ≤ss Y if, and only if, ∞ ∞ xdF (x) ≤ xdG(x), y ≥ 0. (4.A.35) y

y

Proof. The function φy , deﬁned by 0, x ≤ y, φy (x) = x, x > y,

4.A The Monotone Convex and Monotone Concave Orders

205

is starshaped. Thus, (4.A.34) =⇒ (4.A.35). Conversely, let φ be a starshaped function. Then h(x) = φ(x)/x is increasing in x on (0, ∞). Approximate h by a sequence of increasing step functions hn . Then (4.A.35) yields ∞ ∞ xhn (x)dF (x) ≤ xhn (x)dG(x). 0

0

Letting n → ∞, we obtain (4.A.34).

Theorem 4.A.54 shows that when the compared random variables have the same mean, then the starshaped order is equivalent to the usual stochastic ordering of the corresponding length-biased (or spread) random variables. Such random variables are studied in Examples 1.B.23, 1.C.59, 1.C.60, and 8.B.12. Theorem 4.A.55. Let X and Y be two nonnegative random variables. Then X ≤st Y =⇒ X ≤ss Y =⇒ X ≤icx Y. Proof. The ﬁrst implication follows from the fact that a starshaped function φ, such that φ(0) = 0, is increasing. In order to prove the second implication, let φ be an increasing convex function. First suppose that φ(0) = 0. Then φ is starshaped and the inequality in (4.A.1) follows from X ≤ss Y . If φ(0) = a = ˜ 0, then deﬁne φ(x) = φ(x) − a, x ≥ 0. The function φ˜ is increasing convex, ˜ ˜ ˜ )]; and it satisﬁes φ(0) = 0. Thus, by the previous argument E[φ(X)] ≤ E[φ(Y that is, E[φ(X)] − a ≤ E[φ(Y )] − a, and the inequality in (4.A.1) follows.

Some closure properties of the starshaped order are given in the next theorem. Theorem 4.A.56. (a) If the nonnegative random variables X and Y are such that X ≤ss Y , and g is any starshaped function with g(0) = 0, then g(X) ≤ss g(Y ). In particular, cX ≤ss cY for any c > 0. (b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤ss [Y Θ = θ] for all θ in the support of Θ. Then X ≤ss Y . That is, the starshaped order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of nonnegative random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that EX 2 and EY 2 are ﬁnite and that EXj2 EX 2 → EXj EX

and

EYj2 EY 2 → EYj EY

as j → ∞.

If Xj ≤ss Yj , j = 1, 2, . . ., then X ≤ss Y . Theorem 4.A.57. Let X be a nonnegative random variable. Then I[a,∞) (X) ≤ss I[b,∞) (X) whenever b ≥ a ≥ 0, where I[a,∞) and I[b,∞) are the indicator functions of the indicated intervals. The proof of Theorem 4.A.57 consists of verifying (4.A.35) in each of the cases y ≤ a, a < y ≤ b, and y > b.

206

4 Univariate Monotone Convex and Related Orders

4.A.7 Some related orders Let X and Y be two random variables with survival function F and G, and [k] [k] distribution functions F and G, respectively. Let F [k] , F , G[k] , and G be deﬁned as in (3.A.66) and (3.A.67). The inequalities (4.A.5) and (4.A.7) can be generalized as follows: For a positive integer m suppose that F

[m−1]

(x) ≤ G

[m−1]

(x)

for all x,

(4.A.36)

F [m−1] (x) ≥ G[m−1] (x)

for all x,

(4.A.37)

or that provided these integrals are ﬁnite (the integrals are ﬁnite if F and G have ﬁnite (m − 1)st moments). If (4.A.36) holds, then X is said to be smaller than Y in the m-icx order (denoted by X ≤m-icx Y ). If it is known that X and Y take on values in N++ , then the deﬁnition of the m-icx order can be modiﬁed, exploiting the special structure of N++ ; see Denuit and Lef`evre [146]. If (4.A.37) holds, then X is said to be smaller than Y in the m-icv order (denoted by X ≤m-icv Y ). It is seen from the deﬁnition that the orders ≤1-icx and ≤1-icv are equivalent to the order ≤st , the order ≤2-icx is equivalent to the order ≤icx , and the order ≤2-icv is equivalent to the order ≤icv . The orders ≤m-icx and ≤m-icv have some properties that are similar to the properties of the orders ≤icx and ≤icv . For example, the extension of (4.A.4) is that X ≤m-icx Y if, and only if, E[(X − a)+ ]m−1 ≤ E[(Y − a)+ ]m−1

for all a.

(4.A.38)

The extension of (4.A.6) is that X ≤m-icv Y if, and only if, E[(X − a)− ]m−1 ≤ E[(Y − a)− ]m−1

for all a.

(4.A.39)

The characterization (4.A.1) of the orders ≤icx and ≤icv has an analog for the orders ≤m-icx and ≤m-icv . We will not give the technical details here (see Section 4.C for a reference), but we just mention the following results. For m = 1, 2, . . ., let Mm-icx be the set of all functions φ : R → R such that limx→−∞ φ(x) is ﬁnite, and whose ﬁrst m−1 derivatives, φ(1) , φ(2) , . . . , φ(m−1) , exist, and are such that limx→−∞ φ(j) (x) = 0, j = 1, 2, . . . , m − 1, and φ(m−1) is increasing. Let Mm-icx be the closure of Mm-icx in the topology of weak convergence (that is, pointwise convergence in each continuity point of the limit). Let X and Y be two random variables and suppose that the support of each of them contains an interval of the form (−∞, a) for some a. Then X ≤m-icx Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

for all functions φ ∈ Mm-icx ,

provided the expectations exist.

(4.A.40)

4.A The Monotone Convex and Monotone Concave Orders

207

Next, for m = 1, 2, . . ., let Mm-icv be the set of all functions φ : R → R such that limx→∞ φ(x) is ﬁnite, whose ﬁrst m−1 derivatives, φ(1) , φ(2) , . . . , φ(m−1) , exist, and are such that limx→∞ φ(j) (x) = 0, j = 1, 2, . . . , m − 1, and (−1)m−1 φ(m−1) is increasing. Let Mm-icv be the closure of Mm-icv in the topology of weak convergence. Let X and Y be two random variables and suppose that the support of each of them contains an interval of the form (a, ∞) for some a. Then X ≤m-icv Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

for all functions φ ∈ Mm-icv ,

(4.A.41)

provided the expectations exist. Let us denote X ≤∞-icx [≤∞-icv ] Y if X ≤m-icx [≤m-icv ] Y

for all positive integers m.

(4.A.42)

A characterization of the order ≤∞-icv is given in Theorem 5.A.17. It can be shown that if X and Y have ﬁnite (m − 1)st moments, then X ≤m-icx Y =⇒ E[X] ≤ E[Y ] and X ≤m-icv Y =⇒ E[X] ≤ E[Y ], provided the expectations exist. In fact we have the following more general result. Theorem 4.A.58. Let X and Y be two random variables with ﬁnite ﬁrst m−1 moments. If X ≤m-icx Y [X ≤m-icv Y ], then EX k < EY k [(−1)k+1 EX k < (−1)k+1 EY k ] for the smallest k for which EX k = EY k . Some closure properties of the orders ≤m-icx and ≤m-icv are stated next. We omit the proof of the following theorem. Note, however, that parts (b) and (c) of the next theorem are easy to prove. The proof of part (a) uses the fact that if φ ∈ Mm-icx [Mm-icv ] then φ(j) [(−1)j φ(j) ] is nonnegative and increasing [decreasing] for all j ∈ {1, 2, . . . , m − 1}, and therefore Mm-icx and Mm-icv are closed under compositions. Theorem 4.A.59. (a) Let X and Y be two random variables and suppose that the support of each of them contains an interval of the form (−∞, a) [(a, ∞)] for some a. If X ≤m-icx [≤m-icv ] Y and if g is any function in Mm-icx [Mm-icv ], then g(X) ≤m-icx [≤m-icv ] g(Y ). (b) Let X, Y , and Θ be random variables such that, for all θ in the support of Θ, we have that [X Θ = θ] ≤m-icx [≤m-icv ] [Y Θ = θ]. Then X ≤m-icx [≤m-icv ] Y . That is, the m-icx [m-icv ] order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that E(X+ )m−1 and E(Y+ )m−1 are ﬁnite and that

208

4 Univariate Monotone Convex and Related Orders m−1 m−1 m−1 m−1 E(Xj )+ → EX+ [E(Xj )− → EX− ] m−1 E(Yj )+

→

EY+m−1

m−1 [E(Yj )−

→

and

EY−m−1 ]

as j → ∞. (4.A.43)

If Xj ≤m-icx [≤m-icv ] Yj , j = 1, 2, . . ., then X ≤m-icx [≤m-icv ] Y . (d) Let X1 , X2 , . . . , Xl be a set of independent random variables and let Y1 , Y2 , . . . , Yl be another set of independent random variables. If Xi ≤m-icx [≤m-icv ] Yi for i = 1, 2, . . . , l, then l

Xj ≤m-icx [≤m-icv ]

j=1

l

Yj .

j=1

That is, the m-icx [m-icv ] order is closed under convolutions. In part (c), as in Theorem 3.A.12, the condition (4.A.43) is necessary — without it the conclusion of part (c) may not hold. The following result, which extends the m-icx part of Theorem 4.A.59(d), is essentially the same as Theorem 8.A.29. Theorem 4.A.60. Let X1 , X2 , . . . be a set of independent random variables and let Y1 , Y2 , . . . be another set of independent random variables. Let N1 be an integer-valued random variable that is independent of the Xi ’s, and let N2 be an integer-valued random variable that is independent of the Yi ’s. If Xi ≤m-icx Yi for i = 1, 2, . . ., and if N1 ≤m-icx N2 , then N1 j=1

Xj ≤m-icx

N2

Yj .

j=1

For the orders ≤m-icx and ≤m-icv , the analog of Theorem 3.A.12(a) is the following. Theorem 4.A.61. Let X and Y be two random variables. Then X ≤m-icx [≤m-icv ] Y ⇐⇒ −X ≥m-icv [≥m-icx ] − Y. The proof of Theorem 4.A.61 easily follows from (4.A.36) and (4.A.37). It is not hard to verify the next statement. Theorem 4.A.62. Consider two random variables X and Y . If X ≤m1 -icx [≤m1 -icv ] Y , then X ≤m2 -icx [≤m2 -icv ] Y for all m2 ≥ m1 . Since the order ≤1-icx is the same as the order ≤st we see that X ≤st Y =⇒ X ≤m-icx Y and that X ≤st Y =⇒ X ≤m-icv Y. The following obvious relationships hold between the orders of Section 3.A.5 and the present orders:

4.A The Monotone Convex and Monotone Concave Orders

209

X ≤Sm-cx Y =⇒ X ≤m-icx Y, and

X ≤Sm-cv Y =⇒ X ≤m-icv Y.

Suﬃcient conditions for X ≤m-icv Y and X ≤m-icv Y are given in the next result, which is related to Theorem 4.A.22. It is of interest to compare the next result with Theorem 3.A.66. Theorem 4.A.63. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively, and with density functions f and g, respectively, such that E[X i ] = E[Y i ], i = 1, 2, . . . , m − 2, and E[X m−1 ] ≤ E[Y m−1 ]. (a) If S − (F −G) ≤ m−1 and if the last sign of F −G is a +, then X ≤m-icx Y . (b) If S − (f − g) ≤ m and if the last sign of g − f is a +, then X ≤m-icx Y . The following example describes a typical application of Theorem 4.A.63. Example 4.A.64. Let the inverse Gaussian random variable Y , and the lognormal random variable Z, be as in Example 3.A.67; in particular they both have the mean α/β and the second moment α(α + 1)/β 2 . We claim that Y ≤4-icx Z. In order to see it, ﬁrst note, as in Example 3.A.67, that without loss of generality we can take the means to be equal to 1, that is, β = α. Now, a straightforward computation yields log

fY (x) log2 x αx α − =C+ − , fX (x) 2τ 2 2 2x

x > 0,

where C is some constant. Substituting u = log x, the second derivative of the above expression is seen to have two sign changes. Therefore the expression itself has at most four sign changes. We also have here, by a lengthy computation (see Kaas and Hesselager [270]), that E[Y 3 ] < E[Z 3 ]. The stated result now follows from Theorem 4.A.63(b). In fact, it can be shown that if X, Y , and Z, are, respectively, Gamma, inverse Gaussian, and lognormal random variables (with parameters that are diﬀerent from the ones in Example 3.A.67), such that E[X] = E[Y ] = E[Z] and E[X 2 ] ≤ E[Y 2 ] ≤ E[Z 2 ], then X ≤3-icx Y , X ≤3-icx Z, and Y ≤4-icx Z. Some comparisons of Gamma, inverse Gaussian, lognormal, and BirnbaumSaunders random variables in the ≤3-icv sense were derived by Klar [300]. Consider now a family of distribution functions {Gθ , θ ∈ R}. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support R, and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

210

4 Univariate Monotone Convex and Related Orders

The following result generalizes Theorem 4.A.8(a), just as Theorem 1.A.6 generalized Theorem 1.A.3(a). Its proof is similar to the proof of Theorem 4.A.18, using the fact that Mm-icx and Mm-icv are closed under compositions. We omit the details. Theorem 4.A.65. Consider a family of distribution functions {Gθ , θ ∈ R} as above. Let Θ1 and Θ2 be two random variables with support R and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If ψφ , deﬁned by ψφ (θ) ≡ E[φ(X(θ))], is in Mm-icx [Mm-icv ] whenever φ ∈ Mm-icx [φ ∈ Mm-icv ], and if Θ1 ≤m-icx [≤m-icv ] Θ2 , then Y1 ≤m-icx [≤m-icv ] Y2 . For example, the family {Gθ , θ ≥ 0} of the Poisson distributions (or, in fact, every family of distribution functions whose associated density functions {gθ , θ ∈ R} satisfy that gθ (x) is totally positive of order m; see Karlin [275]) satisﬁes the condition in Theorem 4.A.65 that ψφ is in Mm-icx [Mm-icv ] whenever φ ∈ Mm-icx [φ ∈ Mm-icv ]. A Laplace transform characterization of the orders ≤m-icx and ≤m-icv is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, 2.B.14, and 4.A.21. Before stating it we make a few observations. First, note that the random variables X1 and X2 in the theorem below have the support [0, ∞). Then the characterizations (4.A.40) and (4.A.41) are still valid provided the test functions φ in (4.A.40) satisfy that φ(j) (0) = 0 (rather than limx→−∞ φ(j) (x) = 0), j = 1, 2, . . . , m − 1. Next, note that the random variables Nλ (X1 ) and Nλ (X2 ) in the theorem below are discrete with support N+ . There are several ways of deﬁning the orders ≤m-icx and ≤m-icv for such random variables. One possible way is by the requirement (4.A.36) or (4.A.37) (or, equivalently, by (4.A.38) or (4.A.39)). Another possible way is by replacing the integrals in (4.A.36) or (4.A.37) by sums. In the theorem below we adopt a deﬁnition that is a discrete analog of (4.A.40) and (4.A.41). For m = 1, 2, . . ., (j) let Km-icx be the set of functions φ : N+ → R such that ∆φ (0) = 0, j = (0)

(j)

(j−1)

0, 1, . . . , m−1 (where ∆φ (n) ≡ φ(n) and ∆φ (n) = ∆φ

(j−1)

(n+1)−∆φ

(m−1) j = 1, 2, . . .), and such that ∆φ (n) is increasing on N+ . For random variables M1 and M2 denote M1 ≤m-icx M2 if E[φ(M1 )]

(n),

the discrete ≤ E[φ(M2 )] for all functions φ ∈ Km-icx . Similarly, let Km-icv be the set of functions (j) φ : N+ → R such that limn→∞ ∆φ (n) = 0, j = 0, 1, . . . , m − 1, and such that

4.A The Monotone Convex and Monotone Concave Orders

211

(m−1)

(−1)m−1 ∆φ (n) is increasing on N+ . For the discrete random variables M1 and M2 denote M1 ≤m-icv M2 if E[φ(M1 )] ≤ E[φ(M2 )] for all functions φ ∈ Km-icv . Theorem 4.A.66. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤m-icx [≤m-icv ] X2 ⇐⇒ Nλ (X1 ) ≤m-icx [≤m-icv ] Nλ (X2 )

for all λ > 0.

The proof of this theorem is similar to the proof of Theorem 4.A.21 and is therefore omitted. Another family of orders that are related to the ≤cx , ≤icx , and ≤icv orders can be deﬁned by a generalization of (4.A.5) and (4.A.7) that is diﬀerent from the generalization that is described in (4.A.36) and (4.A.37). Let X and Y be two random nonnegative variables with distribution functions F and G, and survival functions F and G, respectively. Let p > 0 and suppose that E[X p ] and E[Y p ] exist. If ∞ ∞ up−1 F (u)du ≤ up−1 G(u)du for all x, and E[X p ] = E[Y p ], x

x

then X is said to be smaller than Y in pth order (denoted by X ≤p Y ). If ∞ ∞ up−1 F (u)du ≤ up−1 G(u)du for all x, x

x

then X is said to be smaller than Y in p+ order (denoted by X ≤p+ Y ). Finally, if x x up−1 F (u)du ≥ up−1 G(u)du for all x, 0

0

then X is said to be smaller than Y in p− order (denoted by X ≤p− Y ). It is not hard to verify that for nonnegative random variables X and Y we have X ≤p Y ⇐⇒ X p ≤cx Y p , (4.A.44) X ≤p+ Y ⇐⇒ X p ≤icx Y p , and X ≤p− Y ⇐⇒ X p ≤icv Y p .

(4.A.45)

It is seen at once that X ≤p Y =⇒ X ≤p+ Y, and that X ≤p Y =⇒ Y ≤p− X. Notice that, for p = m, the order ≤p+ [≤p− ] is not the same as the order ≤m-icx [≤m-icv ]. In fact, X ≤m+ Y if, and only if,

212

4 Univariate Monotone Convex and Related Orders

E[(X m − a)+ ] ≤ E[(Y m − a)+ ]

for all a

(compare this to (4.A.38)), and X ≤m− Y if, and only if, E[(X m − a)− ] ≤ E[(Y m − a)− ]

for all a

(compare this to (4.A.39)). It is easy to verify that the orders ≤p , ≤p+ and ≤p− are closed under mixtures. They are also closed under limits in distribution provided a condition on convergence of moments, which is an obvious modiﬁcation of (4.A.10) (similar to (4.A.43)), holds. The following result points out some interrelationships among these orders. Theorem 4.A.67. Let X and Y be two nonnegative random variables. If X ≤p+ [≤p− ] Y , then X ≤q+ [≤q− ] Y whenever q ≥ p [q ≤ p]. A relationship to the order ≤∗ is given next (the order ≤∗ is deﬁned in Section 4.B below). Theorem 4.A.68. Let X and Y be two nonnegative random variables that have ﬁnite pth moments and that are not degenerate at 0. If X ≤∗ Y and if E[X p ] = E[Y p ], then X ≤p Y . A simple proof of Theorem 4.A.68 will be given in Remark 4.B.24. Motivated by the result of Theorem 1.A.8 (see also Theorems 3.A.43, 3.A.60, 4.A.48, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16), the following results have been derived. Theorem 4.A.69. Let X and Y be two nonnegative random variables. Suppose that X ≤p+ Y [X ≥p− Y ] and that E[X r ] = E[Y r ] for some r ∈ (p, ∞) [r ∈ (0, p)], provided the expectations exist. Then X =st Y . Theorem 4.A.70. Let X and Y be two nonnegative random variables with ﬁnite means and distribution functions F and G, respectively. If X ≤p Y and if 1 1 −1 r −1 r F (t) dφ(t) = G (t) dφ(t) 0

0

for some r ≥ p and some increasing and strictly convex function φ : [0, 1] → R, then X =st Y . We end this section by mentioning still another sequence of orders that is based on iterated integrals. If F is a distribution function, then let F −1 denote the inverse of F (see page 1). Denote recursively F1−1 (p) = F −1 (p), and Fn−1 (p) =

p

1

p ∈ [0, 1],

−1 Fn−1 (u)du,

p ∈ [0, 1],

(4.A.46)

4.B Transform Orders: Convex, Star, and Superadditive Orders

213

for n = 2, 3, . . .. Similarly deﬁne G−1 n for a distribution function G. For any positive integer m, if the distribution functions F and G, of the random variables X and Y , satisfy −1 Fm (p) ≤ G−1 m (p)

for p ∈ [0, 1],

then we denote X ≤−1 m Y . It is easy to see that X ≤−1 1 Y ⇐⇒ X ≤st Y. Also, if EX = EY , then, by Theorem 3.A.5 we see that X ≤−1 2 Y ⇐⇒ X ≤cx Y. From (4.A.46) we obtain at once the following result Theorem 4.A.71. Let X and Y be two random variables. If X ≤−1 m1 Y , then X ≤−1 Y for all m ≥ m . 2 1 m2 A necessary condition for X ≤−1 m Y is given in the next result. Theorem 4.A.72. Let X and Y be two random variables. If X ≤−1 m Y , then E[max{X1 , X2 , . . . , Xk }] ≤ E[max{Y1 , Y2 , . . . , Yk }],

k ≥ m − 1,

where the Xi ’s [Yi ’s] are independent random variables, all distributed according to the distribution of X [Y ]. Proof. Let F and G denote the distribution functions of X and Y , respectively. A straightforward computation yields −1 Fm (0) = E[max{X1 , X2 , . . . , Xm−1 }]

and

G−1 m (0) = E[max{Y1 , Y2 , . . . , Ym−1 }].

Therefore E[max{X1 , X2 , . . . , Xm−1 }] ≤ E[max{Y1 , Y2 , . . . , Ym−1 }]. The inequality for k > m − 1 now follows from Theorem 4.A.71.

4.B Transform Orders: The Convex, Star, and Superadditive Orders 4.B.1 Deﬁnitions Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. Suppose that the support of X is an interval (ﬁnite or inﬁnite).

214

4 Univariate Monotone Convex and Related Orders

We say that X is smaller than Y in the convex transform order (denoted as X ≤c Y ) if G−1 F (x) is convex in x on the support of F . We say that X is smaller than Y in the star order (denoted by X ≤∗ Y ) if G−1 F (x) is starshaped in x (that is, if G−1 F (x)/x increases in x ≥ 0). It is easily seen that X ≤∗ Y if, and only if, G−1 (u) F −1 (u)

is increasing in u ∈ (0, 1).

(4.B.1)

Also, recalling the deﬁnition of the number of sign changes in (1.A.18), it is easily seen that X ≤∗ Y if, and only if, for all b > 0 we have that S − (F (·) − G(b·)) ≤ 1,

(4.B.2)

and the sign sequence is −, + if a crossing occurs. We say that X is smaller than Y in the superadditive order (denoted by X ≤su Y ) if G−1 F (x) is superadditive in x (that is, if G−1 F (x + y) ≥ G−1 F (x) + G−1 F (y) for all x ≥ 0 and y ≥ 0). 4.B.2 Some properties Every nonnegative function that vanishes at 0, and that is increasing and convex on [0, ∞), is also starshaped on [0, ∞). Furthermore, every nonnegative function that vanishes at 0, and that is increasing and starshaped on [0, ∞), is also superadditive on [0, ∞). Therefore, for any two nonnegative random variables X and Y we have X ≤c Y =⇒ X ≤∗ Y,

(4.B.3)

and X ≤∗ Y =⇒ X ≤su Y. The star order is related to the dispersion order as follows: Theorem 4.B.1. Let X and Y be two nonnegative random variables. Then X ≤∗ Y ⇐⇒ log X ≤disp log Y.

(4.B.4)

Proof. The relation X ≤∗ Y holds if, and only if, G−1 F (x)/x is increasing in x ≥ 0; that is, if, and only if, log G−1 F (x) − log x = log G−1 F (elog x ) − log x is increasing in x. The result now follows from (3.B.10).

An equivalent way of writing (4.B.4) is the following. For any two nonnegative random variables X and Y , X ≤disp Y ⇐⇒ eX ≤∗ eY . Under an obvious restriction, the superadditive (and hence also the star and the convex transform) order implies the dispersion order as is shown in the next theorem.

4.B Transform Orders: Convex, Star, and Superadditive Orders

215

Theorem 4.B.2. Let X and Y be two nonnegative random variables such that X ≤st Y . If X ≤su Y , then X ≤disp Y . Proof. Let F and G denote the distribution functions of X and Y , respectively, and let SF denote the support of F . Let x and y be two values in SF . Then G−1 F (x + y) − (x + y) ≥ G−1 F (x) + G−1 F (y) − (x + y) ≥ G−1 F (y) − y, where the ﬁrst inequality follows from X ≤su Y and the second inequality follows from F (x) ≥ G(x). Thus G−1 F (x) − x is increasing in x. Now, from (3.B.10), we obtain X ≤disp Y .

The condition X ≤st Y is clearly needed because without it it is impossible that X ≤disp Y (see Theorem 3.B.13). The condition X ≤su Y by itself (in fact, even the condition X ≤∗ Y ) does not necessarily imply that X ≤st Y . Theorem 4.B.2, together with (4.B.3), implies that if X and Y are two nonnegative random variables with ﬁnite means such that X ≤st Y and if X ≤su Y (and therefore if X ≤c Y or if X ≤∗ Y ), then (see Theorem 3.B.16) [X − EX] ≤cx [Y − EY ], and in particular, Var(X) ≤ Var(Y ). Another condition, under which X ≤su Y implies X ≤disp Y , is given in the next theorem. Theorem 4.B.3. Let X and Y be two nonnegative random variables with distributions F and G, respectively, such that limx→0 (G−1 F (x)/x) ≥ 1. If X ≤su Y , then X ≤disp Y . In particular, if F and G are absolutely continuous with F (0) = G(0) = 0 and their corresponding density functions f and g are such that f (0) ≥ g(0) > 0, then X ≤su Y implies X ≤disp Y . The relationship between the orders ≤∗ and ≤icx is described in the next theorem. Theorem 4.B.4. Let X and Y be two nonnegative random variables such that EX ≤ EY . If X ≤∗ Y , then X ≤icx Y . Proof. First we show that X ≤∗ Y =⇒ X ≤Lorenz Y . For this end we can assume temporarily, without loss of generality, since both orders are scale invariant, that EX = EY = 1. Let F and G denote the distribution functions of X and of Y , respectively. If F ≡ G, then the result is trivial. Thus assume F ≡ G. From (4.B.2) (with b = 1), and from the fact that EX = EY , it follows that S − (G − F ) = 1, and that the sign sequence is +, −. Thus, from (3.A.59) we obtain X ≤Lorenz Y . (Another proof of X ≤∗ Y =⇒ X ≤Lorenz Y can be found in Section 4.B.3.) Now suppose that X ≤Lorenz Y and that EX ≤ EY . Then

216

4 Univariate Monotone Convex and Related Orders

X ≤cx

EX · Y ≤st Y. EY

Thus we see from Theorem 4.A.6(b) that X ≤icx Y .

The following theorem describes a star order comparison of two functions of the same random variable. Theorem 4.B.5. Let X be a nonnegative random variable that is not degenerate at 0, and let g and h be nonnegative increasing functions, deﬁned on [0, ∞), such that g(x) > 0 and h(x) > 0 for all x > 0. If h(x)/g(x) is increasing in x ∈ (0, ∞), then g(X) ≤∗ h(X). Proof. Denote by F the distribution function of X. From the assumption that h(x)/g(x) is increasing in x ∈ (0, ∞) it follows that h(F −1 (u)) g(F −1 (u))

is increasing in u ∈ (0, 1).

Therefore, denoting by Fg and Fh the distribution functions of g(X) and of h(X), we have that Fh−1 (u) Fg−1 (u)

is increasing in u ∈ (0, 1).

Thus g(X) ≤∗ h(X) by (4.B.1).

For example, if X is a nonnegative random variable, then X + a ≤∗ X

whenever a > 0.

An interesting property of the order ≤∗ is given in the next theorem. Theorem 4.B.6. Let X and Y be positive random variables. If X ≤∗ Y , then X p ≤∗ Y p for any p = 0. In particular, 1/X ≤∗ 1/Y . Proof. Let F and G be the distribution functions of X and Y , respectively. First consider the case where p > 0. Then the distribution functions F and G of X p and Y p , respectively, are given by F(x) = F (x1/p )

and G(x) = G(x1/p ),

x ≥ 0.

−1 (F(x)) = (G−1 (F (x1/p )))p we compute Noting that G G−1 (F (y)) p −1 (F(x)) G (G−1 (F (x1/p )))p , = = x x y

4.B Transform Orders: Convex, Star, and Superadditive Orders

217

where y = x1/p . From the assumption X ≤∗ Y it is seen that the right-hand side of the above equation is increasing in y ≥ 0, and therefore the left-hand side of that equation is increasing in x ≥ 0. Now, in order to complete the proof it is only necessary to prove that denote the distribution functions of 1/X and 1/X ≤∗ 1/Y . Let now F and G 1/Y , respectively. These are given by F(x) = F (1/x)

and G(x) = G(1/x),

x ≥ 0,

−1 = 1/G−1 and that where F ≡ 1 − F and G ≡ 1 − G. Noting that G −1 G F = G−1 F , we compute −1 (F(x)) 1/x G 1 = −1 = −1 . x G (F (1/x)) G (F (1/x))x From the assumption X ≤∗ Y it is seen that the latter expression is increasing in x ≥ 0.

Example 4.B.7. Let X and Y be two positive random variables, and let E1 be a mean 1 exponential random variable which is independent of both X and = E1 /X and Y = E1 /Y ; that is, the distributions of both X Y . Deﬁne X and Y are scale mixtures of exponential distributions. Then ≤∗ Y . X ≤∗ Y =⇒ X ≤∗ 1/Y , and then The proof is obtained by showing that X ≤∗ Y =⇒ 1/X using Theorem 4.B.6. We omit the details. See Remarks 5.A.2 and 5.B.1 for similar results. A characterization of the order ≤c by means of the observed total time on test random variables (see Section 1.A.4) is given next. Let X and Y be two random variables with absolutely continuous distribution functions F and G, respectively. Suppose that 0 is the left endpoint of the supports of X −1 and Y . Let HF−1 and HG be the TTT transforms associated with F and G, respectively (see (1.A.19)), and let HF and HG be the respective inverses. Let Xttt and Yttt be random variables with distribution functions HF and HG . Theorem 4.B.8. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions having 0 as the left endpoint of their supports. Then X ≤c Y ⇐⇒ Xttt ≤c Yttt . Proof. Note that X ≤c Y if, and only if, f (F −1 (u)) g(G−1 (u))

is increasing in u ∈ [0, 1],

(4.B.5)

218

4 Univariate Monotone Convex and Related Orders

where f and g are the densities associated with F and G. From (4.B.5), (3.B.16), and (3.B.17) it is seen that X ≤c Y if, and only if, the ratio −1 (u)) is increasing in u ∈ [0, 1] where hF and hG are hF (HF−1 (u))/hG (HG the density functions associated with HF and HG , respectively. Thus, again by (4.B.5), we obtain the stated result.

A related result is the following. Theorem 4.B.9. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions having 0 as the left endpoint of their supports. If X ≤∗ Y , then Xttt ≤∗ Yttt . See related results in Theorems 1.A.29, 3.B.1, 4.A.44, and 4.B.29. The following characterization of the order ≤∗ is similar to the characterization of the order ≤hr in Theorem 1.B.12. Theorem 4.B.10. Let X and Y be two random variables with continuous distribution functions F and G, respectively, with common support [0, ∞). The following conditions are equivalent: (a) X ≤∗ Y . (b) For all functions α and β, such that α is nonnegative and α and α/β are 1 1 decreasing, and such that 0 α(u)dF −1 (u) < ∞, 0 α(u)dG−1 (u) < ∞, 1 1 0 = 0 β(u)dF −1 (u) < ∞, and 0 = 0 β(u)dG−1 (u) < ∞, we have 1 1 α(u)dG−1 (u) α(u)dF −1 (u) 0 ≤ 01 . 1 β(u)dG−1 (u) β(u)dF −1 (u) 0 0 (c) For any two increasing functions a and b such that b is nonnegative, if 1 1 a(u)b(u)dF −1 (u) = 0, then 0 a(u)b(u)dG−1 (u) ≤ 0. 0 The orders ≤c , ≤∗ , and ≤su can be used to characterize, respectively, IFR, IFRA, and NBU random variables as follows. Theorem 4.B.11. Let Exp denote any exponential random variable (no matter what its mean is). Let X be a nonnegative random variable. Then X is IFR ⇐⇒ X ≤c Exp, X is IFRA ⇐⇒ X ≤∗ Exp, and X is NBU ⇐⇒ X ≤su Exp. The theorem follows at once from the deﬁnitions and the observation that a random variable is IFR [IFRA, NBU] if, and only if, the negative of the logarithm of its survival function is convex [starshaped, superadditive] on (0, ∞). The claim in the next example is easy to prove. Example 4.B.12. Let X be a nonnegative random variable with an absolutely continuous distribution function. Then X has a decreasing density if, and only if, U ≤c X, where U is a uniform[0, 1] random variable.

4.B Transform Orders: Convex, Star, and Superadditive Orders

219

Example 4.B.13. Let U(j:m) and U(i:n) denote the jth and the ith order statistics of samples from the uniform distribution on [0, 1] of sizes m and n, respectively. Then U(j:m) ≤∗ U(i:n)

whenever i − j ≥ max{0, n − m}.

This follows from Lemma 3.B.27 and (4.B.4), and from the fact that if U is a uniform random variable on [0, 1], then − log(1 − U ) is a standard exponential random variable. It is worthwhile to mention that the above inequality, together with Theorem 4.B.4, yields the ﬁrst three inequalities in Example 3.A.49. The following example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.D.8, and 6.E.13. Example 4.B.14. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G, i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 3.B.37), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that any of the three transform orderings of the ﬁrst two epoch times implies the same ordering of all the corresponding later epoch times; that is, if X ≤c [≤∗ , ≤su ] Y , then T1,n ≤c [≤∗ , ≤su ] T2,n , n ≥ 1. The proof of this fact is similar to the proof in Example 3.B.38, and is therefore omitted. Similar to the orders ≤st , ≤hr , and ≤lr (see Theorems 1.B.34, 1.C.33, and 6.B.23), the orders ≤c , ≤∗ , and ≤su are also preserved under the formation of orders statistics. This is shown in the next result. Theorem 4.B.15. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤c [≤∗ , ≤su ] Yi , i = 1, 2, . . . , m. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(m) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(m) . Suppose that the Xi ’s are identically distributed and that the Yi ’s are identically distributed. Then X(k) ≤c [≤∗ , ≤su ] Y(k) ,

k = 1, 2, . . . , m.

(4.B.6)

Proof. Let F [G] denote the common distribution function of the Xi ’s [Yi ’s] and let F(k) [G(k) ] denote the distribution function of X(k) [Y(k) ]. Then it is well known that F (x) m! F(k) (x) = uk−1 (1 − u)m−k du (k − 1)!(m − k)! 0 and, similarly,

220

4 Univariate Monotone Convex and Related Orders

G(k) (x) =

m! (k − 1)!(m − k)!

G(x)

uk−1 (1 − u)m−k du. 0

−1 Thus, G−1 F , and (4.B.6) follows from the assumptions of the (k) F(k) = G theorem.

In the following example it is shown that, under the proper conditions, random minima and maxima are ordered in the convex transform, star, and superadditive order senses; see related results in Examples 1.C.46, 3.B.39, 5.A.24, and 5.B.13. Example 4.B.16. Let X1 , X2 , . . ., and Y1 , Y2 , . . ., each be a sequence of independent and identically distributed random variables. Let N be a positive integer-valued random variable, independent of the Xi ’s and of the Yi ’s. Denote X(1,N ) = min{X1 , X2 , . . . , XN }, X(N,N ) = max{X1 , X2 , . . . , XN }, Y(1,N ) = min{Y1 , Y2 , . . . , YN }, and Y(N,N ) = max{Y1 , Y2 , . . . , YN }. It can be shown that if X1 ≤c [≤∗ , ≤su ] Y1 , then X(1:N ) ≤c [≤∗ , ≤su ] Y(1:N ) and X(N :N ) ≤c [≤∗ , ≤su ] Y(N :N ) . The convex transform order between X and Y implies the usual stochastic order between ratios of the corresponding spacings as the next result shows; related results can be found in Theorem 1.C.45, and in Examples 6.B.25 and 6.E.15. In the next result we use the following notation. Let X(1:n) ≤ X(2:n) ≤ · · · ≤ X(n:n) and Y(1:n) ≤ Y(2:n) ≤ · · · ≤ Y(n:n) be the order statistics corresponding to samples X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn ; each consists of independent, identically distributed random variables, where the Xi ’s have the same distribution as X, and the Yi ’s have the same distribution as Y . The corresponding spacings are deﬁned by U(i:n) ≡ X(i:n) −X(i−1:n) and V(i:n) ≡ Y(i:n) − Y(i−1:n) , i = 2, 3, . . . , n. Theorem 4.B.17. Let X and Y be two random variables. If X ≤c Y , then V(j:n) U(j:n) ≤st U(i:n) V(i:n)

for 2 ≤ i ≤ j ≤ n.

Proof. First note that from the convexity of G−1 F we obtain G−1 F (x2 ) − G−1 F (x1 ) G−1 F (x4 ) − G−1 F (x3 ) ≤ x2 − x1 x4 − x3 whenever x1 ≤ [x2 , x3 ] ≤ x4 , where x1 ≤ [x2 , x3 ] ≤ x4 denotes x1 ≤ x2 ≤ x4 and x1 ≤ x3 ≤ x4 . Thus, for 2 ≤ i ≤ j ≤ n,

4.B Transform Orders: Convex, Star, and Superadditive Orders

$ P

U(j:n) >z U(i:n)

%

$ =P $

X(j:n) − X(j−1:n) >z X(i:n) − X(i−1:n)

221

%

G−1 F (X(j:n) ) − G−1 F (X(j−1:n) ) >z G−1 F (X(i:n) ) − G−1 F (X(i−1:n) ) % $ V(j:n) >z , =P V(i:n)

%

≤P

where the last equality follows from the observation that the joint distribution of G−1 F (X(i:n) ), G−1 F (X(i−1:n) ), G−1 F (X(j:n) ), and G−1 F (X(j−1:n) ) is the same as the joint distribution of Y(i:n) , Y(i−1:n) , Y(j:n) , and Y(j−1:n) .

Under a weaker assumption than the one in Theorem 4.B.17 we have the following results. Theorem 4.B.18. Let X and Y be two random variables with distribution functions F and G, respectively, such that F (0) = G(0) = 0. Let 0 ≤ p ≤ q. If X ≤∗ Y , then q p (a) E[X(i:n) ]/E[Y(i:n) ] is decreasing in i, q p ] is increasing in n, and (b) E[X(i:n) ]/E[Y(i:n) q p (c) E[X(n−i:n) ]/E[Y(n−i:n) ] is decreasing in n,

provided the expectations exist. The notation in Theorem 4.B.17 is used in the next result. Theorem 4.B.19. Let X and Y be two nonnegative random variables. If X ≤∗ Y , then E[U(i:n) ] ≤ E[V(i:n) ], i = 2, 3, . . . , n. 4.B.3 Some related orders In this subsection we consider random variables X and Y with distribution functions F and G, respectively, and with supports of the form [0, a), a > 0 (a can be inﬁnity). We assume throughout this subsection that X and Y have ﬁnite means. Denote the mrl functions (see (2.A.1)) that are associated with X and Y , by m and l, respectively. The random variable X is said to be smaller than Y in the DMRL order (denoted by X ≤dmrl Y ) if l(G−1 (u)) m(F −1 (u))

is increasing in u ∈ [0, 1].

Note that (4.B.7) is the same as the condition ∞ 1 EY G−1 (u) G(x)dx ∞ is increasing in u ∈ [0, 1], 1 EX F −1 (u) F (x)dx

(4.B.7)

(4.B.8)

222

4 Univariate Monotone Convex and Related Orders

where F and G are the survival functions associated with F and G, respectively. Condition (4.B.8) can be written equivalently as −1 EY − HG (u) −1 EX − HF (u)

is increasing in u ∈ [0, 1],

−1 are the TTT transforms (see (1.A.19)) that are associated where HF−1 and HG with F and G, respectively.

Theorem 4.B.20. Let X and Y be two random variables, each with support of the form [0, a). If X ≤c Y , then X ≤dmrl Y . Proof. Let the equilibrium survival functions associated with F and G be deﬁned as ∞ ∞ F (t) G(t) F e (x) = dt and Ge (x) = dt. EX EY x x Let α(x) ≡ G−1 (F (x)) = G

−1

(F (x)) and let

−1

γ(u) ≡ F e α−1 Ge (u) ,

u ∈ [0, 1].

For simplicity suppose that α and γ are diﬀerentiable. A lengthy straightforward computation gives EX d −1 · α (x) . (4.B.9) γ (u) = −1 EY dx G (u) e

By assumption, α is convex. It follows from (4.B.9) that γ is convex, and therefore γ is starshaped. That is, −1

F e F −1 G Ge (u) u Equivalently,

F e F −1 (u)

Ge G−1 (u)

is increasing in u ∈ [0, 1].

is decreasing in u ∈ [0, 1],

and (4.B.8) is obtained.

The random variable X is said to be smaller than Y in the NBUE order (denoted by X ≤nbue Y ) if

m F −1 (u) EX

≤ for all u ∈ [0, 1]. (4.B.10) EY l G−1 (u) Note that (4.B.10) is the same as the condition

4.B Transform Orders: Convex, Star, and Superadditive Orders

1 EX

∞

F −1 (u)

F (x)dx ≤

1 EY

∞

G−1 (u)

G(x)dx for all u ∈ [0, 1].

223

(4.B.11)

Condition (4.B.11) can be written equivalently as HF−1 (u) H −1 (u) ≥ G EX EY

for all u ∈ [0, 1].

From (4.B.11) and (3.C.1) it follows that if EX = EY , then X ≤nbue Y ⇐⇒ X ≤ew Y . In other words, for nonnegative random variables X and Y we have Y X X ≤nbue Y ⇐⇒ ≤ew . (4.B.12) EX EY Without the condition that EX = EY the orders ≤nbue and ≤ew are distinct (see Kochar, Li, and Shaked [316]). The following result is immediate from (4.B.7) and (4.B.10). Theorem 4.B.21. Let X and Y be two random variables, each with support of the form [0, a). If X ≤dmrl Y , then X ≤nbue Y . In the following two theorems some further relationships among some orders are proven. Theorem 4.B.22. Let X and Y be two random variables, each with support of the form [0, a). If X ≤∗ Y , then X ≤nbue Y . Proof. If X ≤∗ Y , then, from Theorem 4.B.9, we have that ing in u ∈ [0, 1] (see (4.B.1)). Therefore,

−1 HG (u) −1 HF (u)

≤

−1 HG (1) −1 HF (1)

=

−1 HG (u) −1 HF (u)

EY EX .

is increas

Recall from Theorem 3.B.16 that for random variables with the same means the dispersion order implies the convex order. Thus, from Theorem 4.B.2 it follows that for nonnegative random variables X and Y with ﬁnite means, such that X(EX)−1 ≤st Y (EY )−1 , we have that the star order implies the Lorenz order. However, a stronger result is true — one can obtain the Lorenz order without assuming any usual stochastic comparison associated with X and Y . This follows from Theorem 4.B.22 and the next result. Theorem 4.B.23. Let X and Y be two nonnegative random variables. If X ≤nbue Y , then X ≤Lorenz Y . Proof. The proof follows at once from (4.B.11) and (3.C.8).

A summary of the implications among orders that were mentioned so far in this section is given in the following chart. X ≤c Y ⇒ X ≤∗ Y ⇒ X ≤su Y ⇓ ⇓ X ≤dmrl Y ⇒ X ≤nbue Y ⇒ X ≤Lorenz Y

224

4 Univariate Monotone Convex and Related Orders

Remark 4.B.24. Using the above facts, we provide here a simple proof of Theorem 4.A.68. Recall from Theorem 4.B.6 that for any p > 0 we have that X ≤∗ Y if, and only if, X p ≤∗ Y p . Thus, from (4.A.44) and from Theorems 4.B.22 and 4.B.23 it is seen that if X ≤∗ Y , then X p ≤Lorenz Y p . This observation, again with the aid of (4.A.44), proves Theorem 4.A.68. The orders ≤dmrl and ≤nbue can be used to characterize, respectively, DMRL and NBUE random variables as follows. Theorem 4.B.25. Let Exp denote any exponential random variable (no matter what its mean is). Let X be a nonnegative random variable. Then X is DMRL ⇐⇒ X ≤dmrl Exp, and X is NBUE ⇐⇒ X ≤nbue Exp. The theorem follows at once from the deﬁnitions and the observation that the mrl function of an exponential random variable is a constant. Recall from (1.A.19) the deﬁnition of the TTT transform. We will now introduce and discuss an order that is deﬁned through a comparison of TTT transforms. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. If

F −1 (u)

F (x)dx ≤ 0

G−1 (u)

G(x)dx,

for all u ∈ (0, 1)

(4.B.13)

0

then X is said to be smaller than Y in the TTT order (denoted by X ≤ttt Y ). A simple suﬃcient condition for the order ≤ttt is the usual stochastic order: X ≤st Y =⇒ X ≤ttt Y.

(4.B.14)

In order to verify (4.B.14) one may just notice that if X ≤st Y , then F −1 (u) ≤ G−1 (u) for all u ∈ (0, 1) (see (1.A.12)). By letting u → 1 in (4.B.13) it is seen that X ≤ttt Y =⇒ EX ≤ EY.

(4.B.15)

From (4.B.11) and (4.B.13) it follows that if EX = EY , then X ≤ttt Y ⇐⇒ X ≥nbue Y . In other words, for nonnegative random variables X and Y we have Y X X ≥nbue Y ⇐⇒ ≤ttt ; EX EY see a similar relation in (4.B.12). It is easy to see that for any two nonnegative random variables X and Y we have X ≤ttt Y =⇒ aX ≤ttt aY for any a > 0. An important closure property of the order ≤ttt , analogous to Theorem 3.C.4, is given next.

4.B Transform Orders: Convex, Star, and Superadditive Orders

225

Theorem 4.B.26. Let X and Y be two ﬁnite mean continuous nonnegative random variables with interval supports, and with 0 being the common left endpoint of the supports. Then, for any increasing concave function φ, such that φ(0) = 0, we have X ≤ttt Y =⇒ φ(X) ≤ttt φ(Y ). As a corollary we obtain an analog of (3.C.8): Corollary 4.B.27. Let X and Y be two ﬁnite mean continuous nonnegative random variables with interval supports, and with 0 being the common left endpoint of the supports. Then X ≤ttt Y =⇒ X ≤icv Y. Proof. Suppose that X ≤ttt Y . Let φ be an increasing concave function ˜ ˜ deﬁned on [0, ∞). Deﬁne φ(·) = φ(·) − φ(0), so that φ(0) = 0. From ˜ ˜ ). Hence from (4.B.15) we get Theorem 4.B.26 we obtain φ(X) ≤ttt φ(Y ˜ ˜ )], and this reduces to E[φ(X)] ≤ E[φ(Y )], provided the E[φ(X)] ≤ E[φ(Y expectations exist.

An interesting closure property of the order ≤ttt , analogous to Theorem 3.C.11, is given next. Theorem 4.B.28. Let X1 , X2 , . . . be a collection of independent and identically distributed random variables, and let Y1 , Y2 , . . . be another collection of independent and identically distributed random variables. Also, let N be a positive, integer-valued, random variable, independent of the Xi ’s and of the Yi ’s. If X1 and Y1 are nonnegative, and if X1 ≤ttt Y1 , then min{X1 , X2 , . . . , XN } ≤ttt min{Y1 , Y2 , . . . , YN }. Some interesting connections between the order ≤ttt and observed total time on test random variables are given in the next theorem. Let X and Y be two nonnegative random variables. Recall from Section 1.A.4 the deﬁnition of the observed total time on test random variables Xttt and Yttt . Theorem 4.B.29. Let X and Y be two nonnegative random variables. Then Xttt ≤st Yttt ⇐⇒ X ≤ttt Y and X ≤ttt Y =⇒ Xttt ≤ttt Yttt . Some related results can be found in Theorems 1.A.29, 3.B.1, 4.A.44, 4.B.8, and 4.B.9. The following example describes comparisons of random variables that arise in the model of imperfect repair, and as the lifetimes of series systems.

226

4 Univariate Monotone Convex and Related Orders

Example 4.B.30. Let X be a nonnegative random variable with survival function F . For θ > 0, let X(θ) denote a random variable with the survival function (F )θ . Similarly, if Y is a nonnegative random variable with the survival function G, then denote by Y (θ) a random variable with survival function (G)θ . Suppose that both X and Y have 0 as the left endpoint of their supports. (a) If θ > 1, then X ≤ttt Y =⇒ X(θ) ≤ttt Y (θ). (b) If θ < 1, then X(θ) ≤ttt Y (θ) =⇒ X ≤ttt Y . A generalization of the TTT order is described next. This generalization contains as special cases the orders ≤st , ≤lir , and ≤ttt . Let H denote the set of all functions h such that h(u) > 0 for u ∈ (0, 1), and h(u) = 0 for u ∈ [0, 1]. For h ∈ H, if

F −1 (p)

h(F (x))dx ≤

−∞

G−1 (p)

h(G(x))dx, −∞

p ∈ (0, 1),

then we say that X is smaller than Y in the generalized total time on test (h) transform order with respect to h. We denote this by X ≤ttt Y . Example 4.B.31. Let X and Y be random variables with the same left endpoint of support a > −∞. Let h be a constant function on [0, 1]; that is, (h) h(u) = c, u ∈ [0, 1], for some c > 0, and h(u) = 0 otherwise. Then X ≤ttt Y if, and only if, F −1 (p) ≤ G−1 (p), p ∈ (0, 1); that is (by (1.A.12)), if, and only if, X ≤st Y . Example 4.B.32. Let h(u) = u, u ∈ [0, 1], and h(u) = 0 otherwise. Then (h) X ≤ttt Y if, and only if,

F −1 (p)

−∞

F (x)dx ≤

G−1 (p)

G(x)dx, −∞

p ∈ (0, 1);

that is, if, and only if, X ≤lir Y ; the order ≤lir is deﬁned in Section 3.C.1. Example 4.B.33. Let X and Y be nonnegative random variables with 0 being the left endpoint of their supports. Let h(u) = 1 − u, u ∈ [0, 1], and h(u) = 0 (h) otherwise. Then X ≤ttt Y if, and only if,

F −1 (p)

F (x)dx ≤ 0

G−1 (p)

G(x)dx,

p ∈ (0, 1);

0

that is, if, and only if, X ≤ttt Y . (h)

The next result describes a relationship among the orders ≤ttt for diﬀerent h’s.

4.C Complements

227

Theorem 4.B.34. Let X and Y be two random variables with continuous distribution functions, having 0 as the left endpoint of their supports. Let h1 , h2 ∈ H. Suppose that h2 (u)/h1 (u) is decreasing on (0, 1). Then (h )

(h )

X ≤ttt1 Y =⇒ X ≤ttt2 Y. Remark 4.B.35. In Theorem 4.B.34 let h1 (u) = u and h2 (u) = c for some (h ) (h ) constant c > 0, u ∈ [0, 1]. Then by Theorem 4.B.34, X ≤ttt1 Y =⇒ X ≤ttt2 Y ; that is, by Examples 4.B.31 and 4.B.32, X ≤lir Y =⇒ X ≤st Y

(4.B.16)

when X and Y are two random variables with continuous distribution functions, having 0 as the left endpoint of their supports. Recall from Theorem 3.B.13(a) that if X and Y have 0 as the left endpoint of their supports, then X ≤disp Y =⇒ X ≤st Y. It is not hard to see that X ≤disp Y =⇒ X ≤lir Y . Thus (4.B.16) strengthens Theorem 3.B.13(a) when X and Y have 0 as the left endpoint of their supports. Some relationships between the usual stochastic order ≤st and the orders are given next.

(h) ≤ttt

Theorem 4.B.36. Let X and Y be two nonnegative random variables with continuous distribution functions, having 0 as the left endpoint of their supports. Let h ∈ H. (h)

(a) If h is decreasing on [0, 1], then X ≤st Y =⇒ X ≤ttt Y . (h) (b) If h is increasing on [0, 1], then X ≤ttt Y =⇒ X ≤st Y . (h)

A relationship between the order ≤icv and some orders ≤ttt is described next. Theorem 4.B.37. Let X and Y be two random variables with continuous distribution functions, and supports [0, a) and [0, b), respectively, for some ﬁnite or inﬁnite constants a and b. Let h ∈ H be decreasing on [0, 1]. Then (h)

X ≤ttt Y =⇒ X ≤icv Y.

4.C Complements Section 4.A: Some standard references for the monotone convex and concave orders are Ross [475] and M¨ uller and Stoyan [419], where many of the results that are described in Section 4.A can be found. The characterizations of the order ≤icx by means of the quantile functions (Theorems

228

4 Univariate Monotone Convex and Related Orders

4.A.3 and 4.A.4) are taken from Sordo and Ramos [538]. The condition (4.A.8) is studied in H¨ urlimann [251]; there it is called the RaC (riskadjusted capital) order. The present version of the characterizations of the orders ≤icx and ≤icv , given in Theorem 4.A.5, is taken from M¨ uller and R¨ uschendorf [415]. The two characterizations of the order ≤icx , given in Theorem 4.A.6, can be found in Makowski [378]; an alternative proof of these results is given in M¨ uller [407]. The result that gives the closure under random convolutions property of the monotone convex and concave orders (Theorem 4.A.9) and its proof are taken from Ross and Schechner [477]. Extensions of Theorem 4.A.9 are given in Jean-Marie and Liu [254]; for example, the results mentioned in Remark 4.A.10 can be found there. Theorem 4.A.11 can be found in Fagiuoli and Pellerey [186]. The comparisons of the random sums in Theorems 4.A.12–4.A.14 are motivated by ideas in Pellerey and Shaked [455]; they can be found in Pellerey [450]. The result that gives the closure under general convex increasing transformations property of the increasing convex order (Theorem 4.A.15) and its proof can be found in Ross [475]. The ordering of the maxima in the sense of ≤icx (Corollary 4.A.16) is implicit in Theorem 9 of Li, Li, and Jing [354]. The increasing convex order comparison of maxima of partial sums (Theorem 4.A.17) is taken from Shao [535]; see also Bulinski and Suquet [114]. The icx and icv comparisons of ratios (Example 4.A.19) are restatements of results of Pellerey and Semeraro [454]. The result that gives the closure under mixtures property of the increasing convex and concave orders (Theorem 4.A.20) has been motivated by a result of Ahmed, Soliman, and Khider [10]. The Laplace transform characterization of the orders ≤icx and ≤icv (Theorem 4.A.21) is essentially taken from Ross and Schechner [477] and from Shaked and Wong [524]. A proof of the characterization of the increasing convex order by means of the number of crossings of two distribution functions (Theorem 4.A.23) can be found in M¨ uller [407]. The characterization of the order ≤mrl by the order ≤icx (Theorem 4.A.24) is taken from Brown and Shanthikumar [112]. The closure property of the order ≤mrl given in Remark 4.A.25 is also taken from Brown and Shanthikumar [112]. The suﬃcient condition for the increasing concave order in Theorem 4.A.27 is given on page 484 of Landsberger and Meilijson [329]. The fact that the order ≤hmrl implies the order ≤icx (Theorem 4.A.28) can be found in Fagiuoli and Pellerey [185]. The relationship between the orders ≤dil and ≤icx that is described in Theorem 4.A.30 and in Corollary 4.A.31 can be found in Belzunce, Pellerey, Ruiz, and Shaked [72]. The relationship between the orders ≤ew and ≤icx that is described in Corollary 4.A.32 can be found in Fagiuoli, Pellerey, and Shaked [188]; in Kochar, Li, and Shaked [316] it is shown that Corollary 4.A.32 can be easily obtained from Theorem 3.C.4. The result about the expected values of the extremes and the increasing convex order (Theorem 4.A.33) is taken from Downey and Maier [170]. The bivariate characterizations of the orders ≤st , ≤hr , and ≤lr in Theorem 4.A.36

4.C Complements

229

are taken from Righter and Shanthikumar [466]; its application (Theorem 4.A.37) is taken from Kijima and Ohnishi [292]. The increasing convex and concave comparisons of linear functions of random variables with random coeﬃcients, whose parameters are comparable in the majorization order (Theorems 4.A.38 and 4.A.39 and Corollary 4.A.40), are taken from Denuit and Frostig [144]; further results of this type can be found there. The characterizations of the dilation and the icx orders by means of the increasing convex order (Theorems 4.A.41 and 4.A.42) are taken from Sordo and Ramos [538]. The characterization of the excess wealth order by means of the increasing convex order (Theorem 4.A.43) can be found in Belzunce [63]. The inheritance of the icv order by the observed total time on test random variables (Theorem 4.A.44) is given in Li and Shaked [356]. The icx order comparisons of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 4.A.45) is taken from Boland, Singh, and Cukic [102]. The necessary and suﬃcient conditions for the comparison of normal random variables (Example 4.A.46) are taken from M¨ uller [413]. The icx comparison of average of exponential random variables with the largest among them (Example 4.A.47) can be found in Argon and Andrad´ ottir [16]. The condition for stochastic equality in the icx case of Theorem 4.A.48 can be found Bhattacharjee and Bhattacharya [87]; the condition for stochastic equality in the icv case of Theorem 4.A.48 follows from the above condition and from Theorem 4.A.1. The condition for stochastic equality in Theorem 4.A.50 is taken from Sordo and Ramos [538]. The characterization of the DMRL and IMRL aging notions by means of the increasing convex order (Theorem 4.A.51) can be found in Cao and Wang [117], who also deﬁned and studied the classes of NBUC and NWUC random variables. The terminology of NBUM and NWUM is due to Bergmann [81]. The characterization of NBUC random variables, given in (4.A.32), is taken from Belzunce, Ortega, and Ruiz [71]. The notions of NBU(2) and NWU(2) are deﬁned in Deshpande, Kochar, and Singh [160]. The extension of the suﬃciency condition in Theorem 4.A.51, given in Theorem 4.A.52, is taken from Li and Zuo [359]. Most of the results about the starshaped order (Section 4.A.6) can be found in Alzaid [13]. Most of the results on the orders ≤m-icx and ≤m-icv (Section 4.A.7) are taken from Rolski [473]; see also Mukherjee and Chatterjee [404], Fishburn and Lavalle [204], Wang and Young [558], Cheng and Pai [129], and references therein. Lef`evre and Utev [339] studied some stochastic orders among discrete random variables by replacing the integrals in (4.A.36) and in (4.A.37) by summations. Fishburn and Lavalle [204] also studied discrete analogs of the ≤m-icv orders. ThorlundPetersen [549] characterized the ≤3-icv comparison of arithmetic random variables. The deﬁnition of the order ≤∞-icv can be found in Thistle [548] or in Fishburn and Lavalle [204] and in other references that are given in the latter paper. The moment inequalities that are given in Theorem 4.A.58 are also taken from Fishburn and Lavalle [204]; see further ref-

230

4 Univariate Monotone Convex and Related Orders

erences there, and see also Carletti and Pellerey [121]. Theorem 4.A.60 can be found in Denuit, Lef`evre, and Utev [155]. The suﬃcient conditions for the m-icx order, in terms of sign changes (Theorem 4.A.63), are taken from Kaas and Hesselager [270]; the stochastic comparisons of the Gamma, inverse Gaussian, and lognormal random variables (Example 4.A.64) can also be found there. A variation of Theorem 4.A.65 can be found in Hesselager [222]. Some results that are related to Theorem 4.A.66 have been derived in Denuit [140]. Fishburn [201, 202] and Stoyan [540, page 22] extended the orders ≤m-icx and ≤m-icv by allowing m to be any positive number (that is, not necessarily an integer). They did it by letting the m in (4.A.38) and in (4.A.39) be any number greater than 0. Shaked and Wong [524] considered orders deﬁned by requiring the test functions φ in (4.A.40) [respectively, (4.A.41)] to satisfy that φ(j) [respectively, (−1)j φ(j) ] is increasing, j = 0, 1, . . . , m − 1. Denuit, Lef`evre, and Shaked [151] studied the orders deﬁned by requiring (4.A.38) and (4.A.39) to hold as well as E(X − a)i ≤ E(Y − a)i , i = 1, 2, . . . , m − 1, where a is the left endpoint of the support of the underlying random variables, and a is assumed to be ﬁnite. The results about the orders ≤p , ≤p+ , and ≤p− (Theorems 4.A.67–4.A.69) are taken from Bhattacharjee and Sethuraman [88], Bhattacharjee [83], Li and Zhu [351], and Jun [265]. Note that the order that we denote by ≤p− is not the same as, but is a modiﬁcation of, an order discussed by these authors. Some generalizations of Theorem 4.A.69 can be found in Cai and Wu [116]. The condition for stochastic equality in Theorem 4.A.70 is taken from Sordo and Ramos [538]. The discussion involving the orders ≤−1 m is motivated by Muliere and Scarsini [406]; extensions of these orders are developed in Wang and Young [558] and in Maccheroni, Muliere, and Zoli [376]. Bhattacharjee [85] studied the order ≤icx under the restriction that the compared random variables are discrete. Baccelli and Makowski [28] denote X ≤FR-st Y whenever (4.A.21) holds (that is, X ≤FR-st Y ⇐⇒ X ≤hmrl Y ). They also deﬁne the orders ≤FR-cx and ≤FR-icx in a similar manner, and they study many closure properties of the orders ≤FR-st , ≤FR-cx , and ≤FR-icx . The order ≤FR-icx is a “hybrid” of the orders ≤hmrl (see (2.B.2)) and ≤3-icx (see (4.A.36)). It is deﬁned by saying that the nonnegative random variables X and Y satisfy X ≤FR-icx Y if (here F and G denote the survival functions of X and Y , respectively) ∞∞ ∞∞ F (x1 )d1 dx2 G(x1 )dx1 dx2 x x2 for all x ≥ 0. ≤ x x2 EX EY Clearly, if EX = EY , then X ≤FR-icx Y if, and only if, X ≤2-icx Y . The order ≤FR-cx is deﬁned by saying that the nonnegative random variables X and Y satisfy X ≤FR-cx Y if X ≤FR-icx Y and if E[X 2 ]/E[X] = E[Y 2 ]/E[Y ].

4.C Complements

231

Section 4.B: A good reference about the convex transform, star, and superadditive orders is Barlow and Proschan [36], where further references can be found. Many of the results given in this section can be found there. Another basic reference about the convex transform order is van Zwet [578]. The result about the relation of the star order and the dispersive order (Theorem 4.B.1) is implicit in Shaked [503], whereas the results about the relation of the superadditive order and the dispersive order (Theorems 4.B.2 and 4.B.3) can be found in Ahmed, Alzaid, Bartoszewicz, and Kochar [8]. The relationship between the star order and the icx order (Theorem 4.B.4) is taken from Szekli [544, page 23]; the idea of the ﬁrst part of the proof of Theorem 4.B.4 is adopted from Arnold and Villasenor [21]. The property of the star order given in Theorem 4.B.6, when p = −1, can be found in Taillie [546]; Rivest [469] has obtained it for a general p = 0. The comparison of the exponential mixtures with respect to the order ≤∗ , given in Example 4.B.7, is taken from Bartoszewicz [50]. The characterization of the order ≤c by means of observed total time on test random variables (Theorem 4.B.8) can be found in Barlow and Doksum [34]. The proof of the implication that is given in Theorem 4.B.9 can be found in Bartoszewicz [42, 45]. An interesting study of the relationship between the convex transform, star, and superadditive orders and some variability orders can be found in Metzger and R¨ uschendorf [393]. A characterization of the star order, by means of the monotonicity in k of the ratio of the quantile functions of the corresponding order statistics X(k) and Y(k) (see (4.B.6)), is given in Bartoszewicz [41]. The characterization of the star order given in Theorem 4.B.10 is taken from Bartoszewicz [45]. The star ordering of order statistics from uniform distribution (Example 4.B.13) can be found in Jeon, Kochar, and Park [255]. The three transform orderings of the epoch times of two nonhomogeneous Poisson processes (Example 4.B.14) are given in Gupta and Kirmani [217]. The result about the preservation of the convex transform, star, and superadditive orders under formation of order statistics (Theorem 4.B.15) is a special case of a result in Belzunce, Mercader, and Ruiz [70]. The results about the convex transform, star, and superadditive order comparisons of random minima and maxima (Example 4.B.16) are taken from Bartoszewicz [49]. An extension of Theorem 4.B.15 to order statistics from samples with a random size can be found in Nanda, Misra, Paul, and Singh [427]. This extension of Nanda, Misra, Paul, and Singh [427] also extends the results in Example 4.B.16. The fact that the convex transform order implies the usual stochastic order among ratios of spacings (Theorem 4.B.17) can be found in Oja [440]. The result about the monotonicity of the ratios of expected values of the order statistics which is implied by the order ≤∗ (Theorem 4.B.18) is given in Bartoszewicz [45]; see also Barlow and Proschan [35]. The inequalities between the expected values of spacings from diﬀerent samples (Theorem 4.B.19) are taken from Paul and Gutierrez [443]. The discussion of the DMRL and the NBUE orders in

232

4 Univariate Monotone Convex and Related Orders

Section 4.B.3 follows the work of Kochar and Wiens [319] and of Kochar [306], although some of the proofs here are diﬀerent; see also Belzunce, Candel, and Ruiz [65] and Fernandez-Ponce, Kochar, and Mu˜ noz-Perez [195]. The discussion of the TTT order in Section 4.B.3 follows the work of Kochar, Li, and Shaked [316]. The result about the preservation of the TTT order under random minima (Theorem 4.B.28) is taken from Li and Zuo [358]. The connections between the order ≤ttt and observed total time on test random variables (Theorem 4.B.29) can be found in Li and Shaked [356]. The comparisons of random variables of interest in reliability theory, given in Example 4.B.30, are taken from Li and Shaked [357]. The (h) generalization ≤ttt of the TTT order has been introduced and studied in Li and Shaked [357]. The deﬁnitions of the orders ≤c , ≤∗ , and ≤su , given in Section 4.B.1, are proper when the comparisons apply to distributions of nonnegative random variables. Van Zwet [578], Lawrence [334], and Loh [365] study modiﬁcations of these orders which apply to symmetric distributions.

5 The Laplace Transform and Related Orders

The most important common order that is studied in this chapter is the Laplace transform order. Like the orders that were discussed in Chapter 4, the Laplace transform order compares random variables according to both their “location” and their “spread”. Two other useful orders, based on ratios of Laplace transforms, are also discussed in this chapter. In addition, some other related orders are investigated in this chapter as well.

5.A The Laplace Transform Order 5.A.1 Deﬁnitions and equivalent conditions The relations X ≤st Y , X ≤cx Y , X ≤icx Y , and X ≤icv Y , as well as many others, are deﬁned by requiring E[φ(X)] ≤ E[φ(Y )] to hold for all functions φ in some class of functions. For example, the class of functions which corresponds to the usual stochastic order is the class of all increasing functions. The order that is discussed in this section corresponds to the class of functions φ of the form φ(x) = −e−sx where s is a positive number. More explicitly, let X and Y be two nonnegative random variables such that E[exp{−sX}] ≥ E[exp{−sY }] for all s > 0. (5.A.1) Then X is said to be smaller than Y in the Laplace transform order (denoted by X ≤Lt Y ). Throughout this section we consider only nonnegative random variables. For a nonnegative random variable X with distribution function F and survival function F ≡ 1 − F , denote by ∞ ∞ ∗ e−sx dF (x) and F (s) = e−sx F (x)dx f ∗ (s) = 0

0

the Laplace-Stieltjes transform of F (or the Laplace transform of X) and the Laplace transform of F , respectively. Then it is easy to verify that

234

5 The Laplace Transform and Related Orders ∗

F (s) = s−1 (1 − f ∗ (s))

for all s > 0.

(5.A.2)

Using (5.A.2), the following result is easy to verify. Theorem 5.A.1. Let X and Y be two nonnegative random variables with survival functions F and G, respectively. Then X ≤Lt Y if, and only if, ∞ ∞ e−sx F (x)dx ≤ e−sx G(x)dx for all s > 0. (5.A.3) 0

0

Note that (5.A.3) can be written as E min{X, Es } ≤ E min{Y, Es }

for all s > 0,

(5.A.4)

where Es is an exponential random variable with mean 1/s, which is independent of X and of Y . Using (5.A.2) it is also easy to verify the statement that is given in the following remark. Remark 5.A.2. Let X and Y be two positive random variables, and let E1 be a mean 1 exponential random variable which is independent of both X and = E1 /X and Y = E1 /Y ; that is, the distributions of both X Y . Deﬁne X and Y are scale mixtures of exponential distributions. Then X ≤Lt Y ⇐⇒ Y ≤st X. See similar results in Example 4.B.7 and in Remark 5.B.1. If X ≤Lt Y , then (1 − E[exp{−sX}])/s ≤ (1 − E[exp{−sY }])/s for all s > 0. Letting s ↓ 0 it is seen that X ≤Lt Y =⇒ EX ≤ EY,

(5.A.5)

provided the expectations exist. A function φ : [0, ∞) → R is said to be completely monotone if all its derivatives φ(n) exist and satisfy φ(0) (x) ≡ φ(x) ≥ 0, φ(1) (x) ≤ 0, φ(2) (x) ≥ 0, . . .; that is, φ is completely monotone if (−1)n φ(n) (x) ≥ 0 for all x > 0 and n = 0, 1, 2, . . .. It is well known that φ is completely monotone if, and only if, there exists a measure µ on (0, ∞) such that ∞ φ(x) = e−xu µ(du). 0

Therefore, if X ≤Lt Y and φ is completely monotone, then ∞ ∞ E[φ(X)] = E e−Xu µ(du) = E[e−Xu ]µ(du) 0 0 ∞ ≥ E[e−Y u ]µ(du) = E[φ(Y )], 0

provided the expectations exist. The function φ, which is deﬁned by φ(x) = exp{−sx}, is completely monotone for each s > 0. We thus have proven the following characterization of the order ≤Lt .

5.A The Laplace Transform Order

235

Theorem 5.A.3. Let X and Y be two nonnegative random variables. Then X ≤Lt Y if, and only if, E[φ(X)] ≥ E[φ(Y )]

(5.A.6)

for all completely monotone functions φ, provided the expectations exist. A similar result is the following. Theorem 5.A.4. Let X and Y be two nonnegative random variables. Then X ≤Lt Y if, and only if, E[φ(X)] ≤ E[φ(Y )] for all diﬀerentiable functions φ on [0, ∞) with a completely monotone derivative, provided the expectations exist. Next we characterize the order ≤Lt by a function of the respective moments. In order to do that we notice that if X is a nonnegative random variable with survival function F such that all its moments exist, then ∞ ∞ ∞ (−s)i ∞ i (−s)i EX i+1 e−sx F (x)dx = x F (x)dx = . i! i! i+1 0 0 i=0 i=0 Using this fact and Theorem 5.A.1, the proof of the next theorem is apparent. Theorem 5.A.5. Let X and Y be nonnegative random variables that possess moments µi and νi , respectively, i = 1, 2, . . . . Then X ≤Lt Y if, and only if, ∞ ∞ (−s)i (−s)i µi+1 ≤ νi+1 (i + 1)! (i + 1)! i=0 i=0

for all s > 0.

A Laplace transform characterization of the order ≤Lt is stated next. It may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, 2.B.14, and 4.A.21. We omit its proof. Theorem 5.A.6. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤Lt X2 ⇐⇒ Nλ (X1 ) ≤Lt Nλ (X2 )

for all λ > 0.

5.A.2 Closure and other properties Using (5.A.1) and (5.A.6) it is easy to prove each of the closure results in the following theorem. The ﬁrst part of the theorem follows from the observation that if φ is a completely monotone function and g is a positive function with a completely monotone derivative, then φ(g) is completely monotone. Comments about the proof of the last part are given after the statement of the theorem. (Recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.)

236

5 The Laplace Transform and Related Orders

Theorem 5.A.7. (a) If X ≤Lt Y and g is any positive function with a completely monotone derivative, then g(X) ≤Lt g(Y ). (b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤Lt [Y Θ = θ] for all θ in the support of Θ. Then X ≤Lt Y . That is, the Laplace transform order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤Lt Yj , j = 1, 2, . . ., then X ≤Lt Y . (d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤Lt Yi for i = 1, 2, . . . , m, then g(X1 , X2 , . . . , Xm ) ≤Lt g(Y1 , Y2 , . . . , Ym ) for all nonnegative functions g on [0, ∞)n such that (∂/∂xi )g(x1 , x2 , . . . , xn ) is completely monotone in xi , i = 1, 2, . . . , m. In particular, the Laplace transform order is closed under convolutions. The proof of Theorem 5.A.7(d) is very similar to the proof of Theorem 4.A.15. The basic diﬀerence is that one should use Theorem 5.A.7(a) rather than Theorem 4.A.8(a) in the ﬁrst step of the inductive argument. Another closure property of the order ≤Lt is described in the following theorem. Theorem 5.A.8. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent random variables, and let M and N be integer-valued positive random variables that are independent of the {Xi } and the {Yi } sequences, respectively. Suppose that there exists a nonnegative random variable Z such that Xi ≤Lt Z ≤Lt Yj for all i and j. If M ≤Lt N , then M

Xj ≤Lt

j=1

N

Yj .

j=1

Proof. Note that for all s > 0 we have E exp

$ −s

M

% Xj

=

j=1

≥ = ≥

∞ n=1 ∞ n=1 ∞ n=1 ∞ n=1

P {M = n}

n

E[exp{−sXj }]

j=1

P {M = n}(E[exp{−sZ}])n P {M = n} exp{−n(− log E[exp{−sZ}])} P {N = n} exp{−n(− log E[exp{−sZ}])}

5.A The Laplace Transform Order

= ≥

∞ n=1 ∞

237

P {N = n}(E[exp{−sZ}])n P {N = n}

n=1

E[exp{−sYj }]

j=1

$

= E exp

n

−s

N

% Yj ,

j=1

where the ﬁrst and the last equalities follow from the independence of M and N of the {Xi } and the {Yi } sequences, the ﬁrst and the last inequalities follow from Xi ≤Lt Z ≤Lt Yj for all i and j, and the middle inequality follows from M ≤Lt N . The stated result now follows.

As a corollary of Theorem 5.A.8 we obtain the next result, which is an analog of Theorem 4.A.9. It is worthwhile to point out that Theorem 7.D.7, which is proven in Section 7.D.1, is a more general result than the following theorem. Theorem 5.A.9. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent and identically distributed random variables such that Xi ≤Lt Yi , i = 1, 2, . . .. Let M and N be integer-valued positive random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤Lt N . Then M j=1

Xj ≤Lt

N

Yj .

j=1

A result that is related to Theorem 5.A.9 is given next. It is of interest to compare it to Theorems 1.A.5, 2.B.8, 3.A.14, and 4.A.12. Theorem 5.A.10. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K Xi ≤Lt [≥Lt ] Y1 , i=1

and M ≤Lt [≥Lt ] KN. Then

M j=1

Xj ≤Lt [≥Lt ]

N j=1

Yj .

238

5 The Laplace Transform and Related Orders

We do not give a detailed proof of Theorem 5.A.10 here since it is similar to the proof of Theorem 4.A.12 in Section 4.A.1. Two other similar theorems are the following. Their proofs are similar to the proofs of Theorems 4.A.13 and 4.A.14 in Section 4.A.1. Theorem 5.A.11. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤Lt Y1

and

M ≤Lt

i=1

K

Ni ,

i=1

or if we have KX1 ≤Lt Y1

and

M ≤Lt KN,

KX1 ≤Lt Y1

and

M ≤Lt

or if we have K

Ni ,

i=1

then

M

Xj ≤Lt

j=1

N

Yj .

j=1

Theorem 5.A.12. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1 i=1

then

Xi ≤Lt

K1 Y1 K2

M j=1

and

Xj ≤Lt

N j=1

M ≤Lt K2 N,

Yj .

5.A The Laplace Transform Order

239

Recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. Theorem 5.A.13. Let X1 , X2 , . . . , Xm be independent nonnegative random variables. Let a1 ≥ a2 ≥ · · · ≥ am ≥ 0 and b1 ≥ b2 ≥ · · · ≥ bm ≥ 0 be constants such that a ≺ b. If X1 ≤rh X2 ≤rh · · · ≤rh Xm , then

m i=1

ai Xi ≤Lt

m

am−i+1 Xi

and

i=1

m

bi Xi ≤Lt

i=1

m

ai Xi .

i=1

Proof. By Theorem 5.A.7(d) the order ≤Lt is closed under convolutions. Thus, it suﬃces to prove the stated results for m = 2. Select an s ≥ 0. In Theorem 1.B.50(b), take α(x) = e−a1 sx and β(x) = −a2 sx e to obtain E[exp{−s(a1 X1 + a2 X2 )}] ≥ E[exp{−s(a1 X2 + a2 X1 )}],

s ≥ 0;

that is, a1 X1 + a2 X2 ≤Lt a1 X2 + a2 X1 . In order to prove the second statement, take α(x) = e−a2 sx and β(x) = e−b2 sx in Theorem 1.B.50(b) to obtain E[exp{−b2 X1 s}] E[exp{−b2 X2 s}] ≥ , E[exp{−a2 X2 s}] E[exp{−a2 X1 s}]

s ≥ 0.

(5.A.7)

Also, by Theorem 3.A.35 we have a1 X1 + a2 X1∗ ≤cx b1 X1 + b2 X1∗ , where X1∗ is an independent copy of X1 . Therefore, a1 X1 + a2 X1∗ ≥Lt b1 X1 + b2 X1∗ , and hence, E[exp{−b2 X1 s}] E[exp{−b2 X1∗ s}] E[exp{−a1 X1 s}] = ≥ , E[exp{−a2 X1 s}] E[exp{−a2 X1∗ s}] E[exp{−b1 X1 s}]

s ≥ 0.

(5.A.8) Combining (5.A.7) and (5.A.8) we obtain b1 X1 + b2 X2 ≤Lt a1 X1 + a2 X2 .

The Laplace transform order is closed under linear convex combinations as the following theorem shows. This result is an analog of Theorem 3.A.36, and its proof is similar to the proof of that theorem; therefore the proof is omitted. Similar results are Theorems 5.C.8 and 5.C.18. Theorem 5.A.14. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≥Lt Y , i = 1, 2, . . . , n, then n

ai Xi ≥Lt Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n and

n i=1

ai = 1.

240

5 The Laplace Transform and Related Orders

A result that is similar to Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16, is the following. Theorem 5.A.15. Let X and Y be two nonnegative random variables. Suppose that X ≤Lt Y and that E[X α ] = E[Y α ] for some α < 0 or for some α ∈ (0, 1), provided the expectations exist. Then X =st Y . The function φ deﬁned by φ(x) = exp{−sx} is decreasing and convex for each s > 0. Therefore −φ is increasing and concave. We thus obtain the next result. Theorem 5.A.16. Let X and Y be two nonnegative random variables. If X ≤icv Y , then X ≤Lt Y . In particular, if X ≤st Y , then X ≤Lt Y . In fact, from (4.A.41) it follows that if X ≤m-icv Y , for any m, then X ≤Lt Y . For random variables with ﬁnite supports we have the following characterization of the Laplace transform order by means of the orders ≤m-icv that were studied in Section 4.A.7. Theorem 5.A.17. Let X and Y be two random variables with ﬁnite supports. Then X ≤Lt Y if, and only if, X ≤∞-icv Y (where ≤∞-icv is deﬁned in (4.A.42)). Another strengthening of Theorem 5.A.16 is stated and proven next. Recall from Section 4.A.7 the deﬁnition of the order ≤p− . Theorem 5.A.18. Let X and Y be two nonnegative random variables. If X ≤p− Y for some p ≤ 1, then X ≤Lt Y . Proof. Recall from (4.A.45) that if X ≤p− Y , then X p ≤icv Y p . Select an s > 0. Deﬁne φ(x) ≡ e−sx and let

1/p h(x) ≡ φ x1/p = e−sx . It is easy to verify that the function h is decreasing and convex, and therefore −h is increasing and concave. From the fact that X p ≤icv Y p it follows that −E[h(X p )] ≤ −E[h(Y p )], or, equivalently, that

E e−sX ≥ E e−sY .

Since the latter inequality holds for all s > 0 it follows that X ≤Lt Y .

Closure properties of an order under the operation of taking minima are of importance in reliability theory. The next result gives conditions under which the order ≤Lt is closed under this operation. We do not give the proof here.

5.A The Laplace Transform Order

241

Theorem 5.A.19. Let the independent nonnegative random variables X1 , X2 , . . . , Xm , Y1 , Y2 , . . . , Ym have the survival functions F 1 , F 2 , . . . , F m , G1 , G2 , . . . , Gm , respectively. If Xi ≤Lt Yi , i = 1, 2, . . . , m, and F i and Gi , i = 1, 2, . . . , m, are completely monotone, then min{X1 , X2 , . . . , Xm } ≤Lt min{Y1 , Y2 , . . . , Ym }. Remark 5.A.20. Let {X, X1 , X2 , . . . } be a set of nonnegative independent and identically distributed random variables, and let {Y, Y1 , Y2 , . . . } be another set of nonnegative independent and identically distributed random variables. Denote by X(i:n) the ith order statistic in a sample of size n from {X1 , X2 , . . . }, and denote by Y(i:n) the ith order statistic in a sample of size n from {Y1 , Y2 , . . . }. If X ≤disp Y , then, by Theorem 3.B.31, for 2 ≤ i ≤ n we have X(i:n) − X(i−1:n) ≤st Y(i:n) − Y(i−1:n) , and therefore, by Theorem 5.A.16, we have X(i:n) − X(i−1:n) ≤Lt Y(i:n) − Y(i−1:n) ,

2 ≤ i ≤ n.

(5.A.9)

Bartoszewicz [46] proved a similar result. He showed that if X ≤disp Y , and if the Xi ’s and the Yi ’s are independent, then X(i:n) + Y(i−1:n) ≤Lt X(i−1:n) + Y(i:n) ,

2 ≤ i ≤ n.

(5.A.10)

This is diﬀerent from (5.A.9) because X(i−1:n) and X(i:n) (and Y(i−1:n) and Y(i:n) ) in (5.A.9) have a particular joint distribution, whereas (5.A.10) involves only the marginal distributions of X(i−1:n) and X(i:n) (and of Y(i−1:n) and Y(i:n) ). Bartoszewicz [46] also proved that if X ≤disp Y , and if the Xi ’s and the Yi ’s are independent, then X(n+1−i:n+1) + Y(n−i:n) ≤Lt X(n−i:n) + Y(n+1−i:n+1) ,

0 ≤ i ≤ n − 1,

and X(i:n) + Y(i:n+1) ≤Lt X(i:n+1) + Y(i:n) ,

1 ≤ i ≤ n.

In reliability theory, motivated by (3.A.62) and Theorem 5.A.16, one may consider the class of nonnegative random variables X which satisfy X ≥Lt [≤Lt ] Exp(µ) or, equivalently, 0

∞

e−su P {X > u}du ≥ [≤]

µ 1 + sµ

(5.A.11)

for s ≥ 0,

where µ is the mean of X. Such random variables have interesting aging properties. From Theorems 3.A.55 and 5.A.16 it is seen that if X is NBUE [NWUE], then X satisﬁes (5.A.11). Some researchers studied random variables X which satisfy

242

5 The Laplace Transform and Related Orders

X ≥Lt Gamma(α, β), where Gamma(α, β) denotes a Gamma random with shape parameter α and scale parameter β, which has the same mean as X. See Klar [300], Hu and Lin [228], and references therein. Let X be a nonnegative random variable with a ﬁnite mean. Recall the deﬁnition of the asymptotic equilibrium age AX whose distribution function is given in (1.A.20). Let Y be another nonnegative random variable with the corresponding asymptotic equilibrium age AY . From (5.A.3) it is seen at once that if EX = EY , then X ≤Lt Y ⇐⇒ AX ≥Lt AY .

(5.A.12)

The next result indicates the “minimal” and the “maximal” random variables, with respect to the order ≤Lt , when the mean and the variance are given. It is worthwhile to contrast it with Theorem 3.A.24. Theorem 5.A.21. Let Y be a nonnegative random variable with mean µ and variance σ 2 . Let X be a random variable such that P {X = 0} = 1 − P {X = (µ2 + σ 2 )/µ} = σ 2 /(µ2 + σ 2 ) (so that EX = µ and Var(X) = σ 2 ) and let Z be a random variable degenerate at µ. Then X ≤Lt Y ≤Lt Z.

(5.A.13)

Proof. The right-side inequality in (5.A.13) follows at once from Jensen’s Inequality. Let F and G be, respectively, the survival functions of X and Y . In order to obtain the left-side inequality in (5.A.13) we will show that ∞ ∞ e−sx F (x)dx ≤ e−sx G(x)dx for all s ≥ 0. (5.A.14) 0

0

The result will then follow from Theorem 5.A.1. Deﬁne the functions α and β on (0, ∞) by α(x) = F (x)/µ and β(x) = G(x)/µ. It is easy to see that both α and β are density functions with a common mean (µ2 + σ 2 )/2µ. In fact, α is the density function of the uniform distribution over the interval [0, (µ2 + σ 2 )/µ), whereas β is a density which is decreasing on [0, ∞). From Theorem 3.A.46 it now follows that ∞ ∞ F (x) G(x) dx ≤ dx φ(x) φ(x) µ µ 0 0 for all convex functions φ, and in particular (5.A.14) holds.

A characterization of the hazard rate order, by means of the Laplace transform order, is described in the following theorem. Recall from Section 1.A.3 that for any random variable Z and an event A we denote by [Z A] any random variable that has as its distribution the conditional distribution of Z given A.

5.A The Laplace Transform Order

243

Theorem 5.A.22. Let X and Y be two continuous random variables with right support endpoints uX and uY , respectively. Then X ≤hr Y if, and only if, (5.A.15) [X − tX > t] ≤Lt [Y − tY > t] for all t < min{uX , uY }. Proof. The fact that X ≤hr Y implies (5.A.15) follows from (1.B.6) and Theorem 5.A.16. In order to prove the converse, let us assume, for simplicity, that uX = uY = ∞. Denote by F and G the survival functions of X and Y , respectively. Now note that [X − tX > t] ≤Lt [Y − tY > t] for all t ∞ ∞ F (u + t) G(u + t) ⇐⇒ e−su e−su du ≤ du for all t and s > 0 F (t) G(t) 0 0 ∞ −su e G(u)du G(t) ⇐⇒ t∞ −su ≥ for all t and s > 0 e F (u)du F (t) t ∞ −su e G(u)du ⇐⇒ t∞ −su is increasing in t for all s > 0 (5.A.16) e F (u)du t ∞ 1 −st G(t) − est t e−su G(u)du se is increasing in t for all s > 0 ∞ ⇐⇒ 1 −st F (t) − est t e−su F (u)du se ∞ G(t) − est t e−su G(u)du is increasing in t for all s > 0, (5.A.17) ∞ ⇐⇒ F (t) − est t e−su F (u)du where the second from last equivalence follows by integration by parts. Using the Dominated Convergence Theorem, it is not hard to see that ∞ ∞ lims→0 est t e−su F (u)du = lims→0 est t e−su G(u)du = 0. Therefore, letting s → 0 in (5.A.17) we obtain that G(t)/F (t) is increasing in t; that is, X ≤hr Y .

Remark 5.A.23. The equivalence of X ≤hr Y and (5.A.16), together with (2.A.3), yield a proof of Theorem 2.A.6. In the following example it is shown, under a proper condition which is stated by means of the Laplace transform order, that random minima and maxima are ordered in the usual stochastic order sense; see related results in Examples 1.C.46, 3.B.39, 4.B.16, and 5.B.13. Example 5.A.24. Let X1 , X2 , . . . be a sequence of nonnegative independent and identically distributed random variables with a common distribution function FX1 and a common survival function F X1 . Let N1 and N2 be two positive integer-valued random variables, which are independent of the Xi ’s, and which have the Laplace transforms LN1 and LN2 . Denote X(1:Nj ) ≡ min{X1 , X2 , . . . , XNj } and X(Nj :Nj ) ≡ max{X1 , X2 , . . . , XNj }, j = 1, 2. Then the survival function of X(1:Nj ) is given by

244

5 The Laplace Transform and Related Orders

F X(1:Nj ) (x) = LNj (− log F X1 (x)),

j = 1, 2.

It is thus seen that if N1 ≤Lt N2 , then X(1:N1 ) ≥st X(1:N2 ) . In a similar manner it can be shown that if N1 ≤Lt N2 , then also X(N1 :N1 ) ≤st X(N2 :N2 ) . An example with a similar spirit is the following. Example 5.A.25. Consider a compound Poisson process with rate λ tribution φ. Suppose that this process is the (random) hazard rate of a random variable X. Then the survival function F of X is given t F (t) = exp − λ[1 − Lφ (s)]ds , t ≥ 0,

and disfunction by (5.A.18)

0

where Lφ is the Laplace transform of φ (see Kebir [280, page 873]). Similarly let Y have the survival function G given by t G(t) = exp − λ[1 − Lϕ (s)]ds , t ≥ 0, (5.A.19) 0

where ϕ is a distribution function, and where Lϕ is the Laplace transform of ϕ. It is now seen that if the random variable associated with φ is larger, in the Laplace transform order, than the random variable associated with ϕ, then G(t)/F (t) is increasing in t ≥ 0; that is (see (1.B.3)), X ≤hr Y . A variation of this result is given in Example 5.B.14. When X is a nonnegative integer-valued random variable, then it is customary and convenient to analyze it using its probability generating function E[tX ], t ∈ (0, 1), rather than its Laplace transform E[e−sX ], s ≥ 0. This fact suggests the following deﬁnition. Let X and Y be two nonnegative integer-valued random variables such that E[tX ] ≥ E[tY ] for all t ∈ (0, 1). (5.A.20) Then X is said to be smaller than Y in the probability generating function order (denoted as X ≤pgf Y ). It is not hard to verify the following relation which holds for any nonnegative integer-valued random variable X:

∞ ∞ t 1 − E[tX ] j j for all t ∈ (0, 1). t P {X ≥ j} = t t P {X > j} = 1−t j=1 j=0 We thus obtain the following analog of Theorem 5.A.1. Theorem 5.A.26. Let X and Y be two nonnegative integer-valued random variables. Then X ≤pgf Y if, and only if, ∞ j=1

tj P {X ≥ j} ≤

∞

tj P {Y ≥ j}

for all t ∈ (0, 1).

j=1

It is easy to see that (5.A.20) holds if, and only if, (5.A.1) holds. That is, X ≤pgf Y ⇐⇒ X ≤Lt Y.

5.B Orders Based on Ratios of Laplace Transforms

245

5.B Orders Based on Ratios of Laplace Transforms 5.B.1 Deﬁnitions and equivalent conditions In this section, for a nonnegative random variable X with distribution function F and survival function F ≡ 1 − F , let us denote by ∞ ∞ LX (s) = e−sx dF (x) and L∗X (s) = e−sx F (x)dx 0

0

the Laplace-Stieltjes transform of F (or the Laplace transform of X) and the Laplace transform of F , respectively. If Y is another nonnegative random variable, we similarly deﬁne LY and L∗Y . If LY (s) LX (s)

is decreasing in s > 0,

(5.B.1)

then X is said to be smaller than Y in the Laplace transform ratio order (denoted by X ≤Lt-r Y ). If 1 − LY (s) 1 − LX (s)

is decreasing in s > 0,

(5.B.2)

then X is said to be smaller than Y in the reverse Laplace transform ratio order (denoted by X ≤r-Lt-r Y ). Since L∗X (s) = s−1 (1 − LX (s)) and L∗Y (s) = s−1 (1 − LY (s)) for all s > 0, it follows that X ≤Lt-r Y ⇐⇒

1 − sL∗Y (s) 1 − sL∗X (s)

and that X ≤r-Lt-r Y ⇐⇒

L∗Y (s) L∗X (s)

is decreasing in s > 0,

is decreasing in s > 0.

Using (5.A.2) it is easy to verify the statements that are given in the following remark. Remark 5.B.1. Let X and Y be two positive random variables, and let E1 be a mean 1 exponential random variable which is independent of both X and = E1 /X and Y = E1 /Y ; that is, the distributions of both X Y . Deﬁne X and Y are scale mixtures of exponential distributions. Then X ≤Lt-r Y ⇐⇒ Y ≤hr X X ≤r-Lt-r Y ⇐⇒ Y ≤rh X.

and

See similar results in Example 4.B.7 and in Remark 5.A.2.

246

5 The Laplace Transform and Related Orders

The next theorem characterizes the orders ≤Lt-r and ≤r-Lt-r by means of functions of the respective moments. The characterization is an analog of the characterization of the Laplace transform order given in Theorem 5.A.5. Theorem 5.B.2. Let X and Y be nonnegative random variables that possess moments µi and νi , respectively, i = 1, 2, . . ., (µ0 = ν0 = 1). Then (a) X ≤Lt-r Y if, and only if, ∞

(−s)i n=0 i! νi ∞ (−s)i n=0 i! µi

is decreasing in s > 0.

(b) X ≤r-Lt-r Y if, and only if, ∞

(−s)i n=1 i! νi ∞ (−s)i n=1 i! µi

Proof. By writing e−st = initions.

∞ i=0

is decreasing in s > 0.

(−s)i i i! t ,

the result follows easily from the def-

5.B.2 Closure properties We list below some preservation properties of the orders ≤Lt-r and ≤r-Lt-r . Below, for any nonnegative random variable Z, we will denote by LZ the Laplace transform of Z. Theorem 5.B.3. Let X1 , X2 , . . . be independent, identically distributed nonnegative random variables, and let N1 and N2 be positive integer-valued random variables which are independent of the Xi ’s. Then N1 ≤Lt-r [≤r-Lt-r ] N2 =⇒

N1

Xi ≤Lt-r [≤r-Lt-r ]

i=1

N2

Xi .

i=1

Proof. For j = 1, 2, we have LX1 +X2 +···+XNj (s) = =

∞ i=1 ∞

P {Nj = i}LX1 +X2 +···+Xi (s) P {Nj = i}LiX1 (s)

i=1

= LNj (− log LX1 (s)). The stated results now follow from the assumptions.

If the Xi ’s are not assumed to be identically distributed, then stronger assumptions on the relationship between N1 and N2 yield the same conclusion. This is shown in the next two theorems.

5.B Orders Based on Ratios of Laplace Transforms

247

Theorem 5.B.4. Let X1 , X2 , . . . be independent nonnegative random variables, and let N1 and N2 be positive integer-valued random variables which are independent of the Xi ’s. If N1 ≤rh N2 , then N1

N2

Xi ≤Lt-r

i=1

Xi .

i=1

Proof. For j = 1, 2, we have LX1 +X2 +···+XNj (s) =

∞

P {Nj = i}

i=1

i

LXk (s).

k=1

For 0 < s1 < s2 we need to show that &

∞ m=1

P {N1 = m}

m

'& LXk (s1 )

∞

−

P {N2 = m}

m=1

P {N2 = n}

n=1

k=1

&

∞

m

'&

LXk (s1 )

n

LXk (s2 )

k=1 ∞

P {N1 = n}

n=1

k=1

'

n

' LXk (s2 ) ≤ 0.

k=1

This follows from the remark after Theorem 2.1 of Joag-Dev, Kochar, and Proschan [259] by noting that

m m

LXk (s2 ), LXk (s1 ) g1 (m), g2 (m) ≡ k=1

k=1

is a pair of what Joag-Dev, Kochar, and Proschan [259] call DP2 functions of m, whenever 0 < s1 < s2 , and g1 (m) is decreasing in m.

Theorem 5.B.5. Let X1 , X2 , . . . be independent nonnegative random variables, and let N1 and N2 be positive integer-valued random variables which are independent of the Xi ’s. If N1 ≤hr N2 , and if Xi ≤r-Lt-r Xi+1 , then N1

Xi ≤r-Lt-r

i=1

N2

Xi .

i=1

Proof. For j = 1, 2, we have 1 − LX1 +X2 +···+XNj (s) = =

∞ m=1 ∞ m=0

where

0 k=1

m P {Nj = m} 1 − LXk (s) k=1

P {Nj > m}

m

LXk (s) 1 − LXm+1 (s) ,

k=1

LXk (s) ≡ 1. So for 0 < s1 < s2 we have that

248

5 The Laplace Transform and Related Orders

1 − LX1 +X2 +···+XN1 (s1 ) 1 − LX1 +X2 +···+XN2 (s2 ) − 1 − LX1 +X2 +···+XN2 (s1 ) 1 − LX1 +X2 +···+XN1 (s2 ) ∞ m−1 P {N1 > m}P {N2 > n} − P {N2 > m}P {N1 > n} = m=1 n=0 n

×

LXk (s1 )

k=1

×

m

n

LXk (s2 )

k=1

LXk (s1 )(1 − LXm+1 (s1 ))(1 − LXn+1 (s2 ))

k=n+1 m

−

LXk (s2 )(1 − LXm+1 (s2 ))(1 − LXn+1 (s1 ))

k=n+1

≤ 0. The last inequality follows since N1 ≤hr N2 implies that P {N1 > m}P {N2 > n} − P {N2 > m}P {N1 > n} ≤ 0

for m > n,

and Xi ≤r-Lt-r Xi+1 implies that (1 − LXm+1 (s1 ))(1 − LXn+1 (s2 )) − (1 − LXm+1 (s2 ))(1 − LXn+1 (s1 )) ≥ 0 for m > n. The stated result now follows.

Some other preservation results are given in the following theorems. Theorem 5.B.6. Let X1 , X2 , . . . , Xn be a set of independent nonnegative random variables and let Y1 , Y2 , . . . , Yn be another set of independent nonnegative random variables. If Xj ≤Lt-r Yj , j = 1, 2, . . . , n, then X1 + X2 + · · · + Xn ≤Lt-r Y1 + Y2 + · · · + Yn . Proof. Since LX1 +X2 +···+Xn (s) =

n

creasing in s, j = 1, 2, . . . , n, then

i=1 LXi (s), we see LY1 +Y2 +···+Yn (s) LX1 +X2 +···+Xn (s) is

that if

LYj (s) LXj (s)

is de-

also decreasing in s.

As a special case of Theorem 5.B.6 we see that if X and Y are nonnegative independent random variables, then X ≤Lt-r X + Y.

(5.B.3)

Theorem 5.B.7. Let {Xj } and {Yj } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤Lt-r [≤r-Lt-r ] Yj , j = 1, 2, . . ., then X ≤Lt-r [≤r-Lt-r ] Y .

5.B Orders Based on Ratios of Laplace Transforms

249

Theorem 5.B.8. Let variables such that [X Θ = X, Y , and Θ be random θ] ≤Lt-r [≤r-Lt-r ] [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤Lt-r [≤r-Lt-r ] Y . Proof. We only give the proof for the ≤Lt-r order. The proof for the order ≤r-Lt-r is similar. Note that EΘ L[X|Θ] (s) LX (s) . = LY (s) EΘ L[Y |Θ] (s) d LX (s) d L[X|θ] (s) It can be veriﬁed that ds LY (s) ≥ 0 if ds L[Y |θ ] (s) ≥ 0 for all θ and θ in the support of Θ.

In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of ≤Lt-r [≤r-Lt-r ] ordered random variables, is bounded from below and from above, in the ≤Lt-r [≤r-Lt-r ] order sense, by these two random variables. Theorem 5.B.9. Let X and Y distribution functions F and G, with the distribution function pF [≤r-Lt-r ] Y , then X ≤Lt-r [≤r-Lt-r ]

be two nonnegative random variables with respectively. Let W be a random variable + (1 − p)G for some p ∈ (0, 1). If X ≤Lt-r W ≤Lt-r [≤r-Lt-r ] Y .

The proof of Theorem 5.B.9 is similar to the proof of Theorem 1.B.22, but it uses (5.B.1) [(5.B.2)] instead of (1.B.3). We omit the details. 5.B.3 Relationship to other stochastic orders In this subsection we describe some relationships between the Laplace ratio orders and some other stochastic orders. We also mention some known counterimplications. Theorem 5.B.10. Let X and Y be positive random variables. Then X ≤Lt-r Y =⇒ X ≤Lt Y and X ≤r-Lt-r Y =⇒ X ≤Lt Y. Proof. Denote LX (∞) = lims→∞ LX (s) and LY (∞) = lims→∞ LY (s). Since LX (0) = LY (0) = 1 and LX (∞) = LY (∞) = 0, we see that if X ≤Lt-r Y , then LY (0) LY (s) ≤ = 1, LX (s) LX (0) and if X ≤r-Lt-r Y , then 1 − LY (∞) 1 − LY (s) ≥ = 1. 1 − LX (∞) 1 − LX (s) This proves the stated results.

250

5 The Laplace Transform and Related Orders

As a corollary of Theorem 5.B.10 and (5.A.5) we see that X ≤Lt-r Y =⇒ EX ≤ EY, and that X ≤r-Lt-r Y =⇒ EX ≤ EY provided the expectations exist. The proof of the next theorem will not be given here. Theorem 5.B.11. Let X and Y be nonnegative absolutely continuous or integer-valued random variables. Then X ≤rh Y =⇒ X ≤Lt-r Y and X ≤hr Y =⇒ X ≤r-Lt-r Y. The following result gives a relationship between the orders ≤mrl and ≤Lt-r . Theorem 5.B.12. Let X and Y be two nonnegative absolutely continuous random variables that possess all moments and with bounded support [0, b]. If X ≤mrl Y , then b − Y ≤Lt-r b − X. Proof. Denote g(1, n) = E[X n ] and g(2, n) = E[Y n ]. Since X ≤mrl Y it follows from (2.A.10) that g(i, n) is totally positive of order 2 in i = 1, 2, and in n ≥ 0. Therefore, by the Basic Composition Formula (Karlin [275]) we have that ∞ sn h(i, s) ≡ g(i, n) n! n=0 is totally positive of order 2 in i = 1, 2, and in s ≥ 0. That is, h(2, s) EesY = h(1, s) EesX

is increasing in s ≥ 0.

(5.B.4)

It is easy to verify that (5.B.4) implies b − Y ≤Lt-r b − X.

Counterexamples in the literature show that for nonnegative integer-valued random variables X and Y we have X ≤hr Y =⇒ X ≤Lt-r Y =⇒ X ≤icv Y and X ≤rh Y =⇒ X ≤r-Lt-r Y =⇒ X ≤icv Y. It is of interest to compare the above counterimplications, and the implications given in Theorems 5.B.10 and 5.B.11, with the implication X ≤icv Y =⇒ X ≤Lt Y given in Theorem 5.A.16.

5.B Orders Based on Ratios of Laplace Transforms

251

From the above counterimplications it follows that for nonnegative integervalued random variables X and Y we have X ≤Lt-r Y =⇒ X ≤r-Lt-r Y and X ≤r-Lt-r Y =⇒ X ≤Lt-r Y. Counterexamples in the literature also show that for nonnegative integervalued random variables X and Y we have X ≤Lt-r Y =⇒ X ≤icx Y and X ≤r-Lt-r Y =⇒ X ≤icx Y. From (1.D.2) it follows that for nonnegative random variables, X ≤conv Y =⇒ X ≤Lt-r Y.

(5.B.5)

Example 5.B.13. The Laplace ratio orders are useful for the purpose of stochastically comparing random minima and maxima. Let X1 , X2 , . . . be a sequence of nonnegative independent and identically distributed random variables. Let N1 and N2 be two positive integer-valued random variables which are independent of the Xi ’s. Denote X(1:Nj ) ≡ min{X1 , X2 , . . . , XNj } and X(Nj :Nj ) ≡ max{X1 , X2 , . . . , XNj }, j = 1, 2. Let the common distribution function, and the common survival function, of the Xi ’s be denoted, respectively, by FX1 and F X1 , and let FX(Nj :Nj ) and F X(1:Nj ) denote, respectively, the distribution function of X(Nj :Nj ) and the survival function of X(1:Nj ) , j = 1, 2. Note that FX(Nj :Nj ) (x) =

∞

n FX (x)pNj (n) = LNj (− log FX1 (x)), 1

j = 1, 2,

n=1

and that F X(1:Nj ) (x) =

∞

n

F X1 (x)pNj (n) = LNj (− log F X1 (x)),

n=1

Thus, F X(1:N2 ) (x) F X(1:N1 ) (x)

=

LN2 (− log F X1 (x)) . LN1 (− log F X1 (x))

Therefore N1 ≤Lt-r N2 =⇒ X(1:N1 ) ≥hr X(1:N2 ) . In a similar manner it can be shown that N1 ≤Lt-r N2 =⇒ X(N1 :N1 ) ≤rh X(N2 :N2 ) ,

j = 1, 2.

252

5 The Laplace Transform and Related Orders

N1 ≤r-Lt-r N2 =⇒ X(1:N1 ) ≥rh X(1:N2 ) , and that N1 ≤r-Lt-r N2 =⇒ X(N1 :N1 ) ≤hr X(N2 :N2 ) . From Theorem 5.B.11 and the above implications it follows that N1 ≤rh N2 =⇒ X(1:N1 ) ≥hr X(1:N2 ) , N1 ≤rh N2 =⇒ X(N1 :N1 ) ≤rh X(N2 :N2 ) , N1 ≤hr N2 =⇒ X(1:N1 ) ≥rh X(1:N2 ) ,

(5.B.6) (5.B.7)

and that N1 ≤hr N2 =⇒ X(N1 :N1 ) ≤hr X(N2 :N2 ) . See related results in Examples 1.C.46, 3.B.39, 4.B.16, and 5.A.24. The following example is a variation of Example 5.A.25 — under a stronger condition (the order ≤Lt-r is stronger than the order ≤Lt ) we obtain a stronger conclusion. Example 5.B.14. As in Example 5.A.25, let X have a compound Poisson process, with rate λ and distribution φ, as its (random) hazard rate function. The survival function of X is given in (5.A.18), and it follows that its density function f is given by t f (t) = λ[1 − Lφ (t)] exp − λ[1 − Lφ (s)]ds , t ≥ 0. 0

Similarly let Y have a compound Poisson process, with rate λ and distribution ϕ, as its (random) hazard rate function. Its survival function is given in (5.A.19), and its density function g is given by t λ[1 − Lϕ (s)]ds , t ≥ 0. g(t) = λ[1 − Lϕ (t)] exp − 0

It is now seen that if the random variable associated with φ is larger, in the reverse Laplace transform order (and hence, by Theorem 5.B.10, also larger in the Laplace transform order), than the random variable associated with ϕ, then g(t)/f (t) is increasing in t ≥ 0; that is, X ≤lr Y .

5.C Some Related Orders 5.C.1 The factorial moments order The factorial moments of a random variable X are µi = E[X(X − 1) · · · (X − i + 1)], i = 1, 2, . . .. They are particularly useful when X is a nonnegative

5.C Some Related Orders

253

integer-valued random variable, since they can be easily obtained from the probability generating function of X by repeated diﬀerentiation. Throughout this subsection we consider only nonnegative integer-valued random variables. The ith factorial moment of such a random variable X can also be written as µi = i!E Xi , where xi is deﬁned as 0 when i > x. Let X and Y be two nonnegative integer-valued random variables such that X Y E ≤E for all i ∈ N++ . (5.C.1) i i Then X is said to be smaller than Y in the factorial moments order (denoted by X ≤fm Y ). For a real function φ deﬁned on N+ , deﬁne ∆0 φ(x) = φ(x), and ∆j φ(x) = j−1 ∆ φ(x + 1) − ∆j−1 φ(x), x ∈ N+ , j = 1, 2, . . .. It can be shown that for every φ : N+ → [0, ∞), one has ∞ x φ(x) = ∆j φ(0) , x ∈ N+ . (5.C.2) j j=0 The following characterization of the order ≤fm is a direct consequence of (5.C.2). Theorem 5.C.1. Let X and Y be two nonnegative integer-valued random variables. Then X ≤fm Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

for all φ such that ∆j φ(0) ≥ 0, j ∈ N+ .

It is easy to see that X ≤fm Y =⇒ EX ≤ EY. Some closure properties of the order ≤fm are given in the next theorem. Theorem 5.C.2. (a) Let X and Y be two nonnegative integer-valued random variables. If X ≤fm Y , then X + k ≤fm Y + k for every k ∈ N+ . (b) Let X and Y be two nonnegative integer-valued random variables. If X ≤fm Y , then kX ≤fm kY for every k ∈ N+ . (c) Let X1 , X2 , . . . , Xm be a set of independent nonnegative integer-valued random variables. Let Y1 , Y2 , . . . , Ym be another set of independent nonnegative integer-valued random variables. If Xi ≤fm Yi , i = 1, 2, . . . , m, then m m Xi ≤fm Yi . i=1

i=1

Proof. It is enough to prove part (a) for k = 1; the proof can then be completed by induction. But for k = 1 the desired result follows directly from the identity x+1 x x = + , i, x ∈ N+ . i+1 i+1 i

254

5 The Laplace Transform and Related Orders

A lengthy straightforward calculation yields j kx ∆ ≥ 0, i, j, k ∈ N+ . i x=0 Part (b) then follows. Finally, in order to prove part (c) it is enough to consider the case m = 2. This case follows immediately from the identity i x2 x1 x1 + x2 = , x1 , x2 , i ∈ N+ .

i j i−j j=0 The next result shows that under some conditions the order ≤fm is closed under formation of random sums. We do not give the proof here. Theorem 5.C.3. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent integer-valued random variables such that Xi ≤fm Yi , i = 1, 2, . . . . Let M and N be integer-valued nonnegative random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤icx N . If the Xi ’s or the Yi ’s are identically distributed, then M

Xj ≤fm

j=1

N

Yj .

j=1

Select a positive integer i and consider the real function φ deﬁned on

{i + 1, i + 2, . . . } by φ(x) = xi . A straightforward computation yields that φ(x) + φ(x + 2) ≥ 2φ(x + 1) for x ∈ {i + 1, i + 2, . . . }. That is, the function φ is convex on {i + 1, i + 2, . . . }. Thus we have proven the following result. Theorem 5.C.4. Let X and Y be two nonnegative integer-valued random variables. If X ≤icx Y , then X ≤fm Y . In particular, if X ≤st Y , then X ≤fm Y . A relationship between the orders ≤fm and ≤pgf is given in the next result. Theorem 5.C.5. Let X and Y be two nonnegative integer-valued random variables with bounded support {0, 1, 2, . . . , b}. If X ≤fm Y , then b − Y ≤pgf b − X. Proof. For a ≥ 1 deﬁne MX (a) = E[aX ] and MY (a) = E[aY ]. Note that the (i) ith derivative of MX [MY ] at 1 is MX (1) = E[X(X − 1) · · · (X − i + 1)] (i) [MY (1) = E[Y (Y − 1) · · · (Y − i + 1)]]. Expanding MX and MY about 1, using the ﬁniteness of the support for convergence, it is seen that MX (a) =

∞ (i) M (1) X

i=0

i!

(a − 1)i

∞ E[X(X − 1) · · · (X − i + 1)] = (a − 1)i i! i=0

5.C Some Related Orders

≤

255

∞ E[Y (Y − 1) · · · (Y − i + 1)] (a − 1)i i! i=0

= MY (a), where the inequality follows from the assumption that X ≤fm Y and from the fact that a ≥ 1. Thus, E[aX ] ≤ E[aY ]

for all a ≥ 1.

(5.C.3)

Now it is easy to verify that (5.C.3) implies that b − Y ≤pgf b − X.

5.C.2 The moments order Consider now two general (that is, not necessarily integer-valued) nonnegative random variables X and Y such that E[X i ] ≤ E[Y i ]

for all i ∈ N++ .

Then X is said to be smaller than Y in the moments order (denoted by X ≤mom Y ). Thus X ≤mom Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

(5.C.4)

for all polynomials φ with nonnegative coeﬃcients. In fact, X ≤mom Y if, and only (5.C.4) holds for all absolutely monotone functions φ of the form if, ∞ φ(x) = k=0 ak xk , where the ak ’s are nonnegative, provided the expectations exist. Clearly, X ≤mom Y =⇒ EX ≤ EY. Some closure properties of the order ≤mom are given in the next theorem. Its proof is similar to the proof of Theorem 5.C.2 (except that it is simpler) and is thus omitted. Theorem 5.C.6. (a) Let X and Y be two nonnegative random variables. If X ≤mom Y , then X + k ≤mom Y + k for every k ≥ 0. (b) Let X and Y be two nonnegative random variables. If X ≤mom Y , then kX ≤mom kY for every k ≥ 0. (c) Let X1 , X2 , . . . , Xm be a set of independent nonnegative random variables. Let Y1 , Y2 , . . . , Ym be another set of independent nonnegative random variables. If Xi ≤mom Yi , i = 1, 2, . . . , m, then m i=1

Xi ≤mom

m

Yi .

i=1

The next result shows that under some conditions the order ≤mom is closed under formation of random sums. We do not give the proof here.

256

5 The Laplace Transform and Related Orders

Theorem 5.C.7. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent random variables such that Xi ≤mom Yi , i = 1, 2, . . .. Let M and N be integer-valued nonnegative random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤icx N . If the Xi ’s or the Yi ’s are identically distributed, then M

Xj ≤mom

j=1

N

Yj .

j=1

The moments order is closed under linear convex combinations as the following theorem shows. This result is an analog of Theorems 3.A.36 and 5.A.14. Its proof is similar to the proof of Theorem 3.A.36 and is therefore omitted. A similar result is Theorem 5.C.18. Theorem 5.C.8. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≤mom Y , i = 1, 2, . . . , n, then n

ai Xi ≤mom Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n and

n i=1

ai = 1.

The following result gives a relationship between the orders ≤fm and ≤mom . Theorem 5.C.9. Let X and Y be two nonnegative integer-valued random variables. If X ≤fm Y , then X ≤mom Y . In particular, if X ≤icx Y (or if X ≤st Y ), then X ≤mom Y . Proof. Denote x[i] = x(x − 1) · · · (x − i + 1). The result will follow once we have shown that i (i) xi = αk x[k] , i = 1, 2, . . . , (5.C.5) k=1 (i) αk ’s

where the are some nonnegative constants. The expression (5.C.5) can (i) be found on page 4 of Johnson and Kotz [263]. The αk ’s in (5.C.5) are the Stirling numbers of the second kind, which are known to be positive.

In order to obtain a Laplace transform characterization of the order ≤mom we ﬁrst prove the following result. Theorem 5.C.10. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤mom X2 =⇒ Nλ (X1 ) ≤fm Nλ (X2 )

for all λ > 0.

5.C Some Related Orders

257

Proof. For k = 1, 2, let Fk denote the distribution function of Xk . By (2.A.15) we have E[Nλ (Xk )(Nλ (Xk ) − 1)(Nλ (Xk ) − 2) · · · (Nλ (Xk ) − i + 1)] ∞ ∞ (λx)n = n(n − 1)(n − 2) · · · (n − i + 1) e−λx dFk (x) n! 0 n=0 ∞ ∞ (λx)n dFk (x). n(n − 1)(n − 2) · · · (n − i + 1)e−λx = n! 0 n=0 It is not diﬃcult to verify that the ith factorial moment of a Poisson random variable with mean λx is given by ∞

n(n − 1)(n − 2) · · · (n − i + 1)e−λx

n=0

(λx)n = (λx)i . n!

Therefore E[Nλ (X1 )(Nλ (X1 ) − 1)(Nλ (X1 ) − 2) · · · (Nλ (X1 ) − i + 1)] ∞ ∞ = λi xi dF1 (x) ≤ λi xi dF2 (x) 0

0

= E[Nλ (X2 )(Nλ (X2 ) − 1)(Nλ (X2 ) − 2) · · · (Nλ (X2 ) − i + 1)], where the inequality follows from X1 ≤mom X2 . Thus Nλ (X1 ) ≤fm Nλ (X2 ).

A Laplace transform characterization of the order ≤mom is given next. It may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, 2.B.14, 4.A.21, and 5.A.6. Theorem 5.C.11. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤mom X2 ⇐⇒ Nλ (X1 ) ≤mom Nλ (X2 )

for all λ > 0.

Proof. If X1 ≤mom X2 , then from Theorem 5.C.10 we get that Nλ (X1 ) ≤fm Nλ (X2 ), and from Theorem 5.C.9 we get that Nλ (X1 ) ≤mom Nλ (X2 ). Now suppose that Nλ (X1 ) ≤mom Nλ (X2 ) for all λ > 0. Then E(Nλ (X1 ))i ≤ E(Nλ (X2 ))i , i = 1, 2, . . . . In particular, E[Nλ (X1 )] ≤ E[Nλ (X2 )], therefore, by (2.A.16), E[X1 ] = E[Nλ (X1 )]/λ ≤ E[Nλ (X2 )]/λ = E[X2 ]. Let the induction hypothesis be E[X1i ] ≤ E[X2i ],

i = 1, 2, . . . , m.

258

5 The Laplace Transform and Related Orders

Now observe the following. From (2.A.15) it is seen that ∞ E (Nλ (Xk ))m+1 = nm+1

∞

(λx)n dFk (x) n! 0 n=0 ' ∞ & ∞ n m+1 −λx (λx) dFk (x), = n e n! 0 n=0 e−λx

k = 1, 2.

∞ n m+1 −λx (λx) The quantity is the (m + 1)st moment of a Poisson e n=0 n n! random variable with mean λx. It is not diﬃcult to verify that ∞

nm+1 e−λx

n=0

(λx)n = am+1 (λx)m+1 + am (λx)m + · · · + a1 (λx) + a0 , n!

where aj > 0, j = 0, 1, 2, . . . , m + 1. Therefore

m+1 aj λj E (Nλ (Xk ))m+1 =

∞

xj dFk (x),

k = 1, 2.

0

j=0

We know that E (Nλ (X1 ))m+1 ≤ E (Nλ (X2 ))m+1 and therefore, m+1 j=0

aj λ

j

∞

x dF1 (x) ≤ j

0

m+1

aj λ

j

∞

xj dF2 (x)

0

j=0

for some a0 , a1 , . . . , am+1 > 0 and all λ > 0. Rewrite the inequality as m

aj λj E X1j − E X2j . am+1 λm+1 E X1m+1 − E X2m+1 ≤ j=1

The right-hand side is nonnegative by the induction hypothesis. If E X1m+1 − E X2m+1 > 0, then, by choosing suﬃciently large λ, the left-hand side would be greater than the right-hand side, a contradiction. Thus we must have E X1m+1 − E X2m+1 ≤ 0. The result now follows by induction.

The next result describes a relationship between the orders ≤mom and ≤r-Lt-r ; we omit its proof. Theorem 5.C.12. Let X and Y be two nonnegative random variables. Then X ≤r-Lt-r Y =⇒

1 1 ≤mom . Y X

5.C Some Related Orders

259

Finally we mention a related order. Let X and Y be two nonnegative random variables such that E[Y n ] E[X n ]

is increasing in n ∈ N+ ,

(5.C.6)

where, by convention, E[X 0 ] = E[Y 0 ] = 1. Then X is said to be smaller than Y in the moments ratio order (denoted as X ≤mom-r Y ). E[Y n ] E[Y 0 ] From (5.C.6) it is seen that E[X n ] ≥ E[X 0 ] = 1. Thus we see that X ≤mom-r Y =⇒ X ≤mom Y.

(5.C.7)

From (2.A.10) it is seen that X ≤mrl Y =⇒ X ≤mom-r Y. Therefore, by Theorem 2.A.1, we also have that X ≤hr Y =⇒ X ≤mom-r Y.

(5.C.8)

In the proof of Theorem 5.B.12 it is essentially shown that for nonnegative random variables X and Y with bounded support [0, b] we have X ≤mom-r Y =⇒ b − Y ≤Lt-r b − X. This may be contrasted with (5.C.9) below (recall that X ≤Lt-r Y =⇒ X ≤Lt Y ; see Theorem 5.B.10). The following result is obvious. Theorem 5.C.13. Let X and Y be two nonnegative random variables. If X ≤mom-r Y , then kX ≤mom-r kY for every k ≥ 0. The next result describes a relationship between the orders ≤mom-r and ≤Lt-r ; we omit its proof. Theorem 5.C.14. Let X and Y be two nonnegative random variables. Then X ≤Lt-r Y =⇒

1 1 ≤mom-r . Y X

From (5.C.7) and Theorem 5.C.14 it is seen that if X and Y are nonnegative random variables, then X ≤Lt-r Y =⇒

1 1 ≤mom . Y X

260

5 The Laplace Transform and Related Orders

5.C.3 The moment generating function order Let X and Y be two nonnegative random variables such that Ees0 Y < ∞ for some s0 > 0, and EesX ≤ EesY , for all s > 0. Then X is said to be smaller than Y in the moment generating function order (denoted by X ≤mgf Y ). A simple integration by parts shows that X ≤mgf Y if, and only if, ∞ ∞ sx e F (x)dx ≤ esx G(x)dx for all s > 0, 0

0

where F and G are the survival functions of X and of Y , respectively. The following theorem is an analog of Theorem 5.A.5; its proof is similar to the proof of that result. Theorem 5.C.15. Let X and Y be two nonnegative random variables. Then X ≤mgf Y if, and only if, ∞ i=0

∞

si si EX i+1 ≤ EY i+1 (i + 1)! (i + 1)! i=0

for all s > 0.

It follows from Theorem 5.C.15 that X ≤mom Y =⇒ X ≤mgf Y. Some closure properties of the order ≤mgf are given below (recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.) Theorem 5.C.16. Let X and Y be two nonnegative random variables. (a) If X ≤mgf Y , then X + k ≤mgf Y + k for every k > 0. (b) If X ≤mgf Y , then kX ≤mgf kY for every k > 0. (c) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤mgf [Y Θ = θ] for all θ in the support of Θ. Then X ≤mgf Y . That is, the moment generating function order is closed under mixtures. (d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤mgf Yi for i = 1, 2, . . . , m, then m i=1

Xi ≤mgf

m

Yi ;

i=1

that is, the moment generating function order is closed under convolutions.

5.D Complements

261

The next result is an analog of Theorems 5.A.9 and 5.C.7. Theorem 5.C.17. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent and identically distributed random variables such that Xi ≤mgf Yi , i = 1, 2, . . .. Let M and N be integer-valued nonnegative random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤mgf N . Then M

Xj ≤mgf

j=1

N

Yj .

j=1

The following result is an analog of Theorem 3.A.36; similar results are Theorems 5.A.14 and 5.C.8. Theorem 5.C.18. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≤mgf Y , i = 1, 2, . . . , n, then n

ai Xi ≤mgf Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n and

n i=1

ai = 1.

The next result is an analog of Theorem 5.C.5; it describes a relationship between the orders ≤mgf and ≤Lt . Theorem 5.C.19. Let X and Y be two nonnegative random variables with bounded support [0, b]. Then X ≤mgf Y if, and only if, b − Y ≤Lt b − X. In particular, for random variables as in Theorem 5.C.19, X ≤mom Y =⇒ b − Y ≤Lt b − X.

(5.C.9)

5.D Complements Section 5.A: We used three main sources in order to collect the results regarding the Laplace transform order. These are Stoyan [540, Section 1.8], Kim and Proschan [294], and Alzaid, Kim, and Proschan [11]. The characterization (5.A.4) is taken from Denuit [141]. The characterization of the order ≤Lt in terms of exponential mixtures, given in Remark 5.A.2, is taken from Bartoszewicz [50]. The characterization described in Theorem 5.A.4 can be found in Bhattacharjee [84]. The Laplace transform characterization of the order ≤Lt given in Theorem 5.A.6 is essentially taken from Alzaid, Kim, and Proschan [11]. Some further characterizations of the Laplace transform order by means of inﬁnitely divisible distributions are given in Bartoszewicz [48]. The closure property of the order ≤Lt under random sums (Theorem 5.A.8) is taken from Bhattacharjee [86]. The

262

5 The Laplace Transform and Related Orders

extensions of the closure property of the order ≤Lt under random sums (Theorems 5.A.10–5.A.12) can be found in Pellerey [450]. The majorization result (Theorem 5.A.13) is taken from Ma [375]. The result which gives the closure of the Laplace transform order under linear convex combinations (Theorem 5.A.14) can be found in Pellerey [452]. The condition which implies stochastic equality (in Theorem 5.A.15) is a combination of results in Cai and Wu [116] and in Bhattacharjee [84], where some generalizations of this condition can also be found. The characterization of the Laplace transform order by means of the order ≤∞-icv (Theorem 5.A.17) is taken from Thistle [548]; see also Fishburn and Lavalle [204] and further references in that paper. The implication of the order ≤Lt from the order ≤p− (Theorem 5.A.18) is essentially taken from Bhattacharjee [83]; see also Cai and Wu [116]. The closure property of the order ≤Lt under the operation of taking minima (Theorem 5.A.19) is taken from Alzaid, Kim, and Proschan [11]. Alzaid, Kim, and Proschan [11] also have a version of Theorem 5.A.19 which gives conditions under which the order ≤Lt is closed under the operation of taking maxima, however their condition must be wrong, since it postulates that the Fi ’s and the Gi ’s (of Theorem 5.A.19) are completely monotone — but these functions are increasing, whereas all completely monotone functions must be decreasing. Looking over their proof it is seen that a suﬃcient condition, for the closure of the order ≤Lt under the operation of taking maxima, is that e−tx Fi (x) and e−tx Gi (x) be completely monotone in x for each t ≥ 0, i = 1, 2, . . . , m. We are not aware of any study of the latter condition. The class of random lifetimes, deﬁned by (5.A.11), is studied in Klefsj¨ o [302]. The equivalence of the Laplace transform ordering of nonnegative random variables with equal means, and their corresponding asymptotic equilibrium ages, given in (5.A.12), is taken from Denuit [141]. The lower bound in (5.A.13), in the sense of ≤Lt , when the mean and the variance are given, can be found in Stoyan [540, page 23], who credited it to Rolski. The characterization of the order ≤hr by means of the order ≤Lt (Theorem 5.A.22) is given in Belzunce, Gao, Hu, and Pellerey [67]. The results about the stochastic comparisons of random minima and maxima (Example 5.A.24) are taken from Shaked and Wong [526]. The hazard rate order comparison of two nonnegative random variables with random hazard rate functions (Example 5.A.25) is a special case of Theorem 3 of Di Crescenzo and Pellerey [166]. Section 5.B: Most of the results in this section can be found in Shaked and Wong [525]. The characterizations of the orders ≤Lt-r and ≤r-Lt-r in terms of exponential mixtures, given in Remark 5.B.1, are taken from Bartoszewicz [50]. Di Crescenzo and Shaked [167] used (5.B.3) in order to obtain Laplace transform ratio order comparisons of many pairs of random variables. The relationship between the orders ≤mrl and ≤Lt-r (Theorem 5.B.12) is essentially proven in Fagiuoli and Pellerey [187]. The relationship between the orders ≤Lt-r and ≤conv , given in (5.B.5), was

5.D Complements

263

noted in Shaked and Suarez-Llorens [520]. Extensions of the implications (5.B.6) and (5.B.7) to order statistics other than the minimum can be found in Nanda, Misra, Paul, and Singh [427]. The likelihood ratio order comparison of two nonnegative random variables with random hazard rate functions (Example 5.B.14) is essentially Remark 3 of Di Crescenzo and Pellerey [166]. Section 5.C: Many of the results in this section are taken from Lef`evre and Picard [338]. A discussion about other related orders can also be found in Lef`evre and Picard [338]. The closure properties of the order ≤fm (Theorem 5.C.2), as well as the simple proof of Theorem 5.C.9, have been communicated to us by Lef`evre [335]. The results that give the closure under random convolutions property of the factorial moments order (Theorem 5.C.3) and of the moments order (Theorem 5.C.7) are taken from Jean-Marie and Liu [254]. Lef`evre and Utev [339] have noticed that for ﬁnite random variables with support {0, 1, . . . , b} the discrete versions of the orders ≤m-icx , m = 2, 3, . . . , b (see Section 4.A.7), together with some conditions on the factorial moments, imply the order ≤fm ; thus they generalized Theorem 5.C.5. The result which gives the closure of the moments order under linear convex combinations (Theorem 5.C.8) can be found in Pellerey [452]. The Laplace transform characterizations of the order ≤mom (Theorems 5.C.10 and 5.C.11) are taken from Shaked and Wong [524]. The relationship between the orders ≤mom and ≤r-Lt-r (Theorem 5.C.12) can be found in Bartoszewicz [47]. The moments ratio order has been introduced by Whitt [565] who has also obtained the implications (5.C.7) and (5.C.8). The relationship between the orders ≤mom-r and ≤Lt-r (Theorem 5.C.14) can be found in Bartoszewicz [47]. The moment generating function order is called the exponential order in Kaas, Heerwaarden, and Goovaerts [269]. Most of the results in Section 5.C.3 can be found in Klar and M¨ uller [301]. The result that gives the closure of the order ≤mgf under linear convex combinations (Theorem 5.C.18) is taken from Li [352].

6 Multivariate Stochastic Orders

In this chapter we describe various extensions, of the univariate stochastic orders in Chapters 1 and 2, to the multivariate case. The most important common orders that are studied in this chapter are the multivariate stochastic orders ≤st and ≤lr . Multivariate extensions of the orders ≤hr and ≤mrl are also studied in this chapter. Also, we review here further analogs of the univariate order ≤st , such as the upper and lower orthants orders. In addition, some other related orders are investigated in this chapter as well.

6.A Notations and Preliminaries In this chapter we will be concerned with random vectors that take on values in Rn ≡ (−∞, ∞)n . When we say that the random vectors are nonnegative we mean that they take on values in Rn+ = [0, ∞)n . Elements in Rn will be denoted by x, y, and so forth, or, more explicitly, as x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ), and so on. The space Rn is endowed with the usual componentwise partial order, which is deﬁned as follows. Let x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) be two vectors in Rn ; then we denote x ≤ y if xi ≤ yi for i = 1, 2, . . . , n. Let x be a vector in Rn and let I = {i1 , i2 , . . . , ik } ⊆ {1, 2, . . . , n}; then we denote xI = (xi1 , xi2 , . . . , xik ).

(6.A.1)

For a random vector X that takes on values in Rn , the interpretation of X I is similar. The complement of I in {1, 2, . . . , n} is denoted by I ≡ {1, 2, . . . , n} − I. The vector of ones will be denoted by e, that is, e = (1, 1, . . . , 1). The dimension of e may vary from one formula to another, but it is always possible to determine it from the expression in which it appears. For example, if we write xI ≥ te, then it is obvious that the dimension of e is |I|, that is, the cardinality of I.

266

6 Multivariate Stochastic Orders

Let φ be a univariate or a multivariate function with domain in Rn . If φ(x) ≤ [≥] φ(y) whenever x ≤ y, then we say that the function φ is increasing [decreasing]. A set U ⊆ Rn is called increasing or upper [decreasing or lower] if y ∈ U whenever y ≥ [≤] x and x ∈ U . If U is Borel measurable, then it is increasing [decreasing] if, and only if, its indicator function IU is increasing [decreasing]. In this chapter, and later in the book, when we consider increasing and decreasing sets, they are implicitly assumed to be Borel measurable.

6.B The Usual Multivariate Stochastic Order 6.B.1 Deﬁnition and equivalent conditions Let X and Y be two random vectors such that P {X ∈ U } ≤ P {Y ∈ U } for all upper sets U ⊆ Rn .

(6.B.1)

Then X is said to be smaller than Y in the usual stochastic order (denoted by X ≤st Y ). Roughly speaking, (6.B.1) says that X is less likely than Y to take on large values, where “large” means any value in an increasing set U for any increasing set U . Another way of rewriting (6.B.1) is the following: E[IU (X)] ≤ E[IU (Y )]

for all upper sets U ⊆ Rn ,

(6.B.2)

where IU denotes the indicator function of U . From (6.B.2) it follows that if X ≤st Y , then E

m i=1

m ai IUi (X) − b ≤ E ai IUi (Y ) − b

(6.B.3)

i=1

for all ai ≥ 0, i = 1, 2, . . . , m, b ∈ Rn , and m ≥ 0. Given an increasing real function φ on Rn , it is possible, for each m, to deﬁne a sequence of Ui ’s, and a sequence of ai ’s, and a b (all of which may depend on m), such that as m → ∞, then (6.B.3) converges to E[φ(X)] ≤ E[φ(Y )],

(6.B.4)

provided the expectations exist. It follows that X ≤st Y if, and only if, (6.B.4) holds for all increasing functions φ for which the expectations exist. 6.B.2 A characterization by construction on the same probability space As in the univariate case, the usual multivariate stochastic order can be characterized as follows:

6.B The Usual Multivariate Stochastic Order

267

Theorem 6.B.1. The random vectors X and Y satisfy X ≤st Y if, and only ˆ and Yˆ , deﬁned on the same probability if, there exist two random vectors X space, such that ˆ =st X, X Yˆ =st Y ,

(6.B.6)

ˆ ≤ Yˆ } = 1. P {X

(6.B.7)

(6.B.5)

and Obviously, if (6.B.5)–(6.B.7) hold, then X ≤st Y . We will not give the proof of Theorem 6.B.1 here; however, in the next subsection we point out ˆ and of Yˆ can be an important special case in which the construction of X described explicitly. As in the univariate case (see Theorem 1.A.2) Theorem 6.B.1 can be restated as follows. Theorem 6.B.2. The n-dimensional random vectors X and Y satisfy X ≤st Y if, and only if, there exist a random variable Z and Rn -valued functions ψ 1 and ψ 2 such that ψ 1 (z) ≤ ψ 2 (z) for all z ∈ R, and X =st ψ 1 (Z) and Y =st ψ 2 (Z). In light of Theorem 6.B.1, the following question arises. Let {X(θ), θ ∈ Θ} be a collection of n-dimensional random vectors indexed by θ, where Θ is a subset of Rm for some m (see the beginning of Chapter 8 for a discussion about the meaning of this notation). Suppose that X(θ) ≤st X(θ ) whenever θ ≤ θ ; that is, that X(θ) is stochastically increasing in θ. Is it possible ˆ then to construct, on some probability space, a family {X(θ), θ ∈ Θ} such ˆ ˆ ˆ )} = 1 that X(θ) =st X(θ) for all θ ∈ Θ, and such that P {X(θ) ≤ X(θ whenever θ ≤ θ ? It turns out that if Θ ∈ R (that is, m = 1) the answer is in the aﬃrmative. However, when m ≥ 2 this need not be the case; see Fill and Machida [200] for a counterexample and a further discussion. 6.B.3 Conditions that lead to the multivariate usual stochastic order The ﬁrst basic result described in this subsection gives suﬃcient conditions for the usual multivariate stochastic order by means of the usual univariate stochastic order. The proof is based on the well-known standard construction: Suppose that we are given a distribution of a random vector ˆ = X = (X1 , X2 , . . . , Xn ) and we want to construct a random vector X ˆ =st X. The interest in such constructions is in ˆ1, X ˆ2, . . . , X ˆ n ) such that X (X simulation theory as well as in other areas of applications. In order to do it let U1 , U2 , . . . , Un be independent uniform [0, 1] random variables and deﬁne ˆ 1 = inf{x1 : P {X1 ≤ x1 } ≥ U1 }, X

268

6 Multivariate Stochastic Orders

ˆ 1 , . . . , Xk−1 = X ˆ k−1 } ≥ Uk }, ˆ k = inf{xk : P {Xk ≤ xk X1 = X X k = 2, 3, . . . , n. ˆ =st X. Then X The conditions given in the next result are natural for a construction of ˆ and Yˆ , as needed in Theorem 6.B.1, using the standard construction. The X result then follows from Theorem 6.B.1. Recall from Section 1.A.3 that for any random vector Z and an event A we denote by [Z A] any random vector that has as its distribution the conditional distribution of Z given A. Theorem 6.B.3. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X1 ≤st Y1 , [X2 X1 = x1 ] ≤st [Y2 Y1 = y1 ]

(6.B.8) whenever x1 ≤ y1 ,

(6.B.9)

and in general, for i = 2, 3, . . . , n, [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤st [Yi Y1 = y1 , . . . , Yi−1 = yi−1 ] whenever xj ≤ yj , j = 1, 2, . . . , i − 1, (6.B.10) then X ≤st Y . ˆ 1 and Yˆ1 on some probability space as described, Proof. First we construct X for example, in Section 1.A.2. This is possible by (6.B.8). Any possible realˆ 1 , Yˆ1 ) must satisfy x1 ≤ y1 . Conditioned on every such ization (x1 , y1 ) of (X ˆ 2 and Yˆ2 on the same probpossible realization (x1 , y1 ) we next construct X ability space again as described, for example, in Section 1.A.2. This, again, ˆ1, X ˆ 2 ) and (Yˆ1 , Yˆ2 ). is possible by (6.B.9). We thus have constructed so far (X ˆ ˆ ˆ ˆ Any possible realization ((x1 , x2 ), (y1 , y2 )) of ((X1 , X2 ), (Y1 , Y2 )) must satisfy xj ≤ yj , j = 1, 2. Therefore, conditioned on every such possible realization ˆ 3 and Yˆ3 on the same probability ((x1 , x2 ), (y1 , y2 )) we next can construct X space and so on. Continuing this procedure we ﬁnally arrive at random vecˆ and Yˆ , which satisfy (6.B.7). By the standard construction they also tors X satisfy (6.B.5) and (6.B.6). Therefore X ≤st Y by Theorem 6.B.1.

Conditions (6.B.8)–(6.B.10) can be used to deﬁne a new stochastic order. More explicitly, if X and Y satisfy (6.B.8)–(6.B.10), then X is said to be smaller than Y in the strong stochastic order (denoted by X ≤sst Y ). Theorem 6.B.3 simply says that X ≤sst Y =⇒ X ≤st Y . The order ≤sst is not an order in the usual sense; see Remark 6.B.5 below. Suppose that X = (X1 , X2 , . . . , Xn ) satisﬁes, for i = 2, 3, . . . , n, that

6.B The Usual Multivariate Stochastic Order

269

[Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤st [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] whenever xj ≤ xj , j = 1, 2, . . . , i − 1. (6.B.11) Then X is said to be conditionally increasing in sequence (CIS). It is easy to see that if X is CIS and if [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤st [Yi Y1 = x1 , . . . , Yi−1 = xi−1 ] for all xj , j = 1, 2, . . . , i − 1, (6.B.12) then (6.B.10) holds. Similarly, if Y is CIS and (6.B.12) holds, then (6.B.10) holds. We thus have proved the following result. Theorem 6.B.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If either X or Y is CIS and (6.B.8) and (6.B.12) hold, then X ≤st Y . Remark 6.B.5. The order ≤sst is not an order in the usual sense. In fact, it is obvious that X ≤sst X ⇐⇒ X is CIS. Remark 6.B.6. Let (U1 , U2 ) be a bivariate random vector with uniform[0, 1] margins, and with an absolutely continuous distribution function F . Then, as can easily be veriﬁed, (U1 , U2 ) is CIS if, and only if, F (u1 , u2 ) is a concave function of u1 ∈ [0, 1] for any u2 ∈ [0, 1]. A random vector X = (X1 , X2 , . . . , Xn ) is said to be weak conditionally increasing in sequence (WCIS) if, for i = 2, 3, . . . , n, we have [(Xi , . . . , Xn )X1 = x1 , . . . , Xi−2 = xi−2 , Xi−1 = xi−1 ] ≤st [(Xi , . . . , Xn )X1 = x1 , . . . , Xi−2 = xi−2 , Xi−1 = xi−1 ] for all xj , j = 1, 2, . . . , i − 2, and xi−1 ≤ xi−1 . It can be shown that if a random vector is CIS, then it is WCIS. The next result thus strengthens Theorem 6.B.4. We do not give its proof here. Theorem 6.B.7. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If either X or Y is WCIS and (6.B.8) and (6.B.12) hold, then X ≤st Y . The second basic result of this subsection is a multivariate analog of the univariate implication X ≤lr Y =⇒ X ≤st Y (the latter follows from Theorems 1.C.1 and 1.B.1). (Another multivariate analog is given in Theorem 6.E.8.) Recall the deﬁnition of association given in (3.A.53). Association, along with the notions of CIS and WCIS, are concepts that indicate positive dependence among the random variables X1 , X2 , . . . , Xn .

270

6 Multivariate Stochastic Orders

Theorem 6.B.8. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors with density functions f and g, respectively. If X is associated, and if g(x)/f (x) is increasing in x, then X ≤st Y . Proof. Let φ be an increasing function for which E[φ(Y )] exists. Then E[φ(Y )] = φ(y)g(y)dy g(y) = φ(y) f (y)dy f (y) g(y) f (y)dy ≥ φ(y)f (y)dy f (y) = E[φ(X)], where the inequality follows from (3.A.53) and from the monotonicity of φ(x) and of g(x)/f (x) in x. The stated result now follows from (6.B.4).

In order to motivate the third basic result of this subsection, consider m independent random variables X1 , X2 , . . . , Xm and an increasing m-dimensional function φ. It seems reasonable to expect that [(X1 , X2 , . . . , Xm )φ(X1 , X2 , . . . , Xm ) = s] is stochastically increasing in s. This is not always true, but the next result indicates an important instance in which this is the case. We omit the proof. Theorem 6.B.9. Let X1 , X2 , . . . , Xm be independent random variables, each with a logconcave density (that is, Polya frequency of order 2 (PF2 ); see Theorem 1.C.52). Then

m m (X1 , X2 , . . . , Xm ) Xi = s ≤st (X1 , X2 , . . . , Xm ) Xi = s i=1

i=1

whenever s ≤ s . A variation of Theorem 6.B.9 is stated next. In stating the conditions of Theorem 6.B.10 below we use the discrete analog of the univariate down shifted likelihood ratio order (see Section 1.C.4). Explicitly, let X and Y be univariate discrete random variables, each with support N+ . Then we denote X ≤lr↓ Y if P {Y = m + l} P {X = m}

is increasing in m ≥ 0 for all l ≥ 0.

(6.B.13)

Note that (6.B.13) is a discrete analog of (1.C.21). Theorem 6.B.10. Let X1 , X2 , . . . , Xm be independent random variables, each i with support N+ . Denote Si = j=1 Xj , i = 1, 2, . . . , m. If Xi ≤lr↓ Si ,

i = 2, 3, . . . , m,

6.B The Usual Multivariate Stochastic Order

271

and if Si ≤lr↓ Si+1 ,

i = 1, 2, . . . , m − 1,

then

m m (X1 , X2 , . . . , Xm ) Xi = s ≤st (X1 , X2 , . . . , Xm ) Xi = s i=1

i=1

whenever s ≤ s ∈ N+ . In Theorem 6.B.9, the function mφ which is mentioned just before that theorem, is φ(x1 , x2 , . . . , xm ) = i=1 xi . Another case of interest is when φ(x1 , x2 , . . . , xm ) = x(i) , for some i ∈ {1, 2, . . . , m}, where x(i) is the ith smallest xj . In fact we have the following result, whose proof we do not give. Note that it is not necessary to assume logconcavity in the next theorem. Theorem 6.B.11. Let X1 , X2 , . . . , Xm be independent and identically distributed random variables with a continuous distribution function. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Let 1 ≤ r ≤ m. Then (a) for 1 ≤ k1 < k2 < · · · < kr ≤ m, one has that [(X1 , X2 , . . . , Xm )X(k1 ) = s1 , X(k2 ) = s2 , . . . , X(kr ) = sr ] is stochastically increasing in s1 ≤ s2 ≤ · · · ≤ sr ; (b) for s1 ≤ s2 ≤ · · · ≤ sr , one has that [(X1 , X2 , . . . , Xm )X(k1 ) = s1 , X(k2 ) = s2 , . . . , X(kr ) = sr ] is stochastically decreasing in 1 ≤ k1 < k2 < · · · < kr ≤ m. A related result is given in the following theorem. Theorem 6.B.12. Let X1 , X2 , . . . , Xm be independent and identically distributed random variables with a continuous distribution function. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Let 1 ≤ r ≤ m. Then for 1 ≤ k ≤ m, and s ∈ R, one has that (X1 , X2 , . . . , Xm )X(k−1) < s < X(k) is stochastically increasing in s, and is stochastically decreasing in k. Another result that is related to Theorem 6.B.11 is the following. Theorem 6.B.13. Let X1 , X2 , . . . , Xm be independent exponential random variables with possibly diﬀerent parameters. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Then [(X(1) , X(2) , . . . , X(m) )X(1) = s1 ] is stochastically increasing in s1 . The proof of Theorem 6.B.13 uses ideas involving the total hazard construction which is described in Section 6.C.2. Therefore we defer the proof of this theorem to Remark 6.C.2. For the next result we need the deﬁnition of a copula. Let F be an ndimensional distribution function with univariate marginal distribution functions F1 , F2 , . . . , , Fn . Then there exists an n-dimensional distribution function C, with uniform[0, 1] marginal distributions, such that

272

6 Multivariate Stochastic Orders

(x1 , x2 , . . . , xn ) ∈ Rn . (6.B.14) The function C is a copula associated with F . If F is continuous, then C is unique and can be obtained by F (x1 , x2 , . . . , xn ) = C(F1 (x1 ), F2 (x2 ), . . . , Fn (xn )),

C(u1 , u2 , . . . , un ) = F (F1−1 (u1 ), F2−1 (u2 ), . . . , Fn−1 (un )), (u1 , u2 , . . . , un ) ∈ [0, 1]n ;

(6.B.15)

see, for example, Nelsen [431]. Note that if (U1 , U2 , . . . , Un ) has the distribution function C, then from (6.B.15) it follows that (F1−1 (U1 ), F2−1 (U2 ), . . . , Fn−1 (Un )) =st (X1 , X2 , . . . , Xn ).

(6.B.16)

Theorem 6.B.14. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have a common copula. If Xi ≤st Yi , i = 1, 2, . . . , n, then X ≤st Y . Proof. We only give the proof for the continuous case. Let C be the common copula, and let (U1 , U2 , . . . , Un ) be distributed according to C. Furthermore, let Fi and Gi denote the univariate distribution functions of Xi and Yi , respectively, i = 1, 2, . . . , n. From Xi ≤st Yi and (1.A.12) we get Fi−1 (ui ) ≤ G−1 i (ui ) for all ui ∈ [0, 1], i = 1, 2, . . . , n. Hence −1 −1 (F1−1 (U1 ), F2−1 (U2 ), . . . , Fn−1 (Un )) ≤a.s. (G−1 1 (U1 ), G2 (U2 ), . . . , Gn (Un )).

The stated result now follows from (6.B.16).

Theorem 6.B.14 may be compared with Theorem 7.A.38. An interesting result, which gives conditions under which one can stochastically compare vectors of partial sums of independent random variables, is stated next. Theorem 6.B.15. Let {Zi }ni=1 be a sequence of independent random variables. If Z1 ≤lr Z2 ≤lr · · · ≤lr Zn then Z1 , Z1 + Z2 , . . . ,

n i=1

n Zi ≤st Zπ1 , Zπ1 + Zπ2 , . . . , Zπi

≤st Zn , Zn + Zn−1 , . . . ,

i=1 n i=1

for every permutation (π1 , π2 , . . . , πn ) of (1, 2, . . . , n).

Zi ,

6.B The Usual Multivariate Stochastic Order

273

In particular it follows from Theorem 6.B.15 that if the random variables X and Y are such that X ≤lr Y , then (X, X + Y ) ≤st (Y, X + Y ).

(6.B.17)

Conclusion (6.B.17) does not necessarily follow from merely assuming that X ≤st Y . This can be shown by a counterexample. The proof of (6.B.17) can be obtained from Theorem 1.C.20 as follows. Let ψ be a bivariate increasing function. Then the function φ, deﬁned by φ(x, y) = ψ(x, x + y), belongs to Glr . Therefore, from (1.C.11) one sees that ψ(X, X + Y ) ≤st ψ(Y, X + Y ) and this gives (6.B.17). The proof of Theorem 6.B.15 uses the same idea together with a conditioning argument. 6.B.4 Closure properties Using (6.B.1) through (6.B.7) it is easy to prove each of the following closure results (note that parts (a) and (c) are special cases of part (b) in the next theorem). Theorem 6.B.16. (a) Let X and Y be two n-dimensional random vectors. If X ≤st Y and g : Rn → Rk is any k-dimensional increasing [decreasing] function, for any positive integer k, then the k-dimensional vectors g(X) and g(Y ) satisfy g(X) ≤st [≥st ] g(Y ). (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. Denote k = k1 +k2 +· · ·+km . If X i ≤st Y i for i = 1, 2, . . . , m, then, for any increasing function ψ : Rk → R, one has ψ(X 1 , X 2 , . . . , X m ) ≤st ψ(Y 1 , Y 2 , . . . , Y m ). That is, the usual multivariate stochastic order is closed under conjunctions. In particular, the usual multivariate stochastic order is closed under convolutions. (c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤st Y , then X I ≤st Y I for each I ⊆ {1, 2, . . . , n}. That is, the usual multivariate stochastic order is closed under marginalization. (d) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤st Y j , j = 1, 2, . . ., then X ≤st Y . (e) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤st [Y Θ = θ] for all θ in the support of Θ. Then X ≤st Y . That is, the usual stochastic order is closed under mixtures. In (6.B.1) the random vectors X and Y can be taken to be of countable inﬁnite dimension; that is, each of X and Y may correspond to an inﬁnite

274

6 Multivariate Stochastic Orders

sequence of random variables. In such a case, if (6.B.1) holds for all upper sets in R∞ , then we still say that X is smaller than Y in the usual stochastic order (denoted as X ≤st Y ). A generalization of this idea is described in Section 6.B.7. The inequality (6.B.4), as well as Theorem 6.B.1, are still valid when X and Y have countable inﬁnite dimension. We thus get the following result which involves multivariate random sums. Below, an empty sum is understood to be 0. Theorem 6.B.17. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of nonnegative random variables, and let Y 1 , Y 2 , . . . , Y m be other m such vectors. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers such that M is independent of X 1 , X 2 , . . . , X m , and N is independent of Y 1 , Y 2 , . . . , Y m . Denote by Xj,i [Yj,i ] the ith element of X j [Y j ]. If (X 1 , X 2 , . . . , X m ) ≤st (Y 1 , Y 2 , . . . , Y m ), and if M ≤st N , then M1 i=1

X1,i ,

M2

X2,i , . . . ,

i=1

Mm

N1 N2 Nm Xm,i ≤st Y1,i , Y2,i , . . . , Ym,i .

i=1

i=1

i=1

i=1 (i)

Consider now n families of univariate distribution functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) de(i) note a random variable with distribution function Gθ , i = 1, 2, . . . , n. Let n Θ = (Θ1 , Θ2 , . . . , Θn ) be a random vector with support in i=1 Xi , and with distribution function F . Consider the n-dimensional distribution function H given by

H(y1 , y2 , . . . , yn ) =

n

... X1

X2

Xn i=1

(i)

Gθi (yi )dF (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn . (6.B.18)

The following result is a generalization of Theorem 6.B.16(e), and is a multivariate extension of Theorem 1.A.6; see Theorems 6.G.8, 7.A.37, 9.A.7, and 9.A.15 for related results. (i)

Theorem 6.B.18. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution nfunctions as above. Let Θ 1 and Θ 2 be two random vectors with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

If

6.B The Usual Multivariate Stochastic Order

Xi (θ) ≤st Xi (θ )

275

whenever θ ≤ θ , i = 1, 2, . . . , n,

and if Θ 1 ≤st Θ2 , then Y 1 ≤st Y 2 . 6.B.5 Further properties Clearly if X ≤st Y , then EX ≤ EY . However, similar to the univariate case, if two random vectors are ordered in the usual multivariate stochastic order and have the same expected values, then they must have the same distribution. This is shown in the following result, which is a multivariate generalization of Theorem 1.A.8. Similar results are given in Theorems 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.G.12, and 7.A.14–7.A.16. Theorem 6.B.19. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be two random vectors. If X ≤st Y and if E[hi (Xi )] = E[hi (Yi )] for some strictly increasing function hi , i = 1, 2, . . . , m, then X =st Y . We will not give the complete proof of Theorem 6.B.19 here, but we will show a simple argument that proves it when X and Y are nonnegative random vectors. From the assumption X ≤st Y and from Theorem 6.B.16(c) it follows that Xi ≤st Yi . Since E[hi (Xi )] = E[hi (Yi )] it follows from Theorem 1.A.8 that Xi =st Yi , and thus, in particular, EXi = EYi for i = 1, 2, . . . , m. Therefore E

m

αi Xi =

i=1

m

αi E[Xi ] =

i=1

m

αi E[Yi ] = E

i=1

m

αi Yi

i=1

whenever αi ≥ 0, i = 1, 2, . . . , m. Also, from X ≤st Y it follows that m

αi Xi ≤st

i=1

m

αi Yi

whenever αi ≥ 0, i = 1, 2, . . . , m.

i=1

Therefore, again by Theorem 1.A.8, we have that m

αi Xi =st

i=1

Thus

m

αi Yi

whenever αi ≥ 0, i = 1, 2, . . . , m.

i=1

E exp

−

m i=1

αi Xi

= E exp

−

m

αi Yi

i=1

whenever αi ≥ 0, i = 1, 2, . . . , m. From the unicity property of the Laplace transform we obtain X =st Y .

276

6 Multivariate Stochastic Orders

A straightforward analog of Theorem 1.A.15 is in general not true in the multivariate case. That is, if X is any random vector and if U1 and U2 are any increasing sets such that U1 ⊇ U2 , then it is not necessarily true that [X U1 ] ≤st [X U2 ]; some property of positive dependence is needed to be imposed on X in order for this result to hold. We do not give the details here. Recall from (6.B.4) that X = (X1 , X2 , . . . , Xm ) ≤st Y = (Y1 , Y2 , . . . , Ym ) if, and only if, E[φ(X)] ≤ E[φ(Y )] for all increasing functions φ, and that (6.B.2) says that X ≤st Y if, and only if, E[φ(X)] ≤ E[φ(Y )] for all increasing indicator functions φ. When m = 2 we have a further similar characterization of the multivariate order ≤st , as is stated next. The proof is omitted. Theorem 6.B.20. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. Then (X1 , X2 ) ≤st (Y1 , Y2 ) if, and only if, φ1 (X1 ) + φ2 (X2 ) ≤st φ1 (Y1 ) + φ2 (Y2 ) for all increasing functions φ1 and φ2 . A random vector (X1 , X2 , . . . , Xm ) or its distribution is said to be permutation symmetric or exchangeable if (X1 , X2 , . . . , Xm ) =st (Xπ1 , Xπ2 , . . . , Xπm ) for every permutation π of (1, 2, . . . , m). A set U is said to be symmetric if (x1 , x2 , . . . , xm ) ∈ U =⇒ (xπ1 , xπ2 , . . . , xπm ) ∈ U for every permutation π of (1, 2, . . . , m). For permutation symmetric random vectors the result in the following theorem holds. The proof uses symmetry arguments and is omitted. Theorem 6.B.21. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be two permutation symmetric random vectors. Then X ≤st Y if, and only if, P {X ∈ U } ≤ P {Y ∈ U } for all symmetric upper sets U ⊆ Rm . In the next result we obtain a comparison of order statistics with respect to ≤st , but ﬁrst we need a lemma. Let z1 , z2 , . . . be a sequence of constants or of random variables. Denote by z(i:m) the ith smallest value among the ﬁrst m zi ’s. Lemma 6.B.22. For any sequence of constants z1 , z2 , . . . the following inequalities hold: z(i:m) ≤ z(i+1:m) , 1 ≤ i ≤ m − 1. z(i:m+1) ≤ z(i:m) , 1 ≤ i ≤ m. z(i:m) ≤ z(i+1:m+1) , 1 ≤ i ≤ m.

(6.B.19) (6.B.20) (6.B.21)

6.B The Usual Multivariate Stochastic Order

277

Proof. The proof of (6.B.19) is obvious from the deﬁnition of the z(i:m) ’s. The proof of (6.B.20) is also quite simple—just note that if zm+1 ≤ z(i:m) , then z(i:m+1) ≤ z(i:m) , whereas if zm+1 > z(i:m) , then z(i:m+1) = z(i:m) . Finally, in order to prove (6.B.21), note that if zm+1 ≤ z(i:m) , then z(i+1:m+1) = z(i:m) , whereas if zm+1 > z(i:m) , then z(i:m) ≤ z(i+1:m+1) .

Theorem 6.B.23. Let {X1 , X2 , . . . } and {Y1 , Y2 , . . . } be two sequences of random variables such that (X1 , X2 , . . . , Xk ) ≤st (Y1 , Y2 , . . . , Yk ),

k ≥ 1.

(6.B.22)

Then X(i:m) ≤st Y(j:n)

whenever i ≤ j and m − i ≥ n − j.

(6.B.23)

Proof. First note that from (6.B.22) it follows that X(i:m) ≤st Y(i:m) ,

1 ≤ i ≤ m.

(6.B.24)

Now, if m ≥ n, then X(i:m) ≤a.s. X(i:n) ≤st Y(i:n) ≤a.s. Y(j:n)

(by (6.B.20) and m ≥ n) (by (6.B.24)) (by (6.B.19) and i ≤ j).

And if m < n, then X(i:m) ≤st Y(i:m) ≤a.s. Y(i+n−m:n) ≤a.s. Y(j:n)

(by (6.B.24)) (by (6.B.21) and m < n) (by (6.B.19) and j ≥ i + n − m).

Since the almost sure relation ≤a.s. implies the relation ≤st , we obtain (6.B.23) from the above inequalities.

If in Theorem 6.B.23 we take Yi = Xi , i = 1, 2, . . ., then obviously (6.B.22) holds. Thus we obtain the following corollary. Corollary 6.B.24. Let {X1 , X2 , . . . } be a sequence of (not necessarily independent) random variables. Then X(i:m) ≤st X(j:n)

whenever i ≤ j and m − i ≥ n − j.

The next example shows that if two random variables are ordered in the dispersive order, then the corresponding vectors of spacings are ordered in the usual stochastic order. Related results can be found in Theorems 1.C.45 and 4.B.17, and in Example 6.E.15.

278

6 Multivariate Stochastic Orders

Example 6.B.25. Let X and Y be two random variables. Let X(1) ≤ X(2) ≤ · · · ≤ X(n) denote the order statistics from a sample X1 , X2 , . . . , Xn of independent and identically distributed random variables that have the same distribution as X. Similarly, let Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) denote the order statistics from another sample Y1 , Y2 , . . . , Yn of independent and identically distributed random variables that have the same distribution as Y . The corresponding spacings are deﬁned by U(i) ≡ X(i) − X(i−1) and V(i) ≡ Y(i) − Y(i−1) , i = 2, 3, . . . , n. Denote U = (U(2) , U(3) , . . . , U(n) ) and V = (V(2) , V(3) , . . . , V(n) ). We will now show that if X ≤disp Y , then U ≤st V . Let F and G denote the distribution functions of X and Y , respectively. Deﬁne Yˆ(i) = G−1 (F (X(i) )), i = 1, 2, . . . , n, and Vˆ(i) = Yˆ(i) − Yˆ(i−1) , i = 2, 3, . . . , n. Clearly, (V(2) , V(3) , . . . , V(n) ) =st (Vˆ(2) , Vˆ(3) , . . . , Vˆ(n) ). Furthermore, from (3.B.10) we have that Vˆ(i) = G−1 (F (X(i) )) − G−1 (F (X(i−1) )) ≥ X(i) − X(i−1) = U(i) a.s., i = 2, 3, . . . , n. Thus, it follows from Theorem 6.B.1 that U ≤st V . In particular, from Theorem 6.B.16(c) we get that U(i) ≤st V(i) for i = 2, 3, . . . , n, and this proves Theorem 3.B.31. For the next two examples recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. Example 6.B.26. Let X1 , X2 , . . . , Xn , Y1 , Y2 , . . . , Yn be independent Gamma random variables where Xi has the density function fi deﬁned by fi (x) =

λα i xα−1 e−λi x , Γ (α)

x ≥ 0,

where α > 0 and λi > 0, i = 1, 2, . . . , n, and Yi has the density function gi deﬁned by µα i gi (x) = xα−1 e−µi x , x ≥ 0, Γ (α) where α > 0 is as above, and µi > 0, i = 1, 2, . . . , n. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(n) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) . Suppose that (λ1 , λ2 , . . . , λn ) ≺ (µ1 , µ2 , . . . , µn ). If α ≤ 1, then (X(1) , X(2) , . . . , X(n) ) ≤st (Y(1) , Y(2) , . . . , Y(n) ), and if α ≥ 1, then X(1) ≥st Y(1)

and X(n) ≤st Y(n) .

In particular, by taking α = 1, it is seen that the above inequalities hold for heterogeneous exponential random variables.

6.B The Usual Multivariate Stochastic Order

279

Example 6.B.27. Let X1 , X2 , . . . , Xn , Y1 , Y2 , . . . , Yn be independent Weibull random variables where Xi has the survival function F i deﬁned by F i (x) = e−(λi x) , α

x ≥ 0,

where α > 0 and λi > 0, i = 1, 2, . . . , n, and Yi has the survival function Gi deﬁned by α Gi (x) = e−(µi x) , x ≥ 0, where α > 0 is as above, and µi > 0, i = 1, 2, . . . , n. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(n) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) . Suppose that (λ1 , λ2 , . . . , λn ) ≺ (µ1 , µ2 , . . . , µn ). If α ≤ 1, then (X(1) , X(2) , . . . , X(n) ) ≤st (Y(1) , Y(2) , . . . , Y(n) ). Again, by taking α = 1, it is seen that the above inequalities hold for heterogeneous exponential random variables. Example 6.B.28. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be inﬁnitely divisible random vectors with L`evy measures νX and νY , respectively; that is, νX and νY satisfy Rm (1∧ |x|)νX (dx) < ∞ and Rm (1∧ |y|)νY (dy) < ∞, and the characteristic functions of X and of Y can be written in the form $ % ϕX (t) = exp (ei(t·x) − 1)νX (dx) + i(t · bX ) Rm \{0}

$

and ϕY (t) = exp

Rm \{0}

% (ei(t·y) − 1)νY (dy) + i(t · bY ) ,

respectively, for some bX , bY ∈ Rm . Assume that νX and νY are concentrated on [0, ∞)m . If νX (U ) ≤ νY (U ) for all Borel measurable upper sets in Rm , and if bX ≤ bY , then X ≤st Y . The following example gives necessary and suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.G.11, 7.A.13, 7.A.26, 7.A.39, 7.B.5, and 9.A.20 for related results. Example 6.B.29. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ X , and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix Σ Y . Then X ≤st Y if, and only if, µX ≤ µY and Σ X = Σ Y . 6.B.6 A property in reliability theory In this subsection we show how the multivariate order ≤st can be used as a tool for the purpose of deﬁning aging properties for components whose lifetimes are not necessarily independent. The notions and notations introduced in this subsection will also be used in the rest of this chapter.

280

6 Multivariate Stochastic Orders

Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. In this subsection it is helpful to think about T1 , T2 , . . . , Tm as the lifetimes of m components 1, 2, . . . , m that make up some system. Suppose that an observer observes the system continuously in time and records the failure times and the identities of the components that fail as time passes. Thus, a typical “history” that the observer has observed by time t ≥ 0 is of the form ht = {T I = tI , T I > te},

0e ≤ tI ≤ te, I ⊆ {1, 2, . . . , m}.

(6.B.25)

In (6.B.25) I is the set of components that have already failed by time t (with failure times tI ) and I is the set of components that are still alive at time t. Let hs = {T J = sJ , T J > se},

0e ≤ sJ ≤ se, J ⊆ {1, 2, . . . , m},

(6.B.26)

be another history. If t ≤ s and the histories ht and hs are such that each component that failed in ht also failed in hs , and, for components that failed in both histories, the failures in hs are earlier than the failures in ht , then we say that the history ht is less severe or “more pleasant” than the history hs and we denote it by ht ≤ hs . Note that if ht and hs are as in (6.B.25) and (6.B.26), then ht ≤ hs if, and only if, I ⊆ J and sI ≤ tI . For every vector a = (a1 , a2 , . . . , am ) denote by a+ the vector a+ = ((a1 )+ , (a2 )+ , . . . , (am )+ ). Recalling Theorem 1.A.30 we can deﬁne a nonnegative random vector T as multivariate IFR if for t ≤ s we have (6.B.27) [(T − te)+ ht ] ≥st [(T − se)+ hs ] whenever ht ≤ hs . Another possibility is to call the nonnegative random vector T multivariate IFR if for t ≤ s we have [(T − te)+ ht ] ≥st [(T − se)+ hs ] whenever ht and hs coincide on [0, t). (6.B.28) These two diﬀerent deﬁnitions of multivariate IFR have some desirable properties. For example, a vector consisting of independent IFR random variables is multivariate IFR according to either one of these two deﬁnitions. However, perhaps the most important feature of these kinds of deﬁnitions is their intuitive interpretation. In the univariate case these two deﬁnitions coincide with the usual univariate deﬁnition of IFR. Further notions of multivariate IFR are studied in Section 6.D.3. 6.B.7 Stochastic ordering of stochastic processes In Section 1.A.1 we saw how to deﬁne the usual stochastic order between two univariate random variables. In Section 6.B.1 we saw how this comparison can

6.B The Usual Multivariate Stochastic Order

281

be deﬁned for two multivariate random vectors. The next level of generalization, then, is the stochastic comparison of two stochastic processes. In fact, several levels of generalization can be studied. The stochastic processes can be univariate (if their common state space S is a subset of R). Or they can be multivariate (if their common state space S is a subset of Rm for some m). Or, more generally, the common state space S can be any general space, according to the requirements of the particular application in which the order is to be used. In this subsection we consider only the case in which the random processes are univariate. Section 6.H contains some references for the more general results. Let {X(t), t ∈ T } and {Y (t), t ∈ T } be two stochastic processes with state space S ⊆ R and time parameter space T (usually T = [0, ∞) or T = N+ ). Suppose that, for all choices of an integer m and t1 < t2 < · · · < tm in T , it holds that (X(t1 ), X(t2 ), . . . , X(tm )) ≤st (Y (t1 ), Y (t2 ), . . . , Y (tm )), where here ≤st is in the sense of Section 6.B.1. Then {X(t), t ∈ T } is said to be smaller than {Y (t), t ∈ T } in the usual stochastic order (denoted by {X(t), t ∈ T } ≤st {Y (t), t ∈ T }). It can be shown that {X(t), t ∈ T } ≤st {Y (t), t ∈ T } if, and only if, E{g({X(t), t ∈ T })} ≤ E{g({Y (t), t ∈ T })},

(6.B.29)

for every increasing functional g for which the expectations in (6.B.29) exist (a functional g is called increasing if g({x(t), t ∈ T }) ≤ g({y(t), t ∈ T }) whenever x(t) ≤ y(t), t ∈ T ). An analog of (6.B.1) can also be stated and proved, but it is not included here. However, we do state the following important property of the order ≤st , which is a generalization of Theorem 6.B.1. Theorem 6.B.30. The random processes {X(t), t ∈ T } and {Y (t), t ∈ T } satisfy {X(t), t ∈ T } ≤st {Y (t), t ∈ T } if, and only if, there exist two random ˆ processes {X(t), t ∈ T } and {Yˆ (t), t ∈ T }, deﬁned on the same probability space, such that ˆ {X(t), t ∈ T } =st {X(t), t ∈ T }, ˆ {Y (t), t ∈ T } =st {Y (t), t ∈ T }, and ˆ P {X(t) ≤ Yˆ (t), t ∈ T } = 1. For discrete-time processes (T = N+ ), an analog of Theorem 6.B.3 is given in Theorem 6.B.31. The proof of it is the same as the proof of Theorem 6.B.3, except that Theorem 6.B.30 is applied at the end of the proof rather than Theorem 6.B.1.

282

6 Multivariate Stochastic Orders

Theorem 6.B.31. Let {X(n), n ∈ N+ } = {X(0), X(1), X(2), . . . } and {Y (n), n ∈ N+ } = {Y (0), Y (1), Y (2), . . . } be two discrete-time stochastic processes. If X(0) ≤st Y (0), and if [X(i)X(1) = x1 , . . . , X(i − 1) = xi−1 ] ≤st [Y (i)Y (1) = y1 , . . . , Y (i − 1) = yi−1 ] whenever xj ≤ yj , j = 1, 2, . . . , i − 1, i = 1, 2, 3, . . . , then {X(n), n ∈ N+ } ≤st {Y (n), n ∈ N+ }. Theorems 6.B.2 and 6.B.4 also have straightforward analogs that we do not state here. The order ≤st for stochastic processes is closed under operations similar to those described in Theorem 6.B.16. In particular, {X(t), t ∈ T } ≤st {Y (t), t ∈ T } =⇒ {g({X(t), t ∈ T })} ≤st {g({Y (t), t ∈ T })} for all increasing functionals g. The order is also closed under mixtures. To see an important application of these ideas, consider two discrete-time homogeneous Markov processes {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } with a common state space S ⊆ R. Denote YX1 (x) =st [X1 (n + 1)X1 (n) = x] and YX2 (x) =st [X2 (n + 1)X2 (n) = x], x ∈ S. The proof of the next result follows directly from Theorem 6.B.31. Theorem 6.B.32. Let {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } be two Markov processes as described above. Suppose that X1 (0) ≤st X2 (0) and that YX1 (x) ≤st YX2 (x )

whenever x ≤ x .

Then {X1 (n), n ∈ N+ } ≤st {X2 (n), n ∈ N+ }. A variation of Theorem 6.B.32 for Markov chains (that is, discrete-time homogeneous Markov process with state space in N) is given next. Recall that a Markov chain is called skip-free positive if it does not have positive jumps of magnitude more than one. For a Markov chain {X(n), n ∈ N+ } with state space S ⊆ N we denote YX (i) =st [X(n + 1)X(n) = i], i ∈ S. The proof of the following result is obtained by a straightforward construction of the two underlying Markov chains on the same probability space, and then using Theorem 6.B.30. Theorem 6.B.33. Let {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } be two Markov chains. Suppose that X1 (0) ≤st X2 (0), that YX1 (i) ≤st YX2 (i)

for all i,

and YX2 (i) ≥ i

for all i,

(6.B.30)

and that {X1 (n), n ∈ N+ } is skip-free positive. Then {X1 (n), n ∈ N+ } ≤st {X2 (n), n ∈ N+ }.

6.B The Usual Multivariate Stochastic Order

283

The discrete-time homogeneous Markov process {X(n), n ∈ N+ } is said to be stochastically monotone if YX (x) =st [X(n + 1)X(n) = x] is stochastically increasing in x ∈ S. Note that stochastic monotonicity is a diﬀerent condition than the almost sure monotonicity condition (6.B.30) — none of these implies the other. Denote by {X (x) (n), n ∈ N+ } the process {X(n), n ∈ N+ } under the condition that X(0) = x. The following result is a direct consequence of Theorem 6.B.32. Theorem 6.B.34. Let {X(n), n ∈ N+ } be a discrete-time homogeneous Markov process that is stochastically monotone. Then

{X (x) (n), n ∈ N+ } ≤st {X (x ) (n), n ∈ N+ }

(6.B.31)

whenever x ≤ x . For example, a discrete-time birth and death chain (with state space N) with birth probabilities P {X(n + 1) = i + 1X(n) = i} = pi and death probabilities P {X(n + 1) = i − 1X(n) = i} = 1 − pi , i ∈ N, is stochastically monotone if pi increases in i ∈ N. Hence it satisﬁes (6.B.31). If two processes {X(t), t ∈ T } and {Y (t), t ∈ T } satisfy {X(t), t ∈ T } ≤st {Y (t), t ∈ T }, then, by Theorem 6.B.30, the ﬁrst passage times TX (a) ≡ inf{t : X(t) > a} and TY (a) ≡ inf{t : Y (t) > a} (where inf ∅ = ∞) satisfy TX (a) ≥st TY (a) for all a. The reverse implication need not be true. By removing (6.B.30) from Theorem 6.B.33 we obtain the following result. Its proof consists of a proper construction of the two underlying Markov chains on the same probability space, and then using Theorem 6.B.30. Theorem 6.B.35. Let {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } be two Markov chains. Suppose that X1 (0) ≤st X2 (0), that YX1 (i) ≤st YX2 (i)

for all i,

and that {X1 (n), n ∈ N+ } is skip-free positive. Then TX (a) ≥st TY (a) for all a. Suppose now that the two processes that we want to compare are point processes that, for distinction, we denote by {K(t), t ≥ 0} and {N (t), t ≥ 0}. That is, for each t ≥ 0, K(t) and N (t) are the numbers of jumps that the corresponding processes have experienced over the time interval (0, t]. In addition to the possible relationship {K(t), t ≥ 0} ≤st {N (t), t ≥ 0} between these processes, we will consider also two other stronger possible relationships. For any positive integer m, let B1 , B2 , . . . , Bm be bounded Borel sets of [0, ∞). Let K(Bi ) and N (Bi ) denote the number of jumps of the corresponding processes over the set Bi , i = 1, 2, . . . , m. Suppose that, for all choices of an integer m and bounded Borel sets B1 , B2 , . . . , Bm , it holds that (K(B1 ), K(B2 ), . . . , K(Bm )) ≤st (N (B1 ), N (B2 ), . . . , N (Bm )).

284

6 Multivariate Stochastic Orders

Then {K(t), t ≥ 0} is said to be smaller than {N (t), t ≥ 0} in the usual stochastic order over N (denoted by {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}). (Here N denotes the space of integer-valued Radon measures.) The usual stochastic order over N gives a “global” comparison of the point processes {K(t), t ≥ 0} and {N (t), t ≥ 0}. Let X1 < X2 < · · · be the sequence of interpoint distances of the process {K(t), t ≥ 0}, and let Y1 < Y2 < · · · be the sequence of interpoint distances of the process {N (t), t ≥ 0}. We assume that the Xi ’s and that the Yi ’s are almost surely positive. Also nwe assume that the processes n are nonexplosive in the sense that limn→∞ i=1 Xi = ∞ and limn→∞ i=1 Yi = ∞ almost surely. Suppose that, for all choices of an integer m and indices i1 , i2 , . . . , im , it holds that (Xi1 , Xi2 , . . . , Xim ) ≥st (Yi1 , Yi2 , . . . , Yim ). Then {K(t), t ≥ 0} is said to be smaller than {N (t), t ≥ 0} in the usual stochastic order over R∞ (denoted by {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0}). The usual stochastic order over R∞ gives a “local” comparison of the point processes {K(t), t ≥ 0} and {N (t), t ≥ 0}. Analogs of (6.B.29) can be stated and proven for the orders ≤st-N and ≤st-∞ . Also, “almost sure” constructions, that are analogs of Theorem 6.B.30, can be shown for these orders. We do not give the technical details here. We ˆ = {K(t), ˆ note, however, that in such constructions the counterparts K t ≥ 0} ˆ ˆ and N = {N (t), t ≥ 0} of {K(t), t ≥ 0} and {N (t), t ≥ 0}, respectively, satisfy the following properties: The relationship {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0} ˆ is a thinning of N ˆ . The relationship {K(t), t ≥ 0} ≤st {N (t), t ≥ means that K ˆ ˆ before 0} means that N has a.s. earlier and more numerous points than K each time instant t. The relationship {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} means ˆ than for K ˆ a.s. that the corresponding interpoint distances are shorter for N From this it is immediate that {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st {N (t), t ≥ 0}, and that {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st {N (t), t ≥ 0}. (6.B.32) It can be shown that, in general, {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} and also that {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}. For renewal processes we have the following results. Theorem 6.B.36. Consider two nondelayed renewal processes {K(t), t ≥ 0} and {N (t), t ≥ 0} with generic interpoint distances X and Y , respectively. The following three statements are equivalent. (i) Y <st X, (ii) {K(t), t ≥ 0} ≤st {N (t), t ≥ 0},

6.B The Usual Multivariate Stochastic Order

285

(iii) {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0}. Proof. Note that from the independence of the interpoint distances it follows that (i)⇐⇒(iii). From (6.B.32) it follows that (iii)=⇒(ii). The implication (ii)=⇒(i) is obvious.

Theorem 6.B.37. Consider two nondelayed renewal processes {K(t), t ≥ 0} and {N (t), t ≥ 0} with generic interpoint distances X and Y , respectively. Let rX and rY denote the hazard rate functions corresponding to X and Y , respectively. If rX (t) ≤ rY (s) for all 0 ≤ s ≤ t, (6.B.33) then {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}. Theorem 6.B.37 can be easily proven using the fact, mentioned above, that ˆ is a thinning of N ˆ . We do not give a detailed proof of it here. K Note that (6.B.33) holds if Y ≤hr X and if X is DFR or if Y is DFR. The proofs of the next two theorems are similar to the proofs of Theorems 6.B.36 and 6.B.37, respectively. Theorem 6.B.38. Consider two delayed renewal processes {K d (t), t ≥ 0} and {N d (t), t ≥ 0}, with the corresponding delays X d and Y d and with the same interrenewal distribution after the delay. The following statements are equivalent. (i) Y d <st X d , (ii) {K d (t), t ≥ 0} ≤st {N d (t), t ≥ 0}, (iii) {K d (t), t ≥ 0} ≤st-∞ {N d (t), t ≥ 0}. Theorem 6.B.39. Consider two delayed renewal processes {K d (t), t ≥ 0} and {N d (t), t ≥ 0}, with the corresponding delays X d and Y d and with the same interrenewal distribution after the delay. Let rX d denote the hazard rate function corresponding to X d . If Y d ≤hr X d and if rX d (t) ≤ r(s)

for all 0 ≤ s ≤ t,

(6.B.34)

where r is the hazard rate function associated with the common interrenewal distribution function, then {K d (t), t ≥ 0} ≤st-N {N d (t), t ≥ 0}. Note that (6.B.34) holds, for example, if X ≤hr X d , and if X is DFR or if X is DFR. Finally we give conditions for two nonhomogeneous Poisson processes to be ordered according to the above orders. d

Theorem 6.B.40. Let {K(t), t ≥ 0} and {N (t), t ≥ 0} be two nonhomogeneous Poisson processes with mean functions MK and MN , respectively, and with intensity functions λK and λN , respectively.

286

6 Multivariate Stochastic Orders

(i) If MK (t) ≤ MN (t), t ≥ 0, then {K(t), t ≥ 0} ≤st {N (t), t ≥ 0}. (ii) If λK (t) ≤ λN (t), t ≥ 0, then {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}. −1 (MN (t)) − t is increasing in t ≥ 0, then {K(t), t ≥ 0} ≤st-∞ (iii) If MK {N (t), t ≥ 0}. In the following example, parts (i) and (iii) of Theorem 6.B.40 are restated in the terminology of Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 4.B.14, 6.D.8, 6.E.13, and 7.B.13. Example 6.B.41. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G, i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the usual stochastic ordering of the ﬁrst two epoch times implies the multivariate usual stochastic ordering of all the corresponding later epoch times. Explicitly, part (i) of Theorem 6.B.40 says that if X ≤st Y , then (T1,1 , T1,2 , . . . , T1,n ) ≤st (T2,1 , T2,2 , . . . , T2,n ), n ≥ 1. Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Part (iii) of Theorem 6.B.40 says that if X ≤disp Y , then (X1,1 , X1,2 , . . . , X1,n ) ≤st (X2,1 , X2,2 , . . . , X2,n ), n ≥ 1.

6.C The Cumulative Hazard Order 6.C.1 Deﬁnition Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. In this section, as in Section 6.B.6, it is helpful to think about T1 , T2 , . . . , Tm as the lifetimes of m components 1, 2, . . . , m that make up some system. Consider a typical “history” of T at time t ≥ 0, which is of the form (see (6.B.25)) ht = {T I = tI , T I > te},

0e ≤ tI ≤ te, I ⊆ {1, 2, . . . , m}.

(6.C.1)

Given the history ht in (6.C.1), let i ∈ I be a component that is still alive at time t. Its multivariate conditional hazard rate, at time t, is deﬁned as follows: 1 λi|I (ttI ) = lim P {t < Ti ≤ t + ∆tT I = tI , T I > te}, ∆t↓0 ∆t

(6.C.2)

where, of course, 0e ≤ tI ≤ te, and I ⊆ {1, 2, . . . , m}. As long as the item is alive it accumulates hazard at the rate of λi|I (ttI ) at time t. If I = {i1 , i2 , . . . , ik } and

6.C The Cumulative Hazard Order

287

ti1 ≤ t i2 ≤ · · · ≤ tik , then the cumulative hazard of component i ∈ I at time t is Ψi|i1 ,i2 ,...,ik (tti1 , ti2 , . . . , tik ) ti1 k ti j = λi|∅ (u t∅ )du + λi|i1 ,i2 ,...,ij−1 (uti1 , ti2 , . . . , tij−1 )du 0

j=2

tij−1

t

+ tik

λi|i1 ,i2 ,...,ik (uti1 , ti2 , . . . , tik )du.

(6.C.3)

Let S = (S1 , S2 , . . . , Sm ) be another nonnegative random vector with an absolutely continuous distribution function and with cumulative hazard functions Φ·|· (··), which are deﬁned analogously to the Ψ ’s in (6.C.3). Select two integers j and l such that j ≤ l ≤ m. Let t1 , t2 , . . . , tj and s1 , . . . , sj , . . . , sl be such that 0 ≤ t1 ≤ t2 ≤ · · · ≤ tj , and 0 ≤ si ≤ ti , i = 1, 2, . . . , j, and si ≥ 0, i = j + 1, . . . , l. Let sk1 ≤ sk2 ≤ · · · ≤ skl be the ordered si ’s. If for any integer α > l we have Φα|k1 ,k2 ,...,kl (usk1 , sk2 , . . . , skl ) ≥ Ψα|1,2,...,j (ut1 , t2 , . . . , tj )

(6.C.4)

whenever u ≥ max{tj , sj+1 , sj+2 , . . . , sl }, and if the same holds with 1, 2, . . . , l replaced by π1 , π2 , . . . , πl for every permutation π of (1, 2, . . . , m), then S is said to be smaller than T in the cumulative hazard order (denoted as S ≤ch T ). The order ≤ch is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤ch X means that X has the positive dependence property of “supporting lifetimes” discussed in Norros [437] and in Shaked and Shanthikumar [511]. Condition (6.C.4) simply states that at any time t the cumulative hazard of Sα is larger than the cumulative hazard of Tα whenever the history of the components corresponding to S is more “severe” than the history of the components corresponding to T . Thus (6.C.4) can be written as (see Section 6.B.6 for the deﬁnition of histories and for the deﬁnition of their comparison) Φα (hu ) ≥ Ψα (hu )

whenever hu ≥ hu ,

where α denotes a component that has not failed by time u in the history hu . In the univariate case (that is, m = 1) condition (6.C.4) simply says that − log P {S1 > u} ≥ − log P {T1 > u}. Therefore, in the univariate case S1 ≤ch T1 ⇐⇒ S1 ≤st T1 . Thus, if the components of S are independent, and if the components of T are independent, then S ≤ch T ⇐⇒ S ≤st T . In the general multivariate case the two orders are not equivalent, but it will be shown below that if S ≤ch T , then S ≤st T .

288

6 Multivariate Stochastic Orders

6.C.2 The relationship between the cumulative hazard order and the usual multivariate stochastic order The total hazard accumulated by the failure time Ti , given that Ti was the time of the kth failure and that the previous failure times were Tj1 , Tj2 , . . . , Tjk−1 , is Ψi|j1 ,j2 ,...,jk−1 (Ti Tj1 , Tj2 , . . . , Tjk−1 ). It can be shown that the total hazards accumulated by the failure times Ti ’s are independent standard (that is, mean one) exponential random variables. This fact motivates the following total hazard construction, which is of independent interest but we will use it here in order to show that if S ≤ch T , then S ≤st T . The idea of the construction is as follows. The components accumulate hazard as long as they are alive with the rates given in (6.C.2). Each one of them dies when its accumulated hazard crosses a random threshold. The random thresholds are independent standard exponential random variables. Thus, by continuously comparing the accumulated hazards to the independent exponential random thresholds it is possible to determine the times in which the accumulated hazards cross the respective thresholds, and these times have the desired distribution. From this heuristic description it is seen that the multivariate conditional cumulative hazard functions, given in (6.C.3), determine the distribution of the generated random variables. This, indeed, is well known. Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. Given the functions Ψ·|· (··) that are associated with T , as described in (6.C.3), we will describe now how to generate a random vector Tˆ = (Tˆ1 , Tˆ2 , . . . , Tˆm ) such that Tˆ =st T . Let X1 , X2 , . . . , Xm be independent standard exponential random variables. The total hazard construction will be described in m steps. Step 1. In this step we determine the identity i1 of the component that fails ﬁrst and its time of failure Tˆi1 . This is determined by Tˆi1 = min{T˜1 , T˜2 , . . . , T˜m }, where

T˜j = min{t ≥ 0 : Ψj|∅ (t∅) ≥ Xj },

j = 1, 2, . . . , m,

and i1 is the index of the smallest T˜j . Step k. (k = 2, 3, . . . , m). Suppose that Steps 1, 2, . . . , k − 1 have already yielded Tˆi1 , Tˆi2 , . . . , Tˆik−1 . Let I = {i1 , i2 , . . . , ik−1 } and denote I = {j1 , j2 , . . . , jm−k+1 }. In this step we determine the identity ik of the component that is the kth one to fail and its failure time Tˆik . This is determined by Tˆik = min{T˜j1 , T˜j2 , . . . , T˜jm−k+1 },

6.C The Cumulative Hazard Order

289

where here, for j ∈ I, T˜j = min{t ≥ Tˆik−1 : Ψj|i1 ,i2 ,...,ik−1 (tTˆi1 , Tˆi2 , . . . , Tˆik−1 ) ≥ Xj }, and ik is the index of the smallest T˜j , j ∈ I. It can be shown that indeed Tˆ =st T . Let S = (S1 , S2 , . . . , Sm ) be another nonnegative random vector with an absolutely continuous distribution function and multivariate conditional cumulative hazard functions Φ·|· (··). Using the same independent standard exˆ = (Sˆ1 , Sˆ2 , . . . , Sˆm ) ponential random variables X1 , X2 , . . . , Xm , construct S ˆ and Tˆ are conusing the total hazard construction described above. Thus S ˆ =st S. structed on the same probability space and they satisfy Tˆ =st T and S ˆ ≤ Tˆ } = 1. Also, if (6.C.4) holds, that is, if S ≤ch T , then it is clear that P {S Thus, from Theorem 6.B.1, we see that we have proved the following theorem. Theorem 6.C.1. Let S and T be two nonnegative random vectors with absolutely continuous distribution functions. If S ≤ch T , then S ≤st T . It is worth mentioning that the total hazard construction is theoretically and practically diﬀerent from the standard construction discussed in Section 6.B.3. In the standard construction the uniform random variables U1 , U2 , . . . , Un , which are used to generate the desired Tˆ1 , Tˆ2 , . . . , Tˆn , can be used sequentially, that is, Ui can be used to generate Tˆi , once Tˆ1 , Tˆ2 , . . . , Tˆi−1 have already been generated, i = 1, 2, . . . , n. On the other hand, in the total hazard construction, the exponential random variables X1 , X2 , . . . , Xm are all used simultaneously in the generation of each Tˆi . Remark 6.C.2. Looking at Step 1 of the total hazard construction it is seen that it can be split into two substeps. First the value of ﬁrst order statistic, Tˆ(1) say, of the Tˆj ’s is determined, and then the identity (index) of Tˆ(1) is selected. Similarly Step k can be split into two substeps. Suppose now that T = (T1 , T2 , . . . , Tm ) is a vector of exponential random variables with possibly diﬀerent parameters. Then also Tˆ = (Tˆ1 , Tˆ2 , . . . , Tˆm ) is such a vector. Furthermore, Tˆ(1) is also an exponential random variable. If it is known that Tˆ(1) = s1 say, and if the identity of the smallest Tˆj is also known, then, conditionally, the residual lives of the remaining m − 1 components are independent exponential random variables, and they do not depend on s1 . If the identity of the smallest Tˆj is not known known, then the conditional distribution of the residual lives of the remaining m−1 components is a mixture of distributions of independent exponential random variables, and it still does not depend on s1 (notice that the probabilities of the mixture do not depend on s1 ). Therefore the conditional distribution of (T(2) − s1 , T(3) − s1 , . . . , T(m) − s1 ), given Tˆ(1) = s1 , does not depend on s1 . It follows that [(Tˆ(1) , Tˆ(2) , . . . , Tˆ(m) )Tˆ(1) = s1 ] is stochastically increasing in s1 . Since T =st Tˆ we obtain a proof of Theorem 6.B.13.

290

6 Multivariate Stochastic Orders

6.D Multivariate Hazard Rate Orders 6.D.1 Deﬁnitions and basic properties The following notation will be used below. For any two real numbers x and y we denote x ∨ y = max{x, y} and x ∧ y = min{x, y}. If x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) are two vectors in Rn , then we denote x ∨ y = (x1 ∨ y1 , x2 ∨ y2 , . . . , xn ∨ yn ) and x ∧ y = (x1 ∧ y1 , x2 ∧ y2 , . . . , xn ∧ yn ). Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with respective survival functions F and G deﬁned by F (x) = P {X > x} and G(x) = P {Y > x}, x ∈ Rn . We say that X is smaller than Y in the multivariate hazard rate order (denoted by X ≤hr Y ) if F (x)G(y) ≤ F (x ∧ y)G(x ∨ y)

for every x and y in Rn .

(6.D.1)

We say that X is smaller than Y in the weak multivariate hazard rate order (denoted by X ≤whr Y ) if G(x) is increasing in x ∈ {x : G(x) > 0}, F (x)

(6.D.2)

where in (6.D.2) we use the convention a/0 ≡ ∞ whenever a > 0. Note that (6.D.2) can be written equivalently as F (y)G(x) ≤ F (x)G(y)

whenever x ≤ y.

(6.D.3)

Thus, from (6.D.1) and (6.D.3) it follows that X ≤hr Y =⇒ X ≤whr Y .

(6.D.4)

Note that from (6.D.3) it follows that if y ∈ {x : G(x) = 0}, then y ∈ {x : F (x) = 0}. That is, if X ≤whr Y , then {x : F (x) > 0} ⊆ {x : G(x) > 0}. It can be shown that the implication (6.D.4) is strict. However, when at least one of the survival functions of X and of Y is MTP2 (recall from Karlin and Rinott [278] that a function K : Rn → R+ is said to be multivariate totally positive of order 2 (MTP2 ) if K(x)K(y) ≤ K(x ∧ y)K(x ∨ y) for all x, y ∈ Rn ), then, under some regularity conditions, the orders ≤hr and ≤whr are equivalent. This is shown next. Recall that a set S ⊆ Rn is called a lattice if for all x, y in S we have that x ∧ y and x ∨ y are in S. Theorem 6.D.1. Let X and Y be two random vectors with respective survival functions F and G, and with a common support S which is a lattice. If F and/or G are/is MTP2 , then X ≤whr Y =⇒ X ≤hr Y .

(6.D.5)

6.D Multivariate Hazard Rate Orders

291

Proof. Note that the left hand side of the implication (6.D.5) implies F (x ∨ y)G(y) ≤ F (y)G(x ∨ y),

x, y ∈ Rn ,

and that the MTP2 -ness of F implies F (x)F (y) ≤ F (x ∧ y)F (x ∨ y),

x, y ∈ Rn .

Multiplication of these two inequalities yields F (x ∨ y)G(y)F (x)F (y) ≤ F (y)G(x ∨ y)F (x ∧ y)F (x ∨ y). Now, from the assumption that S is a lattice it follows that if F (x)G(y) > 0, then F (y) and F (x ∨ y) are positive. Canceling these we obtain that (6.D.1) holds in this case. If F (x)G(y) = 0, then (6.D.1) obviously holds too. Therefore X ≤hr Y . In a similar manner the implication (6.D.5) can be shown when G is MTP2 .

The order ≤hr is not an order in the usual sense (that is, it is not reﬂexive) because from (6.D.1) it follows that X ≤hr X ⇐⇒ P {X > x} is MTP2 . Consider now a random vector X = (X1 , X2 , . . . , Xn ) with a partially (1) (2) (n) diﬀerentiable survival function F . Let r X = (rX , rX , . . . , rX ) be its hazard gradient as deﬁned in (1.B.28). Let Y be another n-dimensional random vector (1) (2) (n) with hazard gradient r Y = (rY , rY , . . . , rY ). The following result, which can be obtained by diﬀerentiation of (6.D.2), justiﬁes the terminology “hazard rate order” for the orders that were introduced in (6.D.1) and (6.D.2). Theorem 6.D.2. Let X and Y be n-dimensional random vectors with hazard gradients r X and r Y , respectively. Then X ≤whr Y if, and only if, (i)

(i)

rX (x) ≥ rY (x),

i = 1, 2, . . . , n, x ∈ Rn .

A useful inequality is described next; we omit its proof. Theorem 6.D.3. Let X = (X1 , X2 , . . . , Xn ) be a random vector, and let X I = (Y1 , Y2 , . . . , Yn ) be a vector of independent random variables such that Xi =st Yi , i = 1, 2, . . . , n. If the survival function of X is MTP2 , then X I ≤hr X. The relation X ≤hr Y does not necessarily imply X ≤st Y , where ≤st denotes the usual multivariate stochastic order discussed in Section 6.B. However, a generalization of the univariate Theorem 1.B.1 is given in (6.G.10) in Section 6.G.1. Theorem 6.G.9 is a multivariate generalization of (1.B.7).

292

6 Multivariate Stochastic Orders

6.D.2 Preservation properties The orders ≤hr and ≤whr are closed under some common operations. Theorem 6.D.4. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤hr [≤whr ] (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤hr [≤whr ] (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R is an increasing function, i = 1, 2, . . . , n. (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤hr [≤whr ] Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤hr [≤whr ] (Y 1 , Y 2 , . . . , Y m ). That is, the multivariate hazard rate orders are closed under conjunctions. (c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤hr [≤whr ] Y , then X I ≤hr [≤whr ] Y I for each I ⊆ {1, 2, . . . , n}. That is, the multivariate hazard rate orders are closed under marginalization. (d) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤hr [≤whr ] Y j , j = 1, 2, . . ., then X ≤hr [≤whr ] Y . We will now describe some preservation properties of the multivariate hazard rate orders under random compositions. Let F θ , θ ∈ X be a family of n-dimensional survival functions, where X is a subset of the real line. Let X(θ) denote a random vector with survival function F θ . For any random variable Θ with support in X , and with distribution function H, let us denote by X(Θ) a random vector with survival function G given by G(x) = F θ (x)dH(θ), x ∈ Rn . X

Theorem 6.D.5. Let F θ , θ ∈ X be a family of n-dimensional survival functions as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions H1 and H2 , respectively. Let Y 1 and Y 2 be two random vectors such that Y i =st X(Θi ), i = 1, 2; that is, suppose that the survival function of Y i is given by Gi (x) = F θ (x)dHi (θ), x ∈ Rn , i = 1, 2. X

If

6.D Multivariate Hazard Rate Orders

X(θ) ≤whr X(θ )

whenever θ ≤ θ ,

293

(6.D.6)

and if Θ1 and Θ2 are ordered in the univariate hazard rate order; that is, if Θ1 ≤hr Θ2 ,

(6.D.7)

Y 1 ≤whr Y 2 .

(6.D.8)

then Proof. Assumption (6.D.6) means that for each j ∈ {1, 2, . . . , n}, the function F θ (x1 , x2 , . . . , xn ) is TP2 (totally positive of order 2; that is, bivariate MTP2 ) as a function of θ ∈ X and of xj ∈ R. Assumption (6.D.7) means that H i (θ) is TP2 as a function of i ∈ {1, 2} and of θ ∈ X . Therefore, by Theorem 2.1 of Joag-Dev, Kochar, and Proschan [259], Gi (x1 , x2 , . . . , xn ) is TP2 in i ∈ {1, 2} and in xj ∈ R, j = 1, 2, . . . , n. That is, G2 (x1 , x2 , . . . , xn ) is increasing in xj , G1 (x1 , x2 , . . . , xn ) By (6.D.2), this yields the stated result.

j = 1, 2, . . . , n.

In the case where Y 1 and Y 2 in Theorem 6.D.5 are vectors of conditionally independent random variables, the conclusion (6.D.8) can be strengthened. For thispurpose, consider n families of univariate survival functions

F j,θ , θ ∈ X , j = 1, 2, . . . , n, where X is a subset of the real line. Let Xj (θ) denote a univariate random variable with survival function F j,θ . For any random variable Θ with support in X , and with distribution function H, let Xj (Θ) denote a univariate random variable with survival function given by X F j,θ (x)dH(θ), x ∈ R, j = 1, 2, . . . , n.

Theorem 6.D.6. Let F j,θ , θ ∈ X be n families of univariate survival functions as above, j = 1, 2, . . . , n. Assume that for each j = 1, 2, . . . , n, the univariate supports corresponding to all the F j,θ ’s are identical, Yj , say. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions H1 and H2 , respectively. Let Y 1 = (Y11 , Y12 , . . . , Y1n ) and Y 2 = (Y21 , Y22 , . . . , Y2n ) be two vectors of conditionally independent random variables such that Yij =st Xj (Θi ), i = 1, 2, j = 1, 2, . . . , n; that is, suppose that the survival function of Y i is given by Gi (x1 , x2 , . . . , xn ) =

n X j=1

F j,θ (xj )dHi (θ), (x1 , x2 , . . . , xn ) ∈ Rn , i = 1, 2. (6.D.9)

If and if

Xj (θ) ≤hr Xj (θ )

whenever θ ≤ θ , j = 1, 2, . . . , n,

(6.D.10)

294

6 Multivariate Stochastic Orders

Θ1 ≤hr Θ2 , then Y 1 ≤hr Y 2 . Proof. Let θ ≤ θ . From assumption (6.D.10), from the conditional independence of the Xj (θ)’s, and from the conditional independence of the Xj (θ )’s, it follows by Theorem 6.D.4(b) that (X1 (θ), X2 (θ), . . . , Xn (θ)) ≤hr (X1 (θ ), X2 (θ ), . . . , Xn (θ ))

whenever θ ≤ θ .

Therefore, by Theorem 6.D.5 we get Y 1 ≤whr Y 2 .

(6.D.11)

Next, it is easy to verify that Gi in (6.D.9) is TP2 in each pair of its variables when the other variables are held ﬁxed, i = 1, 2. Therefore Gi is MTP2 , i = 1, 2. Furthermore, from the assumption that for j = 1, 2, . . . , n, all the F j,θ ’s have a corresponding univariate common support Yj , it follows that Y 1 and Y 2 have a common support which is a lattice. The stated result now follows from (6.D.11) and Theorem 6.D.1.

An interesting property of the order ≤whr , for nonnegative random vectors, is given next; see Theorem 6.G.15 for a related result. Theorem 6.D.7. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. If X ≤whr Y , then min{a1 X1 , . . . , an Xn } ≤hr min{a1 Y1 , . . . , an Yn } whenever ai > 0, i = 1, 2, . . . , n. (6.D.12) 6.D.3 The dynamic multivariate hazard rate order Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. Denote the multivariate conditional hazard rate functions of T by λ·|· (··) as deﬁned in (6.C.2). Clearly, the higher the multivariate conditional hazard rate functions are, the smaller T should be stochastically. This is the motivation for the order discussed in this subsection. Let S = (S1 , S2 , . . . , Sm ) be another nonnegative random vector with an absolutely continuous distribution function. Denote its multivariate condi tional hazard rate functions by η·|· (··), where the η’s are deﬁned analogously to the λ’s in (6.C.2). Suppose that ηi|I∪J (usI , sJ ) ≥ λi|I (utI ) whenever J ∩ I = ∅, sI ≤ tI ≤ ue, and sJ ≤ ue, (6.D.13)

6.D Multivariate Hazard Rate Orders

295

where i ∈ I ∪ J. Then S is said to be smaller than T in the dynamic multivariate hazard rate order (denoted as S ≤dyn-hr T ). The order ≤dyn-hr is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤dyn-hr X means that X has the positive dependence property of “hazard rate increasing upon failures” discussed in Shaked and Shanthikumar [511]. Note that (6.D.13) can be written as (see Section 6.B.6 for the deﬁnition of histories and for the deﬁnition of their comparison) ηi (hu ) ≥ λi (hu )

whenever hu ≥ hu ,

where i denotes a component that has not failed by time u in the history hu . The following example illustrates how the dynamic multivariate hazard rate order can be veriﬁed. This example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.E.13, and 7.B.13. Example 6.D.8. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the univariate hazard rate ordering of the ﬁrst two epoch times implies the dynamic multivariate hazard rate ordering of the corresponding vectors of the later epoch times. Explicitly, it will be shown below that if X ≤hr Y , then (T1,1 , T1,2 , . . . , T1,n ) ≤dyn-hr (T2,1 , T2,2 , . . . , T2,n ) for each n ≥ 1. Fix an n ≥ 1. Let η·|· (··) be the multivariate conditional hazard rate func tions associated with (T1,1 , T1,2 , . . . , T1,n ) and let ζ·|· (··) be the multivariate conditional hazard rate functions associated with (T2,1 , T2,2 , . . . , T2,n ). First let us obtain an explicit expression for ζi|I (utI ) under the restrictions on t and u in (6.D.13). Since T2,1 ≤ T2,2 ≤ · · · ≤ T2,n a.s., it follows that tI in (6.D.13) can be a realization (“history”) of observations up to time u only if I is of the form I = {1, 2, . . . , m} for some m ≥ 1, or I = ∅ (that is, m = 0). Then we have λ2 (u), if i = m + 1; ζi|I (utI ) = where I = {1, 2, . . . , m}. 0, if i > m + 1; Next, let us obtain an explicit expression for ηi|I∪J (usI∪J ) under the restrictions on s, t, and u in (6.D.13). Since T1,1 ≤ T1,2 ≤ · · · ≤ T1,n a.s., we see that when I = {1, 2, . . . , m}, then sI∪J in (6.D.13) can be a realization of observations up to time u only if J is of the form J = {m + 1, m + 2, . . . , k} for some k ≥ m + 1, or J = ∅ (that is, k = m). Then we have

296

6 Multivariate Stochastic Orders

ηi|I∪J (usI∪J ) =

λ1 (u), 0,

if i = k + 1; if i > k + 1;

where I = {1, 2, . . . , m} and J = {m + 1, m + 2, . . . , k}. Suppose that X ≤hr Y . Since i in (6.D.13) must satisfy i ∈ I ∪ J (that is, i > k), we see that if k > m, then if i = k + 1; ηi|I∪J (usI∪J ) = λ1 (u) ≥ 0 = ζi|I (utI ) ηi|I∪J (u sI∪J ) = 0 = ζi|I (u tI ) if i > k + 1; so (6.D.13) holds with ζ·|· (··) replacing λ·|· (··). If k = m (that is, J = ∅), then, using X ≤hr Y , we get ηi|I∪J (usI∪J ) = λ1 (u) ≥ λ2 (u) = ζi|I (utI ) if i = k + 1; ηi|I∪J (usI∪J ) = 0 = ζi|I (utI ) if i > k + 1; so (6.D.13), with ζ·|· (··) replacing λ·|· (··), holds in this case too. Thus (T1,1 , T1,2 , . . . , T1,n ) ≤dyn-hr (T2,1 , T2,2 , . . . , T2,n ). It should be noted that in Example 1.B.24 it was shown that if X ≤hr Y , then we have the univariate stochastic inequality T1,n ≤hr T2,n for each n ≥ 1. This stochastic inequality does not follow from the above result because the dynamic multivariate hazard rate order is not closed under marginalization. In the univariate case (m = 1) condition (6.D.13) reduces to (1.B.2) [with a diﬀerent notation]. We have already seen that in the univariate case S1 ≤hr T1 =⇒ S1 ≤st T1 . This is also true in the general dynamic multivariate case. In order to see it, note that if (6.D.13) holds, then (6.C.4) holds, where in (6.C.4) the functions Ψ ’s are deﬁned by means of the functions λ’s as in (6.C.3) and the functions Φ’s are analogously deﬁned by means of the functions η’s. We thus have proven the following result. Theorem 6.D.9. If S and T are two nonnegative random vectors such that S ≤dyn-hr T , then S ≤ch T . Let X(1) ≤ X(2) ≤ · · · ≤ X(n) be the order statistics corresponding to a sample of independent and identically distributed nonnegative random variables X1 , X2 , . . . , Xn . Similarly, let Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) be the order statistics corresponding to a sample of independent and identically distributed nonnegative random variables Y1 , Y2 , . . . , Yn . In the next result, the vectors of order statistics are compared in the order ≤dyn-hr ; it may be compared with Theorems 6.E.12, 7.B.4, and 7.B.12. The proof of the next result is similar to the proof of the main result in Example 6.D.8.

6.D Multivariate Hazard Rate Orders

297

Theorem 6.D.10. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as described above. If X1 ≤hr Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤dyn-hr (Y(1) , Y(2) , . . . , Y(n) ). We will now see a property of the order ≤dyn-hr in reliability theory. Recall from Section 1.B.5 that a nonnegative random variable T is IFR if, and only if, either one of the following equivalent conditions holds: [T − tT > t] ≥hr [T − t T > t ] whenever t ≤ t , (6.D.14) T ≥hr [T − tT > t] for all t ≥ 0. (6.D.15) With the dynamic multivariate analog of the order ≥hr , one can generalize (6.D.14) and (6.D.15) to the multivariate case, thus introducing notions of multivariate IFR distributions. This can be done in several ways. Below we show that various generalizations of (6.D.14) and (6.D.15) actually yield the same notion of multivariate IFR. Let T be a nonnegative random vector. Recall from Section 6.B.6 the deﬁnition, the notation ht , and the comparison of histories associated with T . One possible multivariate analog of (6.D.14) is to require T to satisfy, for t ≤ s and histories ht and hs , [(T − te)+ ht ] ≥dyn-hr [(T − se)+ hs ] whenever ht ≤ hs . (6.D.16) Still another possible multivariate analog of (6.D.14) is to require T to satisfy, for t ≤ s, [(T − te)+ ht ] ≥dyn-hr [(T − te)+ hs ] whenever ht and hs coincide on [0, t). (6.D.17) An analog of (6.D.15) is to require T to satisfy (6.D.16) or (6.D.17) with t = 0; that is, T ≥dyn-hr [(T − se)+ hs ] for any history hs , s ≥ 0. (6.D.18) It turns out that these three conditions are equivalent. If we say that the nonnegative random T is multivariate IFR if it satisﬁes (6.D.16), then we have the following result, the proof of which can be found elsewhere. Theorem 6.D.11. Let T be a nonnegative random vector. The following three statements are equivalent. (i) T is multivariate IFR. (ii) T satisﬁes (6.D.17). (iii) T satisﬁes (6.D.18). Note that if T is multivariate IFR in the sense of Theorem 6.D.11, it is also multivariate IFR in the sense of both (6.B.27) and (6.B.28).

298

6 Multivariate Stochastic Orders

6.E The Multivariate Likelihood Ratio Order 6.E.1 Deﬁnition A multivariate analog of the univariate order ≤lr from Section 1.C will be introduced in this subsection. This order is sometimes also called the TP2 order. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors with absolutely continuous [or discrete] distribution functions and let f and g denote their [continuous or discrete] density functions, respectively. Suppose that f (x)g(y) ≤ f (x ∧ y)g(x ∨ y)

for every x and y in Rn .

(6.E.1)

Then X is said to be smaller than Y in the multivariate likelihood ratio order (denoted as X ≤lr Y ). Indeed, in the univariate case (n = 1), (6.E.1) reduces to (1.C.2). The order ≤lr is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤lr X means that X has the positive dependence property of “multivariate TP2 ” discussed in Karlin and Rinott [278] and in Whitt [563]; see its deﬁnition in Example 6.E.16 below. In the slightly more general case, when X and Y are nonnegative, some of the Xi ’s may be identically zero and the joint distribution of the rest is absolutely continuous or discrete. Suppose that X1 , X2 , . . . , Xm are those that are identically zero for some 0 < m < n. Let f now denote the joint density of (Xm+1 , Xm+2 , . . . , Xn ). In that case we denote X ≤lr Y if f (x)g(y) ≤ f (x ∧ (ym+1 , ym+2 , . . . , yn )) × g((y1 , y2 , . . . , ym ), x ∨ (ym+1 , ym+2 , . . . , yn ))

(6.E.2)

for every x = (xm+1 , xm+2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ). At a ﬁrst glance (6.E.1) and (6.E.2) seem to be unintuitive technical conditions. However, it turns out that in many situations they are very easy to verify and this is one of the major reasons for the usefulness and importance of the order ≤lr . Another possible analog of (1.C.2) is to require that f (y)g(x) ≤ f (x)g(y) whenever x ≤ y. However, this does not yield an intuitive notion; see Remark 6.E.10. 6.E.2 Some properties The multivariate likelihood ratio order is preserved under conditioning on any rectangular set A (that is, A of the form A = A1 ×A2 ×· · ·×An where Ai ⊆ R, i = 1, 2, . . . , n). This is shown in the next result. The proof is quite trivial and is omitted.

6.E The Multivariate Likelihood Ratio Order

299

Theorem 6.E.1. If X and Y are two n-dimensional random vectors such n that X ≤lr Y , then, for any measurable rectangular set A ⊆ R , we have that [X X ∈ A] ≤lr [Y Y ∈ A]. The above theorem can be generalized as follows. For A, B ⊆ Rn we denote A ∨ B = {x ∨ y : x ∈ A, y ∈ B} and A ∧ B = {x ∧ y : x ∈ A, y ∈ B}. Theorem 6.E.2. Let A, B ⊆ Rn satisfy A ∨ B ⊆ B and A ∧ B ⊆ A. If X and Y are two n-dimensional random vectors such that X ≤lr Y , then [X X ∈ A] ≤lr [Y Y ∈ B]. Proof. Let f and g denote the density functions of X and Y , respectively. For any set C, let IC denote its indicator function. The assumptions imply IA (x)IB (y) ≤ IA (x ∧ y)IB (x ∨ y)

and f (x)g(y) ≤ f (x ∧ y)g(x ∨ y).

Therefore f (x)IA (x) g(y)IB (y) f (x ∧ y)IA (x ∧ y) g(x ∨ y)IB (x ∨ y) · ≤ · . P {X ∈ A} P {Y ∈ B} P {X ∈ A} P {Y ∈ B}

The following result shows that the order ≤lr is preserved under strictly monotone transformations of each individual coordinate of the underlying random vectors. The proof follows the lines of the proof of Theorem 1.C.8 and is omitted. Theorem 6.E.3. Let ψi be any increasing function, i = 1, 2, . . . , n. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤lr Y , then (ψ1 (X1 ), ψ2 (X2 ), . . . , ψn (Xn )) ≤lr (ψ1 (Y1 ), ψ2 (Y2 ), . . . , ψn (Yn )). The order ≤lr is closed under marginalization and under conjunctions as the following result shows. The ﬁrst part of the theorem can easily be proven from the deﬁnitions. The proof of the second part uses ideas from the theory of total positivity and is not given here. Theorem 6.E.4. (a) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤lr Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤lr (Y 1 , Y 2 , . . . , Y m ). That is, the multivariate likelihood ratio order is closed under conjunctions. (b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤lr Y , then X I ≤lr Y I for each I ⊆ {1, 2, . . . , n}. That is, the multivariate likelihood ratio order is closed under marginalization.

300

6 Multivariate Stochastic Orders

A result which shows the preservation of the order ≤lr under random summations is stated next. The proof is based on standard arguments from the theory of total positivity, and is omitted. Theorem 6.E.5. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Assume that X 1 , X 2 , . . . , X m are independent. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i has a logconcave density function for all j = 1, 2, . . . , m and i ≥ 1, and if M ≤lr N , then M1 i=1

X1,i ,

M2 i=1

X2,i , . . . ,

Mm

Xm,i ≤lr

i=1

N1 i=1

X1,i ,

N2 i=1

X2,i , . . . ,

Nm

Xm,i .

i=1

In the univariate case the likelihood ratio order implies the hazard rate order. It turns out that this is also the case in the multivariate case as the following two results show. Theorem 6.E.6. If X and Y are two n-dimensional random vectors such that X ≤lr Y , then X ≤hr Y . Proof. This result follows from Theorem 2.4 nin Karlin and Rinott [278] with the MTP2 kernel K deﬁned by K(x, u) = i=1 1(xi ,∞) (ui ).

Theorem 6.E.7. If X and Y are two nonnegative n-dimensional random vectors such that X ≤lr Y , then X ≤dyn-hr Y . Proof. First suppose that X > 0e a.s. Split {1, 2, . . . , n} into three mutually exclusive sets, I, J, and L (so that L = I ∪ J). Select xI , xJ , y I , and t such that xI ≤ y I ≤ te and xJ ≤ te. Denote the densities of (X I , X J , X L ) and of (Y I , Y J , Y L ) by f˜ and g˜, respectively. The density of [X L X I = xI , X J = ˜ xJ ], with argument xL , is then f˜(xI , xJ , xL )/f˜I,J (x I , xJ ) where fI,J is the marginal density of (X I , X J ). The density of [Y L Y I = y I , Y J > te], with argument y L , is then g˜(y I , y J , y L )dy J yJ >te , g˜ (y I , y J )dy J y >te I,J J

where g˜I,J is the marginal density of (Y I , Y J ). Now select a y J > te. Since y J > te and xJ ≤ te it follows that xJ ≤ y J . Also xI ≤ y I . Therefore, from the assumption that X ≤lr Y it follows that g (y I , y J , y L ) ≤ f˜(xI , xJ , xL ∧ y L )˜ g (y I , y J , xL ∨ y L ). (6.E.3) f˜(xI , xJ , xL )˜ Integration of (6.E.3) over the region {y J : y J > te} yields g (y I , y J , y L )dy J f˜(xI , xJ , xL )˜ y J >te ≤ g (y I , y J , xL ∨ y L )dy J f˜(xI , xJ , xL ∧ y L )˜ y J >te

6.E The Multivariate Likelihood Ratio Order

301

which, in turn, yields g˜(y I , y J , y L )dy J f˜(xI , xJ , xL ) y >te × J ˜ g˜ (y I , y J ) dy J fI,J (xI , xJ ) y J >te I,J f˜(xI , xJ , xL ∧ y L ) ≤ × f˜I,J (xI , xJ )

y J >te

g˜(y I , y J , xL ∨ y L )dy J

g˜ (y I , y J )dy J y J >te I,J

That is, we have shown so far that [X L X I = xI , X J = xJ ] ≤lr [Y L Y I = y I , Y J > te].

.

(6.E.4)

From Theorems 6.E.1 and 6.E.3 it now follows that [X L − teX I = xI , X J = xJ , X L > te]

≤lr [Y L − teY I = y I , Y J > te, Y L > te],

and from Theorem 6.E.4(b) it follows that, for k ∈ L, we have [Xk − tX I = xI , X J = xJ , X L > te] ≤lr [Yk − tY I = y I , Y J > te, Y L > te], (6.E.5) where here ≤lr denotes the univariate likelihood ratio order discussed in Section 1.C. From (6.E.5) it follows that the density of [Xk − tX I = xI , X J = xJ , X L > te] at zero is larger than the density of [Yk − t Y I = y I , Y J > te, Y L > te] at zero. But the density of [Xk −t X I = xI , X J = xJ , X L > te] at zero is ηk|I∪J (txI, xJ ) and the density of [Yk −tY I = y I , Y J > te, Y L > te] at zero is λk|I (t y I ), where λ·|· (· ·) and η·|· (··) denote the multivariate conditional hazard rate functions of X and Y , respectively. We thus have shown that X and Y satisfy (6.D.13) and this completes the proof of the theorem when X > 0e a.s. If X has some components that are identically zero a.s., then the above arguments still apply after some simple modiﬁcations.

A combination of Theorems 6.C.1, 6.D.9, and 6.E.7 shows that for nonnegative random vectors X and Y one has X ≤lr Y =⇒ X ≤st Y . But this is true in general as is stated in the next result, the proof of which we omit. Theorem 6.E.8. If X and Y are two n-dimensional random vectors such that X ≤lr Y , then X ≤st Y . Remark 6.E.9. A combination of Theorems 6.E.1 and 6.E.8 shows that X ≤lr Y =⇒ [X A] ≤st [Y A] for all measurable rectangular sets A ⊆ Rn . (6.E.6)

302

6 Multivariate Stochastic Orders

The conclusion in (6.E.6) is a generalization of (1.C.6). However, the characterization of the order ≤lr in the univariate case, given in (1.C.6), does not generalize to the case. That is, X ≤lr Y does not necessarily multivariate imply that [X A] ≤st [Y A] for all measurable sets A ∈ Rn . Remark 6.E.10. Let X and Y be two n-dimensional random vectors with (continuous or discrete) density functions f and g, respectively. If it is only assumed that f (y)g(x) ≤ f (x)g(y) whenever x ≤ y (rather than (6.E.1)), then it is not necessarily true that X ≤st Y ; counterexamples can be found in the literature. Note, however, that, under some additional conditions, the monotonicity of g(x)/f (x) in x implies that X ≤st Y ; see, for example, Theorem 6.B.8. A result that may be viewed as a generalization of Theorems 1.C.9 and 1.C.52 is stated next. Theorem 6.E.11. Let X be an n-dimensional random vector. (a) X ≤lr X + a for all a ≥ 0 if, and only if, X has independent components with logconcave density functions. (b) If X has independent components with logconcave density functions, then X ≤lr X + Y for any random vector Y ≥ 0 independent of X. In the next result, vectors of order statistics are compared in the multivariate order ≤lr . The result may be compared with Theorems 6.D.10, 7.B.4, and 7.B.12. Theorem 6.E.12. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as in Theorem 6.D.10. If X1 ≤lr Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤lr (Y(1) , Y(2) , . . . , Y(n) ). The following example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.D.8, and 7.B.13. Example 6.E.13. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, and density functions f and g, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that, under some conditions, the univariate likelihood ratio ordering of the ﬁrst two epoch times implies the multivariate likelihood ratio ordering of the corresponding vectors of the later epoch times. Explicitly, it will be shown below that if X ≤hr Y , and if (1.B.25) holds, then (T1,1 , T1,2 , . . . , T1,n ) ≤lr (T2,1 , T2,2 , . . . , T2,n ) for each n ≥ 1. (Note that the condition X ≤hr Y , together with (1.B.25), is stronger than merely assuming X ≤lr Y ; see Theorem 1.C.4.)

6.E The Multivariate Likelihood Ratio Order

303

As is mentioned above, the stated result is true for n = 1. So let n ≥ 2. The density functions of (Ti,1 , Ti,2 , . . . , Ti,n ), i = 1, 2, are given by h1,n (x1 , x2 , . . . , xn ) = λ1 (x1 )λ1 (x2 ) · · · λ1 (xn−1 )f (xn ) for x1 ≤ x2 ≤ · · · ≤ xn , and h2,n (x1 , x2 , . . . , xn ) = λ2 (x1 )λ2 (x2 ) · · · λ2 (xn−1 )g(xn ) for x1 ≤ x2 ≤ · · · ≤ xn . Consider now (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) such that x1 ≤ x2 ≤ · · · ≤ xn and y1 ≤ y2 ≤ · · · ≤ yn . We want to prove that λ1 (x1 ∧ y1 )λ1 (x2 ∧ y2 ) · · · λ1 (xn−1 ∧ yn−1 )f (xn ∧ yn ) × λ2 (x1 ∨ y1 )λ2 (x2 ∨ y2 ) · · · λ2 (xn−1 ∨ yn−1 )g(xn ∨ yn ) ≥ λ1 (x1 )λ1 (x2 ) · · · λ1 (xn−1 )f (xn ) × λ2 (y1 )λ2 (y2 ) · · · λ2 (yn−1 )g(yn ). (6.E.7) Let E = {i ≤ n − 1 : xi ≥ yi }. Then (6.E.7) reduces to λ1 (yi )λ2 (xi ) f (xn ∧ yn )g(xn ∨ yn ) ≥ λ1 (xi )λ2 (yi ) f (xn )g(yn ), i∈E

i∈E

and this follows from (1.B.25) and X ≤lr Y . From the above result, and the closure of the likelihood ratio order under marginalization (Theorem 6.E.4(b)), it follows that if X ≤hr Y , and if (1.B.25) holds, then T1,n ≤lr T2,n , n ≥ 1. However, a stronger result is given in Example 1.C.48—this is so because the conditions X ≤hr Y and (1.B.25), together, imply the conditions X ≤lr Y and (1.C.15). Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Again, note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the univariate likelihood ratio ordering of the ﬁrst two inter-epoch times implies the multivariate likelihood ratio ordering of the corresponding vectors of the later inter-epoch times. Explicitly, if X ≤hr Y , and if f and/or g are logconvex, and if λ1 and/or λ2 are logconvex, and if (1.B.25) holds, then (X1,1 , X1,2 , . . . , X1,n ) ≤lr (X2,1 , X2,2 , . . . , X2,n ) for each n ≥ 1. The proof of this statement will not be detailed here. From the above result, and the closure of the likelihood ratio order under marginalization (Theorem 6.E.4(b)), it follows that if X ≤hr Y , and if f and/or g are logconvex, and if λ1 and/or λ2 are logconvex, and if (1.B.25) holds, then X1,n ≤lr X2,n , n ≥ 1. This is a diﬀerent set of conditions for the last stochastic inequality than the set of conditions in Example 1.C.48.

304

6 Multivariate Stochastic Orders

Example 6.E.14. Recall that the spacings that correspond to the nonnegative random variables X1 , X2 , . . . , Xn are denoted by U(i) = X(i) − X(i−1) , i = 1, 2, . . . , n, where the X(i) ’s are the corresponding order statistics (here we take X(0) ≡ 0). The normalized spacings are deﬁned by D(i) = (n − i − 1)U(i) , i = 1, 2, . . . , n. Now, let D(1) , D(2) , . . . , D(n) be the normalized spacings associated with exponential random variables X1 , X2 , . . . , Xn , where Xi has ∗ ∗ ∗ the hazard rate λi , i = 1, 2, . . . , n. Let D(1) , D(2) , . . . , D(n) be the normalized spacings associated with a sample of n independent and identically n distributed exponential random variables that have the hazard rate (1/n) i=1 λi . Then ∗ ∗ ∗ (D(1) , D(2) , . . . , D(n) ) ≤lr (D(1) , D(2) , . . . , D(n) ).

The following example is similar to Example 6.B.25 except that under a diﬀerent assumption we obtain a stronger conclusion. Other results which give related comparisons can be found in Theorems 1.C.45 and 4.B.17. Example 6.E.15. Let X and Y be two random variables. Let X(1) ≤ X(2) ≤ · · · ≤ X(n) denote the order statistics from a sample X1 , X2 , . . . , Xn of independent and identically distributed random variables that have the same distribution as X. Similarly, let Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) denote the order statistics from another sample Y1 , Y2 , . . . , Yn of independent and identically distributed random variables that have the same distribution as Y . The corresponding spacings are deﬁned by U(i) ≡ X(i) − X(i−1) and V(i) ≡ Y(i) − Y(i−1) , i = 2, 3, . . . , n. Denote U = (U(2) , U(3) , . . . , U(n) ) and V = (V(2) , V(3) , . . . , V(n) ). Kochar [311] has shown that if X ≤lr Y , and if either X or Y have logconvex densities, then U ≤lr V . The next example extends Example 1.C.57 to the multivariate likelihood ratio order. Example 6.E.16. Let X be an n-dimensional random vector whose distribution function depends on the m-dimensional parameter Θ. Denote the prior density function of Θ by π(·), and denote the conditional density of X, given Θ = θ, by f (·θ). Suppose that the m-dimensional density function of Θ is MTP2 (multivariate totally positive of order 2), that is, suppose that Θ ≤lr Θ, or, equivalently (see (6.E.1)), that π(θ)π(θ ) ≤ π(θ ∧ θ )π(θ ∨ θ ) for every θ m and θ in R . Then, if f (x θ) is ((m + n)-dimensional) MTP2 , then Θ is increasing in X in the likelihood ratio sense (that is, [Θ X = x] ≤lr [Θ X = x ] whenever x ≤ x ). The proof of this statement is similar to the proof of the statement in Example 1.C.57 and is omitted. 6.E.3 A property in reliability theory In Theorem 1.C.52 it was shown that a nonnegative random variable T has a logconcave density if, and only if, either one of the following equivalent conditions holds:

6.F The Multivariate Mean Residual Life Order

[T − tT > t] ≥lr [T − t T > t ] whenever t ≤ t , T ≥lr [T − tT > t] for all t ≥ 0.

305

(6.E.8) (6.E.9)

We commented there that logconcavity can thus be interpreted as an aging notion in reliability theory. Having a multivariate analog of the order ≥lr one can generalize (6.E.8) and (6.E.9) to the multivariate case, thus introducing notions which can be considered as multivariate analogs of distributions with logconcave densities. This can be done in several ways. In this subsection we show that various generalizations of (6.E.8) and (6.E.9) actually yield the same notion of multivariate PF2 distributions. Let T be a nonnegative random vector. Recall from Section 6.B.6 the deﬁnition, the notation ht , and the comparison of histories associated with T . One possible multivariate analog of (6.E.8) is to require T to satisfy, for t ≤ s and histories ht and hs , [(T − te)+ ht ] ≥lr [(T − se)+ hs ] whenever ht ≤ hs . (6.E.10) Still another possible multivariate analog of (6.E.8) is to require T to satisfy, for t ≤ s, [(T − te)+ ht ] ≥lr [(T − se)+ hs ] whenever ht and hs coincide on [0, t). (6.E.11) An analog of (6.E.9) is to require T to satisfy (6.E.10) or (6.E.11) with t = 0, that is, T ≥lr [(T − se)+ hs ] for any history hs , s ≥ 0. (6.E.12) It turns out that these three conditions are equivalent. If we say that the nonnegative random vector T is multivariate PF2 if it satisﬁes (6.E.10), then we have the following result, the proof of which is similar to the proof of Theorem 6.D.11. Theorem 6.E.17. Let T be a nonnegative random vector. The following three statements are equivalent. (i) T is multivariate PF2 . (ii) T satisﬁes (6.E.11). (iii) T satisﬁes (6.E.12).

6.F The Multivariate Mean Residual Life Order 6.F.1 Deﬁnition Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with a ﬁnite mean vector. Consider a typical history of T at time t ≥ 0, which is of the form (see (6.B.25))

306

6 Multivariate Stochastic Orders

ht = {T I = tI , T I > te},

0e ≤ tI ≤ te, I ⊆ {1, 2, . . . , m}.

(6.F.1)

Given the history ht as in (6.F.1), let i ∈ I be a component that is still alive at time t. Its multivariate mean residual life, at time t, is deﬁned as follows: mi|I (ttI ) = E[Ti − tT I = tI , T I > te], (6.F.2) where, of course, 0e ≤ tI ≤ te and I ⊆ {1, 2, . . . , m}. Clearly, the smaller the mrl function is, the smaller T should be in some stochastic sense. This is the motivation for the order discussed in this section. Let S be another nonnegative random vector with a ﬁnite mean vector. Denote its multivariate mean residual life functions by l·|· (··), where the l’s are deﬁned analogously as the m’s in (6.F.2). Suppose that li|I∪J (usI , sJ ) ≤ mi|I (utI ) whenever J ∩ I = ∅, sI ≤ tI ≤ ue, and sJ ≤ ue, (6.F.3) where i ∈ I ∪ J. Then S is said to be smaller than T in the multivariate mean residual life order (denoted as S ≤mrl T ). The order ≤mrl is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤mrl X means that X has the positive dependence property of “mrl decreasing upon failure” discussed in Shaked and Shanthikumar [513]. Note that (6.F.3) can be written as li (hu ) ≤ mi (hu )

whenever hu ≥ hu ,

where i denotes a component that has not failed by time u in the history hu . In the univariate case (m = 1) condition (6.F.3) reduces to (2.A.2) [with a diﬀerent notation]. We have already seen that in the univariate case S1 ≤hr T1 =⇒ S1 ≤mrl T1 . This is also true in the general multivariate case as will be shown in the next subsection. 6.F.2 The relation between the multivariate mean residual life and the dynamic multivariate hazard rate orders Theorem 6.F.1. If S and T are two nonnegative random vectors with ﬁnite mean vectors such that S ≤dyn-hr T , then S ≤mrl T . Proof. Select a t > 0 and two histories ht and ht such that ht ≤ ht . It is not hard to verify that if S ≤dyn-hr T , then [(S − te)+ ht ] ≤dyn-hr [(T − te)+ ht ]. From Theorems 6.D.9 and 6.C.1 it is seen that if [(S − te)+ ht ] ≤dyn-hr [(T − te)+ ht ], then [(S − te)+ ht ] ≤st [(T − te)+ ht ]. Therefore, for a component i, which is still alive at time t in history ht , we have li (ht ) = E[Si − tht ] ≤ E[Ti − tht ] = mi (ht ), that is, S ≤mrl T .

6.G Other Multivariate Stochastic Orders

307

6.F.3 A property in reliability theory Recall from Section 2.A.4 that a nonnegative random variable T with a ﬁnite mean is DMRL if, and only if, either one of the following equivalent conditions holds: (6.F.4) [T − tT > t] ≥mrl [T − t T > t ] whenever t ≤ t , T ≥mrl [T − t T > t] for all t ≥ 0. (6.F.5) With the multivariate analog of the order ≥mrl one can generalize (6.F.4) and (6.F.5) to the multivariate case, thus introducing notions of multivariate DMRL distributions. This can be done in several ways. In this subsection we show that various generalizations of (6.F.4) and (6.F.5) actually yield the same notion of multivariate DMRL. Let T be a nonnegative random vector with a ﬁnite mean vector. A possible multivariate analog of (6.F.4) is to require, for t ≤ s and histories ht and hs , that T satisﬁes (6.F.6) [(T − te)+ ht ] ≥mrl [(T − se)+ hs ] whenever ht ≤ hs . Still another possible multivariate analog of (6.F.4) is to require, for t ≤ s, that T satisﬁes [(T − te)+ ht ] ≥mrl [(T − se)+ hs ] whenever ht and hs coincide on [0, t). (6.F.7) An analog of (6.F.5) is to require that T satisﬁes (6.F.6) or (6.F.7) with t = 0, that is, (6.F.8) T ≥mrl [(T − se)+ hs ] for any history hs , s ≥ 0. It turns out that these three conditions are equivalent. If we say that the nonnegative random vector T is multivariate DMRL if it satisﬁes (6.F.6), then we have the following result, the proof of which is similar to the proof of Theorem 6.D.11 and is omitted. Theorem 6.F.2. Let T be a nonnegative random vector with a ﬁnite mean vector. The following three statements are equivalent. (i) T is multivariate DMRL. (ii) T satisﬁes (6.F.7). (iii) T satisﬁes (6.F.8).

6.G Other Multivariate Stochastic Orders 6.G.1 The orthant orders The usual multivariate stochastic order, discussed in Section 6.B, is a possible multivariate generalization of (1.A.4) or (1.A.7). In this section we discuss

308

6 Multivariate Stochastic Orders

a few other possible generalizations of the univariate order ≤st which are straightforward analogs of (1.A.1) and of (1.A.2). These generalizations yield orders that are strictly weaker than the usual multivariate stochastic order. For a random vector X = (X1 , X2 , . . . , Xn ) with distribution function F , let F be the multivariate survival function of X, that is, F (x1 , x2 , . . . , xn ) ≡ P {X1 > x1 , X2 > x2 , . . . , Xn > xn }

for all x.

Let Y be another n-dimensional random vector with distribution function G and survival function G. If F (x1 , x2 , . . . , xn ) ≤ G(x1 , x2 , . . . , xn )

for all x,

(6.G.1)

then we say that X is smaller than Y in the upper orthant order (denoted by X ≤uo Y ). If F (x1 , x2 , . . . , xn ) ≥ G(x1 , x2 , . . . , xn )

for all x,

(6.G.2)

then we say that X is smaller than Y in the lower orthant order (denoted by X ≤lo Y ). The reason for this terminology is that sets of the form {x : x1 > a1 , x2 > a2 , . . . , xn > an }, for some ﬁxed a, are called upper orthants, and sets of the form {x : x1 ≤ a1 , x2 ≤ a2 , . . . , xn ≤ an }, for some ﬁxed a, are called lower orthants. Note that (6.G.1) can be written as E[IU (X)] ≤ E[IU (Y )]

for all upper orthants U.

(6.G.3)

Similarly, (6.G.2) can be written as E[IL (X)] ≥ E[IL (Y )]

for all lower orthants L.

(6.G.4)

Let ψ be an n-variate function of the form ψ(x1 , x2 , . . . , xn ) =

n

gi (xi ),

(x1 , x2 , . . . , xn ) ∈ Rn ,

i=1

where the gi ’s are univariate nonnegative increasing functions. Every such function can be approximated by positive linear combinations of indicator functions of upper orthants. Therefore, using (6.G.3), we obtain the ﬁrst part of the next theorem. The other part can be obtained similarly using (6.G.4). Theorem 6.G.1. Let X and Y be two n-dimensional random vectors. Then (a) X ≤uo Y if, and only if, n n E gi (Xi ) ≤ E gi (Yi ) i=1

(6.G.5)

i=1

for every collection {g1 , g2 , . . . , gn } of univariate nonnegative increasing functions.

6.G Other Multivariate Stochastic Orders

309

(b) X ≤lo Y if, and only if, E

n

n hi (Xi ) ≥ E hi (Yi )

i=1

(6.G.6)

i=1

for every collection {h1 , h2 , . . . , hn } of univariate nonnegative decreasing functions. For a real n-variate function g, the multivariate diﬀerence operator ∆ is deﬁned by n (−1) i=1 i g(1 x1 + (1 − 1 )y1 , . . . , n xn + (1 − n )yn ), ∆yx g = (1 ,2 ,...,n )∈{0,1}n

where x and y are elements of Rn . The function g is called ∆-monotone if ∆yx g ≥ 0

whenever x ≤ y.

Let M be the set of all n-variate functions that are ∆-monotone in any of their k coordinates when the other n − k coordinates are held ﬁxed, 1 ≤ k ≤ n. It can be shown that if ψ ∈ M and X ≤uo Y , then E[ψ(X)] ≤ E[ψ(Y )]. Every distribution function is a member of M . Thus we have proven the ﬁrst part of the following theorem. The other part can be shown similarly. Theorem 6.G.2. Let X and Y be two n-dimensional random vectors. Then (a) X ≤uo Y if, and only if, E[ψ(X)] ≤ E[ψ(Y )]

for every distribution function ψ.

(6.G.7)

for every survival function ψ.

(6.G.8)

(b) X ≤lo X if, and only if, E[ψ(X)] ≥ E[ψ(Y )]

It is clear, for example from Theorem 6.G.2, that

X ≤st Y =⇒ X ≤uo Y and X ≤lo Y .

(6.G.9)

Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Note that if X ≤uo Y , or if X ≤lo Y , then Xi ≤st Yi , i = 1, 2, . . . , n. It follows that X ≤uo Y =⇒ EX ≤ EY , X ≤lo Y =⇒ EX ≤ EY .

and

The following closure properties of the orthant orders can be easily veriﬁed using (6.G.1)–(6.G.4).

310

6 Multivariate Stochastic Orders

Theorem 6.G.3. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤uo [≤lo ] (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤uo [≤lo ] (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R is an increasing function, i = 1, 2, . . . , n. (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤uo [≤lo ] Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤uo [≤lo ] (Y 1 , Y 2 , . . . , Y m ). That is, the orthant orders are closed under conjunctions. (c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤uo [≤lo ] Y , then X I ≤uo [≤lo ] Y I for each I ⊆ {1, 2, . . . , n}. That is, the orthant orders are closed under marginalization. (d) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤uo [≤lo ] Y j , j = 1, 2, . . ., then X ≤uo [≤lo ] Y . (e) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤uo [≤lo ] [Y Θ = θ] for all θ in the support of Θ. Then X ≤uo [≤lo ] Y . That is, the orthant orders are closed under mixtures. From parts (a) and (e) of Theorem 6.G.3 we obtain the following corollary. Corollary 6.G.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors such that X ≤uo [≤lo ] Y , and let Z be an m-dimensional random vector which is independent of X and Y . Then (h1 (X1 , Z), h2 (X2 , Z), . . . , hn (Xn , Z)) ≤uo [≤lo ] (h1 (Y1 , Z), h2 (Y2 , Z), . . . , hn (Yn , Z)), whenever hi (x, z), i = 1, 2, . . . , n, are increasing in x for every z. By applying Corollary 6.G.4 twice (letting Z there be an n-dimensional random vector, and letting each hi depend only on its ﬁrst argument and on the ith component of the second argument, i = 1, 2, . . . , n), we get the following result. A strengthening of the following result is Theorem 6.G.18 below. Theorem 6.G.5. Let X, Y , Z, and W be n-dimensional random vectors such that X and Z are independent and Y and W are independent. Let ci : [0, ∞)2 → [0, ∞) be a continuous increasing function, i = 1, 2, . . . , n. If X ≤uo [≤lo ] Y and Z ≤uo [≤lo ] W , then (c1 (X1 , Z1 ), c2 (X2 , Z2 ), . . . , cn (Xn , Zn )) ≤uo [≤lo ] (c1 (Y1 , W1 ), c2 (Y2 , W2 ), . . . , cn (Yn , Wn )).

6.G Other Multivariate Stochastic Orders

311

Example 6.G.6. Consider an n-dimensional Markov chain {X k = (Xk,1 , . . . , Xk,n ), k ≥ 0} deﬁned by X 0 = (0, . . . , 0) and 1 m 1 m X k+1 = (g1 (Xk,1 , Uk,1 , . . . , Uk,1 ), . . . , gn (Xk,n , Uk,n , . . . , Uk,n ),

n ≥ 1,

l l , . . . , Uk,n ), k = where, for each 1 ≤ l ≤ m, the random vectors U lk = (Uk,1 1, 2, . . ., are independent and identically distributed, and the gi ’s are some deterministic (m + 1)-dimensional functions. Consider another n-dimensional Markov chain {Y k = (Yk,1 , . . . , Yk,n ), k ≥ 0} similarly deﬁned by Y 0 = (0, . . . , 0) and 1 m 1 m Y k+1 = (g1 (Yk,1 , Vk,1 , . . . , Vk,1 ), . . . , gn (Yk,n , Vk,n , . . . , Vk,n ),

n ≥ 1,

l l , . . . , Vk,n ), k = where, for each 1 ≤ l ≤ m, the random vectors V lk = (Vk,1 1, 2, . . ., are independent and identically distributed. If the gi ’s are increasing in their m + 1 arguments, if U l = {U lk , k ≥ 0}, l = 1, . . . , m, are independent, if V l = {V lk , k ≥ 0}, l = 1, . . . , m, are independent, and if U lk ≤uo [≤lo ] V lk , l = 1, . . . , m, k ≥ 0, then, for each k ≥ 0 we have

(X 0 , . . . , X k ) ≤uo [≤lo ] (Y 0 , . . . , Y k ). The proof uses Theorem 6.G.5, Corollary 6.G.4, and Theorem 6.G.3(b). We omit the details. Another preservation property of the orthant orders 0 is described in the next theorem. In the following theorem we deﬁne j=1 xj ≡ 0 for any sequence {xj , j = 1, 2, . . . }. Similar results are Theorems 9.A.6 and 9.A.14. Theorem 6.G.7. Let X j = (Xj,1 , Xj,2 , . . . , Xj,m ), j = 1, 2, . . ., be a sequence of nonnegative random vectors, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both M and N are independent of the X j ’s. If M ≤uo [≤lo ] N , then M1

Xj,1 ,

j=1

M2

Xj,2 , . . . ,

j=1

Mm

Xj,m

j=1

≤uo [≤lo ]

N1 j=1

Xj,1 ,

N2

Xj,2 , . . . ,

j=1

Nm

Xj,m .

j=1

Proof. We only give the proof for the upper orthant order; the proof for the lower orthant order is similar. For t = (t1 , t2 , . . . , tm ) we have %% $( Mi m $ P Xj,i > ti i=1

j=1

=

∞ ∞ n1 =0 n2 =0

···

∞ nm =0

P

$( ni m $ i=1

j=1

Xj,i ≤ ti

(n1 , n2 , . . . , nm )}

312

6 Multivariate Stochastic Orders

≤

∞ ∞

···

n1 =0 n2 =0

=P

∞

P

$( ni m $

nm =0

$( Ni m $ i=1

i=1

%% Xj,i

j=1

× P {N > (n1 , n2 , . . . , nm )}

%% Xj,i > ti

Xj,i ≤ ti

x] ≤uo [Y Y > x] for all x ∈ Rn , for which these conditional random vectors are well deﬁned.

6.G Other Multivariate Stochastic Orders

313

It follows from Theorem 6.G.9 that X ≤whr Y =⇒ X ≤uo Y ;

(6.G.10)

this is a multivariate generalization of Theorem 1.B.1. An interesting relationship between the order ≤uo and the orders ≤Sm-cx and ≤m-icx (deﬁned in Sections 3.A.5 and 4.A.7, respectively) is given in the next theorem. Theorem 6.G.10. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be random vectors such that the (m − 1)st moment exists for each Xi and Yi , i = 1, 2, . . . , m.

k

k m m (a) If X ≤uo Y , and if E =E i=1 Xi i=1 Yi , k = 1, 2, . . . , m − 1, m m S then m i=1 Yi , where S is the assumed common support of m i=1 Xi ≤m-cx interval. i=1 Xi and of i=1 Yi , and S is also assumed to be an m (b) If X ≤ Y , and if X and Y are nonnegative, then uo i=1 Xi ≤m-icx m Y . i=1 i It is of interest to compare Theorem 6.G.10 with Theorem 7.A.30 and with implication (9.A.19). The following example gives suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 7.A.13, 7.A.26, 7.A.39, 7.B.5, and 9.A.20 for related results. Example 6.G.11. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ, and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix Σ + D, where D is a matrix with zero diagonal elements such that Σ + D is nonnegative deﬁnite. If µx ≤ µY and D ≥ 0, then X ≤uo Y . The following results give conditions that ensure stochastic equality; see Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, and 7.A.14–7.A.16 for similar results. First, in the bivariate case (n = 2) we have the following result; its proof is not given here since it is a special case of Theorem 6.G.13. Theorem 6.G.12. Let X = (X1 , X2 ) and Y = (Y1 , Y2 ) be two bivariate random vectors. If X1 =st Y1 , X2 =st Y2 , X ≤uo Y , and X ≤lo Y , then X =st Y . Note that when n = 2, Theorem 6.B.19 is a special case of Theorem 6.G.12, as can be seen from (6.G.9). If n ≥ 3, then the conclusion of Theorem 6.G.12 need not hold. The following theorem gives conditions under which the conclusion X =st Y holds. Theorem 6.G.13. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with distributions and survival functions F , F , G, and G,

314

6 Multivariate Stochastic Orders

respectively. If the m-dimensional marginals of X and Y are equal (m ≤ n−1) and if X ≤uo Y , that is, F (x) ≤ G(x)

for all x ∈ Rn ,

(6.G.11)

and if (−1)n F (x) ≥ (−1)n G(x)

for all x ∈ Rn ,

(6.G.12)

then X =st Y . Proof. Write F (x) = 1 −

i

≥1−

P {Xi ≤ xi , Xj ≤ xj } − · · · + (−1)n F (x)

i=j

P {Yi ≤ xi } +

i

= G(x),

P {Xi ≤ xi } +

P {Yi ≤ xi , Yj ≤ xj } − · · · + (−1)n G(x)

i=j

x∈R , n

where the equality of the m-dimensional marginals and also assumption (6.G.12) were used. Thus we get that for each x ∈ Rn , F (x) ≥ G(x). This, together with (6.G.11), yields the stated result.

An interesting relationship between the orders ≤lo and ≤Lt (see Section 5.A) is revealed in the following theorem. Theorem 6.G.14. Let X and Y be two nonnegative random vectors. If (X1 , X2 , . . . , Xn ) ≤lo (Y1 , Y2 , . . . , Yn ), then n

ai Xi ≤Lt

i=1

n

ai Yi

whenever ai ≥ 0, i = 1, 2, . . . , n.

i=1

Proof. Select an s ≥ 0 and ai ≥ 0, i = 1, 2, . . . , n. The function gi deﬁned by gi (x) = exp{−ai sx} is decreasing and nonnegative. Therefore, from (6.G.6), we obtain that n n

for all s ≥ 0, ai Xi ≥ E exp − s ai Yi E exp − s i=1

i=1

and this yields the stated result.

6.G.2 The scaled order statistics orders Consider now nonnegative random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ). For any z = (z1 , z2 , . . . , zn ) denote by z (k) = (z1 , z2 , . . . , zn )(k) the kth smallest zi in {z1 , z2 , . . . , zn }. Thus, for a random vector Z = (Z1 , Z2 , . . . , Zn ), the kth order statistic of Z1 , Z2 , . . . , Zn is Z (k) = (Z1 , Z2 , . . . , Zn )(k) . In particular, Z (1) = min{Z1 , Z2 , . . . , Zn } and Z (n) = max{Z1 , Z2 , . . . , Zn }. The next result describes the orders ≤uo and ≤lo in a new fashion when the underlying random vectors are nonnegative (see Theorem 6.D.7 for a related result).

6.G Other Multivariate Stochastic Orders

315

Theorem 6.G.15. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Then (a) X ≤uo Y if, and only if, min{a1 X1 , . . . , an Xn } ≤st min{a1 Y1 , . . . , an Yn }

(6.G.13)

whenever ai > 0, i = 1, 2, . . . , n. (b) X ≤lo Y if, and only if, max{a1 X1 , . . . , an Xn } ≤st max{a1 Y1 , . . . , an Yn }

(6.G.14)

whenever ai > 0, i = 1, 2, . . . , n. Proof. Condition (6.G.13) is the same as F(

t t t t t t , , . . . , ) ≤ G( , , . . . , ) a1 a2 an a1 a2 an

whenever t ≥ 0, ai > 0, i = 1, 2, . . . , n, which is the same as F (t1 , t2 , . . . , tn ) ≤ G(t1 , t2 , . . . , tn )

(6.G.15)

whenever ti > 0, i = 1, 2, . . . , n. Using standard limiting arguments it is seen that (6.G.15) is the same as X ≤uo Y . This proves (a). The proof of (b) is similar.

Theorem 6.G.15 suggests the following class of orders which contains the orders ≤uo and ≤lo as special cases. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Suppose that (a1 X1 , a2 X2 , . . . , an Xn )(k) ≤st (a1 Y1 , a2 Y2 , . . . , an Yn )(k)

(6.G.16)

whenever ai > 0, i = 1, 2, . . . , n. Then we say that X is smaller than Y in the kth scaled order statistic order (denoted by X ≤(k) Y ), k = 1, 2, . . . , n. So X ≤uo Y ⇐⇒ X ≤(1) Y and X ≤lo Y ⇐⇒ X ≤(n) Y . The next theorem identiﬁes a rich class of functions ψ such that E[ψ(X)] ≤ E[ψ(Y )] whenever X ≤(k) Y . First we need to introduce some notation. For m ∈ {1, 2, . . . , n} let Am be the set of all subsets of {1, 2, . . . , n} of size m. As in Section 6.A, for I = {i1 , i2 , . . . , im } ∈ Am and a vector x = (x1 , x2 , . . . , xn ), we denote xI = (xi1 , xi2 , . . . , xim ). Let M1,n denote the class of all distribution functions corresponding to nonnegative ﬁnite measures on Rn+ . For x ∈ Rn+ , I ∈ Am , and ψ ∈ M1,n , we denote ˜ I , ∞e) = lim ψ(x1 , x2 , . . . , xn ). ψ(x xI →∞e

For k ∈ {1, 2, . . . , n} let Mk,n be the class of functions φ : Rn+ → R of the form

316

6 Multivariate Stochastic Orders

φ(x1 , x2 , . . . , xn ) =

n m=n−k+1

(−1)m−n+k−1

m−1 n−k

˜ I , ∞e), ψ(x

I∈Am

n elements for some ψ ∈ M1,n , where I∈Am denotes the sum over all the m of Am . Note that for k = 1 the two deﬁnitions of M1,n coincide. The proof of the next result is not given here; it can be found elsewhere. Theorem 6.G.16. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Then X ≤(k) Y if, and only if, E[φ(X)] ≤ E[φ(Y )] for every φ ∈ Mk,n for which the expectations exist. Note that both parts of Theorem 6.G.2 are special cases of Theorem 6.G.16. The orders ≤(k) are closed under general monotone increasing transformations as the following theorem shows. The proof is easy and is omitted. Theorem 6.G.17. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Let bi : R+ → R+ be a right continuous increasing function, i = 1, 2, . . . , n. If X ≤(k) Y , then

b1 (X1 ), b2 (X2 ), . . . , bn (Xn ) ≤(k) b1 (Y1 ), b2 (Y2 ), . . . , bn (Yn ) . The orders ≤(k) also satisfy the following general closure property, the proof of which can be found elsewhere and is omitted. Theorem 6.G.18. Let X, Y , Z, and W be n-dimensional nonnegative random vectors such that X and Z are independent, and Y and W are independent. Let ci : R2+ → R+ be a right continuous increasing function, i = 1, 2, . . . , n. If X ≤(k) Y and Z ≤(k) W , then

c1 (X1 , Z1 ), c2 (X2 , Z2 ), . . . , cn (Xn , Zn )

≤(k) c1 (Y1 , W1 ), c2 (Y2 , W2 ), . . . , cn (Yn , Wn ) . From Theorem 6.G.18 we obtain the following two results as corollaries.

Theorem 6.G.19. Let X, Y , Z, and W be n-dimensional nonnegative random vectors such that X and Z are independent and Y and W are independent. If X ≤(k) Y and Z ≤(k) W , then X + Z ≤(k) Y + W ; that is, the orders ≤(k) are closed under convolutions.

6.H Complements

317

Theorem 6.G.20. Let X, Y , Z, and W be n-dimensional nonnegative random vectors such that X and Z are independent and Y and W are independent. If X ≤(k) Y and Z ≤(k) W , then (min(X1 , Z1 ), min(X2 , Z2 ), . . . , min(Xn , Zn )) ≤(k) (min(Y1 , W1 ), min(Y2 , W2 ), . . . , min(Yn , Wn )) and (max(X1 , Z1 ), max(X2 , Z2 ), . . . , max(Xn , Zn )) ≤(k) (max(Y1 , W1 ), max(Y2 , W2 ), . . . , max(Yn , Wn )). The next result states a closure under marginalization property. In its statement X (i) denotes (X1 , . . . , Xi−1 , Xi+1 , . . . , Xn ) and Y (i) denotes (Y1 , . . . , Yi−1 , Yi+1 , . . . , Yn ), i = 1, 2, . . . , n. Theorem 6.G.21. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Suppose that X ≤(k) Y . (a) If 1 < k ≤ n, then X (i) ≤(k−1) Y (i) . (b) If X and Y are positive with probability one and if 1 ≤ k ≤ n − 1, then X (i) ≤(k) Y (i) . It is clear from (6.G.16) that X ≤st Y =⇒ X ≤(k) Y . Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random nonnegative vectors. By letting k − 1 of the ai ’s in (6.G.16) go to 0, and by letting n − k of the other ai ’s be ∞, it is seen that if X ≤(k) Y , then Xi ≤st Yi , i = 1, 2, . . . , n (this fact can also be obtained from Theorem 6.G.21). It follows that X ≤(k) Y =⇒ EX ≤ EY .

6.H Complements Section 6.B: Many of the results described in Section 6.B can be found, or are alluded to, in Marshall and Olkin [383]. For example, the result given in Theorem 6.B.2 can be found there. Some studies of so called integral stochastic orders, which have as their starting point relations such as (6.B.4) or (6.G.7), can be found in Marshall [382], in Mosler and Scarsini [400], in M¨ uller [408], and in Dubra, Maccheroni, and Ok [172]. A proof

318

6 Multivariate Stochastic Orders

of fact that the usual stochastic order is equivalent to an almost sure construction (Theorem 6.B.1) can be found in Kamae, Krengel, and O’Brien [272], where this result is obtained for spaces that are more general than Rn . Theorem 6.B.3 was obtained originally in Veinott [556], but various versions of it appear elsewhere and it is often rediscovered; Shanthikumar [527] has identiﬁed a condition that is weaker than (6.B.8)–(6.B.10) and which still implies X ≤st Y . A standard reference for notions of positive dependence such as association and CIS is Barlow and Proschan [36]. The condition under which CIS random vectors are stochastically ordered (Theorem 6.B.4) can be found, for example, in Langberg [332]. An extension of Theorem 6.B.4 can be found in Shanthikumar [527]. The notation ≤sst and the result in Remark 6.B.5 are taken from Li, Scarsini, and Shaked [348]. The characterization of the CIS notion for bivariate distribution functions with uniform[0, 1] margins (Remark 6.B.6) is taken from Nelsen [431, Corollary 5.2.11], where this result is derived in the context of copulas. The notion of positive dependence WCIS is introduced in Cohen and Sackrowitz [131], from which Theorem 6.B.7 is taken. The fact that association, together with the monotonicity of the ratio of the densities, implies the multivariate usual stochastic order (Theorem 6.B.8), is essentially proved in Proposition 2.6 of Perlman and Olkin [457]. The stochastic monotonicity of a random vector conditioned on the sum of its elements (Theorem 6.B.9) is taken from Efron [181], who credited it to Karlin; extensions of it can be found in Shanthikumar [527] as well as in Efron [181]. This theorem is put into the context of queuing theory in Daduna and Szekli [137]. The result which gives conditions, by means of the univariate down shifted likelihood ratio order, under which a random vector is stochastically increasing in its given sum (Theorem 6.B.10) can be found in Liggett [360]. The results that involve the stochastic monotonicity of a random vector conditioned on some of its order statistics (Theorems 6.B.11 and 6.B.12) are taken from Block, Bueno, Savits, and Shaked [91] and from Shanthikumar [527]; related results can be found in Bueno [113] and in Joag-Dev [257]. The stochastic monotonicity of the order statistics, of heterogeneous exponential random variables, in the ﬁrst order statistic (Theorem 6.B.13), is a strengthening of a result of Kochar and Korwar [314]; its conclusion also holds if it is merely assumed that X1 , X2 , . . . , Xm have proportional hazard functions (rather than having exponential distributions). The stochastic comparison of random vectors with a common copula (Theorem 6.B.14) can be found in Scarsini [491]; an extension of it is given in Li, Scarsini, and Shaked [348]. The result on the comparison of the vector of partial sums (Theorem 6.B.15) is taken from Boland, Proschan, and Tong [100], where the counterexample, mentioned after the theorem, can also be found. Some extensions of this result are given in Shaked, Shanthikumar, and Tong [519]. The result which compares random sums (Theorem 6.B.17) is taken from Pellerey [451], whereas the comparison of mixtures result (Theorem 6.B.18) is taken from Denuit

6.H Complements

319

and M¨ uller [157]. The conditions for stochastic equality (Theorem 6.B.19) can be found in Baccelli and Makowski [27]. The proof of Theorem 6.B.19 that is given in Section 6.B.5 follows the ideas of Scarsini and Shaked [494]. Lemma 2.1 of Costantini and Pasqualucci [135] is an interesting variation of Theorem 6.B.19. The characterizations of the usual stochastic order given in Theorems 6.B.20 and 6.B.21 are taken from Scarsini and Shaked [495]. The comparisons of order statistics, given in Theorem 6.B.23 and Corollary 6.B.24, can be found in Mi and Shaked [395]. These comparisons extend some results of Nanda and Shaked [428] and of Belzunce, Franco, Ruiz, and Ruiz [66, Corollary 3.2]; see a related result in Belzunce, Mercader, and Ruiz [70]. The result that is given in Example 6.B.25 is stated in Bartoszewicz [39], but without a detailed proof; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The usual stochastic order of vectors of order statistics of Gamma and Weibull random variables with diﬀerent scale parameters (Examples 6.B.26 and 6.B.27) are taken from Hu [229] and from Sun and Zhang [543]. Several other examples of this kind can be found in Hu [229], and a general method for identifying such examples can be found in Hu [230]. The conditions under which inﬁnitely divisible random vectors are comparable in the usual multivariate stochastic order (Example 6.B.28) can be found in Samorodnitsky and Taqqu [487]; see also Braverman [108] who has mistakenly confused the usual stochastic order with the upper orthant order. The necessary and suﬃcient conditions for the comparison of multivariate normal random vectors (Example 6.B.29) can be found in M¨ uller [413]; extensions of this result to Kotz-type distributions are given in Ding and Zhang [168]. The multivariate IFR notions described in Section 6.B.6 are taken from Shaked and Shanthikumar [512]; however, the notion corresponding to (6.B.28) is equivalent to a multivariate IFR notion of Arjas [18]. General results concerning the usual stochastic comparison of stochastic processes (that is, results that are more general than Theorems 6.B.30 and 6.B.31) can be found in Kamae, Krengel, and O’Brien [272]; see also Block, Langberg, and Savits [93] and Rolski and Szekli [474]. Versions of the results regarding the usual stochastic comparison of Markov chains (Theorems 6.B.32 and 6.B.34) can be found in Stoyan [540, Chapter 4]. The comparison of Markov chains, one of which is skip-free positive (Theorem 6.B.35), is taken from Ferreira and Pacheco [199]; they obtained stronger results than Theorem 6.B.33 although they use a diﬀerent terminology than the one used in this theorem. The discussion about the stochastic orders of point processes is based on Shaked and Szekli [521] and Szekli [544], although the deﬁnition of the orders ≤st and ≤st-N for point processes can be found already in Ebrahimi [176]; see related results in Sch¨ ottl [497]. Kulik and Szekli [325] extended these orders to k-variate point processes. The statements about the stochastic comparisons of the epoch and interepoch times of two nonhomogeneous Poisson processes (Example 6.B.41) are taken from Belzunce, Lillo, Ruiz, and Shaked [69].

320

6 Multivariate Stochastic Orders

Section 6.C: The development in this section follows the works of Norros [436, 437] and of Shaked and Shanthikumar [504]. A result that is similar to Theorem 6.C.1, but that gives conditions under which two point processes are stochastically ordered, can be found in Kwieci´ nski and Szekli [328]. The fact (which is mentioned in Section 6.C.2) that the cumulative hazards of the components, by the time that they fail, are independent standard exponential random variables, follows from more general results of Aalen and Hoem [1, Section 4.5], Kurtz [326, Theorem 6.19(b)], and Jacobsen [252, Proposition 2.2.11)]. Section 6.D: The development in Sections 6.D.1 and 6.D.2 follows the work of Hu, Khaledi, and Shaked [235], although the deﬁnition of the order ≤whr (with a diﬀerent name), and its characterization by means of the hazard gradients (Theorem 6.D.2) can be found in Jain and Nanda [253]. In Hu, Khaledi, and Shaked [235] it is claimed that (6.D.12) in Theorem 6.D.7 is equivalent to X ≤whr Y , but this is erroneous, as was communicated to us by Antonio Colangelo. An order that is stronger than the order ≤whr is mentioned in Collet, L´ opez, and Mart´ınez [134]. The development in Section 6.D.3 follows the work of Shaked and Shanthikumar [505]. The dynamic multivariate hazard rate order comparison of the epoch times of two nonhomogeneous Poisson processes (Example 6.D.8) is taken from Belzunce, Lillo, Ruiz, and Shaked [69]. The comparison, in the dynamic hazard rate order, of vectors of order statistics (Theorem 6.D.10), can be found in Belzunce, Ruiz, and Ruiz [75]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The multivariate IFR notions described in Section 6.D.3 are taken from Shaked and Shanthikumar [512]; some related notion and results can be found in Bassan and Spizzichino [56]. Section 6.E: The multivariate likelihood ratio order (though using a diﬀerent terminology) is studied in Karlin and Rinott [278] and in Whitt [563]. The preservation under conditioning result (Theorem 6.E.2) can be found in Rinott and Scarsini [468]. The result which shows a preservation property of the order ≤lr under random summations (Theorem 6.E.5) is taken from Pellerey [451]. The result about the relationship between the multivariate likelihood ratio and the multivariate hazard rate order (Theorem 6.E.6) is taken from Hu, Khaledi, and Shaked [235]. The relationship between the multivariate likelihood ratio and the dynamic multivariate hazard rate order (Theorem 6.E.7) can be found in Shaked and Shanthikumar [511], whereas the notion of multivariate PF2 distributions is taken from Shaked and Shanthikumar [512]. Theorem 6.E.8 has been proved in the literature in various generalities; see, for example, Holley [226] or Preston [460]. For a proof of the present statement of Theorem 6.E.8 see Karlin and Rinott [278]. The implication (6.E.6) can be found in Whitt [563]. Shanthikumar and Koo [528] studied an order which is deﬁned as in (6.E.6), except that rather than requiring A there to be a rectangular set, they require the right-hand side of (6.E.6) to hold for all planar regions

6.H Complements

321

A. The statement in Remark 6.E.9 that (1.C.6) does not generalize to the multivariate case, follows from R¨ uschendorf [485, Theorem 8]. The order mentioned in Remark 6.E.10 is studied in Whitt [563], where other orders, related to the multivariate likelihood ratio order, are also studied. One of the counterexamples, mentioned in Remark 6.E.10, can be found in Whitt [563]. Other counterexamples can be found in Lehmann [341]; in that paper it is also claimed that Theorem 6.B.2 is wrong, but that claim is based on erroneous examples. The conditions for the monotonicity of the order ≤lr , given in Theorem 6.E.11, are taken from Rinott and Scarsini [468]. The comparison, in the multivariate likelihood ratio order, of vectors of order statistics (Theorem 6.E.12), can be found in Belzunce, Ruiz, and Ruiz [75]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The multivariate likelihood ratio comparisons of epoch and inter-epoch times of nonhomogeneous Poisson processes (Example 6.E.13) can be found in Belzunce, Lillo, Ruiz, and Shaked [69]; in that paper these results are also extended to nonhomogeneous pure birth processes. The likelihood ratio order comparison of the vectors of the normalized spacings associated with exponential random variables (Example 6.E.14) is taken from Kochar and Rojo [318]. The result about the likelihood ratio ordering of the posterior distributions (Example 6.E.16) can be found in Fahmy, Pereira, Proschan, and Shaked [189]; see also Purcaru and Denuit [462, Proposition 5.1]. A modiﬁcation of Example 6.E.16 is Theorem 3.61 of Spizzichino [539]. The proof of the equivalence of the various notions of multivariate PF2 notions (Theorem 6.E.17) is given in Shaked and Shanthikumar [512]. Section 6.F: The development in this section follows the work of Shaked and Shanthikumar [513]. A notion that is related to the multivariate DMRL concept in Section 6.F.3 can be found in Bassan, Kochar, and Spizzichino [53]. Section 6.G: The orthant orders, which are already mentioned in Marshall and Olkin [383], have been studied further by several authors. Some of the results in Section 6.G.1 can be found in Tchen [547], R¨ uschendorf [481], and Mosler [401]. Several extensions of these orders can be found in Bergmann [82]. The closure results of the orthant orders given in Theorem 6.G.5, and the application to Markov chains given in Example 6.G.6, are taken from Li and Xu [350]. The result about the preservation of the orthant orders under random sums (Theorem 6.G.7) is taken from Wong [568]; this result also appeared in Denuit, Genest, and Marceau [145], and in Pellerey [451] there is an equivalent result with an alternative proof. The comparison of mixtures result (Theorem 6.G.8) can be found in Denuit and M¨ uller [157]. The relationship between the orders ≤uo and ≤whr , given in (6.G.10), can be found in Hu, Khaledi, and Shaked [235]. The relationship between the order ≤uo and the orders ≤Sm-cx and ≤m-icx (Theorem 6.G.10) is taken from Boutsikas and Vaggelatou [107]. The suﬃcient conditions for the comparison of multivariate normal random vectors (Example 6.G.11) can be found in M¨ uller [413]. Theorem 6.G.13 is taken

322

6 Multivariate Stochastic Orders

from Scarsini and Shaked [494], whereas Theorem 6.G.14 is adopted from Baccelli and Makowski [27]. Dyckerhoﬀ and Mosler [173] introduced some relatively easy conditions for verifying X ≤uo Y or X ≤lo Y when X and Y have ﬁnite discrete supports. The development in Section 6.G.2 follows the work of Scarsini and Shaked [493]. Hennessy [220] considered the order which is deﬁned by taking all the ai ’s in (6.G.16) to be equal to 1; he obtained for this order a result which is analogous to Theorem 6.G.16. A generalization of the order ≤uo is mentioned and studied in Daduna and Szekli [138].

7 Multivariate Variability and Related Orders

In this chapter we describe various extensions, of the univariate variability orders in Chapters 3 and 4, to the multivariate case. The most important common orders that are studied in this chapter are the increasing and the directional convex and concave orders. Multivariate extensions of the order ≤disp are also studied in this chapter. Some multivariate extensions of the transform orders, and of the Laplace transform order, are investigated in this chapter as well.

7.A The Monotone Convex and Monotone Concave Orders 7.A.1 Deﬁnitions The multivariate orders ≤icx and ≤icv are deﬁned in a similar fashion to their univariate counterparts discussed in Section 4.A. Let X and Y be two ndimensional random vectors such that E[φ(X)] ≤ E[φ(Y )] for all increasing convex [concave] functions φ : Rn → R,

(7.A.1)

provided the expectations exist. Then X is said to be smaller than Y in the increasing convex [concave] order (denoted by X ≤icx Y [X ≤icv Y ]). One can also deﬁne a decreasing convex [concave] order by requiring (7.A.1) to hold for all decreasing convex [concave] functions φ. But the terms “decreasing convex” and “decreasing concave” orders are counterintuitive because if X is smaller than Y in the sense of either of these two orders, then X is “larger” than Y in some stochastic sense. These orders can easily be characterized using the orders ≤icx and ≤icv . It is therefore not necessary to have a separate discussion about these orders.

324

7 Multivariate Variability and Related Orders

For any i, i = 1, 2, . . . , n, the function φi , deﬁned by φi (x) = φi (x1 , x2 , . . . , xn ) = xi , is increasing and is both convex and concave. Therefore, from (7.A.1) it easily follows that X ≤icx Y =⇒ E[X] ≤ E[Y ]

(7.A.2)

X ≤icv Y =⇒ E[X] ≤ E[Y ],

(7.A.3)

and that provided the expectations exist. If the two n-dimensional random vectors X and Y are such that E[φ(X)] ≤ E[φ(Y )]

for all convex functions φ : Rn → R,

(7.A.4)

provided the expectations exist, then X is said to be smaller than Y in the convex order (denoted by X ≤cx Y ). For any i, i = 1, 2, . . . , n, the function φi , deﬁned as above, and the function ψi , deﬁned by ψi (x) = ψi (x1 , x2 , . . . , xn ) = −xi , are both convex. Therefore, from (7.A.4) it follows that X ≤cx Y =⇒ E[X] = E[Y ],

(7.A.5)

provided the expectations exist. The multivariate convex order can be characterized by construction on the same probability space as the univariate convex order (see Theorem 3.A.4). This is stated next. Theorem 7.A.1. The random vectors X and Y satisfy X ≤cx Y if, and only ˆ and Yˆ , deﬁned on the same probability if, there exist two random vectors X space, such that ˆ =st X, X Yˆ =st Y , ˆ Yˆ } is a martingale, that is, and {X, ˆ =X ˆ E[Yˆ X]

(7.A.6) (7.A.7)

a.s.

(7.A.8)

Similarly, the multivariate extension of Theorem 4.A.5 is the following. Theorem 7.A.2. Two random vectors X and Y satisfy X ≤icx Y [X ≤icv ˆ and Yˆ , deﬁned on the Y ] if, and only if, there exist two random vectors X same probability space, such that ˆ =st X, X Yˆ =st Y , ˆ Yˆ } is a submartingale [{Yˆ , X} ˆ is a supermartingale], that is, and {X, ˆ ≥X ˆ [E[X ˆ Yˆ ] ≤ Yˆ ] a.s. E[Yˆ X]

7.A The Monotone Convex and Monotone Concave Orders

325

The next theorem is a multivariate analog of Theorem 4.A.6. The proof of the next theorem is similar to the proof of Theorem 4.A.6, and is therefore omitted. Theorem 7.A.3. (a) Two random vectors X and Y satisfy X ≤icx Y if, and only if, there exists a random vector Z such that X ≤st Z ≤cx Y . (b) Two random vectors X and Y satisfy X ≤icx Y if, and only if, there exists a random vector Z such that X ≤cx Z ≤st Y . The next result is similar to a result of Veinott that can be found in Section 6.B.3. Veinott’s result deals with the multivariate usual stochastic order (rather than the convex order) and does not assume independence of either the Xj ’s or the Yj ’s. However, the convex order is harder to work with as compared to the usual stochastic order. Thus we have the following result. Theorem 7.A.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If Y1 , Y2 , . . . , Yn are independent, and if X1 ≤cx Y1 , [X2 X1 = x1 ] ≤cx Y2 for all x1 , and, in general, for i = 2, 3, . . . , n, [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤cx Yi

(7.A.9) (7.A.10)

for all xj , j = 1, 2, . . . , i − 1, (7.A.11)

then X ≤cx Y .

(7.A.12)

ˆ and Yˆ on the same probability space The proof consists of constructing X such that (7.A.6)–(7.A.8) hold. This can be done by ﬁrst constructing indeˆ i ’s, note that pendent Yˆ1 , Yˆ2 , . . . , Yˆn such that Yˆ =st Y . To construct the X ˆ 1 on the by Theorem 3.A.4 (using (7.A.9)) it is possible to construct an X ˆ ˆ ˆ ˆ 1 = x1 , same probability space such that E[Y1 X1 ] = X1 a.s. Next, given X ˆ it is possible to construct, again using Theorem 3.A.4 and (7.A.10), an X2 ˆ1, X ˆ2] = X ˆ 2 a.s. Continuing on the same probability space such that E[Yˆ2 X ˆ is constructed. The this way, using Theorem 3.A.4 and (7.A.11), the vector X ˆ ˆ vectors X and Y satisfy the conditions of Theorem 7.A.1, and thus (7.A.12) follows. Note that under the conditions of Theorem 7.A.4 one has n j=1

Xj ≤cx

n

Yj .

j=1

This inequality gives a stronger result than Theorem 3.A.12(d).

(7.A.13)

326

7 Multivariate Variability and Related Orders

7.A.2 Closure properties The proofs of the following closure properties are similar to the univariate counterparts and are omitted. Theorem 7.A.5. (a) Let X and Y be n-dimensional random vectors. If X ≤icx Y [X ≤icv Y ] and g : Rn → Rm is any increasing convex [concave] function, then g(X) ≤icx [≤icv ] g(Y ). (b) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤icx [≤icv ] [Y Θ = θ] for all θ in the support of Θ. Then X ≤icx [≤icv ] Y . That is, the increasing convex [concave] order is closed under mixtures. (c) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞. Assume that EX j → EX and that EY j → EY as j → ∞. If X j ≤cx [≤icx , ≤icv ] Y j , j = 1, 2, . . ., then X ≤cx [≤icx , ≤icv ] Y . (d) Let X 1 , X 2 , . . . , X m be a set of independent random vectors and let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors. If X i ≤icx [≤icv ] Y i for i = 1, 2, . . . , m, then m

X j ≤icx [≤icv ]

j=1

m

Y j.

j=1

That is, the increasing convex [concave] order is closed under convolutions. Parts (a) and (d) of Theorem 7.A.5 can be generalized as follows. Theorem 7.A.6. Let X 1 , X 2 , . . . , X m be a set of independent random vectors, let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors, and assume that X i and Y i have the same dimension, i = 1, 2, . . . , m. If X i ≤icx Y i for i = 1, 2, . . . , m, then g(X 1 , X 2 , . . . , X m ) ≤icx g(Y 1 , Y 2 , . . . , Y m ) for every function g of a proper dimension that is increasing and convex in each argument. A generalization of Theorem 7.A.5(d) is the following result which deals with vectors of random partial sums of random variables. Theorem 7.A.7. Let {Xi } and {Yi } each be a sequence of independent random variables. Also, let {Mi } and {Ni } each be a sequence of independent positive integer-valued random variables, and suppose that the Xi ’s and the Mi ’s are independent and also that Yi ’s and the Ni ’s are independent. Let ˜j = M

j i=1

Mi ,

˜j = N

j i=1

˜

Ni ,

Uj =

Mj i=1

˜

Xi ,

Vj =

Nj i=1

Yi ,

j = 1, 2, . . . , m.

7.A The Monotone Convex and Monotone Concave Orders

327

If Yi ≥ 0 a.s.,

i = 1, 2, . . . ,

Mi ≤st Ni ,

i = 1, 2, . . . ,

Xi ≤icx Yi ,

i = 1, 2, . . . ,

(7.A.14)

and then (U1 , U2 , . . . , Um ) ≤icx (V1 , V2 , . . . , Vm ).

(7.A.15)

Proof. According to Theorems 1.A.1 and 4.A.5 there exist sequences of ranˆ i }, {Yˆi }, {M ˆ i }, and {N ˆi } such that dom variables {X ˆ i =st Xi , X

Yˆi =st Yi ,

and ˆi ≤ N ˆi a.s., M

ˆ i =st Mi , M

ˆi =st Ni , N

ˆ i ] a.s., ˆ i ≤ E[Yˆi X X

i = 1, 2, . . . ,

i = 1, 2, . . . .

Deﬁne ˜ ˆ i, ˆj = M M j

˜ ˆj = ˆi , N N

i=1

˜ ˆ

j

i=1

ˆj = U

Mj i=1

˜ ˆ

ˆi, X

Vˆj =

Nj

Yˆi ,

j = 1, 2, . . . , m.

i=1

From (7.A.14) it is seen that ˜ ˆ

ˆj = U

Mj i=1

˜

ˆi ≤ E X

ˆj N

ˆ k } = E Vˆj {X ˆ k } a.s., Yˆi {X

j = 1, 2, . . . , m.

i=1

Let φ be an increasing convex real n-dimensional function. Then ˆ1 , U ˆ2 , . . . , U ˆm )] ≤ E[φ(E[(Vˆ1 , Vˆ2 , . . . , Vˆm ){X ˆ k }])] E[φ(U ˆ k }]] ≤ E[E[φ(Vˆ1 , Vˆ2 , . . . , Vˆm ){X = E[φ(Vˆ1 , Vˆ2 , . . . , Vˆm )], ˆ2 , ˆ1 , U where the second inequality follows from Jensen’s Inequality. Since (U ˆ ˆ ˆ ˆ . . . , Um ) =st (U1 , U2 , . . . , Um ) and (V1 , V2 , . . . , Vm ) =st (V1 , V2 , . . . , Vm ) we obtain (7.A.15).

Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X i ’s. Denote by Xj,i the ith element of X j . From Theorems 2.3 and 2.4 M1 M2 of Pellerey [451] it seems that if M ≤cx [≤icx ] N , then X1,i , i=1 X2,i , i=1

N1 Mm N2 Nm . . . , i=1 Xm,i ≤cx [≤icx ] i=1 X1,i , i=1 X2,i , . . . , i=1 Xm,i . However, the proofs given in that paper yield somewhat diﬀerent results; see Theorem 7.A.36 for the details. The following two results can easily be proven using Theorem 7.A.1.

328

7 Multivariate Variability and Related Orders

Theorem 7.A.8. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xm ) ≤cx (Y1 , Y2 , . . . , Ym ). A result that is slightly stronger than Theorem 7.A.8 is given in Theorem 7.A.24. Theorem 7.A.9. Let the random vector X and the nonnegative random variable U be independent. If E[U ] = 1, then X ≤cx U X. From Theorem 3.B.15 [Theorem 4.B.23] and Theorem 7.A.8 we obtain the following result. Theorem 7.A.10. Let X1 , X2 , . . . , Xm be a set of nonnegative independent random variables, let Y1 , Y2 , . . . , Ym be another set of nonnegative independent random variables, and assume that EXi = EYi , i = 1, 2, . . . , m. If Xi ≤disp [≤nbue ] Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xm ) ≤cx (Y1 , Y2 , . . . , Ym ). An application of Theorem 7.A.1 is illustrated in the following example (which is, in fact, an extension of Example 3.A.29). Example 7.A.11. Let X 1 , X 2 , . . . be independent and identically distributed m-dimensional random variables. Denote by X n the sample mean of X 1 , X 2 , . . . , X n . That is, X n = (X 1 + X 2 + · · · + X n )/n. If the expectation of X 1 exists, then for any choice of positive integers n ≤ n one has X n ≤cx X n . In order to see it note that by the symmetry of X 1 , X 2 , . . . , X n it follows that E[X i X n ] = X n for all i ≤ n . Therefore E[X n X n ] = X n . That is, {X n , X n } is a martingale. The result now follows from Theorem 7.A.1. 7.A.3 Further properties Let X and Y be random vectors. If E[φ(X)] ≤ E[φ(Y )] for all increasing functions φ, then (7.A.1) obviously holds. Thus we obtain the following result. Theorem 7.A.12. Let X and Y be two random vectors. If X ≤st Y , then X ≤icx Y and X ≤icv Y . The following example gives necessary (and suﬃcient) conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 6.G.11, 7.A.26, 7.A.39, 7.B.5, and 9.A.20 for related results.

7.A The Monotone Convex and Monotone Concave Orders

329

Example 7.A.13. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ X , and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix ΣY . (a) If µX ≤ µY and if Σ Y − Σ X is positive semideﬁnite, then X ≤icx Y . (b) X ≤cx Y if, and only if, µX = µY and Σ Y −Σ X is positive semideﬁnite. Using Theorem 4.A.48 we can obtain conditions under which two nonnegative random vectors, that are comparable in the ≤icx or in the ≤icv orders, have the same distribution; related results are Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, 6.G.12, and 6.G.13. Theorem 7.A.14. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. (a) If X ≤icx Y , and if E[Xi Xj ] = E[Yi Yj ] for all i and j, then X =st Y . (b) If X ≤icv Y , and if EX = EY , and if E[Xi Xj ] = E[Yi Yj ] for all i and j, then X =st Y . Proof. First we prove n (a). From the assumption that X ≤icx Y it follows that n a X ≤ icx i=1 i i i=1 ai Yi for all ai ≥ 0, i = 1, 2, . . . , n. Also E

n i=1

2 ai Xi

=

n n i=1 j=1

ai aj E[Xi Xj ] =

n n

ai aj E[Yi Yj ] = E

i=1 j=1

n

2 ai Yi

.

i=1

n n It then follows, from Theorem 4.A.48, that i=1 ai Xi =st i=1 ai Yi for n all ai ≥ 0, i = 1, 2, . . . , n. Thus we have that E[exp{− i=1 ai Xi }] = n E[exp{− i=1 ai Yi }] for all ai ≥ 0, i = 1, 2, . . . , n. From the unicity property of the Laplace transform we obtain X =st Y . The proof of part follows from part and from the observation that (a) (b) n n n n = E , then if a X ≤ a Y and if E a X a Y i i icv i i i i i i i=1 i=1 n i=1 n i=1 a X ≥ a Y .

i i icx i i i=1 i=1 In a similar manner, using now Theorem 3.A.42 rather than Theorem 4.A.48, we can obtain conditions under which two (not necessarily nonnegative) random vectors, that are comparable in the ≤cx order, have the same distribution. Theorem 7.A.15. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two (not necessarily nonnegative) random vectors. If X ≤cx Y , and if Var(Xi ) = Var(Yi ), i = 1, 2, . . . , n, then X =st Y . Proof. From the assumption that X ≤cx Y it follows that for i = j we have a2i EXi2 + a2j EXj2 + ai aj E[Xi Xj ] = E(ai Xi + aj Xj )2 ≤ E(ai Yi + aj Yj )2 = a2i EYi2 + a2j EYj2 + ai aj E[Yi Yj ],

330

7 Multivariate Variability and Related Orders

where ai and aj are any constants. Since, by assumption, EXi2 = EYi2 and EXj2 = EYj2 , we have that ai aj E[Xi Xj ] ≤ ai aj E[Yi Yj ]. Since ai and aj are arbitrary, we see that E[Xi Xj ] = E[Yi Yj ]. n Now, n again from the assumption that X ≤cx Y it follows that i=1 ai Xi ≤cx i=1 ai Yi for all ai , i = 1, 2, . . . , n. As in the proof of Theorem 7.A.14 n

2

2 n we can show that E i=1 ai Xi = E i=1 ai Yi . It then follows, from n n Theorem 3.A.42, that i=1 ai Xi =st i=1 ai Yi for all ai , i = 1, 2, . . . , n. Therefore the characteristic functions of X and of Y are identical. This implies that X =st Y .

An interesting application of the orthant order in the context of the increasing convex and concave orders is given in the following result. The proof, which can be found elsewhere (see Section 7.D), is not given here. Theorem 7.A.16. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Suppose that X ≤lo Y [respectively, X ≤uo Y ] and that −∞ < E[φi (Xi )] = E[φi (Yi )] < ∞,

i = 1, 2, . . . , n,

for some nonnegative strictly increasing convex functions φi , i = 1, 2, . . . , n. If X and Y are comparable in the order ≤icx [respectively, ≤icv ], then X =st Y . Two orders related to the multivariate monotone convex order are discussed in Sections 7.A.6 and 7.A.7 below. 7.A.4 Convex and concave ordering of stochastic processes In Section 6.B.7 we showed that some of the results regarding the usual stochastic ordering of random vectors can be extended to the usual stochastic ordering of stochastic processes. It turns out that some of the results regarding the monotone convex and concave orderings of random vectors can also be extended to the analogous orderings of stochastic processes. In this subsection we describe a basic result that formally states that two stochastic processes are comparable in the sense of any of these orders if, and only if, any ﬁnite dimensional marginals of them are comparable in the same sense. Let {X(n), n ∈ N++ } and {Y (n), n ∈ N++ } be two discrete-time stochastic processes with state space R. Suppose that, for all choices of an integer m, it holds that (X(1), X(2), . . . , X(m)) ≤cx [≤icx , ≤icv ] (Y (1), Y (2), . . . , Y (m)), then {X(n), n ∈ N++ } is said to be smaller than {Y (n) , n ∈ N++ } in the convex [increasing convex, increasing concave] order (denoted by {X(n), n ∈ N++ } ≤cx [≤icx , ≤icv ] {Y (n), n ∈ N++ }). Below, a functional g is called convex [concave] if g({αx(n) + (1 − α)y(n), n ∈ N++ }) ≤ [≥] αg({x(n), n ∈ N++ }) + (1 − α)g({y(n), n ∈ N++ }) for all α ∈ [0, 1] and {x(n), n ∈ N++ } and {y(n), n ∈ N++ }.

7.A The Monotone Convex and Monotone Concave Orders

331

Theorem 7.A.17. Let {X(n), n ∈ N++ } and {Y (n), n ∈ N++ } be two discrete-time stochastic processes with state space R. Then {X(n), n ∈ N++ } ≤cx [≤icx , ≤icv ] {Y (n), n ∈ N++ } if, and only if, E{g({X(n), n ∈ N++ })} ≤ E{g({Y (n), n ∈ N++ })}

(7.A.16)

for every continuous (with respect to the product topology in R∞ ) convex [increasing convex, increasing concave] functional g for which the expectations in (7.A.16) exist. Notice that the assumption of continuity with respect to the product topology is quite restrictive, but, as far as we know, it is the best result available. 7.A.5 The (m1 , m2 )-icx orders The multivariate ≤icx can be extended in a manner similar to the way in which the univariate order ≤m-icx in Section 4.A.7 extends the univariate ≤icx order. Only the bivariate extension will be described here. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with a common support I × J, where I and J are ﬁnite, or half inﬁnite, or inﬁnite intervals in R. If E[φ(X1 , X2 )] ≤ E[φ(Y1 , Y2 )] for all (m1 + m2 )-diﬀerentiable funck1 +k2 tions φ such that ∂ k1 k2 φ(x1 , x2 ) ≥ 0 on I × J whenever 0 ≤ k1 ≤ m1 , ∂x1 ∂x2

0 ≤ k2 ≤ m2 , and k1 + k2 ≥ 1, then (X1 , X2 ) is said to be smaller than I×J (Y1 , Y2 )). (Y1 , Y2 ) in the (m1 , m2 )-icx order (denoted by (X1 , X2 ) ≤(m 1 ,m2 )-icx If E[φ(X1 , X2 )] ≤ E[φ(Y1 , Y2 )] for all (m1 + m2 )-diﬀerentiable functions φ k1 +k2 such that (−1)k1 +k2 +1 ∂ k1 k2 φ(x1 , x2 ) ≥ 0 on I × J whenever 0 ≤ k1 ≤ m1 , ∂x1 ∂x2

0 ≤ k2 ≤ m2 , and k1 +k2 ≥ 1, then (X1 , X2 ) is said to be smaller than (Y1 , Y2 ) I×J (Y1 , Y2 )). in the (m1 , m2 )-icv order (denoted by (X1 , X2 ) ≤(m 1 ,m2 )-icv The (m1 , m2 )-icx and the (m1 , m2 )-icv orders are related as follows [a ,b ]×[a ,b ]

1 2 2 (X1 , X2 ) ≤(m11 ,m (Y1 , Y2 ) 2 )-icv

[0,b −a ]×[0,b2 −a2 ]

⇐⇒ (b1 − Y1 , b2 − Y2 ) ≤(m11,m21)-icx

(b1 − X1 , b2 − X2 ),

and 2

2

R (X1 , X2 ) ≤R (m1 ,m2 )-icv (Y1 , Y2 ) ⇐⇒ −(Y1 , Y2 ) ≤(m1 ,m2 )-icx −(X1 , X2 ).

Thus it suﬃces for most purposes to focus on the (m1 , m2 )-icx order only. 2 R2 Note that the orders ≤R (1,1)-icx and ≤(1,1)-icv are the orders ≤uo and ≤lo (see 2

2

R Section 6.G.1). The orders ≤R (2,2)-icx and ≤(2,2)-icv are the orders ≤uo-cx and 2

≤uo-cx which are discussed in Section 7.A.9 below. Also, the order ≤R (m,m)-icv is the order ≤2m which is discussed in Section 7.A.9. Some closure properties of the (m1 , m2 )-icx order are given in the next theorem. Some of the results below are stated for simplicity only for the case in which I = J = [0, ∞), but they can be rewritten for the general case.

332

7 Multivariate Variability and Related Orders

Theorem 7.A.18. (a) Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with a common support I × J. Let K and L be two intervals in R, and let φ1 : I → K and φ2 : J → L be two univariate functions with nonnegative I×J ﬁrst m1 and m2 derivatives, respectively. If (X1 , X2 ) ≤(m (Y1 , Y2 ), 1 ,m2 )-icx

K×L (φ1 (Y1 ), φ2 (Y2 )). then (φ1 (X1 ), φ2 (X2 )) ≤(m 1 ,m2 )-icx (b) Let (X1 , X2 ), (Y1 , Y2 ), and Θ be random vectors such that [(X1 , X2 )Θ = [0,∞)2 θ] ≤ [(Y1 , Y2 )Θ = θ] for all θ in the support of Θ. Then (m1 ,m2 )-icx [0,∞)2

(X1 , X2 ) ≤(m1 ,m2 )-icx (Y1 , Y2 ). That is, the (m1 , m2 )-icx order is closed under mixtures. (c) Let {(X11 , X12 ), (X21 , X22 ), . . . } be a sequence of independent random vectors and let {(Y11 , Y12 ), (Y21 , Y22 ), . . . } be another set of independent random vectors. Furthermore, let N be a positive integer-valued random variable which is independent of the above random vectors. If (Xj1 , Xj2 ) [0,∞)2

≤(m1 ,m2 )-icx (Yj1 , Yj2 ) for j = 1, 2, . . ., then N

[0,∞)2

(Xj1 , Xj2 ) ≤(m1 ,m2 )-icx

j=1

N

(Yj1 , Yj2 ).

j=1

In particular, the (m1 , m2 )-icx order is closed under convolutions. Part (c) of Theorem 7.A.18 can be used, for example, to prove (9.A.11) in Chapter 9. The bivariate (m1 , m2 )-icx orders imply some interesting results on their univariate components. Theorem 7.A.19. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with a common support [0, ∞)2 . Let φ be a bivariate function which satisﬁes ∂ k1 +k2 2 whenever 0 ≤ k1 ≤ m1 , 0 ≤ k2 ≤ m2 , and k1 k2 φ(x1 , x2 ) ≥ 0 on [0, ∞) ∂x1 ∂x2

[0,∞)2

k1 + k2 ≥ 1. If (X1 , X2 ) ≤(m1 ,m2 )-icx (Y1 , Y2 ), then φ(X1 , X2 ) ≤(m1 +m2 )-icx φ(Y1 , Y2 ). This result can be used, for example, to prove the second inequality in Theorem 9.A.18 in Chapter 9. Theorem 7.A.20. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors, of independent components, with a common support [0, ∞)2 . Then [0,∞)2 (X1 , X2 ) ≤(m1 ,m2 )-icx (Y1 , Y2 ) ⇐⇒ X1 ≤m1 -icx Y1 and X2 ≤m2 -icx Y2 . 7.A.6 The symmetric convex order Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. When X and Y have exchangeable (that is, permutation symmetric) distribution functions, it is of interest to consider orders deﬁned by the condition

7.A The Monotone Convex and Monotone Concave Orders

333

Eφ(X) ≤ Eφ(Y ) for all functions in a certain class of (permutation) symmetric functions. One such order is deﬁned as follows. Suppose that X and Y are such that Eφ(X) ≤ Eφ(Y )

for all symmetric convex functions φ : Rn → R,

provided the expectations exist. Then X is said to be smaller than Y in the symmetric convex order (denoted as X ≤symcx Y ). The following relationship between the orders ≤cx and ≤symcx is obvious. Theorem 7.A.21. Let X and Y be two random vectors. If X ≤cx Y , then X ≤symcx Y . A further discussion regarding the order ≤symcx can be found in Chapter 7 by Tong in [515]. 7.A.7 The componentwise convex order Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Suppose that X and Y are such that Eφ(X) ≤ Eφ(Y )

for all [increasing] functions φ : Rn → R that are convex in each argument when the other arguments are held ﬁxed,

provided the expectations exist. Then X is said to be smaller than Y in the [increasing] componentwise convex order (denoted by X [≤iccx ] ≤ccx Y ). The following relationship between the orders ≤ccx [≤iccx ] and ≤cx [≤icx ] is obvious. Theorem 7.A.22. Let X and Y be two random vectors. If X ≤ccx [≤iccx ] Y , then X ≤cx [≤icx ] Y . The functions φ1 (x1 , x2 , . . . , xn ) = xi xj and φ2 (x1 , x2 , . . . , xn ) = −xi xj are both componentwise convex, 1 ≤ i < j ≤ n. The next result thus follows from Theorem 7.A.22 and (7.A.5). Theorem 7.A.23. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. If X ≤ccx Y , then Cov(Xi , Xj ) = Cov(Yi , Yj ), 1 ≤ i < j ≤ n. Theorem 7.A.24. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx [≤icx ] Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xn ) ≤ccx [≤iccx ] (Y1 , Y2 , . . . , Yn ).

334

7 Multivariate Variability and Related Orders

Proof. The parenthetical statement follows at once from Theorem 4.A.15. The proof of the other statement is similar to the proof of that theorem. As in there, we can assume, without loss of generality, that all the 2m random variables are independent. The proof is by induction on m. For m = 1 the result is obvious. Assume that the stated result is true for vectors of size m−1. Let φ be a componentwise convex function. Then E[φ(X1 , X2 , . . . , Xm )X1 = x] = E[φ(x, X2 , . . . , Xm )] ≤ E[φ(x, Y2 , . . . , Ym )] = E[φ(X1 , Y2 , . . . , Ym )X1 = x], where the equalities above follow from the independence assumption and the inequality follows from the induction hypothesis. Taking expectations with respect to X1 , we obtain E[φ(X1 , X2 , . . . , Xm )] ≤ E[φ(X1 , Y2 , . . . , Ym )]. Repeating the argument, but now conditioning on Y2 , . . . , Ym and using X1 ≤cx Y1 , we see that E[φ(X1 , Y2 , . . . , Ym )] ≤ E[φ(Y1 , Y2 , . . . , Ym )], and this proves the result.

It is not hard to show that if X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) satisfy conditions (7.A.9)–(7.A.11) of Theorem 7.A.4, and if Y1 , Y2 , . . . , Yn are independent, then, in fact, X ≤ccx Y . This observation provides an alternative proof for the ≤ccx case of Theorem 7.A.24 The following results may be compared with Theorem 6.B.17. Theorem 7.A.25. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i ≤cx [≤icx ] Xj,i+1 for j = 1, 2, . . . , m, and i ≥ 1, and if M ≤ccx [≤iccx ] N , then M1 i=1

X1,i ,

M2 i=1

X2,i , . . . ,

Mm

Xm,i

i=1

≤ccx [≤iccx ]

N1 i=1

X1,i ,

N2 i=1

X2,i , . . . ,

Nm

Xm,i .

i=1

The following example gives suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 6.G.11, 7.A.13, 7.A.39, 7.B.5, and 9.A.20 for related results.

7.A The Monotone Convex and Monotone Concave Orders

335

Example 7.A.26. Let X be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ, and let Y be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ +D, where D is a nonnegative diagonal matrix. Then X ≤ccx Y . 7.A.8 The directional convex and concave orders Let ≤ denote the coordinatewise ordering in Rn . For x, y, z ∈ Rn we use the notation [x, y] ≤ z as a shorthand for x ≤ z and y ≤ z. Also, the notation z ≤ [x, y] stands for z ≤ x and z ≤ y. A function φ : Rn → R is said to be directionally convex [concave] if for any xi ∈ Rn , i = 1, 2, 3, 4, such that x1 ≤ [x2 , x3 ] ≤ x4 and x1 + x4 = x2 + x3 , one has φ(x2 ) + φ(x3 ) ≤ [≥] φ(x1 ) + φ(x4 ).

(7.A.17)

A function φ : Rn → Rm is called directionally convex [concave] if the coordinate functions φi , i = 1, 2, . . . , m, deﬁned by φ(x) = (φ1 (x), φ2 (x), . . . , φn (x)), are directionally convex [concave]. Directional convexity neither implies, nor is implied by, conventional convexity. However, a univariate function is directionally convex [concave] if, and only if, it is convex [concave]. A function φ : Rn → R is said to be supermodular [submodular] if for any x, y ∈ Rn it satisﬁes φ(x) + φ(y) ≤ [≥] φ(x ∧ y) + φ(x ∨ y), where the operators ∧ and ∨ denote coordinatewise minimum and maximum, respectively. If φ : Rn → R has second partial derivatives, then it is supermod2 ular if, and only if, ∂x∂i ∂xj φ ≥ 0 for all i = j. Many examples of supermodular functions can be found in Marshall and Olkin [383, Chapter 6]. Proposition 7.A.27. The following statements are equivalent: (a) The function φ is directionally convex [concave]. (b) The function φ is supermodular [submodular ] and coordinatewise convex [concave]. (c) For any x1 , x2 , y ∈ Rn , such that x1 ≤ x2 and y ≥ 0, one has φ(x1 + y) − φ(x1 ) ≤ [≥] φ(x2 + y) − φ(x2 ). If φ is twice diﬀerentiable, then it is directionally convex [concave] if, and only if, all its second derivatives are nonnegative [nonpositive]. Another useful property of directionally convex [concave] functions is stated next. Proposition 7.A.28. (a) If ψ : Rm → Rk is increasing and directionally convex [concave] and φ : Rn → Rm is increasing and directionally convex [concave], then the composition ψ(φ) is increasing and directionally

336

7 Multivariate Variability and Related Orders

convex [concave]. In particular, if ψ : R → R is increasing and convex [concave] and φ : Rn → R is increasing and directionally convex [concave], then the composition ψ(φ) is increasing and directionally convex [concave]. (b) If ψ : Rm → Rk is increasing and directionally convex [concave] and φ : Rn → Rm is decreasing and directionally convex [concave], then the composition ψ(φ) is decreasing and directionally convex [concave]. In particular, if ψ : R → R is increasing and convex [concave] and φ : Rn → R is decreasing and directionally convex [concave], then the composition ψ(φ) is decreasing and directionally convex [concave]. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Suppose that X and Y are such that Eφ(X) ≤ Eφ(Y )

for all [increasing] functions φ : Rn → R that are directionally convex,

provided the expectations exist. Then X is said to be smaller than Y in the [increasing] directionally convex order (denoted by X [≤idir-cx ] ≤dir-cx Y ). The orders ≤dir-cv and ≤idir-cv are deﬁned similarly. The following relationships among the orders ≤dir-cx [≤idir-cx ] and ≤ccx [≤iccx ] follow from Proposition 7.A.27. The last assertion in the next theorem follows from the observation that −φ is directionally concave if, and only if, φ is directionally convex. Theorem 7.A.29. Let X and Y be two random vectors. If X ≤ccx [≤iccx ] Y , then X ≤dir-cx [≤idir-cx ] Y . Also, if X ≤dir-cx Y , then X ≤idir-cx Y and X ≥dir-cv Y . From Proposition 7.A.28 we obtain the following result (which may be compared with Theorems 6.G.10 and 9.A.16). Theorem 7.A.30. Let X and Y be two n-dimensional random vectors. If X ≤idir-cx Y , then φ(X) ≤idir-cx φ(Y ) for any increasing and directionally convex function φ : Rn → Rm . In particular, φ(X) ≤icx φ(Y ) for any increasing and directionally convex function φ : Rn → R. Theorem 7.A.31. Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞. Assume that EX j → EX and that EY j → EY as j → ∞. If X j ≤dir-cx Y j , j = 1, 2, . . ., then X ≤dir-cx Y . From Theorems 7.A.24 and 7.A.29 we immediately obtain the next result. Theorem 7.A.32. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx [≤icx ] Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xn ) ≤dir-cx [≤idir-cx ] (Y1 , Y2 , . . . , Yn ).

7.A The Monotone Convex and Monotone Concave Orders

337

A stronger result than the ≤cx and ≤dir-cx part of Theorem 7.A.32 is Theorem 7.A.38 below. Also, the ≤icx and ≤idir-cx part of Theorem 7.A.32 still holds if it is merely assumed that (Y1 , Y2 , . . . , Ym ) is CIS (as deﬁned in (6.B.11)) rather than assuming that it consists of independent components. The following result (which is a generalization of Theorem 7.A.32) shows that the directionally convex orders are closed under conjunctions. Theorem 7.A.33. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤dir-cx [≤idir-cx ] Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤dir-cx [≤idir-cx ] (Y 1 , Y 2 , . . . , Y m ). Proof. It is enough to show that if X 1 and Y 1 are of the same dimension k1 , and if Z is another random vector, of dimension k, which is independent of X 1 and Y 1 , and if X 1 ≤dir-cx [≤idir-cx ] Y 1 , then (X 1 , Z) ≤dir-cx [≤idir-cx ] (Y 1 , Z). The rest of the proof can then be obtained by induction and pairwise interchanges. So let φ be a (k1 +k)-dimensional [increasing] directionally convex function. Note that φ(x, z) is [increasing] directionally convex in x for any z, where the dimensions of x and z are k1 and k, respectively. Thus from X 1 ≤dir-cx [≤idir-cx ] Y 1 and the independence assumption we obtain Eφ(X 1 , Z) = E Eφ(X 1 , Z)Z ≤ E Eφ(Y 1 , Z)Z = Eφ(Y 1 , Z), and the proof is complete.

The next result shows that the directionally convex orders are closed under convolutions. Theorem 7.A.34. Let X 1 , X 2 , . . . , X m be a set of independent random vectors and let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors, all of the same dimension k. If X i ≤dir-cx [≤idir-cx ] Y i for i = 1, 2, . . . , m, then m m X i ≤dir-cx [≤idir-cx ] Y i. i=1

i=1

Proof. Let φ : Rk → R be any [increasing] directionally convex function.

m Then the function ψ : Rkm → R, deﬁned by ψ(x1 , x2 , . . . , xm ) = φ i=1 xi , is [increasing] directionally convex function. The stated result now follows from Theorem 7.A.33. (The idir-cx part also follows directly from Theorems 7.A.30 and 7.A.33.)

A continuous analog of Theorem 7.A.34 (where the sums are replaced by integrals) is the following result.

338

7 Multivariate Variability and Related Orders

Theorem 7.A.35. Let {X(t)}t∈Rd and {Y (t)}t∈Rd be two R-valued random ﬁelds which are a.s. Riemann-integrable. Suppose that (X(t1 ), X(t2 ), . . . , X(tk )) ≤idir-cx (Y (t1 ), Y (t2 ), . . . , Y (tk )) for all t1 , t2 , . . . , tk ∈ Rd , k = 1, 2, . . .. Then X(t)dt, X(t)dt, . . . , X(t)dt B1 B2 B k Y (t)dt, Y (t)dt, . . . , Y (t)dt ≤idir-cx B1

B2

Bk

for any disjoint bounded Borel-measurable sets B1 , B2 , . . . , Bk in Rd , k = 1, 2, . . .. The following result may be compared with Theorem 7.A.25. Theorem 7.A.36. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i ≤cx [≤icx ] Xj,i+1 for j = 1, 2, . . . , m, and i ≥ 1, and if M ≤dir-cx [≤idir-cx ] N , then M1 i=1

X1,i ,

M2

X2,i , . . . ,

i=1

Mm

Xm,i

i=1

≤dir-cx [≤idir-cx ]

N1 i=1

X1,i ,

N2 i=1

X2,i , . . . ,

Nm

Xm,i .

i=1

Consider now, as in Section 6.B.4, n families of univariate distribu(i) tion functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) denote a random variable with distribution func(i) tion Gθ , i = 1, 2, . . . , n. Below we give a result which provides comparisons of two random vectors, with distribution functions of the form (6.B.18), in the [increasing] directionally convex order. The following result is a multivariate extension of Theorems 3.A.21 and 4.A.18; see Theorems 6.B.17, 6.G.8, 9.A.7, and 9.A.15 for related results. (i)

Theorem 7.A.37. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution nfunctions as above. Let Θ 1 and Θ 2 be two random vectors with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

7.A The Monotone Convex and Monotone Concave Orders

339

If for every [increasing] convex function φ, E[φ(Xi (θ))] is [increasing] convex in θ,

i = 1, 2, . . . , n,

and if Θ 1 ≤dir-cx [≤idir-cx ] Θ2 , then Y 1 ≤dir-cx [≤idir-cx ] Y 2 . The following result compares, with respect to ≤dir-cx , two random vectors with the same dependence structure. Recall the deﬁnition of CIS given in (6.B.11). If every permutation of the coordinates of a random vector is CIS, then the vector is said to be conditionally increasing (CI). Recall also the deﬁnition of a copula, given in (6.B.14). Theorem 7.A.38. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have a common copula that is CI. If Xi ≤cx Yi , i = 1, 2, . . . , n, then X ≤dir-cx Y . Theorem 7.A.38 may be compared with Theorems 6.B.14 and 7.A.32. A result that is stronger than Theorem 7.A.38 is Theorem 9.A.25 in Section 9.A. The following example gives necessary and suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 6.G.11, 7.A.13, 7.B.5, and 9.A.20 for related results. Example 7.A.39. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ X , and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix Σ Y . Then X ≤dir-cx Y if, and only if, µX = µY and Σ X ≤ Σ Y . It is worth mentioning that the result in Example 7.A.26 implies the sufﬁciency part in Example 7.A.39. In closing this subsection it is worthwhile to mention that a stochastic order, which is deﬁned by requiring Eφ(X) ≤ Eφ(Y ) to hold for all supermodular [rather than supermodular and componentwise convex, that is, directionally convex] functions φ, is studied in Section 9.A.4. 7.A.9 The orthant convex and concave orders Analogous to the orthant orders studied in Section 6.G.1, one can introduce and study orthant convex and concave orders. This is done in this subsection. Let X = (X1 , X2 , . . . , Xn ) be a random vector with distribution function F and multivariate survival function F (see the exact deﬁnition of a multivariate survival function in Section 6.G.1). Let Y be another n-dimensional random vector with distribution function G and survival function G. If

340

7 Multivariate Variability and Related Orders

∞

∞

∞

... x1

x2

xn ∞

≤

F (u1 , u2 , . . . , un )du1 du2 · · · dun ∞ ∞ ... G(u1 , u2 , . . . , un )du1 du2 · · · dun

x1

x2

for all x,

xn

then we say that X is smaller than Y in the upper orthant-convex order (denoted by X ≤uo-cx Y ). If

x1

−∞

x2

... −∞ ≥

xn

F (u1 , u2 , . . . , un )du1 du2 · · · dun xn ... G(u1 , u2 , . . . , un )du1 du2 · · · dun

−∞ x1 x2

−∞

−∞

for all x,

−∞

then we say that X is smaller than Y in the lower orthant-concave order (denoted by X ≤lo-cv Y ). In analogy with Theorem 6.G.1 it is not hard to obtain the following characterizations of the orders ≤uo-cx and ≤lo-cv . Theorem 7.A.40. Let X and Y be two n-dimensional random vectors. Then (a) X ≤uo-cx Y if, and only if, E

n i=1

n gi (Xi ) ≤ E gi (Yi ) i=1

for every collection {g1 , g2 , . . . , gn } of univariate nonnegative increasing convex functions. (b) X ≤lo-cv Y if, and only if, E

n i=1

n hi (Xi ) ≤ E hi (Yi ) i=1

for every collection {h1 , h2 , . . . , hn } of univariate nonnegative increasing functions such that hi is concave on the union of the supports of Xi and Yi , i = 1, 2, . . . , n. From Theorem 7.A.40 it is easy to obtain the following result which is an extension of the fact that if the random variables X and Y satisfy X ≤icx Y , then φ(X) ≤icx φ(Y ) for all real increasing convex functions φ on R (see Theorem 4.A.15). Theorem 7.A.41. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. (a) If X ≤uo-cx Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uo-cx (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) whenever φ1 , φ2 , . . . , φn are increasing convex functions. (b) If X ≤lo-cv Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lo-cv (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) whenever φ1 , φ2 , . . . , φn are increasing concave functions.

7.A The Monotone Convex and Monotone Concave Orders

341

From Theorems 7.A.40 and 6.G.1 it follows that X ≤uo Y =⇒ X ≤uo-cx Y and that X ≤lo Y =⇒ X ≤lo-cv Y . The following results may be compared with Theorems 7.A.25 and 7.A.36. Theorem 7.A.42. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i ≤icx [≥icv ] Xj,i+1 for j = 1, 2, . . . , m, and i ≥ 1, and if M ≤uo-cx [≤lo-cv ] N , then M1 i=1

X1,i ,

M2 i=1

X2,i , . . . ,

Mm

Xm,i

i=1

≤uo-cx [≤lo-cv ]

N1 i=1

X1,i ,

N2

X2,i , . . . ,

i=1

Nm

Xm,i .

i=1

Consider now the function φ : Rn → R which is deﬁned by φ(x1 , x2 , . . . , xn ) n = i=1 gi (xi ), where each gi : R → R is increasing and convex [concave]. It is easy to verify that φ is increasing and directionally convex [concave]. Thus, from Theorem 7.A.40 we obtain that X ≤idir-cx Y =⇒ X ≤uo-cx Y and X ≤idir-cv Y =⇒ X ≤lo-cv Y . It is worth mentioning that the supermodular order, studied in Section 9.A.4, implies the orders ≤uo , ≤lo , and ≤idir-cx , mentioned above. We now describe a multivariate extension of the univariate order ≤m-icv 2 (see Section 4.A.7). A special case of this extension is the order ≤R (m,m)-icv which is discussed in Section 7.A.5. A similar extension of the univariate order ≤m-icx can also be deﬁned and studied. For x ∈ Rn , let L(x) = {y : y ≤ x}. For an n-dimensional distribution function F deﬁne F1 (x) = F (x)

and Fm (x) =

Fm−1 (u)du. L(x)

For n-dimensional distribution functions F and G denote F ≤nm G ⇐⇒ Fm (x) ≥ Gm (x)

for all x ∈ Rn .

342

7 Multivariate Variability and Related Orders

When m = 1 the above order is equivalent to the lower orthant order deﬁned in (6.G.2). When m = 2 the above order is a multivariate (left-sided) analog of (4.A.5). If X and Y have the distribution functions F and G, respectively, then (as can be easily seen by taking m = 2) the relationship F ≤n2 G is 2 the same as X ≤lo-cv Y . Also, the order ≤2m is the order ≤R (m,m)-icv which is discussed in Section 7.A.5. For any n-dimensional distribution function F , its (n − 1)-dimensional marginal distribution functions are deﬁned by F (i) (x1 , . . . , xi−1 , xi+1 , . . . , xn ) = F (x1 , . . . , xi−1 , ∞, xi+1 , . . . , xn ), i = 1, 2, . . . , n. The next result shows that the order ≤nm is preserved under marginalization. Before stating the next result we need the following deﬁnition. The distribution function F is said to be margin-regular for m > 1 and i ≤ n if for each x(i) = (x1 , . . . , xi−1 , xi+1 , . . . , xn ) for which F (i) (x(i) ) < ∞, there is an xi ∈ R such that F (x1 , x2 , . . . , xn ) < ∞. Theorem 7.A.43. For n > 1, m > 1, and i ≤ n, let F and G be two ndimensional distribution functions such that F ≤nm G and F is margin-regular G(i) . for m and i. Then F (i) ≤n−1 m

7.B Multivariate Dispersion Orders Diﬀerent characterizations of the univariate order ≤disp give rise to diﬀerent multivariate dispersive orders. In this section we describe some such orders. 7.B.1 A strong multivariate dispersion order Recall from (3.B.13) that for univariate random variables we have that X ≤disp Y if, and only if, Y =st φ(X) for some φ that satisﬁes φ(x ) − φ(x) ≥ x − x whenever x ≤ x . An extension of this deﬁnition of the univariate dispersion order gives the multivariate dispersion order that is discussed in this subsection. A function φ : Rn → Rn is called an expansion if

φ(x) − φ(x ) ≥ x − x

for all x and x in Rn .

Let X and Y be two n-dimensional random vectors. Suppose that Y =st φ(X)

for some expansion φ.

(7.B.1)

Then we say that X is less than Y in the strong multivariate dispersive order (denoted by X ≤SD Y ). Let J φ (x) denote the Jacobian matrix of φ at x, that is,

7.B Multivariate Dispersion Orders

J φ (x) =

343

∂φ i . ∂xj

It is useful to note that φ is an expansion if, and only if, J Tφ (x)J φ (x) − I is nonnegative deﬁnite, where I is the identity matrix; see Giovagnoli and Wynn [211]. It is very easy to show that the strong multivariate dispersion order ≤SD is closed under conjunctions as the following result states. Theorem 7.B.1. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤SD Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤SD (Y 1 , Y 2 , . . . , Y m ). The strong multivariate dispersion order ≤SD also satisﬁes the following closure property, the proof of which is omitted. Theorem 7.B.2. Let X and Y be two n-dimensional random vectors. Let A be an n × n matrix such that for any orthogonal matrix Γ there exists an orthogonal matrix Γ˜ such that Γ AΓ˜ = A. If X ≤SD Y , then AX ≤SD AY . The following result compares, with respect to the order ≤SD , two random vectors with the same dependence structure. Recall the deﬁnition of a copula, given in (6.B.14). Theorem 7.B.3. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have a common copula. If Xi ≤disp Yi , i = 1, 2, . . . , n, then X ≤SD Y . An interesting application of Theorem 7.B.3 is the following result which may be compared with Theorems 6.D.10, 6.E.12, and 7.B.12. Theorem 7.B.4. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as in Theorem 6.D.10. If X1 ≤disp Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤SD (Y(1) , Y(2) , . . . , Y(n) ). Proof. The vectors (X(1) , X(2) , . . . , X(n) ) and (Y(1) , Y(2) , . . . , Y(n) ) have the same copula. By Theorem 3.B.26, X1 ≤disp Y1 implies that X(i) ≤disp Y(i) , i = 1, 2, . . . , n. The stated result now follows from Theorem 7.B.3.

An interesting example in which the order ≤SD arises naturally is the following. See also Examples 6.B.29, 6.G.11, 7.A.13, 7.A.26, 7.A.39, and 9.A.20.

344

7 Multivariate Variability and Related Orders

Example 7.B.5. Let X = (X1 , X2 , . . . , Xn ) be a multivariate normal random vector with mean vector µ1 , and let Y = (Y1 , Y2 , . . . , Yn ) be a multivariate normal random vector with mean vector µ2 . If X and Y have the same correlation matrix, and if Var(Xi ) ≤ Var(Yi ), i = 1, 2, . . . , n, then X ≤SD Y . This can be seen from Theorem 7.B.3 by noting that X and Y have the same copula, and that Var(Xi ) ≤ Var(Yi ) implies Xi ≤disp Yi , i = 1, 2, . . . , n. Arias-Nicol´ as, Fern´ andez-Ponce, Luque-Calvo, and Su´ arez-Llorens [17] and Fern´ andez-Ponce and Rodr´ıguez-Gri˜ nolo [196] compared, respectively, some multivariate t and Wishart random vectors with respect to the order ≤SD . According to Oja [441], an n-dimensional random vector Y is said to be more scattered than another n-dimensional random vector X (denoted as X ≤∆ Y ) if Y =st φ(X) for some function φ : Rn → Rn that has the property that ∆(φ(x1 ), φ(x2 ), . . . , φ(xn+1 )) ≥ ∆(x1 , x2 , . . . , xn+1 )

(7.B.2)

for all {x1 , x2 , . . . , xn+1 } ⊂ Rn , where ∆(x1 , x2 , . . . , xn+1 ) is the volume of the simplex with vertices x1 , x2 , . . . , xn+1 . It is useful to note that a function φ satisﬁes (7.B.2) for all {x1 , x2 , . . . , xn+1 } ⊂ Rn if, and only if, the determinant of the Jacobian matrix of φ satisﬁes |Det(J φ (x))| ≥ 1

for all x ∈ Rn .

The order ≤∆ , as the order ≤SD , is a multivariate extension of the characterization (3.B.13) of the univariate order ≤disp . We have the following relationship between the orders ≤∆ and ≤SD : X ≤SD Y =⇒ X ≤∆ Y . Fernandez-Ponce and Suarez-Llorens [198] introduced a multivariate dispersion order that is even stronger than ≤SD . They did it by essentially requiring (7.B.1) to hold for a particular expansion φ which is a multivariate analog of the univariate function φ = G−1 F in (3.B.13) in Section 3.B. 7.B.2 A weak multivariate dispersion order The property (3.B.34) of the univariate dispersive order has an obvious multivariate analog, which is used in this subsection in order to deﬁne a multivariate dispersion order. Let X and Y be two n-dimensional random vectors. Let X and Y be such that X =st X and Y =st Y and such that X and X are independent and Y and Y are independent. Suppose that

X − X ≤st Y − Y , where · is the Euclidean norm and ≤st is the usual univariate stochastic order discussed in Section 1.A. Then we say that X is smaller than Y in the multivariate dispersion order (denoted by X ≤D Y ).

7.B Multivariate Dispersion Orders

345

The multivariate dispersion order ≤D has the desirable property that the traces of the corresponding covariance matrices are ordered as expected. This multivariate analog of (3.B.25) is shown in the next theorem. Theorem 7.B.6. Let X and Y be two n-dimensional random vectors. If X ≤D Y , then tr(Cov(X)) ≤ tr(Cov(Y )). (7.B.3) Proof. Let X and Y be such that X =st X and Y =st Y and such that X and X are independent and Y and Y are independent. Then Cov(X) = T T 1 1 2 E[(X − X ) (X − X )], and Cov(Y ) = 2 E[(Y − Y ) (Y − Y )]. Therefore 1 E tr(X − X )(X − X )T 2 1 = E X − X 2 2 1 ≤ E Y − Y 2 2 = tr(Cov(Y ))

tr(Cov(X)) =

and (7.B.3) is obtained.

The multivariate dispersion order ≤D is location-free and rotation-free as the next result shows. The proof is simple and is omitted. Theorem 7.B.7. Let X and Y be two n-dimensional random vectors. If X ≤D Y , then Γ X + a ≤D ΛY + b, for all orthogonal matrices Γ and Λ and for all vectors a and b. The multivariate dispersion order ≤D is also closed under conjunctions as the following result states. Theorem 7.B.8. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤D Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤D (Y 1 , Y 2 , . . . , Y m ). Proof. It is suﬃcient to prove the result when m = 2. Let X 1 , X 2 , Y 1 , and Y 2 be such that X 1 =st X 1 , X 2 =st X 2 , Y 1 =st Y 1 ,

and Y 2 =st Y 2 .

Let X = (X 1 , X 2 ), X = (X 1 , X 2 ), Y = (Y 1 , Y 2 ),

and Y = (Y 1 , Y 2 ).

346

7 Multivariate Variability and Related Orders

Then

X − X 2 = X 1 − X 1 2 + X 2 − X 2 2 ≤st Y 1 − Y 1 2 + Y 2 − Y 2 2 = Y − Y 2 . That is, X ≤D Y .

By construction on the same probability space (see Section 6.B.2), it is easy to prove the following result. Theorem 7.B.9. Let X and Y be two n-dimensional random vectors. Then X ≤SD Y =⇒ X ≤D Y . 7.B.3 Dispersive orders based on constructions The standard construction of an n-dimensional random vector X = (X1 , X2 , . . . , Xn ), from a vector (U1 , U2 , . . . , Un ) of independent uniform[0, 1] random variables, was described in Section 6.B.3. Here we ﬁrst describe explicitly the function that transforms (U1 , U2 , . . . , Un ) into (X1 , X2 , . . . , Xn ). Let F be the distribution function of X. Denote by F1 (·) the marginal distribution function of X1 , and denote by Fi+1|1,2,...,i (·x1 , x2 , . . . , xi ) the conditional distribution function of Xi+1 given that X1 = x1 , X2 = x2 , . . . , Xi = xi , i = 1, 2, . . . , n −1. The inverse of F1 will be denoted by F1−1 (·) and the inverse −1 of Fi+1|1,2,...,i (·x1 , x2 , . . . , xi ) will be denoted by Fi+1|1,2,...,i (·x1 , x2 , . . . , xi ) for every (x1 , x2 , . . . , xi ) in the support of (X1 , X2 , . . . , Xi ), i = 1, 2, . . . , n−1. For (u1 , u2 , . . . , un ) ∈ (0, 1)n denote x1 = F1−1 (u1 ),

(7.B.4)

and, by induction, −1 (ui x1 , x2 , . . . , xi−1 ), xi = Fi|1,2,...,i−1

i = 2, 3, . . . , n.

(7.B.5)

Denote the transformation (u1 , u2 , . . . , un ) → (x1 , x2 , . . . , xn ) described in (7.B.4) and (7.B.5) by Ψ ∗F : (0, 1)n → Rn . It is well known that Ψ ∗F (U1 , U2 , . . . , Un ) =st (X1 , X2 , . . . , Xn ). Let Y = (Y1 , Y2 , . . . , Yn ) be another random vector with distribution function G, and denote the corresponding transformation by Ψ ∗G . Note that Ψ ∗F and Ψ ∗G can be thought of as “inverses” of F and of G, respectively. The following order is a multivariate extension of the characterization (3.B.7) of the univariate order ≤disp . Suppose that Ψ ∗G (u) − Ψ ∗F (u) is increasing in u ∈ (0, 1)n .

7.B Multivariate Dispersion Orders

347

Then X is said to be smaller than Y in the multivariate dispersion order (denoted by X ≤disp Y ). It is easy to prove that the order ≤disp is closed under conjunctions as the following result states. Theorem 7.B.10. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤disp Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤disp (Y 1 , Y 2 , . . . , Y m ). In particular, if the random variables X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn are independent and satisfy Xi ≤disp Yi , i = 1, 2, . . . , n, then (X1 , X2 , . . . , Xn ) ≤disp (Y1 , Y2 , . . . , Yn ). A useful property of the multivariate order ≤disp is given next. Recall from Section 6.B.3 the deﬁnition of a CIS random vector, and recall from Section 7.A.8 the deﬁnition of directionally convex functions. The proof of the following result is not given here. Theorem 7.B.11. Let X and Y be two nonnegative CIS random vectors. If X ≤disp Y , then Var[φ(X)] ≤ Var[φ(Y )]

for all increasing directionally convex functions φ.

In particular, if (X1 , X2 , . . . , Xn ) ≤disp (Y1 , Y2 , . . . , Yn ), then Var[X1 + X2 + · · · + Xn ] ≤ Var[Y1 + Y2 + · · · + Yn ]. The following result may be compared with Theorems 6.D.10, 6.E.12, and 7.B.4. Theorem 7.B.12. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as in Theorem 6.D.10. If X1 ≤disp Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤disp (Y(1) , Y(2) , . . . , Y(n) ). The following example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.D.8, and 6.E.13. Example 7.B.13. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . If X ≤disp Y , then (T1,1 , T1,2 , . . . , T1,n ) ≤disp (T2,1 , T2,2 , . . . , T2,n ) for each n ≥ 1.

348

7 Multivariate Variability and Related Orders

The total hazard construction of a nonnegative n-dimensional random vector T = (T1 , T2 , . . . , Tn ) with distribution function F , from a vector (X1 , X2 , . . . , Xn ) of independent standard exponential random variables, was described in Section 6.C.2. The construction deﬁnes a transformation of (X1 , X2 , . . . , Xn ) to Tˆ = (Tˆ1 , Tˆ2 , . . . , Tˆn ) such that T =st Tˆ . Denote this transformation from [0, ∞)n to [0, ∞)n by R∗F . Thus R∗F (X1 , X2 , . . . , Xn ) =st (T1 , T2 , . . . , Tn ). Let S = (S1 , S2 , . . . , Sn ) be another nonnegative random vector with distribution function G, and denote the corresponding transformation by R∗G . Note that R∗F and R∗G can be thought of as “inverses” of the “total hazards” − log F and − log G, respectively. The following order is a multivariate extension of the characterization (3.B.9) of the univariate order ≤disp . Suppose that R∗G (x) − R∗F (x) is increasing in x ∈ [0, ∞)n . Then T is said to be smaller than S in the dynamic multivariate dispersion order (denoted by T ≤dyn-disp S). The order ≤dyn-disp is closed under conjunctions as the following, easy to prove, result states. Theorem 7.B.14. Let T 1 , T 2 , . . . , T m be a set of independent random vectors where the dimension of T i is ki , i = 1, 2, . . . , m. Let S 1 , S 2 , . . . , S m be another set of independent random vectors where the dimension of S i is ki , i = 1, 2, . . . , m. If T i ≤dyn-disp S i for i = 1, 2, . . . , m, then (T 1 , T 2 , . . . , T m ) ≤dyn-disp (S 1 , S 2 , . . . , S m ). A version of Theorem 7.B.11 holds for the order ≤dyn-disp , and is given next. Recall from Section 6.C.1 that a nonnegative random vector T has the positive dependence property of “supporting lifetimes” if T ≤ch T . The proof of the following result is not given here. Theorem 7.B.15. Let T and S be two nonnegative random vectors with the supporting lifetimes property. If T ≤dyn-disp S, then Var[φ(T )] ≤ Var[φ(S)]

for all increasing directionally convex functions φ.

7.C Multivariate Transform Orders: Convex, Star, and Superadditive Orders In this section we review some extensions of the univariate orders ≤c , ≤∗ , and ≤su , which were studied in Section 4.B. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors with survival functions F and G, respectively. Denote

7.D The Multivariate Laplace Transform and Related Orders

F i (x) =

F (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) , F (x1 , . . . , xi−1 , 0, xi+1 , . . . , xn )

x ≥ 0,

Gi (x) =

G(x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) , G(x1 , . . . , xi−1 , 0, xi+1 , . . . , xn )

x ≥ 0.

349

and

For any (x1 , x2 , . . . , xn ) ≥ 0 and for any i = 1, 2, . . . , n, let ui be the solution of Gi (x1 , . . . , xi−1 , ui , xi+1 , . . . , xn ) = F i (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ). If, for every i = 1, 2, . . . , n and every (x1 , . . . , xi−1 , xi+1 , . . . , xn ), we have that ui is convex in xi , then X is said to be smaller than Y in the multivariate convex transform order (denoted as X ≤mc Y ). If, for every i = 1, 2, . . . , n and every (x1 , . . . , xi−1 , xi+1 , . . . , xn ), we have that ui is starshaped in xi , then X is said to be smaller than Y in the multivariate star order (denoted as X ≤m∗ Y ). Finally, if, for every i = 1, 2, . . . , n and every (x1 , . . . , xi−1 , xi+1 , . . . , xn ), we have that ui is superadditive in xi , then X is said to be smaller than Y in the multivariate superadditive order (denoted as X ≤msu Y ). Obviously, X ≤mc Y =⇒ X ≤m∗ Y =⇒ X ≤msu Y . The above three orders are partial orders in the sense that each of them is transitive and reﬂexive. They are also closed under marginalization: Theorem 7.C.1. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. If X ≤mc [≤m∗ , ≤msu ] Y , then X I ≤mc [≤m∗ , ≤msu ] Y I for each I ⊆ {1, 2, . . . , n}. In analogy with Theorem 4.B.11, the above three orders can be used to deﬁne multivariate notions of the IFR, IFRA, and NBU aging notions.

7.D The Multivariate Laplace Transform and Related Orders The orders we studied in Section 5.A have multivariate extensions, which we will brieﬂy review in this section. 7.D.1 The multivariate Laplace transform order Extending (5.A.1), we have the following deﬁnition of the multivariate Laplace transform order. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors such that

350

7 Multivariate Variability and Related Orders n n for all s > 0. E exp − si Xi ≥ E exp − si Yi i=1

(7.D.1)

i=1

Then X is said to be smaller than Y in the Laplace transform order (denoted as X ≤Lt Y ). Throughout this section we consider only nonnegative random vectors. As in the univariate case (see Theorem 5.A.7), the multivariate order ≤Lt is closed under mixtures, limits in distribution, and convolutions. We do not formally state and prove these closure properties here. The following property of the multivariate Laplace transform order can be veriﬁed easily. Recall the notation X I and Y I from (6.A.1). Theorem 7.D.1. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. If X ≤Lt Y , then X I ≤Lt Y I for each I ⊆ {1, 2, . . . , n}. That is, the multivariate Laplace transform order is closed under marginalization. From Theorem 7.D.1 and (5.A.5) we see that X ≤Lt Y =⇒ E[Xi ] ≤ E[Yi ],

i = 1, 2, . . . , n,

(7.D.2)

provided the expectations exist. The following property is also easy to verify. Theorem 7.D.2. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤Lt Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤Lt (Y 1 , Y 2 , . . . , Y m ). That is, the multivariate Laplace transform order is closed under conjunctions. Another closure property of the multivariate Laplace transform order is given in Theorem 7.D.7. Theorem 7.D.3. Let X and Y be two nonnegative random vectors. If X ≤lo Y or X ≤icv Y or X ≥dir-cx Y , then X ≤Lt Y . In particular, if X ≤st Y , then X ≤Lt Y . Proof. The function hi , deﬁned by hi (x) = exp{−si x}, is nonnegative and decreasing for each si > 0, i = 1, 2, . . . , n. Therefore X ≤lo Y =⇒ X ≤Lt Y by (6.G.6) and (7.D.1). The implication X ≤icv Y =⇒ X ≤ Lt Y follows n from the fact that the function φ, deﬁned by φ(x) = exp{− i=1 si xi }, is decreasing and convex for each s > 0 (and therefore −φ is increasing and concave). Finally, the implication X ≥dir-cx Y =⇒ X ≤Lt Y follows from the fact that the function φ above is directionally convex for each s > 0.

7.D The Multivariate Laplace Transform and Related Orders

351

The following result is a multivariate analog of the right side of (5.A.13). It can be obtained from Jensen’s Inequality. Theorem 7.D.4. Let Y be a nonnegative random vector with mean vector (µ1 , µ2 , . . . , µn ). Let Z be a random vector degenerate at (µ1 , µ2 , . . . , µn ). Then X ≤Lt Z. The next result follows easily from (7.D.1). Theorem 7.D.5. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. If X ≤Lt Y , then n

ai Xi ≤Lt

i=1

n

ai Yi ,

whenever ai ≥ 0, i = 1, 2, . . . , n.

i=1

A multivariate analog of Theorem 5.A.3 is the following result. Its proof is similar to the proof of Theorem 5.A.3 and is therefore omitted. Theorem 7.D.6. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. Then X ≤Lt Y if, and only if, E

n

n φi (Xi ) ≥ E φi (Yi )

i=1

i=1

for all completely monotone functions φi , i = 1, 2, . . . , n, provided the expectations exist. When X and Y are vectors of nonnegative integer-valued random variables, it is customary and convenient to work with their probability generating functions, rather than with their Laplace transforms. This suggests the following deﬁnition. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two vectors, of nonnegative integer-valued random variables, such that E

n i=1

i tX i

≥E

n

tYi i

for all t ∈ (0, 1)n .

(7.D.3)

i=1

Then X is said to be smaller than Y in the multivariate probability generating function order (denoted by X ≤pgf Y ). It is easy to see that (7.D.3) holds if, and only if, (7.D.1) holds. That is, X ≤pgf Y ⇐⇒ X ≤Lt Y . A preservation property of the Laplace transform order is described in the next theorem. It is a multivariate extension of Theorem 5.A.9.

352

7 Multivariate Variability and Related Orders

Theorem 7.D.7. For i = 1, 2, . . . , m, let {Xj,i , j = 1, 2, . . . } be a sequence of nonnegative identically distributed random vectors, and assume that all the Xj,i ’s are mutually independent. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both N and N are independent of the Xj,i ’s. If M ≤pgf N , then M1 j=1

Xj,1 ,

M2 j=1

Xj,2 , . . . ,

Mm

N1 N2 Nm Xj,m ≤Lt Xj,1 , Xj,2 , . . . , Xj,m .

j=1

j=1

j=1

j=1

Proof. For ﬁxed (n1 , n2 , . . . , nm ) and ﬁxed bi > 0, i = 1, 2, . . . , m, we compute m m ni m ni

ni LX1,i (bi ) , E e− i=1 bi j=1 Xj,i = E e−bi j=1 Xj,i = i=1

i=1

where LX1,i denotes the Laplace transform of X1,i , i = 1, 2, . . . , m. Therefore m − m bi Mi Xj,i

Mi i=1 j=1 =E LX1,i (bi ) E e i=1

≥E

m

Ni LX1,i (bi )

i=1

m Ni = E e− i=1 bi j=1 Xj,i .

7.D.2 The multivariate factorial moments order Let X and Y be two vectors of nonnegative integer-valued random variables such that n n Xi Yi E ≤E for all ji ∈ N+ , i = 1, 2, . . . , n. (7.D.4) j ji i i=1 i=1 Then X is said to be smaller than Y in the factorial moments order (denoted by X ≤fm Y ). It is easy to see that X ≤fm Y =⇒ EX ≤ EY . The proofs of the following three results are similar to the proofs of Theorems 5.C.2, 5.C.4, and 5.C.5, respectively. We omit the straightforward details. Theorem 7.D.8. (a) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two vectors of nonnegative integer-valued random variables. If X ≤fm Y , then X + k ≤fm Y + k for every k ∈ Nn+ .

7.D The Multivariate Laplace Transform and Related Orders

353

(b) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two vectors of nonnegative integer-valued random variables. If (X1 , X2 , . . . , Xn ) ≤fm (Y1 , Y2 , . . . , Yn ), then (k1 X1 , k2 X2 , . . . , kn Xn ) ≤fm (k1 Y1 , k2 Y2 , . . . , kn Yn ) for every (k1 , k2 , . . . , kn ) ∈ Nn+ . (c) Let X 1 , X 2 , . . . , X m be a set of independent n-dimensional vectors of nonnegative integer-valued random variables. Let Y 1 , Y 2 , . . . , Y m be another set of independent n-dimensional vectors of nonnegative integer-valued random variables. If X i ≤fm Y i , i = 1, 2, . . . , m, then m

X i ≤fm

i=1

m

Y i.

i=1

Theorem 7.D.9. Let X and Y be two vectors of nonnegative integer-valued random variables. If X ≤icx Y , then X ≤fm Y . In particular, if X ≤st Y , then X ≤fm Y . Theorem 7.D.10. Let X and Y be two vectors of nonnegative integer-valued n random variables with bounded support i=1 {0, 1, 2, . . . , bi }. If X ≤fm Y , then b − Y ≤pgf b − X. 7.D.3 The multivariate moments order Consider now two vectors, of general (that is, not necessarily integer-valued) nonnegative random variables, X and Y such that E

n i=1

Xiji

≤E

n

Yiji

for all ji ∈ N+ , i = 1, 2, . . . , n.

i=1

Then X is said to be smaller than Y in the moments order (denoted as X ≤mom Y ). Clearly, X ≤mom Y =⇒ EX ≤ EY . The following three results are analogs of Theorems 5.C.6, 5.C.9, and 5.C.19. We omit the straightforward proofs. Theorem 7.D.11. (a) Let X and Y be two vectors of nonnegative random variables. If X ≤mom Y , then X + k ≤mom Y + k for every k ≥ 0. (b) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two vectors of nonnegative random variables. If (X1 , X2 , . . . , Xn ) ≤mom (Y1 , Y2 , . . . , Yn ), then (k1 X1 , k2 X2 , . . . , kn Xn ) ≤mom (k1 Y1 , k2 Y2 , . . . , kn Yn ) for every (k1 , k2 , . . . , kn ) ≥ 0. (c) Let X 1 , X 2 , . . . , X m be a set of independent n-dimensional vectors of nonnegative random variables. Let Y 1 , Y 2 , . . . , Y m be another set of independent n-dimensional vectors of nonnegative random variables. If X i ≤mom Y i , i = 1, 2, . . . , m, then

354

7 Multivariate Variability and Related Orders m i=1

X i ≤mom

m

Y i.

i=1

Theorem 7.D.12. Let X and Y be two vectors of nonnegative integer-valued random variables. If X ≤fm Y , then X ≤mom Y . In particular, if X ≤icx Y (or if X ≤st Y ), then X ≤mom Y . Theorem 7.D.13. Let X and n Y be two vectors of nonnegative random variables with bounded support i=1 [0, bi ]. If X ≤mom Y , then b − Y ≤Lt b − X. The ≤uo-cx order implies the multivariate moments order as it is described in the following result. This result follows at once from Theorem 7.A.40. Theorem 7.D.14. Let X and Y be two vectors of nonnegative random variables. If X ≤uo-cx Y , then X ≤mom Y .

7.E Complements Section 7.A: The proofs of Theorems 7.A.1 and 7.A.2 can be derived from results of Strassen [541]; see, for instance, R¨ uschendorf [482]. Elton and Hill [183] derived a constructive proof of Theorem 7.A.1. Further references regarding these theorems and several variations of them can be found in Elton and Hill [182]. Most of the other results in this section are easy to derive. The ﬁrst characterization of the order ≤icx , given in Theorem 7.A.3, can be found in M¨ uller and Stoyan [419]. The result about the convex order comparison of two sums (7.A.13) is taken from Berger [79]. The comparisons of vectors of random partial sums of random variables (Theorem 7.A.7) is taken from Jean-Marie and Liu [254]. Theorems 7.A.8 and 7.A.9 can be found in Arnold [19]. Results similar to the conclusions of Theorem 7.A.10 can be found in Alzaid and Proschan [14]. The convex order comparison of multivariate means (Example 7.A.11) is a variation of Lemma 1 of B¨ auerle [59]. The necessary (and suﬃcient) conditions for the comparison of multivariate normal random vectors (Example 7.A.13) can be found in M¨ uller [413]; some variations of the results in this example are given in Ding and Zhang [168]. The conditions which yield the stochastic equality of X and Y (Theorems 7.A.14 and 7.A.15) are taken from Li and Zhu [351] and from Scarsini [492], whereas Theorem 7.A.16 is taken from Baccelli and Makowski [27]. Some orders that are weaker than the multivariate convex order are studied in Mosler [399, Chapter 8]; for example, he studies the order deﬁned by Eφ(a1 X1 + a2 X2 + · · · + an Xn ) ≤ Eφ(a1 Y1 + a2 Y2 + · · · + an Yn ) for all univariate convex functions φ and constants a1 , a2 , . . . , an for which the expectations exist. Fern´ andez and Molchanov [194] studied related orders. The material in Section 7.A.5 follows Denuit, Lef`evre, and Mesﬁoui [148]; a version of the (m1 , m2 )-icx order for discrete random vectors is studied

7.E Complements

355

in Denuit, Lef`evre, and Mesﬁoui [150]. A more general version of Theorem 7.A.17 can be found in Bassan and Scarsini [54]. The order ≤symcx is deﬁned and studied in Marshall and Olkin [383, page 282]. The fact that random vectors, that are comparable in the order ≤ccx , must have the same covariance matrix (Theorem 7.A.23), can be found in M¨ uller and Stoyan [419]. The “preservation property” of the convex order under independence (Theorem 7.A.24) can be found in M¨ uller and Scarsini [417]. The results which compare random sums (Theorem 7.A.25) are taken from Pellerey [451]. The result about the ordering of multivariate normal random vectors according to the ≤ccx order (Example 7.A.26) is taken from Block and Sampson [94, Section 3]. The notion of directionally convex functions is studied in Shaked and Shanthikumar [509], though Fan and Lorentz [190], Marshall and Olkin [383, page 157], and R¨ uschendorf [483] mentioned such functions earlier. Most of the results about the directionally convex order (Section 7.A.8) are taken from Chang, Chao, Pinedo, and Shanthikumar [125] and from Meester and Shanthikumar [387]. The closure under limits property of the directionally convex order (Theorem 7.A.31) can be found in M¨ uller and Stoyan [419]. The comparison of integrals result (Theorem 7.A.35) is taken from Miyoshi [397]. The results which compare random sums (Theorem 7.A.36) are corrected versions of Theorem 2.3 and a part of Theorem 2.4 of Pellerey [451]. The comparison of mixtures result (Theorem 7.A.37) can be found in Denuit and M¨ uller [157], whereas the comparison of vectors with the same dependence structure (Theorem 7.A.38) can be found in M¨ uller and Scarsini [417]. The necessary and suﬃcient conditions for the comparison of multivariate normal random vectors (Example 7.A.39) are taken from M¨ uller [413]; an extension of this result to Kotz-type distributions is given in Ding and Zhang [168]. A discussion about the order ≤uo-cx can be found in Bergmann [82], where other orders, related to several unimodality notions, are also studied; the characterization given in Theorem 7.A.40(a) is taken from that paper. The preservation property of the order ≤uo-cx given in Theorem 7.A.41(a) can be found in Bergmann [80]. The results which compare random sums (Theorem 7.A.42) are taken from Pellerey [451]. Dyckerhoﬀ and Mosler [173] introduced some relatively easy conditions for verifying X ≤uo-cx Y or X ≤lo-cv Y when X and Y have ﬁnite discrete supports. The material about the orders ≤nm is taken from O’Brien and Scarsini [438]. Scarsini [490] has studied the order ≤2m in some detail; in particular, he has identiﬁed a class U of functions such that (X1 , X2 ) ≤2m (Y1 , Y2 ) if, and only if, E[φ(X1 , X2 )] ≤ E[φ(Y1 , Y2 )] for all φ ∈ U. M¨ uller [412] studied stochastic orders that are deﬁned by requiring (7.A.1) to hold for all quasiconcave or increasing quasiconcave functions.

356

7 Multivariate Variability and Related Orders

Arnold [20], building on previous ideas, introduced a multivariate Lorenz order that is based on the characterization of the univariate Lorenz order given in Theorem 3.A.11. Section 7.B: The development in Sections 7.B.1 and 7.B.2 follows the work of Giovagnoli and Wynn [211]. The comparison of vectors with the same dependence structure (Theorem 7.B.3) can be found in Arias-Nicol´ as, Fern´ andez-Ponce, Luque-Calvo, and Su´ arez-Llorens [17]. The conditions under which normal random vectors can be compared with respect to the order ≤SD (Example 7.B.5) are taken from Arias-Nicol´as, Fern´ andezPonce, Luque-Calvo, and Su´ arez-Llorens [17]. The comparison, in the order ≤SD , of vectors of order statistics (Theorem 7.B.4), has been communicated to us by Su´ arez-Llorens [542]. The orders that are studied in Section 7.B.3 were introduced in Shaked and Shanthikumar [518]; the properties of these orders, given in Theorems 7.B.11 and 7.B.15, can be found in that paper. The comparison, in the multivariate dispersive order, of vectors of order statistics (Theorem 7.B.12), can be found in Belzunce, Ruiz, and Ruiz [75]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The result that compares vectors of epoch times of nonhomogeneous Poisson processes (Example 7.B.13) is taken from Belzunce and Ruiz [73]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. Khaledi and Kochar [290] and Belzunce, Ruiz, and Su´ arez-Llorens [76] introduced and studied multivariate dispersive orders that are generalizations, respectively, of characterizations (3.B.12) and (3.B.13) of the univariate order ≤disp . Section 7.C: The multivariate transform orders in this section were introduced and studied in Roy [480]. Section 7.D: A basic paper on the multivariate Laplace transform order is Denuit [141], where many of the results in Section 7.D.1 can be found. The result about the preservation of the multivariate Laplace transform order under random sums (Theorem 7.D.7) is taken from Wong [568]; see also Pellerey [451]. The multivariate factorial moment order is studied in Lef`evre and Picard [337], where Theorems 7.D.9 and 7.D.10 can be found. That paper also mentions and studies the multivariate moments order. The closure properties of the multivariate order ≤fm (Theorem 7.D.8) have been communicated to us by Lef`evre [335].

8 Stochastic Convexity and Concavity

In this chapter we study stochastic monotonicities of parametric families of distributions with respect to various stochastic orders. We have already encountered stochastic monotonicities earlier in this book. For example, condition (1.A.13) in Theorem 1.A.6, condition (3.A.47) in Theorem 3.A.21, and condition (4.A.17) in Theorem 4.A.18 describe such monotonicities. In this chapter a systematic study of such stochastic monotonicities is given. Various notions of stochastic convexity and concavity are reviewed. A multivariate extension of the notion of stochastic convexity, namely, stochastic directional convexity, is investigated in this chapter as well. Let {Pθ , θ ∈ Θ} be a family of univariate distributions. Throughout this chapter Θ is a convex set (that is, an interval) of the real line R or of the set N+ . Let X(θ) denote a random variable with distribution Pθ . It is convenient and intuitive to replace the notation {Pθ , θ ∈ Θ} by {X(θ), θ ∈ Θ}, which we do throughout this chapter. Note that when we write {X(θ), θ ∈ Θ} we do not assume (and often we are not concerned with) any dependence (or independence) properties among the X(θ)’s. We are only interested in the “marginal distributions” {Pθ , θ ∈ Θ} of {X(θ), θ ∈ Θ} even when in some circumstances {X(θ), θ ∈ Θ} is a well-deﬁned stochastic process. Note also that X(θ) does not mean that X is a function of θ; it only indicates that the distribution of X(θ) is Pθ .

8.A Regular Stochastic Convexity We start our discussion with the weakest notion of stochastic convexity and concavity and show its usefulness by a list of examples. Then, in the following sections, we introduce stronger notions which provide a systematic way of verifying the weak notion of this section.

358

8 Stochastic Convexity and Concavity

8.A.1 Deﬁnitions In the following deﬁnitions SI, SCX, SCV, SICX, SIL, SD, SDCV, and so forth, stand, respectively, for stochastically increasing, stochastically convex, stochastically concave, stochastically increasing and convex, stochastically increasing and linear, stochastically decreasing, stochastically decreasing and concave, and so forth. Let {X(θ), θ ∈ Θ} be a set of random variables. Denote (a) {X(θ), θ ∈ Θ} ∈ SI [or SD] if Eφ(X(θ)) is increasing [or decreasing] for all increasing functions φ, (b) {X(θ), θ ∈ Θ} ∈ SCX [or SCV] if Eφ(X(θ)) is convex [or concave] for all convex [or concave] functions φ, (c) {X(θ), θ ∈ Θ} ∈ SICX [or SICV] if {X(θ), θ ∈ Θ} ∈ SI and Eφ(X(θ)) is increasing convex [or concave] in θ for all increasing convex [or concave] functions φ, (d) {X(θ), θ ∈ Θ} ∈ SDCX [or SDCV] if {X(θ), θ ∈ Θ} ∈ SD and Eφ(X(θ)) is decreasing convex [or concave] in θ for all increasing convex [or concave] functions φ, (e) {X(θ), θ ∈ Θ} ∈ SIL if {X(θ), θ ∈ Θ} ∈ SI and Eφ(X(θ)) is increasing convex in θ for all increasing convex functions φ, and is increasing concave in θ for all increasing concave functions φ, (f) {X(θ), θ ∈ Θ} ∈ SDL if {X(θ), θ ∈ Θ} ∈ SD and Eφ(X(θ)) is decreasing convex in θ for all increasing convex functions φ, and is decreasing concave in θ for all increasing concave functions φ. Note that {X(θ), θ ∈ Θ} ∈ SIL ⇐⇒ {X(θ), θ ∈ Θ} ∈ SICX ∩ SICV and {X(θ), θ ∈ Θ} ∈ SDL ⇐⇒ {X(θ), θ ∈ Θ} ∈ SDCX ∩ SDCV. Also, since a function is convex if, and only if, its negative is concave, we see that {X(θ), θ ∈ Θ} ∈ SCX ⇐⇒ {X(θ), θ ∈ Θ} ∈ SCV. Example 8.A.1. Let X(µ, σ) be a normal random variable with mean µ and standard deviation σ. Then, for each σ > 0, one has {X(µ, σ), µ ∈ R} ∈ SIL. This follows from Example 8.D.4 and Theorem 8.D.11 below. Example 8.A.2. Let X(λ) be a Poisson random variable with mean λ. Then {X(λ), λ ∈ [0, ∞)} ∈ SIL. This follows from Example 8.A.7 below. Equivalently, Example 8.A.2 shows that a homogeneous Poisson process {K(t), t ≥ 0} is SIL.

8.A Regular Stochastic Convexity

359

Lynch [370] has found conditions under which a stationary renewal process {K(t), t ≥ 0} is SCX. Explicitly, let X2 , X3 , . . . be independent and identically distributed interrenewal times with a distribution function F . Let the time until the ﬁrst renewal, X1 , have the equilibrium distribution function G given x

F (u)du

by G(x) = 0 EX2 , x ≥ 0. Lynch [370] has shown that if X2 has a logconcave density function, then {K(t), t ∈ [0, ∞)} ∈ SCX.

Example 8.A.3. Let X(n, p) be a binomial random variable with mean np and variance np(1−p). Then, for each p ∈ (0, 1), one has {X(n, p), n ∈ N++ } ∈ SIL and, for each n ∈ N++ , one has {X(n, p), p ∈ (0, 1)} ∈ SIL. These follow from Example 8.B.3 and Theorem 8.B.9 below. Example 8.A.4. Let Y (n), n = 1, 2, . . ., be a sequence of nonnegative independent and identically distributed random variables with mean 1. For n µ > 0 deﬁne X(µ, n) = µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ [0, ∞)} ∈ SIL and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL. The ﬁrst result follows from Example 8.D.5 and Theorem 8.D.11 below. The second result follows from Example 8.B.4 and Theorem 8.B.9 below. Speciﬁcally, when Y (n) in Example 8.A.4 is an exponential random variable we have the following example. Example 8.A.5. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ [0, ∞)} ∈ SIL and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL. By taking n = 1 in Example 8.A.4 we obtain the following result. Example 8.A.6. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ [0, ∞)} ∈ SIL. Example 8.A.7. Suppose that Θ is [0, ∞) or N++ . The family of nonnegative random variables {X(θ), θ ∈ Θ} is said to have the semigroup property if, for all θ1 and θ2 in Θ, one has X(θ1 + θ2 ) =st X(θ1 ) + X(θ2 ),

(8.A.1)

where X(θ1 ) and X(θ2 ) in (8.A.1) are independent. Note that {X(λ), λ ∈ [0, ∞)} of Example 8.A.2 has the semigroup property. Also, for each µ > 0, it is seen that {X(µ, n), n ∈ N++ } of Example 8.A.4 has the semigroup property. If {X(θ), θ ∈ Θ} has the semigroup property, then {X(θ), θ ∈ Θ} ∈ SIL. This result follows from Example 8.B.7 and Theorem 8.B.9 below. Example 8.A.8. The Beta distribution with parameters α > 0 and β > 0 is the one that has the density function deﬁned as fα,β (x) =

1 xα−1 (1 − x)β−1 , B(α, β)

0 < x < 1,

360

8 Stochastic Convexity and Concavity

1 where B(α, β) ≡ 0 xα−1 (1 − x)β−1 dx. The beta distribution of the second kind with parameters α > 0 and β > 0 is the one that has the density function deﬁned as xα−1 1 gα,β (x) = , x > 0. B(α, β) (1 − x)α+β Fix a t > 0. Adell, Bad´ıa, and de la Cal [2] proved the following results: (a) If X(θ) has the density function ftθ,t(1−θ) , θ ∈ (0, 1), then {X(θ), θ ∈ (0, 1)} ∈ SICX. (b) If Y (θ) has the density function ftθ+1,t(1−θ)+1 , θ ∈ (0, 1), then {Y (θ), θ ∈ (0, 1)} ∈ SICX. (c) If Z(θ) has the density function gtθ,t , θ > 0, then {Z(θ), θ > 0} ∈ SICX. For a random variable Y , let FY and F Y denote its distribution and survival functions, respectively. Similarly, for a random variable X(θ), let FX (·, θ) and F X (·, θ) denote the corresponding distribution and survival functions. Since the class of functions fa (x) = max{x − a, 0} [min{x − a, 0}] for all a ∈ R generates all the and since ∞increasing and convex [concave] functions, a E(max{X − a, 0}) = a F X (x)dx [E(min{X − a, 0}) = − −∞ FX (x)dx] (see Section 4.A.1), we have the following equivalences. Theorem 8.A.9. (a) {X(θ), θ ∈ Θ} ∈ SICX [SICV] if, and only if, {X(θ), ∞ x θ ∈ Θ} ∈ SI and x F X (y, θ)dy [ −∞ FX (y, θ)dy] is increasing [decreasing] convex in θ for all x, and (b) {X(θ), ∞θ ∈ Θ} ∈ SDCX x [SDCV] if, and only if, {X(θ), θ ∈ Θ} ∈ SD and x F X (y, θ)dy [ −∞ FX (y, θ)dy] is decreasing [increasing] convex in θ for all x. For discrete random variables we have the following analog of Theorem 8.A.9. Theorem 8.A.10. Suppose that for each θ ∈ Θ, the support of X(θ) is in N. Then (a) {X(θ), θ ∈ Θ} ∈ SICX [SICV] if, and only k ∞ and l=k P {X(θ) ≥ l} [ l=−∞ P {X(θ) ≤ l}] convex in θ for all k ∈ N, and (b) {X(θ), θ ∈ Θ} ∈ SDCX [SDCV] if, and only ∞ k and l=k P {X(θ) ≥ l} [ l=−∞ P {X(θ) ≤ l}] convex in θ for all k ∈ N.

if, {X(θ), θ ∈ Θ} ∈ SI is increasing [decreasing] if, {X(θ), θ ∈ Θ} ∈ SD is decreasing [increasing]

Recall the following identity which holds for any random variable Z with mean EZ: 0 ∞ EZ = − F (u)du + F (u)du, (8.A.2) −∞

0

where F and F are the distribution function and the survival function of Z, respectively. From Theorem 8.A.9 and (8.A.2) we thus obtain the next result.

8.A Regular Stochastic Convexity

361

Theorem 8.A.11. Suppose that EX(θ) is a linear function of θ. (a) If {X(θ), θ ∈ Θ} ∈ SICX [SICV], then {X(θ), θ ∈ Θ} ∈ SICV [SICX], and therefore {X(θ), θ ∈ Θ} ∈ SIL. (b) If {X(θ), θ ∈ Θ} ∈ SDCX [SDCV], then {X(θ), θ ∈ Θ} ∈ SDCV [SDCX], and therefore {X(θ), θ ∈ Θ} ∈ SDL. From Example 8.A.6 it follows that if X(θ) is uniformly distributed on [0, θ], then {X(θ), θ ∈ [0, ∞)} ∈ SIL. However, in order to obtain the discrete analog of this result we need to proceed in a diﬀerent route as in the next example. Example 8.A.12. Let X(n) be uniformly distributed on {0, 1, . . . , n− 1}. Then {X(n), n ∈ N+ } ∈ SIL. In order to see it ﬁrst note that EX(n) is a linear function of n. Thus, by Theorem 8.A.11 it is suﬃcient to show that {X(n), n ∈ N+ } ∈ SICV. Clearly, {X(n), n ∈ N+ } ∈ SI. Now we compute k l=0

P {X(n) ≤ l} =

1 (k + 1)k · . n 2

This is a decreasing convex function of n. Thus the stated result follows from Theorem 8.A.10(a). We will now present an application of these notions in establishing a stochastic inequality. Theorem 8.A.13. Let {Yk , k ∈ N++ } be a sequence of independent and identically distributed nonnegative random variables independent of the two nonnegative discrete random variables M and N . Then M N (a) M ≤icx [≤icv ] N =⇒ k=1 Yk ≤icx [≤icv ] k=1 Yk , and N M (b) M ≤cx N =⇒ k=1 Yk ≤cx k=1 Yk . Proof. Let φbe an increasing and convex [concave] function and deﬁne n ψ(n) = Eφ k=1 Yk . Then ψ is an increasing and convex [concave] function (see Example 8.A.4). Therefore M ≤icx [≤icv ] N implies that Eψ(M ) = M

N

Eφ ). This establishes part (a). When k=1 Yk ≤ Eφ k=1 Yk = Eψ(N N

M M ≤cx N one has E k=1 Yk = E k=1 Yk (see Theorem 4.A.35). This observation combined with part (a) completes the proof for part (b).

A stronger result than Theorem 8.A.13(b) is stated as Theorem 3.A.13 in Chapter 3. A stronger result than Theorem 8.A.13(a) is stated as Theorem 4.A.9 in Chapter 4. Theorem 8.A.13 can also be obtained from Theorem 4.A.18. In fact, we next restate Theorems 3.A.21 and 4.A.18 in terms of the terminology of this section (the assumption in Theorem 8.A.14(a) below is slightly stronger than the assumption in Theorem 4.A.18; see a comment after Theorem 4.A.18).

362

8 Stochastic Convexity and Concavity

Theorem 8.A.14. Let {X(θ), θ ∈ X } be a collection of random variables, and let Θ1 and Θ2 be two X -valued random variables that are independent of {X(θ), θ ∈ X }. (a) If {X(θ), θ ∈ X } ∈ SICX [SICV] and if Θ1 ≤icx [≤icv ]Θ2 , then X(Θ1 ) ≤icx [≤icv ] X(Θ2 ). (b) If {X(θ), θ ∈ X } ∈ SCX and if Θ1 ≤cx Θ2 , then X(Θ1 ) ≤cx X(Θ2 ). 8.A.2 Closure properties Closure properties of the notions that were introduced in Section 8.A.1 serve as the basis for studying the convexity and concavity properties of the performance measures of stochastic systems. In this subsection we describe some of these closure properties. Theorem 8.A.15. Suppose that {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} are two collections of random variables such that X(θ) and Y (θ) are independent for each θ. If {X(θ), θ ∈ Θ} ∈ SICX [or SICV] and {Y (θ), θ ∈ Θ} ∈ SICX [or SICV], then {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX [or SICV]. Proof. We prove the convex case only. The concave case can be similarly proven. Let θi ∈ Θ, i = 1, 2, 3, 4, be such that θ1 ≤ θ2 = θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . The stochastic monotonicity of X(θ) and Y (θ) can be ˆ1, X ˆ 4 , Yˆ1 , and Yˆ4 such that X ˆ i =st used to construct four random variables X ˆ ˆ ˆ ˆ ˆ X(θi ), Yi =st Y (θi ), i = 1, 4, X1 ≤ X4 a.s., and Y1 ≤ Y4 a.s. (see Theorem ˆ1, X ˆ 4 ) and (Yˆ1 , Yˆ4 ) can be constructed so that they are 1.A.1). Furthermore (X independent. Let I1 and I2 be independent random variables, independent of ˆ 4 , Yˆ1 , and Yˆ4 , such that P {I1 = 0} = P {I1 = 1} = P {I2 = 0} = ˆ1, X X ˆ 2 = (1 − I1 )X ˆ 1 + I1 X ˆ 3 = I1 X ˆ4, ˆ4, X ˆ 1 + (1 − I1 )X P {I2 = 1} = 12 . Deﬁne X Yˆ2 = (1 − I2 )Yˆ1 + I2 Yˆ4 , and Yˆ3 = I2 Yˆ1 + (1 − I2 )Yˆ4 . It is then not hard to see ˆ 2 =st X ˆ 3 , Yˆ2 =st Yˆ3 , that X ˆ 2 , Yˆ2 ), (X ˆ 3 , Yˆ3 ) ≤ (X ˆ 4 , Yˆ4 ) a.s. ˆ 1 , Yˆ1 ) ≤ (X (X (where, for any four numbers a, b, c, and d, the notation a ≤ [b, c] ≤ d means a ≤ min{b, c} and max{b, c} ≤ d), and ˆ 4 + Yˆ4 ) = (X ˆ 2 + Yˆ2 ) + (X ˆ 3 + Yˆ3 ) ˆ 1 + Yˆ1 ) + (X (X

a.s.

Then, for any increasing convex function φ, one has ˆ 4 + Yˆ4 ) ≥ Eφ(X ˆ 2 + Yˆ2 ) + Eφ(X ˆ 3 + Yˆ3 ). ˆ 1 + Yˆ1 ) + Eφ(X Eφ(X ˆ 2 ≥icx X(θ2 ) and Yˆ2 ≥icx Y (θ2 ). So by the preservation of the Observe that X ˆ 2 + Yˆ2 ≥icx order ≥icx under convolution (see Theorem 4.A.8) it follows that X X(θ2 ) + Y (θ2 ). That is, for any increasing convex function φ, one has ˆ 2 + Yˆ2 ) ≥ Eφ(X(θ2 ) + Y (θ2 )). Eφ(X

8.A Regular Stochastic Convexity

363

Similarly, ˆ 3 + Yˆ3 ) ≥ Eφ(X(θ3 ) + Y (θ3 )). Eφ(X Therefore, Eφ(X(θ1 ) + Y (θ1 )) + Eφ(X(θ4 ) + Y (θ4 )) ≥ Eφ(X(θ2 ) + Y (θ2 )) + Eφ(X(θ3 ) + Y (θ3 )). Combining this with the preservation of stochastic monotonicity under convolution (see Theorem 1.A.3), one has {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX.

A combination of Example 8.A.4 and Theorem 8.A.15 yields the following generalization of Example 8.A.4 which will be used later. Example 8.A.16. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1, and let Z be a random variable n which is independent of the Y (n)’s. For µ > 0 deﬁne X(µ, n) = Z + µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL. Theorem 8.A.17. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊆ R is a convex set, and let {Y (λ), λ ∈ Λ} be another family of random variables. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX [SICV, SIL] and {Y (λ), λ ∈ Λ} ∈ SICX [SICV, SIL], then {Y (X(θ)), θ ∈ Θ} ∈ SICX [SICV, SIL]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX [SDCV, SDL] and {Y (λ), λ ∈ Λ} ∈ SICX [SICV, SIL], then {Y (X(θ)), θ ∈ Θ} ∈ SDCX [SDCV, SDL]. Proof. We will prove the increasing convex case only. The other cases can be proven similarly. Using the construction in the proof of Theorem 1.A.1 for the usual stochastic order, it is easily veriﬁed that {Y (X(θ)), θ ∈ Θ} ∈ SI. Let φ be an increasing and convex function. Consider Eφ(Y (X(θ))) = Eψ(X(θ)),

(8.A.3)

where ψ(λ) = Eφ(Y (λ)). Since {Y (λ), λ ∈ Λ} ∈ SICX, we see that ψ is an increasing and convex function. Therefore, since {X(θ), θ ∈ Θ} ∈ SICX, one sees from (8.A.3) that Eφ(Y (X(θ))) is increasing and convex in θ. Therefore {Y (X(θ)), θ ∈ Θ} ∈ SICX.

Example 8.A.18. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables as in Example 8.A.4, but here, since we are interested only in convexity properties with respect to n, we let the common mean of the Y (n)’s be a ﬁxed µ > 0. Denote

364

8 Stochastic Convexity and Concavity

n ˜ X(n) = k=1 Y (k), n ∈ N++ , and let X(n) be the forward recurrence time ˜ associated with X(n), that is, let X(n) have the survival function given by ∞ P {X(n) > u}du ˜ P {X(n) > x} = x , x ≥ 0, n ∈ N++ . nµ ˜ Then {X(n), n ∈ N++ } ∈ SIL. This follows, by Examples 8.A.12 and 8.A.16, and by Theorem 8.A.17, from the relation (proven below)

U (n)

˜ X(n) =st Y˜ +

Y (k),

(8.A.4)

k=1

where U (n) is a random variable which is uniformly distributed on {0, 1, . . . , n − 1}, and Y˜ is the forward recurrence time associated with Y (1), that is, ∞ P {Y (1) > u}du P {Y˜ > x} = x , x ≥ 0. µ The relation (8.A.4) can be proven as follows: Consider n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n, all with interrenewal times that are distributed as Y (1), and consider the renewal process {N (t), t ≥ 0} with interrenewal intervals which are the sums of the corresponding interrenewal intervals of the n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. That is, the interrenewal times that are associated with {N (t), t ≥ 0} are distributed as X(n). Select a t > 0 and consider the associated forward recurrence time in the process {N (t), t ≥ 0}. Clearly the value t falls in an interrenewal interval which is the sum of the n interrenewal intervals corresponding to {N1 (t), t ≥ 0}, {N2 (t), t ≥ 0}, . . . , {Nn (t), t ≥ 0}. With probability 1/n, t falls in the interrenewal interval corresponding to the process {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. Let U (n) + 1 be the index of the process in whose interrenewal interval t falls. Then U (n) is uniformly distributed on {0, 1, . . . , n−1}. If t falls in an interval corresponding to {Ni (t), t ≥ 0} (that is, when U (n) = i − 1), n n−i then its forward recurrence time is Y˜ + k=i+1 Y (k) =st Y˜ + k=1 Y (k). Unconditioning with respect to the value i of U (n) + 1 we obtain

n−U (n)−1

˜ X(n) =st Y˜ +

k=1

U (n)

Y (k) =st Y˜ +

Y (k),

k=1

and the proof of (8.A.4) is complete. In Example 8.B.12 of Section 8.B the reader may ﬁnd a related result. Let {X(n), n ∈ N+ } be a Markov chain with state space S (S = [0, ∞) or N+ ). Let random variables representing Y (x) and Z(x) denote generic [X(n + 1)X(n) = x] and [X(n + 1) − xX(n) = x], respectively (recall that, for a random variable U and an event A, we denote by [U A] any random variable whose distribution is the conditional distribution of U given A). Note that Y (x) =st x + Z(x), x ∈ S.

8.A Regular Stochastic Convexity

365

Theorem 8.A.19. Suppose that X(0) = 0 a.s. If {Z(x), x ∈ S} ∈ SD and Z(x) ≥ 0 a.s. for each x ∈ S, then {X(n), n ∈ N+ } ∈ SICV. Proof. Since Z(x) ≥ 0 a.s. we have Y (x) ≥ x a.s., and therefore X(n) is a.s. increasing in n. For any increasing and concave function φ we have that φ(x+y)−φ(y) increasing in x and decreasing in y. Therefore, since {Z(y), y ∈ S} ∈ SD, we see that Eφ(Z(y) + y) − φ(y) is decreasing in y. Since X(n) is a.s. increasing in n, we have Eφ(Z(X(n + 1)) + X(n + 1)) − Eφ(X(n + 1)) ≤ Eφ(Z(X(n)) + X(n)) − Eφ(X(n)). Noting that X(n + 1) =st Z(X(n)) + X(n), from the above equation one obtains Eφ(X(n + 2)) + Eφ(X(n)) ≤ Eφ(X(n + 1)) + Eφ(X(n + 1)). That is, {X(n), n ∈ N+ } ∈ SICV.

Let X(n) be the historical record value of a sequence of independent and identically distributed random variables {Dn , n ∈ N++ }. That is, X(n) = max{X(n − 1), Dn } = max{X(0), D1 , D2 , . . . , Dn }, n ∈ N++ . Theorem 8.A.20. If X(0) = 0 a.s., then {X(n), n ∈ N+ } ∈ SICV. Proof. We apply Theorem 8.A.19. Here Y (x) =st max{Dn , x} and Z(x) =st max{Dn − x, 0}. Clearly, {Z(x), x ≥ 0} satisﬁes the conditions of Theorem 8.A.19.

8.A.3 Stochastic m-convexity Let S be a subinterval of the real line. Recall from Section 3.A.5 the class MSm-cx of all functions φ : S → R whose mth derivative φ(m) exists and satisﬁes φ(m) (x) ≥ 0, for all x ∈ S, or which are limits of sequences of functions whose mth derivative is continuous and nonnegative on S, m = 1, 2, . . .. A )m function φ : S → R is said to be m-increasing convex if φ ∈ k=1 MSk-cx . A set of random variables {X(θ), θ ∈ Θ} (Θ is a subinterval of the real line) is said to be stochastically m-increasing convex if Eφ(X(θ)) is m-increasing convex in θ whenever φ is m-increasing convex. If Θ is a subinterval of N++ , then the deﬁnition of stochastic m-increasing convexity is similar; we do not give the details here — they can be found in Denuit, Lef`evre, and Utev [155]. The proofs of most of the following examples, as well as many other examples, can be found in Denuit, Lef`evre, and Utev [155]. Example 8.A.21. Let X(λ) be a Poisson random variable with mean λ. Then {X(λ), λ ∈ [0, ∞)} is stochastically m-increasing convex for each m ∈ N++ .

366

8 Stochastic Convexity and Concavity

Example 8.A.22. Let X(n, p) be a binomial random variable with mean np and variance np(1 − p). Then, for each p ∈ (0, 1), one has that {X(n, p), n ∈ N++ } is stochastically m-increasing convex, and for each n ∈ N++ , one has that {X(n, p), p ∈ (0, 1)} is stochastically m-increasing convex, for each m ∈ N++ . Example 8.A.23. Let Y (n), n = 1, 2, . . ., be a sequence of nonnegative independent and identically n distributed random variables with mean 1. For µ > 0 deﬁne X(µ, n) = µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has that {X(µ, n), µ ∈ [0, ∞)} is stochastically m-increasing convex, and for each µ > 0, one has that {X(µ, n), n ∈ N++ } is stochastically m-increasing convex, for each m ∈ N++ . Speciﬁcally, when Y (n) in Example 8.A.23 is an exponential random variable we have the following example. Example 8.A.24. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has that {X(µ, n), µ ∈ [0, ∞)} is stochastically m-increasing convex, and for each µ > 0, one has {X(µ, n), n ∈ N++ } is stochastically m-increasing convex, for each m ∈ N++ . By taking n = 1 in Example 8.A.23 we obtain the following result. Example 8.A.25. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ [0, ∞)} is stochastically m-increasing convex for each m ∈ N++ . When the set of random variables is parametrized by a location parameter then we have: Example 8.A.26. Let Y be a real random variable. For µ > 0 deﬁne X(µ) = Y + µ. Then {X(µ), µ ∈ [0, ∞)} is stochastically m-increasing convex for each m ∈ N++ . Another example of interest is the following. Example 8.A.27. Let X(n) be uniformly distributed on {0, 1, . . . , n− 1}. Then {X(n), n ∈ N+ } is stochastically m-increasing convex for each m ∈ N++ . Since the composition of two m-increasing functions is m-increasing, we obtain the following closure properties of stochastic m-convexity. Theorem 8.A.28. (a) Let ϕ : S → R be an m-increasing convex function. If {X(θ), θ ∈ Θ} is stochastically m-increasing convex, then {ϕ(X(θ)), θ ∈ Θ} is also stochastically m-increasing convex. (b) Let ϑ : Θ → Θ be an m-increasing convex function. If {X(θ), θ ∈ Θ} is stochastically m-increasing convex, then {X(ϑ(θ)), θ ∈ Θ} is also stochastically m-increasing convex. From Theorem 8.A.28(a) and Example 8.A.23 we obtain the following result.

8.B Sample Path Convexity

367

Theorem 8.A.29. Let {Yn , n ≥ 1} be a sequence of nonnegative, independent and identically distributed random variables. Let {N (θ), θ ∈ Θ} be a set of nonnegative integer-valued random variables, independent of the Yn ’s. Deﬁne N (θ) X(θ) = n=1 Yn . If {N (θ), θ ∈ Θ} is stochastically m-increasing convex, then {X(θ), θ ∈ Θ} is stochastically m-increasing convex.

8.B Sample Path Convexity Sample path convexity is one powerful tool that can be used for the purpose of obtaining the regular convexity notions presented in Section 8.A. Two other related tools will be described in Sections 8.C and 8.D. 8.B.1 Deﬁnitions Consider a family {X(θ), θ ∈ Θ} of random variables. Let θi ∈ Θ, i = 1, 2, 3, 4, be any four values such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . ˆ i , i = 1, 2, 3, 4, deﬁned on a common If there exist four random variables X ˆ i =st X(θi ), i = 1, 2, 3, 4, and probability space, such that X ˆ3] ≤ X ˆ 4 a.s. and (ii) X ˆ2 +X ˆ3 ≤ X ˆ1 +X ˆ 4 a.s., then {X(θ), θ ∈ ˆ2, X (a) (i) max[X Θ} is said to be stochastically increasing and convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SICX(sp)); ˆ 1 ≤ min[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 + X ˆ4 ≤ X ˆ2 + X ˆ 3 a.s., then {X(θ), θ ∈ (b) (i) X Θ} is said to be stochastically increasing and concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SICV(sp)); ˆ 1 ≥ max[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 +X ˆ4 ≥ X ˆ2 +X ˆ 3 a.s., then {X(θ), θ ∈ (c) (i) X Θ} is said to be stochastically decreasing and convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SDCX(sp)); ˆ 4 ≤ min[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 + X ˆ4 ≤ X ˆ2 + X ˆ 3 a.s., then {X(θ), θ ∈ (d) (i) X Θ} is said to be stochastically decreasing and concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SDCV(sp)); ˆ2, X ˆ3] ≤ X ˆ 4 a.s. and (ii) X ˆ1 +X ˆ4 = X ˆ2 +X ˆ 3 a.s., then {X(θ), θ ∈ (e) (i) max[X Θ} is said to be stochastically increasing and linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SIL(sp)); ˆ 1 ≥ max[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 +X ˆ4 = X ˆ2 +X ˆ 3 a.s., then {X(θ), θ ∈ (f) (i) X Θ} is said to be stochastically decreasing and linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SDL(sp)). Although Condition (i) in these deﬁnitions requires stochastic monotonicity ˆ i , i = 2, 3, to in Xi , i = 1, 2, 3, 4, we do not require the construction of X satisfy any a.s. monotonicity property (that is, we do not require that either ˆ2 ≥ X ˆ 3 a.s. or X ˆ2 ≤ X ˆ 3 a.s. be satisﬁed). X Example 8.B.1. Let X(µ, σ) be a normal random variable with mean µ and standard deviation σ. Then, for each σ > 0, one has {X(µ, σ), µ ∈ R} ∈ SIL(sp). This follows from Example 8.D.4 and Theorem 8.D.11 below.

368

8 Stochastic Convexity and Concavity

Example 8.B.2. Let X(λ) be a Poisson random variable with mean λ. Then {X(λ), λ ∈ R+ } ∈ SIL(sp). This follows from Example 8.B.7 below. Example 8.B.3. Let X(n, p) be a binomial random variable with mean np and variance np(1 − p). Then, for each p ∈ (0, 1), one has {X(n, p), n ∈ N++ } ∈ SIL(sp) and, for each n ∈ N++ , one has {X(n, p), p ∈ (0, 1)} ∈ SIL(sp). The ﬁrst result follows from Example 8.B.4 below. In order to prove the second result, ﬁrst note that X(n, p) =st X1 (p) + X2 (p) + · · · + Xn (p), where Xj (p), j = 1, 2, . . . , n, are independent and identically distributed Bernoulli random variables with P {Xj (p)} = p. We will show that {X1 (p), p ∈ (0, 1)} ∈ SIL(sp).

(8.B.1)

The second result above then follows from Theorem 8.B.10 below. To prove (8.B.1) let pi , i = 1, 2, 3, 4, be such that 0 < p1 ≤ p2 ≤ p3 ≤ p4 < 1 and p1 + p4 = p2 + p3 . Let U be a uniform (0, 1) random variable. Let IA denote the indicator function of A. Deﬁne ˆ 1 = I{U ≤p } , X 1 ˆ X3 = I{U ≤p1 } + I{p2 ≤U ≤p4 } ,

ˆ 2 = I{U ≤p } , X 2 ˆ X4 = I{U ≤p4 } .

ˆ i =st X1 (p), i = 1, 2, 3, 4, and X ˆ i , i = 1, 2, 3, 4, satisfy the conditions Then X given in the deﬁnitions of SICX(sp) and SICV(sp). This proves (8.B.1). Example 8.B.4. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1. For n µ > 0 deﬁne X(µ, n) = µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(sp) and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL(sp). The ﬁrst result follows from Example 8.D.5 and Theorem 8.D.11 below. The second result follows from Example 8.B.7 below. Speciﬁcally, when Y (n) in Example 8.B.4 is an exponential random variable we have the following example. Example 8.B.5. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(sp) and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL(sp). By taking n = 1 in Example 8.B.4 we obtain the following result. Example 8.B.6. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ R+ } ∈ SIL(sp). Example 8.B.7. If {X(θ), θ ∈ Θ} has the semigroup property (see Example 8.A.7), then {X(θ), θ ∈ Θ} ∈ SIL(sp). In order to see it let θi ∈ Θ, = 1, 2, 3, 4, be such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . Let Zi , i = 1, 2, 3, 4, be independent random variables such that

8.B Sample Path Convexity

369

Z1 =st X(θ1 ), Z2 =st X(θ2 − θ1 ), Z3 =st X(θ3 − θ2 ), and Z4 =st X(θ4 − θ3 ), where, by convention, X(0) ≡ 0. Deﬁne ˆ 1 = Z1 , X ˆ 2 = Z1 + Z2 , X ˆ 3 = Z1 + Z3 + Z4 , X and ˆ 4 = Z1 + Z2 + Z3 + Z4 . X ˆ i =st X(θi ), i = 1, 2, 3, 4, and X ˆ i , i = 1, 2, 3, 4, satisfy the conditions Then X given in the deﬁnitions of SICX(sp) and SICV(sp). This proves the result stated above. The following theorem is obvious. A more general result is proven in Theorem 8.B.13 (see Corollary 8.B.14). Theorem 8.B.8. (a) If {X(θ), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)] and if φ is an increasing convex [or concave] function, then {φ(X(θ)), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)] and if φ is an increasing convex [or concave] function, then {φ(X(θ)), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)]. Theorem 8.B.8 shows that the sample path notions imply the regular notions of stochastic convexity/concavity. Counterexamples can be constructed to show that the reverse need not be true. We have the following results. Theorem 8.B.9. SICX(sp) =⇒ SICX, SICV(sp) =⇒ SICV, SDCX(sp) =⇒ SDCX, SDCV(sp) =⇒ SDCV.

370

8 Stochastic Convexity and Concavity

8.B.2 Closure properties In this section we present some closure properties of the sample path convexity notions. Theorem 8.B.10. Let {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} be two families of random variables such that for each θ ∈ Θ, X(θ) and Y (θ) are independent. Then (a) {X(θ), θ ∈ Θ} ∈ SICX(sp) and {Y (θ), θ ∈ Θ} ∈ SICX(sp) =⇒ {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX(sp), (b) {X(θ), θ ∈ Θ} ∈ SICV(sp) and {Y (θ), θ ∈ Θ} ∈ SICV(sp) =⇒ {X(θ) + Y (θ), θ ∈ Θ} ∈ SICV(sp), (c) {X(θ), θ ∈ Θ} ∈ SDCX(sp) and {X(θ), θ ∈ Θ} ∈ SDCX(sp) =⇒ {X(θ)+ Y (θ), θ ∈ Θ} ∈ SDCX(sp), and (d) {X(θ), θ ∈ Θ} ∈ SDCV(sp) and {Y (θ), θ ∈ Θ} ∈ SDCV(sp) =⇒ {X(θ) + Y (θ), θ ∈ Θ} ∈ SDCV(sp). Proof. We will prove part (a) only, since the other parts can be similarly proven. Let θi ∈ Θ, i = 1, 2, 3, 4, be any four values such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . From the deﬁnition of SICX(sp) one sees that ˆ i , Yˆi , i = 1, 2, 3, 4, deﬁned on a common there exist eight random variables X ˆ probability space, such that Xi =st X(θi ), Yˆi =st Y (θi ), i = 1, 2, 3, 4, and ˆ2, X ˆ3] ≤ X ˆ 4 a.s., max[X ˆ2 + X ˆ3 ≤ X ˆ1 + X ˆ 4 a.s., X

max[Yˆ2 , Yˆ3 ] ≤ Yˆ4 a.s., Yˆ2 + Yˆ3 ≤ Yˆ1 + Yˆ4 a.s.,

ˆ i and Yˆi are independent, i = 1, 2, 3, 4. Let Zˆi = X ˆ i + Yˆi , i = 1, 2, 3, 4. and X Then Zi =st X(θi ) + Y (θi ), i = 1, 2, 3, 4, and max[Zˆ2 , Zˆ3 ] ≤ Zˆ4 a.s.

and Zˆ2 + Zˆ3 ≤ Zˆ1 + Zˆ4 a.s.

Therefore, {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX(sp).

A combination of Example 8.B.4 and Theorem 8.B.10 yields the following generalization of Example 8.B.4 which will be used later. Example 8.B.11. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1, and let Z be a random variable nwhich is independent of the Y (n)’s. For µ > 0 deﬁne X(µ, n) = Z + µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(sp), and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL(sp). Example 8.B.12. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with a common mean µ > 0, as in Example 8.A.18. Let Y ∗ be the spread of the renewal process generated by the Y (n)’s; that is, if f is the density function of Y (1),

8.B Sample Path Convexity

371

n then the density function of Y ∗ is (1/µ)xf (x). Denote X(n) = k=1 Y (k), n ∈ N++ , and let X ∗ (n) be the spread corresponding to X(n). Then {X ∗ (n), n ∈ N++ } ∈ SIL(sp). This follows, by Example 8.B.11, from the relation n−1 X ∗ (n) =st Y ∗ + Y (k). (8.B.2) k=1

The relation (8.B.2) can be proven as follows: Consider n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n, all with interrenewal times that are distributed as Y (1), and consider the renewal process {N (t), t ≥ 0} with interrenewal intervals which are the sums of the corresponding interrenewal intervals of the n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. That is, the interrenewal times that are associated with {N (t), t ≥ 0} are distributed as X(n). Select a t > 0 and consider the spread corresponding to the process {N (t), t ≥ 0}. Clearly the value t falls in an interrenewal interval which is the sum of the n interrenewal intervals corresponding to {N1 (t), t ≥ 0}, {N2 (t), t ≥ 0}, . . . , {Nn (t), t ≥ 0}. With probability 1/n, t falls in the interrenewal interval corresponding to the process {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. Let U (n) be the index of the process in whose interrenewal interval t falls. Then U (n) is uniformly distributed on {1, 2, . . . , n}. If t falls in an interval corresponding to {Ni (t), t ≥ 0} (that is, when U (n) = i), then n−1 its spread is Y ∗ + k=i Y (k) =st Y ∗ + k=1 Y (k). Note that the distribution of the spread is independent of i. Therefore, by unconditioning with respect to the value i of U (n) we obtain (8.B.2). Theorem 8.B.13. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊂ R is a convex set. Also, let {Y (λ), λ ∈ Λ} be another family of random variables. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX(sp) [SICV(sp)] and {Y (λ), λ ∈ Λ} ∈ SICX(sp) [SICV(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SICX(sp) [SICV(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(sp) [SDCV(sp)] and {Y (λ), λ ∈ Λ} ∈ SICX(sp) [SICV(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SDCX(sp) [SDCV(sp)]. Proof. We will prove the convex case of part (a) only, as the proofs of the other cases are similar. Let θi ∈ Θ, i = 1, 2, 3, 4, be any four values such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . Since {X(θ), θ ∈ Θ} ∈ SICX(sp), ˆ i , i = 1, 2, 3, 4, deﬁned on a common there exist four random variables X ˆ i =st X(θi ), i = 1, 2, 3, 4, and probability space, such that X ˆ2, X ˆ3] ≤ X ˆ 4 a.s. [X Let and

ˆ2 + X ˆ3 ≤ X ˆ1 + X ˆ 4 a.s. and X

ˆ4, X ˆ1 + X ˆ4 − X ˆ 3 ], X2∗ = min[X

(8.B.3)

372

8 Stochastic Convexity and Concavity

ˆ3 − X ˆ4. X1∗ = X2∗ + X Clearly, X1∗ and X2∗ ∈ Λ a.s., and ˆ1 X1∗ ≤ X Also,

ˆ3] ≤ X ˆ4 [X2∗ , X

ˆ2. and X2∗ ≥ X

(8.B.4)

ˆ4 = X ∗ + X ˆ3. and X1∗ + X 2

Therefore, since {Y (λ), λ ∈ Λ} ∈ SICX(sp), there exist four random variables Z1∗ , Z2∗ , Zˆ3 , and Zˆ4 , deﬁned on a common probability space, such that Z1∗ =st ˆ 3 ), Zˆ4 =st Y (X ˆ 4 ), and Y (X1∗ ), Z2∗ =st Y (X2∗ ), Zˆ3 =st Y (X [Z2∗ , Zˆ3 ] ≤ Zˆ4 a.s.

and Z2∗ + Zˆ3 ≤ Z1∗ + Zˆ4 a.s.

(8.B.5)

Since Y (λ) is stochastically increasing in λ, from (8.B.4) it is seen that there ˆ i ), i = 1, 2, and exist random variables Zˆi , i = 1, 2, such that Zˆi =st Y (X Z1∗ ≤ Zˆ1

and Z2∗ ≥ Zˆ2 .

Then from (8.B.5) one sees that [Zˆ2 , Zˆ3 ] ≤ Zˆ4 a.s.

and Zˆ2 + Zˆ3 ≤ Zˆ1 + Zˆ4 a.s.

The proof is completed by observing that Y (X(θi )) =st Zˆi , i = 1, 2, 3, 4.

By letting {Y (λ), λ ∈ Λ} of Theorem 8.B.13 be deterministic (we denote it then as a real function φ : Λ → R) we obtain the following corollary. Corollary 8.B.14. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊂ R is a convex set, and let φ be a real function on Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)] and φ is increasing and convex [or concave], then {φ(X(θ)), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)] and φ is increasing and convex [or concave], then {φ(X(θ)), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)]. By letting {X(θ), θ ∈ Θ} of Theorem 8.B.13 be deterministic (we denote it then as a real function φ : Θ → Λ) we obtain the following corollary. Corollary 8.B.15. Let {Y (λ), λ ∈ Λ} be a family of real-valued random variables, where Λ ⊂ R is a convex set, and let φ be a Λ-valued function on Θ, where Θ ⊂ R is a convex set. (a) If {Y (λ), λ ∈ Λ} ∈ SICX(sp) [or convex [or concave], then {Y (φ(θ)), (b) If {Y (λ), λ ∈ Λ} ∈ SDCX(sp) [or convex [or concave], then {Y (φ(θ)),

SICV(sp)] and φ is increasing and θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)]. SDCV(sp)] and φ is increasing and θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)].

Let {X(n), n ∈ N+ } be a Markov chain with state space S (S = R+ or N+ ). Let Y (x) =st [X(n + 1)X(n) = x] and Z(x) = Y (x) − x, x ∈ S.

8.B Sample Path Convexity

373

Theorem 8.B.16. Suppose X(0) = x0 a.s. If Z(x) ≥ 0 a.s. for each x ∈ S and {Z(x), x ∈ S} ∈ SI, then {X(n), n ∈ N+ } ∈ SICX(sp). Proof. Since Z(x) ≥ 0 a.s., for n1 ≤ n2 we have X(n1 ) ≤ X(n2 ) a.s. Let n3 and n4 be such that n1 ≤ n2 ≤ n3 ≤ n4 and n1 + n4 = n2 + n3 . Deﬁne m = n4 − n2 = n3 − n1 and Z (m) (x) =st [X(m) − xX(0) = x]. Since Z(x) is stochastically increasing in x, using sample path construction (as in the proof of Theorem 6.B.3 when it applies to Theorem 6.B.34 through Theorems 6.B.32 and 6.B.31), it can be established that Z (m) (x) is also stochastically increasing ˆ 1 , Zˆ1 ) and (X ˆ 2 , Zˆ2 ) deﬁned on in x. Then there exist two random vectors (X ˆ ˆ a common probability space such that (Xi , Zi ) =st (X(ni ), Z (m) (X(ni ))), i = 1, 2, and ˆ 1 , Zˆ1 ) ≤ (X ˆ 2 , Zˆ2 ) a.s. (X (8.B.6) Set ˆ3 = X ˆ 1 + Zˆ1 X

ˆ4 = X ˆ 2 + Zˆ2 . and X

Since Z (m) (x) ≥ 0 a.s., from (8.B.6) it follows that ˆ2, X ˆ3] ≤ X ˆ4 max[X

ˆ1 + X ˆ4 ≥ X ˆ2 + X ˆ3. and X

ˆ i , i = 1, 2, 3, 4.

The proof is now completed by noting that X(ni ) =st X Next consider a Galton-Watson branching process {X(n), n ∈ N+ } in discrete time. Let Di , i = 1, 2, . . ., be independent and identically distributed random variables such that Di has the same distribution as the number of x oﬀsprings of an ancestor. Then, for this process, Y (x) =st i=1 Di , x ∈ N+ . Theorem 8.B.17. Suppose Di ≥ 1 a.s. and P {Di > 1} > 0. If X(0) ≥ 1 a.s., then {X(n), n ∈ N+ } ∈ SICX(sp). x Proof. First, condition on X(0) = x0 . Since Z(x) = Y (x)−x =st i=1 (Di −1) and Di ≥ 1 a.s., one sees that Z(x) ≥ 0 a.s. Also, it is easily seen that {Z(x), x ∈ N+ } ∈ SI. Then, conditioned on X(0) = x0 , the result of Theorem 8.B.17 follows immediately from Theorem 8.B.16. From the deﬁnition of sample path convexity, it is clear that by unconditioning with respect to X(0), the sample path convexity of {X(n), n ∈ N+ } is preserved.

Now consider a nonhomogeneous Poisson process {N (t), t ≥ 0} with mean value function M (t) = EN (t). To avoid trivialities we assume that M is strictly increasing. Denote by Rn the nth epoch time of this process. Theorem 8.B.18. If M is concave [convex ], then {Rn , n ∈ N++ } ∈ SICX(sp) [SICV(sp)]. Proof. Let {K(t), t ≥ 0} be a Poisson process with rate 1, and let Tn denote the nth epoch time of this process. By Example 8.B.4 we have {Tn , n ∈ N++ } ∈ SIL(sp). Now,

374

8 Stochastic Convexity and Concavity

{Rn , n ∈ N++ } =st {M −1 (Tn ), n ∈ N++ }. Since M is increasing and concave [convex] it follows that M −1 is increasing and convex [concave]. The result now follows from Corollary 8.B.14.

Theorem 8.B.19. If M is convex [concave], then {N (t), t ∈ [0, ∞)} ∈ SICX(sp) [SICV(sp)]. Proof. Let {K(t), t ≥ 0} be a Poisson process with rate 1. By Example 8.B.2 we have {K(t), t ∈ [0, ∞)} ∈ SIL(sp). Now, {N (t), t ∈ [0, ∞)} =st {K(M (t)), t ∈ [0, ∞)}. The result now follows from Corollary 8.B.15.

8.C Convexity in the Usual Stochastic Order In some applications it is hard to ﬁnd the construction needed to establish the sample path convexity of Section 8.B. Then the stochastic convexity notions of this section may be useful. 8.C.1 Deﬁnitions Let {X(θ), θ ∈ Θ} be a family of random variables with survival functions F θ (x) = P {X(θ) > x}, θ ∈ Θ. The family {X(θ), θ ∈ Θ} is said to be stochastically increasing [decreasing] and convex [concave, linear ] in the sense of the usual stochastic ordering if Eφ(X(θ)) is increasing [decreasing] and convex [concave, linear] for all increasing functions φ. We denote this by {X(θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SIL(st), SDCX(st), SDCV(st), SDL(st)]. It is easy to see the following characterization. Theorem 8.C.1. The family {X(θ), θ ∈ Θ} satisﬁes {X(θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)] if, and only if, F (x, θ) is increasing and convex [increasing and concave, decreasing and convex, decreasing and concave] in θ for each ﬁxed x. The following are other characterizations of these notions. Theorem 8.C.2. The family {X(θ), θ ∈ Θ} satisﬁes {X(θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)] if, and only if, for any θi ∈ Θ, i = 1, 2, 3, 4, such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 , there ˆ i , i = 1, 2, 3, 4, deﬁned on a common probaexist four random variables X ˆ 1 ≤ [≤, ≥, ≥] X ˆ4 ˆ i =st X(θi ), i = 1, 2, 3, 4, and X bility space, such that X ˆ ˆ ˆ ˆ ˆ ˆ a.s., min{X1 , X4 } ≥ [≤, ≥, ≤] min{X2 , X3 } a.s., max{X1 , X4 } ≥ [≤, ≥, ≤] ˆ2, X ˆ 3 } a.s., and hence X ˆ1 + X ˆ 4 ≥ [≤, ≥, ≤] X ˆ2 + X ˆ 3 a.s. max{X

8.C Convexity in the Usual Stochastic Order

375

Proof. We prove the increasing convex case only since the other cases can be proven similarly. Since X(θ) is stochastically increasing in θ there exist, on ˆ 1 and X ˆ 4 such that X ˆ i =st a common probability space, random variables X ˆ ˆ X(θi ), i = 1, 4 and X1 ≤ X4 a.s. Let U be a uniform random variable on (0, 1) and deﬁne θ2 − θ1 ˆ θ2 − θ1 ˆ X2∗ = I U ≤ X4 + 1 − I U ≤ X1 , θ4 − θ1 θ4 − θ1 and θ2 − θ1 ˆ θ2 − θ1 ˆ X1 + 1 − I U ≤ X4 . X3∗ = I U ≤ θ4 − θ1 θ4 − θ1 Then ˆ1, X ˆ4] min[X2∗ , X3∗ ] = min[X

(8.C.1)

ˆ1, X ˆ 4 ]. max[X2∗ , X3∗ ] = max[X

(8.C.2)

and

θ4 −θ2 ∗ 1 Also note that P {X2∗ > x} = θθ24 −θ −θ1 F (x, θ4 ) + θ4 −θ1 F (x, θ1 ), and P {X3 > θ2 −θ1 θ4 −θ2 x} = θ4 −θ1 F (x, θ1 ) + θ4 −θ1 F (x, θ4 ). Since F (x, θ) is increasing and convex in θ, it is then obvious that

X2∗ ≥st X(θ2 )

and X3∗ ≥st X(θ3 ).

ˆ 3 such that X ˆ i =st X(θi ), i = 2, 3, and ˆ 2 and X Therefore, there exist X ˆ i , i = 2, 3. Then, from (8.C.1) and (8.C.2), one sees that Xi∗ ≥ X ˆ 3 ] ≤ min[X ˆ1, X ˆ4] ˆ2, X min[X

and

ˆ2, X ˆ 3 ] ≤ max[X ˆ1, X ˆ 4 ]. max[X

ˆ i , i = 1, 2, 3, 4.

The proof is now completed by observing that X(θi ) =st X Example 8.C.3. Let X(p) be a geometric random variable with mean 1/(1−p). Then {X(p), p ∈ (0, 1)} ∈ SICX(st). Example 8.C.4. Let X(λ) be an exponential random variable with mean 1/λ. Then {X(λ), λ ∈ (0, ∞)} ∈ SDCX(st). It is evident from Theorems 8.C.2 and 8.B.9 that one has the following results. Theorem 8.C.5. SICX(st) =⇒ SICX(sp) =⇒ SICX, SICV(st) =⇒ SICV(sp) =⇒ SICV, SDCX(st) =⇒ SDCX(sp) =⇒ SDCX, SDCV(st) =⇒ SDCV(sp) =⇒ SDCV. Observing that {ψ(θ), θ ∈ Θ} ∈ SICX(sp) for any increasing convex function ψ, and that it is not SICX(st), it is clear that the implications in Theorem 8.C.5 are strict.

376

8 Stochastic Convexity and Concavity

8.C.2 Closure properties Unlike the two previous notions, stochastic convexity in the usual stochastic ordering does not have many closure properties. For example, there are no counterparts to Theorems 8.A.15 and 8.A.17 or Theorems 8.B.10 and 8.B.13 for this stochastic convexity notion. Instead, we present some specialized closure properties under random summation. Theorem 8.C.6. Let {N (θ), θ ∈ Θ} be a family of discrete random variables on N+ , let {X(n), n = 1, 2, . . . } be a sequence of independent and identically distributed nonnegative random variables, and let X(0) = 0. Suppose that {N (θ), θ ∈ Θ} and {X(n), n ∈ N+ } are mutually independent. Set Y (θ) = N (θ) n=0 X(n), θ ∈ Θ. If {N (θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)], then {Y (θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)]. Proof. Consider the case {N (θ), θ ∈ Θ} ∈ SICX(st). The other three cases can be similarly proven. From Theorem 8.C.2 one knows that for any θi ∈ Θ, i = 1, 2, 3, 4, such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 , and θ1 + θ4 = θ2 + θ3 , there ˆi , i = 1, 2, 3, 4, deﬁned on a common probability exist four random variables N ˆi =st N (θi ), i = 1, 2, 3, 4, and space, such that N ˆ1 ˆ4 ≥ N N ˆ1 , N ˆ4 } ≥ min{N ˆ2 , N ˆ3 } min{N

a.s.,

(8.C.3)

a.s.,

(8.C.4)

ˆ1 , N ˆ4 } ≥ max{N ˆ2 , N ˆ3 } max{N ˆ1 + N ˆ4 ≥ N ˆ2 + N ˆ3 N

a.s., and hence

(8.C.5)

a.s.

(8.C.6)

Nˆi X(n), i = 1, 2, 3, 4. Then, clearly, Yˆi =st Y (θi ), i = 1, 2, 3, 4. Deﬁne Yˆi = n=0 Furthermore, from (8.C.3)–(8.C.6), one sees that Yˆ4 ≥ Yˆ1 min{Yˆ1 , Yˆ4 } ≥ min{Yˆ2 , Yˆ3 } max{Yˆ1 , Yˆ4 } ≥ max{Yˆ2 , Yˆ3 } Yˆ1 + Yˆ4 ≥ Yˆ2 + Yˆ3

a.s.,

(8.C.7)

a.s.,

(8.C.8)

a.s., and hence

(8.C.9)

a.s.

Theorem 8.C.6 then follows from Theorem 8.C.2.

(8.C.10)

Theorem 8.C.7. Consider {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} and suppose that, for each θ, X(θ) and Y (θ) are independent. Deﬁne V (θ) = max{X(θ), Y (θ)} and W (θ) = min{X(θ), Y (θ)}. (i) If {X(θ), θ ∈ Θ} ∈ SICX(st) [SDCX(st)] and {Y (θ), θ ∈ Θ} ∈ SICX(st) [SDCX(st)], then {W (θ), θ ∈ Θ} ∈ SICX(st) [SDCX(st)].

8.D Strong Stochastic Convexity

377

(ii) If {X(θ), θ ∈ Θ} ∈ SICV(st) [SDCV(st)] and {Y (θ), θ ∈ Θ} ∈ SICV(st) [SDCV(st)], then {V (θ), θ ∈ Θ} ∈ SICV(st) [SDCV(st)]. Proof. The stated results follow immediately from the observations that (i) the survival function of W (θ) at x is equal to P {X(θ) > x}P {Y (θ) > x}, (ii) the survival function of V (θ) at x is equal to 1 − (1 − P {X(θ) > x})(1 − P {Y (θ) > x}), and from Theorem 8.C.1.

Consider the imperfect repair model. A new item with an absolutely continuous survival function F undergoes an imperfect repair each time it fails before it is scrapped. With probability p the repair is unsuccessful and the item is scrapped. With probability 1 − p the repair is successful and minimal, that is, after a successful repair at time t the item is as good as a working item at age t. It is well known that if X(p) denotes the time to scrap, then p the survival function of X(p) is F . Thus, the following result is apparent. Theorem 8.C.8. Let F be an absolutely continuous survival function such that F (0) = 1. Then {X(p), p ∈ (0, 1)} ∈ SDCX(st).

8.D Strong Stochastic Convexity Another notion which is sometimes useful in verifying the sample path convexity of Section 8.B is described in this section. 8.D.1 Deﬁnitions Let {X(θ), θ ∈ Θ} be a family of random variables. The family {X(θ), θ ∈ Θ} is said to be stochastically [increasing, decreasing] and convex [concave, linear ] ˆ ˆ almost everywhere if there exist {X(θ), θ ∈ Θ} such that X(θ) =st X(θ) for ˆ each θ ∈ Θ and X(θ) is [increasing, decreasing] and convex [concave, linear] in θ. We denote this by {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. Although it appears that the deﬁnition of strong stochastic convexity/concavity is restrictive, several families of random variables do satisfy the conditions of this class of convexity/concavity. This is shown in the next theorem and in the corollaries and examples which follow it. Theorem 8.D.1. Suppose that X(θ) = φ(θ, Z), where φ is a real-valued deterministic function, and Z is a random vector. If φ is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear ] in θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)].

378

8 Stochastic Convexity and Concavity

Corollary 8.D.2. Suppose that X(θ) = Z + ψ(θ), where ψ is a real-valued deterministic function, and Z is a random variable. If ψ is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear ] in θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. Corollary 8.D.3. Suppose that X(θ) = Z · ψ(θ), where ψ is a real-valued deterministic function, and Z is a nonnegative random variable. If ψ is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear] in θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. Example 8.D.4. Let X(µ, σ) be a normal random variable with mean µ and standard deviation σ. Since for a unit normal random variable N (0, 1), we ˆ have X(µ, σ) = µ + σN (0, 1) =st X(µ, σ), µ ∈ R, σ ∈ R+ , we see that, for each σ > 0, {X(µ, σ), µ ∈ R} ∈ SIL(ae), and, for each µ ∈ R, {X(µ, σ), σ ∈ R+ } ∈ SL(ae). Similarly one can prove the result in the next example. Example 8.D.5. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically n distributed random variables with mean 1. For µ > 0 deﬁne X(µ) = µ k=1 Y (k), n ∈ N+ . Then, for each n ∈ N++ , one has {X(µ), µ ∈ R+ } ∈ SIL(ae). Speciﬁcally, when Y (n) in Example 8.D.5 is an exponential random variable we have the following example. Example 8.D.6. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(ae). By taking n = 1 in Example 8.D.5 we obtain the following result. Example 8.D.7. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ R+ } ∈ SIL(ae). The following generalization of Example 8.D.5 is easily observed. Example 8.D.8. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1, and let Z be a random variable n which is independent of the Y (n)’s. For µ > 0 deﬁne X(µ) = Z + µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ), µ ∈ R+ } ∈ SIL(ae).

8.D Strong Stochastic Convexity

379

Another suﬃcient condition (in addition to Theorem 8.D.1 and Corollaries 8.D.2 and 8.D.3) for strong convexity and concavity is described next. Let {X(θ), θ ∈ Θ} be a family of random variables, and let Fθ denote the distribution function of X(θ). If U is a uniform[0, 1] random variable, then Fθ−1 (U ) =st X(θ). The following result follows at once from this observation. Theorem 8.D.9. Suppose that Fθ−1 (u) is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear ] in θ ∈ Θ, for all u ∈ (0, 1), then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. A suﬃcient condition for strong convexity and concavity, which is stated on Fθ (rather than on Fθ−1 as in Theorem 8.D.9), is described next. Recall the deﬁnition of supermodular and submodular functions given in Section 7.A.8. Theorem 8.D.10. Let {X(θ), θ ∈ Θ} be a family of random variables, and suppose that all the partial second derivatives of Fθ (x) exist. (a) If Fθ (x) is concave and strictly increasing in x, and is decreasing and concave in θ, and if Fθ (x) is submodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SICX(ae). (b) If Fθ (x) is convex and strictly increasing in x, and is decreasing and convex in θ, and if Fθ (x) is supermodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SICV(ae). (c) If Fθ (x) is concave and strictly increasing in x, and is increasing and concave in θ, and if Fθ (x) is supermodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SDCX(ae). (d) If Fθ (x) is convex and strictly increasing in x, and is increasing and convex in θ, and if Fθ (x) is submodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SDCV(ae). Proof. Only the proof of part (a) is given; the proofs of the other parts are ˆ similar. Let U be a uniform[0, 1] random variable and deﬁne X(θ) by ˆ Fθ (X(θ)) = U.

(8.D.1)

Diﬀerentiating (8.D.1) for a ﬁxed value of U , we obtain ∂ ∂ ˆ ∂ F· X+ F = 0, ∂x ∂θ ∂θ

(8.D.2)

and ∂ ∂2 ˆ + F · 2X ∂x ∂θ

∂2 F ∂x2

∂ ˆ X ∂θ

2 +2

∂2 ∂ ˆ ∂2 F· X + 2 F = 0. (8.D.3) ∂x∂θ ∂θ ∂θ

The conditions stated in part (a) can be written as

380

8 Stochastic Convexity and Concavity

∂2 F ≤ 0. ∂x∂θ (8.D.4) ∂ ˆ ∂2 ˆ From (8.D.2), (8.D.3), and (8.D.4) it is seen that ∂θ X ≥ 0 and ∂θ X ≥ 0, 2 that is, {X(θ), θ ∈ Θ} ∈ SICX(ae).

∂ F > 0, ∂x

∂ F ≤ 0, ∂θ

∂2 F ≤ 0, ∂x2

∂2 F ≤ 0, ∂θ2

and

The following theorem is easily veriﬁed. Theorem 8.D.11. SICX(ae) =⇒ SICX(sp) =⇒ SICX, SICV(ae) =⇒ SICV(sp) =⇒ SICV, SDCX(ae) =⇒ SDCX(sp) =⇒ SDCX, SDCV(ae) =⇒ SDCV(sp) =⇒ SDCV. These are strict implications. It can be veriﬁed that the stochastic convexity in the usual stochastic order neither implies nor is implied by the strong stochastic convexity. 8.D.2 Closure properties In this subsection we present some closure properties of the strong convexity notions. These results trivially follow from the closure properties of deterministic functions. Thus we will not give the proofs here. Theorem 8.D.12. Let {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} be two families of random variables such that for each θ ∈ Θ, X(θ) and Y (θ) are independent. (a) {X(θ), θ ∈ Θ} ∈ SICX(ae) and {Y (θ), θ ∈ Θ} ∈ SICX(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SICX(ae) for any increasing and convex function f . (b) {X(θ), θ ∈ Θ} ∈ SICV(ae) and {Y (θ), θ ∈ Θ} ∈ SICV(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SICV(ae) for any increasing and concave function f . (c) {X(θ), θ ∈ Θ} ∈ SDCX(ae) and {X(θ), θ ∈ Θ} ∈ SDCX(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SDCX(ae) for any increasing and convex function f . (d) {X(θ), θ ∈ Θ} ∈ SDCV(ae) and {Y (θ), θ ∈ Θ} ∈ SDCV(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SDCV(ae) for any increasing and concave function f . Theorem 8.D.13. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊂ R is a convex set. Also, let {Y (λ), λ ∈ Λ} be another family of random variables. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX(ae) [SICV(ae)] and {Y (λ), λ ∈ Λ} ∈ SICX(ae) [SICV(ae)], then {Y (X(θ)), θ ∈ Θ} ∈ SICX(ae) [SICV(ae)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(ae) [SDCV(ae)] and {Y (λ), λ ∈ Λ} ∈ SICX(ae) [SICV(ae)], then {Y (X(θ)), θ ∈ Θ} ∈ SDCX(ae) [SDCV(ae)].

8.E Stochastic Directional Convexity

381

8.E Stochastic Directional Convexity 8.E.1 Deﬁnitions In Sections 8.A–8.D of this chapter, the parameter space Θ, of the families of random variables {X(θ), θ ∈ Θ} that we studied, was a subset of the real line R. However, in some applications the parameter space is multidimensional, that is, Θ is a subset of Rm for some positive integer m ≥ 2. In this section we study such families of random variables or vectors. In such cases one is interested in convexity [concavity] properties with respect to the vector θ = (θ1 , θ2 , . . . , θm ). Rather than studying convexity [concavity] properties of {X(θ), θ ∈ Θ}, we will study here directional convexity [concavity] properties of such families of random variables or vectors. The reader may recall the deﬁnition of directional convexity [concavity] given in (7.A.17) of Section 7.A.8. Below Θ will always be a sublattice of Rm . Let {X(θ), θ ∈ Θ} be a family of random vectors. The family {X(θ), θ ∈ Θ} is said to be (a) stochastically increasing and directionally convex [concave] if {X(θ), θ ∈ Θ} ∈ SI and if Eφ(X(θ)) is directionally convex [concave] in θ for any increasing directionally convex [concave] function φ. We denote it by {X(θ), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV]; (b) stochastically increasing and directionally linear if {X(θ), θ ∈ Θ} ∈ SI-DIR-CX ∩ SI-DIR-CV. We denote it by {X(θ), θ ∈ Θ} ∈ SI-DIR-L; (c) stochastically decreasing and directionally convex [concave] if {X(θ), θ ∈ Θ} ∈ SD and if Eφ(X(θ)) is directionally convex [concave] in θ for any increasing directionally convex [concave] function φ. We denote it by {X(θ), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV]; (d) stochastically decreasing and directionally linear if {X(θ), θ ∈ Θ} ∈ SD-DIR-CX ∩ SD-DIR-CV. We denote it by {X(θ), θ ∈ Θ} ∈ SD-DIR-L. In particular, if X(θ) is a univariate random variable for all θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV] if, and only if, {X(θ), θ ∈ Θ} ∈ SI and Eφ(X(θ)) is directionally convex [concave] in θ for any increasing convex [concave] function φ. Similarly, {X(θ), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV] if, and only if, {X(θ), θ ∈ Θ} ∈ SD and Eφ(X(θ)) is directionally convex [concave] in θ for any increasing convex [concave] function φ. If both the parameter and the random variables are univariate, then the notions of SI-DIRCX, SI-DIR-CV, SI-DIR-L, SD-DIR-CX, SD-DIR-CV, and SD-DIR-L, reduce to the notions of SICX, SICV, SIL, SDCX, SDCV, and SDL, respectively. In order to deﬁne stochastic directional convexity [concavity] in the sample path sense let {X(θ), θ ∈ Θ} be a family of random vectors as above. Let θ i ∈ Θ, i = 1, 2, 3, 4, be any four vectors such that θ 1 ≤ [θ 2 , θ 3 ] ≤ θ 4 and θ1 + θ4 = θ2 + θ3 . ˆ i , i = 1, 2, 3, 4, deﬁned on a common If there exist four random variables X ˆ probability space, such that X i =st X(θ i ), i = 1, 2, 3, 4, and

382

8 Stochastic Convexity and Concavity

ˆ 2, X ˆ 3] ≤ X ˆ 4 a.s. and (ii) X ˆ 2 +X ˆ3≤X ˆ 1 +X ˆ 4 a.s., then {X(θ), θ ∈ (a) (i) [X Θ} is said to be stochastically increasing and directionally convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SI-DIR-CX(sp)); ˆ 1 ≤ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4≤X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (b) (i) X Θ} is said to be stochastically increasing and directionally concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SI-DIR-CV(sp)); ˆ 1 ≥ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4≥X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (c) (i) X Θ} is said to be stochastically decreasing and directionally convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SD-DIR-CX(sp)); ˆ 4 ≤ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4≤X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (d) (i) X Θ} is said to be stochastically decreasing and directionally concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SD-DIR-CV(sp)); ˆ 2, X ˆ 3] ≤ X ˆ 4 a.s. and (ii) X ˆ 2 +X ˆ3=X ˆ 1 +X ˆ 4 a.s., then {X(θ), θ ∈ (e) (i) [X Θ} is said to be stochastically increasing and directionally linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SI-DIR-L(sp)); ˆ 1 ≥ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4=X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (f) (i) X Θ} is said to be stochastically decreasing and directionally linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SD-DIR-L(sp)). If both the parameter and the random variables are univariate, then the notions of SI-DIR-CX(sp), SI-DIR-CV(sp), SI-DIR-L(sp), SD-DIR-CX(sp), SD-DIR-CV(sp), and SD-DIR-L(sp), reduce to the notions of SICX(sp), SICV(sp), SIL(sp), SDCX(sp), SDCV(sp), and SDL(sp), respectively. 8.E.2 Closure properties The following two results are extensions of Theorems 8.A.17 and 8.B.13 to the stochastic directional convexity setting. The proof of Theorem 8.E.1 is similar to the proof of Theorem 8.A.17, using Proposition 7.A.28. The proof of Theorem 8.E.2 is similar to the proof of Theorem 8.B.13, where the minimum in (8.B.3) is performed coordinatewise. Theorem 8.E.1. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random vectors, and let {Y (λ), λ ∈ Λ} be another family of random vectors. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L], then {Y (X(θ)), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L]. (b) If {X(θ), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV, SD-DIR-L] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L], then {Y (X(θ)), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV, SD-DIR-L]. Theorem 8.E.2. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random vectors, and let {Y (λ), λ ∈ Λ} be another family of random vectors. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ.

8.E Stochastic Directional Convexity

383

(a) If {X(θ), θ ∈ Θ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SD-DIR-CX(sp) [SD-DIR-CV(sp), SD-DIR-L(sp)] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SD-DIR-CX(sp) [SD-DIR-CV(sp), SD-DIRL(sp)]. From Theorem 8.E.2 it is easy to verify the following results. Theorem 8.E.3. SI-DIR-CX(sp) =⇒ SI-DIR-CX, SI-DIR-CV(sp) =⇒ SI-DIR-CV, SD-DIR-CX(sp) =⇒ SD-DIR-CX, SD-DIR-CV(sp) =⇒ SD-DIR-CV. The next results will be stated only for the increasing convex cases, however, they have versions that apply to the decreasing convex, the increasing concave, and the decreasing concave cases. By combining independent SI-DIR-CX [SI-DIR-CX(sp)] families of random vectors, one obtains a new SI-DIR-CX [SI-DIR-CX(sp)] family of random vectors. Theorem 8.E.4. Let {X i (θ i ), θ i ∈ Θ i } ∈ SD-DIR-CX [SI-DIR-CX(sp)], i = 1, 2, . . . , m, be mutually independent collections of random vectors. Deﬁne X(θ) = (X 1 (θ 1 ), X 2 (θ 2 ), . . . , X m (θ m )). Then {X(θ), θ ∈ ×m i=1 Θ i } ∈ SD-DIR-CX [SI-DIR-CX(sp)]. The (sp) part of Theorem 8.E.4 can be proven by observing that, by independence, the constructions required by the deﬁnition of the SI-DIR-CX(sp) notion can be done coordinatewise. The other part of Theorem 8.E.4 can be veriﬁed by noticing that an m-variate directionally convex function is also directionally convex in any subset of the m coordinates, and again using the independence assumption. As a special case of Theorem 8.E.4 it is seen that if the families of random variables {Xi (θi ), θi ∈ Θi } ∈ SICX [SICX(sp)], i = 1, 2, . . . , m, then {(X1 (θ1 ), X2 (θ2 ), . . . , Xm (θm )), (θ1 , θ2 , . . . , θm ) ∈ ×m i=1 Θi } ∈ SD-DIR-CX [SI-DIR-CX(sp)]. A version of Theorem 8.E.4, in which some or all of the parameters are the same, can also be stated and proven. For example, if the families of random variables {Xi (θ), θ ∈ Θ} ∈ SICX [SICX(sp)], i = 1, 2, . . . , m, then {(X1 (θ), X2 (θ), . . . , Xm (θ)), θ ∈ Θ} ∈ SD-DIR-CX [SI-DIR-CX(sp)] (here all the parameters are the same). Example 8.E.5. Recall from Example 8.B.4 that if Y (n), n = 1, 2, . . . , are nonnegative independent and identically distributed random variables, then

384

8 Stochastic Convexity and Concavity

n { k=1 Y (k), n ∈ N++ } ∈ SIL(sp). Now, let {Yi (n), n = 1, 2, . . . }, i = 1, 2, . . . , m, be independent sequences of nonnegative

n1 independent n2 and identically distributed random variables. Then Y (k), 1 k=1 k=1 Y2 (k), . . . ,

nm m ∈ SI-DIR-CX(sp). Y (k) , (n , n , . . . , n ) ∈ N m 1 2 m ++ k=1 Similar examples can be constructed from the other examples in Section 8.B. The following result illustrates the use of Theorems 8.E.1 and 8.E.2. For each θ ∈ Θ (where Θ is a convex subset of R or N) let {X(n, θ), n ∈ N+ } be a Markov chain with state space S (S = [0, ∞) or N+ ). Let Y (x, θ) =st [X(n + 1, θ)X(n, θ) = x], x ∈ S. Theorem 8.E.6. Suppose that {Y (x, θ), (x, θ) ∈ S × Θ} ∈ SI-DIR-CX [SIDIR-CV, SI-DIR-CX(sp), SI-DIR-CV(sp)]. If {X(0, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)], then {X(n, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)] for each n ∈ N+ . Proof. As an induction hypothesis assume that for some n we have {X(n, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)].

(8.E.1)

Note that X(n + 1, θ) =st Y (X(n, θ), θ).

(8.E.2)

Now, from (8.E.1), (8.E.2), and from a straightforward extension of Theorem 8.E.2(a) (for the (sp) cases) [or of Theorem 8.E.1(a) (for the other cases)], one obtains that {X(n + 1, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)].

8.F Complements Section 8.A: The notion of (regular) stochastic convexity/concavity is introduced in Shaked and Shanthikumar [508]. However, the condition {X(θ), θ ∈ Θ} ∈ SCX was encountered earlier by Schweder [499] who described it by saying that {X(θ), θ ∈ Θ} is “convexly parametrized.” The basic closure properties (Theorems 8.A.15 and 8.A.17) are established in Shaked and Shanthikumar [508]. As an example of the use of these results, we note that Theorem 1(b) of Lef`evre and Malice [336] can be obtained from a combination of Example 8.A.3 with Theorems 8.A.15 and 8.A.17. A slightly weaker version of the example regarding the forward recurrence times (Example 8.A.18) can be found in Makowski and Philips [380]. Temporal convexity of Markov processes (Theorems 8.A.19 and 8.A.20) are studied in Shaked and Shanthikumar [507, 509], Shanthikumar and Yao [534], and Li and Shaked [349]. Extensions of these notions to random vectors can be found in Chang, Chao, Pinedo, and Shanthikumar [125],

8.F Complements

385

and to arbitrary random variables can be found in Meester [385] and in Meester and Shanthikumar [388]. A study of regular stochastic convexity by means of operators is developed in Adell and Perez-Palomares [5]. The results about stochastic m-convexity (Section 8.A.3) are mostly taken from Denuit, Lef`evre and Utev [155]. The stochastic m-increasing convexity of a family with a location parameter (Example 8.A.26) can be found in Denuit and Lef`evre [147]. “Derivatives” of stochastically convex and m-convex processes are introduced and studied in Adell and Lekuona [4]. Section 8.B: The notion of (sample path) stochastic convexity/concavity is introduced in Shaked and Shanthikumar [508]. A generalization of the notion of the semigroup property can be found in Shaked, Shanthikumar, and Tong [519]; Example 8.B.7 is a special case of a result due to them. The closure properties (Theorems 8.B.10 and 8.B.13) are established in Shaked and Shanthikumar [508]. The relation (8.B.2) between the spreads can be found in Goldstein and Rinott [212]. Temporal sample path convexity of Markov processes (Theorems 8.B.16 and 8.B.17) is studied in Shaked and Shanthikumar [508, 509]. Extensions of these notions to random vectors can be found in Chang, Chao, Pinedo, and Shanthikumar [125], and to arbitrary random variables can be found in Meester [385] and in Meester and Shanthikumar [388]. Theorem 8.B.18 is essentially proved in Kirmani and Gupta [299]. Section 8.C: Stochastic convexity/concavity in the usual stochastic ordering is introduced in Shaked and Shanthikumar [510]. Theorem 8.C.6 is established in Shaked and Shanthikumar [514], and Theorem 8.C.7 is established in Shaked and Shanthikumar [510]. Some variations of the stochastic convexity notions in Section 8.C, and also of the notions in Section 8.A, can be found in Atakan [24]. Section 8.D: The notion of strong stochastic convexity (in a diﬀerent form) is introduced in Shanthikumar and Yao [531, 533]. The deﬁnition presented here is given in Meester and Shanthikumar [386]. Section 8.E: The notion of multivariate stochastic directional convexity is introduced in Meester and Shanthikumar [387]. Most of the results of this section are taken from that paper. The results yielding the parametric stochastic convexity and concavity of Markov processes (Theorem 8.E.6) can be found in Shaked and Shanthikumar [509]. Yao [573] introduced notions of stochastic supermodularity and submodularity that are weaker than the notions of stochastic directional convexity and concavity, respectively.

9 Positive Dependence Orders

Notions of positive dependence of two random variables X1 and X2 have been introduced in the literature in an eﬀort to mathematically describe the property that “large (respectively, small) values of X1 tend to go together with large (respectively, small) values of X2 .” Many of the notions of positive dependence are deﬁned by means of some comparison of the joint distribution of X1 and X2 with their distribution under the theoretical assumption that X1 and X2 are independent. Often such a comparison can be extended to general pairs of bivariate distributions with given marginals. This fact led researchers to introduce various notions of positive dependence orders. These orders are designed to compare the strength of the positive dependence of the two underlying bivariate distributions. In this chapter we describe some such notions. In many sections of this chapter we ﬁrst describe a positive dependence order which compares two bivariate random vectors (or distributions). When the order can be extended to general n-dimensional (n > 2) random vectors, we will describe the extension in a later part of that section. Most of the orders that we describe in this chapter are deﬁned on the Fr´echet class M(F1 , F2 ) of bivariate distributions with ﬁxed marginals F1 and F2 . The upper bound of this class is the distribution deﬁned by min{F1 (x1 ), F2 (x2 )} (whose probability mass is concentrated on the set {(x1 , x2 ) : F1 (x1 ) = F2 (x2 )}). The lower bound of this class is the distribution deﬁned by max{F1 (x1 ) + F2 (x2 ) − 1, 0} (whose probability mass is concentrated on the set {(x1 , x2 ) : F1 (x1 ) + F2 (x2 ) = 1}).

9.A The PQD and the Supermodular Orders 9.A.1 Deﬁnition and basic properties: The bivariate case Let the random vector (X1 , X2 ) have the distribution function F , and let F1 and F2 denote, respectively, the marginal distributions of X1 and X2 .

388

9 Positive Dependence Orders

Lehmann [343] deﬁned (X1 , X2 ) (or F ) to be positive quadrant dependent (PQD) if F (x1 , x2 ) ≥ F1 (x1 )F2 (x2 ) for all x1 and x2 . (9.A.1) Note that (9.A.1) can be rewritten as F (x1 , x2 ) ≥ F I (x1 , x2 )

for all x1 and x2 ,

(9.A.2)

where F I (x1 , x2 ) ≡ F1 (x1 )F2 (x2 ) for all x1 and x2 . This characterization of the PQD notion leads naturally to the deﬁnition of the PQD order that is described next. For a random vector (X1 , X2 ) with distribution function F , let F be the bivariate survival function of (X1 , X2 ), that is, F (x1 , x2 ) ≡ P {X1 > x1 , X2 > x2 } for all x1 and x2 . Let (Y1 , Y2 ) be another bivariate random vector with distribution function G and survival function G. Suppose that F and G have the same univariate marginals; that is, suppose that both belong to M(F1 , F2 ) for some univariate distribution functions F1 and F2 . If F (x1 , x2 ) ≤ G(x1 , x2 )

for all x1 and x2 ,

(9.A.3)

then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PQD order (denoted by (X1 , X2 ) ≤PQD (Y1 , Y2 )). Sometimes it will be useful to write this as F ≤PQD G. Using the assumption that F and G have the same univariate marginals, it is easy to see that (9.A.3) is equivalent to F (x1 , x2 ) ≤ G(x1 , x2 )

for all x1 and x2 .

Note that for random vectors (X1 , X2 ) and (Y1 , Y2 ), with distribution functions in M(F1 , F2 ), we have (X1 , X2 ) ≤PQD (Y1 , Y2 ) ⇐⇒ (X1 , X2 ) ≤uo (Y1 , Y2 ) and (X1 , X2 ) ≤PQD (Y1 , Y2 ) ⇐⇒ (X1 , X2 ) ≥lo (Y1 , Y2 ); see (6.G.1) and (6.G.2) in Section 6.G.1. The reader should notice, however, that in (6.G.1) and (6.G.2) it is not required that (X1 , X2 ) and (Y1 , Y2 ) have the same marginals. Therefore, whereas the upper and lower orthant orders measure the size (or the location) of the underlying random vectors, the PQD order measures the amount of positive dependence of the underlying random vectors. From (9.A.2) it is seen that F is PQD if, and only if, F I ≤PQD F. By Hoeﬀding’s Lemma (see Lehmann [343, page 1139]) we see that if (X1 , X2 ) and (Y1 , Y2 ) have distributions F and G in M(F1 , F2 ), then

9.A The PQD and the Supermodular Orders

Cov(X1 , X2 ) =

Cov(Y1 , Y2 ) =

∞

−∞

and

∞

−∞

∞

−∞

∞

−∞

389

[F (x1 , x2 ) − F1 (x1 )F2 (x2 )]dx1 dx2

[G(x1 , x2 ) − F1 (x1 )F2 (x2 )]dx1 dx2 ,

provided the covariances are well deﬁned. It thus follows from (9.A.3) that if (X1 , X2 ) ≤PQD (Y1 , Y2 ), then Cov(X1 , X2 ) ≤ Cov(Y1 , Y2 ),

(9.A.4)

and therefore, since Var(Xi ) = Var(Yi ), i = 1, 2, we have that ρX1 ,X2 ≤ ρY1 ,Y2 , where ρX1 ,X2 and ρY1 ,Y2 denote the correlation coeﬃcients associated with (X1 , X2 ) and (Y1 , Y2 ), respectively, provided the underlying variances are well deﬁned. Yanagimoto and Okamoto [570] have shown that some other correlation measures, such as Kendall’s τ , Spearman’s ρ, and Blomquist’s q, are preserved under the PQD order. The inequality (9.A.4), and the monotonicity of other correlation measures under the PQD order, can also be obtained as corollaries from (9.A.17) below. Let (X1 , X2 ) and (Y1 , Y2 ) be random vectors with distribution functions F and G. If (X1 , X2 ) ≤PQD (Y1 , Y2 ), then F (x1 , x2 ) ≤ G(x1 , x2 )

for all x1 and x2 ,

and P {X1 > x1 , X2 ≤ x2 } ≥ P {Y1 > x1 , Y2 ≤ x2 }

for all x1 and x2 .

Therefore

and

P {X2 > x2 X1 > x1 } ≤ P {Y2 > x2 Y1 > x1 }

for all x1 and x2 ,

P {X2 ≤ x2 X1 > x1 } ≥ P {Y2 ≤ x2 Y1 > x1 }

for all x1 and x2 .

Thus, for all x1 we have E[X2 X1 > x1 ] = −

P {X2 ≤ x2 X1 > x1 }dx2 −∞ ∞ + P {X2 > x2 X1 > x1 }dx2

0

0

≤− P {Y2 ≤ x2 Y1 > x1 }dx2 −∞ ∞ + P {Y2 > x2 Y1 > x1 }dx2 0 = E[Y2 Y1 > x1 ]. 0

390

9 Positive Dependence Orders

For random vectors (X1 , X2 ) and (Y1 , Y2 ) with distribution functions in M(F1 , F2 ), the condition (9.A.5) E[X2 X1 > x1 ] ≤ E[Y2 Y1 > x1 ] for all x1 can be used to deﬁne a positive dependence stochastic order. Such an order is discussed in Muliere and Petrone [405]. We see that if (X1 , X2 ) ≤PQD (Y1 , Y2 ), then (9.A.5) holds. Let FL and FU denote the Fr´echet lower and upper bounds in the class M(F1 , F2 ). Then, for every distribution F ∈ M(F1 , F2 ) we have FL ≤PQD F ≤PQD FU .

(9.A.6)

9.A.2 Closure properties A powerful closure property of the PQD order is given in the next theorem. Theorem 9.A.1. Suppose that the four random vectors (X1 , X2 ), (Y1 , Y2 ), (U1 , U2 ), and (V1 , V2 ) satisfy (X1 , X2 ) ≤PQD (Y1 , Y2 )

and

(U1 , U2 ) ≤PQD (V1 , V2 ),

(9.A.7)

and suppose that (X1 , X2 ) and (U1 , U2 ) are independent, and also that (Y1 , Y2 ) and (V1 , V2 ) are independent. Then (φ(X1 , U1 ), ψ(X2 , U2 )) ≤PQD (φ(Y1 , V1 ), ψ(Y2 , V2 )), for all increasing functions φ and ψ.

(9.A.8)

Proof. From the monotonicity of φ and ψ it follows that the set {(u1 , u2 ) : φ(x1 , u1 ) ≤ a1 , ψ(x2 , u2 ) ≤ a2 } is a lower quadrant for all x1 , x2 , a1 , and a2 . Therefore, for all a1 and a2 we have P {φ(X1 , U1 ) ≤ a1 ,ψ(X2 , U2 ) ≤ a2 } = P {φ(X1 , u1 ) ≤ a1 , ψ(X2 , u2 ) ≤ a2 }dH(u1 , u2 ) ≤ P {φ(Y1 , u1 ) ≤ a1 , ψ(Y2 , u2 ) ≤ a2 }dH(u1 , u2 ) = P {φ(Y1 , U1 ) ≤ a1 , ψ(Y2 , U2 ) ≤ a2 }, where H is the distribution function of (U1 , U2 ). Thus, (φ(X1 , U1 ), ψ(X2 , U2 )) ≤PQD (φ(Y1 , U1 ), ψ(Y2 , U2 )), for all increasing functions φ and ψ.

(9.A.9)

In a similar manner one can show that (φ(Y1 , U1 ), ψ(Y2 , U2 )) ≤PQD (φ(Y1 , V1 ), ψ(Y2 , V2 )), for all increasing functions φ and ψ. From (9.A.9) and (9.A.10) one obtains (9.A.8).

(9.A.10)

9.A The PQD and the Supermodular Orders

391

In particular, if (9.A.7) holds, then (X1 + U1 , X2 + U2 ) ≤PQD (Y1 + V1 , Y2 + V2 ),

(9.A.11)

that is, the PQD order is closed under convolutions. From Theorem 9.A.1 it also follows that (X1 , X2 ) ≤PQD (Y1 , Y2 ) =⇒ (φ(X1 ), ψ(X2 )) ≤PQD (φ(Y1 ), ψ(Y2 )), for all increasing functions φ and ψ. The closure properties that are stated in the next theorem are easy to verify. (j)

(j)

(j)

(j)

Theorem 9.A.2. (a) Let {(X1 , X2 ), j = 1, 2, . . . } and {(Y1 , Y2 ), j = (j) (j) 1, 2, . . . } be two sequences of random vectors such that (X1 , X2 ) →st (j) (j) (X1 , X2 ) and (Y1 , Y2 ) →st (Y1 , Y2 ) as j → ∞, where →st denotes con(j) (j) (j) (j) vergence in distribution. If (X1 , X2 ) ≤PQD (Y1 , Y2 ), j = 1, 2, . . ., then (X1 , X2 ) ≤PQD (Y1 , Y2 ). (b) Let (X1 , X2 ), (Y1 ,Y2 ), and Θ be random vectors such that [(X1 , X2 )Θ = θ] ≤PQD [(Y1 , Y2 )Θ = θ] for all θ in the support of Θ. Then (X1 , X2 ) ≤PQD (Y1 , Y2 ). That is, the PQD order is closed under mixtures. Fang, Hu, and Joe [191] applied the idea of the PQD order to stationary Markov chains and showed that, if the process is stochastically increasing, then dependence (in the sense of the PQD order) is decreasing with the lag, namely, if {X1 , X2 , . . . } is a Markov chain and Xi is distributed according to F and if (X1 , Xn ) is distributed according to F1n , n = 2, 3, . . ., then F12 ≥PQD F13 ≥PQD · · · ≥PQD F1n ≥PQD · · · ≥PQD F (2) ,

(9.A.12)

where F (2) (x, y) = F (x)F (y). See also Remark 9.A.29 below. Another example is the following. Example 9.A.3. Let φ and ψ be two Laplace transforms of positive random variables. Then F and G, deﬁned by

and

F (x1 , x2 ) = φ(φ−1 (x1 ) + φ−1 (x2 )),

(x1 , x2 ) ∈ [0, 1]2 ,

G(y1 , y2 ) = ψ(ψ −1 (y1 ) + ψ −1 (y2 )),

(y1 , y2 ) ∈ [0, 1]2 ,

are bivariate distribution functions with uniform[0, 1] marginals (such F and G are called Archimedean copulas). Let (X1 , X2 ) and (Y1 , Y2 ) be distributed according to F and G, respectively. Then (X1 , X2 ) ≤PQD (Y1 , Y2 ) if, and only if, ψ −1 φ is superadditive (that is, ψ −1 φ(x + y) ≥ ψ −1 φ(x) + ψ −1 φ(y) for all x, y ≥ 0). Also, if φ−1 ψ has a completely monotone derivative, then (X1 , X2 ) ≤PQD (Y1 , Y2 ).

392

9 Positive Dependence Orders

9.A.3 The multivariate case Let X = (X1 , X2 , . . . , Xn ) be a random vector with distribution function F and survival function F . Let Y = (Y1 , Y2 , . . . , Yn ) be another random vector with distribution function G and survival function G. If F (x) ≤ G(x)

for all x,

(9.A.13)

F (x) ≤ G(x)

for all x,

(9.A.14)

and then we say that X is smaller than Y in the PQD order (denoted by X ≤PQD Y ). From (9.A.13) and (9.A.14) it follows that only random vectors with the same univariate marginals can be compared in the PQD order. From (9.A.13) and (9.A.14) it follows that

X ≤PQD Y ⇐⇒ X ≤uo Y and X ≥lo Y . (9.A.15) An extension of Theorem 9.A.1 to the general multivariate case is the following. The proof of Theorem 9.A.4 is a straightforward extension of the proof of Theorem 9.A.1, and therefore it is omitted. Theorem 9.A.4. Suppose that the four random vectors X = (X1 , X2 , . . . , Xn ), Y = (Y1 , Y2 , . . . , Yn ), U = (U1 , U2 , . . . , Un ), and V = (V1 , V2 , . . . , Vn ) satisfy X ≤PQD Y and U ≤PQD V , (9.A.16) and suppose that X and U are independent, and also that Y and V are independent. Then (φ1 (X1 , U1 ), φ2 (X2 , U2 ), . . . , φn (Xn , Un )) ≤PQD (φ1 (Y1 , V1 ), φ2 (Y2 , V2 ), . . . , φn (Yn , Vn )), for all increasing functions φi , i = 1, 2, . . . , n. In particular, if (9.A.16) holds, then X + U ≤PQD Y + V , that is, the PQD order is closed under convolutions. Also, from Theorem 9.A.4 it follows that (X1 , X2 , . . . , Xn ) ≤PQD (Y1 , Y2 , . . . , Yn ) =⇒ (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤PQD (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )), for all increasing functions φi , i = 1, 2, . . . , n. The closure properties that are stated in the next theorem are easy to verify.

9.A The PQD and the Supermodular Orders

393

Theorem 9.A.5. (a) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤PQD Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤PQD (Y 1 , Y 2 , . . . , Y m ). That is, the PQD order is closed under conjunctions. (b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤PQD Y , then X I ≤PQD Y I for each I ⊆ {1, 2, . . . , n}. That is, the PQD order is closed under marginalization. (c) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤PQD Y j , j = 1, 2, . . ., then X ≤PQD Y . (d) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤PQD [Y Θ = θ] for all θ in the support of Θ. Then X ≤PQD Y . That is, the PQD order is closed under mixtures. From Theorem 9.A.5(b) and (9.A.4) it follows that if (X1 , X2 , . . . , Xn ) ≤PQD (Y1 , Y2 , . . . , Yn ), then, for all i1 = i2 , we have that Cov(Xi1 , Xi2 ) ≤ Cov(Yi1 , Yi2 ). Since the univariate marginals of X and Y are equal, it also follows that ρXi1 ,Xi2 ≤ ρYi1 ,Yi2 , where ρXi1 ,Xi2 and ρYi1 ,Yi2 denote the correlation coeﬃcients associated with (Xi1 , Xi2 ) and (Yi1 , Yi2 ), respectively, provided the underlying variances are well deﬁned. Joe [260] has shown that some multivariate versions of the correlation measures Kendall’s τ , Spearman’s ρ, and Blomquist’s q, are monotone with respect to the PQD order. Another preservation property of the PQDorder is described in the next 0 theorem. In the following theorem we deﬁne j=1 xj ≡ 0 for any sequence {xj , j = 1, 2, . . . }. Similar results are Theorems 6.G.7 and 9.A.15. Theorem 9.A.6. Let X j = (Xj,1 , Xj,2 , . . . , Xj,m ), j = 1, 2, . . ., be a sequence of nonnegative random vectors, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both M and N are independent of the X j ’s. If M ≤PQD N , then M1 j=1

Xj,1 ,

M2 j=1

Xj,2 , . . . ,

Mm j=1

N1 N2 Nm Xj,m ≤PQD Xj,1 , Xj,2 , . . . , Xj,m . j=1

j=1

j=1

394

9 Positive Dependence Orders

Consider now, as in Section 6.B.4, n families of univariate distribu(i) tion functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) denote a random variable with distribution func(i) tion Gθ , i = 1, 2, . . . , n. Below we give a result which provides comparisons of two random vectors, with distribution functions of the form (6.B.18), in the PQD order. The following result is obtained easily from Theorem 6.G.8; see Theorems 6.B.17, 7.A.37, and 9.A.15 for related results. (i)

Theorem 9.A.7. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution functions as above. Let Θ 1 and Θ 2 be two random vectors n with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by

Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

If

Xi (θ) ≤st Xi (θ ) whenever θ ≤ θ ,

i = 1, 2, . . . , n,

and if Θ 1 ≤PQD Θ2 , then Y 1 ≤PQD Y 2 . Example 9.A.8. Let X be an n-dimensional random vector with a density function f of the form f (x) = |Σ|−1/2 g(xΣ −1 x), ∞ where Σ = (σij ) is a positive deﬁnite n × n matrix, and g satisﬁes 0 rn−1 g(r2 )dr < ∞. Such density functions are called elliptically contoured. Let Y be an n-dimensional random vector with a density function h of the form h(x) = |Λ|−1/2 g(xΛ−1 x), where Λ = (λij ) is a positive deﬁnite n × n matrix. If σii = λii , i = 1, 2, . . . , n, and σij ≤ λij , 1 ≤ i < j ≤ n, then X ≤PQD Y . In particular, multivariate normal random vectors with mean 0 and the same variances are ordered in the PQD order if their covariances are pointwise ordered.

9.A The PQD and the Supermodular Orders

395

9.A.4 The supermodular order The supermodular order, which is described in this subsection, is a suﬃcient condition that implies the PQD order, but it is also of independent interest. Recall from Section 7.A.8 that a function φ : Rn → R is said to be supermodular if for any x, y ∈ Rn it satisﬁes φ(x) + φ(y) ≤ φ(x ∧ y) + φ(x ∨ y), where the operators ∧ and ∨ denote coordinatewise minimum and maximum, respectively. Note that if φ : Rn → R is supermodular, then the function ψ, deﬁned by ψ(x1 , x2 , . . . , xn ) = φ(g1 (x1 ), g2 (x2 ), . . . , gn (xn )), is also supermodular, whenever gi : R → R, i = 1, 2, . . . , n, are all increasing or are all decreasing. Let X and Y be two n-dimensional random vectors such that E[φ(X)] ≤ E[φ(Y )]

for all supermodular functions φ : Rn → R,

provided the expectations exist. Then X is said to be smaller than Y in the supermodular order (denoted by X ≤sm Y ). Since the functions φx = I{y:y>x} and ψx = I{y:y≤x} are supermodular for each ﬁxed x, it is immediate that X ≤sm Y =⇒ X ≤PQD Y .

(9.A.17)

These implications also follow from Theorem 6.G.2 and (9.A.15) since every n-dimensional (n ≥ 2) distribution function, and any n-dimensional survival function, are supermodular functions. In fact, when n = 2 we have that (X1 , X2 ) ≤sm (Y1 , Y2 ) ⇐⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 );

(9.A.18)

see, for example, Tchen [547]. From (9.A.17) it is seen that if X ≤sm Y , then X and Y must have the same univariate marginals. Some closure properties of the supermodular order are described in the next theorem. Theorem 9.A.9. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤sm (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤sm (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R, i = 1, 2, . . . , n, are all increasing or are all decreasing. (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤sm Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤sm (Y 1 , Y 2 , . . . , Y m ). That is, the supermodular order is closed under conjunctions.

396

9 Positive Dependence Orders

(c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤sm Y , then X I ≤sm Y I for each I ⊆ {1, 2, . . . , n}. That is, the supermodular order is closed under marginalization. (d) Let X, Y , and Θ be random vectors such that X Θ = θ ≤sm Y Θ = θ for all θ in the support of Θ. Then X ≤sm Y . That is, the supermodular order is closed under mixtures. (e) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤sm Y j , j = 1, 2, . . ., then X ≤sm Y . Proof. Part (a) follows from the fact that a composition of a supermodular function with coordinatewise functions, that are all increasing or are all decreasing, is a supermodular function. In order to see part (b) let X 1 and X 2 be two independent random vectors, and let Y 1 and Y 2 be two other independent random vectors. Suppose that X 1 ≤sm Y 1 and that X 2 ≤sm Y 2 . Then, for any supermodular function φ (of the proper dimension) we have that Eφ(X 1 , X 2 ) = E Eφ(X 1 , X 2 )X 2 ≤ E Eφ(Y 1 , X 2 )X 2 = Eφ(Y 1 , X 2 ) ≤ Eφ(Y 1 , Y 2 ), where the ﬁrst inequality follows from the fact that φ(x1 , x2 ) is supermodular in x1 when x2 is ﬁxed, and the second inequality follows in a similar manner. Part (b) of Theorem 9.A.9 follows from the above by induction. Parts (c) and (d) are easy to prove. A proof of part (e) can be found in M¨ uller and Scarsini [416].

From parts (a) and (d) of Theorem 9.A.9 we obtain the following corollary. Corollary 9.A.10. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors such that X ≤sm Y , and let Z be an m-dimensional random vector which is independent of X and Y . Then (h1 (X1 , Z), h2 (X2 , Z), . . . , hn (Xn , Z)) ≤sm (h1 (Y1 , Z), h2 (Y2 , Z), . . . , hn (Yn , Z)), whenever hi (x, z), i = 1, 2, . . . , n, are all increasing or are all decreasing in x for every z. Example 9.A.11. Let X and Y be two n-dimensional random vectors such that X ≤sm Y , and let Z be an n-dimensional random vector which is independent of X and Y . Then from Corollary 9.A.10 it follows that X ∧ Z ≤sm Y ∧ Z, and that X + Z ≤sm Y + Z.

9.A The PQD and the Supermodular Orders

397

By applying Corollary 9.A.10 twice (letting Z there be an n-dimensional random vector, and letting each hi depend only on its ﬁrst argument and on the ith component of the second argument, i = 1, 2, . . . , n), we get the following result. Theorem 9.A.12. Let X, Y , Z, and W be n-dimensional random vectors such that X and Z are independent and Y and W are independent. Let ci : [0, ∞)2 → [0, ∞) be a continuous increasing function, i = 1, 2, . . . , n. If X ≤sm Y and Z ≤sm W , then (c1 (X1 , Z1 ), c2 (X2 , Z2 ), . . . , cn (Xn , Zn )) ≤sm (c1 (Y1 , W1 ), c2 (Y2 , W2 ), . . . , cn (Yn , Wn )). Example 9.A.13. Let {X k = (Xk,1 , . . . , Xk,n ), k ≥ 0} and {Y k = (Yk,1 , . . . , Yk,n ), k ≥ 0} be two Markov chains as described in Example 6.G.6. If the gi ’s are increasing in their m + 1 arguments, if U l = {U lk , k ≥ 0}, l = 1, . . . , m, are independent, if V l = {V lk , k ≥ 0}, l = 1, . . . , m, are independent, and if U lk ≤sm V lk , l = 1, . . . , m, k ≥ 0, then, for each k ≥ 0 we have (X 0 , . . . , X k ) ≤sm (Y 0 , . . . , Y k ). The proof uses Theorem 9.A.12, Corollary 9.A.10, and Theorem 9.A.9(b). We omit the details. Another preservation property of the supermodular order is described in 0 the next theorem. In the following theorem we deﬁne j=1 xj ≡ 0 for any sequence {xj , j = 1, 2, . . . }. Similar results are Theorems 6.G.7 and 9.A.6. Theorem 9.A.14. Let X j = (Xj,1 , Xj,2 , . . . , Xj,m ), j = 1, 2, . . ., be a sequence of nonnegative random vectors, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both M and N are independent of the X j ’s. If M ≤sm N , then M1

Xj,1 ,

j=1

M2

Xj,2 , . . . ,

j=1

Mm

N1 N2 Nm Xj,m ≤sm Xj,1 , Xj,2 , . . . , Xj,m .

j=1

j=1

j=1

j=1

Proof. Let φ be a supermodular function. Conditioning on the possible realizations of (X 1 , X 2 , . . . ) we can write M1 M2 Mm E φ Xj,1 , Xj,2 , . . . , Xj,m j=1

j=1

j=1

$ M1 M2 Mm % =E E φ Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) . j=1

j=1

j=1

398

9 Positive Dependence Orders

Now, it is easy to see that for any realization (x1 , x (X 1 , X 2 , . . . ), 2 ,n.1. . ) of n2 the function ψ, deﬁned by ψ(n1 , n2 , . . . , nm ) = φ xj,1 , j=1 xj,2 , . . . , j=1

nm , is supermodular. Therefore, since M ≤ x N , we have that sm j=1 j,m M1 M2 Mm Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) = (x1 , x2 , . . . ) E φ j=1

j=1

j=1

N1 N2 Nm ≤E φ Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) = (x1 , x2 , . . . ) , j=1

j=1

j=1

and thus M1 M2 Mm Xj,1 , Xj,2 , . . . , Xj,m E φ j=1

j=1

j=1

$ N1 N2 Nm % ≤E E φ Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) j=1

j=1

j=1

N1 N2 Nm =E φ Xj,1 , Xj,2 , . . . , Xj,m . j=1

j=1

j=1

Consider now, as in Section 6.B.4, n families of univariate distribu(i) tion functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) denote a random variable with distribution func(i) tion Gθ , i = 1, 2, . . . , n. Below we give a result which provides comparisons of two random vectors, with distribution functions of the form (6.B.18), in the supermodular order. The following result is a generalization of Theorem 9.A.9(d); see Theorems 6.B.17, 6.G.8, 7.A.37, and 9.A.7 for related results. (i)

Theorem 9.A.15. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution nfunctions as above. Let Θ 1 and Θ 2 be two random vectors with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

If

Xi (θ) ≤st Xi (θ ) whenever θ ≤ θ ,

and if Θ 1 ≤sm Θ2 ,

i = 1, 2, . . . , n,

9.A The PQD and the Supermodular Orders

399

then Y 1 ≤sm Y 2 . Before stating the next result, it is worthwhile to mention that from Proposition 7.A.27 it follows that X ≤sm Y =⇒ X ≤dir-cx Y . The following result may be compared with Theorems 6.G.10 and 7.A.30. Theorem 9.A.16. Let X and Y be two random vectors. If X ≤sm Y , then φ(X) ≤icx φ(Y ) for any increasing supermodular function φ : Rn → R. A consequence of Theorem 9.A.16, that is useful in queuing theory, is described in the following example. Example 9.A.17. Let {Ai }∞ i=0 be a sequence of random variables, and let c be some constant. Deﬁne inductively Q0 = q;

Qi+1 = [Qi + Ai − c]+ , i = 1, 2, . . . ,

for some ﬁxed q. Similarly, let {Ai }∞ i=0 be another sequence of random variables, and deﬁne inductively Q0 = q;

Qi+1 = [Qi + Ai − c]+ , i = 1, 2, . . . .

If (A0 , A1 , . . . , Ai ) ≤sm (A0 , A1 , . . . , Ai ) for all i = 1, 2, . . ., then Qi ≤icx Qi for all i = 1, 2, . . .. In fact, the above result holds even if Q0 and Q0 are random variables satisfying Q0 ≤icx Q0 . As a particular case of Theorem 9.A.16 we have that (X1 , X2 , . . . , Xn ) ≤sm (Y1 , Y2 , . . . , Yn ) =⇒

n i=1

Xi ≤cx

n

Yi

(9.A.19)

i=1

(since X ≤sm Y =⇒ EX = EY ). A related result is the following. It shows that the larger in the supermodular order a random vector is, the “closer” are its coordinates in the proper stochastic sense. Theorem 9.A.18. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. If (X1 , X2 ) ≤sm (Y1 , Y2 ) (that is, (X1 , X2 ) ≤PQD (Y1 , Y2 ); see (9.A.18)), then Y1 − Y2 ≤cx X1 − X2 . Proof. Let φ be a univariate convex function. Then the function ψ, deﬁned by ψ(x1 , x2 ) = −φ(x1 − x2 ), is easily seen to be supermodular. Thus Eφ(Y1 − Y2 ) ≤ Eφ(X1 − X2 ). This proves the inequality.

400

9 Positive Dependence Orders

A consequence of Theorem 9.A.14 and (9.A.19) is described in the following example. Example 9.A.19. Let X1 , X2 , . . . and Y1 , Y2 , . . . be two sequences of random variables. Let N1 and N2 be two independent and identically distributed positive integer-valued random variables independent of the Xi ’s and of the Yi ’s. Then N1 N1 N2 Xi + Yi ≤cx (Xi + Yi ). i=1

i=1

i=1

In order to see it, note that (N1 , N2 ) ≤sm (N1 , N1 ), and use Theorem 9.A.14 and (9.A.19). This proof was communicated to us by Taizhong Hu. An interesting example in which the supermodular order arises naturally is the following. See also Examples 6.B.29, 6.G.11, 7.A.13, 7.A.26, 7.A.39, and 7.B.5. Example 9.A.20. Let X be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ, and let Y be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ +D, where D is a matrix with zero diagonal elements such that Σ + D is nonnegative deﬁnite. Then X ≤sm Y if, and only if, all the entries of D are nonnegative. The supermodular order can be used to bound some quite general random vectors. This is shown in the next three theorems. The proofs of the these theorems are omitted. Theorem 9.A.21 can be considered to be an extension of the right-hand side of (9.A.6). Theorem 9.A.21. Let X = (X1 , X2 , . . . , Xn ) be a random vector and let FXi be the marginal distribution of Xi , i = 1, 2, . . . , n. Then, for a uniform[0, 1] random variable U we have that −1 −1 −1 (U ), FX (U ), . . . , FX (U )), X ≤sm (FX 1 2 n

and therefore −1 −1 −1 (U ), FX (U ), . . . , FX (U )). X ≤PQD (FX 1 2 n

In particular, if the Xi ’s in Theorem 9.A.21, marginally, have the same (univariate) distribution function, then X ≤sm (X1 , X1 , . . . , X1 ), and therefore X ≤PQD (X1 , X1 , . . . , X1 ). Combining (9.A.19) and Theorem 9.A.21 it is seen, using the notation of Theorem 9.A.21, that

9.A The PQD and the Supermodular Orders −1 −1 −1 X1 + X2 + · · · + Xn ≤cx FX (U ) + FX (U ) + · · · + FX (U ). 1 2 n

401

(9.A.20)

A more detailed result is described next. Let X1 , X2 , . . . , Xn , Z, and U be −1 random variables, where U has the uniform[0, 1] distribution. Let FX (U ) i |Z denote the random variable gi (U, Z), where gi is deﬁned by gi (u, z) = −1 (u), i = 1, 2, . . . , n. FX i |Z=z Proposition 9.A.22. Let X = (X1 , X2 , . . . , Xn ) be a random vector, and let FXi be the marginal distribution of Xi , i = 1, 2, . . . , n. Let Z and U be two other random variables, such that U has a uniform[0, 1] distribution, and is independent of Z. Then −1 −1 −1 X1 + X2 + · · · + Xn ≤cx FX (U ) + FX (U ) + · · · + FX (U ) 1 |Z 2 |Z n |Z −1 −1 −1 ≤cx FX (U ) + FX (U ) + · · · + FX (U ). (9.A.21) 1 2 n

Proof. From (9.A.20) it is seen that for any convex function φ we have (below FZ denotes the distribution function of Z) E[φ(X1 + · · · + Xn )] ∞ = E[φ(X1 + · · · + Xn )Z = z]dFZ (z) −∞ ∞ −1 −1 ≤ E[φ(FX (U ) + · · · + FX (U ))Z = z]dFZ (z) 1 |Z=z n |Z=z −∞

−1 −1 = E[φ(FX (U ) + · · · + FX (U ))], 1 |Z n |Z

and the ﬁrst inequality in (9.A.21) follows. −1 −1 −1 (U ), FX (U ), . . . , FX (U )) Next, note that the random vector (FX 1 |Z 2 |Z n |Z has the same marginals as (X1 , X2 , . . . , Xn ) because ∞ P (Xi ≤ x) = P (Xi ≤ xZ = z)dFZ (z) −∞ ∞ −1 = P (FX (U ) ≤ x)dFZ (z) i |Z=z =

−∞ −1 P (FX (U ) i |Z

≤ x),

−∞ ≤ x ≤ ∞, i = 1, 2, . . . , n,

and the second inequality in (9.A.21) therefore follows from (9.A.20).

The next result has been motivated by the desire to generalize and unify Theorems 3.A.34 and 4.A.17. Recall the deﬁnition of negative association in (3.A.54). If the inequality (3.A.54) is reversed, that is, if the random variables X1 , X2 , . . . , Xn satisfy Cov(h1 (Xi1 , Xi2 , . . . , Xik ), h2 (Xj1 , Xj2 , . . . , Xjn−k )) ≥ 0

(9.A.22)

for all choices of disjoint subsets {i1 , i2 , . . . , ik } and {j1 , j2 , . . . , jn−k } of {1, 2, . . . , n}, and for all increasing functions h1 and h2 for which the above

402

9 Positive Dependence Orders

covariance is deﬁned, then X1 , X2 , . . . , Xn are said to be weakly positively associated. Theorem 9.A.23. Let X = (X1 , X2 , . . . , Xn ) be a random vector, and let Y = (Y1 , Y2 , . . . , Yn ) be a vector of independent random variables such that, marginally, Xi =st Yi , i = 1, 2, . . . , n. (a) If X1 , X2 , . . . , Xn are weakly positively associated, then X ≥sm Y . (b) If X1 , X2 , . . . , Xn are negatively associated, then X ≤sm Y . A result that is stronger than Theorem 9.A.23 is given in Section 9.E below; see details in Remark 9.E.9. Combining Theorem 9.A.23 with Theorem 9.A.16 (and using the fact that positive association implies weak positive association) one obtains Theorems 3.A.34 and 4.A.17 (for the latter, note that the function φ(x) = k max1≤k≤n i=1 xi is increasing and supermodular). Theorem 9.A.24. Let X = (X1 , X2 , . . . , Xn ) be a vector of nonnegative random variables, and let Fi denote the marginal distribution of Xi , i = 1, 2, . . . , n. Suppose that n F i (0) ≤ 1. 1

Then there exists a unique random vector Y = (Y1 , Y2 , . . . , Yn ) with marginal distributions Fi , i = 1, 2, . . . , n, such that P {Yi > 0, Yj > 0} = 0

for all i = j,

(9.A.23)

and this Y satisﬁes Y ≤sm X. The following result strengthens Theorem 7.A.38; the terminology that is used there is also used in the theorem below. Theorem 9.A.25. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have the respective copulas CX and CY . Let U X and U Y be distributed according to CX and CY . If Xi ≤cx Yi , i = 1, 2, . . . , n, if U X ≤sm U Y , and if U Y is CI, then X ≤dir-cx Y . Example 9.A.26. Let Z1 , Z2 , . . . , Zn be a collection of independent and identically distributed random variables, let U1 , U2 , . . . , Un be another collection of independent and identically distributed random variables, and let V be still another random variable that is independent of the Ui ’s. Consider the random vectors Y and X deﬁned as (Y1 , Y2 , . . . , Yn ) = (g1 (Z1 ), g2 (Z2 ), . . . , gn (Zn )) (X1 , X2 , . . . , Xn ) = (˜ g1 (U1 , V ), g˜2 (U2 , V ), . . . , g˜n (Un , V )), where gi : R → R and g˜i : R2 → R are measurable functions that satisfy

9.A The PQD and the Supermodular Orders

gi (Zi ) =st g˜i (Ui , V ),

403

i = 1, 2, . . . , n.

If g˜i is increasing in its second variable, i = 1, 2, . . . , n, then it is known that for ﬁxed values u1 , u2 , . . . , un of U1 , U2 , . . . , Un we have that g˜1 (u1 , V ), g˜2 (u2 , V ), . . . , g˜n (un , V ) are weakly positively associated. Thus, for a supermodular function φ : Rn → R we have (here V1 , V2 , . . . , Vn are independent copies of V ) Eφ(X1 , X2 , . . . , Xn ) = E E φ(˜ g1 (U1 , V ), g˜2 (U2 , V ), . . . , g˜n (Un , V ))U1 , U2 , . . . , Un ≥ E E φ(˜ g1 (U1 , V1 ), g˜2 (U2 , V2 ), . . . , g˜n (Un , Vn ))U1 , U2 , . . . , Un = Eφ(Y1 , Y2 , . . . , Yn ), where the inequality follows from Theorem 9.A.23. Thus Y ≤sm X. Example 9.A.27. Let Ω = {a1 , a2 , . . . , aN } be a ﬁnite population. Let X1 , X2 , . . . , Xn be a sample without replacement of size n ≤ N from Ω; that is, P {(X1 , X2 , . . . , Xn ) = (x1 , x2 , . . . , xn )} =

1 , N (N − 1) · · · (N − n + 1) (x1 , x2 , . . . , xn ) ∈ Ω n ,

provided all the xi ’s comprise diﬀerent elements of Ω. Let Y1 , Y2 , . . . , Yn be a sample with replacement of size n from Ω; that is, P {(Y1 , Y2 , . . . , Yn ) = (x1 , x2 , . . . , xn )} =

1 , Nn

(x1 , x2 , . . . , xn ) ∈ Ω n .

Then (X1 , X2 , . . . , Xn ) ≤sm (Y1 , Y2 , . . . , Yn ). Example 9.A.28. Let φ and ψ be two Laplace transforms of positive random variables. Then F and G, deﬁned by F (x1 , x2 , . . . , xn ) = φ(φ−1 (x1 ) + φ−1 (x2 ) + · · · + φ−1 (xn )), (x1 , x2 , . . . , xn ) ∈ [0, 1]n , and G(y1 , y2 , . . . , yn ) = ψ(ψ −1 (y1 ) + ψ −1 (y2 ) + · · · + ψ −1 (yn )), (y1 , y2 , . . . , yn ) ∈ [0, 1]n , are multivariate distribution functions with uniform[0, 1] marginals (see Example 9.A.3). Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be distributed according to F and G, respectively. If φ−1 ψ has a completely monotone derivative, then X ≤sm Y . Hu, Xie, and Ruan [241] described various sets of conditions under which two multivariate Bernoulli random vectors are ordered with respect to the supermodular order.

404

9 Positive Dependence Orders

Remark 9.A.29. Hu and Pan [239] elegantly extended (9.A.12) to the supermodular order. They also identiﬁed conditions under which any n corresponding values of two stationary Markov chains are comparable in the order ≤sm . See also Miyoshi and Rolski [398].

9.B The Orthant Ratio Orders Some multivariate stochastic orders, that compare the “location” or “magnitude” of two random vectors, may be thought of as stochastic orders of positive dependence if the compared random vectors have the same univariate marginal distributions. For example, in the bivariate case, when this is the situation, the orthant orders ≤uo and ≤lo (see Section 6.G.1) become the order ≤PQD , or, equivalently (see (9.A.18)), the order ≤sm . On the other hand, some multivariate location orders do not give anything meaningful once the marginals are held ﬁxed. For instance, the usual multivariate stochastic order ≤st can order two random vectors, with marginals that are stochastically equal, only if they have the same distributions (see Theorem 6.B.19). In this section we study, among other things, some stochastic orders of positive dependence that arise when the underlying random vectors are ordered with respect to some multivariate hazard rate stochastic orders that were discussed in Section 6.D, and have the same univariate marginal distributions. 9.B.1 The (weak) orthant ratio orders Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with respective distribution functions F and G, and with survival functions F and G. We suppose that F and G belong to the same Fr´echet class; that is, have the same univariate marginals. We say that X is smaller than Y in the lower orthant decreasing ratio order (denoted by X ≤lodr Y or F ≤lodr G) if F (y)G(x) ≥ F (x)G(y)

whenever x ≤ y.

(9.B.1)

This is equivalent to G(x) F (x)

is decreasing in x ∈ {x : G(x) > 0},

(9.B.2)

where in (9.B.2) we use the convention a/0 ≡ ∞ whenever a > 0. Note that (9.B.2) can be written equivalently as G(x − u) F (x − u) ≤ , F (x) G(x)

u ≥ 0, x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0}, (9.B.3)

9.B The Orthant Ratio Orders

405

and it is also equivalent to [X − xX ≤ x] ≥lo [Y − xY ≤ x],

x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0}. (9.B.4) Note that from (9.B.2) it follows that {x : F (x) > 0} ⊆ {x : G(x) > 0}. Thus, in (9.B.3) and (9.B.4) we can formally replace the expression {x : F (x) > 0} ∩ {x : G(x) > 0} by the simpler expression {x : F (x) > 0}. We say that X is smaller than Y in the upper orthant increasing ratio order (denoted by X ≤uoir Y or F ≤uoir G) if F (y)G(x) ≤ F (x)G(y)

whenever x ≤ y.

(9.B.5)

This is equivalent to G(x) F (x)

is increasing in x ∈ {x : G(x) > 0},

where here, again, we use the convention a/0 ≡ ∞ whenever a > 0. Note that the above can be written equivalently as F (x + u) G(x + u) ≤ , F (x) G(x)

u ≥ 0, x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0},

(9.B.6) and it is also equivalent to [X − xX > x] ≤uo [Y − xY > x], x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0}. (9.B.7) Formally the expression {x : F (x) > 0} ∩ {x : G(x) > 0} in (9.B.6) and (9.B.7) can be replaced by the simpler expression {x : F (x) > 0}. We note that if X and Y have the same marginals, then X ≤uoir Y if, and only if, X ≤whr Y ; see (6.D.2). The two orders ≤lodr and ≤uoir are closely related, as is indicated in the next result. Theorem 9.B.1. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors in the same Fr´echet class. (a) If X ≤lodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤lodr Y . (b) If X ≤uoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤uoir Y . The next result is similar to Theorem 9.B.1, but it involves increasing, rather than decreasing, functions. It shows that the orders ≤lodr and ≤uoir are closed under componentwise increasing transformations.

406

9 Positive Dependence Orders

Theorem 9.B.2. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors in the same Fr´echet class. (a) If X ≤lodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤lodr Y . (b) If X ≤uoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤uoir Y . Since the order ≤uoir is equivalent to the order ≤whr when the compared random vectors have the same marginals, it follows from Theorem 6.D.4 that the order ≤uoir is closed under conjunctions, marginalization, and convergence in distribution. Using Theorem 9.B.1 it is seen that also the order ≤lodr is closed under these operations. If X ≤lodr Y , then from (9.B.4) it follows that [X X ≤ x] ≥lo [Y Y ≤ x] for all relevant x. Letting x → −∞ it is seen that (9.A.13) holds (with F and G being the distributions functions of X and Y , respectively). Similarly, if X ≤uoir Y , then (9.A.14) holds. Thus we have that X ≤lodr Y and X ≤uoir Y =⇒ X ≤PQD Y . Example 9.B.3. Recall from page 387 the deﬁnition of the Fr´echet class M(F1 , F2 ) and the Fr´echet lower bound in that class which we denote here by F − . Suppose that (X1 , X2 ) has a distribution function in M(F1 , F2 ). Then F − ≤lodr F and F − ≤uoir F . Example 9.B.4. Let X and Y be two n-dimensional random vectors with Marshall-Olkin exponential distributions F and G with the survival functions given, for x ≥ 0, by F (x) = exp

−

n i=1

λi xi −

λi1 i2 (xi1 ∨ xi2 )

1≤i1 ≤i2 ≤n

− · · · − λ12···n (x1 ∨ x2 ∨ · · · ∨ xn ) ,

and G(x) = exp

−

n i=1

θi xi −

1≤i1 ≤i2 ≤n

θi1 i2 (xi1 ∨ xi2 ) − · · · − θ12···n (x1 ∨ x2 ∨ · · · ∨ xn ) ,

where the λ’s and the θ’s are positive constants. Denote νA = λA − θA , A ⊆ {1, 2, . . . , n}. Then X ≤uoir Y if, and only if,

9.B The Orthant Ratio Orders

νi ≥ 0, νi1 + νi1 i2 ≥ 0, νi1 + νi1 i2 + νi1 i3 + νi1 i2 i3 ≥ 0, .. . and

407

i ∈ {1, 2, . . . , n}, {i1 , i2 } ∈ {1, 2, . . . , n}, {i1 , i2 , i3 } ∈ {1, 2, . . . , n},

νA = 0.

Ai

A⊆{1,2,...,n}

9.B.2 The strong orthant ratio orders Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with respective distribution functions F and G, and with survival functions F and G. As in Section 9.B.1, we suppose that F and G belong to the same Fr´echet class; that is, have the same univariate marginals. We say that X is smaller than Y in the strong lower orthant decreasing ratio order (denoted by X ≤slodr Y or F ≤slodr G) if F (x)G(y) ≤ F (x ∨ y)G(y ∧ x),

x, y ∈ Rn .

(9.B.8)

We say that X is smaller than Y in the strong upper orthant increasing ratio order (denoted by X ≤suoir Y or F ≤suoir G) if F (x)G(y) ≤ F (x ∧ y)G(y ∨ x),

x, y ∈ Rn .

(9.B.9)

We note that if X and Y have the same marginals, then X ≤suoir Y if, and only if, X ≤hr Y ; see (6.D.1). By choosing x ≤ y in (9.B.8) we get (9.B.1), and by choosing x ≥ y in (9.B.9) we get (9.B.5), that is, X ≤slodr Y =⇒ X ≤lodr

and X ≤suoir Y =⇒ X ≤uoir .

(9.B.10)

Thus the orders ≤slodr and ≤suoir are often useful as a tool to identify random vectors that are ordered with respect to the orders ≤lodr and ≤uoir . The two orders ≤slodr and ≤suoir are closely related, and are preserved under componentwise increasing transformations, as is indicated in the next analog of Theorems 9.B.1 and 9.B.2. Theorem 9.B.5. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors in the same Fr´echet class. (a) If X ≤slodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤slodr Y .

408

9 Positive Dependence Orders

(b) If X ≤suoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤suoir Y . (c) If X ≤slodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤slodr Y . (d) If X ≤suoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤suoir Y . Since the order ≤suoir is equivalent to the order ≤hr when the compared random vectors have the same marginals, it follows from Theorem 6.D.4 that the order ≤uoir is closed under conjunctions, marginalization, and convergence in distribution. Using Theorem 9.B.5 it is seen that also the order ≤slodr is closed under these operations. The converses of the implications in (9.B.10) are not true in general. However, under an additional assumption they are valid; these are given in the following theorem. Theorem 9.B.6. Let X and Y be two random vectors in the same Fr´echet class with respective distribution functions F and G, and respective survival functions F and G. (a) If F and/or G are/is MTP2 , then X ≤lodr Y =⇒ X ≤slodr Y . (b) If F and/or G are/is MTP2 , then X ≤uoir Y =⇒ X ≤suoir Y . Part (b) of the above theorem is similar to Theorem 6.D.1. However, it turns out that since the compared random vectors are in the same Fr´echet class, it is not needed, in Theorem 9.B.6(b), that they have a common support which is a lattice.

9.C The LTD, RTI, and PRD Orders For any random vector (X1 , X2 ) with distribution function F ∈ M(F1 , F2 ) (see page 387 for the deﬁnition of M(F1 , F2 )) we deﬁne the conditional distribution function FxL by FxL1 (x2 ) = P {X2 ≤ x2 X1 ≤ x1 } (9.C.1) for all x1 for which this conditional distribution is well deﬁned. Barlow and Proschan [36] deﬁned F (or X1 and X2 ) to be left tail decreasing (LTD) if FxL1 (x2 ) ≥ FxL1 (x2 )

for all x1 ≤ x1 and x2 ,

9.C The LTD, RTI, and PRD Orders

409

or, equivalently, if (FxL1 )−1 (u) ≤ (FxL1 )−1 (u)

for all x1 ≤ x1 and u ∈ [0, 1].

(9.C.2)

Note that when (FxL1 )−1 (u) is continuous in u for all x1 , then (9.C.2) can be equivalently written as FxL1 (FxL1 )−1 (u) ≤ u for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.3) This notion leads to the following deﬁnition. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that for any x1 ≤ x1 we have −1 −1 (FxL1 )−1 (u) ≤ (FxL1 )−1 (v) =⇒ (GL (u) ≤ (GL (v) x1 ) x1 )

for all u, v ∈ [0, 1]. (9.C.4) Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the LTD order (denoted by (X1 , X2 ) ≤LTD (Y1 , Y2 ) or F ≤LTD G). Note that (9.C.4) can be equivalently written as L −1 GL (u) ≤ FxL1 (FxL1 )−1 (u) for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.5) x1 (Gx1 ) It can be shown that if FxL1 (x2 ) and GL x1 (x2 ) are continuous in x2 for all x1 , then (X1 , X2 ) ≤LTD (Y1 , Y2 ) if, and only if, for any x1 ≤ x1 , L L FxL1 (x2 ) ≥ GL x1 (x2 ) =⇒ Fx1 (x2 ) ≥ Gx1 (x2 )

for any x2 and x2 .

(9.C.6)

Note that (9.C.6) can be equivalently written as L L −1 −1 (GL Fx1 (x2 ) ≤ (GL Fx1 (x2 ) for all x1 ≤ x1 and x2 , x1 ) x1 ) L −1 that is, (GL Fx1 (x2 ) is increasing in x1 for all x2 . x1 ) In the continuous case, it is immediate from (9.C.3) and (9.C.5) that F is LTD if, and only if, F I ≤LTD F, where F I is deﬁned in Section 9.A, but this is true also when F is not continuous. Theorem 9.C.1. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ), such that FxL1 (x2 ) and GL x1 (x2 ) are continuous in x2 for all x1 . Then (X1 , X2 ) ≤LTD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ). Proof. Since F and G have the same marginals, we see from (9.C.6) that (X1 , X2 ) ≤LTD (Y1 , Y2 ) if, and only if,

410

9 Positive Dependence Orders

F (x1 , x2 ) ≥ G(x1 , x2 ) =⇒ F (x1 , x2 ) ≥ G(x1 , x2 ) for any x2 , x2 , and x1 ≤ x1 . (9.C.7) If (X1 , X2 ) ≤PQD (Y1 , Y2 ) did not hold, then there would have existed a point (x0 , y0 ) such that F (x0 , y0 ) > G(x0 , y0 ). Let y < y0 be such that F (x0 , y0 ) > F (x0 , y) > G(x0 , y0 ). Since F2 (y) < F2 (y0 ), one can then ﬁnd an x such that x > x0 and F (x, y) < G(x, y0 ). But then F (x0 , y) > G(x0 , y0 ) and F (x, y) < G(x, y0 ) contradict (9.C.7).

The LTD order is not symmetric in the sense that (X1 , X2 ) ≤LTD (Y1 , Y2 ) does not necessarily imply that (X2 , X1 ) ≤LTD (Y2 , Y1 ). However, it satisﬁes the following closure under monotone transformations property. Theorem 9.C.2. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions in the same Fr´echet class. If (X1 , X2 ) ≤LTD (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤LTD (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. Example 9.C.3. Let φθ (t) ≡ (1 − tθ )1/θ , t ∈ [0, 1], θ ∈ (0, 1). Then the function Cθ , deﬁned as Cφθ (x, y) = φ−1 θ {φθ (x) + φθ (y)},

x, y ∈ [0, 1],

is a bivariate distribution function with uniform[0, 1] marginals (it is a particular Archimedean copula). If θ1 ≤ θ2 , then Cφθ2 ≤LTD Cφθ1 . An order that is similar to the LTD order, but which is based on conditioning on right tails, rather than on left tails, is described next. For any random vector (X1 , X2 ) with distribution function F ∈ M(F1 , F2 ) we deﬁne the conditional distribution function FxR by (9.C.8) FxR1 (x2 ) = P {X2 ≤ x2 X1 > x1 } for all x1 for which this conditional distribution is well deﬁned. Barlow and Proschan [36] deﬁned F (or X1 and X2 ) to be right tail increasing (RTI) if FxR1 (x2 ) ≥ FxR1 (x2 )

for all x1 ≤ x1 and x2 ,

or, equivalently, if (FxR1 )−1 (u) ≤ (FxR1 )−1 (u)

for all x1 ≤ x1 and u ∈ [0, 1].

(9.C.9)

When (FxR1 )−1 (u) is continuous in u for all x1 then (9.C.9) can be written as FxR1 (FxR1 )−1 (u) ≤ u for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.10) This notion leads to the following deﬁnition. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that for any x1 ≤ x1 we have

9.C The LTD, RTI, and PRD Orders

411

−1 −1 (FxR1 )−1 (u) ≤ (FxR1 )−1 (v) =⇒ (GR (u) ≤ (GR (v) x1 ) x1 )

for all u, v ∈ [0, 1]. (9.C.11) Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the RTI order (denoted by (X1 , X2 ) ≤RTI (Y1 , Y2 ) or F ≤RTI G). In analogy to (9.C.5) we note that (9.C.11) can be written as R −1 GR (u) ≤ FxR1 (FxR1 )−1 (u) for all x1 ≤ x1 and u ∈ [0, 1]. x1 (Gx1 ) (9.C.12) (x ) are continuous in x It can be shown that if FxR1 (x2 ) and GR 2 2 for all x1 x1 , then (X1 , X2 ) ≤RTI (Y1 , Y2 ) if, and only if, for any x1 ≤ x1 , R R FxR1 (x2 ) ≥ GR x1 (x2 ) =⇒ Fx1 (x2 ) ≥ Gx1 (x2 )

for any x2 and x2 .

(9.C.13)

Note that (9.C.13) can be written as R R −1 −1 Fx1 (x2 ) ≤ (GR Fx1 (x2 ) for all x1 ≤ x1 and x2 , (GR x1 ) x1 ) R −1 Fx1 (x2 ) is increasing in x1 for all x2 . that is, (GR x1 ) In the continuous case, it is immediate from (9.C.10) and (9.C.12) that F is RTI if, and only if, F I ≤RTI F, where F I is deﬁned in Section 9.A, but this is true also when F is not continuous. The following result is an analog of Theorem 9.C.1; its proof is similar to the proof of that theorem, and is therefore omitted. Theorem 9.C.4. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ), such that FxR1 (x2 ) and GR x1 (x2 ) are continuous in x2 for all x1 . Then (X1 , X2 ) ≤RTI (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ). The RTI order is not symmetric in the sense that (X1 , X2 ) ≤RTI (Y1 , Y2 ) does not necessarily imply that (X2 , X1 ) ≤RTI (Y2 , Y1 ). However, it satisﬁes the following closure under monotone transformations property. Theorem 9.C.5. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions in the same Fr´echet class. If (X1 , X2 ) ≤RTI (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤RTI (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. The LTD and RTI orders are related to each other as follows. Theorem 9.C.6. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors in the same Fr´echet class. (a) If (X1 , X2 ) ≤LTD (Y1 , Y2 ), then (φ1 (X1 ), φ2 (X2 ))) ≤RTI (φ1 (Y1 ), φ2 (Y2 )) for any decreasing functions φ1 and φ2 . Conversely, if (φ1 (X1 ), φ2 (X2 )) ≤RTI (φ1 (Y1 ), φ2 (Y2 )) for some strictly decreasing functions φ1 and φ2 , then (X1 , X2 ) ≤LTD (Y1 , Y2 ).

412

9 Positive Dependence Orders

(b) If (X1 , X2 ) ≤RTI (Y1 , Y2 ), then (φ1 (X1 ), φ2 (X2 )) ≤LTD (φ1 (Y1 ), φ2 (Y2 )) for any decreasing functions φ1 and φ2 . Conversely, if (φ1 (X1 ), φ2 (X2 )) ≤LTD (φ1 (Y1 ), φ2 (Y2 )) for some strictly decreasing functions φ1 and φ2 , then (X1 , X2 ) ≤RTI (Y1 , Y2 ). The orders ≤slodr and ≤suoir imply the LTD and RTI orders under some regularity conditions. This is shown in the next result. Theorem 9.C.7. Let F and G be in the Fr´echet class M(F1 , F2 ). Assume that, for every x, the conditional distributions FxL and FxR (see (9.C.1) and (9.C.8)) are strictly increasing and continuous on their supports. Then F ≤slodr G =⇒ F ≤LTD G

and

F ≤suoir G =⇒ F ≤RTI G.

Proof. It is enough to prove the ﬁrst implication; the other implication then follows from Theorems 9.B.5 and 9.C.6. By (9.C.7), we need to show that for x ≤ x , and for any y, y , it holds that F (x, y) ≥ G(x, y ) =⇒ F (x , y) ≥ G(x , y ). (9.C.14) Now assume that F ≤slodr G. So, for x ≤ x and y ≤ y we have F (x, y)G(x , y ) ≤ F (x , y)G(x, y ).

(9.C.15)

In the bivariate case, F ≤slodr G implies that F ≤PQD G. So the left-hand side inequality in (9.C.14) can hold only for y ≤ y. If it does hold, then (9.C.15) implies the inequality on the right-hand side of (9.C.14).

In light of Theorem 9.C.7 it is of interest to note that the (weak) orthant ratio orders ≤lodr and ≤uoir do not imply the orders ≤LTD and ≤RTI , respectively. Counterexamples can be found in the literature. An order that is of the same type as the LTD and RTI orders is the one that we study next. For any random vector (X1 , X2 ), with distribution function F ∈ M(F1 , F2 ), let Fx1 denote the conditional distribution of X2 given that X1 = x1 . Lehmann [343] deﬁned F (or X1 and X2 ) to be positive regression dependent (PRD) if X2 is stochastically increasing in X1 , that is, if Fx1 (x2 ) ≥ Fx1 (x2 )

for all x1 ≤ x1 and x2 ,

or, equivalently, if (u) ≤ Fx−1 Fx−1 (u) 1 1

for all x1 ≤ x1 and u ∈ [0, 1].

(9.C.16)

Note that when Fx−1 (u) is continuous in u for all x1 , then (9.C.16) can be 1 written as

(u) ≤ u for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.17) Fx1 Fx−1 1 This notion leads to the following deﬁnition.

9.C The LTD, RTI, and PRD Orders

413

Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that for any x1 ≤ x1 we have −1 −1 Fx−1 (u) ≤ Fx−1 (v) =⇒ Gx (u) ≤ Gx (v) 1 1 1

1

for all u, v ∈ [0, 1].

(9.C.18)

Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PRD order (denoted by (X1 , X2 ) ≤PRD (Y1 , Y2 ) or F ≤PRD G). Note that (9.C.18) can be written as −1

Gx1 G−1 for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.19) x1 (u) ≤ Fx1 Fx1 (u) It can be shown that if Fx1 (x2 ) and Gx1 (x2 ) are continuous in x2 for all x1 , then (X1 , X2 ) ≤PRD (Y1 , Y2 ) if, and only if, for any x1 ≤ x1 , Fx1 (x2 ) ≥ Gx1 (x2 ) =⇒ Fx1 (x2 ) ≥ Gx1 (x2 )

for any x2 and x2 .

(9.C.20)

Note that (9.C.20) can be written as

−1 G−1 for all x1 ≤ x1 and x2 , (9.C.21) x1 Fx1 (x2 ) ≤ Gx1 Fx1 (x2 )

that is, G−1 x1 Fx1 (x2 ) is increasing in x1 for all x2 . In the continuous case, it is immediate from (9.C.17) and (9.C.19) that F is PRD if, and only if, F I ≤PRD F, where F I is deﬁned in Section 9.A, but this is true also when F is not continuous. The next result shows the relationship between the PRD, LTD, and RTI orders. We do not give the proof of it here. Theorem 9.C.8. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with absolutely continuous distribution functions F, G ∈ M(F1 , F2 ). Then (X1 , X2 ) ≤PRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤LTD (Y1 , Y2 ) and (X1 , X2 ) ≤PRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤RTI (Y1 , Y2 ). The PRD order is not symmetric in the sense that (X1 , X2 ) ≤PRD (Y1 , Y2 ) does not necessarily imply that (X2 , X1 ) ≤PRD (Y2 , Y1 ). However, it satisﬁes the following closure under monotone transformations property. Theorem 9.C.9. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. If (X1 , X2 ) ≤PRD (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤PRD (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. Example 9.C.10. Let U and V be any independent random variables, each having a continuous distribution. Deﬁne X = U,

Yρ = ρU + (1 − ρ2 )1/2 V,

for − 1 ≤ ρ ≤ 1.

Then (X, Yρ1 ) ≤PRD (X, Yρ2 ) whenever ρ1 ≤ ρ2 . A bivariate normal distribution is a particular case of this example when U and V are normally distributed.

414

9 Positive Dependence Orders

Example 9.C.11. Let U and V be any independent random variables, each having a continuous distribution. Deﬁne X = U,

Yα = αU + V,

for − ∞ ≤ α ≤ ∞.

Then (X, Yα1 ) ≤PRD (X, Yα2 ) whenever α1 ≤ α2 . Example 9.C.12. Let U and V be any independent random variables, each having a continuous distribution, such that U is distributed on (0, 1), while V is nonnegative. Deﬁne X = U,

Yα = (1 + αU )V,

for α ≥ −1.

Then (X, Yα1 ) ≤PRD (X, Yα2 ) whenever α1 ≤ α2 .

9.D The PLRD Order Let the random variables X1 and X2 have the joint distribution F . For any two intervals I1 and I2 of the real line, let us denote I1 ≤ I2 if x1 ∈ I1 and x2 ∈ I2 imply that x1 ≤ x2 . For any two intervals I and J of the real line denote F (I, J) ≡ P {X1 ∈ I, X2 ∈ J}. Block, Savits, and Shaked [95] essentially deﬁned F (or X1 and X2 ) to be positive likelihood ratio dependent if F (I1 , J1 )F (I2 , J2 ) ≥ F (I1 , J2 )F (I2 , J1 ),

whenever I1 ≤ I2 and J1 ≤ J2 . (9.D.1) In fact, Block, Savits and Shaked [95] called F totally positive of order 2 (TP2 ) if (9.D.1) holds. When F has a (continuous or discrete) density f , then (9.D.1) is equivalent to the condition that f is TP2 , that is, f (x1 , y1 )f (x2 , y2 ) ≥ f (x1 , y2 )f (x2 , y1 ),

whenever x1 ≤ x2 and y1 ≤ y2 .

Then (9.D.1) is the same as the condition for the positive dependence notion that Lehmann [343] called positive likelihood ratio dependence (PLRD). This notion leads naturally to the order that is described below. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that F (I1 , J1 )F (I2 , J2 )G(I1 , J2 )G(I2 , J1 ) ≤ F (I1 , J2 )F (I2 , J1 )G(I1 , J1 )G(I2 , J2 ), whenever I1 ≤ I2 and J1 ≤ J2 . (9.D.2) where the generic notation G(I, J) is obvious. Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PLRD order (denoted by (X1 , X2 ) ≤PLRD

9.D The PLRD Order

415

(Y1 , Y2 ) or F ≤PLRD G). Since only random vectors with the same univariate marginals can be compared in the PLRD order, we will implicitly assume this fact throughout this section. When F and G have (continuous or discrete) densities f and g, then (9.D.2) is equivalent to f (x1 , y1 )f (x2 , y2 )g(x1 , y2 )g(x2 , y1 ) ≤ f (x1 , y2 )f (x2 , y1 )g(x1 , y1 )g(x2 , y2 ), whenever x1 ≤ x2 and y1 ≤ y2 . If

∂2 ∂x∂y f

and

∂2 ∂x∂y g

exist, then (9.D.2) is equivalent to f 2 ∆g − g 2 ∆f ≥ 0,

where ∆f ≡ f

∂f ∂f ∂2f − · ∂x∂y ∂x ∂y

and ∆g ≡ g

∂g ∂g ∂2g − · . ∂x∂y ∂x ∂y

Obviously F is PLRD if, and only if, F I ≤PLRD F, where F I is deﬁned in Section 9.A. Theorem 9.D.1. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ). Then (X1 , X2 ) ≤PLRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ). Proof. Assume (X1 , X2 ) ≤PLRD (Y1 , Y2 ) and suppose that (X1 , X2 ) ≤PQD (Y1 , Y2 ). Then F (x, y) > G(x, y) (9.D.3) for some (x, y). Let I1 = (−∞, x], I2 = (x, ∞), J1 = (−∞, y] and J2 = (y, ∞). Then from (9.D.3), and from the fact that F and G have the same marginals, it follows that F (I1 , J1 ) > G(I1 , J1 ), F (I2 , J2 ) > G(I2 , J2 ), G(I1 , J2 ) > F (I1 , J2 ) and G(I2 , J1 ) > F (I2 , J1 ). Multiplying these four inequalities we obtain a contradiction to (9.D.2).

416

9 Positive Dependence Orders

We do not know whether (X1 , X2 ) ≤PLRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PRD (Y1 , Y2 ). The following closure properties of the PLRD order are easy to prove. Theorem 9.D.2. (a) Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors such that (X1 , X2 ) ≤PLRD (Y1 , Y2 ). Then (φ(X1 ), ψ(X2 )) ≤PLRD (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. (j) (j) (j) (j) (b) Let {(X1 , X2 ), j = 1, 2, . . . } and {(Y1 , Y2 ), j = 1, 2, . . . } be two (j) (j) sequences of random vectors such that (X1 , X2 ) →st (X1 , X2 ) and (j) (j) (Y1 , Y2 ) →st (Y1 , Y2 ) as j → ∞, where →st denotes convergence (j) (j) (j) (j) in distribution. If (X1 , X2 ) ≤PLRD (Y1 , Y2 ), j = 1, 2, . . ., then (X1 , X2 ) ≤PLRD (Y1 , Y2 ). Let FL and FU denote the Fr´echet lower and upper bounds in the class M(F1 , F2 ). Since FL assigns all its mass to some decreasing curve in R2 , and FU assigns all its mass to some increasing curve in R2 , it follows that for every distribution F ∈ M(F1 , F2 ) we have FL ≤PLRD F ≤PLRD FU . By Theorem 9.D.1, this is a stronger result than (9.A.6). The proof of the next result is similar to the proof of Theorem 9.D.1 and is therefore omitted. Theorem 9.D.3. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors such that (X1 , X2 ) ≤PLRD (Y1 , Y2 ) and (X1 , X2 ) ≥PLRD (Y1 , Y2 ). Then (X1 , X2 ) =st (Y1 , Y2 ). Example 9.D.4. Let H and K be two continuous univariate distribution functions. For −1 ≤ α ≤ 1, deﬁne the following distribution function Fα (x, y) = H(x)K(y){1 + α[1 − H(x)][1 − K(y)]},

for all x and y.

Then Fα1 ≤PLRD Fα2 whenever α1 ≤ α2 . Example 9.D.5. Let φ and ψ be two Laplace transforms of positive random variables and let the random vectors (X1 , X2 ) and (Y1 , Y2 ) be distributed according to F and G as in Example 9.A.3. If φ−1 ψ has a completely monotone derivative, then (X1 , X2 ) ≤PLRD (Y1 , Y2 ). Example 9.D.6. Let (X1 , X2 ) and (Y1 , Y2 ) be bivariate normal random vectors with the same marginals, and with correlation coeﬃcients ρX and ρY , respectively. If ρX ≤ ρY , then (X1 , X2 ) ≤PLRD (Y1 , Y2 ).

9.E Association Orders

417

9.E Association Orders The random variables X1 and X2 are said to be associated if Cov(K(X1 , X2 ), L(X1 , X2 )) ≥ 0 for all increasing functions K and L for which the covariance is well deﬁned (see (3.A.53)). This notion leads to the order that is described below. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that (Y1 , Y2 ) =st (K(X1 , X2 ), L(X1 , X2 )),

(9.E.1)

for some increasing functions K and L which satisfy K(x1 , y1 ) < K(x2 , y2 ), L(x1 , y1 ) > L(x2 , y2 ) =⇒ x1 < x2 , y1 > y2 . (9.E.2) Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the association order (denoted by (X1 , X2 ) ≤assoc (Y1 , Y2 ) or F ≤assoc G). Since only random vectors with the same univariate marginals are compared in the association order, we will implicitly assume this fact throughout this section. The restriction (9.E.2) on the functions K and L is for the purpose of making the association order applicable in situations which are not symmetric in the X1 and X2 variables. [In case (9.E.2) is dropped, (X1 , X2 ) ≥assoc (X2 , X1 ) ≥assoc (X1 , X2 ).] If K and L are partially diﬀerentiable increasing functions, then (9.E.2) is equivalent to ∂ ∂ ∂ ∂ K(x, y) · L(x, y) ≥ K(x, y) · L(x, y) ∂x ∂y ∂y ∂x

for all x and y.

From the fact that increasing functions of independent random variables are associated, it follows that if F I ≤assoc F , then F is the distribution function of associated random variables, where F I is deﬁned in Section 9.A. The following closure property is easy to prove. Theorem 9.E.1. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. If (X1 , X2 ) ≤assoc (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤assoc (φ(Y1 ), ψ(Y2 )) for all strictly increasing functions φ and ψ. The relationship between the association and the PQD orders is described in the next result. Theorem 9.E.2. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ). Then (X1 , X2 ) ≤assoc (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ).

418

9 Positive Dependence Orders

Proof. Denote by F and G the distribution functions of (X1 , X2 ) and (Y1 , Y2 ), respectively. By assumption, (Y1 , Y2 ) =st (K(X1 , X2 ), L(X1 , X2 )) where K and L are increasing and satisfy (9.E.2). Fix a pair (x1 , x2 ). First suppose that K(x1 , x2 ) ≤ x1 and that L(x1 , x2 ) ≤ x2 . Then P {Y1 ≤ x1 , Y2 ≤ x2 } ≥ P {Y1 ≤ K(x1 , x2 ), Y2 ≤ L(x1 , x2 )} = P {K(X1 , X2 ) ≤ K(x1 , x2 ), L(X1 , X2 ) ≤ L(x1 , x2 )} ≥ P {X1 ≤ x1 , X2 ≤ x2 }, where the second inequality follows from the increasingness of K and of L. Thus (9.A.3) holds in this case. Next suppose that K(x1 , x2 ) ≤ x1 and that L(x1 , x2 ) > x2 . Then P {Y1 > x1 , Y2 < x2 } ≤ P {Y1 > K(x1 , x2 ), Y2 < L(x1 , x2 )} = P {K(X1 , X2 ) > K(x1 , x2 ), L(X1 , X2 ) < L(x1 , x2 )} ≤ P {X1 > x1 , X2 < x2 }, where the second inequality follows from (9.E.2). From the fact that (X1 , X2 ) and (Y1 , Y2 ) have the same univariate marginals it is seen that (9.A.3) holds in this case too. For the remaining two cases the inequality (9.A.3) follows in a similar way.

The relationship between the association and the PRD orders is described next. Theorem 9.E.3. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distri bution functions F, G ∈ M(F1 , F2 ) such that FX2 |X1 (x2 x1 ) and GY2 |Y1 (x2 x1 ) are continuous in x2 for all x1 . Then (X1 , X2 ) ≤PRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤assoc (Y1 , Y2 ). Proof. Suppose that (X1 , X2) ≤PRD (Y1 , Y2). Deﬁne K and L by K(x1 , x2 ) ≡ −1 x1 and L(x1 , x2 ) ≡ GY2 |Y1 FX2 |X1 (x2 x1 )x1 . Obviously K is an increasing function. Also, obviously L(x1 , x2 ) is increasing in x2 . Furthermore, from (9.C.21) it is seen that L(x1 , x2 ) is also increasing in x1 , and that (9.E.2) holds. Now note that since X1 =st Y1 , we have, using the continuity assumptions stated, that (Y1 , Y2 ) =st L(X1 , X2 ). That is, (X1 , X2 ) and (Y1 , Y2 ) satisfy (9.E.1) and (9.E.2).

Example 9.E.4. Let U and V be any independent random variables. Deﬁne Xα = (1 − α)U + αV,

Y = U,

for α ∈ [0, 1].

Then (Xα1 , Y ) ≤assoc (Xα2 , Y ) whenever α1 ≤ α2 .

9.E Association Orders

419

Example 9.E.5. Let U and V be any independent random variables. Deﬁne Xα = (1 − α)U + αV,

Y = αU + (1 − α)V,

for α ∈ [0, 12 ].

Then (Xα1 , Y ) ≤assoc (Xα2 , Y ) whenever α1 ≤ α2 . Example 9.E.6. Let (X1 , X2 ) and (Y1 , Y2 ) have bivariate normal distributions with correlation coeﬃcients ρ1 and ρ2 , respectively. Then (X1 , X2 ) ≤assoc (Y1 , Y2 ) if, and only if, −1 ≤ ρ1 ≤ ρ2 ≤ 1. Cap´era`a, Foug`eres, and Genest [119] introduced an order that is related to the association order. In order to deﬁne it we need ﬁrst to introduce some notation. Let (X1 , X2 ) be a random vector with a continuous distribution function F ∈ M(F1 , F2 ). Deﬁne VF ≡ F (X1 , X2 ), and let KF denote the distribution function of VF . For example, if the distribution function of (X1 , X2 ) is the Fr´echet upper bound FU ∈ M(F1 , F2 ) (see page 387), then KFU (v) = v, v ∈ [0, 1]. If the distribution function of (X1 , X2 ) is the Fr´echet lower bound FL ∈ M(F1 , F2 ), then KFL (v) = 1, v ∈ [0, 1]. Finally, if X1 and X2 are independent, with distribution function F I ∈ M(F1 , F2 ), then KF I (v) = v − v log v, v ∈ [0, 1]. These facts suggest the following order. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with continuous distribution functions F, G ∈ M(F1 , F2 ). Suppose that KF (v) ≥ KG (v),

for all v ∈ [0, 1].

Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the Cap´era` a-Foug`eresGenest order (denoted by (X1 , X2 ) ≤CFG (Y1 , Y2 ) or F ≤CFG G). Cap´era` a, Foug`eres, and Genest [119] showed that for every continuous distribution function F ∈ M(F1 , F2 ) we have FL ≤CFG F ≤CFG FU . They also proved, under some regularity conditions, that (X1 , X2 ) ≤assoc (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤CFG (Y1 , Y2 ). However, Cap´era` a, Foug`eres, and Genest [119] showed that ≤CFG =⇒≤ PQD , ´ whereas Nelsen, Quesada-Molina, Rodr´ıguez-Lallena, and Ubeda-Flores [433] showed that ≤PQD =⇒≤ . CFG ´ Nelsen, Quesada-Molina, Rodr´ıguez-Lallena, and Ubeda-Flores [432] introduced some generalizations of the order ≤CFG . Another related order of interest is based on the notion of weak positive association which is deﬁned in (9.A.22). Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors that have the same univariate marginals, and that satisfy

420

9 Positive Dependence Orders

Cov(h1 (Xi1 , Xi2 , . . . , Xik ), h2 (Xj1 , Xj2 , . . . , Xjn−k )) ≤ Cov(h1 (Yi1 , Yi2 , . . . , Yik ), h2 (Yj1 , Yj2 , . . . , Yjn−k )) for all choices of disjoint subsets {i1 , i2 , . . . , ik } and {j1 , j2 , . . . , jn−k } of {1, 2, . . . , n}, and for all increasing functions h1 and h2 for which the above covariances are deﬁned. Then X is said to be smaller than Y in the weak association order (denoted by X ≤w-assoc Y ). Some closure properties of the weak association order are described in the next theorem. Theorem 9.E.7. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤w-assoc (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤w-assoc (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R, i = 1, 2, . . . , n, are all increasing. (b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤w-assoc Y , then X I ≤w-assoc Y I for each I ⊆ {1, 2, . . . , n}. That is, the weak association order is closed under marginalization. An important useful property of the weak association order is the following. Theorem 9.E.8. Let X and Y be two random vectors with the same univariate marginals. Then X ≤w-assoc Y =⇒ X ≤sm Y . Remark 9.E.9. Note that if X = (X1 , X2 , . . . , Xn ) is a vector of weakly positively associated random variables, as deﬁned in (9.A.22), and if Y = (Y1 , Y2 , . . . , Yn ) is a vector of independent random variables such that, marginally, Xi =st Yi , i = 1, 2, . . . , n, then X ≥w-assoc Y . Similarly, if X is a vector of negatively associated random variables, as deﬁned in (3.A.54), and if Y is a vector of independent random variables such that, marginally, Xi =st Yi , i = 1, 2, . . . , n, then X ≤w-assoc Y . Thus it is seen that Theorem 9.E.7 is a stronger result than Theorem 9.A.23.

9.F The PDD Order Let the random variables X1 and X2 have the symmetric (or exchangeable, or interchangeable) joint distribution F . Shaked [501] deﬁnes F (or X1 and X2 ) to be positive deﬁnite dependent (PDD) if F is a positive deﬁnite kernel on S × S, where S is the support of X1 (and therefore, by symmetry, S is also the support of X2 ). Shaked [501] has shown that X1 and X2 are PDD if, and only if,

9.F The PDD Order

Cov(φ(X1 ), φ(X2 )) ≥ 0

for every real function φ,

421

(9.F.1)

provided the covariance is well deﬁned. This notion naturally leads to the order that is deﬁned below. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(s) (Fˆ ), where M(s) (Fˆ ) is the class of all the bivariate symmetric distributions with univariate marginals Fˆ . Let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(s) (Fˆ ). Suppose that Cov(φ(X1 ), φ(X2 )) ≤ Cov(φ(Y1 ), φ(Y2 ))

for every real function φ, (9.F.2)

provided the covariances are well deﬁned. Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PDD order (denoted by (X1 , X2 ) ≤PDD (Y1 , Y2 ) or F ≤PDD G). Since only symmetric random vectors with the same univariate marginals are compared in the PDD order, we will implicitly assume this fact throughout this section. Since Eφ(X1 ) = Eφ(X2 ) = Eφ(Y1 ) = Eφ(Y2 ) for every real function φ, it follows that (X1 , X2 ) ≤PDD (Y1 , Y2 ) if, and only if, Eφ(X1 )φ(X2 ) ≤ Eφ(Y1 )φ(Y2 )

for every real function φ,

(9.F.3)

provided the expectations exist. Thus, if (X1 , X2 ) ≤PDD (Y1 , Y2 ), then P {X1 ∈ A, X2 ∈ A} ≤ P {Y1 ∈ A, Y2 ∈ A} for all Borel-measurable sets A in R. Another characterization of the PDD order is given in the next theorem. Theorem 9.F.1. Let F and G be two symmetric bivariate distributions in M(s) (Fˆ ). Then F ≤PDD G if, and only if, G(x, y) − F (x, y) is a positive deﬁnite kernel. From (9.F.1) and (9.F.3) it is easily seen that F is PDD if, and only if, F I ≤PDD F, where F I is deﬁned in Section 9.A. A powerful closure property of the PDD order is described in the next theorem. Theorem 9.F.2. Suppose that the four random vectors (X1 , X2 ), (Y1 , Y2 ), (U1 , U2 ) and (V1 , V2 ) satisfy (X1 , X2 ) ≤PDD (Y1 , Y2 )

and

(U1 , U2 ) ≤PDD (V1 , V2 ),

(9.F.4)

and suppose that (X1 , X2 ) and (U1 , U2 ) are independent, and also that (Y1 , Y2 ) and (V1 , V2 ) are independent. Then (φ(X1 , U1 ), φ(X2 , U2 )) ≤PDD (φ(Y1 , V1 ), φ(Y2 , V2 )), for every increasing function φ.

422

9 Positive Dependence Orders

In particular, if (9.F.4) holds, then the PDD order is closed under convolutions, that is, (X1 + U1 , X2 + U2 ) ≤PDD (Y1 + V1 , Y2 + V2 ). Using (9.F.3) it is easy to verify the following closure properties. (j)

(j)

(j)

(j)

Theorem 9.F.3. (a) Let {(X1 , X2 ), j = 1, 2, . . . } and {(Y1 , Y2 ), j = (j) (j) 1, 2, . . . } be two sequences of random vectors such that (X1 , X2 ) →st (j) (j) (X1 , X2 ) and (Y1 , Y2 ) →st (Y1 , Y2 ) as j → ∞, where →st denotes con(j) (j) (j) (j) vergence in distribution. If (X1 , X2 ) ≤PDD (Y1 , Y2 ), j = 1, 2, . . ., then (X1 , X2 ) ≤PDD (Y1 , Y2 ). (b) Let (X1 , X2 ), (Y1 ,Y2 ), and Θ be random vectors such that [(X1 , X2 )Θ = θ] ≤PDD [(Y1 , Y2 )Θ = θ] for all θ in the support of Θ. Then (X1 , X2 ) ≤PDD (Y1 , Y2 ). That is, the PDD order is closed under mixtures. Example 9.F.4. Let (X1 , X2 ) and (Y1 , Y2 ) have exchangeable bivariate normal distributions with common marginals and correlation coeﬃcients ρ1 and ρ2 , respectively. If 0 ≤ ρ1 ≤ ρ2 ≤ 1, then (X1 , X2 ) ≤PDD (Y1 , Y2 ). If (X1 , X2 ) and (Y1 , Y2 ) have distributions F and G which are not symmetric, but still have the same common marginals (that is, X1 , X2 , Y1 , and Y2 are all identically distributed), then the PDD order can still be deﬁned on the sym˜ y) = 1 [G(x, y) + G(y, x)] metrizations F˜ (x, y) = 21 [F (x, y) + F (y, x)] and G(x, 2 of F and G. Hu and Joe [234] applied the idea of the PDD order to stationary reversible Markov chains {X1 , X2 , . . . }. They showed for such chains that, if X1 and X2 are PDD (in the sense (9.F.1)), then dependence (in the sense of the PDD order) is decreasing with the lag, namely, F12 ≥PDD F13 ≥PDD · · · ≥PDD F1n ≥PDD · · · ≥PDD F (2) , where the F1j ’s and F (2) are as deﬁned in (9.A.12). An n-variate extension of the PDD order for the case when n ≥ 2 is suggested by (9.F.3). Explicitly, let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have distribution functions with common marginals. Then we can say that X is less positively dependent than Y if E

n i=1

φ(Xi ) ≤ E

n

φ(Yi )

for every nonnegative real function φ.

(9.F.5)

i=1

Note that for this deﬁnition it is not required that X and Y have exchangeable distribution functions; it is only required that X and Y have the same common marginals. One reason for the usefulness of inequality (9.F.5) is that it implies that P {X1 ∈ A, X2 ∈ A, . . . , Xn ∈ A} ≤ P {Y1 ∈ A, Y2 ∈ A, . . . , Yn ∈ A} for all Borel-measurable sets A in R.

9.G Ordering Exchangeable Distributions

423

9.G Ordering Exchangeable Distributions Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with exchangeable distributions. Let X(1) ≤ X(2) ≤ · · · ≤ X(n) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) be the corresponding order statistics. Intuitively, if Y is “more positively dependent” than X (or, alternatively, Y is “less dispersed” than X), then we can expect the Yi ’s to “hang together” more than the Xi ’s. For example, we can expect quantities such as X(n) − X(1) or X(n) + X(n−1) − X(2) − X(1) to be stochastically larger than Y(n) − Y(1) or Y(n) + Y(n−1) − Y(2) − Y(1) . This observation naturally leads to the following deﬁnitions. Let X and Y be two n-dimensional random vectors with exchangeable distribution functions and with the same common marginals. We will write X ≤pd-1 Y if n c X i (i) ≥st i=1

n c Y i (i)

whenever

i=1

n

ci = 0.

(9.G.1)

i=1

When the interest is in the unordered components of the random vectors, then the following deﬁnition is useful. We will write X ≤pd-2 Y if n c X i i ≥st i=1

n c Y i i

whenever

i=1

n

ci = 0.

(9.G.2)

i=1

Recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. For any random variable W , let FW denote the distribution function of W . We will write X ≤pd-3 Y if (FX(1) (x), FX(2) (x), . . . , FX(n) (x)) (FY(1) (x), FY(2) (x), . . . , FY(n) (x))

for all x. (9.G.3)

It is easy to verify that (9.G.3) is equivalent to (Eφ(X(1) ), Eφ(X(2) ), . . . , Eφ(X(n) )) (Eφ(Y(1) ), Eφ(Y(2) ), . . . , Eφ(Y(n) )) for all monotone functions φ for which the expectations exist. A further insight into the meaning of (9.G.3) can be obtained by rewriting it as the set of inequalities

E

j i=1

j I(−∞,x] (X(i) ) ≥ E I(−∞,x] (Y(i) ) , i=1

for j = 1, 2, . . . , n, and all x, (9.G.4) with equality holding for j = n. That is, for each j, the expected value of the number of order statistics which are less than or equal to x among the ﬁrst k

424

9 Positive Dependence Orders

ordered Xi ’s is at least as large as the corresponding expected value based on the ordered Yi ’s. When one is concerned only with the expectations of the order statistics, then the following stochastic order is useful. We will write X ≤pd-4 Y if (EX(1) , EX(2) , . . . , EX(n) ) (EY(1) , EY(2) , . . . , EY(n) ).

(9.G.5)

The next result describes some interrelationships among the orders ≤pd-k , k = 1, 2, 3, 4. Theorem 9.G.1. Let X and Y be two n-dimensional random vectors with exchangeable distribution functions and with the same common marginals. Then X ≤pd-1 Y ⇒ X ≤pd-2 Y ⇓ X ≤pd-3 Y ⇒ X ≤pd-4 Y Proof. First suppose that X ≤pd-1 Y . Let π = (π1 , π2 , . . . , πn ) denote a persuch permumutation of {1, 2, . . . , n}, and let π denote a summation over all n tations. Then, by exchangeability, for any real z, and whenever i=1 ci = 0, we have $ % % $ n n 1 P P ci Xi > z = ci Xi > z Xπ1 ≤ Xπ2 ≤ · · · ≤ Xπn n! π i=1 i=1 % n 1 $ cπi X(i) i > z = P n! π i=1 % n 1 $ cπi Y(i) i > z ≥ P n! π i=1 % $ n ci Yi > z , = P i=1

and (9.G.2) follows. (9.G.1) If we denote ai = EX(i) and bi = EY(i) , i = 1, 2, . . . , n,then from n n it follows that ai −ai−1 ≥ bi −bi−1 , i = 1, 2, . . . , n−1. Also, i=1 ai = i=1 bi . Now it is easily seen that a b, and thus (9.G.5) holds. The proof of X ≤pd-3 Y ⇒ X ≤pd-4 Y is easy (see, for example, Marshall and Olkin [383, page 350]).

Some closure properties of the above orders are described in the following theorem. Theorem 9.G.2. (a) For j = 1, 2, . . . , let X (j) and Y (j) be two random vectors with exchangeable distribution functions and with the same common marginals such that X (j) →st X and Y (j) →st Y as j → ∞, where →st denotes convergence in distribution. If X (j) ≤pd-k Y (j) , j = 1, 2, . . . , then X ≤pd-k Y , k = 1, 2, 3.

9.G Ordering Exchangeable Distributions

425

(b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors with exchangeable distribution functions and with the same common marginals. If X ≤pd-k Y , then X I ≤pd-k Y I for each I ⊆ {1, 2, . . . , n}. That is, the ≤pd-k order is closed under marginalization, k = 1, 2, 3, 4. (c) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be as in part (b). If (X1 , X2 , . . . , Xn ) ≤pd-k (Y1 , Y2 , . . . , Yn ), then (aX1 + b, aX2 + b, . . . , aXn + b) ≤pd-k (aY1 + b, aY2 + b, . . . , aYn + b) for any constants a and b, k = 1, 2, 3, 4. (d) Let X and Y be as in part (b), and let Θ be another random vector. If [X Θ = θ] ≤pd-k [Y Θ = θ] for all θ in the support of Θ, then X ≤pd-k Y . That is, the ≤pd-k order is closed under mixtures, k = 1, 2, 3, 4. In the bivariate case we have the following relationship. Theorem 9.G.3. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with exchangeable distribution functions with common marginals. Then (X1 , X2 ) ≤PDD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤pd-3 (Y1 , Y2 ). Proof. Suppose that (X1 , X2 ) ≤PDD (Y1 , Y2 ). Then, for any real z we have FX(1) (z) = 1 − P {min(X1 , X2 ) > z} = 1 − EI(z,∞) (X1 )I(z,∞) (X2 ) ≥ 1 − EI(z,∞) (Y1 )I(z,∞) (Y2 ) = FY(1) (z), where the inequality follows from (9.F.3). Now, since FX(1) (z) + FX(2) (z) = FY(1) (z) +FY(2) (z), it follows that (FX(1) (z), FX(2) (z)) (FX(1) (z), FY(2) (z)) which is (9.G.3).

A relationship between the star order (see Section 4.B) and the order ≤pd-4 is described next. Theorem 9.G.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two vectors, each consisting of independent and identically distributed nonnegative random variables. If X1 ≤∗ Y1 , and if EX1 = EY1 , then X ≤pd-4 Y . Example 9.G.5. If X1 , X2 , . . . , Xn are conditionally independent and identically distributed (then they are exchangeable), and if Y1 , Y2 , . . . , Yn are independent and identically distributed, and if all the Xi ’s and Yi ’s have the same marginal distribution, then X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) satisfy X ≤pd-3 Y and, of course, also X ≤pd-4 Y ; this is shown in Shaked and Tong [523]. Hu and Hu [233] have shown that if X1 , X2 , . . . , Xn have some other properties of positive or negative dependence, and if Y1 , Y2 , . . . , Yn are independent, and if Xi =st Yi for i = 1, 2, . . . , n, then the above (that is, X ≤pd-3 Y and X ≤pd-4 Y ) also hold. Ebrahimi and Spizzichino [178] obtained conditions on the expected values of the order statistics that are associated with X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ), under which X ≤pd-4 Y .

426

9 Positive Dependence Orders

Paul [442] gave conditions under which Xi ≤cx Yi , i = 1, 2 (where (X1 , X2 ) and (Y1 , Y2 ) are some bivariate random vectors) imply (Y1 , Y2 ) ≤pd-4 (X1 , X2 ) (in fact the conclusion of Paul [442] is stated as E max{X1 , X2 } ≤ E max{Y1 , Y2 }, but since EXi = EYi , i = 1, 2, the stated conclusion is the same as (Y1 , Y2 ) ≤pd-4 (X1 , X2 )). M¨ uller [414], however, noticed that in Paul [442] there was a subtle mistake which invalidated his Theorem 1. M¨ uller [414] provided other conditions under which the conclusion above is valid.

9.H Complements A good review of the theory of positive dependence orders is the survey by Scarsini and Shaked [496]. Section 2.2 in Joe [262] contains many of the results that are mentioned in Sections 9.A–9.F, as well as many examples and counterexamples. Section 9.A: The PQD order is ﬁrst deﬁned in Yanagimoto and Okamoto [570]; it also can be found in Tchen [547]. The general closure property of the PQD order (Theorem 9.A.1) is taken from Kimeldorf and Sampson [295]. The deﬁnition of the PQD order for general n-dimensional vectors (n > 2) can be found in Joe [260]. The conditions under which Archimedean copulas are ordered in the PQD sense (Example 9.A.3) can be found in Joe [262]. Brown and Rinott [110] showed that some pairs of multivariate inﬁnitely divisible distributions are PQD-ordered. The PQD comparisons of convolutions and of mixtures results (Theorems 9.A.6 and 9.A.7) are special cases of results of Belzunce and Semeraro [77]. The PQD ordering of random vectors with elliptically contoured densities (Example 9.A.8) follows from Theorem 5.1 of Das Gupta, Eaton, Olkin, Perlman, Savage, and Sobel [139]; see also Landsman and Tsanakas [331]. The results about the supermodular order (Section 9.A.4) are mostly taken from Meester and Shanthikumar [387] and from Shaked and Shanthikumar [517]; see also Joe [260] and Szekli, Disney, and Hur [545]. The closure results of the supermodular order given in Theorem 9.A.12, and the application to Markov chains given in Example 9.A.13, are taken from Li and Xu [350]. An extension of the result in Example 9.A.13 can be found in Kulik and Szekli [325]. The closure property of the order ≤sm under random sums (Theorem 9.A.14) can be found in Denuit, Genest, and Marceau [145]; it generalizes some results of Hu and Pan [238]. Extensions of Theorem 9.A.14 are given in Lillo, Pellerey, Semeraro, and Shaked [363], and in Kulik and Szekli [325]. The supermodular comparison of mixtures result (Theorem 9.A.15) is taken from Denuit and M¨ uller [157]. The property that is described in Theorem 9.A.16 can be found in B¨ auerle [58] or in B¨auerle and Rieder [61], and the property that is described in Theorem 9.A.18 can be found in M¨ uller [411]. The inequality that is described in Example 9.A.17 is taken from Vanichpun and Makowski [554, 555]; they

9.H Complements

427

credit it to B¨ auerle [58]. The fact that sums of components of supermodular ordered vectors are ordered according to ≤icx , described in (9.A.19), is taken from M¨ uller [409]. The convex order comparison of random sums in Example 9.A.19 is a generalization of a result of O’Cinneide [439]. The result about the ordering of multivariate normal random vectors according to the ≤sm order (Example 9.A.20) can be found in Huﬀer [250]; see also M¨ uller and Scarsini [416] and Block and Sampson [94, Section 2], though in the latter paper there is a mistake which is corrected in M¨ uller and Scarsini [416]. An extension of the result in Example 9.A.20 to Kotztype distributions is given in Ding and Zhang [168]. The bound on X, which is described in Theorem 9.A.21, can be found in Tchen [547]. A geometric proof of (9.A.20) is given in Kaas, Dhaene, Vyncke, Goovaerts, and Denuit [268] and in Hoedemakers, Beirlant, Goovaerts, and Dhaene [224]. The convex comparison of sums (Proposition 9.A.22) is taken from Kaas, Dhaene, and Goovaerts [267]; some related results and extensions can be found in Goovaerts and Kaas [213] and in Hoedemakers, Beirlant, Goovaerts, and Dhaene [224]. The comparison of a vector of associated random variables with its independence version (Theorem 9.A.23) can be found in Christoﬁdes and Vaggelatou [130]; the ﬁrst part of this theorem strengthens a result in Shaked and Shanthikumar [517] which states the same conclusion, but under the CIS condition (deﬁned in (6.B.11)) which is stronger than the weak positive association condition. The lower bound on X by the so-called “mutually exclusive” random variables (that is, that satisfy (9.A.23)), given in Theorem 9.A.24, is taken from Dhaene and Denuit [162]; see related results in Frostig [207] and in references therein. The suﬃcient condition by means of copulas, which imply the ≤dir-cx order (Theorem 9.A.25), can be found in Juri [266]. Theorem 3.1 and Corollaries 3.2 and 4.1 in R¨ uschendorf [486] are variants of Theorem 9.A.25. The model that is described in Example 9.A.26 is a special case of a model discussed in B¨auerle [57]; in fact, her Theorem 3.1 can be obtained from the stochastic inequality of Example 9.A.26 and the closure of the supermodular order under mixtures (Theorem 9.A.9(d)). R¨ uschendorf [486] studied various extensions of Example 9.A.26. The comparison of sampling plans which is given in Example 9.A.27 was obtained in Karlin [276], and noted by Frostig [206]. The comparison of multivariate Archimedean copulas (Example 9.A.28), as well as further similar comparisons, can be found in Wei and Hu [559]. If F and G of (9.A.3) are the distribution functions of bivariate vectors with integer-valued components, then the comparison F ≤PQD G is the same as a comparison of the partial sums of two matrices with nonnegative entries (which sum up to 1). Nguyen and Sampson [434] studied the geometry of such matrices. The PQD comparison can be used also to compare contingency tables that have the same row and column sums. Nguyen and Sampson [435]

428

9 Positive Dependence Orders

obtained some results regarding the number of such contingency tables that are more PQD than a given contingency table. Block, Chhetry, Fang, and Sampson [92] found necessary and suﬃcient conditions (by means of orders of permutations) for two bivariate empirical distributions to be ordered according to the PQD order. Further results in this vein are given in Metry and Sampson [392]. Examples of pairs of bivariate distributions that are PQD-ordered can be found in de la Horra and Ruiz-Rivas [227] and in Joe [261]. Bassan and Scarsini [55] characterized the PQD order by means of the usual stochastic ordering of some related stopping times. Ebrahimi [175] discussed negatively dependent distributions that are ordered according to the PQD order. Some positive dependence orders that are weaker than the PQD order ´ were introduced in Rodr´ıguez-Lallena and Ubeda-Flores [470]. Lu and Yi [366] gave a deﬁnition of an order that generalizes the bivariate PQD order to higher dimensions. However, this order does not have the desirable properties of being closed under mixtures and concatenations (this follows from the fact that parts (c) and (e) of Theorem 2.4 in Lu and Yi [366] may be incorrect). Section 9.B: Most of the results in this section can be found in Colangelo, Scarsini, and Shaked [133]. Section 9.C: Most of the results in this section, about the LTD and RTI orders, are taken from Averous and Dortet-Bernadet [25]. The relationship between the strong orthant ratio orders and the LTD and RTI orders (Theorem 9.C.7) can be found in Colangelo, Scarsini, and Shaked [133]; the counterexamples that are mentioned after Theorem 9.C.7 can also be found in that paper. The results about the PRD order are taken from Yanagimito and Okamoto [570] and from Fang and Joe [192]. In addition to the characterizations (9.C.19)–(9.C.21) of the PRD order, the reader may ﬁnd another characterization in R¨ uschendorf [484]. In addition to Examples 9.C.10–9.C.12, many other examples of pairs of random vectors that are PRD-ordered can be found in Fang and Joe [192]. Hollander, Proschan, and Sconing [225] brieﬂy considered some LTD and RTI orders that are diﬀerent than the ones in Section 9.C. Colangelo [132] studied the relationships among these orders and the LTD and RTI orders in Section 9.C, and Colangelo, Scarsini, and Shaked [133] studied the relationships among these orders and the orthant ratio orders. Block, Chhetry, Fang, and Sampson [92] found necessary and suﬃcient conditions (by means of orders of permutations) for two bivariate empirical distributions to be ordered according to the PRD order. Some variations of the PRD order are discussed in Cap´era` a and Genest [120] and in Fang and Joe [192].

9.H Complements

429

Av´erous, Genest, and Kochar [26] introduced an extension of the PRD order which compares bivariate random vectors that need not have the same univariate marginals. Their order is equivalent to the requirement that the corresponding copulas are ordered in the PRD order. Hollander, Proschan, and Sconing [225] brieﬂy discussed the order according to which (X1 , X2 ) is smaller than (Y1 , Y2 ) if GY2 |Y1 (x2 x1 ) − FX2 |X1 (x2 x1 ) is increasing in x1 for all x2 . Section 9.D: Most of the material in this section is taken from Kimeldorf and Sampson [295]. The conditions under which Archimedean copulas are ordered in the PLRD sense (Example 9.D.5) can be found in Joe [262]. The comparison of two bivariate normal random vectors in the PLRD sense (Example 9.D.6) is taken from Genest and Verret [208]. Yanagimoto [569] introduced a collection of 16 orders based on the idea of (9.D.2). He did it by requiring (9.D.2) to hold for special choices of intervals I1 , I2 , J1 , and J2 . The PQD order is one of the 16 orders in the collection of Yanagimoto. Metry and Sampson [391] extended Yanagimoto’s idea and presented a more general approach for generating positive dependence orderings. That approach makes it fairly easy to study the properties of the resulting orders and the interrelationships among them. Yanagimoto [569] also introduced an order that is similar to the PLRD order, and which applies to random vectors of dimension n ≥ 2. Kemperman [284] and Karlin and Rinott [278] suggested an order according to which the bivariate distribution F (with density f ) is smaller than the bivariate distribution G (with density g) if f (x1 , y1 )g(x2 , y2 ) ≥ f (x1 , y2 )g(x2 , y1 )

whenever x1 ≤ x2 and y1 ≤ y2 .

This order has not been studied in the literature as a positive dependence order. In fact, Kimeldorf and Sampson [295] have noticed that it does not satisfy some of the basic axioms that they introduced. Section 9.E: The deﬁnition and many properties of associated random variables can be found in Esary, Proschan, and Walkup [184]. Most of the results described in this section are taken from Schriever [498] and from Fang and Joe [192]. In addition to Examples 9.E.4–9.E.6, many other examples of pairs of random vectors that are ordered by association can be found in Fang and Joe [192]. Some variations of the association order are also discussed in that paper. Block, Chhetry, Fang, and Sampson [92] found necessary and suﬃcient conditions (by means of orders of permutations) for two bivariate empirical distributions to be ordered according to the association order. The main result about the weak association order (Theorem 9.E.8) is extracted from R¨ uschendorf [486]; see also Yi and Tongyu [574].

430

9 Positive Dependence Orders

Kimeldorf and Sampson [296] and Hollander, Proschan, and Sconing [225] discuss brieﬂy an order according to which (X1 , X2 ) is smaller than (Y1 , Y2 ) if Cov(K(X1 , X2 ), L(X1 , X2 )) ≤ Cov(K(Y1 , Y2 ), L(Y1 , Y2 )), for all increasing functions K and L for which the covariance is well deﬁned. Kimeldorf and Sampson [296] showed that this order does not satisfy one of their axioms. This order can clearly be extended to the case in which the dimension is n ≥ 2. Section 9.F: Most of the results in this section are taken from Shaked [501] and from Rinott and Pollak [467]. One can prove Theorem 9.F.1 using the method of proof of Theorem 3.1 in Shaked [501]. Tong [550] has listed some examples of vectors X and Y that satisfy (9.F.5), and has shown some applications of this order. Rinott and Pollak [467] have essentially shown that if (X1 , X2 ) ≤PDD (Y1 , Y2 ), then some of the ﬁrst-passage times of related Gaussian processes are ordered in the usual stochastic order. Section 9.G: The results in this section are mostly taken from Shaked and Tong [523]. Many examples of pairs of exchangeable vectors that satisfy the orders ≤pd-k , k = 1, 2, 3, 4, are listed in that paper. Further examples can be found in Shaked and Tong [522]. The relationship between the star order and the order ≤pd-4 (Theorem 9.G.4) is taken from Barlow and Proschan [35]; a slightly stronger result can be found in Shaked [502]. Gupta and Richards [218] have given examples of pairs of multivariate Liouville distributions that are ordered according to ≤pd-1 and therefore also according to ≤pd-2 and ≤pd-4 . Shaked and Tong [523] have noted that, intuitively, exchangeable random vectors are “more positively dependent” if, and only if, they are “less dispersed.” Thus they suggested to deﬁne orderings according to which (X1 , X2 , . . . , Xn ) is smaller than (Y1 , Y2 , . . . , Yn ) if Eφ(X1 , X2 , . . . , Xn ) ≥ Eφ(Y1 , Y2 , . . . , Yn ), for every φ which belongs to some properly chosen class of permutation symmetric functions. In addition to the classes deﬁned in (9.G.1), (9.G.2) and (9.G.4) [there exists also a class under which the above inequality gives (9.G.5)], a natural choice of such a class is the class of all Schurconvex functions. Chang [124] considered some orders that are deﬁned by the above inequality for several classes of permutation symmetric functions. His paper contains a rich bibliography regarding several stochastic majorization orders. Mosler [399, Section 7.6] introduced some notions of positive dependence orders that are based on volumes of central regions.

References

1. Aalen, O.O., Hoem, J.M.: Random time changes for multivariate counting processes. Scandinavian Actuarial Journal, 81–101 (1978) 2. Adell, J.A., Bad´ıa, F.G., de la Cal, J.: Beta-type operators preserve shape properties. Stochastic Processes and Their Applications 48, 1–8 (1993) 3. Adell, J.A., de la Cal, J.: Optimal Poisson approximation of uniform empirical processes. Stochastic Processes and Their Applications 64, 135–142 (1996) 4. Adell, J.A., Lekuona, A.: Taylor’s formula and preservation of generalized convexity for positive linear operators. Journal of Applied Probability 37, 765–777 (2000) 5. Adell, J.A., Perez-Palomares, A.: Stochastic orders in preservation properties by Bernstein-type operators. Advances in Applied Probability 31, 492–507 (1999) 6. Ahmadi, J., Arghami, N.R.: Some univariate stochastic orders on record values. Communications in Statistics—Theory and Methods 30, 69–74 (2001) 7. Ahmed, A.-H. N.: Preservation properties for the mean residual life ordering. Statistical Papers 29, 143–150 (1988) 8. Ahmed, A.N., Alzaid, A., Bartoszewicz, J., Kochar, S.C.: Dispersive and superadditive ordering. Advances in Applied Probability 18, 1019–1022 (1986) 9. Ahmed, A.N., Soliman, A.A., Khider, S.E.: On some partial ordering of interest in reliability. Microelectronics Reliability 36, 1337–1346 (1996) 10. Ahmed, A.N., Soliman, A.A., Khider, S.E.: Preservation results for ordered random variables, with applications to reliability theory. Microelectronics Reliability 37, 277–287 (1997) 11. Alzaid, A., Kim, J.S., Proschan, F.: Laplace ordering and its applications. Journal of Applied Probability 28, 116–130 (1991) 12. Alzaid, A.A.: Mean residual life ordering. Statistical Papers 29, 35–43 (1988) 13. Alzaid, A.A.: Length-biased orderings with applications. Probability in the Engineering and Informational Sciences 2, 329–341 (1988) 14. Alzaid, A.A., Proschan, F.: Dispersivity and stochastic majorization. Statistics and Probability Letters 13, 275–278 (1992) 15. Arcones, M.A., Kvam, P.H., Samaniego, F.J.: Nonparametric estimation of a distribution subject to a stochastic precedence constraint. Journal of the American Statistical Association 97, 170–182 (2002)

432

References

16. Argon, N.T., Andrad´ ottir, S.: Partial pooling in tandem lines with cooperation and blocking. Queueing Systems 52, 5–30 (2006) 17. Arias-Nicol´ as, J.P., Fern´ andez-Ponce, J.M., Luque-Calvo, P., Su´arez-Llorens, A.: Multivariate dispersion order and the notion of copula applied to the multivariate t-distribution. Probability in the Engineering and Informational Sciences 19, 363–375 (2005) 18. Arjas, E.: A stochastic process approach to multivariate reliability systems: Notions based on conditional stochastic order. Mathematics of Operations Research 6, 263–276 (1981) 19. Arnold, B.C.: Majorization and the Lorenz Order: A Brief Introduction. Springer-Verlag, New York (1987) 20. Arnold, B.C.: Inequality measures for multivariate distributions. Metron 63, 317–327 (2005) 21. Arnold, B.C., Villasenor, J.A.: Lorenz ordering of order statistics and record values. In: Balakrishnan, N., Rao, C.R. (ed) Handbook of Statistics 16: Order Statistics: Theory and Methods. Elsevier, Amsterdam, 75–87 (1998) 22. Arrow, K.J.: Essays in the Theory of Risk-Bearing. North-Holland, New York (1974) 23. Asadi, M., Shanbhag, D.N.: Hazard measure and mean residual life orderings: A uniﬁed approach. In: Balakrishnan, N., Rao, C.R. (ed) Handbook of Statistics 20: Advances in Reliability. Elsevier, Amsterdam, 199–214 (2001) 24. Atakan, A.E.: Stochastic convexity in dynamic programming. Economic Theory 22, 447–455 (2003) 25. Averous, J., Dortet-Bernadet, J.-L.: LTD and RTI dependence orderings. Canadian Journal of Statistics 28, 151–157 (2000) 26. Av´erous, J., Genest, C., Kochar, S.C.: On the dependence structure of order statistics. Journal of Multivariate Analysis 94, 159–171 (2005) 27. Baccelli, F., Makowski, A.M.: Multi-dimensional stochastic ordering and associated random variables. Operations Research 37, 478–487 (1989) 28. Baccelli, F., Makowski, A.M.: Stochastic orders associated with the forward recurrence time of a renewal process. Technical Report. Department of Electrical Engineering, University of Maryland, College Park (1992) 29. Bagai, I., Kochar, S.C.: On tail-ordering and comparison of failure rates. Communications in Statistics—Theory and Methods 15, 1377–1388 (1986) 30. Baker, E.: Increasing risk and increasing informativeness: Equivalence theorems. Operations Research 54, 26–36 (2006) 31. Bapat, R.B., Kochar, S.C.: On likelihood-ratio ordering of order statistics. Linear Algebra and Its Applications 199, 281–291 (1994) 32. Barlow, R.E., Bartholomew, D.J., Bremner, J.M., Brunk, H.D.: Statistical Inference under Order Restrictions. Wiley, New York (1972) 33. Barlow, R.E., Campo, R.: Total time on test processes and applications to failure data analysis. In: Barlow, R.E., Fussel, R., Singpurwalla, N.D. (ed) Reliability and Fault Tree Analysis. SIAM, Philadelphia, 451–481 (1975) 34. Barlow, R.E., Doksum, K.A.: Isotonic tests for convex ordering. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability 1. University of California Press, Berkeley, 293–323 (1972) 35. Barlow, R.E., Proschan, F.: Inequalities for linear combinations of order statistics from restricted families. Annals of Mathematical Statistics 37, 1574–1592 (1966)

References

433

36. Barlow, R.E., Proschan, F.: Statistical Theory of Reliability and Life Testing, Probability Models. Holt, Rinehart, and Winston, New York (1975) 37. Bartoszewicz, J.: Moment inequalities for order statistics from ordered families of distributions. Metrika 32, 383–389 (1985) 38. Bartoszewicz, J.: Dispersive ordering and monotone failure rate distributions. Advances in Applied Probability 17, 472–474 (1985) 39. Bartoszewicz, J.: Dispersive ordering and the total time on test transformation. Statistics and Probability Letters 4, 285–288 (1986) 40. Bartoszewicz, J.: A note on dispersive ordering deﬁned by hazard functions. Statistics and Probability Letters 6, 13–16 (1987) 41. Bartoszewicz, J.: Quantile inequalities for linear combinations of order statistics from ordered families of distributions. Applicationes Mathematicae 21, 575–589 (1993) 42. Bartoszewicz, J.: Stochastic order relations and the total time on test transform. Statistics and Probability Letters 22, 103–110 (1995) 43. Bartoszewicz, J.: Tail orderings and the total time on test transform. Applicationes Mathematicae 24, 77–86 (1996) 44. Bartoszewicz, J.: Dispersive functions and stochastic orders. Applicationes Mathematicae 24, 429–444 (1997) 45. Bartoszewicz, J.: Applications of a general composition theorem to the star order of distributions. Statistics and Probability Letters 38, 1–9 (1998) 46. Bartoszewicz, J.: Characterizations of the dispersive order of distributions by the Laplace transform. Statistics and Probability Letters 40, 23–29 (1998) 47. Bartoszewicz, J.: Characterizations of stochastic orders based on ratios of Laplace transforms. Statistics and Probab

Moshe Shaked J. George Shanthikumar

Stochastic Orders

Moshe Shaked Department of Mathematics University of Arizona Tucson, AZ 85721 [email protected] J. George Shanthikumar Department of Industrial Engineering and Operations Research University of California, Berkeley Berkeley, CA 94720 [email protected]

Library of Congress Control Number: 2006927724 ISBN-10: 0-387-32915-3 ISBN-13: 978-0387-32915-4 Printed on acid-free paper. © 2007 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 1 springer.com

To my wife Edith and to my children Tal, Shanna, and Lila M.S. To my wife Mellony and to my children Devin, Rajan, and Sohan J.G.S.

To my wife Edith and to my children Tal Shanna Lila

To my wife Mellony and to my children Devin Rajan Sohan

Preface

Stochastic orders and inequalities have been used during the last 40 years, at an accelerated rate, in many diverse areas of probability and statistics. Such areas include reliability theory, queuing theory, survival analysis, biology, economics, insurance, actuarial science, operations research, and management science. The purpose of this book is to collect in one place essentially all that is known about these orders up to the present. In addition, the book illustrates some of the usefulness and applicability of these stochastic orders. This book is a major extension of the ﬁrst six chapters in Shaked and Shanthikumar [515]. The idea that led us to write those six chapters arose as follows. In our own research in reliability theory and operations research we have been using, for years, several notions of stochastic orders. Often we would encounter a result that we could easily (or not so easily) prove, but we could not tell whether it was known or new. Even when we were sure that a result was known, we would not know right away where it could be found. Also, sometimes we would prove a result for the purpose of an application, only to realize later that a stronger result (stronger than what we needed) had already been derived elsewhere. We also often have had diﬃculties giving a reference for one source that contained everything about stochastic orders that we needed in a particular paper. In order to avoid such diﬃculties we wrote the ﬁrst six chapters in Shaked and Shanthikumar [515]. Since 1994 the theory of stochastic orders has grown signiﬁcantly. We think that now is the time to put in one place essentially all that is known about these orders. This book is the result of this eﬀort. The simplest way of comparing two distribution functions is by the comparison of the associated means. However, such a comparison is based on only two single numbers (the means), and therefore it is often not very informative. In addition to this, the means sometimes do not exist. In many instances in applications one has more detailed information, for the purpose of comparison of two distribution functions, than just the two means. Several orders of distribution functions, that take into account various forms of possible knowl-

VIII

Preface

edge about the two underlying distribution functions, are studied in Chapters 1 and 2. When one wishes to compare two distribution functions that have the same mean (or that are centered about the same value), one is usually interested in the comparison of the dispersion of these distributions. The simplest way of doing it is by the comparison of the associated standard deviations. However, such a comparison, again, is based on only two single numbers, and therefore it is often not very informative. In addition to this, again, the standard deviations sometimes do not exist. Several orders of distribution functions, which take into account various forms of possible knowledge about the two underlying distribution functions (in addition to the fact that they are centered about the same value), are studied in Chapter 3. Orders that can be used for the joint comparison of both the location and the dispersion of distribution functions are studied in Chapters 4 and 5. The analogous orders for multivariate distribution functions are studied in Chapters 6 and 7. When one is interested in the comparison of a sequence of distribution functions, associated with the random variables Xi , i = 1, 2, . . ., then one can use, of course, any of the orders described in Chapters 1–7 for the purpose of comparing any two of these distributions. However, the parameter i may now introduce some patterns that connect all the underlying distributions. For example, suppose not only that the random variables Xi , i = 1, 2, . . ., increase stochastically in i, but also that the increase is sharper for larger i’s. Then the sequence Xi , i = 1, 2, . . ., is stochastically increasing in a convex sense. Such notions of stochastic convexity and concavity are studied in Chapter 8. Notions of positive dependence of two random variables X1 and X2 have been introduced in the literature in an eﬀort to mathematically describe the property that “large (respectively, small) values of X1 go together with large (respectively, small) values of X2 .” Many of these notions of positive dependence are deﬁned by means of some comparison of the joint distribution of X1 and X2 with their distribution under the theoretical assumption that X1 and X2 are independent. Often such a comparison can be extended to general pairs of bivariate distributions with given marginals. This fact led researchers to introduce various notions of positive dependence orders. These orders are designed to compare the strength of the positive dependence of the two underlying bivariate distributions. Many of these orders can be further extended to comparisons of general multivariate distributions that have the same marginals. In Chapter 9 we describe these orders. We have in mind a wide spectrum of readers and users of this book. On one hand, the text can be useful for those who are already familiar with many aspects of stochastic orders, but who are not aware of all the developments in this area. On the other hand, people who are not very familiar with stochastic orders, but who know something about them, can use this book for the purpose of studying or widening their knowledge and understanding of this important area.

Preface

IX

We wish to thank Haijun Li, Asok K. Nanda, and Taizhong Hu for critical readings of several drafts of the manuscript. Their comments led to a substantial improvement in the presentation of some of the results in these chapters. We also thank Yigal Gerchak and Marco Scarsini for some illuminating suggestions. We thank our academic advisors John A. Buzacott (of J. G. S.) and Albert W. Marshall (of M. S.) who, years ago, introduced us to some aspects of the area of stochastic orders.

Tucson, Berkeley, August 16, 2006

Moshe Shaked J. George Shanthikumar

Contents

1

2

Univariate Stochastic Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A The Usual Stochastic Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 1.A.2 A characterization by construction on the same probability space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A.3 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.A.4 Further characterizations and properties . . . . . . . . . . . . . . 1.A.5 Some properties in reliability theory . . . . . . . . . . . . . . . . . 1.B The Hazard Rate Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.B.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 1.B.2 The relation between the hazard rate and the usual stochastic orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.B.3 Closure properties and some characterizations . . . . . . . . . 1.B.4 Comparison of order statistics . . . . . . . . . . . . . . . . . . . . . . . 1.B.5 Some properties in reliability theory . . . . . . . . . . . . . . . . . 1.B.6 The reversed hazard order . . . . . . . . . . . . . . . . . . . . . . . . . . 1.C The Likelihood Ratio Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.C.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.C.2 The relation between the likelihood ratio and the hazard and reversed hazard orders . . . . . . . . . . . . . . . . . . . 1.C.3 Some properties and characterizations . . . . . . . . . . . . . . . . 1.C.4 Shifted likelihood ratio orders . . . . . . . . . . . . . . . . . . . . . . . 1.D The Convolution Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.E Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mean Residual Life Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A The Mean Residual Life Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.2 The relation between the mean residual life and some other stochastic orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.3 Some closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 3 4 5 8 15 16 16 18 18 31 35 36 42 42 43 44 66 70 71 81 81 81 83 86

XII

Contents

2.A.4 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 94 2.B The Harmonic Mean Residual Life Order . . . . . . . . . . . . . . . . . . . 94 2.B.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 2.B.2 The relation between the harmonic mean residual life and some other stochastic orders . . . . . . . . . . . . . . . . . . . . 95 2.B.3 Some closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.B.4 Properties in reliability theory . . . . . . . . . . . . . . . . . . . . . . 105 2.C Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 3

Univariate Variability Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 3.A The Convex Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 3.A.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 109 3.A.2 Closure and other properties . . . . . . . . . . . . . . . . . . . . . . . . 119 3.A.3 Conditions that lead to the convex order . . . . . . . . . . . . . 133 3.A.4 Some properties in reliability theory . . . . . . . . . . . . . . . . . 138 3.A.5 The m-convex orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 3.B The Dispersive Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 3.B.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 146 3.B.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 3.C The Excess Wealth Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 3.C.1 Motivation and deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 3.C.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 3.D The Peakedness Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 3.D.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 3.D.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 3.E Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

4

Univariate Monotone Convex and Related Orders . . . . . . . . . 181 4.A The Monotone Convex and Monotone Concave Orders . . . . . . . 181 4.A.1 Deﬁnitions and equivalent conditions . . . . . . . . . . . . . . . . . 181 4.A.2 Closure properties and some characterizations . . . . . . . . . 185 4.A.3 Conditions that lead to the increasing convex and increasing concave orders . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 4.A.4 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 4.A.5 Some properties in reliability theory . . . . . . . . . . . . . . . . . 203 4.A.6 The starshaped order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 4.A.7 Some related orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 4.B Transform Orders: Convex, Star, and Superadditive Orders . . . 213 4.B.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 4.B.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 4.B.3 Some related orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 4.C Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

Contents

XIII

5

The Laplace Transform and Related Orders . . . . . . . . . . . . . . . 233 5.A The Laplace Transform Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 5.A.1 Deﬁnitions and equivalent conditions . . . . . . . . . . . . . . . . . 233 5.A.2 Closure and other properties . . . . . . . . . . . . . . . . . . . . . . . . 235 5.B Orders Based on Ratios of Laplace Transforms . . . . . . . . . . . . . . 245 5.B.1 Deﬁnitions and equivalent conditions . . . . . . . . . . . . . . . . . 245 5.B.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 5.B.3 Relationship to other stochastic orders . . . . . . . . . . . . . . . 249 5.C Some Related Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 5.C.1 The factorial moments order . . . . . . . . . . . . . . . . . . . . . . . . 252 5.C.2 The moments order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 5.C.3 The moment generating function order . . . . . . . . . . . . . . . 260 5.D Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

6

Multivariate Stochastic Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 6.A Notations and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 6.B The Usual Multivariate Stochastic Order . . . . . . . . . . . . . . . . . . . 266 6.B.1 Deﬁnition and equivalent conditions . . . . . . . . . . . . . . . . . 266 6.B.2 A characterization by construction on the same probability space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 6.B.3 Conditions that lead to the multivariate usual stochastic order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 6.B.4 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 6.B.5 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 6.B.6 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 279 6.B.7 Stochastic ordering of stochastic processes . . . . . . . . . . . . 280 6.C The Cumulative Hazard Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 6.C.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 6.C.2 The relationship between the cumulative hazard order and the usual multivariate stochastic order . . . . . . . . . . . 288 6.D Multivariate Hazard Rate Orders . . . . . . . . . . . . . . . . . . . . . . . . . . 290 6.D.1 Deﬁnitions and basic properties . . . . . . . . . . . . . . . . . . . . . 290 6.D.2 Preservation properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 6.D.3 The dynamic multivariate hazard rate order . . . . . . . . . . 294 6.E The Multivariate Likelihood Ratio Order . . . . . . . . . . . . . . . . . . . 298 6.E.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 6.E.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 6.E.3 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 304 6.F The Multivariate Mean Residual Life Order . . . . . . . . . . . . . . . . . 305 6.F.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 6.F.2 The relation between the multivariate mean residual life and the dynamic multivariate hazard rate orders . . . 306 6.F.3 A property in reliability theory . . . . . . . . . . . . . . . . . . . . . . 307 6.G Other Multivariate Stochastic Orders . . . . . . . . . . . . . . . . . . . . . . 307 6.G.1 The orthant orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

XIV

Contents

6.G.2 The scaled order statistics orders . . . . . . . . . . . . . . . . . . . . 314 6.H Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 7

Multivariate Variability and Related Orders . . . . . . . . . . . . . . . 323 7.A The Monotone Convex and Monotone Concave Orders . . . . . . . 323 7.A.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 7.A.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 7.A.3 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 7.A.4 Convex and concave ordering of stochastic processes . . . 330 7.A.5 The (m1 , m2 )-icx orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 7.A.6 The symmetric convex order . . . . . . . . . . . . . . . . . . . . . . . . 332 7.A.7 The componentwise convex order . . . . . . . . . . . . . . . . . . . . 333 7.A.8 The directional convex and concave orders . . . . . . . . . . . . 335 7.A.9 The orthant convex and concave orders . . . . . . . . . . . . . . 339 7.B Multivariate Dispersion Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 7.B.1 A strong multivariate dispersion order . . . . . . . . . . . . . . . 342 7.B.2 A weak multivariate dispersion order . . . . . . . . . . . . . . . . . 344 7.B.3 Dispersive orders based on constructions . . . . . . . . . . . . . 346 7.C Multivariate Transform Orders: Convex, Star, and Superadditive Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 7.D The Multivariate Laplace Transform and Related Orders . . . . . 349 7.D.1 The multivariate Laplace transform order . . . . . . . . . . . . 349 7.D.2 The multivariate factorial moments order . . . . . . . . . . . . . 352 7.D.3 The multivariate moments order . . . . . . . . . . . . . . . . . . . . 353 7.E Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

8

Stochastic Convexity and Concavity . . . . . . . . . . . . . . . . . . . . . . . 357 8.A Regular Stochastic Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 8.A.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 8.A.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 8.A.3 Stochastic m-convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 8.B Sample Path Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 8.B.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 8.B.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 8.C Convexity in the Usual Stochastic Order . . . . . . . . . . . . . . . . . . . 374 8.C.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 8.C.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 8.D Strong Stochastic Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 8.D.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 8.D.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 8.E Stochastic Directional Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . 381 8.E.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 8.E.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 8.F Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384

Contents

9

XV

Positive Dependence Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 9.A The PQD and the Supermodular Orders . . . . . . . . . . . . . . . . . . . . 387 9.A.1 Deﬁnition and basic properties: The bivariate case . . . . . 387 9.A.2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 9.A.3 The multivariate case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 9.A.4 The supermodular order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 9.B The Orthant Ratio Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 9.B.1 The (weak) orthant ratio orders . . . . . . . . . . . . . . . . . . . . . 404 9.B.2 The strong orthant ratio orders . . . . . . . . . . . . . . . . . . . . . 407 9.C The LTD, RTI, and PRD Orders . . . . . . . . . . . . . . . . . . . . . . . . . . 408 9.D The PLRD Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 9.E Association Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 9.F The PDD Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 9.G Ordering Exchangeable Distributions . . . . . . . . . . . . . . . . . . . . . . . 423 9.H Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

Note

Throughout the book “increasing” means “nondecreasing” and “decreasing” means “nonincreasing.” Expectations are assumed to exist whenever they are written. The “inverse” of a monotone function (which is not strictly monotone) means the right continuous version of it, unless stated otherwise. For example, if F is a distribution function, then the right continuous version of its inverse is F −1 (u) = sup{x : F (x) ≤ u}, u ∈ [0, 1]. The following aging notions will be encountered often throughout the text. Let X be a random variable with distribution function F and survival function F ≡ 1 − F. (i) The random variable X (or its distribution) is said to be IFR [increasing failure rate] if F is logconcave. It is said to be DFR [decreasing failure rate] if F is logconvex. (ii) The nonnegative random variable X (or its distribution) is said to be IFRA [increasing failure rate average] if − log F is starshaped; that is, if − log F (t)/t is increasing in t ≥ 0. It is said to be DFRA [decreasing failure rate average] if − log F is antistarshaped; that is, if − log F (t)/t is decreasing in t ≥ 0. (iii) The nonnegative random variable X (or its distribution) is said to be NBU [new better than used] if F (s)F (t) ≥ F (s + t) for all s ≥ 0 and t ≥ 0. It is said to be NWU [new worse than used] if F (s)F (t) ≤ F (s + t) for all s ≥ 0 and t ≥ 0. (iv) The random variable X (or its distribution) is said to be DMRL [decreas ing mean residual life] if

∞ t

F (s)ds F (t)

is decreasing in t over {t : F (t) > 0}. It

is said to be IMRL [increasing mean residual life] if

∞ t

F (s)ds F (t)

is increasing

in t over {t : F (t) > 0}. (v) The nonnegative random variable X (or its distribution) is said to be NBUE [new better than used in expectation] if

∞ t

F (s)ds F (t)

≤ EX for all

2

Note

t ≥ 0. It is said to be NWUE [new worse than used in expectation] if ∞ t

F (s)ds F (t)

≥ EX for all t ≥ 0.

The majorization order will be used in some places in the text. Recall from Marshall and Olkin [383] that a vector a = (a1 , a2 , . . . , an ) is said to be smaller in the majorization order than the vector b = (b1 , b2 , . . . , bn ) (denoted a ≺ b) n j j n if i=1 ai = i=1 bi and if i=1 a[i] ≤ i=1 b[i] for j = 1, 2, . . . , n − 1, where a[i] [b[i] ] is the ith largest element of a [b], i = 1, 2, . . . , n. An n-dimensional function φ is called Schur convex [concave] if a ≺ b =⇒ φ(a) ≤ [≥] φ(b). The notation N ≡ {. . . , −1, 0, 1, . . . }, N+ ≡ {0, 1, . . . }, and N++ ≡ {1, 2, . . . } will be used in this text.

1 Univariate Stochastic Orders

In this chapter we study stochastic orders that compare the “location” or the “magnitude” of random variables. The most important and common orders that are considered in this chapter are the usual stochastic order ≤st , the hazard rate order ≤hr , and the likelihood ratio order ≤lr . Some variations of these orders, and some related orders, are also examined in this chapter.

1.A The Usual Stochastic Order 1.A.1 Deﬁnition and equivalent conditions Let X and Y be two random variables such that P {X > x} ≤ P {Y > x}

for all x ∈ (−∞, ∞).

(1.A.1)

Then X is said to be smaller than Y in the usual stochastic order (denoted by X ≤st Y ). Roughly speaking, (1.A.1) says that X is less likely than Y to take on large values, where “large” means any value greater than x, and that this is the case for all x’s. Note that (1.A.1) is the same as P {X ≤ x} ≥ P {Y ≤ x}

for all x ∈ (−∞, ∞).

(1.A.2)

It is easy to verify (by noting that every closed interval is an inﬁnite intersection of open intervals) that X ≤st Y if, and only if, P {X ≥ x} ≤ P {Y ≥ x} for all x ∈ (−∞, ∞).

(1.A.3)

In fact, we can recast (1.A.1) and (1.A.3) in a seemingly more general, but actually an equivalent, way as follows: P {X ∈ U } ≤ P {Y ∈ U } for all upper sets U ⊆ (−∞, ∞).

(1.A.4)

(In the univariate case, that is on the real line, a set U is an upper set if, and only if, it is an open or a closed right half line.) In the univariate case the

4

1 Univariate Stochastic Orders

equivalence of (1.A.4) with (1.A.1) and (1.A.3) is trivial, but in Chapter 6 it will be seen that the generalizations of each of these three conditions to the multivariate case yield diﬀerent deﬁnitions of stochastic orders. Still another way of rewriting (1.A.1) or (1.A.3) is the following: E[IU (X)] ≤ E[IU (Y )]

for all upper sets U ⊆ (−∞, ∞),

(1.A.5)

where IU denotes the indicator function of U . From (1.A.5) it follows that if X ≤st Y , then E

m

m ai IUi (X) − b ≤ E ai IUi (Y ) − b

i=1

(1.A.6)

i=1

for all ai ≥ 0, i = 1, 2, . . . , m, b ∈ (−∞, ∞), and m ≥ 0. Given an increasing function φ, it is possible, for each m, to deﬁne a sequence of Ui ’s, a sequence of ai ’s, and a b (all of which may depend on m), such that as m → ∞ then (1.A.6) converges to E[φ(X)] ≤ E[φ(Y )], (1.A.7) provided the expectations exist. It follows that X ≤st Y if, and only if, (1.A.7) holds for all increasing ∞the expectations exist. ∞functions φ for which The expressions x P {X > y}dy and x P {Y > y}dy are used extensively in Chapters 2, 3, and 4. It is of interest to note that X ≤st Y if, and only if, ∞ ∞ P {Y > y}du − P {X > y}dy is decreasing in x ∈ (−∞, ∞). x

x

(1.A.8) If X and Y are discrete random variables taking on values in N, then we have the following. Let pi = P {X = i} and qi = P {Y = i}, i ∈ N. Then X ≤st Y if, and only if, i

pj ≥

j=−∞

i

qj ,

i ∈ N,

j=−∞

or, equivalently, X ≤st Y if, and only if, ∞ j=i

pj ≤

∞

qj ,

i ∈ N.

j=i

1.A.2 A characterization by construction on the same probability space An important characterization of the usual stochastic order is the following theorem (here =st denotes equality in law).

1.A The Usual Stochastic Order

5

Theorem 1.A.1. Two random variables X and Y satisfy X ≤st Y if, and ˆ and Yˆ , deﬁned on the same probonly if, there exist two random variables X ability space, such that ˆ =st X, X Yˆ =st Y,

(1.A.10)

ˆ ≤ Yˆ } = 1. P {X

(1.A.11)

(1.A.9)

and Proof. Obviously (1.A.9), (1.A.10), and (1.A.11) imply that X ≤st Y . In order to prove the necessity part of Theorem 1.A.1, let F and G be, respectively, the distribution functions of X and Y , and let F −1 and G−1 be the corresponding ˆ = F −1 (U ) and right continuous inverses (see Note on page 1). Deﬁne X −1 ˆ Y = G (U ) where U is a uniform [0, 1] random variable. Then it is easy to ˆ and Yˆ satisfy (1.A.9) and (1.A.10). From (1.A.2) it is seen that see that X (1.A.11) also holds.

Theorem 1.A.1 is a special case of a more general result that is stated in Section 6.B.2. From (1.A.2) and Theorem 1.A.1 it follows that the random variables X and Y , with the respective distribution functions F and G, satisfy X ≤st Y if, and only if, F −1 (u) ≤ G−1 (u), for all u ∈ (0, 1). (1.A.12) Another way of restating Theorem 1.A.1 is the following. We omit the obvious proof of it. Theorem 1.A.2. Two random variables X and Y satisfy X ≤st Y if, and only if, there exist a random variable Z and functions ψ1 and ψ2 such that ψ1 (z) ≤ ψ2 (z) for all z and X =st ψ1 (Z) and Y =st ψ2 (Z). In some applications, when the random variables X and Y are such that ˆ on the probability space on X ≤st Y , one may wish to construct a Yˆ [X] ˆ =st X and which X [Y ] is deﬁned, such that Yˆ =st Y and P {X ≤ Yˆ } = 1 [X ˆ ≤ Y } = 1]. This is always possible. Here we will show how this can be P {X done when the distribution function F [G] of X [Y ] is absolutely continuous. When this is the case, F (X) [G(Y )] is uniformly distributed on [0, 1], and ˆ = F −1 (G(Y ))] is the desired construction Yˆ therefore Yˆ = G−1 (F (X)) [X ˆ [X]. 1.A.3 Closure properties Using (1.A.1) through (1.A.11) it is easy to prove each of the following closure results. The following notation will be used: For any random variable Z and an event A, let [Z A] denote any random variable that has as its distribution the conditional distribution of Z given A.

6

1 Univariate Stochastic Orders

Theorem 1.A.3. (a) If X ≤st Y and g is any increasing [decreasing] function, then g(X) ≤st [≥st ] g(Y ). (b) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤st Yi for i = 1, 2, . . . , m, then, for any increasing function ψ : Rm → R, one has ψ(X1 , X2 , . . . , Xm ) ≤st ψ(Y1 , Y2 , . . . , Ym ). In particular,

m

Xj ≤st

j=1

m

Yj .

j=1

That is, the usual stochastic order is closed under convolutions. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞, where “→st ” denotes convergence in distribution. If Xj ≤st Yj , j = 1, 2, . . ., then X ≤st Y. (d) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤st [Y Θ = θ] for all θ in the support of Θ. Then X ≤st Y . That is, the usual stochastic order is closed under mixtures. 0 In the next result and in the sequel we deﬁne j=1 aj ≡ 0 for any sequence {aj , j = 1, 2, . . . }. Theorem 1.A.4. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent random variables, and let M be a nonnegative integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent random variables, and let N be a nonnegative integer-valued random variable which is independent of the Yi ’s. If Xi ≤st Yi , i = 1, 2, . . ., and if M ≤st N , then M

Xj ≤st

j=1

N

Yj .

j=1

Another related result is given next. Theorem 1.A.5. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have that K Xj ≤st [≥st ] Y1 j=1

and

1.A The Usual Stochastic Order

7

M ≤st [≥st ] KN, then

M

Xj ≤st [≥st ]

j=1

N

Yj .

j=1

Proof. The assumptions yield M i=1

Xi ≤st [≥st ]

KN

Xi =

i=1

N

Ki

Xj ≤st [≥st ]

i=1 j=K(i−1)+1

N

Yi .

i=1

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line R. Let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result is a generalization of both parts (a) and (c) of Theorem 1.A.3. Theorem 1.A.6. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If

X(θ) ≤st X(θ )

whenever θ ≤ θ ,

(1.A.13)

and if Θ1 ≤st Θ2 ,

(1.A.14)

Y1 ≤st Y2 .

(1.A.15)

then Proof. Note that, by (1.A.13), P {X(θ) > y} is increasing in θ for all y. Thus P {Y1 > y} = P {X(θ) > y}dF1 (θ) X ≤ P {X(θ) > y}dF2 (θ) X

= P {Y2 > y},

for all y,

where the inequality follows from (1.A.14) and (1.A.7). Thus (1.A.15) follows from (1.A.1).

8

1 Univariate Stochastic Orders

Note that, using the notation that is introduced below before Theorem 1.A.14, (1.A.13) can be rewritten as {X(θ), θ ∈ X } ∈ SI. The following example shows an application of Theorem 1.A.6 in the area of Bayesian imperfect repair; a related result is given in Example 1.B.16. Example 1.A.7. Let Θ1 and Θ2 be two random variables with supports in X = (0, 1] and distribution functions F1 and F2 , respectively. For some survival function K, deﬁne 1−θ Gθ = K , θ ∈ (0, 1], and let X(θ) have the survival function K 1−θ

1−θ

1−θ

. Note that (1.A.13) holds be-

(y) ≤ K (y) for all y whenever 0 < θ ≤ θ ≤ 1. Thus, if cause K Θ1 ≤st Θ2 then Yi , with survival function H i deﬁned by 1 1−θ H i (y) = K (y)dFi (θ), y ∈ R, i = 1, 2, 0

satisfy Y1 ≤st Y2 . 1.A.4 Further characterizations and properties Clearly, if X ≤st Y then EX ≤ EY . However, as the following result shows, if two random variables are ordered in the usual stochastic order and have the same expected values, they must have the same distribution. Theorem 1.A.8. If X ≤st Y and if E[h(X)] = E[h(Y )] for some strictly increasing function h, then X =st Y . ˆ and Yˆ be as in Theorem Proof. First we prove the result when h(x) = x. Let X ˆ < Yˆ } > 0, then EX = E X ˆ < E Yˆ = EY , a contradiction to the 1.A.1. If P {X ˆ = Yˆ =st Y . Now let h be some assumption EX = EY . Therefore X =st X strictly increasing function. Observe that if X ≤st Y , then h(X) ≤st h(Y ) and therefore from the above result we have that h(X) =st h(Y ). The strict monotonicity of h yields X =st Y .

Other results that give conditions, involving stochastic orders, which imply stochastic equalities, are given in Theorems 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16. As was mentioned above, if X ≤st Y , then EX ≤ EY . It is easy to ﬁnd counterexamples which show that the converse is false. However, X ≤st Y implies other moment inequalities (for example, EX 3 ≤ EY 3 ). Thus one may wonder whether X ≤st Y can be characterized by a collection of moment inequalities. Brockett and Kahane [109, Corollary 1] showed that there exist no ﬁnite number of moment inequalities which imply X ≤st Y . In fact, they showed it for many other stochastic orders that are studied later in this book. In order to state the next characterization we deﬁne the following class of bivariate functions: Gst = {φ : R2 → R : φ(x, y) is increasing in x and decreasing in y}.

1.A The Usual Stochastic Order

9

Theorem 1.A.9. Let X and Y be independent random variables. Then X ≤st Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Gst .

(1.A.16)

Proof. Suppose that (1.A.16) holds. The function φ deﬁned by φ(x, y) ≡ x belongs to Gst . Therefore X ≤st Y . In order to prove the “only if” part, suppose that X ≤st Y . Let φ ∈ Gst and deﬁne ψ(x, y) = φ(x, −y). Then ψ is increasing on R2 . Since X and Y are independent it follows that X and −Y are independent and also that −X and Y are independent. Since X ≤st Y it follows (for example, from Theorem 1.A.1) that −Y ≤st −X. Therefore, by Theorem 1.A.3(b), we have ψ(X, −Y ) ≤st ψ(Y, −X), that is, φ(X, Y ) ≤st φ(Y, X).

The next result is a similar characterization. In order to state it we need the following notation: Let φ1 and φ2 be two bivariate functions. Denote ∆φ21 (x, y) = φ2 (x, y)−φ1 (x, y). The proof of the following theorem is omitted. Theorem 1.A.10. Let X and Y be two independent random variables. Then X ≤st Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all φ1 and φ2 which satisfy that, for each y, ∆φ21 (x, y) decreases in x on {x ≤ y}; for each x, ∆φ21 (x, y) increases in y on {y ≥ x}; and ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. Another similar characterization is given in Theorem 4.A.36. Let X and Y be two random variables with distribution functions F and G, respectively. Let M(F, G) denote the Fr´echet class of bivariate distributions ˆ Yˆ ) ∈ M(F, G) with ﬁxed marginals F and G. Abusing notation we write (X, ˆ to mean that the jointly distributed random variables X and Yˆ have the marginal distribution functions F and G, respectively. The Fortret-MourierWasserstein distance between the ﬁnite mean random variables X and Y is deﬁned by ˆ d(X, Y ) = inf {E|Yˆ − X|}. (1.A.17) ˆ Yˆ )∈M(F,G) (X,

Theorem 1.A.11. Let X and Y be two ﬁnite mean random variables such that EX ≤ EY . Then X ≤st Y if, and only if, d(X, Y ) = EY − EX. Proof. Suppose that d(X, Y ) = EY −EX. The inﬁmum in (1.A.17) is attained ˆ Yˆ ), and we have E|Yˆ − X| ˆ = E(Yˆ − X). ˆ Therefore P {X ˆ ≤ Yˆ } = for some (X, 1, and from Theorem 1.A.1 it follows that X ≤st Y . ˆ and Yˆ be as in Theorem 1.A.1. Conversely, suppose that X ≤st Y . Let X Then, for any (X , Y ) ∈ M(F, G) we have that E|Y − X | ≥ |EY − EX | = ˆ Therefore d(X, Y ) = EY − EX.

E Yˆ − E X.

10

1 Univariate Stochastic Orders

A simple suﬃcient condition which implies the usual stochastic order is described next. The following notation will be used. Let a(x) be deﬁned on I, where I is a subset of the real line. The number of sign changes of a in I is deﬁned by S − (a) = sup S − [a(x1 ), a(x2 ), . . . , a(xm )], (1.A.18) where S − (y1 , y2 , . . . , ym ) is the number of sign changes of the indicated sequence, zero terms being discarded, and the supremum in (1.A.18) is extended over all sets x1 < x2 < · · · < xm such that xi ∈ I and m < ∞. The proof of the next theorem is simple and therefore it is omitted. Theorem 1.A.12. Let X and Y be two random variables with (discrete or continuous) density functions f and g, respectively. If S − (g − f ) = 1

and the sign sequence is −, +,

then X ≤st Y . Let X1 be a nonnegative random variable with distribution function F1 and survival function F 1 ≡ 1 − F1 . Deﬁne the Laplace transform of X1 by ∞ ϕX1 (λ) = e−λx dF1 (x), λ > 0, 0

and denote 1 aX λ (n) =

(−1)n dn 1 − ϕX1 (λ) , n! dλn λ

n ≥ 0, λ > 0,

and n X1 1 αX λ (n) = λ aλ (n − 1),

n ≥ 1, λ > 0.

Similarly, for a nonnegative random variable X2 with distribution function F2 X1 2 and survival function F 2 ≡ 1 − F2 , deﬁne αX λ (n). It can be shown that αλ X2 and αλ are discrete survival functions (see the proof of the next theorem); denote the corresponding discrete random variables by Nλ (X1 ) and Nλ (X2 ). The following result gives a Laplace transform characterization of the order ≤st . Theorem 1.A.13. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described above. Then X1 ≤st X2 ⇐⇒ Nλ (X1 ) ≤st Nλ (X2 )

for all λ > 0.

Proof. First suppose that X1 ≤st X2 . Select a λ > 0. Let Z1 , Z2 , . . . , be independent exponential mean 1/λ. can be shown

n randomvariables with

It n X2 1 that αX and that (n) = P Z ≤ X α (n) = P 1 i=1 i i=1 Zi ≤ X2 . It λ λ thus follows that Nλ (X1 ) ≤st Nλ (X2 ). Now suppose that Nλ (X1 ) ≤st Nλ (X2 ) for all λ > 0. Select an x > 0. X2 1 Thus αX n/x (n) ≤ αn/x (n). Letting n → ∞, one obtains F 1 (x) ≤ F 2 (x) for all continuity points x of F1 and F2 . Therefore, X1 ≤st X2 by (1.A.1).

1.A The Usual Stochastic Order

11

The implication =⇒ in Theorem 1.A.13 can be generalized as follows. A family of random variables {Z(θ), θ ∈ Θ} (Θ is a subset of the real line) is said to be stochastically increasing in the usual stochastic order (denoted by {Z(θ), θ ∈ Θ} ∈ SI) if Z(θ) ≤st Z(θ ) whenever θ ≤ θ . Recall from Theorem 1.A.3(a) that if X1 ≤st X2 , then g(X1 ) ≤st g(X2 ) for any increasing function g. The following result gives a stochastic generalization of this fact. Theorem 1.A.14. If {Z(θ), θ ∈ Θ} ∈ SI and if X1 ≤st X2 , where Xk and Z(θ) are independent for k = 1, 2 and θ ∈ Θ, then Z(X1 ) ≤st Z(X2 ). Note that Theorem 1.A.14 is a restatement of Theorem 1.A.6. Let X be a random variable and denote by X(−∞,a] the truncation of X at a, that is, X(−∞,a] has as its distribution the conditional distribution of X given that X ≤ a. X(a,∞) is similarly deﬁned. It is simple to prove the following result. Results that are stronger than this are contained in Theorems 1.B.20, 1.B.55, and 1.C.27. Theorem 1.A.15. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the usual stochastic order. An interesting example in which truncated random variables are compared is the following. Example 1.A.16. Let X (1) , X (2) , . . . , X (n) be independent and identically dis(1) (2) (n) tributed random variables. For a ﬁxed t, let X(t,∞) , X(t,∞) , . . . , X(t,∞) be the corresponding truncations, and assume that they are also independent and identically distributed. Then

(1) (2) (n) max X (1) , X (2) , . . . , X (n) (t,∞) ≤st max X(t,∞) , X(t,∞) , . . . , X(t,∞) ,

where max X (1) , X (2) , . . . , X (n) (t,∞) denotes the corresponding trunca

tion of max X (1) , X (2) , . . . , X (n) . The proof consists of a straightforward veriﬁcation of (1.A.2) for the compared random variables. Let φ1 and φ2 be two functions that satisfy φ1 (x) ≤ φ2 (x) for all x ∈ R, and let X be a random variable. Then, clearly, φ1 (X) ≤ φ2 (X) almost surely. From Theorem 1.A.1 we thus obtain the following result. Theorem 1.A.17. Let X be a random variable and let φ1 and φ2 be two functions that satisfy φ1 (x) ≤ φ2 (x) for all x ∈ R. Then φ1 (X) ≤st φ2 (X). In particular, if φ is a function that satisﬁes x ≤ [≥] φ(x) for all x ∈ R, then X ≤st [≥st ] φ(X). Remark 1.A.18. The set of all distribution functions on R is a lattice with respect to the order ≤st . That is, if X and Y are random variables with distributions F and G, then there exist random variables Z and W such that Z ≤st X, Z ≤st Y , W ≥st X, and W ≥st Y . Explicitly, Z has the survival function min{F , G} and W has the survival function max{F , G}.

12

1 Univariate Stochastic Orders

The next four theorems give conditions under which the corresponding spacings are ordered according to the usual stochastic order. Let X1 , X2 , . . . , Xm be any random variables with the corresponding order statistics X(1) ≤ X(2) ≤ · · · ≤ X(m) . Deﬁne the corresponding spacings by U(i) = X(i) −X(i−1) , i = 2, 3, . . . , m. When the dependence on m is to be emphasized, we will denote the spacings by U(i:m) . Theorem 1.A.19. Let X1 , X2 , . . . , Xm , Xm+1 be independent and identically distributed IFR (DFR) random variables. Then (m − i + 1)U(i:m) ≥st [≤st ] (m − i)U(i+1:m) ,

i = 2, 3, . . . , m − 1,

and (m − i + 2)U(i:m+1) ≥st [≤st ] (m − i + 1)U(i:m) ,

i = 2, 3, . . . , m.

The proof of Theorem 1.A.19 is not given here. A stronger version of the DFR part of Theorem 1.A.19 is given in Theorem 1.B.31. Some of the conclusions of Theorem 1.A.19 can be obtained under diﬀerent conditions. These are stated in the next two theorems. Again, the proofs are not given. In the next two theorems we take X(0) ≡ 0, and thus U(1) = X(1) . For the following theorem recall from page 1 the deﬁnition of Schur concavity. Theorem 1.A.20. Let X1 , X2 , . . . , Xm be nonnegative random variables with an absolutely continuous joint distribution function. If the joint density function of X1 , X2 , . . . , Xm is Schur concave (Schur convex ), then (m − i + 1)U(i:m) ≥st [≤st ] (m − i)U(i+1:m) ,

i = 1, 2, . . . , m − 1.

Theorem 1.A.21. Let X1 , X2 , . . . , Xm be independent exponential random variables with possibly diﬀerent parameters. Then (m − i + 1)U(i:m) ≤st (m − i)U(i+1:m) ,

i = 1, 2, . . . , m − 1.

Theorem 1.A.22. Let X1 , X2 , . . . , Xm be independent and identically distributed random variables with a ﬁnite support, and with an increasing [decreasing] density function over that support. Then U(i:m) ≥st [≤st ] U(i+1:m) ,

i = 2, 3, . . . , m − 1.

The proof of Theorem 1.A.22 uses the likelihood ratio order, and therefore it is deferred to Section 1.C, Remark 1.C.3. Note that any absolutely continuous DFR random variable has a decreasing density function. Thus we see that the assumption in the DFR part of Theorem 1.A.19 is stronger than the assumption in the decreasing part of Theorem 1.A.22, but the conclusion in the DFR part of Theorem 1.A.19 is stronger than the conclusion in the decreasing part of Theorem 1.A.22. It is

1.A The Usual Stochastic Order

13

of interest to compare Theorems 1.A.19–1.A.22 with Theorems 1.B.31 and 1.C.42. From Theorem 1.A.1 it is obvious that if X(1) ≤ X(2) ≤ · · · ≤ X(m) are the order statistics corresponding to the random variables X1 , X2 , . . . , Xm , then X(1) ≤st X(2) ≤st · · · ≤st X(m) . Now let X(1) ≤ X(2) ≤ · · · ≤ X(m) be the order statistics corresponding to the random variables X1 , X2 , . . . , Xm , and let Y(1) ≤ Y(2) ≤ · · · ≤ Y(m) be the order statistics corresponding to the random variables Y1 , Y2 , . . . , Ym . As usual, for any distribution function F , we let F ≡ 1 − F denote the corresponding survival function. Theorem 1.A.23. (a) Let X1 , X2 , . . . , Xm be independent random variables with distribution functions F1 , F2 , . . . , Fm , respectively. Let Y1 , Y2 , . . . , Ym be independent and identically distributed random variables with a common distribution function G. Then X(i) ≤st Y(i) for all i = 1, 2, . . . , m if, and only if, m Fj (x) ≥ Gm (x) for all x; j=1

that is, if, and only if, X(m) ≤st Y(m) . (b) Let X1 , X2 , . . . , Xm be independent random variables with survival functions F 1 , F 2 , . . . , F m , respectively. Let Y1 , Y2 , . . . , Ym be independent and identically distributed random variables with a common survival function G. Then X(i) ≥st Y(i) for all i = 1, 2, . . . , m if, and only if, m

m

F j (x) ≥ G (x)

for all x;

j=1

that is, if, and only if, X(1) ≥st Y(1) . The proof of Theorem 1.A.23 is not given here. More comparisons of order statistics in the usual stochastic order can be found in Theorem 6.B.23 and in Corollary 6.B.24. The following neat example compares a sum of independent heterogeneous exponential random variables with an Erlang random variable; it is of interest to compare it with Examples 1.B.5 and 1.C.49. We do not give the proof here. Example 1.A.24. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Let Yi , i = 1, 2, . . . , m, be independent, identically distributed, exponential random variables with mean η −1 . Then m i=1

Xi ≥st

m

Yi ⇐⇒

m λ1 λ2 · · · λm ≤ η.

i=1

The next example may be compared with Examples 1.B.6, 1.C.51, and 4.A.45.

14

1 Univariate Stochastic Orders

Example 1.A.25. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m Xi ≥st Y ⇐⇒ p ≤ n p1n1 pn2 2 · · · pnmm , i=1

and m

Xi ≤st Y ⇐⇒ 1 − p ≤

n

(1 − p1 )n1 (1 − p2 )n2 · · · (1 − pm )nm .

i=1

The following example gives necessary and suﬃcient conditions for the comparison of normal random variables; it is generalized in Example 6.B.29. See related results in Examples 3.A.51 and 4.A.46. Example 1.A.26. Let X be a normal random variable with mean µX and vari2 , and let Y be a normal random variable with mean µY and variance ance σX 2 2 σY . Then X ≤st Y if, and only if, µX ≤ µY and σX = σY2 . Example 1.A.27. Let the random variable X have a unimodal density, symmetric about 0. Then (X + a)2 ≤st (X + b)2

whenever |a| ≤ |b|.

Example 1.A.28. Let X have a multivariate normal density with mean vector 0 and variance-covariance matrix Σ 1 . Let Y have a multivariate normal density with mean vector 0 and variance-covariance matrix Σ 1 + Σ 2 , where Σ 2 is a nonnegative deﬁnite matrix. Then

X 2 ≤st Y 2 , where · denotes the Euclidean norm. The next result involves the total time on test (TTT) transform and the observed TTT random variable. Let F be the distribution function of a nonnegative random variable, and suppose, for simplicity, that 0 is the left endpoint of the support of F . The TTT transform associated with F is deﬁned by F −1 (u) HF−1 (u) = F (x)dx, 0 ≤ u ≤ 1, (1.A.19) 0

where F ≡ 1−F is the survival function associated with F . The inverse, HF , of ∞ the TTT transform is a distribution function. If the mean µ = 0 xdF (x) = ∞ F (x)dx is ﬁnite, then HF has support in [0, µ]. If X has the distribution 0 function F , then let Xttt be any random variable that has the distribution HF . The random variable Xttt is called the observed total time on test. Theorem 1.A.29. Let X and Y be two nonnegative random variables. Then X ≤st Y =⇒ Xttt ≤st Yttt . See related results in Theorems 3.B.1, 4.A.44, 4.B.8, 4.B.9, and 4.B.29.

1.A The Usual Stochastic Order

15

1.A.5 Some properties in reliability theory Recall from page 1 the deﬁnitions of the IFR, DFR, NBU, and NWU properties. The next result characterizes random variables that have these properties by means of the usual stochastic order. The statements in the next theorem follow at once from the deﬁnitions. Recall from Section 1.A.3 that for any random variable Z and an event A we denote by [Z A] any random variable that has as its distribution the conditional distribution of Z given A. Theorem 1.A.30. (a) The random variable X is IFR [DFR] if, and only if, [X − tX > t] ≥st [≤st ] [X − t X > t ] whenever t ≤ t . (b) The nonnegative random variable X is NBU [NWU] if, and only if, X ≥st [≤st ] [X − tX > t] for all t > 0. Note that if X is the lifetime of a device, then [X − tX > t] is the residual life of such a device with age t. Theorem 1.A.30(a), for example, characterizes IFR and DFR random variables by the monotonicity of their residual lives with respect to the order ≤st . Theorem 1.A.30 should be compared to Theorem 1.B.38, where a similar characterization is given. Some multivariate analogs of Theorem 1.A.30(a) are used in Section 6.B.6 to introduce some multivariate IFR notions. For a nonnegative random variable X with a ﬁnite mean, let AX denote the corresponding asymptotic equilibrium age. That is, if the distribution function of X is F , then the distribution function Fe of AX is deﬁned by x 1 Fe (x) = F (y)dy, x ≥ 0, (1.A.20) EX 0 where F ≡ 1 − F is the corresponding survival function. Recall from page 1 the deﬁnitions of the NBUE and the NWUE properties. The following result is immediate. Theorem 1.A.31. The nonnegative random variable X with ﬁnite mean is NBUE [NWUE] if, and only if, X ≥st [≤st ] AX . Another characterization of NBUE random variables is the following. Recall from Section 1.A.4 the deﬁnition of the observed total time on test random variable Xttt . Theorem 1.A.32. Let X be a nonnegative random variable with ﬁnite mean µ. Then X is NBUE if, and only if, Xttt ≥st U(0, µ), where U(0, µ) denotes a uniform random variable on (0, µ). Let X be a nonnegative random variable with ﬁnite mean and distribution function F , and let AX be the corresponding asymptotic equilibrium age having the distribution function Fe given in (1.A.20). The requirement

16

1 Univariate Stochastic Orders

X ≥st [AX − tAX > t]

for all t ≥ 0,

(1.A.21)

has been used in the literature as a way to deﬁne an aging property of the lifetime X. It turns out that this aging property is equivalent to the new better than used in convex ordering (NBUC) notion that is deﬁned in (4.A.31) in Chapter 4.

1.B The Hazard Rate Order 1.B.1 Deﬁnition and equivalent conditions If X is a random variable with an absolutely continuous distribution function F , then the hazard rate of X at t is deﬁned as r(t) = (d/dt)(− log(1 − F (t))). The hazard rate can alternatively be expressed as P {t < X ≤ t + ∆tX > t} f (t) = , t ∈ R, (1.B.1) r(t) = lim ∆t↓0 ∆t F (t) where F ≡ 1 − F is the survival function and f is the corresponding density function. As can be seen from (1.B.1), the hazard rate r(t) can be thought of as the intensity of failure of a device, with a random lifetime X, at time t. Clearly, the higher the hazard rate is the smaller X should be stochastically. This is the motivation for the order discussed in this section. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions and with hazard rate functions r and q, respectively, such that r(t) ≥ q(t), t ∈ R. (1.B.2) Then X is said to be smaller than Y in the hazard rate order (denoted as X ≤hr Y ). Although the hazard rate order is usually applied to random lifetimes (that is, nonnegative random variables), deﬁnition (1.B.2) may also be used to compare more general random variables. In fact, even the absolute continuity, which is required in (1.B.2), is not really needed. It is easy to verify that (1.B.2) holds if, and only if, G(t) F (t)

increases in t ∈ (−∞, max(uX , uY ))

(1.B.3)

(here a/0 is taken to be equal to ∞ whenever a > 0). Here F denotes the distribution function of X and G denotes the distribution function of Y , and uX and uY denote the corresponding right endpoints of the supports of X and of Y . Equivalently, (1.B.3) can be written as F (x)G(y) ≥ F (y)G(x)

for all x ≤ y.

(1.B.4)

1.B The Hazard Rate Order

17

Thus (1.B.3) or (1.B.4) can be used to deﬁne the order X ≤hr Y even if X and/or Y do not have absolutely continuous distributions. A useful further condition, which is equivalent to X ≤hr Y when X and Y have absolutely continuous distributions with densities f and g, respectively, is the following: f (x) g(x) ≥ F (y) G(y)

for all x ≤ y.

(1.B.5)

Rewriting (1.B.4) as F (t + s) G(t + s) ≤ F (t) G(t)

for all s ≥ 0 and all t,

it is seen that X ≤hr Y if, and only if, P {X − t > sX > t} ≤ P {Y − t > sY > t}

for all s ≥ 0 and all t; (1.B.6)

that is, if, and only if, the residual lives of X and Y at time t are ordered in the sense ≤st for all t. Equivalently, (1.B.6) can be written as [X X > t] ≤st [Y Y > t] for all t. (1.B.7) Substituting u = F GF

−1

−1

(u)

u

(t) in (1.B.3) shows that X ≤hr Y if, and only if, ≥

GF

−1

v

(v)

for all 0 < u ≤ v < 1.

Simple manipulations show that the latter condition is equivalent to 1 − F G−1 (1 − v) 1 − F G−1 (1 − u) ≤ u v

for all 0 < u ≤ v < 1.

(1.B.8)

For discrete random variables that take on values in N the deﬁnition of ≤hr can be written in two diﬀerent ways. Let X and Y be such random variables. We denote X ≤hr Y if P {X = n} P {Y = n} ≥ , P {X ≥ n} P {Y ≥ n}

n ∈ N.

(1.B.9)

Equivalently, X ≤hr Y if P {X = n} P {Y = n} ≥ , P {X > n} P {Y > n}

n ∈ N.

The discrete analog of (1.B.4) is that (1.B.9) holds if, and only if, P {X ≥ n1 }P {Y ≥ n2 } ≥ P {X ≥ n2 }P {Y ≥ n1 }

for all n1 ≤ n2 . (1.B.10)

In a similar manner (1.B.3) and (1.B.5) can be modiﬁed in the discrete case. Unless stated otherwise, we consider only random variables with absolutely continuous distribution functions in the following sections.

18

1 Univariate Stochastic Orders

1.B.2 The relation between the hazard rate and the usual stochastic orders By setting x = −∞ in (1.B.4) (or n1 = −∞ in (1.B.10)), and then using (1.A.1), we obtain the following result. Theorem 1.B.1. If X and Y are two random variables such that X ≤hr Y , then X ≤st Y . 1.B.3 Closure properties and some characterizations Let φ be a strictly increasing function with inverse φ−1 . If X has the survival function F , then φ(X) has the survival function F φ−1 . Similarly, if Y has the survival function G, then φ(Y ) has the survival function Gφ−1 . If X ≤hr Y , then from (1.B.3) it follows that Gφ−1 (t) F φ−1 (t)

increases in t over {t : Gφ−1 (t) > 0}.

We have thus shown an important special case of the next theorem. When φ is just increasing (rather than strictly increasing) the result is still true, but the above simple argument is no longer suﬃcient for its proof. Theorem 1.B.2. If X ≤hr Y , and if φ is any increasing function, then φ(X) ≤hr φ(Y ). In general, if X1 ≤hr Y1 and X2 ≤hr Y2 , where X1 and X2 are independent random variables and Y1 and Y2 are also independent random variables, then it is not necessarily true that X1 + X2 ≤hr Y1 + Y2 . However, if these random variables are IFR, then it is true. This is shown in Theorem 1.B.4, but ﬁrst we state and prove the following lemma, which is of independent interest. Lemma 1.B.3. If the random variables X and Y are such that X ≤hr Y and if Z is an IFR random variable independent of X and Y , then X + Z ≤hr Y + Z.

(1.B.11)

Proof. Denote by fW and F W the density function and the survival function of any random variable W . Note that, for x ≤ y,

1.B The Hazard Rate Order

19

F X+Z (x)F Y +Z (y) − F X+Z (y)F Y +Z (x) = fX (u)F Z (x − u)fY (v)F Z (y − v) v

u≥v

fX (u)F Z (y − u)fY (v)F Z (x − v)

− v

u≥v

= v

+ fX (v)F Z (x − v)fY (u)F Z (y − u) dudv

u≥v

+ fX (v)F Z (y − v)fY (u)F Z (x − u) dudv F X (u)fY (v) − fX (v)F Y (u) × F Z (y − v)fZ (x − u) − fZ (y − u)F Z (x − v) dudv,

where the second equality is obtained by integration by parts with respect to u and by collection of terms. Since X ≤hr Y it follows from (1.B.5) that the expression within the ﬁrst set of brackets in the last integral is nonpositive. Since Z is IFR it can be veriﬁed that the quantity in the second pair of brackets in the last integral is also nonpositive. Therefore, the integral is nonnegative. This proves (1.B.11).

The above proof is very similar to the proof that a convolution of two independent IFR random variables is IFR. In fact, this convolution result can be shown to be a consequence of Lemma 1.B.3; see Corollary 1.B.39 in Section 1.B.5. Theorem 1.B.4. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤hr Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, are all IFR, then m m Xi ≤hr Yi . i=1

i=1

Proof. Repeated application of (1.B.11), using the closure property of IFR under convolution, yields the desired result.

The following neat example compares a sum of independent heterogeneous exponential random variables with an Erlang random variable; it is of interest to compare it with Examples 1.A.24 and 1.C.49. We do not give the proof here. Example 1.B.5. Let Xi be an exponential random variable with mean λ−1 > 0, i i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Let Yi , i = 1, 2, . . . , m, be independent, identically distributed, exponential random variables with mean η −1 . Then m i=1

Xi ≥hr

m i=1

Yi ⇐⇒

m λ1 λ2 · · · λm ≤ η.

20

1 Univariate Stochastic Orders

The next example may be compared with Examples 1.A.25, 1.C.51, and 4.A.45. Example 1.B.6. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m

n , i=1 (ni /pi )

Xi ≥hr Y ⇐⇒ p ≤ m

i=1

and

m

n . i=1 (ni /(1 − pi ))

Xi ≤hr Y ⇐⇒ 1 − p ≤ m

i=1

A hazard rate order comparison of random sums is given in the following result. Theorem 1.B.7. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative IFR independent random variables. Let M and N be two discrete positive integervalued random variables such that M ≤hr N (in the sense of (1.B.9) or (1.B.10)), and assume that M and N are independent of the Xi ’s. Then M i=1

Xi ≤hr

N

Xi .

i=1

The hazard rate order (unlike the usual stochastic order; see Theorem 1.A.3(d)) does not have the property of being simply closed under mixtures. However, under quite strong conditions the order ≤hr is closed under mixtures. This is shown in the next theorem. Theorem Let X, Y , and Θ be random variables such that [X Θ = 1.B.8. θ] ≤hr [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤hr Y . Proof. Select a θ and a θ in the support of Θ. Let F (· θ), G(· θ), F (· θ ), and G(·θ ) be the survival functions of [X Θ = θ], [Y Θ = θ], [X Θ = θ ], and [Y Θ = θ ], respectively. For simplicity assume that these random variables have densities which we denote by f (· θ), g(· θ), f (·θ ), and g(·θ ), respectively. It is suﬃcient to show that for α ∈ (0, 1) we have αf (tθ) + (1 − α)f (tθ ) αg(tθ) + (1 − α)g(tθ ) ≥ for all t ≥ 0. αF (tθ) + (1 − α)F (tθ ) αG(tθ) + (1 − α)G(tθ ) (1.B.12) This is an inequality of the form a+b w+x ≥ , c+d y+z

1.B The Hazard Rate Order

21

where all eight variables are nonnegative and by the assumptions of the theorem they satisfy a w ≥ , c y

a x ≥ , c z

b w ≥ , d y

and

b x ≥ . d z

It is easy to verify that the latter four inequalities imply the former one, completing the proof of the theorem.

It should be pointed out, however, that mixtures, of distributions which are ordered by the hazard rate order, are ordered by the usual stochastic order. That is, if X, Y , and Θ are random variables such that [X Θ = θ] ≤hr [Y Θ = θ] for all θ in the support of Θ, then X ≤st Y . This follows from a (conditional) application of Theorem 1.B.1, combined with the fact that the usual stochastic order is closed under mixtures (Theorem 1.A.3(d)). In order to state the next characterization we deﬁne the following class of bivariate functions.

Ghr = φ : R2 → R : φ(x, y) is increasing in x, for each y, on {x ≥ y},

and is decreasing in y, for each x, on {y ≥ x} .

Theorem 1.B.9. Let X and Y be independent random variables. Then X ≤hr Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Ghr .

(1.B.13)

Proof. Suppose that (1.B.13) holds. Select an x and a y such that x ≥ y. Let φ(u, v) = I{u≥x,v≥y} , where IA denotes the indicator function of the set A. It is easy to see that φ(u, v) is increasing in u. In addition, for a ﬁxed u and v such that v ≥ u, we have that φ(u, v) = 1 if u ≥ x and φ(u, v) = 0 if u < x. Therefore, φ ∈ Ghr . Hence, F (y)G(x) = Eφ(Y, X) ≥ Eφ(X, Y ) = F (x)G(y)

whenever x ≥ y,

where F and G are the survival functions of X and Y , respectively. Therefore, by (1.B.4), X ≤hr Y . Conversely, assume that X ≤hr Y . Let ψ : R → R be an increasing function and let φ ∈ Ghr . Denote a(x, y) = ψ(φ(x, y)) − ψ(φ(y, x)). For simplicity assume that a is diﬀerentiable and that X and Y have densities that we denote by f and g, respectively (otherwise, approximation arguments can be used). Then

∞

a(x, y)[f (x)g(y) − f (y)g(x)]dxdy

Ea(X, Y ) = y=−∞ x≥y ∞

= y=−∞

x≥y

∂ a(x, y) F (x)g(y) − f (y)G(x) dxdy ≤ 0, ∂x

22

1 Univariate Stochastic Orders

where the second equality follows from integration by parts, and the inequality follows from X ≤hr Y , the fact that a(x, y) increases in x for all x ≥ y, and (1.B.5).

The next result is a similar characterization. It uses the notation of Theorem 1.A.10, and their comparison is of interest. The proof of the following theorem is omitted. Theorem 1.B.10. Let X and Y be two independent random variables. Then X ≤hr Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all φ1 and φ2 such that, for each x, ∆φ21 (x, y) increases in y on {y ≥ x}, and such that ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. A further similar characterization is given in Theorem 4.A.36. The next result describes another characterization of the order ≤hr . Theorem 1.B.11. Let X and Y be two, absolutely continuous or discrete, independent random variables. Then X ≤hr Y if, and only if, (1.B.14) [X min(X, Y ) = z] ≤hr [Y min(X, Y ) = z] for all z. Also, X ≤hr Y if, and only if, [X min(X, Y ) = z] ≤st [Y min(X, Y ) = z]

for all z.

(1.B.15)

Proof. First suppose that X and Y are absolutely continuous. Denote the survival functions of X and Y by F and G, respectively, and denote the corresponding density functions by f and g. Then 1, if x < z, P [X > x min(X, Y ) = z] = (1.B.16) F (x)g(z) , if x ≥ z, f (z)G(z)+g(z)F (z) and P [Y > x min(X, Y ) = z] =

1,

if x < z,

G(x)f (z) , f (z)G(z)+g(z)F (z)

Therefore P [Y > x| min(X, Y ) = z] = P [X > x| min(X, Y ) = z]

1, G(x) F (x)

if x ≥ z.

if x < z, ·

f (z) g(z) ,

if x ≥ z.

(1.B.17)

(1.B.18)

If X ≤hr Y , then G(z) · f (z) ≥ 1, and G(x) is increasing in x. Thus (1.B.18) is F (z) g(z) F (x) increasing in x, and (1.B.14) follows. Obviously (1.B.15) follows from (1.B.14). Now suppose that (1.B.15) holds. Then from (1.B.16) and (1.B.17) we get that F (x)g(z) ≤ G(x)f (z) for all x ≥ z. Therefore X ≤hr Y by (1.B.5). The proof when X and Y are discrete is similar.

1.B The Hazard Rate Order

23

Some related characterizations are given in the next result. Theorem 1.B.12. Let X and Y be two independent random variables. The following conditions are equivalent: (a) X ≤hr Y . (b) E[α(X)]E[β(Y )] ≤ E[α(Y )]E[β(X)] for all functions α and β for which the expectations exist and such that β is nonnegative and α/β and β are increasing. (c) For any two increasing functions a and b such that b is nonnegative, if E[a(X)b(X)] = 0, then E[a(Y )b(Y )] ≥ 0. Proof. Assume (a). Let α and β be as in (b). Deﬁne φ1 (x, y) = α(x)β(y) and φ2 (x, y) = α(y)β(x). Then ∆φ21 (x, y) = φ2 (x, y) − φ1 (x, y) = β(x)β(y) · [α(y)/β(y) − α(x)/β(x)], which is increasing in y. Note that ∆φ21 (x, y) + ∆φ21 (y, x) = 0. Condition (b) now follows from Theorem 1.B.10. Assume (b). By taking, for some u ≤ v, α(x) = I(v,∞) (x) and β(x) = I(u,∞) (x) in (b) one obtains (1.B.4), from which (a) follows. Assume (b). Let a and b be two increasing functions such that b is nonnegative and such that E[a(X)b(X)] = 0. Deﬁne β(x) = b(x) and α(x) = a(x)b(x). Substitution in (b) yields E[a(Y )b(Y )] ≥ 0; that is, (c) holds. Assume (c). Let α and β be as in (b). Denote c = E[α(X)]/E[β(X)]. Deﬁne a(x) = α(x)/β(x) − c and b(x) = β(x). Then E[a(X)b(X)] = 0. So, by (c), E[a(Y )b(Y )] ≥ 0. But the latter reduces to E[α(X)]E[β(Y )] ≤ E[α(Y )]E[β(X)], and this establishes (b).

Example 1.B.13. Let {N (t), t ≥ 0} be a nonhomogeneous Poisson process with mean function Λ (that is, Λ(t) ≡ E[N (t)], t ≥ 0). Let T1 , T2 , . . . be the successive epoch times, and let Xn ≡ Tn − Tn−1 , n = 1, 2 . . . (where T0 ≡ 0), be the corresponding inter-epoch times. The survival function of Tn is given i n−1 · e−Λ(t) , t ≥ 0, n = 1, 2, . . .. It is easy to verify by P {Tn > t} = i=0 (Λ(t)) i! P {Tn+1 >t} that P {Tn >t} is increasing in t ≥ 0, n = 1, 2, . . ., and thus, by (1.B.3), Tn ≤hr Tn+1 ,

n = 1, 2, . . . .

(1.B.19)

A result that is stronger than (1.B.19) is given in Example 1.C.47. If we denote by Fn the distribution function of Tn , then ∞ P {Xn+1 > t} = P {Tn+1 − Tn > tTn = u}dFn (u) 0 ∞ = exp{−[Λ(t + u) − Λ(u)]}dFn (u) 0

= E[exp{−[Λ(t + Tn ) − Λ(Tn )]},

n = 0, 1, . . . .

Fix t1 ≤ t2 and let α(x) ≡ exp{−[Λ(t2 + x) − Λ(x)]} and β(x) ≡ exp{−[Λ(t1 + x) − Λ(x)]}. Note that if Λ is concave, then α(x)/β(x) is increasing. Thus, by Theorem 1.B.12(b), if Λ is concave, then

24

1 Univariate Stochastic Orders

E[α(Tn )] E[β(Tn )] P {Xn+1 > t1 } P {Xn+1 > t2 } = ≥ = , P {Xn > t2 } E[α(Tn−1 )] E[β(Tn−1 )] P {Xn > t1 } n = 1, 2, . . . . It follows, by (1.B.3), that Xn ≤hr Xn+1 ,

n = 1, 2, . . . .

It can be shown in a similar manner that if Λ is convex, then Xn ≥hr Xn+1 , n = 1, 2, . . .. As another example of the use of Theorem 1.B.12 consider an increasing convex function H such that H(0) = 0. Let X and Y be nonnegative random variables such that X ≤hr Y . Then E[H(X)] E[H(Y )] ≤ . E[X] E[Y ] Rather than using Theorem 1.B.12, one can also obtain the above inequality from (2.B.5) in Chapter 2, and from the fact that the hazard rate order implies the hmrl order (which is discussed there). Other characterizations of the order ≤hr can be found in Theorems 2.A.6 and 5.A.22. Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result generalizes both Theorems 1.B.2 and 1.B.8, just as Theorem 1.A.6 generalized both parts (a) and (c) of Theorem 1.A.3. Theorem 1.B.14. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the survival function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. H i (y) = X

If

X(θ) ≤hr X(θ )

whenever θ ≤ θ ,

(1.B.20)

and if Θ1 ≤hr Θ2 ,

(1.B.21)

Y1 ≤hr Y2 .

(1.B.22)

then

1.B The Hazard Rate Order

25

Proof. Assumption (1.B.20) means that Gθ (y) is TP2 (totally positive of order 2) as a function of θ ∈ X and of y ∈ R (that is, Gθ (y)Gθ (y ) ≥ Gθ (y )Gθ (y) whenever y ≤ y and θ ≤ θ ). Assumption (1.B.21) means that F i (θ), as a function of i ∈ {1, 2} and of θ ∈ X , is TP2 . Also, from Theorem 1.B.1 it follows that Gθ (y) is increasing in θ. Therefore, by Theorem 2.1 of Joag-Dev, Kochar, and Proschan [259], H i (y) is TP2 in i ∈ {1, 2} and y ∈ R. That gives (1.B.22).

The following example shows an interesting and useful application of Theorem 1.B.14 Example 1.B.15. Let {Xni , n ≥ 0} be a Markov chain with state space {1, 2, . . . , M } (M can be inﬁnity) and transition matrix P , which starts from state i; that is, X0i = i. If X1i ≤hr X1i for all i ≤ i , then (a) I1 ≤hr I2 implies that XnI1 ≤hr XnI2 for all n ≥ 0, and (b) Xn1 ≤hr Xn1 whenever n ≤ n . In order to prove (a), ﬁrst note that the result is trivial for n = 0. Suppose that the result is true for n = k. Deﬁne Y (i) = X1i . By the Markov property, i we have Xk+1 =st Y (Xki ) for all i. By induction, XkI1 ≤hr XkI2 . In particular, Y (Xki ) ≤hr Y (Xki ) for all i ≤ i . Therefore, from Theorem 1.B.14 we get I1 I2 = Y (XkI1 ) ≤hr Y (XkI2 ) = Xk+1 . Xk+1 In order to prove (b), note that X01 = 1 ≤hr X11 . So, by (a) and the Markov X1

1 property we have Xn1 ≤hr Xn 1 =st Xn+1 .

The following example shows an application of Theorem 1.B.14 in the area of Bayesian imperfect repair. Example 1.B.16. Let Θ1 and Θ2 be two random variables as in Example 1.A.7. Let Gθ , X(θ), Y1 , and Y2 also be as in Example 1.A.7. Note that (1.B.20) holds 1−θ

1−θ

(y)/K (y) is increasing in y whenever 0 < θ ≤ θ ≤ 1. Thus, because K if Θ1 ≤hr Θ2 , then Y1 ≤hr Y2 . It is of interest to compare Example 1.B.16 to Example 5.B.13 which deals with random minima and maxima. The next example deals with the same proportional hazard model as in Example 1.B.16; however, for convenience we change the notation. Example 1.B.17. Let Θ and X be two nonnegative random variables with distribution function F and G, respectively. Let Y have the survival function H deﬁned as ∞ θ H(y) = G (y)dF (θ), y ≥ 0. 0

Suppose that G is absolutely continuous with hazard rate function r. Then H is also absolutely continuous, and we denote its hazard rate function by q. We will now show that if EΘ ≤ 1, then X ≤hr Y . In order to see it,

26

1 Univariate Stochastic Orders

write H(y) = M (log G(y)), where M is the moment generating function of Θ. Diﬀerentiating − log H(y) we obtain q(y) = −

M (log G(y)) d log H(y) = r(y) dy M (log G(y)) = r(y)

EΘeΘ log G(y) EeΘ log G(y)

≤ r(y)

EΘEeΘ log G(y) EeΘ log G(y)

= r(y)EΘ ≤ r(y),

where the ﬁrst inequality follows from Chebyshev’s Inequality (that is, Cov(Θ, eΘ log G(y) ) ≤ 0), and the second inequality follows from EΘ ≤ 1. The stated result now follows from (1.B.2). The following result gives a Laplace transform characterization of the order ≤hr . It should be compared with Theorem 1.A.13. Theorem 1.B.18. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤hr X2 ⇐⇒ Nλ (X1 ) ≤hr Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤hr Nλ (X2 ) is in the sense of (1.B.9). Proof. First suppose that X1 ≤hr X2 . Denote Γλ (n, x) = λe−λx

(λx)n−1 , (n − 1)!

n ≥ 1, x ≥ 0.

X2 1 Let αX λ (n) = P {Nλ (X1 ) ≥ n} and αλ (n) = P {Nλ (X2 ) ≥ n} be as in the proof of Theorem 1.A.13. Then it can be veriﬁed that ∞ ∞ X1 X2 αλ (n) = Γλ (n, x)F 1 (x)dx and αλ (n) = Γλ (n, x)F 2 (x)dx, 0

0

where F 1 and F 2 are the survival functions corresponding to X1 and X2 . For n1 ≤ n2 , some computation yields X2 X1 X2 1 αX λ (n1 )αλ (n2 ) − αλ (n2 )αλ (n1 ) ∞ y = [Γλ (n1 , x)Γλ (n2 , y) − Γλ (n1 , y)Γλ (n2 , x)] y=0 x=0 × F 1 (x)F 2 (y) − F 1 (y)F 2 (x) dxdy.

It is not hard to verify that if x ≤ y and n1 ≤ n2 , then [Γλ (n1 , x)Γλ (n2 , y) − Γλ (n1 , y)Γλ (n2 , x)] ≥ 0. Also, using (1.B.4) it is seen that X1 ≤hr X2 implies

F 1 (x)F 2 (y) − F 1 (y)F 2 (x) ≥ 0 for x ≤ y. Thus, from (1.B.10) it is seen that Nλ (X1 ) ≤hr Nλ (X2 ).

1.B The Hazard Rate Order

27

Now suppose that Nλ (X1 ) ≤hr Nλ (X2 ) for every λ > 0. Deﬁne c(n, λ) = X2 1 αX λ (n)/αλ (n). It can be shown that c(n, λ) increases in λ and decreases in n. Thus, c(n, n/x) ≥ c(n, n/y) whenever x ≤ y. Letting n → ∞ shows that F 1 (x)/F 2 (x) ≥ F 1 (y)/F 2 (y) for all continuity points x and y of F1 and F2 such that x ≤ y. Thus, from (1.B.3) it is seen that X1 ≤hr X2 .

The implication =⇒ in Theorem 1.B.18 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 1.B.14. A related result is the following. Theorem 1.B.19. Let X1 , X2 , . . . , Xm , Θ1 , and Θ2 be independent nonnegative random variables. Deﬁne Nj (t) =

n

I[Θj Xi ] (t),

t ≥ 0, j = 1, 2,

i=1

where

1 I[Θj Xi ] (t) = 0

if if

Θj Xi > t, Θj Xi ≤ t.

If Θ1 ≤hr Θ2 then N1 (t) ≤hr N2 (t) in the sense of (1.B.9) for all t ≥ 0. The following easy-to-prove result strengthens Theorem 1.A.15. An even stronger result appears in Theorem 1.C.27. Theorem 1.B.20. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the hazard rate order. In Theorem 1.A.17 it was seen that if φ is a function which satisﬁes that φ(x) ≤ x for all x ∈ R, then φ(X) ≤st X. The order ≤hr does not satisfy such a general property. However, we have the following easy-to-prove result. Theorem 1.B.21. Let X be a nonnegative IFR random variable, and let a ≤ 1 be a positive constant. Then aX ≤hr X. In fact, a necessary and suﬃcient condition for a nonnegative random variable X, with survival function F , to satisfy aX ≤hr X for all 0 < a < 1, is that log F (ex ) is concave in x ≥ 0. In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of hazard rate ordered random variables, is bounded from below and from above, in the hazard rate order sense, by these two random variables. Theorem 1.B.22. Let X and Y be two random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤hr Y , then X ≤hr W ≤hr Y .

28

1 Univariate Stochastic Orders

Proof. Let uX , uY , and uW denote the right endpoints of the supports of the corresponding random variables, and note that max(uX , uW ) = max(uX , uY ). Now, if X ≤hr Y , then G(t) pF (t) + (1 − p)G(t) = p + (1 − p) F (t) F (t) is increasing in t ∈ (−∞, max(uX , uW )). Therefore, by (1.B.3), X ≤hr W . The proof that W ≤hr Y is similar.

Example 1.B.23. For a nonnegative random variable X with density function f , and for a nonnegative function w such that E[w(X)] exists, deﬁne X w as the random variable with the so-called weighted density function fw given by fw (x) =

w(x)f (x) , E[w(X)]

x ≥ 0.

Similarly, for another nonnegative random variable Y with density function g, such that E[w(Y )] exists, deﬁne Y w as the random variable with the density function gw given by gw (x) =

w(x)g(x) , E[w(Y )]

x ≥ 0.

We will show that if w is increasing, then X ≤hr Y =⇒ X w ≤hr Y w .

(1.B.23)

In order to do this, ﬁrst note that the hazard rate function rX w of X w is given by w(x)rX (x) , x ≥ 0, rX w (x) = E[w(X)X > x] where rX is the hazard rate function of X. Similarly, the hazard rate function rY w of Y w is given by rY w (x) =

w(x)rY (x) , E[w(Y )Y > x]

x ≥ 0,

where rY is the hazard rate function of Y . Now, from X ≤hr Y it follows that [X X > x] ≤hr [Y Y > x] for all x ≥ 0. Next, using Theorem 1.B.2 and the monotonicity of w, we get that [w(X) X > x] ≤hr [w(Y )Y > x], and therefore, by Theorem 1.B.1, E[w(X) X > x] ≤ E[w(Y )Y > x]. Combining this inequality with rX ≥ rY , it is seen that rX w ≥ rY w . The above random variables are also studied in Example 1.C.59. In particular, taking w to be the identity function w(x) = x, we see from (1.B.23) that the hazard rate ordering of X and Y implies the hazard rate ordering of the corresponding spread (or length-biased) random variables. See Example 8.B.12 for another result involving spreads.

1.B The Hazard Rate Order

29

Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R ∪ {∞} is a lattice with respect to the order ≤hr . The following example may be compared to Examples 1.C.48, 2.A.22, 3.B.38, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 1.B.24. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the hazard rate ordering of the ﬁrst two epoch times implies the hazard rate ordering of all the corresponding later epoch times; that is, it will be shown below that if X ≤hr Y , then T1,n ≤hr T2,n , n ≥ 1. The survival function F 1,n of T1,n is given by F 1,n (t) = P (T1,n > t) =

n−1 j=0

(Λ1 (t))j −Λ1 (t) = Γ n (Λ1 (t)), e j!

t ≥ 0, (1.B.24)

where Γ n is the survival function of the gamma distribution with scale parameter 1 and shape parameter n. The corresponding density function f1,n is given by f1,n (t) = γn (Λ1 (t))λ1 (t), t ≥ 0, where γn is the density function associated with Γ n . The corresponding hazard rate function rF1,n is given by rF1,n (t) = rΓn (Λ1 (t))λ1 (t),

t ≥ 0,

where rΓn is the hazard rate function associated with Γ n . Similarly, rF2,n (t) = rΓn (Λ2 (t))λ2 (t),

t ≥ 0.

If X ≤hr Y , then rF1,n (t) = rΓn (Λ1 (t))λ1 (t) ≥ rΓn (Λ2 (t))λ2 (t) = rF2,n (t),

t ≥ 0,

where the inequality follows from λ1 (t) ≥ λ2 (t), Λ1 (t) ≥ Λ2 (t), and the fact that the hazard rate function of the gamma distribution described above is increasing. Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Again, note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the hazard rate ordering of the ﬁrst two inter-epoch times implies the hazard rate ordering of all the corresponding

30

1 Univariate Stochastic Orders

later inter-epoch times. Explicitly, it will be shown below that if X ≤hr Y , and if F and G are logconvex (that is, X and Y are DFR), and if λ2 (t) λ1 (t)

is increasing in t ≥ 0,

(1.B.25)

then X1,n ≤hr X2,n for each n ≥ 1. For the purpose of this proof let us denote F by F1 , and G by F2 . Let Gi,n denote the survival function of Xi,n , i = 1, 2. The stated result is obvious for n = 1, so let us ﬁx an n ≥ 2. Then, from (7) in Baxter [62] we obtain Gi,n (t) =

∞

λi (s) 0

(s) Λn−2 i F i (s + t)ds, (n − 2)!

t ≥ 0, i ∈ {1, 2}.

(1.B.26)

Condition (1.B.25) means that λi (t) is TP2 (totally positive of order 2) in (i, t). Condition (1.B.25) also implies that Λ2 (t)/Λ1 (t) is increasing in t ≥ 0, that is, Λi (t) is TP2 in (i, t). Since a product of TP2 kernels is TP2 we get that λi (t)

(t) Λn−2 i (n − 2)!

is TP2 in (i, t).

The assumption F1 ≤hr F2 implies that F i (s + t) is TP2 in (i, s) and in (i, t). Finally, the logconvexity of F 1 and of F 2 means that F i (s + t) is TP2 in (s, t). Thus, by Theorem 5.1 on page 123 of Karlin [275], we get that Gi,n (t) is TP2 in (i, t); that is, X1,n ≤hr X2,n . The inequality X1,n ≤hr X2,n , n ≥ 1, can also be obtained under slightly weaker assumptions, namely, that X ≤hr Y , that (1.B.25) holds, and that either X or Y is DFR; see Hu and Zhuang [245]. Example 1.B.25. Let X1 , X2 , Y1 , and Y2 be independent, nonnegative random variables such that X1 =st X2 and Y1 =st Y2 . Denote by λX and λY the hazard rate functions of X1 and Y1 , respectively. If X1 ≤hr Y1 , and if λY /λX is decreasing on [0, 1), then min{max(X1 , X2 ), max(Y1 , Y2 )} ≤hr min{max(X1 , Y1 ), max(X2 , Y2 )}.

1.B The Hazard Rate Order

31

1.B.4 Comparison of order statistics Let X1 , X2 , . . . , Xm be random variables. As usual denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(m) . When we want to emphasize the dependence on m, we denote the order statistics by X(1:m) ≤ X(2:m) ≤ · · · ≤ X(m:m) . The following three theorems compare the order statistics in the hazard rate order. Theorem 1.B.26. If X1 , X2 , . . . , Xm are independent random variables, then X(k) ≤hr X(k+1) for k = 1, 2, . . . , m − 1. A relatively simple proof of Theorem 1.B.26 can be obtained using the likelihood ratio order which is discussed in the next section. Therefore the proof of this theorem will be given there in Remark 1.C.40. Theorem 12.5 in Cramer and Kamps [136] extends Theorem 1.B.26 to the so called sequential order statistics. Further comparisons of order statistics are given in the next two theorems. Theorem 1.B.27. Let X1 , X2 , . . . , Xm be independent random variables. If Xj ≤hr Xm for all j = 1, 2, . . . , m − 1, then X(k−1:m−1) ≤hr X(k:m) for k = 2, 3, . . . , m. Theorem 1.B.28. If X1 , X2 , . . . , Xm are independent random variables, then X(k:m−1) ≥hr X(k:m) for k = 1, 2, . . . , m − 1. From Theorem 1.B.27 it follows that if X1 , X2 , . . . , Xm are independent random variables, then X(1:1) ≥hr X(1:2) ≥hr · · · ≥hr X(1:m) .

(1.B.27)

One may wonder what kind of results of this type hold without the independence assumption. Since X(1:1) ≥ X(1:2) ≥ · · · ≥ X(1:m) a.s., it follows from Theorem 1.A.1 that X(1:1) ≥st X(1:2) ≥st · · · ≥st X(1:m) hold without any (independence) assumption. However, a counterexample in the literature shows that (1.B.27) does not always hold. We now describe some conditions under which (1.B.27) holds. Let X = (X1 , X2 , . . . , Xm ) be a random vector with a partially diﬀerentiable survival function F . The function R = − log F is called the hazard function of X, and the vector r X of partial derivatives, deﬁned by (1)

(2) (m) r X (x) = rX (x), rX (x), . . . , rX (x) ∂ ∂ ∂ R(x), R(x), . . . , R(x) , (1.B.28) = ∂x1 ∂x2 ∂xm for all x ∈ {x : F (x) > 0}, is called the hazard gradient of X; see Johnson (i) and Kotz [264] and Marshall [381]. Note that rX (x) can be interpreted as the conditional hazard rate of Xi evaluated at xi , given that Xj > xj for all j = i. That is,

32

1 Univariate Stochastic Orders (i)

rX (x) =

fi (xi |Xj > xj , j = i) , F i (xi |Xj > xj , j = i)

where fi (·|Xj > xj , j = i) and F i (·|Xj > xj , j = i) are the conditional density and survival functions of Xi , given that Xj > xj for all j = i. For convenience, (i) here and below we set rX (x) = ∞ for all x ∈ {x : F (x) = 0}. For any subset P ⊆ {1, 2, . . . , m} deﬁne YP = min Xi . i∈P

Denote

0 if i ∈ / P, 1P (i) = 1 if i ∈ P, 1P = (1P (1), 1P (2), . . . , 1P (m)),

and 1P c = 1 − 1P ,

where 1 = (1, 1, . . . , 1), and P denotes the complement of P in {1, 2, . . . , m}. Also denote 0 if i ∈ / P c, ∞ · 1P c (i) = ∞ if i ∈ P c , c

and ∞ · 1P c = (∞ · 1P c (1), ∞ · 1P c (2), . . . , ∞ · 1P c (m)). Then the survival function GP of YP can be expressed as GP (t) = F (t · 1P − ∞ · 1P c ),

t ∈ R.

Theorem 1.B.29. Let (X1 , X2 , . . . , Xm ) be a random vector with an absolutely continuous distribution function. Let P and Q be two subsets of {1, 2, . . . , m} such that P ⊂ Q. If r(i) (t · 1P − ∞ · 1P c ) ≤ r(i) (t · 1Q − ∞ · 1Qc ),

t ∈ R, i ∈ P,

(1.B.29)

then YP ≥hr YQ . A suﬃcient condition for (1.B.29) is that r(i) (x1 , x2 , . . . , xm ) is increasing in xj ,

j = i, i = 1, 2, . . . , m.

This is easily seen to be equivalent to the requirement that F (x1 , . . . , xi−1 , xi , xi+1 , . . . , xm ) F (x1 , . . . , xi−1 , xi , xi+1 , . . . , xm ) is decreasing in xj , j = i, whenever xi ≤ xi , i = 1, 2, . . . , m. (1.B.30) Condition (1.B.30) means that F is RR2 (reverse regular of order 2) in pairs; see Karlin [275]. In particular, it holds when X1 , X2 , . . . , Xm are independent. Karlin and Rinott [279] showed that some multivariate normal distributions,

1.B The Hazard Rate Order

33

as well as the Dirichlet distribution, are RR2 in pairs. So Theorem 1.B.29 applies to these distributions. When (X1 , X2 , . . . , Xm ) has an exchangeable distribution function, then the corresponding multivariate hazard function R is permutation symmetric. Therefore each r(i) can be expressed by means of r(1) as follows r(i) (x1 , x2 , . . . , xi−1 , xi , xi+1 , . . . , xm ) = r(1) (xi , x2 , . . . , xi−1 , x1 , xi+1 , . . . , xm ),

i = 2, 3, . . . , m.

Corollary 1.B.30. Let (X1 , X2 , . . . , Xm ) be a random vector with an absolutely continuous exchangeable distribution function. If r(1) (t, t, . . . , t, −∞, −∞, . . . , −∞) ≤ r(1) (t, t, . . . , t, −∞, −∞, . . . , −∞), i times

m−i times

i+1 times

m−i−1 times

t ∈ R, i = 1, 2, . . . , m − 1, (1.B.31) then X(1:1) ≥hr X(1:2) ≥hr · · · ≥hr X(1:m) .

(1.B.32)

If (1.B.31) is not imposed, then (1.B.32) need not be true; this follows from a counterexample in the literature. The following result strengthens the DFR part of Theorem 1.A.19. Recall that the spacings that correspond to the random variables X1 , X2 , . . . , Xm are denoted by U(i) = X(i) − X(i−1) , i = 2, 3, . . . , m, where the X(i) ’s are the corresponding order statistics. When the dependence on m is to be emphasized, we will denote the spacings by U(i:m) . Theorem 1.B.31. Let X1 , X2 , . . . , Xm , Xm+1 be independent and identically distributed, absolutely continuous, DFR random variables. Then (m − i + 1)U(i:m) ≤hr (m − i)U(i+1:m) , i = 2, 3, . . . , m − 1, (m − i + 2)U(i:m+1) ≤hr (m − i + 1)U(i:m) , i = 2, 3, . . . , m,

(1.B.33) (1.B.34)

and U(i:m) ≤hr U(i+1:m+1) ,

i = 2, 3, . . . , m.

(1.B.35)

Note that (1.B.33)–(1.B.35) can be summarized as (m − j + 1)U(j:m) ≤hr (n − i + 1)U(i:n)

whenever i − j ≥ max{0, n − m}.

Theorem 1.B.31 is a simple consequence of Theorem 1.C.45 below. It is of interest to compare Theorem 1.B.31 to Theorems 1.A.19 and 1.A.22. A comparison of such normalized spacings from two diﬀerent samples is described next. Here U(i:m) denotes, as before, the ith spacing that corresponds to the sample X1 , X2 , . . . , Xm , and V(j:n) denotes the jth spacing that corresponds to the sample Y1 , Y2 , . . . , Yn . It is of interest to compare the next result with Theorem 1.C.45.

34

1 Univariate Stochastic Orders

Theorem 1.B.32. For positive integers m and n, let X1 , X2 , . . . , Xm be independent identically distributed random variables with an absolutely continuous common distribution function, and let Y1 , Y2 , . . . , Yn be independent identically distributed random variables with a possibly diﬀerent absolutely continuous common distribution function. If X1 ≤hr Y1 , and if either X1 or Y1 is DFR, then (m − j + 1)U(j:m) ≤st (n − i + 1)V(i:n)

whenever i − j ≥ max{0, n − m}.

The hazard rate order is closed under the operation of taking minima, as the next result shows. Theorem 1.B.33. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤hr Yi , i = 1, 2, . . . , m. Then min{X1 , X2 , . . . , Xm } ≤hr min{Y1 , Y2 , . . . , Y m }. Proof. Clearly, it is enough to show the result when m = 2. For simplicity assume that X1 , X2 , Y1 , and Y2 have hazard rate functions r1 , r2 , q1 , and q2 , respectively. Then it is very easy to see that the hazard rate function of min{X1 , X2 } is r1 + r2 and the hazard rate function of min{Y1 , Y2 } is q1 + q2 . By the assumptions of the theorem (see (1.B.2)) r1 (t) ≥ q1 (t) and r2 (t) ≥ q2 (t) for all t ≥ 0. Therefore r1 (t) + r2 (t) ≥ q1 (t) + q2 (t) for all t ≥ 0, that is, min{X1 , X2 } ≤hr min{Y1 , Y2 }.

If the Xi ’s in Theorem 1.B.33 are identically distributed and if the Yi ’s in Theorem 1.B.33 are also identically distributed, then all order statistics (and not just the minima) corresponding to the Xi ’s and the Yi ’s can be compared in the hazard rate order. This is shown in the following result. Theorem 1.B.34. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of absolutely continuous random variables such that Xi ≤hr Yi , i = 1, 2, . . . , m. Suppose that the Xi ’s are identically distributed and that the Yi ’s are identically distributed. Then X(k:m) ≤hr Y(k:m) ,

k = 1, 2, . . . , m.

(1.B.36)

If the Xi ’s or the Yi ’s in Theorem 1.B.34 are not identically distributed, then the conclusion (1.B.36) need not hold. However, the following result, from Chapter 16 by Boland and Proschan in [515], gives conditions under which (1.B.36) holds. Proposition 1.B.35. Let X1 , X2 , . . . , Xm [respectively, Y1 , Y2 , . . . , Ym ] be m independent (not necessarily identically distributed) absolutely continuous random variables, all with support (a, b) for some a < b. If Xi ≤hr Yj for all i and j, then X(k:m) ≤hr Y(k:m) , k = 1, 2, . . . , m. A result which is stronger than Proposition 1.B.35, but which uses Proposition 1.B.35 in its proof, is the following.

1.B The Hazard Rate Order

35

Theorem 1.B.36. Let X1 , X2 , . . . , Xm be m independent (not necessarily identically distributed) random variables, and let Y1 , Y2 , . . . , Yn be other n independent (not necessarily identically distributed) random variables, all having absolutely continuous distributions with support (a, b) for some a < b. If Xi ≤hr Yj for all i and j, then X(j:m) ≤hr Y(i:n)

whenever i − j ≥ max{0, n − m}.

The proof of Theorem 1.B.36 uses the likelihood ratio order which is discussed in the next section. Therefore the proof will be given in Remark 1.C.41. The following example describes an interesting instance in which the two maxima are ordered in the hazard rate order. It may be compared with Example 3.B.32. Example 1.B.37. Let Y1 , Y2 , . . . , Ym be independent exponential random variables with hazard rates λ1 , λ2 , . . . , λm , respectively. Let X1 , X2 , . . . , Xm be independent andidentically distributed exponential random variables with m hazard rate λ = i=1 λi /m. Then X(m:m) ≤hr Y(m:m) .

(1.B.37)

Let Z1 , Z2 , . . . , Zm be independent and identically distributed exponential

˜ = m λi 1/m . Then random variables with hazard rate λ i=1 Z(m:m) ≤hr Y(m:m) .

(1.B.38)

˜ and ProposiIn fact, from the arithmetic-geometric mean inequality (λ ≥ λ) tion 1.B.35, it follows that (1.B.38) implies (1.B.37). 1.B.5 Some properties in reliability theory The order ≤hr can be trivially (but beneﬁcially) used to characterize IFR random variables. The next result lists several such characterizations. Recall from Section 1.A.3 that for any random variable Z and an event A we denote by [Z A] any random variable that has as its distribution the conditional distribution of Z given A. Theorem 1.B.38. The random variable X is IFR [DFR] if, and only if, one of the following equivalent conditions holds (when the support of the distribution function of X is bounded, condition (iii) does not have a simple DFR analog): (i) [X − tX > t] ≥hr [≤ hr ] [X − t X > t ] whenever t ≤ t . (ii) X ≥hr [≤hr ] [X − tX > t] for all t ≥ 0 (when X is a nonnegative random variable). (iii) X + t ≤hr X + t whenever t ≤ t .

36

1 Univariate Stochastic Orders

Note that if X is the lifetime of a device, then [X − tX > t] is the residual life of such a device with age t. Theorem 1.B.38(i), for example, characterizes IFR random variables by the monotonicity of their residual lives with respect to the order ≤hr . Some multivariate analogs of conditions (i) and (ii) of Theorem 1.B.38 are used in Section 6.D.3 to introduce a multivariate IFR notion. Part (iii) of Theorem 1.B.38 can be used to prove the closure under convolution property of IFR random variables: Corollary 1.B.39. Let X and Y be two independent IFR random variables. Then X + Y has an IFR distribution. Proof. From Theorem 1.B.38(iii) it follows that X + t ≤hr X + t whenever t ≤ t . Also, Y is independent of X +t and of X +t for all t and t , respectively. From Lemma 1.B.3 it now follows that X + Y + t ≤hr X + Y + t whenever t ≤ t . Thus, again from Theorem 1.B.38(iii), it follows that X + Y is IFR.

Recall from (1.A.20) that for a nonnegative random variable X with a ﬁnite mean we denote by AX the corresponding asymptotic equilibrium age. Recall from page 1 the deﬁnitions of the DMRL and the IMRL properties. The following result is immediate. Theorem 1.B.40. The nonnegative random variable X with ﬁnite mean is DMRL [IMRL] if, and only if, X ≥hr [≤hr ] AX . 1.B.6 The reversed hazard order If X is a random variable with an absolutely continuous distribution function F , then the reversed hazard rate of X at the point t is deﬁned as r˜(t) = (d/dt)(log F (t)). One interpretation of the reversed hazard rate at time t is the following. Suppose that X is nonnegative with distribution function F . Then X can be thought of as the lifetime of some device. Given that the device has already failed by time t, then the probability that it survived up to time t − ε (for a small ε > 0) is approximately ε · r˜(t). Some of the results regarding the hazard rate order have analogs when the hazard rate is replaced by the reversed hazard rate. Let X and Y be two random variables with absolutely continuous distribution functions and with reversed hazard rate functions r˜ and q˜, respectively, such that r˜(t) ≤ q˜(t), t ∈ R. (1.B.39) Then X is said to be smaller than Y in the reversed hazard rate order (denoted as X ≤rh Y ). In fact, the absolute continuity, which is required in (1.B.39), is not really needed. It easy to verify that (1.B.39) holds if, and only if,

1.B The Hazard Rate Order

G(t) F (t)

increases in t ∈ (min(lX , lY ), ∞)

37

(1.B.40)

(here a/0 is taken to be equal to ∞ whenever a > 0). Here F denotes the distribution function of X and G denotes the distribution function of Y , and lX and lY denote the corresponding left endpoints of the supports of X and of Y . Equivalently, (1.B.40) can be written as F (x)G(y) ≥ F (y)G(x)

for all x ≤ y.

(1.B.41)

Thus (1.B.40) or (1.B.41) can be used to deﬁne the order X ≤rh Y even if X and/or Y do not have absolutely continuous distributions. The analog of (1.B.5) for the reversed hazard order when X and Y have densities f and g, respectively, is that X ≤rh Y if, and only if, f (y) g(y) ≤ F (x) G(x)

for all x ≤ y.

(1.B.42)

Another condition that is equivalent to X ≤rh Y is GF −1 (v) GF −1 (u) ≤ u v

for all 0 < u ≤ v < 1.

Finally, another condition that is equivalent to X ≤rh Y is P {X − t ≤ −sX ≤ t} ≥ P {Y − t ≤ −sY ≤ t} for all s ≥ 0 and all t, or, equivalently,

[X X ≤ t] ≤st [Y Y ≤ t]

for all t.

(1.B.43)

For discrete random variables X and Y that take on values in N, we denote X ≤rh Y if P {X = n} P {Y = n} ≤ , n ∈ N. (1.B.44) P {X ≤ n} P {Y ≤ n} A useful relationship between the hazard rate and the reversed hazard rate orders is described in the following theorem. Theorem 1.B.41. Let X and Y be two continuous random variables with supports (lX , uX ) and (lY , uY ), respectively. Then X ≤hr Y =⇒ φ(X) ≥rh φ(Y ) for any continuous function φ which is strictly decreasing on (lX , uY ). Also, X ≤rh Y =⇒ φ(X) ≥hr φ(Y ) for any such function φ. Using Theorem 1.B.41 it is easy to obtain the following analogs of results regarding the order ≤hr .

38

1 Univariate Stochastic Orders

Theorem 1.B.42. If X and Y are two random variables such that X ≤rh Y , then X ≤st Y . Theorem 1.B.43. If X ≤rh Y , and if φ is any increasing function, then φ(X) ≤rh φ(Y ). Lemma 1.B.44. If the random variables X and Y are such that X ≤rh Y , and if Z is a random variable independent of X and Y and has decreasing reversed hazard rate, then X + Z ≤rh Y + Z. Theorem 1.B.45. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤rh Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, all have decreasing reversed hazard rates, then m

Xi ≤rh

i=1

m

Yi .

i=1

Theorem Let X, Y , and Θ be random variables such that [X Θ = 1.B.46. θ] ≤rh [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤rh Y . In order to state a bivariate characterization result for the order ≤rh we deﬁne the following class of bivariate functions: Grh = {φ : R2 → R : φ(x, y) is increasing in x, for each y, on {x ≤ y}, and is decreasing in y, for each x, on {y ≤ x}}. The proof of the next result (Theorem 1.B.47) is similar to the proof of Theorem 1.B.9. Theorem 1.B.47. Let X and Y be independent random variables. Then X ≤rh Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Grh .

The next result uses the notation of Theorem 1.A.10. Theorem 1.B.48. Let X and Y be two independent random variables. Then X ≤rh Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all φ1 and φ2 such that, for each y, ∆φ21 (x, y) decreases in x on {x ≤ y}, and such that ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. A further similar characterization is given in Theorem 4.A.36. The following result is an analog of Theorem 1.B.11.

1.B The Hazard Rate Order

39

Theorem 1.B.49. Let X and Y be two independent random variables. Then X ≤rh Y if, and only if, [X max(X, Y ) = z] ≤rh [Y max(X, Y ) = z] for all z. (1.B.45) Also, X ≤rh Y if, and only if, [X max(X, Y ) = z] ≤st [Y max(X, Y ) = z]

for all z.

(1.B.46)

Proof. First suppose that X and Y are absolutely continuous. Denote the distribution functions of X and Y by F and G, respectively, and denote the corresponding density functions by f and g. Then F (x)g(z) , if x ≤ z, (1.B.47) P [X ≤ x max(X, Y ) = z] = f (z)G(z)+g(z)F (z) 1, if x > z, and P [Y ≤ x max(X, Y ) = z] =

G(x)f (z) f (z)G(z)+g(z)F (z) ,

if x ≤ z,

1,

if x > z.

(1.B.48)

Therefore P [Y ≤ x| max(X, Y ) = z] = P [X ≤ x| max(X, Y ) = z]

G(x) F (x)

1,

·

f (z) g(z) ,

if x ≤ z, if x > z.

(1.B.49)

G(z) f (z) If X ≤rh Y , then G(x) F (x) is increasing in x, and F (z) · g(z) ≤ 1. Thus (1.B.49) is increasing in x, and (1.B.45) follows. Obviously (1.B.46) follows from (1.B.45). Now suppose that (1.B.46) holds. Then from (1.B.47) and (1.B.48) we get that F (x)g(z) ≥ G(x)f (z) for all x ≤ z. Therefore X ≤rh Y by (1.B.42). The proof when X and Y are discrete is similar.

The following result is an analog of Theorem 1.B.12. Theorem 1.B.50. Let X and Y be two independent random variables. The following conditions are equivalent: (a) X ≤rh Y . (b) E[α(X)]E[β(Y )] ≥ E[α(Y )]E[β(X)] for all functions α and β for which the expectations exist and such that β is nonnegative and α/β and β are decreasing. (c) For any increasing function a and a nonnegative decreasing function b, if E[a(Y )b(Y )] = 0, then E[a(X)b(X)] ≤ 0. Example 1.B.51. Let X and Y be two random variables with support [c, d], where c < 0 < d, and suppose that E[Y ] > 0. Let u be an increasing diﬀerentiable concave function, corresponding to the utility function of a risk-averse

40

1 Univariate Stochastic Orders

individual. Let kX be a value which maximizes gX (k) ≡ E[u(kX)], and similarly let kY be a value which maximizes gY (k) ≡ E[u(kY )]. Theorem 1.B.50(c) can be used to prove that if X ≤rh Y , then kX ≤ kY . In order to see it, ﬁrst note that the result is trivial if kX = −∞ or if kY = ∞. Thus, let us assume that kX and kY are ﬁnite. Note that then kX and kY satisfy E[Xu (kX X)] = 0 and E[Y u (kY Y )] = 0, where u denotes the derivative of u. Also note that from the assumption E[Y ] > 0 it follows that kY > 0. Without loss of generality let kY = 1. Thus E[Y u (Y )] = 0, and using the concavity of u the assertion would follow if we show that E[Xu (X)] ≤ 0. But this follows from Theorem 1.B.50(c). Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result generalizes Theorem 1.B.43, just as Theorem 1.A.6 generalized Theorem 1.A.3(a). The proof of the next theorem is similar to the proof of Theorem 1.B.14 and is therefore omitted. Theorem 1.B.52. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If

X(θ) ≤rh X(θ )

whenever θ ≤ θ ,

and if Θ1 ≤rh Θ2 , then Y1 ≤rh Y2 . The following result, which is the “reversed hazard analog” of Theorem 1.B.18, gives a Laplace transform characterization of the order ≤rh . Theorem 1.B.53. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤rh X2 ⇐⇒ Nλ (X1 ) ≤rh Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤rh Nλ (X2 ) is in the sense of (1.B.44).

1.B The Hazard Rate Order

41

The implication =⇒ in Theorem 1.B.53 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 1.B.52. The reversed hazard analog of Theorem 1.B.19 is the following. Theorem 1.B.54. Let X1 , X2 , . . . , Xm , Θ1 , and Θ2 be independent nonnegative random variables. Deﬁne Nj (t) for t ≥ 0 and j = 1, 2 as in Theorem 1.B.19. If Θ1 ≤rh Θ2 , then N1 (t) ≤rh N2 (t) in the sense of (1.B.44) for all t ≥ 0. The reversed hazard analog of Theorem 1.B.20 is the following. Theorem 1.B.55. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the reversed hazard order. Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R ∪ {−∞} is a lattice with respect to the order ≤rh . The reversed hazard analog of Theorem 1.B.26 is the following. Theorem 1.B.56. If X1 , X2 , . . . , Xm are independent random variables, then X(k) ≤rh X(k+1) for k = 1, 2, . . . , m − 1. The reversed hazard analog of Theorem 1.B.27 is the following. Theorem 1.B.57. Let X1 , X2 , . . . , Xm be independent random variables. If Xm ≤rh Xj for all j = 1, 2, . . . , m − 1, then X(k−1:m−1) ≤rh X(k:m) for k = 2, 3, . . . , m. The reversed hazard analog of Theorem 1.B.28 is the following. Theorem 1.B.58. If X1 , X2 , . . . , Xm are independent random variables, then X(k:m−1) ≥rh X(k:m) for k = 1, 2, . . . , m − 1. The reversed hazard analogs of Theorems 1.B.33, 1.B.34, and 1.B.36 are the following results. Theorem 1.B.59. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤rh Yi , i = 1, 2, . . . , m. Then max{X1 , X2 , . . . , Xm } ≤rh max{Y1 , Y2 , . . . , Ym }. Theorem 1.B.60. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of absolutely continuous random variables such that Xi ≤rh Yi , i = 1, 2, . . . , m. Suppose that the Xi ’s are identically distributed and that the Yi ’s are identically distributed. Then X(k:m) ≤rh Y(k:m) ,

k = 1, 2, . . . , m.

42

1 Univariate Stochastic Orders

Theorem 1.B.61. Let X1 , X2 , . . . , Xm be m independent (not necessarily identically distributed) random variables, and let Y1 , Y2 , . . . , Yn be other n independent (not necessarily identically distributed) random variables, all having absolutely continuous distributions with support (a, b) for some a < b. If Xi ≤rh Yj for all i and j, then X(j:m) ≤rh Y(i:n)

whenever i − j ≥ max{0, n − m}.

Finally, the reversed hazard analog of Theorem 1.B.38 is the following. Theorem 1.B.62. The random variable X with support (a, b), for some −∞ ≤ a < b ≤ ∞, has decreasing [increasing] reversed hazard rate if, and only if, one of the following equivalent conditions holds: (i) [X − tX < t] ≥rh [≤rh ] [X − t X < t ] whenever a < t ≤ t < b. (ii) X ≤rh [≥rh ] [X − tX < t] for all t ∈ (a, b) (when X is a nonpositive random variable). (iii) X + t ≤rh [≥rh ] X + t whenever a < t ≤ t < b. Corollary 1.B.63. Let X and Y be two independent random variables with decreasing reversed hazard rates. Then X +Y has a decreasing reversed hazard rate.

1.C The Likelihood Ratio Order 1.C.1 Deﬁnition Let X and Y be continuous [discrete] random variables with densities [discrete densities] f and g, respectively, such that g(t) f (t)

increases in t over the union of the supports of X and Y

(1.C.1)

(here a/0 is taken to be equal to ∞ whenever a > 0), or, equivalently, f (x)g(y) ≥ f (y)g(x)

for all x ≤ y.

(1.C.2)

Then X is said to be smaller than Y in the likelihood ratio order (denoted by X ≤lr Y ). By integrating (1.C.2) over x ∈ A and y ∈ B, where A and B are measurable sets in R, it is seen that (1.C.2) is equivalent to P {X ∈ A}P {Y ∈ B} ≥ P {X ∈ B}P {Y ∈ A} for all measurable sets A and B such that A ≤ B, (1.C.3) where A ≤ B means that x ∈ A and y ∈ B imply that x ≤ y. Note that condition (1.C.3) does not directly involve the underlying densities, and thus

1.C The Likelihood Ratio Order

43

it applies uniformly to continuous distributions, or to discrete distributions, or even to mixed distributions. At a ﬁrst glance (1.C.1) or (1.C.2) or (1.C.3) seem to be unintuitive technical conditions. However, it turns out that in many situations they are very easy to verify, and this is one of the major reasons for the usefulness and importance of the order ≤lr . It is also easy to verify by a simple diﬀerentiation (at least when X and Y have the same support) that X ≤lr Y ⇐⇒ GF −1 is convex.

(1.C.4)

Here F and G are the distribution functions of X and Y , respectively. 1.C.2 The relation between the likelihood ratio and the hazard and reversed hazard orders Note that from (1.C.1) it follows (in the continuous case) that y ∞ ∞ y f (t)g(t )dt dt ≥ f (t )g(t)dt dt for all x ≤ y, t=x

t =y

t=y

which, in turn, implies that ∞ ∞ f (t)dt g(t )dt ≥ x

y

t =x

∞

x

g(t)dt

∞

f (t )dt

for all x ≤ y,

y

that is, (1.B.4). We thus have shown a part of the following result. The other parts of the next theorem are proven similarly (recall that the discrete versions of the orders ≤hr and ≤rh are deﬁned in (1.B.9) and (1.B.44), respectively). Theorem 1.C.1. If X and Y are two continuous or discrete random variables such that X ≤lr Y , then X ≤hr Y and X ≤rh Y (and therefore X ≤st Y ). Remark 1.C.2. Neither of the orders ≤hr and ≤rh (even if both hold simultaneously) implies the order ≤lr . In order to see it let X be a uniform random variable over the set {1, 2, 3, 4} and let Y have the probabilities P {Y = 1} = .1, P {Y = 2} = .3, P {Y = 3} = .2, and P {Y = 4} = .4. Then it is not true that X ≤lr Y , however, in this case we have that X ≤hr Y and also that X ≤rh Y . Remark 1.C.3. Using Theorem 1.C.1 we can now give a proof of Theorem 1.A.22. Let F and f denote, respectively, the distribution function and the density function of X1 . Given X(i−1:m) = u and X(i+1:m) = v, the conditional (u+w) density of U(i:m) at the point w is F f(v)−F (u) , 0 ≤ w ≤ v−u, and the conditional f (v−w) density of U(i+1:m) at the point w is F (v)−F (u) , 0 ≤ w ≤ v − u. Since f is increasing [decreasing] it is seen that, conditionally, U(i:m) ≥lr [≤lr ] U(i+1:m) , and therefore, by Theorem 1.C.1, U(i:m) ≥st [≤st ] U(i+1:m) . Theorem 1.A.22 now follows from Theorem 1.A.3(d).

44

1 Univariate Stochastic Orders

Although neither of the orders ≤hr and ≤rh implies the order ≤lr (see Remark 1.C.2), the following result gives a simple condition under which this is actually the case. The proof is immediate and is therefore omitted. Theorem 1.C.4. Let X and Y be two random variables with distribution functions F and G, (discrete or continuous) hazard rate functions r and q, and (discrete or continuous) reversed hazard rate functions r˜ and q˜, respectively. (a) If X ≤hr Y and if (b) If X ≤rh Y and if

q(t) r(t) q˜(t) r˜(t)

increases in t, then X ≤lr Y . increases in t, then X ≤lr Y .

1.C.3 Some properties and characterizations The usual stochastic order has the useful and important constructive property described in Theorem 1.A.1. There is no analogous property associated with the likelihood ratio order. Therefore it is of importance to understand better the relationship between the orders ≤st and ≤lr . We already know from Theorems 1.C.1 and 1.B.1 that the likelihood ratio order implies the usual stochastic order. The following result characterizes the likelihood ratio order by means of the order ≤st . It says that X ≤lr Y if, and only if, for any interval I, the conditional distribution of X, given that X ∈ I, is stochastically smaller than the conditional distribution of Y , given that Y ∈ I. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. It is of interest to contrast the next result with (1.B.7) and (1.B.43). Theorem 1.C.5. The two random variables X and Y satisfy X ≤lr Y if, and only if, [X a ≤ X ≤ b] ≤st [Y a ≤ Y ≤ b] whenever a ≤ b. (1.C.5) Proof. Suppose that (1.C.5) holds. Select an a and a b such that a < b. Then P {u ≤ Y ≤ b} P {u ≤ X ≤ b} ≤ P {a ≤ X ≤ b} P {a ≤ Y ≤ b}

whenever u ∈ [a, b].

It follows then that P {a ≤ X < u} P {a ≤ Y < u} ≥ P {u ≤ X ≤ b} P {u ≤ Y ≤ b}

whenever u ∈ [a, b].

P {a ≤ X < u} P {u ≤ X ≤ b} ≥ P {a ≤ Y < u} P {u ≤ Y ≤ b}

whenever u ∈ [a, b].

That is,

In particular, for u < b ≤ v,

1.C The Likelihood Ratio Order

45

P {b ≤ X ≤ v} P {u ≤ X < b} ≥ . P {u ≤ Y < b} P {b ≤ Y ≤ v} Therefore, when X and Y are continuous random variables, P {a ≤ X < u} P {b ≤ X ≤ v} ≥ P {a ≤ Y < u} P {b ≤ Y ≤ v}

whenever a < u ≤ b ≤ v.

Now let a → u and b → v to obtain (1.C.2). The proof for discrete random variables is similar. Conversely, suppose that X ≤lr Y , then clearly, [X a ≤ X ≤ b] ≤lr [Y a ≤ Y ≤ b] whenever a < b (see also Theorem 1.C.6). From Theorems 1.C.1 and 1.B.1 we obtain (1.C.5).

The likelihood ratio order is preserved under general truncations of the involved random variables. This is stated in the next theorem, the proof of which follows directly from (1.C.2). Theorem 1.C.6. If X and Y are two random variables such that X ≤lr Y , then for any measurable set A ⊆ R we have [X X ∈ A] ≤lr [Y Y ∈ A]. By combining Theorems 1.C.5 and 1.C.6 it is seen that X ≤lr Y if, and only if, (1.C.6) [X X ∈ A] ≤st [Y Y ∈ A] for all measurable sets A ⊆ R. In fact, one can take (1.C.6) as the deﬁnition of the likelihood ratio order. The advantage of this approach is that it does not directly involve the underlying densities, and thus, similarly to condition (1.C.3), it applies uniformly to continuous distributions, or to discrete distributions, or even to mixed distributions. Using the characterization (1.C.3), it is not hard to obtain the following result. Theorem 1.C.7. Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤lr Yj , j = 1, 2, . . ., then X ≤lr Y . Let ψ be a strictly monotone increasing [decreasing] diﬀerentiable function with inverse ψ −1 . If X has the density function f , then ψ(X) has the density function (f ψ −1 )/(ψ (ψ −1 )). Similarly, if Y has the density function g, then ψ(Y ) has the density function (gψ −1 )/(ψ (ψ −1 )). If X ≤lr Y , then ψ −1 )(u)/(ψ (ψ −1 (u))) from (1.C.1) it follows that (f (gψ −1 )(u)/(ψ (ψ −1 (u))) decreases [increases] over the unions of the supports of ψ(X) and ψ(Y ). We have thus proved an important special case of Theorem 1.C.8 below. For discrete random variables the result is proven in a similar manner. When ψ is just monotone (rather than strictly monotone) the result is still true, but the preceding simple argument is no longer suﬃcient for its proof.

46

1 Univariate Stochastic Orders

Theorem 1.C.8. If X ≤lr Y and ψ is any increasing [decreasing] function, then ψ(X) ≤lr [≥lr ] ψ(Y ). If X1 ≤lr Y1 and X2 ≤lr Y2 , where X1 and X2 are independent random variables, and Y1 and Y2 are also independent random variables, then it is not necessarily true that X1 + X2 ≤lr Y1 + Y2 . However, if these random variables have logconcave densities, then it is true. In fact, a slightly stronger result is true: Theorem 1.C.9. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤lr Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, all have (continuous or discrete) logconcave densities, except possibly one Xl and one Yk (l = k), then m m Xi ≤lr Yi . i=1

i=1

Proof. Since a convolution of random variables with logconcave densities has a logconcave density, it is enough to show that if W1 , W2 , and Z are independent random variables such that W1 ≤lr W2 , and Z has a logconcave density function, then W1 + Z ≤lr W2 + Z. We will give the proof for the continuous case; the proof for the discrete case is similar. Let fWi , fWi +Z , i = 1, 2, and fZ denote the density functions of the indicated random variables. Then ∞ fWi +Z (t) = fZ (t − w)fWi (w) dw, i = 1, 2, t ∈ R. −∞

The assumption W1 ≤lr W2 means that fWi (w), as a function of w and of i ∈ {1, 2}, is TP2 . The logconcavity of fZ means that fZ (t−w), as a function of t and of w, is TP2 . Therefore, by the basic composition formula (Karlin [275]) we see that fWi +Z (t) is TP2 in i ∈ {1, 2} and t; that is, W1 + Z ≤lr W2 + Z.

Example 1.C.10. Consider m independent Bernoulli trials with probability pi of success in the ith trial. Let q(k, p) denote the probability of k successes, k = 1, 2, . . . , m, where p = (p1 , p2 , . . . , pm ). Then q(k + 1, p)/q(k, p) is increasing in each pi for k = 0, 1, . . . , m − 1. In order to see it, let Xi be a Bernoulli random variable with probability pi of success, i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Similarly, let Yi be a Bernoulli random variable with probability pi of success, i = 1, 2, . . . , m, and assume that the Yi ’s are independent. Obviously, the discrete density functions of the Xi ’s and of the Yi ’s are logconcave, and if p ≤ p , then Xi ≤lr Yi , i = 1, 2, . . . , m. The stated result thus follows from Theorem 1.C.9. For nonnegative random variables, Theorem 1.C.9 can be generalized further by having more Yi ’s summed than Xi ’s. Under the assumptions of Theorem 1.C.9, one then obtains, for m ≤ n, that

1.C The Likelihood Ratio Order m

Xi ≤lr

i=1

n

47

Yi .

i=1

Of course, in this case, for m+1 ≤ i ≤ n, the Yi ’s only need to have logconcave densities—they do not have to have corresponding Xi ’s to which they need to be comparable in the order ≤lr . One may expect that the latter inequality can be extended to the following one: M

Xi ≤lr

i=1

N

Yi ,

i=1

where M and N are two discrete positive integer-valued random variables, independent of the Xi ’s and of the Yi ’s, respectively, such that M ≤lr N . Indeed this inequality is true under some additional assumptions on the distributions of the Xi ’s and the Yi ’s that will not be stated here. An important special case is the following theorem. Theorem 1.C.11. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative independent random variables with logconcave densities. Let M and N be two discrete positive integer-valued random variables such that M ≤lr N , and assume that M and N are independent of the Xi ’s. Then M i=1

Xi ≤lr

N

Xi .

i=1

In Pellerey [445] it is claimed that the conclusion of Theorem 1.C.11 holds even under the weaker assumption that M ≤hr N (in the sense of (1.B.9) or (1.B.10)). However, there is a mistake in [445] (see Pellerey [446]). It is of interest to compare Theorem 1.C.11 to the following result, which combines uses of the likelihood ratio and the hazard [reversed hazard] rate orders. Theorem 1.C.12. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative independent random variables that are IFR [have decreasing reversed hazard rates]. Let M and N be two discrete positive integer-valued random variables such that M ≤lr N , and assume that M and N are independent of the Xi ’s. Then M N Xi ≤hr [≤rh ] Xi . i=1

i=1

Note that the hazard rate part of Theorem 1.C.12 is weaker than Theorem 1.B.7 because of Theorem 1.C.1. The hazard rate order can be characterized by means of the likelihood ratio order and the appropriate equilibrium age variables. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result is immediate from (1.B.3) and (1.C.1).

48

1 Univariate Stochastic Orders

Theorem 1.C.13. Let X and Y be two nonnegative random variables with ﬁnite positive means. Then X ≤hr Y if, and only if, AX ≤lr AY . In light of Theorem 1.C.13 it is of interest to note that the order ≤lr can also be used to characterize the hazard rate order as is described in the next theorem. Let X and Y be two nonnegative random variables with ﬁnite means and suppose that X ≤st Y and that EX < EY . Let F and G be the distribution functions of X and of Y , respectively. Deﬁne the random variable ZX,Y as the random variable that has the density function h given by h(z) =

G(z) − F (z) , EY − EX

z ≥ 0.

(1.C.7)

Theorem 1.C.14. Let X and Y be two nonnegative random variables with ﬁnite means such that X ≤st Y and such that EY > EX > 0. Then AX ≤lr ZX,Y ⇐⇒ AY ≤lr ZX,Y ⇐⇒ X ≤hr Y, where ZX,Y has the density function given in (1.C.7). Proof. Denote by fe the density function of AY . Then, using (1.A.20), we obtain h(x) F (x) EY 1− , x ≥ 0, = fe (x) EY − EX G(x) and the second stated equivalence follows from (1.C.1) and (1.B.3). The proof of the ﬁrst equivalence is similar.

It is of interest to contrast Theorem 1.C.14 with Theorems 2.A.5 and 2.B.3. The likelihood ratio order enjoys a closure under mixture property which is similar to the closure under mixture property of the hazard rate order stated in Theorem 1.B.8. This is stated next; the proof is similar to the proof of Theorem 1.B.8; we omit the details. Θ = Theorem 1.C.15. Let X, Y , and Θ be random variables such that [X θ] ≤lr [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤lr Y . As a corollary of Theorem 1.C.15 we obtain the following result. Corollary 1.C.16. Let N be a positive integer-valued random variable, and let Xi , i = 1, 2, . . ., be random variables which are independent of N . Let Y be a random variable such that Xi ≤lr Y , i = 1, 2, . . .. Then XN ≤lr Y . Consider now a family of (continuous or discrete) density functions {gθ , θ ∈ X } where X is a subset of the real line. As in Section 1.A.3 let X(θ) denote a random variable with density function gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with density function h given by

1.C The Likelihood Ratio Order

49

h(y) = X

gθ (y)dF (θ),

y ∈ R.

The following result generalizes both Theorems 1.C.8 and 1.C.15, just as Theorem 1.A.6 generalized parts (a) and (c) of Theorem 1.A.3. Theorem 1.C.17. Consider a family of density functions {gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the density function of Yi is given by gθ (y)dFi (θ), y ∈ R, i = 1, 2. hi (y) = X

If

X(θ) ≤lr X(θ )

whenever θ ≤ θ ,

(1.C.8)

and if Θ1 ≤lr Θ2 ,

(1.C.9)

Y1 ≤lr Y2 .

(1.C.10)

then Proof. We give the proof under the assumption that Θ1 and Θ2 are absolutely continuous with density functions f1 and f2 , respectively. The proof for the discrete case is similar. Assumption (1.C.8) means that gθ (y), as a function of θ and of y, is TP2 . Assumption (1.C.9) means that fi (θ), as a function of i ∈ {1, 2} and of θ, is TP2 . Therefore, by the basic composition formula (Karlin [275]) we see that hi (y) is TP2 in i ∈ {1, 2} and y. That gives (1.C.10).

A related result is the following; see also Theorems 1.B.19 and 1.B.54. Theorem 1.C.18. Let X1 , X2 , . . . , Xm , Θ1 , and Θ2 be independent nonnegative random variables. Deﬁne Nj (t) for t ≥ 0 and j = 1, 2 as in Theorem 1.B.19. If Θ1 ≤lr Θ2 , then N1 (t) ≤lr N2 (t) for all t ≥ 0. The following example is an application of Theorem 1.C.17; it may be compared to Examples 1.A.7 and 1.B.16. Example 1.C.19. Let Θ1 and Θ2 be two nonnegative random variables with distribution functions F1 and F2 , respectively. Let G be some absolutely continuous distribution function, and let g be the corresponding density function. Denote by X(θ) a random variable with the distribution function Gθ . Deﬁne Yi = X(Θi ); that is, the distribution function Hi of Yi is given by ∞ Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. 0

50

1 Univariate Stochastic Orders

Note that the density function kθ of X(θ) is given by kθ (y) = θg(y)Gθ−1 (y),

y ∈ R.

It is easy to verify that (1.C.8) holds. Thus, by Theorem 1.C.17, if Θ1 ≤lr Θ2 , then Y1 ≤lr Y2 . θ Now, denote by X(θ) a random variable with the survival function G , i ); that is, the survival function H i of Yi where G ≡ 1 − G. Deﬁne Yi = X(Θ is given by ∞ θ i (y) = G (y)dFi (θ), y ∈ R, i = 1, 2. H 0

is given by Note that the density function kθ of X(θ) θ−1 kθ (y) = θg(y)G (y),

y ∈ R.

) whenever θ ≤ θ . Thus, by an It is easy to verify now that X(θ) ≥lr X(θ obvious modiﬁcation of Theorem 1.C.17, if Θ1 ≤lr Θ2 , then Y1 ≥lr Y2 . In order to state a bivariate characterization result for the order ≤lr we deﬁne the following class of bivariate functions: Glr = {φ : R2 → R : φ(x, y) ≤ φ(y, x) whenever x ≤ y}. Theorem 1.C.20. Let X and Y be independent random variables. Then X ≤lr Y if, and only if, φ(X, Y ) ≤st φ(Y, X)

for all φ ∈ Glr .

(1.C.11)

Proof. We give the proof for the absolutely continuous case only; the proof for the discrete case is similar. Suppose that (1.C.11) holds. Select u, v, ∆u > 0, and ∆v > 0 such that u ≤ v. As before, let IA denote the indicator function of the set A, and deﬁne φ(x, y) = I{u−∆u≤y≤u,v≤x≤v+∆v} . Clearly, φ ∈ Glr . Hence P {v ≤ X ≤ v + ∆v, u − ∆u ≤ Y ≤ u} = Eφ(X, Y ) ≤ Eφ(Y, X) = P {v ≤ Y ≤ v + ∆v, u − ∆u ≤ X ≤ u}. Dividing both sides by ∆u∆v and letting ∆u → 0 and ∆v → 0, we obtain (1.C.2), that is, X ≤lr Y . Conversely, suppose that X ≤lr Y . Let φ ∈ Glr and let ψ be an increasing function. Then E[ψ(φ(Y, X)) − ψ(φ(X, Y ))] = [ψ(φ(y, x)) − ψ(φ(x, y))]f (x)g(y)dxdy y x [ψ(φ(y, x)) − ψ(φ(x, y))][f (x)g(y) − f (y)g(x)]dydx ≥ 0.

= y

y≥x

1.C The Likelihood Ratio Order

51

A typical application of Theorem 1.C.20 is shown in the proof of Theorem 6.B.15 in Chapter 6. Another typical application is the following result. Theorem 1.C.21. Let X1 , X2 , . . . , Xm be independent random variables such that X1 ≤lr X2 ≤lr · · · ≤lr Xm . Let a1 , a2 , . . . , am be constants such that a1 ≤ a2 ≤ · · · ≤ am . Then m i=1

am−i+1 Xi ≤st

m

aπi Xi ≤st

i=1

m

ai Xi ,

i=1

where π = (π1 , π2 , . . . , πm ) denotes any permutation of (1, 2, . . . , m). Proof. We only give the proof when m = 2; the general case then can be obtained by pairwise interchanges. So, suppose that X1 ≤lr X2 and that a1 ≤ a2 . Deﬁne φ by φ(x, y) = a1 y + a2 x. Then it is easy to verify that φ ∈ Glr . Thus, by Theorem 1.C.20, a1 X2 + a2 X1 ≤st a1 X1 + a2 X2 .

The next two results are characterizations similar to the one in Theorem 1.C.20. They use the notation of Theorem 1.A.10, and their comparison is of interest. The proofs of the following two theorems are omitted. Theorem 1.C.22. Let X and Y be two independent random variables. Then X ≤lr Y if, and only if, Eφ1 (X, Y ) ≤ Eφ2 (X, Y ) for all functions φ1 and φ2 that satisfy ∆φ21 (x, y) ≥ 0 whenever x ≤ y, and ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. Theorem 1.C.23. Let X and Y be two independent random variables. Then X ≤lr Y if, and only if, φ1 (X, Y ) ≤st φ2 (X, Y ) for all φ1 and φ2 that satisfy ∆φ21 (x, y) ≥ 0 whenever x ≤ y, and φ1 (x, y) ≤ φ2 (y, x) for all x and y (then, in particular, ∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y). The next theorem gives a characterization of the likelihood ratio order in the spirit of Theorems 1.B.11 and 1.B.49. Theorem 1.C.24. Let X and Y be two independent random variables. Then X ≤lr Y if, and only if, [X min(X, Y ) = z1 , max(X, Y ) = z2 ] ≤lr [Y min(X, Y ) = z1 , max(X, Y ) = z2 ]

for all z1 ≤ z2 .

52

1 Univariate Stochastic Orders

Proof. First suppose that X and Y are absolutely continuous with density functions f and g, respectively. Then P [X = z1 min(X, Y ) = z1 , max(X, Y ) = z2 ] = 1 − P [X = z2 min(X, Y ) = z1 , max(X, Y ) = z2 ] = P [Y = z2 min(X, Y ) = z1 , max(X, Y ) = z2 ] = 1 − P [Y = z1 min(X, Y ) = z1 , max(X, Y ) = z2 ] =

f (z1 )g(z2 ) , f (z1 )g(z2 ) + f (z2 )g(z1 )

and the stated result follows. The proof when X and Y are discrete is similar.

Another similar characterization is given in Theorem 4.A.36. The following result gives a Laplace transform characterization of the order ≤lr . It should be compared with Theorems 1.A.13, 1.B.18, and 1.B.53. The proof is omitted. Theorem 1.C.25. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤lr X2 ⇐⇒ Nλ (X1 ) ≤lr Nλ (X2 )

for all λ > 0.

The implication =⇒ in Theorem 1.C.25 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 1.C.17. Some interesting simple implications of the likelihood ratio order are described in the following theorem. Theorem 1.C.26. Let X, Y , and Z be independent random variables. If X ≤lr Y , then [X X + Y = v] ≤lr [Y X + Y = v] for all v, [X X + Z = v] ≤lr [Y Y + Z = v] for all v, and [Z X + Z = v] ≥lr [Z Y + Z = v] for all v. Proof. We give only the proof of the ﬁrst inequality; the proofs of the other two are similar. First suppose that X and Y are absolutely continuous with density functions f and g, respectively. Denote the density function of X + Y by h. Then the density function of [Y X + Y = v] is given by f (v−·)g(·) , and h(v) f (·)g(v−·) the density function of [X X +Y = v] is given by . It is now seen that h(v)

the monotonicity of g/f implies the monotonicity of the ratio of the above two density functions. The proof when X and Y are discrete is similar.

1.C The Likelihood Ratio Order

53

The next, easily proven, result is stronger than Theorems 1.A.15, 1.B.20, and 1.B.55. Theorem 1.C.27. Let X be any random variable. Then X(−∞,a] and X(a,∞) are increasing in a in the sense of the likelihood ratio order. A similar setting in which the order ≤hr gives rise to the order ≤lr is described in the following result. Theorem 1.C.28. Let X, Y , and T be random variables such that T is independent of (X, Y ). If X ≤hr Y , then [T T < X] ≤lr [T T < Y ]. Proof. For simplicity assume that T is absolutely continuous with density function fT . Let F X and F Y be the survival functions of X and Y . The density T < X] is proportional to fT F X and the density function function of [T of [T T < Y ] is proportional to fT F Y . The stated result now follows from (1.B.3).

An analog of the remark after Theorem 1.B.21 is the following result; its proof is straightforward. Theorem 1.C.29. Let X be a nonnegative, absolutely continuous, random variable with the density function f . Then aX ≤lr X for all 0 < a < 1 if, and only if, log f (ex ) is concave in x ≥ 0. In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of likelihood ratio ordered random variables, is bounded from below and from above, in the likelihood ratio order sense, by these two random variables. Theorem 1.C.30. Let X and Y be two random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤lr Y , then X ≤lr W ≤lr Y . Proof. Let A and B be two measurable sets such that A ≤ B; see (1.C.3). If X ≤lr Y , then P {X ∈ A}P {W ∈ B} = P {X ∈ A}(pP {X ∈ B} + (1 − p)P {Y ∈ B}) ≥ P {X ∈ B}(pP {X ∈ A} + (1 − p)P {Y ∈ A}) = P {X ∈ B}P {W ∈ A}, where the inequality follows from (1.C.3). Thus, by (1.C.3), X ≤lr W . The proof that W ≤lr Y is similar.

54

1 Univariate Stochastic Orders

Analogous to the result in Remark 1.A.18, it can be shown that some general sets of distribution functions on R are lattices with respect to the order ≤lr . Let X1 , X2 , . . . , Xm be random variables, and let X(k:m) denote the corresponding kth order statistic, k = 1, 2, . . . , m. Theorem 1.C.31. Let X1 , X2 , . . . , Xm be m independent random variables, all with absolutely continuous distribution functions, all having the same support which is an interval of the real line, and all having diﬀerentiable densities. (a) If X1 ≤lr X2 ≤lr · · · ≤lr Xm , then X(k−1:m) ≤lr X(k:m) ,

2 ≤ k ≤ m,

X(k−1:m−1) ≤lr X(k:m) ,

2 ≤ k ≤ m.

and

(b) If X1 ≥lr X2 ≥lr · · · ≥lr Xm , then X(k:m) ≤lr X(k:m−1) ,

1 ≤ k ≤ m − 1.

A similar result for a ﬁnite population is the following. Consider a ﬁnite population of size N which is linearly ordered, and suppose, without loss of generality, that it can be represented as {1, 2, . . . , N }. Here let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote now the order statistics corresponding to a simple random sample of size m from this population. Theorem 1.C.32. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) be deﬁned as in the preceding paragraph. Then X(1) ≤lr X(2) ≤lr · · · ≤lr X(m) . Proof. For each k ∈ {1, 2, . . . , m}, let fk denote the discrete density of X(k) . Then ⎧ j−1 N −j ⎨ (k−1)(m−k) , j = k, k + 1, . . . , k + N − m; N (m ) fk (j) = ⎩0, otherwise. Therefore, for k ∈ {1, 2, . . . , m − 1}, ⎧ ⎪0, fk+1 (j) ⎨ (m−k)(j−k) = k(N −j−m+k+1) , ⎪ fk (j) ⎩ ∞,

we have j = k; j = k + 1, k + 2, . . . , k + N − m; j = k + N − m + 1.

This is increasing in j, and therefore X(k) ≤lr X(k+1) .

1.C The Likelihood Ratio Order

55

Under some conditions the likelihood ratio order is closed under the formation of order statistics. As above, let X(j:m) denote the jth order statistic associated with the random variables X1 , X2 , . . . , Xm , and let Y(i:n) denote similarly the ith order statistic associated with the random variables Y1 , Y2 , . . . , Yn . Theorem 1.C.33. Let X1 , X2 , . . . , Xm be m independent random variables, and let Y1 , Y2 , . . . , Yn be other n independent random variables. If Xj ≤lr Yi

for all 1 ≤ j ≤ m and 1 ≤ i ≤ n,

then X(j:m) ≤lr Y(i:n)

whenever j ≤ i and m − j ≥ n − i.

Proof. First we give the proof when X1 , X2 , . . . , Xm and Y1 , Y2 , . . . , Yn all have absolutely continuous distribution functions. In this proof we use an idea of Chan, Proschan, and Sethuraman [123]. Let fj , Fj , and F j ≡ 1 − Fj denote the density, distribution, and survival functions of Xj . Similarly, let gi , Gi , and Gi denote the density, distribution, and survival functions of Yi . The density functions of X(j:m) and Y(i:n) are given by fπ1 (t)Fπ2 (t) · · · Fπj (t)F πj+1 (t) · · · F πm (t), fX(j:m) (t) = π

and gY(i:n) (t) =

gσ1 (t)Gσ2 (t) · · · Gσi (t)Gσi+1 (t) · · · Gσn (t),

σ

where π signiﬁes the sum over all permutations π = (π1 , π2 , . . . , πm ) of (1, 2, . . . , m), and σ similarly denotes the sum over all permutations σ = (σ1 , σ2 , . . . , σn ) of (1, 2, . . . , n). Write gY(i:n) (t) gσ (t)Gσ2 (t) · · · Gσi (t)Gσi+1 (t) · · · Gσn (t) = σ 1 . (1.C.12) fX(j:m) (t) π fπ1 (t)Fπ2 (t) · · · Fπj (t)F πj+1 (t) · · · F πm (t) Now, for any choice of a permutation π of (1, 2, . . . , m) and a permutation σ of (1, 2, . . . , n) we have gσ1 (t)Gσ2 (t) · · · Gσi (t)Gσi+1 (t) · · · Gσn (t) fπ1 (t)Fπ2 (t) · · · Fπj (t)F πj+1 (t) · · · F πm (t) =

Gσi+1 (t) · · · Gσn (t) gσ1 (t) Gσ2 (t) · · · Gσj (t) × × fπ1 (t) Fπ2 (t) · · · Fπj (t) F πm−n+i+1 (t) · · · F πm (t) Gσj+1 (t) · · · Gσi (t) . × F πj+1 (t) · · · F πm−n+i (t)

56

1 Univariate Stochastic Orders

Since Xπ1 ≤lr Yσ1 we see from (1.C.1) that the ﬁrst fraction above is increasing in t. From Xπk ≤lr Yσk and Theorem 1.C.1 it follows that Xπk ≤rh Yσk ; but that means that Gσk (t)/Fπk (t) is increasing in t, k = 2, . . . , j, and therefore the second fraction above is increasing in t. Similarly, from Xπk+m−n ≤lr Yσk and Theorem 1.C.1 it also follows that Xπk+m−n ≤hr Yσk ; but that means that Gσk (t)/F πk+m−n (t) is increasing in t, k = i + 1, . . . , n, and therefore the third fraction above is increasing in t. The fourth fraction above obviously increases in t too, and thus the whole product increases in t. Note that if a1 , a2 , . . . , am and b1 , b2 , . . . , bn are all nonnegative univariate functions, such that aj (t)/b i (t) is increasing in t for all 1 ≤ j ≤ m and m n 1 ≤ i ≤ n, then j=1 aj (t)/ i=1 bi (t) is also increasing in t. It follows from this fact, and from (1.C.12), that gY(i:n) (t)/fX(j:m) (t) is increasing in t, and from (1.C.1) we obtain the stated result. The result for the case when the random variables do not necessarily have absolutely continuous distribution functions follows from the above proof and the closure of the likelihood ratio order under weak convergence (Theorem 1.C.7).

Some of the results that are described in the following pages are stated in the literature (see Section 1.E) only for random variables with absolutely continuous distribution functions. However, by the closure of the likelihood ratio order under weak convergence (Theorem 1.C.7) these results are true also for random variables that do not necessarily have absolutely continuous distribution functions. As a corollary of Theorem 1.C.33 we obtain the following result. Corollary 1.C.34. Let X1 , X2 , . . . , Xm be m independent random variables and let Y1 , Y2 , . . . , Ym be other m independent random variables. If Xj ≤lr Yi , for all choices of i and j, then X(k) ≤lr Y(k) , k = 1, 2, . . . , m. Example 1.C.35. Let X and Y be two independent random variables. If X ≤lr Y , then min{X, Y } ≤lr Y and X ≤lr max{X, Y }. Example 1.C.36. Let X, Y , and Z be three independent random variables. If X ≤lr Y ≤lr Z, then min{X, Y } ≤lr min{Y, Z} and max{X, Y } ≤lr max{Y, Z}. By letting all the Xj ’s and Yi ’s in Theorem 1.C.33 be identically distributed we obtain the following result. Theorem 1.C.37. For positive integers m and n, let X1 , X2 , . . . , Xmax{m,n} be independent identically distributed random variables. Then X(j:m) ≤lr X(i:n)

whenever j ≤ i and m − j ≥ n − i.

In particular, it follows from Theorem 1.C.37 that X1 ≤lr X(m:m) ,

m = 2, 3, . . .

(1.C.13)

1.C The Likelihood Ratio Order

57

and X1 ≥lr X(1:m) ,

m = 2, 3, . . . .

(1.C.14)

Note that (1.C.13) and (1.C.14) can also be obtained by induction from Example 1.C.35. The following two corollaries of Theorem 1.C.37 can be compared to Theorems 1.B.27 and 1.B.28. Corollary 1.C.38. Let X1 , X2 , . . . , Xm be independent identically distributed random variables. Then X(k−1:m−1) ≤lr X(k:m) for k = 2, 3, . . . , m. Corollary 1.C.39. Let X1 , X2 , . . . , Xm be independent identically distributed random variables. Then X(k:m−1) ≥lr X(k:m) for k = 1, 2, . . . , m − 1. Remark 1.C.40. The likelihood ratio order can be used to provide a proof of Theorem 1.B.26. Let X1 , X2 , . . . , Xm be independent nonnegative random variables, and let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Fix s and t such that 0 ≤ s ≤ t. For j = 1, 2, . . . , m, deﬁne Mj = 1 if Xj ≤ s, and Mj = 0 if Xj >s, and also deﬁne Nj = 1 if Xj ≤ t, and m m Nj = 0 if Xj > t. Denote M = j=1 Mj and N = j=1 Nj . Note that, for j = 1, 2, . . . , m, we have P {M < j} = P {X(j) > s}, P {N < j} = P {X(j) > t}.

and

Since P {Mj = 1} = P {Xj ≤ s} ≤ P {Xj ≤ t} = P {Nj = 1} it is easily seen that Mj ≤lr Nj , j = 1, 2, . . . , m. Also, obviously, Mj and Nj have logconcave discrete density functions. Thus, from Theorem 1.C.9 it is seen that M ≤lr N . Therefore, by Theorem 1.C.1, M ≤rh N . Thus, from (1.B.44), we get that P {N < j} P {M < j}

is increasing in j ≥ 1.

Therefore, for k such that 1 ≤ k ≤ m − 1 we have P {X(k) > t} P {X(k+1) > t} P {N < k} P {N < k + 1} = ≤ = . P {X(k) > s} P {M < k} P {M < k + 1} P {X(k+1) > s} From (1.B.3) it thus follows that X(k) ≤hr X(k+1) . Remark 1.C.41. The likelihood ratio order can be used to provide a proof of Theorem 1.B.36. Let the Xi ’s and the Yj ’s be as in that theorem. Assume that Xi ≤hr Yj for all i, j. We ﬁrst show that there exists a random variable Z with support (a, b) such that Xi ≤hr Z ≤hr Yj for all i, j. Let rXi and rYj denote the hazard rate functions of the indicated random variables. From the assumption that Xi ≤hr Yj for all i, j it follows by (1.B.2) that min{rX1 (t), rX2 (t), . . . , rXm (t)} ≥ max{rY1 (t), rY2 (t), . . . , rYn (t)},

t ∈ (a, b).

58

1 Univariate Stochastic Orders

Let q be a function which satisﬁes min{rX1 (t), rX2 (t), . . . , rXm (t)} ≥ q(t) ≥ max{rY1 (t), rY2 (t), . . . , rYn (t)}, t ∈ (a, b); for example, let q(t) = min{rX1 (t), rX2 (t), . . . , rXm (t)}. It can be shown that q is indeed a hazard rate function. Let Z be a random variable with the hazard rate function q. Then indeed Xi ≤hr Z ≤hr Yj for all i, j. Now, let Z1 , Z2 , . . . , Zmax{m,n} be independent random variables which are distributed as Z. Then, for j ≤ i and m − j ≥ n − i we have X(i:m) ≤hr Z(i:m) ≤lr Z(j:n)

(by Proposition 1.B.35)

≤hr Y(j:n)

(by Proposition 1.B.35),

(by Theorem 1.C.37)

and Theorem 1.B.36 follows from the fact that the likelihood ratio order implies the hazard rate order. Recall that for a collection X1 , X2 , . . . , Xm of nonnegative random variables, the spacings are deﬁned by U(i) ≡ X(i) − X(i−1) , i = 1, 2, . . . , m, where X(0) ≡ 0. The following result may be compared with Theorems 1.A.19, 1.A.21, and 1.B.31. Theorem 1.C.42. Let X1 , X2 , . . . , Xm be independent exponential random variables with possibly diﬀerent parameters. Then U(1) ≤lr

m−i+1 · U(i) , m

i = 1, 2, . . . , m.

It is worth mentioning that Kochar and Kirmani [313] claimed that if X1 , X2 , . . . , Xm are independent and identically distributed random variables with a common logconvex density, then U(i) ≤lr ((m − i)/(m − i + 1))U(i+1) for i = 1, 2, . . . , m − 1. However, Misra and van der Meulen [396] showed via a counterexample that this is not correct. For spacings that are not “normalized” we have the following results. We denote by U(i:m) = X(i:m) − X(i−1:m) the ith spacing that corresponds to a sample X1 , X2 , . . . , Xm of size m. Theorem 1.C.43. Let X1 , X2 , . . . , Xm , Xm+1 be independent, identically distributed, nonnegative random variables with a common logconvex density. Then U(i:m) ≤lr U(i+1:m) , U(i:m+1) ≤lr U(i:m) ,

1 ≤ i ≤ m − 1, 1 ≤ i ≤ m,

and U(i:m) ≤lr U(i+1:m+1) ,

1 ≤ i ≤ m.

1.C The Likelihood Ratio Order

59

Note that the three statements of the above theorem can be summarized as U(j:m) ≤lr U(i:n)

whenever i − j ≥ max{0, n − m}.

We also have the following result. Theorem 1.C.44. Let X1 , X2 , . . . , Xm , Xm+1 be independent, identically distributed, nonnegative random variables with a common logconcave density. Then U(i:m) ≥lr U(i+1:m+1) , 1 ≤ i ≤ m. A comparison of spacings from two diﬀerent samples, that is similar to Theorem 1.B.32, is described next. In fact, it will be argued after the next theorem that the next result strengthens Theorem 1.B.31. Here U(i:m) = X(i:m) − X(i−1:m) denotes, as before, the ith spacing that corresponds to the sample X1 , X2 , . . . , Xm , and V(j:n) denotes, similarly, the jth spacing that corresponds to the sample Y1 , Y2 , . . . , Yn . Other results which give related comparisons can be found in Theorem 4.B.17 and in Examples 6.B.25 and 6.E.15. Theorem 1.C.45. For positive integers m and n, let X1 , X2 , . . . , Xm be independent identically distributed random variables with an absolutely continuous common distribution function, and let Y1 , Y2 , . . . , Yn be independent identically distributed random variables with a possibly diﬀerent absolutely continuous common distribution function. If X1 ≤lr Y1 , and if either X1 or Y1 is DFR, then (m − j + 1)U(j:m) ≤hr (n − i + 1)V(i:n)

whenever i − j ≥ max{0, n − m}.

Taking X1 =st Y1 in Theorem 1.C.45 it is seen that Theorem 1.B.31 is a consequence of Theorem 1.C.45. In the following example it is shown that, under the proper conditions, random minima and maxima are ordered in the likelihood ratio order sense; see related results in Examples 3.B.39, 4.B.16, 5.A.24 and 5.B.13. Example 1.C.46. Let X1 , X2 , . . . be a sequence of absolutely continuous nonnegative independent and identically distributed random variables with a common distribution function FX1 and a common density function fX1 . Let N1 and N2 be two positive integer-valued random variables which are independent of the Xi ’s. Denote X(1:Nj ) ≡ min{X1 , X2 , . . . , XNj } and X(Nj :Nj ) ≡ max{X1 , X2 , . . . , XNj }, j = 1, 2. Then the density function of X(Nj :Nj ) is given by fX(Nj :Nj ) (x) =

∞

n−1 nFX (x)fX1 (x)P {Nj = n}, 1

x ≥ 0, j = 1, 2.

n=1

If N1 ≤lr N2 , then P {Nj = n} is TP2 in n ≥ 1 and j ∈ {1, 2}. Also, n−1 nFX (x)fX1 (x) is TP2 in n ≥ 1 and x ≥ 0. Therefore, by the Basic Com1 position Formula (Karlin [275]) it follows that fX(Nj :Nj ) (x) is TP2 in x ≥ 0

60

1 Univariate Stochastic Orders

and j ∈ {1, 2}. That is, X(N1 :N1 ) ≤lr X(N2 :N2 ) . In a similar fashion it can be shown also that X(1:N1 ) ≥lr X(1:N2 ) . Example 1.C.47. Let {N (t), t ≥ 0} be a nonhomogeneous Poisson process with mean function Λ (that is, Λ(t) ≡ E[N (t)], t ≥ 0), and let T1 , T2 , . . . be the successive epoch times. The survival function of Tn is given by P {Tn > i n−1 t} = i=0 (Λ(t)) e−Λ(t) , t ≥ 0, and the density function of Tn is given by i! (n−1)

fn (t) = λ(t) (Λ(t)) (n−1)! easy to verify that

e−Λ(t) , t ≥ 0, where λ(t) ≡

fn+1 (t) fn (t)

d dt Λ(t),

n = 1, 2, . . .. It is

is increasing in t ≥ 0, n = 1, 2, . . ., and therefore

Tn ≤lr Tn+1 ,

n = 1, 2, . . . .

Theorem 2.6 on page 182 of Kamps [273] extends Example 1.C.47 (as it extends Theorem 1.C.45) to the so called generalized order statistics. A further extension is described in Franco, Ruiz, and Ruiz [205]. The following example may be compared to Examples 1.B.24, 2.A.22, 3.B.38, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 1.C.48. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G and density functions f and g, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that, under some conditions, the likelihood ratio ordering of the ﬁrst two epoch times implies the likelihood ratio ordering of all the corresponding later epoch times. Explicitly, it will be shown below that if X ≤lr Y , and if Λ2 (t) is increasing in t ≥ 0, (1.C.15) Λ1 (t) then T1,n ≤lr T2,n , n ≥ 1. From (1.B.24) it is easy to see that the density function f1,n of T1,n is given by (Λ1 (t))n−1 f1,n (t) = f (t) , t ≥ 0, n ≥ 1, (n − 1)! and that the density function f2,n of T2,n is given by f2,n (t) = g(t) Thus,

(Λ2 (t))n−1 , (n − 1)!

t ≥ 0, n ≥ 1.

g(t) Λ2 (t) n−1 f2,n (t) = . f1,n (t) f (t) Λ1 (t)

1.C The Likelihood Ratio Order

61

Now, if X ≤lr Y and (1.C.15) holds, then f2,n /f1,n is increasing and we obtain T1,n ≤lr T2,n . Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Again, note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the likelihood ratio ordering of the ﬁrst two inter-epoch times implies the likelihood ratio ordering of all the corresponding later inter-epoch times. Explicitly, it will be shown below that if X ≤hr Y , if f and g are logconvex, and if (1.B.25) holds, then X1,n ≤lr X2,n for each n ≥ 1. First note that by Theorem 1.C.4 we have X ≤lr Y . For the purpose of the following proof we denote f by f1 and g by f2 . Let gi,n denote the density function of Xi,n , i = 1, 2. The stated result is obvious for n = 1, so let us ﬁx an n ≥ 2. From (1.B.26) we obtain ∞ Λn−2 (s) gi,n (t) = λi (s) i fi (s + t) ds, t ≥ 0, i = 1, 2. (n − 2)! 0 As in Example 1.B.24, we have that λi (t)

(t) Λn−2 i (n − 2)!

is TP2 in (i, t).

The assumption F1 ≤lr F2 implies that fi (s + t) is TP2 in (i, s) and in (i, t). Finally, the logconvexity of f1 and of f2 means that fi (s + t) is TP2 in (s, t). Thus, by Theorem 5.1 on page 123 of Karlin [275], we get that gi,n (t) is TP2 in (i, t); that is, X1,n ≤lr X2,n . The following neat example compares a sum of independent heterogeneous exponential random variables with an Erlang random variable; it is of interest to compare it with Examples 1.A.24 and 1.B.5. We do not give the proof here. Example 1.C.49. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and assume that the Xi ’s are independent. Let Yi , i = 1, 2, . . . , m, be independent, identically distributed, exponential random variables with mean η −1 . Then m i=1

Xi ≥lr

m i=1

Yi ⇐⇒

λ1 + λ2 + · · · + λm ≤ η. m

A related example is the following. Recall from page 2 the deﬁnition of the majorization order ≺ among n-dimensional vectors. It is of interest to compare the example below with Example 3.B.34.

62

1 Univariate Stochastic Orders

Example 1.C.50. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and let Yi be an exponential random variable with mean ηi−1 > 0, i = 1, 2, . . . , m. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m i=1

Xi ≥lr

m

Yi .

i=1

The next example may be compared with Examples 1.A.25, 1.B.6, and 4.A.45. Example 1.C.51. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m i=1

and

m i=1

n , (n i=1 i /pi )

Xi ≥lr Y ⇐⇒ p ≤ m

n . (n /(1 − pi )) i i=1

Xi ≤lr Y ⇐⇒ 1 − p ≤ m

The order ≤lr can be used to characterize random variables with logconcave densities. The next result lists several such characterizations. It shows that logconcavity can be interpreted as an aging notion in reliability theory by a correct use of the likelihood ratio ordering. This theorem may be compared to Theorem 1.B.38. Theorem 1.C.52. The random variable X has a logconcave density (that is, a Polya frequency of order 2 (PF2 )) if, and only if, one of the following equivalent conditions holds: (i) [X − tX > t] ≥lr [X − t X > t ] whenever t ≤ t . (ii) X ≥lr [X − tX > t] for all t ≥ 0 (when X is a nonnegative random variable). (iii) X + t ≤lr X + t whenever t ≤ t . Random variables that satisfy (i) in Theorem 1.C.52 (and hence any of the conditions of that theorem) are said to have the ILR (increasing likelihood ratio) property; see Section 13.D.2 by Righter in [515]. A multivariate extension of parts (i) and (ii) of Theorem 1.C.52 is given in Section 6.E.3. Another connection between logconcavity and the likelihood ratio order is illustrated in the next result. It is worthwhile to compare the following result with Theorem 6.B.9 in Section 6.B.3. Theorem 1.C.53. Let X1 , X2 , . . . , Xm be independent random variables having logconcave density functions. Then

1.C The Likelihood Ratio Order

m m Xi Xj = s ≤lr Xi Xj = s j=1

63

whenever s ≤ s , i = 1, 2, . . . , m.

j=1

Proof. Since the convolution of logconcave density functions is logconcave, it is suﬃcient to prove the result for m = 2 and i = 1. Let f1 and f2 denote the density functions of X1 and X2 , respectively. The conditional density of X1 , given X1 + X2 = s, is fX1 |X1 +X2 =s (x1 ) =

f1 (x1 )f2 (s − x1 ) . f1 (u)f2 (s − u)du

Thus, fX1 |X1 +X2 =s (x1 ) f2 (s − x1 ) f1 (u)f2 (s − u)du . = fX1 |X1 +X2 =s (x1 ) f2 (s − x1 ) f1 (u)f2 (s − u)du

(1.C.16)

The logconcavity of f2 implies that the expression in (1.C.16)) increases in x1 , whenever s ≤ s . By (1.C.1) the proof is complete.

Theorems 1.C.52 and 1.C.53 have straightforward discrete analogs, which we do not state here. A few other properties of the order ≤lr can be found in Lemma 13.D.1 in Chapter 13 by Righter, and in (14.B.7) in Chapter 14 by Shanthikumar and Yao, in [515]. An interesting closure property of logconcave density functions is described in the following result. Theorem 1.C.54. Let X1 , X2 , . . . , Xm be independent, identically distributed random variables with a common logconcave density function. Then the ith order statistic X(i:m) also has a logconcave density function, 1 ≤ i ≤ m. Proof. Let f , F , and F denote, respectively, the density, distribution, and survival function of X1 . Then the density function of X(i:m) is given by m−1 m−i f(i:m) (x) = m F i−1 (x)f (x)F (x). i−1 Since the logconcavity of f implies the logconcavity of F and of F , it follows that f(i:m) is logconcave.

Misra and van der Meulen [396] showed the preservation of logconcavity and logconvexity from the parent density to the density of the corresponding spacings. The likelihood ratio order can be used to characterize some aging notions in reliability theory. Recall from (1.A.20) that for a nonnegative random variable X with a ﬁnite mean we denote by AX the corresponding asymptotic equilibrium age. Recall from page 1 the deﬁnitions of the IFR and the DFR properties. The following result is immediate. It is of interest to contrast it with Theorems 1.A.31 and 1.B.40

64

1 Univariate Stochastic Orders

Theorem 1.C.55. The nonnegative random variable X with ﬁnite mean is IFR [DFR] if, and only if, X ≥lr [≤lr ] AX . An interesting comparison of asymptotic equilibrium ages is described in the next example. Recall from page 1 the deﬁnitions of the DMRL property. Example 1.C.56. Let X and Y be two independent nonnegative DMRL random variables with survival functions F and G, density functions f and g, and asymptotic equilibrium ages AX and AY , respectively. Let Amin{X,Y } denote the asymptotic equilibrium age of min{X, Y }. Then min{AX , AY } ≤lr Amin{X,Y } . In order to see this, assume, for simplicity, that the supports of X and of Y are (0, ∞). Note that the density function of min{AX , AY } is given by ∞ ∞ −1 fmin{AX ,AY } (t) = (EXEY ) F (t) G(x) dx+G(t) F (x)dx , t ≥ 0, t

t

and the density function of Amin{X,Y } is given by

−1 fAmin{X,Y } (t) = E[min{X, Y }] F (t)G(t),

t ≥ 0.

Therefore

−1 fAmin{X,Y } (t) EXEY m(t) + l(t) = , fmin{AX ,AY } (t) E[min{X, Y }]

t ≥ 0,

where m and l are the mean residual life functions of X and of Y , given by m(t) = E[X − tX > t] and l(t) = E[Y − tY > t], t ≥ 0. The functions m and l are decreasing by the DMRL assumptions, and therefore min{AX , AY } ≤lr Amin{X,Y } by (1.C.1). In the following example it is shown that if X is increasing in Θ in the likelihood ratio sense, then the posterior distribution of Θ is increasing in X in the same sense. Example 1.C.57. Let X be a random variable whose distribution function depends on the real parameter Θ. Denote the prior density function of Θ by π, and denote the posterior density function of Θ, given X =x, by π ∗ (·x). Also, denote the conditional density of X, given Θ = θ by f (·θ), and denote the marginal density of X by g. If X is increasing in Θ in the likelihood ratio sense (that is, if [X Θ = θ] ≤lr [X Θ = θ ] whenever θ ≤ θ ), then Θ is increasing in X in the likelihood ratio sense (that is, [ΘX = x] ≤lr [ΘX = x ] whenever x ≤ x ). The proof of this statement is easy by noting that f (xθ)π(θ) ∗ π (θ x) = . g(x)

1.C The Likelihood Ratio Order

65

An extension of Example 1.C.57 to the multivariate likelihood ratio order is given in Example 6.E.16. Example 1.C.58. Let X be a random variable whose distribution function depends on the random parameter Θ1 or, in other circumstances, on the random parameter Θ2 . Denote the prior density functions, of Θ1 and Θ2 , by π1 and π2 , respectively, and denote the posterior density functions of Θ1 and Θ2 , given X = x, by π1∗ (·x) and π2∗ (·x), respectively. Also, denote the conditional density of X, given Θ1 = θ or Θ2 = θ, by f (·θ), and denote the marginal density of X by g1 or by g2 , according to whether X depends on Θ1 or on Θ2 . Then, for any x, we have that Θ1 ≤lr Θ2 =⇒ [Θ1 X = x] ≤lr [Θ2 X = x]. The proof of this statement is easy by noting that f (xθ)πi (θ) ∗ πi (θ x) = , i = 1, 2. gi (x) Example 1.C.59. Recall from Example 1.B.23 that for a nonnegative random variable X with density function f , and for a nonnegative function w such that E[w(X)] exists, we denote by X w the random variable with the weighted density function fw given by fw (x) =

w(x)f (x) , E[w(X)]

x ≥ 0.

(1.C.17)

Similarly, for another nonnegative random variable Y with density function g, such that E[w(Y )] exists, we denote by Y w the random variable with the density function gw given by gw (x) =

w(x)g(x) , E[w(Y )]

x ≥ 0.

(1.C.18)

It is then obvious that X ≤lr Y =⇒ X w ≤lr Y w . Example 1.C.60. Let X be a nonnegative random variable with density function f , and for a nonnegative function w such that E[w(X)] exists, let X w be the random variable with the weighted density function fw given in (1.C.17). It is then obvious that if w is increasing [decreasing], then X ≤lr [≥lr ] X w . In particular, the inequality X ≤lr X w holds when X w is the length-biased version of X; that is, when w(x) = x, x ≥ 0. Example 1.C.61. Let the random variable X have a generalized skew normal distribution with parameters n and λ, that is, suppose that its density function is given by Φn (λx)φ(x) f (x; n, λ) = , x ∈ R, C(n, λ)

66

1 Univariate Stochastic Orders

where φ and Φ are, respectively, the density and distribution functions of a standard normal random variable, and C(n, λ) is given by ∞ C(n, λ) = Φn (λx)φ(x)dx. −∞

Let Y have a generalized skew normal distribution with parameters n1 and λ. It is easy to see that if λ > [ x ≤lr Y for all x ∈ (lX , uX ).

68

1 Univariate Stochastic Orders

Another shifted likelihood ratio stochastic order is deﬁned next. Let X and Y be two absolutely continuous random variables with support [0, ∞). Suppose that X ≤lr [Y − xY > x] for all x ≥ 0. Then X is said to be smaller than Y in the down shifted likelihood ratio order (denoted as X ≤lr↓ Y ). Note that in the above deﬁnition only nonnegative random variables are compared. This is because for the down shifted likelihood ratio order it is not possible to take an analog of (1.C.19), such as X ≤lr Y −x, as a deﬁnition. The reason is that here, by taking x very large, it is seen that practically no random variables would satisfy such an order relation. Note that in the deﬁnition above, the right-hand side [Y − xY > x] can take on (as x varies) any value in the right neighborhood of 0. Therefore the support of the compared random variables is restricted here to be [0, ∞). Let f and g denote the density functions of X and Y , respectively. An analog of (1.C.20) is the following: X ≤lr↓ Y ⇐⇒

g(t + x) is increasing in t ≥ 0 for all x ≥ 0. f (t)

(1.C.21)

(A discrete version of the down shifted likelihood ratio order is deﬁned and used in Section 6.B.3.) It is readily apparent that for nonnegative random variables with support [0, ∞) we have X ≤lr↓ Y =⇒ X ≤lr Y. We describe now some further properties of the down shifted likelihood ratio order. Theorem 1.C.70. Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables, with support [0, ∞), such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤lr↓ Yj , j = 1, 2, . . ., then X ≤lr↓ Y . The following result is an analog of Theorem 1.C.63, however, it does not follow at once from Theorem 1.C.15. Its proof can be found in Lillo, Nanda, and Shaked [361]. Θ = θ] Theorem 1.C.71. Let X, Y , and Θ be random variables such that [X and [Y Θ = θ] are absolutely continuous and have the support [0, ∞) for all θ in the support of Θ. If [X Θ = θ] ≤lr↓ [Y Θ = θ ] for all θ and θ in the support of Θ, then X ≤lr↓ Y . More properties are listed next. Theorem 1.C.72. Let X and Y be two absolutely continuous random variables with support [0, ∞). If X or Y or both have logconvex densities on [0, ∞), and if X ≤lr Y , then X ≤lr↓ Y .

1.C The Likelihood Ratio Order

69

Theorem 1.C.73. Let X and Y be two absolutely continuous random variables with diﬀerentiable densities on their support [0, ∞). Then X ≤lr↓ Y if, and only if, there exists a random variable Z with a logconvex density on [0, ∞) such that X ≤lr Z ≤lr Y . Theorem 1.C.74. Let X be an absolutely continuous random variable with support [0, ∞). Then X ≤lr↓ X if, and only if, f is logconvex on [0, ∞). Theorem 1.C.75. Let X and Y be two absolutely continuous random variables with support [0, ∞). If X ≤lr↓ Y and if Y has a decreasing density function on [0, ∞), then φ(X) ≤lr↓ φ(Y ) for any strictly increasing twice diﬀerentiable convex function φ : [0, ∞) → [0, ∞) (with ﬁrst and second derivatives φ and φ ) such that φ (x)/(φ (x))2 is decreasing. Example 1.C.76. An interesting family of distribution functions, with associated random variables that are ordered in the down shifted likelihood ratio order, is the Pareto family. Explicitly, for θ ∈ (0, ∞), let Xθ be a random variable with density function fθ deﬁned by fθ (x) = θ/(1 + x)θ+1 ,

x ≥ 0.

Then, by verifying (1.C.21), it is easy to see that Xθ1 ≤lr↓ Xθ2 whenever θ1 ≥ θ2 > 0. Some results that compare order statistics in the shifted likelihood ratio orders are described next. Again, X(j:m) denotes the jth order statistic associated with the random variables X1 , X2 , . . . , Xm , and Y(i:n) denotes the ith order statistic associated with the random variables Y1 , Y2 , . . . , Yn . An analog of Theorem 1.C.33 for the order ≤lr↑ is the following result. Note that in the following theorem the assumption is stronger than the assumption in Theorem 1.C.33, but so is the conclusion. Theorem 1.C.77. Let X1 , X2 , . . . , Xm be m independent random variables, and let Y1 , Y2 , . . . , Yn be other n independent random variables, all having absolutely continuous distributions. If Xj ≤lr↑ Yi for all 1 ≤ j ≤ m and 1 ≤ i ≤ n, then X(j:m) ≤lr↑ Y(i:n)

whenever j ≤ i and m − j ≥ n − i.

Proof. Fix an x ≥ 0 and denote by (X − x)(j:m) the jth order statistic among the random variables X1 − x, X2 − x, . . . , Xm − x. By assumption we have Xj − x ≤lr↑ Yi for all 1 ≤ j ≤ m and 1 ≤ i ≤ n. Therefore from Theorem 1.C.33 we get (X − x)(j:m) ≤lr Y(i:n) whenever j ≤ i and m − j ≥ n − i. The stated result follows from the fact that (X − x)(j:m) = X(j:m) − x.

For the down shifted likelihood ratio order, the method of proof used in the proof of Theorem 1.C.33 only yields comparisons of minima as described in the following result.

70

1 Univariate Stochastic Orders

Theorem 1.C.78. Let X1 , X2 , . . . , Xm be m independent random variables, and let Y1 , Y2 , . . . , Yn be other n independent random variables, all having absolutely continuous distributions with support [0, ∞). If Xj ≤lr↓ Yi for all 1 ≤ j ≤ m and 1 ≤ i ≤ n, then X(1:m) ≤lr↓ Y(1:n)

whenever m ≥ n.

Now let X1 , X2 , . . . be independent and identically distributed random variables. Taking Yi =st Xj for all i and j in Theorems 1.C.77 and 1.C.78, and using Theorems 1.C.66 and 1.C.74, we obtain the following analogs of Theorem 1.C.37. Note that in the next theorem (unlike in Theorem 1.C.37) we assume logconcavity or logconvexity of the underlying density function, but the conclusion in part (a) of the next theorem is stronger than the conclusion in Theorem 1.C.37. Theorem 1.C.79. (a) Let X1 , X2 , . . . be independent and identically distributed absolutely continuous random variables with an interval support. If the common density function is logconcave, then X(j:m) ≤lr↑ X(i:n)

whenever j ≤ i and m − j ≥ n − i.

(b) Let X1 , X2 , . . . be independent and identically distributed absolutely continuous random variables with support [0, ∞). If the common density function is logconvex on [0, ∞), then X(1:m) ≤lr↓ X(1:n)

whenever m ≥ n.

1.D The Convolution Order Let X and Y be two random variables such that Y =st X + U,

(1.D.1)

where U is a nonnegative random variable, independent of X. Then X is said to be smaller than Y in the convolution order (denoted as X ≤conv Y ). Obviously, the convolution order is a partial order. It is equivalent to the information order which is deﬁned for statistical experiments when the underlying parameter is a location parameter. The convolution order is obviously closed under increasing linear transformations. That is, for any a ∈ R and b ≥ 0 we have X ≤conv Y =⇒ a + bX ≤conv a + bY. The convolution order is obviously also closed under convolutions. That is, let X1 , X2 , . . . , Xn be a set of independent random variables, and let Y1 , Y2 , . . . , Yn be another set of independent random variables. Then

1.E Complements

71

n n

Xj ≤conv Yj , j = 1, 2, . . . , n =⇒ Xi ≤conv Yi . i=1

i=1

It is obvious from Theorem 1.A.2 and (1.D.1) that X ≤conv Y =⇒ X ≤st Y. For any nonnegative random variable X we denote by LX its classical Laplace transform, that is, LX (s) = E[e−sX ],

s ≥ 0.

Recall that a nonnegative function φ is a Laplace transform of a nonnegative measure on (0, ∞) if, and only if, φ is completely monotone, that is, all the derivatives φ(n) of φ exist, and they satisfy (−1)n φ(n) (x) ≥ 0 for all x ≥ 0 and n = 1, 2, . . .. It follows that for nonnegative random variables we have X ≤conv Y ⇐⇒

LY (s) is a completely monotone function in s ≥ 0. (1.D.2) LX (s)

Example 1.D.1. Let Xi be an exponential random variable with mean 1/λi , i = 1, 2. If λ1 > λ2 , then X1 ≤conv X2 . To see this, note that the ratio of the Laplace transforms of X2 and X1 at s is equal to (λ2 /λ1 )((s + λ1 )/(s + λ2 )), and it is easy to verify that this ratio is completely monotone. The result thus follows from (1.D.2). Example 1.D.2. Let X1 , X2 , . . . , Xn be independent and identically distributed exponential random variables with mean 1/λ for some λ > 0. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(n) . Then X(i) ≤conv X(j)

whenever 1 ≤ i < j ≤ n.

To see this, note that X(k+1) =st X(k) + Zk , where Zk is an exponential random variable with mean ((n − k)λ)−1 , k = 1, 2, . . . , n − 1, and use the transitivity property of the order ≤conv .

1.E Complements Section 1.A: The usual stochastic order is being used in many areas of applications, but there is no single source where many of the basic results can all be found. Some standard references are the books of Lehmann [342], Marshall and Olkin [383], Ross [475], and M¨ uller and Stoyan [419], where most of the results described in Section 1.A can be found. For example, Theorem 1.A.2 can be found in Marshall and Olkin [383]. The characterization of the usual stochastic order by the monotonicity described in

72

1 Univariate Stochastic Orders

(1.A.8) is taken from M¨ uller [407], whereas the characterization given in (1.A.12) can be found in Fellman [193]. The comparison of the random sums in Theorem 1.A.5 is motivated by ideas in Pellerey and Shaked [455]; it was communicated to us by Pellerey [444]. The application of the order ≤st in Bayesian imperfect repair (Example 1.A.7) is taken from Lim, Lu, and Park [364]. The result which gives conditions for stochastic equality (Theorem 1.A.8) can be found in Baccelli and Makowski [27] and in Scarsini and Shaked [494]. Lemma 2.1 of Costantini and Pasqualucci [135] with n = 1 is an interesting variation of Theorem 1.A.8. The bivariate characterizations in Theorems 1.A.9 and 1.A.10 are taken from Shanthikumar and Yao [532] and from Righter and Shanthikumar [466], respectively. The characterization of the order ≤st by means of the FortretMourier-Wasserstein distance (Theorem 1.A.11) is taken from Adell and de la Cal [3]. The Laplace transform characterization of the order ≤st (Theorem 1.A.13) can be found in Kebir [281] and in Kan and Yi [274]. An extension of Theorem 1.A.13 to more general orders can be found in Nanda [422]. The closure of the order ≤st under a stochastically increasing family of random variables (Theorem 1.A.14) is taken from Shaked and Wong [524]. The condition for the usual stochastic order, given in Theorem 1.A.17, has been communicated to us by Gerchak and He [210]. The comparison of truncated maximum with truncations maximum (Example 1.A.16) can be found in Pellerey and Petakos [453]. The lattice property of the order ≤st (Remark 1.A.18) is given in M¨ uller and Scarsini [418]. The four results that give the stochastic orderings of the spacings, Theorems 1.A.19–1.A.22, can be found in Barlow and Proschan [35], Ebrahimi and Spizzichino [178], Pledger and Proschan [458], and Joag-Dev [258], respectively. The stochastic comparison of order statistics of independent random variables with the order statistics of independent and identically distributed random variables (Theorem 1.A.23) is taken from Ma [371]; it generalizes some previous results in the literature. The stochastic comparison of a sum of independent heterogeneous exponential random variables with a proper Erlang random variable (Example 1.A.24) is taken from Bon and P˘ alt˘ anea [105], where more reﬁned comparisons can also be found. The stochastic comparison of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 1.A.25) is taken from Boland, Singh, and Cukic [102]. The necessary and suﬃcient conditions for the comparison of normal random variables (Example 1.A.26) are taken from M¨ uller [413]; an extension of this result to Kotz-type distributions is given in Ding and Zhang [168]. The stochastic comparisons of norms, in Examples 1.A.27 and 1.A.28, are taken from Lapidoth and Moser [333]. The TTT transform (1.A.19) is introduced in Barlow, Bartholomew, Bremner, and Brunk [32], and is further studied in Barlow and Doksum [34] and in Barlow and Campo [33]. The observed total time on test random variable Xttt is deﬁned and studied in Li and Shaked [356], where the implication in Theorem 1.A.29 can be found. The

1.E Complements

73

characterizations of the NBUE and the NWUE aging notions by means of the usual stochastic order (Theorem 1.A.31) can be found in Whitt [565] and in Fagiuoli and Pellerey [187]. The other characterization, by means of the random variable Xttt (Theorem 1.A.32), is taken from Li and Shaked [356]. The aging notion that is described in (1.A.21) is studied in Mugdadi and Ahmad [402]. Boland, Singh, and Cukic [103] studied an order, called the stochastic precedence order, according to which the random variable X is smaller than the random variable Y if P {X < Y } ≥ P {Y < X}. If X and Y are independent, then X ≤st Y implies that X is smaller than Y in the stochastic precedence order. Section 1.B: Many of the basic results regarding the hazard rate order can be found in Ross [475] and in M¨ uller and Stoyan [419]. The characterization (1.B.8) can be found in Lehmann and Rojo [345]. The results regarding the preservation of the orders ≤hr and ≤rh under monotone increasing transformations (Theorems 1.B.2 and 1.B.43) can be found in Keilson and Sumita [283]. The closure under convolutions result (Theorem 1.B.4) and the bivariate characterization result (Theorem 1.B.9) are taken from Kijima [291] and Shanthikumar and Yao [532]. A special case of Lemma 1.B.3 can be found in Mukherjee and Chatterjee [403]. The hazard rate order comparison of a sum of independent heterogeneous exponential random variables with a proper Erlang random variable (Example 1.B.5) is taken from Bon and P˘ alt˘ anea [105], where more reﬁned comparisons can also be found. The hazard rate order comparison of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 1.B.6) is taken from Boland, Singh, and Cukic [102]. The hazard rate order comparison of random sums (Theorem 1.B.7) can be found in Pellerey [445]; some related results are Theorem 7.2 of Kijima [291] and Proposition 2.2 of Kebir [282]. The closure under mixtures result (Theorem 1.B.8) can be found in Boland, El-Neweihi, and Proschan [97]; a generalization of it is contained in Nanda, Jain, and Singh [424]. The bivariate characterizations in Theorems 1.B.10 and 1.B.11 are taken from Righter and Shanthikumar [466] and from Cheng and Righter [128], respectively. The characterizations given in Theorem 1.B.12 can be found in Cap´era` a [118] and in Joag-Dev, Kochar, and Proschan [259]. The hazard rate ordering result regarding the inter-epoch times of a nonhomogeneous Poisson process (Example 1.B.13) is taken from Kochar [309] where other applications of Theorem 1.B.12 can also be found. The hazard rate ordering of the epoch times of a nonhomogeneous Poisson process (1.B.19) can be found in Baxter [62]. The closure property of the order ≤hr under hazard rate ordered mixtures (Theorem 1.B.14) is taken from Shaked and Wong [524]; a related result is Proposition 4.1 of Kebir [282]. The preservation of the order ≤hr under the formation of a proper Markov chain (Example 1.B.15) can essentially be found in Ross, Shanthikumar,

74

1 Univariate Stochastic Orders

and Zhu [478]; they gave a version of this preservation result for the order ≤rh . The application of the order ≤hr in Bayesian imperfect repair (Example 1.B.16) is inspired by Lim, Lu, and Park [364], but the result given here is stronger than their Theorem 4.1(iii). The hazard rate order comparison of a proportional hazard mixture with its parent distribution (Example 1.B.17) is taken from Gupta and Gupta [214]. The Laplace transform characterization of the order ≤hr (Theorem 1.B.18) can be found in Kebir [281] and in Kan and Yi [274]. An extension of Theorem 1.B.18 to more general orders can be found in Nanda [422]. The result about the inheritance of the order ≤hr , from the mixing scales to the underlying counting processes (Theorem 1.B.19), is essentially taken from Ma [374]. The closure property which is given in Theorem 1.B.21 can be found in Kochar [305]; the necessary and suﬃcient condition, given after Theorem 1.B.21, is taken from Ma [374]. The result involving the hazard rate comparison of weighted random variables (Example 1.B.23) is taken from Nanda and Jain [423]; see also Bartoszewicz and Skolimowska [51]. The hazard rate comparison of epoch times of nonhomogeneous Poisson processes in Example 1.B.24 can be found in Ahmadi and Arghami [6] and in Belzunce, Lillo, Ruiz, and Shaked [69]; in the latter paper the result is extended to nonhomogeneous pure birth processes. The hazard rate order comparison of inter-epoch times of nonhomogeneous Poisson processes in Example 1.B.24 is taken from Belzunce, Lillo, Ruiz, and Shaked [69], who also obtained a similar result for the more general nonhomogeneous pure birth processes. The hazard rate order comparison of series systems of parallel systems (Example 1.B.25) can be found in Vald´es and Zequeira [553]. The proof of Theorem 1.B.26 (given in Remark 1.C.40) is taken from Boland, Shaked, and Shanthikumar [101]. The hazard rate order comparisons of order statistics described in Theorems 1.B.27 and 1.B.28 can be found in Korwar [321]. The conditions that lead to the hazard rate ordering of minima (Theorem 1.B.29 and Corollary 1.B.30) are taken from Navarro and Shaked [430]. The two results that give the hazard rate orderings of the spacings (Theorem 1.B.31) can be found in Kochar and Kirmani [313] and in Khaledi and Kochar [285], whereas the comparison of spacings from two diﬀerent samples (Theorem 1.B.32) is taken from Khaledi and Kochar [285]; further results can be found in Hu and Wei [240] and in Misra and van der Meulen [396]. The closure property under formations of order statistics (Theorem 1.B.34) is taken from Singh and Vijayasree [537]; see also Lynch, Mimmack, and Proschan [369]. Boland, El-Neweihi, and Proschan [97] show, by a counterexample, that the conclusion of Theorem 1.B.34 need not hold when the Xi ’s or the Yi ’s are not identically distributed. Extensions of Theorem 1.B.34 can be found in Shaked and Shanthikumar [516], in Belzunce, Mercader, and Ruiz [70], and in Hu and Zhuang [247]. The general comparison result, given in Theorem 1.B.36, is taken from Boland, Hu, Shaked, and Shanthikumar [99]; see related results in Franco, Ruiz, and Ruiz [205] and in Hu and Zhuang [247]. The hazard

1.E Complements

75

rate order comparisons of maxima of heterogeneous exponential random variables (Example 1.B.37) are taken from Dykstra, Kochar, and Rojo [174] and from Khaledi and Kochar [287]. The closure under convolution property of IFR random variables (Corollary 1.B.39) can be found, for example, in Barlow and Proschan [36, page 100]). The characterizations of the DMRL and the IMRL aging notions by means of the hazard rate order (Theorem 1.B.40) can be found in Brown [111, page 229], in Whitt [565], and in Fagiuoli and Pellerey [187]. The observation that essentially reduces the study of the reversed hazard rate order into the study of the hazard rate order (Theorem 1.B.41) is taken from Nanda and Shaked [428]. The bivariate characterization results for the reversed hazard order (Theorems 1.B.47 and 1.B.49) can be found in Shanthikumar, Yamazaki, and Sakasegawa [529] and in Cheng and Righter [128]. The application of the reversed hazard order in economics, described in Example 1.B.51, is taken from Eeckhoudt and Gollier [180]; further results in this vein can be found in Kijima and Ohnishi [293]. The closure property of the order ≤rh under reversed hazard rate ordered mixtures (Theorem 1.B.52) is taken from Shaked and Wong [524]; a related result is Proposition 4.1 of Kebir [282]. The Laplace transform characterization of the order ≤rh (Theorem 1.B.53) is taken from Kebir [281]. The result about the inheritance of the order ≤rh , from the mixing scales to the underlying counting processes (Theorem 1.B.54), is essentially taken from Ma [374]. The results about the reversed hazard rate ordering of order statistics (Theorems 1.B.56 and 1.B.57), and the characterizations of the reversed hazard rate order given in Theorem 1.B.62, can be found in Block, Savits, and Singh [96], whereas the result described in Theorem 1.B.58 is taken from Hu and He [232]. The preservation of the order statistics in the sense of the order ≤rh (Theorem 1.B.60) can be found in Nanda, Jain, and Singh [426]. An order among nonnegative random variables, which is deﬁned by stipulating the monotonicity of the ratio of the hazard rate functions (when they exist), is studied in Kalashnikov and Rachev [271], Sengupta and Deshpande [500], and Rowell and Siegrist [479]. Equivalently, if F and G are survival functions, and we denote RF = − log F and RG = − log G, then the order mentioned above can be deﬁned by requiring that the com−1 position RF ◦ RG be convex on [0, ∞). The notion of the monotonicity of the ratio of hazard rate functions is used in Examples 1.B.24 (see (1.B.25)) and 1.B.25, as well as in Theorem 1.C.4. Sengupta and Deshpande [500] and Rowell and Siegrist [479] also studied the orders deﬁned by stipulating −1 that RF ◦ RG be starshaped or superadditive. Brown and Shanthikumar [112], Lillo, Nanda, and Shaked [361], Hu and Zhu [242], Di Crescenzo and Longobardi [165], and Belzunce, Ruiz, and Ruiz [74] have introduced and studied various shifted hazard and reversed hazard rate orders. Similar orders which extend the likelihood ratio order are studied in Section 1.C.4.

76

1 Univariate Stochastic Orders

Section 1.C: Again, many of the basic results regarding the likelihood ratio order can be found in Ross [475] and in M¨ uller and Stoyan [419]. Condition (1.C.3) is implicit in Block, Savits, and Shaked [95], and it is explicit in M¨ uller [408]. The relation (1.C.4) is mentioned in Chan, Proschan, and Sethuraman [123]. The suﬃcient conditions for X ≤lr Y , given in Theorem 1.C.4, have been noted in Belzunce, Lillo, Ruiz, and Shaked [69]. The closure property of the likelihood ratio order under conditioning (Theorem 1.C.5) is observed in Whitt [561]. Many variations of Theorem 1.C.5 with respect to general sample spaces can be found in Whitt [561] and in R¨ uschendorf [485]. The closure under limits property of the order ≤lr (Theorem 1.C.7) is taken from M¨ uller [408]. The result regarding the preservation of the order ≤lr under monotone increasing transformations (Theorem 1.C.8) can be found in Keilson and Sumita [283]. The several closure under convolution results (Theorems 1.C.9, 1.C.11, and 1.C.12) as well as the bivariate characterization result (Theorem 1.C.20) are taken from Shanthikumar and Yao [532]; a related result is Proposition 2.4 of Kebir [282]. A special case of Theorem 1.C.9 can be found in Mukherjee and Chatterjee [403]. The result about the number of successes in independent trials (Example 1.C.10) is statement (7) in Samuels [488], who attributed it to Ghurye and Wallace. The characterization of the order ≤hr by means of the order ≤lr , given in Theorem 1.C.14, is taken from Di Crescenzo [164]; a density of the form (1.C.7) can be found in Adell and Lekuona [4, page 773]. The likelihood ratio order comparison of a random random variable with a ﬁxed random variable (Corollary 1.C.16) is a slight generalization of Problem B in Szekli [544, page 22]. The closure property of the order ≤lr under likelihood ratio ordered mixtures (Theorem 1.C.17) is an extension of a result in Kebir [282]. The result about the inheritance of the order ≤lr , from the mixing scales to the underlying counting processes (Theorem 1.C.18), is taken from Ma [374]. Example 1.C.19 is inspired by Theorem 4.12 of Asadi and Shanbhag [23], but Example 1.C.19 has weaker assumptions (Θ1 and Θ2 need not be degenerate) and stronger conclusions (Y1 and Y2 are ordered in the likelihood ratio order, rather than in the hazard rate order) than the result of Asadi and Shanbhag [23]. The result in Theorem 1.C.21 is a special case of a result in Ross [475]. The bivariate characterizations in Theorems 1.C.22, 1.C.23, and 1.C.24 are taken from Righter and Shanthikumar [466] and from Chapter 13 by Righter in [515]. The Laplace transform characterization of the order ≤lr (Theorem 1.C.25) can be found in Kebir [281]. An extension of Theorem 1.C.25 to more general orders can be found in Nanda [422]. The conditional likelihood ratio orderings, described in Theorem 1.C.26, can be found in Ku and Niu [324] and in Chapter 14 by Shanthikumar and Yao in [515]. The setting in which the order ≤hr gives rise to the order ≤lr , as described in Theorem 1.C.28, is essentially taken from Ross, Shanthikumar, and Zhu [478]; they gave a version of this result for the order ≤rh . The necessary and suﬃcient condition for aX ≤lr X (Theorem 1.C.29) can be found in

1.E Complements

77

Hu, Nanda, Xie, and Zhu [237]. The likelihood ratio order comparisons of the order statistics given in Theorem 1.C.31 are taken from Bapat and Kochar [31] and from Hu, Zhu, and Wei [243]; an extension of the ﬁrst part of Theorem 1.C.31(a) can be found in Ma [373]. The result about the likelihood ratio order comparison of order statistics of a simple random sample from a ﬁnite population (Theorem 1.C.32) can be found in Kochar and Korwar [315]. The general result which compares order statistics from two samples of diﬀerent size (Theorem 1.C.33) is taken from Lillo, Nanda, and Shaked [362]; see related results in Franco, Ruiz, and Ruiz [205] and in Hu and Zhuang [247]. Belzunce and Shaked [78] extended Theorem 1.C.33 to comparison of lifetimes of coherent systems in reliability theory; see also Belzunce, Franco, Ruiz, and Ruiz [66]. The closure property under formation of order statistics (Corollary 1.C.34) can be found in Chan, Proschan, and Sethuraman [123]; a special case of this result can be found in Singh and Vijayasree [537]. The likelihood ratio order comparison of the order statistics given in Theorem 1.C.37 is taken from Raqab and Amin [465]. Theorem 2.6 in Kamps [273, page 182] extends Theorem 1.C.37 to the so called generalized order statistics; see also Korwar [322] and Hu and Zhuang [247]. The special case of Theorem 1.C.37 when j = i, is extended in Nanda, Misra, Paul, and Singh [427] to the case when the sample sizes m and n are random. Nanda, Misra, Paul, and Singh [427] also extend the special case of Theorem 1.C.37 when m = n, to the case when the common sample size is random. The likelihood ratio order comparison of normalized spacings (Theorem 1.C.42) can be found in Kochar and Korwar [314], whereas the comparisons for nonnormalized spacings (Theorem 1.C.43) are special cases of results in Misra and van der Meulen [396] and in Hu and Zhuang [246, 248]. The comparison of spacings that correspond to random variables with logconcave density (Theorem 1.C.44) is a special case of a result of Hu and Zhuang [246, 248]. The comparison of spacings from two diﬀerent samples (Theorem 1.C.45) is taken from Khaledi and Kochar [285]; an extension of this result can be found in Franco, Ruiz, and Ruiz [205], and a related result can be found in Belzunce, Mercader, and Ruiz [70]. The results about the likelihood ratio order comparisons of random minima and maxima (Example 1.C.46) are taken from Shaked and Wong [526]; see a related result in Bartoszewicz [49]. The result about the likelihood ratio comparison of the successive epochs of a nonhomogeneous Poisson process (Example 1.C.47) is given in Kochar [307, 309], where it is also shown that it implies the likelihood order comparison of successive record values of a sequence of independent and identically distributed random variables. The likelihood ratio comparisons of epoch and inter-epoch times of nonhomogeneous Poisson processes (Example 1.C.48) are taken from Belzunce, Lillo, Ruiz, and Shaked [69], who also extended them to comparisons of epoch and inter-epoch times of nonhomogeneous pure birth processes. The likelihood ratio order comparison of a sum of independent heterogeneous exponential random variables with a proper Erlang ran-

78

1 Univariate Stochastic Orders

dom variable (Example 1.C.49) is a combination of results from Boland, El-Neweihi, and Proschan [98] and from Bon and P˘ alt˘ anea [105], where more reﬁned comparisons can also be found. For instance, the comparison in Example 1.C.50 is given in Boland, El-Neweihi, and Proschan [98]. The likelihood ratio order comparison of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 1.C.51) is taken from Boland, Singh, and Cukic [102]. An interpretation of logconcavity and logconvexity as aging notions can be found in Shaked and Shanthikumar [506], where the proof of parts (i) and (ii) of Theorem 1.C.52 can be found. A proof of (1.C.13) can also be found there. The likelihood ratio ordering of random variables conditioned on their sum (Theorem 1.C.53) is essentially Example 12 of Lehmann [343]. The closure property of logconcave densities under order statistics (Theorem 1.C.54) is a generalization of an observation in Li and Lu [355]. The characterizations of the IFR and the DFR aging notions by means of the likelihood ratio order (Theorem 1.C.55) can be found in Whitt [565]. The likelihood ratio order comparison of the asymptotic equilibrium ages, given in Example 1.C.56, is a special case of a result of Bon and Illayk [104]. The likelihood ratio monotonicity of the parameter in the observation, given the likelihood ratio monotonicity of the observation in the parameter (Example 1.C.57), can be found in Whitt [560], whereas the preservation of the likelihood ratio order of the priors by the posteriors (Example 1.C.58) is given as Remark 3.14 in Spizzichino [539]. The comparison of the weighted random variables (Example 1.C.59) can be found in Bartoszewicz and Skolimowska [51]. An extension of the implication in Example 1.C.59, when X w and Y w are the length-biased versions of X and of Y , respectively, is given in Hu and Zhuang [244]. An extension of the implication in Example 1.C.59 to multivariate weighted distributions can be found in Jain and Nanda [253]. The result in Example 1.C.60 is taken from Bartoszewicz and Skolimowska [51]; extensions of the inequality X ≤lr X w , when X w is the length-biased version of X, are given in Ross [476]. The ordering of generalized skew normal random variables (Example 1.C.61) is taken from Gupta and Gupta [215]. The up shifted likelihood ratio order is introduced in Shanthikumar and Yao [530]. The results described in Section 1.C.4 can mostly be found in Lillo, Nanda, and Shaked [361, 362]. An extension of Theorem 1.C.77 is given in Belzunce, Ruiz, and Ruiz [74]; see also Belzunce and Shaked [78]. Ramos Romero and Sordo D´ıaz [464] deﬁned an order that is reminiscent of the order ≤lr↑ as deﬁned in (1.C.19). According to their deﬁnition, the nonnegative random variable X is said to be smaller than the nonnegative random variable Y if aX ≤lr Y for every 0 < a < 1. Lehmann and Rojo [345] used the characterization (1.C.4) in order to deﬁne stochastic orders that are stronger than ≤lr . For example, let X and Y be two random variables with distribution functions F and G,

1.E Complements

79

respectively, and consider the stipulation that, for a ﬁxed k, dn GF −1 (u) ≥ 0 dun

for all 0 < u < 1 and all n = 1, 2, . . . , k.

If k ≥ 3, then X is stochastically smaller than Y in a sense that is stronger than ≤lr . The order ≤lr is obtained when k = 2. Lehmann and Rojo [345] showed, for example, that if X1 , X2 , . . . , Xm are independent, identically distributed, then X1 is smaller than max{X1 , X2 , . . . , Xm }, in the above sense, with k = m. Chang [126] considered four exponential random variables X1 , X2 , Y1 , and Y2 , with the corresponding rates λ1 , λ2 , µ1 , and µ2 , where X1 and X2 are independent, and Y1 and Y2 are independent. He obtained the necessary and suﬃcient conditions on λ1 , λ2 , µ1 , and µ2 , for each of the following results: (i) X1 + X2 ≤lr Y1 + Y2 , (ii) X1 + X2 ≥lr Y1 + Y2 , and (iii) X1 + X2 and Y1 + Y2 are not comparable in the likelihood ratio order. Section 1.D: The discussion in this section follows Shaked and SuarezLlorens [520]. Fagiuoli and Pellerey [185] have introduced an approach that describes a uniﬁed point of view regarding some of the orders studied in this chapter and some of the orders studied in Chapters 2, 3, and 4. This approach led Fagiuoli and Pellerey to introduce some families of new orders. Several properties of these orders were studied in Fagiuoli and Pellerey [185], in Nanda, Jain, and Singh [424, 425], and in Hu, Kundu, and Nanda [236]; see also Hesselager [221]. Another general approach that uniﬁes some of the orders studied in this chapter and in Chapter 2 was introduced in Hu, Nanda, Xie, and Zhu [237]. Other orders that are related to the orders ≤st and ≤lr have been introduced and studied in Di Crescenzo [163]. Yanagimoto and Sibuya [571], Zijlstra and de Kroon [577], and Shanthikumar and Yao [532], extended the deﬁnitions of X ≤st Y , X ≤hr Y , and X ≤lr Y , to jointly distributed random variables X and Y ; see also Arcones, Kvam, and Samaniego [15]. Ebrahimi and Pellerey [177] have introduced a stochastic order based on a notion of uncertainty and studied its relationship to some of the orders studied in this chapter.

2 Mean Residual Life Orders

In this chapter we study two orders that are based on comparisons of functionals of mean residual lives. Like the orders in Chapter 1, the purpose of the orders here is to compare the “location” or the “magnitude” of random variables. Among other things, the relationship between the orders of Chapter 1 and the orders in this chapter will be analyzed.

2.A The Mean Residual Life Order 2.A.1 Deﬁnition If X is a random variable with a survival function F and a ﬁnite mean µ, the mean residual life of X at t is deﬁned as E[X − tX > t], for t < t∗ ; m(t) = (2.A.1) 0, otherwise, where t∗ = sup{t : F (t) > 0}. Note that if X is an almost surely positive random variable, then m(0) = µ. By the ﬁniteness of µ we have that m(t) < ∞ for all t < ∞. However, it is possible that m(∞) ≡ limt→∞ m(t) = ∞. A useful ∞ observation is that m(t) = ( t F (x)dx)/F (t) when t∗ = ∞. Although in (2.A.1) there is no restriction on the support of X, the mean residual life function is usually of interest when X is a nonnegative random variable. In that case X can be thought of as a lifetime of a device and m(t) then expresses the conditional expected residual life of the device at time t given that the device is still alive at time t. Clearly, m(t) ≥ 0, but not every nonnegative function is a mean residual life (mrl) function corresponding to some random variable. In fact, a function m is an mrl function of some nonnegative random variable with an absolutely continuous distribution function if, and only if, m satisﬁes the following properties: (i) 0 ≤ m(t) < ∞ for all t ≥ 0,

82

(ii) (iii) (iv) (v)

2 Mean Residual Life Orders

m(0) > 0, m is continuous, m(t) + t is increasing on [0, ∞], and when there exists a t0 such that m(t0 ) = 0, then m(t) = 0 for all t ≥ t0 . Otherwise, when there does not exist such a t0 with m(t0 ) = 0, then ∞ 1 dt = ∞. m(t) 0

Clearly, the smaller the mrl function is, the smaller X should be in some stochastic sense. This is the motivation for the order discussed in this section. Let X and Y be two random variables with mrl functions m and l, respectively, such that m(t) ≤ l(t) for all t. (2.A.2) Then X is said to be smaller than Y in the mean residual life order (denoted as X ≤mrl Y ). Analogously to (1.B.3), it can be shown that X ≤mrl Y if, and only if, ∞ ∞ G(u)du t∞ F (u)du > 0}, (2.A.3) increases in t over {t : F (u)du t t or equivalently, if, and only if, ∞ G(t) F (u)du ≤ F (t)

∞

for all t,

(2.A.4)

increases in t over {t : E[(X − t)+ ] > 0},

(2.A.5)

t

G(u)du

t

or equivalently, if, and only if, E[(Y − t)+ ] E[(X − t)+ ]

where, for any real number a, we let a+ denote the positive part of a; that is, a+ = a if a ≥ 0 and a+ = 0 if a < 0. Analogously to (1.B.5), we also have that X ≤mrl Y if, and only if, F (s) G(s) ∞ ≥ ∞ F (u)du G(u)du t t

for all s ≤ t

(2.A.6)

such that the denominators are positive. It is worthwhile to note that Condition (2.A.5) uses the expectations E[(X − t)+ ] and E[(Y − t)+ ] as (3.A.5) in Chapter 3 and (4.A.4) in Chapter 4 do. For discrete random variables that take on values in N+ the deﬁnition of ≤mrl should be modiﬁed. Let X be such a random variable with a ﬁnite mean µ. The mrl function of X at n is deﬁned as E[X − nX ≥ n], for n ≤ n∗ ; m(n) = 0, otherwise,

2.A The Mean Residual Life Order

83

where n∗ = max{n : P {X ≥ n} > 0}. Note that for such a random variable m(0) = µ. By the ﬁniteness of µ we have that m(n) < ∞ for n < ∞. Let X and Y be two such random variables with mrl functions m and l, respectively. We denote X ≤mrl Y if m(n) ≤ l(n)

for all n ≥ 0.

(2.A.7)

The discrete analog of (2.A.3) is that (2.A.7) holds if, and only if, ∞ ∞ j=n P {Y ≥ j} ∞ increases in n over N+ ∩ {n : P {X ≥ j} > 0}. j=n P {X ≥ j} j=n The discrete analog of (2.A.4) is that (2.A.7) holds if, and only if, P {Y ≥ n}

∞

P {X ≥ j} ≤ P {X ≥ n}

j=n+1

∞

P {Y ≥ j}

for all n ≥ 0.

j=n+1

The discrete analog of (2.A.6) is that X ≤mrl Y if, and only if, P {Y ≥ m} P {X ≥ m} ∞ ≥ ∞ P {X ≥ j} j=n+1 j=n+1 P {Y ≥ j}

for all m ≤ n

such that the denominators are positive. 2.A.2 The relation between the mean residual life and some other stochastic orders If X is a random variable with mrl function m and hazard rate function r, it is not hard to verify that t∗ x m(t) = exp − r(u)du dx, for t < t∗ . (2.A.8) t

t

Therefore, if Y is another random variable with mrl function l and hazard rate function q and (1.B.2) is satisﬁed, that is, X ≤hr Y , then X ≤mrl Y . We thus have proved the following result. Theorem 2.A.1. If X and Y are two random variables such that X ≤hr Y , then X ≤mrl Y . Neither of the orders ≤st and ≤mrl implies the other; counterexamples can be found in the literature. The next result, however, gives a condition under which X ≤mrl Y if, and only if, X ≤hr Y . Therefore, in particular, under that condition, X ≤mrl Y =⇒ X ≤st Y . Theorem 2.A.2. Let X and Y be two random variables with mrl functions m and l, respectively. Suppose that m(t) l(t) increases in t. Then, if X ≤mrl Y , then X ≤hr Y .

84

2 Mean Residual Life Orders

Proof. It is not hard to verify that m is diﬀerentiable over {t : P {X > t} > 0} and that if X has the hazard rate function r, then r(t) =

m (t) + 1 , m(t)

where m denotes the derivative of m. Similarly, if Y has the hazard rate function q, then l (t) + 1 q(t) = . l(t) The monotonicity of m(t)/l(t), together with (2.A.2), implies that r(t) =

1 l (t) 1 m (t) + ≥ + = q(t), m(t) m(t) l(t) l(t)

that is, X ≤hr Y .

Under a condition that is weaker than the one in Theorem 2.A.2 one merely obtains that X ≤mrl Y implies that X ≤st Y . This is shown in the next result. Theorem 2.A.3. Let X and Y be two nonnegative random variables with mrl m(0) m(t) EX functions m and l, respectively. Suppose that m(t) l(t) ≥ l(0) (that is, l(t) ≥ EY when X and Y are almost surely positive), t ≥ 0. If X ≤mrl Y , then X ≤st Y . Proof. Let F be the survival function of X. It is not hard to verify that t 1 EX F (t) = exp − dx over {t : P {X > t} > 0}. m(t) 0 m(x) Similarly, the survival function of Y can be expressed as t 1 EY G(t) = exp − dx over {t : P {Y > t} > 0}. l(t) 0 l(x) Therefore, under the assumptions of the theorem, it is seen that

G(t) F (t)

≥ 1.

The mean residual life order can be characterized by means of the hazard rate order and the appropriate equilibrium age variables. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result follows at once from (1.B.3) and (2.A.3). It may be contrasted with Theorem 1.C.13. Theorem 2.A.4. For nonnegative random variables X and Y with ﬁnite means we have X ≤mrl Y if, and only if, AX ≤hr AY .

2.A The Mean Residual Life Order

85

In the next theorem the order ≤mrl is characterized by ordering two related random variables in the sense of the hazard rate order. Let X and Y be two nonnegative random variables with ﬁnite means and suppose that X ≤st Y and that EX < EY . Let F and G be the distribution functions of X and of Y , respectively. Deﬁne the random variable ZX,Y as the random variable that has the density function h given by (1.C.7), as in Theorem 1.C.14; see also Theorem 2.B.3. Theorem 2.A.5. Let X and Y be two nonnegative random variables with ﬁnite means such that X ≤st Y and such that EY > EX > 0. Then X ≤mrl Y ⇐⇒ AY ≤hr ZX,Y ⇐⇒ AX ≤hr ZX,Y , where ZX,Y has the density function given in (1.C.7). Proof. Denote by Ge and H the survival functions of AY and ZX,Y , respectively. Using (1.A.20) and (1.C.7) we compute ∞ F (u)du H(x) EY 1 − x∞ = , x ≥ 0, EY − EX Ge (x) G(u)du x and the ﬁrst stated equivalence follows from (2.A.3) and (1.B.3). The second equivalence is proven similarly.

Some characterizations of the hazard rate order by means of the order ≤mrl are given below. We denote by Exp(µ) any exponential random variable with mean µ. Theorem 2.A.6. Let X and Y be two continuous nonnegative random variables. Then X ≤hr Y if, and only if, min{X, Exp(µ)} ≤mrl min{Y, Exp(µ)}

for all µ > 0.

The proof of Theorem 2.A.6 uses the Laplace transform order which is discussed in Chapter 5, and it will be given in Remark 5.A.23. Note that from Theorem 2.A.6 it follows, for continuous nonnegative random variables, that X ≤hr Y if, and only if, min{X, Z} ≤mrl min{Y, Z} for any nonnegative random variable Z which is independent of X and of Y . This is so because X ≤hr Y implies min{X, Z} ≤hr min{Y, Z} by Theorem 1.B.33, and the latter implies the above inequality by Theorem 2.A.1. The proof of the next result is not given here. Theorem 2.A.7. Let X and Y be two continuous nonnegative random variables. Then X ≤hr Y if, and only if, 1 − e−sX ≤mrl 1 − e−sY

for all s > 0.

A characterization of the order ≤mrl , by means of the increasing convex order, is given in Theorem 4.A.24.

86

2 Mean Residual Life Orders

2.A.3 Some closure properties In general, if X1 ≤mrl Y1 and X2 ≤mrl Y2 , where X1 and X2 are independent random variables and Y1 and Y2 are also independent random variables, then it is not necessarily true that X1 + X2 ≤mrl Y1 + Y2 . However, if these random variables are IFR, then it is true. This is shown in Theorem 2.A.9, but ﬁrst we state and prove the following lemma, which is of independent interest. Lemma 2.A.8. If the random variables X and Y are such that X ≤mrl Y and if Z is an IFR random variable which is independent of X and Y , then X + Z ≤mrl Y + Z.

(2.A.9)

Proof. Denote by fW and F W the density function and the survival function of any random variable W . Note that ∞ ∞ F X+Z (x)dx = F X (u)F Z (s − u)du for all s. −∞

x=s

Now, for s ≤ t, compute ∞ ∞ ∞ ∞ F X+Z (x)dx F Y +Z (y)dy − F X+Z (x)dx F Y +Z (y)dy x=s y=t x=t y=s F X (u)F Z (s − u)F Y (v)F Z (t − v) = v u≥v + F X (v)F Z (s − v)F Y (u)F Z (t − u) dudv − F X (u)F Z (t − u)F Y (v)F Z (t − v) v u≥v + F X (v)F Z (t − v)F Y (u)F Z (s − u) dudv ∞ ∞ = F X (x) dx · F Y (v) − F Y (x) dx · F X (v) v

u≥v

x=u

x=u

× [fZ (s − u)F Z (t − v) − fZ (t − u)F Z (s − v)]dudv, where the second equality is obtained by integration of parts and by collection of terms. Since X ≤mrl Y it follows from (2.A.4) that the expression within the ﬁrst set of brackets in the last integral is nonpositive. Since Z is IFR it can be veriﬁed that the quantity in the second pair of brackets in the last integral is also nonpositive. Therefore the integral is nonnegative. This proves (2.A.9).

Theorem 2.A.9. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤mrl Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, are all IFR, then m m Xi ≤mrl Yi . i=1

i=1

2.A The Mean Residual Life Order

87

Proof. Repeated application of (2.A.9), using the closure property of IFR under convolution, yields the desired result.

Another interesting lemma is stated next. Recall that a random variable X is said to be (or to have) decreasing mean residual life (DMRL) if m(t) is decreasing in t. Lemma 2.A.10. If the random variables X and Y are such that X ≤hr Y and if Z is a DMRL random variable independent of X and Y , then X + Z ≤mrl Y + Z. Proof. Integrating the identity in the proof of Lemma 1.B.3, we obtain that, for s ≤ t, one has ∞ ∞ ∞ ∞ F X+Z (x)dx F Y +Z (y)dy − F X+Z (x)dx F Y +Z (y)dy x=s y=t x=t y=s F X (u)fY (v) − fX (v)F Y (u) = v u≥v ∞ ∞ F Z (y − v)dy · F Z (s − u) − F Z (x − v)dx · F Z (t − u) dudv. × y=t

x=s

The result now follows from the assumptions.

It should be pointed out that a theorem such as Theorem 2.A.9 cannot be obtained from Lemma 2.A.10. The reason is that the inductive argument used to prove Theorem 2.A.9 does not have an analog based on Lemma 2.A.10. Theorem 2.A.11. Let X be a DMRL random variable, and let Z be a nonnegative random variable independent of X. Then X ≤mrl X + Z. Proof. Let FX , FZ , and FX+Z denote the distribution functions of the corresponding random variables, and let F X and F X+Z denote the corresponding survival functions. Then, for any t ∈ R we have ∞ ∞ ∞ F X (t) F X+Z (u)du = F X (t) F X (u − z)dFZ (z)du t t ∞ 0 ∞ = F X (t) F X (u − z)dudFZ (z) 0 t ∞ ∞ = F X (t) F X (u)dudFZ (z) 0 t−z ∞ ∞ ≥ F X (t − z) F X (u)dudFZ (z) 0 t ∞ = F X+Z (t) F X (u)du, t

where the inequality follows from the assumption that X is DMRL. The stated result now follows from (2.A.4).

88

2 Mean Residual Life Orders

A mean residual life order comparison of random sums is given in the following result. Theorem 2.A.12. Let {Xi , i = 1, 2, . . . } be a sequence of independent and identically distributed nonnegative IFR random variables. Let M and N be two discrete positive integer-valued random variables such that M ≤mrl N (in the sense of (2.A.7)), and assume that M and N are independent of the Xi ’s. Then M N Xi ≤mrl Xi . i=1

i=1

The mean residual life order does not have the property of being simply closed under mixtures. However, under quite strong conditions the order ≤mrl is closed under mixtures. This is shown in the next theorem which may be compared with Theorem 1.B.8. Theorem 2.A.13. Let X, Y , and Θ be random variables such that [X Θ = θ] ≤mrl [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤mrl Y . Proof. The proof is similar to the proof 1.B.8. of Theorem Select a θ and a θ), G(·θ), F (·θ ), and G(·θ ) be the survival θ in the support of Θ. Let F (· functions of [X Θ = θ], [Y Θ = θ], [X Θ = θ ], and [Y Θ = θ ], respectively. It is suﬃcient to show that for α ∈ (0, 1) we have α

∞ t

∞ F (uθ)du + (1 − α) t F (uθ )du αF (tθ) + (1 − α)F (tθ ) ∞ ∞ α t G(uθ)du + (1 − α) t G(uθ )du ≤ αG(tθ) + (1 − α)G(tθ )

for all t ≥ 0.

The proof of this inequality is similar to the proof of (1.B.12).

An analog of Theorem 1.B.12 exists for the order ≤mrl . This is stated next. Theorem 2.A.14. Let X and Y be two nonnegative independent random variables. Then X ≤mrl Y if, and only if, for all functions α and β such that β is nonnegative and α/β and β are increasing, one has E[α∗ (X)]E[β ∗ (Y )] ≤ E[α∗ (Y )]E[β ∗ (X)], provided the expectations exist, where x ∗ α (x) = α(u)du and

∗

β (x) =

0

x

β(u)du. 0

In particular, if X ≤mrl Y , then E[Y n ] E[X n ]

is increasing in n.

(2.A.10)

2.A The Mean Residual Life Order

89

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Sections 1.A.3 and 1.C.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result is comparable to Theorems 1.A.6, 1.B.14, 1.B.52, and 1.C.17. Theorem 2.A.15. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If

X(θ) ≤mrl X(θ )

whenever θ ≤ θ ,

(2.A.11)

and if Θ1 ≤hr Θ2 ,

(2.A.12)

Y1 ≤mrl Y2 .

(2.A.13)

then The proof of Theorem 2.A.15 uses the increasing convex order, and is therefore given in Remark 4.A.29 in Chapter 4. A Laplace transform characterization of the order ≤mrl is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, and 1.C.25. Theorem 2.A.16. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤mrl X2 ⇐⇒ Nλ (X1 ) ≤mrl Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤mrl Nλ (X2 ) is in the sense of (2.A.7). Proof. We use the notation of Theorem 1.A.13. Denote the distribution and survival functions of Xk by Fk and F k , k = 1, 2. For k = 1, 2, note that k αX λ (n) can be written as ∞ ∞ (λx)i Xk dFk (x) αλ (n) = e−λx i! 0 i=n 1, n = 0, = ∞ −λx (λx)n−1 (2.A.14) λe F (x)dx, n = 1, 2, . . . . k (n−1)! 0

90

2 Mean Residual Life Orders

Therefore

∞

P [Xk = n] =

e−λx

0

(λx)n dFk (x), n!

n = 0, 1, 2, . . . .

(2.A.15)

k = 1, 2,

(2.A.16)

From (2.A.15) it is seen that E[Nλ (Xk )] = λE[Xk ],

provided the expectations exist. First assume that X1 ≤mrl X2 . For the sake of this proof replace temporarX2 1 ily the notation αX λ (n) and αλ (n), by αλ,1 (n) and αλ,2 (n), respectively. We also denote E[X1 ] and E[X2 ] by µ1 and µ2 , respectively. The proof of Nλ (X1 ) ≤mrl Nλ (X2 ) will consist of showing the following three inequalities: ∞ ∞ αλ,2 (n) αλ,2 (n) n=0 ∞ ≤ n=1 , (2.A.17) ∞ α (n) λ,1 n=1 αλ,1 (n) n=0 ∞ ∞ αλ,2 (n) αλ,2 (n) n=1 ≤ n=2 , (2.A.18) ∞ ∞ n=1 αλ,1 (n) n=2 αλ,1 (n) and ∞

αλ,k (n) is TP2 in k ∈ {1, 2} and m ≥ 2.

(2.A.19)

n=m

In order to prove (2.A.17) note that from (2.A.16) it follows that ∞ n=0 ∞

αλ,k (n) = 1 + λµk αλ,k (n) = µk ,

k = 1, 2,

and

k = 1, 2.

(2.A.20)

n=1

But since X1 ≤mrl X2 implies that µ1 ≤ µ2 it follows that 1 + λµ2 λµ2 ≤ , 1 + λµ1 λµ1 and (2.A.17) is obtained. Next notice that (2.A.18) is equivalent to ∞ αλ,2 (n) αλ,2 (1) . ≤ n=1 ∞ αλ,1 (1) n=1 αλ,1 (n) ∞ Since n=1 αλ,k (n) = λµk , k = 1, 2, and αλ,k (1) = 0

∞

λe−λx F k (x)dx = λ µk − 0

∞

λe−λx

(2.A.21)

∞

F k (u)dudx ,

x

k = 1, 2,

2.A The Mean Residual Life Order

it follows that (2.A.21) is the same as ∞ ∞ ∞ −λx −λx µ1 λe F 2 (u)dudx − µ2 λe 0

0

x

∞

91

F 1 (u)dudx ≥ 0. (2.A.22)

x

Rewriting the left-hand side of (2.A.22) we see that ∞ ∞ λe−λx µ1 F 2 (u)du − µ2 F 1 (u)du dx x x ∞ 0 ∞ ∞ ∞ −λx = λe F 1 (u)du F 2 (u)du − F 2 (u)du

∞

0

0

0

x

∞

F 1 (u)du dx

x

≥ 0, ∞ where the inequality follows from the TP2 -ness of x F k (u)du in k = 1, 2, and x ≥ 0 (see (2.A.3)). This proves (2.A.22), and hence (2.A.18). Finally, in order to prove (2.A.19), notice, using a straightforward computation, that, for m ≥ 2, ∞

∞

αλ,k (n) = 0

n=m

λ2 e−λx

(λx)m−2 (m − 2)!

∞

F k (u)dudx.

(2.A.23)

x

∞

F k (u)du is TP2 in k ∈ {1, 2} and x ≥ 0. Furthermore, ∞ is TP2 in m ≥ 2 and x ≥ 0. Thus, it follows that n=m αλ,k (n) is TP2 in k ∈ {1, 2} and m ≥ 2, and this establishes (2.A.19). Now suppose that Nλ (X1 ) ≤mrl Nλ (X2 ) for all λ > 0. Then ∞ ∞ αλ,2 (n) n=m αλ,1 (n) ≤ n=m , m = 0, 1, 2, . . . . αλ,1 (m) αλ,2 (m)

By assumption, m−2

x

λ2 e−λx (λx) (m−2)!

For m ≥ 2, by (2.A.23) and (2.A.14), ∞ −λu (λu)m−2 ∞ m−2 ∞ λe−λu (λu) F 1 (x)dx du λe F 2 (x)dx du (m−2)! (m−2)! u 0 u ≤ . ∞ ∞ m−1 m−1 λe−λu (λu) λe−λu (λu) (m−1)! F 1 (u)du (m−1)! F 2 (u)du 0 0 (2.A.24) For a ﬁxed y > 0, deﬁne λ = (m − 1)/y. Letting m → ∞ (λ → ∞), we have ∞ ∞ ∞ m−2 −λu (λu) λe F k (x)dx du → F k (x)dx, (m − 2)! u 0 y ∞ 0

and

0

∞

λe−λu

(λu)m−1 F k (u)du → F k (y), (m − 1)!

k = 1, 2,

as long as y is a continuity point of F 1 (x) and F 2 (x). For such y’s, (2.A.24) gives us

92

2 Mean Residual Life Orders

∞ y

F 1 (x)dx

F 1 (y)

∞ ≤

y

F 2 (x)dx

F 2 (y)

.

It follows that X1 ≤mrl X2 since the set of continuity points of F 1 (x) and F 2 (x) is dense in the set of positive real numbers.

An analog of Theorem 1.B.21 is the following result. Theorem 2.A.17. Let X be a nonnegative DMRL random variable, and let a ≤ 1 be a positive constant. Then aX ≤mrl X. Proof. It is easy to verify that the mean residual life function of aX is given by am( at ), for all t, where m is the mean residual life function of X. Now t t am( ) ≤ m( ) ≤ m(t) a a

for all t,

where the ﬁrst inequality follows from a ∈ [0, 1] and the second inequality follows from the assumption that X is DMRL. The proof now follows from (2.A.2).

In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of mean residual life ordered random variables, is bounded from below and from above, in the mean residual life order sense, by these two random variables. Theorem 2.A.18. Let X and Y be two random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤mrl Y , then X ≤mrl W ≤mrl Y . The proof of Theorem 2.A.18 is similar to the proof of Theorem 1.B.22, but it uses (2.A.3) instead of (1.B.3). We omit the details. The following result is proven in Remark 4.A.25 of Section 4.A.3. Theorem 2.A.19. Let X and Y be two random variables. If X ≤mrl Y , then φ(X) ≤mrl φ(Y ) for every increasing convex function φ. Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R+ with ﬁnite means is a lattice with respect to the order ≤mrl . Let X1 , X2 , . . . , Xm be random variables, and let X(k:m) denote the corresponding kth order statistic, k = 1, 2, . . . , m. Theorem 2.A.20. Let X1 , X2 , . . . , Xm be m independent random variables. If Xi ≤mrl Xm , i = 1, 2, . . . , m − 1, then X(m−1:m−1) ≤mrl X(m:m) .

2.A The Mean Residual Life Order

93

Let X1 , X2 , . . . , Xm be nonnegative random variables and let U(i:m) = X(i:m) − X(i−1:m) denote the corresponding spacings, i = 1, 2, . . . , m (where U(1:m) = X(1:m) ). Similarly, let Y1 , Y2 , . . . , Yn be nonnegative random variables and let V(i:n) denote the corresponding spacings, i = 1, 2, . . . , n. Theorem 2.A.21. For positive integers m and n, let X1 , X2 , . . . , Xm be independent identically distributed nonnegative random variables, and let Y1 , Y2 , . . . , Yn be other independent identically distributed nonnegative random variables. If X1 ≤mrl Y1 , and if X1 is IMRL and Y1 is DMRL, then (m − j + 1)U(j:m) ≤mrl (n − i + 1)V(i:n)

for j ≤ m and i ≤ n.

The following example may be compared to Examples 1.B.24, 1.C.48, 3.B.38, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 2.A.22. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G and density functions f and g, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , and let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the mean residual life ordering of the ﬁrst two inter-epoch times implies the mean residual life ordering of all the corresponding later inter-epoch times. Explicitly, it will be shown below that if X ≤mrl Y , if X and Y are IMRL, and if (1.B.25) holds, then X1,n ≤mrl X2,n for each n ≥ 1. For the purpose of this proof we denote F by F 1 and G by F 2 . The stated result is obvious for n = 1. So let us ﬁx n ≥ 2. The survival function Gi,n of Xi,n , i = 1, 2, is given in (1.B.26). From (2.A.3) it is seen that the stated result is equivalent to ∞ Gi,n (x)dx is TP2 in (i, t); t

that is, to ∞ λi (s) s=0

(s) Λn−2 i (n − 2)!

∞

F i (u)duds is TP2 in (i, t).

(2.A.25)

u=s+t

Now, from Example 1.B.24 we know that (1.B.25) implies that λi (s) is TP2 in (i, s). The assumption F1 ≤mrl F2 means that ∞ F i (u)du is TP2 in (i, s) and in (i, t). u=s+t

Finally, the assumption that Fi is IMRL means that

Λn−2 (s) i (n−2)!

94

2 Mean Residual Life Orders

∞

F i (u)du

is TP2 in (s, t).

u=s+t

Thus (2.A.25) follows from Theorem 5.1 on page 123 of Karlin [275]. 2.A.4 A property in reliability theory The order ≤mrl can be used to characterize DMRL random variables. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. Theorem 2.A.23. The random variable X is DMRL if, and only if, any one of the following equivalent conditions holds: (i) [X − tX > t] ≥ mrl [X − t X > t ] whenever t ≤ t . (ii) X ≥mrl [X − tX > t] for all t ≥ 0 (when X is a nonnegative random variable). (iii) X + t ≤mrl X + t whenever t ≤ t . The proofs of all these statements are trivial and are thus omitted. Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.B.17, 3.A.56, 3.C.13, and 4.A.51. A multivariate extension of parts (i) and (ii) of Theorem 2.A.23 is given in Section 6.F.3. An interesting application of part (iii) of Theorem 2.A.23 is the following corollary. Its proof consists of a combination of Theorem 2.A.23(iii) with Lemma 2.A.8 (or, alternatively, a combination of Theorem 2.A.23(iii), Theorem 1.B.38(iii), and Lemma 2.A.10). Corollary 2.A.24. Let X be a DMRL random variable and let Y be an IFR random variable. If X and Y are independent, then X + Y is DMRL.

2.B The Harmonic Mean Residual Life Order 2.B.1 Deﬁnition Let X and Y be two nonnegative random variables with mrl functions m and l, respectively, and suppose that the harmonic averages of m and l are comparable as follows: x −1 x −1 1 1 1 1 ≤ du du x 0 m(u) x 0 l(u)

for all x > 0.

(2.B.1)

Then X is said to be smaller than Y in the harmonic mean residual life order (denoted as X ≤hmrl Y ).

2.B The Harmonic Mean Residual Life Order

95

Notice that F (u) 1 d =− = ∞ log m(u) du F (v)dv u Therefore

x

1 du = log m(u)

0

Similarly

0

x

1 du = log l(u)

∞ x

∞

F (v)dv .

u

EX . F (u)du

EY ∞ . G(u)du x

Thus it is seen that (2.B.1) holds if, and only if, ∞ ∞ F (u)du G(u)du x ≤ x for all x ≥ 0. EX EY

(2.B.2)

For discrete random variables that take on values in N+ the deﬁnition of ≤hmrl should be modiﬁed. Let X and Y be two such random variables. We denote X ≤hmrl Y if ∞ ∞ j=n P {X ≥ j} j=n P {Y ≥ j} ≤ , n = 1, 2, . . . . (2.B.3) E[X] E[Y ] 2.B.2 The relation between the harmonic mean residual life and some other stochastic orders Since the harmonic averages of m and l are increasing functionals of m and l, respectively, it follows that X ≤mrl Y =⇒ X ≤hmrl Y. The order ≤hmrl is closely related to the order ≤icx which is studied in Section 4.A. The reader may ﬁnd it helpful to browse over that section now, since some of the ideas that are explained there are used below. Note that both (2.B.2) and (2.B.3) are equivalent to E[(X − t)+ ] E[(Y − t)+ ] ≤ E[X] E[Y ]

for all t ≥ 0,

(2.B.4)

and from (2.B.4) it follows that X ≤hmrl Y if, and only if, E[φ(X)] E[φ(Y )] ≤ E[X] E[Y ]

for all increasing convex functions φ : [0, ∞) → R,

(2.B.5) such that the expectations exist. It is worthwhile to note that condition (2.B.4) uses the expectations E[(X − t)+ ] and E[(Y − t)+ ] as (2.A.5) and as (3.A.5)

96

2 Mean Residual Life Orders

in Chapter 3 and (4.A.4) in Chapter 4 do. In Chapter 4, where the order ≤icx is studied, we will use (2.B.4) in order to derive a relationship between the orders ≤hmrl and ≤icx (see Theorem 4.A.28). Neither of the orders ≤st and ≤hmrl implies the other; counterexamples can be found in the literature. Letting x → 0 in (2.B.1) we obtain m(0) ≤ l(0), that is, X ≤hmrl Y =⇒ E[X X > 0] ≤ E[Y Y > 0]. Thus, when X and Y are positive almost surely, then X ≤hmrl Y =⇒ EX ≤ EY.

(2.B.6)

X ≤hmrl Y ⇐⇒ X ≤cx Y,

(2.B.7)

If EX = EY , then where the order ≤cx is studied in Section 3.A (see (3.A.7)). Thus, from (3.A.4) it follows that if X ≤hmrl Y and EX = EY , then Var[X] ≤ Var[Y ]. Under the proper condition, even if X and Y do not have the same mean, one can still get the variance inequality; this is shown in the next result. Theorem 2.B.1. Let X and Y be two almost surely positive random variables with ﬁnite second moments. If X ≤hmrl Y , and if Y is NWUE, then Var[X] ≤ Var[Y ]. Proof. From (2.B.5) we get E[X 2 ] E[Y 2 ] ≤ . E[X] E[Y ]

(2.B.8)

From Barlow and Proschan [36, page 187] it is seen that Var[Y ] ≥ {E[Y ]}2 , since Y is NWUE. Thus, using (2.B.6), we see that Var[Y ] ≥ E[Y ]E[X]. Therefore E[X] Var[Y ] + {E[Y ] − E[X]}E[X] E[Y ] E[X] · E[Y 2 ] − {E[X]}2 = E[Y ]

Var[Y ] ≥

≥ E[X 2 ] − {E[X]}2 = Var[X], where the last inequality follows from (2.B.8).

The harmonic mean residual life order can be characterized by means of the usual stochastic order and the appropriate equilibrium age variables. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result follows at once from (1.A.1) and (2.B.2). It may be contrasted with Theorems 1.C.13 and 2.A.4.

2.B The Harmonic Mean Residual Life Order

97

Theorem 2.B.2. For nonnegative random variables X and Y with ﬁnite means we have X ≤hmrl Y if, and only if, AX ≤st AY . In the next theorem the order ≤hmrl is characterized by ordering two related random variables in the sense of the usual stochastic order. Let X and Y be two nonnegative random variables with ﬁnite means and suppose that X ≤st Y and that EX < EY . Let F and G be the distribution functions of X and of Y , respectively. Deﬁne the random variable ZX,Y as the random variable that has the density function h given by (1.C.7), as in Theorem 1.C.14; see also Theorem 2.A.5. Theorem 2.B.3. Let X and Y be two nonnegative random variables with ﬁnite means such that X ≤st Y and such that EY > EX > 0. Then X ≤hmrl Y ⇐⇒ AY ≤st ZX,Y ⇐⇒ AX ≤st ZX,Y , where ZX,Y has the density function given in (1.C.7). Proof. It is easy to see that (here H is the survival function of Z, Ge is the survival function of AY , and F e is as in (1.A.20)) EX H(x) − Ge (x) = Ge (x) − F e (x) , x ≥ 0. EY − EX Thus the ﬁrst stated equivalence follows from Theorem 2.B.2. The proof of the second equivalence is similar.

The order ≤hmrl can characterize the order ≤mrl as follows. Theorem 2.B.4. Let X and Y be two nonnegative random variables with ﬁnite means. Then X ≤mrl Y if, and only if, [X −tX > t] ≤hmrl [Y −tY > t] for all t ≥ 0. X > t] The proof of Theorem 2.B.4 consists of applying (2.B.2) to [X − t and [Y − t Y > t], for each t ≥ 0, and then showing that the resulting inequality is equivalent to (2.A.3). We omit the details. 2.B.3 Some closure properties Under the proper conditions, the order ≤hmrl is closed under the operation of convolution. First we prove the following lemma. Recall that a nonnegative random variable X with a ﬁnite mean is called NBUE (new better than used in expectation) if E[X −tX > t] ≤ E[X] for all t > 0. Note that a nonnegative NBUE random variable must be almost surely positive. Lemma 2.B.5. If the two almost surely positive random variables X and Y are such that X ≤hmrl Y , and if Z is an NBUE nonnegative random variable independent of X and Y , then X + Z ≤hmrl Y + Z.

98

2 Mean Residual Life Orders

Proof. Let F , G, and H [F , G, and H] be the distribution [survival] functions corresponding to X, Y , and Z, respectively. The corresponding equilibrium age distribution [survival] functions will be denoted by Fe , Ge , and He [F e , Ge , and H e ]. Let AX , AY , AZ , AX+Z , and AY +Z denote the asymptotic equilibrium ages corresponding to X, Y , Z, X + Z, and Y + Z, respectively. Now compute ∞ 1 P {X + Z > v}dv P {AX+Z > t} = E[X + Z] v=t ∞ ∞ 1 = F (v − u)dH(u)dv EX + EZ v=t u=0 ∞ ∞ 1 = F (v − u)dvdH(u) EX + EZ u=0 v=t ∞ ∞ 1 = F (v)dvdH(u) EX + EZ u=0 v=t−u t ∞ 1 = F (v)dvdH(u) EX + EZ u=0 v=t−u ∞ 0 ∞ ∞ F (v)dvdH(u) + dvdH(u) + u=t

v=0

t 1 EX = F e (t − u)dH(u) EX + EZ 0

u=t

v=t−u

+ EX · H(t) +

∞

H(u)du t

=

1 EX · P {AX + Z > t} + EZ · H e (t) , EX + EZ

where AX and Z are taken to be independent in the above expression. Now, since Z is NBUE we have that Z ≥st AZ . Therefore P {AX + Z > t} ≥ P {AX + AZ > t} ≥ P {AZ > t} = H e (t).

(2.B.9)

Now notice that 1 EX · P {AX + Z > t} + EZ · H e (t) EX + EZ 1 EY · P {AX + Z > t} + EZ · H e (t) ≤ EY + EZ 1 EY · P {AY + Z > t} + EZ · H e (t) ≤ EY + EZ = P {AY +Z > t}

P {AX+Z > t} =

(AY and Z are taken to be independent in the above), where the ﬁrst inequality follows from (2.B.6) and (2.B.9), and the second inequality follows from Theorem 2.B.2. The result now follows from Theorem 2.B.2.

2.B The Harmonic Mean Residual Life Order

99

Repeated application of Lemma 2.B.5, using the closure property of NBUE under convolution, and noting that every NBUE random variable is almost surely positive, yields the following result. Theorem 2.B.6. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of nonnegative random variables such that Xi ≤hmrl Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, are all NBUE, then m

Xi ≤hmrl

i=1

m

Yi .

i=1

Using Theorem 2.B.6 we can prove the following result. Theorem 2.B.7. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of NBUE nonnegative independent and identically distributed random variables such that Xi ≤hmrl Yi , i = 1, 2, . . .. Let M and N be integer-valued positive random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤hmrl N . Then M

Xj ≤hmrl

j=1

N

Yj .

j=1

Proof. The proof here is similar to the proof of Theorem 4.A.9. The reader may wish to look at that proof before continuing to read the present mproof. 1 E From Theorem 2.B.6 and (2.B.4) it is seen that mE[X i=1 Xi − 1]

m

1 u + ≤ mE[Y1 ] E i=1 Yi − u + (all the Xi ’s have the same mean, and also all the Yi ’s have the same mean). Therefore m

m E E i=1 Xi − u + i=1 Yi − u + for all u ≥ 0, m = 1, 2, . . . . ≤ E[X1 ] E[Y1 ] Thus E

M

i=1 Xi − u + M E i=1 Xi

∞ m=1

=

E

m i=1

Xi − u

P {M = m} +

E[M ]E[X1 ] m

m=1 E i=1 Yi − u + P {M = m}

∞ ≤

E[M ]E[Y1 ]

E i=1 Yi − u + = . M E i=1 Yi M

Therefore (again by (2.B.4)) we have M i=1

Xi ≤hmrl

M i=1

Yi .

(2.B.10)

100

2 Mean Residual Life Orders

Now let φ be an increasing convex function and denote g(n) ≡ E[φ(Y1 + Y2 +· · ·+Yn )]. In the proof of Theorem 4.A.9 it is shown that g(n) is increasing

E φ

M

i=1 and convex in n. Therefore, since M ≤hmrl N , we have that E[M ]

N E φ i=1 Yi , and since the Yi ’s have the same mean we have that E[N ]

Yi

≤

M M N N E φ E φ E φ E φ i=1 Yi i=1 Yi i=1 Yi i=1 Yi ≤ = = . M N E[M ]E[Y1 ] E[N ]E[Y1 ] E E i=1 Yi i=1 Yi Thus we have that

M i=1

Yi ≤hmrl

N

Yi .

(2.B.11)

i=1

The inequalities (2.B.10) and (2.B.11) yield the stated result.

A result that is related to Theorem 2.B.7 is given next. It is of interest to compare it to Theorem 1.A.5. Theorem 2.B.8. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed NBUE random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent and identically distributed NBUE random variables, and let N be a positive integervalued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K

Xi ≤hmrl [≥hmrl ] Y1 ,

i=1

and M ≤hmrl [≥hmrl ] KN. Then

M j=1

Xj ≤hmrl [≥hmrl ]

N

Yj .

j=1

We do not give a detailed proof of Theorem 2.B.8 here since it is similar to the proof of Theorem 4.A.12 in Section 4.A.1. In order to construct a proof of Theorem 2.B.8 from the proof of Theorem 4.A.12 one just uses the equivalence (2.B.7) and one replaces the application of Theorem 4.A.9 by an application of Theorem 2.B.7. Two other similar theorems are the following. Their proofs are similar to the proofs of Theorems 4.A.13 and 4.A.14 in Section 4.A.1. Theorem 2.B.9. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed NBUE random variables, and let M

2.B The Harmonic Mean Residual Life Order

101

be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent and identically distributed NBUE random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤hmrl Y1

and

M ≤hmrl

i=1

K

Ni ,

i=1

or if we have KX1 ≤hmrl Y1

and

M ≤hmrl KN,

KX1 ≤hmrl Y1

and

M ≤hmrl

or if we have K

Ni ,

i=1

then

M

Xj ≤hmrl

j=1

N

Yj .

j=1

Theorem 2.B.10. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed NBUE random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of nonnegative independent and identically distributed NBUE random variables, and let N be a positive integervalued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1 i=1

then

Xi ≤hmrl

K1 Y1 K2

M j=1

and

Xj ≤hmrl

M ≤hmrl K2 N,

N

Yj .

j=1

The harmonic mean residual life order does not have the property of being simply closed under mixtures. However, under quite strong conditions the order ≤hmrl is closed under mixtures. This is shown in the next theorem which may be compared with Theorems 1.B.8 and 2.A.13. Theorem 2.B.11. Let X and Y be nonnegative random variables, and let Θ be another random variable, such that [X Θ = θ] ≤hmrl [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤hmrl Y .

102

2 Mean Residual Life Orders

Proof. The proof is similar to the proof 1.B.8. of Theorem Select a θ and a θ), G(·θ), F (·θ ), and G(·θ ) be the survival θ in the support F (· of Θ. Let functions of [X Θ = θ], [Y Θ = θ], [XΘ = θ ], and [Y Θ = θ ], respectively. Let E[X θ], E[Y θ], E[X θ ], and E[Y θ ] be the corresponding expectations. By (2.B.2) it is suﬃcient to show that for α ∈ (0, 1) we have ∞ ∞ α t F (u|θ)du + (1 − α) t F (u|θ )du αE[X|θ] + (1 − α)E[X|θ ] ∞ ∞ α t G(u|θ)du + (1 − α) t G(u|θ )du for all t ≥ 0. (2.B.12) ≤ αE[Y |θ] + (1 − α)E[Y |θ ] The proof of this inequality is similar to the proof of (1.B.12).

Another condition under which the order ≤hmrl is closed under mixtures is given in the following theorem. Theorem 2.B.12. Let X and Y be nonnegative random variables, and let Θ be another random variable, such that [X Θ = θ] ≤hmrl [Y Θ = θ] for all θ in the support of Θ. Furthermore, assume that E[Y |Θ = θ] =k E[X|Θ = θ]

(independent of θ).

(2.B.13)

Then X ≤hmrl Y . Proof. As in the a θ and a θ in the support proof of Theorem 2.B.11, select of Θ. Let F (·θ), G(·θ), F (·θ ), and G(· θ ) be the survival functions of [X Θ = θ], [Y Θ = θ], [X Θ = θ ], and [Y Θ = θ ], respectively. Let E[X θ], E[Y θ], E[X θ ], and E[Y θ ] be the corresponding expectations. Let α ∈ (0, 1). Note that from (2.B.13) we obtain αE[Y |θ] + (1 − α)E[Y |θ ] = k. (2.B.14) αE[X|θ] + (1 − α)E[X|θ ] Also, from [X Θ = θ] ≤hmrl [Y Θ = θ], [X Θ = θ ] ≤hmrl [Y Θ = θ ], and (2.B.13), we get, for t ≥ 0, that ∞ ∞ ∞ ∞ k F (uθ)du ≤ G(uθ)du and k F (uθ )du ≤ G(uθ )du, t

t

t

and hence ∞ k α F (u θ)du + (1 − α) t

t

F (uθ )du t ∞ ≤α G(uθ)du + (1 − α) ∞

t

∞

G(uθ )du.

t

From this inequality and (2.B.14) we obtain (2.B.12), and this completes the proof.

2.B The Harmonic Mean Residual Life Order

103

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a subset of the real line. As in Sections 1.A.3 and 1.C.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

The following result is comparable to Theorems 1.A.6, 1.B.14, 1.B.52, 1.C.17 and 2.A.15. Theorem 2.B.13. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2, that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If

X(θ) ≤hmrl X(θ )

whenever θ ≤ θ ,

(2.B.15)

and if Θ1 ≤hr Θ2 ,

(2.B.16)

Y1 ≤hmrl Y2 .

(2.B.17)

then The proof of Theorem 2.B.13 uses the increasing convex order, and is therefore given in Remark 4.A.29 in Chapter 4. A Laplace transform characterization of the order ≤hmrl is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, and 2.A.16. Theorem 2.B.14. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤hmrl X2 ⇐⇒ Nλ (X1 ) ≤hmrl Nλ (X2 )

for all λ > 0,

where the notation Nλ (X1 ) ≤hmrl Nλ (X2 ) is in the sense of (2.B.3). Proof. First assume that X ≤hmrl Y . As in the proof of Theorem 2.A.16 we X2 1 temporarily replace the notation αX λ (n) and αλ (n), by αλ,1 (n) and αλ,2 (n), respectively. We also denote the survival function and the mean of Xk by F k and µk , respectively, k = 1, 2. Let m ≥ 2. Using (2.A.23) we have µ1

∞

P 2 (n) − µ2

n=m

= 0

∞ n=m

∞

λ2 e−λx

P 1 (n) (λx)m−2 µ1 (m − 2)!

∞

x

F 2 (u)du − µ2

∞

x

F 1 (u)du dx.

104

2 Mean Residual Life Orders

The integrand is nonnegative by the assumption of the theorem, and one direction of the proof is complete. The proof of the converse statement is similar to the proof of the converse of Theorem 2.A.16.

The following result gives necessary and suﬃcient conditions for two random variables to be equal in the sense of the order ≤hmrl . Theorem 2.B.15. Let X and Y be two nonnegative random variables with positive expectations, such that EX ≤ EY . Then X =hmrl Y if, and only if, X =st BY for some Bernoulli random variable B, independent of Y . Proof. First assume that X =st BY for some Bernoulli random variable B, independent of Y . Then E[(X − t)+ ] E[(BY − t)+ ] E[(Y − t)+ ]P {B = 1} = = E[X] E[BY ] E[Y ]P {B = 1} E[(Y − t)+ ] = E[Y ]

for all t ≥ 0,

and thus X =hmrl Y follows from (2.B.4). Conversely, suppose that X =hmrl Y . By (2.B.2) this means that ∞ ∞ P {X > u}du P {Y > u}du t = t for all t ≥ 0, EX EY which yields EX · P {Y > t}, t ≥ 0. EY That is, X =st BY , where B is a Bernoulli random variable such that P {B = 1} = EX/EY .

P {X > t} =

From the proof of Theorem 2.B.15 it is seen, in contrast to (2.B.6), that if X ≤hmrl Y , then it does not necessarily follow that EX ≤ EY (unless X and Y are positive almost surely). In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of harmonic mean residual life ordered random variables, is bounded from below and from above, in the harmonic mean residual life order sense, by these two random variables. Theorem 2.B.16. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. Let W be a random variable with the distribution function pF + (1 − p)G for some p ∈ (0, 1). If X ≤hmrl Y , then X ≤hmrl W ≤hmrl Y . Proof. By assumption, (2.B.2) holds. Therefore

2.B The Harmonic Mean Residual Life Order

∞ x

F (u)du p ≤ EX

∞ x

∞

F (u)du + (1 − p) x G(u)du ≤ pEX + (1 − p)EY

∞ x

105

G(u)du EY for all x ≥ 0,

and the stated result follows from (2.B.2).

2.B.4 Properties in reliability theory The order ≤hmrl can be used to characterize DMRL random variables. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. Theorem 2.B.17. The nonnegative random variable X is DMRL if, and only if, [X − tX > t] ≥hmrl [X − t X > t ] whenever t ≥ t ≥ 0. The proof is simple and thus omitted. Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.A.23, 3.A.56, 3.C.13, and 4.A.51. The order ≤hmrl can also be used to characterize NBUE random variables as follows. Theorem 2.B.18. Let X be a nonnegative random variable with a ﬁnite positive mean. Then the following assertions are equivalent: (i) X ≤hmrl X + Y for any nonnegative random variable Y with a ﬁnite positive mean, which is independent of X. (ii) X is NBUE. (iii) X + Y1 ≤hmrl X + Y2 whenever Y1 and Y2 are almost surely positive random variables with ﬁnite means, which are independent of X, such that Y1 ≤hmrl Y2 . Proof. Suppose that (i) holds. Then, taking Y =a.s. y for some y > 0, we get from (2.B.4) that E[(X − t)+ ] E[(X + y − t)+ ] ≤ , E[X] E[X] + y

t ≥ 0.

Upon rearrangement this gives

yE[(X − t)+ ] ≤ E[X] E[(X + y − t)+ ] − E[(X − t)+ ] , that is, E[X] E[(X − t)+ ] ≤ y

t

P {X > u}du,

t ≥ 0.

t−y

Letting y → 0 we obtain E[(X − t)+ ] ≤ E[X]P {X > t},

t ≥ 0,

t ≥ 0;

106

2 Mean Residual Life Orders

that is, X is NBUE. The statement (ii)=⇒(iii) is Lemma 2.B.5. Now assume that (iii) holds. Let Y1 =a.s a and Y2 =a.s y, where 0 < a < y. It is easy to verify (for instance, using (2.B.4)) that Y1 ≤hmrl Y2 . That is, (E[X] + y)E[(X + a − t)+ ] ≤ (E[X] + a)E[(X + y − t)+ ],

t ≥ 0.

Letting a → 0 we obtain (E[X] + y)E[(X − t)+ ] ≤ E[X]E[(X + y − t)+ ],

t ≥ 0, y ≥ 0.

Integrating both sides of the above inequality with respect to the distribution of Y (Y is any random variable as described in (i)) we obtain (E[X] + E[Y ])E[(X − t)+ ] ≤ E[X]E[(X + Y − t)+ ],

t ≥ 0,

that is, by (2.B.4), we have X ≤hmrl X + Y .

Another characterization of NBUE random variables by means of the usual stochastic order is given in Theorem 1.A.31.

2.C Complements Section 2.A: Basic properties of the mrl function (which is also called the biometric function) can be found in Yang [572] and references therein. Some properties of the mrl functions are summarized in Shaked and Shanthikumar [513], where further references can be found. The counterexamples mentioned after Theorem 2.A.1 can also be found in that paper and further counterexamples can be found in Gupta and Kirmani [216] and in Alzaid [12]. The conditions under which the ≤mrl order implies the ≤hr and the ≤st orders (Theorems 2.A.2 and 2.A.3) are taken from Gupta and Kirmani [216]. The equivalence of the order ≤mrl and (2.A.3) can be found, for example, in Singh [536]. The characterization of the order ≤mrl which is given in Theorem 2.A.5 is taken from Di Crescenzo [164]. The characterizations of the order ≤hr by means of the order ≤mrl , given in Theorems 2.A.6 and 2.A.7, can be found in Belzunce, Gao, Hu, and Pellerey [67]. The closure under convolution results of the order ≤mrl in Section 2.A.3 were communicated to us by Pellerey [444]. A special case of Lemma 2.A.8 can be found in Mukherjee and Chatterjee [403]. Theorem 2.A.9 can be found in Pellerey [448] and Theorem 2.A.12 can be found in Fagiuoli and Pellerey [186]. The fact that a DMRL random variable increases in the order ≤mrl when a nonnegative random variable is added to it (Theorem 2.A.11) is a result that is slightly stronger than a result in Frostig [207]. The closure under mixtures result (Theorem 2.A.13) is taken from Nanda, Jain, and Singh [424]. The characterization of the

2.C Complements

107

mrl order that is given in Theorem 2.A.14 can be found in Joag-Dev, Kochar, and Proschan [259], whereas its special case given in (2.A.10) is taken from Fagiuoli and Pellerey [187]. Fagiuoli and Pellerey [187] have extended (2.A.10) to sums of mrl ordered random variables. The closure under mixtures property of the order ≤mrl (Theorem 2.A.15) is a special case of a result of Hu, Kundu, and Nanda [236], and it can also be found in Hu, Nanda, Xie, and Zhu [237]; see also Theorem 3.4 in Ahmed [7]. The Laplace transform characterization of the order ≤mrl (Theorem 2.A.16) is taken from Shaked and Wong [524]; see also Kan and Yi [274]. An extension of Theorem 2.A.16 to more general orders can be found in Nanda [422]. The mean residual life order comparisons of order statistics (Theorems 2.A.20 and 2.A.21) can be found in Hu, Zhu, and Wei [243] and in Hu and Wei [240]. The comparison of inter-epoch times of two nonhomogeneous Poisson processes in the sense of the mean residual life order (Example 2.A.22) is taken from Belzunce, Lillo, Ruiz, and Shaked [69]. The result that a convolution of an IFR and a DMRL random variables is DMRL (Corollary 2.A.24) can be found in Kopocinska and Kopocinski [320]. Nanda, Singh, Misra, and Paul [429] studied a notion of reversed residual lifetime, and introduced and studied a stochastic order based on it. An order which is related to the mean residual life order is introduced in Ebrahimi and Zahedi [179]. If m and l are the mrl functions of X and Y , d respectively, then the order is deﬁned by requiring dt (l(t) − m(t)) to be monotone in t. Ebrahimi and Zahedi [179] show that this order implies the mean residual life order. In Kirmani [297] it is claimed that the spacings, from a sample of independent and identically distributed IMRL random variables, are ordered in the mean residual life order. However, the proof of Kirmani is erroneous; see Kirmani [298]. Section 2.B: The order ≤hmrl is studied, for example, in Deshpande, Singh, Bagai, and Jain [161] and in Heilmann and Schr¨ oter [219]. Baccelli and Makowski [28] call it the forward recurrence times stochastic order (see an additional comment on the paper of Baccelli and Makowski [28] in Section 4.C). The counterexamples mentioned after (2.B.5) can be found, for example, in Mi [394]. In fact, Gerchak and Golani [209] have noticed that the example given on page 489 of Wolﬀ [567] shows that it is possible for both X ≤st Y and Y ≤hmrl X to hold simultaneously in the strict sense. The comparison of the expectations of ≤hmrl ordered random variables, described in (2.B.6), is a special case of a result of Nanda, Jain, and Singh [425]. The variance inequality (Theorem 2.B.1) can be found in Kirmani [297]. The characterization of the order ≤hmrl which is given in Theorem 2.B.3 is taken from Di Crescenzo [164]. The characterization of the order ≤mrl by means of the order ≤hmrl (Theorem 2.B.4) can be

108

2 Mean Residual Life Orders

found in Hu, Kundu, and Nanda [236]. The preservation under convolution property of the order ≤hmrl (Theorem 2.B.6) is taken from Pellerey [448, 449] (the latter is a correction note), and the closure under random summations property of the order ≤hmrl (Theorem 2.B.7) is also taken from Pellerey [448, 449], though it is alluded to in Heilmann and Schr¨ oter [219]. These results (Theorems 2.B.6 and 2.B.7) can also be found in Baccelli and Makowski [28]. A slight extension of Theorem 2.B.6 is given in Lef`evre and Utev [340]. Theorems 2.B.8–2.B.10 have been communicated to us by Pellerey [447]. The closure under mixtures properties of the order ≤hmrl (Theorems 2.B.11 and 2.B.12) are taken from Nanda, Jain, and Singh [424] and from Lef`evre and Utev [340], respectively, whereas Theorem 2.B.13 is inspired by Ahmed, Soliman, and Khider [9]. The Laplace transform characterization of the order ≤hmrl (Theorem 2.B.14) is taken from Shaked and Wong [524]. An extension of Theorem 2.B.14 to more general orders can be found in Nanda [422]. The conditions under which X =hmrl Y (Theorem 2.B.15) can be found in Lef`evre and Utev [340]. The NBUE characterization, given in Theorem 2.B.18, is taken from Lef`evre and Utev [340].

3 Univariate Variability Orders

In this chapter we study stochastic orders that compare the “variability” or the “dispersion” of random variables. The most important and common orders that are studied in this chapter are the convex and the dispersive orders. We also study in this chapter the excess wealth order (which is also called the right spread order) which is found to be useful in an increasing number of applications. Various related orders are also examined in this chapter.

3.A The Convex Order 3.A.1 Deﬁnition and equivalent conditions Let X and Y be two random variables such that E[φ(X)] ≤ E[φ(Y )]

for all convex functions φ : R → R,

(3.A.1)

provided the expectations exist. Then X is said to be smaller than Y in the convex order (denoted as X ≤cx Y ). Roughly speaking, convex functions are functions that take on their (relatively) larger values over regions of the form (−∞, a) ∪ (b, ∞) for a < b. Therefore, if (3.A.1) holds, then Y is more likely to take on “extreme” values than X. That is, Y is “more variable” than X. It should be mentioned here that in (3.A.1) it is suﬃcient to consider only functions φ that are convex on the union of the supports of X and Y rather than over the whole real line; we will not keep repeating this point throughout this chapter. One can also deﬁne a concave order by requiring (3.A.1) to hold for all concave functions φ (denoted as X ≤cv Y ). However, X ≤cv Y if, and only if, Y ≤cx X. Therefore, it is not necessary to have a separate discussion for the concave order. Note that the functions φ1 and φ2 , deﬁned by φ1 (x) = x and φ2 (x) = −x, are both convex. Therefore, from (3.A.1) it easily follows that

110

3 Univariate Variability Orders

X ≤cx Y =⇒ E[X] = E[Y ],

(3.A.2)

provided the expectations exist. Later it will be helpful to observe that if E[X] = E[Y ], then ∞ ∞ F (u) − G(u) du = F (u) − G(u) du = 0, (3.A.3) −∞

−∞

provided the integrals exist, where F [F ] and G [G] are the survival [distribution] functions of X and Y , respectively. The function φ, deﬁned by φ(x) = x2 , is convex. Therefore, from (3.A.1) and (3.A.2), it follows that X ≤cx Y =⇒ Var[X] ≤ Var[Y ],

(3.A.4)

whenever Var(Y ) < ∞. For a ﬁxed a, the function φa , deﬁned by φa (x) = (x−a)+ , and the function ϕa , deﬁned by ϕa = (a − x)+ , are both convex. (The reader is encouraged to draw a sketch of φa and ϕa since they are very handy in the analysis of the order ≤cx as well as in the analysis of the monotone convex and the monotone concave orders discussed in Chapter 4.) Therefore, if X ≤cx Y , then E[(X − a)+ ] ≤ E[(Y − a)+ ]

for all a

(3.A.5)

E[(a − X)+ ] ≤ E[(a − Y )+ ]

for all a,

(3.A.6)

and provided the expectations exist. Alternatively, using a simple integration by parts, it is seen that (3.A.5) and (3.A.6) can be rewritten as ∞ ∞ F (u)du ≤ G(u)du for all x (3.A.7) x

and

x

x

−∞

F (u)du ≤

x

G(u)du

for all x,

(3.A.8)

−∞

provided the integrals exist. In fact, when E[X] = E[Y ], (3.A.7) is equivalent to X ≤cx Y . To see this equivalence, note that every convex function can be approximated by (that is, is a limit of) positive linear combinations of the functions φa ’s, for various choices of a’s, and of the function φ(x) = −x. By (3.A.7), E[φa (X)] ≤ E[φa (Y )] for all a’s, and this fact, together with the equality of the means of X and Y , implies (3.A.1). We thus have proved the ﬁrst part of the following result. The other part is proven similarly. Theorem 3.A.1. Let X and Y be two random variables such that E[X] = E[Y ]. Then (a) X ≤cx Y if, and only if, (3.A.7) holds.

3.A The Convex Order

111

(b) X ≤cx Y if, and only if, (3.A.8) holds. By adding a to both sides of the inequality in (3.A.5), it is seen that (3.A.5) can be rewritten as max{X, a}] ≤ E[max{Y, a}]

for all a.

(3.A.9)

Thus, when E[X] = E[Y ], then (3.A.9) is equivalent to X ≤cx Y . In a similar manner (3.A.6) can be rewritten. The following theorem provides another characterization of the convex order. Theorem 3.A.2. Let X and Y be two random variables such that E[X] = E[Y ]. Then X ≤cx Y if, and only if, E|X − a| ≤ E|Y − a|

for all a ∈ R.

(3.A.10)

Proof. Clearly, if X ≤cx Y , then (3.A.10) holds. So suppose that (3.A.10) holds. Without loss of generality it can be assumed that EX = EY = 0. A straightforward computation gives ∞ a E|X − a| = a + 2 F (u)du = −a + 2 F (u)du. (3.A.11) a

−∞

The result now follows from (3.A.7) or (3.A.8).

The function −E|X − ·| is called the potential of the probability measure of X. Similarly, −E|Y − ·| is the potential of the probability measure of Y . Thus, (3.A.10) can be written as −E|X − ·| ≥ −E|Y − ·| pointwise. Using this observation, we obtain from Chacon and Walsh [122] the following characterization. Theorem 3.A.3. Let X and Y be two random variables such that E[X] = E[Y ] = 0. Then X ≤cx Y if, and only if, for a standard Brownian motion from 0, {B(t), t ≥ 0}, there exist two stopping times T1 and T2 , such that T1 ≤ T2 almost surely, and X =st B(T1 ) and Y =st B(T2 ). An immediate consequence of (3.A.5) is shown next. Denote the supports of X and Y by supp(X) and supp(Y ). Let lX = inf{x : x ∈ supp(X)} and uX = sup{x : x ∈ supp(X)}. Deﬁne lY and uY similarly. Then we have that if X ≤cx Y , then lY ≤ lX and uY ≥ uX . As proof, suppose, for example, that uY < uX . Let a be such that uY < a < uX . Then E[(Y − a)+ ] = 0 < E[(X − a)+ ], in contradiction to (3.A.5). Therefore we must have uY ≥ uX . Similarly, using (3.A.6), it can be shown that lY ≤ lX . As a consequence we have that if X and Y are random variables whose supports are intervals, then X ≤cx Y =⇒ supp(X) ⊆ supp(Y ).

(3.A.12)

An important characterization of the convex order by construction on the same probability space is stated in the next theorem.

112

3 Univariate Variability Orders

Theorem 3.A.4. The random variables X and Y satisfy X ≤cx Y if, and ˆ and Yˆ , deﬁned on the same probonly if, there exist two random variables X ability space, such that ˆ =st X, X Yˆ =st Y, ˆ Yˆ } is a martingale, that is, and {X, ˆ =X ˆ E[Yˆ X]

a.s.

(3.A.13) ˆ and Yˆ can be selected such that [Yˆ X ˆ= Furthermore, the random variables X x] is increasing in x in the usual stochastic order ≤st .

It is not easy to prove the constructive part of Theorem 3.A.4. However, it ˆ and Yˆ as described in the theorem is easy to prove that if random variables X exist, then X ≤cx Y . Just note that if φ is a convex function, then by Jensen’s Inequality, ˆ = Eφ(E[Yˆ X]) ˆ ≤ E{E[φ(Yˆ )X]} ˆ = E[φ(Yˆ )] = E[φ(Y )], E[φ(X)] = E[φ(X)] which is (3.A.1). Other characterizations of the convex order are described in the next theorem. Theorem 3.A.5. Let X and Y be two random variables with distribution functions F and G, respectively, and with equal ﬁnite means. Then each of the following two statements is a necessary and suﬃcient condition for X ≤cx Y : p p F −1 (u)du ≥ G−1 (u)du for all p ∈ [0, 1]; (3.A.14) 0

and

0 1

F

−1

(u)du ≤

p

1

G−1 (u)du

for all p ∈ [0, 1].

(3.A.15)

p

1 1 Proof. Since EX = 0 F −1 (u)du and EY = 0 G−1 (u)du, and since EX = EY , it follows that for any p ∈ [0, 1] the inequality 1 1 F −1 (u)du ≤ G−1 (u)du (3.A.16) p

p

is equivalent to the inequality p F −1 (u)du ≥ 0

p

G−1 (u)du.

(3.A.17)

0

It follows that (3.A.14) and (3.A.15) are equivalent. Thus, we just need to show that X ≤cx Y is equivalent to (3.A.14).

3.A The Convex Order

113

We only give the proof for the case when the distribution functions F and G of X and Y are continuous; the proof for the general case is similar, though notationally more complex. Without loss of generality, suppose that F and G are not identical. Since EX = EY , it follows that F and G must cross each other at least once. If either (3.A.7) or (3.A.14) hold, then, if there is a ﬁrst time that F crosses G, it must cross it there from below. Similarly, if there is a last time that F crosses G, it also must cross it there from below. (Thus, if there is a ﬁnite number of crossings, then it must be odd.) Let (y0 , p0 ), (y1 , p1 ), and (y2 , p2 ) be three consecutive crossing points as depicted in Figure 3.A.1. Note that (y0 , p0 ) may be (−∞, 0) (we then adopt the convention that 0 · (−∞) ≡ 0), and that (y2 , p2 ) may be (∞, 1) (we then adopt the convention that 0 · ∞ ≡ 0). Note that by the continuity assumption we have pi = F (yi ) = G(yi ), i = 0, 1, 2. p 1

6 F G

p2 F p1

G G F

p0 G F

y0 y1 y2 Fig. 3.A.1. Typical segments of F and G when X ≤cx Y

Assume that X ≤cx Y . Then ∞ F (x)dx ≤ y2

Thus

-y

∞

y2

G(x)dx.

(3.A.18)

114

3 Univariate Variability Orders

1

F −1 (u)du = y2 (1 − p2 ) +

p2

≤ y2 (1 − p2 ) +

∞

F (x)dx y 2∞

G(x)dx

(by (3.A.18))

(3.A.19)

y2 1

=

G−1 (u)du.

p2

Now, for u ∈ [p1 , p2 ] we have that F −1 (u) − G−1 (u) ≤ 0 (see Figure 3.A.1). 1 Thus p (F −1 (u) − G−1 (u))du is increasing in p ∈ [p1 , p2 ]. Therefore, from (3.A.19) we get that

1

F −1 (u)du ≤

p

1

G−1 (u)du

for p ∈ [p1 , p2 ].

(3.A.20)

p

From X ≤cx Y we also have y0

−∞

F (x)dx ≤

y0

G(x)dx.

(3.A.21)

−∞

Thus

p0

F −1 (u)du = y0 p0 −

0

y0

F (x)dx −∞ y0

≥ y 0 p0 − G(x)dx −∞ p0 = G−1 (u)du.

(by (3.A.21))

(3.A.22)

0

Now, for ∈ [p0 , p1 ] we have that F −1 (u) − G−1 (u) ≥ 0 (see Figure 3.A.1). p u −1 Thus 0 (F (u) − G−1 (u))du is increasing in p ∈ [p0 , p1 ]. Therefore, from (3.A.22) we get that p p F −1 (u)du ≥ G−1 (u)du for p ∈ [p0 , p1 ]. (3.A.23) 0

0

Thus we see from (3.A.20) and (3.A.23) that for each p ∈ [0, 1] either (3.A.16) or (3.A.17) hold. Therefore, (3.A.14) (or, equivalently, (3.A.15)) holds. Conversely, assume that (3.A.14) (or, equivalently, (3.A.15)) holds. Then

1

p2

Thus

F −1 (u)du ≤

1

p2

G−1 (u)du.

(3.A.24)

3.A The Convex Order

∞

1

F (x)dx = y2

p2 1

≤

115

F −1 (u)du − y2 (1 − p2 ) G−1 (u)du − y2 (1 − p2 )

(by (3.A.24))

(3.A.25)

p 2∞

=

G(x)dx. y2

Now, ∞ for x ∈ [y1 , y2 ] we have that F (x) − G(x) ≤ 0 (see Figure 3.A.1). Thus (F (x) − G(x))dx is increasing in y ∈ [y1 , y2 ]. Therefore, from (3.A.25) we y get that ∞ ∞ F (x)dx ≤ G(x)dx for y ∈ [y1 , y2 ]. (3.A.26) y

y

From (3.A.14) we also have p0 F −1 (u)du ≥ 0

Thus

y0

−∞

p0

G−1 (u)du.

(3.A.27)

0

F (x)dx = y0 p0 −

p0

F −1 (u)du

0

p0

≤ y 0 p0 − G−1 (u)du 0 y0 = G(x)dx.

(by (3.A.27))

(3.A.28)

−∞

Now, y for x ∈ [y0 , y1 ] we have that F (x) − G(x) ≤ 0 (see Figure 3.A.1). Thus (F (x) − G(x))dx is decreasing in y ∈ [y0 , y1 ]. Therefore, from (3.A.28) we −∞ get that y y F (x)dx ≤ G(x)dx for y ∈ [y0 , y1 ]. (3.A.29) −∞

−∞

Thus we see from (3.A.26) and (3.A.29) that for each y ∈ R either (3.A.7) or (3.A.8) hold. Therefore X ≤cx Y .

We now give a bivariate characterization result for the order ≤cx that is similar to the characterizations given in Theorems 1.A.9, 1.B.9, 1.B.47, and 1.C.20, for the orders ≤st , ≤hr , ≤rh , and ≤lr , respectively. We deﬁne the following class of bivariate functions: Gcx = {φ : R2 → R : φ(x, y) − φ(y, x) is convex in x for all y}. Theorem 3.A.6. Let X and Y be independent random variables. Then X ≤cx Y if, and only if, E[φ(X, Y )] ≤ E[φ(Y, X)]

for all φ ∈ Gcx .

(3.A.30)

116

3 Univariate Variability Orders

Proof. Suppose that (3.A.30) holds. Let ψ be a univariate convex function. Deﬁne φ(x, y) = ψ(x). Then φ ∈ Gcx and from (3.A.30) we see that X ≤cx Y . Conversely, suppose that X ≤cx Y . Let φ ∈ Gcx and let Yˆ be another random variable, independent of X and Y , such that Yˆ =st Y . Deﬁne ψ by ψ(x) ≡ E[φ(x, Yˆ ) − φ(Yˆ , x)]. From the independence of X and Yˆ it follows that ψ is convex. Therefore, since X ≤cx Y , it follows that E[φ(X, Y )] − E[φ(Y, X)] = E[ψ(X)] ≤ E[ψ(Y )] = 0.

Another characterization of the convex order, by means of the number of sign changes of two distribution functions, is given in Theorem 3.A.45 in Section 3.A.3. Let X be a random variable with survival function F , and let h : [0, 1] → [0, 1] be an increasing function that satisﬁes h(0) = 0 and h(1) = 1. Such a function h is called a probability transformation function. Consider the functional ∞ Vh (X) = − xdh(F (x)); (3.A.31) −∞

this functional is called the Yaari functional and it is of interest in economics. Theorem 3.A.7. Let X and Y be two random variables with the same ﬁnite means. Then X ≤cx Y if, and only if, Vh (X) ≤ Vh (Y )

for every convex probability transformation function h.

As can be seen from (3.A.2), only random variables that have the same means can be compared by the order ≤cx . Often, however, we do not want a variability order to depend on the location of the involved distributions. Several ideas for using the order ≤cx to deﬁne a variability order that is independent of the locations of the underlying random variables X and Y have been suggested in the literature. When X and Y have ﬁnite means, one idea is to say that X is less variable than Y if [X − EX] ≤cx [Y − EY ].

(3.A.32)

This is sometimes called the dilation order. When the random variables X and Y satisfy (3.A.32), we denote X ≤dil Y . For nonnegative random variables X and Y with ﬁnite means one can deﬁne X as less variable than Y if X Y ≤cx . EX EY

(3.A.33)

This is sometimes called the Lorenz order. When the nonnegative random variables X and Y satisfy (3.A.33), we denote X ≤Lorenz Y . Bhattacharjee and Sethuraman [88] introduced a stochastic order, for nonnegative random variables with ﬁnite means, denoted by ≤hnbue . Kochar [306] showed that the orders ≤hnbue and ≤Lorenz are equivalent. The dilation order can be characterized as follows.

3.A The Convex Order

117

Theorem 3.A.8. Let X and Y be two random variables with distribution functions F and G, respectively, and with ﬁnite expectations. Then X ≤dil Y if, and only if, 1 1 1 −1 −1 [F (u) − G (u)]du ≤ [F −1 (u) − G−1 (u)]du for all p ∈ [0, 1). 1−p p 0 (3.A.34) Proof. Denote ∆ = EX − EY . Then the stochastic inequality X ≤dil Y can be rewritten as X − ∆ ≤cx Y . Denote by F∆ the distribution function of X − ∆, and note that from Theorem 3.A.5 we have that X − ∆ ≤cx Y if, and only if, 1 1 −1 F∆ (u)du ≤ G−1 (u)du for all p ∈ [0, 1]. (3.A.35) p

p

−1 (u) = F −1 (u) − ∆ Since F∆ (x) = F (x + ∆) for all x ∈ R it follows that F∆ for all u ∈ [0, 1]. Therefore (3.A.35) is equivalent to 1 1 −1 [F (u) − ∆]du ≤ G−1 (u)du for all p ∈ [0, 1]; p

that is,

1

[F p

−1

p

(u) − G

−1

(u)]du ≤

1

[EX − EY ]du

for all p ∈ [0, 1];

p

that is, 1 1−p

1

[F −1 (u) − G−1 (u)]du ≤ EX − EY

for all p ∈ [0, 1).

(3.A.36)

p

1 1 Now, since EX = 0 F −1 (u)du and EY = 0 G−1 (u)du it is seen that (3.A.36) is equivalent to (3.A.34).

1 For each p ∈ (0, 1), the quantity 0 [F −1 (u) − G−1 (u)]du on the right-hand 1 −1 1 side of (3.A.34) is a weighted average of 1−p [F (u) − G−1 (u)]du and of p 1 p −1 (u)−G−1 (u)]du. Thus, from Theorem 3.A.8 we obtain that X ≤dil Y p 0 [F if, and only if, 1 1 p −1 −1 −1 [F (u) − G (u)]du ≤ [F (u) − G−1 (u)]du for all p ∈ (0, 1]. p 0 0 Also, X ≤dil Y if, and only if, 1 1 1 p −1 [F −1 (u)−G−1 (u)]du ≤ [F (u)−G−1 (u)]du 1−p p p 0

for all p ∈ (0, 1).

118

3 Univariate Variability Orders

For p ∈ [0, 1], let us denote the p-quantiles of X and of Y by x(p) = F −1 (p) and y(p) = G−1 (p), respectively. As in Jewitt [256], we observe that F −1 (p) 1 p −1 F (u)du = 0 xdF (x) = pE[X X ≤ x(p)]. Similarly, p F −1 (u)du = 0 p 1 (1 − p)E[X X ≥ x(p)], 0 G−1 (u)du = pE[Y Y ≤ y(p)], and p G−1 (u)du = (1 − p)E[Y Y ≥ y(p)]. Thus we see that each of the following three statements is a necessary and suﬃcient condition for X ≤dil Y : E[X X ≥ x(p)] − E[Y Y ≥ y(p)] ≤ EX − EY for all p ∈ [0, 1), (3.A.37) E[X X ≤ x(p)] − E[Y Y ≤ y(p)] ≥ EX − EY for all p ∈ (0, 1], (3.A.38) and E[X X ≥ x(p)] − E[Y Y ≥ y(p)] ≤ E[X X ≤ x(p)] − E[Y Y ≤ y(p)] for all p ∈ (0, 1). Rewriting (3.A.37) and (3.A.38) we see that under the conditions of Theorem 3.A.8 we have that X ≤dil Y if, and only if, E[X − EX X ≥ x(p)] ≤ E[Y − EY Y ≥ y(p)] for all p ∈ [0, 1). (3.A.39) Also, X ≤dil Y if, and only if, E[X − EX X ≤ x(p)] ≥ E[Y − EY Y ≤ y(p)]

for all p ∈ (0, 1]. (3.A.40)

When EX = EY we have that X ≤dil Y ⇐⇒ X ≤cx Y . Therefore, when EX = EY , the convex order can be characterized by noting that X ≤cx Y if, and only if, E[X X ≥ x(p)] ≤ E[Y Y ≥ y(p)] for all p ∈ [0, 1). (3.A.41) Also then, X ≤cx Y if, and only if, E[X X ≤ x(p)] ≥ E[Y Y ≤ y(p)]

for all p ∈ (0, 1].

(3.A.42)

Another characterization of the dilation order is given next. Theorem 3.A.9. Let X and Y be two random variables with ﬁnite means, and let the corresponding distribution functions be F and G, respectively. Then X ≤dil Y if, and only if, 1 1 −1 φ(p)[F (p) − EX]dp ≤ φ(p)[G−1 (p) − EY ]dp, 0

0

for any increasing function φ on [0, 1] for which the integrals above are welldeﬁned.

3.A The Convex Order

119

The Lorenz order is closely connected to the so-called Lorenz curve deﬁned as follows. Let X be a nonnegative random variable with distribution function F . The Lorenz curve LX , corresponding to X, is deﬁned as p −1 F (u)du LX (p) = 01 , p ∈ [0, 1]. (3.A.43) F −1 (u)du 0 The Lorenz curve is used in economics to measure the inequality of incomes. Let Y be another nonnegative random variable with distribution function G. The Lorenz curve LY , corresponding to Y , is deﬁned analogously. The next theorem, which follows from Theorem 3.A.5, highlights the connection between the Lorenz curve and the Lorenz order. Theorem 3.A.10. Let X and Y be two nonnegative random variables with equal means. Then X ≤Lorenz Y (or, equivalently, X ≤cx Y ) if, and only if, LX (p) ≥ LY (p)

for all p ∈ [0, 1].

Another related characterization of the Lorenz order is described next. Let Ψ be the set of all measurable mappings from R+ to [0, 1]. For any nonnegative random variable X with a ﬁnite mean deﬁne the Lorenz zonoid in R2+ by $ ∞ % ∞ 1 L(X) = ψ(x)dF (x), xψ(x)dF (x) : ψ ∈ Ψ , EX 0 0 where F denotes the distribution function of X. Theorem 3.A.11. Let X and Y be two nonnegative random variables with ﬁnite means. Then X ≤Lorenz Y ⇐⇒ L(X) ⊆ L(Y ). Ramos and Sordo [463] deﬁned what they called a “second-order absolute Lorenz order” by requiring two random variables X and Y , with ﬁnite means and with distribution functions F and G, respectively, to satisfy 1 u 1 u [F −1 (v) − EX]dvdu ≥ [G−1 (v) − EY ]dvdu for all p ∈ [0, 1]. p

0

p

0

3.A.2 Closure and other properties Using (3.A.1) through (3.A.13) it is easy to prove each of the closure results in the ﬁrst two parts of the following theorem. (Recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.) Theorem 3.A.12. (a) Let X and Y be two random variables. Then X ≤cx Y ⇐⇒ −X ≤cx −Y.

120

3 Univariate Variability Orders

(b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤cx [Y Θ = θ] for all θ in the support of Θ. Then X ≤cx Y . That is, the convex order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that E|Xj | → E|X|

and

E|Yj | → E|Y |

as j → ∞.

(3.A.44)

If Xj ≤cx Yj , j = 1, 2, . . ., then X ≤cx Y . (d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx Yi for i = 1, 2, . . . , m, then m j=1

Xj ≤cx

m

Yj .

j=1

That is, the convex order is closed under convolutions. In order to prove part (c) of Theorem 3.A.12 we will use the characterization of the convex order given in Theorem 3.A.2. Without loss of generality it can be assumed that EXj = EYj = EX = EY = 0 for all j. From (3.A.11) a we have that E|Xj − a| = −a + 2 −∞ Fj (u)du for all a, where Fj denotes the distribution function of Xj . In particular, when a = 0, it is seen that 0 a E|Xj | = 2 −∞ Fj (u)du. Therefore E|Xj − a| = E|Xj | − a + 2 0 Fj (u)du. Using (3.A.44) it is seen that, as j → ∞, the latter expression converges a to E|X| − a + 2 0 F (u)du = E|X − a|, where F is the distribution function of X. That is, for all a, E|Xj − a| → E|X − a|, as j → ∞. Similarly, E|Yj − a| → E|Y − a|, as j → ∞. The result now follows from Theorem 3.A.2. One way of proving part (d) of Theorem 3.A.12 is the following. Note that part (b) of Theorem 3.A.12 can be rephrased as follows: Let Z1 , Z2 , and Θ be independent random variables and let g be a bivariate function such that g(Z1 , θ) ≤cx g(Z2 , θ)

for all θ in the support of Θ.

(3.A.45)

Then g(Z1 , Θ) ≤cx g(Z2 , Θ). If Z1 and Z2 satisfy Z1 ≤cx Z2 , then the function g, deﬁned by g(z, θ) = z + θ, satisﬁes (3.A.45), since the order ≤cx is closed under shifts. Thus we have shown that if Z1 ≤cx Z2 and Θ is any random variable independent of Z1 and Z2 , then Z1 + Θ ≤cx Z2 + Θ. (3.A.46) Repeated applications of (3.A.46) yield part (d) of Theorem 3.A.12. It should be pointed out, in contrast to part (a) of Theorem 3.A.12, that if X and Y are such that X ≤cx Y , it is not necessarily true that X ≤cx −Y also, even when EX = EY = 0. This can be seen easily from (3.A.12).

3.A The Convex Order

121

Without condition (3.A.44) the conclusion of part (c) of Theorem 3.A.12 need not be true. For example, let the Xj ’s be all uniformly distributed on [.5, 1.5]. And let the Yj ’s be such that P {Yj = 0} = (j −1)/j and P {Yj = j} = 1/j, j ≥ 2. Note that the distributions of the Yj ’s converge to a distribution that is degenerate at 0. Here Xj ≤cx Yj , j = 2, 3, . . ., but it is not true that X ≤cx Y . For nonnegative random variables, a “random sums” analog of Theorem 3.A.12(d) follows. We omit the proof (however, in Theorem 8.A.13 of Chapter 8 we give a proof of a special case of the following theorem). Theorem 3.A.13. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent random variables such that Xi ≤cx Yi , i = 1, 2, . . .. Let M and N be integer-valued positive random variables that are independent of the {Xi } and {Yi } sequences, respectively, such that M ≤cx N . If the Xi ’s or the Yi ’s are increasing in i in the convex order, then M j=1

Xj ≤cx

N

Yj .

j=1

A result that is related to Theorem 3.A.13 is given next. It is of interest to compare it to Theorems 1.A.5 and 2.B.8. Theorem 3.A.14. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K Xi ≤cx [≥cx ] Y1 , i=1

and M ≤cx [≥cx ] KN. Then

M j=1

Xj ≤cx [≥cx ]

N

Yj .

j=1

We do not give a detailed proof of Theorem 3.A.14 here since it is similar to the proof of Theorem 4.A.12 in Section 4.A.1. Two other similar theorems are the following. Their proofs are similar to the proofs of Theorems 4.A.13 and 4.A.14 in Section 4.A.1. Theorem 3.A.15. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let

122

3 Univariate Variability Orders

{Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤cx Y1

and

M ≤cx

K

Ni ,

i=1

i=1

or if we have KX1 ≤cx Y1

and

M ≤cx KN,

KX1 ≤cx Y1

and

M ≤cx

or if we have K

Ni ,

i=1

then

M

Xj ≤cx

j=1

N

Yj .

j=1

Theorem 3.A.16. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1 i=1

then

Xi ≤cx

K1 Y1 K2

M j=1

and

Xj ≤cx

N

M ≤cx K2 N,

Yj .

j=1

Another result which involves a comparison of random sums, with respect to the convex order, is given in Example 9.A.19. Theorem 3.A.12(d) can be generalized to situations in which the Xj ’s or the Yj ’s are not necessarily independent. For example, the result (7.A.13) in Chapter 7 is a generalization of Theorem 3.A.12(d). The next result is a trivial illustration of a case in which one of the independence assumptions is dropped.

3.A The Convex Order

123

Theorem 3.A.17. Let X be a random variable with a ﬁnite mean. Then X + EX ≤cx 2X. Proof. Let X be an independent copy of X. Then, for any convex function φ for which the expectations below exist, one has Eφ(X + EX) = Eφ(E(X + X X)) ≤ Eφ(X + X ) ≤ Eφ(2X), where the ﬁrst inequality follows from Jensen’s Inequality and the second inequality follows from Example 3.A.29 below (with n = 2).

Theorem 3.A.17 can also be easily proven using Theorem 3.A.4. The following result provides a generalization of Theorem 3.A.17; see a comment after Theorem 3.A.18. Recall from (3.A.32) the deﬁnition of the dilation order. Theorem 3.A.18. Let X be a random variable with a ﬁnite mean. Then X ≤dil aX

whenever a ≥ 1.

Proof. Without loss of generality assume that EX = 0. Let φ be a convex function which, without loss of generality, can be assumed to satisfy φ(0) = 0. Then, for k ≥ 1 we have Eφ(X) ≤ E[kφ(X)] ≤ Eφ(kX).

From Theorem 3.A.18 it follows that X + (k − 1)EX ≤cx kX

whenever k ≥ 1,

which is, indeed, a generalization of Theorem 3.A.17. From Theorem 3.A.12(a) it is not hard to see that X ≤dil Y ⇐⇒ −X ≤dil −Y. Another property of the dilation and of the convex orders is described in the following theorem. Theorem 3.A.19. Let X1 and X2 (Y1 and Y2 ) be two independent copies of X (Y ), where X and Y have ﬁnite means. If X ≤dil Y , then X1 − X2 ≤dil Y1 − Y2 . If X ≤cx Y , then X1 − X2 ≤cx Y1 − Y2 . Proof. Using the fact that X ≤dil Y if, and only if, −X ≤dil −Y , and the fact that the dilation order is closed under convolutions (see Theorem 3.A.12(d)), the stated result follows. The proof of X ≤cx Y =⇒ X1 − X2 ≤cx Y1 − Y2 is similar (using Theorem 3.A.12(a) and (d)).

124

3 Univariate Variability Orders

An interesting comparison of sums of random variables in the convex order is the following result. Theorem 3.A.20. Let X1 , X2 , . . . , Xn , and Z be random variables. Then X1 + X2 + · · · + Xn ≥cx E[X1 Z] + E[X2 Z] + · · · + E[Xn Z], provided the conditional expectations above exist. Proof. Let φ be a convex function. By Jensen’s Inequality we have Eφ(X1 + X2 + · · · + Xn ) = E E[φ(X1 + X2 + · · · + Xn )Z] ≥ E φ E[X1 Z] + E[X2 Z] + · · · + E[Xn Z] , and the stated result follows.

Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a convex subset (that is, an interval) of the real line or of N. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by Gθ (y)dF (θ), y ∈ R. H(y) = X

Theorem 3.A.21. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If for every convex function φ E[φ(X(θ))]

is convex in θ,

(3.A.47)

and if Θ1 ≤cx Θ2 , then Y1 ≤cx Y2 . The proof of Theorem 3.A.21 is similar to the proof of Theorem 4.A.18 below, and therefore we omit it. It is worth mentioning that condition (3.A.47) is the same as the condition {X(θ), θ ∈ X } ∈ SCX which is studied in Section 8.A of Chapter 8. The following corollary of Theorem 3.A.21 shows that the convex order is closed under products of nonnegative random variables. A variation of this corollary is given in Example 4.A.19.

3.A The Convex Order

125

Corollary 3.A.22. Let X1 and X2 be a pair of independent random variables, and let Y1 and Y2 be another pair of independent random variables. If Xi ≤cx Yi , i = 1, 2, then X1 X2 ≤cx Y1 Y2 . Proof. Using Theorem 3.A.21 twice we see that X1 X2 ≤cx Y1 X2 ≤cx Y1 Y2 , and the stated result follows from the transitivity property of the convex order.

An interesting variation of Theorem 3.A.21 is the following. Again, we omit the proof because it is similar to the proof of Theorem 4.A.18. Theorem 3.A.23. Consider a family of distribution functions {Gθ , θ ∈ X } as described before Theorem 3.A.21. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If for every convex function φ E[φ(X(θ))]

is increasing in θ,

and if Θ1 ≤st Θ2 , then Y1 ≤cx Y2 . The next result indicates the “minimal” and the “maximal” random variables, with respect to the order ≤cx , when the support and the mean are given. The proof, using (3.A.7) or (3.A.8) for example, is trivial and is thus omitted. Theorem 3.A.24. Let X be a random variable with mean EX. Denote the left [right] endpoint of the support of X by lX [uX ] (see the paragraph preceding (3.A.12) for the exact deﬁnition of lX and uX ). Let Z be a random variable such that P {Z = lX } = (uX − EX)/(uX − lX ) and P {Z = uX } = (EX − lX )/(uX − lX ). Then EX ≤cx X ≤cx Z, (3.A.48) where in (3.A.48) (and in (3.A.49)) EX denotes a random variable that takes on the value EX with probability 1.

126

3 Univariate Variability Orders

Another result that indicates the “minimal” random variable, with respect to the order ≤cx , for some rich families of random variables when the mean is given is Theorem 3.A.46. It follows from the ﬁrst inequality of (3.A.48) and from the fact that for any two random variables U and V one has U ≤cx V ⇐⇒ V ≤cv U , that if X is a random variable with mean EX, then X ≤cv EX.

(3.A.49)

In analogy to Theorem 1.A.17 we have the following results. We omit the proof of Theorem 3.A.25; however, the necessity part of Theorem 3.A.25 is a special case of Theorem 3.A.26. In the next three theorems we assume that all the random variables that are considered have ﬁnite means. Theorem 3.A.25. Let X be a nonnegative random variable that is not degenerate at 0 and let g be a nonnegative function deﬁned on [0, ∞). If g(x) > 0 for all x > 0, and if g is increasing on [0, ∞), and if g(x)/x is decreasing [increasing] on (0, ∞), then g(X) ≤Lorenz [≥Lorenz ] X. For example, if X is a nonnegative random variable, then X + a ≤Lorenz X

whenever a > 0.

The proof of the next theorem follows from results in Chapter 4 (see Theorem 4.B.5 and the ﬁrst part of the proof of Theorem 4.B.4). Theorem 3.A.26. Let X be a nonnegative random variable that is not degenerate at 0, and let g and h be nonnegative increasing functions, deﬁned on [0, ∞), such that g(x) > 0 and h(x) > 0 for all x > 0. If h(x)/g(x) is increasing in x ∈ (0, ∞), then g(X) ≤Lorenz h(X). Using Theorem 3.A.25 it is not too hard to prove the following result. Theorem 3.A.27. Let X and Z be two independent nonnegative random variables that are not degenerate at 0 and let g be a nonnegative function deﬁned on [0, ∞)2 such that g(Z, X) is not degenerate at 0. If g(z, x)/x is increasing in x for every z, and if g(z, x) is increasing in x for every z, then X ≤Lorenz g(Z, X). The Lorenz order often implies the harmonic mean residual life order, as the following result shows. Theorem 3.A.28. Let X and Y be two nonnegative random variables with positive expectations. If X ≤Lorenz Y and if EX ≤ EY , then X ≤hmrl Y .

3.A The Convex Order

127

Proof. For t ≥ 0 we have

Y t =E − EX EY EX +

EY E Y − EX · t + E[(Y − t)+ ] ≤ , = EY EY

where the ﬁrst inequality follows from X ≤Lorenz Y (that is, X ≤cx EX EY Y ), and the second inequality follows from EX ≤ EY . The stated result now follows from (2.B.4).

E E[(X − t)+ ] ≤ EX

EX EY

·Y −t

+

Let us now return to the characterization of the convex order given in Theorem 3.A.4. This characterization is sometimes useful for establishing the relation ≤cx between two random variables. The next example is a ﬁne illustration of this procedure. Example 3.A.29. Let X1 , X2 , . . . be independent and identically distributed random variables. Denote by X n the sample mean of X1 , X2 , . . . , Xn . That is, X n = (X1 + X2 + · · · + Xn )/n. It is well known that if the variances exist, then for every n ≥ 2 one has Var(X n ) ≤ Var(X n−1 ). But more than that is true. In fact, if the expectation of X1 exists, then for each n ≥ 2 one has X n ≤cx X n−1 . In order to see it note that from the exchangeability of X1 , X2 , . . . , Xn it follows that E[Xi X n ] = X n for all i ≤ n. Therefore E[X n−1 X n ] = X n . That is, {X n , X n−1 } is a martingale. The result now follows from Theorem 3.A.4. An extension of Example 3.A.29 to the multivariate case is given in Example 7.A.11. A result that is similar to Example 3.A.29 is the following (actually it is a generalization of Example 3.A.29 as will be argued below). Theorem 3.A.30. Let X1 , X2 , . . . , Xn be independent and identically distributed random nvariables. Let φ1 , φ2 , . . . , φn be measurable real functions. Denote φ = n1 i=1 φi . Then n i=1

φ(Xi ) ≤cx

n

φi (Xi ).

i=1

The proof of Theorem 3.A.30 consists of verifying, using the exchangeability of the Xi ’s, that n n n 1 φπi (Xi ) φ(Xi ) = φ(Xi ) E n! π i=1 i=1 i=1

128

3 Univariate Variability Orders

and that

n n 1 φπi (Xi ) =st φi (Xi ). n! π i=1 i=1

The desired result then follows from Theorem 3.A.4. Corollary 3.A.31. Let X1 , X2 , . . . , Xn be independent and identically distributed n random variables. Let a1 , a2 , . . . , an be real constants. Denote a = 1 i=1 ai . Then n n n a Xi ≤cx ai Xi . i=1

i=1

By taking ai = 1/(n − 1) for i = 1, 2, . . . , n − 1, and an = 0, it is easily seen that Example 3.A.29 is a special case of Corollary 3.A.31. Example 3.A.32. Let m ≤ m be two positive integers, and let M and N be two Poisson random variables with means mλ and m λ, respectively, for some λ > 0. Deﬁne X = mN and Y = m M . Then, using Example 3.A.29, it can be shown that X ≤cx Y . This result can be extended to the case where m and m are not integers, by approximating m/m with rational numbers. Two other simple results that follow from Theorem 3.A.4 are the following theorems. Theorem 3.A.33. Let X and Y be independent random variables with ﬁnite means and suppose that EY = a. Then aX ≤cx Y X. Proof. Clearly, E[Y X X] = aX and the result now follows from Theorem 3.A.4. This result is also an immediate consequence of Corollary 3.A.22 if one takes there X1 = a almost surely, X2 = X, and Y1 = Y and Y2 = X.

Theorem 3.A.34. Let X and Y be independent random variables with ﬁnite means and suppose that EY = 0. Then X ≤cx X + Y. Proof. Clearly, E[X + Y X] = X and the result follows from Theorem 3.A.4. Another way of proving this result is to use Theorem 3.A.12(d).

Recall from (3.A.32) the deﬁnition of the dilation order. From Theorem 3.A.34 it follows that if X and Y are independent random variables with ﬁnite means, then X ≤dil X + Y. (3.A.50) Recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. The next result strengthens Corollary 3.A.31.

3.A The Convex Order

129

Theorem 3.A.35. Let X1 , X2 , . . . , Xn be exchangeable random variables. Let a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bn ) be two vectors of constants. If a ≺ b, then n n ai Xi ≤cx bi X i . (3.A.51) i=1

i=1

Proof. Below, for any constants a, b, c, and d the notation a ≤ [b, c] stands for a ≤ min{b, c}, and the notation [b, c] ≤ d stands for max{b, c} ≤ d. By a wellknown property of the majorization order it suﬃces to prove the result only for n = 2. Let X1 and X2 be exchangeable random variables, and let a1 , a2 , b1 , and b2 be four constants such that b1 ≤ a1 ≤ a2 ≤ b2 and a1 + a2 = b1 + b2 . Denote X(1) = min{X1 , X2 } and X(2) = max{X1 , X2 }. Then, almost surely, b1 X(2) + b2 X(1) ≤ [a1 X(1) + a2 X(2) , a1 X(2) + a2 X(1) ] ≤ b1 X(1) + b2 X(2) and a1 X(2) + a2 X(1) + a1 X(1) + a2 X(2) = b1 X(2) + b2 X(1) + b1 X(1) + b2 X(2) . Hence for any convex function φ we have, almost surely, φ(a1 X(2) + a2 X(1) ) + φ(a1 X(1) + a2 X(2) ) ≤ φ(b1 X(2) + b2 X(1) ) + φ(b1 X(1) + b2 X(2) ). Therefore, 2Eφ(a1 X1 + a2 X2 ) = E[φ(a1 X(2) + a2 X(1) ) + φ(a1 X(1) + a2 X(2) )] ≤ E[φ(b1 X(2) + b2 X(1) ) + φ(b1 X(1) + b2 X(2) )] = 2Eφ(b1 X1 + b2 X2 ), and the stated result follows.

A result that is related to Theorem 3.A.35 is Theorem 4.A.39. Another result that is related to Theorem 3.A.35 n is Theorem n7.B.8 in Chapter 7 by Tong in [515]; the latter compares i=1 bi Xi and i=1 ai Xi in the sense of the peakedness order of Section 3.D, rather than in the sense of the order ≤cx . From Theorem 3.A.35 it follows that if the Xi ’s are exchangeable (in particular, if they are identically distributed), if ai ≥ 0, i = 1, 2, . . . , n, and n a = 1, and if X1 ≤cx Y for some random variable Y , then i i=1 n

ai Xi ≤cx Y.

(3.A.52)

i=1

The next result shows that (3.A.52) is true even if the Xi ’s are not exchangeable, but have any joint distribution.

130

3 Univariate Variability Orders

Theorem 3.A.36. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≤cx Y , i = 1, 2, . . . , n, then n

ai Xi ≤cx Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n, and

n i=1

ai = 1.

Proof. Let φ be any convex function for which the expectations below exist. Then n n n E φ ai Xi ≤ E ai φ(Xi ) = ai E[φ(Xi )] i=1

i=1

i=1

≤

n

ai E[φ(Y )] = E[φ(Y )],

i=1

where the ﬁrst inequality follows from the convexity of φ, and the second inequality from Xi ≤cx Y , i = 1, 2, . . . , n.

Similar results are described in Theorems 5.A.14, 5.C.8, and 5.C.18. An interesting result in which the coeﬃcients in (3.A.51) are replaced by Bernoulli random variables is described next. Let Ip denote a Bernoulli random variable with probability of success p, that is, P {Ip = 1} = 1−P {Ip = 0} = p. Theorem 3.A.37. Let X1 , X2 , . . . , Xn be nonnegative exchangeable random variables, and let Ip1 , Ip2 , . . . , Ipn and Iq1 , Iq2 , . . . , Iqn be independent Bernoulli random variables that are independent of X1 , X2 , . . . , Xn . If p ≺ q, then n

Ipi Xi ≥cx

i=1

n

Iqi Xi .

i=1

A result that is related to Theorem 3.A.37 is Theorem 4.A.38. Example 3.A.38. If the Xi ’s in Theorem 3.A.37 are all identically equal to 1, then we get that p ≺ q implies that n

Ipi ≥cx

i=1

In particular,

n

n

Iqi .

i=1

Iqi ≤cx Y,

i=1

where n Y is a binomial random variable having the parameters n and q = i=1 qi /n.

3.A The Convex Order

131

Conceptually it can be expected that if the random variables X1 , X2 , . . . , Xn are “more positively [negatively] associated” than the random variables Y 1 ,nY2 , . . . , Yn in some n sense, but otherwise Xi =st Yi for each i, then i=1 Xi ≥cx [≤cx ] i=1 Yi . The following result is a formalization of this idea. Recall that random variables X1 , X2 , . . . , Xn are said to be positively associated if Cov(h1 (X1 , X2 , . . . , Xn ), h2 (X1 , X2 , . . . , Xn )) ≥ 0

(3.A.53)

for all increasing functions h1 and h2 for which the above covariance is deﬁned. Similarly, X1 , X2 , . . . , Xn are said to be negatively associated if Cov(h1 (Xi1 , Xi2 , . . . , Xik ), h2 (Xj1 , Xj2 , . . . , Xjn−k )) ≤ 0

(3.A.54)

for all choices of disjoint subsets {i1 , i2 , . . . , ik } and {j1 , j2 , . . . , jn−k } of {1, 2, . . . , n}, and for all increasing functions h1 and h2 for which the above covariance is deﬁned. Theorem 3.A.39. Let X1 , X2 , . . . , Xn be positively [negatively] associated random variables, and let Y1 , Y2 , . . . , Yn be independent random variables such that Xi =st Yi , i = 1, 2, . . . , n. Then n i=1

Xi ≥cx [≤cx ]

n

Yi .

i=1

Theorem 3.A.39 follows from Theorem 9.A.23 in Chapter 9; see a comment there after that theorem. A Laplace transform characterization of the order ≤cx is stated next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, and 2.B.14. We do not give the proof of this characterization here since it follows easily from Theorem 4.A.21 in Chapter 4. Theorem 3.A.40. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤cx X2 ⇐⇒ Nλ (X1 ) ≤cx Nλ (X2 )

for all λ > 0.

Example 3.A.41. Let Θ be a random variable whose realization, θ, is a parameter of interest. In the context of statistical inference the distribution function of Θ is called a prior distribution. Let X and Y be two random variables whose distribution functions depend on θ, that is, the conditional distribution of X given Θ = θ is, say, Fθ , and the conditional distribution of Y given Θ = θ is, say, Gθ . Let L(a, θ) be the loss incurred when Θ = θ and when the action a has been taken (a is a number in the action space A which is a compact subset of R). In the following discussion, every expected value that is mentioned is assumed to exist.

132

3 Univariate Variability Orders

If X = x is observed, and action a is taken, then the expected loss is E L(a, Θ)X = x . The minimal expected loss, given that X = x has been observed, is then min E L(a, Θ)X = x . a∈A

Therefore the expected minimal expected loss, for an experiment in which X is used for inference on θ, is

E min E L(a, Θ)X . a∈A

Similarly, the expected minimal expected loss, for an experiment in which Y is used for inference on θ, is

E min E L(a, Θ)Y . a∈A

We say that Y is more informative than X for Θ if

E min E L(a, Θ)X ≥ E min E L(a, Θ)Y a∈A

a∈A

(3.A.55)

for any loss function L, and any action space A, for which the minima and the expected values above are well deﬁned. Let U = E[ΘX] and V = E[ΘY ] be the posterior means in the corresponding experiments. Obviously, EU = E[E[ΘX]] = EΘ = E[E[ΘY ]] = EV. (3.A.56) Take A = [0, 1] and consider the loss function Lc (a, θ) = a(θ − c), where c is some constant. Then min E Lc (a, Θ)X = min a[E[Θ − cX]] = min{0, U − c} = −(c − U )+ , a∈A

and, similarly,

a∈A

min E Lc (a, Θ)Y = −(c − V )+ . a∈A

From (3.A.55) we get that E[(c − U )+ ] ≤ E[(c − V )+ ] for all c. Therefore, from (3.A.6) and (3.A.56) it follows that E[ΘX] ≤cx E[ΘY ]. The following result is an analog of Theorem 1.A.8; similar results are Theorems 3.A.59, 4.A.48, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14– 7.A.16.

3.A The Convex Order

133

Theorem 3.A.42. Let X and Y be two random variables. Suppose that X ≤cx Y [X ≤cv Y ] and that E[X 2 ] = E[Y 2 ], provided the expectations exist. Then X =st Y . Proof. Denote the distribution functions of X and Y by F and G, respectively. Then u 0 2 2 (G(v) − F (v))dv du E[Y ] − E[X ] = 2 u=−∞ v=−∞ ∞ ∞ (G(v) − F (v))dv du. +2 u=0

v=u

2 2 By Theorem 3.A.1 are nonnegative. From E[X u ∞ ] = E[Y ] u both inner integrals we thus obtain v=−∞ F (v)dv = v=−∞ G(v)dv for u ≤ 0, and v=u F (v)dv = ∞ G(v)dv for u ≥ 0. Diﬀerentiating these equalities we obtain F = G.

v=u

Theorem 3.A.42 can be strengthened as follows; we do not detail the proof here. Theorem 3.A.43. Let X and Y be two random variables. Suppose that X ≤cx Y [X ≤cv Y ] and that for some strictly convex function φ we have that E[φ(X)] = E[φ(Y )], provided the expectations exist. Then X =st Y . Theorem 3.A.60 below is a generalization of Theorem 3.A.43. 3.A.3 Conditions that lead to the convex order Once the relation X ≤cx Y has been established between the two random variables X and Y it can be of great use. However, given the two random variables and their distributions it is sometimes not clear how to verify that X ≤cx Y . In this section we point out several simple conditions that imply the convex order. Recall the notation S − (a) (deﬁned in (1.A.18)) for the number of sign changes of the function a. Theorem 3.A.44. Let X and Y be two random variables with equal means, density functions f and g, distribution functions F and G, and survival functions F and G, respectively. Then X ≤cx Y if any of the following conditions hold: S − (g − f ) = 2

and the sign sequence is +, −, +,

(3.A.57)

−

and the sign sequence is +, −,

(3.A.58)

−

and the sign sequence is +, −.

(3.A.59)

S (F − G) = 1 S (G − F ) = 1

Proof. We will prove the result for the continuous case; the proof in the discrete case is similar. Suppose that S − (g − f ) = 2 and that the sign sequence is +, −, +. Let a and b (a < b) be two of the crossing points, where the deﬁnition of a crossing point is self-explanatory. Denote I1 = (−∞, a], I2 = (a, b],

134

3 Univariate Variability Orders

and I3 = (b, ∞). Then g(x) − f (x) ≥ 0 on I1 , g(x) − f (x) ≤ 0 on I2 , and g(x) − f (x) ≥ 0 on I3 . Therefore x G(x) − F (x) = [g(u) − f (u)]du −∞

is increasing on I1 , decreasing on I2 , and increasing on I3 . It is also clear that limx→−∞ [G(x) − F (x)] = limx→∞ [G(x) − F (x)] = 0. Combining all these observations shows that S − (G − F ) = 1 and that the sign sequence is +, −. Now suppose that S − (G − F ) = 1 and that the sign sequence is +, −. Let c be a crossing point. Denote J1 = (−∞, c] and J2 = (c, ∞). Then G(x)−F (x) ≥ 0 on J1 and G(x) − F (x) ≤ 0 on J2 . Clearly x lim [G(u) − F (u)]du = 0 x→−∞

−∞

and from the equality of the means (see (3.A.3)) it follows that x [G(u) − F (u)]du = 0. lim x→∞

−∞

Combining these observations shows that (3.A.8) holds. This proves that (3.A.57) and (3.A.59) imply X ≤cx Y . Note that S − (F − G) = S − (G − F ) with the same sign sequence. This observation, together with (3.A.59), shows that (3.A.58) implies X ≤cx Y .

The condition (3.A.58) (or, equivalently, (3.A.59)) is not only suﬃcient for X ≤cx Y , but, for nonnegative random variables, it can also characterize the convex order as the following theorem shows. Theorem 3.A.45. Let X and Y be two nonnegative random variables with equal means. Then X ≤cx Y if, and only if, there exist random variables Z1 , Z2 , . . ., with distribution functions F1 , F2 , . . ., such that Z1 =st X, EZj = EY , j = 1, 2, . . ., Zj →st Y as j → ∞, and S − (F j − F j+1 ) = 1 and the sign sequence is +, −, j = 1, 2, . . .. If the random variables in Theorem 3.A.45 are not nonnegative then the suﬃciency part of that theorem is not correct. This can be seen by noting that Example 1 of M¨ uller [410] describes a sequence of distribution functions (say of the random variables Z1 , Z2 , . . .), and two other distribution functions (say of the random variables X and Y , which are not nonnegative), which satisfy all the conditions in Theorem 3.A.45, but such that X ≤cx Y . We thank Taizhong Hu for pointing out this fact to us. In Theorem 3.A.24 we obtained the “minimal” random variable with respect to the order ≤cx when the support and the mean are given. Now, with the aid of Theorem 3.A.44, we can obtain the “minimal” random variables with respect to the order ≤cx for some rich families of random variables when the mean is given. This is shown in the next result.

3.A The Convex Order

135

Theorem 3.A.46. Let X be a nonnegative random variable with mean µ. (a) Suppose that X has a density function that is decreasing on [0, ∞). Let Y be uniformly distributed over the interval [0, 2µ] (so that EY = µ). Then Y ≤cx X. (b) Suppose that X has a density function that is decreasing and convex on [0, ∞). Let Z have the triangular distribution over the interval [0, 3µ] with density function 2 − 2 2 x, if 0 ≤ x ≤ 3µ, fZ (x) = 3µ 9µ 0, otherwise (so that EZ = µ). Then Z ≤cx X. Proof. In order to prove (a) let fX and fY denote the density functions of X and Y , respectively. It is easy to see, using the fact that EX = EY , that S − (fX −fY ) = 2 and that the sign sequence is +, −, +. The result now follows from Theorem 3.A.44. The proof of (b) is similar.

Some illustrations of the applicability of Theorem 3.A.44 are shown in the following examples. Example 3.A.47. The following statements can be proven by verifying, using the method in Shaked [502], that in each one of them the two random variables have the same mean, and that their densities satisfy (3.A.57). (a) Let X and Y have, respectively, the Poisson and the Pascal distributions with the discrete densities f and g given by (λ/α)x , f (x) = e−λ/α x! α λ 1 x Γ (x + λ) g(x) = , 1+α 1+α Γ (λ)x!

x = 0, 1, . . . , x = 0, 1, . . . ,

where α > 0 and λ > 0. Then X ≤cx Y . (b) Let X and Y have, respectively, the exponential and the power distributions with the densities f and g given by f (x) = (γ − 1)δ −1 exp{−(γ − 1)δ −1 x}, −γ−1

g(x) = (γ/δ)(1 + x/γ)

where γ > 1 and δ > 0. Then X ≤cx Y .

,

x ≥ 0, x ≥ 0,

136

3 Univariate Variability Orders

(c) Let X and Y have, respectively, the binomial and the Polya distributions with the discrete densities f and g given by n α x β n−x f (x) = , x = 0, 1, . . . , n, α+β x α+β n Γ (α + β)Γ (α + x)Γ (β + n − x) , x = 0, 1, . . . , n, g(x) = Γ (α)Γ (β)Γ (α + β + n) x where α > 0 and β > 0. Then X ≤cx Y . (d) Let X and Y have, respectively, the discrete densities f and g given by λ β − 1 x Γ (x + λ) α , α+β−1 α+β−1 Γ (λ)x! Γ (α + β)Γ (β + λ)Γ (λ + x)Γ (α + x) , g(x) = Γ (α)Γ (β)Γ (λ)Γ (α + β + λ + x)x!

f (x) =

x = 0, 1, . . . , x = 0, 1, . . . ,

where α > 0, β > 1, and λ > 0. Then X ≤cx Y . Example 3.A.48. Let X and Y be Bernoulli random variables with parameters p and q, respectively, where 0 < p ≤ q ≤ 1. Then X Y ≥cx . p q This can be seen by easily verifying (3.A.59), where F and G there are the distribution functions of Y and X, respectively. A further illustration of the applicability of Theorem 3.A.44 is shown in the following example. Example 3.A.49. Let U(i:n) be the ith order statistic from a sample of n uniform [0, 1] random variables. By examination of the density functions of the normalized variables n+1 i U(i:n) it is possible to verify (3.A.57) and obtain the following results (see also Example 4.B.13): U(i+1:n) ≤Lorenz U(i:n) , U(i:n) ≤Lorenz U(i:n+1) , U(n−i+1:n+1) ≤Lorenz U(n−i:n) ,

for all i ≤ n − 1, for all i ≤ n + 1, for all i ≤ n,

and U(n+2:2n+3) ≤Lorenz U(n+1:2n+1) , for all n. The last inequality may be described as “sample medians exhibit less variability as sample size increases.” Arnold and Villasenor [21], who derived the above results, give many other Lorenz order inequalities for order statistics and record values associated with various parametric families; see also Wilﬂing [566] and Kleiber [304].

3.A The Convex Order

137

Example 3.A.50. Let X(i:n) denote the ith order statistic in a sample of n independent and identically distributed random variables having the common distribution F , survival function F , and density function f . Recall that a function φ : [0, ∞) → [0, ∞) is said to be regularly varying at ∞ with index ρ ∈ R if φ(tx) lim = tρ , for all t ∈ [0, ∞). x→∞ φ(x) The function φ is said to be regularly varying at −∞ with index ρ if φ(−x) is regularly varying at ∞ with index ρ. Finally, the function φ is said to be regularly varying at 0 with index ρ if φ(x−1 ) is regularly varying at ∞ with index ρ. For F with support (−∞, ∞) Kleiber [303] showed: (a) If F is regularly varying at −∞ with index α < 0, and if f is monotone on (−∞, c] for some c, then X(j:m) ≤dil X(i:n) implies i ≤ j. (b) If F is regularly varying at ∞ with index α < 0, and if f is monotone on [c, ∞) for some c, then X(j:m) ≤dil X(i:n) implies n − i ≤ m − j. For F with support [0, ∞) Kleiber [303] also showed: (c) If F is regularly varying at 0 with index α < 0, and if f is monotone on (0, c] for some c > 0, then X(j:m) ≤Lorenz X(i:n) implies i ≤ j. (d) If F is regularly varying at ∞ with index α < 0, and if f is monotone on [c, ∞) for some c, then X(j:m) ≤Lorenz X(i:n) implies n − i ≤ m − j. The following example gives necessary and suﬃcient conditions for the comparison of normal random variables; it is generalized in Example 7.A.13. See related results in Examples 1.A.26 and 4.A.46. Example 3.A.51. Let X be a normal random variable with mean µX and vari2 ance σX , and let Y be a normal random variable with mean µY and variance 2 2 σY . Then X ≤cx Y if, and only if, µX = µY and σX ≤ σY2 . Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R, with any ﬁxed ﬁnite mean, is a lattice with respect to the order ≤cx . Let X and Y be two random variables with densities f and g, respectively. Recall that supp(X) and supp(Y ) denote the respective supports. We say that X is uniformly less variable than Y (denoted as X ≤uv Y ) if supp(X) ⊆ supp(Y ) and the ratio f (x)/g(x) is unimodal over supp(Y ), where the mode is a supremum, but X and Y are not ordered by the usual stochastic order (see deﬁnition in Section 1.A). The relation ≤uv is not a transitive order. It is possible to have X ≤uv Y and Y ≤uv Z but not X ≤uv Z. However, it is useful as a simple condition which implies (3.A.57). The next theorem points out this relationship. The proof of the theorem is easy and is therefore omitted.

138

3 Univariate Variability Orders

Theorem 3.A.52. Let X and Y be two random variables with densities f and g, respectively, such that supp(X) ⊆ supp(Y ). Then X ≤uv Y if, and only if, S − (g − cf ) ≤ 2 whenever c > 0, and in case of equality the sign sequence is +, −, +. (3.A.60) From (3.A.60) and (3.A.57) we see that the order ≤uv is a suﬃcient condition for the order ≤cx provided the underlying random variables have equal means. This is formally stated in the next theorem. Theorem 3.A.53. Let X and Y be two random variables with absolutely continuous distributions and equal means such that supp(X) ⊆ supp(Y ). If X ≤uv Y , then X ≤cx Y . A relation that is even stronger than ≤uv is deﬁned next. Its usefulness is that it gives a simple suﬃcient condition for the order ≤uv and therefore for the order ≤cx . Again, let X and Y be two random variables with densities f and g, respectively. We say that X is logconcave relative to Y (denoted by X ≤lc Y ) if f /g is logconcave. The relation ≤lc , unlike the relation ≤uv , is transitive, and it implies the relation ≤uv as the next result shows. Again, the proof is trivial and hence it is omitted. Theorem 3.A.54. Let X and Y be two random variables with densities f and g, respectively, such that supp(X) ⊆ supp(Y ) and S − (g − f ) = 2. Then X ≤lc Y =⇒ X ≤uv Y . 3.A.4 Some properties in reliability theory Recall from page 1 the deﬁnitions of NBUE and NWUE random variables. Such random variables are of interest in reliability theory. The next result shows that NBUE [NWUE] random variables are smaller [larger] than exponential random variables with the same means with respect to the convex order. We denote by Exp(µ) an exponential random variable with mean µ. Theorem 3.A.55. If X is an NBUE [NWUE] random variable with mean µ, then (3.A.61) X ≤cx [≥cx ] Exp(µ), or, equivalently, X ≥cv [≤cv ] Exp(µ).

(3.A.62)

The proof consists of showing that if F is the survival function of X, then ∞ F (u)du ≤ [≥] µe−x/µ , x ≥ 0, x

and the result then follows from (3.A.7). We omit the details.

3.A The Convex Order

139

Random variables that satisfy (3.A.61) are called harmonic new better [worse] than used in expectation (HNBUE [HNWUE]). Sometimes such random variables are deﬁned by X ≤hmrl [≥hmrl ] Exp(µ) rather than by (3.A.61), but by (2.B.7) these two deﬁnitions are the same. Recall from page 1 the deﬁnition of IMRL and DMRL random variables. The following result characterizes such random variables by means of the dilation order deﬁned in (3.A.32). Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.A.23, 2.B.17, 3.C.13, and 4.A.51. Theorem 3.A.56. The nonnegative random variable X is DMRL [IMRL] if, and only if, [X X > t] ≥dil [≤dil ] [X X > t ] whenever t ≤ t . Two related results are stated next without proofs. Theorem 3.A.57. Let X and Y be two random variables that have a common support of the form (0, ∞), and that have ﬁnite means. If X and/or Y is IMRL, and if X ≤mrl Y , then X ≤dil Y . Theorem 3.A.58. Let X and Y be two random variables that have a common support of the form (0, ∞), and that have ﬁnite means. If X is NBUE and Y is NWUE, then X ≤mrl Y ⇐⇒ X ≤dil Y ⇐⇒ EX ≤ EY. 3.A.5 The m-convex orders Let S be a subinterval of the real line. The subinterval S may be open, halfopen, or closed, ﬁnite or inﬁnite. Fix a positive integer m, and consider the class MSm-cx of all functions φ : S → R whose mth derivative φ(m) exists and satisﬁes φ(m) (x) ≥ 0, for all x ∈ S, or which are limits of sequences of functions whose mth derivative is continuous and nonnegative on S. Let X and Y be two random variables that take on values in S such that E[φ(X)] ≤ E[φ(Y )]

for all functions φ ∈ MSm-cx ,

(3.A.63)

provided the expectations exist. Then X is said to be smaller than Y in the m-convex order (denoted as X ≤Sm-cx Y ). For random variables X and Y that take on values in N++ the deﬁnition of the m-cx order is similar — it can be found in Denuit and Lef`evre [146]. In a similar manner one can deﬁne the m-concave order and observe that X ≤Sm-cv Y when m is odd, S X ≤m-cx Y ⇐⇒ Y ≤Sm-cv X when m is even. It can be shown that

140

3 Univariate Variability Orders

EX k = EY k , k = 1, 2, . . . , m − 1, and Y ⇐⇒ E(X − t)m−1 ≤ E(Y − t)m−1 for all t ∈ S, + +

X ≤Sm-cx

(3.A.64)

and also that

EX k = EY k , k = 1, 2, . . . , m − 1, and X Y ⇐⇒ m−1 m−1 ≥ 0 for all t ∈ S. − E(t − X)+ (−1)m E(t − Y )+ (3.A.65) Note that the order ≤S1-cx is just the order ≤st , and that the order ≤S2-cx is the order ≤cx . Menezes, Geiss, and Tressler [390] gave the following interpretation to the order ≤S3-cx : if X ≤S3-cx Y , then, of course, X and Y have the same mean and variance, but X then has smaller rightside risk than Y . Let F and G be the distribution functions of X and Y , respectively. Denote t F [0] (t) = F (t), and, for k ≥ 1, denote F [k] (t) = −∞ F [k−1] (x)dx. Similarly, ∞ [k−1] [0] [k] denote F (t) = F (t), and, for k ≥ 1, denote F (t) = t F (x)dx. ≤Sm-cx

Deﬁne G[k] and G

[k]

in a similar manner. Using the identities

[m−1]

(t) − FX

[m−1]

(t) − F X

FY

[m−1]

(t) =

m−1 − E(t − X)+ E(t − Y )m−1 + (m − 1)!

(t) =

m−1 m−1 − E(X − t)+ E(Y − t)+ (m − 1)!

and FY

[m−1]

(which are easily proven by induction and Fubini’s Theorem) we obtain from (3.A.64) and (3.A.65) that EX k = EY k , k = 1, 2, . . . , m S − 1, and X ≤m-cx Y ⇐⇒ [m−1] [m−1] m (−1) FY (t) − FX (t) ≥ 0 for all t ∈ R, (3.A.66) and also that X ≤Sm-cx Y ⇐⇒

EX k = EY k , [m−1] FY (t)

−

k = 1, 2, . . . , m − 1,

[m−1] FX (t)

≥0

and

for all t ∈ R.

(3.A.67)

Using the identities ∞ [m−1] [m−1] m FY (x) − F X (x) dx = E(Y − t)m m! + − E(X − t)+ t

and

t

m! −∞

[m−1] [m−1] m FY (x) − FX (x) dx = E(t − Y )m + − E(t − X)+ ,

we obtain from (3.A.66) and (3.A.67) that

3.A The Convex Order

X ≤Sm-cx

141

EX k = EY k , k = 1, 2, . . . , m − 1, and Y ⇐⇒ m E(Y − t)m is decreasing in t ∈ R, + − E(X − t)+

and also that X

≤Sm-cx

EX k = EY k , k = 1, 2, . . . , m − 1, and Y ⇐⇒ m (−1)m E(t − Y )m + − E(t − X)+ is increasing in t ∈ R.

Fishburn [203] has reported some attempts at obtaining an analog of Theorem 3.A.4 for the 3-cx order. From (3.A.63) it is seen that if X ≤Sm-cx Y , then EX k ≤ EY k

for k ≥ m such that k − m is even.

If, moreover, X and Y are nonnegative, then EX k ≤ EY k

for k ≥ m.

Motivated by Theorem 3.A.42 (see also Theorems 1.A.8, 4.A.48, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16) we have the following result. Theorem 3.A.59. Let X and Y be two random variables that take on values in S. If X ≤Sm-cx Y , and if E[X m ] = E[Y m ], then X =st Y . Theorem 3.A.59 can be strengthened to the following result in a way that is analogous to the way in which Theorem 3.A.43 strengthened Theorem 3.A.42; we do not detail the proof here. Theorem 3.A.60. Let X and Y be two random variables that take on values in S. If X ≤Sm-cx Y and if E[φ(X)] = E[φ(Y )] for some φ ∈ MSm-cx which satisﬁes φ(m) (x) > 0 for all x ∈ S, then X =st Y . Note that Theorems 1.A.8, 3.A.43, and 3.A.59 are all special cases of Theorem 3.A.60. A generalization of (3.A.12) is given in the next theorem. The notations lX , uX , lY , and uY are described before (3.A.12). Theorem 3.A.61. Let X and Y be two random variables that take on values in S. If X ≤Sm-cx Y , then uX ≤ uY . Also, if m is even, then lX ≥ lY , and if m is odd, then lX ≤ lY . Some closure properties of the order ≤Sm-cx are given in the next theorem. Theorem 3.A.62. (a) Let X and Y be two random variables that take on values in S. Then when s is even, −X ≤−S m-cx −Y X ≤Sm-cx Y ⇐⇒ −Y ≤−S −X when s is odd. m-cx

142

3 Univariate Variability Orders

(b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤Sm-cx [Y Θ = θ] for all θ in the support of Θ. Then X ≤Sm-cx Y . That is, the m-convex order is closed under mixtures. (c) If X ≤Sm-cx Y , then cX ≤cS m-cx cY whenever c > 0, where cS = {x ∈ R : x/c ∈ S}. (d) If X ≤Sm-cx Y , then cX ≤cS m-cx cY whenever c < 0 and m is even, and cY ≤cS m-cx cX whenever c < 0 and m is odd. (e) If X ≤Sm-cx Y , then X + d ≤S+d m-cx Y + d for all d ∈ R, where S + d = {x ∈ R : x − d ∈ S}; that is, the m-convex order is shift-invariant. (f) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables that take on values in S, such that Xj →st X and Yj →st Y m−1 m−1 as j → ∞. Assume that E(X)+ and E(Y )+ are ﬁnite and that m−1 m−1 m−1 m−1 E(Xj )+ → E(X)+ and that E(Yj )+ → E(Y )+ as j → ∞. S S If Xi ≤m-cx Yi for all integers i, then X ≤m-cx Y . That is, the m-convex order is closed under limits. (g) Let X1 , X2 , . . . , Xn be a set of independent random variables and let Y1 , Y2 , . . . , Yn be another set of independent random variables, all taking on values in S. If Xi ≤Sm-cx Yi for i = 1, 2, . . . , n, then n j=1

Xj ≤R m-cx

n

Yj ,

j=1

where R denotes the union of the supports of the distribution functions of the two sums. That is, the m-convex order is closed under convolutions. (h) Let X1 , X2 , . . . be a set of independent random variables and let Y1 , Y2 , . . . be another set of independent random variables, all taking on values in S. If Xi ≤Sm-cx Yi for i = 1, 2, . . ., then, for any positive integer-valued random variable N which is independent of the Xi ’s and of the Yj ’s, one has N N ˜ Xj ≤R Yj , m-cx j=1

j=1

˜ denotes the union of the supports of the distribution functions of where R the two compound sums. Theorem 3.A.62(h) can be extended as follows. Theorem 3.A.63. Let X1 , X2 , . . . be a set of independent random variables and let Y1 , Y2 , . . . be another set of independent random variables, all taking on values in S. Let N1 be an integer-valued random variable that is independent of the Xi ’s, and let N2 be an integer-valued random variable that is independent of the Yi ’s, both taking on values in Q. If Xi ≤Sm-cx Yi for i = 1, 2, . . ., and if N1 ≤Q m-cx N2 , then N2 N1 ˜ Xj ≤R Yj , m-cx j=1

j=1

3.A The Convex Order

143

˜ denotes the union of the supports of the distribution functions of the where R two compound sums. Theorem 3.A.19 can be extended as follows. Theorem 3.A.64. Let X1 and X2 (Y1 and Y2 ) be two independent copies of X (Y ). If X ≤S2m-cx Y , then X1 − X2 ≤R 2m-cx Y1 − Y2 , where R denotes the union of the supports of the distribution functions of the two diﬀerences. The proof of Theorem 3.A.64 is similar to the proof of Theorem 3.A.19 (using Theorem 3.A.62(a) and (g)). Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means, we denote by AX and AY the corresponding asymptotic equilibrium ages. Theorem 3.A.65. Let X and Y be two nonnegative random variables. Then, for m ≥ 2 we have [0,∞)

X ≤[0,∞) m-cx Y ⇐⇒ AX ≤(m−1)-cx AY . In particular, X ≤cx Y ⇐⇒ AX ≤st AY . We now describe a generalization of Theorem 3.A.44. Let Bm (S; µ1 , µ2 , . . . , µm−1 ) denote the class of all the random variables X whose distribution functions have support in S and which have the ﬁrst m − 1 moments EX k = µk , k = 1, 2, . . . , m − 1. Theorem 3.A.66. Let X and Y be two random variables in Bm (S; µ1 , µ2 , . . . , µm−1 ) with distribution functions F and G, respectively, and with density functions f and g, respectively. (a) If S − (F −G) ≤ m−1 and if the last sign of F −G is a +, then X ≤Sm-cx Y . (b) If S − (f − g) ≤ m and if the last sign of g − f is a +, then X ≤Sm-cx Y . The following example describes typical applications of Theorem 3.A.66. Example 3.A.67. Let X have the Gamma density given by fX (x) =

β α α−1 −βx e , x Γ (α)

x > 0,

where α > 0 and β > 0 are constants, and let Y have the inverse Gaussian density given by % $ (α − βx)2 αx−3/2 fY (x) = √ , x > 0, exp − 2βx 2πβ where also here α > 0 and β > 0 are constants. Note that X and Y have the same mean α/β and the same second moment α(α + 1)/β 2 . We claim that

144

3 Univariate Variability Orders [0,∞)

X ≤3-cx Y . In order to see it, ﬁrst note that without loss of generality we can take the means to be equal to 1, that is, β = α. Now, a straightforward computation yields log

fX (x) 1 αx α =C + α+ − , log x + fY (x) 2 2x 2

x > 0,

where C is some constant. The ﬁrst derivative of the above expression is a quadratic form in 1/x, which cannot have more than two zeroes, so the expression itself has no more than three sign changes. In addition, the above expression tends to −∞ as x → ∞. The stated result now follows from Theorem 3.A.66(b). Let Z have the lognormal density given by % $ 1 (log x − ν)2 √ exp − fZ (x) = , x > 0, 2τ 2 xτ 2π where τ > 0 and ν > 0 are constants. With the choice τ 2 = log(1 + α1 ) and α3 ν = 12 log (α+1)β 2 we have that X and Z have the same mean α/β and the [0,∞)

same second moment α(α + 1)/β 2 . We now claim that X ≤3-cx Z. In order to see it, again note that without loss of generality we can take the means to be equal to 1, that is, β = α. Now, a straightforward computation yields log

1 log2 x fX (x) =C + α+ log x − αx + , fZ (x) 2 2τ 2

x > 0,

where C is some constant. Substituting u = log x, the above expression is seen to be the diﬀerence of a quadratic form in u and an exponential function, which cannot have more than three sign changes. In addition, the above expression tends to −∞ as x → ∞. The stated result again follows from Theorem 3.A.66(b). Theorem 3.A.24 can be viewed as a result that gives the “minimal” and the “maximal” random variables with respect to the order ≤S2-cx when the (bounded) support and the mean are given. The following theorem gives the extrema with respect to the order ≤S3-cx when the ﬁrst two moments are given. Here we take S = [a, b] for some ﬁnite a and b. Theorem 3.A.68. Let X ∈ B3 ([a, b]; µ1 , µ2 ). Consider the random variables (3) (3) Xmin and Xmax in B3 ([a, b]; µ1 , µ2 ) deﬁned by ⎧ µ −µ2 ⎨a with probability (a−µ1 )22 +µ12 −µ2 , (3) 1 Xmin = 2 (a−µ1 )2 ⎩µ1 + µ2 −µ1 with probability 2, 2 µ1 −a

and

(a−µ1 ) +µ2 −µ1

3.A The Convex Order (3) Xmax

(3)

⎧ ⎨µ1 − = ⎩b

[a,b]

[a,b]

µ2 −µ21 b−µ1

with probability with probability

145

(b−µ1 )2 , (b−µ1 )2 +µ2 −µ21 µ2 −µ21 . (b−µ1 )2 +µ2 −µ21

(3)

Then Xmin ≤3-cx X ≤3-cx Xmax . An eﬀective method for deriving the support points and the associated probabilities of the stochastic extrema in general (that is, for m’s other than 3) will be described next. For the purpose of somewhat simplifying the expressions below we take a = 0. Thus we describe how to obtain (m) (m) the support points and the associated probabilities of Xmin and Xmax in Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ). (2k) If m is even, m = 2k, say, then the support of Xmin in B2k ([0, b]; µ1 , µ2 , . . . , µ2k−1 ) consists of k interior points x1 , x2 , . . . , xk , 0 < x1 < x2 < · · · < xk < b, which are the k distinct roots of the equation (denoting µ0 = 1) 1 x x2 · · · xk µ0 µ1 µ2 · · · µk µ1 µ2 µ3 · · · µk+1 = 0; .. .. .. . . .. . . . . . µk−1 µk µk+1 · · · µ2k−1 the corresponding probabilities p1 , p2 , . . . , pk are now found by solving p1 xj1 + p2 xj2 + · · · + pk xjk = µj ,

j = 0, 1, . . . , k − 1.

(2k)

The support of Xmax in B2k ([0, b]; µ1 , µ2 , . . . , µ2k−1 ) consists of the points 0, b, and k − 1 interior points x2 , x3 , . . . , xk , 0 < x2 < x3 < · · · < xk < b, which are the k − 1 distinct roots of the equation 1 x x2 ··· xk−1 µ2 − bµ1 µ3 − bµ2 µ − bµ · · · µ − bµ 4 3 k+1 k µ3 − bµ2 µ4 − bµ3 µ − bµ · · · µ − bµ 5 4 k+2 k+1 = 0; .. .. .. . . .. .. . . . µk − bµk−1 µk+1 − bµk µk+2 − bµk+1 · · · µ2k−1 − bµ2k−2 the corresponding probabilities p1 , p2 , . . . , pk+1 are now found by solving the Vandermonde system p1 + p2 + · · · + pk+1 = 1, p2 xj2 + p3 xj3 + · · · + pk xjk + pk+1 bj = µj , j = 1, 2, . . . , k. (2k+1)

in When m is odd, m = 2k + 1, say, then the support of Xmin B2k+1 ([0, b]; µ1 , µ2 , . . . , µ2k ) consists of 0 and k interior points x2 , x3 , . . . , xk+1 , 0 < x2 < x3 < · · · < xk+1 < b, which are the k distinct roots of the equation

146

3 Univariate Variability Orders

1 x µ1 µ2 µ2 µ3 .. .. . . µk µk+1

x2 µ3 µ4 .. . µk+2

· · · xk · · · µk+1 · · · µk+2 = 0; . .. . .. · · · µ2k

the corresponding probabilities p1 , p2 , . . . , pk+1 are now found by solving p1 + p2 + · · · + pk+1 = 1, p2 xj2 + p3 xj3 + · · · + pk+1 xjk+1 = µj , j = 1, 2, . . . , k. (2k+1)

The support of Xmax in B2k+1 ([0, b]; µ1 , µ2 , . . . , µ2k ) consists of the points b and k interior points x1 , x2 , . . . , xk , 0 < x1 < x2 < · · · < xk < b, which are the k distinct roots of the equation 1 x x2 ··· xk µ1 − b µ − bµ µ − bµ · · · µ − bµ 2 1 3 2 k+1 k µ2 − bµ1 µ3 − bµ2 µ − bµ · · · µ − bµ 4 3 k+2 k+1 = 0; .. .. .. . . .. .. . . . µk − bµk−1 µk+1 − bµk µk+2 − bµk+1 · · · µ2k − bµ2k−1 the corresponding probabilities p1 , p2 , . . . , pk+1 are now found by solving the Vandermonde system p1 xj1 + p2 xj2 + · · · + pk+1 xjk+1 + pk+1 bj = µj ,

j = 0, 1, . . . , k. (m)

(m)

Explicit descriptions for the distribution functions of Xmin and Xmax , for values of m up to 5, are given in Tables 3.A.1 and 3.A.2, where in Table 3.A.2 we use the notation

2 = (µ1 − b)(µ4 − bµ3 ) − (µ2 − bµ1 )(µ3 − bµ2 )

− 4 (µ1 − b)(µ3 − bµ2 ) − (µ2 − bµ1 )2

× (µ2 − bµ1 )(µ4 − bµ3 ) − (µ3 − bµ2 )2 . Denuit, De Vylder, and Lef`evre [142] obtained also the extrema with respect to the order ≤Sm-cx when not only the ﬁrst m − 1 moments and the support are given, but also when the density function of X is known to be unimodal with a known mode. Tables that are similar to Tables 3.A.1 and 3.A.2, but when the mode is known, are available in Denuit, Lef`evre, and Shaked [153, 154].

3.B The Dispersive Order 3.B.1 Deﬁnition and equivalent conditions Let X and Y be two random variables with distribution functions F and G, respectively. Let F −1 and G−1 be the right continuous inverses of F and G,

3.B The Dispersive Order

147

(m)

Table 3.A.1. Probability distribution of Xmin ∈ Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ) m Support point

Probability mass

1 0

1

2 µ1

1

3 0

µ2 −µ2 1 µ2 µ2 1 µ2

µ2 µ1

√

4 r+ =

µ3 −µ1 µ2 +

r− =

µ3 −µ1 µ2 −

2 (µ3 −µ1 µ2 )2 −4(µ2 −µ2 1 )(µ1 µ3 −µ2 )

√

5 0

µ1 −r− r+ −r−

2(µ2 −µ2 1)

2 (µ3 −µ1 µ2 )2 −4(µ2 −µ2 1 )(µ1 µ3 −µ2 )

1−

2(µ2 −µ2 1)

√

t+ =

µ1 µ4 −µ2 µ3 +

t− =

µ1 µ4 −µ2 µ3 −

µ1 −r− r+ −r−

1 − p+ − p− 2 (µ1 µ4 −µ2 µ3 )2 −4(µ1 µ3 −µ2 2 )(µ2 µ4 −µ3 )

√

2(µ1 µ3 −µ2 2)

2 (µ1 µ4 −µ2 µ3 )2 −4(µ1 µ3 −µ2 2 )(µ2 µ4 −µ3 )

2(µ1 µ3 −µ2 2)

p+ =

µ2 −t− µ1 t+ (t+ −t− )

p− =

µ2 −t+ µ1 t− (t− −t+ )

(m)

Table 3.A.2. Probability distribution of Xmax ∈ Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ) m Support point

Probability mass

1 b

1

2 0

b−µ1 b

3

b

µ1 b

bµ1 −µ2 b−µ1

(b−µ1 )2 (b−µ1 )2 +µ2 −µ2 1 µ2 −µ2 1 (b−µ1 )2 +µ2 −µ2 1

b

1 − p1 − p2

4 0

(µ2 −bµ1 )3 (µ3 −bµ2 )(µ3 −2bµ2 +b2 µ1 )

µ3 −bµ2 µ2 −bµ1

p1 =

b

p2 =

µ1 µ3 −µ2 2 b(µ3 −2bµ2 +b2 µ1 )

q+ =

µ2 −(b+z− )µ1 +bz− (z+ −z− )(z+ −b)

q− =

µ2 −(b+z+ )µ1 +bz+ (z− −z+ )(z− −b)

√

5 z+ =

(µ1 −b)(µ4 −bµ3 )−(µ2 −bµ1 )(µ3 −bµ2 )+ 2((µ1 −b)(µ3 −bµ2 )−(µ2 −bµ1 )2 )

z− =

(µ1 −b)(µ4 −bµ3 )−(µ2 −bµ1 )(µ3 −bµ2 )− 2((µ1 −b)(µ3 −bµ2 )−(µ2 −bµ1 )2 )

b

√

1 − q+ − q −

148

3 Univariate Variability Orders

respectively, and assume that F −1 (β) − F −1 (α) ≤ G−1 (β) − G−1 (α)

whenever 0 < α ≤ β < 1. (3.B.1)

Then X is said to be smaller than Y in the dispersive order (denoted as X ≤disp Y ). It is conceptually clear that the order ≤disp indeed corresponds to a comparison of X and Y by variability because it requires the diﬀerence between any two quantiles of X to be smaller than the corresponding quantiles of Y . It is clear from (3.B.1) that the order ≤disp is location-free. That is, X ≤disp Y ⇐⇒ X + c ≤disp Y

for any real c.

(3.B.2)

For a ﬁxed α, one can ﬁnd a c such that the inverse of the distribution of X + c, which is F −1 (·) + c, satisﬁes F −1 (α) + c = G−1 (α) = x0 , say. It follows then from (3.B.2) that F (x − c) ≥ G(x) for all x ≥ x0 . Similarly, it can be seen that F (x − c) ≤ G(x) for all x ≤ x0 . This is true for every α (c and x0 are determined by α). By varying α one can obtain any desired c of the form G−1 (α) − F −1 (α). In fact, it can be shown that X ≤disp Y if, and only if, S − (F (· − c) − G(·)) ≤ 1

for all c, with the sign sequence being −, + in the case of equality.

(3.B.3)

It is not hard to prove that condition (3.B.3) is equivalent to the following condition: G(G−1 (α) + c) ≤ F (F −1 (α) + c)

for all α ∈ (0, 1) and c > 0,

(3.B.4)

for all α ∈ (0, 1) and c > 0.

(3.B.5)

or, equivalently, G(G−1 (α) − c) ≥ F (F −1 (α) − c)

Alternatively, (3.B.4) and (3.B.5) can be written as (X − F −1 (α))+ ≤st (Y − G−1 (α))+ ,

α ∈ (0, 1).

(3.B.6)

From (3.B.1) it is clear that X ≤disp Y if, and only if, G−1 (α) − F −1 (α)

increases in α ∈ (0, 1),

(3.B.7)

decreases in α ∈ (0, 1),

(3.B.8)

or, equivalently, if, and only if, G

−1

(α) − F

−1

(α)

where F ≡ 1 − F and G ≡ 1 − G are the survival functions associated with X and Y , respectively. Let R ≡ − log F and Q ≡ − log G denote the cumulative −1 hazard functions of X and Y , respectively. Note that R−1 (z) = F (e−z ) and

3.B The Dispersive Order

Q−1 (z) = G if,

−1

149

(e−z ). Thus from (3.B.8) we obtain that X ≤disp Y if, and only Q−1 (z) − R−1 (z)

increases in z ≥ 0.

(3.B.9)

Substituting α = F (x) in (3.B.7) we obtain that X ≤disp Y if, and only if, G−1 (F (x)) − x increases in x.

(3.B.10)

When X and Y have densities f and g, respectively, then X ≤disp Y if, and only if, g(G−1 (α)) ≤ f (F −1 (α)) for all α ∈ (0, 1); (3.B.11) this can be obtained at once by diﬀerentiation of (3.B.10) and a simple substitution. When X and Y have hazard rate functions r and q, then (3.B.11) can alternatively be recast as q(G−1 (α)) ≤ r(F −1 (α))

for all α ∈ (0, 1).

(3.B.12)

The dispersive order can be characterized also by comparing transformations of the random variables X and Y . For example, for continuous random variables X and Y we have that X ≤disp Y if, and only if, Y =st φ(X)

for some φ which satisﬁes φ(x ) − φ(x) ≥ x − x whenever x ≤ x . (3.B.13)

In order to prove it just let φ be G−1 F . When the φ in (3.B.13) is diﬀerentiable, the condition on φ there is the same as φ ≥ 1, where φ denotes the derivative of φ. An equivalent way of recasting (3.B.13) for continuous random variables X and Y is the following: Y =st X + ψ(X)

for some increasing function ψ.

(3.B.14)

Condition (3.B.13) can also be rewritten as X =st ϕ(Y )

for some increasing ϕ which satisﬁes ϕ(x ) − ϕ(x) ≤ x − x whenever x ≤ x . (3.B.15)

In fact, (3.B.15) characterizes X ≤disp Y even if X and Y are not continuous random variables. The next characterization of the dispersive order that we describe is by means of observed total time on test random variables (see Section 1.A.4). Let F be an absolutely continuous distribution function of a nonnegative random variable X, and suppose, for simplicity, that 0 is the left endpoint of the support of F . Let HF−1 be as deﬁned in (1.A.19), and let Xttt have the distribution function HF . Denote by hF the density function associated with HF . Then it is easy to see that

150

3 Univariate Variability Orders

f (F −1 (u)) , hF HF−1 (u) = 1−u

0 ≤ u < 1,

(3.B.16)

where f is the density function associated with F . Similarly, if G is another absolutely continuous distribution function, then the density hG , of the inverse of the TTT transform HG that is associated with G, satisﬁes −1 g(G−1 (u)) , hG H G (u) = 1−u

0 ≤ u < 1,

(3.B.17)

where g is the density function associated with G. Let Y and Yttt have the distribution functions G and HG , respectively. From (3.B.11), (3.B.16), and (3.B.17) we obtain the following result. Theorem 3.B.1. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions having 0 as the left endpoint of their supports. Then X ≤disp Y ⇐⇒ Xttt ≤disp Yttt . See related results in Theorems 1.A.29, 4.A.44, 4.B.8, 4.B.9, and 4.B.29. Next we mention a characterization by means of the so-called Q-addition (quantiles-addition). The random variable Y with distribution function G is said to be the Q-addition of the random variables X and Z, with corresponding distribution functions F and H, if G−1 (α) = F −1 (α) + H −1 (α) for all α ∈ (0, 1). If X and Y have distribution functions F and G, respectively, then by (3.B.1), X ≤disp Y if, and only if, H −1 (α) ≡ G−1 (α) − F −1 (α)

is increasing in α ∈ (0, 1).

That means that H −1 is an inverse of a distribution function of a random variable Z, say. Thus we see that X ≤disp Y if, and only if, Y is a Q-addition of X and Z for some random variable Z. Another characterization of the order ≤disp is given in the following theorem. Theorem 3.B.2. Let X and Y be two random variables. Then X ≤disp Y if, and only if, for every increasing function φ and increasing concave function h such that φ and ψ(·) ≡ h(φ(·)) are integrable with respect to the distribution of Y , and for every real number c, we have that Eφ(X − c) ≥ Eφ(Y ) =⇒ Eψ(X − c) ≥ Eψ(Y ). It is worthwhile to mention that two twice diﬀerentiable functions φ and ψ satisfy ψ(·) ≡ h(φ(·)) for some increasing concave function h if, and only if, φ /φ ≥ ψ /ψ (see Pratt [459] or Arrow [22]). Like the convex order (see Theorem 3.A.7), the dispersive order can be characterized by means of Yaari functionals Vh deﬁned in (3.A.31).

3.B The Dispersive Order

151

Theorem 3.B.3. Let X and Y be two random variables with the same ﬁnite means. Then X ≤disp Y if, and only if, Vh (X) ≤ Vh (Y )

for every probability transformation function h ≤ 1.

Before leaving this subsection we should mention an alternative way of comparing by dispersion random variables that are symmetric about 0. In such a case one may say (as an alternative to (3.B.1)) that X is less dispersed than Y if F −1 (α)−F −1 (1/2) ≤ [≥] G−1 (α)−G−1 (1/2) whenever α ≥ [≤] 1/2. If X and/or Y are not necessarily symmetric, then one can deﬁne an order that is weaker than ≤disp by requiring F −1 (α) − F −1 (1 − α) ≤ G−1 (α) − G−1 (1 − α),

α ∈ [1/2, 1];

see Townsend and Colonius [552]. If X and Y are positive random variables, then, as an alternative to (3.B.1), one can say that X is less dispersed than Y if log X ≤disp log Y . The latter condition is equivalent to log X ≤∗ log Y , where the order ≤∗ is deﬁned in Section 4.B (see Theorem 4.B.1). 3.B.2 Properties The dispersive order satisﬁes some desirable closure properties but does not satisfy some other desirable properties. For example, it is very easy to verify the following result (compare it to Theorem 3.A.18). Theorem 3.B.4. Let X be a random variable. Then X ≤disp aX

whenever a ≥ 1.

Theorem 3.B.4 can be generalized as follows. For two functions φ and ψ let us denote φ ≤disp ψ if φ(y) − φ(x) ≤ ψ(y) − ψ(x)

whenever x ≤ y.

(3.B.18)

Note that if φ and ψ are diﬀerentiable then φ ≤disp ψ if, and only if, φ ≤ ψ , where φ and ψ are the derivatives of φ and ψ, respectively. Now let X be a random variable. Write ψ(X) = φ(X) + (ψ(X) − φ(X)). From (3.B.14) we obtain the following result. Theorem 3.B.5. Let X be a random variable. Then φ(X) ≤disp ψ(X)

whenever φ ≤disp ψ.

Another simple desirable property that is easily veriﬁed is the following theorem. Theorem 3.B.6. Let X and Y be two random variables. Then X ≤disp Y ⇐⇒ −X ≤disp −Y.

(3.B.19)

152

3 Univariate Variability Orders

However, the dispersive order is not closed under convolutions. In fact, it is not even true in general that for any two independent random variables X and Y we have that X ≤disp X + Y . This observation follows from the next theorem, the proof of which is omitted. Theorem 3.B.7. The random variable X satisﬁes X ≤disp X + Y

for any random variable Y independent of X

if, and only if, X has a logconcave density. A random variable Z is said to be dispersive if X +Z ≤disp Y +Z whenever X ≤disp Y and Z is independent of X and Y . From Theorem 3.B.7 it follows that every dispersive random variable must be strongly unimodal (that is, have a logconcave density). It turns out that strong unimodality is also a suﬃcient condition for dispersivity, as the next result shows. Again the proof is omitted. Theorem 3.B.8. The random variable X is dispersive if, and only if, X has a logconcave density. Other characterizations of random variables with logconcave densities are given in Theorem 1.C.52. From Theorem 3.B.8 we obtain, by iteration, the following result. Theorem 3.B.9. Let X1 , X2 , . . . , Xn be a set of independent random variables, and let Y1 , Y2 , . . . , Yn be another set of independent random variables. If the Xi ’s and the Yi ’s have logconcave densities, and if Xi ≤disp Yi , i = 1, 2, . . . , n, then n n Xi ≤disp Yi . i=1

i=1

The dispersive order is closed under increasing convex and decreasing concave transformations when the underlying random variables are ordered in the usual stochastic order. We have the following result. Theorem 3.B.10. Let X and Y be two random variables such that X ≤st Y . (a) If X ≤disp Y , then φ(X) ≤disp φ(Y )

for all increasing convex and all decreasing concave functions φ. (3.B.20)

(b) If X ≤disp Y , then φ(X) ≥disp φ(Y )

for all decreasing convex and all increasing concave functions φ. (3.B.21)

3.B The Dispersive Order

153

Proof. First we prove (3.B.20) when φ is increasing and convex. Let F and G denote the distribution functions of X and Y , respectively, and let F −1 and G−1 be the respective inverses. For simplicity suppose that F , G, and φ are diﬀerentiable with derivatives f , g, and φ , respectively. The condition X ≤st Y implies that (see (1.A.12)) F −1 (α) ≤ G−1 (α)

for all α ∈ (0, 1).

Since φ is convex it follows that φ is increasing. Therefore φ (F −1 (α)) ≤ φ (G−1 (α))

for all α ∈ (0, 1).

(3.B.22)

The condition X ≤disp Y implies that (see (3.B.11)) g(G−1 (α)) ≤ f (F −1 (α))

for all α ∈ (0, 1).

(3.B.23)

Since φ is increasing it follows that φ ≥ 0. Therefore, combining (3.B.22) and (3.B.23), we see that g(G−1 (α))φ (F −1 (α)) ≤ f (F −1 (α))φ (G−1 (α))

for all α ∈ (0, 1),

and, again from (3.B.11), it is seen that the latter inequality is equivalent to φ(X) ≤disp φ(Y ). If φ is decreasing and concave, then −φ is increasing and convex. Therefore, from what we just proved it follows that −φ(X) ≤disp −φ(Y ). From Theorem 3.B.6 we obtain that φ(X) ≤disp φ(Y ). The proof of (3.B.21) is similar.

Theorem 3.B.10 can be generalized in several ways. Here are two generalizations of the increasing convex part of (3.B.20). Theorem 3.B.11. Let X and Y be two random variables such that X ≤st Y . (a) If X ≤disp Y , then φ(X) ≤disp ψ(Y ) whenever φ ≤disp ψ (in the sense of (3.B.18)) and φ or ψ is an increasing convex function. (b) If X ≤disp Y , then φ(X) ≤disp ψ(Y ) whenever φ and ψ are diﬀerentiable and their derivatives, φ and ψ , respectively, satisfy φ (x) ≤ ψ (y) for all x ≤ y. A relation similar to (3.B.20) can be used as a suﬃcient condition for X ≤disp Y . The next result states such a condition. Note that in (3.B.24) the directions of the monotonicity in the convex and the concave cases are interchanged. Theorem 3.B.12. Let X and Y be two random variables such that X ≤st Y . If φ(X) ≤disp φ(Y ) then X ≤disp Y .

for some decreasing convex or increasing concave function φ, (3.B.24)

154

3 Univariate Variability Orders

The proof of Theorem 3.B.12 uses Theorem 3.B.10. If φ in (3.B.24) is increasing and concave, then φ−1 is increasing and convex. Since X ≤st Y it follows that φ(X) ≤st φ(Y ). Therefore, by Theorem 3.B.10, X = φ−1 (φ(X)) ≤disp φ−1 (φ(Y )) = Y . The proof for a decreasing and convex φ is similar. For random variables with equal left-end support points the assumption in Theorems 3.B.10 and 3.B.11 of the comparison of X and Y in the usual stochastic order need not be stated. This is because of the following observation. Here, for random variables X and Y , we denote the corresponding endpoints of their supports by lX , uX , lY , and uY as deﬁned before (3.A.12). Theorem 3.B.13. (a) If X and Y are random variables such that lX = lY > −∞, then X ≤disp Y =⇒ X ≤st Y. (b) If X and Y are random variables such that uX = uY < ∞, then X ≤disp Y =⇒ X ≥st Y. For example, if X and Y are nonnegative random variables such that lX = lY = 0, then Theorem 3.B.13(a) applies. A stronger version of this fact is described in Remark 4.B.35. The proof of Theorem 3.B.13(a) is based on the fact that if F and G are the distribution functions of X and Y , respectively, then F −1 (0) = lX = lY = G−1 (0). Therefore, from (3.B.1) one obtains that F −1 (β) ≤ G−1 (β) for all β ∈ (0, 1), that is, X ≤st Y by (1.A.12). The proof of Theorem 3.B.13(b) is similar. The following result can be shown using the same kind of argument. Theorem 3.B.14. If X and Y are random variables having the same ﬁnite support and satisfying X ≤disp Y , then they must have the same distribution. The next result is an analog of (3.A.12). We omit the proof. Theorem 3.B.15. Let X and Y be random variables whose supports are intervals. Then X ≤disp Y =⇒ µ{supp(X)} ≤ µ{supp(Y )}, where µ denotes the Lebesgue measure. Suppose that X and Y are two random variables with distributions F and G, respectively, such that X ≤disp Y . Then by taking c = 0 in (3.B.3) we see that (3.A.59) holds for the random variables X − EX and Y − EY . We thus have proved the following implication. Theorem 3.B.16. Let X and Y be two random variables with ﬁnite means. Then X ≤disp Y =⇒ X ≤dil Y.

3.B The Dispersive Order

155

A more reﬁned result can be obtained by combining (3.C.7) and (3.C.9) in Section 3.C below. From Theorem 3.B.16, (3.A.32), and (3.A.4) it follows that if X ≤disp Y , then Var(X) ≤ Var(Y ), (3.B.25) whenever Var(Y ) < ∞. From Theorem 3.B.7 it follows that

X ≤conv Y, and X has a logconcave density =⇒ X ≤disp Y.

(3.B.26)

In contrast to (3.B.19), if X ≤disp Y , it does not necessarily follow that X ≤disp −Y . In order to see it, let X be an exponential random variable with mean 1. Clearly X ≤disp X (in fact, this is the case for any random variable X). The distribution function of X is concave on [0, ∞), and the distribution function of −X is convex on (−∞, 0). Since the order ≤disp is preserved under shifts, it follows that X ≤disp −X. Using an argument as in the proof of Theorem 3.A.44, we obtain the following suﬃcient condition for the dispersive order. Theorem 3.B.17. Let X and Y be random variables with respective densities f and g. If S − (f (· − c) − g(·)) ≤ 2 for all c, with the sign sequence being −, +, − in the case of equality, (3.B.27) then X ≤disp Y . Another suﬃcient condition for X ≤disp Y is given next. Theorem 3.B.18. Let X and Y be two absolutely continuous random variables with hazard rate functions (see (1.B.1)) r and q, respectively. If r(u) ≥ q(u + x)

for all u and x ≥ 0,

(3.B.28)

then X ≤disp Y . Proof. Let F and G denote the distribution functions of X and Y , respectively. Condition (3.B.28) implies that r(u) ≥ q(u); that is, X ≤hr Y . This, in turn, implies X ≤st Y , which, in turn, implies (1.A.12). Now, (3.B.28) therefore gives r(F −1 (α)) ≥ q(G−1 (α)) for all α ∈ (0, 1), which is equivalent to X ≤disp Y by (3.B.12).

Let X be a random variable and denote by X(−∞,a] the truncation of X at a as deﬁned in Section 1.A.4. One would expect X(−∞,a] to increase in a in the sense of the dispersion order. This is not always the case, but it is the case if the distribution function F of X is logconcave; that is, if X has decreasing reverse hazard (see Section 1.B.6). This is shown in the next result,

156

3 Univariate Variability Orders

which is an analog of Theorem 1.A.15 for the dispersion order. The proof of the ﬁrst part of the theorem consists of verifying that for α ≤ β the quantity F −1 (βF (a)) − F −1 (αF (a)) increases in a when F is logconcave. The other parts of the theorem are proven similarly. The notation X(a,b) for a < b is self-explanatory. Theorem 3.B.19. Let X be a random variable with distribution function F and density f . (a) If F is logconcave, then X(−∞,a] increases in a in the sense of the dispersion order. (b) If F is logconcave (that is, if X is IFR), then X(a,∞) decreases in a in the sense of the dispersion order. (c) If f is logconcave, then X(a,b) decreases in a (< b) and increases in b (> a) in the sense of the dispersion order. Recall from page 1 the deﬁnitions of the IFR, DFR, NBU, NWU, DMRL and IMRL properties. The following theorems list some relations between the dispersion order and some other orders. The proofs are mostly straightforward and are not detailed here. Theorem 3.B.20. Let X and Y be two nonnegative random variables. (a) If X ≤hr Y and X or Y is DFR, then X ≤disp Y . (b) If X ≤disp Y and X or Y is IFR, then X ≤hr Y . (c) If X is NBU and Y is NWU, then X ≤disp Y ⇐⇒ X ≤hr Y . A version of parts (a) and (b) of Theorem 3.B.19, where ≤hr is replaced by ≤rh , and DFR and IFR are replaced by monotonicity conditions on the reversed hazard rate function, can be found in Bartoszewicz [44]. Recall from (1.A.20) that for nonnegative random variables X and Y with ﬁnite means, we denote by AX and AY the corresponding asymptotic equilibrium ages. The following result may be contrasted with Theorem 2.A.4. Theorem 3.B.21. Let X and Y be two nonnegative random variables. (a) If X ≤mrl Y and X or Y is IMRL, then AX ≤disp AY . (b) If AX ≤disp AY and X or Y is DMRL, then X ≤mrl Y . (c) If X ≤disp Y and X is DMRL and Y is IMRL, then X ≤mrl Y . Example 3.B.22. Let X1 , X2 , . . . , Xn be independent DFR random variables, and let X(1) ≤ X(2) ≤ · · · ≤ X(n) be the corresponding order statistics. Then X(1) is DFR (since its hazard rate function is the sum of the hazard rate functions of the Xi ’s). From Theorem 1.B.26 we see that X(1) ≤hr X(i) , i = 2, 3, . . . , n. Therefore, by Theorem 3.B.20(a), we have that X(1) ≤disp X(i) ,

i = 2, 3, . . . , n.

3.B The Dispersive Order

157

Example 3.B.23. Let X1 , X2 , . . . , Xm , Xm+1 be independent and identically distributed DFR random variables and let the corresponding spacings be denoted by U(i:m) as in Theorem 1.B.31. It is easy to see then that the spacings are DFR random variables (see Barlow and Proschan [35]). Then, from Theorems 1.B.31 and 3.B.20(a) we get (m − i + 1)U(i:m) ≤disp (m − i)U(i+1:m) , i = 2, 3, . . . , m − 1, (m − i + 2)U(i:m+1) ≤disp (m − i + 1)U(i:m) , i = 2, 3, . . . , m,

(3.B.29) (3.B.30)

and U(i:m) ≤disp U(i+1:m+1) ,

i = 2, 3, . . . , m.

(3.B.31)

Note that (3.B.29)–(3.B.31) can be summarized as (m − j + 1)U(j:m) ≤disp (n − i + 1)U(i:n)

whenever i − j ≥ max{0, n − m}.

The dispersive order can be used to characterize IFR and DFR random variables as the following result shows. Theorem 3.B.24. Let X be a nonnegative random variable. Then X is IFR [DFR] if, and only if, [X − tX > t] ≥disp [≤disp ] [X − t X > t ] whenever t ≤ t . Proof. If X is IFR, then, by Theorem 3.B.19(b), [X X > t] is decreasing in t in the sense of the dispersive order. Since the dispersive order is preserved under shifts, it is seen that [X − tX > t] is decreasing in t in the sense of the dispersive order. The proof of the DFR case is similar, though one ﬁrst needs to prove a DFR version of Theorem 3.B.19(b). The converses of the above statements are consequences of Theorems 1.A.30(a) and 3.B.13(a).

Under some regularity conditions on the distribution function of X and on its support, but without the assumption of nonnegativity of X, we have a related characterization of the IFR and the DFR properties. We do not give the proof of this result here. Theorem 3.B.25. Let X be a random variable with a continuous distribution function, and with support of the form (a, ∞), where a ≥ −∞ [respectively, a > −∞]. Then X is IFR [DFR] if, and only if, X ≥disp [≤disp ] [X − tX > t] for all t > a. The next result states a preservation property of the order ≤disp which is useful in reliability theory as well as in nonparametric statistics. Let X and Y be two random variables. Let X(1:n) ≤ X(2:n) ≤ · · · ≤ X(n:n) denote the order statistics from a sample X1 , X2 , . . . , Xn of independent and identically distributed random variables that have the same distribution as X. Similarly, let Y(1:n) ≤ Y(2:n) ≤ · · · ≤ Y(n:n) denote the order statistics from another sample Y1 , Y2 , . . . , Yn of independent and identically distributed random variables that have the same distribution as Y .

158

3 Univariate Variability Orders

Theorem 3.B.26. Let X and Y be two random variables. If X ≤disp Y , then X(j:n) ≤disp Y(j:n) for j = 1, 2, . . . , n. The proof follows at once from (3.B.10) and the fact that −1 G−1 F j:n Fj:n = G

for j = 1, 2, . . . , n,

where F , Fj:n , G, and Gj:n are the distribution functions of X, X(j:n) , Y , and Y(j:n) , respectively. For the next result about comparison of order statistics we will need the following lemma. Lemma 3.B.27. Let E(j:m) and E(i:n) denote the jth and the ith order statistics of samples from the exponential distribution with rate λ > 0 of sizes m and n, respectively. Then E(j:m) ≤disp E(i:n)

whenever i − j ≥ max{0, n − m}.

j Proof. Write E(j:m) =st k=1 Em−j+k , where Em−j+k is an exponential random variable with rate (m − j + k)λ, k = 1, 2, . . . , j, and the Em−j+k ’s are i independent. Similarly, write E(i:n) =st k=1 En−i+k , where En−i+k is an exponential random variable with rate (n − i + k)λ, k = 1, 2, . . . , i, and the En−i+k ’s are independent. It is easy to check, for instance using Theorem because m − j ≥ n − i. Since exponential 3.B.4, that Em−j+k ≤disp En−i+k random variables have logconcave densities, we obtain from Theorems 3.B.9 and 3.B.7, respectively, that E(j:m) =st

j k=1

Em−j+k ≤disp

j

En−i+k ≤disp

k=1

i

En−i+k =st E(i:n)

k=1

because j ≤ i.

Theorem 3.B.28. Let X(j:m) and X(i:n) denote the jth and the ith order statistics of samples from a DFR distribution F of sizes m and n, respectively. Then X(j:m) ≤disp X(i:n) whenever i − j ≥ max{0, n − m}. Proof. The distribution Fj:m of X(j:m) can be expressed as Fj:m = Bj:m F , where Bj:m is the beta distribution with parameters j and m−j +1. Similarly, the distribution Fi:n of X(i:n) can be expressed as Fi:n = Bi:n F . Now write Fj:m = Bj:m GG−1 F = Hj:m G−1 F, where G denotes the distribution function of an exponential random variable with mean 1, and Hj:m = Bj:m G. Note that Hj:m is the distribution function of E(j:m) in Lemma 3.B.27. Similarly, write Fi:n = Hi:n G−1 F,

3.B The Dispersive Order

159

and ﬁnally notice that −1 −1 Fj:m = ψHi:n Hj:m ψ −1 , Fi:n −1 Hj:m (x)− where ψ = F −1 G. From Lemma 3.B.27 and (3.B.10) we see that Hi:n x is increasing in x. The function ψ is strictly convex because F is DFR, and it satisﬁes ψ(0) = 0. Therefore, by a result of Bartoszewicz [40] it follows that −1 Fi:n Fj:m (x)−x is increasing in x. The stated result now follows from (3.B.10).

As a corollary of Theorems 3.B.26 and 3.B.28 we get the following result. Theorem 3.B.29. Let X(j:m) and Y(i:n) denote the jth and the ith order statistics of samples from the distribution F and G of sizes m and n, respectively. If F or G is DFR, and if X ≤disp Y , then X(j:m) ≤disp Y(i:n)

whenever i − j ≥ max{0, n − m}.

Proof. If F is DFR, then X(j:m) ≤disp X(i:n) ≤disp Y(i:n) by Theorems 3.B.28 and 3.B.26, respectively. If G is DFR, then X(j:m) ≤disp Y(j:m) ≤disp Y(i:n) by Theorems 3.B.26 and 3.B.28, respectively.

It is of interest to compare Theorem 3.B.29 to the following example (which follows from Example 3.A.50 and Theorem 3.B.16). Example 3.B.30. Let X(i:n) denote the ith order statistic in a sample of n independent and identically distributed random variables having the common distribution function F , survival function F , and density function f . Recall the deﬁnition of regular variation from Example 3.A.50. For F with support (−∞, ∞) we have: (a) If F is regularly varying at −∞ with index α < 0, and if f is monotone on (−∞, c] for some c, then X(j:m) ≤disp X(i:n) implies i ≤ j. (b) If F is regularly varying at ∞ with index α < 0, and if f is monotone on [c, ∞) for some c, then X(j:m) ≤disp X(i:n) implies n − i ≤ m − j. The dispersive order between X and Y implies the usual stochastic order between the corresponding spacings as the next result shows. In order to state it we use the following notation. Let X(1:n) ≤ X(2:n) ≤ · · · ≤ X(n:n) and Y(1:n) ≤ Y(2:n) ≤ · · · ≤ Y(n:n) be the order statistics as above. The corresponding spacings are deﬁned by U(i:n) ≡ X(i:n) − X(i−1:n) and V(i:n) ≡ Y(i:n) − Y(i−1:n) , i = 2, 3, . . . , n. The proof of the next theorem is given in Example 6.B.25 in Chapter 6. Theorem 3.B.31. Let X and Y be two random variables. If X ≤disp Y , then U(i:n) ≤st V(i:n) for i = 2, 3, . . . , n. Theorem 2.7 on page 182 of Kamps [273] extends Theorem 3.B.31 to the spacings of the so called generalized order statistics. The following example describes an interesting instance in which the two maxima are ordered in the dispersive order. It may be compared with Example 1.B.37.

160

3 Univariate Variability Orders

Example 3.B.32. Let Y1 , Y2 , . . . , Yn be independent exponential random variables with hazard rates λ1 , λ2 , . . . , λn , respectively. Let X1 , X2 , . . . , Xn be independent andidentically distributed exponential random variables with n hazard rate λ = i=1 λi /n. Then X(n:n) ≤disp Y(n:n) .

(3.B.32)

Let Z1 , Z2 , . . . , Zn be independent andidentically distributed exponential ran˜ = ( n λi )1/n . Then dom variables with hazard rate λ i=1 Z(n:n) ≤disp Y(n:n) .

(3.B.33)

˜ it follows Note that from the arithmetic-geometric mean inequality (λ ≥ λ) that X1 ≤hr Z1 . Therefore, by Theorem 3.B.20(a), X1 ≤disp Z1 . Alternatively, we can see that X1 ≤disp Z1 from Example 1.D.1 and (3.B.26). Hence, by Theorem 3.B.26, X(n:n) ≤disp Z(n:n) . That is, actually (3.B.33) is a stronger result than (3.B.32). Example 3.B.33. Let Y1 , Y2 , . . . , Yn and X1 , X2 , . . . , Xn be as in Example 3.B.32. Denote the corresponding spacings by U(i:n) ≡ X(i:n) − X(i−1:n) and V(i:n) ≡ Y(i:n) − Y(i−1:n) , i = 2, 3, . . . , n. Then U(i:n) ≤disp V(i:n) ,

i = 2, 3, . . . , n.

A related example is the following. Recall from page 2 the deﬁnition of the majorization order ≺ among n-dimensional vectors. It is of interest to compare the example below with Example 1.C.50. Example 3.B.34. Let Xi be an exponential random variable with mean λ−1 > i 0, i = 1, 2, . . . , m, and let Yi be an exponential random variable with mean ηi−1 > 0, i = 1, 2, . . . , m. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m

Xi ≥disp

i=1

m

Yi .

i=1

Similar examples are the following. Example 3.B.35. Let Xi be a uniform random variable on [0, λ−1 i ], i = 1, 2, . . . , m, and let Yi be a uniform random variable on [0, ηi−1 ], i = 1, 2, . . . , m. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m i=1

Xi ≥disp

m

Yi .

i=1

Example 3.B.36. Let Xi be a Gamma random variable with density funcα−1 −λi x e , x > 0, i = 1, 2, . . . , m, and let Yi be a Gamma tion (1/Γ (α))λα i x random variable with density function (1/Γ (α))ηiα xα−1 e−ηi x , x > 0, i =

3.B The Dispersive Order

161

1, 2, . . . , m. Here α ≥ 1, and the λi ’s and the ηi ’s are positive parameters. If (λ1 , λ2 , . . . , λm ) (η1 , η2 , . . . , ηm ), then m

Xi ≥disp

i=1

m

Yi .

i=1

The proof of the next example is omitted. Example 3.B.37. Let {N (t), t ≥ 0} be a nonhomogeneous Poisson process with mean function Λ (that is, Λ(t) ≡ E[N (t)], t ≥ 0), and let T1 , T2 , . . . be the successive epoch times. If Λ is strictly increasing and concave, then Tn ≤disp Tn+1 ,

n = 1, 2, . . . .

In the following example the idea of the proof of Theorem 3.B.26 is used. This example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 4.B.14, 6.B.41, 6.D.8, 6.E.13, and 7.B.13. Example 3.B.38. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G, i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 3.B.37), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the dispersive ordering of the ﬁrst two epoch times implies the dispersive ordering of all the corresponding later epoch times; that is, it will be shown below that if X ≤disp Y , then T1,n ≤disp T2,n , n ≥ 1. In order to see it, ﬁx an n ≥ 1, and denote by F1,n and F2,n the distribution functions of T1,n and T2,n , respectively. Note from (1.B.24) that F1,n (t) = ψn (F (t))

and F2,n (t) = ψn (G(t)),

where ψn (u) ≡ Γn (− log(1 − u)), u ∈ [0, 1]. Therefore, −1 (F1,n (t)) − t = (ψn (G))−1 (ψn (F (t))) − t = G−1 (F (t)) − t, F2,n

t ≥ 0.

Thus, from (3.B.10) it is seen that X ≤disp Y if, and only if, T1,n ≤disp T2,n . In the following example it is shown that, under the proper conditions, random minima and maxima are ordered in the dispersive order sense; see related results in Examples 1.C.46, 4.B.16, 5.A.24, and 5.B.13. Example 3.B.39. Let X1 , X2 , . . ., and Y1 , Y2 , . . ., each be a sequence of independent and identically distributed random variables with common distribution functions FX1 and FY1 , respectively, and common survival functions F X1 and F Y1 , respectively. Let N be a positive integer-valued random variable, independent of the Xi ’s and of the Yi ’s, with a Laplace transform LN .

162

3 Univariate Variability Orders

Denote X(1,N ) = min{X1 , X2 , . . . , XN }, X(N,N ) = max{X1 , X2 , . . . , XN }, Y(1,N ) = min{Y1 , Y2 , . . . , YN }, and Y(N,N ) = max{Y1 , Y2 , . . . , YN }. The distribution functions of X(N,N ) and Y(N,N ) are given by FX(N,N ) (x) = LN (− log FX1 (x)),

x ≥ 0, j = 1, 2,

FY(N,N ) (x) = LN (− log FY1 (x)),

x ≥ 0, j = 1, 2.

and If X1 ≤disp Y1 , then, for 0 < α ≤ β < 1 we compute

−1 −1 −1 −L−1 −1 −L−1 FX e N (β) − FX e N (α) (β) − FX (α) = FX 1 1 (N :N ) (N :N ) −L−1 (β) −L−1 (α) e N − FY−1 e N ≤ FY−1 1 1 (β) − FY−1 (α). = FY−1 (N :N ) (N :N ) Therefore X(N :N ) ≤disp Y(N :N ) . Similarly it can be shown that if X1 ≤disp Y1 , then X(1:N ) ≤disp Y(1:N ) . Example 3.B.40. Let X (respectively, Y ) have the central t-distribution with νX (respectively, νY ) degrees of freedom. If νX ≤ νY , then X ≥disp Y . Example 3.B.41. As in Example 1.C.59, for nonnegative absolutely continuous random variables X and Y , let X w and Y w be the random variables with the weighted density functions fw and gw given in (1.C.17) and (1.C.18). Suppose that X ≤disp Y . If X is DFR, if Y is IFR, and if w is decreasing and convex, then X w ≤disp Y w . Analogous to the result in Remark 1.A.18, it can be shown that a certain quotient set of all distribution functions on R is a lattice with respect to the order ≤disp . A consequence of the order ≤disp is given in the next theorem. It is a motivation for a multivariate dispersion order that is described in Chapter 7. Theorem 3.B.42. Let X and X be two independent and identically distributed random variables and let Y and Y be two other independent and identically distributed random variables. If X ≤disp Y , then |X − X | ≤st |Y − Y |, that is, P {|X − X | > z} ≤ P {|Y − Y | > z}

for all z ≥ 0.

(3.B.34)

Proof. Denote the common distribution function of X and X [respectively, Y and Y ] by F [G]. Select a z ≥ 0. Then ∞ [F (x + z) − F (x − z)]dF (x) P {|X − X | ≤ z} = −∞ 1

=

0

{F [F −1 (u) + z] − F [F −1 (u) − z]}du

3.C The Excess Wealth Order

1

≥

163

{G[G−1 (u) + z] − G[G−1 (u) − z]}du

0

= P {|Y − Y | ≤ z}, where the inequality is a consequence of (3.B.4) and (3.B.5). This proves (3.B.34).

3.C The Excess Wealth Order 3.C.1 Motivation and deﬁnition Let X be a nonnegative random variable with distribution function F and with a ﬁnite mean. Recall from (3.A.43) the deﬁnition of the Lorenz curve. ˜ X , corresponding to The nonstandardized (or the generalized) Lorenz curve L X, is deﬁned as p ˜ X (p) = F −1 (u)du, p ∈ [0, 1]. L 0

˜X Note that the requirement that X is nonnegative is not needed in order for L to be well deﬁned. Thus, in this section we will not assume the nonnegativity of the discussed random variables, unless stated otherwise. For a nonnegative random variable X with a ﬁnite mean, a transform that is closely related to the nonstandardized Lorenz curve is the transform TX deﬁned as −1 F

(p)

TX (p) =

F (x)dx,

p ∈ [0, 1].

0

The transform TX is called the TTT transform, and is denoted by HF−1 in (1.A.19). A third transform, that is related to the nonstandardized Lorenz curve and to the TTT transform, and which will be heavily used in this section, is the excess wealth transform WX deﬁned as ∞ WX (p) = F (x)dx, p ∈ (0, 1]. F −1 (p)

Note that it is not necessary for the random variable X to be nonnegative in order for WX to be well deﬁned; it is only required that X has a ﬁnite mean. This useful property of the excess wealth transform is one of the main reasons for its applicability as a tool that deﬁnes a stochastic order. ˜ X , TX , and WX , when X is nonnegative with a ﬁnite The transforms L ˜ X (p) is depicted mean, are depicted in Figure 3.C.1. For p ∈ (0, 1) the value L as the area of the region A in the ﬁgure. The value TX (p) is the area of A ∪ B, and the value WX (p) is the area of C. Note that the area of A ∪ B ∪ C is EX. The order which is determined by the pointwise comparison of the excess wealth transforms of two random variables is of interest in this section. Let X

164

3 Univariate Variability Orders 1

6 F

C

B p

A

F −1 (p) ˜ X (p), TX (p), and WX (p). Fig. 3.C.1. Depiction of L

-x

and Y be two random variables with distribution functions F and G. Assume that ∞ ∞ WX (p) ≡ F (x)dx ≤ G(x)dx ≡ WY (p) for all p ∈ (0, 1). F −1 (p)

G−1 (p)

(3.C.1) Then X is said to be smaller than Y in the excess wealth order (denoted as X ≤ew Y ). −1 −1 Note that since F −1 (p) = F (1 − p) and G−1 (p) = G (1 − p), we see that X ≤ew Y if, and only if, ∞ ∞ F (x)dx ≤ G(x)dx for all p ∈ (0, 1). F

−1

(p)

If we deﬁne ΨX (y) = X ≤ew Y if, and only if,

G

∞ y

−1

(p)

F (x)dx and ΨY (y) =

−1 ΨY−1 (z) − ΨX (z)

∞ y

G(x)dx, x ∈ R, then

is decreasing in z ≥ 0.

(3.C.2)

In order to obtain another characterization of the ≤ew order, rewrite (3.C.1) as p

1

F −1 (u) − F −1 (p) du ≤

1

G−1 (u) − G−1 (p) du

(3.C.3)

p

(this can be formally veriﬁed by Fubini’s Theorem or, informally, by rewriting the area of the region C in Figure 3.C.1 as the left-hand side above). It is thus seen that X ≤ew Y if, and only if,

3.C The Excess Wealth Order

G−1 (p) − F −1 (p) ≤

1 1−p

1

G−1 (u) − F −1 (u) du,

165

p ∈ (0, 1).

p

By a straightforward diﬀerentiation it can be veriﬁed that the latter is equivalent to 1 −1

1 G (u) − F −1 (u) du is increasing in p ∈ (0, 1). (3.C.4) 1−p p Thus, X ≤ew Y if, and only if, (3.C.4) holds.

∞

F (x)dx

∞

G(x)dx

Let mX and mY , deﬁned by mX (t) ≡ t F (t) and mY (t) ≡ t G(t) (for t’s for which the denominators are not 0), denote the mean residual life functions associated with X and Y (see (2.A.1)). Then it is seen that X ≤ew Y if, and only if, mX (F −1 (p)) ≤ mY (G−1 (p)),

p ∈ (0, 1).

(3.C.5)

p ∈ (0, 1).

(3.C.6)

Also, X ≤ew Y if, and only if, mX (F

−1

(p)) ≤ mY (G

−1

(p)),

Another characterization of the excess wealth order is given in Theorem 4.A.43. Like the convex and the dispersive orders (see Theorems 3.A.7 and 3.B.3), the excess wealth order can be characterized by means of Yaari functionals Vh deﬁned in (3.A.31). Recall that an increasing function h : [0, 1] → [0, 1] is starshaped if h(t)/t is increasing on [0, 1]. Theorem 3.C.1. Let X and Y be two random variables with the same ﬁnite means. Then X ≤ew Y if, and only if, Vh (X) ≤ Vh (Y )

for every starshaped probability transformation function h.

Jewitt [256] considered an order, called the location independent riskier order that can be denoted by ≤lir . It is shown in Fagiuoli, Pellerey, and Shaked [188] that X ≤lir Y ⇐⇒ −X ≤ew −Y . Thus every result that holds for the order ≤ew can be reworded by means of the order ≤lir . 3.C.2 Properties It is easy to verify that the excess wealth order is location-independent. That is, X ≤ew Y =⇒ X + a ≤ew Y for any a ∈ R. From (3.C.4) and Theorem 3.A.8 we see that X ≤ew Y =⇒ X ≤dil Y.

(3.C.7)

166

3 Univariate Variability Orders

It follows that if EX = EY , then X ≤ew Y =⇒ X ≤cx Y.

(3.C.8)

Shaked and Shanthikumar [518] showed that if X ≤cx Y , then it does not necessarily follow that X ≤ew Y . From (3.C.7), (3.A.32), and (3.A.4) it follows that X ≤ew Y =⇒ Var(X) ≤ Var(Y ), provided Var(Y ) < ∞. From (3.C.3) and (3.B.1) we see that for random variables with ﬁnite means, we have X ≤disp Y =⇒ X ≤ew Y. (3.C.9) A characterization of the excess wealth order, which is similar to the characterization of the dispersive order, given in Theorem 3.B.2, is given next. Theorem 3.C.2. Let X and Y be two random variables. Then X ≤ew Y if, and only if, for all increasing convex functions φ and h such that φ and ψ(·) ≡ h(φ(·)) are integrable with respect to the distribution of Y , and for every real number c, we have that Eφ(X − c) ≤ Eφ(Y ) =⇒ Eψ(X − c) ≤ Eψ(Y ). It is worthwhile to mention that two twice diﬀerentiable functions φ and ψ satisfy ψ(·) ≡ h(φ(·)) for some increasing convex function h if, and only if, φ /φ ≤ ψ /ψ . Another characterization of the excess wealth order is described in the following theorem. It is similar to the characterization of the convex order in Theorem 3.A.45. Below, for any random variable Z, the function ΨZ is as deﬁned before (3.C.2). Theorem 3.C.3. Let X and Y be two random variables with equal means. Then X ≤ew Y if, and only if, there exist random variables Z1 , Z2 , . . ., with distribution functions F1 , F2 , . . ., such that Z1 =st X, EZj = EY , j = 1, 2, . . ., ΨZj (x) → ΨY (x) as j → ∞ for all x ∈ R, and, for any c ≥ 0, it holds that

S − F j (·) − F j+1 (· − c) = 1 and the sign sequence is +, −, j = 1, 2, . . .. An important closure property of the excess wealth order is given next. Theorem 3.C.4. Let X and Y be two continuous random variables with ﬁnite means. Then, for any increasing convex function φ, we have X ≤ew Y =⇒ φ(X) ≤ew φ(Y ). In the next two results we describe some relationships between the orders ≤ew and ≤mrl . We denote the left endpoint of the support of a random variable X by lX .

3.C The Excess Wealth Order

167

Theorem 3.C.5. Let X and Y be two random variables with distribution functions F and G, respectively, with ﬁnite means, and with ﬁnite left endpoints lX and lY such that lX ≤ lY . If X ≤ew Y , and if either X or Y or both are DMRL, then X ≤mrl Y . Proof. We only give the proof for the case when the distribution functions F and G of X and Y are continuous; the proof for the general case is similar, though notationally more complex. Let (y0 , p0 ), (y1 , p1 ), and (y2 , p2 ) be three consecutive points of crossing as in the proof of Theorem 3.A.5 (see Figure 3.A.1). Note that by the continuity assumption we have pi = F (yi ) = G(yi ), i = 0, 1, 2. Suppose that Y is DMRL. For p ∈ [p1 , p2 ] we have F −1 (p) ≤ G−1 (p), and therefore, for such a p we have mX (F −1 (p)) ≤ mY (G−1 (p)) ≤ mY (F −1 (p)), where the ﬁrst inequality follows from (3.C.5), and the second from the assumption that Y is DMRL. Thus, mX (y) ≤ mY (y)

for y ∈ [y1 , y2 ].

(3.C.10)

If X (rather than Y ) is DMRL, then (3.C.10) follows from mX (G−1 (p)) ≤ mX (F −1 (p)) ≤ mY (G−1 (p)),

p ∈ [p1 , p2 ],

where the ﬁrst inequality follows from the assumption that X is DMRL, and the second from (3.C.5). Since y0 = F −1 (p0 ) = G−1 (p0 ), from X ≤ew Y we also have that ∞ ∞ F (x)dx ≤ G(x)dx. (3.C.11) y0

y0

Now let y ∈ (y0 , y1 ). For x ∈ (y0 , y) we have F (x) ≥ G(x). Therefore y y F (x)dx ≥ G(x)dx. (3.C.12) y0

Hence

∞

∞

F (x)dx = y

y0 ∞

≤

y 0∞

G(x)dx −

F (x)dx

[by (3.C.11)]

G(x)dx

[by (3.C.12)]

y 0y

G(x)dx − y0

G(x)dx. y

F (x)dx y0 y

y 0∞

=

y

F (x)dx −

≤

y0

168

3 Univariate Variability Orders

Therefore

y

∞

F (x)dx ≤

∞

G(x)dx,

for all y ∈ [y0 , y1 ].

y

But since F (y) ≥ G(y) for y ∈ [y0 , y1 ], we see that ∞ ∞ ∞ F (x)dx G(x)dx G(x)dx y y y ≤ ≤ . F (y) F (y) G(y) So mX (y) ≤ mY (y)

for y ∈ [y0 , y1 ].

(3.C.13)

That is, from (3.C.10) and (3.C.13) we have mX (y) ≤ mY (y)

for y ∈ [y0 , y2 ].

In order to complete the proof we need to show that the interval [lX , ∞) is a union of segments [y0 , y2 ) as above. Suppose that a last point of crossing of F and G exists, and denote it by (yl , pl ). Denote (y0 , p0 ) = (yl−1 , pl−1 ), (y1 , p1 ) = (yl , pl ), and (y2 , p2 ) = (∞, 1), where (yl−1 , pl−1 ) is the point of the next to the last crossing of F and G. From thefacts that F −1 (p1 ) = G−1 (p1 ), ∞ ∞ and that X ≤ew Y implies F −1 (p ) F (x)dx ≤ G−1 (p ) G(x)dx, it follows that 1 1 F crosses G from below at (y1 , p1 ), and therefore the interval [y0 , ∞) is of the type described above. Now suppose that a ﬁrst point of crossing of F and G exists, and denote it by (yf , pf ). If lX < lY , then at the ﬁrst point of crossing, F crosses G from above. Thus, from the above proof it follows that mX (y) ≤ mY (y) for all y ≥ yf . The proof that mX (y) ≤ mY (y) also for y < yf is similar to the proof of (3.C.10). If lX = lY , then consider two possible cases: (a) in the ﬁrst point of crossing, F crosses G from above, and (b) in the ﬁrst point of crossing, F crosses G from below. In case (a) we obtain mX (y) ≤ mY (y) for all y, as we obtained it above when lX < lY . In case (b) denote y0 = sup{y : F (y) = G(y)}, p0 = F (y0 ), (y1 , p1 ) = (yf , pf ), and (y2 , p2 ) = (yf +1 , pf +1 ), where (yf +1 , pf +1 ) is the point of the second crossing of F and G. The interval [y0 , y2 ) is of the kind described above, and therefore mX (y) ≤ mY (y) for y ∈ [y0 , y2 ), and from it it also follows that mX (y) ≤ mY (y) for y < y0 .

Theorem 3.C.6. Let X and Y be two random variables with distribution functions F and G, respectively, with ﬁnite means, and with ﬁnite left endpoints lX and lY such that lX ≤ lY . If X ≤mrl Y , and if either X or Y or both are IMRL, then X ≤ew Y . Proof. Again, we only give the proof for the case when the distribution functions F and G of X and Y are continuous; the proof for the general case is similar, though notationally more complex. Let (y0 , p0 ), (y1 , p1 ), and (y2 , p2 ) be three consecutive points of crossing as in the proof of Theorem 3.A.5 (see Figure 3.A.1).

3.C The Excess Wealth Order

169

Suppose that Y is IMRL. For p ∈ [p1 , p2 ] we have F −1 (p) ≤ G−1 (p), and therefore, for such a p we have mX (F −1 (p)) ≤ mY (F −1 (p)) ≤ mY (G−1 (p)), where the ﬁrst inequality follows from X ≤mrl Y , and the second from the assumption that Y is IMRL. Thus, mX (F −1 (p)) ≤ mY (G−1 (p))

for p ∈ [p1 , p2 ].

(3.C.14)

If X (rather than Y ) is IMRL, then (3.C.14) follows from mX (F −1 (p)) ≤ mX (G−1 (p)) ≤ mY (G−1 (p)),

p ∈ [p1 , p2 ],

where the ﬁrst inequality follows from the assumption that X is IMRL, and the second from X ≤mrl Y . Since y0 = F −1 (p0 ) = G−1 (p0 ), from X ≤mrl Y we also have that mX (F −1 (p0 )) ≤ mY (G−1 (p0 )).

(3.C.15)

Now let p ∈ (p0 , p1 ). Since F (x) ≥ G(x) for x ∈ [y0 , y1 ] we see that

F −1 (p)

F −1 (p)

F (x)dx ≥

G−1 (p)

G(x)dx ≥

y0

y0

G(x)dx,

(3.C.16)

y0

where the second inequality follows from F −1 (p) ≥ G−1 (p). Therefore ∞ F (x)dx F −1 (p) −1 mX (F (p)) = 1−p F −1 (p) ∞ F (x)dx − y0 F (x)dx y0 = 1−p G−1 (p) ∞ G(x)dx − y0 G(x)dx y0 ≤ [by (3.C.15) and (3.C.16)] 1−p = mY (G−1 (p)), for p ∈ [p0 , p1 ]. So, from the preceding inequality and from (3.C.14) we obtain mX (F −1 (p)) ≤ mY (G−1 (p))

for p ∈ [p0 , p2 ].

In order to complete the proof we need to show that the interval [lX , ∞) is a union of segments [y0 , y2 ) as above. Suppose that a last point of crossing of F and G exists, and denote it by (yl , pl ). Denote (y0 , p0 ) = (yl−1 , pl−1 ), (y1 , p1 ) = (yl , pl ), and (y2 , p2 ) = (∞, 1), where (yl−1 , pl−1 ) is the point of the next to the last crossing of F and G. From ∞ the facts that F (y1 ) = G(y1 ), and ∞ that X ≤mrl Y implies y F (x)dx ≤ y G(x)dx, it follows that F crosses 1

1

170

3 Univariate Variability Orders

G from below at (y1 , p1 ), and therefore the interval [y0 , ∞) is of the type described above. Now suppose that a ﬁrst point of crossing of F and G exists, and denote it by (yf , pf ). If lX < lY , then at the ﬁrst point of crossing, F crosses G from above. Thus, from the above proof it follows that mX (F −1 (p)) ≤ mY (G−1 (p)) for all p ≥ pf . The proof that mX (F −1 (p)) ≤ mY (G−1 (p)) also for p < pf is similar to the proof of (3.C.14). If lX = lY , then consider two possible cases: (a) in the ﬁrst point of crossing, F crosses G from above, and (b) in the ﬁrst point of crossing, F crosses G from below. In case (a) we obtain mX (F −1 (p)) ≤ mY (G−1 (p)) for all p, as we obtained it above when lX < lY . In case (b) denote y0 = sup{y : F (y) = G(y)}, p0 = F (y0 ), (y1 , p1 ) = (yf , pf ), and (y2 , p2 ) = (yf +1 , pf +1 ), where (yf +1 , pf +1 ) is the point of the second crossing of F and G. The interval [y0 , y2 ) is of the kind described above, and therefore mX (F −1 (p)) ≤ mY (G−1 (p)) for p ∈ [p0 , p2 ), and from it it also follows that mX (F −1 (p)) ≤ mY (G−1 (p)) for p ≤ p0 . In summary, we have shown that mX (F −1 (p)) ≤ mY (G−1 (p)) for all p ∈ (0, 1). Therefore X ≤ew Y by (3.C.5).

The following few results give conditions under which the order ≤ew is closed under convolutions. Theorem 3.C.7. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤ew Yi , i = 1, 2, . . . , m. If Xi , Yi , i = 1, 2, . . . , m, all have (continuous or discrete) logconcave densities, except possibly one Xl and one Yk (l = k), then m m Xi ≤ew Yi . i=1

i=1

In order to prove Theorem 3.C.7 one ﬁrst proves that if X and Y are two random variables such that X ≤ew Y , and if Z is a random variable with logconcave density that is independent of X and of Y , then X + Z ≤ew Y + Z. The statement of the theorem can then be derived from the fact that a convolution of random variables with logconcave densities has a logconcave density. We do not give the details here. The next two results are analogs of Theorems 3.B.7 and 3.B.8. Theorem 3.C.8. The random variable X satisﬁes X ≤ew X + Y

for any random variable Y independent of X

if, and only if, X is IFR. Theorem 3.C.9. Let Z be a random variable. Then X + Z ≤ew Y + Z

whenever X ≤disp Y and Z is independent of X and Y

if, and only if, Z is IFR.

3.D The Peakedness Order

171

Since a convolution of IFR random variables is IFR (see Corollary 1.B.39), repeated application of Theorem 3.C.9 yields the following result, which is an analog of Theorem 3.B.9. Theorem 3.C.10. Let X1 , X2 , . . . , Xn be a set of independent random variables, and let Y1 , Y2 , . . . , Yn be another set of independent random variables. If the Xi ’s and the Yi ’s are all IFR, and if Xi ≤disp Yi , i = 1, 2, . . . , n, then n

Xi ≤ew

i=1

n

Yi .

i=1

An interesting closure property of the order ≤ew is given next. Theorem 3.C.11. Let X1 , X2 , . . . be a collection of independent and identically distributed random variables, and let Y1 , Y2 , . . . be another collection of independent and identically distributed random variables. Also, let N be a positive, integer-valued, random variable, independent of the Xi ’s and of the Yi ’s. If X1 ≤ew Y1 , then max{X1 , X2 , . . . , XN } ≤ew max{Y1 , Y2 , . . . , YN }. The following result may be compared to Theorem 3.B.31. By (3.C.9), we assume below less than is assumed in Theorem 3.B.31, but the conclusion is weaker. We use below the notation for spacings that was used in Theorem 3.B.31. Theorem 3.C.12. Let X and Y be two random variables. If X ≤ew Y , then EU(n−1:n) ≤ EV(n−1:n) for n = 2, 3, . . .. The order ≤ew can be used to characterize DMRL and IMRL random variables. The following result may be compared with Theorems 2.A.23, 2.B.17, 3.A.56, and 4.A.51. As in Section 1.A.3, [Z A] denotes any random variable that has as its distribution the conditional distribution of Z given A. Theorem 3.C.13. Let X be a continuous random variable with a ﬁnite left endpoint of its support lX . Then X is DMRL [IMRL] if, and only if, any one of the following equivalent conditions holds: (i) [X − tX > t] ≥ew [≤ X > t ] whenever t ≥ t ≥ lX . ] [X − t ew (ii) X ≥ew [≤ew ] [X − tX > t] for all t ≥ lX (when lX = 0). The proof of this result is omitted.

3.D The Peakedness Order 3.D.1 Deﬁnition In this section we discuss a variability order that applies to random variables with symmetric distribution functions. This is one of the oldest (if not the oldest) variability notions that can be found in the literature. It stochastically

172

3 Univariate Variability Orders

compares random variables according to their distance from their center of symmetry. Let X be a random variable with a distribution function that is symmetric about µ, and let Y be another random variable with a distribution function that is symmetric about ν. Suppose that |X − µ| ≤st |Y − ν|. Then X is said to be smaller than Y in the peakedness order (denoted by X ≤peak Y ). Note that, in the literature, often X is said to be more peaked about µ than Y about ν if X ≤peak Y . The following result is easy to prove. Theorem 3.D.1. Let X and Y be two random variables with diﬀerent distribution functions, but with the same mean. Suppose that the distribution functions F and G, of X and Y , respectively, are symmetric about the common mean. Then X ≤peak Y if, and only if, S − (G − F ) = 1 and the sign sequence is +, −, where S − is deﬁned in (1.A.18). 3.D.2 Some properties The peakedness order satisﬁes some desirable closure properties. For example, it is easy to verify the following result. Theorem 3.D.2. Let X be a random variable with a symmetric distribution function. Then X ≤peak aX whenever a ≥ 1. The closure results in the next theorem can also be easily veriﬁed. Theorem 3.D.3. (a) Let X, Y, and Θ be random variables such that the distribution functions of [X Θ = θ] are symmetric about some µ (which is independent of θ) and the distribution functions of [Y Θ = θ] are symmetric about some ν (which is also independent of θ) and such that [X Θ = θ] ≤peak [Y Θ = θ] for all θ in the support of Θ. Then X ≤peak Y . That is, the peakedness order is closed under mixtures. (b) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables with symmetric distribution functions such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤peak Yj , j = 1, 2, . . ., then X ≤peak Y . The peakedness order is also closed under convolutions of random variables that have unimodal symmetric distribution functions (that is, with mode at their center of symmetry). This is shown next.

3.D The Peakedness Order

173

Theorem 3.D.4. Let X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn be two sets of independent random variables, all having distribution functions that are symmetric about possibly diﬀerent centers, and all having unimodal densities with possibly some probability mass at their respective centers. If Xi ≤peak Yi for i = 1, 2, . . . , n, then n n Xi ≤peak Yi . i=1

i=1

In particular, X ≤peak Y , where X and Y denote the corresponding sample means. Proof. Without loss of generality we may assume that all the centers of the Xi ’s and of the Yi ’s are 0. First we prove the result for n = 2. Let F1 , F2 , G1 , and G2 denote the distribution functions of X1 , X2 , Y1 , and Y2 , respectively. Select an a > 0. Then ∞ [F1 (x + a) − F1 (x − a)]dF2 (x) P {|X1 + X2 | ≤ a} = 2 0 ∞ ≥2 [F1 (x + a) − F1 (x − a)]dG2 (x) 0 ∞ =2 [G2 (x + a) − G2 (x − a)]dF1 (x) 0 ∞ ≥2 [G2 (x + a) − G2 (x − a)]dG1 (x) 0

= P {|Y1 + Y2 | ≤ a}, where the ﬁrst inequality follows from the unimodality of X1 (therefore, the integrand is decreasing in x ≥ 0) and from X2 ≤peak Y2 , and the second inequality follows from the unimodality of Y2 and from X1 ≤peak Y1 . This proves the result for n = 2. The general result can be obtained by a simple induction together with the observation that a sum of independent random variables, all having distribution functions that are symmetric about 0 and all having unimodal densities, also has a unimodal density symmetric about 0.

If X1 , X2 , . . . are independent and identically distributed random variables, then, for each n, we denote by X n the sample mean of X1 , X2 , . . . , Xn . That is, X n = (X1 + X2 + · · · + Xn )/n. In Example 3.A.29 it is shown that X n ≤cx X n−1 . The following result shows that a similar property holds for the peakedness order under an additional condition. Theorem 3.D.5. If X1 , X2 , . . . are independent and identically distributed random variables, having a common logconcave density function that is symmetric about a common value, then for each n ≥ 2 one has X n ≤peak X n−1 .

174

3 Univariate Variability Orders

A relationship between the dispersive and the peakedness orders is described next. Theorem 3.D.6. Let X and Y be two random variables having distribution functions that are symmetric about possibly diﬀerent centers. If X ≤disp Y , then X ≤peak Y .

3.E Complements Section 3.A: For historical reasons, the convex order is sometimes referred to as “dilation.” However, in recent literature the order deﬁned in (3.A.32) is often called the dilation order. Some standard references about the convex order are Ross [475] and M¨ uller and Stoyan [419], where many of the results described in Section 3.A can be found. Another monograph that studies the convex order (under the mask of the Lorenz order) is Arnold [19], and many of the results in this section that deal directly with the Lorenz order can be found there. The proof of Theorem 3.A.2 is taken from Mu˜ noz-Perez and Sanches-Gomez [421]; an alternative proof, using ideas from the area of comparison of experiments, can be found in Torgersen [551, page 369]. Result (3.A.12) is taken from Hickey [223]. The present version of the characterization of the convex order given in Theorem 3.A.4 is taken from M¨ uller and R¨ uschendorf [415]. The characterization of the convex order in Theorem 3.A.5 can be found in Fagiuoli, Pellerey, and Shaked [188]; see also Levy and Kroll [346] and Ramos and Sordo [463]. The characterization of the convex order by means of Yaari functionals (Theorem 3.A.7) can be found in Chateauneuf, Cohen, and Meilijson [127]. The characterization of the dilation order, given in Theorem 3.A.8, is taken from Fagiuoli, Pellerey, and Shaked [188]. The characterization given in Theorem 3.A.9 can be found in Ramos and Sordo [463]. The characterization of the Lorenz order by means of the Lorenz zonoids (Theorem 3.A.11) is taken from Arnold [20]. The result about the convex ordering of random sums (Theorem 3.A.13) is a special case of a result of Jean-Marie and Liu [254]; the extensions of it when the underlying random variables are identically distributed (Theorems 3.A.14–3.A.16) are taken from Pellerey [450]. Theorem 3.A.17 and some related results can be found in Berger [79]. The property of the increase in the dilation order with an increase in the scale (Theorem 3.A.18) is taken from Hickey [223]. The result about the dilation ordering of two diﬀerences (Theorem 3.A.19) can be found n in Kochar and Carri`ere [312]. The convex order lower bound on i=1 Xi , given in Theorem 3.A.20, is taken from Vyncke, Goovaerts, De Schepper, Kaas, and Dhaene [557]. The property of inheritance of the convex order from the mixing random variables to the mixed ones (Theorem 3.A.21) can be found in Schweder [499]; its variation, Theorem 3.A.23, is taken from Kottas and Gelfand [323]. The property of the preservation of the convex

3.E Complements

175

order under products of nonnegative random variables (Corollary 3.A.22) can be found in Whitt [562]. The Lorenz order comparison of g(X) and h(X) (Theorem 3.A.26) can be found in Wilﬂing [566]. The relationship between the orders ≤Lorenz and ≤hmrl , given in Theorem 3.A.28, is taken from Lef`evre and Utev [340]. The result about the convex ordering of the sample means (Example 3.A.29) can be found in Marshall and Olkin [383, page 288]. Its generalizations (Theorem 3.A.30 and Corollary 3.A.31) are taken from Denuit and Vermandele [158] and from O’Cinneide [439]. The convex order comparison of scaled Poisson random variables (Example 3.A.32) is inspired by a result at the end of page 1078 in B¨ auerle [60]. The convex order comparison in Theorem 3.A.33 can be found in O’Cinneide [439]. The closure property (3.A.50) of the dilation order can be found in Mu˜ noz-Perez and Sanches-Gomez [421]. The majorization result (Theorem 3.A.35) is a special case of a result of Marshall and Proschan [384]; related results can be found in Ma [375]. The preservation of the convex order under linear convex combinations (Theorem 3.A.36) is taken from Pellerey [452]. The convex order comparison of sums of random variables with random coeﬃcients (Theorem 3.A.37) can be found in Ma [375]. The particular case of it, given in Example 3.A.38, is a result of Karlin and Novikoﬀ [277]; Marshall and Olkin [383, Section 15.E] obtained a generalization of this special case which is diﬀerent from the result in Theorem 3.A.37. The convex order comparison of sums of positively [respectively, negatively] associated random variables, and independent random variables, given in Theorem 3.A.39, can be found in Denuit, Dhaene, and Ribas [143] [respectively, Shao [535]]; see also Boutsikas and Vaggelatou [107]. The Laplace transform characterization of the order ≤cx (Theorem 3.A.40) is taken from Shaked and Wong [524]; see also Kan and Yi [274]. The convex order comparison of posterior means, in the context of statistical experiments (Example 3.A.41), is essentially taken from Baker [30]. The condition for stochastic equality of ≤cx -ordered random variables (Theorem 3.A.42) is a special case of a result by Denuit, Lef`evre, and Shaked [151], whereas its generalization (Theorem 3.A.43) has been motivated by a result in Bhattacharjee and Bhattacharya [87]; see also Huang and Lin [249]. The result that gives suﬃcient conditions for the convex order by means of the number of crossings of the underlying densities or distribution functions (Theorem 3.A.44) is taken from Shaked [502], but its origins may be found in Karlin and Novikoﬀ [277], if not before. A proof of the characterization of the convex order by means of the number of crossings of two distribution functions (Theorem 3.A.45) can be found in M¨ uller [407]; similar results are given in Borglin and Keiding [106]. The convex order comparison of normalized Bernoulli random variables (Example 3.A.48) can be found in Makowski [379]. The necessary and suﬃcient conditions for the comparison of normal random variables (Example 3.A.51) are taken from M¨ uller [413]. The relations ≤uv and ≤lc were introduced in Whitt [564] as means to identify the order ≤cx . The

176

3 Univariate Variability Orders

characterization of DMRL and IMRL random variables by means of the convex order (Theorem 3.A.56) is taken from Belzunce, Candel, and Ruiz [64]. The relationships between the orders ≤dil and ≤mrl that are described in Theorems 3.A.57 and 3.A.58 can be found in Belzunce, Pellerey, Ruiz, and Shaked [72] where further related results can also be found. The results on the m-convex order (Section 3.A.5) are mostly taken from Denuit, Lef`evre, and Shaked [151]. The condition that implies the stochastic equality of ≤Sm-cx -ordered random variables (Theorem 3.A.60) is taken from Denuit, Lef`evre, and Shaked [152]. The method for deriving the distributions of the stochastic extrema in Bm ([0, b]; µ1 , µ2 , . . . , µm−1 ) is taken from Denuit, De Vylder, and Lef`evre [142]. The stochastic comparisons of the Gamma, inverse Gaussian, and lognormal random variables (Example 3.A.67) are taken from Kaas and Hesselager [270]. Tables 3.A.1 and 3.A.2 can be found in Denuit, Lef`evre, and Shaked [153, 154]. Theorem 3.A.63 can be found in Denuit, Lef`evre, and Utev [155]. The result about the 2m-cx ordering of two diﬀerences (Theorem 3.A.64) is taken from Bassan, Denuit, and Scarsini [52]. Denuit and Lef`evre [146], Denuit, Lef`evre, and Utev [156], and Denuit, Lef`evre, and Mesﬁoui [149] studied discrete analogs of the m-convex order; in particular they obtained some analogs of the results in Section 3.A.5 for arithmetic random variables, as well as some speciﬁc results for the discrete case. Denuit, Lef`evre, and Utev [155] extended the m-convex order to Tchebycheﬀ-type orders; see also Lynch [367]. Bhattacharjee [85] studied the order ≤cx under the restriction that the compared random variables are discrete. Metzger and R¨ uschendorf [393] studied variability orderings, which are related to ≤uv and ≤lc , deﬁned by requiring the ratio of the distribution functions F/G or of the survival functions F /G to be unimodal. For example, they showed that if X and Y are two random variables with distribution functions F and G, respectively, such that supp(X) ⊆ supp(Y ), and if X ≤uv Y , then F/G is unimodal. They also considered the order deﬁned by requiring the ratio of a shifted density to another density f (· + a)/g(·) to be unimodal for all a. This order is to be compared with the order ≤uv and also with the order ≤lr↑ studied in Section 1.C.4. M¨ uller [412] considered an order that is deﬁned by requiring (3.A.1) to hold for all so-called (a, b)-concave functions. Other related stochastic orders can be found in M¨ uller [412] as well. An order which is related to the Lorenz order is studied in Zenga [576]. Section 3.B: Doksum [169] studied some properties of the dispersive order by stipulating (3.B.10) and calling it the “tail-order” (see Deshpande and Kochar [159] for further early references in which this order is studied). A basic paper on the dispersive order is Shaked [503] where many of the

3.E Complements

177

equivalent conditions described in Section 3.B.1 can be found. The conditions (3.B.3), (3.B.4), and (3.B.6) are taken, respectively, from Saunders [489], Hickey [223], and Mu˜ noz-Perez [420]. Another characterization of the order ≤disp , which is related to (3.B.3), is given in Burger [115]. The observation (3.B.15) has been noted in M¨ uller and Stoyan [419]. The characterization of the dispersive order by means of the observed total time on test random variables (Theorem 3.B.1) can be found in Bartoszewicz [42]; other related results can be found in Bartoszewicz [39, 42]. The notion of Q-addition was introduced in Mu˜ noz-Perez [420]. The characterization of the dispersive order given in Theorem 3.B.2 is taken from Landsberger and Meilijson [330]. The characterization of the dispersive order by means of Yaari functionals (Theorem 3.B.3) can be found in Chateauneuf, Cohen, and Meilijson [127]. The properties described in Section 3.B.2 have been collected from many sources. The result of Theorem 3.B.7 can be found in Droste and Wefelmeyer [171]. Several versions of Theorem 3.B.8 can be found in Lewis and Thompson [347] and in Lynch, Mimmack, and Proschan [368]. Some versions of Theorem 3.B.10 can be found in Bartoszewicz [37] and in Rojo and He [472]. Some related results appear in Hickey [223]; for example, his Theorem 4 can be obtained from (3.B.21) applied to the decreasing convex case. Theorems 3.B.14 and 3.B.15 are also taken from that paper. The relationship between the orders ≤disp and ≤conv , given in (3.B.26), was noted in Shaked and Suarez-Llorens [520]. The suﬃcient condition for the dispersive order by means of comparison of shifted hazard rate functions (Theorem 3.B.18) can be found in Belzunce, Lillo, Ruiz, and Shaked [69]. Theorem 3.B.19 has been proved in Mailhot [377], whereas Theorem 3.B.20 combines results from Bartoszewicz [38, 40] and Bagai and Kochar [29]. The relationships between the orders ≤disp and ≤mrl , given in Theorem 3.B.21, can be found in Bartoszewicz [44]. The result about the dispersive ordering of order statistics of DFR random variables (Example 3.B.22) is taken from Kochar [308]; some other related results can also be found there. The results about the dispersive ordering of the spacings of DFR random variables (Example 3.B.23) are taken from Kochar and Kirmani [313] and from Khaledi and Kochar [285]; an extension of these results can be found in Belzunce, Hu, and Khaledi [68]. The characterizations of IFR and DFR random variables by means of the dispersive order (Theorems 3.B.24 and 3.B.25) have been derived by Belzunce, Candel, and Ruiz [64], and by Pellerey and Shaked [456]. The results on the dispersive order comparisons of order statistics and spacings (Theorems 3.B.26, 3.B.28, 3.B.29, and 3.B.31) can be found in Bartoszewicz [39], in Khaledi and Kochar [286], and in Oja [440], whereas Example 3.B.30 is mentioned in Kleiber [303]; related results can be found in Alzaid and Proschan [14], in Belzunce, Hu, and Khaledi [68], in Belzunce, Mercader, and Ruiz [70], and in Hu and Zhuang [247]. An extension of Theorem 3.B.26 to order statistics from samples with random size can be found in Nanda, Misra, Paul, and Singh [427].

178

3 Univariate Variability Orders

The dispersive order comparisons of maxima of heterogeneous exponential random variables (Example 3.B.32) are taken from Dykstra, Kochar, and Rojo [174] and from Khaledi and Kochar [287], whereas the comparison of the spacings (Example 3.B.33) is taken from Kochar and Korwar [314]. The comparison of sums of heterogeneous exponential random variables (Example 3.B.34) can be found in Kochar and Ma [317]. The comparisons of sums of uniform and Gamma random variables (Examples 3.B.35 and 3.B.36) are slightly weaker than results that are given in Khaledi and Kochar [288, 289]. The result about the dispersive order comparison of the successive epochs of a nonhomogeneous Poisson process (Example 3.B.37) is given in Kochar [310], though it is stated by means of the dispersive order comparison of successive record values of a sequence of independent and identically distributed random variables with a common DFR distribution function. The dispersive order comparison of epoch times of nonhomogeneous Poisson processes (Example 3.B.38) can be found in Belzunce, Lillo, Ruiz, and Shaked [69] and in Yue and Cao [575]. The results about the dispersive order comparisons of random minima and maxima (Example 3.B.39) are taken from Shaked and Wong [526]; a simple proof of these results is given in Bartoszewicz [49]. The comparison of t-distributed random variables (Example 3.B.40) can be found in Arias-Nicol´ as, Fern´ andez-Ponce, Luque-Calvo, and Su´ arez-Llorens [17], whereas the comparison of weighted random variables (Example 3.B.41) can be found in Bartoszewicz and Skolimowska [51]. Finally, the result of Theorem 3.B.42 has been derived by Giovagnoli and Wynn [211] in order to motivate a deﬁnition of multivariate dispersive order (see Section 7.B); Theorem 3.B.42 was also obtained by Kusum, Kochar, and Deshpande [327] who actually derived it for logarithms of positive random variables. Fern´ andez-Ponce and Su´ arez-Llorens [197] introduced a “weakly dispersive” order by requiring that, corresponding to every interval of length ε in the support of the “larger” variable, there exists an interval of the same length in the support of the “smaller” variable, such that the probability mass of the latter with respect to the distribution of the “smaller” variable is at least as large as the probability mass of the former with respect to the distribution of the “larger” variable. Belzunce, Hu, and Khaledi [68] studied an order, which they denoted by ≤disp-hr , that is stronger than the order ≤disp . Condition (3.B.1) can be written as F −1 (β) − F −1 (α) ≤M G−1 (β) − G−1 (α)

whenever 0 < α < β < 1,

where M = 1. Lehmann [344] considered this condition for other possible values of M in order to compare the tails of F and G. Burger [115] studied, among other things, the above condition (with M = 1), but only for α

3.E Complements

179

and β such that 0 < α < G−1 (µ) < β < 1, where µ is some constant. Rojo [471] studied the above condition with M = ∞ in the sense lim sup u→1

F −1 (u) < ∞, G−1 (u)

and Bartoszewicz [43] obtained comparison results, with respect to the latter order, for the observed total time on test random variables Xttt and Yttt , with distribution functions as deﬁned in (1.A.19). Section 3.C: Most of the results, about the excess wealth order, that are described in this section are taken from Shaked and Shanthikumar [518], Fagiuoli, Pellerey, and Shaked [188], and Kochar, Li, and Shaked [316]. Fernandez-Ponce, Kochar, and Mu˜ noz-Perez [195] also studied the excess wealth order by the name of the right spread order. The characterization of the excess wealth order given in (3.C.2) is taken from Chateauneuf, Cohen, and Meilijson [127]; the characterization of the excess wealth order by means of Yaari functionals (Theorem 3.C.1) can be found in that paper as well. The characterization of the excess wealth order given in Theorem 3.C.2 is a translation of the deﬁnition of the order ≤lir into the order ≤ew , which can be done by virtue of Lemma 3.1 of Fagiuoli, Pellerey, and Shaked [188]. The characterization of the excess wealth order by means of the number of crossings of two distribution functions (Theorem 3.C.3) can be obtained in a similar manner from a correction by M¨ uller [410] of Theorem 1 in Landsberger and Meilijson [330]. The conditions for the preservation of the order ≤ew under convolutions (Theorems 3.C.7–3.C.10) can essentially all be found in Hu, Chen, and Yao [231]. The result about the preservation of the excess wealth order under random maxima (Theorem 3.C.11) is taken from Li and Zuo [358], and the result that compares the expected values of the extreme spacings (Theorem 3.C.12) is a special case of a result of Li [353]. The characterization of DMRL and IMRL random variables by the order ≤ew (Theorem 3.C.13) is taken from Belzunce [63]. Belzunce, Hu, and Khaledi [68] studied a stochastic order, denoted by ≤disp-mrl , which is stronger than the order ≤ew . Section 3.D: The peakedness order was introduced by Birnbaum [90]. The characterization of this order, given in Theorem 3.D.1, was observed in Kottas and Gelfand [323]. Theorem 3.D.4 was essentially proven by Birnbaum [90]; the proof given here is adopted from Bickel and Lehmann [89]. The result about the monotonicity of the sample means in the sense of the peakedness order (Theorem 3.D.5) is given in Proschan [461]; an extension of Theorem 3.D.5 can be found in Ma [372]. The relationship between the dispersive and the peakedness orders, given in Theorem 3.D.6, was observed in Shaked [503].

4 Univariate Monotone Convex and Related Orders

In Chapter 1 we studied orders that compare random variables according to their “magnitude”. In Chapter 3 the studied orders compare random variables according to their “variability”. The orders that are discussed in this chapter compare random variables according to both their “location” and their “spread”. The most important and common orders that are studied in this chapter are the increasing convex and the increasing concave orders. Also the transform orders that are studied here, that is, the convex, the star, and the superadditive orders, are of interest in many theoretical and practical applications. In addition, some other related orders are investigated in this chapter as well.

4.A The Monotone Convex and Monotone Concave Orders 4.A.1 Deﬁnitions and equivalent conditions Let X and Y be two random variables such that E[φ(X)] ≤ E[φ(Y )] for all increasing convex [concave] functions φ : R → R,

(4.A.1)

provided the expectations exist. Then X is said to be smaller than Y in the increasing convex [concave] order (denoted by X ≤icx Y [X ≤icv Y ]). Roughly speaking, if X ≤icx Y , then X is both “smaller” and “less variable” than Y in some stochastic sense. Similarly, if X ≤icv Y , then X is both “smaller” and “more variable” than Y in some stochastic sense. One can also deﬁne a decreasing convex [concave] order by requiring (4.A.1) to hold for all decreasing convex [concave] functions φ (denoted by X ≤dcx [≤dcv ] Y ). The terms “decreasing convex” and “decreasing concave” are counterintuitive in the sense that if X is smaller than Y in the sense

182

4 Univariate Monotone Convex and Related Orders

of either of these two orders, then X is “larger” than Y in some stochastic sense. These orders can be easily characterized using the orders ≤icx and ≤icv . Therefore, it is not necessary to have a separate discussion for these orders. In analogy with Theorem 3.A.12(a), the orders ≤icx and ≤icv are related to each other as follows. Theorem 4.A.1. Let X and Y be two random variables. Then X ≤icx [≤icv ] Y ⇐⇒ −X ≥icv [≥icx ] − Y. The proof of Theorem 4.A.1 is based on the fact that a function φ satisﬁes that φ(x) is increasing and convex in x if, and only if, −φ(−x) is increasing and concave in x. We omit the straightforward details. Note that the function φ, deﬁned by φ(x) = x, is increasing and is both convex and concave. Therefore, from (4.A.1) it follows that X ≤icx Y =⇒ E[X] ≤ E[Y ]

(4.A.2)

X ≤icv Y =⇒ E[X] ≤ E[Y ],

(4.A.3)

and that provided the expectations exist. Let F [F ] and G [G] be the survival [distribution] functions of X and Y , respectively. For a ﬁxed a, the function φa , deﬁned by φa (x) = (x − a)+ , is increasing and convex. Therefore, if X ≤icx Y , then E[(X − a)+ ] ≤ E[(Y − a)+ ]

for all a,

(4.A.4)

provided the expectations exist. Alternatively, using a simple integration by parts, it is seen that (4.A.4) can be rewritten as ∞ ∞ F (u)du ≤ G(u)du for all x, (4.A.5) x

x

provided the integrals exist. For any real number a let a− denote the negative part of a, that is, a− = a if a ≤ 0 and a− = 0 if a > 0. For a ﬁxed a, the function ζa , deﬁned by ζa (x) = (x − a)− , is increasing and concave. Therefore, if X ≤icv Y , then E[(X − a)− ] ≤ E[(Y − a)− ]

for all a,

(4.A.6)

provided the expectations exist. Alternatively, again using a simple integration by parts, it is seen that (4.A.6) can be rewritten as x x F (u)du ≥ G(u)du for all x, (4.A.7) −∞

provided the integrals exist.

−∞

4.A The Monotone Convex and Monotone Concave Orders

183

In fact (4.A.5) [(4.A.7)] is equivalent to X ≤icx Y [X ≤icv Y ]. To see it, note that every increasing convex [concave] function can be approximated by (that is, is a limit of) positive linear combinations of the functions φa ’s [ζa ’s], for various choices of a’s. By (4.A.5), E[φa (X)] ≤ E[φa (Y )] for all a, and this fact implies (4.A.1) in the increasing convex case. Similarly, by (4.A.7), E[ζa (X)] ≤ E[ζa (Y )] for all a, and this fact implies (4.A.1) in the increasing concave case. We thus have proved the following result. Theorem 4.A.2. Let X and Y be two random variables. Then X ≤icx Y [X ≤icv Y ] if, and only if, (4.A.5) [(4.A.7)] holds. The next two results give further characterizations of the order ≤icx . The ﬁrst one is an analog of Theorem 3.A.5. Theorem 4.A.3. Let X and Y be two random variables with distribution functions F and G, respectively. Then X ≤icx Y if, and only if, 1 1 F −1 (u)du ≤ G−1 (u)du for all p ∈ [0, 1]. p

p

Theorem 4.A.4. Let X and Y be two random variables with distribution functions F and G, respectively. Then X ≤icx Y if, and only if,

1

F

−1

1

(u)dφ(u) ≤

0

G−1 (u)dφ(u)

0

for all increasing convex functions φ : [0, 1] → R. Another necessary and suﬃcient condition for X ≤icx Y is the following: F −1 (p) +

1 1−p

∞

F (x)dx F −1 (p)

≤ G−1 (p) +

1 1−p

∞

G(x)dx, G−1 (p)

p ∈ (0, 1). (4.A.8)

Condition (4.A.8) may be compared with (3.C.1); see also Corollary 4.A.32. An important characterization of the increasing convex and the increasing concave orders by construction on the same probability space is stated next. Theorem 4.A.5. Two random variables X and Y satisfy X ≤icx Y [X ≤icv ˆ and Yˆ , deﬁned on the Y ] if, and only if, there exist two random variables X same probability space, such that ˆ =st X, X Yˆ =st Y, ˆ Yˆ } is a submartingale [{Yˆ , X} ˆ is a supermartingale], that is, and {X,

184

4 Univariate Monotone Convex and Related Orders

ˆ ≥X ˆ [E[X ˆ Yˆ ] ≤ Yˆ ] E[Yˆ X]

almost surely.

(4.A.9) ˆ and Yˆ can be selected such that [Yˆ X ˆ= Furthermore, the random variables X ˆ ˆ x] [[X Y = x]] is increasing in x in the usual stochastic order ≤st . The proof of this theorem is similar to the proof of Theorem 3.A.4. It is not easy to prove the constructive part of Theorem 4.A.5. However, it is easy ˆ and Yˆ as described in the theorem exist, to prove that if random variables X then X ≤icx Y [X ≤icv Y ]. For example, if the ﬁrst inequality in (4.A.9) holds and if φ is an increasing convex function, then, using Jensen’s Inequality, ˆ ≤ E{φ(E[Yˆ X])} ˆ E[φ(X)] = E[φ(X)]

ˆ = E[φ(Yˆ )] = E[φ(Y )], ≤ E{E[φ(Yˆ )X]}

which is (4.A.1). Theorem 4.A.6. (a) Two random variables X and Y satisfy X ≤icx Y if, and only if, there exists a random variable Z such that X ≤st Z ≤cx Y. (b) Two random variables X and Y satisfy X ≤icx Y if, and only if, there exists a random variable Z such that X ≤cx Z ≤st Y. (c) Two random variables X and Y satisfy X ≤icv Y if, and only if, there exists a random variable Z such that X ≤cv Z ≤st Y. (d) Two random variables X and Y satisfy X ≤icv Y if, and only if, there exists a random variable Z such that X ≤st Z ≤cv Y. Proof. First we prove part (a). It is obvious (see, for example, Theorem 4.A.34 ˆ below) that X ≤st Z ≤cx Y =⇒ X ≤icx Y . So suppose that X ≤icx Y . Let X ˆ and Y be deﬁned on the same probability space, as in Theorem 4.A.5. Deﬁne ˆ It is seen that E[Yˆ Z] ˆ = E[Yˆ X] ˆ = Z. ˆ Thus, by Theorem 3.A.4, Zˆ = E[Yˆ X]. ˆ ≤ Z, ˆ and therefore, by Theorem 1.A.1, Zˆ ≤cx Yˆ . Also, by Theorem 4.A.5, X ˆ ≤st Z. ˆ Letting Z have the same distribution as Z, ˆ we obtain the stated X result. Now we prove part (b). Again it is obvious that X ≤cx Z ≤st Y =⇒ ˆ and Yˆ be deﬁned on the same X ≤icx Y . So suppose that X ≤icx Y . Let X ˆ − E[Yˆ X]. ˆ Then, probability space, as in Theorem 4.A.5. Let Zˆ = Yˆ + X ˆ ˆ ˆ by Theorem 4.A.5, Z ≤ Y , and therefore, by Theorem 1.A.1, Z ≤st Yˆ . Also,

4.A The Monotone Convex and Monotone Concave Orders

185

ˆ = X, ˆ and thus, by Theorem 3.A.4, X ˆ ≤cx Z. ˆ Letting Z have the E[Zˆ X] ˆ we obtain the stated result. same distribution as Z, Parts (c) and (d) can be proven similarly. Alternatively, using Theorem 4.A.1, part (c) can be obtained from part (a), and part (d) can be obtained from part (b).

The following bivariate characterization of the orders ≤icx and ≤icv is analogous to Theorem 3.A.6. Its proof is similar to the proof of Theorem 3.A.6 and is therefore omitted. Deﬁne the following classes of bivariate functions: Gicx = {φ : R2 → R : φ(x, y) − φ(y, x) is increasing and convex in x for all y} and Gicv = {φ : R2 → R : φ(x, y)−φ(y, x) is increasing and concave in x for all y}. Theorem 4.A.7. Let X and Y be independent random variables. Then X ≤icx Y [X ≤icv Y ] if, and only if, E[φ(X, Y )] ≤ E[φ(Y, X)]

for all φ ∈ Gicx [Gicv ].

Another characterization of the increasing convex order, by means of the number of sign changes of two distribution functions, is given in Theorem 4.A.23 below. 4.A.2 Closure properties and some characterizations Using (4.A.1) through (4.A.9) it is easy to prove each of the closure results in the ﬁrst two parts of the following theorem. The last two parts can be proven as in Theorem 3.A.12. (Recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.) Theorem 4.A.8. (a) If X ≤icx Y [X ≤icv Y ] and g is any increasing and convex [concave] function, then g(X) ≤icx [≤icv ] g(Y ). (b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤icx [≤icv ] [Y Θ = θ] for all θ in the support of Θ. Then X ≤icx [≤icv ] Y . That is, the increasing convex [concave] order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that EX+ [EX− ] and EY+ [EY− ] are ﬁnite and that E(Xj )+ → EX+ [E(Xj )− → EX− ] and E(Yj )+ → EY+ [E(Yj )− → EY− ]

as j → ∞. (4.A.10)

If Xj ≤icx [≤icv ] Yj , j = 1, 2, . . ., then X ≤icx [≤icv ] Y .

186

4 Univariate Monotone Convex and Related Orders

(d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤icx [≤icv ] Yi for i = 1, 2, . . . , m, then m

Xj ≤icx [≤icv ]

j=1

m

Yj .

j=1

That is, the increasing convex [concave] order is closed under convolutions. In part (c), as in Theorem 3.A.12, the condition (4.A.10) is necessary — without it the conclusion of part (c) may not hold. Part (d) of Theorem 4.A.8 can be strengthened as follows. Theorem 4.A.9. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent and identically distributed random variables such that Xi ≤icx [≤icv ] Yi , i = 1, 2, . . .. Let M and N be positive integer-valued random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤icx [≤icv ] N . Then M

Xj ≤icx [≤icv ]

N

Yj .

j=1

j=1

Proof. Let φ be an increasing convex [concave] function and denote g(n) ≡ E[φ(X1 + X2 + · · · + Xn )]. Clearly g(n) increases in n. Denote Sn = X1 + X2 + · · · + Xn for n ≥ 1. Now, E[φ(Sn + Xn+1 ) − φ(Sn )Sn = s] = E[φ(s + Xn+1 ) − φ(s)] = h(s), say. Since φ is convex [concave] it follows that h(s) is increasing [decreasing] in s. Since Sn is increasing in n in the usual stochastic order, it follows that g(n + 1) − g(n) = E[h(Sn )] is increasing [decreasing] in n. That is, g(n) is increasing and convex [concave] in n. Therefore M N Xi ≤ E φ Xi , E φ i=1

i=1

that is, M

Xi ≤icx [≤icv ]

i=1

N

Xi .

(4.A.11)

i=1

From Theorem 4.A.8 (b) and (d) it follows that N i=1

Xi ≤icx [≤icv ]

N

Yi ,

i=1

and the proof is complete by the transitivity property of the order ≤icx [≤icv ].

4.A The Monotone Convex and Monotone Concave Orders

187

A special case of Theorem 4.A.9 is stated, and proven in a diﬀerent manner, in Chapter 8 (see Theorem 8.A.13). Remark 4.A.10. If in Theorem 4.A.9 the Xi ’s are only assumed to be increasing [decreasing] in i in the increasing convex [concave] order (rather than being identically distributed), or if the same is assumed about the Yi ’s, then the conclusion of the theorem is still true. As a special case of the result mentioned in Remark 4.A.10 we obtain the following theorem. Theorem 4.A.11. Let {Xi , i = 1, 2, . . . } be a sequence of nonnegative independent random variables such that Xi ≤st Xi+1 , i = 1, 2, . . .. Let M and N be two discrete positive integer-valued random variables such that M ≤icx N , and assume that M and N are independent of the Xi ’s. Then M

Xi ≤icx

i=1

N

Xi .

i=1

The following result follows easily from Theorem 4.A.9. It is of interest to compare it to Theorems 1.A.5, 2.B.8, and 3.A.14. Theorem 4.A.12. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K Xi ≤icx [≥icx , ≤icv , ≥icv ] Y1 , i=1

and M ≤icx [≥icx , ≤icv , ≥icv ] KN. Then

M

Xj ≤icx [≥icx , ≤icv , ≥icv ]

j=1

N

Yj .

j=1

Proof. The assumptions yield M

Xi ≤icx [≥icx , ≤icv , ≥icv ]

i=1

KN

Xi

i=1

=

N

Ki

i=1 j=K(i−1)+1

Xj ≤icx [≥icx , ≤icv , ≥icv ]

N

Yi ,

i=1

where the inequalities follow from Theorem 4.A.9. This gives the stated result.

188

4 Univariate Monotone Convex and Related Orders

Some results that are related to Theorem 4.A.12 are given in the next theorem. Theorem 4.A.13. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤icx Y1

M ≤icx

and

i=1

K

Ni ,

(4.A.12)

i=1

or if we have KX1 ≤icx Y1

M ≤icx KN,

and

(4.A.13)

or if we have KX1 ≤icx Y1

M ≤icx

and

K

Ni ,

(4.A.14)

i=1

then

M

Xj ≤icx

j=1

N

Yj .

(4.A.15)

j=1

Proof. Assume that (4.A.13) holds. Then M i=1

Xi ≤icx

KN i=1

Xi =

N

Ki

i=1 j=K(i−1)+1

Xj ≤cx

N i=1

KXi ≤icx

N

Yi ,

i=1

where the ﬁrst and the third inequalities follow from Theorem 4.A.9, and the second inequality follows from Theorem 3.A.13 and Example 3.A.29. This gives (4.A.15). K Next note, using Example 3.A.29, that i=1 Ni ≤icx KN . Thus, by Theorem 4.A.12, the conditions in (4.A.12) imply (4.A.15), and, by (4.A.13), the conditions in (4.A.14) imply (4.A.15).

A slight generalization of the conditions in (4.A.12) is given in the next theorem.

4.A The Monotone Convex and Monotone Concave Orders

189

Theorem 4.A.14. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1

Xi ≤icx

i=1

then

K1 Y1 K2

M

M ≤icx K2 N,

and

Xj ≤icx

j=1

N

Yj .

j=1

Proof. The ﬁrst assumption and Example 3.A.29 yield K1 · that is,

K2 i=1

K2 i=1

K2

Xi

≤cx K1 ·

K1 i=1

K1

Xi

≤icx

K1 Y1 ; K2

Xi ≤icx Y1 . The result now follows from Theorem 4.A.12.

Parts (a) and (d) of Theorem 4.A.8 can be generalized as follows. Theorem 4.A.15. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤icx Yi for i = 1, 2, . . . , m, then g(X1 , X2 , . . . , Xm ) ≤icx g(Y1 , Y2 , . . . , Ym )

(4.A.16)

for every increasing and componentwise convex function g. Proof. Without loss of generality we can assume that all the 2m random variables are independent because such an assumption does not aﬀect the distributions of g(X1 , X2 , . . . , Xm ) and g(Y1 , Y2 , . . . , Ym ). The proof is by induction on m. For m = 1 the result is just Theorem 4.A.8(a). Assume that (4.A.16) is true for vectors of size m − 1. Let g and φ be increasing and componentwise convex functions. Then E[φ(g(X1 , X2 , . . . , Xm ))X1 = x] = E[φ(g(x, X2 , . . . , Xm ))] ≤ E[φ(g(x, Y2 , . . . , Ym ))] = E[φ(g(X1 , Y2 , . . . , Ym ))X1 = x], where the equalities above follow from the independence assumption and the inequality follows from the induction hypothesis. Taking expectations with respect to X1 , we obtain

190

4 Univariate Monotone Convex and Related Orders

E[φ(g(X1 , X2 , . . . , Xm ))] ≤ E[φ(g(X1 , Y2 , . . . , Ym ))]. Repeating the argument, but now conditioning on Y2 , . . . , Ym and using (4.A.16) with m = 1, we see that E[φ(g(X1 , Y2 , . . . , Ym ))] ≤ E[φ(g(Y1 , Y2 , . . . , Ym ))], and this proves the result.

From Theorem 4.A.15 we obtain the following corollary. Corollary 4.A.16. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤icx Yi for i = 1, 2, . . . , m, then max{X1 , X2 , . . . , Xm } ≤icx max{Y1 , Y2 , . . . , Ym }. From Corollary 4.A.16 and Theorem 4.A.1 it is easy to see that if X1 , X2 , . . . , Xm are independent random variables, and if Y1 , Y2 , . . . , Ym are independent random variables, and if Xi ≤icv Yi for i = 1, 2, . . . , m, then min{X1 , X2 , . . . , Xm } ≤icv min{Y1 , Y2 , . . . , Ym }. A comparison of maxima of two partial sums in the increasing convex order is given next. Recall from (3.A.54) the deﬁnition of negatively associated random variables. Theorem 4.A.17. Let X1 , X2 , . . . , Xn be negatively associated random variables, and let Y1 , Y2 , . . . , Yn be independent random variables such that Xi =st Yi , i = 1, 2, . . . , n. Then max

1≤k≤n

k i=1

Xi ≤icx max

1≤k≤n

k

Yi .

i=1

Theorem 4.A.17 follows from Theorem 9.A.23 in Chapter 9; see a comment there after that theorem. Consider now a family of distribution functions {Gθ , θ ∈ X } where X is a convex subset (that is, an interval) of the real line or of N. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support in X , and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by Gθ (y)dF (θ), y ∈ R. H(y) = X

The following result generalizes Theorem 4.A.8(a), just as Theorem 1.A.6 generalized Theorem 1.A.3(a).

4.A The Monotone Convex and Monotone Concave Orders

191

Theorem 4.A.18. Consider a family of distribution functions {Gθ , θ ∈ X } as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Gθ (y)dFi (θ), y ∈ R, i = 1, 2. Hi (y) = X

If for every increasing convex [concave] function φ E[φ(X(θ))]

is increasing and convex [concave] in θ,

(4.A.17)

and if Θ1 ≤icx [≤icv ] Θ2 ,

(4.A.18)

Y1 ≤icx [≤icv ] Y2 .

(4.A.19)

then Proof. Select an increasing convex [concave] function φ for which the expectations below exist, denote ψ(θ) = E[φ(X(θ))],

θ ∈ X,

and notice that ψ is increasing and convex [concave] by (4.A.17). Then E[φ(Y1 )] = E[ψ(Θ1 )] ≤ E[ψ(Θ2 )] = [E[φ(Y2 )], where the inequality follows from (4.A.18). This gives (4.A.19).

Note that (4.A.11) can be easily obtained from the result above. It is worth mentioning also that condition (4.A.17) is weaker than the condition {X(θ), θ ∈ X } ∈ SICX [SICV] which is studied in Section 8.A of Chapter 8. An extension of Theorem 4.A.18 is given as Theorem 4.A.65 below. The following example illustrates the use of Theorem 4.A.18. It may be compared to Corollary 3.A.22. Example 4.A.19. Let U , Θ1 , and Θ2 be independent positive random variables. Deﬁne U U Y1 = and Y2 = . Θ1 Θ2 If Θ1 ≤icv [≤icx ] Θ2 , then Y1 ≥icx [≥icv ] Y2 . This can be proven by a simple application of Theorems 4.A.18 and 4.A.1. An interesting variation of Theorem 4.A.18 is the following. Its proof is similar to the proof of Theorem 4.A.18 and is therefore omitted.

192

4 Univariate Monotone Convex and Related Orders

Theorem 4.A.20. Consider a family of distribution functions {Gθ , θ ∈ X } as described before Theorem 4.A.18. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If for every increasing convex [concave] function φ E[φ(X(θ))]

is increasing in θ,

and if Θ1 ≤st Θ2 , then Y1 ≤icx [≤icv ] Y2 . A Laplace transform characterization of the orders ≤icx and ≤icv is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, and 2.B.14. Theorem 4.A.21. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤icx [≤icv ] X2 ⇐⇒ Nλ (X1 ) ≤icx [≤icv ] Nλ (X2 )

for all λ > 0.

Proof. First assume that X1 ≤icx [≤icv ] X2 . For k = 1, 2, denote the distribution function of Xk by Fk . Let φ be an increasing convex [concave] function. Without loss of generality assume that φ(0) = 0. Then, from (2.A.16) we have that ∞ ∞ (λx)n dFk (x), E[φ(Xk )] = φ(n)e−λx n! 0 n=1 and therefore it is seen that it suﬃces to show that g(x) ≡

∞

φ(n)e−λx

n=1

(λx)n n!

is increasing and convex [concave] in x. Now compute

g (x) =

∞

φ(n)λe

n=1 ∞

=λ

n=0

−λx

(λx)n−1 (λx)n − (n − 1)! n!

[φ(n + 1) − φ(n)]e−λx

(λx)n . n!

If we denote ∆φ (n) ≡ φ(n + 1) − φ(n), then it is seen that

4.A The Monotone Convex and Monotone Concave Orders

193

g (x) = λE{∆φ [N (x)]}, where {N (x), x ≥ 0} is a Poisson process with rate λ. Since ∆φ (n) ≥ 0, by the monotonicity of φ, it follows that g (x) ≥ 0. Also, since ∆φ (n) ↑ [↓] n by the convexity [concavity] of φ, and since N (x) ↑st x, it follows that g (x) ↑ [↓] x. Therefore g is increasing and convex [concave]. Now suppose that Nλ (X1 ) ≤icx Nλ (X2 ) for all λ > 0, that is, using the notation of the proof of Theorem 2.A.16, ∞

αλ,1 (n) ≤

n=m

∞

αλ,2 (n),

m = 0, 1, 2, . . . .

n=m

Then for m ≥ 2, (2.A.23) yields

∞

λe 0

m−2 −λu (λu)

(m − 2)!

∞

u

F 1 (x)dx du ∞ m−2 ∞ −λu (λu) ≤ λe F 2 (x)dx du. (m − 2)! u 0

For any ﬁxed y > 0 set λ = (m − 1)/y. It follows that as m → ∞ (then λ → ∞), ∞ ∞ ∞ (λu)m−2 λe−λu F k (x)dx du → F k (x)dx, k = 1, 2. (m − 2)! u 0 y Therefore we obtain y

∞

F 1 (x)dx ≤

∞

F 2 (x)dx,

y > 0,

y

that is X1 ≤icx X2 (see (4.A.5)). The proof of the converse for the ≤icv order is similar.

The implication =⇒ in Theorem 4.A.21 can be generalized in the same manner that Theorem 1.A.14 generalizes the implication =⇒ in Theorem 1.A.13. We will not state the result here since it is equivalent to Theorem 4.A.18. 4.A.3 Conditions that lead to the increasing convex and increasing concave orders Once the relation X ≤icx Y or the relation X ≤icv Y has been established between the two random variables X and Y , it can be of great use. However, given the two random variables and their distribution functions it is sometimes not clear how to verify that X ≤icx Y or that X ≤icv Y . Parallel to the analysis in Section 3.A.3 we point out here some simple conditions that imply the increasing convex and the increasing concave orders.

194

4 Univariate Monotone Convex and Related Orders

Theorem 4.A.22. Let X and Y be two random variables with distribution functions F and G and survival functions F and G, respectively, and with ﬁnite means such that EX ≤ EY . (a) If S − (F −G) ≤ 1 and the sign sequence is +, − [−, +] when equality holds, then X ≤icx Y [X ≤icv Y ]. (b) If S − (G−F ) ≤ 1 and the sign sequence is +, − [−, +] when equality holds, then X ≤icx Y [X ≤icv Y ]. The proof of this theorem is similar to the proof of Theorem 3.A.44 and is not detailed here. The condition in part (a) (or, equivalently, in part (b)) of Theorem 4.A.22 is not only suﬃcient for X ≤icx Y , but, for nonnegative random variable, it can also characterize the increasing convex order in a similar manner in which (3.A.58) (or, equivalently, (3.A.59)) characterizes the convex order in Theorem 3.A.45. This is stated next. Theorem 4.A.23. Let X and Y be two nonnegative random variables such that EX ≤ EY . Then X ≤icx [≤icv ] Y if, and only if, there exist random variables Z1 , Z2 , . . ., with distribution functions F1 , F2 , . . ., such that Z1 =st X, EZj ≤ EY , j = 1, 2, . . ., Zj →st Y as j → ∞, EZj → EY as j → ∞, and S − (F j − F j+1 ) = 1 and the sign sequence is +, − [−, +], j = 1, 2, . . .. If the random variables in Theorem 4.A.23 are not nonnegative, then the suﬃciency part of that theorem is not correct. This follows from the remark after Theorem 3.A.45. An interesting characterization of the mean residual life order by means of the increasing convex order is the following result. Theorem 4.A.24. Let X and Y be two random variables. Then X ≤mrl Y if, and only if, [X − sX > s] ≤icx [Y − sY > s] for all s. (4.A.20) Proof. Let F and G be the survival functions of X and Y , respectively. Condition (4.A.20) can be written as ∞ ∞ F (s + u)du G(s + u)du t ≤ t for all s and all t ≥ 0, F (s) G(s) which is equivalent to X ≤mrl Y by (2.A.6).

Remark 4.A.25. Let φ be an increasing convex function. For any s let s be selected such that φ(s ) = s. Note that if (4.A.20) holds, then [X X > s ] ≤icx [Y Y > s ]. Therefore E[φ(X)X > s ] ≤ E[φ(Y )Y > s ], and therefore E[φ(X) − sφ(X) > s] ≤ E[φ(Y ) − sφ(Y ) > s]. Thus we have proven that if X ≤mrl Y , then φ(X) ≤mrl φ(Y ) for every increasing convex function φ.

4.A The Monotone Convex and Monotone Concave Orders

195

From Theorem 4.A.24 we see that if X ≤mrl Y , then [X X > s] ≤icx [Y Y > s] for all s. Letting s → −∞ we obtain from Theorem 4.A.8(c) the following result. Theorem 4.A.26. Let X and Y be two random variables with ﬁnite means. If X ≤mrl Y , then X ≤icx Y . An analog of Theorem 4.A.26 for the increasing concave order is the following result. Theorem 4.A.27. Let X and Y be two random variables with ﬁnite means. If E[X X ≤ x] ≤ E[Y Y ≤ x] for all x ∈ R, then X ≤icv Y . For positive random variables we have a result that is stronger than Theorem 4.A.26: Theorem 4.A.28. Let X and Y be two almost surely positive random variables with ﬁnite means. If X ≤hmrl Y , then X ≤icx Y . Proof. Let F and G be the survival functions of X and Y , respectively. From (2.B.4) (or, equivalently, from (2.B.2)) it follows that ∞ ∞ F (u)du G(u)du t ≤ t for all t ≥ 0. (4.A.21) EX EY Since, for almost surely positive random variables, X ≤hmrl Y implies that EX ≤ EY (see (2.B.6)), it follows that (4.A.5) holds.

Remark 4.A.29. With the help of Theorem 4.A.28 we can now provide proofs for Theorems 2.A.15 and 2.B.13. First we prove Theorem 2.A.15. From (2.A.3) it is seen that assumption ∞ (2.A.11) means that y Gθ (u)du, as a function of θ and of y, is TP2 , where Gθ is the survival function associated with Gθ . Assumption (2.A.12) means that F i (θ), as a function of i ∈ {1, 2} and of θ, is TP2 . From Theorem 4.A.28 and ∞ (4.A.5) it follows that y Gθ (u)du is increasing in θ. Therefore, by Theorem ∞ 2.1(i) of Lynch, Mimmack, and Proschan [369], X y Gθ (u)du dFi (θ) is TP2 ∞ ∞ Gθ (u)dFi (θ) du, in i ∈ {1, 2} and y. But X y Gθ (u)du dFi (θ) = y X and that, by (2.A.3), gives (2.A.13). Next we prove Theorem 2.B.13. Fix an x > 0. From (2.B.2) it is seen ∞ that assumption (2.B.15) implies that y Gθ (u)du is TP2 in y ∈ {0, x} and θ, where Gθ is the survival function associated with Gθ . Assumption 2} and of θ, is TP2 . (2.B.16) means that F i (θ), as a function of i ∈ {1, ∞ From Theorem 4.A.28 and (4.A.5) it follows that y Gθ (u)du is increasing in θ. Therefore, by Theorem 2.1(i) of Lynch, Mimmack, and Proschan ∞ [369], X y Gθ (u)du dFi (θ) is TP2 in i ∈ {1, 2} and y ∈ {0, x}. But ∞ ∞ Gθ (u)du dFi (θ) = y Gθ (u)dFi (θ) du and this expression is TP2 X y X in i ∈ {1, 2} and y ∈ {0, x} for all x > 0. Thus, by (2.B.2), we obtain (2.B.17).

196

4 Univariate Monotone Convex and Related Orders

Under quite weak conditions the order ≤dil implies the order ≤icx . This is shown in the next theorem. For any random variable Z, let lZ denote the left endpoint of the support of Z. Theorem 4.A.30. Let X and Y be two random variables with ﬁnite means. If lX ≤ l Y (4.A.22) and if X ≤dil Y , then X ≤icx Y . Proof. Suppose that X ≤dil Y . Then [X − EX] ≤cx [Y − EY ].

(4.A.23)

Therefore, by (3.A.12) we get that supp(X − EX) ⊆ supp(Y − EY ). Thus lY − EY ≤ lX − EX. Hence, EY − EX ≥ lY − lX .

(4.A.24)

Combining (4.A.22) with (4.A.24) it is seen that EX ≤ EY.

(4.A.25)

X ≤cx Y − (EY − EX),

(4.A.26)

From (4.A.23) it follows that

and from (4.A.25) it follows that Y − (EY − EX) ≤st Y.

(4.A.27)

Using Theorem 4.A.6(b) it is seen that, from (4.A.26) and (4.A.27), we obtain X ≤icx Y . It is also easy to obtain X ≤icx Y from (4.A.26) and (4.A.27) by noticing that the usual stochastic order and the convex order both imply the increasing convex order.

As a corollary of Theorem 4.A.30 we obtain the following result. Corollary 4.A.31. Let X and Y be two nonnegative random variables with ﬁnite means, such that X has the support [0, ∞). If X ≤dil Y , then X ≤icx Y . A corollary of Theorem 4.A.30 and of (3.C.7) is the following result. Corollary 4.A.32. Let X and Y be two random variables with ﬁnite means. If lX ≤ lY and if X ≤ew Y , then X ≤icx Y . The next result gives a simple condition that implies the increasing convex order between a given random variable and a scale transformation of another random variable. Let X1 , X2 , . . . be a sequence of independent and identically distributed nonnegative random variables with a common distribution

4.A The Monotone Convex and Monotone Concave Orders

197

function F , and let Y1 , Y2 , . . . be another sequence of independent and identically distributed nonnegative random variables with a common distribution function G. Let X(n) ≡ max{X1 , X2 , . . . , Xn } be the nth order statistic of a sample of size n from the distribution F , n = 1, 2, . . .. Let Y(n) be similarly deﬁned for n = 1, 2, . . .. Note that from Corollary 4.A.16 it follows that if X1 ≤icx Y1 , then X(n) ≤icx Y(n) for all n = 1, 2, . . .. The following theorem is a weak converse of this observation. The proof is not given here. Theorem 4.A.33. Let X1 , X2 , . . . be a sequence of independent and identically distributed nonnegative random variables and let Y1 , Y2 , . . . be another sequence of independent and identically distributed nonnegative random variables. If E[X(n) ] ≤ E[Y(n) ] for all n = 1, 2, . . ., then X1 ≤icx κY1 for some constant κ ≥ 1 that is independent of the distributions of X1 and Y1 . The constant κ can be taken to be equal to 2(1 − e−1 )−1 . 4.A.4 Further properties Let X and Y be two random variables. If E[φ(X)] ≤ E[φ(Y )] for all increasing functions φ, then (4.A.1) deﬁnitely holds. If E[φ(X)] ≤ E[φ(Y )] for all convex [concave] functions φ, then (4.A.1) also holds. From (1.A.7) and (3.A.1) we thus obtain the following result. Note that in the conclusion of the second part of (b) in the next theorem the random variables X and Y are interchanged. Theorem 4.A.34. Let X and Y be two random variables. (a) If X ≤st Y , then X ≤icx Y and X ≤icv Y . (b) If X ≤cx Y , then X ≤icx Y and Y ≤icv X. Thus we see that indeed the increasing convex [concave] order has both properties of ordering by size and ordering by variability. One indication of the ordering by size property is (4.A.2) [(4.A.3)], that is, the ordering of the expected values (when they exist) that follows from the increasing convex [concave] order. It turns out that the ordering of the expected values is actually the only indication of the ordering by size property. If the two means are equal, then the monotone convex and the monotone concave orders reduce to the convex order of Section 3.A. This is stated formally in the following theorem. Theorem 4.A.35. Let X and Y be two random variables with ﬁnite means. (a) If X ≤icx Y and EX = EY , then X ≤cx Y . (b) If X ≤icv Y and EX = EY , then Y ≤cx X. Proof. If X ≤icx Y , then (4.A.5) (which is the same as (3.A.7)) holds. Part (a) now follows from Theorem 3.A.1(a). Part (b) is proven similarly using (4.A.7), (3.A.8), and Theorem 3.A.1(b).

198

4 Univariate Monotone Convex and Related Orders

The order ≤icx can be used to yield bivariate characterizations of the orders ≤st , ≤hr , ≤rh , and ≤lr (compare the following result to Theorems 1.A.10, 1.B.10, 1.B.48, 1.C.22, and 1.C.23). Let φ1 and φ2 be two bivariate functions and let ∆φ21 (x, y) = φ2 (x, y) − φ1 (x, y). Consider the following set of conditions on φ1 and φ2 : (a) (b) (c) (d) (e) (f) (g)

∆φ21 (x, y) ≥ −∆φ21 (y, x) whenever x ≤ y. ∆φ21 (x, y) ≥ 0 whenever x ≤ y. φ1 (y, x) ≤ φ2 (x, y) whenever x ≤ y. For each x, φ2 (x, y) increases in y on {y ≥ x}. For each y, φ2 (x, y) decreases in x on {x ≤ y}. For each x, ∆φ21 (x, y) increases in y on {y ≥ x}. For each y, ∆φ21 (x, y) decreases in x on {x ≤ y}.

The proof of the next theorem is omitted. Theorem 4.A.36. Let X and Y be two independent random variables. Then (i) X ≤st Y if, and only if, φ1 (X, Y ) ≤icx φ2 (X, Y )

(4.A.28)

for all φ1 and φ2 satisfying (a), (b), (c), (d), (e), (f), and (g). (ii) X ≤hr Y if, and only if, (4.A.28) holds for all φ1 and φ2 satisfying (a), (b), (c), (d), and (f). (iii) X ≤rh Y if, and only if, (4.A.28) holds for all φ1 and φ2 satisfying (a), (b), (c), (e), and (g). (iv) X ≤lr Y if, and only if, (4.A.28) holds for all φ1 and φ2 satisfying (a), (b), and (c). A typical application of Theorem 4.A.36 is the following result (compare it to Theorem 1.C.21). Theorem 4.A.37. Let X1 , X2 , . . . , Xm be independent random variables such that X1 ≤rh X2 ≤rh · · · ≤rh Xm . Let a1 , a2 , . . . , am be constants such that a1 ≤ a2 ≤ · · · ≤ am . Then m i=1

am−i+1 Xi ≤icv

m i=1

aπi Xi ≤icv

m

ai Xi ,

i=1

where π = (π1 , π2 , . . . , πm ) denotes any permutation of (1, 2, . . . , m). Proof. We only give the proof when m = 2; the general case then can be obtained by pairwise interchanges. So, suppose that X1 ≤rh X2 and that a1 ≤ a2 . Deﬁne φ1 and φ2 by φ1 (x, y) = −a1 x−a2 y and φ2 (x, y) = −a1 y−a2 x. Then it is easy to verify that (a), (b), (c), (e), and (g) above hold. Thus, by Theorem 4.A.36(iii), −a1 X1 − a2 X2 ≤icx −a1 X2 − a2 X1 . By Theorem 4.A.1 this means a1 X2 + a2 X1 ≤icv a1 X1 + a2 X2 .

4.A The Monotone Convex and Monotone Concave Orders

199

In the next few results we denote by Ip a Bernoulli random variable with probability of success p, that is, P {Ip = 1} = 1 − P {Ip = 0} = p. Recall from page 2 the deﬁnition of the majorization order ≺ among n-dimensional vectors. It is shown after the next theorem that it partially extends Theorem 3.A.37. Theorem 4.A.38. Let X1 , X2 , . . . , Xn be independent nonnegative random variables, and let Ip1 , Ip2 , . . . , Ipn and Iq1 , Iq2 , . . . , Iqn be independent Bernoulli random variables that are independent of X1 , X2 , . . . , Xn . Suppose that (i) 1 ≥ p1 ≥ p2 ≥ · · · ≥ pn and 1 ≥ q1 ≥ q2 ≥ · · · ≥ qn , (ii) Xn ≤st Xn−1 ≤st · · · ≤st X1 , and (iii) p ≺ q. Then

n

Ipi Xi ≤icv

i=1

n

Iqi Xi .

i=1

If X1 , X2 , . . . , Xn in Theorem

4.A.38 are identically distributed, then n n E I X I X = E and therefore p i q i i i i=1 i=1 n n in this theon n the conclusion rem is i=1 Ipi Xi ≤cv i=1 Iqi Xi ; that is, i=1 Ipi Xi ≥cx i=1 Iqi Xi . This is the same as the conclusion of Theorem 3.A.37. The following result partially extends Theorem 3.A.35. Theorem 4.A.39. Let X1 , X2 , . . . , Xn be independent and identically distributed nonnegative random variables, and let Ip1 , Ip2 , . . . , Ipn be independent Bernoulli random variables that are independent of X1 , X2 , . . . , Xn . Let a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bn ) be two vectors of constants. Suppose that (i) 1 ≥ p1 ≥ p2 ≥ · · · ≥ pn , (ii) a1 ≥ a2 ≥ · · · ≥ an and b1 ≥ b2 ≥ · · · ≥ bn , and (iii) a ≺ b. Then

n i=1

Ipi ai Xi ≤icx

n

Ipi bi Xi .

i=1

A family of nonnegative random variables {X(θ), θ > 0} is said to have the semigroup property if, for all θ1 > 0 and θ2 > 0, one has X(θ1 + θ2 ) =st X(θ1 ) + X(θ2 ), where X(θ1 ) and X(θ2 ) are independent. As a corollary of Theorem 4.A.39 we obtain the following result. Corollary 4.A.40. Let {X(θ), θ > 0} be a family of random variables with the semigroup property, and let Ip1 , Ip2 , . . . , Ipn be independent Bernoulli random variables that are independent of {X(θ), θ > 0}. Let θ = (θ1 , θ2 , . . . , θn ) and γ = (γ1 , γ2 , . . . , γn ) be two vectors of constants. Suppose that (i) 1 ≥ p1 ≥ p2 ≥ · · · ≥ pn ,

200

4 Univariate Monotone Convex and Related Orders

(ii) θ1 ≥ θ2 ≥ · · · ≥ θn and γ1 ≥ γ2 ≥ · · · ≥ γn , and (iii) θ ≺ γ. Then

n i=1

Ipi X(θi ) ≤icx

n

Ipi X(γi ).

i=1

The following characterizations of the dilation order, by means of the order ≤icx , are similar to characterizations (3.A.39) and (3.A.40). Theorem 4.A.41. Let X and Y be two random variables with distribution functions F and G, respectively, and with ﬁnite expectations. Then X ≤dil Y if, and only if, any of the following two statements hold: [X − EX X ≥ F −1 (p)] ≤icx [Y − EY Y ≥ G−1 (p)] for all p ∈ [0, 1), and [X − EX X ≤ F −1 (p)] ≥icx [Y − EY Y ≤ G−1 (p)]

for all p ∈ [0, 1).

The following characterizations of the convex order, by means of the order ≤icx , are similar to characterizations (3.A.41) and (3.A.42). These characterizations follow at once from Theorem 4.A.41 and from (3.A.32). Theorem 4.A.42. Let X and Y be two random variables with distribution functions F and G, respectively, and with equal ﬁnite means. Then X ≤cx Y if, and only if, any of the following two statements hold: [X X ≥ F −1 (p)] ≤icx [Y Y ≥ G−1 (p)] for all p ∈ [0, 1), and

[X X ≤ F −1 (p)] ≥icx [Y Y ≤ G−1 (p)]

for all p ∈ [0, 1).

In a manner similar to the characterization (3.B.6) of the dispersive order by the usual stochastic order, the increasing convex order can characterize the excess wealth order as follows. Theorem 4.A.43. Let X and Y be two continuous random variables with distribution functions F and G, respectively. Then X ≤ew Y if, and only if, (X − F −1 (α))+ ≤icx (Y − G−1 (α))+ ,

α ∈ (0, 1).

(4.A.29)

Proof. We give the proof under the assumption that F and G are strictly increasing; the more general proof can be found in the literature. First assume that (4.A.29) holds. Then, by (4.A.2) we get E[(X − F −1 (α))+ ] ≤ E[(Y − G−1 (α))+ ],

α ∈ (0, 1).

The latter inequality is easily seen to be equivalent to (3.C.5), and therefore X ≤ew Y .

4.A The Monotone Convex and Monotone Concave Orders

201

In order to obtain the converse note that (4.A.29) is equivalent to ∞ ∞ H(t, α) ≡ G(x)dx − F (x)dx ≥ 0, (t, α) ∈ [0, ∞) × (0, 1). t+G−1 (α)

t+F −1 (α)

Select an α ∈ (0, 1). Note that limt→∞ H(t, α) = 0. If H(·, α) attains a minimum at t∗ , since H(·, α) is continuous and diﬀerentiable, t∗ should satisfy ∂H(t,α) ∗ = 0. This equality holds if, and only if, ∂t

t=t

F (t∗ + F −1 (α)) = F (t∗ + G−1 (α)) = β, say. Since F and G are strictly increasing it is seen that F −1 (β) = t∗ + F −1 (α) and G−1 (β) = t∗ + G−1 (α). Therefore ∞ ∞ ∗ H(t , α) = G(x)dx − F (x)dx ≥ 0, G−1 (β)

F −1 (β)

where the inequality follows from X ≤ew Y .

Let X and Y be two nonnegative random variables with respective distri−1 bution functions F and G. Let HF−1 and HG be the TTT transforms associated with F and G, respectively (see (1.A.19)), and let HF and HG be the respective inverses. Let Xttt and Yttt be random variables with distribution functions HF and HG (see Section 1.A.4). Theorem 4.A.44. Let X and Y be two nonnegative random variables. Then X ≤icv Y =⇒ Xttt ≤icv Yttt . See related results in Theorems 1.A.29, 3.B.1, 4.B.8, 4.B.9, and 4.B.29. The next example may be compared with Examples 1.A.25, 1.B.6, and 1.C.51. Example 4.A.45. Let Xi be a binomial random variable with parameters ni and pi , i = 1, 2, . . . , m, and assume that the Xi ’s are independent. mLet Y be a binomial random variable with parameters n and p where n = i=1 ni . Then m

Xi ≥icx Y ⇐⇒ p ≤

n

pn1 1 pn2 2 · · · pnmm ,

i=1

and

m i=1

m Xi ≤icx Y ⇐⇒ p ≥

i=1

n

n i pi

.

The following example gives necessary and suﬃcient conditions for the comparison of normal random variables; it is generalized in Example 7.A.13. See related results in Examples 1.A.26 and 3.A.51.

202

4 Univariate Monotone Convex and Related Orders

Example 4.A.46. Let X be a normal random variable with mean µX and vari2 ance σX , and let Y be a normal random variable with mean µY and variance 2 2 ≤ σY2 . σY . Then X ≤icx Y if, and only if, µX ≤ µY and σX Example 4.A.47. Let X1 , X2 , . . . , Xn be independent exponential random varin ables with distinct hazard rates λ1 > λ2 > · · · > λn > 0. Then n1 i=1 Xi ≤icx Xn . Conditions for stochastic equality, for random variables that are ≤icx - or ≤icv -ordered, are given in the following result. This result may be compared to Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16. Theorem 4.A.48. Let X and Y be two nonnegative random variables. Suppose that X ≤icx Y [X ≤icv Y ] and that E[X r ] = E[Y r ] for some r ∈ (1, ∞) [r ∈ (0, 1)], provided the expectations exist. Then X =st Y . This result is a corollary of Theorem 4.A.69 below with p = 1. In fact, the following stronger result, which is an analog of Theorem 3.A.43, holds for the orders ≤icx and ≤icv . Theorem 4.A.49. Let X and Y be two random variables. Suppose that X ≤icx [≤icv ] Y and that for some increasing strictly convex [concave] function φ we have that E[φ(X)] = E[φ(Y )], provided the expectations exist. Then X =st Y . Of course, in Theorem 4.A.49 we can replace “increasing strictly convex [concave] function” by “decreasing strictly concave [convex] function.” Theorem 4.A.50. Let X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn (n ≥ 2) be two collections of independent and identically distributed random variables. If X1 ≤icx Y1 and if E[max{X1 , X2 , . . . , Xn }] = E[max{Y1 , Y2 , . . . , Yn }], then X1 =st Y1 . Analogous to the result in Remark 1.A.18, it can be shown that the set of all distribution functions on R with ﬁnite means is a lattice with respect to the order ≤icx . Meilijson and N´ adas [389] have proved the following result which, for the sake of simplicity, we describe informally. Let X be a random variable with mean residual life function m (see, for example, (2.A.1)). Deﬁne H by H(x) = m(x) + x = E[X X > x], for all x, and note that H is increasing. Denote ˜ = H(X). Then X ˜ ≥st Y for every random variable Y which satisﬁes Y ≤icx X ˜ is the least stochastic X. In fact, Meilijson and N´ adas [389] proved that X majorant in the sense that if another random variable Z also satisﬁes Z ≥st Y ˜ ≤st Z. for every Y such that Y ≤st X, then X

4.A The Monotone Convex and Monotone Concave Orders

203

4.A.5 Some properties in reliability theory We have seen in Theorem 1.A.30 that a nonnegative random variable is IFR [DFR] if, and only if, [X − tX > t] ≥st [≤st ] [X − t X > t ] whenever t ≤ t . A question of interest then is what does one get if in the above condition one replaces the order ≥st by the order ≥icx . It turns out that the order ≥icx can characterize another familiar aging notion in reliability theory. Recall from page 1 the deﬁnitions of DMRL and IMRL random variables. A combination of Theorems 2.A.23 and 4.A.24 provides a proof of the DMRL part of the next theorem. The proof of the IMRL part is similar. Theorem 4.A.51. The nonnegative random variable X is DMRL [IMRL] if, and only if, [X − tX > t] ≥icx [≤icx ] [X − t X > t ] whenever t ≤ t . Other characterizations of DMRL and IMRL random variables, by means of other stochastic orders, can be found in Theorems 2.A.23, 2.B.17, 3.A.56, and 3.C.13. We will now describe a generalization of the suﬃciency part of Theorem 4.A.51. For two independent random variables X and T , let XT denote a random variable that has the distribution of [X − T X > T ]. Note that XT is not the residual life of X given T . Theorem 4.A.52. Let X, T1 , and T2 be independent random variables. If T1 ≤rh T2 , and if X is DMRL [IMRL], then XT1 ≥icx [≤icx ] XT2 . Proof. We will prove the DMRL part only. The proof of the IMRL part is similar. Let F denote the survival function of X, and let Gi denote the survival function of XTi , i = 1, 2. Then, for any ﬁxed x we have

∞

x

∞ G2 (y)dy − G1 (y)dy x ∞ ∞ E F (T1 ) E x F (T2 + y)dy − E F (T2 ) E x F (T1 + y)dy = . E F (T1 ) E F (T2 ) (4.A.30)

∞ Deﬁne the functions α and β by α(t) = x F (t + y)dy and β(t) = F (t). Note that β is nonnegative and decreasing, and that α/β is decreasing because X is DMRL. Therefore, by Theorem 1.B.50(b), we see that the numerator in (4.A.30) is nonpositive for any x. It follows, by (4.A.5), that XT1 ≥icx XT2 .

Note that if the nonnegative random variable X is DMRL [IMRL], then, from Theorem 4.A.51 it follows that X ≥icx [≤icx ] [X − tX > t] for all t ≥ 0. (4.A.31)

204

4 Univariate Monotone Convex and Related Orders

Nonnegative random variables that satisfy (4.A.31) are called new better [worse] than used in convex ordering (NBUC [NWUC]) or new better [worse] than used in mean (NBUM [NWUM]). An equivalent deﬁnition of the NBUC notion, by means of the usual stochastic order, is given in (1.A.21). It is of interest to note that a nonnegative random variable X with survival function F is NBUC if, and only if, ∞ x+t F (t) F (y)dy ≤ F (y)dy for all t ≥ 0 and x ≥ 0. (4.A.32) 1 − F (t) x x+t It is worthwhile to point out that a nonnegative random variable X that satisﬁes (4.A.31), but with the increasing concave (rather than the increasing convex) order, is said to be NBU(2) [NWU(2)]. If a nonnegative random variable X satisﬁes [X − tX > t] ≥icv [≤icv ] [X − t X > t ] whenever t ≤ t , (4.A.33) then, in some places in the literature, the random variable X is said to have the IFR(2) [DFR(2)] property. However, Belzunce, Hu, and Khaledi [68] proved that the IFR(2) [DFR(2)] property is the same as the IFR [DFR] property. Thus they obtained the following characterization of the IFR [DFR] property. Theorem 4.A.53. The nonnegative random variable X is IFR [DFR] if, and only if, (4.A.33) holds. 4.A.6 The starshaped order A function φ : [0, ∞) → [0, ∞), which satisﬁes φ(0) = 0, is called starshaped if φ(x)/x is increasing in x on (0, ∞) (here we use the convention a/∞ = 0 for a > 0). Note that such a function is increasing. Note also that every increasing convex function φ on [0, ∞), such that φ(0) = 0, is starshaped. Let X and Y be two nonnegative random variables such that E[φ(X)] ≤ E[φ(Y )]

for all starshaped functions φ : [0, ∞) → [0, ∞), (4.A.34) provided the expectations exist. Then X is said to be smaller than Y in the starshaped order (denoted by X ≤ss Y ). Theorem 4.A.54. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. Then X ≤ss Y if, and only if, ∞ ∞ xdF (x) ≤ xdG(x), y ≥ 0. (4.A.35) y

y

Proof. The function φy , deﬁned by 0, x ≤ y, φy (x) = x, x > y,

4.A The Monotone Convex and Monotone Concave Orders

205

is starshaped. Thus, (4.A.34) =⇒ (4.A.35). Conversely, let φ be a starshaped function. Then h(x) = φ(x)/x is increasing in x on (0, ∞). Approximate h by a sequence of increasing step functions hn . Then (4.A.35) yields ∞ ∞ xhn (x)dF (x) ≤ xhn (x)dG(x). 0

0

Letting n → ∞, we obtain (4.A.34).

Theorem 4.A.54 shows that when the compared random variables have the same mean, then the starshaped order is equivalent to the usual stochastic ordering of the corresponding length-biased (or spread) random variables. Such random variables are studied in Examples 1.B.23, 1.C.59, 1.C.60, and 8.B.12. Theorem 4.A.55. Let X and Y be two nonnegative random variables. Then X ≤st Y =⇒ X ≤ss Y =⇒ X ≤icx Y. Proof. The ﬁrst implication follows from the fact that a starshaped function φ, such that φ(0) = 0, is increasing. In order to prove the second implication, let φ be an increasing convex function. First suppose that φ(0) = 0. Then φ is starshaped and the inequality in (4.A.1) follows from X ≤ss Y . If φ(0) = a = ˜ 0, then deﬁne φ(x) = φ(x) − a, x ≥ 0. The function φ˜ is increasing convex, ˜ ˜ ˜ )]; and it satisﬁes φ(0) = 0. Thus, by the previous argument E[φ(X)] ≤ E[φ(Y that is, E[φ(X)] − a ≤ E[φ(Y )] − a, and the inequality in (4.A.1) follows.

Some closure properties of the starshaped order are given in the next theorem. Theorem 4.A.56. (a) If the nonnegative random variables X and Y are such that X ≤ss Y , and g is any starshaped function with g(0) = 0, then g(X) ≤ss g(Y ). In particular, cX ≤ss cY for any c > 0. (b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤ss [Y Θ = θ] for all θ in the support of Θ. Then X ≤ss Y . That is, the starshaped order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of nonnegative random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that EX 2 and EY 2 are ﬁnite and that EXj2 EX 2 → EXj EX

and

EYj2 EY 2 → EYj EY

as j → ∞.

If Xj ≤ss Yj , j = 1, 2, . . ., then X ≤ss Y . Theorem 4.A.57. Let X be a nonnegative random variable. Then I[a,∞) (X) ≤ss I[b,∞) (X) whenever b ≥ a ≥ 0, where I[a,∞) and I[b,∞) are the indicator functions of the indicated intervals. The proof of Theorem 4.A.57 consists of verifying (4.A.35) in each of the cases y ≤ a, a < y ≤ b, and y > b.

206

4 Univariate Monotone Convex and Related Orders

4.A.7 Some related orders Let X and Y be two random variables with survival function F and G, and [k] [k] distribution functions F and G, respectively. Let F [k] , F , G[k] , and G be deﬁned as in (3.A.66) and (3.A.67). The inequalities (4.A.5) and (4.A.7) can be generalized as follows: For a positive integer m suppose that F

[m−1]

(x) ≤ G

[m−1]

(x)

for all x,

(4.A.36)

F [m−1] (x) ≥ G[m−1] (x)

for all x,

(4.A.37)

or that provided these integrals are ﬁnite (the integrals are ﬁnite if F and G have ﬁnite (m − 1)st moments). If (4.A.36) holds, then X is said to be smaller than Y in the m-icx order (denoted by X ≤m-icx Y ). If it is known that X and Y take on values in N++ , then the deﬁnition of the m-icx order can be modiﬁed, exploiting the special structure of N++ ; see Denuit and Lef`evre [146]. If (4.A.37) holds, then X is said to be smaller than Y in the m-icv order (denoted by X ≤m-icv Y ). It is seen from the deﬁnition that the orders ≤1-icx and ≤1-icv are equivalent to the order ≤st , the order ≤2-icx is equivalent to the order ≤icx , and the order ≤2-icv is equivalent to the order ≤icv . The orders ≤m-icx and ≤m-icv have some properties that are similar to the properties of the orders ≤icx and ≤icv . For example, the extension of (4.A.4) is that X ≤m-icx Y if, and only if, E[(X − a)+ ]m−1 ≤ E[(Y − a)+ ]m−1

for all a.

(4.A.38)

The extension of (4.A.6) is that X ≤m-icv Y if, and only if, E[(X − a)− ]m−1 ≤ E[(Y − a)− ]m−1

for all a.

(4.A.39)

The characterization (4.A.1) of the orders ≤icx and ≤icv has an analog for the orders ≤m-icx and ≤m-icv . We will not give the technical details here (see Section 4.C for a reference), but we just mention the following results. For m = 1, 2, . . ., let Mm-icx be the set of all functions φ : R → R such that limx→−∞ φ(x) is ﬁnite, and whose ﬁrst m−1 derivatives, φ(1) , φ(2) , . . . , φ(m−1) , exist, and are such that limx→−∞ φ(j) (x) = 0, j = 1, 2, . . . , m − 1, and φ(m−1) is increasing. Let Mm-icx be the closure of Mm-icx in the topology of weak convergence (that is, pointwise convergence in each continuity point of the limit). Let X and Y be two random variables and suppose that the support of each of them contains an interval of the form (−∞, a) for some a. Then X ≤m-icx Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

for all functions φ ∈ Mm-icx ,

provided the expectations exist.

(4.A.40)

4.A The Monotone Convex and Monotone Concave Orders

207

Next, for m = 1, 2, . . ., let Mm-icv be the set of all functions φ : R → R such that limx→∞ φ(x) is ﬁnite, whose ﬁrst m−1 derivatives, φ(1) , φ(2) , . . . , φ(m−1) , exist, and are such that limx→∞ φ(j) (x) = 0, j = 1, 2, . . . , m − 1, and (−1)m−1 φ(m−1) is increasing. Let Mm-icv be the closure of Mm-icv in the topology of weak convergence. Let X and Y be two random variables and suppose that the support of each of them contains an interval of the form (a, ∞) for some a. Then X ≤m-icv Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

for all functions φ ∈ Mm-icv ,

(4.A.41)

provided the expectations exist. Let us denote X ≤∞-icx [≤∞-icv ] Y if X ≤m-icx [≤m-icv ] Y

for all positive integers m.

(4.A.42)

A characterization of the order ≤∞-icv is given in Theorem 5.A.17. It can be shown that if X and Y have ﬁnite (m − 1)st moments, then X ≤m-icx Y =⇒ E[X] ≤ E[Y ] and X ≤m-icv Y =⇒ E[X] ≤ E[Y ], provided the expectations exist. In fact we have the following more general result. Theorem 4.A.58. Let X and Y be two random variables with ﬁnite ﬁrst m−1 moments. If X ≤m-icx Y [X ≤m-icv Y ], then EX k < EY k [(−1)k+1 EX k < (−1)k+1 EY k ] for the smallest k for which EX k = EY k . Some closure properties of the orders ≤m-icx and ≤m-icv are stated next. We omit the proof of the following theorem. Note, however, that parts (b) and (c) of the next theorem are easy to prove. The proof of part (a) uses the fact that if φ ∈ Mm-icx [Mm-icv ] then φ(j) [(−1)j φ(j) ] is nonnegative and increasing [decreasing] for all j ∈ {1, 2, . . . , m − 1}, and therefore Mm-icx and Mm-icv are closed under compositions. Theorem 4.A.59. (a) Let X and Y be two random variables and suppose that the support of each of them contains an interval of the form (−∞, a) [(a, ∞)] for some a. If X ≤m-icx [≤m-icv ] Y and if g is any function in Mm-icx [Mm-icv ], then g(X) ≤m-icx [≤m-icv ] g(Y ). (b) Let X, Y , and Θ be random variables such that, for all θ in the support of Θ, we have that [X Θ = θ] ≤m-icx [≤m-icv ] [Y Θ = θ]. Then X ≤m-icx [≤m-icv ] Y . That is, the m-icx [m-icv ] order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. Assume that E(X+ )m−1 and E(Y+ )m−1 are ﬁnite and that

208

4 Univariate Monotone Convex and Related Orders m−1 m−1 m−1 m−1 E(Xj )+ → EX+ [E(Xj )− → EX− ] m−1 E(Yj )+

→

EY+m−1

m−1 [E(Yj )−

→

and

EY−m−1 ]

as j → ∞. (4.A.43)

If Xj ≤m-icx [≤m-icv ] Yj , j = 1, 2, . . ., then X ≤m-icx [≤m-icv ] Y . (d) Let X1 , X2 , . . . , Xl be a set of independent random variables and let Y1 , Y2 , . . . , Yl be another set of independent random variables. If Xi ≤m-icx [≤m-icv ] Yi for i = 1, 2, . . . , l, then l

Xj ≤m-icx [≤m-icv ]

j=1

l

Yj .

j=1

That is, the m-icx [m-icv ] order is closed under convolutions. In part (c), as in Theorem 3.A.12, the condition (4.A.43) is necessary — without it the conclusion of part (c) may not hold. The following result, which extends the m-icx part of Theorem 4.A.59(d), is essentially the same as Theorem 8.A.29. Theorem 4.A.60. Let X1 , X2 , . . . be a set of independent random variables and let Y1 , Y2 , . . . be another set of independent random variables. Let N1 be an integer-valued random variable that is independent of the Xi ’s, and let N2 be an integer-valued random variable that is independent of the Yi ’s. If Xi ≤m-icx Yi for i = 1, 2, . . ., and if N1 ≤m-icx N2 , then N1 j=1

Xj ≤m-icx

N2

Yj .

j=1

For the orders ≤m-icx and ≤m-icv , the analog of Theorem 3.A.12(a) is the following. Theorem 4.A.61. Let X and Y be two random variables. Then X ≤m-icx [≤m-icv ] Y ⇐⇒ −X ≥m-icv [≥m-icx ] − Y. The proof of Theorem 4.A.61 easily follows from (4.A.36) and (4.A.37). It is not hard to verify the next statement. Theorem 4.A.62. Consider two random variables X and Y . If X ≤m1 -icx [≤m1 -icv ] Y , then X ≤m2 -icx [≤m2 -icv ] Y for all m2 ≥ m1 . Since the order ≤1-icx is the same as the order ≤st we see that X ≤st Y =⇒ X ≤m-icx Y and that X ≤st Y =⇒ X ≤m-icv Y. The following obvious relationships hold between the orders of Section 3.A.5 and the present orders:

4.A The Monotone Convex and Monotone Concave Orders

209

X ≤Sm-cx Y =⇒ X ≤m-icx Y, and

X ≤Sm-cv Y =⇒ X ≤m-icv Y.

Suﬃcient conditions for X ≤m-icv Y and X ≤m-icv Y are given in the next result, which is related to Theorem 4.A.22. It is of interest to compare the next result with Theorem 3.A.66. Theorem 4.A.63. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively, and with density functions f and g, respectively, such that E[X i ] = E[Y i ], i = 1, 2, . . . , m − 2, and E[X m−1 ] ≤ E[Y m−1 ]. (a) If S − (F −G) ≤ m−1 and if the last sign of F −G is a +, then X ≤m-icx Y . (b) If S − (f − g) ≤ m and if the last sign of g − f is a +, then X ≤m-icx Y . The following example describes a typical application of Theorem 4.A.63. Example 4.A.64. Let the inverse Gaussian random variable Y , and the lognormal random variable Z, be as in Example 3.A.67; in particular they both have the mean α/β and the second moment α(α + 1)/β 2 . We claim that Y ≤4-icx Z. In order to see it, ﬁrst note, as in Example 3.A.67, that without loss of generality we can take the means to be equal to 1, that is, β = α. Now, a straightforward computation yields log

fY (x) log2 x αx α − =C+ − , fX (x) 2τ 2 2 2x

x > 0,

where C is some constant. Substituting u = log x, the second derivative of the above expression is seen to have two sign changes. Therefore the expression itself has at most four sign changes. We also have here, by a lengthy computation (see Kaas and Hesselager [270]), that E[Y 3 ] < E[Z 3 ]. The stated result now follows from Theorem 4.A.63(b). In fact, it can be shown that if X, Y , and Z, are, respectively, Gamma, inverse Gaussian, and lognormal random variables (with parameters that are diﬀerent from the ones in Example 3.A.67), such that E[X] = E[Y ] = E[Z] and E[X 2 ] ≤ E[Y 2 ] ≤ E[Z 2 ], then X ≤3-icx Y , X ≤3-icx Z, and Y ≤4-icx Z. Some comparisons of Gamma, inverse Gaussian, lognormal, and BirnbaumSaunders random variables in the ≤3-icv sense were derived by Klar [300]. Consider now a family of distribution functions {Gθ , θ ∈ R}. As in Section 1.A.3 let X(θ) denote a random variable with distribution function Gθ . For any random variable Θ with support R, and with distribution function F , let us denote by X(Θ) a random variable with distribution function H given by H(y) = Gθ (y)dF (θ), y ∈ R. X

210

4 Univariate Monotone Convex and Related Orders

The following result generalizes Theorem 4.A.8(a), just as Theorem 1.A.6 generalized Theorem 1.A.3(a). Its proof is similar to the proof of Theorem 4.A.18, using the fact that Mm-icx and Mm-icv are closed under compositions. We omit the details. Theorem 4.A.65. Consider a family of distribution functions {Gθ , θ ∈ R} as above. Let Θ1 and Θ2 be two random variables with support R and distribution functions F1 and F2 , respectively. Let Y1 and Y2 be two random variables such that Yi =st X(Θi ), i = 1, 2; that is, suppose that the distribution function of Yi is given by Hi (y) = Gθ (y)dFi (θ), y ∈ R, i = 1, 2. X

If ψφ , deﬁned by ψφ (θ) ≡ E[φ(X(θ))], is in Mm-icx [Mm-icv ] whenever φ ∈ Mm-icx [φ ∈ Mm-icv ], and if Θ1 ≤m-icx [≤m-icv ] Θ2 , then Y1 ≤m-icx [≤m-icv ] Y2 . For example, the family {Gθ , θ ≥ 0} of the Poisson distributions (or, in fact, every family of distribution functions whose associated density functions {gθ , θ ∈ R} satisfy that gθ (x) is totally positive of order m; see Karlin [275]) satisﬁes the condition in Theorem 4.A.65 that ψφ is in Mm-icx [Mm-icv ] whenever φ ∈ Mm-icx [φ ∈ Mm-icv ]. A Laplace transform characterization of the orders ≤m-icx and ≤m-icv is given next; it may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, 2.B.14, and 4.A.21. Before stating it we make a few observations. First, note that the random variables X1 and X2 in the theorem below have the support [0, ∞). Then the characterizations (4.A.40) and (4.A.41) are still valid provided the test functions φ in (4.A.40) satisfy that φ(j) (0) = 0 (rather than limx→−∞ φ(j) (x) = 0), j = 1, 2, . . . , m − 1. Next, note that the random variables Nλ (X1 ) and Nλ (X2 ) in the theorem below are discrete with support N+ . There are several ways of deﬁning the orders ≤m-icx and ≤m-icv for such random variables. One possible way is by the requirement (4.A.36) or (4.A.37) (or, equivalently, by (4.A.38) or (4.A.39)). Another possible way is by replacing the integrals in (4.A.36) or (4.A.37) by sums. In the theorem below we adopt a deﬁnition that is a discrete analog of (4.A.40) and (4.A.41). For m = 1, 2, . . ., (j) let Km-icx be the set of functions φ : N+ → R such that ∆φ (0) = 0, j = (0)

(j)

(j−1)

0, 1, . . . , m−1 (where ∆φ (n) ≡ φ(n) and ∆φ (n) = ∆φ

(j−1)

(n+1)−∆φ

(m−1) j = 1, 2, . . .), and such that ∆φ (n) is increasing on N+ . For random variables M1 and M2 denote M1 ≤m-icx M2 if E[φ(M1 )]

(n),

the discrete ≤ E[φ(M2 )] for all functions φ ∈ Km-icx . Similarly, let Km-icv be the set of functions (j) φ : N+ → R such that limn→∞ ∆φ (n) = 0, j = 0, 1, . . . , m − 1, and such that

4.A The Monotone Convex and Monotone Concave Orders

211

(m−1)

(−1)m−1 ∆φ (n) is increasing on N+ . For the discrete random variables M1 and M2 denote M1 ≤m-icv M2 if E[φ(M1 )] ≤ E[φ(M2 )] for all functions φ ∈ Km-icv . Theorem 4.A.66. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤m-icx [≤m-icv ] X2 ⇐⇒ Nλ (X1 ) ≤m-icx [≤m-icv ] Nλ (X2 )

for all λ > 0.

The proof of this theorem is similar to the proof of Theorem 4.A.21 and is therefore omitted. Another family of orders that are related to the ≤cx , ≤icx , and ≤icv orders can be deﬁned by a generalization of (4.A.5) and (4.A.7) that is diﬀerent from the generalization that is described in (4.A.36) and (4.A.37). Let X and Y be two random nonnegative variables with distribution functions F and G, and survival functions F and G, respectively. Let p > 0 and suppose that E[X p ] and E[Y p ] exist. If ∞ ∞ up−1 F (u)du ≤ up−1 G(u)du for all x, and E[X p ] = E[Y p ], x

x

then X is said to be smaller than Y in pth order (denoted by X ≤p Y ). If ∞ ∞ up−1 F (u)du ≤ up−1 G(u)du for all x, x

x

then X is said to be smaller than Y in p+ order (denoted by X ≤p+ Y ). Finally, if x x up−1 F (u)du ≥ up−1 G(u)du for all x, 0

0

then X is said to be smaller than Y in p− order (denoted by X ≤p− Y ). It is not hard to verify that for nonnegative random variables X and Y we have X ≤p Y ⇐⇒ X p ≤cx Y p , (4.A.44) X ≤p+ Y ⇐⇒ X p ≤icx Y p , and X ≤p− Y ⇐⇒ X p ≤icv Y p .

(4.A.45)

It is seen at once that X ≤p Y =⇒ X ≤p+ Y, and that X ≤p Y =⇒ Y ≤p− X. Notice that, for p = m, the order ≤p+ [≤p− ] is not the same as the order ≤m-icx [≤m-icv ]. In fact, X ≤m+ Y if, and only if,

212

4 Univariate Monotone Convex and Related Orders

E[(X m − a)+ ] ≤ E[(Y m − a)+ ]

for all a

(compare this to (4.A.38)), and X ≤m− Y if, and only if, E[(X m − a)− ] ≤ E[(Y m − a)− ]

for all a

(compare this to (4.A.39)). It is easy to verify that the orders ≤p , ≤p+ and ≤p− are closed under mixtures. They are also closed under limits in distribution provided a condition on convergence of moments, which is an obvious modiﬁcation of (4.A.10) (similar to (4.A.43)), holds. The following result points out some interrelationships among these orders. Theorem 4.A.67. Let X and Y be two nonnegative random variables. If X ≤p+ [≤p− ] Y , then X ≤q+ [≤q− ] Y whenever q ≥ p [q ≤ p]. A relationship to the order ≤∗ is given next (the order ≤∗ is deﬁned in Section 4.B below). Theorem 4.A.68. Let X and Y be two nonnegative random variables that have ﬁnite pth moments and that are not degenerate at 0. If X ≤∗ Y and if E[X p ] = E[Y p ], then X ≤p Y . A simple proof of Theorem 4.A.68 will be given in Remark 4.B.24. Motivated by the result of Theorem 1.A.8 (see also Theorems 3.A.43, 3.A.60, 4.A.48, 5.A.15, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16), the following results have been derived. Theorem 4.A.69. Let X and Y be two nonnegative random variables. Suppose that X ≤p+ Y [X ≥p− Y ] and that E[X r ] = E[Y r ] for some r ∈ (p, ∞) [r ∈ (0, p)], provided the expectations exist. Then X =st Y . Theorem 4.A.70. Let X and Y be two nonnegative random variables with ﬁnite means and distribution functions F and G, respectively. If X ≤p Y and if 1 1 −1 r −1 r F (t) dφ(t) = G (t) dφ(t) 0

0

for some r ≥ p and some increasing and strictly convex function φ : [0, 1] → R, then X =st Y . We end this section by mentioning still another sequence of orders that is based on iterated integrals. If F is a distribution function, then let F −1 denote the inverse of F (see page 1). Denote recursively F1−1 (p) = F −1 (p), and Fn−1 (p) =

p

1

p ∈ [0, 1],

−1 Fn−1 (u)du,

p ∈ [0, 1],

(4.A.46)

4.B Transform Orders: Convex, Star, and Superadditive Orders

213

for n = 2, 3, . . .. Similarly deﬁne G−1 n for a distribution function G. For any positive integer m, if the distribution functions F and G, of the random variables X and Y , satisfy −1 Fm (p) ≤ G−1 m (p)

for p ∈ [0, 1],

then we denote X ≤−1 m Y . It is easy to see that X ≤−1 1 Y ⇐⇒ X ≤st Y. Also, if EX = EY , then, by Theorem 3.A.5 we see that X ≤−1 2 Y ⇐⇒ X ≤cx Y. From (4.A.46) we obtain at once the following result Theorem 4.A.71. Let X and Y be two random variables. If X ≤−1 m1 Y , then X ≤−1 Y for all m ≥ m . 2 1 m2 A necessary condition for X ≤−1 m Y is given in the next result. Theorem 4.A.72. Let X and Y be two random variables. If X ≤−1 m Y , then E[max{X1 , X2 , . . . , Xk }] ≤ E[max{Y1 , Y2 , . . . , Yk }],

k ≥ m − 1,

where the Xi ’s [Yi ’s] are independent random variables, all distributed according to the distribution of X [Y ]. Proof. Let F and G denote the distribution functions of X and Y , respectively. A straightforward computation yields −1 Fm (0) = E[max{X1 , X2 , . . . , Xm−1 }]

and

G−1 m (0) = E[max{Y1 , Y2 , . . . , Ym−1 }].

Therefore E[max{X1 , X2 , . . . , Xm−1 }] ≤ E[max{Y1 , Y2 , . . . , Ym−1 }]. The inequality for k > m − 1 now follows from Theorem 4.A.71.

4.B Transform Orders: The Convex, Star, and Superadditive Orders 4.B.1 Deﬁnitions Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. Suppose that the support of X is an interval (ﬁnite or inﬁnite).

214

4 Univariate Monotone Convex and Related Orders

We say that X is smaller than Y in the convex transform order (denoted as X ≤c Y ) if G−1 F (x) is convex in x on the support of F . We say that X is smaller than Y in the star order (denoted by X ≤∗ Y ) if G−1 F (x) is starshaped in x (that is, if G−1 F (x)/x increases in x ≥ 0). It is easily seen that X ≤∗ Y if, and only if, G−1 (u) F −1 (u)

is increasing in u ∈ (0, 1).

(4.B.1)

Also, recalling the deﬁnition of the number of sign changes in (1.A.18), it is easily seen that X ≤∗ Y if, and only if, for all b > 0 we have that S − (F (·) − G(b·)) ≤ 1,

(4.B.2)

and the sign sequence is −, + if a crossing occurs. We say that X is smaller than Y in the superadditive order (denoted by X ≤su Y ) if G−1 F (x) is superadditive in x (that is, if G−1 F (x + y) ≥ G−1 F (x) + G−1 F (y) for all x ≥ 0 and y ≥ 0). 4.B.2 Some properties Every nonnegative function that vanishes at 0, and that is increasing and convex on [0, ∞), is also starshaped on [0, ∞). Furthermore, every nonnegative function that vanishes at 0, and that is increasing and starshaped on [0, ∞), is also superadditive on [0, ∞). Therefore, for any two nonnegative random variables X and Y we have X ≤c Y =⇒ X ≤∗ Y,

(4.B.3)

and X ≤∗ Y =⇒ X ≤su Y. The star order is related to the dispersion order as follows: Theorem 4.B.1. Let X and Y be two nonnegative random variables. Then X ≤∗ Y ⇐⇒ log X ≤disp log Y.

(4.B.4)

Proof. The relation X ≤∗ Y holds if, and only if, G−1 F (x)/x is increasing in x ≥ 0; that is, if, and only if, log G−1 F (x) − log x = log G−1 F (elog x ) − log x is increasing in x. The result now follows from (3.B.10).

An equivalent way of writing (4.B.4) is the following. For any two nonnegative random variables X and Y , X ≤disp Y ⇐⇒ eX ≤∗ eY . Under an obvious restriction, the superadditive (and hence also the star and the convex transform) order implies the dispersion order as is shown in the next theorem.

4.B Transform Orders: Convex, Star, and Superadditive Orders

215

Theorem 4.B.2. Let X and Y be two nonnegative random variables such that X ≤st Y . If X ≤su Y , then X ≤disp Y . Proof. Let F and G denote the distribution functions of X and Y , respectively, and let SF denote the support of F . Let x and y be two values in SF . Then G−1 F (x + y) − (x + y) ≥ G−1 F (x) + G−1 F (y) − (x + y) ≥ G−1 F (y) − y, where the ﬁrst inequality follows from X ≤su Y and the second inequality follows from F (x) ≥ G(x). Thus G−1 F (x) − x is increasing in x. Now, from (3.B.10), we obtain X ≤disp Y .

The condition X ≤st Y is clearly needed because without it it is impossible that X ≤disp Y (see Theorem 3.B.13). The condition X ≤su Y by itself (in fact, even the condition X ≤∗ Y ) does not necessarily imply that X ≤st Y . Theorem 4.B.2, together with (4.B.3), implies that if X and Y are two nonnegative random variables with ﬁnite means such that X ≤st Y and if X ≤su Y (and therefore if X ≤c Y or if X ≤∗ Y ), then (see Theorem 3.B.16) [X − EX] ≤cx [Y − EY ], and in particular, Var(X) ≤ Var(Y ). Another condition, under which X ≤su Y implies X ≤disp Y , is given in the next theorem. Theorem 4.B.3. Let X and Y be two nonnegative random variables with distributions F and G, respectively, such that limx→0 (G−1 F (x)/x) ≥ 1. If X ≤su Y , then X ≤disp Y . In particular, if F and G are absolutely continuous with F (0) = G(0) = 0 and their corresponding density functions f and g are such that f (0) ≥ g(0) > 0, then X ≤su Y implies X ≤disp Y . The relationship between the orders ≤∗ and ≤icx is described in the next theorem. Theorem 4.B.4. Let X and Y be two nonnegative random variables such that EX ≤ EY . If X ≤∗ Y , then X ≤icx Y . Proof. First we show that X ≤∗ Y =⇒ X ≤Lorenz Y . For this end we can assume temporarily, without loss of generality, since both orders are scale invariant, that EX = EY = 1. Let F and G denote the distribution functions of X and of Y , respectively. If F ≡ G, then the result is trivial. Thus assume F ≡ G. From (4.B.2) (with b = 1), and from the fact that EX = EY , it follows that S − (G − F ) = 1, and that the sign sequence is +, −. Thus, from (3.A.59) we obtain X ≤Lorenz Y . (Another proof of X ≤∗ Y =⇒ X ≤Lorenz Y can be found in Section 4.B.3.) Now suppose that X ≤Lorenz Y and that EX ≤ EY . Then

216

4 Univariate Monotone Convex and Related Orders

X ≤cx

EX · Y ≤st Y. EY

Thus we see from Theorem 4.A.6(b) that X ≤icx Y .

The following theorem describes a star order comparison of two functions of the same random variable. Theorem 4.B.5. Let X be a nonnegative random variable that is not degenerate at 0, and let g and h be nonnegative increasing functions, deﬁned on [0, ∞), such that g(x) > 0 and h(x) > 0 for all x > 0. If h(x)/g(x) is increasing in x ∈ (0, ∞), then g(X) ≤∗ h(X). Proof. Denote by F the distribution function of X. From the assumption that h(x)/g(x) is increasing in x ∈ (0, ∞) it follows that h(F −1 (u)) g(F −1 (u))

is increasing in u ∈ (0, 1).

Therefore, denoting by Fg and Fh the distribution functions of g(X) and of h(X), we have that Fh−1 (u) Fg−1 (u)

is increasing in u ∈ (0, 1).

Thus g(X) ≤∗ h(X) by (4.B.1).

For example, if X is a nonnegative random variable, then X + a ≤∗ X

whenever a > 0.

An interesting property of the order ≤∗ is given in the next theorem. Theorem 4.B.6. Let X and Y be positive random variables. If X ≤∗ Y , then X p ≤∗ Y p for any p = 0. In particular, 1/X ≤∗ 1/Y . Proof. Let F and G be the distribution functions of X and Y , respectively. First consider the case where p > 0. Then the distribution functions F and G of X p and Y p , respectively, are given by F(x) = F (x1/p )

and G(x) = G(x1/p ),

x ≥ 0.

−1 (F(x)) = (G−1 (F (x1/p )))p we compute Noting that G G−1 (F (y)) p −1 (F(x)) G (G−1 (F (x1/p )))p , = = x x y

4.B Transform Orders: Convex, Star, and Superadditive Orders

217

where y = x1/p . From the assumption X ≤∗ Y it is seen that the right-hand side of the above equation is increasing in y ≥ 0, and therefore the left-hand side of that equation is increasing in x ≥ 0. Now, in order to complete the proof it is only necessary to prove that denote the distribution functions of 1/X and 1/X ≤∗ 1/Y . Let now F and G 1/Y , respectively. These are given by F(x) = F (1/x)

and G(x) = G(1/x),

x ≥ 0,

−1 = 1/G−1 and that where F ≡ 1 − F and G ≡ 1 − G. Noting that G −1 G F = G−1 F , we compute −1 (F(x)) 1/x G 1 = −1 = −1 . x G (F (1/x)) G (F (1/x))x From the assumption X ≤∗ Y it is seen that the latter expression is increasing in x ≥ 0.

Example 4.B.7. Let X and Y be two positive random variables, and let E1 be a mean 1 exponential random variable which is independent of both X and = E1 /X and Y = E1 /Y ; that is, the distributions of both X Y . Deﬁne X and Y are scale mixtures of exponential distributions. Then ≤∗ Y . X ≤∗ Y =⇒ X ≤∗ 1/Y , and then The proof is obtained by showing that X ≤∗ Y =⇒ 1/X using Theorem 4.B.6. We omit the details. See Remarks 5.A.2 and 5.B.1 for similar results. A characterization of the order ≤c by means of the observed total time on test random variables (see Section 1.A.4) is given next. Let X and Y be two random variables with absolutely continuous distribution functions F and G, respectively. Suppose that 0 is the left endpoint of the supports of X −1 and Y . Let HF−1 and HG be the TTT transforms associated with F and G, respectively (see (1.A.19)), and let HF and HG be the respective inverses. Let Xttt and Yttt be random variables with distribution functions HF and HG . Theorem 4.B.8. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions having 0 as the left endpoint of their supports. Then X ≤c Y ⇐⇒ Xttt ≤c Yttt . Proof. Note that X ≤c Y if, and only if, f (F −1 (u)) g(G−1 (u))

is increasing in u ∈ [0, 1],

(4.B.5)

218

4 Univariate Monotone Convex and Related Orders

where f and g are the densities associated with F and G. From (4.B.5), (3.B.16), and (3.B.17) it is seen that X ≤c Y if, and only if, the ratio −1 (u)) is increasing in u ∈ [0, 1] where hF and hG are hF (HF−1 (u))/hG (HG the density functions associated with HF and HG , respectively. Thus, again by (4.B.5), we obtain the stated result.

A related result is the following. Theorem 4.B.9. Let X and Y be two nonnegative random variables with absolutely continuous distribution functions having 0 as the left endpoint of their supports. If X ≤∗ Y , then Xttt ≤∗ Yttt . See related results in Theorems 1.A.29, 3.B.1, 4.A.44, and 4.B.29. The following characterization of the order ≤∗ is similar to the characterization of the order ≤hr in Theorem 1.B.12. Theorem 4.B.10. Let X and Y be two random variables with continuous distribution functions F and G, respectively, with common support [0, ∞). The following conditions are equivalent: (a) X ≤∗ Y . (b) For all functions α and β, such that α is nonnegative and α and α/β are 1 1 decreasing, and such that 0 α(u)dF −1 (u) < ∞, 0 α(u)dG−1 (u) < ∞, 1 1 0 = 0 β(u)dF −1 (u) < ∞, and 0 = 0 β(u)dG−1 (u) < ∞, we have 1 1 α(u)dG−1 (u) α(u)dF −1 (u) 0 ≤ 01 . 1 β(u)dG−1 (u) β(u)dF −1 (u) 0 0 (c) For any two increasing functions a and b such that b is nonnegative, if 1 1 a(u)b(u)dF −1 (u) = 0, then 0 a(u)b(u)dG−1 (u) ≤ 0. 0 The orders ≤c , ≤∗ , and ≤su can be used to characterize, respectively, IFR, IFRA, and NBU random variables as follows. Theorem 4.B.11. Let Exp denote any exponential random variable (no matter what its mean is). Let X be a nonnegative random variable. Then X is IFR ⇐⇒ X ≤c Exp, X is IFRA ⇐⇒ X ≤∗ Exp, and X is NBU ⇐⇒ X ≤su Exp. The theorem follows at once from the deﬁnitions and the observation that a random variable is IFR [IFRA, NBU] if, and only if, the negative of the logarithm of its survival function is convex [starshaped, superadditive] on (0, ∞). The claim in the next example is easy to prove. Example 4.B.12. Let X be a nonnegative random variable with an absolutely continuous distribution function. Then X has a decreasing density if, and only if, U ≤c X, where U is a uniform[0, 1] random variable.

4.B Transform Orders: Convex, Star, and Superadditive Orders

219

Example 4.B.13. Let U(j:m) and U(i:n) denote the jth and the ith order statistics of samples from the uniform distribution on [0, 1] of sizes m and n, respectively. Then U(j:m) ≤∗ U(i:n)

whenever i − j ≥ max{0, n − m}.

This follows from Lemma 3.B.27 and (4.B.4), and from the fact that if U is a uniform random variable on [0, 1], then − log(1 − U ) is a standard exponential random variable. It is worthwhile to mention that the above inequality, together with Theorem 4.B.4, yields the ﬁrst three inequalities in Example 3.A.49. The following example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.D.8, and 6.E.13. Example 4.B.14. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G, i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 3.B.37), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that any of the three transform orderings of the ﬁrst two epoch times implies the same ordering of all the corresponding later epoch times; that is, if X ≤c [≤∗ , ≤su ] Y , then T1,n ≤c [≤∗ , ≤su ] T2,n , n ≥ 1. The proof of this fact is similar to the proof in Example 3.B.38, and is therefore omitted. Similar to the orders ≤st , ≤hr , and ≤lr (see Theorems 1.B.34, 1.C.33, and 6.B.23), the orders ≤c , ≤∗ , and ≤su are also preserved under the formation of orders statistics. This is shown in the next result. Theorem 4.B.15. Let (Xi , Yi ), i = 1, 2, . . . , m, be independent pairs of random variables such that Xi ≤c [≤∗ , ≤su ] Yi , i = 1, 2, . . . , m. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(m) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(m) . Suppose that the Xi ’s are identically distributed and that the Yi ’s are identically distributed. Then X(k) ≤c [≤∗ , ≤su ] Y(k) ,

k = 1, 2, . . . , m.

(4.B.6)

Proof. Let F [G] denote the common distribution function of the Xi ’s [Yi ’s] and let F(k) [G(k) ] denote the distribution function of X(k) [Y(k) ]. Then it is well known that F (x) m! F(k) (x) = uk−1 (1 − u)m−k du (k − 1)!(m − k)! 0 and, similarly,

220

4 Univariate Monotone Convex and Related Orders

G(k) (x) =

m! (k − 1)!(m − k)!

G(x)

uk−1 (1 − u)m−k du. 0

−1 Thus, G−1 F , and (4.B.6) follows from the assumptions of the (k) F(k) = G theorem.

In the following example it is shown that, under the proper conditions, random minima and maxima are ordered in the convex transform, star, and superadditive order senses; see related results in Examples 1.C.46, 3.B.39, 5.A.24, and 5.B.13. Example 4.B.16. Let X1 , X2 , . . ., and Y1 , Y2 , . . ., each be a sequence of independent and identically distributed random variables. Let N be a positive integer-valued random variable, independent of the Xi ’s and of the Yi ’s. Denote X(1,N ) = min{X1 , X2 , . . . , XN }, X(N,N ) = max{X1 , X2 , . . . , XN }, Y(1,N ) = min{Y1 , Y2 , . . . , YN }, and Y(N,N ) = max{Y1 , Y2 , . . . , YN }. It can be shown that if X1 ≤c [≤∗ , ≤su ] Y1 , then X(1:N ) ≤c [≤∗ , ≤su ] Y(1:N ) and X(N :N ) ≤c [≤∗ , ≤su ] Y(N :N ) . The convex transform order between X and Y implies the usual stochastic order between ratios of the corresponding spacings as the next result shows; related results can be found in Theorem 1.C.45, and in Examples 6.B.25 and 6.E.15. In the next result we use the following notation. Let X(1:n) ≤ X(2:n) ≤ · · · ≤ X(n:n) and Y(1:n) ≤ Y(2:n) ≤ · · · ≤ Y(n:n) be the order statistics corresponding to samples X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn ; each consists of independent, identically distributed random variables, where the Xi ’s have the same distribution as X, and the Yi ’s have the same distribution as Y . The corresponding spacings are deﬁned by U(i:n) ≡ X(i:n) −X(i−1:n) and V(i:n) ≡ Y(i:n) − Y(i−1:n) , i = 2, 3, . . . , n. Theorem 4.B.17. Let X and Y be two random variables. If X ≤c Y , then V(j:n) U(j:n) ≤st U(i:n) V(i:n)

for 2 ≤ i ≤ j ≤ n.

Proof. First note that from the convexity of G−1 F we obtain G−1 F (x2 ) − G−1 F (x1 ) G−1 F (x4 ) − G−1 F (x3 ) ≤ x2 − x1 x4 − x3 whenever x1 ≤ [x2 , x3 ] ≤ x4 , where x1 ≤ [x2 , x3 ] ≤ x4 denotes x1 ≤ x2 ≤ x4 and x1 ≤ x3 ≤ x4 . Thus, for 2 ≤ i ≤ j ≤ n,

4.B Transform Orders: Convex, Star, and Superadditive Orders

$ P

U(j:n) >z U(i:n)

%

$ =P $

X(j:n) − X(j−1:n) >z X(i:n) − X(i−1:n)

221

%

G−1 F (X(j:n) ) − G−1 F (X(j−1:n) ) >z G−1 F (X(i:n) ) − G−1 F (X(i−1:n) ) % $ V(j:n) >z , =P V(i:n)

%

≤P

where the last equality follows from the observation that the joint distribution of G−1 F (X(i:n) ), G−1 F (X(i−1:n) ), G−1 F (X(j:n) ), and G−1 F (X(j−1:n) ) is the same as the joint distribution of Y(i:n) , Y(i−1:n) , Y(j:n) , and Y(j−1:n) .

Under a weaker assumption than the one in Theorem 4.B.17 we have the following results. Theorem 4.B.18. Let X and Y be two random variables with distribution functions F and G, respectively, such that F (0) = G(0) = 0. Let 0 ≤ p ≤ q. If X ≤∗ Y , then q p (a) E[X(i:n) ]/E[Y(i:n) ] is decreasing in i, q p ] is increasing in n, and (b) E[X(i:n) ]/E[Y(i:n) q p (c) E[X(n−i:n) ]/E[Y(n−i:n) ] is decreasing in n,

provided the expectations exist. The notation in Theorem 4.B.17 is used in the next result. Theorem 4.B.19. Let X and Y be two nonnegative random variables. If X ≤∗ Y , then E[U(i:n) ] ≤ E[V(i:n) ], i = 2, 3, . . . , n. 4.B.3 Some related orders In this subsection we consider random variables X and Y with distribution functions F and G, respectively, and with supports of the form [0, a), a > 0 (a can be inﬁnity). We assume throughout this subsection that X and Y have ﬁnite means. Denote the mrl functions (see (2.A.1)) that are associated with X and Y , by m and l, respectively. The random variable X is said to be smaller than Y in the DMRL order (denoted by X ≤dmrl Y ) if l(G−1 (u)) m(F −1 (u))

is increasing in u ∈ [0, 1].

Note that (4.B.7) is the same as the condition ∞ 1 EY G−1 (u) G(x)dx ∞ is increasing in u ∈ [0, 1], 1 EX F −1 (u) F (x)dx

(4.B.7)

(4.B.8)

222

4 Univariate Monotone Convex and Related Orders

where F and G are the survival functions associated with F and G, respectively. Condition (4.B.8) can be written equivalently as −1 EY − HG (u) −1 EX − HF (u)

is increasing in u ∈ [0, 1],

−1 are the TTT transforms (see (1.A.19)) that are associated where HF−1 and HG with F and G, respectively.

Theorem 4.B.20. Let X and Y be two random variables, each with support of the form [0, a). If X ≤c Y , then X ≤dmrl Y . Proof. Let the equilibrium survival functions associated with F and G be deﬁned as ∞ ∞ F (t) G(t) F e (x) = dt and Ge (x) = dt. EX EY x x Let α(x) ≡ G−1 (F (x)) = G

−1

(F (x)) and let

−1

γ(u) ≡ F e α−1 Ge (u) ,

u ∈ [0, 1].

For simplicity suppose that α and γ are diﬀerentiable. A lengthy straightforward computation gives EX d −1 · α (x) . (4.B.9) γ (u) = −1 EY dx G (u) e

By assumption, α is convex. It follows from (4.B.9) that γ is convex, and therefore γ is starshaped. That is, −1

F e F −1 G Ge (u) u Equivalently,

F e F −1 (u)

Ge G−1 (u)

is increasing in u ∈ [0, 1].

is decreasing in u ∈ [0, 1],

and (4.B.8) is obtained.

The random variable X is said to be smaller than Y in the NBUE order (denoted by X ≤nbue Y ) if

m F −1 (u) EX

≤ for all u ∈ [0, 1]. (4.B.10) EY l G−1 (u) Note that (4.B.10) is the same as the condition

4.B Transform Orders: Convex, Star, and Superadditive Orders

1 EX

∞

F −1 (u)

F (x)dx ≤

1 EY

∞

G−1 (u)

G(x)dx for all u ∈ [0, 1].

223

(4.B.11)

Condition (4.B.11) can be written equivalently as HF−1 (u) H −1 (u) ≥ G EX EY

for all u ∈ [0, 1].

From (4.B.11) and (3.C.1) it follows that if EX = EY , then X ≤nbue Y ⇐⇒ X ≤ew Y . In other words, for nonnegative random variables X and Y we have Y X X ≤nbue Y ⇐⇒ ≤ew . (4.B.12) EX EY Without the condition that EX = EY the orders ≤nbue and ≤ew are distinct (see Kochar, Li, and Shaked [316]). The following result is immediate from (4.B.7) and (4.B.10). Theorem 4.B.21. Let X and Y be two random variables, each with support of the form [0, a). If X ≤dmrl Y , then X ≤nbue Y . In the following two theorems some further relationships among some orders are proven. Theorem 4.B.22. Let X and Y be two random variables, each with support of the form [0, a). If X ≤∗ Y , then X ≤nbue Y . Proof. If X ≤∗ Y , then, from Theorem 4.B.9, we have that ing in u ∈ [0, 1] (see (4.B.1)). Therefore,

−1 HG (u) −1 HF (u)

≤

−1 HG (1) −1 HF (1)

=

−1 HG (u) −1 HF (u)

EY EX .

is increas

Recall from Theorem 3.B.16 that for random variables with the same means the dispersion order implies the convex order. Thus, from Theorem 4.B.2 it follows that for nonnegative random variables X and Y with ﬁnite means, such that X(EX)−1 ≤st Y (EY )−1 , we have that the star order implies the Lorenz order. However, a stronger result is true — one can obtain the Lorenz order without assuming any usual stochastic comparison associated with X and Y . This follows from Theorem 4.B.22 and the next result. Theorem 4.B.23. Let X and Y be two nonnegative random variables. If X ≤nbue Y , then X ≤Lorenz Y . Proof. The proof follows at once from (4.B.11) and (3.C.8).

A summary of the implications among orders that were mentioned so far in this section is given in the following chart. X ≤c Y ⇒ X ≤∗ Y ⇒ X ≤su Y ⇓ ⇓ X ≤dmrl Y ⇒ X ≤nbue Y ⇒ X ≤Lorenz Y

224

4 Univariate Monotone Convex and Related Orders

Remark 4.B.24. Using the above facts, we provide here a simple proof of Theorem 4.A.68. Recall from Theorem 4.B.6 that for any p > 0 we have that X ≤∗ Y if, and only if, X p ≤∗ Y p . Thus, from (4.A.44) and from Theorems 4.B.22 and 4.B.23 it is seen that if X ≤∗ Y , then X p ≤Lorenz Y p . This observation, again with the aid of (4.A.44), proves Theorem 4.A.68. The orders ≤dmrl and ≤nbue can be used to characterize, respectively, DMRL and NBUE random variables as follows. Theorem 4.B.25. Let Exp denote any exponential random variable (no matter what its mean is). Let X be a nonnegative random variable. Then X is DMRL ⇐⇒ X ≤dmrl Exp, and X is NBUE ⇐⇒ X ≤nbue Exp. The theorem follows at once from the deﬁnitions and the observation that the mrl function of an exponential random variable is a constant. Recall from (1.A.19) the deﬁnition of the TTT transform. We will now introduce and discuss an order that is deﬁned through a comparison of TTT transforms. Let X and Y be two nonnegative random variables with distribution functions F and G, respectively. If

F −1 (u)

F (x)dx ≤ 0

G−1 (u)

G(x)dx,

for all u ∈ (0, 1)

(4.B.13)

0

then X is said to be smaller than Y in the TTT order (denoted by X ≤ttt Y ). A simple suﬃcient condition for the order ≤ttt is the usual stochastic order: X ≤st Y =⇒ X ≤ttt Y.

(4.B.14)

In order to verify (4.B.14) one may just notice that if X ≤st Y , then F −1 (u) ≤ G−1 (u) for all u ∈ (0, 1) (see (1.A.12)). By letting u → 1 in (4.B.13) it is seen that X ≤ttt Y =⇒ EX ≤ EY.

(4.B.15)

From (4.B.11) and (4.B.13) it follows that if EX = EY , then X ≤ttt Y ⇐⇒ X ≥nbue Y . In other words, for nonnegative random variables X and Y we have Y X X ≥nbue Y ⇐⇒ ≤ttt ; EX EY see a similar relation in (4.B.12). It is easy to see that for any two nonnegative random variables X and Y we have X ≤ttt Y =⇒ aX ≤ttt aY for any a > 0. An important closure property of the order ≤ttt , analogous to Theorem 3.C.4, is given next.

4.B Transform Orders: Convex, Star, and Superadditive Orders

225

Theorem 4.B.26. Let X and Y be two ﬁnite mean continuous nonnegative random variables with interval supports, and with 0 being the common left endpoint of the supports. Then, for any increasing concave function φ, such that φ(0) = 0, we have X ≤ttt Y =⇒ φ(X) ≤ttt φ(Y ). As a corollary we obtain an analog of (3.C.8): Corollary 4.B.27. Let X and Y be two ﬁnite mean continuous nonnegative random variables with interval supports, and with 0 being the common left endpoint of the supports. Then X ≤ttt Y =⇒ X ≤icv Y. Proof. Suppose that X ≤ttt Y . Let φ be an increasing concave function ˜ ˜ deﬁned on [0, ∞). Deﬁne φ(·) = φ(·) − φ(0), so that φ(0) = 0. From ˜ ˜ ). Hence from (4.B.15) we get Theorem 4.B.26 we obtain φ(X) ≤ttt φ(Y ˜ ˜ )], and this reduces to E[φ(X)] ≤ E[φ(Y )], provided the E[φ(X)] ≤ E[φ(Y expectations exist.

An interesting closure property of the order ≤ttt , analogous to Theorem 3.C.11, is given next. Theorem 4.B.28. Let X1 , X2 , . . . be a collection of independent and identically distributed random variables, and let Y1 , Y2 , . . . be another collection of independent and identically distributed random variables. Also, let N be a positive, integer-valued, random variable, independent of the Xi ’s and of the Yi ’s. If X1 and Y1 are nonnegative, and if X1 ≤ttt Y1 , then min{X1 , X2 , . . . , XN } ≤ttt min{Y1 , Y2 , . . . , YN }. Some interesting connections between the order ≤ttt and observed total time on test random variables are given in the next theorem. Let X and Y be two nonnegative random variables. Recall from Section 1.A.4 the deﬁnition of the observed total time on test random variables Xttt and Yttt . Theorem 4.B.29. Let X and Y be two nonnegative random variables. Then Xttt ≤st Yttt ⇐⇒ X ≤ttt Y and X ≤ttt Y =⇒ Xttt ≤ttt Yttt . Some related results can be found in Theorems 1.A.29, 3.B.1, 4.A.44, 4.B.8, and 4.B.9. The following example describes comparisons of random variables that arise in the model of imperfect repair, and as the lifetimes of series systems.

226

4 Univariate Monotone Convex and Related Orders

Example 4.B.30. Let X be a nonnegative random variable with survival function F . For θ > 0, let X(θ) denote a random variable with the survival function (F )θ . Similarly, if Y is a nonnegative random variable with the survival function G, then denote by Y (θ) a random variable with survival function (G)θ . Suppose that both X and Y have 0 as the left endpoint of their supports. (a) If θ > 1, then X ≤ttt Y =⇒ X(θ) ≤ttt Y (θ). (b) If θ < 1, then X(θ) ≤ttt Y (θ) =⇒ X ≤ttt Y . A generalization of the TTT order is described next. This generalization contains as special cases the orders ≤st , ≤lir , and ≤ttt . Let H denote the set of all functions h such that h(u) > 0 for u ∈ (0, 1), and h(u) = 0 for u ∈ [0, 1]. For h ∈ H, if

F −1 (p)

h(F (x))dx ≤

−∞

G−1 (p)

h(G(x))dx, −∞

p ∈ (0, 1),

then we say that X is smaller than Y in the generalized total time on test (h) transform order with respect to h. We denote this by X ≤ttt Y . Example 4.B.31. Let X and Y be random variables with the same left endpoint of support a > −∞. Let h be a constant function on [0, 1]; that is, (h) h(u) = c, u ∈ [0, 1], for some c > 0, and h(u) = 0 otherwise. Then X ≤ttt Y if, and only if, F −1 (p) ≤ G−1 (p), p ∈ (0, 1); that is (by (1.A.12)), if, and only if, X ≤st Y . Example 4.B.32. Let h(u) = u, u ∈ [0, 1], and h(u) = 0 otherwise. Then (h) X ≤ttt Y if, and only if,

F −1 (p)

−∞

F (x)dx ≤

G−1 (p)

G(x)dx, −∞

p ∈ (0, 1);

that is, if, and only if, X ≤lir Y ; the order ≤lir is deﬁned in Section 3.C.1. Example 4.B.33. Let X and Y be nonnegative random variables with 0 being the left endpoint of their supports. Let h(u) = 1 − u, u ∈ [0, 1], and h(u) = 0 (h) otherwise. Then X ≤ttt Y if, and only if,

F −1 (p)

F (x)dx ≤ 0

G−1 (p)

G(x)dx,

p ∈ (0, 1);

0

that is, if, and only if, X ≤ttt Y . (h)

The next result describes a relationship among the orders ≤ttt for diﬀerent h’s.

4.C Complements

227

Theorem 4.B.34. Let X and Y be two random variables with continuous distribution functions, having 0 as the left endpoint of their supports. Let h1 , h2 ∈ H. Suppose that h2 (u)/h1 (u) is decreasing on (0, 1). Then (h )

(h )

X ≤ttt1 Y =⇒ X ≤ttt2 Y. Remark 4.B.35. In Theorem 4.B.34 let h1 (u) = u and h2 (u) = c for some (h ) (h ) constant c > 0, u ∈ [0, 1]. Then by Theorem 4.B.34, X ≤ttt1 Y =⇒ X ≤ttt2 Y ; that is, by Examples 4.B.31 and 4.B.32, X ≤lir Y =⇒ X ≤st Y

(4.B.16)

when X and Y are two random variables with continuous distribution functions, having 0 as the left endpoint of their supports. Recall from Theorem 3.B.13(a) that if X and Y have 0 as the left endpoint of their supports, then X ≤disp Y =⇒ X ≤st Y. It is not hard to see that X ≤disp Y =⇒ X ≤lir Y . Thus (4.B.16) strengthens Theorem 3.B.13(a) when X and Y have 0 as the left endpoint of their supports. Some relationships between the usual stochastic order ≤st and the orders are given next.

(h) ≤ttt

Theorem 4.B.36. Let X and Y be two nonnegative random variables with continuous distribution functions, having 0 as the left endpoint of their supports. Let h ∈ H. (h)

(a) If h is decreasing on [0, 1], then X ≤st Y =⇒ X ≤ttt Y . (h) (b) If h is increasing on [0, 1], then X ≤ttt Y =⇒ X ≤st Y . (h)

A relationship between the order ≤icv and some orders ≤ttt is described next. Theorem 4.B.37. Let X and Y be two random variables with continuous distribution functions, and supports [0, a) and [0, b), respectively, for some ﬁnite or inﬁnite constants a and b. Let h ∈ H be decreasing on [0, 1]. Then (h)

X ≤ttt Y =⇒ X ≤icv Y.

4.C Complements Section 4.A: Some standard references for the monotone convex and concave orders are Ross [475] and M¨ uller and Stoyan [419], where many of the results that are described in Section 4.A can be found. The characterizations of the order ≤icx by means of the quantile functions (Theorems

228

4 Univariate Monotone Convex and Related Orders

4.A.3 and 4.A.4) are taken from Sordo and Ramos [538]. The condition (4.A.8) is studied in H¨ urlimann [251]; there it is called the RaC (riskadjusted capital) order. The present version of the characterizations of the orders ≤icx and ≤icv , given in Theorem 4.A.5, is taken from M¨ uller and R¨ uschendorf [415]. The two characterizations of the order ≤icx , given in Theorem 4.A.6, can be found in Makowski [378]; an alternative proof of these results is given in M¨ uller [407]. The result that gives the closure under random convolutions property of the monotone convex and concave orders (Theorem 4.A.9) and its proof are taken from Ross and Schechner [477]. Extensions of Theorem 4.A.9 are given in Jean-Marie and Liu [254]; for example, the results mentioned in Remark 4.A.10 can be found there. Theorem 4.A.11 can be found in Fagiuoli and Pellerey [186]. The comparisons of the random sums in Theorems 4.A.12–4.A.14 are motivated by ideas in Pellerey and Shaked [455]; they can be found in Pellerey [450]. The result that gives the closure under general convex increasing transformations property of the increasing convex order (Theorem 4.A.15) and its proof can be found in Ross [475]. The ordering of the maxima in the sense of ≤icx (Corollary 4.A.16) is implicit in Theorem 9 of Li, Li, and Jing [354]. The increasing convex order comparison of maxima of partial sums (Theorem 4.A.17) is taken from Shao [535]; see also Bulinski and Suquet [114]. The icx and icv comparisons of ratios (Example 4.A.19) are restatements of results of Pellerey and Semeraro [454]. The result that gives the closure under mixtures property of the increasing convex and concave orders (Theorem 4.A.20) has been motivated by a result of Ahmed, Soliman, and Khider [10]. The Laplace transform characterization of the orders ≤icx and ≤icv (Theorem 4.A.21) is essentially taken from Ross and Schechner [477] and from Shaked and Wong [524]. A proof of the characterization of the increasing convex order by means of the number of crossings of two distribution functions (Theorem 4.A.23) can be found in M¨ uller [407]. The characterization of the order ≤mrl by the order ≤icx (Theorem 4.A.24) is taken from Brown and Shanthikumar [112]. The closure property of the order ≤mrl given in Remark 4.A.25 is also taken from Brown and Shanthikumar [112]. The suﬃcient condition for the increasing concave order in Theorem 4.A.27 is given on page 484 of Landsberger and Meilijson [329]. The fact that the order ≤hmrl implies the order ≤icx (Theorem 4.A.28) can be found in Fagiuoli and Pellerey [185]. The relationship between the orders ≤dil and ≤icx that is described in Theorem 4.A.30 and in Corollary 4.A.31 can be found in Belzunce, Pellerey, Ruiz, and Shaked [72]. The relationship between the orders ≤ew and ≤icx that is described in Corollary 4.A.32 can be found in Fagiuoli, Pellerey, and Shaked [188]; in Kochar, Li, and Shaked [316] it is shown that Corollary 4.A.32 can be easily obtained from Theorem 3.C.4. The result about the expected values of the extremes and the increasing convex order (Theorem 4.A.33) is taken from Downey and Maier [170]. The bivariate characterizations of the orders ≤st , ≤hr , and ≤lr in Theorem 4.A.36

4.C Complements

229

are taken from Righter and Shanthikumar [466]; its application (Theorem 4.A.37) is taken from Kijima and Ohnishi [292]. The increasing convex and concave comparisons of linear functions of random variables with random coeﬃcients, whose parameters are comparable in the majorization order (Theorems 4.A.38 and 4.A.39 and Corollary 4.A.40), are taken from Denuit and Frostig [144]; further results of this type can be found there. The characterizations of the dilation and the icx orders by means of the increasing convex order (Theorems 4.A.41 and 4.A.42) are taken from Sordo and Ramos [538]. The characterization of the excess wealth order by means of the increasing convex order (Theorem 4.A.43) can be found in Belzunce [63]. The inheritance of the icv order by the observed total time on test random variables (Theorem 4.A.44) is given in Li and Shaked [356]. The icx order comparisons of a sum of independent heterogeneous binomial random variables with a proper binomial random variable (Example 4.A.45) is taken from Boland, Singh, and Cukic [102]. The necessary and suﬃcient conditions for the comparison of normal random variables (Example 4.A.46) are taken from M¨ uller [413]. The icx comparison of average of exponential random variables with the largest among them (Example 4.A.47) can be found in Argon and Andrad´ ottir [16]. The condition for stochastic equality in the icx case of Theorem 4.A.48 can be found Bhattacharjee and Bhattacharya [87]; the condition for stochastic equality in the icv case of Theorem 4.A.48 follows from the above condition and from Theorem 4.A.1. The condition for stochastic equality in Theorem 4.A.50 is taken from Sordo and Ramos [538]. The characterization of the DMRL and IMRL aging notions by means of the increasing convex order (Theorem 4.A.51) can be found in Cao and Wang [117], who also deﬁned and studied the classes of NBUC and NWUC random variables. The terminology of NBUM and NWUM is due to Bergmann [81]. The characterization of NBUC random variables, given in (4.A.32), is taken from Belzunce, Ortega, and Ruiz [71]. The notions of NBU(2) and NWU(2) are deﬁned in Deshpande, Kochar, and Singh [160]. The extension of the suﬃciency condition in Theorem 4.A.51, given in Theorem 4.A.52, is taken from Li and Zuo [359]. Most of the results about the starshaped order (Section 4.A.6) can be found in Alzaid [13]. Most of the results on the orders ≤m-icx and ≤m-icv (Section 4.A.7) are taken from Rolski [473]; see also Mukherjee and Chatterjee [404], Fishburn and Lavalle [204], Wang and Young [558], Cheng and Pai [129], and references therein. Lef`evre and Utev [339] studied some stochastic orders among discrete random variables by replacing the integrals in (4.A.36) and in (4.A.37) by summations. Fishburn and Lavalle [204] also studied discrete analogs of the ≤m-icv orders. ThorlundPetersen [549] characterized the ≤3-icv comparison of arithmetic random variables. The deﬁnition of the order ≤∞-icv can be found in Thistle [548] or in Fishburn and Lavalle [204] and in other references that are given in the latter paper. The moment inequalities that are given in Theorem 4.A.58 are also taken from Fishburn and Lavalle [204]; see further ref-

230

4 Univariate Monotone Convex and Related Orders

erences there, and see also Carletti and Pellerey [121]. Theorem 4.A.60 can be found in Denuit, Lef`evre, and Utev [155]. The suﬃcient conditions for the m-icx order, in terms of sign changes (Theorem 4.A.63), are taken from Kaas and Hesselager [270]; the stochastic comparisons of the Gamma, inverse Gaussian, and lognormal random variables (Example 4.A.64) can also be found there. A variation of Theorem 4.A.65 can be found in Hesselager [222]. Some results that are related to Theorem 4.A.66 have been derived in Denuit [140]. Fishburn [201, 202] and Stoyan [540, page 22] extended the orders ≤m-icx and ≤m-icv by allowing m to be any positive number (that is, not necessarily an integer). They did it by letting the m in (4.A.38) and in (4.A.39) be any number greater than 0. Shaked and Wong [524] considered orders deﬁned by requiring the test functions φ in (4.A.40) [respectively, (4.A.41)] to satisfy that φ(j) [respectively, (−1)j φ(j) ] is increasing, j = 0, 1, . . . , m − 1. Denuit, Lef`evre, and Shaked [151] studied the orders deﬁned by requiring (4.A.38) and (4.A.39) to hold as well as E(X − a)i ≤ E(Y − a)i , i = 1, 2, . . . , m − 1, where a is the left endpoint of the support of the underlying random variables, and a is assumed to be ﬁnite. The results about the orders ≤p , ≤p+ , and ≤p− (Theorems 4.A.67–4.A.69) are taken from Bhattacharjee and Sethuraman [88], Bhattacharjee [83], Li and Zhu [351], and Jun [265]. Note that the order that we denote by ≤p− is not the same as, but is a modiﬁcation of, an order discussed by these authors. Some generalizations of Theorem 4.A.69 can be found in Cai and Wu [116]. The condition for stochastic equality in Theorem 4.A.70 is taken from Sordo and Ramos [538]. The discussion involving the orders ≤−1 m is motivated by Muliere and Scarsini [406]; extensions of these orders are developed in Wang and Young [558] and in Maccheroni, Muliere, and Zoli [376]. Bhattacharjee [85] studied the order ≤icx under the restriction that the compared random variables are discrete. Baccelli and Makowski [28] denote X ≤FR-st Y whenever (4.A.21) holds (that is, X ≤FR-st Y ⇐⇒ X ≤hmrl Y ). They also deﬁne the orders ≤FR-cx and ≤FR-icx in a similar manner, and they study many closure properties of the orders ≤FR-st , ≤FR-cx , and ≤FR-icx . The order ≤FR-icx is a “hybrid” of the orders ≤hmrl (see (2.B.2)) and ≤3-icx (see (4.A.36)). It is deﬁned by saying that the nonnegative random variables X and Y satisfy X ≤FR-icx Y if (here F and G denote the survival functions of X and Y , respectively) ∞∞ ∞∞ F (x1 )d1 dx2 G(x1 )dx1 dx2 x x2 for all x ≥ 0. ≤ x x2 EX EY Clearly, if EX = EY , then X ≤FR-icx Y if, and only if, X ≤2-icx Y . The order ≤FR-cx is deﬁned by saying that the nonnegative random variables X and Y satisfy X ≤FR-cx Y if X ≤FR-icx Y and if E[X 2 ]/E[X] = E[Y 2 ]/E[Y ].

4.C Complements

231

Section 4.B: A good reference about the convex transform, star, and superadditive orders is Barlow and Proschan [36], where further references can be found. Many of the results given in this section can be found there. Another basic reference about the convex transform order is van Zwet [578]. The result about the relation of the star order and the dispersive order (Theorem 4.B.1) is implicit in Shaked [503], whereas the results about the relation of the superadditive order and the dispersive order (Theorems 4.B.2 and 4.B.3) can be found in Ahmed, Alzaid, Bartoszewicz, and Kochar [8]. The relationship between the star order and the icx order (Theorem 4.B.4) is taken from Szekli [544, page 23]; the idea of the ﬁrst part of the proof of Theorem 4.B.4 is adopted from Arnold and Villasenor [21]. The property of the star order given in Theorem 4.B.6, when p = −1, can be found in Taillie [546]; Rivest [469] has obtained it for a general p = 0. The comparison of the exponential mixtures with respect to the order ≤∗ , given in Example 4.B.7, is taken from Bartoszewicz [50]. The characterization of the order ≤c by means of observed total time on test random variables (Theorem 4.B.8) can be found in Barlow and Doksum [34]. The proof of the implication that is given in Theorem 4.B.9 can be found in Bartoszewicz [42, 45]. An interesting study of the relationship between the convex transform, star, and superadditive orders and some variability orders can be found in Metzger and R¨ uschendorf [393]. A characterization of the star order, by means of the monotonicity in k of the ratio of the quantile functions of the corresponding order statistics X(k) and Y(k) (see (4.B.6)), is given in Bartoszewicz [41]. The characterization of the star order given in Theorem 4.B.10 is taken from Bartoszewicz [45]. The star ordering of order statistics from uniform distribution (Example 4.B.13) can be found in Jeon, Kochar, and Park [255]. The three transform orderings of the epoch times of two nonhomogeneous Poisson processes (Example 4.B.14) are given in Gupta and Kirmani [217]. The result about the preservation of the convex transform, star, and superadditive orders under formation of order statistics (Theorem 4.B.15) is a special case of a result in Belzunce, Mercader, and Ruiz [70]. The results about the convex transform, star, and superadditive order comparisons of random minima and maxima (Example 4.B.16) are taken from Bartoszewicz [49]. An extension of Theorem 4.B.15 to order statistics from samples with a random size can be found in Nanda, Misra, Paul, and Singh [427]. This extension of Nanda, Misra, Paul, and Singh [427] also extends the results in Example 4.B.16. The fact that the convex transform order implies the usual stochastic order among ratios of spacings (Theorem 4.B.17) can be found in Oja [440]. The result about the monotonicity of the ratios of expected values of the order statistics which is implied by the order ≤∗ (Theorem 4.B.18) is given in Bartoszewicz [45]; see also Barlow and Proschan [35]. The inequalities between the expected values of spacings from diﬀerent samples (Theorem 4.B.19) are taken from Paul and Gutierrez [443]. The discussion of the DMRL and the NBUE orders in

232

4 Univariate Monotone Convex and Related Orders

Section 4.B.3 follows the work of Kochar and Wiens [319] and of Kochar [306], although some of the proofs here are diﬀerent; see also Belzunce, Candel, and Ruiz [65] and Fernandez-Ponce, Kochar, and Mu˜ noz-Perez [195]. The discussion of the TTT order in Section 4.B.3 follows the work of Kochar, Li, and Shaked [316]. The result about the preservation of the TTT order under random minima (Theorem 4.B.28) is taken from Li and Zuo [358]. The connections between the order ≤ttt and observed total time on test random variables (Theorem 4.B.29) can be found in Li and Shaked [356]. The comparisons of random variables of interest in reliability theory, given in Example 4.B.30, are taken from Li and Shaked [357]. The (h) generalization ≤ttt of the TTT order has been introduced and studied in Li and Shaked [357]. The deﬁnitions of the orders ≤c , ≤∗ , and ≤su , given in Section 4.B.1, are proper when the comparisons apply to distributions of nonnegative random variables. Van Zwet [578], Lawrence [334], and Loh [365] study modiﬁcations of these orders which apply to symmetric distributions.

5 The Laplace Transform and Related Orders

The most important common order that is studied in this chapter is the Laplace transform order. Like the orders that were discussed in Chapter 4, the Laplace transform order compares random variables according to both their “location” and their “spread”. Two other useful orders, based on ratios of Laplace transforms, are also discussed in this chapter. In addition, some other related orders are investigated in this chapter as well.

5.A The Laplace Transform Order 5.A.1 Deﬁnitions and equivalent conditions The relations X ≤st Y , X ≤cx Y , X ≤icx Y , and X ≤icv Y , as well as many others, are deﬁned by requiring E[φ(X)] ≤ E[φ(Y )] to hold for all functions φ in some class of functions. For example, the class of functions which corresponds to the usual stochastic order is the class of all increasing functions. The order that is discussed in this section corresponds to the class of functions φ of the form φ(x) = −e−sx where s is a positive number. More explicitly, let X and Y be two nonnegative random variables such that E[exp{−sX}] ≥ E[exp{−sY }] for all s > 0. (5.A.1) Then X is said to be smaller than Y in the Laplace transform order (denoted by X ≤Lt Y ). Throughout this section we consider only nonnegative random variables. For a nonnegative random variable X with distribution function F and survival function F ≡ 1 − F , denote by ∞ ∞ ∗ e−sx dF (x) and F (s) = e−sx F (x)dx f ∗ (s) = 0

0

the Laplace-Stieltjes transform of F (or the Laplace transform of X) and the Laplace transform of F , respectively. Then it is easy to verify that

234

5 The Laplace Transform and Related Orders ∗

F (s) = s−1 (1 − f ∗ (s))

for all s > 0.

(5.A.2)

Using (5.A.2), the following result is easy to verify. Theorem 5.A.1. Let X and Y be two nonnegative random variables with survival functions F and G, respectively. Then X ≤Lt Y if, and only if, ∞ ∞ e−sx F (x)dx ≤ e−sx G(x)dx for all s > 0. (5.A.3) 0

0

Note that (5.A.3) can be written as E min{X, Es } ≤ E min{Y, Es }

for all s > 0,

(5.A.4)

where Es is an exponential random variable with mean 1/s, which is independent of X and of Y . Using (5.A.2) it is also easy to verify the statement that is given in the following remark. Remark 5.A.2. Let X and Y be two positive random variables, and let E1 be a mean 1 exponential random variable which is independent of both X and = E1 /X and Y = E1 /Y ; that is, the distributions of both X Y . Deﬁne X and Y are scale mixtures of exponential distributions. Then X ≤Lt Y ⇐⇒ Y ≤st X. See similar results in Example 4.B.7 and in Remark 5.B.1. If X ≤Lt Y , then (1 − E[exp{−sX}])/s ≤ (1 − E[exp{−sY }])/s for all s > 0. Letting s ↓ 0 it is seen that X ≤Lt Y =⇒ EX ≤ EY,

(5.A.5)

provided the expectations exist. A function φ : [0, ∞) → R is said to be completely monotone if all its derivatives φ(n) exist and satisfy φ(0) (x) ≡ φ(x) ≥ 0, φ(1) (x) ≤ 0, φ(2) (x) ≥ 0, . . .; that is, φ is completely monotone if (−1)n φ(n) (x) ≥ 0 for all x > 0 and n = 0, 1, 2, . . .. It is well known that φ is completely monotone if, and only if, there exists a measure µ on (0, ∞) such that ∞ φ(x) = e−xu µ(du). 0

Therefore, if X ≤Lt Y and φ is completely monotone, then ∞ ∞ E[φ(X)] = E e−Xu µ(du) = E[e−Xu ]µ(du) 0 0 ∞ ≥ E[e−Y u ]µ(du) = E[φ(Y )], 0

provided the expectations exist. The function φ, which is deﬁned by φ(x) = exp{−sx}, is completely monotone for each s > 0. We thus have proven the following characterization of the order ≤Lt .

5.A The Laplace Transform Order

235

Theorem 5.A.3. Let X and Y be two nonnegative random variables. Then X ≤Lt Y if, and only if, E[φ(X)] ≥ E[φ(Y )]

(5.A.6)

for all completely monotone functions φ, provided the expectations exist. A similar result is the following. Theorem 5.A.4. Let X and Y be two nonnegative random variables. Then X ≤Lt Y if, and only if, E[φ(X)] ≤ E[φ(Y )] for all diﬀerentiable functions φ on [0, ∞) with a completely monotone derivative, provided the expectations exist. Next we characterize the order ≤Lt by a function of the respective moments. In order to do that we notice that if X is a nonnegative random variable with survival function F such that all its moments exist, then ∞ ∞ ∞ (−s)i ∞ i (−s)i EX i+1 e−sx F (x)dx = x F (x)dx = . i! i! i+1 0 0 i=0 i=0 Using this fact and Theorem 5.A.1, the proof of the next theorem is apparent. Theorem 5.A.5. Let X and Y be nonnegative random variables that possess moments µi and νi , respectively, i = 1, 2, . . . . Then X ≤Lt Y if, and only if, ∞ ∞ (−s)i (−s)i µi+1 ≤ νi+1 (i + 1)! (i + 1)! i=0 i=0

for all s > 0.

A Laplace transform characterization of the order ≤Lt is stated next. It may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, 2.B.14, and 4.A.21. We omit its proof. Theorem 5.A.6. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤Lt X2 ⇐⇒ Nλ (X1 ) ≤Lt Nλ (X2 )

for all λ > 0.

5.A.2 Closure and other properties Using (5.A.1) and (5.A.6) it is easy to prove each of the closure results in the following theorem. The ﬁrst part of the theorem follows from the observation that if φ is a completely monotone function and g is a positive function with a completely monotone derivative, then φ(g) is completely monotone. Comments about the proof of the last part are given after the statement of the theorem. (Recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.)

236

5 The Laplace Transform and Related Orders

Theorem 5.A.7. (a) If X ≤Lt Y and g is any positive function with a completely monotone derivative, then g(X) ≤Lt g(Y ). (b) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤Lt [Y Θ = θ] for all θ in the support of Θ. Then X ≤Lt Y . That is, the Laplace transform order is closed under mixtures. (c) Let {Xj , j = 1, 2, . . . } and {Yj , j = 1, 2, . . . } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤Lt Yj , j = 1, 2, . . ., then X ≤Lt Y . (d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤Lt Yi for i = 1, 2, . . . , m, then g(X1 , X2 , . . . , Xm ) ≤Lt g(Y1 , Y2 , . . . , Ym ) for all nonnegative functions g on [0, ∞)n such that (∂/∂xi )g(x1 , x2 , . . . , xn ) is completely monotone in xi , i = 1, 2, . . . , m. In particular, the Laplace transform order is closed under convolutions. The proof of Theorem 5.A.7(d) is very similar to the proof of Theorem 4.A.15. The basic diﬀerence is that one should use Theorem 5.A.7(a) rather than Theorem 4.A.8(a) in the ﬁrst step of the inductive argument. Another closure property of the order ≤Lt is described in the following theorem. Theorem 5.A.8. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent random variables, and let M and N be integer-valued positive random variables that are independent of the {Xi } and the {Yi } sequences, respectively. Suppose that there exists a nonnegative random variable Z such that Xi ≤Lt Z ≤Lt Yj for all i and j. If M ≤Lt N , then M

Xj ≤Lt

j=1

N

Yj .

j=1

Proof. Note that for all s > 0 we have E exp

$ −s

M

% Xj

=

j=1

≥ = ≥

∞ n=1 ∞ n=1 ∞ n=1 ∞ n=1

P {M = n}

n

E[exp{−sXj }]

j=1

P {M = n}(E[exp{−sZ}])n P {M = n} exp{−n(− log E[exp{−sZ}])} P {N = n} exp{−n(− log E[exp{−sZ}])}

5.A The Laplace Transform Order

= ≥

∞ n=1 ∞

237

P {N = n}(E[exp{−sZ}])n P {N = n}

n=1

E[exp{−sYj }]

j=1

$

= E exp

n

−s

N

% Yj ,

j=1

where the ﬁrst and the last equalities follow from the independence of M and N of the {Xi } and the {Yi } sequences, the ﬁrst and the last inequalities follow from Xi ≤Lt Z ≤Lt Yj for all i and j, and the middle inequality follows from M ≤Lt N . The stated result now follows.

As a corollary of Theorem 5.A.8 we obtain the next result, which is an analog of Theorem 4.A.9. It is worthwhile to point out that Theorem 7.D.7, which is proven in Section 7.D.1, is a more general result than the following theorem. Theorem 5.A.9. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent and identically distributed random variables such that Xi ≤Lt Yi , i = 1, 2, . . .. Let M and N be integer-valued positive random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤Lt N . Then M j=1

Xj ≤Lt

N

Yj .

j=1

A result that is related to Theorem 5.A.9 is given next. It is of interest to compare it to Theorems 1.A.5, 2.B.8, 3.A.14, and 4.A.12. Theorem 5.A.10. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Suppose that for some positive integer K we have K Xi ≤Lt [≥Lt ] Y1 , i=1

and M ≤Lt [≥Lt ] KN. Then

M j=1

Xj ≤Lt [≥Lt ]

N j=1

Yj .

238

5 The Laplace Transform and Related Orders

We do not give a detailed proof of Theorem 5.A.10 here since it is similar to the proof of Theorem 4.A.12 in Section 4.A.1. Two other similar theorems are the following. Their proofs are similar to the proofs of Theorems 4.A.13 and 4.A.14 in Section 4.A.1. Theorem 5.A.11. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. Also, let {Nj , j = 1, 2, . . . } be a sequence of independent random variables that are distributed as N . If for some positive integer K we have K

Xi ≤Lt Y1

and

M ≤Lt

i=1

K

Ni ,

i=1

or if we have KX1 ≤Lt Y1

and

M ≤Lt KN,

KX1 ≤Lt Y1

and

M ≤Lt

or if we have K

Ni ,

i=1

then

M

Xj ≤Lt

j=1

N

Yj .

j=1

Theorem 5.A.12. Let {Xj , j = 1, 2, . . . } be a sequence of nonnegative independent and identically distributed random variables, and let M be a positive integer-valued random variable which is independent of the Xi ’s. Let {Yj , j = 1, 2, . . . } be another sequence of independent and identically distributed random variables, and let N be a positive integer-valued random variable which is independent of the Yi ’s. If for some positive integers K1 and K2 , such that K1 ≤ K2 , we have K1 i=1

then

Xi ≤Lt

K1 Y1 K2

M j=1

and

Xj ≤Lt

N j=1

M ≤Lt K2 N,

Yj .

5.A The Laplace Transform Order

239

Recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. Theorem 5.A.13. Let X1 , X2 , . . . , Xm be independent nonnegative random variables. Let a1 ≥ a2 ≥ · · · ≥ am ≥ 0 and b1 ≥ b2 ≥ · · · ≥ bm ≥ 0 be constants such that a ≺ b. If X1 ≤rh X2 ≤rh · · · ≤rh Xm , then

m i=1

ai Xi ≤Lt

m

am−i+1 Xi

and

i=1

m

bi Xi ≤Lt

i=1

m

ai Xi .

i=1

Proof. By Theorem 5.A.7(d) the order ≤Lt is closed under convolutions. Thus, it suﬃces to prove the stated results for m = 2. Select an s ≥ 0. In Theorem 1.B.50(b), take α(x) = e−a1 sx and β(x) = −a2 sx e to obtain E[exp{−s(a1 X1 + a2 X2 )}] ≥ E[exp{−s(a1 X2 + a2 X1 )}],

s ≥ 0;

that is, a1 X1 + a2 X2 ≤Lt a1 X2 + a2 X1 . In order to prove the second statement, take α(x) = e−a2 sx and β(x) = e−b2 sx in Theorem 1.B.50(b) to obtain E[exp{−b2 X1 s}] E[exp{−b2 X2 s}] ≥ , E[exp{−a2 X2 s}] E[exp{−a2 X1 s}]

s ≥ 0.

(5.A.7)

Also, by Theorem 3.A.35 we have a1 X1 + a2 X1∗ ≤cx b1 X1 + b2 X1∗ , where X1∗ is an independent copy of X1 . Therefore, a1 X1 + a2 X1∗ ≥Lt b1 X1 + b2 X1∗ , and hence, E[exp{−b2 X1 s}] E[exp{−b2 X1∗ s}] E[exp{−a1 X1 s}] = ≥ , E[exp{−a2 X1 s}] E[exp{−a2 X1∗ s}] E[exp{−b1 X1 s}]

s ≥ 0.

(5.A.8) Combining (5.A.7) and (5.A.8) we obtain b1 X1 + b2 X2 ≤Lt a1 X1 + a2 X2 .

The Laplace transform order is closed under linear convex combinations as the following theorem shows. This result is an analog of Theorem 3.A.36, and its proof is similar to the proof of that theorem; therefore the proof is omitted. Similar results are Theorems 5.C.8 and 5.C.18. Theorem 5.A.14. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≥Lt Y , i = 1, 2, . . . , n, then n

ai Xi ≥Lt Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n and

n i=1

ai = 1.

240

5 The Laplace Transform and Related Orders

A result that is similar to Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 6.B.19, 6.G.12, 6.G.13, and 7.A.14–7.A.16, is the following. Theorem 5.A.15. Let X and Y be two nonnegative random variables. Suppose that X ≤Lt Y and that E[X α ] = E[Y α ] for some α < 0 or for some α ∈ (0, 1), provided the expectations exist. Then X =st Y . The function φ deﬁned by φ(x) = exp{−sx} is decreasing and convex for each s > 0. Therefore −φ is increasing and concave. We thus obtain the next result. Theorem 5.A.16. Let X and Y be two nonnegative random variables. If X ≤icv Y , then X ≤Lt Y . In particular, if X ≤st Y , then X ≤Lt Y . In fact, from (4.A.41) it follows that if X ≤m-icv Y , for any m, then X ≤Lt Y . For random variables with ﬁnite supports we have the following characterization of the Laplace transform order by means of the orders ≤m-icv that were studied in Section 4.A.7. Theorem 5.A.17. Let X and Y be two random variables with ﬁnite supports. Then X ≤Lt Y if, and only if, X ≤∞-icv Y (where ≤∞-icv is deﬁned in (4.A.42)). Another strengthening of Theorem 5.A.16 is stated and proven next. Recall from Section 4.A.7 the deﬁnition of the order ≤p− . Theorem 5.A.18. Let X and Y be two nonnegative random variables. If X ≤p− Y for some p ≤ 1, then X ≤Lt Y . Proof. Recall from (4.A.45) that if X ≤p− Y , then X p ≤icv Y p . Select an s > 0. Deﬁne φ(x) ≡ e−sx and let

1/p h(x) ≡ φ x1/p = e−sx . It is easy to verify that the function h is decreasing and convex, and therefore −h is increasing and concave. From the fact that X p ≤icv Y p it follows that −E[h(X p )] ≤ −E[h(Y p )], or, equivalently, that

E e−sX ≥ E e−sY .

Since the latter inequality holds for all s > 0 it follows that X ≤Lt Y .

Closure properties of an order under the operation of taking minima are of importance in reliability theory. The next result gives conditions under which the order ≤Lt is closed under this operation. We do not give the proof here.

5.A The Laplace Transform Order

241

Theorem 5.A.19. Let the independent nonnegative random variables X1 , X2 , . . . , Xm , Y1 , Y2 , . . . , Ym have the survival functions F 1 , F 2 , . . . , F m , G1 , G2 , . . . , Gm , respectively. If Xi ≤Lt Yi , i = 1, 2, . . . , m, and F i and Gi , i = 1, 2, . . . , m, are completely monotone, then min{X1 , X2 , . . . , Xm } ≤Lt min{Y1 , Y2 , . . . , Ym }. Remark 5.A.20. Let {X, X1 , X2 , . . . } be a set of nonnegative independent and identically distributed random variables, and let {Y, Y1 , Y2 , . . . } be another set of nonnegative independent and identically distributed random variables. Denote by X(i:n) the ith order statistic in a sample of size n from {X1 , X2 , . . . }, and denote by Y(i:n) the ith order statistic in a sample of size n from {Y1 , Y2 , . . . }. If X ≤disp Y , then, by Theorem 3.B.31, for 2 ≤ i ≤ n we have X(i:n) − X(i−1:n) ≤st Y(i:n) − Y(i−1:n) , and therefore, by Theorem 5.A.16, we have X(i:n) − X(i−1:n) ≤Lt Y(i:n) − Y(i−1:n) ,

2 ≤ i ≤ n.

(5.A.9)

Bartoszewicz [46] proved a similar result. He showed that if X ≤disp Y , and if the Xi ’s and the Yi ’s are independent, then X(i:n) + Y(i−1:n) ≤Lt X(i−1:n) + Y(i:n) ,

2 ≤ i ≤ n.

(5.A.10)

This is diﬀerent from (5.A.9) because X(i−1:n) and X(i:n) (and Y(i−1:n) and Y(i:n) ) in (5.A.9) have a particular joint distribution, whereas (5.A.10) involves only the marginal distributions of X(i−1:n) and X(i:n) (and of Y(i−1:n) and Y(i:n) ). Bartoszewicz [46] also proved that if X ≤disp Y , and if the Xi ’s and the Yi ’s are independent, then X(n+1−i:n+1) + Y(n−i:n) ≤Lt X(n−i:n) + Y(n+1−i:n+1) ,

0 ≤ i ≤ n − 1,

and X(i:n) + Y(i:n+1) ≤Lt X(i:n+1) + Y(i:n) ,

1 ≤ i ≤ n.

In reliability theory, motivated by (3.A.62) and Theorem 5.A.16, one may consider the class of nonnegative random variables X which satisfy X ≥Lt [≤Lt ] Exp(µ) or, equivalently, 0

∞

e−su P {X > u}du ≥ [≤]

µ 1 + sµ

(5.A.11)

for s ≥ 0,

where µ is the mean of X. Such random variables have interesting aging properties. From Theorems 3.A.55 and 5.A.16 it is seen that if X is NBUE [NWUE], then X satisﬁes (5.A.11). Some researchers studied random variables X which satisfy

242

5 The Laplace Transform and Related Orders

X ≥Lt Gamma(α, β), where Gamma(α, β) denotes a Gamma random with shape parameter α and scale parameter β, which has the same mean as X. See Klar [300], Hu and Lin [228], and references therein. Let X be a nonnegative random variable with a ﬁnite mean. Recall the deﬁnition of the asymptotic equilibrium age AX whose distribution function is given in (1.A.20). Let Y be another nonnegative random variable with the corresponding asymptotic equilibrium age AY . From (5.A.3) it is seen at once that if EX = EY , then X ≤Lt Y ⇐⇒ AX ≥Lt AY .

(5.A.12)

The next result indicates the “minimal” and the “maximal” random variables, with respect to the order ≤Lt , when the mean and the variance are given. It is worthwhile to contrast it with Theorem 3.A.24. Theorem 5.A.21. Let Y be a nonnegative random variable with mean µ and variance σ 2 . Let X be a random variable such that P {X = 0} = 1 − P {X = (µ2 + σ 2 )/µ} = σ 2 /(µ2 + σ 2 ) (so that EX = µ and Var(X) = σ 2 ) and let Z be a random variable degenerate at µ. Then X ≤Lt Y ≤Lt Z.

(5.A.13)

Proof. The right-side inequality in (5.A.13) follows at once from Jensen’s Inequality. Let F and G be, respectively, the survival functions of X and Y . In order to obtain the left-side inequality in (5.A.13) we will show that ∞ ∞ e−sx F (x)dx ≤ e−sx G(x)dx for all s ≥ 0. (5.A.14) 0

0

The result will then follow from Theorem 5.A.1. Deﬁne the functions α and β on (0, ∞) by α(x) = F (x)/µ and β(x) = G(x)/µ. It is easy to see that both α and β are density functions with a common mean (µ2 + σ 2 )/2µ. In fact, α is the density function of the uniform distribution over the interval [0, (µ2 + σ 2 )/µ), whereas β is a density which is decreasing on [0, ∞). From Theorem 3.A.46 it now follows that ∞ ∞ F (x) G(x) dx ≤ dx φ(x) φ(x) µ µ 0 0 for all convex functions φ, and in particular (5.A.14) holds.

A characterization of the hazard rate order, by means of the Laplace transform order, is described in the following theorem. Recall from Section 1.A.3 that for any random variable Z and an event A we denote by [Z A] any random variable that has as its distribution the conditional distribution of Z given A.

5.A The Laplace Transform Order

243

Theorem 5.A.22. Let X and Y be two continuous random variables with right support endpoints uX and uY , respectively. Then X ≤hr Y if, and only if, (5.A.15) [X − tX > t] ≤Lt [Y − tY > t] for all t < min{uX , uY }. Proof. The fact that X ≤hr Y implies (5.A.15) follows from (1.B.6) and Theorem 5.A.16. In order to prove the converse, let us assume, for simplicity, that uX = uY = ∞. Denote by F and G the survival functions of X and Y , respectively. Now note that [X − tX > t] ≤Lt [Y − tY > t] for all t ∞ ∞ F (u + t) G(u + t) ⇐⇒ e−su e−su du ≤ du for all t and s > 0 F (t) G(t) 0 0 ∞ −su e G(u)du G(t) ⇐⇒ t∞ −su ≥ for all t and s > 0 e F (u)du F (t) t ∞ −su e G(u)du ⇐⇒ t∞ −su is increasing in t for all s > 0 (5.A.16) e F (u)du t ∞ 1 −st G(t) − est t e−su G(u)du se is increasing in t for all s > 0 ∞ ⇐⇒ 1 −st F (t) − est t e−su F (u)du se ∞ G(t) − est t e−su G(u)du is increasing in t for all s > 0, (5.A.17) ∞ ⇐⇒ F (t) − est t e−su F (u)du where the second from last equivalence follows by integration by parts. Using the Dominated Convergence Theorem, it is not hard to see that ∞ ∞ lims→0 est t e−su F (u)du = lims→0 est t e−su G(u)du = 0. Therefore, letting s → 0 in (5.A.17) we obtain that G(t)/F (t) is increasing in t; that is, X ≤hr Y .

Remark 5.A.23. The equivalence of X ≤hr Y and (5.A.16), together with (2.A.3), yield a proof of Theorem 2.A.6. In the following example it is shown, under a proper condition which is stated by means of the Laplace transform order, that random minima and maxima are ordered in the usual stochastic order sense; see related results in Examples 1.C.46, 3.B.39, 4.B.16, and 5.B.13. Example 5.A.24. Let X1 , X2 , . . . be a sequence of nonnegative independent and identically distributed random variables with a common distribution function FX1 and a common survival function F X1 . Let N1 and N2 be two positive integer-valued random variables, which are independent of the Xi ’s, and which have the Laplace transforms LN1 and LN2 . Denote X(1:Nj ) ≡ min{X1 , X2 , . . . , XNj } and X(Nj :Nj ) ≡ max{X1 , X2 , . . . , XNj }, j = 1, 2. Then the survival function of X(1:Nj ) is given by

244

5 The Laplace Transform and Related Orders

F X(1:Nj ) (x) = LNj (− log F X1 (x)),

j = 1, 2.

It is thus seen that if N1 ≤Lt N2 , then X(1:N1 ) ≥st X(1:N2 ) . In a similar manner it can be shown that if N1 ≤Lt N2 , then also X(N1 :N1 ) ≤st X(N2 :N2 ) . An example with a similar spirit is the following. Example 5.A.25. Consider a compound Poisson process with rate λ tribution φ. Suppose that this process is the (random) hazard rate of a random variable X. Then the survival function F of X is given t F (t) = exp − λ[1 − Lφ (s)]ds , t ≥ 0,

and disfunction by (5.A.18)

0

where Lφ is the Laplace transform of φ (see Kebir [280, page 873]). Similarly let Y have the survival function G given by t G(t) = exp − λ[1 − Lϕ (s)]ds , t ≥ 0, (5.A.19) 0

where ϕ is a distribution function, and where Lϕ is the Laplace transform of ϕ. It is now seen that if the random variable associated with φ is larger, in the Laplace transform order, than the random variable associated with ϕ, then G(t)/F (t) is increasing in t ≥ 0; that is (see (1.B.3)), X ≤hr Y . A variation of this result is given in Example 5.B.14. When X is a nonnegative integer-valued random variable, then it is customary and convenient to analyze it using its probability generating function E[tX ], t ∈ (0, 1), rather than its Laplace transform E[e−sX ], s ≥ 0. This fact suggests the following deﬁnition. Let X and Y be two nonnegative integer-valued random variables such that E[tX ] ≥ E[tY ] for all t ∈ (0, 1). (5.A.20) Then X is said to be smaller than Y in the probability generating function order (denoted as X ≤pgf Y ). It is not hard to verify the following relation which holds for any nonnegative integer-valued random variable X:

∞ ∞ t 1 − E[tX ] j j for all t ∈ (0, 1). t P {X ≥ j} = t t P {X > j} = 1−t j=1 j=0 We thus obtain the following analog of Theorem 5.A.1. Theorem 5.A.26. Let X and Y be two nonnegative integer-valued random variables. Then X ≤pgf Y if, and only if, ∞ j=1

tj P {X ≥ j} ≤

∞

tj P {Y ≥ j}

for all t ∈ (0, 1).

j=1

It is easy to see that (5.A.20) holds if, and only if, (5.A.1) holds. That is, X ≤pgf Y ⇐⇒ X ≤Lt Y.

5.B Orders Based on Ratios of Laplace Transforms

245

5.B Orders Based on Ratios of Laplace Transforms 5.B.1 Deﬁnitions and equivalent conditions In this section, for a nonnegative random variable X with distribution function F and survival function F ≡ 1 − F , let us denote by ∞ ∞ LX (s) = e−sx dF (x) and L∗X (s) = e−sx F (x)dx 0

0

the Laplace-Stieltjes transform of F (or the Laplace transform of X) and the Laplace transform of F , respectively. If Y is another nonnegative random variable, we similarly deﬁne LY and L∗Y . If LY (s) LX (s)

is decreasing in s > 0,

(5.B.1)

then X is said to be smaller than Y in the Laplace transform ratio order (denoted by X ≤Lt-r Y ). If 1 − LY (s) 1 − LX (s)

is decreasing in s > 0,

(5.B.2)

then X is said to be smaller than Y in the reverse Laplace transform ratio order (denoted by X ≤r-Lt-r Y ). Since L∗X (s) = s−1 (1 − LX (s)) and L∗Y (s) = s−1 (1 − LY (s)) for all s > 0, it follows that X ≤Lt-r Y ⇐⇒

1 − sL∗Y (s) 1 − sL∗X (s)

and that X ≤r-Lt-r Y ⇐⇒

L∗Y (s) L∗X (s)

is decreasing in s > 0,

is decreasing in s > 0.

Using (5.A.2) it is easy to verify the statements that are given in the following remark. Remark 5.B.1. Let X and Y be two positive random variables, and let E1 be a mean 1 exponential random variable which is independent of both X and = E1 /X and Y = E1 /Y ; that is, the distributions of both X Y . Deﬁne X and Y are scale mixtures of exponential distributions. Then X ≤Lt-r Y ⇐⇒ Y ≤hr X X ≤r-Lt-r Y ⇐⇒ Y ≤rh X.

and

See similar results in Example 4.B.7 and in Remark 5.A.2.

246

5 The Laplace Transform and Related Orders

The next theorem characterizes the orders ≤Lt-r and ≤r-Lt-r by means of functions of the respective moments. The characterization is an analog of the characterization of the Laplace transform order given in Theorem 5.A.5. Theorem 5.B.2. Let X and Y be nonnegative random variables that possess moments µi and νi , respectively, i = 1, 2, . . ., (µ0 = ν0 = 1). Then (a) X ≤Lt-r Y if, and only if, ∞

(−s)i n=0 i! νi ∞ (−s)i n=0 i! µi

is decreasing in s > 0.

(b) X ≤r-Lt-r Y if, and only if, ∞

(−s)i n=1 i! νi ∞ (−s)i n=1 i! µi

Proof. By writing e−st = initions.

∞ i=0

is decreasing in s > 0.

(−s)i i i! t ,

the result follows easily from the def-

5.B.2 Closure properties We list below some preservation properties of the orders ≤Lt-r and ≤r-Lt-r . Below, for any nonnegative random variable Z, we will denote by LZ the Laplace transform of Z. Theorem 5.B.3. Let X1 , X2 , . . . be independent, identically distributed nonnegative random variables, and let N1 and N2 be positive integer-valued random variables which are independent of the Xi ’s. Then N1 ≤Lt-r [≤r-Lt-r ] N2 =⇒

N1

Xi ≤Lt-r [≤r-Lt-r ]

i=1

N2

Xi .

i=1

Proof. For j = 1, 2, we have LX1 +X2 +···+XNj (s) = =

∞ i=1 ∞

P {Nj = i}LX1 +X2 +···+Xi (s) P {Nj = i}LiX1 (s)

i=1

= LNj (− log LX1 (s)). The stated results now follow from the assumptions.

If the Xi ’s are not assumed to be identically distributed, then stronger assumptions on the relationship between N1 and N2 yield the same conclusion. This is shown in the next two theorems.

5.B Orders Based on Ratios of Laplace Transforms

247

Theorem 5.B.4. Let X1 , X2 , . . . be independent nonnegative random variables, and let N1 and N2 be positive integer-valued random variables which are independent of the Xi ’s. If N1 ≤rh N2 , then N1

N2

Xi ≤Lt-r

i=1

Xi .

i=1

Proof. For j = 1, 2, we have LX1 +X2 +···+XNj (s) =

∞

P {Nj = i}

i=1

i

LXk (s).

k=1

For 0 < s1 < s2 we need to show that &

∞ m=1

P {N1 = m}

m

'& LXk (s1 )

∞

−

P {N2 = m}

m=1

P {N2 = n}

n=1

k=1

&

∞

m

'&

LXk (s1 )

n

LXk (s2 )

k=1 ∞

P {N1 = n}

n=1

k=1

'

n

' LXk (s2 ) ≤ 0.

k=1

This follows from the remark after Theorem 2.1 of Joag-Dev, Kochar, and Proschan [259] by noting that

m m

LXk (s2 ), LXk (s1 ) g1 (m), g2 (m) ≡ k=1

k=1

is a pair of what Joag-Dev, Kochar, and Proschan [259] call DP2 functions of m, whenever 0 < s1 < s2 , and g1 (m) is decreasing in m.

Theorem 5.B.5. Let X1 , X2 , . . . be independent nonnegative random variables, and let N1 and N2 be positive integer-valued random variables which are independent of the Xi ’s. If N1 ≤hr N2 , and if Xi ≤r-Lt-r Xi+1 , then N1

Xi ≤r-Lt-r

i=1

N2

Xi .

i=1

Proof. For j = 1, 2, we have 1 − LX1 +X2 +···+XNj (s) = =

∞ m=1 ∞ m=0

where

0 k=1

m P {Nj = m} 1 − LXk (s) k=1

P {Nj > m}

m

LXk (s) 1 − LXm+1 (s) ,

k=1

LXk (s) ≡ 1. So for 0 < s1 < s2 we have that

248

5 The Laplace Transform and Related Orders

1 − LX1 +X2 +···+XN1 (s1 ) 1 − LX1 +X2 +···+XN2 (s2 ) − 1 − LX1 +X2 +···+XN2 (s1 ) 1 − LX1 +X2 +···+XN1 (s2 ) ∞ m−1 P {N1 > m}P {N2 > n} − P {N2 > m}P {N1 > n} = m=1 n=0 n

×

LXk (s1 )

k=1

×

m

n

LXk (s2 )

k=1

LXk (s1 )(1 − LXm+1 (s1 ))(1 − LXn+1 (s2 ))

k=n+1 m

−

LXk (s2 )(1 − LXm+1 (s2 ))(1 − LXn+1 (s1 ))

k=n+1

≤ 0. The last inequality follows since N1 ≤hr N2 implies that P {N1 > m}P {N2 > n} − P {N2 > m}P {N1 > n} ≤ 0

for m > n,

and Xi ≤r-Lt-r Xi+1 implies that (1 − LXm+1 (s1 ))(1 − LXn+1 (s2 )) − (1 − LXm+1 (s2 ))(1 − LXn+1 (s1 )) ≥ 0 for m > n. The stated result now follows.

Some other preservation results are given in the following theorems. Theorem 5.B.6. Let X1 , X2 , . . . , Xn be a set of independent nonnegative random variables and let Y1 , Y2 , . . . , Yn be another set of independent nonnegative random variables. If Xj ≤Lt-r Yj , j = 1, 2, . . . , n, then X1 + X2 + · · · + Xn ≤Lt-r Y1 + Y2 + · · · + Yn . Proof. Since LX1 +X2 +···+Xn (s) =

n

creasing in s, j = 1, 2, . . . , n, then

i=1 LXi (s), we see LY1 +Y2 +···+Yn (s) LX1 +X2 +···+Xn (s) is

that if

LYj (s) LXj (s)

is de-

also decreasing in s.

As a special case of Theorem 5.B.6 we see that if X and Y are nonnegative independent random variables, then X ≤Lt-r X + Y.

(5.B.3)

Theorem 5.B.7. Let {Xj } and {Yj } be two sequences of random variables such that Xj →st X and Yj →st Y as j → ∞. If Xj ≤Lt-r [≤r-Lt-r ] Yj , j = 1, 2, . . ., then X ≤Lt-r [≤r-Lt-r ] Y .

5.B Orders Based on Ratios of Laplace Transforms

249

Theorem 5.B.8. Let variables such that [X Θ = X, Y , and Θ be random θ] ≤Lt-r [≤r-Lt-r ] [Y Θ = θ ] for all θ and θ in the support of Θ. Then X ≤Lt-r [≤r-Lt-r ] Y . Proof. We only give the proof for the ≤Lt-r order. The proof for the order ≤r-Lt-r is similar. Note that EΘ L[X|Θ] (s) LX (s) . = LY (s) EΘ L[Y |Θ] (s) d LX (s) d L[X|θ] (s) It can be veriﬁed that ds LY (s) ≥ 0 if ds L[Y |θ ] (s) ≥ 0 for all θ and θ in the support of Θ.

In the next result it is shown that a random variable, whose distribution is the mixture of two distributions of ≤Lt-r [≤r-Lt-r ] ordered random variables, is bounded from below and from above, in the ≤Lt-r [≤r-Lt-r ] order sense, by these two random variables. Theorem 5.B.9. Let X and Y distribution functions F and G, with the distribution function pF [≤r-Lt-r ] Y , then X ≤Lt-r [≤r-Lt-r ]

be two nonnegative random variables with respectively. Let W be a random variable + (1 − p)G for some p ∈ (0, 1). If X ≤Lt-r W ≤Lt-r [≤r-Lt-r ] Y .

The proof of Theorem 5.B.9 is similar to the proof of Theorem 1.B.22, but it uses (5.B.1) [(5.B.2)] instead of (1.B.3). We omit the details. 5.B.3 Relationship to other stochastic orders In this subsection we describe some relationships between the Laplace ratio orders and some other stochastic orders. We also mention some known counterimplications. Theorem 5.B.10. Let X and Y be positive random variables. Then X ≤Lt-r Y =⇒ X ≤Lt Y and X ≤r-Lt-r Y =⇒ X ≤Lt Y. Proof. Denote LX (∞) = lims→∞ LX (s) and LY (∞) = lims→∞ LY (s). Since LX (0) = LY (0) = 1 and LX (∞) = LY (∞) = 0, we see that if X ≤Lt-r Y , then LY (0) LY (s) ≤ = 1, LX (s) LX (0) and if X ≤r-Lt-r Y , then 1 − LY (∞) 1 − LY (s) ≥ = 1. 1 − LX (∞) 1 − LX (s) This proves the stated results.

250

5 The Laplace Transform and Related Orders

As a corollary of Theorem 5.B.10 and (5.A.5) we see that X ≤Lt-r Y =⇒ EX ≤ EY, and that X ≤r-Lt-r Y =⇒ EX ≤ EY provided the expectations exist. The proof of the next theorem will not be given here. Theorem 5.B.11. Let X and Y be nonnegative absolutely continuous or integer-valued random variables. Then X ≤rh Y =⇒ X ≤Lt-r Y and X ≤hr Y =⇒ X ≤r-Lt-r Y. The following result gives a relationship between the orders ≤mrl and ≤Lt-r . Theorem 5.B.12. Let X and Y be two nonnegative absolutely continuous random variables that possess all moments and with bounded support [0, b]. If X ≤mrl Y , then b − Y ≤Lt-r b − X. Proof. Denote g(1, n) = E[X n ] and g(2, n) = E[Y n ]. Since X ≤mrl Y it follows from (2.A.10) that g(i, n) is totally positive of order 2 in i = 1, 2, and in n ≥ 0. Therefore, by the Basic Composition Formula (Karlin [275]) we have that ∞ sn h(i, s) ≡ g(i, n) n! n=0 is totally positive of order 2 in i = 1, 2, and in s ≥ 0. That is, h(2, s) EesY = h(1, s) EesX

is increasing in s ≥ 0.

(5.B.4)

It is easy to verify that (5.B.4) implies b − Y ≤Lt-r b − X.

Counterexamples in the literature show that for nonnegative integer-valued random variables X and Y we have X ≤hr Y =⇒ X ≤Lt-r Y =⇒ X ≤icv Y and X ≤rh Y =⇒ X ≤r-Lt-r Y =⇒ X ≤icv Y. It is of interest to compare the above counterimplications, and the implications given in Theorems 5.B.10 and 5.B.11, with the implication X ≤icv Y =⇒ X ≤Lt Y given in Theorem 5.A.16.

5.B Orders Based on Ratios of Laplace Transforms

251

From the above counterimplications it follows that for nonnegative integervalued random variables X and Y we have X ≤Lt-r Y =⇒ X ≤r-Lt-r Y and X ≤r-Lt-r Y =⇒ X ≤Lt-r Y. Counterexamples in the literature also show that for nonnegative integervalued random variables X and Y we have X ≤Lt-r Y =⇒ X ≤icx Y and X ≤r-Lt-r Y =⇒ X ≤icx Y. From (1.D.2) it follows that for nonnegative random variables, X ≤conv Y =⇒ X ≤Lt-r Y.

(5.B.5)

Example 5.B.13. The Laplace ratio orders are useful for the purpose of stochastically comparing random minima and maxima. Let X1 , X2 , . . . be a sequence of nonnegative independent and identically distributed random variables. Let N1 and N2 be two positive integer-valued random variables which are independent of the Xi ’s. Denote X(1:Nj ) ≡ min{X1 , X2 , . . . , XNj } and X(Nj :Nj ) ≡ max{X1 , X2 , . . . , XNj }, j = 1, 2. Let the common distribution function, and the common survival function, of the Xi ’s be denoted, respectively, by FX1 and F X1 , and let FX(Nj :Nj ) and F X(1:Nj ) denote, respectively, the distribution function of X(Nj :Nj ) and the survival function of X(1:Nj ) , j = 1, 2. Note that FX(Nj :Nj ) (x) =

∞

n FX (x)pNj (n) = LNj (− log FX1 (x)), 1

j = 1, 2,

n=1

and that F X(1:Nj ) (x) =

∞

n

F X1 (x)pNj (n) = LNj (− log F X1 (x)),

n=1

Thus, F X(1:N2 ) (x) F X(1:N1 ) (x)

=

LN2 (− log F X1 (x)) . LN1 (− log F X1 (x))

Therefore N1 ≤Lt-r N2 =⇒ X(1:N1 ) ≥hr X(1:N2 ) . In a similar manner it can be shown that N1 ≤Lt-r N2 =⇒ X(N1 :N1 ) ≤rh X(N2 :N2 ) ,

j = 1, 2.

252

5 The Laplace Transform and Related Orders

N1 ≤r-Lt-r N2 =⇒ X(1:N1 ) ≥rh X(1:N2 ) , and that N1 ≤r-Lt-r N2 =⇒ X(N1 :N1 ) ≤hr X(N2 :N2 ) . From Theorem 5.B.11 and the above implications it follows that N1 ≤rh N2 =⇒ X(1:N1 ) ≥hr X(1:N2 ) , N1 ≤rh N2 =⇒ X(N1 :N1 ) ≤rh X(N2 :N2 ) , N1 ≤hr N2 =⇒ X(1:N1 ) ≥rh X(1:N2 ) ,

(5.B.6) (5.B.7)

and that N1 ≤hr N2 =⇒ X(N1 :N1 ) ≤hr X(N2 :N2 ) . See related results in Examples 1.C.46, 3.B.39, 4.B.16, and 5.A.24. The following example is a variation of Example 5.A.25 — under a stronger condition (the order ≤Lt-r is stronger than the order ≤Lt ) we obtain a stronger conclusion. Example 5.B.14. As in Example 5.A.25, let X have a compound Poisson process, with rate λ and distribution φ, as its (random) hazard rate function. The survival function of X is given in (5.A.18), and it follows that its density function f is given by t f (t) = λ[1 − Lφ (t)] exp − λ[1 − Lφ (s)]ds , t ≥ 0. 0

Similarly let Y have a compound Poisson process, with rate λ and distribution ϕ, as its (random) hazard rate function. Its survival function is given in (5.A.19), and its density function g is given by t λ[1 − Lϕ (s)]ds , t ≥ 0. g(t) = λ[1 − Lϕ (t)] exp − 0

It is now seen that if the random variable associated with φ is larger, in the reverse Laplace transform order (and hence, by Theorem 5.B.10, also larger in the Laplace transform order), than the random variable associated with ϕ, then g(t)/f (t) is increasing in t ≥ 0; that is, X ≤lr Y .

5.C Some Related Orders 5.C.1 The factorial moments order The factorial moments of a random variable X are µi = E[X(X − 1) · · · (X − i + 1)], i = 1, 2, . . .. They are particularly useful when X is a nonnegative

5.C Some Related Orders

253

integer-valued random variable, since they can be easily obtained from the probability generating function of X by repeated diﬀerentiation. Throughout this subsection we consider only nonnegative integer-valued random variables. The ith factorial moment of such a random variable X can also be written as µi = i!E Xi , where xi is deﬁned as 0 when i > x. Let X and Y be two nonnegative integer-valued random variables such that X Y E ≤E for all i ∈ N++ . (5.C.1) i i Then X is said to be smaller than Y in the factorial moments order (denoted by X ≤fm Y ). For a real function φ deﬁned on N+ , deﬁne ∆0 φ(x) = φ(x), and ∆j φ(x) = j−1 ∆ φ(x + 1) − ∆j−1 φ(x), x ∈ N+ , j = 1, 2, . . .. It can be shown that for every φ : N+ → [0, ∞), one has ∞ x φ(x) = ∆j φ(0) , x ∈ N+ . (5.C.2) j j=0 The following characterization of the order ≤fm is a direct consequence of (5.C.2). Theorem 5.C.1. Let X and Y be two nonnegative integer-valued random variables. Then X ≤fm Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

for all φ such that ∆j φ(0) ≥ 0, j ∈ N+ .

It is easy to see that X ≤fm Y =⇒ EX ≤ EY. Some closure properties of the order ≤fm are given in the next theorem. Theorem 5.C.2. (a) Let X and Y be two nonnegative integer-valued random variables. If X ≤fm Y , then X + k ≤fm Y + k for every k ∈ N+ . (b) Let X and Y be two nonnegative integer-valued random variables. If X ≤fm Y , then kX ≤fm kY for every k ∈ N+ . (c) Let X1 , X2 , . . . , Xm be a set of independent nonnegative integer-valued random variables. Let Y1 , Y2 , . . . , Ym be another set of independent nonnegative integer-valued random variables. If Xi ≤fm Yi , i = 1, 2, . . . , m, then m m Xi ≤fm Yi . i=1

i=1

Proof. It is enough to prove part (a) for k = 1; the proof can then be completed by induction. But for k = 1 the desired result follows directly from the identity x+1 x x = + , i, x ∈ N+ . i+1 i+1 i

254

5 The Laplace Transform and Related Orders

A lengthy straightforward calculation yields j kx ∆ ≥ 0, i, j, k ∈ N+ . i x=0 Part (b) then follows. Finally, in order to prove part (c) it is enough to consider the case m = 2. This case follows immediately from the identity i x2 x1 x1 + x2 = , x1 , x2 , i ∈ N+ .

i j i−j j=0 The next result shows that under some conditions the order ≤fm is closed under formation of random sums. We do not give the proof here. Theorem 5.C.3. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent integer-valued random variables such that Xi ≤fm Yi , i = 1, 2, . . . . Let M and N be integer-valued nonnegative random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤icx N . If the Xi ’s or the Yi ’s are identically distributed, then M

Xj ≤fm

j=1

N

Yj .

j=1

Select a positive integer i and consider the real function φ deﬁned on

{i + 1, i + 2, . . . } by φ(x) = xi . A straightforward computation yields that φ(x) + φ(x + 2) ≥ 2φ(x + 1) for x ∈ {i + 1, i + 2, . . . }. That is, the function φ is convex on {i + 1, i + 2, . . . }. Thus we have proven the following result. Theorem 5.C.4. Let X and Y be two nonnegative integer-valued random variables. If X ≤icx Y , then X ≤fm Y . In particular, if X ≤st Y , then X ≤fm Y . A relationship between the orders ≤fm and ≤pgf is given in the next result. Theorem 5.C.5. Let X and Y be two nonnegative integer-valued random variables with bounded support {0, 1, 2, . . . , b}. If X ≤fm Y , then b − Y ≤pgf b − X. Proof. For a ≥ 1 deﬁne MX (a) = E[aX ] and MY (a) = E[aY ]. Note that the (i) ith derivative of MX [MY ] at 1 is MX (1) = E[X(X − 1) · · · (X − i + 1)] (i) [MY (1) = E[Y (Y − 1) · · · (Y − i + 1)]]. Expanding MX and MY about 1, using the ﬁniteness of the support for convergence, it is seen that MX (a) =

∞ (i) M (1) X

i=0

i!

(a − 1)i

∞ E[X(X − 1) · · · (X − i + 1)] = (a − 1)i i! i=0

5.C Some Related Orders

≤

255

∞ E[Y (Y − 1) · · · (Y − i + 1)] (a − 1)i i! i=0

= MY (a), where the inequality follows from the assumption that X ≤fm Y and from the fact that a ≥ 1. Thus, E[aX ] ≤ E[aY ]

for all a ≥ 1.

(5.C.3)

Now it is easy to verify that (5.C.3) implies that b − Y ≤pgf b − X.

5.C.2 The moments order Consider now two general (that is, not necessarily integer-valued) nonnegative random variables X and Y such that E[X i ] ≤ E[Y i ]

for all i ∈ N++ .

Then X is said to be smaller than Y in the moments order (denoted by X ≤mom Y ). Thus X ≤mom Y if, and only if, E[φ(X)] ≤ E[φ(Y )]

(5.C.4)

for all polynomials φ with nonnegative coeﬃcients. In fact, X ≤mom Y if, and only (5.C.4) holds for all absolutely monotone functions φ of the form if, ∞ φ(x) = k=0 ak xk , where the ak ’s are nonnegative, provided the expectations exist. Clearly, X ≤mom Y =⇒ EX ≤ EY. Some closure properties of the order ≤mom are given in the next theorem. Its proof is similar to the proof of Theorem 5.C.2 (except that it is simpler) and is thus omitted. Theorem 5.C.6. (a) Let X and Y be two nonnegative random variables. If X ≤mom Y , then X + k ≤mom Y + k for every k ≥ 0. (b) Let X and Y be two nonnegative random variables. If X ≤mom Y , then kX ≤mom kY for every k ≥ 0. (c) Let X1 , X2 , . . . , Xm be a set of independent nonnegative random variables. Let Y1 , Y2 , . . . , Ym be another set of independent nonnegative random variables. If Xi ≤mom Yi , i = 1, 2, . . . , m, then m i=1

Xi ≤mom

m

Yi .

i=1

The next result shows that under some conditions the order ≤mom is closed under formation of random sums. We do not give the proof here.

256

5 The Laplace Transform and Related Orders

Theorem 5.C.7. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent random variables such that Xi ≤mom Yi , i = 1, 2, . . .. Let M and N be integer-valued nonnegative random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤icx N . If the Xi ’s or the Yi ’s are identically distributed, then M

Xj ≤mom

j=1

N

Yj .

j=1

The moments order is closed under linear convex combinations as the following theorem shows. This result is an analog of Theorems 3.A.36 and 5.A.14. Its proof is similar to the proof of Theorem 3.A.36 and is therefore omitted. A similar result is Theorem 5.C.18. Theorem 5.C.8. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≤mom Y , i = 1, 2, . . . , n, then n

ai Xi ≤mom Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n and

n i=1

ai = 1.

The following result gives a relationship between the orders ≤fm and ≤mom . Theorem 5.C.9. Let X and Y be two nonnegative integer-valued random variables. If X ≤fm Y , then X ≤mom Y . In particular, if X ≤icx Y (or if X ≤st Y ), then X ≤mom Y . Proof. Denote x[i] = x(x − 1) · · · (x − i + 1). The result will follow once we have shown that i (i) xi = αk x[k] , i = 1, 2, . . . , (5.C.5) k=1 (i) αk ’s

where the are some nonnegative constants. The expression (5.C.5) can (i) be found on page 4 of Johnson and Kotz [263]. The αk ’s in (5.C.5) are the Stirling numbers of the second kind, which are known to be positive.

In order to obtain a Laplace transform characterization of the order ≤mom we ﬁrst prove the following result. Theorem 5.C.10. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤mom X2 =⇒ Nλ (X1 ) ≤fm Nλ (X2 )

for all λ > 0.

5.C Some Related Orders

257

Proof. For k = 1, 2, let Fk denote the distribution function of Xk . By (2.A.15) we have E[Nλ (Xk )(Nλ (Xk ) − 1)(Nλ (Xk ) − 2) · · · (Nλ (Xk ) − i + 1)] ∞ ∞ (λx)n = n(n − 1)(n − 2) · · · (n − i + 1) e−λx dFk (x) n! 0 n=0 ∞ ∞ (λx)n dFk (x). n(n − 1)(n − 2) · · · (n − i + 1)e−λx = n! 0 n=0 It is not diﬃcult to verify that the ith factorial moment of a Poisson random variable with mean λx is given by ∞

n(n − 1)(n − 2) · · · (n − i + 1)e−λx

n=0

(λx)n = (λx)i . n!

Therefore E[Nλ (X1 )(Nλ (X1 ) − 1)(Nλ (X1 ) − 2) · · · (Nλ (X1 ) − i + 1)] ∞ ∞ = λi xi dF1 (x) ≤ λi xi dF2 (x) 0

0

= E[Nλ (X2 )(Nλ (X2 ) − 1)(Nλ (X2 ) − 2) · · · (Nλ (X2 ) − i + 1)], where the inequality follows from X1 ≤mom X2 . Thus Nλ (X1 ) ≤fm Nλ (X2 ).

A Laplace transform characterization of the order ≤mom is given next. It may be compared to Theorems 1.A.13, 1.B.18, 1.B.53, 1.C.25, 2.A.16, 2.B.14, 4.A.21, and 5.A.6. Theorem 5.C.11. Let X1 and X2 be two nonnegative random variables, and let Nλ (X1 ) and Nλ (X2 ) be as described in Theorem 1.A.13. Then X1 ≤mom X2 ⇐⇒ Nλ (X1 ) ≤mom Nλ (X2 )

for all λ > 0.

Proof. If X1 ≤mom X2 , then from Theorem 5.C.10 we get that Nλ (X1 ) ≤fm Nλ (X2 ), and from Theorem 5.C.9 we get that Nλ (X1 ) ≤mom Nλ (X2 ). Now suppose that Nλ (X1 ) ≤mom Nλ (X2 ) for all λ > 0. Then E(Nλ (X1 ))i ≤ E(Nλ (X2 ))i , i = 1, 2, . . . . In particular, E[Nλ (X1 )] ≤ E[Nλ (X2 )], therefore, by (2.A.16), E[X1 ] = E[Nλ (X1 )]/λ ≤ E[Nλ (X2 )]/λ = E[X2 ]. Let the induction hypothesis be E[X1i ] ≤ E[X2i ],

i = 1, 2, . . . , m.

258

5 The Laplace Transform and Related Orders

Now observe the following. From (2.A.15) it is seen that ∞ E (Nλ (Xk ))m+1 = nm+1

∞

(λx)n dFk (x) n! 0 n=0 ' ∞ & ∞ n m+1 −λx (λx) dFk (x), = n e n! 0 n=0 e−λx

k = 1, 2.

∞ n m+1 −λx (λx) The quantity is the (m + 1)st moment of a Poisson e n=0 n n! random variable with mean λx. It is not diﬃcult to verify that ∞

nm+1 e−λx

n=0

(λx)n = am+1 (λx)m+1 + am (λx)m + · · · + a1 (λx) + a0 , n!

where aj > 0, j = 0, 1, 2, . . . , m + 1. Therefore

m+1 aj λj E (Nλ (Xk ))m+1 =

∞

xj dFk (x),

k = 1, 2.

0

j=0

We know that E (Nλ (X1 ))m+1 ≤ E (Nλ (X2 ))m+1 and therefore, m+1 j=0

aj λ

j

∞

x dF1 (x) ≤ j

0

m+1

aj λ

j

∞

xj dF2 (x)

0

j=0

for some a0 , a1 , . . . , am+1 > 0 and all λ > 0. Rewrite the inequality as m

aj λj E X1j − E X2j . am+1 λm+1 E X1m+1 − E X2m+1 ≤ j=1

The right-hand side is nonnegative by the induction hypothesis. If E X1m+1 − E X2m+1 > 0, then, by choosing suﬃciently large λ, the left-hand side would be greater than the right-hand side, a contradiction. Thus we must have E X1m+1 − E X2m+1 ≤ 0. The result now follows by induction.

The next result describes a relationship between the orders ≤mom and ≤r-Lt-r ; we omit its proof. Theorem 5.C.12. Let X and Y be two nonnegative random variables. Then X ≤r-Lt-r Y =⇒

1 1 ≤mom . Y X

5.C Some Related Orders

259

Finally we mention a related order. Let X and Y be two nonnegative random variables such that E[Y n ] E[X n ]

is increasing in n ∈ N+ ,

(5.C.6)

where, by convention, E[X 0 ] = E[Y 0 ] = 1. Then X is said to be smaller than Y in the moments ratio order (denoted as X ≤mom-r Y ). E[Y n ] E[Y 0 ] From (5.C.6) it is seen that E[X n ] ≥ E[X 0 ] = 1. Thus we see that X ≤mom-r Y =⇒ X ≤mom Y.

(5.C.7)

From (2.A.10) it is seen that X ≤mrl Y =⇒ X ≤mom-r Y. Therefore, by Theorem 2.A.1, we also have that X ≤hr Y =⇒ X ≤mom-r Y.

(5.C.8)

In the proof of Theorem 5.B.12 it is essentially shown that for nonnegative random variables X and Y with bounded support [0, b] we have X ≤mom-r Y =⇒ b − Y ≤Lt-r b − X. This may be contrasted with (5.C.9) below (recall that X ≤Lt-r Y =⇒ X ≤Lt Y ; see Theorem 5.B.10). The following result is obvious. Theorem 5.C.13. Let X and Y be two nonnegative random variables. If X ≤mom-r Y , then kX ≤mom-r kY for every k ≥ 0. The next result describes a relationship between the orders ≤mom-r and ≤Lt-r ; we omit its proof. Theorem 5.C.14. Let X and Y be two nonnegative random variables. Then X ≤Lt-r Y =⇒

1 1 ≤mom-r . Y X

From (5.C.7) and Theorem 5.C.14 it is seen that if X and Y are nonnegative random variables, then X ≤Lt-r Y =⇒

1 1 ≤mom . Y X

260

5 The Laplace Transform and Related Orders

5.C.3 The moment generating function order Let X and Y be two nonnegative random variables such that Ees0 Y < ∞ for some s0 > 0, and EesX ≤ EesY , for all s > 0. Then X is said to be smaller than Y in the moment generating function order (denoted by X ≤mgf Y ). A simple integration by parts shows that X ≤mgf Y if, and only if, ∞ ∞ sx e F (x)dx ≤ esx G(x)dx for all s > 0, 0

0

where F and G are the survival functions of X and of Y , respectively. The following theorem is an analog of Theorem 5.A.5; its proof is similar to the proof of that result. Theorem 5.C.15. Let X and Y be two nonnegative random variables. Then X ≤mgf Y if, and only if, ∞ i=0

∞

si si EX i+1 ≤ EY i+1 (i + 1)! (i + 1)! i=0

for all s > 0.

It follows from Theorem 5.C.15 that X ≤mom Y =⇒ X ≤mgf Y. Some closure properties of the order ≤mgf are given below (recall from Section 1.A.3 that for any random variable Z and any event A we denote by [Z A] any random variable whose distribution is the conditional distribution of Z given A.) Theorem 5.C.16. Let X and Y be two nonnegative random variables. (a) If X ≤mgf Y , then X + k ≤mgf Y + k for every k > 0. (b) If X ≤mgf Y , then kX ≤mgf kY for every k > 0. (c) Let X, Y , and Θ be random variables such that [X Θ = θ] ≤mgf [Y Θ = θ] for all θ in the support of Θ. Then X ≤mgf Y . That is, the moment generating function order is closed under mixtures. (d) Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤mgf Yi for i = 1, 2, . . . , m, then m i=1

Xi ≤mgf

m

Yi ;

i=1

that is, the moment generating function order is closed under convolutions.

5.D Complements

261

The next result is an analog of Theorems 5.A.9 and 5.C.7. Theorem 5.C.17. Let X1 , X2 , . . . and Y1 , Y2 , . . . each be a sequence of nonnegative independent and identically distributed random variables such that Xi ≤mgf Yi , i = 1, 2, . . .. Let M and N be integer-valued nonnegative random variables that are independent of the {Xi } and the {Yi } sequences, respectively, such that M ≤mgf N . Then M

Xj ≤mgf

j=1

N

Yj .

j=1

The following result is an analog of Theorem 3.A.36; similar results are Theorems 5.A.14 and 5.C.8. Theorem 5.C.18. Let X1 , X2 , . . . , Xn and Y be n + 1 random variables. If Xi ≤mgf Y , i = 1, 2, . . . , n, then n

ai Xi ≤mgf Y,

i=1

whenever ai ≥ 0, i = 1, 2, . . . , n and

n i=1

ai = 1.

The next result is an analog of Theorem 5.C.5; it describes a relationship between the orders ≤mgf and ≤Lt . Theorem 5.C.19. Let X and Y be two nonnegative random variables with bounded support [0, b]. Then X ≤mgf Y if, and only if, b − Y ≤Lt b − X. In particular, for random variables as in Theorem 5.C.19, X ≤mom Y =⇒ b − Y ≤Lt b − X.

(5.C.9)

5.D Complements Section 5.A: We used three main sources in order to collect the results regarding the Laplace transform order. These are Stoyan [540, Section 1.8], Kim and Proschan [294], and Alzaid, Kim, and Proschan [11]. The characterization (5.A.4) is taken from Denuit [141]. The characterization of the order ≤Lt in terms of exponential mixtures, given in Remark 5.A.2, is taken from Bartoszewicz [50]. The characterization described in Theorem 5.A.4 can be found in Bhattacharjee [84]. The Laplace transform characterization of the order ≤Lt given in Theorem 5.A.6 is essentially taken from Alzaid, Kim, and Proschan [11]. Some further characterizations of the Laplace transform order by means of inﬁnitely divisible distributions are given in Bartoszewicz [48]. The closure property of the order ≤Lt under random sums (Theorem 5.A.8) is taken from Bhattacharjee [86]. The

262

5 The Laplace Transform and Related Orders

extensions of the closure property of the order ≤Lt under random sums (Theorems 5.A.10–5.A.12) can be found in Pellerey [450]. The majorization result (Theorem 5.A.13) is taken from Ma [375]. The result which gives the closure of the Laplace transform order under linear convex combinations (Theorem 5.A.14) can be found in Pellerey [452]. The condition which implies stochastic equality (in Theorem 5.A.15) is a combination of results in Cai and Wu [116] and in Bhattacharjee [84], where some generalizations of this condition can also be found. The characterization of the Laplace transform order by means of the order ≤∞-icv (Theorem 5.A.17) is taken from Thistle [548]; see also Fishburn and Lavalle [204] and further references in that paper. The implication of the order ≤Lt from the order ≤p− (Theorem 5.A.18) is essentially taken from Bhattacharjee [83]; see also Cai and Wu [116]. The closure property of the order ≤Lt under the operation of taking minima (Theorem 5.A.19) is taken from Alzaid, Kim, and Proschan [11]. Alzaid, Kim, and Proschan [11] also have a version of Theorem 5.A.19 which gives conditions under which the order ≤Lt is closed under the operation of taking maxima, however their condition must be wrong, since it postulates that the Fi ’s and the Gi ’s (of Theorem 5.A.19) are completely monotone — but these functions are increasing, whereas all completely monotone functions must be decreasing. Looking over their proof it is seen that a suﬃcient condition, for the closure of the order ≤Lt under the operation of taking maxima, is that e−tx Fi (x) and e−tx Gi (x) be completely monotone in x for each t ≥ 0, i = 1, 2, . . . , m. We are not aware of any study of the latter condition. The class of random lifetimes, deﬁned by (5.A.11), is studied in Klefsj¨ o [302]. The equivalence of the Laplace transform ordering of nonnegative random variables with equal means, and their corresponding asymptotic equilibrium ages, given in (5.A.12), is taken from Denuit [141]. The lower bound in (5.A.13), in the sense of ≤Lt , when the mean and the variance are given, can be found in Stoyan [540, page 23], who credited it to Rolski. The characterization of the order ≤hr by means of the order ≤Lt (Theorem 5.A.22) is given in Belzunce, Gao, Hu, and Pellerey [67]. The results about the stochastic comparisons of random minima and maxima (Example 5.A.24) are taken from Shaked and Wong [526]. The hazard rate order comparison of two nonnegative random variables with random hazard rate functions (Example 5.A.25) is a special case of Theorem 3 of Di Crescenzo and Pellerey [166]. Section 5.B: Most of the results in this section can be found in Shaked and Wong [525]. The characterizations of the orders ≤Lt-r and ≤r-Lt-r in terms of exponential mixtures, given in Remark 5.B.1, are taken from Bartoszewicz [50]. Di Crescenzo and Shaked [167] used (5.B.3) in order to obtain Laplace transform ratio order comparisons of many pairs of random variables. The relationship between the orders ≤mrl and ≤Lt-r (Theorem 5.B.12) is essentially proven in Fagiuoli and Pellerey [187]. The relationship between the orders ≤Lt-r and ≤conv , given in (5.B.5), was

5.D Complements

263

noted in Shaked and Suarez-Llorens [520]. Extensions of the implications (5.B.6) and (5.B.7) to order statistics other than the minimum can be found in Nanda, Misra, Paul, and Singh [427]. The likelihood ratio order comparison of two nonnegative random variables with random hazard rate functions (Example 5.B.14) is essentially Remark 3 of Di Crescenzo and Pellerey [166]. Section 5.C: Many of the results in this section are taken from Lef`evre and Picard [338]. A discussion about other related orders can also be found in Lef`evre and Picard [338]. The closure properties of the order ≤fm (Theorem 5.C.2), as well as the simple proof of Theorem 5.C.9, have been communicated to us by Lef`evre [335]. The results that give the closure under random convolutions property of the factorial moments order (Theorem 5.C.3) and of the moments order (Theorem 5.C.7) are taken from Jean-Marie and Liu [254]. Lef`evre and Utev [339] have noticed that for ﬁnite random variables with support {0, 1, . . . , b} the discrete versions of the orders ≤m-icx , m = 2, 3, . . . , b (see Section 4.A.7), together with some conditions on the factorial moments, imply the order ≤fm ; thus they generalized Theorem 5.C.5. The result which gives the closure of the moments order under linear convex combinations (Theorem 5.C.8) can be found in Pellerey [452]. The Laplace transform characterizations of the order ≤mom (Theorems 5.C.10 and 5.C.11) are taken from Shaked and Wong [524]. The relationship between the orders ≤mom and ≤r-Lt-r (Theorem 5.C.12) can be found in Bartoszewicz [47]. The moments ratio order has been introduced by Whitt [565] who has also obtained the implications (5.C.7) and (5.C.8). The relationship between the orders ≤mom-r and ≤Lt-r (Theorem 5.C.14) can be found in Bartoszewicz [47]. The moment generating function order is called the exponential order in Kaas, Heerwaarden, and Goovaerts [269]. Most of the results in Section 5.C.3 can be found in Klar and M¨ uller [301]. The result that gives the closure of the order ≤mgf under linear convex combinations (Theorem 5.C.18) is taken from Li [352].

6 Multivariate Stochastic Orders

In this chapter we describe various extensions, of the univariate stochastic orders in Chapters 1 and 2, to the multivariate case. The most important common orders that are studied in this chapter are the multivariate stochastic orders ≤st and ≤lr . Multivariate extensions of the orders ≤hr and ≤mrl are also studied in this chapter. Also, we review here further analogs of the univariate order ≤st , such as the upper and lower orthants orders. In addition, some other related orders are investigated in this chapter as well.

6.A Notations and Preliminaries In this chapter we will be concerned with random vectors that take on values in Rn ≡ (−∞, ∞)n . When we say that the random vectors are nonnegative we mean that they take on values in Rn+ = [0, ∞)n . Elements in Rn will be denoted by x, y, and so forth, or, more explicitly, as x = (x1 , x2 , . . . , xn ), y = (y1 , y2 , . . . , yn ), and so on. The space Rn is endowed with the usual componentwise partial order, which is deﬁned as follows. Let x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) be two vectors in Rn ; then we denote x ≤ y if xi ≤ yi for i = 1, 2, . . . , n. Let x be a vector in Rn and let I = {i1 , i2 , . . . , ik } ⊆ {1, 2, . . . , n}; then we denote xI = (xi1 , xi2 , . . . , xik ).

(6.A.1)

For a random vector X that takes on values in Rn , the interpretation of X I is similar. The complement of I in {1, 2, . . . , n} is denoted by I ≡ {1, 2, . . . , n} − I. The vector of ones will be denoted by e, that is, e = (1, 1, . . . , 1). The dimension of e may vary from one formula to another, but it is always possible to determine it from the expression in which it appears. For example, if we write xI ≥ te, then it is obvious that the dimension of e is |I|, that is, the cardinality of I.

266

6 Multivariate Stochastic Orders

Let φ be a univariate or a multivariate function with domain in Rn . If φ(x) ≤ [≥] φ(y) whenever x ≤ y, then we say that the function φ is increasing [decreasing]. A set U ⊆ Rn is called increasing or upper [decreasing or lower] if y ∈ U whenever y ≥ [≤] x and x ∈ U . If U is Borel measurable, then it is increasing [decreasing] if, and only if, its indicator function IU is increasing [decreasing]. In this chapter, and later in the book, when we consider increasing and decreasing sets, they are implicitly assumed to be Borel measurable.

6.B The Usual Multivariate Stochastic Order 6.B.1 Deﬁnition and equivalent conditions Let X and Y be two random vectors such that P {X ∈ U } ≤ P {Y ∈ U } for all upper sets U ⊆ Rn .

(6.B.1)

Then X is said to be smaller than Y in the usual stochastic order (denoted by X ≤st Y ). Roughly speaking, (6.B.1) says that X is less likely than Y to take on large values, where “large” means any value in an increasing set U for any increasing set U . Another way of rewriting (6.B.1) is the following: E[IU (X)] ≤ E[IU (Y )]

for all upper sets U ⊆ Rn ,

(6.B.2)

where IU denotes the indicator function of U . From (6.B.2) it follows that if X ≤st Y , then E

m i=1

m ai IUi (X) − b ≤ E ai IUi (Y ) − b

(6.B.3)

i=1

for all ai ≥ 0, i = 1, 2, . . . , m, b ∈ Rn , and m ≥ 0. Given an increasing real function φ on Rn , it is possible, for each m, to deﬁne a sequence of Ui ’s, and a sequence of ai ’s, and a b (all of which may depend on m), such that as m → ∞, then (6.B.3) converges to E[φ(X)] ≤ E[φ(Y )],

(6.B.4)

provided the expectations exist. It follows that X ≤st Y if, and only if, (6.B.4) holds for all increasing functions φ for which the expectations exist. 6.B.2 A characterization by construction on the same probability space As in the univariate case, the usual multivariate stochastic order can be characterized as follows:

6.B The Usual Multivariate Stochastic Order

267

Theorem 6.B.1. The random vectors X and Y satisfy X ≤st Y if, and only ˆ and Yˆ , deﬁned on the same probability if, there exist two random vectors X space, such that ˆ =st X, X Yˆ =st Y ,

(6.B.6)

ˆ ≤ Yˆ } = 1. P {X

(6.B.7)

(6.B.5)

and Obviously, if (6.B.5)–(6.B.7) hold, then X ≤st Y . We will not give the proof of Theorem 6.B.1 here; however, in the next subsection we point out ˆ and of Yˆ can be an important special case in which the construction of X described explicitly. As in the univariate case (see Theorem 1.A.2) Theorem 6.B.1 can be restated as follows. Theorem 6.B.2. The n-dimensional random vectors X and Y satisfy X ≤st Y if, and only if, there exist a random variable Z and Rn -valued functions ψ 1 and ψ 2 such that ψ 1 (z) ≤ ψ 2 (z) for all z ∈ R, and X =st ψ 1 (Z) and Y =st ψ 2 (Z). In light of Theorem 6.B.1, the following question arises. Let {X(θ), θ ∈ Θ} be a collection of n-dimensional random vectors indexed by θ, where Θ is a subset of Rm for some m (see the beginning of Chapter 8 for a discussion about the meaning of this notation). Suppose that X(θ) ≤st X(θ ) whenever θ ≤ θ ; that is, that X(θ) is stochastically increasing in θ. Is it possible ˆ then to construct, on some probability space, a family {X(θ), θ ∈ Θ} such ˆ ˆ ˆ )} = 1 that X(θ) =st X(θ) for all θ ∈ Θ, and such that P {X(θ) ≤ X(θ whenever θ ≤ θ ? It turns out that if Θ ∈ R (that is, m = 1) the answer is in the aﬃrmative. However, when m ≥ 2 this need not be the case; see Fill and Machida [200] for a counterexample and a further discussion. 6.B.3 Conditions that lead to the multivariate usual stochastic order The ﬁrst basic result described in this subsection gives suﬃcient conditions for the usual multivariate stochastic order by means of the usual univariate stochastic order. The proof is based on the well-known standard construction: Suppose that we are given a distribution of a random vector ˆ = X = (X1 , X2 , . . . , Xn ) and we want to construct a random vector X ˆ =st X. The interest in such constructions is in ˆ1, X ˆ2, . . . , X ˆ n ) such that X (X simulation theory as well as in other areas of applications. In order to do it let U1 , U2 , . . . , Un be independent uniform [0, 1] random variables and deﬁne ˆ 1 = inf{x1 : P {X1 ≤ x1 } ≥ U1 }, X

268

6 Multivariate Stochastic Orders

ˆ 1 , . . . , Xk−1 = X ˆ k−1 } ≥ Uk }, ˆ k = inf{xk : P {Xk ≤ xk X1 = X X k = 2, 3, . . . , n. ˆ =st X. Then X The conditions given in the next result are natural for a construction of ˆ and Yˆ , as needed in Theorem 6.B.1, using the standard construction. The X result then follows from Theorem 6.B.1. Recall from Section 1.A.3 that for any random vector Z and an event A we denote by [Z A] any random vector that has as its distribution the conditional distribution of Z given A. Theorem 6.B.3. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X1 ≤st Y1 , [X2 X1 = x1 ] ≤st [Y2 Y1 = y1 ]

(6.B.8) whenever x1 ≤ y1 ,

(6.B.9)

and in general, for i = 2, 3, . . . , n, [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤st [Yi Y1 = y1 , . . . , Yi−1 = yi−1 ] whenever xj ≤ yj , j = 1, 2, . . . , i − 1, (6.B.10) then X ≤st Y . ˆ 1 and Yˆ1 on some probability space as described, Proof. First we construct X for example, in Section 1.A.2. This is possible by (6.B.8). Any possible realˆ 1 , Yˆ1 ) must satisfy x1 ≤ y1 . Conditioned on every such ization (x1 , y1 ) of (X ˆ 2 and Yˆ2 on the same probpossible realization (x1 , y1 ) we next construct X ability space again as described, for example, in Section 1.A.2. This, again, ˆ1, X ˆ 2 ) and (Yˆ1 , Yˆ2 ). is possible by (6.B.9). We thus have constructed so far (X ˆ ˆ ˆ ˆ Any possible realization ((x1 , x2 ), (y1 , y2 )) of ((X1 , X2 ), (Y1 , Y2 )) must satisfy xj ≤ yj , j = 1, 2. Therefore, conditioned on every such possible realization ˆ 3 and Yˆ3 on the same probability ((x1 , x2 ), (y1 , y2 )) we next can construct X space and so on. Continuing this procedure we ﬁnally arrive at random vecˆ and Yˆ , which satisfy (6.B.7). By the standard construction they also tors X satisfy (6.B.5) and (6.B.6). Therefore X ≤st Y by Theorem 6.B.1.

Conditions (6.B.8)–(6.B.10) can be used to deﬁne a new stochastic order. More explicitly, if X and Y satisfy (6.B.8)–(6.B.10), then X is said to be smaller than Y in the strong stochastic order (denoted by X ≤sst Y ). Theorem 6.B.3 simply says that X ≤sst Y =⇒ X ≤st Y . The order ≤sst is not an order in the usual sense; see Remark 6.B.5 below. Suppose that X = (X1 , X2 , . . . , Xn ) satisﬁes, for i = 2, 3, . . . , n, that

6.B The Usual Multivariate Stochastic Order

269

[Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤st [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] whenever xj ≤ xj , j = 1, 2, . . . , i − 1. (6.B.11) Then X is said to be conditionally increasing in sequence (CIS). It is easy to see that if X is CIS and if [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤st [Yi Y1 = x1 , . . . , Yi−1 = xi−1 ] for all xj , j = 1, 2, . . . , i − 1, (6.B.12) then (6.B.10) holds. Similarly, if Y is CIS and (6.B.12) holds, then (6.B.10) holds. We thus have proved the following result. Theorem 6.B.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If either X or Y is CIS and (6.B.8) and (6.B.12) hold, then X ≤st Y . Remark 6.B.5. The order ≤sst is not an order in the usual sense. In fact, it is obvious that X ≤sst X ⇐⇒ X is CIS. Remark 6.B.6. Let (U1 , U2 ) be a bivariate random vector with uniform[0, 1] margins, and with an absolutely continuous distribution function F . Then, as can easily be veriﬁed, (U1 , U2 ) is CIS if, and only if, F (u1 , u2 ) is a concave function of u1 ∈ [0, 1] for any u2 ∈ [0, 1]. A random vector X = (X1 , X2 , . . . , Xn ) is said to be weak conditionally increasing in sequence (WCIS) if, for i = 2, 3, . . . , n, we have [(Xi , . . . , Xn )X1 = x1 , . . . , Xi−2 = xi−2 , Xi−1 = xi−1 ] ≤st [(Xi , . . . , Xn )X1 = x1 , . . . , Xi−2 = xi−2 , Xi−1 = xi−1 ] for all xj , j = 1, 2, . . . , i − 2, and xi−1 ≤ xi−1 . It can be shown that if a random vector is CIS, then it is WCIS. The next result thus strengthens Theorem 6.B.4. We do not give its proof here. Theorem 6.B.7. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If either X or Y is WCIS and (6.B.8) and (6.B.12) hold, then X ≤st Y . The second basic result of this subsection is a multivariate analog of the univariate implication X ≤lr Y =⇒ X ≤st Y (the latter follows from Theorems 1.C.1 and 1.B.1). (Another multivariate analog is given in Theorem 6.E.8.) Recall the deﬁnition of association given in (3.A.53). Association, along with the notions of CIS and WCIS, are concepts that indicate positive dependence among the random variables X1 , X2 , . . . , Xn .

270

6 Multivariate Stochastic Orders

Theorem 6.B.8. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors with density functions f and g, respectively. If X is associated, and if g(x)/f (x) is increasing in x, then X ≤st Y . Proof. Let φ be an increasing function for which E[φ(Y )] exists. Then E[φ(Y )] = φ(y)g(y)dy g(y) = φ(y) f (y)dy f (y) g(y) f (y)dy ≥ φ(y)f (y)dy f (y) = E[φ(X)], where the inequality follows from (3.A.53) and from the monotonicity of φ(x) and of g(x)/f (x) in x. The stated result now follows from (6.B.4).

In order to motivate the third basic result of this subsection, consider m independent random variables X1 , X2 , . . . , Xm and an increasing m-dimensional function φ. It seems reasonable to expect that [(X1 , X2 , . . . , Xm )φ(X1 , X2 , . . . , Xm ) = s] is stochastically increasing in s. This is not always true, but the next result indicates an important instance in which this is the case. We omit the proof. Theorem 6.B.9. Let X1 , X2 , . . . , Xm be independent random variables, each with a logconcave density (that is, Polya frequency of order 2 (PF2 ); see Theorem 1.C.52). Then

m m (X1 , X2 , . . . , Xm ) Xi = s ≤st (X1 , X2 , . . . , Xm ) Xi = s i=1

i=1

whenever s ≤ s . A variation of Theorem 6.B.9 is stated next. In stating the conditions of Theorem 6.B.10 below we use the discrete analog of the univariate down shifted likelihood ratio order (see Section 1.C.4). Explicitly, let X and Y be univariate discrete random variables, each with support N+ . Then we denote X ≤lr↓ Y if P {Y = m + l} P {X = m}

is increasing in m ≥ 0 for all l ≥ 0.

(6.B.13)

Note that (6.B.13) is a discrete analog of (1.C.21). Theorem 6.B.10. Let X1 , X2 , . . . , Xm be independent random variables, each i with support N+ . Denote Si = j=1 Xj , i = 1, 2, . . . , m. If Xi ≤lr↓ Si ,

i = 2, 3, . . . , m,

6.B The Usual Multivariate Stochastic Order

271

and if Si ≤lr↓ Si+1 ,

i = 1, 2, . . . , m − 1,

then

m m (X1 , X2 , . . . , Xm ) Xi = s ≤st (X1 , X2 , . . . , Xm ) Xi = s i=1

i=1

whenever s ≤ s ∈ N+ . In Theorem 6.B.9, the function mφ which is mentioned just before that theorem, is φ(x1 , x2 , . . . , xm ) = i=1 xi . Another case of interest is when φ(x1 , x2 , . . . , xm ) = x(i) , for some i ∈ {1, 2, . . . , m}, where x(i) is the ith smallest xj . In fact we have the following result, whose proof we do not give. Note that it is not necessary to assume logconcavity in the next theorem. Theorem 6.B.11. Let X1 , X2 , . . . , Xm be independent and identically distributed random variables with a continuous distribution function. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Let 1 ≤ r ≤ m. Then (a) for 1 ≤ k1 < k2 < · · · < kr ≤ m, one has that [(X1 , X2 , . . . , Xm )X(k1 ) = s1 , X(k2 ) = s2 , . . . , X(kr ) = sr ] is stochastically increasing in s1 ≤ s2 ≤ · · · ≤ sr ; (b) for s1 ≤ s2 ≤ · · · ≤ sr , one has that [(X1 , X2 , . . . , Xm )X(k1 ) = s1 , X(k2 ) = s2 , . . . , X(kr ) = sr ] is stochastically decreasing in 1 ≤ k1 < k2 < · · · < kr ≤ m. A related result is given in the following theorem. Theorem 6.B.12. Let X1 , X2 , . . . , Xm be independent and identically distributed random variables with a continuous distribution function. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Let 1 ≤ r ≤ m. Then for 1 ≤ k ≤ m, and s ∈ R, one has that (X1 , X2 , . . . , Xm )X(k−1) < s < X(k) is stochastically increasing in s, and is stochastically decreasing in k. Another result that is related to Theorem 6.B.11 is the following. Theorem 6.B.13. Let X1 , X2 , . . . , Xm be independent exponential random variables with possibly diﬀerent parameters. Let X(1) ≤ X(2) ≤ · · · ≤ X(m) denote the corresponding order statistics. Then [(X(1) , X(2) , . . . , X(m) )X(1) = s1 ] is stochastically increasing in s1 . The proof of Theorem 6.B.13 uses ideas involving the total hazard construction which is described in Section 6.C.2. Therefore we defer the proof of this theorem to Remark 6.C.2. For the next result we need the deﬁnition of a copula. Let F be an ndimensional distribution function with univariate marginal distribution functions F1 , F2 , . . . , , Fn . Then there exists an n-dimensional distribution function C, with uniform[0, 1] marginal distributions, such that

272

6 Multivariate Stochastic Orders

(x1 , x2 , . . . , xn ) ∈ Rn . (6.B.14) The function C is a copula associated with F . If F is continuous, then C is unique and can be obtained by F (x1 , x2 , . . . , xn ) = C(F1 (x1 ), F2 (x2 ), . . . , Fn (xn )),

C(u1 , u2 , . . . , un ) = F (F1−1 (u1 ), F2−1 (u2 ), . . . , Fn−1 (un )), (u1 , u2 , . . . , un ) ∈ [0, 1]n ;

(6.B.15)

see, for example, Nelsen [431]. Note that if (U1 , U2 , . . . , Un ) has the distribution function C, then from (6.B.15) it follows that (F1−1 (U1 ), F2−1 (U2 ), . . . , Fn−1 (Un )) =st (X1 , X2 , . . . , Xn ).

(6.B.16)

Theorem 6.B.14. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have a common copula. If Xi ≤st Yi , i = 1, 2, . . . , n, then X ≤st Y . Proof. We only give the proof for the continuous case. Let C be the common copula, and let (U1 , U2 , . . . , Un ) be distributed according to C. Furthermore, let Fi and Gi denote the univariate distribution functions of Xi and Yi , respectively, i = 1, 2, . . . , n. From Xi ≤st Yi and (1.A.12) we get Fi−1 (ui ) ≤ G−1 i (ui ) for all ui ∈ [0, 1], i = 1, 2, . . . , n. Hence −1 −1 (F1−1 (U1 ), F2−1 (U2 ), . . . , Fn−1 (Un )) ≤a.s. (G−1 1 (U1 ), G2 (U2 ), . . . , Gn (Un )).

The stated result now follows from (6.B.16).

Theorem 6.B.14 may be compared with Theorem 7.A.38. An interesting result, which gives conditions under which one can stochastically compare vectors of partial sums of independent random variables, is stated next. Theorem 6.B.15. Let {Zi }ni=1 be a sequence of independent random variables. If Z1 ≤lr Z2 ≤lr · · · ≤lr Zn then Z1 , Z1 + Z2 , . . . ,

n i=1

n Zi ≤st Zπ1 , Zπ1 + Zπ2 , . . . , Zπi

≤st Zn , Zn + Zn−1 , . . . ,

i=1 n i=1

for every permutation (π1 , π2 , . . . , πn ) of (1, 2, . . . , n).

Zi ,

6.B The Usual Multivariate Stochastic Order

273

In particular it follows from Theorem 6.B.15 that if the random variables X and Y are such that X ≤lr Y , then (X, X + Y ) ≤st (Y, X + Y ).

(6.B.17)

Conclusion (6.B.17) does not necessarily follow from merely assuming that X ≤st Y . This can be shown by a counterexample. The proof of (6.B.17) can be obtained from Theorem 1.C.20 as follows. Let ψ be a bivariate increasing function. Then the function φ, deﬁned by φ(x, y) = ψ(x, x + y), belongs to Glr . Therefore, from (1.C.11) one sees that ψ(X, X + Y ) ≤st ψ(Y, X + Y ) and this gives (6.B.17). The proof of Theorem 6.B.15 uses the same idea together with a conditioning argument. 6.B.4 Closure properties Using (6.B.1) through (6.B.7) it is easy to prove each of the following closure results (note that parts (a) and (c) are special cases of part (b) in the next theorem). Theorem 6.B.16. (a) Let X and Y be two n-dimensional random vectors. If X ≤st Y and g : Rn → Rk is any k-dimensional increasing [decreasing] function, for any positive integer k, then the k-dimensional vectors g(X) and g(Y ) satisfy g(X) ≤st [≥st ] g(Y ). (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. Denote k = k1 +k2 +· · ·+km . If X i ≤st Y i for i = 1, 2, . . . , m, then, for any increasing function ψ : Rk → R, one has ψ(X 1 , X 2 , . . . , X m ) ≤st ψ(Y 1 , Y 2 , . . . , Y m ). That is, the usual multivariate stochastic order is closed under conjunctions. In particular, the usual multivariate stochastic order is closed under convolutions. (c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤st Y , then X I ≤st Y I for each I ⊆ {1, 2, . . . , n}. That is, the usual multivariate stochastic order is closed under marginalization. (d) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤st Y j , j = 1, 2, . . ., then X ≤st Y . (e) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤st [Y Θ = θ] for all θ in the support of Θ. Then X ≤st Y . That is, the usual stochastic order is closed under mixtures. In (6.B.1) the random vectors X and Y can be taken to be of countable inﬁnite dimension; that is, each of X and Y may correspond to an inﬁnite

274

6 Multivariate Stochastic Orders

sequence of random variables. In such a case, if (6.B.1) holds for all upper sets in R∞ , then we still say that X is smaller than Y in the usual stochastic order (denoted as X ≤st Y ). A generalization of this idea is described in Section 6.B.7. The inequality (6.B.4), as well as Theorem 6.B.1, are still valid when X and Y have countable inﬁnite dimension. We thus get the following result which involves multivariate random sums. Below, an empty sum is understood to be 0. Theorem 6.B.17. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of nonnegative random variables, and let Y 1 , Y 2 , . . . , Y m be other m such vectors. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers such that M is independent of X 1 , X 2 , . . . , X m , and N is independent of Y 1 , Y 2 , . . . , Y m . Denote by Xj,i [Yj,i ] the ith element of X j [Y j ]. If (X 1 , X 2 , . . . , X m ) ≤st (Y 1 , Y 2 , . . . , Y m ), and if M ≤st N , then M1 i=1

X1,i ,

M2

X2,i , . . . ,

i=1

Mm

N1 N2 Nm Xm,i ≤st Y1,i , Y2,i , . . . , Ym,i .

i=1

i=1

i=1

i=1 (i)

Consider now n families of univariate distribution functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) de(i) note a random variable with distribution function Gθ , i = 1, 2, . . . , n. Let n Θ = (Θ1 , Θ2 , . . . , Θn ) be a random vector with support in i=1 Xi , and with distribution function F . Consider the n-dimensional distribution function H given by

H(y1 , y2 , . . . , yn ) =

n

... X1

X2

Xn i=1

(i)

Gθi (yi )dF (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn . (6.B.18)

The following result is a generalization of Theorem 6.B.16(e), and is a multivariate extension of Theorem 1.A.6; see Theorems 6.G.8, 7.A.37, 9.A.7, and 9.A.15 for related results. (i)

Theorem 6.B.18. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution nfunctions as above. Let Θ 1 and Θ 2 be two random vectors with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

If

6.B The Usual Multivariate Stochastic Order

Xi (θ) ≤st Xi (θ )

275

whenever θ ≤ θ , i = 1, 2, . . . , n,

and if Θ 1 ≤st Θ2 , then Y 1 ≤st Y 2 . 6.B.5 Further properties Clearly if X ≤st Y , then EX ≤ EY . However, similar to the univariate case, if two random vectors are ordered in the usual multivariate stochastic order and have the same expected values, then they must have the same distribution. This is shown in the following result, which is a multivariate generalization of Theorem 1.A.8. Similar results are given in Theorems 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.G.12, and 7.A.14–7.A.16. Theorem 6.B.19. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be two random vectors. If X ≤st Y and if E[hi (Xi )] = E[hi (Yi )] for some strictly increasing function hi , i = 1, 2, . . . , m, then X =st Y . We will not give the complete proof of Theorem 6.B.19 here, but we will show a simple argument that proves it when X and Y are nonnegative random vectors. From the assumption X ≤st Y and from Theorem 6.B.16(c) it follows that Xi ≤st Yi . Since E[hi (Xi )] = E[hi (Yi )] it follows from Theorem 1.A.8 that Xi =st Yi , and thus, in particular, EXi = EYi for i = 1, 2, . . . , m. Therefore E

m

αi Xi =

i=1

m

αi E[Xi ] =

i=1

m

αi E[Yi ] = E

i=1

m

αi Yi

i=1

whenever αi ≥ 0, i = 1, 2, . . . , m. Also, from X ≤st Y it follows that m

αi Xi ≤st

i=1

m

αi Yi

whenever αi ≥ 0, i = 1, 2, . . . , m.

i=1

Therefore, again by Theorem 1.A.8, we have that m

αi Xi =st

i=1

Thus

m

αi Yi

whenever αi ≥ 0, i = 1, 2, . . . , m.

i=1

E exp

−

m i=1

αi Xi

= E exp

−

m

αi Yi

i=1

whenever αi ≥ 0, i = 1, 2, . . . , m. From the unicity property of the Laplace transform we obtain X =st Y .

276

6 Multivariate Stochastic Orders

A straightforward analog of Theorem 1.A.15 is in general not true in the multivariate case. That is, if X is any random vector and if U1 and U2 are any increasing sets such that U1 ⊇ U2 , then it is not necessarily true that [X U1 ] ≤st [X U2 ]; some property of positive dependence is needed to be imposed on X in order for this result to hold. We do not give the details here. Recall from (6.B.4) that X = (X1 , X2 , . . . , Xm ) ≤st Y = (Y1 , Y2 , . . . , Ym ) if, and only if, E[φ(X)] ≤ E[φ(Y )] for all increasing functions φ, and that (6.B.2) says that X ≤st Y if, and only if, E[φ(X)] ≤ E[φ(Y )] for all increasing indicator functions φ. When m = 2 we have a further similar characterization of the multivariate order ≤st , as is stated next. The proof is omitted. Theorem 6.B.20. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. Then (X1 , X2 ) ≤st (Y1 , Y2 ) if, and only if, φ1 (X1 ) + φ2 (X2 ) ≤st φ1 (Y1 ) + φ2 (Y2 ) for all increasing functions φ1 and φ2 . A random vector (X1 , X2 , . . . , Xm ) or its distribution is said to be permutation symmetric or exchangeable if (X1 , X2 , . . . , Xm ) =st (Xπ1 , Xπ2 , . . . , Xπm ) for every permutation π of (1, 2, . . . , m). A set U is said to be symmetric if (x1 , x2 , . . . , xm ) ∈ U =⇒ (xπ1 , xπ2 , . . . , xπm ) ∈ U for every permutation π of (1, 2, . . . , m). For permutation symmetric random vectors the result in the following theorem holds. The proof uses symmetry arguments and is omitted. Theorem 6.B.21. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be two permutation symmetric random vectors. Then X ≤st Y if, and only if, P {X ∈ U } ≤ P {Y ∈ U } for all symmetric upper sets U ⊆ Rm . In the next result we obtain a comparison of order statistics with respect to ≤st , but ﬁrst we need a lemma. Let z1 , z2 , . . . be a sequence of constants or of random variables. Denote by z(i:m) the ith smallest value among the ﬁrst m zi ’s. Lemma 6.B.22. For any sequence of constants z1 , z2 , . . . the following inequalities hold: z(i:m) ≤ z(i+1:m) , 1 ≤ i ≤ m − 1. z(i:m+1) ≤ z(i:m) , 1 ≤ i ≤ m. z(i:m) ≤ z(i+1:m+1) , 1 ≤ i ≤ m.

(6.B.19) (6.B.20) (6.B.21)

6.B The Usual Multivariate Stochastic Order

277

Proof. The proof of (6.B.19) is obvious from the deﬁnition of the z(i:m) ’s. The proof of (6.B.20) is also quite simple—just note that if zm+1 ≤ z(i:m) , then z(i:m+1) ≤ z(i:m) , whereas if zm+1 > z(i:m) , then z(i:m+1) = z(i:m) . Finally, in order to prove (6.B.21), note that if zm+1 ≤ z(i:m) , then z(i+1:m+1) = z(i:m) , whereas if zm+1 > z(i:m) , then z(i:m) ≤ z(i+1:m+1) .

Theorem 6.B.23. Let {X1 , X2 , . . . } and {Y1 , Y2 , . . . } be two sequences of random variables such that (X1 , X2 , . . . , Xk ) ≤st (Y1 , Y2 , . . . , Yk ),

k ≥ 1.

(6.B.22)

Then X(i:m) ≤st Y(j:n)

whenever i ≤ j and m − i ≥ n − j.

(6.B.23)

Proof. First note that from (6.B.22) it follows that X(i:m) ≤st Y(i:m) ,

1 ≤ i ≤ m.

(6.B.24)

Now, if m ≥ n, then X(i:m) ≤a.s. X(i:n) ≤st Y(i:n) ≤a.s. Y(j:n)

(by (6.B.20) and m ≥ n) (by (6.B.24)) (by (6.B.19) and i ≤ j).

And if m < n, then X(i:m) ≤st Y(i:m) ≤a.s. Y(i+n−m:n) ≤a.s. Y(j:n)

(by (6.B.24)) (by (6.B.21) and m < n) (by (6.B.19) and j ≥ i + n − m).

Since the almost sure relation ≤a.s. implies the relation ≤st , we obtain (6.B.23) from the above inequalities.

If in Theorem 6.B.23 we take Yi = Xi , i = 1, 2, . . ., then obviously (6.B.22) holds. Thus we obtain the following corollary. Corollary 6.B.24. Let {X1 , X2 , . . . } be a sequence of (not necessarily independent) random variables. Then X(i:m) ≤st X(j:n)

whenever i ≤ j and m − i ≥ n − j.

The next example shows that if two random variables are ordered in the dispersive order, then the corresponding vectors of spacings are ordered in the usual stochastic order. Related results can be found in Theorems 1.C.45 and 4.B.17, and in Example 6.E.15.

278

6 Multivariate Stochastic Orders

Example 6.B.25. Let X and Y be two random variables. Let X(1) ≤ X(2) ≤ · · · ≤ X(n) denote the order statistics from a sample X1 , X2 , . . . , Xn of independent and identically distributed random variables that have the same distribution as X. Similarly, let Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) denote the order statistics from another sample Y1 , Y2 , . . . , Yn of independent and identically distributed random variables that have the same distribution as Y . The corresponding spacings are deﬁned by U(i) ≡ X(i) − X(i−1) and V(i) ≡ Y(i) − Y(i−1) , i = 2, 3, . . . , n. Denote U = (U(2) , U(3) , . . . , U(n) ) and V = (V(2) , V(3) , . . . , V(n) ). We will now show that if X ≤disp Y , then U ≤st V . Let F and G denote the distribution functions of X and Y , respectively. Deﬁne Yˆ(i) = G−1 (F (X(i) )), i = 1, 2, . . . , n, and Vˆ(i) = Yˆ(i) − Yˆ(i−1) , i = 2, 3, . . . , n. Clearly, (V(2) , V(3) , . . . , V(n) ) =st (Vˆ(2) , Vˆ(3) , . . . , Vˆ(n) ). Furthermore, from (3.B.10) we have that Vˆ(i) = G−1 (F (X(i) )) − G−1 (F (X(i−1) )) ≥ X(i) − X(i−1) = U(i) a.s., i = 2, 3, . . . , n. Thus, it follows from Theorem 6.B.1 that U ≤st V . In particular, from Theorem 6.B.16(c) we get that U(i) ≤st V(i) for i = 2, 3, . . . , n, and this proves Theorem 3.B.31. For the next two examples recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. Example 6.B.26. Let X1 , X2 , . . . , Xn , Y1 , Y2 , . . . , Yn be independent Gamma random variables where Xi has the density function fi deﬁned by fi (x) =

λα i xα−1 e−λi x , Γ (α)

x ≥ 0,

where α > 0 and λi > 0, i = 1, 2, . . . , n, and Yi has the density function gi deﬁned by µα i gi (x) = xα−1 e−µi x , x ≥ 0, Γ (α) where α > 0 is as above, and µi > 0, i = 1, 2, . . . , n. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(n) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) . Suppose that (λ1 , λ2 , . . . , λn ) ≺ (µ1 , µ2 , . . . , µn ). If α ≤ 1, then (X(1) , X(2) , . . . , X(n) ) ≤st (Y(1) , Y(2) , . . . , Y(n) ), and if α ≥ 1, then X(1) ≥st Y(1)

and X(n) ≤st Y(n) .

In particular, by taking α = 1, it is seen that the above inequalities hold for heterogeneous exponential random variables.

6.B The Usual Multivariate Stochastic Order

279

Example 6.B.27. Let X1 , X2 , . . . , Xn , Y1 , Y2 , . . . , Yn be independent Weibull random variables where Xi has the survival function F i deﬁned by F i (x) = e−(λi x) , α

x ≥ 0,

where α > 0 and λi > 0, i = 1, 2, . . . , n, and Yi has the survival function Gi deﬁned by α Gi (x) = e−(µi x) , x ≥ 0, where α > 0 is as above, and µi > 0, i = 1, 2, . . . , n. Denote the corresponding order statistics by X(1) ≤ X(2) ≤ · · · ≤ X(n) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) . Suppose that (λ1 , λ2 , . . . , λn ) ≺ (µ1 , µ2 , . . . , µn ). If α ≤ 1, then (X(1) , X(2) , . . . , X(n) ) ≤st (Y(1) , Y(2) , . . . , Y(n) ). Again, by taking α = 1, it is seen that the above inequalities hold for heterogeneous exponential random variables. Example 6.B.28. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be inﬁnitely divisible random vectors with L`evy measures νX and νY , respectively; that is, νX and νY satisfy Rm (1∧ |x|)νX (dx) < ∞ and Rm (1∧ |y|)νY (dy) < ∞, and the characteristic functions of X and of Y can be written in the form $ % ϕX (t) = exp (ei(t·x) − 1)νX (dx) + i(t · bX ) Rm \{0}

$

and ϕY (t) = exp

Rm \{0}

% (ei(t·y) − 1)νY (dy) + i(t · bY ) ,

respectively, for some bX , bY ∈ Rm . Assume that νX and νY are concentrated on [0, ∞)m . If νX (U ) ≤ νY (U ) for all Borel measurable upper sets in Rm , and if bX ≤ bY , then X ≤st Y . The following example gives necessary and suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.G.11, 7.A.13, 7.A.26, 7.A.39, 7.B.5, and 9.A.20 for related results. Example 6.B.29. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ X , and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix Σ Y . Then X ≤st Y if, and only if, µX ≤ µY and Σ X = Σ Y . 6.B.6 A property in reliability theory In this subsection we show how the multivariate order ≤st can be used as a tool for the purpose of deﬁning aging properties for components whose lifetimes are not necessarily independent. The notions and notations introduced in this subsection will also be used in the rest of this chapter.

280

6 Multivariate Stochastic Orders

Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. In this subsection it is helpful to think about T1 , T2 , . . . , Tm as the lifetimes of m components 1, 2, . . . , m that make up some system. Suppose that an observer observes the system continuously in time and records the failure times and the identities of the components that fail as time passes. Thus, a typical “history” that the observer has observed by time t ≥ 0 is of the form ht = {T I = tI , T I > te},

0e ≤ tI ≤ te, I ⊆ {1, 2, . . . , m}.

(6.B.25)

In (6.B.25) I is the set of components that have already failed by time t (with failure times tI ) and I is the set of components that are still alive at time t. Let hs = {T J = sJ , T J > se},

0e ≤ sJ ≤ se, J ⊆ {1, 2, . . . , m},

(6.B.26)

be another history. If t ≤ s and the histories ht and hs are such that each component that failed in ht also failed in hs , and, for components that failed in both histories, the failures in hs are earlier than the failures in ht , then we say that the history ht is less severe or “more pleasant” than the history hs and we denote it by ht ≤ hs . Note that if ht and hs are as in (6.B.25) and (6.B.26), then ht ≤ hs if, and only if, I ⊆ J and sI ≤ tI . For every vector a = (a1 , a2 , . . . , am ) denote by a+ the vector a+ = ((a1 )+ , (a2 )+ , . . . , (am )+ ). Recalling Theorem 1.A.30 we can deﬁne a nonnegative random vector T as multivariate IFR if for t ≤ s we have (6.B.27) [(T − te)+ ht ] ≥st [(T − se)+ hs ] whenever ht ≤ hs . Another possibility is to call the nonnegative random vector T multivariate IFR if for t ≤ s we have [(T − te)+ ht ] ≥st [(T − se)+ hs ] whenever ht and hs coincide on [0, t). (6.B.28) These two diﬀerent deﬁnitions of multivariate IFR have some desirable properties. For example, a vector consisting of independent IFR random variables is multivariate IFR according to either one of these two deﬁnitions. However, perhaps the most important feature of these kinds of deﬁnitions is their intuitive interpretation. In the univariate case these two deﬁnitions coincide with the usual univariate deﬁnition of IFR. Further notions of multivariate IFR are studied in Section 6.D.3. 6.B.7 Stochastic ordering of stochastic processes In Section 1.A.1 we saw how to deﬁne the usual stochastic order between two univariate random variables. In Section 6.B.1 we saw how this comparison can

6.B The Usual Multivariate Stochastic Order

281

be deﬁned for two multivariate random vectors. The next level of generalization, then, is the stochastic comparison of two stochastic processes. In fact, several levels of generalization can be studied. The stochastic processes can be univariate (if their common state space S is a subset of R). Or they can be multivariate (if their common state space S is a subset of Rm for some m). Or, more generally, the common state space S can be any general space, according to the requirements of the particular application in which the order is to be used. In this subsection we consider only the case in which the random processes are univariate. Section 6.H contains some references for the more general results. Let {X(t), t ∈ T } and {Y (t), t ∈ T } be two stochastic processes with state space S ⊆ R and time parameter space T (usually T = [0, ∞) or T = N+ ). Suppose that, for all choices of an integer m and t1 < t2 < · · · < tm in T , it holds that (X(t1 ), X(t2 ), . . . , X(tm )) ≤st (Y (t1 ), Y (t2 ), . . . , Y (tm )), where here ≤st is in the sense of Section 6.B.1. Then {X(t), t ∈ T } is said to be smaller than {Y (t), t ∈ T } in the usual stochastic order (denoted by {X(t), t ∈ T } ≤st {Y (t), t ∈ T }). It can be shown that {X(t), t ∈ T } ≤st {Y (t), t ∈ T } if, and only if, E{g({X(t), t ∈ T })} ≤ E{g({Y (t), t ∈ T })},

(6.B.29)

for every increasing functional g for which the expectations in (6.B.29) exist (a functional g is called increasing if g({x(t), t ∈ T }) ≤ g({y(t), t ∈ T }) whenever x(t) ≤ y(t), t ∈ T ). An analog of (6.B.1) can also be stated and proved, but it is not included here. However, we do state the following important property of the order ≤st , which is a generalization of Theorem 6.B.1. Theorem 6.B.30. The random processes {X(t), t ∈ T } and {Y (t), t ∈ T } satisfy {X(t), t ∈ T } ≤st {Y (t), t ∈ T } if, and only if, there exist two random ˆ processes {X(t), t ∈ T } and {Yˆ (t), t ∈ T }, deﬁned on the same probability space, such that ˆ {X(t), t ∈ T } =st {X(t), t ∈ T }, ˆ {Y (t), t ∈ T } =st {Y (t), t ∈ T }, and ˆ P {X(t) ≤ Yˆ (t), t ∈ T } = 1. For discrete-time processes (T = N+ ), an analog of Theorem 6.B.3 is given in Theorem 6.B.31. The proof of it is the same as the proof of Theorem 6.B.3, except that Theorem 6.B.30 is applied at the end of the proof rather than Theorem 6.B.1.

282

6 Multivariate Stochastic Orders

Theorem 6.B.31. Let {X(n), n ∈ N+ } = {X(0), X(1), X(2), . . . } and {Y (n), n ∈ N+ } = {Y (0), Y (1), Y (2), . . . } be two discrete-time stochastic processes. If X(0) ≤st Y (0), and if [X(i)X(1) = x1 , . . . , X(i − 1) = xi−1 ] ≤st [Y (i)Y (1) = y1 , . . . , Y (i − 1) = yi−1 ] whenever xj ≤ yj , j = 1, 2, . . . , i − 1, i = 1, 2, 3, . . . , then {X(n), n ∈ N+ } ≤st {Y (n), n ∈ N+ }. Theorems 6.B.2 and 6.B.4 also have straightforward analogs that we do not state here. The order ≤st for stochastic processes is closed under operations similar to those described in Theorem 6.B.16. In particular, {X(t), t ∈ T } ≤st {Y (t), t ∈ T } =⇒ {g({X(t), t ∈ T })} ≤st {g({Y (t), t ∈ T })} for all increasing functionals g. The order is also closed under mixtures. To see an important application of these ideas, consider two discrete-time homogeneous Markov processes {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } with a common state space S ⊆ R. Denote YX1 (x) =st [X1 (n + 1)X1 (n) = x] and YX2 (x) =st [X2 (n + 1)X2 (n) = x], x ∈ S. The proof of the next result follows directly from Theorem 6.B.31. Theorem 6.B.32. Let {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } be two Markov processes as described above. Suppose that X1 (0) ≤st X2 (0) and that YX1 (x) ≤st YX2 (x )

whenever x ≤ x .

Then {X1 (n), n ∈ N+ } ≤st {X2 (n), n ∈ N+ }. A variation of Theorem 6.B.32 for Markov chains (that is, discrete-time homogeneous Markov process with state space in N) is given next. Recall that a Markov chain is called skip-free positive if it does not have positive jumps of magnitude more than one. For a Markov chain {X(n), n ∈ N+ } with state space S ⊆ N we denote YX (i) =st [X(n + 1)X(n) = i], i ∈ S. The proof of the following result is obtained by a straightforward construction of the two underlying Markov chains on the same probability space, and then using Theorem 6.B.30. Theorem 6.B.33. Let {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } be two Markov chains. Suppose that X1 (0) ≤st X2 (0), that YX1 (i) ≤st YX2 (i)

for all i,

and YX2 (i) ≥ i

for all i,

(6.B.30)

and that {X1 (n), n ∈ N+ } is skip-free positive. Then {X1 (n), n ∈ N+ } ≤st {X2 (n), n ∈ N+ }.

6.B The Usual Multivariate Stochastic Order

283

The discrete-time homogeneous Markov process {X(n), n ∈ N+ } is said to be stochastically monotone if YX (x) =st [X(n + 1)X(n) = x] is stochastically increasing in x ∈ S. Note that stochastic monotonicity is a diﬀerent condition than the almost sure monotonicity condition (6.B.30) — none of these implies the other. Denote by {X (x) (n), n ∈ N+ } the process {X(n), n ∈ N+ } under the condition that X(0) = x. The following result is a direct consequence of Theorem 6.B.32. Theorem 6.B.34. Let {X(n), n ∈ N+ } be a discrete-time homogeneous Markov process that is stochastically monotone. Then

{X (x) (n), n ∈ N+ } ≤st {X (x ) (n), n ∈ N+ }

(6.B.31)

whenever x ≤ x . For example, a discrete-time birth and death chain (with state space N) with birth probabilities P {X(n + 1) = i + 1X(n) = i} = pi and death probabilities P {X(n + 1) = i − 1X(n) = i} = 1 − pi , i ∈ N, is stochastically monotone if pi increases in i ∈ N. Hence it satisﬁes (6.B.31). If two processes {X(t), t ∈ T } and {Y (t), t ∈ T } satisfy {X(t), t ∈ T } ≤st {Y (t), t ∈ T }, then, by Theorem 6.B.30, the ﬁrst passage times TX (a) ≡ inf{t : X(t) > a} and TY (a) ≡ inf{t : Y (t) > a} (where inf ∅ = ∞) satisfy TX (a) ≥st TY (a) for all a. The reverse implication need not be true. By removing (6.B.30) from Theorem 6.B.33 we obtain the following result. Its proof consists of a proper construction of the two underlying Markov chains on the same probability space, and then using Theorem 6.B.30. Theorem 6.B.35. Let {X1 (n), n ∈ N+ } and {X2 (n), n ∈ N+ } be two Markov chains. Suppose that X1 (0) ≤st X2 (0), that YX1 (i) ≤st YX2 (i)

for all i,

and that {X1 (n), n ∈ N+ } is skip-free positive. Then TX (a) ≥st TY (a) for all a. Suppose now that the two processes that we want to compare are point processes that, for distinction, we denote by {K(t), t ≥ 0} and {N (t), t ≥ 0}. That is, for each t ≥ 0, K(t) and N (t) are the numbers of jumps that the corresponding processes have experienced over the time interval (0, t]. In addition to the possible relationship {K(t), t ≥ 0} ≤st {N (t), t ≥ 0} between these processes, we will consider also two other stronger possible relationships. For any positive integer m, let B1 , B2 , . . . , Bm be bounded Borel sets of [0, ∞). Let K(Bi ) and N (Bi ) denote the number of jumps of the corresponding processes over the set Bi , i = 1, 2, . . . , m. Suppose that, for all choices of an integer m and bounded Borel sets B1 , B2 , . . . , Bm , it holds that (K(B1 ), K(B2 ), . . . , K(Bm )) ≤st (N (B1 ), N (B2 ), . . . , N (Bm )).

284

6 Multivariate Stochastic Orders

Then {K(t), t ≥ 0} is said to be smaller than {N (t), t ≥ 0} in the usual stochastic order over N (denoted by {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}). (Here N denotes the space of integer-valued Radon measures.) The usual stochastic order over N gives a “global” comparison of the point processes {K(t), t ≥ 0} and {N (t), t ≥ 0}. Let X1 < X2 < · · · be the sequence of interpoint distances of the process {K(t), t ≥ 0}, and let Y1 < Y2 < · · · be the sequence of interpoint distances of the process {N (t), t ≥ 0}. We assume that the Xi ’s and that the Yi ’s are almost surely positive. Also nwe assume that the processes n are nonexplosive in the sense that limn→∞ i=1 Xi = ∞ and limn→∞ i=1 Yi = ∞ almost surely. Suppose that, for all choices of an integer m and indices i1 , i2 , . . . , im , it holds that (Xi1 , Xi2 , . . . , Xim ) ≥st (Yi1 , Yi2 , . . . , Yim ). Then {K(t), t ≥ 0} is said to be smaller than {N (t), t ≥ 0} in the usual stochastic order over R∞ (denoted by {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0}). The usual stochastic order over R∞ gives a “local” comparison of the point processes {K(t), t ≥ 0} and {N (t), t ≥ 0}. Analogs of (6.B.29) can be stated and proven for the orders ≤st-N and ≤st-∞ . Also, “almost sure” constructions, that are analogs of Theorem 6.B.30, can be shown for these orders. We do not give the technical details here. We ˆ = {K(t), ˆ note, however, that in such constructions the counterparts K t ≥ 0} ˆ ˆ and N = {N (t), t ≥ 0} of {K(t), t ≥ 0} and {N (t), t ≥ 0}, respectively, satisfy the following properties: The relationship {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0} ˆ is a thinning of N ˆ . The relationship {K(t), t ≥ 0} ≤st {N (t), t ≥ means that K ˆ ˆ before 0} means that N has a.s. earlier and more numerous points than K each time instant t. The relationship {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} means ˆ than for K ˆ a.s. that the corresponding interpoint distances are shorter for N From this it is immediate that {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st {N (t), t ≥ 0}, and that {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st {N (t), t ≥ 0}. (6.B.32) It can be shown that, in general, {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} and also that {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0} =⇒ {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}. For renewal processes we have the following results. Theorem 6.B.36. Consider two nondelayed renewal processes {K(t), t ≥ 0} and {N (t), t ≥ 0} with generic interpoint distances X and Y , respectively. The following three statements are equivalent. (i) Y <st X, (ii) {K(t), t ≥ 0} ≤st {N (t), t ≥ 0},

6.B The Usual Multivariate Stochastic Order

285

(iii) {K(t), t ≥ 0} ≤st-∞ {N (t), t ≥ 0}. Proof. Note that from the independence of the interpoint distances it follows that (i)⇐⇒(iii). From (6.B.32) it follows that (iii)=⇒(ii). The implication (ii)=⇒(i) is obvious.

Theorem 6.B.37. Consider two nondelayed renewal processes {K(t), t ≥ 0} and {N (t), t ≥ 0} with generic interpoint distances X and Y , respectively. Let rX and rY denote the hazard rate functions corresponding to X and Y , respectively. If rX (t) ≤ rY (s) for all 0 ≤ s ≤ t, (6.B.33) then {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}. Theorem 6.B.37 can be easily proven using the fact, mentioned above, that ˆ is a thinning of N ˆ . We do not give a detailed proof of it here. K Note that (6.B.33) holds if Y ≤hr X and if X is DFR or if Y is DFR. The proofs of the next two theorems are similar to the proofs of Theorems 6.B.36 and 6.B.37, respectively. Theorem 6.B.38. Consider two delayed renewal processes {K d (t), t ≥ 0} and {N d (t), t ≥ 0}, with the corresponding delays X d and Y d and with the same interrenewal distribution after the delay. The following statements are equivalent. (i) Y d <st X d , (ii) {K d (t), t ≥ 0} ≤st {N d (t), t ≥ 0}, (iii) {K d (t), t ≥ 0} ≤st-∞ {N d (t), t ≥ 0}. Theorem 6.B.39. Consider two delayed renewal processes {K d (t), t ≥ 0} and {N d (t), t ≥ 0}, with the corresponding delays X d and Y d and with the same interrenewal distribution after the delay. Let rX d denote the hazard rate function corresponding to X d . If Y d ≤hr X d and if rX d (t) ≤ r(s)

for all 0 ≤ s ≤ t,

(6.B.34)

where r is the hazard rate function associated with the common interrenewal distribution function, then {K d (t), t ≥ 0} ≤st-N {N d (t), t ≥ 0}. Note that (6.B.34) holds, for example, if X ≤hr X d , and if X is DFR or if X is DFR. Finally we give conditions for two nonhomogeneous Poisson processes to be ordered according to the above orders. d

Theorem 6.B.40. Let {K(t), t ≥ 0} and {N (t), t ≥ 0} be two nonhomogeneous Poisson processes with mean functions MK and MN , respectively, and with intensity functions λK and λN , respectively.

286

6 Multivariate Stochastic Orders

(i) If MK (t) ≤ MN (t), t ≥ 0, then {K(t), t ≥ 0} ≤st {N (t), t ≥ 0}. (ii) If λK (t) ≤ λN (t), t ≥ 0, then {K(t), t ≥ 0} ≤st-N {N (t), t ≥ 0}. −1 (MN (t)) − t is increasing in t ≥ 0, then {K(t), t ≥ 0} ≤st-∞ (iii) If MK {N (t), t ≥ 0}. In the following example, parts (i) and (iii) of Theorem 6.B.40 are restated in the terminology of Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 4.B.14, 6.D.8, 6.E.13, and 7.B.13. Example 6.B.41. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G, i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the usual stochastic ordering of the ﬁrst two epoch times implies the multivariate usual stochastic ordering of all the corresponding later epoch times. Explicitly, part (i) of Theorem 6.B.40 says that if X ≤st Y , then (T1,1 , T1,2 , . . . , T1,n ) ≤st (T2,1 , T2,2 , . . . , T2,n ), n ≥ 1. Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Part (iii) of Theorem 6.B.40 says that if X ≤disp Y , then (X1,1 , X1,2 , . . . , X1,n ) ≤st (X2,1 , X2,2 , . . . , X2,n ), n ≥ 1.

6.C The Cumulative Hazard Order 6.C.1 Deﬁnition Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. In this section, as in Section 6.B.6, it is helpful to think about T1 , T2 , . . . , Tm as the lifetimes of m components 1, 2, . . . , m that make up some system. Consider a typical “history” of T at time t ≥ 0, which is of the form (see (6.B.25)) ht = {T I = tI , T I > te},

0e ≤ tI ≤ te, I ⊆ {1, 2, . . . , m}.

(6.C.1)

Given the history ht in (6.C.1), let i ∈ I be a component that is still alive at time t. Its multivariate conditional hazard rate, at time t, is deﬁned as follows: 1 λi|I (ttI ) = lim P {t < Ti ≤ t + ∆tT I = tI , T I > te}, ∆t↓0 ∆t

(6.C.2)

where, of course, 0e ≤ tI ≤ te, and I ⊆ {1, 2, . . . , m}. As long as the item is alive it accumulates hazard at the rate of λi|I (ttI ) at time t. If I = {i1 , i2 , . . . , ik } and

6.C The Cumulative Hazard Order

287

ti1 ≤ t i2 ≤ · · · ≤ tik , then the cumulative hazard of component i ∈ I at time t is Ψi|i1 ,i2 ,...,ik (tti1 , ti2 , . . . , tik ) ti1 k ti j = λi|∅ (u t∅ )du + λi|i1 ,i2 ,...,ij−1 (uti1 , ti2 , . . . , tij−1 )du 0

j=2

tij−1

t

+ tik

λi|i1 ,i2 ,...,ik (uti1 , ti2 , . . . , tik )du.

(6.C.3)

Let S = (S1 , S2 , . . . , Sm ) be another nonnegative random vector with an absolutely continuous distribution function and with cumulative hazard functions Φ·|· (··), which are deﬁned analogously to the Ψ ’s in (6.C.3). Select two integers j and l such that j ≤ l ≤ m. Let t1 , t2 , . . . , tj and s1 , . . . , sj , . . . , sl be such that 0 ≤ t1 ≤ t2 ≤ · · · ≤ tj , and 0 ≤ si ≤ ti , i = 1, 2, . . . , j, and si ≥ 0, i = j + 1, . . . , l. Let sk1 ≤ sk2 ≤ · · · ≤ skl be the ordered si ’s. If for any integer α > l we have Φα|k1 ,k2 ,...,kl (usk1 , sk2 , . . . , skl ) ≥ Ψα|1,2,...,j (ut1 , t2 , . . . , tj )

(6.C.4)

whenever u ≥ max{tj , sj+1 , sj+2 , . . . , sl }, and if the same holds with 1, 2, . . . , l replaced by π1 , π2 , . . . , πl for every permutation π of (1, 2, . . . , m), then S is said to be smaller than T in the cumulative hazard order (denoted as S ≤ch T ). The order ≤ch is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤ch X means that X has the positive dependence property of “supporting lifetimes” discussed in Norros [437] and in Shaked and Shanthikumar [511]. Condition (6.C.4) simply states that at any time t the cumulative hazard of Sα is larger than the cumulative hazard of Tα whenever the history of the components corresponding to S is more “severe” than the history of the components corresponding to T . Thus (6.C.4) can be written as (see Section 6.B.6 for the deﬁnition of histories and for the deﬁnition of their comparison) Φα (hu ) ≥ Ψα (hu )

whenever hu ≥ hu ,

where α denotes a component that has not failed by time u in the history hu . In the univariate case (that is, m = 1) condition (6.C.4) simply says that − log P {S1 > u} ≥ − log P {T1 > u}. Therefore, in the univariate case S1 ≤ch T1 ⇐⇒ S1 ≤st T1 . Thus, if the components of S are independent, and if the components of T are independent, then S ≤ch T ⇐⇒ S ≤st T . In the general multivariate case the two orders are not equivalent, but it will be shown below that if S ≤ch T , then S ≤st T .

288

6 Multivariate Stochastic Orders

6.C.2 The relationship between the cumulative hazard order and the usual multivariate stochastic order The total hazard accumulated by the failure time Ti , given that Ti was the time of the kth failure and that the previous failure times were Tj1 , Tj2 , . . . , Tjk−1 , is Ψi|j1 ,j2 ,...,jk−1 (Ti Tj1 , Tj2 , . . . , Tjk−1 ). It can be shown that the total hazards accumulated by the failure times Ti ’s are independent standard (that is, mean one) exponential random variables. This fact motivates the following total hazard construction, which is of independent interest but we will use it here in order to show that if S ≤ch T , then S ≤st T . The idea of the construction is as follows. The components accumulate hazard as long as they are alive with the rates given in (6.C.2). Each one of them dies when its accumulated hazard crosses a random threshold. The random thresholds are independent standard exponential random variables. Thus, by continuously comparing the accumulated hazards to the independent exponential random thresholds it is possible to determine the times in which the accumulated hazards cross the respective thresholds, and these times have the desired distribution. From this heuristic description it is seen that the multivariate conditional cumulative hazard functions, given in (6.C.3), determine the distribution of the generated random variables. This, indeed, is well known. Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. Given the functions Ψ·|· (··) that are associated with T , as described in (6.C.3), we will describe now how to generate a random vector Tˆ = (Tˆ1 , Tˆ2 , . . . , Tˆm ) such that Tˆ =st T . Let X1 , X2 , . . . , Xm be independent standard exponential random variables. The total hazard construction will be described in m steps. Step 1. In this step we determine the identity i1 of the component that fails ﬁrst and its time of failure Tˆi1 . This is determined by Tˆi1 = min{T˜1 , T˜2 , . . . , T˜m }, where

T˜j = min{t ≥ 0 : Ψj|∅ (t∅) ≥ Xj },

j = 1, 2, . . . , m,

and i1 is the index of the smallest T˜j . Step k. (k = 2, 3, . . . , m). Suppose that Steps 1, 2, . . . , k − 1 have already yielded Tˆi1 , Tˆi2 , . . . , Tˆik−1 . Let I = {i1 , i2 , . . . , ik−1 } and denote I = {j1 , j2 , . . . , jm−k+1 }. In this step we determine the identity ik of the component that is the kth one to fail and its failure time Tˆik . This is determined by Tˆik = min{T˜j1 , T˜j2 , . . . , T˜jm−k+1 },

6.C The Cumulative Hazard Order

289

where here, for j ∈ I, T˜j = min{t ≥ Tˆik−1 : Ψj|i1 ,i2 ,...,ik−1 (tTˆi1 , Tˆi2 , . . . , Tˆik−1 ) ≥ Xj }, and ik is the index of the smallest T˜j , j ∈ I. It can be shown that indeed Tˆ =st T . Let S = (S1 , S2 , . . . , Sm ) be another nonnegative random vector with an absolutely continuous distribution function and multivariate conditional cumulative hazard functions Φ·|· (··). Using the same independent standard exˆ = (Sˆ1 , Sˆ2 , . . . , Sˆm ) ponential random variables X1 , X2 , . . . , Xm , construct S ˆ and Tˆ are conusing the total hazard construction described above. Thus S ˆ =st S. structed on the same probability space and they satisfy Tˆ =st T and S ˆ ≤ Tˆ } = 1. Also, if (6.C.4) holds, that is, if S ≤ch T , then it is clear that P {S Thus, from Theorem 6.B.1, we see that we have proved the following theorem. Theorem 6.C.1. Let S and T be two nonnegative random vectors with absolutely continuous distribution functions. If S ≤ch T , then S ≤st T . It is worth mentioning that the total hazard construction is theoretically and practically diﬀerent from the standard construction discussed in Section 6.B.3. In the standard construction the uniform random variables U1 , U2 , . . . , Un , which are used to generate the desired Tˆ1 , Tˆ2 , . . . , Tˆn , can be used sequentially, that is, Ui can be used to generate Tˆi , once Tˆ1 , Tˆ2 , . . . , Tˆi−1 have already been generated, i = 1, 2, . . . , n. On the other hand, in the total hazard construction, the exponential random variables X1 , X2 , . . . , Xm are all used simultaneously in the generation of each Tˆi . Remark 6.C.2. Looking at Step 1 of the total hazard construction it is seen that it can be split into two substeps. First the value of ﬁrst order statistic, Tˆ(1) say, of the Tˆj ’s is determined, and then the identity (index) of Tˆ(1) is selected. Similarly Step k can be split into two substeps. Suppose now that T = (T1 , T2 , . . . , Tm ) is a vector of exponential random variables with possibly diﬀerent parameters. Then also Tˆ = (Tˆ1 , Tˆ2 , . . . , Tˆm ) is such a vector. Furthermore, Tˆ(1) is also an exponential random variable. If it is known that Tˆ(1) = s1 say, and if the identity of the smallest Tˆj is also known, then, conditionally, the residual lives of the remaining m − 1 components are independent exponential random variables, and they do not depend on s1 . If the identity of the smallest Tˆj is not known known, then the conditional distribution of the residual lives of the remaining m−1 components is a mixture of distributions of independent exponential random variables, and it still does not depend on s1 (notice that the probabilities of the mixture do not depend on s1 ). Therefore the conditional distribution of (T(2) − s1 , T(3) − s1 , . . . , T(m) − s1 ), given Tˆ(1) = s1 , does not depend on s1 . It follows that [(Tˆ(1) , Tˆ(2) , . . . , Tˆ(m) )Tˆ(1) = s1 ] is stochastically increasing in s1 . Since T =st Tˆ we obtain a proof of Theorem 6.B.13.

290

6 Multivariate Stochastic Orders

6.D Multivariate Hazard Rate Orders 6.D.1 Deﬁnitions and basic properties The following notation will be used below. For any two real numbers x and y we denote x ∨ y = max{x, y} and x ∧ y = min{x, y}. If x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) are two vectors in Rn , then we denote x ∨ y = (x1 ∨ y1 , x2 ∨ y2 , . . . , xn ∨ yn ) and x ∧ y = (x1 ∧ y1 , x2 ∧ y2 , . . . , xn ∧ yn ). Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with respective survival functions F and G deﬁned by F (x) = P {X > x} and G(x) = P {Y > x}, x ∈ Rn . We say that X is smaller than Y in the multivariate hazard rate order (denoted by X ≤hr Y ) if F (x)G(y) ≤ F (x ∧ y)G(x ∨ y)

for every x and y in Rn .

(6.D.1)

We say that X is smaller than Y in the weak multivariate hazard rate order (denoted by X ≤whr Y ) if G(x) is increasing in x ∈ {x : G(x) > 0}, F (x)

(6.D.2)

where in (6.D.2) we use the convention a/0 ≡ ∞ whenever a > 0. Note that (6.D.2) can be written equivalently as F (y)G(x) ≤ F (x)G(y)

whenever x ≤ y.

(6.D.3)

Thus, from (6.D.1) and (6.D.3) it follows that X ≤hr Y =⇒ X ≤whr Y .

(6.D.4)

Note that from (6.D.3) it follows that if y ∈ {x : G(x) = 0}, then y ∈ {x : F (x) = 0}. That is, if X ≤whr Y , then {x : F (x) > 0} ⊆ {x : G(x) > 0}. It can be shown that the implication (6.D.4) is strict. However, when at least one of the survival functions of X and of Y is MTP2 (recall from Karlin and Rinott [278] that a function K : Rn → R+ is said to be multivariate totally positive of order 2 (MTP2 ) if K(x)K(y) ≤ K(x ∧ y)K(x ∨ y) for all x, y ∈ Rn ), then, under some regularity conditions, the orders ≤hr and ≤whr are equivalent. This is shown next. Recall that a set S ⊆ Rn is called a lattice if for all x, y in S we have that x ∧ y and x ∨ y are in S. Theorem 6.D.1. Let X and Y be two random vectors with respective survival functions F and G, and with a common support S which is a lattice. If F and/or G are/is MTP2 , then X ≤whr Y =⇒ X ≤hr Y .

(6.D.5)

6.D Multivariate Hazard Rate Orders

291

Proof. Note that the left hand side of the implication (6.D.5) implies F (x ∨ y)G(y) ≤ F (y)G(x ∨ y),

x, y ∈ Rn ,

and that the MTP2 -ness of F implies F (x)F (y) ≤ F (x ∧ y)F (x ∨ y),

x, y ∈ Rn .

Multiplication of these two inequalities yields F (x ∨ y)G(y)F (x)F (y) ≤ F (y)G(x ∨ y)F (x ∧ y)F (x ∨ y). Now, from the assumption that S is a lattice it follows that if F (x)G(y) > 0, then F (y) and F (x ∨ y) are positive. Canceling these we obtain that (6.D.1) holds in this case. If F (x)G(y) = 0, then (6.D.1) obviously holds too. Therefore X ≤hr Y . In a similar manner the implication (6.D.5) can be shown when G is MTP2 .

The order ≤hr is not an order in the usual sense (that is, it is not reﬂexive) because from (6.D.1) it follows that X ≤hr X ⇐⇒ P {X > x} is MTP2 . Consider now a random vector X = (X1 , X2 , . . . , Xn ) with a partially (1) (2) (n) diﬀerentiable survival function F . Let r X = (rX , rX , . . . , rX ) be its hazard gradient as deﬁned in (1.B.28). Let Y be another n-dimensional random vector (1) (2) (n) with hazard gradient r Y = (rY , rY , . . . , rY ). The following result, which can be obtained by diﬀerentiation of (6.D.2), justiﬁes the terminology “hazard rate order” for the orders that were introduced in (6.D.1) and (6.D.2). Theorem 6.D.2. Let X and Y be n-dimensional random vectors with hazard gradients r X and r Y , respectively. Then X ≤whr Y if, and only if, (i)

(i)

rX (x) ≥ rY (x),

i = 1, 2, . . . , n, x ∈ Rn .

A useful inequality is described next; we omit its proof. Theorem 6.D.3. Let X = (X1 , X2 , . . . , Xn ) be a random vector, and let X I = (Y1 , Y2 , . . . , Yn ) be a vector of independent random variables such that Xi =st Yi , i = 1, 2, . . . , n. If the survival function of X is MTP2 , then X I ≤hr X. The relation X ≤hr Y does not necessarily imply X ≤st Y , where ≤st denotes the usual multivariate stochastic order discussed in Section 6.B. However, a generalization of the univariate Theorem 1.B.1 is given in (6.G.10) in Section 6.G.1. Theorem 6.G.9 is a multivariate generalization of (1.B.7).

292

6 Multivariate Stochastic Orders

6.D.2 Preservation properties The orders ≤hr and ≤whr are closed under some common operations. Theorem 6.D.4. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤hr [≤whr ] (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤hr [≤whr ] (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R is an increasing function, i = 1, 2, . . . , n. (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤hr [≤whr ] Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤hr [≤whr ] (Y 1 , Y 2 , . . . , Y m ). That is, the multivariate hazard rate orders are closed under conjunctions. (c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤hr [≤whr ] Y , then X I ≤hr [≤whr ] Y I for each I ⊆ {1, 2, . . . , n}. That is, the multivariate hazard rate orders are closed under marginalization. (d) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤hr [≤whr ] Y j , j = 1, 2, . . ., then X ≤hr [≤whr ] Y . We will now describe some preservation properties of the multivariate hazard rate orders under random compositions. Let F θ , θ ∈ X be a family of n-dimensional survival functions, where X is a subset of the real line. Let X(θ) denote a random vector with survival function F θ . For any random variable Θ with support in X , and with distribution function H, let us denote by X(Θ) a random vector with survival function G given by G(x) = F θ (x)dH(θ), x ∈ Rn . X

Theorem 6.D.5. Let F θ , θ ∈ X be a family of n-dimensional survival functions as above. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions H1 and H2 , respectively. Let Y 1 and Y 2 be two random vectors such that Y i =st X(Θi ), i = 1, 2; that is, suppose that the survival function of Y i is given by Gi (x) = F θ (x)dHi (θ), x ∈ Rn , i = 1, 2. X

If

6.D Multivariate Hazard Rate Orders

X(θ) ≤whr X(θ )

whenever θ ≤ θ ,

293

(6.D.6)

and if Θ1 and Θ2 are ordered in the univariate hazard rate order; that is, if Θ1 ≤hr Θ2 ,

(6.D.7)

Y 1 ≤whr Y 2 .

(6.D.8)

then Proof. Assumption (6.D.6) means that for each j ∈ {1, 2, . . . , n}, the function F θ (x1 , x2 , . . . , xn ) is TP2 (totally positive of order 2; that is, bivariate MTP2 ) as a function of θ ∈ X and of xj ∈ R. Assumption (6.D.7) means that H i (θ) is TP2 as a function of i ∈ {1, 2} and of θ ∈ X . Therefore, by Theorem 2.1 of Joag-Dev, Kochar, and Proschan [259], Gi (x1 , x2 , . . . , xn ) is TP2 in i ∈ {1, 2} and in xj ∈ R, j = 1, 2, . . . , n. That is, G2 (x1 , x2 , . . . , xn ) is increasing in xj , G1 (x1 , x2 , . . . , xn ) By (6.D.2), this yields the stated result.

j = 1, 2, . . . , n.

In the case where Y 1 and Y 2 in Theorem 6.D.5 are vectors of conditionally independent random variables, the conclusion (6.D.8) can be strengthened. For thispurpose, consider n families of univariate survival functions

F j,θ , θ ∈ X , j = 1, 2, . . . , n, where X is a subset of the real line. Let Xj (θ) denote a univariate random variable with survival function F j,θ . For any random variable Θ with support in X , and with distribution function H, let Xj (Θ) denote a univariate random variable with survival function given by X F j,θ (x)dH(θ), x ∈ R, j = 1, 2, . . . , n.

Theorem 6.D.6. Let F j,θ , θ ∈ X be n families of univariate survival functions as above, j = 1, 2, . . . , n. Assume that for each j = 1, 2, . . . , n, the univariate supports corresponding to all the F j,θ ’s are identical, Yj , say. Let Θ1 and Θ2 be two random variables with supports in X and distribution functions H1 and H2 , respectively. Let Y 1 = (Y11 , Y12 , . . . , Y1n ) and Y 2 = (Y21 , Y22 , . . . , Y2n ) be two vectors of conditionally independent random variables such that Yij =st Xj (Θi ), i = 1, 2, j = 1, 2, . . . , n; that is, suppose that the survival function of Y i is given by Gi (x1 , x2 , . . . , xn ) =

n X j=1

F j,θ (xj )dHi (θ), (x1 , x2 , . . . , xn ) ∈ Rn , i = 1, 2. (6.D.9)

If and if

Xj (θ) ≤hr Xj (θ )

whenever θ ≤ θ , j = 1, 2, . . . , n,

(6.D.10)

294

6 Multivariate Stochastic Orders

Θ1 ≤hr Θ2 , then Y 1 ≤hr Y 2 . Proof. Let θ ≤ θ . From assumption (6.D.10), from the conditional independence of the Xj (θ)’s, and from the conditional independence of the Xj (θ )’s, it follows by Theorem 6.D.4(b) that (X1 (θ), X2 (θ), . . . , Xn (θ)) ≤hr (X1 (θ ), X2 (θ ), . . . , Xn (θ ))

whenever θ ≤ θ .

Therefore, by Theorem 6.D.5 we get Y 1 ≤whr Y 2 .

(6.D.11)

Next, it is easy to verify that Gi in (6.D.9) is TP2 in each pair of its variables when the other variables are held ﬁxed, i = 1, 2. Therefore Gi is MTP2 , i = 1, 2. Furthermore, from the assumption that for j = 1, 2, . . . , n, all the F j,θ ’s have a corresponding univariate common support Yj , it follows that Y 1 and Y 2 have a common support which is a lattice. The stated result now follows from (6.D.11) and Theorem 6.D.1.

An interesting property of the order ≤whr , for nonnegative random vectors, is given next; see Theorem 6.G.15 for a related result. Theorem 6.D.7. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. If X ≤whr Y , then min{a1 X1 , . . . , an Xn } ≤hr min{a1 Y1 , . . . , an Yn } whenever ai > 0, i = 1, 2, . . . , n. (6.D.12) 6.D.3 The dynamic multivariate hazard rate order Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with an absolutely continuous distribution function. Denote the multivariate conditional hazard rate functions of T by λ·|· (··) as deﬁned in (6.C.2). Clearly, the higher the multivariate conditional hazard rate functions are, the smaller T should be stochastically. This is the motivation for the order discussed in this subsection. Let S = (S1 , S2 , . . . , Sm ) be another nonnegative random vector with an absolutely continuous distribution function. Denote its multivariate condi tional hazard rate functions by η·|· (··), where the η’s are deﬁned analogously to the λ’s in (6.C.2). Suppose that ηi|I∪J (usI , sJ ) ≥ λi|I (utI ) whenever J ∩ I = ∅, sI ≤ tI ≤ ue, and sJ ≤ ue, (6.D.13)

6.D Multivariate Hazard Rate Orders

295

where i ∈ I ∪ J. Then S is said to be smaller than T in the dynamic multivariate hazard rate order (denoted as S ≤dyn-hr T ). The order ≤dyn-hr is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤dyn-hr X means that X has the positive dependence property of “hazard rate increasing upon failures” discussed in Shaked and Shanthikumar [511]. Note that (6.D.13) can be written as (see Section 6.B.6 for the deﬁnition of histories and for the deﬁnition of their comparison) ηi (hu ) ≥ λi (hu )

whenever hu ≥ hu ,

where i denotes a component that has not failed by time u in the history hu . The following example illustrates how the dynamic multivariate hazard rate order can be veriﬁed. This example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.E.13, and 7.B.13. Example 6.D.8. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that the univariate hazard rate ordering of the ﬁrst two epoch times implies the dynamic multivariate hazard rate ordering of the corresponding vectors of the later epoch times. Explicitly, it will be shown below that if X ≤hr Y , then (T1,1 , T1,2 , . . . , T1,n ) ≤dyn-hr (T2,1 , T2,2 , . . . , T2,n ) for each n ≥ 1. Fix an n ≥ 1. Let η·|· (··) be the multivariate conditional hazard rate func tions associated with (T1,1 , T1,2 , . . . , T1,n ) and let ζ·|· (··) be the multivariate conditional hazard rate functions associated with (T2,1 , T2,2 , . . . , T2,n ). First let us obtain an explicit expression for ζi|I (utI ) under the restrictions on t and u in (6.D.13). Since T2,1 ≤ T2,2 ≤ · · · ≤ T2,n a.s., it follows that tI in (6.D.13) can be a realization (“history”) of observations up to time u only if I is of the form I = {1, 2, . . . , m} for some m ≥ 1, or I = ∅ (that is, m = 0). Then we have λ2 (u), if i = m + 1; ζi|I (utI ) = where I = {1, 2, . . . , m}. 0, if i > m + 1; Next, let us obtain an explicit expression for ηi|I∪J (usI∪J ) under the restrictions on s, t, and u in (6.D.13). Since T1,1 ≤ T1,2 ≤ · · · ≤ T1,n a.s., we see that when I = {1, 2, . . . , m}, then sI∪J in (6.D.13) can be a realization of observations up to time u only if J is of the form J = {m + 1, m + 2, . . . , k} for some k ≥ m + 1, or J = ∅ (that is, k = m). Then we have

296

6 Multivariate Stochastic Orders

ηi|I∪J (usI∪J ) =

λ1 (u), 0,

if i = k + 1; if i > k + 1;

where I = {1, 2, . . . , m} and J = {m + 1, m + 2, . . . , k}. Suppose that X ≤hr Y . Since i in (6.D.13) must satisfy i ∈ I ∪ J (that is, i > k), we see that if k > m, then if i = k + 1; ηi|I∪J (usI∪J ) = λ1 (u) ≥ 0 = ζi|I (utI ) ηi|I∪J (u sI∪J ) = 0 = ζi|I (u tI ) if i > k + 1; so (6.D.13) holds with ζ·|· (··) replacing λ·|· (··). If k = m (that is, J = ∅), then, using X ≤hr Y , we get ηi|I∪J (usI∪J ) = λ1 (u) ≥ λ2 (u) = ζi|I (utI ) if i = k + 1; ηi|I∪J (usI∪J ) = 0 = ζi|I (utI ) if i > k + 1; so (6.D.13), with ζ·|· (··) replacing λ·|· (··), holds in this case too. Thus (T1,1 , T1,2 , . . . , T1,n ) ≤dyn-hr (T2,1 , T2,2 , . . . , T2,n ). It should be noted that in Example 1.B.24 it was shown that if X ≤hr Y , then we have the univariate stochastic inequality T1,n ≤hr T2,n for each n ≥ 1. This stochastic inequality does not follow from the above result because the dynamic multivariate hazard rate order is not closed under marginalization. In the univariate case (m = 1) condition (6.D.13) reduces to (1.B.2) [with a diﬀerent notation]. We have already seen that in the univariate case S1 ≤hr T1 =⇒ S1 ≤st T1 . This is also true in the general dynamic multivariate case. In order to see it, note that if (6.D.13) holds, then (6.C.4) holds, where in (6.C.4) the functions Ψ ’s are deﬁned by means of the functions λ’s as in (6.C.3) and the functions Φ’s are analogously deﬁned by means of the functions η’s. We thus have proven the following result. Theorem 6.D.9. If S and T are two nonnegative random vectors such that S ≤dyn-hr T , then S ≤ch T . Let X(1) ≤ X(2) ≤ · · · ≤ X(n) be the order statistics corresponding to a sample of independent and identically distributed nonnegative random variables X1 , X2 , . . . , Xn . Similarly, let Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) be the order statistics corresponding to a sample of independent and identically distributed nonnegative random variables Y1 , Y2 , . . . , Yn . In the next result, the vectors of order statistics are compared in the order ≤dyn-hr ; it may be compared with Theorems 6.E.12, 7.B.4, and 7.B.12. The proof of the next result is similar to the proof of the main result in Example 6.D.8.

6.D Multivariate Hazard Rate Orders

297

Theorem 6.D.10. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as described above. If X1 ≤hr Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤dyn-hr (Y(1) , Y(2) , . . . , Y(n) ). We will now see a property of the order ≤dyn-hr in reliability theory. Recall from Section 1.B.5 that a nonnegative random variable T is IFR if, and only if, either one of the following equivalent conditions holds: [T − tT > t] ≥hr [T − t T > t ] whenever t ≤ t , (6.D.14) T ≥hr [T − tT > t] for all t ≥ 0. (6.D.15) With the dynamic multivariate analog of the order ≥hr , one can generalize (6.D.14) and (6.D.15) to the multivariate case, thus introducing notions of multivariate IFR distributions. This can be done in several ways. Below we show that various generalizations of (6.D.14) and (6.D.15) actually yield the same notion of multivariate IFR. Let T be a nonnegative random vector. Recall from Section 6.B.6 the deﬁnition, the notation ht , and the comparison of histories associated with T . One possible multivariate analog of (6.D.14) is to require T to satisfy, for t ≤ s and histories ht and hs , [(T − te)+ ht ] ≥dyn-hr [(T − se)+ hs ] whenever ht ≤ hs . (6.D.16) Still another possible multivariate analog of (6.D.14) is to require T to satisfy, for t ≤ s, [(T − te)+ ht ] ≥dyn-hr [(T − te)+ hs ] whenever ht and hs coincide on [0, t). (6.D.17) An analog of (6.D.15) is to require T to satisfy (6.D.16) or (6.D.17) with t = 0; that is, T ≥dyn-hr [(T − se)+ hs ] for any history hs , s ≥ 0. (6.D.18) It turns out that these three conditions are equivalent. If we say that the nonnegative random T is multivariate IFR if it satisﬁes (6.D.16), then we have the following result, the proof of which can be found elsewhere. Theorem 6.D.11. Let T be a nonnegative random vector. The following three statements are equivalent. (i) T is multivariate IFR. (ii) T satisﬁes (6.D.17). (iii) T satisﬁes (6.D.18). Note that if T is multivariate IFR in the sense of Theorem 6.D.11, it is also multivariate IFR in the sense of both (6.B.27) and (6.B.28).

298

6 Multivariate Stochastic Orders

6.E The Multivariate Likelihood Ratio Order 6.E.1 Deﬁnition A multivariate analog of the univariate order ≤lr from Section 1.C will be introduced in this subsection. This order is sometimes also called the TP2 order. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors with absolutely continuous [or discrete] distribution functions and let f and g denote their [continuous or discrete] density functions, respectively. Suppose that f (x)g(y) ≤ f (x ∧ y)g(x ∨ y)

for every x and y in Rn .

(6.E.1)

Then X is said to be smaller than Y in the multivariate likelihood ratio order (denoted as X ≤lr Y ). Indeed, in the univariate case (n = 1), (6.E.1) reduces to (1.C.2). The order ≤lr is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤lr X means that X has the positive dependence property of “multivariate TP2 ” discussed in Karlin and Rinott [278] and in Whitt [563]; see its deﬁnition in Example 6.E.16 below. In the slightly more general case, when X and Y are nonnegative, some of the Xi ’s may be identically zero and the joint distribution of the rest is absolutely continuous or discrete. Suppose that X1 , X2 , . . . , Xm are those that are identically zero for some 0 < m < n. Let f now denote the joint density of (Xm+1 , Xm+2 , . . . , Xn ). In that case we denote X ≤lr Y if f (x)g(y) ≤ f (x ∧ (ym+1 , ym+2 , . . . , yn )) × g((y1 , y2 , . . . , ym ), x ∨ (ym+1 , ym+2 , . . . , yn ))

(6.E.2)

for every x = (xm+1 , xm+2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ). At a ﬁrst glance (6.E.1) and (6.E.2) seem to be unintuitive technical conditions. However, it turns out that in many situations they are very easy to verify and this is one of the major reasons for the usefulness and importance of the order ≤lr . Another possible analog of (1.C.2) is to require that f (y)g(x) ≤ f (x)g(y) whenever x ≤ y. However, this does not yield an intuitive notion; see Remark 6.E.10. 6.E.2 Some properties The multivariate likelihood ratio order is preserved under conditioning on any rectangular set A (that is, A of the form A = A1 ×A2 ×· · ·×An where Ai ⊆ R, i = 1, 2, . . . , n). This is shown in the next result. The proof is quite trivial and is omitted.

6.E The Multivariate Likelihood Ratio Order

299

Theorem 6.E.1. If X and Y are two n-dimensional random vectors such n that X ≤lr Y , then, for any measurable rectangular set A ⊆ R , we have that [X X ∈ A] ≤lr [Y Y ∈ A]. The above theorem can be generalized as follows. For A, B ⊆ Rn we denote A ∨ B = {x ∨ y : x ∈ A, y ∈ B} and A ∧ B = {x ∧ y : x ∈ A, y ∈ B}. Theorem 6.E.2. Let A, B ⊆ Rn satisfy A ∨ B ⊆ B and A ∧ B ⊆ A. If X and Y are two n-dimensional random vectors such that X ≤lr Y , then [X X ∈ A] ≤lr [Y Y ∈ B]. Proof. Let f and g denote the density functions of X and Y , respectively. For any set C, let IC denote its indicator function. The assumptions imply IA (x)IB (y) ≤ IA (x ∧ y)IB (x ∨ y)

and f (x)g(y) ≤ f (x ∧ y)g(x ∨ y).

Therefore f (x)IA (x) g(y)IB (y) f (x ∧ y)IA (x ∧ y) g(x ∨ y)IB (x ∨ y) · ≤ · . P {X ∈ A} P {Y ∈ B} P {X ∈ A} P {Y ∈ B}

The following result shows that the order ≤lr is preserved under strictly monotone transformations of each individual coordinate of the underlying random vectors. The proof follows the lines of the proof of Theorem 1.C.8 and is omitted. Theorem 6.E.3. Let ψi be any increasing function, i = 1, 2, . . . , n. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤lr Y , then (ψ1 (X1 ), ψ2 (X2 ), . . . , ψn (Xn )) ≤lr (ψ1 (Y1 ), ψ2 (Y2 ), . . . , ψn (Yn )). The order ≤lr is closed under marginalization and under conjunctions as the following result shows. The ﬁrst part of the theorem can easily be proven from the deﬁnitions. The proof of the second part uses ideas from the theory of total positivity and is not given here. Theorem 6.E.4. (a) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤lr Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤lr (Y 1 , Y 2 , . . . , Y m ). That is, the multivariate likelihood ratio order is closed under conjunctions. (b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤lr Y , then X I ≤lr Y I for each I ⊆ {1, 2, . . . , n}. That is, the multivariate likelihood ratio order is closed under marginalization.

300

6 Multivariate Stochastic Orders

A result which shows the preservation of the order ≤lr under random summations is stated next. The proof is based on standard arguments from the theory of total positivity, and is omitted. Theorem 6.E.5. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Assume that X 1 , X 2 , . . . , X m are independent. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i has a logconcave density function for all j = 1, 2, . . . , m and i ≥ 1, and if M ≤lr N , then M1 i=1

X1,i ,

M2 i=1

X2,i , . . . ,

Mm

Xm,i ≤lr

i=1

N1 i=1

X1,i ,

N2 i=1

X2,i , . . . ,

Nm

Xm,i .

i=1

In the univariate case the likelihood ratio order implies the hazard rate order. It turns out that this is also the case in the multivariate case as the following two results show. Theorem 6.E.6. If X and Y are two n-dimensional random vectors such that X ≤lr Y , then X ≤hr Y . Proof. This result follows from Theorem 2.4 nin Karlin and Rinott [278] with the MTP2 kernel K deﬁned by K(x, u) = i=1 1(xi ,∞) (ui ).

Theorem 6.E.7. If X and Y are two nonnegative n-dimensional random vectors such that X ≤lr Y , then X ≤dyn-hr Y . Proof. First suppose that X > 0e a.s. Split {1, 2, . . . , n} into three mutually exclusive sets, I, J, and L (so that L = I ∪ J). Select xI , xJ , y I , and t such that xI ≤ y I ≤ te and xJ ≤ te. Denote the densities of (X I , X J , X L ) and of (Y I , Y J , Y L ) by f˜ and g˜, respectively. The density of [X L X I = xI , X J = ˜ xJ ], with argument xL , is then f˜(xI , xJ , xL )/f˜I,J (x I , xJ ) where fI,J is the marginal density of (X I , X J ). The density of [Y L Y I = y I , Y J > te], with argument y L , is then g˜(y I , y J , y L )dy J yJ >te , g˜ (y I , y J )dy J y >te I,J J

where g˜I,J is the marginal density of (Y I , Y J ). Now select a y J > te. Since y J > te and xJ ≤ te it follows that xJ ≤ y J . Also xI ≤ y I . Therefore, from the assumption that X ≤lr Y it follows that g (y I , y J , y L ) ≤ f˜(xI , xJ , xL ∧ y L )˜ g (y I , y J , xL ∨ y L ). (6.E.3) f˜(xI , xJ , xL )˜ Integration of (6.E.3) over the region {y J : y J > te} yields g (y I , y J , y L )dy J f˜(xI , xJ , xL )˜ y J >te ≤ g (y I , y J , xL ∨ y L )dy J f˜(xI , xJ , xL ∧ y L )˜ y J >te

6.E The Multivariate Likelihood Ratio Order

301

which, in turn, yields g˜(y I , y J , y L )dy J f˜(xI , xJ , xL ) y >te × J ˜ g˜ (y I , y J ) dy J fI,J (xI , xJ ) y J >te I,J f˜(xI , xJ , xL ∧ y L ) ≤ × f˜I,J (xI , xJ )

y J >te

g˜(y I , y J , xL ∨ y L )dy J

g˜ (y I , y J )dy J y J >te I,J

That is, we have shown so far that [X L X I = xI , X J = xJ ] ≤lr [Y L Y I = y I , Y J > te].

.

(6.E.4)

From Theorems 6.E.1 and 6.E.3 it now follows that [X L − teX I = xI , X J = xJ , X L > te]

≤lr [Y L − teY I = y I , Y J > te, Y L > te],

and from Theorem 6.E.4(b) it follows that, for k ∈ L, we have [Xk − tX I = xI , X J = xJ , X L > te] ≤lr [Yk − tY I = y I , Y J > te, Y L > te], (6.E.5) where here ≤lr denotes the univariate likelihood ratio order discussed in Section 1.C. From (6.E.5) it follows that the density of [Xk − tX I = xI , X J = xJ , X L > te] at zero is larger than the density of [Yk − t Y I = y I , Y J > te, Y L > te] at zero. But the density of [Xk −t X I = xI , X J = xJ , X L > te] at zero is ηk|I∪J (txI, xJ ) and the density of [Yk −tY I = y I , Y J > te, Y L > te] at zero is λk|I (t y I ), where λ·|· (· ·) and η·|· (··) denote the multivariate conditional hazard rate functions of X and Y , respectively. We thus have shown that X and Y satisfy (6.D.13) and this completes the proof of the theorem when X > 0e a.s. If X has some components that are identically zero a.s., then the above arguments still apply after some simple modiﬁcations.

A combination of Theorems 6.C.1, 6.D.9, and 6.E.7 shows that for nonnegative random vectors X and Y one has X ≤lr Y =⇒ X ≤st Y . But this is true in general as is stated in the next result, the proof of which we omit. Theorem 6.E.8. If X and Y are two n-dimensional random vectors such that X ≤lr Y , then X ≤st Y . Remark 6.E.9. A combination of Theorems 6.E.1 and 6.E.8 shows that X ≤lr Y =⇒ [X A] ≤st [Y A] for all measurable rectangular sets A ⊆ Rn . (6.E.6)

302

6 Multivariate Stochastic Orders

The conclusion in (6.E.6) is a generalization of (1.C.6). However, the characterization of the order ≤lr in the univariate case, given in (1.C.6), does not generalize to the case. That is, X ≤lr Y does not necessarily multivariate imply that [X A] ≤st [Y A] for all measurable sets A ∈ Rn . Remark 6.E.10. Let X and Y be two n-dimensional random vectors with (continuous or discrete) density functions f and g, respectively. If it is only assumed that f (y)g(x) ≤ f (x)g(y) whenever x ≤ y (rather than (6.E.1)), then it is not necessarily true that X ≤st Y ; counterexamples can be found in the literature. Note, however, that, under some additional conditions, the monotonicity of g(x)/f (x) in x implies that X ≤st Y ; see, for example, Theorem 6.B.8. A result that may be viewed as a generalization of Theorems 1.C.9 and 1.C.52 is stated next. Theorem 6.E.11. Let X be an n-dimensional random vector. (a) X ≤lr X + a for all a ≥ 0 if, and only if, X has independent components with logconcave density functions. (b) If X has independent components with logconcave density functions, then X ≤lr X + Y for any random vector Y ≥ 0 independent of X. In the next result, vectors of order statistics are compared in the multivariate order ≤lr . The result may be compared with Theorems 6.D.10, 7.B.4, and 7.B.12. Theorem 6.E.12. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as in Theorem 6.D.10. If X1 ≤lr Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤lr (Y(1) , Y(2) , . . . , Y(n) ). The following example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.D.8, and 7.B.13. Example 6.E.13. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, and density functions f and g, respectively. Denote Λ1 = − log F , Λ2 = − log G, and λi = Λi , i = 1, 2. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . It turns out that, under some conditions, the univariate likelihood ratio ordering of the ﬁrst two epoch times implies the multivariate likelihood ratio ordering of the corresponding vectors of the later epoch times. Explicitly, it will be shown below that if X ≤hr Y , and if (1.B.25) holds, then (T1,1 , T1,2 , . . . , T1,n ) ≤lr (T2,1 , T2,2 , . . . , T2,n ) for each n ≥ 1. (Note that the condition X ≤hr Y , together with (1.B.25), is stronger than merely assuming X ≤lr Y ; see Theorem 1.C.4.)

6.E The Multivariate Likelihood Ratio Order

303

As is mentioned above, the stated result is true for n = 1. So let n ≥ 2. The density functions of (Ti,1 , Ti,2 , . . . , Ti,n ), i = 1, 2, are given by h1,n (x1 , x2 , . . . , xn ) = λ1 (x1 )λ1 (x2 ) · · · λ1 (xn−1 )f (xn ) for x1 ≤ x2 ≤ · · · ≤ xn , and h2,n (x1 , x2 , . . . , xn ) = λ2 (x1 )λ2 (x2 ) · · · λ2 (xn−1 )g(xn ) for x1 ≤ x2 ≤ · · · ≤ xn . Consider now (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) such that x1 ≤ x2 ≤ · · · ≤ xn and y1 ≤ y2 ≤ · · · ≤ yn . We want to prove that λ1 (x1 ∧ y1 )λ1 (x2 ∧ y2 ) · · · λ1 (xn−1 ∧ yn−1 )f (xn ∧ yn ) × λ2 (x1 ∨ y1 )λ2 (x2 ∨ y2 ) · · · λ2 (xn−1 ∨ yn−1 )g(xn ∨ yn ) ≥ λ1 (x1 )λ1 (x2 ) · · · λ1 (xn−1 )f (xn ) × λ2 (y1 )λ2 (y2 ) · · · λ2 (yn−1 )g(yn ). (6.E.7) Let E = {i ≤ n − 1 : xi ≥ yi }. Then (6.E.7) reduces to λ1 (yi )λ2 (xi ) f (xn ∧ yn )g(xn ∨ yn ) ≥ λ1 (xi )λ2 (yi ) f (xn )g(yn ), i∈E

i∈E

and this follows from (1.B.25) and X ≤lr Y . From the above result, and the closure of the likelihood ratio order under marginalization (Theorem 6.E.4(b)), it follows that if X ≤hr Y , and if (1.B.25) holds, then T1,n ≤lr T2,n , n ≥ 1. However, a stronger result is given in Example 1.C.48—this is so because the conditions X ≤hr Y and (1.B.25), together, imply the conditions X ≤lr Y and (1.C.15). Now let Xi,n ≡ Ti,n − Ti,n−1 , n ≥ 1 (where Ti,0 ≡ 0), be the inter-epoch times of the process Ni , i = 1, 2. Again, note that X =st X1,1 and Y =st X2,1 . It turns out that, under some conditions, the univariate likelihood ratio ordering of the ﬁrst two inter-epoch times implies the multivariate likelihood ratio ordering of the corresponding vectors of the later inter-epoch times. Explicitly, if X ≤hr Y , and if f and/or g are logconvex, and if λ1 and/or λ2 are logconvex, and if (1.B.25) holds, then (X1,1 , X1,2 , . . . , X1,n ) ≤lr (X2,1 , X2,2 , . . . , X2,n ) for each n ≥ 1. The proof of this statement will not be detailed here. From the above result, and the closure of the likelihood ratio order under marginalization (Theorem 6.E.4(b)), it follows that if X ≤hr Y , and if f and/or g are logconvex, and if λ1 and/or λ2 are logconvex, and if (1.B.25) holds, then X1,n ≤lr X2,n , n ≥ 1. This is a diﬀerent set of conditions for the last stochastic inequality than the set of conditions in Example 1.C.48.

304

6 Multivariate Stochastic Orders

Example 6.E.14. Recall that the spacings that correspond to the nonnegative random variables X1 , X2 , . . . , Xn are denoted by U(i) = X(i) − X(i−1) , i = 1, 2, . . . , n, where the X(i) ’s are the corresponding order statistics (here we take X(0) ≡ 0). The normalized spacings are deﬁned by D(i) = (n − i − 1)U(i) , i = 1, 2, . . . , n. Now, let D(1) , D(2) , . . . , D(n) be the normalized spacings associated with exponential random variables X1 , X2 , . . . , Xn , where Xi has ∗ ∗ ∗ the hazard rate λi , i = 1, 2, . . . , n. Let D(1) , D(2) , . . . , D(n) be the normalized spacings associated with a sample of n independent and identically n distributed exponential random variables that have the hazard rate (1/n) i=1 λi . Then ∗ ∗ ∗ (D(1) , D(2) , . . . , D(n) ) ≤lr (D(1) , D(2) , . . . , D(n) ).

The following example is similar to Example 6.B.25 except that under a diﬀerent assumption we obtain a stronger conclusion. Other results which give related comparisons can be found in Theorems 1.C.45 and 4.B.17. Example 6.E.15. Let X and Y be two random variables. Let X(1) ≤ X(2) ≤ · · · ≤ X(n) denote the order statistics from a sample X1 , X2 , . . . , Xn of independent and identically distributed random variables that have the same distribution as X. Similarly, let Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) denote the order statistics from another sample Y1 , Y2 , . . . , Yn of independent and identically distributed random variables that have the same distribution as Y . The corresponding spacings are deﬁned by U(i) ≡ X(i) − X(i−1) and V(i) ≡ Y(i) − Y(i−1) , i = 2, 3, . . . , n. Denote U = (U(2) , U(3) , . . . , U(n) ) and V = (V(2) , V(3) , . . . , V(n) ). Kochar [311] has shown that if X ≤lr Y , and if either X or Y have logconvex densities, then U ≤lr V . The next example extends Example 1.C.57 to the multivariate likelihood ratio order. Example 6.E.16. Let X be an n-dimensional random vector whose distribution function depends on the m-dimensional parameter Θ. Denote the prior density function of Θ by π(·), and denote the conditional density of X, given Θ = θ, by f (·θ). Suppose that the m-dimensional density function of Θ is MTP2 (multivariate totally positive of order 2), that is, suppose that Θ ≤lr Θ, or, equivalently (see (6.E.1)), that π(θ)π(θ ) ≤ π(θ ∧ θ )π(θ ∨ θ ) for every θ m and θ in R . Then, if f (x θ) is ((m + n)-dimensional) MTP2 , then Θ is increasing in X in the likelihood ratio sense (that is, [Θ X = x] ≤lr [Θ X = x ] whenever x ≤ x ). The proof of this statement is similar to the proof of the statement in Example 1.C.57 and is omitted. 6.E.3 A property in reliability theory In Theorem 1.C.52 it was shown that a nonnegative random variable T has a logconcave density if, and only if, either one of the following equivalent conditions holds:

6.F The Multivariate Mean Residual Life Order

[T − tT > t] ≥lr [T − t T > t ] whenever t ≤ t , T ≥lr [T − tT > t] for all t ≥ 0.

305

(6.E.8) (6.E.9)

We commented there that logconcavity can thus be interpreted as an aging notion in reliability theory. Having a multivariate analog of the order ≥lr one can generalize (6.E.8) and (6.E.9) to the multivariate case, thus introducing notions which can be considered as multivariate analogs of distributions with logconcave densities. This can be done in several ways. In this subsection we show that various generalizations of (6.E.8) and (6.E.9) actually yield the same notion of multivariate PF2 distributions. Let T be a nonnegative random vector. Recall from Section 6.B.6 the deﬁnition, the notation ht , and the comparison of histories associated with T . One possible multivariate analog of (6.E.8) is to require T to satisfy, for t ≤ s and histories ht and hs , [(T − te)+ ht ] ≥lr [(T − se)+ hs ] whenever ht ≤ hs . (6.E.10) Still another possible multivariate analog of (6.E.8) is to require T to satisfy, for t ≤ s, [(T − te)+ ht ] ≥lr [(T − se)+ hs ] whenever ht and hs coincide on [0, t). (6.E.11) An analog of (6.E.9) is to require T to satisfy (6.E.10) or (6.E.11) with t = 0, that is, T ≥lr [(T − se)+ hs ] for any history hs , s ≥ 0. (6.E.12) It turns out that these three conditions are equivalent. If we say that the nonnegative random vector T is multivariate PF2 if it satisﬁes (6.E.10), then we have the following result, the proof of which is similar to the proof of Theorem 6.D.11. Theorem 6.E.17. Let T be a nonnegative random vector. The following three statements are equivalent. (i) T is multivariate PF2 . (ii) T satisﬁes (6.E.11). (iii) T satisﬁes (6.E.12).

6.F The Multivariate Mean Residual Life Order 6.F.1 Deﬁnition Let T = (T1 , T2 , . . . , Tm ) be a nonnegative random vector with a ﬁnite mean vector. Consider a typical history of T at time t ≥ 0, which is of the form (see (6.B.25))

306

6 Multivariate Stochastic Orders

ht = {T I = tI , T I > te},

0e ≤ tI ≤ te, I ⊆ {1, 2, . . . , m}.

(6.F.1)

Given the history ht as in (6.F.1), let i ∈ I be a component that is still alive at time t. Its multivariate mean residual life, at time t, is deﬁned as follows: mi|I (ttI ) = E[Ti − tT I = tI , T I > te], (6.F.2) where, of course, 0e ≤ tI ≤ te and I ⊆ {1, 2, . . . , m}. Clearly, the smaller the mrl function is, the smaller T should be in some stochastic sense. This is the motivation for the order discussed in this section. Let S be another nonnegative random vector with a ﬁnite mean vector. Denote its multivariate mean residual life functions by l·|· (··), where the l’s are deﬁned analogously as the m’s in (6.F.2). Suppose that li|I∪J (usI , sJ ) ≤ mi|I (utI ) whenever J ∩ I = ∅, sI ≤ tI ≤ ue, and sJ ≤ ue, (6.F.3) where i ∈ I ∪ J. Then S is said to be smaller than T in the multivariate mean residual life order (denoted as S ≤mrl T ). The order ≤mrl is not an order in the usual sense; a comment, similar to the comment in Remark 6.B.5, applies to this order too. Explicitly, X ≤mrl X means that X has the positive dependence property of “mrl decreasing upon failure” discussed in Shaked and Shanthikumar [513]. Note that (6.F.3) can be written as li (hu ) ≤ mi (hu )

whenever hu ≥ hu ,

where i denotes a component that has not failed by time u in the history hu . In the univariate case (m = 1) condition (6.F.3) reduces to (2.A.2) [with a diﬀerent notation]. We have already seen that in the univariate case S1 ≤hr T1 =⇒ S1 ≤mrl T1 . This is also true in the general multivariate case as will be shown in the next subsection. 6.F.2 The relation between the multivariate mean residual life and the dynamic multivariate hazard rate orders Theorem 6.F.1. If S and T are two nonnegative random vectors with ﬁnite mean vectors such that S ≤dyn-hr T , then S ≤mrl T . Proof. Select a t > 0 and two histories ht and ht such that ht ≤ ht . It is not hard to verify that if S ≤dyn-hr T , then [(S − te)+ ht ] ≤dyn-hr [(T − te)+ ht ]. From Theorems 6.D.9 and 6.C.1 it is seen that if [(S − te)+ ht ] ≤dyn-hr [(T − te)+ ht ], then [(S − te)+ ht ] ≤st [(T − te)+ ht ]. Therefore, for a component i, which is still alive at time t in history ht , we have li (ht ) = E[Si − tht ] ≤ E[Ti − tht ] = mi (ht ), that is, S ≤mrl T .

6.G Other Multivariate Stochastic Orders

307

6.F.3 A property in reliability theory Recall from Section 2.A.4 that a nonnegative random variable T with a ﬁnite mean is DMRL if, and only if, either one of the following equivalent conditions holds: (6.F.4) [T − tT > t] ≥mrl [T − t T > t ] whenever t ≤ t , T ≥mrl [T − t T > t] for all t ≥ 0. (6.F.5) With the multivariate analog of the order ≥mrl one can generalize (6.F.4) and (6.F.5) to the multivariate case, thus introducing notions of multivariate DMRL distributions. This can be done in several ways. In this subsection we show that various generalizations of (6.F.4) and (6.F.5) actually yield the same notion of multivariate DMRL. Let T be a nonnegative random vector with a ﬁnite mean vector. A possible multivariate analog of (6.F.4) is to require, for t ≤ s and histories ht and hs , that T satisﬁes (6.F.6) [(T − te)+ ht ] ≥mrl [(T − se)+ hs ] whenever ht ≤ hs . Still another possible multivariate analog of (6.F.4) is to require, for t ≤ s, that T satisﬁes [(T − te)+ ht ] ≥mrl [(T − se)+ hs ] whenever ht and hs coincide on [0, t). (6.F.7) An analog of (6.F.5) is to require that T satisﬁes (6.F.6) or (6.F.7) with t = 0, that is, (6.F.8) T ≥mrl [(T − se)+ hs ] for any history hs , s ≥ 0. It turns out that these three conditions are equivalent. If we say that the nonnegative random vector T is multivariate DMRL if it satisﬁes (6.F.6), then we have the following result, the proof of which is similar to the proof of Theorem 6.D.11 and is omitted. Theorem 6.F.2. Let T be a nonnegative random vector with a ﬁnite mean vector. The following three statements are equivalent. (i) T is multivariate DMRL. (ii) T satisﬁes (6.F.7). (iii) T satisﬁes (6.F.8).

6.G Other Multivariate Stochastic Orders 6.G.1 The orthant orders The usual multivariate stochastic order, discussed in Section 6.B, is a possible multivariate generalization of (1.A.4) or (1.A.7). In this section we discuss

308

6 Multivariate Stochastic Orders

a few other possible generalizations of the univariate order ≤st which are straightforward analogs of (1.A.1) and of (1.A.2). These generalizations yield orders that are strictly weaker than the usual multivariate stochastic order. For a random vector X = (X1 , X2 , . . . , Xn ) with distribution function F , let F be the multivariate survival function of X, that is, F (x1 , x2 , . . . , xn ) ≡ P {X1 > x1 , X2 > x2 , . . . , Xn > xn }

for all x.

Let Y be another n-dimensional random vector with distribution function G and survival function G. If F (x1 , x2 , . . . , xn ) ≤ G(x1 , x2 , . . . , xn )

for all x,

(6.G.1)

then we say that X is smaller than Y in the upper orthant order (denoted by X ≤uo Y ). If F (x1 , x2 , . . . , xn ) ≥ G(x1 , x2 , . . . , xn )

for all x,

(6.G.2)

then we say that X is smaller than Y in the lower orthant order (denoted by X ≤lo Y ). The reason for this terminology is that sets of the form {x : x1 > a1 , x2 > a2 , . . . , xn > an }, for some ﬁxed a, are called upper orthants, and sets of the form {x : x1 ≤ a1 , x2 ≤ a2 , . . . , xn ≤ an }, for some ﬁxed a, are called lower orthants. Note that (6.G.1) can be written as E[IU (X)] ≤ E[IU (Y )]

for all upper orthants U.

(6.G.3)

Similarly, (6.G.2) can be written as E[IL (X)] ≥ E[IL (Y )]

for all lower orthants L.

(6.G.4)

Let ψ be an n-variate function of the form ψ(x1 , x2 , . . . , xn ) =

n

gi (xi ),

(x1 , x2 , . . . , xn ) ∈ Rn ,

i=1

where the gi ’s are univariate nonnegative increasing functions. Every such function can be approximated by positive linear combinations of indicator functions of upper orthants. Therefore, using (6.G.3), we obtain the ﬁrst part of the next theorem. The other part can be obtained similarly using (6.G.4). Theorem 6.G.1. Let X and Y be two n-dimensional random vectors. Then (a) X ≤uo Y if, and only if, n n E gi (Xi ) ≤ E gi (Yi ) i=1

(6.G.5)

i=1

for every collection {g1 , g2 , . . . , gn } of univariate nonnegative increasing functions.

6.G Other Multivariate Stochastic Orders

309

(b) X ≤lo Y if, and only if, E

n

n hi (Xi ) ≥ E hi (Yi )

i=1

(6.G.6)

i=1

for every collection {h1 , h2 , . . . , hn } of univariate nonnegative decreasing functions. For a real n-variate function g, the multivariate diﬀerence operator ∆ is deﬁned by n (−1) i=1 i g(1 x1 + (1 − 1 )y1 , . . . , n xn + (1 − n )yn ), ∆yx g = (1 ,2 ,...,n )∈{0,1}n

where x and y are elements of Rn . The function g is called ∆-monotone if ∆yx g ≥ 0

whenever x ≤ y.

Let M be the set of all n-variate functions that are ∆-monotone in any of their k coordinates when the other n − k coordinates are held ﬁxed, 1 ≤ k ≤ n. It can be shown that if ψ ∈ M and X ≤uo Y , then E[ψ(X)] ≤ E[ψ(Y )]. Every distribution function is a member of M . Thus we have proven the ﬁrst part of the following theorem. The other part can be shown similarly. Theorem 6.G.2. Let X and Y be two n-dimensional random vectors. Then (a) X ≤uo Y if, and only if, E[ψ(X)] ≤ E[ψ(Y )]

for every distribution function ψ.

(6.G.7)

for every survival function ψ.

(6.G.8)

(b) X ≤lo X if, and only if, E[ψ(X)] ≥ E[ψ(Y )]

It is clear, for example from Theorem 6.G.2, that

X ≤st Y =⇒ X ≤uo Y and X ≤lo Y .

(6.G.9)

Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Note that if X ≤uo Y , or if X ≤lo Y , then Xi ≤st Yi , i = 1, 2, . . . , n. It follows that X ≤uo Y =⇒ EX ≤ EY , X ≤lo Y =⇒ EX ≤ EY .

and

The following closure properties of the orthant orders can be easily veriﬁed using (6.G.1)–(6.G.4).

310

6 Multivariate Stochastic Orders

Theorem 6.G.3. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤uo [≤lo ] (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤uo [≤lo ] (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R is an increasing function, i = 1, 2, . . . , n. (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤uo [≤lo ] Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤uo [≤lo ] (Y 1 , Y 2 , . . . , Y m ). That is, the orthant orders are closed under conjunctions. (c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤uo [≤lo ] Y , then X I ≤uo [≤lo ] Y I for each I ⊆ {1, 2, . . . , n}. That is, the orthant orders are closed under marginalization. (d) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤uo [≤lo ] Y j , j = 1, 2, . . ., then X ≤uo [≤lo ] Y . (e) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤uo [≤lo ] [Y Θ = θ] for all θ in the support of Θ. Then X ≤uo [≤lo ] Y . That is, the orthant orders are closed under mixtures. From parts (a) and (e) of Theorem 6.G.3 we obtain the following corollary. Corollary 6.G.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors such that X ≤uo [≤lo ] Y , and let Z be an m-dimensional random vector which is independent of X and Y . Then (h1 (X1 , Z), h2 (X2 , Z), . . . , hn (Xn , Z)) ≤uo [≤lo ] (h1 (Y1 , Z), h2 (Y2 , Z), . . . , hn (Yn , Z)), whenever hi (x, z), i = 1, 2, . . . , n, are increasing in x for every z. By applying Corollary 6.G.4 twice (letting Z there be an n-dimensional random vector, and letting each hi depend only on its ﬁrst argument and on the ith component of the second argument, i = 1, 2, . . . , n), we get the following result. A strengthening of the following result is Theorem 6.G.18 below. Theorem 6.G.5. Let X, Y , Z, and W be n-dimensional random vectors such that X and Z are independent and Y and W are independent. Let ci : [0, ∞)2 → [0, ∞) be a continuous increasing function, i = 1, 2, . . . , n. If X ≤uo [≤lo ] Y and Z ≤uo [≤lo ] W , then (c1 (X1 , Z1 ), c2 (X2 , Z2 ), . . . , cn (Xn , Zn )) ≤uo [≤lo ] (c1 (Y1 , W1 ), c2 (Y2 , W2 ), . . . , cn (Yn , Wn )).

6.G Other Multivariate Stochastic Orders

311

Example 6.G.6. Consider an n-dimensional Markov chain {X k = (Xk,1 , . . . , Xk,n ), k ≥ 0} deﬁned by X 0 = (0, . . . , 0) and 1 m 1 m X k+1 = (g1 (Xk,1 , Uk,1 , . . . , Uk,1 ), . . . , gn (Xk,n , Uk,n , . . . , Uk,n ),

n ≥ 1,

l l , . . . , Uk,n ), k = where, for each 1 ≤ l ≤ m, the random vectors U lk = (Uk,1 1, 2, . . ., are independent and identically distributed, and the gi ’s are some deterministic (m + 1)-dimensional functions. Consider another n-dimensional Markov chain {Y k = (Yk,1 , . . . , Yk,n ), k ≥ 0} similarly deﬁned by Y 0 = (0, . . . , 0) and 1 m 1 m Y k+1 = (g1 (Yk,1 , Vk,1 , . . . , Vk,1 ), . . . , gn (Yk,n , Vk,n , . . . , Vk,n ),

n ≥ 1,

l l , . . . , Vk,n ), k = where, for each 1 ≤ l ≤ m, the random vectors V lk = (Vk,1 1, 2, . . ., are independent and identically distributed. If the gi ’s are increasing in their m + 1 arguments, if U l = {U lk , k ≥ 0}, l = 1, . . . , m, are independent, if V l = {V lk , k ≥ 0}, l = 1, . . . , m, are independent, and if U lk ≤uo [≤lo ] V lk , l = 1, . . . , m, k ≥ 0, then, for each k ≥ 0 we have

(X 0 , . . . , X k ) ≤uo [≤lo ] (Y 0 , . . . , Y k ). The proof uses Theorem 6.G.5, Corollary 6.G.4, and Theorem 6.G.3(b). We omit the details. Another preservation property of the orthant orders 0 is described in the next theorem. In the following theorem we deﬁne j=1 xj ≡ 0 for any sequence {xj , j = 1, 2, . . . }. Similar results are Theorems 9.A.6 and 9.A.14. Theorem 6.G.7. Let X j = (Xj,1 , Xj,2 , . . . , Xj,m ), j = 1, 2, . . ., be a sequence of nonnegative random vectors, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both M and N are independent of the X j ’s. If M ≤uo [≤lo ] N , then M1

Xj,1 ,

j=1

M2

Xj,2 , . . . ,

j=1

Mm

Xj,m

j=1

≤uo [≤lo ]

N1 j=1

Xj,1 ,

N2

Xj,2 , . . . ,

j=1

Nm

Xj,m .

j=1

Proof. We only give the proof for the upper orthant order; the proof for the lower orthant order is similar. For t = (t1 , t2 , . . . , tm ) we have %% $( Mi m $ P Xj,i > ti i=1

j=1

=

∞ ∞ n1 =0 n2 =0

···

∞ nm =0

P

$( ni m $ i=1

j=1

Xj,i ≤ ti

(n1 , n2 , . . . , nm )}

312

6 Multivariate Stochastic Orders

≤

∞ ∞

···

n1 =0 n2 =0

=P

∞

P

$( ni m $

nm =0

$( Ni m $ i=1

i=1

%% Xj,i

j=1

× P {N > (n1 , n2 , . . . , nm )}

%% Xj,i > ti

Xj,i ≤ ti

x] ≤uo [Y Y > x] for all x ∈ Rn , for which these conditional random vectors are well deﬁned.

6.G Other Multivariate Stochastic Orders

313

It follows from Theorem 6.G.9 that X ≤whr Y =⇒ X ≤uo Y ;

(6.G.10)

this is a multivariate generalization of Theorem 1.B.1. An interesting relationship between the order ≤uo and the orders ≤Sm-cx and ≤m-icx (deﬁned in Sections 3.A.5 and 4.A.7, respectively) is given in the next theorem. Theorem 6.G.10. Let X = (X1 , X2 , . . . , Xm ) and Y = (Y1 , Y2 , . . . , Ym ) be random vectors such that the (m − 1)st moment exists for each Xi and Yi , i = 1, 2, . . . , m.

k

k m m (a) If X ≤uo Y , and if E =E i=1 Xi i=1 Yi , k = 1, 2, . . . , m − 1, m m S then m i=1 Yi , where S is the assumed common support of m i=1 Xi ≤m-cx interval. i=1 Xi and of i=1 Yi , and S is also assumed to be an m (b) If X ≤ Y , and if X and Y are nonnegative, then uo i=1 Xi ≤m-icx m Y . i=1 i It is of interest to compare Theorem 6.G.10 with Theorem 7.A.30 and with implication (9.A.19). The following example gives suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 7.A.13, 7.A.26, 7.A.39, 7.B.5, and 9.A.20 for related results. Example 6.G.11. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ, and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix Σ + D, where D is a matrix with zero diagonal elements such that Σ + D is nonnegative deﬁnite. If µx ≤ µY and D ≥ 0, then X ≤uo Y . The following results give conditions that ensure stochastic equality; see Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, and 7.A.14–7.A.16 for similar results. First, in the bivariate case (n = 2) we have the following result; its proof is not given here since it is a special case of Theorem 6.G.13. Theorem 6.G.12. Let X = (X1 , X2 ) and Y = (Y1 , Y2 ) be two bivariate random vectors. If X1 =st Y1 , X2 =st Y2 , X ≤uo Y , and X ≤lo Y , then X =st Y . Note that when n = 2, Theorem 6.B.19 is a special case of Theorem 6.G.12, as can be seen from (6.G.9). If n ≥ 3, then the conclusion of Theorem 6.G.12 need not hold. The following theorem gives conditions under which the conclusion X =st Y holds. Theorem 6.G.13. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with distributions and survival functions F , F , G, and G,

314

6 Multivariate Stochastic Orders

respectively. If the m-dimensional marginals of X and Y are equal (m ≤ n−1) and if X ≤uo Y , that is, F (x) ≤ G(x)

for all x ∈ Rn ,

(6.G.11)

and if (−1)n F (x) ≥ (−1)n G(x)

for all x ∈ Rn ,

(6.G.12)

then X =st Y . Proof. Write F (x) = 1 −

i

≥1−

P {Xi ≤ xi , Xj ≤ xj } − · · · + (−1)n F (x)

i=j

P {Yi ≤ xi } +

i

= G(x),

P {Xi ≤ xi } +

P {Yi ≤ xi , Yj ≤ xj } − · · · + (−1)n G(x)

i=j

x∈R , n

where the equality of the m-dimensional marginals and also assumption (6.G.12) were used. Thus we get that for each x ∈ Rn , F (x) ≥ G(x). This, together with (6.G.11), yields the stated result.

An interesting relationship between the orders ≤lo and ≤Lt (see Section 5.A) is revealed in the following theorem. Theorem 6.G.14. Let X and Y be two nonnegative random vectors. If (X1 , X2 , . . . , Xn ) ≤lo (Y1 , Y2 , . . . , Yn ), then n

ai Xi ≤Lt

i=1

n

ai Yi

whenever ai ≥ 0, i = 1, 2, . . . , n.

i=1

Proof. Select an s ≥ 0 and ai ≥ 0, i = 1, 2, . . . , n. The function gi deﬁned by gi (x) = exp{−ai sx} is decreasing and nonnegative. Therefore, from (6.G.6), we obtain that n n

for all s ≥ 0, ai Xi ≥ E exp − s ai Yi E exp − s i=1

i=1

and this yields the stated result.

6.G.2 The scaled order statistics orders Consider now nonnegative random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ). For any z = (z1 , z2 , . . . , zn ) denote by z (k) = (z1 , z2 , . . . , zn )(k) the kth smallest zi in {z1 , z2 , . . . , zn }. Thus, for a random vector Z = (Z1 , Z2 , . . . , Zn ), the kth order statistic of Z1 , Z2 , . . . , Zn is Z (k) = (Z1 , Z2 , . . . , Zn )(k) . In particular, Z (1) = min{Z1 , Z2 , . . . , Zn } and Z (n) = max{Z1 , Z2 , . . . , Zn }. The next result describes the orders ≤uo and ≤lo in a new fashion when the underlying random vectors are nonnegative (see Theorem 6.D.7 for a related result).

6.G Other Multivariate Stochastic Orders

315

Theorem 6.G.15. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Then (a) X ≤uo Y if, and only if, min{a1 X1 , . . . , an Xn } ≤st min{a1 Y1 , . . . , an Yn }

(6.G.13)

whenever ai > 0, i = 1, 2, . . . , n. (b) X ≤lo Y if, and only if, max{a1 X1 , . . . , an Xn } ≤st max{a1 Y1 , . . . , an Yn }

(6.G.14)

whenever ai > 0, i = 1, 2, . . . , n. Proof. Condition (6.G.13) is the same as F(

t t t t t t , , . . . , ) ≤ G( , , . . . , ) a1 a2 an a1 a2 an

whenever t ≥ 0, ai > 0, i = 1, 2, . . . , n, which is the same as F (t1 , t2 , . . . , tn ) ≤ G(t1 , t2 , . . . , tn )

(6.G.15)

whenever ti > 0, i = 1, 2, . . . , n. Using standard limiting arguments it is seen that (6.G.15) is the same as X ≤uo Y . This proves (a). The proof of (b) is similar.

Theorem 6.G.15 suggests the following class of orders which contains the orders ≤uo and ≤lo as special cases. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Suppose that (a1 X1 , a2 X2 , . . . , an Xn )(k) ≤st (a1 Y1 , a2 Y2 , . . . , an Yn )(k)

(6.G.16)

whenever ai > 0, i = 1, 2, . . . , n. Then we say that X is smaller than Y in the kth scaled order statistic order (denoted by X ≤(k) Y ), k = 1, 2, . . . , n. So X ≤uo Y ⇐⇒ X ≤(1) Y and X ≤lo Y ⇐⇒ X ≤(n) Y . The next theorem identiﬁes a rich class of functions ψ such that E[ψ(X)] ≤ E[ψ(Y )] whenever X ≤(k) Y . First we need to introduce some notation. For m ∈ {1, 2, . . . , n} let Am be the set of all subsets of {1, 2, . . . , n} of size m. As in Section 6.A, for I = {i1 , i2 , . . . , im } ∈ Am and a vector x = (x1 , x2 , . . . , xn ), we denote xI = (xi1 , xi2 , . . . , xim ). Let M1,n denote the class of all distribution functions corresponding to nonnegative ﬁnite measures on Rn+ . For x ∈ Rn+ , I ∈ Am , and ψ ∈ M1,n , we denote ˜ I , ∞e) = lim ψ(x1 , x2 , . . . , xn ). ψ(x xI →∞e

For k ∈ {1, 2, . . . , n} let Mk,n be the class of functions φ : Rn+ → R of the form

316

6 Multivariate Stochastic Orders

φ(x1 , x2 , . . . , xn ) =

n m=n−k+1

(−1)m−n+k−1

m−1 n−k

˜ I , ∞e), ψ(x

I∈Am

n elements for some ψ ∈ M1,n , where I∈Am denotes the sum over all the m of Am . Note that for k = 1 the two deﬁnitions of M1,n coincide. The proof of the next result is not given here; it can be found elsewhere. Theorem 6.G.16. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Then X ≤(k) Y if, and only if, E[φ(X)] ≤ E[φ(Y )] for every φ ∈ Mk,n for which the expectations exist. Note that both parts of Theorem 6.G.2 are special cases of Theorem 6.G.16. The orders ≤(k) are closed under general monotone increasing transformations as the following theorem shows. The proof is easy and is omitted. Theorem 6.G.17. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Let bi : R+ → R+ be a right continuous increasing function, i = 1, 2, . . . , n. If X ≤(k) Y , then

b1 (X1 ), b2 (X2 ), . . . , bn (Xn ) ≤(k) b1 (Y1 ), b2 (Y2 ), . . . , bn (Yn ) . The orders ≤(k) also satisfy the following general closure property, the proof of which can be found elsewhere and is omitted. Theorem 6.G.18. Let X, Y , Z, and W be n-dimensional nonnegative random vectors such that X and Z are independent, and Y and W are independent. Let ci : R2+ → R+ be a right continuous increasing function, i = 1, 2, . . . , n. If X ≤(k) Y and Z ≤(k) W , then

c1 (X1 , Z1 ), c2 (X2 , Z2 ), . . . , cn (Xn , Zn )

≤(k) c1 (Y1 , W1 ), c2 (Y2 , W2 ), . . . , cn (Yn , Wn ) . From Theorem 6.G.18 we obtain the following two results as corollaries.

Theorem 6.G.19. Let X, Y , Z, and W be n-dimensional nonnegative random vectors such that X and Z are independent and Y and W are independent. If X ≤(k) Y and Z ≤(k) W , then X + Z ≤(k) Y + W ; that is, the orders ≤(k) are closed under convolutions.

6.H Complements

317

Theorem 6.G.20. Let X, Y , Z, and W be n-dimensional nonnegative random vectors such that X and Z are independent and Y and W are independent. If X ≤(k) Y and Z ≤(k) W , then (min(X1 , Z1 ), min(X2 , Z2 ), . . . , min(Xn , Zn )) ≤(k) (min(Y1 , W1 ), min(Y2 , W2 ), . . . , min(Yn , Wn )) and (max(X1 , Z1 ), max(X2 , Z2 ), . . . , max(Xn , Zn )) ≤(k) (max(Y1 , W1 ), max(Y2 , W2 ), . . . , max(Yn , Wn )). The next result states a closure under marginalization property. In its statement X (i) denotes (X1 , . . . , Xi−1 , Xi+1 , . . . , Xn ) and Y (i) denotes (Y1 , . . . , Yi−1 , Yi+1 , . . . , Yn ), i = 1, 2, . . . , n. Theorem 6.G.21. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. Suppose that X ≤(k) Y . (a) If 1 < k ≤ n, then X (i) ≤(k−1) Y (i) . (b) If X and Y are positive with probability one and if 1 ≤ k ≤ n − 1, then X (i) ≤(k) Y (i) . It is clear from (6.G.16) that X ≤st Y =⇒ X ≤(k) Y . Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random nonnegative vectors. By letting k − 1 of the ai ’s in (6.G.16) go to 0, and by letting n − k of the other ai ’s be ∞, it is seen that if X ≤(k) Y , then Xi ≤st Yi , i = 1, 2, . . . , n (this fact can also be obtained from Theorem 6.G.21). It follows that X ≤(k) Y =⇒ EX ≤ EY .

6.H Complements Section 6.B: Many of the results described in Section 6.B can be found, or are alluded to, in Marshall and Olkin [383]. For example, the result given in Theorem 6.B.2 can be found there. Some studies of so called integral stochastic orders, which have as their starting point relations such as (6.B.4) or (6.G.7), can be found in Marshall [382], in Mosler and Scarsini [400], in M¨ uller [408], and in Dubra, Maccheroni, and Ok [172]. A proof

318

6 Multivariate Stochastic Orders

of fact that the usual stochastic order is equivalent to an almost sure construction (Theorem 6.B.1) can be found in Kamae, Krengel, and O’Brien [272], where this result is obtained for spaces that are more general than Rn . Theorem 6.B.3 was obtained originally in Veinott [556], but various versions of it appear elsewhere and it is often rediscovered; Shanthikumar [527] has identiﬁed a condition that is weaker than (6.B.8)–(6.B.10) and which still implies X ≤st Y . A standard reference for notions of positive dependence such as association and CIS is Barlow and Proschan [36]. The condition under which CIS random vectors are stochastically ordered (Theorem 6.B.4) can be found, for example, in Langberg [332]. An extension of Theorem 6.B.4 can be found in Shanthikumar [527]. The notation ≤sst and the result in Remark 6.B.5 are taken from Li, Scarsini, and Shaked [348]. The characterization of the CIS notion for bivariate distribution functions with uniform[0, 1] margins (Remark 6.B.6) is taken from Nelsen [431, Corollary 5.2.11], where this result is derived in the context of copulas. The notion of positive dependence WCIS is introduced in Cohen and Sackrowitz [131], from which Theorem 6.B.7 is taken. The fact that association, together with the monotonicity of the ratio of the densities, implies the multivariate usual stochastic order (Theorem 6.B.8), is essentially proved in Proposition 2.6 of Perlman and Olkin [457]. The stochastic monotonicity of a random vector conditioned on the sum of its elements (Theorem 6.B.9) is taken from Efron [181], who credited it to Karlin; extensions of it can be found in Shanthikumar [527] as well as in Efron [181]. This theorem is put into the context of queuing theory in Daduna and Szekli [137]. The result which gives conditions, by means of the univariate down shifted likelihood ratio order, under which a random vector is stochastically increasing in its given sum (Theorem 6.B.10) can be found in Liggett [360]. The results that involve the stochastic monotonicity of a random vector conditioned on some of its order statistics (Theorems 6.B.11 and 6.B.12) are taken from Block, Bueno, Savits, and Shaked [91] and from Shanthikumar [527]; related results can be found in Bueno [113] and in Joag-Dev [257]. The stochastic monotonicity of the order statistics, of heterogeneous exponential random variables, in the ﬁrst order statistic (Theorem 6.B.13), is a strengthening of a result of Kochar and Korwar [314]; its conclusion also holds if it is merely assumed that X1 , X2 , . . . , Xm have proportional hazard functions (rather than having exponential distributions). The stochastic comparison of random vectors with a common copula (Theorem 6.B.14) can be found in Scarsini [491]; an extension of it is given in Li, Scarsini, and Shaked [348]. The result on the comparison of the vector of partial sums (Theorem 6.B.15) is taken from Boland, Proschan, and Tong [100], where the counterexample, mentioned after the theorem, can also be found. Some extensions of this result are given in Shaked, Shanthikumar, and Tong [519]. The result which compares random sums (Theorem 6.B.17) is taken from Pellerey [451], whereas the comparison of mixtures result (Theorem 6.B.18) is taken from Denuit

6.H Complements

319

and M¨ uller [157]. The conditions for stochastic equality (Theorem 6.B.19) can be found in Baccelli and Makowski [27]. The proof of Theorem 6.B.19 that is given in Section 6.B.5 follows the ideas of Scarsini and Shaked [494]. Lemma 2.1 of Costantini and Pasqualucci [135] is an interesting variation of Theorem 6.B.19. The characterizations of the usual stochastic order given in Theorems 6.B.20 and 6.B.21 are taken from Scarsini and Shaked [495]. The comparisons of order statistics, given in Theorem 6.B.23 and Corollary 6.B.24, can be found in Mi and Shaked [395]. These comparisons extend some results of Nanda and Shaked [428] and of Belzunce, Franco, Ruiz, and Ruiz [66, Corollary 3.2]; see a related result in Belzunce, Mercader, and Ruiz [70]. The result that is given in Example 6.B.25 is stated in Bartoszewicz [39], but without a detailed proof; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The usual stochastic order of vectors of order statistics of Gamma and Weibull random variables with diﬀerent scale parameters (Examples 6.B.26 and 6.B.27) are taken from Hu [229] and from Sun and Zhang [543]. Several other examples of this kind can be found in Hu [229], and a general method for identifying such examples can be found in Hu [230]. The conditions under which inﬁnitely divisible random vectors are comparable in the usual multivariate stochastic order (Example 6.B.28) can be found in Samorodnitsky and Taqqu [487]; see also Braverman [108] who has mistakenly confused the usual stochastic order with the upper orthant order. The necessary and suﬃcient conditions for the comparison of multivariate normal random vectors (Example 6.B.29) can be found in M¨ uller [413]; extensions of this result to Kotz-type distributions are given in Ding and Zhang [168]. The multivariate IFR notions described in Section 6.B.6 are taken from Shaked and Shanthikumar [512]; however, the notion corresponding to (6.B.28) is equivalent to a multivariate IFR notion of Arjas [18]. General results concerning the usual stochastic comparison of stochastic processes (that is, results that are more general than Theorems 6.B.30 and 6.B.31) can be found in Kamae, Krengel, and O’Brien [272]; see also Block, Langberg, and Savits [93] and Rolski and Szekli [474]. Versions of the results regarding the usual stochastic comparison of Markov chains (Theorems 6.B.32 and 6.B.34) can be found in Stoyan [540, Chapter 4]. The comparison of Markov chains, one of which is skip-free positive (Theorem 6.B.35), is taken from Ferreira and Pacheco [199]; they obtained stronger results than Theorem 6.B.33 although they use a diﬀerent terminology than the one used in this theorem. The discussion about the stochastic orders of point processes is based on Shaked and Szekli [521] and Szekli [544], although the deﬁnition of the orders ≤st and ≤st-N for point processes can be found already in Ebrahimi [176]; see related results in Sch¨ ottl [497]. Kulik and Szekli [325] extended these orders to k-variate point processes. The statements about the stochastic comparisons of the epoch and interepoch times of two nonhomogeneous Poisson processes (Example 6.B.41) are taken from Belzunce, Lillo, Ruiz, and Shaked [69].

320

6 Multivariate Stochastic Orders

Section 6.C: The development in this section follows the works of Norros [436, 437] and of Shaked and Shanthikumar [504]. A result that is similar to Theorem 6.C.1, but that gives conditions under which two point processes are stochastically ordered, can be found in Kwieci´ nski and Szekli [328]. The fact (which is mentioned in Section 6.C.2) that the cumulative hazards of the components, by the time that they fail, are independent standard exponential random variables, follows from more general results of Aalen and Hoem [1, Section 4.5], Kurtz [326, Theorem 6.19(b)], and Jacobsen [252, Proposition 2.2.11)]. Section 6.D: The development in Sections 6.D.1 and 6.D.2 follows the work of Hu, Khaledi, and Shaked [235], although the deﬁnition of the order ≤whr (with a diﬀerent name), and its characterization by means of the hazard gradients (Theorem 6.D.2) can be found in Jain and Nanda [253]. In Hu, Khaledi, and Shaked [235] it is claimed that (6.D.12) in Theorem 6.D.7 is equivalent to X ≤whr Y , but this is erroneous, as was communicated to us by Antonio Colangelo. An order that is stronger than the order ≤whr is mentioned in Collet, L´ opez, and Mart´ınez [134]. The development in Section 6.D.3 follows the work of Shaked and Shanthikumar [505]. The dynamic multivariate hazard rate order comparison of the epoch times of two nonhomogeneous Poisson processes (Example 6.D.8) is taken from Belzunce, Lillo, Ruiz, and Shaked [69]. The comparison, in the dynamic hazard rate order, of vectors of order statistics (Theorem 6.D.10), can be found in Belzunce, Ruiz, and Ruiz [75]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The multivariate IFR notions described in Section 6.D.3 are taken from Shaked and Shanthikumar [512]; some related notion and results can be found in Bassan and Spizzichino [56]. Section 6.E: The multivariate likelihood ratio order (though using a diﬀerent terminology) is studied in Karlin and Rinott [278] and in Whitt [563]. The preservation under conditioning result (Theorem 6.E.2) can be found in Rinott and Scarsini [468]. The result which shows a preservation property of the order ≤lr under random summations (Theorem 6.E.5) is taken from Pellerey [451]. The result about the relationship between the multivariate likelihood ratio and the multivariate hazard rate order (Theorem 6.E.6) is taken from Hu, Khaledi, and Shaked [235]. The relationship between the multivariate likelihood ratio and the dynamic multivariate hazard rate order (Theorem 6.E.7) can be found in Shaked and Shanthikumar [511], whereas the notion of multivariate PF2 distributions is taken from Shaked and Shanthikumar [512]. Theorem 6.E.8 has been proved in the literature in various generalities; see, for example, Holley [226] or Preston [460]. For a proof of the present statement of Theorem 6.E.8 see Karlin and Rinott [278]. The implication (6.E.6) can be found in Whitt [563]. Shanthikumar and Koo [528] studied an order which is deﬁned as in (6.E.6), except that rather than requiring A there to be a rectangular set, they require the right-hand side of (6.E.6) to hold for all planar regions

6.H Complements

321

A. The statement in Remark 6.E.9 that (1.C.6) does not generalize to the multivariate case, follows from R¨ uschendorf [485, Theorem 8]. The order mentioned in Remark 6.E.10 is studied in Whitt [563], where other orders, related to the multivariate likelihood ratio order, are also studied. One of the counterexamples, mentioned in Remark 6.E.10, can be found in Whitt [563]. Other counterexamples can be found in Lehmann [341]; in that paper it is also claimed that Theorem 6.B.2 is wrong, but that claim is based on erroneous examples. The conditions for the monotonicity of the order ≤lr , given in Theorem 6.E.11, are taken from Rinott and Scarsini [468]. The comparison, in the multivariate likelihood ratio order, of vectors of order statistics (Theorem 6.E.12), can be found in Belzunce, Ruiz, and Ruiz [75]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The multivariate likelihood ratio comparisons of epoch and inter-epoch times of nonhomogeneous Poisson processes (Example 6.E.13) can be found in Belzunce, Lillo, Ruiz, and Shaked [69]; in that paper these results are also extended to nonhomogeneous pure birth processes. The likelihood ratio order comparison of the vectors of the normalized spacings associated with exponential random variables (Example 6.E.14) is taken from Kochar and Rojo [318]. The result about the likelihood ratio ordering of the posterior distributions (Example 6.E.16) can be found in Fahmy, Pereira, Proschan, and Shaked [189]; see also Purcaru and Denuit [462, Proposition 5.1]. A modiﬁcation of Example 6.E.16 is Theorem 3.61 of Spizzichino [539]. The proof of the equivalence of the various notions of multivariate PF2 notions (Theorem 6.E.17) is given in Shaked and Shanthikumar [512]. Section 6.F: The development in this section follows the work of Shaked and Shanthikumar [513]. A notion that is related to the multivariate DMRL concept in Section 6.F.3 can be found in Bassan, Kochar, and Spizzichino [53]. Section 6.G: The orthant orders, which are already mentioned in Marshall and Olkin [383], have been studied further by several authors. Some of the results in Section 6.G.1 can be found in Tchen [547], R¨ uschendorf [481], and Mosler [401]. Several extensions of these orders can be found in Bergmann [82]. The closure results of the orthant orders given in Theorem 6.G.5, and the application to Markov chains given in Example 6.G.6, are taken from Li and Xu [350]. The result about the preservation of the orthant orders under random sums (Theorem 6.G.7) is taken from Wong [568]; this result also appeared in Denuit, Genest, and Marceau [145], and in Pellerey [451] there is an equivalent result with an alternative proof. The comparison of mixtures result (Theorem 6.G.8) can be found in Denuit and M¨ uller [157]. The relationship between the orders ≤uo and ≤whr , given in (6.G.10), can be found in Hu, Khaledi, and Shaked [235]. The relationship between the order ≤uo and the orders ≤Sm-cx and ≤m-icx (Theorem 6.G.10) is taken from Boutsikas and Vaggelatou [107]. The suﬃcient conditions for the comparison of multivariate normal random vectors (Example 6.G.11) can be found in M¨ uller [413]. Theorem 6.G.13 is taken

322

6 Multivariate Stochastic Orders

from Scarsini and Shaked [494], whereas Theorem 6.G.14 is adopted from Baccelli and Makowski [27]. Dyckerhoﬀ and Mosler [173] introduced some relatively easy conditions for verifying X ≤uo Y or X ≤lo Y when X and Y have ﬁnite discrete supports. The development in Section 6.G.2 follows the work of Scarsini and Shaked [493]. Hennessy [220] considered the order which is deﬁned by taking all the ai ’s in (6.G.16) to be equal to 1; he obtained for this order a result which is analogous to Theorem 6.G.16. A generalization of the order ≤uo is mentioned and studied in Daduna and Szekli [138].

7 Multivariate Variability and Related Orders

In this chapter we describe various extensions, of the univariate variability orders in Chapters 3 and 4, to the multivariate case. The most important common orders that are studied in this chapter are the increasing and the directional convex and concave orders. Multivariate extensions of the order ≤disp are also studied in this chapter. Some multivariate extensions of the transform orders, and of the Laplace transform order, are investigated in this chapter as well.

7.A The Monotone Convex and Monotone Concave Orders 7.A.1 Deﬁnitions The multivariate orders ≤icx and ≤icv are deﬁned in a similar fashion to their univariate counterparts discussed in Section 4.A. Let X and Y be two ndimensional random vectors such that E[φ(X)] ≤ E[φ(Y )] for all increasing convex [concave] functions φ : Rn → R,

(7.A.1)

provided the expectations exist. Then X is said to be smaller than Y in the increasing convex [concave] order (denoted by X ≤icx Y [X ≤icv Y ]). One can also deﬁne a decreasing convex [concave] order by requiring (7.A.1) to hold for all decreasing convex [concave] functions φ. But the terms “decreasing convex” and “decreasing concave” orders are counterintuitive because if X is smaller than Y in the sense of either of these two orders, then X is “larger” than Y in some stochastic sense. These orders can easily be characterized using the orders ≤icx and ≤icv . It is therefore not necessary to have a separate discussion about these orders.

324

7 Multivariate Variability and Related Orders

For any i, i = 1, 2, . . . , n, the function φi , deﬁned by φi (x) = φi (x1 , x2 , . . . , xn ) = xi , is increasing and is both convex and concave. Therefore, from (7.A.1) it easily follows that X ≤icx Y =⇒ E[X] ≤ E[Y ]

(7.A.2)

X ≤icv Y =⇒ E[X] ≤ E[Y ],

(7.A.3)

and that provided the expectations exist. If the two n-dimensional random vectors X and Y are such that E[φ(X)] ≤ E[φ(Y )]

for all convex functions φ : Rn → R,

(7.A.4)

provided the expectations exist, then X is said to be smaller than Y in the convex order (denoted by X ≤cx Y ). For any i, i = 1, 2, . . . , n, the function φi , deﬁned as above, and the function ψi , deﬁned by ψi (x) = ψi (x1 , x2 , . . . , xn ) = −xi , are both convex. Therefore, from (7.A.4) it follows that X ≤cx Y =⇒ E[X] = E[Y ],

(7.A.5)

provided the expectations exist. The multivariate convex order can be characterized by construction on the same probability space as the univariate convex order (see Theorem 3.A.4). This is stated next. Theorem 7.A.1. The random vectors X and Y satisfy X ≤cx Y if, and only ˆ and Yˆ , deﬁned on the same probability if, there exist two random vectors X space, such that ˆ =st X, X Yˆ =st Y , ˆ Yˆ } is a martingale, that is, and {X, ˆ =X ˆ E[Yˆ X]

(7.A.6) (7.A.7)

a.s.

(7.A.8)

Similarly, the multivariate extension of Theorem 4.A.5 is the following. Theorem 7.A.2. Two random vectors X and Y satisfy X ≤icx Y [X ≤icv ˆ and Yˆ , deﬁned on the Y ] if, and only if, there exist two random vectors X same probability space, such that ˆ =st X, X Yˆ =st Y , ˆ Yˆ } is a submartingale [{Yˆ , X} ˆ is a supermartingale], that is, and {X, ˆ ≥X ˆ [E[X ˆ Yˆ ] ≤ Yˆ ] a.s. E[Yˆ X]

7.A The Monotone Convex and Monotone Concave Orders

325

The next theorem is a multivariate analog of Theorem 4.A.6. The proof of the next theorem is similar to the proof of Theorem 4.A.6, and is therefore omitted. Theorem 7.A.3. (a) Two random vectors X and Y satisfy X ≤icx Y if, and only if, there exists a random vector Z such that X ≤st Z ≤cx Y . (b) Two random vectors X and Y satisfy X ≤icx Y if, and only if, there exists a random vector Z such that X ≤cx Z ≤st Y . The next result is similar to a result of Veinott that can be found in Section 6.B.3. Veinott’s result deals with the multivariate usual stochastic order (rather than the convex order) and does not assume independence of either the Xj ’s or the Yj ’s. However, the convex order is harder to work with as compared to the usual stochastic order. Thus we have the following result. Theorem 7.A.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If Y1 , Y2 , . . . , Yn are independent, and if X1 ≤cx Y1 , [X2 X1 = x1 ] ≤cx Y2 for all x1 , and, in general, for i = 2, 3, . . . , n, [Xi X1 = x1 , . . . , Xi−1 = xi−1 ] ≤cx Yi

(7.A.9) (7.A.10)

for all xj , j = 1, 2, . . . , i − 1, (7.A.11)

then X ≤cx Y .

(7.A.12)

ˆ and Yˆ on the same probability space The proof consists of constructing X such that (7.A.6)–(7.A.8) hold. This can be done by ﬁrst constructing indeˆ i ’s, note that pendent Yˆ1 , Yˆ2 , . . . , Yˆn such that Yˆ =st Y . To construct the X ˆ 1 on the by Theorem 3.A.4 (using (7.A.9)) it is possible to construct an X ˆ ˆ ˆ ˆ 1 = x1 , same probability space such that E[Y1 X1 ] = X1 a.s. Next, given X ˆ it is possible to construct, again using Theorem 3.A.4 and (7.A.10), an X2 ˆ1, X ˆ2] = X ˆ 2 a.s. Continuing on the same probability space such that E[Yˆ2 X ˆ is constructed. The this way, using Theorem 3.A.4 and (7.A.11), the vector X ˆ ˆ vectors X and Y satisfy the conditions of Theorem 7.A.1, and thus (7.A.12) follows. Note that under the conditions of Theorem 7.A.4 one has n j=1

Xj ≤cx

n

Yj .

j=1

This inequality gives a stronger result than Theorem 3.A.12(d).

(7.A.13)

326

7 Multivariate Variability and Related Orders

7.A.2 Closure properties The proofs of the following closure properties are similar to the univariate counterparts and are omitted. Theorem 7.A.5. (a) Let X and Y be n-dimensional random vectors. If X ≤icx Y [X ≤icv Y ] and g : Rn → Rm is any increasing convex [concave] function, then g(X) ≤icx [≤icv ] g(Y ). (b) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤icx [≤icv ] [Y Θ = θ] for all θ in the support of Θ. Then X ≤icx [≤icv ] Y . That is, the increasing convex [concave] order is closed under mixtures. (c) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞. Assume that EX j → EX and that EY j → EY as j → ∞. If X j ≤cx [≤icx , ≤icv ] Y j , j = 1, 2, . . ., then X ≤cx [≤icx , ≤icv ] Y . (d) Let X 1 , X 2 , . . . , X m be a set of independent random vectors and let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors. If X i ≤icx [≤icv ] Y i for i = 1, 2, . . . , m, then m

X j ≤icx [≤icv ]

j=1

m

Y j.

j=1

That is, the increasing convex [concave] order is closed under convolutions. Parts (a) and (d) of Theorem 7.A.5 can be generalized as follows. Theorem 7.A.6. Let X 1 , X 2 , . . . , X m be a set of independent random vectors, let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors, and assume that X i and Y i have the same dimension, i = 1, 2, . . . , m. If X i ≤icx Y i for i = 1, 2, . . . , m, then g(X 1 , X 2 , . . . , X m ) ≤icx g(Y 1 , Y 2 , . . . , Y m ) for every function g of a proper dimension that is increasing and convex in each argument. A generalization of Theorem 7.A.5(d) is the following result which deals with vectors of random partial sums of random variables. Theorem 7.A.7. Let {Xi } and {Yi } each be a sequence of independent random variables. Also, let {Mi } and {Ni } each be a sequence of independent positive integer-valued random variables, and suppose that the Xi ’s and the Mi ’s are independent and also that Yi ’s and the Ni ’s are independent. Let ˜j = M

j i=1

Mi ,

˜j = N

j i=1

˜

Ni ,

Uj =

Mj i=1

˜

Xi ,

Vj =

Nj i=1

Yi ,

j = 1, 2, . . . , m.

7.A The Monotone Convex and Monotone Concave Orders

327

If Yi ≥ 0 a.s.,

i = 1, 2, . . . ,

Mi ≤st Ni ,

i = 1, 2, . . . ,

Xi ≤icx Yi ,

i = 1, 2, . . . ,

(7.A.14)

and then (U1 , U2 , . . . , Um ) ≤icx (V1 , V2 , . . . , Vm ).

(7.A.15)

Proof. According to Theorems 1.A.1 and 4.A.5 there exist sequences of ranˆ i }, {Yˆi }, {M ˆ i }, and {N ˆi } such that dom variables {X ˆ i =st Xi , X

Yˆi =st Yi ,

and ˆi ≤ N ˆi a.s., M

ˆ i =st Mi , M

ˆi =st Ni , N

ˆ i ] a.s., ˆ i ≤ E[Yˆi X X

i = 1, 2, . . . ,

i = 1, 2, . . . .

Deﬁne ˜ ˆ i, ˆj = M M j

˜ ˆj = ˆi , N N

i=1

˜ ˆ

j

i=1

ˆj = U

Mj i=1

˜ ˆ

ˆi, X

Vˆj =

Nj

Yˆi ,

j = 1, 2, . . . , m.

i=1

From (7.A.14) it is seen that ˜ ˆ

ˆj = U

Mj i=1

˜

ˆi ≤ E X

ˆj N

ˆ k } = E Vˆj {X ˆ k } a.s., Yˆi {X

j = 1, 2, . . . , m.

i=1

Let φ be an increasing convex real n-dimensional function. Then ˆ1 , U ˆ2 , . . . , U ˆm )] ≤ E[φ(E[(Vˆ1 , Vˆ2 , . . . , Vˆm ){X ˆ k }])] E[φ(U ˆ k }]] ≤ E[E[φ(Vˆ1 , Vˆ2 , . . . , Vˆm ){X = E[φ(Vˆ1 , Vˆ2 , . . . , Vˆm )], ˆ2 , ˆ1 , U where the second inequality follows from Jensen’s Inequality. Since (U ˆ ˆ ˆ ˆ . . . , Um ) =st (U1 , U2 , . . . , Um ) and (V1 , V2 , . . . , Vm ) =st (V1 , V2 , . . . , Vm ) we obtain (7.A.15).

Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X i ’s. Denote by Xj,i the ith element of X j . From Theorems 2.3 and 2.4 M1 M2 of Pellerey [451] it seems that if M ≤cx [≤icx ] N , then X1,i , i=1 X2,i , i=1

N1 Mm N2 Nm . . . , i=1 Xm,i ≤cx [≤icx ] i=1 X1,i , i=1 X2,i , . . . , i=1 Xm,i . However, the proofs given in that paper yield somewhat diﬀerent results; see Theorem 7.A.36 for the details. The following two results can easily be proven using Theorem 7.A.1.

328

7 Multivariate Variability and Related Orders

Theorem 7.A.8. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xm ) ≤cx (Y1 , Y2 , . . . , Ym ). A result that is slightly stronger than Theorem 7.A.8 is given in Theorem 7.A.24. Theorem 7.A.9. Let the random vector X and the nonnegative random variable U be independent. If E[U ] = 1, then X ≤cx U X. From Theorem 3.B.15 [Theorem 4.B.23] and Theorem 7.A.8 we obtain the following result. Theorem 7.A.10. Let X1 , X2 , . . . , Xm be a set of nonnegative independent random variables, let Y1 , Y2 , . . . , Ym be another set of nonnegative independent random variables, and assume that EXi = EYi , i = 1, 2, . . . , m. If Xi ≤disp [≤nbue ] Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xm ) ≤cx (Y1 , Y2 , . . . , Ym ). An application of Theorem 7.A.1 is illustrated in the following example (which is, in fact, an extension of Example 3.A.29). Example 7.A.11. Let X 1 , X 2 , . . . be independent and identically distributed m-dimensional random variables. Denote by X n the sample mean of X 1 , X 2 , . . . , X n . That is, X n = (X 1 + X 2 + · · · + X n )/n. If the expectation of X 1 exists, then for any choice of positive integers n ≤ n one has X n ≤cx X n . In order to see it note that by the symmetry of X 1 , X 2 , . . . , X n it follows that E[X i X n ] = X n for all i ≤ n . Therefore E[X n X n ] = X n . That is, {X n , X n } is a martingale. The result now follows from Theorem 7.A.1. 7.A.3 Further properties Let X and Y be random vectors. If E[φ(X)] ≤ E[φ(Y )] for all increasing functions φ, then (7.A.1) obviously holds. Thus we obtain the following result. Theorem 7.A.12. Let X and Y be two random vectors. If X ≤st Y , then X ≤icx Y and X ≤icv Y . The following example gives necessary (and suﬃcient) conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 6.G.11, 7.A.26, 7.A.39, 7.B.5, and 9.A.20 for related results.

7.A The Monotone Convex and Monotone Concave Orders

329

Example 7.A.13. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ X , and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix ΣY . (a) If µX ≤ µY and if Σ Y − Σ X is positive semideﬁnite, then X ≤icx Y . (b) X ≤cx Y if, and only if, µX = µY and Σ Y −Σ X is positive semideﬁnite. Using Theorem 4.A.48 we can obtain conditions under which two nonnegative random vectors, that are comparable in the ≤icx or in the ≤icv orders, have the same distribution; related results are Theorems 1.A.8, 3.A.43, 3.A.60, 4.A.69, 5.A.15, 6.B.19, 6.G.12, and 6.G.13. Theorem 7.A.14. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors. (a) If X ≤icx Y , and if E[Xi Xj ] = E[Yi Yj ] for all i and j, then X =st Y . (b) If X ≤icv Y , and if EX = EY , and if E[Xi Xj ] = E[Yi Yj ] for all i and j, then X =st Y . Proof. First we prove n (a). From the assumption that X ≤icx Y it follows that n a X ≤ icx i=1 i i i=1 ai Yi for all ai ≥ 0, i = 1, 2, . . . , n. Also E

n i=1

2 ai Xi

=

n n i=1 j=1

ai aj E[Xi Xj ] =

n n

ai aj E[Yi Yj ] = E

i=1 j=1

n

2 ai Yi

.

i=1

n n It then follows, from Theorem 4.A.48, that i=1 ai Xi =st i=1 ai Yi for n all ai ≥ 0, i = 1, 2, . . . , n. Thus we have that E[exp{− i=1 ai Xi }] = n E[exp{− i=1 ai Yi }] for all ai ≥ 0, i = 1, 2, . . . , n. From the unicity property of the Laplace transform we obtain X =st Y . The proof of part follows from part and from the observation that (a) (b) n n n n = E , then if a X ≤ a Y and if E a X a Y i i icv i i i i i i i=1 i=1 n i=1 n i=1 a X ≥ a Y .

i i icx i i i=1 i=1 In a similar manner, using now Theorem 3.A.42 rather than Theorem 4.A.48, we can obtain conditions under which two (not necessarily nonnegative) random vectors, that are comparable in the ≤cx order, have the same distribution. Theorem 7.A.15. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two (not necessarily nonnegative) random vectors. If X ≤cx Y , and if Var(Xi ) = Var(Yi ), i = 1, 2, . . . , n, then X =st Y . Proof. From the assumption that X ≤cx Y it follows that for i = j we have a2i EXi2 + a2j EXj2 + ai aj E[Xi Xj ] = E(ai Xi + aj Xj )2 ≤ E(ai Yi + aj Yj )2 = a2i EYi2 + a2j EYj2 + ai aj E[Yi Yj ],

330

7 Multivariate Variability and Related Orders

where ai and aj are any constants. Since, by assumption, EXi2 = EYi2 and EXj2 = EYj2 , we have that ai aj E[Xi Xj ] ≤ ai aj E[Yi Yj ]. Since ai and aj are arbitrary, we see that E[Xi Xj ] = E[Yi Yj ]. n Now, n again from the assumption that X ≤cx Y it follows that i=1 ai Xi ≤cx i=1 ai Yi for all ai , i = 1, 2, . . . , n. As in the proof of Theorem 7.A.14 n

2

2 n we can show that E i=1 ai Xi = E i=1 ai Yi . It then follows, from n n Theorem 3.A.42, that i=1 ai Xi =st i=1 ai Yi for all ai , i = 1, 2, . . . , n. Therefore the characteristic functions of X and of Y are identical. This implies that X =st Y .

An interesting application of the orthant order in the context of the increasing convex and concave orders is given in the following result. The proof, which can be found elsewhere (see Section 7.D), is not given here. Theorem 7.A.16. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Suppose that X ≤lo Y [respectively, X ≤uo Y ] and that −∞ < E[φi (Xi )] = E[φi (Yi )] < ∞,

i = 1, 2, . . . , n,

for some nonnegative strictly increasing convex functions φi , i = 1, 2, . . . , n. If X and Y are comparable in the order ≤icx [respectively, ≤icv ], then X =st Y . Two orders related to the multivariate monotone convex order are discussed in Sections 7.A.6 and 7.A.7 below. 7.A.4 Convex and concave ordering of stochastic processes In Section 6.B.7 we showed that some of the results regarding the usual stochastic ordering of random vectors can be extended to the usual stochastic ordering of stochastic processes. It turns out that some of the results regarding the monotone convex and concave orderings of random vectors can also be extended to the analogous orderings of stochastic processes. In this subsection we describe a basic result that formally states that two stochastic processes are comparable in the sense of any of these orders if, and only if, any ﬁnite dimensional marginals of them are comparable in the same sense. Let {X(n), n ∈ N++ } and {Y (n), n ∈ N++ } be two discrete-time stochastic processes with state space R. Suppose that, for all choices of an integer m, it holds that (X(1), X(2), . . . , X(m)) ≤cx [≤icx , ≤icv ] (Y (1), Y (2), . . . , Y (m)), then {X(n), n ∈ N++ } is said to be smaller than {Y (n) , n ∈ N++ } in the convex [increasing convex, increasing concave] order (denoted by {X(n), n ∈ N++ } ≤cx [≤icx , ≤icv ] {Y (n), n ∈ N++ }). Below, a functional g is called convex [concave] if g({αx(n) + (1 − α)y(n), n ∈ N++ }) ≤ [≥] αg({x(n), n ∈ N++ }) + (1 − α)g({y(n), n ∈ N++ }) for all α ∈ [0, 1] and {x(n), n ∈ N++ } and {y(n), n ∈ N++ }.

7.A The Monotone Convex and Monotone Concave Orders

331

Theorem 7.A.17. Let {X(n), n ∈ N++ } and {Y (n), n ∈ N++ } be two discrete-time stochastic processes with state space R. Then {X(n), n ∈ N++ } ≤cx [≤icx , ≤icv ] {Y (n), n ∈ N++ } if, and only if, E{g({X(n), n ∈ N++ })} ≤ E{g({Y (n), n ∈ N++ })}

(7.A.16)

for every continuous (with respect to the product topology in R∞ ) convex [increasing convex, increasing concave] functional g for which the expectations in (7.A.16) exist. Notice that the assumption of continuity with respect to the product topology is quite restrictive, but, as far as we know, it is the best result available. 7.A.5 The (m1 , m2 )-icx orders The multivariate ≤icx can be extended in a manner similar to the way in which the univariate order ≤m-icx in Section 4.A.7 extends the univariate ≤icx order. Only the bivariate extension will be described here. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with a common support I × J, where I and J are ﬁnite, or half inﬁnite, or inﬁnite intervals in R. If E[φ(X1 , X2 )] ≤ E[φ(Y1 , Y2 )] for all (m1 + m2 )-diﬀerentiable funck1 +k2 tions φ such that ∂ k1 k2 φ(x1 , x2 ) ≥ 0 on I × J whenever 0 ≤ k1 ≤ m1 , ∂x1 ∂x2

0 ≤ k2 ≤ m2 , and k1 + k2 ≥ 1, then (X1 , X2 ) is said to be smaller than I×J (Y1 , Y2 )). (Y1 , Y2 ) in the (m1 , m2 )-icx order (denoted by (X1 , X2 ) ≤(m 1 ,m2 )-icx If E[φ(X1 , X2 )] ≤ E[φ(Y1 , Y2 )] for all (m1 + m2 )-diﬀerentiable functions φ k1 +k2 such that (−1)k1 +k2 +1 ∂ k1 k2 φ(x1 , x2 ) ≥ 0 on I × J whenever 0 ≤ k1 ≤ m1 , ∂x1 ∂x2

0 ≤ k2 ≤ m2 , and k1 +k2 ≥ 1, then (X1 , X2 ) is said to be smaller than (Y1 , Y2 ) I×J (Y1 , Y2 )). in the (m1 , m2 )-icv order (denoted by (X1 , X2 ) ≤(m 1 ,m2 )-icv The (m1 , m2 )-icx and the (m1 , m2 )-icv orders are related as follows [a ,b ]×[a ,b ]

1 2 2 (X1 , X2 ) ≤(m11 ,m (Y1 , Y2 ) 2 )-icv

[0,b −a ]×[0,b2 −a2 ]

⇐⇒ (b1 − Y1 , b2 − Y2 ) ≤(m11,m21)-icx

(b1 − X1 , b2 − X2 ),

and 2

2

R (X1 , X2 ) ≤R (m1 ,m2 )-icv (Y1 , Y2 ) ⇐⇒ −(Y1 , Y2 ) ≤(m1 ,m2 )-icx −(X1 , X2 ).

Thus it suﬃces for most purposes to focus on the (m1 , m2 )-icx order only. 2 R2 Note that the orders ≤R (1,1)-icx and ≤(1,1)-icv are the orders ≤uo and ≤lo (see 2

2

R Section 6.G.1). The orders ≤R (2,2)-icx and ≤(2,2)-icv are the orders ≤uo-cx and 2

≤uo-cx which are discussed in Section 7.A.9 below. Also, the order ≤R (m,m)-icv is the order ≤2m which is discussed in Section 7.A.9. Some closure properties of the (m1 , m2 )-icx order are given in the next theorem. Some of the results below are stated for simplicity only for the case in which I = J = [0, ∞), but they can be rewritten for the general case.

332

7 Multivariate Variability and Related Orders

Theorem 7.A.18. (a) Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with a common support I × J. Let K and L be two intervals in R, and let φ1 : I → K and φ2 : J → L be two univariate functions with nonnegative I×J ﬁrst m1 and m2 derivatives, respectively. If (X1 , X2 ) ≤(m (Y1 , Y2 ), 1 ,m2 )-icx

K×L (φ1 (Y1 ), φ2 (Y2 )). then (φ1 (X1 ), φ2 (X2 )) ≤(m 1 ,m2 )-icx (b) Let (X1 , X2 ), (Y1 , Y2 ), and Θ be random vectors such that [(X1 , X2 )Θ = [0,∞)2 θ] ≤ [(Y1 , Y2 )Θ = θ] for all θ in the support of Θ. Then (m1 ,m2 )-icx [0,∞)2

(X1 , X2 ) ≤(m1 ,m2 )-icx (Y1 , Y2 ). That is, the (m1 , m2 )-icx order is closed under mixtures. (c) Let {(X11 , X12 ), (X21 , X22 ), . . . } be a sequence of independent random vectors and let {(Y11 , Y12 ), (Y21 , Y22 ), . . . } be another set of independent random vectors. Furthermore, let N be a positive integer-valued random variable which is independent of the above random vectors. If (Xj1 , Xj2 ) [0,∞)2

≤(m1 ,m2 )-icx (Yj1 , Yj2 ) for j = 1, 2, . . ., then N

[0,∞)2

(Xj1 , Xj2 ) ≤(m1 ,m2 )-icx

j=1

N

(Yj1 , Yj2 ).

j=1

In particular, the (m1 , m2 )-icx order is closed under convolutions. Part (c) of Theorem 7.A.18 can be used, for example, to prove (9.A.11) in Chapter 9. The bivariate (m1 , m2 )-icx orders imply some interesting results on their univariate components. Theorem 7.A.19. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with a common support [0, ∞)2 . Let φ be a bivariate function which satisﬁes ∂ k1 +k2 2 whenever 0 ≤ k1 ≤ m1 , 0 ≤ k2 ≤ m2 , and k1 k2 φ(x1 , x2 ) ≥ 0 on [0, ∞) ∂x1 ∂x2

[0,∞)2

k1 + k2 ≥ 1. If (X1 , X2 ) ≤(m1 ,m2 )-icx (Y1 , Y2 ), then φ(X1 , X2 ) ≤(m1 +m2 )-icx φ(Y1 , Y2 ). This result can be used, for example, to prove the second inequality in Theorem 9.A.18 in Chapter 9. Theorem 7.A.20. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors, of independent components, with a common support [0, ∞)2 . Then [0,∞)2 (X1 , X2 ) ≤(m1 ,m2 )-icx (Y1 , Y2 ) ⇐⇒ X1 ≤m1 -icx Y1 and X2 ≤m2 -icx Y2 . 7.A.6 The symmetric convex order Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. When X and Y have exchangeable (that is, permutation symmetric) distribution functions, it is of interest to consider orders deﬁned by the condition

7.A The Monotone Convex and Monotone Concave Orders

333

Eφ(X) ≤ Eφ(Y ) for all functions in a certain class of (permutation) symmetric functions. One such order is deﬁned as follows. Suppose that X and Y are such that Eφ(X) ≤ Eφ(Y )

for all symmetric convex functions φ : Rn → R,

provided the expectations exist. Then X is said to be smaller than Y in the symmetric convex order (denoted as X ≤symcx Y ). The following relationship between the orders ≤cx and ≤symcx is obvious. Theorem 7.A.21. Let X and Y be two random vectors. If X ≤cx Y , then X ≤symcx Y . A further discussion regarding the order ≤symcx can be found in Chapter 7 by Tong in [515]. 7.A.7 The componentwise convex order Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Suppose that X and Y are such that Eφ(X) ≤ Eφ(Y )

for all [increasing] functions φ : Rn → R that are convex in each argument when the other arguments are held ﬁxed,

provided the expectations exist. Then X is said to be smaller than Y in the [increasing] componentwise convex order (denoted by X [≤iccx ] ≤ccx Y ). The following relationship between the orders ≤ccx [≤iccx ] and ≤cx [≤icx ] is obvious. Theorem 7.A.22. Let X and Y be two random vectors. If X ≤ccx [≤iccx ] Y , then X ≤cx [≤icx ] Y . The functions φ1 (x1 , x2 , . . . , xn ) = xi xj and φ2 (x1 , x2 , . . . , xn ) = −xi xj are both componentwise convex, 1 ≤ i < j ≤ n. The next result thus follows from Theorem 7.A.22 and (7.A.5). Theorem 7.A.23. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. If X ≤ccx Y , then Cov(Xi , Xj ) = Cov(Yi , Yj ), 1 ≤ i < j ≤ n. Theorem 7.A.24. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx [≤icx ] Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xn ) ≤ccx [≤iccx ] (Y1 , Y2 , . . . , Yn ).

334

7 Multivariate Variability and Related Orders

Proof. The parenthetical statement follows at once from Theorem 4.A.15. The proof of the other statement is similar to the proof of that theorem. As in there, we can assume, without loss of generality, that all the 2m random variables are independent. The proof is by induction on m. For m = 1 the result is obvious. Assume that the stated result is true for vectors of size m−1. Let φ be a componentwise convex function. Then E[φ(X1 , X2 , . . . , Xm )X1 = x] = E[φ(x, X2 , . . . , Xm )] ≤ E[φ(x, Y2 , . . . , Ym )] = E[φ(X1 , Y2 , . . . , Ym )X1 = x], where the equalities above follow from the independence assumption and the inequality follows from the induction hypothesis. Taking expectations with respect to X1 , we obtain E[φ(X1 , X2 , . . . , Xm )] ≤ E[φ(X1 , Y2 , . . . , Ym )]. Repeating the argument, but now conditioning on Y2 , . . . , Ym and using X1 ≤cx Y1 , we see that E[φ(X1 , Y2 , . . . , Ym )] ≤ E[φ(Y1 , Y2 , . . . , Ym )], and this proves the result.

It is not hard to show that if X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) satisfy conditions (7.A.9)–(7.A.11) of Theorem 7.A.4, and if Y1 , Y2 , . . . , Yn are independent, then, in fact, X ≤ccx Y . This observation provides an alternative proof for the ≤ccx case of Theorem 7.A.24 The following results may be compared with Theorem 6.B.17. Theorem 7.A.25. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i ≤cx [≤icx ] Xj,i+1 for j = 1, 2, . . . , m, and i ≥ 1, and if M ≤ccx [≤iccx ] N , then M1 i=1

X1,i ,

M2 i=1

X2,i , . . . ,

Mm

Xm,i

i=1

≤ccx [≤iccx ]

N1 i=1

X1,i ,

N2 i=1

X2,i , . . . ,

Nm

Xm,i .

i=1

The following example gives suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 6.G.11, 7.A.13, 7.A.39, 7.B.5, and 9.A.20 for related results.

7.A The Monotone Convex and Monotone Concave Orders

335

Example 7.A.26. Let X be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ, and let Y be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ +D, where D is a nonnegative diagonal matrix. Then X ≤ccx Y . 7.A.8 The directional convex and concave orders Let ≤ denote the coordinatewise ordering in Rn . For x, y, z ∈ Rn we use the notation [x, y] ≤ z as a shorthand for x ≤ z and y ≤ z. Also, the notation z ≤ [x, y] stands for z ≤ x and z ≤ y. A function φ : Rn → R is said to be directionally convex [concave] if for any xi ∈ Rn , i = 1, 2, 3, 4, such that x1 ≤ [x2 , x3 ] ≤ x4 and x1 + x4 = x2 + x3 , one has φ(x2 ) + φ(x3 ) ≤ [≥] φ(x1 ) + φ(x4 ).

(7.A.17)

A function φ : Rn → Rm is called directionally convex [concave] if the coordinate functions φi , i = 1, 2, . . . , m, deﬁned by φ(x) = (φ1 (x), φ2 (x), . . . , φn (x)), are directionally convex [concave]. Directional convexity neither implies, nor is implied by, conventional convexity. However, a univariate function is directionally convex [concave] if, and only if, it is convex [concave]. A function φ : Rn → R is said to be supermodular [submodular] if for any x, y ∈ Rn it satisﬁes φ(x) + φ(y) ≤ [≥] φ(x ∧ y) + φ(x ∨ y), where the operators ∧ and ∨ denote coordinatewise minimum and maximum, respectively. If φ : Rn → R has second partial derivatives, then it is supermod2 ular if, and only if, ∂x∂i ∂xj φ ≥ 0 for all i = j. Many examples of supermodular functions can be found in Marshall and Olkin [383, Chapter 6]. Proposition 7.A.27. The following statements are equivalent: (a) The function φ is directionally convex [concave]. (b) The function φ is supermodular [submodular ] and coordinatewise convex [concave]. (c) For any x1 , x2 , y ∈ Rn , such that x1 ≤ x2 and y ≥ 0, one has φ(x1 + y) − φ(x1 ) ≤ [≥] φ(x2 + y) − φ(x2 ). If φ is twice diﬀerentiable, then it is directionally convex [concave] if, and only if, all its second derivatives are nonnegative [nonpositive]. Another useful property of directionally convex [concave] functions is stated next. Proposition 7.A.28. (a) If ψ : Rm → Rk is increasing and directionally convex [concave] and φ : Rn → Rm is increasing and directionally convex [concave], then the composition ψ(φ) is increasing and directionally

336

7 Multivariate Variability and Related Orders

convex [concave]. In particular, if ψ : R → R is increasing and convex [concave] and φ : Rn → R is increasing and directionally convex [concave], then the composition ψ(φ) is increasing and directionally convex [concave]. (b) If ψ : Rm → Rk is increasing and directionally convex [concave] and φ : Rn → Rm is decreasing and directionally convex [concave], then the composition ψ(φ) is decreasing and directionally convex [concave]. In particular, if ψ : R → R is increasing and convex [concave] and φ : Rn → R is decreasing and directionally convex [concave], then the composition ψ(φ) is decreasing and directionally convex [concave]. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. Suppose that X and Y are such that Eφ(X) ≤ Eφ(Y )

for all [increasing] functions φ : Rn → R that are directionally convex,

provided the expectations exist. Then X is said to be smaller than Y in the [increasing] directionally convex order (denoted by X [≤idir-cx ] ≤dir-cx Y ). The orders ≤dir-cv and ≤idir-cv are deﬁned similarly. The following relationships among the orders ≤dir-cx [≤idir-cx ] and ≤ccx [≤iccx ] follow from Proposition 7.A.27. The last assertion in the next theorem follows from the observation that −φ is directionally concave if, and only if, φ is directionally convex. Theorem 7.A.29. Let X and Y be two random vectors. If X ≤ccx [≤iccx ] Y , then X ≤dir-cx [≤idir-cx ] Y . Also, if X ≤dir-cx Y , then X ≤idir-cx Y and X ≥dir-cv Y . From Proposition 7.A.28 we obtain the following result (which may be compared with Theorems 6.G.10 and 9.A.16). Theorem 7.A.30. Let X and Y be two n-dimensional random vectors. If X ≤idir-cx Y , then φ(X) ≤idir-cx φ(Y ) for any increasing and directionally convex function φ : Rn → Rm . In particular, φ(X) ≤icx φ(Y ) for any increasing and directionally convex function φ : Rn → R. Theorem 7.A.31. Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞. Assume that EX j → EX and that EY j → EY as j → ∞. If X j ≤dir-cx Y j , j = 1, 2, . . ., then X ≤dir-cx Y . From Theorems 7.A.24 and 7.A.29 we immediately obtain the next result. Theorem 7.A.32. Let X1 , X2 , . . . , Xm be a set of independent random variables and let Y1 , Y2 , . . . , Ym be another set of independent random variables. If Xi ≤cx [≤icx ] Yi for i = 1, 2, . . . , m, then (X1 , X2 , . . . , Xn ) ≤dir-cx [≤idir-cx ] (Y1 , Y2 , . . . , Yn ).

7.A The Monotone Convex and Monotone Concave Orders

337

A stronger result than the ≤cx and ≤dir-cx part of Theorem 7.A.32 is Theorem 7.A.38 below. Also, the ≤icx and ≤idir-cx part of Theorem 7.A.32 still holds if it is merely assumed that (Y1 , Y2 , . . . , Ym ) is CIS (as deﬁned in (6.B.11)) rather than assuming that it consists of independent components. The following result (which is a generalization of Theorem 7.A.32) shows that the directionally convex orders are closed under conjunctions. Theorem 7.A.33. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤dir-cx [≤idir-cx ] Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤dir-cx [≤idir-cx ] (Y 1 , Y 2 , . . . , Y m ). Proof. It is enough to show that if X 1 and Y 1 are of the same dimension k1 , and if Z is another random vector, of dimension k, which is independent of X 1 and Y 1 , and if X 1 ≤dir-cx [≤idir-cx ] Y 1 , then (X 1 , Z) ≤dir-cx [≤idir-cx ] (Y 1 , Z). The rest of the proof can then be obtained by induction and pairwise interchanges. So let φ be a (k1 +k)-dimensional [increasing] directionally convex function. Note that φ(x, z) is [increasing] directionally convex in x for any z, where the dimensions of x and z are k1 and k, respectively. Thus from X 1 ≤dir-cx [≤idir-cx ] Y 1 and the independence assumption we obtain Eφ(X 1 , Z) = E Eφ(X 1 , Z)Z ≤ E Eφ(Y 1 , Z)Z = Eφ(Y 1 , Z), and the proof is complete.

The next result shows that the directionally convex orders are closed under convolutions. Theorem 7.A.34. Let X 1 , X 2 , . . . , X m be a set of independent random vectors and let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors, all of the same dimension k. If X i ≤dir-cx [≤idir-cx ] Y i for i = 1, 2, . . . , m, then m m X i ≤dir-cx [≤idir-cx ] Y i. i=1

i=1

Proof. Let φ : Rk → R be any [increasing] directionally convex function.

m Then the function ψ : Rkm → R, deﬁned by ψ(x1 , x2 , . . . , xm ) = φ i=1 xi , is [increasing] directionally convex function. The stated result now follows from Theorem 7.A.33. (The idir-cx part also follows directly from Theorems 7.A.30 and 7.A.33.)

A continuous analog of Theorem 7.A.34 (where the sums are replaced by integrals) is the following result.

338

7 Multivariate Variability and Related Orders

Theorem 7.A.35. Let {X(t)}t∈Rd and {Y (t)}t∈Rd be two R-valued random ﬁelds which are a.s. Riemann-integrable. Suppose that (X(t1 ), X(t2 ), . . . , X(tk )) ≤idir-cx (Y (t1 ), Y (t2 ), . . . , Y (tk )) for all t1 , t2 , . . . , tk ∈ Rd , k = 1, 2, . . .. Then X(t)dt, X(t)dt, . . . , X(t)dt B1 B2 B k Y (t)dt, Y (t)dt, . . . , Y (t)dt ≤idir-cx B1

B2

Bk

for any disjoint bounded Borel-measurable sets B1 , B2 , . . . , Bk in Rd , k = 1, 2, . . .. The following result may be compared with Theorem 7.A.25. Theorem 7.A.36. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i ≤cx [≤icx ] Xj,i+1 for j = 1, 2, . . . , m, and i ≥ 1, and if M ≤dir-cx [≤idir-cx ] N , then M1 i=1

X1,i ,

M2

X2,i , . . . ,

i=1

Mm

Xm,i

i=1

≤dir-cx [≤idir-cx ]

N1 i=1

X1,i ,

N2 i=1

X2,i , . . . ,

Nm

Xm,i .

i=1

Consider now, as in Section 6.B.4, n families of univariate distribu(i) tion functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) denote a random variable with distribution func(i) tion Gθ , i = 1, 2, . . . , n. Below we give a result which provides comparisons of two random vectors, with distribution functions of the form (6.B.18), in the [increasing] directionally convex order. The following result is a multivariate extension of Theorems 3.A.21 and 4.A.18; see Theorems 6.B.17, 6.G.8, 9.A.7, and 9.A.15 for related results. (i)

Theorem 7.A.37. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution nfunctions as above. Let Θ 1 and Θ 2 be two random vectors with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

7.A The Monotone Convex and Monotone Concave Orders

339

If for every [increasing] convex function φ, E[φ(Xi (θ))] is [increasing] convex in θ,

i = 1, 2, . . . , n,

and if Θ 1 ≤dir-cx [≤idir-cx ] Θ2 , then Y 1 ≤dir-cx [≤idir-cx ] Y 2 . The following result compares, with respect to ≤dir-cx , two random vectors with the same dependence structure. Recall the deﬁnition of CIS given in (6.B.11). If every permutation of the coordinates of a random vector is CIS, then the vector is said to be conditionally increasing (CI). Recall also the deﬁnition of a copula, given in (6.B.14). Theorem 7.A.38. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have a common copula that is CI. If Xi ≤cx Yi , i = 1, 2, . . . , n, then X ≤dir-cx Y . Theorem 7.A.38 may be compared with Theorems 6.B.14 and 7.A.32. A result that is stronger than Theorem 7.A.38 is Theorem 9.A.25 in Section 9.A. The following example gives necessary and suﬃcient conditions for the comparison of multivariate normal random vectors. See Examples 6.B.29, 6.G.11, 7.A.13, 7.B.5, and 9.A.20 for related results. Example 7.A.39. Let X be a multivariate normal random vector with mean vector µX and variance-covariance matrix Σ X , and let Y be a multivariate normal random vector with mean vector µY and variance-covariance matrix Σ Y . Then X ≤dir-cx Y if, and only if, µX = µY and Σ X ≤ Σ Y . It is worth mentioning that the result in Example 7.A.26 implies the sufﬁciency part in Example 7.A.39. In closing this subsection it is worthwhile to mention that a stochastic order, which is deﬁned by requiring Eφ(X) ≤ Eφ(Y ) to hold for all supermodular [rather than supermodular and componentwise convex, that is, directionally convex] functions φ, is studied in Section 9.A.4. 7.A.9 The orthant convex and concave orders Analogous to the orthant orders studied in Section 6.G.1, one can introduce and study orthant convex and concave orders. This is done in this subsection. Let X = (X1 , X2 , . . . , Xn ) be a random vector with distribution function F and multivariate survival function F (see the exact deﬁnition of a multivariate survival function in Section 6.G.1). Let Y be another n-dimensional random vector with distribution function G and survival function G. If

340

7 Multivariate Variability and Related Orders

∞

∞

∞

... x1

x2

xn ∞

≤

F (u1 , u2 , . . . , un )du1 du2 · · · dun ∞ ∞ ... G(u1 , u2 , . . . , un )du1 du2 · · · dun

x1

x2

for all x,

xn

then we say that X is smaller than Y in the upper orthant-convex order (denoted by X ≤uo-cx Y ). If

x1

−∞

x2

... −∞ ≥

xn

F (u1 , u2 , . . . , un )du1 du2 · · · dun xn ... G(u1 , u2 , . . . , un )du1 du2 · · · dun

−∞ x1 x2

−∞

−∞

for all x,

−∞

then we say that X is smaller than Y in the lower orthant-concave order (denoted by X ≤lo-cv Y ). In analogy with Theorem 6.G.1 it is not hard to obtain the following characterizations of the orders ≤uo-cx and ≤lo-cv . Theorem 7.A.40. Let X and Y be two n-dimensional random vectors. Then (a) X ≤uo-cx Y if, and only if, E

n i=1

n gi (Xi ) ≤ E gi (Yi ) i=1

for every collection {g1 , g2 , . . . , gn } of univariate nonnegative increasing convex functions. (b) X ≤lo-cv Y if, and only if, E

n i=1

n hi (Xi ) ≤ E hi (Yi ) i=1

for every collection {h1 , h2 , . . . , hn } of univariate nonnegative increasing functions such that hi is concave on the union of the supports of Xi and Yi , i = 1, 2, . . . , n. From Theorem 7.A.40 it is easy to obtain the following result which is an extension of the fact that if the random variables X and Y satisfy X ≤icx Y , then φ(X) ≤icx φ(Y ) for all real increasing convex functions φ on R (see Theorem 4.A.15). Theorem 7.A.41. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors. (a) If X ≤uo-cx Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uo-cx (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) whenever φ1 , φ2 , . . . , φn are increasing convex functions. (b) If X ≤lo-cv Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lo-cv (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) whenever φ1 , φ2 , . . . , φn are increasing concave functions.

7.A The Monotone Convex and Monotone Concave Orders

341

From Theorems 7.A.40 and 6.G.1 it follows that X ≤uo Y =⇒ X ≤uo-cx Y and that X ≤lo Y =⇒ X ≤lo-cv Y . The following results may be compared with Theorems 7.A.25 and 7.A.36. Theorem 7.A.42. Let X 1 , X 2 , . . . , X m be m countably inﬁnite vectors of independent nonnegative random variables. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integers which are independent of X 1 , X 2 , . . . , X m . Denote by Xj,i the ith element of X j . If Xj,i ≤icx [≥icv ] Xj,i+1 for j = 1, 2, . . . , m, and i ≥ 1, and if M ≤uo-cx [≤lo-cv ] N , then M1 i=1

X1,i ,

M2 i=1

X2,i , . . . ,

Mm

Xm,i

i=1

≤uo-cx [≤lo-cv ]

N1 i=1

X1,i ,

N2

X2,i , . . . ,

i=1

Nm

Xm,i .

i=1

Consider now the function φ : Rn → R which is deﬁned by φ(x1 , x2 , . . . , xn ) n = i=1 gi (xi ), where each gi : R → R is increasing and convex [concave]. It is easy to verify that φ is increasing and directionally convex [concave]. Thus, from Theorem 7.A.40 we obtain that X ≤idir-cx Y =⇒ X ≤uo-cx Y and X ≤idir-cv Y =⇒ X ≤lo-cv Y . It is worth mentioning that the supermodular order, studied in Section 9.A.4, implies the orders ≤uo , ≤lo , and ≤idir-cx , mentioned above. We now describe a multivariate extension of the univariate order ≤m-icv 2 (see Section 4.A.7). A special case of this extension is the order ≤R (m,m)-icv which is discussed in Section 7.A.5. A similar extension of the univariate order ≤m-icx can also be deﬁned and studied. For x ∈ Rn , let L(x) = {y : y ≤ x}. For an n-dimensional distribution function F deﬁne F1 (x) = F (x)

and Fm (x) =

Fm−1 (u)du. L(x)

For n-dimensional distribution functions F and G denote F ≤nm G ⇐⇒ Fm (x) ≥ Gm (x)

for all x ∈ Rn .

342

7 Multivariate Variability and Related Orders

When m = 1 the above order is equivalent to the lower orthant order deﬁned in (6.G.2). When m = 2 the above order is a multivariate (left-sided) analog of (4.A.5). If X and Y have the distribution functions F and G, respectively, then (as can be easily seen by taking m = 2) the relationship F ≤n2 G is 2 the same as X ≤lo-cv Y . Also, the order ≤2m is the order ≤R (m,m)-icv which is discussed in Section 7.A.5. For any n-dimensional distribution function F , its (n − 1)-dimensional marginal distribution functions are deﬁned by F (i) (x1 , . . . , xi−1 , xi+1 , . . . , xn ) = F (x1 , . . . , xi−1 , ∞, xi+1 , . . . , xn ), i = 1, 2, . . . , n. The next result shows that the order ≤nm is preserved under marginalization. Before stating the next result we need the following deﬁnition. The distribution function F is said to be margin-regular for m > 1 and i ≤ n if for each x(i) = (x1 , . . . , xi−1 , xi+1 , . . . , xn ) for which F (i) (x(i) ) < ∞, there is an xi ∈ R such that F (x1 , x2 , . . . , xn ) < ∞. Theorem 7.A.43. For n > 1, m > 1, and i ≤ n, let F and G be two ndimensional distribution functions such that F ≤nm G and F is margin-regular G(i) . for m and i. Then F (i) ≤n−1 m

7.B Multivariate Dispersion Orders Diﬀerent characterizations of the univariate order ≤disp give rise to diﬀerent multivariate dispersive orders. In this section we describe some such orders. 7.B.1 A strong multivariate dispersion order Recall from (3.B.13) that for univariate random variables we have that X ≤disp Y if, and only if, Y =st φ(X) for some φ that satisﬁes φ(x ) − φ(x) ≥ x − x whenever x ≤ x . An extension of this deﬁnition of the univariate dispersion order gives the multivariate dispersion order that is discussed in this subsection. A function φ : Rn → Rn is called an expansion if

φ(x) − φ(x ) ≥ x − x

for all x and x in Rn .

Let X and Y be two n-dimensional random vectors. Suppose that Y =st φ(X)

for some expansion φ.

(7.B.1)

Then we say that X is less than Y in the strong multivariate dispersive order (denoted by X ≤SD Y ). Let J φ (x) denote the Jacobian matrix of φ at x, that is,

7.B Multivariate Dispersion Orders

J φ (x) =

343

∂φ i . ∂xj

It is useful to note that φ is an expansion if, and only if, J Tφ (x)J φ (x) − I is nonnegative deﬁnite, where I is the identity matrix; see Giovagnoli and Wynn [211]. It is very easy to show that the strong multivariate dispersion order ≤SD is closed under conjunctions as the following result states. Theorem 7.B.1. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤SD Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤SD (Y 1 , Y 2 , . . . , Y m ). The strong multivariate dispersion order ≤SD also satisﬁes the following closure property, the proof of which is omitted. Theorem 7.B.2. Let X and Y be two n-dimensional random vectors. Let A be an n × n matrix such that for any orthogonal matrix Γ there exists an orthogonal matrix Γ˜ such that Γ AΓ˜ = A. If X ≤SD Y , then AX ≤SD AY . The following result compares, with respect to the order ≤SD , two random vectors with the same dependence structure. Recall the deﬁnition of a copula, given in (6.B.14). Theorem 7.B.3. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have a common copula. If Xi ≤disp Yi , i = 1, 2, . . . , n, then X ≤SD Y . An interesting application of Theorem 7.B.3 is the following result which may be compared with Theorems 6.D.10, 6.E.12, and 7.B.12. Theorem 7.B.4. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as in Theorem 6.D.10. If X1 ≤disp Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤SD (Y(1) , Y(2) , . . . , Y(n) ). Proof. The vectors (X(1) , X(2) , . . . , X(n) ) and (Y(1) , Y(2) , . . . , Y(n) ) have the same copula. By Theorem 3.B.26, X1 ≤disp Y1 implies that X(i) ≤disp Y(i) , i = 1, 2, . . . , n. The stated result now follows from Theorem 7.B.3.

An interesting example in which the order ≤SD arises naturally is the following. See also Examples 6.B.29, 6.G.11, 7.A.13, 7.A.26, 7.A.39, and 9.A.20.

344

7 Multivariate Variability and Related Orders

Example 7.B.5. Let X = (X1 , X2 , . . . , Xn ) be a multivariate normal random vector with mean vector µ1 , and let Y = (Y1 , Y2 , . . . , Yn ) be a multivariate normal random vector with mean vector µ2 . If X and Y have the same correlation matrix, and if Var(Xi ) ≤ Var(Yi ), i = 1, 2, . . . , n, then X ≤SD Y . This can be seen from Theorem 7.B.3 by noting that X and Y have the same copula, and that Var(Xi ) ≤ Var(Yi ) implies Xi ≤disp Yi , i = 1, 2, . . . , n. Arias-Nicol´ as, Fern´ andez-Ponce, Luque-Calvo, and Su´ arez-Llorens [17] and Fern´ andez-Ponce and Rodr´ıguez-Gri˜ nolo [196] compared, respectively, some multivariate t and Wishart random vectors with respect to the order ≤SD . According to Oja [441], an n-dimensional random vector Y is said to be more scattered than another n-dimensional random vector X (denoted as X ≤∆ Y ) if Y =st φ(X) for some function φ : Rn → Rn that has the property that ∆(φ(x1 ), φ(x2 ), . . . , φ(xn+1 )) ≥ ∆(x1 , x2 , . . . , xn+1 )

(7.B.2)

for all {x1 , x2 , . . . , xn+1 } ⊂ Rn , where ∆(x1 , x2 , . . . , xn+1 ) is the volume of the simplex with vertices x1 , x2 , . . . , xn+1 . It is useful to note that a function φ satisﬁes (7.B.2) for all {x1 , x2 , . . . , xn+1 } ⊂ Rn if, and only if, the determinant of the Jacobian matrix of φ satisﬁes |Det(J φ (x))| ≥ 1

for all x ∈ Rn .

The order ≤∆ , as the order ≤SD , is a multivariate extension of the characterization (3.B.13) of the univariate order ≤disp . We have the following relationship between the orders ≤∆ and ≤SD : X ≤SD Y =⇒ X ≤∆ Y . Fernandez-Ponce and Suarez-Llorens [198] introduced a multivariate dispersion order that is even stronger than ≤SD . They did it by essentially requiring (7.B.1) to hold for a particular expansion φ which is a multivariate analog of the univariate function φ = G−1 F in (3.B.13) in Section 3.B. 7.B.2 A weak multivariate dispersion order The property (3.B.34) of the univariate dispersive order has an obvious multivariate analog, which is used in this subsection in order to deﬁne a multivariate dispersion order. Let X and Y be two n-dimensional random vectors. Let X and Y be such that X =st X and Y =st Y and such that X and X are independent and Y and Y are independent. Suppose that

X − X ≤st Y − Y , where · is the Euclidean norm and ≤st is the usual univariate stochastic order discussed in Section 1.A. Then we say that X is smaller than Y in the multivariate dispersion order (denoted by X ≤D Y ).

7.B Multivariate Dispersion Orders

345

The multivariate dispersion order ≤D has the desirable property that the traces of the corresponding covariance matrices are ordered as expected. This multivariate analog of (3.B.25) is shown in the next theorem. Theorem 7.B.6. Let X and Y be two n-dimensional random vectors. If X ≤D Y , then tr(Cov(X)) ≤ tr(Cov(Y )). (7.B.3) Proof. Let X and Y be such that X =st X and Y =st Y and such that X and X are independent and Y and Y are independent. Then Cov(X) = T T 1 1 2 E[(X − X ) (X − X )], and Cov(Y ) = 2 E[(Y − Y ) (Y − Y )]. Therefore 1 E tr(X − X )(X − X )T 2 1 = E X − X 2 2 1 ≤ E Y − Y 2 2 = tr(Cov(Y ))

tr(Cov(X)) =

and (7.B.3) is obtained.

The multivariate dispersion order ≤D is location-free and rotation-free as the next result shows. The proof is simple and is omitted. Theorem 7.B.7. Let X and Y be two n-dimensional random vectors. If X ≤D Y , then Γ X + a ≤D ΛY + b, for all orthogonal matrices Γ and Λ and for all vectors a and b. The multivariate dispersion order ≤D is also closed under conjunctions as the following result states. Theorem 7.B.8. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤D Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤D (Y 1 , Y 2 , . . . , Y m ). Proof. It is suﬃcient to prove the result when m = 2. Let X 1 , X 2 , Y 1 , and Y 2 be such that X 1 =st X 1 , X 2 =st X 2 , Y 1 =st Y 1 ,

and Y 2 =st Y 2 .

Let X = (X 1 , X 2 ), X = (X 1 , X 2 ), Y = (Y 1 , Y 2 ),

and Y = (Y 1 , Y 2 ).

346

7 Multivariate Variability and Related Orders

Then

X − X 2 = X 1 − X 1 2 + X 2 − X 2 2 ≤st Y 1 − Y 1 2 + Y 2 − Y 2 2 = Y − Y 2 . That is, X ≤D Y .

By construction on the same probability space (see Section 6.B.2), it is easy to prove the following result. Theorem 7.B.9. Let X and Y be two n-dimensional random vectors. Then X ≤SD Y =⇒ X ≤D Y . 7.B.3 Dispersive orders based on constructions The standard construction of an n-dimensional random vector X = (X1 , X2 , . . . , Xn ), from a vector (U1 , U2 , . . . , Un ) of independent uniform[0, 1] random variables, was described in Section 6.B.3. Here we ﬁrst describe explicitly the function that transforms (U1 , U2 , . . . , Un ) into (X1 , X2 , . . . , Xn ). Let F be the distribution function of X. Denote by F1 (·) the marginal distribution function of X1 , and denote by Fi+1|1,2,...,i (·x1 , x2 , . . . , xi ) the conditional distribution function of Xi+1 given that X1 = x1 , X2 = x2 , . . . , Xi = xi , i = 1, 2, . . . , n −1. The inverse of F1 will be denoted by F1−1 (·) and the inverse −1 of Fi+1|1,2,...,i (·x1 , x2 , . . . , xi ) will be denoted by Fi+1|1,2,...,i (·x1 , x2 , . . . , xi ) for every (x1 , x2 , . . . , xi ) in the support of (X1 , X2 , . . . , Xi ), i = 1, 2, . . . , n−1. For (u1 , u2 , . . . , un ) ∈ (0, 1)n denote x1 = F1−1 (u1 ),

(7.B.4)

and, by induction, −1 (ui x1 , x2 , . . . , xi−1 ), xi = Fi|1,2,...,i−1

i = 2, 3, . . . , n.

(7.B.5)

Denote the transformation (u1 , u2 , . . . , un ) → (x1 , x2 , . . . , xn ) described in (7.B.4) and (7.B.5) by Ψ ∗F : (0, 1)n → Rn . It is well known that Ψ ∗F (U1 , U2 , . . . , Un ) =st (X1 , X2 , . . . , Xn ). Let Y = (Y1 , Y2 , . . . , Yn ) be another random vector with distribution function G, and denote the corresponding transformation by Ψ ∗G . Note that Ψ ∗F and Ψ ∗G can be thought of as “inverses” of F and of G, respectively. The following order is a multivariate extension of the characterization (3.B.7) of the univariate order ≤disp . Suppose that Ψ ∗G (u) − Ψ ∗F (u) is increasing in u ∈ (0, 1)n .

7.B Multivariate Dispersion Orders

347

Then X is said to be smaller than Y in the multivariate dispersion order (denoted by X ≤disp Y ). It is easy to prove that the order ≤disp is closed under conjunctions as the following result states. Theorem 7.B.10. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤disp Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤disp (Y 1 , Y 2 , . . . , Y m ). In particular, if the random variables X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn are independent and satisfy Xi ≤disp Yi , i = 1, 2, . . . , n, then (X1 , X2 , . . . , Xn ) ≤disp (Y1 , Y2 , . . . , Yn ). A useful property of the multivariate order ≤disp is given next. Recall from Section 6.B.3 the deﬁnition of a CIS random vector, and recall from Section 7.A.8 the deﬁnition of directionally convex functions. The proof of the following result is not given here. Theorem 7.B.11. Let X and Y be two nonnegative CIS random vectors. If X ≤disp Y , then Var[φ(X)] ≤ Var[φ(Y )]

for all increasing directionally convex functions φ.

In particular, if (X1 , X2 , . . . , Xn ) ≤disp (Y1 , Y2 , . . . , Yn ), then Var[X1 + X2 + · · · + Xn ] ≤ Var[Y1 + Y2 + · · · + Yn ]. The following result may be compared with Theorems 6.D.10, 6.E.12, and 7.B.4. Theorem 7.B.12. Let X(1) , X(2) , . . . , X(n) and Y(1) , Y(2) , . . . , Y(n) be order statistics as in Theorem 6.D.10. If X1 ≤disp Y1 , then (X(1) , X(2) , . . . , X(n) ) ≤disp (Y(1) , Y(2) , . . . , Y(n) ). The following example may be compared with Examples 1.B.24, 1.C.48, 2.A.22, 3.B.38, 6.B.41, 6.D.8, and 6.E.13. Example 7.B.13. Let X and Y be two absolutely continuous nonnegative random variables with survival functions F and G, respectively. Denote Λ1 = − log F and Λ2 = − log G. Consider two nonhomogeneous Poisson processes N1 = {N1 (t), t ≥ 0} and N2 = {N2 (t), t ≥ 0} with mean functions Λ1 and Λ2 (see Example 1.B.13), respectively. Let Ti,1 , Ti,2 , . . . be the successive epoch times of process Ni , i = 1, 2. Note that X =st T1,1 and Y =st T2,1 . If X ≤disp Y , then (T1,1 , T1,2 , . . . , T1,n ) ≤disp (T2,1 , T2,2 , . . . , T2,n ) for each n ≥ 1.

348

7 Multivariate Variability and Related Orders

The total hazard construction of a nonnegative n-dimensional random vector T = (T1 , T2 , . . . , Tn ) with distribution function F , from a vector (X1 , X2 , . . . , Xn ) of independent standard exponential random variables, was described in Section 6.C.2. The construction deﬁnes a transformation of (X1 , X2 , . . . , Xn ) to Tˆ = (Tˆ1 , Tˆ2 , . . . , Tˆn ) such that T =st Tˆ . Denote this transformation from [0, ∞)n to [0, ∞)n by R∗F . Thus R∗F (X1 , X2 , . . . , Xn ) =st (T1 , T2 , . . . , Tn ). Let S = (S1 , S2 , . . . , Sn ) be another nonnegative random vector with distribution function G, and denote the corresponding transformation by R∗G . Note that R∗F and R∗G can be thought of as “inverses” of the “total hazards” − log F and − log G, respectively. The following order is a multivariate extension of the characterization (3.B.9) of the univariate order ≤disp . Suppose that R∗G (x) − R∗F (x) is increasing in x ∈ [0, ∞)n . Then T is said to be smaller than S in the dynamic multivariate dispersion order (denoted by T ≤dyn-disp S). The order ≤dyn-disp is closed under conjunctions as the following, easy to prove, result states. Theorem 7.B.14. Let T 1 , T 2 , . . . , T m be a set of independent random vectors where the dimension of T i is ki , i = 1, 2, . . . , m. Let S 1 , S 2 , . . . , S m be another set of independent random vectors where the dimension of S i is ki , i = 1, 2, . . . , m. If T i ≤dyn-disp S i for i = 1, 2, . . . , m, then (T 1 , T 2 , . . . , T m ) ≤dyn-disp (S 1 , S 2 , . . . , S m ). A version of Theorem 7.B.11 holds for the order ≤dyn-disp , and is given next. Recall from Section 6.C.1 that a nonnegative random vector T has the positive dependence property of “supporting lifetimes” if T ≤ch T . The proof of the following result is not given here. Theorem 7.B.15. Let T and S be two nonnegative random vectors with the supporting lifetimes property. If T ≤dyn-disp S, then Var[φ(T )] ≤ Var[φ(S)]

for all increasing directionally convex functions φ.

7.C Multivariate Transform Orders: Convex, Star, and Superadditive Orders In this section we review some extensions of the univariate orders ≤c , ≤∗ , and ≤su , which were studied in Section 4.B. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative random vectors with survival functions F and G, respectively. Denote

7.D The Multivariate Laplace Transform and Related Orders

F i (x) =

F (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) , F (x1 , . . . , xi−1 , 0, xi+1 , . . . , xn )

x ≥ 0,

Gi (x) =

G(x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) , G(x1 , . . . , xi−1 , 0, xi+1 , . . . , xn )

x ≥ 0.

349

and

For any (x1 , x2 , . . . , xn ) ≥ 0 and for any i = 1, 2, . . . , n, let ui be the solution of Gi (x1 , . . . , xi−1 , ui , xi+1 , . . . , xn ) = F i (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ). If, for every i = 1, 2, . . . , n and every (x1 , . . . , xi−1 , xi+1 , . . . , xn ), we have that ui is convex in xi , then X is said to be smaller than Y in the multivariate convex transform order (denoted as X ≤mc Y ). If, for every i = 1, 2, . . . , n and every (x1 , . . . , xi−1 , xi+1 , . . . , xn ), we have that ui is starshaped in xi , then X is said to be smaller than Y in the multivariate star order (denoted as X ≤m∗ Y ). Finally, if, for every i = 1, 2, . . . , n and every (x1 , . . . , xi−1 , xi+1 , . . . , xn ), we have that ui is superadditive in xi , then X is said to be smaller than Y in the multivariate superadditive order (denoted as X ≤msu Y ). Obviously, X ≤mc Y =⇒ X ≤m∗ Y =⇒ X ≤msu Y . The above three orders are partial orders in the sense that each of them is transitive and reﬂexive. They are also closed under marginalization: Theorem 7.C.1. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. If X ≤mc [≤m∗ , ≤msu ] Y , then X I ≤mc [≤m∗ , ≤msu ] Y I for each I ⊆ {1, 2, . . . , n}. In analogy with Theorem 4.B.11, the above three orders can be used to deﬁne multivariate notions of the IFR, IFRA, and NBU aging notions.

7.D The Multivariate Laplace Transform and Related Orders The orders we studied in Section 5.A have multivariate extensions, which we will brieﬂy review in this section. 7.D.1 The multivariate Laplace transform order Extending (5.A.1), we have the following deﬁnition of the multivariate Laplace transform order. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors such that

350

7 Multivariate Variability and Related Orders n n for all s > 0. E exp − si Xi ≥ E exp − si Yi i=1

(7.D.1)

i=1

Then X is said to be smaller than Y in the Laplace transform order (denoted as X ≤Lt Y ). Throughout this section we consider only nonnegative random vectors. As in the univariate case (see Theorem 5.A.7), the multivariate order ≤Lt is closed under mixtures, limits in distribution, and convolutions. We do not formally state and prove these closure properties here. The following property of the multivariate Laplace transform order can be veriﬁed easily. Recall the notation X I and Y I from (6.A.1). Theorem 7.D.1. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. If X ≤Lt Y , then X I ≤Lt Y I for each I ⊆ {1, 2, . . . , n}. That is, the multivariate Laplace transform order is closed under marginalization. From Theorem 7.D.1 and (5.A.5) we see that X ≤Lt Y =⇒ E[Xi ] ≤ E[Yi ],

i = 1, 2, . . . , n,

(7.D.2)

provided the expectations exist. The following property is also easy to verify. Theorem 7.D.2. Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤Lt Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤Lt (Y 1 , Y 2 , . . . , Y m ). That is, the multivariate Laplace transform order is closed under conjunctions. Another closure property of the multivariate Laplace transform order is given in Theorem 7.D.7. Theorem 7.D.3. Let X and Y be two nonnegative random vectors. If X ≤lo Y or X ≤icv Y or X ≥dir-cx Y , then X ≤Lt Y . In particular, if X ≤st Y , then X ≤Lt Y . Proof. The function hi , deﬁned by hi (x) = exp{−si x}, is nonnegative and decreasing for each si > 0, i = 1, 2, . . . , n. Therefore X ≤lo Y =⇒ X ≤Lt Y by (6.G.6) and (7.D.1). The implication X ≤icv Y =⇒ X ≤ Lt Y follows n from the fact that the function φ, deﬁned by φ(x) = exp{− i=1 si xi }, is decreasing and convex for each s > 0 (and therefore −φ is increasing and concave). Finally, the implication X ≥dir-cx Y =⇒ X ≤Lt Y follows from the fact that the function φ above is directionally convex for each s > 0.

7.D The Multivariate Laplace Transform and Related Orders

351

The following result is a multivariate analog of the right side of (5.A.13). It can be obtained from Jensen’s Inequality. Theorem 7.D.4. Let Y be a nonnegative random vector with mean vector (µ1 , µ2 , . . . , µn ). Let Z be a random vector degenerate at (µ1 , µ2 , . . . , µn ). Then X ≤Lt Z. The next result follows easily from (7.D.1). Theorem 7.D.5. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. If X ≤Lt Y , then n

ai Xi ≤Lt

i=1

n

ai Yi ,

whenever ai ≥ 0, i = 1, 2, . . . , n.

i=1

A multivariate analog of Theorem 5.A.3 is the following result. Its proof is similar to the proof of Theorem 5.A.3 and is therefore omitted. Theorem 7.D.6. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two nonnegative n-dimensional random vectors. Then X ≤Lt Y if, and only if, E

n

n φi (Xi ) ≥ E φi (Yi )

i=1

i=1

for all completely monotone functions φi , i = 1, 2, . . . , n, provided the expectations exist. When X and Y are vectors of nonnegative integer-valued random variables, it is customary and convenient to work with their probability generating functions, rather than with their Laplace transforms. This suggests the following deﬁnition. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two vectors, of nonnegative integer-valued random variables, such that E

n i=1

i tX i

≥E

n

tYi i

for all t ∈ (0, 1)n .

(7.D.3)

i=1

Then X is said to be smaller than Y in the multivariate probability generating function order (denoted by X ≤pgf Y ). It is easy to see that (7.D.3) holds if, and only if, (7.D.1) holds. That is, X ≤pgf Y ⇐⇒ X ≤Lt Y . A preservation property of the Laplace transform order is described in the next theorem. It is a multivariate extension of Theorem 5.A.9.

352

7 Multivariate Variability and Related Orders

Theorem 7.D.7. For i = 1, 2, . . . , m, let {Xj,i , j = 1, 2, . . . } be a sequence of nonnegative identically distributed random vectors, and assume that all the Xj,i ’s are mutually independent. Let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both N and N are independent of the Xj,i ’s. If M ≤pgf N , then M1 j=1

Xj,1 ,

M2 j=1

Xj,2 , . . . ,

Mm

N1 N2 Nm Xj,m ≤Lt Xj,1 , Xj,2 , . . . , Xj,m .

j=1

j=1

j=1

j=1

Proof. For ﬁxed (n1 , n2 , . . . , nm ) and ﬁxed bi > 0, i = 1, 2, . . . , m, we compute m m ni m ni

ni LX1,i (bi ) , E e− i=1 bi j=1 Xj,i = E e−bi j=1 Xj,i = i=1

i=1

where LX1,i denotes the Laplace transform of X1,i , i = 1, 2, . . . , m. Therefore m − m bi Mi Xj,i

Mi i=1 j=1 =E LX1,i (bi ) E e i=1

≥E

m

Ni LX1,i (bi )

i=1

m Ni = E e− i=1 bi j=1 Xj,i .

7.D.2 The multivariate factorial moments order Let X and Y be two vectors of nonnegative integer-valued random variables such that n n Xi Yi E ≤E for all ji ∈ N+ , i = 1, 2, . . . , n. (7.D.4) j ji i i=1 i=1 Then X is said to be smaller than Y in the factorial moments order (denoted by X ≤fm Y ). It is easy to see that X ≤fm Y =⇒ EX ≤ EY . The proofs of the following three results are similar to the proofs of Theorems 5.C.2, 5.C.4, and 5.C.5, respectively. We omit the straightforward details. Theorem 7.D.8. (a) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two vectors of nonnegative integer-valued random variables. If X ≤fm Y , then X + k ≤fm Y + k for every k ∈ Nn+ .

7.D The Multivariate Laplace Transform and Related Orders

353

(b) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two vectors of nonnegative integer-valued random variables. If (X1 , X2 , . . . , Xn ) ≤fm (Y1 , Y2 , . . . , Yn ), then (k1 X1 , k2 X2 , . . . , kn Xn ) ≤fm (k1 Y1 , k2 Y2 , . . . , kn Yn ) for every (k1 , k2 , . . . , kn ) ∈ Nn+ . (c) Let X 1 , X 2 , . . . , X m be a set of independent n-dimensional vectors of nonnegative integer-valued random variables. Let Y 1 , Y 2 , . . . , Y m be another set of independent n-dimensional vectors of nonnegative integer-valued random variables. If X i ≤fm Y i , i = 1, 2, . . . , m, then m

X i ≤fm

i=1

m

Y i.

i=1

Theorem 7.D.9. Let X and Y be two vectors of nonnegative integer-valued random variables. If X ≤icx Y , then X ≤fm Y . In particular, if X ≤st Y , then X ≤fm Y . Theorem 7.D.10. Let X and Y be two vectors of nonnegative integer-valued n random variables with bounded support i=1 {0, 1, 2, . . . , bi }. If X ≤fm Y , then b − Y ≤pgf b − X. 7.D.3 The multivariate moments order Consider now two vectors, of general (that is, not necessarily integer-valued) nonnegative random variables, X and Y such that E

n i=1

Xiji

≤E

n

Yiji

for all ji ∈ N+ , i = 1, 2, . . . , n.

i=1

Then X is said to be smaller than Y in the moments order (denoted as X ≤mom Y ). Clearly, X ≤mom Y =⇒ EX ≤ EY . The following three results are analogs of Theorems 5.C.6, 5.C.9, and 5.C.19. We omit the straightforward proofs. Theorem 7.D.11. (a) Let X and Y be two vectors of nonnegative random variables. If X ≤mom Y , then X + k ≤mom Y + k for every k ≥ 0. (b) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two vectors of nonnegative random variables. If (X1 , X2 , . . . , Xn ) ≤mom (Y1 , Y2 , . . . , Yn ), then (k1 X1 , k2 X2 , . . . , kn Xn ) ≤mom (k1 Y1 , k2 Y2 , . . . , kn Yn ) for every (k1 , k2 , . . . , kn ) ≥ 0. (c) Let X 1 , X 2 , . . . , X m be a set of independent n-dimensional vectors of nonnegative random variables. Let Y 1 , Y 2 , . . . , Y m be another set of independent n-dimensional vectors of nonnegative random variables. If X i ≤mom Y i , i = 1, 2, . . . , m, then

354

7 Multivariate Variability and Related Orders m i=1

X i ≤mom

m

Y i.

i=1

Theorem 7.D.12. Let X and Y be two vectors of nonnegative integer-valued random variables. If X ≤fm Y , then X ≤mom Y . In particular, if X ≤icx Y (or if X ≤st Y ), then X ≤mom Y . Theorem 7.D.13. Let X and n Y be two vectors of nonnegative random variables with bounded support i=1 [0, bi ]. If X ≤mom Y , then b − Y ≤Lt b − X. The ≤uo-cx order implies the multivariate moments order as it is described in the following result. This result follows at once from Theorem 7.A.40. Theorem 7.D.14. Let X and Y be two vectors of nonnegative random variables. If X ≤uo-cx Y , then X ≤mom Y .

7.E Complements Section 7.A: The proofs of Theorems 7.A.1 and 7.A.2 can be derived from results of Strassen [541]; see, for instance, R¨ uschendorf [482]. Elton and Hill [183] derived a constructive proof of Theorem 7.A.1. Further references regarding these theorems and several variations of them can be found in Elton and Hill [182]. Most of the other results in this section are easy to derive. The ﬁrst characterization of the order ≤icx , given in Theorem 7.A.3, can be found in M¨ uller and Stoyan [419]. The result about the convex order comparison of two sums (7.A.13) is taken from Berger [79]. The comparisons of vectors of random partial sums of random variables (Theorem 7.A.7) is taken from Jean-Marie and Liu [254]. Theorems 7.A.8 and 7.A.9 can be found in Arnold [19]. Results similar to the conclusions of Theorem 7.A.10 can be found in Alzaid and Proschan [14]. The convex order comparison of multivariate means (Example 7.A.11) is a variation of Lemma 1 of B¨ auerle [59]. The necessary (and suﬃcient) conditions for the comparison of multivariate normal random vectors (Example 7.A.13) can be found in M¨ uller [413]; some variations of the results in this example are given in Ding and Zhang [168]. The conditions which yield the stochastic equality of X and Y (Theorems 7.A.14 and 7.A.15) are taken from Li and Zhu [351] and from Scarsini [492], whereas Theorem 7.A.16 is taken from Baccelli and Makowski [27]. Some orders that are weaker than the multivariate convex order are studied in Mosler [399, Chapter 8]; for example, he studies the order deﬁned by Eφ(a1 X1 + a2 X2 + · · · + an Xn ) ≤ Eφ(a1 Y1 + a2 Y2 + · · · + an Yn ) for all univariate convex functions φ and constants a1 , a2 , . . . , an for which the expectations exist. Fern´ andez and Molchanov [194] studied related orders. The material in Section 7.A.5 follows Denuit, Lef`evre, and Mesﬁoui [148]; a version of the (m1 , m2 )-icx order for discrete random vectors is studied

7.E Complements

355

in Denuit, Lef`evre, and Mesﬁoui [150]. A more general version of Theorem 7.A.17 can be found in Bassan and Scarsini [54]. The order ≤symcx is deﬁned and studied in Marshall and Olkin [383, page 282]. The fact that random vectors, that are comparable in the order ≤ccx , must have the same covariance matrix (Theorem 7.A.23), can be found in M¨ uller and Stoyan [419]. The “preservation property” of the convex order under independence (Theorem 7.A.24) can be found in M¨ uller and Scarsini [417]. The results which compare random sums (Theorem 7.A.25) are taken from Pellerey [451]. The result about the ordering of multivariate normal random vectors according to the ≤ccx order (Example 7.A.26) is taken from Block and Sampson [94, Section 3]. The notion of directionally convex functions is studied in Shaked and Shanthikumar [509], though Fan and Lorentz [190], Marshall and Olkin [383, page 157], and R¨ uschendorf [483] mentioned such functions earlier. Most of the results about the directionally convex order (Section 7.A.8) are taken from Chang, Chao, Pinedo, and Shanthikumar [125] and from Meester and Shanthikumar [387]. The closure under limits property of the directionally convex order (Theorem 7.A.31) can be found in M¨ uller and Stoyan [419]. The comparison of integrals result (Theorem 7.A.35) is taken from Miyoshi [397]. The results which compare random sums (Theorem 7.A.36) are corrected versions of Theorem 2.3 and a part of Theorem 2.4 of Pellerey [451]. The comparison of mixtures result (Theorem 7.A.37) can be found in Denuit and M¨ uller [157], whereas the comparison of vectors with the same dependence structure (Theorem 7.A.38) can be found in M¨ uller and Scarsini [417]. The necessary and suﬃcient conditions for the comparison of multivariate normal random vectors (Example 7.A.39) are taken from M¨ uller [413]; an extension of this result to Kotz-type distributions is given in Ding and Zhang [168]. A discussion about the order ≤uo-cx can be found in Bergmann [82], where other orders, related to several unimodality notions, are also studied; the characterization given in Theorem 7.A.40(a) is taken from that paper. The preservation property of the order ≤uo-cx given in Theorem 7.A.41(a) can be found in Bergmann [80]. The results which compare random sums (Theorem 7.A.42) are taken from Pellerey [451]. Dyckerhoﬀ and Mosler [173] introduced some relatively easy conditions for verifying X ≤uo-cx Y or X ≤lo-cv Y when X and Y have ﬁnite discrete supports. The material about the orders ≤nm is taken from O’Brien and Scarsini [438]. Scarsini [490] has studied the order ≤2m in some detail; in particular, he has identiﬁed a class U of functions such that (X1 , X2 ) ≤2m (Y1 , Y2 ) if, and only if, E[φ(X1 , X2 )] ≤ E[φ(Y1 , Y2 )] for all φ ∈ U. M¨ uller [412] studied stochastic orders that are deﬁned by requiring (7.A.1) to hold for all quasiconcave or increasing quasiconcave functions.

356

7 Multivariate Variability and Related Orders

Arnold [20], building on previous ideas, introduced a multivariate Lorenz order that is based on the characterization of the univariate Lorenz order given in Theorem 3.A.11. Section 7.B: The development in Sections 7.B.1 and 7.B.2 follows the work of Giovagnoli and Wynn [211]. The comparison of vectors with the same dependence structure (Theorem 7.B.3) can be found in Arias-Nicol´ as, Fern´ andez-Ponce, Luque-Calvo, and Su´ arez-Llorens [17]. The conditions under which normal random vectors can be compared with respect to the order ≤SD (Example 7.B.5) are taken from Arias-Nicol´as, Fern´ andezPonce, Luque-Calvo, and Su´ arez-Llorens [17]. The comparison, in the order ≤SD , of vectors of order statistics (Theorem 7.B.4), has been communicated to us by Su´ arez-Llorens [542]. The orders that are studied in Section 7.B.3 were introduced in Shaked and Shanthikumar [518]; the properties of these orders, given in Theorems 7.B.11 and 7.B.15, can be found in that paper. The comparison, in the multivariate dispersive order, of vectors of order statistics (Theorem 7.B.12), can be found in Belzunce, Ruiz, and Ruiz [75]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. The result that compares vectors of epoch times of nonhomogeneous Poisson processes (Example 7.B.13) is taken from Belzunce and Ruiz [73]; an extension of it is given in Belzunce, Mercader, and Ruiz [70]. Khaledi and Kochar [290] and Belzunce, Ruiz, and Su´ arez-Llorens [76] introduced and studied multivariate dispersive orders that are generalizations, respectively, of characterizations (3.B.12) and (3.B.13) of the univariate order ≤disp . Section 7.C: The multivariate transform orders in this section were introduced and studied in Roy [480]. Section 7.D: A basic paper on the multivariate Laplace transform order is Denuit [141], where many of the results in Section 7.D.1 can be found. The result about the preservation of the multivariate Laplace transform order under random sums (Theorem 7.D.7) is taken from Wong [568]; see also Pellerey [451]. The multivariate factorial moment order is studied in Lef`evre and Picard [337], where Theorems 7.D.9 and 7.D.10 can be found. That paper also mentions and studies the multivariate moments order. The closure properties of the multivariate order ≤fm (Theorem 7.D.8) have been communicated to us by Lef`evre [335].

8 Stochastic Convexity and Concavity

In this chapter we study stochastic monotonicities of parametric families of distributions with respect to various stochastic orders. We have already encountered stochastic monotonicities earlier in this book. For example, condition (1.A.13) in Theorem 1.A.6, condition (3.A.47) in Theorem 3.A.21, and condition (4.A.17) in Theorem 4.A.18 describe such monotonicities. In this chapter a systematic study of such stochastic monotonicities is given. Various notions of stochastic convexity and concavity are reviewed. A multivariate extension of the notion of stochastic convexity, namely, stochastic directional convexity, is investigated in this chapter as well. Let {Pθ , θ ∈ Θ} be a family of univariate distributions. Throughout this chapter Θ is a convex set (that is, an interval) of the real line R or of the set N+ . Let X(θ) denote a random variable with distribution Pθ . It is convenient and intuitive to replace the notation {Pθ , θ ∈ Θ} by {X(θ), θ ∈ Θ}, which we do throughout this chapter. Note that when we write {X(θ), θ ∈ Θ} we do not assume (and often we are not concerned with) any dependence (or independence) properties among the X(θ)’s. We are only interested in the “marginal distributions” {Pθ , θ ∈ Θ} of {X(θ), θ ∈ Θ} even when in some circumstances {X(θ), θ ∈ Θ} is a well-deﬁned stochastic process. Note also that X(θ) does not mean that X is a function of θ; it only indicates that the distribution of X(θ) is Pθ .

8.A Regular Stochastic Convexity We start our discussion with the weakest notion of stochastic convexity and concavity and show its usefulness by a list of examples. Then, in the following sections, we introduce stronger notions which provide a systematic way of verifying the weak notion of this section.

358

8 Stochastic Convexity and Concavity

8.A.1 Deﬁnitions In the following deﬁnitions SI, SCX, SCV, SICX, SIL, SD, SDCV, and so forth, stand, respectively, for stochastically increasing, stochastically convex, stochastically concave, stochastically increasing and convex, stochastically increasing and linear, stochastically decreasing, stochastically decreasing and concave, and so forth. Let {X(θ), θ ∈ Θ} be a set of random variables. Denote (a) {X(θ), θ ∈ Θ} ∈ SI [or SD] if Eφ(X(θ)) is increasing [or decreasing] for all increasing functions φ, (b) {X(θ), θ ∈ Θ} ∈ SCX [or SCV] if Eφ(X(θ)) is convex [or concave] for all convex [or concave] functions φ, (c) {X(θ), θ ∈ Θ} ∈ SICX [or SICV] if {X(θ), θ ∈ Θ} ∈ SI and Eφ(X(θ)) is increasing convex [or concave] in θ for all increasing convex [or concave] functions φ, (d) {X(θ), θ ∈ Θ} ∈ SDCX [or SDCV] if {X(θ), θ ∈ Θ} ∈ SD and Eφ(X(θ)) is decreasing convex [or concave] in θ for all increasing convex [or concave] functions φ, (e) {X(θ), θ ∈ Θ} ∈ SIL if {X(θ), θ ∈ Θ} ∈ SI and Eφ(X(θ)) is increasing convex in θ for all increasing convex functions φ, and is increasing concave in θ for all increasing concave functions φ, (f) {X(θ), θ ∈ Θ} ∈ SDL if {X(θ), θ ∈ Θ} ∈ SD and Eφ(X(θ)) is decreasing convex in θ for all increasing convex functions φ, and is decreasing concave in θ for all increasing concave functions φ. Note that {X(θ), θ ∈ Θ} ∈ SIL ⇐⇒ {X(θ), θ ∈ Θ} ∈ SICX ∩ SICV and {X(θ), θ ∈ Θ} ∈ SDL ⇐⇒ {X(θ), θ ∈ Θ} ∈ SDCX ∩ SDCV. Also, since a function is convex if, and only if, its negative is concave, we see that {X(θ), θ ∈ Θ} ∈ SCX ⇐⇒ {X(θ), θ ∈ Θ} ∈ SCV. Example 8.A.1. Let X(µ, σ) be a normal random variable with mean µ and standard deviation σ. Then, for each σ > 0, one has {X(µ, σ), µ ∈ R} ∈ SIL. This follows from Example 8.D.4 and Theorem 8.D.11 below. Example 8.A.2. Let X(λ) be a Poisson random variable with mean λ. Then {X(λ), λ ∈ [0, ∞)} ∈ SIL. This follows from Example 8.A.7 below. Equivalently, Example 8.A.2 shows that a homogeneous Poisson process {K(t), t ≥ 0} is SIL.

8.A Regular Stochastic Convexity

359

Lynch [370] has found conditions under which a stationary renewal process {K(t), t ≥ 0} is SCX. Explicitly, let X2 , X3 , . . . be independent and identically distributed interrenewal times with a distribution function F . Let the time until the ﬁrst renewal, X1 , have the equilibrium distribution function G given x

F (u)du

by G(x) = 0 EX2 , x ≥ 0. Lynch [370] has shown that if X2 has a logconcave density function, then {K(t), t ∈ [0, ∞)} ∈ SCX.

Example 8.A.3. Let X(n, p) be a binomial random variable with mean np and variance np(1−p). Then, for each p ∈ (0, 1), one has {X(n, p), n ∈ N++ } ∈ SIL and, for each n ∈ N++ , one has {X(n, p), p ∈ (0, 1)} ∈ SIL. These follow from Example 8.B.3 and Theorem 8.B.9 below. Example 8.A.4. Let Y (n), n = 1, 2, . . ., be a sequence of nonnegative independent and identically distributed random variables with mean 1. For n µ > 0 deﬁne X(µ, n) = µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ [0, ∞)} ∈ SIL and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL. The ﬁrst result follows from Example 8.D.5 and Theorem 8.D.11 below. The second result follows from Example 8.B.4 and Theorem 8.B.9 below. Speciﬁcally, when Y (n) in Example 8.A.4 is an exponential random variable we have the following example. Example 8.A.5. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ [0, ∞)} ∈ SIL and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL. By taking n = 1 in Example 8.A.4 we obtain the following result. Example 8.A.6. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ [0, ∞)} ∈ SIL. Example 8.A.7. Suppose that Θ is [0, ∞) or N++ . The family of nonnegative random variables {X(θ), θ ∈ Θ} is said to have the semigroup property if, for all θ1 and θ2 in Θ, one has X(θ1 + θ2 ) =st X(θ1 ) + X(θ2 ),

(8.A.1)

where X(θ1 ) and X(θ2 ) in (8.A.1) are independent. Note that {X(λ), λ ∈ [0, ∞)} of Example 8.A.2 has the semigroup property. Also, for each µ > 0, it is seen that {X(µ, n), n ∈ N++ } of Example 8.A.4 has the semigroup property. If {X(θ), θ ∈ Θ} has the semigroup property, then {X(θ), θ ∈ Θ} ∈ SIL. This result follows from Example 8.B.7 and Theorem 8.B.9 below. Example 8.A.8. The Beta distribution with parameters α > 0 and β > 0 is the one that has the density function deﬁned as fα,β (x) =

1 xα−1 (1 − x)β−1 , B(α, β)

0 < x < 1,

360

8 Stochastic Convexity and Concavity

1 where B(α, β) ≡ 0 xα−1 (1 − x)β−1 dx. The beta distribution of the second kind with parameters α > 0 and β > 0 is the one that has the density function deﬁned as xα−1 1 gα,β (x) = , x > 0. B(α, β) (1 − x)α+β Fix a t > 0. Adell, Bad´ıa, and de la Cal [2] proved the following results: (a) If X(θ) has the density function ftθ,t(1−θ) , θ ∈ (0, 1), then {X(θ), θ ∈ (0, 1)} ∈ SICX. (b) If Y (θ) has the density function ftθ+1,t(1−θ)+1 , θ ∈ (0, 1), then {Y (θ), θ ∈ (0, 1)} ∈ SICX. (c) If Z(θ) has the density function gtθ,t , θ > 0, then {Z(θ), θ > 0} ∈ SICX. For a random variable Y , let FY and F Y denote its distribution and survival functions, respectively. Similarly, for a random variable X(θ), let FX (·, θ) and F X (·, θ) denote the corresponding distribution and survival functions. Since the class of functions fa (x) = max{x − a, 0} [min{x − a, 0}] for all a ∈ R generates all the and since ∞increasing and convex [concave] functions, a E(max{X − a, 0}) = a F X (x)dx [E(min{X − a, 0}) = − −∞ FX (x)dx] (see Section 4.A.1), we have the following equivalences. Theorem 8.A.9. (a) {X(θ), θ ∈ Θ} ∈ SICX [SICV] if, and only if, {X(θ), ∞ x θ ∈ Θ} ∈ SI and x F X (y, θ)dy [ −∞ FX (y, θ)dy] is increasing [decreasing] convex in θ for all x, and (b) {X(θ), ∞θ ∈ Θ} ∈ SDCX x [SDCV] if, and only if, {X(θ), θ ∈ Θ} ∈ SD and x F X (y, θ)dy [ −∞ FX (y, θ)dy] is decreasing [increasing] convex in θ for all x. For discrete random variables we have the following analog of Theorem 8.A.9. Theorem 8.A.10. Suppose that for each θ ∈ Θ, the support of X(θ) is in N. Then (a) {X(θ), θ ∈ Θ} ∈ SICX [SICV] if, and only k ∞ and l=k P {X(θ) ≥ l} [ l=−∞ P {X(θ) ≤ l}] convex in θ for all k ∈ N, and (b) {X(θ), θ ∈ Θ} ∈ SDCX [SDCV] if, and only ∞ k and l=k P {X(θ) ≥ l} [ l=−∞ P {X(θ) ≤ l}] convex in θ for all k ∈ N.

if, {X(θ), θ ∈ Θ} ∈ SI is increasing [decreasing] if, {X(θ), θ ∈ Θ} ∈ SD is decreasing [increasing]

Recall the following identity which holds for any random variable Z with mean EZ: 0 ∞ EZ = − F (u)du + F (u)du, (8.A.2) −∞

0

where F and F are the distribution function and the survival function of Z, respectively. From Theorem 8.A.9 and (8.A.2) we thus obtain the next result.

8.A Regular Stochastic Convexity

361

Theorem 8.A.11. Suppose that EX(θ) is a linear function of θ. (a) If {X(θ), θ ∈ Θ} ∈ SICX [SICV], then {X(θ), θ ∈ Θ} ∈ SICV [SICX], and therefore {X(θ), θ ∈ Θ} ∈ SIL. (b) If {X(θ), θ ∈ Θ} ∈ SDCX [SDCV], then {X(θ), θ ∈ Θ} ∈ SDCV [SDCX], and therefore {X(θ), θ ∈ Θ} ∈ SDL. From Example 8.A.6 it follows that if X(θ) is uniformly distributed on [0, θ], then {X(θ), θ ∈ [0, ∞)} ∈ SIL. However, in order to obtain the discrete analog of this result we need to proceed in a diﬀerent route as in the next example. Example 8.A.12. Let X(n) be uniformly distributed on {0, 1, . . . , n− 1}. Then {X(n), n ∈ N+ } ∈ SIL. In order to see it ﬁrst note that EX(n) is a linear function of n. Thus, by Theorem 8.A.11 it is suﬃcient to show that {X(n), n ∈ N+ } ∈ SICV. Clearly, {X(n), n ∈ N+ } ∈ SI. Now we compute k l=0

P {X(n) ≤ l} =

1 (k + 1)k · . n 2

This is a decreasing convex function of n. Thus the stated result follows from Theorem 8.A.10(a). We will now present an application of these notions in establishing a stochastic inequality. Theorem 8.A.13. Let {Yk , k ∈ N++ } be a sequence of independent and identically distributed nonnegative random variables independent of the two nonnegative discrete random variables M and N . Then M N (a) M ≤icx [≤icv ] N =⇒ k=1 Yk ≤icx [≤icv ] k=1 Yk , and N M (b) M ≤cx N =⇒ k=1 Yk ≤cx k=1 Yk . Proof. Let φbe an increasing and convex [concave] function and deﬁne n ψ(n) = Eφ k=1 Yk . Then ψ is an increasing and convex [concave] function (see Example 8.A.4). Therefore M ≤icx [≤icv ] N implies that Eψ(M ) = M

N

Eφ ). This establishes part (a). When k=1 Yk ≤ Eφ k=1 Yk = Eψ(N N

M M ≤cx N one has E k=1 Yk = E k=1 Yk (see Theorem 4.A.35). This observation combined with part (a) completes the proof for part (b).

A stronger result than Theorem 8.A.13(b) is stated as Theorem 3.A.13 in Chapter 3. A stronger result than Theorem 8.A.13(a) is stated as Theorem 4.A.9 in Chapter 4. Theorem 8.A.13 can also be obtained from Theorem 4.A.18. In fact, we next restate Theorems 3.A.21 and 4.A.18 in terms of the terminology of this section (the assumption in Theorem 8.A.14(a) below is slightly stronger than the assumption in Theorem 4.A.18; see a comment after Theorem 4.A.18).

362

8 Stochastic Convexity and Concavity

Theorem 8.A.14. Let {X(θ), θ ∈ X } be a collection of random variables, and let Θ1 and Θ2 be two X -valued random variables that are independent of {X(θ), θ ∈ X }. (a) If {X(θ), θ ∈ X } ∈ SICX [SICV] and if Θ1 ≤icx [≤icv ]Θ2 , then X(Θ1 ) ≤icx [≤icv ] X(Θ2 ). (b) If {X(θ), θ ∈ X } ∈ SCX and if Θ1 ≤cx Θ2 , then X(Θ1 ) ≤cx X(Θ2 ). 8.A.2 Closure properties Closure properties of the notions that were introduced in Section 8.A.1 serve as the basis for studying the convexity and concavity properties of the performance measures of stochastic systems. In this subsection we describe some of these closure properties. Theorem 8.A.15. Suppose that {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} are two collections of random variables such that X(θ) and Y (θ) are independent for each θ. If {X(θ), θ ∈ Θ} ∈ SICX [or SICV] and {Y (θ), θ ∈ Θ} ∈ SICX [or SICV], then {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX [or SICV]. Proof. We prove the convex case only. The concave case can be similarly proven. Let θi ∈ Θ, i = 1, 2, 3, 4, be such that θ1 ≤ θ2 = θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . The stochastic monotonicity of X(θ) and Y (θ) can be ˆ1, X ˆ 4 , Yˆ1 , and Yˆ4 such that X ˆ i =st used to construct four random variables X ˆ ˆ ˆ ˆ ˆ X(θi ), Yi =st Y (θi ), i = 1, 4, X1 ≤ X4 a.s., and Y1 ≤ Y4 a.s. (see Theorem ˆ1, X ˆ 4 ) and (Yˆ1 , Yˆ4 ) can be constructed so that they are 1.A.1). Furthermore (X independent. Let I1 and I2 be independent random variables, independent of ˆ 4 , Yˆ1 , and Yˆ4 , such that P {I1 = 0} = P {I1 = 1} = P {I2 = 0} = ˆ1, X X ˆ 2 = (1 − I1 )X ˆ 1 + I1 X ˆ 3 = I1 X ˆ4, ˆ4, X ˆ 1 + (1 − I1 )X P {I2 = 1} = 12 . Deﬁne X Yˆ2 = (1 − I2 )Yˆ1 + I2 Yˆ4 , and Yˆ3 = I2 Yˆ1 + (1 − I2 )Yˆ4 . It is then not hard to see ˆ 2 =st X ˆ 3 , Yˆ2 =st Yˆ3 , that X ˆ 2 , Yˆ2 ), (X ˆ 3 , Yˆ3 ) ≤ (X ˆ 4 , Yˆ4 ) a.s. ˆ 1 , Yˆ1 ) ≤ (X (X (where, for any four numbers a, b, c, and d, the notation a ≤ [b, c] ≤ d means a ≤ min{b, c} and max{b, c} ≤ d), and ˆ 4 + Yˆ4 ) = (X ˆ 2 + Yˆ2 ) + (X ˆ 3 + Yˆ3 ) ˆ 1 + Yˆ1 ) + (X (X

a.s.

Then, for any increasing convex function φ, one has ˆ 4 + Yˆ4 ) ≥ Eφ(X ˆ 2 + Yˆ2 ) + Eφ(X ˆ 3 + Yˆ3 ). ˆ 1 + Yˆ1 ) + Eφ(X Eφ(X ˆ 2 ≥icx X(θ2 ) and Yˆ2 ≥icx Y (θ2 ). So by the preservation of the Observe that X ˆ 2 + Yˆ2 ≥icx order ≥icx under convolution (see Theorem 4.A.8) it follows that X X(θ2 ) + Y (θ2 ). That is, for any increasing convex function φ, one has ˆ 2 + Yˆ2 ) ≥ Eφ(X(θ2 ) + Y (θ2 )). Eφ(X

8.A Regular Stochastic Convexity

363

Similarly, ˆ 3 + Yˆ3 ) ≥ Eφ(X(θ3 ) + Y (θ3 )). Eφ(X Therefore, Eφ(X(θ1 ) + Y (θ1 )) + Eφ(X(θ4 ) + Y (θ4 )) ≥ Eφ(X(θ2 ) + Y (θ2 )) + Eφ(X(θ3 ) + Y (θ3 )). Combining this with the preservation of stochastic monotonicity under convolution (see Theorem 1.A.3), one has {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX.

A combination of Example 8.A.4 and Theorem 8.A.15 yields the following generalization of Example 8.A.4 which will be used later. Example 8.A.16. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1, and let Z be a random variable n which is independent of the Y (n)’s. For µ > 0 deﬁne X(µ, n) = Z + µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL. Theorem 8.A.17. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊆ R is a convex set, and let {Y (λ), λ ∈ Λ} be another family of random variables. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX [SICV, SIL] and {Y (λ), λ ∈ Λ} ∈ SICX [SICV, SIL], then {Y (X(θ)), θ ∈ Θ} ∈ SICX [SICV, SIL]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX [SDCV, SDL] and {Y (λ), λ ∈ Λ} ∈ SICX [SICV, SIL], then {Y (X(θ)), θ ∈ Θ} ∈ SDCX [SDCV, SDL]. Proof. We will prove the increasing convex case only. The other cases can be proven similarly. Using the construction in the proof of Theorem 1.A.1 for the usual stochastic order, it is easily veriﬁed that {Y (X(θ)), θ ∈ Θ} ∈ SI. Let φ be an increasing and convex function. Consider Eφ(Y (X(θ))) = Eψ(X(θ)),

(8.A.3)

where ψ(λ) = Eφ(Y (λ)). Since {Y (λ), λ ∈ Λ} ∈ SICX, we see that ψ is an increasing and convex function. Therefore, since {X(θ), θ ∈ Θ} ∈ SICX, one sees from (8.A.3) that Eφ(Y (X(θ))) is increasing and convex in θ. Therefore {Y (X(θ)), θ ∈ Θ} ∈ SICX.

Example 8.A.18. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables as in Example 8.A.4, but here, since we are interested only in convexity properties with respect to n, we let the common mean of the Y (n)’s be a ﬁxed µ > 0. Denote

364

8 Stochastic Convexity and Concavity

n ˜ X(n) = k=1 Y (k), n ∈ N++ , and let X(n) be the forward recurrence time ˜ associated with X(n), that is, let X(n) have the survival function given by ∞ P {X(n) > u}du ˜ P {X(n) > x} = x , x ≥ 0, n ∈ N++ . nµ ˜ Then {X(n), n ∈ N++ } ∈ SIL. This follows, by Examples 8.A.12 and 8.A.16, and by Theorem 8.A.17, from the relation (proven below)

U (n)

˜ X(n) =st Y˜ +

Y (k),

(8.A.4)

k=1

where U (n) is a random variable which is uniformly distributed on {0, 1, . . . , n − 1}, and Y˜ is the forward recurrence time associated with Y (1), that is, ∞ P {Y (1) > u}du P {Y˜ > x} = x , x ≥ 0. µ The relation (8.A.4) can be proven as follows: Consider n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n, all with interrenewal times that are distributed as Y (1), and consider the renewal process {N (t), t ≥ 0} with interrenewal intervals which are the sums of the corresponding interrenewal intervals of the n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. That is, the interrenewal times that are associated with {N (t), t ≥ 0} are distributed as X(n). Select a t > 0 and consider the associated forward recurrence time in the process {N (t), t ≥ 0}. Clearly the value t falls in an interrenewal interval which is the sum of the n interrenewal intervals corresponding to {N1 (t), t ≥ 0}, {N2 (t), t ≥ 0}, . . . , {Nn (t), t ≥ 0}. With probability 1/n, t falls in the interrenewal interval corresponding to the process {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. Let U (n) + 1 be the index of the process in whose interrenewal interval t falls. Then U (n) is uniformly distributed on {0, 1, . . . , n−1}. If t falls in an interval corresponding to {Ni (t), t ≥ 0} (that is, when U (n) = i − 1), n n−i then its forward recurrence time is Y˜ + k=i+1 Y (k) =st Y˜ + k=1 Y (k). Unconditioning with respect to the value i of U (n) + 1 we obtain

n−U (n)−1

˜ X(n) =st Y˜ +

k=1

U (n)

Y (k) =st Y˜ +

Y (k),

k=1

and the proof of (8.A.4) is complete. In Example 8.B.12 of Section 8.B the reader may ﬁnd a related result. Let {X(n), n ∈ N+ } be a Markov chain with state space S (S = [0, ∞) or N+ ). Let random variables representing Y (x) and Z(x) denote generic [X(n + 1)X(n) = x] and [X(n + 1) − xX(n) = x], respectively (recall that, for a random variable U and an event A, we denote by [U A] any random variable whose distribution is the conditional distribution of U given A). Note that Y (x) =st x + Z(x), x ∈ S.

8.A Regular Stochastic Convexity

365

Theorem 8.A.19. Suppose that X(0) = 0 a.s. If {Z(x), x ∈ S} ∈ SD and Z(x) ≥ 0 a.s. for each x ∈ S, then {X(n), n ∈ N+ } ∈ SICV. Proof. Since Z(x) ≥ 0 a.s. we have Y (x) ≥ x a.s., and therefore X(n) is a.s. increasing in n. For any increasing and concave function φ we have that φ(x+y)−φ(y) increasing in x and decreasing in y. Therefore, since {Z(y), y ∈ S} ∈ SD, we see that Eφ(Z(y) + y) − φ(y) is decreasing in y. Since X(n) is a.s. increasing in n, we have Eφ(Z(X(n + 1)) + X(n + 1)) − Eφ(X(n + 1)) ≤ Eφ(Z(X(n)) + X(n)) − Eφ(X(n)). Noting that X(n + 1) =st Z(X(n)) + X(n), from the above equation one obtains Eφ(X(n + 2)) + Eφ(X(n)) ≤ Eφ(X(n + 1)) + Eφ(X(n + 1)). That is, {X(n), n ∈ N+ } ∈ SICV.

Let X(n) be the historical record value of a sequence of independent and identically distributed random variables {Dn , n ∈ N++ }. That is, X(n) = max{X(n − 1), Dn } = max{X(0), D1 , D2 , . . . , Dn }, n ∈ N++ . Theorem 8.A.20. If X(0) = 0 a.s., then {X(n), n ∈ N+ } ∈ SICV. Proof. We apply Theorem 8.A.19. Here Y (x) =st max{Dn , x} and Z(x) =st max{Dn − x, 0}. Clearly, {Z(x), x ≥ 0} satisﬁes the conditions of Theorem 8.A.19.

8.A.3 Stochastic m-convexity Let S be a subinterval of the real line. Recall from Section 3.A.5 the class MSm-cx of all functions φ : S → R whose mth derivative φ(m) exists and satisﬁes φ(m) (x) ≥ 0, for all x ∈ S, or which are limits of sequences of functions whose mth derivative is continuous and nonnegative on S, m = 1, 2, . . .. A )m function φ : S → R is said to be m-increasing convex if φ ∈ k=1 MSk-cx . A set of random variables {X(θ), θ ∈ Θ} (Θ is a subinterval of the real line) is said to be stochastically m-increasing convex if Eφ(X(θ)) is m-increasing convex in θ whenever φ is m-increasing convex. If Θ is a subinterval of N++ , then the deﬁnition of stochastic m-increasing convexity is similar; we do not give the details here — they can be found in Denuit, Lef`evre, and Utev [155]. The proofs of most of the following examples, as well as many other examples, can be found in Denuit, Lef`evre, and Utev [155]. Example 8.A.21. Let X(λ) be a Poisson random variable with mean λ. Then {X(λ), λ ∈ [0, ∞)} is stochastically m-increasing convex for each m ∈ N++ .

366

8 Stochastic Convexity and Concavity

Example 8.A.22. Let X(n, p) be a binomial random variable with mean np and variance np(1 − p). Then, for each p ∈ (0, 1), one has that {X(n, p), n ∈ N++ } is stochastically m-increasing convex, and for each n ∈ N++ , one has that {X(n, p), p ∈ (0, 1)} is stochastically m-increasing convex, for each m ∈ N++ . Example 8.A.23. Let Y (n), n = 1, 2, . . ., be a sequence of nonnegative independent and identically n distributed random variables with mean 1. For µ > 0 deﬁne X(µ, n) = µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has that {X(µ, n), µ ∈ [0, ∞)} is stochastically m-increasing convex, and for each µ > 0, one has that {X(µ, n), n ∈ N++ } is stochastically m-increasing convex, for each m ∈ N++ . Speciﬁcally, when Y (n) in Example 8.A.23 is an exponential random variable we have the following example. Example 8.A.24. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has that {X(µ, n), µ ∈ [0, ∞)} is stochastically m-increasing convex, and for each µ > 0, one has {X(µ, n), n ∈ N++ } is stochastically m-increasing convex, for each m ∈ N++ . By taking n = 1 in Example 8.A.23 we obtain the following result. Example 8.A.25. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ [0, ∞)} is stochastically m-increasing convex for each m ∈ N++ . When the set of random variables is parametrized by a location parameter then we have: Example 8.A.26. Let Y be a real random variable. For µ > 0 deﬁne X(µ) = Y + µ. Then {X(µ), µ ∈ [0, ∞)} is stochastically m-increasing convex for each m ∈ N++ . Another example of interest is the following. Example 8.A.27. Let X(n) be uniformly distributed on {0, 1, . . . , n− 1}. Then {X(n), n ∈ N+ } is stochastically m-increasing convex for each m ∈ N++ . Since the composition of two m-increasing functions is m-increasing, we obtain the following closure properties of stochastic m-convexity. Theorem 8.A.28. (a) Let ϕ : S → R be an m-increasing convex function. If {X(θ), θ ∈ Θ} is stochastically m-increasing convex, then {ϕ(X(θ)), θ ∈ Θ} is also stochastically m-increasing convex. (b) Let ϑ : Θ → Θ be an m-increasing convex function. If {X(θ), θ ∈ Θ} is stochastically m-increasing convex, then {X(ϑ(θ)), θ ∈ Θ} is also stochastically m-increasing convex. From Theorem 8.A.28(a) and Example 8.A.23 we obtain the following result.

8.B Sample Path Convexity

367

Theorem 8.A.29. Let {Yn , n ≥ 1} be a sequence of nonnegative, independent and identically distributed random variables. Let {N (θ), θ ∈ Θ} be a set of nonnegative integer-valued random variables, independent of the Yn ’s. Deﬁne N (θ) X(θ) = n=1 Yn . If {N (θ), θ ∈ Θ} is stochastically m-increasing convex, then {X(θ), θ ∈ Θ} is stochastically m-increasing convex.

8.B Sample Path Convexity Sample path convexity is one powerful tool that can be used for the purpose of obtaining the regular convexity notions presented in Section 8.A. Two other related tools will be described in Sections 8.C and 8.D. 8.B.1 Deﬁnitions Consider a family {X(θ), θ ∈ Θ} of random variables. Let θi ∈ Θ, i = 1, 2, 3, 4, be any four values such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . ˆ i , i = 1, 2, 3, 4, deﬁned on a common If there exist four random variables X ˆ i =st X(θi ), i = 1, 2, 3, 4, and probability space, such that X ˆ3] ≤ X ˆ 4 a.s. and (ii) X ˆ2 +X ˆ3 ≤ X ˆ1 +X ˆ 4 a.s., then {X(θ), θ ∈ ˆ2, X (a) (i) max[X Θ} is said to be stochastically increasing and convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SICX(sp)); ˆ 1 ≤ min[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 + X ˆ4 ≤ X ˆ2 + X ˆ 3 a.s., then {X(θ), θ ∈ (b) (i) X Θ} is said to be stochastically increasing and concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SICV(sp)); ˆ 1 ≥ max[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 +X ˆ4 ≥ X ˆ2 +X ˆ 3 a.s., then {X(θ), θ ∈ (c) (i) X Θ} is said to be stochastically decreasing and convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SDCX(sp)); ˆ 4 ≤ min[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 + X ˆ4 ≤ X ˆ2 + X ˆ 3 a.s., then {X(θ), θ ∈ (d) (i) X Θ} is said to be stochastically decreasing and concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SDCV(sp)); ˆ2, X ˆ3] ≤ X ˆ 4 a.s. and (ii) X ˆ1 +X ˆ4 = X ˆ2 +X ˆ 3 a.s., then {X(θ), θ ∈ (e) (i) max[X Θ} is said to be stochastically increasing and linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SIL(sp)); ˆ 1 ≥ max[X ˆ2, X ˆ 3 ] a.s. and (ii) X ˆ1 +X ˆ4 = X ˆ2 +X ˆ 3 a.s., then {X(θ), θ ∈ (f) (i) X Θ} is said to be stochastically decreasing and linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SDL(sp)). Although Condition (i) in these deﬁnitions requires stochastic monotonicity ˆ i , i = 2, 3, to in Xi , i = 1, 2, 3, 4, we do not require the construction of X satisfy any a.s. monotonicity property (that is, we do not require that either ˆ2 ≥ X ˆ 3 a.s. or X ˆ2 ≤ X ˆ 3 a.s. be satisﬁed). X Example 8.B.1. Let X(µ, σ) be a normal random variable with mean µ and standard deviation σ. Then, for each σ > 0, one has {X(µ, σ), µ ∈ R} ∈ SIL(sp). This follows from Example 8.D.4 and Theorem 8.D.11 below.

368

8 Stochastic Convexity and Concavity

Example 8.B.2. Let X(λ) be a Poisson random variable with mean λ. Then {X(λ), λ ∈ R+ } ∈ SIL(sp). This follows from Example 8.B.7 below. Example 8.B.3. Let X(n, p) be a binomial random variable with mean np and variance np(1 − p). Then, for each p ∈ (0, 1), one has {X(n, p), n ∈ N++ } ∈ SIL(sp) and, for each n ∈ N++ , one has {X(n, p), p ∈ (0, 1)} ∈ SIL(sp). The ﬁrst result follows from Example 8.B.4 below. In order to prove the second result, ﬁrst note that X(n, p) =st X1 (p) + X2 (p) + · · · + Xn (p), where Xj (p), j = 1, 2, . . . , n, are independent and identically distributed Bernoulli random variables with P {Xj (p)} = p. We will show that {X1 (p), p ∈ (0, 1)} ∈ SIL(sp).

(8.B.1)

The second result above then follows from Theorem 8.B.10 below. To prove (8.B.1) let pi , i = 1, 2, 3, 4, be such that 0 < p1 ≤ p2 ≤ p3 ≤ p4 < 1 and p1 + p4 = p2 + p3 . Let U be a uniform (0, 1) random variable. Let IA denote the indicator function of A. Deﬁne ˆ 1 = I{U ≤p } , X 1 ˆ X3 = I{U ≤p1 } + I{p2 ≤U ≤p4 } ,

ˆ 2 = I{U ≤p } , X 2 ˆ X4 = I{U ≤p4 } .

ˆ i =st X1 (p), i = 1, 2, 3, 4, and X ˆ i , i = 1, 2, 3, 4, satisfy the conditions Then X given in the deﬁnitions of SICX(sp) and SICV(sp). This proves (8.B.1). Example 8.B.4. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1. For n µ > 0 deﬁne X(µ, n) = µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(sp) and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL(sp). The ﬁrst result follows from Example 8.D.5 and Theorem 8.D.11 below. The second result follows from Example 8.B.7 below. Speciﬁcally, when Y (n) in Example 8.B.4 is an exponential random variable we have the following example. Example 8.B.5. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(sp) and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL(sp). By taking n = 1 in Example 8.B.4 we obtain the following result. Example 8.B.6. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ R+ } ∈ SIL(sp). Example 8.B.7. If {X(θ), θ ∈ Θ} has the semigroup property (see Example 8.A.7), then {X(θ), θ ∈ Θ} ∈ SIL(sp). In order to see it let θi ∈ Θ, = 1, 2, 3, 4, be such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . Let Zi , i = 1, 2, 3, 4, be independent random variables such that

8.B Sample Path Convexity

369

Z1 =st X(θ1 ), Z2 =st X(θ2 − θ1 ), Z3 =st X(θ3 − θ2 ), and Z4 =st X(θ4 − θ3 ), where, by convention, X(0) ≡ 0. Deﬁne ˆ 1 = Z1 , X ˆ 2 = Z1 + Z2 , X ˆ 3 = Z1 + Z3 + Z4 , X and ˆ 4 = Z1 + Z2 + Z3 + Z4 . X ˆ i =st X(θi ), i = 1, 2, 3, 4, and X ˆ i , i = 1, 2, 3, 4, satisfy the conditions Then X given in the deﬁnitions of SICX(sp) and SICV(sp). This proves the result stated above. The following theorem is obvious. A more general result is proven in Theorem 8.B.13 (see Corollary 8.B.14). Theorem 8.B.8. (a) If {X(θ), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)] and if φ is an increasing convex [or concave] function, then {φ(X(θ)), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)] and if φ is an increasing convex [or concave] function, then {φ(X(θ)), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)]. Theorem 8.B.8 shows that the sample path notions imply the regular notions of stochastic convexity/concavity. Counterexamples can be constructed to show that the reverse need not be true. We have the following results. Theorem 8.B.9. SICX(sp) =⇒ SICX, SICV(sp) =⇒ SICV, SDCX(sp) =⇒ SDCX, SDCV(sp) =⇒ SDCV.

370

8 Stochastic Convexity and Concavity

8.B.2 Closure properties In this section we present some closure properties of the sample path convexity notions. Theorem 8.B.10. Let {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} be two families of random variables such that for each θ ∈ Θ, X(θ) and Y (θ) are independent. Then (a) {X(θ), θ ∈ Θ} ∈ SICX(sp) and {Y (θ), θ ∈ Θ} ∈ SICX(sp) =⇒ {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX(sp), (b) {X(θ), θ ∈ Θ} ∈ SICV(sp) and {Y (θ), θ ∈ Θ} ∈ SICV(sp) =⇒ {X(θ) + Y (θ), θ ∈ Θ} ∈ SICV(sp), (c) {X(θ), θ ∈ Θ} ∈ SDCX(sp) and {X(θ), θ ∈ Θ} ∈ SDCX(sp) =⇒ {X(θ)+ Y (θ), θ ∈ Θ} ∈ SDCX(sp), and (d) {X(θ), θ ∈ Θ} ∈ SDCV(sp) and {Y (θ), θ ∈ Θ} ∈ SDCV(sp) =⇒ {X(θ) + Y (θ), θ ∈ Θ} ∈ SDCV(sp). Proof. We will prove part (a) only, since the other parts can be similarly proven. Let θi ∈ Θ, i = 1, 2, 3, 4, be any four values such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . From the deﬁnition of SICX(sp) one sees that ˆ i , Yˆi , i = 1, 2, 3, 4, deﬁned on a common there exist eight random variables X ˆ probability space, such that Xi =st X(θi ), Yˆi =st Y (θi ), i = 1, 2, 3, 4, and ˆ2, X ˆ3] ≤ X ˆ 4 a.s., max[X ˆ2 + X ˆ3 ≤ X ˆ1 + X ˆ 4 a.s., X

max[Yˆ2 , Yˆ3 ] ≤ Yˆ4 a.s., Yˆ2 + Yˆ3 ≤ Yˆ1 + Yˆ4 a.s.,

ˆ i and Yˆi are independent, i = 1, 2, 3, 4. Let Zˆi = X ˆ i + Yˆi , i = 1, 2, 3, 4. and X Then Zi =st X(θi ) + Y (θi ), i = 1, 2, 3, 4, and max[Zˆ2 , Zˆ3 ] ≤ Zˆ4 a.s.

and Zˆ2 + Zˆ3 ≤ Zˆ1 + Zˆ4 a.s.

Therefore, {X(θ) + Y (θ), θ ∈ Θ} ∈ SICX(sp).

A combination of Example 8.B.4 and Theorem 8.B.10 yields the following generalization of Example 8.B.4 which will be used later. Example 8.B.11. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1, and let Z be a random variable nwhich is independent of the Y (n)’s. For µ > 0 deﬁne X(µ, n) = Z + µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(sp), and, for each µ > 0, one has {X(µ, n), n ∈ N++ } ∈ SIL(sp). Example 8.B.12. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with a common mean µ > 0, as in Example 8.A.18. Let Y ∗ be the spread of the renewal process generated by the Y (n)’s; that is, if f is the density function of Y (1),

8.B Sample Path Convexity

371

n then the density function of Y ∗ is (1/µ)xf (x). Denote X(n) = k=1 Y (k), n ∈ N++ , and let X ∗ (n) be the spread corresponding to X(n). Then {X ∗ (n), n ∈ N++ } ∈ SIL(sp). This follows, by Example 8.B.11, from the relation n−1 X ∗ (n) =st Y ∗ + Y (k). (8.B.2) k=1

The relation (8.B.2) can be proven as follows: Consider n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n, all with interrenewal times that are distributed as Y (1), and consider the renewal process {N (t), t ≥ 0} with interrenewal intervals which are the sums of the corresponding interrenewal intervals of the n independent renewal processes {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. That is, the interrenewal times that are associated with {N (t), t ≥ 0} are distributed as X(n). Select a t > 0 and consider the spread corresponding to the process {N (t), t ≥ 0}. Clearly the value t falls in an interrenewal interval which is the sum of the n interrenewal intervals corresponding to {N1 (t), t ≥ 0}, {N2 (t), t ≥ 0}, . . . , {Nn (t), t ≥ 0}. With probability 1/n, t falls in the interrenewal interval corresponding to the process {Ni (t), t ≥ 0}, i = 1, 2, . . . , n. Let U (n) be the index of the process in whose interrenewal interval t falls. Then U (n) is uniformly distributed on {1, 2, . . . , n}. If t falls in an interval corresponding to {Ni (t), t ≥ 0} (that is, when U (n) = i), then n−1 its spread is Y ∗ + k=i Y (k) =st Y ∗ + k=1 Y (k). Note that the distribution of the spread is independent of i. Therefore, by unconditioning with respect to the value i of U (n) we obtain (8.B.2). Theorem 8.B.13. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊂ R is a convex set. Also, let {Y (λ), λ ∈ Λ} be another family of random variables. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX(sp) [SICV(sp)] and {Y (λ), λ ∈ Λ} ∈ SICX(sp) [SICV(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SICX(sp) [SICV(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(sp) [SDCV(sp)] and {Y (λ), λ ∈ Λ} ∈ SICX(sp) [SICV(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SDCX(sp) [SDCV(sp)]. Proof. We will prove the convex case of part (a) only, as the proofs of the other cases are similar. Let θi ∈ Θ, i = 1, 2, 3, 4, be any four values such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 . Since {X(θ), θ ∈ Θ} ∈ SICX(sp), ˆ i , i = 1, 2, 3, 4, deﬁned on a common there exist four random variables X ˆ i =st X(θi ), i = 1, 2, 3, 4, and probability space, such that X ˆ2, X ˆ3] ≤ X ˆ 4 a.s. [X Let and

ˆ2 + X ˆ3 ≤ X ˆ1 + X ˆ 4 a.s. and X

ˆ4, X ˆ1 + X ˆ4 − X ˆ 3 ], X2∗ = min[X

(8.B.3)

372

8 Stochastic Convexity and Concavity

ˆ3 − X ˆ4. X1∗ = X2∗ + X Clearly, X1∗ and X2∗ ∈ Λ a.s., and ˆ1 X1∗ ≤ X Also,

ˆ3] ≤ X ˆ4 [X2∗ , X

ˆ2. and X2∗ ≥ X

(8.B.4)

ˆ4 = X ∗ + X ˆ3. and X1∗ + X 2

Therefore, since {Y (λ), λ ∈ Λ} ∈ SICX(sp), there exist four random variables Z1∗ , Z2∗ , Zˆ3 , and Zˆ4 , deﬁned on a common probability space, such that Z1∗ =st ˆ 3 ), Zˆ4 =st Y (X ˆ 4 ), and Y (X1∗ ), Z2∗ =st Y (X2∗ ), Zˆ3 =st Y (X [Z2∗ , Zˆ3 ] ≤ Zˆ4 a.s.

and Z2∗ + Zˆ3 ≤ Z1∗ + Zˆ4 a.s.

(8.B.5)

Since Y (λ) is stochastically increasing in λ, from (8.B.4) it is seen that there ˆ i ), i = 1, 2, and exist random variables Zˆi , i = 1, 2, such that Zˆi =st Y (X Z1∗ ≤ Zˆ1

and Z2∗ ≥ Zˆ2 .

Then from (8.B.5) one sees that [Zˆ2 , Zˆ3 ] ≤ Zˆ4 a.s.

and Zˆ2 + Zˆ3 ≤ Zˆ1 + Zˆ4 a.s.

The proof is completed by observing that Y (X(θi )) =st Zˆi , i = 1, 2, 3, 4.

By letting {Y (λ), λ ∈ Λ} of Theorem 8.B.13 be deterministic (we denote it then as a real function φ : Λ → R) we obtain the following corollary. Corollary 8.B.14. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊂ R is a convex set, and let φ be a real function on Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)] and φ is increasing and convex [or concave], then {φ(X(θ)), θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)] and φ is increasing and convex [or concave], then {φ(X(θ)), θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)]. By letting {X(θ), θ ∈ Θ} of Theorem 8.B.13 be deterministic (we denote it then as a real function φ : Θ → Λ) we obtain the following corollary. Corollary 8.B.15. Let {Y (λ), λ ∈ Λ} be a family of real-valued random variables, where Λ ⊂ R is a convex set, and let φ be a Λ-valued function on Θ, where Θ ⊂ R is a convex set. (a) If {Y (λ), λ ∈ Λ} ∈ SICX(sp) [or convex [or concave], then {Y (φ(θ)), (b) If {Y (λ), λ ∈ Λ} ∈ SDCX(sp) [or convex [or concave], then {Y (φ(θ)),

SICV(sp)] and φ is increasing and θ ∈ Θ} ∈ SICX(sp) [or SICV(sp)]. SDCV(sp)] and φ is increasing and θ ∈ Θ} ∈ SDCX(sp) [or SDCV(sp)].

Let {X(n), n ∈ N+ } be a Markov chain with state space S (S = R+ or N+ ). Let Y (x) =st [X(n + 1)X(n) = x] and Z(x) = Y (x) − x, x ∈ S.

8.B Sample Path Convexity

373

Theorem 8.B.16. Suppose X(0) = x0 a.s. If Z(x) ≥ 0 a.s. for each x ∈ S and {Z(x), x ∈ S} ∈ SI, then {X(n), n ∈ N+ } ∈ SICX(sp). Proof. Since Z(x) ≥ 0 a.s., for n1 ≤ n2 we have X(n1 ) ≤ X(n2 ) a.s. Let n3 and n4 be such that n1 ≤ n2 ≤ n3 ≤ n4 and n1 + n4 = n2 + n3 . Deﬁne m = n4 − n2 = n3 − n1 and Z (m) (x) =st [X(m) − xX(0) = x]. Since Z(x) is stochastically increasing in x, using sample path construction (as in the proof of Theorem 6.B.3 when it applies to Theorem 6.B.34 through Theorems 6.B.32 and 6.B.31), it can be established that Z (m) (x) is also stochastically increasing ˆ 1 , Zˆ1 ) and (X ˆ 2 , Zˆ2 ) deﬁned on in x. Then there exist two random vectors (X ˆ ˆ a common probability space such that (Xi , Zi ) =st (X(ni ), Z (m) (X(ni ))), i = 1, 2, and ˆ 1 , Zˆ1 ) ≤ (X ˆ 2 , Zˆ2 ) a.s. (X (8.B.6) Set ˆ3 = X ˆ 1 + Zˆ1 X

ˆ4 = X ˆ 2 + Zˆ2 . and X

Since Z (m) (x) ≥ 0 a.s., from (8.B.6) it follows that ˆ2, X ˆ3] ≤ X ˆ4 max[X

ˆ1 + X ˆ4 ≥ X ˆ2 + X ˆ3. and X

ˆ i , i = 1, 2, 3, 4.

The proof is now completed by noting that X(ni ) =st X Next consider a Galton-Watson branching process {X(n), n ∈ N+ } in discrete time. Let Di , i = 1, 2, . . ., be independent and identically distributed random variables such that Di has the same distribution as the number of x oﬀsprings of an ancestor. Then, for this process, Y (x) =st i=1 Di , x ∈ N+ . Theorem 8.B.17. Suppose Di ≥ 1 a.s. and P {Di > 1} > 0. If X(0) ≥ 1 a.s., then {X(n), n ∈ N+ } ∈ SICX(sp). x Proof. First, condition on X(0) = x0 . Since Z(x) = Y (x)−x =st i=1 (Di −1) and Di ≥ 1 a.s., one sees that Z(x) ≥ 0 a.s. Also, it is easily seen that {Z(x), x ∈ N+ } ∈ SI. Then, conditioned on X(0) = x0 , the result of Theorem 8.B.17 follows immediately from Theorem 8.B.16. From the deﬁnition of sample path convexity, it is clear that by unconditioning with respect to X(0), the sample path convexity of {X(n), n ∈ N+ } is preserved.

Now consider a nonhomogeneous Poisson process {N (t), t ≥ 0} with mean value function M (t) = EN (t). To avoid trivialities we assume that M is strictly increasing. Denote by Rn the nth epoch time of this process. Theorem 8.B.18. If M is concave [convex ], then {Rn , n ∈ N++ } ∈ SICX(sp) [SICV(sp)]. Proof. Let {K(t), t ≥ 0} be a Poisson process with rate 1, and let Tn denote the nth epoch time of this process. By Example 8.B.4 we have {Tn , n ∈ N++ } ∈ SIL(sp). Now,

374

8 Stochastic Convexity and Concavity

{Rn , n ∈ N++ } =st {M −1 (Tn ), n ∈ N++ }. Since M is increasing and concave [convex] it follows that M −1 is increasing and convex [concave]. The result now follows from Corollary 8.B.14.

Theorem 8.B.19. If M is convex [concave], then {N (t), t ∈ [0, ∞)} ∈ SICX(sp) [SICV(sp)]. Proof. Let {K(t), t ≥ 0} be a Poisson process with rate 1. By Example 8.B.2 we have {K(t), t ∈ [0, ∞)} ∈ SIL(sp). Now, {N (t), t ∈ [0, ∞)} =st {K(M (t)), t ∈ [0, ∞)}. The result now follows from Corollary 8.B.15.

8.C Convexity in the Usual Stochastic Order In some applications it is hard to ﬁnd the construction needed to establish the sample path convexity of Section 8.B. Then the stochastic convexity notions of this section may be useful. 8.C.1 Deﬁnitions Let {X(θ), θ ∈ Θ} be a family of random variables with survival functions F θ (x) = P {X(θ) > x}, θ ∈ Θ. The family {X(θ), θ ∈ Θ} is said to be stochastically increasing [decreasing] and convex [concave, linear ] in the sense of the usual stochastic ordering if Eφ(X(θ)) is increasing [decreasing] and convex [concave, linear] for all increasing functions φ. We denote this by {X(θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SIL(st), SDCX(st), SDCV(st), SDL(st)]. It is easy to see the following characterization. Theorem 8.C.1. The family {X(θ), θ ∈ Θ} satisﬁes {X(θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)] if, and only if, F (x, θ) is increasing and convex [increasing and concave, decreasing and convex, decreasing and concave] in θ for each ﬁxed x. The following are other characterizations of these notions. Theorem 8.C.2. The family {X(θ), θ ∈ Θ} satisﬁes {X(θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)] if, and only if, for any θi ∈ Θ, i = 1, 2, 3, 4, such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 and θ1 + θ4 = θ2 + θ3 , there ˆ i , i = 1, 2, 3, 4, deﬁned on a common probaexist four random variables X ˆ 1 ≤ [≤, ≥, ≥] X ˆ4 ˆ i =st X(θi ), i = 1, 2, 3, 4, and X bility space, such that X ˆ ˆ ˆ ˆ ˆ ˆ a.s., min{X1 , X4 } ≥ [≤, ≥, ≤] min{X2 , X3 } a.s., max{X1 , X4 } ≥ [≤, ≥, ≤] ˆ2, X ˆ 3 } a.s., and hence X ˆ1 + X ˆ 4 ≥ [≤, ≥, ≤] X ˆ2 + X ˆ 3 a.s. max{X

8.C Convexity in the Usual Stochastic Order

375

Proof. We prove the increasing convex case only since the other cases can be proven similarly. Since X(θ) is stochastically increasing in θ there exist, on ˆ 1 and X ˆ 4 such that X ˆ i =st a common probability space, random variables X ˆ ˆ X(θi ), i = 1, 4 and X1 ≤ X4 a.s. Let U be a uniform random variable on (0, 1) and deﬁne θ2 − θ1 ˆ θ2 − θ1 ˆ X2∗ = I U ≤ X4 + 1 − I U ≤ X1 , θ4 − θ1 θ4 − θ1 and θ2 − θ1 ˆ θ2 − θ1 ˆ X1 + 1 − I U ≤ X4 . X3∗ = I U ≤ θ4 − θ1 θ4 − θ1 Then ˆ1, X ˆ4] min[X2∗ , X3∗ ] = min[X

(8.C.1)

ˆ1, X ˆ 4 ]. max[X2∗ , X3∗ ] = max[X

(8.C.2)

and

θ4 −θ2 ∗ 1 Also note that P {X2∗ > x} = θθ24 −θ −θ1 F (x, θ4 ) + θ4 −θ1 F (x, θ1 ), and P {X3 > θ2 −θ1 θ4 −θ2 x} = θ4 −θ1 F (x, θ1 ) + θ4 −θ1 F (x, θ4 ). Since F (x, θ) is increasing and convex in θ, it is then obvious that

X2∗ ≥st X(θ2 )

and X3∗ ≥st X(θ3 ).

ˆ 3 such that X ˆ i =st X(θi ), i = 2, 3, and ˆ 2 and X Therefore, there exist X ˆ i , i = 2, 3. Then, from (8.C.1) and (8.C.2), one sees that Xi∗ ≥ X ˆ 3 ] ≤ min[X ˆ1, X ˆ4] ˆ2, X min[X

and

ˆ2, X ˆ 3 ] ≤ max[X ˆ1, X ˆ 4 ]. max[X

ˆ i , i = 1, 2, 3, 4.

The proof is now completed by observing that X(θi ) =st X Example 8.C.3. Let X(p) be a geometric random variable with mean 1/(1−p). Then {X(p), p ∈ (0, 1)} ∈ SICX(st). Example 8.C.4. Let X(λ) be an exponential random variable with mean 1/λ. Then {X(λ), λ ∈ (0, ∞)} ∈ SDCX(st). It is evident from Theorems 8.C.2 and 8.B.9 that one has the following results. Theorem 8.C.5. SICX(st) =⇒ SICX(sp) =⇒ SICX, SICV(st) =⇒ SICV(sp) =⇒ SICV, SDCX(st) =⇒ SDCX(sp) =⇒ SDCX, SDCV(st) =⇒ SDCV(sp) =⇒ SDCV. Observing that {ψ(θ), θ ∈ Θ} ∈ SICX(sp) for any increasing convex function ψ, and that it is not SICX(st), it is clear that the implications in Theorem 8.C.5 are strict.

376

8 Stochastic Convexity and Concavity

8.C.2 Closure properties Unlike the two previous notions, stochastic convexity in the usual stochastic ordering does not have many closure properties. For example, there are no counterparts to Theorems 8.A.15 and 8.A.17 or Theorems 8.B.10 and 8.B.13 for this stochastic convexity notion. Instead, we present some specialized closure properties under random summation. Theorem 8.C.6. Let {N (θ), θ ∈ Θ} be a family of discrete random variables on N+ , let {X(n), n = 1, 2, . . . } be a sequence of independent and identically distributed nonnegative random variables, and let X(0) = 0. Suppose that {N (θ), θ ∈ Θ} and {X(n), n ∈ N+ } are mutually independent. Set Y (θ) = N (θ) n=0 X(n), θ ∈ Θ. If {N (θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)], then {Y (θ), θ ∈ Θ} ∈ SICX(st) [SICV(st), SDCX(st), SDCV(st)]. Proof. Consider the case {N (θ), θ ∈ Θ} ∈ SICX(st). The other three cases can be similarly proven. From Theorem 8.C.2 one knows that for any θi ∈ Θ, i = 1, 2, 3, 4, such that θ1 ≤ θ2 ≤ θ3 ≤ θ4 , and θ1 + θ4 = θ2 + θ3 , there ˆi , i = 1, 2, 3, 4, deﬁned on a common probability exist four random variables N ˆi =st N (θi ), i = 1, 2, 3, 4, and space, such that N ˆ1 ˆ4 ≥ N N ˆ1 , N ˆ4 } ≥ min{N ˆ2 , N ˆ3 } min{N

a.s.,

(8.C.3)

a.s.,

(8.C.4)

ˆ1 , N ˆ4 } ≥ max{N ˆ2 , N ˆ3 } max{N ˆ1 + N ˆ4 ≥ N ˆ2 + N ˆ3 N

a.s., and hence

(8.C.5)

a.s.

(8.C.6)

Nˆi X(n), i = 1, 2, 3, 4. Then, clearly, Yˆi =st Y (θi ), i = 1, 2, 3, 4. Deﬁne Yˆi = n=0 Furthermore, from (8.C.3)–(8.C.6), one sees that Yˆ4 ≥ Yˆ1 min{Yˆ1 , Yˆ4 } ≥ min{Yˆ2 , Yˆ3 } max{Yˆ1 , Yˆ4 } ≥ max{Yˆ2 , Yˆ3 } Yˆ1 + Yˆ4 ≥ Yˆ2 + Yˆ3

a.s.,

(8.C.7)

a.s.,

(8.C.8)

a.s., and hence

(8.C.9)

a.s.

Theorem 8.C.6 then follows from Theorem 8.C.2.

(8.C.10)

Theorem 8.C.7. Consider {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} and suppose that, for each θ, X(θ) and Y (θ) are independent. Deﬁne V (θ) = max{X(θ), Y (θ)} and W (θ) = min{X(θ), Y (θ)}. (i) If {X(θ), θ ∈ Θ} ∈ SICX(st) [SDCX(st)] and {Y (θ), θ ∈ Θ} ∈ SICX(st) [SDCX(st)], then {W (θ), θ ∈ Θ} ∈ SICX(st) [SDCX(st)].

8.D Strong Stochastic Convexity

377

(ii) If {X(θ), θ ∈ Θ} ∈ SICV(st) [SDCV(st)] and {Y (θ), θ ∈ Θ} ∈ SICV(st) [SDCV(st)], then {V (θ), θ ∈ Θ} ∈ SICV(st) [SDCV(st)]. Proof. The stated results follow immediately from the observations that (i) the survival function of W (θ) at x is equal to P {X(θ) > x}P {Y (θ) > x}, (ii) the survival function of V (θ) at x is equal to 1 − (1 − P {X(θ) > x})(1 − P {Y (θ) > x}), and from Theorem 8.C.1.

Consider the imperfect repair model. A new item with an absolutely continuous survival function F undergoes an imperfect repair each time it fails before it is scrapped. With probability p the repair is unsuccessful and the item is scrapped. With probability 1 − p the repair is successful and minimal, that is, after a successful repair at time t the item is as good as a working item at age t. It is well known that if X(p) denotes the time to scrap, then p the survival function of X(p) is F . Thus, the following result is apparent. Theorem 8.C.8. Let F be an absolutely continuous survival function such that F (0) = 1. Then {X(p), p ∈ (0, 1)} ∈ SDCX(st).

8.D Strong Stochastic Convexity Another notion which is sometimes useful in verifying the sample path convexity of Section 8.B is described in this section. 8.D.1 Deﬁnitions Let {X(θ), θ ∈ Θ} be a family of random variables. The family {X(θ), θ ∈ Θ} is said to be stochastically [increasing, decreasing] and convex [concave, linear ] ˆ ˆ almost everywhere if there exist {X(θ), θ ∈ Θ} such that X(θ) =st X(θ) for ˆ each θ ∈ Θ and X(θ) is [increasing, decreasing] and convex [concave, linear] in θ. We denote this by {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. Although it appears that the deﬁnition of strong stochastic convexity/concavity is restrictive, several families of random variables do satisfy the conditions of this class of convexity/concavity. This is shown in the next theorem and in the corollaries and examples which follow it. Theorem 8.D.1. Suppose that X(θ) = φ(θ, Z), where φ is a real-valued deterministic function, and Z is a random vector. If φ is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear ] in θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)].

378

8 Stochastic Convexity and Concavity

Corollary 8.D.2. Suppose that X(θ) = Z + ψ(θ), where ψ is a real-valued deterministic function, and Z is a random variable. If ψ is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear ] in θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. Corollary 8.D.3. Suppose that X(θ) = Z · ψ(θ), where ψ is a real-valued deterministic function, and Z is a nonnegative random variable. If ψ is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear] in θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. Example 8.D.4. Let X(µ, σ) be a normal random variable with mean µ and standard deviation σ. Since for a unit normal random variable N (0, 1), we ˆ have X(µ, σ) = µ + σN (0, 1) =st X(µ, σ), µ ∈ R, σ ∈ R+ , we see that, for each σ > 0, {X(µ, σ), µ ∈ R} ∈ SIL(ae), and, for each µ ∈ R, {X(µ, σ), σ ∈ R+ } ∈ SL(ae). Similarly one can prove the result in the next example. Example 8.D.5. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically n distributed random variables with mean 1. For µ > 0 deﬁne X(µ) = µ k=1 Y (k), n ∈ N+ . Then, for each n ∈ N++ , one has {X(µ), µ ∈ R+ } ∈ SIL(ae). Speciﬁcally, when Y (n) in Example 8.D.5 is an exponential random variable we have the following example. Example 8.D.6. Let X(µ, n) be an Erlang-n random variable with mean nµ and variance nµ2 . Then, for each n ∈ N++ , one has {X(µ, n), µ ∈ R+ } ∈ SIL(ae). By taking n = 1 in Example 8.D.5 we obtain the following result. Example 8.D.7. Let Y be a nonnegative random variable. For µ > 0 deﬁne X(µ) = µY . Then {X(µ), µ ∈ R+ } ∈ SIL(ae). The following generalization of Example 8.D.5 is easily observed. Example 8.D.8. Let Y (n), n = 1, 2, . . . , be a sequence of nonnegative independent and identically distributed random variables with mean 1, and let Z be a random variable n which is independent of the Y (n)’s. For µ > 0 deﬁne X(µ) = Z + µ k=1 Y (k), n ∈ N++ . Then, for each n ∈ N++ , one has {X(µ), µ ∈ R+ } ∈ SIL(ae).

8.D Strong Stochastic Convexity

379

Another suﬃcient condition (in addition to Theorem 8.D.1 and Corollaries 8.D.2 and 8.D.3) for strong convexity and concavity is described next. Let {X(θ), θ ∈ Θ} be a family of random variables, and let Fθ denote the distribution function of X(θ). If U is a uniform[0, 1] random variable, then Fθ−1 (U ) =st X(θ). The following result follows at once from this observation. Theorem 8.D.9. Suppose that Fθ−1 (u) is convex [concave, linear, increasing convex, increasing concave, increasing linear, decreasing convex, decreasing concave, decreasing linear ] in θ ∈ Θ, for all u ∈ (0, 1), then {X(θ), θ ∈ Θ} ∈ SCX(ae) [SCV(ae), SL(ae), SICX(ae), SICV(ae), SIL(ae), SDCX(ae), SDCV(ae), SDL(ae)]. A suﬃcient condition for strong convexity and concavity, which is stated on Fθ (rather than on Fθ−1 as in Theorem 8.D.9), is described next. Recall the deﬁnition of supermodular and submodular functions given in Section 7.A.8. Theorem 8.D.10. Let {X(θ), θ ∈ Θ} be a family of random variables, and suppose that all the partial second derivatives of Fθ (x) exist. (a) If Fθ (x) is concave and strictly increasing in x, and is decreasing and concave in θ, and if Fθ (x) is submodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SICX(ae). (b) If Fθ (x) is convex and strictly increasing in x, and is decreasing and convex in θ, and if Fθ (x) is supermodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SICV(ae). (c) If Fθ (x) is concave and strictly increasing in x, and is increasing and concave in θ, and if Fθ (x) is supermodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SDCX(ae). (d) If Fθ (x) is convex and strictly increasing in x, and is increasing and convex in θ, and if Fθ (x) is submodular in (x, θ), then {X(θ), θ ∈ Θ} ∈ SDCV(ae). Proof. Only the proof of part (a) is given; the proofs of the other parts are ˆ similar. Let U be a uniform[0, 1] random variable and deﬁne X(θ) by ˆ Fθ (X(θ)) = U.

(8.D.1)

Diﬀerentiating (8.D.1) for a ﬁxed value of U , we obtain ∂ ∂ ˆ ∂ F· X+ F = 0, ∂x ∂θ ∂θ

(8.D.2)

and ∂ ∂2 ˆ + F · 2X ∂x ∂θ

∂2 F ∂x2

∂ ˆ X ∂θ

2 +2

∂2 ∂ ˆ ∂2 F· X + 2 F = 0. (8.D.3) ∂x∂θ ∂θ ∂θ

The conditions stated in part (a) can be written as

380

8 Stochastic Convexity and Concavity

∂2 F ≤ 0. ∂x∂θ (8.D.4) ∂ ˆ ∂2 ˆ From (8.D.2), (8.D.3), and (8.D.4) it is seen that ∂θ X ≥ 0 and ∂θ X ≥ 0, 2 that is, {X(θ), θ ∈ Θ} ∈ SICX(ae).

∂ F > 0, ∂x

∂ F ≤ 0, ∂θ

∂2 F ≤ 0, ∂x2

∂2 F ≤ 0, ∂θ2

and

The following theorem is easily veriﬁed. Theorem 8.D.11. SICX(ae) =⇒ SICX(sp) =⇒ SICX, SICV(ae) =⇒ SICV(sp) =⇒ SICV, SDCX(ae) =⇒ SDCX(sp) =⇒ SDCX, SDCV(ae) =⇒ SDCV(sp) =⇒ SDCV. These are strict implications. It can be veriﬁed that the stochastic convexity in the usual stochastic order neither implies nor is implied by the strong stochastic convexity. 8.D.2 Closure properties In this subsection we present some closure properties of the strong convexity notions. These results trivially follow from the closure properties of deterministic functions. Thus we will not give the proofs here. Theorem 8.D.12. Let {X(θ), θ ∈ Θ} and {Y (θ), θ ∈ Θ} be two families of random variables such that for each θ ∈ Θ, X(θ) and Y (θ) are independent. (a) {X(θ), θ ∈ Θ} ∈ SICX(ae) and {Y (θ), θ ∈ Θ} ∈ SICX(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SICX(ae) for any increasing and convex function f . (b) {X(θ), θ ∈ Θ} ∈ SICV(ae) and {Y (θ), θ ∈ Θ} ∈ SICV(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SICV(ae) for any increasing and concave function f . (c) {X(θ), θ ∈ Θ} ∈ SDCX(ae) and {X(θ), θ ∈ Θ} ∈ SDCX(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SDCX(ae) for any increasing and convex function f . (d) {X(θ), θ ∈ Θ} ∈ SDCV(ae) and {Y (θ), θ ∈ Θ} ∈ SDCV(ae) imply that {f (X(θ), Y (θ)), θ ∈ Θ} ∈ SDCV(ae) for any increasing and concave function f . Theorem 8.D.13. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random variables, where Λ ⊂ R is a convex set. Also, let {Y (λ), λ ∈ Λ} be another family of random variables. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SICX(ae) [SICV(ae)] and {Y (λ), λ ∈ Λ} ∈ SICX(ae) [SICV(ae)], then {Y (X(θ)), θ ∈ Θ} ∈ SICX(ae) [SICV(ae)]. (b) If {X(θ), θ ∈ Θ} ∈ SDCX(ae) [SDCV(ae)] and {Y (λ), λ ∈ Λ} ∈ SICX(ae) [SICV(ae)], then {Y (X(θ)), θ ∈ Θ} ∈ SDCX(ae) [SDCV(ae)].

8.E Stochastic Directional Convexity

381

8.E Stochastic Directional Convexity 8.E.1 Deﬁnitions In Sections 8.A–8.D of this chapter, the parameter space Θ, of the families of random variables {X(θ), θ ∈ Θ} that we studied, was a subset of the real line R. However, in some applications the parameter space is multidimensional, that is, Θ is a subset of Rm for some positive integer m ≥ 2. In this section we study such families of random variables or vectors. In such cases one is interested in convexity [concavity] properties with respect to the vector θ = (θ1 , θ2 , . . . , θm ). Rather than studying convexity [concavity] properties of {X(θ), θ ∈ Θ}, we will study here directional convexity [concavity] properties of such families of random variables or vectors. The reader may recall the deﬁnition of directional convexity [concavity] given in (7.A.17) of Section 7.A.8. Below Θ will always be a sublattice of Rm . Let {X(θ), θ ∈ Θ} be a family of random vectors. The family {X(θ), θ ∈ Θ} is said to be (a) stochastically increasing and directionally convex [concave] if {X(θ), θ ∈ Θ} ∈ SI and if Eφ(X(θ)) is directionally convex [concave] in θ for any increasing directionally convex [concave] function φ. We denote it by {X(θ), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV]; (b) stochastically increasing and directionally linear if {X(θ), θ ∈ Θ} ∈ SI-DIR-CX ∩ SI-DIR-CV. We denote it by {X(θ), θ ∈ Θ} ∈ SI-DIR-L; (c) stochastically decreasing and directionally convex [concave] if {X(θ), θ ∈ Θ} ∈ SD and if Eφ(X(θ)) is directionally convex [concave] in θ for any increasing directionally convex [concave] function φ. We denote it by {X(θ), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV]; (d) stochastically decreasing and directionally linear if {X(θ), θ ∈ Θ} ∈ SD-DIR-CX ∩ SD-DIR-CV. We denote it by {X(θ), θ ∈ Θ} ∈ SD-DIR-L. In particular, if X(θ) is a univariate random variable for all θ ∈ Θ, then {X(θ), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV] if, and only if, {X(θ), θ ∈ Θ} ∈ SI and Eφ(X(θ)) is directionally convex [concave] in θ for any increasing convex [concave] function φ. Similarly, {X(θ), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV] if, and only if, {X(θ), θ ∈ Θ} ∈ SD and Eφ(X(θ)) is directionally convex [concave] in θ for any increasing convex [concave] function φ. If both the parameter and the random variables are univariate, then the notions of SI-DIRCX, SI-DIR-CV, SI-DIR-L, SD-DIR-CX, SD-DIR-CV, and SD-DIR-L, reduce to the notions of SICX, SICV, SIL, SDCX, SDCV, and SDL, respectively. In order to deﬁne stochastic directional convexity [concavity] in the sample path sense let {X(θ), θ ∈ Θ} be a family of random vectors as above. Let θ i ∈ Θ, i = 1, 2, 3, 4, be any four vectors such that θ 1 ≤ [θ 2 , θ 3 ] ≤ θ 4 and θ1 + θ4 = θ2 + θ3 . ˆ i , i = 1, 2, 3, 4, deﬁned on a common If there exist four random variables X ˆ probability space, such that X i =st X(θ i ), i = 1, 2, 3, 4, and

382

8 Stochastic Convexity and Concavity

ˆ 2, X ˆ 3] ≤ X ˆ 4 a.s. and (ii) X ˆ 2 +X ˆ3≤X ˆ 1 +X ˆ 4 a.s., then {X(θ), θ ∈ (a) (i) [X Θ} is said to be stochastically increasing and directionally convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SI-DIR-CX(sp)); ˆ 1 ≤ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4≤X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (b) (i) X Θ} is said to be stochastically increasing and directionally concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SI-DIR-CV(sp)); ˆ 1 ≥ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4≥X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (c) (i) X Θ} is said to be stochastically decreasing and directionally convex in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SD-DIR-CX(sp)); ˆ 4 ≤ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4≤X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (d) (i) X Θ} is said to be stochastically decreasing and directionally concave in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SD-DIR-CV(sp)); ˆ 2, X ˆ 3] ≤ X ˆ 4 a.s. and (ii) X ˆ 2 +X ˆ3=X ˆ 1 +X ˆ 4 a.s., then {X(θ), θ ∈ (e) (i) [X Θ} is said to be stochastically increasing and directionally linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SI-DIR-L(sp)); ˆ 1 ≥ [X ˆ 2, X ˆ 3 ] a.s. and (ii) X ˆ 1 +X ˆ4=X ˆ 2 +X ˆ 3 a.s., then {X(θ), θ ∈ (f) (i) X Θ} is said to be stochastically decreasing and directionally linear in the sample path sense (denoted by {X(θ), θ ∈ Θ} ∈ SD-DIR-L(sp)). If both the parameter and the random variables are univariate, then the notions of SI-DIR-CX(sp), SI-DIR-CV(sp), SI-DIR-L(sp), SD-DIR-CX(sp), SD-DIR-CV(sp), and SD-DIR-L(sp), reduce to the notions of SICX(sp), SICV(sp), SIL(sp), SDCX(sp), SDCV(sp), and SDL(sp), respectively. 8.E.2 Closure properties The following two results are extensions of Theorems 8.A.17 and 8.B.13 to the stochastic directional convexity setting. The proof of Theorem 8.E.1 is similar to the proof of Theorem 8.A.17, using Proposition 7.A.28. The proof of Theorem 8.E.2 is similar to the proof of Theorem 8.B.13, where the minimum in (8.B.3) is performed coordinatewise. Theorem 8.E.1. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random vectors, and let {Y (λ), λ ∈ Λ} be another family of random vectors. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ. (a) If {X(θ), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L], then {Y (X(θ)), θ ∈ Θ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L]. (b) If {X(θ), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV, SD-DIR-L] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX [SI-DIR-CV, SI-DIR-L], then {Y (X(θ)), θ ∈ Θ} ∈ SD-DIR-CX [SD-DIR-CV, SD-DIR-L]. Theorem 8.E.2. Let {X(θ), θ ∈ Θ} be a family of Λ-valued random vectors, and let {Y (λ), λ ∈ Λ} be another family of random vectors. Suppose that X(θ) and Y (λ) are independent for any choice of θ ∈ Θ and λ ∈ Λ.

8.E Stochastic Directional Convexity

383

(a) If {X(θ), θ ∈ Θ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)]. (b) If {X(θ), θ ∈ Θ} ∈ SD-DIR-CX(sp) [SD-DIR-CV(sp), SD-DIR-L(sp)] and {Y (λ), λ ∈ Λ} ∈ SI-DIR-CX(sp) [SI-DIR-CV(sp), SI-DIR-L(sp)], then {Y (X(θ)), θ ∈ Θ} ∈ SD-DIR-CX(sp) [SD-DIR-CV(sp), SD-DIRL(sp)]. From Theorem 8.E.2 it is easy to verify the following results. Theorem 8.E.3. SI-DIR-CX(sp) =⇒ SI-DIR-CX, SI-DIR-CV(sp) =⇒ SI-DIR-CV, SD-DIR-CX(sp) =⇒ SD-DIR-CX, SD-DIR-CV(sp) =⇒ SD-DIR-CV. The next results will be stated only for the increasing convex cases, however, they have versions that apply to the decreasing convex, the increasing concave, and the decreasing concave cases. By combining independent SI-DIR-CX [SI-DIR-CX(sp)] families of random vectors, one obtains a new SI-DIR-CX [SI-DIR-CX(sp)] family of random vectors. Theorem 8.E.4. Let {X i (θ i ), θ i ∈ Θ i } ∈ SD-DIR-CX [SI-DIR-CX(sp)], i = 1, 2, . . . , m, be mutually independent collections of random vectors. Deﬁne X(θ) = (X 1 (θ 1 ), X 2 (θ 2 ), . . . , X m (θ m )). Then {X(θ), θ ∈ ×m i=1 Θ i } ∈ SD-DIR-CX [SI-DIR-CX(sp)]. The (sp) part of Theorem 8.E.4 can be proven by observing that, by independence, the constructions required by the deﬁnition of the SI-DIR-CX(sp) notion can be done coordinatewise. The other part of Theorem 8.E.4 can be veriﬁed by noticing that an m-variate directionally convex function is also directionally convex in any subset of the m coordinates, and again using the independence assumption. As a special case of Theorem 8.E.4 it is seen that if the families of random variables {Xi (θi ), θi ∈ Θi } ∈ SICX [SICX(sp)], i = 1, 2, . . . , m, then {(X1 (θ1 ), X2 (θ2 ), . . . , Xm (θm )), (θ1 , θ2 , . . . , θm ) ∈ ×m i=1 Θi } ∈ SD-DIR-CX [SI-DIR-CX(sp)]. A version of Theorem 8.E.4, in which some or all of the parameters are the same, can also be stated and proven. For example, if the families of random variables {Xi (θ), θ ∈ Θ} ∈ SICX [SICX(sp)], i = 1, 2, . . . , m, then {(X1 (θ), X2 (θ), . . . , Xm (θ)), θ ∈ Θ} ∈ SD-DIR-CX [SI-DIR-CX(sp)] (here all the parameters are the same). Example 8.E.5. Recall from Example 8.B.4 that if Y (n), n = 1, 2, . . . , are nonnegative independent and identically distributed random variables, then

384

8 Stochastic Convexity and Concavity

n { k=1 Y (k), n ∈ N++ } ∈ SIL(sp). Now, let {Yi (n), n = 1, 2, . . . }, i = 1, 2, . . . , m, be independent sequences of nonnegative

n1 independent n2 and identically distributed random variables. Then Y (k), 1 k=1 k=1 Y2 (k), . . . ,

nm m ∈ SI-DIR-CX(sp). Y (k) , (n , n , . . . , n ) ∈ N m 1 2 m ++ k=1 Similar examples can be constructed from the other examples in Section 8.B. The following result illustrates the use of Theorems 8.E.1 and 8.E.2. For each θ ∈ Θ (where Θ is a convex subset of R or N) let {X(n, θ), n ∈ N+ } be a Markov chain with state space S (S = [0, ∞) or N+ ). Let Y (x, θ) =st [X(n + 1, θ)X(n, θ) = x], x ∈ S. Theorem 8.E.6. Suppose that {Y (x, θ), (x, θ) ∈ S × Θ} ∈ SI-DIR-CX [SIDIR-CV, SI-DIR-CX(sp), SI-DIR-CV(sp)]. If {X(0, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)], then {X(n, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)] for each n ∈ N+ . Proof. As an induction hypothesis assume that for some n we have {X(n, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)].

(8.E.1)

Note that X(n + 1, θ) =st Y (X(n, θ), θ).

(8.E.2)

Now, from (8.E.1), (8.E.2), and from a straightforward extension of Theorem 8.E.2(a) (for the (sp) cases) [or of Theorem 8.E.1(a) (for the other cases)], one obtains that {X(n + 1, θ), θ ∈ Θ} ∈ SICX [SICV, SICX(sp), SICV(sp)].

8.F Complements Section 8.A: The notion of (regular) stochastic convexity/concavity is introduced in Shaked and Shanthikumar [508]. However, the condition {X(θ), θ ∈ Θ} ∈ SCX was encountered earlier by Schweder [499] who described it by saying that {X(θ), θ ∈ Θ} is “convexly parametrized.” The basic closure properties (Theorems 8.A.15 and 8.A.17) are established in Shaked and Shanthikumar [508]. As an example of the use of these results, we note that Theorem 1(b) of Lef`evre and Malice [336] can be obtained from a combination of Example 8.A.3 with Theorems 8.A.15 and 8.A.17. A slightly weaker version of the example regarding the forward recurrence times (Example 8.A.18) can be found in Makowski and Philips [380]. Temporal convexity of Markov processes (Theorems 8.A.19 and 8.A.20) are studied in Shaked and Shanthikumar [507, 509], Shanthikumar and Yao [534], and Li and Shaked [349]. Extensions of these notions to random vectors can be found in Chang, Chao, Pinedo, and Shanthikumar [125],

8.F Complements

385

and to arbitrary random variables can be found in Meester [385] and in Meester and Shanthikumar [388]. A study of regular stochastic convexity by means of operators is developed in Adell and Perez-Palomares [5]. The results about stochastic m-convexity (Section 8.A.3) are mostly taken from Denuit, Lef`evre and Utev [155]. The stochastic m-increasing convexity of a family with a location parameter (Example 8.A.26) can be found in Denuit and Lef`evre [147]. “Derivatives” of stochastically convex and m-convex processes are introduced and studied in Adell and Lekuona [4]. Section 8.B: The notion of (sample path) stochastic convexity/concavity is introduced in Shaked and Shanthikumar [508]. A generalization of the notion of the semigroup property can be found in Shaked, Shanthikumar, and Tong [519]; Example 8.B.7 is a special case of a result due to them. The closure properties (Theorems 8.B.10 and 8.B.13) are established in Shaked and Shanthikumar [508]. The relation (8.B.2) between the spreads can be found in Goldstein and Rinott [212]. Temporal sample path convexity of Markov processes (Theorems 8.B.16 and 8.B.17) is studied in Shaked and Shanthikumar [508, 509]. Extensions of these notions to random vectors can be found in Chang, Chao, Pinedo, and Shanthikumar [125], and to arbitrary random variables can be found in Meester [385] and in Meester and Shanthikumar [388]. Theorem 8.B.18 is essentially proved in Kirmani and Gupta [299]. Section 8.C: Stochastic convexity/concavity in the usual stochastic ordering is introduced in Shaked and Shanthikumar [510]. Theorem 8.C.6 is established in Shaked and Shanthikumar [514], and Theorem 8.C.7 is established in Shaked and Shanthikumar [510]. Some variations of the stochastic convexity notions in Section 8.C, and also of the notions in Section 8.A, can be found in Atakan [24]. Section 8.D: The notion of strong stochastic convexity (in a diﬀerent form) is introduced in Shanthikumar and Yao [531, 533]. The deﬁnition presented here is given in Meester and Shanthikumar [386]. Section 8.E: The notion of multivariate stochastic directional convexity is introduced in Meester and Shanthikumar [387]. Most of the results of this section are taken from that paper. The results yielding the parametric stochastic convexity and concavity of Markov processes (Theorem 8.E.6) can be found in Shaked and Shanthikumar [509]. Yao [573] introduced notions of stochastic supermodularity and submodularity that are weaker than the notions of stochastic directional convexity and concavity, respectively.

9 Positive Dependence Orders

Notions of positive dependence of two random variables X1 and X2 have been introduced in the literature in an eﬀort to mathematically describe the property that “large (respectively, small) values of X1 tend to go together with large (respectively, small) values of X2 .” Many of the notions of positive dependence are deﬁned by means of some comparison of the joint distribution of X1 and X2 with their distribution under the theoretical assumption that X1 and X2 are independent. Often such a comparison can be extended to general pairs of bivariate distributions with given marginals. This fact led researchers to introduce various notions of positive dependence orders. These orders are designed to compare the strength of the positive dependence of the two underlying bivariate distributions. In this chapter we describe some such notions. In many sections of this chapter we ﬁrst describe a positive dependence order which compares two bivariate random vectors (or distributions). When the order can be extended to general n-dimensional (n > 2) random vectors, we will describe the extension in a later part of that section. Most of the orders that we describe in this chapter are deﬁned on the Fr´echet class M(F1 , F2 ) of bivariate distributions with ﬁxed marginals F1 and F2 . The upper bound of this class is the distribution deﬁned by min{F1 (x1 ), F2 (x2 )} (whose probability mass is concentrated on the set {(x1 , x2 ) : F1 (x1 ) = F2 (x2 )}). The lower bound of this class is the distribution deﬁned by max{F1 (x1 ) + F2 (x2 ) − 1, 0} (whose probability mass is concentrated on the set {(x1 , x2 ) : F1 (x1 ) + F2 (x2 ) = 1}).

9.A The PQD and the Supermodular Orders 9.A.1 Deﬁnition and basic properties: The bivariate case Let the random vector (X1 , X2 ) have the distribution function F , and let F1 and F2 denote, respectively, the marginal distributions of X1 and X2 .

388

9 Positive Dependence Orders

Lehmann [343] deﬁned (X1 , X2 ) (or F ) to be positive quadrant dependent (PQD) if F (x1 , x2 ) ≥ F1 (x1 )F2 (x2 ) for all x1 and x2 . (9.A.1) Note that (9.A.1) can be rewritten as F (x1 , x2 ) ≥ F I (x1 , x2 )

for all x1 and x2 ,

(9.A.2)

where F I (x1 , x2 ) ≡ F1 (x1 )F2 (x2 ) for all x1 and x2 . This characterization of the PQD notion leads naturally to the deﬁnition of the PQD order that is described next. For a random vector (X1 , X2 ) with distribution function F , let F be the bivariate survival function of (X1 , X2 ), that is, F (x1 , x2 ) ≡ P {X1 > x1 , X2 > x2 } for all x1 and x2 . Let (Y1 , Y2 ) be another bivariate random vector with distribution function G and survival function G. Suppose that F and G have the same univariate marginals; that is, suppose that both belong to M(F1 , F2 ) for some univariate distribution functions F1 and F2 . If F (x1 , x2 ) ≤ G(x1 , x2 )

for all x1 and x2 ,

(9.A.3)

then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PQD order (denoted by (X1 , X2 ) ≤PQD (Y1 , Y2 )). Sometimes it will be useful to write this as F ≤PQD G. Using the assumption that F and G have the same univariate marginals, it is easy to see that (9.A.3) is equivalent to F (x1 , x2 ) ≤ G(x1 , x2 )

for all x1 and x2 .

Note that for random vectors (X1 , X2 ) and (Y1 , Y2 ), with distribution functions in M(F1 , F2 ), we have (X1 , X2 ) ≤PQD (Y1 , Y2 ) ⇐⇒ (X1 , X2 ) ≤uo (Y1 , Y2 ) and (X1 , X2 ) ≤PQD (Y1 , Y2 ) ⇐⇒ (X1 , X2 ) ≥lo (Y1 , Y2 ); see (6.G.1) and (6.G.2) in Section 6.G.1. The reader should notice, however, that in (6.G.1) and (6.G.2) it is not required that (X1 , X2 ) and (Y1 , Y2 ) have the same marginals. Therefore, whereas the upper and lower orthant orders measure the size (or the location) of the underlying random vectors, the PQD order measures the amount of positive dependence of the underlying random vectors. From (9.A.2) it is seen that F is PQD if, and only if, F I ≤PQD F. By Hoeﬀding’s Lemma (see Lehmann [343, page 1139]) we see that if (X1 , X2 ) and (Y1 , Y2 ) have distributions F and G in M(F1 , F2 ), then

9.A The PQD and the Supermodular Orders

Cov(X1 , X2 ) =

Cov(Y1 , Y2 ) =

∞

−∞

and

∞

−∞

∞

−∞

∞

−∞

389

[F (x1 , x2 ) − F1 (x1 )F2 (x2 )]dx1 dx2

[G(x1 , x2 ) − F1 (x1 )F2 (x2 )]dx1 dx2 ,

provided the covariances are well deﬁned. It thus follows from (9.A.3) that if (X1 , X2 ) ≤PQD (Y1 , Y2 ), then Cov(X1 , X2 ) ≤ Cov(Y1 , Y2 ),

(9.A.4)

and therefore, since Var(Xi ) = Var(Yi ), i = 1, 2, we have that ρX1 ,X2 ≤ ρY1 ,Y2 , where ρX1 ,X2 and ρY1 ,Y2 denote the correlation coeﬃcients associated with (X1 , X2 ) and (Y1 , Y2 ), respectively, provided the underlying variances are well deﬁned. Yanagimoto and Okamoto [570] have shown that some other correlation measures, such as Kendall’s τ , Spearman’s ρ, and Blomquist’s q, are preserved under the PQD order. The inequality (9.A.4), and the monotonicity of other correlation measures under the PQD order, can also be obtained as corollaries from (9.A.17) below. Let (X1 , X2 ) and (Y1 , Y2 ) be random vectors with distribution functions F and G. If (X1 , X2 ) ≤PQD (Y1 , Y2 ), then F (x1 , x2 ) ≤ G(x1 , x2 )

for all x1 and x2 ,

and P {X1 > x1 , X2 ≤ x2 } ≥ P {Y1 > x1 , Y2 ≤ x2 }

for all x1 and x2 .

Therefore

and

P {X2 > x2 X1 > x1 } ≤ P {Y2 > x2 Y1 > x1 }

for all x1 and x2 ,

P {X2 ≤ x2 X1 > x1 } ≥ P {Y2 ≤ x2 Y1 > x1 }

for all x1 and x2 .

Thus, for all x1 we have E[X2 X1 > x1 ] = −

P {X2 ≤ x2 X1 > x1 }dx2 −∞ ∞ + P {X2 > x2 X1 > x1 }dx2

0

0

≤− P {Y2 ≤ x2 Y1 > x1 }dx2 −∞ ∞ + P {Y2 > x2 Y1 > x1 }dx2 0 = E[Y2 Y1 > x1 ]. 0

390

9 Positive Dependence Orders

For random vectors (X1 , X2 ) and (Y1 , Y2 ) with distribution functions in M(F1 , F2 ), the condition (9.A.5) E[X2 X1 > x1 ] ≤ E[Y2 Y1 > x1 ] for all x1 can be used to deﬁne a positive dependence stochastic order. Such an order is discussed in Muliere and Petrone [405]. We see that if (X1 , X2 ) ≤PQD (Y1 , Y2 ), then (9.A.5) holds. Let FL and FU denote the Fr´echet lower and upper bounds in the class M(F1 , F2 ). Then, for every distribution F ∈ M(F1 , F2 ) we have FL ≤PQD F ≤PQD FU .

(9.A.6)

9.A.2 Closure properties A powerful closure property of the PQD order is given in the next theorem. Theorem 9.A.1. Suppose that the four random vectors (X1 , X2 ), (Y1 , Y2 ), (U1 , U2 ), and (V1 , V2 ) satisfy (X1 , X2 ) ≤PQD (Y1 , Y2 )

and

(U1 , U2 ) ≤PQD (V1 , V2 ),

(9.A.7)

and suppose that (X1 , X2 ) and (U1 , U2 ) are independent, and also that (Y1 , Y2 ) and (V1 , V2 ) are independent. Then (φ(X1 , U1 ), ψ(X2 , U2 )) ≤PQD (φ(Y1 , V1 ), ψ(Y2 , V2 )), for all increasing functions φ and ψ.

(9.A.8)

Proof. From the monotonicity of φ and ψ it follows that the set {(u1 , u2 ) : φ(x1 , u1 ) ≤ a1 , ψ(x2 , u2 ) ≤ a2 } is a lower quadrant for all x1 , x2 , a1 , and a2 . Therefore, for all a1 and a2 we have P {φ(X1 , U1 ) ≤ a1 ,ψ(X2 , U2 ) ≤ a2 } = P {φ(X1 , u1 ) ≤ a1 , ψ(X2 , u2 ) ≤ a2 }dH(u1 , u2 ) ≤ P {φ(Y1 , u1 ) ≤ a1 , ψ(Y2 , u2 ) ≤ a2 }dH(u1 , u2 ) = P {φ(Y1 , U1 ) ≤ a1 , ψ(Y2 , U2 ) ≤ a2 }, where H is the distribution function of (U1 , U2 ). Thus, (φ(X1 , U1 ), ψ(X2 , U2 )) ≤PQD (φ(Y1 , U1 ), ψ(Y2 , U2 )), for all increasing functions φ and ψ.

(9.A.9)

In a similar manner one can show that (φ(Y1 , U1 ), ψ(Y2 , U2 )) ≤PQD (φ(Y1 , V1 ), ψ(Y2 , V2 )), for all increasing functions φ and ψ. From (9.A.9) and (9.A.10) one obtains (9.A.8).

(9.A.10)

9.A The PQD and the Supermodular Orders

391

In particular, if (9.A.7) holds, then (X1 + U1 , X2 + U2 ) ≤PQD (Y1 + V1 , Y2 + V2 ),

(9.A.11)

that is, the PQD order is closed under convolutions. From Theorem 9.A.1 it also follows that (X1 , X2 ) ≤PQD (Y1 , Y2 ) =⇒ (φ(X1 ), ψ(X2 )) ≤PQD (φ(Y1 ), ψ(Y2 )), for all increasing functions φ and ψ. The closure properties that are stated in the next theorem are easy to verify. (j)

(j)

(j)

(j)

Theorem 9.A.2. (a) Let {(X1 , X2 ), j = 1, 2, . . . } and {(Y1 , Y2 ), j = (j) (j) 1, 2, . . . } be two sequences of random vectors such that (X1 , X2 ) →st (j) (j) (X1 , X2 ) and (Y1 , Y2 ) →st (Y1 , Y2 ) as j → ∞, where →st denotes con(j) (j) (j) (j) vergence in distribution. If (X1 , X2 ) ≤PQD (Y1 , Y2 ), j = 1, 2, . . ., then (X1 , X2 ) ≤PQD (Y1 , Y2 ). (b) Let (X1 , X2 ), (Y1 ,Y2 ), and Θ be random vectors such that [(X1 , X2 )Θ = θ] ≤PQD [(Y1 , Y2 )Θ = θ] for all θ in the support of Θ. Then (X1 , X2 ) ≤PQD (Y1 , Y2 ). That is, the PQD order is closed under mixtures. Fang, Hu, and Joe [191] applied the idea of the PQD order to stationary Markov chains and showed that, if the process is stochastically increasing, then dependence (in the sense of the PQD order) is decreasing with the lag, namely, if {X1 , X2 , . . . } is a Markov chain and Xi is distributed according to F and if (X1 , Xn ) is distributed according to F1n , n = 2, 3, . . ., then F12 ≥PQD F13 ≥PQD · · · ≥PQD F1n ≥PQD · · · ≥PQD F (2) ,

(9.A.12)

where F (2) (x, y) = F (x)F (y). See also Remark 9.A.29 below. Another example is the following. Example 9.A.3. Let φ and ψ be two Laplace transforms of positive random variables. Then F and G, deﬁned by

and

F (x1 , x2 ) = φ(φ−1 (x1 ) + φ−1 (x2 )),

(x1 , x2 ) ∈ [0, 1]2 ,

G(y1 , y2 ) = ψ(ψ −1 (y1 ) + ψ −1 (y2 )),

(y1 , y2 ) ∈ [0, 1]2 ,

are bivariate distribution functions with uniform[0, 1] marginals (such F and G are called Archimedean copulas). Let (X1 , X2 ) and (Y1 , Y2 ) be distributed according to F and G, respectively. Then (X1 , X2 ) ≤PQD (Y1 , Y2 ) if, and only if, ψ −1 φ is superadditive (that is, ψ −1 φ(x + y) ≥ ψ −1 φ(x) + ψ −1 φ(y) for all x, y ≥ 0). Also, if φ−1 ψ has a completely monotone derivative, then (X1 , X2 ) ≤PQD (Y1 , Y2 ).

392

9 Positive Dependence Orders

9.A.3 The multivariate case Let X = (X1 , X2 , . . . , Xn ) be a random vector with distribution function F and survival function F . Let Y = (Y1 , Y2 , . . . , Yn ) be another random vector with distribution function G and survival function G. If F (x) ≤ G(x)

for all x,

(9.A.13)

F (x) ≤ G(x)

for all x,

(9.A.14)

and then we say that X is smaller than Y in the PQD order (denoted by X ≤PQD Y ). From (9.A.13) and (9.A.14) it follows that only random vectors with the same univariate marginals can be compared in the PQD order. From (9.A.13) and (9.A.14) it follows that

X ≤PQD Y ⇐⇒ X ≤uo Y and X ≥lo Y . (9.A.15) An extension of Theorem 9.A.1 to the general multivariate case is the following. The proof of Theorem 9.A.4 is a straightforward extension of the proof of Theorem 9.A.1, and therefore it is omitted. Theorem 9.A.4. Suppose that the four random vectors X = (X1 , X2 , . . . , Xn ), Y = (Y1 , Y2 , . . . , Yn ), U = (U1 , U2 , . . . , Un ), and V = (V1 , V2 , . . . , Vn ) satisfy X ≤PQD Y and U ≤PQD V , (9.A.16) and suppose that X and U are independent, and also that Y and V are independent. Then (φ1 (X1 , U1 ), φ2 (X2 , U2 ), . . . , φn (Xn , Un )) ≤PQD (φ1 (Y1 , V1 ), φ2 (Y2 , V2 ), . . . , φn (Yn , Vn )), for all increasing functions φi , i = 1, 2, . . . , n. In particular, if (9.A.16) holds, then X + U ≤PQD Y + V , that is, the PQD order is closed under convolutions. Also, from Theorem 9.A.4 it follows that (X1 , X2 , . . . , Xn ) ≤PQD (Y1 , Y2 , . . . , Yn ) =⇒ (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤PQD (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )), for all increasing functions φi , i = 1, 2, . . . , n. The closure properties that are stated in the next theorem are easy to verify.

9.A The PQD and the Supermodular Orders

393

Theorem 9.A.5. (a) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤PQD Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤PQD (Y 1 , Y 2 , . . . , Y m ). That is, the PQD order is closed under conjunctions. (b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤PQD Y , then X I ≤PQD Y I for each I ⊆ {1, 2, . . . , n}. That is, the PQD order is closed under marginalization. (c) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤PQD Y j , j = 1, 2, . . ., then X ≤PQD Y . (d) Let X, Y , and Θ be random vectors such that [X Θ = θ] ≤PQD [Y Θ = θ] for all θ in the support of Θ. Then X ≤PQD Y . That is, the PQD order is closed under mixtures. From Theorem 9.A.5(b) and (9.A.4) it follows that if (X1 , X2 , . . . , Xn ) ≤PQD (Y1 , Y2 , . . . , Yn ), then, for all i1 = i2 , we have that Cov(Xi1 , Xi2 ) ≤ Cov(Yi1 , Yi2 ). Since the univariate marginals of X and Y are equal, it also follows that ρXi1 ,Xi2 ≤ ρYi1 ,Yi2 , where ρXi1 ,Xi2 and ρYi1 ,Yi2 denote the correlation coeﬃcients associated with (Xi1 , Xi2 ) and (Yi1 , Yi2 ), respectively, provided the underlying variances are well deﬁned. Joe [260] has shown that some multivariate versions of the correlation measures Kendall’s τ , Spearman’s ρ, and Blomquist’s q, are monotone with respect to the PQD order. Another preservation property of the PQDorder is described in the next 0 theorem. In the following theorem we deﬁne j=1 xj ≡ 0 for any sequence {xj , j = 1, 2, . . . }. Similar results are Theorems 6.G.7 and 9.A.15. Theorem 9.A.6. Let X j = (Xj,1 , Xj,2 , . . . , Xj,m ), j = 1, 2, . . ., be a sequence of nonnegative random vectors, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both M and N are independent of the X j ’s. If M ≤PQD N , then M1 j=1

Xj,1 ,

M2 j=1

Xj,2 , . . . ,

Mm j=1

N1 N2 Nm Xj,m ≤PQD Xj,1 , Xj,2 , . . . , Xj,m . j=1

j=1

j=1

394

9 Positive Dependence Orders

Consider now, as in Section 6.B.4, n families of univariate distribu(i) tion functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) denote a random variable with distribution func(i) tion Gθ , i = 1, 2, . . . , n. Below we give a result which provides comparisons of two random vectors, with distribution functions of the form (6.B.18), in the PQD order. The following result is obtained easily from Theorem 6.G.8; see Theorems 6.B.17, 7.A.37, and 9.A.15 for related results. (i)

Theorem 9.A.7. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution functions as above. Let Θ 1 and Θ 2 be two random vectors n with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by

Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

If

Xi (θ) ≤st Xi (θ ) whenever θ ≤ θ ,

i = 1, 2, . . . , n,

and if Θ 1 ≤PQD Θ2 , then Y 1 ≤PQD Y 2 . Example 9.A.8. Let X be an n-dimensional random vector with a density function f of the form f (x) = |Σ|−1/2 g(xΣ −1 x), ∞ where Σ = (σij ) is a positive deﬁnite n × n matrix, and g satisﬁes 0 rn−1 g(r2 )dr < ∞. Such density functions are called elliptically contoured. Let Y be an n-dimensional random vector with a density function h of the form h(x) = |Λ|−1/2 g(xΛ−1 x), where Λ = (λij ) is a positive deﬁnite n × n matrix. If σii = λii , i = 1, 2, . . . , n, and σij ≤ λij , 1 ≤ i < j ≤ n, then X ≤PQD Y . In particular, multivariate normal random vectors with mean 0 and the same variances are ordered in the PQD order if their covariances are pointwise ordered.

9.A The PQD and the Supermodular Orders

395

9.A.4 The supermodular order The supermodular order, which is described in this subsection, is a suﬃcient condition that implies the PQD order, but it is also of independent interest. Recall from Section 7.A.8 that a function φ : Rn → R is said to be supermodular if for any x, y ∈ Rn it satisﬁes φ(x) + φ(y) ≤ φ(x ∧ y) + φ(x ∨ y), where the operators ∧ and ∨ denote coordinatewise minimum and maximum, respectively. Note that if φ : Rn → R is supermodular, then the function ψ, deﬁned by ψ(x1 , x2 , . . . , xn ) = φ(g1 (x1 ), g2 (x2 ), . . . , gn (xn )), is also supermodular, whenever gi : R → R, i = 1, 2, . . . , n, are all increasing or are all decreasing. Let X and Y be two n-dimensional random vectors such that E[φ(X)] ≤ E[φ(Y )]

for all supermodular functions φ : Rn → R,

provided the expectations exist. Then X is said to be smaller than Y in the supermodular order (denoted by X ≤sm Y ). Since the functions φx = I{y:y>x} and ψx = I{y:y≤x} are supermodular for each ﬁxed x, it is immediate that X ≤sm Y =⇒ X ≤PQD Y .

(9.A.17)

These implications also follow from Theorem 6.G.2 and (9.A.15) since every n-dimensional (n ≥ 2) distribution function, and any n-dimensional survival function, are supermodular functions. In fact, when n = 2 we have that (X1 , X2 ) ≤sm (Y1 , Y2 ) ⇐⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 );

(9.A.18)

see, for example, Tchen [547]. From (9.A.17) it is seen that if X ≤sm Y , then X and Y must have the same univariate marginals. Some closure properties of the supermodular order are described in the next theorem. Theorem 9.A.9. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤sm (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤sm (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R, i = 1, 2, . . . , n, are all increasing or are all decreasing. (b) Let X 1 , X 2 , . . . , X m be a set of independent random vectors where the dimension of X i is ki , i = 1, 2, . . . , m. Let Y 1 , Y 2 , . . . , Y m be another set of independent random vectors where the dimension of Y i is ki , i = 1, 2, . . . , m. If X i ≤sm Y i for i = 1, 2, . . . , m, then (X 1 , X 2 , . . . , X m ) ≤sm (Y 1 , Y 2 , . . . , Y m ). That is, the supermodular order is closed under conjunctions.

396

9 Positive Dependence Orders

(c) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤sm Y , then X I ≤sm Y I for each I ⊆ {1, 2, . . . , n}. That is, the supermodular order is closed under marginalization. (d) Let X, Y , and Θ be random vectors such that X Θ = θ ≤sm Y Θ = θ for all θ in the support of Θ. Then X ≤sm Y . That is, the supermodular order is closed under mixtures. (e) Let {X j , j = 1, 2, . . . } and {Y j , j = 1, 2, . . . } be two sequences of random vectors such that X j →st X and Y j →st Y as j → ∞, where →st denotes convergence in distribution. If X j ≤sm Y j , j = 1, 2, . . ., then X ≤sm Y . Proof. Part (a) follows from the fact that a composition of a supermodular function with coordinatewise functions, that are all increasing or are all decreasing, is a supermodular function. In order to see part (b) let X 1 and X 2 be two independent random vectors, and let Y 1 and Y 2 be two other independent random vectors. Suppose that X 1 ≤sm Y 1 and that X 2 ≤sm Y 2 . Then, for any supermodular function φ (of the proper dimension) we have that Eφ(X 1 , X 2 ) = E Eφ(X 1 , X 2 )X 2 ≤ E Eφ(Y 1 , X 2 )X 2 = Eφ(Y 1 , X 2 ) ≤ Eφ(Y 1 , Y 2 ), where the ﬁrst inequality follows from the fact that φ(x1 , x2 ) is supermodular in x1 when x2 is ﬁxed, and the second inequality follows in a similar manner. Part (b) of Theorem 9.A.9 follows from the above by induction. Parts (c) and (d) are easy to prove. A proof of part (e) can be found in M¨ uller and Scarsini [416].

From parts (a) and (d) of Theorem 9.A.9 we obtain the following corollary. Corollary 9.A.10. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors such that X ≤sm Y , and let Z be an m-dimensional random vector which is independent of X and Y . Then (h1 (X1 , Z), h2 (X2 , Z), . . . , hn (Xn , Z)) ≤sm (h1 (Y1 , Z), h2 (Y2 , Z), . . . , hn (Yn , Z)), whenever hi (x, z), i = 1, 2, . . . , n, are all increasing or are all decreasing in x for every z. Example 9.A.11. Let X and Y be two n-dimensional random vectors such that X ≤sm Y , and let Z be an n-dimensional random vector which is independent of X and Y . Then from Corollary 9.A.10 it follows that X ∧ Z ≤sm Y ∧ Z, and that X + Z ≤sm Y + Z.

9.A The PQD and the Supermodular Orders

397

By applying Corollary 9.A.10 twice (letting Z there be an n-dimensional random vector, and letting each hi depend only on its ﬁrst argument and on the ith component of the second argument, i = 1, 2, . . . , n), we get the following result. Theorem 9.A.12. Let X, Y , Z, and W be n-dimensional random vectors such that X and Z are independent and Y and W are independent. Let ci : [0, ∞)2 → [0, ∞) be a continuous increasing function, i = 1, 2, . . . , n. If X ≤sm Y and Z ≤sm W , then (c1 (X1 , Z1 ), c2 (X2 , Z2 ), . . . , cn (Xn , Zn )) ≤sm (c1 (Y1 , W1 ), c2 (Y2 , W2 ), . . . , cn (Yn , Wn )). Example 9.A.13. Let {X k = (Xk,1 , . . . , Xk,n ), k ≥ 0} and {Y k = (Yk,1 , . . . , Yk,n ), k ≥ 0} be two Markov chains as described in Example 6.G.6. If the gi ’s are increasing in their m + 1 arguments, if U l = {U lk , k ≥ 0}, l = 1, . . . , m, are independent, if V l = {V lk , k ≥ 0}, l = 1, . . . , m, are independent, and if U lk ≤sm V lk , l = 1, . . . , m, k ≥ 0, then, for each k ≥ 0 we have (X 0 , . . . , X k ) ≤sm (Y 0 , . . . , Y k ). The proof uses Theorem 9.A.12, Corollary 9.A.10, and Theorem 9.A.9(b). We omit the details. Another preservation property of the supermodular order is described in 0 the next theorem. In the following theorem we deﬁne j=1 xj ≡ 0 for any sequence {xj , j = 1, 2, . . . }. Similar results are Theorems 6.G.7 and 9.A.6. Theorem 9.A.14. Let X j = (Xj,1 , Xj,2 , . . . , Xj,m ), j = 1, 2, . . ., be a sequence of nonnegative random vectors, and let M = (M1 , M2 , . . . , Mm ) and N = (N1 , N2 , . . . , Nm ) be two vectors of nonnegative integer-valued random variables. Assume that both M and N are independent of the X j ’s. If M ≤sm N , then M1

Xj,1 ,

j=1

M2

Xj,2 , . . . ,

j=1

Mm

N1 N2 Nm Xj,m ≤sm Xj,1 , Xj,2 , . . . , Xj,m .

j=1

j=1

j=1

j=1

Proof. Let φ be a supermodular function. Conditioning on the possible realizations of (X 1 , X 2 , . . . ) we can write M1 M2 Mm E φ Xj,1 , Xj,2 , . . . , Xj,m j=1

j=1

j=1

$ M1 M2 Mm % =E E φ Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) . j=1

j=1

j=1

398

9 Positive Dependence Orders

Now, it is easy to see that for any realization (x1 , x (X 1 , X 2 , . . . ), 2 ,n.1. . ) of n2 the function ψ, deﬁned by ψ(n1 , n2 , . . . , nm ) = φ xj,1 , j=1 xj,2 , . . . , j=1

nm , is supermodular. Therefore, since M ≤ x N , we have that sm j=1 j,m M1 M2 Mm Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) = (x1 , x2 , . . . ) E φ j=1

j=1

j=1

N1 N2 Nm ≤E φ Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) = (x1 , x2 , . . . ) , j=1

j=1

j=1

and thus M1 M2 Mm Xj,1 , Xj,2 , . . . , Xj,m E φ j=1

j=1

j=1

$ N1 N2 Nm % ≤E E φ Xj,1 , Xj,2 , . . . , Xj,m (X 1 , X 2 , . . . ) j=1

j=1

j=1

N1 N2 Nm =E φ Xj,1 , Xj,2 , . . . , Xj,m . j=1

j=1

j=1

Consider now, as in Section 6.B.4, n families of univariate distribu(i) tion functions {Gθ , θ ∈ Xi } where Xi is a subset of the real line R, i = 1, 2, . . . , n. Let Xi (θ) denote a random variable with distribution func(i) tion Gθ , i = 1, 2, . . . , n. Below we give a result which provides comparisons of two random vectors, with distribution functions of the form (6.B.18), in the supermodular order. The following result is a generalization of Theorem 9.A.9(d); see Theorems 6.B.17, 6.G.8, 7.A.37, and 9.A.7 for related results. (i)

Theorem 9.A.15. Let {Gθ , θ ∈ Xi }, i = 1, 2, . . . , n, be n families of univariate distribution nfunctions as above. Let Θ 1 and Θ 2 be two random vectors with supports in i=1 Xi and distribution functions F1 and F2 , respectively. Let Y 1 and Y 2 be two random vectors with distribution functions H1 and H2 given by Hj (y1 , y2 , . . . , yn ) =

...

X1

X2

n

Xn i=1

(i)

Gθi (yi )dFj (θ1 , θ2 , . . . , θn ), (y1 , y2 , . . . , yn ) ∈ Rn , j = 1, 2.

If

Xi (θ) ≤st Xi (θ ) whenever θ ≤ θ ,

and if Θ 1 ≤sm Θ2 ,

i = 1, 2, . . . , n,

9.A The PQD and the Supermodular Orders

399

then Y 1 ≤sm Y 2 . Before stating the next result, it is worthwhile to mention that from Proposition 7.A.27 it follows that X ≤sm Y =⇒ X ≤dir-cx Y . The following result may be compared with Theorems 6.G.10 and 7.A.30. Theorem 9.A.16. Let X and Y be two random vectors. If X ≤sm Y , then φ(X) ≤icx φ(Y ) for any increasing supermodular function φ : Rn → R. A consequence of Theorem 9.A.16, that is useful in queuing theory, is described in the following example. Example 9.A.17. Let {Ai }∞ i=0 be a sequence of random variables, and let c be some constant. Deﬁne inductively Q0 = q;

Qi+1 = [Qi + Ai − c]+ , i = 1, 2, . . . ,

for some ﬁxed q. Similarly, let {Ai }∞ i=0 be another sequence of random variables, and deﬁne inductively Q0 = q;

Qi+1 = [Qi + Ai − c]+ , i = 1, 2, . . . .

If (A0 , A1 , . . . , Ai ) ≤sm (A0 , A1 , . . . , Ai ) for all i = 1, 2, . . ., then Qi ≤icx Qi for all i = 1, 2, . . .. In fact, the above result holds even if Q0 and Q0 are random variables satisfying Q0 ≤icx Q0 . As a particular case of Theorem 9.A.16 we have that (X1 , X2 , . . . , Xn ) ≤sm (Y1 , Y2 , . . . , Yn ) =⇒

n i=1

Xi ≤cx

n

Yi

(9.A.19)

i=1

(since X ≤sm Y =⇒ EX = EY ). A related result is the following. It shows that the larger in the supermodular order a random vector is, the “closer” are its coordinates in the proper stochastic sense. Theorem 9.A.18. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. If (X1 , X2 ) ≤sm (Y1 , Y2 ) (that is, (X1 , X2 ) ≤PQD (Y1 , Y2 ); see (9.A.18)), then Y1 − Y2 ≤cx X1 − X2 . Proof. Let φ be a univariate convex function. Then the function ψ, deﬁned by ψ(x1 , x2 ) = −φ(x1 − x2 ), is easily seen to be supermodular. Thus Eφ(Y1 − Y2 ) ≤ Eφ(X1 − X2 ). This proves the inequality.

400

9 Positive Dependence Orders

A consequence of Theorem 9.A.14 and (9.A.19) is described in the following example. Example 9.A.19. Let X1 , X2 , . . . and Y1 , Y2 , . . . be two sequences of random variables. Let N1 and N2 be two independent and identically distributed positive integer-valued random variables independent of the Xi ’s and of the Yi ’s. Then N1 N1 N2 Xi + Yi ≤cx (Xi + Yi ). i=1

i=1

i=1

In order to see it, note that (N1 , N2 ) ≤sm (N1 , N1 ), and use Theorem 9.A.14 and (9.A.19). This proof was communicated to us by Taizhong Hu. An interesting example in which the supermodular order arises naturally is the following. See also Examples 6.B.29, 6.G.11, 7.A.13, 7.A.26, 7.A.39, and 7.B.5. Example 9.A.20. Let X be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ, and let Y be a multivariate normal random vector with mean vector 0 and variance-covariance matrix Σ +D, where D is a matrix with zero diagonal elements such that Σ + D is nonnegative deﬁnite. Then X ≤sm Y if, and only if, all the entries of D are nonnegative. The supermodular order can be used to bound some quite general random vectors. This is shown in the next three theorems. The proofs of the these theorems are omitted. Theorem 9.A.21 can be considered to be an extension of the right-hand side of (9.A.6). Theorem 9.A.21. Let X = (X1 , X2 , . . . , Xn ) be a random vector and let FXi be the marginal distribution of Xi , i = 1, 2, . . . , n. Then, for a uniform[0, 1] random variable U we have that −1 −1 −1 (U ), FX (U ), . . . , FX (U )), X ≤sm (FX 1 2 n

and therefore −1 −1 −1 (U ), FX (U ), . . . , FX (U )). X ≤PQD (FX 1 2 n

In particular, if the Xi ’s in Theorem 9.A.21, marginally, have the same (univariate) distribution function, then X ≤sm (X1 , X1 , . . . , X1 ), and therefore X ≤PQD (X1 , X1 , . . . , X1 ). Combining (9.A.19) and Theorem 9.A.21 it is seen, using the notation of Theorem 9.A.21, that

9.A The PQD and the Supermodular Orders −1 −1 −1 X1 + X2 + · · · + Xn ≤cx FX (U ) + FX (U ) + · · · + FX (U ). 1 2 n

401

(9.A.20)

A more detailed result is described next. Let X1 , X2 , . . . , Xn , Z, and U be −1 random variables, where U has the uniform[0, 1] distribution. Let FX (U ) i |Z denote the random variable gi (U, Z), where gi is deﬁned by gi (u, z) = −1 (u), i = 1, 2, . . . , n. FX i |Z=z Proposition 9.A.22. Let X = (X1 , X2 , . . . , Xn ) be a random vector, and let FXi be the marginal distribution of Xi , i = 1, 2, . . . , n. Let Z and U be two other random variables, such that U has a uniform[0, 1] distribution, and is independent of Z. Then −1 −1 −1 X1 + X2 + · · · + Xn ≤cx FX (U ) + FX (U ) + · · · + FX (U ) 1 |Z 2 |Z n |Z −1 −1 −1 ≤cx FX (U ) + FX (U ) + · · · + FX (U ). (9.A.21) 1 2 n

Proof. From (9.A.20) it is seen that for any convex function φ we have (below FZ denotes the distribution function of Z) E[φ(X1 + · · · + Xn )] ∞ = E[φ(X1 + · · · + Xn )Z = z]dFZ (z) −∞ ∞ −1 −1 ≤ E[φ(FX (U ) + · · · + FX (U ))Z = z]dFZ (z) 1 |Z=z n |Z=z −∞

−1 −1 = E[φ(FX (U ) + · · · + FX (U ))], 1 |Z n |Z

and the ﬁrst inequality in (9.A.21) follows. −1 −1 −1 (U ), FX (U ), . . . , FX (U )) Next, note that the random vector (FX 1 |Z 2 |Z n |Z has the same marginals as (X1 , X2 , . . . , Xn ) because ∞ P (Xi ≤ x) = P (Xi ≤ xZ = z)dFZ (z) −∞ ∞ −1 = P (FX (U ) ≤ x)dFZ (z) i |Z=z =

−∞ −1 P (FX (U ) i |Z

≤ x),

−∞ ≤ x ≤ ∞, i = 1, 2, . . . , n,

and the second inequality in (9.A.21) therefore follows from (9.A.20).

The next result has been motivated by the desire to generalize and unify Theorems 3.A.34 and 4.A.17. Recall the deﬁnition of negative association in (3.A.54). If the inequality (3.A.54) is reversed, that is, if the random variables X1 , X2 , . . . , Xn satisfy Cov(h1 (Xi1 , Xi2 , . . . , Xik ), h2 (Xj1 , Xj2 , . . . , Xjn−k )) ≥ 0

(9.A.22)

for all choices of disjoint subsets {i1 , i2 , . . . , ik } and {j1 , j2 , . . . , jn−k } of {1, 2, . . . , n}, and for all increasing functions h1 and h2 for which the above

402

9 Positive Dependence Orders

covariance is deﬁned, then X1 , X2 , . . . , Xn are said to be weakly positively associated. Theorem 9.A.23. Let X = (X1 , X2 , . . . , Xn ) be a random vector, and let Y = (Y1 , Y2 , . . . , Yn ) be a vector of independent random variables such that, marginally, Xi =st Yi , i = 1, 2, . . . , n. (a) If X1 , X2 , . . . , Xn are weakly positively associated, then X ≥sm Y . (b) If X1 , X2 , . . . , Xn are negatively associated, then X ≤sm Y . A result that is stronger than Theorem 9.A.23 is given in Section 9.E below; see details in Remark 9.E.9. Combining Theorem 9.A.23 with Theorem 9.A.16 (and using the fact that positive association implies weak positive association) one obtains Theorems 3.A.34 and 4.A.17 (for the latter, note that the function φ(x) = k max1≤k≤n i=1 xi is increasing and supermodular). Theorem 9.A.24. Let X = (X1 , X2 , . . . , Xn ) be a vector of nonnegative random variables, and let Fi denote the marginal distribution of Xi , i = 1, 2, . . . , n. Suppose that n F i (0) ≤ 1. 1

Then there exists a unique random vector Y = (Y1 , Y2 , . . . , Yn ) with marginal distributions Fi , i = 1, 2, . . . , n, such that P {Yi > 0, Yj > 0} = 0

for all i = j,

(9.A.23)

and this Y satisﬁes Y ≤sm X. The following result strengthens Theorem 7.A.38; the terminology that is used there is also used in the theorem below. Theorem 9.A.25. Let the random vectors X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have the respective copulas CX and CY . Let U X and U Y be distributed according to CX and CY . If Xi ≤cx Yi , i = 1, 2, . . . , n, if U X ≤sm U Y , and if U Y is CI, then X ≤dir-cx Y . Example 9.A.26. Let Z1 , Z2 , . . . , Zn be a collection of independent and identically distributed random variables, let U1 , U2 , . . . , Un be another collection of independent and identically distributed random variables, and let V be still another random variable that is independent of the Ui ’s. Consider the random vectors Y and X deﬁned as (Y1 , Y2 , . . . , Yn ) = (g1 (Z1 ), g2 (Z2 ), . . . , gn (Zn )) (X1 , X2 , . . . , Xn ) = (˜ g1 (U1 , V ), g˜2 (U2 , V ), . . . , g˜n (Un , V )), where gi : R → R and g˜i : R2 → R are measurable functions that satisfy

9.A The PQD and the Supermodular Orders

gi (Zi ) =st g˜i (Ui , V ),

403

i = 1, 2, . . . , n.

If g˜i is increasing in its second variable, i = 1, 2, . . . , n, then it is known that for ﬁxed values u1 , u2 , . . . , un of U1 , U2 , . . . , Un we have that g˜1 (u1 , V ), g˜2 (u2 , V ), . . . , g˜n (un , V ) are weakly positively associated. Thus, for a supermodular function φ : Rn → R we have (here V1 , V2 , . . . , Vn are independent copies of V ) Eφ(X1 , X2 , . . . , Xn ) = E E φ(˜ g1 (U1 , V ), g˜2 (U2 , V ), . . . , g˜n (Un , V ))U1 , U2 , . . . , Un ≥ E E φ(˜ g1 (U1 , V1 ), g˜2 (U2 , V2 ), . . . , g˜n (Un , Vn ))U1 , U2 , . . . , Un = Eφ(Y1 , Y2 , . . . , Yn ), where the inequality follows from Theorem 9.A.23. Thus Y ≤sm X. Example 9.A.27. Let Ω = {a1 , a2 , . . . , aN } be a ﬁnite population. Let X1 , X2 , . . . , Xn be a sample without replacement of size n ≤ N from Ω; that is, P {(X1 , X2 , . . . , Xn ) = (x1 , x2 , . . . , xn )} =

1 , N (N − 1) · · · (N − n + 1) (x1 , x2 , . . . , xn ) ∈ Ω n ,

provided all the xi ’s comprise diﬀerent elements of Ω. Let Y1 , Y2 , . . . , Yn be a sample with replacement of size n from Ω; that is, P {(Y1 , Y2 , . . . , Yn ) = (x1 , x2 , . . . , xn )} =

1 , Nn

(x1 , x2 , . . . , xn ) ∈ Ω n .

Then (X1 , X2 , . . . , Xn ) ≤sm (Y1 , Y2 , . . . , Yn ). Example 9.A.28. Let φ and ψ be two Laplace transforms of positive random variables. Then F and G, deﬁned by F (x1 , x2 , . . . , xn ) = φ(φ−1 (x1 ) + φ−1 (x2 ) + · · · + φ−1 (xn )), (x1 , x2 , . . . , xn ) ∈ [0, 1]n , and G(y1 , y2 , . . . , yn ) = ψ(ψ −1 (y1 ) + ψ −1 (y2 ) + · · · + ψ −1 (yn )), (y1 , y2 , . . . , yn ) ∈ [0, 1]n , are multivariate distribution functions with uniform[0, 1] marginals (see Example 9.A.3). Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be distributed according to F and G, respectively. If φ−1 ψ has a completely monotone derivative, then X ≤sm Y . Hu, Xie, and Ruan [241] described various sets of conditions under which two multivariate Bernoulli random vectors are ordered with respect to the supermodular order.

404

9 Positive Dependence Orders

Remark 9.A.29. Hu and Pan [239] elegantly extended (9.A.12) to the supermodular order. They also identiﬁed conditions under which any n corresponding values of two stationary Markov chains are comparable in the order ≤sm . See also Miyoshi and Rolski [398].

9.B The Orthant Ratio Orders Some multivariate stochastic orders, that compare the “location” or “magnitude” of two random vectors, may be thought of as stochastic orders of positive dependence if the compared random vectors have the same univariate marginal distributions. For example, in the bivariate case, when this is the situation, the orthant orders ≤uo and ≤lo (see Section 6.G.1) become the order ≤PQD , or, equivalently (see (9.A.18)), the order ≤sm . On the other hand, some multivariate location orders do not give anything meaningful once the marginals are held ﬁxed. For instance, the usual multivariate stochastic order ≤st can order two random vectors, with marginals that are stochastically equal, only if they have the same distributions (see Theorem 6.B.19). In this section we study, among other things, some stochastic orders of positive dependence that arise when the underlying random vectors are ordered with respect to some multivariate hazard rate stochastic orders that were discussed in Section 6.D, and have the same univariate marginal distributions. 9.B.1 The (weak) orthant ratio orders Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with respective distribution functions F and G, and with survival functions F and G. We suppose that F and G belong to the same Fr´echet class; that is, have the same univariate marginals. We say that X is smaller than Y in the lower orthant decreasing ratio order (denoted by X ≤lodr Y or F ≤lodr G) if F (y)G(x) ≥ F (x)G(y)

whenever x ≤ y.

(9.B.1)

This is equivalent to G(x) F (x)

is decreasing in x ∈ {x : G(x) > 0},

(9.B.2)

where in (9.B.2) we use the convention a/0 ≡ ∞ whenever a > 0. Note that (9.B.2) can be written equivalently as G(x − u) F (x − u) ≤ , F (x) G(x)

u ≥ 0, x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0}, (9.B.3)

9.B The Orthant Ratio Orders

405

and it is also equivalent to [X − xX ≤ x] ≥lo [Y − xY ≤ x],

x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0}. (9.B.4) Note that from (9.B.2) it follows that {x : F (x) > 0} ⊆ {x : G(x) > 0}. Thus, in (9.B.3) and (9.B.4) we can formally replace the expression {x : F (x) > 0} ∩ {x : G(x) > 0} by the simpler expression {x : F (x) > 0}. We say that X is smaller than Y in the upper orthant increasing ratio order (denoted by X ≤uoir Y or F ≤uoir G) if F (y)G(x) ≤ F (x)G(y)

whenever x ≤ y.

(9.B.5)

This is equivalent to G(x) F (x)

is increasing in x ∈ {x : G(x) > 0},

where here, again, we use the convention a/0 ≡ ∞ whenever a > 0. Note that the above can be written equivalently as F (x + u) G(x + u) ≤ , F (x) G(x)

u ≥ 0, x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0},

(9.B.6) and it is also equivalent to [X − xX > x] ≤uo [Y − xY > x], x ∈ {x : F (x) > 0} ∩ {x : G(x) > 0}. (9.B.7) Formally the expression {x : F (x) > 0} ∩ {x : G(x) > 0} in (9.B.6) and (9.B.7) can be replaced by the simpler expression {x : F (x) > 0}. We note that if X and Y have the same marginals, then X ≤uoir Y if, and only if, X ≤whr Y ; see (6.D.2). The two orders ≤lodr and ≤uoir are closely related, as is indicated in the next result. Theorem 9.B.1. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors in the same Fr´echet class. (a) If X ≤lodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤lodr Y . (b) If X ≤uoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤uoir Y . The next result is similar to Theorem 9.B.1, but it involves increasing, rather than decreasing, functions. It shows that the orders ≤lodr and ≤uoir are closed under componentwise increasing transformations.

406

9 Positive Dependence Orders

Theorem 9.B.2. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors in the same Fr´echet class. (a) If X ≤lodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤lodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤lodr Y . (b) If X ≤uoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . Conversely, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤uoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤uoir Y . Since the order ≤uoir is equivalent to the order ≤whr when the compared random vectors have the same marginals, it follows from Theorem 6.D.4 that the order ≤uoir is closed under conjunctions, marginalization, and convergence in distribution. Using Theorem 9.B.1 it is seen that also the order ≤lodr is closed under these operations. If X ≤lodr Y , then from (9.B.4) it follows that [X X ≤ x] ≥lo [Y Y ≤ x] for all relevant x. Letting x → −∞ it is seen that (9.A.13) holds (with F and G being the distributions functions of X and Y , respectively). Similarly, if X ≤uoir Y , then (9.A.14) holds. Thus we have that X ≤lodr Y and X ≤uoir Y =⇒ X ≤PQD Y . Example 9.B.3. Recall from page 387 the deﬁnition of the Fr´echet class M(F1 , F2 ) and the Fr´echet lower bound in that class which we denote here by F − . Suppose that (X1 , X2 ) has a distribution function in M(F1 , F2 ). Then F − ≤lodr F and F − ≤uoir F . Example 9.B.4. Let X and Y be two n-dimensional random vectors with Marshall-Olkin exponential distributions F and G with the survival functions given, for x ≥ 0, by F (x) = exp

−

n i=1

λi xi −

λi1 i2 (xi1 ∨ xi2 )

1≤i1 ≤i2 ≤n

− · · · − λ12···n (x1 ∨ x2 ∨ · · · ∨ xn ) ,

and G(x) = exp

−

n i=1

θi xi −

1≤i1 ≤i2 ≤n

θi1 i2 (xi1 ∨ xi2 ) − · · · − θ12···n (x1 ∨ x2 ∨ · · · ∨ xn ) ,

where the λ’s and the θ’s are positive constants. Denote νA = λA − θA , A ⊆ {1, 2, . . . , n}. Then X ≤uoir Y if, and only if,

9.B The Orthant Ratio Orders

νi ≥ 0, νi1 + νi1 i2 ≥ 0, νi1 + νi1 i2 + νi1 i3 + νi1 i2 i3 ≥ 0, .. . and

407

i ∈ {1, 2, . . . , n}, {i1 , i2 } ∈ {1, 2, . . . , n}, {i1 , i2 , i3 } ∈ {1, 2, . . . , n},

νA = 0.

Ai

A⊆{1,2,...,n}

9.B.2 The strong orthant ratio orders Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with respective distribution functions F and G, and with survival functions F and G. As in Section 9.B.1, we suppose that F and G belong to the same Fr´echet class; that is, have the same univariate marginals. We say that X is smaller than Y in the strong lower orthant decreasing ratio order (denoted by X ≤slodr Y or F ≤slodr G) if F (x)G(y) ≤ F (x ∨ y)G(y ∧ x),

x, y ∈ Rn .

(9.B.8)

We say that X is smaller than Y in the strong upper orthant increasing ratio order (denoted by X ≤suoir Y or F ≤suoir G) if F (x)G(y) ≤ F (x ∧ y)G(y ∨ x),

x, y ∈ Rn .

(9.B.9)

We note that if X and Y have the same marginals, then X ≤suoir Y if, and only if, X ≤hr Y ; see (6.D.1). By choosing x ≤ y in (9.B.8) we get (9.B.1), and by choosing x ≥ y in (9.B.9) we get (9.B.5), that is, X ≤slodr Y =⇒ X ≤lodr

and X ≤suoir Y =⇒ X ≤uoir .

(9.B.10)

Thus the orders ≤slodr and ≤suoir are often useful as a tool to identify random vectors that are ordered with respect to the orders ≤lodr and ≤uoir . The two orders ≤slodr and ≤suoir are closely related, and are preserved under componentwise increasing transformations, as is indicated in the next analog of Theorems 9.B.1 and 9.B.2. Theorem 9.B.5. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors in the same Fr´echet class. (a) If X ≤slodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤slodr Y .

408

9 Positive Dependence Orders

(b) If X ≤suoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any decreasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly decreasing functions φ1 , φ2 , . . . , φn , then X ≤suoir Y . (c) If X ≤slodr Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤slodr (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤slodr Y . (d) If X ≤suoir Y , then (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for any increasing functions φ1 , φ2 , . . . , φn . On the other hand, if (φ1 (X1 ), φ2 (X2 ), . . . , φn (Xn )) ≤suoir (φ1 (Y1 ), φ2 (Y2 ), . . . , φn (Yn )) for some strictly increasing functions φ1 , φ2 , . . . , φn , then X ≤suoir Y . Since the order ≤suoir is equivalent to the order ≤hr when the compared random vectors have the same marginals, it follows from Theorem 6.D.4 that the order ≤uoir is closed under conjunctions, marginalization, and convergence in distribution. Using Theorem 9.B.5 it is seen that also the order ≤slodr is closed under these operations. The converses of the implications in (9.B.10) are not true in general. However, under an additional assumption they are valid; these are given in the following theorem. Theorem 9.B.6. Let X and Y be two random vectors in the same Fr´echet class with respective distribution functions F and G, and respective survival functions F and G. (a) If F and/or G are/is MTP2 , then X ≤lodr Y =⇒ X ≤slodr Y . (b) If F and/or G are/is MTP2 , then X ≤uoir Y =⇒ X ≤suoir Y . Part (b) of the above theorem is similar to Theorem 6.D.1. However, it turns out that since the compared random vectors are in the same Fr´echet class, it is not needed, in Theorem 9.B.6(b), that they have a common support which is a lattice.

9.C The LTD, RTI, and PRD Orders For any random vector (X1 , X2 ) with distribution function F ∈ M(F1 , F2 ) (see page 387 for the deﬁnition of M(F1 , F2 )) we deﬁne the conditional distribution function FxL by FxL1 (x2 ) = P {X2 ≤ x2 X1 ≤ x1 } (9.C.1) for all x1 for which this conditional distribution is well deﬁned. Barlow and Proschan [36] deﬁned F (or X1 and X2 ) to be left tail decreasing (LTD) if FxL1 (x2 ) ≥ FxL1 (x2 )

for all x1 ≤ x1 and x2 ,

9.C The LTD, RTI, and PRD Orders

409

or, equivalently, if (FxL1 )−1 (u) ≤ (FxL1 )−1 (u)

for all x1 ≤ x1 and u ∈ [0, 1].

(9.C.2)

Note that when (FxL1 )−1 (u) is continuous in u for all x1 , then (9.C.2) can be equivalently written as FxL1 (FxL1 )−1 (u) ≤ u for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.3) This notion leads to the following deﬁnition. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that for any x1 ≤ x1 we have −1 −1 (FxL1 )−1 (u) ≤ (FxL1 )−1 (v) =⇒ (GL (u) ≤ (GL (v) x1 ) x1 )

for all u, v ∈ [0, 1]. (9.C.4) Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the LTD order (denoted by (X1 , X2 ) ≤LTD (Y1 , Y2 ) or F ≤LTD G). Note that (9.C.4) can be equivalently written as L −1 GL (u) ≤ FxL1 (FxL1 )−1 (u) for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.5) x1 (Gx1 ) It can be shown that if FxL1 (x2 ) and GL x1 (x2 ) are continuous in x2 for all x1 , then (X1 , X2 ) ≤LTD (Y1 , Y2 ) if, and only if, for any x1 ≤ x1 , L L FxL1 (x2 ) ≥ GL x1 (x2 ) =⇒ Fx1 (x2 ) ≥ Gx1 (x2 )

for any x2 and x2 .

(9.C.6)

Note that (9.C.6) can be equivalently written as L L −1 −1 (GL Fx1 (x2 ) ≤ (GL Fx1 (x2 ) for all x1 ≤ x1 and x2 , x1 ) x1 ) L −1 that is, (GL Fx1 (x2 ) is increasing in x1 for all x2 . x1 ) In the continuous case, it is immediate from (9.C.3) and (9.C.5) that F is LTD if, and only if, F I ≤LTD F, where F I is deﬁned in Section 9.A, but this is true also when F is not continuous. Theorem 9.C.1. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ), such that FxL1 (x2 ) and GL x1 (x2 ) are continuous in x2 for all x1 . Then (X1 , X2 ) ≤LTD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ). Proof. Since F and G have the same marginals, we see from (9.C.6) that (X1 , X2 ) ≤LTD (Y1 , Y2 ) if, and only if,

410

9 Positive Dependence Orders

F (x1 , x2 ) ≥ G(x1 , x2 ) =⇒ F (x1 , x2 ) ≥ G(x1 , x2 ) for any x2 , x2 , and x1 ≤ x1 . (9.C.7) If (X1 , X2 ) ≤PQD (Y1 , Y2 ) did not hold, then there would have existed a point (x0 , y0 ) such that F (x0 , y0 ) > G(x0 , y0 ). Let y < y0 be such that F (x0 , y0 ) > F (x0 , y) > G(x0 , y0 ). Since F2 (y) < F2 (y0 ), one can then ﬁnd an x such that x > x0 and F (x, y) < G(x, y0 ). But then F (x0 , y) > G(x0 , y0 ) and F (x, y) < G(x, y0 ) contradict (9.C.7).

The LTD order is not symmetric in the sense that (X1 , X2 ) ≤LTD (Y1 , Y2 ) does not necessarily imply that (X2 , X1 ) ≤LTD (Y2 , Y1 ). However, it satisﬁes the following closure under monotone transformations property. Theorem 9.C.2. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions in the same Fr´echet class. If (X1 , X2 ) ≤LTD (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤LTD (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. Example 9.C.3. Let φθ (t) ≡ (1 − tθ )1/θ , t ∈ [0, 1], θ ∈ (0, 1). Then the function Cθ , deﬁned as Cφθ (x, y) = φ−1 θ {φθ (x) + φθ (y)},

x, y ∈ [0, 1],

is a bivariate distribution function with uniform[0, 1] marginals (it is a particular Archimedean copula). If θ1 ≤ θ2 , then Cφθ2 ≤LTD Cφθ1 . An order that is similar to the LTD order, but which is based on conditioning on right tails, rather than on left tails, is described next. For any random vector (X1 , X2 ) with distribution function F ∈ M(F1 , F2 ) we deﬁne the conditional distribution function FxR by (9.C.8) FxR1 (x2 ) = P {X2 ≤ x2 X1 > x1 } for all x1 for which this conditional distribution is well deﬁned. Barlow and Proschan [36] deﬁned F (or X1 and X2 ) to be right tail increasing (RTI) if FxR1 (x2 ) ≥ FxR1 (x2 )

for all x1 ≤ x1 and x2 ,

or, equivalently, if (FxR1 )−1 (u) ≤ (FxR1 )−1 (u)

for all x1 ≤ x1 and u ∈ [0, 1].

(9.C.9)

When (FxR1 )−1 (u) is continuous in u for all x1 then (9.C.9) can be written as FxR1 (FxR1 )−1 (u) ≤ u for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.10) This notion leads to the following deﬁnition. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that for any x1 ≤ x1 we have

9.C The LTD, RTI, and PRD Orders

411

−1 −1 (FxR1 )−1 (u) ≤ (FxR1 )−1 (v) =⇒ (GR (u) ≤ (GR (v) x1 ) x1 )

for all u, v ∈ [0, 1]. (9.C.11) Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the RTI order (denoted by (X1 , X2 ) ≤RTI (Y1 , Y2 ) or F ≤RTI G). In analogy to (9.C.5) we note that (9.C.11) can be written as R −1 GR (u) ≤ FxR1 (FxR1 )−1 (u) for all x1 ≤ x1 and u ∈ [0, 1]. x1 (Gx1 ) (9.C.12) (x ) are continuous in x It can be shown that if FxR1 (x2 ) and GR 2 2 for all x1 x1 , then (X1 , X2 ) ≤RTI (Y1 , Y2 ) if, and only if, for any x1 ≤ x1 , R R FxR1 (x2 ) ≥ GR x1 (x2 ) =⇒ Fx1 (x2 ) ≥ Gx1 (x2 )

for any x2 and x2 .

(9.C.13)

Note that (9.C.13) can be written as R R −1 −1 Fx1 (x2 ) ≤ (GR Fx1 (x2 ) for all x1 ≤ x1 and x2 , (GR x1 ) x1 ) R −1 Fx1 (x2 ) is increasing in x1 for all x2 . that is, (GR x1 ) In the continuous case, it is immediate from (9.C.10) and (9.C.12) that F is RTI if, and only if, F I ≤RTI F, where F I is deﬁned in Section 9.A, but this is true also when F is not continuous. The following result is an analog of Theorem 9.C.1; its proof is similar to the proof of that theorem, and is therefore omitted. Theorem 9.C.4. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ), such that FxR1 (x2 ) and GR x1 (x2 ) are continuous in x2 for all x1 . Then (X1 , X2 ) ≤RTI (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ). The RTI order is not symmetric in the sense that (X1 , X2 ) ≤RTI (Y1 , Y2 ) does not necessarily imply that (X2 , X1 ) ≤RTI (Y2 , Y1 ). However, it satisﬁes the following closure under monotone transformations property. Theorem 9.C.5. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions in the same Fr´echet class. If (X1 , X2 ) ≤RTI (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤RTI (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. The LTD and RTI orders are related to each other as follows. Theorem 9.C.6. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors in the same Fr´echet class. (a) If (X1 , X2 ) ≤LTD (Y1 , Y2 ), then (φ1 (X1 ), φ2 (X2 ))) ≤RTI (φ1 (Y1 ), φ2 (Y2 )) for any decreasing functions φ1 and φ2 . Conversely, if (φ1 (X1 ), φ2 (X2 )) ≤RTI (φ1 (Y1 ), φ2 (Y2 )) for some strictly decreasing functions φ1 and φ2 , then (X1 , X2 ) ≤LTD (Y1 , Y2 ).

412

9 Positive Dependence Orders

(b) If (X1 , X2 ) ≤RTI (Y1 , Y2 ), then (φ1 (X1 ), φ2 (X2 )) ≤LTD (φ1 (Y1 ), φ2 (Y2 )) for any decreasing functions φ1 and φ2 . Conversely, if (φ1 (X1 ), φ2 (X2 )) ≤LTD (φ1 (Y1 ), φ2 (Y2 )) for some strictly decreasing functions φ1 and φ2 , then (X1 , X2 ) ≤RTI (Y1 , Y2 ). The orders ≤slodr and ≤suoir imply the LTD and RTI orders under some regularity conditions. This is shown in the next result. Theorem 9.C.7. Let F and G be in the Fr´echet class M(F1 , F2 ). Assume that, for every x, the conditional distributions FxL and FxR (see (9.C.1) and (9.C.8)) are strictly increasing and continuous on their supports. Then F ≤slodr G =⇒ F ≤LTD G

and

F ≤suoir G =⇒ F ≤RTI G.

Proof. It is enough to prove the ﬁrst implication; the other implication then follows from Theorems 9.B.5 and 9.C.6. By (9.C.7), we need to show that for x ≤ x , and for any y, y , it holds that F (x, y) ≥ G(x, y ) =⇒ F (x , y) ≥ G(x , y ). (9.C.14) Now assume that F ≤slodr G. So, for x ≤ x and y ≤ y we have F (x, y)G(x , y ) ≤ F (x , y)G(x, y ).

(9.C.15)

In the bivariate case, F ≤slodr G implies that F ≤PQD G. So the left-hand side inequality in (9.C.14) can hold only for y ≤ y. If it does hold, then (9.C.15) implies the inequality on the right-hand side of (9.C.14).

In light of Theorem 9.C.7 it is of interest to note that the (weak) orthant ratio orders ≤lodr and ≤uoir do not imply the orders ≤LTD and ≤RTI , respectively. Counterexamples can be found in the literature. An order that is of the same type as the LTD and RTI orders is the one that we study next. For any random vector (X1 , X2 ), with distribution function F ∈ M(F1 , F2 ), let Fx1 denote the conditional distribution of X2 given that X1 = x1 . Lehmann [343] deﬁned F (or X1 and X2 ) to be positive regression dependent (PRD) if X2 is stochastically increasing in X1 , that is, if Fx1 (x2 ) ≥ Fx1 (x2 )

for all x1 ≤ x1 and x2 ,

or, equivalently, if (u) ≤ Fx−1 Fx−1 (u) 1 1

for all x1 ≤ x1 and u ∈ [0, 1].

(9.C.16)

Note that when Fx−1 (u) is continuous in u for all x1 , then (9.C.16) can be 1 written as

(u) ≤ u for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.17) Fx1 Fx−1 1 This notion leads to the following deﬁnition.

9.C The LTD, RTI, and PRD Orders

413

Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that for any x1 ≤ x1 we have −1 −1 Fx−1 (u) ≤ Fx−1 (v) =⇒ Gx (u) ≤ Gx (v) 1 1 1

1

for all u, v ∈ [0, 1].

(9.C.18)

Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PRD order (denoted by (X1 , X2 ) ≤PRD (Y1 , Y2 ) or F ≤PRD G). Note that (9.C.18) can be written as −1

Gx1 G−1 for all x1 ≤ x1 and u ∈ [0, 1]. (9.C.19) x1 (u) ≤ Fx1 Fx1 (u) It can be shown that if Fx1 (x2 ) and Gx1 (x2 ) are continuous in x2 for all x1 , then (X1 , X2 ) ≤PRD (Y1 , Y2 ) if, and only if, for any x1 ≤ x1 , Fx1 (x2 ) ≥ Gx1 (x2 ) =⇒ Fx1 (x2 ) ≥ Gx1 (x2 )

for any x2 and x2 .

(9.C.20)

Note that (9.C.20) can be written as

−1 G−1 for all x1 ≤ x1 and x2 , (9.C.21) x1 Fx1 (x2 ) ≤ Gx1 Fx1 (x2 )

that is, G−1 x1 Fx1 (x2 ) is increasing in x1 for all x2 . In the continuous case, it is immediate from (9.C.17) and (9.C.19) that F is PRD if, and only if, F I ≤PRD F, where F I is deﬁned in Section 9.A, but this is true also when F is not continuous. The next result shows the relationship between the PRD, LTD, and RTI orders. We do not give the proof of it here. Theorem 9.C.8. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with absolutely continuous distribution functions F, G ∈ M(F1 , F2 ). Then (X1 , X2 ) ≤PRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤LTD (Y1 , Y2 ) and (X1 , X2 ) ≤PRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤RTI (Y1 , Y2 ). The PRD order is not symmetric in the sense that (X1 , X2 ) ≤PRD (Y1 , Y2 ) does not necessarily imply that (X2 , X1 ) ≤PRD (Y2 , Y1 ). However, it satisﬁes the following closure under monotone transformations property. Theorem 9.C.9. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. If (X1 , X2 ) ≤PRD (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤PRD (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. Example 9.C.10. Let U and V be any independent random variables, each having a continuous distribution. Deﬁne X = U,

Yρ = ρU + (1 − ρ2 )1/2 V,

for − 1 ≤ ρ ≤ 1.

Then (X, Yρ1 ) ≤PRD (X, Yρ2 ) whenever ρ1 ≤ ρ2 . A bivariate normal distribution is a particular case of this example when U and V are normally distributed.

414

9 Positive Dependence Orders

Example 9.C.11. Let U and V be any independent random variables, each having a continuous distribution. Deﬁne X = U,

Yα = αU + V,

for − ∞ ≤ α ≤ ∞.

Then (X, Yα1 ) ≤PRD (X, Yα2 ) whenever α1 ≤ α2 . Example 9.C.12. Let U and V be any independent random variables, each having a continuous distribution, such that U is distributed on (0, 1), while V is nonnegative. Deﬁne X = U,

Yα = (1 + αU )V,

for α ≥ −1.

Then (X, Yα1 ) ≤PRD (X, Yα2 ) whenever α1 ≤ α2 .

9.D The PLRD Order Let the random variables X1 and X2 have the joint distribution F . For any two intervals I1 and I2 of the real line, let us denote I1 ≤ I2 if x1 ∈ I1 and x2 ∈ I2 imply that x1 ≤ x2 . For any two intervals I and J of the real line denote F (I, J) ≡ P {X1 ∈ I, X2 ∈ J}. Block, Savits, and Shaked [95] essentially deﬁned F (or X1 and X2 ) to be positive likelihood ratio dependent if F (I1 , J1 )F (I2 , J2 ) ≥ F (I1 , J2 )F (I2 , J1 ),

whenever I1 ≤ I2 and J1 ≤ J2 . (9.D.1) In fact, Block, Savits and Shaked [95] called F totally positive of order 2 (TP2 ) if (9.D.1) holds. When F has a (continuous or discrete) density f , then (9.D.1) is equivalent to the condition that f is TP2 , that is, f (x1 , y1 )f (x2 , y2 ) ≥ f (x1 , y2 )f (x2 , y1 ),

whenever x1 ≤ x2 and y1 ≤ y2 .

Then (9.D.1) is the same as the condition for the positive dependence notion that Lehmann [343] called positive likelihood ratio dependence (PLRD). This notion leads naturally to the order that is described below. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that F (I1 , J1 )F (I2 , J2 )G(I1 , J2 )G(I2 , J1 ) ≤ F (I1 , J2 )F (I2 , J1 )G(I1 , J1 )G(I2 , J2 ), whenever I1 ≤ I2 and J1 ≤ J2 . (9.D.2) where the generic notation G(I, J) is obvious. Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PLRD order (denoted by (X1 , X2 ) ≤PLRD

9.D The PLRD Order

415

(Y1 , Y2 ) or F ≤PLRD G). Since only random vectors with the same univariate marginals can be compared in the PLRD order, we will implicitly assume this fact throughout this section. When F and G have (continuous or discrete) densities f and g, then (9.D.2) is equivalent to f (x1 , y1 )f (x2 , y2 )g(x1 , y2 )g(x2 , y1 ) ≤ f (x1 , y2 )f (x2 , y1 )g(x1 , y1 )g(x2 , y2 ), whenever x1 ≤ x2 and y1 ≤ y2 . If

∂2 ∂x∂y f

and

∂2 ∂x∂y g

exist, then (9.D.2) is equivalent to f 2 ∆g − g 2 ∆f ≥ 0,

where ∆f ≡ f

∂f ∂f ∂2f − · ∂x∂y ∂x ∂y

and ∆g ≡ g

∂g ∂g ∂2g − · . ∂x∂y ∂x ∂y

Obviously F is PLRD if, and only if, F I ≤PLRD F, where F I is deﬁned in Section 9.A. Theorem 9.D.1. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ). Then (X1 , X2 ) ≤PLRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ). Proof. Assume (X1 , X2 ) ≤PLRD (Y1 , Y2 ) and suppose that (X1 , X2 ) ≤PQD (Y1 , Y2 ). Then F (x, y) > G(x, y) (9.D.3) for some (x, y). Let I1 = (−∞, x], I2 = (x, ∞), J1 = (−∞, y] and J2 = (y, ∞). Then from (9.D.3), and from the fact that F and G have the same marginals, it follows that F (I1 , J1 ) > G(I1 , J1 ), F (I2 , J2 ) > G(I2 , J2 ), G(I1 , J2 ) > F (I1 , J2 ) and G(I2 , J1 ) > F (I2 , J1 ). Multiplying these four inequalities we obtain a contradiction to (9.D.2).

416

9 Positive Dependence Orders

We do not know whether (X1 , X2 ) ≤PLRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PRD (Y1 , Y2 ). The following closure properties of the PLRD order are easy to prove. Theorem 9.D.2. (a) Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors such that (X1 , X2 ) ≤PLRD (Y1 , Y2 ). Then (φ(X1 ), ψ(X2 )) ≤PLRD (φ(Y1 ), ψ(Y2 )) for all increasing functions φ and ψ. (j) (j) (j) (j) (b) Let {(X1 , X2 ), j = 1, 2, . . . } and {(Y1 , Y2 ), j = 1, 2, . . . } be two (j) (j) sequences of random vectors such that (X1 , X2 ) →st (X1 , X2 ) and (j) (j) (Y1 , Y2 ) →st (Y1 , Y2 ) as j → ∞, where →st denotes convergence (j) (j) (j) (j) in distribution. If (X1 , X2 ) ≤PLRD (Y1 , Y2 ), j = 1, 2, . . ., then (X1 , X2 ) ≤PLRD (Y1 , Y2 ). Let FL and FU denote the Fr´echet lower and upper bounds in the class M(F1 , F2 ). Since FL assigns all its mass to some decreasing curve in R2 , and FU assigns all its mass to some increasing curve in R2 , it follows that for every distribution F ∈ M(F1 , F2 ) we have FL ≤PLRD F ≤PLRD FU . By Theorem 9.D.1, this is a stronger result than (9.A.6). The proof of the next result is similar to the proof of Theorem 9.D.1 and is therefore omitted. Theorem 9.D.3. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors such that (X1 , X2 ) ≤PLRD (Y1 , Y2 ) and (X1 , X2 ) ≥PLRD (Y1 , Y2 ). Then (X1 , X2 ) =st (Y1 , Y2 ). Example 9.D.4. Let H and K be two continuous univariate distribution functions. For −1 ≤ α ≤ 1, deﬁne the following distribution function Fα (x, y) = H(x)K(y){1 + α[1 − H(x)][1 − K(y)]},

for all x and y.

Then Fα1 ≤PLRD Fα2 whenever α1 ≤ α2 . Example 9.D.5. Let φ and ψ be two Laplace transforms of positive random variables and let the random vectors (X1 , X2 ) and (Y1 , Y2 ) be distributed according to F and G as in Example 9.A.3. If φ−1 ψ has a completely monotone derivative, then (X1 , X2 ) ≤PLRD (Y1 , Y2 ). Example 9.D.6. Let (X1 , X2 ) and (Y1 , Y2 ) be bivariate normal random vectors with the same marginals, and with correlation coeﬃcients ρX and ρY , respectively. If ρX ≤ ρY , then (X1 , X2 ) ≤PLRD (Y1 , Y2 ).

9.E Association Orders

417

9.E Association Orders The random variables X1 and X2 are said to be associated if Cov(K(X1 , X2 ), L(X1 , X2 )) ≥ 0 for all increasing functions K and L for which the covariance is well deﬁned (see (3.A.53)). This notion leads to the order that is described below. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(F1 , F2 ), and let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(F1 , F2 ). Suppose that (Y1 , Y2 ) =st (K(X1 , X2 ), L(X1 , X2 )),

(9.E.1)

for some increasing functions K and L which satisfy K(x1 , y1 ) < K(x2 , y2 ), L(x1 , y1 ) > L(x2 , y2 ) =⇒ x1 < x2 , y1 > y2 . (9.E.2) Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the association order (denoted by (X1 , X2 ) ≤assoc (Y1 , Y2 ) or F ≤assoc G). Since only random vectors with the same univariate marginals are compared in the association order, we will implicitly assume this fact throughout this section. The restriction (9.E.2) on the functions K and L is for the purpose of making the association order applicable in situations which are not symmetric in the X1 and X2 variables. [In case (9.E.2) is dropped, (X1 , X2 ) ≥assoc (X2 , X1 ) ≥assoc (X1 , X2 ).] If K and L are partially diﬀerentiable increasing functions, then (9.E.2) is equivalent to ∂ ∂ ∂ ∂ K(x, y) · L(x, y) ≥ K(x, y) · L(x, y) ∂x ∂y ∂y ∂x

for all x and y.

From the fact that increasing functions of independent random variables are associated, it follows that if F I ≤assoc F , then F is the distribution function of associated random variables, where F I is deﬁned in Section 9.A. The following closure property is easy to prove. Theorem 9.E.1. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors. If (X1 , X2 ) ≤assoc (Y1 , Y2 ), then (φ(X1 ), ψ(X2 )) ≤assoc (φ(Y1 ), ψ(Y2 )) for all strictly increasing functions φ and ψ. The relationship between the association and the PQD orders is described in the next result. Theorem 9.E.2. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distribution functions F, G ∈ M(F1 , F2 ). Then (X1 , X2 ) ≤assoc (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤PQD (Y1 , Y2 ).

418

9 Positive Dependence Orders

Proof. Denote by F and G the distribution functions of (X1 , X2 ) and (Y1 , Y2 ), respectively. By assumption, (Y1 , Y2 ) =st (K(X1 , X2 ), L(X1 , X2 )) where K and L are increasing and satisfy (9.E.2). Fix a pair (x1 , x2 ). First suppose that K(x1 , x2 ) ≤ x1 and that L(x1 , x2 ) ≤ x2 . Then P {Y1 ≤ x1 , Y2 ≤ x2 } ≥ P {Y1 ≤ K(x1 , x2 ), Y2 ≤ L(x1 , x2 )} = P {K(X1 , X2 ) ≤ K(x1 , x2 ), L(X1 , X2 ) ≤ L(x1 , x2 )} ≥ P {X1 ≤ x1 , X2 ≤ x2 }, where the second inequality follows from the increasingness of K and of L. Thus (9.A.3) holds in this case. Next suppose that K(x1 , x2 ) ≤ x1 and that L(x1 , x2 ) > x2 . Then P {Y1 > x1 , Y2 < x2 } ≤ P {Y1 > K(x1 , x2 ), Y2 < L(x1 , x2 )} = P {K(X1 , X2 ) > K(x1 , x2 ), L(X1 , X2 ) < L(x1 , x2 )} ≤ P {X1 > x1 , X2 < x2 }, where the second inequality follows from (9.E.2). From the fact that (X1 , X2 ) and (Y1 , Y2 ) have the same univariate marginals it is seen that (9.A.3) holds in this case too. For the remaining two cases the inequality (9.A.3) follows in a similar way.

The relationship between the association and the PRD orders is described next. Theorem 9.E.3. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with distri bution functions F, G ∈ M(F1 , F2 ) such that FX2 |X1 (x2 x1 ) and GY2 |Y1 (x2 x1 ) are continuous in x2 for all x1 . Then (X1 , X2 ) ≤PRD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤assoc (Y1 , Y2 ). Proof. Suppose that (X1 , X2) ≤PRD (Y1 , Y2). Deﬁne K and L by K(x1 , x2 ) ≡ −1 x1 and L(x1 , x2 ) ≡ GY2 |Y1 FX2 |X1 (x2 x1 )x1 . Obviously K is an increasing function. Also, obviously L(x1 , x2 ) is increasing in x2 . Furthermore, from (9.C.21) it is seen that L(x1 , x2 ) is also increasing in x1 , and that (9.E.2) holds. Now note that since X1 =st Y1 , we have, using the continuity assumptions stated, that (Y1 , Y2 ) =st L(X1 , X2 ). That is, (X1 , X2 ) and (Y1 , Y2 ) satisfy (9.E.1) and (9.E.2).

Example 9.E.4. Let U and V be any independent random variables. Deﬁne Xα = (1 − α)U + αV,

Y = U,

for α ∈ [0, 1].

Then (Xα1 , Y ) ≤assoc (Xα2 , Y ) whenever α1 ≤ α2 .

9.E Association Orders

419

Example 9.E.5. Let U and V be any independent random variables. Deﬁne Xα = (1 − α)U + αV,

Y = αU + (1 − α)V,

for α ∈ [0, 12 ].

Then (Xα1 , Y ) ≤assoc (Xα2 , Y ) whenever α1 ≤ α2 . Example 9.E.6. Let (X1 , X2 ) and (Y1 , Y2 ) have bivariate normal distributions with correlation coeﬃcients ρ1 and ρ2 , respectively. Then (X1 , X2 ) ≤assoc (Y1 , Y2 ) if, and only if, −1 ≤ ρ1 ≤ ρ2 ≤ 1. Cap´era`a, Foug`eres, and Genest [119] introduced an order that is related to the association order. In order to deﬁne it we need ﬁrst to introduce some notation. Let (X1 , X2 ) be a random vector with a continuous distribution function F ∈ M(F1 , F2 ). Deﬁne VF ≡ F (X1 , X2 ), and let KF denote the distribution function of VF . For example, if the distribution function of (X1 , X2 ) is the Fr´echet upper bound FU ∈ M(F1 , F2 ) (see page 387), then KFU (v) = v, v ∈ [0, 1]. If the distribution function of (X1 , X2 ) is the Fr´echet lower bound FL ∈ M(F1 , F2 ), then KFL (v) = 1, v ∈ [0, 1]. Finally, if X1 and X2 are independent, with distribution function F I ∈ M(F1 , F2 ), then KF I (v) = v − v log v, v ∈ [0, 1]. These facts suggest the following order. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with continuous distribution functions F, G ∈ M(F1 , F2 ). Suppose that KF (v) ≥ KG (v),

for all v ∈ [0, 1].

Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the Cap´era` a-Foug`eresGenest order (denoted by (X1 , X2 ) ≤CFG (Y1 , Y2 ) or F ≤CFG G). Cap´era` a, Foug`eres, and Genest [119] showed that for every continuous distribution function F ∈ M(F1 , F2 ) we have FL ≤CFG F ≤CFG FU . They also proved, under some regularity conditions, that (X1 , X2 ) ≤assoc (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤CFG (Y1 , Y2 ). However, Cap´era` a, Foug`eres, and Genest [119] showed that ≤CFG =⇒≤ PQD , ´ whereas Nelsen, Quesada-Molina, Rodr´ıguez-Lallena, and Ubeda-Flores [433] showed that ≤PQD =⇒≤ . CFG ´ Nelsen, Quesada-Molina, Rodr´ıguez-Lallena, and Ubeda-Flores [432] introduced some generalizations of the order ≤CFG . Another related order of interest is based on the notion of weak positive association which is deﬁned in (9.A.22). Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors that have the same univariate marginals, and that satisfy

420

9 Positive Dependence Orders

Cov(h1 (Xi1 , Xi2 , . . . , Xik ), h2 (Xj1 , Xj2 , . . . , Xjn−k )) ≤ Cov(h1 (Yi1 , Yi2 , . . . , Yik ), h2 (Yj1 , Yj2 , . . . , Yjn−k )) for all choices of disjoint subsets {i1 , i2 , . . . , ik } and {j1 , j2 , . . . , jn−k } of {1, 2, . . . , n}, and for all increasing functions h1 and h2 for which the above covariances are deﬁned. Then X is said to be smaller than Y in the weak association order (denoted by X ≤w-assoc Y ). Some closure properties of the weak association order are described in the next theorem. Theorem 9.E.7. (a) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be two ndimensional random vectors. If (X1 , X2 , . . . , Xn ) ≤w-assoc (Y1 , Y2 , . . . , Yn ), then (g1 (X1 ), g2 (X2 ), . . . , gn (Xn )) ≤w-assoc (g1 (Y1 ), g2 (Y2 ), . . . , gn (Yn )) whenever gi : R → R, i = 1, 2, . . . , n, are all increasing. (b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors. If X ≤w-assoc Y , then X I ≤w-assoc Y I for each I ⊆ {1, 2, . . . , n}. That is, the weak association order is closed under marginalization. An important useful property of the weak association order is the following. Theorem 9.E.8. Let X and Y be two random vectors with the same univariate marginals. Then X ≤w-assoc Y =⇒ X ≤sm Y . Remark 9.E.9. Note that if X = (X1 , X2 , . . . , Xn ) is a vector of weakly positively associated random variables, as deﬁned in (9.A.22), and if Y = (Y1 , Y2 , . . . , Yn ) is a vector of independent random variables such that, marginally, Xi =st Yi , i = 1, 2, . . . , n, then X ≥w-assoc Y . Similarly, if X is a vector of negatively associated random variables, as deﬁned in (3.A.54), and if Y is a vector of independent random variables such that, marginally, Xi =st Yi , i = 1, 2, . . . , n, then X ≤w-assoc Y . Thus it is seen that Theorem 9.E.7 is a stronger result than Theorem 9.A.23.

9.F The PDD Order Let the random variables X1 and X2 have the symmetric (or exchangeable, or interchangeable) joint distribution F . Shaked [501] deﬁnes F (or X1 and X2 ) to be positive deﬁnite dependent (PDD) if F is a positive deﬁnite kernel on S × S, where S is the support of X1 (and therefore, by symmetry, S is also the support of X2 ). Shaked [501] has shown that X1 and X2 are PDD if, and only if,

9.F The PDD Order

Cov(φ(X1 ), φ(X2 )) ≥ 0

for every real function φ,

421

(9.F.1)

provided the covariance is well deﬁned. This notion naturally leads to the order that is deﬁned below. Let (X1 , X2 ) be a bivariate random vector with distribution function F ∈ M(s) (Fˆ ), where M(s) (Fˆ ) is the class of all the bivariate symmetric distributions with univariate marginals Fˆ . Let (Y1 , Y2 ) be another bivariate random vector with distribution function G ∈ M(s) (Fˆ ). Suppose that Cov(φ(X1 ), φ(X2 )) ≤ Cov(φ(Y1 ), φ(Y2 ))

for every real function φ, (9.F.2)

provided the covariances are well deﬁned. Then we say that (X1 , X2 ) is smaller than (Y1 , Y2 ) in the PDD order (denoted by (X1 , X2 ) ≤PDD (Y1 , Y2 ) or F ≤PDD G). Since only symmetric random vectors with the same univariate marginals are compared in the PDD order, we will implicitly assume this fact throughout this section. Since Eφ(X1 ) = Eφ(X2 ) = Eφ(Y1 ) = Eφ(Y2 ) for every real function φ, it follows that (X1 , X2 ) ≤PDD (Y1 , Y2 ) if, and only if, Eφ(X1 )φ(X2 ) ≤ Eφ(Y1 )φ(Y2 )

for every real function φ,

(9.F.3)

provided the expectations exist. Thus, if (X1 , X2 ) ≤PDD (Y1 , Y2 ), then P {X1 ∈ A, X2 ∈ A} ≤ P {Y1 ∈ A, Y2 ∈ A} for all Borel-measurable sets A in R. Another characterization of the PDD order is given in the next theorem. Theorem 9.F.1. Let F and G be two symmetric bivariate distributions in M(s) (Fˆ ). Then F ≤PDD G if, and only if, G(x, y) − F (x, y) is a positive deﬁnite kernel. From (9.F.1) and (9.F.3) it is easily seen that F is PDD if, and only if, F I ≤PDD F, where F I is deﬁned in Section 9.A. A powerful closure property of the PDD order is described in the next theorem. Theorem 9.F.2. Suppose that the four random vectors (X1 , X2 ), (Y1 , Y2 ), (U1 , U2 ) and (V1 , V2 ) satisfy (X1 , X2 ) ≤PDD (Y1 , Y2 )

and

(U1 , U2 ) ≤PDD (V1 , V2 ),

(9.F.4)

and suppose that (X1 , X2 ) and (U1 , U2 ) are independent, and also that (Y1 , Y2 ) and (V1 , V2 ) are independent. Then (φ(X1 , U1 ), φ(X2 , U2 )) ≤PDD (φ(Y1 , V1 ), φ(Y2 , V2 )), for every increasing function φ.

422

9 Positive Dependence Orders

In particular, if (9.F.4) holds, then the PDD order is closed under convolutions, that is, (X1 + U1 , X2 + U2 ) ≤PDD (Y1 + V1 , Y2 + V2 ). Using (9.F.3) it is easy to verify the following closure properties. (j)

(j)

(j)

(j)

Theorem 9.F.3. (a) Let {(X1 , X2 ), j = 1, 2, . . . } and {(Y1 , Y2 ), j = (j) (j) 1, 2, . . . } be two sequences of random vectors such that (X1 , X2 ) →st (j) (j) (X1 , X2 ) and (Y1 , Y2 ) →st (Y1 , Y2 ) as j → ∞, where →st denotes con(j) (j) (j) (j) vergence in distribution. If (X1 , X2 ) ≤PDD (Y1 , Y2 ), j = 1, 2, . . ., then (X1 , X2 ) ≤PDD (Y1 , Y2 ). (b) Let (X1 , X2 ), (Y1 ,Y2 ), and Θ be random vectors such that [(X1 , X2 )Θ = θ] ≤PDD [(Y1 , Y2 )Θ = θ] for all θ in the support of Θ. Then (X1 , X2 ) ≤PDD (Y1 , Y2 ). That is, the PDD order is closed under mixtures. Example 9.F.4. Let (X1 , X2 ) and (Y1 , Y2 ) have exchangeable bivariate normal distributions with common marginals and correlation coeﬃcients ρ1 and ρ2 , respectively. If 0 ≤ ρ1 ≤ ρ2 ≤ 1, then (X1 , X2 ) ≤PDD (Y1 , Y2 ). If (X1 , X2 ) and (Y1 , Y2 ) have distributions F and G which are not symmetric, but still have the same common marginals (that is, X1 , X2 , Y1 , and Y2 are all identically distributed), then the PDD order can still be deﬁned on the sym˜ y) = 1 [G(x, y) + G(y, x)] metrizations F˜ (x, y) = 21 [F (x, y) + F (y, x)] and G(x, 2 of F and G. Hu and Joe [234] applied the idea of the PDD order to stationary reversible Markov chains {X1 , X2 , . . . }. They showed for such chains that, if X1 and X2 are PDD (in the sense (9.F.1)), then dependence (in the sense of the PDD order) is decreasing with the lag, namely, F12 ≥PDD F13 ≥PDD · · · ≥PDD F1n ≥PDD · · · ≥PDD F (2) , where the F1j ’s and F (2) are as deﬁned in (9.A.12). An n-variate extension of the PDD order for the case when n ≥ 2 is suggested by (9.F.3). Explicitly, let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) have distribution functions with common marginals. Then we can say that X is less positively dependent than Y if E

n i=1

φ(Xi ) ≤ E

n

φ(Yi )

for every nonnegative real function φ.

(9.F.5)

i=1

Note that for this deﬁnition it is not required that X and Y have exchangeable distribution functions; it is only required that X and Y have the same common marginals. One reason for the usefulness of inequality (9.F.5) is that it implies that P {X1 ∈ A, X2 ∈ A, . . . , Xn ∈ A} ≤ P {Y1 ∈ A, Y2 ∈ A, . . . , Yn ∈ A} for all Borel-measurable sets A in R.

9.G Ordering Exchangeable Distributions

423

9.G Ordering Exchangeable Distributions Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two random vectors with exchangeable distributions. Let X(1) ≤ X(2) ≤ · · · ≤ X(n) and Y(1) ≤ Y(2) ≤ · · · ≤ Y(n) be the corresponding order statistics. Intuitively, if Y is “more positively dependent” than X (or, alternatively, Y is “less dispersed” than X), then we can expect the Yi ’s to “hang together” more than the Xi ’s. For example, we can expect quantities such as X(n) − X(1) or X(n) + X(n−1) − X(2) − X(1) to be stochastically larger than Y(n) − Y(1) or Y(n) + Y(n−1) − Y(2) − Y(1) . This observation naturally leads to the following deﬁnitions. Let X and Y be two n-dimensional random vectors with exchangeable distribution functions and with the same common marginals. We will write X ≤pd-1 Y if n c X i (i) ≥st i=1

n c Y i (i)

whenever

i=1

n

ci = 0.

(9.G.1)

i=1

When the interest is in the unordered components of the random vectors, then the following deﬁnition is useful. We will write X ≤pd-2 Y if n c X i i ≥st i=1

n c Y i i

whenever

i=1

n

ci = 0.

(9.G.2)

i=1

Recall from page 2 the deﬁnition of the majorization order a ≺ b among n-dimensional vectors. For any random variable W , let FW denote the distribution function of W . We will write X ≤pd-3 Y if (FX(1) (x), FX(2) (x), . . . , FX(n) (x)) (FY(1) (x), FY(2) (x), . . . , FY(n) (x))

for all x. (9.G.3)

It is easy to verify that (9.G.3) is equivalent to (Eφ(X(1) ), Eφ(X(2) ), . . . , Eφ(X(n) )) (Eφ(Y(1) ), Eφ(Y(2) ), . . . , Eφ(Y(n) )) for all monotone functions φ for which the expectations exist. A further insight into the meaning of (9.G.3) can be obtained by rewriting it as the set of inequalities

E

j i=1

j I(−∞,x] (X(i) ) ≥ E I(−∞,x] (Y(i) ) , i=1

for j = 1, 2, . . . , n, and all x, (9.G.4) with equality holding for j = n. That is, for each j, the expected value of the number of order statistics which are less than or equal to x among the ﬁrst k

424

9 Positive Dependence Orders

ordered Xi ’s is at least as large as the corresponding expected value based on the ordered Yi ’s. When one is concerned only with the expectations of the order statistics, then the following stochastic order is useful. We will write X ≤pd-4 Y if (EX(1) , EX(2) , . . . , EX(n) ) (EY(1) , EY(2) , . . . , EY(n) ).

(9.G.5)

The next result describes some interrelationships among the orders ≤pd-k , k = 1, 2, 3, 4. Theorem 9.G.1. Let X and Y be two n-dimensional random vectors with exchangeable distribution functions and with the same common marginals. Then X ≤pd-1 Y ⇒ X ≤pd-2 Y ⇓ X ≤pd-3 Y ⇒ X ≤pd-4 Y Proof. First suppose that X ≤pd-1 Y . Let π = (π1 , π2 , . . . , πn ) denote a persuch permumutation of {1, 2, . . . , n}, and let π denote a summation over all n tations. Then, by exchangeability, for any real z, and whenever i=1 ci = 0, we have $ % % $ n n 1 P P ci Xi > z = ci Xi > z Xπ1 ≤ Xπ2 ≤ · · · ≤ Xπn n! π i=1 i=1 % n 1 $ cπi X(i) i > z = P n! π i=1 % n 1 $ cπi Y(i) i > z ≥ P n! π i=1 % $ n ci Yi > z , = P i=1

and (9.G.2) follows. (9.G.1) If we denote ai = EX(i) and bi = EY(i) , i = 1, 2, . . . , n,then from n n it follows that ai −ai−1 ≥ bi −bi−1 , i = 1, 2, . . . , n−1. Also, i=1 ai = i=1 bi . Now it is easily seen that a b, and thus (9.G.5) holds. The proof of X ≤pd-3 Y ⇒ X ≤pd-4 Y is easy (see, for example, Marshall and Olkin [383, page 350]).

Some closure properties of the above orders are described in the following theorem. Theorem 9.G.2. (a) For j = 1, 2, . . . , let X (j) and Y (j) be two random vectors with exchangeable distribution functions and with the same common marginals such that X (j) →st X and Y (j) →st Y as j → ∞, where →st denotes convergence in distribution. If X (j) ≤pd-k Y (j) , j = 1, 2, . . . , then X ≤pd-k Y , k = 1, 2, 3.

9.G Ordering Exchangeable Distributions

425

(b) Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two n-dimensional random vectors with exchangeable distribution functions and with the same common marginals. If X ≤pd-k Y , then X I ≤pd-k Y I for each I ⊆ {1, 2, . . . , n}. That is, the ≤pd-k order is closed under marginalization, k = 1, 2, 3, 4. (c) Let (X1 , X2 , . . . , Xn ) and (Y1 , Y2 , . . . , Yn ) be as in part (b). If (X1 , X2 , . . . , Xn ) ≤pd-k (Y1 , Y2 , . . . , Yn ), then (aX1 + b, aX2 + b, . . . , aXn + b) ≤pd-k (aY1 + b, aY2 + b, . . . , aYn + b) for any constants a and b, k = 1, 2, 3, 4. (d) Let X and Y be as in part (b), and let Θ be another random vector. If [X Θ = θ] ≤pd-k [Y Θ = θ] for all θ in the support of Θ, then X ≤pd-k Y . That is, the ≤pd-k order is closed under mixtures, k = 1, 2, 3, 4. In the bivariate case we have the following relationship. Theorem 9.G.3. Let (X1 , X2 ) and (Y1 , Y2 ) be two random vectors with exchangeable distribution functions with common marginals. Then (X1 , X2 ) ≤PDD (Y1 , Y2 ) =⇒ (X1 , X2 ) ≤pd-3 (Y1 , Y2 ). Proof. Suppose that (X1 , X2 ) ≤PDD (Y1 , Y2 ). Then, for any real z we have FX(1) (z) = 1 − P {min(X1 , X2 ) > z} = 1 − EI(z,∞) (X1 )I(z,∞) (X2 ) ≥ 1 − EI(z,∞) (Y1 )I(z,∞) (Y2 ) = FY(1) (z), where the inequality follows from (9.F.3). Now, since FX(1) (z) + FX(2) (z) = FY(1) (z) +FY(2) (z), it follows that (FX(1) (z), FX(2) (z)) (FX(1) (z), FY(2) (z)) which is (9.G.3).

A relationship between the star order (see Section 4.B) and the order ≤pd-4 is described next. Theorem 9.G.4. Let X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) be two vectors, each consisting of independent and identically distributed nonnegative random variables. If X1 ≤∗ Y1 , and if EX1 = EY1 , then X ≤pd-4 Y . Example 9.G.5. If X1 , X2 , . . . , Xn are conditionally independent and identically distributed (then they are exchangeable), and if Y1 , Y2 , . . . , Yn are independent and identically distributed, and if all the Xi ’s and Yi ’s have the same marginal distribution, then X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ) satisfy X ≤pd-3 Y and, of course, also X ≤pd-4 Y ; this is shown in Shaked and Tong [523]. Hu and Hu [233] have shown that if X1 , X2 , . . . , Xn have some other properties of positive or negative dependence, and if Y1 , Y2 , . . . , Yn are independent, and if Xi =st Yi for i = 1, 2, . . . , n, then the above (that is, X ≤pd-3 Y and X ≤pd-4 Y ) also hold. Ebrahimi and Spizzichino [178] obtained conditions on the expected values of the order statistics that are associated with X = (X1 , X2 , . . . , Xn ) and Y = (Y1 , Y2 , . . . , Yn ), under which X ≤pd-4 Y .

426

9 Positive Dependence Orders

Paul [442] gave conditions under which Xi ≤cx Yi , i = 1, 2 (where (X1 , X2 ) and (Y1 , Y2 ) are some bivariate random vectors) imply (Y1 , Y2 ) ≤pd-4 (X1 , X2 ) (in fact the conclusion of Paul [442] is stated as E max{X1 , X2 } ≤ E max{Y1 , Y2 }, but since EXi = EYi , i = 1, 2, the stated conclusion is the same as (Y1 , Y2 ) ≤pd-4 (X1 , X2 )). M¨ uller [414], however, noticed that in Paul [442] there was a subtle mistake which invalidated his Theorem 1. M¨ uller [414] provided other conditions under which the conclusion above is valid.

9.H Complements A good review of the theory of positive dependence orders is the survey by Scarsini and Shaked [496]. Section 2.2 in Joe [262] contains many of the results that are mentioned in Sections 9.A–9.F, as well as many examples and counterexamples. Section 9.A: The PQD order is ﬁrst deﬁned in Yanagimoto and Okamoto [570]; it also can be found in Tchen [547]. The general closure property of the PQD order (Theorem 9.A.1) is taken from Kimeldorf and Sampson [295]. The deﬁnition of the PQD order for general n-dimensional vectors (n > 2) can be found in Joe [260]. The conditions under which Archimedean copulas are ordered in the PQD sense (Example 9.A.3) can be found in Joe [262]. Brown and Rinott [110] showed that some pairs of multivariate inﬁnitely divisible distributions are PQD-ordered. The PQD comparisons of convolutions and of mixtures results (Theorems 9.A.6 and 9.A.7) are special cases of results of Belzunce and Semeraro [77]. The PQD ordering of random vectors with elliptically contoured densities (Example 9.A.8) follows from Theorem 5.1 of Das Gupta, Eaton, Olkin, Perlman, Savage, and Sobel [139]; see also Landsman and Tsanakas [331]. The results about the supermodular order (Section 9.A.4) are mostly taken from Meester and Shanthikumar [387] and from Shaked and Shanthikumar [517]; see also Joe [260] and Szekli, Disney, and Hur [545]. The closure results of the supermodular order given in Theorem 9.A.12, and the application to Markov chains given in Example 9.A.13, are taken from Li and Xu [350]. An extension of the result in Example 9.A.13 can be found in Kulik and Szekli [325]. The closure property of the order ≤sm under random sums (Theorem 9.A.14) can be found in Denuit, Genest, and Marceau [145]; it generalizes some results of Hu and Pan [238]. Extensions of Theorem 9.A.14 are given in Lillo, Pellerey, Semeraro, and Shaked [363], and in Kulik and Szekli [325]. The supermodular comparison of mixtures result (Theorem 9.A.15) is taken from Denuit and M¨ uller [157]. The property that is described in Theorem 9.A.16 can be found in B¨ auerle [58] or in B¨auerle and Rieder [61], and the property that is described in Theorem 9.A.18 can be found in M¨ uller [411]. The inequality that is described in Example 9.A.17 is taken from Vanichpun and Makowski [554, 555]; they

9.H Complements

427

credit it to B¨ auerle [58]. The fact that sums of components of supermodular ordered vectors are ordered according to ≤icx , described in (9.A.19), is taken from M¨ uller [409]. The convex order comparison of random sums in Example 9.A.19 is a generalization of a result of O’Cinneide [439]. The result about the ordering of multivariate normal random vectors according to the ≤sm order (Example 9.A.20) can be found in Huﬀer [250]; see also M¨ uller and Scarsini [416] and Block and Sampson [94, Section 2], though in the latter paper there is a mistake which is corrected in M¨ uller and Scarsini [416]. An extension of the result in Example 9.A.20 to Kotztype distributions is given in Ding and Zhang [168]. The bound on X, which is described in Theorem 9.A.21, can be found in Tchen [547]. A geometric proof of (9.A.20) is given in Kaas, Dhaene, Vyncke, Goovaerts, and Denuit [268] and in Hoedemakers, Beirlant, Goovaerts, and Dhaene [224]. The convex comparison of sums (Proposition 9.A.22) is taken from Kaas, Dhaene, and Goovaerts [267]; some related results and extensions can be found in Goovaerts and Kaas [213] and in Hoedemakers, Beirlant, Goovaerts, and Dhaene [224]. The comparison of a vector of associated random variables with its independence version (Theorem 9.A.23) can be found in Christoﬁdes and Vaggelatou [130]; the ﬁrst part of this theorem strengthens a result in Shaked and Shanthikumar [517] which states the same conclusion, but under the CIS condition (deﬁned in (6.B.11)) which is stronger than the weak positive association condition. The lower bound on X by the so-called “mutually exclusive” random variables (that is, that satisfy (9.A.23)), given in Theorem 9.A.24, is taken from Dhaene and Denuit [162]; see related results in Frostig [207] and in references therein. The suﬃcient condition by means of copulas, which imply the ≤dir-cx order (Theorem 9.A.25), can be found in Juri [266]. Theorem 3.1 and Corollaries 3.2 and 4.1 in R¨ uschendorf [486] are variants of Theorem 9.A.25. The model that is described in Example 9.A.26 is a special case of a model discussed in B¨auerle [57]; in fact, her Theorem 3.1 can be obtained from the stochastic inequality of Example 9.A.26 and the closure of the supermodular order under mixtures (Theorem 9.A.9(d)). R¨ uschendorf [486] studied various extensions of Example 9.A.26. The comparison of sampling plans which is given in Example 9.A.27 was obtained in Karlin [276], and noted by Frostig [206]. The comparison of multivariate Archimedean copulas (Example 9.A.28), as well as further similar comparisons, can be found in Wei and Hu [559]. If F and G of (9.A.3) are the distribution functions of bivariate vectors with integer-valued components, then the comparison F ≤PQD G is the same as a comparison of the partial sums of two matrices with nonnegative entries (which sum up to 1). Nguyen and Sampson [434] studied the geometry of such matrices. The PQD comparison can be used also to compare contingency tables that have the same row and column sums. Nguyen and Sampson [435]

428

9 Positive Dependence Orders

obtained some results regarding the number of such contingency tables that are more PQD than a given contingency table. Block, Chhetry, Fang, and Sampson [92] found necessary and suﬃcient conditions (by means of orders of permutations) for two bivariate empirical distributions to be ordered according to the PQD order. Further results in this vein are given in Metry and Sampson [392]. Examples of pairs of bivariate distributions that are PQD-ordered can be found in de la Horra and Ruiz-Rivas [227] and in Joe [261]. Bassan and Scarsini [55] characterized the PQD order by means of the usual stochastic ordering of some related stopping times. Ebrahimi [175] discussed negatively dependent distributions that are ordered according to the PQD order. Some positive dependence orders that are weaker than the PQD order ´ were introduced in Rodr´ıguez-Lallena and Ubeda-Flores [470]. Lu and Yi [366] gave a deﬁnition of an order that generalizes the bivariate PQD order to higher dimensions. However, this order does not have the desirable properties of being closed under mixtures and concatenations (this follows from the fact that parts (c) and (e) of Theorem 2.4 in Lu and Yi [366] may be incorrect). Section 9.B: Most of the results in this section can be found in Colangelo, Scarsini, and Shaked [133]. Section 9.C: Most of the results in this section, about the LTD and RTI orders, are taken from Averous and Dortet-Bernadet [25]. The relationship between the strong orthant ratio orders and the LTD and RTI orders (Theorem 9.C.7) can be found in Colangelo, Scarsini, and Shaked [133]; the counterexamples that are mentioned after Theorem 9.C.7 can also be found in that paper. The results about the PRD order are taken from Yanagimito and Okamoto [570] and from Fang and Joe [192]. In addition to the characterizations (9.C.19)–(9.C.21) of the PRD order, the reader may ﬁnd another characterization in R¨ uschendorf [484]. In addition to Examples 9.C.10–9.C.12, many other examples of pairs of random vectors that are PRD-ordered can be found in Fang and Joe [192]. Hollander, Proschan, and Sconing [225] brieﬂy considered some LTD and RTI orders that are diﬀerent than the ones in Section 9.C. Colangelo [132] studied the relationships among these orders and the LTD and RTI orders in Section 9.C, and Colangelo, Scarsini, and Shaked [133] studied the relationships among these orders and the orthant ratio orders. Block, Chhetry, Fang, and Sampson [92] found necessary and suﬃcient conditions (by means of orders of permutations) for two bivariate empirical distributions to be ordered according to the PRD order. Some variations of the PRD order are discussed in Cap´era` a and Genest [120] and in Fang and Joe [192].

9.H Complements

429

Av´erous, Genest, and Kochar [26] introduced an extension of the PRD order which compares bivariate random vectors that need not have the same univariate marginals. Their order is equivalent to the requirement that the corresponding copulas are ordered in the PRD order. Hollander, Proschan, and Sconing [225] brieﬂy discussed the order according to which (X1 , X2 ) is smaller than (Y1 , Y2 ) if GY2 |Y1 (x2 x1 ) − FX2 |X1 (x2 x1 ) is increasing in x1 for all x2 . Section 9.D: Most of the material in this section is taken from Kimeldorf and Sampson [295]. The conditions under which Archimedean copulas are ordered in the PLRD sense (Example 9.D.5) can be found in Joe [262]. The comparison of two bivariate normal random vectors in the PLRD sense (Example 9.D.6) is taken from Genest and Verret [208]. Yanagimoto [569] introduced a collection of 16 orders based on the idea of (9.D.2). He did it by requiring (9.D.2) to hold for special choices of intervals I1 , I2 , J1 , and J2 . The PQD order is one of the 16 orders in the collection of Yanagimoto. Metry and Sampson [391] extended Yanagimoto’s idea and presented a more general approach for generating positive dependence orderings. That approach makes it fairly easy to study the properties of the resulting orders and the interrelationships among them. Yanagimoto [569] also introduced an order that is similar to the PLRD order, and which applies to random vectors of dimension n ≥ 2. Kemperman [284] and Karlin and Rinott [278] suggested an order according to which the bivariate distribution F (with density f ) is smaller than the bivariate distribution G (with density g) if f (x1 , y1 )g(x2 , y2 ) ≥ f (x1 , y2 )g(x2 , y1 )

whenever x1 ≤ x2 and y1 ≤ y2 .

This order has not been studied in the literature as a positive dependence order. In fact, Kimeldorf and Sampson [295] have noticed that it does not satisfy some of the basic axioms that they introduced. Section 9.E: The deﬁnition and many properties of associated random variables can be found in Esary, Proschan, and Walkup [184]. Most of the results described in this section are taken from Schriever [498] and from Fang and Joe [192]. In addition to Examples 9.E.4–9.E.6, many other examples of pairs of random vectors that are ordered by association can be found in Fang and Joe [192]. Some variations of the association order are also discussed in that paper. Block, Chhetry, Fang, and Sampson [92] found necessary and suﬃcient conditions (by means of orders of permutations) for two bivariate empirical distributions to be ordered according to the association order. The main result about the weak association order (Theorem 9.E.8) is extracted from R¨ uschendorf [486]; see also Yi and Tongyu [574].

430

9 Positive Dependence Orders

Kimeldorf and Sampson [296] and Hollander, Proschan, and Sconing [225] discuss brieﬂy an order according to which (X1 , X2 ) is smaller than (Y1 , Y2 ) if Cov(K(X1 , X2 ), L(X1 , X2 )) ≤ Cov(K(Y1 , Y2 ), L(Y1 , Y2 )), for all increasing functions K and L for which the covariance is well deﬁned. Kimeldorf and Sampson [296] showed that this order does not satisfy one of their axioms. This order can clearly be extended to the case in which the dimension is n ≥ 2. Section 9.F: Most of the results in this section are taken from Shaked [501] and from Rinott and Pollak [467]. One can prove Theorem 9.F.1 using the method of proof of Theorem 3.1 in Shaked [501]. Tong [550] has listed some examples of vectors X and Y that satisfy (9.F.5), and has shown some applications of this order. Rinott and Pollak [467] have essentially shown that if (X1 , X2 ) ≤PDD (Y1 , Y2 ), then some of the ﬁrst-passage times of related Gaussian processes are ordered in the usual stochastic order. Section 9.G: The results in this section are mostly taken from Shaked and Tong [523]. Many examples of pairs of exchangeable vectors that satisfy the orders ≤pd-k , k = 1, 2, 3, 4, are listed in that paper. Further examples can be found in Shaked and Tong [522]. The relationship between the star order and the order ≤pd-4 (Theorem 9.G.4) is taken from Barlow and Proschan [35]; a slightly stronger result can be found in Shaked [502]. Gupta and Richards [218] have given examples of pairs of multivariate Liouville distributions that are ordered according to ≤pd-1 and therefore also according to ≤pd-2 and ≤pd-4 . Shaked and Tong [523] have noted that, intuitively, exchangeable random vectors are “more positively dependent” if, and only if, they are “less dispersed.” Thus they suggested to deﬁne orderings according to which (X1 , X2 , . . . , Xn ) is smaller than (Y1 , Y2 , . . . , Yn ) if Eφ(X1 , X2 , . . . , Xn ) ≥ Eφ(Y1 , Y2 , . . . , Yn ), for every φ which belongs to some properly chosen class of permutation symmetric functions. In addition to the classes deﬁned in (9.G.1), (9.G.2) and (9.G.4) [there exists also a class under which the above inequality gives (9.G.5)], a natural choice of such a class is the class of all Schurconvex functions. Chang [124] considered some orders that are deﬁned by the above inequality for several classes of permutation symmetric functions. His paper contains a rich bibliography regarding several stochastic majorization orders. Mosler [399, Section 7.6] introduced some notions of positive dependence orders that are based on volumes of central regions.

References

1. Aalen, O.O., Hoem, J.M.: Random time changes for multivariate counting processes. Scandinavian Actuarial Journal, 81–101 (1978) 2. Adell, J.A., Bad´ıa, F.G., de la Cal, J.: Beta-type operators preserve shape properties. Stochastic Processes and Their Applications 48, 1–8 (1993) 3. Adell, J.A., de la Cal, J.: Optimal Poisson approximation of uniform empirical processes. Stochastic Processes and Their Applications 64, 135–142 (1996) 4. Adell, J.A., Lekuona, A.: Taylor’s formula and preservation of generalized convexity for positive linear operators. Journal of Applied Probability 37, 765–777 (2000) 5. Adell, J.A., Perez-Palomares, A.: Stochastic orders in preservation properties by Bernstein-type operators. Advances in Applied Probability 31, 492–507 (1999) 6. Ahmadi, J., Arghami, N.R.: Some univariate stochastic orders on record values. Communications in Statistics—Theory and Methods 30, 69–74 (2001) 7. Ahmed, A.-H. N.: Preservation properties for the mean residual life ordering. Statistical Papers 29, 143–150 (1988) 8. Ahmed, A.N., Alzaid, A., Bartoszewicz, J., Kochar, S.C.: Dispersive and superadditive ordering. Advances in Applied Probability 18, 1019–1022 (1986) 9. Ahmed, A.N., Soliman, A.A., Khider, S.E.: On some partial ordering of interest in reliability. Microelectronics Reliability 36, 1337–1346 (1996) 10. Ahmed, A.N., Soliman, A.A., Khider, S.E.: Preservation results for ordered random variables, with applications to reliability theory. Microelectronics Reliability 37, 277–287 (1997) 11. Alzaid, A., Kim, J.S., Proschan, F.: Laplace ordering and its applications. Journal of Applied Probability 28, 116–130 (1991) 12. Alzaid, A.A.: Mean residual life ordering. Statistical Papers 29, 35–43 (1988) 13. Alzaid, A.A.: Length-biased orderings with applications. Probability in the Engineering and Informational Sciences 2, 329–341 (1988) 14. Alzaid, A.A., Proschan, F.: Dispersivity and stochastic majorization. Statistics and Probability Letters 13, 275–278 (1992) 15. Arcones, M.A., Kvam, P.H., Samaniego, F.J.: Nonparametric estimation of a distribution subject to a stochastic precedence constraint. Journal of the American Statistical Association 97, 170–182 (2002)

432

References

16. Argon, N.T., Andrad´ ottir, S.: Partial pooling in tandem lines with cooperation and blocking. Queueing Systems 52, 5–30 (2006) 17. Arias-Nicol´ as, J.P., Fern´ andez-Ponce, J.M., Luque-Calvo, P., Su´arez-Llorens, A.: Multivariate dispersion order and the notion of copula applied to the multivariate t-distribution. Probability in the Engineering and Informational Sciences 19, 363–375 (2005) 18. Arjas, E.: A stochastic process approach to multivariate reliability systems: Notions based on conditional stochastic order. Mathematics of Operations Research 6, 263–276 (1981) 19. Arnold, B.C.: Majorization and the Lorenz Order: A Brief Introduction. Springer-Verlag, New York (1987) 20. Arnold, B.C.: Inequality measures for multivariate distributions. Metron 63, 317–327 (2005) 21. Arnold, B.C., Villasenor, J.A.: Lorenz ordering of order statistics and record values. In: Balakrishnan, N., Rao, C.R. (ed) Handbook of Statistics 16: Order Statistics: Theory and Methods. Elsevier, Amsterdam, 75–87 (1998) 22. Arrow, K.J.: Essays in the Theory of Risk-Bearing. North-Holland, New York (1974) 23. Asadi, M., Shanbhag, D.N.: Hazard measure and mean residual life orderings: A uniﬁed approach. In: Balakrishnan, N., Rao, C.R. (ed) Handbook of Statistics 20: Advances in Reliability. Elsevier, Amsterdam, 199–214 (2001) 24. Atakan, A.E.: Stochastic convexity in dynamic programming. Economic Theory 22, 447–455 (2003) 25. Averous, J., Dortet-Bernadet, J.-L.: LTD and RTI dependence orderings. Canadian Journal of Statistics 28, 151–157 (2000) 26. Av´erous, J., Genest, C., Kochar, S.C.: On the dependence structure of order statistics. Journal of Multivariate Analysis 94, 159–171 (2005) 27. Baccelli, F., Makowski, A.M.: Multi-dimensional stochastic ordering and associated random variables. Operations Research 37, 478–487 (1989) 28. Baccelli, F., Makowski, A.M.: Stochastic orders associated with the forward recurrence time of a renewal process. Technical Report. Department of Electrical Engineering, University of Maryland, College Park (1992) 29. Bagai, I., Kochar, S.C.: On tail-ordering and comparison of failure rates. Communications in Statistics—Theory and Methods 15, 1377–1388 (1986) 30. Baker, E.: Increasing risk and increasing informativeness: Equivalence theorems. Operations Research 54, 26–36 (2006) 31. Bapat, R.B., Kochar, S.C.: On likelihood-ratio ordering of order statistics. Linear Algebra and Its Applications 199, 281–291 (1994) 32. Barlow, R.E., Bartholomew, D.J., Bremner, J.M., Brunk, H.D.: Statistical Inference under Order Restrictions. Wiley, New York (1972) 33. Barlow, R.E., Campo, R.: Total time on test processes and applications to failure data analysis. In: Barlow, R.E., Fussel, R., Singpurwalla, N.D. (ed) Reliability and Fault Tree Analysis. SIAM, Philadelphia, 451–481 (1975) 34. Barlow, R.E., Doksum, K.A.: Isotonic tests for convex ordering. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability 1. University of California Press, Berkeley, 293–323 (1972) 35. Barlow, R.E., Proschan, F.: Inequalities for linear combinations of order statistics from restricted families. Annals of Mathematical Statistics 37, 1574–1592 (1966)

References

433

36. Barlow, R.E., Proschan, F.: Statistical Theory of Reliability and Life Testing, Probability Models. Holt, Rinehart, and Winston, New York (1975) 37. Bartoszewicz, J.: Moment inequalities for order statistics from ordered families of distributions. Metrika 32, 383–389 (1985) 38. Bartoszewicz, J.: Dispersive ordering and monotone failure rate distributions. Advances in Applied Probability 17, 472–474 (1985) 39. Bartoszewicz, J.: Dispersive ordering and the total time on test transformation. Statistics and Probability Letters 4, 285–288 (1986) 40. Bartoszewicz, J.: A note on dispersive ordering deﬁned by hazard functions. Statistics and Probability Letters 6, 13–16 (1987) 41. Bartoszewicz, J.: Quantile inequalities for linear combinations of order statistics from ordered families of distributions. Applicationes Mathematicae 21, 575–589 (1993) 42. Bartoszewicz, J.: Stochastic order relations and the total time on test transform. Statistics and Probability Letters 22, 103–110 (1995) 43. Bartoszewicz, J.: Tail orderings and the total time on test transform. Applicationes Mathematicae 24, 77–86 (1996) 44. Bartoszewicz, J.: Dispersive functions and stochastic orders. Applicationes Mathematicae 24, 429–444 (1997) 45. Bartoszewicz, J.: Applications of a general composition theorem to the star order of distributions. Statistics and Probability Letters 38, 1–9 (1998) 46. Bartoszewicz, J.: Characterizations of the dispersive order of distributions by the Laplace transform. Statistics and Probability Letters 40, 23–29 (1998) 47. Bartoszewicz, J.: Characterizations of stochastic orders based on ratios of Laplace transforms. Statistics and Probab