Текст
                    Asymptotic Methods in
the Theory of Stochastic
Differential Equations
A. V. SKOROKHOD


Volume 78


TRANSLATIONS OF
MATHEMATICAL MONOGRAPHS


-,






A erican Mathematical Society





Asymptotic Methods in the Theory of Stochastic Differential Equations 
TRANSLATIONS OF MATHEMATICAL MONOGRAPHS VOLUME 78 Asymptotic Methods in the Theory of Stochastic Differential Equations A. v. SKOROKHOD American Mathematical Society · Providence · Rhode Island 
A. B. CKOPOXO ACIIMIITOTlIqECIGIE METO)J:LI TEOPIIII CTOXACTlIqECKIIX )J:1I fDfD EPEHQIIAJILHLIX YP ABHEHIIR «HAYKA», MOCKBA, 1987 Translated from the Russian by H. H. McFaden Translation edited by Ben Silver 1980 Mathematics Subject Classification (1985 Revision). Primary 60- 02, 60HI0, 60J60; Secondary 60H15, 60J25, 28DI0, 34F05, 47D07, 47A35, 60J75, 35R60, 34K20. ABSTRACT. The topics in this monograph are ergodic theory for Markov processes and for solutions of stochastic differential equations, stochastic differential equations containing a small parameter, and stability theory for solutions of systems of stochastic differential equations. The main part of the material is presented for the first time. The book is intended for specialists in the theory of random processes and its applications. Bibliography: 66 titles. Library of Congress Cataloging-in-Publication Data Skorokhod, A. V. (Anatolii Vladimirovich), 1930- Asymptotic methods in the theory of stochastic differential equations. (Translations of mathematical monographs; v. 78) Translation of: Asimptoticheskie metody teorii stokhasticheskikh differentsial' nykh urav- nenii. Includes bibliographical references. 1. Stochastic differential equations. 2. Asymptotic expansions. I. Title. II. Series. QA274.23.S5313 1989 519.2 89-17698 ISBN 0-8218-4531-4 Copyright @ 1989 American Mathematical Society. All rights reserved. Translation authorized by the All-Union Agency for Authors' Rights, Moscow The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America Information on Copying and Reprinting can be found at the back of this volume. The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. €9 This publication was typeset using AMS - TEX, the American Mathematical Society's T}3X macro system. 
Contents Foreword IX List of Notation XI Introduction XIII CHAPTER I. Ergodic theorems 1  1. General ergodic theorems 1 1.1. Ergodic theorems for semigroups of measure-preserving transformations 1 1.2. Homogeneous Markov processes. Invariant measures and ergodic theorems 6 1.3. Harris recurrence 15 2. Densities for transition probabilities and resolvents for Markov solutions of stochastic differential equations 23 2.1. Nondegenerate diffusion processes 24 2.2. Diffusion processes with degenerate diffusion 27 2.3. Processes with jumps 35 3. Ergodic theorems for one-dimensional stochastic equations 41 3.1. Diffusion processes on the line 42 3.2. Diffusion processes on an interval 53 3.3. Processes with reflection at the boundary 55 4. Ergodic theorems for solutions of stochastic equations in Rd 57 4.1. Invariant measures for processes on compact spaces 58 4.2. Locally compact spaces 63 4.3. Solutions of stochastic equations in Rd 66 CHAPTER II. Asymptotic behavior of systems of stochastic equations containing a small parameter 77  1. Equations with a small right-hand side 77 1.1. A general theorem on convergence to a diffusion process 77 v 
VI CONTENTS 1.2. Ordinary differential equations with a random right-hand side 80 1.3. A theorem on integral continuity with respect to a parameter for diffusion processes 97 1.4. Stochastic equations with small diffusion 99 2. Processes with rapid switching 102 2.1. Processes with a discrete component 103 2.2. An ergodic theorem for jump processes 106 2.3. An estimate for a process with a discrete component 110 2.4. A limit theorem for processes with rapidly varying discrete component 114 2.5. Dynamical systems with rapid switching 117 3. Averaging over variables for systems of stochastic differential equations 134 3.1. A general theorem on averaging 134 3.2. A diffusion process under the influence of a rapid dynamical system in the presence of feedback 144 3.3. A dynamical system under the influence of a rapid diffusion process. Neutral case 156 3.4. A dynamical system under the influence of a rapid diffusion process. Neutral case, hirge times 163 CHAPTER III. Stability. Linear systems 183  1. Stability of sample paths of homogeneous Markov processes 183 1.1. Definition 183 1.2. A Feller process on a compact metric space 187 1.3. Stability aI)d instability of one-dimensional continuous processes 1 96 1.4. Stability and instability of Feller processes in a locally compact space 198 2. Linear equations in Rd and the stochastic semigroups connected with them. Stability 205 2.1. Linear equations 205 2.2. Operator equations. Representation of solutions 211 2.3. Commutative case 222 2.4. Homogeneous case. Invariant subspaces 225 2.5. Mean square stability 231 2.6. Stability with probability 1 236 2.7. p-Stability 246 
CONTENTS Vll 3. Stability of solutions of stochastic differential equations 251 3.1. Stability and instability in first approximation 251 3.2. Diffusion equations with homogeneous coefficients 259 CHAPTER IV. Linear stochastic equations in Hilbert space. Stochastic semigroups. Stability 271  1. Linear equations with bounded coefficients 271 1.1. General equations in Hilbert space 271 1.2. Linear equations 279 1.3. Linear stochastic equations in Hilbert space 285 1.4. Stochastic Hilbert-Schmidt semigroups 291 2. Strong stochastic semigroups with second moments 296 2.1. Strong and weak random operators 296 2.2. Processes with independent increments that are continuous in II . lis 300 2.3. A stochastic differential equation 305 2.4. Second-order stochastic semigroups of bounded variation 309 2.5. Stochastic equations of diffusion type with constant coefficients 318 3. Stability 322 3.1. Examples of stable and unstable infinite-dimensional systems 322 3.2. Stability in the mean square 327 Bibliography 333 
Foreword The 1982 book on stochastic differential equations written jointly by the author and Iosif Il'ich Gikhman did not include a number of areas in this theory that are important for applications. Therefore, we decided to write a book that would bring together material relating to applied ar- eas in the theory of stochastic equations. We intended to treat equations in infinite-dimensional spaces, in particular, infinite systems of stochastic equations; the theory of linear equations in infinite-dimensional spaces and- the semigroups connected with them, in particular, stochastic partial dif- ferential equations of evolution type; equations for conditionally Markov processes and the equations of nonlinear filtration connected with them; and the asymptotic behavior of solutions of stochastic equations, including ergodic theory, the method of averaging, and the theory of stability. The plan of the book was discussed for a fairly long time, and we convinced ourselves at last that it was impossible to present all these topics in a single book. We then decided to treat the last topic. This choice was made under the influence of the interests of Iosif Il'ich, who, as a student of Nikolai Nikolaevich Bogolyubov, had directed much attention to the study of the asymptotic behavior of systems undergoing random perturbations. A serious illness did not permit Iosif Il'ich to work on this book. Now he is no longer, but the book is published. It would certainly have been different if he had taken part in its writing-he had a better feeling for the "physical" aspects of mathematical theories and could convey this in his expositions, thus giving them more substance. Moreover, he knew far more than was written in his (and others') works. While recognizing how far this book was from what we had envisioned, I wrote it nevertheless, hoping at least by the choice of topic to pay homage to the shining memory of my teacher and friend. A.  Skorokhod IX 
List of Notation R-the real line. R+-the set of nonnegative numbers. a A b and a V b-the smaller and larger of the respective numbers a, b E R. Rd-the d-dimensional Euclidean space. lxi-the absolute value of a number x E R or the norm of a vector x E X, where X is a Euclidean space. (x,y)-the inner product in a Euclidean space. X x Y -the Cartesion product of sets X and Y. (x,y)-an element of X x Y;x E X, Y E Y. $x,$(X)-the a-algebra of Borel subsets of a metric space X. (RQ)ms-Lebesgue measure on a set S.  -the product of a-algebras  and. v-the smallest a-algebra containing. a (C;a, a E A)-the a-algebra generated by the variables {C;a, a E A}. L(X, Y)-the linear space of linear operators from a linear space X to a linear space Y. IIAII-the norm of a linear operator A E L(X, Y). A*-the operator adjoint to A (A* E L(Y,X)). {ek }-an orthonormal basis in a Euclidean space X. d tr A = LI (Aek> ed. x 0 y E L(X, X)-defined by (x 0 y)z = (x, z)y, where X is a Euclidean space. '(x)-the function in L(X, Y) defined for : X --+ Y by the equality tp'(x)y = :t tp(X + ty) 11=0, x,y E X, t E R. IIII = sup 1(x)l. C x-the space of continuous functions on X. Xl 
Introduction Asymptotic problems for stochastic differential equations arose and were solved simultaneously with the very beginnings of the theory of such equations, because the founder of this theory, I. I. Gikhman, was consid- ering first and foremost problems on asymptotic behavior, and he con- structed the equations themselves partly in order to be able to pose and solve these problems rigorously. In this he, as a student of N. N. Bo- golyubov, was continuing the traditions of the new direction developed in the 1930's by N. M. Krylov and Bogolyubov in investigations on non- linear mechanics-the study of systems subject to the action of random perturbations. A cycle of papers by Krylov and Bogolyubov [1]-[5] were devoted to these investigations. They established, in particular, ergodic theorems for Markov processes with a phase space of a very general form. Special mention should be made of [1], in which a study was made of the behavior of a system subject to the action of a rapidly variable random force that becomes a "white noise" in the limit. It is this paper that served as an impetus for the creation by Gikhman of the theory of stochastic dif- ferential equations. In [1]-[5] various approaches were considered to the rigorous definition of a dynamical system subject to the action of a ran- dom force of "white noise" type, as well as the definition of a stochastic differential equation in a random field of forces with independent values, and results were obtained on the asymptotic behavior of the system when the field varies (for example, when impulse actions become continuous actions). (It6 used the convenient concept of a stochastic integral to con- struct a stochastic equation in [1] and [2]; this form of the equation is more accepted at present.) We indicate two directions in the asymptotic investigation of systems with random actions: 1) investigation of the behavior of systems as t --+ 00, and 2) investigation of systems depending on a small parameter as this pa- rameter tends to zero. The mixed problem also relates here-investigation XUl 
XIV INTRODUCTION of a system as a parameter tends to zero and t tends to infinity simul- taneously. The main systems considered are those describable by Markov proc- esses that are, in turn, solutions of stochastic differential equations. How- ever, many of the results are simpler to formulate and prove for Markov processes, and even for processes of a more general form. It is often con- siderations of convenience that dictate the choice of the form of a system. We remark also that, in addition to problems on the behavior of a sys- tem, new problems connected with the study of the asymptotic behavior of distributions (transition probabilities) arise for stochastic systems. In considering the asymptotic behavior of a system as t --+ 00 we are pri- marily interested in a definite "stabilization" of the system. This term can be used to characterize any regularity that manifests itself in the behav- ior of the system. The crudest type of such stabilization is boundedness in probability. Under fairly natural assumptions about the probabilistic properties of the system, boundedness in probability implies ergodicity- this property characterizes more precisely the behavior of the system on the whole unbounded interval of variation. Even when the system is not bounded in probability, it can fail to diverge to infinity but instead return to a neighborhood of the original state with probability 1. Then it has an infinite invariant measure, and we can judge the qualitative behavior of the system on the basis of exact quantitative laws. Although ergodic theory (including ergodic theory for Markov proc- esses) is very weIr developed, some questions connected with this theory, as well as some results relating specifically to solutions of stochastic equations, are appearing here for the first time in a monograph. Shurenkov's book [1] contains the most complete reflection of the state of ergodic theory for Markov processes, along with a detailed bibliography. Questions involving (asymptotic) stability of a system in a neighborhood of an equilibrium state or involving instability of the system arise naturally in the study of the behavior of systems on an infinite interval. Under very general assumptions, stability implies asymptotic stability for stochastic systems, and instability with positive probability implies instability with probability 1. Linear systems for which the point 0 is the only equilibrium point are of special interest. Such systems are either stable or unstable. In the latter case the system either diverges to infinity, or oscillates and hence has an invariant measure. Gikhman founded the theory of stability for solutions of stochastic dif- ferential equations in [6] and [7], and then Khas'minskii developed it fur- ther in [1]-[5]. We note that the study of stability of linear systems is 
INTRODUCTION xv closely connected with the study of products of independent identically distributed matrices (about this see Bellman, Kesten, and Furstenberg (see Furstenberg [1]), Tutubalin [1], and Sazonov and Tutubalin [1]). We mention also results of Kulinich [1] that have not appeared in a book: for recurrent processes he found conditions for the existence of a limit distribution for a solution of a stochastic equation under a suitable normalization. Carrying results relating to stochastic equations in finite-dimensional spaces over to the infinite-dimensional case is far from trivial. Although the form of stochastic equation proposed by Gikhman is insensitive to a change in the dimension of the space, the more natural form based on the Ito integral needed a certain reinterpretation (Daletskii [1], [2]). The study of linear systems led to the concept of a stochastic semigroup (Skorokhod [1], [2], [4], and Butsan [1]). Mean-square stability of solutions of linear equations involves stability of certain now nonrandom semigroups in the Banach space of linear operators acting in a Hilbert space. There is a fairly complete exposition of the theory of stability of such semigroups in Daletskii and Krein's book [1]. A small parameter in the equation has the effect that some terms in the equation become large in comparison with others, and since a stochastic differential equation contains four different terms (the differential of the unknown solution, the drift, the diffusion, and the jumps), we obtain different problems with a small parameter by placing the small parameter as a coefficient of different groups of terms. Most natural is the problem when the system is determined by an ordinary differential equation with a small random perturbation. Then under a mixing condition for the process on the right-hand side it behaves like a solution of a stochastic equation of diffusion type on large time intervals. Another class of problems is connected with the presence of rapidly varying components in the system. If these components have ergodic properties, then their effect on the remaining components is "averaged", i.e., for the latter a closed equation is obtained whose coefficients are the coefficients of the original equation, averaged with respect to an ergodic distribution. These kinds of theorems generalize the Bogolyubov method of averaging to random systems. Gikhman and Khas'minskii occupied themselves with the justification of the Bogolyubov method of averaging in various degrees of generality in the case of stochastic equations (see also Stratonovich [1], [2], V. V. Sarafyan [1], and Sarafyan and Skorokhod [1]). We remark that for finite Markov chains and semi-Markov processes such a method of averaging was developed by Korolyuk and Turbin [1] (see also Turbin [1]) as a method of asymptotic phase amalgamation. 
xvi INTRODUCTION A special place is occupied by the class of problems on the behavior of a dynamical system under the influence of a small diffusion. They have been investigated by Venttsel' and Freidlin [1] (see also Venttsel' [1], and Sarafyan [1]), and relate to the determination of an asymptotic expression for the probability of unlikely events (large deviations) such as, for example, the system reaching the boundary of a domain whose interior contains a point of stable equilibrium, due to a small diffusion or a transition of the system from one stable state to another. 
CHAPTER I Ergodic Theorems  1. General ergodic theorems Ergodic theorems combine two sorts of theorems: on the one hand, theorems on the existence with probability 1 of limits of means of the form l i t - f((s)) ds t 0 as t --+ 00, where (s) is a random process and f a measurable numerical function on the phase space of the process, and, on the other hand, the- orems on the existence of limits for transition probabilities P(t, x, A) of homogeneous Markov processes or of their means l i t - P(s, x, A) ds t 0 as t --+ 00, and the cases when these limits do not depend on the initial state (O) = x of the process are of special interest. In this chapter we consider ergodic theorems of both forms for homogeneous Markov processes that are solutions of stochastic differential equations with time-independent coefficients. (1) (2) 1.1. Ergodic theorems for semigroups of measure preserving transforma- tions. General ergodic theorems are usually formulated according to the following scheme. Some measurable space (X,) is considered, and on it are given a semigroup of measurable transformations StX, t > 0, and a measure m on  (a-finite in general), with the transformations of the semigroup leaving m unchanged: for all t > 0 and A E  m(St-l(A)) = m(A). (3) The last relation is equivalent to the following: if Ll (m) is the space of -measurable m-integrable functions, then for all fELl (m) f j(Stx)m(dx) = f j(x)m(dx). (4) 
2 I. ERGODIC THEOREMS Saying that StX is a semigroup of transformations means that St+sx = StSsX (St need not be thought of as a linear operator). It is assumed that StX is a measurable function with respect to R+   (R+ is the Borel a-algebra on R+). Let us consider the asymptotic behavior of the quantity 10 1 j(Sux) du (5) as t -+ 00. We present one of the main ergodic theorems on the behavior of quantities of the form (5). THEOREM 1 (BIRKHOFF). Suppose that f, g E L l (m), g > 0, and J o oo g(Sux) du = +00 almost everywhere with respect to the measure m. Then the limit I (10 1 j(Su x ) du / 10 1 g(Su x ) du ) (6) exists with probability 1. If this limit is denoted by f;(X), then f;(Shx) = f;(x) almost everywhere with respect to m for all h > o. Further, / f;(x)g(x)m(dx) = / j(x)m(dx). (7) A proof of this theorem will be given below. It is based on the ergodic theorem for the case of discrete time. 'In this case we can consider a single transformation S of (X,) into (X,) that preserves a a-finite measure m. THEOREM 1 * (BIRKHOFF). Suppose that f, g E L l (m), g > 0, and E g(Sk x ) = +00 almost everywhere with respect to the measure m. Then the limit }i (t,j(SkX)/g(SkX)) (8) exists almost everywhere with respect to m. If this limit is equal to f;(x), then f;(Sx) = f;(x) almost everywhere with respect to m, and / f;(x)g(x)m(dx) = / j(x)m(dx). There is usually a proof of Theorem 1 * in probability texts, and we omit it. PROOF OF THEOREM 1. It can obviously be assumed that f > 0 and f < g (otherwise, g + f can be taken as g). Let fi (x) = J O I f(Sux) du. 
 1. GENERAL ERGODIC THEOREMS 3 Since f(Sux) is measurable with respect to R+ fg), while fi(x) is- measurable and 10 1 f j(Sux)m(dx) du = 10 1 f j(x)m(dx) du = f j(x)m(dx), it follows that f fi (x)m(dx) = f 10 1 j(Su x ) du m(dx) = f j(x)m(dx). Note that Sl preserves the measure m and Sf = Sn. Therefore, n-l n-l 1 n Lfi(Sx) = L { j(Sk+ux)du = ( j(Sux)du. k=O k=O 10 10 Similarly, if gl (x) = J O I g(Su x ) du, then gl (x) > 0, E gl (Sf x) = +00 for m-almost all x, and J gl (x)m(dx) < 00. On the basis of Theorem 1 *, the limit }i. (Ion j(Su x ) du / Ion g(Sux) du ) exists for almost all x. We observe that for nt < t < nt + 1 (9) ( {nt / (nt+l ) 10 j(Su x ) du 10 g(Su x ) du < (lot j(Su x ) du / lot g(Sux) du ) < (Io nl + 1 j(Sux) du / Ion g(Sux) dU) . Therefore, it suffices to prove that }i. (I n + 1 g(Su x ) du / Ion g(Su x ) dU) = 0 (10) almost everywhere with respect to m. But, taking g(SIX) as f(x), we get from (9) that }i. (n+1 g(Su x ) du / Ion g(Su x ) du ) exists almost everywhere with respect to m, and since }i. (10 1 g(Su x ) du / Ion g(Su x ) dU) = 0 
4 I. ERGODIC THEOREMS almost everywhere with respect to m, the limit nli. (Ion+1 g(Su x ) du / Ion g(Su x ) du ) also exists; hence also the limit on the left-hand side of (10). But f I n + 1 g(Su x ) du m(dx) = f g(x)m(dx). Thus, the sequence fnn+l g(Sux) du is bounded with respect to m, and hence the ratio after the limit sign on the left-hand side converges to zero in the measure m (the denominator converges to infinity). This implies (10), and it is proved that the limit (6) exists. Therefore, f;(Sh X ) = I (It+h j(Sux) du / It+h g(Sux) du ) = I (10 1 + h j(Su x ) du / Iot+h g(Su x ) dU) = f;(x) almost everywhere with respect to m. Since f gl (x)f;(x)m(dx) = f {10 1 g(Su x ) du } f;(x)m(dx) = 10 1 f g(Sux)f;(Sux)m(dx) du = f g(x)f;(x)m(dx), Theorem 1 * gives us that f g(x)f;(x)m(dx) = f fi (x)m(dx) = f j(x)m(dx). 0 REMARK. If m(X) < 00, then the function g(x) = 1 can be taken as g. Consequently, in this case we have, for all fELl (m) and m-almost all x, the existence of the limit 1 In t lim - f(Sux) du = j*(x), too t 0 where j*(ShX) = f*(x) for all h > 0 and m-almost all x, and f j*(x)m(dx) = f j(x)m(dx). (11 ) ( 12) 
1. GENERAL ERGODIC THEOREMS 5 We prove that also lim ! .!. fl f(Su x ) du - j*(x) m(dx) = O. (13) too t 10 It obviously suffices to confine oneself to the case f > o. Let fN(x) = f(x) 1\ N, fN (x) = f(x) - fN(X), and (fN)*(X) = lim .!. t fN(SuX) du, too t 10 (fN)*(x) = lim .!. t fN (Su X ) du t-+oo t 10 (the limits in the sense of convergence almost everywhere with respect to m). Then  10 1 fN(SuX) du < N, and, by the Lebesgue theorem, lim ! .!. t fN(Su X ) du - (fN)*(X) I m(dx) = 0, too t 10 I while in view of Fatou's lemma we can write lim ! .!. t fN (Su x ) du - (fN)*(x) m(dx) < 2 ! fN (x)m(dx). too t 10 The right-hand side of the last inequality cn be made arbitrarily small by suitably choosing N. This proves (13). Accordingly, the variant of Birkhoff's theorem for a finite measure m is established. THEOREM 2. Suppose that m is a finite measure and fELl (m), and let J be the smallest a-algebra of sets in  with respect to which all the junctions g(x) with g(x) = g(ShX) for m-almost all x and for any h > 0 are measurable. Then the limit (11) exists almost everywhere with respect to m, f*(x) is J-measurable and belongs to L l (m), and the equalities (12) and (13) hold. REMARK. Let A E J. Then IA(Sux) = IA(x) almost everywhere with respect to m, and hence lim .!. t f(Sux)IA(Sux)du = IA(x) lim .!. t f(Sux)du = IA(x)j*(x). too t 10 too t 10 On the basis of (12), i j*(x)m(dx) = i f(x)m(dx). (14) 
6 I. ERGODIC THEOREMS The relation (14 L determines f* (x) uniquely to within sets of m-measure zero. Indeed, if f(x) is J-measurable and for all A E J i l(x)m(dx) = i J*(x)m(dx), (15) ....., then f = f* almost everywhere with respect to m. DEFINITION. A semigroup of transformations Su of the space (X, £B) preserving the measure m on £B is called a metrically transitive semigroup if m(A) = 0 or m(X\A) = 0 for all A E J. The sets in J are said to be invariant. Metric transitivity means that every invariant set coincides either with the whole space or with the empty set (to within sets of m- measure 0). REMARK. If the transformation semigroup in Theorems 1, 1 *, and 2 is metrically transitive, then the limits (6), (8), and (11) are constants. 1.2. Homogeneous Markov processes. Invariant measures and ergodic theorems. Let (X,£B) be a measurable space, the phase space of the process. We consider a space Q of measurable functions from R+ to X that is translation invariant: if x(t) E Q, then Xh(t) = x(t + h) E Q for all h > O. In Q we single out some a-algebra sr of subsets and a flow of sub-a-algebras 5'; such that: 1) 5'; c S'; for t < s; 2) sr oo = V t 5'; = !T; 3) {x(.): x(t) E B} E  for all B E £B; and 4) the subset {(s,x(.)): s E A, x(s) E B} of [0, t] x Q is in O,t] (g)5'; for all A E O,t] and B E £B (O,t] is the a-algebra of Borel subsets of [0, t]). Let Px, x E X, be a family of probability measures on Q satisfying the following conditions: a) Px(C) is £B-measurable with respect to x for C E sr; b) Px(x(t + h) E BI5';) = Px(t)(x(h) E B), t,hER+, ( 16) almost everywhere with respect to P x ; and c) Px(x(O) = x) = 1. The col- lection of these objects is called a Markov process with phase space (X, £B), space Q of sample paths, flow of a-algebras 5';, and family of probability measures Px. We denote it by {Q,5';,P x }. The main characteristic of the process is its transition probability P(t,x,A) = Px(x(t) E A). ( 17) The condition 4) means that a Markov process is progressively measurable. Denote by 5';* the smallest a-algebra containing the sets {x(.): x(s) E B}, s < t, B E £B. Obviously, 5';* is also a flow of a-algebras, 5';* c 5';, and, since the right-hand side of (16) is 5';* -measurable, Px(x(t + h) E BI5';*) = px(t)(x(h) E B) ( 18) 
1. GENERAL ERGODIC THEOREMS 7 almost everywhere with respect to Px. The measure Px on 3';* is uniquely determined by the transition probability P(s, x, A) for s < t. Let us consider the set {x: x(t + hi) E B l ,... ,x(t + h k ) E B k }, ( 19) where B l ,..., Bk E £B and 0 < hi < ... < h k . Then, on the basis of (16), Px(x(t + hi) E B l ,... ,x(t + h k ) E Bkl) = E(P(x(t + hi) E B l ,. . ., x(t + h k ) E Bk 19;"+h k - 1 )I) = E(Px(t+hk_d(x(h k ) E B)I B1 (x(t + hi)) ... I Bk _ 1 (x(t + h k - l ))I9;) = E(P(h k - h k - l , x(t + h k - l ), Bk)IBl (x(t + hi)) ... I Bk _ 1 (x(t + h k - l ))I9;) = ( P(h1,x(t),dYl) ( P(h 2 - hhYh d Y2) JB 1 J ... ( P(h k -hk-hYk-hdYk) JB k = P x (t)(x(h l ) E B l ,... ,x(h k ) E B k ) with Px-probability 1. The last relation can be written as follows. We in- troduce the operation 8 t : 8tx(s) = x(s + t) of translation of functions and sets. If C E 7, then 8;-1 C = {x(.): 8 t x(.) E C}. If C = {x(.): x(h i ) E B i , i = 1, 2, . . . , k}, then (19) is 8;-1 C, and for the given C we have established the equality px(8;-ICI) = px(t)(C) (20) with probability 1. This relation clearly extends to all C E 7* = V 3';* (here 8;-1 C E 7*), as well as to the completion of this a-algebra with respect to PX. We shall assume the following condition. CONDITION U. 8;-IC E 7 and relation (20) is valid for all C E 7 and t > O. The latter relation is fulfilled if 7 lies in the completion of 7* with respect to P x. If the condition 4) holds when  is replaced by the completion of 3';* with respect to Px, then 3';* can be taken as. Thus, Condition U can be replaced by a condition expressible in terms of 5';*, s < t. The following three semigroups are connected with a Markov process: 1) the semigroup of translations 8t on Q; 2) the semigroup defined by 1if(x) = f f(y)P(t,x,dy) = Exf(x(t)) (21 ) 
8 I. ERGODIC THEOREMS in the space of all bounded £B-measurable functions f (the fact that this is a semigroup follows from the equality Exf(x(t + s)) = ExE(f(x(t + s))I9;) = ExEx(t)f(x(s)) = Ex Tsf(x(t)) = ItTsf(x); here Ex is the expectation with respect to the measure P x); and 3) the semigroup fJIt acting in the space of finite measures on £B, p,1/(A) = / P(t,x, A)p,(dx) = E,JA(X(t)). (22) If fJ is a probability distribution, then fJIt is the distribution of x(t) un- der the condition that x(O) has distribution fJ; E,u is the expectation with respect to the measure PJl(C) = / Px(C)p,(dx) (23) (we use this notation also in the case of measures that are not probability measures). The measure fJIt is defined even when fJ is a-finite, but is itself not necessarily a-finite. DEFINITION. A a-finite measure fJ is said to be invariant if fJ1t = J.l for all t > O. LEMMA 1. Let fJ be an invariant measure. Then P,u (C) is a a-finite measure, and the translation semigroup preserves this measure. PROOF. Let t > 0 be arbitr3ry, and suppose that X = Uk B k , BknB j = 0, k # j, and fJ(Bk) < 00. Then P,u(C) = LP,u(Cn{x(.): x(t) E B k }), k and P,u( C n {x(.): x(t) E B k }) is a finite measure as a function of C, since P,u(C n {x(.): x(t) E B k }) < fJ(It(B k )) = J.l(Bk) < 00. Further, PJl(S;IC) = / Px(S;IC)p,(dx) = /(E x Px(S;lqg;))P,(dX) = / ExPx(l) (C)p,(dx) = / p,(dx) / P(t,x,dy)Py(C) = / p,(dy)Py(C) = PJl(C). 0 If C;(x(.)) is an ST-measurable function on Q, then SuC; = C;(Sux(.)). By Condition U, SuC; is also ST-measurable. We reformulate Theorems 1 and 2 for the case of a Markov process. 
 1. GENERAL ERGODIC THEOREMS 9 THEOREM 3. Suppose that fJ is an invariant a-finite measure for a Markov process (Q,,Px), C; and" are !T-measurable variables, " > 0, E,u Ic; I < 00, E,u" < 00, and P ,u {Jooo Su'1 d u < oo} = O. Then the limit 1(lleu'du/ lleurtdu)=[ (24) - exists almost everywhere with respect to the measure P,u, the variable C; has - - - the property that P ,u(SuC; # C;) = 0 for all u > 0, and E,uC;'1 = E,uC;. If fJ is finite, then lim .!. t eu' du = [ (24') too t 10 - almost everywhere with respect to P,u, and C; satisfies the additional condition - E,uC; = E,uC;. What is more, if J is the a-algebra of subsets C E!T with P,u((C\S;IC) u (S;IC\C)) = 0 for all u > 0, then for all C E J - E,uC;Ic = E,uC;Ic. (25) In particular, if P,u is trivial on J, then the limit in (24) is constant (nonrandom ). There is interest in the construction of the a-algebra J and in condi- tions under which P,u is trivial on it, i.e., P,u(C) = 0 or P,u(Q\C) = 0 for all C EJ. - It is natural to denote by E,u(c;IJ) an J -measurable variable C; for which (25) holds for all C E J. The next theorem describes the a-algebra J. - THEOREM 4. Suppose that C; is an J -measurable bounded variable. Then: a) ExC; = g(x) is a £B-measurable function satisfying Itg(x) = g(x) almost everywhere with respect to the measure fJ for all t > 0; b) g(x(t)) is a martingale with respect to the flow S'; and the measure P ,u; - c) C; = limt-+oo g(x(t)) almost everywhere with respect to P,u. - - PROOF. We have that P ,u(C; # StC;) = O. Hence, o = EJlI[ - ell = / ,u(dx)Exl[ - ell > / ,u(dx)IE x [ - Exel = / ,u(dx)lg(x) - ExE(el[I)1 = / ,u(dx)lg(x) - Ex EX(I)[I = / ,u(dx)lg(x) - Exg(x(t))1 = / Ig(x) - 7/g(x)I,u(dx). 
10 I. ERGODIC THEOREMS Assertion a) is proved. It was shown in the chain of equalities that - g(x(t)) = E x (c;l9;) almost everywhere with respect to P,u. This at once gives us b) and c). 0 We introduce invariant sets in the phase space of the process. Let fJ be an invariant measure. A set B E £B is said to be fJ-invariant for the process {Q,, P x} if P(t, x, B) = 1 for fJ-almost all x E B. REMARK. A measurable set B is said to be invariant if P(t, x, B) = 1 forxEB. Let us show that for every fJ-invariant set B there exists an invariant set B' c B such that fJ(B\B') = O. For a given n we construct a sequence of sets Bm), m = 1, 2, . . . , as follows: Bm) = {x: x E Bm-l), I n P(t, x, Bm-l)) dt = n } . Then fJ(Bl)) = 1, and if fJn(Bm-l)) = 1, then B (I) - B n - , 1 = Jl(Bm-l») = ( Jl(dx) r !P(t,x,Bm-l»)dt 1 B(m) 10 n n 1 t 1 i n + fJ(dx)- P(t,x,Bm-I)) dt B(m-I) \ B(IIl) n 0 n n =Jl(Bm»)+ ( Jl(dx)! rp(t,x,Bm-l))dt, 1B(m-I) \ B(m) n 10 n n 0= ( [ 1 _! r P (t,X,Bm-l»)dt ] fJ(dx). 1 B(IIl-I) \ B(m) n 10 n n Since the expression in square brackets is positive, it follows that fJ(Bm-l)\Bm)) = 0 and fJ(Bm)) = 1. Setting Bn = n m Bm), we have that fJ(Bn) = 1 and I i n - P(t,x, Bn) dt = 1 for x E Bn. n 0 If B' = n Bn, then fJ(B\B') = 0, and for all n I i n - P(t,x,B')dt=1 forxEB'. n 0 
1. GENERAL ERGODIC THEOREMS 11 Hence P(t, X, B') = 1 for almost all t with respect to Lebesgue measure, for all x E B'. We show that B' is then invariant: tP(t,x,X\B') = t f P(s,x,dy)P(t-s,y,X\B')ds = t { P(s,x,dy)P(t-s,y,X\B')ds 10 1 X\B' + t ds ( P(s,x,dy)P(t-s,y,X\B') 10 1 B' = t ds ( P(s, x, dy)P(t - s,y, X\B') = 0, 10 1 B' since P(t-s,y, X\B') = 0 for almost all s with respect to Lebesgue measure. THEOREM 5. If B is a J..l-invariant set of finite J..l-measure, then X\B is also J..l-invariant. Further, IB(x(O)) is an J-measurable variable. IfC E J and P.u (C) < 00, then there is a J..l-invariant set B such that P.u{Ic # IB(x(O))} = o. PROOF. Suppose that J..l(B) < 00 and B is invariant. Then J..l(B) = f p,(dx)P(t, x, B) = ( p,(dx)P(t, x, B) + ( p,(dx)P(t, x, B) 1B 1x\B = p,(B) + ( p,(dx)P(t, x, B). 1x\B Hence, P(t, x, X\B) = I-P(t, x, B) = 1 for almos all x E X\B. Therefore, P.u(IB(x(O)) # IB(x(t))) = P.u(x(O) E B,x(t) E X\B) + P.u(x(O) E X\B,x(t) E B) = ( p,(dx)P(t,x,X\B) + ( p,(dx)P(t,x,B) = O. 1 B 1x\B This means that 8tIB(x(0)) = IB(x(t)) = IB(x(O)) almost everywhere with respect to J..l, i.e., IB(x(O)) is J-measurable. If C E J and P,u(C) < 00, we set tp(x) = Px(C). Then Ie = limt-+oo tp(x(t)) almost everywhere with respect to P.u' by Theorem 4. For 0 < a < P < 1 let Ap = {x: a < tp(x) < P}. Then lim P(t, x, A p ) = 0 for J..l-almost all x, because lAp (x(t)) --+ 0 almost everywhere with respect to P.u and is bounded by the function tp(x(t))la, for which E.utp(x(t))la < 00, so that E.uIAp (x(t)) --+ O. Since p,(A p ) = f p,(dx)P(t,x,A p ) - 0, 
12 I. ERGODIC THEOREMS it follows that fJ(Acp) = O. Thus, fJ({x: 0 < tp(x) < I}) = 0, and the measure fJ is concentrated on the sets B l = {x: tp(x) = I} and Bo {x: tp(x) = O}. Note that p,(Bd = / p,(dx)tp(x) = / p,(dx)Px(C) = PJl(C) < 00. Since the function tp(x) coincides with IBI (x) almost everywhere with re- spect to fJ, we can use assertion a) in Theorem 4 to write IBI (x) = tp(x) = Extp(x(t)) = ExIBI (x(t)) = P(t, x, B l ) for fJ-almost all x. This means that B l is a fJ-invariant set. If we set B = B 1, then the second assertion of the theorem holds. 0 COROLLARY. Denote by  the a-algebra generated by the fJ-invariant sets. The measure P J.l is trivial on J if and only if fJ is trivial on , i.e., for every fJ-invariant set B either fJ(B) = 0 or fJ(B) = fJ(X). REMARK. We give a condition for the measure P J.l to be nontrivial on J: P J.l is nontrivial on J if and only if  contains two disjoint sets of positive fJ-measure. Indeed, in this case there exist C l and C 2 in J such that PJ.l(C k ) > 0, k = 1,2. If tpk(X) = Px(C k ), then tpk(Xt) --+ IC k in the measure P J.l. We see as in the proof of Theorem 5 that fJ({x: a < tpk(X) < P}) = 0 for 0 < a < P < 1. Therefore, tpk(x)-coincides with the indicator function of the set Ak with respect to fJ, k = 1,2. The set Ak is fJ-invariant: since IAk (x) = ExIAk (Xt) (for fJ-almost all x), it follows that P(t,x,A k ) = 1 for fJ-almost all x E Ak. DEFINITION. A finite invariant measure fJ is said to be ergodic if it is trivial on the a-algebra  of fJ-invariant sets. We consider ergodic theorems for transition probabilities: THEOREM 6. Suppose that fJ is a finite invariant measure. Then for fJ- almost all x and for A E £B the limit lim.!. tp(u,x,A)du=Q(x,A) too t 10 exists, the function Q(x, A) is -measurable with respect to x, and for all BE 1 Q(x,A)p,(dx) = p,(A n B), (26) i.e., Q(x, A) is the conditional probability of A with respect to fJ relative to the a-algebra . 
1. GENERAL ERGODIC THEOREMS 13 PROOF. Suppose that I l t lim - IA(x(u)) du = 17(A) too t 0 almost everywhere with respect to P,u. By Theorem 4, 17(A) is an J- measurable variable, and hence 17(A) = Q(x(.),A) by Theorem 5, where Q(x,A) = E x 17(A). If B is ,u-invariant, then IB(x(O)) is J-measurable, and this is true for all B E. Hence, I l t E,u lim - IA(x(u))IB(x(u)) du too t 0 = lim .!. t ! P(u, x, A n B),u(dx) du too t 10 = ,u(A n B)E,u17(A)I B (x(O)) = E,uIB(x(O)) = E p I B (x(O))E(l1(A)lx(')) = l Q(x,A)jl(dx). The relation (26) is established. Finally, since 1 = P,u ( lim .!. t IA(x(u)) du = l1(A) ) too t 10 = ! Px ( lim .!. t IA(x(u))du = l1(A) ) ,u(dx), too t 10 (27) holds for ,u-almost all x with P x-probability 1. If Ex is taken on both sides of (27) and is carried under the limit sign on the left-hand side, then we get a proof of the theorem. 0 REARK. If the invariant finite measure ,u is trivial on, then . 1 I t ,u(A) 11m - P(u, x, A) du = (X) too t 0 ,u (27) for ,u-almost all x. THEOREM 7. Suppose that ,u is an invariant measure for the process {Q,,Px}, ,u is trivial on, f,g E L l (,u), g > 0, and J o oo Tug(x)du = +00 for ,u-almost all x. Then I (1 1 Tuf(x) du / 1 1 Tug(x) du ) = ! f(x)jl(dx) / ! g(x)jl(dx) (28) for ,u-almost all x. PROOF. We use the following result (see Neveu [1], Proposition V.6.4): if T f(x) = J f(y)P(x, dy), where P(x, dy) is a transition probability in 
14 I. ERGODIC THEOREMS the phase space (X,), p is invariant for P, f and g are as in the theorem, and E Tk g(x) = +00 for p-almost all x, then }i.. L T k f(x) / L T k g(x) = I f(x)p,(dx) / I g(x)p,(dx). (29) k$,n k$,n Applying this assertion to the functions fi (x) = J O I Tuf(x) du and gl (x) = Jo l Tzlg(X) du, we get that }i.. (Ion Tuf(x) du / Ion Tug(x) du ) = I fi (x)p,(dx) / I gl (x)p,(dx) = II p,(dx) 10 1 P(u,x,dy)f(y) du / II p,(dx) 10 1 P(u,x, dy)g(y) du = I f(x)p,(dx) / I g(x)p,(dx). Further, it can be assumed without loss of generality that f > 0 and g > f. For t E [n, n + 1 [ lot Tuf(x) du / lot Tug(x) du - Ion Tuf(x) du / Ion Tug(x) du < i n + 1 Tug(x) du / Ion Tug(x) duo The last expression tends to zero, since, by (29), nli.. ( T k gl (x) /  T k gl (X)) = }i.. (Tk gl(x) /  T k gl(X)) = nli.. (t.Tk(TgI(X))/Tkgl(X)) = I Tg i (x)p,(dx) / I gl (x)p,(dx) = I Tg(x)p,(dx) / I g(x)p,(dx) = 1, and hence }i. (rngl(X)/Tkgl(X)) =0. 0 
1. GENERAL ERGODIC THEOREMS 15 1.3. Hanis recurrence. We first consider a Markov chain with phase space (X,) and one-step transition probability P(x, B). The n-step tran- sition probability is Pn(x, B). Denote by Px the measure Px ( n{Xk E Bd ) = IBo(x) { P(x,dyJ)... ( P(Yn-hdYn) k=O J  in the space Q of all sequences (xo, XI,. . . ), Xk E X. The chain is said to be recurrent (in the Harris sense) with respect to a a-finite measure 1/ if for every set C E  with 1/(C) > 0 and all X EX Px ( U{Xk E C} ) = 1, k=O which is equivalent to the following: Px (LIc(Xk) = +00) = 1. If the chain is recurrent, then it has a unique invariant measure J.l, and there is a function g(x) such that J g(x)J.l(dx) < 00 and Px{E g(x n ) = +oo} = 1 for all x. To prove this fact we need some auxiliary propositions. LEMMA 2. For 0 < A < 1 let QA(X, A) = E;X> AkPk(x, A). Then there ex- ist a probability measure n(A) on  and a positive -measurable function g(x) such that QA(x,A) > g(x)n(A). PROOF. Obviously, the equality QA(X, C) = 0 implies that Px{Ic(xk) = O} = 1 for all k, which is impossible for 1/( C) > O. Hence, 1/ is absolutely continuous with respect to QA(X, .). If f(x,y) is the density of 1/ with respect to QA/2(X,.) (f(x,y) can be chosen to be  BI-measurable for a countably generated ), then QA/2(X, A) > i f(:,y) v(dy). Since f QA/2(X,dy)QA/2(y,A) = f: (  ) k (  ) n Pk+n(x,A) k,n= 1  k - 1 k = L...J 2 k A. Pk(x, A) < QA(X, A), k=2 it follows that QA(x,A) > i [L f(:,y) f(:, Z) v(d y )] v(dz). 
16 I. ERGODIC THEOREMS The measure 1/ can be assumed to be a probability measure. If the mea- surable functions k l (y) and k 2 (x) are chosen so that 1/ ( {x: f (x, y) > k 1 (y) }) < 1 14, 1/ ( {y: f (x, y) > k 2 (x) }) < 1 14, then 1 11 1 1 1/(dy) > 1/(dy) x f(x,y) f(y, z) - {y: f(x,y)kl(X),f(y,z)k2(Z)} k l (x)k 2 (z) 1 > 2k 1 (x)k 2 (z) ' Therefore, 1 ( 1 QA(x,A) > 2k 1 (x) J A k 2 (z) v(dz). 0 COROLLARY. There exists a   -measurable function g(x,y) such that g(x)n(A) = i g(x,y)P*(x,dy), (30) where (1 - A.)QA(x,A) = P*(x,A) is the transition probability for some Markov chain in (X, ). We denote this chain by (x o ' x, x;, . . .), the corresponding transition probabilities by P(x, A), and the measure on Q by P. LEMMA 3. The chain (x o ' x,. . .) is representable in the form Xo, x lll ' x 1l2 ' . . . , where 111, 112 - 11., · · · , 11n - 11n 1 are mutually independent, indepen- dent of Xk (k = 0, 1,...), and geometrically distributed with parameter A.: P(11n - 11n-l = k) = (1-A.)A. k , k = 1,2,.... It is recurrent with respect to the same measure 1/ as (xo, XI, . . . ); the invariant measures for the two chains coincide. PROOF. The first assertion follows from the form of P*(x,A). The recurrence with respect to 1/ can be seen from the equality LIA(x;) = L'nIA(X n ), k n and 'n = E:=o I{llm=n}. The variables 'n are independent and take the values 0 with probability A. and 1 with probability I-A.. It is easy to see that the series E 'nIA(x n ) and E IA(x n ) converge or diverge simultaneously. The fact that every invariant measure for {Xk} is invariant also for {x;} follows from the formula for P*(x,A). Using the equalities P* ( A ) = ( I_A. ) n (m-l)(m-2)...(m-n+l) ).mp ( A ) n x, L...J ( n _ I ) ! m x, , m=n 
1. GENERAL ERGODIC THEOREMS 17 we can see that the equalities J P(x, A)p(dx) = p(A) for all n imply JPm(x,A)p(dx) = p(A). 0 We introduce the stopping time 1" as follows: let 8 1 , 8 2 , ... be inde- pendent uniformly distributed variables on [0, 1] that are independent of xo, xi, · . · , and let 1" = min{n > 1, g(X;_I'X;) > 8n}. LEMMA 4. P {1" < oo} = 1, P{1" = n,x; E A} = n(A)P(1" = n), PROOF. We have P{1" = l,x; E A} = P{g(x,xi) > 8 1 ,xi E A} = i g(x,ydP*(x, dyd = g(x)n(A), P;{ 1" = 2,x; E A} = !! (1 - g(x,Yd)g(YhY2)P*(x,dYdP*(Yh dY2) = ! (1- g(x,Yd)g(ydP*(x,dydn(A), P;{ 1" = n, x; E A} = f...! (1 - g(x,yd)... (1 - g(x,Yn-d)g(Yn-d x P*(x, dYl) X . . . x P*(x, dYn-l )n(A). AE, n > 1, XEX. In particular, the last relation implies that P(1" = n) = E(I- g(x,xi))... g(X;-I) = E(I- g(x,xi))... g(X;_I'X;), P(1" > n) = E(1 - g(x,xi))(1 - g(xi,x;))... (1 - g(X;_l'X;)) < Eexp { - tg(X;-I'X;) } . k=l To prove the lemma it remains to show that for all x E X Px { f g(Xk-l'X;) = +oo } = 1. (31) k=l It can be assumed without loss of generality that g(x,y) < c (otherwise, consider the function g(x,y) A c). We take the sequence n Zn = L[g(xZ-l,xZ) - g(xZ- l )], k=l 
18 I. ERGODIC THEOREMS where g(x) = J g(x,y)P*(x,dy); Zn is a martingale with respect to the flow  generated by the variables x k ' k < n. For a > 0 and b > 0 let t' = min[n: Zn ft [-a,b]] vmin [ n: tg(Xk) > a+b ] . k=1 In view of recurrence, E;x' g(Xk) = +00; therefore, t' < 00. Since t' is a stopping time, EZT = O. Therefore, o < (b + C)P{ZT' > b} - aP{zT' < -a} + bP { ZTI E [-a,b], t g(Xk) > a + b } · k=1 The events {ZT' > b} and {ZT' E [a,b]} n {E=1 g(Xk) > a + b} imply the , events {E=1 g(xk-l' x;) > b}. Therefore, o < (b + c)P { t g(Xk-l,Xk) > b } k=1 -a (l-P{ g(Xk-l,Xk) > b}), P { t g(Xk_pXk) > b } > a +: + c ' k=l and, passing to the limit as a --+ 00, we see that for all b > 0 P { f g(Xk-l,Xk) > b } = 1. 0 k=l THEOREM 8. If a Markov chain {x n } is recurrent with respect to a (J- finite measure v on the measurable space (X,) with countably generated a-algebra, then it has an invariant (J-finite measure J.l majorizing v, and the chain is recurrent with respect to this measure. PROOF. Let t be the stopping time constructed in Lemma 4, and let 00 J.l(A) = L P;{x k E A, 1" > k}. k=O 
 1. GENERAL ERGODIC THEOREMS 19 Then j.l majorizes n: #(A) = n(A) + L n(dx) / (1 - g(x,yd)P*(x,dyd k . . . / (1 - g(Yk-2, Yk- d)P* (Yk-2, dYk-l) x L(1-g(Yk-I>Yk))P*(Yk-l>d Y k )+.... The fact that j.l is a-finite follows from the equality / g(x)#(dx) = / n(dx) [g(X) +  /... /(1- g(x,Yd) . . . (1 - g(y k- I> Y k )) g (y k ) ] X P* (x, d y 1) . . . P* (y k - 1 , d Y k) + . . . = / n(dx)P*(. < 00) = 1. Further, / #(dx)P*(x, A) = L / P;{x; E dy,. > k}P*(y, A) kO = L P;(Xk+l E A, , > k) kO (we have used the fact that the event {, > k} is in 9k). Therefore, / #(dx)P*(x, A) = L P; (Xk+ 1 E A,. > k + 1) kO + LP;(Xk+l EA,,=k+ 1). kO On the basis of Lemma 4, L n(A)P;(, = k + 1) = n(A) = P;(xo E A" > 0). kO It is proved that j.l is invariant. We show that the chain x k is j.l-recurrent. For this we construct a sequence of times 'k, where '1 = , and if 'k = m, then 'k+l = min{n > m: g(X;_I'X;) > en} 
20 I. ERGODIC THEOREMS ((In is the same as in the definition of ,). Then X;k has distribution n(A) for k > 1, and the X;k are mutually independent. The events {Tl IA(x;) > 0 } are mutually independent and have the same positive probability when p(A) > O. Therefore, infinitely many of these events occur, i.e., {x;} is p-recurrent. We conclude on the basis of Lemma 3 that the chain {xn} is also p-recurrent, and p is its invariant measure. 0 REMARK 1. Since 00 'k+l-l ,-1 L IA(x;) = L L IA(x;) + L IA(x), n=O k n='k n=O it follows that the condition p(A) = 0, which implies that all the events (31) have Px-probability 0 for k > 1, gives us that E IA(x;) < 00. Thus, f.J. is the maximal measure with respect to which the chain is recurrent. REMARK 2. Suppose that {xn} is a Markov chain with transition proba- bility P(x, A) and a-finite invariant measure p, A E , p(A) > 0, p(A) < 00, and Px(En IA(x n ) = +00) = 1. Denote by 'k the kth time the chain hits the set A: '1 = inf[n > l,x n E A], and 'k = inf[n > 'k-bXn E A]. All t these times are finite. Then the sequence Yn = x'n' Yo = Xo E A, is a homo- geneous Markov chain with transition probability Q(x, C) = Px{X('I) E C} and invariant measure PA (C) = p(A n C). In the proof we need only the invariance of the measure PA (C) for QA(X, C). Since 00 QA(X,C) = LPx{Xk ft A, 1 < k < n,x n E CnA}, n=1 it follows that f JlA(dx)QA(X, C) = f Jl(dy)Q(x, C) - f Jl(dy) t\A P(y, dx) x P(Xk ft A, 1 < k < n, X n E C n A) 00 = f Jl(dY)LP y (xk EA ,l < k<n,xnECnA) k=1 00 - L f Jl(dy)Py(Xk ft A, 1 < k < n,x n E C n A) k=2 = f Jl(dy)Py{xn E C n A} = Jl(C n A) = JlA(C). 0 
 1. GENERAL ERGODIC THEOREMS 21 LEMMA 5. Under the conditions of Theorem 8 the following assertions are valid: 1) An invariant measure is unique to within a factor. 2) Iff and g are measurable, J(lf(x)1 + g(x))J1.(dx) < 00, and g(x) > 0, then for all x the limit n / n l(f,g) = nl.!.Lf(xk) Lg(Xk) k=l k=1 exists with P x-probability 1. PROOF. Suppose that the chain is recurrent with respect to J1. (the ex- istence of such measures J1. follows from Theorem 8), and let L be the set of x such that the limit I (f, g) exists with P x-probability 1. Then J1.( L) > 0, and hence P x { 1" L < oo} = 1 for all x, where 1" L is the hitting time for L. Therefore, assertion 2) is valid. Obviously, the a-algebra J is trivial, and hence every invariant measure is ergodic. If v is such a measure, and A and B are such that J1.(A) + J1.(B) + v(A) + v(B) < 00 and J1.(A)J1.(B)v(A)v(B) > 0, then J1.(A) v(A) I(IA,IB) = Jl(B) = v(B) with P x-probability 1 for all x. 0 The next theorem enables one to carry over the results from the discrete case to the continuous case. THEOREM 9. Suppose that {Q,9';, Px} is a homogeneous Markov process with transition probability P(t, x, A) in a measurable phase space (X,) with countably generated a-algebra . Let Q.1.(x,A) =).1 00 e-.1.1P(t,x,A)dt. [fa Markov chain in (X,) with transition probability QA(x,A) is recur- rent with respect to some measure v, then there exists an invariant a-finite measure J1. for the Markov process majorizing it, and there exists a positive function g(x) such that J g(x)J1.(dx) < 00 and Px(Jooo g(Xt) dt = +00) = 1 for all x. If J1. is finite, then the limit in (24') exists with P x-probability 1 for all x. PROOF. Let J1. be the invariant measure constructed in Theorem 8 for a chain with transition probability QA(x,A). As a Markov chain with this transition probability we can take the sequence X n = x(en), where en = c; 1 + . . . + c;n, and {c;k} is a sequence of random variables that are 
22 I. ERGODIC THEOREMS mutually independent, independent of the Markov process, and have the exponent distribution P{ 'k > t} = e-;'t, Xo = x(O). Then Px{XI E A} = P x {X('I) E A} = Q;.(x,A). If Qf(x,A) denotes the n-step transition probability, then A. n roo Q1(x,A) = (n _ I)! 10 t n - l e-AtP(t,x,A)dt. If for all n I ,u(dx)Q1(x,A) = ,u(A), then for every polynomial g(t) in t 1 00 e-)./ g(t) I ,u(dx)P(t, x, A) dt = ,u(A) 1 00 e-)./ g(t) dt. From this, I ,u(dx)P(t,x,A) = ,u(A) for almost all t > O. Since  is countably generated, this equality holds simultaneously for almost all t and all A. But then I ,u(dx)P(t,x,A) =  1/ ll,u(dX)P(s,x,dY)P(t-S,Y,A)dS =  1/ ,u(dy)P(t - s,y,A) ds = ,u(A) for any t > 0, i.e., p is an invariant measure for the Markov process. Let g(x) be a positive function with J g(x)p(dx) < 00. We show that Px {1 OO g(x(t)) dt = +00 } = 1. The function g;.(x) = JX('n) g(y)Q;.(x, dy) is also positive, and, because the chain is recurrent, Px {  g).(x((n)) = +00 } = 1. Since roo r'n+l 10 g(x(t)) dt > L 1r e-).(U-C') g(x(u)) du, o n 'n it suffices to prove that the series on the right-hand side diverges. Note that the sequence r'k+l Zn = L). 1r e-).(U-C')g(x(u))du - g).(X((k)) k<n 'I.. 
2. DENSITIES FOR TRANSITION PROBABILITIES 23 is a martingale with respect to the flow 9Cn. It can be assumed that IIgll < 00. Then 1 ("+1 1 00 A. e-A(U-(k) g(x( u)) du < IIgllA. e- As ds = IIgli.  0 Therefore, as in the proof of Theorem 4, we find for the stopping time " = min[n: Zn  [-a, b]] V min [n:  gA(X((k)) > a + b] that Px { t {Chi e-A(S-Ck)g(x(u))du > b } > a . o J a+b+c This inequality yields what was required. 0 2. Densities for transition probabilities and resolvents for Markov solutions of stochastic differential equations We consider Markov processes formed by solutions of homogeneous stochastic differential equations of the following form: dx(t) = a(x(t)) dt + B(x(t)) dw(t) + f fi (x(t), O)Jll (dt x dO) + f h(x(t), O)v2(dt x dO), (32) where x(t) is a process with values in Rd, a(x): Rd --+ Rd, B(x): Rd --+ L(Rd), fi(x, 8): X x 8 --+ Rd, (8,) is a measurable space, w(t) is a Wiener process in Rd, v;(dt x d8) is a Poisson measure on + X , Ev;(dtxd8) = dt.m;(d8), ml is a a-finite measure on , PI = VI-Evl, and m2 is a finite measure on . The coefficients a and B are locally bounded Borel functions, the fi are Rd  -measurable, and J Iii (x, 8)1 2m l (d8) is locally bounded. It is assumed in addition that the coefficients in (32) are such that the solution of the equation is weakly unique (see Gikhman- Skorokhod [2], Chapter 6, 1). In this case x(t) is a Markov process with transition probability P(t, x, A) dt = P(C;x(t) E A), where C;x(t) is the solu- tion of (32) with initial condition C;x(O) = x. We remark that in the case when the solution of (32) is strongly unique it can be constructed on the probability space on which the Wiener process w(t) of the measure Vk is given, and the a-algebras generated by {w(s)vk(ds x d8), k = 1,2, s < t} appear as the a-algebras 9';. As follows from  1.3, in investigating the ergodic properties for Markov processes we must look for a measure v such that a chain with transition 
24 I. ERGODIC THEOREMS probability Q.«x,A) = ).1 00 e-.<tP(t,x,A) dt is recurrent with respect to v. If x(t) were a process with independent in- crements, then Lebesgue measure would be invariant for it. The process x(t) is locally spatially homogeneous, and though Lebesgue mea- sure is no longer invariant, the process can be recurrent with respect to this measure. To investigate recurrence with respect to Lebesgue measure we study absolute continuity of Q with respect to this measure (if it is singu- lar, then Lebesgue measure cannot supply recurrence to a chain with such a transition probability). This question is also considered in the present section. 2.1. Nondegenerate diffusion processes. We consider a process x(t) solving the equation dx(t) = a(x(t)) dt + B(x(t)) dw(t) (33) with measurable coefficients. The functions a(x), B(x), and B-1 (x) are assumed to be locally integrable. In this case (33) has a weak solution that is weakly unique and a Markov random process, and the function Ttf(x) = f f(y)P(t,x,dy) is continuous in x for every bounded continuous function f if the linear boundedness condition la(x)1 + liB (x) II < C(1 + Ixl) (34) holds with a constant C. These results are found, for example, in Gikh- man-Skorokhod [2] (Chapter 6, 3). It is known that the measure corresponding to the solution of equation (33) with a given initial condition on some finite interval [0, T] is equiv- alent to the same measure for the solution of the equation with a = O. Therefore, the transition probability densities with respect to Lebesgue measure exist simultaneously for a # 0 and a = 0, and we can restrict our attention to the case a = O. THEOREM 10. Suppose that B(x) and B- l (x) are locally bounded and IIB(x)1I < C(1 + Ixl). Then for all A. > 0 and x E Rd the measure Ql(x,A) is absolutely continuous with respect to Lebesgue measure. The measures Ql(x,A) and Ql(y,A) are equivalent for any x,y E Rd. PROOF. We use the following result of Krylov ([1], Chapter 2, 2, Lemma 8). For r > 0 suppose that a r , Pr > 0 are such that for all 
2. DENSITIES FOR TRANSITION PROBABILITIES 25 XES, = {X: Ix I < r} a,Izl 2 < IB(x)zI2 < P,lzl 2 , and let " be the exit time of the process from the sphere S,. Then there exists a constant qt depending on r, a" and P, such that for all x and every measurable function f(s, x) on [0, t] x S, with t { If(s,y)ld+1 dsdy < 00, 10 11YI' we have the inequality tl\1: ( t ) 1/(d+l) Ex ( r f(s, x(s)) ds < qt ({ If(s,y)ld+1 ds dy . (35) 10 10 11YI' Let g,(t, x, A) = f Px{X(S) E A, " > O} ds, and let f(y) be such that YI' If(Y)l d + l dy < 00. Then ( f(y)g,(t, x, dy) < q t ( ( If(Y)ld+1 d Y ) 1/(d+l) , 11YI' 11YI' where q does not depend on f. Therefore, ( f(y)g,(t,x,dy) 11YI' is a continuous functional on the space Ld+l (S,) of functions with (d + 1 )st power integrable on S, (with respect to Lebesgue measure). It can be represented in the form ( f(y) tp, (t, x,y) dy, 11YI' where ,(t, x, .) E L(d+l)/d(S,), i.e., for Borel sets A c S, g,(t,x,A) = i tp,(t,x,y)dy. It is easy to see that g,(t, x, A) is an increasing function of r; therefore, ,(t,x,y) is also an increasing function of r almost everywhere. Thus, the limit lim ,(t, x, y) = (t, x, y) ,oo exists, and t Px(x(s) E A) ds = lim g,(t, x, A) 10 ,oo = lim { tp,(t,x,A)dy= { tp(t,x,y)dy. ,oo 1 A 1 A 
26 I. ERGODIC THEOREMS This implies that Ql(X, A) is absolutely continuous with respect to Le- besgue measure. Let A be a bounded Borel set, and fn a sequence of collectively bounded continuous functions such that lim f IIA(Y) - fn(y)1 dy = O. noo Then lot P(s,x,A)ds - lot f P(s,x,dy)j(y)ds < Ex lot IIA(x(s)) - fn(x(s))1 ds (tAT r ( ) < Ex 10 IIA(x(s)) - fn(x(s))1 ds + t 1 + sp IIfnll Px{'r, < t} ( ) Ij(d+l) < qtt r IIA(y) - fn(Y)ld+l dy JIYIr + t (1 + sp IIfnll) Px{., < t}. Therefore, for all rl i t ' i t lim sup P(s,x,A)ds-" Tsfn(x)ds < CtSUpPx{tr<t}. noo Ixl rl 0 0 . Ixl r But { } Exlx(t)12 Px{t r < t} = Px suplx(s)1 > r < 2 ' st r and the right-hand side tends to zero uniformly with respect to Ixl < rl as r --+ 00. Thus, f P(s, x, A) ds is a locally uniform limit of continuous functions, and hence also continuous. Assume that Ql( x ,A) = 0 for some x . Then fP(s,x,A)ds = 0 for all t > O. This implies that f {t (t+h P(h, x , dy) 10 P(y,s,A)ds = 1h P(s, x ,A)ds = 0 for all t > 0 and h > 0, and hence {tl (t 10 P(u, x , dy) du 10 P(y,s, A) ds = 0 for all t > 0 and tl > O. Therefore, fot l f P(u, x ,dy)Q;.(y,A)du = O. (36) 
2. DENSITIES FOR TRANSITION PROBABILITIES 27 We show that Ql(X, G) > 0 for all x for every open set G. Since Q..(x, G) = 10 00 e-Atd lot P(s, X, G) ds =). 10 00 e-.. t lot pes, X, G) ds dt is a continuous function of x, it follows that the set F = {x: 10 00 P(u,x,G)du = o} is closed. If G l = Rd\F, then P(t,x, G l ) = 0 for all x E F and t > 0, because o= j OOp(U,X,G)dU= roo ( P(t,x,dy) (OOp(u,y,G)du. t 10 1G 1 10 Therefore, there exists an open ball S with boundary intersecting F such that S c G l . Hence, P(t,xo,S) = 0 for all t > 0 and some Xo lying on the boundary of S. Using the law of the iterated logarithm, we can get that for some Cl, C2 > 0 P { - I . Ix(t) - xol } - 1 Xo 1m < Cl - , tO v 2t In In t P {I . (x(t) - xo, a) > } - 1 Xo 1m C2 - , tO v 2tlnlnt - where a = XI - Xo, XI being the center of S. Therefore, there is an infinite sequence of points tk --+ 0 such that X(tk) E S with Pxo-probability 1. This contradicts the assumption that there exist an open set G and a point X such that Ql(X, G) = O. In particular, f Ql(X, dy)(y) > 0 for every continuous function  > O. If Ql(X, A) is not identically equal to zero, then, since the set of X with Ql(x,A) > 0 is open, loti f P(s, x , dy) ds Q..(y, A) > 0 for sufficiently large tl; but this contradicts (36). Hence, Ql(X, A) is either positive for all x or identically equal to zero. The theorem is proved. 0 2.2. Diffusion processes with degenerate diffusion. We consider solu- tions of the equation dx(t) = a(t, x(t)) dt + B(t, x(t)) dw(t), (37) where a(t, x) and B(t, x) are continuous, locally bounded, and continu- ously differentiable, and satisfy inequality (34) for some c > O. We are interested in conditions for the existence of a density for the transition probability. The following general fact will be used. 
28 I. ERGODIC THEOREMS LEMMA 6. Let X be a separable Hilbert space, L an m-dimensional sub- space of X, and J.l a probability measure on X such that all a E L are admissible translations for J.l, i.e., the measure J.la determined by the equal- ity f f(x)J.la(dx) = f f(x + a)J.l(dx) is absolutely continuous with respect to J.l. The density of J.la with respect to J.l is denoted by d J.la ) ( dJ.l (x = p a, x). Further, suppose that <I>(x) is a mapping of X to L that is continuous and continuously differentiable in the directions of L in the measure J.l (the derivative in the measure J.l in the direction a is defined as the limit lim  [cI>(x + ).a) - cI>(x)] = <I>' (x)a A.o I\. in the measure J.l), and let <l>'L(X) be the operator from L to L that is the derivative of<l> along L. Denote by v the image of J.l on L under the mapping <1>. Assume the following conditions hold: a) The set S(x,y) of gEL with <I>(x + g) = y is at most countable for p,-almost all x and for all y E L. b) I det <l>'L(X) I > 0 for p,-almost al x. c) fLP(u,x)du < 00 for p,-almost all x. Then the measure v has a density with respect to the Lebesgue measure dx on L. This density is given by dv  f 1 Pv(y) = dy (y) = L- I det (x + g) I f p(u x + g) du J.l(dx). gES(x,y) L L' (38) See Skorokhod's book [5] (27, Theorem 1) for a proof in the case m = 1; the proof in the general case is analogous. REMARK. Suppose that the measure p, has a dense linear manifold Xo of admissible translations in X, the mapping <I>(x) is differentiable along directions in Xo in the measure p" and the derivative along any finite- dimensional subspace of Xo is continuous in the measure p,. If <l>'xo(x), which is a linear operator from Xo to L, maps Xo onto L for p,-almost all x, and f N p( U, x) d u < 00 for p,-almost all x for any finite-dimensional subspace N c Xo, then there exist a partition X = Uk Uk of X into finitely many measurable parts and m-dimensional subspaces Lk c Xo such that for all y E L the set {x: <I>(x) = y} n Uk projects in one-to-one fashion 
2. DENSITIES FOR TRANSITION PROBABILITIES 29 on X e Lk (the orthogonal complement of L k ), and <l>'L k (x) is a nons in- gular operator from Lk onto L. If gk(X,y), x E Uk, denotes the point in {x: <I>(x) = y} n Uk with the same projection as x, then !r 1 P v (y) = ,.., J.l ( d x),  Uk(Y) I det <l»'L/gk(X, y))1 ILk p(u, gk(X,y)) du (39) ,.., where Uk(y) is the set of those x whose projections on X e Lk coincide with the projection of {x: <I>(x) = y} n Uk. If J.l is a Gaussian measure on X with mean 0 and correlation operator B (B is a trace-class operator), then the set Xo of admissible translations coincides with Bl/2 X; if a E Xo, then p(a,x) = exp{(B- l / 2 a,B- l / 2 x) - !IB- l / 2 aI 2 }. Let N be an m-dimensional subspace with orthonormal basis al,. . ., am. Then p (tkak> x ) = exp {  tk( -B- 1 / 2 ak> B- 1 / 2 X) 1 m } - 2 L tkt/(B-l/2ak>B-l/2a/) · k ,I = 1 It can be assumed without loss of generality that the ak are such that (B- l / 2 ak,B-l/2 a /) = 0 for k # I. Then I p (tkak> X) dtl .0. dtm = II eXP{ ) k(B-l/2ak>B-l/2X) 1 m } - 2 Lt(B-l/2ak>B-l/2ak) dtloo.dtm k=1 = rr m 2n ex { ! m (B-l/2ak,B-l/2X)2 } (B-l/2ak>B-l/2ak) p 2 L l (B-l/2ak,B-l/2ak) . k=l A Wiener process w(t) with values in Rd, t E [0, T], can be regarded as a Gaussian variable in the Hilbert space L 2 ([0, T], Rd) of functions a(t) with values in Rd that are square-integrable on [0, T]. If J.l is the Gaussian measure corresponding to this variable, then Xo consists of the functions 
30 I. ERGODIC THEOREMS a(t) with derivative a'(t) satisfying f{ la'(t)12 dt < 00, and p(a, x) = exp {I T (a'(t), dx(t)) -  1 T la'(t)1 2 dt} . The solution of (37) for a particular initial value Xo is a function of w(t), and we can consider the mapping of L 2 ([0, T], Rd) to Rd given by the equality <l>t(w(.)) = x(t). We find the derivative (w(.))a(.), where a(.) is an admissible translation. We have that <I> t ( W ( .) + A.a ( . )) = x;. ( t) , where x;.(t) is  solution of the equation dX;.(t) = a(t, x;.(t)) dt + B(t, x;.(t) )[dw (t) + A.a' (t) d t] = [a( t, x;. (t)) + A.B(t, x;. (t)a' (t))] d t + B( t, x;. (t)) dw(t). (40) The coefficients in (40) are differentiable with respect to x and A.; therefore (see Gikhman-Skorokhod [2], p. 263), 8x;.(t)18A. exists and satisfies the equation d :). x;.(t) = [a(t, x;.(t)) : + )'B(t, x;. (t))a' (t) : + B(t, x;. (t))a' (t)] dt + B(t, x;.(t)) dw(t) :. . Let <I>(w(.))a(.)  y(t). Then y(t) satisfies d y ( t) = [a ( t, x ( t) ) y ( t) + B ( t, x ( t) ) a' ( t)] d t + B ( t, x ( t)) d w ( t) y ( t). ( 41 ) Let a(t, x(t)) = A(t), [B(t, x(t)) dw(t)] = dB(t), B(t, x(t))a' (t) = z(t). Equation (41) can be rewritten in the form dy(t) = A(t)y(t) dt + dB(t)y(t) + z(t) dt. (42) This is a nonhomogeneous linear equation (see Chapter III, 2). A solution of it can be written as follows. Let Zt be an operator-valued process in L(Rd) that is the solution of the equation dZ t = A(t)Zt dt + dB(t)Z(t) with initial condition Zo = I. Then (since y(O) = 0) y(t) = z, l' Zs-IB(s,x(s))a'(s)ds. 
2. DENSITIES FOR TRANSITION PROBABILITIES 31 The operator Zt is invertible. Therefore, the dimension of the space of vectors y(t) as a'(s) runs through L 2 ([0, t], Rd) is the same as the dimension of the space of vectors I t Z5- 1 B(s, x(s))a' (s) ds. (43) Assume that this dimension is less than d for some t > O. Denote the space of vectors of the form (43) by Ht. Obviously, Ht is generated by the vectors Zs-lB(s,x(s))a, where a E Rd and s < t, and Ht is an increasing function that can have discontinuities only when the dimension of Ht changes, i.e., the number of discontinuities does not exceed d. Let '1 > 0 be the first discontinuity of Ht after the point O. Then H o + = Ht for t < '1. Since H o + is nonrandom, for H o + # Rd there is a nonrandom vector v E Rd such that (v, Zs-l B(s, x(s))a) = 0 for s < '1 for all a E Rd. But then B*(s, x(s))Zs*-lV = 0, s < '1. (44) Thus, the following general theorem is valid. THEOREM 11. For the transition probability of a Markov process deter- mined by the stochastic differential equation (37) to have a positive density it suffices that the set {s > t: IB*(s,x(s))Zs*-I V I > O} have t as a limit point with positive P x,t-probability for all x E Rd, t > 0, and v E Rd. Here Px,u is the distribution of the solution x(s) of(37) on [u,oo[ with the initial condition x(u) = x, and Zs is the solution of(42). Let us consider in more detail the homogeneous case. We rewrite (37) in the form d dx(t) = a(x(t)) dt + L bk(x(t)) dWk(t), (45) 1 where the Wk(t), k = 1,..., d, are one-dimensional Wiener processes, a(x) and b l (x),...,b k (x) are continuously differentiable functions from Rd to Rd, and d la(x)1 + L Ibk(x)1 < c(1 + Ix!). 1 The equation for Zs has the form d dZ = a'(x(s))Zs ds + L b(x(s))Zs dWk(S) 1 
32 I. ERGODIC THEOREMS (a' and b' denote the derivatives with respect to x). The equation for Zs* is obtained by passing to the adjoint operators in the last equation: d dZ s * = Zs* a'* (x(s)) ds + L Zs* b* (x(s)) dWk (s). 1 Using the Ita formula, we get that dZ s - l * = - (a'*(X(S))dS+ *bk*(X(S))dWk(S)) Zs-h d + L[b*(x(S))]2Zs-1* ds. 1 Let Zs-l* = Us. Then Us satisfies the equation dU s = { (*[bk*(X(S))f - a'*(X(S))) ds - * bk(x(s)) dWk(S) } Us. (46) The condition (44) is transformed as follows. Suppose that {ek, k = 1, . . . , d} is an orthonormal basis in Rd, and let the Wiener process W (t) in Rd be given by w(t) = EWk(t)ek. In this case if B is the diffusion operator in (37), then under the transformation of (37) to (45) we must have b k = Bek. It follows from (44) that ( B* (x (s ) ) Us v, ek) = (Us v, b k (x (s ) )) = 0 for k = 1, 2, . . . d, s < '1. ( 4 7) Using Theorem 11, we now establish sufficient conditions for the exis- tence of a transition probability density for a solution of (45). THEOREM 12. Suppose that for all x the functions bk(x) are twice con- tinuously differentiable, and the subspace spanned by the vectors { b 1 (x), . . . , b d ( x), C 1 (x ), . . . , Cd ( x), C 12 ( x), . . . , Cd - 1 ,d ( x) } , ( 4 8 ) where d Ck(X) = L[(b(x))2bk(x) - b(x)b(x)br(x)] r=l + .!.b (x)[b,(x), b,(x)] + bk(x)a(x) + a'(x)bk(x), 2 8 2 b" (x)[al, a2] = a a b(x + tal + sa2)\I=O, t s s=o Ckl = b(x)bl(X) - bi(x)bk(x), 
2. DENSITIES FOR TRANSITION PROBABILITIES 33 coincides with Rd. Then the transition probability for a solution of(45) has a positive density. PROOF. A density does not exist if for some nonzero v E Rd and some T > 0 we have that (Usv,bk(x(s))) = 0 for s < T, k = 1,...,d. Differ- entiating the last relation with respect to s, we see from the Ita formula that ( {[b:*(x(s))f - a'*(x(s)) } Usv, bk(X(S))) ds d - L(b* (x(s)) Us v , bk(x(s))) dw,(s) + (Usv, b,,(x(s))a(x(s))) ds ,=1 + ( Usv, t. b(X(S))b,(X(S))) dw,(s) 1 d + 2 L(Usv, bf(x(s))[b,(x(s)), b,(x (s))]) ds ,=1 d + L(b*(x(s))UsV, b,,(x(s))b,(x(s))) ds = o. ,=1 This implies the equalities ( Usv, (b:(X(S)))2bk(X(S)) - b:(x(s))b(x(s))b,(x(s)) 1 + 2 bk'(x(s))[b,(x(s)), b,(x(s))] + b,,(x(s))a(x(s)) - a' (X(S))bk(X(S))) = 0, k = 1,2,..., d, (Usv, b,,(x(s))b,(x(s)) - b(x(s))bk(x(s))) = 0 (49) (we have used the fact that a(s) ds + E Pk(S) dWk(s) = 0 implies a(s) = 0 and Pk(S) = 0, k = 1,..., d, for almost all s). Passage to the limit as s --+ 0 gives us that (v, Ck(X)) = 0 and (v, Ck,[(X)) = 0, k, I = 1,2,..., d. And we find from (47) that (v,bk(x)) = 0, k = 1,...,d. But under the assumptions of the theorem there is no nonzero vector v E Rd for which all these equalities hold. 0 If the coefficients ak and b k are smoother, then a stronger result can be obtained. It is based on the following lemma. LEMMA 7. Let c(x) be a twice continuously differentiable function for which there exists aT> 0 such that (Usv,c(x(s))) = 0 when s < T. Then 
34 I. ERGODIC THEOREMS the equalities ( Us v, r( x (s ) )) = 0, ( Us v, rk (x (s ) )) = 0, hold for s < T, where d . r(x) = L[(b;(x))2 c (x) - b;(x)c' (x)bl(x) + !c"(x)[bl(x), bl(x)]] 1=1 - a' (x )c(x) + c' (x )a(x), (50) rk(x) = c'(x)bk(x) - b,,(x)c(x). PROOF. Applying the Ita formula to the equality (Usv, c(x(s))) = 0, we get that k = 1,...,d, 0= ([t.(b i *(X(s)))2-a'*(X(S))] UsV,C(X(S))) ds d - L(b;* (x(s))Usv, c(x(s))) dWI(S) + (Usv, c' (x(s))a(x(s))) ds 1=1 + (USV,C'(X(S)) t.b/(X(S))dW/(S)) d - L(b;*(x(s))Usv, c'(x(s))bi(x(s))) ds. 1=1 Gathering the coefficients of ds and dWI, I = 1,..., d, and equating them to zero, we obtain a proof of the lemma. 0 THEOREM 13.(1) Suppose that the functions a(x) and b l (x),..., bd(x) have continuous derivatives up to order m < 00. Denote by N (x) the smallest subspace of Rd containing all the vectors b l (x), . .., bd(x) and, together with c(x), where c(y) is any twice continuously differentiable function, the vectors r(x) and rl (x),..., rd(x) defined b} (50). If N(x) = Rd for all x E Rd, then a solution of (45) has positive transition probability. PROOF. Denote by F the smallest class of functions from Rd to Rd con- taining the functions b l (x),..., bd(x) and, together with any twice contin- uously differentiable function c(x), the functions r(x) and rl (x),..., rd(x) given by (50). We need to show that there is no nonzero v E Rd such that for some! > 0 we have that (Usv, bk(x(s))) = 0 for s < !, k = 1,..., d. If there were such a v, then for every function c(x) E F we would have (1 )See Malliavin [1]. 
2. DENSITIES FOR TRANSITION PROBABILITIES 35 that (Usv,c(x(s))) = 0 for s < !, by Lemma 7. Passing to the limit as s --+ 0, we get that (v, c(x)) = 0 for c(x) E F, i.e., (v, c) = 0 for c E N(x). Therefore, v must be equal to zero. 0 2.3. Processes with jumps. We first consider solutions of the equation dx(t) = a(t, x(t)) dt + B(t, x(t)) dw(t) + f h(t, x(t),f})v2(dt x dO) (51) (this is an equation of the form (32) for fi = 0). THEOREM 14. Suppose that a(t, x) and B(t, x) are such that equation (33) has a weakly unique solution for which a transition probability density exists. Then a solution of (51) also exists, is weakly unique, and is thus a Markov process. The transition probability for this process also has a density. If the transition probability density for (33) is positive, then the same holds for (51). PROOF. Denote by pO(s, x, t, A) the transition probability for the solu- tion of (33), and by P(s, x, t, A) the transition probability for the solution of (51) (the fact that this solution is weakly unique and hence a Markov process follows from the construction of the solution (see Gikhman- Skorokhod [2], p. 232). These probabilities are then connected by the relation P ( s x t A ) = e-(t-s)m 2 (8)p O (s x t A ) , , , , , , + it f l P(s, x, t, dy)e-(t-u)m 2C 8) x m2(dO) dupO(y + h(u, y, 0), u, t, A). (52) To prove this we introduce the random variables {Ok, !k}, where Ok E e and ! k E R+ are such that v([s, t[xC) = L I{E>kE[s,t[}I{OnEC}. n The pairs {Ok, !k} are independent, and P{Ok E C,!k > u} = exp{m2(8)u}m2(C). Moreover, these variables do not depend on the process xO(t) that solves (33). It obviously suffices to consider the case s = O. Using the facts that x(t) and xO(t) coincide on [0, !1[, !1 is a stopping time, and X(!I) = X(!l-) + f(!l,X(!l-),O), 
36 I. ERGODIC THEOREMS we get the relation P(O,x, t,A) = Eo,xIA(x(t)) = Eo,xIA(x(t))I{'rl>t} + Eo,xIA(x(t))I{LI<t} = Eo,xIA(x(t))I{LI>t} + Eo,xI{LI<t}E(IA(x(t))Ig;I) = exp{ -tm2(8) }po(O, x, t, A) + i l IL exp{- um 2(6)}m2(dO)du x po (0, x, u, d y ) P ( u, y + 12 ( u, y, 0), t, A). Similarly, P(s, x, t, A) = exp{ -(t - s)m2(8)}P o (s, x, t, A) + [I L exp{ -(u - s)m2(8)}Po(s,x, u, dy)m2(d8) du x P(u,y + 12(u,y) + 12(u,y, 0), t,A). (53) We introduce the kernel Q(s, x, u, dy, dO) with the help of the equality I g(y,O)Q(s,x,u,dy,dO) = II g(y+12(u,y,O),O)p°(s,x,u,dy) x exp{ -(u - s)m2(8)}m2(dO). Then P(s, x, t, A) = exp{ -(t - s)m(8) }po(s, x, t, A) + [I II Q(s,x,u,dy,dO)P(u,y,t,A)du. (54) From (54) we obtain a representation ofP(s, x, t, A) in the form of a series: P(s, x, t, A) = exp{ -(t - s)m2(8)}P o (s, x, t, A) + [I II Q(s,x,u,dy,dO)exp{-(t - u)m2(8)}po(u,y,t,A)du + ... + [<UI<U2<...<un<1 I I ... II Q(s,x, ul>dYl>d8dQ(ul>YI> u2,dY2,d0 2 ) ... Q(Un-l,Yn-l, Un, dYn, dOn) exp{ -(t - u n )m2(8)} X po ( Un, Y n, t, A) d U 1 . . . dUn + . . .; ( 5 5) convergence of the series follows from the fact that the general term in the preceding formula can be estimated by the quantity (t - sr (m2())n exp{ -m2(8)(t - s)}. n. 
2. DENSITIES FOR TRANSITION PROBABILITIES 37 Rewriting the general term in (55) in the form ( fJ ... fJ exP{-(UI-S)m2(8)}pO(S,X,Ul,dYl) ls<u, <...<un<t X m2(dO) exp{ -(U2 - Ul)m2(8)} x pO(Ul,Yl + J2(Ul,Yl, ( 1 ), U2, dY2) ... exp{ -(Un - U n -l)m2(8)} X pO ( Un -1 , Y n - 1 + f ( Un -1 , Y n -1, On -1 ) , Un, d Y n ) x exp{ -(t - u n )m2(8)}pO(u n ,Yn + f(un,Yn, On), t, A) X m2(dO l )... m2(dOn)dul ... dUn, we obtain (52). The latter implies that P(s,x,t,A) = 0 if the Lebesgue measure of A is equal to zero, since by assumption we then have that pO(s,x,t,A) = 0 for all s < t and x E Rd. Moreover, (52) gives us an equation for the densities: if pO and p are the respective densities for pO and P, then p(s,x, t,y) = exp{ -(t - s)m2(8)}po(s,x, t,y) + 1/11 p(s,x,u,z)exp{-(t-u)m2(6)} x pO(u, z + J2(u, z, 0), t,y) dzm2(dO) du. (56) This implies that p(s, x, t, y) is positive when pO(s, x, t, y) is. 0 Let us now consider a homogeneous equation of the form d dx(t) = a(x(t)) dt + L bk(x(t)) dwdt) + { II (x(t), O)f.ll (dt x dO). (57) 1 18 The same approach as used in 2.2 is applicable to it. LEMMA 8. Assume the following conditions hold: 1) a' (x) and b k (x), k = 1,..., d, exist and are continuous. 2) fi (x, 0) is differentiable with respect to x in the mean square with respect to the measure ml (dO), and if f{(x, 0) is this derivative, then sup 11ft (x, 0) II < 1 x,8 and lim I If{(x, 0) - f{(y, O)1 2 m(dO) = O. yx 3) For some c > 0 d la(x)1 2 + L Ib k (x)1 2 + I 1.Ii (x, OWm(dO) < c(l + IxI 2 ). 1 
38 I. ERGODIC THEOREMS In this case if x;.(t) is a solution of the equation d d x;. ( t) = a (x;. ( t) ) d t + L b k (x;. ( t) ) [d W k ( t) + A.ak ( t) d t] 1 + L fi (x)(t), O)f.ll (dt X dO), (58) where the al(t) are locally square-integrable numerical functions, then the limit y(t) = lim  [x)(t) - x(t)] ;.oJl., exists in probability. Let Zt be a function with values in L(Rd) that satisfies the stochastic equation d dZ t = a'(x(t))Zt dt + L b,,(x(t))Zt dWk(t) 1 + L j{(x(t), 0) Ztf.l 1 (dt X dO) (59) with the initial condition Zo = I. Then Zt is an invertible process, Zt- l is locally bounded, and the following representation is valid: In t d y(t) = Zt Zs-l L ak(s)bk(x s ) ds. o 1 (60) PROOF. Under the conditions of the lemma (58) has a derivative with respect to the parameter A., and 8 x;. ( t) 18 A. satisfies d d :). x)(t) = a' (x)(t)) :). x) (t) dt + L bk(x)(t)) :). x) (t) dWk(t) 1 + t bk(x)(t))adt) dt + ( j{(x)(t), 0) :). X)(t)f.ll (dt x dO) 1 is (see Gikhman-Skorokhod [2], p. 263). Since y(t) = 8x;.(t)18A.1;.=0, it follows that d dy(t) = a'(x(t))y(t) dt + L b,,(x(t))y(t) dWk(t) 1 d + ( f{(x(t), O)y(t)f.ll (dt x dO) + L ak (t)b k (x(t)) dt. is 1 
2. DENSITIES FOR TRANSITION PROBABILITIES 39 The existence of a solution of (59) and its invertibility, as well as the local boundedness of Zt- l , follow from 2 in Chapter III. We get a simple check if we set y(t) = ZtUt; then d Ztdut = L ak(t)bk(x(t)) dt. 1 Therefore, (60) is valid. 0 THEOREM 15. Suppose that the conditions of Lemma 8 are valid and the second derivative b;: (x) exists and is continuous. For the transition probability of a solution of (57) to have a positive density it suffices that the following conditions hold: for all x E Rd there is no nonzero v E Rd such that (v,bk(x)) = 0, (v,rk(x)) = 0, (v,rkl(x)) = 0, f I( V, gk(X, O))lml (dO) = 0, k, I = 1,2,..., d, where rk(x) = - a'(x)bk(x) + b,,(x)a(x) d + L([bi(x)]2b k (x) - bi(x)b,,(x)bl(x) + !b;:(x)[bl(x), bl(x)]) 1 + f [I + J{(x, 0)r 1 ((bk(x + f(x, 0)) - bk(x)) - b,,(x)fi (x, O))ml (dO), rlk(x) = b,,(x)bl(x) - bi(x)bk(x), gk(X, 0) = [I + f' (x, 0)]-1 (bk(x + fi (x, 0)) - bk(x)). PROOF. It can be assumed without loss of generality that e = Rd and ml (dO) = ml (dx) is a measure on Rd for which J Ixl 2 ml (dx) < 00. Let {el, . . . , ed} be an orthonormal basis in Rd. We consider the process (t) = LekWk(t) + I t L xf.ll(ds x dx) in Rd, where x(t) is a function of (s), s < t. If (s) is regarded as a random element in L 2 ([0, T], Rd), then the measure corresponding to it has the same admissible directions as the measure corresponding to the Wiener process, and the density po(a, x) has the same expression in terms of the Wiener process. The process y(t) given by (60) is the derivative 
40 I. ERGODIC THEOREMS of the function <l>t() = x(t) along the direction a(t) = E J ak(s) ds eke Therefore, as in the proof of Theorem 12 it suffices to establish that there do not exist a stopping time ! > 0 and a nonzero v E Rd such that ( v, Z s-1 b k (x (s ) )) = 0, k = 1,..., d, for s < !. Using the Ito formula, we can write dZ S - 1 = - Zs-I (a'(x(s)) ds +  b/(x(s)) dw[(s) - [b/(X(s))f dS) + Zs-I 1[(1 + f{(x(s), 0))-1 - I]f.ll(ds x dO), (61) dZS-1bk(x(s)) = - Zs-I (a'(X(s)) ds +  b/(x(s)) dw[(s) - [b/(X(s))f dS) bk(x(s)) + Zs-I 1 ([1 + f{(x(s), O)r l - I)f.l(ds x dO)bk(x(s)) + Zs-I (bk{X(S)) [a(x(s)) ds + t. b[(x(s)) dW[(S)] 1 d ) + 2  bk'(x(s))[b[(x(s)), b[(x(s))] ds d - Z s-1 L [b; (x (s ) ) b" (x (s ) ) b I (x (s ) ) ] 1 + Zs-I 1[(1 + j{(x(s), 0))-1 (bdx(s) + jj(x(s), 0)) - b k (x(s)))],ul(dO x ds) + Zs-I 1 ((I + j{(x(s), 0))-1 (bk(x(s) + jj(x(s), 0)) - bk(x(s)) - b,,(x(s))ji(x(s), O))m(dO))ds. (62) Suppose that there exist a stopping time ! and a v E Rd such that ( v, Zs-1 b k (x (s ) )) = 0 for s < !, k = 1,..., d. 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 41 Gathering coefficients, we get that for S < ! o = (v, Zs-l rk(x(s))) ds + L(v, Zs-1 rkl(x(s))) dWI(S) I + f Zs-I gk(X(S), O)J1.1 (dO x ds) = O. If for all t < ! 0= {(V, Zs-Irdx(s)))ds + t t(V,Zs-lrk/(x(s)))dw/(s) ... 0 1=1 10 + it f (v, Zs-I gdx(s), O))J1.1 (dO x ds), then i T - I(v, Zs-I gk(X(S), O)WVI (dO x ds) = 0, and from the last equality i T - I(v, Zs-I gdx(s), O))12ml (dO) ds = o. Therefore, i T - I(v, Zs-I gk(X(S), O))lm(dO) ds = O. Thus, the third integral in (62) is equal to zero. Then the first two integrals are also equal to zero, and hence ( v, Zs rk (x (s ) )) = 0, ( v, Zs- 1 rk I (x (s ) )) = 0, f I(v, Zs-I gk(X(S), O))lm(dO) = O. Passing to the limit as s --+ 0 in (61) and the last equalities, we get that for the given v E Rd (v,bk(x)) =0, k= 1,...,d, (v,rk(x)) =0, k= 1,...,d, (v,rkl(x)) =0, k,l= 1,...,d, f I(v, gk(X, O))lm(dO) = 0, k = 1,..., d, and this contradicts a condition of the theorem. 3. Ergodic theorems for one-dimensional stochastic equations The phase space is ordered in the case of one-dimensional stochastic equations. In this case the attainment of individual points can have pos- itive probability for continuous processes, a circumstance which enables 
42 I. ERGODIC THEOREMS one to obtain ergodic theorems by a method simpler than those discussed in  1. This method does not admit generalization to the multidimensional case. 3.1. Diffusion processes on the line. We consider a homogeneous sto- chastic equation of the form dx(t) = a(x(t)) dt + b(x(t)) dw(t), (63) where a(x) and b(x) are measurable functions from R to R. It will be assumed that the functions a(x), b(x), and I/b(x) are locally bounded. Let {X { (Z a(y) } h(x) = 10 exp -2 10 b 2 (y) dy dz, h-l(x) = f(x), b l (x) = h'(f(x))b(f(x)), {-OO { (Z a(y) } '1 = 10 exp -2 10 b 2 (y) dy dz, roo { (Z a(y) } '2 = 10 exp -2 10 b 2 (y) dy dz, o > rl > -00, 0 < r2 < +00. LEMMA 9. Let x(t) be a solution of equation (63) on the interval [O,![, where ! is a stopping time (in sI?eaking of a solution of an equation we assume that there exists aflow (:7;)tO of a-algebras to which x(t) and w(t) are adapted, where w(t) is a Wiener process with respect to (:7;)). Then the process x(t) = h(x(t)) is a solution of the equation - dx(t) = b(x(t)) dw(t) on [O,![ and x E ]rl, r2[ for s < !. (65) PROOF. It follows from the result of Krylov cited in the proof of The- orem 10 that for every r and every Borel set A (t/\7: r E 10 IA(x(s)) ds, where !, is the first exit time of a solution of (63) from the interval (-r, r), is absolutely continuous with respect to Lebesgue measure. Choose a se- quence of continuous functions an (x) and bn(x) such that an (x) --+ a(x), bn(x) --+ b(x), and an(x)lb(x) --+ a(x)lb 2 (x) for almost all x, and the quantities (64) an(x) a(x) lan(x) - a(x)1 + Ibn(x) - b(x)1 + b(x) - b 2 (x) are bounded. Then the function {X { (Z an(y) } hn(x) = 10 exp -2 10 b(y) dy dz 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 43 is twice continuously differentiable. Using the Ito formula, we have that dhn(x(t)) = h(x(t))[a(x(t)) dt + b(x(t)) dw(t)] 1 + 2 h(x(t))b2(X(t)) dw(t) = h(x(t)) [a(x(t)) - :n dt + h(x(t))b(x(t)) dw(t), hn(x(t" T,)) = hn(x(O)) + i lATr h'(x(s)) [a(x(s)) - : b2(X(S))] ds + i lATr h(x(s))b(x(s)) dw(s). Therefore, rtA Lr E hn(x(t" T,)) - h(xo) - 10 h(x(s))b(x(s)) dw(s) < C E t ATr an(x(s)) _ an(x(s)) d ( 66 ) - 1 J o b(x(s)) b 2 (x(s)) s, where C l = sUPlxlr h(x)lb(x)l. The right-hand side of (66) tends to zero. Hence, for all r rtA Lr h(x(t" T,)) - h(x(O)) = 10 h'(x(s))b(x(s)) dw(s), I.e., r tA Lr .- x(t " T,) - x(O) = 10 b(x(s)) dw(s), and this is equivalent to (65). We have that x(s) E ] - rl,r2[ for s < !, since x(s) = rl for x(s) = -00, and x(s) = r2 for x(s) = +00. Therefore, only solutions of (65) will be considered in what follows. 0 LEMMA 10. Let rl < a < p < r2. Denote by ![a, P] the first exit time of x(t) from the interval (a, P). Then for x E ]a, P[ x - a J p 2(P - z) P - x r x 2(z - a) ExT[<>,PI = P _ a x b 2 (z) dz + p - a 1<> b 2 (z) dz. With the expression on the right-hand side of(67) denoted by v(x), (67) p-x 1 x-a ExT[<>,PII{X(T(".PJ)=<>} = v(x) p _ a + p _ a <I>(x) - <I>(P) (P _ a)2 ' (68) x-a x-a 1 EXT[<>,pj!{X(T(".PI)=P} = v(x) p _ a + <I>(P) (P _ a)2 - P _ a <I>(x), (69) <I>(x) = 2 IX (x - z)b 2 (z)v'(z) dz. 
44 I. ERGODIC THEOREMS ,.., PROOF. Let bn(x) be a sequence of continuous functions such that for allt>O . l tAt [n. p ] 1 1 11m Ex ,.., - ,.., ds = 0 o b(x(s)) b 2 (x(s)) (the existence of such a sequence was established in Lemma 9). Define l x ,.., ( l p ,.., ) x - a un(x) = 2 0 (x - y)b;2(y) dy + 1 - 2 0 (P - y)b;2(y) dy P _ a · (70) Then un(x) is a twice continuously differentiable function, un(a) = 0, un(P) = 1, and u(y) = 2b;2(y). By the Ito formula, du(.x(t)) = u(.x(t))b(x(t)) dw(t) +  u(X(t))b2(X(t)) dt ,.., b 2 (x(t)) = u(x(t))b(x(t)) dw(t) + ,..,,.., dt. b(x(t)) Therefore, under the assumption that x(O) = x E ]a, P[, we have that for allt>O r tA t[n.p] ,.., ,.., un(x(t 1\ 'Io,PI)) - un(x) = 10 b 2 (x(s))b;2(X(S)) ds + 1 1AT (",{i] u(x(s))b(x(s)) dw(s). .0 ,.., If u(x) is the function given by the right-hand side of (70) with b substi- ,.., tuted for b n , then, passing to the limit as n --+ 00, we get that r tA t[!t.P] ,.., u(x(t 1\ 'Io,PI)) - u(x) = t 1\ 'Io,PI + 10 u'(x(s))b(x(s)) dw(s). (71) Hence, Ext 1\ 'l'[a,p] = Exu(x(t 1\ 'l'[a,p])) - u(x). It is clear from this relation that the left-hand side is bounded uniformly with respect to t, and hence Ex'l'[a,p] < 00. Passing to the limit as t --+ 00, we get that Ex'l'[a,p] = P{x( 'l'[a,p]) = P} - u(x). Since x(t) is a martingale and x(t 1\ 'l'[a,p]) is a uniformly integrable mar- tingale, it follows that x = Exx( 'l'[a,p]) = PP{x( 'l'[a,p]) = P} + a( 1 - P{x( 'l'[a,p]) = P}), so that Px{x('Io,PI) = P} = ; =: . 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS Therefore, 45 {X,.., x - a {p ,.., EX'[a,PJ = 2 ia (x - y)b- 2 (y)dy - 2 P _ a ia (P - y)b- 2 (y)dy, which implies (67). If the right-hand side of (67) is denoted by v(x), then (71) gives us that v(x) = '[a,PJ + laTI"'lil v'(x(s))b(x(s))dw(s). Therefore, (X(L[a,p]) - a)L[a,p] = v(X)(X(L[a,p]) - a) - (X(L[a,p]) - a) x laTI"'lil v' (x(s ))b( x(s)) dw (s), {'(H.P] ,.., Ex(x('[a,PJ) - a)'[a,PJ = v(x)(x - a) - Ex io v'(x(s))b 2 (x(s)) ds. Let <I>(x) = 2 LX (x - z)b 2 (z)v'(z) dz. Then /1(x) = 2b 2 (x)v'(x), and <I> (x ( '[a,PJ)) - <I> (x ) = laTI".lil v' (x(s ))b 2 (x(s)) ds + laTI".lil <I>' (x(s))b(x(s)) dw (s). From this we get Ex laTI"'lil v'(x(s))b 2 (x(s))ds = Ex[<I>(x('[a,pJ)) -<I>(x)] x-a = -<I>(x) + <I>(P) P _ a ' (72) We now find from (72) that x-a x-a 1 E'[tt,P]l{X(T[",PJ)=P} = v(x) P _ a + <I>(P) (P _ a)2 - P _ a <I>(x). Formula (68) is obtained from this; (69) follows from the preceding two formulas. 0 COROLLARY. Denote by La the time when the process x(t) first hits the .0"" pOint a. Let rl = -00 and J-oo b- 2 (z) d z < 00. Then for all x < P {p,.., f x ,.., Ex'p = 2 ix (P - z)b- 2 (z) dz + 2(P - x) -00 b- 2 (z) dz. (73) 
46 I. ERGODIC THEOREMS Similarly, ifr2 = +00 and J o oo 'b- 2 (z) dz < 00, then for all x > a EX'a = l x 2(z - a)j}-2(z) dz + (x - a) i oo 2j}-2(z) dz. (74) These two formulas are obtained from (67) by passing to the limit as a -+ -00 or P -+ +00. THEOREM 16. Suppose that rl = -00, r2 = +00, and Joo 'b- 2 (z) dz < 00. Then the process x(t) is ergodic with ergodic distribution n(A) = k i j}-2(z)dz, k = (I j}-2(z) dZ) -1 . PROOF. If rl = -00 and r2 = +00, then x(t) is a bounded process for all t > O. For any x and y we have that Ex'l'y < 00 and Ey'l'x < 00. Suppose that x(O) = x #- y. Denote by '1 the time of first return to x after hitting y, and let '2 = 8(1 '1, i.e., '2 is the time interval between the first time x is hit after y has been visited and the second such hit, 'n = 0E;-I l;k'l, etc. The variables 'k are independent and identically distributed; they are stopping times, and x(E7 'k) = x. Moreover, EX'l = Ex'l'y + Ey'l'x < 00. Let f be a bounded measurable function. Then r E; l;k n r E-I l;k in f(x(s)) ds = .L h j_ f(x(s)) ds. o j=l Ek=1 (k The variables t E-1 (k Ylj = . - f(x(s)) ds J-I (. LJk= 1 It. are mutually independent, since P{Ylj < alg-/-I,. } = P{8J-I,. Yll < alg-J-I,. } = P{Yll < a}. LJI ':.k LJk=1 ':.k LJI ':.k Moreover, EIYld < IlfllEx'l < 00. Therefore, by the strong law of large numbers, 1 n lim - "Ylj = EYll, noo n  1 1 n lim - "'j = E'l noo n  j=l with probability 1. Let Vt be such that '1 + . . . + 'II( < t < '1 + . . . + '11(+ 1. Then for f > 0 ( ) -1 . ( ) -1 ( ) II, 11(+1 1 t 11(+1 II, f; 'j f; '1j > t 1 f(x(s)) ds >  'i  '1j · 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 47 We have that Vt --+ 00 as t --+ 00. Hence, the extreme left-hand and right- hand sides in the last inequalities tend to (Ex'l)-IExl1l with probability 1. This establishes that for bounded nonnegative f lim ! (f(x(s)) ds = Ex'lI/ExCI too t 10 (75) with P x-probability 1. If z is any point, then, since P z { 'l' x < oo} = 1, P z { lim ! (f(x(s)) ds = a } too t 10 = Pz t  (i To . f(x(s)) ds + l tHx f(xs(s)) dS) = a} x = Pz { lim ! (Hx f(x(s))ds = a } too t 1t x = Px { lim ! (f(x(s))ds = a } . too t 10 Hence, the limit on the left-hand side exists with P z-probability 1 for any z, and it does not depend on z. This limit can obviously be represented in the form J f(Y)1(,(dy), where 1(, is a probability measure (1(, is nonnegative and 1(,(R) = 1). Let g(x) be a twice continuously differentiable compactly supported function. Then, by the It6 formula, g(x(t)) - g(x(O)) = it g'(x(s))b(x(s)) dw(s) + it g" (x(s))b 2 (x(s)) ds. Obviously, lim ![g(x(t)) - g(x(O))] = 0 too t ,.., for all w. Using the boundedness of g'(x)b(x), we see that l i t ,.., lim - g'(x(s))b(x(s)) dw(s) = o. too t 0 Consequently, for every twice continuously differentiable compactly sup- ported function g(x), lim ! (g"(x(s))b 2 (x(s)) ds = O. t-+oo t 10 From this we get f b 2 (x)g"(x)7t(dx) = O. 
48 I. ERGODIC THEOREMS If P (d x) denotes the measure on the line defined by p(A) = i b 2 (x)n(dx), then J g"(x)p(dx) = 0 for every compactly supported twice continuously differentiable function g(x). A compactly supported continuous function (x) is the second deriva- tive of a compactly supported function if and only if J (x) dx = 0 and J x(x) dx = O. If l (x) and 2(X) are two arbitrary compactly supported functions such that f tpl(x)dx = 1, f xtpl(x)dx = 0, f tp2(X) dx =F 0, f Xtp2(X) dx = 1, then for every continuous compactly supported function (x) the function ljI(x) = tp(x) - f tp(y) dYtpl (x) - f ytp(y) dytp2(X) satisfies the conditions J tJI(x) dx = 0 and J tJI(x)x dx = O. Hence, o = f ljI(x)p(x) dx = f (x)p(dx) - f (k + ly)tp(y) dy, where k = J l (x)p(dx) and I = J 2(x)p(dx). Hence, f tp(x)p(dx) = f (k + ly)tp(y) dy. Therefore, the function p(dx) is absolutely continuous with respect to the Lebesgue measure, with density k + Ix. Since it must be nonnegative, it follows that I = 0, and p(dx) = kdx. Hence, n(dx) = kb- 2 (x) dx, and the value of k is determined by the condition J n(dx) = 1. 0 THEOREM 17. Under the conditions of Theorem 16, lim Exf(x(t)) = k f f(y)b-2(y) dy too (76) for all x and all bounded continuous functions f(y). PROOF. Let '1 be the same as in the proof of Theorem 16. Then Exf(x(t)) = Ex/{c,>t}f(x) + EX/{C,<t}E(f(x(t))IS'i,) = Ex/{c,>t}f(x(t)) + EX/{C,<t}Ex(cd(f(x(t - '1))). 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 49 If Exf(x(t)) = g(t) and Ex/{c,>t}f(x(t)) = h(t), then g(t) satisfies the renewal equation g(t) = h(t) + I t Px{CI E ds}g(t - s). (77) It is easy to see that the distribution of Cl does not have atoms (if Px{ 'y = s} > 0, then Px{x(s) = y} > 0; but x(s) has a continuous distribution). Further, L sup Ih(t)1 < L IIfIlP x {CI > n} < 00, n ntn+ 1 n since E x CI < 00. Let f be a twice continuously differentiable compactly supported function. Then Ih(t + h) - h(t)1 < IlfllP x {CI E [t, t + h]} + IEx[f(x(t + h)) - f(x(t))]/{CI>t} I < IIfIlPx{CI E [t, t + h]} + Ex [t+h f"(x(s))b 2 (x(s)) dsIg,>t} < IIfllPx{C1 E [t, t + h]} + Ex [t+h If" (x(s))b 2 (x(s)) I dsIg,>t}. It follows from this inequality that sup Ih(S2)-h(sl)1 < IlfllPx{Cl E[t,t+h]} khsl <s2(k+l)h l (k+l)h + Ex If" (x(s) )b 2 (x(s))1 dS/{CI >kh}, kh so that L sup Ih(s2) - h(Sl)1 < IIfll + IIl"b 2 I1 E x(Cl + h). k khsl<si(k+l)h Hence, h(t) is directly Riemann integrable (see Feller [1], XI.l), and the limit limtoo Ex(f(x(t))) exists and is finite for a solution of (77). It fol- lows from Theorem 16 that this limit coincides with lim  I t Ex (f(x(s))) ds = f f(y)n(dy). This is valid for every twice continuously differentiable compactly sup- ported function f(x), which yields a proof of the theorem. 0 The next theorem refines Theorem 16. · 
50 I. ERGODIC THEOREMS THEOREM 18. Suppose that rl = -00, r2 = +00, and b- 2 (y) is a locally bounded function. Then P x { 'l' y < oo} = 1 for all x and y, and the measure n(A) = i b- 2 (y) dy (78) is the unique a-finite invariant measure. PROOF. We note first that, by Lemma 10, "'" c-x Px{'r[y,c] < co} = 1 and Px{x(t'[y,c]) = y} = c _ y for all y < x < c. Since 'l'y > 'l'[y,c] and 'l'y = 'l'[y,c] when x('l'[y,c]) = y, it follows that Px{'l'y < oo} > (c - x)/(c - y). Passing to the limit as c --+ +00, we see that P x { 'l' y < oo} = 1 for y < x. Similarly, considering 'l'[c,y], where c < x < y, we see that Px{'l'y < oo} = 1 also for x < y. Let f(x) be a nonnegative continuous compactly supported function. Then for all x and y Ex Io T " o f(x(s)) ds < co. Indeed, let <I>n)(x) = i X 2(x - z)b;2(z)f(z) dz, <l>y(x) = lX 2(x - z)b- 2 (z)f(z)dz, "'" where b n is a sequence of positive continuous functions such that lim f Ib;2(y) - b- 2 (y)1 dy = O. noo On the basis of the Ito formula, <I>n)(x(t)) - <I>n)(x(O)) = lot b 2 (x(s))b;2(X(s))f(x(s)) ds + lot <I>n) (x(s ))b(x(s)) dw (s). Passing to the limit as n -+ 00, we get that cl»y(x(t)) - <l>y(x(O)) = lot f(x(s)) ds [ "'" ] t x(s) "'" "'" + 2 10 i b- 2 (z)f(z) dz b(x(s)) dw(s). 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 51 We substitute '[y,c] in place of t in this relation (assume that y < x < c) and take the expectation: Ex<l>y (x( <[y,c])) - <l>y (c) = Ex 1'!J" C ] f(x (s)) ds. Therefore, Ex ['ll"('] f(x(s)) ds = x - Y <l>y(c) - <l>y(x) J o c - y = x - y r 2(c - z)b- 2 (z)f(z) dz c - y J y - i X 2(x - z)b- 2 (z)f(z)dz. Passing to the limit as c --+ +00, we find that for y < x Ex 1',1' f(x(s)) ds = (x - y) i oo 2b- 2 (z)f(z) dz - i X 2(x - z)b- 2 (z)f(z) dz = (x - y) L oo 2b- 2 (z)f(z) dz + i X 2(z - y)b- 2 (z)f(z) dz = 2 i oo (x - y) 1\ (z - y)b- 2 (z)f(z) dz. Similarly, for x < y Ex [',I' f(x(s)) ds = 2 f Y (y - x) 1\ (y - z)b- 2 (z)f(z) dz. J o -00 Let '1 be as in the proof of Theorem 16 (assume for definiteness that x < y). Then Ex 1(1 f(x(s)) ds = Ex 1,.1' f(x(s)) ds + Ey 1'x f(x(s)) ds = 2 1:00 (y - x) 1\ (y - z)b- 2 (z)f(z) dz + 2 L oo (y - x) 1\ (z - x)b- 2 (z)f(z) dz = 2(y - x) [: b- 2 (z)f(z) dz, r'l f oo ,..., Ex Jo f(x(s))ds=2(y-x) -00 b- 2 (z)f(z)dz. (79) 
52 I. ERGODIC THEOREMS By passing to the limit this formula can be extended to any measurable functions such that the right-hand side of (79) is defined. If g(x) is an arbitrary positive measurable function with f g(y)j}-2(y) dy < 00, and f(y) is such that J If(y)lb- 2 (y) dy < 00, then with probability 1 t  lim f fCi(s)) ds = f f(y)-2(y) dy . (80) too J o g(x(s)) ds J g(y)b- 2 (y) dy Indeed, it suffices to consider the case when f > O. If Vt is the index such that Et 'i < t < Et+l 'i, then since E+I " E+I "  f(x(s)) ds and  g(x(s)) ds E " E " are independent identically distributed variables with finite expectation, we have that with probability 1 lit Ek+l, lim ! L r I ' f(x(s)) ds = 2(y - x) f f(z)b-2(z) dz, too Vt Jk r k= 1 LJI ':., lit Ek+1 , lim ! L r I I g(x(s)) ds . 2(y - x) f g(X)b-2(z) dz, too Vt Jk r k= 1 LJI ':., lim(Vtl(Vt + 1)) = 1. This gives us (80). Since there are no invariant subsets for the process, every invariant measure is ergodic. It follows from (80) that the only possible (to within proportionality) invariant measure is defined by (78). Let Q;.(x,A) = (1 -).) 1 00 ExIA(x(s))e-J.s ds. We show that a Markov chain with the indicated transition probability is recurrent (in the Harris sense) with respect to Lebesgue measure. Let (}1, (}2, . .. be a sequence of independent identically distributed random variables independent of the process x(t) and such that P((} > t) = e-;'t. Then the sequence x(E7 (}k) is a Markov chain with transition probability Q;.(x, A). We show that for all A of positive Lebesgue measure LI A (x (ek)) = +00 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 53 with P x-probability 1. Let us write this sum as follows: LIA (x (Ok)) I{E'i<E8k<E+I'i}" (81) The terms in the last sum are independent and identically distributed for different k: Px {LIA (x (Ok)) I{E<',} > I} > Px{x(Od E A,OI ::; Cd = Ex 10'1 e-A'IA(x(s))ds > O. If the right-hand side were equal to zero, we would have f'l f '" Ex J o IA(x(s)) ds = 2(x - y) J A b- 2 (z) dz = O. This implies that the series in (S'l) diverges. We can now use Theorem 8, which mplies the existence and uniqueness of an invariant measure. 0 3.2. Diffusion processes on an interval. We consider equation (63) on some interval ]Cl, C2[, with a(x) and b(x) measurable in the interval, b(x) > 0, and a(x), b(x), and b- l (x) bounded on [a, P] for any Cl < a < P < C2. This condition can fail to hold in neighborhoods of Cl and C2. A solution of the equation exists up to the time , = sUP[a,p]C]cl,C2[ '[a,p]. By the same methods as in 3.1 we transform it to equation (65) on ]rl, r2[, where '1 = lCI exp { -2l z a(y)b- 2 (y) dy } dz, '2 = lC2 exp { -21 Z a(y)b- 2 (y) dy } dz, h(x) = lX exp { -2 Io z a(y) - b- 2 (y) dy } dz, where C is some point of ]Cl, C2[. If rl = -00 and r2 = +00, then we arrive at the case considered in the preceding subsection. Therefore, we dwell here on the case when ]rl, r2[ is a half-line or a finite interval. We are only interested in the case when the time, for the existence of a solution is +00. By passing to the limit it is easy to establish with the help of Lemma 10 that Ext' is finite if and only if fr7 [;-2(y) dy < 00. Let rl = -00 and r2 = +00. Then the solution of the equation can hit the point rl in a finite amount of time. This will be the case, for instance, 
54 I. ERGODIC THEOREMS when J Z;-2(y) dy < 00 for some c > '\. Indeed, then 'l'[TI>C) is finite with probability 1, and c-x P x{.x( 'r[rl, c]) = rl} = for rl < x < c. c - rl Therefore, if P x {'r = +oo} = 1, then fC b- 2 (y) d y = +00. In exactly the Jrl same way, if rl > -00, r2 < +00, and 'r is finite, then l c 1 r, b- 2 (y) dy = - b- 2 (y) dy = +00 rl C THEOREM 19. Ifx(t) is a solution of(65) and -00 < rl < r2 < +00, then a solution exists for all t if and only if 1 C "" 1 r2 "" b- 2 (y) dy = +00, r2 + b- 2 (y) dy = +00 rl C for c E ]rl, r2[. Under the latter conditions (rl < c < r2). Px { lim x(t) = r l } = 1 too (82) when r2 = +00, and { . "" ( ) } - x + r2 P x 11m x t = rl = , r2 t- rl Px{limx(t) = r2} = x - rl r2 - rl when r2 < +00. PROOF. Suppose that JT Z;-2(y) dy = +00. Assume that PX{X('l'[Th C )) = rl, 'r[rl ,c] < t} > q > 0 for some x E ]rl, c[ and t > O. Then obviously Py{x('r[rl'c]) = rl, 'r[rl,c] < t} > q for all rl < Y < 00. Further, 'r[rl ,c] = 'r[x,c] + I{x(T[x.c))=c} OT[x.(.) 'r[rJ,c] for x < y < c. Since 1 Py{ 'l'[x.c) < td > 1 - Ey'l'[x.c), the probability on the left-hand side can be made arbitrarily close to 1, uniformly with respect to y E [x, c]. Therefore, for some q > 0 Py{ 'r[rl'c] < t + tl} > q, y E ]rl, c[, Py{'r[rJ,c] > t + tl} < 1 - q, Y E ]rl,c[. 
3. ONE-DIMENSIONAL STOCHASTIC EQUATIONS 55 Then Py{'l'[rt,c] > 2(t + tl)} = EyI{T[rt.c»t+tt}Ot+ttI{T[rt.c»t+tt} = EyI{T[rt.c»t+tt}PX(t+td{'l'[rJ,c] > t + tl} < (1 - q)2, Py{ 'l'[rt,c] > n(t + tl)} < (1 - q)n, 00 """ n t + tl Ey'l'[rt,c] < L.J( 1 - q) (t + tl) = < +00, n=O q but this contradicts the divergence of the integral r C 'b- 2 (y) dy. Thus, the Jrt process does not hit the point rl in a finite amount of time. We establish similarly that for r2 < +00 it is impossible to hit r2 in a finite amount of time. Therefore, under the conditions of the theorem the process x(t) is defined for all t > O. Obviously, x(t) is a martingale bounded below by rl. Therefore, the limit lim t--+oo x(t) exists with probability 1. This limit cannot be an interior point of ]rl, r2[, as follows from the fact that for every closed bounded interval [a, P] interior to ]rl, r2[ the exit time from [a, P] is finite. If r2 = +00, then for I > rl p x { su p x(t)-r l >l-rl } < E-rd , t - rl i.e., x(t) is bounded with probability 1, and limt--+oox(t) = rl. But if r2 < 00, then x(t) is a uniformly integrable martingale, and Ex lim x(t) = x = P { lim x(t) = r l } rl + P { lim x(t) = r 2 } r2. t--+oo t--+oo t--+oo This implies (82). 0 REMARK. As we see, under the conditions of Theorem 19 there does not exist an invariant measure on ]rl, r2[ for the process x(t). 3.3. Processes with reflection at the boundary. We first consider a proc- ess on the half-line R+ satisfying the equation (see Gikhman-Skorokhod [2], Chapter 6, 3, (34)) dx(t) = a(x(t)) dt + b(x(t)) dw(t) + d'o(t), (83) where a(x) and b(x) are bounded and measurable on R+, b(x) > 0, and a(x), b(x), and b-l(x) are locally bounded. The process 'o(t) is contin- uous, nondecreasing, and adapted to the flow on which x(t) and w(t) are defined, and it has as points of increase only the set {t: x(t) = O}; we assume that this set has Lebesgue measure 0, and x(t) > 0 for all t > O. Using the same arguments as in Lemma 9, we can transform the equation to the form ,..", ,..", dx(t) = b(x(t)) dw(t) + d'o(t), (84) 
56 I. ERGODIC THEOREMS - t - where 'o(t) = J o h'(x(s)) d'o(s), x(t) = h(x(t)), and hand b are defined by (64). We assume that r2 = +00. - - THEOREM 20. Suppose that b(x) and b- l (x) are locally bounded. The unique invariant a-finite measure for x(t) is the measure n(A) = i b- 2 (y) dy, A E R+. PROOF. Let XI (t) be a solution of the stochastic differential equation - dXl (t) = b(xl (t)) dWl (t), - - where b l (x) = b(lxl), b l (x) is defined on R+, and WI (t) is a Wiener process. Then the process IXI (t)1 satisfies - d(lxl (t)1) = b l (I X I (t)1) sgn XI (t) dWI (t) + d '(t), where '(t) is a nondecreasing process whose points of increase can only be the zeros of the process x(t), and J SgnXl (s) dWI (s) = w(t) is also a Wiener process. Hence, a solution of (84) can be written as IXI (t)l. If f(x) is a symmetric continuous compactly supported function on R+, then for XER Exf(xl (t)) = Elxlf(x(t)), i: b- 2 (x)J(x)dx = i: b,2(X)E x J(xI(t))dx = i: ElxIJ(X(t))b,2(X) dx = 21 00 ExJ(x(t))b- 2 (x) dx = 21 00 b- 2 (x)J(x) dx (we have used Theorem 18). The same theorem gives us that the invariant measure is unique. 0 REMARK. If J o oo j}-2(X) dx < 00, then it follows from Theorem 17 that I ExJ(x(t)) = 1 b- 2 (y)J(y) dy (I b- 2 (y) dy ) -I for every continuous bounded function f on R+. We consider now an equation for a process on a finite interval with instantaneous reflection at the endpoints. Assume from the start that a = 0 and the equation has the form - dx(t) = b(x(t)) dw(t) + d'l (t) - d'2(t), (85) 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 57 where b(x) is bounded and measurable on [0, c], b(x) > 0, b- l (x) is also bounded, and '1 (t) and '2(t) are increasing continuous processes adapted to the flow of a-algebras on which the Wiener process w(t) and the con- tinuous process x(t) are given. Further, x(t) E [0, c] for all t, and the sets {t: x(t) = O} and {t: x(t) = c} have Lebesgue measure O. ,..", A solution of (85) can be,..", constructed as follows. Le!., b l (x) b defined for x E R by the e't,Ualities b l (xl., = b(x) for x E [O,c], b l (x) = b(2c - x) for x E [c,2c], and b l (x + 2c) = b l (x) for all x. Let l(x) = x for x E [0, c] and l(x) = c -Ic - xl for x E [c,2c], and extend it to l(x) for x ERas a periodic function with period 2c. If XI (t) is a solution of the equation ,..", dXl(t) = b l (Xl(t)) dw l(t), then l(xl (t)) is a solution of an equation of the form (85) with w(t) = I I' (XI (t)) dWI (t). Then it follows from Theorems 1 7 and 18 that the only invariant measure for x(t) is n(A) = ( f}-2(y) dy. 1 An[O,c] This measure is clearly finite. Therefore, the ergodic theorem is valid; in particular, lim Exf(x(t)) = t f(y)f}-2(y) dy ( t f}-2(y) dy ) -I . (86) too 10 10 4. Ergodic theorems for solutions of stochastic equations in Rd The ergodic behavior of a Markov process (see  1) is connected with the set of invariant a-finite measures for this process. Ergodicity is equivalent to uniqueness and finiteness for such a measure. In this section we study invariant measures for solutions of stochastic differential equations in Rd. Under the assumption that the equation has a weakly unique solution (see Gikhman-Skorokhod [2], p. 571) the transition probability P(t,x,A) for these solutions has the following property: for all f E CRd 1if(x) = f f(y)P(t,x, dy) E CRd. (87) Therefore, in the first place we study invariant measures for homogeneous Markov processes satisfying the condition just formulated, i.e., Feller pro- cesses, and we consider processes on compact spaces separately. 
58 I. ERGODIC THEOREMS 4.1. Invariant measures for processes on compact spaces. Let X be a compact metric space, and C x the space of continuous functions. We consider a homogeneous Markov process in X with transition probability P(t, x, A) such that (87) holds. Such Markov processes can arise as solu- tions of stochastic differential equations whose solutions lie on bounded closed surfaces, or for solutions of equations in a bounded region with reflection at the boundary. THEOREM 21. There exist finite invariant measures for a Markov process. The collection M J of all invariant probability measures is a closed convex set (in the weak convergence). If M J is the set of extremal points of M J , then it coincides with the set of ergodic measures, and every measure 1t E M J can be represented as n = '- all(da), 1M J where a E M J and v is a probability measure on M J . For each measure a E M J there is a measurable invariant set Ao c X such that a(Ao) = 1 and Ao n Ao' = 0 for a # a'. PROOF. The set of probability measures on X is a compact metrizable space in the topology of weak convergence. Let Jl(dx) be an arbitrary probability measure on X, and let Tn --+ 00. The sequence of measures 1 (Tn I J.ln (A) = Tn J 0 . J.l (d x) P(t, x, A) d t is compact; therefore, it has a weakly convergent subsequence. We can assume without loss of generality that Jln converges to some measure 1t. Let us show that 1t is invariant. For every f E C x I I n(dx)P(t,x, dy)f(y) = ;i. I I J.ln(dx)P(t, x, dy)f(y) = lim ;. (Tn If Jln(dx)P(s + t, x, dy)f(y) ds noo .I. n 10 1 I t + Tn If = lim T Jln(dx)P(s,x,dy)f(y) ds noo .I. n t = nli. [ ;n 1 Tn II J.l(dx)P(s,x,dy)f(y)ds + 0 (  )] = nli.11 J.ln(dx)P(t,x,dy)f(y) = I n(dy)f(y). Thus 1t is invariant. The convexity of the set of invariant measures is obvious. Closedness follows from the Feller property of the transition 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 59 probability: if nn E MJ converges weakly to n, then, since Trf E C x for f E C x , we have that j f(x)n(dx) = lim j f(x)nn(dx) = lim j nn(dx) Trf(x) noo noo = j n(dx)1tf(x). Recall that a measure n E M J is said to be extremal if there do not exist o < A < 1 and nl,n2 E M J , nl # n2, such that n = Ani + (1 - A)n2. The fact that a compact convex subset of a linear space has extremal points and all the elements of this set are representable by integrals over the set of such points follows from the Krein-Mil'man theorem. Let a be an extremal measure in M J . If it were not ergodic, then there would be an invariant set F such that 0 < a(F) < 1 and a( G) = a(GnF) +a(G\F), and both measures on the right-hand side are nonzero and invariant, so that, setting a'(G) = a() a(G n F), a"(G) = 1 _ (Fat (G \ F), we get two invariant probability measures such that a = a(F)a' + (1 - a(F) )a", contradicting the assumption that a is extremal. The mea- sure a is hence trivial on the a-algebra generated by the invariant sets; therefore, it is ergodic. Let fk(x) be a countable dense sequence in Cx. On the basis of Theorem 6, lim ! (t j P(s,x, dy)fk(y) ds = j fk(y)a(dy) too t 10 for almost all x with respect to the measure a E M J . Let Aa = n {x: t  1 t j P(s,x,dy)fk(y) ds = j fk(y)a(d Y )}. k Then a(An) = 1, and Ao: n Ao:' = 0 for a # a' E M J , because for at least one k j fk(y)a(dy) =I- j fk(y)a'(dy). 0 REMARK 1. If an ergodic measure v is unique, then for all f E C x limsup t- I t j f(y)P(S,X,dy)- j f(y)V(d Y ) =0, tO x 10 since otherwise the family of measures t- l J P(s, x, .) d s would have a limit point other than v as t --+ 00. 
60 I. ERGODIC THEOREMS Let a(dx) be a measure in M J , and F the smallest closed set with a(F) = 1. We show that P(t,x,F) = 1 for all x E F. We have that 1 = a(F) = L a(dy)P(t,y,F), i.e., P(t,y,F) = 1 for almost all y with respect to the measure a(dy). But a( U) > 0 for every open set U with U n F # 0 (by the construction of F). Hence, P(t,y,F) = 1 for a dense subset of F. If Yn --+ Y and P(t,Yn,F) = 1, then, since the measures P(t,Yn,dx) converge weakly to the measure P(t,y, dx), ! P(t,y,dX)(X) = l im ! p(t'Yn,dx)(x) > lim P(t,y,F) = 1, noo for any  E C x with  > IF. Hence, P(t,y,F) = inf ! P(t,y,dX)(X) > 1, (i.I F i.e., the set of x with P(t, x, F) = 1 is closed. This implies that P(t, x, F) = 1 forallxEF. REMARK 2. Denote by S(x), x E X, the smallest closed set such that P(t,x,S(x)) = 1 for all t > O. Obviously, x E S(x). If U is an open sequence in the complement of S(x), then P(t,x, U) = 0 for all t > O. Clearly, S(x) is an invariant set, and S(y) c S(x) for y E S(x). DEFINITION. A process is said to. be topologically weakly recurrent if S(y) = S(x) for y E S(x). Topological weak recurrence has the following meaning: if x and y are points in X and for every neighborhood U l of y there is a t with P(t,x, U l ) > 0, then for each neighborhood U 2 of x there is an s with P(s,y, U2) > O. If a process is topologically weakly recurrent, then for any x and y there are only two possibilities: either S(x) n S(y) = 0, or S(x) = S(y). Suppose that a process is topologically weakly recurrent. Then each of the sets S(x) can be regarded as the phase space of the process, and it no longer contains closed invariant subsets. DEFINITION. A Feller Markov process is said to be irreducible if it does not have closed invariant subsets different from the whole space. Let us consider the question of Harris recurrence for an irreducible process on a compact. space. We need the following auxiliary assertion. LEMMA 11. Let p(x,y) be a measurable function on X x X, and A(dy) a finite measure on !JI, with a) J p(x,Y)A(dy) = 1; 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 61 b) f p(x,Y)(Y)A(dy) E C x if  E C x . Then the following assertions are true. 1) Thefunction p(x,y) is integrable with respect to A(dy), uniformly with respect to x. 2) f f/I (y )p (x, y )A( d y) E C x for all bounded measurable functions f/I. PROOF. 1) It suffices to show that for every G > 0 there exists a J > 0 such that for any closed set F with A(F) < J tp(X,y))'(d Y ) < 8 '<Ix E X. Assume the opposite. Suppose that A(Fn) --+ 0 but f Fn p(xn,Y)A(dy) > G. It can be assumed without loss of generality that Fn! and X n --+ x , and hence ( Pn(Xn,y))'(dy) > 8 lFm for n > m. It follows from b) that for the closed set Fm 8 < lim ( p(xn,y))'(dy) < ( p( x ,y))'(dy), noolFm lFm and hence the right-hand side does not tend to zero as m --+ 00, even though A(Fm) --+ O. We have arrived at a contradiction. 2) Suppose that n(Y) is a sequence of functions in C x with sUPn lInll < 00 such that }i.. f Jtpn(Y) - ljI(y)I)'(dy) = O. Then f p(X,y)tpn(y))'(dy) - f p(x,Y)IjI(y))'(dy) < 2 ( p(x,y))'(dy) + 8. 1 {y: 19'n(Y)-f/ln(y)l>t} (8) Since ).( {y: Itpn(Y) - IjI(Y) 1 > 8}) <  f Itpn (y) - ljI(y)J)'(dy)  0, the first term on the right-hand side of (88) tends to zero uniformly with respect to x, by part 1). Hence, f f/I(y)p(x,Y)A(dy) is a uniform limit of continuous functions. 0 REMARK. If X is a locally compact space and the conditions of Lemma 11 hold, then p(x,y) is integrable, uniformly with respect to x in any compact set K, and assertion 2) is also valid. 
62 I. ERGODIC THEOREMS THEOREM 22. Suppose that a continuous process has a probability mea- sure 1t (d y) with support dense in X such that for some A. > 0 the transition probability Q;.(x,A) =).1 00 e-;'tp(t,x,A)dt is absolutely continuous with respect to 1t(dy). Then a Markov chain with transition probability Q;.(x, A) is Harris-recurrent with respect to some mea- sure 1t' that is absolutely continuous with respect to 1t. PROOF. Since J Q;.(x, dy)(y) E C x for  E C x , Lemma 10 gives us that Q;.(x,E) is continuous in x for all measurable sets E. The set {x: Q;.(x,E) > oo} is open, and the set {x: Q;.(x,E) = O} is closed and invariant. Therefore, either Q;.(x,E) = 0 for all x, or Q;.(x,E) > 0 for all x. Let us take Q;.( x , A) as 1t', where x is a particular point. It follows from what was proved above that Q;.(x, A) is equivalent to 1t' for all x E X. Suppose that 1t'(E) > O. Then Q;.(x,E) > 0 for all x, and, since Q(x,E) is continuous in x and X is compact, inf Q;.(x, E) = P > O. xEX Let Y/k be a homogeneous Markov chain in X with transition probability P (x, A) = Q;.(x, A), and let Px be the probability constructed from P(x, A). Denote by v£ the first time Y/k hits the set E. To prove the theorem it suffices to show that Px {v£ < oo} =.1 for all x E X. But Px {v£ > n} = P x {Y/1  E,..., Y/n  E} = f p x {'11 ft E, · · . , tI n- 2 ft E, tin - lEd y } P (y , X\E) lx\£ < (1 - P) Px {Y/1  E,..., Y/n-l  E} < (1 - p)n. This inequality proves the theorem. 0 REMARK. Under the conditions of the theorem lim sup .!. t Tsf(x) ds - f f(x)Jl(dx) = 0 too x t 10 for every bounded measurable function f, where Jl is the unique invariant measure for the process. The existence of an invariant measure follows from Theorem 21; its uniqueness follows from Theorem 8, the corollary to that theorem, and the absolute continuity of every invariant measure with respect to 1t; and Theorem 2 gives us that lim f Iht(x)IJl(dx) = 0, too 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 63 where ht(x) =  I t Tsf(x) ds - ! f(x)p(dx). The functions ht(x) are bounded. Obviously, the invariant measure is ab- solutely continuous with respect to n, and hence limtoo J Iht(x)ln(dx) = O. It follows from Lemma 11 that lim sup ! Iht(y)IQ;.(x,dy) = O. too x Hence, lim sup ! QA(X, dy)! t Tsf(x) ds - ! f(x)Jl(dx) = O. too x t 10 But 1 1 t ! 1 1 t - Tsf(x) ds - Q;.(x, dy)- Tsf(y) ds tot 0 _ ! t Tsf(x) ds -). {'X) e-AuT u du! t Tsf(y) ds t 10 10 t 10 _ ! t Tsf(x) ds - ). roo e- AU du t Tu+sf(y) ds t 10 t 10 10 _ ! t Tsf(x) ds _ ). roo Tvf(y) t Av e-A(V-S) dsdv t 10 t 10 10 -  I t Tsf(x) ds -  1 00 (eA(VAt) - l)e- AV T v f(y) dv < IIfll (  I t e- AV dv +  1 00 (e- AV + e-A(V-t)) dv ) < 21I . The assertion of the remark is a consequence of this estimate. 4.2. Locally compact spaces. We consider homogeneous processes in a locally compact phase space X. Let C be the space of continuous functions (x) such that limxoo (x) = O. It will be assumed that the transition probability P(t, x, A) has the following regularity property: for allEC 1/tp(x) = ! P(t,x,dy)tp(y) E C. We say that a sequence of finite measures Jln on X is CO-convergent to a measure Jl if for all f E C !i.! f(x)Pn(dx) = ! f(x)p(dx). 
64 I. ERGODIC THEOREMS If J1.n is CO-convergent to J1., then J1.(X) < lim J1.n (X), and a CO-limit of probability measures is not necessarily a probability measure. For a CO- limit of probability measures also to be a probability measure it is necessary (and sufficient) for the sequence of measures J1.n to be weakly compact (in X). We consider measures of the form pt(A) =  I t f v(dx)P(s,x,A) ds. Every CO-limit measure for J1.t is invariant for the process. Indeed, if J1. is the CO-limit of a sequence J1.t n , then for f E C f f(x)p(dx) = lim  t n If v(dx) Tsf(x) ds noo t n 10 = lim  t n v(dx) Ts+hf(x) ds = f Thf(x)J1.(dx). noo t n 10 DEFINITION. A Markov process with transition probability P(t,x,A) is said to be bounded in probability if the family {Vt(A) = f v(dx)P(t,x,A), t > o} of measures is weakly compact for any finite measure v on X. Obviously, boundedness in probability implies the compactness of the family {Pt =  I t V s ds, t > 0 } of measures, which, in turn, ensures the existence of invariant probability measures, and hence ergodic measures. We present a condition for Harris recurrence for Markov processes in a locally compact space. THEOREM 23. Suppose that a process {Q, 3';, P x} in a locally compact space X satisfies the following conditions: a) it is irreducible; b) it is bounded in probability; and c) there exists a probability measure 1C(dy) with closed support X such that for some A. the transition probability Q;.(x, dy) is abso- lutely continuous with respect to 1C. Then: 1) There exists an invariant measure 1C' absolutely continuous with re- spect to 1C for which the process is Harris-recurrent. 2) For all x E X and all bounded E E!!I I 1 t lim - P(s,x,E)ds = 1C'(E). t-+oo t 0 (89) 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 65 PROOF. Using Lemma 11 and the remark after it, we see that for every bounded Borel set E the function Q;. (x, E) is continuous in x, and it belongs to C. As in the proof of Theorem 22, we see that the measures Q;.(x, E) are equivalent for different x. If v is a measure to which they are all equivalent, then inf xEK Q;.(x, E) > 0 if E E !!I is a bounded set with v(E) > 0, for any compact set K. We show that a Markov chain with transition probability Q;.(x, E) is recurrent with respect to the measure v. Denote by 'E the first time the chain hits E. It suffices to show that PX {'E < oo} = 1 for all x. It follows from condition b) that for every 8 > 0 there is a compact set K such that Px { n U {Xk E K} } > 1 - 8 n k>n (here Xk is a Markov chain with transition probability Q;.(x,E), and Px is the probability corresponding to it). Let k l ,k 2 ,... be a (finite or infinite) sequence of stopping times such that Xk; E K. Since Px{Xk;+1 E Elxk;} > P > 0, where p is a particular number, it follows that 'E < 00 if the sequence {k i } is infinite. Hence, PX {'E < oo} > 1-8, and 8 is arbitrary (does not depend on E). Recurrence with respect to v is proved. By construction, the transition probability is absolutely continuous with respect to v, and hence the unique invariant measure is absolutely continuous with respect to v. The chain is recurrent also with respect to this measure, and condition b) implies that it is finite. The proof of assertion 2) is contained in the remark after Theorem 22. 0 REMARK. Obviously, recurrence with respect to a measure with support X implies irreducibility of the process. We show that under condition c) the existence of a finite invariant measure with closed support X implies also condition b). Note that P(t, x, E) is absolutely continuous with respect to 1C' (E). Indeed, if 1C' (E) = 0, then P(t, x, E) =  1 1 f P(t - s, x, dy)P(s,y,E) ds = 0, because P(s,y,E) = 0 for all y, and J o OO e-lsP(y,s,E) ds = 0 for almost all s. Denote the density of P(h,x,E) with respect to 1C'(E) by p(h,x,y). 
66 I. ERGODIC THEOREMS Then for t > h P(t,x,E) = ! p(h,x,y)P(t - h,y,E)1C'(dy) < f P(h,x,y)1C'(d Y )+C ! P(t-h,y,E)1C'(d y ) J {p(h,x,y»c} = f p(h, x,y)1C'(dy) + c1C'(E). J {p(h,x,y»c} It follows from Lemma 11 (and the remark after it) that p(h,x,y) is inte- grable uniformly for x in any compact set K. Therefore, for any compact sets K and Ke, sup P(t,x,X\K e ) < S U P ! I{p(h,x,y»c}P(h,x,y)1C'(d y ) + c1C'(X\K e ). xEK,t>O xEK Choosing c such that the first term on the right-hand side is less than el2 and then choosing the compact set Ke such that c1C'(X\K e ) < e12, we see that for every e > 0 and every compact set K there exists a compact set Ke such that P(t,x,X\K e ) < e for all x E K and t > O. This clearly implies that the Markov process is bounded in probability. 4.3. Solutions of stochastic equations in Rd. We shall be interested in conditions under which the conditions of Theorem 23 are valid for solu- tions of time-homogeneous stochastic differential equations in Rd. In 2 we studied conditions under which the transition probability (or the time- integrated transition probability) has a density with respect to Lebesgue measure. Therefore, it remains for us to determine conditions under which a Markov process solving an equation is irreducible or bounded in probability. Sufficient conditions for the validity of these assertions are presented below. THEOREM 24. Suppose that x(t) is a solution of the equation dx(t) = a(x(t)) dt + B(x(t)) dw(t) + ! fi (x(t), O)Jl.l (dt x dO), (90) where the coefficients a(x) and B(x) are continuous, while fi (x, e) is con- tinuous with respect to x in L 2 (ml (de)), and assume that conditionsfor the existence and weak uniqueness of a solution are satisfied. Denote by N(x) the linear subspace B(x)Rd, by S(x) the set of vectors x such that ml ({ e: Iz - fi (x, e)1 < e}) > 0 for all e > 0, and by D(x) the sma/lest set containing S(x) and, with each point y E D(x), the vectors y + z for all z E S(y). Then the Markov process x(t) is irreducible if the 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 67 algebraic sum N(x) n V +S(x) n V of sets is dense in V for any point x E X and any ball U about O. PROOF. Let F(x) be the smallest closed set such that P(t,x,F(x)) = 1 for all t > 0; F(x) is an invariant set. To prove the theorem it is necessary to show that F(x) = X for all x E X. If this is not so and S c X\F(x) is an open ball whose boundary contains points of F(x), then P(t,y,S) = 0 for all t > 0 and y E F(x). Let Y ES' n F(x), where S' is the boundary of S. It can be assumed without loss of generality that y = O. Let c denote the center of S. We show that for any ball VI about ayE S(x) there exist arbitrarily small t > 0 such that P{t,x,V l } = P{C;x(t) E VI} > 0, where C;x(t) is the solution of the equation C;x(t) = x + 1 1 a(C;x(t)) ds + 1 1 B(C;x(t)) dw(s) + 1 1  fi (C;x(S), O)Jl.l (ds x dO). Suppose that there exists a  > 0 such that P(t, x, VI) = 0 for t < . Let r be the radius of VI. It follows from the conditions on fi that there exist an e < r/2 and a subset C l , with m(C l ) < 00, such that ml(C l n{8: Ifi(z,8)-yl < }) > !ml(C I n{8: Ifi(x,8)-yl < }) for Ix - zl < e. Let , be the first jump time of the process J1.1 ([0, t] X C l ), and C;(t) the solution of the equation C;(t) = x + 1 1 a(c;(s)) ds + 1 1 B(C;(s)) dw(s) + t f fi (c;(s), O)Jl.l (ds x dO) - t ! fi (c;(s), 8)ml (d8) ds; hJc h  C;(t) does not depend on the measure J1.1 ([0, t] X C l n d8), C;(t) = C;x(t) for t < " and C;x(') = C;(,) + f(c;(,), 8'); here 8' is a point such that {L ! (8)J1.1 (ds x d8) = (8) - ,ml (C l ). J o C 1 Denote by,' the first exit time of C;(t) from the ball {z: Ix - zl < r/2}; then P{C;x(') E VI,' < } > P{, <  < ,'; If(c;(,), 8') - yl < } > !ml(C I n {8: If(x,8) - yl < })P(, < )P(,' >) > 0 
68 I. ERGODIC THEOREMS for any  > O. But P(US<d{C;X(S) E VI}) = 0 under our assumption. We have arrived at a contradIction. It follows at once that for any x, any y E D(x), and any ball about y there exist arbitrarily small t such that P(t, x, VI) > O. If P(t, 0, S) = 0 for all t > 0, where S is the ball about c of radius Icl, then D(O) nS = 0. Since c belongs to the closure of N(O) + D(O), we have that c E N(O). We now observe that + J a(c;o(s)) ds --+ 0 in probability,  It f ii (C;o(s), 0)I{lf(o(s),8)I>e},u1 (dO x ds) -+ 0 in probability for any e > 0 because f In/i(o(s),8)I>e}ml (dO) <  f Iii (C;o (s), 0)IInfi(o(s),8)I>e}ml (dO) < :2 f Iii (C;o (s), 0)1 2m l (dO) < 00, and hence It f ii (C;o(s), 0)In/i(o(s),8)1>e}VI (dO) = 0 for sufficiently small t, and 1 {I f d o 10 Iii (C;o(s))II nfi ('o(s),8)1>e} m.1 (dO) ds = O( 0). Further, 1 {lAd f .n 10 ii (C;o (s), 0)In/i(o(s),8)I:::;e},u1 (dO x ds) is a martingale with characteristic  ltA6 f Iii (C;o(s), 0) 1 2 I{lfi (o(s),8)1:::;e} m 1 (d 0) ds, which tends to zero as e --+ O. Hence, for every p > 0 and  > 0, lim sup P {  t ii (C;o(s), O)I{lfi (o(s),8)I:::;e},u1 (d 0 x ds) > P } = 0, £-+0 Id u 10 and, therefore,  It f ii (C;o (s), O),ul (dO x ds) -+ 0 in probability as t --+ O. I t is easy to see that 1 (I o 10 B(c;o(s)) dw(s) 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 69 has a limiting normal distribution coinciding with that of B(O)w( 1). Hence, limP{c;o(t) E S} = limP { .!.o(t) E Sicl/v'i (  c ) } tO tO t v t = lim P {  o(t) E {x: (c,x) > O} } too v t 1 = P{(B(O)w(I),c) > O} = 2 ' since B(O)w(l) is a Gaussian variable with mean 0 that is not orthogonal to c with probability 1 (here S,(z) is the sphere of radius r about z). We have arrived at a contradiction. 0 Let us now consider conditions for boundedness in probability of a Markov process. We first establish the following auxiliary fact. LEMMA 12. Suppose that a process is irreducible and has a transition probability density. Then it is bounded in probability if and only if there exists a continuous function f//(x) > 0 such that f//(x) --+ 00 as Ixl--+ 00 and sup ! P(t,x,dY)f//(y) < 00 t>O,lxlc for all c > O. PROOF. The sufficiency of the condition follows from the inequality sup P(t,x,{y: Iyl > r}) < . f 1 () sup ! P(t,x,dY)f//(Y) t>O,lxlC In lyl2:' f// Y t>O,lxlC and the fact that the right-hand side of this inequality tends to zero as r --+ 00. To prove the necessity we choose a sequence r n i 00 such that sup P(t, x, {y: Iyl > r n +l}) < 2- n . t>O,lxl'n This is possible because the process is bounded in probability in view of the remark after Theorem 23. Let f//(x) = g(lxl), where g(s), s E R+, is a nonnegative continuous function such that g(r n ) = n. Then for Ixl < rk 00 ! f//(y)P(t,x,dy) = L l P(t,x,dY)f//(Y) n=k 'klyl'k+1 + f P(t, x, dY)IfI(Y) JIYI'k 00 00 +1 < k + L g(rn+dP(t,x, {lyl > r n }) < k + L n 2 n · 0 n=k n=k 
70 I. ERGODIC THEOREMS We consider the processes that are solutions of equation (90). Denote by A the operator defined on the twice continuously differentiable functions by A qJ (x) = (a (x), qJ' (x)) + ! tr B (x) B* (x) qJ /1 (x) + f [tp(x + f(x, 0)) - tp(x) - tp' (x)fi (x, O)]ml (dO). THEOREM 25. Assume conditions for the existence and weak uniqueness of a solution of(90). The solution of the equation is bounded in probability if there exists a nonnegative twice continuously differentiable function qJ(x) such that qJ(x) --+ 00 as Ixl --+ 00, ExqJ(x(t)) and ExIAqJ(x(t))1 are locally bounded, and AqJ(x) < b - cqJ(x) (91 ) for some b > 0 and c > O. PROOF. Using the It6 formula, we have that tp(x(t /\ T)) - tp(x(O)) = tAT All' (x(s)) ds + t (B*(x(s))tp' (x(s)), dw(s)) + t (tp(x(s) + f(x(s), 0)) - tp(x (s)))p I (ds x dO). Let " be the first exit time of the process from the ball S,(O) of radius r about O. Then (lATf Extp(x(t /\ Tr)) = Ex 10 All' (x(s)) ds + tp(x). Passing to the limit as r --+ 00, we get that Extp(x(t)) = t ExAtp(x(s)) ds + tp(x), which implies that d dt Ex(x(t)) = ExA(x(t)) < b - cE(x(t)). Hence, d d t e ct Ex tp (x(t)) < be ct , b eCIEx(x(t)) - (x) < _(e CI - 1), c b _ Ex(x(t)) < - + (x)e Cl. 0 C 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 71 REMARK. Inequality (91) holds if A < g(), where g is an upwards convex function such that g(O) > 0 and lim{Ooo g() < O. Indeed, g() < b - c, where b = sUP{O>o g(), e = -g'(m), and m is such that g(m) = O. We give conditions for a Markov chain with transition probability Q(x,A) = ).loo e-}.IP(t,x,A)dt to be bounded in probability. THEOREM 26. Suppose that there exists a continuous function  > 0 such that (x) --+ 00 as Ixl --+ 00, and Qtp(x) = f Q(x, dy)tp(y) < b + ctp(x), where b > 0 and c < 1. In this case if QI(x,A) = Q(x,A),...,Qn(x,A) = f Qn-l(x,dy)Q(y,A), then for all x sp f Qn(x,dy)tp(y) < 00. PROOF. We use induction to establish the inequality Qntp(x) < f Qn(x,dy)tp(y) < b \__C; +cntp(x). (92) Indeed, this is valid for n = 1 by assumption. If it holds for some n, then l-c n l-c n + l Qn+ltp(X) < b 1 _ c + cn(b + ctp(x)) = b 1 _ c + Cn+ltp(X). 0 REMARK. The condition of the theorem holds if lim Q(x) < 1. Xoo (x) We now investigate Harris recurrence for solutions of (90). THEOREM 27. Suppose that the solution of (90) has a transition prob- ability density and the process is irreducible. Then one of the following statements holds: a) Ex J o oo (x(t)) dt < 00 for all x and all compactly supported functions , or b) for all x and all compactly supported nonzero functions  > 0 Px {lOO tp(x(t)) dt = +00 } = 1, 
72 I. ERGODIC THEOREMS and the process is Harris recurrent with respect to some measure majorized by Lebesgue measure. PROOF. Assume that for some compactly supported function  > 0 and some x Px {lOO tp(x(t)) dt < 00 } > o. Then on a set of positive Lebesgue measure the function g(y) = Py {lOO tp(x(t))dt < oo} is positive, since g(x) = Px {lOO tp(x(t)) dt < 00 } = Px {1°O tp(x(t)) dt < 00 } = ExI{j:oo Ip(x(t))dt<oo} = ExE{I{J:oo Ip(x(t))dt<oo} Ix(s)) = Exg(x(s)) = ! g(y)P(s,x,dy). It follows from Lemma 11 that g(x) is continuous. Hence, the set {x: g(x) = O} is closed. It is clearly invariant. By irreducibility of the process, either this set is empty or it coincides with the whole space. Assume that g(x) > 0 for all x. Denote by K the support of , and by 'K the hitting time for K. We introduce the function f(x) = Ex exp {-lOO tp(x(t)) dt} . If fh(x) = Ex exp {-lOO tp(x(t)) dt } , then fh(x) = ExE (ex p {-lOO tp(x(t)) dt } IX(h)) = ! P(h, x, dy)f(y), and fh(x) is continuous. Finally, fh(x) > f(x) > e- hll91l1 fh(x), and hence f(x) is continuous, being a uniform limit of continuous func- tions. Further, f(x) = Ex exp {-l tp(x(t)) dt} = EAf(x(TK))I{TK<oo} + I{TK=oo}]; 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 73 therefore, inf f(x) = inf f(x) = P > 0 x xEK (since K is compact). For all x and sufficiently large c > 0 Px {10 00 tp(x(t)) dt > c} = Px {ex p {- 10 00 tp(x(t)) dt} < e- c } = Px { 1 - exp { - 10 00 tp(x(t)) dt} > 1 - e- c } < (l-e- C )- IE x (l-ex p {- 10 00 tp(X(t))dt}) < (1 - e- C )-I(1 - P) = p < 1. Then for k > 1, with the notation 1" = inf{s: J; (x(t)) dt > (k - l)c}, we get that Px {10 00 tp(x(t)) dt > kC} = Px { L < 00, 1 00 tp(x(t)) dt > c} = ExI{T<oo}PX(T) {10 00 tp(x(t))dt > c} < pPx {10 00 tp(x(t)) dt > (k - I)C}. Hence, Px {10 00 tp(x(t))dt > kC} < pk, and Ex roo tp(x(t)) dt < f k Cp k-l < (1  )2 . 10 k= 1 P We show that in this case sup Ex roo tpl (x(t)) dt < 00 x 10 for every compactly supported function 1 (x). Indeed, Rl(X) is an ev- erywhere positive continuous function. Therefore, it suffices to show that sup Ex roo R;.tp(x(t)) dt < 00. x 10 
74 I. ERGODIC THEOREMS However, Ex loo R)..tp(x(t)) dt = Ex loo Ex loo e-J..stp(x(s)) ds dt = Ex loo E ([00 e-)..(s-t) tp(x(s)) dSIX(t)) dt = Ex loo dt [00 e-)..(S-t)tp(x(s)) ds 1 roo = Ex ). 10 (1 - e-J..s)tp(x(s)) ds 1 roo < ). Ex 10 tp(x(s)) ds. Assertion a) is proved. If there do not exist a compactly supported pos- itive function rp and a point x such that Px{IoOO rp(x(t)) dt = +oo} < 1, then assertion b) holds. It is established similarly that one of the follow- ing assertions holds for a Markov chain {Xk} with transition probability Ql(X, A) = A fooo e-ltP(t, x, A) dt (the corresponding probability is denoted by P x , and the expectation by Ex ): a') Ex Er:l rp(Xk) < 00 for all x and all compactly supported functions rp, or b') for all x and every nonzero compactly supported function rp > 0 Px {tp(Xk) = +oo} = 1. Note that 00 00 Ex L rp(Xk) = Ex L rp(X(Ol + ... + Ok)), k= 1 k= 1 where 0 1 , O 2 , . .. are independent identically distributed variables that are independent of x(t) and have probability density Ae- lt I{t>o}. We have that Ex loo tp(x(t)) dt = Ex  I I{E=I O,t<E=1 O,} tp(x(t)) dt, r(}1 roo Ex 10 tp(x(t)) dt = Ex 10 I{tOI }tp(x(t)) dt = Ex loo e -)..t tp (x (t) ) d t = R).. tp (x). 
4. SOLUTIONS OF STOCHASTIC EQUATIONS IN R d 75 Hence, E (I IE181<I<E=,81}tp(x(t))lx(s),s <  Oi) = R).tp (x (Oi)) , Ex fooo tp(x(t)) dt = R).tp(x) + E R).tp (X (Oi) ) · On the other hand, Extp(x( 0d) =). fooo e-).IExtp(x(t)) dt = )'R).( tp(x)); hence ExEtp(X(OI +...+Ok)) =)'R).tp(X)+)'R).tp (x (Oi)) and i oo 1 00 Ex tp(x(t)) dt = ). Ex L tp(Xk), o 1 Px {foOO tp(x(t)) dt = +00 } = Px {E tp(Xk) = +00 }. If b) holds, then b') holds; therefore, the sequence Xk hits any open set infinitely many times with P x -probability 1. From this, as in the proof of Theorem 23, we establish Harris recurrence of the Markov chain {Xk}, and hence of the process x(t). 0 
CHAPTER II Asymptotic Behavior of Systems of Stochastic Equations Containing a Small Parameter 1. Equations with a small right-hand side We investigate equations of the form dx = eAe(x,dt), (1) where the right-hand side can be either an ordinary or a stochastic differ- ential with random coefficients depending on the unknown function x(t). Since the right-hand side is small, x(t) differs little from x(O) on finite time intervals. We are interested in time intervals for which x(t) differs essen- tially from the value at zero (for example, intervals of the order O( e- l ) or 0(e- 2 )), and in the asymptotic behavior of a solution on these intervals as e --+ O. 1.1. A general theorem on convergence to a diffusion process. We use a variant of a limit theorem on convergence of a sequence of processes to a solution of a stochastic differential equation (see Gikhman-Skorokhod [2], Chapter 5, 3, Theorem 9). Random processes n(t) on [0, T] with values in Rd will be considered. The sequence n(t) is said to converge in distribution to a process (t) if all the finite-dimensional distributions of n(t) converge to the corresponding finite-dimensional distributions of (t). Conditions are given below for convergence in distribution to a process x(t) that is a solution of the stochastic differential equation dx(t) = a(t, x(t)) dt + B(t, x(t)) dw(t), (2) where a(t,x) is a continuous function from [0, T] x Rd to Rd, and B(t,x) is a continuous function from [0, T] x Rd to L(Rd). It is assumed that these functions are such that a solution of (2) exists and is weakly unique. The latter condition holds if, for example, la(t,x)1 + IIB(t,x)1I < k(1 + Ixl) 77 
78 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER and B(t, x) is an invertible operator (see Gikhman-Skorokhod [2], Chapter 6, 3, Theorem 4). THEOREM 1. Suppose that the following conditions hold for the sequence of random processes n(t) : a) The distributions ofn(O) converge to the distribution of some random variable . b) There exists a set D of twice continuously differentiable compactly supported functions from Rd to R that is dense in the space CO of functions in C tending to zero at infinity and is such that for all 0 < tl < .. . < tm+l < t < t + h < T and qJI, . . . , qJm+l E D lim IEqJl (n(tl)) . .. qJm(n(tm))[qJm+l (n(t + h)) - qJm+l (n(t)) noo - hLtqJm+l(n(t))]1 = o(h) (3) uniformly with respect to t E [tm+l; T - h], where LtqJ(x) = (qJ'(x),a(t,x)) + !trB*(t,x)qJ"(x)B(t,x). (4) Then the sequence n(t) converges in distribution to the solution of (2) with initial condition x(O) whose distribution coincides with that of. PROOF. We use the theorem mentioned above. The proof of it gives us that the following two assertions hold for the sequence n(t) under the conditions a) and b): 1) lim lim sup P{In(t)1 > r} = O. roo noo tE[O,T] 2) For every e > 0 lim lim sup P{In(t2) - n(tl)1 > e} = O. hO noo O<tl <t2<t+h T - This means that the sequence n (t) is compact in distribution. Let  (t) be a process to whose distributions the finite-dimensional distributions of some subsequence nk (t) converge. Then it follows from (3) that for t 1 < . . . < t m < t m+ 1 < t < t + h < T and qJ 1 , . . . , qJ m+ 1 E D - - - - EqJl ((tl))... qJm((tm))[qJm+l ((t + h)) - qJm+l ((t)) - - hLtqJm+l((t))] = o(h) (5) uniformly with respect to t E [tm+l, T]. - It follows from assertion 2) that (s) is a stochastically continuous - process. Therefore, since LtqJm+l ((t)) is stochastically continuous and 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 79 bounded by a nonrandom constant, lim h  Ls+ kh tpm+l ([(s + kh)) = fU L l tpm+l ([(t)) dt (6) h-+O L..J J_f\ k«u-s)/n s for 0 < s < u < T (there is a proof of this fact in, for example, Gikhman- Skorokhod [1], Vol. 1, Chapter V, 4). Therefore, it follows from (5) and (6) that for any 'Pl,...,'Pm+l ED and 0 < tl < ... < t m < tm+l < t m +2 - - E'Pl (c;(tl)) . .. 'Pm (c;(t m )) x [ tpm+l ([(tm+2)) - tpm+l ([(tm+l)) - t m + 2 L s tpm+l ([(S)) dS ] = O. (7) J tm + 1 Obviously, by passing to the limit the relation (7) can be extended to any twice continuously differentiable compactly supported function 'Pm+l (x) = h(x). If it is rewritten in the form E<I>(cf(tl)' . . . , cf(t m )) [ h([(tm+2)) - h([(tm+l)) - jlm+2 Lsh([(s)) dS ] = 0, tm+1 (8) then this rewritten relation holds on the smallest linear space of functions <I>(Xl, . . ., x m ) from (Rd)m to R that is closed under bounded pointwise convergence and contains the functions of the form <I>(Xl, . . · , x m ) = 'PI (XI) . . . 'Pm (x m ), where 'Pk E D. Hence, this linear space contains all the functions of the form fi(xl)...fm(x m ), where fk E C, and with them all continuous functions. It follows from (8) that h([(t)) -1 1 Lsh([(s)) ds - is a martingale with respect to the flow of a-algebras generated by c;(t) for any twice continuously differentiable compactly supported function h. But then Corollary 2 of Theorem 9 in Chapter 5, 3 of Gikhman-Skorokhod - [2] gives us that c;(t) is a solution of an equation of the form (2) with some Wiener process w(t). By the assumptions about weak uniqueness of the - solution, the distributions of c;(t) are uniquely determined. Therefore, the weakly compact family of finite-dimensional distributions of the processes - c;n (t) has a unique limit point. 0 REMARK. Instead of (3), in condition b) it is sometimes more con- venient to use the following: there exists a sequence h n --+ 0 such that - 1 lim sup hIE'Pl(c;n(t l ))... 'Pm(c;n(tm))('Pm+l(c;n(t + h n )) noo tE[tm+J,T] n - 'Pm+l(c;n(t)) - hnLt'Pm+l(c;n(t)))\ = o. (9) 
80 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Indeed, if this holds, then there is a K such that for rp E D IErp(c;n(t + h)) - Erp(c;n(t))1 < Kh n . This implies that for every h lim IErp(c;n(t + h)) - Erp(c;n(t))1 < Kh, noo ( 10) and so assertions 1) and 2) of the proof of the theorem are valid. Moreover, denoting the quantity after the lim on the left-hand side of (9) by en, we have that Etpl(n(td)... tpm(n(tm)) ((tpm+I(n(t + lh n )) I-I ) - rpm+l(c;n(t)) - LhnLt+ihnrpm+l(c;n(t + ih n )) < enh n . i=O It is now easy to obtain (7) from this. 1.2. Ordinary differential equations with random right-hand side. We consider equations of the form d Xe / d t = e a e ( t, Xe ( t) ), ( 1 0') where ae(t, x) is an Rd-valued random vector field on R+ x Rd for each e > 0, and the field ae(t, x) remains bounded "on the average" as e --+ 0 (a more precise formulation of what this means is given below), and we investigate the asymptotic behavior of a solution as e --+ 0 for large t. We are interested in the case when ae(t,x) is asymptotically ergodic for fixed x as e --+ 0 and xe(t) behaves like a diffusion process for large t. To investigate the nature of the results possible here we consider what is in a certain sense the simplest case, when in the random field the variables t, w, and x separate: ae(t, x) = a(x)1je(t), where a(x) is no longer a random function, and 17e(t) is a stationary (or asymptotically stationary) process. In this case equation (10') takes the form dXe/dt = ea(x e (t))17e(t). (11) We solve this equation for a fixed initial condition xe(O) = Xo. The main idea in the investigation of the asymptotic behavior of xe(t) for large t can be described roughly as follows. We introduce a new process xe(t) = Xe(Aet), where Ae -+ 00 as e --+ 0; Ae must be chosen below. Then d?) = e).£a(x£ (t) )'1£ ().£t) = (x£ (t) ) 17£ (t), where ife(t) = eAe17e(Aet). Assume that ife(t) = 'Ye + 'e(t), where 'Ye is a constant bounded as e --+ 0, and 'e(t) is a white noise process. Then it 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 81 is natural to expect that xe(t) is close in distribution to the solution of some stochastic equation. If it is assumed that (1/ Pe) f Ce(s) ds converges in distributions to a Wiener process w(t), then we can use the following considerations to write the approximate stochastic equation that must be satisfied by xe(t). It will be assumed that a(x) is continuously differen- tiable. Then for a twice continuously differentiable function rp rp(xe(t + h)) - rp(xe(t)) = jt+h (tp' (xe(s)), a(x e (s)))1fe(s) ds = (tp' (xe(t)), a(xe(t))) [Yeh + jt+h Ce(s) dS] j t+h + t [( tp' (x e (s)), a(x e (s))) - (tp' (xe(t)), a(xe(t)))] 1fe(s) ds = (tp' (xe(t)), a(xe(t))) (Ye h + jt+h Ce(s) ds ) + jt+h jS ( [( tp(xe( u)), a(Xe(U)))] ' a(xe(u)) )1fe( u)1fe(s) du ds = (tp' (xe(t)), a(xe(t))) [Ye h + jt+h Ce(s) dS] + ([( tp(xe(t)), a(Xe(t)))] ' a(xe(t))) jt+h jS 1fe( u)1fe(s) du ds + t5 n , where I n is a variable of higher order of smallness. Here [( rp, a)] denotes the derivative with respect to x of the function (rp(x),a(x)). Since jt+h jS 1fe(u)1fe(s) du ds =  (jt+h 1fe(u) du ) 2 , the fact that f/+ h 17e(U) du is asymptotically independent of X e (tl),..., xe(t m ) for tl < ... < t m < t gives us that for every bounded continuous function <I>(Xl, . . . , x m ) EcI>( xe (t d, · · · , xe (t m)) { tp (x e (t + h)) - tp (x e (t) ) - h [Ye( tp' (xe(t)), a(xe(t))) + Pi ([( tp(Xe(t)), a(xe(t)) )], a(x e (t)))] } = o( h): (12) 
82 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Using Theorem 1, we get that, as Ye --+ Y and Pe --+ P, the process xe(t) converges in distribution to a process x(t) that solves the stochastic equa- tion dx(t) = al (x(t)) dt + pa(x(t)) dC(t), ( 13) where C(t) is a one-dimensional Wiener process, and a) (x) = ya(x) + 2 a'(x)a(x). ( 13') For a rigorous justification of the result we have obtained we must esti- mate I n . Moreover, it is desirable to formulate more precisely conditions ensuring the possibility of getting a formula of the type (12). We now consider in detail the case when the process 'f/e (t) in (11) does not depend on e and is a stationary ergodic process. We need the following definition. DEFINITION. Let 'f/(t), t E R, be a stationary process, and let g; and sr t be the a-algebras generated by 'f/(s) (s < t) and 'f/(s) (s > t), respectively. The process 'f/(t) satisfies the mixing condition if, for any t n < Sn with Sn - t n --+ 00 and any events An E g;" and Bn E g-Sn, lim (P(An n Bn) - P(An) P(Bn)) = O. noo ( 14) LEMMA 1. Suppose that the process 'f/(t) satisfies the mixing condition, Sn -t n --+ 00, n is a sequence ofbounded (jointly) g; -measurable variables, . n and 'f/n is a sequence ofg-sn-measurable and uniformly integrable variables. Then lim (En'f/n - EnE'f/n) = o. noo ( 15) PROOF. It follows from (14) that (15) is valid if n and 'f/n are indica- tor functions. Therefore, (15) holds for linear combinations of indicator functions, as well as for the variables that are uniform limits of such func- tions. Thus, (15) is valid for jointly bounded variables n and 'f/n. Let 'f/ = 'f/n I {I17nl<c}. Then lim IEn'f/n - EnE'f/nl < lim IEE'f/ - EnE'f/1 noo noo + lim I En 'f/n - Ec;n 'f/ I noo + lim IEnE'f/n - EnE'f/I. noo The first term is equal to zero because n and 'f/ are uniformly bounded, and the second two can be made arbitrarily small by suitably choosing c, because the 'f/n are uniformly integrable. 0 
 1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 83 REMARK. If n and 17n are S';n- and srsn-measurable, Sn - t n --+ 00, and sUPn(E + E17) < 00, then (15) holds. Indeed, lim IEn17n - EnE17nl < lim IE17 - EE171 noo noo + lim (IEn(17n - 17)1 + IE17(n - )I + IEnE17n - EE17I), noo and all the terms on the right-hand side tend to zero as c --+ 00. THEOREM 2. Assume the following conditions hold: 1) a(x) has continuous derivatives a'(x) and a"(x), and for some k la(x)1 + la'(x)a(x)1 < k(1 + Ixl). 2) The process 17(t) is stationary with mean 0 and satisfies the mixing condition and the conditions a) EI17(t)1 4 < 00, b) the correlation function r(t) of the process is such that f Ir(t)1 dt < 00, c) } ;2 E (iT 17(t) dt) 4 < 00, d) i 2T lim E E(17(t)/5lO) dt = o. Too T 3) The equation (13) has a weakly unique solution. In this case if xe(t) is a solution of (11) with 17e(t) = 17(t) and initial condition Xo, then the process xe(t) = x e (tje 2 ) converges weakly in distri- bution to the process x(t) that is the solution of (13) with initial condition x(O) = Xo, whet:e al(x) is defined by (13') with y = 0 and p2 = fr(t)dt. PROOF. If 17e(t) = e- l 17(e- 2 t), then the equation for xe(t) will have the form ft Xe(t) = a(x e (t))1;e(t). Repeating the computations given before the theorem, we see that for a thrice continuously differentiable compactly supported function rp(x) j t+h tp(xe(t + h)) - tp(xe(t)) = (tp' (x e (t)), a(x e (t))) t 1;e (s) ds + ([( rp (x e (t)), a(x e ( t)) )], a(x e (t))) j t+h j s X t t 1;e(U)1;e(S) du ds + t5, 
84 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where j t+h j s c5 = I I [1fI(Xe(U)) - IfI(Xe (t))]i1e (u)i1e(s) du ds, lfI(x) = ([( {O (x), a(x) )], a(x)). It will be assumed that h varies with e in such a way that hje 2 --+ 00 and hje --+ O. It follows from the conditions on a(x) and (O(x) that lfI(X) is continuously differentiable. Therefore, j t+h j s j u c5 = I I I (1fI' (x e ( v)), a(x e ( v )))i1e( v) dv i1e( u)i1e (s) du ds 1 j t+h ( (t+h ) 2 = 2 I lv i1e(s) ds (1fI'(Xe(V)), a(x e (v)))i1e(v) dv. If I( lfI'(x), a(x))1 < 2c, then j t+h ( (t+h ) 2 1c51 < C I li1e(v)1 lv i1e(s)ds dv. Suppose now that <I>(Xl,... ,x m ) is a bounded continuous function, and o < tl < . . . < t m < t m +l < t < t + h. Then for some Cl and C2 IE<I>(xe(tl),... ,x e (t m ))l5 h l j t+h ( (t+h ) 2 < clE I li1e(v)1 lv i1e(s)ds dv h/e2 ( h/e2 ) 2 = CI8 3 E 1 117(V)1 1 17(s)ds dv 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 85 hle 2 < Cl831 V E 'f/2(V) h < C3h- = o(h) e ( h l e 2 ) 4 E 1 'f/(S) ds dv (we have used condition 2c)). Let 1fI1 (x) = ({O'(x), a(x)). Then j t+h 'III (xe(t)) t fle(s) ds j t+h = 'III (xe(t - h)) t ife(s) ds I t j t+h + lfI(xe(u))f1e(u) du r;e(s) ds t-h t j t+h = 'III (xe(t - h)) t ife(s) ds I t j t+h + lfI(xe(t - h)) r;e(u) du r;e(s) ds t-h t I t l u j t+h + (lfI'(xe(v)), a(xe(v)))r;e(v) dvr;e(u) du r;e(s) ds. t-h t-h t For t - h > t m  E<I»(xe(td,..., xe(tm)) 'III (xe(t - h)) [t+h ife(s) ds C j t+h < ; E t E(ife(s)/9(t-h)/e 2 ) ds e j (t+h)le 2 = C4 h E E(f1(S)/9(r-h)le 2 ) ds tle 2 e 1 2hle2 = h C4E E(f1(s)/c9Q)ds. hle 2 
86 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER For T < hje 2 h I E<I»(xe(td,..., xe(tm)) If/(xe(t - h)) t ife(u) du j t+h ife(s) ds It-h t _ _ _ 1 I t-h+e 2 T _ j t+h _ = E<I>(Xe(tl),... ,xe(tm))ljI(xe(t - h)) h 11e(U) du 11e(S) ds t-h t + h I E<I»(xe(td,..., xe(tm)) If/(xe(t - h)) r ife(u) du j l+h ife(s) ds J t - h +e 2 T t - E<I»(xe(td,... , xe(tm))If/(xe(t - h))  loT 11 C 2 h + u) du x 1;::e 2 E ( 11 (s + t 8 2 h ) /  ) ds + E<I>( Xe ( t 1 ), . . . , Xe ( t m ) ) x If/(Xe(t-h))  i h / e211 C2h +u) du 1;::e 2 11 C2h +s) ds. By the condition of the theorem, the variable e2 j h/e 2 1 2h/e2 - h 11(S) ds 11(U) du T h/e 2 < ; {(i h / e211 (S)dS)2 + (1;::e 2 11 (U}du)2} is uniformly integrable. Therefore, if T --+ 00 as e --+ 0, then, by Lemma 1, lim E<I>( xe ( t 1 ), . . . , Xe ( t m ) ) 'II (Xe (t - h)) eO e 2 j h/e 2 ( t - h ) 1 2h/e2 ( t - h ) x - h 11 2 + u 11 2 + s ds T e h / e2 e = lim E<I>(Xe(tl),... ,Xe(tm))ljI(Xe(t - h)) eO e 2 {h/e 2 ( t - h ) {2h/e2 ( t - h ) x Efl iT 11 8 2 +U du ih/e 2 11 8 2 +S ds = lim E<I>(Xe(tl),... ,Xe(t m )) eO e 2 j h/e2 1 2h/e2 X ljI(Xe(t - h))- h r(s - u)dsdu = 0, T h / e 2 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 87 because e 2  h/e2 1 2h/e2 e2 i 2h / e2 lim- h r(s-u)dsdu < lim- h vlr(v)ldv=O. eO T h/e 2 eO 0 Moreover, on the basis of the remark after Lemma 1, lim E<I>(Xe(tl), . .. , Xe(t m )) f//(Xe(t - h)) e h 2 {T rJ ( t -2 h + U ) du eO J o e 1 2h / e2 ( t h ) x 'YJ 2 + s ds h / e 2 e = lim E<I>( Xe ( t 1 ), . . . , Xe ( t m ) ) f// (X e (t - h)) eO 1 (T ( t-h ) VTe2 {2h/e 2 ( t-h ) x VT J o rJ 82 + u du. E h J h / e 2 rJ 8 2 + s ds = 0, since VTe 2 jh < ejVh, and the required conditions are satisfied. Hence, EcI>(xe(td, · · · , xe (tm)) [tp(Xe(t + h)) - tp (x e (t)) 1 (! t+h ) 2 ] - 2 V/(xe(t)) 1 ije(u) du = o(h). We now consider ( t+h ) 2 EcI>(xe(td,.. ., xe(tm)) V/(xe(t)) 1 ij(u) du ( t/e2+h/e2 ) 2 = E <I>(Xe(tl),...,Xe(tm))f//(Xe(t))e2 ! 'YJ(u)du t/e 2 [ (! t/e2+T ) 2 = EcI>(xe(td,. .., Xe(tm)) V/(Xe(t)) 8 2 l/e 2 rJ(U) du t/e 2 +T t/e2+h/e2 ( t/e2+h/e2 ) 2 ] + 2e 2 ! 'YJ(U) du ! 'YJ(S) ds + e 2 ! 'YJ(S) ds . t/e 2 t/e 2 +T t/e 2 +T 
88 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Let T --+ 00 and Te 2 I h --+ O. Then e 2 (j t/e 2 +T ) 2 E7l 1/£2 '1(u) du ( ) 2 e2 T e 2 T T = E h 1 '1(u)du = 7l 11 r(u-s)duds e2 j T e2 J OO = 7l -T (T -Iul)r(u) du < 7l T -00 Ir(t)1 dt  0, e 2 j t/e 2 +T 1 j t/e2+h/e2 E7l f1(U) du f1(S) ds t/e 2 t/e 2 +T 1 ( (T ) Te4 ( (h/e2-T ) < T E 10 '1(u) du 2 Ji2E 10 '1(u) du 2 = Iff (  E (1 T '1(u) du ) 2)  E (1 h /£2_ T '1(s) ds ) 2 (i2T J OO < V h -00 Ir(t)1 dt  o. Finally, as in the estimation of c5 h , we find that E<I>(X e (tl),... ,xe(t m )) / t / 1 (V/'(xe(v)), a(x e (v)))'1e(v) dv t-h t-h X '1£(u) du [t+h fie(s) ds = o(h). Thus, if hand T satisfy the indicated conditions, then EcI>(X£(tI),. .., x£(tm)) [tp(X£(t + h)) - tp£(x£(t)) · - V/(x£(t)) ([t+h 11£(u) du rJ 1 2h / e2 = C3eE E(f1(s) 19'0) ds + o(h). h/e 2 Let g(t) = EI Ir 2t E(f1(s) 19'0) dsl. By condition 2d) of the theorem, g(t) --+ o as t --+ 00. Let hie = (). Then eE f2h/£2 E('1(s) 19'0) ds = h () 1 g ( () ) . J'h/e2 G 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 89 We show that it is possible to choose £J dependent on e in such a way that £J --+ 0 and ! g( £J j e) --+ 0 as e --+ O. Let the sequence t n be such that n 2 g(t n jn) < 1 and t n < t n +l. In this case if £J = Ijn for Ijt n +l < e < Ijt n , then (lj£J)g(£J je) < Ijn for Ijt n +l < e < Ijt n , and EcI>(x£(td, · · · , x£(tm)) [tp(X£(t + h)) - tp (x£(t)) 1 ( f t/e2+h/e2 ) 2 ] - 2 lfI(x£(t)) 8 1/£2 11( u) du = o(h). Since the variable e 2 (f t/e2+h/e2 ) 2 h YJ(U) du t / e 2 + T is uniformly integrable, we get . _ _ _ [ e 2 (f t/e2+h/e2 ) 2 11m E<I>(X e (tl), · · · , Xe(t m )) f//(Xe(t)) - h YJ( U) du eO t/e 2 +T e 2 (f t/e2+h/e2 ) 2 ] -hE YJ(u) du = 0 t / e 2 + T on the basis of Lemma 1. The relation e 2 (f t/e2+h/e2 ) 2 1 00 lim - h E YJ(u) du = r(t) dt t/e 2 +T -00 implies that   cI>(X£(tl), .. . , x£(tm)) [tp(X e (t + h)) - tp(xe(t)) - 2 IfI(X£(t))] = O. It remains to use Theorem. 1 and the remark after it. 0 REMARK 1. The assertion of the theorem remains true for a solution of (11) if a(x) satisfies condition 1) and the following conditions hold for YJe(t) : a) sUPe,t EIYJe(t)1 4 < 00; b) uniformly with respect to t, ( t+T ) 2   E 1 l1£(S) ds = p; c) ( t+T ) 4 lim T \ sup E f YJe(S) ds < 00; Too e,t t 
90 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER 3) I t + 2T lim sup E E(l1e(s)I9;"(e)) ds = 0, Too,eO t t+T where 9;"(e) is the a-algebra generated by l1e(S), s < t; and e) if g-S(e) is the a-algebra generated by l1e(U), U > s, then for At(e) E 9;"(e) and BS(e) E g-S(e) lim (P(At(e) n Bs(s)) - P(At(e)) P(Bs(e))) = 0 eO,s-too (the mixing condition is uniform with respect to e). The proof is analogous to that of Theorem 2. REMARK 2. Theorem 2 extends trivially to equations of the form dXe(t)  dt = e L...J ak(x e (t))l1k(t), k=l Xe(O) = Xo, ( 16) where the ak (x) are functions satisfying condition 1) of the theorem, and, moreover, la(x)aj(x)1 < k(1 + lxI), i,j < I, while (111 (t),..., l1[(t)) is an I-dimensional stationary process with mean 0 satisfying the mixing condi- tion and with components l1k(t) each satisfying condition 2). Let Pkj = IErJk (t)r/j (t) dt. Then the process xe(t) = x e (tje 2 ) converges in distribution to the solution of the stochastic differential equation [ [ dx(t) =  L a,,(x(t))aj{x(t))Pkjdt+ Lai(x(t))dwi(t) (17) k ,j = 1 j = 1 with the initial condition x(O) = Xo, where WI (t),.. . , w[(t) are one-dimen- sional Wiener processes with EWj(t)wj(t) = Put. In Theorem 2 and Remark 2 after it we consider equations of the form (10') when e enters as a factor on the right-hand side (Remark 1 shows that this assumption is not fundamental), and the stationary process (the random stationary dependence on t) is linear. A more general equation of this form is given below (it can be regarded as a generalization of (16) to the case I = 00). Let <I> be a linear topological space, <1>* the space of linear functionals on <1>, and <1>2 the space of bilinear functionals. We consider a process l1(t) with values in <1>. Let X be a linear space. Denote by L(<I>, X) the space of continuous linear mappings from <I> to X, and by L2(<I>, X) the space of bilinear mappings from <1>2 to X. 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 91 We consider the equation dx (t) elt = ea(x£(t))['1(t)], ( 18) where a(x)[.] is an element of L(<I>, Rd) for each x. The arguments in <I> of elements in L(<I>, Rd) and L2(<I>, Rd) will be written in square brackets after the symbol for the element. If a(x)[.] is differentiable with respect to x, then a'(x)a(x)[.,.] denotes the element in L2(<I>, Rd) with a'(x)a(x)[{OI, {O2] = lim h I (a(x + ha(x)tp2)[tpd - a(x)[tpd). hO We define the correlation operator of a process 'YJ(t) with values in <1>: Rs,t(C) = EC['YJ(s), 'YJ(t)], C E <1>2. If the process is stationary, then Rs,t( C) = Rt-s( C). THEOREM 3. Let xe(t) be the solution of( 18) with initial condition xe(O) = Xo, and suppose that a(x) and 'YJ(t) satisfy the following conditions: 1) a(x) is a function from Rd to L(<I>,Rd) that is continuous and twice continuously differentiable with respect to x, and for Z E Rd i: E(a'(x)a(x)['1(l1), '1(s)], z) ds < klzl(1 + lxI), E(a(x)['YJ(O)], z)2 < k 2 1z1 2 (1 + Ix1)2, where k is a constant. 2) 'YJ(t) is a stationary process satisfying the mixing condition and the conditions a) EI{o*('YJ(t))14 < 00 and E{O*('YJ(t)) = 0 for all {O* E <1>*, and for every compact set F c <1>* E sup 1{o*('YJ(t))1 2 < 00, 'P-EF b) f IRt(C)1 dt < 00 for all C E <1>2' c) for any compact set C* c <1>2'  1 l T l s 2 11m T 2 E sup 0 0 C['YJ(s), 'YJ(u)] du ds < 00, T oo CEC- d) lim E sup T oo 'P- EF for every compact set F c <1>*. (2T iT E( tp* ('1(t)) 190) dt = 0 
92 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER 3) The solution of the stochastic differential equation dx(t) = a(x(t)) dt + B(x(t))dw(t), x(O) = Xo, ( 19) where w(t) is a Wiener process in Rd, and where the coefficients a(x): Rd --+ Rd and B(x): Rd --+ L(Rd) satisfy for Z E Rd the relations (a(x), z) = 1 00 E(a'(x)a(X)[17(O), 17(S)], z) ds, (B(x)z, z) = I Rt(Cz(x)) dt with C z (X)[{OI,<D2] = (a(x){OI, z)(a(x){02, z), is weakly unique. Then the processes xe(t) = xe(tje 2 ) converge in distribution to the process x(t) as e --+ O. PROOF. The proof is analogous to that of Theorem 2; therefore we sketch only its main points, dwelling in more detail on the places where the particulars of the infinite-dimensional case enter. Setting 'YJ(sje 2 )je 2 = r;e(s), we have for a twice continuously differentiable compactly supported function g(x) that g(xe(t + h)) - g(xe(t)) = i t + h (g' (x£(s)), a(x£ (s))[I1£ (s)]) ds = (g'(X£(t)), a(x£(t)) [i t + h 11£(s) dS] ) + it+h is (g' (x£(t)), a(x£(t)))' a(x e (t))[I1£(S), 11£ (u)] du ds + t5;: (the prime denotes differentiation with respect to x), where the variable c5 h is given by t5;: = i t + h is i U ((g' (x£( v ))a(Xe( v )))'[I1£(S), 11£ (u)] x a(xe(v))[r;e(V)]) dv du ds. Let us show that Elc5hl = o(h) (as in Theorem 2, hje 2 --+ 00 and hje --+ 0). We can assume that t = 0, and after a change of variables we get h/e 2 ( h/e2 s ) E\t5;:1 < e 3 E 1 dv 1 ds 1 duB(v)[17(U), 17(S)], a(X£(V))[17(V)] , 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 93 where B(v) is a function with values in L2(<I>,L(Rd)). Hence, {h/e 2 ( (h/e2 S 2 ) 1/2 EI t5 ZI < 8 3 10 dv E 1v ds 1 duB(v)[71(U),71(S)] x (EII{g(x£(v))O}a(xe( v) )[11( v) ]1 2 ) 1/2. It is easy to see from the form of B(v) (recall that g has compact sup- port) that the set of possible values (((g'(x)a(x))'a(x))') is compact in L2(<I>, L(Rd)) as the continuous image of the support of g(x), and the set I{g(x)O}a(x) is compact in L(<I>, Rd). Therefore, on the basis of 2a) and 2c), d EI{g(x£(V))0}a(xe(v))[11(V)]2 = E L I{g(x£(v))O}(a(xe(v))[l1(V)], ek)2 k=l < dE sup ({o*(11(V)))2 < Cf, rp. EF E 1 h / e2 ds 1 s duB(v)[71(U), 71(S)] 2 d ( {h/e2 (S ) 2 < kl E 1v ds 1v du(B(v)[71(U),71(S)],ek,e;) , (l h / e2 l s ) 2 < d 2 E sup ds dUC[l1(U),17(S)] < ci(hle 2 )2 CEC. v v for some Cl and C2. Here {el, . . . , ed} is a basis in Rd, and F* and C* are compact sets in <I> and <1>2. Therefore, EIl5 h l = h 2 Ie = o(h). (20) To prove that for every function G(Xl,..., x m ) and for all tl < t2 < . . . < t m < t EG(xe(td,.. ., xe(tm)) ( g' (xe(t)), a(xe(t)) [[I+h 11e(S) dS] ) = o(h), (21) it is necessary to use the following variant of Lemma 1: if {O is a sequence of !7;n -measurable elements in <1>* with values in a particular compact space F* C <1>*, then for Sn - t n --+ 00 and Tn --+ 00 ( 1 f sn+Tn ) lim E{O IT l1(S) ds = O. noo V Tn Sn (22) 
94 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER A uniform approximation of  by a finite-valued random variable can be used in the proof of (22). It is possible to get (21) from (22) as in the proof of Theorem 2. Finally, in the proof of the relation i t + h i s t t (g' (xe(t)), a(xe(t)))' a(x e (t))[l1e(S), l1e( u)] du ds {h/e 2 {S = £2 10 10 (g' (xe(t)), a(xe(t)))' a(Xe(t))[(s), (u)] du ds £2 {h/e 2 {S = h]lE 10 10 (g'(x),a(x))'a(x)[(s),(u)]duds  +o(h) o 0 x=xe(t) we need the following variant of Lemma 1: if Cn(w)[.,.] is a sequence of <l>2-valued.!Jl;n -measurable variables taking values in a compact set C* c <1>2 and if Sn - t n --+ 00 and Tn --+ 00, then (23) ( {Sn+Tn {S n E lsn ls n Cn(w)[(s),(u)]duds ( l sn+Tn i s ) ) -E E C[l1(S), l1(U)] du ds = o. Sn Sn C=Cn(W) The proof of the theorem follows from (20), (22), and (23). Finally, we consider a generalization of Theorem 2 to equations of the form (10'). It will be more convenint for us to formulate this result for a more special fQrm of equation, namely d Xe (t) 0 2 1 ) dt = £a (t, xe(t)) + £ a e (t, xe(t) , xe(O) = Xo. (24) The random field aO(t, x) can be regarded as a C(X)-valued function of t (C(X) is the space of continuous functions from X to X). THEOREM 4. Suppose that xe(t) is the solution of(24), whose coefficients satisfy the following conditions: 1) aO(t, x) is a stationary C(X)-valued process (as a function of t) satis- fying the mixing condition and such that a) EaO(t,x) = 0, Ela O (t,x)1 4 < 00, and E sup lao(t, x)1 2 < 00 xEK for every compact set K C Rd, b) for Xk E X and Yk E X, k = 1,2, i: IE(a°(t,xd,Yd(a°(t+s,x2),Y2)lds < 00, 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 95 c) for every compact set K C Rd and for Zl, Z2, Z3, Z4 E Rd lim T \ ESUP {( {T t(a O (s,x),zd(a O (U,X),Z2)dUdS ) 2 Too xEK 10 10 + (I T l s ( :x aO(S,X) ZI,z2)(aO(U,X),Z3)dUdSr + (I T l s ( :x aO(u,X)Zl>Z2)(aO(S,X),Z3)dUdSr + (I T l s ( :x aO (U,X)Zl>Z2) x ( :x aO (S,X)Z3,Z4) dUdS)2} < 00, d) for every compact set K C Rd lim E sup (2T E(a0(t, x) 1.90) dt = o. Too xEK 1T 2) Uniformly with respect to t E R and x in any compact subset K of Rd, _ 1 j t+Tt 1im Te I a(s,x)ds=al(x), where al (x) is a continuous function, whenever Te --+ 00 in such a way that £2 Te --+ 0, and the quantities 1 j t+Tt -sup t a(s,x)ds Te xEK are uniformly integrable. 3) For some k lal(x)1 + lao(x)1 < k(1 + lxI), IB(x)1 < k 2 (1 + Ix1)2, where 1 00 a ao(x) = E- a aO(s, x)ao(O, x) ds, -00 x and the symmetric nonnegative operator B(x) E L(Rd) is determined from the equality (B 2 (x)z, z) = 1 00 E(aO(O, x), z)(aO(t, x), z) dt. 4) The solution of the stochastic differential equation dx(t) = [al (x(t)) + ao(x(t))] dt + B lj 2(X(t))dw(t), x(O) = Xo, (25) 
96 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where w(t) is a Wiener process in Rd, is weakly unique. Then the process xe(t) = xe(tje 2 ) converges in distribution to the process x( t). PROOF. Let  be a thrice continuously differentiable compactly sup- ported function. Then  (x e (t + h)) -  (x e ( t) ) 1 f t+h = e t (tpl(Xe(t)),aO ( :2 ,X e (t))) ds + e 12 1l+ h IS ( (Il'l (xe(t)), aO ( :2 ' Xe(t)) )' , aO ( e ' Xe(t)) ) du ds tje 2 +hje 2 + e 2 f ('(Xe(e2s)), a l (s, Xe(e 2 s))) ds + J h , tje 2 where tje 2 +hje 2 ( [ J h = f dv aO(v,xe(v)), ({ c((xe(v),ao(u,xe(v)), tje 2 J J vu<s<tj£2+h2 je 2 aO(s, xe(v))) ( a 0 0 ) + C2 Xe(v), ax a (U, xe(v)), a (s, xe(v)) ( O' a 0 ) +C3 xe(v),a (U,Xe(V)), aX a (s,xe(v)) + C4 (Xe( v), :X aO(u, Xe( v)), :X aO(s, Xe(V))) ] du dS), and the Ck(X,.,.) (k = 1,2,3,4), which are bilinear functions from Rd x Rd, L(Rd) X Rd, Rd X L(Rd), and L(Rd) x L(Rd), respectively, to Rd, are continuous and compactly supported with respect to x. Using the conditions la) and Ib), we see that EIJhl = o(h). In the proof of the equalities 1 j t+h ( ( S ) ) E e t tpl(xe(t)),ao e 2 ,x e (t) ds = o(h), E e 12 1l+ h IS ((tpl(xe(t)),aO ( :2 ,X e (t)))' ,ao(  ,Xe(t))) duds = L(xe(t))h + o(h), where Lrp(x) = ! tr B(X)rp"(X) + ('(x), ao(x)), we use the following con- sequence of Lemma 1: if T l , T 2 --+ 00, t e is arbitrary, and Cl (x, ., .) and 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 97 C2(X,.,.) are arbitrary linear or bilinear forms continuous with respect to x, then [ ( 1 j t t+ T l+ T 2 )] lim E c xe(t e ), . rr: aO(xe(t e )) ds = 0, eO V2 + [ 1 j t t+ T l+ T 2 j s E T C2 (x e (te), aO(s, xe(te)), aO(u, xe(te))) du ds 2 tt+TI tt+Tl 1 j t t+ T 2+T2 j s ] -E T2 EC2(x,ao(s,x),ao(u,x)) duds = o. tt + Tl tt + T2 x=xe( tt) Finally, it follows from condition 3) that t/e 2 +h/e 2 e 2 j (qJ'(xe(e2s)), a l (s, xe(e 2 s))) ds = h('(xe(t)), al (xe(t))) + o(h). t/e 2 The rest of the proof of this theorem repeats that of Theorems 2 and 3. 1.3. A theorem on integral continuity with respect to a parameter for diffusion processes. There is a general theorem on integral continuity with respect to a parameter for solutions of stochastic differential equations in the Gikhman-Skorokhod book [2] (Chapter 5, 94). A variant of this theorem is given here for diffusion stochastic differential equations. Since we do not impose on the coefficients the Lipschitz-type conditions imposed in the theorem cited, the theorem formulated here does not follow from the former theorem. Therefore, it is presented with a proof. THEOREM 5. Suppose that 'e(t) is a solution of the stochastic differential equation in Rd d'e(t) = ae(t, 'e(t)) dt + Be(t, 'e(t)) dw(t), 'e(O) = '0 (26) (w(t) is a Wiener process in Rd), with coefficients satisfying the following conditions: a) ae(t, x) and Be(t, x) are measurable functions from R+ x Rd to Rd and L(Rd), respectively, are continuous in x uniformly with respect to e and t < c for Ixl < c, where c is arbitrary, and for some k lae(t, x)1 + IIBe(t, x)11 < k( 1 + Ixl). (27) b) There exist a (t,x), B (t,x), and he --+ 0 as e --+ 0 such that, uniformly on each compact set K c R+ x Rd, j t+h t t [ae(s, x) - a e (t, x)] ds = o(h e ), j t+h t t [Be(s,x)B;(s,x) - B (t,x) B *(t,x)]ds = o(h e ), (t,x) E K. 
98 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER c) The stochastic differential equation d , (t) = a (t, , (t)) dt + B (t, , (t)) dw(t), ,(0) = '0, (28) has a weakly unique solution. Then 'e(t) converges in distribution to , (t) as e --+ O. PROOF. We again use Theorem 1 and the remark after it. On the basis of the Ita formula, for every twice continuously differentiable compactly supported function 91(x), tl < t2 < . . . < t m < t m +l < t, and any bounded measurable (Xl, . . . , x m ) we have E('e(tl),... ,'e(tm))[('e(t + he)) - ('e(t))] i t + ht [ = E<I»(e(t d, · : · , e (tm)) I (tp' (e (s)), ae(s, e (s))) +  trtpll(e(S))Be(S'e(S))B;(S'e(S))] ds = E<I»( e (t 1), · · · , e (tm)) [ ( tp' (e (t)), a( t, e (t))) +  trtpll(e(t)) B *(t'e(t))] he + <51 + <51, where J e l = E('e(tl),... ,'e(t m )) X { (tp' (e(tn, ll+h e [a(s, e(t)) - a (t, e(t))] ds ) 1 i t +ht + 2 trtp"(e(t)) I [Be(s,e(t))B;(s,e(t)) - B (t, e(t)) B * (t, e(t))] ds }, J; = E('e(tl),... ,'e(t m )) { i t+ht X I [( tp' (e(s)), ae(s, e (s))) - (tp' (e(t)), a e (s, e(t)))] ds 1 i t +ht + 2 I tr[ tp" (e (s) )Be (s, e (s ))B; (s, e (s)) - tr tp"(e(t))Be(s, e(t))B; (s, e(t))] ds }. The fact that Ji = o(he) follows from condition b) of the theorem, because 91'(e(t)) is nonzero only on some compact set. Since the functions (' (x), ae(s, x)) and tr 91" (x)Be(s, x)B; (s, x) 
1. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 99 are continuous uniformly with respect to e > 0, s < t + he, and x, to prove the equality J; = o(he) it suffices to prove that 'e(t) is stochastically continuous, uniformly with respect to e > O. It follows from Remark 1, Chapter 5, 2 of the Gikhman-Skorokhod book [2] that there exists a constant IT, depending only on k and T, such that 7E ( SUP l 'e(t)1 2 Ic9Qe ) < IT( 1 + 1'012). tT (Here g;e is the flow of a-algebras generated by 'e(t).) Then for t + h < T E(I'e(t + h) - 'e(t)1 2 Ic9Qe) < E (JI+h [2(a(e(s), e(s)) + tr Be(s, e(s))B;(s, e(s))] dS 1 c9Qe) < hl}(1 + 1'012) and I} also depends on k and T. The last inequality implies that the stochastic continuity of 'e(t) is uniform with respect to e > 0, and hence that E<I>( e(tl), · · · , e (tm)) [tp(e(t + he)) - tp (e(t)) - he { (tp' (e(t)), a (e(t))) +  tr tp" (e(t)) B (e(t)) B * (e(t)) }] = o(h e ). It remains to use the remark after Theorem 1. 0 1.4. Stochastic equations with small diffusion. Let us first consider stochastic equations that are easily transformable by a time change to an equation with finite coefficients. We use the following fact. LEMMA 2. Let x(t) be the solution of the equation dx(t) = a(t, x(t)) dt + B(t, x(t))dw(t), x(O) = Xo. Then x(t) = X(At) (A > 0) is a solution of the stochastic equation dx(t) = a(t, x(t)) dt + B(t, x(t))dw(t), where a(t, x) = Aa(At, x), B(t, x) = ...[iB(At, x), w(t) = W(At)j...[i, and w(t) is also a Wiener process in Rd (like w(t)). 
100 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER PROOF. For s < t X(t) - x(s) = X(At) - X(AS) fAt fAt = 1 As a(u,x(u)) du + 1 As B(u,x(u)) dw(u) = 1 1 ).a().u, x ().u)) du + 1 1 B()'u, x().u)) dw().u) = 1 1 1i(u,x(u))du+ 1 1 B(u,x(u))dw(u). 0 THEOREM 6. Let xe(t) be the solution of the stochastic equation dxe(t) = e 2 a e (t, xe(t)) dt + eBe(t, xe(t)) dw(t), xe(O) = Xo, where the coefficients satisfy the following conditions: a) ae(t, x) and Be(t, x) are jointly continuous in the variables t and x and are continuous in x uniformly with respect to e > 0, t > 0, and Ixl < c, where c > 0 is arbitrary, and (27) holds for some k > o. b) There exist he --+ 0 as e --+ 0 such that the limits (29) . e 2 j t/e 2 +h/e 2 _ 11m - h ae(s, x) ds = a(t, x), eO e t/e 2 . e 2 j t/e 2 +h/e 2 * -2 11m - h Be(s, x)B e (s, x) ds = B (t, x) eO e t/e 2 exist uniformly with respect to Ixl < c and t < c, for any c > o. c) Equation (28) has a weakly unique solution. Then the processes xe (t) = Xe (t j e 2 ) converge in distribution to the solution of (28) with initial condition C;(O) = XO. PROOF. Making the substitution xe(t) = x e (tje 2 ) in (28) and using Lemma 2, we get that dXe(t) = ae(tje,xe(t)) dt + Be(tje 2 ,xe(t))dw(t), (30) where we(t) = (lje)w(tje 2 ) is a Wiener process in Rd. It is easy to verify that the conditions of Theorem 5 are satisfied for equation (30), and the proof follows from that theorem. 0 COROLLARY. Suppose that ae(t,x) = a(t,x), Be(t,x) = B(t,x), the lim- its 1 j t+T lim T a(s,x) ds = a (x), T -+00 t 1 j t+T _ lim T B(s,x)ds=B(x) T oo t 
l. EQUATIONS WITH A SMALL RIGHT-HAND SIDE 101 exist uniformly with respect to Ixi < c, and the stochastic equation d x (t) = a ( x (t)) dt + B ( x (t)) dw(t), x (O) = Xo, (31 ) has a weakly unique solution. Then the processes xe(t) converge in distri- bution to x (t). We now consider equations of the form dXe(t) = a(xe(t)) dt + eB(xe(t)) dw(t), xe(O) = Xo, (32) where a(x) is a sufficiently smooth function. For small e the process xe(t) differs little from the trajectory u(t,xo), where u(t,x) is the solution of the ordinary differential equation d dt u(t, x) = a(u(t, x)), u(O, x) = x. (33) Therefore, u( -t, xe(t)) differs little from Xo. Under the assumption that B(xe(t)) is a bounded variable, the diffusion component in (32) begins to have an influence on the solution of the equation only when an amount of time of order e- 2 has passed. This implies that an equation with "finite" (not tending to zero as e --+ 0) coefficients will be obtained for the process Ye(t) = u(-t/e 2 ,x e (t/e 2 )). The process xe(t) can be expressed in terms of Ye(t) by the formula xe(t) = u(e 2 t, Ye (e 2 t)). (34) We find the equation for Ye(t). Let Ye(t) = u(-t,xe(t)). Using the Ito formula and the equality :t u(t,x) = ( :x u(t,x),a(u(t,x))). we find that dYe(t) = [- :t u( -t, xe(t)) + ( :x u( -t, xe(t)), a(u( -t, Xe(t))))] dt e 2 [ 82 ] + "2 tr 8x 2 u( -t, xe(t))B(xe(t))B*(xe(t)) dt + eu( -t, xe(t))B(xe(t)) dw(t). 
102 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Hence, £2 dYe(t) = 2 a1 (t,Ye(t))dt+eB 1 (t,Ye(t))dW(t), Ye(O) = XO, a 1 (t,y) =  tr [ ::2 U( -t, U(t,Y))B(U(t,Y))B*(U(t,y))] , 1 8 B (t,y) = 8x u(-t, u(t,y))B(u(t,y)), (35) dYe(t) = a l (tf £2, Ye(t)) d t + B l (tf £2, Ye(t)) dw (t), Ye(O) = Xo. (36) We transform the expressions for a l (t,y) and BI(t,y). It follows from the equali ty y = u ( - t, u ( t, Y )) that 8 8 8 ( 8 ) -1 1= lh u(-t,u(t,y)) 8y u (t,y), 8x u (-t,u(t,y)) = 8y u(t,y) (37) (the right-hand side is an invertible operator). Thus, ( 8 ) -1 B1(t,y) = 8y u (t,y) B(u(t,y)). (38) From (37) we get 8 2 8x 2 u( -t, u(t,Y))[Zl, Z2] ( 8 ) -182 [( 8 ) -1 ( 8 ) -1 ] = 8y u (t,y) 8y 2 u (t,y) 8y u(t,y) ZI, 8y u(t,y) Z2. Therefore, d ( 8 ) -1 82 al(t,y) = L au(t,y) aIu(t,y)[B1(t,y)ek>B1(t,y)e k ]' k=1 y Y where {ek, k = 1,..., a} is an orthonormal basis in Rd. The equations for 8u(t,x)f8x and 8 2 u(t,x)f8x 2 can be obtained by differentiating (33). Theorem 6 can be applied to (36). The possibility of doing this is connected with the properties of u(t, x). (39) 2. Processes with rapid switching In this section we consider two-component Markov processes (x(t); y(t)) in the phase space X x Y, where X is a finite-dimensional Euclidean space, Y is a space with the discrete topology, x(t) E X, and y(t) E Y. It 
2. PROCESSES WITH RAPID SWITCHING 103 is assumed that y(t) is a step process, i.e., it is piecewise constant, and finitely many jumps (changes of state) take place in any finite amount of time. Such processes are called processes with a discrete component. See Gikhman-Skorokhod [1] (Chapter 5, 2, Theorem 2) for the general definition of such processes and their main properties. We consider the case when the process x(t) satisfies a diffusion stochastic differential equation with coefficients depending on y(t), the increase in intensity of the jumps of the process y(t) is inversely proportional to e as e --+ 0, and y(t) is an exponentially ergodic process for fixed e. Then the limit process x(t) turns out simply to be a diffusion process with coefficients obtained from those of the pre-limit process by a certain averaging with respect to an ergodic distribution. More precise formulations will be given below. 2.1. Processes with a discrete component. Let X be a topological space, and Y a space with the discrete topology. We consider a homogeneous right-continuous strongly Markov process (x(t); y(t)) such that y(t) is a step function. The process is called a Markov process with discrete component; y(t) is the discrete component, and x(t) is the phase component. The transition probability for such a process is determined by a collec- tion of operators Ayf-the generating operator for the process x(t) on the interval [0, -r], where -r is the first exit time of the component y(t) from the initial state-and by the probability Q( x , y , dx x dy) of transition from the point x(-r-) = x , y(-r-) = y to the set dx x dy at the jump time -r. We are interested in the more concrete class of processes such that x(t) is a diffusion process in Rd on the interval [0, -r[, and x( -r-) = x( -r). For such processes it is more convenient to give Q in the form Q( x , y , dy) = P{y( -r) E dylx( -r) = x( -r-) = x , y( -r-) = y } and by coefficients a(x,y), B(x,y), and c(x,y) defined and measurable on X x Y with values in Rd, L(Rd), and R+, respectively. Further, on [O,-r[ dx(t) = a(x(t),y(t)) dt + B(x(t),y(t)) dw(t), (40) and P{ 1: > tl} = exp { -1/ c(x(s),y(s)) ds } · Here g; is the a-algebra generated by (x(s);y(s)) for s < t. It will be assumed that Y is an additive group. Let (O,) be a measurable space with a a-finite measure m(dO), and let v(dO x dt) be a Poisson measure on 0 x R+ such that Ev(dO x dt) = m(dO) dt. It is possible to construct a 
104 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER PARd PAy  -measurable function f(x,y, (J) from Rd x Y X (J to Y such that m( {(J: f(x,y, (J) # O}) = c(x,y), m({(J:f(x,y,(J)EB})=Q(x,y,B)c(x,y), BE, OftB (see Gikhman-Skorokhod [2], pp. 226-227). Then, adding to (40) the equation dy(t) = f f(x(t),y(t), fJ)v(dfJ x dt), (41) we get a system of equations for the process (x(t);y(t)). We define a family of operators llxg acting on the space B(Y) of all PAy-measurable bounded real-valued functions g(y) with norm Ilgll = SUPy Ig(y)1 according to the formula TIxg(y) = -c(x,y)g(y) + c(x,y) f g(z)Q(x,y,dz) (42) (llx depends on x E Rd as a parameter). In what follows the following conditions are assumed: 1) c(x,y) > o. 2) llxg is continuous in x in the B(Y)-norm for all g E B(Y). 3) The functions a(x,y) and B(,y) are jointly measurable functions of their variables, they are continuous in x uniformly with respect to y, and la(x,y)1 + IIB(x,y)11 sp 1 + Ixl < 00. These conditions ensure the existence of a solution of the system (40), (41). It will be assumed further that the following condition holds: 4) The solution of the system (40), (41) is weakly unique, and hence is a homogeneous Markov process. Denote by ps;(x;y) and Es;(x;y) the probability and expectation for the so- lution of the system (40), (41) on [s, oo[ with initial condition (x(s);y(s)) = (x;y). As usual, the solutions are assumed to be right-continuous. LEMMA 3. Let (x,y) be a bounded function from Rd x Y to R that is !JI Rd PAy-measurable and satisfies the condition that  (x, y) is a compactly supported twice continuously differentiable function of x for all y E Y. For rp E Cd let Ly(x) = (a(x,y), '(x)) + ! tr 1/ B(x,y)B*(x,y). (43) 
2. PROCESSES WITH RAPID SWITCHING 105 Then for t > s Es;(x;y)tp(x(t),y(t)) = Es;(x;y) it [Ly(u) tp(X(U), y(u)) + IIx(u)tp(X(U),y(u))] du . (44) (the operator Ly is applied to (x,y) as afunction of x, while n x is applied to it as a function of y). PROOF. Let s < 'l'1 < ... < 'l'v < t be all the times when y(u) has a jump. Applying the Ito formula to (x(u), z) on the intervals [s, 'l'1[, ]'l'I, 'l'2[, . . . , ]'l'v, t], we have that (LI tp(x('rd, z) - tp(x(s), z) = is Ly(u)tp(x(u), z) du + iT (tp' (x(u), z), B(x(u),y(u)) dw(u)), l Lk+1  (x ( 'l' k + 1 ), z) -  (x ( 'l' k ), z) = Ly (u)  (x ( u ), z) d U Lk l Lk+1 + ('(x(u), z),B(x(u),y(u)) dw(u)), Lk tp(x(t), z) - tp(x( tv), z) = t Ly(u)tp(X(U), z) du J LV + 1 (tp'(x(u), z),B(x(u), z) dw(u)). Substituting z = y in the first equation, z = y( 'l'k) in the second, and z = Y('l'v) in the third and adding them over k from 1 to v-I, we get (x(t),y(t)) - (x,y) = it Ly(u)tp(x(u),y(u)) du + it (tp'(x(u),y(u)), B(x(u),y(u)) dw(u)) v + L[(X('l'k),Y('l'k)) - (X('l'k),Y('l'k-))] 1 = it Ly(u)tp(x(u), y(u)) du + it (tp' (x( u),y(u)), B(x(u),y(u)) dw(u)) + it [tp(X(U)'Y(U) + j(x(u),y(u), 0)) - tp(x(u),y(u))]m(dO) du + it [tp(X(U)'y + j(x(u),y(u), 0)) - tp(x(u),y(u))].u(dO x du), 
106 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where p,(d() x dt) = v(d() x dt) - m(d())dt is a martingale measure. Taking the expectation and considering that fe[tp(X,y + f(x,y, 0)) - tp(x,y)]m(dO) = IIxtp(x,y), we get (44). 2.2. An ergodic theorem for jump processes. We consider homogeneous Markov jump processes in the space (Y,!B y ), i.e., processes y(t) with tran- sition probability P( t, y, B) satisfying the following condition: the limits lim.!. P(t,y,B) = Q(y,B) t!O t exist for all B E !By such that y ft B, and Q(y, Y\ {y}) = A(Y) is a bounded function. If we extend the definition of Q by the equality Q(y, {y}) = 0, then Q(y, B) is a finite measure on !By that is measurable with respect to y. The generating operator of the semigroup of operators Pt corresponding to the process in the space By of all bounded !By-measurable functions has the form IItp(y) = -).(y)tp(y) + / tp(z)Q(y,dz). (45) Denote by M(Y) the space of all countably additive functions p(dy) on !By of bounded variation. We consider the semigroup on measures P P/(B) = / p(dy) P(t,y,B). If pII(B) = -1 ).(y)p(dy) + / p(dy)Q(y,B), (46) then II is the generating operator of P on M(Y). (We denote semigroups and generating operators by a single letter, but operators are applied to measures from the right, while they are applied to functions from the left; this is analogous to the action of matrices on rows and columns.) Assume that there exists a stationary distribution for the process y(t), i.e., a probability measure p such that p(B) = J p(dy) P(t,y, B) = p Pt(B). Then it is obvious that pll = O. We introduce an operator R acting in the spaces B(Y) and M(Y) by the formulas Rtp(x) = / tp(y)p(dy), vR(B) = v(Y)p(B). The operator R carries all functions into constants and all measures into measures proportional to p. It is clearly a projection operator: R2 = R, and Pt R = RP t = R, llR = Rll = O. 
2. PROCESSES WITH RAPID SWITCHING 107 LEMMA 4. Suppose that for some c > 0 the operator Ac = n + c(I - R) is the generating operator for some contraction semigroup. Then II Pt - RII < 2e- ct , II Pt -( 1 - e-ct)RII < e- ct . PROOF. Let St = e tAc . Then liSt II < 1, and since the operators nand (I - R) commute, it follows that e tAc = etnetc(I-R) = Pt ecte-ctR. Since R is a projection operator, we can write -tcR = I  (-tc)k R k = I  (-tc)k R e +  k! +  k! k=l k=l = I - R + f (_)k R = I - R(l - e- ct ). k=O Hence, St = e ct Pt(I - R) + R = ect(P t -R) + R, Pt -R = e-ct(St - R). The lemma follows from this relation. 0 REMARK 1. The operator Ac in Lemma 4 has the form Acf(y) = -().(y) - c)f(y) + f tp(z)[Q(y,dz) - cp(dx)]. The conditions of the lemma will be satisfied if A.(y) > c and Q(y, d z) - cp(d z) is a nonnegative measure. Denote by q(y, z) the density of Q(y, d z) with respect to p(dz) (we have in mind the density of the absolutely con- tinuous component). Then under the condition of the theorem q(y, z) > c for all y E Y and almost all z with respect to the measure p(dz). REMARK 2. Suppose that for some T > 0 II P T -RII < r < 1. Then there exist Cl, C2 > 0 such that II Pt - RII < Cl e- c2t . (47) (48) Indeed, (Pt -R)(P s -R) = P t + s -R Ps - Pt R + R 2 = P t + s -R. Hence P nT -R = (P, -R)n, P nT + s -R = (P T -R)n(P s -R). 
108 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER If t = nT + s, where 0 < S < T, then II p/-RII < ,nil Ps -RII < 2,n = 2exp {-nln  } < 2 exp { - t  S In  } , and (47) holds with Cl = 2/r and C2 = In(l/r). REMARK 3. Suppose that for some T and c E (0, 1) P, -cR = (1 - c)Q, (49) where Q is the transition probability operator of the Markov process in ( Y, PAy ). Then II P n , -RII < 2(1 - c)n, and hence (47) holds for some Cl > 0 and C2 > O. Indeed, it follows from (49) that QR = RQ = R. Hence, P, -R = (1 - c)(Q - R), (Q - R)n = Qn - R, P n , -R = (P, -R)n = (1 - c)n(Qn - R). The remark follows from this. Let Y be a finite set, m the number of points in Y, and A.-I (y) Q(y, {z} ) = q(y, z) the transition probability for the imbedded chain. Suppose that ql(Y,Z) = q(y,z), and qn(Y,z) = E uEy q(y,U)qn-l(U,Z) (n > 1) is the n-step transition 1>robability of the imbedded chain. If all the states com- municate, then m Lqn(Y,Z) > o. n=1 Suppose that J = miny,z E:=1 qn(Y, z) is a positive number. Let A.O = miny A.(Y) and A. = max y A.(Y). Then PT(y, {z}) > q(y, z) iT ).(y)e-A(Y)Se-A(Y)(T-S) ds m-l + . . . + L 1 . .. ( L q(y, yJ) n=l SI+S2+."+ S n<' JyIEY,...,YnEY ...qn(Yn,Z)A.(Y)A.(Yl)...A.(Yn) x exp{ -A.(Y)S - A.(YI )SI - . . . - A.(Yn)Sn - A.(Z)(T - SI - Sn)} ds dS I dS 2 ... dS n J - > ,A.oe- A '(1 A T)m. m. 
2. PROCESSES WITH RAPID SWITCHING 109 Hence, setting c = JADe).. 1m!, Pl(Y,{Z}) -cp(z) > 0 and condition (49) holds. In (48) we can take Cl = 2/(1 - c) and C2 = In( 1 1(1 - c)). In the finite-dimensional case this enables us to get uniform estimates in terms of m, J, AO, and A. for the rate of convergence to an ergodic distribution. REMARK 4. Suppose that (48) holds. Then there exists a c > 0 such that for every function f E B(Y) and all T > 0 1 (T f c Ey T 10 f(y(s)) ds - f(z)p(dz) < T " f ". (50) Indeed, f f(z)p(dz) = Rf, 1 {T 1 (T Ey T 10 f(y(s)) ds = T 10 Ps f(y) ds. Hence, sp Ey  iT f(y(s)) ds - f f(z)p(dz) < sp  iT(ps -R)f(y) ds <  iT II Ps -RII-lIfll ds < i iT e- C2S ds -lIfll = (1 - e- C2T )llfll. C2 T DEFINITION. Let {n(H a E A} be some family of generating operators of the form (45). A family of Markov processes with these generating operators is said to be uniformly ergodic if for every a E A (A is some set) a Markov process with generating operator no is ergodic. If Po is the corresponding ergodic distribution, then there exists a constant c such that for a E A, f E B(Y), and T > 0 1 (T f c E T 10 f(y(s)) ds - f(z)po.(dz) < T " fll , (51 ) where E is the expectation for a process with generating operator no. Effective conditions for uniform ergodicity of a family of processes in terms of no or P for some fixed 'l' > 0 can be given on the basis of Lemma 4 and the remarks after it. 
110 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER LEMMA 5. Suppose that a family of Markov processes with generating operators {no, a E A} is uniformly ergodic and Illlo - llpll < J. Then Ipo - ppl < CIVJ, where Cl depends only on c in (51), and Ipo - ppl is the variation of the difference of measures. PROOF. Let P be the transition probability operator for the process with generating operator llo. Then d d p _ p ds P = P llo, ds P t - s - -llp P t - s , :s P pf-s = P(ITa - TIp) pf-s, P - P = 10 1 :s (P pf-s) ds = 10 1  (ITa - TIp) pf-s ds" Hence, II P - P II < tJ, 1 {T a p JT T 10 (PI -PI )dt < T" Therefore, f I(y) Po. (dy) - f I(y)pp(dy) < ( l5J +  ) 11/11. Choosing T = J-l/2, we get what is needed. 0 2.3. An estimate for a process with a discrete component. We consider the system (40), (41), with the following conditions: 5) There exists a constant such that for all y E Y and x E Rd la(x,y)1 + IIB(x,y)1I < k(1 + Ixl). 6) There exists an increasing upwards convex function 'I' on R+ such that '1'(0) = 0, 'I'(lx) < l'¥(x) for I> 1, and for Xl,X2 E Rd IIllxI - llX211 < 'I'(I X I - x212). LEMMA 6. Suppose that II and II 1 are the generating operators of Markov jump processes in (Y, PAy), the operator II has the form (45), and III has the sameform if A. and Q are replaced by A.l and Ql. In this case, ifllll-lllll < J, then there exists a transition kernel Q (y, d z) such that Q (y, B) < Q(y, B) A Ql (y, B), B E PAy, Q(y, Y) - Q (y, Y) < J, Ql (y, Y) - Q (y, Y) < J. PROOF. Using the decomposition of a measure into the absolutely con- tinuous component and the singular component, we can write Q(y,B) = Q'(y,B) + Q"(y,B), Ql(y,B) = Q(y,B) + Q'(y,B), 
2. PROCESSES WITH RAPID SWITCHING 111 where Q' and Q are equivalent measures, and Q" and Q' are orthogonal to them and to each other. Using the fact that the a-algebra !By is countably generated, we can choose Q', Q", Q, and Q' to be measurable with respect to y, i.e., they are also transition kernels. There exists a measurable density q' (y, z) such that Q(y,B) = l q'(y,z)Q'(y,dz). N ow let Q (y,B) = l (q'(y, z)" l)Q'(y,dz). Suppose that for a given y the sets C l ,. . ., C 5 are disjoint, Uk C k = Y, and they satisfy the following conditions: Q (y, C l U C 5 ) = 0; q'(y, z) < 1 for z E C 2 ; q'(y, z) = 1 for z E C 3 ; q'(y, z) > 1 for z E C 4 ; and Q"(y, Y\C l ) = Q'(y, Y\C 5 ) = O. Then -, - o < Q(y, Y) - Q(y, Y) = Q' (y, C l ) + Q(y, C 2 ) - Q(y, C 2 ) = Q(y,C l ) - Ql(Y,C l ) + Q(y,C 2 ) - Ql(Y,C 2 ) = f Q(y,dz)Icluc2(z) - f QI(y,dz)Icluc2 = llIc 1 uc 2 - ll1 I C 1 UC 2 < IIll - llll1 < J. Similarly, o < Ql (y, Y) - Q (y, Y) = lllIc4ucs - llIc 4 uc s < IIlll - nil < J. 0 LEMMA 7. Suppose that nand III satisfy the conditions of Lemma 6, {8, } is a measurable space with a a-finite measure m(d8) without atoms, and the function f(y, 8) from Y x 8 to Y is such that m( {8: f(y, 8) E B}) = Q(y,B), o ft B. (52) Then it is possible to construct a function fi (y, 8) from Y x 8 to Y such that m ( { 8: f (y, 8) # O} U {8: fi (y, 8) # O} \ { 8: fi (y, 8) # f (y, 8) }) < 2J ( 53) and m( {8: fi (y, 8) E B}) = Ql (y, B), o ft B. (54) PROOF. Suppose that C l ,..., C 5 are the same as in the proof of Lemma 6, and the r i are defined by r j = {8: f (y, 8) E C j }, i = 1, 2, 3, 4. Let fi (y, 8) = 0 for 8 E r l , and fi (y, 8) = f(y,8) for 8 E r 3 U r4. Let mr2(d8Iz) be the conditional distribution of the measure m on r 2 with 
112 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER respect to the fibering generated by the sets {(J:f(y,(J) = z}, z E C2. It is possible for each z E C 2 to choose a set Az C {(J: f(y, (J) = z} such that U ZEC2 Az is measurable and mr2(Azlz)lmr2(r2Iz) = q'(y, z). Then let fi (y, (J) = f(y, (J) on UZ EC2 Az, and fi (y, z) = 0 for (J E r 2 \ UZ EC2 Az. We extend the definition of fi (y, (J) in such a way that (54) is satisfied. Then f(y, (J) = fi (y, (J) # 0 for (J E (U zEC 2 Az) u r 3 u r 4. Further { (J: f (y, (J) # O} u {(J: fi (y, (J) # O} \ { (J: fi (y, (J) = f (y, (J) } = ({ (J: f (y, (J) # O} \ { (J: fi (y, (J) = f (y, (J) # O}) u ( { (J: fi (y, (J) # O} \ { (J: fi (y, (J) = f (y, (J) # O}). We have that m ( { (J: f (y , (J) # O} \ { (J: fi (y, (J) = f (y, (J) # O} ) = m( {(J: f(y, (J) # O}) - m( {(J: fi (y, (J) = f(y, (J) # O}) = Q(y, Y) - m( U Az) - m(r 3 ) - m(r 4 ) zEC 2 =Q(y,Y)- ( q'(y,z)Q(y,dz)- ( Q(y,dz) J JyU = Q(y, Y) - Q (y, Y) < J ( Q is defined in Lemma 6). Analogously, m ( { (J: fi (y, (J) # O}) - m ( f(J: fi (y, (J) = f (y, (J) # O}) = Ql(Y, Y) - Q (y, Y) < J. 0 THEOREM 7. Assume conditions 5) and 6) hold. Denote by (x(t);y(t)) the solution of the system dx(t) = a(x(t),y(t)) dt + B(x(t),y(t)) dw(t), dy(t) = fa f(xo,y(t), O)v(dO x dt) (55) with initial condition x(O) = Xo, 9(0) = Yo (xo and Yo are not random). Further, let (x(t),y(t)) be the solution of the system (40), (41) with the same initial condition. For any bounded !!ARd x !!Ay-measurable function g(x,y), E 1 h g(x(s),y(s)) ds - E 1 h g(x(s),y(s)) ds < 1'l'(h)h 2 I1gl1 (56) for all h, where I depends on Xo and k (the constant in condition 5)), and SUPxoEK I < 00 for every compact set K C Rd. 
2. PROCESSES WITH RAPID SWITCHING 113 PROOF. Since the distribution of the pair (x(t);y(t)) does not depend on the choice of the function f(x, y, (J) satisfying m( {(J: f(x,y, (J) E B}) = c(x, y)Q(x, y, B), BE/!Ay,OftB, we can use Lemma 7 and condition 5) to choose this function so that m ( { (J: f (xo, y, (J) # O} U {(J: f (x, y, (J) # O} \ { (J: f (xo, y, (J) = f (x, y, (J) } ) < 211nxo - nxll < 2'¥(lx - xoI 2 ). Denote by Cx,y the set in  appearing as the argument of m in the pre- ceding inequality. Obviously, the processes (x(t);y(t)) and (x(t);y(t)) co- incide as long as the jumps of the processes y(t) and y(t) coincide, and they coincide if at the time s of a jump of y(t) f(x(s), y(s-), (Js) = fi (x(s), y(s-), (Js), where ((Js, s) is a point of concentration of the measure v(d(J x dt) on the line t = s (it exists, because s is a jump point). Let C(t) = it  IcX(S).)'(S)v(d£J x ds) and let 'r be the first jump time of C(t). Then (x(t);y(t)) = (x(t);y(t)) for t < 'r. Hence, P {i h g(x(t),y(t)) dt -I i h g(x(t),y(t)) dt} < P{ 1" < h} < P{C(h) > I} < EC(h) = E i h f IcX(S).)'(s)m(d£J)ds = i h Em(Cx(s),y(s»)ds < 2 i h E'I'(lx(s) - x(O)1 2 ) ds < lh'l'(h). We have used the fact that for every compact set K C Rd there exists a constant 11 dependent on k such that Ex('¥(lx(t) -x(0)1 2 )) < f. t for x E K, and hence Ex'¥(lx(t) - x(0)1 2 ) < ,¥(Exlx(t) - x(0)1 2 ) < '¥(f.t) < (11 + 1)'¥(t). Observe now that E (i h f(x(s),y(s)) ds - i h f(x(s),y(s)) ds ) < 2hllfll P {i h f(x(s),y(s)) ds -I i h f(x(s),y(s)) ds } . This gives us (56). 0 
114 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER 2.4. A limit theorem for processes with rapidly varying discrete compo- nent. We consider a system of the form (40), (41) dependent on a small parameter e, and we investigate its behavior as e --+ O. This system has the form dxe(t) = a(xe(t), Ye(t)) dt + B(xe(t), Ye(t)) dw(t), ( (57) dYe(t) = 19 f(xe(t),Ye(t), O)ve(dO x dt). Here a,B, and f are the same as in (40), (41), and the e in the equation appears only in the measure ve(dO x dt) on  X R+, namely, Eve(dO x dt) = m(dO) dt. Assume that the coefficients in (57) have a weakly unique solution, which is thus a Markov process. Let Ly and IIx be defined by (42) and (43). Then the generating operator Ae of the Markov process solving (57) has the following form on functions rp (x, y) twice continuously differentiable with respect to x: Aerp(x, y) = Lyrp(x,y) + tIIxrp(x, y). This means that the average number of jumps of the discrete component per unit of time is proportional to. We need a condition on the generating operators IIx (for a fixed x this is the generating operator of a jump process in Y) : 7) The family of Markov processes. in Y with generating operators IIx is uniformly ergodic; if Px (d z) is an ergodic distribution for the process with generating operator IIx, and PX(t,y,dz) is the transition probability for this process, then  I T f j(z)PX(t,y,dz)dt- f j(z)Px(dz) <  lljll. THEOREM 8. Assume conditions 1)-7) hold. Then the processes xe(t), where (xe(t);Ye(t)) is the solution of (57) with initial condition xe(O) = Xo, Ye(O) = Yo, converge in distribution to the process x(t) that is the solution of the equation dx(t) = a(x(t)) dt + B(x(t)) dw(t) with initial condition x(O) = Xo, where a(x) = f a(x,y)pAdy), ( ) 1/2 B(x) = f B(x,y)B*(x,Y)Px(dy) (here the nonnegative square root of a symmetric nonnegative operator is understood). 
2. PROCESSES WITH RAPID SWITCHING 115 PROOF. We again use Theorem 1 and the remark after it. The fact that (xe(t);Ye(t)) is a homogeneous Markov process means that it suffices to prove that for a twice continuously differentiable compactly supported function rp(x) on Rd .1- 11m - h Ex,y[rp(xe(he)) - rp(x) - heLrp(xe(x))] = 0 (58) eO e heO uniformly with respect to Ixi < rand Y E Y, for any r > 0, where Lrp(x) = (rp' (x), a(x)) + ! tr rp" (x)B 2 (x). (59) Indeed, if <I>(Xl,. . . , x m ) is bounded, and tl < . . . < t m < t, then [ j t+ht ] EcI>(xe(td,. · · , xe(tm)) tp(xe(t + he)) - tp(xe(t)) - t Ltp(xe(s)) ds = E<I>(X e (tl),. · ., xe(tm))Ext(t),Ye(t) X [tp(Xe(h)) - tp(xe(O)) -l h £ Ltp(xe(s)) dS] = <I>(X e (tl),... ,xe(tm))I{lxe(t)Ir}Ext(t),Yt(t) [ (1 X tp(xe(h)) - tp(xe(O)) - 10 Ltp(xe(s)) ds J + cI>(xe(td,... , xe(tm)) (ht X I{lx.(t)l>r} Ex£(t),y£(t) 10 (g(Xe(S),Ye(S)) - Ltp(xe(s))) ds, where g(X, y) = (rp' (x), a(x, y)) + ! tr rp" (x)B(x,y)B* (x, y), because on the basis of the It6 formula (he tp(xe(he)) - tp(Xe(O)) = 10 g(x(s),y(s)) ds + l h tp' (xe(s)), B(xe(s), Ye(S)) dw(s). Hence, 1 [ j t +ht ] he EcI>(xe(td,..., xe(tm)) tp(Xe(t + he)) - tp(xe(t)) - t Ltp(xe(s)) ds < 11<1>11 sup h I Ex,y [ tp(Xe(h e )) - tp(X) - (h£ Ltp(xe(s)) dS ] Ixlr,y e 10 + 11<I>11(IILrpll + Ilgl!) P{lxe(t)1 > r}. 
116 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER The first term on the right-hand side tends to zero as e --+ 0 if (58) holds, and the second can be made arbitrarily small by suitably choosing r. Since Ex,y [tp(Xe(h e )) - tp(xe(O)) _1 he g(xe(s),Ye(s)) dS] = 0, to prove the theorem it suffices to show that lim h I Ex,y ( {he g(X(S), Ye(S)) ds - (he g(Xe(S)) dS ) = 0 eO e 10 10 uniformly with respect to Ixl < rand Y, where g(x) = Ltp(x) = f g(x,y)px(dy). Denote by ( xe (t); ye (t)) the solution of the system (60) d xe (t) = a( xe (t),Ye(t)) dt + B( xe (t), ye (t)) dw(t), d ye (t) = Ie f(x, ye (t), O)ve(dO x dt) with initial conditions xe (O) = x and ye (O) = y. On the basis of Theore 7 there exists for every r > 0 an I such that for Ixl < r {hI; (hI; I 2 Ex,y 10 g(Xe(S),Ye(S)) ds - E 10 g( xe (s), ye (s)) ds < e'P(h e ) . he .llgll. In exactly the same way we prove that Ex,y (he g(Xe(S)) ds _ (he g( Xe (S)) ds < £hi'P(he)llgli. 10 10 e We now choose he such that elh e --+ 0 and he\P(he)le --+ O. Then {hI; (he Ex,y 10 g(xe(s)) ds - E 10 g( xe (s)) ds {he (he + Ex,y 10 g(Xe(S),Ye(S)) ds - E 10 g( xe (s), ye (s)) ds = o(he) uniformly with respect to Y E Y and Ixl < r. Thus, the proof of (60) reduces to showing that lim h I ( E {he g( Xe (S), ye (s))ds - E (he g(Xe(S))dS ) = 0 (61) eO e 10 10 uniformly with respect to Y E Y and Ixl < r. Note that in view of con- dition 3) the function g(x,y) is uniformly continuous with respect to x, 
2. PROCESSES WITH RAPID SWITCHING 117 uniformly with respect to y. Therefore, lim h I E r\g( Xe (S), Ye (S)) - g(x, Ye (s))]ds eO e 10 < E lim h I rhelg( Xe (S), Ye (S)) - g(x, ye (s))1 ds = O. (62) eO e 10 Obviously, ye (es) is a Markov jump process with generating operator lim Eyg(ye(et)) - g(y) = dim Eyg(Ye(t)) - g(y) = IIxg(y). tO t tO t Thus, ye (et) can be regarded as a Markov jump process not dependent on e and having generating operator IIx. Using condition 7), we can write e 1 he Eg(x, ye (s))ds - e 1 he Eg( xe (s))ds e {hefe 1 (he - he 10 Eg(x, Ye (es)) ds - he 10 Eg( xe (s)) ds < / g(x,y)px(dy) - :e 1 hde / PX(s,x,dz)g(x,dz) ds + e 1 he IEx,yg(xe(s)) - g(x)1 ds < C h e Ilgll + sup I Ex,yg(xe(s)) - g(x)l. e s  he The right-hand side tends to zero uniformly with respect to x on each compact set by the choice of he, the uniform continuity of g(x), and the estimate Exl xe (s) - xe (O)1 < l(x)s, where l(x) is a locally bounded function. The proof of (61) is concluded by using (62). 0 2.5. Dynamical systems with rapid switching. We consider the partic- ular case of the system (57) when B(x,y) = O. In order that the solution of the system by unique (it is easy to see that under this assumption weak uniqueness is equivalent to strong uniqueness, since the solution between jumps of the process y(t) is the solution of a first-order equation) it suffices that the function a(x,y) satisfy a Lipschitz condition with respect to x. If this condition holds, then it follows from Theorem 8 that the process xe(t) converges to a nonrandom function x(t) that is the solution of the equation dx(t) _ _ ( _ ( )) dt - a x t , x(O) = Xo, 
118 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where Xo is the nonrandom initial value for xe(t) (here a(x) is the same as in Theorem 8). More interesting is the case when a(x) = O. Then x(t) coincides with the initial value, i.e., xe(t) --+ Xo for all t > o. We shall study the nature of this convergence. Thus, we have the system d Xe (t) dt = a(xe(t),Ye(t)), dYe(t) = Ie f(xe(t),Ye(t), O)ve(dO x dt), (63) where the function f(x,y, 0) and the measure V e are the same as in 2.4. The following condition is assumed: 8) a(x,y) is bounded, jointly measurable, and continuous in x, a(x,y) exists and satisfies a Lipschitz condition in x with constant independent of y, f a(x,y)px(dy) = 0 for all x E Rd, and condition 6) holds for the operators IIx with the function 'I/(s) = cv'S, where c is a constant. We consider the expression Ex,y{O(xe(h)) for a thrice continuously dif- ferentiable compactly supported function {O: h Ex,ytp(x(h)) = tp(x) + Ex,y 1 (tp' (xe(s)), a(xe(s), Ye(S))) ds = tp(X) + Ex,y 1 h (tp'(x), a(xe(s),Ye(S))) ds + Ex,y 1 h 1 5 (tp"(xe(u))a(xe(u),Ye(u)),a(xe(s),Ye(s)))duds = tp(x) + Ex,y 1 h (tp'(x),a(x,Ye(S))) ds + Ex,y 1 h 1 5 (tp' (x), a' (xe(s), Ye(s))a(xe(u),Ye(U))) du ds + Ex,y 1 h 1 5 ( tp" (x e ( U ))a( Xe (u), Ye (u)), a(x e (s), Ye (s))) duds. Note that, since a, a, and {O satisfy a Lipschitz.condition in x and a(x,y) is bounded, so that Ixe(h) - xe(O)1 = O(h), it follows that if we replace xe(u) and xe(s) by x in the double integrals, then we get an error of order 
2. PROCESSES WITH RAPID SWITCHING 119 h 3 . Therefore, Ex,ytp(x(h)) = tp(x) + Ex,y 1 h (tp'(x),a(x,Ye(s))) ds (64) + Ex,y 1 h (tp'(X), 1 5 a'(X,Ye(s))a(x,Ye(U)) du ) ds + Ex,y 1 h 1 5 (tp" (x)a(x,Ye(U)), a(x, Ye(S))) du ds + O(h 3 ). We now transform the expressions containing double integrals, replacing Ye(s) by Ye (s) (these processes were introduced in the proof of Theorem 8). Since IXe(h) - xe(O)1 < kh, where k is a constant, and condition 6) holds with the function cVS, we can write (see the proof of Theorem 7) p {l h (tp'(X), 1 5 a'(X,Ye(s))a(x,Ye(U)) dU) ds =l-1 h (tp'(X), 1 5 a(X' Ye (s))a(x' Ye (U))dU) dS} < CI 2 , where Cl is a constant. Therefore, Ex,y 1 h (tp'(X), 1 5 a(X,Ye(s))a(x,Ye(U)) dU) ds - Ex,y 1 h (tp'(X), 1 5 a(X' Ye (s))a(x' Ye (U))dU) ds < o( 4 ). Further, Ex,y 1 h (tp'(X), 1 5 a'(X, Ye (s))a(x, Ye (U))dU) ds = e 2 E x ,y 1 h / e (tp'(X), 1 5 a'(x, ye (es))a(x, ye (eu)) dU) ds. Using the fact that ye (es) is a Markov jump process with transition prob- abili ty p x (s , Y, d z ), we can rewrite this in the form e21h/e (tp'(X), 1 5 II a'(x,z2)a(x,zdPX(u,y, dZ d) x PX(s - u, zl,dz 2 ) duds = e2 1 h / e I (tp'(X), [l h / e - u I a'(X,Z2)PX(S,ZI,dZ 2 )dS] )a(X,Zd x P(u,y,dz l ) du. 
120 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER For what follows we need uniform exponential ergodicity for a family of Markov processes with transition probabilities Px (t, ZI, d Z2). The follow- ing condition is assumed: 9) For each r > 0 there exists a constant c, such that for all g E B(Y) and t > 0 f P"(t,y,dz)g(z) - f px(dz)g(z) < c;le-crtllgll for Ixl < r and all y E Y. If condition 9) holds, then a function RX (y, B) is defined that is Rd  y-measurable, countably additive with respect to B E y, of uniformly bounded variation with res.pect to Ixl < rand y E Y for any r > 0, and such that f RX(y,dz)g(z) = 1 00 f g(z)[P"(t,y,dz) - Px(dz)]dt. Under the assumption of condition 9), for large T we can write I T f gl(ZI) [I T - U f g2(Z2)P"(S, zJ,dz 2 ) dS] PX(u,y,dzddu = I T f gl(zd [I T - U f g2(Z2)[P"(s,zJ,dz 2 ) - Px(dZ 2 )]] ds x [PX(u,y,dz l ) - Px(dz l )]du + I T f gl (zd (I T - U ds f g2(Z2)[P"(S, Zl, d Z2) - px(d Z2)]) x Px(dz l ) du + I T f gl(zd [I T - U f g2(Z2)Px(dZ 2 )dS] X (PX(u,y,dz l ) - Px(dz l ))du + I T f gl(zd [I T - U f g2(Z2)Px(dZ 2 )dS] px(dzddu = ! gl(zd f g2(Z2)R X (zJ,dz 2 )R X (y,dz 1 ) + T f f gl (Zdg2(Z2)R X (zJ, dZ 2 )Px(dzd + I T f gl(zd(T-u) f g2(Z2)Px(dz 2 )(P"(u,y,dzd-px(dzd)du + 2 f gl(zdPx(dz 1 ) f g2(Z2)Px(dz 2 ) + O(Te- CrT + 1) (65) 
2. PROCESSES WITH RAPID SWITCHING 121 (here 0(.) estimates the error arising when the integrals with Px - Px are replaced by integrals with infinite limits). We use the computations to transform the right-hand side of (65). In the case T = hie we assume that hie --+ 00 and .f a(x, zl)Px(dz l ) = 0, and since 1 T I g,(zd(T - u) I g2(Z2)Px(dz 2 )[PX(u,y,dzd - px(dzd] du = T II g,(Zdg2(Z2)R X (y, zdPx(dzd + 0(1), we rewrite the right-hand side of (65). in the form e 2 (II (tp' (x), a' (x, z2)a(x, zd)R X (y, d zdRx (z" d Z2) +  II (tp'(x), a'(x, z2)a(x, zd)[RX(z" dZ 2 )Px(dzd + R X (y,dzdpx(dz 2 )] + 0(1)) = eh I I (tp' (x), a' (x, z2)a(x, Z2)) x [R X (ZI, dZ 2 )Px(dz l ) + RX(y, dZ l )Px(dz 2 )] + 0(e 2 ). Similarly, Ex,y 1 h 1 5 (tp" (x)a(x, Ye(U)), a(x, Ye(S))) du ds = eh II (tp"(x)a(x, zd, a(x, z2))px(d zdRX(z" dz 2 ) + o( e2 + 4 ). Consequently, (64) can be rewritten as Ex,ytp(xe(h)) =tp(x) + eh I I [(tp"(x)a(x, zd, a(x, Z2)) (qJ'(x), a'(x, z2)a(x, ZI))] x RX(zI, dZ 2 )Px(dz l ) +eh(tp'(X), (I a'(x,Z2)Px(dZ 2 )) x (a(x,zdRX(Y,dZd)) + Ex,y 1 h (tp'(x), a(x,Ye(s))) ds +0(h3+ 4 +e 2 ). (66) 
122 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER We apply this equality to the function {O(v) = Iv - x1 2 , V E Rd. Since {O'(x) = 0, we get that Ex,ylxe(h) - xl 2 = O(eh + e 2 + h 3 + h 4 Ie). By assumption, hie --+ 00. Assume also that h 3 1e 2 --+ O. Then Ex,ylxe(h) - xl 2 = O(eh). (67) The last relation shows that the variable e- l / 2 (xe(t) - x) can have a limit distribution as e --+ O. We demonstrate this. THEOREM 9. Let (xe(t);Ye(t)) be the solution of the system (63) with initial values xe(O) = x and Ye(O) = Y, and assume conditions 5)-9) hold. Then, uniformly with respect to Ixl < r, Y E Y (r > 0 arbitrary), the process c;,y(t) = (xe(t) - x)l...fi converges in distribution to a process 'x(t) that is a homogeneous Gaussian process with independent increments in Rd and satisfies E'x(t) = 0 and E(Cx(t), V)2 = t f f (v,a(x, zJ))(v,a(x, z2))R X (z"dz 2 )Px(dzJ) = t(Bxv, v) for v E Rd. The uniformity of the convergence means that for every contin- uous bounded function <I>(Xl,. . ., x m ) on (Rd)m lim sup IE<I>(c;;,y(tl)'... ,c;;,yt(t m )) - E<I>('x(tl),..., 'x(tm))1 = 0 e-+O \xl<r yEY (0 < tl < . . . < t m ). PROOF. On the basis of Theorem 7, condition 8), and (67), Ex,y l h a(x,Ye(s))ds = Ex,y l h a(x, ye (s)) ds + o(  ) = e l h / e f PX(s,y, dz)a(x, z) ds + o(  ) = e l h / e f (PX(s,y,dz) - px(dz))a(x, z) ds + o(  ) =e f a(X,Z)RX(y,dZ)+O(  +ee-C,h/e). Let 0 < tl < . . . < t m < t - h < t < t + h. Define a (x,y) = f a(x,z)RX(y,dz). 
2. PROCESSES WITH RAPID SWITCHING 123 It is easy to see that f a (x,y)px(dy) = O. Therefore E<I>(xe(td,..., xe(tm)) 1 h a(xe(t),Ye(t + s)) ds ( h 5 / 2 ) = 0 Vi + ee-c,h/e + eE<I>(xe(td,..., xe(tm)) x Ext(t-h),Yt(t-h) a (x e (t), Ye (t)) ( h 5 / 2 ) = 0 Vi + ee-c\h/e + O(eEx<(t-h),y<(t-h)l a (xe(t),Ye(t)) - a(xe(t - h),Ye(t))1) + eE<I>(x e (tl),... ,xe(tm))Exe(t-h),Ye(t-h) a (xe(t - h),Ye(t)). It follows from condition 8) and Lemma 5 that l a (x,y) - a (x2,y)1 < kllxl-X211/2 for Ix;! < r, where k l is a constant dependent on r. Therefore, Exe(t-h),Yt(t-h)l a (xe(t),Ye(t)) - a (xe(t - h),Ye(t))1 < 0((eh)I/4 + Exe(t-h),Ydt-h)[I{lxe(t-h)l>r} + I{lxdt)l>r}]). Further, Exe(t-h),Ye(t-h) a (xe(t - h),Ye(t)) = Ex,y a (x,Ye(h)) , x=xt;{t-h) y=yt;{t-h) Ex,y a (x,Ye(h)) = Ex,y a (x, ye (h)) + o( h2 ) = o(e-C,h/e +  ) (we have used the fact that f a (x,y)px(dy) = 0). It will be assumed that h has been chosen so that h 2 Ie --+ 0, and e-c,h/e = o(e). Then h E<I>(xe(td,..., xe(tm)) 1 a(xe(s),Ye(S)) ds _ ( 5/4 1/4 h 5 / 4 2 - 0 e h + Vi + e + eEx<(t-h),y<(t-h) x (I{!x<(t-h)l>r} + I{lx<(t)l>r}) ). (68) Choose h = e'Y with)' < 1 such that h 5 / 2 e- l / 2 = o(hVi). Then e 5 / 4 h 1 / 4 + h 5 / 2 e- 1 / 2 + e 2 = o(hVi). Suppose that {O(x) is a thrice continuously dif- ferentiable function with support in the ball {x: Ixl < rj2}, g(x) = 1 for 
124 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Ixi < 2r, Ig(x)1 < 1, g(x) = 0 for Ixl > 3r, and g(x) is continuous. Then [ ( xe(t + h) - X ) ( xe(t) - X )] EcI>(xe(tI), · · · , xe(tm)) tp .[i - tp .[i = E<I>(X e (tl),... ,xe(tm))g(xe(t - h)) x [ ( Xe(t+h) -X ) _ xe(t) -X ]  .[i .[i + E<I>(X e (tl),... ,x e (t m ))(1 - g(xe(t - h))) x [tp ( Xe (t ) - X ) _ tp ( Xe (- X ) ] . The second term can be estimated by the quantity 4l/cI>lI.lltpll ( p {(I - g(xe(t - h)))tp ( Xe(t) - X ) I- o} +p{(1-g(Xe(t-h)))tp( Xe(-X ) I-O}). If Ixl < r, then ((v - x)/.[i) is nonzero for Ix - vi < .[ir/2. Hence, if e < 1, then ( Xe (t + h) - x ) = 0  .[i when Ixe(t + h)1 > r. On the other hand, (1 - g(xe(t - h))) = 0 for IXe(t - h)1 < 2r. Hence, P {(1- g(xe(t - h)))tp( xe(t) - X ) = o} < P {Ixe(t + h) - xe(t - h)1 > r/2} 4 < ,2 Elxe(t + h) - Xe(t - h)12 = O(eh). 
2. PROCESSES WITH RAPID SWITCHING 125 The second probability is estimated similarly. On the basis of equations (66) and (68), E <1>( Xe ( t 1 ), . . . , Xe ( t m ) ) g (X e (t - h)) [  ( c;,y (t + h)) -  ( c;;,y ( t) ) ] = E<I>(X e (tl),... ,xe(tm))g(xe(t - h)) X Ex(t),y(t) (eh f f  tp" (c;,y(t))a(xe (t), zd, a(xe(t), Z2) x Pxe(t)(d zdRXe(t) (Zl, d Z2)) + O( .;8h + h 3 + 4 ; +e 2 + eh + eEg(xe(t - h)) x Ill" (x e (t)) I (I{lx e (t-h)I>3r} + I{lxe(t)l>3r})) = hE<I>(x e (tl),... ,xe(tm))g(xe(t - h)) x tr  1/ ( c;,y ( t) ) B xt (t) + 0 ( h) ( 69) (we have applied (66) to ((v - x)/Vi) as a function of v), and the o(h) on the right-hand side of (69) is uniform with respect to y E Y and Ixi < r. Since BXt(t) --+ Bx as e --+ 0 because Bx is continuous in x, what is required follows from Theorem 1 and the remark after it. 0 Relation (67) gives us that Ex,y xe(  ) -x 2 = O(h). Therefore, it might be expected that the process xe(t/e) also has a limit distribution. Let us show that this is indeed so under certain additional assumptions. We need the following condition on TIx: 10) dTIx/dx and d2TIx/dx2 exist and are bounded operators on B(Y). The notation llxg(y) = f Qx(y,dz)g(z), :x llxg(y) = IQ:(y,dZ)g(Z), :;2 llxg (y) = f Q(y,dz)g(z) will be used for these operators. For a fixed y the quantities Qx, Q, and Q are countably additive functions of bounded variation (on Il/y). THEOREM 10. Suppose that (xe(t);Ye(t)) is a solution of the system (63), conditions 5)-10) hold, and the stochastic equation in Rd dx(t) = a(x(t)) dt + B(x(t))dw(t), (70) 
126 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where a(x) = al (x) + a2(x), al (x) = I I a'(x, z2)a(x, zdpx(d zdRX(Zb d Z2), a2(x) = III px(dz)RX(z,dzd(Q(zl>dz2),a(x,z)) a (x,z2), and B(x) is a symmetric nonnegative operator with (ij2(x)v,v) = 211(a(x,zd,v)(a(X,Z2),V)RX(ZbdZ2)Px(dzd for v E Rd, has a weakly unique solution. Then the process xe(t) = xe(te- l ) converges in distribution to the solution of(70) with initial condition x(O) = Xo, where xe(O) = Xo, uniformly with respect to Yo = Ye(O). PROOF. Assume that h depends on e in such a way that hIe --+ 0 and hle 2 --+ 00 as e --+ O. On the basis of (66), Ex,ytp (Xe (  ) ) = tp(x) + h [  tr tp" jj2(x) + (tp' (x), al (x)) + I a'(x, z) a (x,y)px(dZ)] + Ex,y 1 h / e (tp'(x), a(x,Ye(s))) ds + o( : + : + e 2 ). (71 ) Consider the expression Ex,y l' (tp'(x), a(x,Ye(S))) ds = (tp'(X), l' Ex,ya(x,Ye(S)) dS), h "'r- " - - e Using Theorem 7, we find that Ex,y l' a(x,Ye(S)) ds = Ex,y l' a(x, ye (s)) ds + o( :2 JU) = e a (x,y) + O( h:2 ) + O(ee-crh/e\ Assume that e- h /e 2 = o(h). Then (' ( h 5 / 2 ) Ex,y 10 a(x,Ye(S)) ds = e a (x,y) + 0 83 + o(h). (72) 
2. PROCESSES WITH RAPID SWITCHING 127 We use this preliminary estimate to get a sharper one. Let Ts and Ts be .- two semigroups with generating operators A and A. Then Tsg - Tsg = l s Tu(A - A) Ts-ug duo Using this formula, we get that Ex,y ' a(x,Ye(s)) ds - Ex,y ' a(x, ye (s)) ds 1 {' (S = e E 10 10 [Qxe(u)(Ye(u),dz) - Qx(Ye(u),dz)] x I px eu ,Z,dz\)a(X,ZddUdS (the operators are applied to the function a for a fixed x). Expanding llxt(u) - llx by the Taylor formula, we can write the expression in the last equality on the right in the form  Ex,y l' l s I(Q(Ye(U),dz),Xe(U) -x) x I px e  U , z,dz\ )a(x, zd duds + o( Ex,y  ' s Ixe(u) - xl 2 x II px eu ,Z,dz\)a(X,ZddUdS ) = <1>1 + 0(<1>2). We have that <1>2 = 0(1' l s ue-C,(S-U)/eduds) = o( (  )), <1>\ = Ex,y ' I (Q(Ye(U), d z), xe(u) - x) a (x, z) du + o(' Ex,ylxe(u) - xle-c,(,-u)/e du ). 
128 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER 7: Ex,ylxe(u) _ xle- c ,(7:-u)/edu = 7: O( ViU)e- c1 (7:-u)/edu = O(eVh) = o(h). Thus, Ex,y 7: a(x,Ye(s)) ds = e a (x,y) + Ex,y 7: f (Q(Ye(u),dz),xe(u) - x) a (x, z) du + o(h) = e a (x,y) + <1>3 + o(h). We show that in the expression for <1>3 we can replace Xe and Ye by Xe and Y e , with an error of the order o(h). Let a > O. Then on the basis of Theorem 7 <1>3 = Ex,y 7: f(Q( Ye (U),dz), Xe (U) -x) a (x,z)du + Ex,y 7: (Q(Ye(u), d z), xe(u) - x)I{lxe(u)-xl>a} a (x, z) du - Ex,y 7: (Q( y e (u), d z, xe (u) - x)I{l:xe(u)-xl>a} a (x, z) du +o( T2 a) = Ex,y 7: f(Q( Ye (U),dz), Xe (U) -x) a (x,z)du 0( t 2 ..fh t h ) + a+ 2 ' e a t2..fh th _ ( (  ) 3/2  ) (  ) 3/2 (  ) a+ 2 -h 2 a+ 2 < h 2 a+ 2 · e a e ea e a Suppose that a = e 1/3 and (h/e2)3/2eI/3 --+ O. Then <1>3 = Ex,y 7: f ( Q( Ye (u), dz), u a( xe (s), ye (s)) ds ) a (x, z) du + o(h) = Ex,y 7: ! ( Q( Ye (u), dz), u a(x, ye (s)) ds ) a (x, z) du + 0(t 2 Vh) + o(h) 
2. PROCESSES WITH RAPID SWITCHING 129 = Ex,y l' l' II (Q(z\,dz)PX ( US , y£ (s),dzl)du,a(X, y£ (S))) x a (x, z) ds + o(h) + o( h:2 ) = Ex,y l' l' II ( Q(z\,dz)px(dzddu,a(x, y£ (s)) ) a (x, z) ds ( (' 1 00 h 5 / 2 ) +0 10 , e- C ,(U-S)/£duds+ 7 +o(h) + Ex,y l' 1 00 II ( Q(z\,dz) [ px ( U  S , y £(s),dZ I ) - pAd Zd] du, a(x, y£ (S))) a (x, z) ds = Ex,y l' (I Q(ZI,dz)pAdzd, l u a(x, Y£ (S))ds) a (X,z)dU + E X , y8 1' (II Q(Z\,dZ)RX( Y £(S),dzd,a(x, y£ (s))) a (x, z)ds + o(h) + o( h:2 ) = 81: [!! (Q(zt. d z)Px(d zd, a (x, y)) a (x, Z)] +  Ex,y 1'/£ II(Q(Zt.dZ)RX( Y£ (8S),dZd,a(x, Y£ (8s))) a (x,z)dS ( h 5 / 2 ) +0 7 +o(h). Since the process Y e (es) is uniformly ergodic, lim 8 Ex,y r/£ If Q(ZI, dz)RX( y e(es), dz l ), a(x, ye (es)) a (x, z) ds eO l' J 0 = a2(X) uniformly with respect to x for Ixl < r, where r > 0 is arbitrary. 
130 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER We now let h depend on e in such a way that for all c > 0 ( h2 h3 h 3/2 ( h ) 312 e 2 ) lim - + - + - + - e 1/3 + _e- chle = o. 8-+0 e 3 e 5 e 2 e 2 h Then <1>3 = ha2(X) + h II(Q(ZbdZ)PAdzd, a (x,y)) a (x, z) + o(h). By (71) and (72), Ex,ytp (Xe (  ) ) = tp(X) + h [  trtp"(x)i12(x) + (tp'(x), a(x))] +h(tp'(X), I a(x,z)pAdz) a (x,y) + II (Q(Zb d z)Px(dzd, a (x,y)) a (x, Z)) + e a (x,y) + o(h). Let 0 < tl < . . . < t m < t - h < t < t + h. Then E<I>(Xe(  ),...,xe C: )) x (tp ( Xe C : h ) ) - tp ( Xe ( ) ) - hLtp ( Xe (  ) ) ) = o(h) + E<I> ( Xe (  ), · · · , Xe C: ) ) X Exe((t-h)/e),Ye((t-h)/e) [ h ( tp' ( Xe ( ) ), b ( Xe (  ) , Ye ( ) ) ) + 8 (tp' ( Xe () ), a ( Xe (), Ye () ) ) ] , (73) where b (x,y) = I a(x, z)px(dz) a (x,y) + II (Q(Zb d z)Px(dzd, a (x,y)) a (x, z). 
2. PROCESSES WITH RAPID SWITCHING 131 We have that Exe«t-h)/e),Ye((t-h)/e) (tp' ( Xe () ), b ( Xe (). Ye () ) ) = Exe((t-h)/e),Ye((t-h)/e) ( tp' ( Xe ( t  h ) ), b ( Xe ( t  h ) , Ye ( ) ) ) + 0(1) ( - ( ( h ))) = Ex Y '(x), b X,Ye - + 0(1), , e X=Xt;«t-h)/e),Y=Ye«t-h)/e) Ex,y b (X,Ye (  ) ) = o(  Vh) + Ex,y b (X, Ye (  ) ) = I b (x, z)Px(dz) + 0(1) = 0(1), because J b (x,z)Px(dz) = O. Further, 8E xe ((t-h)/e),Ye((t-h)/e) (tp' ( Xe G ) ), a ( Xe ( ), Ye () ) ) = 8E xe ((t-h)/e),Ye((t-h)/e) (tp' ( Xe ( t  h ) ), a ( Xe ( t  h ), Ye () ) ) + O(eVh) =eExy ( '(x), a ( x'Ye ( h ))) +o(h). , e x=x«t-h)/e),y=y«t-h)/e) Since J a (x, z)Px(dz) = 0, we get that 8Ex,y(tp'(x), a (x'Ye(  )) ) = 8Ex,y(tp'(x), a (x' Ye (  ))) + Ex,y T II (Q(Ye(s), dz), xe(s) - x) px ( T  s , Z, dz 1 ) X (tp'(x), a (x, ZI)) ds + o( Ex,y8 h/e IXe(s) - xI 2 dS) 
132 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER ( 2 h 2 ) = 0 ee- C ,h/8 + 8" + Ex,y l' II (Q(Y8(S),dZ), 1 s a (x 8 (U)'Y8(U))dU ) xPX C's ,Z,dZl)(tp'(X), a (x,zd)dS = Ex,y l' 1 s II(Q( Y8 (S),dz),a( X8 (U)' Y8 (U))dU)PX ( TS ,Z,dZl) x ('(x), a (x,zl))ds ( (h/e2 2 r; ) + 0 10 s Jes e- h / 82 +S/ 8 ds + o(h) = Ex,y l' 1 s II(Q( Y8 (S),dz),a(X' Y8 (U))dU)PX ( TS ,z,dzl) x ('(x), a (x,zl))ds ( ( {T{S { h es } ) h5/2 ) + 0 10 10 vueexp - c, 2 ds + 82 + o(h) = l' 1 s IIII Px (  ,y,dZ2)(Q(Z3,dz),a(X,Z2)) ( s - u ) xPX e ,z2,dz 3 x px ( T  S , z, d z 1 ) ( tp' ( X ), a (x, z d) dud s + 0 ( h ) = l' lS IIII PX(dZ2)(Q(Z3,dz),a(X,Z2))PX C  U ,Z2,dZ 3 ) ( ! - s ) xpX" e ,z,dz l ('(x), a (x,zl))duds+o(h) + o(l' 1 s exp { - c, u +; - S } dUdS) = o(h) + 0(e 2 ) + l' 1 s IIII pAdz2)(Q(Z3,dz),a(x,z2))Px(dz3) x Px ( T s ,Z,dZl)(tp'(X), a (x,zd)dUdS + 0(1' !os e- c ,(,-U)/8 dUdS) = o(h) + 0(e 2 ) + 0 + 0(e 2 ) = o(h). 
2. PROCESSES WITH RAPID SWITCHING 133 This proves that o(h) stands on the right-hand side of (73). It remains to use Theorem 1 and the remark after it. 0 EXAMPLE. dx(t) ----eft = a(x(t), y(t)), y(t) is a jump process with finite set of states denoted by I,..., m, (x(t);y(t)) is a homoge- neous Markov process, and in this case Ox is given by a matrix m Qx = (q;j(x));,j=I,...,m, q;;(x) < 0, qij(x) > 0, i -I j, L qij(x) = O. j=1 It will be assumed that the functins q;j(x) are twice continuously differentiable. For condi- tion 9) to hold it suffices that for every r > 0 there exist an I such that ! 7t/(x) > 0 I,} I $)jm for Ixl :5 r, where the 7t](x) are the elements of the matrix [ll(x)](/), and 7tij(x) = 7tg) = -qij(x)/qij(x) for j -I i and 7t;;(x) = 0 if q;;(x) -I 0, but if q;;(x) = 0, then 7t;;(x) = I, and 7tij(x) = 0 for i -I j. In this case the distribution Px(d z) is given by the tuple PI (x),..., Pm(x) that is the unique solution of the system -q;;(x)p;(x) = L qij(x)Pj(x), L p;(x) = I, j; and PI (x),..., pm(X) are also twice continuously differentiable functions; the function a(x,y) is given by the tuple of functions al (x),..., am(X). It is assumed that these functions are twice continuously differentiable and that L a;(x)p;(x) = O. Denote by rij (x) the unique solution of the system of equations L q;k rkj (x) = Pj(x) - J;b k L q;kPk(X) = O. Suppose now that (Xe(t);Ye(t)) is a homogeneous Markov process in Rd x {I,..., m} such that lim Ex,y.!. (qJ (xe (t), Ye (t)) - qJ (x, Y)) 1-0 t m = (ql(x,y)ay(x)) + ; L qyJ(x)qJ(x,j). j=1 Then the process xe(t/e) converges weakly in distribution to the diffusion process in Rd satisfying (70), where (B2(x)v, v) = 2 L(a;(x), v)(aj(x), v)p;(x)r;j(x), ;,j /lex) = L p;(x)r;j(x) [ aj(x)a;(x) + L (qjk (x), alex) hr (x)ar (x) ] . j kJ 
134 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER 3. Averaging over variables for systems of stochastic differential equations We consider systems of stochastic differential equations containing rapid variable components, and we find conditions under which the in- fluence of these components on the remaining ones is averaged in such a way that in the limit the nonrapid components satisfy a certain stochastic equation with "averaged" coefficients. Let X and Y be two finite-dimensional Euclidean spaces, with X the phase space of the nonrapid components and Y the phase space of the rapid components. Consider the system dxt(t) = a(xt(t), Yt(t)) dt + B(xt(t), Yt(t)) dw(t), dYt:(t) = .!.al (xt:(t),Yt:(t)) dt +  Bl (xt:(t),Yt:(t)) dWl (t), (74) e ye where Xt(t) E X, Yt(t) E Y, a,al,B, and B l are functions from X x Y to X, Y, L(X), and L(Y), respectively, w(t) is a Wiener process in X, WI (x) is a Wiener process in Y, and the pair (w(t), WI (t)) is a process with independent increments in X x Y. Along with system (74) we consider the system with B l = 0 { dXt(t) = a(xt(t), Yt(t)) dt + B(xt(t), Yt(t)) dw(t), dYt(t) 1 d = -al(x t (t),Yt(t)). t e The symbol Ex,y (Ex, Ey) will always denote the expectation under the as- sumption that the solution (it can be denoted differently) satisfies the initial condition (x(O), y(O)) = (x, y) (x(O) = x, y(O) = y). We are interested in the question of when Xt(t) converges weakly in distribution as e --+ 0 to a solution of an "autonomous" (not dependent on y) stochastic equation of the form (74') dx(t) = a(x(t)) dt + B(x(t)) dw(t). (75) Special attention is given to dynamical systems that are subject to the action of rapid variable perturbations (they are described by system (74) with B = 0). The case of greatest interest is that when a(x) = 0 in (75) (the neutral case). Here B = 0 automatically. Then the nontrivial limit of xt(tle) is now a solution of (75) with B ¥= O. 3.1. A general theorem on averaging. We first consider the simple case when al and B l do not depend on x. Then the process Yt(t) is a solution of a stochastic equation. If this equation has a weakly unique solution, then Yt(t) is a homogeneous Markov process, and the distribution of Yt(et) 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 135 coincides with the distribution of a process y(t) that is a solution of the stochastic equation dy(t) = al (y(t)) dt + B l (y(t)) dw(t). (76) If y(t) = Yt(et), then y depends on e, but its distribution coincides with that of y(t) and does not depend on e. To investigate the asymptotic behavior of Xt(t) we use, as before, Theorem 1 and the remark after it. The following condition is assumed in this section: 1) The functions a, ai, B, and Bl are jointly continuous in their vari- ables, and the system (74) has a weakly unique solution (consequently, the solution of this system is a homogeneous Markov process). Since for tl < t2 < ... < tk < t < t + h, cI> E CXk, and rp E C x EcI>(X t (tl),... ,Xt(tk))[rp(Xt(t + h)) - rp(xt(t))] = EcI>(X t (tl),..., Xt (tk)) EXdt),Yt(t) [rp (Xt (h)) - rp(Xt(t))], (77) therefore, to use Theorem 1 and the remark after it we must consider the asymptotic behavior of the expression Ex,y[rp(xt(h)) - rp(x)]. If rp E ci 2 ) is compactly supported, then Ex,y[ rp(xt(h)) - rp(x)] = Ex,y l h [(tp'(Xe(S)),a(Xe(S),y(  s))) +  tr tp" (xe(s))B (Xe(S), y (  s ) ) B* (Xe(S), y(  S ) ) ] ds. (78) It will be assumed that h --+ 0 as e --+ O. The first natural assumption is that xt(s) is stochastically continuous uniformly with respect to e, and the expression 1 Lx,y rp (x) = (rp' (x), a (x, y)) + 2 tr rp" (x) B (x, y ) B* (x, y) (79) is continuous in x uniformly with respect to y. Then Ex,y[tp(xe(h)) - tp(x)] = Ex,y l h Lx,y(s/e) tp (x) ds + o(h), (80) where o(h) is uniform with respect to x, if for all  > 0 and r > 0 lim sup sup Ex,yI{lxt(s)-xl>b} = O. h-+O sh Ixlr,y The main term on the right-hand side of (80) is transformed as follows: ( e (hIt __ ) hEx,y h 10 '¥(x,y(s)) ds , (81 ) 
136 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where '¥(x,y) = Lx,y(x). Hence, E<I>( Xt ( t 1 ), . . . , Xt ( t k ) ) [  (X t (t + h)) -  (Xt ( t) ) ] e j tlt+hlt ,.., = hE<I>(x t (tl),... ,Xt(tk)) h '¥(xt(t),y(s)) ds + o(h). tIt Suppose now that h is connected with e in such a way that hie --+ 00 as e --+ 0 (but h --+ 0 together with e). If Xt(t) is bounded in probability, '¥(x,y) is a bounded function, and for all x the limit 1 [ s+ T lim T '¥(x,y(u)) du = g(x) Too S for the means exists uniformly with respect to s, then (82) E<I>(X t (tl),... ,Xt(tk))[(Xt(t + h)) - (Xt(t))] = hE<I>(x t (tl),... ,Xt(tk))g(Xt(t)) + o(h). The remark after Theorem 1 can now be used. We impose on a and B the following condition: 2) (la(x,y)1 + IIB(x,y)ID sup 1 I I < 00. x,y + x LEMMA 8. If conditions 1) and 2) hold, then: a) there exists a k such that Ex,ylxt(t) - xI 2 . < kt(1 + IxI 2 )e kt ; (83) and b) for every compactly supported function  E cf) the function Lx,y(x) is continuous and bounded. PROOF. On the basis of the Ita formula, Ex,yIXt;(t) - xI 2 = E ! [2(Xt;(S) - x, a(Xt; (s), Yt;(s))) +  tr B(xt;(s), Yt;(s) )B* (Xt;(s), Yt;(S))] ds < cEx,y fot (1 + IXt;(S) 12) ds (84) (we have used condition 2), and c is some constant). This inequality implies that for some Cl Ex,y( 1 + IXt;(t)12) < Cl ( 1 + IxI 2 + Ex fot (1 + IXt;(sW) ds ). From this, Ex,y( 1 + Ix t (t)1 2 ) < Cl (1 + IxI 2 )cc 1 t. 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 137 Substituting this estimate into the right-hand side of (84), we get (83) for some k. Assertion b) follows from the fact that  is compactly supported and from conditions 1) and 2). 0 COROLLARY. For any r > 0 the processes xe(t) are locally boundedfunc- tions with probability 1, uniformly with respect to e > 0 and Ixi < r, i.e., for all T > 0 lim sup Px,y { SUP IXe(t)1 > C } = o. coo e<O t< T Ixlr,y - Indeed, since sup Ixe(t)1 < fT la(xe(t),Ye(t))1 dt + sup f o t B(xe(s),Ye(s)) dw(s) , tT 10 tT 10 it follows that Px,y { suPlxe(t)1 > C } < 2 Ex fT la(xe(t),Ye(t))ldt t<T c 10 16 i T + -rEx tr B(xe(s),Ye(s))B*(xe(s),Ye(s)) ds c 0 =  O(foT E(l + Ixe(sW)dS). We have used martingale inequalities (Gikhman-Skorokhod [2], Chapter 1, 2); therefore, in view of Lemma 8 Px,y {¥ IXe(t) I > c } < k(T)(lc+ IxI 2 ) , where k(T) is a constant. LEMMA 9. Let y(s) be a random process such that for a given function f there is an a for which l i T lim E T f(y(s)) ds - a = 0 Too 0 as T --+ 00. Then there is an h(T) such that h(T) ! 0, Th(T) --+ 00, and . 1 (T +Th(T) i:. E Th(T) iT j(y(s)) ds - a = O. PROOF. Let 1 i u sup E - f(y(s)) ds - a = a(T). u>T U 0 
138 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Then 1 (T +Th(T) E Th(T) 1 T j(y(s))ds - a 1 (T+Th(T) 1 (T = E Th(T) 10 j(y(s)) ds - Th(T) 10 j(y(s)) ds - a T + Th(T) 1 (T+Th(T) = E Th(T) T + Th(T) 10 j(y(s)) ds T 1 (T - Th(T) T 10 j(y(s)) ds - a T + Th(T) T 3 < Th(T) a(T + Th(T)) + Th(T) a(T) < h(T) a(T) (we choose h(T) < 1 and use the fact that a(T) is monotonically decreas- ing). For the statement of the lemma to hold it suffices that lim a(T)h- l (T) = O. 0 Too REMARK. If a(T) = O(IIT), then the assertion of th lemma holds for any function h(T) such that Th(T) --+ 00. We can now prove the following theorem. t THEOREM 11. For the system (74) suppose that al (x, y) = al (y), B l (x, y) = Bl(Y), conditions 1) and 2) hold, and a solution of(76) is ergodic: for any initial value y(O) = y and f E C y lim T 1 fT j(Y(S))ds= j f(Y)P(d Y ) T-+oo 10 with probability 1, where p(dy) is a probability measure on Y (an ergodic distribution). Let a(x) = j a(x,y)p(dy), jj2(x) = j B(x,y)B*(x,y)p(dy) and suppose that these functions are such that the solution of(75) is weakly unique. Then the process Xt(t), where Xt(t), Yt(t) is the solution of (74) with initial values xt(O) = x(O), Yt(O) = y(O) (independent of e and of the processes w(t) and WI (t)), converges in distribution to the process x(t) that is the solution of(75) with the same initial value x(O). PROOF. It follows from Lemma 9 that for each compactly supported function  E C¥) there exists h(e) --+ 0 as e --+ 0 such that h(e)le --+ 00 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 139 and . e I t / t +( l/t)h(t) ! 11m h(B) t/t: Lx,y(s)tp(x) ds = Lx,ytp(x)p(dy) = ! [a(x,y)tp'(X) +  trtp"(X)B(X,Y)B*(X,Y)]P(d Y ) = (a(x), tp' (x) +  tr tp" (x)l12 (x) ). (85) It is easy to see from the continuity of a(x,y) and B(x,y) and from con- dition 2) that lim E T 1 fT a(x,y(s)) ds - a(x) Too 10 +  T B(x,y(s))B*(x,y(s))ds-iJ2(x) =0 locally uniformly with respect to x. Hence, h(e) can be chosen so that (85) holds uniformly with respect to x in each finite region. Then for o < tl < . . . < tk < t E<I>(X t (tl),... ,Xt(tk))[qJ(Xt(t + h(e))) - qJ(xt(t)) - h(e)LqJ(xt(t))] = o(h(e)) uniformly with respect to tl < ... < tk < t in each finite region if <I> is bounded and qJ is a compactly supported function in C), where LqJ = (a(x), qJ'(x)) + ! trqJ"(x)B 2 (x). It remains to use Theorem 1 and the remark after it. 0 We now consider the system (74) in the general case. It is natural to expect that on small intervals of length h --+ 0, where Xt(t) differs little from the initial value, Yt(t) will differ little from the solution of the equation d YA t) = .!.al(x, Yt: (t))dt+  Bl(X' Yt: (t))dwl(t), (86) e ve where x is its initial value. Equation (86), regarded for fixed x, does not depend on the first equation in (74), and since yt (t) is close to Yt(t), we can substitute it in the first equation in place of Yt(t). The solution obtained for the equation d Xt (t) = a( xt (t), yt (t)) dt + B( xt (t), yt (t)) dw(t) is also close to Xt(t). If the process yt (t) is ergodic (for all x) with ergodic distribution Px(dy), then Theorem 11 gives a basis for expecting that Xt(t) 
140 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER will converge in distribution to a solution of (75) with coefficients a(x) = I a(x,y)px(dy), jj2(x) = I B(x,y)B*(x,Y)Px(dy). (87) We find conditions under which these arguments are justified. Consider the question of closeness of the distributions of Yt(t) and yt (t). For this we observe that the distribution of Y t (t) coincides with the distribution of yX(tle), where yX(t) is a solution of the equation d yX ( t) = a 1 (x, yX ( t)) d t + B 1 (x, yX ( t)) d w ( t). (88) It will be assumed that the following condition holds for the coefficients a 1 (x, y) and B 1 (x, y): 3) For all x E X equation (88) has a weakly unique solution, and for allr>O sup sup(1 + lyl)-I(lal(X,Y)1 + IIB l (x,y)11) < 00, Ixlr y sup SUp(lal(X,y) - al( x ,y)1 + IIBl(X,y)B(x,y) Iylr x,x - B l (x,y)B(x,y)ll)lx - x l- l < 00. LEMMA 10. Suppose that conditions 1)-3) hold, and Xt(t), Yt(t) is the solution of the system (74) with initial conditions xt(O) = x and Yt(O) = y. Then for every r the family {Yt(et), lxi, lyl < r} of processes is compact in distribution, and Yt(et) converges in distribution to the process yX (t) that is the solution of(88) with initial condition yX(O) = Y, uniformly with respect to Ixl < rand lyl < r. PROOF. Let t' = inf{t: IXt(t) - xl > c}. It follows from the corollary to Lemma 8 that for all t > 0 lim sup suPSUpPx,y{t' < t} = O. coo Ixlr y t To prove the compactness in distribution of the processes Yt(es) on [0, T] it suffices to prove that they are compact on [0, T 1\ t'] for all c > O. It follows from condition 3) that for some I (it depends on T, r, and c) E x,y Iy £ (et) - Y 1 2 I { r. t} < I (lot E X,y Iy £ ( BS) - Y 1 2 I { TS} d S + t) for Ixl < r, and this implies that Ex,yIYt(et) - YI2I{'rt} < lte 1t . This inequality gives us that the processes Yt(e(t 1\ t')) are compact, and hence so are the Yt(et). 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 141 To prove convergence in distribution of Yt(et) to yX(t) we use Theorem 1. For '" E C2) let L l '" = (",' (y), a 1 (x, y)) + ! tr "," (y ) B 1 (x, y ) B i ( x , y) . For a compactly supported '" E C2), a bounded continuous function 'II (y 1, . . . , Y k ), and 0 < t 1 < . . . < t k < t < t + h we have that E'¥ (y t ( e t 1 ), . . . , Y t ( e t k ) ) [ '" (y t ( e t + e h )) - '" (y t ( e t) ) ] = E'¥(Ye(etd,... ,Ye(Btk)) t Heh .!. [ (""(Yt(U)), al (Xt(U),Yt(U))) ltt e +  tr 1fI" (Ye(u))B, (Xe(U), Ye(u))Bj (Xe(U), Ye(U))] du I tt + th 1 = E'¥(Ye( etd, · · · , Ye( etd) e Lu),y.(U) IfI d U tt = E'¥(Ye(etd, ... , Ye(Btk)) ft+h L.(eu) IfI du + 0 (f: eHh Ex,y  ILu),y.(U) IfI - L.(u) IfII du ) = E'¥(Ye(etd, .. ., Ye(etk)) ft+h L.(eu) IfI du + o(  f: Heh Ex,ylxe(u) - xI 2 dU). The last term is o(h) in view of Lemma 8. The lemma follows from Theorem 1. 0 COROLLARY. Let g(x,y) E Cxxy. Then for every t and r > 0 lim sup E t g(x,Ye(es)) ds - E t g(x,yX(s)) ds = O. tOlxlr 10 10 Iylr This assertion follows from the uniform weak convergence of the corre- sponding distributions. We next require the condition of locally uniform ergodicity for the process yX(t) : 4) For all x E X the process yX(t) is ergodic with ergodic distribution Px(dz), and for all f E Cy and all r > 0 lim sup Ex,y T 1 (T f(yx(t)) dt - ! f(Z)Px(dZ) = O. (89) Too Ixlr 10 Iylr 
142 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER If this condition holds, then for any family {!a(y)}, SUPa,y 1!a(y)1 < 00, !a E Cy, such that {!a(y)e- IYI } is compact in C y the equality (89) holds for !a(y) uniformly with respect to a. We now prove the main theorem of this part. THEOREM 12. Assume conditions 1 )-4) and suppose that, for any ini- tial conditions xe(O) = x and Ye(O) = y, the process Ye(t) is bounded in probability uniformly with respect to e. Then the process xe(t) converges in distribution to the process x(t) that is the solution with initial condition x(O) = xe(O) = X of equation (75) with coefficients defined by (87). PROOF. For a bounded continuous function <I>(Xl,..., Xk) and a com- pactly supportd rp E C) we find for tl < t2 < ... < tk < t < t + h that E<I>(X e (tl),... ,Xe(tk))[rp(Xe(t + h)) - rp(xe(t))] = E<I>(xt:(td,..., Xt: (tk)) Exe(t),Ye(t) 1 h Lxe(t),Ye(U)tp(xt:(t)) du + o(h). Let {h (hIe fh,t:(x,y) = Ex,y 10 LX,Ye(U)tp(X) du = E x , y8 1 0 Lx,Ye(£U) tp (X) duo On the basis of the corollary to Lemma 10, for all t > 0 and r > 0 lim sup - Ex,y t Lx,Ye(t:U) tp (x) du - Ey t Lx,y:(s) tp (x) ds = O. e-+O Ixlr,lylr 10 10 Therefore, re --+ 00 and t e --+ 0 can be chosen so that lim sup Ex,y t e Lx,Y.(t:u)tp(x) du - Ey t e Lx,yx(s)tp(x) ds = O. e-+O Ixlrt,lylrt 10 10 On the other hand, it follows from condition 4) that we can choose r T --+ 00 so that . 1 l T J 11m sup Ey T Lx,yx(s)rp(x) ds - Lx,z(x)Px(dz) = O. T-+oo IxlrT,lylrT 0 Choosing t e --+ 00 so that et e --+ 0 (t e and re can be chosen to be arbitrarily slowly increasing), and re so that re < r tt , we have that . 1 l tt J 11m sup - Ey Lx,yx(s)rp(x) ds - t e Lx,zrp(x)Px(d z) = O. e-+O Ixl re t e 0 Iylre 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 143 Now let tt = hie. Then Ji"e(x,y) = eEx,y 1 1e Lx,Ye(eu)tp(x) du = h ! Lx,ztp(x)Px(dz) +h[ t: (E y 1 1e Lx,yx(s)tp(x)ds-t e ! Lx,ztp(X)Px(dZ)) + t: (E y 1 1e Lx,yx(s)tp(x) ds - Ex,y 1 1e Lx,Ye(es) tp (x) ds ) ] = hLxrp(x) + h8 t (x, y), where Lxrp = (a(x), rp'(x)) + ! trrp"(x)B2(x), and the 8 t (x,y) are collec- tively bounded functions such that lim sup 18 t (x,y)1 = o. t-+O Ixl't Iyl't Since xt(t) and Yt(t) are bounded in probability, lim EcI>(X t (tl),... ,x t (tk))8 t (x t (t),Yt(t)) = 0 t-+O uniformly with respect to tl < t2 < . . . < tk < t < T for any T. Hence, for the indicated choice of h, EcI>(X t (tl), . . . , Xt(tk) )[(Xt(t + h)) - rp(Xt(t)) - hLxt(t) rp(Xt(t))] = o(h). It remains to use Theorem 1 and the remark after it. 0 REMARK. The following condition suffices for the process Yt(t) to be bounded in probability: Suppose that conditions 1 )-3) hold and for every r > 0 there exist a A, > 0 and a twice continuously differentiable function VI,(Y): Y --+ R, VI,(Y) --+ +00 as lyl --+ 00, such that sup sup[LiVI(Y) + A, VI, (y)] < 00. (90) Ixl' y Indeed, if this condition holds and T, = inf{t: Ixt(t)1 > r}, then (tA1: r 1 Elflr(Ye(t A <r)) = Elflr(Ye (0)) + E 10 e L1S),Ye(S) IfIr(Ye(S)) A, i t < -- EVI,(Yt(SAT,))ds+c" e 0 where c, is a constant (we have used (90) and the boundedness of VI, from below). It follows from the last inequality that SUPt EVI,(Yt(t AT,)) < q" where q, < 00. Hence, P{IYt(t)1 > c} < P{ T, > t} + q,IVI,(c) 
144 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER and the expression on the right-hand side can be made arbitrarily small by suitably choosing rand c. EXAMPLE. Let Y be a one-dimensional space, and let ( 1 ) d f/I 1 2 d 2 f/I Lx,yf/l = al(x,y) dy (y) + lBl (x,y) dy2 . Define V(x,y) = exp { r lad (x, z) dZ } . 10 Bl (x, z) The condition j oo 1 V(x,y) 2 dy < 00 -00 Bl (x,y) is a condition for the ergodicity of the process yX (t) (see 3 in Chapter I), and if condition 4) holds, then sUPlxlr c(x) < 00 for all r > 0, where j oo 1 c(x) = V(x, y) 2 dYe -00 Bl (x,y) Here the ergodic distribution has density with respect to Lebesgue measure given by 1 p(x, y) = ()B2( ) V(x, y). c X 1 x,y Therefore, a(x) = f a(x;y) V(x,y) dy, c(x)B l (x, y) Ii? (x) = f B (x, y )B. (x, y) V (' y) d y. . c(x)B l (x,y) 3.2. A diffusion process under the influence of a rapid dynamical system in the presence of feedback. A diffusion process can be given by a stochas- tic differential equation in X. The influence of a dynamical system means that the coefficients of the equation depend also on the point y E Y, where Y is the phase space of the system, and the state y(t) of the influencing system at time t is substituted for y in the equation; the fact that this is a dynamical system means that y(t) is a solution of a first-order equation with coefficient depending on the state of the diffusion process x(t) (feed- back). Finally, the fact that the dynamical system is rapid means that the coefficient of the equation determining the system is proportional to lie (e a small parameter). Thus, we shall consider the system (74'). The specific nature of this case lies in the facts that, first, in the ergodic case the sample pathyX(t) is dense in the support of the measure Px(dy), and so there is no reason to expect that the process Ye(t) will be bounded in probability, and second, condition 4) on locally uniform ergodicity is also too restrictive, because for fixed x the limit of the time averages exists only for almost all initial conditions y(O) = y (with respect to the ergodic distribution). On 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 145 the other hand, those assertions not based on condition 4) are certainly valid. The following condition will be imposed on the function al(x,y) when the system (74') is considered: 5) The derivative a (x,y) = 8al (x,y)18x exists and is continuous and bounded, and al (x, y) satisfies a Lipschitz condition locally in x : for all r there exists an lr such that for Ixl < rand I x l < r lal (x,y) - al ( x , y )1 < lr(lx - x l + Iy - y l). To clarify the situation we consider first the case when al(x,y) = al(y) does not depend on x. As in the preceding part, the formula E<I>(X e (tl),... ,xe(tk))[(xe(t+h))-(xe(t))- fh,e(xe(t),Ye(t))] = o(h) (91) remains valid, where (hie ./h,e(X,y) = eEx,y 10 Lx,Ye(eu) tp (x) duo Ify(t) is a solution of the equation dy(t)ldt = al (y(t)), then Ye(eu) = y(u). Assume that there is a measure p(dy) such that for f E C y and p(dy)- almost all y(O) = Y lim T 1 (T f(y(t)) dt = ! f(y)p(dy), T-+oo 10 i.e., the ergodic theorem holds for the dynamical system. Then for all x and p(dy)-almost all y fh,e(x,y)  hLx(x). For this relation to be used in (91) the distribution of Ye(t) =. y(tle) must be absolutely continuous (uniformly with respect to e) with respect to p(dy). Since p is an invariant distribution for the dynamical system, this condition will be satisfied if the distribution of Ye(O) = y(O) does not depend on e and has bounded density with respect to p(dy). The assertion of Theorem 11 is valid in this case. Note that in this situation we can apply Theorem 11 directly if y(O) = y is chosen so that the ergodic theorem holds for y(t). But this approach is not applicable when there is feedback, since the exceptional set for which the assertion of the ergodic theorem fails then varies with x. Therefore, the approach based on a random choice of the initial value Ye(O) is more natural here. We first establish how to determine the distribution of Ye(t) from that of Ye(O). 
146 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER LEMMA 11. Suppose that b(t,y) is a jointly continuous function from R x Y to Y such that b;(t,y) exists and is a continuous bounded function. Denote by y(t,y) the solution of the equation dy(t,y)ldt = b(t,y(t,y)) with initial condition y(O,y) = y. Further, let D(y(t,y)) = det y ' ( t y) D(y) Y , be the Jacobian of the transformation y(t,y): Y --+ Y. Then DY%;y) = exp {I t trb(S,Y(S,Y))dS}. See, for example, Arnol'd's book [1] for a proof (Russian p. 61). COROLLARY 1. Consider y(t, 17), where 17 is a random variable in Y with distribution density g(y) = go(Y). This is the solution of the equation dy(t)ldt = b(t,y(t)) with random initial value 17. Then y(t,17) also has distribution density gt(Y), with gt(y(t,y)) = g(y) exp { -I t trb(s,y(s,y)) ds }. (92) Indeed, if y-l (t, y) is the inverse function (with respect to y) of y(t, y), then for f E C y ! j(y)g(y-l (t,y)) exp {-I t trb(s,Y(S,y-l(t,Y)))dS} dy = ! j(y(t,y))g(y)exp {-I t trb(S,Y(s,Y))dS} DYY) dy = ! j(y(t,y))g(y)dy = Ej(y(t,rJ)) = ! j(y)gt(y)dy (the substitution y --+ y(t,y) was made in the first integral). The last equality is equivalent to (92). COROLLARY 2. Suppose that conditions 1)-3) and 5) hold and x(t),Ye(t) is the solution of(74'). IfYe(O) has distribution density g(y), and Ye(O) is independent of the Wiener process w(t) (appearing in an equation of(74')), then the variable Ye(t) has distribution density ge(t, y), and ge(t,y) = Eg(y;'(t,y)) exp { -  I t tray(xe(s),y;'(t,y)) dS}. (93) Let us now consider the solution of the equation dyldt = b(y(t)). If g(y) is the density of the invariant measure, then on the basis of (92) g(y(t,y)) = g(y) exp { -I t trb(y(s,y)) dS}. 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 147 Assume that g(y) > 0 and the derivative g'(y) exists and is continuous. Then from the preceding equality we get that (g' (y ( t, y) ), b (y ( t, y) )) = - g (y ( t, y) ) tr b (y ( t, y) ) . If for all y the sample path y(t,y) is dense in Y, then the last relation is equivalent to the equation for the stationary density, (g' (y), b(y)) + g(y) tr b (y) = tr(g(y )b(y)) = O. (94) This implies that if we have the two equations dy dy d t = b (y ( t) ), d t = b 1 (y ( t) ) and b l (y) = A(y)b(y), where A(Y) > 0 and b(y) and b l (y) are differentiable functions, and g(y) is the stationary density for the first equation, then the function g(y) I A(Y) = gl (y) is the stationary density for the second equation. Let us consider the solution of the equation dyXJ:'y) = al(x,yx(t,y)), yX(O,y) = y (95) for fixed x. We introduce the following condition: 6) For all x E X equation (94) has a unique positive and continuous stationary density g(x,y), the derivative g;(x,y) exists and is continuous, for all r > 0 there exists an I, such that for Ixl < rand Ixil < r sup g(x,y) - 1 < [,Ix - xIi, y g(Xl,Y) and for all f E Cy lim sup f T l rT j(yX(z,s)) ds - f f(y)g(x,y) dy g(x, z) dz = 0, T-+oo Ixl' 10 where yX(z,s) is the solution of (95) with initial condition yX(z,O) = z. The last condition can be called local uniform ergodicity with respect to x. We need a result on random time change in a stochastic differential equation. Let 'II(x,y) > 0 be a measurable locally bounded function. The variable -r1 is determined by t'£ t= l' "'(Xe(S),Ye(s))ds, 
148 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER where xe(s), Ye(s) is the solution of the system (74) with B l = O. Let xe(t) = xe('rf) and Ye(t) = Ye('rf). Then xe(t) and Ye(t) satisfy the system of equations dXe(t) = C (   ( )) a(xe(t),Ye(t)) dt 'II Xe t , Yet + y'  1  B(xe(t),Ye(t))dwe(t), (96) 'II(X e (t), Ye (t)) dYe(t) 1 - dt - el/f(xe(t),Ye(t)) a, (xe(t), Ye(t)), where L 1 we(t) = r dw(s) 10 y' 'II(X e (s), Ye (s)) is also a Wiener process, and if w(t) is adapted to the flow g; with respect -- to which xe(t) and Ye(t) are measurable, then we(t) is adapted to g;e = g; (on this see Gikhman-Skorokhod [1], Vol. III, Russian p. 276, English p. 208). If condition 5) holds, then 'II(x,y) can be chosen so that the stationary density for the solution of the equation dy dt ( 1( )) a,(x,y(t)) 'II X,Y t (97) does not depend on x. We can take 'II(x,y) = g(x},y)1 g(x,y), where XI is a fixed valued. Then g(y) = g(Xl,Y). LEMMA 12. If conditions 5) and 6) hold and 'II(x,y) = g(y)lg(x,y), where g(y) is a positive continuously differentiable density, and xe(t), Ye(t) is the solution of the system (96) for which Ye(O) has distribution density g(y), then Ye(t) has distribution density g(y) for all t. PROOF. Denote the coefficients of (96) by a, B, and ai, respectively. We use (93). On the basis of (94) t -, ( ) _ _ (g; (y), a 1 (x, y) ) ra ly x,y - g(y) , 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 149 and so _.!. t tray(Xe(S), Ye(S, Y;! (t, y))) ds e 10 =.!. t (g;(Ye(S,y;!(y)),aXe(S),Y(S,y-!(t,X)))) ds e 10 g (Ye (s ) , Y;- 1 ( t, y) ) = 1 t d s In g(Ye(s,y;!(t,y))) = Ing(Ye(t,y;-!(t,y))) = In g(y) g (y;- 1 ( t, y) ) g (y;- 1 (t, y)) · Substituting this in (93), we get what is required. 0 Consider the solution of (96) with the value of 'II(x,y) chosen in Lemma 12. Let xe(O) = Xo be fixed, and suppose that Ye(O) has density go(y) such that go(y) = O(g(y)). Then, arguing precisely as in the beginning of this subsection, we see that xe(t) converges in distribution to a process x(t) that is the solution of the stochastic equation dx(t) = a(x(t)) dt + B(x(t)) dw(t) with initial condition x(O) = Xo, where a(x) = f a(x,y)g(y) dy, B 2 (x) = f B(x,y)B*(x,y)g(y) dy. Substituting the values a(x,y) and B(x,y), we get that a(x) = f g) a(x,y)g(y) dy = f a(x,y)px(dy) = a(x), B 2 (x) = B 2 (x). Thus, the distribution of x(t) coincides with that of the solution x(t) of (75) (which is weakly unique by assumption). Let us show that xe(t) also converges in distribution to the same process. To do this we study the behavior of -r1 for the choice of 'II(x,y) indicated in Lemma 12. Differentiating the equality (r:£ t = 10 I 'I'(xe(u),Ye(u))du, we get that I = 1 'I'(x e ('r1),Ye('rm = 7:! 'I'(xe(t), Ye(t)). From this, 7:f = r o t ds t g(Xe(S),Ye(S)) ds 10 'II ( Xe (s ), Ye (s )) = 1 0 g (Ye (s ) ) · 
150 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER LEMMA 13. Suppose that the conditions of Lemma 12 hold and the sys- tem (96) has initial conditions xe(O) = Xo, Ye(O), where the latter has density go(y) such that go(y) < cg(y) for some c > O. For any  E C y and t > 0 lim t tp(Ye(s)) ds = t f qJ(y)g(y) dy (98) e-+O 10 in the sense of convergence in probability. - PROOF. Using the boundedness in probability of Ye(s) (for all r > 0 and s > 0, P{IYe(s)1 > r} < YI>' g(y) dy), we see that it suffices to prove the lemma for compactly supported functions in C y . The possibility of uniformly approximating such functions by compactly supported functions in cV) enables us to reduce the proof to functions  satisfying a Lipschitz condition. Suppose that  is such a function,  --+ 0, I e --+ 00, t I  = n is an integer, and write t n.1 - 1 1  r tp(Ye(S)) ds = L tp(yX.(klJ.)(Ye(kll),sje)) ds 10 k=O 0 -1  _ + L r (tp(Ye(S + kll)) - tpW:«klJ.)(Ye(kll),sje))) ds, k=O 10 where yX(z,s) is a solution of the equation dyX(z,s) _ - ( -X ( )) d - al x,y z,s , s . Assume that xe(O) = x. We estimate the difference between Ye(s) and yX(Ye(O),sle). Let' = inf{t: IXe(t)1 > c}. Then, on the basis of the corollary to Lemma 8, yX(z,O) = z. lim supP{' < t} = 0 c-+oo e for all t > O. It follows from conditions 5) and 6) that for the indicated ,..., choice of ",(x,y) and for some I, the function al(x,y) satisfies the condi- tion ,..., lal (x,y) - al ( x , y )1 < 1,(lx - x l + Iy - y l) for lxi, I x l < r. Therefore, for some I IYe(s) - yX (sle)II 1 {' > t} =  1 5 [al(.xe(u),Ye(U)) du - al(x,yX(uje))] IW<u}du < I  (1 5 Ye(U) - yX(uje) IW>u} du ) + 1 5 Ixe(u) - xl duo 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 151 Hence, Ye(S) - yX (  S) <  foS Ixe(u) - xl du. exp {s}. Since  satisfies a Lipschitz condition, n& - 1 .6. ( ( ,..., ( 1 ) ) ) t; 10 tp(Ye(s + k)) - tp yXe(kA) Ye(k), t/ dy IW>A} I n& -1 1 .6. i s { I } < .J. L IXe(u + k) - xe(k)1 duds. exp -s e k=O 0 0 e I { I } n& - 1 1 .6. < .J.. exp - L IXe(u + k) - xe(k)1 du e e k=O 0 for some 11. Since Elxe(u + k) - xe(k)1 < V Elxe(u + k) - xe(kW = O(), it follows that t n& - 1 .6. ( ,..., ( 1 ) ) E 10 tp(Ye(s)) ds - t; 10 tp yXe(kA) Ye(k), t/ ds < P{-r < t} + E lot tp(Ye(s))ds n& - 1 .6. ( ,..., ( 1 ) ) - t; 10 tp yXe(kA) Ye(k), e s ds I{t} < P{ -r < t} + h  exp {  } n.6.3/2 - - e e < P{ -r < t} + lt 3/2 exp {  } . We choose  and c to depend on e in such a way that the expression on the right-hand side tends to zero even though I e --+ 00 (take  = e In In( 1 Ie)). Using the condition of local uniform ergodicity with respect to x (see condition 6)) in connection with the process yX(z,s), we can assert that for all r > 0 sup f T l (T tp(yX(z,s)) ds - f (y)g(y) dy g(z) dz < c5(r, T), Ixlr 10 where the c5 (r, T) are collectively bounded and c5 (r, T) ---+ 0 as T ---+ 00. Let E E 1 loA tp (yX'e(kA) (Ye(k),s/e) ) ds - t f tp(y),i(y) dy = De. 
152 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Then De < I E(Io A tp(yX«kAJ(Ye(kd),Sje)) dS-d f tp(y)g(Y)d Y ) ( nl1- 1 ) X I{I;e(kL1)1,} + 0 E L M{I;e(kL1)I>'} · k=O Define tp*(Y) = tp(y) - f tp(Z)g(Z) dz, 0, Ixl < r12, g,(X) = 1, Ixl > r, 21xl/r - 1, rl2 < x < r. Then nl1 - 1 nl1 - 1 L dI{I(kAJI>r} < L gr(Xe(kd))d k=O k=O 1 t nl1-1 1 (k+l)L1 = g,(Xe(S)) ds + L (g,(Xe(S)) - g,(xe(k))) ds. o k=O kL1 Since g, satisfies a Lipschitz condition with constant 21r and Elxe(s) - xe(kL\) I = O(ls - kll/2), it follows that nl1 - 1 t E L M{lx«kAJI>r} < E r gr(Xe(S)) ds + O( J'X). k=O 10 Further, E (loA tp* (yX«kAJ (Ye(kd),  S ) ) dS) I{lx«kAJI:5r} - f f P(Ye(kd) E dz) (L1/e X P{Xe(kd) E dxjYe(kd) = z}e 10 tp*(yX(z,s)) dsI{lxl:5r} < cd f f g(z)dzP{xe(kd) E dxjYe(kd) = z} e (L1/e X d 10 tp*(YX(z,s)) ds I{lxl:5r} < cL\J (r, I e). 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 153 We have used the fact that the inequality P{Ye(O) E dz} < cg(z) dz and the invariance of g(z) for Ye(s) give us that P{Ye(s) E dz} < cg(z)dz. Thus, De < cto(r,!:!.je) + O(.JX) + E I t g,(xe(s)) ds. Since xe(s) is bounded in probability, limt -+o De can be made arbitrarily small by suitably choosing r. 0 REMARK. If <I> c C y is a bounded set of functions such that the set {  (y) exp{ -Iy I},  E <I>} is compact in C y, then under the conditions of the lemma the convergence in (98) is uniform with respect to  E <1>. LEMMA 14. Suppose that f(x,y) E C xxy . Then under the conditions of Lemma 13 the distribution of the variable I t f(xe(s),Ye(s)) ds as e --+ 0 converges to the distribution of the variable I t (! f(xe(s), z)g(z) d z ) ds. PROOF. Using the uniform (with respect to e) stochastic continuity of xe(s) and the boundedness in probability of Ye(s), we can see that 1 t n-l 1 (k+ 1 )Int lim sup E f(xe(s),Ye(s)) ds - L f(xe(ktln),Ye(s)) ds = o. n-+oo e 0 k=O ktln Further, for every n l (k+l)lnt t ! lim sup f(x,Ye(s)) ds - - f(x, z)g(z) dz = 0 e-+O Ixl r ktln n in view of Lemma 13 and the remark after it; hence l (k+l)t l n t ! lim E f(xe(ktln), Ye(s)) ds - - f(xe(ktl n), z)g(z) d z eO ktln n x I{I;t(ktln)lr} = O. Therefore, using the boundedness in probability of xe(ktln), we can see that for all n n-l 1 (k+ 1 )tln lim E L f(xe(ktln),Ye(s)) ds eO kt l n k=O n-l -  L! f(xe(ktjn), Z)g(Z) dz = O. k=O 
154 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER It remains to observe that lim supE t j f(xe(s),z)g(Z)dZdS noo e J o n-l -  L j f(xe(ktfn), z)g(z) dz = O. 0 k=O By Lemma 14, the distribution of 'l' converges to the distribution of the variable 1/ j g(:,Z) g(Z)dzdt = 1/ j g(xe(s),z)dzdt = t, since f g(x, z) dz = 1. Recall that g(x, z) is the stationary density for yX(t). But then 'l' --+ t in probability as e --+ O. Since 'l' is a strictly monotone function (by condition 5), ",(x,y) is bounded and bounded away from zero), it follows that 'l' --+ t uniformly in probability in each finite interval, i.e., limP { SUP I'l' - sl > J } = 0 eO st for all J > O. We observe now that for p > 0 and J > 0 P{lxe(s) - xe(s)1 > p} = P{lxe(s) - xe('l')1 > p} < P{I'l' - sl > J} + P { SUP Ixe(s) - xe(u)1 > P } uE[s-,s+] < P{ls - 'l'1 > J} + 2P { SUP Ixe(s - J) - xe(u)1 > PI2 } . uE[s-,s+] Hence, supP{lxe(s) - xe(s)1 > p} st < P { SUP Is - 'l'1 > J } + 2 sup P { SUP IXe(u) - xe(s)1 > P12 } . st s<u2 s<u2 However, xe(u) -xe(s) = [U a(Ye(V), xe(v)) dv + [U B(Ye(v),xe(v))dw(v), l s+2 sup Ixe(u) - xe(s)1 < la(Ye(v), xe(v))1 dv sus+2 s + sup [ S U B(Ye(v),xe(v)) dw(v) . sus+2 J_t 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 155 On the basis of martingale inequalities (see Gikhman-Skorokhod [2], Chapter 1, 2) E sup (U B(Ye(v),xe(v)) dw(v) 2 su2 J s (s+2 < 4E is tr B(Ye(v), xe(v))B*(Ye(v),xe(v)) dv, E([H20 la(Ye(v),xe(v))1 dv r < UE [H2O la(Ye(v),x e (v))1 2 dv. Therefore, by using condition 2) and Lemma 8 it can be seen that there exists a constant I (it can depend on Xo and t, but not on e nor J) such that for s < t E sup Ixe(u) - x e (s)1 2 < lJ. sus+2 Hence, supP{lxe(s) - xe(s)1 > p} < P { SUP I, - sl > J } + 41 , st st P lim sup P{lxe(s) - xe(s)1 > p} = O. eO. st Thus, the following theorem has been proved. THEOREM 13. Suppose that for the system (74') conditions 1)-3), 5), and 6) hold, and the derivative (818x)al(x,y) exists and is bounded and continuous. Denote by xe(t),Ye(t) the solution of the system with the initial conditions xe(O) = Xo (nonrandom) and Ye(O) = y(O), where y(O) has a distribution density g(y) satisfying the inequality g(y) < cg(xo,Y) (c is a constant, and g(x,y) is thefunction in condition 6)). Then the process xe(t) converges in distribution to the process x(t) that is the solution of(75) with coefficients given by (87), where Px(dy) = g(x,y) dy. We single out the special case when B(x,y) = O. Then we have a system of two connected dynamical systems, of which one is rapid, dXe dYe 1 ( ) dt = a(x(t), y(t)), dt = e-al (xe(t), Ye(t)). 99 THEOREM 14. Suppose that a) the functions a(x,y) and al (x,y) are con- tinuous and satisfy a Lipschitz condition with respect to x uniformly with respect to y, the derivative ay(x,y) exists and is continuous and bounded, and the system (99) has a unique solution for all e > 0; and b) for fixed x the solution of the equation dyX(t)ldt = al(x,yX(t)) is ergodic with an ergodic probability measure Px(dy) that has density g(x,y) with respect to 
156 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Lebesgue measure, and for any r > 0 there exists an I, such that for Ixl < r and Ixil < r sup g(xt.Y) - 1 < lrlx - xIi. y gl(X,y) If xe(t), Ye(t) is the solution of (99) with initial conditions Xo, Yo, where Xo is nonrandom, Yo has distribution density go(Y), and go(y) < cg(xo,y) for some c > 0, then xe(t) tends uniformly as e --+ 0 to the function x(t) that is the solution of the equation di(t) = a(i(t)) dt with initial condition x(O) = Xo, a(x) = f a(x,y)g(x,y) dy. 3.3. A dynamical system under the influence of a rapid diffusion process. Neutral case. We consider a system of the form (74) with B = O. Theo- rems 12 and 13 are applicable for such systems, but we are interested in the case when a(x) = 0 (B(x) = 0, since B = 0). Under this assumption the limit dynamical system does not leave the initial position in a finite amount of time. Therefore, nontrivial results in the investigation of xe(t) can be obtained either by examining xe(t) "under a microscope", i.e., by studying the character of the deviation of xe(t) from xe(O) with a suitable normalization for finite times, or by onsidering the process for large times in which it is able to essentially leave the initial state. In the first case the process ae(xe(t) - xe(O)) is studied, where a e --+ 00 is chosen so that the limit distribution exists. In the second case the process xe(Pet) is studied, where Pe --+ 00 and is chosen from the same considerations. Let us take the first problem. We investigate the system of equations dXe(t) dt = a(xe(t),Ye(t)), 1 1 dYe(t) = -al (xe(t), Ye(t)) dt + r; Bl (xe(t), Ye(t) )dw(t). e ye ( 1 00) For the time being we impose the conditions 1 )-4) on the coefficients of the equation and assume the following: 7) a(x) = f a(x,y)px(dy) = 0, a(x,y) exists and is continuous, and for all r there exists an I, such that la(x,y) - a( x ,y)1 < 1,Ix - x l for lxi, I x l < r. 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 157 Let  E C<}). Then tp(xt(t)) - tp(xt(O)) = lot (tp' (xt(s)), a(xt(s), Yt(s))) ds = lot (tp' (x), a(x, Yt(s))) ds + lot Io s [( tp" (xt(u))a(xt(u), Yt(s)), a(xt(u), Yt(u))) + (qJ' (x e ( u)), a(xe( u), Ye(s) )a(x e ( u), Ye( u)))] du. It is easy to get from condition 2) that xe(t) has the estimate Ixe(t) - xe(O)1 < 1(1 + Ixe(O)l)te lt (101) for some I. LEMMA 15. Suppose that condition 7) holds. Then for rp E c<}) tp(xt(t)) - tp(x) - lot (tp'(x), a(x,Yt(s))) ds - lot Io s [(tp"(x)a(x,Yt(s)), a(x,Yt(u))) + ('(x), a(x,Ye(s))a(x,Ye(u)))] du ds < ct 3 , where xe(t) is the solution of(100) with initial condition xe(O) = x, and for any r the constant c can be chosen to be the same for alllxl < r, t < r, and  satisfying I  I, I 'I, I "1, I "'I < r. The proof follows from the fact that under the indicated restrictions on x and t we have that Ixe(t)1 < rl in view of (101), where rl depends only on r. The function ("(x)a(x,y), a(x,Yl)) + ('(x)a(x,y)a(x'Yl)) satisfies a Lipschitz condition in x for Ixl < rl, uniformly with respect to Y and Yl, and lot Io s Ixt(u) - xl du = 0(t3). 0 To clear up how the process xe(t) behaves in a neighborhood of the ini- tial value we again consider the elementary situation when the coefficients in the second equation in (100) do not depend on x. If al(x,y) = al(Y) and B l (x,y) = 8 1 (y), then the distribution of the process Ye(t) coincides with that of the process y(tle), where y(t) is the solution of the equation dy(t) = al (y(s)) ds + B l (y(s)) dw(s) 
158 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER with the same initial condition. Assume that y(t) is exponentially ergodic: for some k and c(y) I Py{y(t) E A} - p(A)1 < c(y)e- kt , where p(A) is an ergodic distribution, c(y) is a locally bounded function, and the function a(x,y) is such that sup c(y)la(x,y)1 < 00. y,lxlr Then the following kernel is defined: Q(y,A) = 10 00 [Py{y(t) E A} - p(A)]dt, Ex,y 10/ (tp'(x), a(x,y£(s))) ds = Ey 10/ (tp'(x), a(x,y(s/e))) ds =e 10//£ /(tp'(x),a(x,z))py{y(S) Edz}ds. Using the fact that f a(x, z)p(dz) = 0, we can write Ex,y lo\tp'(X), a(x,y£(s))) ds = e(tp'(x), a (x,y)) + O(exp{ -kt/e} )lltp'lI, where a (x,y) = f a(x, z)Q(y,dz), and 0(.) is uniform with respect to Ixl < r and with respect to y. Further, Ex,y 10/ Io s (tp" (x)a(x,y£(s)), a(x, y£(u))) du ds = Ey 10/ (tp"(X) 10/ E(a(X,y(s/e))/y(u/e))dS,a(X,y£(u/e))) du = Eye21o//£ (tp"(X) Io//£-u/£ py(u) {y(s) E dz}a(x, z) dS,a(X,y(u))) du = e 2 10//£ Ey(tp"(x) a (x,y(u)),a(x,y(u)))du +0 ( lItp"lle 2 (/£ 1 00 e-k(t-S)dSdU ) J 0 t/e-u/e = te / (tp"(x) a (x, z), a(x, z))p(dz) + O(lItp"lle 2 ). Finally, Ex,y 10/ Io s (tp' (x), a(x, y£(s))a(x,y£(u))) du ds = O(lItp'llt 2 ). 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 159 Suppose now that xe(t) = (xe(t) - x)/Vi, where x = xe(O). For any <I> E Cx m ,  E C'}), and 0 < tl < ... < t m < t < t + h we get, setting f/I(z) = ((z - x)1 Vi), that E<I>(X e (tl), . . . , xe(t m ))[ (xe( t + h)) - (xe(t))] = E <I> ( X e ( t 1 ), · · · , X e ( t m ) ) E xe( t) ,ye( t) [ f/I ( Xe ( h )) - f/I ( Xe ( 0) ) ] = E<I>(X e (tl),... ,xe(tm))[e(f/I'(xe(t)), a (xe(t),Ye(t)))] + eh / (If 1" (xe(t)) a (xe(t), z), a(xe(t), z) )p(d z) + 0(e 2 1If/1"ll + h 2 11f/1'II + exp{ -khle} + Ilf/I"'(z)llh 3 ). Note that ' ( ) _  , ( z - X ) " ( ) _! " z - x f/I z - Vi'P Vi ' f/I Z - e  Vi' Therefore, f/I"'(Z) = 0(e- 3 / 2 ). E<I>(X e (tl), . . . , xe(t m ))[ (xe(t + h)) - (xe(t))] = ViE<I>(x e (tl), . . . , xe(tm))(' (x e ( t)), a (xe(t), Ye( t))) + hE<I>(xe(tl), · · · , xe(tm)) / (tp" (xe(t) ) a (xe(t), z), a(x e (t), z)) p( d z) + O( e + h 2 e- l / 2 + h 3 e- 3 / 2 + ..;e exp{ -kh Ie} ). Since xe(t) --+ x in probability as e --+ 0, and the functions a (x, z) and a(x, z) are continuous, it follows that E<I>(Xe(tl), · · · , xe(tm)) / (tp" (xe(t)) a (xe(t), z), a(xe(t), z ))p( d z) '" E<I>(Xe(tl), . . . , xe(tm)) / (tp" (X e (t)) a (x, z), a(x, z)) p( d z). We next choose h such that hie --+ 00, h 2 1e 3 / 2 --+ 0, and Vie-kh/elh --+ O. Consider the expression ..;eE<I>( xe (t 1), · . . , Xe (t m)) ( ' (X e (t)), a (X e (t), Ye (t))) = ..;eE<I>(X e (tl),... ,Xe(tm))('(Xe(t - h)), a (xe(t - h),Ye(t))) + ..;eO((Elxe(t) - Xe(t - h)12)1/2 + h). Assume that t m < t - h. Then E( a (xe(t - h),Ye(t))lxe(t - h),Ye(t - h)) = / a (xe(t - h), z) Py<(t-h) (y(h/e) E dz) = O(e-kh/e), 
160 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER because f a (x, z)p(dz) = O. Further, E(xe(h) - x,xe(h) - x) = 2E foh (a(x£(s),y£(S)), fos a(x£(u),y£(u)) du ) ds = O(h 3 ) + 2E foh (a(x,y£(S)), fos a(x,y£(u)) dU) ds = O(h 3 ) + 2e 2 foh E( a(x,y(u)), lh/£ a(x,y(s)) dS) du = O(h 3 + eh). Therefore, Elxe(t) - xe(t - h)12 = O(h + h 3 Ie). This establishes that y'£E<I>(X e (tl), . . . , xe( tm))(' (xe( t)), a (x e ( t), Ye( t))) = 0(..fiJi + h 3 / 2 + y'£e-kh/e), and hence E<I>(X e (tl),... ,Xe(tm))[(Xe(t + h)) - (Xe(t)) - hI(Xe(t))] = o(h) if I(z) = f("(z) a (x,y),a(x,y))p(dy). Using Theorem 1 and the remark after it, we see that the processes xe(t) converge in distribution as e --+ 0 to a homogeneous Gaussian process x(t) with independent increments sucH that Ex(t) = 0, E(x(t), z)2 = 2 f ( a (x,y), z)(a(x,y), z)p(dy). We now proceed to the general case. Along with (100) we consider the system d x e ( t) ( _ ( ) _ ( ) ) dt = a Xe t 'Y e t , d y£ (t) = .!.al ( x , y£ (t)) dt +  Bl ( x , y£ (t)) dWI (t), (102) e ye where x E X is fixed. It is natural to expect that if xe(O) = xe (O) = x and Ye(O) = ye (O), then on small time intervals the functions Ye(t) and Ye (t) differ little. On the other hand, ye (t) coincides in distribution with y X (tle), where y X is the solution of (88) (for x = x ). Our goal is to replace the process Ye(t) by y X (tle) in the expression for (xe(t)) - (x). Assume the following condition: 8) The functions a(x,y), B l (x,y), B(x,y), and al(x,y) are twice con- tinuously differentiable with respect to their variables and have bounded derivatives. 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 161 Denote by 1tf(x,y) the semigroup for the Markov process (xe(t),Ye(t)) that is the solution of system (100). Then Trf E ci!y for f E ci!y (for fixed t), and the derivatives of 1tf with respect to x and Y up to second order are bounded uniformly with respect to t on each finite interval. The generating operator of the semigroup It on ci! y is defined and has the form Af = :t 1/flt=o = (fl:(x,y), a(x,y)) + !(.t;(x, y), at (x, y)) + 2 1 tr .t;(x, y)B t (x, y)Bj(x, y). e e Proofs of these assertions are contained in Dynkin's book [1] (Chapter 5, 5). Analogous assertions hold also for the semigroup Ttf(x,y) for the Markov process ( xe (t), ye (t)) that is the solution of system (102). Fur- ther, the generating operator A of the semigroup Tr on cf y has the form .If = (fl:(x, y), a(x, y)) + !(fl:(x, y), at ( x , y)) e + ;e tr .t;(x,y)Bt ( x ,y)Bj( x ,y). We use the formula 1/g - Trg = lot Ts(A - A)Tr-sgds, (103) which is valid if Tug belongs to the domain of the operators A and A (the formula follows from the fact that the integrand is -is rsTt-s). In particular, this formula is valid for g E cf y. In this case it can be rewritten as Ex,yg(xe(t), Ye(t)) - Ex,yg( xe (t), Y e (t)) =! t Ex,y[(Ly - Ly )Ex,yg( xe (t - s)' Ye (t - s))] ds. (104) e J 0 x=xe(s) y=Yt(S) Here Ly g (x, y) = (g; (x, y), a 1 (x, y)) + ! tr g;y (x, y) B 1 (x, y) B i (x, y), and Ly g(x,y) = (g;(x,y),al( x ,y)) + !trg;y(x,y)B l ( x ,y)Bi( x ,y). Using (104) and the fact that - '1/- [(Ly - Ly)g(x,Y)]x=xt(t),y=ye(t) = O((llgyll + IIgyyll)lxe(t) - xl), 
162 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER we can write E:x,y 1 1 (tp'(x), a(x,Yt(S))) ds - E:x,y 1 1 (tp'(x), a(x, yt (s))) ds = O(lItp'111 1  1 5 E:X,ylxt(u) - x l dU) = O(lItp'lIt 3 Ie). (105) This gives us that E:x,y 1 1 1 5 (tp"( x )a( x ,Yt(U)), a( x ,Yt(S))) du ds = E:x,y 1 1 (tp" ( x )a( x , Yt(U)), Ex£(u),y£(u) 1 1 - u a( x , Yt(S)) ds ) du = E:x,y 1 1 (tp"( x )a( x ,Yt(U))Ex£(u),y£(U) x t- u a( x ,y X ' (sle)) ds ) du + 0(11"llt4 Ie). J 0 X'=xt(u) We impose the following condition of uniform exponential ergodicity on the process yX ( t) : 9) For each r there exist a k(r) and a locally bounded function c,(y) such that I Py{yX(t) E A} - Px(A)1 < c,(y) exp{ -k(r)t} for Ixl < r, and for all r sup g,(y)la(x,y)1 < 00. y,lxl' Let RX(y, A) = 1 00 (Py{yx(t) E A} - Px(A)) dt, a (x,y) = / a(x,z)RX(y,dz). Then Ex',y' 1 1 - u a(x,yX(sle)) ds (t-u = O(l x - x'l(t - u)) + Ex',y' 10 a(x',y X ' (els)) ds = O(l x - x'l(t - u) + e a (x',y')) + O(ee-k(\x'I)(t-u)/t) 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 163 uniformly with respect to y and Ix'i < r. Consequently, E:x,y lot Io s (tp" ( x )a( x ,Yt(u)), a( x , Yt(s))) du ds = eE:x,y lo\tp"( x )a( x ,Yt(U)), a (xt(u),Yt(u))) du + o( E:x,y lot (t - u)lxt(u) - x l du + e + t: )iltp"ll = eE:x,y lot (tp"( x )a( x ,Yt(u)), a ( x ,Yt(u))) du + 0(11"II(t3 + t 4 Ie + e 2 + et 2 )) = eE:x,y lot (tp"( x )a( x ,y X (uje)), a( x ,y X (uje))) du + 0(11"II(t3 + t 4 1e + e 2 + et 2 )) + 0(11"llt3) (106) (we have used an estimate of the form (105)). It is now possible to use the computations and estimates performed for the system (100) with al and B l independent of x. They give us that for tl < t2 < . .. < t m < t-h < t < t+h E<I>(X e (tl),..., xe(t m )){ (xe(t + h)) - (xe(t)) - I(xe(t))} = o(h), where Ltp(z) = / (tp" (z) a ( x , y), a( x , Y))Px(dy). This proves the following theorem. THEOREM 15. Suppose that the coefficients of system (100) satisfy condi- tions 8) and 9), and xe(t), Ye(t) is the solution of( 100) with initial conditions xe(O) = X, Ye(O) = yo. Then the processes xe(t) = (xe(t) - x )IVi converge in distribution to a homogeneous Gaussian process x(t) with independent increments such that Ex(t) = 0 and 2E(x(t), z)2 = 2 / ( a ( x ,y), z)(a( x ,y), z)Px(dy) = (jj( x )z, z). (107) 3.4. A dynamical system under the influence of a rapid diffusion process. Neutral case, large times. For small times xe(t) behaves like x + Vix(t), where x is the initial value of the process and x(t) is a homogeneous Gaussian process with independent increments, and hence the diffusion of 
164 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Xe(t) has order e. Therefore, it is natural to expect that the process moves a finite distance away from the initial state in a time of order lie. Let us study the limit behavior of the process xe(tle) as e --+ O. With this goal we consider Ex,y(xe(hle)), where  E C) and  is a compactly supported function. Using the above computations, we can write Ex,ytp(xe(  )) = tp(x) + 1 h /e(tp'(X), a(x,Ye(s))) ds (108) + 1 h /e l s (tp" (x)a(x, Ye(u)), a(x, Ye(s))) du ds + {h/e r (tp' (x), a(x, Ye(s))a(x, Ye( u))) du ds + O h: ' 10 10 e and the 0 on the right-hand side is uniform in x and Y if condition 7) holds. Let us study the asymptotic behavior of each term on the right- hand side of (108). Assume that e 2 I h --+ 0 and e I h --+ 00. The connection between e and h will be determined more precisely below. Our goal is to single out the terms of order h on the right-hand side of (108). As in 3.3, Ye(s) must be replaced by y:(s) in computing these terms. It is simplest to estimate the third term on the right-hand side of (107). On the basis of (106) we can write {hie (S Ex,y 10 10 (tp"(x)a(x,Ye(U)), a(x,Ye(S))) du ds {hie = eEx,y 10 (tp"(x)a(x, Ye(U)), a (x,Ye(U))) du ( h 3 h4 h 2 ) + 0 - + - + e 2 + - e 3 e 5 e (hie = eE y 10 (tp"(x)a(x,yX(uje)), a (x,yX(uje))) du ( h 3 h4 h 2 ) + 0 - + - + e 2 + - e 3 e 5 e hle 2 = e21 Ey(tp"(x)a(x,yX(u)), a (x,yX(u))) du ( e 2 h h 2 h 3 ) + hO - + - + - + - . h e e 3 e 5 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 165 Using condition 9), we get that for Ixi < r e 2 (h/e 2 h 10 Ey(tp"(x)a(x,yX(u)), a (x,yX(u))) du = f (tp"(x)a(x, z), a (x, z))Px(dz) + o(  Cr(Y) 'akf)' e-k(r)h/t). Since  is a compactly supported function, there is an r such that this estimate holds for all x, and in view of condition 9) Ex,y i h / t is (tp" (x)a(x,Yt(U)), a(x, Yt(S))) du ds h -2 = 2 tr B (X)"(X) + O(J I (e, h)) for some k l , where (B2(X)z, z) is determined by (107), and h 2 h 3 h 4 e 2 { h } J l (e h)=e 2 +-+-+-+-exp -k l - . , e e 3 e 5 h e 2 Consider the fourth integral in (108). Using (105), we first replace Ye(s) by y XI (s Ie) (Xl is a variable): A4 = Ex,y i h / t is (tp'(x), a(x,Yt(s))a(x,Yt(U))) du ds = Ex,y i h / t (tp'(X), i h / t - u Ex(u),Y(U)a ( x, yx.(u) (  s ) ) ds x a(x,Yt(U))) du ds + o( : ). For this substitution it is necessary that the following condition hold along with condition 8): 8') a(x,y) is twice continuously differentiable with respect to y, and its derivatives are bounded for Ixl < r, where r > 0 is arbitrary. Then for a compactly supported function  E cf) the term 0(.) in the last equality is uniform with respect to X and y. We introduce the following notation: QX(y,s,B) = Py{yX(s) E B}, RX(y,B) = ioo[fZ(y,s,B) - Px(B)]ds, 
166 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER RX(y, B) is defined if condition 9) holds. We have that {hle-u 10 Ex' ,y,a' (x,y X ' (s Ie)) ds = (h Ie - u) V(x') - {"O [Qx' (y', s Ie, d z) - Px' (d z )]a(x', z) ds J hle-u l hle - u - , , , , X' 'x' +eV(x,y)+ 0 EX',y,[aAx,y (sle))-aAx,y (sle))]ds, where V(x') = f a(x',z)px,(dz), V (x',y) = f a(x',z)RX'(y,dz). Suppose that the following condition holds along with 9): 9') sup Ila(x,Y)llcr(Y) < 00 for r > o. Ixlr,y Then roo [QX' (y,sle, dz) - px,(dz)]a(x', z) = O(ee-kl(hft-U)/t). J hle-u Moreover, I I la(x',yX (sle)) - a(x,yX (sle))1 < clx' - xl. Hence, (hie A4 = Ex,y 10 (tp'(x), V(xt(u))a(x,Yt(u)))(hle - u) du (hie + eEx,y 10 (tp'(x), V (xt(u), Yt(u))a(x,Yt(U))) du ( h3 h 4 ) + 0 e 2 + - + - e 3 e 5 (we have used the facts that  has compact support and Ixe(u)-xl = O(u)). Since V and V satisfy a Lipschitz condition in x, we can replace xe(u) by x in the integrals representing A 4 , and O(i h / t ( : - u) UdU) = 0( :: ) in the first integral, while o(e i h / t UdU) =0( 2 ) 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 167 in the second. Hence, A4 = Ex,y { 1 h / t (tp'(X), V(x)a(x,Yt(U))) (  - U ) du {hie } + e 10 (tp'(x), V (X,Yt(u)))a(x,Yt(U)) du ( h2 h 2 h 4 ) + 0 e 2 + - + - + - . e e 3 e 5 Under condition 8') we can replace Ye(u) by yX(sle) in both integrals, with error O(1 h / t  1 u SdS(  - u) dU) = o( ; ) for the first integral, and 0(h 3 Ie 3 ) for the second. Accordingly, we have obtained an expression for A4 in terms of yX(s) : { {hie A4 = Ey 10 (tp'(x), V(x),a(x,yX(sje)))(hje -s)ds {hie } + e 10 (tp'(x), V (x,yX(sje))a(x,yX(sje))) + O(c5 I (e,h)). Since f a(x, z)pz(d z) = 0, it follows that Ey 1 h / t (tp'(X), v(x)a(x,yx(  s))) (  )dS = 1 h / t (tp'(X), V(x) f [Qx(y,  S,dZ) - Px(dZ)] a(x, Z)) (  -S)dS hi e 2 ( ) = h 1 tp'(x), V(x) f [QX(y,s, dz) - px(dz)]a(x, z) ds + o( 1 h / t se-klsftds) = h('(x), V(x) a (x,y)) + 0(e 2 + he-klhle2). 
168 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Further, eE y 1 h / t (tp'(X), V (X,yx (  S) )a (X,yx () ) ) ds = h (tp'(X), / V (x, z)a(x, Z)Px(dZ)) +e 1 h / t (tp'(X), / V (x,z)a(x,z)[QX(y,s,dz) - Px(dZ)])dS = h('(x), a l (x)) + 0(e 2 ), where a l (x) = f V(x, z)a(x, z)Px(dz). Finally, for A4 we have A4 = h('(x), a l (x)) + h('(x), V(x)a(x,y)) + O(l (e, h)) + o(h). (109) We now proceed to a study of the second term on the right-hand side of (108): (hie A 2 = Ex,y 10 (tp'(x), a(x,Yt(S))) ds. Using (104), we can write (using the solution of (102) with x = x ) A2 = E y {1 h / t (tp'(x),a(x,yx(  s))) ds +  1 h / t 1 s [ ( al (x', y') - al (x, y'), o' Ey' (tp' (x), a(x, Yt (s - U)))) + tr ((B 1 (x', y')Bj(x' ,y') - Bl (x, y')Bj(x, y')) o2 Ey' (tp' (x), a(x, ye (s - U))) )] dUdS } . x' =Xe(U),y' =Ye(U) Here xe(t), Ye(t) is the solution of (100). Expanding al (x', y') by the Taylor formula at the point x (al is twice continuously differentiable by condition 8)), we have that 8 al (x' ,y') - al (x, y') = oX al (x,y')(x' - x) + O(lx' - x1 2 ) (8a1/8x is a linear operator from X to Y). There is an analogous repre- sentation for B l Bi. Let g(s,x,y) = Eya(x,yX(s)) = / QX(y,s, dz)a(x, z). 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 169 Then EX',y,a(x, yt (s)) = g(sle, x,y') (this expression really does not depend on x'). Assume that g;(s,x,y) and g;y(s,x,y) are bounded and continuous for Ixl < r, where r > 0 is arbitrary. Then 1 In hlt In s 1 In hlt In s ( h4 ) - IXt(u) -xl 2 du = - 0(u 2 )du = 0 5" · eo 0 eo 0 e Therefore, with an error at most 0(t5 1 (e, h)) we can confine ourselves to the first term in the Taylor expansions for the differences involving al and BIBi in the expression for A 2 . We consider the expression 1 {hit {S ( 8a e 10 10 Ex,y a; (X,Ye(U))(Xe(U) - x), 8 a , (g((s - u)je,x,y')tp'(x)) ) duds y y'=Ye(U) 1 {hit {S ( 8a l (U = e 10 10 Ex,y ax (X,Ye(U)) 10 a(xe(v),Ye(V)) dv, aa. C  U ,X,Ye(U)) tp'(X)) du ds 1 {hit {S ( 8a l (U = e 10 10 Ex,y ax (X,Ye(U)) 10 a(x,Ye(V)) dv, . ( S  U , x,Ye(u)) tp'(X)) du ds + o( ; ) 1 {hit {S ( (U 8a* = e 10 10 Ex,y 10 a(x,Ye(V)) dv, Ex.(v),y.(v) a (X,Ye(U - v)) 8 g * ( S-U ) ) ( h4 ) x ay e ,X,Ye(U - v) tp'(x) duds + 0 es = A 21 - Here 8 g* 18 y is the operator from X to Y that is the adjoint of 8 g I 8 y. Suppose that (818x)al(x,y) and (818y)g(s,x,y) have derivatives with respect to y up to second order that are continuous and bounded for 
170 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Ixl < r, where r is arbitrary. Then in the last integral Ye(u) can be re- placed by yX' (ule), with error of order  In hle In s 3 _ h 5 2 U du - 7. e 0 0 e Therefore, the preceding chain of equalities can be extended as follows: _ 1 {hie {S {U ( 8ai ( X' ( u - V ) ) A2l - e 10 10 Ex,y 10 a(x,Ye(V)), Eye(v) ox X,Y ---e- 8 g * ( S-U , ( U-V )) ) X _ 8 ,x,yX '(x) dvduds y e e x'=Xe(V) ( h 4 h5 ) +0 -+- e 5 e 7 1 {hie {S ( {U ! ( u - V ) = e 10 10 Ex,y 10 a(x,Ye(V)), QXe(V) Ye(V), e ' dz 8a* 8 g* ( s - u ) ) X o (x, z) aye ' x, z tp'(x) dv du ds ( h4 h 5 ) +0 es+er · Assume now that 8 _ oy QX(y,s,A) = O(cr(y)e k(r)s) for Ixl < r, where cr(y)a(x,y) is bounded for Ixl < r. Then 8g* ! 8Qx oy (s, x, y) = oy (y, s, d z)a(x, z), and roo 8 g* {OO 8 ! 8 10 oy (s,x,y)ds = 10 oy (y,s,dz)a(x,z) = Oya (x,y). 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 171 The last equality for A 21 can be rewritten, changing the order of integration (the parenthesis indicate the previous integrand): 1 l hlt l hlt 1 hlt-a A21 = - dv du ds(.) e 0 v U 1 l hlt l hlt 1 00 1 l hlt 1 00 1 00 = - dv du ds - - dv du O(e-k(r)Slt)ds e 0 0 u e 0 hit hlt-u {hit {hit ( = 10 dv 1v duEx,y a(x,Ye(V)), f ( u - v ) Ga. G a . ) QX.(v) Ye( V), e ' d z o (x, z) a z (x, z)tp' (x) ( h 4 h5 ) + 0 - + - + e 2 e 5 e 7 = foh/e dv 10 00 duEx,y (a(x,Ye(v)), f [QX'(V) (Ye(V), u  v , dz ) ] Ga. G a . ) - Px.(v)(d z) o (x, z) a z (x, z)tp'(x) {hit {hit ( + 10 dv 1v duEx,y a(x,Ye(V)), f Ga. G a . ) Px.(v)(d z) o (x, z) a z (x, z)tp'(x) l hlt 1 00 ( h4 h5 ) - dv O(e-k(r)(u-v)lt) du + 0 - + 7 + e 2 o hit e 5 e {hit ( f oa.(x z) = e 10 dvEx,y a(x,Ye(V)) Rx.(v)(Ye(v),dz) lOX' G a . , ) x GZ (x,z),(x) + foh/e (  - v ) Ex,y (a(x,Ye(V)), f Px.(v) (dz) Ga. G a . ) x o (x, z) a z (x, z), tp'(x) dv ( h 4 h5 2 ) +0 es+er+e · 
172 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Assume that J Px(dz)g(z) satisfies a Lipschitz condition in x for Ixl < r and g E C?>. Then 1 h / e (  -V)Ex,y(a(x,Ye(V)), f aa* a a * ) Px.(v)(d z) o (x, z) a z (x, z), tp' (x) dv {hie ( h ) ( h3 ) = 10 t - v Ex,y(a(x,Ye(v)),A(x)tp'(x)) dv + 0 t3 ' f aa* a a * A(x) = o (x, z) oz (x, z)Px(dz). Using an estimate of type (105), we get that {hie ( h ) 10 t - V Ex,y(a(x,Ye(v)),A(x)tp'(x)) dv = 1 h / e (  -v )Ex,y(a(x,yx(  v) ),A(X)tp'(X)) dv 1 {hie ( h ) + e 10 t - V O(v 2 ) dv = 1 h / e (  -v)QX(Y'  V,dz)(a(X,Z),A(X)tp'(X))dV+O( ; ) =  1 00 f QX(y,  v,dz )(a(X,Z),A(X)tp'(X))dV _ {h/e vO(e-k(r)v/e)dv _ h roo O(e-k(r)v/edv) + O ( h 4 ) J o e Jhle e 5 = h( a (x,y),A(x)'(x)) + 0(e 2 ) + o(h). Assume now that the function f R x' ( , d ) a ai ( ) a a* ( ) - ( ' , ) y, z ax X,z az X,z -A x,y,x satisfies a Lipschitz condition in x' for Ix'i < r, with a constant propor- tional to cr(y'), and is twice continuously differentiable with respect to y' 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 173 with bounded derivatives. Then (hie e 10 Ex,y(a(x, Ye( V)), A(x e ( V), Ye( V), x)tp' (x)) dx {hie (hie = e 10 Ex,y(a(x,Ye(V)), A(x,Ye(V),X)tp'(x)) dv + e 10 O(v) dv = e fah/e Ex,y (a (x,yx (  V ) ),A (x,yx (  V ),x ) tp'(x) )dV ( {hie h2 ) + 0 10 v 2 dv + e {hie ! = e 10 (a(x, z),A(x, z,x)tp'(x))Px(dz) l hle ( h3 h 2 ) + e O(e-k(r)vle) dv + 0 - + - o e 3 e =h(tp'(X),! A*(X,Z,x)a(x,Z)px(dZ)) +O( : + 2 +e 2 ). Finally, A21 = h(tp'(X),! A*(X,Z,x)a(x,Z)px(dZ)) ( h2 h 3 ) + h( a (x,y),A(x)tp'(x)) + 0 e 2 + t + t3 + o(h). Suppose now that 1 {hie {S { 8 A22 = e 10 10 Ex,y tr oX (B 1 (Xe(U),Ye(U)) X B(xe(u),Ye(U)))(Xe(U) - X)} X { 8 o2 ( g ( .!.(S-U),x,y' ) ,'(X) )} duds. y e y'=Ye(U) If B(x) is a function from X to L(Y), then (818x)B(x) is a linear function from X to L(Y) for fixed x. Assume that 8 2 Oy 2 QX (y,S,A) = O(cr(y)e-k(r)s), where C r is as before, and for all a and b the function ! R X ' (y', dz) tr { :X B1 (x, z)B(x, z)a} ::2 ( a (x, z), b) = (B(x',y',x)a,b), a,b EX, 
174 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER satisfies a Lipschitz condition in x' for Ix'i < r, with a constant propor- tional to lal.lblcr(y'), and moreover, it is twice continuously differentiable with respect to y' and has bounded derivatives for Ixl < r (r arbitrary). Then, repeating all the computations used for A 21 , we find the representa- tion A22 = h !(B(X,Z,X)a(x,z),tp'(X))Px(dz) ( h2 h 3 h 4 h 5 ) + h(C(x) a (x,y), qJ'(x)) + 0 e 2 + - + 3" + 5" + 7 + o(h) e e e e (110) for A 22 . Here the operator C(x) in L(X) is determined by the following equality for a, b EX: (C(x)a,b) = ! tr{ :x (B1(X,Z)Bi(x,z))a} ::2 ( a (X,z),b)px(dz). Finally, for Ixl < r {hit Ey 10 (tp'(x),a(x,yX(sje))) ds (hit = 10 (tp'(x),a(x,z)[QX(y,sje,dz) - px(dz)]) = e(qJ'(x), a (x,y)) + O(ee-hk(r)lt). Thus, since qJ has compact support, we can assert that for some k l Ex,ytp (xe (  ) ) = qJ(x) + e(qJ'(x), a (x,y)) + h(qJ'(x), C(x) a (x,y)) + h( tp' (x), III (x)) + h ! (B(x, z, x)a(x, z), tp'(x) )pAd z) + h ! (A*(x, z, x)a(x, z), tp' (x))Px(d z) + h( a (x, y), A(x)tp' (x)) _ ( h2 h 3 h 4 h 5 e 2 2 + h tr(qJ"(x)B 2 (x)) + 0 e 2 + - + - + - + - + _e-klhlt e e 3 e 5 e 7 h +ee- k1h /e 2 ) +o(h) (111) uniformly with respect to x, y. The right-hand side of this is representable in the form qJ(x) + hLqJ(x) + (eqJ'(x) + hC*(x)qJ'(x) + hA(x)qJ'(x), a (x,y)) + 0(c5 2 (e, h)) + o(h), (112) 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 175 where h 2 h 3 h 4 h 5 ( e2 ) 2 d (e, h) = e 2 + - + - + - + - + - + e e- k1h / t t e e 3 e 5 e 7 h ' - 1 -2 Lrp(x) = (a(x), rp'(x)) + 2 tr(B (X)rp"(X)), and a(x) is determined by the equality (a(x),z) = f(a(x,Yda(x'Y2),Z)Px(dzdRX(YJ,dY2) f ( 8a* 8 a * ) + Px(dYdR X (YJ,dY2) o (X,Y2) OY2 (x,Y2)z,a(x,yd + f Px(dYdRX(YJ,dY2)tr{ :x (Bl(X,Y2)Bj(X,Y2))a(x,Yd} 8 2 - x 8 2 (a (x, Y2 ), z). ( 113 ) Y2 If we choose h = e 2 - P , where 0 < P < 1/4, then d2(e, h) = h[e P + e- P + e l - 2P + e l - 4P + e- kl / t ' le 2 - 2P ] = o(h). For what follows we need an estimate of Ex,y (a (Xe ( : ) ,Ye ( : ) ), b (Xe ( : ) ) ), where b(x) is a sufficiently smooth function from X to X. LEMMA 16. Assume the conditions listed above (those used in the deriva- tion of (111)), and suppose that b(x) is continuously differentiable and its derivative satisfies a Lipschitz condition. Then Ex,y ( a ( Xe ( : ). Ye ( : ) ), b ( Xe ( : ) ) ) = 0 ( 8 + h + :: + :: + :: ). 
176 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER PROOF. We have that cI» = Ex,y ( a (Xe(h), Ye(h)), b (xe (  ) ) ) = Ex,y( a (x'Ye(  ) ),b(X)) + Ex,y l h / e ( a (Xe(S)'Ye(  ) )a(Xe(S),Ye(S)),b(Xe(S))) ds + Ex.Y l h / e ( a (xe (s), Ye (  ) ), b(Xe(S)a(Xe(S)'Ye(S)))) ds = Ex,y( a (x'Ye(  ) ),b(X)) + Ex,y l h / e [(a'x(X'Ye(  ) )a(X,Ye(S)),b(X)) + ( a (X,Ye (  ) ), b(X)a(X'Ye(S))) ] ds + o( : ) = Ex,y l h / e [ ( ( Ex.(s),y.(s) a (x,Ye (  - S ) ) ) a(x, Ye(s)), b(X)) + ( Ex.(s),y.(S)  ( x, Ye (  - S ) ), b(x )a(x, Ye(S))) ] ds + o( : ) + Ex.Y ( a (X,Ye (  ) ), b(X)). If Ye(hle - s) is replaced by yX' ((hle - s)) under the sign Ex',y" then the error is of order 0(h 3 Ie 4 ). Hence, cI» = Ex,y ( a ( X, Ye (  ) ), b(X)) + Ex,y l h / e f QX.(s) (Ye(S), h 2 8S , d z ) x [( a (x, z)a(x,Ye(s)), b(x)) + ( a (x, z), b(x)a(x,Yt(s)))] ds ( h2 h 3 ) + 0 e 2 + e 4 · Using the fact that f QX(y',s,dz)g(z) satisfies a Lipschitz condition in x, we can replace QXe(s) by QX with an error of order o(l h / e Ixe(s) -X1dS) = o( : ). Further, f QX (yI, h 2 8S , dz )a(x, z) = O(e-kl(h-es)/e\ 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 177 After integration with respect to s we get a quantity of order O(e). Hence, = Ex,y (a (x,ye( : ) ),b(X)) (hit f + Ex,y 10 px(dz)[( a (x, z)a(x,Ye(S)), b(x)) ( h2 h 3 ) + ( a (x, z), b(x)a(x,Yt(S)))] ds + 0 e + t2 + e 4 = Ex,y (a (X,Ye ( : ) ), b(X)) + Ex,y 1 h / e (T(x)a(x,Ye(S)), b(x)) ds ( h2 h 3 ) + 0 e + e 2 + e 4 ' where T(x) = f a (x, z)Px(dz). We have used the equality f a(x, z)Px(dz) = o. The second term has the form (hit Ex,y 10 (a(x,Ye(S)), b l (x)) ds, where b l (x) = T*(x)b(x). The estimates obtained in computing A2 give us that (hit Ex,y 10 (a(x,Ye(S)), b l (x)) ds = 0(8 + h) + o(h). Further, using a procedure analogous to that in the computation of A2, we get that Ex,y (a (x,ye( : ) ),b(X)) = Ex,y( a (x,yx(  ) ).b(X)) {hit 1 { ( 8 (S + Ex,y 10 "8 ox al(X,Ye(S)) 10 a(x,Ye(U)) du, 8 ( h - es ) ) oy gl x, 8 2 ' Ye(s) +  tr { :x (BI(x,Ye(s))Bj(x,Ye(s))) l s a(x,Ye(u)) dU} { 82 ( h-es )}} ( h3 ) x oy 2 g1 x, 8 2 ,Ye(S) ds+O 8 4 · 
178 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER Here gl(X,S,y) = Ey( a (x,yX(s)),b(x)). We have that Ex a ( x, yX (  ) ) = O( e- k !h/e 2 ), 1 {hit {S ( 8 8" Ex,y 10 10 Ex.(u),y.(u) ax al (x, Ye(S - u) )a(x, Ye( u)), 8 ( h - es ) ) ay g1 x, 8 2 ,Ye(S - u) duds 1 {hit {S ( 8 ( ( S-u )) = 8 Ex,y 10 10 Ex.(u),y.(u) ax al x,Y x 8 a(x,y'), 8 ( h-es ( s-u ))) ( h4 ) _ 8 gl x, 2 ,yX duds+O 6" y e e y'=Yt(U) e 1 {hit {S f ( s u ) ( 8 = 8 Ex,y 10 10 cy.(u) Ye(U),  ,dz aX al(X,Z)a(x,Ye(u)), 8 ( h-es )) ( h4 ) ay g1 x, 8 2 'z duds+O 8 6 1 {hit {S f ( s - u ) = 8" Ex,y 10 10 CY Ye(U), 8 ,dz ( 8 8 ( h - es ) ) x ax a1(x,z)a(x,Ye(U)), ay g1 x, 8 2 ,z duds ( h 4 h 3 ) +0 -+- e 6 e 4 1 {hit {S f ( 8 = 8" Ex,y 10 10 Px(dz) ax a1(x, z)a(x,Ye(U)), 8 ( h - es ) ) ay g1 x, 8 2 ' z duds 1 {hit {S ( { ( s u h es ) }) + 8" 10 10 0 exp - k 1 8 + 8 2 duds ( h 4 h 3 ) +0 -+- e 6 e 4 =  Ex,y 1 h / e l s f PAdz)( :x al(X,Z)a(x,yx(  u)), 8 ( h - es ) ) ay g1 x, 8 2 ' z duds ( h 4 ) ( h4 h3 ) + 0 e 6 + O( e) + 0 e 6 + e 4 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 179 1 ( {hit {S { h - es } ) = eO 10 10 e-k1u/e exp - k 1 8 2 duds ( h3 h 4 ) +0 e+-+- e 4 e 6 ( ( h3 h4 ) = 0 e + e 4 + t6 ; here we used the fact that 8 ( { h - es }) Oy g1=0 exp -k 1 8 2 · Similarly, 1 {hit {S { 8 } eEx,y 10 10 tr ox (B 1 (X,Ye(s))Bj(x,Ye(s)))a(x,Ye(U)) 8 2 ( h - es ) ( h3 h 4 ) X oy 2 g1 x, 8 2 ,Ye(s) duds = 0 8+ 84 + t6 · Combining all the estimates, we get what is required. 0 We can now formulate and prove a theorem on the limit behavior of the process Xt (t Ie) as e --+ O. For convenience we collect all the conditions imposed in the intermediate estimates and computations. These condi- tions naturally break up into two groups: smoothness conditions on the coefficients of the system, and conditions on convergence to an ergodic distribution for the processes yX (t). THEOREM 16. Assume the following conditions holdfor the system (100): 1) a(x, y) satisfies sup la(x,y)I(1 + IxD- l < 00. x,y 2) T n d . . "" " " d '" . d . .I. j e erzvatzves ax, a y , a xx , a xy , a yy , an a xyy exzst an are contznuous and bounded for Ixl < rand y E Y, where r > 0 is arbitrary. 3) The functions al(x,y) and B l (x,y)Bi(x,y) are twice continuously differentiable with respect to x and y, and are four times continuously dif ferentiable with respect to y, and all these derivatives are boundedfor Ixl < r and y E Y, where r > 0 is arbitrary. 4) For all x the solution yX(t) of equation (88) is ergodic with ergodic distribution Px(dy) for which f a(x,y)px(dy) = O. 5) If g(y) E C?), then f Px(dy)g(y) satisfies a local Lipschitz condition. 6) If QX (y, t, d z) is the transition probability for the process yX (t), then for every r there exist a k(r) and a c,(y) such that for Ixl < r IQX(y,t,B) - Px(B)1 < c,(y)exp{-k(r)t} 
180 II. STOCHASTIC EQUATIONS WITH A SMALL PARAMETER and, for every g E C?) with Ig;l, Ilg;11 < r, 8 f 82 f oy QX(y,t,dz)g(z) + oy2 QX(y,t,dz)g(z) < c,(y) exp{ -k(r)t}, f QX(y, t, dz)g(z) - f QX' (y, t, d z)g(z) < c,(y)lx - x'l, and the function c, (y) is such that sup (la(x,y)1 + Ila(x,y)ll)c,(y) < 00. y,lxl' Ifxe(t), Ye(t) is the solution of system (100) with initial conditions xe(O) = Xo and Ye(O) = Yo, then the processes xe(tle) converge in distribution to the process x(t) that is the solution of (75) with initial condition x(O) = Xo, B(x) is determined by (107), and a(x) is determined by (113). PROOF. We use Theorem 1 and the remark after it. Let h = e 2 - P , where o < P < 1/4. Then on the basis of (111) we have for tl < t2 < ... < t m < t - h < t < t + h, <I> E Cxm, and rp E cf) with compact support that E ( Xe (  ), · · · , Xe C; ) ) x [tp ( Xe C : h ) ) - tp ( Xe ( ) ) - hLtp ( Xe ( ) ) ] = E ( Xe (  ), . . . , Xe ( t; ) ) Ex.(t/e),y.(,/e) x [tp (Xe (  ) ) - tp(xe(O)) - hLtp (x e (0)) ] = o(h) + E(Xe(  )'...'Xe C; )) X (8tp'(Xe()) +hC.(Xe() )tp'(Xe()) + hA ( Xe (  ) ) tp' ( Xe (  ) ), a ( Xe (  ) , Ye ( ) ) ) 
3. AVERAGING OVER VARIABLES FOR SYSTEMS OF EQUATIONS 181 = o(h) + E<I>(Xe(  ),...,xe C: )) X Ex.W-h)/e),y.W-h)/e) (ell" ( Xe ( : ) ) + hC. ( Xe ( : ) ) tp' (Xe ( : ) ) + hA ( Xe ( : ) ) tp' ( Xe ( : ) ), a (Xe ( : ), Ye ( : ) ) ) = o(h) + E<I>(Xe C; )'...'Xe C: )) X(e+h)o ( e+h+ h : + h; + h: ) =o(h). 0 e e e EXAMPLE. We consider the system of one-dimensional equations d Xe (t) dt = a(xe(t), Ye(t)), 1 1 dYe(t) = --al (Xe(t))Ye(t) dt + .  bI (Xe(t)) dw(t), B vB where at, b i E c1 4 ), al > J > 0, b i > J > 0, yX (t) is the solution of the linear equation dyX (t) = -al (x)y x (t) dt + b i (x) dw(t), and yX (t) has a normally distributed transition probability Q" (y, t, d z) = n (ye- al (x)t, : (1 - e-a\(x)t) + y 2 e- al (x)t(l - e- al (x)t), Z ) d z. Here n (a, b 2 , z) is the density of the normal distribution with mean a and variance b 2 . The ergodic distribution is also normal: ( bf(x) ) Px(dz) = n 0, aI(x) 'z . Therefore, ( b2(x) ) IQ" (y, t, d z) - Px(d z)1 = 0 (1 + y2) a (x) e-adx)t Px(d z). Let a(x,y) be a function in cj;) such that a(x,y)(l + IYI6) is bounded. Define a(x) = !a(x,y)px(dY) and a(x,y) = a(x,y) -a(x). Then conditions 5) and 6) hold, and hence the theorem is valid. 
CHAPTER III Stability. Linear Systems  1. Stability of sample paths of homogeneous Markov processes 1.1. Definition. In the most interesting cases, solutions of stochastic differential equations are sample paths of homogeneous Markov processes, and hence the study of the stability of the latter is certainly of interest. We are interested in the behavior of a Markov process in a neighborhood of a stationary point on an infinite time interval. Let X be the phase space of a process x(t), Xt its sample paths, Ex and Px the expectation and probability when the initial position of the process is x, and A its generating operator. A point x is said to be stationary if AqJ(x) = 0 for all qJ E D A , where D A is the domain of the generating operator. A point is stationary if Px{Xt = x} = 1 for all t > 0 (such points are said to be absorbing); however, this condition is not necessarily satisfied for a stationary point. For processes given by the homogeneous stochastic equation dx(t) = a(x(t)) dt + B(x(t)) dw(t) + f f(x(t), O)f.l(dO x dt), a point is stationary if a(x) = 0, B(x) = 0, and f(x, 8) = o. We use the following notation. Suppose that X is a metric space with metric r(x,y). DEFINITIONS. 1. A stationary point is said to be stable if for every e > 0 and p > 0 there is a J > 0 such that sup Px { supr(XhX) > P } < e. r(x,x) t 2. A stationary point x is said to be asymptotically stable if for every e > 0 there exists a J > 0 such that sup Px { lim r(xt,x) = O } > 1 - e. r(x,x) too 183 
184 III. STABILITY. LINEAR SYSTEMS 3. A stationary point x is said to be asymptotically stable in the large if for all x E X Px { lim r(xr,x) = O } = 1. too Note that Definitions 1 and 2 are natural carry-overs of well-known definitions in the theory of differential equations to the case of Markov processes. In the theory of differential equations there is a very broad class of equations for which stability holds, but not asymptotic stability (undamped oscillations about a stable equilibrium point for a mechanical system). For Markov processes such a situation is rather exceptional. This is shown by the following theorem. THEOREM 1. Suppose that x is a stationary point, and that, for any closed bounded set F c X\{x}, any open set U, and sufficiently large T infPx{'u < T} > 0, xEF 'u = inf[t: Xt E U]. In this case if x is stable, it is asymptotically stable. PROOF. Let Sr.,r2 = {x: r2 < r(x, x) < rl}. This is a closed set. Define U r .,r2 = X\Sr.,r2. By an assumption of the theorem, there exist aT> 0 and an a > 0 such that P X {'U'."2 > T} < 1 - a, x E Sr.,r2. Then P X {'U'."2 > 2T} < E x I{(u'."2>T}P X (T){'u'."2 > T} < (1 - a)2, P x {'U'."2 > nT} < (1 - a)n, E x 'U'."2 < T ja 2 , x E Sr.,r2. Note that P x{r(x(u ' x) > r2} < P x { sUP r(xr, x) > r l } · '. .'2 t Writing U r2 = {x: r(x,x) < r2}, we have that P X {Cu r2 < co} > 1 - Px {SP'(XhX) > '1 }. If rl > r2 > ... and r n --+ 0, then the events {'u'n < oo} are decreasing, and for all n Px{CU rn < co} > 1 - Px {SP'(XhX) > '1 }. 
1. STABILITY OF SAMPLE PATHS 185 Hence, Px (0{CUr. < oo}) > 1 - Px {SP'2(XhX) > '1 }, inf Px ( n{Cur < oo} ) > 1 - sup P { su p r 2 (x t ,X) > rl } . Ixl<d n n Ixld t Suppose that, for a given £ > 0 and rl, J > 0 is chosen so that sup Px { supr(xr,x) > r l } < 2 £ , Ixld t and the r n (n > 1) are chosen so that sup P { SUP r(xr, x) > rn } < £ .2- n + l . Ixlrn+l t Then Px { lim r(xr,x) = O } too > Px { n [ {CUrn+l < oo} n { SUP r(xr,x) < rn }] } n t>Cu rn + 1 > Px {0{CUr. < oo} } - LP x { {C U r n + 1 < oo} n { SUP r(xr,x) > rn }} n t>C n+l > 1 - 2 8 - I: ExPx(,u ) { supr(xr,x) > rn } r n +l t n=1 00 £ L £ >1--- ->1-£ - 2 2 n + l - n=1 if Ixl < J. 0 REMARK 1. Let X be a locally compact space, and let Exf(xt) E C x for all f E C x , i.e., the process is a Feller process. Then for the condition of the theorem to hold it suffices that for all x # x and any open set U we have that Px{Xt E U} > 0 for some t > O. Indeed, there is a  E C x with support in U such that ExqJ(x(t)) > 0, and hence there exist a J > 0 and a neighborhood S(x) of x such that Ey(xt) > J for y E S(x), and thus Py{Xt E U} > J for Y E S(x). Using the compactness of F, we can find 
186 III. STABILITY. LINEAR SYSTEMS a finite covering of F, F c U1 S(Xk), such that for each k there exist tk and J k for which Py{Xtk E U} > J k , Y E S(Xk), k = 1,..., m. Therefore, Px { Cu < m axt k } > minJ k , x E F. km km REMARK 2. It actually suffices that the condition of the theorem holds for F and U lying in some neighborhood of the stationary point. We consider examples of unstable stationary points. EXAMPLE 1. Let w(t) be a one-dimensional Wiener process, and let x(t) = 1/(1 + w 2 (t)). We extend the definition of x(t) as follows: if x(O) = 0, then x(t) = 0 for all t > O. Using the Ito formula, we have that d 2w(t) dw(t) 3w 2 (t) - 1 d x(t) = - (1 + w2(t))2 + (1 + w 2 (t))3 t. Since w 2 (t) = 1 _ x(t), Iw(t)1 = y 1 - x(t) , 1 + w 2 (t) y 1 + w 2 (t) 3w 2 (t) - 1 (I + w 2 (t))3 = x 2 (t)[3(1 - x 2 (t)) - x 2 (t)] = x 2 (t)(3 - 4x 2 (t)), x(t) satisfies the stochastic differential equation . dx(t) = x 3/2 (t) y l - x(t) dw(t) + x2(t)(3 - 4x 2 (t)) dt, where w(t) = - f sgn w(s) dw(s) is also a Wiener process, x(t) is a Markov process, and the point 0 is stationary for it. It is obvious from the form of x(t) that this point is not stable, while x(t)  0 in probability as t  00, since E 1 <  f dx =   O. 1 + w 2 (t) - v2nt 1 + x 2 v2nt This example shows that the convergence of Xt in probability to a stationary point does not imply stability. EXAMPLE 2. Suppose that w(t) is again a one-dimensional Wiener process. and 't is determined by the equality (TI ds t = 10 g(w(s))' where g(x) is an even bounded continuous function that is positive for x  0, g(O) = 0, and j J dx IX\OO g(x) > 0, -6 g(x) < 00. The conditions on g(x) ensure that 't is defined for all t (fooo(ds/g(w(s))) = +00), and 't  00 as t  00 (P{f(ds/g(w(s))) < oo} = 1 for all t). Let Xt = w(,r). Then Xt satisfies the stochastic differential equation dXt = Y g(Xt) dWt, 
1. STABILITY OF SAMPLE PATHS 187 where w(t) = i T1 dw(s) o V g(w(s)) is a Wiener process. Since g( 0) = 0, it follows that 0 is a stationary point. But this point is not even absorbing-the process hits this point in a finite amount of time and instantly leaves it. 1.2. A Feller process on a compact metric space. Let X be a compact metric space, and suppose that the homogeneous Markov process Xt is right-continuous, Exf(xt) = Trf(x) E C x for all t > 0 and f E C x , and II Tr f - fll --+ 0 as t --+ 0 (the last assum ption holds if Tr f (x) --+ f (x) as t --+ 0, i.e., if the process is stochastically continuous). Let x be the unique stationary point for the process. It turns out that if it is stable, then under natural assumptions it is asymptotically stable in the large. This is a consequence of the following assertion. THEOREM 2. Suppose that x is the unique stationary point of a Feller process such that there is no closed invariant subset not containing x. If x is stable, then it is asymptotically stable in the large, and for every p > 0 lim sUPP x { supr(xr,x) > P } = O. (1) Too xEX tT PROOF. It follows from the stability of x that for every e > 0 and p > 0 there exists a J > 0 such that Px { supr(xr,x) > P } < e tO for r(x,x) < J. Let U d = {x: r(x,x) < J}. Denote by F the set of x such that Px{Xt E U d } = 0, t E R+. The set F is invariant; therefore its closure is also invariant (see Chapter I, 4.1). Hence, F is empty. Using Remark 1 and Theorem 1, we see that Ex'u is a bounded variable, where 'u is the first time the process hits U d (if Xo E U d , then let 'u = 0). Therefore, Px { SuP r(xt, x) > P } < Px { SUP r(xr,x) > p, 'u < T } + Px{'u > T} t T t>'u { } Ex' C < EXPX({Ud) r(xhx) > p + T < e + T ' where C = sUPx Ex'u. Since SUPtT r(xr, x) decreases as T increases, it follows that Px { lim sup r(xt, x) > P } = lim Px { supr(xt,x) > P } < e. Too tT Too tT 
188 III. STABILITY. LINEAR SYSTEMS Hence, Px { lim supr(xr,x) > P } = 0 Too tT for all p > 0, and Px { lim supr(xr,x) = O } = Px { lim r(xt,x) = O } = 1. D Too tT too We say that a process is irreducible away from a stationary point if there are no closed invariant subsets not containing this point. For such processes the concepts of stability, asymptotic stability, and asymptotic stability in the large coincide. Conditions will be found under which the process hits a stationary point in a finite amount of time. As is easy to see from the definition, Py (Xt = x} = 1 in the case of stability, i.e., a stable point is absorbing. Obviously, the set F of x such that Px{Xt = x} = 0 for all t is invariant. It will be assumed that the function Px{Xt = x} = P(t,x, {x}) is continuous in x. Let Fo = nk{x: P(tk,X,{X}) = O}, where tk i +00 (the function P(t,x,{x}) is nondecreasing with respect to t, since x is an absorbing point). As an intersection of closed sets, Fo is closed, and it does not contain x. Therefore, by our assumption about irreducibility, Fo = 0. By using the compactness of X, the continuity of the function P( t, x, {x}) with respect to x, and its monotonicity in t, we can easily see that P(t,x,{x}) > a> 0 for all x when t is sufficiently large. But then P(t,x,X\{x}) < 1 - a, P(2t,x,X\{x}) = f P(t,x,dy)P(t,y,X\{x}) < (1 - a)2, J X\{x} P(nt,x,X\{x}) < (1 - a)n. Hence, there exists a p > 0 such that P(t,x,X\{x}) < e-Pt/p. Denote by ex the first time the point x is hit. Then 'x= 1 00 I{x,#}dt, Ex 'x = 1 OOp (t,x,X\{X})dt, and since the integrand has an integrable majorant e- pt / p and is continu- ous in x, V/(x) = Exex is continuous in x. Further, V/(x) = O. We consider the Markov process obtained from Xt by stopping it at the time ex. Its phase space is X\{x}. Denote by E and P the expectation and probabil- ity for the terminating process under the condition that the initial point 
1. STABILITY OF SAMPLE PATHS 189 is x; the corresponding semigroup and generating operator are denoted by Tr* and A*. If (x) has a limit as x --+ x, then Tr*  (x) = It  (x) - liIll  (x) P ( t, x, {x } ). xx This function is continuous if  is. Regarding X\ {x} as a locally compact space with the point x at infinity, we see that the semigroup Tr* corresponds to a regular process (see Gikhman and Skorokhod [1], Vol. III, Russian p. 170, English pp. 124-125). Note that A* = A for x E X\{x} if  E D A . We show that", E D A *. Indeed, T;I/I(x) = Exl/l(Xh) = Ex 1 00 I{xdX} ds = 1 00 p(t,x,X\{x})dt, lim h I (Thl/l(X) -I/I(x)) = -lim h I fh P(t,x,X\{x})dt = -I{x\{x}}. hO hO J o Suppose that there exists a continuous function ",(x) such that A* ",(x) = -1 for x # x, and ",(x) > O. Then for 'u tS !n 'VtS Ex ",(x,v ) - ",(x) = Ex A* ",(x s ) ds = -Ex'u tS , tS 0 Ex'u tS = ",(x) - Ex ",(x( 'UtS)) < ",(x). Since limdo 'u tS = 'x, it follows that Ex'x < ",(x). Thus, the following assertion has been proved. THEOREM 3. Suppose that the condition in Theorem 2 holds, and, more- over, the function P(t, x, {x}) is continuous in x for all t. For 'x to be finite with P x-probability 1 for x E X it is necessary and sufficient that there exist afunction ",(x) E D A * such that A*",(x) = -1 for x # x. (2) - REMARK. We consider the weak generating operator A of the process Xt, which is defined as follows:  E DA' if  is continuous and bounded, and there exists a bounded function g(x) such that 1(tp(x) - tp(x) = lot Tsg(x) ds - for t > 0 and x E X. In this case let g(x) = A(x). If  E Di' then tp(Xt) - tp(x) - lot g(x s ) ds is a martingale, and for every stopping time, with Ex' < 00 the Dynkin formula is valid (see Dynkin [1], formula (5.8)): Extp(x T ) - tp(x) = Ex loT Atp(xs)ds, (3) 
190 III. STABILITY. LINEAR SYSTEMS - and hence (2) holds if '" ED;, A", = -1 or x # x, and A* is replaced by A. Thus, the existence of a '" E D;with A", = -1 for x # x and '" > 0 is a sufficient condition for ExCx < 00. We find a condition for the stability of a stationary point x under the assumption that the process is irreducible in X\ {x}. LEMMA 1. If x is stable, then for every closed set F not containing x sup t JO P(t, X, F) dt < 00. x 10 PROOF. Since r(xr,x) --+ 0 and x ft F, it follows that IF(xt) = 0 for sufficiently large t. Hence, 1 00 h(Xt) dt < 00. We choose J > 0 such that for r(x,x) < J Px {spr(xt,X) > P } < e, where p < r(x, F). Since the event { sUP r(xr, x) < P } t>(UtS implies the event { l OO h(x(s)) ds = O } , (UtS it follows that fooo IF (x s ) ds < CUtS when this event holds. Hence, Px { rOO h(x s ) ds > C } < Px { SUP r(xr,x) > P } + Px{Cu tS > c} 10 t>( < e + sup ExCutS/c. x Thus, for sufficiently large c sPPx{lOO h(Xs)dS>C} <  (we use the fact that sUPx Ex CUtS < 00 for all J > 0, as follows from Theo- rem 1 and the remark after it). Consequently, if " is the first time when 
1. STABILITY OF SAMPLE PATHS 191 J o ' IF (X s ) ds = c, then L Px {1°O h(x s ) ds > 2C} = Px { . < 00, 1 00 h(x s ) 2 } = EI{t"<oo}P XT {1°O h(x s ) ds > c} <  , Px {1°O h(xs)ds > nc} < Ij2 n , Ex 1 00 h(xs)ds < 4c. 0 LEMMA 2. Suppose that f(x) is a continuous function, and f(x) = 0 for r(x,x) < J, where J > O. Then the integral J o oo Itf(x) dt is defined and is a continuous function of x. PROOF. Convergence of the integral follows from Lemma 1. We show that under the conditions of Lemma 1 lim sup j OO P(s,x,F)ds = O. (4) too x t Indeed, let sUPx J o oo P(s, x, F) ds = Cl. Then [00 P(s,x,F)ds = Ex [00 h(xs)ds = ExEXt 1 00 h(xs)ds < ExI{r(xr,x»J 1 } EXt 1 00 h(xs)ds + E x I{r(x t ,x):5J 1 } EXt 1 00 h(x s ) ds < clPX{r(xr,x) > Ol} + sup Ey roo h(xs)ds. r(y,x)tSl 10 Denote by CF the first time the process hits the set F. Then J;F IF (x s ) ds = O. Hence, Ey roo h(x s ) ds = Ey roo h(x s ) ds < EyI{'F<oo}Cl 10 1'F < CIPy { SUP r(xs, x) > r(x, F) } . s>o We finally get the inequality sup j OO P(s,x,F)ds < Cl ( SUP Px{r(xr, x) < J 1 } x t x + sup Py { supr(xs,x) > r(X,F) } ) . r(y,.t)tSl s>o 
192 III. STABILITY. LINEAR SYSTEMS The first term on the right-hand side tends to zero as t --+ 00 for any £5 1 , and the second can be made arbitrarily small by suitably choosing £5 1 . This establishes (4). However, 1 00 Ts/(X)dS-1 T Ts/(x)ds < 11/11[00 P(s,x,Fo)ds, where Fo = supp f and x ft Fo. Since f Tsf(x) ds 6, C x , the last estimate and (4) give us the proof. 0 COROLLARY. Ifx is stable, then there exists a function f E C x such that f(x) > 0 for x #= x and fooo Tsf(x) ds  C x . Indeed, choose a sequence £5 k ! 0 and suppose that fk(x) E Cx, fk(x) > 0, fk(x) = 1 for r(x,x) > £5 k , and fk(x) = 0 for r(x,x) < £5 k + 1 . Then fo oo Tsfk(x) ds E C x , by Lemma 2. Therefore, we can choose a sequence ak > 0 such that Lakllfkll < 00, L ak 1 00 Tsfk(x) ds < 00 and take f(x) = E akfk(x). THEOREM 4. When the process is irreducible in X\{x}, the stationary point x is stable if and only if there exists a function g E D A such that g > o for x #= x, g(x) = 0, and Ag(x) < o. PROOF. Let x be a stable point. On the basis of the corollary to Lemma 2, there exists a function f(x) such that f(x) > 0 for x #= x and fooo Tsf(x) ds E C x . Let g(x) = J o oo Tsf(x) ds. Since x is an absorb- ing point, Tsf(x) = f(x), and hence Tsf(x) = 0 because g(x) is finite. But then g(x) = O. It is easy to see that Ag(x) = - f(x) < 0 for x #= x. The necessity is proved. Sufficiency. Let g(x) be a function satisfying the conditions of the the- orem. Then Exg(xt) = g(x) + Ex 1 t Ag(x s ) ds < g(x). Therefore, g(x) is an excessive function, and g(Xt) a bounded nonnegative supermartingale. Hence, Px {sp g(Xt) > ).} < g(x)j).. 
1. STABILITY OF SAMPLE PATHS 193 Since g(x) = 0, g(x) is continuous, and g(x) > 0 for x # x, for all J > 0 we have that infr(x,x»d g(x) = p > 0, and Px {spr(xt,X) > o} < P {spg(Xt) > p} < g(x)j p, sup Px { supr(x,x) > J } < ! sup g(x). (5) r(x,.t)dl t P r(x,.t)dl For any J > 0 the right-hand side of (5) can be made less than e by suitably choosing J 1 . 0 REMARK. In the proof of the sufficiency of the condition for stability of a stationary point x we used only the fact that g(x) is continuous, g(x) > 0 for x = x, g(x) = 0, and g(Xt) is a supermartingale. DEFINITION. A continuous nonnegative function g(x) is called a Lya- punov function for the process Xt at the point x if x is the only zero of g(x), and for all x --1 1 lim -[g(x) - g(x)] < 0, sup -[g(x) - g(x)] < 00. t!O t x,t>O t It follows from Theorem 4 that under the conditions of Theorem 4 a Lyapunov function exists. THEOREM 5. If at a stationary absorbing point x there exists a Lyapunov function for Xt, then x is stable. PROOF. We prove first that if g(x) is a Lyapunov function at x, then Ttg(x) < g(x) for all x E X and t > O. The function g(x) is continuous with respect to t. Let us show that for all x and t --1 lim h [+hg(X) - 1[g(x)] < O. (6) h!O The expression after the limit sign has the form  f P(t,x,dy)[Thg(y) - g(y)] <  f P(t,x, dy)([Thg(y) - g(y)] V 0). The function 1 h ([Thg(y) - g(y)] V 0) is nonnegative, is bounded by a constant, i.e. 1 sup h [Thg(y) - g(y)] < 00, y,h>O and tends to zero as h --+ 0, because -1 lim h (Thg(y) - g(y)) < O. h!O 
194 III. STABILITY. LINEAR SYSTEMS Hence, lim h I f P(t, X, dy)([Thg(y) - g(y)] V 0) = o. hO This implies (6). If A(t) is a continuous function such that -1 lim h [A(t + h) - A(t)] < 0 h!O for all t, then A(t) is nonincreasing. We get from the condition Trg(x) < g(x) that g(x t ) is a bounded nonnegative supermartingale. The rest of the proof is the same as in Theorem 4. 0 We consider unstable stationary points. DEFINITION. If for a stationary point x there exist an a > 0 and a J > 0 such that Px {spr(XhX) > tJ } > a for all x, then x is said to be unstable. THEOREM 6. If a stationary point x is unstable, P(t,x, {x}) = 0 for all t > 0 and x # x, and the process Xt is irreducible in X\{x}, then there exists a J o > 0 such that Px {spr(XhX) > tJ o } = 1. PROOF. Let J be such that J > J o > J 1 > ... in the definition. Denote by !n the exit time from the 'Set S6 n ,60 = {x: I n < r(x,x) < J o }. It is finite for any I n < J o . Suppose that I n with even n > 2 has been chosen so that for r(x,x) > I n - 1 Px { r(x c v ,x) > In } > 2 1 . (7) 6n-1 Here U e = {x: r(x,x) < e}, and Cu is the first time U is hit. Choose J 1 less than J o . The remaining I n with odd indices are chosen so that for x E S6 n - 2 ,6 n - 1 PX{r(xTn,x) > Jo} > al2 (8) (we assume that the J k with indices k < n have already been chosen). Relation (7) can be satisfied, since the process r- 1 (xr,x) is bounded on each finite time interval and sup Px { supr-I(Xh X ) >  } r(x,x)6n-1 tT Un can be made less than 1/4 for every T by suitably choosing I n . On the other hand, E X Cu 6 is bounded for r(x,x) > I n - 1 , and hence T can be n-I 
1. STABILITY OF SAMPLE PATHS 195 chosen so that for all x P X {CU 6 > T} < 1/4. n-I Then Px{r(xcu ,x) < n} < Px { supr-1(xr,x) > ; } + P X {C U 6 > T} < 2 1 . 6n-1 tT Un n-I We now show that n can be chosen (n odd) so that (8) holds. Denote by C the first time the set {x: r( x, x) > } is hit. By a condition of the theorem, Px{C < +oo} > a for all x. Let g(A), A > 0, be a continuous function such that g(A) = 0 for A < o, g(A) = 1 for A > , and 0 < g(A) < 1. Then the function Exg(sUPtTr(xr,x)) is continuous in x for all T (see Gikhman and Skorokhod [1], Vol. I, Russian p. 508, English p. 431). If Exg(SUPtT r(xr,x)) > P, then P { supr(xr,x) > o } > p. tT Since lim Exg ( supr(xr,x) ) > Px { supr(Xt,X) >  } > a Too tT t for all x E S6 n - 2 6 n - l , for every x and e > 0 there exist a neighborhood U x of x and a Tx > 0 such that Py { SUP r(xt,x) > o } > a - e for Y E U x - yTx Using the compactness of S6 n - 2 ,6 n - l , we see that there is a T such that for all x E S6 n - 2 ,6 n - 1 Px { SUP r(xr, x) > o } = Px{C o < T} > a - e, tT where CO is the first time the set {x: r(x,x) > o} is hit. Obviously, Px{r(x(!n),x) > o} = Px{!n = Co} = Px{C o < C U 6 n } > P X ( {Co < T} n {C u 6n > T}) > Px{C o < T} - P X {C U 6 n < T} > a - e - Px { supr-1(xr,x) > ; } . tT Un If we choose e = a/4 and n so that sup Px { supr-1(xr,x) > : } < a 4 , r(x,.t)6n-1 t T Un 
196 III. STABILITY. LINEAR SYSTEMS then (8) holds. Observe now that for even n > 0 Px { sUP r(xr,x) > Jolcu -o } 'U n -2 tCun n-2 > Ex[ {r(xCU n _2' x) > I n - 1 }1CUn-2 -0] x P x , { SUP r(xr,x) > O } U n -2 r <t<r "U n -2 - -"Un a 1 a >-.-=- -22 4. Denote by An the event on the left-hand side after the sign of the condi- tional probability, and by A n the opposite event. Then Px (5 A2 k) = Ex n IA < Ex Cg 2 I Ak ) I A4 k = Ex ( 2 rr n - 2 IA ) E( IA Icu ) 2k 4n cS 4n -2 k=1 2n-2 < ( 1 - a ) E rr 1- < ( 1 _ a ) n . - 4 X t A 2k - 4 k=1 Hence, Px {spr(XhX) < J o } = Px { n A 2k } = nl Px { n A2k } = O. D k=1 k=1 1.3. Stability and instability of one-dimensional continuous processes. We consider a one-dimensional continuous process on the interval (a, P) under the assumption that all points of the interval are regular. The last point means that Px{'Y < oo} > 0 for any x,y E (a,p), where 'y denotes the first time the process hits the point y. It will be assumed that the process stops when it leaves the interval (a, P). Such a process is com- pletely determined by two functions: m(x) and n(x). The function m(x) is a strictly increasing continuous harmonic function determined, to within a factor and an additive constant, by the equality m(x) - m(al) Px{x(T[a.,pd) = PI} = m(Pd _ m(ad ' (9) where Ct < al < PI < P and '[al,PI] is the first exit time from the interval (Ctl,Pl) for x E (ai, PI). This function is completely determined by its 
1. STABILITY OF SAMPLE PATHS 197 values at two points. The function n(x) is such that m(Pl) - m(x) m(x) - m(al) ExT[a"Pd = n(x) - n(ad (P) () - n(Pd (P) () (10) m 1 - m al m 1 - m al for a < al < PI < P and x E [ai, PI). It is determined to within a term of the form cm(x) or by its values at two points. The fact that the right- hand side of (10) is nonnegative implies that n(x) = A(m(x)), where A is convex upwards. (Regarding continuous Markov processes on the line see Gikhman and Skorokhod [2], Chapter 5, 4, proof of Theorem 5.) Tbe generating operator Af is defined on functions f that are differentiable with respect to m(x) and such that df/dm has a derivative with respect to A' (m(x)), and Af(x) - _ a df(x) - aA'(m(x)) dm(x) (see Dynkin [1], Paragraph 15.13). We remark that m(xt) is a continuous martingale. Let m(a+) = -00. Then a cannot be a stable stationary point. Indeed, for any J E (a, P) and a < y < J { } m(x) - m(y) Px SPXt > t5 > P x {X(T[)I,6») = t5} = m(t5) _ m(y) for x < J. Passing to the limit as y --+ a, we get that P x { sP Xt > t5 } = 1 for all J > O. Thus, if a is a stationary point, then it is unstable when m(a+) = -00. Suppose now that m(a+) > -00. We can assume that m(a+) = O. Then m(x) > 0, m(x) > 0 for x > a, and the function m(x) 1\ c is a superharmonic function for c > 0 (if m(xt) is a martingale, then m(xt) A c is a supermartingale). Therefore, the point a is stable on the basis of Theorem 4. If here the function n(x) is bounded at the point a, n(a+) > -00, then it can be chosen so that n(a+) = 0 and n(x) > 0 in a neighborhood of a (we can add to n(x) a constant and the function km(x), where k > 0). Then An(x) = -1, and Px{Ca < oo} = 1 on the basis of Theorem 3. Accordingly, we have proved the following theorem. THEOREM 7. Suppose that x(t) is a continuous homogeneous process on the interval (a, P) such that relations (9) and (10) hold. Then: 1) if m(a+) > -00, then the point a is stable; 2) if, further, n(a+) > -00, then Px{Ca < oo} --+ 1 as x --+ a; and 3) ifm(a+) = -00, then a is unstable. 
198 III. STABILITY. LINEAR SYSTEMS We apply this theorem to a one-dimensional diffusion process on (a, p). Let a(x) be the drift coefficient and b(x) the diffusion coefficient, and suppose that a(a) = 0, b(a) = 0, and a(x) and b(x) are continuous. It will be assumed that a is in the domain of definition and is an absorbing point. The process is considered up to the exit time from [a, P[. The generating operator is defined on twice continuously differentiable functions f by the equality Af(x) = a(x)f'(x) + !b(x)f"(x). For the harmonic function m(x) we have the equation a(x)m'(x) + !b(x)m"(x) = 0, which implies that {X { (Y 2a(z) } m(x) = 1"1 exp - 1"1 b(z) dz dy, y E (a, P), to within a multiplicative constant and an additive constant. COROLLARY. Let { (X 2a(z) } u(x) = exp - 1"1 b(z) dz · If J: u(x) dx = +00 for some J > a, then a is unstable, and iff: u(x) dx < 00, then a is stable. The function n(x) satisfies the equation a(x)n'(x) + !b(x)n"(x) = -1. From this, {X (Y 1 n(x) = -2 1"1 u(y) 1"1 u(z) dzdy (to within a term of the form Clm(X) +C2). Therefore, Px{Ca < oo} --+ 1 as x --+ a if and only if for some y E (a, P) 1 " (Y 1 C< u(y) 1 y u(z) dzdy < 00. 1.4. Stability and instability of Feller processes in a locally compact space. Let X be a locally compact space, and C the space of continuous functions tending to zero at infinity. It will be assumed that the process is regular, i.e., Trf(x) = Exf(xt) E C for all f E C, and x E X is an isolated absorbing point. Moreover, the following condition holds. A. There exists a bounded neighborhood U of x such that: 1) U does not contain other stationary points; 2) if t is the first exit time from the set U, then tv has a continuous distribution with respect to the measure Px for all x E U; and 3) EXe-ATU is a continuous function for some A. > o. 
l. STABILITY OF SAMPLE PATHS 199 LEMMA 4. If conditions A2) and A3) hold, then a Markov process termi- nating at the time !u is a Feller process. PROOF. We must prove that if f is a continuous nonnegative function on U, then Exf(xt)I{Tu>t} is a continuous function. Since Xt is a Feller process, the measures J.lx on the space D[o,oo[(X) of right-continuous X- valued functions on [O,oo[ with limits from the left corresponding to the Markov process with initial value x depend continuously on the parameter x EX. On D[o,oo[(X) we define the function !u(x(.)) = sup[t: x(s) E U \Is < t]. Suppose that X n --+ Xo and n(t, w) (n = 0, 1,...) is a sequence of X-valued processes on some probability space {Q,3T, P} such that P{n(t, w) E D[o,oo[ (X)} = 1 and n (t, w) --+ o ( t, w) in the topology of D[o,oo[ (X), and the distribution of n(t, w) in D[o,oo[(X) coincides with J.lx n (the possibil- ity of constructing such a sequence of processes is proved in Skorokhod's book [5], Chapter 1, 6. It is easy to see that !u(o(t, w)) < lim !U(n(t, w)), since infts r(o(t, w), X\ U) > 0 for all s < !u(o(t, w)), and hence lim infr(n(t, w), X\ U) > O. noo ts By condition A3), for some A. > 0 E exp{ -A.!u(o(t, w))} = lim E exp{ -A.!n(n(t, w))}, noo but lim exp{ -A. !U(n (t, w))} < exp{ -A. !u(o(t, w))}. noo This is possible (see Lemma 5 below) only if exp{ -A.!U(n(t, w))} --+ exp{ -A.!u(o(t, w))} in probability, i.e., !U(n(t, w)) --+ !u(o(t, w)) in probability. Since P{ !u(o(., w)) = t} = 0 by condition A2), and I{Tu(c;(o,w»<t} converges to I{Tu(c;o(.,w»<t} if !u(o(t, w)) # t, it follows that I{Tu(c;n(.,w»<t}f(n(t, w)) --+ I{Tu(c;o(o,w»<t}f(o(t, w)) in probability, and hence, lim Exnf(x(t))I{Tu<t} = Exof(xt)I{Tu<t}, noo J+ Ef(n(t, w))I{Tu(C;n(o,W))<t} = Ef(o(t, w))I{Tu(c;o(o,w))<t}. D 
200 III. STABILITY. LINEAR SYSTEMS LEMMA 5. Suppose that 0 < n < 1, lim n < , and En --+ E. Then n --+  in probability. PROOF. It is easy to see that I{C;n>c;+e} --+ 0 as n --+ 00 for all e > 0; therefore, n A  > n - I{c;n>c;+e} - e, En A  > En - e - P{n >  + e}, lim En A  > lim En - e, lim En A  > E n noo noo in view of the arbitrariness of e. The variables  - n A  --+ 0 are nonneg- ative, and E( - n A) --+ O. Hence,  - n A  --+ 0, and P{n <  - e} --+ 0 for every e > O. 0 We extend U by a point 8, taking sets U\F with FeU an arbitrary closed set as neighborhoods of 8. (In other words, we collapse all points in X\U to a single point.) Then U u {8} = fj is compact. We now construct a nonterminating Feller process in fj that coincides with the original process up to the time tu. Let v(dy) be an arbitrary continuous probability distribution on U, and let a > O. We define a semigroup 1;* on Cfj by the equation 1;* f(x) = Exf(xt)I{Tu>t} + f(()) It e-a(t-S)p x { TU E ds} + ( Px{ TU _E ds}v(dy)e- au 1;*-s_uf(y) du, J O<s+u<t f E Cu. (11 ) This equation can be solved as follows. Integrating both sides of (11) with respect to v(dx), for the function Af(t) = J 1;* f(x)v(dx) we get the equation ).f(t) = tpf(t) + It ).f(t - s)'II(ds), ( 12) where tpf(t) = / v(dx) [Exf(Xt)I{Tu>t} + f(()) It Px{TU E dS}e-a(t-S)] , / g(s) 'II (ds) = / / g(s + u) / v(dX)Px{TU E ds}ae- au duo From this, ).f(t) = tp f(t) + f: / tpf(t - s)'IIn(ds). n=1 
1. STABILITY OF SAMPLE PATHS 201 Here 'IIn(ds) is the n-fold convolution of the measure'll. We now define T* f(x) by Tr* f(x) = Exf(xt)I{Tu>t} + f(O) I t Px{'ru E ds}e-a(t-S) + f Px{'ruEds}e-au).f(t-u-s)du. (13) Jo<u+st It follows from the-proof of Lemma 4 that  f(t) is continuous; therefore, so is Af(t). Therefore, again using the continuity of Px{tv < t} with respect to t and x, we see that Tr* f(x) is also continuous with respect to t and x. The process x* (t) can be described as follows: up to the time tv it coincides with Xr, at the time tv it hits the state fJ and is in that state an exponential amount of time with parameter a, and then with probability v(dx) it hits the region dx E U and behaves again like Xt until leaving U; x*(t) is a homogeneous nonterminating Feller process on the compact set U. Obviously, x is stable for the process Xt if and only if it is stable for the process x;, and it is unstable for Xt if and only if it is unstable for x; . The process x; is irreducible in fj - {x}, and hence all the assertions in  1.2 are valid for it. A function f(x) is said to be superharmonic in the neighborhood U for the process x if it is bounded below (but can take the value +00) and Exf(x,) < f(x), XE U, for any stopping time' < tv. If f(x) is a superharmonic function in the neighborhood U, then f(xTul\t) is a supermartingale. THEOREM 8. A point x is stable for a process Xt satisfying condition A if and only if there exists a bounded continuous function f(x) that is superharmonic in the neighborhood U such that f(x) = 0, f(x) > 0 for x # x, and infxv f(x) > o. PROOF. Necessity. If x is stable for Xr, then it is stable also for x;, and by Theorem 4 there exists a continuous function ](x) on fj such that ](x) = 0, ](x) > 0 for x =F x, and ](x) is superharmonic for the process x;. Let f(x) = ](x) for x E U, and f(x) = ](fJ) for x  U. Then - f(xt) = f(x;) for t < tv, and - - Exf(xtI\Tu) = E x f(x:I\ T u) < f(x) = f(x) for x E U. Hence, f(xtI\Tu) is a supermartingale with respect to the mea- sure Px for all x E U, i.e., f(x) is superharmonic in U. 
202 III. STABILITY. LINEAR SYSTEMS Sufficiency. If f is superharmonic in U, then f(xtI\Tu) is a nonnegative supermartingale, and Px {sPf(XtMU) > a} < f(x)/a. If a is chosen so that r(x,y) < J and {y: r(x,y) < J} E U for f(y) < a, then Px {spr(XtMu,X) < } > 1 - f(x)/a. But tv = +00 for SUPtr(XtI\TU'X) < J, and supr(XtI\TU'X) = supr(xr,x). t t Hence, inf Px { supr(xr,x) < J } > 1 _.!. sup f(x), r(x,.t)e t a r(x,.t)e and the right-hand side can be made arbitrarily close to 1 by suitably choosing e > O. 0 REMARK 1. Condition A was not used in the proof of the sufficiency of the condition in the theorem. REMARK 2. A continuous function f(x) is said to be A-superharmonic in a neighborhood U for the process Xt if for any stopping time' < tv Exf(x)e).' < f(x). If for some A > 0 there exists a A-superharmonic function f(x) in the neighborhood U such that f(x) = 0 and f(x) > 0 for x # x, then the process Xt is asymptotically stable at X. Indeed, the process f(xt)eA.t is a supermartingale on [0, tv[, and Px{tv = +oo} can be made arbitrarily close to 1 by choosing x sufficiently close to X. But Px { lim f(xt) = O } > Px{tv = +oo}. too We now investigate conditions for instability of a stationary point X. For this we use a different extension of the process Xt from the interval [0, tv[ in the space fi. It will be assumed that 8 is an absorbing point. Define the process Xt = Xt for t < tv and Xt = 8 for t > tv. The corresponding ,.., semigroup on Cc; is denoted by Tr: for f E Cv Trf(x) = Exf(xt)I{Tu>t} + f(8)Px{tv < t}, x # 8, Trf(8 = f(8). The fact that this is a Feller process follows from Lemma 4. We need one more condition on the neighborhood U. B. If FeU is a closed invariant set for the process Xr, then F = {x}, i.e., {x} is the unique closed invariant set in U. 
1. STABILITY OF SAMPLE PATHS 203 LEMMA 6. If conditions A and B hold and x is an unstable point, then Px{tv < oo} = 1 for all x E U\{x}. PROOF. It follows from condition A3) that the set {x: Px{tv = +oo} = I} = {x: Exe- ATU = O} is closed. Obviously, this in an invariant set. Hence, Px{tv < oo} > 0 for x E U\{x} (we have used condition B). Further, if x is unstable for the process Xr, then it is unstable also for the process Xt. Therefore, on the basis of Theorem 6 there is a t5 0 such that Px {sp r(xz,x) > t50 } = 1 for x # x. Let F = fj n {x: r(x,x) > t5 0 }. It can be assumed that t5 0 is small enough that {x: r(x,x) > t5 0 } c U. Then F is a nonempty closed set and inf Exe- ATU = a > O. xEF Choose c > 0 such that supPx{tv > c} = supP x {1 - e- ATU > 1 - e- AC } xEF xEF < (1 - a)j(1 - e- AC ) = P < 1. We introduce a sequence of stopping times: to is the first time Xt hits F; if to + c < tv, then tl is the first time F is hit on the interval [to + c, oo[ (tl = to + c if x TO + C E F); if tk has already been defined and tk + c < tv, then tk+l is the first time F is hit on the interval [tk + c, 00[, and so on. If tk < 00, then P(tk + c < tvlFTk) = P XTk {tv> c}, since X Tk E F. Note that tk < 00 if tk+l + c < tv, because either tk = tk-l + c or X Tk _ I + C  F, and Px{SUPt r(xr,x) > t5 0 } = 1 for r(x,x) < t5 0 , i.e., Xt hits F in a finite amount of time with probability 1. We set tk = tv for tk-l + c > tv. Thus, the tk are defined and finite for all k. We have that Px{tk+l < tv} < Px{tk +c < tv} = ExPx tk {tv> c}I{Tu>Tk} < PPx{tv > tk} < pk+l. Hence, Px (Y{.k = .U}) = 1. 0 To derive instability conditions we need an auxiliary proposition of analytic character. 
204 III. STABILITY. LINEAR SYSTEMS LEMMA 7. Suppose that X is locally compact, and let gn (x) be a sequence offunctions in C x satisfying the conditions 1) gn(x) > 0; 2) gn(x) > gn+l (x); 3) lim gn(x) = O. noo Then there exists a sequencec n > OsuchthatEc n = +ooandEncngn(x) < 00 for all x E X, and the sum of this series is continuous. PROOF. Let Fm be a sequence of compact sets such that Fm C Fm+l and U Fm = X. Let an,m = SUPXEF m gn(x). By Dini's theorem, limnoo an,m = o for all m. We choose a nondecreasing sequence of positive integers m n such that lim an,m n = O. Then there is a sequence C n > 0 such that E C n = +00, and E cnan,m n < 00. For x E Fk L cngn(X) < L cnan,kI{mn<k} + L cnan,kI{mnk}. n n The first sum contains finitely many terms. It follows from the definition that an,k < an,m for m > k. Hence, L cnan,kI{mnk} < L cnan,m n < 00, n n and Ecngn(x) converges uniformly on Fk. 0 THEOREM 9. Let x be a stationary point, and let U be a bounded neigh- borhood such that conditions A and B hold. The point x is unstable if and only if there exists a function g(x) that is bounded and continuous on X\{x}, is equal to zero outside U, is positive and superharmonic in U\{x}, and satisfies the condition limxx g(x) = +00. PROOF. Sufficiency. If g(x) is such a function, then g(XtA'ru) is a non- negative supermartingale, the limit relation lim g(XtA'ru) = lim g(Xt) too too - is valid, where g(x) is the function on U such that g(x) = g(x) for x E U and g(lJ) = 0; we use the fact that g(x) is constant on X\U. Since the limit from the right is finite, it is equal to 0 with probability 1. Since infr(x,x)c5 g(x) > 0 for those J > 0 for which {x: r(x,x) < J} c U, the relation limt-+oo g(Xt) = 0 implies that sup r(xr, x) = sup r(xr, x) > J. t t The sufficiency is proved. Necessity. Suppose that x is unstable. Then in view of conditions A and B, Theorem 6, and the remark after it, Px{1'u < oo} = 1 for all x # x. 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 205 Let rA,(x) = 1 - Ex exp{ -A, l' V }. This function has the following properties: a) it is continuous and nonnegative; b) rA,(x) = 1, since x is an absorbing point; c) rA,(x) = 0 for x E X\U, and rA,(x) > 0 for x E U; d) rA,(x) is superharmonic in U; and e) limA,!o rA,(x) = 0 if x # x. Property a) follows from condition A and Lemma 4, and properties b) and c) are obvious. We prove d). Let C be a stopping time with C < 1'v. Then for x E U\{x} ExrA,(x,) = 1 - ExEx, exp{ -A,1'v} = 1 - ExE(exp{ -A,(1'v - C)}IF,) = 1- E x Eexp{-A,(1'v - C)} = 1- E x exp{-A,1'v} = rA,(x). Property e) follows from the relation limE x exp{-A,1'v} = Px{1'v < oo} = 1. A,!O t. - Let A,n ! O. Then the functions rA,n (x) defined on U\ {x} by the equalities rA,n(x) = rA,n(x), x # 0, and rA,n(lJ) = 0 satisfy the conditions of Lemma 7. Hence, there exist C n > 0 such that E C n = +00, and En CnrA,n (x) converges - in U\ {x} and is a continuous function. Let g(x) = L cnrA,n (x). n This function is continuous on X\ {x}, equal to zero outside U, and pos- itive on U\{x}. It is superharmonic on U as the sum of a convergent series of nonnegative superharmonic functions. The fact that g(x) --+ +00 as x --+ X follows from the fact that E cnrA,n(x) are continuous, and N N LcnrA,n(x) = LC n i 00 as N --+ 00. 0 n=l n=1 2. Linear equations in Rd and the stochastic semigroups connected with them. Stability 2.1. Linear equations. A general linear stochastic equation in Rd is obtained from a general stochastic differential equation under the assump- tion that its coefficients depend linearly on the unknown random function. Here we consider only the Markov case and equations containing stochas- tic differentials with respect to Wiener processes and Poisson measures with independent values. The simplest example of a linear stochastic equation is the equation of a harmonic oscillator when there are fluctuations of the frequency. The equations in phase space have the form dXI = X2 dt and dX2 = -aXI dt in the absence of fluctuations. If a has fluctuations of white noise type, then 
206 III. STABILITY. LINEAR SYSTEMS it is natural to replace this system of equations by the system of stochastic differential equations dXI = X2 dt, dX2 = -aX2 dt + JXl dw(t), which can be written in matrix form as follows: d()=[(a )dt+( )dW(t)](). (14) We introduce an operator-valued function with independent increments in R2 that has a matrix in the natural basis given by t(a )+W(t)( ), and we let x(t) be the vector with coordinates XI (t) and X2(t) (in the same basis). Then (14) can be written in the following form: dx(t) = dY(t)x(t). ( 15) It turns out that a broad class of linear stochastic differential equations in Rd lead to an equation of the form (15), where Y(t) is an operator process with independent increments. Let Y(t) be a stochastically continuous process with independent incre- ments in the space L(Rd) of linear operators from Rd to Rd. Since L(Rd) is a finite-dimensional Euclidean space, it follows from the general form of a stochastically continuous process with independent increments in such a space that y(t) = A(t) + Yo(t) + f t U[v(ds x dU) - I{IIUIIl}n(ds x dU)], (16) where A(t) is a continuous L(Rd)-valued function, Yo(t) is a continu- ous L(Rd)-valued process with Gaussian independent increments, and v(ds x dU) is a Poisson measure with independent values on R+ x L(Rd) such that n(ds x dU) = Ev(ds x dU), f II U1I2(1 + II UII2)-1 n([O, t] x dU) < 00 for all t, and the expression on the left-hand side is continuous in t (II UII is the norm of the operator U in the Euclidean norm of Rd). In order that (15) can be written with Y(t) having the representation (16) it is necessary only that A(t) be a function of bounded variation (the stochastic differentials obtained as a result of the operation inverse to stochastic integration are defined for Yo(t) and the integral term). Pro- vided that (15) makes sense, the existence and uniqueness of a solution 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 207 of this equation and the fact that the solution is a Markov process fol- low from known theorems for stochastic differential equations (Gikhman and Skorokhod [2], Chapter 4, 991 and 2). Therefore, we are interested mainly in questions connected with representation of solutions and with the asymptotic behavior of them on an infinite time interval. We now consider a general linear stochastic equation in the locally in- finitely divisible case (Gikhman and Skorokhod [2], Chapter 4, 91). It has the following form: dx(t) = a(t, x(t)) dt + B(t,x(t)) dw(t) + f fi (t,x(t), O)J.lI (dt x dO) + f h(t, x(t), O)v2(dt x dO), (17) where w(t) is a Wiener process in some Euclidean space H, VI and V2 are independent Poisson measures with independent values on R+ x 8 (8 some measurable space), V2([0,t] x 8) < 00, #1(dt x dB) = vl(dt x dB) - EVl(dt x dB), a(t,x) is a function from R+ x Rd to Rd linear in x, B(t,x) E L(H,Rd) (the linear space of operators from H to Rd) is linear in x, and fi(t, x, B) is a function from R+ x Rd x 8 to Rd that is linear in x. Thus, a(t, x) = A(t)x, where A(t) is a function from R+ to L(Rd), and fi(t,x, B) = Fi(t, B)x, where Fi(t, B) is a function from R+ x 8 to L(Rd). Finally, we define in L(Rd) a Gaussian process Yo(t) with independent increments by means of the equality Yo(t)x = I t B(s, x) dw(s) for all x E Rd (the right-hand side belongs to Rd and is linearly dependent on x; therefore, it can be represented as written on the left-hand side). Let Y(t) = I t A(s) ds + Yo(t) + I t f FI (s, O)J.lI (ds x dO) + I t f F 2 (s, O)v2(ds x dO). (18) Equations (15) and (17) are equivalent for such a Y(t). Since (16) repre- sents any stochastically continuous process with independent increments, (18) can be represented in this form. The only difference is the differen- tiability of A(t) and of the second moments of Yo(t) in this case. For the locally infinitely divisible case the measure n(ds x B) will also be abso- lutely continuous with respect to Lebesgue measure for a fixed Borel set B C L(Rd): n(ds x B) = n(s, B) ds. 
208 III. STABILITY. LINEAR SYSTEMS We write the Kolmogorov equation for equation (15) with Y(t) of the form (16) under the assumption that its characteristics are smooth. Let 's,x(t) denote the solution of (2) for t > s satisfying the initial condition 's,x(s) = x. If {O(x) E C(2)(Rd) (the space of twice continuously differen- tiable functions on Rd that are bounded together with their derivatives up to second order), then the function V(x) = E{O('s,x(t)) satisfies for s < t the equation a Vx) + (V;(s,x), A(s)x) +  (Qs(V;(S,X))x, x) + ![V(S,X + Ux) - V(s,x) - (V;(s,x), Ux)I{IIUIIl}]n(s,dU) = 0, ( 19) where d Qs(C) = ds EYci(s)CYo(s) for C E L(R d ). The linearity of the equation enables us to get linear equations for the moment functions of the solution. These equations can be obtained from ( 19), whose form implies that if the initial value of V (t, x) is a polynomial in x of degree at most r, then a solution of (19) can be sought in the form of a polynomial with coefficients dependent on s, and it is possible to get a linear system of ordinary differential equations for them. We obtain them in a more natural way with the help of the Ita formula. Assume that for a positive integer r > 2 1 t IIUII'n(s,dU)ds < 00. Then the process Y(s) (assume that Yo(O) = 0) defined by (16) has mo- ments up to the rth order on [0, t]. LEMMA 8. If Y(s) has moments up to the rth order on [0, t], X(s) is a solution of(15), and Elx(O)I' < 00, then Elx(s)I' is uniformly bounded and continuous with respect to s for s < t. PROOF. We write Y(s) in the form Y(s) = l s A(u) du + Y(s), Y(s) = Yo(s) + l s ! U[v(du,dU) - n(du x dU)]; Y(s) is a martingale with moments of order r. See Gikhman and Skorokhod [2] (Chapter 4,  1, Theorem 3 and Remark 5) for a proof 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 209 of the lemma when r = 2. Assume that r > 3. Then from estimates of the moments of martingales (Gikhman and Skorokhod [2], Chapter 3, 4, Theorem 9) we can see that there exists a continuous increasing function A,(s) such that for s < t E r f(u) dY(u) , < (S sup E(lf(u)I' + 1) d)'(v) (20) 10 10 uv for every Rd-valued function f(s) adapted to the flow of a-algebras g; generated by the process . Let xo(s) = x(O) and xn(s) = J dY(U)X n -l(U) + xo(O). Using (20), we see that the functions {On(s) = supus Elxn(u)I' satisfy for some A and B the relations tpn(S) < A + B l s tpn-l (u) d)'(u), tpo(s) < A. Hence, {On(s) < A exp{BA,(s)} (this can be verified by induction). The inequality Elxn(s + h) - xn(s)I' < A[exp{BA,(s + h)} - exp{BA,(s)}] is established similarly. Since the xn(s) are successive approximations of the solution (15) and converge to this solution, we conclude the proof of the lemma by taking the limit with respect to n. 0 Let m,(t, ZI,..., z,) = E(x(t), ZI)(X(t), Z2)... (x(t), z,). On the basis of the Ito formula, E d(x(t), ZI )(x(t), Z2) . . . (x(t), z,) ,.", = E[(A(t)x(t), ZI)(X(t), Z2)... (x(t), z,) ,.", + (x(t), zl)(A(t)x(t), Z2)... (x(t), z,) ,.", +... + (x(t), ZI)(X(t), Z2)... (A(t)x(t), z,)] dt + E[(d Yo(t)x(t), ZI )(d Yo(t)x(t), Z2) . . . (x(t), z,) + . . . + (d Yo(t)x(t), ZI )(x(t), Z2) . . . (d Yo(t)x(t), z,)] + ... + E f [(x(t) + Ux(t), zd(x(t) + Ux(t), Z2) ... (x(t) + Ux(t), z,) - (x(t), Zl )(x(t), Z2) . . . (x(t), z,) - (Ux(t), ZI)(X(t), Z2)... (x(t), z,) - ... - (x(t), Zl )(x(t), Z2) .. . (Ux(t), z, )]h(s, dU). 
210 III. STABILITY. LINEAR SYSTEMS We have that ""J ""J E(A(t)x(t), ZI) . . . (X(t), Z,) = E(x(t), A* (t)ZI) . . . (X(t), Z,) ""J = m,(t,A*(t)ZI,...,Z,), E(x(t) + Ux(t), ZI)... (x(t) + Ux(t), z,) = m,(t, ZI + U* ZI,. .., z, + U* Z,), E(Ux(t), ZI)... (x(t), z,) = m,(t, U* ZI,..., Z,). Further, E(d Yo(t)x(t), ZI )(d Yo(t), x(t), Z2) . . . (x(t), z,) = EE[(d Yo(t)x(t), ZI )(d Yo(t)x(t), Z2) . . . (x(t), z,) / d Yo(t)] = Em,(t, d Yo (t)ZI, d Y o * (t)Z2, . . . , Z,). We introduce the operator Q; acting on a bilinear form l(zl, Z2) according to the formula [Q; 1](ZI, Z2) dt = El(d Yo (t)ZI, d Y o *(t)Z2). Let [Q;(Zi, zj)m,] be the result of the action of this operator on m,(t, ZI,..., z,), regarded as a bilinear form in Zi and Zj (i # j). Then dm,(t, ZI,..., z,) dt , = L m,(t, ZI,... ,A*(t)Zi'...' z,) + L[Q7(Zi, zj)m,](t, ZI,..., z,) i=1 i<j + ![m,(t,ZI + U.zJ,...,z,+ U.z,) - m,(t,zJ,...,z,) - Lm,(t,zl,...,U*Zi,Zi,...,z,)]n(t,dU). (21) I To define an r-linear form it suffices to define its coefficients m,(t, ei l , · . · , eir)' where {el, . . . , ed} is a basis in Rd, and the ii, . . . , i, are arbitrary sequences of numbers 1,..., d. It is possible to obtain a system of ordinary differen- tial equations for these coefficients from (21). We get such a system for 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 211 r = 2: 8m2(t, Zl, Z2) ( """"* ( ) ) ( A """"* ) Q * ]( ) 8 t = m2 t, Zl, A t Z2 + m2 t, Zl, Z2 + [ t m, t, Zl, Z2 + / m2(t, U* ZJ, U* z2)n(t, dU), (22) m2(t, zl,A*(t)Z2) = L(zl,ei)(z2,ej)(A*(t)ej,ek)m2(t,ei,ek)' i,j ,k [Q7m,](t, Zl, Z2) dt = L(zl,ei)(z2,ej)Em,(t,dY o *(t)ei,dY o *(t)ej) i,j = L (Zl, ei)(z2, ej )m,(t, ek, el )E(d Yo (t)ei, ek)(d Yo. (t)ej, el), i,j,k,l / m2(t, U* ZJ, Y* z2)h(t, dU) = :?; m2(t, ei, ej) / (U* ZJ, ei)( U* Z2, ej )n(t, dU) I,J = L (ZJ, e k)(z2,e/)m2(t,ei,ej) /(Uei,ek)(Uej,e/)n(t,dU). i,j,k,l Substituting these expressions in the right-hand side of (9), and then setting ZI = e p and Z2 = e q , we get the following system of differential equations: :t m2(t,ep,eq) = L(A*(t)e q ,ek)m2(t,e p ,ek) + LC pq ij(t)m2(t,ei,ej), k iJ where Cpqij(t) = E(dYo(t)ep,ei)(dYo(t)eq,ej) + /(u*ei,ep)(u*ej,eq)n(t,dU). 2.2. Operator equations. Representation of solutions. Let 's,x(t) be a solution of (15) for t > s with the initial condition 's,x(s) = x. By the linearity of the equation, 's,x(t) is linear in x. Therefore, there exists a random linear operator Ut on R d (a random variable with values in L(Rd)) such that 's,x(t) = Utx. The matrix of this random operator in the basis {el, . . . , ed} has the form II ('s,e; (t), ej) lIi,j=I,...,d. Denote by s the a-algebra generated by the variables Y(u) - Y(s) for U E [s, t]. Obviously, Ut is an s-measurable variable. Let s < t < u. Then Ux = 's,x(u) = 't,c;s.x(t)(u) = U's,x(t) = UUtSX, 
212 III. STABILITY. LINEAR SYSTEMS which implies that U = UUf. Finally, Uf is stochastically continuous in t, and Uf --+ I in probability as t ! s. The quantity Uf is the fundamental matrix of the linear stochastic equation (15). DEFINITION. A family of random operators {Uf, 0 < S < t < oo} on Rd is called a stochastic semigroup if for all 0 < s < t < 00 there are a-algebras g;s such that: a) g;s :) !Tvu for 0 < S < u < v < t; b) the a-algebras 91;0 and g;s are independent for 0 < s < t; c) Uf is measurable with respect to g;s; d ) US = U t US for s < t < u. and u u t , e) Uf --+ I in probability as sit or t ! s. Thus, associated with every linear stochastic differential equation of the form (15) is a stochastic semigroup Uf (the fundamental matrix for the equation). The operator-valued function Uf itself also satisfies the linear operator equation dtU t S = dY(t)Uf, U; = I, t > s. (23) To see this it suffices to apply both sides of (23) to an arbitrary vector x E Rd and take into account that Ufx = 's,x(t), while 's,x(t) satisfies equation (15). The study of (15) with all possible initial conditions is equivalent to the study of the operator equation (23). In addition to equation (23) for operator-valued functions we can also consider the linear operator equation dtV/ = V/ dY(t), s = I, t > s, (23') where Y(t) is again an operator-valued process with independent incre- ments for which the stochastic differential is defined. Such an equation is obtained by passing to the adjoint operators in (23). The solution of (23) has properties a)-c) and e), while property d) for it is replaced by the following property: d') Vzi = V/VJ for s < t < u. A family of operators for which these properties hold (with d) replaced by d')) is called a right stochastic semigroup. We find a representation of the solution of (10). Suppose that Y(t) is representable as follows: Y(t) = M(t) + Z(t) + Y 1 (t), where M(t) is a continuous nonrandom function of bounded variation, Z(t) is a martingale with EIIZ(t)1I 2 < 00 for t > 0, Y 1 (t) is a stochastically continuous step process with independent increments, and Z (t) and Y 1 (t) 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 213 are independent processes (all the processes take values in L(Rd)). This representation is possible if in the representation (16) for Y (t) the function A(t) has bounded variation ((23) makes sense only in this case). Let us first find a solution of the equation dtut = [dM(t) + dZ(t)]ut, U: = I, t > s. (24) We introduce a nonrandom Rd-valued function Q: that satisfies for t > s the equation Qf = I - it Q dM(u). (25) Such a function can be given by the series Q: = I - j t dM(uJ) +... + (-l)n j dM(ul)... dM(u n -l) S S<Ul <U2<".<U n <t X dM(u n ) +... . The convergence of the series is a consequence of the following estimate: if A(t) = var[O,t] M(u), and A(t) is a continuous increasing function, then f dM(uJ) . . . dM(u n ) < [).(t) - ,).(sW n. (the estimate is easily obtained by induction). In particular, it follows from this estimate that IIQf - III < f: ().(t) - ,).(s))n = exp{).(t) - ).(s)} - 1. n. n=l Therefore, Q: is an invertible operator for sufficiently small t - s. Note that Q = QQ: holds for s < t < u. This follows from the relations Q - QfQ = [u[Qt _ QQndMv (u > t), IIQt - QfQ1I < [U IIQ - QfQ1I d)'(v) and Gronwall's inequality. Hence, Q s = Q UO Q UI . . . Q Un-l t U 1 U2 Un for s = Uo < Ul < . . . < Un = t, and by choosing max(uk+l - Uk) sufficiently small we see that Q: is invertible for all s < t. - ,..., Now let Ut = Q?Ut[Q]-I. Since it follows from (25) that dQ? - -dM(t)Q?, we conclude that dVt = -Q? dM(t)Ut[Q?]-l + Q? dut[Q]-l = -Q? dM(t)UtS[Q]-1 + Q?[dM(t) + dZ(t)]UtS[Q]-1 = Q? dZ(t)ut[Q]-l = Q? dZ(t)[Q?]-1 vt = dZ(t)V t S , 
214 III. STABILITY. LINEAR SYSTEMS where z(t) = fot Q dZ(u)[Qrl (26) is also a martingale in L(Rd), with EIIZ(t)1I 2 < 00. For example, the integral (26) with respect to an operator-valued martingale can be defined with the help of the equalities (Z(t)ei,ej) = fot (dZ*(u)[Qrlei' [Q]*ej) i t d = L(dZ*(u)ek' [Q]-le;)(ek' [Q]*ej). o k=1 In particular, they imply that EIIZ (t) 11 2 < L E(Z (t)e;, ej)2 < 00. Thus, Vl satisfies the following differential equation: dtV; = dZ(t)V;, VJ = I. (27) The solution of (27) can be written with the help of the series V; = I + Jf'/ (1) + . . . + Jf'/ (n) + . . . , (28) t where Ui?(n) = it dZ(u)W(n - 1), n > 1, ..-... with the assumption that WJ(O) = I. Convergence of the series in (18) follows from the preceding estimates. Let B(t) = EZt Zt. It is easy to see that B(t) is a symmetric nonnegative operator, and B(t) - B(s) > 0 for t > s. Therefore, tr B(t) is a continuous monotone function. Then for any operator-valued function F(u) that is measurable with respect to the flow {9;S}ts generated by the variables {Zu - Zs, u E [s, t]} and such that IIEF*(u)F(u)1I is bounded we have the inequality E (it dZ(U)F(U)) *I t dZ(u)F(u) t 2 = sup E f dZ(u)F(u)x Ixl<1 is = sup t E(dii(u)F(u)x,F(u)x) < t IIEF*(u)F(u)lIdtrii(u). Ixl<S:li s is 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 215 Using this, we conclude by induction that II E ( U'? ( n )) · ( W;S ( n ) ) II = (tr jj (t) - ,tr jj (s )) n . n. It follows from this estimate that the series x + Jf'? (l)x + . . . + s (n)x + . . . converges strongly in the mean square, uniformly on each finite interval. By the definition of S(n), x+ S(I)x+...+ S(n)x+... = x + 1/ dZu(x) +... + 1/ dZuWu(n - l)x + ... =x+ 1/ dZ u (x+W:(1)x+...+W:(n-l)x+",); therefore, the right-hand side of (28) is a solution of (27). The solution of (24) can now be written as follows: V t S = [Q?]-1 fjtsQ = [Q:]-1 + [Q?]-I S(I)Q + . . . + [Q?] -1 S ( n ) Q + . .. . (29) Wnow show how to express the solution of (15) in terms of the func- tion Ul. Let 'l'1 < 'l'2 < ... be the jump points of the process Y 1 (s), and define Yl ('l'k) = Y 1 ('l'k + 0) - Y 1 ('l'k - 0). The variables {'l'I, Yl ('l'I), ":2, Yl ('l'2),. · · } are jointly independent of the process Z (t), and hence of Ul. The process Y 1 (t) is constant on each interval ['l'k, 'l'k+l [. Consequently, U S Tk _ 1 = U S Tk _ 1 for s E ['l'k-l, 'l'k [. Further, U Tk - 1 = UTk + y ( 'l' ) UTk-1 = ( I + y ( 'l' )) U  Tk-I Tk Tk- 1 k Tk- 1 k Tk . We have used the fact that U S Tk _ 1 is continuous at the point 'l'k with prob- ability 1, and the predictable projection of U S Tk _ 1 at the point s = 'l'k coin- cides with U:kk1 (it is the predictable projection at the jump time and is  used in the stochastic differential equation). If'l'j < S < 'l'j+l, then Ul = Ul for s < t < 'l'j+l, and  U ii+ I = (I +  Y 1 ( 'l' j + 1 ) ) U ( t, 'l' j + I ). Therefore, U t S = flt T k (I +  Y 1 ( 'l' k ) ) fl :kk - I (I -  Y 1 ( 'l' k _ 1 )) . . . (I +  Yl ( 'l' j + 1 ) ) U ii+ I ( 30) for 'l'j < S < 'l'j+l < ... < 'l'k < t < 'l'k+l. It can be verified directly that the right-hand side of (30) is a solution of (15) for the indicated representation of the process Y(t). We transform (30) to a more convenient form for writing that contains neither the points 
216 III. STABILITY. LINEAR SYSTEMS 'l'j nor the jumps Y( 'l'j). Multiplying out the parentheses on the right-hand side of (30), we have that ut = ut + L Ur' j  Y 1 ( 'l' j ) U: j + . . . + L U;c j/  Y 1 ( 'l' h) TjE]s,t] S<Tjl <".<Tj/ t XUT!/-I .. .Yl ( 'l'. ) US, +.... ( 31 ) Tl/ JI Tli ,..., We use the fact that Ul, as a solution of the linear equion (24), satisfies ,...", ,...", ,...", . the following multiplicative property: U Ul = U for s ,..., < t < u. This property is preserved if random variables independent of U are taken as s, t, and u. The right-hand side of (31) has a finite number of terms, because the sums ES<T. <".<T' <t are defined only when at most I of the points 'l'j 11 1/- fall in ]s, t]. The I-fold Stieltjes integrals 1 UtU/ dYl(UI)U/-1 dY 1 (UI-l)... dYl(Ul)U1 S<UI <...<u/t are defined for the step process Y 1 (u). In the case when I is greater than the number of jumps of Y 1 (u) on ]s, t] this integral is equal to zero (at least one dY 1 (Uk) is 0), and in the opposite case it is equal to the I-fold sum on the right-hand side of (31). Thus, 00 U t S = ut + L 1 UtU/ d Y 1 (UI) U/-I d Y 1 (Ul-l) . . . d Y 1 (Ul) UI. 1=1 S<UI<".<U/t (32) Here the sum on the right-hand side actually has only finitely many nonzero terms. REMARK. For a stochastically continuous process with independent in- crements to admit a representation (16) in which A(t) has locally bounded variation it is necessary and sufficient that its characteristic function have the form Eexp{itr Y*(t)Z} = exp{\f(t, Z)}, Z E L(R d ), where \f(t, Z) has locally bounded variation for all Z E L(Rd) (we regard L(Rd) as a Euclidean space with the inner product (ZI, Z2) = tr Z Z2, so that the characteristic function of the operator-valued process is defined by the expression on the left-hand side). If the process Y(t) admits the representation (16), then '¥(t,Z) = itrA*(t)Z -  Q(t,Z) + /[eitrUOz -1- isU*ZI{IIUIIl}] xn([O, t] x dU), 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 217 where Q(t, Z) is a continuous increasing function of t that is quadratic in Z. The second and third terms on the right-hand side of the equality defining \f(t, Z) have bounded variation, and hence \f(t, Z) has bounded variation if and only if tr A*(t)Z has bounded variation for all Z, i.e., if and only if A(t) has bounded variation. Consider the variables et = det Ul. They obviously form a one-dimen- sional stochastic semigroup if we take into account that a linear operator on Rl is the operator of multiplication by a number. We find a stochas- tic differential equation for et under the assumption that Uf satisfies the locally infinitely divisible equation dUf = A(t)Uf dt + f Bk(t)Uf dWk(t) + / FI (t, O)Uf J.lI (dt x dO) k=1 + / F 2 (t, O)Uf1l2(ds x dO), (33) where ElIk(dt x dB) = nk(t, dB) dt, k = 1,2, A(t), and Bk(t) are continuous functions with values in L(Rd), the Wk are independent Wiener processes in R, the measures nk(t, dB) and the functions Fk(t, B) with values in L(Rd) are such that sUPt,811Fl (t, B)II < 1, IIFI (t, B)1I 2 is integrable uniformly in t with respect to the measure nl(t,dB), SUPtn2(t,8) < 00, and lim SUpn2(t, {B: IIF2(t, B)II > c}) = o. coo t LEMMA 9. For s < t the process Ct satisfies the stochastic differential equation m de: = a(t)e: dt + L Pk(t)e: dWk(t) k=1 + fa [det(I + F j (t, 0)) - lK: J.lI (dt x dO) + fa[det(I + F2(t, 0)) - l]C:1I2(dt x dO), (34) where 1 m a(t) = tr A(t) + 2 L[(tr Bk(t))2 - tr Bt(t)] k=1 + fa [det(I + FI (t, 0)) - 1 - tr FI (t, O)]nl (t, dO), (35) Pk(t) = tr Bk(t). PROOF. Since det U is an analytic (polynomial) function of the elements of the matrix of the operator U, we can use the Ita formula. Therefore, 
218 III. STABILITY. LINEAR SYSTEMS Ct has a stochastic differential of the form m dC: = a(t) dt + L Pk(t) dt k=1 + fa tpl (t, O)J.lI (dt x dO) + fa tp2(t, O)VI (dt x dO). ,..., To determine the coefficients a, P, and k we can use the relations rs _ rt rs t+t - t+tt' ( f t+t m j t+t C:+t = det I + A(v)U dv + L Bk(V)U dWk(t) t k=1 t f t+t { + 1 18 FI (v, 0) UJ.lI (dt x dO) j t+M ( ) + 1 18 F 2 (v, O)Uv2(dt x dO) · If we now use the equality e 2 det(I + eB) = 1 + e tr B'+ 2((tr B)2 - tr B2) + 0(e 3 ), we get (34) when JIIF1(t,8)lInl(t,d8) is bounded. The general case is obtained by passing to the limit. COROLLARY. Suppose that for all t n2(t, {8: det(I + F 2 (t, 8)) = O}) = o. (36) Then Cf is representable as follows: c: = (_1)V(I)-V(S) exp {[' [a(u) -  L Pf(u) + fa [1 + In det(I + FI (u, 0)) · - det(I + FI (u, 0) )]nl (u, dO)] du + [' fa Indet(I + FI (u, O))J.lI (du x dO) + [' fa In I det(I + F2(U, O))lvl (du, dO) } , (37) 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 219 where v(t) - V(S) = [t 1 I{det(I+F 2 (u,8))<O}V2(du X dO). This follows from the fact that C: is a solution of a linear equation. In particular, C: is nonzero for all sand t with probability 1 under condition (36). We present a condition under which a one-dimensional stochastic semi- group does not vanish. THEOREM 8. Suppose that C: is a numerical random function defined for o < s < t that satisfies the following conditions: a) The variables C:, . . . , C::- 1 are independent for 0 < So < SI < . . . < Sk. b) C: C = C for s < t < u. c) C: --+ 1 in probability if sit or t ! s. Then the following assertions are true: 1) C: has a modification for which the limits C:+, C:-, C:+, and C:- exist. 2) C: is nonzero with probability 1 if and only if either A) for all t P { infl':1 = O } = 0, st or B) for all t n-l lim  P {C: k k = O} = 0, max t o L..J + I k k=O o = to < tl < . . . < t n = to. PROOF. It follows from c) that for every u > 0 and e > 0 lim sup P{IC:-ll>e}=O. hO Os<t<s+hu Let q(s, t) = P{C: 1= O}. By a) and b), q(s,v) = q(s,t)q(t,v) fors < t < v. Since (37) gives us that lim sup q(s, t) = 1, hO Os<t<s+hu it follows that q(s, t) > 0 for all s < t, and hence q(s, t) = exp{ -(A(t) - A(S))}, (38) 
220 III. STABILITY. LINEAR SYSTEMS where A(t) = In(ljq(O, t)) is an increasing function (its continuity follows from (37)). Let v(t) = lim L I { r k / 2n =O } = lim vn(t). noo ':J(k+I)/2n k<2 n t Obviously, v(t) is a nondecreasing integer-valued random process. For all dyadic rationals t = r j2 m , v(t) is nondecreasing with respect to n, and for n > m P {v n ( ;m ) = 0 } = q ( 0, ;m ) , P {v n ( ;m ) = i } = L q (0, ;m ) iI (1 - q ( k j 2 1 , ; ) ) · O<k l <...<kir.2n-m J=1 Passing to the limit with respect to n, we see that v(t) is a Poisson process with mean A(t). - Let us now consider the new semigroup Ct: - U k / 2n C:(n) = (C(k+l)/2 n )I{c k / 2n #O}' s.2n <k<t.2 n (k+I)/2 n where 0 0 = 1. , - - C S = lim CS(n). t noo t (39) We show that this limit exists in the sense of convergence in pbability. To do this we observe that if sand t are dyadic rationals, then Cf(n) = Ct for sufficiently large n when v(t) - v(s) = O. _Therefore, if'l'l < ... < 'l'n are the jump times of the process v(t), then Cf(n) has a limit coinciding with Cf on each of the intervals ('l'k, 'l'k+l) ('l'o = 0). To prove the existence of the limit (39) it is necessary to establish the existence of the limits as s ! 'l'k and t i 'l'k+l. For example, we show the existence of the limit '(n) = C- = lim C / 2n, noo "In 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 221 . k/2 n where 'fin = Inf{k: C(k+l)/2 n = O}. For m > n we have P{ICm/2n - Cn/2ml > e} = P{I'n/2nl-11 - ':;:I > t} L P {v (  ) = 0, 1'£/2nl-ll - 'fl;:-n+j)/2 m l > t, k 0j2m-n v ( k-2:n+j ) -v (  ) =0, v ( k - 2m-;m+ j + 1 ) _ v ( k . 2:n + j ) = 1 } < L P {v (  ) = 0, I '£/2 n I > a } P {v ( k 2: 1 ) - v ( ;m ) = 1 } k { ( k ) } + L L P v 2 n = 0 k 0j<2n-m x P {II - 'fl;:-n+ j)/2 m I > : ' v ( k - 2:n + j ) - v (  ) = 0 } x P {v ( k - 2m-;m+ j + 1 ) _ v ( k - 2:n + j ) = 1 } < sup p{IC2/2 n l > a}A.(u) k2n u + su P { II - 'fl;m-n+j)/2 n I >  } A. ( u + 2 1n ) k2nu,0J2m-n a + 2P{'rl > u} < supP{IC?1 > a}A.(u) tu + sup P { I':+h-ll> } A. ( u+ 2 1 ) +2P{'rl>U}. tu,h 1/2n a n Hence, - 0 0 0 lim P{IC 1In / 2 n - C 1Im / 2 ml > e} < 2P{'rl > u} +A.(u)supP{IC t I> a}. m>n,moo tu The last expression can be made arbitrarily small by suitably choosing u and a. The existence of the limits at the points 'rk is proved. If s < 'rk < t th 'is rs rTk- r T /+ . . . < 'r I <, en  t =  t -  T k + ...  t · We now consider the stochastic semigroups sgn1f: and 11f:I. It is easy to see, as in calculating zeros, that sgn 1f: = (-1 ) {;7(t)-v(s)} , where v(t) is a Poisson process with independent increments. Therefore, sgn 1f: is 
a process without discontinuities of the second kind. Since 11f: I 1= 0, it follows that 11f:1 = exp{ (t) - (s)}, (t) = In 111?1 ((t) is a stochastically continuous process with independent increments), has a modification without discontinuities of the second kind. This implies that if: has a modification without discontinuities of the second kind, and SInce s ""'s I 'fit = 'fit {Uk {LkE]s,t]} }, assertion 1) is proved. It is easy to see that the assertion 2A) follows from the proof that t;,:;: - 1 in probability as n - 00 (n < m), and therefore, P{ ,9 1= O} = r. In exactly the same way, P{ ,;k+ _ 1= O} = 1 for all k. n k+1 Finally, it is easy to see that for 0 = tno < t n l < . . . < tnn = t we have n-l lim p{':nk =O}=A(t). maxt o L..J nk+1 nk k=O The Poisson process v(t) is equal to zero with probability 1 if and only if A(t) = o. 0 2.3. The commutative case. We can express Uts in ternlS of Y(t) more simply in the case when the increments of Y(t) commute. Let K be a com- mutative algebra of operators in L(Rd) that is a closed subspace of L(R d ) (obviously, the closure of a commutative algebra is also a commutative al- gebra). Assume that Y(t) - Y(s) E K with probability 1 for all S < t. Since Y('l'+) - Y('l') E K for any point 'l' of discontinuity, if we decompose the process into the sum of a continuous process Zo(t), a process ZI (t) with jumps less than 1, and a process Z2(t) with jumps not less than 1, 'we can assert that each of them (assume that Y(O) = 0) belongs to K. Therefore, EYo(t) and EZ 1 (t) belong to K (these expectations exist, and K is a convex set). Thus, it can be assumed that Y(t) has the following form: Y(t) = M(t) + Zo(t) + ZI (t) + Z2(t), where M(t) is a function of bounded variation, Zo(t) is a continuous mar- tingale, ZI (t) is a martingale with jumps less than 1 that does not have a continuous component, and Z2(t) is a step process with independent in- crements (all the functions take values in K), and Zo(t), ZI (t), and Z2(t) are independent random processes. We consider the solution of the equations d t Uo (s, t) = (d M (t) + d Zo ( t) ) Uo (s, t), t > s, U 0 (s , s) = I, } ( 40) dtUk(s, t) = dZk(t)Uk(s, t), t > S, Uk(S,S) = I, k = 1,2. 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 223 It is easy to see that U 1 (s, t) and U 2 (s, t) belong to K, since U 1 (s, t) can be written as a series of multiple integrals with respect to ZI that belong to K, while U 2 (s, t) is a product of factors of the form (I + Z2('rj)), where the 'l'j are the jump points of Z2(t), and Z2('l'j) are their values (they belong to K). We find the function Uo(s, t) = exp{Zo(t) - Zo(s) + M 1 (t) - M 1 (s)}, where M 1 (t) is a function of bounded variation with values in K. Note that for analytic functions of operators with values in K the Ito formula can be used in precisely the same way as for numerical functions. This can be seen by expanding the functions in series. Therefore, dtUo(s, t) = exp{Zo(t) - Zo(s) + M 1 (t) - M 1 (s)} x [dZo(t) + dM I (t) + ! dEZ6(t)]. If the first of the equations in (40) is satisfied (with commutativity of the factors taken into account), then M 1 (t) + !EZ6(t) = M(t). Setting EZ6(t) = V(t), we get Uo(s, t) = exp{Zo(t) - Zo(s) + M(t) - ! V(t) - M(s) + ! V(s)}. (41) The quantity Uo(s, t) also belongs to K. Hence, Uo(s, t), U 1 (s, t), and U 2 (s, t) commute. Therefore, d[Uo(s, t) U 1 (s, t) U 2 (s, t)] = dUo(s, t)U 1 (s, t)U 2 (s, t) + Uo(s, t) dU I (s, t)U 2 (s, t) + Uo(s, t)U 1 (s, t) dU 2 (s, t) = [dM(t) + dZo(t) + dZ I (t) + dZ 2 (t)]Uo(s, t) U 1 (s, t) U 2 (s, t) (we have used commutativity and the fact that only dUo contains the dif- ferential of the continuous martingale, and ZI and Z2 do not have common jumps). Thus, ut = Uo(s, t) U 1 (s, t) U 2 (s, t). Let us find a representation for U 1 (s, t). The martingale ZI (s, t) can be represented in the form Zl(t) = f f' U[v(dO x dU) - n(dO x dun, 1 11UII <11 0 
224 III. STABILITY. LINEAR SYSTEMS where the measure v is given on R+ x K in our case. We look for U 1 (s, t) in the form Ul (s, t) = exp {I t ( V/(U)[v(dO x dU) - n(dO x dU)] s J lluII <1 + M2(t) - M2(S)}, (42) where W(U) is a function from K to K, and M 2 (t) is a function of bounded variation with values in K. On the basis of the Ito formula, d t U I (s, t) = exp {I t ( V/(U)[v(dO x dU) - n(dO x dU)] s J IIUII <1 +M2(t) - M2(S) } x [ dM2(t) + ( [elf/(V) - I - V/(U)]n(dt x dU) J IIUII <1 + felf/(V) - I](v(dt x dU) - n(dt x dun] . If (42) holds, then U = ef//(U) - I , M 2 (t) + t ( [elf/(V) - I - V/(U)]n(ds x dU) = O. J o J IIUII <1 Since In(I + U) is defined for IIUII < 1, it follows that W(U) = In(I + U), and U 1 (s, t) = exp {I t ( In(I + U)[v(dO x dU) - n(dO x dU)] s J lluII <1 - I t { (U -In(I - U))n(dO x dU) } . (43) s J ll uII<1 Finally, we write a representation for U2(S, t). It follows from the preceding point that U 2 (s, t) = IT (I + Zj), !iE]s,t] 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 225 where 1'1 < 1'2 < ... are the jump times of the process Z2(t), and Zi are their values. Thus, we obtain Ui = II (I + Zi) exp { M(t) - M(s) - ! V(t) + ! V(s) + Zo(t) - Zo(s) !jE]s,t] + i t { (In(I + U) - U)n(dO x dU) + Zo(t) - Zo(s) s J ll ulI<1 + i t ( In(I + U)[v(dO x dU) - n(dO x dU)] } . (44) s J llulI <1 2.4. The homogeneous case. Invariant subspaces. We consider equation (15) under the assumption that the process Y(t) is a homogeneous process with independent increments. Then it is representable as follows: y(t) = tAl + t wk(t)B k + t ( U[v(ds x dU) - I{IIUII<1}m(dU) ds], k=l 10 1 L(Rd) (45) where A 1 ,B 1 ,...,B, E L(Rd), Wl(t),...,w,(t) are independent Wiener processes, v(ds x dU) is a Poisson measure on R+ x L(Rd) such that Ev(ds x dU) = dsm(dU), and m(dU) is a measure on L(Rd) such that f 11U1I2(1 + 11U1I 2 )-lm(dU) < 00. If Uf is a stochastic semigroup that is the solution of (23) for the indicated Y(t), then in addition to properties a)-e) it will also have the following homogeneity property: f) The distribution Ui+h does not depend on t. Such stochastic semigroups will be called homogeneous in what follows. If EIIY(t)1I 2 < 00, then Y(t) can be written in the form y(t) = tA + t wk(t)B k + t 1 U[v(ds x dU) - m(dU) ds]. (46) k=1 J o L(Rd) In this case EIIUfll 2 < 00. We associate moment semigroups with such a homogeneous stochastic semigroup. Let Et = EU t O , Y;(C) = EUto*CUP, C E L(R d ); Et is an operator in L(Rd), and (.) is a linear function from L(Rd) to L(Rd), i.e., (.) E L(L(Rd)). Since Et+h = EU/+ h U t O = EU/+hEU t O = EhEt 
226 III. STABILITY. LINEAR SYSTEMS in view of the independence and homogeneity of Uf+h and Up, it follows that Et (as a function of t) is a homogeneous semigroup of operators on Rd. The relation UtO-I= 1 t dYsU s o= 1 t AU s ods + 1 t dYl(S)U, where Y 1 (t) = Y(t) - tA is a martingale, gives us upon taking the expecta- tion that Et-I= 1tAEsdS. Thus, A is the generating operator of the semigroup Et. We find the generating operator of the semigroup JI((.). Applying the Ito formula to (CUpx, Upy), we have that d(CUtOx, Utoy) = [(CAUtOX, UtOy) + (CUtOx,AUtOy) + (CBkUPX,BkUPY)] dt r + L[(CBkUtOx, Upy) + (CUtOx, BkUtOy)] dWk(t) k=1 + 1 [(CUUtOX, UtOy) + (CUtOx, UUtOy)] L(Rd) x (v(dt x dU) - dtm(dU)) + 1 (CUUtOX, UUtOy)m(dU) dt. L(Rd) Therefore, d(JI((C)x,y) = E [(CAUtOX, UtOy) + (CUtOx,AUtOy) + (CBkUtOX,BkUtOy) + 1 (CUUpx, uUtOy)m(dU) ] dt L( Rd) = ( VI ( CA + A*C + tBZCBk + 1 u*cUm(dU) ) X,y ) . k=1 L(Rd) Let r Q(C) = CA + A*C + LBkCB k + 1 U*CUm(dU). (47) k= 1 L(Rd) 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 227 Then d dt (C) = (Q(C)) (48) and hence Q( C), as an operator in L(Rd), is the generating operator of the semigroup (C). The semigroup Et is called the moment semigroup, and the semigroup VI(.) is called the second moment semigroup of the stochastic semigroup UI. Let N be a subspace of L(Rd); it is said to be invariant for the stochastic semigroup UI if Ulx E N with probability 1 for all x E Nand s < t. THEOREM 9. For the homogeneous stochastic semigroup UI that solves equation (23) the subspace N is invariant if and only if [Y(t) - Y(s)]x E N for all x E N. PROOF. The sufficiency follows from the fact that the process Y(t) can be regarded as an operator-valued process with values in L(N), and then the solution of (23) will belong to L(N), i.e., it will carry vectors in N into N. To prove the necessity we first establish the formula y(t) = nli.. L (Ut'::l)/n -I) (49) k<nt (the limit is in the sense of convergence in probability). We have that Ukln _ I _ ( y ( k + 1 ) _ Y ( k )) = l (k+l)/n dY ( )[ Ukln - I ] (k+l)/n n n k s s · In Therefore, ( I ) It-l 1 (k+l)/n Y(t) - L (U('1:1)/n -I) = Y(t) - Y  + L dY(s)[U s k / n - I], k<nt k=O kin where It is the largest number for which It < nt. Since It / n --+ t as n --+ 00, it follows that Y(t) - Y(lt/n) --+ 0 in probability. The sum of the integrals on the right-hand side can be written as an integral J+t5 d Y(s)<I>n(s), where J > l/n and <l>n(s) --+ 0 in probability, which also converges to zero in probability. We have established (49). Hence, fix = nli.. L (U('1:1)/n - I)x E N k<nt for all x E N. 0 Let N be an invariant subspace for the homogeneous stochastic semi- group UI. It is said to be irreducible on this subspace if there is no non- trivial subspace of N that is invariant for the semigroup. We show how to construct invariant subspaces for a semigroup. Denote by N x the smallest 
228 III. STABILITY. LINEAR SYSTEMS linear subspace such that P{Upx E N x } = 1 for all t > O. We show that N x is invariant for the stochastic semigroup Up. Indeed, 1 = P{UtsX E N x } = P{Uf+sUtOx E N x } = f P{Uf+sz E Nx}P{Utx E dz} = f P{Usoz E Nx}P{UtOx E dz}. Hence, P{ Uso z E N x } = 1 for almost all z with respect to the measure P{Upx E dz}. Let Kt,x be the closed support of this measure, and Nt,x the linear span of Kt,x. Then P{ Uso z E N x } = 1 for all s > 0 when z E Nt,x. But N x coincides with the linear span of Ut>o Nt,x, and hence P{UsOz E N x } = 1 for all z E N x . The invariance of N x is established. Obviously, N x c N when x E N, for every invariant subspace N (N x is the smallest invariant subspace containing x). Hence, N z C N x for z E N x . The invariant subspace is irreducible if and only if N = N x for all x E N. We write invariant subspaces in terms of the characteristics of the process Y(t) given by (45). THEOREM 10. N is an invariant subspace for the stochastic semigroup Up that solves (23) with the process Y(t) given by (45) if and only if AI, B 1 ,. . ., Br E K(N), where K(N) is the ring of operators carrying N into N, and m(L(Rd)\K(N)) = O. PROOF. It follows from Theorem 9 that if N is invariant, then P{ E K(N)} = 1. Hence, for any discontinuity +x - _x E N with probability 1 for x E N, i.e., the jumps of the process  belongs to K(N) with probability 1. This implies that the measure m(dU) is concentrated on K(N). But then t ( U[v(ds x dU) - I{lIUII<I}m(dU) ds] 10 1 L(Rd) belongs to K(N) with probability 1, and hence P { tAl +  wk(t)B k E K(N) } = 1. Since K (N) is a convex set, it follows that tAl = E (tAl +  Wk(t)B k ) E K(N), tBj = EWj(t) (tAl +  Wk(t)Bk) E K(N), i = 1,..., T. 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 229 This proves that the conditions of the theorem are necessaryo That they are sufficient follows from the fact that under the conditions of the theorem Y(t) E K(N) with probability 1, i.e., Y(t)x E N for x E N with probability 1. 0 The presence of an invariant subspace enables us to lower the dimen- sion of our system of stochastic linear equations. Let N 1 be an invariant subspace for UI, N2 its orthogonal complement, and Ql and Q2 the oper- ators of (orthogonal) projection onto N 1 and N 2 , respectively. Then the operator functions Uj(i) = QiUtsQi, i = 1, 2, are also homogeneous stochastic semigroups satisfying dtUj(i) = dYi(t)UtS(i), U;(i) = Qi, Yi(t) = QiY(t)Qi, t > s, i = 1, 2. (50) Indeed, dQl UjQl = Ql dU jQl = Ql dY(t)U t s Ql = Ql dY(t)Ql[QI UtQl], sInce UjQl = UiQr = Ql UiQl = Qr U t s Ql. This establishes (50) for i = 1. Further, dQ2 U tQ2 = Q2 dY (t)utQ2 = Q2 dY (t)(QI + Q2)UtQ2 = Q2 dY(t)Ql UjQ2 + Q2 dY(t)Q2[Q2 U jQ2] = dY 2 (t)U t S (2), because Q2dY(t)Ql = Q2Ql dY(t)Ql = o. Each of the equations (50) has dimension smaller than the original. How can ut be expressed in terms of Ut(l) and UI(2)? We have that Uj = (Ql + Q2)Uj(QI + Q2) = Uj(l) + U t S (2) + Ql UjQ2 (Q2 U fQl = Q2Ql UfQl = 0). Let Ql UtQ2 = Uf(l, 2). For Ut(l, 2) we get the equation dUj(I,2) = Ql dY(t)UjQ2 = Ql dY(t)[QI + Q2]U t S (Q2) = dY 1 (t)U t S (I,2) + Ql dY(t)Uj(2). Thus, the equation for Uf (1, 2) is a nonhomogeneous equation of the form (50) with i = 1. The initial condition is Ui ( 1, 2) = O. The solution of this equation can be expressed in terms of Uf(i). This is more conveniently done by using the properties of the stochastic semigroup itself. 
230 III. STABILITY. LINEAR SYSTEMS Let S = So < SI < . . . < Sn = t. Then US ( 1 2 ) = Q US Q = Q uSn-l U Sn - 2 ... uso Q t' 1 t 2 1 Sn Sn-l Sl 2 = QI U %nn-l(QI +Q2)un12(QI +Q2)...(QI +Q2)UoQ2. After using the equality Q2UfQl = Q2Ql UIQl = 0 for S < t, we find that n-l U S ( 1 2 ) = "'"'" Q uSn-l Q uSn-2 Q ... Q Us; Q U;-l Q ... uso Q t' L..J 1 Sn 1 Sn-l 1 1 S;+l 2 S, 2 Sl 2 ;=0 n-I = L ut;+I(I)QIUIQ2U(2) ;=0 n-l = L ut;+I(I)Ql(U1 - I)Q2U(2). ;=0. Using the same estimates as in the derivation of (48), we can see that U,s(1,2) = nl.!. L U,(k+I)/n(l)QI [y ( k; 1 ) - Y ( : )] Q 2U k/n(2). ns<k<nt (51 ) Let us show how to transform (51) under the assumption that the op- erators UI ( 1) are invertible. We have that U,s(1,2) = U,s(l) nl.!. L (U{k+I)/n(1))-1 ns<k<nt X QI [Y ( k ; 1 ) - Y ( : ) ] Q2 Uk/n (2) = U,S(l) J.!. L (Uk/n(l))-I (I - QI [Y ( k: 1 ) - Y ( : )] ns<k<nt XQI [Y ( k: 1 ) - Y ( : )] Q 2U k/n(2)) = Us (1) lim "'"'" (US (1))-1 t noo L..J kin ns<k<nt X QI [Y ( k : 1 ) - Y ( : ) ] Q2 Uk/n (2) +U,s(l)nl.!. L (Uk/n(1))-IQI [y( k;l ) -y( : )] ns<k<nt X QI [y ( k: 1 ) - y ( : )] Q 2U k/n(2) 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 231 = Uf(l) [I (U(1))-IQI dY(V)Q2U(2) + Uf(l) [I (U(1))-1 [E Q 1 B k Q 1 B k Q2 dV] + / QI UQI UQ2 v (dv X dU). We show how to reduce the study of a stochastic semigroup to the study of it on the invariant subspaces on which it is irreducible. LEMMA 10. Let N be an invariant subspace, and N 1 a proper invariant subspace of N with maximal dimension. Then the stochastic semigroup QUfQ is irreducible on N e N 1 (the orthogonal complement of N 1 in N), where Q is the operator of projection onto N e N 1 . PROOF. Assume that there exists a proper invariant subspace L c NeN I for the semigroup QUts Q. Denote by P the operator of projection onto N. Then for XI E N 1 and X2 E L U t S (XI + X2) = Uf X l + U t s Q X 2 = Uf X l + (P - Q)UfQ X 2 + QUfQ X 2. The first term belongs to N 1 , since XI E N 1 and N 1 is invariant. The second term belongs to N 1 , since P - Q is the operator of projection onto this subspace. The third term belongs to L by assumption. Hence, N 1 + L is a proper invariant subspace of N, and its dimension is greater than that of N 1 , which contradicts the assumption. 0 We now construct a chain Rd = No :J N 1 :J ... :J N, of invariant sub- spaces such that each subspace is a proper subspace of the preceding one and has maximal possible dimension. Let Qi be the operator of projec- tion onto N i . Then the stochastic semigroup [Qi - Qi+l]Uf[Qi - Qi+l] is irreducible on the subspace N i e N i + 1 (this follows from the lemma). 2.5. Mean-square stability. Let Ul be the homogeneous stochastic semigroup that is the solution of (23). It is said to be mean-square stable on an element X if E 1 Up X 1 2 --+ 0 as t --+ 00. Clearly, the set of X E Rd on which a stochastic semigroup is stable forms a linear subspace of Rd. We show that this subspace is invariant. THEOREM 11. Let N be the linear suhspace of those X on which the semigroup Up is mean-square stable. Then N is an invariant subspace. PROOF. Let Ns,x be the smallest linear subspace on which the distribu- tion of the vector Uso X is concentrated. To prove the theorem it suffices to show that Ns,x c N for all s > 0 and x E N. It follows from the definition 
232 III. STABILITY. LINEAR SYSTEMS of Ns,x that E(U s O x,y)2 > 0 for all y E Ns,x. Hence, there exists an a > 0 such that E(U s O x,y)2 > a1Y12. Therefore, ElUstxl2 = EIU;+tUsoxI2 = { EIU;+tzI2p{UX E dz} J zENs,x = { ElUtOz12p{UsOX E du}. J zENs.x We consider the quadratic form EIUpzl2 on the subspace Ns,x. If A(t) is the maximal eigenvalue of this form, and e(t) is an eigenvector corre- sponding to it, then (z,e(t))2 A (t) < EIUpzl2 < IzI 2 A(t). Hence, Elut t xI 2 = { ElUtOz12p{UsOX E dz} > ).(t) ! (z,e(t))2p{UsOx E dz} J zENs.x > A( t)ale( t) 1 2 = aA( t). This implies that A(t) --+ 0 as t --+ 00, i.e., EI Up zl2 --+ 0 for z E Ns,x, Ns,x c N. 0 The homogeneous stochastic semigroup Up is said to be mean-square stable on some invariant subspace N if EIUpxl 2 --+ 0 for all x E N. THEOREM 12. The homogeneous stochastic semigroup Ul is mean-square stable if and only if one of the following conditions holds: a) All the eigenvalues of the linear operator Q(.) from Ls(Rd) to Ls(Rd) must have negative real parts (Ls(Rd) is the space of symmetric operators in L(Rd)). b) There exists a symmetric strictly positive operator C such that Q( C) < o. PROOF. We prove condition a). Let JIt( C) be the second moment semi- group of the semigroup Ul, i.e., (VI(C)x,x) = E(CUtOx, Upx). Obviously, mean-square stability is equivalent to the condition that limt-+oo JIt(C) = 0 for all C E Ls(Rd), and VI(C) is a solution of the linear equation with constant coefficients (48). As is known from the theory of ordinary differential equations, the solution of this equation is stable if and only if all the eigenvalues of the linear operator on the right-hand side have negative real parts. Further, the inequality II (C) II < ce-«St II CII, C E Ls(Rd), (52) 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 233 holds for some  > 0 and c > O. Note that VI(C) depends monotonically on C for C E Ls(Rd), since (C) > 0 for C > 0 (the inequality is in the sense of inequality for symmetric operators). For mean-square stability it suffices that limtoo (C) = 0 for some positive operator C. Indeed, lxl2 < (Cx,x) for some  > 0 for such an operator, and hence o 2 1 0 0 1 EIU t xl <  E(CUt x, U t x) =  (VI(C),x,x). Suppose that C > 0 and Q( C) < O. Then there is a y > 0 such that Q(C) < -yC. By (48), d dt (VI(C)x,x) = (VI(Q(C))x,x) < -y((C)x,x), and hence (VI(C)x,x) < e-Yt(Cx,x). I t is proved that b) is sufficient. Let VI(C) --+ O. It follows from (52) that for all C E Ls(Rd) the integral R(C) = 1 00 V,(C) dt exists, R(C) is a linear operator from Ls(Rd) to Ls(Rd), and R(C) > 0 for C > O. The last remark follows from the fact that VI(C) is continuous and JIQ( C) = C > 0, hence (C) > 0 for all sufficiently small t. Further, Q(R(C)) = lim Q ( r V,(C)dt ) = lim r Q(V,(C))dt. soo 10 s-+oo 10 However, Q(VI(C)) = lim h I [f},(v,(C)) - V,(C)] = d d V,(C). hO t Therefore, Q(R(C)) = lim r d d V,(C)dt= lim((C)-C)=-C. soo 10 t soo Thus, for any C 1 > 0 the positive operator C = R( C 1 ) is such that Q( C) = -C 1 < O. The necessity of condition b) is proved. 0 COROLLARY. For mean-square stability it suffices that the operator r QI =A+A*+ LB;Bk+ f U*Um(dU) k=1 be negative-definite. This follows from the fact that Ql = Q(I) and from assertion b). 
234 III. STABILITY. LINEAR SYSTEMS EXAMPLE 1. Consider the semigroup on R2 that is the solution of (15) with Y(t) defined by (46), v = 0, and r = 1; the operators A and B have in some basis the matrices A=('6 J, B=( ). Then Q _ ( 2al + 1 0 ) 1 - 0 2a2. We have that Ql < 0 if and only if 2al + 1 < 0 and a2 < O. Let C = ( Cl C2 ) , Q(C) = ( 2alcl +c3 (al +a2)C2 ) ; C2 c3 (al + a2)c2 2a2c3 then Q( C) < 0 if 2a2c3 < 0 and (2al cl + C3)(2a2c3) - (al + a2)2c > O. The condition C > 0 is equivalent to the following: Cl, C3 > 0 and Cl c3 - c > O. Let ai, a2 < 0 and 2al + 1 > O. Choose C such that Q( C) < O. For this it is necessary only that 4al a2 c l c3 + 2a2cj - (al + a2)2c = 4al a2(cl C3 - c) + 2a2c - (al - a2)2c > O. For given ai, a2, C2, C3 > 0 this can always be achieved by choosing Cl sufficiently large; therefore, our semigroup is mean-square stable, but the condition Ql < 0 does not hold. EXAMPLE 2. Consider the homogeneous stochastic semigroup that is the solution of (23) with , Y(t) = tA + L wk(t)B k k=l under the assumption that A, A*, Bh"., B" B,..., B; commute. As follows from (44), up = exp { t ( A + 'tB ) + t Wk(t)Bk } , k= 1 k= 1 Up*U t O = exp { t ( A + A. -  t(B + B k 2 ) ) + t w(t)(B k + B k ) } . k=l k=l After using the formula Eew(t)B = etB2 /2, where w(t) is a Wiener process in R+ (this formula can be obtained by expanding the ex- ponential in a series), the independence of Wk(t), and the commutativity of the operators under the sign of the exponential, we find that EUp* UtO = exp { t ( A + A. -  t(B + B k 2 ) ) +  t(B k + B k )2 } k=l k=l = exp {t (A + A. + tBkBk) } = exp{tQ)}. Thus, in this case the condition Ql < 0 is necessary and sufficient for stability. The study of mean-square stability can be reduced to the study of such stability in invariant subspaces in which the stochastic semigroup is irre- ducible. For this it is possible to use the construction given after Lemma 10, along with the following fact. 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 235 LEMMA 11. Let N 1 C NeRd be two invariant subspaces for the stochas- tic semigroup Ul, and let Ql and Q be the operators of projection onto the subspaces N 1 and N, respectively. The stochastic semigroup is mean- square stable on N if and only if it is stable on N 1 and the stochastic semi- group (Q - Ql)Ul(Q - Ql) is stable on N e Nl. PROOF. The necessity is obvious. In view of the formula QUtsQ = Ql U t s Ql + Ql UtS(Q - Ql) + (Q - Ql)U t S (Q - Ql) (it follows from the invariance of N 1 that (Q - Ql)UlQl = 0), to prove sufficiency it suffices to prove that for all x lim EIQl Uf(Q - Ql)xl 2 = o. tO The mean-square stability of the stochastic semigroups Ql U! Ql and (Q - Ql)Ul(Q - Ql) implies that there are constants c > 0 and a > 0 such that EIQI U t O QI X l 2 < ce- at lxI 2 , EI(Q - Ql)U t O (Q - Ql)xl 2 < ce- at lxI 2 . Let Q2 = Q - Ql. Since Q2 U lQl = 0, for 0 < tl < t2 < t3 we have QIUtQ2 = QIU;:U;2IQ2 = QI U ;:(QI + Q2)U;2 1 (QI + Q2)UgQ2 = Ql Uf: Q2 Uf 2 1 Q2 ug Q2 + Ql Uf: Ql Uf 2 1 Q2 Ut Q2 + Ql U;: Ql U;2 1 Ql ug Q2 = Ql Uf:Q2UtQ2 + Ql U;;Ql U;2 1 Q2 U gQ2 + Ql Uf 3 1 Ql UtQ2. Similarly, for 0 = to < tl < . . . < t n = t n Ql U t O Q2 = L Ql U;iQl U2- 1 Q2 U g_ 1 Q2. ;=1 Let t; = i, i < n, n - 1 < t < n. Then n Ql UPQ2 = L Ql U;iQl U2- 1 Q2Ut?_1 Q2, ;=1 n EIQl U t O Q2 x l2 < n L EIQl UfiQl U2- 1 Q2Ug_1 Q2 x l2 ;=1 n < n  ce-a(t-ti)E IQ uti-I Q UO Q X l 2 -  1 ti 2 ti-I 2 i=l n < n  ce-a(t-t;}C E IQ UO Q X l 2 -  1 2 ti-I 2 ;=1 n < n  C 2 C e-a(t-t,)-ati-I < C 2 C n 2 e-(n-2)a. _  1 - 1 ;=1 
236 III. STABILITY. LINEAR SYSTEMS Here Cl is such that EI Up xl 2 < cl1xl 2 for t < 1. The right-hand side tends to zero as n --+ 00. 0 REMARK. Let N be an invariant subspace for the stochastic semigroup Uf on which the latter is irreducible. For mean-square stability on this subspace it suffices that EI Up xl 2 --+ 0 for some x =I- 0, x E N. This follows from the fact that the semigroup is stable on the invariant subspace N x c N, and N x = N under our assumptions. 2.6. Stability with probability 1. A stochastic semigroup Uf is said to be stable with probability 1 on an element x if P { lim U t O x = O } = 1. too We consider only homogeneous stochastic semigroups. Since for every s>O 1 = P { lim UtOx } = P { lim UtsX = O } = P { lim ut+sUsox = O } too too too = / p t UtOz = o} P{UsOx E dz}, stability of the stochastic semigroup with probability 1 on an element x implies its stability on all elements z E N x , where N x is the smallest linear subspace on which the distribution of Uso x is concentrated for all s > 0 ( obviously, the collection of elements on which the semigroup is stable with probability 1 forms a linear subspace). Since N x is invariant for the semigroup Uf, i is possible to define stability with probability 1 on an invariant subspace; in particular, the semigroup is said to be stable with probability 1 if it is stable on the whole space. The stochastic semigroup Up is said to be unstable if for all nonzero x E X P { lim IUtOxl = +oo } = 1. too Since P { lim IUtOxl = +oo } = / P { lim IUtOzl = +oo } P{UsOx E dz}, too too the set of x for which it is unstable is invariant. THEOREM 13. Suppose that the semigroup Uf is mean-square stable. Then it is stable with probability 1. PROOF. Suppose that the semigroup satisfies equation (23) with a process Y(t) for which EY(t) = tA. Then Y(t) - tA is a martingale, and hence Uta -I t AU s o ds = I + 1 t d[Y(s) - As]U s o 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 237 is a martingale. Since t 2 E UtOx - fo AUsOxds < 2ElUtOxl2 + 2 fot fot E(AUsox, AUx) dsdv < 2(V,(I)x,x) + 211AII2 fot fot V EIUpxl2EIU9xl2 ds dv and for some c > 0 and a > 0 EIU t O xl 2 < ce-o: t in view of mean-square stability, the quantity EIUpx - f AUxdsl2 IS bounded as t --+ 00. Hence, the martingale U O _ i t AUO ds t S o has a limit with probability 1. The inequality fooo AU s o ds < IIAII fooo IlUsoll ds and the finiteness of the expectation of the variable on the right-hand side (Ell Usoll < v'Cde-o: s / 2 ) give us that the limit relation lim t AU s o ds = 1 00 AUsO ds too 10 s is valid with probability 1. Hence, Up has a limit with probability 1, and the limit must coincide with the limit in probability, hence with the mean- square limit. 0 We give some examples that clear up the possible relations between the two forms of stability. EXAMPLE 3. Suppose that Uts is a solution of the equation dUP = AUP dt + BUP dw(t), where A, B E L(Rd), (Bx, x) = 0, A* + A + B* B = 1/, and d(Upx, Upx) = (A Up x, Upx) dt + (UtOx, AUpx) dt + (BUpx, BUtOx) dt = llUpxl 2 dt (since (BUpx, Upx) = 0). Hence IUpxl2 is unstable, and stability with probability 1 holds only if there is mean-square stability. EXAMPLE 4. We consider the same equation as in the preceding example, and assume that A and B commute. Then UtO = exp{t(A - B 2 /2) + w(t)B}. For stability with probability 1 it suffices that exp{t(A - B 2 /2)}  o. 
238 III. STABILITY. LINEAR SYSTEMS Indeed, in this case II exp{t(A - B 2 /2)}II  ce- Jt for some 6 > 0 and c > 0, and hence IIUtOIi < ce-Jtew(t)IIBIl = cexp{ -t(6 + w(t)IIBII/t)}. Since Iw(t)l/t  0, II Up II  0 with probability 1. On the other hand, EUP2 = exp{t(A + B2/2)}, and hence mean-square stability implies that exp{t(A + B2 /2)}  O. Therefore, stability with probability 1 and mean-square stability differ for such a stochastic semigroup. For the further study of stability with probability 1 it will be convenient for us to consider equations not in operator form but in vector form. Let x(t) be a solution of equation (15), where Y(t) has the form Y(t) = tAl + E wk(t)B k + It / U[v(dsxdU)-I{IIUII<c}m(dU) ds], (53) c < 1, and all the remaining variables are the same as in (45). Let y(t) = Ix(t)I- 1 x(t). This process is defined as long as Ix(t)1 =I- O. On the basis of the Ita formula, dy(t) = [lxl-IAIX -lxl- 3 (x,A l x)x -IX I - 3 E(B k X,B k X)X 3 -5  2  (x, Bkx) + 2 1xl (BkX,X) x -  Ixl 3 Bkx k=1 k=1 + /[iX + Uxl-I(x + Ux) -lxi-Ix -lxi-lUX + Ixl- 3 (x, Ux)x]I{lIulI<l}m(dU) dt] r + L(lxl-1 Bkx -lxl- 3 (B k x, x)x) dWk(t) k=1 + /{IX+UXI-I(X+UX)-IXI-Ix} x [v(dt x dU) - I{lIulI<c}m(dU) dt]. Therefore, y(t) satisfies the following stochastic differential equation: r dy(t) = a(y(t)) dt + L bk(y(t)) dWk(t) k=1 + / f(y(t), U)[v(dt x dU) - I{lIulI<l}m(dU) dt], (54) 
2. LIEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 239 where 1 r a(y) = A1y - (y,A1y)y - 2 L(Bky,Bky)y k=1 3 r r + 2 L(B k y,y)2 y - L(Bky,y)BkY k=1 k=1 + f[IY + UYI-1(y + Uy) - Y - Uy + (y, Uy)y]I{IIUII<c}m(dU) is a function from Rd to Rd, bk(y) = Bky - (Bky,y)y, k = 1,...,r, are also functions from Rd to Rd, and f(y, U) = Iy + UYI-l(y + Uy) - y is a function from Rd x L(Rd) to Rd. For (54) to make sense it suffices that Iy + Uyl > 0 almost everywhere with respect to the measure m(dU) for all y =I- O. By using the Ita formula it is easy to see that for IYol = 1 a solution y(t) of (54) satisfies the condition ly(t)1 = 1 for all t > O. Therefore, under our assumption (54) has a solution for all t. What is more, it is unique, since for II UII < c the function f(y, U) satisfies a local Lipschitz condition with respect to y; a(y), b 1 (y),..., bk(y) also satisfy such a condition, hence Theorem 1 in Chapter 4,  1 of Gikhman and Skorokhod [2] can be used. Therefore, y(t) is a homogeneous Markov process on the sphere S of unit radius about zero in Rd. We now apply the Ita formula to the function r(t) = Ix(t)l: 1 r dr(t) = Ixl-1(x,A1x) dt - 2 1xl-3 L(x,B k x)2 dt k=1 1 r + 2 1xl - 1 L I B k X l 2 dt k=1 + { ( Ix + Uxl-Ixl- (Xi X) ) m(dU) dt JIlUIIc x r + Ixl- 1 L(x,Bkx) dWk(t) k=1 = r(t) [tp(y(t)) dt + E tpk(y(t)) dWk(t) + f g(y(t), U)(v(dt x dU) - I{IIUII<c}m(dU) dt)] , 
240 III. STABILITY. LINEAR SYSTEMS where 1 r 1 r tp(y) = (y,A1y) - 2 L(y,B k y)2 + 2 L IB k yl2 k=1 k=1 + f (Iy + Uyl- I - (y, Uy))m(dU), JIIUII$c k(Y) = (y,Bky), g(y, U) = Iy + Uyl- 1. Since r(t) > 0, it follows that 1 r dIn r(t) = tp(y(t)) dt - 2 L 'IIf(y(t)) dt k=1 + f [In Iy(t) + Uy(t)1 + I -Iy(t) + U g(t)l]m(dU) dt JIIUII$c r + L tpk(y(t)) dWk(t) + f In Iy(t) + Uy(t)I[v(dt x dU) k=1 - I{IIUIIc}m(dU) dt]. Hence, r(t) = r(O) exp { t g(y(s)) ds +. t t tpk(Y(S)) dWk(S) J o k=I JO + It f In Iy(s) + Uy(s)l[v(ds x dU) - I{IIUIIc}m(dU) dS]} , (55) where r 1 r g(y) = (y,A1y) - L(y,B k y)2 + 2 L IB k yl2 k=1 k=1 + f [In Iy + Uyl- (y, Uy)]m(dU). (56) JIIUII<c For r(t) to tend to zero with probability 1 it is necessary and sufficient that the argument of exp in (55) tend to -00 with probability 1. We use these considerations for proving the next theorem. THEOREM 14. Assume the following conditions: a) The operator I + U is invertible almost everywhere with respect to the measure m(dU) andfor some t5 > 1 f (I In 11(1 + U)-11I1 6 + Iln III + UII1 6 )m(dU) < 00. JIIUII>c 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 241 b) With probability 1 -1 1 t lim- gl(y(s))ds<O, too t 0 where gl(Y) = g(y) + ( In 1(1 + U)ylm(dU). JllulI>c Under these conditions, ifx/lxl = y(O), then p { lim I Up x I = O } = 1. too PROOF. Under assumption a) the exponent of the exponential in (55) is representable in the form t {  I t gl (y(s)) ds +! t ( In Iy(s) + Uy(s)l[v(ds x dU) - m(dU) ds] t J o JIIUII < c 1 r (t + t  10 tpk(Y(S)) dWk(S) +! t ( In Iy(s) + Uy(s)l[v(ds x dU) - m(dU) dS] } . (57) t J o JIIUII>c Let ,,(t) = t ( In Iy(s) + Uy(s)l[v(ds x dU) - m(dU) ds] J o JIIUII<c +  I t tpk(Y(S)) dWk(S). This is a square-integrable martingale with characteristic (", ")t = t ( t tp(y(S)) ) + ( In 2 IY(s) + Uy(s)lm(dU) ds < at, J o k=l JllulIc where a is a constant. Therefore, { I } n2 p sup 117(t)l> - .2 n < _ 2 2 E(17, 17)2n = O(n 2 .2-n), 0t2n n n L P { SUP 1,,(t)1 > ! . 2n } < 00, 0<t<2n n n -- 
242 III. STABILITY. LINEAR SYSTEMS and since 1 1 sup -11I(t)1 < 2 n - 1 sup 11I(t)l, 2 n - 1 t2n t 0t2n the Borel-Cantelli theorem gives us that 11I(t)l/t < 2/log 2 t for all suffi- ciently large t. Hence, p { lim .!.11(t) = O } = 1. too t We now consider the martingale 111 (t) = t f In Iy(s) - Uy(s)l[v(ds x dU) - m(dU) ds]. J o JIIUII>c It follows from condition a) that EI1Il (t)l d < 00, and hence 1111 (t)ld is a submartingale: p { SUP 1111 (t)1 > 2 n n- 1 } = p { SUP 1111 (t)ld > 2 nd n- d } t2n t2n < n d 2- nd EI1Il (2 n )l d . Using the fact that for some a E j t+1 f In Iy(s) - Uy(s)l[v(ds x dU) - m(dU) ds],s t JllulI>c and also the inequality la+bl d < laid +Jlaldb/a+Ylbl d , < a, where a, b E Rand Y is a constant, we find that Ell1l (k + I) l,s < Ell1l (k )I,s + oEll1l (k) l,s 111 (k +11 lk) 111 (k) + yEI1Il (k + 1) - 111 (k)l d < EI1Il(k)l d + ya < (k + l)ya. Hence, P { SUP 1111 (t)1 > 2 n n- 1 } < n d . 2- nd ay2 n = O(n d .2 n (l-d)). t2n This implies that p { lim .!.111 (t) = O } = 1. t-+oo t Using condition b), we see that (57) tends with probability 1 to -00. 0 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 243 COROLLARY 1. Suppose that the stochastic semigroup Up is irreducible, and n(dy) is an ergodic distribution for the process y(t) (y(t) is a Feller process with compact phase space; hence y(t) has an ergodic distribution). If f gl(y)n(dy) < 00, then the semigroup Up is stable. Indeed, for n(dy)-almost all y(O) lim .!. t gl (y(s)) ds = f gl (y)n(dy) < 00. too t 10 Therefore, there exists an x =I- 0 such that P{lim IUpxl = O} = 1. Since the stochastic semigroup is irreducible, it is stable. COROLLARY 2. Suppose that the ergodic measure n(dz) for the process y(t) is unique. Then, by Lemma 5 in Chapter I, for any y(O) = Y lim .!. t gl (y(s)) ds = f gl (z)n(d z). too t 10 If f gl(z)n(dz) > 0, then for any x =I- 0 P { lim lutOxl = +oo } = 1, too i.e., the stochastic semigroup is unstable in this case. REMARK. Suppose that the ergodic measure n(dz) for the process y(t) is unique, and i oo f g(z)Py(y(t) E dz) dt < 00 for every function g( z) with f Ig(z)ln(dz) < 00, f g(z)n(dz) = O. In this case if f gl(z)n(dz) = 0, then the function Q(y) = i oo Eygl(y(t)) dt is defined, and Q(y(t)) - Q(y(O)) -I t gl(Y(s)) ds is a martingale. Obviously, the jumps of this martingale are Q(y(s)) - Q(y(s-)) and take place at jump points of y(s); therefore, Q(y(t)) - Q(y(O)) - it gl (y(s)) ds - it f[Q(y(S) + j(y(s), U)) - Q(y(s))] [v(ds x dU) - m(dU) ds] 
244 III. STABILITY. LINEAR SYSTEMS is a continuous martingale, and hence is representable in the form  I t V'k(y(S)) dWk(S). This establishes that the exponent of the exponential in (55) is Q(Y(O)) - Q(y(t)) +  I t (tpk(Y(S)) - V'k(y(S))) dWk(S) + I t f [In Iy(s) + Uy(s)1 + Q(y(s)) - Q(y(s) + f(y(s), U))](v(ds x dU) - m(dU) ds) = Q(y(O)) - Q(y(t)) + 'o(t) + '1 (t), where 'o(t) is a continuous martingale and '1 (t) is ajump process. Further, r {t ('0, 'o)r =  10 (tpk(Y(S)) - V'k(y(S)))2 ds, ('1> 'I)r = I t f [In Iy(s) + Uy(s)1 + Q(y(s)) - Q(y(s) + f(y(s), U))]2 m (dU) ds. We assume that sup f In2(y + Uy)m(dU) < 00. lyl=1 (58) If o < f f (t.(tpk(y) - V'k(y))) 2 + [In Iy + Uyl + Q(y) - Q(y + f(y, U))]2 xm(dU)1l(dy), then for the martingale 'o(t) + '1 (t) the characteristic tends to infinity, while condition (58) ensures that the variable '1 (-r) - '1 (-r-) is uniformly bounded with respect to all stopping times ,. Suppose that a < 0 < band , is the first exit time of the martingale 'o(t) + '1 (t) from the strip [a, b]. Then 0= E[,o(') + '1(')] < aP{,o(') + '1(') < a} + b + EI'I(') - '1(,-)1, b 1 P{,o(') + '1(') < a} < -- + -EI'l(') - '1(,-)1. a a Since the right-hand side tends to zero as a --+ -00, p { SP['o(t) + 'I (t)] > b } = I 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 245 for all b. Therefore, p { lim r(t) = +oo } = 1, too i.e., the stochastic semigroup Up x is not stable. Suppose that the process y(t) has a transition probability density. We consider conditions under which the ergodic distribution n(dy) for the process y(t) is unique. As follows from Theorem 22 in Chapter I, this holds if the support of the measure n coincides with S. Let us consider some properties of the support Sx of the ergodic measure n. Let y E Sx be a nonisolated point. Denote by Ly the set of z such that IYn -Anzl = O(An) for some sequences An --+ 0 and Yn E Sx. The facts that 1) Sx is the set of essential states of the process (and hence the attainability of a neighborhood of Yn from the point Y implies the attainability of Y from Yn) and 2) y(t) depends continuously on the intial state (and hence the attainability of a neighborhood of Yn from Y implies the attainability of a neighborhood Yn +v from the point Y+v, where v is sufficiently small) give us the following properties: I. Ly contains - z if it contains z. II. Ly contains ZI + Z2 if it contains ZI and Z2. III. Ly contains az for all a E R if it contains z. IV. Ly is a linear subspace of Rd. V. Ly depends continuously on Y, and the dimension of Ly is the same on each connected component of the set Sx. VI. Sx = S if and only if the dimension of Ly is equal to d - 1 for at least one point. Indeed, Ly = {z: (y, z) = O} if S = Sx. Suppose that Yo is such that LyO = {z: (Yo, z) = O}. Denote by F the connected component of Sx containing Yo. If F ¥- S, then there is a point y E F which can be touched by a sphere not containing points of Sx. Obviously, we cannot have that Ly = {z: (y, z) = O} at y, since Ly does not contain a vector directed to the center of the contacting sphere. But for all Y E F the dimension of Ly is equal to d - 1. We have obtained a contradiction. VII. Denote by Ly the smallest subspace containing Y and Ly. Then assertion VI is equivalent to th! following: y = Rd for some y E SXo VIII. If Y E Sx, then Bky E Ly and Ay E Ly. Indeed, if either one of these relations fails to hold, then from the point y the process x(t) can in an arbitrarily small time proceed with positive probability in a direction not belonging to Ly, and this is impossible. IX. If y E Sx, then each of the curves z(t) = etAY/letAYI and Zk(t) = e tBk /le tBk yl, k = 1,..., r, t E R, also lies in Sx. 
246 III. STABILITY. LINEAR SYSTEMS Such curves were considered by Babchuk and Kulinich [1] in connection with the study of invariant sets for solutions of linear stochastic equations. Our assertion follows from the fact that these curves have the property that the tangents to them at each point z belong to Lz. X. Denote by ( the set of operators C E L(Rd) such that e tC y Ile tC yl E Sx for all t E R if y E Sx. Then ( contains the operators A, B 1 ,.. . , B" and it contains the commutator C 1 C 2 - C 2 C 1 = [C 1 , C 2 ] of each pair of operators C 1 and C 2 in it. The first assertion follows from IX. The second is a consequence of the following considerations. For all h > 0 e hC1 ehC2e-hCI e- hC2 y IIe hC1 ehC2e-hCI e- hC2 yl E Sx, however, ehC1 ehC2e-hCI e- hC2 = (I + hC 1 + !h2Cr)(I + hC2 + !h 2 C}) x (I - hC 1 + ! h2C r)(I - hC 2 + !h 2 C}) + O(h 3 ) = I + hC 1 + !h2Cr + hC 2 + !h 2 C} - hC 1 + !h2Cr - hC 2 + !h 2 C} + h 2 C 1 C 2 - h 2 C 1 C 2 - h 2 C 2 C 1 - h 2 C} + h 2 C 1 C 2 - h2Cr + O(h 3 ) t = I + h 2 ( C 1 C 2 - C 2 C 1 ) + O(h 3 ). Consequently, lim(ehclehc2e-hcle-hc2)[t/h2] = e t [C.,c 2 ] hO (here [.] is the integer part of a number). Therefore, e t [C.,c 2 ]y Ile t [C 1 ,C 2 ]yl E Sx. The next theorem follows from properties I-X. THEOREM 15. Let ( be the smallest linear collection of operators contain- ing the operators A and Bl,... ,B, and such that [C 1 , C 2 ] E (ifC 1 , C 2 E (. An ergodic distribution for the process y(t) is unique if {Cy, C E (} = Rd for all y # o. 2.7. p-stability. DEFINITION. Let p > O. The stochastic semigroup Up is said to be p-stable if lim EIUoxl P = 0 too t (59) for all x. 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 247 REMARK. This definition differs from the generally accepted one (see, for example, Khas'minskii [5], Chapter V, 7). One speaks of asymptotic stability when the indicated properties hold. As shown in  1, stability and asymptotic stability coincide under very general conditions for stochastic systems. Therefore, here we consider only asymptotic p-stability (instabil- ity), omitting for brevity whe word "asymptotic". Obviously, the set of x such that (59) holds forms an invariant linear space (this can be established just as for p = 2). Therefore, if the stochastic semigroup Up is irreducible, then it is p-stable if (59) holds for at least one x "# o. DEFINITION. The stochastic semigroup Up is said to be exponentially p-stable if there exists an a > 0 such that EIUtOxl P < e-atlxl P fa. The next theorem establishes a connection between p-stability, expo- nential p-stability, and stability with probability 1. THEOREM 16. Suppose that Up is a solution of equation (23), where r Y(t) = tAl + L wk(t)B k + ( U[v(ds x dU) - m(dU) ds], k=1 JIIUII<c v is the same as in (45), and c < 1. Then the following assertions are equivalent: 1) Up is stable with probability 1. 2) Up is p-stable for some p > o. 3) Up is exponentially p-stable (p can be the same as in 2)). PROOF. We show that assertion 2) follows from 1). Let el,. . . , ed be a basis in Rd. Since P{SUPt IUPekl > c} can be made arbitrarily small by suitably choosing c, and p {Sp IUtOxl > c} < P {t l(x,ek)1 sP IUtOekl > c} < tp { suPIUtOe kl >  } k=1 t for Ix I < 1, it follows that sup P { sUPIUtOXI > 2 Y } < 2 1 Ixll t 
248 III. STABILITY. LINEAR SYSTEMS for some y > O. Then for any P sup P { SUP IUtOxl > 2 Y P } < 2 1 . Ixlp t By assumption, II U:- - I II < c < 1. Let , be the first time when I U x I > PI > p. Then IU_xl < PI, IIU:-II < 1 + c, and hence IUxl < PI (1 + c). Therefore, sup P { SUPIUtOXI > 2 Y pl(1 +c) } Ixl:::; 1 t < sup P{supIlUt'Uxll > 2 Y pl(1 +c)} Ixl:::;1 = sup E P { SUPIUtTUXI > 2 Y pl(1 +C)I91; } I{T<OO} Ixl:::; 1 t>T < sup P { suP1UtOYI > 2 Y pl(1 +C) } sup Ex{' < oo} lylPI(1+c) t Ixl1 < 2 1 sup P { SUP IUtOxl > PI } . Ixl<1 t If PI = 2 kY (1 + c)k-l in this inequality, then sup P { SUPIUtOXI > 2(k+l)Y(1 +C)k } Ixl:::;1 t < 2 1 sup P { suPtUtOXI > 2 kY (1 +C)k-l } Ixl:::;1  t < 2 1 k sup P { SUPIUtOXIY > 2 Y } < 2 Ll o Ixl:::;1 t Therefore, E sup IUtOxl P < 1 + f: P { SUP I UtOxl > 2 kY (1 + C)k } t k=O t X [2(k+l)Y (1 + c)k+l]p 00 < 1 + 2 Y (1 + c)P L 2- k (l-yp-p log2(I+c)) < 00, k=O provided that p(y + log2(1 + c)) < 1. Since IUpxl P --+ 0 in probability and E SUPt I Up xl P < 00, Lebesgue's theorem gives us that limtoo EI Up xl PI = 0, o < PI < p. Hence, 2) follows from 1). We show that 2) implies 3). Obviously, by 2), lim sup EIUtOxl P = O. too Ix I:::; 1 
2. LINEAR EQUATIONS AND STOCHASTIC SEMIGROUPS 249 Choose s such that sUPlxll EIUsoxl P < !, and hence sup EIUsoxl P < !pp. Ixlp Then for t > s sup EIUtOxl P = sup EIU:-SUtsxIP = sup EE(IU:-S(Utsx)IPIc9;-s) Ixl1 Ixl1 Ixl1 = sup E(EIUsoYIP =u O x) < sup E(!IUtsxIP). Ixl1 y t-s Ixl1 Hence, for ns < t < ns + s sup EIUtOxl P < 2- n sup EIUtnsxlP < 2 sup EIU s u xI P e-(n+l)ln2 Ixll Ixl1 Ixl1 us < 2 sup EI U s u xl P e-(t/s) In 2. Ixll us This implies the exponential p-stability of Up. Finally, we show that 3) implies 1). Let v(x) = E 1 00 IUtOxlP dt (if 3) holds, then v(x) is defined and continuous). Obviously, v(x) > 0 for Ixl '# 0, v(O) = 0, and v(x) is a superharmonic function for the process x(t) = Upx. Therefore, v(Upx) is a supermartingale, and limtoov(UPx) exists with probability 1. But v(x) < cdxl P for some Cl > 0; hence Ev(Upx) --+ O. Therefore, v(Upx) --+ 0 with probability 1, and hence Up x --+ 0 with probability 1. 0 DEFINITION. The stochastic semigroup Up is said to be p-unstable if for allx,#O lim E I (Tox l - P = 0 t t , oo and it is exponentially p-unstable if for some a > 0 ElUtOxl- P < .!.e-atlxl- p . a THEOREM 17. If Up is as in Theorem 16 and the process y( t) - I Up X 1-1 Up x has a unique ergodic distribution, then the following assertions are equivalent: 1) P{limtoo IUpxl = +oo} = 1 for all x '# O. 2) Up is p-unstable for some p > o. 3) Up is exponentially p-unstable (p can be taken as in 2)). PROOF. The proof is analogous to that of Theorem 16; therefore, we dwell on the points where they differ. Using the representation (55) for 
250 III. STABILITY. LINEAR SYSTEMS our case, we see that under the assumptions of the theorem we have f g(y)n(dy) > 0 (n is a unique measure). From this and the fact that 1 {t ( UOx ) ! t 10 Eg ,U:oxi ds -+ g(Y)1C(dy) uniformly in x (this follows from the uniqueness of the ergodic distribu- tion and the remark after Theorem 21 in 4 of Chapter I) it follows easily that lim P { SUP IUtOxl-l > C } = 0 coo t uniformly with respect to Ixl > 1, if only assertion 1) holds. If y > 0 is chosen so that sup P { SUP IUtOxl-l > 2 Y } < 2 1 , Ixll t then, considering that I U:- x I > (1 - c) Ix I, we can establish that sup P { SUP IUtOxl-l > 2 kY (1 _ C)-k+l } < ( ! ) k Ixll t 2 and hence sup EsuplUtOxl- P < 00 Ixll t for ( 1 ) -1 P < Y + log2 "1 - c . This implies that 2) holds if 1) holds. The fact that 2) implies 3) can be proved as in Theorem 16. Finally, if 3) holds, then the function v(x) = fooo EIUpxl- P dt is superharmonic, and cllxl- P > v(x) > c2lxl- P for some Cl and C2. This gives us that the limit limtoo v(x(t)) exists with probability 1; since Ev(x(t)) --+ 0, this limit is equal to zero, but then limtoo IUpxl- P = 0, i.e., 1) holds. 0 We present some sufficient conditions for p-stability and p-instability. THEOREM 18. Let Up be a solution of the equation r dUtO = AUtO dt + LBkUtO dWk(t), k=l where {wk(t),k = 1,...,r} are independent one-dimensional Wiener processes. Then the following assertions are true: a) If r 1 r (Ax, x) - L(Bk X ,X)2 + 2 L I B k X l 2 < 0 k=l k=l 
3. STABILITY OF SOLUTIONS 251 for all x with Ixl = 1, then Up is p-stable for some p > o. b) If , 1 ' (Ax,x) - L(BkX,xf + 2 L I B k X l 2 > 0 k=1 k=1 for all x with Ixl = 1, then the stochastic semigroup Up is p-unstable for some p > o. PROOF. Using the Ita formula for the function Ixlo (a can be positive or negative), we can write dlx(tW = [a 1 x(t)la-2(AX(t),X(t)) + a(a 2- 2) Ix(t)la-4 x E(BkX(t), X(t))2 +  Ix(t)la-2 E(BkX(t), BkX(t))] dt , + alx(t)IO- 2 L(Bkx(t), x(t)) dWk(t). k=1 If condition a) holds, then, choosing a > 0 such that 2 ' , a-  2 1  2 (Ax, x) + 2 (BkX,X) + 2  IBkXI <-a k=l k=1 for Ix I = 1, we have that aElx(t)IO < -a 2 Elx(t)IO, Elx(t)IO < Ixloe-0 2t . If b) holds, then we choose a < 0 such that 2 ' 1 ' (Ax, x) + a; L(BkX,X) + 2 L I B k X I2 > -a k= 1 k= 1 for Ixl = 1. Then aElx(t)IO < -a 2 Elx(t)l, 2 Elx(t)IO < Ixloe-o t. D 3. Stability of solutions of stochastic differential equations 3.1. Stability and instability in first approximation. Consider the ho- mogeneous stochastic differential equation , dx(t) = a(x(t)) dt + E bk(x(t)) dWk(t) + 1 f(x(t), O)f.l(dO x dt) (60) in Rd, where Wk and J.l are as in the preceding section. Assume that zero is a stationary point for the equation, i.e., a(O) = 0, b 1 (0) = . . . = b,(O) = 0, 
252 III. STABILITY. LINEAR SYSTEMS and f(O, 0) = O. If the coefficients of the equation are differentiable at 0, and A, B k , and F(O) are the derivatives of the functions a(x), bk(x), and f(x,O) at 0 (these are linear operators), then a(x) = Ax + ao(x), bk(x) = Bkx + b2(x), f(x,O) = F( O)x + fo(x, 0), (61) where ao(x), bo(x), and fo(x,O) are small in comparison with Ixl in a neighborhood of O. It is natural to single out the "principal" part of the equality in a neighborhood of 0, namely, the linear equation r dx(t) = Ax(t)dt+ EBkX(t)dWk(t) + 1 F(O)x(t)p,(dO x dt). (62) Its solution is called the first approximation. The assertions about stabil- ity (instability) in first approximation are formulated as follows: if the solution of (62) is stable (unstable), then so is the solution of (60). This assertion is valid under certain assumptions. We formulate them. 1. The functions a(x), bk(x), k = 1,..., r, and f(x, 0) satisfy the Lips- chitz condition r la(x) - a(y)f + L Ibk(x) - bk(y)f + f If(x, 0) - f(y, O)fm(dO) k=1 < llx - Y12. 2. The representations (61) are valid, where r [ ] 1/2 lao(x)1 + E Ib2(x)1 + f lfo(x, O)1 2 m(dO) + Sp lfo(x, 0)1 < elxl + Ix1 2 , and e is sufficiently small. 3. sUPoIIF(O)1I < 1. 4. Let y(t) = x(t)/lx(t)l, a homogeneous Markov process on the sphere. This process has a unique ergodic distribution p(dz). 5. If c = f ql (z)p(dz), where r 1 r q\(y) = (y,Ay) - L(y,Bky) + 2 L I B kyl2 k=1 k=1 + f[ln Iy + F(O)y! - (y,F(O)y)]m(dO), then c 1: O. 
3. STABILITY OF SOLUTIONS 253 THEOREM 19. Assume conditions 1-5. There exists an eo > 0, depending only on A, B k , k = 1,..., r, F(O), and m(dO), such that for e < eo the solution of (60) is stable if the solution of (62) is stable with probability 1, and it is unstable if the solution of ( 62) is unstable. PROOF. Consider first the case c < O. Then the solution of (62) is stable (see Theorem 14 and Corollary 1). Therefore, by Theorem 16 there exists a p > 0 such that x(t) is asymptotically p-stable, i.e., for some a > 0 Exlx(t)IP < IxlPe- ot fa. We use (55) to represent Ix(t)l. This implies that for all real y there exists a c y < 00 such that Exlx(t)I Y < IxIYexp{cyt}. (63) For this it suffices to prove that Eexp {I t g(s) dWk(S)} < exp{clt} if g is a bounded adapted function, and, as easily follows from the Ita formula, for Igl < k E exp {I t g(s) dWk(S) } < exp{k 2 tj2}, and E exp {I t tp(s, O).u( ds x dO) } < exp{ C2t}, (64) provided only that I(s, 0)1 < k and f I(s, O)1 2 m(dO) < k for all s, where the constant C2 depends solely on k. It suffices to establish (64) for step functions (s, 0). However, E (ex p {ft;+1 f tp(t;, O).u(ds x dO) } 19;; ) = exp { (t;+1 - t;) f [e9'(t;,8) - 1 - tp(t;, O)]m(dO) } < exp { !ek(t;+1 - t;) f tp2(t;, O)m(dO) } < exp{!ke k (t2 - tl)}. We have used the fact that J.l(ds x dO) does not depend on 9'", along with the inequality leA - 1 - AI < e k A 2 /2 for IAI < k. It is clear from this that (64) holds with C2 = ke k /2 for step functions. Hence, it is valid also for all functions (s, 0). 
254 III. STABILITY. LINEAR SYSTEMS Denote by Up the solution of the operator stochastic differential equa- tion r dUto=AUtOdt+ LBkUtOdwk(t) + ! F(O)Ut°p,(dOxdt), U8=I. k=1 (65) Then x(t) = Upx(O). For some to > 0 let {to (to v(x) = Ex 10 Ix(s)IP ds = E 10 IUsoxl P ds. (66) Then (to v'(x) = E 10 pIUxlp-2Usoxds, (to v"(x) = E 10 (PI U s Oxl p - 2 U s o - p(P - 2) IUsoxl P - 4 Usox 0 Usox) ds. In view of (63) the derivatives v' and v" are defined for Ixl -=1= 0, and Ixl P k l < v(x) < k 2 lxl P , Iv'(x)1 < k 3 (to)lxI P - 1 , II v" (x) II < k 4 ( to) Ix\p-2. (67) We can take P k 3 (to) = - exp{cp-ItO}, C p -l p2 k 4 (to) < - exp{c p -2 t O} C p -2 (use (63) to estimate the derivatives); k 1 > 0 is a certain constant, i to k 1 = inf E IUsoYI P ds, Iyl=l 0 and the fact that it is positive follows from the continuity of E f I U s o xl P ds and the fact that this function is positive for Ixl -=1= o. We consider the integro-differential operators 1 r L tp(x) = (Ax, tp'(x)) + 2 L(tp"(x)Bkx, Bk X ) k=1 + ! [tp(x + F(O)x) - tp(x) - (tp'(x),F(O)x)]m(dO), Lrp(x) = L rp(x) + Lorp(x), 1 r Lotp(x) = (ao(x), tp'(x)) + 2 L(tp"(x)b2(x), b2(x)) k=1 + ! [tp(x + fo(x, 0)) - tp(x) - (tp'(x), fO(x, O))]m(dO). 1 k 2 = 2' a 
3. STABILITY OF SOLUTIONS 255 The operator L is the generating operator on twice continuously differen- tiable functions for the process x(t), and L is the generating operator on twice continuously differentiable functions for the process x(t). If rp E Cd' then, denoting by Tr the semigroup corresponding to the process x(t), we have that L E 1 10 tp(Usox) ds = L 1 10 Tstp(x) ds = Trotp(x) - tp(x). It is easy to see by passing to the limit that L v(x) = EIUIxIP - Ixl P < -lxlP (1 -  e- alo ) . Choose to so that L v(x) < -lxl P /2. Now 1 r Lov(x) = (ao(x), tp'(x)) + 2 L(v"(x)b2(x), b2(x)) k=1 + f (v (x + fo(x, 0)) - v(x) - (v' (x), fo(x, O)))m(dO). Using condition 2, we find that ILov(x)1 < (elxl + IxI 2 )lv'(x)1 + (elxl + IxI 2 )2I1v"(x)1I < IxlP(e + IxDk 3 (to) + IxlP(e + IxD 2 k 4 (to) (in view of the estimates (67) for the derivatives). Hence, Lv(x) < -lxIP(! - ek 3 (to) - e 2 k 4 (to) - k 5 (lxl + IxI 2 )), where k5 is a constant. Let eo be such that 1/2-ek 3 (to)-e 3 k 4 (to) > 1/4 for e < eo. Then there exists a J > 0 such that Lv(x) < -lxl P /4 for Ixl < J. Therefore, Lv(x) < -kv(x) for some k, and the stability (asymptotic) of the point 0 for the process x(t) follows from Theorem 8 and the remarks after it. Now let c > O. On the basis of Corollary 2 we can conclude from Theorem 14 that for all x Px { lim Ix(t)1 = +oo } = 1. too Hence, by Theorem 17, Exlx(t)I- P < (Ixl- P /a)e- at for some p > 0 and a > o. Let z(x) = Ex 1 10 Ix(s)I-P ds = 1 10 EIUsOxl- P ds. 
256 III. STABILITY. LINEAR SYSTEMS Using the same arguments as in the derivation of (67), we can see that there exist II, 1 2 , 13(to), and 14(to) such that 111xl- P < Iz(x)1 < 12Ixl- P , Iz'(x)1 < 13(to)lxl- P - 1 , IIz"(x)1I < 14(to)lxl- p - 2 . Therefore, ILoz(x)1 < (8 + IxDlxl-PI3(to) + (8 + IxD2Ixl-PI4(to). (68) Further, L z(x) = -Ixl- P + EIUtxl-P < -lxl- P (1 - e- ato la) < -Ixl- P 12 (if a is chosen as earlier). Therefore, there is a J > 0 such that for Ixl < J Lz(x) < -15z(x), 15 > o. Thus, z(x) is a A-supermartingale for 0 < A < 15. Hence, the limit lim z(x( t 1\ 't'u ) )el(tl\t u ) too exists and is finite, i.e., Px{ 't'u < oo} = 1 for all x E U, because Ix(t)1 < J for t < 't'u, and Ix( 't'u)1 < J( 1 + Cl), where Cl = sup f(x, fJ). IxlJ,8 This proves that 0 is unstable. EXAMPLE 1. We consider the one-dimensional stochastic equation dx(t) = a(x,) dt + b(x,) dw(t) + ! f(8, x,)J1.(d8 x dt). (69) Here a(x) and b(x) are differentiable functions of x, 1/(8,x)1 < clxl, where c < 1, and the limit lim 1(8, x) = 1(8) x-o x exists for m(d8)-almost all 8 E 8. Assume also that a(O) = b(O) = 0 and - I . a(x) b - I . b(x) a = 1m -, = Im-. x-o X x-o x The main part of the equation has the form dx, = aX, dt + bX, dw(t) = ! /(8)x'J1.(d8 x dt). The solution of this equation is x, = Xo exp { (a -  + ! [IntI + J(8» - J(8)]m(d8») t +bw(t) + l' ! In( I + /(8»J1.(d8 x dt) } , c = a - b; + ! (In( I + J(8» - J(8»m(d8). 
3. STABILITY OF SOLUTIONS 257 The process Xt is stable for c < 0 and unstable for c > o. EXAMPLE 2. Suppose that the process Xt is continuous and satisfies (69) with f = O. We consider the case c = a - 0 2 /2 = O. Let x . 2a(x)/b 2 (x) = a(x). This function tends to zero as x  O. As follows from results in  1.3, x(t) is stable at 0 if and only if for J > 0 00 > 1 6 exp {1 6 ; dz } dx = 1 6  exp {1 6 a) dz } dx. The function X(z) = exp{f: (a(z)/z) dz} is slowly varying, and a condition for stability at o from the right is that f6 .!X(z)dz < 00. 10 z If a(x) and b(x) are twice continuously differentiable functions, then a(x) has a derivative, and hence the limit limx_o(a(x)/x) = a exists. Then the limit limz_o x(z) = P :F 0 exists, and the point 0 is unstable. EXAMPLE 3. We consider the solution of the equation dXt = aXt dt + bXt dw(t) = qXt dllt, where lit is a homogeneous Poisson process with jumps I and with mean value mt. Then dXt = aXt + qmXt + bXt dw(t) + qXt d'Vc, 'Vc = lit - mt, b 2 C = a + qm - 2 + [In(1 + q) - q]m b 2 =a--+mln(l+q). 2 If a - b 2 /2 + m In( I + q) < 0, i.e., q < e(b 2 -2a)/2m - I, then the solution is stable, and the solution is unstable for q > exp{ (b 2 - 2a) /2m} - I. Let us consider diffusion processes that are solutions of stochastic dif- ferential equations of the form r dXt = a(Xt) dt + L bk(Xt) dWk(t). (70) k=1 Assume the relation holds along with conditions 1 and 2 with f = O. We are interested in sufficient conditions for stability and instability in first approximation (more precisely, with respect to equation (62) with F = 0). This has to do with the fact that it is not possible to effectively compute c in condition 5 (in particular, it is not possible to effectively determine an ergodic distribution for the process y(t)). THEOREM 20. Let r 1 r Q(x) = (Ax, x) - L(B k x,X)2 + 2 L I B k X I 2 , k=l k=l Cl = sup Q(x), C2 = inf Q(x). Ixl1 Ixll 
258 III. STABILITY. LINEAR SYSTEMS Then there exists an eo > 0 such that under condition 2 with e < eo the point 0 is stable for the solution of (70) when Cl < 0, and it is unstable when C2 > o. PROOF. We have that [ 1 r dlxtl P = plxtl P - 2 (a(xt),xt) + 2 P (P - 2)lxtlp-4 £;(b k (Xt),Xt)2 1 r ] + 2 PlxtlP-2 £; Ib k (xt)1 2 dt r + plxtl p - 2 L(bk(Xt), Xt) dWk(t). k=1 Since (a(x), x) + ( ! _ 1 )  (bk(x), X)2 +.!.  (bk(x), bk(x)) IxI 2 2  IxI 4 2  IxI 2 k=1 k=1 = Q (  ) + (ao(x),x) _  2(B k x,x)(b2(x),x) + (b2(x),X)2 Ixi Ixl 2  IxI4 k=1 1  2(B k x,b2(x)) + (b2(x),b2(x)) 1  (b k (x),x)2 + 2  Ixl 2 + 2  Ixl 4 ' k=1 k=1 we see by using condition 2 that for some 11 (a(x),x) + ( P _ 1 )  (btc(X),X)2 +.!.  (bk(x),bk(x)) _ Q (  ) Ixl 2 2  Ixl 4 2  Ixl 2 Ixl k=1 k=1 < 11 (e + p + Ixl + IxI 2 ). Now choose eo > 0, p > 0, and J > 0 such that 11 (eo + p + J + J2) < -Cl if Cl < 0, and II (eo + p + J + J2) < C2 for C2 > o. Let 'fJ be the first exit time of the process Xt from the ball of radius J about O. Then for 'f < 'fJ and e < eo dlxtl P = [PQ(xt/lxtDlxtI P + p l l(e + p + IJI + IJ 2 nO(t,xt)l x tI P ] r X IXtlp-2 L(bk(xt), Xt) dWk(t), k=1 where IO(t,x)1 < 1. Hence, E x lx t 1\1:6l P -lxl P < Ex 1 t 1\1: 6 P [Q C;I ) + /1 (8 + P + 101 + 10 2 1)] Ixsl P ds {tl\t < P(CI + /1 (8 + P + 0 + 02))Ex 10 Ixsl P ds. 
3. STABILITY OF SOLUTIONS 259 Letting P(CI + ll(e + p + J + J2)) = 10 < 0, we see that IXtl\'reSIP is an 1 0 - supermartingale, and hence lim I x I Pelo(tI\'reS) = I x I Pelo(tI\'reS) tl\'reS 'reS' t'reS Elx'ru I P e1o(tI\'reS) < IxIP. This implies that x(t) is stable (what is more, asymptotically stable). We establish similarly that, for C2 > 0, eo, p > 0, and J > 0 such that 10 = P(C2 -11(eO + p + J + J2)) > 0, Ix(t 1\ 'l'J)I- P exp{lo(t 1\ 'l'J)} is a semimartingale. Therefore, Px{ 'l'J < oo} = 1 for all x with Ixl < J. 0 3.2. Diffusion equations with homogeneous coefficients. We consider equations of the form (70) under the following assumptions about the coefficients a(x) and b 1 (x),..., b,(x): 1) There exist a > 0 and a 1 > 0, . . . , a, > 0 such that a(Ax) = AQa(xIA), bk(Ax) = AQkbk(x/A) for all A > O. 2) sup I  I ( Ia(x) - a(y)1 + t Ibk(x) - bk(y)l ) < 00. Ixl=I,lyl=1 X Y k=1 3) inf ( la(x)1 + t I bk(X)I ) > O. Ixl=1 k =1 It follows from 1) that the point x = 0 is stationary for equation (70), and from 3) that this is a unique point. We are interested in conditions for its stability and instability. Let y(t) = x(t)/lx(t)l. This is a process on the unit sphere satisfying the equation dy(t) = {lx(t)la-la(y(t)) +  Ix(t)1 2 a k -2[3(b k (y(t)),y(t))y(t) -2(y(t), b k (y(t)))b k (y(t)) - (bk(y(t)), bk(y(t)))]} dt , + L Ix(t)I Q k- 1 bk(y(t)) dWk(t). k=1 (71 ) 
260 III. STABILITY. LINEAR SYSTEMS This equation is obtained with the help of the Ita formula and relations of the form a(x(t)) = Ix(t)laa(y(t)). We can also write an equation for Ix(t)1 with coefficients depending on y(t): dlx(t)1 = {lx(t)I<>(a(y(t)),y(t)) 1 ' } + 2 £; Ix(t)I-I+2<>k (Ib k (y(t)) 1 2 - (bk(y(t) ),y(t))2) dt , + L Ix(t)lak(bk(y(t)),y(t)) dWk(t). k=1 (72) Obviously, the principal role in the study of the behavior of the process x(t) in a neighborhood of 0 must be played by the terms containing Ixl to the smallest powers (the terms containing dt and the terms containing stochastic differentials must be treated separately). Therefore, we first consider the case when al = ... = a, = p. Then (72) can be rewritten as follows: d Ix ( t) I = {Ix ( t) I a Cl (y ( t)) + I x ( t) 1 2P -1 C2 (y ( t) ) } d t + I x ( t) I P C3 (y ( t) ) d W ( t), (73) where w(t) is a one-dimensional Wiener process, and the Ck(Y), k = 1,2,3, are determined from (72). We first present deeper conditions for stability and instability for a solution of (70), conditions that take into account only upper and lower estimates for Cl(y),C2(Y), and C3(Y). THEOREM 21. Suppose that conditions 1)-3) hold and al = .. . = a, = p. Let 1 ' , al (y) = (a(y),y) - 2 L Ib k (y)1 2 - L(b k (y),y)2, k= I k= I a2 (y) = a 1 (y) - (a (y ), y). The solution of(70) is stable if one of the following conditions holds: 1) sup(a(y),y) < 0 for a < 2p - 1; Iyl=l 2) SUpal(y) <0 fora=2p-l; Iyl=l 3) sup a2(y) < 0 for a > 2p - 1. Iyl=l 
3. STABILITY OF SOLUTIONS 261 It is unstable if one of the following conditions holds: inf (a(y),y) > 0; Iyl=l inf al (y) > 0; lyl=1 inf a2(y) > o. Iyl=l PROOF. We find from (73) that for A > 0 dlx(t) IA = Alx(t)IA-l {Ix(t) la CI (y(t)) + Ix(t) 1 2P - 1 C2 (y(t))} dt A ( A - 1 ) + 2 Ix(t)l'1.- 2 Ix(t)1 2P d(y(t)) dt + Alx(t) IA-1Ix(t) I P C3(y(t)) dw (t). 1') a < 2p - 1, 2') a=2p-l, 3')a>2p-l, Let UJ = {x: Ixl < c5}, and let c5 be the first exit time from the neighbor- hood UJ. For Ixl < c5 (tAf tS Ix(t 1\ 'r6)1J. = Ix(O)IJ. +). 10 Ix(SW- 1 - aA (2 P -l) g(lx(s)l,y(s)) ds (tAf tS + ). 10 Ix(s) IJ.+P-l C3 (y(s)) dw(s), where A(a(y),y) + AlxI 2P - 1 - a a2(y) for a < 2p - 1, Aal(y) fora=2p-l, g(x,y) = ).a2(Y) + ).lxl a + I - 2p (a(y),y) + !A2Ixl(2P-l-a)VOcj(y) for a > 2p - 1. Therefore, there always exist c5 > 0 and A > 0 such that g(x,y) < 0 for Ixl < c5 and Iyl = 1, provided that one of conditions 1)-3) holds. There- fore, Ix(t 1\ J)IA is a supermartingale, and hence Ixl A is a superharmonic function on UJ that satisfies the conditions of Theorem 8, and the point 0 is stable. As in the proof of stability, we establish that the function lxi-A is also superharmonic for sufficiently small A > O. Theorem 9 can then be used. REMARK. Let al = ... = a q < a q +l < ... < a" where q < r. Let p = al = ... = a q , and let al (y) and a2(y) be computed in the same way as al (y) and a2(y) with r replaced by q in the formulas for comput- ing the latter. Then the assertions of Theorem 21 remain valid if al(y) and a2(y) are replaced by al (y) and a2(Y), respectively, in the formula- tions. This follows from the fact that under conditions 1 )-3) the function Ixl A is superharmonic for sufficiently small A > 0 in the neighborhood UJ 
262 III. STABILITY. LINEAR SYSTEMS for sufficiently small J > 0, but if conditions 1')-3') hold, then lxi-A. is superharmonic. For illustration we consider the case when condition 2) holds: dlx(t)1 1 = [).a 1 (y(t)) +  ).2 t.(bk(y(t)), y(t))2 A. ' ] + 2 kllx(t)12<>k-2P(lbk(y(t))12 - (2 - )')(b k (y(t)),y(t))2) , X Ix(t)IA.+a-l dt + A. L Ix(t)IA.+a k -l(b k (y(t)),y(t)) dWk(t). k=1 For IxlA. to be a superharmonic function on UJ it suffices to choose A. > 0 and 0 < J < 1 such that 1 ' J2a -2p , sup al(y) + 2 ). L sup (b k (y),y)2 + ; L Ib k (y)1 2 > O. lyl=1 k=1 lyl=1 k=l This is possible, because sUPIYI=1 al(y) < 0 by assumption. It is possible to use the ergodic properties of the process y(t) for a more thorough study of stability and instability conditions. To do this it is necessary to make a random time change in equation (70) with coefticients satisfying conditions 1)-3). As before, we assume that P = al = ... = a q < aq+l < ... < a" q < r. Let y = a A (2P - 1), and let i be determined by t = 1'C 1 Ix(sW- 1 ds (1't = t for y = 1). Then the process x(t) = x(1't) satisfies , dx(t) = a(v(t))lx(t)la+I- Y dt + L Ix(t)la k +(I-y)/2b k (v(t)) dWk(Y), (74) k=1 where y(t) = x(t)/lx(t)1 = Y(t). The process y(t) satisfies the stochastic equation dy(t) = {IX(tW- Y a (y(t)) dt + E Ix(t)1 2 <>k- 1 -Y[3(b k (y(t)),y(t))2 y (t) -2(y(t), b k (y(t)))b k (y(t)) - (bk(y(t)), bk(y(t)))]} dt r + L Ix(t)la k -(1+Y)/2b k (V(t)) dWk(t). k=l (75) 
3. STABILITY OF SOLUTIONS 263 The variable Ix(t)1 appears in (75) only with nonnegative exponents, and there are terms in which Ix(t)1 appears with a zero exponent. These are the principal terms in a neighborhood of x = O. Depending on the relations between a and p, the equation containing only the principal terms has the form q dy(t) = a(y(t)) dt + L bk(y(t)) dWk(t), k=1 (76) where a(y) = a(y) and bk(y) = 0, k = 1,..., q, for a < p; q a(y) = a(y) + L[3(b k (y),y)2 y - 2(y, bk(y))bk(y) -lb k (y)1 2 ], k=1 - bk(y) = bk(y), k = 1,...,q, for a = p; and q a(y) = L[3(b k (y),y)2 y - 2(y, bk(y))bk(y) - Ib k (y)1 2 ], k=1 - bk(y) = bk(y), k = 1,...,q, fora> p. LEMMA 12. The point 0 is stable (unstable) for the solution of(74) if and only if it is stable (unstable) for the solution of (70) under the assumption that the coefficients a(x) and bk(x) satisfy conditions 1)-3). The proof follows from the fact that the superharmonic functions for the solutions of these equations coincide, and Theorems 8 and 9 give necessary and sufficient conditions for stability (instability). We now consider an equation of the form (74) under the assumption that all the terms are principal. This equation can be written in the form r dx(t) = Ix(t)la(x(t)) dt + Ix(t)1 L Ok(X(t)) dWk(t) (77) k=l (here it can turn out that either a = 0 or some Ok = 0, and the func- tions a and Ok are homogeneous of degree zero, i.e., they depend only on 
264 III. STABILITY. LINEAR SYSTEMS x(t)/lx(t)1 = y(t)). The process y(t) satisfies the equation dy(t) = {a(y(t)) + [3(bk(y(t)),y(t))2y(t) - 2(y(t), bk(y(t))) xbk(y(t)) - Ib k (y(t)) 12y(t)] } dt r + L bk(y(t)) dWk(t), k=1 (78) and Ix(t)1 2 satisfies dlx(t)1 2 = Ix(t)1 2 (2(a(y(t)),y(t)) + E 1 b k (y(t))1 2 ) dt r + 2Ix(t)1 2 L(bk(y(t)), y(t)) dw(t). (79) k=1 This implies the following representation for Ix(t)1 2 : { {t [ 1 r _ Ix(t)1 2 = Ix(0)1 2 exp 2 10 (a(y(s)),y(s)) + 2 tr Ib k (y(s))1 2 (80) - (bk(Y(:S))'Y(S))2] ds + (bk(Y(S))'Y(S)) dWk(S) } . THEOREM 22. Consider an equation of the form (74), where y = a 1\ (2a 1 - 1) 1\ . . . 1\ (2a r - 1). Let a(x) = a(x)I{y=o:}, bk(x) = b k (x)I{y=2o: k -l}, k = 1,..., r, and let x(t) and y(t) be solutions of(77) and (78). Assume the following for the process y(t) : 1) there exists a unique ergodic distribution p(dy); and 2) the coefficients a(y) and bk(y) are twice continuously differentiable on the unit sphere. Let c = f [(a(y),y) +   Ib k (y)1 2 - (bk(y),y)2] p(dy). 1) If c < 0, then the point 0 is stable for the process x(t) that solves (74). 2) If c > 0, then 0 is unstable for x(t). 
3. STABILITY OF SOLUTIONS 265 PROOF. We start by proving the first assertion. It follows from the representation (80) that Ix(t)1 2 = Ix(Q)1 2 exp {t (c + [  I t IfI(P(S)) ds - f lfI(y)p(dy) -  lt IfIk(P(S))dWk(S)])}, where 'I/(y) and '1/1 (y), . . . , 'I/,(y) are continuous functions on the unit sphere. By the ergodic theorem, . 1 lo t f 11m - 'l/CV(s)) ds = 'I/(y)p(dy) too t 0 with probability 1. Moreover, with probability 1 lim !. f 'l/kCV(S)) dWk(s) = 0, too t k = 1,...,r (this was established in the proof of Theorem 14). Thus, for c < 0 P { lim Ix(t)1 = O } = 1, too and for c > 0 p { lim Ix(t)I- 1 = O } = 1. too From this, as in the proof of Theorem 16, we establish that in the case c < 0 there exists a p > 0 such that for some q > 0 Exlx(t)IP < IxlPe- qt jq, and for c > 0 there exists a p > 0 such that Exlx(t)I-P < IxlPe- qt jq for some q > O. Let c < 0 and p > 0 be chosen as indicated above. Define v(x) = Ex I t Ix(s)IP ds. Then 1 o < v(x) < 2"lxIP, q lim Exv(x(t)) - v(x) = -lxjP + Exlx(t)IP < -lxlP + Ixl P e- qto . too t q 
266 III. STABILITY. LINEAR SYSTEMS It follows from the representation (80) that Elx(t)lm is locally bounded with respect to t for all mER. Using the representation v(x) = Ixl P l to Ex exp {p I t IfI(Y(S)) ds + E I t Plflk(Y(S)) dWk(s) } dt, where the functions VI and VII,..., VI, are twice continuously differentiable on the unit sphere, we see that v(x) is twice continuously differentiable for x # 0, and its derivatives satisfy the inequalities Iv'(x)1 < c(to)lxI P - 1 , Iv"(x)1 < c(to)lxI P - 2 , c(to) a constant. Choose to so that 1 - e- qto /q > 1/2. We show that v(x) is superharmonic in the neighborhood U J for some J > O. Indeed, let , L tp(x) = Ixl(a(x), tp'(x)) +  L IxI 2 (tp"(X)b k (x), bk(x)), k=1 L(x) = Ixlo:+ 1 - y ( a (  ) '(X) ) + .!. t IxI2ak-t-y Ixl ' 2 k=1 X (tpll(X)b k C;I ) ,b k C;I )) · Then L(x) = L(x) + IxIPI(x), where , Ltp(x) = Ixl(a(x), tp'(x)) +  L IxI 2 (tp"(X)b k (x), bk(x)), k=1 ,..,.. the functions a and bk(x), k = 1,..., r, are continuous and locally bounded, and p is the smallest positive number among the numbers a, :-y, 2al - 2y, 2a2 - 2y, 2a, - 2y; furthermore, a = 0 if a - y = 0 and b k = 0 if ak = (y + 1) /2. Since L is the generating operator for the process x(t), it follows that - . 1 1 Lv(x) = 11m -(Exv(Xt) - v(x)) < - 2 1x1P. tO t Therefore, Lv(x) < -  Ixl P + Ixlt+P(a(x), v' (x)) , 1 ,..,..,..,.. + 2 1x12+P L(v"(x)bk(x), bk(x)). k=l 
3. STABILITY OF SOLUTIONS 267 ,.." There exists a constant Cl depending on c(to) and sUPlxll,k Ibk(x)1 such that 1(7i(x), v' (x))1 < i IxI P - 1 , r L(v"(x)bk(x),bk(x)) < ci!xI P - 2 k=1 for Ix! < 1. Hence, Lv(x) < -!lxIP + ctlxl P + P < -!lxIP(1 - 2ctlxI P ). Therefore, if J > 0 is such that 2CIJP < 1/2, then Lv(x) < -!lxIP for x E U. But L is the generating operator for the process x(t), and hence v(x) is a superharmonic function for x(t) in the neighborhood U J . The stability of 0 follows from Theorem 8. Let c > O. We consider the function (to Z(z) = Ex 10 Ix(t)I- P dt, where p > 0 is such that Exlx(t)I-P < e- qt Iq, q > O. Then for sufficiently large to the function Z(x) is superharmonic for the process x(t). What is more, L Z(x) < Ixl- P 12. Again using the estimates IZ'(x)1 < C2!X!-p-l, IZ"(X)I < ct!xl- p - 2 , we see that Z(x) is superharmonic for x(t) in the neighborhood U J for sufficiently small J. The instability of 0 for x(t) follows from Theorem 9. REMARK. The theorem remains valid if x(t) is as before, L is the gener- ating operator for x(t), and the generating operator of the diffusion process x(t) has the form Lip = L ip + p(x) [(7i(X), 1p'(x))lxl + 12 E(lpll(X)bk(X), bk(X))] , ,.." where the functions a and b k are bounded and continuous, and p(x) --+ 0 as x --+ O. The proof is again based on the fact that v(x) (Z(x)) is a superharmonic function for x(t). So far we have considered only diffusion processes. The fact of the matter is that a random time change in processes with a Poisson mea- 
268 III. STABILITY. LINEAR SYSTEMS sure changes the form of the original stochastic equation. Therefore, we dwell only on such equations with stochastic integrals with respect to a Poisson measure when the principal terms have degree 1 from the start. Accordingly, suppose that x(t) satisfies the stochastic differential equation _ _ ( _ ( x(t) ) r - ( X(t) ) ) dx(t) = Ix(t)1 a Ix(t)1 + £; b k Ix(t)1 dt f - ( X(t) ) + f (), Ix(t)1 Jl(d(} x dt), (81 ) p,(dO x dt) is a.centered Poisson measure, Ep,2(dO x dt) = m(dO) dt, the functions a(y) and bk(y) are continuous functions on the unit sphere, sup If(O,y)1 < 1, 8,lyl= 1 and lim f If(O,Yl) - f(O,Y2)1 2 m(dO) = O. YIY Then y(t) = x(t)/lx(t)1 satisfies the equation dy(t) = {a(y(t)) + E[3(b k (y(t)),y(t))2 y (t) - 2(y(t), bk(y(t)))bk(y(t)) -lb k (y(t))1 2 Y(t)]} dt r + L bkCp(t)) dWk(t) k=1 f ( y(t) + f(O,y(t)) - - ) + Iy(t) + f((},y(t))1 - y(t) - f((},y(t)) m(d(}) dt f ( y(t) + f(O,y(t)) - ) + Iy(t) + f((},y(t))1 - y(t) Jl(d(} x dt). (82) 
3. STABILITY OF SOLUTIONS 269 Finally, Ix(t)1 2 can be expressed in terms of ly(t)1 by the formula Ix(t)1 2 = Ix(0)1 2 exp { r 2(a(y(s)),y(s)) + t Ib k (y(s))1 2 10 k=1 + f[ln(l + 2(y(s),j(O,y(s))) + Ij(O,y(s))1 2 ) - 2(y(s), f( 0, y(s)) )]ds r r (t - 2 L(b k (y(s)),y(S))2 + 2 L 10 (b k (y(s)), y(s)) dWk(S) k= 1 k= 1 + I t f In(l + 2(y(s),j(O,y(s))) + Ij(O,y(s))1 2 ).u(dO x dS)} · (83) THEOREM 23. Assume the following conditions hold: 1) The functions a(y) and b k (y) are twice continuously differentiable on the unit sphere, and J( 0, x) is twice continuously differentiable with respect to x as an element of the space L2(m). 2) The Markov process y(t) that is the solution of (82) has a unique ergodic distribution p(dy). 3) If c = f [2(a(y),y) + E(lb k (Y)1 2 - 2(b k (y),y)2) + f(ln(l + 2(y,](O,y))) +1](O,Y)1 2 ) - 2(Y,](O,y))m(dO)] p(dy), then c # O. Consider the solution x(t) of equation (60) with coefficients satisfying the following conditions: 4) The solution is weakly unique. 5) There exists a function p(x), p(x) --+ 0 as x --+ 0, such that a(x) - Ixla CI ) + t bk(x) - Ixlb k CI ) k=1 [ ] 1/2 + f Ij(O,x) -lxl/(O,x)1 2 m(dO) < p(x)lxl. Then the point 0 is stable for x(t) if c < 0, and unstable if c > O. PROOF. The proof is obtained by a simple modification of the proofs of Theorems 22 and 19. Let c < O. Then it follows from (83) that Px{x(t) --+ 
270 III. STABILITY. LINEAR SYSTEMS O} = 1 for all x. From this and Theorem 16 we can establish that Elx(t)IP < q-l exp{ -qt} for some p > 0 and q > O. Therefore, for sufficiently large to the function to to V(x) = Ex 10 Ix(t)IP dt = Ixl P 10 Ex exp { i cl>t(.V(.)) } dt, where cI>(y(.)) is the expression in the exponent on the right-hand side of (83), is superharmonic for x(t). Moreover, it has derivatives up to second order satisfying the same estimates as in Theorems 19 and 22. This and condition 5) imply that V(x) is a superharmonic function for x(t) in the neighborhood U J if J > 0 is sufficiently small. The case c > 0 is treated analogously. Q 
CHAPTER IV Linear Stochastic Equations In Hilbert Space. Stochastic Semigroups. Stability  1. Linear equations with bounded coefficients We extend the results of 2 in Chapter III to equations in Hilbert space. 1.1. General equations in Hilbert space. Let X be a separable Hilbert space with inner product (x,y) and norm Ixl. Suppose that a(t,x) and bk(t,x), k = 1,2,..., are functions from R+ x X to X, fk(t,x,O), k = 1,2,..., are functions from R+ x X x 8 to X, where (8, ) is a measur- able space, (Wk(t), k = 1,2,...) is a countable collection of independent Wiener processes, and vk(dO x dt), k = 1,2, are Poisson measures with independent values on the measurable space (8 x R+,  X Bi R +) such that EVk(dO x dt) = mk(dO) dt, where m2(dO) is a finite measure on , and ml(dO) is a O'-finite measure on . We consider the stochastic equation 00 dx(t) = a(t, x(t)) dt + L bk(t, x(t)) dWk(t) k=1 + f fi (t, x(t), O)Jl.t (dO x dt) + f 12(t, x(t), O)v2(dO x dt), (1) where x(t) is an unknown X-valued random function, and Jll(dO x dt) = VI (d 0 x d t) - m 1 (d 0) d t. Equation (1) can be solved for a given initial value x(O) independent of {Wk, k = 1,2,...}, VI, and V2. A solution of the equation is understood to be a random function x(t) such that: 1) if 9;" is the smallest O'-algebra with respect to which x(s), Wk(S), k = 1,2,..., and v;(dO x ds), i = 1,2, s < t, are measurable, then the collection {wk(t+h)- Wk(t), v;(C x [t, t + h]); h > 0, k = 1,2,..., i = 1,2; C E } of variables is independent of 9'; (in other words, the Wiener processes and the Poisson 271 
272 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE measures Vi are adapted to the flow 9';); 2) the stochastic integrals of the differentials on the right-hand side of (1) exist; 3) the series of stochastic integrals with respect to dWk converges in the sense of convergence in probability; and 4) x(t) - x(O) coincides with the sum of the stochastic integrals of the right-hand side of (1) on the interval [0, t] for all t E R+. We present the following lemma in order to clarify what conditions are needed for convergence of the series of stochastic integrals. LEMMA 1. Let b k E X, and let k be independent Gaussian variables with mean zero and variance one. Then E kbk converges in probability if and only ifE Ib k l 2 < 00. PROOF. Let 11 be a Gaussian random variable in X with correlation operator B. Then (see Gikhman and Skorokhod [1], Vol. 1, Russian p. 417, English p. 351), 00 Ee- IIII2 = II (1 + 2Pk)-1/2, k=1 where the Pk are eigenvalues of the operator B, 00 II (1 + 2Pk) = (Ee- I1712 )-2, k=1 1 _11712 -2 1 ( e2t ) trB< 2 [(Ee ) -1] <2 P{I'71 2 >e} -I. Since tr B = EI111 2 , it follows that 2 1 ( e2t ) EI'71 < 2 P{I'712 > e} - 1 · This implies that a sequence 11n of Gaussian variables in X converges in probability if and only if EI11n - 11m1 2 --+ O. But for n < m m 2 m m E Lkbk = E L ki(bk, b i ) = L Ib k l 2 k=n k,i=n k=n and the series converges under the condition of the lemma if and only if the last expression tends to zero as n, m --+ 00. 0 We now present conditions for the existence and uniqueness of a so- lution of (1), conditions that amount to a natural generalization of the "classical" conditions for the finite-dimensional case. 
l. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 273 THEOREM 1. Suppose that the coefficients a(t,x),bk(t,x), and fi(t,x) satisfy the following conditions: 1) They are jointly measurable. 2) For all t E R+ there exists a kt such that for s < t 00 la(s, x)1 2 + L Ibk(s, x)1 2 + f lfi (s, x, O)1 2m l (dO) < kt(1 + IxI 2 ). k=1 3) For all t E R+ there exists an It such that for s < t 00 la(s,x) - a(s,y)1 2 + L Ibk(s,x) - b k (s,y)1 2 k=1 + f lfi (s,x, 0) - fi(s,y, O)12ml(dO) < ltl x - Y12. Then equation (1) has a unique solution satisfying the initial condition x(O) = Xo, where Xo is independent of {Wk, Vi, k = 1, 2, . . ., V = 1, 2}. This solution can be chosen not to have discontinuities of the second kind and to be right-continuous. If h = 0 and Elxol2 < 00, then Elx(t)1 2 is a continuous function of t. PROOF. Let us first consider the case when h = O. We prove uniqueness. If Xl(t) and X2(t) are two solutions of (1) without discontinuities of the second kind and 7:N = inf[t: IXl(t)1 + IX2(t)1 > N], then XI (t) - X2(t) = I t [a(s, XI (s)) - a(s, X2(S))] ds 00 (t + £; 10 [bk(s, XI (s)) - bk(s, X2(S))] dWk(S) + I t f [fi (s, XI (s), 0) - fi (s, X2(S), O)],ul (dO x ds). 
274 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Using condition 3), we find that Elxl (t) - x2(t)1 2 I{TNt} tAT 2 < 3E 10 [a(s, XI (s)) - a(s,x2(s))]ds 00 tAT 2 + 3E E  [bk(s,xI(S)) - bk(s, X2(S))] dWk(S) t 2 + 3E  ! Uj (S, XI (S), 0) - ii (S, X2(S), O)f #1 (dO x ds) {tAT [ < (3t + 3)E 10 la(s, XI (s)) - a(s, x2(s))1 2 00 + L Ib k (s,Xl(S)) - b k (s,x2(s))1 2 k=l + ! Iii (s, XI (s), 0) - ii (s, X2(S), oWml (dO) ] ds {tAT < (3t + 3)lt E 10 IXI(s) - X2(SW ds = (3t + 3)lt t Elxl (s) - X2(s)1 2 IfrN>s} ds. This implies that XI (t) = X2(t) when t < TN, for all N. Since TN --+ 00 as N --+ 00, uniqueness (under the assumption that h = 0) is established. The existence of a solution can be proved by the method of successive approximations. Suppose first that Elxol2 < 00. Let xo(t) = Xo, and for n > 0 let i t 00 i t Xn(t) = Xo + a(S,X n -l (s)) ds + L b k (s,X n -l(S)) dWk(S) o k=1 0 + t! ii (s, Xn-I (s), 0)#1 (dO x ds). (2) We show that all the xn(t) are defined. If X n -l(S) is -adapted, where  is generated by Xo, the increments of Wk on [0, s], and the values of the measure VI on ex [O,s], and if Elx n _l(S)1 2 is a continuous function, then all the stochastic integrals on the right-hand side of (2) are defined (we use 
l. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 275 condition 2)). Further, 00 i t 2 00 i t E L bk(s,xn(s))dwk(S) = L Elb k (s,x n _l(s))1 2 ds k=1 0 k=1 0 = I EL Ibk(s,Xn-i(S))fds < kIll E(1 + IX n _i(S)1 2 )ds, and thus Elx n (t)1 2 is finite. Obviously, xn(t) is 9;-adapted. Since xo(t) is 9;-adapted and Elxo(t)12 = Elxol2, induction gives us that the xn(t) are defined and 9;-adapted, and Elx n (t)1 2 is locally bounded. Using condition - 3), we establish that for some kt 2 _ i t 2 (ktt)n EIX n +l (t) - xn(t)1 < kt Elxn(s) - X n -l (s)1 ds < , . o n. Therefore, the series Xo + EI[Xk(t) - Xk-l(t)] converges in probability to some process x(t). Note that sUPIX n +l(S) -x n (s)1 2 st < 3t lla(s,Xn(S)) - a(s,xn-i(S)W ds 00 t 2 + 3 sup L ( (bk(u,xn(u)) - bk(u,Xn-i(U))) dWk(U) st k=110 +sup t j [ii(u,xn(U),O) _ ii(U,Xn-i(U),O)]Jli(du) 2. st 10 Using the martingale inequalities, we get that E sup IX n +l (s) - x n (s)1 2 st < 3t I Ela(s,xn(s)) - a(s,x n _i(s))1 2 ds 00 i t + 12 L Elbk(s,xn(s)) - b k (s,x n -l(s))1 2 ds k=1 0 + 12 I j Iii (s, xn(s), 0) - ii (s, Xn-i (s), 0)1 2m i (dO) ds {t (k t)n < (12+3t)/110 Elxn(s)-Xn-i(SWds < CI ! · 
276 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE This inequality implies that Xo + EI[Xk(t) - Xk-l(t)] converges with probability 1 uniformly on each finite interval: for q < 1 { } (kt)n 00 ( kt ) n 1 p sUpIX n +l(S)-X n (s)l>qn < Ct /2n ' L -1: ,<00. s<t n.q q n. - n=1 The fact that the process x(t) = Xo + EI(Xk(t) - Xk-l(t)) satisfies (1) with h = 0 can be verified by passing to the limit in (2). The fact that x(t) does not have discontinuities of the second kind follows from the fact that the xn(t) have this property. It need only be verified that the sum of stochastic integrals 00 i t L b k (s,x n -l(S)) dw k(S) k=1 0 is continuous if X n -l(S) does not have discontinuities of the second kind. Let Cc = inf{t: IX n -l(t)1 > c}. By assumption, Cc --+ 00 as C --+ 00 in view of the boundedness of IX n -l(S)I; therefore, it suffices to prove that the sum of the series 00 t L f bk(s, Xn-I (s))I{s<'c} dWk(S) = n(t) k=1 10 is continuous in t. The quantity Ic;n(t)1 2 is a spbmartingale with the repre- sentation tOO. ln(t)12 = f L Ibk(s,xn-l(s)WI{s<,c} ds 10 k=1 tOO + 2 f L(n(s),bk(S,Xn-l(s)))I{s<'c} dWk(S). 10 k=1 Since 00 L Ib k (s,x n -l(s))1 2 I{s<,c} < ks(1 + IX n -l(S)1 2 )I{s<cc} < ks(1 + c 2 ), k=1 there exists for each T > 0 a constant CT such that Elc;n(t)1 2 < CTt and ln(t)12 < CTt + 2 t f)n(S), bk(s, Xn-I (s)))I{s<"} dWk(S). 10 k=1 In precisely the same way, for h > 0 and t + h < CT we have that Ic;n(t + h) - c;n(t)1 2 00 f t+h 00 < cTh + 2 L L(c;n(S) - c;n(t), b k (s,X n -l (s)))I{s<Cc} dWk(s). k=1 t k=1 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 277 Hence, for t, t + h < T EIC;n(t + h) - C;n(t)1 4 j t+h 00 < 2c}h 2 + 8E L(C;n(S) - C;n(t),b k (s,x n -l(s)))2I{s<Cc} ds t k=1 < 2c}h 2 + S[t+h Elc;n(s) - c;n(tWCTS ds < 2c 2 h 2 + 8c 2 j t+h ( S - t ) s ds < 2c 2 h 2 + 4Tc 2 h 2 + O ( h 2 ) - T T - T T t (we have used the inequality EIC;n(t + h) - C;n(t)1 2 < cTh). The continuity of C;n(t) follows from the theorem of Kolmogorov (Gikhman and Skorokhod [1], Vol. 1, Russian p. 235, English p. 191). To prove the existence of a solution without the assumption that Elxol2 exists we consider a sequence of functions XN(X) from X to X such that XN(x) = x for Ixl < Nand XN(x) = ZN for Ixl > N, where IZNI > N. Let xN(t) = XN(XO) + t a(s,xN(s))ds + f: t bk(s,xN(s))dwk(S) 10 k=110 + h t ! Ji (s, x N (s), O)J.ll (dO x ds). Then ElxN(t) -xNI(t)12I{xN(oJ=xNI(O)} < Ct h t ElxN(s) _X N1 (S)1 2 ds, and hence Elx N (t) - X N1 (t)1 2 I{xN(0)=xN1 (O)} = o. Therefore, P{xN(t) =F xN1(t)} < P{xN(O) =F xN1(0)} and limNoo xN(t) exists as N --+ 00. This limit is obviously a solution of (1) with 12 = O. ....-.. We now remark that for any stopping time 'l' with respect to some flow c9; to which the Wiener processes Wk(t) and the Poisson measures 1/;(d(J x dt) are adapted we can consider the solution of (1) with 12 = 0 on ['l', oo[ that satisfies an initial condition x, measurable with respect to the a-algebra ....-.. g;. As for the case 'l' = 0, it is possible to establish the existence and uniqueness of a solution of the equation. Thus, the theorem is proved for the case 12 = O. Suppose that 12 =F O. Since the Poisson measure 1/2 is finite on each set [0, t] x 8, there exist a sequence {'l' k, k = 1, 2, . . .} of stopping times and 
278 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE a sequence {(J k, k = 1, 2, . . . } of random elements in the measurable space (8,) such that 112 is concentrated on the sequence {('k, (Jk), k = 1,2,...} of points. Further, the (Jk are independent of {'i}, and P{(Jk E C} = m2(C)j m 2(8), C E , while 'k = 111 + ... + 11k, where the 11i are inde- pendent identically distributed variables with P{11i > t} = exp{ -tm2(8)}. The integral of the last term in (1) has the form 1/ f h(t, x(t), O)v2(dO x dt) =  herb x( 't'k - ), Ok)I{'rk/}' k = 0, 1,..., where it is assumed that '0 = o. Equation (1) can be rewritten on each interval ['k, 'k+l[ in the form x(t) - x( 't'k) = l a(s, x(s)) ds +  l bi(s, x(s)) dWi(S) + l f Ii (s, x(s), O)J.l1 (dO x ds). (3) By what has been proved, the solution of it is unique; therefore, it follows from the equality X('k+l) = X('k+l-) + f('k+l,x('k+I-),(Jk+l) that the solution of (3) is unique on the interval ['k''k+l]. This implies that the solution of (1) is unique. We prove existence. The solution will be con- structed successively on the intervals. ['k, 'k+l]. Let xo(t) be the solution of the equation i t 00 t xo(t) - So = a(s,x(s)) ds + L ( bi(s,x(s)) dWi(S) o i= 1 J 0 + 1/ f Ii (s, x(s), O)J.l1 (dO x ds), which exists by what was proved, and which does not have discontinuities of the second kind. The variables ('1, (Jl) are independent of xo(t); there- fore, xo(t) is continuous at the point '1 with probability 1. Let x(t) = xo(t) for t < '1, and let X('I) = X('I) + h('I, (Jl,X('I)). Assume that x(t) has already been constructed on [0, 'k]. Denote by Xk(t) the solution of (3) for all t > 'k. Then Xk(t) is independent of ('k+l, (Jk+I). Let x(t) = Xk(t) for t E ['k, 'k+l [, and X('k+l) = Xk('k+l) + h('k+l,Xk('k+l), (Jk+l); further, X('k+I-) = Xk('k+I), by the continuity of Xk(t) at the point 'k+l. 
l. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 279 Let t E ['l' I, 'l'l + 1 [. Then, by construction, x(t) = x('/) + 1: a(s,x(s)) ds +  1: bj(s, x(s)) dWj(s) + 1: f Ii (s, x(s), O)J1.1 (dO x ds) =X('/_I)+ lTI a(s,x(s))ds+ tl TI bj(s,x(s))dwj(s) '/_1 1=1 '/_1 + l TI f Ii (s, x(s), O)J1.1 (dO x ds) '/_1 + h('l'J,X('l'I-), ( 1 ) + x(t) - X('l'l) I t 00 I t = X('l'I-I) + a(s,x(s))ds + L b;(s,x(s))dw(s) '/_1 ;=1 '/_1 + It f Ii (s, x(s), O)J1.1 (dO x ds) + h ('/' x( '/-)' 0/) '/_1 =xo+ t a(s,x(s))ds+ t bj(s,x(s))dw;(s) + 1 t f f(s, x(s), O)J1.1 (dO x ds) + L h( 'ko x( 'k-), Ok). o kl This means that x(t) satisfies the equation. Obviously, x(t) does not have discontinuities of the second kind. 1.2. Linear equations. The general linear equation is obtained from (1) under the assumption that the coefficients depend linearly on x. Therefore, the coefficients must be linear operators from X to X: a(t, x) = A(t)x, bk(t, x) = Bk(t)x, fi(t, x, 8) = F;(t, 8)x, where A(t), Bk(t), and Fi(t, 8) are functions from R+ and R+ x e to L(X). The equation itself has the form 00 dx(t) = A(t)x(t) dt + L Bk(t)X(t) dWk(t) k=l + f FI (t, O)X(t)J1.1 (dO x dt) + f F2(t, O)x(t)v2(dO x dt). (4) The functions A(t), Bk(t), and Fi(t, 8) must be jointly weakly measurable (this will imply strong measurability), and conditions 2) and 3) of Theo- rem 1 can be combined into one for them: for all t E R+ there exists an It 
280 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE such that 00 IA(s)xI 2 + L I B k(S)xI 2 + f IF;(t, O)xl 2 ml (dO) < ltl x l 2 . k=1 The same condition is equivalent to local boundedness of the operator function 00 Q(s) = A*(s)A(s) + L B; (S)Bk (s) + f Ft(s, O)F1(s, O)ml (dO). k=1 Thus, if the coefficients of equation (4) are measurable and the function Q(s) is locally bounded, t\1en (4) has a unique solution for each initial condition Xo independent of the Wiener processes Wk and the Poisson measures lIi. Denote by C;(xo, t) the solution of (4) with the initial value Xo. It is easy to see that for any Xl,X2 E X and al,a2 E R P{ C;( al X l + a2 X 2, t) = al C;(Xl, t) + a2c;(x2, t)} = 1, (5) i.e., the solution depends linearly on the initial condition (it can also be random). We remark that property (5) does not permit us to claim (as in the case of equations in a finite-dimensional space) that ,( t, x) = Vtx, where V t is a random operator (a random element of L(X)). To see that this is not necessarily so, we consider an example. EXAMPLE. Let {ek} be an orthonormal basis in X, and Pk the operator of projection onto ek. Consider the equation 00 dx(t) = L Pkx(t) dWk(t). k=l (6) For this equation 00 00 Q(s) = LP;P k = LPk = I, k=l k=l where I is the identity operator. Hence, (6) has a unique solution for every initial condition: for all m and n 00 d((t,em),en) = L(Pk(t,ek),en)dwk(t) k=l 00 = L((t, em), Pken) dWk(t) = ((t, em), en) dWn(t). k=l Since ((O, em), en) = dm,n, it follows that ((t, em), en) = 0 for n -# m, and ((t, em), em) = exp{ Wm(t) - tj2}. 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 281 Hence: c;(t, x) = L c;(t, em)(X, em) = L(X' em) exp{wm(t) - tf2}em, m m (c;(t, x), c;(t, x)) = L(X' em)2 exp{2wm(t) - t}. m If there were a random operator U t such that c;(t, x) = Utx, then for all m Ic;(t,em)1 2 = IUteml2 < IIU t Il 2 , I.e., supexp{2wm(t) - t} < IIU t Il 2 , m but P { supwm(t) = +oo } = 1 - lim II P{Wm(t) < c} = 1. m c-oo m The solution of (6) is not representable in the form UtxQ with U t a random operator. We present some facts about solutions of (4) that are analogous to the facts established for equations in a finite-dimensional space (see 2 of Chapter III). I. Let C;1 (s, X, t) be the solution of (4) on [s,oo[ satisfying the initial condition C;1(S,X,S) = x under the assumption that F2 = O. Then the solution of (4) with the initial condition x(O) = Xo can be written as follows: if ('k, (Jk) is a sequence of pairs of stopping times and points in e on which the measure V2 is concentrated (as indicated in the proof of Theorem 1), then for 'I < t < '1+1 x(t) = C;1 ('I, x( 'I), t), x ( 'I) = (I + F 1 ( 'I, (J I ) )C; 1 ( '1- 1 , X ( '1-1 ), 'I ) , I > 1, X('I) = (I + F 1 ('I, (Jl))C;I(O,XO, '1). In what follows we consider only equations of the form (4) with F2 = O. II. Let Zt be a function with values in L(X) satisfying the differential equation dZt/dt = -ZtAt, Zo = I. If At is a measurable locally bounded function, then Zt is a norm-continu- ous function, and since Zo = I, it follows that Zt is an invertible oper- ator for all t > O. Let y(t) = Ztx(t). Then y(t) satisfies the stochastic differential equation 00 dy(t) = L Bk(t)y(t) dWk(t) + f £1 (t, O)y(t)JlI (dO x dt), (7) k=1 
282 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE where - 1 Bk(t) = ZtBk(t)(Zt)- , - -I F 1 (t, 8) = ZtFI (t, 8)(Zt) . These functions are also measurable and locally bounded. The equation (7) is convenient in that it contains only martingale terms, and its solution is a martingale. The coefficients of (7) also satisfy the condition sup LBZ(S)B k (S)+ ! Ft(s,8)F't(S,8)m 1 (d8) <00, st t > o. (8) III. Assume that the functions F 1 (t, 8) in (7) satisfy the condition that there exists an increasing numerical function kt such that IFI (t, 8)xl < k t lxl 2 . Then the solution of (7) with the nonrandom initial condition Xo has all moments. Indeed, on the basis of the Ito formula 00 d(y(t), y(t)) = 2 L(y(t), Bk(t)y(t)) dWk(t) k=1 + ![2(F I (t,(J)y(t),y(t)) + IFI(t,O)y(tW]JlI(dO x dt) + (IBk(t)Y(fW + ! IFI(t, O)y(tWml (dO)) dt, and hence for all positive integers m d(y(t),y(t))m = m(y(t), y(t))m-I [2 (y(t), Bk(t)y(t)) dWk (t) + (IBk(t)y(t)12 + ! IFI (t, O)y(t)1 2m l (dO)) dt] m(m - 1) 2  - 2 + 2 4(y(t), y(t))m- L.J(y(t), Bk(t)y(t)) dt k=1 + ! [(y(t) + FI (t, O)y(t), y(t) + FI (t, O)y(t))m - (y ( t), Y ( t) ) m - m (y ( t), Y ( t) ) m - 1 (y ( t), F 1 ( t, 8) y ( t) ) + IFI (t, 8)y(t)1 2 ]ml (d8) dt + ! [(y(t) + FI (t, O)y(t),y(t) + FI (t, O)y(t))m - (y( t), y( t) )m],ul (d 8 x d t). 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 283 If 'c = inf[t: ly(t)1 2 > c], then by using the boundedness of the jumps of (y(t),y(t)) for ly(t)1 2 < c we see that Ely(t 1\ 'c)1 2 m < 00. The preceding relation implies the inequality Ely(t 1\ 'c)1 2 m (tATe [ 00 < I X ol 2m + E 10 mly(s)1 2m - 2 £; I B k(S)y(s)1 2 00 + 2m(m - 1)ly(s)1 2m - 4 L(y(s),B k (s)y(S))2 k=1 + ! [ly(s) + £1 (s, O)y(s)1 2m - ly(s)1 2m - mly(s)1 2m - 2 X (2(y (s ), £1 (s, t) y (s )) + 1£1 (s, O)y (s W)] m I ( dO)] d t. Using the inequality (y(s), B k (s)Y(S))2 < ly(s)1 2 IB k (s)y(s)1 2 , condition (8), and the boundedness of F 1 (t, (J), we can get that for all t there exists an ht such that for s < t (SATe Ely(s /I. 't'c)1 2m < I X ol 2m + hI 10 Ely(u)1 2m duo This gives us that Ely(s 1\ 'c)1 2 m is bounded uniformly with respect to c. Since 'c --+ 00 and c --+ 00, it follows that Ely(s)1 2 m < 00. IV. We consider the solution of (7) by the successive approximations yo(t) = Xo, 00 t Yn(t) = Xo + L ( Bk(s)Yn-I(S) dWk(S) k= I J 0 + I! £1(S,O)Yn-I(S)J.lI(dO x ds). 
284 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Then YI (t) - Yo(t) = f t iik(s)xo dWk(S) + t / Fl (s, fJ)XO(S)J.ll (dfJ x ds) k=110 10 = WI (t, xo), Y2(t) - YI (t) = f t ii k (S)(Y2 (s) - YI (s)) dWk(S) k= 1 10 + t / £1 (s, 0)(Y2(S) - YI (S))/ll (dO x ds) .= f t ii k (s)>>1(s,XO) dw k(S) k=110 + t / £1(s,O)>>1(s,xo)/lI(dO x ds) = W 2 (t,xo), Yn(t) - Yn-I(t) = f t iik(S)Wn-I(S,xo)dwk(S) k= 1 10 + t / £1 (s, 0) W n - I (s, XO)/ll (dO x ds) = Wn(t, xo). (9) It is clear frollJ. the construction that the Wn(t,xo) are n-fold stochastic integrals and can be determined successively by the second equality in (9) if it is assumed that JtQ(t,xo) = Xo. Thus, the formula 00 y(t) = Xo + L Wn(t,xo) n=1 (10) holds for the solution of (7); the convergence of the series of (10) (in the mean square and with probability 1) was established in the proof of Theorem 1. V. It is easy to see by induction that for n < m and for any x,y E X E(Wn(t, xo), z)(Wm(t, xo),y) = o. Therefore, the correlation operator R(t) of the process y(t), which satisfies (R(t)z,y) = E(y(t), z)(y(t),y), is determined by 00 (R(t)z,y) = (xo, z)(xo,y) + L E(Wn(to, xo), z)(Wn(t, xo),y). n=1 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 285 We have that E( Wn(t, XO), Z)( Wn(t, XO), y) 00 i t = L E(Wn-l(S,Xo),Bk(S)Z)(Wn-l(S,Xo),Bk(S)y)ds k=1 0 + 1 1 ! E(Wn-i(s,xo),Ft(s,O)Z)(Wn-i(S,Xo),Ft(s,O)y)mi(dO)ds. Let ( Qn ( t, XO) z, y) = E ( W n ( t, xo), z) ( W n ( t, XO), y). Then we have the recursion relation 00 i t (Qn(t,XO)Z,Y) = L (Qn-l(t,xo)B k (s)z,B k (s)y)ds k=1 0 + 1 1 ! (Qn-i (s,xo)Ft(s, O)z, Ft(s, O)y)mi (dO) ds. We introduce a linear function S defined on strongly continuous func- tions Q(t) on R+ taking values in the space L+(X) of symmetric nonneg- ative operators in L(X): 00 {t (S[Q](t)z,y) = £; 10 (Q(s)B;(s)z,B;(s)y) ds + 1 1 !(Q(S)Ft(s,O)Z,Ft(S,O)y)mi(dO)dS. Then Qn(t, XO) = sn[xo 0 xo](t), 00 R(t) = LSn[xo oxo](t). n=1 1.3. Linear sthastic equations in Hilbert space. We single out a certain subclass of linear stochastic differential equations of the form (5) whose solutions can be represented in the form x(t) = UtXo, where U t is a bounded linear (random) operator. Let H(X) be the space of lin- ear operators in L.(X) that are Hilbert-Schmidt operators: C E H(X) if trC.C < 00. The space H(X) is a Hilbert space with the inner product (C 1 , C 2 ) = trCiCl, C 1 , C 2 E H(X). Note that F(C) = AC and F 1 (C) = CA are bounded linear operators from H(X) to itself for every A E L(X). 
286 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE THEOREM 2. Assume the following conditions hold for the coefficients of equation (4): 1) They are measurable and the function tr Q( t) is locally bounded, where 00 Q(t) = A*(t)A(t) + L Bk (t)B k (t) + / Ft(t, O)F I (t, O)ml (dO). k=1 2) F 2 (t, fJ) E H(X) for all t and almost all fJ with respect to the measure m2(dfJ). Then the solution of( 4) with initial condition x(O) = Xo is representable in the form x(t) = UtXo, where U t is a bounded random operator, and P{U t - I E H(X)} = 1. PROOF. Let us consider the expression z(t) = t A(s) ds + f t Bk(S) dWk(S) 10 k=110 + I t / FI (s, O)f.ll (dO x ds) + I t / F2(S, O)v2(dO x ds). Regarding the stochastic integrals as integrals of H(X)-valued functions, i.e., functions with values in a Hilbert space, we see that they all exist and are processes in H(X) with independent increments. We see that the series of Gaussian variables converges with probability 1 in H(X). In the proof of Lemma 1 it was shown that the following conditions are equivalent for a sequence of Gaussian variables Y/n with values in a Hilbert space: a) Y/n --+ 0 in probability, and b) EIY/nI 2 --+ O. But for n < m (11 ) / m {t m (t ) (t m E \E 10 Bk(s) dWk(S), E 10 Bk(s) dWk(S) = 10 E tr Bk(s)Bk(s) ds. Therefore, convergence of the series of stochastic integrals follows from condition 1). We consider the stochastic differential equation dYr = dZtCY t + I) (12) for a process Yr in H(X). This equation can be rewritten in the form (1): 00 dYr = A(t, Yr) dt + L Bk(t, Yr) dWk(t) k=1 + / FI (t, Yr, O)f.ll (dO x dt) + / F 2 (t, Yr, O)v2(dO x dt), (13) 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 287 where A(t, Y), Bk(t, Y), and F;(t, 8, Y) are functions from R+ x H(X) and R+ x e x H(X) to H(X) defined by A(t, Y) = A(t)(I + Y), Bk(t, Y) = Bk(t)(I + Y), F;(t, 8, Y) = F;(t, 8)(1 + Y). They have the necessary measurability properties. Further, 00 (A(t, Y), A(t, Y)) + L (Bk(t, Y), Bk(t, Y)) k=1 + / (i l (t, 0, Y), i l (t, 0, Y)}ml (dO) = tr(I + Y)* [A*(t)A(t) + Bk(t)Bk(t) + / Ft(t,O)FI(t,O)ml(dO)] (I + Y) = tr(I + Y)*Q(t)(I + Y) < trQ(t) + 2(Y, Q(t)) + IIQ(t)II(Y, Y) < tr Q(t) + (Q(t), Q(t)) + (Y, Y) + IIQ(t) II ( Y). This means that the coefficients in (13) satisfy condition 2) of Theorem 1. Finally, (A(t, Y 1 ) - A(t, Y 2 ),A(t, Y 1 ) - A(t, Y 2 )) 00 + L(Bk(t, Y 1 ) - Bk(t, Y 2 ),B k (t, Y 1 ) - Bk(t, Y 2 )) k=1 + / (i l (t, 0, Yd - i l (t, 0, Y2), i l (t, 0, Yd - i l (t, 0, Y2)}m(dO) = tr(Y I - Y2)*Q(t)(Y I - Y2) < IIQ(t)II(Y I - Y 2 , Y 1 - Y 2 ); hence condition 3) of Theorem 1 holds. Therefore, there exists a unique solution of (13). Setting U t = I + yt, we thus have that U I = I + hI A()Us ds +  hI Bk(s)Us dWk(S) + hI / FI (s, O)Usf.l1 (dO x ds) + hI / F 2 (s, O)U s 1l2(dO x ds). Applying this relation to the element Xo E X, we see that Utx = x(t) is a solution of (4). 0 We now consider (12), where Zt is a stochastically continuous process in H(X) with independent increments. Such a process (see Gikhman and 
288 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Skorokhod [1], Vol. 2, Russian p. 401, English p. 270) can be represented as follows: - 0 1 Zt=A(t)+Zt +Zt, (14) where A (t) is a continuous nonrandom function, Zp is a martingale with bounded jumps, Zl is a jump process, and Zp and Zl are mutually inde- pendent stochastically continuous processes with independent increments. Since Zl has finitely many jumps on each interval [0, t], equation (12) can be solved between the jumps of Zl, but if 'l' is a jump time of Zl and a solution of (12) has been constructed for t < 'l', then Y t = (Zi - Zi-)(Y t - + I) (we assume that Yi and Zt are right-continuous). Therefore, it suffices to consider (12) for the case when Zt = A (t) + Zp. For (12) to make sense it is necessary that A (t) have bounded variation (in H(X)). We consider the equation - 0 dYi = (dAt + dZ t )(Yi + I), where A (t) is a continuous process of bounded variation, and Zp is a stochastically continuous martingale with bounded jumps. Denote by Jit the solution of the equation d = -d A (, Vo = I. The solution of (14) can be written .as a series 00 v, = I + L f... / d A s 1 '. .d As ., n-l - O<SI <...<Sn<t The function  - I is also continuous and has bounded variation in H(X). ,..., ,..., Let Yi = VI Yi + Jit - I. Then Yi satisfies the stochastic equation dYt = dJitYi +  dYi + d - - 0 - = -  dAtYi + Jit(dAt + dZ t )(Yi + I) - Jit dAt =  dZto(Yi + I). Note that Jit is an invertible operator. Indeed, if V t is the solution of the equation d V t = d At V I, V 0 = I, then d( V t) = (dJit) V t + d V t = -Jitd A t V t + Jitd At d V t = 0,  V t = I. ,..., - -1 Therefore, V t = (Jit) and Yi can be expressed in terms of Yi by the formula Yi = J/;-1 Yt + JI;-l - I. 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 289 Finally, we get the equation ,...,., """'0 ,...,., dYr = dZ t (Yi + I) ( 15) - -0 t -1 0 for Yr, where Zt = J o  dZs. The solution of (15) can be written as a series of multiple stochastic integrals - -0 L oo / / -0 -0 y; t = Z + . . . dZ. . . dZ . t Sn Sl n-2 - O<SI <."<Sn<t ( 16) The stochastic integrals in (16) have the form (t -0 10 dZ s F(s), where F(s) is an H(X)-valued measurable random function adapted to some flow {9';} with respect to which Zso is a process with independent increments ({9';} can be generated by the martingale Zso itself), and the function F(s) can itself be given by stochastic integrals of the same form. Let As = E (ZsO, ZsO). This is a continuous increasing function, and At - As = E(ZP - Zso, Zp - ZsO) for s < t. The integral (17) is defined as a mean-square limit of integrals of step functions for all F(s) such that ( 17) 1/ E(F(s),F(s)) dA s < 00 ( 18) and, further, E(l/ dZF(s), 1/ dZsOF(S)) < 1/ E(F(s),F(s))dAso (19) For step functions inequality (19) is a consequence of the following: for SI < S2 -0 -0 -0-0 E(( ZS 2 - ZSI )FsI' ( ZS 2 - ZSI )F sl ) * -0 -0 * -0 -0 = E tr FSI ( ZS 2 - ZSI) ( ZS 2 - ZSI )F si -0 -0 * -0 -0 * = E tr( ZS 2 - ZSI) ( ZS 2 - ZSI )F si FSI -0 -0 * -0 -0 * < E tr( ZS 2 - ZSI) ( ZS 2 - ZSI) tr Psi FSI -0 -0 * -0 -0 * = E tr( ZS 2 - ZSI) (ZS2 - ZSI)E tr FSI Psi = (A S2 - ASI )E(FsI' F sl ) (we have used the fact that Zs - Zs is independent of the a-algegra 9';1' with respect to which FSI is measurable). 
290 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE From (19) we get the following estimate for the terms of the series (16): ( ! ! -0 -O ! ! -0 -0 ) An(t) E... dZ ...dZ ... dZ ...dZ < . Sn SI ' Sn SI - n! O<SI <...<Sn<t O<SI <...<Sn<t This implies that the series in (16) converges, uniformly on each finite interval. The last point is a consequence of the following estimate for H(X)-valued martingales: if Jtt is a separable martingale in H(X), then p { SUp(, ) > e } < .!.E(»I, »I). st e Combining all these assertions, we obtain the following theorem. THEOREM 3. Suppose that Zt, t E R+, is a stochastically continuous process in H(X) admitting the representation (14), with A (t) afunction of bounded variation. Then (12) has a unique solution in H(X). REMARK. Obviously, (12) also has a unique solution on [s, 00[. For t > s let Yrs be the solution of (12) on [s,oo[ such that Yss = O. We consider the family of operators Uf = I + Yrs. Denote by g;s the a-algebra generated by the variables Zu - Zs for u E [s, ,t]. Then the Uf satisfy the following conditions: 1) Uf is an g;s -measurable variable, and hence the variables Us' UI , . . . , u:nn-I are independent for 0 < SI < S2 < ... < Sn. 2) U = U Uf with probability 1 for s < t < v. (Indeed, the relations dv(U - I) = dZvU, dv(UUtS - I) = [dv(U - I)]U t S = dZvUUtS hold for v > t; hence U - I and U Uf with t > v satisfy the same equation (12), and the values of U - I and UUf - I coincide for v = t; the uniqueness of the solution of (12) and the fact that the initial values of the solutions coincide at the point v = t yield what is required.) 3) If 0 < s < t < v, t - s --+ 0, and v is fixed, then Uf - I --+ 0 in probability in H(X). Indeed, the probability that the process ZJ does not have jumps on [s, t] tends to 1 as t - s --+ O. Therefore, it suffices to consider the case when ZJ = O. Then E(Yrs., YrS) is bounded for s, t E [0, v]. We use this and the 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 291 estimate E(Y/, yn 2 < 2E (it duY A uY, it dUY A uY) +2E (it duz2Y, it duZ2Y) < 2 [ ( va rA u ) 2 + ;.(t) - ;'(S) ] sup E(Y, Y). [s ,t] u This implies that Yrs --+ 0 in probability. 1.4. Stochastic Hilbert-Schmidt semigroups. Denote by G(X) the semigroup of linear operators in L(X) of the form I + Y, where Y E H(X). We regard G(X) as a metric space with the metric p(U 1 , U2) = (U 1 - U2, U 1 - U2)l j 2. It is separable and complete. For U E G(H) let N(U) = 1 + p(U,I). Then N( U) satisfies the following inequality: N( U 1 U 2 ) < N( U 1 )N( U2) if Ul, U2 E G(H). Moreover, for p( U, I) < 1 the operator U is invertible, and U-l E G(H). It can be seen that N(U- 1 ) < 1/(2 - N(U)). We shall consider a two-parameter family {UI, 0 < s < t < oo} of random elements in G(X) satisfying conditions 1 )-3) in  1.3, with condition 1) fulfilled for some family {!Jl;s,O < s < t < oo} of a-algebras such that la) !Jl;s c 9';;u for [s,t] C [u,v], and Ib) the a-algebras 91;0 and!Jl;s are independent for o < s < t. The collection {Ul} is called a (left) stochastic semigroup. If instead of condition 2) we have 2') U = Ul U  with probability 1 for s < t < v, then the stochastic semigroup is said to be a right stochastic semigroup. The operation of taking adjoints carries right-semigroups into left-semigroups, and conversely. It was shown in 1.3 that the solutions of (12) generate a left stochastic semigroup. It turns out that under a natural restriction every stochas- tic semigroup satisfies a linear stochastic differential equation with some process with" independent increments. We require the addition condition: 4) Ul is a semimartingale as a function of t (more precisely, Ul- I is a semimartingale in H(X), i.e., it is representable as the sum of a martingale and a function of bounded variation). It will be assumed that Ul+ = Ul for all t; the limit Ul+ exists because Ul is a semimartingale. 
292 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE THEOREM 4. If conditions 1)-4) hold, then there exists a stochastically continuous process Zt with independent increments that is a semimartingale (this is equivalent to saying that the function A t in the representation (14) for Zt has bounded variation) such that Yrs = Uf - I satisfies for t > s equation (12) with initial condition Y; = O. PROOF. Being a semimartingale, Uf does not have discontinuities of the second kind in G(X) as a function of t. Using the fact that ufo is invertible with probability arbitrarily close to 1 if t - to is sufficiently small, we see that Uf does not have discontinuities of the second kind as a function of s: for to < s < t Ui = UfO(U;O)-1 (the right-hand side does not have discontinuities of the second kind on the set {UfO is invertible}, because the invertibility of Uf ufo and the indepen- dence of the factors implies that each is invertible). For every e > 0 there exists a sequence of stopping times 'l'k' 'l' > 'l' < . . . < 'l'k --+ 00, such that p ( U;; , I) < e for s  {'l', . . . }, and p ( U;; , I) > e for s = 'l'k' k = 1, 2, . . . . The quantity U;; - I will be called the jump of the semigroup at the point s. The times 'l'k are all the times when the stochastic semigroup has jumps exceeding e. We define a random measure v on the a-algebra of Borel sets in R+ x H(X): v(B) = L IB(s, U;; - I), with the summation over the points of discontinuity of the stochastic semi- group; v([O,t] x {Y: (Y) > e}) < 00 for every e > 0 and t > 0, and it is a stochastically continuous Poisson measure with independent values. Further, if s < u < v < t and C c H(X) is a Borel set lying at a posi- tive distance from 0, then v([u, v] x C) is measurable with respect to grs. We construct a family {Uf(e), 0 < s < t} of random elements of G(H) as follows. Let 'l'k be the stopping times indicated above. For 0 < s < t let l l TTS ( ) TTt[ U ti-I u s l./ t e = l./ t t l . .. t l , 1- k- (20) where s < 'l'k < ... < 'l'i < t are all the points of the sequence {'l'} in ]s, t]. If none of the points 'l'k fall in ]s, t], then we regard the product on the right-hand side as coinciding with Uf; more precisely, U;e = Uf, and k the remaining factors are equal to I (this can be explained as follows: in constructing Uf ( e) we throwaway the jumps of Uf exceeding e in norm; if there are no such jumps in some interval, then the semigroup is left unchanged). We note that the process v([O;t] x {Y: (Y, Y) > e}) = e(t) is a Poisson process for which the jumps coincide with the times 'l'k' and 
l. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 293 e(t) - e(s) is g;s-measurable. Therefore, the variables on the right-hand side of (20) are also g;s-measurable. Thus, n-l U; _ = nli. L( U:+k ':s - 1)1 g.(s+k ':S )-.(s)=o, + UI Ig.(t)_.(s)=O}, k=O e(s+(k+l) t-;;s -e(s)=I} t e . L s+k I.=! U t / _ = I + 11m U .t- I { J: ( s+ ( k-l ) I.=! ) -J: ( s ) =o k+1 noo S+}n '='t n ,=,e. ' o k <j  n e (s+ * (t - s)) =e (s+ * (t - s) )e (s)+ 1, e(s+(j+ 1) t-;;s )=t(s)+2} Hence, Ul(e) is g;s-measurable. It follows from the construction that U(e) = U(e)Ul(e) for s < t < v. Moreover, since P{Ul(e) ¥= Uf} = P{e(t) - e(s) > O} and the right-hand side tends to zero as t - s --+ 0, it follows that Ul (e) --+ I as t - s --+ 0, t < v. Therefore, Ul ( e) is a stochastic semigroup in G(H). It also does not have discontinuities of the second kind, and its jumps do not exceed ..;e in norm. The following lemma is needed. LEMMA 2. The stochastic semigroup Ul (e) has uniformly bounded mo- ments of any order on each finite interval. PROOF. Obviously, it suffices to prove the existence for each v > 0 of an h such that for each r > 0 sup E(N(Ut(e)))' < 00. O<s<t:5 v t-sh Choose J > 0 and a > 0 (their values will be made precise later). Let h be such that for 0 < s < t < v and t - s < h P{p(Ut(e),I) > J} < a, . = inf [t > s: N(UI(e)) > .  ] . Then P{-r < s+h} < a+P{-r < s+h,N(UtS(e)) < 1 +J} < a + EI{t:5 s + h }P{N(U[t+h}(e)U;(e)) < 1 + J}. Since N(Ui(e)) > (1 + J)j(1 - J), it follows that N(Us'+h(e)) > 1 + J, as otherwise, we would have N(U;(e)) = N((Ust+h)-1 U:+ h ) < 2  : c5 -    . Therefore, P{N(Ust+h(e)U;(e)) < 1 + JIg;S} < P{N(Ust+h(e)) < 1 + JIg;S} < a. 
294 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE We have proved that { I+J } P sup N(Ut(e)) > 1 _ J < 2a. tE[s,s+h] Next, using the fact that N(Ut(e)) < N(U;-(e)U;_(e)) <   (1 +e) (N(UtS) < (1 +J)j(I-J) for t < -r), we find from the same considerations that for A> (1 +J)(1 +e)j(I-J) P { SUP N(Ut(e)) > A } tE[s,s+h] = EI{ts+h}P { SUP N(Utt(e)UtS(e)) > AIS1;s } tE[t,s+h] { t A( 1 - J) } < EI{T:5s+h}P sup N(U t (e)) > (1 £5)(1 ) 19; tE[t,s+h] + + e { s A( 1 - J) } < supP sup N(U t (e)) > (1 £5)(1 £5) 2a. sv tE[s,s+h] + + From this, SUPP { sup N(UtS(e)) > ( (I+:+e) ) k } « 2a)k. s tE[s,+h] Therefore, for s < v and t - s < h E(N(Ut(e)))' < 1 + f: ( (1 + : + e) rr (2a)k-1 k=l = C 1 + :  + e) )' f: ( 2a (1 + :  + e) ) kr . k=O Choose J > 0 and a > 0 such that 2 (1 +J)'(1 +e)' 1 a (1 _ J)' <. 0 We return to the proof of the theorem. Let EUl(e) = Ef(e). The operators Ef(e) belong to G(X). They have the following properties: a) E(e) = E(e)Ef(e) for s < t < v, because E U ( e) U t S ( e) = E U ( e) E U t S ( e). 
1. LINEAR EQUATIONS WITH BOUNDED COEFFICIENTS 295 b) E;(e) = I, and E(e) is continuous with respect to s and v (the last point follows from the stochastic continuity of UtS(e) and the uniform boundedness of the second moment of Ut(e)). c) Ef(e) is invertible for all s < t (this follows from a) and the fact that p(Ef(e), I) --+ 0 as t - s --+ 0). Let U t S ( e) = (E? ( e ) ) - I U t S ( e ) E ( e). It is easy to verify that Ut(e) is a stochastic semigroup in G(X) having all moments. Moreover, Ut(e) is a martingale. Let e < 1. Then p(Ut(e),I) < 1, and Ut(e) is invertible for sufficiently small t - s. Therefore, Ut(e) is also invertible. We consider the H(X)- valued process defined by the stochastic integral Zt(e) = lt d v U2(e)(U2(e))-I. (21) The existence of the stochastic integral follows from the inequality N((U;;(e))-I) < 1/(1 - e), because N((U;;(e))-I) < 1/(1 - e), and (Ut(e))-l is a right stochastic semigroup with bounded jumps; hence it has all moments. We note that Zt(e) - Zs(e) = it d v U2(e)(U2(e))-1 = it d v U;(e)U2(e)( UsO(e))-1 (U;(e))-I = it d v U; ( e )( U; ( e )) -I , i.e., Zt(e) - Zs(e) is s-measurable. Therefore, Zt(e) is a process with independent increments. The following equation for Ut(e) follows from (21): U:(e) = it dZv(e)UtS(e). Note that U t S (e) = E?(e) Ut(e)(E(e) )-1. By assumption, Ut(e) is a semimartingale (as a function of t), and Ut(e) is a martingale; therefore, Ep(e) is a function of bounded variation: dUt(e) = dE?(e)UtS(e)(E(e))-1 + E?(e) dZte UtS(e)(E(e))-1 = dEp(e)(E?(e) )-1 Ul(e) + Ep(e) dZt(E?(e))-I. 
296 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Let A (t) = fot dE2(e)(E2(e))-I, Z(e) = fot E2(e) dZ(E2(e))-I. Then dUt(e) = d( A (t) + Zp)Ut(e). Let Zt = A (t) + Z + ! Yv([O, t] x dY). Since Zt - Zs is measurable with respect to g;s, Zt is a process with inde- pendent increments. We consider the expression I + it dZvU:. (22) If, is the first time after s that p(U;,I) > e, then dZ v = d A (v) + dZ and U = U(e) for v < ,. Hence, (22) coincides with Ul for t < ,. Let 'k = t. Then (22) can be written in the form e U:t_ + (Z - Z_)U_ = (I + Z - Z1:t_)U_ = U;k- U_ = U;t. k k k k k k k k k k Similarly, I + it dZvU:) = utI<  for t < 'k+l. This proves that Ut is the solution of (12) with the process Zt. 0 2. Strong stochastic semigroups with second moments For the solutions of linear stochastic equations to be representable in the form x(t) = UtXo it is not necessary that the random operator U be bounded, but it is necessary that it can be applied to the elements of X. Here we consider some natural generalizations of the concept of a random operator. 2.1. Strong and weak random operators. The usual random operator is a mapping of a probability space {Q,sr, P} into L(X) that is measur- able with respect to the a-algebra go generated by the sets {A E L(X): (Ax,y) < a}, where a E Rand x,y E X are arbitrary. In contrast to the a-algebra g(L(X)) of all Borel sets, go is countably generated. Let Z(w) be a random operator. Denote by Q(X) the linear space of X- valued variables on {Q,sr, P}, equipped with the topology of convergence 
2. STRONG STOCHASTIC SEMIGROUPS 297 in probability. The random operator Z ( ro) gives rise to a continuous linear mapping of X into Q(X). Let U be a continuous mapping of X into Q(X), i.e., associated with each x E X is a random X-valued variable Ux, and: 1) P{U(ax + py) = aU(x) + PU(y)} = 1 for all a, PER and x,y E X; 2) Ux is bounded in probability for Ixl < 1. Then U is said to be a strong random operator on X. In particular, the mapping Ux = Z(ro)x from X to Q(X) satisfies these conditions if Z(ro) is a random operator. In this case we identify U and Z(ro): Z(ro) = U. For an example of a strong random operator that is not a random op- erator, consider an operator U with U ek = kek for some orthonormal basis {ek}, where the k are independent normal (0,1) variables. Then U(x) = Ek(x,ek)ek is defined for all x, since Lf(x,ek)2 < 00 with probability 1 for all x EX. On the other hand, P {sup Ik I = +oo} = 1, so that I ( U x, x) I is not bounded by a random variable. We denote the space of random operators by L(Q, X), and the space of strong random operators by Ls(Q, X). If U E Ls(Q, X), then for all x,y E X the random variable (Ux,y) is defined, and it has the properties: 3) for all a,p E Rand x,y,z E X P{(Ux, ay + pz) = a(Ux,y) + P(Ux, z)} = P{(U(ax + py), z) = a(Ux, z) + P(U,y, z)} = 1; 4) (Ux,y) is bounded in probability for Ixl < 1 and Iyl < 1. Let (Ux,y): Q x X x X --+ R be a mapping such that conditions 3) and 4) hold. Then U is said to be a weak random operator. The space of weak random operators is denoted by Lw (Q, X). If a weak random operator is generated by some strong random operator, then these two operators wj.ll be identified. Therefore, L(Q, X) c Ls(Q, X) c Lw(Q, X). We show that in the second case there is also strict inclusion. Let k be a sequence of independent normal (0, I)-variables. Define 00 (Ux, y) = L k(X, el )(y, ek). K=l The series on the right-hand side converges with probability 1, since 00 LE(k(x,el)(y,ek))2 = (x,el)2 L (y,ek)2 = (x,el)2IYI2 < 00. k=l 
298 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE But U x - E1 k(x,el)ek is not a random variable in L(X), because E f = +00 with probability 1. Thus, a random operator is an operator in L(X) for each (J); for a strong operator only its value on each x E X is given for each (J), while for a weak operator only the inner product of the value of the operator on an arbitrary x E X with any other element y E Y is given. The most typical representative of the space Lw(Q, X) is a Gaussian operator white noise W, for which (Wx,y) = Lij(x,ei)(y,ej), where the i,j are independent normal (0, I)-variables. The value of W on any x is given by a Gaussian white noise in X, and W x and W yare independent if (x,y) = o. If U E Lw(Q, X), then the operator U* acting according to the formula (U*x,y) = (Uy,x) is defined and in Lw(Q,X). It is called the weak operator adjoint to the weak operator U. The operation of taking the adjoint can lead out of the space Ls(Q, X), as shown by the following example: if 00 (Ux,y) = Lk(x,el)(y,ek)' k=l where the k are independent and normal (0, 1), then U.y = [ f:k(Y,ek) ] el k=l is a strong random operator, while U is not. The naturalness of considering spaces of strong and weak operators is seen from the following assertions (they are all proved in Skorokhod's book [2], Chapter 1). I. Let Un be a sequence of operators in Ls(Q, X) such that for all x E X the limit Ux = lim Unx exists in the sense of convergence in probability. Then U E Ls(Q, X). II. Suppose that the sequence Un E Lw(Q,X) is such that for all x,y E Lw(Q,X) the limit (Ux,y) = limn-+oo(Unx,y) exists in the sense of con- vergence in probability. Then U E Lw (Q, X). The convergence of random operators in assertion I is said to be strong, and that in assertion II to be weak. We note that L(Q, X) is dense in Ls (Q, X) in the strong convergence, and in Lw (Q, X) in the weak conver- gence. To see this we need the following assertions. 
2. STRONG STOCHASTIC SEMIGROUPS 299 III. Let U E Lw(Q, X). Then U E Ls(Q, X) if and only if for all x E X P {(UX,ek)2 < oo} = 1. IV. Let U E Lw(Q, X). Then U E L(Q, X) if p { (Uei,ek)2 < oo } = 1. k,l Let U E Ls (Q, X) . We consider the operator U Pn, where Pn is the operator of projection onto the subspace generated by the vectors el, . . . , en. This product is defined as a weak operator by means of the equality n (UPnx,y) = L(Uek,y)(x,ek). k=l According to assertions III and IV this is an operator in L(Q, X), since with probability 1 n 00 L(UP n e i,ek)2 = L L( Ue i,ek)2 < 00. k,i i=1 k=l Because Pnx --+ X in X, it follows that U Pnx --+ U x in probability for all x. Suppose now that U E Lw(Q,X). We define the operator PnUP n as a weak operator by the equality (PnUPnx,y) = (UPnx,Pny). Then L(P n UP n e i,ek)2 = L ( Ue i,ek)2 < 00. k,i k,in Consequently, Pn U Pn E L(Q, X). Moreover, (PnUPnx,y) = (UPnx,Pny) --+ (Ux,y) in probability, since Pnx --+ X and Pny --+ y. We consider moments of random operators. V. If E(Ux,y) is defined for all x,y E X and for U E Lw(Q,X), then it is a bounded bilinear form, and hence there exists an operator A E L(X) such that E(Ux,y) = (Ax,y). We write A = EU and call this operator the first moment of the random operator. For a U E Ls (Q, X) the operator U. U E Lw (Q, X) is determined by the equality (U.Ux,y) = (Ux, Uy). If EU.U is defined, then this operator is called the second moment of the strong random operator, and it is a 
300 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE nonnegative symmetric operator. If it is defined, then the second moment form V(C) = EU*CU is defined, where the weak operator U*CU is determined by the equality (U*CUx,y) = (CUx, Uy), and EI(CUx, Uy)1 < IICII V E I Ux l 2 EIUyl2 < IICII.IIEU.Ulllxllyl. Denote by L}2) (Q, X) the space of U E Ls (Q, X) such that E U* U is defined. We introduce in L}2) (Q, X) the following convergence: Un --+ U if Unx --+ Ux in the mean square for all x, i.e., lim EIUnx - Uxl 2 = 0, x E X. n-+oo A sequence is a Cauchy sequence in L}2) (Q, X) if lim EI Unx - U m xl 2 = o. n,moo It is easy to see that L}2)(Q, X) is complete in this convergence. A stronger topology can be introduced in L}2)(Q, X) by means of the norm IIUII; = sup IEUxl 2 = IIEU*UII. (23) Ixll In this norm L}2) (Q, X) is a nonseparable Banach space (it contains L(X) as a subset, and there the norm coincides with the usual operator norm).  2.2. Processes with independent increments that are continuous in II . lis. We consider a process Zr, t E R+, with values in L}2) (Q, X). It is called a process with independent increments if ZtX is a process with independent increments for each x E X. It will be assumed that the process has the following continuity property: lim IIZt - Zslls = O. (24) st Two moment functions are associated with Zt. Let At = EZ t ; the existence of EZt follows from that of EIZtxl2 for all x. Condition (2) implies the continuity of At in the operator norm: IIAt - As II = IIEAt - EAsli = sup IE(At - As)xl Ixl1 < sup (EI(At - As)xI 2 )1/2 = IIZt - Zslls. Ixl1 The second moment function is defined by Bt(C) = EZr*CZ t ; 
2. STRONG STOCHASTIC SEMIGROUPS 301 Bt( C) is a bounded linear operator from L(X) to itself for t E R+. Denote by L+(X) the cone of nonnegative operators. Then Bt(.) carries L+(X) into itself: for C E L+(X) (Bt(C)x,y) = E(CZtx, ZtY) = E(CZty, Zt x ) = (Bt(C)y,x), (BtCx, x) = E( CZtx, Ztx) > o. The process Zt - At = Zt is a martingale (this means that Zt is an X -valued martingale for every x E X). Let Bt(C) be the second moment form for Zt: Bt(C) = EZtCZ t = E(Zt - At)*C(Zt - At) = Bt(C) - A;CAt. The functions Bt(.) and Bs(.) are continuous with respect to t in the norm of L(L(X)). Indeed, for C E L(X) IIBt(C) - Bs(C)11 - sup I(Bt(C)x,y) - (Bs(C)x,y)1 Ixl 1,lyl 1 - sup IE(CZtx, ZtY) - E(CZsx, ZsY) I Ixl 1,lyl 1 < sup [EI(CZtx, ZtY - Zty)1 + EI(CZtx - CZsx, ZsY) I] Ixl 1,lyl 1 < II CII[IIZt lis .IIZt - Zslls + IIZsIIs .IIZt - Zslls]. Moreover, IIA; CAt - A; CAs II < II ClIlIAt II . IIAt - As II + II CII . liAs II . IIAt - As II. Hence, Bt(C) is continuous in the norm of L(L(X)). The function Bt(C) is monotonically increasing in t for C E L+(X) (in the sense of the order in the cone L+(X)). Indeed, for s > t ([Bs(C) - Bt(C)]x,x) = E(CZsx, Zsx) - E(CZtx - Zt x ) = E(C(Zs - Zt)x, (Zs - Zt)x) > 0, because Zt is a martingale. We construct a stochastic integral of operator-valued functions with respect to the martingale Zt in the norm 1I.lIs. Note that for all C E L+(X) and s < t Bt(C) - Bs(C) < IICII(Bt(I) - Bs(I)). (25) We now define the integral t Bds(C s ) = lim L[B sk + 1 (C Sk ) - BSk(C Sk )], (26) 10 maX&kO 
302 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE where the function C s takes values in L+(X), and 0 = So < SI < .. . < Sn = t. The limit on the right-hand side of (26) is defined, for example, for functions C s that are norm continuous in L(X), and the integral satisfies the inequality t Bds(C s ) < t II C s II dBs(I) < sup II C s IIBt(I) (27) 10 10 st in L+(X). (These assertions are easy to prove with the help of inequality (25).) This implies that the integral (26) exists also for piecewise contin- uous bounded functions. We use the integral with respect to s for constructing the stochas- tic integral with respect to Zt. Denote by 31; the a-algebra generated by {(Zsx,y),s < t;x,y E X}. We consider functions (t) taking values in L}2) (Q, X) and adapted to the flow 31;. Our goal is to construct the integral J (s) dZ s . Assume that (s) is a step function. Then it is natural to set t n-l r <I>(s) dZ s = L <I>(Sk)[ZSk+1 - ZSk]' 10 k=O (28) where 0 = So < SI < ... < Sn = t, and (s) = (Sk) for Sk < S < Sk+l. The product (Sk)[ZSk I - ZSk] is defined as the product of two independent + . random operators (ZSk+1 - ZSk is independent of (Sk), because it is inde- pendent of !7;k' and (Sk) is !7;k-measurable) as follows. Let  and Z be two independent strong operators. Then x is a stochastically continuous function of x, and hence it can be assumed to be measurable with respect to sr  !!Ix. But then any X -valued random variable can be substituted for x in x; in particular, Zx can be substituted. This defines Zx. Let ,Z E L}2)(Q,X),BCI>(C) = E*C, and Bz(C) = EZ*CZ. Then E(CZx,Zy) = EE((CZx,Zy)IZx,Zy) = E(BCI>(C)Zx, Zy) = (Bz(BCI>(C)x,y)). Thus, in this case both Z E L}2)(Q, X) and BCI>z(C) = Bz(BCI>(C)); in particular, E(Z)*(Z) = Bz(E*). 
2. STRONG STOCHASTIC SEMIGROUPS 303 Using the fact that Zs is a martingale, along with the preceding formula, we find that E ( E <I>(Sk)[ZSk+1 - ZSk] ) * ( E <I>(Sk)[ZSk+1 - ZSk] ) k=O k=O n-l = L (BSk+1 (E<I>* (Sk ) <I>(Sk )) - BSk (E<I>* (Sk ) <I>(Sk ))) k=O = t BdAE<I>*(s)<I>(s)) (the integral on the right-hand side exists as the integral of a piecewise continuous function). Suppose now that <l>n(s) is a sequence of step functions such that sup II E[<I>n (s) - <I>(s)]*[<I>n(s) - <I>(S)] II --+ o. S Then lim sup II E[<I>n (s) - <l>m(s)]*[<I>n(s) - <l>m(s)]11 = 0, n,moo S and so t t 2 ( <l>n(S) dZ s - ( <l>m(S) dZ s 10 10 S - [I t [<I>n(S) - <l>m(S)] dZ s ]: - I t VdAE[<I>n(s) - <l>m(S)]* [<I>n(S) - <l>m(S)]) < sup IIE[<I>n(s) - <l>m(S)]*[<I>n(S) - <l>m(s)II.1I (I)II. st t - Consequently, the sequence of random variables J o <I>(s) dZ(s) converges in L}2)(Q, X) to some random variable, call it f <I>(s) dZ(s), in the same space. This implies, in particular, the existence of the integral for a func- tion <I>(s) that is continuous with respect to s in L}2)(Q,X). The equality E [I t <I>(s) dZ(S)r I t <I>(s) dZ(s) = I t B ds (E<I>* (s)<I>(s)) (29) is clearly preserved for the stochastic integral when we pass to the limit. Let A(t), t > 0, be a function with values in L(X). It is said to be of strongly bounded variation if: 1) A(t)x, x E X, has bounded variation as 
304 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE a function of t on each finite interval, and 2) for all t > 0 there exists a function Yt(h), h > 0, such that Yt(h) ! 0 as h ! 0, and for 0 < u < u+h < t var A(s)x < Yt(h )Ixl. usu+h Denote by F the set of functions <I>(s), s > 0, with values in L}2) (Q, X) for which there exists a sequence <l>n(s) of step functions with values in L}2)(Q, X) such that for all t > 0 lim sup II E[<I>n (s) - <I>(s)]*[<I>n(s) - <I>(s)] II = O. (30) noo st We define the integral i t n <I>(s) dA(s) = lim L <l>n(sk)[A(Sk+l) - A(Sk)], o noo k=O where 0 = So < SI < ... < Sn = t, and <l>n(s) = <l>n(Sk) for Sk < S < Sk+l; the limit in (31) is understood as the limit in L}2)(Q,X). The existence of this limit follows from the estimate (31 ) n E L<I>n(Sk)[A(Sk+l) - A(Sk)]X k=O < sup IIE<I>(k)<I>n(sk)III/2 var A(s)x st Ost < sup IIE<I>(s)<I>n(s)III/2Yt(t)lxl, st (32) which implies, in particular, that E 1 1 [cI>n(S) - cI>m(S)] A(s)x < sup IIE(<I>(s) - <l>m(s))(<I>m(S) - <l>n(s))1I 1 / 2 Yt(t)lxl, st and hence the variables in (31) to the right of the limit sign form a Cauchy sequence in L}2) (Q, X). Using estimates of the form (32) and passing to the limit, we see that l u + h E <I>(s) dA(s)x < sup IIE<I>*(s)<I>(s)1I 1 / 2 Yt(h)lxl, u usu+h E (i U + h cI>(s) dA(S)) * (i U + h cI>(s) dA(S)) (33) < sup IIE<I>*(s)<I>(s)lIyf(h). usu+h 
2. STRONG STOCHASTIC SEMIGROUPS 305 2.3. A stochastic differential equation. We consider a stochastic linear differential equation of the form dU t = U t dZ t , (34) - where Zt = A(t) + Zt is a process with independent increments and with values in L}2)(Q, X) for which A(t) is a nonrandom function with values in L(X), Zt is a martingale with values in L}2) (Q, X), and A(t) and Zt satisfy the conditions of 2.2. Therefore, it is possible to construct a stochastic integral with respect to the process Zt as the sum of the integrals with - respect to A(t) and Z(t). The main condition imposed on Zt is included in the following: 1) A(t) has strongly bounded variation, and the function Bt(I) is norm continuous in L(X). It follows from this condition that for every function <I> satisfying (8) E (l u + h cI>(S)dZ(S)) * (l u + h cI>(S)dZ(S)) < A(h)u+h sup IIE<I>*(s)<I>(s)lI. usu+s (35) THEOREM 5. If assumption 1) holds, then (34) has a unique solution that has the initial condition Uo = I, belongs to L}2) (Q, X), and satisfies sUPst II E U s * Us II < 00 for all t > 0; moreover, for the functions Et = E U t and (C) = EUr*CUt var Esx < Pt(h)lxl, var (Us*CUsx,x) < IICIIPt(h)lxI 2 , (36) tst+h t<s<t+h where Pt(h) is an increasingfunction oft and h such that Pt(O) = o. PROOF. Uniqueness. If U t is another solution of (34), then U t - U t = 1 t [Us - Us] dZ s , and, on the basis of condition 1), ,.."",,..,,,,, ,.."",,..,,,,, II E (U t - U t ) * (U t - U t ) II < A( t) sup II E (Us - Us) * (Us - Us) II. st Let h be such that A( h) < 1. Then sup II E( U t - U t )*( U t - U t ) II < A.(h) sup sup II E( Us - U s )*( Us - Us )11. th th sh Hence, sup IIE(U t - Ut)*(U t - Ut)1I < A(h) sup IIE(U t - Ut)*(U t - Ut)lI, th th 
306 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE which implies that U t = U t for t < h. Analogously, starting from the equality U I - U I = 1 1 (Us - Us) dZ s , '" which is valid for t > 0, we see that U t = U t for t < 2h, and so on. Existence. We consider the iterated integrals (t) = 1 1 Wn-l(s)dZ s , W1(t) = Zlo Assume that -I(t) is defined and sup II EW n *-1 (s) -1 (s)1I < 00. s$.t In this case if the process (t) is defined, then sup II EW n *(s) Wn(s) II s$.t < sup E ( t  -I(U)dZu ) * t Wn-l(u)dZ u st 10 10 < SUpA(S) . sup IIEW n *-1 (u) W n - 1 (u)1I s$.t us < A(t) sup IIEW n *-1 (s) W n - 1 (s)lI, s$.t IIE(Wn*(t + h) - Wn*(t))(Wn(t + h) - Wn(t))11 < A(h) sup IIEW:- 1 (s)W n - 1 (s)lI. (37) s1+h Therefore, the process (t) is continuous in L}2)(Q,X), and hence the process W n + 1 (t) is defined. Since Wi (t) is continuous in L}2)(Q, X), W2(t) is defined and continuous in L}2)(Q,X), and so on. The preceding estimates imply sup II EW n * (s) W n (s) II < A n - 1 (t) sup II EZs* Zsll < An(t) (38) s$.t s$.t (it follows from (35) that II EZs* Zsll < A(S); this inequality is obtained if we set <I> = I). Let h be such that A(h) < 1. Then the series E Wn(t)x converges in L}2)(Q,X) for t < h. Since IIE*(t)Wn(t)III/2 < A(h) < 1, the series determines an operator in L}2) (Q, X). Let 00 Ut=I+LWn(t). n=1 Using (37) and (38), we can see that for 0 < t < t + c5 < h IIE( Ut+o - U I )*( U I + o - U I )III/2 < ;,1/2(0) 1 _ (h) ; 
2. STRONG STOCHASTIC SEMIGROUPS 307 therefore, the integral {t 00 (t 10 Us dZ s = ZI + L 10 Wn(s) dZ s o n= 1 0 is defined. It follows from the last equality that U t satisfies (34) for s < h. Analogously, for every k we can determine the solution Ut(k) of the equation UI(k) - I = r U}k) dZ s , 1kh Setting U t = Uo)... U t (k+l) for t E [kh, (k + l)h], we get the solution of the equation for all t > O. Since the operators Uo), UJk) , . . . , Ut(k+ 1 ) are independent, it follows that kh < t < (k + l)h. EUt U t = EU t (k+l)* ... Uo)* Uo) ... U t (k+l) = B}k+l)(Bi)h(... (BO))... )), where B}i)(C) = EUt(i)* CUt(i) for t E [ih,(i + l)h]; hence, U t E L}2)(Q,X), and II EUt U t II is locally bounded. Further, E1x = I + 1 1 Es dA(s)x, and therefore, var Esx < sup IIEsli var A(s)x < sup IIEsIlYo(t)lxl, Ost Ost Ost Ost var (EUtCUtx,x) Ost = var ( CE [ X + r Us dZsx ] , x + r Us dZsx ) Ost 10 10 = OI [(CX,X) + (cx, 1 1 EsdA s ) + (c.x, 1 1 EsdA s ) +E ( C 1 1 UsdZsx, 1 1 UsdZsx)] . Since the first of the inequalities (36) has already been established, it suf- fices to establish that the function E ( C 1 1 Us dZsx, 1 1 Us dZsx) 
308 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE is of bounded variation. Let t = to < tl < . . . < t n = t + h. Then L E ( C ilk Us dZsx, ilk Us dZsx) ( {tk-I (t k - I ) -E C 10 Us dZsx, 10 Us dZsx < L { E ( C 1::1 Us dZsx, 1::1 Us dZsX) + E ( c (k-! UsdZsx, (k UsdZsX ) 10 1tk_1 + E ( C I::. US dZsx, ilk-I Us dZsX) } < IICII L 2 [E I::. U'ds (E Us. Us)x, x) {t k {t k (t k - I ] + E ll k _1 Us dZsx 2 + E llk-! Us dAx 10 Us dZsx < 211CII sup IIEUs*UslI{II+h(I) - (I)II + J'l(h) + J'0(t)J't(h)}lxI 2 . st+h t Relation (36) is proved. 0 REMARK 1. Let Zt be a homogeneous process with independent in- crements. Then A(t) = tA and Bt(C) = tB(C), where A E L(X) and B(.) E L(L(X)). Therefore, condition 1) is obviously satisfied, and equa- tion (34) has a solution. Denote by UI the solution of (34) for t > s with initial condition U; = I. Then UI is a homogeneous stochastic semigroup of operators in L}2) (Q, X) that has the following continuity condition: II E (Uf - I) * (Uf - I) II --+ 0 as t ! s. REMARK 2. If UI is the solution of (34) for t > s with the initial condition U; = I, then for all s there is a unique solution in L}2) (Q, X), and the family {UI, t > s > O} of operators forms a stochastic semigroup of operators in L}2)(Q, X) having the following property: there exists a function Pt(h) that is increasing in t, satisfies Pt(h) ! 0 as h --+ 0, and is such that var EUtSx < Ps+h(h)x, sts+h var E(Ufx, Ufx) < IICIIPs+h(h)lxI 2 . sts+h (39) 
2. STRONG STOCHASTIC SEMI GROUPS 309 The stochastic semigroups in L}2) (Q, X) for which (39) holds will be called second..order stochastic semigroups of bounded variation. 2.4. Second-order stochastic semigroups of bounded variation. Let { ut, 0 < S < t} be a second-order semigroup of bounded variation. Define E; = EUt. Since liE: - III < sup var EUt < Pt(t - s), Ixl 1 sut the nonrandom semigroup {E;, 0 < s < t} of operators in L(X) is norm - continuous. Let Ut = EUt(EP)-1 (the existence of the operator in- verse to EP follows from norm continuity and the representation EP = Eg E: . . . E::- 1 , since it is possible to choose tl,..., tk such that all the operators E::- 1 have inverses). Obviously, Ut is a martingale. Let V/(C) = EUt* cut = (Ep*)-1 ut* E* CEUt(EP)-1 = ( E O * ) -1 Jl:S ( Eo* CEo ) Eo-1 t t S S t , where S(C) = EUt* CUt. Since IIS(C) - CII < sup var E(CUx, Ux) < IICIIPt(t - s), Ixl1 sut - S(C) is norm continuous as a function of sand t. Hence, S(C) is also norm continuous. Let 0 = to < tl < .. . < t n = t. We consider the variables n-l Zn(t) = L[U/:_ 1 - I]. k=O It will be shown that the limit of the variables Zn(t) exists in L}2)(Q,X) as maxL\tk --+ O. We first estimate the quantity E ( rp - I -  [ fjs.; - I ]) * ( rp - I -  [ fj; - I ]) t L....ti S 1+ 1 t L....ti S 1+ 1 ' i=O i=O where s = So < SI < . . . < Sm = t. We have that m-l m-l m-l L(U-I)(UI-I)- L(UI-I)= L(U-I)(UI-I), i=O i=O i=O E  ( fjs.; - 1 ) * ( fj - 1 ) * ( fj - I ) ( fjs; - I )  SI+I S, S, S.HI i =  E ( fj; - I ) * [ E ( fj - I ) * ( fj - I )]( fj; - I ) L....ti SI+I S, S, SI+I i m-l = L [V;I (V;(I) - I) - V;(I) + I]. i=O 
310 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Note that V;S(C) - C = E(Uf - I)*C(U t S - I) > 0 for C > 0, and it depends monotonically on C; therefore, V;I (V;(I) - I) - V;(I) + I < (V;I (I) - 1)11 V; - III. Further, - - - - - h(I) - V;S(I) = h(V/) - V/(I) > 0, and hence m-l L [V;I (V;(I) - I) - V;(I) + I] ;=0 m-l m-l < sup II V; (I) - III L (V;I (I) - I) = II V/ (I) - III L (V;I (I) - I). ;=0 ;=0 Let n-l Ik- 1 (k) Zm(t) = L L[Uik) - I], s. I k=O ;=0 1+ where tk = sk> < ... < sJ:> = tk+lo Then, by what we have proved, '" '" E(Zm(t) - Zn(t))*(Zm(t) - Zn(t)) n-l Ik- 1 (k) < L II I (I) - III L'(V) (I) - I) s. I k=O ;=0 1+ n -Ilk - 1 (k) < sup II I (I) - III L L (V) (I) - I) k k=O i=O Si+ I n-l l k- 1 - tk "'" "'" - sk) - 0 - 0 < sup II Jt;k+1 - III L..J L..J (Ie) (k) (I)) - k) (I)) k k=O ;=0 1+1' , - s(k) - (we have used the monotonicity of V() (C) - C and the fact that V?k) > I). Si+1 Si Since -Sk) -0 -0 V (Ie) (V (k) (I)) = V (k) (I), Si+1 Sk Si+1 we get finally that E(Zm(t) - Zn(t))*(Zm(t) - Zn(t)) < sup II I (I) - III . II o(I)II. k If 0 = to < ... < t n = t and 0 = So < SI < ... < Sm = t are two arbitrary partitions of [0, t], and Zn(t) and Zm(t) are constructed from 
2. STRONG STOCHASTIC SEMIGROUPS 311 these partitions, then, taking a partition 0 = Uo < Ul < ... < up = t whose points contain both he ti ad the Sj,  1,... n, j = 1,..., m, and estimating the differences Zn(t) - Zp(t) and Zm(t) - Zp(t) according to the preceding formula, where p-l Zp(t) = L[UI - I], i=O we find that IIE(Zn(t) - Zm(t))*(Zn(t) - Zm(t))11 < 2 sup IIV/(I) - III.IIo(I)II, Is-rlh st where h = maxk,j[lsk - sk-tl, Itj - tj-d]. Therefore, for all t the limit Z(t) = lim  [Dink - I]  nk+1 tnk <t (40) exists in L}2) (Q, X), where 0 = tno < t n l < ... <,..,tnk < .. ., limkoo tnk = 00, and limmaxk[tnk+l - tnk] = O. Obviously, Z(t) is a process with in- dependent increments and a martingale with values in L}2)(Q,X). Let '" '" Bt = EZt Zt. Then Bt+h - Bt = lim E (  [Ui n k k - I] ) * (  [Ui n k k - I] ) noo  n +1  n +1 ttnk t+h ttnk t+h = lim  E[ Dtnk - 1 ] * [Dt n k - I ] noo  tnk+1 tnk+1 ttnk <t+h = lim  [V/nk (I) - I] noo  tnk+1 ttnk t+h < lim L [ tnk ( o ( I )) o ( I )] tnk+1 tnk - tnk noo ttnk t+h '-0 '-0 = V;+h(I) -  (I). Consequently, '" ,..., * '" '" '-0'-0 E(Zt+h - Zt) (Zt+h - Zt) = Bt+h - Bt < V;+h(I) -  (I), and Zt is a process that is continuous in LS) (Q, X). 
312 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Let tnk = kt/2 n . We have that 2 m -l fJo - I = '"' fJo [ U tnk - I ] t L....ti tnk t n k+l k=O 2 m -1 U+ 1 ).2 n - m '"' -0 '"' -t =  U i / 2 m  [Utn1 - I] i=O k=i.2 n - m 2 m -l U+l).2n-m '"' '"' -0 -0 -tnk + L....ti L....ti [U tnk - U i / 2 m][ U tnk + 1 - I]. i=O k=i.2 n - m ( 41) The limit of the first sum on the right-hand side as n --+ 00 is equal to 2 m -1 [ ( . 1 ) ( . ) ] -0 '" I + '" I  U i / 2m Z 2 m - Z 2 m . 1=0 (42) Further, ( ) * 2 m -l U+l).2 n - m -l -0 -0 -t E  k=-m [U tnk - Ui/2m][Utn1 - I] ( 2m-l (i+l).2n-m-l ) -0 -0 -t X  k=f,;;-m [U tnk - U i / 2m ][ Utnn:+1 - I] 2 m -l U+l).2n-m-l _ '"' '"' -tnk -0 -0 -0 -0 -   [nk+l (nk (I) - JIf/2 m (I)) - nk (I) + JIf/2m (I)] i=O k=i.2 n - m -0 -0 -0 < Sf lInk(I) - Vf/2m(I)II( (I) - I). This estimate implies that the second sum on the right-hand side of (41) tends to zero in L}2) ((1, X) as n, m --+ 00. By using the continuity of Up with respect to t in L}2) ((1, X), it is easy to see that (42) tends in L}2) ((1, X) t - to the integral fo Uso dZ s . Thus, -0 {-o '" U t - I = Jot Us dZ s . 1.he following equality is established analogously: for s < t Vf - I = it v; dZ v . (43) We now find an expression for the stochastic semigroup UI in terms of '" - Et and Zt. The connection between UI and UI gives us that Uf = (E)-l Vf EP = (E)-l [I + it EUf(E2)-1 dZ v ] EP. (44) 
2. STRONG STOCHASTIC SEMIGROUPS 313 For what follows we need some auxiliary facts relating to the stochastic integrals constructed in this section. Fact 1. Let <I> E F and C E L(X). Then C it cI>(v) dZ v = it CcI>(v) dZ v . This equality is valid for step functions <I>(v); the general case is ob- tained by passing to the limit. Fact 2. For <I> E F let B(t) be a function of strongly bounded variation with values in L(X). Then it cI>(v) dZvB(t) = it [l U cI>(v) dZ v ] dB(u) + it cI>(v) d [l V (dZu)B(U)] . (45) The integral fsv (dZu)B(u) for a continuous function with values in L(X) is defined as the limit of the integral sums n-l L[ZUk+1 - ZUk]B(Uk) k=O as maxk(Uk+l - Uk) --+ O,where s = Uo < Ul < ... < Un = V; the existence of the limit follows from the continuity of B(u) and the estimate ( n-l ) * ( n-l ) E L(ZUk+1 - ZUk)B(Uk) L(ZUk+1 - ZUk)B(Uk) k=O k=O < sup IIB(u)1I2[S(I) - I]. U Further, the integral fsV(dZu)B(u) is a process with independent incre- ments and with values in L}2) ((1, X). Formula (45) is a form of integra- tion by parts. For a proof it suffices to consider step functions <I>(v) and numbers s < t such that <I> is constant on [s, t[. Then it reduces to cI>(s)[Zt - Zs]B(t) = it cI>(s)[Zu - Zs] dB(u) + cI>(s) it dZuB(u). '" Obviously, this equality is valid for <I> = I, Zs = 0, and s = 0; it then reduces to ZtB(t) = it Zu dB(u) + it dZuB(u). 
314 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE The latter is obtained by passing to the limit in the equality n-l ZtB(t) = L[ZtkB(tk) - Ztk_lB(tk-l)] k=O n-l n-l = L[Ztk - Ztk_l]B(tk-l) + L Ztk[B(tk) - B(tk-l)]. k=O k=O Fact 3. Suppose that <I> E F, and the Bk(t), k = 1,2, are continuous functions with values in L(X). Then [t <P(u)B[(u)d [U dZ v B 2 (v) = [t <P(u)d [U B[(v)dZ v B 2 (v). (46) The integral on the right-hand side is defined as the limit of the integral sums of the form L <I>(uk)B 1 (Uk)[ZUk+l - ZUk]B2(uk); the existence of this integral is easily proved by using the equality E (L <P(uk)B[ (Uk)[ZUk+1 - ZUk]B2(Uk)) * x L<I>(Uk)B 1 (Uk){ ZU k+l - ZUk]B2(uk) = L B!(uk) VU1 B(Uk)E<I>*(Uk)<I>(Uk)Bl (uk)B 1 (Uk) < supB!(u)B2(U)B(u)Bl (u)E<I>*(u)<I>(u)(V/(I) - I). U Equality (46), like (45), need be proved only for constant <1>. Then it reduces to [t B[(u)d [U dZ v B 2 (v) = [t B[(u)dZ u B 2 (u), which is obvious for constants B 1 and B 2 ; therefore, it is true for piecewise constant Bk(u), and can be extended to all continuous Bk(u) by passing to the limit. Fact 4. If Bk(t) satisfies the conditions of Fact 3, then f Bl (u) dZ u B 2 (u) is a martingale with independent increments. ,..., ,.."",,..,,,,,,..,,,,,,..,,,,, Let Jt;S(C) = E(Zt - Zs)*C(Zt - Zs). Then E ([t B[(U)dZ u B 2 (U)) * C [t B[(u)dZ u B2(U) = [t Bi(u)V du (Bj(u)CB[(u))B 2 (u), (47) 
2. STRONG STOCHASTIC SEMIGROUPS 315 where the integral on the right-hand side is defined as the limit of the integral sums of the form n-l L B!(tk) V;I (B(tk)CBl (tk))B 2 (tk) k=O as max(tk+l - tk) --+ 0, where s = to < tl < ... < t n = t. The existence of the integral for continuous Bk(t) follows from the inequality n-l L B!(tk) V;I (B(tk)CBl (tk))B2(tk) k=O < IICII sup[IIB I (u)1I 2 I1B 2 (u)1I 2 ]1I V/(I)II. u Relation (47) can be verified immediately for piecewise constant functions Bk(u), and it can be obtained for continuous functions by passing to the limit. By (47), ( t ) * t E 1 BI (u) dZ u B 2 (u) C 1 BI (u) dZ u B 2(U) < IICII sup[IIB I (u)1I 2 I1 B 2(u)1I 2 ]1I V;SII. ut Fact 5. The function A(t) = l t (E)-I dE has strongly bounded variation. This follows from the inequality var A(u)x < sup II(E)-111 var Ex. sut ut sut We return to (44). Using Facts 1 and 2, we can write (E)-I [I t E2U(E2)-1 dZ v ] E = [it U(E2)-1 dZ v ] E = it [l U U(E2)-1 dZ v ] dE2 + it U(E2)-1 d [l V dZ u E2] . But on the basis of (44), l u U(E2)-1 dZ v = U(E2)-1 - (E)-I, 
316 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE and, by Fact 3, [t U(E)-I d [[V dZ u E2] = [t U d [V (E2)-1 dZ u E2. Hence, (E)-I [it EU(E2)-1 dZ v ] E = [t U(E2)-1 dE2 _ [t (E)-I dE2 + [t U; dZ v = [t u dA(u) + I _ (E)-I E + [t u dZu, (48) where - _ (t 0 -1  0 Zt - J o (Eu) dZuEn' Finally, we get that Uf = I + [t U dZ v , Zt = A(t) + Zt. (49) The process Zt satisfies condition 1) in view of Facts 4 and 5. A(t) = 1 t (E2)-1 dE2. THEOREM 6. 1. For every second-order stochastic semigroup Ul of bounded variation there exists a second-order process Zt of bounded varia- tion with independent increments such. that (49) holds. 2. For every homogeneous stochastic semigroup Ul that is continuous in L1 2 ) (Q, X) there exists a homogeneous process with independent increments that is continuous in L1 2 ) (Q, X). PROOF. Assertion 1 was established above. We prove assertion 2. Note that in this case Et = E forms a uniformly continuous one-parameter semigroup, since Ef = Et-s and IIEt - III < II E( U t O - 1)* (U t O - I) 11 1 / 2 . Therefore, E = exp{tA}, where A E L(X), and hence E has uniformly bounded variation, and thus strongly bounded variation. Further, if (C) = E(UP)*CUp, then (C) forms a one-parameter semigroup of operators from L(X) into itself, and II(C) - CII = II EU t o * CUtO - CII = IIE(U t O * - I)C(U t O - I) + (e tA * - I)C + C(e tA - 1)11 < IICII[IIE(U t O * - I)(U t O - 1)11 + lIe tA * - III + lIe tA - III], 
2. STRONG STOCHASTIC SEMIGROUPS 317 so that this semigroup is uniformly continuous in L(X). This implies that there exists a bounded operator B from L(X) to L(X) such that VI(C) = exp{tB}(C); therefore, ((C)x,x) also has bounded variation, and var( (C)x, x) < (e tllBIl - 1) II Clllxl 2 . st Hence, the stochastic semigroup VI is second-order and of bounded variation. To see that the process Zt is homogeneous in this case we can use the formula n-l Zt = lim L[U({)tln - I] k=O (the limit is in the sense of convergence in L2)(Q, X)). Indeed, A(v) = 1 t e- vA de vA = lA, UI - I - Zt + Zs = [t[U - I]dZ v = [t[U _ I]dZ v + [t[U - I]Adv = V;S + JJjs. To prove (50) it suffices to show that the variables (50) n-l y; _ "" ykt/n n - L...J (k+l)t/n' k=O n-l UI: "" w,kt/n n - L...J (k+l)t/n k=O converge to zero in Lf)(Q,X). Using the independence of the Yc)tln and setting V;(C) = EZtCZr, we get that n-l IIEY * y; II "" E y *kt/n Y kt/n n n = L...J (k+l)t/n (k+l)t/n k=O n-l l (k+l)t/n - L Vds(E(Uskt/n - 1)* (U s kt / n - I)) k=O kt/n n-l l (k+ 1 )t/n - L Vds(Vs-kt/n(I) + I - eA-(s-kt/n) - e(s-kt/n)A) k=O kt/n < SUp II  (I) + I - e hA - - e hA 1111 V; ( I) II , ht/n 
318 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE and the right-hand side tends to zero. The variables ;)tln are also independent, n-l l (k+ 1 )t/n IIEWnll = L [e(s-kt/n)A - I] Ads k=O kt/n {t/n < n J 0 (e SIlAIl - I) IIAII ds < lliAIl (etlnllAIl - I) ---+ 0 as n --+ 00, and IIE( - E)*(Wn - EWn)1I n-l (l (k+l)t/n ) * - L E (U s kt / n - Es-kt/n)Ads k=O kt/n X ( f(k+[)tln (U;tln - ES-ktln)AdS ) 1 kt/n < nliAIi. E [i tln (U s o - Es) ds r i tln (U s o - Es) ds =nIlAIi. i tln iSE(UsO-Es)*(U-Ev)dVdS + tin tE(U-Ev)*(UsO-Es)dvds 10 .10 = nliAIl i tln is IIE;-v (I) - E; Ev + (I)Esv - E:Esll dv ds =o(  ). 2.5. Stochastic equations of diffusion type with constant coefficients. We consider the stochastic differential equation 00 dX t = Xt Adt + LXtBk dWk(t), (51) k=1 where A and B are in general unbounded linear operators defined on some dense subset D of X, for xED 00 L I B k x l 2 < 00, k=1 and the Wk(t) are independent Wiener processes. This equation can be written as d Xt = XtAd t + Xt d, where  is a homogeneous operator- valued process with Gaussian independent increments that is defined on 
2. STRONG STOCHASTIC SEMIGROUPS 319 the dense set D. The process  can then be represented by a series of independent Wiener processes, so that (51) is a general linear equation with a homogeneous Gaussian process with independent increments. We are interested in the case when (51) has a solution having a second moment. Note that a solution of this equation is taken to be a strong operator-valued process Xt such that for xED the function XtX has stochastic differential coinciding with the result of applying the right-hand side of (51) to x (this result is defined for xED and a strong operator Xt). If (51) has a unique solution, then with it we can associate a homoge- neous strong stochastic semigroup Xl (the solution of (51) for t > s with the initial condition XJ = I). If the solution has a second moment, then (C) = EX?- C X? is a semigroup of bounded linear operators on L(X) with the following property: for all x,y E D and C E L(X) the limit relation lim ![(V,(C)x,y) - (Cx,y)] = (Q(C)x,y) tO t holds, where 00 Q(C) = A*C + CA + LBkCB k . (52) k=1 We remark that the boundedness of (C) in a neighborhood of zero and the fact that ((C)x,y) --+ (Cx,y) for x,y E D imply that (C) is weakly continuous with respect to t at O. Therefore, it follows from the general theory of semigroups that on a certain set  C L(X) the weak limit 1 - lim -[JI;(C) - C] = Q(C), C E, tO t exists, and Q(C) E L(X). Obviously, Q(C) = Q(C) for C E. As a generating operator, Q( C) is closed on. The operator Q( C) is weakly closed on the domain where Q( C) E L(X). Therefore,  coincides - with the set of C such that Q( C) E L(X) and Q( C) = Q( C). Thus, we establish a necessary condition for the existence of a solution of (51) having a locally bounded second moment-for this it is necessary that Q( C) be the generating operator of some weakly continuous semigroup on L(X). We show that in this case a solution of (51) having locally bounded second moments is unique. The local boundedness of II Jt( ( .) II = II Jt( (I) II, which follows from the inequality IE(CXt x, XtY) I < IICII V EIXtxl2EIXtyl2 < IICII.IIEXt XtlllxllYI, implies that II (.)II = O(e at ) for some a > o. 
320 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Let Xt and Xt be two solutions of (51) such that II EXt Xtll + IIEXt Xtllt = O(e at ). Define Rt = E(X t - Xt)*(X t - Xt). Then for x,y E D d dt (Rtx,y) = (Q(Rt)x,y). Since Rt = O(e at ), it follows that 10 00 e- lt Rt dt is defined for A. > a, and 1 00 :/ Rtx,y)e-Atdt= (Q(l OO e-AtRtdt)X,y), which implies that (;. [1 00 Rte- At dt] X,y) = (Q (1 00 e-AtRtdt) X,y) . Let 1 00 Rte- At dt = C A E L(X). Obviously, d dt (Cl) = (Q(Cl)) = A.(Cl), and (Cl) = Clelt. If C l -1= 0, then this contradicts the fact that II Jt;(.) II = O(e at ), because A. > a. THEOREM 7. For (51) to have a unique solution with locally bounded second moment it is necessary-and 'sufficient for uniqueness-that there exist a semigroup on L(X) with generating operator Q(C) given by (52). We now consider sufficient conditions for the existence of a solution of ( 51) that has locally bounded second moments. LEMMA 2. Suppose that the coefficients of ( 51) are bounded operators, and Bk = 0 for k > m. If m 2(Ax,x) + L IBkxl2 < alxl 2 , k=1 then the solution of (51) with the initial condition Xo = I satisfies EXt Xt < eat I. PROOF. Under our assumptions the operator Q( C) is bounded, and hence II (I) - I - tQ(I) II = o( t). Therefore, for every e > 0 there is a J > 0 such that for t < J ((I)x,x) < (x, x) + t[(Q(I)x,x) + e(x,x)] < (x,x)(1 + t(a + e)) < e(a+t)t(x,x). 
2. STRONG STOCHASTIC SEMIGROUPS 321 Using the monotonicity of (C), we get that t(I) = ((I)) < (e(a+t)tI) < e(a+t)t(I) = e(a+t) 2t I, Vnt(I) = Ji(n-l)t((I)) < Ji(n_l)t(e(a+t)tI) < e(a+t)tJi(n_l)t(I) < e(a+t)ntI. Therefore, (I) < e(a+t)s I for any s > 0 and any e > O. Passing to the limit as e ! 0, we get a proof of the lemma. THEOREM 8. Suppose that for some a > 0 00 2(Ax,x) + L IBkxl2 < alxl 2 k=l for all XED. Then (51) has a solution with the initial condition Xo = I, and it satisfies II EXt Xt II < eat. PROOF. Let X,m be a solution of the equation m dXtn,m = X,mAndt+ LXtn,mBkdwk(t), k=1 where An = PnAPn, Bk = BkPn, Pn is a sequence of projection operators that increases monotonically to I, and Pnx E D for all x. The operators An and Bk are bounded, and X n,m - I 0-' m m 2(A n x,x) + L I B k x l 2 = 2(AP n x,P n x) + L I B k P n x l 2 k=1 k=1 < alPn x l 2 < alxl 2 . . - Therefore, II EX': X,m II < eat on the basis of the lemma. Let D Un PnX. For xED 1 t2 2 1 t2 EIXtn,m x - xg,m xl 2 < 2E x;,m An x ds + 2E IX;,m Bkxl2 ds tl t I < 2(t2 - tJ) t 2 Ix;"m Anxl2 ds 1tl + 2 1'2 Elx;"m BZxl 2 ds ltl < 2 (IAX I2 +  I B k X I 2 ) 1. t2 e QS ds if t2 - tl < 1 and n is sufficiently large. Therefore, the finite-dimensional distributions of the processes (X,mx,y), xED, y  D 1 , are weakly com- pact, where D 1 is some countable dense subset of D. Let nk and mk be 
322 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE chosen so that nk, mk --+ 00 and the finite-dimensional distributions of the processes {(Xtnkmk x,y);x,y E D 1 , Wk(t), k = 1,2,...} converge to those of the processes {(ZtX,y),x,y E D 1 , Wk(t), k = 1,2,...}. Then i t 00 i t (ZtX,y) = (ZsAx,y) ds + L (ZsBkx,y) dWk(S) o k=1 0 (because of the assumptions about D 1 , Anx = Ax for sufficiently large n, and it can be assumed without loss of generality that Ax and Bkx are in D 1 for x E D 1 and that D 1 is a linear space over the field of rational numbers). The expression (Zsx,y) is a bilinear form on D 1 satisfying the conditions E(Z s x,y)2 < e QS lxl 2 .IYI2. (53) Therefore, it can be extended by continuity (in the mean square) to X, and inequality (53) is preserved. For the present, (Zsx,y) only denotes our bilinear form. Setting 00 XtX = L(ZSx,ek)ek, k=1 where {ek} is a basis in D 1 such that Pnek = 0 for all sufficiently large k, we get a solution of (51). Convergence of the series follows from the fact that 00 00 E"(Z s x,ek)2 < lim E"(X;,m x ,ek)2 L...J n 00 L...J k=l k=1 = lim Elxn,m xl 2 < Ixl 2 e Qt . 0 noo s 3. Stability 3.1. Examples of stable and unstable infinite systems. The infinite dimensionality of the phase space of a linear system essentially affects the character of the asymptotic behavior of the system. We present examples, clarifying the possible deviations from what has been established in the finite-dimensional case. EXAMPLE 1. The set of initial values on which a homogeneous stochastic semigroup is stable is a closed invariant subspace (see 92 in Chapter III). In the infinite-dimensional case this is not necessarily so. We consider a nonrandom semigroup acting in X as follows. Let {ek} be a basis in X, let U;+t = Up, and suppose that Up is given on the basis elements for n = 1, 2, . .. by the system of differential equations dol 0 - d Ut e2n-l = -4Ut e2n-l, t n dol 0 1 0 - d Ut e2n-l = -4Ut e2n + -Ut e2n-l. t n n 
3. STABILITY 323 This system clearly decomposes into a countable collection of second-order systems, and, by using the initial condition ug = I, we can write the solution UPe2n-l = exp{ -t/n 4 }e2n_b UtOe2n = exp{ -t/n 4 }[ te 2n_l/ n + e2n]. Obviously, Up - I is a Hilbert-Schmidt operator (d Up / d t = AUtO, where A is a Hilbert- Schmidt operator). The semigroup Uta is stable (since it is nonrandom, we can consider any fonn of stability: with probability I, in the mean square, p-stability) on all basis elements, hence also on linear combinations of them, which fonn a dense subset of X. We show that it is unstable on some x. Let x = E ek / k. Then ° I °  { t }( 1 t 1 ) U t x = L...J k U t ek = L...J exp - k4 2k _ 1 e2k-l + k(2k _ 1) e2k-l + 2k ek , k=l k=l IU O xl 2 =  (  + (k + t)2 ) e-2t/k4 > t 2 '"' e-2t/k4. t L...J 4k 2 k2(2k - 1)2 - L...J k 4 k=l For t = m 4 we have that IUtOxl > m8e-2m4/m4/m4 = m 4 e- 2 . Hence lim IUtOxl = +00. In the finite-dimensional case stability of a semigroup for all x implies that it is uniformly stable with respect to x, i.e., II up II tends to zero (with probability 1 or in the mean square, depending on the nature of stability of the semigroup). In the infinite-dimensional case II Up II may not even be defined (for strong semigroups). But if, for example, the semigroup is such that II up II exists, then stability of the semigroup on each element does not imply that the norm tends to zero. EXAMPLE 2. We again consider a nonrandom semigroup Up such that UtOek = exp{ -Akt}ek, where {ek} is a basis, Ak > 0, and Ak -+ O. Then 00 UtOx = L exp{ -Akt}(X, ek)ek, k=l 00 I Uta xl = L e- Ukt (x, ek)2. k=l The series on the right-hand side converges unifonnly with respect to t, since E(x, ek)2 < 00. Hence, limt- 00 1 Uta x 1 2 = O. On the other hand, I U t Oe;12 = e-U;t and IU t Oe;12 = e- 2  0 if t; = I/A;, t; -+ 00. Thus, sup EIUpx 2 1  0, Ixll although E 1 Uta x 1 2 -+ 0 for all x. 
324 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE In  1 of Chapter III it was proved that in the case when the phase space is locally compact and there is a stationary point, asymptotic stability (con- vergence of a sample path to the stationary point) for a process irreducible away from the stationary point is implied by stability, i.e., the condition that the sample path of the process is in an arbitrarily small neighborhood of the stationary point if the initial point is chosen sufficiently close to the stationary point. The next example shows that for strong semigroups in a Hilbert space this is not so. EXAMPLE 3. Suppose that H is the space /2 of real sequences {Xk, k > I} with E xl < 00. In H we consider the system of stochastic differential equations dXn(t) = anxn(t) dt + Pnxn(t) dwn(t) - 2xn(t)v;(dt) + (Xn-I(t) - xn(t))v(dt) + Ynxn+I(t)v:(dt), (54) n > 1 (Xn-I and Vn-I are regarded as equal to zero for n = 1), where an and Pn are nonzero constants, Yn > 0, {wn(t), n = 1,2,...} are independent Wiener processes, v(t) and v; (t), v: (t), n = 1,2,..., are Poisson processes that are independent of each other and of {wn(t), n = 1,2,...}, Ev(t) = At, and Ev;=(t) = A;(t). The values of the constants an, Pn, and A; will be made more precise later. We describe the behavior of system (51). As long as the processes v; (t), v(t), and v:(t) do not have jumps, Xn(t) is a solution of the one-dimensional diffusion equation, and hence Xn(t) = xn(O) exp{Pnwn(t) + (an - P;,/2)t}. (55) It is most simple to take into account the influence of a jump in the process v; (t): Xn(t) changes sign at this time. At a jump time 'Z' 6f v(t) the process XI (t) vanishes, and all the remaining x;(t) become equal to X;-I(t). In other words, (Xt('Z'),X2('Z'),.") = (0, xI ('Z'- ), x2 ('Z'- ), . . . ), i.e., all sequences are shifted to the right. This does not change the norm of the solution. If the process vt (t) has a jump at time 'Z'* , then xk ('Z'*) = xk ('Z'* - ) + Ykxk+ I ('Z'). Therefore, if the process v(t) has k jumps up to the time t, and E Vj(t) = 0, then x;(t) = 0 for i < k. We remark that IXn(t)1 sat.isfies the same system of equations as Xn(t), except that A; = 0, and hence v;(t) = 0, in the system for IXn(t)l. Assume that E Ian I < 00, E IPI < 00, E Yn < 00, and E A;t < 00. Then E v: (t) + v(t) < 00, and the sum of the left-hand side is a Poisson process with parameter A + E A;t . Suppose that the jumps of the total process on [0, t] took place at the times 0 < 'Z'I < . . . < 'Z' k < t. Then IXn('Z'I-)1 = IXn(O)1 exp{Pnwn('Z'd + (an - P;,/2)'Z'I}. If 'Z'I is a jump time for the process v(t), then IXI ('Z'dl = 0, IXn+1 ('Z'dl = IXn('Z'I- )1. If 'Z'I is a jump time for the process v:' (t), then IXnl ('Z'dl = IXnl ('Z'I-)I + Yn1lxnl +1 ('Z'I- )1, IXn ( 'Z' d I = IXn ( 'Z' I - ) I for n =F n I. 
3. STABILITY 325 Therefore, L IXn('rl )1 2 < L IXn('rl- )1 2 + L(2Y n I X n(TI- )llxn+1 (TI-)I n n n + YIXn+1 (TI- )1 2 )I{II:(fd- II :(fl-)=I} < L IXn(TI- )1 2 + L(Yn + Y)I{II:(fl-)-II:(fl-)=I} L IXn(TI - )1 2 n n n < L IXn(1-)12 exp {c L )'n[V,i(d - V,i(I-)]}' n (56) where c is such that ')'n < c. Let "t = L [ 2pn sup IWn(u) - wn(s)1 + (2O: n - P;,)t ] . o<u<st Then, for any u < s < t, exp{2Pn(w n (s) - wn(u)) + (2O: n - P;,)(s - u)} < e"'. Therefore, L IXn(dI2 < L I X n(O)1 2 exp { '" + L )'nV,i(I) } . Similarly, L IXn(2)12 < eT/' L IXn(dI2 exp {c L )'n(V,i(2) - v,i(d) } < e 2 T/(t) L I X n(O)1 2 exp {c L )'nV,i(2) }, L I X n(t)1 2 < L I X n(O)1 2 exp {(k + 1)", + c L )'nv,i(t)}. This shows that under our assumptions a solution of (54) gives rise to a stochastic semigroup of bounded random operators. We show that the process generated by a solution of (54) is irreducible away from the stationary point O. If at least one of the xk (0) is nonzero, then with positive probability all the {xn(h), n < m} will be nonzero, for any h > 0 and m (for this it is necessary that the process v(s) have m - k jumps on [0, h], and then the processes v;;;_l (s),..., vi(s) have one jump each, and the jumps are arranged in the same time order as the processes are written: first a jump for the process v;;;_l' then for the process v;;;_2' and so on; we assume that k < m). Using (55) and the fact that with positive probability v(s) and v:(s) (n < m) do not have jumps on [h, t], while the v; (s) (n < m) have a given number of jumps, we see that the point (Xl (t), . .. , Xm (t)) in Rm hits any ball with positive probability. We remark also that if v(t) = 0 and v;t;(t) = 0, then {Xl (t),... ,xm(t)} and {Xm+l (t),X m +2(t),...} are independent collections of variables. With positive probability, Vj(t) = vf(t) = 0 for all i > m. Then xn(t) can be expressed according to (55), and L x(t) = L x(O) exp{2Pn w n(t) + (2O: n - P;,)t} n>m n>m 
326 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE takes arbitrarily small values with positive probability. Therefore, p {(Xn(t) - Xn)2 < e } > P { :; (Xn(t) - Xn)2 < , Vm(t) = 0, V;:;(t) = 0 } x P { L(Xn(t) - Xn)2 < , ( v(t) + L v:(t) ) = O } n>m nm > P { :? xn(t) - Xn)2 < , Vm(t) = 0, V;:;(t) = 0 } x P { 2 L(X;(t) + x;) < , L(vn(t) + v;i(t)) = O } . n>m nm Choosing m such that " X n 2 < e /4, we see that both probabilities on the right-hand side L..Jn>m are positive. Using the Ito formula, we get that Edx = E{(2O: n + P)x dt + 2Pnx dWn(t) + [2Xn(Xn-l - xn) + (Xn - X n - d 2 ] dv(t) + (2Yn X n X n+l + YX;+l) dv:(t)} = E[(2O: n + P)x + (x;_l - X)A + (2Yn X n X n+l + yx;+l )A] dt < E[(2O: n + P)x + A(X;_l  x) + (YnX + (Yn + y)x;+l )A] dt, dE L x < L E[(2O: n + P + Yn)X + A(X;_l - X)+A(Yn + y)x;+d n n = E L(2o: n + P + A + Yn + A;_l (Yn + y))x. n Suppose that 2O: n + P + AYn + A;_l (Yn + Y) < 0 for all n. Then E E dx < 0, and hence E En x(t) < E x(O). This implies that a solution of system (54) is stable (in the mean square, but not asymptotically). Since E x(t) is a supermartingale, this variable is bounded. We show that there is an initial condition such that P{limt-oo E IXn (t) 1 2 > O} can be made arbitrarily close to 1. Suppose that Xn(O) = 0 for n =F m, and Xm(O) = 1. Denote by 't'l, 't'l + 't'2, . .. the jump times of the process v(t). Then P { V+ ( ! l) = O } = Ee-A.:;'fl = A m A + Ah:z ' A P{v+l (!l + 't'2) - v+l ('t'd = O} = A + A+ m+l P{v;:;+i(l +... + i+d - v;:;+i(l +... + i) = O} = A. + . + m+l 
3. STABILITY 327 Hence (assume that E 'r; = 0), p { n{V';;+i(Tl + ... + Ti+d - v';;+i(Tl +... + Ti) = a} } 1=0 00 ( l+ ) -1 = D 1 + +i 1=0 (57) can be made arbitrarily close to 1 by choosing a suitable m. Suppose that the event after the probability sign on the left-hand side of (57) takes place. Then for tE [tTj'Tj] we have that Xn(t) = 0 for n =F m + i, and IXm+i(t)i = exp { L an T n -m+l + Pn ( Wn Cl Tj) - Wn (Tj) ) + am+i (t - t Tj) + Pm+i ( Wm+i(t) - wm+i (t Tj) ) }. Setting 1'/n = sup n-m n-m+1 SE[L.Jj=1 fj 'L.Jj= 1 fj] Wn(S) - Wn ( I: Tj ) , }=1 we have that L Ix n (t)1 2 > exp { -  lanI T n -m+l -  IPnl'ln } > 0 (the convergence of the series in the exponent follows from the fact that the variables 'rk and 1'/k are independent and identically distributed, while E Ian I < 00 and E IPn I < 00). 3.2. Stability in the mean square. We consider strong homogeneous stochastic semigroups Ul with second moments. Denote by Et = EUP and (C) = EUp. CUp the semigroups of first and second moments. One of the following continuity properties will be assumed for a stochastic semlgroup: a) Up is strongly continuous in the mean square if for all x E X lim EIUpx - xl 2 = o. tO b) Up is uniformly continuous in the mean square if lim sup EI U t O x - xl 2 = o. tO Ixll 
328 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Condition a) is satisfied, for example, by solutions of the stochastic equa- tion (51). Condition b) is equivalent to lim II (I) - III = O. tO Denote by Q(C) the generating operator of the semigroup Jt((C): Q(C) = lim ![V,(C) - C], tO t where the limit of operators is understood in the weak sense. In particular, if the stochastic semigroup is generated by equation (51), then Q(C) is given by (52). If C is in the domain of the generating operator Q, then d dt (C) = Q((C)) = (Q(C)). (58) As Example 3 shows, stability and asymptotic stability in the mean square are different. DEFINITION. A stochastic semigroup Up is said to be uniformly asymp- totically stable in the mean square if lim sup EIUpxl 2 = 0; tO Ixl < l EIUpxl 2 = (Jt((I)x,x) (see 2), and hence uniform asymptotic stability in the mean square is equivalent to the condition that IIVI(I)II--+ 0 as t --+ 00. Since +s(I) = Jt(((I)) and II vt+s (I) II < J:J (II  (I) II I) < II  (I) 1111 Jt( (I) \I, uniform asymptotic stability in the mean square implies exponential sta- bility in the mean square, i.e., the existence of an a > 0 such that II V, (I) II < .!. exp{ -at}. (59) a THEOREM 9. Suppose that the stochastic semigroup satisfies condition b). The following assertions are equivalent: 1) limtoo II VI (I) II = O. 2) There exists a positive operator C with bounded inverse such that Q(C) < -aC, a > o. 3) JoOO(Jt((I)x, x) dt < 00 for all x E X. If one of these conditions holds, then Px { lim IUpxl = O } = 1. too PROOF. 1) implies 3) in view of (59). If 2) holds, then d dt (C) = Jt((Q(C)) < -a(C), 
3. STABILITY 329 and hence Jt((C) < e-atC, II (C)II < e-atIiCIi. Since JI < C for some J > 0, it follows that II V,(I) II <  II V,( C) II < II II e- at . Hence, 2) implies 1). It remains to show that 3) implies 2). Let C = 1 00 v, (I) dt. Obviously, C is a nonnegative symmetric operator. It follows from b) that there exists a J > 0 such that II Jt((I) - III < 1/2, and hence Jt((I) > !I, for t < J. Therefore, (d J C > 10 v, (I) dt > 2 1 . Further, by (59), 00 00  Q(C) = 1 Q(V,(I)) dt = 1 :t V, (I) dt = -I. Hence, Q(C) = -I < -C/IICII. Suppose that 2) holds. Then eat(CUpx, Upx) is a supermartingale: if 9'; is the flow of a-algebras generated by Up, then E[ea(t+h)( C U/+ h U t O X, U/+ h Up x) 19';] = ea(t+h)(Vi,(C)Upx, Upx) = eat ( CU,0x U,0x ) + eat {h .!!.....eau (  ( C ) U,0x U,0x ) du t , t 10 du u t' t h = eat(Upx, Upx) + eat 1 ([aJi';,(C) + Ji';,(Q(C))]Upx, Upx)eaUdu = eat(CUtOx, UtOx) + eat l h (Ji';,(aC + Q(C))UtOx, UtOx)e au du < eat(CUtOx, Upx) (we have used the inequality aC + Q(C) < 0). Therefore, eat (CUP x, Upx) is bounded, and (CUpx, Up x) = O(e- at ). Since C > JI for some J > 0, it follows that IUpxl 2 = O(e- at ). 0 Below we give an example showing that limtoo EI Up xl 2 = 0 for all x E X, and at the same time II Y;°(I) II = 1 for all t > O. 
330 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE EXAMPLE 4. Let {e;} be a basis in X, and suppose that the semigroup of operators Ul is determined by the equalities ute; = exp{p;[w;(t) - w;(s)] + (a; - !pl)(t - s)}, where the w;(t) are independent Wiener processes, and a; and p; are constants. Then EIUte;12 = exp{y;(t - s)}, y; = 2a; + pl, EUle; = eQ;(t-s), EIUpx - xl 2 = L(el';t - 2e Q ;t + I)(x,e;)2. ; Condition a) holds if and only if for some h > 0 the quantities sup( el';t - 2e Q ;t + 1) t$.h are bounded (with respect to i), and b) holds if and only if lim sup lel';t - 2e Q ;t + 11 = o. t-O ; It is easy to see that if the first condition holds, then sup; a; < 00 and sup; IP; I < 00, and if the second holds, then sup; la;1 < 00 and sup; IP;I < 00. We have that ((I)x,x) = Le y ;t(x,ei)2 i and if J'i < 0 for all i, then lim ((I)x,x) = o. too If this condition holds, but lim i J'i = 0, then II(I)II = 1. REMARK. If condition 3) in Theorem 9 holds, then, as established in the proof of the theorem, Jt(( C) < e- at C for C = 10 00 Jt((I) dt, for some a > O. If the operator C has a bounded inverse, then condition 1) holds. Obviously, Cx -# 0 for x -# 0 and C-l is defined on a dense set in any case. Let Up x = Cl/2UPC-l/2X (C is a symmetric operator and Cl/2 is the nonnegative square root). Then Up x is defined on a dense set. Since E(UtOx, U?x) = E(CU?C- 1 / 2 X, U t O C- 1 / 2 X) = ((C)C-l/2X, C- 1 / 2 X) < e- at (CC- 1 / 2 x, C- 1 / 2 X) < e-at(x,x), Up x extends to X in the mean square. It is easy to verify that Ul Cl/2UfC-l/2 is a stochastic semigroup satisfying condition a) i! the sto- chastic semigroup Uf satisfies this condition, and the semigroup Ul is now uniformly asymptotically stable in the mean square. It can happen that there is an unbounded positive operator C such that E(CUpx, Upx) < 00 for some x E X. It is natural to understand the expression under the expectation sign as lim E(CnU?x, UtOx), noo 
3. STABILITY 331 where C n E L+(X), C n i C. If the indicated limit exists on a dense set of values x, then we denote it by ((C)x,x), and there really exists a nonnegative, symmetric, densely defined operator  (C) such that ((C)X,X) = E(CUtOx, UtOx) for all x in its domain. If the Jt( (C) are defined for all t > 0 and have a common domain, and, moreover, Jt(((C)) is defined, then it can be seen by passing to the limit from bounded C that the semigroup property VI (  ( C)) = Jt(+s (C) is valid. If the limit lim !([V,(C) - C]x,y) tO t exists for x,y E S, where S is a dense set on which the operators (C) are defined, then this limit is representable in the form (Rx,y), where R is a symmetric operator with dense domain. Let R = Q( C). We remark that for such an "extended" generating operator of the semi- group Jt((C) equation (58) also holds if it is understood in the weak sense: instead of the operators in (58) it is necessary to consider the correspond- ing bilinear forms on elements in S. The fact that there always exist unbounded positive operators C such that Jt((C) is defined follows from the next lemma. LEMMA 3. For any bounded set {Xk, k = 1,2,...} and any stochastic semigroup satisfying condition a) there is a positive unbounded operator C such that (( C)Xk, Xk) < 00 for all k and t > o. PROOF. Let Pn i I be a sequence of finite-dimensional projection oper- ators, and suppose that Ak > 0 and E Ak < 00. Define n(t) = LAk(Jt((I - Pn)Xk,Xk). k The functions n(t) are continuous and nonnegative, n(t) > n+l (t), and n(t) --+ 0 for all t. Therefore, n(t) tends to zero uniformly on each bounded set. Let a nm = SUPt<m n(t), and let n m be an increasing sequence such that a nmm < 11m 2 . Define C = L Vfn(I - Pn m ). (60) m 
332 IV. LINEAR STOCHASTIC EQUATIONS IN HILBERT SPACE Obviously, C is an unbounded operator, and for t < I (Jt((C)Xk,Xk) = L Vfn(C(I - Pnm)Xk,Xk) m 1 < r LL VfnA.j(C(I - Pnm)xj,Xj) k m j 1 = r L Il'nm(t) k m 1 I 1 = r L Il'n m (t) + r L Vfna nmm k m=1 k m>l <; ( t Il'nm (t) + L m- 3 / 2 ) · k m=1 m>l REMARK 1. Iflimtoo(Jt((I)xk,xk) = 0 for all k, then Ak can be chosen so that lim an = lim sup n(t) = 0, noo noo t and then the operator C defined by (60) satisfies sup( Jt( ( C)Xk, Xk) < 00 t for all k when the n m are such that a nm < 11m 2 . REMARK 2. It can be assumed that PI is the zero operator. Then C is invertible, and the inverse C-I is compact. We consider conditions for asymptotic stability in the mean square. If lim (V((I)x,x) = 0 for all x E X, too then sup sup( Jt( (I)x, x) < 00, Ixl1 t and hence sup II Jt((C) II < kliCII, t where k is a constant. Therefore, a necessary condition for asymptotic sta- bility in the mean square is stability in the mean square, i.e., the existence of a constant k such that EIUpxI2 < klxI 2 for all t > O. 
Bibliography V. I. ARNOL'D 1. Mathematical methods in classical mechanics, 2nd ed., "Nauka", Moscow, 1979; English trans!. of 1st ed., Springer-Verlag, 1978. V. G. BABCHUK AND L. G. KULINICH 1. On a method for finding invariant sets of [to stochastic differential equations, Teor. Veroyatnost. i Primenen. 23 (1978), 454; English trans!., Theory Probab. Appl. 23 (1978), 434-435. 2. Solution of a class of linear systems of second-order Ito stochastic dif ferential equations with a single Wiener process, Teor. Veroyatnost. i Primenen. 23 (1978), 457-458; English transl., Theory Probab. Appl. 23 (1978), 438-439. M. BEBOUTOFF [M. V. BEBUTOV] 1. Markov chains with a compact state space, Mat. Sb. 10(52) (1942), 213-238. (English) RICHARD BELLMAN 1. Limit theorems for non-commutative operators. I, Duke Math. J. 21 (1954),491-500. GEORGE D. BIRKHOFF 1. Proof of the ergodic theorem, Proc. Nat. Acad. Sci. U.S.A. 17 (1931), 656-660. YU. N. BLAGOVESHCHENSKII 1. Diffusion processes depending on a small parameter, Teor. Vero- yatnost. i Primenen. 7 (1962), 135-152; English transl. in Theory Probab. Appl. 7 (1962). 333 
334 BIBLIOGRAPHY YU. N. BLAGOVESHCHENSKII AND M. I. FREIDLIN 1. Some properties of diffusion processes depending on a parameter, Dokl. Akad. Nauk SSSR 138 (1961), 508-511; English transl. in Soviet Math. Dokl. 2 (1961). N. N. BOGOLVUBOV 1. Problems of dynamical theory in statistical physics, GITTL, Moscow, 1946; English transl., North-Holland, Amsterdam, and Interscience, New York, 1962. A. N. BORODIN 1. A limit theorem for the solutions of differential equations with a ran- dom right-hand side, Teor. Veroyatnost. i Primenen. 22 (1977), 498- 512; English transl. in Theory Probab. Appl. 22 (1977). G. P. BUTSAN 1. Stochastic semigroups, "Naukova Dumka", Kiev, 1977. (Russian) Yu. L. DALETSKII 1. Infinite-dimensional elliptic operators and the corresponding para- bolic equations, Uspekhi Mat. Nauk 22 (1967), no. 4( 136), 3-54; English transl. in Russian Math. Surveys 22 (1967). 2. Stochastic differential geometry, Uspekhi Mat. Nauk 38 (1983), no. 3(231), 87-111; English transl. in Russian Math. Surveys 38 (1983). Yu. L. DALETSKII AND M. G. KREIN 1. Stability of solutions of differential equations in Banach space, "Nauka", Moscow, 1970; English transl., Amer. Math. Soc., Provi- dence, R.I., 1974. E. B. DVNKIN 1. Markov processes, Fizmatgiz, Moscow, 1963; English transl., Vols. I, II, Springer-Verlag, Berlin, and Academic Press, New York, 1965. WILLIAM FELLER 1. An introduction to probability theory and its applications. Vol. II, Wiley, 1966. M. N. FREIDLIN 1. The averaging principle and theorems on large deviations, Uspekhi Mat. Nauk 33 (1978), no. 5(203), 107-160; English transl. in Rus- sian Math. Surveys 33 (1978). HARRV FURSTENBERG 1. Noncommuting random products, Trans. Amer. Math. Soc. 108 (1963), 377-428. 
BIBLIOGRAPHY 335 I. I. GIKHMAN 1. On the effect of a random process on a dynamical system, Nauchn. Zap. Kiev. Univ. Mekh.-Mat. Fak. 5 (1941),119-132. (Ukrainian) 2. On passing to the limit in dynamical systems, Nauchn. Zap. Kiev. Univ. Mekh.-Mat. Fak. 5 (1941), 141-149. (Ukrainian) 3. On a scheme for formation of random processes, Dokl. Akad. Nauk SSSR 58 (1947), 961-964. (Russian) 4. On some differential equations with random functions, Ukrain. Mat. Zh. 2 (1950), no. 3, 45-69. (Russian) 5. On the theory of differential equations of random processes. I, II, Ukraine Mat. Zh. 2 (1950), no. 4, 37-63; 3 (1951), 317-339; English transl. in Amer. Math. Soc. Transl. (2) 1 (1955). 6. On a theorem ofN. N. Bogolyubov, Ukraine Mat. Zh. 4 (1952),215- 219. (Russian) 7. Differential equations with random functions, Winter School Theory Probab. and Math. Statist. (Uzhgorod, 1964), Izdat. Akad. Nauk Ukraine SSR, Kiev, 1964, pp. 41-85. (Russian) 8. Stability of solutions of stochastic differential equations, Limit The- orems and Statistical Inference (S. Kh. Sirazhdinov, editor), "Fan", Tashkent, 1966, pp. 14-45; English transl. in Selected Transl. Math. Statist. and Probab., Vol. 12, Amer. Math. Soc., Providence, R.I., 1973. I. I. GIKHMAN AND A. Y A. DOROGOVTSEV 1. On stability of solutions of stochastic differential equations, Ukrain. Mat. Zh. 17 (1965), no. 6, 3-21; English transl. in Amer. Math. Soc. Transl. (2) 72 (1968). I. I. GIKHMAN AND A. V. SKOROKHOD 1. The theory of stochastic processes. I, II, III, "Nauka", Moscow, 1971, 1973, 1975; English transl., Springer-Verlag, 1974, 1975, 1979. 2. Stochastic differential equations and their applications, "Naukova Dumka", Kiev, 1982. (Russian) T. E. HARRIS 1. The existence of stationary measures for certain Markov processes, Proc. Third Berkeley Sympos. Math. Statist. and Probab. (1954/55), Vol. 2, Univ. of California Press, Berkeley, Calif., 1956, pp. 113- 124. EBERHARD HOPF 1. Ergodentheorie, Springer-Verlag, 1937; reprint, Chelsea, New York, 1948. 
336 BIBLIOGRAPHY KIYOSIITO 1. On a stochastic integral equation, Proc. Japan Acad. 22 (1946), 32- 35. 2. On stochastic differential equations, Mem. Amer. Math. Soc. No.4 (1951). R. Z. KHAS'MINSKII 1. On the averaging principle for parabolic and elliptic differential equa- tions and Markov processes with small diffusion, Teor. Veroyatnost. i Primenen. 8 (1963), 3-25; English transl. in Theory Probab. Appl. 8 (1963). 2. On random processes determined by differential equations with a small parameter, Teor. Veroyatnost. i Primenen. 11 (1966), 240- 259; English trans!. in Theory Probab. Appl. 11 (1966). 3. A limit theorem for solutions of differential equations with a random right-hand side, Teor. Veroyatnost. i Primenen. 11 (1966), 444-462; English transl. in Theory Probab. Appl. 11 (1966). 4. On the averaging principle for Ito stochastic differential equations, Kybernetika (Prague) 4 (1968),260-279. (Russian) 5. Stability of systems of differential equations under random perturba- tions of their parameters, "Nauka", Moscow, 1969; English transl., Stochastic stability of differential equations, Sijthoff and Noordhoff, Alphen aan den Rijn, 1980. A. KHINTCHINE [A. YA. KHINCHIN] 1. Zu Birkhojfs Losung des Ergodenproblems, Math. Ann. 107 (1932), 485-488. A. N. KOLMOGOROV 1. Markov chains with countably many states, Bull. Univ. Moscou Sere Internat. Sect. A Math. Mec. 1 (1937/38), no. 3. (Russian; French summary ) V. S. KOROLYUK AND A. F. TURBIN 1. Mathematical foundations for phase amalgamation of complex sys- tems, "Naukova Dumka", Kiev, 1978. (Russian) N. M. KRYLOV AND N. N. BOGOLYUBOV 1. Les proprietes ergodiques des suites des probabilites en chaine, C. R. Acad. Sci. Paris 204 (1937), 1545-1546. 
BIBLIOGRAPHY 337 2. L 'effet de la variation statistique des parametres sur les proprietes ergodiques des systemes dynamiques non conservatifs, Zap. Kafedr. Mat. Fiz. Inst. Budlvel. Mat. Akad. N auk Ukrain. SSR 3 (1937), 154-171 (Ukrainian); French transl., ibid., .172-190. 3. General measure theory in nonlinear mechanics, Zap. Kafedr. Mat. Fiz. Inst. Budlvel. Mat. Akad. Nauk Ukraine SSR 3 (1937), 55-112 (Ukrainian); French transl., La theorie generale de la mesure dans son application a l'etude des systemes dynamiques de la mecanique non lineaire, Ann. of Math. (2) 38 (1937), 65-113. 4. Sur les equations de Fokker-Planck deduites dans la theorie des per- turbations a l'aide d'une methode basee sur les proprietes spectrales de l'hamiltonien perturbateur, Zap. Kafedr. Mat. Fiz. Inst. Budlvel. Mat. Akad. Nauk Ukrain. SSR 4 (1939), 5-80 (Ukrainian); French transl., ibid., 81-157. 5. On some problems in the ergodic theory of stochastic systems, Zap. Kafedr. Mat. Fiz. Inst. Budlvel. Mat. Akad. Nauk Ukraine SSR 4 (1939), 243-287. (Ukrainian) N. V. KRYLOV 1. Controlled diffusion processes, "Nauka", Moscow, 1977; English transl., Springer-Verlag, 1980. G. L. KULINICH 1. Limit distributions for the solution of a stochastic diffusion equation, Teor. Veroyatnost. i Primenen. 13 (1968), 502-506; English transl. in Theory Probab. Appl. 13 (1968). PAUL MALLIA VIN 1. Geometrie differentielle stochastique, Sem. Math. Sup., vol. 64, Presses Univ. Montreal, Montreal, 1978. Yu. A. MITROPOL'SKII 1. Problems in the asymptotic theory of nonstationary oscillations, "Nauka", Moscow, 1964; English transl., Israel Program Sci. Transls., Jerusalem, and Davey, New York, 1965. JAQUES NEVEU 1. Bases mathematiques du calcul des probabilites, Masson, Paris, 1964; English transl., Holden-Day, San Francisco, Calif., 1965. V. V. SARAFYAN 1. On the limit behavior of the largest eigenvalue of an elliptic operator with a small parameter, Mat. Sb. 127(169) (1985), 538-554; English transl. in Math. USSR Sb. 55 (1986). 
338 BIBLIOGRAPHY V. V. SARAFYAN AND A. V. SKOROKHOD 1. On fast-switching dynamical systems, Teor. Veroyatnost. i Primenen. 32 (1987), 658-669; English transl. in Theory Probab. Appl. 32 ( 1987). V. V. SAZONOV AND V. N. TUTUBALIN 1. Probability distributions on topological groups, Teor. Veroyatnost. i Primenen. 11 (1966), 3-55; English transl. in Theory Probab. Appl. 11 (1966). V. M. SHURENKOV 1. Ergodic theory and related equations in the theory of random processes, "Naukova.Dumka", Kiev, 1981. (Russian) A. V.SKOROKHOD 1. Operator martingales and stochastic semigroups, Teoriya Sluchai- nykh Protsessov, vyp. 4, "Naukova-Dumka", Kiev, 1976, pp. 86-94. (Russian) 2. Random linear operators, "Naukova Dumka", Kiev, 1978; English transl., Reidel, 1983. 3. Stochastic equations for complex systems, "Nauka", Moscow, 1983; English transl., Reidel, 1988. 4. Operator stochastic differential equations and stochastic semigroups, Uspekhi Mat. Nauk 37 (1982), no. 6(228), 157-183; English transl. in Russian Math. Surveys 37 (1982). 5. Integration in Hilbert space, "Nauka", Moscow, 1975; English transl., Springer-Verlag, 1974. R. P. STRA TONOVICH 1. Selected questions of the theory of fluctuations in radio engineering, "Sovet. Rado", Moscow, 1961; English transl., Topics in the theory of random noise. Vol. I: General theory of random processes. Nonlin- ear transformations of signals and noise, Gordon and Breach, New York, 1963. 2. Conditional Markov processes and their application to the theory of optimal control, Izdat. Moskov. Gos. Univ., Moscow, 1966; English transl., Amer. Elsevier, New York, 1968. A. F. TURBIN 1. An application of the theory of perturbations of linear operators to the solution of some problems connected with Markov chains and semi-Markov processes, Teor. Veroyatnost. i Mat. Statist. Vyp. 6 (1972), 118-128; English transl. in Theory Probab. Math. Statist. No.6 (1975)( 1976). 
BIBLIOGRAPHY 339 V. N. TUTU BALIN 1. On limit theorems for products of random matrices, Teor. Vero- yatnosl. i Primenen. 10 (1965), 19-32; English transl. in Theory Probab. Appl. 10 (1965). A. D. VENTTSEL' 1. Robust limit theorems on large deviations for Markov random processes. I, II, III, Teor. Veroyatnost. i Primenen. 21 (1976), 235- 252, 512-526; 24 (1979), 673-691; English transl. in Theory Probab. Appl. 21 (1976); 24 (1979). A. D. VENTTSEL' AND M. I. FREIDLIN 1. Fluctuation in dynamical systems under the influence of random per- turbations, "Nauka", Moscow, 1979; English transl., Random per- turbations of dynamical systems, Springer-Verlag, 1984. 
COPYING AND REPRINTING. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy an article for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication (including abstracts) is permitted only under license from the American Mathematical Society. Requests for such permission should be ad- dressed to the Executive Director, American Mathematical Society, P.O. Box 6248, Providence, Rhode Island 02940. The owner consents to copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law, provided that a fee of $1.00 plus $.25 per page for each copy be paid directly to the Copyright Clearance Center, Inc., 21 Congress Street, Salem, Massachusetts 01970. When paying this fee please use the code 0065- 9282/89 to refer to this publication. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotion purposes, for creating new collective works, or for resale. ABCDEFGHIJ - 89