Текст
                    A Practical Introduction


PROBABILITY AND STOCHASTTCS SERIES Edited by Richard Durrett and Mark Pinsky Linear Stochastic Control Systems, Guanrong Chen, Goong Chen, and Shia-Hsun Hsu Advances in Queiteing: Theory, Methods, and Open Problems, Jewgeni H. Dshalalow Stoclmstics Calculus: A Practical Introduction, Richard Durrette Chaos Expansion, Multiple Weiner-Ito Integrals and Applications, Christian Houdre and Victor Perez-Abreu Wliite Noise Distribution Theory, Hui-Hsiung Kuo Topics in Contemporary Probability, J. Laurie Snell
Probability and Stochastics Series A Practical Introduction icharci Duire CRC Press Boca Raton New York London Tokyo
m0 7,m J *S» FY Of ^¾ Tim Pletscher: Dawn Boyd: Susie Carlisle: Arline Massey: Paul Gottehrer: Kevin Luong: Sheri Schwartz: Acquiring Editor Cover Designer Marketing Manager Associate Marketing Manager Assistant Managing Editor, EDP Pre-Press Manufacturing Assistant Library of Congress Cataloging-in-Publication Data Catalog record is available from the Library of Congress This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. CRC Press, Inc.'s consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press for such copying. Direct all inquiries to CRC Press, Inc., 2000 Corporate Blvd., N.W., Boca Raton, Florida 33431. © 1996 by CRC Press, Inc. No claim to original U.S. Government works International Standard Book Number 0-8493-8071-5 Printed in the United States of America 1234567890 Printed on acid-free paper
Preface This book is the re-incarnation of my first: took Browriian Motion and Martingales in Analysis, which was published by Wadswotth in-1984. For more than a decade I have used Chapters 1, 2, 8, ancl-9 of that'book to give "reading courses" to graduate students who have completed, the first year graduate probability course and were interested in learning more about processes that move continuously in space and time. Taking the advice from biology that "form follows function" I have taken that material on stochastic integration, stochastic differential equations, Brownian motion and its relation to partial differential equations to be the core of this book (Chapters 1-5). To this I have added other practically important topics: one dimensional diffusions, semigroups and generators, Harris chains, and weak convergence. I have struggled with this material for almost twenty years. I now think that I understand most of it, so to help you master it in less time, I have tried to explain it as simply and clearly as I can. My students' motivations for learning this material have been diverse: some have wanted to apply ideas from probability to analysis or differential geometry, others have gone on to do research on diffusion processes or stochastic partial differential equations, some have been interested in applications of these ideas to finance, or to problems in operations research. My motivation for writing this book, like that for Probability Theory and Examples, was to simplify my life as a teacher by bringing together in one place useful material that is scattered in a variety of sources. An old joke says that "if you copy from one book that is plagiarism, but if you copy from ten books that is scholarship." From that viewpoint this is a scholarly book. Its main contributors for the various subjects are (a) stochastic integration and differential equations: Chung and Williams (1990), Ikeda and Watanabe (1981), Karatzas and Shreve (1991), Protter (1990), Revuz and Yor (1991), Rogers and Williams (1987), Stroock and Varadhan (1979); (b) partial differential equations: Folland (1976), Friedman (1964), (1975), Port and Stone (1978), Chung and Zhao (1995); (c) one dimensional diffusions: Karlin and Taylor (1981); (d) semi-groups and generators: Dynkin (1965), Revuz and Yor (1991); (e) weak convergence: Billingsley (1968), Ethier and Kurtz (1986), Stroock and Varadhan (1979). If you bought all those books you would spend more than $1000 but for a fraction of that cost you can have this book, the intellectual equivalent of the ginzu knife.
vi Preface Shutting off the laugh-track and turning on the violins, the road from this book's first publication in 1984 to its rebirth in 1996 has been a long and winding one. In the second half of the 80's I accumulated an embarrassingly long list of typos from the first edition. Some time at the beginning of the 90's I talked to the editor who brought my first three books into the world, John Kimmel, about preparing a second edition. However, after the work was done, the second edition was personally killed by Bill Roberts, the President of Brooks/Cole. At the end of 1992 I entered into a contract with Wayne Yuhasz at CRC Press to produce this book. In the first few months of 1993, June Meyerman typed most of the book into TeX. In the Fall Semester of 1993 I taught a course from this material and began to organize it into the current form. By the summer of 1994 I thought I was almost done. At this point I had the good (and bad) fortune of having Nora Guertler, a student from Lyon, visit for two months. When she was through making an average of six corrections per page, it was clear that the book was far from finished. During the 1994-95 academic year most of my time was devoted to preparing the second edition of my first year graduate textbook Probability: Theory and Examples. After that experience my brain cells could not bear to work on another book for another several months, but toward the end of 1995 they decided "it is now or never." The delightful Cornell tradition of a long winter break, which for me stretched from early December to late January, provided just enough time to finally finish the book. I am grateful to my students who have read various versions of this book and also made numerous comments: Don Allers, Hassan Allouba, Robert Bat- tig, Marty Hill, Min-jeong Kang, Susan Lee, Gang Ma, and Nikhil Shah. Earlier in the process, before I started writing, Heike Dengler, David Lando and I spent a semester reading Protter (1990) and Jacod and Shiryaev (1987), an enterprise which contributed greatly to my education. The ancient history of the revision process has unfortunately been lost. At the time of the proposed second edition, I transferred a number of lists of typos to my copy of the book, but I have no record of the people who supplied the lists. I remember getting a number of corrections from Mike Brennan and Ruth Williams, and it is impossible to forget the story of Robin Pemantle who took Brownian Motion, Martingales and Analysis as his only math book for a yearlong trek through the South Seas and later showed me his fully annotated copy. However, I must apologize to others whose contributions were recorded but whose names were lost. Flame me at rtdl@cornell.edu and I'll have something to say about you in the next edition. Rick Durrett
About the Author Rick Durrett received his Ph.D. in Operations research from Stanford University in 1976. He taught in the Mathematics Department at UCLA for nine years before becoming a Professor of Mathematics at Cornell University. He was a Sloan Fellow 1981-83, Guggenheim Fellow 1988-89, and spoke at the International Congress of Math in Kyoto 1990. Durrett is the author of a graduate textbook, Probability: Theory and Examples, and an undergraduate one, The Essentials of Probability. He has written almost 100 papers with a total of 38 co-authors and seen 19 students complete their Ph.D.'s under his direction. His recent research focuses on the applications of stochastic spatial models to various problems in biology.
1. Brownian Motion 1.1 Definition and Construction 1 1.2 Markov Property, Blumenthal's 0-1 Law 7 1.3 Stopping Times, Strong Markov Property 18 1.4 First Formulas 26 2. Stochastic Integration 2.1 Integrands: Predictable Processes 33 2.2 Integrators: Continuous Local Martingales 37 2.3 Variance and Covariance Processes 42 2.4 Integration w.r.t. Bounded Martingales 52 2.5 The Kunita-Watanabe Inequality 59 2.6 Integration w.r.t. Local Martingales 63 2.7 Change of Variables, Ito's Formula 68 2.8 Integration w.r.t. Semimartingales 70 2.9 Associative Law 74 2.10 Functions of Several Semimartingales 76 Chapter Summary 79 2.11 Meyer-Tanaka Formula, Local Time 82 2.12 Girsanov's Formula 90 3. Brownian Motion, II 3.1 Recurrence and Transience 95 3.2 Occupation Times 100 3.3 Exit Times 105 3.4 Change of Time, Levy's Theorem 111 3.5 Burkholder Davis Gundy Inequalities 116 3.6 Martingales Adapted to Brownian Filtrations 119 4. Partial Differential Equations A. Parabolic Equations 4.1 The Heat Equation 126 4.2 The Inhomogeneous Equation 130 4.3 The Feynman-Kac Formula 137 B. Elliptic Equations 4.4 The Dirichlet Problem 143 4.5 Poisson's Equation 151 4.6 The Schrodinger Equation 156 C. Applications to Brownian Motion 4.7 Exit Distributions for the Ball 164 4.8 Occupation Times for the Ball 167 4.9 Laplace Transforms, Arcsine Law 170
5. Stochastic Differential Equations 5.1 Examples 177 5.2 Ito's Approach 183 5.3 Extension 190 5.4 Weak Solutions 196 5.5 Change of Measure 202 5.6 Change of Time 207 6. One Dimensional Diffusions 6.1 Construction 211 6.2 Feller's Test 214 6.3 Recurrence and Transience 219 6.4 Green's Functions 222 6.5 Boundary Behavior 229 6.6 Applications to Higher Dimensions 234 7. Diffusions as Markov Processes 7.1 Semigroups and Generators 245 7.2 Examples 250 7.3 Transition Probabilities 255 7.4 Harris Chains 258 7.5 Convergence Theorems 268 8. Weak Convergence 8.1 In Metric Spaces 271 8.2 Prokhorov's Theorems 276 8.3 The Space C 282 8.4 Skorohod's Existence Theorem for SDE 285 8.5 Donsker's Theorem 287 8.6 The Space D 293 8.7 Convergence to Diffusions 296 8.8 Examples 305 Solutions to Exercises 311 References 335 Index 339
1 Brownian Motion 1.1. Definition and Construction In this section we will define Brownian motion and construct it. This event, like the birth of a child, is messy and painful, but after a while we will be able to have fun with our new arrival. We begin by reducing the definition of a d-dimensional Brownian motion with a general starting point to that of a one dimensional Brownian motion starting at 0. The first two statements, (1.1) and (1.2), are part of our definition. (1.1) Translation invariance. {Bt — Bo,t > 0} is independent of Bq and has the same distribution as a Brownian motion with B0 = 0. (1.2) Independence of coordinates. If B0 = 0 then {B], t > 0}..., {Bf, t > 0} are independent one dimensional Brownian motions starting at 0. Now we define a one dimensional Brownian motion starting at 0 to be a process Bt, t > 0 taking values in R that has the following properties: (a) If t0 <ti< ... < tn then £(*(,),#(*i) - B(t0), ...B(tn) - B(tn^) are independent. (b) If s, t > 0 then P(B(s +t)- B(s) £A)= I (27rf)-1/2 exp(-z2/2i) dx Ja (c) With probability one, Bq = 0 and t —+ Bt is continuous. (a) says that Bt has independent increments, (b) says that the increment B(s +1) — B(s) has a normal distribution with mean vector 0 and variance t. (c) is self-explanatory. The reader should note that above we have sometimes
2 Chapter 1 Brownian Motion written Bs and sometimes written B(s), a practice we will continue in what follows. An immediate consequence of the definition that will be useful many times is: (1.3) Scaling relation. If Bo = 0 then for any t > 0, {BsUs>0}i{tl'2BS!s>0} To be precise, the two families of random variables have the same finite dimensional distributions, i.e., if s\ < ... < s„ then (ft,« ft.«) = (''\ ''\) In view of (1.2) it suffices to prove this for a one dimensional Brownian motion. To check this when n = 1, we note that W2 times a normal with mean 0 and variance s is a normal with mean 0 and variance st. The result for n > 1 follows from independent increments. A second equivalent definition of one dimensional Brownian motion starting from Bo = 0, which we will occasionally find useful, is that Bt, t > 0, is a real valued process satisfying (a') Bt- is a Gaussian process (i.e., all its finite dimensional distributions are multivariate normal), (b') EB, = 0, EBsBt = sAt = min{s,t}, (c) With probability one, t —+ Bt is continuous. It is easy to see that (a) and (b) imply (a'). To get (b') from (a) and (b) suppose s < t and write EB,Bt = E{B2S) + E(B,(Bt - Bs)) = s + EBsE(Bt - Bs) = s The converse is even easier, (a') and (b') specify the finite dimensional distributions of Bt, which by the last calculation must agree with the ones defined in (a) and (b). The first question that must be addressed in any treatment of Brownian motion is, "Is there a process with these properties?" The answer is ICYes," of course, or this book would not exist. For pedagogical reasons we will pursue an approach that leads to a dead end and then retreat a little to rectify the difficulty. In view of our definition we can restrict our attention to a one dimensional Brownian motion starting from a fixed jgR. We could take x = 0 but will not for reasons that become clear in the remark after (1.5).
Section 1.1 Definition and Construction 3 For each 0 < ti < ... < tn define a measure on R" by m—li xmj JAx Ja„ m=1 where x0 = x, to = 0, Pt(a, b) = (2^)-^2 exp(-(6 - a)2/2t) and A{ G H the Borel subsets of R. From the formula above it is easy to see that for fixed x the family y. is a consistent set of finite dimensional distributions (f.d.d.'s), that is, if {si,... sn_i} C {ti, ...tn} and tj ¢{^1,.. .sn_i} then /fx,jx,...*n_i (^i x • - • x j4„_i) = tiXitu,„in(Ai x-xAj^ixRxAjX-x An„{) This is clear when j = n. To check the equality when 1 < j < n, it is enough to show that j Ptj-ts-t (X,V)Ptj+x-ts{v, 2) dy = ptj+l_tj._x(a:,z) By translation invariance, we can without loss of generality assume x = 0, but all this says in that case is the sum of independent normals with mean 0 and variances tj — i,-_i and ty+i — tj has a normal distribution with mean 0 and variance tJ+1 — ty_i. With the consistency of f.d.d.'s verified we get our first construction of Brownian motion: (1.4) Theorem. Let fi0 = {functions w : [0,oo) -+ R} and T0 be the cr-field generated by the finite dimensional sets {w : w(tj) € -4,- for 1 < i < n} where A{ £ It. For each x G R, there is a unique probability measure vx on (fi0, T^) so that vx{w : w(0) = a;} = 1 and when 0 < ti • ■ ■ < tn (*) vs({w : u(ti) G A{}) = Hx,tu...u{M x • • • x An) This follows from a generalization of Kolmogorov's extension theorem. We will not bother with the details since at this point we are at the dead end referred to above. If C = {w : t —+ w(t) is continuous} then C ¢ T0, that is, C is not a measurable set. The easiest way of proving C $ T0 is to do Exercise 1.1. A € T0 if and only if there is a sequence of times t\, t2, ... G [0,oo) and a B G ft*1-2-} (the infinite product cr-field TI x TI x ■ ■ ■) so that A = {w : (w(ti),w(t2),.--) G B). In words, all events in T0 depend on only countably many coordinates.
4 Chapter 1 Brownian Motion The above problem is easy to solve. Let Q2 = {m2~n : m, n > 0} be the dyadic rationals. If Qq = {w : Q2 -+ R} and Tq is the cr-field generated by the finite dimensional sets, then enumerating the rationals gi, g2, • • • an<i applying Kolmogorov's extension theorem shows that we can construct a probability vx on (fig,.Fg) so that i>x{u : w(0) = x} = 1 and (*) in (1.4) holds when the U £ Q2. To extend the definition from Q2 to [0,oo) we will show: (1.5) Theorem. Let T < oo and x G R. vx assigns probability one to paths w : Q2 —+ R that are uniformly continuous on Q2 fl [0,T]. Remark. It will take quite a bit of work to prove (1.5). Before taking on that task, we. will attend to the last measure theoretic detail: we tidy things up by moving our probability measures to (C,C) where C = {continuous w : [0,oo) —+ R} and C is the cr-field generated by the coordinate maps t —nj(t). To do this, we observe that the map tp that takes a uniformly continuous point in fig to its extension in C is measurable, and we set Px = vx of1. Our construction guarantees that Bt (w) = wt has the right finite dimensional distributions for t € Q2. Continuity of paths and a simple limiting argument shows that this is true when t £ [0,oo). As mentioned earlier the generalization to d > 1 is straightforward since the coordinates are independent. In this generality C = {continuous w : [0,oo) -+ Rd} and C is the cr-field generated by the coordinate maps t —+ w(t). The reader should note that the result of our construction is one set of random variables Bt(u) = w(t), and a family of probability measures Px, x € Rd, so that under Px, Bt is a Brownian motion with PX(B0 = x) = 1. It is enough to construct the Brownian motion starting from an initial point x, since if we want a Brownian motion starting from an initial measure y. (i.e., have P^Bq & A) = n(A)) we simply set P„ (A) = j n(dx)Px(A) Proof of (1.5) By (1.1) and (1.3), we can without loss of generality suppose B0 = 0 and prove the result for T = 1. In this case, part (b) of the definition and the scaling relation (1.3) imply EodBt - 53|4) = £(,15,-,14 = C(t - sf where C = i?!-^4 < oo. From the last observation we get the desired uniform continuity by using a result due to Kolmogorov. In this proof, we do not use the independent increments property of Brownian motion; the only thing we
Section 1.1 Definition and Construction 5 use is the moment condition. In Section 2.11 we will need this result when the Xt take values in a space S with metric p\ so, we will go ahead and prove it in that generality. (1.6) Theorem. Suppose that Ep(X1,Xi)^ < K\t- s\1+a where a, /? > 0. If 7 < a//? then with probability one there is a contant C(w) so that p(Xq,Xr)<C\q-rp foraUg,r€Q2n[0,l] Proof Let 7 < a//?, r\ > 0, I„ = {(i,j) : 0 < i < j < 2",0 < j - i < 2""} and G„ = {P(X(j2-n),X(i2-n)) < ((j - i)2~")T for all (i,j) G /„}. Since af3P(\Y\>a)<E\Y\l3 we have P(°n)< E (0'-02-n)-^^(X0-2-n),X(i2-))" («",i)6/n («"j")6In by our assumption. Now the number of (i,j) G In is < 2n2n'7, so p(Gn) < K • S"2"" • (2n"2~n)-^+1+a = K2~nX where A = (1 — 77)(1 + a — f3y) -(1 + »7). Since 7 < a//?, we can pick r\ small enough so that A > 0. To complete the proof now we will show (1.7) Lemma. Let A = 3 • 2^-^/(1 ~ 2_7)- On #v = ^n=NGn we have />(X8,Xr)<A|g-rr forg,rGQ2n[0,l] with \q-r\< 2-^-^ (1.6) follows easily from (1.7): 00 00 K9~NX P(HeN) < £ P(Gcn) < # £ 2-A = yz^z n=N n=N This shows p(Xq,Xr) < A\q — r|7 for |g — r\ < <5(w) and implies that we have p(Xq,Xr) < C(u)\q - r[r for q,r G [0,1]. Proof of (1.7) Let q, r G Q2 n [0,1] with 0 < r - q < 2-0-*)". Pick m > N so that 2-(m+l)(l-ij) < r _ < 2-"»(1-ij)
6 Chapter 1 Brownian Motion and write r = j2~m + 2~r(1} + • • • + 2~rW g = i2~m - 2~qW 2-q^ where m < r(l) < ■ • • < r(£) and m < g(l) < • • • < q(k). Now 0 < r — g < 2-m(i-^)j so y _ ^ < 2mi and it follows that on HN (a) /)(X(i2-ra), X(j2-m)) < ((2m")2-my On ifjv it follows from the triangle inequality that k oo (b) P(xq, x(i2-m)) < Y^C2~q{h)y < X) (2_7)A ^ C72_7m A=l h=m where C7 = 1/(1 — 2-7) > 1. Repeating the last computation shows (c) P(Xr,X(j2-m))<C^2-^m Combining (a)-(c) gives p(Xq,Xr) < 3^2-^^-^ < 3(7^-^ - g|T since 2-m(1-") < 21-"|r.— g|. This completes the proof of (1.7) and hence of (1.6) and (1.5) D The scaling relation (1.2) implies E\Bt - B3\2m = Cm\t - s\m where Cm = E\Bx\2m So using (1.6) with /? = 2m and a = m — 1 and then letting m —► oo gives: (1.8) Theorem. Brownian paths are Holder continuous with exponent j for any j < 1/2. It is easy to show: (1.9) Theorem. With probability one, Brownian paths are not Lipschitz continuous (and hence not differentiate) at any point. Proof Let An = {w : there is an s £ [0,1] so that \Bt — Bs\ < C\t — s\ when \t - s\ < 3/n}. For 1 < k < n - 2 let (K^)- n,„=max< B[^^)-B< *+i 1 n Gn = { at least one YkiU is < 5C/n} :j = 0,1,2
Section 1.2 Markov Property, Blumenthal's 0-1 Law 7 We claim that An C Gn. To prove this we consider s = 1, which we claim is the worst possibility. In this case we want to conclude that Yn_2i„ < 5C/n. For this we observe that j = 0 is the worst case, but even then |£(„-3)/„ " £(n-2)/„| < |£(„_3)/„ ~B1\+\B1- £(„-2)/„| < C ^ + £) Using An C Gn and the scaling relation (1.2) gives P(A.) < P(Gn) <nP (|J31/n| < ^-\ since exp(—x2/2) < 1. Letting n —+ oo shows P(An) —+ 0. Noticing n —+ j4„ is increasing shows P(An) = 0 for all n and completes the proof. D Exercise 1.2. Show by considering k increments instead of 3 that if 7 > 1/2+ 1/k then with probability 1, Brownian paths are not Holder continuous with exponent 7 at any point of [0,1]. The next result is more evidence that Bt — Bs w y/t — s. Exercise 1.3. Let Am,„ = B(tm2-n) - B(t(m-l)2~n). Compute \m<2" and use the Borel-Cantelli lemma to conclude that ^m<2n A^, n —+ t a.s. as n —+ 00. Remark. The last result is true if we consider a sequence of partitions ni C n2 C ... with mesh -+ 0. See Freedman (1970) p.42-46. The true quadratic variation, defined as the sup over all partitions, is 00 for Brownian motion. 1.2. Markov Property, Blumenthal's 0-1 Law Intuitively the Markov property says "given the present state, Bs, any other information about what happened before time s is irrelevant for predicting what happens after time s."
8 Chapter 1 Brownian Motion Since Brownian motion is translation invariant, the Markov property can be more simply stated as: "if s > 0 then Bi+S — Bs, t > 0 is a Brownian motion that is independent of what happened before time s." This should not be surprising: the independent increments property ((a) in the definition) implies that if si < S2 ■ ■ ■ < sm = s and 0 < ti... < tn then (Bil+1 -BS,...BU+S -Bs) is independent of (BSl,...BSn) The major obstacle in proving the Markov property for Brownian motion then is to introduce the measure theory necessary to state and prove the result. The first step in doing this is to explain what we mean by "what happened before time s." The technical name for what we are about to define is a filtration, a fancy term for an increasing collection of cr-fields, Ta> i.e., if s < t then •?\j C Tf Since we want Bs £ Tat i.e., Bs is measurable with respect to Ts, the first thing that comes to mind is T's = <r(Br :r<s) For technical reasons, it is convenient to replace T° by The fields Tf are nicer because they are right continuous. That is, nt>sT+ = nt>3 (nu>1;Fu0) = nu>^„° = Tf In words the T% allow us an "infinitesimal peek at the future," i.e., A G Tf if it is in T°+c for any e > 0. If Bt is a Brownian motion and / is any measurable function with f(u) > 0 when u > 0 the random variable limsup (Bt -Bs)/f(t-s) U> is measurable with respect to T+ but not T°. Exercises 2.9 and 2.10 consider what happens when we take f(u) — y/u and f(u) = \/wloglog(l/w). However, as we will see in (2.6), there are no interesting examples of sets that are in Ff but not in F°. The two cr-fields are the same (up to null sets). To state the Markov property we need some notation. First, recall that we have a family of measures Px, x € Rd, on (C, C) so that under Px, Bt(w) = w(t) is a Brownian motion with B0 = x. In order to define "what happens after time
Section 1:2 Markov Property, Blumenthal's 0-1 Law 9 s," it is convenient to define the shift transformations 9S : C —► C, for s > 0 by (6sw)(t)=w(s+t) forf>0 In words, we cut off the part of the path before time s and then shift the path so that time s becomes time 0. To prepare for the next result, we note that if Y : C —* R is C measurable then Y o6s is a function of the future after time s. To see this, consider the simple example Y(w) = f(w(t)). In this case Y o 0, = f(0su(t)) = f(w(s +1)) = f(B1+i) Likewise, if Y(w) = /(wtx,.. -wln) then Yo0s=f(Bs+tl,...Bs+tn) (2.1) The Markov property. If s > 0 and Y is bounded and C measurable then for all x £ Rd Ex{YoQs\T+) = EB,Y where the right hand side is <p(y) = EyY evaluated at y = B(s). Explanation. In words, this says that the conditional expectation of Y o 0, given T+ is just the expected value of Y for a Brownian motion starting at Bs. To explain why this implies "given the present state, Bs, any other information about what happened before time s is irrelevant for predicting what happens after time s," we begin by recalling (see (1.1) in Chapter 5 of Durrett (1995)) that if g C T and E{Z\T) £ G then E{Z\T) = E{Z\Q). Applying this with T = T+ and Q = cr(Bs) we have EX(Y o 0,\Tf) = EX(Y o6,\Bs) If we recall that X = E{Z\T) is our best guess at Z given the information in F (in the sense of minimizing E(Z — X)2 over X € Fs, see (1.4) in Chapter 4 of Durrett (1995)) then we see that our best guess at Y o ds given Bs is the same as our best guess given Ff, i.e., any other information in T* is irrelevant for predicting Y o9s. Proof By the definition of conditional expectation, what we need to show is (MP) Ex{Yo6s;A) = Ex{EB,Y;A) for all A £ JF+
10 Chapter 1 Brownian Motion We will begin by proving the result for a special class of Y's and a special class of A's. Then we will use the tt — A theorem and the monotone class theorem to extend to the general case. Suppose ^)= 33 /™K*m)) l<m<n where 0 < t\ < ... <tn and the fm : Rd —+ R are bounded and measurable. Let 0 < h< 11, let 0 < si... < sk < s + h, and let A = {w : w(sy) G Aj, 1 < j < k} where Aj £ Tid for 1 < j < k. We will call these A's the finite dimensional or f.d. sets in .?7+A. Prom the definition of Brownian motion it follows that if 0 = u0 < u\ < ...ut then the joint density of (BUl ,...BUt) is given by t Px(BUl = yi,...BUe=yt) = JJpU£_Ui_x(y,-i,y,) 1=1 where yQ = x. From this it follows that t (yo.yi)ffi(yi) • •• / dylpUt_Ut_l(yt-Uyl)gi(yt) Applying this result with the u,- given by s\,.. .,sk,s + h,s + ti,.. .s +tn and the obvious choices for the gi we have EX(Y o 0,-A) = Ex [ JJ UABsj) ■ ln*(B,+h) ■ JJ fM+im) U = l m=l /R-* where JAx JAk ■ / dyp1+h_Sk(xk,y)ip(y,h) <p(y,h)= / dyiPu-h(y,yi)h(yi)--- I ^ynPi„-i„_i(yn-i,yn)/„(y„) Using the formula for Ex 11/= i 9i{BUi) again we have
Section 1.2 Markov Property, Blumenthal's 0-1 Law 11 (MP-1) EX(Y o 0,; A) = Ex(<p(B1+h, A); A) for all f.d. sets A in T°i+h To extend the class of j4's to all of ^"/+A, we use (2.2) The tt — A theorem. Let A be a collection of subsets of Q that contains fi and is closed under intersection (i.e., if A, B &A then A n B G .4)- Let £ be a collection of sets that satisfy (i) if A,B G g and A D B then A - B G £ (ii) If j4„ G £ and j4„ | .4, then A eg. If Acg then the cr-field generated by A, (r(A) C £• Proof See (2.1) in the Appendix of Durrett (1995). (MP-2) EX(Y oO,;A) = Ex(<p(Bs+h, A); A) for all A G J?+fc Proof Fix Y and let ^ be the collection of sets A for which the desired equality is true. A simple subtraction shows that (i) in (2.2) holds, while the monotone convergence theorem shows that (ii) holds. If we let A be the collection of finite dimensional sets in T°+h then we have shown A C g so T°+h = c(A) C g which is the desired conclusion. D Our next step is (MP-3) EX(Y o 0S; A) = Ex(<p(Bs, 0); A) for all A G Tf Proof Since Tf C -F/+A for any h > 0, all we need to do is let h -+ 0 in (MP-2). It is easy to see that ip(y\) = /1(2/1) / dy2pt2-tl(2/1,2/2)/2(½) ••• I ^2/nPi„-i„_x(2/n-l,2/n)/n(2/n) is bounded and measurable. Using the dominated convergence theorem shows that if h —+ 0 and x^—^x then <p(xh,h)= dyipil_h(xh,yi)ijj(yi)->-'p(x,0) Using (MP-2) and the bounded convergence theorem now gives the desired result. D
12 Chapter 1 Browniaji Motion (MP-3) shows that (MP) holds for Y = ni<m<„ /m(w(im)) when the fm are bounded and measurable. To extend to general "bounded measurable Y we will use (2.3) Monotone class theorem. Let A be a collection of subsets of Q that contains Q and is closed under intersection (i.e., if A, B G A then ifl5G .4). Let Ti be a vector space of real valued functions on fi satisfying (i) JfAeA,lA€ H. (ii) If 0 < /„ G Ti and /„|/,a bounded function, then f eTi. Then Ti contains all the bounded functions on Q that are measurable with respect to cr(A). Proof See (1.4) of Chapter 5 of Durrett (1995). To get (2.1) from (2.3), fix an A G T+ and let Ti = the collection of bounded functions Y for which (MP) holds. Ti clearly is a vector space satisfying (ii). Let A be the collection of sets of the form {w : u(tj) G Aj, 1 < j < n} where Aj G Tid. The special case treated above shows that if A G A then 1,4 G Ti. This shows (i) holds and the desired conclusion follows from (2.3). D The next seven exercises give typical applications of the Markov property. Perhaps the simplest example is Exercise 2.1. Let 0 < s < t. If / : Rd -+ R is bounded and measurable Ex{f{Bt)\Ts) = EBJ{Bt^) Exercise 2.2. Take f(x) = xt and f(x) = hxj with i ^ j in Exercise 2.1 to conclude that B\ and B\B\ are martingales if i ^ j. The next two exercises prepare for calculations in Section 1.4. Exercise 2.3. Let TQ = inf{s > 0 : B, = 0} and let R = M{t > 1 : Bt = 0}. R is for right or return. Use the Markov property at time 1 to get (2.4) PX(R > 1 +1) = J pi(x, y)Py(T0 > t) dy Exercise 2.4. Let TQ = inf{s > 0 : B, = 0} and let L = sup{f < 1 : Bt = 0}. L is for left or last. Use the Markov property at time 0 < t < 1 to conclude (2.5) P0(L <t)= [ Pt(0,y)Py(To >l-t)dy
Section 1.2 Markov Property, Blumenthal's 0-1 Law 13 Exercise 2.5. Let G be an open set and let T = inf{£ : Bt £ G). Let K be a closed subset of G and suppose that for all x G K we have PX(T > 1, Bt G K) > a, then for all integers n > 1 and x G K we have Px(r > n,Bt G K) > a". The next two exercises prepare for calculations in Chapter 4. Exercise 2.6. Let 0 < s <t. If A : R x Rd —+ R is bounded and measurable Ex (J h(r,Br)dr ?\ = J' h(r,Br)dr + Eb(s) / h(s + u,Bu)du Jo Exercise 2.7. Let 0 < s < t. If / : Rd -+ R and h : R x Rd -+ R are bounded and measurable then Ex(f(Bt)exp(J h(r,Br)dA t\ = exp (J h(r, Br) dA EBl //(5,-,) exp (J h(s + u, Bu) du\ \ The reader will see many other applications of the Markov property below, so we turn our attention now to a "triviality" that has surprising consequences. Since it follows (see, e.g., (1.1) in Chapter 5 of Durrett (1995)) that EX(Y o 0,\Tf) = Ex(Yo et\Tt) Prom the last equation it is a short step to (2.6) Theorem. If Z G C is bounded then for all s > 0 and x G Rd, Ex{Z\Tf) = Ex{Z\rs). Proof By the monotone class theorem, (2.3), it suffices to prove the result when n Z = J]; fm(B(tm)) m=l
14 Chapter 1 Browniaji Motion and the fm are bounded and measurable. In this case Z = X(Y 0¾) where X G F° and Y G C, so using a property of conditional expectation (see, e.g., (1.3) in Chapter 4 of Durrett (1995)) and the Markov property (2.1) gives Ex{Z\Tf) = XEX(Y 0 0, \F+) = XEB,Y G Tf and the proof is complete. D If we let Z G F+ then (2.6) implies Z = EX(Z\T°) G T°, so the two cr-fields are the same up to null sets. At first glance, this conclusion is not exciting. The fun starts when we take s = 0 in (2.6) to get (2.7) Blumenthal's 0-1 law. If A G ^"0+ then for all x G Rd, Px(A)e {0,1} Proof Using (i) the fact that A G T$, (ii) (2.6), (iii) T% = <r(J30) is trivial under Ps, and (iv) if Q is trivial E{X\Q) = EX gives 1A = EX(1A\F+) = EX(1A\FS) = PM) P* a.s. This shows that the indicator function 1,4 is a.s. equal to the number PX(A) and it follows that PX(A) G {0,1}. D In words, the last result says that the germ field, J7^, is trivial. This result is very useful in studying the local behavior of Brownian paths. Until further notice we will restrict our attention to one dimensional Brownian motion. (2.8) Theorem. If r = inf{i > 0 : Bt > 0} then P0(t = 0) = 1. Proof Pq(t <t)> Po(Bi > 0) = 1/2 since the normal distribution is symmetric about 0. Letting t J. 0 we conclude P0(t = 0) = limP0(t <t)> 1/2 so it follows from (2.7) that Pq(t = 0) = 1. D Once Brownian motion must hit (0,oo) immediately starting from 0, it must also hit (—00,0) immediately. Since t —+ Bt is continuous, this forces: (2.9) Theorem. If TQ = M{t > 0 : Bt = 0} then PQ(TQ = 0) = 1.
Section 1,2 Markov Property, Blumenthal's 0-1 Law 15 Combining (2.8) and (2.9) with the Markov property you can prove Exercise 2.8. If a < 6 then with probability one there is a local maximum of Bt in (a, 6). Since with probability one this holds for all rational a < b, the local maxima of Brownian motion are dense. Another typical application of (2.7) is Exercise 2.9. Let f(t) be a function with f(t) > 0 for all t > 0. Use (2.7) to conclude that limsup^o B{t)/f{t) = c Pq a.s. where c G [0, oo] is a constant. In the next exercise we will see that c = oo when f(t) = tll2. The law of the iterated logarithm (see Section 7,9 of Durrett (1995)) shows that c = 21/2 when /(*) = (* log log(l/t))1/2. Exercise 2.10. Show that limsuptxo-BM/t1/2 = oo P0 a.s., so with probability one Brownian paths are not Holder continuous of order 1/2 at 0. Remark. Let W7 (w) be the set of times at which the path w G C is Holder continuous of order 7, (1,6) shows that P(7i-y = [0,oo)) = 1 for 7 < 1/2. Exercise 1.2 shows that P(7i-y = 0) = 1 for 7 > 1/2. The last exercise shows P(t € W1/2) = 0 for each t, but B. Davis (1983) has shown P(7ii/2 # 0) = 1. Comic Relief. There is a wonderful way of expressing the complexity of Brownian paths that I learned from Wilfrid Kendall. "If you run Brownian motion in two dimensions for a positive amount of time, it will write your name." Of course, on top of your name it will write everybody else's name, as well as all the works of Shakespeare, several pornographic novels, and a lot of nonsense. Thinking of the function g as our signature we can make a precise statement as follows: (2.10) Theorem. Let g : [0,1] -+ Rd be a continuous function with g(0) = 0, let e > 0 and let tn J. 0. Then Pq almost surely, sup 0<8<1 Proof In view of Blumenthal's 0-1 law, (2.7), and the scaling relation (1.3), we can prove this if we can show that (*) Po ( sup \B(0) - g(0)\ < e ) > 0 for any e > 0 \0<8<1 J B(0tn) -m < e for infinitely many n
16 Chapter 1 Brownian Motion This was easy for me to believe, but not so easy for me to prove when students asked me to fill in the details. The first step is to treat the dead man's signature g(x) = 0. Exercise 2.11. Show that if e > 0 and t < oo then P0 (sup sup |-Bj|<e) >0 \ i 0<s<t J In doing this you may find Exercise 2.5 helpful. In (5.4) of Chapter 5 we will get the general result from this one by change of measure. Can the reader find a simple direct proof of (*)? With our discussion of Blumenthal's 0-1 law complete, the distinction between Tf and T° is no longer important, so we will make one final improvement in our cr-fields and remove the superscripts. Let K = {A ; A C B with PX(B) = 0} ?* = <r{T?UMx) F, = nxr? J\fx are the null sets and Ff are the completed cr-fields for Px. Since we do not want the filtration to depend on the initial state we take the intersection of all the completed cr-fields. This technicality will be mentioned at one point in the next section but can otherwise be ignored. (2.7) concerns the behavior of Bt as t -+ 0. By using a trick we can use this result to get information about the behavior as t —+ oo. (2.11) Theorem. If Bt is a Brownian motion starting at 0 then so is the process defined by X0 = 0 and Xt = t B(l/t) for t > 0. Proof By (1,2) it suffices to prove the result in one dimension. We begin by observing that the strong law of large numbers implies ^-+033^-+0, so X has continuous paths and we only have to check that X has the right f.d.d.'s. By the second definition of Brownian motion, it suffices to show that (i) if 0 < ti < ... < tn then (X(ti), ... X(tn)) has a multivariate normal distribution with mean 0 (which is obvious) and (ii) if s < t then E{XsXt) = stE(B(l/s)B(l/t)) = s D
Section 1.2 Markov Property, Blumenthal's 0-1 Law 17 (2.11) allows us to relate the behavior of B>t as t —+ oo to the behavior as t —► 0. Combining this idea with Blumenthal's 0-1 law leads to a very useful result. Let T[ = <r(Bs : s >t) = the future after time t T = nt>o T't = the tail cr-field (2.12) Theorem. If A G T then either PX(A) = 0 or Px(j1) = 1. Remark. Notice that this is stronger than the conclusion of Blumenthal's 0-1 law (2.7). The examples A = {w : w(0) G B} show that for A in the germ cr-field T$ the value of PX{A) may depend on x. Proof Since the tail cr-field of B is the same as the germ cr-field for X, it follows that P0(A) G {0,1}. To improve this to the conclusion given observe that A G T[, so 1,4 can be written as 1b o 9\. Applying the Markov property, (2.1), gives P*(A) = EX(1B o ex) = EX(EX(1B o fcl^i)) = Ex{EBllB) = j(2w)-d/2 exp(-|y - x\*/2)Ps(B) dy Taking x = 0 we see that if Pq(A) = 0 then Py(B) = 0 for a.e. y with respect to Lebesgue measure, and using the formula again shows PX(A) = 0 for all x. To handle the case Pq(A) = 1 observe that Ae G T and Pq(Ac) = 0, so the last result implies Px(Ae) = 0 for all x. Q The next result is a typical application of (2.12). The argument here is a close relative of the one for (2.8). (2.13) Theorem. Let Bt be a one dimensional Brownian motion and let A = nn{Bt = 0 for some t>n}. Then PX(A) = 1 for all x. In words, one dimensional Brownian motion is recurrent. It will return to 0 "infinitely often," i.e., there is a sequence of times tn f oo so that Bta = 0. We have to be careful with the interpretation of the phrase in quotes since starting from 0, Bt will return to 0 infinitely many times by time e > 0. Proof We begin by noting that under Px, Bi/y/t has a normal distribution with mean x/y/i and variance 1, so if we use \ '° denote a standard normal, Px(Bt < 0) = PziBt/y/i < 0) = P(X < -x/y/t)
18 Chapter 1 Browniaji Motion and lim^oo Px{Bt < 0) = 1/2. If we let T0 = inf{t : Bt = 0} then the last result and the fact that Brownian paths are continuous implies that for all x > 0 PX(TQ < oo) > 1/2 Symmetry implies that the last result holds for x < 0, while (2.9) (or the Markov property) covers the last case x = Q. Combining the fact that Px(Tq < oo) > 1/2 for all x with the Markov property shows that Px(Bt = 0 for some t > n) = Ex(PBn (To < oo)) > 1/2 Letting n —► oo it follows that PX(A) > 1/2 but A G T so (2.12) implies Px(Bt = 0 i.o.) = 1. D 1.3. Stopping Times, Strong Markov Property We call a random variable S taking values in [0, oo] a stopping time if for all t > 0, {S < t} £ Tt. To bring this definition to life think of Bt as giving the price of a stock and S as the time we choose to sell it. Then the decision to sell before time t should be measurable with respect to the information known at time t. In the last definition we have made a choice between {S < t) and {S < t). This makes a big difference in discrete time but none in continuous time (for a right continuous filtration Tt) '■ If {S<t} e Tt then {S < t} = U„{S < t - 1/n} G Tx. If {S < t} G Tt then {S<t} = n„{S < t + 1/n} £ Tt. The first conclusion requires only that t —+ Tt is increasing. The second relies on the fact that t —+ Tt is right continuous. (3.2) and (3.3) below show that when checking something is a stopping time it is nice to know that the two definitions are equivalent. (3.1) Theorem. If G is an open set and T = inf{f > 0 : Bt e G) then T is a stopping time. Proof Since G is open and t -+ Bt is continuous {T < t} = Uq<t{Bq G G} where the union is over all rational q, so {T < t) G T%. Here, we need to use the rationals so we end up with a countable union. D (3.2) Theorem. If Tn is a sequence of stopping times and Tn J. T then T is a stopping time.
Section 1.3 Stopping Times, Strong Markov Property 19 Proof {T<t} = U„{T„ <t). a (3.3) Theorem. If Tn is a sequence of stopping times and Tn\T then T is a stopping time. Proof {T <t} = D„{Tn <t}. a (3.4) Theorem. If K is a closed set and T = inf {* >Q:Bt£K} then T is a stopping time. Proof Let D(x,r) = {y : \x - y\ < r}, let G„ = U{D(z, 1/n) : x G K}, and let Tn = ini{t > 0 : Bt G G„}. Since G„ is open, it follows from (3.1) that T is a stopping time. I claim that as n f oo, Tn | T. To prove this notice that T > Tn for all n, so lim Tn <_T. To prove T < lim T„ we can suppose that Tn T t < oo. Since B(T„) G G„ for all n and B(T„) -+ B{t), it follows that 5(f) G -K and T < t. D Remark. As the reader might guess the hitting time of a Borel set A, Ta = inf{t : Bt G A}, is a stopping time. However, this turns out to be a difficult result to prove and is not true unless the filtration is completed as we did at the end of the last section. Hunt was the first to prove this. The reader can find a discussion of this result in Section 10 of Chapter 1 of Blumenthal and Getoor (1968) or in Chapter 3 of Dellacherie and Meyer (1978). We will not worry about that result here since (3.1) and (3.4), or in a pinch the next result, will be adequate for all the hitting times we will consider. Exercise 3.1. Suppose A is an Fa, i.e., a countable union of closed sets. Show that TA = inf{t: Bt G A} is a stopping time. Exercise 3.2. Let S be a stopping time and let S„ = ([2nS] + l)/2n where [x] = the largest integer < x. That is, S„ = (m + 1)2-" if m2~" <S<{m+ 1)2-" In words, we stop at'the first time of the form k2~n after S (i.e., > S). Prom the verbal description it should be clear that S„ is a stopping time. Prove that it is. Exercise 3.3. If S and T are stopping times, then SAT = min{S,T}, SVT = max{S,T}, and S + T are also stopping times. In particular, if t > 0, then S At, SV t, and S + t are stopping times.
20 Chapter 1 Browniaji Motion Exercise 3.4. Let Tn be a sequence of stopping times. Show that supT„, infTn, limsupTn, liminfT„ are stopping times n n n n Our next goal is to state and prove the strong Markov property. To do this, we need to generalize two definitions from Section 1.2. Given a nonnegative random variable S(w) we define the random shift 0s which "cuts off the part of w before S(w) and then shifts the path so that time S(w) becomes time 0." (*sw)(t) - | A on {S = oo} Here A is an extra point we add to C to cover the case in which the whole path gets shifted away. Some authors like to adopt the convention that all functions have /(A) = 0 to take care of the second case. However, we will usually explicitly restrict our attention to {S < oo} so that the second half of the definition will not come into play. The second quantity Ts, "the information known at time S," is a little more subtle. We could have defined T, = n£>o<r(B1A(,+£))i > 0) so by analogy we could set Ts = n£>o <T(5M(s+£),f > 0) The definition we will now give is less transparent but easier to work with. Fs = {A : A n {S < t} G Ft for all t > 0} In words, this makes the reasonable demand that the part of A that lies in {S < t} should be measurable with respect to the information available at time t. Again we have made a choice between < t and < t but as in the case of stopping times, this makes no difference and it is useful to know that the two definitions are equivalent. Exercise 3.5. When Tx is right continuous, the definition of Ts is unchanged if we replace {S < t} by {S < t}. For practice with the definition of Ts do
Section 1.3 Stopping Times, Strong Markov Property 21 Exercise 3.6. Let S be a stopping time and let A G Ts- Show that R= \ ., is a stopping time L oo on Ae Exercise 3.7. Let S and T be stopping times. (i) {S < t}, {S > t}, {S = t} are in Ts. (ii) {S < T}, {S > T}, and {S = T) are in Ts (and in TT). Two properties of Ts that will be useful below are: (3.5) Theorem. If S < T are stopping times then Ts C Tt- Proof If A G Ts then A n {T < t} = (,4 n {S < t}) n {T < t} g ^. D (3.6) Theorem. If Tn | T are stopping times then TT = nnT(Tn). Proof (3.5) implies T(Tn) D Tt for all n. To prove the other inclusion, let A £ DT(Tn). Since A D {T„ < t} G ^i and T„ J. T, it follows that A D {T < t} e T, so A £TT. D The last result and Exercises 3.2 and 3.7 allow us to prove something that is obvious from the verbal definition. Exercise 3.8. Bs £ Ts, i-e., the value of Bs is measurable with respect to the information known at time S! To prove this let Sn = ([2nS] + l)/2n be the stopping times defined in Exercise 3.2. Show B(Sn) G Tsn then let n -+ oo and use (3.6). The next result goes in the opposite direction from (3.6). Here Qn\ Q means n —+ Qn is increasing and Q = (r(Gn)- Exercise 3.9. Let S < oo and Tn be stopping times and suppose that Tn | oo as n | oo. Show that Tsat* ^ Ts as n 1 oo. We are now ready to state the strong Markov property, which says that the Markov property holds at stopping times. (3.7) Strong Markov property. Let (s,w) -+ Y(s,u) be bounded and HxC measurable. If S is a stopping time then for all x G R-d EX(YS o 9S \TS) = EB(S)Ys on {S < oo}
22 Chapter 1 Brownian Motion where the right-hand side is <p(y,t) = EyYt evaluated at y = B(S), t = S. Remark. In most applications the function that we apply to the shifted path will not depend on s but this flexibility is important in Example 3.3. The verbal description of this equation is much like that of the ordinary Markov property: "the conditional expectation of Y o 9s given J^ is just the expected value of Ys for a Brownian motion starting at Bs." Proof We first prove the result under the assumption that there is a sequence of times tn | oo, so that PX(S < oo) = J2 PX(S = tn). In this case we simply break things down according to the value of S, apply the Markov property and put the pieces back together. If we let Z„ = Ytll(u) and A € Ts then oo EX(YS o 9S; An{S<oo}) = J2 M^n ° 6U; A n {S = tn}) n = l Now if a e TSl A n {S = tn} = (An{S< tn}) - (A n {S < tn-i}) e ^(t„), so it follows from the Markov property that the above sum is oo = Y,Ex(EB{in)Zn;Ar\ {S = *„}) = Ex(EB{s)Ys;An{S < ex.}) n = l To prove the result in general we let Sn = ([2"5] + l)/2n where [a;] = the largest integer < x. In Exercise 3.2 you showed that S„ is a stopping time. To be able to let n —+ oo we restrict our attention to Y's of the form n (*) y,H = Ms) J] /mK*m)) m=l where 0 < t\ < ... < tn and /o, • ■ ■ ,/n are real valued, bounded and continuous. If/ is bounded and continuous then the dominated convergence theorem implies that ¢-+/ dypt(x,y)f(y) is continuous. From this and induction it follows that <p(x,s) = EXY, = f0(s) / dy1ptl(x,y1)f(y1) ■ dyn Ptn-tn-l(yn-i,yn)f(yn)
Section 1.3 Stopping Times, Strong Markov Property 23 is bounded and continuous. Having assembled the necessary ingredients we can now complete the proof. Let A € Ts- Since S < Sn, (3.5) implies A £ ^"(S„). Applying the special case of (3.7) proved above to Sn and observing that {Sn < oo} = {S < oo} gives Ex(YSn o 0Sn; A n {S < oo}) = Ex(<p(B(Sn), Sn) ; A n {S < oo}) Now as n - oo, Sn J S, B(Sn) - B(S), p(B(S,),S,) - f(B(S),S) and ^s„°0s„-Ys°0s, so the bounded convergence theorem implies that (3.7) holds when Y has the form given in (*). To complete the proof now we use the monotone class theorem, (2.3). Let Ti be the collection of bounded functions for which (3.7) holds. Clearly Ti is a vector space that satisfies (ii) if Yn £W are nonnegative and increase to a bounded Y, then 7 gW. To check (i) now, let A be the collection of sets of the form {w : u(tj) £ Gj} where Gj is an open set. If G is open the function 1g is a decreasing limit of the continuous functions/i(z) = (1 — k dist(z, G))+, where dist(z,G) is the distance from x to G, so if A £ A then 1,4 € Ti. This shows (i) holds and the desired conclusion follows from (2.3). D Example 3.1. Zeros of Brownian motion. Consider one dimensional Brow- nian motion, let Rt = inf{« > t ; Bu = 0} and let T0 = inf{« > 0 : Bu = 0}. Now (2.13) implies Px(Rt < oo) = 1, so B(Rt) = 0 and the strong Markov property and (2.9) imply PX(T0 o 9Rt > Q\FRl) = P0(To > 0) = 0 Taking the expected value of the last equation we see that PX(T0 o 6Rt > 0 for some rational t) = 0 Prom this it follows that with probability one, if a point u £ Z(w) = {t : Bt{u) = 0} is isolated on the left (i.e., there is a rational t < u so that (t, u) n Z(u) = 0) then it is a decreasing limit of points in Z(u). This shows that the closed set Z(w) has no isolated points and hence must be uncountable. For the last step see Hewitt and Stromberg (1969), page 72. If we let |2(w)| denote the Lebesgue measure of Z(w) then Fubini's theorem implies Ex(\Z(u)n[0,T]\)= [ Px(Bt = 0)dt = 0 Jo
24 Chapter 1 Browni&n Motion So Z(w) is a set of measure zero. D Example 3.2. Let G be an open set, x G G, let T = ini{t : Bt ¢. G}, and suppose PX(T < oo) = 1. Let A C dG, the boundary of G, and let u(x) = Px(Bt G A). I claim that if we let 6 > 0 be chosen so that D(x, 8) = {y : \y - x\ < 6} C G and let S = inf {t > 0 : Bt ¢. D(x, <5)}, then u(x) = Exu(Bs) Intuition Since D(x, 6) C G, Bt cannot exit G without first exiting D(x,6) at Bs- When Bs = y, the probability of exiting G in A is u(y) independent of how Bt got to y. Proof To prove the desired formula, we will apply the strong Markov property, (3.7), to Y = 1(bt6j4) To check that this leads to the right formula, we observe that since D(x, 6) C G, we have Bt°0s = Bt and l(BTeA)°6s = 1(53-6.4)- In words, w and the shifted path #sw must exit G at the same place. Since S < T and we have supposed PX(T < oo) = 1, it follows that PX(S < oo) = 1. Using the strong Markov property now gives Ex(l(BTeA) ° Qs\Fs) = EB(S)l(BTeA) = u(Bs) Using the definition of u, l^g^) oQs = l(BTe^)i and taking expected value of the previous display, we have u{x) = Exl^BreA-) = Ex (l(BT6^) ° ds) = EXEX (1(Bt6j4) °9s\Fs)=Exu(Bs) D Exercise 3.10. Let G, T, D(x,6) and S be as above, but now suppose that EyT < oo for all y G G. Let g be a bounded function and let u(y) = Ey(fJ g(B3) ds). Show that for x G G u(x) = Exlj g(B,)ds + u(Bs) J Our third application shows why we want to allow the function Y that we apply to the shifted path to depend on the stopping time S.
Section 1.3 Stopping Times, Strong Markov Property 25 Example 3.3. Reflection principle. Let Bt be a one dimensional Brownian motion, let a > 0 and let Ta = inf{t : Bt = a}. Then (3.8) P0(Ta <t) = 2P0(Bt > a) Intuitive proof We observe that if B, hits a at some time s < t then the strong Markov property implies that Bt — B(Ta) is independent of what happened before time Ta. The symmetry of the normal distribution and Pa(Bt = a) = 0 then imply (3.9) P0(Ta <t,Bt>a) = \p0{Ta < t) Multiplying by 2, then using {Bt > a} C {Ta < t} we have Po(Ta <t) = 2PQ(Ta <t,Bt>a) = 2P0(Bt > a) Proof To make the intuitive proof rigorous we only have to prove (3.9). To extract this from the strong Markov property (3.7), we let YJu) = (1 '^s<t,u(t-s)> a 1.0 otherwise We do this so that if we let S = inf {s < t : B, = a) with inf 0 = oo then Ys(6su) = l(B1>a) on {S<oo} = {Ta < t} If we let <p(x, s) = EXY3 the strong Markov property implies Eo(Ys o 6s\Fs) = <P(BS,S) on {S < oo} = {Ta < t} Now Bs = a on {S < oo} and <p(a, s) = 1/2 if s < t, so P0(Ta <t,Bt>a) = £(,(1/2 ;Ta<t) which proves (3.9). Exercise 3.11. Generalize the proof of (3.9) to conclude that if u < v < a then (3.10) P0(Ta <t,u<Bt<v) = P0(2a-v<Bt<2a-u)
26 Chapter 1 Brownian Motion To explain our interest in (3.10), let Mt = maxo<j<t53 to rewrite it as P0(Mt > a,u< Bt <v) = P0(2a - v < Bt < 2a - u) Letting the interval (u, v) shrink to x we see that P0(Mt >a,Bt = x) = P0(Bt = 2a-x) = _L=e-(2a-x^2( V27d Differentiating with respect to a now we get the joint density (3.11) P0(M, = a,Bt = x) = 2(2^)e-(2a-x)V2t v27rt3 1.4. First Formulas We have two interrelated aims in this section: the first to understand the behavior of the hitting times Ta = inf {t: Bt = a} for a one dimensional Brownian motion Bt; the second to study the behavior of Brownian motion in the upper half space H = {x £ Hd : xj > 0}. We begin with Ta and observe that the reflection principle (3.8) implies /•oo Po(Ta <t) = 2P0(Bt >a) = 2 (2^)-^2 exp(-z2/2i) dx Ja Here, and until further notice, we are dealing with a one dimensional Brownian motion. To find the probability density of Ta, we change variables x = fi/2a/si/2j dx _ _fi/2a/2s3/2c/s t0 get /o (2^)-^2 exp(-a2/2s) (-f1/2a/2s3/2) ds = I (2vsz)-1/2a exp(-a2/2s) ds Jo Using the last formula we can compute the distribution of L = sup{£ < 1 : Bt = 0} and R = inf {t > 1 : Bt = 0}, completing work we started in Exercises 2.3 and 2.4. By (2.5) if 0 < s < 1 then /OO ps(0,x)Px(TQ> l-s)dx ■oo *00 aOO = 2/ (27rs)-1/2exp(-z2/2s) / (2^)-^2 x exp(-x2/2r)drdx J0 Jl-s = -/ (sr3)_1/2 / xexp(-x2(r + s)/2rs)dxdr * J\-s Jo I l-OO = -1 (sr3)-1/2rs/(r + s)dr T Vl-J
Section 1.4 First Formulas 27 Our next step is to let t = s/(r + s) to convert the integral over r G [1 — s, oo) into one over t £ [0,s]. dt = —s/(r + s)2dr, so to make the calculations easier we first rewrite the integral as I f°° Wl-j '( . ^2\1/2 (r + s) \ s dr r s J (r + s)2 Changing variables as indicated above and then again with t = u2 to get Po(L<s) = - f\t(l-t))-l/2dt (4.2) * J° r = -/ (1 - u2)-1'2 du=- arcsin(V^) i" Jo ir The reader should note that, contrary to intuition, the density function of L = the last 0 before time 1, PQ(L = t) = - f\t(l -1))-1'2 for 0 < t < 1 i" Jo is symmetric about 1/2 and blows up near 0 and 1. This is one of two arcsine laws associated with Brownian motion. We will encounter the other one in Section 4.9. The computation for R is much easier and is left to the reader. Exercise 4.1. Show that the probability density for R is given by Po(R = 1 +1) = 1/(^/2(1 + t)) for t > 0 Notation. In the last two displays and in what follows we will often write P(T = t) = f(t) as short hand for T has density function f(t). As our next application of (4.1) we will compute the distribution of BT where r = inf{t : Bt ¢ H} and H = {z : zj > 0} and, of course, Bt is a d dimensional Brownian motion. Since the exit time depends only on the last coordinate, it is independent of the first d — 1 coordinates and we can compute the distribution of BT by writing for x,0 & Rd_1, y G R P(*,y)(Br = (0, 0)) = / ds P(s,y)(r = s)(2„s)-(d-V/2e-l*-Bf>2> Jo r°° "(2*) Wo
28 Chapter 1 Brownian Motion by (4.1) with a = y. Changing variables s = (\x — 9\2 + y2)/2t gives y f° -{\*-o? + y2) ., ( * \ ^2 so we have (4.3) P(x,y)(Br = (6,0)) = (|g_^y2)d/2r^22) where T(a) = fQ ya~1e~y dy is the usual gamma function. When d = 2 and x = 0, probabilists should recognize this as a Cauchy distribution. At first glance, the fact that BT has a Cauchy distribution might be surprising, but a moment's thought reveals that this must be true. To explain this, we begin by looking at the behavior of the hitting times {Ta,a > 0} as the level a varies. (4.4) Theorem. Under Po, {Ta,a > 0} has stationary independent increments. Proof The first step is to notice that if 0 < a < b then Tbo9Ta=Tb-Ta, so if/ is bounded and measurable, the strong Markov property (3.7) and translation invariance imply EQ (f(Tb - Ta) \TTa) = E0 (f(Tb) o 9Ta \TTa) = Eaf(Tb) = EQf(Tb„a) The desired result now follows from the next lemma which will be useful later. (4.5) Lemma. Suppose Zt is adapted to Qt and that for each bounded measurable function / E(f(Zt - Z,)\G,) = Ef(Zt-,) Then Zt has stationary independent increments. Proof Let tQ <ti... <tn, and let /,-, 1 < i < n be bounded and measurable. Conditioning on T%n_x and using the hypothesis we have (n-1 > II MZU - ^-1 ) • E(fn(ZU - ^„_x)|^„_x) = E ( J] f{(Zu - Zt^)) EfiZt^t^) vi=l
Section 1.4 First Formulas 29 By induction it follows that e (f[fi(zu - z^yj = n £/*(&,_„_,) which implies the desired conclusion. D The scaling relation (1.2) implies (4.6) Ta = a2T, So consulting Section 2.7 of Durrett (1995) we see that Ta has a stable law with index a = 1/2 and skewness parameter k = 1 (since Ta > 0). To explain the appearance of the Cauchy distribution in the hitting locations, let ra = inf{£ : Bf = a} (where B"l is the second component of a two dimensional Brownian motion) and observe that another application of the strong Markov property implies (4.7) Theorem. Under Pq, {C, = B(t,),s > 0} has stationary independent increments. Exercise 4.2. Prove (4.7). The scaling relation (1.3) and an obvious symmetry imply (4.8) C, = ad Cs = -Cs so again consulting Section 2.7 of Durrett (1995) we can conclude that C, has the symmetric stable distribution with index a = 1 and hence must be Cauchy. To give a direct derivation of the last fact let <pe{s) = E(exp(idC,)). Then (4.7) and (4.8) imply ipg(s)ipB(t) = <pB(s + t), <Pg(s) = ipg,(l), <pB(s) = ip_g(s) The fact that C, = — C, implies that <pg(s) is real. Since 6 -+ <pg(l) is continuous, the second equation implies s —+ <pg(s) = fg,(l) is continuous, and a simple argument (see Exercise 4.3) shows that for each 6,<pg(s) = exp(cgs). The last two equations imply that c, = sci and c-g = eg so eg = —n\0\ for some k, so C, has a Cauchy distribution. Since the arguments above apply equally well to (B],cBt), where c is any constant, we cannot determine k with this argument.
30 Chapter 1 Brown'mn Motion Exercise 4.3. Suppose <p(s) is real valued continuous and satisfies <p(s)<p(t) = <p(s + t), <p(0) = 1. Let i[>(s) = ln(y?(s)) to get the additive equation: ij>(s) + ip(t) = ip(s +1). Use the equation to conclude that ip(m2~n) = m2~n^>(l) for all integers m,n > 0, and then use continuity to extend this to ip(t) = tip(l).. Exercise 4.4. Adapt the argument for <fe(s) = E(exp(i0C3)) to show £'o(exp(—ATa)) = exp(—anV\) The representation of the Cauchy process given above allows us to see that its sample paths a —+ Ta and s —+ C, are very bad. Exercise 4.5. If u < v then Po(a -+ Ta is discontinuous in (u,v)) = 1. Exercise 4.6. If u < v then Pq(s —+ C, is discontinuous in (u,v)) = 1. Hint. By independent increments the probabilities in Exercises 4.5 and 4.6 only depend on v — u but then scaling implies that they do not depend on the size of the interval. The discussion above has focused on how Bt leaves H. The rest of the section is devoted to studying where Bt goes before it leaves H. We begin with the case d = 1. (4.9) Theorem. If x,y > 0, then Px(Bt = y,TQ > t) = pt(x,y) - pt(x,-y) where pt(x,y) = (27rf)-1/2e-(!/-x)2/21 Proof The proof is a simple extension of the argument we used in Section 1.3 to prove that P0(Ta < t) = 2P0(Bt > a). Let / > 0 with f(x) = 0 when x < 0. Clearly Ex(f(Bt);TQ >t) = Exf(Bt) - Ex(f(Bt);TQ < t) If we let f(x) = f(—x), then it follows from the strong Markov property and symmetry of Brownian motion that Ex{f(Bt);TQ <t) = Ex[Eof(Bt-To);TQ < t] = Ex[EQf(Bt-T0);To<t] = Ex[f(Bt);T0 <t] = Ex(f(Bt))
Section 1.4 First Formulas 31 since f(y) = 0 for y > 0. Combining this with the first equality shows Ex(f(Bt);TQ >t) = Exf(Bt) - Exf(Bt) = I {Pt(x, y) - pt(x, -y))f(y)dy D The last formula generalizes easily to d > 2. (4.10) Theorem. Let r = inf{f : B? = 0}. If x,y e H, P*(Bt = y,T > t) = pt(x,y) - pt(x,y) where y = (yi,..., yd-\, -yd)- Exercise 4.7. Prove (4.10).
2 Stochastic Integration In this chapter we will define our stochastic integral It = fQ HsdX,. To motivate the developments, think of X, as being the price of a stock at time s and H, as the number of shares we hold, which may be negative (selling short). The integral It then represents the net profits at time t, relative to our wealth at time 0. To check this note that the infinitesimal rate of change of the integral dlt = Ht dXt = the rate of change of the stock times the number of shares we hold. In the first section we will introduce the integrands Hs, the "predictable processes," a mathematical version of the notion that the number of shares held must be based on the past behavior of the stock and not on the future performance. In the second section, we will introduce our integrators Xs, the "continuous local martingales." Intuitively, martingales are fair games, while the "local" refers to the fact that we reduce the integrability requirements to admit a wider class of examples. We restrict our attention to the case of martingales with continuous paths t —+ Xt to have a simpler theory. 2.1. Integrands: Predictable Processes To motivate the class of integrands we consider, we will discuss integration w.r.t. discrete time martingales. Here, we will assume that the reader is familiar with the basics of martingale theory, as taught for example in Chapter 4 of Durrett (1995). However, we will occasionally present results whose proofs can be found there. Let Xn,n > 0, be a martingale w.r.t. Tn. If H„, n > 1, is any process, we can define n (H ■ X)n = 2__, Hm(Xm — Xm-i) m=l To motivate the last formula and the restriction we are about to place on the Hm, we will consider a concrete example. Let £1,^21--- be independent with P(& = 1) = P(£i = -1) = 1/2, and let Xn = &+• ■ •+£„. Xn is the symmetric simple random walk and is a martingale with respect to Tn = cr(£i £„).
34 Chapter 2 Stochastic Integration If we consider a person flipping a fair coin and betting $1 on heads each time then Xn gives their net winnings at time n. Suppose now that the person bets an amount Hm on heads at time m (with Hm < 0 interpreted as a bet of — Hm on tails). I claim that (H ■ X)n gives her net winnings at time n. To check this note that if Hm > 0 our gambler wins her bet at time m and increases her fortune by Hm if and only if Xm — Xm-\ = 1. The gambling interpretation of the stochastic integral suggests that it is natural to let the amount bet at time n depend on the outcomes of the first n — 1 flips but not on the flip we are betting on, or on later flips. A process Hn that has Hn £ Tn-\ for all n > 1 (here Tq = {0,0}, the trivial cr-field) is said to be predictable since its value at time n can be predicted (with certainty) at time n — 1. The next result shows that we cannot make money by gambling on a fair game. (1.1) Theorem. Let Xn be a martingale. If Hn is predictable and each Hn is bounded, then (H ■ X)n is a martingale. Proof It is easy to check that (H ■ X)n £ Tn. The boundedness of the H„ implies E\(H ■ X)n\ < oo for each n. With this established, we can compute conditional expectations to conclude E((H ■ X)n+1 \Tn) = (H ■ X)n + E(Hn+1(Xn+1 - Xn)\Fn) = (HX)n + Hn+iE(Xn+i — Xn \Tn) = (H ■ X)n since Hn+i G Fn and E(Xn+i - X„|.Fn) = 0. D The last theorem can be interpreted as: you can't make money by gambling on a fair game. This conclusion does not hold if we only assume that Hn is optional, that is, Hn £ ^"n, since then we can base our bet on the outcome of the coin we are betting on. Example 1.1. If Xn is the symmetric simple random walk considered above and Hn = £„ then n (H.X)n = y£,U-U=n m = l since ^ = 1. D In continuous time, we still want the metatheorem "you can't make money gambling on a fair game" to hold, i.e., we want our integrals to be martingales. However, since the present (t) and past (< t) are not separated, the definition of the class of allowable integrands is more subtle. We will begin by considering a simple example that indicates one problem that must be dealt with.
Section 2.1 Integrands: Predictable Processes 35 Example 1.2. Let (Q,F, P) be a probability space on which there is defined a random variable T with P(T < t) = t for 0 < t < 1 and an independent random variable £ with P($ = 1) = P{£ = -1) = 1/2. Let Y _ /0 t <T and let Tt = <r{X, : s <t). In words we wait until time T and then flip a coin. Xt is a martingale with respect to T%. However, if we define the stochastic integral It = JQ X,dX, to be the ordinary Lebesgue-Stieltjes integral then y1= f x,dx,=xT.t = t2 = i Jo To check this note that the measure dX, corresponds to a mass of size £ at T and hence the integral is £ times the value there. Noting now that Y0 = 0 while Y\ = 1 we see that It is not a martingale. D The problem with the last example is the same as the problem with the one in discrete time — our bet can depend on the outcome of the event we are betting on. Again there is a gambling interpretation that illustrates what is wrong. Consider the game of roulette. After the wheel is spun and the ball is rolled, people can bet at any time before (<) the ball comes to rest but not after (>). One way of guaranteeing that our bet be made strictly before T, i.e., a sufficient condition, is to require that the amount of money we have bet at time t is left continuous. This implies, for instance, that we cannot react instantaneously to take advantage of a jump in the process we are betting on. The simplest left continuous integrand we can imagine is made by picking a < b, C £ Ta, and setting (*) H(s,w) = C(u)l(a>b](t) In words, we buy C(w) shares of stock at time a based on our knowledge then, i.e., C G 7a- We hold them to time 6 and then sell them all. Clearly, the examples in (*) should be allowable integrands; one should be able to add two or more of these, and take limits. To encompass the possibilities in the previous sentence, we let II be the smallest cr-field containing all sets of the form (a, 6] X A where A G Ta. The II we have just defined is called the predictable cr-field. We will demand that our integrands H are measurable w.r.t. II. In the previous paragraph, we took the "bottom-up" approach to the definition of II. That is, we started with some simple examples and then extended the class of integrands by taking sums and limits. For the rest of the section, we will take a "top-down" approach. We will start with some natural requirements
36 Chapter 2 Stochastic Integration and then add more until we arrive at II. The descending path is more confusing since it involves four definitions that only differ in subtle ways. However, the reader need not study this material in detail. We will only use the predictable cr-field. The other definitions are included only to allow the reader to make the connection with other treatments and to indirectly make the point that the measure theoretic questions associated with continuous time processes can be quite difficult. Let ?jbea right continuous filtration. The first and most intuitive concept of "depending on the past behavior of the stock and not on the future performance" is the following: H(s, w) is said to be adapted if for each t we have Ht G Tf We encountered this notion in our discussion of the Markov property in Chapter 1. In words, it says that the value at time t can be determined from the information we have at time t. The definition above, while intuitive, is not strong enough. We are dealing with a function defined on a product space [0,oo) x fl, so we need to worry about measurability as a function of the two variables. H is said to be progressively measurable if for each t the mapping (s,w) —+ H(s,u) from [0,t] to R is % x T% measurable. This is a reasonable definition. However, the "modern" approach is to use a slightly different definition that gives us a slightly smaller cr-field. Let A be the cr-field of subsets of [0,oo) x Q that is generated by the adapted processes that are right continuous and have left limits, i.e., the smallest cr-field which makes all of these processes measurable. A process H is said to be optional if H(s,w) is measurable w.r.t. A. According to Dellacherie and Meyer (1978) page 122, the optional cr-field is contained in the progressive cr-field and in the case of the natural filtration of Brownian motion the inclusion is strict. The subtle distinction between the last two cr-fields is not important for us. The only purpose here for the last two definitions is to prepare for the next one. Let n' be the cr-field of subsets of [0,00) X Q that is generated by the left continuous adapted processes. A process H is said to be predictable if#(fl,w)€lT. As we will now show, the new definition of predictable is the same as the old one. We have added the ' only to make the next statement and proof possible.
Section 2.2 Integrators: Continuous Local Martingales 37 (1.2) Theorem. II = II'. Proof Since all the processes used to define II are left continuous, we have II C II'. To argue the other inclusion, let H(s, w) be adapted and left continuous andletfP(s,w) = #(m2-",w)form2-n < s < (m + l)2~n. Clearly H" ell'. Further, since H is left continuous, Hn(s,w) —+ H(s,u) as n —+ oo. D The distinction between the optional and predictable cr-fields is not important for Brownian motion since in that case the two cr-fields coincide. Our last fact is that, in general, II C A. Exercise 1.1. Show that if H(s,u) = l(aib](s)lA(u) where i G /„, then H is the limit of a sequence of optional processes; therefore, H is optional and ncA. 2.2. Integrators: Continuous Local Martingales In Section 2.1 we described the class of integrands that we will consider: the predictable processes. In this section, we will describe our integrators: continuous local martingales. Continuous, of course, means that for almost every w, the sample path s —+ X,(w) is continuous. To define local martingale we need some notation. If T is a nonnegative random variable and Yj is any process we define VT _ YTM on{T>0} 1 ~ \ 0 on {T = 0} (2.1) Definition. Xt is said to be a local martingale (w.r.t. {Tt,t > 0}) if there are stopping times Tn | oo so that Xt " is a martingale (w.r.t. {TAxn '■ t > 0}). The stopping times Tn are said to reduce X. We need to set Xf = 0 on {T = 0} to deal with the fact that Xo need not be integrable. In most of our concrete examples Xq is a constant and we can take Ti > 0 a.s. However, the more general definition is convenient in a number of situations: for example (i) below and the definition of the variance process in (3.1). In the same way we can define local submartingale, locally bounded, locally of bounded variation, etc. In general, we say that a process Y is locally A if there is a sequence of stopping time Tn | oo so that the stopped process YtT has property A. (Of course, strictly speaking this means we should say locally a submartingale but we will continue to use the other term.)
38 Chapter 2 Stochastic Integration Why local martingales? There are several reasons for working with local martingales rather than with martingales. (i) This frees us from worrying about integrability. For example, let Xt be a martingale with continuous paths, and let <p be a convex function. Then <p{Xt) is always a local submartingale (see Exercise 2.3). However, we can conclude that <p(Xt) is a submartingale only if we know E\<p(Xt)\ < oo for each t, a fact that may be either difficult to check or false in some cases. (ii) Often we will deal with processes defined on a random time interval [0, r). If r < oo, then the concept of martingale is meaningless, since for large t Xt is not defined on the whole space. However, it is trivial to define a local martingale: there are stopping times Tn | r so that ... (iii) Since most of our theorems will be proved by introducing stopping times Tn to reduce the problem to a question about nice martingales, the proofs are no harder for local martingales defined on a random time interval than for ordinary martingales. Reason (iii) is more than just a feeling. There is a construction that makes it almost a theorem. Let Xt be a local martingale defined on [0,r) and let T„ | r be a sequence of stopping times that reduces X. Let T0 = 0, suppose Ti > 0 a.s., and for k > 1 let m _ / * - (* - 1) r*-i + (* - 1) < * < It + (* - 1) TW -\Tk Tk + (k-l)<t<Tk + k To understand this definition it is useful to write it out for k = 1,2,3: 7(0 = { In words, the time change expands [0, r) onto [0, oo) by waiting one unit of time each time a Tn is encountered. Of course, strictly speaking j compresses [0, oo) onto [0,r) and this is what allows X^ to be defined for all t > 0. The reason for our fascination with the time change can be explained by: t Ti t-l T2 t-2 n [0,¾ CTi.Ti + l] [7\ + l,T2 + l] [T2 + l,T2 + 2] [T2 + 2,Ta + 2] p3 + 2,T3 + 3]
Section 2.2 Integrators: Continuous Local Martingales 39 (2.2) Theorem. {X^t), ^"7(i), t > 0} is a martingale. First we need to show: (2.3) The Optional Stopping Theorem. Let X be a continuous local martingale. If S < T are stopping times and Xtm is a uniformly integrable martingale thenE{XT\Fs) = Xs. Proof (7.4) in Chapter 4 of Durrett (1995) shows that if L < M are stopping times and Yjv/An is a uniformly integrable martingale w.r.t. Qn then E(Ym\Gl) = Yl To extend the result from discrete to continuous time let Sn = ([2n5] + l)/2n. Applying the discrete time result to the uniformly integrable martingale Ym = XTAm2-» with L = 2nSn and M = oo we see that E(XT\fSa) = XSa Letting n —+ oo and using the dominated convergence theorem for conditional expectations ((5.9) in Chapter 4 of Durrett (1995)) the result follows. D Proof of (2.2) Let n = [t] + 1. Since j(t) < Tn An, using the optional stopping theorem, (2.2), gives X^) = E(XTnAn l-^-r(i))- Taking conditional expectation with respect to ^"7(3) we get E(xt(t) I^"t(j)) = E(XT„An\^,)) = Xf(i) proving the desired result. D The next result is an example of the simplifications that come from assuming local martingales are continuous. (2.4) Theorem. If X is a continuous local martingale, we can always take the sequence which reduces X to be Tn = inf{£ : \Xi\ > n} or any other sequence T^ < Tn that has T^ | oo as n | oo. Proof Let Sn be a sequence that reduces X. If s < t, then applying the optional stopping theorem to X?n at times r = s A T^ and t A T^ gives E(X(t AT^A S„)l(s„>o)|^(s AT^ A Sn)) = X(s AT^ A S„)l(s„>o)
40 Chapter 2 Stochastic Integration Multiplying by l(Tj,>o) 6^oC T{a A T'm A S„) we get E(X(tAT^ASn)l(Sn>QiTL>Q)\F(sAT^ASn)) = X(s AT^ AS„)l(sB>o,:rm>o) As n T oo, T(s A T^ A Sn) T ^(s A 7^), and X(r AT^A S„)l(sn>o,Ti.>o) - X(r AI£,)1(tj.>o) for all r > 0, and \X(r AT^ AS„)|l(s„>o,T' >o) < "*> s0 it follows from the dominated convergence theorem for conditional expectations (see (5.9) in Chapter 4 of Durrett (1995)) that E(X(t A T^)l(T!n>Q)\F(s A !£)) = X(s A ^)1(^0) proving the desired result. D In our definition of local martingale in (2.1), we assumed that Xt " is a martingale (w.r.t. {FtATn,t > 0})- We did this with the proof of (2.4) in mind. The next exercise shows that we get the same definition if we assume xf- is a martingale (w.r.t. {Tt,t > 0}). Exercise 2.1. Let S be a stopping time. Then Xf is a martingale w.r.t. FtAS,t > 0, if and only if it is a martingale w.r.t. Tu * > 0. In the first version of this book I said "You should think of a local martingale as something that would be a martingale if it had E\Xt \ < oo." As many people have pointed out to me, there is a famous example which shows that this is WRONG. Example 2.1. Let Bt be a Brownian motion in dimension d > 3. In Section 3.1 we will see that Xt = l/\Bt\d~2 is a local martingale. Now £S|X,|* = [(2wt)-d'2e-h-*r'2i\y\-(d-Vp dy < oo if and only if p < d/(d — 2) since (i) there is no trouble with the convergence for large y and (ii) ignoring (2irt)~d/2e~\y~x\ l2t which is bounded near 0 and changing to polar coordinates we have 1 l r-(d-2)prd-l dr<QQ 0
Section 2.2 Integrators: Continuous Local Martingales 41 if and only if -{d- 2)p + {d-l)> -1, i.e., p < d/(d-2). To see that Xt is not a martingale we begin by noting that (Bt — x) =<j tll2(B\ — x) under Px so as t -+ oo, |5(| -+ oo and Xt -+ 0 in probability. To show that ExXt —+ 0 we observe that for any R < oo ExXt < {2irt)-d'2 ( M-(d-2> dy + iT^ J\y\<R The integral is convergent so limsup^^, ExXt < R~(d~2\ Since R is arbitrary ExXt —+ 0. Since ExXq > 0 it follows that EsXt is not constant and hence Xt is not a martingale. D An even worse counterexample is provided by Exercise 2.2. In Section 3.1 we will learn that if B% is a two dimensional Brownian motion then Xt = log \Bt\ is a local martingale. Show J5|.Xt|p < oo for any p < oo but lim<_,oo EXt = oo so X is not a martingale. The last example may leave the reader with the impression that no amount of integrability will guarantee that a local martingale is an honest martingale, but as the next very useful result shows, this is a false impression. (2.5) Theorem. If Xt is a local submartingale and E ( sup \X, | ) < oo for each t then Xt is a submartingale. Proof Clearly our assumption implies E\Xt\<oo for each t so all we have to check is that ^(^1^,) > X3. To do this, we note that if Tn is a sequence of stopping times that reduces X (xf-|^jATn)>^- E\ then we let n —+ oo and apply the dominated convergence theorem for conditional expectations. See (5.9) in Chapter 4 of Durrett (1995). D As usual, multiplying by —1 shows that the last result holds for supermartin- gales, and once it is true for super- and sub- it is true for martingales (by applying the two other results). The following special case will be important: (2.6) Corollary. A bounded local martingale is a martingale.
42 Chapter 2 Stochastic Integration Here and in what follows when we say that a process Xt is bounded we mean there is a constant M so that with probability 1, \Xt\ < M for all t > 0. In some circumstances we will need a convergence theorem for local martingales. The following simple result is sufficient for our needs. (2.7) Theorem. Let Xt be a local martingale on [0, r). If E[ sup |X,| ] <oo \0<3<T J then XT = limtir -^ exists a.s. and EX0 = EXT. Proof Let j be the time scale of (2.2) and let Yt = X^ny Yt is a martingale. The upcrossing inequality and the proof of the martingale convergence theorem given in Section 4.2 of Durrett (1995) generalize in a straightforward way to continuous time and allow us to conclude that Y^ = lim<foo ^i exists a.s. and Xr = lim^r Xt exists a.s. To prove the last conclusion we note that EYq = EYt then use the a.s. convergence and the dominated convergence theorem to conclude EX0 = EY0 - EY^ = EXr. D Exercise 2.3. Let Xt be a continuous local martingale and let <p be a convex function. Then <p(Xt) is a local submartingale. Exercise 2.4. Let X be a continuous local martingale and let R < oo be a stopping time. Then Y, = Xr+3 is a local martingale with respect to Q3 = Exercise 2.5. Show that if X is a continuous local martingale with Xt > 0 and EXo < oo then Xt is a supermartingale. 2.3. Variance and Covariance Processes The next definition may look mysterious but it is very important. (3.1) Theorem. If Xt is a continuous local martingale, then we define the variance process (X)t to be the unique continuous predictable increasing process At that has Aq = 0 and makes X? — At a local martingale. To warm up for the proof of (3.1) we will prove a result in discrete time. (3.2) Theorem. Suppose Xn is a martingale with EX% < oo for all n. Then there is a unique predictable process An with Aq = 0 so that X„ — An is a martingale. Furthermore, n —» An is increasing.
Section 2.3 Variance and Covariance Processes 43 Proof Let Aq = 0 and define for n > 1 An = An.! + E{Xl\Tn-l) - X2n_x Prom the definition, it is immediate that An G J-n-\, An is increasing (since Xn is a submartingale), and E(Xl-An\Tn-l) = E(Xl\Tn-l)-An = Xn — i An— 1 so A has the desired properties. To see that A is unique, observe that if B is another process with desired properties, then An — Bn is a martingale and An-Bn £ Tn-\- Therefore An — Bn = E(An — Bn l^n-i) = An-i — Bn-i and it follows by induction that An — Bn = A0 — B0 = 0. D The key to the uniqueness may be summarized as "any predictable discrete time martingale is constant." Since Brownian motion is a predictable martingale, the last statement fails in continuous time and we need another assumption. (3.3) Theorem. Any continuous local martingale Xt that is predictable and locally of bounded variation is constant (in time). Proof By subtracting Xq, we may suppose that Xq = 0 and prove that X is identically 0. Let Vt be the variation of X on [0,i]. We leave it to the reader to check Exercise 3.1. If X is continuous and locally of bounded variation then t —* Vt is continuous. Define S = inf{s : V, > K). Since t < S implies \Xt\ < K, (2.4) implies that the stopped process Mt = X(t A S) is a bounded martingale. Now if s < t using well known properties of conditional expectation (see e.g., (1.3) and other results in Chapter 4 of Durrett (1995)) we have E((Mt - M,f\T,) = E(M?\F3) - 2M,E{Mt\T,) + M? {'' = E(M?\F,) - M? = E(M? - M?\T,) an equation that we will refer to as the orthogonality of martingale increments since the key to its validity is the fact that E(M3(Mt — M,)^,) = 0 (and hence E(M3(Mt - M,)) = 0).
44 Chapter 2 Stochastic Integration Letting 0 = tQ <ti... <tn = the a. subdivision of [0, t], the orthogonality of martingale increments, the inequality J2ial < (IZi la«l)suPj lajl> an(^ tne fact that VtAS < K imply n = EYJ{Mim-Mim_1)2 m=l < E [VtAs sup \Mtm - Mtr <X£,sup|Mlm-Mlm_l| .-,,) If we take a sequence of partitions A„ = {0 = tg < *i • • • < ^k(n) = *} m wmch the mesh |A„| = supm \t^ — £JJ,-il —* 0 then continuity implies supm |M*n — Mtn_ | —» 0. Since the supm < 2/^ the bounded convergence theorem implies £'supm |Mt» - Mtn_J -♦ 0. This shows EM? = 0 so Mt = 0 a.s. In the last conclusion t is arbitrary, so with probability one we have Mt = 0 for all rational t. The desired result now follows from continuity. D With (3.3) established we can do the Proof of uniqueness in (3.1) Suppose At and A\ are two processes with the desired properties. Then At — A\ is a continuous local martingale that is locally of bounded variation and hence must be constant. Since A§ — Aq = 0 it follows that At = A't for all t. D Proof of existence in (3.1) when Xt is a bounded martingale The existence of A is considerably more complicated than its uniqueness. To inspire the reader for the battle to come, we would like to point out that (i) we are about to prove the special case that we need of the celebrated Doob-Meyer decomposition and (ii) the construction will provide some useful information, e.g., (3.6). Though (3.1) and (3.8) are quite important, the details of their proofs are not, so the squeamish reader can skip ahead to the boldfaced "End of Existence Proof" five pages ahead. Given a partition A = {0 = t0 < t\ < t2 ■ ■.} with lim„ tn = oo, we let k(t) = sup{fc : tk < t} be the index of the last point before time t. (To avoid confusion later, we emphasize now that k(t) is a number not a random variable.) We next define for a process X, an approximate quadratic variation by k(t) Qt(x) = Y),xik - Xik_S~ + (Xt - xikll)f k=\
Section 2.3 Variance and Covariance Processes 45 If the reader recalls what we are trying to define (see (3.1)) then the reason for the definition is explained by (a) Lemma. If Xt is a bounded continuous martingale then X? — Qf(X) is a martingale. Proof To prove this, we first note that reasoning as in the proof of (3.4) one can conclude that if r < s < t then E ((Xt - Xrf\T,) = E ((Xt - X,Y\T,) - 2(X, - Xr)E(Xt - X, \Ta) + (X, - XT)2 = E{{Xt-X,)2\T,)+{X,-Xr)2 since E(Xi-X3\T3)=0. Writing Qf = Qf + (Qf - Qf) and working out the difference, Qt - Qt = £(¾ - ***-i)2 + (¾ - X**(o)2 k=\ -^(^-¾^)2+ (^-^,.,)2 k=l = P^fc(.)+i _ ^<fc(.)) ~~ (X' ~ X*H.)) + Yl (***-***-i)2+(**-**x,,)2 k=k(i)+2 The next step is to take conditional expectation with respect to Ts and use the first formula with r = t^,) and t = ijt(4)+i. To neaten up the result, define a sequence un, k(s) — 1 < n < k(t) + 1 by letting uk^_1 = s, u,- = i,- for Ks) < * < K0> an(i uk(t)+i = *• E{X*-Q?{X)\T,) t(0+i = EiX?\r,)-QtiX)-E\ J2 (*««-*««-i) I t(*)+i = X2-Q?(X) T, T>
46 Chapter 2 Stochastic Integration where the second equality follows from (3.4). D Looking at (a) and noticing that Qf(X) is increasing except for the last term, the reader can probably leap to the conclusion that we will construct the process At desired in (3.1) by taking a sequence of partitions with mesh going to 0 and proving that the limit exists. Lemma (c) is the heart (of darkness) of the proof. To prepare for its proof we note that using (a), taking expected value and using (3.4) gives E(Q?(X) - Q?{X)\T,) = E(X? - X]\Ta) U =E((Xt-X3)2\T3) To inspire you for the proof of (c), we promise that once (c) is established the rest of the proof of (3.1) is routine. (c) Lemma. Let Xt be a bounded continuous martingale. Fix r > 0 and let A„ be a sequence of partitions 0 = ij} < i" ... < i£ = r of [0,r] with mesh |A„| = sup,. |t£ - tl_x\ -* 0. Then Q?"(X) converges" to a limit in L2. Proof If A and A' are two partitions, we call AA' the partition obtained by taking all the points in A and in A'. If we apply (a) twice and take differences we see that Yt = QApO-QA'(X) is a martingale. By definition Yi2-QAA'(Y) is a martingale, so E(Q?(X) - Q?(X))2 = EY? = EQ?A'{Y) For reasons that will become clear in a moment, we will drop the argument from QA when it is X and we will drop the r when we want to refer to the process t —» QA. Since (3.5) (a + 6)2 < 2a2 + 262 for any real numbers a and 6 (since 2a2 + 262 - (a + 6)2 = (a - 6)2 > 0) we have QrAA'00 < 2QAA'(QA) + 2QAA'(QA') Combining the last two results about Q we see that it is enough to prove that (d) if |A| + |A'| -* 0 then SQAA'(QA) -* 0. To do this let sk £ A A' and tj £ A so that tj < sfc < sk+i < tj+\. Recalling the definition of QA (X) and doing a little arithmetic we have = (X'k+i - X'k)(X'k+i + X*k - 2¾)
Section 2.3 Variance and Covariance Processes 47 QAA'(QA) < QrAA'(X)sup(X,fc+1 +X,fc -2X<.(fc))2 1/2 4 f Summing the squares gives QAA'(QA)<c where j(k) — sup{j : tj < Sk}. By the Cauchy-Schwarz inequality E(QAA'(QA)) < {EQA*'(XW ji? sup(X,fc+1 +X3k - 2Xt.w) Whenever |A| + |A'| —» 0 the second factor goes to 0, since X is bounded and continuous, so it is enough to prove (e) If \Xt\<M for all t then EQAA'(X)2 < 12M4. The value 12 is not important but it does make it clear that the bound does not depend on r, A, or A'. 2 \2 | Proof of (e) If T = AA' is the partition 0 = s0 < si... < sn = r then Qjrpo2 =(J2(x,m-x,m_1y \m=l / (e-1) =E(X,m-XJm_1)4 m=l + 2 £(X,m - XJm_1)2(Qj'(X) - Qj*m(X)) m=l To bound the first term on the right-hand side we note that if |X*| < M for all t then some arithmetic and (3.4) imply n n m=l m=l (e-2) =*m*eYx! -x2 x m=l < 4M2£X2 < 4M4 To bound the second term we note that (X,m — X,m_1)2 £ TSm then use (b) and |Xr - X,m \ < 1M to get E((X3m -XJm_1)2{Qj'(X)-Qj,m(X)}|^m) /e 3) ' = (X'» - Xim_1)2S({Qr(X) - QL(X)}|^J = (X,m -X^.J^Xr - Xim)2|^J <(2M)2(Xim-Xim_1)2
48 Chapter 2 Stochastic Integration Taking expected value in (e.3), summing over m, and using (3.4) as before, we get < a Enf2(X3m-X3m_1)\Qrr(X)-Q^JX))<4M'EJ2^L-^L-1 Ve-4^ m=l m=l < 4M2EX2 < 4M4 Plugging this bound and (e.2) into (e.l) proves (e) which completes the proof of (d) and hence of (c). • Q It only remains to put the pieces together. Let A„ be the partition with points k2~nr with 0 < k < 2". Since Qfm — Qfu is a martingale (by (a)), using the L2 maximal inequality (see e.g., (4.3) in Chapter 4 of Durrett (1995)) gives (f) E (sup |Qf» - Qf -12) < 4£|Q?» - Q^ |2 From this and (c) it follows that (g) There is a subsequence so that Qt n(fc) —» a limit At uniformly on [0,r]. Proof Since Qfu converges in L2, it is a Cauchy sequence in L2 and we can pick an increasing sequence n(k) so that for m > n(k) E\Q?--Q?"w\2<2-k Using (f) and Chebyshev's inequality it follows that P (sup \Qf "(fc+1) - Qf "(fc)| > 1/k2) < k*2~k Since the right-hand side is summable, the Borel-Cantelli lemma implies the inequality on the right only fails finitely many times and the desired result follows. D By taking subsequences of subsequences and using a diagonal argument we can get uniform convergence on [0,N] for all N. Since each Qf is continuous, At is as well. Because of the last term (Xt — Xk(t))2, Qt is not increasing. However, if m > n then k —* Qj2-„ is increasing, so k —* -dj^—r is increasing. Since n is arbitrary and t —* At is continuous it follows that t —* At is increasing.
Section 2.3 Variance and Covariance Processes 49 The last detail for the case in which X is a bounded martingale is to check that M2 — At is a martingale. This follows from the next result (with p = 2). (3.6) Lemma. Suppose that for each n, Z" is a martingale with respect to Tu and that for each t, Z" —» Zt in LP where p > 1. Then Zt is a martingale. Proof The martingale property implies that if s < t then E(Zt\T3) = Z". The right-hand side converges to Z3 in LP. To check that the left-hand side converges to E(Zt\Ts) we note that linearity properties of and Jensen's inequality for conditional expectation (see e.g., (1.1a) and (Lid) in Chapter 4 of Durrett (1995)) imply E\E(Z?\F,) - E(Zt\T,)Y = E\E{Z? - Z,\T,yP < EE(\Z? - Zt\p\T,) = E\Z? - Zt\P — 0 Taking limits in LP it follows that E{Zt\T3) = Z, and the proof is complete. D Proof of existence in (3.1) when X is a local martingale Our first step in extending the result to local martingales is to prove a result about the quadratic variation of a stopped martingale. Once one remembers the definition of Y^ given at the beginning of Section 2.2, the next result should be obvious: the quadratic variation does not increase after the process stops. (3.7) Lemma. Let X be a bounded martingale, and T be a stopping time. Then {XT) = {X)T. Proof By the optional stopping theorem (XT)2 — (X)T is a local martingale so the result follows from uniqueness. D Let Tn be a sequence of stopping times increasing to oo so that Y" = XTn ■ 1(t„>o) is a bounded martingale By the previous result there is a unique continuous predictable increasing process An so that (Y/1)2 - A? is a martingale. By (3.7) A? = A?+i for t < Tn so we can unambiguously define (X)t = A" for t < Tn. Clearly, (X)t is continuous, predictable, and increasing. The definition implies X$.nM ■ 1(t„>o) — {^Oi^a* is a martingale so X? — (X)t is a local martingale. D End of Existence Proof Finally, we have the following extension of (c) that will be useful later.
50 Chapter 2 Stochastic Integration (3.8) Theorem. Let Xt be a continuous local martingale. For every t and sequence of subdivisions A„ of [0,i] with mesh |A„| ->0we have sup \Qf - (X) - (X)31 — 0 in probability Proof Let S, e > 0. We can find a stopping time S so that Xf is a bounded martingale and P(S <t) < 5. It is clear that QA(X) and QA(XS) coincide on [0,5]. From the definition and (3.7) it follows that (X) and (Xs) are equal on [0,S]. So we have P (sup \QA(X) - (X), | > c) < 6 + P (sup \Qf (Xs) - (Xs), \ > e) Since Xs is a bounded martingale (c) implies that the last term goes to 0 as |A|-*0. D (3.9) Definition. If X and Y are two continuous local martingales, we let (X,Y)t = \((X + Y)t-(X-Y)t) Remark. If X and Y are random variables with mean zero, cov(X, Y) = EXY = \(E(X + Yf - E(X - Y)2) = |(var(X + Y) - var(X - Y)) so it is natural, I think, to call (X, Y)t the covariance of X and Y. Given the definitions of (X)t and (X, Y)t the reader should not be surprised that if for a given partition A = {0 = to < *i < *2 • • •} with limn tn = oo, we let k(t) = sup{fc :tk<t} and define Qt(X,Y) = Y^Xtk ~ Xt^XYtt - Ytk_J k=\ + (Xt - Xtk(l))(Yt - Ytk(l)) then we have (3.10) Theorem. Let X and Y be continuous local martingales. For every t and sequence of partitions An of [0,t] with mesh |An| ->0we have sup \Qf "(X, Y) - (X, Y)31 — 0 in probability 3<t
Section 2.3 Variance and Co variance Processes 51 Proof Since Qf "(X, Y) = i (Q?»(X + Y) - Qf »(X - y)) this follows immediately from the definition of (X, V) and (3.8). D Our next result is useful for computing (X, Y)t. (3.11) Theorem. Suppose X* and Y are continuous local martingales. (X, Y)% is the unique continuous predictable process At that is locally of bounded variation, has Aq = 0, and makes XtYt — At a local martingale. Proof From the definition, it is easy to see that XtYt - (X, Y)t = \[(Xt + Ytf - (X + Y)t - {{Xt - Yt)2 - (X - Y)t}] is a local martingale. To prove uniqueness, observe that if At and A't are two processes with the desired properties, then At — A't = (XtYt — A't) — (XtYt — At) is a continuous local martingale that is locally of bounded variation and hence = 0 by (3.3). D From (3.11) we get several useful properties of the covariance. In what follows, X, Y and Z are continuous local martingales. First since XtYt = YtXt, (X,Y)t = (Y,X)t —-"■"" Exercise 3.2. (X + Y, Z)t = (X, Z)t + (Y, Z)t. Exercise 3.3. (X - X0, Z)t = (X, Z)t. Exercise 3.4. If a, 6 are real numbers then (aX, bY)t = ab(X,Y)t. Taking a = 6, X = Y, and noting (X,X)< = (X)t it follows that (aX)t = a2(X)t. Exercise 3.5. (XT,YT) = (X,Y)T It is also true that (X,YT) = (X,Y)T Since we are lazy we will wait until Section 2.5 to prove this. We invite the reader to derive this directly from the definitions. The next two exercises prepare for the third which will be important in what follows:
52 Chapter 2 Stochastic Integration Exercise 3.6. If Xt is a bounded martingale then X? — (X)t is a uniformly integrable martingale. Exercise 3.7. If X is a bounded martingale and S is a stopping time then Yt = Xs+t—Xs is a martingale with respect to Ts+t and (Y)t = (X)s+t—(X)s- Exercise 3.8. If S < T are stopping times and (X)s = (X)t, then X is constant on [S, T]. Exercise 3.9. Conversely, if S < T are stopping times and X is constant on [S,T]then<X)s=POT. 2.4. Integration w.r.t. Bounded Martingales In this section we will explain how to integrate predictable processes w.r.t. bounded continuous martingales. Once we establish the Kunita-Watanabe inequality in Section 2.5, we will be able to treat a general continuous local martingale in Section 2.6. As in the case of the Lebesgue integral (i) we will begin with the simplest functions and then take limits to get the general case, and (ii) the sequence of steps used in denning the integral will also be useful in proving results later on. Step 1: Basic Integrands We say H(s,u) is a basic predictable process if H (s, w) = l(a6](s)C(w) where C G Ta. Let IIo = the set of basic predictable processes. If H = l(a,&]C and X is continuous, then it is clear that we should define jH,dX, = C(w)(Xi(w) -X„(w)) Here we restrict our attention to integrating over [0,oo), since once we know how to do that then we can define the integral over [0,t] by (H -X)t = J H, dX, = J H,lm(s)dX3 To extend our integral we will need to keep proving versions of the next three results. Under suitable assumptions on H, K, X and Y (a) (H ■ X)t is a continuous local martingale (b) ((H + K)-X)t = (H-X)t + (K-X)u (H-(X+Y))t = (H-X)t + (H-Y)t
Section 2.4 Integration w.r.t. Bounded Martingales 53 (c) (H -X,K. Y)t = /o H,K, d(X, Y), In this step we will only prove a version of (a). The other two results will make their first appearance in Step 2. Before plunging into the details recall that at the end of Section 2.2 we defined Ht to be bounded if there was an M so that with probability one Ht < M for all t > 0. (4.1.a) Theorem. If X is a continuous martingale and H £ bU0 = {H £ Uq : H is bounded}, then (H ■ X)t is a continuous martingale. Proof (0 0<t<a (H ■ X)t = \ C(Xt - Xa) a<t<b { C(Xb -Xa) b<t< CO From this it is clear that (H-X)t is continuous, (H-X)t £ Ft and E\(H-X)t\ < oo. Since (H -X)t is constant for t ¢. [a, b] we can check the martingale property by considering only a < s < t < b. (See Exercise 4.1 below.) In this case, E{{H-X)t\T,) - (H-X)3 = E((H-X)t - {H-X),\T,) = E(C(Xt - XS)\TS) = CE(Xt-X,)\Ts) = 0 where the last two equalities follow from C £ T, and E{Xt\J:3) —Xs. D Exercise 4.1. If Yt is constant for t ¢ [a, 6] and E{Yt\Ta) = Y3 when a < s < t <b then Yt is a martingale. Step 2: Simple Integrands We say H(s,w) is a simple predictable process and write H £ IIi if if can be written as the sum of a finite number of basic predictable processes. It is easy to see that if H £ IIi then H can be written as m where t0 < ti < ... < tm and Cj £ Tti-i ■ ^n tnis case we ^et /m HtdX^Y^CiiXu-Xu.J ••—1
54 Chapter 2 Stochastic Integration The representation in the definition of H is not unique, since one may subdivide the intervals into two or more pieces, but it is easy to see that the right-hand side does not depend on the representation of H chosen. (4.2.b) Theorem. Suppose X and Y are continuous martingales. If H, K € IIi then ((H + K) ■ X)t = (H ■ X)t + (K ■ X)t {H-{X+Y))t = {H-X)t + (H-Y)t Proof Let H1 = H and H2 — K. By subdividing the intervals in the defin- tions we can suppose H{ — Ya=.\ l(ti-.i,ti](s)C| for j = 1,2. In this case it is easy to see that each side of the first identity is equal to m YlP\ + (%){Xu-Xu-x) If H3 = Ylhzi l(<i_i,<.](s)Ci then each side of the second identity is m £=1 Since the sum of a finite number of continuous martingales is a continuous martingale, it follows from (4.1.a) and (4.2.b) that we have (4.2.a) Theorem. If X is a continuous martingale and H € 6IIi = {H € IIi : H is bounded}, then (H ■ X)t is a continuous martingale. Turning to our third result (4.2.c) Theorem. If X and Y are bounded continuous martingales and H, K € 6IIi, then (H-X,K-Y)t= I H,K,d(X,Y)3 Jo Consequently, E((H -X)t(K -Y)t) = E f* H3K3d(X,Y), and E(H -X)2 = E j H2d(X), Jo Remark. Here and in what follows, integrals with respect to (X, Y)3 are Lebesgue-Stieltjes integrals. That is, since s —* (X, Y)3 is locally of bounded
Section 2.4 Integration w.r.t. Bounded Martingales 55 variation it defines a cr-finite signed measure, so we fix an w and then integrate with respect to that measure. In the case under consideration, the integrand is piecewise constant so the integral is trivial. Proof To prove these results it suffices to show that Zt=(H-X)t(K -Y)t- J H3K3d(X, Y)3 Jo is a martingale, for then the first result follows from (3.11), the second by taking expected values, and the third formula follows from the second by taking H = K and X = Y. To prove Zt is a martingale we begin by noting that ((H1 + H2) ■ X)t(K ■ Y)t = (H1 ■ X)t(K ■ Y)t + (H2 ■ X)t(K -Y)t f (H} + H2)K3d(X,Y)3 = f H]K3d(X,Y)3+ f H2K3d{X,Y)3 Jo Jo Jo The last two observations imply that if the result holds for (H1, K) and (H2, K), it holds for (Hl + H2, K). Similarly, if the result holds for (H, K1) and (H, K2), it holds for (H, K1 + K2). In view of the results in the last paragraph, we can now prove the result by establishing it in the case H = l(a>6]C, K = l(cd]JD, and we can furthermore assume that (i) 6 < c or (ii) a = c, b = d. Case i. In this case /„ H3K3d(X,Y}3 = 0, so we need to show that (H-X)t(K- Y)t is a martingale. To prove this we observe that if J = C(Xb — Xa)Dl^Ci^ then (H ■ X)t(K ■ Y)t = (J ■ Y)t is a martingale by (4.1.a) Case ii. In this case {0 s <a CD {(X3 - Xa)(Y3 - Ya) - ((X, Y)3 - (X, Y)a)} a<s<b CD {(Xb -Xa)(Yb -Ya) - ((X,Y)b - (X,Y)a)} s > b so it suffices to check the martingale property for a < s < t < b. To do this we note Zt-Z, = CDiXtYt - X3Y3 - Xa(yt - Y3) - Ya(Xt -X3) -((X.Y^-fX.Y),)} Taking expected value and noting Xa G T3, E(Yt — Y3\T3) = 0 we have E(Zt - Z3\T3) = CDE{XtYt - {X,Y)t - {X3Y3 - (X,Y)3}\F3) = 0 This completes the proof of Case ii and finishes the proof of (4.2.c). D
56 Chapter 2 Stochastic Integration Step 3: Square Integrable Integrands, Bounded Martingales Let II2PO be the set of all predictable processes H that have \\H\\X = (eJH?d(X)?) 2<cx> Exercise 4.2. \\H\\x is a norm. Remark. If we define a measure on the predictable cr-field by lt(Ax(a,t]) = E((X)t-(X)t;A) when A £ Ts then II2PO = L2(U,n). fj. is called the Doleans measure after C. Doleans-Dade. For a proof that this is a measure see Chung and Williams (1990), pages 52-53. Let M.2 be the set of all martingales adapted to {Ft,t > 0} that have ||X||2 = (sup EX2)1'2 < 00 t In the proof of (4.6) we will see that M2 is isomorphic to I/2(Jr00). Exercise 4.3. X e M2 if and only if EX% < 00 and E(X)O0 < 00. The following is the key to our next extension (4.4) Isometry Property. If X is a bounded martingale and H G 6H1, then \\H-X\\2 = \\H\\X. Remark. We skipped over the number (4.3) since (4.3.a) and (4.3.b) will be proved later in this step. Proof Recalling the definitions and using the last formula in (4.2.c) gives \\H\\2X = EJH2d(X)3=snpEjtH2d(X)1 = supE(H-X)2 = \\H-X\\2 a t The first step in extending the integral to U2(X) is to prove that 6II1 is dense in H.2(X). Later we need a slight strengthening of this result so we go ahead and prove the stronger form.
Section 2.4 Integration w.r.t. Bounded Martingales 57 (4.5) Lemma. If for 1 < i < k we have X* G M2 and H G n2(Xx"), then there is a sequence Hn G 6¾ with \\Hn - H\\X' -* 0 for 1 < i < k. Proof Since X{ £ M2, E(X% < oo and it follows that 6IIi C n2(X*)- Let Tit be the collection of predictable G that vanish on (t, oo) for which the conclusion holds. Clearly, if r < s < t and A G TT then G = l(r,i]l.A €E Tit. Suppose now that 0 < Gn G Tit and G„ | G with G bounded. The dominated convergence theorem implies ||G- Gn\\7Xi = E J{G, - GlfdiX'), - 0 as n —» oo, so we can pick n so that the last difference is < e2 for 1 < i < k. Since G„ eTit, we can find a sequence Hn-m G &ni so that ||ifn'm-Gn||x' -♦ 0 for 1 < i < k and then pick m so that ||fP'm - G"||x-' < e for 1 < i < k. The triangle inequality implies that ||ifn'm — G||x>' < 2e for 1 < i < k. Since e is arbitrary it follows that G ETit. Using the monotone class theorem now ((2.3) in Chapter 1) it follows that Tit contains all bounded predictable processes that vanish on (t, oo). Now if K G n2pf') for 1 < i < k and we define It'" = J^l|A-|<„l[o,n] then the dominated convergence theorem implies that \\Kn — K\\x< —* 0. Since Kn is bounded and vanishes on (n, oo), another use of the triangle inequality implies K is a limit offPG&IIi. D (4.6) Theorem. M2 is complete. Proof Standard martingale convergence theorems imply that if X G M2, then as t —» oo, Xt converges almost surely and in L2 to a limit Xoo with EX^ = supt EX2, and the martingale can be recovered from Xoo by Xt = EiXcolTt). Let ^"oo = 0-(^,t > 0). Since IM = \imXt G ^"oo, the observation a"bove shows that X —» X^ maps M2 one-to-one into L2{Too). On the other hand, if Y G i2(^oo), then Yt = E{Y\Tt) is a martingale with Yt -* Y as t —» oo, and Jensen's inequality shows that EY2 = E{E{Y\Tt)2) < E{E{Y2\Tt)) = EY2 so Yt G M.2. Combining this with the previous observation shows that X —* Xoo is an isometry from M2 onto I/2(Jr0o) and proves (4.6). D With (4.5) and (4.6) established, we can give the Definition of the integral for n^X). To define H -X when X is a bounded continuous martingale and H G n2pO, let Hn G bill so that \\Hn-H\\x -* 0.
58 Chapter 2 Stochastic Integration Since \\Hn — Hm\\x ->Oasm,n-nx), (Hn ■ X) is a Cauchy sequence in M2 and by (4.6) must converge to a limit in M2, which we define to be the integral (H -X). To see that the limit is independent of the sequence of approximations chosen suppose \\Hn - H\\x -* 0 and \\Hn - H\\x -* 0 with #" • X -* Y and Hn ■ X —* Y. Form a third approximating sequence by setting Hn — Hn if n is odd, Hn = Hniin is even. \\Hn - H\\x -» 0 so we have Hn-X-+Z. Looking at the even and odd subsequences of Hn it follows that Y = Z = Y so the limit is unique. In words, the last argument shows that if the limit exists for ANY sequence of approximations, it must be unique. Note that as a corollary of the construction we get (4.3.a) Theorem. If X is a bounded continuous martingale and H G IL2PO then H ■ X G M2 and is continuous. Proof The fact that H-X G M2 is automatic from the definition. If H" G 6IIi have \\Hn — H\\x -* 0 then (4.2.a) implies (Hn ■ X)t is continuous. Since \\(Hn ■ X) — (H ■ X)\\2 —» 0, using Chebyshev and the L2 maximal inequality we get that P (sup \(Hn ■ X)t - (# • X)*| > e) < e-2£sup \{Hn -X)t - (H ■ X)t\2 < e~2 ■ 4||(if" • X) - (H ■ X)\\2 -+ 0 Since this holds for any e > 0 we say "(Hn -X)t converges uniformly to (H -X)t in probability." By passing to a subsequence we can have sap\(Hn-X)t-(H-X)t\-*0 a.s. t and it follows that t —» (H ■ X)t is continuous. D (4.3.b) Theorem. If X is abounded continuous martingale, and H, K G Il2(X) then H + K G H2(X) and ((H + It') • X)t = (H ■ X)t + (K ■ X)t Proof The triangle inequality for the norm || -||x implies that H+K G Il2(X). Let Hn,Kn G 6ni with \\Hn - H\\x -» 0 and \\Kn - K\\x -* 0. The triangle inequality implies \\{Hn + Kn) - (H + K)\\x -* 0. (4.2.b) implies ((Hn + Kn) ■ X)t = (Hn ■ X)t + (Kn ■ X)t Now let n -* oo and use the fact that (Gn ■ X)t -* (G ■ X)t in M2 when G=H,K,H + K. □
Section 2.5 The Kunita-Watanabe Inequality 59 The proofs of the remaining two equalities (H-(X + Y))t = (H-X)t + (H-Y)t (H-X,K-Y)t= I H,K,d(X,Y), Jo will be given in the next section. To develop the results there we will need to extend the isometry property from 6II1 to TL2(X). To do this it is useful to recall some of your undergraduate analysis. Exercise 4.4. If || • || is a norm and \\xn — x\\ -* 0 then ||zn|| -» ||x||. Exercise 4.5. If X is a bounded martingale and H € n2(X) then \\H -X\\2 — WWx- 2.5. The Kunita-Watanabe Inequality In this section we will prove an inequality due to Kunita and Watanabe and apply this result to extend the formula given in Section 2.4 for the covariance of two stochastic integrals. (5.1) Theorem. If X and Y are local martingales and H and K are two measurable processes, then almost surely »00 / foa \ 1/2 / »00 \ 1/2 jf \H,K3\d\(X,Y)l < [l H?d(X)3) [Jq K3sd(X),) where d\(X, Y)\3 stands for dV, where V, is the total variation of r —* (X,Y)r on [0,s]. Remai-ks. (i) If X = Y and d(X), = s then d(Y), = d\(X,Y)\, = s and (5.1) reduces to the Cauchy-Schwarz inequality, (ii) Notice that H and K are not assumed to be predictable. We assume only that H(s,u) and K(s,u) ate measurable with respect to Ti x T where Ii, is the Borel subsets of R. The reason we can attain this level of generality is that the notion of martingale does not enter into the proof after the first step. Proof Step 1: Observe that if a < t, (X+XY,X+XY)t > (X+XY,X+XY)3. If we let (M, N)*s = (M, N)t - (M, N)3, then 0 < {X + XY, X + \Y)t - (X + XY, X + AY), = (X,X)\-2X(X,Y)l + X2(Y,Y)3
60 Chapter 2 Stochastic Integration for all s, t, and A. If we fix s and t and throw away a countable number of sets of measure 0 then with probability one the last inequality will hold for all rational A. Now a quadratic ax2 + 6a; + c that is nonnegative at all the rationals and not identically 0 has at most one real root (i.e., 62 — 4ac < 0), so we have that Step 2: Let 0 = io < *i < ••• < tn be an increasing sequence of times, let hi,ki, 1 < iI < n, be random variables, and define simple measurable processes n #(-,^) = ^(001(1.--^)(-) :=1 n #(-,^) = 5^)1(,...,,,..)(-) i=i Prom the definition of the integral, the result of Step 1, and the Cauchy-Schwarz inequality, it follows that / HJ<3d(X,Y)3 Jo ^glMilKx.ytfU £=1 < ;£>i ((^)^,¾^17^ / n \!/2 / „ \l/2 proving that for simple measurable processes: Jo H,Ksd(X,Y)s\< ^jf H?d(X),j ^ K?d(Y),j which is (5.1) with the absolute values outside the integral. Step 3: Let M be a large number and let T - inf{i : {X)t or (Y), > M}. By the monotone convergence theorem, it suffices to prove (5.1) when H = K = 0 for s > T and \HS\, \K,\ < M for s < T. Having restricted our attention to [0,T], (X) and (Y) are finite measures; so using the bounded convergence theorem, we see that (5.2) holds for bounded measurable processes. To improve (5.2) to (5.1) (and complete the proof), let J, be a measurable process taking values in {—1,1} such that ( |d(x,y),|= ( J,d(x,Y), Jo Jo
Section 2.5 The Kunita-Wa.tana.be Inequality 61 To see that this exists, note that J, is the Radon-Nikodym derivative of d(X, Y)s with respect to d\(X,Y)3\. Now apply (5.2) to H, = \H,\ and K, = J3\K3\. □ With (5.1) established, we are now ready to take care of unfinished business from Section 4. (5.3) Theorem. If H G U2(X) D U2{Y) then H £ Il2{X + Y) and (H ■ (X + Y))t = (H ■ X)t + (H ■ Y)t Proof First note that Exercise 3.2 implies (X + Y)i = (X)i + (Y)i + 2(X,Y)i and the Kunita-Watanabe inequality, (5.1), implies \(X,Y)\t < (™r>01/2 < «X)t + (Y)t)/2 since 2a6 < a2 + 62, i.e., 0 < (a — 6)2. So we have (X + Y)t<2((X)t + (Y)t) (which should remind the reader of (z + y)2 < 2(z2 + y2)) and it follows that H G 112(^ + Y). To prove the formula now we note that by (4.5) we can find Hn in 6IIi so that \\Hn - H\\z -+ 0 for Z = X,Y,X + Y. (4.2.b) implies that (Hn ■ (X + Y))t = (Hn ■ X)t + (Hn ■ Y)t Now let n -» 00 and use {Hn ■ Z)t -»{H ■ Z)t for Z = X,Y,X+Y. D (5.4) Theorem. If X, Y are bounded continuous martingales, H € U2(X), K e II2(Y) then (H -X,K- Y)t = f H,K, d(X, Y), Jo Proof By (3.11), it suffices to show that (t) Zt = (H- X)t(K .Y)t- J H3K,d(X, Y), Jo is a martingale. Let Hn and Kn be sequences of elements of 6II1 that converge to H and K in TL2(X) and Il2(y), respectively, and let Z" be the quantity that results when Hn and Kn replace H and K in (f). By (4.2.c), Z? is a
62 Chapter 2 Stochastic Integration martingale. By (3.6), we can complete the proof by showing Z3 —* Z3 in L1. The triangle inequality and (4.3.b) imply that Esup\(Hn -X)t(Kn .Y)t - (H-X)t(K -Y)t\ t <Esup\((Hn-H)-X)t(Kn.Y)t\ t + Esuv\(H ■ X)t((Kn - K) -Y)t\ t To bound the first term, we use (i) the sup of the product is smaller than the product of the sup's, (ii) the Cauchy-Schwarz inequality, (iii) the L2 maximal inequality and (iv) the isometry property in Exercise 4.5: < £{sup \((Hn - H) • X),|sup \(Kn ■ Y)t\] t t < {£(sup \((Hn - H) ■ X),|2)£(sup \(Kn ■ Y)^2)}1/2 t t < 4||(fP - H) ■ X\\2\\Kn -Y\\2 = 4\\Hn ~ H\\x\\Kn\\Y - 0 as n -* oo, since \\Hn - H\\x -* 0 and \\Kn\\Y -* \\K\\y < oo. A similar estimate shows E(siiP\(H-X)t((Kn-K)-Y)t\\-*0 so we have shown {Hn ■ X)3{Kn ■ Y)3 -»{H ■ X)3{K ■ Y)3 in L1. To estimate the second term in Z" — Zt, we again begin by putting the absolute values inside, replacing the measure by its variation, and then using the triangle inequality I I H?Ky(X,Y)3- I H3K3d(X,Y)3 I Jo Jo < [ \Hn3K?-H3K3\d\(X,Y)\3 Jo < /V;--^||ICM<X,Y)|,+ f\H,\\K?-Ks\d\(X,Y)\, Jo Jo Using the Kunita-Watanabe inequality, (5.1), the first term is ft1 \1/2/ t* n1/2 < (jf |fl? - H,\*d(X),) [JQ \K7MYh
Section 2.6 Integration w.r.t. Local Martingales 63 Taking expected values and using the Cauchy-Schwarz inequality E f \H? - H,\\I<:\d\(X,Y)\3 Jo < (eJ* \h:- H3\2d(x)^j (eJ* ijci2^),) -o as n -* oo, since ||if" - H\\x -* 0 and \\Kn\\Y -* \\K\\y < oo. Combining this with a similar estimate on /J \H3\ \K3 - K3\d\(X,Y}\3 it follows that f H?K?d(X,Y)*-+ I H3K3d(X,Y)3 Jo Jo in L1. We have now shown Z3 —* Z3 in L1 and the desired result follows from (3.6). D Remark. In some treatments (see e.g., Revuz and Yor (1991) p. 130), if M G M2 and H g II2(M), H-M is DEFINED to be the unique L e M2 with L0 = 0 so that (L,N) = H -(M,N) for all JV G.M2. Exercise 5.1. Prove the uniqueness in the last claim. (5.4) allows us to do easily some things that we would have had to work to prove in Section 2.3. Let X and Y be continuous local martingales, and suppose X0 = Y0 — 0. The last assumption entails no loss of generality. See Exercise 3.3. Exercise 5.2. (X,YT) = (X,Y)T. Exercise 5.3. Xj{Yt —Y^) is a local martingale. 2.6. Integration w.r.t. Local Martingales In this section we will extend our integral so that the integrators can be continuous local martingales and so that the integrands are in H3(X) =(h: f H2d(X)3 < oo a.s. for all t > o|
64 Chapter 2 Stochastic Integration We will argue in Section 3.4 that this is the largest possible class of integrands. To extend the integral from bounded martingales to local martingales we begin by proving an obvious fact. (6.1) Theorem. Suppose X is a bounded continuous martingale, H, K G n2(X) and H3 = K3 for s < T where T is a stopping time. Then (H ■ X)3 = (K ■ X)3 for s < T. Proof Write H3 = H\ + H2 and K3 = It',1 + K2 where H) = K] = fU(,<T) = A',1(,<T) Clearly, {H1 ■ X)t = (K1 ■ Y)t for all t. Since H2 = K? = 0 for s < T, (5.4) implies (H2 ■ X)3 = (K2 -X)3 = 0 for s < T and it follows from Exercise 3.8 that (H2 ■ X)t = (It'2 • X)t = 0 for t < T. Combining this with the first result and using (4.3.b) gives the desired conclusion. D Exercise 6.1. Extend the proof of (6.1) to show that if X is a bounded continuous martingale, H, K € II2PO and H3 = K3 for S < s < T where S,T are stopping times then (H-X)3 - (H-X)s = (It -X)3 - (K -X)s for S < s < T. To extend the integral now, let Sn be a sequence of stopping times with S„ < Tn = inf-ff : \Xt\ >nor / H;d(X)3 > n} and Sn | 00. Let H3 = Hjl(,<sn)> and observe that if m < n, (6.1) implies that (Hm ■ X)t = (Hn ■ X)t for i < Sm, so we can define H ■ X by setting {H ■ X)t = (Hn ■ X)t for t < Sn. To" complete the definition we have to show that if Rn and Sn are two sequences of stopping times < Tn with R„ | 00 and Sn | 00 then we end up with the same (H ■ X)t. Let Hj be the stopped version of the process defined at the beginning of Section 2.2, let Qn = Rn A S„ and note that (6.1) implies (HR" ■ X)3 = (Hs" .-X)3 for s < Qn Since Qn | 00, it follows that (H-X)t is independent of the sequence of stopping times chosen. The uniqueness result and (6.1) imply (6.2) Theorem. If X is a continuous local martingale and H € ^(X) then HT-X = (H-X)T = H-XT = HT-XT
Section 2.6 Integration w.r.t. Local Martingales 65 In words if we set the integrand = 0 after time T or stop the martingale at time T or do both, then this just stops the integral at T. Our next task is to generalize our abc's to integrands in II3. (6.3) Theorem. If X is a continuous local martingale and H G II3(X), then (H ■ X)t is a continuous local martingale. Proof By stopping at T„ = inf {i : /J H?d(X)s > n or \Xt\ > n}, it suffices to show that if X is a bounded martingale and H € H.2(X), then (H ■ X)t is a continuous martingale but this follows from (4.3.a). D (6.4) Theorem. Let X and Y be continuous local martingales. If H, K € H3(X) then H + Ke H3(X) and {{H + K) ■ X)t = (H ■ X)t + (K ■ X), If He H3(X) n n3(Y) then H G n3(X + Y) and (H-(X + Y))t = (H-X)t + (H-Y)t Proof To prove the first formula we note that the triangle inequality for the norm || • ||x implies H + K € n3(X). Stopping at Tn = inf{t : \Xt\, J E] d(X)„ or J Kj d(Y)3 > n] reduces the result to (4.3.b) For the second formula we note that the argument in (5.3) shows that H € II3(X + Y) and by stopping we can reduce the result to (5.3). D After seeing the last two proofs the reader can undoubtedly improve (5.4) to (6.5) Theorem. If X, Y are continuous local martingales, H £ II3(X) and K e n3(Y) then (H-X,K.Y)t= I H3K3d(X,Y)3 Jo Using Exercise 3.2, we can generalize (6.5) to sums of stochastic integrals. (6.6) Theorem. Let X = J2T=i & 'X' and Y = Ej=iI<} "yi where the *l and YJ are continuous local martingales. If H* € II3(X') and K.i G n3(YJ), then {X,Y)t = Y,f Hl3K{d{X\Y!)3
66 Chapter 2 Stochastic Integration We leave it to the reader to make the final extension of the isometry property: Exercise 6.2. If X is a continuous local martingale and H € TL2(X) then H-XeM2aad\\H-X\\2 = \\H\\x. We will conclude this section by computing some stochastic integrals. The first formula should be obvious, but for completeness must be derived from the definitions. Exercise 6.3. Let X be a continuous local martingale. Let S < T < oo be stopping times, let C(w) be bounded with C(w) G Ts, and define H, = C for S < s < T and 0 otherwise. Then H £ E3(X) and I H, dX, = C(XT - Xs) For the rest of this section, A„ = {0 = tj < *i < • • • < *£(„) = *} iS a sequence of partitions of [0,t] with mesh |A„| = sup,- \t" — t"_x| —» 0. Our theme here is that if the integrand Ht is continuous then the integral is a limit of Riemann sums, provided we evaluate the integrand at the left endpoint of the intervals. (6.7) Theorem. If X is a continuous local martingale and t —* Ht is continuous then as n —» oo Y^Ht?{X(t?+1) - X(t?)} -* I H, dX, in probability Proof First note that H € U3(X) so the integral exists. By stopping at T = inf{s : s, (X)3, or \H,\ > M} and making H constant after time T to preserve continuity, we can suppose (X)t < M and \H,\ < M for all s. Let H? = Ht? for s G (*",*"+i], H? = Ht for s > t. Since s -* H3 is continuous on [0, t] we have for each w that sup | #,"-#,1—0 s The desired result now follows from the following lemma which will be useful later. (6.8) Lemma. Let Xt be a continuous local martingale with (X)t < M for all t. If Hn is a sequence of predictable processes with \Hn\ < M for all s, w and with sup, \H? - H, | -♦ 0 in probability then (Hn . X)-* (H ■ X) in M2.
Section 2.6 Integration w.r.t. Local Martingales 67 Proof Exercise 6.2 (or 4.4) implies \\(Hn - H) ■ X\\l = \\Hn - H\& < MS sup \H? - #,|2 ^ 0 s as n —» oo by the bounded convergence theorem. D The rest of the section is devoted to investigating what happens when we evaluate at the right end point or the center of each interval. For simplicity we will only consider the special case in which H, = 2XS. Exercise 6.4. If X is a continuous local martingale then ( 2X1dX,=X?-XQ2-(X)t Jo Exercise 6.5. Show that if X is a continuous local martingale and we evaluate at the right end point then X>X,r+i{X(tf+1) - X(t?)} - f 2X, dX, + 2(X)t = X? - Xl + (X)t i J° Comparing the last two exercises you might jump to the conclusion that if we evaluate at the midpoint we get an integral that performs like the one in calculus. However, we will not ask you to prove this in general. Exercise 6.6. Show that if Bt is a one dimensional Brownian motion starting at 0 then 2"-l J2 2B((Jfc + l/2)2-"0{B((Jfc + 1)2—0 - B(Jb2-"0} converges in probability to fQ 2BS dB, + t = Bf. This approach to integration, called the Stratonovich integral, is more convenient than Ito's approach in certain circumstances (e.g., diffusion processes on manifolds) but we will not discuss it further here. Changing topics we have one more consequence of (6.7) that we will need in Chapter 5. Exercise 6.7. Suppose h : [0, oo) -» R is continuous. Show that /0 h, dB, has a normal distribution with mean 0 and variance /0 h/; ds.
68 Chapter 2 Stochastic Integration 2.7. Change of Variables, Ito's Formula This section is devoted to a proof of our first version of Ito's fromula, (7.1). In Section 2.10, we will prove a bigger and better version, (10.2), with a slicker proof, but the mundane proof given here has the advantage of explaining where the term with /" comes from. (7.1) Theorem. Suppose X is a continuous local martingale and / has two continuous derivatives. Then with probability 1, for all t > 0 f(Xt) - f(X0) = f f'{X.)dX. + | f f"(X,)d(xh Remark. If At is a continuous function that is locally of bounded variation, and / has a continuous derivative then (see Exercise 7.2 below) (7.2) f(At)-f(Ao)= I f'(As)dA, Jo As the reader will see in the proof of (7.1), the second term comes from the fact that local martingale paths have quadratic variation (X)t, while the 1/2 in front of it comes from expanding / in a Taylor series. Proof By stopping at Tm = inf{< : \Xt\ or (X)t > M}, it suffices to prove the result when \Xt\ and (X)t < M. From calculus we know that for any a and 6, there is a c(a, 6) in between a and 6 such that (7.3) /(6) - /(a) = (6 - a)f(a) + |(6 - aff"(c(a, 6)) Let t be a fixed positive number. Consider a-sequence A„ of partitions 0 = fS < <? ... < <tn = t with mesh |A„| -* 0. From (7.3) it follows that f{Xt) - f(XQ) = £/(¾) - f(Xtf) i (7.4) = z2f'i.xt?)(xn+i ~xn) i i wheretf(w) = /"(c(Xtr,X,? )).
Section 2.7 Change of Variables, Ito's Formula 69 Comparing (7.4) with (7.1), it becomes clear that we want to show (7.5) £f'(Xt?)(Xt?+i - X,?) - f f(X,)dX, (7-6) \ X>M(**?+1 " ^?)2 -*\J* f"{X,)d(X)s in probability asn-^oo. (7.5) follows from (6.8). To prove (7.6), we let G? = tf («) = /"(c(X«»,X«, ) when a € (t?,t?+1], G? = /"(X,) for a > t and let so that and what we want to show is (7.7) f(%dA\-* ff"{Xs)d{X)s Jo Jo To do this we begin by observing that the uniform continuity of /" implies that as n —* oo we have G" —* f"(XsAt) uniformly in s, while (3.8) implies that A" converges in probability to (X)s. Now by taking subsequences we can suppose that with probability 1, A"M converges weakly to (X)sM. In other words, if we fix w and regard s —* A"M and s —* (X)sAt as distribution functions, then the associated measures converge weakly. Having done this, we can fix w and deduce (7.7) from the following simple result: (7.8) Lemma. If (i) measures \in on [0,t] converge weakly to fj.co, a finite measure, and (ii) gn is a sequence of functions with \gn\ < M that have the property that whenever sn £ [0,t] —* s we have gn(sn) —* g(s), then as n —* oo / ffn^n -* gdp Proof By letting /4i(j4) = Mn(^)/Mn([0)<])) we can assume that all the \in are probability measures. A standard construction (see (2.1) in Chapter 2 of Durrett (1995)) shows that there is a sequence of random variables Xn with distribution \in so that Xn —* Xoo a.s. as n —* oo The convergence of #„ to 5
70 Chapter 2 Stochastic Integration implies gn(Xn) —» g(Xco), so the result follows from the bounded convergence theorem. D (7.8) is the last piece in the proof of (7.1). Tracing back through the proof, we see that (7.8) implies (7.7), which in turn completes the proof of (7.6). So adding (7.5) and using (7.4) gives that for each t, f(Xt) - f(X0) = J f'(X,)dX, + i y* f"(X,)d(X). a.s. Since each side of the formula is a continuous function of t, it follows that with probability 1 the equality holds for all t > 0, the statement made in (7.1). D (7.9) Remark. In Chapter 3 we will need to use Ito's formula for complex valued /. The extension is trivial. Write / = u + iv, apply Ito's formula to u and v, multiply the v formula by i and add the two. Exercise 7.1. Let At be a continuous process with total variation \A\t < M for all t. If Kn is a sequence of measurable processes with \Kn(s, w)| < M for all s,w and sup, \K? - K,\ — 0 then suPl |(JiTn - K) ■ A)t\ — 0. Exercise 7.2. Use the fact that /(6) — /(a) = /'(c(a, 6))(6 — a) for some c between a and 6, and Exercise 7.1 to prove (7.2). 2.8. Integration w.r.t. Semimartingales X is said to be a continuous semimartingale if Xt can be written as Mt + At, where Mt is a continuous local martingale and At is a continuous adapted process that is locally of bounded variation. A nice feature of continuous semimartingales that is unheard of for their more general counterparts is (8.1) Theorem. Let Xt be a continuous semimartingale. If the (continuous) processes Mt and At are chosen so that Aq = 0, then the decomposition Xt = Mt + At is unique. Proof If Mi + A't is another decomposition, then At — A't = M[ — Mt is a continuous local martingale and locally of bounded variation, so by (3.3) At—A't is constant and hence = 0. D Remark. In what follows we will sometimes write "Let Xt = Mt + At be a continuous semimartingale" to specify that the decomposition of Xt consists of the local martingale Mt and the process At that is locally of bounded variation.
Section 2.8 Integration w.r.t. Semimaxtinga.les 71 Exercise 8.1. Show that if Xt = Mt + At and X't = M[ + A\ are continuous semimartingales then Xt + X[ = (Mt + M't) + (At + A't) is a continuous semimartingale. In this section we will extend the class of integrators for our stochastic integral from continuous local martingales to continuous semimartingales. There are three reasons for doing this. (i) If X is a continuous local martingale and / is C2 then Ito's formula shows us that f(Xt) is always a semimartingale but it is not a local martingale unless f"(x) = 0 for all x. In the next section we will prove a generalization of Ito's formula which implies that if X is a continuous semimartingale and / is C2 then f{Xt) is again a semimartingale. (ii) It can be argued that any "reasonable integrator" is a semimartingale. To explain this, we begin by defining an easy integrand to be a process of the form n :=0 where 0 = T0 < T\ < ■ • • < Tn+\ are stopping times, and the Hi € Tt{ have \Hi\ < oo a.s. Let bHeii be the collection of bounded easy predictable processes that vanish on (t, oo) equipped with the uniform norm pr||u=sup|ff,(«)| For easy integrands we define the integral as n :=1 Finally, let L° be the collection of all random variables topologized by convergence in probability, which comes from the metric ||J\r||0 = E(\X\/(1 + \X\)). See Exercise 6.4 in Chapter 1 of Durrett (1995). A result proved independently by Bichteler and Dellacherie states (8.2) Theorem. If H -* (H ■ X) is continuous from 6ne>1 -* L° for all t then X is a semimartingale. Since any extension of the integral from blleii will involve taking limits, we need our integral to be a continuous function on the simple integrands (in some sense). We have placed a very strong topology on the domain and a weak
72 Chapter 2 Stochastic Integration topology on the range, so this is a fairly weak requirement. Protter (1990) takes (8.2) as the definition of semimartingale. In his approach one gets some very deep results without much effort but then one must sweat to prove that a semimartingale (defined as a good integrator) is a semimartingale (as defined above). (iii) Last but not least, it is easy to extend the integral from local martingales to semimartingales if we replace Ilz(X) by a slightly smaller class of integrands that does not depend on X. Getting back to business, let Xt = Mt+Ai be a continuous semimartingale. We say H G /611 = the set of locally bounded predictable processes, if there is a sequence of stopping times Tn j oo so that \H(s, w)| < n for s < Tn. If H £ /611 we can define (H.A)t(u)= f Hs{uj)dA,{w) Jo as a Lebesgue-Stieltjes integral (which exists for a.e. w). To integrate with respect to the local martingale Mt we note that f nHU(M)s<n2(M)Ta<oo Jo and Tn —* oo. So /611 C n3(M), we can define (H ■ M)t, and let (H ■ X)t = (H ■ M)t + (H ■ A)t since by the uniqueness of the decomposition this is an unambiguous definition. From the definition above, it follows immediately that we have (8.3) Theorem. If X is a continuous semimartingale and H £ /611, then (H-X)t is a continuous semimartingale. The second of our abc's is also easy. We name it in honor of the similar relationships between addition and multiplication. (8.4) Distributive Laws. Suppose X and Y are continuous semimartingales, and H, K G Ibtt. Then ((H + K) ■ X)t = (H ■ X)t + (K ■ X)t (H.(X + Y))t = (H.X)t + (H.Y)t
Section 2.8 Integration w.r.t. Semimartingales 73 Proof This follows easily from the definition of the integral with respect to a semimartingale, the result for local martingales given in (6.4), and the fact that the results are true when X and Y are locally of bounded variation. In proving the second result we also need Exercise 8.1. D The rest of this section is devoted to defining the covariance for semimartin- gales and proving the third of our abc's. (8.5) Definition. If X = M + A and X' = M' + A' are continuous semimartin- gales we define the covariance (X, X')t = {M,M')t. To explain this we recall the approximate quadratic variation Qf-(X) defined in Section 2.3, and note (8.6) Theorem. Suppose X = M + A and X' = M' + A' are continuous semimartingales. If A„ is a sequence of partitions of [0, t] with mesh |A„| —» 0 then Qf »(X,X') -» (M,M')t in probability Proof Since Qf"(X,X') = Q?«(M,M') + Q?o(A,M') + Q^(X,A') it suffices to show that if Yt is a continuous process and Vt is continuous and locally of bounded variation then Qfn{Y, V) —* 0 almost surely. To do this we note that Q?"(Y,V) < \V\tsup \YtT -Yt?| i As n —» oo the second term —» 0 almost surely, while the first one has \V\t < oo and the desired result follows. D (8.7) Theorem. Suppose X' and YJ are continuous semimartingales H',K^ G /MI, X = ££ 1 W ■ X\ Y = £"=1 Ki ■ Y'\ then (X,Y)t = J2 fHJKjdiX',^), Proof Let M' and JVJ be the local martingale parts of X' and YJ, let M = YT=i Ri ■ Mi> and N = E"=i Rj ■ Nj- Tt follows from our definition that (X,Y)t = (M,N)U so using (6.6) and then (M\Nj)t = (X^Y'h gives the desired result. D
74 Chapter 2 Stochastic Integration 2.9. Associative Law Let Xt be a continuous semimartingale. In some computations, it is useful to write the integral relationship Yt= [ K,dX, Jo as the formal equation dYt = KtdXt where the dYt and dXt are fictitious objects known as "stochastic differentials." A good example is the derivation of the following formula, which (for obvious reasons) we call the associative law: (*) H-(K-X) = (HK) ■ X Proof using stochastic differentials d(H ■ Y)t = HtdYt. Letting Yt = (K ■ X)t and observing dYt = KtdXt gives d(H ■ (K ■ X))t = Ht d(K ■ X)t = HtKt dXt = d((HK) ■ X)t D The above proof is not rigorous, but the computation is useful because it tells us what the answer should be. Once we know the answer, it is routine to verify by checking that it holds for basic predictable processes and then following the extension process we used for defining the integral to conclude that it holds in general. (9.1) Lemma. Let X be any process. If H,K € III, then (*) holds. Proof Since the formula is linear in H and in K separately, we can without loss of generality suppose H = l(a,6]C an(i K = l(c,d]-D an<i further that either (i) 6 < c or (ii) a = c, 6 = d. In case (i), both sides of the equation are = 0 and hence equal. In case (ii), (0 0<t <a (K-X)t = {D(Xt-Xa) a<t<b [D(Xb-Xa) b<t <oo so „ rO 0<t<a (H • (K ■ X))t = { CD(Xt - Xa) a<t<b {CD(Xb-Xa) b<t<oo and it follows that (H ■ (K • X))t = ((HK) -X)t. D
Section 2.9 Associative Law 75 To extend to more general integrands, we will take limits. First, however, we need a technical result. (9.2) Lemma. Suppose fi is a signed measure with finite total variation, h and k are bounded, and let ^([0,<]) = fQ k,fi(ds). Then / h, u(ds) = / h,k, n(ds) Jo Jo Proof of (9.2) The Radon-Nikodym derivative dv/dp = k, so the desired identity is just the well known fact that J hsv(ds)= / hs — n(ds) See for example Exercise 8.7 in the Appendix of Durrett (1995). D To simplify the extension to n2(X), we will take the limit for K and then for if. (9.3) Lemma. Let X be a continuous local martingale. If H £ bill and K e n2(X), then (*) holds. Proof Let Kn G 611! such that Kn —► K in Tl2(X). Since H is bounded, HKn -* HK in n2(A'), and it follows from Exercise 6.2 that (HKn) ■ X -* (HK) ■ X in M2- To deal with the left side of (*), we observe that using (6.4) twice, Exercise 6.2, (6.5) with (9.2), and the fact that Hs is bounded we have Mi/ - (X - ^) - i/ - (JC" - ^)||| = ||i/ - ((A' - Jv") - ^)||| y«co = E H?d((K-Kn)-X), Jo i»CO = Ej H?(K,-K?)2d(X), <C\\K-Kn\& so as n -» oo, H ■ (Kn ■ X) -* H • (Jv • X) in M2 and the result follows. D (9.4) Lemma. Let X be a continuous local martingale. If H £ n2(Jv -X) and K e IL2(X), then (*) holds.
76 Chapter 2 Stochastic Integration Proof Let Hn g MIi such that Hn -+ H in n2(J<T -X). (6.4) and Exercise 6.2 imply \\Hn -(K.X)-H- (K -X)\\2 = \\(H" - H) ■ (K ■ X)\\2 = \\H" - H\\K.X so Hn • (Jv -X) -* H ■ (K ■ X) in M2. To deal with the right side of (*), we observe that using the definition of the norm then (6.5) with (9.2) f°° \\HnK - HK\\2X = E J {H? - Hs)2K2d(X)s = l|tf°-#llix-o so applying Exercise 6.2 we have HnK ■ X —* HK ■ X in M2. D (9.5) Lemma. Let X be a continuous local martingale. If K G Il3(Y) and H e n3(JC • X) then (*) holds. Proof Let Tn = inf{< : /J K*d{X)t or /J H~d(K ■ X), > n}. Let Hf- = H'h><T„) and Kjn = Ksl(,<Kay (9-4) implies that ^T".(A'T".X) = (^T"A'T»)-X Using (6.2) now it follows that (H ■ (K ■ X))t = (HK ■ X)t for t < Tn, and letting n —* oo gives the desired result. D (9.6) Associative Law. Suppose X is a continuous semimartingale. If H, K G /6n then H ■ (K ■ X) = HK ■ X. Proof In view of (9.5) and the definition of the integral it suffices to prove the result when Xt = At is continuous adapted and locally of bounded variation. However, by stopping the first time \Ha\, \KS\ or the variation of A on [0,s] exceeds n the statement reduces to something covered by (9.2). D 2.10. Functions of Several Semimartingales In this section, we will prove a version of Ito's formula for functions of several semimartingales. The key to the proof is the following (10.1) Integration by Parts. If X and Y are continuous semimartingales then XtYt - XQYo = [ Y,dX,+ f XsdYs + (X,Y)t Jo Jo
Section 2.10 Functions of Several Semimartingales 77 Proof Let An = {<"} be a sequence of sub divisions of [0,t] with mesh |A„| going to 0. We can write XtYt - XQYQ = 2^(^i?+1 - ^")(^i+i - **?) + £(¾ - K?)Yt? + £**?(*?+1 - Yt?) i i (8.6) implies that £(x„+1 - xti)orti+l - Yti) -+ {x,Y)t i in probability, while application of (6.7) to each of the last two sums implies that they converge to fQ YsdXs and to fQ XsdYs. D When Yt is locally of bounded variation then (X,Y)t = 0 and this reduces to the ordinary integration by parts formula of Lebesgue-Stieltjes integration. f Y,dX, = XtYt - X0Y0 - f X,dY, Jo Jo Note that the right hand side makes sense in a path-by-path sense. We are now ready to prove the following generalization of (7.1): (10.2) Ito's formula. Let / : Rd — R and 0 < c < d. If X},...,Xf are continuous semimartingales and X£+1,..., Xf are locally of bounded variation then f(Xt) - f(X0) = jrf Dd{Xs)dX\ + \ Y, f Dijnx.WX'.Xi). provided the derivatives Djf, 1 < j < d and Aj/, 1 < i,j, < c exist and are continuous. Remark. At this point saving a derivative or two may seem like pinching pennies but when we get to Chapter 4 and apply this result to u(t,Bt) we will be very happy to not have to check that the partial derivatives utt and uttXi exist. Proof By stopping, it suffices to prove the result when |Xj|,(X')t < M for all i, t. Since any function / satisfying the hypotheses can be approximated by
78 Chapter 2 Stochastic Integration polynomials gn in such a way that gn,Dign, 1 <i < d and Ajffn, 1 < hj < c converge to /, A/, and Aj/ uniformly on [—M,M]d, it suffices to prove the result when / is a polynomial. For then (6.8) and Exercise 7.2 allow us to pass from the result for gn to that for /. To prove the result for a polynomial, it suffices by linearity to prove the result when / is a monomial xklxk* ■ • • xkn where &i, &2,..., &„ G {1,..., d). (The reader should note that ki,... ,kn are superscripts, not powers, e.g., our monomial might be x1xix1x1x2.) If n = 1 and k\ = k, then (10.2) says that Xk-Xk = f JO ldX? which is trivially true. To prove the result for a general monomial, we use induction. Let Yt = Yim=i Xt be a monomial for which (10.2) holds, and let Zt = X^n+1). Applying (10.1) gives YtZt-Y0Z0 = J Z,dY,+ J YtdZt + (Y,Z)t Jo Jo Applying (10.2) to Y gives (10.3) z ,,j:<» J0 I m=1 d(Xk^,Xk^)s so using the associative law (9.6), rt rt /n+l ZsdYS=j: n^(m J° ,<n J° I »-i 1 /•* / "+1 dXk& \ J d(XkV\XkW)s By definition, pYsdZs = f (n xk^ ] ^^+1)
Chapter Summary 79 To evaluate the third term (Y, Z)t, we observe that by (10.3) and the formula for the covariance of stochastic integrals, (8.7), Adding the last three equalities gives YtZt - YoZo = E /| ff X>(m) I dxsW 2t*WJoWj ) (Notice that for each i in the sum for (Y, Z)t, there are two terms i = i, j = n+1, and i = n + 1, j = i in the last sum.) The proof is complete. D Chapter Summary With the completion of the proof of the multivariate form of Ito's formula, we have enough stochastic calculus to carry us through most of the applications in the book, so we pause for a moment to recall what we've just done. In Section 2.1, we introduced the integrands Hs, the predictable processes and in Section 2.2 the (first) integrators Xt, the continuous local martingales. In Section 2.3 we defined the variance process associated with a local martingale, and more generally the covariance (X, Y)t of two local martingales. Theorem (3.11) tells us that the covariance could have been defined as the unique continuous predictable process At that is locally of bounded variation, has A0 = 0, and makes XtYt — At is a martingale. The next result gives a pathwise interpretation of (X, Y)t: (3.10) Quadratic Variation. Suppose X and Y are continuous local martingales. If A„ is a sequence of partitions of [0, t] with mesh |A„| —» 0 then in £.(*** - Xtk_J ■ (Ytk - Y^J - (X,Y)t in probability t=i Taking X = Y we see that (X)t is the "quadratic variation" of the path up to time t.
80 Chapter 2 Stochastic Integration In Section 2.4-2.5, we denned the integral, beginning with bounded continuous martingales and considering a sequence of progressively more general integrands. Ho l(0li](fl)C(w) with Ce?a III finite sum of elements of IIo n2 (X) H with E /0°° H; d{X)s < oo H3(X) H with /o H; d(X), < oo a.s. for each t A key role in this development was played by the (5.1) Kunita-Watanabe Inequality. If X and Y are local martingales and H and K are two measurable processes, then almost surely yoo / pea \ 1/2 / pea \ 1/2 JQ \HsKs\d\{X,Y)\s < (JQ HU(X)S) (Jq KU(X)s) where d\(X, Y)\, stands for dV, where V, is the total variation of r —» (X,Y)r on [0,s]. In Section 2.6, we generalized the integrators X, to continuous local martingales and, in Section 2.8, to continuous semimartingales which are the sum (in a unique way) of a continuous local martingale M, and a continuous process A, that is locally of bounded variation. Further, we defined the covari- ance of two semimartingales Xt = Mt + At and X't = M[ + A't to be that of their martingale parts, i.e., (M,M')t, and generalized the quadratic variation interpretation in (3.10), see (8.6). To be able to integrate with respect to semimartingales we had to restrict the integrands to the class of locally bounded predictable processes, £611, but in this generality the integral has many nice properties. (8.3) Closure under Integration. If X is a continuous semimartingale and H £ IbU, then (H ■ X)t is a continuous semimartingale. (8.4) Distributive Laws. Let X and Y be continuous semimartingales, and H,K eibll. ((H + K) ■ X)t = (H ■ X)t + (K ■ X)t (H-(X + Y))t = (H-X)t + (H-Y)t
Chapter Summary 81 (8.7) Covariance of Stochastic Integrals. Suppose X' and YJ are continuous semimartingales H\I<.i € IbU, X = YT=i H{ -X\Y = £"=1 K> • Y-», then <x,Y), = W*ffjff2d<x',YJ), M J° (9.6) Associative Law. Let X be a continuous semimartingale. If H,K € /611 then H -{K-X) = HK-X. As the reader probably remembers, the first three results were only obtained after several improvements Integrands = nx n2 n3 /611 a. Closure (4.2.a) (4.3.a) (6.3) (8.3) b. Distributive Laws (4.2.b) (4.3.b), (5.3) (6.4) (8.4) c. Covariance Formula (4.2.c) (5.4) (6.5), (6.6) (8.6), (8.7) In Section 2.7, we proved a change of variables formula. This led to: (10.1) Integration by Parts. Let X and Y be continuous semimartingales. XtYt -XQYo = j Y,dX,+ j X,dY, + (X,Y)t Jo Jo which in turn led to a bigger and better change of variables formula. (10.2) Ito's formula. Let / : Rd -* R and 0 < c < d. liX],...,Xf are continuous semimartingales and X£+1,..., Xf are locally of bounded variation £(J° 2i<U<'Jo provided the derivatives Djf, 1 < j < d and Aj/. 1 < i, j, < c exist and are continuous. Coming Attractions Section 2.11 is devoted to a proof of the Meyer-Tanaka formula, which extends the version of Ito's formula given in (7.1) to / that are a difference of two convex functions. This leads in a nice way to the definition of an occupation time density (or "local time") L\ for a continuous semimartingale that is jointly continuous in t and a.
82 Chapter 2 Stochastic Integration The developments described in the previous paragraph are used only in (9.3) of Chapter 4 and in Example 2.1 of Chapter 5. The material in Section 2.12 concerning Girsanov's formula is not used until Section 5.5. We would like to suggest that the reader proceed now to the beginning of Chapter 3. 2.11. The Meyer-Tanaka Formula, Local Time In this section we will prove an extension of Ito's formula due to Meyer (1976) that has its roots in work of Tanaka (1963). See (11.4). Loosely speaking it says that Ito's formula is valid if / is a difference of two convex functions. Since such functions are differentiable except at a countable set and have a second derivative that is a nice signed measure, this is a modest generalization. However, this is the best one can do in general: if Bt is a standard Brownian motion and Xt = f(Bt) is a semimartingale then / must be the difference of two convex functions. See Cinlar, Jacod, Protter, and Sharpe (1980). The developments in this section follow those in Sections 4 and 5 of Chapter IV of Protter (1990) but things are simpler here since we have no jumps. Our first step is (11.1) Theorem. Let / : R -+ R be convex and let X be a continuous semimartingale. Then f(X) is a semimartingale and f(Xi)-f(XQ)= [ f'(Xt)dXt+Kt Jo Here f'(x) = lim/ijo(/(a0 — f{x — h))/h is the left derivative which exists at all x by convexity and K% is a continuous adapted increasing process. Proof Let Xt = Mt + At be the decomposition of the semimartingale. By stopping we can suppose without loss of generality that \Xt\ < N, \A\t < N and (M)t < N for all t. Let g be a C°° function with compact support in [—oo,0] and having fg(s)ds = 1. Let (a) fn(x) = Jf(x+l)g(y)dy Then /„ is convex and C°°. Using Ito's formula we have (b) fn(Xt) - fn(XQ) = Jq f'n(X>) dX' + \JQ fn(X') dW> Our strategy for the proof will be to let n -+ oo in (b) to get the desired formula. Since a convex function is Lipschitz continuous on each bounded
Section 2.11 The Meyer-Tanaka Formula, Local Time 83 interval, it is easy to see that /n(z) —» f(x) uniformly on compact sets. Since \Xt\ <N, it follows that (c) fn(Xt)-+f(Xt) uniformly in t To deal with the first term on the right, we differentiate (a) to get (d)' f'n{*) = jf'(x+l)g(y)dyU'(x) as n — oo. Now if I? = /J fn(Xs)dMs and It = /J f'(X,)dMs then the I/2 maximal inequality for martingales and the isometry property of stochastic integrals (Exercise 6.2), imply E (sup (I? - It)A < 4sup E(I? - Itf = AE {fn{Xs)-f\Xs)fd{M)s^Q Jo by the bounded convergence theorem. (Recall (M)^ < N.) By passing to a subsequence we can improve the last conclusion to (e) sup (J"fc — It)2 —* 0 almost surely t Let Jt" = fQf'n{Xs)dAs and Jt = fQf'(Xs)dAs. In this case it follows from the bounded convergence theorem that for almost every w ,oo (f) sup |J," - Jt\ < / \fn(Xs) - f(X,)\ dA, -+ 0 t Jo To take the limit of the third and final term K" = | Jo fn(X*) d(X)s we note that (b) implies (g) K?" = fnk (Xt) ~ fnk (Xo) ~ I f'nk (X, ) dX, JO Combining (c), (e), and (f) we see that almost surely the right hand side of (g) converges to a limit uniformly in t, so the left hand side does also. Since Jt'"fc is continuous adapted and increasing the limit is also. D If we fix X then for a given / we call K the increasing process associated with /. The increasing process associated with |z — a\ is called the local
84 Chapter 2 Stochastic Integration time at a, and is denoted by L". To prove the first property that justifies this name, (11.3), we need the following preliminary (11.2) Lemma. The increasing process associated with (z — a)+ or (z — a)- is (1/2)L?. Proof Let f\(x) = (x — a)+, /2(2) = (x — a)-, and let K\ be the increasing process associated with /;. We begin by observing /1+/2 = \x—a\, so K}+K$ = L°. Our second claim is that since /1 — /2 = x — a we have K] — K"l = 0. To see this let g(x) = x — a which has g'(x) = 1 and note Xt — X0 = fQ 1 cW3 so the associated K% must be = 0. D (11.3) Theorem. L\ only increases when Xt = a or to be precise, if we let £a be the measure with distribution function t —* L", then £a is supported by {i: Xt = a}. Proof Intuitively, |.Xt — a\ is a local martingale except when Xt = a so L? is constant when Xt ^ a. To prove (11.3), however, it is easier to use the alternative definitions in (11.2). Let S <T be stopping times so that [S, T] C {t : Xt < a}. Applying (11.1) to f(x) = (x — a)+ we have {Xt - a)+ - (Xs -*)+ = Js l{x,>a} dXt + \{LaT - L%) The left-hand side and the integral on the right-hand side vanish so Ly = Las. Since this holds when S = q with q any rational and T = inf {t > S : Xt > a — 1/n) for any n, it follows that la({t : Xt < a}) = 0. A similar argument using (x — a)~ instead of (x — a)+ shows £a({t : Xt > a}) = 0 and the proof is complete. D We are now ready for our main result. (11.4) Meyer-Tanaka formula. Let X be a continuous semimartingale. Let / be the difference of two convex functions, /' be the left derivative of / and suppose /" = fj. in the sense of distribution (i.e., fj. has distribution function /'). Then f(Xt) - f(X0) = J f'(Xs) dX, + l-jn{da)Lat Proof Since / is the difference of two convex functions and the formula is linear in /, we can suppose without loss of generality that / is convex. By
Section 2.11 The Meyer-TanaA-a Formula, Local Time 85 stopping we can suppose \Xt\ < N and (X)t < N for all t. Having done this we can let 1 fN 9(^)=-J /*(da)|z-a| Since (/ - g)" = 0 on [-N, N] it follows that f(x) - g(x) = a + bx for |z| < N. Since the result is trivial for linear functions, it suffices now to prove the result for g, which is almost trivial. Differentiating the definition of g we have 1 fN a' (*) = J J V(da) sign(z - a) Starting with the definition of the local time \Xt - a\ - \XQ - a\ = J sign(X, - a) dX, + L\ Jo and then integrating 1/2 J_N fj.(da) gives giXt) - g(XQ) = I f li(da) f sign(X, - a) dX, t J-N JO 1 fN + -J^{da)Lat (11.5) and (11.6) below will justify interchanging the two integrals in the first term on the right-hand side to give g(Xt) - g(X0) = [ g'(Xs) dXs + \ j ^da)Lat Jo l J-N Since L° = 0 for \a\ > N (recall \Xt\ < N and use (11.3)) this proves the result for g and completes the proof of (11.4). D For the next two results let S be a set, let S be a cr-field of subsets of S, and let fj.be a. finite measure on (S, S). For our purposes it would be enough to take S = R, but it is just as easy to treat a general set. The first step in justifying the interchange of the order of integration is to deal with a measurability issue: if we have a family H?(w) of integrands for a 6 S then we can define the integrals fQ H° dXs to be a measurable function of (a, t,w). (11.5) Lemma. Let X be a continuous semimartingale with Xq = 0 and let H?(w) = H(a,t,u) be bounded and S x II measurable. Then there is a
86 Chapter 2 Stochastic Integration Z(a,t,w) G S x II such that for /^-almost every a Z(a,t,u) is a continuous version of fQ H° dXs. Proof Let Xt = Mt + At be the decomposition of the semimartingale. By stopping we can suppose that \A\t < N, \Mt\ < N and (X)t = (M)t < N for alii. Let Ti be the collection of bounded H G S x II for which the conclusion holds. We will check the assumptions of the Monotone Class Theorem, (2.3) in Chapter 1. Clearly 7i is a vector space. To check (i) of the MCT suppose H(a,t,u) = f(a)K(t,u) where K(t,ui) G 611 and /(a) G bS. In this case f Hf dX, = f(a) f K, dX, Jo Jo so fK G Ti. Taking / and K to be indicator functions of sets in S and II respectively, we have checked (i) of the MCT with A = {Ax B : A & S, B & II}. To check (ii) of the MCT let 0 < Hn G V. with Hn | H and H bounded. The I/2 maximal inequality for martingales and the isometry property of the stochastic integral imply E Tsup \{Hn'a ■ M)t - (Ha ■ M)i|M < 4sup E\(Hn<" ■ M)t - (Ha ■ M)t\2 = 4E f{H?>a - Hasf d(M), — 0 By passing to a subsequence we have for fj. almost every a sup |(#"*'" • M)t - (Ha ■ M)t\ — 0 t To deal with the bounded variation part we note that £sup \(Hn'a ■ A)t - (Ha ■ A)t\ <E f° \H?'a - H?\ d\A\t -+ 0 t Jo as n —* oo by the bounded convergence theorem. Combining the last two results completes the proof. D (11.6) Fubiiii's Theorem. Let X be a continuous semimartingale, let Hf G b(S x II), and let Z" G S x II be a continuous version of /0 Hf dX, for /^-almost every a. Then Yt = /s Z$ fi(da) is a version of H ■ X where Ht = fs H° fi(da). Less formally, f f H?dXsti(da)= f f H?fi(da)dXs Js Jo Jo Js
Section 2.11 The Meyer-Tanaka Formula, Local Time 87 Proof Let Xt = M% + At be the decomposition of the semimartingale. By stopping we can suppose that \A\t < N, \Mt\ < N and (X)t = (M)t < N for all t. As noted in the proof of (11.5), the result in the special case X = A is just Fubini's theorem from measure theory, so we will suppose for the rest of the proof that X = M. Let 7i be the collection of bounded H G S x II for which the conclusion holds. Again, we will check the assumptions of the Monotone Class Theorem, (2.3) in Chapter 1. Clearly V. is a vector space. To check (i) of the MCT, suppose H(a,t,u) = f(a)K(t,u) where K(t,w) G 611 and f(a) G 65. In this case Z? = f(a)(K ■ X)t so J Z* ii{da) = (K ■ X)t J f(a) ^(da) = ({Jsf(a)n(da)K}.X)t = (H-X)t so fK G V.. Taking / and K to be indicator functions of sets in S and II respectively, we have checked (i) of the MCT with A = {A x B : A G S, B £ II}. To check (ii) of the MCT let 0 < H" G V. with Hn \ H and H bounded. Letting ||p|| be the total mass of fj. applying Jensen's inequality to the probability measure M0/IMI. then multiplying each side by \\fi\\ we have ^(^supl^-^l^))2 <E ( sup \Z?-a-Z?\2 n(da) Js t Using Fubini's theorem, the Ir maximal inequality, and the isometry property the right-hand side = [ Esup\Z?-a-Z?\2fi(da) Js t < 4 J sup£'|^n'a - Z?\2n(da) Js t = \ (H?-a-H?)2d(X)s)n(da)^0 by using the bounded convergence theorem three times. Putting the absolute values inside the integral we have \ 2 -//Of Dunded convert Le integral we r E(sup\ f Z?-afM(da) - f Z?fi(da) < ^( f sup \Z^a - Z?\n(da)y -+ 0
88 Chapter 2 Stochastic Integration Taking H" = /s H"'" fi(do) and passing to a subsequence nt we have that with probability one (H"k -X)t = Js Z"k,a n(da) converges uniformly to fs Z? y.(da). The last detail is to check that Hn ■ X converges to H ■ X in M.2, for then it follows that fs Zf n(da) = (H ■ X)t. In view of the isometry property, we can complete the last detail by showing that \\Hn — H\\x —» 0 but this is routine. ^£f (//r.>w-/s™)V), <E I" I (»,"■■ - Htf Ada) d(X), -. 0 ^0 Js by using the bounded convergence theorem three times. D We can now complete the identification of L\ as the local time at a. (11.7) Theorem. Let X be a continuous semimartingale with local time L°. If g is a bounded Borel measurable function then /OO j-t Latg{a)da= / g{Xs)d(X), -oo JO Proof- Suppose first that g is continuous and let / G C2 with /" = g. In this case comparing (11.4) with Ito's formula proves the identity in question. Since the identity holds for any continuous function, using the Monotone Class Theorem proves the result for a bounded measurable g. Q When X is a Brownian motion the identity in (11.7) becomes /OO ft Latg{a)da= / g(X,)ds -oo JO From this, we see that L° can be thought of as the time spent at a up to time t, or to be precise it is a density function for the occupation time measure Vt(A) = S*lA(Xt)ds. There are many ways of defining local time for Brownian motion. The one we have given above is famous for making it easy to prove that there is a version in which L\ is a jointly continuous function of a and t. It is somewhat remarkable that this property is true for a continuous local martingale and it is not any more difficult to prove the result in this generality. (11.8) Theorem. Let X be a continuous local martingale. There is a version of the L° in which (a, t) —» L% is continuous.
Section 2.11 The Meyer-Tanaka Formula, Local Time 89 Example 11.1. (11.8) is not true for Xt = \Bt\, which is a semimartingale by (11.1). Introducing subscripts to indicate the process we are referring to, we clearly have Lx{i) = LaB(t) + Lga{i) for a > 0 and Lax(t) = 0 for a < 0. Since L%(t) =£ 0 it follows that a —* Lx(t) is discontinuous at a = 0. One can complain that this example is trivial since 0 is an end point of the set of possible values. A related example with 0 in the middle is provided by skew Brownian motion. Start with \Bt\ and flip coins with probability p > 1/2 of + and 1 — p of — to assign signs to excursions of Bt, i.e., maximal time intervals (a, 6) on which |i?<| > 0. If we call the resulting process Yt then L$(t) -* 2pL%(t) as a | 0 and L\r{i) -* 2(1 - p)L°B(t) as a | 0. For details see Walsh (1978). Proof As usual, by stopping we can suppose that \Xt\ < N and (X)t < N for all t. By (11.2) \h\ = (Xt - a)+ - (X0 - a)+ - j£ l{x,>a} dXs It is clear that (a, t) —* (Xt — a)+ — (X0 — a)+ is continuous, so we only have to investigate the joint continuity of the stochastic integral JO l{x,>a} dXs Fix a time T < oo and regard a —* I" as a mapping from R to C([0, T], R), the real valued functions continuous on [0,T], which we equip with the sup norm 11/11 — sup0<1<T |/(s)|. In view of Kolmogorov's continuity theorem ((1.6) in Chapter 1) it suffices to show that for some a, /? > 0 we have E\\Ia - Ib\f < C\a - b\1+a Suppose without loss of generality that a < b. One of the Burkholder Davis Gundy inequalities ((5.1) in Chapter 3) implies that 2 E\\I? - J6||4 < CE (£ l{a<x,<6} d(X), To bound the right-hand side we note that using (11.7) and then the Cauchy- Schwarz inequality and Fubini's theorem: = e( f I%dx\ <(b-a)E J (L^fdx rb = (b-a) E(L^)2dx<{b-a)2 sup E(L^)2 Ja »6(a,6)
90 Chapter 2 Stochastic Integration To bound the sup, we recall the definition of if.. LXT = \XT - x\ - \XQ - x\ - [ sgn(X, - x) dX, Jo which with the trivial inequalities (a+6)2 < 2a2+2&2 and \y— x\— \z—x\ < \y—z\ implies that E{LxTf < 2E(XT - X0)2 + IE I f sign(X, - x) dX, J < 2(2iV)2 + 2E({X)T) < 8(N2 + N) by the isometry property and the bounds we have imposed on \Xi\ and (X)i by stopping. This gives us what we need to check the inequality in Kolmogorov's continuity theorem and completes the proof. D Remark. The proof above almost works for semimartingales Xt = Mt + At. If we define I? = fQ l{x,>a] dM, and J? = fQ l{x,>a} dA, then the argument above shows that (a, t) —» 1° is continuous, so we will get the desired conclusion if we adopt assumptions that imply (a, t) —» J° is continuous. 2.12. Girsanov's Formula In this section, we will show that the collection of semimartingales and the definition of the stochastic integral are not affected by a locally equivalent change of measure. For concreteness, we will work on the canonical probability space (C,C), with T% the filtration generated by the coordinate maps Xt(u) = ut. Two measures Q and P defined on a filtration Tt are said to be locally equivalent if for each t their restrictions to Tty Qt and Pt are equivalent, i.e., mutually absolutely continuous. In this case we let at = dQt/dPt. The reasons for our interest in this quantity will become clear as the story unfolds. (12.1) Lemma. Yt is a (local) martingale/Q if and only if cctYt is a (local) martingale/P. Proof The parentheses are meant to indicate that the statement is true if the two locals are removed. Let Yt be a martingale/Q, s < t and A £ Ts. Now if Z £ Tu then (i) / Z dP = f Z dPu and (ii) fZatdPt = f Z dQt. So using (i), (ii), the fact that Y is a martingale/Q, (ii), and (i) we have / aiYtdP = f atYtdPt = [ YtdQt J A J A J A = [ Y,dQ, = f a.Y.dP. = [ a,YsdP J A JA J A
Section 2.12 Girsanov's Formula. 91 This shows atYt is a martingale/P. If Y is a local martingale/Q then there is a sequence of stopping times Tn j oo so that ltAT„ is a martingale/Q and hence atYtATn is a martingale/P. The optional stopping theorem implies that atATnYtATa is a martingale/P and it follows that atYt is a local martingale/P. To prove the converse, observe that (a) interchanging the roles of P and Q and applying the last result shows that if /¾ = dPt/dQt, and Zt is a (local) martingale/P, then /3tZt is a (local) martingale/Q and (b) /¾ = a^1, so letting Zt — atYt we have the desired result. D Since 1 is a martingale/Q we have (12.2) Corollary, a* = dQt/dPt is a martingale/P. There is a converse of (12.2) which will be useful in constructing examples. (12.3) Lemma. Given &t a nonnegative martingale/P there is a unique locally equivalent probability measure Q so that dQt/dPt = a*. Proof The last equation in the lemma defines the restriction of Q to Tt for any t. To see that this defines a unique measure on C, let f i < tn ... < tn, Ai be Borel subsets of Itd, and let B = {w : w(i,) G A{ for 1 < i < n}. Define the finite dimensional distributions of a measure Q by setting Q(B) = J atdP whenever t > tn- The martingale property of at implies that the finite dimensional distributions are consistent so we have defined a unique measure on (C,C). a We are ready to prove the main result of the section. (12.4) Girsanov's formula. If A' is a local martingale/P and we let At = fQ a~1d(a,X)s, then Xt — At is a local martingale/Q. Proof Although the formula for A looks a little strange, it is easy to see that it must be the right answer. If we suppose that A is locally of b.v. and has AQ = 0, then integrating by parts, i.e., using (10.1), and noting {a,A)t = 0 since A has bounded variation gives (12.5) ^(^-,40-^0^0= f (Xs-As)da,+ f a,dX,- f a^A^^X^ Jo Jo Jo
92 Chapter 2 Stochastic Integration At this point, we need the assumption made in Section 2.2 that our filtration only admits continuous martingales, so there is a continuous version of a to which we can apply our integration-by-parts formula. Since as and Xs are local martingales/P, the first two terms on the right in (12.5) are local martingales/P. In view of (12.1), if we want Xt — At to be a local martingale/Q, we need to choose A so that the sum of the third and fourth terms = 0, that is, / asdA, = (a,X)t Jo Prom the last equation and the associative law (9.6), it is clear that it is necessary and sufficient that At= [ a^dfaX), Jo The last detail remaining is to prove that the integral that defines At exists. Let Tn = inf {t : at < n-1}. If t < T„, then fUe^k^f «<„,*>i.<=o Jo . ai Jo by the Kunita-Watanabe inequality. So if T = limn_0o Tn then At is well defined for t <T. The optional stopping theorem implies that EaiATn = Eat, so noting a*ATn = a* on T„ > f we have E{at;Tn<t) = E{aTa;Tn<t) Since at > 0 is continuous and Tn < T, it follows that E(at;T<t) < E(at;T„ <t) = E(aTn;Tn <t) < l/n Letting n —» oo, we see that at = 0 a.s. on {T < t), so Qt(T<t) = E(at;T<t) = 0 But Pt is equivalent to Qt, so 0 = Pt(T <t) = P(T < t). Since t is arbitrary P(T < oo) = 0, and the proof is complete. D (12.4) shows that the collection of semimartingales is not affected by change of measure. Our next goal is to show that if X is a semimartingale/P, Q is locally equivalent to P, and H £ £611, then the integral (H ■ X)t is the same under P and Q. The first step is
Section 2.12 Girsanov's Formula. 93 (12.6) Theorem. The quadratic variation (X)t and hence the covariance (X, Y)i is the same under P and Q. Proof The second conclusion follows from the first and (3.9). To prove the first we recall (8.6) shows that if A„ = {0 = ig < *" ■ ■ ■ < *£ = 0 is a sequence of partitions of [0,i] with mesh |A„| —» 0 then Dx*f+x-x*f)2-<x>* in probability for any semimartiugale/P. Since convergence in probability is not affected by locally equivalent change of measure, the desired conclusion follows. D (12.7) Theorem. If H € ^611 then (H ■ X)t is the same under P and Q. Proof The value is clearly the same for simple integrands. Let M and N be the local martingale parts and A and B be the locally bounded variation parts of X under P and Q respectively. Note that (12.6) implies (M)t = {N)t. Let Tn = mi{t:(M)i,\A\iOT\B\t>n} If H G ^&II and Ht = 0 for t > Tn then by (4.5) we can find simple Hm so that \\Hm-H\\M,\\Hm-H\\N-+Q and ^1^-^1^1 + 151)^0 These conditions allow us to pass to the limit in the equality (Hm ■ (M + A))t = (Hm ■ (N + B))t (the left-hand side being computed under P and the right under Q) to conclude that for any H G £611 the integrals under P and Q agree up to time Tn. Since n is arbitrary and Tn —* oo the proof is complete. D
3 Brownian Motion, II In this chapter we will use Ito's formula to deepen our understanding of Brownian motion or, more generally, continuous local martingales. 3.1. Recurrence and Transience If Bt is a d-dimensional Brownian motion and B\ is the ith component then B\ is a martingale with (B1)t = t. If i ^ j Exercise 2.2 in Chapter 1 tells us that B\B\ is a martingale, so (3.11) in Chapter 2 implies (B',BJ)t = 0. Using this information in Ito's formula we see that if / : Rd —* R is C2 then Writing V/ = (£>i/,..., Ddf) for the gradient of /, and A/ = Ya=\ Daf for the Laplacian of /, we can write the last equation more neatly as (1.1) f{Bt) - /0¾) = J V/(53) -dB, + ^J &f(B,)ds Here, the dot in the first term stands for the inner product of two vectors and the precise meaning of that is given in the previous equation. Functions with A/ = 0 are called harmonic. (1.1) shows that if we compose a harmonic function with Brownian motion the result is a local martingale. The next result (and judicious choices of harmonic functions) is the key to deriving properties of Brownian motion from Ito's formula. (1.2) Theorem. Let G be a bounded open set and r = inf{i : Bt_ ¢. G). If / G C2 and A/ = 0 in G, and / is continuous on the closure of G, G, then for x G G we have f(x) = Exf(Br). Proof Our first step is to prove that Px(t < oo) = 1. Let K = sup{|z — y\ : x,y E G} be the diameter of G. If x G G and \B\ — x\ > K then B\ (fc G and r < 1. Thus Px(r < 1) > P«(|£i -x\> K) = PQ{\BX\ >K) = eK>0
96 Chapter 3 Browniaji Motion, H This shows supx Px(t > k) < (1 — e/c)k holds when k = 1. To prove the last result by induction on k we observe that if pt(x,y) is the transition probability for Brownian motion, the Markov property implies P*(t > k) < [ Pl(x,y)Py(T >k-l)dy Jg < (1 - eKf-1Px(B1 G G) < (1 - eKf where the last two inequalities follow from the induction assumption and the fact that \Bi — x\ > K implies B\ ¢ G. The last result implies that Px(t < oo) = 1 for all x G G and moreover that (1.3) sup ExTp < oo for all 0 < p < oo X To get the last conclusion recall that (see e.g., (5.7) in Chapter 1 of Durrett (1995)) EXTP= / ptp-1Px(T>t)dt Jo When A/ = 0 in G, (1.1) implies that_/(i?t) is a local martingale on [0,r). We have assumed that / is continuous on G and G is bounded, so / is bounded and if we apply the time change j defined in Section 2.2, Xt = /(57(tj) is a bounded martingale (with respect to Qt = ^"7(i))- Being a bounded martingale Xt converges almost surely to a limit X^ which has Xt = Ex(Xoo\Qt) and hence ExXt = ExX^. Since r < oo and / is continuous on G, .X^, = f(BT). Taking t = 0 it follows that f(x) = EXXQ = ExX^ = Exf(Br). D In the rest of this section we will use (1.2) to prove some results concerning the range of Brownian motion {Bt : t > 0}. We start with the one-dimensional case. (1.4) Theorem. Let a < x < b and T = inf{i: Bt £ (a, 6)}. 6 — x x — a PX(BT = a) = —■ PX(BT = b) = —■ b — a b — a Proof f(x) = (6 — x)/(b — a) has /" = 0 in (a, 6), is continuous on [a, 6], and has f(a) = 1, /(6) = 0, so (1.2) implies f(x) = Exf(BT) = PX(BT = a). D Exercise 1.1 Deduce the last result by noting that Bt is a martingale and using the optional stopping theorem at time T. Let Tx = inf{i : Bt = x}. Prom (1.4), it follows immediately that
Section 3.1 Recurrence and Transience 97 (1.5) Theorem. For all x and y, Px(Ty < oo) = 1. Proof Since Px(Ty < oo) = Px-y(To < oo), it suffices to prove the result when y = 0. A little reflection (pun intended) shows we can also suppose x > 0. Now using (1.4) PX(T0 < TMx) = (M - 1)/M, and the right-hand side approaches 1 as M —» oo. D It is trivial to improve (1.5) to conclude that (1.6) Theorem. For any s < oo, Px(Bt = y for some t > s) = 1. Proof By the Markov property, Px(Bt = y for some t > s) = Ex(PB{s)(Ty < oo)) = 1 D The conclusion of (1.6) implies (argue by contradiction) that for any y with probability 1 there is a sequence of times tn f oo (which will depend on the outcome u) so that Bin = y, a conclusion we will hereafter abbreviate as "Bt = y infinitely often" or "Bt = y i.o." In the terminology of the theory of Markov processes, what we have shown is that one-dimensional Brownian motion is recurrent. Exercise 1.2 Use (1.6) to conclude limsupi?i = oo liminf Bt = — oo In order to study Brownian motion in d > 2, we need to find some appropriate harmonic functions. In view of the spherical symmetry of Brownian motion, an obvious way to do this is to let <p(x) = /(|z|2) and try to pick / : R —» R so that A<p = 0. We use \x\2 = x\ + ■ ■ ■ x\ rather than \x\ since it is easier to differentiate: A-/(M2) = /'(M2)2*< Atf(M2) = r(M2)4s2+ 2/^12) Therefore, for A<p = 0 we need 0 = ^/^)4^ + 2/^12)} = 4|x|2/"(|x|2) + 2d/'(|a:|2)
98 Chapter 3 Brownian Motion, II Letting y = \x\2, we can write the above as 4y/"(y) + 2df'(y) = 0 or, if y > 0, Taking f'(y) = Cy~d'2 guarantees Aip = 0 for x ^ 0, so by choosing C appropriately we can let ..,_.>,_ flog\x| d=2 We are now ready to use (1.2) in d > 2. Let Sr = inf{i : \Bt\ = r} and r < R. Since <p has A<p = 0 in G = {z : r < \x\ < R}, and is continuous on G, (1.2) implies <p(x) = EM^r) = <f(r)P*(Sr < SR) + <p(R)(l - Px(Sr < SR)) where <p(r) is short for the value of <p(x) on {x : \x\ = r}. Solving now gives <"> '■<*<*) = ^^ In c/ = 2, the last formula says If we fix r and let -R —» oo in (1.8), the right-hand side goes to 1. So Px{Sr < oo) = 1 for any x and any r > 0 and repeating the proof of (1.6) shows that (1.9) Theorem. Two-dimensional Brownian motion is recurrent in the sense that if G is any open set, then Px(Bt G G i.o.) = 1. If we fix R, let r -♦ 0 in (1.8), and let SQ = inf{i > 0 : Bt = 0}, then for x^O Px(So < SR) < lim Px(Sr <SR) = 0 Since this holds for all R and since the continuity of Brownian paths implies Sr t oo as R | oo, we have Px(So < oo) = 0 for all x £ 0. To extend the last result to x = 0 we note that the Markov property implies Po(Bi = 0 for some t>e) = Eo[PB<(T0 < oo)] = 0
Section 3.1 Recurrence and Transience 99 for all e > 0, so Po(Bt = 0 for some t > 0) = 0, and thanks to our definition of S0 = inf{i > 0 : Bt = 0}, we have (1.10) PX(S0 < oo) = 0 for all x Thus, in d > 2 Brownian motion will not hit 0 at a positive time even if it starts there. Exercise 1.3 Use the continuity of the Brownian path and Px(So = oo) = 1 to conclude that if x ^ 0 then Px(Sr tooasr|0)=l. For d > 3, formula (1.7) says J?2-d _ l-12-d a-11) Msr<sR) = KR2_Jrid There is no point in fixing R and letting r —* 0, here. The fact that two dimensional Brownian motion does not hit points implies that three dimensional Brownian motion does not hit points and indeed will not hit the line {x : x\ = z2 = 0}. If we fix r and let R —* oo in (1.11) we get (1.12) Px(Sr < oo) = (r/|z|)d-2 < 1 if |x| > r From the last result it follows easily that for d > 3, Brownian motion is transient, i.e. it does not return infinitely often to any bounded set. (1.13) Theorem. As t —* oo, \Bt\ —» oo a.s. Proof Let An = {\Bt\ > n1!2 for all t > Sn} and note that Sn < oo by (1.3). The strong Markov property implies P*iK) = Ex(PB(Sn)(Snl/, < oo)) = (n^/nY-2 - 0 as n —» oo. Now limsup An = n^=1 U^L^ An has P(limsup An) > limsup P(An) = 1 So infinitely often the Brownian path never returns to {x : \x\ < n1'2} after time Sn and this implies the desired result. D Dvoretsky and Erdos (1951) have proved the following result about how fast Brownian motion goes to oo in d > 3.
100 Chapter 3 Brownian Motion, II (1.14) Theorem. Suppose g(t) is positive and decreasing. Then Po(\Bt\ < g(t)y/i i.o. as t | oo) = 1 or 0 according as f°° g(t)d~2/t dt = oo or < oo. Here the absence of the lower limit implies that we are only concerned with the behavior of the integral "near oo." A little calculus shows that /oo i_1 log-" t dt = oo or < oo according as a < 1 or a > 1, so Bt goes to oo faster than ^/(\ogt)a^d~2 for any a > 1. Note that in view of the Brownian scaling relationship Bt =d t^^Bx we could not sensibly expect escape at a faster rate than y/i. The last result shows that the escape rate is not much slower. Review. At this point, we have derived the basic facts about the recurrence and transience of Brownian motion. What we have found is that (i) Px(\Bt\ < 1 for some t > 0) = 1 if and only if d < 2 (ii) Px(Bt = 0 for some t > 0) = 0 in d > 2. The reader should observe that these facts can be traced to properties of what we have called <p, the (unique up to linear transformations) spherically symmetric function that has A<p(x) = 0 for all x £ 0, that is : (\x\ d=l ip{x)= l\og\x\ d = 2 [\x\2-d d>3 and the features relevant for (i) and (ii) above are (i) <p(x) —* oo as |x| —» oo if and only if d < 2 (ii) |y(z)| —>-ooasz—>-0inc/>2. 3.2. Occupation Times Let D = 5(0, r) = {y : \y\ < r} the ball of radius r centered at 0. In Section 3.1, we learned that Bt will return to D i.o. in d < 2 but not in d > 3. In this
Section 3.2 Occupation Times 101 section we will investigate the occupation time fQ lo(Bt) dt and show that for any x (2.1) Px (/0°° 1D (Bt) dt = oo) = 1 in d < 2 (2.2) Ex /0°° lD (Bt) dt < oo in d > 3 Proof of (2.1) Let TQ = 0 and G = B(0,2r). For k > 1, let S^in^^Tfc.! :Bt£D) Tk = inf{i >Si:5,eG} Writing r for Ti and using the strong Markov property, we get for k > 1 PAjk lo{Bt)dt>s TSk\= PB{sk) (f lD{Bt)dt>s)= H(s) From this and (4.5) in Chapter 1 it follows that rTk f "1d(B Jsk t)dt are i.i.d. Since these random variables have positive mean it follows from the strong law of large numbers that '" s " = oo a.s. J/-°° ■• /.jfc 1/3(Bt) dt > Jim J2 lD(Bt) dt = proving the desired result. D Proof of (2.2) If / is a nonnegative function, then Fubini's theorem implies ^0° y»0O ^OO p Ex / f(Bt) dt = / Exf(Bt) dt= / Pt(x, y)f(y) dy dt Jo Jo Jo J = // Pi(x>y)dtf(y)dy where pt(x,y) = (27rf)-d/2e-lx-!/l l2t is the transition density for Brownian motion. As t —* oo, pt(x,y) ~ (2wt)~dl'z, so if d < 2 then fpt(x,y)dt = oo.
102 Chapter 3 Brownian Motion, II When d > 3, changing variables t = \x — y\2/2s gives JfOO *0O -I I *(*.»)* = / (5SF75«H-|,,2'<« - f° ( s \d'2 -. ( \x~y\2\ j r2.3) ~L\*\*-v\3) e \ ** )ds Jo = tzy\2~d r°° 2 -^-^l*-y| 27rd/2 *■ (.2 ~ •*•) l_ ,,|2-d 27rd/2 where T(a) = /0 s"-^-*^ is the usual gamma function. If we define JfOO pt(x, y) dt o then in d > 3, G(x, y) < oo for x ^ y, and (2.4) E* J°° f{Bt) dt = J G(x, y)f(y) dy To complete the proof of (2.2) now we observe that taking f = 1D with D = 5(0, r) and changing to polar coordinates f G(0, y) dy = f 3d-1 Cds*-d ds = ^r2 < Jd Jo * oo To extend the last conclusion to x ^ 0 observe that applying the strong Markov property at the exit time r from 5(0, \x\) for a Brownian motion starting at 0 and using the rotational symmetry we have JfOO *00 yOO ' lD(Bs)ds = E0 lD{B,)ds<Eo lD(B,)da D 0 Jr Jo We call G(x,y) the potential kernel, because G(-,y) is the electrostatic potential of a unit charge at y. See Chapter 3 of Port and Stone (1978) for more on this. In d < 2, fQ pt(x, y) dt = oo so we have to take another approach to define a useful G: G(x,y) = / (j>t(x,y)-at) Jo dt
Section 3.2 Occupation Times 103 where the at are constants we will choose to make the integral converge (at least when x ^ y). To see why this modified definition might be useful note that if f f(y) dy = 0 then (assuming we can use Pubini's theorem) J G(x, y)f(y) dy= J Exf(Bt) dt When d = 1, we let at = Pt(0,0). With this choice, G(x,y)=-J=jQ (e-(y-*?/"-l)t-^dt and the integral converges, since the integrand is < 0 and (y— z)2/2f3/2 as t —* oo. Changing variables t = (y — x)2/2u gives 1/2 (2.5) ^ h Js 2 V71" ./o since Jo Jo 2 The computation is almost the same for d = 2. The only thing that changes is the choice of at. If we try at = p<(0,0) again, then for x ^ y the integrand ~— f-1asf—>-0 and the integral diverges, so we let at = p<(0,ei) where ei = (1,0). With this choice of at, we get 1 f°° ( y1/21 \ , 1 r00 G{x,y) = ±- (e-l-vl2/2< - e"1/21)r^ to Jo = J_ /°° ( f1'2 to 7o \J|x-t/ 1 f°° , , /"i/;" i , = — / dse~' I t~ldt 2*" Vo J|x-y|2/2j = 27 (/°° ds e~0 (~l0g(|a: " y|2)) = Vl0g(|a:" y|) (2.6) - v-"a/* V ' ,00 rl/2s
104 Chapter 3 Brownian Motion, II To sum up, the potential kernels are given by ( (T(d/2 - 1)/2^12) ■ \x - y|2-d d > 3 (2.7) G(x, y) = \ (-1/^) - logflz - y\) d=2 {-l.\x-y\ d=l The reader should note that in each case G(x,y) = C<p(\x — y\) where <p is the harmonic function we used in Section 3.1. This is, of course, no accident. x —* G(x,0) is obviously spherically symmetric and, as we will see, satisfies AG(z, 0) = 0 for x ^ 0, so the results above imply G(x, 0) = A + B<p(\x\). The formulas above correspond to A = 0, which is nice and simple. But what about the weird looking B's? What is special about them? The answer is simple: They are chosen to make |AG(x, 0) = — <50 (a point mass at 0) in the distributional sense. It is easy to see that this happens in d = 1. In that case <p(x) = \x\ then lf , f 1 x > 0 so if B = — 1 then B<p"(x) = —260. More sophisticated readers can check that this is also true in d > 2. (See F. John (1982), pages 96-97.) To explain why we want to define the potential kernels, we return to Brow- nian motion in the half space, first considered in Section 1.4. Let H = {y : yd > 0}, r = inf{i : Bt(£H}, and let y= (yi,---,yd-i,-yd) be the reflection of y through the plane {y £ Rd : yd = 0} (2.8) Theorem. If x € H, f > 0 has compact support, and {z : f(x) > 0} C H then Ex (j'fiB^dt} = JG(x,y)f(y)dy- JG(x,y)f(y)dy Proof Using Fubini's theorem which is justified since / > 0, then using (4.9) from Chapter 1 and the fact that {x : f(x) > 0} C H we have Ex f f{Bt) dt = [°° Ex{f{Bt); r>t)dt Jo Jo = / {Pt(x,y) -pt{x,y))f(y)dydt Jo Jh = / (pt{x,y)-at)f(y)dydt Jo Jh - / (pt(x,y)-at)f(y)dydt Jo Jh = j G{x,y)f{y)dy- JG(x,y)f(y) dy
Section 3.3 Exit Times 105 The compact support of/ and the formulas for G imply that f \G(x,y)f(y)\dy and f\G(x,y)f(y)\dy are finite, so the last two equalities are valid. D The proof given above simplifies considerably in the case d > 3; however, part of the point of the proof above is that, with the definition we have chosen for G in the recurrent case, the formulas and proofs can be the same for all d. Let Gff(x,y) = G(x,y) — G(x,y). We think of Ga{x,y) as the "expected occupation time (density) at y for a Brownian motion starting at x and killed when it leaves H." The rationale for this interpretation is that (for suitable /) Ex (JqT f(Bt)dt\ = J GH(x, y)f(y) dy With this interpretation for G# introduced, we invite the reader to pause for a minute and imagine what y —» Gff(x,y) looks like in one dimension. If you don't already know the answer, you will probably not guess the behavior as y —» oo. So much for small talk. The computation is easier than guessing the answer: G(x,y) = -\x-y\ so Gff(x,y) = — \x — y\ + \x + y\. Separating things into cases, we see that r , x _ f ~(x -y) + (x + y) = 2y when 0 < y < x LrH^X,y)~\-(y-x) + (x + y) = 2x when x < y so we can write (2.9) GH(x,y) = 2(xAy) foralla;,y>0 It is somewhat surprising that y —* Gj{(x,y) is constant = 2a; for y > x, that is, all points y > x have the same expected occupation time! 3.3. Exit Times In this section we investigate the moments of the exit times r = inf {t : Bt ¢ G] for various open sets. We begin with G = {x : \x\ < r} in which case r = Sr in the notation of Section 3.1. (3.1) Theorem. If |x| < r then ExSr = (r2 - \x\2)/d Proof The key is the observation that \Bi\2-dt = ^{(B>)2-t}
106 Chapter 3 Brownian Motion, II being the sum of d martingales is a martingale. Using the optional stopping theorem at the bounded stopping time Sr A t we have \x\2 = Ex{\BSrM\2-(SrAt)d} (1.3) tells us that ExSr < oo, and we have |-BsrA*|2 < r2, so letting t —* oo and using the dominated convergence theorem gives \x\2 = Ex (r2 — Srd) which implies the desired result. D Exercise 3.1. Let a, b > 0 and T = inf{i : Bt <£ (-a, 6)}. Show E0T = ab. To get more formulas like (3.1) we need more martingales. Applying Ito's formula, (10.2) in Chapter 2, with X] = Xt, a continuous local martingale, and X2 = (X)t we obtain f(Xu (X)t) - /(Xo.O) = f D.fiX,, (X),)dX, Jo (3.2) + f D2f(X„(X)s)d(X)s Jo + \J Dnf(X„(X)s)d(X)s From (3.2) we see that if (|Dn + D2)f = 0, then f(Xt, (X)t) is a local martingale. Examples of such functions are f(x,y) = x, x2- y, x3 - 3zy, z4 - 6x2y + 3y2... or to expose the pattern fn(x,y)= J2 Cn,mXn-2mym 0<m<[n/2] where [n/2] denotes the largest integer < n/2, c„i0 = 1 and for 0 < m < [n/2] we pick 7:cn,m(n - 2m)(n - 2m - 1) = -(m + l)c„|m+1 so that Dxx/2 of the mth term is cancelled by Dy of the (m + l)th. (Dxx of the [n/2]th term is 0.) The first two of our functions give us nothing new (Xt and X2 — (X)t are local martingales), but after that we get some new local martingales: Xf-3Xt(X)u X?-6X2(X)t + 3(X)2, ...
Section 3.3 Exit Times 107 These local martingales are useful for computing expectations for one dimensional Brownian motion. (3.3) Theorem. Let ra = inf{i : \Bt\ > a}. Then (i) E0ra = a2, (ii) E0r2 = 5a4/3. The dependence of the moments on a is easy to explain: the Brownian scaling relationship Bct =d cxl2Bt implies that ra/a2 =d t\. Proof (i) follows from (3.1), so we will only prove (ii). To do this let Xt = B? — 6B?t + 3f2 and Tn <nbe stopping times so that Tn f oo and XiAxn 1S a martingale. Since Tn < n the optional stopping theorem implies 0 = £o{£?aATn - 65MTn (ra A Tn) + 3(ra A T„)2} Now |-BTaATn| < a, so using (1.3) and the dominated convergence theorem we can let n —* oo to conclude 0 = a4 — 6a2J5ora + 3^0^. Using (i) and rearranging gives (ii). D Exercise 3.2 Find a, b, c so that Bf—aBft+bB?t2—ct3 is a (local) martingale and use this to compute Eqt%. Our next result is a special case of the Burkholder Davis Gundy inequalities, (5.1), but is needed to prove (4.4) which is a key step in our proof of (5.1). (3.4) Theorem. If Xt is a continuous local martingale with Xq = 0 then EUupXf) ^381^(^)^ Proof First suppose that \Xt\ and (X)t are < M for all t. In this case (2.5) in Chapter 2 implies X? — 6Xf(X)t + 3(X)2 is a martingale, so its expectation is 0. Rearranging and using the Cauchy-Schwarz inequality EX? + 3E(X)2 = 6S(X2(X)t) < Q(EX*)l'2{E(X)2)ll2 Using the L4 maximal inequality ((4.3) in Chapter 4 of Durrett (1995)) and the fact that (4/3)4 < 3.1605 < 19/6 we have / \ 1/2 SsupX? < (4/3)4^4 < 19 EsupX? ) (E(X)2)ll2 3<t \ 3<i J
108 Chapter 3 Brownian Motion, II Since \X, \ < M for all s we can divide each side by E(sups<i Xf)1/2 then square to get E (supxA < 381(^(^)2)^2 The last inequality holds for a general martingale if we replace t by Tn = inf {t : t,\Xt\, or (X)t > n}. Using that conclusion, letting n —* oo and using the monotone convergence theorem, we have the desired result. D If we notice that f(x, y) = exp(x — y/2) satisfies (^JDn + D2)f = 0, then we get another useful result. (3.5) The Exponential Local Martingale. If X is a continuous local martingale, then £(X)t = exp(Xt — ^(X)t) is a local martingale. If we let Yt = exp(Xt - f (X)t), then (3.2) says that (3.6) Yt-YQ= I Y,dX, Jo or, in stochastic differential notation, that dYt = YtdX,. This property gives Yt the right to be called the martingale exponential of Yt. As in the case of the ordinary differential equation f'(t) = f(t)a(t) /(0) = 1 which for a given continuous function a(t) has unique solution /(f) = exp(-At), where At = JQ a(s) ds. It is possible to prove (under suitable assumptions) that Z is the only solution of (3.6). See Doleans-Dade (1970) for details. The exponential local martingale will play an important role in Section 5.3. Then (and now) it will be useful to know when the exponential local martingale is a martingale. The next result is not very sophisticated (see (1.14) and (1.15) in Chapter VIII of Revuz and Yor (1991) for better results) but is enough for our purposes. (3.7) Theorem. Suppose Xt is a continuous local martingale with (X)t < Mt and Xq = 0. Then Yt = exp(Xt — ^(X)t) is a martingale. Proof Let Zt = exp(2Xt — ^(2X)t), which is a local martingale by (3.5). Now, Y2 = exp(2Xi - (X)t) = Zt exp((X)0
Section 3.3 Exit Times 109 So if Tn is a sequence of times that reduces Zt, the L2 maximal inequality applied to Y1ATn gives E (sup Y32ATn) < 4£Y<2ATn < 4eMtE{Zu<rn) = AeMt Letting n | oo and using the monotone convergence theorem we have 4em > E (sup yA > (^ sup \Y,\\ by Jensen's inequality. Using (2.5) in Chapter 2 now, we see that Yt is a. martingale. D Remark. It follows from the last proof that if Xt is a continuous local martingale with X0 = 0 and (X)t < M for all t then Yt = exp(Xt - |(^)i) is a martingale in M.2. Letting 0 £ R and setting Xt = QB% in (3.6), where B% is a one dimensional Brownian motion, gives us a family of martingales exp(9Bt — d2t/2). These martingales are useful for computing the distribution of hitting times associated with Brownian motion. (3.8) Theorem. Let Ta = inf{i : Bt = a}. Then for a > 0 and A > 0 E0exp(-\Ta) = e-aV2X Remark. If you are good at inverting Laplace transforms, you can use (3.8) to prove (4.1) in Chapter 1: Po(Ta <t)= f (27rs3)1/2ae_a2/2J ds Jo Proof P0(Ta < oo) = 1 by (1.5). Let Xt = exp(^5t - 6H/2) and S„ < n be stopping times so that Sn | oo and XiAsn is a martingale. Since Sn < n, the optional stopping theorem implies 1 = E0 exT>(eBTaASa - d2(Ta A S„)/2) If 6 > 0 the right-hand side is < exp(^a) so letting n —» oo and using the bounded convergence theorem we have 1 = £'oexp(^a — d2Ta/2). Taking 6 = V2A now gives the desired result. D
110 Chapter 3 Browni&n Motion, II (3.9) Theorem. Let ra = inf{i : \Bt\ > a}. Then for a > 0 and A > 0, £0exp(-Ara) = 2e~aV2T/(l + e~2a^) Proof Let V'a(A) = E0 exp(—ATa). Applying the strong Markov property at time ra (and dropping the subscript a to make the formula easier to typeset) gives EQ exp(-ATa) = £'o(exp(-Ar);JBr = a) + S0(exp(-Ar)T/>2a(A);JBr = -a) Symmetry dictates that (r, BT) =d (r, —BT). Since BT £ {—a, a} it follows that r and BT- are independent, and we have V-a(A) = ^(1 + tMA))£0 exp(-Ar) Using the expression for V'a(A) given in (3.8) now gives the desired result. D Another consequence of (3.8) is a bit of calculus that will come in handy in Section 7.2. (3.10) Theorem. If B > 0 then Proof Changing variables t = l/2s, dt = —ds/2s2 the integral above 2s 2 _B/2s ds_ 5F 2*2 = / Jo V26 Jo y/2ws3 Using (3.8) now with a = Vd and A = z2 and consulting the remark after (3.8) for the density function of Ta we see that the last expression is equal to the right-hand side of (3.10). D The exponential martingale can also be used to study a one dimensional Brownian motion Bt plus drift. Let Zt = aBt + fit where a > 0 and fj. is real. Xt = Zt — fit is a martingale with (X)t = oH so (3.7) implies that exp(0(Zt - fit) - e2cr2t/2)
Section 3.4 Change of Time, Levy's Theorem 111 is a martingale. If 6 = -2/*/cr2 then -0/x - 02cr2/2 = 0 and exp(-(2/x/<r2)Zt) is a local martingale. Repeating the proof of (3.8) one gets Exercise 3.3. Let T_a = inf{i : Zt = -a}. If a, \i > 0 then P0(71a < oo) = exp(-2a/*/(T2) 3.4. Change of Time, Levy's Theorem In this section we will prove Levy's characterization of Brownian motion (4.1) and use it to show that every continuous local martingale is a time change of Brownian motion. (4.1) Theorem. If Xt is a continuous local martingale with X0 = 0 and (X)t = t, then Xt is a one dimensional Brownian motion. Proof By (4.5) in Chapter 1 it suffices to show (4.2) Lemma. For any s and t, Xs+t — Xs is independent of Ts and has a normal distribution with mean 0 and variance t. Proof of (4.2) Applying the complex version of Ito's formula, (7.9) in Chapter 2, to X'r = Xs+r - X, and f(x) = eiBx, we get e"Xl -l = ie f eiBX* dX'u--, f eiBX* du Jo 2 J0 Let T'T = Ts+T and let A € T, = T'Q. The first term on the right, which we will call Yt, is a local martingale with respect to T[. To get rid of that term let Tn t oo be a sequence of stopping times that reduces Yt, replace t by t A Tn, and integrate over A. The definition of conditional expectation implies E(YiATa;A) = E(E(YtATJ^);A) = E(YQ;A) = 0 since Yq = 0. So we have E(eitx"*»; A) - P(A) = 0 - —£ / ei8X« du; A
112 Chapter 3 Brownian Motion, II Since |e,8x| = 1, letting n —* oo and using the bounded convergence theorem gives E(eiBX>;A)-P(A) = -jE([ eiBX* du;A\ = ~T JE (e<BK>A)du by Fubini's theorem. (The integrand is bounded and the two measures are finite.) Writing j(t) = E(eiBX';A), the last equality says j(t)-P(A) = -j£j(u)du Since we know that \j(s)\ < 1, it follows that \j(t) - j(u)\ < \t - u\02/2, so j is continuous and we can differentiate the last equation to conclude j is differentiable with -92 2 02, Together with j(0) = P(A), this shows that j(t) = P(A)e~B i'2, or E(eiBX>;A)= [ e'^'2 dP Since this holds for all A E^o'it follows that (4.3) E{eiBX'^) = e-BHl2 or in words, the conditional characteristic function of X't is that of the normal distribution with mean 0 and variance t. To get from this to (4.2) we first take expected values of both sides to conclude that X[ has a normal distribution with mean 0 and variance t. The fact that the conditional characteristic function is a constant suggests that Xi is independent of T'Q. To turn this intuition into a proof let g be a C1 function with compact support, and let <p(6)= [eiBxg(x)dx be its Fourier transform. We have assumed more than enough to conclude that <p is integrable and hence 9{x)=^jeiB*<p{-6)dx
Section 3.4 Change of Time, Levy's Theorem 113 Multiplying each side of (4.3) by ip(—9) and integrating we see that E{g{X't)\J:'Q) is a constant and hence E(g(Xt)\Po) = Eg{X>) A monotone class argument now shows that the last conclusion is true for any bounded measurable g. Taking g = 1b and integrating the last equality over A € ^¾ we have P(A)P(X[ G B) = [ E{lB{X't)\T'Q) dP = P{X't e B) Ja by the definition of conditional expectation, and we have proved the desired independence. D Exercise 4.1. Suppose X\, 1 < i < d are continuous local martingales with XQ = 0 and (X'fX')* = (* iff = i 10 otherwise then Xt = (Xl,..., Xf) is a d-dimensional Brownian motion. An immediate consequence of (4.1) is: (4.4) Theorem. Every continuous local martingale with X0 = 0 and having (^)00 = 00 is a time change of Brownian motion. To be precise if we let y(u) = inf{t : (X)t > u) then Bu = X7(u) is a Brownian motion and Xt = -B(x), • Proof Since j{(X)t) = t the second equality is an immediate consequence of the first. To prove that we note Exercise 3.8 of Chapter 2 implies that u —* Bu is continuous, so it suffices to show that Bu and B^ — u are local martingales. (4..5) Lemma. Bu, ^"7(u), u > 0 is a local martingale. Proof of (4.5) Let Tn = inf{i : \Xt\ > n}. The optional stopping theorem implies that if u < v then E(Xf(v)ATn\^f(u)) = ^7(u)AT„ where we have used Exercise 2.1 in Chapter 2 to replace ^"7(u)ATn by ^7(u)- To let n —>• 00 we observe that the L? maximal inequality, the fact that X^v-^Aj,^ — (X)7(t;wTn is a martingale, and the definition of j(v) imply EsupX^v)ATn<4snpEX^v)ATn = 4 sup E{X)^v)ATn < Av
114 Chapter 3 Browniaji Motion, II The last result and the dominated convergence theorem imply that asn-+ oo, X^(f)ATn —* Xy(i) in I/2 for t = u,v. Since conditional expectation is a contraction in L2 it follows that E(X^v^Ta\T^u)) -* E(X^v-)\Jr^uy) in L2 and the proof is complete. D To complete the proof of (4.4) now it remains to show (4.6) Lemma. B2 — u, ^"7(u), u > 0 is a local martingale. Proof of (4.6) As in the proof of (4.5), the optional stopping theorem implies that if u < v then E(Xl(v)ATn ~ (X)f(v)ATn \?-,{u)) = ^7(u)ATn - (X)f(u)ATn To let n —* oo we observe that using (a + 6)2 < 2a2 + 2b2, (3.4), and the definition of j(y) then £sup (x2(u)ATn - P07(u)ATn)2 < 2SsupX74(u)ATn + 2E(X)2{V) <CE(X)2(v)<Cv2 The proof can now be completed as in (4.5) by using the dominated convergence theorem, and the fact that conditional expectation is a contraction in L2. O Our next goal is to extend (4.4) to Xt with P^X)^ < 00) > 0. In this case ^7(u) is a Brownian motion run for an amount of time (^)00. The first step in making this precise is to prove (4.7) Lemma. lim1Too Xt exists almost surely on {(X)^ < 00}. Proof Let T„ = inf{i : (X)t > n}. (3.7) in Chapter 2 implies that (XT»)t = (X)tATn < n Using this with Exercise 4.3 in Chapter 2 we get XiATn £ M2 so lim^oo XiAxn exists almost surely and in L2. The last statement shows limt_oo Xt exists almost surely on {Tn = 00} D {{X)^ < "}• Letting n —» 00 now gives (4.7). D To prove the promised extension of (4.4) now, let j(u) = inf{i : (X)t > u) when u < {X)oo, let .X^, = lim^oo Xt on {(^)oo < 00}, let Bt be a Brownian motion which is independent of {Xt, t > 0}, and let V - f Xi(.") u < (X)°° u~\X00+B{u-(X)00) u>{X)00
Section 4.3 Change of Time, Levy's Theorem 115 (4.8) Theorem. Y is a Brownian motion. Proof By (4.1) it suffices to show that Yu and Y* — u are local martingales with respect to the filtration <r(Yt :t<u). This holds on [0, (^)oo] for reasons indicated in the proof of (4.4). It holds on [(^)oo, oo) because B is a Brownian motion independent of X. D The reason for our interest in (4.8) is that it leads to a converse of (4.7). (4.9) Theorem. The following sets are equal almost surely: C = {limt_oo Xt exists } B = {supt \X%\ < oo} A = {PQoc < oo} B+ = {supt Xt < oo} Proof Clearly, C C B C B+. In Section 3.1 we showed that Brownian motion has limsup1_0o Bt = oo. This and (4.8) implies that A" C -B+, or B+ C A. Finally (4.7) shows that A C C. D The result in (4.8) can be used to justify the assertion we made at the beginning of Sections 2.6 that D^X) is the largest possible class of integrands. Suppose H G II and let T = sup{i : /„* H?d(X)s < oo}. (H -X)t can be defined for t < T and has (H-X)t= [ H*d(X), Jo so on {(H ■ X)t = oo} D {T < oo} we have limsup(if • X)t = oo liminf(if • X)t = — oo ITT tfT and there is no reasonable way to continue to define (H ■ X)t for t>T. Convergence is not the only property of local martingales that can be studied using (4.9). Almost any almost-sure property concerning the Brownian path can be translated into a corresponding result for local martingales. This immediately gives us a number of theorems about the behavior of paths of local martingales. We will state only: (4.10) Law of the Iterated Logarithm. Let L(t) = -^/22 log log f for t > e. Then on {{-^Qoo = oo}, limsupXt/L((X)t) = 1 a.s. t — 03
116 Chapter 3 Brownian Motion, II Proof This follows from (4.9) and the result for Brownian motion proved in Section 7.9 of Durrett (1995). D Finally we have a distributional result that can be derived by time change. Exercise 4.2. Suppose h : [0,oo) —» R is measurable and locally bounded. Use (4.4) to generalize Exercise 6.7 in Chapter 2 and conclude that Xt = / hs dB, is normal with mean 0 and variance / h2 ds Jo Jo 3.5. Burkholder Davis Gundy Inequalities Let Xt be a local martingale with X0 = 0 and let X* = sup3<1 \XS\. This section is devoted to a proof of the following inequalities. (5.1) Theorem. For any 0 < p < oo there are constants 0 < c, C < oo so that cE{xft12 < E(x;y < ce{x)pJ2 Remark. This result should be contrasted with the Lp maximal inequality for martingales that only holds for 1 < p < oo p p £TOp<(jri) E\xt\ Levy's Theorem, (4.4), tells us that any continuous local martingale is a time change of Brownian motion Bt, so it suffices to let B* = supi<a \Ba\ and show that (5.2) Theorem. For any 0 < p < oo there are constants 0 < c, C < oo so that for any stopping time r cEtvI2 < E(B;)P < CEt?'2 Proof of (5.2) The key is the following pair of odd looking inequalities. (5.3) Lemma. Let /? > 1 and 8 > 0. Then for any A > 0 (a) p(b; > /?a, r1'2 < s\) < w^wp(b; > X)
Section 3.5 Burkholder Davis Gundy Inequalities 117 (b) P(t^ > p\,B; < SX) < ^Pir1'2 > A) Remark. The inequalities above are called "good A" inequalities, although the reason for the name is obscured by our formulation (which is from Burkholder (1973)). The name "good A" comes from the fact that early versions of this and similar inequalities (see Theorems 3.1 and 4.1 in Burkholder, Gundy, and Silverstein (1971)) were formulated as P(f > A) < CPiKP{g > A) for all A that satisfy P{g > A) < KP{g > /?A). Here /?, K > 1. Proof It is enough to prove the result for bounded r for if the result holds for r A n for all n, it also holds for r. Let Si =inf{i:|5(iAr)| > A} $2=^:15(2 At) | >/?A} T = inf {t : (t A t)1'2 > <5A} Since B* > /?A implies S\ < S2 < r and r1/2 < <5A implies T = oo we have P(B*T > /?A, t1'2 < 8\) < P(\B(t A S2 A T) - B(t A Si A T)\ > (/? - 1)A) < (/?- 1)-2\-2E{(B(t AS2AT)- B(r A Sx AT))2} where the second inequality is due to Chebyshev. Now if Ri < R2 are bounded stopping times then £{£(#05(¾)} = E{B(R1)E(B(R2)\^Rl)} = E B{R{f So.we have E(B(R2) - 5(¾))2 = EB(R2)2 - 2EB(Ri)B(R2) + EB{R{)2 = EB(R2)2 - EB{Rxf = E(R2 - Rx) since B2 — t is a martingale. Resuming our first computation we find = (/? - l)-2A-2£{(r A S2 A T) - (r A Si A T)} <(/?-l)-2A-2(5A)2P(Si<oo) = {/3-l)-262P{B;> A) since TAt < (SX)2 proving (a).
118 Chapter 3 Brownian Motion, II To prove (b) we interchange the roles of B(t A r) and (t A r)1/2 in the first set of definitions and let Si=inf{i: (rAi)1/2>A} S2 = inf{i: (r A i)1/2 > /?A} T=mf{t:\B{rAt)\>8\} Reversing the roles of B, and s1/2 in the proof, it is easy to check that P(r1/2 > /?A, B*T < 6\) < P((r A S2 A T) - (r A Sx A T) > (/32 - 1)A2) < (/?2 - l)-1A"2S{(r AS2AT)-(rAS1A T)} and using the stopping time result mentioned in the proof of (a) it follows that the above is = (/?2 - l)-1A"2S{JB(r A S2 A T? - B(r A Sx A T)2} < (/?2 - 1)-1A"2(5A)2P(S1 < oo) <(/?2-l)-152P(r1/2>A) since \B(t AT)\< (5A)2 proving (b). □ It will take one more lemma to extract (5.2) from (5.3). First, we need a definition. A function <p is said to be moderately increasing if <p is a nondecreasing function with <p(0) = 0 and if there is a constant K so that <p(2\) < K<p(\). It is easy to see that <p(x) = xp is moderately increasing (K = 2P) but <p(x) = eax — 1 is not for any a > 0. To complete the proof of (5.2) now it suffices to show (take /? = 2 in (5.3) and note 1/3 < 1) (5.4) Lemma. If X,Y > 0 satisfy P{X > 2A,Y < «5A) < 62P(X > A) for all 6 > 0 and <p is a moderately increasing function, then there is a constant C that only depends on the growth rate K so that E<p{X) < CE<p(Y) Proof It is enough to prove the result for bounded <p for if the result holds for <p A n for all n > 1, it also holds for <p. Now <p is the distribution function of a measure on [0, oo) that has f MA)= / hh>X)d<p(X) o Jo
Section 3.6 Martingales Adapted to Browniaji Filtraiions 119 Replacing h by a nonnegative random variable Z, taking expectations and using Fubini's theorem gives (5.5) E<p(Z)= / P(Z>\)dip(\) Jo Prom our assumption it follows that P(X > 2A) = P{X > 2A, Y < 6\) + P(X > 2A,Y > <5A) < S2P(X > A) + P(Y > <5A) Integrating d<p(\) and using (5.5) with Z = X/2,X,Y/6 E<p(X/2) < 62E<p(X) + E<p(Y/6) Pick 6 so that K62 < 1 and then pick N > 0 so that 2N > 5_1. Prom the growth condition and the monotonicity of <p, it follows that E<p{Y/6) < KNE<p{Y) Combining this with the previous inequality and using the growth condition again gives E<p(X) < KE<p(X/2) < K62E<p(X) + KN+iE<p{Y) Solving for E<p(X) now gives kn+i Ef(X) < YZTKPE,p(y) D 381 Revisited. To compare with (3.4), suppose p = 4. In this case K = 24 = 16. Taking 8 = 1/8 to make 1 — K82 = 3/4 and N = 3, we get a constant which is 164-4/3 = 87,381.333... Of course we really don't care about the value, just that positive finite constants 0 < c, C < 00 exist in (5.1). 3.6. Martingales Adapted to Brownian Filtrations Let {Bt, t > 0} be the filtration generated by a d-dimensional Brownian motion Bt with B0 = 0, defined on some probability space (fi, J7, P). In this section we will show (i) all local martingales adapted to {Bt, t > 0} are continuous and
120 Chapter 3 Brownian Motion, II (ii) every random variable X £ L2(Q,Boo,P) can be written as a stochastic integral. (6.1) Theorem. All local martingales adapted to {Bt, t > 0} are continuous. Proof Let Xt be a local martingale adapted to {Bt,t > 0} and let Tn < n be a sequence of stopping times that reduces X. It suffices to show that for each n, X(t ATn) is continuous, or in other words, it is enough to show that the result holds for martingales of the form Yt = E(Y\Bt) where Y € Bn. We build up to this level of generality in three steps. Step 1. Let Y = f(Bn) where / is a bounded continuous function. If t > n, then Yt = f(Bn), so t —» Yt is trivially continuous for t > n. If t < n, the Markov property implies Yt = E(Y\Bt) = h(n — t,Bt) where It is easy to see that h(s,x) is a continuous function on (0,oo) x R, so Yt is continuous for t < n. To check continuity at t = n, observe that changing variables y = x + Zy/s, dy = sdl2 dz gives h{s' x) = J J^F-e~^'2f^x + z^ dz so the dominated convergence theorem implies that as t | n, h(n— t,Bt) —* f(Bn). Step 2. Let Y = /i(5(x)/2(5ta) where t\ < t2 < n and /1,/2 are bounded and continuous. If t > t\, then Yt = h(Btl)E(f2(Bt3)\Bt) so the argument from step 1 implies that Yt is continuous on \t\, 00). On the other hand, if t < t1} then Y< = E(Ytl\Bt) and Ytl = fi(Btl)E(MBta)\Btl) = fi(Btl)g(Btl) where is a bounded continuous function, so Yt = E{h{Btl)g{Btl)\Bt) for * < *i
Section 3.6 Martingales Adapted to Brownian Filtrations 121 and it follows from step 1 that Yt is continuous on [0,ii]. Repeating the argument above and using induction, it follows that the result holds if Y = f\{Btl) • • • fk(Btk) where t\ < t2 < ■ ■ ■ < i* < n and /i,..., /* are bounded continuous functions. Step 3. Let Y € Bn with E\Y| < oo. It follows from a standard application of the monotone class theorem that for any e > 0, there is a random variable Xc of the form considered in step 2 that has E\Xe — Y\ < e. Now \E{X'\Bt) - E{Y\Bt)\ < E(\X< - Y\\Bt) and if we let Zt = E{\X€ — Y\\Bt), it follows from Doob's inequality (see e.g., (4.2) in Chapter 4 of Durrett (1995)) that \P (sup Zt > A J < EZn = E\XC - Y| < e Now Xe(t) = E(Xe\Bt) is continuous, so letting £-+0we see that for a.e. (j,Yt(w) is a uniform limit of continuous functions, so Yt is continuous. □ Remark. I would like to thank Michael Sharp e for telling me about the proof given above. We now turn to our second goal. Let Bt = (Bj,..., Bf) with Bq = 0 and let {B\,t > 0} be the filtrations generated by the coordinates. (6.2) Theorem. For any X e i^fi,£«,,-?) there are unique H{ G n2(-B*') with d /.OO X = EX + J2J HldBi Proof We follow Section V.3 of Revuz and Yor (1991). The uniqueness is immediate since the isometry property, Exercise 6.2 of Chapter 2, implies there is only one way to write the zero random variable, i.e., using if'(s,w) = 0. To prove the existence of H3, the first step is to reduce to one dimension by proving: (6.3) Lemma. Let X e L^n.^.P) with EX = 0, and let X{ = SpqBJJ. Then X = X1 + --- + Xd. Proof Geometrically, see (1.4) in Chapter 4 of Durrett (1995), X{ is the projection of X onto Lf the subspace of mean zero random variables measurable
122 Chapter 3 Browniaji Motion, II with respect to B^. If Y € L2 and Z € Lj then Y and Z are independent so EYZ = EY ■ EZ = 0. The last result shows that the spaces L2 and L) are orthogonal. Since L2, ...,1/^ together span all the elements in L2 with mean 0, (6.7) follows. □ Proof in one dimension Let X be the set of integrands which can be written as J2]=i ^jl(i,--i,f,-] wnere the Ay and Sj are (nonrandom!) real numbers and 0 = s0 < si < ... < sn. For any integrand H £l C IbCB), we can define the stochastic integral Y* = /0 H3 dB3, and use the isometry property, Exercise 6.2 of Chapter 2, to conclude Y* € .M2. Using (3.7) and the remark after its proof, we can further define the martingale exponential of the integral, Zt = £(Y)t and conclude that Zt G M2. Ito's formula, see (3.6), implies that Z% — Zq = S*Z,dYt. Recalling Yt = (H ■ B)t and using the associative law, (9.6) in Chapter 2, we have i: Zi-Z0= / Z3H3dB3 Jo Y0 = 0 and <Y)0 = 0, so Z0 = 1. Since Zt € A42 we have .EZi = 1 = Z0 and it follows that JO i.e., the desired representation holds for each of the random variables in the set J = {€(H-B)t :H£l,t>Q}. To complete the proof of (6.2) now, it suffices to show (6.4) Lemma. If W G L2 has E(ZW) = 0 for all Z g J then W = 0. (6.5) Lemma. Let Q be the collection of L2 random variables X that can be written as EX + /0°° H3 dB3. Then Q is a closed subspace of L2. Proof of (6.4) We will show that the measure W dP (i.e., the measure \i with dfx/dP = W) is the zero measure. To do this, it suffices to show that WdP is the zero measure on the oxfield <r{Bil,...,Bia) for any finite sequence 0 = t0 < ti < i2 < • • • < tn- Let Xj, 1 < j < n be real numbers and z be a complex number. The function <p(z) = E is easily seen to be analytic in C, i.e., <p(z) is represented by an absolutely convergent power series. By the assumed orthogonality, we have <p(x) = 0 for
Section 3.6 Martingales Adapted to Brownian Filtrations 123 all real x, so complex variable theory tells us that tp must vanish identically. In particular <p(i) = 0, when i = y/—l. The last equality implies that the image of W dP under the map u - (Btl H-AH.., Bu («) - Bu_, («)) is the zero measure since its Fourier transform vanishes. This shows that W dP vanishes on <r(Btl — Bto,...,Btn — Btti_x) = a(Btl,...,Btn) and the proof of (6.4) is complete. □ Proof of (6.5) It is clear that Q is a subspace. Let X„ £ Q with (6.6) Xn-EXn= / H?dB, Jo and suppose that Xn —» X in L2. Using an elementary fact about the variance, then (6.6), and the isometry property of the stochastic integral, Exercise 6.2 in Chapter 2, we have E(Xn - Xmf - (EXn - EXmf = E{(Xn - Xm) - (EXn - EXm)}2 = E (H? - H?)3 ds Jo Since Xn -» X in L2 implies E(Xn - Xm)2 -» 0 and EXn -» EX, it follows that ||#m - Hn\\B — 0. The completeness of 112(5), see the remark at the beginning of Step 3 in Section 2.4, implies that there is a predictable H so that \\Hn — H\\b —* 0. Another use of the isometry property implies that Hm ■ B converges to H ■ B in M2. Taking limits in (6.6) gives the representation for X. This completes the proof of (6.5) and thus of (6.2). □
4 Partial Differential Equations A. Parabolic Equations In the first third of this chapter, we will show how Brownian motion can be used to construct (classical) solutions of the following equations: 1. ut = -Au ut = -Au + g ut = -Aw + cu in (0, oo) x Rd subject to the boundary condition: u is continuous at each point of {0} x Rd and u(0,x) = f(x) for x G Rd. Here, d2Uj d2ud Au = -r-r- -\ h dx\ dx and by a classical solution, we mean one that has enough derivatives for the equation to make sense. That is, u g C1,2, the functions that have one continuous derivative with respect to t and two with respect to each of the z,-. The continuity in u in the boundary condition is needed to establish a connection between the equation which holds in (0,oo) x Rd and u(0,x) = f(x) which holds on {0} x Rd. Note that the boundary condition cannot possibly hold unless / : Rd —» R is continuous. We will see that the solutions to these equations are (under suitable assumptions) given by Ex f(Bt) Ex(f(Bt) + J git-s^^ds^ Ex (f(Bt) exp (J c(t - s, B,) ds^
126 Chapter 4 Partial Differential Equations In words, the solutions may be described as follows: (i) To solve the heat equation, run a Brownian motion and let u(t, x) = Exf(Bt). (ii) To introduce a term g(x) add the integral of g along the path. (iii) To introduce cu, multiply f(Bt) by m* = exp(/0 c(t — s,B,)ds) before taking expected values. Here, we think of the Brownian particle as having mass 1 at time 0 and changing mass according to m'3 = c(t — s, B,)ms, and when we take expected values, we take the particle's mass into account. An alternative interpretation when c < 0 is that exp(/0 c(t — s, B3) ds) is the probability the particle survives until time t, or —c(r, x) gives the killing rate when the particle is at x at time r. In the first three sections of this chapter, we will say more about why the expressions we have written above solve the indicated equations. In order to bring out the similarities and differences between these equations and their elliptic counterparts discussed in Sections 4.4 to 4.6, we have adopted a rather robotic style. Formulas (m.2) through (m.6) and their proofs have been developed in parallel in the first six sections, and at the end of most sections we discuss what happens when something becomes unbounded. 4.1. The Heat Equation In this section, we will consider the following equation: (1.1a) u e C1'2 and ut = |Aw in (0, oo) x Rd. (1.1b) u is continuous at each point of {0} x Rd and u(0,x) = f(x). This equation derives its name, the heat equation, from the fact that if the units of measurement are chosen suitably then the solution u(t,x) gives the temperature at the point x G Rd at time t when the temperature profile at time 0 is given by f(x). The first step in solving (1.1), as it will be six times below, is to find a local martingale. (1.2) Theorem. If u satisfies (1.1a), then M, = u(t—s, B3) is a local martingale on [0,i)- Proof Applying Ito's formula, (10.2) in Chapter 2, to u(x0,... ,Xd) with
Section 4.1 The Heat Equation 127 X° = t - s and X\ = B\ for 1 < i < d gives u(t-s,B,)-u(t,B0) = I -ut(t-r,Br)dr Jo + I Vu(t-r,Br)-dBr Jo + | J Au{t-r,Br)dr To check this note that dX° = —dr and X° has bounded variation, while the X). with 1 <i <d are independent Brownian motions, so 10 otherwise (1.2) follows easily from the Ito's formula equation since — ut + |Aw = 0 and the second term on the right-hand side is a local martingale. D Our next step is to prove a uniqueness theorem. (1.3) Theorem. If there is a solution of (1.1) that is bounded then it must be v(t,x) = Exf(Bt) Here = means that the last equation defines v. We will always use u for a generic solution of the equation and v for our special solution. Proof If we now assume that u is bounded, then M3,0 < s < t, is a bounded martingale. The martingale convergence theorem implies that Mt = lim M, exists a.s. If u satisfies (1.1b), this limit must be f(Bt). Since M3 is uniformly integrable, it follows that u{t, x) = ExMq = ExMt = v(t, x) D Now that (1.3) has told us what the solution must be, the next logical step is to find conditions under which v is a solution. It is (and always will be) easy to show that if v is smooth enough then it is a classical solution. (1.4) Theorem. Suppose / is bounded. If v £ C1,2 then it satisfies (1.1a).
128 Chapter 4 Partial Differentia.! Equations Proof The Markov property implies, see Exercise 2.1 in Chapter 1, that Ex{f(Bt)\T,) = EBM(f(Bt-,)) = v(t - s, B,) The left-hand side is a martingale, so the right-hand side is also. If v £ C1,2, then repeating the calculation in the proof of (1.2) shows that t' 1 v(t-s,B,)-v{t,B0) = / (-^ + -A^)(i-r,JBr)dr Jo * + a local martingale The left-hand side is a local martingale, so the integral on the right-hand side is also. However, the integral is continuous and locally of bounded variation, so by (3.3) in Chapter 2 it must be = 0 almost surely. Since vt and Av are continuous, it follows that — vt + %Av = 0. For if it were ^ 0 at some point (t, x), then it would be ^ 0 on an open neighborhood of that point, and, hence, with positive probability the integral would not be = 0, a contradiction. D It is easy to give conditions that imply that v satisfies (1.1b). In order to keep the exposition simple, we first consider the situation when / is bounded. (1.5) Theorem. If / is bounded and continuous, then v satisfies (1.1b). Proof (Bt — Bo) = i^2iV, where N has a normal distribution with mean 0 and variance 1, so if tn —>• 0 and xn —* x, the bounded convergence theorem implies that v(tn, xn) = Ef(xn + ti'2N) - f(x) D The final step in showing that v is a solution is to find conditions that guarantee that it is smooth. In this case, the computations are not very difficult. (1.6) Theorem. If / is bounded, then v £ C1,2 and hence satisfies (1.1a). Proof By definition, v{t,x) = Exf(Bt)= fPi(x,y)f(y)dy
Section 4.1 The Heat Equation 129 where pt(x,y) = (2^)-^-^-^/^. Writing D{ = d/dx{ and Dt = d/dt, a little calculus gives DiPt{x,y)=~{Xi~yi)pt{x,y) DiiVt{x,y)={xi~yf~%t{x,y) DijPt(x,y)={xi-yifj-yj)pt(x,y) i # j ntMx,y)=(^ + £gP)Pt(x,y) If / is bounded, then it is easy to see that for or = i, ij, or t / \Dapt(x,y)f(y)\dy < oo and is continuous in Rd, so (1.6) follows from the next result on differentiating under the integral sign. This result is lengthy to state but short to prove since we assume everything we need for the proof to work. Nonetheless we will see that this result is useful. (1.7) Lemma, Let (S, S, m) be a cr-finite measure space, and g : S —* R be measurable. Suppose that for x G G an open subset of Rd and some h0 > 0 we have: (a) u(x) = /s K(x,y)g(y)m(dy) where K and dK/dxi :GxS-+R are measurable functions with (b) K{x* + ha.y) - K{x\y) = ft §£-.(x* + ee{,y)de for \h\ < hQ and y G S (c) u{(x) = fs §*r.(x,y)g(y) m(dy) is continuous at x* and (d) /s f*l |f£(s' + 0e{, y)g(y)\ d8 m(dy) < oo. Then du/dx{ exists at x* and equals «,•(«*). Proof Using the definition of u in (a), then (b) and Fubini's theorem, which is justified for \h\ < h0 by (d), we have «(*-+ he{) - u(x') = J(K(x* + heuy) - K(x\y))g(y) m(dy) = 11 7r(*'+teiMv)m{dy)dO JO Js °xi
130 Chapter 4 Partial Differential Equations Dividing by h and letting h —» 0 the desired result follows from (c). □ Later we will need a result about differentiating sums. Taking S = Z with S = all subsets of S, and fj. is counting measure in (1.7), then setting g = 1 and /n(z) = K(x, n) gives the following. Note that in (a) and (c) of (1.7) it is implicit that the integrals exist, so here in (a) and (c) we assume that the sums converge absolutely. (1.8) Lemma. Suppose that for x £ G an open subset of Rd and some h0 > 0 we have: (a)u(x)=J2nfn(x) where /„ and dfn /dx{ : G -» R, n € Z ate measurable functions with (b) /„(*' + Ac,-) - /„(*•) = /* §£(*• + 0e{) d9 for \h\ <hQandneZ (c) u{(x) = J2n ~i£r(x) 1S continuous at x* and (d) £„ fX \e£(x* + Ba) d9 < <x>. Then du/dx{ exists at x* and equals u{(x*). Unbounded f. For some applications, the assumption that / is bounded is too restrictive. To see what type of unbounded / we can allow, we observe that, at the bare minimum, we need .EX|/(.B<)| < oo for all t. Since a condition that guarantees this for locally bounded / is (*) \x\~2 l°g+ \f{x)\ ~~*■ 0 as x —* oo Replacing the bounded convergence theorem in (1.5) and (1.6) by the dominated convergence theorem, it is not hard to show: (1.9) Theorem. If/ is continuous and satisfies (*), then v satisfies (1.1). 4.2. The Inhomogeneous Equation In this section, we will consider what happens when we add a function g(t, x) to the equation we considered in the last section. That is, we will study (2.1a) u e C1'2 and ut = |Aw + g in (0, oo) x Kd
Section 4.2 The Inhomogeneous Equation 131 (2.1b) u is continuous at each point of {0} x Rd and u(0, x) = f{x). We observed in Section 4.1 that (2.1b) cannot hold unless / is continuous. Here g = ut — \&u so the equation in (2.1a) cannot hold with u £ C1,2 unless g(t, x) is continuous. Our first step in treating the new equation is to observe that if u\ is a solution of the equation with / = /0 and g = 0 which we studied in the last section, and u2 is a solution of the equation with / = 0 and g = go then u\ + u2 is a solution of the equation with f = f0 and g = go so we can restrict our attention to the case /=0. Having made this simplification, we will now study the equation above by following' the procedure used in the last section. The first step is to find an associated local martingale. (2.2) Theorem. If u satisfies (2.1a), then M3=u(t-s,B,) + I g(t-r,Br)dr Jo is a local martingale on [0,2). Proof Applying Ito's formula as in the proof of (1.2) gives I" 1 u{t-s,B3)-u{t,BQ)= / {-ut + -Au)(t-r,Br)dr + I Vu(t-r,Br)-dBr Jo which proves (2.2), since — ut + \Au = —g and the second term on the right- hand side is a local martingale. D • Again the next step is a uniqueness result. (2.3) Theorem. Suppose g is bounded. If there is a solution of (2.1) that is bounded on [0,T] x Rd for any T < oo, it must be v(t,x) = Ex(J g(t-s,B3)ds\ Proof Under the assumptions on g and u, Ms, 0 < s < t, defined in (2.2) is a bounded martingale and u(0,x) = 0 so Mt = limM, = / g(t — s, B3) ds *T< Jo
132 Chapter 4 Partial Differentia.! Equations Since M3 is uniformly integrable, u(t, x) = ExMq = ExMt = v(t, x) □ Again, it is easy to show that if v is smooth enough it is a solution. (2.4) Theorem. Suppose g is bounded and continuous. If v £ C1,2, then it satisfies (2.1a) in (0, oo) x Rd. Proof Using the Markov property, see Exercise 2.6 in Chapter 1, gives Ex (J g{t-r,Br)dr F3j = J g(t- r, Br) dr + EB. (J g(t-s- u, Bu) du\ = ( g(t-r,Br)dr + v(t-s,B3) Jo The left-hand side is a martingale, so the right-hand side is also. If v £ C1,2, then repeating the calculation in the proof of (2.2) shows that v(t-s,B3)-v(t,BQ)+ f g(t-r,Br)dr Jo f3 1 = I {-vt + -Av + g){t-r,Br)dr + a local martingale The left-hand side is a local martingale, so the integral on the right-hand side is also. Since the integral is continuous and locally of bounded variation, (3.3) in Chapter 2 implies it must be = 0 almost surely. Again this implies that (—vt + jAv + g) = 0, for our assumptions imply that this quantity is continuous and if it were ^ 0 at some point (i, z) we would have a contradiction. D The next step is to give a condition that guarantees that v satisfies (2.1b). As in the last section, we will begin by considering what happens when everything is bounded. (2.5) Theorem. If g is bounded, then v satisfies (2.1b). Proof If \g\ < M, then as t -* 0 \v(t, x)\ <EX I \g(t - s, B3)\ ds<Mt-+Q D Jo
Section 4.2 The Inhomogeneous Equation 133 The final step in showing that v is a solution is to check that v £ C1,2. Since the calculations necessary to establish these properties are quite tedious, we will content ourselves to state what the results are and do enough of the proofs to indicate why they are true. If you get dazed and confused and bail out before the end, the Take home message is: it is not enough to assume g is continuous to have v g C1,2. We must assume g to be Holder continuous locally in t. That is, for any N < oo there are constants C,a £ (0, oo), which may depend on N, such that \g(t, x) — g(t, y)\ < C\x — y\a whenever t < N. The reason for this assumption can be found in the proof of (2.6c) and again in (2.6d). The first step in showing v G C1,2 is to assume g is bounded and use Fubini's theorem to conclude v(t,x) = / ds p,(x,y)g(t-s,y)dy where p3(x,y) = {2irs)~dl2e~\y~x\ I2'. The expression we have just written for v above is what Friedman (1964) would call a volume potential and would write as V(x,t)= J J Z(x,t;t,T)g(t,r)dtdT To translate between notations, set % = 0,JD = Rd, Z(x,t;t,T) = pt-T(x,t) and change variables s = t — r,y = £. Because of their importance for the parametrix method, the differentiability properties of volume potentials are well known. The results we will state are just Theorems 2 to 5 in Chapter 1 ofFriedman (1964), so the reader who is interested in knowing the whole story can find the missing details there. (2.6a) Theorem. If g is a bounded measurable function, then v(t, x) is continuous on (0, oo) x Rd. Proof Use the bounded convergence theorem. D (2.6b) Theorem. There is a constant C so that if \g\ < M is measurable, then the partial derivatives D{V = dv/dx{ have \D{v\ < CMt1!2, are continuous, and are given by D{V =// Dip,(x,y)g(t - s, y) dy ds
134 Chapter 4 Partial Differential Equations Proof Using the formula for Dip, from (1.6), the right-hand side is - jT J(2^s)-^^^le-^-y\2^g{t - s,y)dyds = - f^Ex{(Xi-Bi)g(t-s,B3)] Jo s Although the last formula looks suspicious because we are integrating s-1 near 0, everything is really all right. If |^| < M, then Ex\(x{ - B\)g{t - a, B3)\ < MEx\x{ - B\\ = CMs1'2 so we have f —Ex\{x{ - B\)g{t - a,B,)\ < 2CMt1'2 < oo Jo s Using our result on differentiating under the integral sign, (1.7), it follows that the partial derivatives D{V exist, are continuous, and have the indicated form. □ Things get even worse when we take second derivatives. (2.6c) Theorem. Suppose that g is bounded and Holder continuous locally in t. Then the partial derivatives D{jv = d2v/dxidxj are continuous, and DijV = DijP*(x, y)g(t - s, y) dy ds Proof Suppose for simplicity that i = j. Consulting (1.6) for the formula for A": Pi, the right-hand side is fQ yWd/2 (<*-%>'-) e-l—la/*„(t - s, y) dy ds '{xj-BW-a' l'M(! s2 ,g(t-s,B3)\ds This time, however, Ex\(x{ - B\)2 -s\ = s^oK^j)2 - 1| so {xt-BJf-a i: dsEx 00
Section 4.2 The Inhomogeneous Equation 135 We can overcome this problem if g is Holder continuous at x, because the fact that Ex(x{ — B\)2 = s allows us to write Ex (xt-Btf-a g(t-s,B,) Ex {xi-Biy-a {g(t-s,B3)-g(t-s,x)} Using the Holder continuity of g one can show the above is < Cs~l+al2, so its integral from s = 0 to t converges absolutely, and with a little work (2.6c) follows. (See Friedman (1964), pages 10-12, for more details.) D The last detail now is: (2.6d) Theorem. Let g be as in (2.6c). Then dv/dt exists, and dv (t, *) = g(t, x)+ dr —pt-r(x,y)g(r, y) dy Proof To take the derivative w.r.t. t, we rewrite v as v(t,x)= / pt-r(x,y)g(r,y)dydr Differentiating the right-hand side w.r.t. t gives two terms. Differentiating the upper limit of the integral gives g(t, x). Differentiating the integrand and using a formula from (1.6) gives = Jo J {2*)dl\t - r)(<*+2)/2 e^ V 2(i-r) + / /(2.(, - 0)-"2±=§ exp (¾) ,(r, „) dydr exP 1 ~^7I—!T~ ) 9(r, y) dy dr -d l* dr J JL-Exg(r,Bt-r) + J d* Ex(\x-Bt-r\2g(r,Bt-r)) In the second integral, we can use the fact that Ex(\x — 5t_r|2) = C(t — r) to cancel one of the t — r's and make the second expression like the first, but even if we do this,
136 Chapter 4 Partial Differential Equations This is the difficulty that we experienced in the proof of (2.6c), and the remedy is the same: We can save the day if g is Holder continuous locally in t. For further details, see pages 12-13 of Friedman (1964). D Unbounded g. To see what type of unbounded g's can be allowed, we will restrict our attention to the temporally homogeneous case g(t,x) = h(x). At the bare minimum, we need Ex I \h(B,)\ds < oo Jo and if we want (2.1b) to hold, we need to know that if tn j 0 and xn —» x, then Ex_ [nh(Bs)da-*Q ?x„ [Uh(B,)t Jo If we put absolute values inside the integral and strengthen the last result to uniform convergence for x £ Rd, then we get a definition that is essentially due to Kato (1973). A function h is said to be in Kd if lim sup £* ( I \h(B,)\dsj = 0 By Fubini's theorem, we can write the above as (*) lim sup / kt(x,y)\h(y)\dy = 0 'i0 X J where h(x,y) = f{2ns)-dl2e-\x-^l2'ds Jo By considering the asymptotic behavior of kt(x, y) as t —» 0 and \x—y\2/t —» c, we can cast this condition in a more analytical form as (**) lim sup / <p(\x - y\)\h(y)\dy = 0 «10 X J\x-y\<a where fr-(d-2) d>3 <f(r)= < -logr d=2 U d=l The equivalence of (*) and (**) is Theorem 4.5 of Aizenman and Simon (1982) or Theorem 3.6 of Chung and Zhao (1995). Section 3.1 of the latter
Section 4.3 The Feynman-Ka.c Formula. 137 source can be consulted for a wealth of information about these spaces. We will content ourselves here with an Example 2.1. Let h(z) = k(\z\) where k(r) = r~p for r < 1 and 0 for r > 1. We will now show that (2.7) Theorem, h g Kd if p < 2. The special case p = 1 is the Coulomb potential which is important in physics. Proof If a < 1 then changing to polar coordinates and noticing the integral is largest when x = 0 we have sup I <p(\x - y\)\h(y)\ dy<Cd f V»(r)fc(r)r''-1 dr x J\x-y\<a JO If we replace k(r) by r~p and <p(r) by r~(d~2\ which holds in d ^ 2, then the above = Cd f r1_p dr — 0 Jo when p < 2. In d = 2 when p < 2 we get = Cd I r1-plogrc/r-»-0 D Jo Remark. We will have more to say about these spaces at the end of the next section. For the developments there, we will also need the space Kd°c, which is denned in the obvious way: / £ Kldc if for every R < oo, the spatially truncated function /l(|x|<ii) € Kd- 4.3. The Feynman-Kac Formula In this section, we will consider what happens when we add cu to the right-hand side of the heat equation. That is, we will study (3.1a) u G C1'2 and ut = f Au+ cu in (0, oo) x Rd. (3.1b) u is continuous at each point of {0} x Rd and w(0,z) = f(x). If c(t, x) < 0, then this equation describes heat flow with cooling. That is, u(t, x) gives the temperature at the point x £ Rd at time t, when the heat at x at time t dissipates at the rate —c(t, x).
138 Chapter 4 Partial Differential Equations The first step, as usual, is to find a local martingale. (3.2) Theorem. Let c* = f* c(t — r,Br) dr. If u satisfies (3.1a), then M, = u(t — s, B3) exp(c*) is a local martingale on [0,2). Proof Applying Ito's formula with X° = t - s, A'* = B\ for 1 < i < d, and Xd+i _ ci gives tnat u{t — s, B,) exp(c*) - u(t,BQ) = I -ui(t-r,Br)exp(cir)dr + f exp(c*)Vti(f- r,Br) ■ dBr Jo Jo + I u(t- r,Br) exp(c*) del + 2 / Au^ ~r'Br) exP(cr) dr since we have , . . , (X{ Xj) =•[* nl<i = J<d \ 0 otherwise Using del = c(i — r,Br) dr and rearranging, the right-hand side is = / ( -«t + cw + -Aw J (i - r, 5r) exp(c*) dr + I exp(cl)Vu(t-r,Br)-dBr Jo which proves (3.2); since — ut + cu + \Au = 0 and the second term is a local martingale. D The next step, again, is a uniqueness result. (3.3) Theorem. Suppose that c is bounded. If there is a solution of (3.1) that is bounded on [0,T] x Rd for any T < 00, it must be v{t,x) = Ex{f{Bi)exp{cii)} Proof Under our assumptions on c and u, M3, 0 < s < t, is a bounded martingale and Mt = lim.,_t Ms = /(-Bt)exp(cj). Since M, is uniformly integrable it follows that u(t, x) = ExMo = ExMt = v(t, x) D
Section 4.3 The Peynman-Ivac Formula 139 As before, it is easy to show that if v is smooth enough it is a solution. (3.4) Theorem. Suppose / is bounded and that c is bounded and continuous. If v G C1'2, then it satisfies (3.1a). Proof The Markov property implies, see Exercise 2.7 in Chapter 1 and take h(r,x) = c(t — r,x), that Ex(f(Bt)exp(c\)\F3) = exp(cl)EB.(f(Bt-3)exp(c\ys)) = exv(c\)v(t-s,B3) The left-hand side is a martingale, so the right-hand side is also. If v £ C1'2, then repeating the calculation in the proof of (3.2) shows that v(t — s,i?,)exp(c*) — v(t, B0) [* 1 = / (—vt +cv+ -Av)(t — r, Br) exp(cl) dr + a local martingale The left-hand side is a local martingale, so the integral on the right-hand side is also. Since the integral is continuous and locally of bounded variation, (3.3) in Chapter 2 implies that it must be = 0 almost surely. Again this implies that (—vt +cv +1Av)(t — r,Br) exp(c*r) = 0 for our assumptions imply this quantity is continuous and if it were ^ 0 at some point we would have a contradiction. D The next step is to give a condition that guarantees that v satisfies (3.1b). As before, we begin by considering what happens when everything is bounded. (3.5) Theorem. If c is bounded and / is bounded and continuous, then v satisfies (3.1b). Proof If \c\ < M, then e~Mt < exp(c*) < eMt, so exp(c*) -♦ 1 as t -* 0. Letting ||/||oo = supx |/(z)| this result implies \EXexV(c\)f(Bt) - Exf(Bt)\ < 11/11«, Ex\exp(c\) - 1| - 0 (1.5) implies that (i, x) —* Exf(Bt) is continuous at each point of {0} x Rd and the desired result follows. D This brings us to the problem of determining when v is smooth enough to be a solution.
140 Chapter 4 Partial Differential Equations (3.6) Theorem. Suppose that / is bounded and Holder continuous. If c is bounded and Holder continuous locally in t, then v £ C1,2 and, hence, satisfies (3.1a). Proof To solve the problem in this case, we use a trick to reduce our result to the previous case. We begin by observing that <4= I c(t-r,Br)dr Jo is continuous and locally of bounded variation. So Ito's formula, (10.2) in Chapter 2, implies that if h G C1 then h(c\)-h(cl)= fh'{c\)dc\ Jo Taking h(x) = e~x we have exp(—c\) — 1 = — / exp(—c\)c(t — s, B3) ds Jo Multiplying by — exp(cj) gives exp(c^) — 1 = / c(t — s,B3)exp(c\ — c\)ds Jo Plugging in the definitions of c\ and c\ we have exp ( I c(t- r, Br) dr J = 1 + J c(t - s, B3) exp ( f c(t - r,Br) drj ds Multiplying by f{Bt), taking expected values, and using Fubini's theorem, which is justified since everything is bounded, gives v(t, x) = Exf(Bt) + J Ex L(t - s, Bs) exp (J c(t - r, Br) drj f(Bt) J ds Conditioning on Ts, noticing c(t — s, Bs) £ Ts, and using the Markov property as in Exercise 2.7 of Chapter 1, Ex f c(t -s,Bs) exp (J c(t - r, Br) drj f(Bt) F, J = c(t - s, B,)v(t - s, B,)
Section 4.3 The Feynman-Kac Formula. 141 Taking the expected value of the last equation and plugging into the previous one, we have v{t,x) = Exf{Bt)+ J Ex{c(t-s,B3)v(t-s,Bs)}ds (*) Jo = n(t,x) + v2{t,x) The first term on the right, vi(t,x), is C1,2 by (1.6). The second term, V2(t, x), is of the form considered in the last section with g(r, x) = c(r, x)v(r, x). If we start with the trivial observation that if c and / are bounded, then v is bounded on [0,JV] x Rd, and apply (2.6b), we see that \v2(t,x) — V2(t,y)\ < Cn\x — y\ whenever t < N To get a similar estimate for v\ let Bt be a Brownian motion starting at 0 and observe that since / is Holder continuous \vi(t, x) - Vl(t, y)\ = \E{f(x + Bt) - f(y + Bt)}\ < E\f(x + Bt) - f(y + Bt)\ < C\x - y\<* Combining the last two estimates we see that v(r, x) is Holder continuous locally in t. Since c and v ate bounded and we have supposed that c(r,x) is Holder continuous locally in t the triangle inequality implies g(r, x) is Holder continuous locally in t. Using (2.6c) and (2.6d) now, we have v2{t, x) G C1,2. D Unbounded c. As in the last section, we can generalize the results above to unbounded c's, but for simplicity we restrict our attention to the temporally homogeneous case c(t, x) = h(x). Given the formula (*) above, which expresses v as a volume potential, it is perhaps not too surprising that the appropriate assumption is c G Kd- The key to working in this generality is what Simon (1982) calls (3.7) Khasminskii's lemma. Let j>0bea function on Rd with a = snpEx(f g{B3)ds\ <1 Then sup Ex exp ( J g(Bs) ds j < (1 - a)-1 Proof The Markov property and the nonnegativity of g imply that sup Ex [... j dSl... ds„ g(B3l)... g(B3J < an X J J0<Sl< — <3n<t
142 Chapter 4 Partial Differential Equations Since the integrand is symmetric in (s\,..., sn) we have /•••/ dSl...dsng{B3l)...g{B3J = ±-f ... f dSl...dsng(B3l)...g(B3J Summing on n now gives the desired formula. D Prom the last result, it should be clear why assuming h £ Kd is natural in this context. This condition guarantees that supW/ \h(B3)\ds) ^0 and, hence, that supExexp ( f \h{B3)\ds\ -»1 With these two results in hand, we can proceed with developing the theory much as we did in the case of bounded coefficients. B. Elliptic Equations In the next three sections of this chapter, we show how Brownian motion can be used to construct classical solutions of the following equations: 0 = 0 = 0 = 1. 2Au ^u + g —Aw + cu in an open set G subject to the boundary condition: at each point of dG, u is continuous and u= f. We will see that the solutions to these equations are (under suitable assumptions) given by Exf{BT) Ex(j(Br) + J\{B3)ds Ex(f{Br)exV(j\{B3)dsX\
Section 4.4 The Dirichlet Problem 143 where r = inf{i > 0 : Bt ¢. G}. To see the similarities to the solutions given for the equations in Part A of this chapter think of those solutions in terms of space-time Brownian motions B3 = (t — s, B3) run until time f = inf{s : B3 ¢. (0, oo) x Rd}. Of course, f = t. 4.4. The Dirichlet Problem In this section, we will consider the Dirichlet problem. That is, we will study (4.1a) u e C2 and Aw = 0 in G. (4.1b) At each point of dG, u is continuous and u = /. To see what this means, note that if we let h(t, x) = u(x), then h satisfies the heat equation dh !a. — = -Ah dt 2 Thus, u is an equilibrium temperature distribution when dG is held at a fixed temperature profile /. As in the first three sections, the first step in solving (4.1) is to find a local martingale. (4.2) Theorem. Let r = inf{i > 0 : Bt ¢. G). If u satisfies (4.1a), then Mi = u(Bt) is a local martingale on [0, r). Proof Applying Ito's formula gives u{Bt) - u(BQ) = f Vu(B3) -dB3 + \ f Au(B3)ds Jo * Jo for t < t. This proves (4.2), since Aw = 0 in G and the first term is a local martingale on [0,r). D The second step is a uniqueness theorem. For this we introduce (f) Px(t < oo) = 1 for all x which (4.7b) below will show is necessary for uniqueness. Note that (1.3) in Chapter 3 implies bounded sets G satisfy (f). (4.3) Theorem. Suppose G satisfies (f). If there is a solution of (4.1) that is bounded, it must be !<*) = Exf(Br)
144 Chapter 4 Partial Differential Equations Proof If u is bounded and satisfies (4.1) then M3,0 < s < r, is a bounded local martingale. Using (2.7) in Chapter 2, (f), and (4.1b), we have Mr=\\mM3 =f(Br) u(x) = EXMQ = ExMr = v(x) D (f) is not needed for the existence of solutions, but to drop (f) we need to modify the definition to take the possibility of r = oo into account. Let v(x) = Ex (f(Br)l{r<co)) As in the first three sections, it is easy to show: (4.4) Theorem. Let G be any open set and suppose / is bounded. If v £ C2, then it satisfies (4.1a). Proof The Markov property implies that on {r > s} Ex{f(Br)l{r<oo)\T3)=v(B3) The left-hand side is a local martingale on [0,r), so v(B3) is also. If v. £ C2, then repeating the calculation in the proof of (4.2) shows that, for s £ [0,r), 1 F v(B3) — v(B0) = - / Av(Br) dr + a local martingale 2 jo The left-hand side is a local martingale on [0,r), so the integral on the right- hand side is also. However, the integral is continuous and locally of bounded variation, so by (3.3) in Chapter 2 it must be = 0. Since Av is continuous in G, it follows that Av = 0 in G. For if Av ^ 0 at some point we would have a contradiction. D Up to this point, everything has been the same as in Section 4.1. Differences appear when we consider the boundary condition (4.1b), since it is no longer sufficient for / to be bounded and continuous. The open set G must satisfy a regularity condition. Definition. A point y £ dG is said to be a regular point if Pv(t = 0) = 1. (4.5) Theorem. Let G be any open set. Suppose / is bounded and continuous and y is a regular point of dG. If xn £ G and xn —» y, then v(xn) —» f(y).
Section 4.4 The Dirichlet Problem 145 Proof The first step is to show (4.5a) Lemma. If t > 0, then x —* Px(t < t) is lower semicontinuous. That is, if aj„ —» x, then liminf Px (r<t)> PJt < t) n—*co Proof By the Markov property PX{B, e Gc for some s g (c,t]) = J pe(x,y)Py(r <t-e)dy Since y —» Py (r < t — e) is bounded and measurable and pe(x,y) = (2ve)-d/2e-^-^'2e it follows from the dominated convergence theorem that x -* PX(B3 e Gc for some s £ (e,t\) is continuous for each e > 0. Letting e J. 0 shows that x —* Px(t < t) is an increasing limit of continuous functions and, hence, lower semicontinuous. D If y is regular for G and t > 0, then Py(r < t) — 1, so it follows from (4.5a) that if xn —* y, then lim inf PXn (r < t) > 1 for alU > 0 n—*co With this established, it is easy to complete the proof. Since / is bounded and continuous, it suffices to show that if D(y, 6) = {x :\x — y\ < 6}, then (4.5b) Lemma. If y is regular for G and xn —» y, then for all 6 > 0 Px„(t<oo1Bt €£»(»,«))-*! Proof Let e > 0 and pick t so small that Po( sup |B,|>f) <e Since PXn(r < t) —» 1 as a;n —» y, it follows from the choices above that liminfPs„(T<oo1BT G D(y,<5)) > liminf PXn (r<t, sup \B, - xn\ < £) n-*co n-*co ^ 0<j<i */ > liminf Px (r < t) - P0 ( sup \Bt\ > -) > 1 - e
146 Chapter 4 Partial Differentia.! Equations Since e was arbitrary, this proves (4.5b) and, hence, (4.5). D (4.5) shows that if every point of dG is regular, then v(x) will satisfy the boundary condition (4.1b) for any bounded continuous /. The next two exercises develop a converse to this result. The first identifies a trivial case. Exercise 4.1. If G C R- is open then each point of dG is regular. Exercise 4.2. Let G be an open set in Rd with d > 2 and let y £ dG have Py(r = 0) < 1. Let / be a continuous function with f(y) = 1 and f(z) < 1 for all z ^ y. Show that there is a sequence of points xn —* y such that liminf„_co v{xn) < 1. Prom the discussion above, we see that for v to satisfy (4.1b) it is sufficient (and almost necessary) that each point of dG is a regular point. This situation raises two questions: Do irregular points exist? What are sufficient conditions for a point to be regular? In order to answer the first question we will give two examples. Example 4.1. Punctured Disc. Let d > 2 and let G = D — {0}, where D = {x : \x\ < 1}. If we let T0 = ini{t > 0 : Bt = 0}, then PQ{T0 = oo) = 1 by (1.10) in Chapter 3, so 0 is not a regular point of dD. One can get an example with a bigger boundary in d > 3 by looking at G = D — K where K = {x : x\ = X2 = 0}. In d = 3 this is a ball minus a line through the origin. Example 4.2. Lebesgue's Thorn. Let d > 3 and let G = (-1, l)d - U1<„<0O{[2-",2-"+1] x [-^, a^-1} where the n = oo term is the single point {0}. Claim. If an J. 0 sufficiently fast, then 0 is not a regular point of dG. Proof ^0((^,-8^3) = (0,0) for some t > 0) = 0, so with probability 1, a Brownian motion Bt starting at 0 will not hit In = {x : Xl e [2-n, 2-n+1], x2 = x3 = ... = xd = 0} Since Bt is transient in d > 3 it follows that for a.e. w the distance between {B3 : 0 < s < 00} and J„ is positive. From the last observation, it follows
Section 4.4 The Dirichlet Problem 147 immediately that if we let Tn = inf{f : Bt G [2_n, 2"n+1] x [a„, a„]d_1} and pick a„ small enough, then P0(Tn < oo) < 3-". Now £~=1 3"n = 3_1(3/2) = 1/2, so if we let r = inf{f > 0 : Bt £ G) and a = inf{f > 0 : Bt £ (-1, l)d}, then we have 1 Po(r <<r)<J2 po(Tn < oo) < - n = l Thus Po(r > 0) > P0(t = <t) > 1/2 and 0 is an irregular point. D The last two examples show that if Gc is too small near y, then y may be irregular. The next result shows that if Gc is not too small near y, then y is regular. (4.5c) Cone Condition. If there is a cone V having vertex y and an r > 0 such that V D D(y,r) C Gc, then y is a regular point. Proof The first thing to do is to define a cone with vertex y, pointing in direction v, with opening a as follows: V(y, v, a) = {x : x = y + d(v + z) where 6 £ (0, oo), z ±v, and \z\ < a} Now that we have denned a cone, the rest is easy. Since the normal distribution is spherically symmetric, ^(-¾ eV(y,v,a)) = ea>0 where ea is a constant that depends only on the opening a. Let r > 0 be such that V(y, v, a) D D(y, r) C Ge. The continuity of Brownian paths implies limPy (sup|£,-»|>r) =0 Combining the last two results with a trivial inequality we have ea < liminf Py(Bt G Gc) < limP„(r <t)<Py(r = 0) — no y — txo s * and it follows from Blumenthal's zero-one law that Pv(t = 0) = 1. D (4.5c) is sufficient for most cases. Exercise 4.3. Let G = {x : g(x) < 0} where g is a C1 function with Vg(y) ^ 0 for each y £ dG. Show that each point of dG is regular.
148 Chapter 4 Partial Differential Equations However, in a pinch the following generalization can be useful. Exercise 4.4. Define a flat cone V(y, v, a) to be the intersection of V(y, v, a) with a d — 1 dimensional hyperplane that contains the line {y + 6v : 6 g R-}. Show that (4.5c) remains true if "cone" is replaced by "flat cone." When d = 2 this says that if we can find a line segment ending at z that lies in the complement then z is regular. For example this implies that each point of the boundary of the slit disc G = D — {x : xx > 0, z2 = 0} is regular. Having completed our discussion of the boundary condition, we now turn our attention to determining when v is smooth. As in Section 4.1, this is true under minimal assumptions on /. (4.6) Theorem. Let G be any open set. If / is bounded, then v g C100 and, hence, satisfies (4.1a). Proof Let x £ G and pick 6 > 0 so that D(x, 6) C G. If we let a = inf {t : Bi £ D(x,6)}, then the strong Markov property implies that (for more details see Example 3.2 in Chapter 1) v(x) = Ex (/(5r)l(r<oo)) = Ex(v(Ba)) = f v(y) w(dy) JdD(x,&) where tt is surface measure on D(x, 6) normalized to be a probability measure. The last result is the "averaging property" of harmonic functions. Now a simple analytical argument takes over to show v £ C°°. (4.6a) Lemma. Let G be an open set and h be a bounded function which has the averaging property h(x)= [ h(y)w(dy) J8D(x,S) when x eG and 8 < 8X where Sx > 0. Then h G C°° and Ah = 0 in G. Proof Let i^bea nonnegative infinitely differentiable function that vanishes on [<52,oo) but is not = 0. By repeatedly applying (1.7) it is routine to show that g(x)= [ il>(\y-x\2)h(y)dyeC™ JD(x,S)
Section 4.4 The Dirichlet Problem 149 Note that we use \y — x\2 = ^(y; — z,-)2 rather than \y — x\ which is not smooth at 0. Making a simple change of variables, then looking at things in polar coordinates, and using the averaging property we have g(x)= f ^j(\z\2)h(x + z)dz Jd(o,s) = C [ drrd-li!>{r2)( f h(x + z) ir(dz)) JO \JdD(0,r) J = cl [ drrd-1i>{r2)\ h(x) So h is a constant times g and hence C°°. To conclude now that Ah = 0 we note that the multivariate version of Taylor's theorem implies that if \y — x\ < r then h(y) = h(x) + J2(Vi ~ Xi)Dih(x) i + X^y*' ~ xNyi ~ xi)DnKx) + e(y.x) 0' where \e(y, x)\ < C3r3. Integrating over D(x,r) w.r.t. dy/\D(0, r)\ and using the averaging property we have h(x) = h(x) + 0 + C2Ah(x)ra + 0(r3) Subtracting h(x) from each side, dividing by r2, and letting r —» 0 it follows that Ah(x) = 0. D Unbounded G. As in the previous three sections, our last topic is to discuss what happens when something becomes unbounded. This time we will focus on G and ignore /. Combining (4.3), (4.5), and (4.6) we have: (4.7a) Theorem. Suppose that / is bounded and continuous and that each point of dG is regular. If for all x £ G, Px(t < oo) = 1, then v is the unique bounded solution of (4.1). To see that there might be other unbounded solutions consider G = (0, oo), /(0) = 0, and note that u(x) = ex is a solution. Conversely, we have (4.7b) Theorem. Suppose that / is bounded and continuous and that each point of dG is regular. If for some x £ G, Px(r < oo) < 1, then the solution of (4.1) is not unique.
150 Chapter 4 Partial Differential Equations Proof Since h(x) = Px(t = oo) has the averaging property given in (4.6a), it is C°° and has Ah = 0 in G. Since each point y £ dG is regular, a trivial comparison and (4.5a) implies limsup Px(t = oo) < limsup Px(r > 1) < Py(r > 1) = 0 x—*y x—*y The last two observations show that h is a solution of (4.1) with / = 0, which completes the proof. Q By working a little harder, we can show that adding ocPx(t = oo) to v(x) is the only way to produce new bounded solutions. (4.7c) Theorem. Suppose that / is bounded and continuous and that each point of dG is regular. If u is bounded and satisfies (4.1) in G, then there is a constant a such that u{x) = Ex(f(Br); t < oo) + aPx(r = oo) We will warm up for this by proving the following special case in which G = Rd. (4.7d) Theorem. If u is bounded and harmonic in Rd then u is constant. Proof (4.2) above and (2.6) in Chapter 2 imply that u{Bt) is a bounded martingale. So the martingale convergence theorem implies that as i —» oo, u(Bt) —+ Uoo- Since C/M is measurable with respect to the tail cr-field, it follows from (2.12) in Chapter 1 that Px(a < Uoo < b) is either = 0 or = 1 for any a < b. The last result implies that there is a constant c independent of x so that Px(Uoo = c) = 1. Taking expected values it follows that u(x) = ExUoo = c. Q Proof of (4.7c) Prom the proof of (4.7d) we see that u{Bt) is a bounded local martingale on [0, r) so UT = lim^r u(Bt). On {r < oo}, we have Ur = f(BT) so what we need to show is that there is a constant a independent of the starting point Bq so that UT = a on {r = oo}. Intuitively, this is a consequence of the triviality of the asymptotic cr-field, but the fact that 0 < Px(r = oo) < 1 makes it difficult to extract this from (2.12) in Chapter 1. To get around the difficulty identified in the previous paragraph, we will extend u to the whole space. The two steps in doing this are to (a) Let h{x) = u(x) - Ex(f(Br);r < oo). (4.6) and (4.5) imply that h is bounded and satisfies (4.1) with boundary function / = 0. (b) Let M = \\h\loo and look at w(x) = fh(x) + MPx(r= co) xeG K ' 10 xgG
Section 4.5 Poisson's Equation 151 To complete the proof now, we will show in four steps that w(x) = f3Px(r = oo), from which the desired result follows immediately. (i) When restricted to G, w satisfies (4.1) with boundary function / = 0. The proof of (4.7b) implies that MPx(t < oo) satisfies (4.1) with boundary function /=0. Combining this with (a) the desired result follows. (ii) w > 0. To do this we use the optional stopping theorem on the martingale h(Bt) at time tAt and note h(Br) = 0 to get h(x) = Ex(h(Bt);r > t) > -MPx(r > t) Letting t —* oo proves h(x) > —MPx(r = oo). (iii) w(Bt) is a submartingale. Because of the Markov property it is enough to show that for all x and t we have w(x) < Exw(Bt). Since w > 0 this is trivial if x ¢. G. To prove this for x £ G, we note that (i) implies Wt = w{Bt) is a bounded local martingale on [0, r) and WT = 0 on {r < oo}, so using the optional stopping theorem w(x) = Ex(WrM) = Ex(w(Bt); r > t) < Ex(w(Bt)) (iv) There is a constant /? so that w(x) = &Px{t = oo). Since w is a bounded submartingale it follows that as t —*■ oo, w(Bt) converges to a limit Woo- The argument in (4.7d) implies there is a constant /? so that Px{Woo = /?) = 1 for all x. Letting t —» oo in w(x) = Ex(w(Bi);r>t) arid using the bounded convergence gives (iv). D 4.5. Poisson's Equation In this section, we will see what happens when we add a function of x to the equation considered in the last section. That is, we will study: (5.1a) ueC2 and |Aw = -g in G. (5.1b) At each point of dG, u is continuous and u = 0. As in Section 4.2, we can add a solution of (4.1) to replace u = 0 in (5.1b) by u = /. As always, the first step in solving (5.1) is to find a local martingale.
152 Chapter 4 Partial Differential Equations (5.2) Theorem. Let r = inf{f > 0 : Bt ¢. G). If u satisfies (5.1a), then Aft = u(.Bt) + f g(B,)ds Jo is a local martingale on [0,r). Proof Applying Ito's formula as we did in the last section gives u(Bt) - u(B0) = f Vu(B3) -dB, + \ f Au(B,)ds Jo * Jo for t < t: This proves (5.2), since |Aw = —g and the first term on the right- hand side is a local martingale on [0,r). D The next step is to prove a uniqueness result. (5.3) Theorem. Suppose that G and g are bounded. If there is a solution of (5.1) that is bounded, it must be v(x) = Ex (J g(Bt)dtj Proof - If u satisfies (5.1a) then Mt denned in (5.2) is a local martingale on [0, r). If G is bounded, then (1.3) in Chapter 3 implies Exr < oo for all x £ G. If u and g are bounded then for t < r |M,|<|M|M+ry|oo Since the right-hand side is integrable, (2.7) in Chapter 2 and (5.1b) imply Mr = limMt = I g(Bt)dt u(x) = EXMQ = Ex(Mr) = v(x) a As usual, it is easy to show (5.4) Theorem. Suppose that G is bounded and g is continuous. If v G C2, then it satisfies (5.1a). Proof The Markov property implies that on {r > s}, Ex(JTg(Bi)dt T^j = J g(Bt)dt + v(B3)
Section 4.5 Poisson's Equation 153 The left-hand side is a local martingale on [0, r), so the right-hand side is also. If u G C2, then repeating the calculation in the proof of (5.2) shows that for «€ [O.t), v(B3) - v(Bo) + £ g(Br) dr = £ (±Av + g) (Br)dr + a local martingale The left-hand side is a local martingale on [0, r), so the integral on the right- hand side is also. However, the integral is continuous and locally of bounded variation, so by (3.3) in Chapter 2 it must be = 0. Since %Av + g is continuous in G, it follows that it is = 0 in G, for if it were ^ 0 at some point then we would have a contradiction. D After the extensive discussion in the last section, the conditions needed to guarantee that the boundary conditions hold should come as no surprise. (5.5) Theorem. Suppose that G and g are bounded. Let y be a regular point of dG. If xn £ G and xn —» y, then v(xn) —» 0. Proof We begin by observing: (i) It follows from (4.5a) that if e > 0, then PXn(r > e) -* 0. (ii) If G is bounded, then (1.3) in Chapter 3 implies C = supx Exr < oo and, hence, |H|co < C||ff||co < oo. Let e > 0. Beginning with some elementary inequalities then using the Markov property we have K*„)| < EXn HT ' \g(Bs)\ ds^j + EXn ( £ g(Bs) ds ; r > e) < e|M|oo + EIn(\v(Be)\; r>e)< e||ff||M + ||t;||ooP«b(t > e) Letting n —» oo, and using (i) and (ii) proves (5.5) since e is arbitrary. D Last, but not least, we come to the question of smoothness. For these developments we will assume that g is denned on Rd not just in G and has compact support. Recall we are supposing G is bounded and notice that the values of g on Gc are irrelevant for (5.1), so there will be no loss of generality if we later want to suppose that f g{x)dx = 0. We begin with the case d > 3, because in this case (2.2) in Chapter 3 implies ,co w(x) = Ex / \g(Bt)\ dt < Jo
154 Chapter 4 Partial Differentia.! Equations and, moreover, is a bounded function of x. This means we can define ,co w(x) = Ex / g(Bt)dt Jo use the strong Markov property to conclude w(x) = Ex f g(Bt)dt + Exw(Br) Jo and change notation to get (*) v(x) = w(x) - Exw(Br) (4.6) tells us that the second term is C°° in G, so to verify that ngC^we need only prove that w is, a task that is made simple by the fact that (2.4) and (2.3) in Chapter 3 gives the following explicit formula w(x) = Cd \x- y\2~dg(y) dy The first derivative is easy. (5.6a) Theorem. If g is bounded and has compact support, then w is C1 and there is a constant C which only depends on d so that IA-«<*)| < cy|M I Jh dy y <oo d-\ {g*0} \x ~ V\ Proof We will content ourselves to show that the expression we get by differentiating under the integral sign converges and leave it to the reader to apply (1.7) to make the argument rigorous. Now dax - vrd = (^) (x> - «)a)~*/a2(*« - 2/0 So differentiating under the integral sign DiW{x) = Cd(2 -d)J ^|f^ g(y) dy The integral on the right-hand side is convergent since /|^*"hsiM-L,i^<o°
Section 4.5 Poisson's Equation 155 As in Section 4.2, trouble starts when we consider second derivatives. If i^ j, then DtJ \x - y\2~d = (2 - d)(-d)\x - y\-d-\Xi - yi)(Xj - Vj) In this case, the estimate used above leads to \DIJ\x-y\*-d\<C\x-y\-d which is (just barely) not locally integrable. As in Section 4.2, if g is Holder continuous of order a, we can get an extra \x — y\a to save the day. The details are tedious, so we will content ourselves to state the result. (5.6b) Theorem. If g is Holder continuous, then w is C2. The reader can find a proof either in Port and Stone (1978), pages 116-117, or in Gilbarg and Trudinger (1977), pages 53-55. Combining (*) with (5.6b) gives (5.6) Theorem. Suppose that G is bounded. If g is Holder continuous, then v £ C2 and hence satisfies (5.1a). The last result settles the question of smoothness in d > 3. To extend the result to d < 2, we need to find a substitute for (*). To do this, we let w(x) = / G(x,y)g(y)dy where G is the potential kernel defined in (2.7) of Chapter 3, that is, G(xv)=l-^og(\x-y\) d=2 U{'V) \-\x-y\ d=l G was defined as / Jo {Pt(x,y)-at}dt where the at were chosen to make the integral converge. So if f gdx = 0, we see that I G(x,y)g(y)dy= lim Ex f g(Bt)dt J T->co JQ Using this interpretation of w, we can easily show that (*) holds, so again our problem is reduced to proving that w is C2, which is a problem in calculus. Once all the computations are done, we find that (5.6) holds in d < 2 and that in d = 1, it is sufficient to assume that g is continuous. The reader can find
156 Chapter 4 Partial Differential Equations details for d = 1 in (4.5) of Chapter 6, and for d > 2 in either of the sources cited above. 4.6. The Schrodinger Equation In this section, we will consider what happens when we add cu to the left-hand side of the equation considered in Section 4.4. That is, we will study (6.1a) ueC2 and |Aw + cu = 0 in G. (6.1b) At each point of dG, u is continuous and u = f. As always, the first step in solving (6.1) is to find a local martingale. (6.2) Theorem. Let r = inf{f > 0 : Bt ¢. G). If u satisfies (6.1a) then Mt = u(Bt)exp ( I c(B,)dsj is a local martingale on [0,r). Proof Let c% = /0 c(B3)ds. Applying Ito's formula gives u(Bt)exp(ct) - u(B0) = / exp(c,)Vu(i?,) • dB3 + / u(B3) exp(c,) dc, Jo Jo 1 I* + -/ Au(Ba)exp(ca)ds * Jo for t < t. This proves (6.2), since dc, = c(B3) ds, |A« + cu = 0, and the first term on the right-hand side is a local martingale on [0,r). D At this point, the reader might expect that the next step, as it has been five times before, is to assume that everything is bounded and conclude that if there is a solution of (6.1) that is bounded, it must be v(x) = Ex(f(BT)exp(cT)) We will not do this, however, because the following simple example shows that this result is false. Example 6.1. Let d = 1, G = (—a, a), c = j, and /=1. The equation we are considering is —u" + ju = 0 w(a) = u(—a) = 1
Section 4.6 The Schrodinger Equation 157 The general solution is A cos bx + B sin bx, where b = y/2-f. So if we want the boundary condition to be satisfied we must have 1 = A cos 6a + B sin 6a 1 = A cos(—6a) + B sin(—6a) = A cos 6a — B sin 6a Adding the two equations and then subtracting them it follows that 2 = 2 A cos 6a 0 = IB sin 6a From this we see that B = 0 always works and we may or may not be able to solve for A. If cos 6a = 0 then there is no solution. If cos 6a -^ 0 then u(x) = cos bx/ cos 6a is a solution. We will see later (in Example 9.1) that if a6 < w/2 then v(x) = cos bx/ cos 6a However this cannot possibly hold for a6 > tt/2 since v(x) > 0 while the right- hand side is < 0 for some values of x. O We will see below (again in Example 9.1) that the trouble with the last example is that if a6 > w/2 then c = 7 is too large, or to be precise, if we let v(x) = Exexp( c(B,)ds then w = 00 in (—a, a). The rest of this section is devoted to showing that if w =£ 00, then "everything is fine." The development will require several stages. The first step is to show (6.3a) Lemma. Let 0 > 0. There is a p. > 0 so that if H is an open set with Lebesgue measure \H\ < \i and tr = inf{t > 0 : Bt ¢ H) then sup Ex (exp(0Tff)) < 2 Proof Pick 7 > 0 so that eB^ < 4/3. Clearly, Fx(-TH >1)S]H (2*7)^ e dy S (2*T)d/2 " 4
158 Chapter 4 Partial Differentia.! Equations if we pick p. so that /*/(2tf-7)d/2 < 1/4. Using the Markov property as in the proof of (1.2) in Chapter 3 we can conclude that Px{rH > kj) = Ex(PBy(TH >(k- 1)T); rff > t) < \supPy(TH>(k-l)7) So it follows by induction that for all integers k > 0 we have sup Px (¾ > kj) < -£ Since ex is increasing, and e87 < 4/3 by assumption, we have CO Ex exp(^rff) < J2eM9jk)Px((k - 1)? < Th < k7) k=i <^r(^\ J_-±f_L_! i " i^i V3/ 4fc-i 3 £^ 3*-1 " 3 ' 1 - I Careful readers may have noticed that we left rff = 0 out of the expected value. However, by the Blumenthal 0-1 law either Px{tjj = 0) = 0 in which case our computation is valid, or Px{th = 0) = 1 in which case Ex exp(flrff) = 1. D Let c* = supx \c(x)\. By (6.3a), we can pick ro so small that if Tr = inf{t: \Bt - B0\ > r} and r < r0, then Ex exp(c*Tr) < 2 for all x. (6.3b) Lemma. Let 25 < r0. If D(x,28) C G and y G J5(s,5), then ™(y) < 2d+2iy(a;) The reason for our interest in this is that it shows w(x) < oo implies w(y) < oo foryeD(x,8). Proof If D(y, r) C G, and r < ro then the strong Markov property implies w(y) = Ey[exp(cTr)w(B(Tr))] < Ey[exp(c*Tr)w(B(Tr))] = £y[exp(cTr)] i w(z)Ti(dz)<2 I w(z)Ti(dz) JdD(y,r) JdD(y,r) where n is surface measure on dD(y, r) normalized to be a probability measure, since the exit time, Tr, and exit location, B(Tr), are independent.
Section 4.6 The Schrodinger Equation 159 If 5 < ro and D(y,6) C G, multiplying the last inequality by rd_1 and integrating from 0 to 5 gives 6d If —w{y)<2- — / w{z)dz d <*i JD(y,S) where <rd is the surface area of {x : \x\ = 1}. Rearranging we have r 8d (*) / w(z)dz>2-1—w(y) JD(y,S) ^o where C0 = d/(Td is a constant that depends only on d. Repeating the first argument in the proof with y = x and using the fact that ctt > —c*Tr gives w(x) = Ex[exp(cTr)w(B(Tr))] > Ex[exv(-c*Tr)w(B(Tr))] = Es[exp(-c*Tr)] J w(z)w(dz) Since 1/x is convex, Jensen's inequality implies Es[exp(-c*Tr)] > l/£s[exp(cTr)] > 1/2 Combining the last two displays, multiplying by rd_1, and integrating from 0 to 25 we get the lower bound (25) (z)> 2"1-— f w(z)dz ffd Jd(x,2S) d Rearranging and using D(x, 25) D D(y, 5), w > 0 we have C t (**) w{x)>2-lj-~ I w(z)dz where again C0 = d/ffd- Combining (**) and (*) it follows that w(x)>2-1-^-d f w(z)dz >2'1^-'rl^wiy) = 'r(d+2)w(y) D (6.3b) and a simple covering argument lead to
160 Chapter 4 Partial Differential Equations (6.3c) Theorem. Let G be a connected open set. If w ^ oo then w(x) < oo for all x £ G Proof Prom (6.3b), we see that if w(x) < oo, 25 < ro, and D(x, 25) C G, then w < oo on D(x,6). From this result, it follows that Go = {x : w(x) < oo} is an open subset of G. To argue now that Go is also closed (when considered as a subset of G) we observe that if 25 < ro, D(y, 35) c G, and we have xn £ Go with xn —» y £ G then for n sufficiently large, y £ D(xn, 5) and D(xn, 25) C G, so w(y) < oo. D Before we proceed to the uniqueness result, we want to strengthen the last conclusion. (6.3d) Theorem. Let G be a connected open set with finite Lebesgue measure, \G\ < oo. If w ^ oo then sup w(x) < oo x Proof Let K C G be compact so that \G— K\ < p. the constant in (6.3a) for 6 = c*. For each x £ K we can pick a 5X so that 25x < ro and D{x, 25x) C G. The open sets D(x, 5X) cover K so there is a finite subcover D(x{, 5Xj), 1 < i < I. Clearly, sup w(xi) < oo (6.3b) implies w(y) < 2d+2w(x{) for y £ D(xit 5Xj), so M = sup w(y) < oo If y £ H = G — K, then Ey(exp(c*TH)) < 2 by (6.3a) so using the strong Markov property w(y) = Ey (exp(cr„); th = r) + Ey (exp(cTH)w(BTH); BTH £ K) < 2 + MEy (exp(cTH); BTH £ K) < 2 + 2M D With (6.3d) established, we are now more than ready to prove our uniqueness result. To simplify the statements of the results that follow we will now list the assumptions that we will make for the rest of the section. (Al) G is a bounded connected open set. (A2) / and c are bounded and continuous.
Section 4.6 The Schrodinger Equation 161 (A3) w ¢. oo. (6.3) Theorem. If there is a solution of (6.1) that is bounded, it must be v{x) = Ex{f{Br)exV{cr)) Proof (6.2) implies that M, = u(B3) exp(cj) is a local martingale on [0,r). Since /, c, and u are bounded, letting s f r A t and using the bounded convergence theorem gives u(x) = Ex(f(Br)exp(cry,T<t)+Ex(u(Bt)exp(ct);T>t) Since / is bounded and w(x) = £'xexp(cr) < oo, the dominated convergence theorem implies that as t —» oo, the first term converges to Ex(f(Br) exp(cT)). To show that the second term —» 0, we begin with the observation that since {r > t} £ Tt, the definition of conditional expectation and the Markov property imply Ex{u{Bt) exp(cr); r > t) = Ex(Ex(u(Bt) exp(cr)|^); r>t) = Ex(u(Bt) exv(Ci)w(Bt); r>t) Now we claim that for all y £ G w(y) > exv{-c*)Py{r < 1) > e > 0 The first inequality is trivial. The last two follow easily from (Al). See the first display in the proof of (1.2) in Chapter 3. Replacing w(Bi) by e, Ex(\u(Bt)\exv(cty,T > t) < r1Ex(\u(Bt)\exp(cr);T> t) < e-1||w||co£'x(exp(cr); r > t) — 0 as t —» oo, by the dominated convergence theorem since w(x) = Ex exp(cr) < oo and Px(t < oo) = 1. Going back to the first equation in the proof, we have shown u(x) = v(x) and the proof is complete. D This completes our consideration of uniqueness. The next stage in our program, fortunately, is as easy as it always has been. Recall that here and in what follows we are assuming (A1)-(A3). (6.4) Theorem. If v £ C2, then it satisfies (6.1a) in G.
162 Chapter 4 Partial Differentia.! Equations Proof The Markov property implies that on {r > s}, Ex{exv{cT)f{Br)\T,) = exp(Ci)£B,(exp(cr)/(5r)) = exp(c,)v(Bs) The left-hand side is a local martingale on [0, r), so the right-hand side is also. If v G C2, then repeating the calculation in the proof of (6.2) shows that for «€[0,1-), v(B3) exp(c,) - v(BQ) =/( 2 Av + cv) (Br) exp(cr)dr + a local martingale The left-hand side is a local martingale on [0, r), so the integral on the right- hand side is also. However, the integral is continuous and locally of bounded variation, so by (3.3) in Chapter 2 it must be = 0. Since v G C2 and c is continuous, it follows that ^Av + cv = 0, for if it were ^ 0 at some point then we would have a contradiction. D Having proved (6.4), the next step is to consider the boundary condition. As in the last two sections, we need the boundary to be regular. (6.5) Theorem, v satisfies (6.1b) at each regular point oidG. Proof Let y be a regular point of dG. We showed in (4.5a) and (4.5b) that if xn -* y, then PXn(r < 6) -* 1 and PXn{Br £ D{y,6)) -* 1 for all 6 > 0. Since c is bounded and / is bounded and continuous, the bounded convergence theorem implies that EXn(exv(cr)f(Br);T< l)-/(y) To control the contribution from the rest of the space, we observe that if \c\ < M then using the Markov property and the boundedness of w established in (6.3d) we have EXn(exv(cr)f(Br);T > 1) < eM\\f\\00EXn(w(B1);r> 1) <eM||/||oo||U;||ooPx„(r>l)-0 D This brings us finally to the problem of determining when v is smooth enough to be a solution. We use the same trick used in Section 4.3 to reduce
Section 4.6 The Schrodinger Equation 163 to the previous case. We begin with the identity established there, which holds for all t and Brownian paths w and, hence, holds when t = r(w) exp ( I c(B,)ds] =1+/ c(B,)exp ( j c(Br)drj ds Multiplying by f(Br) and taking expected values gives v(x) = Exf(Br) + J Ex (c(B3) exp (J c(Br)drj f(Br)l(3<r)J ds Conditioning on Ts and using the Markov property, we can write the above as v(x) = Exf{Br) + / Ex(c{B3)v(Bs); r>s)ds Jo = Vi(x) + Vn(x) The first term, vi(x), is C00 by (4.6). The second term is M*) = E* (J c(Bs)v(Bs)ds) so if we let #(z) = c(x)v(x) then we can apply results from the last section. If c and / are bounded and w ^ oo, then v is bounded by (6.3d), so it follows from results in the last section that vn is C1 and has a bounded derivative. Since ui G C°° and G is bounded, it follows that v is C1 and has abounded derivative. If c is Holder continuous, then g(x) = c(x)v(x) is Holder continuous, and we can use (5.6b) from the last section to conclude v2 G C2 and hence (6.6) Theorem. If in addition to (A1)-(A3), c is Holder continuous, then tgC2 and, hence, satisfies (6.1a). C. Applications to Brownian Motion In the next three sections we will use the p.d.e. results proved in the last three to derive some formulas for Brownian motion. The first two sections are closely related but the third can be read independently.
164 Chapter 4 Partial Differential Equations 4.7. Exit Distributions for the Ball In this section, we will use results for the Dirichlet problem proved in Section 4.4 to find the exit distributions for D = {x : \x\ < 1}. Our main result is (7.1) Theorem. If / is bounded and measurable, then (*) Exf(Br)= [ I^Ml/O,)^) JdD \x-y\ where -k is surface measure on dD normalized to be a probability measure. Proof An application of the monotone class theorem shows that if (*) holds for / G C°° it is valid for bounded measurable /. In view of (4.3), we can prove (*) for / G C°° by showing that if ky(x) = (1 - |z|2)/|z - y\d and v(x) = J SaDh{*)f{yWy) ze-D f(x) xedD then v solves the Dirichlet problem (4.1): (7.2a) In D, v G C2 and Av = 0. (7.2b) At each point of dD, v is continuous and v(x) = f(x). The first, somewhat painful, step in doing this is to show (7.3) Lemma. If y G dD then Aky = 0 in D. Proof To warm up for this, we observe A \x - y\P = Di (J2(XJ - yjY-f2 = p\x - yr\Xi - yi) i so taking p = — d we have Diky{x) = (-2,0 - ^p + (1- M2) - -^4. Differentiating again and using our fact with p = — (d+ 2) gives DUky{x) = (-2) ■ r-^-r, + 2 ■ (-2,,-) -^--^1 \d+2 (d(d+2)(xi-yi)* d_ { ' Nl \*-y\d+i \*-y\°
Section 4.7 Exit Distributions for the Ball 165 Summing the last expression on i gives a i / x — 2d . , I2I2 — x ■ y Aky(x) = - + Ad ■ Y irfxf y \x-y\d \x-y\d+2 { ' Nll*-y|d+2 l*-y|d+2 _ -2d\x-y\2 4d(\x\2 -x-y) (1-^)-2d \x-y\d+2 \x-y\d+2 + \x-y\*+2 If we replace the 1 by |y|2, the expression collapses AM«) = |^5+J H*-»I2+2H2-2*. ^,+ 1^-1^) = 0 since \x - y\2 = (x - y) ■ (x - y) = \x\2 -2x-y + \y\2. Inside the open set D, v is a linear combination of the ky. So bringing the differentiation under the integral (and leaving it to the reader to justify this using (1.7)) gives Av(x)= f w(dy)f(y) Aky(x)=0 JdD JdD Thus, v satisfies (7.2a). To check (7.2b), the first step is to show 1 - i*i' 'w=If^*(*)s1 This is just "calculus," but we prefer to use a soft noncomputational approach instead. We begin by observing that 1(0) = 1, J is invariant under rotations, and A J = 0 in D. (For the last conclusion apply the result for Av with /=1.) To conclude 1=1 now, let x G D with \x\ = r < 1, and let r = inf{t : \Bt\ > r). Applying (1.2) of Chapter 3 with G = 13(0, r) now shows 1(0) = EQI(Br) = I(x) where the second equality follows from invariance under rotations. To show that v(x) —* f(y) asj-»t/£ dD, we observe that if 2 7^ y, u( ^ i-H2 0
166 Chapter 4 Partial Differentia.! Equations From the last calculation it is clear that if 5 > 0, then the convergence is uniform for z G B0{6) = dD- D{y, 6). Thus, if we let B^S) = dD- B0(6), then f kz(x)ir(dz) -+0 and f kz(x)ir(dz) < 1 JB0(S) Jb^s) since I(x) = 1. To prove that v(x) —» f(y) now we observe H*)-M\ = \J h(x)f(z)w(dz) - f(y) J kz(x)w(dz) <2||/||oo/ kz(x)ir(dz)+ sup 1/(^)-/(2/)1 •AB0(4) z6B!(4) The first term —» 0 as a; —» y, while the sup in the second is small if 6 is. This shows that (7.2b) holds and completes the proof of (7.1). D The derivation given above is a little unsatisfying, since it starts with the answer and then verifies it, but it is simpler than messing around with Kelvin's transformations (see pages 100-103 in Port and Stone (1978)). It also has the merit of explaining why ky(x) is the probability density of exiting at y: ky is a nonnegative harmonic function that has ky (0) = 1 and ky (x) —+ 0 when x —» z G dD and z ^y. Exercise 7.1. The point of this exercise is to apply the reasoning of this section to r = inf{f :Bt<£H} where H = {(x, y) : x £ Rd_1, y > 0}. For 6 G Iid_1 let h8(x>v)= (\x-o\* + y*y/* where Cd is chosen so that f d6hg(0,1) = 1 and let u(x,y) = Jd9hB(x,y)f(d,0) where / is bounded and continuous. (a) Show that Ahg = 0 in if and use (1.7) to conclude Aw = 0 in if. (b) Show I(x,y) = fd6hg(x,y) = 1. (c) Show that if xn —» x, yn —* 0 then u(xn, yn) —* f(x, 0). (d) Conclude that E{x>y)f(BT) = fd0he(x,y)f(0,0).
Section 4.8 Occupation Times for the Ball 167 4.8. Occupation Times for the Ball In the last section we considered how Bt leaves D = {x : \x\ < 1}. In this section we will investigate how it spends its time before it leaves. Let r = inf {t: Bt $ D} and let G(x,y) be the potential kernel denned in (2.7) of Chapter 3. That is, \-\x-y\ d=l G(x,y)={ -llog\x-y\ d = 2 {Cd\x-y\2-d d>3 where Cd = T(d/2 - 1)/2^2. (8.1) Theorem. If g is bounded and measurable then Ex J g(Bt) dt = Jgd(x, y)g(y) dy where GD(x,y) = G(x,y)-Jj^^G(z,y)7r(dz) Proof Combine (*) in Section 4.5 with (7.1). D We think of Gi)(x,y) as the expected amount of time a Brownian motion starting at x spends at y before exiting G. To be precise, if A C G then the expected amount of time Bt spends in A before exiting G is JAGo(x,y)dy. Our task in this section is to compute GD(x,y). In d = 1, where D = (— 1,1), (8.1) tells us that GD(x,y) = G(x,y)-^±±G(l,y) - i^G(-l,y) = -\*-y\ + ^1(i-y) + ^(y + i) = -\x-y\ + 1-xy Considering the two cases x > y and x < y leads to (8.2) GD(x,y)-^{1_y){1 + x) _1^x<y<1 Geometrically, if we fix y then x —* Gi)(x,y) is determined by the conditions that it is linear on [0,y] and on [y, 1] with Go(0,y) = Gx)(l,y) = 0, and GD(y,y) = l-y2.
168 Chapter 4 Partial Differential Equations In d > 2, (8.1) works well when y = 0 for then G(z,0) is constant on {\z\ = 1} and the expression in (8.1) reduces to (8.3) 0^-,0) = 0(.,0)-^,0) = (^1^ ddl\ To get an explicit formula for Gd (x, y) in d > 2 for y ^ 0 we will cheat and look up the answer. This weakness of character has the advantage of making the point that Go(x,y) is nothing more than the Green's function for D with Dirichlet boundary conditions. Folland (1976), page 109, defines the Green's function for D to be the function K (x, y) on Dx D determined by the following properties: (A) For each y £ D, K(-,y) — G(-,y) is C2 and harmonic in D. (B) For each y G D, if xn -* x G dD, K(xn, y) -* 0. Remark. For convenience, we have changed Folland's notation to conform to ours and interchanged the roles of x and y. This interchange makes no difference, since the Green's function is symmetric, that is, K(x,y) = K(y,x). (See Folland (1976), page 110.) (8.4) Lemma. The occupation time density Gq is equal to the Green's function K denned above. Proof Using (8.1) and (7.1) we have GD(x,z) - G(x,z) = - f i—!£JlG(», z)ir(dy) = -ExG(Br,y) JdD \x ~ v\ This is harmonic in D by (4.6). (4.5) implies if xn —* x G dD, then EXnG(Br,y)^G(x,y) D Having made the connection in (8.4), we can now find G# by "guessing and verifying" functions with properties (A) and (B). To "guess," we turn to page 123 in Folland (1976) and find (8.5) Theorem. In d > 3, if 0 < |y| < 1 then GD(x,y) = G(x,y)-\y\2-dG(x,y/\y\2)
Section 4.0 Lap/ace Transforms, Arcsine Law 169 Proof To check (A) and (B), we observe that if we let Ax denote the Laplacian acting in the x variable for fixed y then (*) AxG(z,y) = 0 forz^y (a) Since y/|y|2 ¢. D, (*) implies that the second term is harmonic for x £ D. (b) Let x £ dD. Clearly, the right-hand side is continuous at x. To see that it vanishes, we note G(x,y)-\y\"~dG(x,y/\y\2) = Cd Cd |*-y|d-2 Cd \y\ d-2 \y\* cd \x-y\d~2 Myl-yM-T-2 The last equality follows from a fact useful for the next proof: (8.6) Lemma. If \x\ = 1 then |z|y| — y|y|-1| = \x — y\. Proof Using \z\2 = z ■ z and then |z|2 = 1 we have \x\y\-y\y\-l\2 = \x\2\y\2-2x-y + l = |y|2-2z-y+|z|2 = |z-y|2 -(d-2) = 0 D D Again turning to page 123 of Folland (1976) we "guess" (8.7) Theorem. In d = 2 if 0 < |y| < 1 then GD{x,2/) = — (In \x-y\- In \x\y\ - y|y|_11) Proof Again we need to check (A) and (B). (a) GD{x,y)-G{x,y) = ^(\n I2/I2 + ln|y| Again the first term is harmonic by (*) since y/|y|2 ¢ D. The second term does not depend on x and hence is trivially harmonic, (b) Let x £ dD. Clearly, the right-hand side is continuous at x. The fact that it vanishes follows immediately from (8.6). D
170 Chapter 4 Partial Differential Equations 4.9. Laplace Transforms, Arcsine Law In this section we will apply results from Section 4.6 to do three things: (a) Complete the discussion of Example 6.1. (b) Prove a remarkable observation of Ciesielski and Taylor (1962) that for Brownian motion starting at 0, the distribution of the exit time from D = {x : \x\ < 1} in dimension d is the same as that of the total occupation time of D in dimension d + 2. (c) Prove Levy's arcsine law for the occupation time of [0, oo). The third topic is independent of the first two. Example 9.1. Take d = 1, G = (—a, a), c(x) = j > 0, and / = 1 in the problem considered in Section 4.6. (9.1) Theorem. Let r = inf{f : Bt <£ (-a, a)}. If 0 < 7 < ir/8ar, then cos(a^/2j) If 7 > ;r2/8a2 then Exe^T = 00. Proof By Example 6.1, u(x) = cos(z -(/27)/ cos(a -(/27) is a nonnegative solution of -u" + -fu = 0 u(—a) = u(a) = 1 To check that w ^ 00, we observe that (6.2) implies that Mt = w(5t)e7< is a local martingale on [0,r). If we let Tn be a sequence of stopping times that reduces Mt, then u(x) = Ex («(BT„An)e*T»A">) Letting n —* 00 and noting Tn A n f r, it follows from Fatou's lemma that u(x) > Exe^r The last equation implies w ^ 00, so (6.3) implies the result for 7 < ;r2/8a2. To see that w(x) = 00 when 7 > Tr~/8a2, suppose not. In this case (6.3) implies v(x) = Ex(f(Br)exp(cr)) is the unique solution of our equation but
Section 4.9 La.pla.ce Transforms, Arcsine Law 171 computations in Example 6.1 imply that in this case there is no nonnegative solution. D Exercise 9.1. Show that if/? > 0 then , n x cosh(z J2P) Ex exp(-/?r) = ) vJ-L K } coshCav^ Since cos(z) = (e'z + e~'z)/2 this is what we get if we set 7 = —/? in (9.1). Example 9.2. Our second topic is the observation of Ciesielski and Taylor (1962). The proof of the general result, see Getoor and Sharpe (1979), sections 5 and 8, or Knight (1981), pages 88-89, requires more than a little familiarity with Bessel functions, so we will only show that the distribution of the exit time from (—1,1) in one dimension starting from 0 is the same as that of the total occupation time of D = {x : \x\ < 1} in three dimensions starting from the origin. Exercise 9.1 gives the Laplace transform of the exit time from (—1,1) so we need only compute v(x) = Ex exp (-P J lD{Bt)dt Of course, we only have to compute t)(0) but as in Example 9.1, and Exercise 9.1, we will do this by using a differential equation to compute v(x) for all x. Several properties of v are immediately obvious (in any dimension d > 3): (i) Spherical symmetry implies v(x) = /(|z|). (ii) Using the strong Markov property at the hitting time of D and (1.12) from Chapter 3 gives for r > 1 (a) /(r) = r2"d/(l) + (1- r2"d) -1=1 + r2"d ■ (/(1) - 1) (iii) The strong Markov property and (6.3) imply that v is the unique solution °f 1 -At; — fiv = 0 for x G D v(x) = /(1) for xedD To express the last equation in terms of / we note that if v(x) = f(\x\) then DiV(x) = f'(\x\)^ ^(-) = ^(1-1)(^-^)+/-(1-1)]¾
172 Chapter 4 Partial Differentia.! Equations Summing over i gives |A,=|ni,i)+^/'(N) Multiplying by 2 we see that / satisfies (b) /» + —/'(r)-2/?/(r) = 0 for r < 1 r To tie equations (a) and (b) together, we note that by applying the reasoning in (iii) to D(0, q) with q > 1 and using the proof of (6.6) (which refers us to (5.6a)) we can conclude (iv) v is C1 and hence f'(r) is continuous at r = 1. Facts (ii)-(iv) give us the information we need to solve for / at least in principle: (b) is a second order ordinary differential equation, so if we specify /(0) = C and /'(0) = 0, the latter from (i), there is a unique solution jc on [0,1]. Given /c(l), (") gives us the solution on [l,oo). We then pick C so that (iv) holds. To carry out our plan, we begin with the "well known" fact that the only solutions to.(b) which stay bounded near 0 have the form cr1-(d/2)I^d/2)-i(rVW) where ^ (z/2)"+»» m=0 v ' is one of the happy families of Bessel functions. Letting a = y/2]3 and recalling T(z) = (x — l)r(z — 1), we see that when d = 3 and hence v = 1/2 the solution has a simple form v ' ^ (2m+1)! ar K ' m=0 v ' Having "found" the solution we want, we can now forget about its origins and simply check that it works. Differentiating we have /'(r) = ^cosh(ar) - ^sinh(ar) r ar- tut \ C(a)a • x.i ^ 2C(a) 2C(a) .,,, f"(r) = snih(ar) 5-1 cosh(a7-) -\ ^ sinh(ar) r r~ ar /"('■) + -f'(r) = ^^sinh(ar) = a2f(r) r r
Section 4.9 Lap/ace Transforms, Arcsine Law 173 which shows that (b) holds. To complete the solution, we have to pick C(a) to make / G C1. Using formulas for f'(r) for r < 1 from the previous display, for r > 1 from (a), and recalling d = 3, we have /'(1-) = C(a) cosh(a) - 0(0.)^1 sinh(a) /'(1+) = -{/(1) - 1} = 1 - 0(0)^1 sinh(a) Solving gives C(a) = l/cosh(a). Since a = yfifi this matches the result in Exercise 9.1 when x = 0. D Example 9.3. Our third and final topic is Kac's (1951) derivation of (9.2) Levy's arcsine law. Let Ht = \{s € [0,t] : Bs > 0}|. If 0 < B < 1 then P,{Ht <et)=\ f J?_ = I arcsin(V^) Remark. The reader should note that by scaling the distribution of Ht/t does not depend on t, so the fraction of time in [0,t] that a Brownian gambler is ahead does not converge to 1/2 in probability as t —> oo. Indeed r = 1/2 is the value for which the probability density l/TTy/r(l — r) is smallest. Proof For the last equality see the other arcsine law, (4.2) in Chapter 1. To prove the first we start with (9.3) Lemma. Let c(x) = —a — /?l[0jOO)(z) with a, /? > 0. Suppose v is bounded, C1, and satisfies -At; + cv = — 1 for all x^O. Then I dt e-atExe-pH' o Proof Our assumptions about v imply that v"(x) = —2(1 + c(x)v(x)) dx in the sense of distribution. So two results from Chapter 2, the Meyer-Tanaka formula, (11.4), and (11.7) imply v(Bi)-v(BQ)= f v'(Bs)dBs- [ (l + c(Bs)v(B,))ds Jo Jo
174 Chapter 4 Partial Differential Equations Letting ct = J0 c(Bs)ds and using the integration by parts formula, (10.1) in Chapter 2, with Xt = v(Bt) and Yt = exp(ct), which is locally of bounded variation, we have v(Bi)exp(ci)-v(BQ)= f exv(cs)v'(Bs)dBs Jo - f exv(cs){l + c(Bs)v{Bs)}ds Jo + / v(Bs)exp(cs)dcs Jo So Mi = v(Bi) exp(cj) + fQ exp(c5) ds is a local martingale. Since v is bounded and ct < 0, Mt is bounded. As t —* oo, exp(ct) < e~at —* 0, so using the martingale and bounded convergence theorems gives f°° v(x) = EXMQ = ExMoo = Ex ex-p(cs)ds Jo Plugging in the definition of c(x) now and using Fubini's theorem leads to the formula given above. D (9.3) tells us that we want to find a bounded C1 function v with (a + P)v = -v" + 1 x > 0 av = -v" + 1 x < 0 To solve the equation jv = ^v" + 1 we write v = vq + v\ where V\(x) = 1/j and note that 1 „ lvo =2¾ so we have vq(x) = Ce±xv^. Picking the signs to keep v bounded, we have .)={Ae-'yfiW> + -L- x>0 |5exv^ + l x<0 To find A and B we note that we want v to be C1 so A+—— = B+- a + p a -Ay/2(a + /3) = BV2^
Section 4.9 Laplace Transforms, Arcsine Law 175 It is a little tedious to solve these equations but it is easy to check that (a + p)y/El ccyfcT+p gives a solution. What we are interested in is 1 1 (9.4) w(0) = A + cc + P ^a(a + l3) To go backwards from here to the answer written in (9.2), we warm up by changing variables x = y/2yi, dx = (l/2)\/2j/tdt to get r^r--"1^* This identity and Fubini's theorem imply -(a+p)s /-oo -a(t-s) 1 1 1 /" e-(a+P)' r°° e-aV y/ec + P y/a~ tt J0 y/s Js t- Jo * Jo y/s(t - s) dtds s dsdt From (9.3), (9.4), and the uniqueness of Laplace transforms it follows that 1 /"* e'P' E0 exp(-0Ht) = - / , t ds dt T JO \/s(t - S) and invoking the uniqueness again we have the desired result. D
5 Stochastic Differential Equations 5.1. Examples In this chapter we will be interested in solving stochastic differential equations (or SDE's) that are written informally as dX, = b(X,)da + <r(Xt)dBt or more formally, as an integral equation: (*) Xt = X0 + f b(Xs) ds+ f cr(Xs).dB3 Jo Jo We will give a precise formulation of (*) at the beginning of Section 5.2. In (*), Bt is an m dimensional Brownian motion, a is a d x m matrix, and to make the matrix operations work out right we will think of X, b and B as column vectors. That is, they are matrices of size d x 1, d x 1, and m x 1 respectively. Writing out the coordinates (*) says that for 1 < i < d Xl = X'Q+ / b;(Xs)ds + J2 / <rij(X3)dBi Jo -=1 Jo Note that the number m of Brownian motions used to "drive" the equation may be more or less than d. This generalization does not cause any additional difficulty, is useful in some situations (see Example 1.5), and will help us distinguish between which is d x d and <rTa which ismxm. Here <rT denotes the transpose, of the matrix a. To explain what we have in mind when we write down (*) and why we want to solve it, we will now consider some examples. In some of the later examples we will use the term diffusion process. This is customarily defined to be a "strong Markov process with continuous paths." Ilowever, in this section,
178 Chapter 5 Stochastic Differential Equations the term will be used to mean "a family of solutions of the SDE, one for each starting point." (4.6) will show that under suitable conditions the family of solutions defines a diffusion process. Example 1.1. Exponential Brownian Motion. Let Xt = Xq exp([it + <rBt) where Xq is a real number and Bt is a standard one dimensional Brownian motion. Using Ito's formula (1.1) Xt = X0 + f iiX, ds+ f <tXs dBs+\ f o2Xs ds Jo Jo * Jo so Xt solves (*) with 6(z) = (/* + (cr2/2))z and <x(x) = ax. Exponential Brownian motion is often used as a simple model for stock prices because it stays nonnegative and fluctuations in the price are proportional to the price of the stock. To explain the last phrase, we return to the general equation (*) and suppose that 6 and a are continuous. If Xq = z then stopping at t ATn for suitable stopping times Tn f oo, taking expected values in (*) and letting n-+oowe have EX\ = x( +E I bi(Xs) ds Jo So if 6,- is continuous Because of the last equation, 6 is called the infinitesimal drift. To give the meaning of the other coefficient we note that if a(x) = cr(z)crT(z) then the formula for the covariance of two stochastic integrals, (8.7) in Chapter 2, implies (X\Xl)t = J2J cr!k(Xs)crjk(Xs)ds = J aij(Xs)ds Using the integration by parts formula X\X{ = zV + I X{ dX\ + f X\ dX{ + (X\Xi)t Jo Jo Substituting dXks = bk(X,)ds + J2<rkj(Xs) dBl i
Section 5.1 Examples 179 using the associative law, and taking expected values /•* . E(X\X{) = zV + E / X'MX.)ds Jo + E f Xibj(Xs)ds + E f a{j{Xs)ds Jo Jo If ay is also continuous, differentiating gives jt E(X'tXi)\t=o = xH;(x) + x%(x) + aij(x) Using EX\ = x' + E fQ bi(Xs) ds and differentiating again we get jt (EX\EX{) \t=Q = xHiix) + x%(x) Subtracting the last two equations gives jt [e{X\x{) - EX\EX{) [=q = aij(x) justifying the name infinitesimal covariance. When d = 1, we call a(x) the infinitesimal variance and <r{x) = y/a(x) the infinitesimal standard deviation. In Example 1.1, a(x) = ax is proportional to the stock price x. The infinitesimal drift 6(z) = (fi + (cr2/2))z is also proportional to x. Note, however, that in addition to the drift fj. built in to the exponential there is a drift cr2z/2 that comes from the Brownian motion. More precisely it comes from the fact that ex is convex and hence exp(cr51) is a submartingale. Example 1.2. The Bessel Processes. Let Wt be a d-dimensional Brownian motion with d > 1, and Xt = \Wt\. Differentiating gives ^ii 1 2z; „ . . 1 z? ... d— 1 So using Ito's formula
180 Chapter 5 Stochastic Differential Equations We will use Bt to denote the first term on the right-hand side, since Bt is a local martingale and (B)t = t, i.e., Bt is a Brownian motion by Levy's theorem, (4.1) in Chapter 3. Changing notation now we have (1.2) Xt - XQ = Bt + J ^ ds so (*) holds with b(x) = (d- l)/2|z| and <r(x) = 1. Example 1.3. The Ornstein Uhlenbeck Process. Let Bt be a one dimensional Brownian motion and consider (1.3) dXt = -aXt dt + o- dBt which describes one component of the velocity of a Brownian particle which is slowed by friction (i.e., experiences an acceleration —a times its velocity Xt). This is one case where we can find the solution explicitly. (1.4) Xt = e~ai (xq + J ea'adB^\ To check this informally, we note that differentiating the first term in the product gives ~aXtdt and differentiating the second gives adBt. To check this formally, we use Ito's formula with f{u, v) = e-auv Ut = t Vt=XQ+ f ea'<rdBs Jo to conclude Xt-X0= f -aX,ds+ f e-a'{ea'a)dBs Jo Jo which is (1.3) in integral form. From the representation in (1.4) we get the following useful fact (1.5) Theorem. When Xq = x, the distribution of Xt is normal with mean xe~ai and variance a2 fQ e~2ar dr. Proof To see that the integral has a normal distribution with mean 0 and the indicated variance we use Exercise 6.7 in Chapter 2. Adding e~atXo = e~atx now we have the desired result. D Example 1.4. The Kalman-Bucy Filter. Consider dYt = Vtdt + adWt ^ ' ' dVt = -cVtdt + <rdBt
Section 5.1 Examples 181 Here Vj is an Ornstein-Uhlenbeck velocity process, and Yt is an observation of the position process subject to observation error described by a dWt, where W is a Brownian motion independent of B. The fundamental problem here is: how does one best estimate the present velocity Vt from the observation {Ys : s < t}? This is a problem of important practical significance but not one that we will treat here. See for example, Rogers and Williams (1987), p. 327-329, or 0ksendal (1992), Chapter VI. Example 1.5. Brownian Motion on the Circle. Let Bt be a one dimensional Brownian motion, let Xt = cos Bt and Yt = sin Bt. Since X? + Y? = 1, (Xt,Yt) always stays on the unit circle. Ito's formula implies that dXt = -Yt dBt - \xt dt (1.7) \ dYt = Xt dBt - 2y< dt To write this in the form (*) we take •<■■»>-(:$) «■■»>=(7) Note that a is always a unit vector tangent to the circle but we need the drift b(x,y), which is perpendicular to the circle and points inward, to keep (Xt, Yt) from flying off. Example 1.6. Feller's Branching Diffusion. This Xt > 0 and satisfies (1.8) dXt = pXt dt + <Ty/XtdBt To explain where this equation comes from, we recall that in a branching process, each individual in generation m has an independent and identically distributed number of offspring in generation m + 1. Consider a sequence of branching processes {Z^,m > 0} in which the probability of k children is p£ and suppose (Al) the mean of pi is 1+ (/?„/n) with /?„ -* /? (A2) the variance of p£ is o\ with <rn —» a2 > 0 (A3) for any 5 > 0, £t>*„*2l>Z-0. Since the individuals in generation 0 have an independent and identically distributed number of children £(^1^=1^) = 1^.(1 + 0 var (Z? \Z£ = nx) = nx ■ al
182 Chapter 5 Stochastic Differential Equations So if we let X? = Z£ni]/n then E(x?,n-x\xn{Q) = x)=xl3n.± var (X?,n\x»{0) = x)=x<%-± The last two equations say that the infinitesimal mean and variance of the rescaled process X" are z/?n and xo\. These quantities obviously converge to those of the process Xt in (1-8). In Example 8.2 of Chapter 8, we will show that under (A1)-(A3) the processes {X",t > 0} converge weakly to {Xt,t > 0}. Example 1.7. Wright-Fisher Diffusion. This Xt G [0,1] and satisfies (1.9). dXt = (-0¾ + /?(1 - Xt)) dt + y/Xt{\ - Xt) dBt The motivation for this model comes from genetics, but to avoid the details of that subject we will suppose instead we have an urn with n balls in it that may be labelled A or a. To build up the urn at time m + 1 we sample with replacement from the urn at time m. However, with probability a/n we ignore the draw and place an a in, and with probability /?/n we ignore the draw and place an A in. In genetics terms the last two events correspond to mutations. Let Zm be the number of A's in the urn at time m. Now when the fraction of j4's in the urn at time 0 is x, the probability of drawing an A on a given trial is pn = x.(l--)+(l-x).£ V n/ n Since we are sampling with replacement, E(Z?\ZZ = nx) = npn var (Z? \ZZ = nx) = npn(l - pn) If we let Xtn = Z£ni]/n then E (X?,n - x | X"(0) = x)= {-ax + /?(1 -x)}-± var (x»/n| X"(0) = x) = p„(l - pn) ■ ± Now j)„-4iasn-+ooso again the infinitesimal mean and variance of X" converge to those of Xt in (1.9) but the rigorous connection has to wait until Example 8.3 of Chapter 8. This is just one of many of the diffusions that can
Section 5.2 Ito's Approach 183 arise from genetics. See Karlin and Taylor (1981), vol. II, pages 176-191 for some of the others. 5.2. Ito's Approach We begin this section by giving some essential definitions and introducing a counterexample that will explain the need for some of the formalities. We will then proceed to our main business: the first existence and uniqueness result for SDE, which was proved by K. Ito long before the subject became entangled in the complications of the general theory of processes. To finally become precise, a solution to our SDE (*) is a triple (Xt, Bt, Tf) where Xt and Bt are continuous processes adapted to a filtration Tf so that (i) Bt is a Brownian motion with respect to Tf, i-e., for each s and t, Bt+S —Bt is independent of Tf and is normally distributed with mean 0 and variance s. (ii) Xt satisfies (*) Xt=XQ+ f b(Xs) ds + f cr(Xs) dBs Jo Jo We will always assume that a and 6 are measurable and locally bounded so the integral exists. JC Ft It is easy to see that if the SDE holds for some Tf it will hold for Tt ' the filtration generated by (Xt,Bt). When Xt is adapted to Tf the filtration generated by the Brownian motion, then we can take T% = Tf and X% is called a strong solution. It is difficult to imagine how Xt could satisfy (*) without being adapted to Tf but there is a famous example due to H. Tanaka showing that this can happen. Example 2.1. Let Wt be a one dimensional Brownian motion with Wo = 0 and let Bt= [ sga(Ws)dWs Jo where sgn(a;) = 1 if x > 0, sgn(a;) = —1 if x < 0. Since Bt is a local martingale with (B)t = t, (4.2) in Chapter 3 implies that Bt is a Brownian motion with respect to T™, the filtration generated by Wt. Since sgn(z)2 = 1, the associative law implies that f sga(W3)dB3= f dW. = Wt Jo Jo So W is a solution of (*) with 6 = 0, <r(x) = sgn(z), and Tf = T™ = Tt ' ■
184 Chapter 5 Stochastic Differential Equations To see that W is a not a strong solution we apply the Meyer-Tanaka formula, (11.4) in Chapter 2, with Xt = Wt and f(x) = \x\ to get \Wt\-L°t= f sgn(Ws)dWs=Bt Jo where L° is the local time of \Wt\ at 0. The occupation time density formula in (11.7) of Chapter 2, and the continuity of the local time established in (11-8) of Chapter 2, imply that YeJolUw.l<^ds=leJ_^da^L0i as e —» 0 So L° and hence Bt = \Wt\ — L° is measurable with respect to the filtration generated by \Wt\, t > 0. However, Wt is clearly not adapted to the filtration generated by \Wt\. O As usual when we have an equation, we are interested in having a unique solution. For SDE there are several notions of uniqueness. Our first, pathwise uniqueness holds if whenever Xt and X't are solutions of (*) with Xq = X'Q = x driven by the same Brownian motion Bt, then with probability one Xt = X't for all t >0. Here, the nitrations T% and T[ are allowed to be different. (2.1) Theorem. Pathwise uniqueness does not hold in Example 2.1 Proof We observed in Example 2.1 that W was a solution. To show that —W is also a solution, we note that sgn(—z) = — sgn(z) except when x = 0, in which case the left side is 1 and the right side is —1. So f sgn(-W,)dBs =- f sgn(Ws) dB, + 2 f 1{W,=0} dBs Jo Jo Jo To prove that the second integral is 0 (and hence the right-hand side is = — Wt) recall that Wt is a Brownian motion. Example 3.1 in Chapter 2 implies that {s : W, = 0} has measure 0, and hence E (J l{tv,=0} dBs J = E J l{tv,=0} ds = E\{s < t : Ws = 0}| = 0 D Turning to positive results:
Section 5.2 Ito's Approach 185 (2.2) Theorem. If for all i, j, x, and y we have |cr,-j(as) — ff,'j(y)| < K\x — y\ and \bi(x) — 6,-(y)| < K\x — y\, then the stochastic differential equation (*) Xt = x+ f cr(Xs)dBs + f b(Xs)ds Jo Jo has a strong solution and pathwise uniqueness holds. Proof We construct the solution by a successive approximation scheme that reduces to the classical Picard iteration method for solving ordinary differential equations in the special case a = 0. We begin by introducing the notation Xt=X0+ f tr{Xt) dBs + f b(Xt)ds Jo Jo to describe one iteration. In words, we use the process {Xs,t > 0} as input for the coefficients a and 6, and the output is the process {Xt,t > 0}. To solve (*) we begin with the initial guess X° = x and define the successive guess by iteration: X?+i = X? for n > 0 The rest of this section is devoted to proving that this procedure constructs a strong solution and that pathwise uniqueness holds. To help the reader digest the argument we have divided it into sections. One of the claims is easy. It is clear by induction that X" is adapted to Tf. The basic estimate which is the key to the proof is: (2.3) Lemma. Let r be a stopping time with respect to T%, let T < oo and let 5 = (47^+16^2)^2. Then E ( sup 1¾ - Zt A < 2E\Yo - Zo\2 + BE f \Yt - Zs \2ds \0<t<TAr J Jo Proof The inequality is trivial if the right-hand side is infinite so we can suppose that each term on the right is finite. To begin we recall that (a+b)2 < 2a2 + 262. Using this twice (a + 6 + c)2 < 2a2 + 2(6 + c)2 < 2a2 + 462 + 4c2 So the left-hand side of (2.3) is < 2E\YQ - Zo |2 + 4£ sup / <r{Ys) - <r{Zs) dBa , s 0<i<T I Jo (a) I rtAT + AE sup / b{Y,)-b{Za)ds 0<t<T \Jo
186 Chapter 5 Stochastic Differential Equations To bound the third term in (a), we observe that the Cauchy-Schwarz inequality implies that a Mr \2 ptAr .iAr bi{Yt)-bi{Zt)da\ <J (bi(Y3)-bi(Zs))2ds. J Ids Taking sup0<1<T and using d (b) sup |w(t)|2 < J2 SUP «?(*) 0<t<T i^iQ<t<T with the Lipschitz continuity assumption gives I riAT 2 4E sup / b(Ys)-b(Zs)ds 0<t<T\Jo <4EJ2 sup / 6,-(^)-6,(^)^ (c) . d yTAr <4T-EJ2 (bi(Ys)-bi(Zs))2ds JfTAT ' |Y, -Z,|2cfc 0 To bound the second term in (a), let C{ be the ith row of <r, let 1-tAT K= / {<Ti{Ys)-<n{zs))-dBs Jo and let r„ = inf{£: \M\ \ > n} A r. The L2 maximal inequality and the formula for the variance of a stochastic integral imply that rTAT E sup (Ml)2 < AE (A4ArJ2 < AE / |<r,-(Y,) - <r,-(Z,)|2 ds 0<i<TAr„ JO Letting n —» oo, then using (b) and Lipschitz continuity we have I ftAT 2 AE sup / (<r(Y,)-(r(Z, ))<*#, 0<1<T|J0 <4EJ2 sup / (<n(Ys) - 0-,-(Z,)) • dB. (d) ^o<,<t|70 d [TAr <16EJ2 \<r{(Y.)-<rtiZ.)\2da /■Tat <16d2K2E / \YS-Za\2ds Jo
Section 5.2 Ifco's Approach 187 Combining inequalities (a), (c), and (d) proves (2.3). D Convergence of the sequence of approximations. Let 'U —1|2 An(t) = E[ sup IXf-Xr't \0<s<t and observe that since X° = X°_1( (2.3) implies that for n > 1 (2.4) An+1(D<B f An(s)ds Jo To get started we have to estimate |Ai(s)|. Our earlier inequality for squares implies that for norms \a + b\2 < 2|a|2 + 2|6|2. So we have \X}-XW<2(\<r(x)Bs\2 + \b(x)s\2) Using the fact that sup \<x{x)Bs\=t1'2 sup Ka:)B,| 0<s<t 0<i<l and that the right-hand side has finite expected value we have Ai(*)<C(f-M2) Combining this with (2.4) and using induction we have (tn 2tn+1 (2.5) A„(f) < Bn~lC r- + n! (n + 1)! From (2.5) we easily get the existence of a limit. Chebyshev's inequality shows that P ( sup |X|* - X?-^ > 2"n ) < 22nA„(T) \0<1<T J (2.5) implies the right-hand side is summable, so the Borel-Cantelli gives P ( sup |X|* -X?-^ > 2"" i.o. ] = 0 \0<1<T J Since ^n 2"" < oo it follows that with probability 1, X? -* a limit Xf° uniformly on [0, T].
188 Chapter 5 Stochastic Differential Equations The limit is a solution. We begin by showing that convergence also occurs uniformly in L2. (2.6) Lemma. For all 0 < m < n < oo, e( sup \X?-X?A < I V A^T)1'2) Proof Let ||Z||2 = {EZ2)1!2. If n < oo, then it follows from monotonicity of expected value and the triangle inequality for the norm || ■ H2 that sup |xr-x?| < <0<s<T »2 J2 sup ix.'-x*-1! i=m+l°^<T < £ II sup |X* -X*-i\\\ = £ Ak(Tf'2 Letting n —» 00 and using Fatou's lemma proves the result for n = 00. D To see that Xf is a solution, we let Y* = Xt" and Zt = Xf in (2.3) to get E ( sup IX?-*-1 - Xf|2 ] < BE f |X,n - Xf |2 ds \0<1<T / JO <bt-e( sup |x;-xf°|2) -♦ 0 \0<s<T J by (2.6), so X°° = limX^1 = Xf°. Uniqueness. The key to the proof is (2.7) Gronwall's Inequality. Suppose <p(t) < A + B f* <p(s)ds for alU > 0 and <p(t) is continuous. Then <p(i) < AeBi. Proof If we let ip(t) =(A + e)eBt then ij>'{t) = BiJ>(t) so i[>(t) = A + e + B I i[>(s) ds Jo Let r = inf{£ : <p(t) > i[>(t)}. Since i[) and <p are continuous, it follows that if r < 00 then <p(r) = i>(r), but this is impossible since iK- -) = .4 + 6 + 5 I il>(s)ds>A + B I <p(s) ds > <p(t) Jo Jo
Section 5.2 Ito's Approach 189 The last contradiction implies that ip(t) < (A + e)eBt, but e > 0 is arbitrary, so the desired result follows. Q To prove pathwise uniqueness now, let {yt,Bt,T\) and (Zt, Bt, T%) be two solutions of (*). Replacing the T\ by Tt = f\ V T% we can suppose T\ = 3^. To prepare for developments in the next section, we will use a more general formulation than we need now. (2.8) Lemma. Let Yt and Zt be continuous processes adapted to Tt with YQ = ZQ = x and let r(R) = inf{f : \Yt\ or \Zt\ > R}. Suppose that Bt is Brownian motion w.r.t. Tt and that Yt and Zt satisfy (*) on [0,t(R)]. Then Yt = Zt on [0,7-(1¾)]. Proof Let f(t) = E[ sup \YS-ZS\2 \0<s<tAr(R) i 'ds Since Yt and Zt are solutions, (2.3) implies rtAr(R) <p(t)<BE \YS-ZS\2 Jo <BE f \Y3Ar - Z3Ar\2 ds < B f <p(s)ds Jo Jo so (2.7) holds with A = 0 and it follows that tp = 0. D The last result implies that two solutions must agree until the first time one of them leaves the ball of radius R. (Of course this implies that the two solutions must exit at the same time.) Since solutions are by definition continuous functions from [0,oo) —» Rd, we must have r(R) f oo as i2 f oo and the desired result follows. D Temporal Inhomogeneity. In some applications it is important to allow the coefficients 6 and a to depend on time, that is, to consider (**) Xt = X0 + f b(s, Xs) ds + f cr(s, Xs) dBt Jo Jo In many cases it is straightforward to generalize proofs from (*) to (**). For example, by repeating the argument above with some minor modifications one can show
190 Chapter 5 Stochastic Differential Equations (2.9) Theorem. If for all i, j, x, y, and T < oo there is a constant Kt so that for t < T, |(ry(t,0)| < KT, Mt, 0)| < KT, \<rii(t,x)-<rii(t,y)\<KT\x-y\ and \bt(t,x)-bi(tty)\< KT\x-y\ then the stochastic differential equation (**) has a strong solution and pathwise uniqueness holds. Comparing with (2.2) one sees that the new result simply assumes Lipschitz continuity locally uniformly in t but needs new conditions: |ff;j(t,0)| < Kt and |6j(i, 0) | < Kt that are automatic in the temporally homogeneous case. The dependence of the coefficients on t usually does not introduce any new difficulties but it does make the statements and proofs uglier. (Here it would be impolite to point to Karatzas and Shreve (1991) and Stroock and Varadhan (1979).) So with the exception of (5.1) below, where we need to prove the result for time-dependent coefficients, we will restrict our attention to the temporally homogeneous case. 5.3. Extensions In this section we will (a) extend Ito's result to coefficients that are only locally Lipschitz-, (b) prove a result for one dimensional SDE due to Yamada and Watanabe, and (c) introduce examples to show that the results in (2.2) and (3.3) are fairly sharp. a. Locally Lipschitz Coefficients In this subsection we will extend (2.2) in the following way. (3.1) Theorem. Suppose (i) for any n < oo we have |o-y(a:) - (r0(y)| < Kn\x - y\ |6,-(a:) - bi(y)\ < Kn\x - y\ when \x\, \y\ < n and (ii) there is a constant A < oo and a function <p(x) > 0 so that if Xt is a solution of (*) then e~At<p(Xt) is a local supermartingale. Then (*) has a strong solution and pathwise uniqueness holds. (3.2) Theorem. Let a = ooT and suppose d £{2^(2) + ^(2))^5(1 + 1^2) i=i
Section 5.3 Extensions 191 then (ii) in (3.1) holds with A = B and <p(x) = 1 + \x\2. We begin with the easier proof. Proof of (3.2) Using Ito's formula with X? = t and f(x0,..., xd) = e-Bxo<p(xi,...,xd) we have e-Aif(Xt) - <p(XQ) = -B f e-B'<p(Xs)ds Jo £ f e-Bs2X\dX\ + \J2 f e~B'2 4(^), fei Jo l .=1 Jo d = local martingale + J* e~B> l-Bf(Xs) + J2 WW) + a«(X.)} J ^ Our assumption implies that the last term is < 0 so e~Ai<p(Xt) is a local supermartingale. D Proof of (3.1) Let R < oo, and introduce <rR and bR with (a) aR(x) = a(x) and bR(x) = b(x) for \x\<R (b) 0^(¾) = 0 and bR(x) = 0 for |x| > 2R (c) ^(x) - ^(y)\ < K'\x - y\, \bR(x) - bR(y)\ < K'\x - y\ For an explicit construction do Exercise 3.1. Suppose \h(x) — h(y)\ < C\x — y\ when \x\ < R. Let h{x) = ^^^--h{Rxl\x\) for R < \x\ < 2R R and h(x) = 0 for \x\ > 2R. Show that h is Lipschitz continuous on Rd with constant C - 2Ci + R~ 1\h(0)\. Fix a Brownian motion and let X" be the unique strong solution of (*„) X, = XQ + [ bn(X,) ds+ ( crn(X,) dB, Jo Jo
192 Chapter 5 Stochastic Differentia.! Equations Let Tn = inf{i : \Xt\ > n}. (2.8) implies that if m < n then X? = X? for t < Tm ■ From this it follows that we can define a process X£° so that Xf = X? for i < r„ and Xf° will be a solution of (*) for t < Too = lim^oo Tn. To complete the proof now it suffices to show that Too = oo a.s. If X0 = x then using the optional stopping theorem at time tATn, which is valid since the local supermartingale is bounded before that time, we have <p(x) > E (e-^AT"V(X1AT„)) > e~AiP(Tn < t) inf <p(y) Rearranging gives P(Tn < t) < eM<p{x) I inf <p(y) Since we have supposed <p(y) —» oo as y —» oo it follows that P(Tn < t) —» 0 as n —* oo which proves the desired result. D In words, what we have shown is that for locally Lipschitz coefficients the solution is unique up to Tn the exit time from the ball of radius n. If Tn j oo we get a-solution defined for all time. If not then the solution reaches infinity in finite time and we say that an explosion has occurred. Intuitively (3.2) says there is no explosion if, when \x\ is large, the part of the drift that points out, 6(z) • z/|z|, is smaller than C\x\ and the variance of each component is smaller than C|z|2. We do not need conditions on the off diagonal elements of a,j since the Kunita-Watanabe inequality implies |(X*',X^)1|2<(X£)1(X^)1 Note that since our condition concerns a sum, a drift toward the origin can compensate for a larger variance. Example 3.1. Consider d = 1 and suppose b(x) = —x3. Then (3.2) holds if a(x) < 5(1 + x2) + 2z4. The next two examples show that when considered separately the conditions on 6 and a are close to the best possible. Example 3.2. Consider d = 1 and let 0-(1) = 0, 6(z) = (l+|z|)4 with<5>l
Section 5.3 Extensions 193 Suppose Xq = 0. 6 > 0 so t —* Xt is strictly increasing. Let Tn = inf{£ : Xt = n}. While Xt G (n - l,n) its velocity b(Xt) > ns so Tn - Tn_x < n~s. Letting Too = limT„ and summing we have Too < IZ^Li n~4 < oo if 5 > 1. We like the last proof since it only depends on the asymptotic behavior of the coefficients. One can also solve the equation explicitly, which has the advantage of giving the exact explosion time. Let Yt = 1 + Xt. dYt = Y/ dt so if we guess Yt = (1 — at)~p then dYi/dt= ap (1 - <rf)P+! Setting a = 1/p and then p = 1/(5 — 1) to have p + 1 = 5p we get a solution that explodes at time 1/a = p = 1/(5 — 1). Example 3.3. Suppose 6 = 0, and a = (1 + |z|)4/. (2.2) implies there is no explosion when 5 < 1. Later we will show (see Example 6.2 and (6.3)) that in d > 3 explosion occurs for 5 > 1 in d < 2 no explosion for any 5 < oo b. Yamada and Watanabe in d = 1 In one dimension one can get by with less smoothness in a. (3.3) Theorem. Suppose that (i) there is a strictly increasing continuous function p with |<r(as) — a(y)\ < p(\x — y\) where /)(0) = 0 and for all e > 0 / Jo p 2(u)du = oo (ii) there is a strictly increasing and concave function k with \b(x) — b(y)\ < k(\x — y\) where k(0) = 0 and for all e > 0 I k x (u) du = 00 0 Then pathwise uniqueness holds. Remark. Note that there is no mention here of existence of solutions. That is taken care of by a result in Section 8.4. Proof The first step is to define a sequence <pn of smooth approximations of \x\. Let an I 0 be defined by a0 = 1 and f p ~(u) du = n
194 Chapter 5 Stochastic Differential Equations Let 0 < V'n(w) < 2/) 2(u)/n be a continuous function with support in (an, a„_i) and fH|.~l ij)n(u)du = 1 f Jo. Since the upper bound in the definition of t/>„ integrates to 2 over (an, an_i) such a function exists. Let /■1*1 /•</ Vn(a:)= / ^2// dzij)n{z) Jo Jo By definition y?„(z) = <pn(—x), and y^,(a:) = J"* ^„(u) d«, for x > 0, so {= 0 for |i| < a„ <1 for a„ < |x| < a„_! = 1 for a„_i < |x| and it follows that <pn(x) | |a;| as n | oo. Suppose now that Xj and X} are two solutions with X\ = Xq = a; driven by the same Brownian motion, and let At = X} — X?. At = f{<r{X)) - <r(X2)} dB, + [\b(Xl) - b(X?)} ds Jo Jo so Ito's formula gives <pn(At) = jf ^n(A.)dA. + \ Jq rt(A,)d(A), = fp'n(A.){<r{X])-<r{X?)}dB. Jo (a) + f<?n{A.){b{Xl)-b{X*)}da Jo = IQ(t) + h(t) + h(t) Io(t) is a local martingale so there are stopping times Tm j oo with (b) EIo(tATm) = 0 To deal with I2 in (a) we note that <p'^(x) = i/>n(|a:|) < 2p~2(\x\)/n when |a;| G (an, an_i) and is 0 otherwise. So using (i) (c) W0l<^3C!flM^D4<i
Section 5.3 Extensions 195 For h in (a) we observe |^(z)| < 1, so using (ii) and the concavity of k we have E\h(t)\< f E\b(Xl)-b(Xj)\ds (d) Jo < / EK(\A,\)ds< / K(E\As\)ds Jo Jo Combining (a)-(d) we have t /"* E<pn(AtATm) <-+ n(E\At\)ds n Jo Letting m —» oo and using Fatou's lemma, then letting n —» oo and using the monotone convergence theorem (recall <pn(x) > 0 and <pn(x) T \x\) we have E\At\< f K(E\A,\)ds Jo To finish the proof now we prove a slight improvement of Gronwall's inequality. (3.4) Lemma. Suppose f(t) < JQ k(/(s)) ds where / is continuous, k(x) > 0 is increasing for x > 0, and has JQ K-1(x)dx = oo for any e > 0. Then f(t) = 0. Proof Let g(t) be the solution of <?'(*) = K(g(t)) with g(0) = e. It follows from the proof of (2.7) that f(t) < g(t). To show that g is small when e is small note that g is strictly increasing and let h be its inverse. Since g(h(x)) = x, we have h'(x) = l/g'(h(x)) = 1/K.(g(h(x)) = 1/k(x) and hence if a < b h(b)-h(a)= f 1/K(x)dx Ja In words the right-hand side gives the amount time it takes g to climb from a to 6. Taking a = e and 6 = 6 > 0, the assumed divergence of the integral implies that the amount of time to climb to 6 approaches oo as e —» 0 and the desired result follows. D c. Examples We will now give examples to show that for bounded coefficients the combination of (2.2) and (3.3) is fairly sharp. In the first case the counterexamples will have to wait until Section 5.6.
196 Chapter 5 Stochastic Differential Equations Example 3.4. Consider b(x) = 0 and <r{x) = \x\s A 1. (2.2) and (3.3) imply that ., . . , ,, , (S> 1/2 ind= 1 pathwise uniqueness holds for < . — ' t<5>linc/>l Example 6.1 will show pathwise uniqueness fails if \ F ' in ,— v U < 1 in d > 1 Example 3.5. Consider <x(x) = 0, b(x) = \x\s A 1, and X0 = 0. If 8 > 1 then (2.2) implies that there is a unique solution: X, = 0. However, if 6 < 1 there are others. To find one, we guess Xt = Ctp. Then dX, = Cpsf-1 ds b(Xs) = CssP* so to make dXa = b(Xa) ds we set (p-l) = P8 i.e., p =1/(1 -S) Cp = Cs i.e.,C = p-1/(1-') If S < 1 then p > 0 and we have created a second solution starting at 0. Once we have two solutions we can construct infinitely many. Let a > 0 and let Xi ~ { C(t - t< a a)P t>a Exercise 3.2. The main reason for interest in (3.3) is the weakening of Lips- chitz continuity of a. However, the condition on 6 is also an improvement, and further sharpens the line between theorems and counterexamples, (a) Show that (ii) in (3.3) holds when C > 0 and / \ _ f Cz log(l/z) for z < e 2 K{X) ~ \ C(e~2 + z) for z > e"2 (b) Consider g(t) = exp(—l/tp) with p > 0 and show that g'(t) = ij>p(g(t)) where V-P(y) = P2/{log(l/y)}(p+1)/p- 5.4. Weak Solutions Intuitively, a strong solution corresponds to solving the SDE for a given Brow- nian motion, while in producing a weak solution we are allowed to construct
Section 5.4 Weak Solutions 197 the Brownian motion and the solution at the same time. In signal processing applications one must deal with the noisy signal that one is given, so strong solutions are required. However, if the aim is to construct diffusion processes then there is nothing wrong with a weak solution. Now if we do not insist on staying on the original space (Q, J7, P) then we can use the map w —» (-Xt(w), Bt(w)) to move from the original space (Q, T) to (C x C, C x C) where we can use the filtration generated by the coordinates wi(s), w2(s), s < t. Thus a weak solution is completely specified by giving the joint distribution (Xt,Bt). Reflecting our interest in constructing the process Xt and sneaking in some notation we will need in a minute, we say that there is uniqueness in distribution if whenever (X, B) and (X',B') are solutions of SDE(6, cr) with XQ = X'0=x then X and X' have the same distribution, i.e., P(X e A) = P(X' G A) for all A € C. Here a solution of SDE(6, a) is something that satisfies (*) in the sense defined in Section 5.2. Example 4.1. Since sgn(z)2 = 1, any solution to the equation in Example 2.1 is a local martingale with (X)t = t. So Levy's theorem, (4.1) in Chapter 3, implies that Xt is a Brownian motion. The last result asserts that uniqueness in distribution holds, but (2.1) shows that pathwise uniqueness fails. The next result, also due to Yamada and Watanabe (1971), shows that the other implication is true. (4.1) Theorem. If pathwise uniqueness holds then there is uniqueness in distribution. Remark. Pathwise uniqueness also implies that every solution is a strong solution. However that proof is more complicated and the conclusion is not important for our purposes so we refer the reader to Revuz and Yor (1991), p. 340-341. Proof Suppose (X1,!?1) and (X2,B2) are two solutions with Xq = X$ = x. (As remarked above the solutions are specified by giving the joint distributions (X\ £')•) Since (C,C) is a complete separable metric space, we can find regular conditional distributions Q'(w, A) defined for w € C and A € C (see Section 4.1 in Durrett (1995)) so that P{XieA\Bi) = Qi{Bi,A) Let Pq be the measure on (C, C) for the standard Brownian motion and define a measure w on (C3, C3) by 71-(^0 x Ai x A2) = / dPo(uQ)Qi(u0, Ai)Q2(uo,A2)
198 Chapter 5 Stochastic Differential Equations If we let Y/(w0, wi, W2) = w,(t) then it is clear that {X\B1) = (Y1,Y°) and (X2,B2) = (Y2,Y°) Take A2 = C 01 Ai = C respectively in the definition. Thus, we have two solutions of the equation driven by the same Brownian motion. Pathwise uniqueness implies Y1 = Y2 with probability one. (i) follows since X1 = Y1 = Y2 = X2 D Let a(x) = <t<tt(x) and recall from Section 5.1 that Jo We say that X is a solution to the martingale problem for 6 and a, or simply X solves MP(6, a), if for each i and j, X\- j bi(X,)ds and X\X{- f aij(Xa)ds Jo Jo are local martingales. The second condition is, of course, equivalent to <X',X')*= [ ay(X.)<h Jo To be precise in the definition above we need to specify a filtration Tt for the local martingales. In posing the martingale problem we are only interested in the process Xt, so we can assume without loss of generality that the underlying space is (C,C) with Xt(w) = w* and the filtration is the one generated by Xt- In the formulation above 6 and a = aaT are the basic data for the martingale problem. This presents us with the problem of obtaining a from a. To solve this problem we begin by introducing some properties of a. Since (X',Xi)t = (XJ,X')t, a is symmetric, that is, atj = aj{. Also, if 2 £ Hd then ( V i,j i,j,k k \ i / So a is nonnegative definite and linear algebra provides us with the following useful information. (4.2) Lemma. For any symmetric nonnegative definite matrix a, we can find an orthogonal matrix U (i.e., its rows are perpendicular vectors of length 1)
Section 5.4 Weak Solutions 199 and a diagonal matrix A with diagonal entries Ai > A2 ... > Ad > 0 so that a = UTAU. a is invertible if and only if Aj > 0. (4.2)tells us a = t/^Af/, so if we want a = <x<xT we can take <x = UTvA, where a/A is the diagonal matrix with entries \/A7. (4.2) tells us how to find the square root of a given a. The usual algorithms for finding U are such that if we apply them to a measurable a(x) then the resulting square root <r(x) will also, be measurable. It takes more sophistication to start with a smooth a(x) and construct a smooth ff(z). As in the case of (4.2), we will content ourselves to just state the result. For detailed proofs of (4.3) and (4.4) see Stroock and Varadhan (1979), Section 5.2. (4.3) Theorem. If BT a(x)Q > a\6\2 and \\a(x) - a(y)\\ < C\x - y\ for all x,y then Ha1/2^) - a1/2(2/)|| < (Cfia1/2)^ - y\ for all x,y. When a can degenerate we need to suppose more smoothness to get a Lipschitz continuous square root. (4.4) Theorem. Suppose atj £ C2 and max \6TDna(x)6\ < C\6\2 then 11^/2(^) - a1/2^)!! < (1(20)1^ - y\ for all x,y. To see why we need a £ C2 to get a Lipschitz a1/2, consider d = 1 and a(x) = \x\x. In this case <r(x) = \x\x/2 is Lipschitz continuous at 0 if and only if A > 2. Returning to probability theory, our first step is to make the connection between solutions to the martingale problem and solutions of the SDE. (4.5) Theorem. Let X be a solution to MP(6,a) and a a measurable square root of a. Then there is a Brownian motion Bt, possibly defined on an enlargement of the original probability space, so that (X, B) solves SDE(6, a). Proof If a is invertible at each X this is easy. LetYJ = Xi-fibi(X,)ds and let (a) B1i = Y,[' <rTjHX.)*Y! From the last two definitions and the associative law it is immediate that (b) / o-(X.) dB, = Yt - Yo = Xt - X0 - f b(X,)ds Jo Jo
200 Chapter 5 Stochastic Differential Equations so (*) holds. The B\ are local martingales with (B',B^)t = 5,-jt so Exercise 4.1 in Chapter 3 implies that B is a Brownian motion. To deal with the general case in one dimension, we let jj = /1 if<r(X.)#0 10 otherwise let J, = 1 — Is, let W, be an independent Brownian motion (to define this we may need to enlarge the probability space) and let dYs + / J.dW, <a'> A = l'^)jy' + / The two integrals are well defined since their variance processes are < t. Bt is a local martingale with (B)t = t so (4.1) in Chapter 3 implies that Bt is a Brownian motion. To extract the SDE now, we observe that the associative law and the fact that Js<x{Xs) = 0 imply f cr(X,) dB, = f I, dY, Jo Jo = Yt-Y0- [ Jo J,dYs In view of (b) the proof will be complete when we show fQ J, dY, = 0. To do this we note that E^J J,dY,} =eJ^ J*d(Y)a = E f J*cr(Xa)ds = 0 Jo To handle higher dimensions we let U(x) be the orthogonal matrices constructed in (4.2) and let Zt= t UT(X,)dY, Jo Introducing Kronecker's 5y = 1 if i = j and 0 otherwise, we have (Zi,Z^)t= [ Y,Vilk{X.)akJi{X.)UTj{X.)da= I %A,-(X3)dS Jo kl Jo
Section 5.4 Weak Solutions 201 Our result for one dimension implies that the Z\ may be realized as stochastic integrals with respect to independent Brownian motions. Tracing back through the definitions gives the desired representation. For more details see Ikeda and Watanabe (1981), p. 89-91, Revuz and Yor (1991), p. 190-191, or Karatzas and Shreve (1991), p. 315-317. □ (4.5) shows that there is a 1-1 correspondence between distributions of solutions of the SDE and solutions of the martingale problem, which by definition are distributions on (C, C). Thus there is uniqueness in distribution for the SDE if and only if the martingale problem has a unique solution. Our final topic is an important reason to be interested in uniqueness. (4.6) Theorem. Suppose a and 6 are locally bounded functions and MP(6,a) has a unique solution. Then the strong Markov property holds. Proof For each x £ Rd let Px be the probability measure on (C,C) that gives the distribution of the unique solution starting from Xq = x. If you talk fast the proof is easy. If T is a stopping time then the conditional distribution of {Xt+s , s > 0} given Tt is a solution of the martingale problem starting from Xt and so by uniqueness it must be Px(T)- Thus, if we let Qt be the random shift defined in Section 1.3, then for any bounded measurable Y : C —* R we have the strong Markov property (4.7) E{YoeT\TT) = Ex{T)Y To begin to turn the last paragraph into a proof, fix a starting point Xq = x0. To simplify this rather complicated argument, we will follow the standard practice (see Stroock and Varadhan (1979), p. 145-146, Karatzas and Shreve (1991), p. 321-322, or Rogers and Williams (1987), p. 162-163) of giving the details only for the case in which the coefficients are bounded and hence M\° = X\- ( bi{Xa)ds Jo M? =X\X{- J ai5{X,)ds Jo are martingales. We have added an extra index in the first case so we can refer to all the martingales at once by saying "the M\3." Consider a bounded stopping time T and let Q(w,A) be a regular conditional distribution for {Xr+t, t > 0} given Tt ■ We want to claim that (4.8) Lemma. For PXo a.e. w, all the M%+t - M% = M\3 o eT
202 Chapter 5 Stochastic Differential Equations are martingales under Q(w, •). To prove this we consider rational s < t and B € T, and note that the optional stopping theorem implies that if A £ Tt then EXo ({(M/J' - MY)1B) o 6T) 1A) = 0 Letting B run through a countable collection that is closed under intersection and generates T, it follows that PXo a.s., the expected value of (M\3 —M'J)1b is 0, which implies (4.8). Using uniqueness now gives (4.7) and completes the proof. D 5.5. Change of Measure Our next step is to use Girsanov's formula from Section 2.12 to solve martingale problems or more precisely to change the drift in an existing solution. To accomodate Example 5.2 below we must consider the case in which the added drift 6 depends on time as well as space. (5.1) Theorem. Suppose Xt is a solution of MP(/?,a), which, for concreteness and without loss of generality, we suppose is defined on the canonical probability space (C, C, P) with Tt the filtration generated by Xt(w) = wt. Suppose a-1(x) exists for all x, and that b(s, z) is measurable and has \bTa~1b(s, x)\ < M. We can define a probability measure Q locally equivalent to P, so that under Q, Xt is a solution to MP(/? + 6, a). Proof Let Xt=Xt- JJj P(X,)ds, let c(s,x) = a-^x^s,!) and let Yt= [ c(s,X,)-dX, Jo which exists since JO y = f cTac(s,Xs)ds= f bTa-1b(s,Xs)ds<Mt Jo Jo by our assumption that \bTa~1b(s,x)\ < M. Letting Qt and Pt be the restrictions of Q and P to Tt we define our Radon-Nikodym derivative to be dQ, dPi £ = tt, = exp(Y,-!<Y),)
Section 5.5 Change of Measure 203 Since (Y)t < Mt, (3.7) in Chapter 3 implies that at is a martingale, and invoking (12.3) in Chapter 2 we have denned Q. To apply Girsanov's formula ((12.4) from Chapter 2), we have to compute A3t = /0 aj1 d(a,X])s. Ito's formula and the associative law imply that at -1= ( asdYs= f asc(s,Xa)-dX, Jo Jo So by the formula for the covariance of stochastic integrals (a,Xi)t=J2J a3c,(S,X3)rf{X*',^')3 = / a,(cTa)j(s,X,)ds = / a, bj(s,X,)ds Jo JQ since c(s,x) = a~i(x)b(s,x). It follows that A{= J a^d{a,Xl),= f bj(s,Xs)ds Jo Jo Using Girsanov's formula now, it follows that X{ - I bj(s,X.)ds = X{ - ! friX,) + bs(s,Xs)ds Jo Jo is a local martingale/Q. This is half of the desired conclusion but the other half is easy. We note that under P (Xi,Xi)t= [ aij(Xa)ds Jo and (12.6) in Chapter 2 implies that the covariance of X' and X3 is the same under Q. Q Exercise 5.1. To check your understanding of the formulas, consider the case in which b does not depend on s, start with the Q constructed above, change to a measure R to remove the added drift 6, and show that dRt/dQt = l/a* so R=P. Example 5.1. Suppose Xt is a solution of MP(0,1), i.e., a standard Brownian motion and consider the special situation in which 6(z) = Vf7(z). In this case
204 Chapter 5 Stochastic Differential Equations recalling the definition of Yt in the proof of (5.1) and applying Ito's formula to U(Xt) we have Yt = [ VU(X,) ■ X, = U(Xt)- U(XQ) -\ ( AU(Xs)ds Jo * J0 Thus, we can get rid of the stochastic integral in the definition of the Radon- Nikodym derivative: dQ dPt £ = exp (u(Xt)- U(X0) - | J AU(X,) + \VU(X,)\2ds\ In the special case of the Ornstein-Uhlenbeck process, 6(z) = — 2ax, where a is a positive real number, we have U(x) = — a\x\2 AU(x) = — 2da, and the expression above simplifies to dQi =exp (-a\Xt\2 + a\X0\2 + dat - f 2a2\X,\2 ds dPt In the trivial case of constant drift, i.e., b(x) = n = Vf7(z) we have U(x) = fi-x and AU(x) = 0 so dQ, dPt }± = 'exp (».Xt- ii-X0-±\n\*t) It is comforting to note that when Xq = x, the one dimensional density functions satisfy Qt(Xt = y) = exp(fM-y-n-x- ^t\2J Pt(Xt = y) = (27rf)-d/2exp(-|y-x- nt\2/2t) as it should be since under Q, Xt has the same distribution as B% + fit. O (5.1) deals with existence of solutions, our next result with uniqueness. (5.2) Theorem. Under the hypotheses of (5.1) there is a 1-1 correspondence between solutions of MP(/?, a) and MP(/? + 6, a). Proof Let Pi and P2 be solutions of MP(/?, a). By change of measure we can produce solutions Q\ and Q2 of MP(/? + 6, a). We claim that if Pi ^ P2 then Q\ i1 Q2- To prove this we consider two cases: Case 1. Suppose Px and P2 are n°t locally equivalent measures. Then since Qi is locally equivalent to P,-, Q\ and Q2 are not locally equivalent and cannot be equal.
Section 5.5 Change of Measure 205 Case 2. Suppose Pi and P2 are locally equivalent. In the definition of ait (Y)t and Yt are independent of P{ by (12.6) and (12.7) in Chapter 2, so dQx/dPx = dQ2/dP2 and Qi # Q2. D One can considerably relax the boundedness condition in (5.1). The next result will cover many examples. If it does not cover yours, note that the main ingredients of the proof are that the conditions of (5.1) hold locally, and we know that solutions of MP(6,a) do not explode. (5.3) Theorem. Suppose X is a solution of MP(0,a) constructed on (C,C). Suppose that a_1(x) exists, bTa~1b(x) is locally bounded and measurable, and the quadratic growth condition from (3.2) holds. That is, d ^2{2Xibi(x) + Oit(x)} < A(l + |z|2) : = 1 Then there is a 1-1 correpondence between solutions to MP(6, a) and MP(0, a). Remark. Combining (5.3) with (3.1) and (4.1) we see that if in addition to the conditions in (5.3) we have a = <t<tt where a is locally Lipschitz, then MP(6, a) has a unique solution. By using (3.3) instead of (3.1) in the previous sentence we can get a better result in one dimension. Proof Let Z and a be as in the proof of (5.1) and let Tn = inf{t: \Xt\ > n}. From the proof of (5.1) it follows that a(t A Tn) is a martingale. So if we let P/1 be the restriction of P to FiATn and let dQ^/dP? = a(t A Tn) then under Q", Xt is a solution of MP(6, a) up to time Tn. The quadratic growth condition implies that solutions of MP(6, a) do not explode. We will now show that in the absence of explosions, cct is a martingale. First, to show at is integrable, we note that Fatou's lemma implies (a) Eat < liminf J5(a(i A Tn)) = 1 n—*oo where E denotes expected value with respect to P. Next we note that E\at - atATJ = E(\at - a(Tn)\;Tn < t) 1 < E(a(Tn);Tn <t) + Eat - E(at;Tn > t) The absence of explosions implies that asn->oo (c) E(a(Tn);Tn <t) = Qn(Tn < t) - 0
206 Chapter 5 Stochastic Differential Equations This and the fact that a(t A Tn) is a martingale implies (d) E(at; Tn>t) = l- E(a(Tn);Tn < t) - 1 as n —* oo. (b), (c), and (d) imply that a(t A Tn) —* at in L1. It follows (see e.g., (5.6) in Chapter 4 of Durrett (1995)) that a(tATn) = E(at\ftATn) and from this that a,, 0 < s < t is a martingale. The rest of the proof is the same as that of (5.1) and (5.2). D Our last chore in this section is to complete the proof of (2.10) in Chapter 1 by showing (5.4) Theorem. Let Bt be a Brownian motion starting at 0, let g be a continuous function with g(0) = 0, and let e > 0. Then P[ sup \Bt-g(t)\<e) >0 \0<1<1 J Proof We can find a C1 function h with /i(0) = 0 and \h(s) - g(s)\ < e/2 for s < t, so we can suppose without loss of generality that g £ C1 and let 63 = ff'(s). If we let Yt = fQ bs dB, and define Qt by dQ_ dPi u^(jyB,-\jy,UB, then it follows from (5.1) that under Q, Bt—g(t) is a standard Brownian motion. Let G = {\B, — g(s)\ < e for s < t}. Exercise 2.11 in Chapter 1 implies that Qt(G) > 0. It follows from the Cauchy-Schwarz inequality that To complete the proof now we let E denote expectation with respect to Pt and observe that if |631 < M for s < t then = exp (['\b,f ds\ <tM'<
Section 5.6 Change of Time 207 since exp (fQ 263 dB, — (1/2) JjJ |263|2 ds) is a martingale and hence has expected value 1. For this step it is important to remember that b, = g'(s) is a non-random vector. □ 5.6. Change of Time In the last section we learned that we could alter the 6 in the SDE by changing the measure. However, we could not alter a with this technique because the quadratic variation is not affected by absolutely continuous change of measure. In this section we will introduce a new technique, change of time, that will allow us to alter the diffusion coefficient. To prepare for developments in Section 6.1 we will allow for the possibility that our processes are denned on a random time interval. (6.1) Theorem. Let Xt be a solution of MP(6, a) for i < ¢. That is, for each i and j, X\- f bi(Xs)ds and X\X{ - j aij(Xs)ds Jo Jo are local martingales on [0,C). Let g be a positive function and suppose that cr* = / g(Xs) ds < oo for alii < C Jo Define the inverse of a by js = inf{i : 0* > s or i > ¢) and let Y, = X(j,) for s < £ = (T£. Then Y, is a solution of MP(b/g, a/g) for s < £. Proof We begin by noting that Xt = Y{at), then in the second step change variables r = <rs, dr = g(Xs) ds, X, = -^7(r) = Yr, to get X\- f bi(X,)ds = Yit- j bi(Ya,)ds Jo Jo = y;'(- f b{(Yr)/g(Yr)dr Jo Changing variables i = ju gives Yj - /" bi(Yr)/g(Yr) dr = X^ - P" b{(Xs) ds Jo Jo Since the ju are stopping times, the right-hand side is a local martingale on [0, £) and we have checked one of the two conditions to be a solution of MP(b/g, a/g).
210 ChapterS Stochastic Differential Equations Then MP(0,a) is well posed, i.e., uniqueness in distribution holds and there is no explosion. Proof We start with Xt = Bt, a Brownian motion, which solves MP(0,1) and take g(x) = /1(1)-1. Since g{Bt) < e^1 for t < TR = inf{f : \Bt\ > R} we have fQ g(Bs)ds < 00 for any t. On the other hand since Brownian motion is recurrent, so applying (6.1) we have a solution of MP(0,a). Uniqueness follows from (6.2) so the proof is complete. D Combining (6.3) with (5.2) we can construct solutions of MP(6,a) in one dimension for very general 6 and a. (6.4) Theorem. Consider dimension d = 1 and suppose (i) 0 < eR < a(y) <CR<oo when |y| < R (ii) 6(z) is locally bounded (iii) x ■ b{x) + a(x) < A(l + x2) Then MP(6,a) is well posed. I
6 One Dimensional Diffusions 6.1. Construction In this section we will take ah approach to constructing solutions of (*) dXt = b(Xt) dt + a(Xi) dBt that is special to one dimension, but that will allow us to obtain a very detailed understanding of this case. To obtain results with a minimum of fuss and in a generality that encompasses most applications we will assume: (ID) 6 and a are continuous and a{x) = ff2(z) > 0 for all x The purpose of this section is to answer the following questions: Does a solution exist when (ID) holds? Is it unique? Does it explode? Our approach may be outlined as follows: (i) We define a function <p so that if Xt is a solution of the SDE on [0, £) then Yt = <p(Xt) is a local martingale on [0,£). (ii) Our Yt has (Y)t = /0 h(Ys) ds so construct Yt to be a solution of MP(0, h) by time changing a Brownian motion. (iii) We define Xt = <p~1(Yt) and check that Xt is a solution of the SDE. To begin to carry out our plan, suppose that Xt is a solution of MP(6, a) for t < £. If / G C2, Ito's formula implies that for t < £ (1.1) f(Xt) - f(X0) = J' f\Xt) dX, + i jf f"(X.)d(X). = local mart. + / Lf(X,)ds Jo
212 Chapter 6 One Dimensional Diffusions where (1.2) Lf(x)=±a(x)f"(x) + b(x)f'(x) and a(x) = (^(x). From this we see that f(Xt) is a local martingale on [0,£) if and only if Lf(x) = 0. Setting Lf = 0 and noticing (ID) implies a(x) > 0 gives a first order differential equation for /' (1.3) (/')' = — /' a Solving this equation we find, and it follows that Any of these functions can be called the natural scale. We will usually take A = 0 and B = 1 to get (L4) «*> = /^(1-¾)^ However, in some situations it will be convenient to replace the 0's at the lower limits by some other point. Note that our assumption (ID) implies <p is C2 so our use of Ito's formula in (1.1) is justified. Since Yt = <p{Xt) is a local martingale, results in Section 3.4 imply that it is a time change of Brownian motion. To find the time change function we note that (1.1) and the formula for the variance of a stochastic integral imply (1.5) where <Y),= f <p'(X,fd(X), Jo = f <p'(X,)2a(X,)ds= f h(Y,)ds Jo Jo h(y) = {^'(^"Hy))}2'1^-1^)) > 0 is continuous So if we let rt = inf{s : (Y)s > t} then Wt = Yt, is a Brownian motion run for an amount of time (Y)f.
Section 6.1 Construction 213 To construct solutions of MP(6, a) we will reverse the calculations above: we will use a time change of Brownian motion to construct a Yt which solves MP(0, h) and then let Xt = <p-l{Yt). With Examples 1.6 and 1.7 of Chapter 5 in mind we generalize our set-up to allow the coefficients 6 and a to be denned on an open interval (or,/?) with a < 0 < /?. Since <p'(x) > 0 for all x, the image of (a, /?) under <p is an open interval (£, r) with —oo <£<0<r<oo. Letting Wt be a Brownian motion, C = inf{t : Wt ¢ {£, r)}, g = 1/h, <rt = g(Ws) ds for t < C and js = inf{t : <rt > s or t > Q Jo Using (6.1) in Chapter 5, we see that Y, = W(js) is a solution of MP(0, h) for s <£ = <?<:■ To define Xt now, we let "ij> be the inverse of <p and let Xt = i>(Yt). To check that X solves MP(6, a) until it exits from (a, /?) at time £, we differentiate <p(ip(x)) = z and rearrange, then differentiate again to get p'(V>(z)) Using the first equality and <p"(y) = —(2&(y)/a(y))y/(y), which follows from (1.3), we have v ^) {<pm*w aw*)) These calculations show ij) £ C2 so Ito's formula, (1.1), and (1.5) imply that for t < £ t{Yt) - iftYo) = J 4>'(Ys)dY, + i J ^"(y.)A(y.) rf* For the second term we note that combining the formulas for ■$>" and h gives (recall ij) = (p~l) \#\v)Kv) = KHv)) To deal with the first term observe that Yt is a solution of MP(0, h) up to time £ so by (4.5) in Chapter 5 there is a Brownian motion B, with dY, = 1//1( Y3) dB,. Since ^'(y)\/h(y) = c(t/>(i/)), letting Xt = i>(Yt) we have for t < £ X«-X0= / <r{Xa)dBa+ f b(X,)ds Jo Jo
214 Chapter 6 One Dimensional Diffusions The last computation shows that if Yt is a solution of MP(0, h) then Xt = i>(Yt) is a solution of MP(6,a), while (1.1) shows that if Xt is a solution of MP(6, a) then Yt = <p(Xt) is a solution of MP(0, h). This establishes there is a 1-1 correspondence between solutions of MP(0, h) and MP(6,a). Using (6.2) of Chapter 5 now, we see that (1.6) Theorem. Consider (C,C) and let Xt(u) = ut. Let a < /3 and r^a,p) — \xd{Xt{u) ¢ (a,/?)}. Under (ID), uniqueness in distribution holds for MP(6, a) on [0,7-(,,,0)). Using the uniqueness result with (4.6) of Chapter 5 we get (1.7) Theorem. For each x £ (a,/?) let Px be the law of Xt, t < T^a,p) from (1.6). If we set Xt = A for t > T(a,p)) where A is the cemetery state of Section 1.3, then the resulting process has the strong Markov property. 6.2. Feller's Test In the previous section we showed that under (ID) there were unique solutions to MP(6,a) on [0,T(a,p)) where T(a,p) = inf{t : Xt ¢ (a,/?)}. The next result gives necessary and sufficient conditions for no explosions, i.e., T(a,p) = °° a-s- Let Ty = inf {t :Xt = y} for y G (a, /?) Ta = limT„ and Tp = limTv vie y P vW y In stating (2.1) we have without loss of generality supposed 0 £ («,/?)■ If you are confronted by an (a,/?) that does not have this property pick your favorite j in the interval and translate the system by —j. Changing variables in the integrals we see that (2.1) holds when all of the 0's are replaced by 7's. (2.1) Feller's test. Let <p(x) be the natural scale denned by (1.4) and let m(x) = l/(<p'(x)a(x)). (a) Px(Tp < To) is positive for some (all) x £ (0,/?) if and only if I dx m{x) (<p(/3) — <p(x)) < 00 Jo (b) Px(Ta < Tq) is positive for some (all) x € (a, 0) if and only if 0 dx m(x) (<p(x) — <p(a)) < 00 L
Section 6.2 Feller's Test 215 (c) If both integrals are infinite then Px(t"(«,/3) = oo) = 1 for all x £ (a,/?). Remarks. If <p((3) = oo then the first integrand is = oo and the integral is oo. Similarly the second integral is oo if <p(a) = —oo. The statement in (a) means that the following are equivalent (al) Px(Tp < TQ) > 0 for some x G (0,/3) (a2) J* dx m{x) (p(/J) - <p(x)) < oo (a3) Px(7> < To) > 0 for all x g (0,/3) Proof The key to the proof of our sufficient condition for no explosions given in (3.1) and (3.2) of Chapter 5 was the fact that if A is sufficiently large St = (1 + \Xt\2)e~Ai is a supermartingale To get the optimal result on explosions we have to replace 1+ \x\2 by a function that is tailor-made for the process. A natural choice that gives up nothing is a function g > 0 so that e~tg(Xt) is a local martingale To find such a g it is convenient to look at things on the natural scale, i.e., let Yt = <p(Xt). Let £ = limX|Qy?(z) and let r = limxf/j <p(x). We will find a C2 function f(x) so that / is decreasing on (£, 0), / is increasing on (0, r) and e_</(^i) is a local martingale Denouement. Once we have / the conclusion of the argument is easy, so to explain our motivations we begin with the end. Let f (I) = \im f(y) /(r) = lim/(y) (2.6) and the calculations at the end of the proof will show that (2.2a) The integral in (a) is finite if and only if f(r) < oo. (2.2b) The integral in (b) is finite if and only if f(£) < oo. Once these are established the conclusions of (2.1) follow easily. Let fy=mi{t:Yt = y} fory€(£,r) fr = \imfv and ft = limTL f(l,r)=flAfr
216 Chapter 6 One Dimensional Diffusions Proof of (c) Suppose f{€) = f(r) = oo. Let an < 0 < 6„ be chosen so that /(a„) = /(&„) = n and let r„ = fan A fhn. Since e~'f(Ys) is bounded before time r„ At we can apply the optional stopping theorem at that time to conclude that if y G (an, bn), then ei f(v) Py(rn < t) < V' -* 0 as n -* oo n Letting i-+oowe conclude 0 = Py(f(tr) < oo) = -^-1(1,)(^(0,/3) < °°)- Proof of (a) Let 0 < y < r, let 6„ | r with 6X > y, and let r„ = T0 A r&„. If /(r) = 00, applying the optional stopping theorem at time t„ A t, we conclude that ei f(v) Py(Tbn <T0At)< -jfff -» 0 as n -* 00 Letting i-+oowe conclude 0 = Py(fr < fQ) = Pv-i^y)(Tp < T0). This shows that if (a2) is false then (al) is false, i.e., (al) implies (a2). If /(r) < 00, applying the optional stopping theorem at time r„ (which is justified since e_< f(Yt) < f(bn) for t < r„) we conclude that l<f(y) = Es{e-T-f(YTn)) < l + f(r)Ey (e-^»;f6„< To) Rearranging gives Noting Tr = Tb < °° is impossible, and letting n-+oowe have Ey(e-^;fbn < To) | Ey(e-^;fr < %) which is a contradiction, unless 0 < Py(fr < T0) = P^-i^Tp < T0). This shows that (a2) implies (a3). (a3) implies (al) is trivial so the proof of (a) is complete. Proof of (b) is identical to that of (a). The search for f. To find a function / so that e~tf(Yt) is a local martingale, we apply Ito's formula to f(x\, z2) = e-Xl/(z2) with X\ = t, X% = Yt, and recall Yt is a local martingale with variance process given by (1.5), so e~7W " f(Yo) = J {-e-f{Y.) + \e~' f"(Y,)h(Y,)}ds + local mart.
Section 6.2 Feller's Test 217 so we want (2.3) ±h(x)f"(x)-f(x) = Q To solve this equation we let /o = 1 and define for n > 1 (2.4) Ux) = £dy£d!v^ (1.8) in Chapter 4 implies that if we set f(x) = J2™=ofn(x) an(i oo (2.5) Y^ sup |/n(aOI < oo for any R < oo then f"(x) = En=ofn^)- Using f'^x) = 2fn_1(x)/h(x) for n > 1 and f{,'(x) = 0 it follows that n*) = £«*) = £ ^ = ¾^ n=0 n=l v ' v ' i.e., / satisfies (2.3). To prove (2.5), we begin with some simple properties of the /„. (2.6a) /„>0 (2.6b) /„ is convex with /^(0) = 0 (2.6c) /„ is increasing on (0, r) and decreasing on {1,0) Proof Since h(z) > 0 (2.6a) follows easily by induction from the definition in (2.4). Using (2.6a) in the definition, (2.6b) follows. (2.6c) is an immediate consequence of (2.6b). D Our next step is to use induction to show (2.7) Lemma. fn(x) < (^(z))"/"!, (2.5) holds, and l + /i</<exp(/!) Proof Using (2.6c) and (2.6a) the second and third conclusions are immediate consquences of the first one. The inequality /n(z) < (fi(x))n/n\ is obvious if
218 Chapter 6 One Dimensional Diffusions n = 1. Using (i) the definition of/„ in (2.4), (ii) (2.6c), (iii) the definition of /x and /o, (iv) the result for n — 1, and (v) doing a little calculus, we have yx py /„(*)= / dy dz Jo Jo h(z) y , 2 yx yt/ < / d»/„-i(») / ^ JO Jo A(z) dyfn-i(3f)f[(jf) Xn-1 To complete the proof now we only have to prove (2.2a) and (2.2b). In view of (2.7) it suffices to prove these results with / replaced by fx. Changing variables z = <p(v), dz = <p'(v) dv then y = <p(u), dy = <p'(u) du we have yx yt/ 9 /!(*)= dy dz Jo Jo <p>W(z)y-aW(z)) -X yi = / «W JO Jo i^(y) 2 (ill • y/(u)a(u) y1^(x) ru 2 Jo Jo <p'{v)a(v) Letting x|r and using Pubini's theorem /i(r)=/ dv—7TTT\ du<p'(u) J0 <p'(v)a(v) Jv = 2/ dt;m(t;)(p(/?)-p(t;)) Jo A similar argument shows that the second expression in (2.1) is (except for a factor of 2) /x (£) and the proof is complete. D To see what (2.1) says in a concrete case we consider Example 2.1. Let a(x) = 1, b(x) = (1 + |a;|)5/2 where 8 > 0. When 6 < 1 the coefficients are Lipschitz continuous, so there is no explosion. We will now use (2.1) to reprove that result and show that explosion occurs when 5 > 1. When y < 0, <p'(y) > 1, so <p(— oo) = —oo and the second integral in (2.1) is oo.
Section 6.3 Recurrence and Transience 219 To evaluate the first integral we note that if y > 0 • W = exp (J -(1 + ,f *) = exp (-"+^';' + 1) so <p(oo) < oo. To use (2.1) now, we note that a(y) = 1 so m(y) = l/f'(y) so there is no explosion if and only if r, f(i+vy+s-i\ r°°, (-(i + uy+* + i\ oo = / dv exp -*—- / dw exp —-—-• Jo V 1 + 5 yy„ V\ 1 + 6 J yoo y JO Jv du exp ( _ = / "yX ^eXH 1 + 5 ) where in the last step we have changed variables x = u+1, y = v + 1 to get rid of the l's. The next lemma estimates the last expression and shows that it is < oo if 6 > 1 = oo if 0 < 6 < 1 (2.8) Lemma. If 6 > 0, f°° (yl+s -x1+s\ V I dXeXP{ (1 + 6) )^1 ^^°° Proof Changing variables x = y + zy~s we have f°° /y1+s-x1+s\ _c f°° ( ry+zy~s \ J, dX "* { 1 + 6 ) = y~ I d2eXp {- J, W dw) X ~~S Since z < fy+zy ws dw < zy~s(y + zy~s)s -+ z as y —* oo, the desired result follows from the dominated convergence theorem. D 6.3. Recurrence and Transience The natural scale, which was used in Section 6.1 to construct one dimensional diffusions, can also be used to study their recurrence and transience. Let Xt be a solution of MP(6, a) and suppose
220 Chapter 6 One Dimensional Diffusions (ID) 6 and <r are continuous with <r(x) > 0 for all x Let a(x) = ff2(z), and let <p be the natural scale denned by Let Ty = inf{t > 0 : Xt = y} and let r = Ta A Tb. The first detail is to show (3.1) Lemma. If a < x < b then Px(t < oo) = 1. Proof We saw in Section 6.1 that Yt = <p(Xi) is a solution of MP(0, h) where h given by (1.5) is positive and continuous. Thus (6.1) in Chapter 5 implies that Y can be constructed as a time change of Brownian motion, Bt: Y, = B(js) where j, = inf{t: <rt > s} and crt= ds/h(Bs) Jo Since Brownian motion will exit (<p(a), <p(b)) with probability one, Yt will exit (<p(a),<p(b)) with probability one, and Xt will exit (a, 6) with probability one. D With (3.1) established we can study the recurrence and transience as we did for Brownian motion. f(XtAr) is a uniformly bounded martingale, so the optional stopping theorem implies <p{x) = Ex<p(Xr) = f(a)Px(Ta < Tb) + <p(b){l - Px(Ta < Tb)} and solving we have <p(b) - tp(a) <p{b) - tp(a) Letting y(oo) = lirri6_oo <f{b) and <p(— oo) = lima-.-oo <p(a) (the limits exist since <p is strictly increasing) we have (3.3) Theorem. Suppose a < x < b. Px(Ta < oo) = 1 if and only if <p(oo) = oo. Px(Tb < oo) = 1 if and only if <p(— oo) = —oo. In one dimension we say that X is recurrent if Px(Ty < oo) = 1 for all y. Prom (3.3) it follows that
Section 6.3 Recurrence and Transience 221 (3.4) Corollary. X is recurrent if and only if <p(H) = R. This conclusion should not be surprising. Yt = <p{Xt) is a local martingale so it is a time change of a Brownian motion run for a random amount of time and if <p(H) 7^ R that random amount of time must be finite. To understand the dividing line between recurrence and transience we consider some examples. We begin by noting that the natural scale only depends on 6 and a through the ratio 26/a. Exercise 3.1. (i) Show that if b(y) < 0 for y > y0 then PX(T0 < oo) = 1 for all x > 0. (ii) Show that if b(y)/a(y) > e > 0 for y > yQ then PX(TQ < oo) < 1 for all x > 0. Example 3.1. The exercise identifies the interesting case as being 6(y) > 0 with b(y)/a(y) —* 0 as y —* oo. Suppose that for x > 0 26(z)/a(z) = C(l + x)~r where C, r > 0 If r > 1 then I Jo -C(l + z)~r dz = -K > -oo so <p'(y) > e~K for all y > 0 and <p(oo) = oo. If r < 1 I " -C(l + z)~r dz = --$- {(1 + y)1-' - 1} o I" and it follows that P(°o) = / Jo exP (-j3; {(i + ^)1-^1}) dy<™ In the borderline case r = 1 the outcome depends on the value of C. r -C(l + z)-1dz = -C\n(l + y) I" Jo /o so <p(oo)= J°°(l+y)-cdy |< oo if C > 1 oo if C < 1 D To check the last calculation note that Example 1.2 in Chapter 5 shows that the radial part of d dimensional Brownian motion has 6(r)/a(r) = (d — l)/2r, while Section 3.1 shows that Brownian motion is recurrent in d < 2 and transient in d > 2, which corresponds to C < 1 and to C > 1 respectively. Sticklers for
222 Chapter 6 One Dimensional Diffusions detail may complain that we do not have any Brownian motions denned for dimensions 2 < d < 3. However, this analysis suggests that if we did then they would be transient. Exercise 3.2. Suppose a and 6 satisfy (ID) and are periodic with period 1, i.e., a(y+1) = a(y) and b(y+1) = b(y) for all y. Find a necessary and sufficient condition for Xt to be recurrent. 6.4. Green's Functions Suppose Xt is a solution of MP(6,a), where 6 and a satisfy (ID). Let a < b be real numbers, which you should not confuse with the coefficients, let D = (a, 6), and let r = inf{£ : Xt ¢ (a, 6)}. Our goals in this section, which are accomplished in (4.5) and (4.4) below, are to show that (i) if g is bounded and measurable then Ex J g(Xs) ds= J GD(x, y)g(y) dy and (ii) to give a formula for the Green's function Gi)(x,y). As in the discussion of (8.1) of Chapter 4, we think of Gd(x, y) as the expected amount of time spent at y before the process exits D when it starts at x. The rigorous meaning of the last sentence is given by (i). The analysis here is similar to that in Section 4.5 but instead of the Lapla- cian, A, we are concerned with Lf(x) = ±a(x)f"(x) + b(x)f'(x) The first step is to generalize (1.3) from Chapter 3. (4.1) Lemma. supx6(a6) Exr < oo. Proof Let <p be the natural scale defined in (1.4) and consider Yj = <p(Xt). By (1.5) Yt is a local martingale with (Y)t = fQ h(Ys)ds where h(y) is positive and continuous. To estimate r = inf{t: Yt ¢ (y(a), f(b))} we let v = (<p(a) + <p(b))/2 be the midpoint of the interval, I = (ip(b) — ip(a))/2 be half the length of the interval, p = M{h(y) : y G (<p(a), <p(b))} > 0, f(x)=(f--(x-ur-)/p,
Section 6.4 Green's Functions 223 and note that / > 0 in (<p(a), <p(b)) with f"(x) = -2//). Ito's formula implies f(Yt) - /(¾) = local mart. - f 2&1 ds Our choice of p implies h(y) > p, so /(It) -M is a local supermartingale on [0,r). Using the optional stopping theorem at time rn An where rn = inf{t : Yt ¢ (y(a) + 1/n, y?(6) — 1/n)} we have /(*) > ^/(^An) + Ex(rn A n) Since 0 < f(x) < C~ j'p for x € [^(a), p(6)] we have £2//) > Ex(rn A n). Now, let n —» oo and use the monotone convergence theorem. D (4.2) Theorem. Suppose </ is bounded. If there is a function v with (i) v £ C2, Lv = — g in (a, 6) (ii) v is continuous at a and 6 with v(a) = v(b) = 0 then *(*) = £„ I' g(X,)ds Jo Proof Let Mt = v{Xt) + f* g(Xs) ds. (i) and (1.1) imply that for t < r v(Xt) — v(Xq) = local mart. — / g(Xs)ds Jo so Mt is a local martingale on [0, r). If w and g are bounded then for t < r |M,|<iM|oo + ry|oo (4.1) implies that the right-hand side is an integrable random variable so (2.7) in Chapter 2 and (ii) imply Mr = limMt= I g(Bt)dt *Tr JQ v{x) = EXMQ = EXMT = Ex f g(Bt) dt a Jo Solving (4.2). The next two pages will be devoted to deriving the solution to the equation in (4.2). Readers who are content to guess and verify the answer can skip to (4.4) now. To solve the equation it is convenient to introduce m^ = <p'(x)a(x)
224 Chapter 6 One Dimensional Diffusions where <p(x) is the natural scale and note that y } 2m(x) dx \<p'(x)dx) 2 dx0- 2 \ <p'(x) J dx JK> since the definition of the natural scale implies <p"(x)/<p'(x) = — 2b(x)/a(x). m is called the (density function of the) speed measure, though as we will see in Example 4.1 the rate of movement of the process near x is inversely proportional to m(x)l To simplify notation and to facilitate comparison with our source for this material, Karlin and Taylor (1981), Vol. II, we let s(x) = <p'(x) be the derivative of the natural scale. To solve equation Lf = — g now, we use (4.3) to write 1 dv\ „ . . . . = -2m(x)g(x) dx \s(x) dx and integrate to conclude that for some constant /? 1 dv s(y) dy = 13-2 J dzm(z)g{z) J a Multiplying by s(y) on each side, integrating y from a to x, and recalling that v(a) = 0 and s = <p' we have (a) v{x) = /3{<p(x) - <p(a)) -2 dys(y) / dz m{z)g(z) •J a " a In order to have v(b) = 0 we must have P=m-9{a)fadyS{v)[dZm{z)9{z) Plugging the formula for /? into (a) and writing u(x)=^-f]=Px(Tb<Ta) vw-vw we have (b) r*> ry v(x) = 2u(x) dys(y) dzm(z)g(z) J a J a -2 dys(y) dzm(z)g(z) •J a " a
Section 6.4 Green's Functions 225 Breaking the first f dy into an integral over [a, x] and one over [z, 6] we have v(x) = 2(u(x) - 1) / dys(y) dzm(z)g(z) •J a " a (c) + 2u(x) / dys(y) / dzm(z)g(z) J x J a Recalling s(y) = <p'(y) we have rb u(x) jf dy s(y) = ^j .¾ " (9(b) - <p{x)) = (1- u(x)) J' dy s(y) Multiplying the last identity by 2 Ja dz m(z)g(z)) we have rb rx px px 2u(x) dys(y) dz m(z)g(z) = 2(1 - u(x)) / dys(y) dzm(z)g(z) Jx Ja Ja Ja Using this in (c) we can break off part of the second term and cancel with the first one to end up with (d) rx rx v(x) = 2(1 - u(x)) / dys(y) dzm{z)g(z) J a Jv r» ry + 2u(x) dys(y) dzm(z)g(z) J X " X Using Fubini's theorem now gives (e) v(x) = 2(1 - u(x)) / dzm(z)g(z) dys(y) Ja Ja + 2u(x) / dzm(z)g(z) / dys(y) If we define G(x, z) to be 2 y(6)-y(a) '{<p{h) ~ ^x)) m{z) ^ * ~ * 2 y(6)-y((a) '{<f{z) ~ ^ m(z) ^ * ~ * then we have (f) v(x)= [ G(x,z)g(z)dz Ja
226 Chapter 6 One Dimensional Diffusions Combining the calculations above with (4.2) shows (4.5) Theorem. If g is bounded and measurable then Ex [ g(Xs)ds= [ G(x,z)g(z)dz J0 Ja Proof By the monotone class theorem it suffices to prove the result when g is continuous. To do this it suffices to show that the v denned in (f) satisfies the hypotheses of (4.2). To check (ii), we note that as x —» a or x —» 6, G(x, z) —» 0, and G is bounded so the bounded convergence theorem implies v(x) —» 0. To check that v £ C2 we note that y(6)~y(a) v(x) = (<p(b) - <p{x)) j\<p{z) - <p(a)) m(z)g(z) dz + (<P(*) ~ f(a)) / (<p(b) - <f(z)) m(z)g(z) dz Jx Differentiating and noting that the terms from the limits of the integrals cancel we have . y(6)~y(a) v'(x) = -<p'(x) j\<p{z) - p(a)) m(z)g(z) dz + <f'(x) / (<P(b) ~ <f(z)) ™(%(2) dz Jx Differentiating again we have <p{b) - <p(a) v"(x) = -<p"(x) f (<p(z)-<p(a))m(z)g(z)dz Ja + <P"{*) j (<f(i>) ~ <f(z)) m(z)9(z) dz Jx - f'{x)(<p(b) - <p(a))m(x)g(x) This shows v G C2. Multiplying the last two equations by 6(z) and a(x)/2, adding, then using L<p = 0 and the definition of m(x) we have so Lv = —g. This shows that v satisfies (i) in (4.2) and the result follows. D
Section 6.4 Green's Functions 227 Example 4.1. Speed measure? If X has no drift (e.g., Brownian motion or any diffusion on its natural scale) then <p(x) = x and (4.4) becomes ^,/ \ _ / 2m(z)(6 — z)(x — a)/(b — a) when z > x t*l*. z) - | 2m(z)(6 - x)(z - a)/(b - a) when z < x Taking g = 1, a = x — h, and b = x + h the last expression becomes Cl \ — / m(2)(a; + /i — 2) when z>x \ m(z)(z — x+h) when z <x Letting T(a,b) = inf{t : Xt ¢ (a, 6)}, and using (4.5) with j=lwe have rx+h (4.6) •E'xT'Cx-A.x+A) =/ (¢ + /1-2) m(z) c/z Jx + / (z — s + h)m(z)dz ~ m(s)/i2 Jx-A as h —» 0. Thus m(a;) gives us the time that .X"* takes to exit a small interval centered at 2, or to be precise, the ratio of the time for Xt to that for Brownian motion. For another interpretation of m note that (11.7) in Chapter 2 implies that if L% is the local time at x up to time t then for a bounded measurable g [ Lxtg(x)dx= [ «K*<M*}t= / g(Xs)a(Xs)ds J Jo Jo When Xt has no drift <p'(x) = 1 so m(x) = l/a(x) and taking g(x) = f(x)m(x) we have Im{x)L*f{x)dx= i f{Xs)ds Thus multiplying the local times by m(x) converts them into occupation times. This and the previous interpretation suggest that it would be natural to call m(x)dx the occupation measure. However,,it is too late to try to change the original name, speed measure. D Example 4.2. Second Proof of Feller's test. Let Tx = inf{t : Xt = 2}, Ta = limxx«rx and T^a,b) = TaATb. We content ourselves to establish a variant of (b) in (2.1). The first of two steps is (4.7) Lemma. Let 0 < 6 < /?. Then (<p(z) — <p(a)) m(z) dz < 00 I J a
228 Chapter 6 One Dimensional Diffusions if and only if infa<a<o Po(Ta < Tb) > 0 and supQ<a<0 EQ(Ta A Tb) < oo. Proof (3.2) implies <p(b) - <p(a) so infQ<a<0 Po(Ta < Tb) > 0 if and only if <p(a) > -oo. Taking g = 1 in (4.5) and using (4.4) we have (4.8) + 2lM^)[{*{z)-*{a))m{z)dZ The first integral always stays bounded as a J. a. So Eqt^^ stays bounded as a —» a if and only if y(or) > —oo and / (<p(z) - <p(a))m(z) dz < J a oo D The-two conditions in (4.7) are equivalent to Po(Ta < Tb) > 0 and Eo(TaA Tb) < oo. To complete the proof now we will show (4.9) Lemma. If PQ(Ta <Tb)>0 then E0(Ta A Tb) < oo. Proof P0(Ta < Tb) > 0 implies P0(Ta < oo) > 0. Pick M large enough so that P0(Ta < M) > e > 0 and that P0(?6 < M) > e > 0. By considering the first time the process starting from 0 hits x and using the strong Markov property, it follows that if 0 < x < b then P0(Tb <t) = E0(Px(Tb < t - TX);TX < t) < Px(Tb < t)P0(Tx <t)< Px(Tb < t) A similar argument shows Px(Ta <t)> PQ(Ta <t) for a < x < 0 Combining the last two results shows that Px(Ta A Tb < M) > e > 0 for all a < x < b
Section 6.5 Boundary Behavior 229 Letting r = Ta ATj and repeating the proof of (1.2) in Chapter 3 now, we have Px(t > kM) = Ex(PXm(t > (k - 1)M); r > M) <(l-e) sup Py(r> (k-l)M) a<y<b It follows by induction that for all integers k > 1 sup P(t > kM) < (1 - ef x6(a,6) Using (5.7) in Chapter 1 of Durrett (1995) now, we get an upper bound on all the moments of the exit time r. D 6.5. Boundary Behavior In Section 6.1 when we constructed a diffusion on a half line or a bounded interval, we gave up when the process reached the boundary and we said that it had exploded. In this section, we will identify situations in which we can extend the life of the process. To explain how we might continue we begin with a trivial example. Example 5.1. Reflecting boundary at 0. Suppose b(x) = 0, and <r(x) is positive and continuous on [0,oo). To define a solution to dXt = a(Xt) dBt with a "reflecting boundary at 0," we extend a to R by setting a(—x) = cr(x), let Yi be a solution of dYt = ff(Yt) dBt on R and then let Xt = |lt|. To use the last recipe in general we start with Xt a solution of MP(6,a) and let <p be its natural scale. Yt = f(Xt) solves MP(0,/i) where h(y) = ^(9-1 (y))}2a(<p-\y)) To see if we can start the process Yt at 0 we extend h to the negative half-line by setting h(—y) = h(y) and let Zt be a solution of MP(0,/i) on R. If we let rh(y) = l//i(|y|) be the speed measure for Zt, which is on its natural scale, we can use (4.6) and the symmetry m(—y) = fh(y) to conclude 1 fc 2^07-(-^)= / {e-y)m(y)dy Changing variables y = <p(x), dy = <p'(x) dx, e = <p(6) the above fs 1 Jo <p\x)a{x)
230 Chapter 6 One Dimensional Diffusions Now, (i) recalling that the speed measure for Xt is m(x) = l/<p'(x)a(x), and changing notation <p'(z) = s(z), and (ii) using Fubini's theorem, the above = /1/ s(z) dz J m(x) dx = / / m(a;) dxs(z) dz Introducing M as the antiderivative of m to bring out the analogy with the condition in Feller's test (2.1), we have (5.1) M-«,o = 2 / (M(z)-M(0))s(z)dz Jo To see that this means that the process cannot escape from 0, we note that repeating the proof of (4.9) shows (5.2) Lemma. If P0(t(_£,£) < oo) > 0 then £roT(_££) < oo. Proof Pick M so that Pq{t^_£j£) < M) = S > 0. Using the strong Markov property as in the proof of (4.9) we first get sup -Px(t(_£i£) > M) < 1 — 6 *e(-e,0 and then conclude that for each integer k > 1 sup Px (r(_ £j£) > kM) < (1 - S)k x£(-£,0 from which the desired result follows. □ Consider a diffusion on (0,r) where r < oo, let q G (0,r), and let 1= /V(2)-^,(0)) m(2)d2 Jo J = / (M(z) - M(0)) s(z) dz Jo Feller's test implies that when I < oo we can get IN to the boundary point. The analysis above shows that when J < oo we can get OUT from the boundary point.
Section 6.5 Boundary Behavior 231 This leads to four possible combinations, which were named by Feller as follows I J name < oo < oo regular < oo = oo absorbing = oo < oo entrance = oo = oo natural The second case is called absorbing because we can get in to 0 but cannot get out. The third is called an entrance boundary because we cannot get to 0 but we can start the process there. Finally, in the fourth case the process can neither get to nor start at 0, so it is reasonable to exclude 0 from the state space. We will now give examples of the various possibilities. Example 5.2. Feller's branching diffusion. Let dXt = /3Xt dt + <ry/TtdBt Of course we want to suppose a > 0 but we will also suppose /? > 0 since the calculations are somewhat different in the cases /? = 0 and /? < 0. See Exercise 5.1 below. Using the formula in (1.4), the natural scale is "w = i "*■*{[ ^tiz)d» -f Jo e-2^/* dy = -(1 - e-W*'* /o 2/? which maps [0,oo) onto [0,cr2/2/?). The speed measure is m& = ^TTTT = e2/,s/*3/(^) (p'{x)a{x) To investigate the boundary at 0, we note that I = [ m(x)(<p(x) - <p(0)) dx= [ -j- (e2^la2 - l) dx < oo JO Jo *PX ^ ' since the integrand converges to 1/cr2 as x —* 0. To calculate J we note that m(x) ~ l/(<r2x) as x —» 0 so M(0) = —oo and J = oo. The combination I < oo and J = oo says that the process can get into 0 but not get out, so 0 is an absorbing boundary. To investigate the boundary at oo, we note that /oo yoo -I m(x)(ip(oo) - <p(x)) dx = J ^dx = 00
232 Chapter 6 One Dimensional Diffusions To calculate J we note that when /? > 0, m(x) > l/(cr2z) so M(oo) = oo and J = oo. The combination I = oo and J = oo says that the process cannot get into or out of oo, so oo is a natural boundary. Exercise 5.1. Show that 0 is an absorbing boundary and oo is a natural boundary when (a) /? = 0, (b) /? < 0. Example 5.3. Bessel process. Consider dXt = ^-dt + dBt Here 7 > —1 is the index of the Bessel process. To explain the restriction on 7 and to prepare for the results we will derive, note that Example 1.2 in Chapter 5 shows that the radial part of d dimensional Brownian motion is a Bessel process with j = (d — 1). The natural scale is <p(x) - \ exp f- / j/zdz) dy [*_-,, f In x if 7 = 1 = k " ^=((^-1)/(1-7) *7#1 From the last computation we see that if 7 > 1 then <p(Q) = —00 and I = 00. To handle — 1 < 7 < 1 we observe that the speed measure So taking q = 1 in the definition of I ri 2i-7 I = / 27 dz < 00 Jo 1-7 To compute J we observe that for any 7 > —1, M(z) = z7+1 /(7 + 1) and /•1 27+i J = / -2 7 dz < 00 Jo 7+1 Combining the last two conclusions we see that 0 is an entrance boundary if 7€[l,oo) regular boundary if 76(-1,1)
Section 6.5 Boundary Behavior 233 The first conclusion is reasonable since in d > 2 we can start Brownian motion at 0 but then it does not hit 0 at positive times. The second conclusion can be thought of as saying that in dimensions d < 2 Brownian motion will hit 0. Exercise 5.2. Show that 0 is an absorbing boundary if j < —1. We leave it to the reader to ponder the meaning of Brownian motion in dimension d < 0. Example 5.4. Power noise. Consider dXt = Xf dBt on (0, oo). The natural scale is <p(x) = x and the speed measure is m(x) — l/(<p'(x)a(x)) = x~2S so oo if 5 < 1 oo if 5 > 1 7-|.■-»*-{< When 6 > 1/2, M(0) = -oo and hence J = oo. When 8 < 1/2 - C — rl 2l-24 J = I . 2S dz < oo Combining the last two conclusions we see that the boundary point 0 is natural if 5 £ [l,oo) absorbing if 6 G [1/2,1) regular if 6 £ (0,1/2) The fact that we can start at 0 if and only if 6 < 1/2 is suggested by (3.3) and Example 6.1 in Chapter 5. The new information here is that the solution can reach 0 if and only if 5 < 1. For another proof of that result, recall the time change recipe in Example 6.1 of Chapter 5 for constructing solutions of the equation above and do Exercise 5.3. Consider hs= i °i*rMi(i».<i)dfl and Hl= f V2V.>i Jo Jo \ds Show that (a) for any 6 > 0, P\{H'S < oo) = 1. (b) E^Hs < oo when 6 < 1. (c) Pi(Hg = oo) = 1 when 6 >l. Exercise 5.4. Show that the boundary at oo in Example 5.4 is natural if 6 < 1 and entrance if 8 > 1.
234 Chapter 6 One Dimensional Diffusions Remark. The fact that we cannot reach oo is expected. (5.3) in Chapter 5 implies that these processes do not explode for any 5. The second conclusion is surprising at first glance since the process starting at oo is a time change of a Brownian motion starting from oo. However the function we use in (5.1) of Chapter 5 is g(x) = x~2S so (2.9) in Chapter 3 implies EM I g(Bs)l(B.<M)ds= I x~2&-Ixdx which stays bounded as M —» oo if and only if 6 > 1. Exercise 5.5. Wright-Fisher diffusion. Recall the definition given in Example 1.7 in Chapter 5. Show that the boundary point 0 is absorbing if /? = 0 regular if /? G (0,1/2) entrance if /? > 1/2 Hint: simplify calculations by first considering the case a = 0 and then arguing that the value of a is not important. 6.6. Applications to Higher Dimensions In this section we will use the technique of comparing a multidimensional diffusion with a one dimensional one to obtain sufficient conditions (a) for no explosion, (b) for recurrence and transience, and (c) for diffusions to hit points or not. Let Sr = inf{t : \Xt\ = r} and 5«, = limr_oo5y. Throughout this section we will suppose that Xt is a solution to MP(6, a) on [0,5^,), and that the coefficients satisfy (i) 6 is measurable and locally bounded (ii) a is continuous and nondegenerate for each x, i.e., Y^y'anix)^ > 0 when y ^ 0 >,i The first detail is to show that Xt will eventually exit from any ball. (6.1) Theorem. Let Sr = inf{t : \Xt\ = r}. Then there is a constant C < oo so that ExSr < C for all x with \x\ < r. Proof Assumptions (i) and (ii) imply that we can pick an a large so that if \x\ < r ^aii(x)>l+\b{(x)\
Section 6.6 Applications to Higher Dimensions 235 for all x with \x\ < r. Let h(x) = ^,-=1 cosh(az,-). Using Ito's formula we have rt d h{Xt) - h{X0) = f y)asmh{aXi)bi(Xs)ds + local mart. 1 Jo ,-=i Noting that cosh(y) > 1 and ey e-y cosh(y) > max [-^,-^-) >\ sinh(y)| our choice of a implies 0 a cosh(aXi)au(Xs) + asinh(aXj>f(X,) > acosh(aXi) (|a«(X,) - |6,-(X,)|) > acosh(aX;') > a so h(Xt) — tad is a local submartingale on [0, Sr). Using the optional stopping theorem at time Sr A t we have 0 < h(x) < Ex{h(XSrM) ~ ad(Sr A t)} Since h(XsrAt) < rfcosh(ar) it follows that Ex(Sr At)< cosh(ar)/a Letting t —» oo gives the desired result. D Remark. Tracing back through the proof shows C = cosh(ar)/a where a is chosen so that |alV(z) > 1 + |6,-(a:)| To see that the last estimate is fairly sharp consider a special case Example 6.1. Suppose d = 1, a(x) = 1, and b(x) = — /?sgn(z) with /? > 0. The natural scale has <p(x) = J\>»dy=±{eV-l)
1 = e-2/J|x| 236 Chapter 6 One Dimensional Diffusions for x > 0 and <p(—x) = — <p(x), so the speed measure <p>(x) Using (4.8) now with 6 = r, a = —r, x = 0 and noting that o f{r) ~ ¥>(0) _ 2y(0) ~ y(-r) _ x p(r) - p(-r) p(r) - <p(-r) we have £°r<-">=[ h{'2" "^ '""dz The two integrals are equal so £0^) = 1^(6^-)-1)^ To compare with (6.1) now, note that when a(x) = 1 and b(x) = — /?sgn(z), the remark after the proof says we can pick a = 2/? + 2 so the inequality in (6.1) is pT . cosh((2/?+2)r) a. Explosions With the little detail of the finiteness of ExSr out of the way we now introduce a comparison between a multidimensional diffusion and a one dimensional one, which is due to Khasminskii (1960). We begin by introducing three pairs of definitions that may look strange but are tailor made for computations in (d), (e), and (a) of the proof of (6.2). a(x) = 2x a(x)x and /?(z) = yj2z,-6,(z) + ag(x) 1=1 v(r) = sup { : \x\2 = r > and p(r) = sup {a(x) : \x\2 = r} I a(z) J n(r) = p(r)i/(r) and 6(r) = \/2p(r)
Section 6.6 Applications to Higher Dimensions 237 Our plan is to compare Rt = \Xt\2 with dZt = n(Zt) dt + 6{Zt) dBt on (0, oo) So we introduce the natural scale <p(x) and speed measure m(x) for Zt: <p(x) = / s(y) dy m(x) = l/(s(x)02(x)) In view of Feller's test, (2.1), the next result says that if Zt can't reach oo in finite time then Xt does not explode. (6.2) Theorem. There is no explosion if dx m(x)((p(oo) — <p(x)) = oo I Proof To prove the result we will let Rt = \Xt\2 and find a function g with g(r) —* oo as r —» oo so that e~ig(Rt) is a supermartingale while Rt > 1. To see that this is enough, let Sr = inf{t : Rt = r} and use the optional stopping theorem at time S\ A Sn A t to conclude that for \x\ > 1 ff(M2) > c-*j(n2)Ps(S„ < Sx At) Letting 5«, = linv^oo Sr, then n —» oo and i->oowe have Px(Soo < Si) — 0. To define the function #, we follow the proof of (2.1). Let Yt = <p{Zt), use the construction there to produce a function / so that e~tf{Yt) is a local martingale, then take g = / o <p where <p is the natural scale of Zt ■ It follows from our assumption and (2.2a) that g(r) —* oo as r —» oo. Since ^ G C2 it follows from Ito's formula that e~VZ«) -ff(Zo) = -/ e-'ff(Z.)<h Jo + e-'g'(Zs)n(Zs)ds + localma.it. Jo + \j\->g"{Z1)6\Z1)ds
238 Chapter 6 One Dimensional Diffusions Since e~ig{Zt) is a local martingale it follows that \e\z)g"{z) + n{z)g\z) = g{z) Using the definition of 6 and n now we can rewrite this as (a) P(z)g"(z) + v{z)p{z)g'{z) = g(z) Before plunging into the proof we need one more property which follows from (2.6c) and the fact that our natural scale is an increasing function with <p(l) = 0: (b) g'(r) > 0 for r > 1 A little calculus gives Dig(\*\2) = g'(\x\2)2x{ (c) Dijg(\X\') = g"(\x\0-)4xiXj i#j Diig(\X?) = <A|z|2)4z? + g'(\Xf)2 Using Ito's formula now we have e-tg(Rt)-g(Ro)= [ -e-'g{Rs)ds Jo + Y] I e~'g'{Rs)2Xibi{Xs) ds + local mart. . Jo + 111^ e-'g"(R,)4XixiatJ(Xt) ds + \J2J e-'g'(Rs)2a!i(Xs)ds Using the functions a(r) and /?(r) introduced above we can write the last equation compactly as e~tg(Ri) — g(Ro) = local mart. (d) + f e~' {-9(R,) + P(Xt)g'(Rs) + a(Xs)g"(Rs)} ds Jo To bound the integral when Rs > 1 we use (i) the definition of v and (b), then (ii) the equation in (a), the fact that g > 0, and the definition of p -g(R<) + {§§<? W + 9"(R,)} «(*.) (e) < -g(R,) + MR,)g'(R,) + g"(R,)}a(X,) £-^)+{S)}^)£0
Section 6.6 Applications to Higher Dimensions 239 Combining (d) and (e) shows that e~ig(Rt) is a local supermartingale while Rt>l and completes the proof. D b. Recurrence and Transience The next two results give sufficient conditions for transience and recurrence in d > 1. In this part we use the formulation of Meyers and Serrin (1960). Let -, \ x , \ x j, \ 2i • 6(i) + tr (a(i)) \x\ \x\ a(x) where tr (a(z)) = £\a,-,-(z) is the trace of a. We call de(x) the effective dimension at x because (6.3) and (6.4) will show that there will be recurrence or transience if the dimension is < 2 or > 2 in a neighborhood of infinity. To formulate a result that allows de(x) to approach 2 as \x\ —» oo we need a definition: S(t) is said to be a Dini function if and only if In the next two results we will apply the definition to S(t) = exp(- f~ds) To see what the conditions say note that S(t) = (logt)~p is a Dini function if p > 1 but not if p < 1 and this corresponds to e(s) = p/logs. (6.3) Theorem. Suppose de(x) > 2(1 + e(|z|)) for |x| > R and 8(t) is a Dini function, then Xt is transient. To be precise, if \x\ > R then Px(Sr < oo) < 1. Here Sr = inf{t : \Xt\ = r}, and being transient includes the possibility of explosion. (6.4) Theorem. Suppose de(x) < 2(l+e(|z|)) for \x\ > R and 6(t) is not a Dini function, then Xt is recurrent. To be precise, if \x\ > R then Px(Sr < oo) = 1. Proof of (6.3) The first step is to let
240 Chapter 6 One Dimensional Diffusions which has ¢^) = =1^(- £ '-&dt\ <o and hence satisfies r<p"(r) + (1 + e(r))<p'(r) = 0 Letting -Rt = \Xi\2 and using Ito's formula we get the following which we will use four times below: (6.5) Lemma. If rg"(r) + j(r)g'(r) = 0 for r G (u2, v2) and g'(\x\2) ■ {de{x) - 27(|z|2)) < 0 when \x\ £ (u,v), then g(Rt) is a local supermartingale while u < \Xt\ < v. Proof Using Ito's formula with (c) from the proof of (6.2) g(Rt) - g(R0) = Y] [ g'(R,)2X'MXs) ds + local mart. + \j2g''(RMxixiaij(xs)ds 'i i = local mart. + j 2a{Xs) (Rsg"{R3) + g\R,)^y^) ds Using our two assumptions on g, we get for u < \X%\ < v R,g"(Rs) + <7W^4P = g'(R>) (^1 - 7(¾)) < 0 which proves the result. D (6.5) implies that <p(Rt) is a local supermartingale while Rs > R2. If R < r < s < oo then using the optional stopping theorem at time r = Sr A S, ((6.1) implies that Px(t < oo) = 1) we see that <p(\x\2) > Ex<p(Rr) > f(r2)Px(Sr < Ss) Rearranging gives Px(Sr < Ss) < ^)^- < 1
Section 6.6 Applications to Higher Dimensions 241 The upper bound is independent of s and (6.3) follows. □ Proof of (6.4) This time we let *»-jr-(-jr?-)* s Since i>'(r) = —<p'{r) we again have rip"{r) + (1 + e(r) )tf'(r) = ° but this time ij)'{r) > 0. Since we have assumed de(x) < 2(l + e(|z|)) for \x\ > R it follows from (6.5) that i[>(Rt) is a local supermartingale while .¾ > R2. If R < r < s < oo using the optional stopping theorem at time r = Sr A S, ((6.1) implies that Px(r < oo) = 1) we see that V-(|*|2) > Ex^{Rr) = ^(r2)Px(Sr < St) + 4>(s2){l - P(Sr < S,)} Rearranging gives Pis <S)>^2)-^N2) V(s-) - ip(r~) Letting s —► oo and noting that i[>(s2) —» oo since 5(f) is not a Dini function gives the desired result. D c. Hitting Points The proofs of (6.3) and (6.4) generalize easily to investigate whether multidimensional diffusions hit 0. Let T0 = ini{t > 0 : Xt = 0}. (6.6) Theorem. If de(x) > 2(1 - e(|z|)) for \x\ < rj and then PX(T0 < oo) = 0. (6.7) Theorem. If de(x) < 2(1 - e(|z|)) for |x| < tj and leXP(-/ 1 e(z) , \ dy -^-1 dz) — < oo y then PX(T0 < oo) > 0.
242 Chapter 6 One Dimensional Diffusions We have changed the sign of e to make the proofs more closely parallel the previous pair. Another explanation of the change is that Brownian motion is recurrent in d < 2 hits points in d < 2 is transient in d > 2 doesn't hit points in d > 2 so the boundary is now a little below 2 dimensions rather than a little above. Proof of (6.6) Let *>-jf-(-jfW which has ^0-) = =^(-^1^) <o ^) = 1^)^(-/1^) and hence satisfies r<p"(r) +(1- e{r))<p'(r) = 0 Letting Rt = \Xt\2 and using (6.5) shows <p(Rt) is a local supermartingale while Rt £ (0,rf). Ii0<r<s<7] using the optional stopping theorem at time r = Sr A Ss we see that <P(\*\2) > EMRt) > f(r2)P*(Sr < St) Rearranging gives P*(Sr < Ss) < <p(\*\2)/<p(r2) Letting r —» 0 and noting <p(r2) —» oo gives PX(T0 <Sn) = 0 for 0 < |x| < t) To replace Sn by oo in the last equation let (To = 0 and for n > 1 let r„ = inf{i > cr„_i : \Xt\ = 7)} (rn=mf{t>Tn:\Xt\ = 7)/2} The strong Markov property implies that the <rn — rn are i.i.d. so <rn —» oo as n —» oo. It is impossible for the process to hit 0 in [rn, crn] and hitting 0 in [<?n) Tn+i] has probability 0 so we have PX(TQ < oo) = 0 for 0 < |x| < 7)
Section 6.6 Applications to Higher Dimensions 243 Using (3.2) and the Markov property now extends the conclusion to \x\ > tj. D Proof of (6.7) This time we let Since i>'(r) = — <p'{r) we have ij)'{r) > 0 and nl>"{r) + (1- e(r))V-'(r) = 0 Letting -Rt = \Xt\2 and using (6.5) shows i>(Rt) is a local supermartingale while -Ri G (0,7j2). If 0 < r < s < 7} using the optional stopping theorem at time r = Sr A Ss we see that V-(|x|2) > E^Rr) = Hr2)Px(Sr < Ss) + tK*2){1 - Px(Sr < S,)} Rearranging gives p(s <s)>^2)-w2) V>(s2) - V>(r2) Letting r —» 0 and noting ^(r2) —» 0 gives Px(To < oo) > 0 for |x| < i} Using (3.2) and the Markov property now extends the conclusion to \x\ >r).U (6.8) Corollary. If (i) d > 3 or (ii) d = 2 and a is Holder continuous then PX(T0 < oo) = 0 for all ¢. Proof Referring to (4.2) in Chapter 5, we can write a(0) = UTAU where A is a diagonal matrix with entries A;. Let T be the diagonal matrix with entries 1/t/AJ and let V = UTT. The formula for the covariance of stochastic integrals implies that Xt = VXt has a(0) = VTa(0)V = I So we can without loss of generality suppose that a(0) = I. In d > 3, the continuity of a implies de(x) —* d as z —» 0 so we can take e(r) = 0 and the desired result follows from (6.6). If d = 2 and a is Holder continuous with exponent 5 we can take e(r) = Cr4. To extract the desired result from (6.6) we note that
244 Chapter 6 One Dimensional Diffusions since the exponential in the integrand converges to a positive limit as y —* 0. D It follows from (6.7) that a two dimensional diffusion can hit 0. Example 6.1. Suppose b(x) = 0, a(0) = I, and for 0 < \x\ < 1/e let a(x) be the matrix with eigenvectors vi = -r~(xi,x2) ^2 = 7-7 (-22,21) \x\ \x\ and associated eigenvalues Ai = 1 A2 = l-2p/log|z| These definitions are chosen so that de(x) = 2 — 2p/log(|z|) and hence we can take e(r) = p/log(r). Plugging into the integral r1'' ( y1/6 Pdz \ dy y1/6 , , „ 1X dy / exp - / —■ — =/ exp(plog logy ) — Jo V Jy zlos2/ y Jo V y1/e dy = / y <oo ifp> 1 Jo y|logy|p Remark. The problem of whether or not diffusions can hit points has been investigated in a different guise by Gilbarg and Serrin (1956), see page 315.
Diffusions as Markov Processes In this chapter we will assume that the martingale problem MP(6,a) has a unique solution and hence gives rise to a strong Markov process. As the title says, we will be interested in diffusions as Markov processes, in particular, in their asymptotic behavior as t —* oo. 7.1. Semigroups and Generators The results in this section hold for any Markov process, Xt. For any bounded measurable /, and t > 0, let Ttf(x) = Exf(Xt) The Markov property implies that Extf[Xt+t)\Tt) = ExJ(Xt) = Ttf(Xt) So taking expected values gives the semigroup property (1.1) Tt+tf{x) = T.(Ttf)(x) for s,t>0. Introducing the norm ||/|| = sup \f(x)\ and L°° = {/ : ||/|| < oo} we see that Tt is a contraction semigroup on L°°, i.e., (1-2) ||Ti/|| < ll/H Following Dynkin (1965), but using slightly different notation, we will define the domain of T to be V(T) = {/ G L°° : \\Ttf - f\\ - 0 as t - 0} As the next result shows, this is a reasonable choice. (1.3) Theorem. V{T) is a vector space, i.e., if fltf2 € V(T) and cx,c2 G R, ci/i + c2/2 G V(T). V(T) is closed. If / G V{T) then T,/ G V(T) and s -* Tsf is continuous.
246 Chapter 7 Diffusions as Markov Processes Remark. Here and in what follows, limits are always taken in, and hence continuity is defined for the uniform norm || ■ ||. Proof Since Tt(ci/i + C2/2) = c\Ttf\ + 02^/2, the first conclusion follows from the triangle inequality. To prove V(T) is closed, let /„ G V(T) with ll/n - /|| -* 0. Let c> 0. If we pick n large ||/„ - /|| < e/3. /„ G V(T) so \\Ttfn — /n|| < e/3 for t < t0. Using the triangle inequality and the contraction property (1.2) now, it follows that for t <to \\Ttf - /|| < \\Ttf - Ttfn\\ + \\Ttfn -fn\\ + \\fn - /|| <ll/-/n|| + e/3 + e/3<e To prove the last claim in the theorem we note that the semigroup and contraction properties, (1.1) and (1.2), imply that if s < t \\Ttf - Ttf\\ = ||r,(7U/ - /)|| < ||7W - /|| D Remark. While V(T) is a reasonable choice it is certainly not the only one. A common alternative is define the domain to be Co, the continous functions which converge to 0 at 00 equipped with the sup norm. When the semi-group Tt maps C0 into itself, it is called a Feller semi-group. We will not follow this approach since this assumption does not hold, for example, in d = 1 when 00 is an entrance boundary. The infinitesimal generator of a semigroup is defined by HO h Its domain V(A) is the set of / for which the limit exists, that is, the / for which there is a g so that Thf-f g 0 as h -*0 Before delving into properties of generators, we pause to identify some operators that cannot be generators. (1.4) Theorem. If / G V(A) and f(x0) > f(y) for all y then Af(x0) < 0. Proof Since x0 is a maximum, Ttf(x0) — f(x0) < 0 so Af(x0) < 0. D
Section 7.1 Semigroups and Generators 247 Example 1.1. The property in (1.4) looks innocent but it eliminates a number of operators. For example, when k > 3 is an integer we cannot have Af = f(k\ the fcth derivative. To prove this, we begin by noting that gk(x) = 1 — x2 + xk has a local maximum at 0. From this it follows that we can pick a small 6 > 0 and define a C°° function /jt > 0 that agrees with (¾ on (—5,6) and has a strict global maximum there. Since Afk(0) = kl > 0, this contradicts (1.4). D Returning to properties of actual generators, we have (1.5) Theorem. V(A) is dense in V(T). If / G V(A) then Af G V(T), TJ G V(A), and -Ttf = ATtf = %Af Tif-f= i ZAfds Jo Remark. Here we are differentiating and integrating functions that take values in a Banach space, i.e., that map s G [0,oo) into L°°. Most of the proofs from calculus apply if you replace the absolute value by the norm. Readers who want help justifying our computations can consult Dynkin (1965), Volume 1, pages 19-22, or Ethier and Kurtz (1986), pages 8-10. Proof Let / G V(T) and ga = fQa Tsf ds. To explain the motivation for the last definition, we note that |^-/| < \ fa\\Tsf -/|| ds^O II a II a J0 asa-^0 since / G V(T). Now Th ga = f T3+hfds = ga+ f TJds- f TJds J0 Ja Jo so we have Thga 9a_{Taf_f) <\[ Pi/-r0/||d- + lJQh\\Tsf-f\\ds^0 This shows that ga G V{A) and Aga = Taf — f. Since the first calculation shows ga/a G V(T) and converges to / as a —* 0 the first claim follows.
248 Chapter 7 Diffusions as Markov Processes The fact that Af G V(T) follows from (1.3) which implies (Thf - f)/h G V(T) and V(T) is closed. To prove that Ttf G V(A) we note that the contraction property and the fact that / G V{A) imply ThTtf-TJ -TAf Tt(2£f£-Af < Tnf-f -,4/-0 The last equality shows that ATtf = TAf and that the right derivative of Ttf is TAf. To see that the left derivative exists and has the same value we note that Tf - T-hf -TAf Tt_h{Thl_l_Af -Af + \\Tt-h(Af-ThAf)\\ Tnf-f + \\Af-ThAf\\^0 as h -* 0 since / G V(A) and Af G V(T). The final equation in (1.5) is obtained by integrating the one above it. D To make the connection between generators and martingale problems we will now prove (1.6) Theorem. If / G V{A) then for any probability measure fi M{ = f(Xt) - f(XQ) - f Af(Xs) ds Jo is a Pf, martingale with respect to the filtration T% generated by Xt- Proof Since / and Af are bounded, M{ is integrable for each t. Since M/ is Ts measurable £p(M/|^) = Ml + £p (f(Xt) - f(Xt) - J Af(Xr) dr T^j By the Markov property the second term on the right-hand side is = EX, (f(Xt-s) ~ f(Xo) - Jq ' Af(Xr) dr^j
Section 7.1 Semigroups and Generators 249 But for any y, it follows from the definition of Tr and Fubini's theorem that Ey(f{Xt-.)-f(X0)-J ' Af{Xr)dr) = 2}_,/(y) - f(y) - I ' TrAf(y) dr Jo which is 0 by (1.5) D Our final topic is an important concept for some developments but will play a minor role here. For each A > 0 define the resolvent operator for bounded measurable / by Uxf(x)= e-XiTtf(x)dt = Ex e-Xtf(Xt)dt Jo Jo Clearly \\U\f\\ < ||/[|/A. A more subtle fact is that (1.7) Theorem. Ux maps V(T) 1-1 onto V(A). Its inverse is (A - A). Proof Let A^f = (Thf — f)/h. Plugging in the definition of Ux and changing variables s = t + h and s = (we have AhUxf = fiJ e-Xi(Tt+hf-Ttf)dt / e-x'T,fds-- / e~XsTsfds Jh " Jo h To conclude that Uxf £ V(A) and AUxf = Wxf - / we note that \Uxf-f = \j e-x'T3fds--J fds and compare the last two displays to conclude (using ||T</|| < ||/||) that eAA_l 1 /-°° /•* \\AhUxf-(\Uxf-f)\\< / e-*'||/||<fa + A/ e-x'\\f\\ds Jh Jo + \ J\l - e-^)||/|| ds+±£ \\T.f - /|| ds -+ 0 as h -+ 0. The formula for AUxf implies that (A - A)Uxf = /• Thus Ux is 1-1 and its inverse is A — A.
250 Chapter 7 Diffusions as Markov Processes To prove that U\ is onto, let / £ ~D(A) and note (1.5) implies f°° Ux(\f-Af)= / e-XtTt(Xf-Af)dt Jo = J \e-XtTtfdt-J e-Xt-Ttfdt Integrating by parts jf e-x<fTtfdt=(e-x<Ttf)\~ + jQ Xe'^TJ dt Combining the last two displays shows U\(\f — Af) = / and the proof is complete. D 7.2. Examples In this section we will consider two families of Markov processes and compute their generators. Example 2.1. Purejuinp Markov processes. Let (S,S) be a measurable space and let Q(x, A) :Sx5->[0, oo) be a transition kernel. That is, (i) for each A G S, x —» Q(x, A) is measurable (ii) for each x £ S, A —* Q(x,A) is a finite measure (iii)Q(z, {*}) = () Intuitively Q(x, A) gives the rate at which our Markov chains makes jumps from x into A and we exclude jumps from z to z since they are invisible. To be able to construct a process Xt with a minimum of fuss, we will suppose that A = supxgs Q(x,S) < oo. Define a transition probability P by P(x,A) = jQ(x,A) iixgA P(z,{z» = i(A-Q(z,S)) Let ii,^2,..- be i.i.d. exponential with parameter A, that is, P(t,- > t) = e~Xt. Let Tn = ti + ■ ■ ■ + tn for n > 1 and let To = 0. Let Yn be a Markov chain with transition probability P(x,dy) (see e.g., Section 5.1 of Durrett (1995) for a defintion), and let Xt=Yn fort€[rn>rn+1)
Section 7.2 Examples 251 Xt defines our pure jump Markov process. To compute its generator we note that Nt = sup{n : Tn < t} has a Poisson distribution with mean Xt so if / is bounded and measurable Ttf(x) = Exf(Xt) = e-xtf(x) + Xte~xt J P(x, dy)f(y) + 0(t2) Doing a little arithmetic it follows that -XJp(x,dy)(f(y)-f(x)) = t-\e-Xi-l + Xt)f(x) + X(e-Xi-l)Jp(x,dy)f(y) + 0(t) Tif(x)-f(x) t = t-\e-Xi-l + Xt)f{x) Since t~1(e Xi — 1 + Xt) —»-0, e xt —* 1, and P(x, dy) is a transition probability it follows that Ttf(x) - f(x) t -XJp(x,dy)(f(y)-f(x)) as t —» 0. Recalling the definition of P(x, dy) it follows that (2.1) Theorem. V(A) = V(T) = L°° and Af(x)= JQ(x, dy)(f (y)-f(x)) ' We are now ready for the main event. Example 2.2. Diffusion processes. Suppose we have a family of measures Px on (C, C) so that under Px the coordinate maps Xt(<j) = w(t) give the unique solution to the MP(6, a) starting from x, and T% is the filtration generated by the Xt. Let C\ be the C2 functions with compact support. (2.2) Theorem. Suppose a and 6 are continuous and MP(6,a) is well posed. Then C\ C V(A) and for all / £ C\ WV) = \ X>i(*)A//(*) + X>(*)A/(z)
252 Chapter 7 Diffusions as Markov Processes Proof If / G C2 then Ito's formula implies f(Xt) - f(X0) = J2[ Dif(Xs)bi(Xs) ds + local mart. + \j2JQDijf(Xs)ai]iXs)ds If we let Tn be a sequence of times that reduce the local martingale, stopping at time t A Tn and taking expected value gives /•MT„ Exf(XtATa) - f(x) = VEx / A/(X,)6i(X.) <h Jo 1 fMT„ + 2^E'J0 DiSf[Xt)ai}{Xt)da If / G C^- and the coefficients are continuous then the integrands are bounded and the bounded convergence theorem implies Exf{Xi)-f(x) = YjEx f Dif(X.)bi(Xt)da Jo (2.3) + I TEx f Dtj/(X.)oy (X,) ds = EX f Af{Xs)ds Jo To prove the conclusion now we have to show that Exf{Xt)-f{x) -Af The first step is to bound the movement of the diffusion process. (2.4) Lemma. Let r > 0, K C Rd be compact, and Kr = {y : \x — y\ < r for some x G K}. Suppose |&(z)| < B and J2i an(x) < -A for all x G i^r- If we let Sr = inf{f : \Xt - X0\ > r} then sup Px(Sr < t) < t ■ 2 (- + 4)
Section 7.2 Examples 253 Proof Let h(y) = J2i(yi ~ xi)2- Using Ito's formula we have h(Xt) - h(X0) = J2[ 2(X> ~ xi)hi(x>)ds +local mart- + \YJJ^aii{Xs)ds Letting Tn be a sequence of times that reduce the local martingale, stopping at t A Sr A Tn, taking expected value, then letting n —* oo and invoking the bounded convergence theorem we have /•MSr Exh(XiASr) = EX J2 i2(Xi ~ *iM**) + oo{Xt)} ds < 2t(rB + A) Jo Since h > 0 and h(Xsr) = r2 it follows that r2Px(Sr< t)<2t(rB + A) which is the desired result. D Proof of (2.2) Pick R so that H = {z : \x\ < R — 1} contains the support of / and let K = {x : \x\ < R}. Since / £ C\% and the coefficients are continuous, Af is continuous and has compact support. This implies Af is uniformly continuous, that is, given e > 0 we can pick 6 £ (0,1] so that if \x-y\<6 then \Af(x) - Af(y)\ < e. If we let C = supz \Af(z)\ and use (2.3) it follows that for x G K E*f{Xt)-f(*)_Am <lExJ*\Af(Xs)-Af(x)\ds <e + 2C sup PX(SS < t) X£K For x ¢ K, f(x) = 0 and Af = 0. So using (2.3) again gives that for x ¢ K Exf(Xt)-f(x) -Af(x) Exf(Xt) < CPX{TH < t) < C SUP Py(Sl < t) y£dK where the second equation comes from using the strong Markov property at time Tk- (2.4) shows that sup Px(Si <t)< sup PX(SS <t)-*0 as t -* 0
254 Chapter 7 Diffusions as Markov Processes Since e > 0 is arbitrary, the desired result follows. D For general 6 and a it is a hopelessly difficult problem to compute V(A). To make this point we will now use (1.7) to compute the exact domains of the semi-group and of the generator for one dimensional Brownian motion. Let Cu be the functions / on R that are bounded and uniformly continuous. Let C„ be the functions / on R so that /, /', /" G Cu. (2.5) Theorem. For Brownian motion in d = 1, V(T) = Cu, V(A) = Cl, and Af{x) = f"{x)/2 for all / G V{A). Remark. For Brownian motion in d > 1, V(A) is larger than C„ and consists of the functions so that A/ exists in the sense of distribution and lies in Cu (see Revuz and Yor (1991), p. 226). Proof If t > 0 and / is bounded then writing Pt(x, z) for the Brownian transition probability and using the triangle inequality we have \Ttf(x) - Ttf(y)\ < ll/H J \pt(x, z) - pt(y, z)\ dz The right-hand side only depends on |a; — y\ and converges to 0 as |a; — y\ —» 0 so Ttf e Cu. If / G V(T) then \\Ttf - /|| -* 0. We claim that this implies / G Cu. To check this, pick e > 0, let t > 0 so that ||Ti/-/|| < e/3 then pick 6 so that ii\x — y\<6 then \Ttf(x) — Ttf(y)\ < e/3. Using the triangle inequality now it follows that \f\x — y\<6 then |/(*) - f(y)\ < \f(x) - Ttf(x)\ + \Ttf(x) - Ttf(y)\ + \Ttf(y) - f(y)\ < e To prove that V(A) = C„ we begin by observing that (3.10) in Chapter 3 implies Uxf(x) = (2A)-1/2 J e-l*-«l^/(y) dy Our first task is to show that if / G Cu then U\f € C„. Letting g(x) = U\f(x) we have Jx J—oo Differentiating once with respect to x and noting that the terms from differentiating the limits cancel, we have y»0O fX g'(x)= e-(y-*^f(y)dy- e<'-yW^f{y) dy Jx J—CO
Section 7.3 Transition Probabilities 255 Differentiating again gives g"(x) = V2X / e-<*-*>S*f(y) dy + V2A / e'^^^fiy) dy - 2f(x) Jx J—CO It is easy to use the dominated convergence theorem to justify differentiating the integrals. Prom the formulas it is easy to see that g, g', g" £ Cu so ge Cl and g"(x) = 2Xg(x)-2f(x) Since Ag(x) = Xg(x) — f(x) by the proof of (1.7), it follows that Ag(x) = g"(x)/2. To complete the proof now we need to show V(A) D C2. Let h E C„ and define a function / £ Cu by f(x) = \h(x)-±h"(x) The function y = h — U\f satisfies \y"{x)-\y{x) = Q All solutions of this differential equation have the form Aexy^ + f?e-xv^\ Since h — U\f is bounded it follows that h = U\f and the proof is complete. D 7.3. Transition Probabilities In the last section we saw that if a and 6 are continuous and MP (6, a) is well posed then jTtf{x) = ATtf{x) for / £ Cjf, or changing notation v(t,x) = Tif(x), we have (3.1) ^=Av(t,x) where A acts in the x variable. In the previous discussion we have gone from the process to the p.d.e. We will now turn around and go the other way. Consider (a) u e C1-2 and ut = Lu in (0, oo) x Rd (b) u is continuous on [0, oo) x Rd and w(0, x) = f(x)
256 Chapter 7 Diffusions as Markov Processes Here we have returned to our old notation L=\ 5>i(*)Ai/(*) + £ b{(x) Dtf{x) >i ' To explain the dual notation note that (3.1) says that v(t, -) G ~D{A) and satisfies the equality, while (a) of (3.2) asks for u £ C1,2. As for the boundary condition (b) note that f £ V(A) C V(T) implies that \\Ttf - f\\ -* 0 as t -* 0. To prove the existence of solutions to (3.2) we turn to Friedman (1975), Section 6.4. (3.3) Theorem. Suppose a and 6 are bounded and Holder continuous. That is, there are constants 0 < 6, C < oo so that (HC) laijW-aijtolKClx-yl' Hx)-bi(y)\<C\x-y\s Suppose also that a is uniformly elliptic, that is, there is a constant A so that for all x,y (UE) £y.-a1-y(a;)yi>A|y|2 hi Then there is a function Pt(x,y) > 0 jointly continuous in t > 0, x, y, and C2 as a function of x which satisfies dp/dt = Lp (with L acting on the x variable) and so .that if / is bounded and continuous «(*.*)= Pi(x,y)f(y)dy satisfies (3.2). The maximum principle implies |w(£,z)| < ||/||. To make the connection with the SDE we prove (3.4) Theorem. Suppose Xt is a nonexplosive solution of MP(6, a). If u is a bounded solution of (3.2) then u(t, x) = Exf{Xt) Proof Ito's formula implies u(t - s, Xs) - u(0,X„) = f ~(t - r,Xr)dr + ^2/ Diu(Xr)bi(Xr) dr + local mart. + \j2fDiju(Xr)aij(Xr)dr
Section 7.3 Transition Probabilities 257 so (a) tells us that u(t — s,Xs) is a bounded martingale on [0,t). (b) implies that u(t — s,Xs) —* f{Xt) as s —* t, so the martingale convergence theorem implies u(t, x) = Exf(Xt) a Pt(x, y) is called a fundamental solution of the parabolic equation Ut = Lit since it can be used to produce solutions for any bounded continuous initial data /. Its importance for us is that (3.4) implies that Px(XteB)= f Pt(x,y)dy Jb i.e., Pt(x,y) is the transition density for our diffusion process. Let D = {x : \x\ < R}. To obtain result for unbounded coefficients, we will consider (a) u G C1-2 and du/dt = Lu in (0, oo) x D (3.5) (b) u is continuous on [0,oo) x D with u(0,z) = f(x), and u(t, y) = 0 when t > 0, y £ dD To prove existence of solutions to (3.5) we turn to Dynkin (1965), Vol. II, pages 230-231. (3.6) Theorem. Let D = {x : \x\ < R} and suppose that (HC) and (UE) hold in D. Then there is a function pf{x,y) > 0 jointly continuous in t > 0, x, y G D, C2 as a function of x which satisfies dpR/dt = Lp (with L acting on the x variable) and so that if / is bounded and continuous u(t, x)= p?(x, y)f{y) dy satisfies (3.5). To make the connection with the SDE we prove (3.7) Theorem. Suppose Xt is any solution of MP(6, a) and let r = inf{t : Xt ¢ D}. If u is any solution of (3.5) then u(t,x) = Ex(f{Xt);r>t)
258 Chapter 7 Diffusions as Markov Processes Proof The fact that u is continuous in [0,t] x D implies it is bounded there. Ito's formula implies that for 0 < s < t A r u(t-s,Xs)-u(0,XQ)= --^{t-r,Xr)dr + J2 I Diu{Xr)bi{Xr)dr+ local mart. + \j2fDiju(Xr)aij(Xr)dr so (a) tells us that u(t — s,Xs) is a bounded local martingale on [0,t A r). (b) implies that u(t — s,X,) -^Oassjr and u(t — s,X3) —* f(Xt) as s —* t on {r > t}. Using (2.7) from Chapter 2 now, we have u(t,x) = Ex{f{Xi)-r>t) a Combining (3.6) and (3.7) we see that Ex{f{Xt); r>t)= f pf{x, y)f(y) dy Jd Letting R j oo and Pt(x, y) = limij_»oopf (z,y), which exists since (3.7) implies i2 —>• p^(x,y) is increasing, we have (3.8) Theorem. Suppose that the martingale problem for MP(6, a) is well posed and that a and 6 satisfy (HC) and (UE) hold locally. That is, they hold in {z : \x\ < R} for any R < oo. Then for each t > 0 there is a lower semicontinuous function p(t,x,y) > 0 so that if / is bounded and continuous then Exf(Xi) = JPi(x,y)dy Remark. The energetic reader can probably show that pt{x,y) is continuous. However, lower semicontinuity implies that p<(z, y) is bounded away from 0 on compact sets and this will be enough for our results in Section 7.5. 7.4. Harris Chains In this section we will give a quick treatment of the theory of Harris chains to prepare for applications to diffusions in the next section. In this section and
Section 7.4 Harris Chains 259 the next, we will assume that the reader is familiar with the basic theory of Markov chains on a countable state space as explained for example in the first five sections of Chapter 5 of Durrett (1995). This section is a close relative of Section 5.6 there. We will formulate the results here for a transition probability defined on a measurable space (S,S). Intuitively, P(Xn+\ £ A\Xn = x) = p(x,A). Formally, it is a function j):Sx5-»[0,l] that satisfies for fixed A £ S, x —* p(x, A) is measurable for fixed x £ S, A —» p(x, A) is a probability measure In our applications to diffusions we will take S = Rd, and let p(x,A) = / pi(x,y)dy Ja where pt(x, y) is the transition probability introduced in the previous section. By taking t = 1 we will be able to investigate the asymptotic behavior of Xn as n —* oo through the integers but this and the Markov property will allow us to get results for t —* oo through the real numbers. We say that a Markov chain Xn with transition probability p is a Harris chain if we can find sets A, B £ S, a function q with q(x, y) > e > 0 for x £ A, y £ B, and a probability measure p concentrated on B so that: (i) If rA = inf{n >0:XneA}, then Pz(rA < oo) > 0 for all z £ S (ii) If x £ A and C C B then p(x,C) > Jc q(x,y)p(dy) In the diffusions we consider we can take A = B to be a ball with radius r, p to be a constant c times Lebesgue measure restricted to B, and q(x, y) = p\(x,y)/c. See (5.2) below. It is interesting to note that the new theory still contains most of the old one as a special case. Example 4.1. Countable State Space. Suppose Xn is a Markov chain on a countable state space S. In order for Xn to be a Harris chain it is necessary and sufficient that there be a state u with Px(Xn = u for some n > 0) > 0 for all x £ S. Proof To prove sufficiency, pick v so that p(u, v) > 0. If we let A = {u} and B = {v} then (i) and (ii) hold. To prove necessity, let {u} be a point with p({u}) > 0 and note that for all x . Px(Xn = u for some n > 0) > Ex{q(XTA)p({u})} > 0 D The developments in this section are based on two simple ideas, (i) To make the theory of Markov chains on a countable state space work, all we need
260 Chapter 7 Diffusions as Markov Processes is to have one point in the state space which is hit. (ii) In a Harris chain we can manufacture such a point (called a below) which corresponds to "being in B with distribution p." The notation needed to carry out this plan is occasionally obnoxious but we hope these words of wisdom will help the reader stay focused on the simple ideas that underlie these developments. Given a Harris chain on^S,_«S), we will construct a Markov chain Xn with transition probability p on (S, S) where S = S U {a} and S = {B, B U {a} : B G S}. Thinking of a as corresponding to being on B with distribution p, we define the modified transition probability as follows: IfxeS-A, p(x,C)=p(x}C)iorCeS IfxeA, p(x,{a}) = e p(x,C) = p(x,C) - ep(C) for C G S If x = a, p(a, D) = fp(dx)p(x, D) for D G S Here and in what follows, we will reserve A and B for the special sets that occur in the definition and use C and D for generic elements of S. We will often simplify notation by writing p(x,a) instead of p(x,{a}), fj.(a) instead of fi({a})} etc. Our first step is to prove three technical lemmas that will help us carry out the proofs below. Define a transition probability v by v(x,{x}) = 1 if xeS v(a,C) = p(C) In words, v leaves mass in S alone but returns the mass at a to S and distributes it according to p. (4.1) Lemma, (a) vp = p and (b) pv = p. Proof Before giving the proof we would like to remind the reader that measures multiply the transition probability on the left, i.e., in the first case we want to show \ivp~ = fip. If we first make a transition according to v and then one according to p, this amounts to one transition according to p, since only mass at a is affected by v and p(a,D) = / p(dx)p(x,D). The second equality also follows easily from the definition. In words, if p acts first and then v, then this is the same as one transition according to p since v returns the mass at a to where it came from. D From (4.1) it follows easily that we have:
Section 7.4 Harris Chajns 261 (4.2) Lemma. Let Yn be an inhomogeneous Markov chain with pih = v and P2k+i — V- Then Xn = Y2„ is a Markov chain with transition probability p and Xn = Y2n+i is a Markov chain with transition probability p. (4.2) shows that there is an intimate relationship between the asymptotic behavior of Xn and of Xn. To quantify this we need a definition. If / is a bounded measurable function on S, let / = vf, i.e., f(x) = f(x) forxeS f(a) = Jfdp (4.3) Lemma. If p. is a probability measure on (S, S) then E,f(Xn) = Ej{Xn) Proof Observe that if Xn and Xn are constructed as in (4.2), and P(Xq g S) = 1 then X0 = .¾ and -^n is obtained from Xn by making a transition according to v. D Before developing the theory we give one example to explain why some of the statements to come will be messy. Example 4.2. Perverted Brownian motion. For x that is not an integer > 2, let p(x,-) be the transition probability of a one dimensional Brownian motion. When x > 2 is an integer let p(x,{x + 1}) = 1-z-2 p(x,C) = x-2\CD[0,l]\ ifx + lgC p is the transition probability of a Harris chain (take A = B = (0,1), p = Lebesgue measure on B) but P2(Xn = n + 2 for all n) > 0 I can sympathize with the reader who thinks that such crazy chains will not arise "in applications," but it seems easier (and better) to adapt the theory to include them than to modify the assumptions to exclude them. a. Recurrence and transience We begin with the dichotomy between recurrence and transience. Let R = inf{n > 1 : Xn = a}. If Pa(R < oo) = 1 then we call the chain
262 Chapter 7 Diffusions as Markov Processes recurrent, otherwise we call it transient. Let R\ = R and for fc > 2, let Rk = inf{n > Rk-i : Xn = a) be the time of the fcth return to a. The strong Markov property implies Pa(Rk < oo) = Pa(R < oo)fc so Pa(X„ = a i.o.) = 1 in the recurrent case and is 0 in the transient case. From this the next result follows easily. (4.4) Theorem. Let A(C) = £„ 2-"p"(a, C). In the recurrent case if A(C) > 0 then Pa{Xn G C i.o.) = 1. For A a.e. x, PX(R < oo) = 1. Remark. Here and in what follows p™(x,C) = Px(Xn G C) is the nth iterate of the transition probability denned inductively by p1 = p and for n > 1 Pn+1(x,C) = Jp(x,dy)pn(y,C) Proof The first conclusion follows from the following fact (see e.g., (2.3) in Chapter 5 of Durrett (1995)). Let Xn be a Markov chain and suppose P (U~=n+1{Xm G Bm}\ Xn) > 6 on {Xn G An} Then P({Xn G An i.o.}) - P({Xn G Bn i.o.}) = 0. Taking An = {a} and Bn = C gives the desired result. To prove the second conclusion, let D = {x : PX{R < oo) < 1} and observe that if pn(a, D) > 0 for some n then Pa{Xm = a i.o.) < fpn(a, dx)Px(R < oo) < 1 D Remark. Example 4.2 shows that we cannot expect to have PX(R < oo) = 1 for all x. To see that this can occur even when the state space is countable, consider a branching process in which the offspring distribution has po = 0 and ^fcpi > 1- If we take A = B = {0} this is a Harris chain by Example 4.1. Since pn(0,0) = 1 it is recurrent. b. Stationary measures (4.5) Theorem. Let R = inf{n > 1 : Xn = a}. In the recurrent case, (R-l \ oo E hx.ec] = £ Wn eC,R>n) n=0 / n=0 defines a cr-finite stationary measure for p, with /i« A, the measure defined in (4.4). fj. = fiv is a cr-finite stationary measure for p.
Section 7.4 Harris Chains 263 Proof If A(C) = 0 then the definition of A implies Pa(Xn G C) = 0, so Pa(Xn G C, R > n) = 0 for all n, and /2(C) =.0. We next check that p. is cr-finite. Let GM = {x : pk(x,a) > 6}. Let T0 = 0 and let T„ = inf{m > T„-\ + fc : Xm G Gi,*}. The definition of Gk,s implies p(r„<ra|r„_i<ra)<(i-<5) so if we let N = inf{n : Tn > Ta} then EN < 1/6. Since we can only have Xm G Gk,s with R > m when T„ < m < T„ + fc for some 0 < n < N it follows that /2(Gi,4) < fc/<5. Part (i) of the definition of a Harris chain implies S C Uj^m^iGj^x/m and cr-finiteness follows. Next we show that Jlp = Ji. Case 1. Let C be a set that does not contain a. Using the definition p. and Fubini's theorem. f fi(dy)p(y} C)=J2 f Pa(Xn edy,R> n)p(y, C) •* n = 0 ^ oo = J2 P°(Xn+i G C, R > n + 1) = /2(C) n=0 since a ¢0 and Pa{Xo = a) = 1. Case 2. To complete the proof now it suffices to consider C = {a}. Hdy)P(y> a) = J2 I P°(*n &dy,R> n)p{y, a) J n=QJ oo = £>„(# = n+l) = l = /2(a) n=0 where in the last two equalities we have used recurrence and the fact that when C = {a} only the n = 0 term contributes in the definition. Turning to the properties of fj., we note that p = p,v and /2(a) = 1 so /x is cr-finite. To check fj.p = fj., we note that (i) using the definition \i = /2t) and (b) in (3.1), (ii) using (a) in (4.1), (iii) using p.p = /2, and (iv) using the definition H = p,v: fip = (p,v)(pv) = fipv = p,v = [M O To investigate uniqueness of the stationary measure we begin with:
264 Chapter 7 Diffusions as Markov Processes (4.6) Lemma. If v is a cr-finite stationary measure for p, then v(A) < oo and v = vp is a cr-finite stationary measure for p with v(a) < oo. Proof We will first show that v{A) < oo. If v{A) = oo then part (ii) of the definition of a Harris chain implies v(C) = oo for all sets C with p(C) > 0. If B = UiB; with 1/(2¾) < oo then p(B{) = 0 by the last observation and p(B) = 0 by countable subadditivity, a contradiction. So v(A) < oo and v(a) = vp(a) = ev(A) < oo. Using the fact that vp = v, we find v(C) = vp{C) = v(C) - e v(A)p{B n C) the last subtraction being well defined since v(A) < oo. From the last equality it is clear that v is cr-finite. Since P(a) = ei/(j4), it also follows that vv = v. To check vp = v, we observe that (a) of (4.1), the last result, and the definition of v imply vp = vvp = vp = v O (4.7) Theorem. Suppose p is recurrent. If v is a cr-finite stationary measure then v = v(a)n where p. is the measure constructed in the proof of (4.5). Proof By (4.6) it suffices to prove that if v is a cr-finite stationary measure for p with v(cc) < oo then v = v{oc)Jx. Our first step is to observe that v(C) = v(a)p(a, C)+ f v{dy)p{y, C) Js-{a} Using the last identity to replace v(dy) on the right-hand side we have v{C) = v{a)p(a, C)+ f v{a)p{a, dy)p(y, C) JS-{a] + / Hdx) / P(x,dy)p(y,C) JS-{a] JS-{a] = v(a)Pa(Xi eC) + v(a)Pa(X! # a, X2 € C) + PP(X0 # a.Xx # a,X2 6 C) Continuing in the obvious way we have n v(C) = v{a) J2 Pa(Xk # a for 1 < Jb < m, Xm G C) m=\ + Pp(Xk # a for 0 < k < n + l,Xn+i G C)
Section 7.4 Harris Chains 265 The last term is nonnegative, so letting n-+oowe have 0(C) > i>(a)ii(C) To turn the > into an = we observe that v(a) = I i/(dx)pn(x, a) > v(a) I p.(dx)pn(x, a) = v(a)ji(a) = v(a) since /2(a) = 1. Let Sn = {x : pn(x,a) > 0}. By assumption U„S„ = S. If v(D) > v(a)ji(D) for some D, then v(DP[Sn) > i>(a)ji(Dr\Sn) for some n and it follows that 9(a) > v(cc), a contradiction. D (4.5) and (4.7) show that a recurrent Harris chain has a cr-finite stationary measure that is unique up to constant multiples. The next result, which goes in the other direction, will be useful in the proof of the convergence theorem and in checking recurrence of concrete examples. (4.8) Theorem. If there is a stationary probability distribution then the chain is recurrent. Proof Let f = np, which is a stationary distribution by (4.6), and note that part (i) of the definition of a Harris chain implies f(a) > 0. Suppose that the chain is transient, i.e., Pa(R < oo) = g < 1. If 2¾ is the time of the fcth return then Pa(Rk < oo) = 1k so if x ^ a (CO \ CO CO 1 E1(Jf—})=Ep»(^< ^)^^^ = 137 n = 0 / k=l fc=l * The last upper bound is also valid when x = a since (CO \ CO CO i n=0 / i=l i=0 " Integrating with respect to f and using the fact that f is a stationary distribution we have a contradiction that proves the result 1 / CO \ CO ¾ \n=0 / n=0 D
266 Chapter 7 Diffusions as Markov Processes c. Convergence theorem If J is a set of positive integers, we let g.c.d.(J) be the greatest common divisor of the elements of I. We say that a recurrent Harris chain Xn is aperiodic if g.c.d.({n > 1 : pn(a,a) > 0}) = 1. This occurs, for example, if we can take A = B in the definition for then p(a, a) > 0. A well known consequence of aperiodicity is (4.9) Lemma. There is an mo < oo so that pm(a, a) > 0 for m > mo- For a proof see (5.4) of Chapter 5 in Durrett (1995). We are now ready to prove (4.10) Theorem. Let Xn be an aperiodic recurrent Harris chain with stationary distribution tt. If PX(R < oo) = 1 then as n —* oo, II^.O-tOII-o Remark. Here || -1| denotes the total variation distance between the measures. (4.4) guarantees that A a.e. x satisfies the hypothesis, while (4.5), (4.7), and (4.8) imply tt is absolutely continuous with respect to A. Proof In view of (4.3) and (4.6) it suffices to show that if f = np then IIp"(*, 0-*OII-o Our second reduction is to note that if x ^ a then n ^(z, 0 = X>x(# = m)p"-m(«, 0 m=l so we have iipn(*, •) - *oii <J2P*(R=™)ir"m(a. •) - *(-)ii m=l and it suffices to prove the result when x = a. Let S2 = S x S. Define a transition probability p on S x S by P{{X1,X2),C1 X C2) = PiXi, 0^(10,02) That is, each coordinate moves independently. We will call this the "product chain" (think of product measure) and denote it by Zn.
Section 7.4 Harris Chains 267 We claim that the product chain is a Harris chain with A = {(a, a)} and B = B x B. Clearly (ii) is satisfied. To check (i) we need to show that for all (zi, X2) there is an N so that pN((x\, z2), (a, a)) > 0. To prove this let K and L be such that pK(x\,a) > 0 and pL(x2,a) > 0, let M >m0} the constant in (4.9), and take N = K + L + M. From the definitions it follows that pK+L+M{xucc) > pK{xua)pL+M{a,a) > ° pK+L+M(x2,a) > pL(x1,a)pK+M(a,a) > 0 and hence pN((xi, xi), (oc, a)) > 0. The next step is show that the product chain is recurrent. To do this, note that (4.6) implies f = wp is a stationary distribution for p, and the two coordinates move independently, so f(Ci x C2) = f(Ci)f(C2) defines a stationary probability distribution for p, and the desired conclusion follows from (4.8). To prove the convergence theorem now we will write Zn = (Xn,Yn) and consider Zn with initial distribution 5a x f. That is X0 = a with probability 1 and Y0 has the stationary distribution ft. Let T = inf{n : Zn = (a, a)} = R. (4.8), (4.7), and (4.5) imply ft x ft <5C A so (4.4) implies PSaMT< °°)=1 Dropping the subscript 6a x ft from P and considering the time T, n P(Xn eC,T<n)=J2P(T= m)pn-m(a,C) = P(Yn £C,T<n) since on {T = m}, Xm = Ym = a. Now P(I„6C) = P(X„6C,r<n) + P(X„6C,T>n) = P(y„ £C,T<n) + P(Xn eC,T>n) <P(YneC) + P(T>n) Interchanging the roles of X and Y we have ^n ec)< P{xn ec) + p(t > n) and it follows that
268 Chapter 7 Diffusions as Markov Processes ll^(M-*(OII = lW.€-)-P(y„€-)|| = sup \P(Xn g C) - P(Yn € C)\ <P(T>n)^0 c as n —» oo. D 7.5. Convergence Theorems In this section we will apply the theory of Harris chains developed in the previous section to diffusion processes. To be able to use that theory we will suppose throughout this section that (5.1) Assumption. MP(6, a) is well posed and the coefficients a and 6 satisfy (HC) and (UE) locally. (5.2) Theorem. If A = B = {x : \x\ < 7'} and p is Lebesgue measure on B normalized to be a probability measure then Xn is an aperiodic Harris chain. Proof By (3.8) there is a lower semicontinuous function Pt{x,y) > 0 so that Px(XteA)= [ Pt(x,y)dy Ja From this we see that PX(R = 1) > 0 for each x so (i) holds. Lower semi- continuity implies Pi(x,y) > e > 0 when x,y € A so (ii) holds and we have p(a, a) > 0. D To finish checking the hypotheses of the convergence theorem (4.10), we have to show that a stationary distribution 7r exists. Our approach will be to show that EaR < oo, so the construction in (4.5) produces a stationary measure with finite total mass. To check EaR < oo We will use a slight generalization of Lemma 5.1 of Khasminskii (1960). (5.3) Theorem. Suppose u > 0 is C2 and has Lu < —1 for all x ¢ K where K is a compact set. Let Tk = inf{f > 0 : Xt G it'}. Then u(x) > EXTK- Proof Ito's formula implies that Vt = u(Xt) + t is a local supermartingale on [0,¾). Letting Tn j Tk be a sequence of times that reduce Vt, we have u(x) > Exu{n A r„) + Ex{n A T„) > Ex(n A Tn)
Section 7.5 Convergence Theorems 269 Letting n —» oo and using the monotone convergence theorem the desired result follows. D Taking u(x) = |z|2/e gives the following (5.4) Corollary. Suppose tr a(z) + 2x ■ b(x) < —e for x £ Kc = {x : \x\ > r} then EXTK < \x\2/e. Though (5.4) was easy to prove, it is remarkably sharp. Example 5.1. Consider d = 1. Suppose a(x) = 1 and b(x) = —c/x for \x\ > 1 where c > 0 and set K = [-1,1]. If c > 1/2 then (5.4) holds with e = 1c-1. To compute ExTk, suppose c ^ 1/2, let e = 2c— 1, and let u(x) = x2/e. Lu = —1 in (1,6) so u(Xt) + t is a local martingale until time T(itb) = inf{t: Xt ¢(1,6)}. Using the optional stopping theorem at time T(itb) A t we have x2 EXXT M — = f-1 Ex{r{1>b) A t) Rearranging, then letting t —* oo and using the monotone and bounded convergence theorems, we have p .. _ x* ExXni.v Mi,s) -~ " To evaluate the right-hand side, we need the natural scale and doing this it is convenient to let start from 1 rather than 0: ^)=/exp(/¥"2) = C j*y2*dy=C'{y2«l-l) Using (3.2) in Chapter 6 with a = 1 it follows that _ x2 62c+l _ ^20+1 J g2c+l _ t ^2 ^«^1,») - "7 - 62c+l _ ! ■ 7 - 62c+l _ 1 " c The second term on the right always converges to — 1/e as 6 —» oo. The third term on the right converges to 0 when c > 1/2 and to oo when c < 1/2 (recall that e < 0 in this case). Exercise 5.1. Use (4.8) in Chapter 6 to show that ExTk = oo when c < 1/2.
Bt= I £ri-dW, Jo 270 Chapter 7 Diffusions as Markov Processes Exercise 5.2. Consider d = 1. Suppose a(x) = 1 and b(x) < —e/xs for x > 1 where e > 0 and 0 < 5 < 1. Let 7\ = inf{f : Xt = 1} and show that for x > 1, £xTi < (z1+* - l)/e. Example 5.2. When d > 1 and a(s) = J the condition in (5.4) becomes x-b(x)<-(d + e)/2 To see this is sharp, let Yt = \Xt\, and use Ito's formula with f(x) = \x\ which has Dif = xi/\x\ Diif=l/\x\-x?/\x\3 to get We have written W here for a d dimensional Brownian motion, so that we can let <*2k I*. I be a one dimensional Brownian motion B (it is a local martingale with {B)t = t) and write dYt = r±-:(xf 6(¾) + ^) dt + dBt If x • b(x) = — {d— 1 + c)/2 this reduces to the previous example and shows that the condition x • b(x) < —(d + e)/2 is sharp. The last detail is to make the transition from ExTk < oo to EaR < oo. (5.5) Theorem. Let K = {x : \x\ < r) and H = {x : \x\ < r + 1}. If suPx6H E*Tk < oo then EaR < oo. Proof Let U0 = 0, and for m > 1 let Vm = inf{f > Um-i : Xt G K}, Um = inf{i > Vm : \Xt\ £ H or t G Z}, and M = inf{m > 1 : Um G Z}. Since ^(f/m) G j4, -R < f/M- To estimate .Ea-R, we note that X(f7m_i) G A so £ (Kn - I7m-i |^m_!) < Co = sup £,¾ x6j4 To estimate M, we observe that if r# is the exit time from H then (3.6) implies tofP.fa > 1) > \K\ inf p[+1 (x,y) = e0 > 0 xgJT x,y£K where p£+ (a;, y) is the transition probability for the process killed when it leaves the ball of radius r+1. Prom this it follows that P(M > m) < (1 — eo)m. Since Um — Vm < 1 we have EUm < (1 + Co)/ea and the proof is complete. D
8 Weak Convergence 8.1. In Metric Spaces We begin with a treatment of weak convergence on a general space S with a metric p, i.e., a function with (i) p(x,x) = 0, (ii) p(x,y) = p(y,x), and (iii) p(x, y)+p(y, z) > p(x, z). Open balls are denned by {y : p(x} y) < r} with r > 0 and the Borel sets S are the cr-field generated by the open balls. Throughout this section we will suppose S is a metric space and S is the collection of Borel sets, even though several results are true in much greater generality. For later use, recall that S is said to be separable if there is a countable dense set, and complete if every Cauchy sequence converges. A sequence of probability measures \in on (S,S) is said to converge weakly if for each bounded continuous function f f(x)fj.n(dx) —» f f(x)fj.(dx). In this case we write fMn =$■ fM. In many situations it will be convenient to deal directly with random variables Xn rather than with the associated distributions fin(A) = P(Xn G A). We say that Xn converges weakly to X and write Xn =>• X if for any bounded continuous function /, Ef(Xn) —» Ef(X). Our first result, sometimes called the Portmanteau Theorem, gives five equivalent definitions of weak convergence. (1.1) Theorem. The following statements are equivalent. (i) Ef(Xn) —* Ef(X) for any bounded continuous function /. (ii) For all closed sets K, limsup,,^^ P(Xn € K) < P(X G K). (iii) For all open sets G, liminfn-.oo P(Xn G G) > P(X € G). (iv) For all sets A with P(X £ dA) = 0, limsup,,.,^ P(Xn G A) = P(X G A). (v) Let Dj = the set of discontinuities of /. For all bounded functions with P(X G Dj) = 0, Ef(Xn) - Ef(X).
272 Chapter 8 Weak Convergence Remark. To help remember (ii) and (iii) think about what can happen when P(Xn = xn) = 1, xn —» x, and x lies on the boundary of the set. If K is closed we can have xn ¢ K for all n but x G K. If G is open we can have xn £ G for all n but ¢¢¢. Proof We will go around the loop in roughly the order indicated. The last step, (v) implies (i), is trivial so we have four things to show. (i) implies (ii) Let p(x,K) = inf{p(x,y) : y € K}, ij)j{r) = (1 — jr)+ where z+ = max{z, 0} is the positive part, and let fj(x) = i/>y(/>(z, K)). fj is bounded, continuous, and > 1# so limsup P(X„ G K) < lim Efj{Xn) = Efj(X) n—*oo n—*oo Now fj I 1K as j 1 oo, so letting j —* oo gives (ii). (ii) is equivalent to (iii) This follows immediately from (a) P(A)+P(AC) = 1, and (b) A is open if and only if Ac is closed. (ii) and (iii) imply (iv) Let K = A = the closure of A, G = A" = the interior of A, and note that dA = K — G, so our assumptions imply P(X G 10 = P(X EA) = P{X G G) Using (ii) and (iii) now with the last equality we have limsup P(Xn eA)< limsup P{Xn G K) < P(X G K) = P[X G A) n—+oo n—*oo liminf P(Xn e A) > liminf P(X„ EG)< P(X £G) = P(X G A) n—*oo n—*oo At this point we have shown liminf P(Xn &A)> P(X G A) > limsup P(Xn G A) n-*oo n-.oo which with the triviality liminf < limsup gives the desired result. (iv) implies (v) Suppose |/(z)| < K and pick ao < a\ < ... < at with aQ < —K and at > K so that P(f(X) = a;) = 0 for 0 < i < I and ay—ay_i < e for 1 < j < L This is always possible since {a : P(X = a) > 0} is a countable set. Let Ai = {x : aj_i < f(x) < a,-}, 5^4,- C {x : f(x) G {aj-i, a,-}} U D/, so P{X G 5.4,-) = 0, and it follows from (iv) that i i J2<*ip(Xn G At) ^J2a'P(X G **)
Section 8.1 In Metric Spaces 273 The definition of the a; implies t 0 < J2<*ip(xn € At) - Ef(Xn) < € 1=1 and this inequality holds with X in place of Xn. Combining our conclusions we have limsup \Ef(Xn) - Ef(X)\ < It n—oo but e is arbitrary so the proof is complete. Q An important consequence of the (1.1) is (1.2) Continuous mapping theorem. Suppose fxn on (S,S) converge weakly to fj., and let <p : S —* S' have a discontinuity set Dv with n(Dv) = 0. Then fin o(p~l =4> /jo <p~x._ Proof Let / : S' —* R be bounded and continuous. Since Djaip C Dv, it follows from (v) in (1.1) that j Hn{dx)f{<p{x)) - J H{dx)f{<p{x)) Changing variables gives J fi„ o ^_1((fa;)/(a;) -> Uo ^(cte)/^) Since this holds for any bounded continuous /, the desired result follows. D In checking convergence in distribution the following lemma is sometimes useful. (1.3) Converging together lemma. Suppose Xn =r- X and p(Xn,Yn) —» 0 in probability then Yn=t? X. Proof Let K be a closed set and Ks = {y ■ p{y,K) < <?}, where p(y,K) = inf{p(y, z) : z £ if}- ^ is easy to see that 2 ¢ -K* if and only if there is a 7 > 5 so that 5(2,7) H If = 0, so (¾)0 is open and Ks is closed. With this little detail out of the way the rest is easy. P(Yn € K) < P(Xn G Ks) + P(p(Xn,Yn) > S)
274 Chapter 8 WeaA' Convergence Letting n —* oo, noticing the second term on the right goes to 0 by assumption, and using (ii) in (1.1) we have limsup P(Yn g K) < limsup P(Xn g Kg) < P{X g Ks) n—*oo n—*co Measure theory tells us that P(X g Ks) I P(X g K) as 8 J. 0 so we have shown limsup P(Yn g K) < P(X g K) n—oo The desired conclusion now follows from (1.1). D The next result, due to Skorokhod, is useful for proving results about a sequence that converges weakly, because it converts weak convergence into almost sure convergence. (1.4) Skorokhod's representation theorem. Suppose S is a complete separable metric space. If fMn =>• /*oo then we can define random variables Yn on [0,1] (equipped with the Borel sets and Lebesgue measure A) with distribution Hn so that Yn —* Yoo almost surely. Remark. When S = R this is easy. Let Fn(x) = /*„(—oo, x] be the distribution function, let F-\y) = inf{z : Fn(x) > y} let U(w) = w for w g [0,1] be a random variable that is uniform on (0,1), and let Yn = i?^"1(f7). Then Yn -+ Yoo almost surely. (For a proof, see (2.1) of Chapter 2 of Durrett (1995).) We invite the reader to contemplate the question: "Is there an easy way to do this when S = R2 (or even S = [0, l]2)?" while we give the general proof. The details are somewhat messy and not important for the developments that follow, so if the reader finds herself asking "Who cares?", she can skip to the beginning of the next section. Proof We will warm up for the real proof by treating the special case fMn = fj.. (1.5) Theorem. For each probability measure \i there is a random variable Y defined on [0,1] with P(Y g A) = n(A). Proof For each k construct a decomposition Ak = {Ak,i,Akl2,...} of S into disjoint sets of diameter less than 1/fc and so that Ak refines Ak-i, i.e., each set -Afc-i^- is a union of sets in Ak- For each k construct a corresponding decomposition Ik of the unit interval so that X(Ik,j) = M-^fcj) an(l arrange things so that Ak-i,i D Akj if and only if Ik-i,i D Ikj-
Section 8.1 In Metric Spaces 275 Let Xkj be some point in A^j and let Xk,]{u) = ztj for w G hj. Since .¾, Xk+i,... is contained in some element of Ak, which by definition has diameter < 1/fc, it is a Cauchy sequence. So X(w) = lim* Xk(u) exists and satisfies p(Xk,X) < 1/fc. To show that X has distribution fi we let K be a closed set, let S(fc, K) = {j : Ak,j D K £ 0}, and let Ks = {x : p(x,K) < 6} as in the proof of (1.3). X(Xk G K) < \{Xk{w) e Uj€S{k,K)Ak,j) < E MXt(W)€Atj) = E A(4j)= E AiU*j)<Ai(^i/0 j6S(i,A-) j6S(i,A-) Letting fc —» oo and noting -Ki/t J. If it follows that limsup X(Xk G If) < n(K) k So (1.1) implies that Xk =>• M an(i it follows that X has distribution /*. D Proof of (1.4) Construct a decomposition Ak of the space S as in the preceding proof but this time require that each Akj has Hoo{dAkj) = 0. For each n construct decompositions Zj?. of [0,1] so that A(J£ •) = fin(Akj). To arrange the intervals in an appropriate way we introduce an ordering in which [a, 6] < [c, d\ if 6 < c and demand that Ij?.. < .¾ if and only if J£>{ < 1¾. Let Zij G -dj-j- and define X%(w) = zfciJ- for w G i^j. As before, X" = limfc_oo Xg exists and has distribution \in. Since J\ /Xoo(-Afcj) - Mn(^i,j) = 1 - 1 = 0, we have E lA(J%) " A(4"j)l = E IM^j) " Mn(^j)| j i = 2^Uu)-^i,i))+ where y+ = max{y, 0} is the positive part of y. Since the ^(dAkj) = 0, (1.1) implies fin(Akj) —* fioa{Ak,j) and the dominated convergence theorem implies (1.6) !>('£)-a(j?j)I-o ^-°° i Fix fc and jQ, let a„ be the left endpoint of I£jV and let S(j0) = {j : 1^j < -¾..} which by our construction does not depend on n. (1.6) implies that «co= E I4°jl=lim E 14^-1= lirna„ J*65(io) j'6S(jo)
276 Chapter 8 Weak Convergence Similarly if we let /?„ be the right endpoint of IJJj then /?„ —* /?«, asn->oo. Hence if w is in the interior of If. it lies in the interior of I£ • for large n, and it follows from the definition of the Xn that limsup p{Xn, Xqo) < 2/fc n—*oo Letting fc —* oo we see that if w is not one of the countable set of points that is the end point of some If. we have Xn —* X^, which proves (1.4). D 8.2. Prokhorov's Theorems Let II be a family of probability measures on a metric space (S,p). We call II relatively compact if every sequence \in of elements of II contains a subsequence [Mnk that converges weakly to a limit n, which may be ^ II. We call II tight if for each e > 0 there is a compact set K so that fi(K) > 1 — e for all /igll. This section is devoted to the proofs of (2.1) Theorem. If II is tight then it is relatively compact. (2.2) Theorem. Suppose S is complete and separable. If II is relatively compact, it is tight. The first result is the one we want for applications, but the second one is comforting since it says that, in nice spaces, the two notions are equivalent. Proof of (2.1) We shall prove the result successively for Rd, R°°, a countable union of compact sets, and finally, a general S. This approach is somewhat lengthy but to inspire the reader for the effort, we note that the first two special cases are important examples, and then the last two steps and the converse are remarkably easy. Before we begin we will prove a lemma that will be useful for the first two examples. (2.3) Lemma. Let A be a subclass of S that is closed under intersection, and so that each open set is a countable union of elements of A. If ^„(j4) —» p{A) for all A G A then fMn =>• fi. Proof The inclusion-exclusion formula says m P(uT=1Ai) = X>(^) -J2p(A' riA]) + :. + (-i)m+1P(nT=1Ai) t=l i<j
Section 8.2 Prokhorov's Theorems 277 Combining this with the fact that A is closed under intersection it is easy to see that fMn(\Ji=1Ai) —» M(Uf=iA{). Given an open set G and an e > 0, choose sets Ai,... Ak so that fj.(U{=1Ai) > fi(G) — e. Using the convergence just proved it follows that fi(G) - € < n(uf=1A{) < lim iin(U$=1Ai) < liminf fin(G) n —>oo n—>oo Since e is arbitrary, (iii) in (1.1) follows and the proof is'complete. D Proof for Rd This is a fairly straightforward generalization of the argument for d = 1. (See e.g., (2.5) in Chapter 2 of Durrett (1995).) In this case we can rely on the distribution functions F(x) = fj.({y < z}), where for two vectors y < x means y,- < z,- for each i. Let Qd be the points in Rd with rational coordinates. By enumerating the points in Qd and then using a diagonal argument, we can find a subsequence F„k so that Fnk(q) converges for each q G Qd. Call the limit G(q) and define the proposed limiting distribution function by F(x) = inf{G(g) : q G Qd and q > x) where q > x means g,- > z,- for each i. To check that F is a distribution function we note that it is an immediate consequence of the definition that (i) if z < y then F(x) < F(y) {n)limylx F(y) = F(x) where y J. z means y,- J. z,- for each i. The third condition for being a distribution function is easier to prove than it is to state: (iii) let A = (ai, 6J x • • ■ x (a,j, 6,j] be a rectangle with vertices V = {ai, 61} x ■■■ X {ad,bd}, and let sgn(t)) = (—l)n(a>u) where n(a,v) is the number of a's that appear in v. The inclusion-exclusion formula implies that the probability ofA\sJ2v€Vsgn(v)F(v)>Q. The fact that (iii) holds for the Fn 's implies that it holds for G and by taking limits we see that it holds for F. Conditions (1)-(iii) imply that F is the distribution function of a measure on Rd, which in our case will always have total mass < 1. Define the ith marginal distribution function of F by F'(x) = limi?(y) where the limit is taken through vectors y with y,- = z and the other coordinates y,- —>• 00. F' is a one dimensional distribution function and hence its discontinuities D' are a countable set. Call z a good point if z,- ¢. Dl for each i. We claim that F is continuous at good points. To see this, let w < z < y, let z,- be the vector which uses the first
278 Chapter 8 Weak Convergence i components from y and the last d—i from w and note that the monotonicity properties that come from the interpretation of F as a measure imply d d F(y) - F(w) = ^F(zi) - J-(z,_x) < X>(y,-) - F^w,) :=1 isl The continuity of F now follows from the monotonicity and the continuity of the F{. Our next claim is that for good x, Fnk(x) —* F(x). To prove this pick P, 1, v € Qd with p < q < x < r and F(x) - e < F(p) < F(q) < F(x) < F(r) < F(x) + e Since u < v implies F(u) < G(v) < F(v) it follows that F(x) - e < G(q) < F(x) < G(r) < F{x) + e Using F„(q) < Fn(x) < Fn(r) and the convergence Fnk(q) -* G(q) for q G Qd the claim follows easily. Up to this point we have not used the tightness assumption. It enters into the proof only to guarantee that F is the distribution function of a probability measure. Given e choose a compact set K so that Fn(K) > 1 — e for all n, where Fn(K) is short for the probability of K under the measure associated with the distribution function Fn. Now pick a good rectangle A, i.e., one with a,-,6,- ¢ Di for all i, so that K C A. The formula in (iii) for the probability of A and the previous claim about convergence at good points imply F(A) = lim Fnk(A) > liminf Fn(K) > 1 - e k—*oo n—>oo Finally, we have to prove that Fnk =r- F. For this we use the following result which is of interest in its own right. (2.4) Theorem. In Rd probability measure fin =$■ n if and only if the associated distribution functions have Fn(x) —» F(x) for all x that are good for F. Proof If x is good for F, then A = {y : y < x} has n(dA) = 0, so if fj.n =$■ [i then (iv) of (1.1) implies Fn(x) —* F(x). To prove the converse we will apply (2.3) to the collection A of finite disjoint unions of good rectangles, (i.e., rectangles, all of whose vertices are good). We have already observed that fin(A) —* fi(A) for all A e A. To complete the proof we note that A is closed under intersection, and any open set is a countable disjoint union of good rectangles. D
Section 8.2 Prokhorov's Theorems 279 Proof for R°° Let R°° be the collection of all sequences (zi, z2,...) of real numbers. To define a metric on R°° we introduce the bounded metric p0(x, y) = \x — y|/(l + \x — y\) on R and let oo : = 1 It is easy to see that the induced topology is that of coordinatewise convergence, (i.e., z" —» x if and only if z" —» z,-) for each i and hence this metric makes R°° complete and separable. A countable dense set is the collection of points with finitely many nonzero coordinates, all of which are rational numbers. (2.5) Lemma. The Borel sets S = 1Z°°, the product cr-field gnerated by the finite dimensional sets {z : z,- G Ai for 1 < i < fc}. Proof To argue that S D Tt°° observe that if G,- are open then {z : z,- G Gi for 1 < i < fc} is open. For the other inclusion note that {y ■ p{x,y)<s} = n~al \y-J2 ^mPo{xm,ym) < 6 \ e nx so {y : p(x, y) < t> = U~=1{y : p(x, y) < T - 1/n} G ft°° • □ Let Wd : R°° —» Rd be the projection Wd(x) = (zi,... ,Xd). Our first claim is that if II is a tight family on (R°° ,¾0°) then {/101J1 : /x e n} is a tight family on (Rd,7£d). This follows from the following general result. (2.6) Lemma. If II is a tight family on (S, S) and if h is a continuous map from S to S' then {p o /i-1 : p. £ 11} is a tight family on (S',S'). Proof Given e, choose in S a compact set K so that p(K) > 1—e for all p G II. If K' = h(K) then K' is compact and h'^K') D K so p o ft-^Jf') > 1 - e for all /x G II. D Using (2.6), the result for Rd, and a diagonal argument we can, for a given sequence pn, pick a subsequence pnk so that for all d, pnk° ltd converges to a limit Vd- Since the measures Vd obviously satisfy the consistency conditions of Kolmogorov's existence theorem, there is a probability measure v on (R°°, 7£°°) so that v o fl-J = Vd- We now want to check that pnk => v. To do this we will apply (2.3) with A = the finite dimensional sets A with v(dA) = 0. Convergence of finite dimensional distributions and (1.1) imply that pnk(A) —» v(A) for all A G A.
280 Chapter 8 Weak Convergence The intersection of two finite dimensional sets is a finite dimensional set and d(A n B) C dA U dB so A is closed under intersection. To prove that any open set G is a countable union of sets in A, we observe that if y G G there is a 8 > 0 so that B(y, 8) = {x : p(x, y) < 5} C G. Pick 8 so large that B(y, 28) n Gc ¢. 0. (G = R°° G A so we can suppose Gc ^ 0.) If we pick N so that 2~w < 8/2 then it follows from the definition of the metric that for r < 8/2 the finite dimensional set A(y,r,N) = {y : |x, - w| < r for 1 < i < iV} C 5(2/,5) C G The boundaries of these sets are disjoint for different values of r so we can pick r > 5/4 so that the boundary of A(y, r, N) has v measure 0. As noted above R°° is separable. Let 1/1,1/21--- be an enumeration of the members of the countable dense set that lie in G and let A{ = A(y{,n,Ni) be the sets chosen in the last paragraph. To prove that U,-j4,' = G suppose x G G — U,-j4,-, let 7 > 0 so that B(x,j) C G, and pick a point y,- so that B(x,yi) < 7/9. We claim that x G -4,-. To see this, note that the triangle inequality implies B(yi, 87/9) C G, so 5 > 47/9 and r > 5/4 = 7/9, which completes the proof of /*„fc =$• 1/. D Before passing to the next case we would like to observe that we have shown (2.7) Theorem. In R°°, weak convergence is equivalent to convergence of finite dimensional distributions. Proof for the cr-compact case We will prove the result in this case by reducing it to the previous one. We start by observing (2.8) Lemma. If S is <r-compact then S is separable. Proof Since a countable union of countable sets is countable, it suffices to prove this if S is compact. To do this cover S by balls of radius 1/n, let zj},, m < mn be the centers of the balls of a finite sub cover, and check that the x^ are a countable dense set. D (2.9) Lemma. If S is a separable metric space then it can be embedded home- omorphically into R°°. Proof Let gi, q2, ■ ■ ■ be a sequence of points dense in S and define a mapping from S into R°° by h(x) = (p(x,qi),p(x,q2),...)
Section 8.2 Prokhorov's Theorems 281 If the points x„ —* x in S then limp(zn,g{) = p{x,qi) and hence /i(zn) —* /i(z). Suppose x ^ x' and let e = p(x,x'). If p(x,qi) < e/2, which must be true for some i, then p(x',qi) > e/2 or the triangle inequality would lead to the contradiction e < p(x,x'). This shows that /i is 1-1. Our final task is to show that /i_1 is continuous. If xn does not converge to x then limsupp(xn,x) = e > 0. If p{x,q{) < e/2 which must be true for some q{ then limsup p(xn, g,«) > e/2, or again the triangle inequality would give a contradiction, and hence h(xn) does not converge to h(x). This shows that if h(xn) —* /i(z) then zn —* a; so /i_1 is continuous. D It follows from (2.6) that if II is tight on S then {fi o h~l : fi G II} is a tight family of measures on R°°. Using the result for R°° we see that if \in is a sequence of measures in II and we let vn — \in o /i_1 then there is a convergent subsequence i/„fc. Applying the continuous mapping theorem, (1.2), now to the function <p = /i_1 it follows that fMnk = i/„fc o h converges weakly. The general case Whatever S is, if II is tight, and we let Kt be so that fi(Ki) > 1 — 1/i for all n G II then all the measures are supported on the cr-compact set So = Ufiff and this case reduces to the previous one. D Proof of (2.2) We begin by introducing the intermediate statement: (H) For each e,6 > 0 there is a finite collection Ai,...,An of balls of radius 5 so that fi(Ui<„Ai) > 1 — e for all fi G II. We will show that (i) if (H) holds then II is tight and (ii) if (H) fails then II is not relativley compact. Combining (ii) and (i) gives (2.2). Proof of (i) Fix e and choose for each k finitely many balls A\,..., A*k of radius 1/fc so that /*(U;<„fcj4*) > 1 — e/2k for all /x € II. Let K be the closure of r\^Li U,-<nfc Af. Clearly n(K) > 1 — e. K is totally bounded since if Bj is a ball with the same center as Aj and radius 2/fc then Bj, 1 < j < n^ covers K. Being a closed and totally bounded subset of a complete space, K is compact and the proof is complete. D Proof of (ii) Suppose (H) fails for some e and 6. Enumerate the members of the countable dense set gi, g2,..., let A{ = B(qi, 6), and Gn = U"=1 A{. For each n there is a measure \in G II so that ii„(Gn) < 1 — e. We claim that \in has no convergent subsequence. To prove this, suppose ii„fc =>• v. (iii) of (1.1) implies that for each m v(Gm) < limsup fi„k{Gm) k—*oo < limsup fink(Gnk) < 1 — e k—*oo
282 Chapter 8 Weak Convergence However, Gm j S as m j oo so we have u(S) < 1 — e, a contradiction. D 8.3. The Space C Let C = C([0, l],Rd) be the space of continuous functions from [0,1] to Rd equipped with the norm |M| = supK*)l i<l Let B be the collection of Borel subsets of C. Introduce the coordinate random variables Xt(w) = w(i) and define the finite dimensional sets by {w : Xti((j) € A{ for 1 < i < k) where 0 <ix < t2 < ... < tk < 1 and A{ £ Kd, the Borel subsets of Rd. (3.1) Lemma. B is the same as the cr-field C generated by the finite dimensional sets. Proof Observe that if £ is a given continuous function {w : ||u - e|| < € - 1/n} = Dq {w : |w(g) - ¢(¢) | < e - 1/n} where the intersection is over all rationals q £ [0,1]. Letting n —» oo shows {w : ||w — £|| < e} G C and B C.C. To prove the reverse inclusion observe that if the A{ are open the finite dimensional set {w : w(i,-) £ A;} is open so the tt — X theorem implies C C.B. D Let 0 < *! < t2 < ... < t„ < 1 and wt : C -* (Rd)n be defined by Trt{u) = (u(ti),...,u(tn)) Given a measure \i on (C,C), the measures fi o w^ , which give the distribution of the vectors (Xilt.. .,Xin) under n, are called the finite dimensional distributions or f.d.d.'s for short. On the basis of (3.1) one might hope that convergence of the f.d.d.'s might be enough for weak convergence of the \in. However, a simple example shows this is false. Example 3.1. Let an = 1/2— l/2n, 6„ = 1/2— l/4n, and let \in be the point mass on the function {0 ze[0,an] 4n(z - a„) x £ [a„, 6„] 471(1/2-«) xe[bn,l/2] 0 «€[1/2,1]
Section 8.3 The Space C 283 As n —» oo, fn{t) —» /oo = 0 but not uniformly. To see that fin does not converge weakly to fi^, note that /i(w) = sup0<t<i w(i) is a continuous function but Jh{w)nn{du) = 1 j* 0 = Jh^fiooidw) (3.2) Theorem. Let /*„, 1 < n < oo be probability measures on C. If the finite dimensional distributions of \in converge to those of fi^ and if the \in are tight then fin => fioa- Proof If fin is tight then by (2.1) it is relatively compact and hence each subsequence fi„m has a further subsequence finim that converges to a limit v. If / : Rfc —» R is bounded and continuous then f{Xtlt ■ ■ -Xtk) is bounded and continuous from C to R and hence J f(Xu,... Xtk)fin,m(da,) - y f(Xu,...X,Ji/OkO The last conclusion implies that fini o ir^1 =>• i/ o ir^*1, so from the assumed convergence of the f.d.d.'s we see that the f.d.d's of v are determined and hence there is only one subsequential limit. To see that this implies that the whole sequence converges to i/, we use the following result, which as the proof shows, is valid for weak convergence on any space. To prepare for the proof we ask the reader to do Exercise 3.1. Let rn be a sequence of real numbers. If each subsequence of rn has a further subsequence that converges to r then r„ —» r. (3.3) Lemma. If each subsequence of fin has a further subsequence that converges to v then fin ■=$■ v. Proof Note that if / is a bounded continuous function, the sequence of real numbers f f(u)fin(du) have the property that every subsequence has a further subsequence that converges to f f(u)v(du). Exercise 3.1 implies that the whole sequence of real numbers converges to the indicated limit. Since this holds for any bounded continuous / the desired result follows. D As Example 3.1 suggests, fin will not be tight if it concentrates on paths that oscillate too much. To find conditions that guarantee tightness, we introduce the modulus of continuity ws(u) = sup{ |w(s) — w(i)| : \s — t\ < 6}
284 Chapter 8 Weak Convergence (3.4) Theorem. The sequence \in is tight if and only if for each e > 0 there are no, M and 8 so that (i) Mn(|w(0)| > M) < e for all n > n0 (ii) fin (ws > e) < e for all n > n0 Remark. Of course by increasing M and decreasing 6 we can always check the condition with no = 1 but the formulation in (3.4) eliminates the need for that final adjustment. Also by taking e = 77 A C it follows that if fin is tight then there is a 8 and an no so that y.n{^s > *]) < C f°r n > no- Proof We begin by recalling (see e.g., Royden (1988), page 169) (3.5) The Arzela-Ascoli Theorem. A subset A of C has compact closure if and only if supw6j4 |w(0)| < 00 and limf_osupa,6j4 ws(u) = 0. To prove the necessity of (i) and (ii), we note that if \in is tight and e > 0 we can choose a compact set K so that Hn(K) > 1 — e for all n. By (3.5), K C {X(0) < M} for large M and if e > 0 then K C {w« < e} for small 6. To prove the sufficiency of (i) and (ii), choose M so that /in(|X(0)| > M) < e/2 for all n and choose 5u so that ^n(w«fc > 1/&) < e/2i+1 for all n. If we let K be the closure of {^(0) < M,wsk < 1/k for all k} then (3.5) implies K is compact" and fin(K) > 1 — e for all n. D Condition (i) is usually easy to check. For example it is trivial when Xn(0) = xn and xn —» x as n —» 00. The next result, (3.6), will be useful in checking condition (ii). Here, we formulate the result in terms of random variables Xn taking values in C, that is, to say random processes {Xn(t),0 < t < 1}. However, one can easily rewrite the result in terms of their distributions fin(A) = P(Xn e A). Here, A EC. (3.6) Theorem. Suppose for some a,/? > 0 E\Xn(t2) - Xnit^f < K\U - i!|1+a then (ii) in (3.4) holds. Remark. The condition should remind the reader of Kolmogorov's continuity criterion, (1.6) in Chapter 1. The key to our proof is the observation that the proof of (1.6) gives quantitative estimates on the modulus of continuity. For a much different approach to a slightly more general result, see Theorem 12.3 in Billingsley (1968).
Section 8.4 Skorokhod's Existence Theorem for SDE 285 Proof Let 7 < a//3, and pick 77 > 0 small enough so that A = (l-77)(l + a-/?T)-(l + 77)>0 From (1.7) in Chapter 1 it follows that if A = 3 • 2^-^/(1 - 2^) then with probability > 1 - K2~NX/(1 - 2~A) we have \Xn{q) - Xn(r)\ < A\q - r\y for q,r £ Q2 fl [0,1] with 2^-7^ Pick N so that K2-NX/(1 - 2~A) < e then 8 < 2^-^ so that ASn < e and it follows that P(ws > e) < e. D 8.4. Skorokhod's Existence Theorem for SDE In this section, we will describe Skorokhod's approach to constructing solutions of stochastic differential equations. We will consider the special case (*) dXt = (r(Xt)dBt where a is bounded and continuous, since we can introduce the term b(Xt) dt by change of measure. The new feature here is that a is only assumed to be continuous, not Lipschitz or even Holder continuous. Examples at the end of Section 5.3 show that we cannot hope to have uniqueness in this generality. Skorokhod's idea for solving stochastic differential equations was to dis- cretize time to get an equation that is trivial to solve, and then pass to the limit and extract subsequential limits to solve the original equation. For each n, define Xn(t) by setting X„(0) = x and, for m2~" < t < (m + l)2-n, Xn(t) = Xn(m2-n) + cr(Xn(m2-n))(Bi - B{m2-n)) Since Xn is a stochastic integral with respect to Brownian motion, the formula for covariance of stochastic integrals implies (X'n,Xi)t = £ f\aikaik){Xn{[2ns]l2n))ds = faij(Xn([2ns]/2n))ds Jo where as usual a = aaT. If we suppose that a{j(x) < M for all i,j and 2, it follows that if s <t, then \(X'n,Xi)t-(X'n,Xi),\<M(t-s)
286 Chapter 8 Weak Convergence so (5.1) in Chapter 3 implies E sup |Xi(ti)-Xi(5)|"<Cp^|(X')t-(X;)I|"/'<Cp{M(i--)K/2 «e[M] Taking p = 4, we see that E sup |X< (u) - X^(s)|4 < CM2(t - s)2 ti6[3,i] Using (3.6) to check (ii) in (3.4) and then noting that Xn(0) = x so (i) in (3.4) is trivial, it follows that the sequence Xn is tight. Invoking Prohorov's Theorem (2.1) now, we can conclude that there is a subsequence Xn(M that converges weakly to a limit X. We claim that X satisfies MP(0,a). To prove this, we will show (4.1) Lemma. If / £ C2 and Lf = \ £y atJ Di}f then f(Xt) — /(X0) — / Lf(X,)ds is a local martingale Jo Once this is done the desired conclusion follows by applying the result to f(x) = H and f(x) = x{Xj. Proof It suffices to show that if /, A'/, and JDy/ are bounded, then the process above is a martingale. Ito's formula implies that /(*»(*)) " /tM»)) = E jf Dif(Xn(r))dX'n(r) + \y£jyiif{Xn{r))d{Xin,Xi)r So it follows from the definition of Xn that <X;,x£)r= f a{j(Xn([2nu]2-"))du Jo so if we let Lnf(r) = i ^^([2^2^))^/(^) 2 then /(X„(i)) — /(Xn(s)) — /3 Lnf(r) dr is a local martingale.
Section 8.5 Donsker's Theorem 287 Skorokhod's representation theorem, (1.4), implies that we can construct processes Yk with the same distributions as the Xn(k) on some probability space in such a way that with probability 1 as k —* oo, Yk(t) converges to Y(t) uniformly on [0,T] for any T < oo. If s < t and g : C —* R is a bounded continuous function that is measurable with respect to T,, then E[g{Y). j/(Y0 - f(Y.) - J* Lf(Yr) dr}) = \imE (g(Yk) ■ {/(Y/=) - f(Y,k) - jf ^/(Y^rir}) = 0 Since this holds for any continuous g, an application of the monotone class theorem shows E (/(¾) - f{Y.) - £ Lf(Yr)dr T^j = 0 which proves (4.1). D 8.5. Donsker's Theorem Let &, 6. • ■ • be i.i.d. with E£i = 0 and Eg = 1. Let Sn = 6 + • • • + £„ be the nth partial sum. The most natural way to turn Sm, 0 < m < n into a process indexed by 0 < t < 1 let 5" = S[ni]/y/n where [z] is the largest integer < x. To have a continuous trajectory we will instead let Bn _ f Sm/-A if t = m/n i \ linear if t £ [m/n, (m + l)/n] (5.1) Donsker's Theorem. As n —* oo, Bn =>• B, where B is a standard Brownian motion. Proof We will prove this result using (3.2). Thus there are two things to do: Convergence of finite dimensional distributions. Since Srnli is the sum of [nt] independent random variables, it follows from the central limit theorem that if 0 < ii < ... i„ < 1 then To extend the last conclusion from Bn to Bn, we begin by observing that if rn = nt — [nt] then .B? = 5?+ »7, £[„*]+!/V"
288 Chapter 8 Weak Convergence Since 0 < r„ < 1, we have r„£[nl]+i/^/n —» 0 in probability and the converging together lemma, (1.3), implies that the individual random variables B" =>• Bt. To treat the vector we observe that (i% - 5?._x) - (SI\ - B»_J = (*<"• " B?.) - (5?._x - 5?._x) -+ 0 in probability, so it follows from (1.3) that (Btl . Bt2 ~ Btl ' ■ ■ ■ ' Btm ~ B"m-1 ) =** (-8^ yBt2~ Btl,..., Btm - Btm_! ) Using the fact that (xi, z2,.. ■, zn) —* (zi> ^1 + x2, ■ ■ • > a;i + • ■ • + xm) is a continuous mapping and invoking (1.2) it follows that the finite dimensional distributions of Bn converge to those of B. Tightness. The L2 maximal inequality for martingales (see e.g., (4.3) in Chapter 4 of Durrett (1995)) implies E (max Sj] < AESj \°<i<t J ~ Taking £ = n/m and using Chebyshev's inequality it follows that P ( max IS,-1 > ey/n) < 4/me2 \0<j<n/m ' Jl J - ' or writing things in terms of J3" P ( max 15" - 5" I > e ) < 4/me2 V3<i<3 + l/m' * ' ' J ~ ' Since we need m intervals of length 1/m to cover [0,1] the last estimate is not good enough to prove tightness. To improve this, we will truncate and compute fourth moments. To isolate these details we will formulate a general result. (5.2) Lemma. Let £1,^2,... be i.i.d. with mean \i. Let £,• = &l(|f|<Af)> let Si = 6 + - • •+&, let a(M) = E(\ti\2; |&| > M), let pM = E&, and vM = E$). Then we have (i) P(6 # 6) < M-*E{\ti\3; |-e| >M) = a(M)/M2 (ii) \n - fiM\ < E(\Zi\; 1^1 > M) < cc{M)/M (iii) If M > 1 and a(M) < 1 then E(St - IJxmY < CXIM2 + C2t2
Section 8.5 Donsker's Theorem 289 where Ci = 32 {pAM + vM} and C2 = 24 {p.2M + vM}. Remark. We give the explicit values of C\ and C2 only to make it clear that they only depend on jxm and T>m■ Proof Only (iii) needs to be proved. Since E(£j — Hm) = 0 and the |y are independent =^(¾ - ^)4 + Q Q) ^(¾ - um? If p is an even integer (we only care about p = 2,4) then E(ij-pMY<2"(fipM + E^) Prom this, (ii), and our assumptions it follows that E{ij - w)2 < 4(/¾ + vm) This gives the term C^2- To estimate the fourth moment we notice that rM E(tf)= / 4x3P(\i]-\>x)dx Jo rM < 2M2 / 2xP(\£j | > x) dx = 1M2vM Jo So using our inequality for the pth power again, we have E(ij - Mf)4 < 16(/22, + 1MHm) Since we have supposed M > 1, this gives the term C\£M2 and the proof is complete. D We will apply (5.2) with M = 6y/n. Part (i) implies (5.3) nP(6#£)<n.^Uo by the dominated convergence theorem. Since \i — 0 part (ii) implies (5.4) n\(lM\<n.^p- = o(V^)
290 Chapter 8 Weak Convergence as n —* oo. Using the L4 maximal inequality for martingales (again see e.g., (4.3) in Chapter 4 of Durrett (1995)) and part (iii), it follows that if £ = n/m and n is large then (here and in what follows C will change from line to line) E sup \Sk - fc/2M|4 < CE\St - £j!m\4 0<k<l < C{£M2 +e)<c(- + -^-) n2 — \m m -/ Using Chebyshev's inequality now it follows that (5.5) P f sup ^ \Sk -kJiM\> eVn'/o) < Ce~4 (^ + ^) Let B" = S[nl]/V" and Jfc,m = [k/m,(k + l)/m]. Adding up m of the probabilities in (5.5), it follows that limsup P (max max \Bns - Sty J > e/l) < Cr* id2 + 1) To turn this into an estimate of the modulus of continuity, we note that *»i/m(/)<3 max max \f(s) - f(k/m)\ 0<*<m J6ik,m since for example if k/m <s<(k+ l)/m <t < (k + 2)/m \f(t)-m\<\f(t)-f((k+l)/m)\ + |/((t + l)/m) - f(k/m)\ + \f(k/m) - f(s)\ If we pick m and 6 so that Ce~4(<52 + 1/m) < e it follows that (5.6) limsup P(w1/m(Bn) > e/3) < e n—>oo To convert this into an estimate the modulus of continuity of Bn we begin by observing that (5.3) and (5.4) imply pf sup \B?-B?\>e/3) ^0 \0<1<1 J as n —» oo so the triangle inequality implies limsup P(w1/m(Bn) > e) < e
Section 8.5 Donsker's Theorem 291 Now the maximum oscillation of Bn over [s,t] is smaller than the maximum oscillation of Bn over [[ns]/n,([nf] + l)/n] so (5.7) ws(Bn) < ws+2/n(Bn) and it follows that limsupP(iu1/2m(5n) > e) < e n—»00 This verifies (ii) in (3.4) and completes the proof of Donsker's theorem. D The main motivation for proving Donsker's theorem is that it gives as corollaries a number of interesting facts about random walks. The key to the vault is the continuous mapping theorem, (1.2), which with (5.1) implies: (5.8) Theorem. If t/> : C[0,1] —» R has the property that it is continuous Po-a.s. then 4>{Bn) =» *I>{B). Example 5.1. Let i>(u) = max{w(f) : 0 < t < 1}. It is easy to see that |t/>(w) — ip(0\ < llw — V"l| so V1 : C[0,1] -» R is continuous, and (5.8) implies max Sm/y/n=> M\ = max Bt 0<m<n 0<i<l To complete the picture, we observe that by (3.8) in Chapter 1 the distribution of the right-hand side is Po(Mi > a) = P0(Ta < 1) = 2P0(Bi > a) Example 5.2. Let V(w) = sup{i < 1 : w(i) = 0}. This time ij) is not continuous, for if w£ has w£(0) = 0, w£(l/3) = 1, w£(2/3) = e, w(l) = 2, and linear on each interval \j,(j + 1)/3] then i>(u0) = 2/3 but ip(uc) = 0 for e > 0. It is easy to see that if ip(w) < 1 and w(t) has positive and negative values in each interval (V"(w) — ^ilKw)) tnen i> is continuous at w. By arguments in Example 3.1 of Chapter 1, the last set has Pq measure 1. (If the zero at ip(<j) was isolated on the left, it would not be isolated on the right.) Using (5.8) now sup{m < n : 5m_i • Sm < 0}/n =» L = sup{i < 1 : Bt = 0} The distribution of L, given in (4.2) in Chapter 1, is an arcsine law. Example 5.3. Let ij>(w) = \{t G [0,1] : u(t) > a}\. The point w = a shows that t/1 is not continuous but it is easy to see that tp is continuous at paths w with \{t G [0,1] : w(i) = a}\ = 0. Fubini's theorem implies that EQ\{t G [0,1] : Bt = a}\ = f P0(Bt = a)dt = 0 Jo
292 Chapter 8 Weak Convergence so i[> is continuous P0-a..s. With a little work (5.8) implies \{m < n : Sm > ay/n}\/n => \{t g [0,1] : Bt > a}\ Before doing that work we would like to observe that (9.2) in Chapter 4 shows that \{t £ [0,1] : Bt > 0}| has an arcsine law under Pq. Proof Application of (5.8) gives that for any a, \{t e [0,1] : B? > aJZ}\ =»► \{t e [0,1] : Bt > a}\ To convert this into a result about \{m <n : Sm > ay/n}\ we note that if e > 0 then P(max|Xm| > ey/n) < nP(\Xm\ > e^/n) (5.9) m^n <e-2E(Xl;\Xm\>e^l)^0 by dominated convergence, and on {maxm<„ \Xm\ < ey/n}, we have \{t e [0,1] : B? > (a + e)V^}\ < -\{m < n : Sm > ay/K)\ n <|{i€[0,l]:B?>(a-OVn}| Combining this with the first conclusion of the proof and using the fact that b —» \{t £ [0,1] : J3t > 6}| is continuous at 6 = a with probability one, we arrive easily at the desired conclusion. D Example 5.4. Let i>(u) = frQ1,u(t)kdt where k > 0 is an integer. i[> is continuous so applying (5.8) gives / {B?/y/n)k dt => f Bk dt Jo Jo To convert this into a result about the original sequence, we begin by observing that if x < y with \x — y\ < e and \x\, \y\ < M then y \z\k+l , eMk+i , k k, f l*f+1 j eM*+ From this it follows that on Gn(M) = I max \Xm \ < M ^^y/n, max lm<n m5-n
Section 8.6 The Space D 293 we have 3k X (S(ni)/Vn)fc di - n-1-(^2) £ S* m = l < (fc + 1)M For fixed M we have P(maxm<„ \Xm\ < M~^k+2^-^/n) -» 0 by (5.9) so using Example 5.1, and (1.1) it follows that liminf P(Gn(M)) > P ( max \BA < M) n-~oo v // — \j)<i<l J The right-hand side is close to 1 if M is large so Jo m = l in probability and it follows from the converging together lemma (1.3) that m=l •'O It is remarkable that the last result holds under the assumption that EXi = 0 and EXf = 1, i.e., we do not need to assume that E\Xk\ < oo. 8.6. The Space D In the previous section, forming the piecewise linear approximation was an annoying bookkeeping detail. In the next section when we consider processes that jump at random times, making them piecewise linear will be a genuine nuisance. To deal with that problem and to educate the reader, we will introduce the space JD([0,1], Rd) of functions from [0,1] into Rd that are right continuous and have left limits. Since we only need two simple results, (6.4) and (6.5) below, and one can find this material in Chapter 3 of Billingsley (1968), Chapter 3 of Ethier and Kurtz (1986), or Chapter VI of Jacod and Shiryaev (1987), we will content ourselves to simply state the results. We begin by defining the Skorokhod topology on D. To motivate this consider Example 6.1. For 1 < n < oo let f m-/° *€ [0,(n+l)/2n) Mj"\l *€ [(n+l)/2n,l]
294 Chapter 8 Weak Convergence where (n+l)/2n = l/2fom = oo. We certainly want/„ —» /«, but ||/n— /oo|| = 1 for all n. Let A be the class of strictly increasing continuous mappings of [0,1] onto itself. Such functions necessarily have A(0) = 0 and A(l) = 1. For f,g £ D define d(f, g) to be the infimum of those positive e for which there is a A € A so that sup |A(i) -1\ < e and sup \f(t) - g(X(t))\ < e t t It is easy to see that d is a metric. If we consider / = fn and g = fm in Example 6.1 then for e < 1 we must take A((n + l)/2n) = (m + l)/2m so d(fn,fm) = n+1 In m+1 2m = 1 2n ~ 1 " 2m When m = oo we have d(fn,foa) = l/2n so /„ —» /«, in the metric d. We will see in (6.2) that d defines the correct topology on D. However, in view of (2.2), it is unfortunate that the metric d is not complete. Example 6.2. For 1 < n < oo let fo te [0,1/2) gn{t)={ 1 iG[l/2,(n+l)/2n) [0 t £ [{n + l)/2n, 1] In order to have e < 1 in the definition of d(gn, gm) we must have A(l/2) = 1/2) and A((n + l)/2n) = (m + l)/2m so d(gn,gm) = 2n ~ 2m The pointwise limit of gn is g^ = 0 but d(gn,g00) = 1. We leave it to the reader to show Exercise 6.1. If h £ D then liminfn-.oo d(gn,h) > 0. To fix the problem with completeness we require that A be close to the identity in a more stringent sense: the slopes of all of its chords are close to 1. If A G A let log f^MfT = sup t-s
Section 8.6 The Space D 295 For f,g £ D define d0(x, y) to be the iniimum of those positive e for which there is a A G A so that ||A||<e and sup \f(t) - g(\(t))\ < e t It is easy to see that do is a metric. The functions gn in Example 6.2 have do(gn,gm) = min{l, |log(n/m)|} so they no longer form a Cauchy sequence. In fact, there are no more problems. (6.1) Theorem. The space D is complete under the metric do. For the reader who is curious why we discussed the simpler metric d we note: (6.2) Theorem. The metrics d and do are equivalent, i.e., they give rise to the same topology on D. Our first step in studying weak convergence in D is to characterize tightness. Generalizing a definition from Section 7.3 we let ws(f) = sup{|/(i) - /(s).| :s,t e S} for each S C [0,1]. For 0 < 6 < 1 put w's(f) = inf maxw[titU+l)(f) where the iniimum extends over the finite sets {i,-} with 0 = t0 < ti <■■■ <tr and U - *,-_! > 6 for 1 < i < r The analogue of (3.4) in the current setting is (6.3) Theorem. The sequence fin is tight if and only if for each e > 0 there are no, M, and S so that (i) n„(\u(0)\ > M) < e for all n > n0 (ii) finis's ■> ^) < e f°r all n > n0 Note that the only difference from (3.4) is the ' in (ii). If we remove the ' we get the main result we need.
296 Chapter 8 Weak Convergence (6.4) Theorem. If for each e > 0 there are no, M, and 8 so that (i) Mn(|w(0)| > M) < e for all n > nQ (ii) fMn(w's > e) < e for all n > no then nn is tight and every subsequential limit fj. has fi(C) = 1. (6.1)-(6.4) are Theorems 14.2, 14.1, 15.2, and 15.5 in Billingsley (1968). We will also need the following, which is a consequence of the proof of (3.3). (6.5) Theorem. If each subsequence of \in has a further subsequence that converges to \i then \in =r- fi. 8.7. Convergence to Diffusions In this section we will prove a result, due to Stroock and Varadhan, about the convergence of Markov chains to diffusion processes. Our treatment here follows that in Chapter 11 of Stroock and Varadhan (1979) but we will discuss both discrete and continuous time in parallel. Our duplicity lengthens the proof somewhat, but is preferable we believe to the usual (somewhat dangerous) cop- out: "the same argument with minor changes handles the other case." In discrete time, the basic data is a sequence of transition probabilities Iih(x, dy) for a Markov chain Y^h, m = 0,1,2,..., taking values in Sh C Rd, i.e., P(Ytm+1)h € A\Y*h = x) = nA(x, A) iotxeSh,Ae Tld In this case we define X^ = Y£rt/h-,, i.e., we make X% constant on intervals [mh, (m + l)/i). In continuous time, the basic data is a sequence of transition rates Qh{x, dy) for a Markov chain X^, t > 0, taking values in Sh C Iid, i.e., ^P(X* e A\X$ = x) = Qh(x, A) foix£Sh,Ae 1ld, with x £ A at Note that although x £ A is excluded from the interpretation, we will allow Q(x, {x}) > 0 in which case jumps from z to z (which are invisible) occur at a positive rate. To have a well behaved Markov chain we will suppose (B) For any compact set K, supx6Jf Qh(x,Hd) < oo. In words, our convergence theorem states that if the infinitesimal mean and covariance of the Markov chain converge to those of a diffusion for which the martingale problem is well posed, and we have a condition that rules out jumps
Section 8.7 Convergence to Diffusions 297 in the limit then we have weak convergence. To make a precise statement we need some notation. To incorporate both cases we introduce a kernel vi j \ — /Rh(x,dy)/h in discrete time ^ ' — \ Qh(x,dy) in continuous time and define aij(x) = / (2/. - xi)(yj ~ xi)Kh{x,dy) J\y-x\<\ &*(*)=/ {Vi-*i)Kh(x,dy) J\y-x\<l /|y-x|<l Ahc(x) = Kh(x,B(x,£y) where B(x, e) = {y : \y — x\ < e}. Suppose (A) a11 and 6' are continuous coefficients for which the martingale problem is well posed, i.e., for each x there is a unique measure Px on {C,C) so that the coordinate maps Xt(u) = w(i) satisfy Px(Xq = x) = 1 and X\- j bi{X,)ds and X\X{ - f a^iX^ds Jo Jo are local martingales. (7.1) Theorem. Suppose in continuous time that (B) holds, and in either case that (A) holds and for each z, j, R < oo, and e > 0 (i) limfciosupi^B \a^(x) - a0(a;)| = 0 (ii) limAiosup|x|<fl \b*(x) - &,-(z)| = 0 (iii) limhi0sup|x|<fi A*(z) = 0 If Xq = Xh —» x then we have Xp =>• Xt the solution of the martingale problem with Xq = x. Remarks. Here =>• denotes convergence in JD([0,l],Rd) but the result can be trivially generalized to give convergence D([0,T],Hd) for all T < oo. In (i), (ii), and (iii) the sup is taken only over points x £ Sh- Condition (iii) cannot hold unless the points in Sh are getting closer together. However, there is no implicit assumption that Sh is becoming dense in all of Rd. Proof The rest of the section is devoted to the proof of this result. We will first prove the result under the stronger assumptions that for all i, j and e > 0 we have
298 Chapter 8 WeaA' Convergence (i') limhi0supx |a£(z) - dij(x)\ = 0 (iv) supAx |o*-(a:)| < oo (ii') limH0suPx \b*(x) - &,(z)| = 0 (v) supAiX \$(x)\ < oo (iii') limAxosupx A*(a;) = 0 (vi) supAx A*(a;) < oo and in continuous time that (B') supx Q(x, Rd) < oo. a. Tightness If / is bounded and measurable, let Lhf(x) = JKh(x,dy)(f(y)-f(x)) As the notation may suggest, we expect this to converge to To explain the reason for this definition we note that if / is bounded and measurable then (7.2a) In discrete time f(X%h) — J2jZo hLhf(Xjh) is a martingale. (7.2b) In continuous time f(X^) — fQ Lhf(X%) is a martingale. Proof (7.2a) follows easily from the definition of Lh and induction. Since we have (B'), (7.2b) follows from (2.1) and (1.6) in Chapter 7. D The first step in the tightness proof is to see what happens when we apply Lh to fCiy(x) = fc(x - y) where fc{x) = g(\x\2/e2), and a(x)=((l-x*)(l-xf 0<x<l ^; \0 z>l The exact formula for g is not important. All we care about is that 0 < g(x) < 1 for x > 0, (/(0) = 1, g(x) = 0 for x > 1, and g G C2 (since g'(0) = g"(0) = 0). (7.3) Lemma. There is a Cc which only depends on e so that \LhfyiC\ < C£. Proof Applying Taylor's theorem to the function h(t) = f(x + t(y — x)) for 0 < t < 1 we have for some cSiV G [0,1] f(y) - /(*) = Mi) - MO) = h'(o) + h"(cXty)/2\ (7-4) = £(w _ Xi)Dif(x) + J2(yi - xi)(yj - *j)Aj/(*x,»)
Section 8.7 Convergence to Diffusions 299 where zx>y = x + cXiV(y — x). Integrating with respect to It'/,(z, dy) over the set \y — x\ < 1 and using the triangle inequality we have Lhf(x) < IV/(z) ■ f (y-x) Kh (x, dy) I J\y-x\<l + / Yl(yi- *•')(yj - xi)Dii f(z*,y)Kh(x,dy) + 2\\f\\aoKh{x,B{x,\Y) Let Ac = supx |V/(z)|, and Bt = sup2 ||Aj/(z)|| where |my 11 = sup < ^w,m0«j l«l = 1 Noticing that the Cauchy-Schwarz inequality implies (7.5) |I>t*|<MM i and the definition of 11 rrz^y 11 gives (7-6) |X> - *••)(%• - *j)Dijm\ <\y- x|2||Ai/W|| it follows that Lhf(x) < A / (y-x)Kh{x,dy) J\y-x\<l + BC f \y-x\2Kh(x,dy)+2\\f\\oaKh(x,B(x,ir) J\y-x\<l Now (7.7) £4(*) = / \y-x\2Kh(x,dy) so recalling the definitions of &f and Af, then using (iv)-(vi) the desired result follows. □ To estimate the oscillations of the sample paths, we will let tq = 0, r„ = inf{i > rn_! : |X* - X^ \ > e/4} N = inf{n > 0 : r„ > 1} cr = min{r„ — rn_i : 1 < n < N} 0 = max{\Xh(t) - Xh(t-)\ : 0 < t < 1}
300 Chapter 8 Weak Convergence To relate these definitions to tightness, we will now prove (7.8) Lemma. If a > 5 and 6 < e/4 then ws(Xh) < e. Proof To check this we note that if a > S and 0 < t — s < S there are two cases to consider Case 1. r„ < s <t < rn+\ 1/(0 - /001 < 1/(0 - /(r„)| + |/(r„) - f{s)\ < 2e/4 Case 2. r„_i < s < r„ < i < rn+i 1/(0 - /(01 < 1/(0 - /(t„)| + |/(r„) - /(rn-)| + |/(r„-) - /(t„_i)| + |/(t„_i) - /(S)| < e D Tightness proof, discrete time. One of the probabilities in (7.8) is trivial to estimate (7.9a) Py{6 > e/4) < -J-supEh(x,B(x, e/4)c) < sup Ahc/i(x) -* 0 ft X X by (iii'). The first step in estimating Py(a > 6) is to estimate Py(r\ < 5). To do this we begin by observing that (7.2a) and (7.3) imply fy,tli{Xkh) + Ce/ikh, k = 0,1,2,... is a submartingale Using the optional stopping theorem at time t AS where S = [S/h]h < S and noticing XrA$ = XrAs it follows that Ey{fy,c/4(XrAs) + Cc/4(rAS)}>l or rearranging and using r A S < S (7.10) CC/4S > Ey{l - fytC/i(XrAS)} > Py(n < S) Our next task is to estimate Py(N > k). To do this, we begin by observing Ey{e~Tl) < Py(n <S) + e-sPy{n > 5) < Ce/4S + e-s(l - Cc/4S) = A < 1
Section 8.7 Convergence to Diffusions 301 Iterating and using the strong Markov property it follows that Ey(e~Tn) < A" and (7.11) Py(N > k) = Py(rk < 1) < eEy{e-r>) < e\k Combining (7.10) and (7.11) we have (7.12) Ps(<r <S)< fcsup Px(r < 6) + Py(N > Jfc) < C£,4k6 + e\k X If we pick k so that e\k < e/3 and then pick 6 so that CckS < e/3 then (7.12), (7.9a), and (7.8) imply that for small h, supy Py{ws(Xh) > e) < e. This verifies (ii) in (6.4). Since X§ = xh, (i) is trivial and the proof of tightness is complete in discrete time. □ Tightness proof, continuous time. One of the probabilities in (7.8) is trivial to estimate. Since jumps of size > e/4 occur at a rate smaller than supx IIA(z, B(x, e/4)c) and e~z > 1 - z (7.9b) Py{6 > e/4) < l-exp(-supQh(x,B(x,e/4)c)) <supA^/4(z) ^0 X X by (iii')- The first step in estimating Py(cr > 6) is to estimate Py(ri < 6). To do this we begin by observing that (7.2b) and (7.3) imply fy,c/i(Xt) + Cc/it, t > 0 is a submartingale Using the optional stopping theorem at time r A 6 it follows that Eyify.cMXrAs) + Ce/4(r A 6)} > 1 The remainder of the proof is identical to the one in discrete time. D b. Convergence Since we have assumed that the martingale problem has a unique solution, it suffices by (6.5) to prove that the limit of any convergent subsequence solves the martingale problem to conclude that the whole sequence converges. The first step is to show (7.13) Lemma. If / € C& then \\Lhf - Lf \\oo — 0.
302 Chapter 8 Weak Convergence Proof Using (7.4) then integrating with respect to Kh(x,dy) over the set \y — x\ < 1 we have Lhf{x) = Yjbhi{x)Dif{x) i (7 U) + / J2(Vi - xi)(yj - xi)Dijf(zx,y)Kh(x, dy) K } -/|y-x|<i V + [ {f(y)-f(x)}Kh(x,dy) /|t/-x|> Recalling the definition of A J and using (vi) it follows that the third term goes to 0 uniformly in z. To treat the first term we note that (7.5) and (v) imply X>*(z)A-/(*) - X>(z)A/(z) < sup \bf(x) - bj(x)\ J2 |A/(*)| - 0 uniformly in x. Now the difference between the second term in (7.14) and J2ij aijDijf(x) is smaller than Eotf (aOUy/to - X>£(*)A-;/(z) (7.15) +1 / £(W ~ *<)(%■ ~ ^) {PaH***) ~ Duf{*)) Kh (*, dy) J|«-il<l TT By (7.5) the first term is smaller than sup M*)-ak(s)| X)Ptf/(*)l which goes to 0 uniformly in x by (iv). To estimate the second term in (7.15) we note that for any e the integral over {y : e < \y — x\ < 1} goes to 0 uniformly in x by (vi) and the fact that the integrand is uniformly bounded. Using (7.6), the last piece (7.16) ( X> - *••)(»; - *j) {Dijf(zx,y) - Dijfix)} Kh{x, dy) <r(0 / \y-x\2Kh(x,dy) J\y-x\<c
Section 8.7 Convergence to Diffusions 303 where r(c)= sup \\Dijf(zXiy) - Dijfix) \y-x\<e Now zx>y is on the line segment connecting x and y, D{jf is continuous and has compact support, so T(e) -^Oas£->0. Using (7.7) and (iv) now the quantity in (7.16) converges to 0 and the proof of (7.13) is complete. D Convergence proof, discrete time. Let hn —* 0 so that Xhn converges weakly to a limit X°. (7.2a) implies i-l f(Xtin)-J2hnLh"f(Xfrj, 4 = 0,1,2,... is a martingale i=o To pass to the limit, we rewrite the martingale property by introducing a bounded continuous function g : D —» R which is measurable with respect to T, and observe that if kn = [s/hn] + 1 and £„ = [t/hn] + 1 the martingale property implies E ( g{Xh~) J /(X*;fcJ - /(X&J - £ KL^HXfcj \)=0 Let hoo = 0. Skorokhod's representation theorem (1.4) implies that we can construct processes Y", 1 < n < oo with the same distribution as the Xhn so that Yn —* Y°° almost surely. Combining this with (7.13) and using the bounded convergence theorem it follows that if / g C\ then E (g(X°) j/(X,°) - /(X,°) - J* i/(Xr°) dr}) = 0 Since this holds for all continuous g : C —» R measurable with respect to F3 it follows that if / G C\ then f(X°) - f Lf{X°) ds is a martingale Jo Applying the last result to smoothly truncated versions of f(x) = x{ and f(x) = X{Xj it follows that X° is a solution of the martingale problem. Since the martingale problem has a unique solution, the desired conclusion now follows from (6.5). D
304 Chapter 8 Weak Convergence Convergence proof, continuous time. Let hn —* 0 so that Xhn converges weakly to a limit X°. (7.2b) implies /(X«*n) - f Lh»f{X^)ds t > 0... is a martingale Jo and repeating the discrete time proof shows that X° is a solution of the martingale problem and the desired result follows. □ c. Localization To extend the result to the original assumptions, we let <pk be a C°° function that is 1 on 5(0,k) and 0 on 5(0, k + 1) and define Kh<k{x,dy) = ipk{x)Kh{x,dy) + (1- ifk(x))6x(dy) where 8X is a point mass at z. It is easy to see that if Kh satisfies (i)-(iii) then Kh,k satisfies (i')-(iii') and (iv)-(vi). From the first two parts of the proof it follows that for each k, Xh,k is tight, and any subsequential limit solves the martingale problem with coefficients ak(x) = <pk{x)a(x) bk(x) = <pk(x)b(x) We have supposed that the martingale problem for a and 6 has a unique nonexplosive solution X°. In view of (3.3), to prove (6.1) it suffices to show that given any sequence hn —* 0 there is a subsequence h'n so that Xh" =>• X°. To prove this we note that since each sequence Xhn,k is tight, a diagonal argument shows that we can select a subsequence h'n so that for each k, Xhn,k =r- X0,k. Let Gk = {w : w(i) € 5(0,fc),0 < t < 1}. Gk is an open set so if Xh'»-k =» X°<k and H C C is open then (2.1) implies that liminf P(Xh'»-k £ Gk DH)> P(X°-k £GkDH) = P(X° g Gk D H) n—*oo since up until the exit from 5(0, k), X°-k is a solution of the martingale problem for 6 and a and hence its distribution agrees with that of X°. Now up to the first exit from 5(0, k) Xh'» has the same distribution as that of X »' so we have liminf P(Xh'» £H)> liminf P(Xh'» £ Gk D H) > P(X° eGkDH) n—*oo n—*oo Since there is no explosion the right-hand side increases to P(X° £ H) as k j oo and the proof is complete.
Section 8.8 Examples 305 8.8. Examples In this section we will give applications of (7.1) beginning with the following very simple situation. Example 8.1. Ehrenfest Chain. Here, the physical system is a box filled with air and divided in half by a plane with a small hole in it. We model this mathematically by two urns that contain a total of In balls, which we think of as the air molecules. At each time we pick a ball from the In in the two urns at random and move it to the other urn, which we think of as a molecule going through the hole. We expect the number of balls in the left urn to be about n + Cy/n, so we let Z^ be the number of balls at time m in the left urn minus n, and let Y^ = Z^/^/n. The state space S1/n = {k/y/n : -n < k < n} while the transition probability is „ , _i/2\ n — xy/n „ -i/2\ n + xy/n n1/n(z, x + n ^2) = ^- II1/n(z, x - n x'2) = ^- When e > n-1/2, A*'"(a:) = 0 for all x so (iii) holds. To check conditions (i) and (ii) we note that ,1/ / 1 n-xy/n~ 1 n-xy/n\ b i (x) = n < —j= ■ ■= • > = — x K ' {y/n In y/n In J -{i-- Tl/n/ ) _ J ± . " ~ ZV" , I . n-xy/^. K J ' - In n In The limiting coefficients 6(z) = — x and a(x) = 1 are Lipschitz continuous, so the martingale problem is well posed and (A) holds. Letting Xt'" = Yrtyn and applying (7.1) now it follows that (8.1) Theorem. As n —* oo, Xt'" converges weakly to an Ornstein-Uhlenbeck process Xt, i.e., the solution of dXt = -Xt dt + dBt Next we state a lemma that will help us check the hypotheses in the next two examples. The point here is to replace the truncated moments by ordinary moments that are easier to compute. a&te) = / (Vi ~ xi)(yj ~ xi)Kh(x, dy) "#(*) = J{yi-Xi)Kh{x,dy) 7p(x)= / \y-x\pKh(x,dy)
306 Chapter 8 Weak Convergence (8.2) Lemma. If p > 2 and for all R < oo we have (a) limftiosupi^B \a*}.(x) - a{j(x)\ = 0 (b) limAxosup^^B \bf(x) - &;(z)| = 0 (c) limhioSup^i^T^a;) = 0 then (i), (ii), and (iii) of (7.1) hold. Proof Taking the conditions in reverse order, we note A£ (z) < e~pj„ (x) \bH*)-bH*)\< [ \y-*\i<h(x,dy)<7Z(x) J\y-x\>l Using the Cauchy-Schwarz inequality with the triviality (yt — zjt)2 < \y — x\2 \%(x) - a1j(x)\ < f |(y; - x{)(yj - xj)\Kh(x, dy) J\y-x\>l < f \y-x\2Kh(x,dy)<7*(x) J\y-*\>1 The desired conclusion follows easily from the last three observations. D Our first two examples are the last two in Section 5.1, and date back to the beginnings of the subject, see Feller (1951). Example 8.2. Branching Processes. Consider a sequence of branching processes {Z^,m > 0} in which the probability of k children p£ has mean 1 + (/?„ /n) and variance a2. Suppose that (Al) /?„ -* /? G (-oo, oo), (A2) cr„ -► cr G (0, oo), (A3) for any 6 > 0, £*>«„ fc2p£ - 0 Following the motivation in Example 1.6 of Chapter 5, we let Xt'" = ZPtJn. Our goal is to show (8.3) Theorem. If (Al), (A2), and (A3) hold then x]ln converges weakly to Feller's branching diffusion Xt, i.e., the solution of dXi = pXi dt + cr^fXtdBt
Section 8.8 Examples 307 References. For a classical generating function approach and the history of the result, see Sections 3.3-3.4 of Jagers (1975). For an approach based on semigroups and generators see Section 9.1 of Ethier and Kurtz (1986). Proof We first prove the result under the assumption (A3') for any 6 > 0, £*>«„ k2p% = 0 for large n Then we will argue that if n is large then with high probability we never see any families of size larger than nS. The limiting coefficients satisfy (3.3) in Chapter 5, so using (4.1) there, we see that the martingale problem is well posed. Calculations in Example 1.6 of Chapter 5 show that (a) and (b) hold. To check (c) with p = 4, we begin by noting that when Zq = £, Z" is the sum of £ independent random variables £" with mean 1 + /?„ /n > 0 and variance a2. Let 5 > 0. Under (A3') we have in addition \g\<n6 for large n. Using (a + 6)4 < 24a4 + 2464 we have f\'n{x) = nE{n~*{Zl - nx}4\Z% = nx) < 16n-3(z/?„)4 + 16n-3E({Z? - nx{l + /?„/n)}4|Z£ = nx) To bound the second term we note that E({Z? - nx{\ + /?„/n)}4|Z0" = nx) = nxE(tf -(1 + /?„/n))4 To bound £(£" - (1 + /W«))4, we note that if 1 + /Wn < nS £(ff-(l+/Wn))4= /" 4x3P(\&-(l + f3n/n)\>x)dx Jo < 2(n8)2 f" 2x P(|3* -(1 + pn/n)\ > x) dx < 262^2 Jo Combining the last three estimates we see that if \x\ <R then j\/n{x) < 3252o^J2 + 96o^J22n-1 + 16(z/?„/n)4 Since 6 > 0 is arbitrary we have established (c) and the desired conclusion follows from (8.2) and (7.1). To replace (A3') by (A3) now, we observe that the convergence for the special case and use of the continuous mapping theorem as in Example 5.4 imply n~2 £>£=* / X,ds m=0 J0
308 Chapter 8 Weak Convergence (A3) implies that the probability of a family of size larger than nS is o(n~2). Thus if we truncate the original sequence by not allowing more than Snn children where <5n —» 0 slowly, then (Al), (A2) and (A3') will hold and the probability of a difference between the two systems will converge to 0. D Example 8.3. Wright Fisher Diffusion. Recall the setup of Example 1.7 in Chapter 5. We have an urn with n letters in it which may be A or a. To build up the urn at time m + lwe sample with replacement from the urn at time n but with probability a/n we ignore the draw and place an a in, and with probability /3/n we ignore the draw and place an A in. Let Z^ be the number of A's in the urn at time n, and let Xt'" = Z?niJn. Our goal is to show (8.4) Theorem. As n -+ oo, Xt converges weakly to the Wright-Fisher diffusion Xit i.e., the solution of dXt = {-aXt + /?(1 - Xt)) dt + y/Xt{l-Xt)dBt References. Again the first results are due to Feller (1951). Chapter 10 of Ethier and Kurtz (1986) treats this problem and generalizations allowing more than 2 alleles. Proof The limiting coefficients satisfy (3.3) in Chapter 5 so using (4.1) there we see that the martingale problem is well posed. Calculations in Example 1.7 in Chapter 5 show that (a) and (b) hold. To check condition (c) with p = 4 now, we note that if Zq = nx then the distribution of Z" is the same as that of Sn = 6 + • • • + £„ where &,... ,£„ g {0,1} are i.i.d. with P($ = 1) = z(l - a/n) + (1- x)/3/n. E(Sn - np) ^ = e(±^-p)4 nE(tj - p)4 + 6 r) E& - pf < Cv? since E(£j — p)m < 1 for all m > 0. From this and the inequality (a + 6)4 < 24a4 + 2464 it follows that nE({Z?/n - x¥\Z% = x) < n(-ax + /?(1 - x))/n4 + nCn2/n4 -* 0 uniformly for x £ [0,1] and the proof is complete. D Our final example has a deterministic limit. In such situations the following lemma is useful.
Section 8.8 Examples 309 (8.5) Lemma. If for all R < oo we have (a) limftiosupi^B \&^(x)\ = 0 (b) limftiosupi^B \b*(x) - 6,-(1)| = 0 then (i), (ii), and (iii) of (7.1) hold with 0,7(2:) = 0. Proof The new (a) and (b) imply the ones in (8.2). To check (c) of (8.2) with p = 2 we note that J2 4(=) = f\y~ = 1¾ (=, dy) = T2A(z) □ Example 8.4. Epidemics. Consider a population with a fixed size of n individuals and think of a disease like measles so that recovered individuals are immune to further infection. Let S" and I" be the number of susceptible and infected individuals at time n, so that n — (S" +1") is the number of recovered individuals. We suppose that (S", /") makes transitions at the following rates: (s, z) —» (s + 1, z) a(n — s — i) (s, i) —* (s — 1, z" + 1) /?sz"/n (s, z") -* (s, z - 1) 71 In words, each immune individual dies at rate a and is replaced by a new susceptible. Each susceptible individual becomes infected at rate /? times the fraction of the population that is infected, i/n. Finally, each infected individual recovers at rate 7 and enters the immune class. Let X]'n = (S?/n, I?/n). To check (a) in (8.5) we note that a}{"(zi,z2) = — ■{an(l-a;1 -22) + /^=1=2} n- 1/"/ N 1 a «12 (=11=2) = — -pnxix2 n- a^"(2:1,2:2) = —n • {Pnxix2 + jnxix2} n and all three terms converge to 0 uniformly in T = {(2:1,2:2) : 2:1,2:2 > 0,2:1 + =2 < 1}- To check (b) in (8.5) we observe 6^(2:1,2:2) = an(l -2:1.- =2) (3nxix2 ■ - n n = a(l — x\ — x2) — (ix\x2 = 61(2:) 1.1/"/ \ a * * 6, (2:1, 2:2) = pnxix2 ■ 7ri2:2 ■ — n n = f3x\x2 — jx2 = 62(2:)
310 Chapter 8 Weak Convergence The limiting coefficients are Lipschitz continuous on T. Their values outside T are irrelevant so we extend them to be Lipschitz continuous. It follows that the martingale problem is well posed and we have (8.6) Theorem. Asn-^oo, Xt'" converges weakly to Xt, the solution of the ordinary differential equation dXt = b(Xt)dt Remark. If we set a = 0 then our epidemic model reduces to one considered in Sections 11.1-11.3 of Ethier and Kurtz (1986). In addition to (8.6) they prove a central limit theorem for ^/n{Xt'" — Xt).
Solutions to Exercises Chapter 1 1.1. Let A = {A = {w: (w(*i),w(*2), • • •) € B} : B G ft*1'2-"}}. Clearly, any A G .4 is in the cr-field generated by the finite dimensional sets. To complete the proof, we only have to check that A is a cr-field. The first and easier step is to note if A = {w : (w(*i),w(i2),...) € B} then Ae = {w : (w(*i),w(i2),...) G Bc} G .4. To check that A is closed under countable unions, let An = {w : (w(i"),w(i2),-. ■) G £„}, let ii, i2,... be an ordering of {t^ \ n,m > 1} and note that we can write An = {w : (w(2i),w(f2),...) G ^n} so U„j4„ = {w : (w(ii),w(i2),...)GU„£„}G A. 1.2. Let j4n = {w : there is an s G [0,1] so that \Bt - B3\ < C\t - s\^ when \t ~ s\ < k/n}- For 1 < i < n - k + 1 let yiin=max{|5(i±i) -5 (i±^i)|:j = 0,1,...fc-l} Bn = { at least one Yl-itl is < (2fc - l)C/n7} Again .An C Bn but this time if j > 1/2 + 1/fc, i.e., £(1/2 - 7) < -1, then P(Bn) < nP(\B(l/n)\ < (2k - l)C/n*)k < nP(|5(l)| < (2k - yCn1'2^)* < c'n^1'2-^1 — 0 1.3. The first step is to observe that the scaling relationship (1.3) implies (*) Am,„ = 2-"/2Ali0 while the definition of Brownian motion shows EA\ 0 = t, and £'(Af0 — 02 = C < 00. Using (*) and the definition of Brownian motion, it follows that if k ^ m then Af n — t2~n and A^ „ — t2~n are independent and have mean 0 so E[ £ (Allin-t2-n)) = £ E(All>n-t2-n)2 = 2nC2-2" \l<m<2" / Km<2"
312 Solutions to Exercises where in the last equality we have used (*) again. The last result and Cheby- shev's inequality imply £ <--< > l/n < n2 ■ CTn l<m<2n The right-hand side is summable so the Borel-Cantelli lemma implies E<„-* m<2» > l/n infinitely often J = 0 2.1. Take Y(w) = /((*_,). 2.2. When f(x) = z,-, 2.1 becomes EX(B\\F3) = EB(,)B\_3 = B\ since under Px, B\_3 is normal with mean z,-. When f(x) = X{Xj with i ^ j 2.1 becomes EX{B\B{\TS) = EB.(B)_,Bi_,) = B[B{ since under Px, B\_a and 5j_5 are independent normals with means z,- and Xj. 2.3. Let Y = l(T0>i) and note that Tb°#i = R—lsoYodi = l(ij>i+1). Using the Markov property gives PX(R > 1 + t\?i) = PBl(To > t) Taking expected value now and recalling PX{B\ = y) = P\(x,y) gives Px(# > 1 +0 = [Pi(x,v)Py(TQ > t)dy 2.4. Let Y = l(T0>i_t) and note that Y o 9t = !(£<<)■ Using the Markov property gives Po(L<t\rt)=PBt(T0>l-t) Taking expected value now and recalling Po(Bi = y) = p<(0,y) gives Po(i < 0 = JPt(0,y)Py(To > 1 -0 c/y 2.5. We will prove the result by induction on n. By assumption it holds for n = 1. Suppose now it is true for n. Let y(w) = 1(t>i,w16«')- Applying the Markov property at time n gives £*(y o 6n\Tn) =PBn(T>l,B1e K)
Answers for Chapter 1 313 Integrating both sides over An = {T > n, Bn G K}, recalling the definition of conditional expectation, and using our assumption shows Ex(Yo6n;An)>aP(An) The right-hand side is smaller than P{An+i) and the desired result follows. 2.6. Since £ h(r, Br) dr G T„ we have : f h(r,Br)dr Jo + EX(J h(r,Br)dr ?A To evaluate the second term we let Y(w) = fQ h(s + u,wu) du and note Yo63=[ h(s + u,Bs+u)du Jo so the Markov property implies Ex (I Mr' Br)dr T* E> u: h(r,Br)dr T* --EbAI h(s + u,Bu)du 2.7. Since fi c(Br) dr E T„ we have Ex (/(5,) exp(Cl)|^) = exp(c,)£x (/(£<) exp (7 c(Br)dr ^ To evaluate the second term we let Y(w) = f(Bt-3)exp(fQ ' c(uu)du) and note Y o63 = /(5*) exp( fQ~3 c(Bs+u) du) so the Markov property implies Ex ff(Bt)exp(j c(Br)dr) T,\ = EB(l) (f(Bt-3)exp(J c(wu)du)\ 2.8. (2.8), (2.9), and the Markov property imply that with probability one there are times si > ti > s2 > t2 . ■. that converge to a so that B(sn) = B(a) and B(tn) > B(a). In each interval (sn+i,sn) there is a local maximum. 2.9. Z = limsupno B(t)/f(t) G T$ so (2.7) implies P0(Z > c) G {0,1} for each c, which implies that Z is constant almost surely.
314 Solutions to Exercises 2.10. Let C < oo, tn i 0, AN = {B(tn) > CyfQ for some n > N} and A = DffAff. A trivial inequality and the scaling relation (1.3) imply Po(AN) > Po(B(tN) > C^) = Po(B(l) > C) > 0 Letting N —* oo and noting An J. A we have Pq(A) > Pq(Bi > C) > 0. Since A e T$ it follows from (2.7) that P0(A) = 1, that is, limsup^o B{i)/y/i > C with probability one. Since C is arbitrary the proof is complete. 2.11. Since coordinates are independent it suffices to prove the result in one dimension. Continuity implies that if 8 is small P0 sup 15,1 <e/2 =2a>0 ^0<3<6 so it follows from symmetry that Po ( sup \B,\<e/2,Bs <0 \0<s<6 ao( sup |B,| \0<3<5 < e/2, Bs > 0 = a > 0 Using the top result for x > 0 and the bottom one for x < 0 shows that if xe [-e/2, e/2] Px ( sup |5t| < e,Bs G [-e/2, e/2] ) > a > 0 \0<i<« J Iterating and using Exercise 2.5 gives that if x £ [—e/2, e/2] Px ( sup |5(| < e,BnS e [-e/2, e/2] ) > a" > 0 \0<i<n« J 3.1. Suppose j4 = U^-iKn where J<T„ are closed. Let Hn = UD,=1J<rm. if„ is closed so (3.4) implies T„ = inf{i : Bt € Hn} is a stopping time. In view of (3.2) we can complete the proof now by showing that Tn J. Ta as n | oo. Clearly, T^ < limT„ for all n. To see that < cannot hold let t be a time at which Bt € A. Then Bt € Kn for some n and hence limT„ < i. Taking the inf over t with Bt & A v/e have limTn < Ta- 3.2. If m2"" < t < (m + 1)2-" then {Sn <t} = {S < m2-"> G ?m2-n C ^i.
Answers for Chapter 1 315 3.3. Since constant times are stopping times the last three statements follow from the first three. {S AT <t} = {S <t}U{T <t} e Tt. {SVT <t} = {S <t}n{r < 4 G Tt {S + T<t} = UqireQ:q+r<t{S < q) D {T < r} G Tt 3.4. Define Rn by Rx = 7\, Rn = #„_i VT„. Repeated use of Exercise 3.3 shows that Rn is a stopping time. As n f oo Rn | supn Tn so the first conclusion follows from (3.3). Define Sn by Si = Ti, Sn = S„-i AT„. Repeated use of Exercise 3.3 shows that Sn is a stopping time. As n j oo, Sn J. infn Tn so the second conclusion follows from (3.2). The last two conclusions follow from the first two since limsupT„ = inf sup Tm and liminf Tn = sup inf Tm n " m>n " n "»>n 3.5. First if A G Ts then A n {S < t} = U„ (A n {S < t - l/n» G ^ On the other hand if A D {S < t} G T% and the filtration is right continuous then AD{S <t} = Dn(An{S <t + 1/n}) G n„ jF1+1/„ = ^. 3.6. {# < t} = {S < t) n A G ^ since AeTs 3.7. (i) Let r = sAi. {S<t}n{S <s} = {S<r}£FrC T, {S <t} n{S < s} = {S < r} £ Fr C T, This shows {S < t} and {S < t} are in Ts- Taking complements and intersections we get {S > t}, {S > t}, and {S = t} are in Ts- (ii) {S < T}n{S < t} = Uq<i{S <q}n{T>q}eFt by (i), so {S < T} G Ts. {S < T}H{T <t}= Uq<t{S <q}n{q<T<t}e Tt by (i), so {S < T} G TT. Here the unions were taken over rational q. Interchanging the roles of S and T we have {S > T} in ^n^ Taking complements and intersections we get {S > T}, {S < T}, and {S = T} are in Ts n JFT. 3.8. If .4 G ft then {B(5„) G ,4} n {S„ < 0 = U0<m<2-«{Sn = m/2n} n {B(m/2n) G ,4} G ^,
316 Solutions to Exercises by (i) in Exercise 3.7. This shows {B(Sn) E A} e TSn so B(Sn) G TSn ■ Letting n -* oo we have Bs = lirn„ B(Sn) G DnTsn = Ts- 3.9. (3.5) implies Tsat„ C Ts- To argue the other inclusion let A G Ts- Since A = u„(A n {Tn > S}) it suffices to show that A n {Tn > S} G -?\sat„ ■ To do this we observe A n {T„ > s} n {Tn a s < 0 = A n {r„ > s} n {S < *} = Ug<((^ n {S < q}) n {Tn >q}ETt since A & Ts implies An{S < q} & Tq and the fact that T„ is a stopping time implies {Tn > q} eTq. 3.10. Since /Q5 g{B3) ds G Ts we have ^%(^) Ts [ 9(B,)ds Jo %(j g(B3)ds Ts) Applying the strong Markov property to Y(w) = f (w' g(w3) ds we have + EX EX(J g(B3) ds Ts u(Bs) 3.11. Let Y3 (w) = 1 if s < t and u < u(t — s) < v, 0 otherwise. Let y / \ _ f 1 if s < i, 2a — u < w(i — s) < 2a — u >• 0 otherwise Symmetry of the normal distribution implies EaY3 = EaY3, so if we let S : inf{s < t : B3 = a} and apply the strong Markov property then on {S < oo} EX(YS o 6s\Ts) = EaYs = EaYs = EX(YS ° 6S\TS) Taking expected values now gives the desired result.
Answers for Chapter 1 317 4.1. We begin by noting symmetry and (2.5) imply Po(R<l + t) = 2 / Pl(0,y) / Py(TQ = s)dsdy Jo Jo ft fOO ft = /2/ Pl(0,y) / Py(TQ = s)dyds Jo Jo Jo by Fubini's theorem, so the integrand gives the density Pq(R = 1 + t). Since Py(TQ = t) = PQ{Ty = t), (4.1) gives fOO I -I 2irt3/*JQ ye dV~2id^{l + t) 4.2. Translation invariance implies that if r < s then E(x,r)f(Br, - Bo) = E(0,o)f(Br,_r) Applying the strong Markov property to Y(u) = /(w(r,) — w0) at time rr, and using the last identity we have E{o,o)(f(Br. - BTr)\TTr) = £(Ol0)/(BT._r) The rest of the argument is identical to that for (4.4). 4.3. As s —* 0, <p(s) —* 1, so for some 5 > 0 we have y»(s) > 0 for all s < 8. Using the equation <p(s)<p(t) = <p(s +1) we can now conclude by induction that for all n, <p(s) > 0 for s < nS, and we can take logarithms without feeling guilty. The rest is easy: ^(2"") + ^(2"") = ^(2-(n-1)) and induction tells us that V>(2-") = 2-"t/>(1), while tf((m - 1)2-") + tf(2~") = tf(m2-") and induction implies ij){m.2~n) = m2~nij){l). Since <p(s) is continuous and does not vanish, t/>(s) = <p(s) is continuous and the desired result follows. 4.4. Let ipx(a) = EQ(exp(-\Ta)). (4.4) and (4.5) imply <px(a)<px(b) = ipx(a+b) <px(a) = <pXa2(l) As before the first equation implies <p\(a) = exp(c\d) while the second with A = 1 and a = Vo gives c& = ybci and this implies c\ = —kV\. 4.5. Let Mi = max0<j<i5j. Exercise 3.10 implies that with probability 1, Mi > Bi and hence a —» Ta is discontinuous at a = Mi. This shows Po(a —* ^a is discontinuous in [0,n])->lasn->oo and the result follows from the hint.
318 Solutions to Exercises 4.6. Let Mx = maxo<,<i B,2 and T = inf{i > 1 : 52 > Mi}. Since B\ < Mi the strong Markov property and (4.3) imply that with probability one B% £ B\[x so s —» C, is discontinuous at s = Mi. This shows P0(s —* C, is discontinuous in [0,n])-^lasn->oo and the result follows from the hint. 4.7. If we let f(x) = /(£), then it follows from the strong Markov property and symmetry of Brownian motion that Ex{f(Bt);r<t) = Ex[E0f(Bt-T);T <t] = Ex[E0f(Bt-r);r <t] = Ex[f(Bt);r<t]=Ex(f(Bt)) The last equality holds since f(y) = 0 when yd > 0, and the desired formula follows as in (4.8). Chapter 2 1.1. If A G -Fa and a < b then H„(s,u) = l[a+i/n,b+i/n)^A is optional and converges to l(atb](s)lA(u) asn->oo. 2.1. The martingale property w.r.t. a nitration Qt is that E(Xis;A) = E(Xf;A) for all A € Qs. (3.5) in Chapter 1 implies that Tsi\> C T, so if X? is a martingale with respect to Ts it is a martingale with respect to ^"saj- To argue the other direction let A £ Ts. We claim that A n {S > s} £ Tsas- To prove this we have to check that for all r (A n {S > s}) n {S A s < r} G Tr but this is 0 for r < s and true for r > s. If X/ is a martingale with respect to •Fsaj it follows that if A £ T, then E(Xf;A !1{S> s}) = E(Xf; An{S> s}) On the other hand E(X?;An{S<s}) = E(Xsl(s>0y,An{S < s}) = E(Xf;An{S < s}) Adding the last two equations gives the desired conclusion.
Answers for Chapter 2 319 2.2. To check integrability we note that Ex\Xt}f = J(27d)-1e-\y-x\3/2t\\og\y\\Pdy < oo since | log |y|| is integrable near 0 and e-ly-*l2/2< takes care of the behavior for large y. To show that ExXt —* oo we observe that for any R < oo ExXt > (27rf)-d/2 / log \y\ dy + (log R)Ps(\Bt \ > R) Ay\<i The integral is convergent and as we showed in Example 2.1, PX(|S<| > R) —* 1 as t —* oo so the desired result follows from the fact that R is arbitrary. 2.3. Let Tn = inf{i : \Xt\ > n}. By (2.3) this sequence will reduce Xt. Note that Xt " <n. Jensen's inequality for conditional expectation (see e.g., (1.1.d) in Chapter 4 of Durrett (1995)) implies E(V{X?")\rTnAt) > ¥>(£(X,T»|*TnA,)) = <P(XJ") 2.4. Let T„ < n be a sequence which reduces X, and Sn = (Tn — R)+. If s < t then definitions involved (note S„ = Tn — R on {Sn > 0} = {Tn > R}), the optional stopping theorem, and the fact 1(t„>r) G Fr C ^a+j imply ■E'(^ASnl(Sn>0)|^) = ■E'(X(ii+i)ATnl(Tn>ii)I^R+i) = l(Tn>ii)-E'(X"(ii+i)ATnl-^ii+i) = X(ii+i)ATnl(Tn>H) = y5ASnl(Sn>0) 2.5. Let Tn be a sequence that reduces X. The martingale property shows that if A £ To then E(XiATnl(Tn>oy,A) = E(X0l(rn>0);A) Letting n —* oo and using Fatou's lemma and the dominated convergence theorem gives E(Xt;A) < \immfE(XtATnhTn>oy,A) n—+oo v < lim E(X0hTn>oy,A) = E(X0;A) Taking A = fl we see that .EX* < 51¾ < oo. Since this holds for all A € To we get .E(X"t|.Fo) < XQ. To replace 0 by s apply the last conclusion to Yt = Xs+t, which by Exercise 2.4 is a local martingale. 3.1. Since t —* Vt is increasing, we only have to rule out jump discontinuities. Suppose there is a t and an e > 0 so that s < t < u then V(u) - V(s) > 3e for
320 Solutions to Exercises all n. It is easy to see that V(u) — V(s) is the variation of X over [s,u]. Let s0 <t < u0. We will now derive a contradiction that the variation over [so,«o] is infinite. Pick 5 > 0 so that if \r —1\ < 5 then \Xt — Xr\ < e. Assuming sn and un have been defined, pick a partition of [s„, un] not containing t with mesh < S and variation > It. Let sn+i be the largest point in the partition < t and «n+i be the smallest point in the partition > t. Our construction shows that the variation of X over [sn, un] — [sn+i,tn+i] is always > e, so the total variation over [so,«o] is infinite, a contradiction which implies t —* Vt is continuous. 3.2. (X + Y)tZi - (X, Z)i - (Y, Z)i is a local martingale. 3.3. — X0Zt is a local martingale, so if Y* = — Xo then (Y, Z)t = 0 and the desired result follows from Exercise 3.2. 3.4. (aXt)(bYt) - a6{X,y)t = ab(XtYt - (X,Y)t) is a local martingale. 3.5. If T is such that XT is a bounded martingale it follows from (3.7) and the definition of (X) that (XT) = (X)T. For a general stopping time T, let Tn = inf{i : \Xt\ > n}, note that the last result implies <XTaT») = (X)TaT" and then let n —» oo. The result for the covariance process follows immediately from the definition and the result for the variance process. 3.6. Since X? — (X)t is a martingale and \Xt\ < M for all t E{X)t = EX? - EX% < M2 so letting i -+ oo we have E(X)oo < M2. This shows that X2 — (X)t is dominated by an integrable random variable and hence is uniformly integrable. 3.7. The optional stopping theorem implies E(Xs+t\Fs+3) = Xs+s and Xs eTsC Fs+, so E(Xs+t — XslFs+s) = Xs+s — Xs i.e., Y, is a martingale. To prepare for the proof of the second result we note that imitating the proof of (2.4) E((Xs+i -Xs)2\Fs+3) = E((Xs+i-Xs+3)2\Fs+,)+(Xs+3 -X,)2 = E(X2s+t - X2S+S 1^5+,) + (XS+, - ^)2
Answers for Chapter 2 321 Let Zt = (Xs+t — Xs)2 — {(X)s+t — (X)s)- Using the last equality we get E(Zt\fs+3) = E(X2s+i - Xj+S - {(X)s+t - (X)s}\Fs+,) + (Xs+3 - X,)2 = E(X2s+t - (X)s+i\fs+3) - X2S+, + (X)s + (Xs+3 - X,)2 = -(X)s+3 + (X)s + (Xs+3 - X,)2 = Z, where the last equality follows from the optional stopping theorem and Exercise 3.6. This shows that Zt is a martingale, so (Y)t = (X)s+t — (X)s- 3.8. By stopping at Tn = inf{i : \Xt\ > n} and using Exercise 3.5 it suffices to prove the result when X is a bounded martingale. Using Exercise 3.7, we can suppose without loss of generality that S = 0. Using the L2 maximal inequality on the martingale XiAx, and the optional stopping theorem on the martingale X? - (X)t it follows that E (jup^2AT) < *E(Xhn) = E((X)tAT = 0 Letting n —* oo now gives the desired conclusion. 3.9. This is an immediate consequence of (3.8). 4.1. It is simply a matter of patiently checking all the cases. 1. s <t < a. Yt =Y0 e fo so , E<yt\T,) = E{Y0\T,) = Y* = Y, 2. s < a < t < b E(Yt \T,) = E(E(Yi \?a)F3) = E(Ya \Tt) = E{YQ\T,) = Y0 = Y, 3. s < b, t > 6. Yt = Yh so this reduces to case 2 if s < a or to our assumption if a < s < b. 4. 6 < s < t. E{Yt\T,) = E(Yb\Fs) =Yb=Ys. 4.2. If we fix w then t —* (X)t defines a measure. The triangle inequality for L2 of that measure implies (JHU(X)^ 2+(/lC2W) 2< (/(#* + K,)*d(X)?j Now take expected value.
322 Solutions to Exercises 4.3. X G M2 implies Esupt X? < oo and hence EXq < oo. To check the other condition let Tn be a sequence of stopping times | oo so that XTn is bounded and (X)r„ < n. The optional stopping theorem implies E iXTn ~ POtJ l(Tn>0) = £'^l(Tn>o) Letting n —* oo it follows that E(X20 — (X)oo = EXq. To prove the converse rearrange the displayed equation to conclude £(*T„1(T„>0))<£*02+ £<*}«, so X is L2 bounded. 4.4. The triangle inequality implies ||z|| < ||zn|| + ||z — xn\\ so M^ll < liminf ||a=„|| For the other inequality note that ||zn|| < ||z|| + \\xn — x\\ so \\x\\ > limsup \\xn\\ 4.5. Let Hn e bill with \\Hn - H\\x -* 0. Since \\(Hn ■ X) - (H ■ X)\\2 -+ 0, using Exercise 4.4, the isometry property for 6IIi, and Exercise 4.4 again \\H -X\\2 = lim||fP -X\\2 = lim||if"||x = \\H\\X 5.1. If L, V G M2 have the desired properties then taking N = L — L' we have (L,L- V) = {L',L- L') so (L - L') = 0. Using Exercise 3.8 now it follows that L — V is constant and hence must be = 0. 5.2. If we let H, = 1 and K, = 1(,<T) then X = H ■ X and YT = K ■ Y. (It is for this reason we have assumed Xo = Y0 = 0.) Using (5.4) now it follows that (X,YT)t= [ l(,<T)d(X,Y)s = (X,Y)J Jo 5.3. If we let H, = l(i<T) and K, = l(j>T) then XT = H ■ X and Y - YT = K ■ Y. Using (5.4) now it follows that (XT,Y - YT)t = 0 so the desired result follows from (3.11). 6.1. Write H, = H} + H2 and K, = K] + K2 where H, = K, = -ffjl(s<j<T) = KA(s<3<T)
Answers for Chapter 2 323 Clearly, (H1 ■ X)t = (K1 ■ Y)t for all t. Since H2 = K2 = 0 for S < s < T, (5.4) implies (H2 ■ X)3 = (K2 -X)3 = 0 for S < s < T and it follows from Exercise 3.8 that (H2 ■ X)t and (K2 ■ X)t are constant on [S,T\. Combining this with the first result and using (4.3.b) gives the desired conclusion. 6.2. Stop X at Tn = ini{t : \Xt\ > not /J H2 d(X),} to get a bounded martingale and an integrand in U2(X). Exercise 4.5 implies \\HT--XT-\\2 = E I " H2d{X)t Jo Using the L2 maximal inequality it follows that E (sup (H ■ X)2] <4E f " H2 d{X)t \t<Tn J Jo Letting n —» oo now we can conclude E (sup(if • X)2 J < oo With this in hand we can start with E(H -X)2Tn=E r H2 d(X)t Jo and let n —» oo to get the desired result. 6.3. By stopping we can reduce to the case in which X is a bounded continuous martingale and (X)t < N for all t, which implies H € n2(X). If we replace S and T by S„ and Tn which stop at the next dyadic rational and let H" = C for S„ < s <Tn then H" £ Hi and it follows easily from the definition of the integral in Step 2 in Section 2.4 that jH?dX, = C{XTn-XsJ To complete the proof now we observe that if |C(w)| < K then \\Hn - H\\x < K2{E((X)T„ - (X)T) + E({X)S„ - (X)s)} - 0
324 Solutions to Exercises as n —» oo by the bounded convergence theorem. Using Exercise 4.4 and (4.3.b) it follows that \\{Hn ■ X) - (H ■ X)||2 -» 0 and hence f H? dX3 -* f H, dX3. Clearly, C{Xt„ — Xsa) —* C{Xt — Xs) and the desired result follows. 6.4. X? - Xl = £>2(*?+1) - X\t?) i = E^W+i)-^'?)}2 + £2X(i?){X(i?+1) -X(i?)} (3.8) implies that the first term converges in probability to (X)i. The previous exercise shows the second term converges in probability to fQ 2X3 dX3. 6.5. The difference between evaluating at the right and the left end points is 2X){X(t?+1)-X(t?)}2-*2(X), I 6.6. Let Cki„ = B((k + l/2)2~nt) - B(k2~nt) and £>,..,„ = B({k + l)2~nt) - B((k + 1/2)2-"*). In view of (6.7) it suffices to show that as n -* oo . k in probability. ECl = 2~nt/2 and ECk,nDk,n = 0 so ES„ = t/2. To complete the proof now we note that the terms in the sum are independent so E(Sn -1/2)2 = E (j2Cln ~ 2_n</2 + CkinDkA = E £(C* „ - 2-"t/2)2 + ClnD\tn < 2»Ct2-*» -+ 0 k and then use Chebyshev's inequality. 6.7. (6.7) implies that if t" is a sequence of partitions of [0,t] with mesh —* 0 as n —* oo we have y^ht^Btn -Bt-)-» f h3dB3 ; ' '+X ' JO in probability. The left-hand side has a normal distribution with mean 0 and variance
Answers for Chapter 3 325 as n —* oo by the convergence of the Riemann approximating sums for the integral. Convergence of the means and variances for a sequence of normal random variables is sufRcent for them to converge in distribution (consider the characteristic functions) and the desired result follows. 7.1. suPl \((Kn - K) -A)t\< Msup, \K? - IC| — 0 7.2. By stopping it suffices to prove the result when \A\t < M. If we let Gf = f'(c(Aii}Aii+l)) then as in (7.4) we get f(At)-f(AQ)= f G\dAs Jo As 6 —» 0, Gf —» f'{A3) uniformly on [0,t] so the desired conclusion from Exercise 7.1. 8.1. Clearly Mt +M{- is a continuous local martingale and At +A't is continuous adapted, locally of bounded variation and has A0 + A'0 = 0. Chapter 3 1.1. The optional stopping theorem implies that x = EXBT = aPx{BT = a) + 6(1 - PX(BT = a)) Solving gives Px(Bt = a) = (6 — x)/{b — a). 1.2. From (1.6) it follows that for any s and n P(supt>i Bt > n) = 1. This implies P(supt>i Bt = oo) = 1 for all s and hence limsup^^, B3 = oo a.s. 1.3. If Sr(w) | t(w) which is < oo with positive probability then Bt(u)(u) = 0 and we have a contradiction. 3.1. Using the optional stopping theorem at time TAf we have E0B%.Ai = E0(T A t). Letting t —* oo and using the bounded and monotone convergence theorems with (1.4) we have E0T = E0B% = a2 ■ —— + b2 ■ -r^— = ab b + a b + a 3.2. Let f(x,t) = x6- axH + bx2t2 - ct3. Differentiating gives Dtf = -ax4 + 2bx2t - Zct2 (1/2)JW = 15a;4 - 6ax2t + bt2
326 Solutions to Exercises Setting a = 15, 26 = 6a, and 6 = 3c, that is a = 15, b = 45, c = 15, we have Dtf + (1/2)JDXX/ = 0, so f(Bi,t) is a local martingale. Using the optional stopping theorem at time ra A t we have E0{B°aM - 15B«aA,(r0 At) +45B?aA1(ra At)2} = 15EQ(ra At)3 Letting t —» oo and using the bounded and monotone convergence theorems with the results in (3.3) we have 15E0t% = a6 - 15a4Era + 4ba2Er2 = a6(l - 15 + 75) so E0t* = 61/15. 3.3. Using the optional stopping theorem at time TLa A n we have 1 = Eoexp(-(2fM/(r2)ZT_aAn) Letting n —* oo and noting that on {TLa = oo} we have Zt/t = aBt/t + n —* n almost surely and hence Zt —» oo a.s., the desired result follows. 4.1. Let 6 : [0,oo) —» Rd be piecewise constant and have \63\ = 1 for all s. Let y< = Ei /o *i dXi ■ The formula for the covariance of stochastic integrals shows Yt is a local martingale with (Y)t = t, so (4.1) implies that Yt is a Brownian motion. Letting 0 = t0 < ti < ... < tn and taking 04 = vj for s g (*j-ii*j] f°r 1 < j < n shows that Xtt — Xt0,..., Xtn —Xtn_t are independent multivariate normals with mean 0 and covariance matrices (ti — to)I,..., (tn — tn_i)J, and it follows that Xt is a d-dimensional Brownian motion. 4.2. X, is a local martingale with (X)3 = /q h2 dr. By modifying h3 after time t we can suppose without loss of generality that (X)oo = oo. Using (4.4) now we see that if j(u) = inf {s : (X)3 > u} then X^u) is a Brownian motion. Since (X)3 and hence the time change j(u) are deterministic, the desired result follows. Chapter 4 4.1. Let r = inf{f > 0 : Bt £ G}, let y e dG, and let Ty = M{t >0:Bt=y}. (2.9) in Chapter 1 implies Py(Ty = 0) = 1. Since r < Ty it follows that y is regular. 4.2. Py(r = 0) < 1 and Blumenthal's 0-1 law imply Py(r = 0) = 0. Since we are in d > 2, it follows that Py{Br = y) = 0 and hence e = 1 - Eyf{Br) > 0. Let cr„ = inf{t > 0 : B% ¢ D(y, 1/n)}. cr„ J. 0 as n j oo so ^^(crn < r) -» 1. 1 - € = Eyf(Br) = Ey(f(Br);r < an) + Ey(v{BaJ;r > an)
Answers for Chapter 4 327 As n —* oo the first term on the right —» 0, so Ey(v(B°*):r> <r„)-*l-e and it follows that for large n, infx6sD(y,i/n) ^(^) < 1 — f ■ 4.3. Let y £ dG and consider f/ = V(y,'Vg(y)} 1). Calculus gives us g{z)-g{y)= I yg(y + o{z-y))-e(z-y) Jo d6 Continuity of Vg implies that if \z — y\ < r and z £ U then V<7(y + 9{z — y)) ' <?(2 — 2/) > 0 so <?(z) > 0. This shows that for small r the truncated cone U n D(y, r) C Gc so the regularity of y follows from (4.5c). 4.4. Let r = inf{t > 0 : Bt € V(y,v,a)}. By translation and rotation we can suppose y = 0, v = (1,0,..., 0), and the d— 1 dimensional hyperplane is z<j = 0. Let r0 = inf{* > 0 : Bf = 0}. (2.9) in Chapter 1 implies that P0(T0 = 0) = 1. If a > 0 then for some k < oo the hyperplane can be covered by k rotations of V so Pq{t = 0) > 1/fc and it follows from the 0-1 law that P0(t = 0) = 1. 7.1. (a) Ignoring Cd and differentiating gives 2 (|a;_6l|2 + y2)(d+2)/2 n h - A y i «*(<* +2)(*<-0)2» (|z-0|2 + y2)(d+2)/2 (|x - 0|2 + y2)(d+4)/2 A/A» = 77-—„.,'■ ,w/, - d ■ (\x-6\2 + y2)d/2 (|a: - 6»|2 + y2)(<*+2)/2 P + 2p c/(i+2)t/3 (|a: - 6»|2 + y2)(d+2)/2 (|x - 0|2 + y2)(<*+4)/2 Dyy hB - -d- — -^-t o\(JJ.f\IO + Adding up we see that V^n /, , n /, {(-d)(d-l)+'(-d)^}y 2^UXiXiHg + Dyy tlB - ^ _ fl|2 + y2)(d+2)/2 d{d + 2)y{\x-0\* + y*) (|z-0|2 + t/2)(d+4)/2 The fact that Ati(i) = 0 follows from (1.7)'as in the proof of (7.1).
328 Solutions to Exercises (b) Clearly J dO hg(x, y) is independent of z. To show that it is independent of 2/, let x — 0 and change variables 0,- = yipi for 1 < i < d — 1 to get /^^0.^) = /^^^^^=/^^(0,1) = 1 (c) Changing variables 0{ = z,- + r,-y and using dominated convergence f d6hg(x,y)= f drhr(0,l)^0 JD(x,t)e JD(0,t/y)e (d) Since Px(r < oo) = 1 for all x £ H, this follows from (4.3). 9.1. c(x) = -/? < 0 so io(z) = j5xe""r < 1 and it follows from (6.3) that v(x) = Exe~PT is the unique solution of -u" -/3u = 0 u(-a) = u(a) = 1 Guessing u{x) = B cosh(6z) with 6 > 0 we find Iu" -pu= f^~ - /3b\ cosh(6z) = 0 if 6 = -\/2/?- Then we take 5 = 1/ cosh(ay2/?) to satisfy the boundary condition. Chapter 5 3.1. We first prove that h is Lipschitz continuous with constant C2 = 1C\ + U-1|A(0)| on D2 = {R< \x\ < 2R}. Let f(x) = (2R- \x\)/R and g(x) = h(Rx/\x\). If x,y e D2 then l/(*)-/(,)1 = ^^ <>-,! Since h is Lipschitz continuous on Di = {\x\ < R}, and Rx/\x\ is the projection of x onto Di we have \g(x)-g(y)\ = \h(Rx/\x\)-h(Ry/\y\)\ Rx Ry <d M \y\ <Ci\x-y\
Answers for Chapter 5 329 Combining the last two results, introducing ||fc||oo,:' = sup{|fc(z)| : z £ £),}> and using the triangle inequality: \h(x) - h(y)\ < ||/||oo,2|<7(z) " 9(y)\ + |/(*) " /(y)lll<7||co,2 ^l-CxIx-yl + ^IAlU,! <(2C1 + \h(0)\/R)\x-y\ = C2\x-y\ since for ¢6¾ \h(x)\ < |/i(0)| + CiR. To extend the last result to Lipschitz continuity on Rd we begin by observing that if x G D\, y G D2, and 2 is the point of dD\ on the line segment between x and y then \h{x) - h(y)\ < \h(x) - h(z)\ + \h(z) - h(y)\ < d\x -z\ + C2\z -y\< C2\x - 2/| since C\ < C2 and \x — z\ + \z — y\ = \x — y\. This shows that h is Lipschitz continuous with constant C2 on \z\ < 2R. Repeating the last argument taking |as| < 2jF2 and y > 2R completes the proof. 3.2. (a) Differentiating we have k(x) = —Cz log z k'{x) = -C\ogx-C > 0 iftXaKe-1 K"(x) = -C/x < 0 if x > 0 so k is strictly increasing and concave on [0, e-2]. Since n(e~2) = 2Ce-2 and K'(e-2) = C, k is strictly increasing and concave on [0,00). When e < e-2 I! dx = —C l loglogz|0 = 00 /0 Cz log z (b) If g(t) = exp(-l/<p) with p > 0 then 9'(t) = ^T exp(-l/<p) = P9(t){log(l/g(t))}(p+1)/p 5.1. Under Q the coordinate maps Xt{u) satisfy MP(/?+ 6, <r). Let Xt = Xt - f /3(XS) + b(X3) ds Jo = Xt- [ b{X,)ds Jo
330 Solutions to Exercises Since we are interested in adding drift —6 we let c = a_16 as before, but let Yt = - I c(X3) ■ dXs Jo = -Yt+ f c(X,)-b(X,)ds Jo = -Yt+ f ba-H(Xt)da: Jo Since Yt and Yt differ by a process of bounded variation we have (Y)t = (Y)t= J ba-H{X,)ds Jo The change of measure in this case is given by d1=exp(Y1--(Y>1 = exp (-Yt + |<Y)<) = l/at 6.1. It suffices to show that P0 a.s. f limsup / |5i|_1l(|Bi|<2-„)c/s > 0 n—»oo Jo To prove the last result, let RQ = 0, Sn = inf{f > -R„ : \Bt\ = 2-"}, Rn+1 = inf{f > S„ : Bt = 0} for n > 0, and N = sup{n : i?„ < 7\}. Now f l IB,]'1 1(|B,|<2-") da>T, nSm - Rm) J° m^O where the number of terms, N+1, has a geometric distribution with E(N + 1) = 2", and the Sm-Rm are i.i.d. with E(Sm - Rm) = 2~2n and E{Sm - Rm)2 = (5/3)2_4n. Using the formula for the variance of a random sum or noticing that P(N > 2") —* e_1 as n —* oo one concludes easily that limsup > C > 0 a.s. Chapter 6 3.1. (i) If 6(2) < 0 then exp (- J 2b{z)/a{z) dz\ > 1
Answers for Chapter 6 331 so <p(x) > x, y?(oo) = oo and recurrence follows from (3.3). (ii) If b{z)ja{z) > e then exp (- f 2b(z)/a(z) dz) < e~2ey so <p(oo) < oo and transience follows from (3.3). 3.2. The condition is /,,1 -2b(y)/a(y) dy = 0. Let I = /,1 -2b(y)/a(y) dy. <p'(n + y) = enI<p'(y) so if I > 0 <p(—oo) > —oo and if I < 0 then <p(oo) < oo. If I = 0 then y'(y) is periodic so 0 < c < ^'(y) < C < oo, and it follows that y?(oo) = oo and <p(—oo) = —oo. 5.1. (a) If /? = 0 then <p(x) = z and m(z) = l/a2x so / m(z)(^(z) — <p(Q)) dx = / a~2dx< Jo Jo 00 M(0) = —oo so J = oo and it follows that 0 is absorbing. To deal with oo note that <p(oo) = M(oo) = oo so I = J = oo. (b) When p < 0, <p'(x) = e2^x/"2, so 2|/J| m(z) = c-2""'/"'/^* ^) = 33(^-1) To evaluate J for the boundary at 0, we note •i i _ e-2|/J|*/<r2 Jm(x)(ip(x) — <p(0)) dx = o Jo 2|/?|a dx < oo since the integrand converges to 1/cr2 asi-^O. Again M(0) = —oo so J = oo and it follows that 0 is absorbing. To deal with oo we note that <p(oo) = oo so I = oo. To evaluate J we begin by noting that r°° ... . „ ^-2 Jx -my/«ady= ^—e-2^1!"2 2|/?| Since most of the integral comes from x < y < x + ^Jx it follows easily that 2 M(oo) - Mix) ~ -^re-2^'"2 \P\X
332 Solutions to Exercises Since ip'(x) = e~2^xl" it follows that J = oo and oo is a natural boundary. 5.2. The computations in Example 5.3 show that (a) I < oo when j < 1 and (b) for 7 < —1 we have M(0) = —oo and hence J = oo. Consulting the table of definitions we see that 0 is an absorbing boundary for j < — 1. 5.3. (a) H'f < To < oo. (b) (2.9) in Chapter 3 implies EiHs = f 2yy-2S dy < oo JO since 5 < 1 implies 1 — 25 > —1. (c) If 5 > 1 then Hg > Hi so it suffices to prove the result when 5 = 1. To do this, we note that if 50 = x then x~iB3Xi> is a Brownian motion starting at 1, so the distribution of ' B~2 ds under Px o does not depend on x. Letting Sx = (Tx/2 A T2x) ° ®tx and using the strong Markov property it follows that the distribution of rs* Ix= B~2 ds under Px does not depend on x. Pick 6 > 0 so that P\(IX > 5) > 0. The /2-» are independent so the second Borel-Cantelli lemma implies that Pi(I2-^ > & i.o.) = 1 but this is inconsistent with P\(H\ < 00) > 0. 5.4. ip(x) = x so <p(oo) = 00 and I = 00. When 5 < 1/2, M(oo) = 00 and hence J = 00. When 5 > 1/2 z1"2*-! , f < 00 if 5 > 1 25-1 2\=oo if5<l Combining the results for I and J and consulting the table we have the indicated result. 5.5. When a = 0 <p'(x) = exp(j^-^-dz)=Cy-2^ '-r
Answers for Chapter 8 333 Here and in what follows C is a constant which will change from line to line. If /? > 1/2 then <p{0) = -oo and I = oo. If /? < 1/2 then <p(z) - <p(Q) = Czl~2P and f'ivMy) y -p-y{l-y) as y -* 0. From this we see that I < oo if /? < 1/2. When /? = 0, M(0) = -oo so J = oo. When /3 > 0, M(z) - M(0) ~ Cz2^ asz-^0soJ<ooif/?>0. Comparing the possibilities for I and J gives the desired result. Chapter 7 5.1. Letting 6 —» oo in (4.7) of Chapter 6 we have £xT(li0o) = 2(p(a:) - p(l)) J m{z) dz + 2J (<p(z) - <p(l)) m(z) dz The result now follows from m{z) = l/<p'(z)a(z) = z~2c. 5.2. We apply (5.3) to v(x) = x1+s - 1 and note that Lu= (11^. £^ + 6(.)(1 + 5)^<-l 2 e e — since b(x) < -e/xs, (1 + 5)/2 < 1, and a;4"1 < 1. Chapter 8 3.1. Suppose r„ 7^ r. Then there is an e > 0 and a subsequence r„fc so that \rn'k — r\> e for all k. However it is impossible for a subsequence of r„fc, so we have a contradiction which proves rn —* r. 6.1. If d(gn, h) —* 0 then h can only take the values 0 and 1. In Example 6.2 we showed that h cannot be = 0 so it must take the value 1 at some point a. Since h is right continuous at a there must be some point 6 7^ 1/2 where h(b) = 1 but in this case it is impossible for d(gnt h) —» 0.
References M. Aizenman and B. Simon (1982) Brownian motion and a Harnack inequality for Schrodinger operators. Comm. Pure Appl. Math. 35, 209-273 P. Billingsley (1968) Convergence of Probability Measures. John Wiley and Sons, New York R.M. Blumenthal and R.K. Getoor (1968) Markov Processes and their Potential Theory. Academic Press, New York D. Burkholder (1973) Distribution function inequalities for martingales. Ann. Probab. 1, 19-42 D. Burkholder, R. Gundy, and M. Silverstein (1971) A maximal function characterization of the class fP>. Trans A.M.S. 157, 137-153 K.L. Chung and R.J. Williams (1990) Introduction to Stochastic Integration. Second edition. Birkhauser, Boston K.L. Chung and Z. Zhao (1995) From Brownian Motion to Schrodinger's Equation. Springer, New York Z. Ciesielski and S.J. Taylor (1962) First passage times and sojurn times for Brownian motion in space and exact HausdorfF measure. Trans. A.M.S. 103, 434-450 E. Cinlar, J. Jacod, P. Protter, and M. Sharpe (1980) Semimartingales and Markov processes. Z. fur. Wahr. 54, 161-220 B. Davis (1983) On Brownian slow points. Z. fur Wahr. 64, 359-367 C. Dellacherie and P.A. Meyer (1978) Probabilities and Potentials. North Holland, Amsterdam C. Dellacherie and P.A. Meyer (1982) Probabilities and Potentials B. Theory of Martingales. North Holland, Amsterdam C. Doleans-Dade (1970) Quelques applications de laformule de changement de variables pour les semimartingales. Z. fur Wahr. 16, 181-194 R. Durrett (1995) Probability: Theory and Examples. Second edition. Duxbury Press, Belmont, CA
336 References A. Dvoretsky and P. Erdos (1951) Some problems on random walk in space. Pages 353-367 in Proc. 2nd Berkeley Syrup., U. of California Press E.B. Dynkin (1965) Markov Processes. Springer, New York P. Echeverria (1982) A criterion for invariant measures of Markov processes. Z. fur Wahr. 61, 1-16 S. Ethier and T. Kurtz (1986) Markov Processes: Characterization and Convergence. John Wiley and Sons, New York W. Feller (1951) Diffusion processes in genetics. Pages 227-246 in Proc. 2nd Berkeley Symp., U. of California Press G.B. Folland (1976) An Introduction to Partial Differential Equations. Princeton U. Press, Princeton, NJ D. Freedman (1970) Brownian Motion and Diffusion. Holden-Day, San Francisco, CA A. Friedman (1964) Partial Differential Equations of Parabolic Type. Prentice- Hall, Englewood Cliffs, NJ A. Friedman (1975) Stochastic Differential Equations and Applications, Volume 1. Academic Press, New York R.K. Getoor and M. Sharpe (1979) Excursions of Brownian motion and Bessel processes. Z. fur. Wahr. 47, 83-106 D. Gilbarg and J. Serrin (1956) On isolated singularities of second order elliptic differential equations. J. d'Analyse Math. 4, 309-340 D. Gilbarg and N.S. Trudinger (1977) Elliptic Partial Differential Equations of Second Order. Springer, New York E. Hewitt and K. Stromberg (1969) Real and Abstract Analysis. Springer, New York N. Ikeda and S. Watanabe (1981) Stochastic Differential Equations and Diffusion Processes. North Holland, Amsterdam J. Jacod and A.N. Shiryaev (1987) Limit Theorems for Stochastic Processes. Springer, New York P. Jagers (1975) Branching Processes with Biological Applications. John Wiley and Sons, New York F. John (1982) Partial Differential Equations. Fourth edition. Springer-Verlag, New York
References 337 M. Kac (1951) On some connections between probability theory and differential and integral equations. Pages 189-215 in Proc. 2nd Berkeley Symposium, U. of California Press I. Karatzas and S. Shreve (1991) Brownian Motion and Stochastic Calculus. Second edition. Springer, New York S. Karlin and H. Taylor (1981) A Second Course in Stochastic Processes. Academic Press, New York T. Kato (1973) Schrodinger operators with singular potentials. Israel J. Math. 13, 135-148 R.Z. Khasminskii (1960) Ergodic properties of recurrent diffusion processes and stabilization of the solution to the Cauchy problem for parabolic equations. Theor. Prob. Appl. 5, 179-196 F. Knight (1981) Essentials of Brownian Motion and Diffusion. American Math. Society, Providence, RI H.P. McKean (1969) Stochastic Integrals. Academic Press, New York P.A. Meyer (1976) Un cours sur les integrals stochastiques. Pages 246-400 in Lecture Notes in Math. 511, Springer, New York N. Meyers and J. Serrin (1960) The exterior Dirichlet problem for second order elliptic partial differential equations. J. Math. Mech. 9, 513-538 B. 0ksendal (1992) Stochastic Differential Equations. Third edition. Springer, New York S. Port and C. Stone (1978) Brownian Motion and Classical Potential Theory. Academic Press, New York P: Protter (1990) Stochastic Integration and Differential Equations. Springer, New York D. Revuz and M. Yor (1991) Continuous Martingales and Brownian Motion. Springer, New York L.C.G. Rogers and D. Williams (1987) Diffusions, Markov Processes, and Martingales. Volume 2: ltd Calculus. John Wiley and Sons, New York H. Royden (1988) Real Analysis. Third edition. MacMillan, New York B. Simon (1982) Schrodinger semigroups. Bulletin A.M.S. 7, 447-526 D.W. Stroock and S.R.S. Varadhan (1979) Multidimensional Diffusion Processes. Springer, New York
338 References H. Tanaka (1963) Note on continuous additive functionals of one-dimensional Brownian motion. Z. fur Wahr. 1, 251-257 J. Walsh (1978) A diffusion with a discontinuous local time. Asterique 52—53, 37-45 T. Yamada and S. Watanabe (1971) On the uniqueness of solutions to stochastic differential equations. J. Math. Kyoto 11,1, 155-167; II, 553-563
Index adapted 36 arcsine law 173, 291 Arzela-Ascoli theorem 284 associative law 76 averaging property 148 basic predictable process 52 Bessel process 179, 232 Bichteler-Dellacherie theorem 71 BlumenthaPs 0-1 law 14 boundary behavior of diffusions 231 Brownian motion continuity of paths 4 definition 1 Holder continuity 6 Levy's characterization of 111 nondifferentiability 6 quadratic variation 7 writes your name 15 zeros of 23 Burkholder Davis Gundy theorem 116 C, the space 4, 282 Cauchy process 29 change of measure in SDE 202 change of time in SDE 207 Ciesielski-Taylor theorem 171 cone condition 147 continuous mapping theorem 273 contraction semigroup 245 converging together lemma 273 covariance process 50 D, the space 293 differentiating under an integral 129 diffusion process 177 Dini function 239 Dirichlet problem 143 distributive laws 72 Doleans measure 56 Donsker's theorem 287 Dvoretsky-Erdos theorem 99 effective dimension 239 exit distributions ball 164 half space 166 exponential Brownian motion 178 exponential martingale 108 Feller semigroup 246 Feller's branching diffusion 181, 231 Feller's test 214, 227 Feynman-Kac formula 137 filtration 8 finite dimensional sets 3, 10 Fubini's theorem 86 fundamental solution 251 generator of a semigroup 246 Girsanov's formula 91 Green's functions Brownian motion 104 one dimensional diffusions 222 Gronwall's inequality 188
340 Index harmonic function 95 averaging property 148 Dirichlet problem 143 Harris chains 259 heat equation 126 inhomogeneous 130 hitting times 19, 26 infinitesimal drift 178 infinitesimal generator 246 infinitesimal variance 179 integration by parts 76 isometry property 56 Ito's formula 68 multivariate 77 Kalman-Bucy filter 180 Khasminskii's lemma 141 Khasminskii's test 237 Kolmogorov's continuity theorem 5 Kunita-Watanabe inequality 59 law of the iterated logarithm 115 Lebesgue's thorn 146 Levy's theorem 111 local martingale 37, 113 local time 83 continuity of 88 locally equivalent measures 90 Markov property 9 martingale problem 198 well posed 210 Meyer-Tanaka formula 84 modulus of continuity 283 monotone class theorem 12 natural scale 212 nonnegative definite matrices 198 occupation times ball 167 half space 31 optional process 36 in discrete time 34 optional stopping theorem 39 Ornstein-Uhlenbeck process 180 pathwise uniqueness for SDE 184 ■k — A theorem 11 Poisson's equation 151 potential kernel 102 predictable process 36 basic 52 in discrete time 34 simple 53 predictable cr-field 35 progressivley measurable process 36 Prokhorov's theorems 276 punctured disc 146 quadratic variation 50 recurrence and transience Brownian motion 17, 98 one dimensional diffusions 220 Harris chains 262 reflecting boundary 229 reflection principle 25 regular point 144 resolvent operator 249 relatively compact 276 right continuous filtration scaling relation 2 Schrodinger equation 156 semigroup property 245 semimartingale 70 shift transformations 9 simple predictable process 53 skew Brownian motion 89 Skorokhod representation 274 Skorokhod topology on D 293 speed measure 224
stopping time 18 strong Markov property for Brownian motion 21 for diffusions 201 strong solution to SDE 183 tail cr-field 17 tight sequence of measures 276 transience, see recurrence transition density 257 transition kernels 250 uniqueness in distribution for SDE 197 variance process 42 weak convergence 271 Wright-Fisher diffusion 182, 234, 308 Yamada-Watanabe theorem 193