/
Автор: Durrett R.
Теги: mathematical analysis applied mathematics probability theory ito stochastic calculus
ISBN: 0-8493-8071-5
Год: 1996
Текст
A Practical Introduction
PROBABILITY AND STOCHASTTCS
SERIES
Edited by Richard Durrett and Mark Pinsky
Linear Stochastic Control Systems, Guanrong Chen, Goong Chen, and
Shia-Hsun Hsu
Advances in Queiteing: Theory, Methods, and Open Problems,
Jewgeni H. Dshalalow
Stoclmstics Calculus: A Practical Introduction,
Richard Durrette
Chaos Expansion, Multiple Weiner-Ito Integrals and Applications, Christian Houdre
and Victor Perez-Abreu
Wliite Noise Distribution Theory, Hui-Hsiung Kuo
Topics in Contemporary Probability, J. Laurie Snell
Probability
and Stochastics Series
A Practical Introduction
icharci Duire
CRC Press
Boca Raton New York London Tokyo
m0 7,m J
*S» FY Of ^¾
Tim Pletscher:
Dawn Boyd:
Susie Carlisle:
Arline Massey:
Paul Gottehrer:
Kevin Luong:
Sheri Schwartz:
Acquiring Editor
Cover Designer
Marketing Manager
Associate Marketing Manager
Assistant Managing Editor, EDP
Pre-Press
Manufacturing Assistant
Library of Congress Cataloging-in-Publication Data
Catalog record is available from the Library of Congress
This book contains information obtained from authentic and highly regarded sources. Reprinted
material is quoted with permission, and sources are indicated. A wide variety of references are listed.
Reasonable efforts have been made to publish reliable data and information, but the author and the
publisher cannot assume responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying, microfilming, and recording, or by any information storage or
retrieval system, without prior permission in writing from the publisher.
CRC Press, Inc.'s consent does not extend to copying for general distribution, for promotion, for
creating new works, or for resale. Specific permission must be obtained in writing from CRC Press for such
copying.
Direct all inquiries to CRC Press, Inc., 2000 Corporate Blvd., N.W., Boca Raton, Florida 33431.
© 1996 by CRC Press, Inc.
No claim to original U.S. Government works
International Standard Book Number 0-8493-8071-5
Printed in the United States of America 1234567890
Printed on acid-free paper
Preface
This book is the re-incarnation of my first: took Browriian Motion and
Martingales in Analysis, which was published by Wadswotth in-1984. For more
than a decade I have used Chapters 1, 2, 8, ancl-9 of that'book to give "reading
courses" to graduate students who have completed, the first year graduate
probability course and were interested in learning more about processes that move
continuously in space and time. Taking the advice from biology that "form
follows function" I have taken that material on stochastic integration, stochastic
differential equations, Brownian motion and its relation to partial differential
equations to be the core of this book (Chapters 1-5). To this I have added
other practically important topics: one dimensional diffusions, semigroups and
generators, Harris chains, and weak convergence. I have struggled with this
material for almost twenty years. I now think that I understand most of it,
so to help you master it in less time, I have tried to explain it as simply and
clearly as I can.
My students' motivations for learning this material have been diverse: some
have wanted to apply ideas from probability to analysis or differential geometry,
others have gone on to do research on diffusion processes or stochastic partial
differential equations, some have been interested in applications of these ideas
to finance, or to problems in operations research. My motivation for writing
this book, like that for Probability Theory and Examples, was to simplify my life
as a teacher by bringing together in one place useful material that is scattered
in a variety of sources.
An old joke says that "if you copy from one book that is plagiarism, but
if you copy from ten books that is scholarship." From that viewpoint this is a
scholarly book. Its main contributors for the various subjects are (a) stochastic
integration and differential equations: Chung and Williams (1990), Ikeda and
Watanabe (1981), Karatzas and Shreve (1991), Protter (1990), Revuz and Yor
(1991), Rogers and Williams (1987), Stroock and Varadhan (1979); (b) partial
differential equations: Folland (1976), Friedman (1964), (1975), Port and Stone
(1978), Chung and Zhao (1995); (c) one dimensional diffusions: Karlin and
Taylor (1981); (d) semi-groups and generators: Dynkin (1965), Revuz and Yor
(1991); (e) weak convergence: Billingsley (1968), Ethier and Kurtz (1986),
Stroock and Varadhan (1979). If you bought all those books you would spend
more than $1000 but for a fraction of that cost you can have this book, the
intellectual equivalent of the ginzu knife.
vi Preface
Shutting off the laugh-track and turning on the violins, the road from this
book's first publication in 1984 to its rebirth in 1996 has been a long and winding
one. In the second half of the 80's I accumulated an embarrassingly long list
of typos from the first edition. Some time at the beginning of the 90's I talked
to the editor who brought my first three books into the world, John Kimmel,
about preparing a second edition. However, after the work was done, the second
edition was personally killed by Bill Roberts, the President of Brooks/Cole. At
the end of 1992 I entered into a contract with Wayne Yuhasz at CRC Press
to produce this book. In the first few months of 1993, June Meyerman typed
most of the book into TeX. In the Fall Semester of 1993 I taught a course from
this material and began to organize it into the current form. By the summer
of 1994 I thought I was almost done. At this point I had the good (and bad)
fortune of having Nora Guertler, a student from Lyon, visit for two months.
When she was through making an average of six corrections per page, it was
clear that the book was far from finished.
During the 1994-95 academic year most of my time was devoted to
preparing the second edition of my first year graduate textbook Probability: Theory
and Examples. After that experience my brain cells could not bear to work
on another book for another several months, but toward the end of 1995 they
decided "it is now or never." The delightful Cornell tradition of a long winter
break, which for me stretched from early December to late January, provided
just enough time to finally finish the book.
I am grateful to my students who have read various versions of this book
and also made numerous comments: Don Allers, Hassan Allouba, Robert Bat-
tig, Marty Hill, Min-jeong Kang, Susan Lee, Gang Ma, and Nikhil Shah. Earlier
in the process, before I started writing, Heike Dengler, David Lando and I spent
a semester reading Protter (1990) and Jacod and Shiryaev (1987), an enterprise
which contributed greatly to my education.
The ancient history of the revision process has unfortunately been lost. At
the time of the proposed second edition, I transferred a number of lists of typos
to my copy of the book, but I have no record of the people who supplied the
lists. I remember getting a number of corrections from Mike Brennan and Ruth
Williams, and it is impossible to forget the story of Robin Pemantle who took
Brownian Motion, Martingales and Analysis as his only math book for a
yearlong trek through the South Seas and later showed me his fully annotated copy.
However, I must apologize to others whose contributions were recorded but
whose names were lost. Flame me at rtdl@cornell.edu and I'll have something
to say about you in the next edition.
Rick Durrett
About the Author
Rick Durrett received his Ph.D. in Operations research from Stanford
University in 1976. He taught in the Mathematics Department at UCLA for
nine years before becoming a Professor of Mathematics at Cornell University.
He was a Sloan Fellow 1981-83, Guggenheim Fellow 1988-89, and spoke at the
International Congress of Math in Kyoto 1990.
Durrett is the author of a graduate textbook, Probability: Theory and
Examples, and an undergraduate one, The Essentials of Probability. He has
written almost 100 papers with a total of 38 co-authors and seen 19 students
complete their Ph.D.'s under his direction. His recent research focuses on the
applications of stochastic spatial models to various problems in biology.
1. Brownian Motion
1.1 Definition and Construction 1
1.2 Markov Property, Blumenthal's 0-1 Law 7
1.3 Stopping Times, Strong Markov Property 18
1.4 First Formulas 26
2. Stochastic Integration
2.1 Integrands: Predictable Processes 33
2.2 Integrators: Continuous Local Martingales 37
2.3 Variance and Covariance Processes 42
2.4 Integration w.r.t. Bounded Martingales 52
2.5 The Kunita-Watanabe Inequality 59
2.6 Integration w.r.t. Local Martingales 63
2.7 Change of Variables, Ito's Formula 68
2.8 Integration w.r.t. Semimartingales 70
2.9 Associative Law 74
2.10 Functions of Several Semimartingales 76
Chapter Summary 79
2.11 Meyer-Tanaka Formula, Local Time 82
2.12 Girsanov's Formula 90
3. Brownian Motion, II
3.1 Recurrence and Transience 95
3.2 Occupation Times 100
3.3 Exit Times 105
3.4 Change of Time, Levy's Theorem 111
3.5 Burkholder Davis Gundy Inequalities 116
3.6 Martingales Adapted to Brownian Filtrations 119
4. Partial Differential Equations
A. Parabolic Equations
4.1 The Heat Equation 126
4.2 The Inhomogeneous Equation 130
4.3 The Feynman-Kac Formula 137
B. Elliptic Equations
4.4 The Dirichlet Problem 143
4.5 Poisson's Equation 151
4.6 The Schrodinger Equation 156
C. Applications to Brownian Motion
4.7 Exit Distributions for the Ball 164
4.8 Occupation Times for the Ball 167
4.9 Laplace Transforms, Arcsine Law 170
5. Stochastic Differential Equations
5.1 Examples 177
5.2 Ito's Approach 183
5.3 Extension 190
5.4 Weak Solutions 196
5.5 Change of Measure 202
5.6 Change of Time 207
6. One Dimensional Diffusions
6.1 Construction 211
6.2 Feller's Test 214
6.3 Recurrence and Transience 219
6.4 Green's Functions 222
6.5 Boundary Behavior 229
6.6 Applications to Higher Dimensions 234
7. Diffusions as Markov Processes
7.1 Semigroups and Generators 245
7.2 Examples 250
7.3 Transition Probabilities 255
7.4 Harris Chains 258
7.5 Convergence Theorems 268
8. Weak Convergence
8.1 In Metric Spaces 271
8.2 Prokhorov's Theorems 276
8.3 The Space C 282
8.4 Skorohod's Existence Theorem for SDE 285
8.5 Donsker's Theorem 287
8.6 The Space D 293
8.7 Convergence to Diffusions 296
8.8 Examples 305
Solutions to Exercises 311
References 335
Index 339
1 Brownian Motion
1.1. Definition and Construction
In this section we will define Brownian motion and construct it. This event,
like the birth of a child, is messy and painful, but after a while we will be able
to have fun with our new arrival. We begin by reducing the definition of a
d-dimensional Brownian motion with a general starting point to that of a one
dimensional Brownian motion starting at 0. The first two statements, (1.1) and
(1.2), are part of our definition.
(1.1) Translation invariance. {Bt — Bo,t > 0} is independent of Bq and has
the same distribution as a Brownian motion with B0 = 0.
(1.2) Independence of coordinates. If B0 = 0 then {B], t > 0}..., {Bf, t >
0} are independent one dimensional Brownian motions starting at 0.
Now we define a one dimensional Brownian motion starting at 0 to be a
process Bt, t > 0 taking values in R that has the following properties:
(a) If t0 <ti< ... < tn then £(*(,),#(*i) - B(t0), ...B(tn) - B(tn^) are
independent.
(b) If s, t > 0 then
P(B(s +t)- B(s) £A)= I (27rf)-1/2 exp(-z2/2i) dx
Ja
(c) With probability one, Bq = 0 and t —+ Bt is continuous.
(a) says that Bt has independent increments, (b) says that the increment
B(s +1) — B(s) has a normal distribution with mean vector 0 and variance t.
(c) is self-explanatory. The reader should note that above we have sometimes
2 Chapter 1 Brownian Motion
written Bs and sometimes written B(s), a practice we will continue in what
follows.
An immediate consequence of the definition that will be useful many times is:
(1.3) Scaling relation. If Bo = 0 then for any t > 0,
{BsUs>0}i{tl'2BS!s>0}
To be precise, the two families of random variables have the same finite
dimensional distributions, i.e., if s\ < ... < s„ then
(ft,« ft.«) = (''\ ''\)
In view of (1.2) it suffices to prove this for a one dimensional Brownian motion.
To check this when n = 1, we note that W2 times a normal with mean 0 and
variance s is a normal with mean 0 and variance st. The result for n > 1 follows
from independent increments.
A second equivalent definition of one dimensional Brownian motion starting
from Bo = 0, which we will occasionally find useful, is that Bt, t > 0, is a real
valued process satisfying
(a') Bt- is a Gaussian process (i.e., all its finite dimensional distributions are
multivariate normal),
(b') EB, = 0, EBsBt = sAt = min{s,t},
(c) With probability one, t —+ Bt is continuous.
It is easy to see that (a) and (b) imply (a'). To get (b') from (a) and (b) suppose
s < t and write
EB,Bt = E{B2S) + E(B,(Bt - Bs))
= s + EBsE(Bt - Bs) = s
The converse is even easier, (a') and (b') specify the finite dimensional
distributions of Bt, which by the last calculation must agree with the ones defined
in (a) and (b).
The first question that must be addressed in any treatment of Brownian
motion is, "Is there a process with these properties?" The answer is ICYes," of
course, or this book would not exist. For pedagogical reasons we will pursue
an approach that leads to a dead end and then retreat a little to rectify the
difficulty. In view of our definition we can restrict our attention to a one
dimensional Brownian motion starting from a fixed jgR. We could take x = 0
but will not for reasons that become clear in the remark after (1.5).
Section 1.1 Definition and Construction 3
For each 0 < ti < ... < tn define a measure on R" by
m—li xmj
JAx Ja„ m=1
where x0 = x, to = 0,
Pt(a, b) = (2^)-^2 exp(-(6 - a)2/2t)
and A{ G H the Borel subsets of R. From the formula above it is easy to see
that for fixed x the family y. is a consistent set of finite dimensional distributions
(f.d.d.'s), that is, if {si,... sn_i} C {ti, ...tn} and tj ¢{^1,.. .sn_i} then
/fx,jx,...*n_i (^i x • - • x j4„_i) = tiXitu,„in(Ai x-xAj^ixRxAjX-x An„{)
This is clear when j = n. To check the equality when 1 < j < n, it is enough
to show that
j Ptj-ts-t (X,V)Ptj+x-ts{v, 2) dy = ptj+l_tj._x(a:,z)
By translation invariance, we can without loss of generality assume x = 0, but
all this says in that case is the sum of independent normals with mean 0 and
variances tj — i,-_i and ty+i — tj has a normal distribution with mean 0 and
variance tJ+1 — ty_i. With the consistency of f.d.d.'s verified we get our first
construction of Brownian motion:
(1.4) Theorem. Let fi0 = {functions w : [0,oo) -+ R} and T0 be the cr-field
generated by the finite dimensional sets {w : w(tj) € -4,- for 1 < i < n} where
A{ £ It. For each x G R, there is a unique probability measure vx on (fi0, T^)
so that vx{w : w(0) = a;} = 1 and when 0 < ti • ■ ■ < tn
(*) vs({w : u(ti) G A{}) = Hx,tu...u{M x • • • x An)
This follows from a generalization of Kolmogorov's extension theorem. We will
not bother with the details since at this point we are at the dead end referred
to above. If C = {w : t —+ w(t) is continuous} then C ¢ T0, that is, C is not a
measurable set. The easiest way of proving C $ T0 is to do
Exercise 1.1. A € T0 if and only if there is a sequence of times t\, t2, ...
G [0,oo) and a B G ft*1-2-} (the infinite product cr-field TI x TI x ■ ■ ■) so that
A = {w : (w(ti),w(t2),.--) G B). In words, all events in T0 depend on only
countably many coordinates.
4 Chapter 1 Brownian Motion
The above problem is easy to solve. Let Q2 = {m2~n : m, n > 0} be the
dyadic rationals. If Qq = {w : Q2 -+ R} and Tq is the cr-field generated by the
finite dimensional sets, then enumerating the rationals gi, g2, • • • an<i applying
Kolmogorov's extension theorem shows that we can construct a probability vx
on (fig,.Fg) so that i>x{u : w(0) = x} = 1 and (*) in (1.4) holds when the
U £ Q2. To extend the definition from Q2 to [0,oo) we will show:
(1.5) Theorem. Let T < oo and x G R. vx assigns probability one to paths
w : Q2 —+ R that are uniformly continuous on Q2 fl [0,T].
Remark. It will take quite a bit of work to prove (1.5). Before taking on that
task, we. will attend to the last measure theoretic detail: we tidy things up by
moving our probability measures to (C,C) where C = {continuous w : [0,oo) —+
R} and C is the cr-field generated by the coordinate maps t —nj(t). To do this,
we observe that the map tp that takes a uniformly continuous point in fig to its
extension in C is measurable, and we set
Px = vx of1.
Our construction guarantees that Bt (w) = wt has the right finite dimensional
distributions for t € Q2. Continuity of paths and a simple limiting argument
shows that this is true when t £ [0,oo).
As mentioned earlier the generalization to d > 1 is straightforward since the
coordinates are independent. In this generality C = {continuous w : [0,oo) -+
Rd} and C is the cr-field generated by the coordinate maps t —+ w(t). The reader
should note that the result of our construction is one set of random variables
Bt(u) = w(t), and a family of probability measures Px, x € Rd, so that under
Px, Bt is a Brownian motion with PX(B0 = x) = 1. It is enough to construct the
Brownian motion starting from an initial point x, since if we want a Brownian
motion starting from an initial measure y. (i.e., have P^Bq & A) = n(A)) we
simply set
P„ (A) = j n(dx)Px(A)
Proof of (1.5) By (1.1) and (1.3), we can without loss of generality suppose
B0 = 0 and prove the result for T = 1. In this case, part (b) of the definition
and the scaling relation (1.3) imply
EodBt - 53|4) = £(,15,-,14 = C(t - sf
where C = i?!-^4 < oo. From the last observation we get the desired uniform
continuity by using a result due to Kolmogorov. In this proof, we do not use
the independent increments property of Brownian motion; the only thing we
Section 1.1 Definition and Construction 5
use is the moment condition. In Section 2.11 we will need this result when the
Xt take values in a space S with metric p\ so, we will go ahead and prove it in
that generality.
(1.6) Theorem. Suppose that Ep(X1,Xi)^ < K\t- s\1+a where a, /? > 0. If
7 < a//? then with probability one there is a contant C(w) so that
p(Xq,Xr)<C\q-rp foraUg,r€Q2n[0,l]
Proof Let 7 < a//?, r\ > 0, I„ = {(i,j) : 0 < i < j < 2",0 < j - i < 2""}
and G„ = {P(X(j2-n),X(i2-n)) < ((j - i)2~")T for all (i,j) G /„}. Since
af3P(\Y\>a)<E\Y\l3 we have
P(°n)< E (0'-02-n)-^^(X0-2-n),X(i2-))"
(«",i)6/n
(«"j")6In
by our assumption. Now the number of (i,j) G In is < 2n2n'7, so
p(Gn) < K • S"2"" • (2n"2~n)-^+1+a = K2~nX
where A = (1 — 77)(1 + a — f3y) -(1 + »7). Since 7 < a//?, we can pick r\ small
enough so that A > 0. To complete the proof now we will show
(1.7) Lemma. Let A = 3 • 2^-^/(1 ~ 2_7)- On #v = ^n=NGn we have
/>(X8,Xr)<A|g-rr forg,rGQ2n[0,l]
with \q-r\< 2-^-^
(1.6) follows easily from (1.7):
00 00 K9~NX
P(HeN) < £ P(Gcn) < # £ 2-A = yz^z
n=N n=N
This shows p(Xq,Xr) < A\q — r|7 for |g — r\ < <5(w) and implies that we have
p(Xq,Xr) < C(u)\q - r[r for q,r G [0,1].
Proof of (1.7) Let q, r G Q2 n [0,1] with 0 < r - q < 2-0-*)". Pick m > N
so that
2-(m+l)(l-ij) < r _ < 2-"»(1-ij)
6 Chapter 1 Brownian Motion
and write
r = j2~m + 2~r(1} + • • • + 2~rW
g = i2~m - 2~qW 2-q^
where m < r(l) < ■ • • < r(£) and m < g(l) < • • • < q(k). Now 0 < r — g <
2-m(i-^)j so y _ ^ < 2mi and it follows that on HN
(a) /)(X(i2-ra), X(j2-m)) < ((2m")2-my
On ifjv it follows from the triangle inequality that
k oo
(b) P(xq, x(i2-m)) < Y^C2~q{h)y < X) (2_7)A ^ C72_7m
A=l h=m
where C7 = 1/(1 — 2-7) > 1. Repeating the last computation shows
(c) P(Xr,X(j2-m))<C^2-^m
Combining (a)-(c) gives
p(Xq,Xr) < 3^2-^^-^ < 3(7^-^ - g|T
since 2-m(1-") < 21-"|r.— g|. This completes the proof of (1.7) and hence of
(1.6) and (1.5) D
The scaling relation (1.2) implies
E\Bt - B3\2m = Cm\t - s\m where Cm = E\Bx\2m
So using (1.6) with /? = 2m and a = m — 1 and then letting m —► oo gives:
(1.8) Theorem. Brownian paths are Holder continuous with exponent j for
any j < 1/2.
It is easy to show:
(1.9) Theorem. With probability one, Brownian paths are not Lipschitz
continuous (and hence not differentiate) at any point.
Proof Let An = {w : there is an s £ [0,1] so that \Bt — Bs\ < C\t — s\ when
\t - s\ < 3/n}. For 1 < k < n - 2 let
(K^)-
n,„=max< B[^^)-B< *+i 1
n
Gn = { at least one YkiU is < 5C/n}
:j = 0,1,2
Section 1.2 Markov Property, Blumenthal's 0-1 Law 7
We claim that An C Gn. To prove this we consider s = 1, which we claim is
the worst possibility. In this case we want to conclude that Yn_2i„ < 5C/n.
For this we observe that j = 0 is the worst case, but even then
|£(„-3)/„ " £(n-2)/„| < |£(„_3)/„ ~B1\+\B1- £(„-2)/„| < C ^ + £)
Using An C Gn and the scaling relation (1.2) gives
P(A.) < P(Gn) <nP (|J31/n| < ^-\
since exp(—x2/2) < 1. Letting n —+ oo shows P(An) —+ 0. Noticing n —+ j4„ is
increasing shows P(An) = 0 for all n and completes the proof. D
Exercise 1.2. Show by considering k increments instead of 3 that if 7 >
1/2+ 1/k then with probability 1, Brownian paths are not Holder continuous
with exponent 7 at any point of [0,1].
The next result is more evidence that Bt — Bs w y/t — s.
Exercise 1.3. Let Am,„ = B(tm2-n) - B(t(m-l)2~n). Compute
\m<2"
and use the Borel-Cantelli lemma to conclude that ^m<2n A^, n —+ t a.s. as
n —+ 00.
Remark. The last result is true if we consider a sequence of partitions ni C
n2 C ... with mesh -+ 0. See Freedman (1970) p.42-46. The true quadratic
variation, defined as the sup over all partitions, is 00 for Brownian motion.
1.2. Markov Property, Blumenthal's 0-1 Law
Intuitively the Markov property says
"given the present state, Bs, any other information about what
happened before time s is irrelevant for predicting what happens after
time s."
8 Chapter 1 Brownian Motion
Since Brownian motion is translation invariant, the Markov property can be
more simply stated as:
"if s > 0 then Bi+S — Bs, t > 0 is a Brownian motion that is
independent of what happened before time s."
This should not be surprising: the independent increments property ((a) in the
definition) implies that if si < S2 ■ ■ ■ < sm = s and 0 < ti... < tn then
(Bil+1 -BS,...BU+S -Bs) is independent of (BSl,...BSn)
The major obstacle in proving the Markov property for Brownian motion
then is to introduce the measure theory necessary to state and prove the result.
The first step in doing this is to explain what we mean by "what happened
before time s." The technical name for what we are about to define is a
filtration, a fancy term for an increasing collection of cr-fields, Ta> i.e., if s < t then
•?\j C Tf Since we want Bs £ Tat i.e., Bs is measurable with respect to Ts, the
first thing that comes to mind is
T's = <r(Br :r<s)
For technical reasons, it is convenient to replace T° by
The fields Tf are nicer because they are right continuous. That is,
nt>sT+ = nt>3 (nu>1;Fu0) = nu>^„° = Tf
In words the T% allow us an "infinitesimal peek at the future," i.e., A G Tf if
it is in T°+c for any e > 0. If Bt is a Brownian motion and / is any measurable
function with f(u) > 0 when u > 0 the random variable
limsup (Bt -Bs)/f(t-s)
U>
is measurable with respect to T+ but not T°. Exercises 2.9 and 2.10 consider
what happens when we take f(u) — y/u and f(u) = \/wloglog(l/w). However,
as we will see in (2.6), there are no interesting examples of sets that are in Ff
but not in F°. The two cr-fields are the same (up to null sets).
To state the Markov property we need some notation. First, recall that we
have a family of measures Px, x € Rd, on (C, C) so that under Px, Bt(w) = w(t)
is a Brownian motion with B0 = x. In order to define "what happens after time
Section 1:2 Markov Property, Blumenthal's 0-1 Law 9
s," it is convenient to define the shift transformations 9S : C —► C, for s > 0
by
(6sw)(t)=w(s+t) forf>0
In words, we cut off the part of the path before time s and then shift the path
so that time s becomes time 0. To prepare for the next result, we note that if
Y : C —* R is C measurable then Y o6s is a function of the future after time s.
To see this, consider the simple example Y(w) = f(w(t)). In this case
Y o 0, = f(0su(t)) = f(w(s +1)) = f(B1+i)
Likewise, if Y(w) = /(wtx,.. -wln) then
Yo0s=f(Bs+tl,...Bs+tn)
(2.1) The Markov property. If s > 0 and Y is bounded and C measurable
then for all x £ Rd
Ex{YoQs\T+) = EB,Y
where the right hand side is <p(y) = EyY evaluated at y = B(s).
Explanation. In words, this says that the conditional expectation of Y o 0,
given T+ is just the expected value of Y for a Brownian motion starting at Bs.
To explain why this implies "given the present state, Bs, any other information
about what happened before time s is irrelevant for predicting what happens
after time s," we begin by recalling (see (1.1) in Chapter 5 of Durrett (1995))
that if g C T and E{Z\T) £ G then E{Z\T) = E{Z\Q). Applying this with
T = T+ and Q = cr(Bs) we have
EX(Y o 0,\Tf) = EX(Y o6,\Bs)
If we recall that X = E{Z\T) is our best guess at Z given the information in F
(in the sense of minimizing E(Z — X)2 over X € Fs, see (1.4) in Chapter 4 of
Durrett (1995)) then we see that our best guess at Y o ds given Bs is the same
as our best guess given Ff, i.e., any other information in T* is irrelevant for
predicting Y o9s.
Proof By the definition of conditional expectation, what we need to show is
(MP) Ex{Yo6s;A) = Ex{EB,Y;A) for all A £ JF+
10 Chapter 1 Brownian Motion
We will begin by proving the result for a special class of Y's and a special class
of A's. Then we will use the tt — A theorem and the monotone class theorem to
extend to the general case. Suppose
^)= 33 /™K*m))
l<m<n
where 0 < t\ < ... <tn and the fm : Rd —+ R are bounded and measurable. Let
0 < h< 11, let 0 < si... < sk < s + h, and let A = {w : w(sy) G Aj, 1 < j < k}
where Aj £ Tid for 1 < j < k. We will call these A's the finite dimensional
or f.d. sets in .?7+A.
Prom the definition of Brownian motion it follows that if 0 = u0 < u\ <
...ut then the joint density of (BUl ,...BUt) is given by
t
Px(BUl = yi,...BUe=yt) = JJpU£_Ui_x(y,-i,y,)
1=1
where yQ = x. From this it follows that
t
(yo.yi)ffi(yi)
• •• / dylpUt_Ut_l(yt-Uyl)gi(yt)
Applying this result with the u,- given by s\,.. .,sk,s + h,s + ti,.. .s +tn and
the obvious choices for the gi we have
EX(Y o 0,-A) = Ex [ JJ UABsj) ■ ln*(B,+h) ■ JJ fM+im)
U = l m=l
/R-*
where
JAx JAk
■ / dyp1+h_Sk(xk,y)ip(y,h)
<p(y,h)= / dyiPu-h(y,yi)h(yi)--- I ^ynPi„-i„_i(yn-i,yn)/„(y„)
Using the formula for Ex 11/= i 9i{BUi) again we have
Section 1.2 Markov Property, Blumenthal's 0-1 Law 11
(MP-1) EX(Y o 0,; A) = Ex(<p(B1+h, A); A) for all f.d. sets A in T°i+h
To extend the class of j4's to all of ^"/+A, we use
(2.2) The tt — A theorem. Let A be a collection of subsets of Q that contains
fi and is closed under intersection (i.e., if A, B &A then A n B G .4)- Let £ be
a collection of sets that satisfy
(i) if A,B G g and A D B then A - B G £
(ii) If j4„ G £ and j4„ | .4, then A eg.
If Acg then the cr-field generated by A, (r(A) C £•
Proof See (2.1) in the Appendix of Durrett (1995).
(MP-2) EX(Y oO,;A) = Ex(<p(Bs+h, A); A) for all A G J?+fc
Proof Fix Y and let ^ be the collection of sets A for which the desired equality
is true. A simple subtraction shows that (i) in (2.2) holds, while the monotone
convergence theorem shows that (ii) holds. If we let A be the collection of finite
dimensional sets in T°+h then we have shown A C g so T°+h = c(A) C g
which is the desired conclusion. D
Our next step is
(MP-3) EX(Y o 0S; A) = Ex(<p(Bs, 0); A) for all A G Tf
Proof Since Tf C -F/+A for any h > 0, all we need to do is let h -+ 0 in
(MP-2). It is easy to see that
ip(y\) = /1(2/1) / dy2pt2-tl(2/1,2/2)/2(½)
••• I ^2/nPi„-i„_x(2/n-l,2/n)/n(2/n)
is bounded and measurable. Using the dominated convergence theorem shows
that if h —+ 0 and x^—^x then
<p(xh,h)= dyipil_h(xh,yi)ijj(yi)->-'p(x,0)
Using (MP-2) and the bounded convergence theorem now gives the desired
result. D
12 Chapter 1 Browniaji Motion
(MP-3) shows that (MP) holds for Y = ni<m<„ /m(w(im)) when the fm
are bounded and measurable. To extend to general "bounded measurable Y we
will use
(2.3) Monotone class theorem. Let A be a collection of subsets of Q that
contains Q and is closed under intersection (i.e., if A, B G A then ifl5G .4).
Let Ti be a vector space of real valued functions on fi satisfying
(i) JfAeA,lA€ H.
(ii) If 0 < /„ G Ti and /„|/,a bounded function, then f eTi.
Then Ti contains all the bounded functions on Q that are measurable with
respect to cr(A).
Proof See (1.4) of Chapter 5 of Durrett (1995).
To get (2.1) from (2.3), fix an A G T+ and let Ti = the collection of bounded
functions Y for which (MP) holds. Ti clearly is a vector space satisfying (ii).
Let A be the collection of sets of the form {w : u(tj) G Aj, 1 < j < n} where
Aj G Tid. The special case treated above shows that if A G A then 1,4 G Ti.
This shows (i) holds and the desired conclusion follows from (2.3). D
The next seven exercises give typical applications of the Markov property.
Perhaps the simplest example is
Exercise 2.1. Let 0 < s < t. If / : Rd -+ R is bounded and measurable
Ex{f{Bt)\Ts) = EBJ{Bt^)
Exercise 2.2. Take f(x) = xt and f(x) = hxj with i ^ j in Exercise 2.1 to
conclude that B\ and B\B\ are martingales if i ^ j.
The next two exercises prepare for calculations in Section 1.4.
Exercise 2.3. Let TQ = inf{s > 0 : B, = 0} and let R = M{t > 1 : Bt = 0}.
R is for right or return. Use the Markov property at time 1 to get
(2.4) PX(R > 1 +1) = J pi(x, y)Py(T0 > t) dy
Exercise 2.4. Let TQ = inf{s > 0 : B, = 0} and let L = sup{f < 1 : Bt = 0}.
L is for left or last. Use the Markov property at time 0 < t < 1 to conclude
(2.5) P0(L <t)= [ Pt(0,y)Py(To >l-t)dy
Section 1.2 Markov Property, Blumenthal's 0-1 Law 13
Exercise 2.5. Let G be an open set and let T = inf{£ : Bt £ G). Let K be a
closed subset of G and suppose that for all x G K we have PX(T > 1, Bt G K) >
a, then for all integers n > 1 and x G K we have Px(r > n,Bt G K) > a".
The next two exercises prepare for calculations in Chapter 4.
Exercise 2.6. Let 0 < s <t. If A : R x Rd —+ R is bounded and measurable
Ex (J h(r,Br)dr ?\ = J' h(r,Br)dr
+ Eb(s) / h(s + u,Bu)du
Jo
Exercise 2.7. Let 0 < s < t. If / : Rd -+ R and h : R x Rd -+ R are bounded
and measurable then
Ex(f(Bt)exp(J h(r,Br)dA t\
= exp (J h(r, Br) dA EBl //(5,-,) exp (J h(s + u, Bu) du\ \
The reader will see many other applications of the Markov property below,
so we turn our attention now to a "triviality" that has surprising consequences.
Since
it follows (see, e.g., (1.1) in Chapter 5 of Durrett (1995)) that
EX(Y o 0,\Tf) = Ex(Yo et\Tt)
Prom the last equation it is a short step to
(2.6) Theorem. If Z G C is bounded then for all s > 0 and x G Rd,
Ex{Z\Tf) = Ex{Z\rs).
Proof By the monotone class theorem, (2.3), it suffices to prove the result
when
n
Z = J]; fm(B(tm))
m=l
14 Chapter 1 Browniaji Motion
and the fm are bounded and measurable. In this case Z = X(Y 0¾) where
X G F° and Y G C, so using a property of conditional expectation (see, e.g.,
(1.3) in Chapter 4 of Durrett (1995)) and the Markov property (2.1) gives
Ex{Z\Tf) = XEX(Y 0 0, \F+) = XEB,Y G Tf
and the proof is complete. D
If we let Z G F+ then (2.6) implies Z = EX(Z\T°) G T°, so the two cr-fields
are the same up to null sets. At first glance, this conclusion is not exciting.
The fun starts when we take s = 0 in (2.6) to get
(2.7) Blumenthal's 0-1 law. If A G ^"0+ then for all x G Rd,
Px(A)e {0,1}
Proof Using (i) the fact that A G T$, (ii) (2.6), (iii) T% = <r(J30) is trivial
under Ps, and (iv) if Q is trivial E{X\Q) = EX gives
1A = EX(1A\F+) = EX(1A\FS) = PM) P* a.s.
This shows that the indicator function 1,4 is a.s. equal to the number PX(A)
and it follows that PX(A) G {0,1}. D
In words, the last result says that the germ field, J7^, is trivial. This result
is very useful in studying the local behavior of Brownian paths. Until further
notice we will restrict our attention to one dimensional Brownian motion.
(2.8) Theorem. If r = inf{i > 0 : Bt > 0} then P0(t = 0) = 1.
Proof Pq(t <t)> Po(Bi > 0) = 1/2 since the normal distribution is
symmetric about 0. Letting t J. 0 we conclude
P0(t = 0) = limP0(t <t)> 1/2
so it follows from (2.7) that Pq(t = 0) = 1. D
Once Brownian motion must hit (0,oo) immediately starting from 0, it
must also hit (—00,0) immediately. Since t —+ Bt is continuous, this forces:
(2.9) Theorem. If TQ = M{t > 0 : Bt = 0} then PQ(TQ = 0) = 1.
Section 1,2 Markov Property, Blumenthal's 0-1 Law 15
Combining (2.8) and (2.9) with the Markov property you can prove
Exercise 2.8. If a < 6 then with probability one there is a local maximum
of Bt in (a, 6). Since with probability one this holds for all rational a < b, the
local maxima of Brownian motion are dense.
Another typical application of (2.7) is
Exercise 2.9. Let f(t) be a function with f(t) > 0 for all t > 0. Use (2.7) to
conclude that limsup^o B{t)/f{t) = c Pq a.s. where c G [0, oo] is a constant.
In the next exercise we will see that c = oo when f(t) = tll2. The law of the
iterated logarithm (see Section 7,9 of Durrett (1995)) shows that c = 21/2 when
/(*) = (* log log(l/t))1/2.
Exercise 2.10. Show that limsuptxo-BM/t1/2 = oo P0 a.s., so with probability
one Brownian paths are not Holder continuous of order 1/2 at 0.
Remark. Let W7 (w) be the set of times at which the path w G C is Holder
continuous of order 7, (1,6) shows that P(7i-y = [0,oo)) = 1 for 7 < 1/2.
Exercise 1.2 shows that P(7i-y = 0) = 1 for 7 > 1/2. The last exercise shows
P(t € W1/2) = 0 for each t, but B. Davis (1983) has shown P(7ii/2 # 0) = 1.
Comic Relief. There is a wonderful way of expressing the complexity of
Brownian paths that I learned from Wilfrid Kendall.
"If you run Brownian motion in two dimensions for a positive amount
of time, it will write your name."
Of course, on top of your name it will write everybody else's name, as well as all
the works of Shakespeare, several pornographic novels, and a lot of nonsense.
Thinking of the function g as our signature we can make a precise statement
as follows:
(2.10) Theorem. Let g : [0,1] -+ Rd be a continuous function with g(0) = 0,
let e > 0 and let tn J. 0. Then Pq almost surely,
sup
0<8<1
Proof In view of Blumenthal's 0-1 law, (2.7), and the scaling relation (1.3),
we can prove this if we can show that
(*) Po ( sup \B(0) - g(0)\ < e ) > 0 for any e > 0
\0<8<1 J
B(0tn)
-m
< e for infinitely many n
16 Chapter 1 Brownian Motion
This was easy for me to believe, but not so easy for me to prove when students
asked me to fill in the details. The first step is to treat the dead man's signature
g(x) = 0.
Exercise 2.11. Show that if e > 0 and t < oo then
P0 (sup sup |-Bj|<e) >0
\ i 0<s<t J
In doing this you may find Exercise 2.5 helpful. In (5.4) of Chapter 5 we will
get the general result from this one by change of measure. Can the reader find
a simple direct proof of (*)?
With our discussion of Blumenthal's 0-1 law complete, the distinction
between Tf and T° is no longer important, so we will make one final improvement
in our cr-fields and remove the superscripts. Let
K = {A ; A C B with PX(B) = 0}
?* = <r{T?UMx)
F, = nxr?
J\fx are the null sets and Ff are the completed cr-fields for Px. Since we do
not want the filtration to depend on the initial state we take the intersection of
all the completed cr-fields. This technicality will be mentioned at one point in
the next section but can otherwise be ignored.
(2.7) concerns the behavior of Bt as t -+ 0. By using a trick we can use
this result to get information about the behavior as t —+ oo.
(2.11) Theorem. If Bt is a Brownian motion starting at 0 then so is the process
defined by X0 = 0 and Xt = t B(l/t) for t > 0.
Proof By (1,2) it suffices to prove the result in one dimension. We begin by
observing that the strong law of large numbers implies ^-+033^-+0, so X
has continuous paths and we only have to check that X has the right f.d.d.'s.
By the second definition of Brownian motion, it suffices to show that (i) if
0 < ti < ... < tn then (X(ti), ... X(tn)) has a multivariate normal distribution
with mean 0 (which is obvious) and (ii) if s < t then
E{XsXt) = stE(B(l/s)B(l/t)) = s
D
Section 1.2 Markov Property, Blumenthal's 0-1 Law 17
(2.11) allows us to relate the behavior of B>t as t —+ oo to the behavior as
t —► 0. Combining this idea with Blumenthal's 0-1 law leads to a very useful
result. Let
T[ = <r(Bs : s >t) = the future after time t
T = nt>o T't = the tail cr-field
(2.12) Theorem. If A G T then either PX(A) = 0 or Px(j1) = 1.
Remark. Notice that this is stronger than the conclusion of Blumenthal's 0-1
law (2.7). The examples A = {w : w(0) G B} show that for A in the germ
cr-field T$ the value of PX{A) may depend on x.
Proof Since the tail cr-field of B is the same as the germ cr-field for X, it
follows that P0(A) G {0,1}. To improve this to the conclusion given observe
that A G T[, so 1,4 can be written as 1b o 9\. Applying the Markov property,
(2.1), gives
P*(A) = EX(1B o ex) = EX(EX(1B o fcl^i)) = Ex{EBllB)
= j(2w)-d/2 exp(-|y - x\*/2)Ps(B) dy
Taking x = 0 we see that if Pq(A) = 0 then Py(B) = 0 for a.e. y with respect
to Lebesgue measure, and using the formula again shows PX(A) = 0 for all x.
To handle the case Pq(A) = 1 observe that Ae G T and Pq(Ac) = 0, so the last
result implies Px(Ae) = 0 for all x. Q
The next result is a typical application of (2.12). The argument here is a
close relative of the one for (2.8).
(2.13) Theorem. Let Bt be a one dimensional Brownian motion and let A =
nn{Bt = 0 for some t>n}. Then PX(A) = 1 for all x.
In words, one dimensional Brownian motion is recurrent. It will return to 0
"infinitely often," i.e., there is a sequence of times tn f oo so that Bta = 0. We
have to be careful with the interpretation of the phrase in quotes since starting
from 0, Bt will return to 0 infinitely many times by time e > 0.
Proof We begin by noting that under Px, Bi/y/t has a normal distribution
with mean x/y/i and variance 1, so if we use \ '° denote a standard normal,
Px(Bt < 0) = PziBt/y/i < 0) = P(X < -x/y/t)
18 Chapter 1 Browniaji Motion
and lim^oo Px{Bt < 0) = 1/2. If we let T0 = inf{t : Bt = 0} then the last
result and the fact that Brownian paths are continuous implies that for all x > 0
PX(TQ < oo) > 1/2
Symmetry implies that the last result holds for x < 0, while (2.9) (or the Markov
property) covers the last case x = Q.
Combining the fact that Px(Tq < oo) > 1/2 for all x with the Markov
property shows that
Px(Bt = 0 for some t > n) = Ex(PBn (To < oo)) > 1/2
Letting n —► oo it follows that PX(A) > 1/2 but A G T so (2.12) implies
Px(Bt = 0 i.o.) = 1. D
1.3. Stopping Times, Strong Markov Property
We call a random variable S taking values in [0, oo] a stopping time if for all
t > 0, {S < t} £ Tt. To bring this definition to life think of Bt as giving the
price of a stock and S as the time we choose to sell it. Then the decision to sell
before time t should be measurable with respect to the information known at
time t.
In the last definition we have made a choice between {S < t) and {S < t).
This makes a big difference in discrete time but none in continuous time (for a
right continuous filtration Tt) '■
If {S<t} e Tt then {S < t} = U„{S < t - 1/n} G Tx.
If {S < t} G Tt then {S<t} = n„{S < t + 1/n} £ Tt.
The first conclusion requires only that t —+ Tt is increasing. The second relies
on the fact that t —+ Tt is right continuous. (3.2) and (3.3) below show that
when checking something is a stopping time it is nice to know that the two
definitions are equivalent.
(3.1) Theorem. If G is an open set and T = inf{f > 0 : Bt e G) then T is a
stopping time.
Proof Since G is open and t -+ Bt is continuous {T < t} = Uq<t{Bq G G}
where the union is over all rational q, so {T < t) G T%. Here, we need to use
the rationals so we end up with a countable union. D
(3.2) Theorem. If Tn is a sequence of stopping times and Tn J. T then T is a
stopping time.
Section 1.3 Stopping Times, Strong Markov Property 19
Proof {T<t} = U„{T„ <t). a
(3.3) Theorem. If Tn is a sequence of stopping times and Tn\T then T is a
stopping time.
Proof {T <t} = D„{Tn <t}. a
(3.4) Theorem. If K is a closed set and T = inf {* >Q:Bt£K} then T is a
stopping time.
Proof Let D(x,r) = {y : \x - y\ < r}, let G„ = U{D(z, 1/n) : x G K}, and
let Tn = ini{t > 0 : Bt G G„}. Since G„ is open, it follows from (3.1) that T
is a stopping time. I claim that as n f oo, Tn | T. To prove this notice that
T > Tn for all n, so lim Tn <_T. To prove T < lim T„ we can suppose that
Tn T t < oo. Since B(T„) G G„ for all n and B(T„) -+ B{t), it follows that
5(f) G -K and T < t. D
Remark. As the reader might guess the hitting time of a Borel set A, Ta =
inf{t : Bt G A}, is a stopping time. However, this turns out to be a difficult
result to prove and is not true unless the filtration is completed as we did at
the end of the last section. Hunt was the first to prove this. The reader can
find a discussion of this result in Section 10 of Chapter 1 of Blumenthal and
Getoor (1968) or in Chapter 3 of Dellacherie and Meyer (1978). We will not
worry about that result here since (3.1) and (3.4), or in a pinch the next result,
will be adequate for all the hitting times we will consider.
Exercise 3.1. Suppose A is an Fa, i.e., a countable union of closed sets. Show
that TA = inf{t: Bt G A} is a stopping time.
Exercise 3.2. Let S be a stopping time and let S„ = ([2nS] + l)/2n where
[x] = the largest integer < x. That is,
S„ = (m + 1)2-" if m2~" <S<{m+ 1)2-"
In words, we stop at'the first time of the form k2~n after S (i.e., > S). Prom
the verbal description it should be clear that S„ is a stopping time. Prove that
it is.
Exercise 3.3. If S and T are stopping times, then SAT = min{S,T}, SVT =
max{S,T}, and S + T are also stopping times. In particular, if t > 0, then
S At, SV t, and S + t are stopping times.
20 Chapter 1 Browniaji Motion
Exercise 3.4. Let Tn be a sequence of stopping times. Show that
supT„, infTn, limsupTn, liminfT„ are stopping times
n n n n
Our next goal is to state and prove the strong Markov property. To do
this, we need to generalize two definitions from Section 1.2. Given a nonnegative
random variable S(w) we define the random shift 0s which "cuts off the part
of w before S(w) and then shifts the path so that time S(w) becomes time 0."
(*sw)(t) - | A on {S = oo}
Here A is an extra point we add to C to cover the case in which the whole
path gets shifted away. Some authors like to adopt the convention that all
functions have /(A) = 0 to take care of the second case. However, we will
usually explicitly restrict our attention to {S < oo} so that the second half of
the definition will not come into play.
The second quantity Ts, "the information known at time S," is a little
more subtle. We could have defined
T, = n£>o<r(B1A(,+£))i > 0)
so by analogy we could set
Ts = n£>o <T(5M(s+£),f > 0)
The definition we will now give is less transparent but easier to work with.
Fs = {A : A n {S < t} G Ft for all t > 0}
In words, this makes the reasonable demand that the part of A that lies in
{S < t} should be measurable with respect to the information available at time
t. Again we have made a choice between < t and < t but as in the case of
stopping times, this makes no difference and it is useful to know that the two
definitions are equivalent.
Exercise 3.5. When Tx is right continuous, the definition of Ts is unchanged
if we replace {S < t} by {S < t}.
For practice with the definition of Ts do
Section 1.3 Stopping Times, Strong Markov Property 21
Exercise 3.6. Let S be a stopping time and let A G Ts- Show that
R= \ ., is a stopping time
L oo on Ae
Exercise 3.7. Let S and T be stopping times.
(i) {S < t}, {S > t}, {S = t} are in Ts.
(ii) {S < T}, {S > T}, and {S = T) are in Ts (and in TT).
Two properties of Ts that will be useful below are:
(3.5) Theorem. If S < T are stopping times then Ts C Tt-
Proof If A G Ts then A n {T < t} = (,4 n {S < t}) n {T < t} g ^. D
(3.6) Theorem. If Tn | T are stopping times then TT = nnT(Tn).
Proof (3.5) implies T(Tn) D Tt for all n. To prove the other inclusion, let
A £ DT(Tn). Since A D {T„ < t} G ^i and T„ J. T, it follows that A D {T <
t} e T, so A £TT. D
The last result and Exercises 3.2 and 3.7 allow us to prove something that
is obvious from the verbal definition.
Exercise 3.8. Bs £ Ts, i-e., the value of Bs is measurable with respect to
the information known at time S! To prove this let Sn = ([2nS] + l)/2n be the
stopping times defined in Exercise 3.2. Show B(Sn) G Tsn then let n -+ oo and
use (3.6).
The next result goes in the opposite direction from (3.6). Here Qn\ Q means
n —+ Qn is increasing and Q = (r(Gn)-
Exercise 3.9. Let S < oo and Tn be stopping times and suppose that Tn | oo
as n | oo. Show that Tsat* ^ Ts as n 1 oo.
We are now ready to state the strong Markov property, which says that
the Markov property holds at stopping times.
(3.7) Strong Markov property. Let (s,w) -+ Y(s,u) be bounded and HxC
measurable. If S is a stopping time then for all x G R-d
EX(YS o 9S \TS) = EB(S)Ys on {S < oo}
22 Chapter 1 Brownian Motion
where the right-hand side is <p(y,t) = EyYt evaluated at y = B(S), t = S.
Remark. In most applications the function that we apply to the shifted path
will not depend on s but this flexibility is important in Example 3.3. The verbal
description of this equation is much like that of the ordinary Markov property:
"the conditional expectation of Y o 9s given J^ is just the expected
value of Ys for a Brownian motion starting at Bs."
Proof We first prove the result under the assumption that there is a sequence
of times tn | oo, so that PX(S < oo) = J2 PX(S = tn). In this case we simply
break things down according to the value of S, apply the Markov property and
put the pieces back together. If we let Z„ = Ytll(u) and A € Ts then
oo
EX(YS o 9S; An{S<oo}) = J2 M^n ° 6U; A n {S = tn})
n = l
Now if a e TSl A n {S = tn} = (An{S< tn}) - (A n {S < tn-i}) e ^(t„),
so it follows from the Markov property that the above sum is
oo
= Y,Ex(EB{in)Zn;Ar\ {S = *„}) = Ex(EB{s)Ys;An{S < ex.})
n = l
To prove the result in general we let Sn = ([2"5] + l)/2n where [a;] = the
largest integer < x. In Exercise 3.2 you showed that S„ is a stopping time. To
be able to let n —+ oo we restrict our attention to Y's of the form
n
(*) y,H = Ms) J] /mK*m))
m=l
where 0 < t\ < ... < tn and /o, • ■ ■ ,/n are real valued, bounded and continuous.
If/ is bounded and continuous then the dominated convergence theorem implies
that
¢-+/ dypt(x,y)f(y)
is continuous. From this and induction it follows that
<p(x,s) = EXY, = f0(s) / dy1ptl(x,y1)f(y1)
■ dyn Ptn-tn-l(yn-i,yn)f(yn)
Section 1.3 Stopping Times, Strong Markov Property 23
is bounded and continuous.
Having assembled the necessary ingredients we can now complete the proof.
Let A € Ts- Since S < Sn, (3.5) implies A £ ^"(S„). Applying the special case
of (3.7) proved above to Sn and observing that {Sn < oo} = {S < oo} gives
Ex(YSn o 0Sn; A n {S < oo}) = Ex(<p(B(Sn), Sn) ; A n {S < oo})
Now as n - oo, Sn J S, B(Sn) - B(S), p(B(S,),S,) - f(B(S),S) and
^s„°0s„-Ys°0s,
so the bounded convergence theorem implies that (3.7) holds when Y has the
form given in (*).
To complete the proof now we use the monotone class theorem, (2.3). Let
Ti be the collection of bounded functions for which (3.7) holds. Clearly Ti is
a vector space that satisfies (ii) if Yn £W are nonnegative and increase to a
bounded Y, then 7 gW. To check (i) now, let A be the collection of sets of
the form {w : u(tj) £ Gj} where Gj is an open set. If G is open the function
1g is a decreasing limit of the continuous functions/i(z) = (1 — k dist(z, G))+,
where dist(z,G) is the distance from x to G, so if A £ A then 1,4 € Ti. This
shows (i) holds and the desired conclusion follows from (2.3). D
Example 3.1. Zeros of Brownian motion. Consider one dimensional Brow-
nian motion, let Rt = inf{« > t ; Bu = 0} and let T0 = inf{« > 0 : Bu = 0}.
Now (2.13) implies Px(Rt < oo) = 1, so B(Rt) = 0 and the strong Markov
property and (2.9) imply
PX(T0 o 9Rt > Q\FRl) = P0(To > 0) = 0
Taking the expected value of the last equation we see that
PX(T0 o 6Rt > 0 for some rational t) = 0
Prom this it follows that with probability one, if a point u £ Z(w) = {t :
Bt{u) = 0} is isolated on the left (i.e., there is a rational t < u so that (t, u) n
Z(u) = 0) then it is a decreasing limit of points in Z(u). This shows that the
closed set Z(w) has no isolated points and hence must be uncountable. For the
last step see Hewitt and Stromberg (1969), page 72.
If we let |2(w)| denote the Lebesgue measure of Z(w) then Fubini's theorem
implies
Ex(\Z(u)n[0,T]\)= [ Px(Bt = 0)dt = 0
Jo
24 Chapter 1 Browni&n Motion
So Z(w) is a set of measure zero. D
Example 3.2. Let G be an open set, x G G, let T = ini{t : Bt ¢. G},
and suppose PX(T < oo) = 1. Let A C dG, the boundary of G, and let
u(x) = Px(Bt G A). I claim that if we let 6 > 0 be chosen so that D(x, 8) =
{y : \y - x\ < 6} C G and let S = inf {t > 0 : Bt ¢. D(x, <5)}, then
u(x) = Exu(Bs)
Intuition Since D(x, 6) C G, Bt cannot exit G without first exiting D(x,6)
at Bs- When Bs = y, the probability of exiting G in A is u(y) independent of
how Bt got to y.
Proof To prove the desired formula, we will apply the strong Markov
property, (3.7), to
Y = 1(bt6j4)
To check that this leads to the right formula, we observe that since D(x, 6) C G,
we have Bt°0s = Bt and l(BTeA)°6s = 1(53-6.4)- In words, w and the shifted
path #sw must exit G at the same place.
Since S < T and we have supposed PX(T < oo) = 1, it follows that
PX(S < oo) = 1. Using the strong Markov property now gives
Ex(l(BTeA) ° Qs\Fs) = EB(S)l(BTeA) = u(Bs)
Using the definition of u, l^g^) oQs = l(BTe^)i and taking expected value of
the previous display, we have
u{x) = Exl^BreA-) = Ex (l(BT6^) ° ds)
= EXEX (1(Bt6j4)
°9s\Fs)=Exu(Bs) D
Exercise 3.10. Let G, T, D(x,6) and S be as above, but now suppose
that EyT < oo for all y G G. Let g be a bounded function and let u(y) =
Ey(fJ g(B3) ds). Show that for x G G
u(x) = Exlj g(B,)ds + u(Bs) J
Our third application shows why we want to allow the function Y that we
apply to the shifted path to depend on the stopping time S.
Section 1.3 Stopping Times, Strong Markov Property 25
Example 3.3. Reflection principle. Let Bt be a one dimensional Brownian
motion, let a > 0 and let Ta = inf{t : Bt = a}. Then
(3.8) P0(Ta <t) = 2P0(Bt > a)
Intuitive proof We observe that if B, hits a at some time s < t then the
strong Markov property implies that Bt — B(Ta) is independent of what
happened before time Ta. The symmetry of the normal distribution and Pa(Bt =
a) = 0 then imply
(3.9) P0(Ta <t,Bt>a) = \p0{Ta < t)
Multiplying by 2, then using {Bt > a} C {Ta < t} we have
Po(Ta <t) = 2PQ(Ta <t,Bt>a) = 2P0(Bt > a)
Proof To make the intuitive proof rigorous we only have to prove (3.9). To
extract this from the strong Markov property (3.7), we let
YJu) = (1 '^s<t,u(t-s)> a
1.0 otherwise
We do this so that if we let S = inf {s < t : B, = a) with inf 0 = oo then
Ys(6su) = l(B1>a) on {S<oo} = {Ta < t}
If we let <p(x, s) = EXY3 the strong Markov property implies
Eo(Ys o 6s\Fs) = <P(BS,S) on {S < oo} = {Ta < t}
Now Bs = a on {S < oo} and <p(a, s) = 1/2 if s < t, so
P0(Ta <t,Bt>a) = £(,(1/2 ;Ta<t)
which proves (3.9).
Exercise 3.11. Generalize the proof of (3.9) to conclude that if u < v < a
then
(3.10) P0(Ta <t,u<Bt<v) = P0(2a-v<Bt<2a-u)
26 Chapter 1 Brownian Motion
To explain our interest in (3.10), let Mt = maxo<j<t53 to rewrite it as
P0(Mt > a,u< Bt <v) = P0(2a - v < Bt < 2a - u)
Letting the interval (u, v) shrink to x we see that
P0(Mt >a,Bt = x) = P0(Bt = 2a-x) = _L=e-(2a-x^2(
V27d
Differentiating with respect to a now we get the joint density
(3.11) P0(M, = a,Bt = x) = 2(2^)e-(2a-x)V2t
v27rt3
1.4. First Formulas
We have two interrelated aims in this section: the first to understand the
behavior of the hitting times Ta = inf {t: Bt = a} for a one dimensional Brownian
motion Bt; the second to study the behavior of Brownian motion in the upper
half space H = {x £ Hd : xj > 0}. We begin with Ta and observe that the
reflection principle (3.8) implies
/•oo
Po(Ta <t) = 2P0(Bt >a) = 2 (2^)-^2 exp(-z2/2i) dx
Ja
Here, and until further notice, we are dealing with a one dimensional
Brownian motion. To find the probability density of Ta, we change variables x =
fi/2a/si/2j dx _ _fi/2a/2s3/2c/s t0 get
/o
(2^)-^2 exp(-a2/2s) (-f1/2a/2s3/2) ds
= I (2vsz)-1/2a exp(-a2/2s) ds
Jo
Using the last formula we can compute the distribution of L = sup{£ < 1 :
Bt = 0} and R = inf {t > 1 : Bt = 0}, completing work we started in Exercises
2.3 and 2.4. By (2.5) if 0 < s < 1 then
/OO
ps(0,x)Px(TQ> l-s)dx
■oo
*00 aOO
= 2/ (27rs)-1/2exp(-z2/2s) / (2^)-^2 x exp(-x2/2r)drdx
J0 Jl-s
= -/ (sr3)_1/2 / xexp(-x2(r + s)/2rs)dxdr
* J\-s Jo
I l-OO
= -1 (sr3)-1/2rs/(r + s)dr
T Vl-J
Section 1.4 First Formulas 27
Our next step is to let t = s/(r + s) to convert the integral over r G [1 — s, oo)
into one over t £ [0,s]. dt = —s/(r + s)2dr, so to make the calculations easier
we first rewrite the integral as
I f°°
Wl-j
'( . ^2\1/2
(r + s) \ s
dr
r s J (r + s)2
Changing variables as indicated above and then again with t = u2 to get
Po(L<s) = - f\t(l-t))-l/2dt
(4.2) * J° r
= -/ (1 - u2)-1'2 du=- arcsin(V^)
i" Jo ir
The reader should note that, contrary to intuition, the density function of L =
the last 0 before time 1,
PQ(L = t) = - f\t(l -1))-1'2 for 0 < t < 1
i" Jo
is symmetric about 1/2 and blows up near 0 and 1. This is one of two arcsine
laws associated with Brownian motion. We will encounter the other one in
Section 4.9.
The computation for R is much easier and is left to the reader.
Exercise 4.1. Show that the probability density for R is given by
Po(R = 1 +1) = 1/(^/2(1 + t)) for t > 0
Notation. In the last two displays and in what follows we will often write
P(T = t) = f(t) as short hand for T has density function f(t).
As our next application of (4.1) we will compute the distribution of BT
where r = inf{t : Bt ¢ H} and H = {z : zj > 0} and, of course, Bt is a d
dimensional Brownian motion. Since the exit time depends only on the last
coordinate, it is independent of the first d — 1 coordinates and we can compute
the distribution of BT by writing for x,0 & Rd_1, y G R
P(*,y)(Br = (0, 0)) = / ds P(s,y)(r = s)(2„s)-(d-V/2e-l*-Bf>2>
Jo
r°°
"(2*) Wo
28 Chapter 1 Brownian Motion
by (4.1) with a = y. Changing variables s = (\x — 9\2 + y2)/2t gives
y f° -{\*-o? + y2) ., ( * \ ^2
so we have
(4.3) P(x,y)(Br = (6,0)) = (|g_^y2)d/2r^22)
where T(a) = fQ ya~1e~y dy is the usual gamma function.
When d = 2 and x = 0, probabilists should recognize this as a Cauchy
distribution. At first glance, the fact that BT has a Cauchy distribution might
be surprising, but a moment's thought reveals that this must be true. To explain
this, we begin by looking at the behavior of the hitting times {Ta,a > 0} as
the level a varies.
(4.4) Theorem. Under Po, {Ta,a > 0} has stationary independent increments.
Proof The first step is to notice that if 0 < a < b then
Tbo9Ta=Tb-Ta,
so if/ is bounded and measurable, the strong Markov property (3.7) and
translation invariance imply
EQ (f(Tb - Ta) \TTa) = E0 (f(Tb) o 9Ta \TTa)
= Eaf(Tb) = EQf(Tb„a)
The desired result now follows from the next lemma which will be useful later.
(4.5) Lemma. Suppose Zt is adapted to Qt and that for each bounded
measurable function /
E(f(Zt - Z,)\G,) = Ef(Zt-,)
Then Zt has stationary independent increments.
Proof Let tQ <ti... <tn, and let /,-, 1 < i < n be bounded and measurable.
Conditioning on T%n_x and using the hypothesis we have
(n-1 >
II MZU - ^-1 ) • E(fn(ZU - ^„_x)|^„_x)
= E ( J] f{(Zu - Zt^)) EfiZt^t^)
vi=l
Section 1.4 First Formulas 29
By induction it follows that
e (f[fi(zu - z^yj = n £/*(&,_„_,)
which implies the desired conclusion. D
The scaling relation (1.2) implies
(4.6) Ta = a2T,
So consulting Section 2.7 of Durrett (1995) we see that Ta has a stable law with
index a = 1/2 and skewness parameter k = 1 (since Ta > 0).
To explain the appearance of the Cauchy distribution in the hitting
locations, let ra = inf{£ : Bf = a} (where B"l is the second component of a
two dimensional Brownian motion) and observe that another application of the
strong Markov property implies
(4.7) Theorem. Under Pq, {C, = B(t,),s > 0} has stationary independent
increments.
Exercise 4.2. Prove (4.7).
The scaling relation (1.3) and an obvious symmetry imply
(4.8) C, = ad Cs = -Cs
so again consulting Section 2.7 of Durrett (1995) we can conclude that C, has
the symmetric stable distribution with index a = 1 and hence must be Cauchy.
To give a direct derivation of the last fact let <pe{s) = E(exp(idC,)). Then
(4.7) and (4.8) imply
ipg(s)ipB(t) = <pB(s + t), <Pg(s) = ipg,(l), <pB(s) = ip_g(s)
The fact that C, = — C, implies that <pg(s) is real. Since 6 -+ <pg(l) is
continuous, the second equation implies s —+ <pg(s) = fg,(l) is continuous, and
a simple argument (see Exercise 4.3) shows that for each 6,<pg(s) = exp(cgs).
The last two equations imply that c, = sci and c-g = eg so eg = —n\0\ for some
k, so C, has a Cauchy distribution. Since the arguments above apply equally
well to (B],cBt), where c is any constant, we cannot determine k with this
argument.
30 Chapter 1 Brown'mn Motion
Exercise 4.3. Suppose <p(s) is real valued continuous and satisfies <p(s)<p(t) =
<p(s + t), <p(0) = 1. Let i[>(s) = ln(y?(s)) to get the additive equation: ij>(s) +
ip(t) = ip(s +1). Use the equation to conclude that ip(m2~n) = m2~n^>(l) for
all integers m,n > 0, and then use continuity to extend this to ip(t) = tip(l)..
Exercise 4.4. Adapt the argument for <fe(s) = E(exp(i0C3)) to show
£'o(exp(—ATa)) = exp(—anV\)
The representation of the Cauchy process given above allows us to see that
its sample paths a —+ Ta and s —+ C, are very bad.
Exercise 4.5. If u < v then Po(a -+ Ta is discontinuous in (u,v)) = 1.
Exercise 4.6. If u < v then Pq(s —+ C, is discontinuous in (u,v)) = 1.
Hint. By independent increments the probabilities in Exercises 4.5 and 4.6
only depend on v — u but then scaling implies that they do not depend on the
size of the interval.
The discussion above has focused on how Bt leaves H. The rest of the
section is devoted to studying where Bt goes before it leaves H. We begin with
the case d = 1.
(4.9) Theorem. If x,y > 0, then Px(Bt = y,TQ > t) = pt(x,y) - pt(x,-y)
where
pt(x,y) = (27rf)-1/2e-(!/-x)2/21
Proof The proof is a simple extension of the argument we used in Section
1.3 to prove that P0(Ta < t) = 2P0(Bt > a). Let / > 0 with f(x) = 0 when
x < 0. Clearly
Ex(f(Bt);TQ >t) = Exf(Bt) - Ex(f(Bt);TQ < t)
If we let f(x) = f(—x), then it follows from the strong Markov property and
symmetry of Brownian motion that
Ex{f(Bt);TQ <t) = Ex[Eof(Bt-To);TQ < t]
= Ex[EQf(Bt-T0);To<t]
= Ex[f(Bt);T0 <t] = Ex(f(Bt))
Section 1.4 First Formulas 31
since f(y) = 0 for y > 0. Combining this with the first equality shows
Ex(f(Bt);TQ >t) = Exf(Bt) - Exf(Bt)
= I {Pt(x, y) - pt(x, -y))f(y)dy D
The last formula generalizes easily to d > 2.
(4.10) Theorem. Let r = inf{f : B? = 0}. If x,y e H,
P*(Bt = y,T > t) = pt(x,y) - pt(x,y)
where y = (yi,..., yd-\, -yd)-
Exercise 4.7. Prove (4.10).
2 Stochastic Integration
In this chapter we will define our stochastic integral It = fQ HsdX,. To motivate
the developments, think of X, as being the price of a stock at time s and H,
as the number of shares we hold, which may be negative (selling short). The
integral It then represents the net profits at time t, relative to our wealth at
time 0. To check this note that the infinitesimal rate of change of the integral
dlt = Ht dXt = the rate of change of the stock times the number of shares we
hold.
In the first section we will introduce the integrands Hs, the "predictable
processes," a mathematical version of the notion that the number of shares
held must be based on the past behavior of the stock and not on the future
performance. In the second section, we will introduce our integrators Xs, the
"continuous local martingales." Intuitively, martingales are fair games, while
the "local" refers to the fact that we reduce the integrability requirements
to admit a wider class of examples. We restrict our attention to the case of
martingales with continuous paths t —+ Xt to have a simpler theory.
2.1. Integrands: Predictable Processes
To motivate the class of integrands we consider, we will discuss integration
w.r.t. discrete time martingales. Here, we will assume that the reader is familiar
with the basics of martingale theory, as taught for example in Chapter 4 of
Durrett (1995). However, we will occasionally present results whose proofs can
be found there.
Let Xn,n > 0, be a martingale w.r.t. Tn. If H„, n > 1, is any process, we
can define
n
(H ■ X)n = 2__, Hm(Xm — Xm-i)
m=l
To motivate the last formula and the restriction we are about to place on the
Hm, we will consider a concrete example. Let £1,^21--- be independent with
P(& = 1) = P(£i = -1) = 1/2, and let Xn = &+• ■ •+£„. Xn is the symmetric
simple random walk and is a martingale with respect to Tn = cr(£i £„).
34 Chapter 2 Stochastic Integration
If we consider a person flipping a fair coin and betting $1 on heads each time
then Xn gives their net winnings at time n. Suppose now that the person bets
an amount Hm on heads at time m (with Hm < 0 interpreted as a bet of — Hm
on tails). I claim that (H ■ X)n gives her net winnings at time n. To check
this note that if Hm > 0 our gambler wins her bet at time m and increases her
fortune by Hm if and only if Xm — Xm-\ = 1.
The gambling interpretation of the stochastic integral suggests that it is
natural to let the amount bet at time n depend on the outcomes of the first
n — 1 flips but not on the flip we are betting on, or on later flips. A process Hn
that has Hn £ Tn-\ for all n > 1 (here Tq = {0,0}, the trivial cr-field) is said
to be predictable since its value at time n can be predicted (with certainty)
at time n — 1. The next result shows that we cannot make money by gambling
on a fair game.
(1.1) Theorem. Let Xn be a martingale. If Hn is predictable and each Hn is
bounded, then (H ■ X)n is a martingale.
Proof It is easy to check that (H ■ X)n £ Tn. The boundedness of the H„
implies E\(H ■ X)n\ < oo for each n. With this established, we can compute
conditional expectations to conclude
E((H ■ X)n+1 \Tn) = (H ■ X)n + E(Hn+1(Xn+1 - Xn)\Fn)
= (HX)n + Hn+iE(Xn+i — Xn \Tn) = (H ■ X)n
since Hn+i G Fn and E(Xn+i - X„|.Fn) = 0. D
The last theorem can be interpreted as: you can't make money by gambling
on a fair game. This conclusion does not hold if we only assume that Hn is
optional, that is, Hn £ ^"n, since then we can base our bet on the outcome of
the coin we are betting on.
Example 1.1. If Xn is the symmetric simple random walk considered above
and Hn = £„ then
n
(H.X)n = y£,U-U=n
m = l
since ^ = 1. D
In continuous time, we still want the metatheorem "you can't make money
gambling on a fair game" to hold, i.e., we want our integrals to be martingales.
However, since the present (t) and past (< t) are not separated, the definition
of the class of allowable integrands is more subtle. We will begin by considering
a simple example that indicates one problem that must be dealt with.
Section 2.1 Integrands: Predictable Processes 35
Example 1.2. Let (Q,F, P) be a probability space on which there is defined
a random variable T with P(T < t) = t for 0 < t < 1 and an independent
random variable £ with P($ = 1) = P{£ = -1) = 1/2. Let
Y _ /0 t <T
and let Tt = <r{X, : s <t). In words we wait until time T and then flip a coin.
Xt is a martingale with respect to T%. However, if we define the stochastic
integral It = JQ X,dX, to be the ordinary Lebesgue-Stieltjes integral then
y1= f x,dx,=xT.t = t2 = i
Jo
To check this note that the measure dX, corresponds to a mass of size £ at T
and hence the integral is £ times the value there. Noting now that Y0 = 0 while
Y\ = 1 we see that It is not a martingale. D
The problem with the last example is the same as the problem with the
one in discrete time — our bet can depend on the outcome of the event we
are betting on. Again there is a gambling interpretation that illustrates what
is wrong. Consider the game of roulette. After the wheel is spun and the ball
is rolled, people can bet at any time before (<) the ball comes to rest but not
after (>). One way of guaranteeing that our bet be made strictly before T,
i.e., a sufficient condition, is to require that the amount of money we have bet
at time t is left continuous. This implies, for instance, that we cannot react
instantaneously to take advantage of a jump in the process we are betting on.
The simplest left continuous integrand we can imagine is made by picking
a < b, C £ Ta, and setting
(*) H(s,w) = C(u)l(a>b](t)
In words, we buy C(w) shares of stock at time a based on our knowledge then,
i.e., C G 7a- We hold them to time 6 and then sell them all. Clearly, the
examples in (*) should be allowable integrands; one should be able to add two
or more of these, and take limits. To encompass the possibilities in the previous
sentence, we let II be the smallest cr-field containing all sets of the form (a, 6] X A
where A G Ta. The II we have just defined is called the predictable cr-field.
We will demand that our integrands H are measurable w.r.t. II.
In the previous paragraph, we took the "bottom-up" approach to the
definition of II. That is, we started with some simple examples and then extended
the class of integrands by taking sums and limits. For the rest of the section, we
will take a "top-down" approach. We will start with some natural requirements
36 Chapter 2 Stochastic Integration
and then add more until we arrive at II. The descending path is more confusing
since it involves four definitions that only differ in subtle ways. However, the
reader need not study this material in detail. We will only use the predictable
cr-field. The other definitions are included only to allow the reader to make
the connection with other treatments and to indirectly make the point that the
measure theoretic questions associated with continuous time processes can be
quite difficult.
Let ?jbea right continuous filtration. The first and most intuitive
concept of "depending on the past behavior of the stock and not on the future
performance" is the following:
H(s, w) is said to be adapted if for each t we have Ht G Tf
We encountered this notion in our discussion of the Markov property in Chapter
1. In words, it says that the value at time t can be determined from the
information we have at time t. The definition above, while intuitive, is not
strong enough. We are dealing with a function defined on a product space
[0,oo) x fl, so we need to worry about measurability as a function of the two
variables.
H is said to be progressively measurable if for each t the mapping
(s,w) —+ H(s,u) from [0,t] to R is % x T% measurable.
This is a reasonable definition. However, the "modern" approach is to use
a slightly different definition that gives us a slightly smaller cr-field.
Let A be the cr-field of subsets of [0,oo) x Q that is generated by the
adapted processes that are right continuous and have left limits, i.e.,
the smallest cr-field which makes all of these processes measurable. A
process H is said to be optional if H(s,w) is measurable w.r.t. A.
According to Dellacherie and Meyer (1978) page 122, the optional cr-field is
contained in the progressive cr-field and in the case of the natural filtration of
Brownian motion the inclusion is strict. The subtle distinction between the
last two cr-fields is not important for us. The only purpose here for the last two
definitions is to prepare for the next one.
Let n' be the cr-field of subsets of [0,00) X Q that is generated by the left
continuous adapted processes. A process H is said to be predictable
if#(fl,w)€lT.
As we will now show, the new definition of predictable is the same as the old
one. We have added the ' only to make the next statement and proof possible.
Section 2.2 Integrators: Continuous Local Martingales 37
(1.2) Theorem. II = II'.
Proof Since all the processes used to define II are left continuous, we have
II C II'. To argue the other inclusion, let H(s, w) be adapted and left continuous
andletfP(s,w) = #(m2-",w)form2-n < s < (m + l)2~n. Clearly H" ell'.
Further, since H is left continuous, Hn(s,w) —+ H(s,u) as n —+ oo. D
The distinction between the optional and predictable cr-fields is not
important for Brownian motion since in that case the two cr-fields coincide. Our last
fact is that, in general, II C A.
Exercise 1.1. Show that if H(s,u) = l(aib](s)lA(u) where i G /„, then
H is the limit of a sequence of optional processes; therefore, H is optional and
ncA.
2.2. Integrators: Continuous Local Martingales
In Section 2.1 we described the class of integrands that we will consider: the
predictable processes. In this section, we will describe our integrators:
continuous local martingales. Continuous, of course, means that for almost every w,
the sample path s —+ X,(w) is continuous. To define local martingale we need
some notation. If T is a nonnegative random variable and Yj is any process we
define
VT _ YTM on{T>0}
1 ~ \ 0 on {T = 0}
(2.1) Definition. Xt is said to be a local martingale (w.r.t. {Tt,t > 0}) if
there are stopping times Tn | oo so that Xt " is a martingale (w.r.t. {TAxn '■
t > 0}). The stopping times Tn are said to reduce X.
We need to set Xf = 0 on {T = 0} to deal with the fact that Xo need not be
integrable. In most of our concrete examples Xq is a constant and we can take
Ti > 0 a.s. However, the more general definition is convenient in a number of
situations: for example (i) below and the definition of the variance process in
(3.1).
In the same way we can define local submartingale, locally bounded, locally
of bounded variation, etc. In general, we say that a process Y is locally A if
there is a sequence of stopping time Tn | oo so that the stopped process YtT
has property A. (Of course, strictly speaking this means we should say locally
a submartingale but we will continue to use the other term.)
38 Chapter 2 Stochastic Integration
Why local martingales?
There are several reasons for working with local martingales rather than
with martingales.
(i) This frees us from worrying about integrability. For example, let Xt be a
martingale with continuous paths, and let <p be a convex function. Then <p{Xt)
is always a local submartingale (see Exercise 2.3). However, we can conclude
that <p(Xt) is a submartingale only if we know E\<p(Xt)\ < oo for each t, a fact
that may be either difficult to check or false in some cases.
(ii) Often we will deal with processes defined on a random time interval [0, r). If
r < oo, then the concept of martingale is meaningless, since for large t Xt is not
defined on the whole space. However, it is trivial to define a local martingale:
there are stopping times Tn | r so that ...
(iii) Since most of our theorems will be proved by introducing stopping times Tn
to reduce the problem to a question about nice martingales, the proofs are no
harder for local martingales defined on a random time interval than for ordinary
martingales.
Reason (iii) is more than just a feeling. There is a construction that makes
it almost a theorem. Let Xt be a local martingale defined on [0,r) and let
T„ | r be a sequence of stopping times that reduces X. Let T0 = 0, suppose
Ti > 0 a.s., and for k > 1 let
m _ / * - (* - 1) r*-i + (* - 1) < * < It + (* - 1)
TW -\Tk Tk + (k-l)<t<Tk + k
To understand this definition it is useful to write it out for k = 1,2,3:
7(0 = {
In words, the time change expands [0, r) onto [0, oo) by waiting one unit of time
each time a Tn is encountered. Of course, strictly speaking j compresses [0, oo)
onto [0,r) and this is what allows X^ to be defined for all t > 0. The reason
for our fascination with the time change can be explained by:
t
Ti
t-l
T2
t-2
n
[0,¾
CTi.Ti + l]
[7\ + l,T2 + l]
[T2 + l,T2 + 2]
[T2 + 2,Ta + 2]
p3 + 2,T3 + 3]
Section 2.2 Integrators: Continuous Local Martingales 39
(2.2) Theorem. {X^t), ^"7(i), t > 0} is a martingale.
First we need to show:
(2.3) The Optional Stopping Theorem. Let X be a continuous local
martingale. If S < T are stopping times and Xtm is a uniformly integrable martingale
thenE{XT\Fs) = Xs.
Proof (7.4) in Chapter 4 of Durrett (1995) shows that if L < M are stopping
times and Yjv/An is a uniformly integrable martingale w.r.t. Qn then
E(Ym\Gl) = Yl
To extend the result from discrete to continuous time let Sn = ([2n5] + l)/2n.
Applying the discrete time result to the uniformly integrable martingale Ym =
XTAm2-» with L = 2nSn and M = oo we see that
E(XT\fSa) = XSa
Letting n —+ oo and using the dominated convergence theorem for conditional
expectations ((5.9) in Chapter 4 of Durrett (1995)) the result follows. D
Proof of (2.2) Let n = [t] + 1. Since j(t) < Tn An, using the optional
stopping theorem, (2.2), gives X^) = E(XTnAn l-^-r(i))- Taking conditional
expectation with respect to ^"7(3) we get
E(xt(t) I^"t(j)) = E(XT„An\^,)) = Xf(i)
proving the desired result. D
The next result is an example of the simplifications that come from
assuming local martingales are continuous.
(2.4) Theorem. If X is a continuous local martingale, we can always take the
sequence which reduces X to be Tn = inf{£ : \Xi\ > n} or any other sequence
T^ < Tn that has T^ | oo as n | oo.
Proof Let Sn be a sequence that reduces X. If s < t, then applying the
optional stopping theorem to X?n at times r = s A T^ and t A T^ gives
E(X(t AT^A S„)l(s„>o)|^(s AT^ A Sn)) = X(s AT^ A S„)l(s„>o)
40 Chapter 2 Stochastic Integration
Multiplying by l(Tj,>o) 6^oC T{a A T'm A S„) we get
E(X(tAT^ASn)l(Sn>QiTL>Q)\F(sAT^ASn)) = X(s AT^ AS„)l(sB>o,:rm>o)
As n T oo, T(s A T^ A Sn) T ^(s A 7^), and
X(r AT^A S„)l(sn>o,Ti.>o) - X(r AI£,)1(tj.>o)
for all r > 0, and \X(r AT^ AS„)|l(s„>o,T' >o) < "*> s0 it follows from the
dominated convergence theorem for conditional expectations (see (5.9) in Chapter
4 of Durrett (1995)) that
E(X(t A T^)l(T!n>Q)\F(s A !£)) = X(s A ^)1(^0)
proving the desired result. D
In our definition of local martingale in (2.1), we assumed that Xt " is a
martingale (w.r.t. {FtATn,t > 0})- We did this with the proof of (2.4) in mind.
The next exercise shows that we get the same definition if we assume xf- is a
martingale (w.r.t. {Tt,t > 0}).
Exercise 2.1. Let S be a stopping time. Then Xf is a martingale w.r.t.
FtAS,t > 0, if and only if it is a martingale w.r.t. Tu * > 0.
In the first version of this book I said
"You should think of a local martingale as something
that would be a martingale if it had E\Xt \ < oo."
As many people have pointed out to me, there is a famous example which shows
that this is WRONG.
Example 2.1. Let Bt be a Brownian motion in dimension d > 3. In Section
3.1 we will see that Xt = l/\Bt\d~2 is a local martingale. Now
£S|X,|* = [(2wt)-d'2e-h-*r'2i\y\-(d-Vp dy < oo
if and only if p < d/(d — 2) since (i) there is no trouble with the convergence
for large y and (ii) ignoring (2irt)~d/2e~\y~x\ l2t which is bounded near 0 and
changing to polar coordinates we have
1
l
r-(d-2)prd-l dr<QQ
0
Section 2.2 Integrators: Continuous Local Martingales 41
if and only if -{d- 2)p + {d-l)> -1, i.e., p < d/(d-2).
To see that Xt is not a martingale we begin by noting that (Bt — x) =<j
tll2(B\ — x) under Px so as t -+ oo, |5(| -+ oo and Xt -+ 0 in probability. To
show that ExXt —+ 0 we observe that for any R < oo
ExXt < {2irt)-d'2 ( M-(d-2> dy + iT^
J\y\<R
The integral is convergent so limsup^^, ExXt < R~(d~2\ Since R is arbitrary
ExXt —+ 0. Since ExXq > 0 it follows that EsXt is not constant and hence Xt
is not a martingale. D
An even worse counterexample is provided by
Exercise 2.2. In Section 3.1 we will learn that if B% is a two dimensional
Brownian motion then Xt = log \Bt\ is a local martingale. Show J5|.Xt|p < oo
for any p < oo but lim<_,oo EXt = oo so X is not a martingale.
The last example may leave the reader with the impression that no amount
of integrability will guarantee that a local martingale is an honest martingale,
but as the next very useful result shows, this is a false impression.
(2.5) Theorem. If Xt is a local submartingale and
E ( sup \X, | ) < oo
for each t then Xt is a submartingale.
Proof Clearly our assumption implies E\Xt\<oo for each t so all we have to
check is that ^(^1^,) > X3. To do this, we note that if Tn is a sequence of
stopping times that reduces X
(xf-|^jATn)>^-
E\
then we let n —+ oo and apply the dominated convergence theorem for
conditional expectations. See (5.9) in Chapter 4 of Durrett (1995). D
As usual, multiplying by —1 shows that the last result holds for supermartin-
gales, and once it is true for super- and sub- it is true for martingales (by
applying the two other results). The following special case will be important:
(2.6) Corollary. A bounded local martingale is a martingale.
42 Chapter 2 Stochastic Integration
Here and in what follows when we say that a process Xt is bounded we mean
there is a constant M so that with probability 1, \Xt\ < M for all t > 0.
In some circumstances we will need a convergence theorem for local
martingales. The following simple result is sufficient for our needs.
(2.7) Theorem. Let Xt be a local martingale on [0, r). If
E[ sup |X,| ] <oo
\0<3<T J
then XT = limtir -^ exists a.s. and EX0 = EXT.
Proof Let j be the time scale of (2.2) and let Yt = X^ny Yt is a martingale.
The upcrossing inequality and the proof of the martingale convergence theorem
given in Section 4.2 of Durrett (1995) generalize in a straightforward way to
continuous time and allow us to conclude that Y^ = lim<foo ^i exists a.s. and
Xr = lim^r Xt exists a.s. To prove the last conclusion we note that EYq =
EYt then use the a.s. convergence and the dominated convergence theorem to
conclude EX0 = EY0 - EY^ = EXr. D
Exercise 2.3. Let Xt be a continuous local martingale and let <p be a convex
function. Then <p(Xt) is a local submartingale.
Exercise 2.4. Let X be a continuous local martingale and let R < oo be a
stopping time. Then Y, = Xr+3 is a local martingale with respect to Q3 =
Exercise 2.5. Show that if X is a continuous local martingale with Xt > 0
and EXo < oo then Xt is a supermartingale.
2.3. Variance and Covariance Processes
The next definition may look mysterious but it is very important.
(3.1) Theorem. If Xt is a continuous local martingale, then we define the
variance process (X)t to be the unique continuous predictable increasing
process At that has Aq = 0 and makes X? — At a local martingale.
To warm up for the proof of (3.1) we will prove a result in discrete time.
(3.2) Theorem. Suppose Xn is a martingale with EX% < oo for all n. Then
there is a unique predictable process An with Aq = 0 so that X„ — An is a
martingale. Furthermore, n —» An is increasing.
Section 2.3 Variance and Covariance Processes 43
Proof Let Aq = 0 and define for n > 1
An = An.! + E{Xl\Tn-l) - X2n_x
Prom the definition, it is immediate that An G J-n-\, An is increasing (since
Xn is a submartingale), and
E(Xl-An\Tn-l) = E(Xl\Tn-l)-An
= Xn — i An— 1
so A has the desired properties. To see that A is unique, observe that if B
is another process with desired properties, then An — Bn is a martingale and
An-Bn £ Tn-\- Therefore
An — Bn = E(An — Bn l^n-i) = An-i — Bn-i
and it follows by induction that An — Bn = A0 — B0 = 0. D
The key to the uniqueness may be summarized as "any predictable
discrete time martingale is constant." Since Brownian motion is a predictable
martingale, the last statement fails in continuous time and we need another
assumption.
(3.3) Theorem. Any continuous local martingale Xt that is predictable and
locally of bounded variation is constant (in time).
Proof By subtracting Xq, we may suppose that Xq = 0 and prove that X is
identically 0. Let Vt be the variation of X on [0,i]. We leave it to the reader
to check
Exercise 3.1. If X is continuous and locally of bounded variation then t —* Vt
is continuous.
Define S = inf{s : V, > K). Since t < S implies \Xt\ < K, (2.4) implies
that the stopped process Mt = X(t A S) is a bounded martingale. Now if s < t
using well known properties of conditional expectation (see e.g., (1.3) and other
results in Chapter 4 of Durrett (1995)) we have
E((Mt - M,f\T,) = E(M?\F3) - 2M,E{Mt\T,) + M?
{'' = E(M?\F,) - M? = E(M? - M?\T,)
an equation that we will refer to as the orthogonality of martingale
increments since the key to its validity is the fact that E(M3(Mt — M,)^,) = 0
(and hence E(M3(Mt - M,)) = 0).
44 Chapter 2 Stochastic Integration
Letting 0 = tQ <ti... <tn = the a. subdivision of [0, t], the orthogonality
of martingale increments, the inequality J2ial < (IZi la«l)suPj lajl> an(^ tne
fact that VtAS < K imply
n
= EYJ{Mim-Mim_1)2
m=l
< E [VtAs sup \Mtm - Mtr
<X£,sup|Mlm-Mlm_l|
.-,,)
If we take a sequence of partitions A„ = {0 = tg < *i • • • < ^k(n) = *} m wmch
the mesh |A„| = supm \t^ — £JJ,-il —* 0 then continuity implies supm |M*n —
Mtn_ | —» 0. Since the supm < 2/^ the bounded convergence theorem implies
£'supm |Mt» - Mtn_J -♦ 0. This shows EM? = 0 so Mt = 0 a.s. In the last
conclusion t is arbitrary, so with probability one we have Mt = 0 for all rational
t. The desired result now follows from continuity. D
With (3.3) established we can do the
Proof of uniqueness in (3.1) Suppose At and A\ are two processes with
the desired properties. Then At — A\ is a continuous local martingale that is
locally of bounded variation and hence must be constant. Since A§ — Aq = 0 it
follows that At = A't for all t. D
Proof of existence in (3.1) when Xt is a bounded martingale The
existence of A is considerably more complicated than its uniqueness. To inspire
the reader for the battle to come, we would like to point out that (i) we are
about to prove the special case that we need of the celebrated Doob-Meyer
decomposition and (ii) the construction will provide some useful information,
e.g., (3.6). Though (3.1) and (3.8) are quite important, the details of their
proofs are not, so the squeamish reader can skip ahead to the boldfaced "End
of Existence Proof" five pages ahead.
Given a partition A = {0 = t0 < t\ < t2 ■ ■.} with lim„ tn = oo, we let
k(t) = sup{fc : tk < t} be the index of the last point before time t. (To avoid
confusion later, we emphasize now that k(t) is a number not a random variable.)
We next define for a process X, an approximate quadratic variation by
k(t)
Qt(x) = Y),xik - Xik_S~ + (Xt - xikll)f
k=\
Section 2.3 Variance and Covariance Processes 45
If the reader recalls what we are trying to define (see (3.1)) then the reason for
the definition is explained by
(a) Lemma. If Xt is a bounded continuous martingale then X? — Qf(X) is a
martingale.
Proof To prove this, we first note that reasoning as in the proof of (3.4) one
can conclude that if r < s < t then
E ((Xt - Xrf\T,) = E ((Xt - X,Y\T,)
- 2(X, - Xr)E(Xt - X, \Ta) + (X, - XT)2
= E{{Xt-X,)2\T,)+{X,-Xr)2
since E(Xi-X3\T3)=0.
Writing Qf = Qf + (Qf - Qf) and working out the difference,
Qt - Qt = £(¾ - ***-i)2 + (¾ - X**(o)2
k=\
-^(^-¾^)2+ (^-^,.,)2
k=l
= P^fc(.)+i _ ^<fc(.)) ~~ (X' ~ X*H.))
+ Yl (***-***-i)2+(**-**x,,)2
k=k(i)+2
The next step is to take conditional expectation with respect to Ts and use the
first formula with r = t^,) and t = ijt(4)+i. To neaten up the result, define
a sequence un, k(s) — 1 < n < k(t) + 1 by letting uk^_1 = s, u,- = i,- for
Ks) < * < K0> an(i uk(t)+i = *•
E{X*-Q?{X)\T,)
t(0+i
= EiX?\r,)-QtiX)-E\ J2 (*««-*««-i)
I t(*)+i
= X2-Q?(X)
T,
T>
46 Chapter 2 Stochastic Integration
where the second equality follows from (3.4). D
Looking at (a) and noticing that Qf(X) is increasing except for the last
term, the reader can probably leap to the conclusion that we will construct the
process At desired in (3.1) by taking a sequence of partitions with mesh going
to 0 and proving that the limit exists. Lemma (c) is the heart (of darkness)
of the proof. To prepare for its proof we note that using (a), taking expected
value and using (3.4) gives
E(Q?(X) - Q?{X)\T,) = E(X? - X]\Ta)
U =E((Xt-X3)2\T3)
To inspire you for the proof of (c), we promise that once (c) is established the
rest of the proof of (3.1) is routine.
(c) Lemma. Let Xt be a bounded continuous martingale. Fix r > 0 and let
A„ be a sequence of partitions 0 = ij} < i" ... < i£ = r of [0,r] with mesh
|A„| = sup,. |t£ - tl_x\ -* 0. Then Q?"(X) converges" to a limit in L2.
Proof If A and A' are two partitions, we call AA' the partition obtained by
taking all the points in A and in A'. If we apply (a) twice and take differences
we see that Yt = QApO-QA'(X) is a martingale. By definition Yi2-QAA'(Y)
is a martingale, so
E(Q?(X) - Q?(X))2 = EY? = EQ?A'{Y)
For reasons that will become clear in a moment, we will drop the argument
from QA when it is X and we will drop the r when we want to refer to the
process t —» QA. Since
(3.5) (a + 6)2 < 2a2 + 262 for any real numbers a and 6
(since 2a2 + 262 - (a + 6)2 = (a - 6)2 > 0) we have
QrAA'00 < 2QAA'(QA) + 2QAA'(QA')
Combining the last two results about Q we see that it is enough to prove that
(d) if |A| + |A'| -* 0 then SQAA'(QA) -* 0.
To do this let sk £ A A' and tj £ A so that tj < sfc < sk+i < tj+\. Recalling
the definition of QA (X) and doing a little arithmetic we have
= (X'k+i - X'k)(X'k+i + X*k - 2¾)
Section 2.3 Variance and Covariance Processes 47
QAA'(QA) < QrAA'(X)sup(X,fc+1 +X,fc -2X<.(fc))2
1/2
4 f
Summing the squares gives
QAA'(QA)<c
where j(k) — sup{j : tj < Sk}. By the Cauchy-Schwarz inequality
E(QAA'(QA)) < {EQA*'(XW ji? sup(X,fc+1 +X3k - 2Xt.w)
Whenever |A| + |A'| —» 0 the second factor goes to 0, since X is bounded and
continuous, so it is enough to prove
(e) If \Xt\<M for all t then EQAA'(X)2 < 12M4.
The value 12 is not important but it does make it clear that the bound does
not depend on r, A, or A'.
2
\2 |
Proof of (e) If T = AA' is the partition 0 = s0 < si... < sn = r then
Qjrpo2 =(J2(x,m-x,m_1y
\m=l /
(e-1) =E(X,m-XJm_1)4
m=l
+ 2 £(X,m - XJm_1)2(Qj'(X) - Qj*m(X))
m=l
To bound the first term on the right-hand side we note that if |X*| < M for all
t then some arithmetic and (3.4) imply
n n
m=l m=l
(e-2) =*m*eYx! -x2 x
m=l
< 4M2£X2 < 4M4
To bound the second term we note that (X,m — X,m_1)2 £ TSm then use (b)
and |Xr - X,m \ < 1M to get
E((X3m -XJm_1)2{Qj'(X)-Qj,m(X)}|^m)
/e 3) ' = (X'» - Xim_1)2S({Qr(X) - QL(X)}|^J
= (X,m -X^.J^Xr - Xim)2|^J
<(2M)2(Xim-Xim_1)2
48 Chapter 2 Stochastic Integration
Taking expected value in (e.3), summing over m, and using (3.4) as before, we
get
< a Enf2(X3m-X3m_1)\Qrr(X)-Q^JX))<4M'EJ2^L-^L-1
Ve-4^ m=l m=l
< 4M2EX2 < 4M4
Plugging this bound and (e.2) into (e.l) proves (e) which completes the proof
of (d) and hence of (c). • Q
It only remains to put the pieces together. Let A„ be the partition with
points k2~nr with 0 < k < 2". Since Qfm — Qfu is a martingale (by (a)),
using the L2 maximal inequality (see e.g., (4.3) in Chapter 4 of Durrett (1995))
gives
(f) E (sup |Qf» - Qf -12) < 4£|Q?» - Q^ |2
From this and (c) it follows that
(g) There is a subsequence so that Qt n(fc) —» a limit At uniformly on [0,r].
Proof Since Qfu converges in L2, it is a Cauchy sequence in L2 and we can
pick an increasing sequence n(k) so that for m > n(k)
E\Q?--Q?"w\2<2-k
Using (f) and Chebyshev's inequality it follows that
P (sup \Qf "(fc+1) - Qf "(fc)| > 1/k2) < k*2~k
Since the right-hand side is summable, the Borel-Cantelli lemma implies the
inequality on the right only fails finitely many times and the desired result
follows. D
By taking subsequences of subsequences and using a diagonal argument we
can get uniform convergence on [0,N] for all N. Since each Qf is continuous,
At is as well. Because of the last term (Xt — Xk(t))2, Qt is not increasing.
However, if m > n then k —* Qj2-„ is increasing, so k —* -dj^—r is increasing.
Since n is arbitrary and t —* At is continuous it follows that t —* At is increasing.
Section 2.3 Variance and Covariance Processes 49
The last detail for the case in which X is a bounded martingale is to check
that M2 — At is a martingale. This follows from the next result (with p = 2).
(3.6) Lemma. Suppose that for each n, Z" is a martingale with respect to Tu
and that for each t, Z" —» Zt in LP where p > 1. Then Zt is a martingale.
Proof The martingale property implies that if s < t then E(Zt\T3) = Z".
The right-hand side converges to Z3 in LP. To check that the left-hand side
converges to E(Zt\Ts) we note that linearity properties of and Jensen's inequality
for conditional expectation (see e.g., (1.1a) and (Lid) in Chapter 4 of Durrett
(1995)) imply
E\E(Z?\F,) - E(Zt\T,)Y = E\E{Z? - Z,\T,yP
< EE(\Z? - Zt\p\T,) = E\Z? - Zt\P — 0
Taking limits in LP it follows that E{Zt\T3) = Z, and the proof is complete. D
Proof of existence in (3.1) when X is a local martingale Our first
step in extending the result to local martingales is to prove a result about the
quadratic variation of a stopped martingale. Once one remembers the definition
of Y^ given at the beginning of Section 2.2, the next result should be obvious:
the quadratic variation does not increase after the process stops.
(3.7) Lemma. Let X be a bounded martingale, and T be a stopping time.
Then {XT) = {X)T.
Proof By the optional stopping theorem (XT)2 — (X)T is a local martingale
so the result follows from uniqueness. D
Let Tn be a sequence of stopping times increasing to oo so that
Y" = XTn ■ 1(t„>o) is a bounded martingale
By the previous result there is a unique continuous predictable increasing
process An so that (Y/1)2 - A? is a martingale. By (3.7) A? = A?+i for t < Tn so
we can unambiguously define (X)t = A" for t < Tn. Clearly, (X)t is continuous,
predictable, and increasing. The definition implies X$.nM ■ 1(t„>o) — {^Oi^a* is
a martingale so X? — (X)t is a local martingale. D
End of Existence Proof
Finally, we have the following extension of (c) that will be useful later.
50 Chapter 2 Stochastic Integration
(3.8) Theorem. Let Xt be a continuous local martingale. For every t and
sequence of subdivisions A„ of [0,i] with mesh |A„| ->0we have
sup \Qf - (X) - (X)31 — 0 in probability
Proof Let S, e > 0. We can find a stopping time S so that Xf is a bounded
martingale and P(S <t) < 5. It is clear that QA(X) and QA(XS) coincide on
[0,5]. From the definition and (3.7) it follows that (X) and (Xs) are equal on
[0,S]. So we have
P (sup \QA(X) - (X), | > c) < 6 + P (sup \Qf (Xs) - (Xs), \ > e)
Since Xs is a bounded martingale (c) implies that the last term goes to 0 as
|A|-*0. D
(3.9) Definition. If X and Y are two continuous local martingales, we let
(X,Y)t = \((X + Y)t-(X-Y)t)
Remark. If X and Y are random variables with mean zero,
cov(X, Y) = EXY = \(E(X + Yf - E(X - Y)2)
= |(var(X + Y) - var(X - Y))
so it is natural, I think, to call (X, Y)t the covariance of X and Y.
Given the definitions of (X)t and (X, Y)t the reader should not be surprised
that if for a given partition A = {0 = to < *i < *2 • • •} with limn tn = oo, we
let k(t) = sup{fc :tk<t} and define
Qt(X,Y) = Y^Xtk ~ Xt^XYtt - Ytk_J
k=\
+ (Xt - Xtk(l))(Yt - Ytk(l))
then we have
(3.10) Theorem. Let X and Y be continuous local martingales. For every t
and sequence of partitions An of [0,t] with mesh |An| ->0we have
sup \Qf "(X, Y) - (X, Y)31 — 0 in probability
3<t
Section 2.3 Variance and Co variance Processes 51
Proof Since
Qf "(X, Y) = i (Q?»(X + Y) - Qf »(X - y))
this follows immediately from the definition of (X, V) and (3.8). D
Our next result is useful for computing (X, Y)t.
(3.11) Theorem. Suppose X* and Y are continuous local martingales. (X, Y)%
is the unique continuous predictable process At that is locally of bounded
variation, has Aq = 0, and makes XtYt — At a local martingale.
Proof From the definition, it is easy to see that
XtYt - (X, Y)t = \[(Xt + Ytf - (X + Y)t - {{Xt - Yt)2 - (X - Y)t}]
is a local martingale. To prove uniqueness, observe that if At and A't are two
processes with the desired properties, then At — A't = (XtYt — A't) — (XtYt — At)
is a continuous local martingale that is locally of bounded variation and hence
= 0 by (3.3). D
From (3.11) we get several useful properties of the covariance. In what
follows, X, Y and Z are continuous local martingales. First since XtYt = YtXt,
(X,Y)t = (Y,X)t —-"■""
Exercise 3.2. (X + Y, Z)t = (X, Z)t + (Y, Z)t.
Exercise 3.3. (X - X0, Z)t = (X, Z)t.
Exercise 3.4. If a, 6 are real numbers then (aX, bY)t = ab(X,Y)t. Taking
a = 6, X = Y, and noting (X,X)< = (X)t it follows that (aX)t = a2(X)t.
Exercise 3.5. (XT,YT) = (X,Y)T
It is also true that
(X,YT) = (X,Y)T
Since we are lazy we will wait until Section 2.5 to prove this. We invite the
reader to derive this directly from the definitions. The next two exercises
prepare for the third which will be important in what follows:
52 Chapter 2 Stochastic Integration
Exercise 3.6. If Xt is a bounded martingale then X? — (X)t is a uniformly
integrable martingale.
Exercise 3.7. If X is a bounded martingale and S is a stopping time then
Yt = Xs+t—Xs is a martingale with respect to Ts+t and (Y)t = (X)s+t—(X)s-
Exercise 3.8. If S < T are stopping times and (X)s = (X)t, then X is
constant on [S, T].
Exercise 3.9. Conversely, if S < T are stopping times and X is constant on
[S,T]then<X)s=POT.
2.4. Integration w.r.t. Bounded Martingales
In this section we will explain how to integrate predictable processes w.r.t.
bounded continuous martingales. Once we establish the Kunita-Watanabe
inequality in Section 2.5, we will be able to treat a general continuous local
martingale in Section 2.6.
As in the case of the Lebesgue integral (i) we will begin with the simplest
functions and then take limits to get the general case, and (ii) the sequence of
steps used in denning the integral will also be useful in proving results later on.
Step 1: Basic Integrands
We say H(s,u) is a basic predictable process if H (s, w) = l(a6](s)C(w)
where C G Ta. Let IIo = the set of basic predictable processes. If H = l(a,&]C
and X is continuous, then it is clear that we should define
jH,dX, = C(w)(Xi(w) -X„(w))
Here we restrict our attention to integrating over [0,oo), since once we know
how to do that then we can define the integral over [0,t] by
(H -X)t = J H, dX, = J H,lm(s)dX3
To extend our integral we will need to keep proving versions of the next three
results. Under suitable assumptions on H, K, X and Y
(a) (H ■ X)t is a continuous local martingale
(b) ((H + K)-X)t = (H-X)t + (K-X)u (H-(X+Y))t = (H-X)t + (H-Y)t
Section 2.4 Integration w.r.t. Bounded Martingales 53
(c) (H -X,K. Y)t = /o H,K, d(X, Y),
In this step we will only prove a version of (a). The other two results will
make their first appearance in Step 2. Before plunging into the details recall
that at the end of Section 2.2 we defined Ht to be bounded if there was an M
so that with probability one Ht < M for all t > 0.
(4.1.a) Theorem. If X is a continuous martingale and H £ bU0 = {H £ Uq :
H is bounded}, then (H ■ X)t is a continuous martingale.
Proof
(0 0<t<a
(H ■ X)t = \ C(Xt - Xa) a<t<b
{ C(Xb -Xa) b<t< CO
From this it is clear that (H-X)t is continuous, (H-X)t £ Ft and E\(H-X)t\ <
oo. Since (H -X)t is constant for t ¢. [a, b] we can check the martingale property
by considering only a < s < t < b. (See Exercise 4.1 below.) In this case,
E{{H-X)t\T,) - (H-X)3 = E((H-X)t - {H-X),\T,)
= E(C(Xt - XS)\TS)
= CE(Xt-X,)\Ts) = 0
where the last two equalities follow from C £ T, and E{Xt\J:3) —Xs. D
Exercise 4.1. If Yt is constant for t ¢ [a, 6] and E{Yt\Ta) = Y3 when a < s <
t <b then Yt is a martingale.
Step 2: Simple Integrands
We say H(s,w) is a simple predictable process and write H £ IIi if if
can be written as the sum of a finite number of basic predictable processes. It
is easy to see that if H £ IIi then H can be written as
m
where t0 < ti < ... < tm and Cj £ Tti-i ■ ^n tnis case we ^et
/m
HtdX^Y^CiiXu-Xu.J
••—1
54 Chapter 2 Stochastic Integration
The representation in the definition of H is not unique, since one may subdivide
the intervals into two or more pieces, but it is easy to see that the right-hand
side does not depend on the representation of H chosen.
(4.2.b) Theorem. Suppose X and Y are continuous martingales. If H, K € IIi
then
((H + K) ■ X)t = (H ■ X)t + (K ■ X)t
{H-{X+Y))t = {H-X)t + (H-Y)t
Proof Let H1 = H and H2 — K. By subdividing the intervals in the defin-
tions we can suppose H{ — Ya=.\ l(ti-.i,ti](s)C| for j = 1,2. In this case it is
easy to see that each side of the first identity is equal to
m
YlP\ + (%){Xu-Xu-x)
If H3 = Ylhzi l(<i_i,<.](s)Ci then each side of the second identity is
m
£=1
Since the sum of a finite number of continuous martingales is a continuous
martingale, it follows from (4.1.a) and (4.2.b) that we have
(4.2.a) Theorem. If X is a continuous martingale and H € 6IIi = {H € IIi :
H is bounded}, then (H ■ X)t is a continuous martingale.
Turning to our third result
(4.2.c) Theorem. If X and Y are bounded continuous martingales and H, K €
6IIi, then
(H-X,K-Y)t= I H,K,d(X,Y)3
Jo
Consequently, E((H -X)t(K -Y)t) = E f* H3K3d(X,Y), and
E(H -X)2 = E j H2d(X),
Jo
Remark. Here and in what follows, integrals with respect to (X, Y)3 are
Lebesgue-Stieltjes integrals. That is, since s —* (X, Y)3 is locally of bounded
Section 2.4 Integration w.r.t. Bounded Martingales 55
variation it defines a cr-finite signed measure, so we fix an w and then integrate
with respect to that measure. In the case under consideration, the integrand is
piecewise constant so the integral is trivial.
Proof To prove these results it suffices to show that
Zt=(H-X)t(K -Y)t- J H3K3d(X, Y)3
Jo
is a martingale, for then the first result follows from (3.11), the second by taking
expected values, and the third formula follows from the second by taking H = K
and X = Y. To prove Zt is a martingale we begin by noting that
((H1 + H2) ■ X)t(K ■ Y)t = (H1 ■ X)t(K ■ Y)t + (H2 ■ X)t(K -Y)t
f (H} + H2)K3d(X,Y)3 = f H]K3d(X,Y)3+ f H2K3d{X,Y)3
Jo Jo Jo
The last two observations imply that if the result holds for (H1, K) and (H2, K),
it holds for (Hl + H2, K). Similarly, if the result holds for (H, K1) and (H, K2),
it holds for (H, K1 + K2).
In view of the results in the last paragraph, we can now prove the result
by establishing it in the case H = l(a>6]C, K = l(cd]JD, and we can furthermore
assume that (i) 6 < c or (ii) a = c, b = d.
Case i. In this case /„ H3K3d(X,Y}3 = 0, so we need to show that (H-X)t(K-
Y)t is a martingale. To prove this we observe that if J = C(Xb — Xa)Dl^Ci^
then (H ■ X)t(K ■ Y)t = (J ■ Y)t is a martingale by (4.1.a)
Case ii. In this case
{0 s <a
CD {(X3 - Xa)(Y3 - Ya) - ((X, Y)3 - (X, Y)a)} a<s<b
CD {(Xb -Xa)(Yb -Ya) - ((X,Y)b - (X,Y)a)} s > b
so it suffices to check the martingale property for a < s < t < b. To do this we
note
Zt-Z, = CDiXtYt - X3Y3 - Xa(yt - Y3) - Ya(Xt -X3)
-((X.Y^-fX.Y),)}
Taking expected value and noting Xa G T3, E(Yt — Y3\T3) = 0 we have
E(Zt - Z3\T3) = CDE{XtYt - {X,Y)t - {X3Y3 - (X,Y)3}\F3) = 0
This completes the proof of Case ii and finishes the proof of (4.2.c). D
56 Chapter 2 Stochastic Integration
Step 3: Square Integrable Integrands, Bounded Martingales
Let II2PO be the set of all predictable processes H that have
\\H\\X = (eJH?d(X)?) 2<cx>
Exercise 4.2. \\H\\x is a norm.
Remark. If we define a measure on the predictable cr-field by
lt(Ax(a,t]) = E((X)t-(X)t;A)
when A £ Ts then II2PO = L2(U,n). fj. is called the Doleans measure after
C. Doleans-Dade. For a proof that this is a measure see Chung and Williams
(1990), pages 52-53.
Let M.2 be the set of all martingales adapted to {Ft,t > 0} that have
||X||2 = (sup EX2)1'2 < 00
t
In the proof of (4.6) we will see that M2 is isomorphic to I/2(Jr00).
Exercise 4.3. X e M2 if and only if EX% < 00 and E(X)O0 < 00.
The following is the key to our next extension
(4.4) Isometry Property. If X is a bounded martingale and H G 6H1, then
\\H-X\\2 = \\H\\X.
Remark. We skipped over the number (4.3) since (4.3.a) and (4.3.b) will be
proved later in this step.
Proof Recalling the definitions and using the last formula in (4.2.c) gives
\\H\\2X = EJH2d(X)3=snpEjtH2d(X)1
= supE(H-X)2 = \\H-X\\2 a
t
The first step in extending the integral to U2(X) is to prove that 6II1 is dense
in H.2(X). Later we need a slight strengthening of this result so we go ahead
and prove the stronger form.
Section 2.4 Integration w.r.t. Bounded Martingales 57
(4.5) Lemma. If for 1 < i < k we have X* G M2 and H G n2(Xx"), then there
is a sequence Hn G 6¾ with \\Hn - H\\X' -* 0 for 1 < i < k.
Proof Since X{ £ M2, E(X% < oo and it follows that 6IIi C n2(X*)-
Let Tit be the collection of predictable G that vanish on (t, oo) for which the
conclusion holds. Clearly, if r < s < t and A G TT then G = l(r,i]l.A €E Tit.
Suppose now that 0 < Gn G Tit and G„ | G with G bounded. The dominated
convergence theorem implies
||G- Gn\\7Xi = E J{G, - GlfdiX'), - 0
as n —» oo, so we can pick n so that the last difference is < e2 for 1 < i < k.
Since G„ eTit, we can find a sequence Hn-m G &ni so that ||ifn'm-Gn||x' -♦ 0
for 1 < i < k and then pick m so that ||fP'm - G"||x-' < e for 1 < i < k. The
triangle inequality implies that ||ifn'm — G||x>' < 2e for 1 < i < k. Since e is
arbitrary it follows that G ETit.
Using the monotone class theorem now ((2.3) in Chapter 1) it follows that
Tit contains all bounded predictable processes that vanish on (t, oo). Now if K G
n2pf') for 1 < i < k and we define It'" = J^l|A-|<„l[o,n] then the dominated
convergence theorem implies that \\Kn — K\\x< —* 0. Since Kn is bounded and
vanishes on (n, oo), another use of the triangle inequality implies K is a limit
offPG&IIi. D
(4.6) Theorem. M2 is complete.
Proof Standard martingale convergence theorems imply that if X G M2,
then as t —» oo, Xt converges almost surely and in L2 to a limit Xoo with
EX^ = supt EX2, and the martingale can be recovered from Xoo by Xt =
EiXcolTt). Let ^"oo = 0-(^,t > 0). Since IM = \imXt G ^"oo, the observation
a"bove shows that X —» X^ maps M2 one-to-one into L2{Too). On the other
hand, if Y G i2(^oo), then Yt = E{Y\Tt) is a martingale with Yt -* Y as
t —» oo, and Jensen's inequality shows that
EY2 = E{E{Y\Tt)2) < E{E{Y2\Tt)) = EY2
so Yt G M.2. Combining this with the previous observation shows that X —* Xoo
is an isometry from M2 onto I/2(Jr0o) and proves (4.6). D
With (4.5) and (4.6) established, we can give the
Definition of the integral for n^X). To define H -X when X is a bounded
continuous martingale and H G n2pO, let Hn G bill so that \\Hn-H\\x -* 0.
58 Chapter 2 Stochastic Integration
Since \\Hn — Hm\\x ->Oasm,n-nx), (Hn ■ X) is a Cauchy sequence in M2
and by (4.6) must converge to a limit in M2, which we define to be the integral
(H -X). To see that the limit is independent of the sequence of approximations
chosen suppose \\Hn - H\\x -* 0 and \\Hn - H\\x -* 0 with #" • X -* Y and
Hn ■ X —* Y. Form a third approximating sequence by setting Hn — Hn if n is
odd, Hn = Hniin is even. \\Hn - H\\x -» 0 so we have Hn-X-+Z. Looking
at the even and odd subsequences of Hn it follows that Y = Z = Y so the limit
is unique. In words, the last argument shows that if the limit exists for ANY
sequence of approximations, it must be unique. Note that as a corollary of the
construction we get
(4.3.a) Theorem. If X is a bounded continuous martingale and H G IL2PO
then H ■ X G M2 and is continuous.
Proof The fact that H-X G M2 is automatic from the definition. If H" G 6IIi
have \\Hn — H\\x -* 0 then (4.2.a) implies (Hn ■ X)t is continuous. Since
\\(Hn ■ X) — (H ■ X)\\2 —» 0, using Chebyshev and the L2 maximal inequality
we get that
P (sup \(Hn ■ X)t - (# • X)*| > e) < e-2£sup \{Hn -X)t - (H ■ X)t\2
< e~2 ■ 4||(if" • X) - (H ■ X)\\2 -+ 0
Since this holds for any e > 0 we say "(Hn -X)t converges uniformly to (H -X)t
in probability." By passing to a subsequence we can have
sap\(Hn-X)t-(H-X)t\-*0 a.s.
t
and it follows that t —» (H ■ X)t is continuous. D
(4.3.b) Theorem. If X is abounded continuous martingale, and H, K G Il2(X)
then H + K G H2(X) and
((H + It') • X)t = (H ■ X)t + (K ■ X)t
Proof The triangle inequality for the norm || -||x implies that H+K G Il2(X).
Let Hn,Kn G 6ni with \\Hn - H\\x -» 0 and \\Kn - K\\x -* 0. The triangle
inequality implies \\{Hn + Kn) - (H + K)\\x -* 0. (4.2.b) implies
((Hn + Kn) ■ X)t = (Hn ■ X)t + (Kn ■ X)t
Now let n -* oo and use the fact that (Gn ■ X)t -* (G ■ X)t in M2 when
G=H,K,H + K. □
Section 2.5 The Kunita-Watanabe Inequality 59
The proofs of the remaining two equalities
(H-(X + Y))t = (H-X)t + (H-Y)t
(H-X,K-Y)t= I H,K,d(X,Y),
Jo
will be given in the next section. To develop the results there we will need to
extend the isometry property from 6II1 to TL2(X). To do this it is useful to
recall some of your undergraduate analysis.
Exercise 4.4. If || • || is a norm and \\xn — x\\ -* 0 then ||zn|| -» ||x||.
Exercise 4.5. If X is a bounded martingale and H € n2(X) then \\H -X\\2 —
WWx-
2.5. The Kunita-Watanabe Inequality
In this section we will prove an inequality due to Kunita and Watanabe and
apply this result to extend the formula given in Section 2.4 for the covariance
of two stochastic integrals.
(5.1) Theorem. If X and Y are local martingales and H and K are two
measurable processes, then almost surely
»00 / foa \ 1/2 / »00 \ 1/2
jf \H,K3\d\(X,Y)l < [l H?d(X)3) [Jq K3sd(X),)
where d\(X, Y)\3 stands for dV, where V, is the total variation of r —* (X,Y)r
on [0,s].
Remai-ks. (i) If X = Y and d(X), = s then d(Y), = d\(X,Y)\, = s and
(5.1) reduces to the Cauchy-Schwarz inequality, (ii) Notice that H and K are
not assumed to be predictable. We assume only that H(s,u) and K(s,u) ate
measurable with respect to Ti x T where Ii, is the Borel subsets of R. The
reason we can attain this level of generality is that the notion of martingale
does not enter into the proof after the first step.
Proof Step 1: Observe that if a < t, (X+XY,X+XY)t > (X+XY,X+XY)3.
If we let (M, N)*s = (M, N)t - (M, N)3, then
0 < {X + XY, X + \Y)t - (X + XY, X + AY),
= (X,X)\-2X(X,Y)l + X2(Y,Y)3
60 Chapter 2 Stochastic Integration
for all s, t, and A. If we fix s and t and throw away a countable number of
sets of measure 0 then with probability one the last inequality will hold for all
rational A. Now a quadratic ax2 + 6a; + c that is nonnegative at all the rationals
and not identically 0 has at most one real root (i.e., 62 — 4ac < 0), so we have
that
Step 2: Let 0 = io < *i < ••• < tn be an increasing sequence of times, let
hi,ki, 1 < iI < n, be random variables, and define simple measurable processes
n
#(-,^) = ^(001(1.--^)(-)
:=1
n
#(-,^) = 5^)1(,...,,,..)(-)
i=i
Prom the definition of the integral, the result of Step 1, and the Cauchy-Schwarz
inequality, it follows that
/ HJ<3d(X,Y)3
Jo
^glMilKx.ytfU
£=1
< ;£>i ((^)^,¾^17^
/ n \!/2 / „ \l/2
proving that for simple measurable processes:
Jo H,Ksd(X,Y)s\< ^jf H?d(X),j ^ K?d(Y),j
which is (5.1) with the absolute values outside the integral.
Step 3: Let M be a large number and let T - inf{i : {X)t or (Y), > M}. By
the monotone convergence theorem, it suffices to prove (5.1) when H = K = 0
for s > T and \HS\, \K,\ < M for s < T. Having restricted our attention
to [0,T], (X) and (Y) are finite measures; so using the bounded convergence
theorem, we see that (5.2) holds for bounded measurable processes. To improve
(5.2) to (5.1) (and complete the proof), let J, be a measurable process taking
values in {—1,1} such that
( |d(x,y),|= ( J,d(x,Y),
Jo Jo
Section 2.5 The Kunita-Wa.tana.be Inequality 61
To see that this exists, note that J, is the Radon-Nikodym derivative of d(X, Y)s
with respect to d\(X,Y)3\. Now apply (5.2) to H, = \H,\ and K, = J3\K3\. □
With (5.1) established, we are now ready to take care of unfinished business
from Section 4.
(5.3) Theorem. If H G U2(X) D U2{Y) then H £ Il2{X + Y) and
(H ■ (X + Y))t = (H ■ X)t + (H ■ Y)t
Proof First note that Exercise 3.2 implies
(X + Y)i = (X)i + (Y)i + 2(X,Y)i
and the Kunita-Watanabe inequality, (5.1), implies
\(X,Y)\t < (™r>01/2 < «X)t + (Y)t)/2
since 2a6 < a2 + 62, i.e., 0 < (a — 6)2. So we have
(X + Y)t<2((X)t + (Y)t)
(which should remind the reader of (z + y)2 < 2(z2 + y2)) and it follows that
H G 112(^ + Y). To prove the formula now we note that by (4.5) we can find
Hn in 6IIi so that \\Hn - H\\z -+ 0 for Z = X,Y,X + Y. (4.2.b) implies that
(Hn ■ (X + Y))t = (Hn ■ X)t + (Hn ■ Y)t
Now let n -» 00 and use {Hn ■ Z)t -»{H ■ Z)t for Z = X,Y,X+Y. D
(5.4) Theorem. If X, Y are bounded continuous martingales, H € U2(X),
K e II2(Y) then
(H -X,K- Y)t = f H,K, d(X, Y),
Jo
Proof By (3.11), it suffices to show that
(t) Zt = (H- X)t(K .Y)t- J H3K,d(X, Y),
Jo
is a martingale. Let Hn and Kn be sequences of elements of 6II1 that converge
to H and K in TL2(X) and Il2(y), respectively, and let Z" be the quantity
that results when Hn and Kn replace H and K in (f). By (4.2.c), Z? is a
62 Chapter 2 Stochastic Integration
martingale. By (3.6), we can complete the proof by showing Z3 —* Z3 in L1.
The triangle inequality and (4.3.b) imply that
Esup\(Hn -X)t(Kn .Y)t - (H-X)t(K -Y)t\
t
<Esup\((Hn-H)-X)t(Kn.Y)t\
t
+ Esuv\(H ■ X)t((Kn - K) -Y)t\
t
To bound the first term, we use (i) the sup of the product is smaller than the
product of the sup's, (ii) the Cauchy-Schwarz inequality, (iii) the L2 maximal
inequality and (iv) the isometry property in Exercise 4.5:
< £{sup \((Hn - H) • X),|sup \(Kn ■ Y)t\]
t t
< {£(sup \((Hn - H) ■ X),|2)£(sup \(Kn ■ Y)^2)}1/2
t t
< 4||(fP - H) ■ X\\2\\Kn -Y\\2 = 4\\Hn ~ H\\x\\Kn\\Y - 0
as n -* oo, since \\Hn - H\\x -* 0 and \\Kn\\Y -* \\K\\y < oo. A similar
estimate shows
E(siiP\(H-X)t((Kn-K)-Y)t\\-*0
so we have shown {Hn ■ X)3{Kn ■ Y)3 -»{H ■ X)3{K ■ Y)3 in L1.
To estimate the second term in Z" — Zt, we again begin by putting the
absolute values inside, replacing the measure by its variation, and then using
the triangle inequality
I I H?Ky(X,Y)3- I H3K3d(X,Y)3
I Jo Jo
< [ \Hn3K?-H3K3\d\(X,Y)\3
Jo
< /V;--^||ICM<X,Y)|,+ f\H,\\K?-Ks\d\(X,Y)\,
Jo Jo
Using the Kunita-Watanabe inequality, (5.1), the first term is
ft1 \1/2/ t* n1/2
< (jf |fl? - H,\*d(X),) [JQ \K7MYh
Section 2.6 Integration w.r.t. Local Martingales 63
Taking expected values and using the Cauchy-Schwarz inequality
E f \H? - H,\\I<:\d\(X,Y)\3
Jo
< (eJ* \h:- H3\2d(x)^j (eJ* ijci2^),) -o
as n -* oo, since ||if" - H\\x -* 0 and \\Kn\\Y -* \\K\\y < oo. Combining this
with a similar estimate on /J \H3\ \K3 - K3\d\(X,Y}\3 it follows that
f H?K?d(X,Y)*-+ I H3K3d(X,Y)3
Jo Jo
in L1. We have now shown Z3 —* Z3 in L1 and the desired result follows from
(3.6). D
Remark. In some treatments (see e.g., Revuz and Yor (1991) p. 130), if M G
M2 and H g II2(M), H-M is DEFINED to be the unique L e M2 with L0 = 0
so that
(L,N) = H -(M,N)
for all JV G.M2.
Exercise 5.1. Prove the uniqueness in the last claim.
(5.4) allows us to do easily some things that we would have had to work to
prove in Section 2.3. Let X and Y be continuous local martingales, and suppose
X0 = Y0 — 0. The last assumption entails no loss of generality. See Exercise
3.3.
Exercise 5.2. (X,YT) = (X,Y)T.
Exercise 5.3. Xj{Yt —Y^) is a local martingale.
2.6. Integration w.r.t. Local Martingales
In this section we will extend our integral so that the integrators can be
continuous local martingales and so that the integrands are in
H3(X) =(h: f H2d(X)3 < oo a.s. for all t > o|
64 Chapter 2 Stochastic Integration
We will argue in Section 3.4 that this is the largest possible class of integrands.
To extend the integral from bounded martingales to local martingales we begin
by proving an obvious fact.
(6.1) Theorem. Suppose X is a bounded continuous martingale, H, K G
n2(X) and H3 = K3 for s < T where T is a stopping time. Then (H ■ X)3 =
(K ■ X)3 for s < T.
Proof Write H3 = H\ + H2 and K3 = It',1 + K2 where
H) = K] = fU(,<T) = A',1(,<T)
Clearly, {H1 ■ X)t = (K1 ■ Y)t for all t. Since H2 = K? = 0 for s < T, (5.4)
implies
(H2 ■ X)3 = (K2 -X)3 = 0 for s < T
and it follows from Exercise 3.8 that (H2 ■ X)t = (It'2 • X)t = 0 for t <
T. Combining this with the first result and using (4.3.b) gives the desired
conclusion. D
Exercise 6.1. Extend the proof of (6.1) to show that if X is a bounded
continuous martingale, H, K € II2PO and H3 = K3 for S < s < T where S,T
are stopping times then (H-X)3 - (H-X)s = (It -X)3 - (K -X)s for S < s < T.
To extend the integral now, let Sn be a sequence of stopping times with
S„ < Tn = inf-ff : \Xt\ >nor / H;d(X)3 > n}
and Sn | 00. Let H3 = Hjl(,<sn)> and observe that if m < n, (6.1) implies
that (Hm ■ X)t = (Hn ■ X)t for i < Sm, so we can define H ■ X by setting
{H ■ X)t = (Hn ■ X)t for t < Sn.
To" complete the definition we have to show that if Rn and Sn are two
sequences of stopping times < Tn with R„ | 00 and Sn | 00 then we end up
with the same (H ■ X)t. Let Hj be the stopped version of the process defined
at the beginning of Section 2.2, let Qn = Rn A S„ and note that (6.1) implies
(HR" ■ X)3 = (Hs" .-X)3 for s < Qn
Since Qn | 00, it follows that (H-X)t is independent of the sequence of stopping
times chosen. The uniqueness result and (6.1) imply
(6.2) Theorem. If X is a continuous local martingale and H € ^(X) then
HT-X = (H-X)T = H-XT = HT-XT
Section 2.6 Integration w.r.t. Local Martingales 65
In words if we set the integrand = 0 after time T or stop the martingale at time
T or do both, then this just stops the integral at T.
Our next task is to generalize our abc's to integrands in II3.
(6.3) Theorem. If X is a continuous local martingale and H G II3(X), then
(H ■ X)t is a continuous local martingale.
Proof By stopping at T„ = inf {i : /J H?d(X)s > n or \Xt\ > n}, it suffices
to show that if X is a bounded martingale and H € H.2(X), then (H ■ X)t is a
continuous martingale but this follows from (4.3.a). D
(6.4) Theorem. Let X and Y be continuous local martingales. If H, K €
H3(X) then H + Ke H3(X) and
{{H + K) ■ X)t = (H ■ X)t + (K ■ X),
If He H3(X) n n3(Y) then H G n3(X + Y) and
(H-(X + Y))t = (H-X)t + (H-Y)t
Proof To prove the first formula we note that the triangle inequality for the
norm || • ||x implies H + K € n3(X). Stopping at
Tn = inf{t : \Xt\, J E] d(X)„ or J Kj d(Y)3 > n]
reduces the result to (4.3.b)
For the second formula we note that the argument in (5.3) shows that
H € II3(X + Y) and by stopping we can reduce the result to (5.3). D
After seeing the last two proofs the reader can undoubtedly improve (5.4) to
(6.5) Theorem. If X, Y are continuous local martingales, H £ II3(X) and
K e n3(Y) then
(H-X,K.Y)t= I H3K3d(X,Y)3
Jo
Using Exercise 3.2, we can generalize (6.5) to sums of stochastic integrals.
(6.6) Theorem. Let X = J2T=i & 'X' and Y = Ej=iI<} "yi where the *l
and YJ are continuous local martingales. If H* € II3(X') and K.i G n3(YJ),
then
{X,Y)t = Y,f Hl3K{d{X\Y!)3
66 Chapter 2 Stochastic Integration
We leave it to the reader to make the final extension of the isometry property:
Exercise 6.2. If X is a continuous local martingale and H € TL2(X) then
H-XeM2aad\\H-X\\2 = \\H\\x.
We will conclude this section by computing some stochastic integrals. The
first formula should be obvious, but for completeness must be derived from the
definitions.
Exercise 6.3. Let X be a continuous local martingale. Let S < T < oo be
stopping times, let C(w) be bounded with C(w) G Ts, and define H, = C for
S < s < T and 0 otherwise. Then H £ E3(X) and
I
H, dX, = C(XT - Xs)
For the rest of this section, A„ = {0 = tj < *i < • • • < *£(„) = *} iS
a sequence of partitions of [0,t] with mesh |A„| = sup,- \t" — t"_x| —» 0. Our
theme here is that if the integrand Ht is continuous then the integral is a limit
of Riemann sums, provided we evaluate the integrand at the left endpoint of the
intervals.
(6.7) Theorem. If X is a continuous local martingale and t —* Ht is continuous
then as n —» oo
Y^Ht?{X(t?+1) - X(t?)} -* I H, dX, in probability
Proof First note that H € U3(X) so the integral exists. By stopping at
T = inf{s : s, (X)3, or \H,\ > M} and making H constant after time T to
preserve continuity, we can suppose (X)t < M and \H,\ < M for all s. Let
H? = Ht? for s G (*",*"+i], H? = Ht for s > t. Since s -* H3 is continuous on
[0, t] we have for each w that
sup | #,"-#,1—0
s
The desired result now follows from the following lemma which will be useful
later.
(6.8) Lemma. Let Xt be a continuous local martingale with (X)t < M for all
t. If Hn is a sequence of predictable processes with \Hn\ < M for all s, w and
with sup, \H? - H, | -♦ 0 in probability then (Hn . X)-* (H ■ X) in M2.
Section 2.6 Integration w.r.t. Local Martingales 67
Proof Exercise 6.2 (or 4.4) implies
\\(Hn - H) ■ X\\l = \\Hn - H\& < MS sup \H? - #,|2 ^ 0
s
as n —» oo by the bounded convergence theorem. D
The rest of the section is devoted to investigating what happens when we
evaluate at the right end point or the center of each interval. For simplicity we
will only consider the special case in which H, = 2XS.
Exercise 6.4. If X is a continuous local martingale then
( 2X1dX,=X?-XQ2-(X)t
Jo
Exercise 6.5. Show that if X is a continuous local martingale and we evaluate
at the right end point then
X>X,r+i{X(tf+1) - X(t?)} - f 2X, dX, + 2(X)t = X? - Xl + (X)t
i J°
Comparing the last two exercises you might jump to the conclusion that
if we evaluate at the midpoint we get an integral that performs like the one in
calculus. However, we will not ask you to prove this in general.
Exercise 6.6. Show that if Bt is a one dimensional Brownian motion starting
at 0 then
2"-l
J2 2B((Jfc + l/2)2-"0{B((Jfc + 1)2—0 - B(Jb2-"0}
converges in probability to fQ 2BS dB, + t = Bf.
This approach to integration, called the Stratonovich integral, is more
convenient than Ito's approach in certain circumstances (e.g., diffusion processes
on manifolds) but we will not discuss it further here. Changing topics we have
one more consequence of (6.7) that we will need in Chapter 5.
Exercise 6.7. Suppose h : [0, oo) -» R is continuous. Show that /0 h, dB, has
a normal distribution with mean 0 and variance /0 h/; ds.
68 Chapter 2 Stochastic Integration
2.7. Change of Variables, Ito's Formula
This section is devoted to a proof of our first version of Ito's fromula, (7.1). In
Section 2.10, we will prove a bigger and better version, (10.2), with a slicker
proof, but the mundane proof given here has the advantage of explaining where
the term with /" comes from.
(7.1) Theorem. Suppose X is a continuous local martingale and / has two
continuous derivatives. Then with probability 1, for all t > 0
f(Xt) - f(X0) = f f'{X.)dX. + | f f"(X,)d(xh
Remark. If At is a continuous function that is locally of bounded variation,
and / has a continuous derivative then (see Exercise 7.2 below)
(7.2) f(At)-f(Ao)= I f'(As)dA,
Jo
As the reader will see in the proof of (7.1), the second term comes from the
fact that local martingale paths have quadratic variation (X)t, while the 1/2
in front of it comes from expanding / in a Taylor series.
Proof By stopping at Tm = inf{< : \Xt\ or (X)t > M}, it suffices to prove
the result when \Xt\ and (X)t < M. From calculus we know that for any a and
6, there is a c(a, 6) in between a and 6 such that
(7.3) /(6) - /(a) = (6 - a)f(a) + |(6 - aff"(c(a, 6))
Let t be a fixed positive number. Consider a-sequence A„ of partitions 0 =
fS < <? ... < <tn = t with mesh |A„| -* 0. From (7.3) it follows that
f{Xt) - f(XQ) = £/(¾) - f(Xtf)
i
(7.4) = z2f'i.xt?)(xn+i ~xn)
i
i
wheretf(w) = /"(c(Xtr,X,? )).
Section 2.7 Change of Variables, Ito's Formula 69
Comparing (7.4) with (7.1), it becomes clear that we want to show
(7.5) £f'(Xt?)(Xt?+i - X,?) - f f(X,)dX,
(7-6) \ X>M(**?+1 " ^?)2 -*\J* f"{X,)d(X)s
in probability asn-^oo. (7.5) follows from (6.8).
To prove (7.6), we let G? = tf («) = /"(c(X«»,X«, ) when a € (t?,t?+1],
G? = /"(X,) for a > t and let
so that
and what we want to show is
(7.7) f(%dA\-* ff"{Xs)d{X)s
Jo Jo
To do this we begin by observing that the uniform continuity of /" implies that
as n —* oo we have G" —* f"(XsAt) uniformly in s, while (3.8) implies that A"
converges in probability to (X)s. Now by taking subsequences we can suppose
that with probability 1, A"M converges weakly to (X)sM. In other words, if
we fix w and regard s —* A"M and s —* (X)sAt as distribution functions, then
the associated measures converge weakly. Having done this, we can fix w and
deduce (7.7) from the following simple result:
(7.8) Lemma. If (i) measures \in on [0,t] converge weakly to fj.co, a finite
measure, and (ii) gn is a sequence of functions with \gn\ < M that have the
property that whenever sn £ [0,t] —* s we have gn(sn) —* g(s), then as n —* oo
/ ffn^n -* gdp
Proof By letting /4i(j4) = Mn(^)/Mn([0)<])) we can assume that all the \in
are probability measures. A standard construction (see (2.1) in Chapter 2 of
Durrett (1995)) shows that there is a sequence of random variables Xn with
distribution \in so that Xn —* Xoo a.s. as n —* oo The convergence of #„ to 5
70 Chapter 2 Stochastic Integration
implies gn(Xn) —» g(Xco), so the result follows from the bounded convergence
theorem. D
(7.8) is the last piece in the proof of (7.1). Tracing back through the proof,
we see that (7.8) implies (7.7), which in turn completes the proof of (7.6). So
adding (7.5) and using (7.4) gives that for each t,
f(Xt) - f(X0) = J f'(X,)dX, + i y* f"(X,)d(X). a.s.
Since each side of the formula is a continuous function of t, it follows that with
probability 1 the equality holds for all t > 0, the statement made in (7.1). D
(7.9) Remark. In Chapter 3 we will need to use Ito's formula for complex
valued /. The extension is trivial. Write / = u + iv, apply Ito's formula to u
and v, multiply the v formula by i and add the two.
Exercise 7.1. Let At be a continuous process with total variation \A\t < M
for all t. If Kn is a sequence of measurable processes with \Kn(s, w)| < M for
all s,w and sup, \K? - K,\ — 0 then suPl |(JiTn - K) ■ A)t\ — 0.
Exercise 7.2. Use the fact that /(6) — /(a) = /'(c(a, 6))(6 — a) for some c
between a and 6, and Exercise 7.1 to prove (7.2).
2.8. Integration w.r.t. Semimartingales
X is said to be a continuous semimartingale if Xt can be written as Mt + At,
where Mt is a continuous local martingale and At is a continuous adapted
process that is locally of bounded variation. A nice feature of continuous
semimartingales that is unheard of for their more general counterparts is
(8.1) Theorem. Let Xt be a continuous semimartingale. If the (continuous)
processes Mt and At are chosen so that Aq = 0, then the decomposition Xt =
Mt + At is unique.
Proof If Mi + A't is another decomposition, then At — A't = M[ — Mt is a
continuous local martingale and locally of bounded variation, so by (3.3) At—A't
is constant and hence = 0. D
Remark. In what follows we will sometimes write "Let Xt = Mt + At be a
continuous semimartingale" to specify that the decomposition of Xt consists of
the local martingale Mt and the process At that is locally of bounded variation.
Section 2.8 Integration w.r.t. Semimaxtinga.les 71
Exercise 8.1. Show that if Xt = Mt + At and X't = M[ + A\ are
continuous semimartingales then Xt + X[ = (Mt + M't) + (At + A't) is a continuous
semimartingale.
In this section we will extend the class of integrators for our stochastic
integral from continuous local martingales to continuous semimartingales. There
are three reasons for doing this.
(i) If X is a continuous local martingale and / is C2 then Ito's formula shows
us that f(Xt) is always a semimartingale but it is not a local martingale unless
f"(x) = 0 for all x. In the next section we will prove a generalization of Ito's
formula which implies that if X is a continuous semimartingale and / is C2
then f{Xt) is again a semimartingale.
(ii) It can be argued that any "reasonable integrator" is a semimartingale. To
explain this, we begin by defining an easy integrand to be a process of the
form
n
:=0
where 0 = T0 < T\ < ■ • • < Tn+\ are stopping times, and the Hi € Tt{ have
\Hi\ < oo a.s. Let bHeii be the collection of bounded easy predictable processes
that vanish on (t, oo) equipped with the uniform norm
pr||u=sup|ff,(«)|
For easy integrands we define the integral as
n
:=1
Finally, let L° be the collection of all random variables topologized by
convergence in probability, which comes from the metric ||J\r||0 = E(\X\/(1 + \X\)).
See Exercise 6.4 in Chapter 1 of Durrett (1995).
A result proved independently by Bichteler and Dellacherie states
(8.2) Theorem. If H -* (H ■ X) is continuous from 6ne>1 -* L° for all t then
X is a semimartingale.
Since any extension of the integral from blleii will involve taking limits, we
need our integral to be a continuous function on the simple integrands (in some
sense). We have placed a very strong topology on the domain and a weak
72 Chapter 2 Stochastic Integration
topology on the range, so this is a fairly weak requirement. Protter (1990)
takes (8.2) as the definition of semimartingale. In his approach one gets some
very deep results without much effort but then one must sweat to prove that a
semimartingale (defined as a good integrator) is a semimartingale (as defined
above).
(iii) Last but not least, it is easy to extend the integral from local martingales
to semimartingales if we replace Ilz(X) by a slightly smaller class of integrands
that does not depend on X.
Getting back to business, let Xt = Mt+Ai be a continuous semimartingale.
We say H G /611 = the set of locally bounded predictable processes, if
there is a sequence of stopping times Tn j oo so that \H(s, w)| < n for s < Tn.
If H £ /611 we can define
(H.A)t(u)= f Hs{uj)dA,{w)
Jo
as a Lebesgue-Stieltjes integral (which exists for a.e. w). To integrate with
respect to the local martingale Mt we note that
f nHU(M)s<n2(M)Ta<oo
Jo
and Tn —* oo. So /611 C n3(M), we can define (H ■ M)t, and let
(H ■ X)t = (H ■ M)t + (H ■ A)t
since by the uniqueness of the decomposition this is an unambiguous definition.
From the definition above, it follows immediately that we have
(8.3) Theorem. If X is a continuous semimartingale and H £ /611, then (H-X)t
is a continuous semimartingale.
The second of our abc's is also easy. We name it in honor of the similar
relationships between addition and multiplication.
(8.4) Distributive Laws. Suppose X and Y are continuous semimartingales,
and H, K G Ibtt. Then
((H + K) ■ X)t = (H ■ X)t + (K ■ X)t
(H.(X + Y))t = (H.X)t + (H.Y)t
Section 2.8 Integration w.r.t. Semimartingales 73
Proof This follows easily from the definition of the integral with respect to
a semimartingale, the result for local martingales given in (6.4), and the fact
that the results are true when X and Y are locally of bounded variation. In
proving the second result we also need Exercise 8.1. D
The rest of this section is devoted to defining the covariance for semimartin-
gales and proving the third of our abc's.
(8.5) Definition. If X = M + A and X' = M' + A' are continuous semimartin-
gales we define the covariance (X, X')t = {M,M')t.
To explain this we recall the approximate quadratic variation Qf-(X) defined
in Section 2.3, and note
(8.6) Theorem. Suppose X = M + A and X' = M' + A' are continuous
semimartingales. If A„ is a sequence of partitions of [0, t] with mesh |A„| —» 0
then
Qf »(X,X') -» (M,M')t in probability
Proof Since
Qf"(X,X') = Q?«(M,M') + Q?o(A,M') + Q^(X,A')
it suffices to show that if Yt is a continuous process and Vt is continuous and
locally of bounded variation then Qfn{Y, V) —* 0 almost surely. To do this we
note that
Q?"(Y,V) < \V\tsup \YtT -Yt?|
i
As n —» oo the second term —» 0 almost surely, while the first one has \V\t < oo
and the desired result follows. D
(8.7) Theorem. Suppose X' and YJ are continuous semimartingales H',K^ G
/MI, X = ££ 1 W ■ X\ Y = £"=1 Ki ■ Y'\ then
(X,Y)t = J2 fHJKjdiX',^),
Proof Let M' and JVJ be the local martingale parts of X' and YJ, let M =
YT=i Ri ■ Mi> and N = E"=i Rj ■ Nj- Tt follows from our definition that
(X,Y)t = (M,N)U so using (6.6) and then (M\Nj)t = (X^Y'h gives the
desired result. D
74 Chapter 2 Stochastic Integration
2.9. Associative Law
Let Xt be a continuous semimartingale. In some computations, it is useful to
write the integral relationship
Yt= [ K,dX,
Jo
as the formal equation
dYt = KtdXt
where the dYt and dXt are fictitious objects known as "stochastic differentials."
A good example is the derivation of the following formula, which (for obvious
reasons) we call the associative law:
(*) H-(K-X) = (HK) ■ X
Proof using stochastic differentials d(H ■ Y)t = HtdYt. Letting Yt =
(K ■ X)t and observing dYt = KtdXt gives
d(H ■ (K ■ X))t = Ht d(K ■ X)t = HtKt dXt = d((HK) ■ X)t D
The above proof is not rigorous, but the computation is useful because it
tells us what the answer should be. Once we know the answer, it is routine
to verify by checking that it holds for basic predictable processes and then
following the extension process we used for defining the integral to conclude
that it holds in general.
(9.1) Lemma. Let X be any process. If H,K € III, then (*) holds.
Proof Since the formula is linear in H and in K separately, we can without
loss of generality suppose H = l(a,6]C an(i K = l(c,d]-D an<i further that either
(i) 6 < c or (ii) a = c, 6 = d. In case (i), both sides of the equation are = 0 and
hence equal. In case (ii),
(0 0<t <a
(K-X)t = {D(Xt-Xa) a<t<b
[D(Xb-Xa) b<t <oo
so „
rO 0<t<a
(H • (K ■ X))t = { CD(Xt - Xa) a<t<b
{CD(Xb-Xa) b<t<oo
and it follows that (H ■ (K • X))t = ((HK) -X)t. D
Section 2.9 Associative Law 75
To extend to more general integrands, we will take limits. First, however,
we need a technical result.
(9.2) Lemma. Suppose fi is a signed measure with finite total variation, h and
k are bounded, and let ^([0,<]) = fQ k,fi(ds). Then
/ h, u(ds) = / h,k, n(ds)
Jo Jo
Proof of (9.2) The Radon-Nikodym derivative dv/dp = k, so the desired
identity is just the well known fact that
J hsv(ds)= / hs — n(ds)
See for example Exercise 8.7 in the Appendix of Durrett (1995). D
To simplify the extension to n2(X), we will take the limit for K and then
for if.
(9.3) Lemma. Let X be a continuous local martingale. If H £ bill and
K e n2(X), then (*) holds.
Proof Let Kn G 611! such that Kn —► K in Tl2(X). Since H is bounded,
HKn -* HK in n2(A'), and it follows from Exercise 6.2 that (HKn) ■ X -*
(HK) ■ X in M2- To deal with the left side of (*), we observe that using (6.4)
twice, Exercise 6.2, (6.5) with (9.2), and the fact that Hs is bounded we have
Mi/ - (X - ^) - i/ - (JC" - ^)||| = ||i/ - ((A' - Jv") - ^)|||
y«co
= E H?d((K-Kn)-X),
Jo
i»CO
= Ej H?(K,-K?)2d(X),
<C\\K-Kn\&
so as n -» oo, H ■ (Kn ■ X) -* H • (Jv • X) in M2 and the result follows. D
(9.4) Lemma. Let X be a continuous local martingale. If H £ n2(Jv -X) and
K e IL2(X), then (*) holds.
76 Chapter 2 Stochastic Integration
Proof Let Hn g MIi such that Hn -+ H in n2(J<T -X). (6.4) and Exercise
6.2 imply
\\Hn -(K.X)-H- (K -X)\\2 = \\(H" - H) ■ (K ■ X)\\2 = \\H" - H\\K.X
so Hn • (Jv -X) -* H ■ (K ■ X) in M2. To deal with the right side of (*), we
observe that using the definition of the norm then (6.5) with (9.2)
f°°
\\HnK - HK\\2X = E J {H? - Hs)2K2d(X)s
= l|tf°-#llix-o
so applying Exercise 6.2 we have HnK ■ X —* HK ■ X in M2. D
(9.5) Lemma. Let X be a continuous local martingale. If K G Il3(Y) and
H e n3(JC • X) then (*) holds.
Proof Let Tn = inf{< : /J K*d{X)t or /J H~d(K ■ X), > n}. Let Hf- =
H'h><T„) and Kjn = Ksl(,<Kay (9-4) implies that
^T".(A'T".X) = (^T"A'T»)-X
Using (6.2) now it follows that (H ■ (K ■ X))t = (HK ■ X)t for t < Tn, and
letting n —* oo gives the desired result. D
(9.6) Associative Law. Suppose X is a continuous semimartingale. If H, K G
/6n then H ■ (K ■ X) = HK ■ X.
Proof In view of (9.5) and the definition of the integral it suffices to prove the
result when Xt = At is continuous adapted and locally of bounded variation.
However, by stopping the first time \Ha\, \KS\ or the variation of A on [0,s]
exceeds n the statement reduces to something covered by (9.2). D
2.10. Functions of Several Semimartingales
In this section, we will prove a version of Ito's formula for functions of several
semimartingales. The key to the proof is the following
(10.1) Integration by Parts. If X and Y are continuous semimartingales
then
XtYt - XQYo = [ Y,dX,+ f XsdYs + (X,Y)t
Jo Jo
Section 2.10 Functions of Several Semimartingales 77
Proof Let An = {<"} be a sequence of sub divisions of [0,t] with mesh |A„|
going to 0. We can write
XtYt - XQYQ = 2^(^i?+1 - ^")(^i+i - **?)
+ £(¾ - K?)Yt? + £**?(*?+1 - Yt?)
i i
(8.6) implies that
£(x„+1 - xti)orti+l - Yti) -+ {x,Y)t
i
in probability, while application of (6.7) to each of the last two sums implies
that they converge to fQ YsdXs and to fQ XsdYs. D
When Yt is locally of bounded variation then (X,Y)t = 0 and this reduces
to the ordinary integration by parts formula of Lebesgue-Stieltjes integration.
f Y,dX, = XtYt - X0Y0 - f X,dY,
Jo Jo
Note that the right hand side makes sense in a path-by-path sense.
We are now ready to prove the following generalization of (7.1):
(10.2) Ito's formula. Let / : Rd — R and 0 < c < d. If X},...,Xf are
continuous semimartingales and X£+1,..., Xf are locally of bounded variation
then
f(Xt) - f(X0) = jrf Dd{Xs)dX\ + \ Y, f Dijnx.WX'.Xi).
provided the derivatives Djf, 1 < j < d and Aj/, 1 < i,j, < c exist and are
continuous.
Remark. At this point saving a derivative or two may seem like pinching
pennies but when we get to Chapter 4 and apply this result to u(t,Bt) we will
be very happy to not have to check that the partial derivatives utt and uttXi
exist.
Proof By stopping, it suffices to prove the result when |Xj|,(X')t < M for
all i, t. Since any function / satisfying the hypotheses can be approximated by
78 Chapter 2 Stochastic Integration
polynomials gn in such a way that gn,Dign, 1 <i < d and Ajffn, 1 < hj < c
converge to /, A/, and Aj/ uniformly on [—M,M]d, it suffices to prove the
result when / is a polynomial. For then (6.8) and Exercise 7.2 allow us to pass
from the result for gn to that for /.
To prove the result for a polynomial, it suffices by linearity to prove the
result when / is a monomial xklxk* ■ • • xkn where &i, &2,..., &„ G {1,..., d).
(The reader should note that ki,... ,kn are superscripts, not powers, e.g., our
monomial might be x1xix1x1x2.)
If n = 1 and k\ = k, then (10.2) says that
Xk-Xk = f
JO
ldX?
which is trivially true. To prove the result for a general monomial, we use
induction. Let Yt = Yim=i Xt be a monomial for which (10.2) holds, and
let Zt = X^n+1). Applying (10.1) gives
YtZt-Y0Z0 = J Z,dY,+ J YtdZt + (Y,Z)t
Jo Jo
Applying (10.2) to Y gives
(10.3)
z ,,j:<» J0 I m=1
d(Xk^,Xk^)s
so using the associative law (9.6),
rt rt /n+l
ZsdYS=j: n^(m
J° ,<n J° I »-i
1 /•* / "+1
dXk&
\
J
d(XkV\XkW)s
By definition,
pYsdZs = f (n xk^ ] ^^+1)
Chapter Summary 79
To evaluate the third term (Y, Z)t, we observe that by (10.3) and the formula
for the covariance of stochastic integrals, (8.7),
Adding the last three equalities gives
YtZt - YoZo = E /| ff X>(m) I dxsW
2t*WJoWj )
(Notice that for each i in the sum for (Y, Z)t, there are two terms i = i, j = n+1,
and i = n + 1, j = i in the last sum.) The proof is complete. D
Chapter Summary
With the completion of the proof of the multivariate form of Ito's formula, we
have enough stochastic calculus to carry us through most of the applications in
the book, so we pause for a moment to recall what we've just done.
In Section 2.1, we introduced the integrands Hs, the predictable processes
and in Section 2.2 the (first) integrators Xt, the continuous local martingales.
In Section 2.3 we defined the variance process associated with a local
martingale, and more generally the covariance (X, Y)t of two local martingales.
Theorem (3.11) tells us that the covariance could have been defined as the
unique continuous predictable process At that is locally of bounded variation,
has A0 = 0, and makes XtYt — At is a martingale. The next result gives a
pathwise interpretation of (X, Y)t:
(3.10) Quadratic Variation. Suppose X and Y are continuous local
martingales. If A„ is a sequence of partitions of [0, t] with mesh |A„| —» 0 then
in
£.(*** - Xtk_J ■ (Ytk - Y^J - (X,Y)t in probability
t=i
Taking X = Y we see that (X)t is the "quadratic variation" of the path up to
time t.
80 Chapter 2 Stochastic Integration
In Section 2.4-2.5, we denned the integral, beginning with bounded
continuous martingales and considering a sequence of progressively more general
integrands.
Ho l(0li](fl)C(w) with Ce?a
III finite sum of elements of IIo
n2 (X) H with E /0°° H; d{X)s < oo
H3(X) H with /o H; d(X), < oo a.s. for each t
A key role in this development was played by the
(5.1) Kunita-Watanabe Inequality. If X and Y are local martingales and
H and K are two measurable processes, then almost surely
yoo / pea \ 1/2 / pea \ 1/2
JQ \HsKs\d\{X,Y)\s < (JQ HU(X)S) (Jq KU(X)s)
where d\(X, Y)\, stands for dV, where V, is the total variation of r —» (X,Y)r
on [0,s].
In Section 2.6, we generalized the integrators X, to continuous local
martingales and, in Section 2.8, to continuous semimartingales which are the
sum (in a unique way) of a continuous local martingale M, and a continuous
process A, that is locally of bounded variation. Further, we defined the covari-
ance of two semimartingales Xt = Mt + At and X't = M[ + A't to be that of
their martingale parts, i.e., (M,M')t, and generalized the quadratic variation
interpretation in (3.10), see (8.6).
To be able to integrate with respect to semimartingales we had to restrict
the integrands to the class of locally bounded predictable processes, £611, but
in this generality the integral has many nice properties.
(8.3) Closure under Integration. If X is a continuous semimartingale and
H £ IbU, then (H ■ X)t is a continuous semimartingale.
(8.4) Distributive Laws. Let X and Y be continuous semimartingales, and
H,K eibll.
((H + K) ■ X)t = (H ■ X)t + (K ■ X)t
(H-(X + Y))t = (H-X)t + (H-Y)t
Chapter Summary 81
(8.7) Covariance of Stochastic Integrals. Suppose X' and YJ are
continuous semimartingales H\I<.i € IbU, X = YT=i H{ -X\Y = £"=1 K> • Y-»,
then
<x,Y), = W*ffjff2d<x',YJ),
M J°
(9.6) Associative Law. Let X be a continuous semimartingale. If H,K € /611
then H -{K-X) = HK-X.
As the reader probably remembers, the first three results were only obtained
after several improvements
Integrands = nx n2 n3 /611
a. Closure (4.2.a) (4.3.a) (6.3) (8.3)
b. Distributive Laws (4.2.b) (4.3.b), (5.3) (6.4) (8.4)
c. Covariance Formula (4.2.c) (5.4) (6.5), (6.6) (8.6), (8.7)
In Section 2.7, we proved a change of variables formula. This led to:
(10.1) Integration by Parts. Let X and Y be continuous semimartingales.
XtYt -XQYo = j Y,dX,+ j X,dY, + (X,Y)t
Jo Jo
which in turn led to a bigger and better change of variables formula.
(10.2) Ito's formula. Let / : Rd -* R and 0 < c < d. liX],...,Xf are
continuous semimartingales and X£+1,..., Xf are locally of bounded variation
£(J° 2i<U<'Jo
provided the derivatives Djf, 1 < j < d and Aj/. 1 < i, j, < c exist and are
continuous.
Coming Attractions
Section 2.11 is devoted to a proof of the Meyer-Tanaka formula, which
extends the version of Ito's formula given in (7.1) to / that are a difference of
two convex functions. This leads in a nice way to the definition of an occupation
time density (or "local time") L\ for a continuous semimartingale that is jointly
continuous in t and a.
82 Chapter 2 Stochastic Integration
The developments described in the previous paragraph are used only in
(9.3) of Chapter 4 and in Example 2.1 of Chapter 5. The material in Section
2.12 concerning Girsanov's formula is not used until Section 5.5. We would like
to suggest that the reader proceed now to the beginning of Chapter 3.
2.11. The Meyer-Tanaka Formula, Local Time
In this section we will prove an extension of Ito's formula due to Meyer (1976)
that has its roots in work of Tanaka (1963). See (11.4). Loosely speaking
it says that Ito's formula is valid if / is a difference of two convex functions.
Since such functions are differentiable except at a countable set and have a
second derivative that is a nice signed measure, this is a modest generalization.
However, this is the best one can do in general: if Bt is a standard Brownian
motion and Xt = f(Bt) is a semimartingale then / must be the difference of
two convex functions. See Cinlar, Jacod, Protter, and Sharpe (1980). The
developments in this section follow those in Sections 4 and 5 of Chapter IV of
Protter (1990) but things are simpler here since we have no jumps. Our first
step is
(11.1) Theorem. Let / : R -+ R be convex and let X be a continuous
semimartingale. Then f(X) is a semimartingale and
f(Xi)-f(XQ)= [ f'(Xt)dXt+Kt
Jo
Here f'(x) = lim/ijo(/(a0 — f{x — h))/h is the left derivative which exists at all
x by convexity and K% is a continuous adapted increasing process.
Proof Let Xt = Mt + At be the decomposition of the semimartingale. By
stopping we can suppose without loss of generality that \Xt\ < N, \A\t < N and
(M)t < N for all t. Let g be a C°° function with compact support in [—oo,0]
and having fg(s)ds = 1. Let
(a) fn(x) = Jf(x+l)g(y)dy
Then /„ is convex and C°°. Using Ito's formula we have
(b) fn(Xt) - fn(XQ) = Jq f'n(X>) dX' + \JQ fn(X') dW>
Our strategy for the proof will be to let n -+ oo in (b) to get the desired
formula. Since a convex function is Lipschitz continuous on each bounded
Section 2.11 The Meyer-Tanaka Formula, Local Time 83
interval, it is easy to see that /n(z) —» f(x) uniformly on compact sets. Since
\Xt\ <N, it follows that
(c) fn(Xt)-+f(Xt) uniformly in t
To deal with the first term on the right, we differentiate (a) to get
(d)' f'n{*) = jf'(x+l)g(y)dyU'(x)
as n — oo. Now if I? = /J fn(Xs)dMs and It = /J f'(X,)dMs then the
I/2 maximal inequality for martingales and the isometry property of stochastic
integrals (Exercise 6.2), imply
E (sup (I? - It)A < 4sup E(I? - Itf
= AE {fn{Xs)-f\Xs)fd{M)s^Q
Jo
by the bounded convergence theorem. (Recall (M)^ < N.) By passing to a
subsequence we can improve the last conclusion to
(e) sup (J"fc — It)2 —* 0 almost surely
t
Let Jt" = fQf'n{Xs)dAs and Jt = fQf'(Xs)dAs. In this case it follows
from the bounded convergence theorem that for almost every w
,oo
(f) sup |J," - Jt\ < / \fn(Xs) - f(X,)\ dA, -+ 0
t Jo
To take the limit of the third and final term K" = | Jo fn(X*) d(X)s we
note that (b) implies
(g) K?" = fnk (Xt) ~ fnk (Xo) ~ I f'nk (X, ) dX,
JO
Combining (c), (e), and (f) we see that almost surely the right hand side of (g)
converges to a limit uniformly in t, so the left hand side does also. Since Jt'"fc
is continuous adapted and increasing the limit is also. D
If we fix X then for a given / we call K the increasing process
associated with /. The increasing process associated with |z — a\ is called the local
84 Chapter 2 Stochastic Integration
time at a, and is denoted by L". To prove the first property that justifies this
name, (11.3), we need the following preliminary
(11.2) Lemma. The increasing process associated with (z — a)+ or (z — a)- is
(1/2)L?.
Proof Let f\(x) = (x — a)+, /2(2) = (x — a)-, and let K\ be the increasing
process associated with /;. We begin by observing /1+/2 = \x—a\, so K}+K$ =
L°. Our second claim is that since /1 — /2 = x — a we have K] — K"l = 0. To
see this let g(x) = x — a which has g'(x) = 1 and note Xt — X0 = fQ 1 cW3 so
the associated K% must be = 0. D
(11.3) Theorem. L\ only increases when Xt = a or to be precise, if we let
£a be the measure with distribution function t —* L", then £a is supported by
{i: Xt = a}.
Proof Intuitively, |.Xt — a\ is a local martingale except when Xt = a so L?
is constant when Xt ^ a. To prove (11.3), however, it is easier to use the
alternative definitions in (11.2). Let S <T be stopping times so that [S, T] C
{t : Xt < a}. Applying (11.1) to f(x) = (x — a)+ we have
{Xt - a)+ - (Xs -*)+ = Js l{x,>a} dXt + \{LaT - L%)
The left-hand side and the integral on the right-hand side vanish so Ly = Las.
Since this holds when S = q with q any rational and T = inf {t > S : Xt >
a — 1/n) for any n, it follows that la({t : Xt < a}) = 0. A similar argument
using (x — a)~ instead of (x — a)+ shows £a({t : Xt > a}) = 0 and the proof is
complete. D
We are now ready for our main result.
(11.4) Meyer-Tanaka formula. Let X be a continuous semimartingale. Let
/ be the difference of two convex functions, /' be the left derivative of / and
suppose /" = fj. in the sense of distribution (i.e., fj. has distribution function
/'). Then
f(Xt) - f(X0) = J f'(Xs) dX, + l-jn{da)Lat
Proof Since / is the difference of two convex functions and the formula is
linear in /, we can suppose without loss of generality that / is convex. By
Section 2.11 The Meyer-TanaA-a Formula, Local Time 85
stopping we can suppose \Xt\ < N and (X)t < N for all t. Having done this
we can let
1 fN
9(^)=-J /*(da)|z-a|
Since (/ - g)" = 0 on [-N, N] it follows that f(x) - g(x) = a + bx for |z| < N.
Since the result is trivial for linear functions, it suffices now to prove the result
for g, which is almost trivial. Differentiating the definition of g we have
1 fN
a' (*) = J J V(da) sign(z - a)
Starting with the definition of the local time
\Xt - a\ - \XQ - a\ = J sign(X, - a) dX, + L\
Jo
and then integrating 1/2 J_N fj.(da) gives
giXt) - g(XQ) = I f li(da) f sign(X, - a) dX,
t J-N JO
1 fN
+ -J^{da)Lat
(11.5) and (11.6) below will justify interchanging the two integrals in the first
term on the right-hand side to give
g(Xt) - g(X0) = [ g'(Xs) dXs + \ j ^da)Lat
Jo l J-N
Since L° = 0 for \a\ > N (recall \Xt\ < N and use (11.3)) this proves the result
for g and completes the proof of (11.4). D
For the next two results let S be a set, let S be a cr-field of subsets of S, and
let fj.be a. finite measure on (S, S). For our purposes it would be enough to take
S = R, but it is just as easy to treat a general set. The first step in justifying
the interchange of the order of integration is to deal with a measurability issue:
if we have a family H?(w) of integrands for a 6 S then we can define the
integrals fQ H° dXs to be a measurable function of (a, t,w).
(11.5) Lemma. Let X be a continuous semimartingale with Xq = 0 and
let H?(w) = H(a,t,u) be bounded and S x II measurable. Then there is a
86 Chapter 2 Stochastic Integration
Z(a,t,w) G S x II such that for /^-almost every a Z(a,t,u) is a continuous
version of fQ H° dXs.
Proof Let Xt = Mt + At be the decomposition of the semimartingale. By
stopping we can suppose that \A\t < N, \Mt\ < N and (X)t = (M)t < N for
alii.
Let Ti be the collection of bounded H G S x II for which the conclusion
holds. We will check the assumptions of the Monotone Class Theorem, (2.3)
in Chapter 1. Clearly 7i is a vector space. To check (i) of the MCT suppose
H(a,t,u) = f(a)K(t,u) where K(t,ui) G 611 and /(a) G bS. In this case
f Hf dX, = f(a) f K, dX,
Jo Jo
so fK G Ti. Taking / and K to be indicator functions of sets in S and II
respectively, we have checked (i) of the MCT with A = {Ax B : A & S, B & II}.
To check (ii) of the MCT let 0 < Hn G V. with Hn | H and H bounded.
The I/2 maximal inequality for martingales and the isometry property of the
stochastic integral imply
E Tsup \{Hn'a ■ M)t - (Ha ■ M)i|M < 4sup E\(Hn<" ■ M)t - (Ha ■ M)t\2
= 4E f{H?>a - Hasf d(M), — 0
By passing to a subsequence we have for fj. almost every a
sup |(#"*'" • M)t - (Ha ■ M)t\ — 0
t
To deal with the bounded variation part we note that
£sup \(Hn'a ■ A)t - (Ha ■ A)t\ <E f° \H?'a - H?\ d\A\t -+ 0
t Jo
as n —* oo by the bounded convergence theorem. Combining the last two results
completes the proof. D
(11.6) Fubiiii's Theorem. Let X be a continuous semimartingale, let Hf G
b(S x II), and let Z" G S x II be a continuous version of /0 Hf dX, for /^-almost
every a. Then Yt = /s Z$ fi(da) is a version of H ■ X where Ht = fs H° fi(da).
Less formally,
f f H?dXsti(da)= f f H?fi(da)dXs
Js Jo Jo Js
Section 2.11 The Meyer-Tanaka Formula, Local Time 87
Proof Let Xt = M% + At be the decomposition of the semimartingale. By
stopping we can suppose that \A\t < N, \Mt\ < N and (X)t = (M)t < N for
all t. As noted in the proof of (11.5), the result in the special case X = A is
just Fubini's theorem from measure theory, so we will suppose for the rest of
the proof that X = M.
Let 7i be the collection of bounded H G S x II for which the conclusion
holds. Again, we will check the assumptions of the Monotone Class Theorem,
(2.3) in Chapter 1. Clearly V. is a vector space. To check (i) of the MCT,
suppose H(a,t,u) = f(a)K(t,u) where K(t,w) G 611 and f(a) G 65. In this
case Z? = f(a)(K ■ X)t so
J Z* ii{da) = (K ■ X)t J f(a) ^(da)
= ({Jsf(a)n(da)K}.X)t = (H-X)t
so fK G V.. Taking / and K to be indicator functions of sets in S and II
respectively, we have checked (i) of the MCT with A = {A x B : A G S, B £ II}.
To check (ii) of the MCT let 0 < H" G V. with Hn \ H and H bounded.
Letting ||p|| be the total mass of fj. applying Jensen's inequality to the
probability measure M0/IMI. then multiplying each side by \\fi\\ we have
^(^supl^-^l^))2
<E ( sup \Z?-a-Z?\2 n(da)
Js t
Using Fubini's theorem, the Ir maximal inequality, and the isometry property
the right-hand side
= [ Esup\Z?-a-Z?\2fi(da)
Js t
< 4 J sup£'|^n'a - Z?\2n(da)
Js t
= \
(H?-a-H?)2d(X)s)n(da)^0
by using the bounded convergence theorem three times. Putting the absolute
values inside the integral we have
\ 2
-//Of
Dunded convert
Le integral we r
E(sup\ f Z?-afM(da) - f Z?fi(da)
< ^( f sup \Z^a - Z?\n(da)y -+ 0
88 Chapter 2 Stochastic Integration
Taking H" = /s H"'" fi(do) and passing to a subsequence nt we have that with
probability one (H"k -X)t = Js Z"k,a n(da) converges uniformly to fs Z? y.(da).
The last detail is to check that Hn ■ X converges to H ■ X in M.2, for then it
follows that fs Zf n(da) = (H ■ X)t. In view of the isometry property, we can
complete the last detail by showing that \\Hn — H\\x —» 0 but this is routine.
^£f (//r.>w-/s™)V),
<E I" I (»,"■■ - Htf Ada) d(X), -. 0
^0 Js
by using the bounded convergence theorem three times. D
We can now complete the identification of L\ as the local time at a.
(11.7) Theorem. Let X be a continuous semimartingale with local time L°.
If g is a bounded Borel measurable function then
/OO j-t
Latg{a)da= / g{Xs)d(X),
-oo JO
Proof- Suppose first that g is continuous and let / G C2 with /" = g. In
this case comparing (11.4) with Ito's formula proves the identity in question.
Since the identity holds for any continuous function, using the Monotone Class
Theorem proves the result for a bounded measurable g. Q
When X is a Brownian motion the identity in (11.7) becomes
/OO ft
Latg{a)da= / g(X,)ds
-oo JO
From this, we see that L° can be thought of as the time spent at a up to time
t, or to be precise it is a density function for the occupation time measure
Vt(A) = S*lA(Xt)ds.
There are many ways of defining local time for Brownian motion. The
one we have given above is famous for making it easy to prove that there is a
version in which L\ is a jointly continuous function of a and t. It is somewhat
remarkable that this property is true for a continuous local martingale and it
is not any more difficult to prove the result in this generality.
(11.8) Theorem. Let X be a continuous local martingale. There is a version
of the L° in which (a, t) —» L% is continuous.
Section 2.11 The Meyer-Tanaka Formula, Local Time 89
Example 11.1. (11.8) is not true for Xt = \Bt\, which is a semimartingale by
(11.1). Introducing subscripts to indicate the process we are referring to, we
clearly have Lx{i) = LaB(t) + Lga{i) for a > 0 and Lax(t) = 0 for a < 0. Since
L%(t) =£ 0 it follows that a —* Lx(t) is discontinuous at a = 0.
One can complain that this example is trivial since 0 is an end point of
the set of possible values. A related example with 0 in the middle is provided
by skew Brownian motion. Start with \Bt\ and flip coins with probability
p > 1/2 of + and 1 — p of — to assign signs to excursions of Bt, i.e., maximal
time intervals (a, 6) on which |i?<| > 0. If we call the resulting process Yt then
L$(t) -* 2pL%(t) as a | 0 and L\r{i) -* 2(1 - p)L°B(t) as a | 0. For details see
Walsh (1978).
Proof As usual, by stopping we can suppose that \Xt\ < N and (X)t < N
for all t. By (11.2)
\h\ = (Xt - a)+ - (X0 - a)+ - j£ l{x,>a} dXs
It is clear that (a, t) —* (Xt — a)+ — (X0 — a)+ is continuous, so we only have
to investigate the joint continuity of the stochastic integral
JO
l{x,>a} dXs
Fix a time T < oo and regard a —* I" as a mapping from R to C([0, T], R), the
real valued functions continuous on [0,T], which we equip with the sup norm
11/11
— sup0<1<T |/(s)|. In view of Kolmogorov's continuity theorem ((1.6) in
Chapter 1) it suffices to show that for some a, /? > 0 we have
E\\Ia - Ib\f < C\a - b\1+a
Suppose without loss of generality that a < b. One of the Burkholder Davis
Gundy inequalities ((5.1) in Chapter 3) implies that
2
E\\I? - J6||4 < CE (£ l{a<x,<6} d(X),
To bound the right-hand side we note that using (11.7) and then the Cauchy-
Schwarz inequality and Fubini's theorem:
= e( f I%dx\ <(b-a)E J (L^fdx
rb
= (b-a) E(L^)2dx<{b-a)2 sup E(L^)2
Ja »6(a,6)
90 Chapter 2 Stochastic Integration
To bound the sup, we recall the definition of if..
LXT = \XT - x\ - \XQ - x\ - [ sgn(X, - x) dX,
Jo
which with the trivial inequalities (a+6)2 < 2a2+2&2 and \y— x\— \z—x\ < \y—z\
implies that
E{LxTf < 2E(XT - X0)2 + IE I f sign(X, - x) dX, J
< 2(2iV)2 + 2E({X)T) < 8(N2 + N)
by the isometry property and the bounds we have imposed on \Xi\ and (X)i by
stopping. This gives us what we need to check the inequality in Kolmogorov's
continuity theorem and completes the proof. D
Remark. The proof above almost works for semimartingales Xt = Mt + At.
If we define I? = fQ l{x,>a] dM, and J? = fQ l{x,>a} dA, then the argument
above shows that (a, t) —» 1° is continuous, so we will get the desired conclusion
if we adopt assumptions that imply (a, t) —» J° is continuous.
2.12. Girsanov's Formula
In this section, we will show that the collection of semimartingales and the
definition of the stochastic integral are not affected by a locally equivalent
change of measure. For concreteness, we will work on the canonical probability
space (C,C), with T% the filtration generated by the coordinate maps Xt(u) =
ut. Two measures Q and P defined on a filtration Tt are said to be locally
equivalent if for each t their restrictions to Tty Qt and Pt are equivalent, i.e.,
mutually absolutely continuous. In this case we let at = dQt/dPt. The reasons
for our interest in this quantity will become clear as the story unfolds.
(12.1) Lemma. Yt is a (local) martingale/Q if and only if cctYt is a (local)
martingale/P.
Proof The parentheses are meant to indicate that the statement is true if
the two locals are removed. Let Yt be a martingale/Q, s < t and A £ Ts. Now
if Z £ Tu then (i) / Z dP = f Z dPu and (ii) fZatdPt = f Z dQt. So using
(i), (ii), the fact that Y is a martingale/Q, (ii), and (i) we have
/ aiYtdP = f atYtdPt = [ YtdQt
J A J A J A
= [ Y,dQ, = f a.Y.dP. = [ a,YsdP
J A JA J A
Section 2.12 Girsanov's Formula. 91
This shows atYt is a martingale/P. If Y is a local martingale/Q then there
is a sequence of stopping times Tn j oo so that ltAT„ is a martingale/Q and
hence atYtATn is a martingale/P. The optional stopping theorem implies that
atATnYtATa is a martingale/P and it follows that atYt is a local martingale/P.
To prove the converse, observe that (a) interchanging the roles of P and
Q and applying the last result shows that if /¾ = dPt/dQt, and Zt is a (local)
martingale/P, then /3tZt is a (local) martingale/Q and (b) /¾ = a^1, so letting
Zt — atYt we have the desired result. D
Since 1 is a martingale/Q we have
(12.2) Corollary, a* = dQt/dPt is a martingale/P.
There is a converse of (12.2) which will be useful in constructing examples.
(12.3) Lemma. Given &t a nonnegative martingale/P there is a unique locally
equivalent probability measure Q so that dQt/dPt = a*.
Proof The last equation in the lemma defines the restriction of Q to Tt for
any t. To see that this defines a unique measure on C, let f i < tn ... < tn, Ai
be Borel subsets of Itd, and let B = {w : w(i,) G A{ for 1 < i < n}. Define the
finite dimensional distributions of a measure Q by setting
Q(B) = J atdP
whenever t > tn- The martingale property of at implies that the finite
dimensional distributions are consistent so we have defined a unique measure on
(C,C). a
We are ready to prove the main result of the section.
(12.4) Girsanov's formula. If A' is a local martingale/P and we let At =
fQ a~1d(a,X)s, then Xt — At is a local martingale/Q.
Proof Although the formula for A looks a little strange, it is easy to see that
it must be the right answer. If we suppose that A is locally of b.v. and has
AQ = 0, then integrating by parts, i.e., using (10.1), and noting {a,A)t = 0
since A has bounded variation gives
(12.5) ^(^-,40-^0^0= f (Xs-As)da,+ f a,dX,- f a^A^^X^
Jo Jo Jo
92 Chapter 2 Stochastic Integration
At this point, we need the assumption made in Section 2.2 that our filtration
only admits continuous martingales, so there is a continuous version of a to
which we can apply our integration-by-parts formula.
Since as and Xs are local martingales/P, the first two terms on the right
in (12.5) are local martingales/P. In view of (12.1), if we want Xt — At to be
a local martingale/Q, we need to choose A so that the sum of the third and
fourth terms = 0, that is,
/ asdA, = (a,X)t
Jo
Prom the last equation and the associative law (9.6), it is clear that it is
necessary and sufficient that
At= [ a^dfaX),
Jo
The last detail remaining is to prove that the integral that defines At exists.
Let Tn = inf {t : at < n-1}. If t < T„, then
fUe^k^f «<„,*>i.<=o
Jo . ai Jo
by the Kunita-Watanabe inequality. So if T = limn_0o Tn then At is well
defined for t <T. The optional stopping theorem implies that EaiATn = Eat,
so noting a*ATn = a* on T„ > f we have
E{at;Tn<t) = E{aTa;Tn<t)
Since at > 0 is continuous and Tn < T, it follows that
E(at;T<t) < E(at;T„ <t) = E(aTn;Tn <t) < l/n
Letting n —» oo, we see that at = 0 a.s. on {T < t), so
Qt(T<t) = E(at;T<t) = 0
But Pt is equivalent to Qt, so 0 = Pt(T <t) = P(T < t). Since t is arbitrary
P(T < oo) = 0, and the proof is complete. D
(12.4) shows that the collection of semimartingales is not affected by change
of measure. Our next goal is to show that if X is a semimartingale/P, Q is
locally equivalent to P, and H £ £611, then the integral (H ■ X)t is the same
under P and Q. The first step is
Section 2.12 Girsanov's Formula. 93
(12.6) Theorem. The quadratic variation (X)t and hence the covariance
(X, Y)i is the same under P and Q.
Proof The second conclusion follows from the first and (3.9). To prove the
first we recall (8.6) shows that if A„ = {0 = ig < *" ■ ■ ■ < *£ = 0 is a sequence
of partitions of [0,i] with mesh |A„| —» 0 then
Dx*f+x-x*f)2-<x>*
in probability for any semimartiugale/P. Since convergence in probability is
not affected by locally equivalent change of measure, the desired conclusion
follows. D
(12.7) Theorem. If H € ^611 then (H ■ X)t is the same under P and Q.
Proof The value is clearly the same for simple integrands. Let M and N be
the local martingale parts and A and B be the locally bounded variation parts
of X under P and Q respectively. Note that (12.6) implies (M)t = {N)t. Let
Tn = mi{t:(M)i,\A\iOT\B\t>n}
If H G ^&II and Ht = 0 for t > Tn then by (4.5) we can find simple Hm so that
\\Hm-H\\M,\\Hm-H\\N-+Q and ^1^-^1^1 + 151)^0
These conditions allow us to pass to the limit in the equality
(Hm ■ (M + A))t = (Hm ■ (N + B))t
(the left-hand side being computed under P and the right under Q) to conclude
that for any H G £611 the integrals under P and Q agree up to time Tn. Since
n is arbitrary and Tn —* oo the proof is complete. D
3 Brownian Motion, II
In this chapter we will use Ito's formula to deepen our understanding of
Brownian motion or, more generally, continuous local martingales.
3.1. Recurrence and Transience
If Bt is a d-dimensional Brownian motion and B\ is the ith component then B\
is a martingale with (B1)t = t. If i ^ j Exercise 2.2 in Chapter 1 tells us that
B\B\ is a martingale, so (3.11) in Chapter 2 implies (B',BJ)t = 0. Using this
information in Ito's formula we see that if / : Rd —* R is C2 then
Writing V/ = (£>i/,..., Ddf) for the gradient of /, and A/ = Ya=\ Daf for
the Laplacian of /, we can write the last equation more neatly as
(1.1) f{Bt) - /0¾) = J V/(53) -dB, + ^J &f(B,)ds
Here, the dot in the first term stands for the inner product of two vectors and
the precise meaning of that is given in the previous equation.
Functions with A/ = 0 are called harmonic. (1.1) shows that if we
compose a harmonic function with Brownian motion the result is a local martingale.
The next result (and judicious choices of harmonic functions) is the key to
deriving properties of Brownian motion from Ito's formula.
(1.2) Theorem. Let G be a bounded open set and r = inf{i : Bt_ ¢. G). If
/ G C2 and A/ = 0 in G, and / is continuous on the closure of G, G, then for
x G G we have f(x) = Exf(Br).
Proof Our first step is to prove that Px(t < oo) = 1. Let K = sup{|z — y\ :
x,y E G} be the diameter of G. If x G G and \B\ — x\ > K then B\ (fc G and
r < 1. Thus
Px(r < 1) > P«(|£i -x\> K) = PQ{\BX\ >K) = eK>0
96 Chapter 3 Browniaji Motion, H
This shows supx Px(t > k) < (1 — e/c)k holds when k = 1. To prove the last
result by induction on k we observe that if pt(x,y) is the transition probability
for Brownian motion, the Markov property implies
P*(t > k) < [ Pl(x,y)Py(T >k-l)dy
Jg
< (1 - eKf-1Px(B1 G G) < (1 - eKf
where the last two inequalities follow from the induction assumption and the
fact that \Bi — x\ > K implies B\ ¢ G. The last result implies that Px(t <
oo) = 1 for all x G G and moreover that
(1.3) sup ExTp < oo for all 0 < p < oo
X
To get the last conclusion recall that (see e.g., (5.7) in Chapter 1 of Durrett
(1995))
EXTP= / ptp-1Px(T>t)dt
Jo
When A/ = 0 in G, (1.1) implies that_/(i?t) is a local martingale on [0,r).
We have assumed that / is continuous on G and G is bounded, so / is bounded
and if we apply the time change j defined in Section 2.2, Xt = /(57(tj) is a
bounded martingale (with respect to Qt = ^"7(i))- Being a bounded martingale
Xt converges almost surely to a limit X^ which has Xt = Ex(Xoo\Qt) and
hence ExXt = ExX^. Since r < oo and / is continuous on G, .X^, = f(BT).
Taking t = 0 it follows that f(x) = EXXQ = ExX^ = Exf(Br). D
In the rest of this section we will use (1.2) to prove some results concerning
the range of Brownian motion {Bt : t > 0}. We start with the one-dimensional
case.
(1.4) Theorem. Let a < x < b and T = inf{i: Bt £ (a, 6)}.
6 — x x — a
PX(BT = a) = —■ PX(BT = b) = —■
b — a b — a
Proof f(x) = (6 — x)/(b — a) has /" = 0 in (a, 6), is continuous on [a, 6], and
has f(a) = 1, /(6) = 0, so (1.2) implies f(x) = Exf(BT) = PX(BT = a). D
Exercise 1.1 Deduce the last result by noting that Bt is a martingale and
using the optional stopping theorem at time T.
Let Tx = inf{i : Bt = x}. Prom (1.4), it follows immediately that
Section 3.1 Recurrence and Transience 97
(1.5) Theorem. For all x and y, Px(Ty < oo) = 1.
Proof Since Px(Ty < oo) = Px-y(To < oo), it suffices to prove the result
when y = 0. A little reflection (pun intended) shows we can also suppose
x > 0. Now using (1.4) PX(T0 < TMx) = (M - 1)/M, and the right-hand side
approaches 1 as M —» oo. D
It is trivial to improve (1.5) to conclude that
(1.6) Theorem. For any s < oo, Px(Bt = y for some t > s) = 1.
Proof By the Markov property,
Px(Bt = y for some t > s) = Ex(PB{s)(Ty < oo)) = 1 D
The conclusion of (1.6) implies (argue by contradiction) that for any y
with probability 1 there is a sequence of times tn f oo (which will depend on
the outcome u) so that Bin = y, a conclusion we will hereafter abbreviate as
"Bt = y infinitely often" or "Bt = y i.o." In the terminology of the theory
of Markov processes, what we have shown is that one-dimensional Brownian
motion is recurrent.
Exercise 1.2 Use (1.6) to conclude
limsupi?i = oo liminf Bt = — oo
In order to study Brownian motion in d > 2, we need to find some
appropriate harmonic functions. In view of the spherical symmetry of Brownian
motion, an obvious way to do this is to let <p(x) = /(|z|2) and try to pick
/ : R —» R so that A<p = 0. We use \x\2 = x\ + ■ ■ ■ x\ rather than \x\ since it
is easier to differentiate:
A-/(M2) = /'(M2)2*<
Atf(M2) = r(M2)4s2+ 2/^12)
Therefore, for A<p = 0 we need
0 = ^/^)4^ + 2/^12)}
= 4|x|2/"(|x|2) + 2d/'(|a:|2)
98 Chapter 3 Brownian Motion, II
Letting y = \x\2, we can write the above as 4y/"(y) + 2df'(y) = 0 or, if y > 0,
Taking f'(y) = Cy~d'2 guarantees Aip = 0 for x ^ 0, so by choosing C
appropriately we can let
..,_.>,_ flog\x| d=2
We are now ready to use (1.2) in d > 2. Let Sr = inf{i : \Bt\ = r} and
r < R. Since <p has A<p = 0 in G = {z : r < \x\ < R}, and is continuous on G,
(1.2) implies
<p(x) = EM^r) = <f(r)P*(Sr < SR) + <p(R)(l - Px(Sr < SR))
where <p(r) is short for the value of <p(x) on {x : \x\ = r}. Solving now gives
<"> '■<*<*) = ^^
In c/ = 2, the last formula says
If we fix r and let -R —» oo in (1.8), the right-hand side goes to 1. So
Px{Sr < oo) = 1 for any x and any r > 0
and repeating the proof of (1.6) shows that
(1.9) Theorem. Two-dimensional Brownian motion is recurrent in the sense
that if G is any open set, then Px(Bt G G i.o.) = 1.
If we fix R, let r -♦ 0 in (1.8), and let SQ = inf{i > 0 : Bt = 0}, then for
x^O
Px(So < SR) < lim Px(Sr <SR) = 0
Since this holds for all R and since the continuity of Brownian paths implies
Sr t oo as R | oo, we have Px(So < oo) = 0 for all x £ 0. To extend the last
result to x = 0 we note that the Markov property implies
Po(Bi = 0 for some t>e) = Eo[PB<(T0 < oo)] = 0
Section 3.1 Recurrence and Transience 99
for all e > 0, so Po(Bt = 0 for some t > 0) = 0, and thanks to our definition of
S0 = inf{i > 0 : Bt = 0}, we have
(1.10) PX(S0 < oo) = 0 for all x
Thus, in d > 2 Brownian motion will not hit 0 at a positive time even if it starts
there.
Exercise 1.3 Use the continuity of the Brownian path and Px(So = oo) = 1
to conclude that if x ^ 0 then Px(Sr tooasr|0)=l.
For d > 3, formula (1.7) says
J?2-d _ l-12-d
a-11) Msr<sR) = KR2_Jrid
There is no point in fixing R and letting r —* 0, here. The fact that two
dimensional Brownian motion does not hit points implies that three dimensional
Brownian motion does not hit points and indeed will not hit the line {x : x\ =
z2 = 0}. If we fix r and let R —* oo in (1.11) we get
(1.12) Px(Sr < oo) = (r/|z|)d-2 < 1 if |x| > r
From the last result it follows easily that for d > 3, Brownian motion is
transient, i.e. it does not return infinitely often to any bounded set.
(1.13) Theorem. As t —* oo, \Bt\ —» oo a.s.
Proof Let An = {\Bt\ > n1!2 for all t > Sn} and note that Sn < oo by (1.3).
The strong Markov property implies
P*iK) = Ex(PB(Sn)(Snl/, < oo)) = (n^/nY-2 - 0
as n —» oo. Now limsup An = n^=1 U^L^ An has
P(limsup An) > limsup P(An) = 1
So infinitely often the Brownian path never returns to {x : \x\ < n1'2} after
time Sn and this implies the desired result. D
Dvoretsky and Erdos (1951) have proved the following result about how
fast Brownian motion goes to oo in d > 3.
100 Chapter 3 Brownian Motion, II
(1.14) Theorem. Suppose g(t) is positive and decreasing. Then
Po(\Bt\ < g(t)y/i i.o. as t | oo) = 1 or 0
according as f°° g(t)d~2/t dt = oo or < oo.
Here the absence of the lower limit implies that we are only concerned with the
behavior of the integral "near oo." A little calculus shows that
/oo
i_1 log-" t dt = oo or < oo
according as a < 1 or a > 1, so Bt goes to oo faster than ^/(\ogt)a^d~2 for
any a > 1. Note that in view of the Brownian scaling relationship Bt =d t^^Bx
we could not sensibly expect escape at a faster rate than y/i. The last result
shows that the escape rate is not much slower.
Review. At this point, we have derived the basic facts about the
recurrence and transience of Brownian motion. What we have found is that
(i) Px(\Bt\ < 1 for some t > 0) = 1 if and only if d < 2
(ii) Px(Bt = 0 for some t > 0) = 0 in d > 2.
The reader should observe that these facts can be traced to properties of what
we have called <p, the (unique up to linear transformations) spherically
symmetric function that has A<p(x) = 0 for all x £ 0, that is :
(\x\ d=l
ip{x)= l\og\x\ d = 2
[\x\2-d d>3
and the features relevant for (i) and (ii) above are
(i) <p(x) —* oo as |x| —» oo if and only if d < 2
(ii) |y(z)| —>-ooasz—>-0inc/>2.
3.2. Occupation Times
Let D = 5(0, r) = {y : \y\ < r} the ball of radius r centered at 0. In Section
3.1, we learned that Bt will return to D i.o. in d < 2 but not in d > 3. In this
Section 3.2 Occupation Times 101
section we will investigate the occupation time fQ lo(Bt) dt and show that for
any x
(2.1) Px (/0°° 1D (Bt) dt = oo) = 1 in d < 2
(2.2) Ex /0°° lD (Bt) dt < oo in d > 3
Proof of (2.1) Let TQ = 0 and G = B(0,2r). For k > 1, let
S^in^^Tfc.! :Bt£D)
Tk = inf{i >Si:5,eG}
Writing r for Ti and using the strong Markov property, we get for k > 1
PAjk lo{Bt)dt>s TSk\= PB{sk) (f lD{Bt)dt>s)= H(s)
From this and (4.5) in Chapter 1 it follows that
rTk
f "1d(B
Jsk
t)dt are i.i.d.
Since these random variables have positive mean it follows from the strong law
of large numbers that
'" s " = oo a.s.
J/-°° ■• /.jfc
1/3(Bt) dt > Jim J2 lD(Bt) dt =
proving the desired result. D
Proof of (2.2) If / is a nonnegative function, then Fubini's theorem implies
^0° y»0O ^OO p
Ex / f(Bt) dt = / Exf(Bt) dt= / Pt(x, y)f(y) dy dt
Jo Jo Jo J
= // Pi(x>y)dtf(y)dy
where pt(x,y) = (27rf)-d/2e-lx-!/l l2t is the transition density for Brownian
motion. As t —* oo, pt(x,y) ~ (2wt)~dl'z, so if d < 2 then fpt(x,y)dt = oo.
102 Chapter 3 Brownian Motion, II
When d > 3, changing variables t = \x — y\2/2s gives
JfOO *0O -I
I *(*.»)* = / (5SF75«H-|,,2'<«
- f° ( s \d'2 -. ( \x~y\2\ j
r2.3) ~L\*\*-v\3) e \ ** )ds
Jo
= tzy\2~d r°°
2
-^-^l*-y|
27rd/2
*■ (.2 ~ •*•) l_ ,,|2-d
27rd/2
where T(a) = /0 s"-^-*^ is the usual gamma function. If we define
JfOO
pt(x, y) dt
o
then in d > 3, G(x, y) < oo for x ^ y, and
(2.4) E* J°° f{Bt) dt = J G(x, y)f(y) dy
To complete the proof of (2.2) now we observe that taking f = 1D with D =
5(0, r) and changing to polar coordinates
f G(0, y) dy = f 3d-1 Cds*-d ds = ^r2 <
Jd Jo *
oo
To extend the last conclusion to x ^ 0 observe that applying the strong Markov
property at the exit time r from 5(0, \x\) for a Brownian motion starting at 0
and using the rotational symmetry we have
JfOO *00 yOO
' lD(Bs)ds = E0 lD{B,)ds<Eo lD(B,)da D
0 Jr Jo
We call G(x,y) the potential kernel, because G(-,y) is the electrostatic
potential of a unit charge at y. See Chapter 3 of Port and Stone (1978) for more
on this. In d < 2, fQ pt(x, y) dt = oo so we have to take another approach to
define a useful G:
G(x,y) = / (j>t(x,y)-at)
Jo
dt
Section 3.2 Occupation Times 103
where the at are constants we will choose to make the integral converge (at
least when x ^ y). To see why this modified definition might be useful note
that if f f(y) dy = 0 then (assuming we can use Pubini's theorem)
J G(x, y)f(y) dy= J Exf(Bt) dt
When d = 1, we let at = Pt(0,0). With this choice,
G(x,y)=-J=jQ (e-(y-*?/"-l)t-^dt
and the integral converges, since the integrand is < 0 and (y— z)2/2f3/2 as
t —* oo. Changing variables t = (y — x)2/2u gives
1/2
(2.5)
^ h Js 2
V71" ./o
since
Jo Jo 2
The computation is almost the same for d = 2. The only thing that changes
is the choice of at. If we try at = p<(0,0) again, then for x ^ y the integrand
~— f-1asf—>-0 and the integral diverges, so we let at = p<(0,ei) where
ei = (1,0). With this choice of at, we get
1 f°° ( y1/21 \ ,
1 r00
G{x,y) = ±- (e-l-vl2/2< - e"1/21)r^
to Jo
= J_ /°° ( f1'2
to 7o \J|x-t/
1 f°° , , /"i/;" i ,
= — / dse~' I t~ldt
2*" Vo J|x-y|2/2j
= 27 (/°° ds e~0 (~l0g(|a: " y|2)) = Vl0g(|a:" y|)
(2.6) - v-"a/*
V ' ,00 rl/2s
104 Chapter 3 Brownian Motion, II
To sum up, the potential kernels are given by
( (T(d/2 - 1)/2^12) ■ \x - y|2-d d > 3
(2.7) G(x, y) = \ (-1/^) - logflz - y\) d=2
{-l.\x-y\ d=l
The reader should note that in each case G(x,y) = C<p(\x — y\) where <p is
the harmonic function we used in Section 3.1. This is, of course, no accident.
x —* G(x,0) is obviously spherically symmetric and, as we will see, satisfies
AG(z, 0) = 0 for x ^ 0, so the results above imply G(x, 0) = A + B<p(\x\).
The formulas above correspond to A = 0, which is nice and simple. But
what about the weird looking B's? What is special about them? The answer is
simple: They are chosen to make |AG(x, 0) = — <50 (a point mass at 0) in the
distributional sense. It is easy to see that this happens in d = 1. In that case
<p(x) = \x\ then
lf , f 1 x > 0
so if B = — 1 then B<p"(x) = —260. More sophisticated readers can check that
this is also true in d > 2. (See F. John (1982), pages 96-97.)
To explain why we want to define the potential kernels, we return to Brow-
nian motion in the half space, first considered in Section 1.4. Let H = {y : yd >
0}, r = inf{i : Bt(£H}, and let
y= (yi,---,yd-i,-yd)
be the reflection of y through the plane {y £ Rd : yd = 0}
(2.8) Theorem. If x € H, f > 0 has compact support, and {z : f(x) > 0} C H
then
Ex (j'fiB^dt} = JG(x,y)f(y)dy- JG(x,y)f(y)dy
Proof Using Fubini's theorem which is justified since / > 0, then using (4.9)
from Chapter 1 and the fact that {x : f(x) > 0} C H we have
Ex f f{Bt) dt = [°° Ex{f{Bt); r>t)dt
Jo Jo
= / {Pt(x,y) -pt{x,y))f(y)dydt
Jo Jh
= / (pt{x,y)-at)f(y)dydt
Jo Jh
- / (pt(x,y)-at)f(y)dydt
Jo Jh
= j G{x,y)f{y)dy- JG(x,y)f(y) dy
Section 3.3 Exit Times 105
The compact support of/ and the formulas for G imply that f \G(x,y)f(y)\dy
and f\G(x,y)f(y)\dy are finite, so the last two equalities are valid. D
The proof given above simplifies considerably in the case d > 3; however,
part of the point of the proof above is that, with the definition we have chosen
for G in the recurrent case, the formulas and proofs can be the same for all d.
Let Gff(x,y) = G(x,y) — G(x,y). We think of Ga{x,y) as the "expected
occupation time (density) at y for a Brownian motion starting at x and killed
when it leaves H." The rationale for this interpretation is that (for suitable /)
Ex (JqT f(Bt)dt\ = J GH(x, y)f(y) dy
With this interpretation for G# introduced, we invite the reader to pause
for a minute and imagine what y —» Gff(x,y) looks like in one dimension. If
you don't already know the answer, you will probably not guess the behavior
as y —» oo. So much for small talk. The computation is easier than guessing
the answer:
G(x,y) = -\x-y\
so Gff(x,y) = — \x — y\ + \x + y\. Separating things into cases, we see that
r , x _ f ~(x -y) + (x + y) = 2y when 0 < y < x
LrH^X,y)~\-(y-x) + (x + y) = 2x when x < y
so we can write
(2.9) GH(x,y) = 2(xAy) foralla;,y>0
It is somewhat surprising that y —* Gj{(x,y) is constant = 2a; for y > x, that
is, all points y > x have the same expected occupation time!
3.3. Exit Times
In this section we investigate the moments of the exit times r = inf {t : Bt ¢ G]
for various open sets. We begin with G = {x : \x\ < r} in which case r = Sr in
the notation of Section 3.1.
(3.1) Theorem. If |x| < r then ExSr = (r2 - \x\2)/d
Proof The key is the observation that
\Bi\2-dt = ^{(B>)2-t}
106 Chapter 3 Brownian Motion, II
being the sum of d martingales is a martingale. Using the optional stopping
theorem at the bounded stopping time Sr A t we have
\x\2 = Ex{\BSrM\2-(SrAt)d}
(1.3) tells us that ExSr < oo, and we have |-BsrA*|2 < r2, so letting t —* oo
and using the dominated convergence theorem gives \x\2 = Ex (r2 — Srd) which
implies the desired result. D
Exercise 3.1. Let a, b > 0 and T = inf{i : Bt <£ (-a, 6)}. Show E0T = ab.
To get more formulas like (3.1) we need more martingales. Applying Ito's
formula, (10.2) in Chapter 2, with X] = Xt, a continuous local martingale, and
X2 = (X)t we obtain
f(Xu (X)t) - /(Xo.O) = f D.fiX,, (X),)dX,
Jo
(3.2) + f D2f(X„(X)s)d(X)s
Jo
+ \J Dnf(X„(X)s)d(X)s
From (3.2) we see that if (|Dn + D2)f = 0, then f(Xt, (X)t) is a local
martingale. Examples of such functions are
f(x,y) = x, x2- y, x3 - 3zy, z4 - 6x2y + 3y2...
or to expose the pattern
fn(x,y)= J2 Cn,mXn-2mym
0<m<[n/2]
where [n/2] denotes the largest integer < n/2, c„i0 = 1 and for 0 < m < [n/2]
we pick
7:cn,m(n - 2m)(n - 2m - 1) = -(m + l)c„|m+1
so that Dxx/2 of the mth term is cancelled by Dy of the (m + l)th. (Dxx of
the [n/2]th term is 0.)
The first two of our functions give us nothing new (Xt and X2 — (X)t are
local martingales), but after that we get some new local martingales:
Xf-3Xt(X)u X?-6X2(X)t + 3(X)2, ...
Section 3.3 Exit Times 107
These local martingales are useful for computing expectations for one
dimensional Brownian motion.
(3.3) Theorem. Let ra = inf{i : \Bt\ > a}. Then
(i) E0ra = a2, (ii) E0r2 = 5a4/3.
The dependence of the moments on a is easy to explain: the Brownian scaling
relationship Bct =d cxl2Bt implies that ra/a2 =d t\.
Proof (i) follows from (3.1), so we will only prove (ii). To do this let Xt =
B? — 6B?t + 3f2 and Tn <nbe stopping times so that Tn f oo and XiAxn 1S a
martingale. Since Tn < n the optional stopping theorem implies
0 = £o{£?aATn - 65MTn (ra A Tn) + 3(ra A T„)2}
Now |-BTaATn| < a, so using (1.3) and the dominated convergence theorem we
can let n —* oo to conclude 0 = a4 — 6a2J5ora + 3^0^. Using (i) and rearranging
gives (ii). D
Exercise 3.2 Find a, b, c so that Bf—aBft+bB?t2—ct3 is a (local) martingale
and use this to compute Eqt%.
Our next result is a special case of the Burkholder Davis Gundy
inequalities, (5.1), but is needed to prove (4.4) which is a key step in our proof of
(5.1).
(3.4) Theorem. If Xt is a continuous local martingale with Xq = 0 then
EUupXf) ^381^(^)^
Proof First suppose that \Xt\ and (X)t are < M for all t. In this case (2.5)
in Chapter 2 implies X? — 6Xf(X)t + 3(X)2 is a martingale, so its expectation
is 0. Rearranging and using the Cauchy-Schwarz inequality
EX? + 3E(X)2 = 6S(X2(X)t) < Q(EX*)l'2{E(X)2)ll2
Using the L4 maximal inequality ((4.3) in Chapter 4 of Durrett (1995)) and the
fact that (4/3)4 < 3.1605 < 19/6 we have
/ \ 1/2
SsupX? < (4/3)4^4 < 19 EsupX? ) (E(X)2)ll2
3<t \ 3<i J
108 Chapter 3 Brownian Motion, II
Since \X, \ < M for all s we can divide each side by E(sups<i Xf)1/2 then square
to get
E (supxA < 381(^(^)2)^2
The last inequality holds for a general martingale if we replace t by Tn = inf {t :
t,\Xt\, or (X)t > n}. Using that conclusion, letting n —* oo and using the
monotone convergence theorem, we have the desired result. D
If we notice that f(x, y) = exp(x — y/2) satisfies (^JDn + D2)f = 0, then
we get another useful result.
(3.5) The Exponential Local Martingale. If X is a continuous local
martingale, then £(X)t = exp(Xt — ^(X)t) is a local martingale.
If we let Yt = exp(Xt - f (X)t), then (3.2) says that
(3.6) Yt-YQ= I Y,dX,
Jo
or, in stochastic differential notation, that dYt = YtdX,. This property gives
Yt the right to be called the martingale exponential of Yt. As in the case of the
ordinary differential equation
f'(t) = f(t)a(t) /(0) = 1
which for a given continuous function a(t) has unique solution /(f) = exp(-At),
where At = JQ a(s) ds. It is possible to prove (under suitable assumptions) that
Z is the only solution of (3.6). See Doleans-Dade (1970) for details.
The exponential local martingale will play an important role in Section 5.3.
Then (and now) it will be useful to know when the exponential local martingale
is a martingale. The next result is not very sophisticated (see (1.14) and (1.15)
in Chapter VIII of Revuz and Yor (1991) for better results) but is enough for
our purposes.
(3.7) Theorem. Suppose Xt is a continuous local martingale with (X)t < Mt
and Xq = 0. Then Yt = exp(Xt — ^(X)t) is a martingale.
Proof Let Zt = exp(2Xt — ^(2X)t), which is a local martingale by (3.5).
Now,
Y2 = exp(2Xi - (X)t) = Zt exp((X)0
Section 3.3 Exit Times 109
So if Tn is a sequence of times that reduces Zt, the L2 maximal inequality
applied to Y1ATn gives
E (sup Y32ATn) < 4£Y<2ATn < 4eMtE{Zu<rn) = AeMt
Letting n | oo and using the monotone convergence theorem we have
4em > E (sup yA > (^ sup \Y,\\
by Jensen's inequality. Using (2.5) in Chapter 2 now, we see that Yt is a.
martingale. D
Remark. It follows from the last proof that if Xt is a continuous local
martingale with X0 = 0 and (X)t < M for all t then Yt = exp(Xt - |(^)i) is a
martingale in M.2.
Letting 0 £ R and setting Xt = QB% in (3.6), where B% is a one dimensional
Brownian motion, gives us a family of martingales exp(9Bt — d2t/2). These
martingales are useful for computing the distribution of hitting times associated
with Brownian motion.
(3.8) Theorem. Let Ta = inf{i : Bt = a}. Then for a > 0 and A > 0
E0exp(-\Ta) = e-aV2X
Remark. If you are good at inverting Laplace transforms, you can use (3.8)
to prove (4.1) in Chapter 1:
Po(Ta <t)= f (27rs3)1/2ae_a2/2J ds
Jo
Proof P0(Ta < oo) = 1 by (1.5). Let Xt = exp(^5t - 6H/2) and S„ < n be
stopping times so that Sn | oo and XiAsn is a martingale. Since Sn < n, the
optional stopping theorem implies
1 = E0 exT>(eBTaASa - d2(Ta A S„)/2)
If 6 > 0 the right-hand side is < exp(^a) so letting n —» oo and using the
bounded convergence theorem we have 1 = £'oexp(^a — d2Ta/2). Taking 6 =
V2A now gives the desired result. D
110 Chapter 3 Browni&n Motion, II
(3.9) Theorem. Let ra = inf{i : \Bt\ > a}. Then for a > 0 and A > 0,
£0exp(-Ara) = 2e~aV2T/(l + e~2a^)
Proof Let V'a(A) = E0 exp(—ATa). Applying the strong Markov property at
time ra (and dropping the subscript a to make the formula easier to typeset)
gives
EQ exp(-ATa) = £'o(exp(-Ar);JBr = a)
+ S0(exp(-Ar)T/>2a(A);JBr = -a)
Symmetry dictates that (r, BT) =d (r, —BT). Since BT £ {—a, a} it follows that
r and BT- are independent, and we have
V-a(A) = ^(1 + tMA))£0 exp(-Ar)
Using the expression for V'a(A) given in (3.8) now gives the desired result. D
Another consequence of (3.8) is a bit of calculus that will come in handy
in Section 7.2.
(3.10) Theorem. If B > 0 then
Proof Changing variables t = l/2s, dt = —ds/2s2 the integral above
2s 2 _B/2s ds_
5F 2*2
= /
Jo
V26 Jo y/2ws3
Using (3.8) now with a = Vd and A = z2 and consulting the remark after (3.8)
for the density function of Ta we see that the last expression is equal to the
right-hand side of (3.10). D
The exponential martingale can also be used to study a one dimensional
Brownian motion Bt plus drift. Let Zt = aBt + fit where a > 0 and fj. is real.
Xt = Zt — fit is a martingale with (X)t = oH so (3.7) implies that
exp(0(Zt - fit) - e2cr2t/2)
Section 3.4 Change of Time, Levy's Theorem 111
is a martingale. If 6 = -2/*/cr2 then -0/x - 02cr2/2 = 0 and exp(-(2/x/<r2)Zt)
is a local martingale. Repeating the proof of (3.8) one gets
Exercise 3.3. Let T_a = inf{i : Zt = -a}. If a, \i > 0 then
P0(71a < oo) = exp(-2a/*/(T2)
3.4. Change of Time, Levy's Theorem
In this section we will prove Levy's characterization of Brownian motion (4.1)
and use it to show that every continuous local martingale is a time change of
Brownian motion.
(4.1) Theorem. If Xt is a continuous local martingale with X0 = 0 and
(X)t = t, then Xt is a one dimensional Brownian motion.
Proof By (4.5) in Chapter 1 it suffices to show
(4.2) Lemma. For any s and t, Xs+t — Xs is independent of Ts and has a
normal distribution with mean 0 and variance t.
Proof of (4.2) Applying the complex version of Ito's formula, (7.9) in
Chapter 2, to X'r = Xs+r - X, and f(x) = eiBx, we get
e"Xl -l = ie f eiBX* dX'u--, f eiBX* du
Jo 2 J0
Let T'T = Ts+T and let A € T, = T'Q. The first term on the right, which we
will call Yt, is a local martingale with respect to T[. To get rid of that term let
Tn t oo be a sequence of stopping times that reduces Yt, replace t by t A Tn,
and integrate over A. The definition of conditional expectation implies
E(YiATa;A) = E(E(YtATJ^);A) = E(YQ;A) = 0
since Yq = 0. So we have
E(eitx"*»; A) - P(A) = 0 - —£ / ei8X« du; A
112 Chapter 3 Brownian Motion, II
Since |e,8x| = 1, letting n —* oo and using the bounded convergence theorem
gives
E(eiBX>;A)-P(A) = -jE([ eiBX* du;A\
= ~T JE (e<BK>A)du
by Fubini's theorem. (The integrand is bounded and the two measures are
finite.) Writing j(t) = E(eiBX';A), the last equality says
j(t)-P(A) = -j£j(u)du
Since we know that \j(s)\ < 1, it follows that \j(t) - j(u)\ < \t - u\02/2,
so j is continuous and we can differentiate the last equation to conclude j is
differentiable with
-92
2
02,
Together with j(0) = P(A), this shows that j(t) = P(A)e~B i'2, or
E(eiBX>;A)= [ e'^'2 dP
Since this holds for all A E^o'it follows that
(4.3) E{eiBX'^) = e-BHl2
or in words, the conditional characteristic function of X't is that of the normal
distribution with mean 0 and variance t.
To get from this to (4.2) we first take expected values of both sides to
conclude that X[ has a normal distribution with mean 0 and variance t. The
fact that the conditional characteristic function is a constant suggests that Xi
is independent of T'Q. To turn this intuition into a proof let g be a C1 function
with compact support, and let
<p(6)= [eiBxg(x)dx
be its Fourier transform. We have assumed more than enough to conclude that
<p is integrable and hence
9{x)=^jeiB*<p{-6)dx
Section 3.4 Change of Time, Levy's Theorem 113
Multiplying each side of (4.3) by ip(—9) and integrating we see that E{g{X't)\J:'Q)
is a constant and hence
E(g(Xt)\Po) = Eg{X>)
A monotone class argument now shows that the last conclusion is true for
any bounded measurable g. Taking g = 1b and integrating the last equality
over A € ^¾ we have
P(A)P(X[ G B) = [ E{lB{X't)\T'Q) dP = P{X't e B)
Ja
by the definition of conditional expectation, and we have proved the desired
independence. D
Exercise 4.1. Suppose X\, 1 < i < d are continuous local martingales with
XQ = 0 and
(X'fX')* = (* iff = i
10 otherwise
then Xt = (Xl,..., Xf) is a d-dimensional Brownian motion.
An immediate consequence of (4.1) is:
(4.4) Theorem. Every continuous local martingale with X0 = 0 and having
(^)00 = 00 is a time change of Brownian motion. To be precise if we let y(u) =
inf{t : (X)t > u) then Bu = X7(u) is a Brownian motion and Xt = -B(x), •
Proof Since j{(X)t) = t the second equality is an immediate consequence of
the first. To prove that we note Exercise 3.8 of Chapter 2 implies that u —* Bu
is continuous, so it suffices to show that Bu and B^ — u are local martingales.
(4..5) Lemma. Bu, ^"7(u), u > 0 is a local martingale.
Proof of (4.5) Let Tn = inf{i : \Xt\ > n}. The optional stopping theorem
implies that if u < v then
E(Xf(v)ATn\^f(u)) = ^7(u)AT„
where we have used Exercise 2.1 in Chapter 2 to replace ^"7(u)ATn by ^7(u)- To
let n —>• 00 we observe that the L? maximal inequality, the fact that X^v-^Aj,^ —
(X)7(t;wTn is a martingale, and the definition of j(v) imply
EsupX^v)ATn<4snpEX^v)ATn
= 4 sup E{X)^v)ATn < Av
114 Chapter 3 Browniaji Motion, II
The last result and the dominated convergence theorem imply that asn-+
oo, X^(f)ATn —* Xy(i) in I/2 for t = u,v. Since conditional expectation is a
contraction in L2 it follows that E(X^v^Ta\T^u)) -* E(X^v-)\Jr^uy) in L2
and the proof is complete. D
To complete the proof of (4.4) now it remains to show
(4.6) Lemma. B2 — u, ^"7(u), u > 0 is a local martingale.
Proof of (4.6) As in the proof of (4.5), the optional stopping theorem implies
that if u < v then
E(Xl(v)ATn ~ (X)f(v)ATn \?-,{u)) = ^7(u)ATn - (X)f(u)ATn
To let n —* oo we observe that using (a + 6)2 < 2a2 + 2b2, (3.4), and the
definition of j(y) then
£sup (x2(u)ATn - P07(u)ATn)2 < 2SsupX74(u)ATn + 2E(X)2{V)
<CE(X)2(v)<Cv2
The proof can now be completed as in (4.5) by using the dominated convergence
theorem, and the fact that conditional expectation is a contraction in L2. O
Our next goal is to extend (4.4) to Xt with P^X)^ < 00) > 0. In this
case ^7(u) is a Brownian motion run for an amount of time (^)00. The first
step in making this precise is to prove
(4.7) Lemma. lim1Too Xt exists almost surely on {(X)^ < 00}.
Proof Let T„ = inf{i : (X)t > n}. (3.7) in Chapter 2 implies that
(XT»)t = (X)tATn < n
Using this with Exercise 4.3 in Chapter 2 we get XiATn £ M2 so lim^oo XiAxn
exists almost surely and in L2. The last statement shows limt_oo Xt exists
almost surely on {Tn = 00} D {{X)^ < "}• Letting n —» 00 now gives (4.7). D
To prove the promised extension of (4.4) now, let j(u) = inf{i : (X)t > u)
when u < {X)oo, let .X^, = lim^oo Xt on {(^)oo < 00}, let Bt be a Brownian
motion which is independent of {Xt, t > 0}, and let
V - f Xi(.") u < (X)°°
u~\X00+B{u-(X)00) u>{X)00
Section 4.3 Change of Time, Levy's Theorem 115
(4.8) Theorem. Y is a Brownian motion.
Proof By (4.1) it suffices to show that Yu and Y* — u are local martingales
with respect to the filtration <r(Yt :t<u). This holds on [0, (^)oo] for reasons
indicated in the proof of (4.4). It holds on [(^)oo, oo) because B is a Brownian
motion independent of X. D
The reason for our interest in (4.8) is that it leads to a converse of (4.7).
(4.9) Theorem. The following sets are equal almost surely:
C = {limt_oo Xt exists } B = {supt \X%\ < oo}
A = {PQoc < oo} B+ = {supt Xt < oo}
Proof Clearly, C C B C B+. In Section 3.1 we showed that Brownian motion
has limsup1_0o Bt = oo. This and (4.8) implies that A" C -B+, or B+ C A.
Finally (4.7) shows that A C C. D
The result in (4.8) can be used to justify the assertion we made at the
beginning of Sections 2.6 that D^X) is the largest possible class of integrands.
Suppose H G II and let T = sup{i : /„* H?d(X)s < oo}. (H -X)t can be defined
for t < T and has
(H-X)t= [ H*d(X),
Jo
so on {(H ■ X)t = oo} D {T < oo} we have
limsup(if • X)t = oo liminf(if • X)t = — oo
ITT tfT
and there is no reasonable way to continue to define (H ■ X)t for t>T.
Convergence is not the only property of local martingales that can be
studied using (4.9). Almost any almost-sure property concerning the Brownian
path can be translated into a corresponding result for local martingales. This
immediately gives us a number of theorems about the behavior of paths of local
martingales. We will state only:
(4.10) Law of the Iterated Logarithm. Let L(t) = -^/22 log log f for t > e.
Then on {{-^Qoo = oo},
limsupXt/L((X)t) = 1 a.s.
t — 03
116 Chapter 3 Brownian Motion, II
Proof This follows from (4.9) and the result for Brownian motion proved in
Section 7.9 of Durrett (1995). D
Finally we have a distributional result that can be derived by time change.
Exercise 4.2. Suppose h : [0,oo) —» R is measurable and locally bounded.
Use (4.4) to generalize Exercise 6.7 in Chapter 2 and conclude that
Xt = / hs dB, is normal with mean 0 and variance / h2 ds
Jo Jo
3.5. Burkholder Davis Gundy Inequalities
Let Xt be a local martingale with X0 = 0 and let X* = sup3<1 \XS\. This
section is devoted to a proof of the following inequalities.
(5.1) Theorem. For any 0 < p < oo there are constants 0 < c, C < oo so that
cE{xft12 < E(x;y < ce{x)pJ2
Remark. This result should be contrasted with the Lp maximal inequality for
martingales that only holds for 1 < p < oo
p
p
£TOp<(jri) E\xt\
Levy's Theorem, (4.4), tells us that any continuous local martingale is a time
change of Brownian motion Bt, so it suffices to let B* = supi<a \Ba\ and show
that
(5.2) Theorem. For any 0 < p < oo there are constants 0 < c, C < oo so that
for any stopping time r
cEtvI2 < E(B;)P < CEt?'2
Proof of (5.2) The key is the following pair of odd looking inequalities.
(5.3) Lemma. Let /? > 1 and 8 > 0. Then for any A > 0
(a) p(b; > /?a, r1'2 < s\) < w^wp(b; > X)
Section 3.5 Burkholder Davis Gundy Inequalities 117
(b) P(t^ > p\,B; < SX) < ^Pir1'2 > A)
Remark. The inequalities above are called "good A" inequalities, although the
reason for the name is obscured by our formulation (which is from Burkholder
(1973)). The name "good A" comes from the fact that early versions of this
and similar inequalities (see Theorems 3.1 and 4.1 in Burkholder, Gundy, and
Silverstein (1971)) were formulated as P(f > A) < CPiKP{g > A) for all A that
satisfy P{g > A) < KP{g > /?A). Here /?, K > 1.
Proof It is enough to prove the result for bounded r for if the result holds
for r A n for all n, it also holds for r. Let
Si =inf{i:|5(iAr)| > A}
$2=^:15(2 At) | >/?A}
T = inf {t : (t A t)1'2 > <5A}
Since B* > /?A implies S\ < S2 < r and r1/2 < <5A implies T = oo we have
P(B*T > /?A, t1'2 < 8\)
< P(\B(t A S2 A T) - B(t A Si A T)\ > (/? - 1)A)
< (/?- 1)-2\-2E{(B(t AS2AT)- B(r A Sx AT))2}
where the second inequality is due to Chebyshev. Now if Ri < R2 are bounded
stopping times then
£{£(#05(¾)} = E{B(R1)E(B(R2)\^Rl)} = E B{R{f
So.we have
E(B(R2) - 5(¾))2 = EB(R2)2 - 2EB(Ri)B(R2) + EB{R{)2
= EB(R2)2 - EB{Rxf = E(R2 - Rx)
since B2 — t is a martingale. Resuming our first computation we find
= (/? - l)-2A-2£{(r A S2 A T) - (r A Si A T)}
<(/?-l)-2A-2(5A)2P(Si<oo)
= {/3-l)-262P{B;> A)
since TAt < (SX)2 proving (a).
118 Chapter 3 Brownian Motion, II
To prove (b) we interchange the roles of B(t A r) and (t A r)1/2 in the first
set of definitions and let
Si=inf{i: (rAi)1/2>A}
S2 = inf{i: (r A i)1/2 > /?A}
T=mf{t:\B{rAt)\>8\}
Reversing the roles of B, and s1/2 in the proof, it is easy to check that
P(r1/2 > /?A, B*T < 6\) < P((r A S2 A T) - (r A Sx A T) > (/32 - 1)A2)
< (/?2 - l)-1A"2S{(r AS2AT)-(rAS1A T)}
and using the stopping time result mentioned in the proof of (a) it follows that
the above is
= (/?2 - l)-1A"2S{JB(r A S2 A T? - B(r A Sx A T)2}
< (/?2 - 1)-1A"2(5A)2P(S1 < oo)
<(/?2-l)-152P(r1/2>A)
since \B(t AT)\< (5A)2 proving (b). □
It will take one more lemma to extract (5.2) from (5.3). First, we need
a definition. A function <p is said to be moderately increasing if <p is a
nondecreasing function with <p(0) = 0 and if there is a constant K so that
<p(2\) < K<p(\). It is easy to see that <p(x) = xp is moderately increasing
(K = 2P) but <p(x) = eax — 1 is not for any a > 0. To complete the proof of
(5.2) now it suffices to show (take /? = 2 in (5.3) and note 1/3 < 1)
(5.4) Lemma. If X,Y > 0 satisfy P{X > 2A,Y < «5A) < 62P(X > A) for all
6 > 0 and <p is a moderately increasing function, then there is a constant C
that only depends on the growth rate K so that
E<p{X) < CE<p(Y)
Proof It is enough to prove the result for bounded <p for if the result holds
for <p A n for all n > 1, it also holds for <p. Now <p is the distribution function
of a measure on [0, oo) that has
f MA)= / hh>X)d<p(X)
o Jo
Section 3.6 Martingales Adapted to Browniaji Filtraiions 119
Replacing h by a nonnegative random variable Z, taking expectations and using
Fubini's theorem gives
(5.5) E<p(Z)= / P(Z>\)dip(\)
Jo
Prom our assumption it follows that
P(X > 2A) = P{X > 2A, Y < 6\) + P(X > 2A,Y > <5A)
< S2P(X > A) + P(Y > <5A)
Integrating d<p(\) and using (5.5) with Z = X/2,X,Y/6
E<p(X/2) < 62E<p(X) + E<p(Y/6)
Pick 6 so that K62 < 1 and then pick N > 0 so that 2N > 5_1. Prom the
growth condition and the monotonicity of <p, it follows that
E<p{Y/6) < KNE<p{Y)
Combining this with the previous inequality and using the growth condition
again gives
E<p(X) < KE<p(X/2) < K62E<p(X) + KN+iE<p{Y)
Solving for E<p(X) now gives
kn+i
Ef(X) < YZTKPE,p(y) D
381 Revisited. To compare with (3.4), suppose p = 4. In this case K = 24 =
16. Taking 8 = 1/8 to make 1 — K82 = 3/4 and N = 3, we get a constant which
is
164-4/3 = 87,381.333...
Of course we really don't care about the value, just that positive finite constants
0 < c, C < 00 exist in (5.1).
3.6. Martingales Adapted to Brownian Filtrations
Let {Bt, t > 0} be the filtration generated by a d-dimensional Brownian motion
Bt with B0 = 0, defined on some probability space (fi, J7, P). In this section
we will show (i) all local martingales adapted to {Bt, t > 0} are continuous and
120 Chapter 3 Brownian Motion, II
(ii) every random variable X £ L2(Q,Boo,P) can be written as a stochastic
integral.
(6.1) Theorem. All local martingales adapted to {Bt, t > 0} are continuous.
Proof Let Xt be a local martingale adapted to {Bt,t > 0} and let Tn < n
be a sequence of stopping times that reduces X. It suffices to show that for
each n, X(t ATn) is continuous, or in other words, it is enough to show that
the result holds for martingales of the form Yt = E(Y\Bt) where Y € Bn. We
build up to this level of generality in three steps.
Step 1. Let Y = f(Bn) where / is a bounded continuous function. If t > n,
then Yt = f(Bn), so t —» Yt is trivially continuous for t > n. If t < n, the
Markov property implies Yt = E(Y\Bt) = h(n — t,Bt) where
It is easy to see that h(s,x) is a continuous function on (0,oo) x R, so Yt is
continuous for t < n. To check continuity at t = n, observe that changing
variables y = x + Zy/s, dy = sdl2 dz gives
h{s' x) = J J^F-e~^'2f^x + z^ dz
so the dominated convergence theorem implies that as t | n, h(n— t,Bt) —*
f(Bn).
Step 2. Let Y = /i(5(x)/2(5ta) where t\ < t2 < n and /1,/2 are bounded and
continuous. If t > t\, then
Yt = h(Btl)E(f2(Bt3)\Bt)
so the argument from step 1 implies that Yt is continuous on \t\, 00). On the
other hand, if t < t1} then Y< = E(Ytl\Bt) and
Ytl = fi(Btl)E(MBta)\Btl) = fi(Btl)g(Btl)
where
is a bounded continuous function, so
Yt = E{h{Btl)g{Btl)\Bt) for * < *i
Section 3.6 Martingales Adapted to Brownian Filtrations 121
and it follows from step 1 that Yt is continuous on [0,ii]. Repeating the
argument above and using induction, it follows that the result holds if Y =
f\{Btl) • • • fk(Btk) where t\ < t2 < ■ ■ ■ < i* < n and /i,..., /* are bounded
continuous functions.
Step 3. Let Y € Bn with E\Y| < oo. It follows from a standard application of
the monotone class theorem that for any e > 0, there is a random variable Xc
of the form considered in step 2 that has E\Xe — Y\ < e. Now
\E{X'\Bt) - E{Y\Bt)\ < E(\X< - Y\\Bt)
and if we let Zt = E{\X€ — Y\\Bt), it follows from Doob's inequality (see e.g.,
(4.2) in Chapter 4 of Durrett (1995)) that
\P (sup Zt > A J < EZn = E\XC - Y| < e
Now Xe(t) = E(Xe\Bt) is continuous, so letting £-+0we see that for a.e.
(j,Yt(w) is a uniform limit of continuous functions, so Yt is continuous. □
Remark. I would like to thank Michael Sharp e for telling me about the proof
given above.
We now turn to our second goal. Let Bt = (Bj,..., Bf) with Bq = 0 and
let {B\,t > 0} be the filtrations generated by the coordinates.
(6.2) Theorem. For any X e i^fi,£«,,-?) there are unique H{ G n2(-B*')
with
d /.OO
X = EX + J2J HldBi
Proof We follow Section V.3 of Revuz and Yor (1991). The uniqueness is
immediate since the isometry property, Exercise 6.2 of Chapter 2, implies there
is only one way to write the zero random variable, i.e., using if'(s,w) = 0.
To prove the existence of H3, the first step is to reduce to one dimension by
proving:
(6.3) Lemma. Let X e L^n.^.P) with EX = 0, and let X{ = SpqBJJ.
Then X = X1 + --- + Xd.
Proof Geometrically, see (1.4) in Chapter 4 of Durrett (1995), X{ is the
projection of X onto Lf the subspace of mean zero random variables measurable
122 Chapter 3 Browniaji Motion, II
with respect to B^. If Y € L2 and Z € Lj then Y and Z are independent so
EYZ = EY ■ EZ = 0. The last result shows that the spaces L2 and L) are
orthogonal. Since L2, ...,1/^ together span all the elements in L2 with mean 0,
(6.7) follows. □
Proof in one dimension Let X be the set of integrands which can be written
as J2]=i ^jl(i,--i,f,-] wnere the Ay and Sj are (nonrandom!) real numbers and
0 = s0 < si < ... < sn. For any integrand H £l C IbCB), we can define the
stochastic integral Y* = /0 H3 dB3, and use the isometry property, Exercise 6.2
of Chapter 2, to conclude Y* € .M2. Using (3.7) and the remark after its proof,
we can further define the martingale exponential of the integral, Zt = £(Y)t
and conclude that Zt G M2. Ito's formula, see (3.6), implies that Z% — Zq =
S*Z,dYt. Recalling Yt = (H ■ B)t and using the associative law, (9.6) in
Chapter 2, we have
i:
Zi-Z0= / Z3H3dB3
Jo
Y0 = 0 and <Y)0 = 0, so Z0 = 1. Since Zt € A42 we have .EZi = 1 = Z0 and it
follows that
JO
i.e., the desired representation holds for each of the random variables in the set
J = {€(H-B)t :H£l,t>Q}.
To complete the proof of (6.2) now, it suffices to show
(6.4) Lemma. If W G L2 has E(ZW) = 0 for all Z g J then W = 0.
(6.5) Lemma. Let Q be the collection of L2 random variables X that can be
written as EX + /0°° H3 dB3. Then Q is a closed subspace of L2.
Proof of (6.4) We will show that the measure W dP (i.e., the measure \i
with dfx/dP = W) is the zero measure. To do this, it suffices to show that
WdP is the zero measure on the oxfield <r{Bil,...,Bia) for any finite sequence
0 = t0 < ti < i2 < • • • < tn- Let Xj, 1 < j < n be real numbers and z be a
complex number. The function
<p(z) = E
is easily seen to be analytic in C, i.e., <p(z) is represented by an absolutely
convergent power series. By the assumed orthogonality, we have <p(x) = 0 for
Section 3.6 Martingales Adapted to Brownian Filtrations 123
all real x, so complex variable theory tells us that tp must vanish identically. In
particular <p(i) = 0, when i = y/—l. The last equality implies that the image
of W dP under the map
u - (Btl H-AH.., Bu («) - Bu_, («))
is the zero measure since its Fourier transform vanishes. This shows that W dP
vanishes on <r(Btl — Bto,...,Btn — Btti_x) = a(Btl,...,Btn) and the proof of
(6.4) is complete. □
Proof of (6.5) It is clear that Q is a subspace. Let X„ £ Q with
(6.6) Xn-EXn= / H?dB,
Jo
and suppose that Xn —» X in L2. Using an elementary fact about the variance,
then (6.6), and the isometry property of the stochastic integral, Exercise 6.2 in
Chapter 2, we have
E(Xn - Xmf - (EXn - EXmf = E{(Xn - Xm) - (EXn - EXm)}2
= E (H? - H?)3 ds
Jo
Since Xn -» X in L2 implies E(Xn - Xm)2 -» 0 and EXn -» EX, it follows
that ||#m - Hn\\B — 0.
The completeness of 112(5), see the remark at the beginning of Step 3 in
Section 2.4, implies that there is a predictable H so that \\Hn — H\\b —* 0.
Another use of the isometry property implies that Hm ■ B converges to H ■ B
in M2. Taking limits in (6.6) gives the representation for X. This completes
the proof of (6.5) and thus of (6.2). □
4 Partial Differential Equations
A. Parabolic Equations
In the first third of this chapter, we will show how Brownian motion can be
used to construct (classical) solutions of the following equations:
1.
ut = -Au
ut = -Au + g
ut = -Aw + cu
in (0, oo) x Rd subject to the boundary condition: u is continuous at each point
of {0} x Rd and u(0,x) = f(x) for x G Rd. Here,
d2Uj d2ud
Au = -r-r- -\ h
dx\ dx
and by a classical solution, we mean one that has enough derivatives for the
equation to make sense. That is, u g C1,2, the functions that have one
continuous derivative with respect to t and two with respect to each of the z,-. The
continuity in u in the boundary condition is needed to establish a connection
between the equation which holds in (0,oo) x Rd and u(0,x) = f(x) which
holds on {0} x Rd. Note that the boundary condition cannot possibly hold
unless / : Rd —» R is continuous.
We will see that the solutions to these equations are (under suitable
assumptions) given by
Ex f(Bt)
Ex(f(Bt) + J git-s^^ds^
Ex (f(Bt) exp (J c(t - s, B,) ds^
126 Chapter 4 Partial Differential Equations
In words, the solutions may be described as follows:
(i) To solve the heat equation, run a Brownian motion and let u(t, x) = Exf(Bt).
(ii) To introduce a term g(x) add the integral of g along the path.
(iii) To introduce cu, multiply f(Bt) by m* = exp(/0 c(t — s,B,)ds) before
taking expected values. Here, we think of the Brownian particle as having mass
1 at time 0 and changing mass according to m'3 = c(t — s, B,)ms, and when we
take expected values, we take the particle's mass into account. An alternative
interpretation when c < 0 is that exp(/0 c(t — s, B3) ds) is the probability the
particle survives until time t, or —c(r, x) gives the killing rate when the particle
is at x at time r.
In the first three sections of this chapter, we will say more about why the
expressions we have written above solve the indicated equations. In order to
bring out the similarities and differences between these equations and their
elliptic counterparts discussed in Sections 4.4 to 4.6, we have adopted a rather
robotic style. Formulas (m.2) through (m.6) and their proofs have been
developed in parallel in the first six sections, and at the end of most sections we
discuss what happens when something becomes unbounded.
4.1. The Heat Equation
In this section, we will consider the following equation:
(1.1a) u e C1'2 and ut = |Aw in (0, oo) x Rd.
(1.1b) u is continuous at each point of {0} x Rd and u(0,x) = f(x).
This equation derives its name, the heat equation, from the fact that if the
units of measurement are chosen suitably then the solution u(t,x) gives the
temperature at the point x G Rd at time t when the temperature profile at
time 0 is given by f(x).
The first step in solving (1.1), as it will be six times below, is to find a
local martingale.
(1.2) Theorem. If u satisfies (1.1a), then M, = u(t—s, B3) is a local martingale
on [0,i)-
Proof Applying Ito's formula, (10.2) in Chapter 2, to u(x0,... ,Xd) with
Section 4.1 The Heat Equation 127
X° = t - s and X\ = B\ for 1 < i < d gives
u(t-s,B,)-u(t,B0) = I -ut(t-r,Br)dr
Jo
+ I Vu(t-r,Br)-dBr
Jo
+ | J Au{t-r,Br)dr
To check this note that dX° = —dr and X° has bounded variation, while the
X). with 1 <i <d are independent Brownian motions, so
10 otherwise
(1.2) follows easily from the Ito's formula equation since — ut + |Aw = 0 and
the second term on the right-hand side is a local martingale. D
Our next step is to prove a uniqueness theorem.
(1.3) Theorem. If there is a solution of (1.1) that is bounded then it must be
v(t,x) = Exf(Bt)
Here = means that the last equation defines v. We will always use u for a
generic solution of the equation and v for our special solution.
Proof If we now assume that u is bounded, then M3,0 < s < t, is a bounded
martingale. The martingale convergence theorem implies that
Mt = lim M, exists a.s.
If u satisfies (1.1b), this limit must be f(Bt). Since M3 is uniformly integrable,
it follows that
u{t, x) = ExMq = ExMt = v(t, x) D
Now that (1.3) has told us what the solution must be, the next logical step
is to find conditions under which v is a solution. It is (and always will be) easy
to show that if v is smooth enough then it is a classical solution.
(1.4) Theorem. Suppose / is bounded. If v £ C1,2 then it satisfies (1.1a).
128 Chapter 4 Partial Differentia.! Equations
Proof The Markov property implies, see Exercise 2.1 in Chapter 1, that
Ex{f(Bt)\T,) = EBM(f(Bt-,)) = v(t - s, B,)
The left-hand side is a martingale, so the right-hand side is also. If v £ C1,2,
then repeating the calculation in the proof of (1.2) shows that
t' 1
v(t-s,B,)-v{t,B0) = / (-^ + -A^)(i-r,JBr)dr
Jo *
+ a local martingale
The left-hand side is a local martingale, so the integral on the right-hand side
is also. However, the integral is continuous and locally of bounded variation,
so by (3.3) in Chapter 2 it must be = 0 almost surely. Since vt and Av are
continuous, it follows that — vt + %Av = 0. For if it were ^ 0 at some point
(t, x), then it would be ^ 0 on an open neighborhood of that point, and, hence,
with positive probability the integral would not be = 0, a contradiction. D
It is easy to give conditions that imply that v satisfies (1.1b). In order to
keep the exposition simple, we first consider the situation when / is bounded.
(1.5) Theorem. If / is bounded and continuous, then v satisfies (1.1b).
Proof (Bt — Bo) = i^2iV, where N has a normal distribution with mean 0
and variance 1, so if tn —>• 0 and xn —* x, the bounded convergence theorem
implies that
v(tn, xn) = Ef(xn + ti'2N) - f(x) D
The final step in showing that v is a solution is to find conditions that
guarantee that it is smooth. In this case, the computations are not very difficult.
(1.6) Theorem. If / is bounded, then v £ C1,2 and hence satisfies (1.1a).
Proof By definition,
v{t,x) = Exf(Bt)= fPi(x,y)f(y)dy
Section 4.1 The Heat Equation 129
where pt(x,y) = (2^)-^-^-^/^. Writing D{ = d/dx{ and Dt = d/dt, a
little calculus gives
DiPt{x,y)=~{Xi~yi)pt{x,y)
DiiVt{x,y)={xi~yf~%t{x,y)
DijPt(x,y)={xi-yifj-yj)pt(x,y) i # j
ntMx,y)=(^ + £gP)Pt(x,y)
If / is bounded, then it is easy to see that for or = i, ij, or t
/ \Dapt(x,y)f(y)\dy < oo
and is continuous in Rd, so (1.6) follows from the next result on differentiating
under the integral sign. This result is lengthy to state but short to prove since
we assume everything we need for the proof to work. Nonetheless we will see
that this result is useful.
(1.7) Lemma, Let (S, S, m) be a cr-finite measure space, and g : S —* R be
measurable. Suppose that for x G G an open subset of Rd and some h0 > 0 we
have:
(a) u(x) = /s K(x,y)g(y)m(dy)
where K and dK/dxi :GxS-+R are measurable functions with
(b) K{x* + ha.y) - K{x\y) = ft §£-.(x* + ee{,y)de for \h\ < hQ and y G S
(c) u{(x) = fs §*r.(x,y)g(y) m(dy) is continuous at x*
and (d) /s f*l |f£(s' + 0e{, y)g(y)\ d8 m(dy) < oo.
Then du/dx{ exists at x* and equals «,•(«*).
Proof Using the definition of u in (a), then (b) and Fubini's theorem, which
is justified for \h\ < h0 by (d), we have
«(*-+ he{) - u(x') = J(K(x* + heuy) - K(x\y))g(y) m(dy)
= 11 7r(*'+teiMv)m{dy)dO
JO Js °xi
130 Chapter 4 Partial Differential Equations
Dividing by h and letting h —» 0 the desired result follows from (c). □
Later we will need a result about differentiating sums. Taking S = Z with
S = all subsets of S, and fj. is counting measure in (1.7), then setting g = 1
and /n(z) = K(x, n) gives the following. Note that in (a) and (c) of (1.7) it is
implicit that the integrals exist, so here in (a) and (c) we assume that the sums
converge absolutely.
(1.8) Lemma. Suppose that for x £ G an open subset of Rd and some h0 > 0
we have:
(a)u(x)=J2nfn(x)
where /„ and dfn /dx{ : G -» R, n € Z ate measurable functions with
(b) /„(*' + Ac,-) - /„(*•) = /* §£(*• + 0e{) d9 for \h\ <hQandneZ
(c) u{(x) = J2n ~i£r(x) 1S continuous at x*
and (d) £„ fX \e£(x* + Ba) d9 < <x>.
Then du/dx{ exists at x* and equals u{(x*).
Unbounded f. For some applications, the assumption that / is bounded
is too restrictive. To see what type of unbounded / we can allow, we observe
that, at the bare minimum, we need .EX|/(.B<)| < oo for all t. Since
a condition that guarantees this for locally bounded / is
(*) \x\~2 l°g+ \f{x)\ ~~*■ 0 as x —* oo
Replacing the bounded convergence theorem in (1.5) and (1.6) by the dominated
convergence theorem, it is not hard to show:
(1.9) Theorem. If/ is continuous and satisfies (*), then v satisfies (1.1).
4.2. The Inhomogeneous Equation
In this section, we will consider what happens when we add a function g(t, x)
to the equation we considered in the last section. That is, we will study
(2.1a) u e C1'2 and ut = |Aw + g in (0, oo) x Kd
Section 4.2 The Inhomogeneous Equation 131
(2.1b) u is continuous at each point of {0} x Rd and u(0, x) = f{x).
We observed in Section 4.1 that (2.1b) cannot hold unless / is continuous. Here
g = ut — \&u so the equation in (2.1a) cannot hold with u £ C1,2 unless g(t, x)
is continuous.
Our first step in treating the new equation is to observe that if u\ is a
solution of the equation with / = /0 and g = 0 which we studied in the last
section, and u2 is a solution of the equation with / = 0 and g = go then u\ + u2
is a solution of the equation with f = f0 and g = go so we can restrict our
attention to the case /=0.
Having made this simplification, we will now study the equation above by
following' the procedure used in the last section. The first step is to find an
associated local martingale.
(2.2) Theorem. If u satisfies (2.1a), then
M3=u(t-s,B,) + I g(t-r,Br)dr
Jo
is a local martingale on [0,2).
Proof Applying Ito's formula as in the proof of (1.2) gives
I" 1
u{t-s,B3)-u{t,BQ)= / {-ut + -Au)(t-r,Br)dr
+ I Vu(t-r,Br)-dBr
Jo
which proves (2.2), since — ut + \Au = —g and the second term on the right-
hand side is a local martingale. D
• Again the next step is a uniqueness result.
(2.3) Theorem. Suppose g is bounded. If there is a solution of (2.1) that is
bounded on [0,T] x Rd for any T < oo, it must be
v(t,x) = Ex(J g(t-s,B3)ds\
Proof Under the assumptions on g and u, Ms, 0 < s < t, defined in (2.2) is a
bounded martingale and u(0,x) = 0 so
Mt = limM, = / g(t — s, B3) ds
*T< Jo
132 Chapter 4 Partial Differentia.! Equations
Since M3 is uniformly integrable,
u(t, x) = ExMq = ExMt = v(t, x) □
Again, it is easy to show that if v is smooth enough it is a solution.
(2.4) Theorem. Suppose g is bounded and continuous. If v £ C1,2, then it
satisfies (2.1a) in (0, oo) x Rd.
Proof Using the Markov property, see Exercise 2.6 in Chapter 1, gives
Ex (J g{t-r,Br)dr F3j
= J g(t- r, Br) dr + EB. (J g(t-s- u, Bu) du\
= ( g(t-r,Br)dr + v(t-s,B3)
Jo
The left-hand side is a martingale, so the right-hand side is also. If v £ C1,2,
then repeating the calculation in the proof of (2.2) shows that
v(t-s,B3)-v(t,BQ)+ f g(t-r,Br)dr
Jo
f3 1
= I {-vt + -Av + g){t-r,Br)dr
+ a local martingale
The left-hand side is a local martingale, so the integral on the right-hand side
is also. Since the integral is continuous and locally of bounded variation, (3.3)
in Chapter 2 implies it must be = 0 almost surely. Again this implies that
(—vt + jAv + g) = 0, for our assumptions imply that this quantity is continuous
and if it were ^ 0 at some point (i, z) we would have a contradiction. D
The next step is to give a condition that guarantees that v satisfies (2.1b).
As in the last section, we will begin by considering what happens when
everything is bounded.
(2.5) Theorem. If g is bounded, then v satisfies (2.1b).
Proof If \g\ < M, then as t -* 0
\v(t, x)\ <EX I \g(t - s, B3)\ ds<Mt-+Q D
Jo
Section 4.2 The Inhomogeneous Equation 133
The final step in showing that v is a solution is to check that v £ C1,2.
Since the calculations necessary to establish these properties are quite tedious,
we will content ourselves to state what the results are and do enough of the
proofs to indicate why they are true. If you get dazed and confused and bail
out before the end, the
Take home message is: it is not enough to assume g is continuous to have
v g C1,2. We must assume g to be Holder continuous locally in t. That
is, for any N < oo there are constants C,a £ (0, oo), which may depend on N,
such that \g(t, x) — g(t, y)\ < C\x — y\a whenever t < N.
The reason for this assumption can be found in the proof of (2.6c) and again
in (2.6d).
The first step in showing v G C1,2 is to assume g is bounded and use
Fubini's theorem to conclude
v(t,x) = / ds p,(x,y)g(t-s,y)dy
where p3(x,y) = {2irs)~dl2e~\y~x\ I2'. The expression we have just written for
v above is what Friedman (1964) would call a volume potential and would write
as
V(x,t)= J J Z(x,t;t,T)g(t,r)dtdT
To translate between notations, set % = 0,JD = Rd,
Z(x,t;t,T) = pt-T(x,t)
and change variables s = t — r,y = £. Because of their importance for the
parametrix method, the differentiability properties of volume potentials are
well known. The results we will state are just Theorems 2 to 5 in Chapter 1
ofFriedman (1964), so the reader who is interested in knowing the whole story
can find the missing details there.
(2.6a) Theorem. If g is a bounded measurable function, then v(t, x) is
continuous on (0, oo) x Rd.
Proof Use the bounded convergence theorem. D
(2.6b) Theorem. There is a constant C so that if \g\ < M is measurable, then
the partial derivatives D{V = dv/dx{ have \D{v\ < CMt1!2, are continuous,
and are given by
D{V =// Dip,(x,y)g(t - s, y) dy ds
134 Chapter 4 Partial Differential Equations
Proof Using the formula for Dip, from (1.6), the right-hand side is
- jT J(2^s)-^^^le-^-y\2^g{t - s,y)dyds
= - f^Ex{(Xi-Bi)g(t-s,B3)]
Jo s
Although the last formula looks suspicious because we are integrating s-1 near
0, everything is really all right. If |^| < M, then
Ex\(x{ - B\)g{t - a, B3)\ < MEx\x{ - B\\ = CMs1'2
so we have
f —Ex\{x{ - B\)g{t - a,B,)\ < 2CMt1'2 < oo
Jo s
Using our result on differentiating under the integral sign, (1.7), it follows that
the partial derivatives D{V exist, are continuous, and have the indicated form.
□
Things get even worse when we take second derivatives.
(2.6c) Theorem. Suppose that g is bounded and Holder continuous locally in
t. Then the partial derivatives D{jv = d2v/dxidxj are continuous, and
DijV = DijP*(x, y)g(t - s, y) dy ds
Proof Suppose for simplicity that i = j. Consulting (1.6) for the formula for
A": Pi, the right-hand side is
fQ yWd/2 (<*-%>'-) e-l—la/*„(t - s, y) dy ds
'{xj-BW-a'
l'M(!
s2 ,g(t-s,B3)\ds
This time, however, Ex\(x{ - B\)2 -s\ = s^oK^j)2 - 1| so
{xt-BJf-a
i:
dsEx
00
Section 4.2 The Inhomogeneous Equation 135
We can overcome this problem if g is Holder continuous at x, because the fact
that Ex(x{ — B\)2 = s allows us to write
Ex
(xt-Btf-a
g(t-s,B,)
Ex
{xi-Biy-a
{g(t-s,B3)-g(t-s,x)}
Using the Holder continuity of g one can show the above is < Cs~l+al2, so
its integral from s = 0 to t converges absolutely, and with a little work (2.6c)
follows. (See Friedman (1964), pages 10-12, for more details.) D
The last detail now is:
(2.6d) Theorem. Let g be as in (2.6c). Then dv/dt exists, and
dv
(t, *) = g(t, x)+ dr —pt-r(x,y)g(r, y) dy
Proof To take the derivative w.r.t. t, we rewrite v as
v(t,x)= / pt-r(x,y)g(r,y)dydr
Differentiating the right-hand side w.r.t. t gives two terms. Differentiating the
upper limit of the integral gives g(t, x). Differentiating the integrand and using
a formula from (1.6) gives
= Jo J {2*)dl\t - r)(<*+2)/2 e^ V 2(i-r)
+ / /(2.(, - 0)-"2±=§ exp (¾) ,(r, „) dydr
exP 1 ~^7I—!T~ ) 9(r, y) dy dr
-d l* dr
J JL-Exg(r,Bt-r) + J d* Ex(\x-Bt-r\2g(r,Bt-r))
In the second integral, we can use the fact that Ex(\x — 5t_r|2) = C(t — r) to
cancel one of the t — r's and make the second expression like the first, but even
if we do this,
136 Chapter 4 Partial Differential Equations
This is the difficulty that we experienced in the proof of (2.6c), and the remedy
is the same: We can save the day if g is Holder continuous locally in t. For
further details, see pages 12-13 of Friedman (1964). D
Unbounded g. To see what type of unbounded g's can be allowed, we
will restrict our attention to the temporally homogeneous case g(t,x) = h(x).
At the bare minimum, we need
Ex I \h(B,)\ds < oo
Jo
and if we want (2.1b) to hold, we need to know that if tn j 0 and xn —» x, then
Ex_ [nh(Bs)da-*Q
?x„ [Uh(B,)t
Jo
If we put absolute values inside the integral and strengthen the last result to
uniform convergence for x £ Rd, then we get a definition that is essentially due
to Kato (1973). A function h is said to be in Kd if
lim sup £* ( I \h(B,)\dsj = 0
By Fubini's theorem, we can write the above as
(*) lim sup / kt(x,y)\h(y)\dy = 0
'i0 X J
where
h(x,y) = f{2ns)-dl2e-\x-^l2'ds
Jo
By considering the asymptotic behavior of kt(x, y) as t —» 0 and \x—y\2/t —» c,
we can cast this condition in a more analytical form as
(**) lim sup / <p(\x - y\)\h(y)\dy = 0
«10 X J\x-y\<a
where
fr-(d-2) d>3
<f(r)= < -logr d=2
U d=l
The equivalence of (*) and (**) is Theorem 4.5 of Aizenman and Simon
(1982) or Theorem 3.6 of Chung and Zhao (1995). Section 3.1 of the latter
Section 4.3 The Feynman-Ka.c Formula. 137
source can be consulted for a wealth of information about these spaces. We will
content ourselves here with an
Example 2.1. Let h(z) = k(\z\) where k(r) = r~p for r < 1 and 0 for r > 1.
We will now show that
(2.7) Theorem, h g Kd if p < 2.
The special case p = 1 is the Coulomb potential which is important in physics.
Proof If a < 1 then changing to polar coordinates and noticing the integral
is largest when x = 0 we have
sup I <p(\x - y\)\h(y)\ dy<Cd f V»(r)fc(r)r''-1 dr
x J\x-y\<a JO
If we replace k(r) by r~p and <p(r) by r~(d~2\ which holds in d ^ 2, then the
above
= Cd f r1_p dr — 0
Jo
when p < 2. In d = 2 when p < 2 we get
= Cd I r1-plogrc/r-»-0 D
Jo
Remark. We will have more to say about these spaces at the end of the next
section. For the developments there, we will also need the space Kd°c, which is
denned in the obvious way: / £ Kldc if for every R < oo, the spatially truncated
function /l(|x|<ii) € Kd-
4.3. The Feynman-Kac Formula
In this section, we will consider what happens when we add cu to the right-hand
side of the heat equation. That is, we will study
(3.1a) u G C1'2 and ut = f Au+ cu in (0, oo) x Rd.
(3.1b) u is continuous at each point of {0} x Rd and w(0,z) = f(x).
If c(t, x) < 0, then this equation describes heat flow with cooling. That is,
u(t, x) gives the temperature at the point x £ Rd at time t, when the heat at
x at time t dissipates at the rate —c(t, x).
138 Chapter 4 Partial Differential Equations
The first step, as usual, is to find a local martingale.
(3.2) Theorem. Let c* = f* c(t — r,Br) dr. If u satisfies (3.1a), then
M, = u(t — s, B3) exp(c*)
is a local martingale on [0,2).
Proof Applying Ito's formula with X° = t - s, A'* = B\ for 1 < i < d, and
Xd+i _ ci gives tnat
u{t — s, B,) exp(c*) - u(t,BQ)
= I -ui(t-r,Br)exp(cir)dr + f exp(c*)Vti(f- r,Br) ■ dBr
Jo Jo
+ I u(t- r,Br) exp(c*) del + 2 / Au^ ~r'Br) exP(cr) dr
since we have , . . ,
(X{ Xj) =•[* nl<i = J<d
\ 0 otherwise
Using del = c(i — r,Br) dr and rearranging, the right-hand side is
= / ( -«t + cw + -Aw J (i - r, 5r) exp(c*) dr
+ I exp(cl)Vu(t-r,Br)-dBr
Jo
which proves (3.2); since — ut + cu + \Au = 0 and the second term is a local
martingale. D
The next step, again, is a uniqueness result.
(3.3) Theorem. Suppose that c is bounded. If there is a solution of (3.1) that
is bounded on [0,T] x Rd for any T < 00, it must be
v{t,x) = Ex{f{Bi)exp{cii)}
Proof Under our assumptions on c and u, M3, 0 < s < t, is a bounded
martingale and Mt = lim.,_t Ms = /(-Bt)exp(cj). Since M, is uniformly integrable
it follows that
u(t, x) = ExMo = ExMt = v(t, x) D
Section 4.3 The Peynman-Ivac Formula 139
As before, it is easy to show that if v is smooth enough it is a solution.
(3.4) Theorem. Suppose / is bounded and that c is bounded and continuous.
If v G C1'2, then it satisfies (3.1a).
Proof The Markov property implies, see Exercise 2.7 in Chapter 1 and take
h(r,x) = c(t — r,x), that
Ex(f(Bt)exp(c\)\F3) = exp(cl)EB.(f(Bt-3)exp(c\ys))
= exv(c\)v(t-s,B3)
The left-hand side is a martingale, so the right-hand side is also. If v £ C1'2,
then repeating the calculation in the proof of (3.2) shows that
v(t — s,i?,)exp(c*) — v(t, B0)
[* 1
= / (—vt +cv+ -Av)(t — r, Br) exp(cl) dr
+ a local martingale
The left-hand side is a local martingale, so the integral on the right-hand side
is also. Since the integral is continuous and locally of bounded variation, (3.3)
in Chapter 2 implies that it must be = 0 almost surely. Again this implies that
(—vt +cv +1Av)(t — r,Br) exp(c*r) = 0 for our assumptions imply this quantity
is continuous and if it were ^ 0 at some point we would have a contradiction.
D
The next step is to give a condition that guarantees that v satisfies (3.1b).
As before, we begin by considering what happens when everything is bounded.
(3.5) Theorem. If c is bounded and / is bounded and continuous, then v
satisfies (3.1b).
Proof If \c\ < M, then e~Mt < exp(c*) < eMt, so exp(c*) -♦ 1 as t -* 0.
Letting ||/||oo = supx |/(z)| this result implies
\EXexV(c\)f(Bt) - Exf(Bt)\ < 11/11«, Ex\exp(c\) - 1| - 0
(1.5) implies that (i, x) —* Exf(Bt) is continuous at each point of {0} x Rd and
the desired result follows. D
This brings us to the problem of determining when v is smooth enough to
be a solution.
140 Chapter 4 Partial Differential Equations
(3.6) Theorem. Suppose that / is bounded and Holder continuous. If c is
bounded and Holder continuous locally in t, then v £ C1,2 and, hence, satisfies
(3.1a).
Proof To solve the problem in this case, we use a trick to reduce our result
to the previous case. We begin by observing that
<4= I c(t-r,Br)dr
Jo
is continuous and locally of bounded variation. So Ito's formula, (10.2) in
Chapter 2, implies that if h G C1 then
h(c\)-h(cl)= fh'{c\)dc\
Jo
Taking h(x) = e~x we have
exp(—c\) — 1 = — / exp(—c\)c(t — s, B3) ds
Jo
Multiplying by — exp(cj) gives
exp(c^) — 1 = / c(t — s,B3)exp(c\ — c\)ds
Jo
Plugging in the definitions of c\ and c\ we have
exp ( I c(t- r, Br) dr J = 1 + J c(t - s, B3) exp ( f c(t - r,Br) drj ds
Multiplying by f{Bt), taking expected values, and using Fubini's theorem,
which is justified since everything is bounded, gives
v(t, x) = Exf(Bt) + J Ex L(t - s, Bs) exp (J c(t - r, Br) drj f(Bt) J ds
Conditioning on Ts, noticing c(t — s, Bs) £ Ts, and using the Markov property
as in Exercise 2.7 of Chapter 1,
Ex f c(t -s,Bs) exp (J c(t - r, Br) drj f(Bt) F, J = c(t - s, B,)v(t - s, B,)
Section 4.3 The Feynman-Kac Formula. 141
Taking the expected value of the last equation and plugging into the previous
one, we have
v{t,x) = Exf{Bt)+ J Ex{c(t-s,B3)v(t-s,Bs)}ds
(*) Jo
= n(t,x) + v2{t,x)
The first term on the right, vi(t,x), is C1,2 by (1.6). The second term,
V2(t, x), is of the form considered in the last section with g(r, x) = c(r, x)v(r, x).
If we start with the trivial observation that if c and / are bounded, then v is
bounded on [0,JV] x Rd, and apply (2.6b), we see that
\v2(t,x) — V2(t,y)\ < Cn\x — y\ whenever t < N
To get a similar estimate for v\ let Bt be a Brownian motion starting at 0 and
observe that since / is Holder continuous
\vi(t, x) - Vl(t, y)\ = \E{f(x + Bt) - f(y + Bt)}\
< E\f(x + Bt) - f(y + Bt)\ < C\x - y\<*
Combining the last two estimates we see that v(r, x) is Holder continuous locally
in t. Since c and v ate bounded and we have supposed that c(r,x) is Holder
continuous locally in t the triangle inequality implies g(r, x) is Holder continuous
locally in t. Using (2.6c) and (2.6d) now, we have v2{t, x) G C1,2. D
Unbounded c. As in the last section, we can generalize the results above
to unbounded c's, but for simplicity we restrict our attention to the temporally
homogeneous case c(t, x) = h(x). Given the formula (*) above, which expresses
v as a volume potential, it is perhaps not too surprising that the appropriate
assumption is c G Kd- The key to working in this generality is what Simon
(1982) calls
(3.7) Khasminskii's lemma. Let j>0bea function on Rd with
a = snpEx(f g{B3)ds\ <1
Then
sup Ex exp ( J g(Bs) ds j < (1 - a)-1
Proof The Markov property and the nonnegativity of g imply that
sup Ex [... j dSl... ds„ g(B3l)... g(B3J < an
X J J0<Sl< — <3n<t
142 Chapter 4 Partial Differential Equations
Since the integrand is symmetric in (s\,..., sn) we have
/•••/ dSl...dsng{B3l)...g{B3J
= ±-f ... f dSl...dsng(B3l)...g(B3J
Summing on n now gives the desired formula. D
Prom the last result, it should be clear why assuming h £ Kd is natural in
this context. This condition guarantees that
supW/ \h(B3)\ds) ^0
and, hence, that
supExexp ( f \h{B3)\ds\ -»1
With these two results in hand, we can proceed with developing the theory
much as we did in the case of bounded coefficients.
B. Elliptic Equations
In the next three sections of this chapter, we show how Brownian motion can
be used to construct classical solutions of the following equations:
0 =
0 =
0 =
1.
2Au
^u + g
—Aw + cu
in an open set G subject to the boundary condition: at each point of dG, u is
continuous and u= f.
We will see that the solutions to these equations are (under suitable
assumptions) given by
Exf{BT)
Ex(j(Br) + J\{B3)ds
Ex(f{Br)exV(j\{B3)dsX\
Section 4.4 The Dirichlet Problem 143
where r = inf{i > 0 : Bt ¢. G}. To see the similarities to the solutions given
for the equations in Part A of this chapter think of those solutions in terms of
space-time Brownian motions B3 = (t — s, B3) run until time f = inf{s : B3 ¢.
(0, oo) x Rd}. Of course, f = t.
4.4. The Dirichlet Problem
In this section, we will consider the Dirichlet problem. That is, we will study
(4.1a) u e C2 and Aw = 0 in G.
(4.1b) At each point of dG, u is continuous and u = /.
To see what this means, note that if we let h(t, x) = u(x), then h satisfies the
heat equation
dh !a.
— = -Ah
dt 2
Thus, u is an equilibrium temperature distribution when dG is held at a fixed
temperature profile /.
As in the first three sections, the first step in solving (4.1) is to find a local
martingale.
(4.2) Theorem. Let r = inf{i > 0 : Bt ¢. G). If u satisfies (4.1a), then
Mi = u(Bt) is a local martingale on [0, r).
Proof Applying Ito's formula gives
u{Bt) - u(BQ) = f Vu(B3) -dB3 + \ f Au(B3)ds
Jo * Jo
for t < t. This proves (4.2), since Aw = 0 in G and the first term is a local
martingale on [0,r). D
The second step is a uniqueness theorem. For this we introduce
(f) Px(t < oo) = 1 for all x
which (4.7b) below will show is necessary for uniqueness. Note that (1.3) in
Chapter 3 implies bounded sets G satisfy (f).
(4.3) Theorem. Suppose G satisfies (f). If there is a solution of (4.1) that is
bounded, it must be
!<*) = Exf(Br)
144 Chapter 4 Partial Differential Equations
Proof If u is bounded and satisfies (4.1) then M3,0 < s < r, is a bounded
local martingale. Using (2.7) in Chapter 2, (f), and (4.1b), we have
Mr=\\mM3 =f(Br)
u(x) = EXMQ = ExMr = v(x) D
(f) is not needed for the existence of solutions, but to drop (f) we need to
modify the definition to take the possibility of r = oo into account. Let
v(x) = Ex (f(Br)l{r<co))
As in the first three sections, it is easy to show:
(4.4) Theorem. Let G be any open set and suppose / is bounded. If v £ C2,
then it satisfies (4.1a).
Proof The Markov property implies that on {r > s}
Ex{f(Br)l{r<oo)\T3)=v(B3)
The left-hand side is a local martingale on [0,r), so v(B3) is also. If v. £ C2,
then repeating the calculation in the proof of (4.2) shows that, for s £ [0,r),
1 F
v(B3) — v(B0) = - / Av(Br) dr + a local martingale
2 jo
The left-hand side is a local martingale on [0,r), so the integral on the right-
hand side is also. However, the integral is continuous and locally of bounded
variation, so by (3.3) in Chapter 2 it must be = 0. Since Av is continuous in
G, it follows that Av = 0 in G. For if Av ^ 0 at some point we would have a
contradiction. D
Up to this point, everything has been the same as in Section 4.1. Differences
appear when we consider the boundary condition (4.1b), since it is no longer
sufficient for / to be bounded and continuous. The open set G must satisfy a
regularity condition.
Definition. A point y £ dG is said to be a regular point if Pv(t = 0) = 1.
(4.5) Theorem. Let G be any open set. Suppose / is bounded and continuous
and y is a regular point of dG. If xn £ G and xn —» y, then v(xn) —» f(y).
Section 4.4 The Dirichlet Problem 145
Proof The first step is to show
(4.5a) Lemma. If t > 0, then x —* Px(t < t) is lower semicontinuous. That is,
if aj„ —» x, then
liminf Px (r<t)> PJt < t)
n—*co
Proof By the Markov property
PX{B, e Gc for some s g (c,t]) = J pe(x,y)Py(r <t-e)dy
Since y —» Py (r < t — e) is bounded and measurable and
pe(x,y) = (2ve)-d/2e-^-^'2e
it follows from the dominated convergence theorem that
x -* PX(B3 e Gc for some s £ (e,t\)
is continuous for each e > 0. Letting e J. 0 shows that x —* Px(t < t) is an
increasing limit of continuous functions and, hence, lower semicontinuous. D
If y is regular for G and t > 0, then Py(r < t) — 1, so it follows from (4.5a)
that if xn —* y, then
lim inf PXn (r < t) > 1 for alU > 0
n—*co
With this established, it is easy to complete the proof. Since / is bounded and
continuous, it suffices to show that if D(y, 6) = {x :\x — y\ < 6}, then
(4.5b) Lemma. If y is regular for G and xn —» y, then for all 6 > 0
Px„(t<oo1Bt €£»(»,«))-*!
Proof Let e > 0 and pick t so small that
Po( sup |B,|>f) <e
Since PXn(r < t) —» 1 as a;n —» y, it follows from the choices above that
liminfPs„(T<oo1BT G D(y,<5)) > liminf PXn (r<t, sup \B, - xn\ < £)
n-*co n-*co ^ 0<j<i */
> liminf Px (r < t) - P0 ( sup \Bt\ > -) > 1 - e
146 Chapter 4 Partial Differentia.! Equations
Since e was arbitrary, this proves (4.5b) and, hence, (4.5). D
(4.5) shows that if every point of dG is regular, then v(x) will satisfy
the boundary condition (4.1b) for any bounded continuous /. The next two
exercises develop a converse to this result. The first identifies a trivial case.
Exercise 4.1. If G C R- is open then each point of dG is regular.
Exercise 4.2. Let G be an open set in Rd with d > 2 and let y £ dG have
Py(r = 0) < 1. Let / be a continuous function with f(y) = 1 and f(z) < 1
for all z ^ y. Show that there is a sequence of points xn —* y such that
liminf„_co v{xn) < 1.
Prom the discussion above, we see that for v to satisfy (4.1b) it is sufficient
(and almost necessary) that each point of dG is a regular point. This situation
raises two questions:
Do irregular points exist?
What are sufficient conditions for a point to be regular?
In order to answer the first question we will give two examples.
Example 4.1. Punctured Disc. Let d > 2 and let G = D — {0}, where
D = {x : \x\ < 1}. If we let T0 = ini{t > 0 : Bt = 0}, then PQ{T0 = oo) = 1
by (1.10) in Chapter 3, so 0 is not a regular point of dD. One can get an
example with a bigger boundary in d > 3 by looking at G = D — K where
K = {x : x\ = X2 = 0}. In d = 3 this is a ball minus a line through the origin.
Example 4.2. Lebesgue's Thorn. Let d > 3 and let
G = (-1, l)d - U1<„<0O{[2-",2-"+1] x [-^, a^-1}
where the n = oo term is the single point {0}.
Claim. If an J. 0 sufficiently fast, then 0 is not a regular point of dG.
Proof ^0((^,-8^3) = (0,0) for some t > 0) = 0, so with probability 1, a
Brownian motion Bt starting at 0 will not hit
In = {x : Xl e [2-n, 2-n+1], x2 = x3 = ... = xd = 0}
Since Bt is transient in d > 3 it follows that for a.e. w the distance between
{B3 : 0 < s < 00} and J„ is positive. From the last observation, it follows
Section 4.4 The Dirichlet Problem 147
immediately that if we let Tn = inf{f : Bt G [2_n, 2"n+1] x [a„, a„]d_1} and pick
a„ small enough, then P0(Tn < oo) < 3-". Now £~=1 3"n = 3_1(3/2) = 1/2,
so if we let r = inf{f > 0 : Bt £ G) and a = inf{f > 0 : Bt £ (-1, l)d}, then
we have
1
Po(r <<r)<J2 po(Tn < oo) < -
n = l
Thus Po(r > 0) > P0(t = <t) > 1/2 and 0 is an irregular point. D
The last two examples show that if Gc is too small near y, then y may be
irregular. The next result shows that if Gc is not too small near y, then y is
regular.
(4.5c) Cone Condition. If there is a cone V having vertex y and an r > 0
such that V D D(y,r) C Gc, then y is a regular point.
Proof The first thing to do is to define a cone with vertex y, pointing
in direction v, with opening a as follows:
V(y, v, a) = {x : x = y + d(v + z) where 6 £ (0, oo), z ±v, and \z\ < a}
Now that we have denned a cone, the rest is easy. Since the normal distribution
is spherically symmetric,
^(-¾ eV(y,v,a)) = ea>0
where ea is a constant that depends only on the opening a. Let r > 0 be such
that V(y, v, a) D D(y, r) C Ge. The continuity of Brownian paths implies
limPy (sup|£,-»|>r) =0
Combining the last two results with a trivial inequality we have
ea < liminf Py(Bt G Gc) < limP„(r <t)<Py(r = 0)
— no y — txo s *
and it follows from Blumenthal's zero-one law that Pv(t = 0) = 1. D
(4.5c) is sufficient for most cases.
Exercise 4.3. Let G = {x : g(x) < 0} where g is a C1 function with Vg(y) ^ 0
for each y £ dG. Show that each point of dG is regular.
148 Chapter 4 Partial Differential Equations
However, in a pinch the following generalization can be useful.
Exercise 4.4. Define a flat cone V(y, v, a) to be the intersection of V(y, v, a)
with a d — 1 dimensional hyperplane that contains the line {y + 6v : 6 g R-}.
Show that (4.5c) remains true if "cone" is replaced by "flat cone."
When d = 2 this says that if we can find a line segment ending at z that lies in
the complement then z is regular. For example this implies that each point of
the boundary of the slit disc G = D — {x : xx > 0, z2 = 0} is regular.
Having completed our discussion of the boundary condition, we now turn
our attention to determining when v is smooth. As in Section 4.1, this is true
under minimal assumptions on /.
(4.6) Theorem. Let G be any open set. If / is bounded, then v g C100 and,
hence, satisfies (4.1a).
Proof Let x £ G and pick 6 > 0 so that D(x, 6) C G. If we let a = inf {t :
Bi £ D(x,6)}, then the strong Markov property implies that (for more details
see Example 3.2 in Chapter 1)
v(x) = Ex (/(5r)l(r<oo)) = Ex(v(Ba)) = f v(y) w(dy)
JdD(x,&)
where tt is surface measure on D(x, 6) normalized to be a probability measure.
The last result is the "averaging property" of harmonic functions. Now a simple
analytical argument takes over to show v £ C°°.
(4.6a) Lemma. Let G be an open set and h be a bounded function which has
the averaging property
h(x)= [ h(y)w(dy)
J8D(x,S)
when x eG and 8 < 8X where Sx > 0. Then h G C°° and Ah = 0 in G.
Proof Let i^bea nonnegative infinitely differentiable function that vanishes
on [<52,oo) but is not = 0. By repeatedly applying (1.7) it is routine to show
that
g(x)= [ il>(\y-x\2)h(y)dyeC™
JD(x,S)
Section 4.4 The Dirichlet Problem 149
Note that we use \y — x\2 = ^(y; — z,-)2 rather than \y — x\ which is not
smooth at 0. Making a simple change of variables, then looking at things in
polar coordinates, and using the averaging property we have
g(x)= f ^j(\z\2)h(x + z)dz
Jd(o,s)
= C [ drrd-li!>{r2)( f h(x + z) ir(dz))
JO \JdD(0,r) J
= cl [ drrd-1i>{r2)\ h(x)
So h is a constant times g and hence C°°.
To conclude now that Ah = 0 we note that the multivariate version of
Taylor's theorem implies that if \y — x\ < r then
h(y) = h(x) + J2(Vi ~ Xi)Dih(x)
i
+ X^y*' ~ xNyi ~ xi)DnKx) + e(y.x)
0'
where \e(y, x)\ < C3r3. Integrating over D(x,r) w.r.t. dy/\D(0, r)\ and using
the averaging property we have
h(x) = h(x) + 0 + C2Ah(x)ra + 0(r3)
Subtracting h(x) from each side, dividing by r2, and letting r —» 0 it follows
that Ah(x) = 0. D
Unbounded G. As in the previous three sections, our last topic is to
discuss what happens when something becomes unbounded. This time we will
focus on G and ignore /. Combining (4.3), (4.5), and (4.6) we have:
(4.7a) Theorem. Suppose that / is bounded and continuous and that each
point of dG is regular. If for all x £ G, Px(t < oo) = 1, then v is the unique
bounded solution of (4.1).
To see that there might be other unbounded solutions consider G = (0, oo),
/(0) = 0, and note that u(x) = ex is a solution. Conversely, we have
(4.7b) Theorem. Suppose that / is bounded and continuous and that each
point of dG is regular. If for some x £ G, Px(r < oo) < 1, then the solution of
(4.1) is not unique.
150 Chapter 4 Partial Differential Equations
Proof Since h(x) = Px(t = oo) has the averaging property given in (4.6a),
it is C°° and has Ah = 0 in G. Since each point y £ dG is regular, a trivial
comparison and (4.5a) implies
limsup Px(t = oo) < limsup Px(r > 1) < Py(r > 1) = 0
x—*y x—*y
The last two observations show that h is a solution of (4.1) with / = 0, which
completes the proof. Q
By working a little harder, we can show that adding ocPx(t = oo) to v(x)
is the only way to produce new bounded solutions.
(4.7c) Theorem. Suppose that / is bounded and continuous and that each
point of dG is regular. If u is bounded and satisfies (4.1) in G, then there is a
constant a such that
u{x) = Ex(f(Br); t < oo) + aPx(r = oo)
We will warm up for this by proving the following special case in which G = Rd.
(4.7d) Theorem. If u is bounded and harmonic in Rd then u is constant.
Proof (4.2) above and (2.6) in Chapter 2 imply that u{Bt) is a bounded
martingale. So the martingale convergence theorem implies that as i —» oo,
u(Bt) —+ Uoo- Since C/M is measurable with respect to the tail cr-field, it follows
from (2.12) in Chapter 1 that Px(a < Uoo < b) is either = 0 or = 1 for any
a < b. The last result implies that there is a constant c independent of x so that
Px(Uoo = c) = 1. Taking expected values it follows that u(x) = ExUoo = c. Q
Proof of (4.7c) Prom the proof of (4.7d) we see that u{Bt) is a bounded local
martingale on [0, r) so UT = lim^r u(Bt). On {r < oo}, we have Ur = f(BT) so
what we need to show is that there is a constant a independent of the starting
point Bq so that UT = a on {r = oo}. Intuitively, this is a consequence of the
triviality of the asymptotic cr-field, but the fact that 0 < Px(r = oo) < 1 makes
it difficult to extract this from (2.12) in Chapter 1.
To get around the difficulty identified in the previous paragraph, we will
extend u to the whole space. The two steps in doing this are to
(a) Let h{x) = u(x) - Ex(f(Br);r < oo). (4.6) and (4.5) imply that h is
bounded and satisfies (4.1) with boundary function / = 0.
(b) Let M = \\h\loo and look at
w(x) = fh(x) + MPx(r= co) xeG
K ' 10 xgG
Section 4.5 Poisson's Equation 151
To complete the proof now, we will show in four steps that w(x) = f3Px(r = oo),
from which the desired result follows immediately.
(i) When restricted to G, w satisfies (4.1) with boundary function / = 0.
The proof of (4.7b) implies that MPx(t < oo) satisfies (4.1) with boundary
function /=0. Combining this with (a) the desired result follows.
(ii) w > 0.
To do this we use the optional stopping theorem on the martingale h(Bt) at
time tAt and note h(Br) = 0 to get
h(x) = Ex(h(Bt);r > t) > -MPx(r > t)
Letting t —* oo proves h(x) > —MPx(r = oo).
(iii) w(Bt) is a submartingale.
Because of the Markov property it is enough to show that for all x and t we
have w(x) < Exw(Bt). Since w > 0 this is trivial if x ¢. G. To prove this for
x £ G, we note that (i) implies Wt = w{Bt) is a bounded local martingale on
[0, r) and WT = 0 on {r < oo}, so using the optional stopping theorem
w(x) = Ex(WrM) = Ex(w(Bt); r > t) < Ex(w(Bt))
(iv) There is a constant /? so that w(x) = &Px{t = oo).
Since w is a bounded submartingale it follows that as t —*■ oo, w(Bt) converges
to a limit Woo- The argument in (4.7d) implies there is a constant /? so that
Px{Woo = /?) = 1 for all x. Letting t —» oo in
w(x) = Ex(w(Bi);r>t)
arid using the bounded convergence gives (iv). D
4.5. Poisson's Equation
In this section, we will see what happens when we add a function of x to the
equation considered in the last section. That is, we will study:
(5.1a) ueC2 and |Aw = -g in G.
(5.1b) At each point of dG, u is continuous and u = 0.
As in Section 4.2, we can add a solution of (4.1) to replace u = 0 in (5.1b) by
u = /. As always, the first step in solving (5.1) is to find a local martingale.
152 Chapter 4 Partial Differential Equations
(5.2) Theorem. Let r = inf{f > 0 : Bt ¢. G). If u satisfies (5.1a), then
Aft = u(.Bt) + f g(B,)ds
Jo
is a local martingale on [0,r).
Proof Applying Ito's formula as we did in the last section gives
u(Bt) - u(B0) = f Vu(B3) -dB, + \ f Au(B,)ds
Jo * Jo
for t < t: This proves (5.2), since |Aw = —g and the first term on the right-
hand side is a local martingale on [0,r). D
The next step is to prove a uniqueness result.
(5.3) Theorem. Suppose that G and g are bounded. If there is a solution of
(5.1) that is bounded, it must be
v(x) = Ex (J g(Bt)dtj
Proof - If u satisfies (5.1a) then Mt denned in (5.2) is a local martingale on
[0, r). If G is bounded, then (1.3) in Chapter 3 implies Exr < oo for all x £ G.
If u and g are bounded then for t < r
|M,|<|M|M+ry|oo
Since the right-hand side is integrable, (2.7) in Chapter 2 and (5.1b) imply
Mr = limMt = I g(Bt)dt
u(x) = EXMQ = Ex(Mr) = v(x) a
As usual, it is easy to show
(5.4) Theorem. Suppose that G is bounded and g is continuous. If v G C2,
then it satisfies (5.1a).
Proof The Markov property implies that on {r > s},
Ex(JTg(Bi)dt T^j = J g(Bt)dt + v(B3)
Section 4.5 Poisson's Equation 153
The left-hand side is a local martingale on [0, r), so the right-hand side is also.
If u G C2, then repeating the calculation in the proof of (5.2) shows that for
«€ [O.t),
v(B3) - v(Bo) + £ g(Br) dr = £ (±Av + g) (Br)dr
+ a local martingale
The left-hand side is a local martingale on [0, r), so the integral on the right-
hand side is also. However, the integral is continuous and locally of bounded
variation, so by (3.3) in Chapter 2 it must be = 0. Since %Av + g is continuous
in G, it follows that it is = 0 in G, for if it were ^ 0 at some point then we
would have a contradiction. D
After the extensive discussion in the last section, the conditions needed to
guarantee that the boundary conditions hold should come as no surprise.
(5.5) Theorem. Suppose that G and g are bounded. Let y be a regular point
of dG. If xn £ G and xn —» y, then v(xn) —» 0.
Proof We begin by observing:
(i) It follows from (4.5a) that if e > 0, then PXn(r > e) -* 0.
(ii) If G is bounded, then (1.3) in Chapter 3 implies C = supx Exr < oo and,
hence, |H|co < C||ff||co < oo.
Let e > 0. Beginning with some elementary inequalities then using the
Markov property we have
K*„)| < EXn HT ' \g(Bs)\ ds^j + EXn ( £ g(Bs) ds ; r > e)
< e|M|oo + EIn(\v(Be)\; r>e)< e||ff||M + ||t;||ooP«b(t > e)
Letting n —» oo, and using (i) and (ii) proves (5.5) since e is arbitrary. D
Last, but not least, we come to the question of smoothness. For these
developments we will assume that g is denned on Rd not just in G and has
compact support. Recall we are supposing G is bounded and notice that the
values of g on Gc are irrelevant for (5.1), so there will be no loss of generality
if we later want to suppose that f g{x)dx = 0. We begin with the case d > 3,
because in this case (2.2) in Chapter 3 implies
,co
w(x) = Ex / \g(Bt)\ dt <
Jo
154 Chapter 4 Partial Differentia.! Equations
and, moreover, is a bounded function of x. This means we can define
,co
w(x) = Ex / g(Bt)dt
Jo
use the strong Markov property to conclude
w(x) = Ex f g(Bt)dt + Exw(Br)
Jo
and change notation to get
(*) v(x) = w(x) - Exw(Br)
(4.6) tells us that the second term is C°° in G, so to verify that ngC^we
need only prove that w is, a task that is made simple by the fact that (2.4) and
(2.3) in Chapter 3 gives the following explicit formula
w(x) = Cd \x- y\2~dg(y) dy
The first derivative is easy.
(5.6a) Theorem. If g is bounded and has compact support, then w is C1 and
there is a constant C which only depends on d so that
IA-«<*)| < cy|M I
Jh
dy
y <oo
d-\
{g*0} \x ~ V\
Proof We will content ourselves to show that the expression we get by
differentiating under the integral sign converges and leave it to the reader to apply
(1.7) to make the argument rigorous. Now
dax - vrd = (^) (x> - «)a)~*/a2(*« - 2/0
So differentiating under the integral sign
DiW{x) = Cd(2 -d)J ^|f^ g(y) dy
The integral on the right-hand side is convergent since
/|^*"hsiM-L,i^<o°
Section 4.5 Poisson's Equation 155
As in Section 4.2, trouble starts when we consider second derivatives. If
i^ j, then
DtJ \x - y\2~d = (2 - d)(-d)\x - y\-d-\Xi - yi)(Xj - Vj)
In this case, the estimate used above leads to
\DIJ\x-y\*-d\<C\x-y\-d
which is (just barely) not locally integrable. As in Section 4.2, if g is Holder
continuous of order a, we can get an extra \x — y\a to save the day. The details
are tedious, so we will content ourselves to state the result.
(5.6b) Theorem. If g is Holder continuous, then w is C2.
The reader can find a proof either in Port and Stone (1978), pages 116-117, or
in Gilbarg and Trudinger (1977), pages 53-55. Combining (*) with (5.6b) gives
(5.6) Theorem. Suppose that G is bounded. If g is Holder continuous, then
v £ C2 and hence satisfies (5.1a).
The last result settles the question of smoothness in d > 3. To extend the
result to d < 2, we need to find a substitute for (*). To do this, we let
w(x) = / G(x,y)g(y)dy
where G is the potential kernel defined in (2.7) of Chapter 3, that is,
G(xv)=l-^og(\x-y\) d=2
U{'V) \-\x-y\ d=l
G was defined as
/
Jo
{Pt(x,y)-at}dt
where the at were chosen to make the integral converge. So if f gdx = 0, we
see that
I G(x,y)g(y)dy= lim Ex f g(Bt)dt
J T->co JQ
Using this interpretation of w, we can easily show that (*) holds, so again our
problem is reduced to proving that w is C2, which is a problem in calculus.
Once all the computations are done, we find that (5.6) holds in d < 2 and that
in d = 1, it is sufficient to assume that g is continuous. The reader can find
156 Chapter 4 Partial Differential Equations
details for d = 1 in (4.5) of Chapter 6, and for d > 2 in either of the sources
cited above.
4.6. The Schrodinger Equation
In this section, we will consider what happens when we add cu to the left-hand
side of the equation considered in Section 4.4. That is, we will study
(6.1a) ueC2 and |Aw + cu = 0 in G.
(6.1b) At each point of dG, u is continuous and u = f.
As always, the first step in solving (6.1) is to find a local martingale.
(6.2) Theorem. Let r = inf{f > 0 : Bt ¢. G). If u satisfies (6.1a) then
Mt = u(Bt)exp ( I c(B,)dsj
is a local martingale on [0,r).
Proof Let c% = /0 c(B3)ds. Applying Ito's formula gives
u(Bt)exp(ct) - u(B0) = / exp(c,)Vu(i?,) • dB3 + / u(B3) exp(c,) dc,
Jo Jo
1 I*
+ -/ Au(Ba)exp(ca)ds
* Jo
for t < t. This proves (6.2), since dc, = c(B3) ds, |A« + cu = 0, and the first
term on the right-hand side is a local martingale on [0,r). D
At this point, the reader might expect that the next step, as it has been
five times before, is to assume that everything is bounded and conclude that if
there is a solution of (6.1) that is bounded, it must be
v(x) = Ex(f(BT)exp(cT))
We will not do this, however, because the following simple example shows that
this result is false.
Example 6.1. Let d = 1, G = (—a, a), c = j, and /=1. The equation we are
considering is
—u" + ju = 0 w(a) = u(—a) = 1
Section 4.6 The Schrodinger Equation 157
The general solution is A cos bx + B sin bx, where b = y/2-f. So if we want the
boundary condition to be satisfied we must have
1 = A cos 6a + B sin 6a
1 = A cos(—6a) + B sin(—6a) = A cos 6a — B sin 6a
Adding the two equations and then subtracting them it follows that
2 = 2 A cos 6a 0 = IB sin 6a
From this we see that B = 0 always works and we may or may not be able to
solve for A.
If cos 6a = 0 then there is no solution.
If cos 6a -^ 0 then u(x) = cos bx/ cos 6a is a solution.
We will see later (in Example 9.1) that if a6 < w/2 then
v(x) = cos bx/ cos 6a
However this cannot possibly hold for a6 > tt/2 since v(x) > 0 while the right-
hand side is < 0 for some values of x. O
We will see below (again in Example 9.1) that the trouble with the last
example is that if a6 > w/2 then c = 7 is too large, or to be precise, if we let
v(x) = Exexp( c(B,)ds
then w = 00 in (—a, a). The rest of this section is devoted to showing that if
w =£ 00, then "everything is fine." The development will require several stages.
The first step is to show
(6.3a) Lemma. Let 0 > 0. There is a p. > 0 so that if H is an open set with
Lebesgue measure \H\ < \i and tr = inf{t > 0 : Bt ¢ H) then
sup Ex (exp(0Tff)) < 2
Proof Pick 7 > 0 so that eB^ < 4/3. Clearly,
Fx(-TH >1)S]H (2*7)^ e dy S (2*T)d/2 " 4
158 Chapter 4 Partial Differentia.! Equations
if we pick p. so that /*/(2tf-7)d/2 < 1/4. Using the Markov property as in the
proof of (1.2) in Chapter 3 we can conclude that
Px{rH > kj) = Ex(PBy(TH >(k- 1)T); rff > t)
< \supPy(TH>(k-l)7)
So it follows by induction that for all integers k > 0 we have
sup Px (¾ > kj) < -£
Since ex is increasing, and e87 < 4/3 by assumption, we have
CO
Ex exp(^rff) < J2eM9jk)Px((k - 1)? < Th < k7)
k=i
<^r(^\ J_-±f_L_! i
" i^i V3/ 4fc-i 3 £^ 3*-1 " 3 ' 1 - I
Careful readers may have noticed that we left rff = 0 out of the expected value.
However, by the Blumenthal 0-1 law either Px{tjj = 0) = 0 in which case our
computation is valid, or Px{th = 0) = 1 in which case Ex exp(flrff) = 1. D
Let c* = supx \c(x)\. By (6.3a), we can pick ro so small that if Tr = inf{t:
\Bt - B0\ > r} and r < r0, then Ex exp(c*Tr) < 2 for all x.
(6.3b) Lemma. Let 25 < r0. If D(x,28) C G and y G J5(s,5), then
™(y) < 2d+2iy(a;)
The reason for our interest in this is that it shows w(x) < oo implies w(y) < oo
foryeD(x,8).
Proof If D(y, r) C G, and r < ro then the strong Markov property implies
w(y) = Ey[exp(cTr)w(B(Tr))] < Ey[exp(c*Tr)w(B(Tr))]
= £y[exp(cTr)] i w(z)Ti(dz)<2 I w(z)Ti(dz)
JdD(y,r) JdD(y,r)
where n is surface measure on dD(y, r) normalized to be a probability measure,
since the exit time, Tr, and exit location, B(Tr), are independent.
Section 4.6 The Schrodinger Equation 159
If 5 < ro and D(y,6) C G, multiplying the last inequality by rd_1 and
integrating from 0 to 5 gives
6d If
—w{y)<2- — / w{z)dz
d <*i JD(y,S)
where <rd is the surface area of {x : \x\ = 1}. Rearranging we have
r 8d
(*) / w(z)dz>2-1—w(y)
JD(y,S) ^o
where C0 = d/(Td is a constant that depends only on d.
Repeating the first argument in the proof with y = x and using the fact
that ctt > —c*Tr gives
w(x) = Ex[exp(cTr)w(B(Tr))] > Ex[exv(-c*Tr)w(B(Tr))]
= Es[exp(-c*Tr)] J w(z)w(dz)
Since 1/x is convex, Jensen's inequality implies
Es[exp(-c*Tr)] > l/£s[exp(cTr)] > 1/2
Combining the last two displays, multiplying by rd_1, and integrating from 0
to 25 we get the lower bound
(25)
(z)> 2"1-— f w(z)dz
ffd Jd(x,2S)
d
Rearranging and using D(x, 25) D D(y, 5), w > 0 we have
C t
(**) w{x)>2-lj-~ I w(z)dz
where again C0 = d/ffd- Combining (**) and (*) it follows that
w(x)>2-1-^-d f w(z)dz
>2'1^-'rl^wiy) = 'r(d+2)w(y)
D
(6.3b) and a simple covering argument lead to
160 Chapter 4 Partial Differential Equations
(6.3c) Theorem. Let G be a connected open set. If w ^ oo then
w(x) < oo for all x £ G
Proof Prom (6.3b), we see that if w(x) < oo, 25 < ro, and D(x, 25) C G, then
w < oo on D(x,6). From this result, it follows that Go = {x : w(x) < oo} is
an open subset of G. To argue now that Go is also closed (when considered as
a subset of G) we observe that if 25 < ro, D(y, 35) c G, and we have xn £ Go
with xn —» y £ G then for n sufficiently large, y £ D(xn, 5) and D(xn, 25) C G,
so w(y) < oo. D
Before we proceed to the uniqueness result, we want to strengthen the last
conclusion.
(6.3d) Theorem. Let G be a connected open set with finite Lebesgue measure,
\G\ < oo. If w ^ oo then
sup w(x) < oo
x
Proof Let K C G be compact so that \G— K\ < p. the constant in (6.3a) for
6 = c*. For each x £ K we can pick a 5X so that 25x < ro and D{x, 25x) C G.
The open sets D(x, 5X) cover K so there is a finite subcover D(x{, 5Xj), 1 < i < I.
Clearly,
sup w(xi) < oo
(6.3b) implies w(y) < 2d+2w(x{) for y £ D(xit 5Xj), so
M = sup w(y) < oo
If y £ H = G — K, then Ey(exp(c*TH)) < 2 by (6.3a) so using the strong
Markov property
w(y) = Ey (exp(cr„); th = r) + Ey (exp(cTH)w(BTH); BTH £ K)
< 2 + MEy (exp(cTH); BTH £ K) < 2 + 2M D
With (6.3d) established, we are now more than ready to prove our
uniqueness result. To simplify the statements of the results that follow we will now
list the assumptions that we will make for the rest of the section.
(Al) G is a bounded connected open set.
(A2) / and c are bounded and continuous.
Section 4.6 The Schrodinger Equation 161
(A3) w ¢. oo.
(6.3) Theorem. If there is a solution of (6.1) that is bounded, it must be
v{x) = Ex{f{Br)exV{cr))
Proof (6.2) implies that M, = u(B3) exp(cj) is a local martingale on [0,r).
Since /, c, and u are bounded, letting s f r A t and using the bounded
convergence theorem gives
u(x) = Ex(f(Br)exp(cry,T<t)+Ex(u(Bt)exp(ct);T>t)
Since / is bounded and w(x) = £'xexp(cr) < oo, the dominated convergence
theorem implies that as t —» oo, the first term converges to Ex(f(Br) exp(cT)).
To show that the second term —» 0, we begin with the observation that since
{r > t} £ Tt, the definition of conditional expectation and the Markov property
imply
Ex{u{Bt) exp(cr); r > t) = Ex(Ex(u(Bt) exp(cr)|^); r>t)
= Ex(u(Bt) exv(Ci)w(Bt); r>t)
Now we claim that for all y £ G
w(y) > exv{-c*)Py{r < 1) > e > 0
The first inequality is trivial. The last two follow easily from (Al). See the first
display in the proof of (1.2) in Chapter 3.
Replacing w(Bi) by e,
Ex(\u(Bt)\exv(cty,T > t) < r1Ex(\u(Bt)\exp(cr);T> t)
< e-1||w||co£'x(exp(cr); r > t) — 0
as t —» oo, by the dominated convergence theorem since w(x) = Ex exp(cr) < oo
and Px(t < oo) = 1. Going back to the first equation in the proof, we have
shown u(x) = v(x) and the proof is complete. D
This completes our consideration of uniqueness. The next stage in our
program, fortunately, is as easy as it always has been. Recall that here and in
what follows we are assuming (A1)-(A3).
(6.4) Theorem. If v £ C2, then it satisfies (6.1a) in G.
162 Chapter 4 Partial Differentia.! Equations
Proof The Markov property implies that on {r > s},
Ex{exv{cT)f{Br)\T,) = exp(Ci)£B,(exp(cr)/(5r))
= exp(c,)v(Bs)
The left-hand side is a local martingale on [0, r), so the right-hand side is also.
If v G C2, then repeating the calculation in the proof of (6.2) shows that for
«€[0,1-),
v(B3) exp(c,) - v(BQ) =/( 2 Av + cv) (Br) exp(cr)dr
+ a local martingale
The left-hand side is a local martingale on [0, r), so the integral on the right-
hand side is also. However, the integral is continuous and locally of bounded
variation, so by (3.3) in Chapter 2 it must be = 0. Since v G C2 and c is
continuous, it follows that ^Av + cv = 0, for if it were ^ 0 at some point then
we would have a contradiction. D
Having proved (6.4), the next step is to consider the boundary condition.
As in the last two sections, we need the boundary to be regular.
(6.5) Theorem, v satisfies (6.1b) at each regular point oidG.
Proof Let y be a regular point of dG. We showed in (4.5a) and (4.5b) that
if xn -* y, then PXn(r < 6) -* 1 and PXn{Br £ D{y,6)) -* 1 for all 6 > 0.
Since c is bounded and / is bounded and continuous, the bounded convergence
theorem implies that
EXn(exv(cr)f(Br);T< l)-/(y)
To control the contribution from the rest of the space, we observe that if \c\ < M
then using the Markov property and the boundedness of w established in (6.3d)
we have
EXn(exv(cr)f(Br);T > 1) < eM\\f\\00EXn(w(B1);r> 1)
<eM||/||oo||U;||ooPx„(r>l)-0 D
This brings us finally to the problem of determining when v is smooth
enough to be a solution. We use the same trick used in Section 4.3 to reduce
Section 4.6 The Schrodinger Equation 163
to the previous case. We begin with the identity established there, which holds
for all t and Brownian paths w and, hence, holds when t = r(w)
exp ( I c(B,)ds] =1+/ c(B,)exp ( j c(Br)drj ds
Multiplying by f(Br) and taking expected values gives
v(x) = Exf(Br) + J Ex (c(B3) exp (J c(Br)drj f(Br)l(3<r)J ds
Conditioning on Ts and using the Markov property, we can write the above as
v(x) = Exf{Br) + / Ex(c{B3)v(Bs); r>s)ds
Jo
= Vi(x) + Vn(x)
The first term, vi(x), is C00 by (4.6). The second term is
M*) = E* (J c(Bs)v(Bs)ds)
so if we let #(z) = c(x)v(x) then we can apply results from the last section. If c
and / are bounded and w ^ oo, then v is bounded by (6.3d), so it follows from
results in the last section that vn is C1 and has a bounded derivative.
Since ui G C°° and G is bounded, it follows that v is C1 and has abounded
derivative. If c is Holder continuous, then g(x) = c(x)v(x) is Holder continuous,
and we can use (5.6b) from the last section to conclude v2 G C2 and hence
(6.6) Theorem. If in addition to (A1)-(A3), c is Holder continuous, then
tgC2 and, hence, satisfies (6.1a).
C. Applications to Brownian Motion
In the next three sections we will use the p.d.e. results proved in the last
three to derive some formulas for Brownian motion. The first two sections are
closely related but the third can be read independently.
164 Chapter 4 Partial Differential Equations
4.7. Exit Distributions for the Ball
In this section, we will use results for the Dirichlet problem proved in Section
4.4 to find the exit distributions for D = {x : \x\ < 1}. Our main result is
(7.1) Theorem. If / is bounded and measurable, then
(*) Exf(Br)= [ I^Ml/O,)^)
JdD \x-y\
where -k is surface measure on dD normalized to be a probability measure.
Proof An application of the monotone class theorem shows that if (*) holds
for / G C°° it is valid for bounded measurable /. In view of (4.3), we can prove
(*) for / G C°° by showing that if ky(x) = (1 - |z|2)/|z - y\d and
v(x) = J
SaDh{*)f{yWy) ze-D
f(x) xedD
then v solves the Dirichlet problem (4.1):
(7.2a) In D, v G C2 and Av = 0.
(7.2b) At each point of dD, v is continuous and v(x) = f(x).
The first, somewhat painful, step in doing this is to show
(7.3) Lemma. If y G dD then Aky = 0 in D.
Proof To warm up for this, we observe
A \x - y\P = Di (J2(XJ - yjY-f2 = p\x - yr\Xi - yi)
i
so taking p = — d we have
Diky{x) = (-2,0 - ^p + (1- M2) - -^4.
Differentiating again and using our fact with p = — (d+ 2) gives
DUky{x) = (-2) ■ r-^-r, + 2 ■ (-2,,-) -^--^1
\d+2
(d(d+2)(xi-yi)* d_
{ ' Nl \*-y\d+i \*-y\°
Section 4.7 Exit Distributions for the Ball 165
Summing the last expression on i gives
a i / x — 2d . , I2I2 — x ■ y
Aky(x) = - + Ad ■ Y irfxf
y \x-y\d \x-y\d+2
{ ' Nll*-y|d+2 l*-y|d+2
_ -2d\x-y\2 4d(\x\2 -x-y) (1-^)-2d
\x-y\d+2 \x-y\d+2 + \x-y\*+2
If we replace the 1 by |y|2, the expression collapses
AM«) = |^5+J H*-»I2+2H2-2*. ^,+ 1^-1^) = 0
since \x - y\2 = (x - y) ■ (x - y) = \x\2 -2x-y + \y\2.
Inside the open set D, v is a linear combination of the ky. So bringing the
differentiation under the integral (and leaving it to the reader to justify this
using (1.7)) gives
Av(x)= f w(dy)f(y) Aky(x)=0
JdD
JdD
Thus, v satisfies (7.2a). To check (7.2b), the first step is to show
1 - i*i'
'w=If^*(*)s1
This is just "calculus," but we prefer to use a soft noncomputational approach
instead. We begin by observing that 1(0) = 1, J is invariant under rotations,
and A J = 0 in D. (For the last conclusion apply the result for Av with /=1.)
To conclude 1=1 now, let x G D with \x\ = r < 1, and let r = inf{t : \Bt\ > r).
Applying (1.2) of Chapter 3 with G = 13(0, r) now shows
1(0) = EQI(Br) = I(x)
where the second equality follows from invariance under rotations.
To show that v(x) —* f(y) asj-»t/£ dD, we observe that if 2 7^ y,
u( ^ i-H2 0
166 Chapter 4 Partial Differentia.! Equations
From the last calculation it is clear that if 5 > 0, then the convergence is uniform
for z G B0{6) = dD- D{y, 6). Thus, if we let B^S) = dD- B0(6), then
f kz(x)ir(dz) -+0 and f kz(x)ir(dz) < 1
JB0(S) Jb^s)
since I(x) = 1. To prove that v(x) —» f(y) now we observe
H*)-M\ = \J h(x)f(z)w(dz) - f(y) J kz(x)w(dz)
<2||/||oo/ kz(x)ir(dz)+ sup 1/(^)-/(2/)1
•AB0(4) z6B!(4)
The first term —» 0 as a; —» y, while the sup in the second is small if 6 is. This
shows that (7.2b) holds and completes the proof of (7.1). D
The derivation given above is a little unsatisfying, since it starts with the
answer and then verifies it, but it is simpler than messing around with Kelvin's
transformations (see pages 100-103 in Port and Stone (1978)). It also has the
merit of explaining why ky(x) is the probability density of exiting at y: ky
is a nonnegative harmonic function that has ky (0) = 1 and ky (x) —+ 0 when
x —» z G dD and z ^y.
Exercise 7.1. The point of this exercise is to apply the reasoning of this section
to r = inf{f :Bt<£H} where H = {(x, y) : x £ Rd_1, y > 0}. For 6 G Iid_1 let
h8(x>v)= (\x-o\* + y*y/*
where Cd is chosen so that f d6hg(0,1) = 1 and let
u(x,y) = Jd9hB(x,y)f(d,0)
where / is bounded and continuous.
(a) Show that Ahg = 0 in if and use (1.7) to conclude Aw = 0 in if.
(b) Show I(x,y) = fd6hg(x,y) = 1.
(c) Show that if xn —» x, yn —* 0 then u(xn, yn) —* f(x, 0).
(d) Conclude that E{x>y)f(BT) = fd0he(x,y)f(0,0).
Section 4.8 Occupation Times for the Ball 167
4.8. Occupation Times for the Ball
In the last section we considered how Bt leaves D = {x : \x\ < 1}. In this section
we will investigate how it spends its time before it leaves. Let r = inf {t: Bt $
D} and let G(x,y) be the potential kernel denned in (2.7) of Chapter 3. That
is,
\-\x-y\ d=l
G(x,y)={ -llog\x-y\ d = 2
{Cd\x-y\2-d d>3
where Cd = T(d/2 - 1)/2^2.
(8.1) Theorem. If g is bounded and measurable then
Ex J g(Bt) dt = Jgd(x, y)g(y) dy
where
GD(x,y) = G(x,y)-Jj^^G(z,y)7r(dz)
Proof Combine (*) in Section 4.5 with (7.1). D
We think of Gi)(x,y) as the expected amount of time a Brownian motion
starting at x spends at y before exiting G. To be precise, if A C G then the
expected amount of time Bt spends in A before exiting G is JAGo(x,y)dy.
Our task in this section is to compute GD(x,y). In d = 1, where D = (— 1,1),
(8.1) tells us that
GD(x,y) = G(x,y)-^±±G(l,y) - i^G(-l,y)
= -\*-y\ + ^1(i-y) + ^(y + i)
= -\x-y\ + 1-xy
Considering the two cases x > y and x < y leads to
(8.2) GD(x,y)-^{1_y){1 + x) _1^x<y<1
Geometrically, if we fix y then x —* Gi)(x,y) is determined by the conditions
that it is linear on [0,y] and on [y, 1] with Go(0,y) = Gx)(l,y) = 0, and
GD(y,y) = l-y2.
168 Chapter 4 Partial Differential Equations
In d > 2, (8.1) works well when y = 0 for then G(z,0) is constant on
{\z\ = 1} and the expression in (8.1) reduces to
(8.3) 0^-,0) = 0(.,0)-^,0) = (^1^ ddl\
To get an explicit formula for Gd (x, y) in d > 2 for y ^ 0 we will cheat and
look up the answer. This weakness of character has the advantage of making
the point that Go(x,y) is nothing more than the Green's function for D with
Dirichlet boundary conditions. Folland (1976), page 109, defines the Green's
function for D to be the function K (x, y) on Dx D determined by the following
properties:
(A) For each y £ D, K(-,y) — G(-,y) is C2 and harmonic in D.
(B) For each y G D, if xn -* x G dD, K(xn, y) -* 0.
Remark. For convenience, we have changed Folland's notation to conform
to ours and interchanged the roles of x and y. This interchange makes no
difference, since the Green's function is symmetric, that is, K(x,y) = K(y,x).
(See Folland (1976), page 110.)
(8.4) Lemma. The occupation time density Gq is equal to the Green's function
K denned above.
Proof Using (8.1) and (7.1) we have
GD(x,z) - G(x,z) = - f i—!£JlG(», z)ir(dy) = -ExG(Br,y)
JdD \x ~ v\
This is harmonic in D by (4.6). (4.5) implies if xn —* x G dD, then
EXnG(Br,y)^G(x,y) D
Having made the connection in (8.4), we can now find G# by "guessing
and verifying" functions with properties (A) and (B). To "guess," we turn to
page 123 in Folland (1976) and find
(8.5) Theorem. In d > 3, if 0 < |y| < 1 then
GD(x,y) = G(x,y)-\y\2-dG(x,y/\y\2)
Section 4.0 Lap/ace Transforms, Arcsine Law 169
Proof To check (A) and (B), we observe that if we let Ax denote the Laplacian
acting in the x variable for fixed y then
(*)
AxG(z,y) = 0 forz^y
(a) Since y/|y|2 ¢. D, (*) implies that the second term is harmonic for x £ D.
(b) Let x £ dD. Clearly, the right-hand side is continuous at x. To see that it
vanishes, we note
G(x,y)-\y\"~dG(x,y/\y\2) =
Cd
Cd
|*-y|d-2
Cd
\y\
d-2
\y\*
cd
\x-y\d~2 Myl-yM-T-2
The last equality follows from a fact useful for the next proof:
(8.6) Lemma. If \x\ = 1 then |z|y| — y|y|-1| = \x — y\.
Proof Using \z\2 = z ■ z and then |z|2 = 1 we have
\x\y\-y\y\-l\2 = \x\2\y\2-2x-y + l
= |y|2-2z-y+|z|2 = |z-y|2
-(d-2)
= 0 D
D
Again turning to page 123 of Folland (1976) we "guess"
(8.7) Theorem. In d = 2 if 0 < |y| < 1 then
GD{x,2/) = — (In \x-y\- In \x\y\ - y|y|_11)
Proof Again we need to check (A) and (B).
(a)
GD{x,y)-G{x,y) = ^(\n
I2/I2
+ ln|y|
Again the first term is harmonic by (*) since y/|y|2 ¢ D. The second term does
not depend on x and hence is trivially harmonic, (b) Let x £ dD. Clearly, the
right-hand side is continuous at x. The fact that it vanishes follows immediately
from (8.6). D
170 Chapter 4 Partial Differential Equations
4.9. Laplace Transforms, Arcsine Law
In this section we will apply results from Section 4.6 to do three things:
(a) Complete the discussion of Example 6.1.
(b) Prove a remarkable observation of Ciesielski and Taylor (1962) that for
Brownian motion starting at 0, the distribution of the exit time from D = {x :
\x\ < 1} in dimension d is the same as that of the total occupation time of D
in dimension d + 2.
(c) Prove Levy's arcsine law for the occupation time of [0, oo).
The third topic is independent of the first two.
Example 9.1. Take d = 1, G = (—a, a), c(x) = j > 0, and / = 1 in the
problem considered in Section 4.6.
(9.1) Theorem. Let r = inf{f : Bt <£ (-a, a)}. If 0 < 7 < ir/8ar, then
cos(a^/2j)
If 7 > ;r2/8a2 then Exe^T = 00.
Proof By Example 6.1,
u(x) = cos(z -(/27)/ cos(a -(/27)
is a nonnegative solution of
-u" + -fu = 0 u(—a) = u(a) = 1
To check that w ^ 00, we observe that (6.2) implies that Mt = w(5t)e7< is a
local martingale on [0,r). If we let Tn be a sequence of stopping times that
reduces Mt, then
u(x) = Ex («(BT„An)e*T»A">)
Letting n —* 00 and noting Tn A n f r, it follows from Fatou's lemma that
u(x) > Exe^r
The last equation implies w ^ 00, so (6.3) implies the result for 7 < ;r2/8a2.
To see that w(x) = 00 when 7 > Tr~/8a2, suppose not. In this case (6.3)
implies v(x) = Ex(f(Br)exp(cr)) is the unique solution of our equation but
Section 4.9 La.pla.ce Transforms, Arcsine Law 171
computations in Example 6.1 imply that in this case there is no nonnegative
solution. D
Exercise 9.1. Show that if/? > 0 then
, n x cosh(z J2P)
Ex exp(-/?r) = ) vJ-L
K } coshCav^
Since cos(z) = (e'z + e~'z)/2 this is what we get if we set 7 = —/? in (9.1).
Example 9.2. Our second topic is the observation of Ciesielski and Taylor
(1962). The proof of the general result, see Getoor and Sharpe (1979), sections
5 and 8, or Knight (1981), pages 88-89, requires more than a little familiarity
with Bessel functions, so we will only show that the distribution of the exit
time from (—1,1) in one dimension starting from 0 is the same as that of the
total occupation time of D = {x : \x\ < 1} in three dimensions starting from
the origin.
Exercise 9.1 gives the Laplace transform of the exit time from (—1,1) so
we need only compute
v(x) = Ex exp (-P J lD{Bt)dt
Of course, we only have to compute t)(0) but as in Example 9.1, and Exercise
9.1, we will do this by using a differential equation to compute v(x) for all x.
Several properties of v are immediately obvious (in any dimension d > 3):
(i) Spherical symmetry implies v(x) = /(|z|).
(ii) Using the strong Markov property at the hitting time of D and (1.12) from
Chapter 3 gives for r > 1
(a) /(r) = r2"d/(l) + (1- r2"d) -1=1 + r2"d ■ (/(1) - 1)
(iii) The strong Markov property and (6.3) imply that v is the unique solution
°f 1
-At; — fiv = 0 for x G D
v(x) = /(1) for xedD
To express the last equation in terms of / we note that if v(x) = f(\x\) then
DiV(x) = f'(\x\)^
^(-) = ^(1-1)(^-^)+/-(1-1)]¾
172 Chapter 4 Partial Differentia.! Equations
Summing over i gives
|A,=|ni,i)+^/'(N)
Multiplying by 2 we see that / satisfies
(b) /» + —/'(r)-2/?/(r) = 0 for r < 1
r
To tie equations (a) and (b) together, we note that by applying the reasoning
in (iii) to D(0, q) with q > 1 and using the proof of (6.6) (which refers us to
(5.6a)) we can conclude
(iv) v is C1 and hence f'(r) is continuous at r = 1.
Facts (ii)-(iv) give us the information we need to solve for / at least in
principle: (b) is a second order ordinary differential equation, so if we specify
/(0) = C and /'(0) = 0, the latter from (i), there is a unique solution jc on
[0,1]. Given /c(l), (") gives us the solution on [l,oo). We then pick C so that
(iv) holds.
To carry out our plan, we begin with the "well known" fact that the only
solutions to.(b) which stay bounded near 0 have the form cr1-(d/2)I^d/2)-i(rVW)
where
^ (z/2)"+»»
m=0 v '
is one of the happy families of Bessel functions. Letting a = y/2]3 and recalling
T(z) = (x — l)r(z — 1), we see that when d = 3 and hence v = 1/2 the solution
has a simple form
v ' ^ (2m+1)! ar K '
m=0 v '
Having "found" the solution we want, we can now forget about its origins
and simply check that it works. Differentiating we have
/'(r) = ^cosh(ar) - ^sinh(ar)
r ar-
tut \ C(a)a • x.i ^ 2C(a) 2C(a) .,,,
f"(r) = snih(ar) 5-1 cosh(a7-) -\ ^ sinh(ar)
r r~ ar
/"('■) + -f'(r) = ^^sinh(ar) = a2f(r)
r r
Section 4.9 Lap/ace Transforms, Arcsine Law 173
which shows that (b) holds.
To complete the solution, we have to pick C(a) to make / G C1. Using
formulas for f'(r) for r < 1 from the previous display, for r > 1 from (a), and
recalling d = 3, we have
/'(1-) = C(a) cosh(a) - 0(0.)^1 sinh(a)
/'(1+) = -{/(1) - 1} = 1 - 0(0)^1 sinh(a)
Solving gives C(a) = l/cosh(a). Since a = yfifi this matches the result in
Exercise 9.1 when x = 0. D
Example 9.3. Our third and final topic is Kac's (1951) derivation of
(9.2) Levy's arcsine law. Let Ht = \{s € [0,t] : Bs > 0}|. If 0 < B < 1 then
P,{Ht <et)=\ f J?_ = I arcsin(V^)
Remark. The reader should note that by scaling the distribution of Ht/t does
not depend on t, so the fraction of time in [0,t] that a Brownian gambler is
ahead does not converge to 1/2 in probability as t —> oo. Indeed r = 1/2 is the
value for which the probability density l/TTy/r(l — r) is smallest.
Proof For the last equality see the other arcsine law, (4.2) in Chapter 1. To
prove the first we start with
(9.3) Lemma. Let c(x) = —a — /?l[0jOO)(z) with a, /? > 0. Suppose v is
bounded, C1, and satisfies
-At; + cv = — 1
for all x^O. Then
I dt e-atExe-pH'
o
Proof Our assumptions about v imply that v"(x) = —2(1 + c(x)v(x)) dx in
the sense of distribution. So two results from Chapter 2, the Meyer-Tanaka
formula, (11.4), and (11.7) imply
v(Bi)-v(BQ)= f v'(Bs)dBs- [ (l + c(Bs)v(B,))ds
Jo Jo
174 Chapter 4 Partial Differential Equations
Letting ct = J0 c(Bs)ds and using the integration by parts formula, (10.1) in
Chapter 2, with Xt = v(Bt) and Yt = exp(ct), which is locally of bounded
variation, we have
v(Bi)exp(ci)-v(BQ)= f exv(cs)v'(Bs)dBs
Jo
- f exv(cs){l + c(Bs)v{Bs)}ds
Jo
+ / v(Bs)exp(cs)dcs
Jo
So Mi = v(Bi) exp(cj) + fQ exp(c5) ds is a local martingale. Since v is bounded
and ct < 0, Mt is bounded. As t —* oo, exp(ct) < e~at —* 0, so using the
martingale and bounded convergence theorems gives
f°°
v(x) = EXMQ = ExMoo = Ex ex-p(cs)ds
Jo
Plugging in the definition of c(x) now and using Fubini's theorem leads to the
formula given above. D
(9.3) tells us that we want to find a bounded C1 function v with
(a + P)v = -v" + 1 x > 0
av = -v" + 1 x < 0
To solve the equation jv = ^v" + 1 we write v = vq + v\ where V\(x) = 1/j
and note that
1 „
lvo =2¾
so we have vq(x) = Ce±xv^. Picking the signs to keep v bounded, we have
.)={Ae-'yfiW> + -L- x>0
|5exv^ + l x<0
To find A and B we note that we want v to be C1 so
A+—— = B+-
a + p a
-Ay/2(a + /3) = BV2^
Section 4.9 Laplace Transforms, Arcsine Law 175
It is a little tedious to solve these equations but it is easy to check that
(a + p)y/El ccyfcT+p
gives a solution. What we are interested in is
1 1
(9.4) w(0) = A +
cc + P ^a(a + l3)
To go backwards from here to the answer written in (9.2), we warm up by
changing variables x = y/2yi, dx = (l/2)\/2j/tdt to get
r^r--"1^*
This identity and Fubini's theorem imply
-(a+p)s /-oo -a(t-s)
1 1 1 /" e-(a+P)' r°° e-aV
y/ec + P y/a~ tt J0 y/s Js t-
Jo * Jo y/s(t - s)
dtds
s
dsdt
From (9.3), (9.4), and the uniqueness of Laplace transforms it follows that
1 /"* e'P'
E0 exp(-0Ht) = - / , t ds dt
T JO \/s(t - S)
and invoking the uniqueness again we have the desired result. D
5 Stochastic Differential Equations
5.1. Examples
In this chapter we will be interested in solving stochastic differential equations
(or SDE's) that are written informally as
dX, = b(X,)da + <r(Xt)dBt
or more formally, as an integral equation:
(*) Xt = X0 + f b(Xs) ds+ f cr(Xs).dB3
Jo Jo
We will give a precise formulation of (*) at the beginning of Section 5.2.
In (*), Bt is an m dimensional Brownian motion, a is a d x m matrix,
and to make the matrix operations work out right we will think of X, b and B
as column vectors. That is, they are matrices of size d x 1, d x 1, and m x 1
respectively. Writing out the coordinates (*) says that for 1 < i < d
Xl = X'Q+ / b;(Xs)ds + J2 / <rij(X3)dBi
Jo -=1 Jo
Note that the number m of Brownian motions used to "drive" the equation
may be more or less than d. This generalization does not cause any additional
difficulty, is useful in some situations (see Example 1.5), and will help us
distinguish between which is d x d and <rTa which ismxm. Here <rT denotes
the transpose, of the matrix a.
To explain what we have in mind when we write down (*) and why we want
to solve it, we will now consider some examples. In some of the later examples
we will use the term diffusion process. This is customarily defined to be
a "strong Markov process with continuous paths." Ilowever, in this section,
178 Chapter 5 Stochastic Differential Equations
the term will be used to mean "a family of solutions of the SDE, one for each
starting point." (4.6) will show that under suitable conditions the family of
solutions defines a diffusion process.
Example 1.1. Exponential Brownian Motion. Let Xt = Xq exp([it + <rBt)
where Xq is a real number and Bt is a standard one dimensional Brownian
motion. Using Ito's formula
(1.1) Xt = X0 + f iiX, ds+ f <tXs dBs+\ f o2Xs ds
Jo Jo * Jo
so Xt solves (*) with 6(z) = (/* + (cr2/2))z and <x(x) = ax. Exponential
Brownian motion is often used as a simple model for stock prices because it
stays nonnegative and fluctuations in the price are proportional to the price of
the stock.
To explain the last phrase, we return to the general equation (*) and
suppose that 6 and a are continuous. If Xq = z then stopping at t ATn for suitable
stopping times Tn f oo, taking expected values in (*) and letting n-+oowe
have
EX\ = x( +E I bi(Xs) ds
Jo
So if 6,- is continuous
Because of the last equation, 6 is called the infinitesimal drift.
To give the meaning of the other coefficient we note that if
a(x) = cr(z)crT(z)
then the formula for the covariance of two stochastic integrals, (8.7) in Chapter
2, implies
(X\Xl)t = J2J cr!k(Xs)crjk(Xs)ds = J aij(Xs)ds
Using the integration by parts formula
X\X{ = zV + I X{ dX\ + f X\ dX{ + (X\Xi)t
Jo Jo
Substituting
dXks = bk(X,)ds + J2<rkj(Xs) dBl
i
Section 5.1 Examples 179
using the associative law, and taking expected values
/•* .
E(X\X{) = zV + E / X'MX.)ds
Jo
+ E f Xibj(Xs)ds + E f a{j{Xs)ds
Jo Jo
If ay is also continuous, differentiating gives
jt E(X'tXi)\t=o = xH;(x) + x%(x) + aij(x)
Using EX\ = x' + E fQ bi(Xs) ds and differentiating again we get
jt (EX\EX{) \t=Q = xHiix) + x%(x)
Subtracting the last two equations gives
jt [e{X\x{) - EX\EX{) [=q = aij(x)
justifying the name infinitesimal covariance.
When d = 1, we call a(x) the infinitesimal variance and <r{x) = y/a(x)
the infinitesimal standard deviation. In Example 1.1, a(x) = ax is
proportional to the stock price x. The infinitesimal drift 6(z) = (fi + (cr2/2))z is
also proportional to x. Note, however, that in addition to the drift fj. built in to
the exponential there is a drift cr2z/2 that comes from the Brownian motion.
More precisely it comes from the fact that ex is convex and hence exp(cr51) is
a submartingale.
Example 1.2. The Bessel Processes. Let Wt be a d-dimensional Brownian
motion with d > 1, and Xt = \Wt\. Differentiating gives
^ii 1 2z; „ . . 1 z? ... d— 1
So using Ito's formula
180 Chapter 5 Stochastic Differential Equations
We will use Bt to denote the first term on the right-hand side, since Bt is a
local martingale and (B)t = t, i.e., Bt is a Brownian motion by Levy's theorem,
(4.1) in Chapter 3. Changing notation now we have
(1.2) Xt - XQ = Bt + J ^ ds
so (*) holds with b(x) = (d- l)/2|z| and <r(x) = 1.
Example 1.3. The Ornstein Uhlenbeck Process. Let Bt be a one
dimensional Brownian motion and consider
(1.3) dXt = -aXt dt + o- dBt
which describes one component of the velocity of a Brownian particle which is
slowed by friction (i.e., experiences an acceleration —a times its velocity Xt).
This is one case where we can find the solution explicitly.
(1.4) Xt = e~ai (xq + J ea'adB^\
To check this informally, we note that differentiating the first term in the
product gives ~aXtdt and differentiating the second gives adBt. To check this
formally, we use Ito's formula with
f{u, v) = e-auv Ut = t Vt=XQ+ f ea'<rdBs
Jo
to conclude
Xt-X0= f -aX,ds+ f e-a'{ea'a)dBs
Jo Jo
which is (1.3) in integral form.
From the representation in (1.4) we get the following useful fact
(1.5) Theorem. When Xq = x, the distribution of Xt is normal with mean
xe~ai and variance a2 fQ e~2ar dr.
Proof To see that the integral has a normal distribution with mean 0 and the
indicated variance we use Exercise 6.7 in Chapter 2. Adding e~atXo = e~atx
now we have the desired result. D
Example 1.4. The Kalman-Bucy Filter. Consider
dYt = Vtdt + adWt
^ ' ' dVt = -cVtdt + <rdBt
Section 5.1 Examples 181
Here Vj is an Ornstein-Uhlenbeck velocity process, and Yt is an observation
of the position process subject to observation error described by a dWt, where
W is a Brownian motion independent of B. The fundamental problem here
is: how does one best estimate the present velocity Vt from the observation
{Ys : s < t}? This is a problem of important practical significance but not
one that we will treat here. See for example, Rogers and Williams (1987),
p. 327-329, or 0ksendal (1992), Chapter VI.
Example 1.5. Brownian Motion on the Circle. Let Bt be a one
dimensional Brownian motion, let Xt = cos Bt and Yt = sin Bt. Since X? + Y? = 1,
(Xt,Yt) always stays on the unit circle. Ito's formula implies that
dXt = -Yt dBt - \xt dt
(1.7) \
dYt = Xt dBt - 2y< dt
To write this in the form (*) we take
•<■■»>-(:$) «■■»>=(7)
Note that a is always a unit vector tangent to the circle but we need the drift
b(x,y), which is perpendicular to the circle and points inward, to keep (Xt, Yt)
from flying off.
Example 1.6. Feller's Branching Diffusion. This Xt > 0 and satisfies
(1.8) dXt = pXt dt + <Ty/XtdBt
To explain where this equation comes from, we recall that in a branching
process, each individual in generation m has an independent and identically
distributed number of offspring in generation m + 1. Consider a sequence of
branching processes {Z^,m > 0} in which the probability of k children is p£
and suppose
(Al) the mean of pi is 1+ (/?„/n) with /?„ -* /?
(A2) the variance of p£ is o\ with <rn —» a2 > 0
(A3) for any 5 > 0, £t>*„*2l>Z-0.
Since the individuals in generation 0 have an independent and identically
distributed number of children
£(^1^=1^) = 1^.(1 + 0
var (Z? \Z£ = nx) = nx ■ al
182 Chapter 5 Stochastic Differential Equations
So if we let X? = Z£ni]/n then
E(x?,n-x\xn{Q) = x)=xl3n.±
var (X?,n\x»{0) = x)=x<%-±
The last two equations say that the infinitesimal mean and variance of the
rescaled process X" are z/?n and xo\. These quantities obviously converge to
those of the process Xt in (1-8). In Example 8.2 of Chapter 8, we will show that
under (A1)-(A3) the processes {X",t > 0} converge weakly to {Xt,t > 0}.
Example 1.7. Wright-Fisher Diffusion. This Xt G [0,1] and satisfies
(1.9). dXt = (-0¾ + /?(1 - Xt)) dt + y/Xt{\ - Xt) dBt
The motivation for this model comes from genetics, but to avoid the details
of that subject we will suppose instead we have an urn with n balls in it that
may be labelled A or a. To build up the urn at time m + 1 we sample with
replacement from the urn at time m. However, with probability a/n we ignore
the draw and place an a in, and with probability /?/n we ignore the draw and
place an A in. In genetics terms the last two events correspond to mutations.
Let Zm be the number of A's in the urn at time m. Now when the fraction of
j4's in the urn at time 0 is x, the probability of drawing an A on a given trial is
pn = x.(l--)+(l-x).£
V n/ n
Since we are sampling with replacement,
E(Z?\ZZ = nx) = npn
var (Z? \ZZ = nx) = npn(l - pn)
If we let Xtn = Z£ni]/n then
E (X?,n - x | X"(0) = x)= {-ax + /?(1 -x)}-±
var (x»/n| X"(0) = x) = p„(l - pn) ■ ±
Now j)„-4iasn-+ooso again the infinitesimal mean and variance of X"
converge to those of Xt in (1.9) but the rigorous connection has to wait until
Example 8.3 of Chapter 8. This is just one of many of the diffusions that can
Section 5.2 Ito's Approach 183
arise from genetics. See Karlin and Taylor (1981), vol. II, pages 176-191 for
some of the others.
5.2. Ito's Approach
We begin this section by giving some essential definitions and introducing a
counterexample that will explain the need for some of the formalities. We will
then proceed to our main business: the first existence and uniqueness result for
SDE, which was proved by K. Ito long before the subject became entangled in
the complications of the general theory of processes.
To finally become precise, a solution to our SDE (*) is a triple (Xt, Bt, Tf)
where Xt and Bt are continuous processes adapted to a filtration Tf so that
(i) Bt is a Brownian motion with respect to Tf, i-e., for each s and t, Bt+S —Bt
is independent of Tf and is normally distributed with mean 0 and variance s.
(ii) Xt satisfies
(*) Xt=XQ+ f b(Xs) ds + f cr(Xs) dBs
Jo Jo
We will always assume that a and 6 are measurable and locally bounded so the
integral exists.
JC Ft
It is easy to see that if the SDE holds for some Tf it will hold for Tt '
the filtration generated by (Xt,Bt). When Xt is adapted to Tf the filtration
generated by the Brownian motion, then we can take T% = Tf and X% is called
a strong solution. It is difficult to imagine how Xt could satisfy (*) without
being adapted to Tf but there is a famous example due to H. Tanaka showing
that this can happen.
Example 2.1. Let Wt be a one dimensional Brownian motion with Wo = 0
and let
Bt= [ sga(Ws)dWs
Jo
where sgn(a;) = 1 if x > 0, sgn(a;) = —1 if x < 0. Since Bt is a local martingale
with (B)t = t, (4.2) in Chapter 3 implies that Bt is a Brownian motion with
respect to T™, the filtration generated by Wt. Since sgn(z)2 = 1, the associative
law implies that
f sga(W3)dB3= f dW. = Wt
Jo Jo
So W is a solution of (*) with 6 = 0, <r(x) = sgn(z), and Tf = T™ = Tt ' ■
184 Chapter 5 Stochastic Differential Equations
To see that W is a not a strong solution we apply the Meyer-Tanaka
formula, (11.4) in Chapter 2, with Xt = Wt and f(x) = \x\ to get
\Wt\-L°t= f sgn(Ws)dWs=Bt
Jo
where L° is the local time of \Wt\ at 0. The occupation time density formula in
(11.7) of Chapter 2, and the continuity of the local time established in (11-8)
of Chapter 2, imply that
YeJolUw.l<^ds=leJ_^da^L0i
as e —» 0
So L° and hence Bt = \Wt\ — L° is measurable with respect to the filtration
generated by \Wt\, t > 0. However, Wt is clearly not adapted to the filtration
generated by \Wt\. O
As usual when we have an equation, we are interested in having a unique
solution. For SDE there are several notions of uniqueness. Our first, pathwise
uniqueness holds if whenever Xt and X't are solutions of (*) with Xq = X'Q = x
driven by the same Brownian motion Bt, then with probability one Xt = X't
for all t >0. Here, the nitrations T% and T[ are allowed to be different.
(2.1) Theorem. Pathwise uniqueness does not hold in Example 2.1
Proof We observed in Example 2.1 that W was a solution. To show that
—W is also a solution, we note that sgn(—z) = — sgn(z) except when x = 0, in
which case the left side is 1 and the right side is —1. So
f sgn(-W,)dBs =- f sgn(Ws) dB, + 2 f 1{W,=0} dBs
Jo Jo Jo
To prove that the second integral is 0 (and hence the right-hand side is = — Wt)
recall that Wt is a Brownian motion. Example 3.1 in Chapter 2 implies that
{s : W, = 0} has measure 0, and hence
E (J l{tv,=0} dBs J = E J l{tv,=0} ds = E\{s < t : Ws = 0}| = 0 D
Turning to positive results:
Section 5.2 Ito's Approach 185
(2.2) Theorem. If for all i, j, x, and y we have |cr,-j(as) — ff,'j(y)| < K\x — y\
and \bi(x) — 6,-(y)| < K\x — y\, then the stochastic differential equation
(*) Xt = x+ f cr(Xs)dBs + f b(Xs)ds
Jo Jo
has a strong solution and pathwise uniqueness holds.
Proof We construct the solution by a successive approximation scheme that
reduces to the classical Picard iteration method for solving ordinary differential
equations in the special case a = 0. We begin by introducing the notation
Xt=X0+ f tr{Xt) dBs + f b(Xt)ds
Jo Jo
to describe one iteration. In words, we use the process {Xs,t > 0} as input for
the coefficients a and 6, and the output is the process {Xt,t > 0}. To solve
(*) we begin with the initial guess X° = x and define the successive guess by
iteration:
X?+i = X? for n > 0
The rest of this section is devoted to proving that this procedure constructs a
strong solution and that pathwise uniqueness holds. To help the reader digest
the argument we have divided it into sections. One of the claims is easy. It is
clear by induction that X" is adapted to Tf.
The basic estimate which is the key to the proof is:
(2.3) Lemma. Let r be a stopping time with respect to T%, let T < oo and let
5 = (47^+16^2)^2. Then
E ( sup 1¾ - Zt A < 2E\Yo - Zo\2 + BE f \Yt - Zs \2ds
\0<t<TAr J Jo
Proof The inequality is trivial if the right-hand side is infinite so we can
suppose that each term on the right is finite. To begin we recall that (a+b)2 <
2a2 + 262. Using this twice
(a + 6 + c)2 < 2a2 + 2(6 + c)2 < 2a2 + 462 + 4c2
So the left-hand side of (2.3) is
< 2E\YQ - Zo |2 + 4£ sup / <r{Ys) - <r{Zs) dBa
, s 0<i<T I Jo
(a)
I rtAT
+ AE sup / b{Y,)-b{Za)ds
0<t<T \Jo
186 Chapter 5 Stochastic Differential Equations
To bound the third term in (a), we observe that the Cauchy-Schwarz inequality
implies that
a Mr \2 ptAr .iAr
bi{Yt)-bi{Zt)da\ <J (bi(Y3)-bi(Zs))2ds. J Ids
Taking sup0<1<T and using
d
(b) sup |w(t)|2 < J2 SUP «?(*)
0<t<T i^iQ<t<T
with the Lipschitz continuity assumption gives
I riAT 2
4E sup / b(Ys)-b(Zs)ds
0<t<T\Jo
<4EJ2 sup / 6,-(^)-6,(^)^
(c) .
d yTAr
<4T-EJ2 (bi(Ys)-bi(Zs))2ds
JfTAT
' |Y, -Z,|2cfc
0
To bound the second term in (a), let C{ be the ith row of <r, let
1-tAT
K= / {<Ti{Ys)-<n{zs))-dBs
Jo
and let r„ = inf{£: \M\ \ > n} A r. The L2 maximal inequality and the formula
for the variance of a stochastic integral imply that
rTAT
E sup (Ml)2 < AE (A4ArJ2 < AE / |<r,-(Y,) - <r,-(Z,)|2 ds
0<i<TAr„ JO
Letting n —» oo, then using (b) and Lipschitz continuity we have
I ftAT 2
AE sup / (<r(Y,)-(r(Z, ))<*#,
0<1<T|J0
<4EJ2 sup / (<n(Ys) - 0-,-(Z,)) • dB.
(d) ^o<,<t|70
d [TAr
<16EJ2 \<r{(Y.)-<rtiZ.)\2da
/■Tat
<16d2K2E / \YS-Za\2ds
Jo
Section 5.2 Ifco's Approach 187
Combining inequalities (a), (c), and (d) proves (2.3). D
Convergence of the sequence of approximations. Let
'U —1|2
An(t) = E[ sup IXf-Xr't
\0<s<t
and observe that since X° = X°_1( (2.3) implies that for n > 1
(2.4) An+1(D<B f An(s)ds
Jo
To get started we have to estimate |Ai(s)|. Our earlier inequality for squares
implies that for norms \a + b\2 < 2|a|2 + 2|6|2. So we have
\X}-XW<2(\<r(x)Bs\2 + \b(x)s\2)
Using the fact that
sup \<x{x)Bs\=t1'2 sup Ka:)B,|
0<s<t 0<i<l
and that the right-hand side has finite expected value we have
Ai(*)<C(f-M2)
Combining this with (2.4) and using induction we have
(tn 2tn+1
(2.5) A„(f) < Bn~lC r- +
n! (n + 1)!
From (2.5) we easily get the existence of a limit. Chebyshev's inequality
shows that
P ( sup |X|* - X?-^ > 2"n ) < 22nA„(T)
\0<1<T J
(2.5) implies the right-hand side is summable, so the Borel-Cantelli gives
P ( sup |X|* -X?-^ > 2"" i.o. ] = 0
\0<1<T J
Since ^n 2"" < oo it follows that with probability 1, X? -* a limit Xf°
uniformly on [0, T].
188 Chapter 5 Stochastic Differential Equations
The limit is a solution. We begin by showing that convergence also
occurs uniformly in L2.
(2.6) Lemma. For all 0 < m < n < oo,
e( sup \X?-X?A < I V A^T)1'2)
Proof Let ||Z||2 = {EZ2)1!2. If n < oo, then it follows from monotonicity of
expected value and the triangle inequality for the norm || ■ H2 that
sup |xr-x?| <
<0<s<T »2
J2 sup ix.'-x*-1!
i=m+l°^<T
< £ II sup |X* -X*-i\\\ = £ Ak(Tf'2
Letting n —» 00 and using Fatou's lemma proves the result for n = 00. D
To see that Xf is a solution, we let Y* = Xt" and Zt = Xf in (2.3) to get
E ( sup IX?-*-1 - Xf|2 ] < BE f |X,n - Xf |2 ds
\0<1<T / JO
<bt-e( sup |x;-xf°|2) -♦ 0
\0<s<T J
by (2.6), so X°° = limX^1 = Xf°.
Uniqueness. The key to the proof is
(2.7) Gronwall's Inequality. Suppose <p(t) < A + B f* <p(s)ds for alU > 0
and <p(t) is continuous. Then <p(i) < AeBi.
Proof If we let ip(t) =(A + e)eBt then ij>'{t) = BiJ>(t) so
i[>(t) = A + e + B I i[>(s) ds
Jo
Let r = inf{£ : <p(t) > i[>(t)}. Since i[) and <p are continuous, it follows that if
r < 00 then <p(r) = i>(r), but this is impossible since
iK-
-) = .4 + 6 + 5 I il>(s)ds>A + B I <p(s) ds > <p(t)
Jo Jo
Section 5.2 Ito's Approach 189
The last contradiction implies that ip(t) < (A + e)eBt, but e > 0 is arbitrary, so
the desired result follows. Q
To prove pathwise uniqueness now, let {yt,Bt,T\) and (Zt, Bt, T%) be two
solutions of (*). Replacing the T\ by Tt = f\ V T% we can suppose T\ = 3^.
To prepare for developments in the next section, we will use a more general
formulation than we need now.
(2.8) Lemma. Let Yt and Zt be continuous processes adapted to Tt with
YQ = ZQ = x and let r(R) = inf{f : \Yt\ or \Zt\ > R}. Suppose that Bt is
Brownian motion w.r.t. Tt and that Yt and Zt satisfy (*) on [0,t(R)]. Then
Yt = Zt on [0,7-(1¾)].
Proof Let
f(t) = E[ sup \YS-ZS\2
\0<s<tAr(R) i
'ds
Since Yt and Zt are solutions, (2.3) implies
rtAr(R)
<p(t)<BE \YS-ZS\2
Jo
<BE f \Y3Ar - Z3Ar\2 ds < B f <p(s)ds
Jo Jo
so (2.7) holds with A = 0 and it follows that tp = 0. D
The last result implies that two solutions must agree until the first time
one of them leaves the ball of radius R. (Of course this implies that the two
solutions must exit at the same time.) Since solutions are by definition
continuous functions from [0,oo) —» Rd, we must have r(R) f oo as i2 f oo and the
desired result follows. D
Temporal Inhomogeneity. In some applications it is important to allow the
coefficients 6 and a to depend on time, that is, to consider
(**) Xt = X0 + f b(s, Xs) ds + f cr(s, Xs) dBt
Jo Jo
In many cases it is straightforward to generalize proofs from (*) to (**). For
example, by repeating the argument above with some minor modifications one
can show
190 Chapter 5 Stochastic Differential Equations
(2.9) Theorem. If for all i, j, x, y, and T < oo there is a constant Kt so that
for t < T, |(ry(t,0)| < KT, Mt, 0)| < KT,
\<rii(t,x)-<rii(t,y)\<KT\x-y\ and \bt(t,x)-bi(tty)\< KT\x-y\
then the stochastic differential equation (**) has a strong solution and pathwise
uniqueness holds.
Comparing with (2.2) one sees that the new result simply assumes Lipschitz
continuity locally uniformly in t but needs new conditions: |ff;j(t,0)| < Kt
and |6j(i, 0) | < Kt that are automatic in the temporally homogeneous case.
The dependence of the coefficients on t usually does not introduce any new
difficulties but it does make the statements and proofs uglier. (Here it would
be impolite to point to Karatzas and Shreve (1991) and Stroock and Varadhan
(1979).) So with the exception of (5.1) below, where we need to prove the result
for time-dependent coefficients, we will restrict our attention to the temporally
homogeneous case.
5.3. Extensions
In this section we will (a) extend Ito's result to coefficients that are only locally
Lipschitz-, (b) prove a result for one dimensional SDE due to Yamada and
Watanabe, and (c) introduce examples to show that the results in (2.2) and
(3.3) are fairly sharp.
a. Locally Lipschitz Coefficients
In this subsection we will extend (2.2) in the following way.
(3.1) Theorem. Suppose (i) for any n < oo we have
|o-y(a:) - (r0(y)| < Kn\x - y\ |6,-(a:) - bi(y)\ < Kn\x - y\
when \x\, \y\ < n and (ii) there is a constant A < oo and a function <p(x) > 0 so
that if Xt is a solution of (*) then e~At<p(Xt) is a local supermartingale. Then
(*) has a strong solution and pathwise uniqueness holds.
(3.2) Theorem. Let a = ooT and suppose
d
£{2^(2) + ^(2))^5(1 + 1^2)
i=i
Section 5.3 Extensions 191
then (ii) in (3.1) holds with A = B and <p(x) = 1 + \x\2.
We begin with the easier proof.
Proof of (3.2) Using Ito's formula with X? = t and
f(x0,..., xd) = e-Bxo<p(xi,...,xd)
we have
e-Aif(Xt) - <p(XQ) = -B f e-B'<p(Xs)ds
Jo
£ f e-Bs2X\dX\ + \J2 f e~B'2 4(^),
fei Jo l .=1 Jo
d
= local martingale
+
J* e~B> l-Bf(Xs) + J2 WW) + a«(X.)} J ^
Our assumption implies that the last term is < 0 so e~Ai<p(Xt) is a local
supermartingale. D
Proof of (3.1) Let R < oo, and introduce <rR and bR with
(a) aR(x) = a(x) and bR(x) = b(x) for \x\<R
(b) 0^(¾) = 0 and bR(x) = 0 for |x| > 2R
(c) ^(x) - ^(y)\ < K'\x - y\, \bR(x) - bR(y)\ < K'\x - y\
For an explicit construction do
Exercise 3.1. Suppose \h(x) — h(y)\ < C\x — y\ when \x\ < R. Let
h{x) = ^^^--h{Rxl\x\) for R < \x\ < 2R
R
and h(x) = 0 for \x\ > 2R. Show that h is Lipschitz continuous on Rd with
constant C - 2Ci + R~ 1\h(0)\.
Fix a Brownian motion and let X" be the unique strong solution of
(*„) X, = XQ + [ bn(X,) ds+ ( crn(X,) dB,
Jo Jo
192 Chapter 5 Stochastic Differentia.! Equations
Let Tn = inf{i : \Xt\ > n}. (2.8) implies that if m < n then X? = X? for
t < Tm ■ From this it follows that we can define a process X£° so that
Xf = X? for i < r„
and Xf° will be a solution of (*) for t < Too = lim^oo Tn. To complete the proof
now it suffices to show that Too = oo a.s. If X0 = x then using the optional
stopping theorem at time tATn, which is valid since the local supermartingale
is bounded before that time, we have
<p(x) > E (e-^AT"V(X1AT„)) > e~AiP(Tn < t) inf <p(y)
Rearranging gives
P(Tn < t) < eM<p{x) I inf <p(y)
Since we have supposed <p(y) —» oo as y —» oo it follows that P(Tn < t) —» 0 as
n —* oo which proves the desired result. D
In words, what we have shown is that for locally Lipschitz coefficients the
solution is unique up to Tn the exit time from the ball of radius n. If Tn j oo
we get a-solution defined for all time. If not then the solution reaches infinity
in finite time and we say that an explosion has occurred.
Intuitively (3.2) says there is no explosion if, when \x\ is large, the part of
the drift that points out, 6(z) • z/|z|, is smaller than C\x\ and the variance of
each component is smaller than C|z|2. We do not need conditions on the off
diagonal elements of a,j since the Kunita-Watanabe inequality implies
|(X*',X^)1|2<(X£)1(X^)1
Note that since our condition concerns a sum, a drift toward the origin can
compensate for a larger variance.
Example 3.1. Consider d = 1 and suppose b(x) = —x3. Then (3.2) holds if
a(x) < 5(1 + x2) + 2z4.
The next two examples show that when considered separately the conditions
on 6 and a are close to the best possible.
Example 3.2. Consider d = 1 and let
0-(1) = 0, 6(z) = (l+|z|)4 with<5>l
Section 5.3 Extensions 193
Suppose Xq = 0. 6 > 0 so t —* Xt is strictly increasing. Let Tn = inf{£ : Xt =
n}. While Xt G (n - l,n) its velocity b(Xt) > ns so Tn - Tn_x < n~s. Letting
Too = limT„ and summing we have Too < IZ^Li n~4 < oo if 5 > 1.
We like the last proof since it only depends on the asymptotic behavior
of the coefficients. One can also solve the equation explicitly, which has the
advantage of giving the exact explosion time. Let Yt = 1 + Xt. dYt = Y/ dt so
if we guess Yt = (1 — at)~p then
dYi/dt= ap
(1 - <rf)P+!
Setting a = 1/p and then p = 1/(5 — 1) to have p + 1 = 5p we get a solution
that explodes at time 1/a = p = 1/(5 — 1).
Example 3.3. Suppose 6 = 0, and a = (1 + |z|)4/. (2.2) implies there is no
explosion when 5 < 1. Later we will show (see Example 6.2 and (6.3)) that
in d > 3 explosion occurs for 5 > 1
in d < 2 no explosion for any 5 < oo
b. Yamada and Watanabe in d = 1
In one dimension one can get by with less smoothness in a.
(3.3) Theorem. Suppose that (i) there is a strictly increasing continuous
function p with |<r(as) — a(y)\ < p(\x — y\) where /)(0) = 0 and for all e > 0
/
Jo
p 2(u)du = oo
(ii) there is a strictly increasing and concave function k with \b(x) — b(y)\ <
k(\x — y\) where k(0) = 0 and for all e > 0
I
k x (u) du = 00
0
Then pathwise uniqueness holds.
Remark. Note that there is no mention here of existence of solutions. That is
taken care of by a result in Section 8.4.
Proof The first step is to define a sequence <pn of smooth approximations of
\x\. Let an I 0 be defined by a0 = 1 and
f
p ~(u) du = n
194 Chapter 5 Stochastic Differential Equations
Let 0 < V'n(w) < 2/) 2(u)/n be a continuous function with support in (an, a„_i)
and
fH|.~l
ij)n(u)du = 1
f
Jo.
Since the upper bound in the definition of t/>„ integrates to 2 over (an, an_i)
such a function exists. Let
/■1*1 /•</
Vn(a:)= / ^2// dzij)n{z)
Jo Jo
By definition y?„(z) = <pn(—x), and y^,(a:) = J"* ^„(u) d«, for x > 0, so
{= 0 for |i| < a„
<1 for a„ < |x| < a„_!
= 1 for a„_i < |x|
and it follows that <pn(x) | |a;| as n | oo.
Suppose now that Xj and X} are two solutions with X\ = Xq = a; driven
by the same Brownian motion, and let At = X} — X?.
At = f{<r{X)) - <r(X2)} dB, + [\b(Xl) - b(X?)} ds
Jo Jo
so Ito's formula gives
<pn(At) = jf ^n(A.)dA. + \ Jq rt(A,)d(A),
= fp'n(A.){<r{X])-<r{X?)}dB.
Jo
(a) + f<?n{A.){b{Xl)-b{X*)}da
Jo
= IQ(t) + h(t) + h(t)
Io(t) is a local martingale so there are stopping times Tm j oo with
(b) EIo(tATm) = 0
To deal with I2 in (a) we note that <p'^(x) = i/>n(|a:|) < 2p~2(\x\)/n when
|a;| G (an, an_i) and is 0 otherwise. So using (i)
(c) W0l<^3C!flM^D4<i
Section 5.3 Extensions 195
For h in (a) we observe |^(z)| < 1, so using (ii) and the concavity of k we
have
E\h(t)\< f E\b(Xl)-b(Xj)\ds
(d) Jo
< / EK(\A,\)ds< / K(E\As\)ds
Jo Jo
Combining (a)-(d) we have
t /"*
E<pn(AtATm) <-+ n(E\At\)ds
n Jo
Letting m —» oo and using Fatou's lemma, then letting n —» oo and using the
monotone convergence theorem (recall <pn(x) > 0 and <pn(x) T \x\) we have
E\At\< f K(E\A,\)ds
Jo
To finish the proof now we prove a slight improvement of Gronwall's inequality.
(3.4) Lemma. Suppose f(t) < JQ k(/(s)) ds where / is continuous, k(x) > 0 is
increasing for x > 0, and has JQ K-1(x)dx = oo for any e > 0. Then f(t) = 0.
Proof Let g(t) be the solution of <?'(*) = K(g(t)) with g(0) = e. It follows
from the proof of (2.7) that f(t) < g(t). To show that g is small when e is small
note that g is strictly increasing and let h be its inverse. Since g(h(x)) = x, we
have h'(x) = l/g'(h(x)) = 1/K.(g(h(x)) = 1/k(x) and hence if a < b
h(b)-h(a)= f 1/K(x)dx
Ja
In words the right-hand side gives the amount time it takes g to climb from
a to 6. Taking a = e and 6 = 6 > 0, the assumed divergence of the integral
implies that the amount of time to climb to 6 approaches oo as e —» 0 and the
desired result follows. D
c. Examples
We will now give examples to show that for bounded coefficients the
combination of (2.2) and (3.3) is fairly sharp. In the first case the counterexamples
will have to wait until Section 5.6.
196 Chapter 5 Stochastic Differential Equations
Example 3.4. Consider b(x) = 0 and <r{x) = \x\s A 1. (2.2) and (3.3) imply
that
., . . , ,, , (S> 1/2 ind= 1
pathwise uniqueness holds for < . — '
t<5>linc/>l
Example 6.1 will show
pathwise uniqueness fails if \ F ' in ,—
v U < 1 in d > 1
Example 3.5. Consider <x(x) = 0, b(x) = \x\s A 1, and X0 = 0. If 8 > 1 then
(2.2) implies that there is a unique solution: X, = 0. However, if 6 < 1 there
are others. To find one, we guess Xt = Ctp. Then
dX, = Cpsf-1 ds b(Xs) = CssP*
so to make dXa = b(Xa) ds we set
(p-l) = P8 i.e., p =1/(1 -S)
Cp = Cs i.e.,C = p-1/(1-')
If S < 1 then p > 0 and we have created a second solution starting at 0. Once
we have two solutions we can construct infinitely many. Let a > 0 and let
Xi ~ { C(t -
t< a
a)P t>a
Exercise 3.2. The main reason for interest in (3.3) is the weakening of Lips-
chitz continuity of a. However, the condition on 6 is also an improvement, and
further sharpens the line between theorems and counterexamples, (a) Show
that (ii) in (3.3) holds when C > 0 and
/ \ _ f Cz log(l/z) for z < e 2
K{X) ~ \ C(e~2 + z) for z > e"2
(b) Consider g(t) = exp(—l/tp) with p > 0 and show that g'(t) = ij>p(g(t))
where V-P(y) = P2/{log(l/y)}(p+1)/p-
5.4. Weak Solutions
Intuitively, a strong solution corresponds to solving the SDE for a given Brow-
nian motion, while in producing a weak solution we are allowed to construct
Section 5.4 Weak Solutions 197
the Brownian motion and the solution at the same time. In signal processing
applications one must deal with the noisy signal that one is given, so strong
solutions are required. However, if the aim is to construct diffusion processes
then there is nothing wrong with a weak solution.
Now if we do not insist on staying on the original space (Q, J7, P) then we
can use the map w —» (-Xt(w), Bt(w)) to move from the original space (Q, T)
to (C x C, C x C) where we can use the filtration generated by the coordinates
wi(s), w2(s), s < t. Thus a weak solution is completely specified by giving the
joint distribution (Xt,Bt).
Reflecting our interest in constructing the process Xt and sneaking in some
notation we will need in a minute, we say that there is uniqueness in
distribution if whenever (X, B) and (X',B') are solutions of SDE(6, cr) with XQ =
X'0=x then X and X' have the same distribution, i.e., P(X e A) = P(X' G A)
for all A € C. Here a solution of SDE(6, a) is something that satisfies (*) in the
sense defined in Section 5.2.
Example 4.1. Since sgn(z)2 = 1, any solution to the equation in Example
2.1 is a local martingale with (X)t = t. So Levy's theorem, (4.1) in Chapter
3, implies that Xt is a Brownian motion. The last result asserts that
uniqueness in distribution holds, but (2.1) shows that pathwise uniqueness fails. The
next result, also due to Yamada and Watanabe (1971), shows that the other
implication is true.
(4.1) Theorem. If pathwise uniqueness holds then there is uniqueness in
distribution.
Remark. Pathwise uniqueness also implies that every solution is a strong
solution. However that proof is more complicated and the conclusion is not
important for our purposes so we refer the reader to Revuz and Yor (1991),
p. 340-341.
Proof Suppose (X1,!?1) and (X2,B2) are two solutions with Xq = X$ = x.
(As remarked above the solutions are specified by giving the joint distributions
(X\ £')•) Since (C,C) is a complete separable metric space, we can find regular
conditional distributions Q'(w, A) defined for w € C and A € C (see Section 4.1
in Durrett (1995)) so that
P{XieA\Bi) = Qi{Bi,A)
Let Pq be the measure on (C, C) for the standard Brownian motion and define
a measure w on (C3, C3) by
71-(^0 x Ai x A2) = / dPo(uQ)Qi(u0, Ai)Q2(uo,A2)
198 Chapter 5 Stochastic Differential Equations
If we let Y/(w0, wi, W2) = w,(t) then it is clear that
{X\B1) = (Y1,Y°) and (X2,B2) = (Y2,Y°)
Take A2 = C 01 Ai = C respectively in the definition. Thus, we have two
solutions of the equation driven by the same Brownian motion. Pathwise uniqueness
implies Y1 = Y2 with probability one. (i) follows since
X1 = Y1 = Y2 = X2 D
Let a(x) = <t<tt(x) and recall from Section 5.1 that
Jo
We say that X is a solution to the martingale problem for 6 and a, or
simply X solves MP(6, a), if for each i and j,
X\- j bi(X,)ds and X\X{- f aij(Xa)ds
Jo Jo
are local martingales. The second condition is, of course, equivalent to
<X',X')*= [ ay(X.)<h
Jo
To be precise in the definition above we need to specify a filtration Tt for the
local martingales. In posing the martingale problem we are only interested in
the process Xt, so we can assume without loss of generality that the underlying
space is (C,C) with Xt(w) = w* and the filtration is the one generated by Xt-
In the formulation above 6 and a = aaT are the basic data for the
martingale problem. This presents us with the problem of obtaining a from a.
To solve this problem we begin by introducing some properties of a. Since
(X',Xi)t = (XJ,X')t, a is symmetric, that is, atj = aj{. Also, if 2 £ Hd then
( V
i,j i,j,k k \ i /
So a is nonnegative definite and linear algebra provides us with the following
useful information.
(4.2) Lemma. For any symmetric nonnegative definite matrix a, we can find
an orthogonal matrix U (i.e., its rows are perpendicular vectors of length 1)
Section 5.4 Weak Solutions 199
and a diagonal matrix A with diagonal entries Ai > A2 ... > Ad > 0 so that
a = UTAU. a is invertible if and only if Aj > 0.
(4.2)tells us a = t/^Af/, so if we want a = <x<xT we can take <x = UTvA,
where a/A is the diagonal matrix with entries \/A7. (4.2) tells us how to find
the square root of a given a. The usual algorithms for finding U are such that
if we apply them to a measurable a(x) then the resulting square root <r(x) will
also, be measurable. It takes more sophistication to start with a smooth a(x)
and construct a smooth ff(z). As in the case of (4.2), we will content ourselves
to just state the result. For detailed proofs of (4.3) and (4.4) see Stroock and
Varadhan (1979), Section 5.2.
(4.3) Theorem. If BT a(x)Q > a\6\2 and \\a(x) - a(y)\\ < C\x - y\ for all x,y
then Ha1/2^) - a1/2(2/)|| < (Cfia1/2)^ - y\ for all x,y.
When a can degenerate we need to suppose more smoothness to get a Lipschitz
continuous square root.
(4.4) Theorem. Suppose atj £ C2 and
max \6TDna(x)6\ < C\6\2
then 11^/2(^) - a1/2^)!! < (1(20)1^ - y\ for all x,y.
To see why we need a £ C2 to get a Lipschitz a1/2, consider d = 1 and a(x) =
\x\x. In this case <r(x) = \x\x/2 is Lipschitz continuous at 0 if and only if A > 2.
Returning to probability theory, our first step is to make the connection
between solutions to the martingale problem and solutions of the SDE.
(4.5) Theorem. Let X be a solution to MP(6,a) and a a measurable square
root of a. Then there is a Brownian motion Bt, possibly defined on an
enlargement of the original probability space, so that (X, B) solves SDE(6, a).
Proof If a is invertible at each X this is easy. LetYJ = Xi-fibi(X,)ds
and let
(a) B1i = Y,[' <rTjHX.)*Y!
From the last two definitions and the associative law it is immediate that
(b) / o-(X.) dB, = Yt - Yo = Xt - X0 - f b(X,)ds
Jo Jo
200 Chapter 5 Stochastic Differential Equations
so (*) holds. The B\ are local martingales with (B',B^)t = 5,-jt so Exercise 4.1
in Chapter 3 implies that B is a Brownian motion.
To deal with the general case in one dimension, we let
jj = /1 if<r(X.)#0
10 otherwise
let J, = 1 — Is, let W, be an independent Brownian motion (to define this we
may need to enlarge the probability space) and let
dYs + / J.dW,
<a'> A = l'^)jy' + /
The two integrals are well defined since their variance processes are < t. Bt
is a local martingale with (B)t = t so (4.1) in Chapter 3 implies that Bt is a
Brownian motion.
To extract the SDE now, we observe that the associative law and the fact
that Js<x{Xs) = 0 imply
f cr(X,) dB, = f I, dY,
Jo Jo
= Yt-Y0- [
Jo
J,dYs
In view of (b) the proof will be complete when we show fQ J, dY, = 0. To do
this we note that
E^J J,dY,} =eJ^ J*d(Y)a
= E f J*cr(Xa)ds = 0
Jo
To handle higher dimensions we let U(x) be the orthogonal matrices
constructed in (4.2) and let
Zt= t UT(X,)dY,
Jo
Introducing Kronecker's 5y = 1 if i = j and 0 otherwise, we have
(Zi,Z^)t= [ Y,Vilk{X.)akJi{X.)UTj{X.)da= I %A,-(X3)dS
Jo kl Jo
Section 5.4 Weak Solutions 201
Our result for one dimension implies that the Z\ may be realized as stochastic
integrals with respect to independent Brownian motions. Tracing back through
the definitions gives the desired representation. For more details see Ikeda and
Watanabe (1981), p. 89-91, Revuz and Yor (1991), p. 190-191, or Karatzas and
Shreve (1991), p. 315-317. □
(4.5) shows that there is a 1-1 correspondence between distributions of
solutions of the SDE and solutions of the martingale problem, which by definition
are distributions on (C, C). Thus there is uniqueness in distribution for the SDE
if and only if the martingale problem has a unique solution. Our final
topic is an important reason to be interested in uniqueness.
(4.6) Theorem. Suppose a and 6 are locally bounded functions and MP(6,a)
has a unique solution. Then the strong Markov property holds.
Proof For each x £ Rd let Px be the probability measure on (C,C) that gives
the distribution of the unique solution starting from Xq = x. If you talk fast
the proof is easy. If T is a stopping time then the conditional distribution of
{Xt+s , s > 0} given Tt is a solution of the martingale problem starting from
Xt and so by uniqueness it must be Px(T)- Thus, if we let Qt be the random
shift defined in Section 1.3, then for any bounded measurable Y : C —* R we
have the strong Markov property
(4.7) E{YoeT\TT) = Ex{T)Y
To begin to turn the last paragraph into a proof, fix a starting point Xq =
x0. To simplify this rather complicated argument, we will follow the standard
practice (see Stroock and Varadhan (1979), p. 145-146, Karatzas and Shreve
(1991), p. 321-322, or Rogers and Williams (1987), p. 162-163) of giving the
details only for the case in which the coefficients are bounded and hence
M\° = X\- ( bi{Xa)ds
Jo
M? =X\X{- J ai5{X,)ds
Jo
are martingales. We have added an extra index in the first case so we can refer
to all the martingales at once by saying "the M\3."
Consider a bounded stopping time T and let Q(w,A) be a regular
conditional distribution for {Xr+t, t > 0} given Tt ■ We want to claim that
(4.8) Lemma. For PXo a.e. w, all the
M%+t - M% = M\3 o eT
202 Chapter 5 Stochastic Differential Equations
are martingales under Q(w, •).
To prove this we consider rational s < t and B € T, and note that the optional
stopping theorem implies that if A £ Tt then
EXo ({(M/J' - MY)1B) o 6T) 1A) = 0
Letting B run through a countable collection that is closed under intersection
and generates T, it follows that PXo a.s., the expected value of (M\3 —M'J)1b
is 0, which implies (4.8). Using uniqueness now gives (4.7) and completes the
proof. D
5.5. Change of Measure
Our next step is to use Girsanov's formula from Section 2.12 to solve martingale
problems or more precisely to change the drift in an existing solution. To
accomodate Example 5.2 below we must consider the case in which the added
drift 6 depends on time as well as space.
(5.1) Theorem. Suppose Xt is a solution of MP(/?,a), which, for concreteness
and without loss of generality, we suppose is defined on the canonical probability
space (C, C, P) with Tt the filtration generated by Xt(w) = wt. Suppose a-1(x)
exists for all x, and that b(s, z) is measurable and has \bTa~1b(s, x)\ < M. We
can define a probability measure Q locally equivalent to P, so that under Q,
Xt is a solution to MP(/? + 6, a).
Proof Let Xt=Xt- JJj P(X,)ds, let c(s,x) = a-^x^s,!) and let
Yt= [ c(s,X,)-dX,
Jo
which exists since
JO y
= f cTac(s,Xs)ds= f bTa-1b(s,Xs)ds<Mt
Jo Jo
by our assumption that \bTa~1b(s,x)\ < M. Letting Qt and Pt be the
restrictions of Q and P to Tt we define our Radon-Nikodym derivative to be
dQ,
dPi
£ = tt, = exp(Y,-!<Y),)
Section 5.5 Change of Measure 203
Since (Y)t < Mt, (3.7) in Chapter 3 implies that at is a martingale, and
invoking (12.3) in Chapter 2 we have denned Q.
To apply Girsanov's formula ((12.4) from Chapter 2), we have to compute
A3t = /0 aj1 d(a,X])s. Ito's formula and the associative law imply that
at
-1= ( asdYs= f asc(s,Xa)-dX,
Jo Jo
So by the formula for the covariance of stochastic integrals
(a,Xi)t=J2J a3c,(S,X3)rf{X*',^')3
= / a,(cTa)j(s,X,)ds = / a, bj(s,X,)ds
Jo JQ
since c(s,x) = a~i(x)b(s,x). It follows that
A{= J a^d{a,Xl),= f bj(s,Xs)ds
Jo Jo
Using Girsanov's formula now, it follows that
X{ - I bj(s,X.)ds = X{ - ! friX,) + bs(s,Xs)ds
Jo Jo
is a local martingale/Q.
This is half of the desired conclusion but the other half is easy. We note
that under P
(Xi,Xi)t= [ aij(Xa)ds
Jo
and (12.6) in Chapter 2 implies that the covariance of X' and X3 is the same
under Q. Q
Exercise 5.1. To check your understanding of the formulas, consider the case
in which b does not depend on s, start with the Q constructed above, change
to a measure R to remove the added drift 6, and show that dRt/dQt = l/a* so
R=P.
Example 5.1. Suppose Xt is a solution of MP(0,1), i.e., a standard Brownian
motion and consider the special situation in which 6(z) = Vf7(z). In this case
204 Chapter 5 Stochastic Differential Equations
recalling the definition of Yt in the proof of (5.1) and applying Ito's formula to
U(Xt) we have
Yt = [ VU(X,) ■ X, = U(Xt)- U(XQ) -\ ( AU(Xs)ds
Jo * J0
Thus, we can get rid of the stochastic integral in the definition of the Radon-
Nikodym derivative:
dQ
dPt
£ = exp (u(Xt)- U(X0) - | J AU(X,) + \VU(X,)\2ds\
In the special case of the Ornstein-Uhlenbeck process, 6(z) = — 2ax, where a
is a positive real number, we have U(x) = — a\x\2 AU(x) = — 2da, and the
expression above simplifies to
dQi =exp (-a\Xt\2 + a\X0\2 + dat - f 2a2\X,\2 ds
dPt
In the trivial case of constant drift, i.e., b(x) = n = Vf7(z) we have U(x) = fi-x
and AU(x) = 0 so
dQ,
dPt
}± = 'exp (».Xt- ii-X0-±\n\*t)
It is comforting to note that when Xq = x, the one dimensional density functions
satisfy
Qt(Xt = y) = exp(fM-y-n-x- ^t\2J Pt(Xt = y)
= (27rf)-d/2exp(-|y-x- nt\2/2t)
as it should be since under Q, Xt has the same distribution as B% + fit. O
(5.1) deals with existence of solutions, our next result with uniqueness.
(5.2) Theorem. Under the hypotheses of (5.1) there is a 1-1 correspondence
between solutions of MP(/?, a) and MP(/? + 6, a).
Proof Let Pi and P2 be solutions of MP(/?, a). By change of measure we can
produce solutions Q\ and Q2 of MP(/? + 6, a). We claim that if Pi ^ P2 then
Q\ i1 Q2- To prove this we consider two cases:
Case 1. Suppose Px and P2 are n°t locally equivalent measures. Then since
Qi is locally equivalent to P,-, Q\ and Q2 are not locally equivalent and cannot
be equal.
Section 5.5 Change of Measure 205
Case 2. Suppose Pi and P2 are locally equivalent. In the definition of ait (Y)t
and Yt are independent of P{ by (12.6) and (12.7) in Chapter 2, so dQx/dPx =
dQ2/dP2 and Qi # Q2. D
One can considerably relax the boundedness condition in (5.1). The next
result will cover many examples. If it does not cover yours, note that the main
ingredients of the proof are that the conditions of (5.1) hold locally, and we
know that solutions of MP(6,a) do not explode.
(5.3) Theorem. Suppose X is a solution of MP(0,a) constructed on (C,C).
Suppose that a_1(x) exists, bTa~1b(x) is locally bounded and measurable, and
the quadratic growth condition from (3.2) holds. That is,
d
^2{2Xibi(x) + Oit(x)} < A(l + |z|2)
: = 1
Then there is a 1-1 correpondence between solutions to MP(6, a) and MP(0, a).
Remark. Combining (5.3) with (3.1) and (4.1) we see that if in addition to the
conditions in (5.3) we have a = <t<tt where a is locally Lipschitz, then MP(6, a)
has a unique solution. By using (3.3) instead of (3.1) in the previous sentence
we can get a better result in one dimension.
Proof Let Z and a be as in the proof of (5.1) and let Tn = inf{t: \Xt\ > n}.
From the proof of (5.1) it follows that a(t A Tn) is a martingale. So if we let
P/1 be the restriction of P to FiATn and let dQ^/dP? = a(t A Tn) then under
Q", Xt is a solution of MP(6, a) up to time Tn.
The quadratic growth condition implies that solutions of MP(6, a) do not
explode. We will now show that in the absence of explosions, cct is a martingale.
First, to show at is integrable, we note that Fatou's lemma implies
(a) Eat < liminf J5(a(i A Tn)) = 1
n—*oo
where E denotes expected value with respect to P. Next we note that
E\at - atATJ = E(\at - a(Tn)\;Tn < t)
1 < E(a(Tn);Tn <t) + Eat - E(at;Tn > t)
The absence of explosions implies that asn->oo
(c) E(a(Tn);Tn <t) = Qn(Tn < t) - 0
206 Chapter 5 Stochastic Differential Equations
This and the fact that a(t A Tn) is a martingale implies
(d) E(at; Tn>t) = l- E(a(Tn);Tn < t) - 1
as n —* oo. (b), (c), and (d) imply that a(t A Tn) —* at in L1. It follows (see
e.g., (5.6) in Chapter 4 of Durrett (1995)) that
a(tATn) = E(at\ftATn)
and from this that a,, 0 < s < t is a martingale. The rest of the proof is the
same as that of (5.1) and (5.2). D
Our last chore in this section is to complete the proof of (2.10) in Chapter
1 by showing
(5.4) Theorem. Let Bt be a Brownian motion starting at 0, let g be a
continuous function with g(0) = 0, and let e > 0. Then
P[ sup \Bt-g(t)\<e) >0
\0<1<1 J
Proof We can find a C1 function h with /i(0) = 0 and \h(s) - g(s)\ < e/2
for s < t, so we can suppose without loss of generality that g £ C1 and let
63 = ff'(s). If we let Yt = fQ bs dB, and define Qt by
dQ_
dPi
u^(jyB,-\jy,UB,
then it follows from (5.1) that under Q, Bt—g(t) is a standard Brownian motion.
Let G = {\B, — g(s)\ < e for s < t}. Exercise 2.11 in Chapter 1 implies that
Qt(G) > 0. It follows from the Cauchy-Schwarz inequality that
To complete the proof now we let E denote expectation with respect to Pt and
observe that if |631 < M for s < t then
= exp (['\b,f ds\ <tM'<
Section 5.6 Change of Time 207
since exp (fQ 263 dB, — (1/2) JjJ |263|2 ds) is a martingale and hence has
expected value 1. For this step it is important to remember that b, = g'(s) is a
non-random vector. □
5.6. Change of Time
In the last section we learned that we could alter the 6 in the SDE by changing
the measure. However, we could not alter a with this technique because the
quadratic variation is not affected by absolutely continuous change of measure.
In this section we will introduce a new technique, change of time, that will allow
us to alter the diffusion coefficient. To prepare for developments in Section 6.1
we will allow for the possibility that our processes are denned on a random time
interval.
(6.1) Theorem. Let Xt be a solution of MP(6, a) for i < ¢. That is, for each i
and j,
X\- f bi(Xs)ds and X\X{ - j aij(Xs)ds
Jo Jo
are local martingales on [0,C). Let g be a positive function and suppose that
cr* = / g(Xs) ds < oo for alii < C
Jo
Define the inverse of a by js = inf{i : 0* > s or i > ¢) and let Y, = X(j,) for
s < £ = (T£. Then Y, is a solution of MP(b/g, a/g) for s < £.
Proof We begin by noting that Xt = Y{at), then in the second step change
variables r = <rs, dr = g(Xs) ds, X, = -^7(r) = Yr, to get
X\- f bi(X,)ds = Yit- j bi(Ya,)ds
Jo Jo
= y;'(- f b{(Yr)/g(Yr)dr
Jo
Changing variables i = ju gives
Yj - /" bi(Yr)/g(Yr) dr = X^ - P" b{(Xs) ds
Jo Jo
Since the ju are stopping times, the right-hand side is a local martingale on [0, £)
and we have checked one of the two conditions to be a solution of MP(b/g, a/g).
210 ChapterS Stochastic Differential Equations
Then MP(0,a) is well posed, i.e., uniqueness in distribution holds and there
is no explosion.
Proof We start with Xt = Bt, a Brownian motion, which solves MP(0,1) and
take g(x) = /1(1)-1. Since g{Bt) < e^1 for t < TR = inf{f : \Bt\ > R} we have
fQ g(Bs)ds < 00 for any t. On the other hand
since Brownian motion is recurrent, so applying (6.1) we have a solution of
MP(0,a). Uniqueness follows from (6.2) so the proof is complete. D
Combining (6.3) with (5.2) we can construct solutions of MP(6,a) in one
dimension for very general 6 and a.
(6.4) Theorem. Consider dimension d = 1 and suppose
(i) 0 < eR < a(y) <CR<oo when |y| < R
(ii) 6(z) is locally bounded
(iii) x ■ b{x) + a(x) < A(l + x2)
Then MP(6,a) is well posed.
I
6 One Dimensional Diffusions
6.1. Construction
In this section we will take ah approach to constructing solutions of
(*) dXt = b(Xt) dt + a(Xi) dBt
that is special to one dimension, but that will allow us to obtain a very detailed
understanding of this case. To obtain results with a minimum of fuss and in a
generality that encompasses most applications we will assume:
(ID) 6 and a are continuous and a{x) = ff2(z) > 0 for all x
The purpose of this section is to answer the following questions: Does a
solution exist when (ID) holds? Is it unique? Does it explode? Our approach
may be outlined as follows:
(i) We define a function <p so that if Xt is a solution of the SDE on [0, £) then
Yt = <p(Xt) is a local martingale on [0,£).
(ii) Our Yt has (Y)t = /0 h(Ys) ds so construct Yt to be a solution of MP(0, h)
by time changing a Brownian motion.
(iii) We define Xt = <p~1(Yt) and check that Xt is a solution of the SDE.
To begin to carry out our plan, suppose that Xt is a solution of MP(6, a)
for t < £. If / G C2, Ito's formula implies that for t < £
(1.1) f(Xt) - f(X0) = J' f\Xt) dX, + i jf f"(X.)d(X).
= local mart. + / Lf(X,)ds
Jo
212 Chapter 6 One Dimensional Diffusions
where
(1.2) Lf(x)=±a(x)f"(x) + b(x)f'(x)
and a(x) = (^(x). From this we see that f(Xt) is a local martingale on [0,£)
if and only if Lf(x) = 0. Setting Lf = 0 and noticing (ID) implies a(x) > 0
gives a first order differential equation for /'
(1.3) (/')' = — /'
a
Solving this equation we find,
and it follows that
Any of these functions can be called the natural scale. We will usually take
A = 0 and B = 1 to get
(L4) «*> = /^(1-¾)^
However, in some situations it will be convenient to replace the 0's at the lower
limits by some other point. Note that our assumption (ID) implies <p is C2 so
our use of Ito's formula in (1.1) is justified.
Since Yt = <p{Xt) is a local martingale, results in Section 3.4 imply that it
is a time change of Brownian motion. To find the time change function we note
that (1.1) and the formula for the variance of a stochastic integral imply
(1.5)
where
<Y),= f <p'(X,fd(X),
Jo
= f <p'(X,)2a(X,)ds= f h(Y,)ds
Jo Jo
h(y) = {^'(^"Hy))}2'1^-1^)) > 0 is continuous
So if we let rt = inf{s : (Y)s > t} then Wt = Yt, is a Brownian motion run for
an amount of time (Y)f.
Section 6.1 Construction 213
To construct solutions of MP(6, a) we will reverse the calculations above:
we will use a time change of Brownian motion to construct a Yt which solves
MP(0, h) and then let Xt = <p-l{Yt). With Examples 1.6 and 1.7 of Chapter 5
in mind we generalize our set-up to allow the coefficients 6 and a to be denned
on an open interval (or,/?) with a < 0 < /?. Since <p'(x) > 0 for all x, the image
of (a, /?) under <p is an open interval (£, r) with —oo <£<0<r<oo.
Letting Wt be a Brownian motion, C = inf{t : Wt ¢ {£, r)}, g = 1/h,
<rt = g(Ws) ds for t < C and js = inf{t : <rt > s or t > Q
Jo
Using (6.1) in Chapter 5, we see that Y, = W(js) is a solution of MP(0, h) for
s <£ = <?<:■
To define Xt now, we let "ij> be the inverse of <p and let Xt = i>(Yt). To
check that X solves MP(6, a) until it exits from (a, /?) at time £, we differentiate
<p(ip(x)) = z and rearrange, then differentiate again to get
p'(V>(z))
Using the first equality and <p"(y) = —(2&(y)/a(y))y/(y), which follows from
(1.3), we have
v ^) {<pm*w aw*))
These calculations show ij) £ C2 so Ito's formula, (1.1), and (1.5) imply that
for t < £
t{Yt) - iftYo) = J 4>'(Ys)dY, + i J ^"(y.)A(y.) rf*
For the second term we note that combining the formulas for ■$>" and h gives
(recall ij) = (p~l)
\#\v)Kv) = KHv))
To deal with the first term observe that Yt is a solution of MP(0, h) up to time £
so by (4.5) in Chapter 5 there is a Brownian motion B, with dY, = 1//1( Y3) dB,.
Since ^'(y)\/h(y) = c(t/>(i/)), letting Xt = i>(Yt) we have for t < £
X«-X0= / <r{Xa)dBa+ f b(X,)ds
Jo Jo
214 Chapter 6 One Dimensional Diffusions
The last computation shows that if Yt is a solution of MP(0, h) then Xt =
i>(Yt) is a solution of MP(6,a), while (1.1) shows that if Xt is a solution of
MP(6, a) then Yt = <p(Xt) is a solution of MP(0, h). This establishes there is a
1-1 correspondence between solutions of MP(0, h) and MP(6,a). Using (6.2) of
Chapter 5 now, we see that
(1.6) Theorem. Consider (C,C) and let Xt(u) = ut. Let a < /3 and r^a,p) —
\xd{Xt{u) ¢ (a,/?)}. Under (ID), uniqueness in distribution holds for MP(6, a)
on [0,7-(,,,0)).
Using the uniqueness result with (4.6) of Chapter 5 we get
(1.7) Theorem. For each x £ (a,/?) let Px be the law of Xt, t < T^a,p) from
(1.6). If we set Xt = A for t > T(a,p)) where A is the cemetery state of Section
1.3, then the resulting process has the strong Markov property.
6.2. Feller's Test
In the previous section we showed that under (ID) there were unique solutions
to MP(6,a) on [0,T(a,p)) where T(a,p) = inf{t : Xt ¢ (a,/?)}. The next result
gives necessary and sufficient conditions for no explosions, i.e., T(a,p) = °° a-s-
Let
Ty = inf {t :Xt = y} for y G (a, /?)
Ta = limT„ and Tp = limTv
vie y P vW y
In stating (2.1) we have without loss of generality supposed 0 £ («,/?)■
If you are confronted by an (a,/?) that does not have this property pick your
favorite j in the interval and translate the system by —j. Changing variables
in the integrals we see that (2.1) holds when all of the 0's are replaced by 7's.
(2.1) Feller's test. Let <p(x) be the natural scale denned by (1.4) and let
m(x) = l/(<p'(x)a(x)).
(a) Px(Tp < To) is positive for some (all) x £ (0,/?) if and only if
I dx m{x) (<p(/3) — <p(x)) < 00
Jo
(b) Px(Ta < Tq) is positive for some (all) x € (a, 0) if and only if
0
dx m(x) (<p(x) — <p(a)) < 00
L
Section 6.2 Feller's Test 215
(c) If both integrals are infinite then Px(t"(«,/3) = oo) = 1 for all x £ (a,/?).
Remarks. If <p((3) = oo then the first integrand is = oo and the integral is oo.
Similarly the second integral is oo if <p(a) = —oo. The statement in (a) means
that the following are equivalent
(al) Px(Tp < TQ) > 0 for some x G (0,/3)
(a2) J* dx m{x) (p(/J) - <p(x)) < oo
(a3) Px(7> < To) > 0 for all x g (0,/3)
Proof The key to the proof of our sufficient condition for no explosions given
in (3.1) and (3.2) of Chapter 5 was the fact that if A is sufficiently large
St = (1 + \Xt\2)e~Ai is a supermartingale
To get the optimal result on explosions we have to replace 1+ \x\2 by a function
that is tailor-made for the process. A natural choice that gives up nothing is a
function g > 0 so that
e~tg(Xt) is a local martingale
To find such a g it is convenient to look at things on the natural scale, i.e., let
Yt = <p(Xt). Let £ = limX|Qy?(z) and let r = limxf/j <p(x). We will find a C2
function f(x) so that / is decreasing on (£, 0), / is increasing on (0, r) and
e_</(^i) is a local martingale
Denouement. Once we have / the conclusion of the argument is easy, so to
explain our motivations we begin with the end. Let
f (I) = \im f(y) /(r) = lim/(y)
(2.6) and the calculations at the end of the proof will show that
(2.2a) The integral in (a) is finite if and only if f(r) < oo.
(2.2b) The integral in (b) is finite if and only if f(£) < oo.
Once these are established the conclusions of (2.1) follow easily. Let
fy=mi{t:Yt = y} fory€(£,r)
fr = \imfv and ft = limTL
f(l,r)=flAfr
216 Chapter 6 One Dimensional Diffusions
Proof of (c) Suppose f{€) = f(r) = oo. Let an < 0 < 6„ be chosen so that
/(a„) = /(&„) = n and let r„ = fan A fhn. Since e~'f(Ys) is bounded before
time r„ At we can apply the optional stopping theorem at that time to conclude
that if y G (an, bn), then
ei f(v)
Py(rn < t) < V' -* 0 as n -* oo
n
Letting i-+oowe conclude 0 = Py(f(tr) < oo) = -^-1(1,)(^(0,/3) < °°)-
Proof of (a) Let 0 < y < r, let 6„ | r with 6X > y, and let r„ = T0 A r&„.
If /(r) = 00, applying the optional stopping theorem at time t„ A t, we
conclude that
ei f(v)
Py(Tbn <T0At)< -jfff -» 0 as n -* 00
Letting i-+oowe conclude 0 = Py(fr < fQ) = Pv-i^y)(Tp < T0). This shows
that if (a2) is false then (al) is false, i.e., (al) implies (a2).
If /(r) < 00, applying the optional stopping theorem at time r„ (which is
justified since e_< f(Yt) < f(bn) for t < r„) we conclude that
l<f(y) = Es{e-T-f(YTn))
< l + f(r)Ey (e-^»;f6„< To)
Rearranging gives
Noting Tr = Tb < °° is impossible, and letting n-+oowe have
Ey(e-^;fbn < To) | Ey(e-^;fr < %)
which is a contradiction, unless 0 < Py(fr < T0) = P^-i^Tp < T0). This
shows that (a2) implies (a3). (a3) implies (al) is trivial so the proof of (a) is
complete.
Proof of (b) is identical to that of (a).
The search for f. To find a function / so that e~tf(Yt) is a local
martingale, we apply Ito's formula to f(x\, z2) = e-Xl/(z2) with X\ = t, X% = Yt,
and recall Yt is a local martingale with variance process given by (1.5), so
e~7W " f(Yo) = J {-e-f{Y.) + \e~' f"(Y,)h(Y,)}ds
+ local mart.
Section 6.2 Feller's Test 217
so we want
(2.3) ±h(x)f"(x)-f(x) = Q
To solve this equation we let /o = 1 and define for n > 1
(2.4) Ux) = £dy£d!v^
(1.8) in Chapter 4 implies that if we set f(x) = J2™=ofn(x) an(i
oo
(2.5) Y^ sup |/n(aOI < oo for any R < oo
then f"(x) = En=ofn^)- Using f'^x) = 2fn_1(x)/h(x) for n > 1 and
f{,'(x) = 0 it follows that
n*) = £«*) = £ ^ = ¾^
n=0 n=l v ' v '
i.e., / satisfies (2.3).
To prove (2.5), we begin with some simple properties of the /„.
(2.6a) /„>0
(2.6b) /„ is convex with /^(0) = 0
(2.6c) /„ is increasing on (0, r) and decreasing on {1,0)
Proof Since h(z) > 0 (2.6a) follows easily by induction from the definition
in (2.4). Using (2.6a) in the definition, (2.6b) follows. (2.6c) is an immediate
consequence of (2.6b). D
Our next step is to use induction to show
(2.7) Lemma. fn(x) < (^(z))"/"!, (2.5) holds, and
l + /i</<exp(/!)
Proof Using (2.6c) and (2.6a) the second and third conclusions are immediate
consquences of the first one. The inequality /n(z) < (fi(x))n/n\ is obvious if
218 Chapter 6 One Dimensional Diffusions
n = 1. Using (i) the definition of/„ in (2.4), (ii) (2.6c), (iii) the definition of
/x and /o, (iv) the result for n — 1, and (v) doing a little calculus, we have
yx py
/„(*)= / dy dz
Jo Jo
h(z)
y , 2
yx yt/
< / d»/„-i(») / ^
JO Jo
A(z)
dyfn-i(3f)f[(jf)
Xn-1
To complete the proof now we only have to prove (2.2a) and (2.2b). In
view of (2.7) it suffices to prove these results with / replaced by fx. Changing
variables z = <p(v), dz = <p'(v) dv then y = <p(u), dy = <p'(u) du we have
yx yt/ 9
/!(*)= dy dz
Jo Jo
<p>W(z)y-aW(z))
-X yi
= / «W
JO Jo
i^(y) 2
(ill •
y/(u)a(u)
y1^(x) ru 2
Jo Jo <p'{v)a(v)
Letting x|r and using Pubini's theorem
/i(r)=/ dv—7TTT\ du<p'(u)
J0 <p'(v)a(v) Jv
= 2/ dt;m(t;)(p(/?)-p(t;))
Jo
A similar argument shows that the second expression in (2.1) is (except for a
factor of 2) /x (£) and the proof is complete. D
To see what (2.1) says in a concrete case we consider
Example 2.1. Let a(x) = 1, b(x) = (1 + |a;|)5/2 where 8 > 0. When 6 < 1 the
coefficients are Lipschitz continuous, so there is no explosion. We will now use
(2.1) to reprove that result and show that explosion occurs when 5 > 1. When
y < 0, <p'(y) > 1, so <p(— oo) = —oo and the second integral in (2.1) is oo.
Section 6.3 Recurrence and Transience 219
To evaluate the first integral we note that if y > 0
• W = exp (J -(1 + ,f *) = exp (-"+^';' + 1)
so <p(oo) < oo. To use (2.1) now, we note that a(y) = 1 so m(y) = l/f'(y) so
there is no explosion if and only if
r, f(i+vy+s-i\ r°°, (-(i + uy+* + i\
oo = / dv exp -*—- / dw exp —-—-•
Jo V 1 + 5 yy„ V\ 1 + 6 J
yoo y
JO Jv
du exp ( _
= / "yX ^eXH 1 + 5 )
where in the last step we have changed variables x = u+1, y = v + 1 to get rid
of the l's. The next lemma estimates the last expression and shows that it is
< oo if 6 > 1
= oo if 0 < 6 < 1
(2.8) Lemma. If 6 > 0,
f°° (yl+s -x1+s\
V I dXeXP{ (1 + 6) )^1 ^^°°
Proof Changing variables x = y + zy~s we have
f°° /y1+s-x1+s\ _c f°° ( ry+zy~s \
J, dX "* { 1 + 6 ) = y~ I d2eXp {- J, W dw)
X ~~S
Since z < fy+zy ws dw < zy~s(y + zy~s)s -+ z as y —* oo, the desired result
follows from the dominated convergence theorem. D
6.3. Recurrence and Transience
The natural scale, which was used in Section 6.1 to construct one dimensional
diffusions, can also be used to study their recurrence and transience. Let Xt be
a solution of MP(6, a) and suppose
220 Chapter 6 One Dimensional Diffusions
(ID) 6 and <r are continuous with <r(x) > 0 for all x
Let a(x) = ff2(z), and let <p be the natural scale denned by
Let Ty = inf{t > 0 : Xt = y} and let r = Ta A Tb. The first detail is to show
(3.1) Lemma. If a < x < b then Px(t < oo) = 1.
Proof We saw in Section 6.1 that Yt = <p(Xi) is a solution of MP(0, h) where
h given by (1.5) is positive and continuous. Thus (6.1) in Chapter 5 implies
that Y can be constructed as a time change of Brownian motion, Bt:
Y, = B(js) where j, = inf{t: <rt > s} and crt= ds/h(Bs)
Jo
Since Brownian motion will exit (<p(a), <p(b)) with probability one, Yt will exit
(<p(a),<p(b)) with probability one, and Xt will exit (a, 6) with probability one.
D
With (3.1) established we can study the recurrence and transience as we
did for Brownian motion. f(XtAr) is a uniformly bounded martingale, so the
optional stopping theorem implies
<p{x) = Ex<p(Xr) = f(a)Px(Ta < Tb) + <p(b){l - Px(Ta < Tb)}
and solving we have
<p(b) - tp(a) <p{b) - tp(a)
Letting y(oo) = lirri6_oo <f{b) and <p(— oo) = lima-.-oo <p(a) (the limits exist
since <p is strictly increasing) we have
(3.3) Theorem. Suppose a < x < b.
Px(Ta < oo) = 1 if and only if <p(oo) = oo.
Px(Tb < oo) = 1 if and only if <p(— oo) = —oo.
In one dimension we say that X is recurrent if Px(Ty < oo) = 1 for all y. Prom
(3.3) it follows that
Section 6.3 Recurrence and Transience 221
(3.4) Corollary. X is recurrent if and only if <p(H) = R.
This conclusion should not be surprising. Yt = <p{Xt) is a local martingale so
it is a time change of a Brownian motion run for a random amount of time and
if <p(H) 7^ R that random amount of time must be finite.
To understand the dividing line between recurrence and transience we
consider some examples. We begin by noting that the natural scale only depends
on 6 and a through the ratio 26/a.
Exercise 3.1. (i) Show that if b(y) < 0 for y > y0 then PX(T0 < oo) = 1 for
all x > 0. (ii) Show that if b(y)/a(y) > e > 0 for y > yQ then PX(TQ < oo) < 1
for all x > 0.
Example 3.1. The exercise identifies the interesting case as being 6(y) > 0
with b(y)/a(y) —* 0 as y —* oo. Suppose that for x > 0
26(z)/a(z) = C(l + x)~r where C, r > 0
If r > 1 then
I
Jo
-C(l + z)~r dz = -K > -oo
so <p'(y) > e~K for all y > 0 and <p(oo) = oo. If r < 1
I
" -C(l + z)~r dz = --$- {(1 + y)1-' - 1}
o I"
and it follows that
P(°o) = /
Jo
exP (-j3; {(i + ^)1-^1}) dy<™
In the borderline case r = 1 the outcome depends on the value of C.
r -C(l + z)-1dz = -C\n(l + y)
I"
Jo
/o
so
<p(oo)= J°°(l+y)-cdy |<
oo if C > 1
oo if C < 1
D
To check the last calculation note that Example 1.2 in Chapter 5 shows that
the radial part of d dimensional Brownian motion has 6(r)/a(r) = (d — l)/2r,
while Section 3.1 shows that Brownian motion is recurrent in d < 2 and transient
in d > 2, which corresponds to C < 1 and to C > 1 respectively. Sticklers for
222 Chapter 6 One Dimensional Diffusions
detail may complain that we do not have any Brownian motions denned for
dimensions 2 < d < 3. However, this analysis suggests that if we did then they
would be transient.
Exercise 3.2. Suppose a and 6 satisfy (ID) and are periodic with period 1,
i.e., a(y+1) = a(y) and b(y+1) = b(y) for all y. Find a necessary and sufficient
condition for Xt to be recurrent.
6.4. Green's Functions
Suppose Xt is a solution of MP(6,a), where 6 and a satisfy (ID). Let a < b
be real numbers, which you should not confuse with the coefficients, let D =
(a, 6), and let r = inf{£ : Xt ¢ (a, 6)}. Our goals in this section, which are
accomplished in (4.5) and (4.4) below, are to show that (i) if g is bounded and
measurable then
Ex J g(Xs) ds= J GD(x, y)g(y) dy
and (ii) to give a formula for the Green's function Gi)(x,y). As in the discussion
of (8.1) of Chapter 4, we think of Gd(x, y) as the expected amount of time spent
at y before the process exits D when it starts at x. The rigorous meaning of
the last sentence is given by (i).
The analysis here is similar to that in Section 4.5 but instead of the Lapla-
cian, A, we are concerned with
Lf(x) = ±a(x)f"(x) + b(x)f'(x)
The first step is to generalize (1.3) from Chapter 3.
(4.1) Lemma. supx6(a6) Exr < oo.
Proof Let <p be the natural scale defined in (1.4) and consider Yj = <p(Xt).
By (1.5) Yt is a local martingale with (Y)t = fQ h(Ys)ds where h(y) is positive
and continuous. To estimate r = inf{t: Yt ¢ (y(a), f(b))} we let
v = (<p(a) + <p(b))/2 be the midpoint of the interval,
I = (ip(b) — ip(a))/2 be half the length of the interval,
p = M{h(y) : y G (<p(a), <p(b))} > 0,
f(x)=(f--(x-ur-)/p,
Section 6.4 Green's Functions 223
and note that / > 0 in (<p(a), <p(b)) with f"(x) = -2//). Ito's formula implies
f(Yt) - /(¾) = local mart. - f 2&1 ds
Our choice of p implies h(y) > p, so /(It) -M is a local supermartingale on
[0,r). Using the optional stopping theorem at time rn An where rn = inf{t :
Yt ¢ (y(a) + 1/n, y?(6) — 1/n)} we have
/(*) > ^/(^An) + Ex(rn A n)
Since 0 < f(x) < C~ j'p for x € [^(a), p(6)] we have £2//) > Ex(rn A n). Now, let
n —» oo and use the monotone convergence theorem. D
(4.2) Theorem. Suppose </ is bounded. If there is a function v with
(i) v £ C2, Lv = — g in (a, 6)
(ii) v is continuous at a and 6 with v(a) = v(b) = 0 then
*(*) = £„ I' g(X,)ds
Jo
Proof Let Mt = v{Xt) + f* g(Xs) ds. (i) and (1.1) imply that for t < r
v(Xt) — v(Xq) = local mart. — / g(Xs)ds
Jo
so Mt is a local martingale on [0, r). If w and g are bounded then for t < r
|M,|<iM|oo + ry|oo
(4.1) implies that the right-hand side is an integrable random variable so (2.7)
in Chapter 2 and (ii) imply
Mr = limMt= I g(Bt)dt
*Tr JQ
v{x) = EXMQ = EXMT = Ex f g(Bt) dt a
Jo
Solving (4.2). The next two pages will be devoted to deriving the solution
to the equation in (4.2). Readers who are content to guess and verify the answer
can skip to (4.4) now. To solve the equation it is convenient to introduce
m^ = <p'(x)a(x)
224 Chapter 6 One Dimensional Diffusions
where <p(x) is the natural scale and note that
y } 2m(x) dx \<p'(x)dx) 2 dx0- 2 \ <p'(x) J dx JK>
since the definition of the natural scale implies <p"(x)/<p'(x) = — 2b(x)/a(x). m
is called the (density function of the) speed measure, though as we will see in
Example 4.1 the rate of movement of the process near x is inversely proportional
to m(x)l To simplify notation and to facilitate comparison with our source for
this material, Karlin and Taylor (1981), Vol. II, we let s(x) = <p'(x) be the
derivative of the natural scale.
To solve equation Lf = — g now, we use (4.3) to write
1 dv\ „ . . . .
= -2m(x)g(x)
dx \s(x) dx
and integrate to conclude that for some constant /?
1 dv
s(y) dy
= 13-2 J dzm(z)g{z)
J a
Multiplying by s(y) on each side, integrating y from a to x, and recalling that
v(a) = 0 and s = <p' we have
(a) v{x) = /3{<p(x) - <p(a)) -2 dys(y) / dz m{z)g(z)
•J a " a
In order to have v(b) = 0 we must have
P=m-9{a)fadyS{v)[dZm{z)9{z)
Plugging the formula for /? into (a) and writing
u(x)=^-f]=Px(Tb<Ta)
vw-vw
we have
(b)
r*> ry
v(x) = 2u(x) dys(y) dzm(z)g(z)
J a J a
-2 dys(y) dzm(z)g(z)
•J a " a
Section 6.4 Green's Functions 225
Breaking the first f dy into an integral over [a, x] and one over [z, 6] we have
v(x) = 2(u(x) - 1) / dys(y) dzm(z)g(z)
•J a " a
(c)
+ 2u(x) / dys(y) / dzm(z)g(z)
J x J a
Recalling s(y) = <p'(y) we have
rb
u(x) jf dy s(y) = ^j .¾ " (9(b) - <p{x)) = (1- u(x)) J' dy s(y)
Multiplying the last identity by 2 Ja dz m(z)g(z)) we have
rb rx px px
2u(x) dys(y) dz m(z)g(z) = 2(1 - u(x)) / dys(y) dzm(z)g(z)
Jx Ja Ja Ja
Using this in (c) we can break off part of the second term and cancel with the
first one to end up with
(d)
rx rx
v(x) = 2(1 - u(x)) / dys(y) dzm{z)g(z)
J a Jv
r» ry
+ 2u(x) dys(y) dzm(z)g(z)
J X " X
Using Fubini's theorem now gives
(e)
v(x) = 2(1 - u(x)) / dzm(z)g(z) dys(y)
Ja Ja
+ 2u(x) / dzm(z)g(z) / dys(y)
If we define G(x, z) to be
2 y(6)-y(a) '{<p{h) ~ ^x)) m{z) ^ * ~ *
2 y(6)-y((a) '{<f{z) ~ ^ m(z) ^ * ~ *
then we have
(f) v(x)= [ G(x,z)g(z)dz
Ja
226 Chapter 6 One Dimensional Diffusions
Combining the calculations above with (4.2) shows
(4.5) Theorem. If g is bounded and measurable then
Ex [ g(Xs)ds= [ G(x,z)g(z)dz
J0 Ja
Proof By the monotone class theorem it suffices to prove the result when g is
continuous. To do this it suffices to show that the v denned in (f) satisfies the
hypotheses of (4.2). To check (ii), we note that as x —» a or x —» 6, G(x, z) —» 0,
and G is bounded so the bounded convergence theorem implies v(x) —» 0. To
check that v £ C2 we note that
y(6)~y(a) v(x) = (<p(b) - <p{x)) j\<p{z) - <p(a)) m(z)g(z) dz
+ (<P(*) ~ f(a)) / (<p(b) - <f(z)) m(z)g(z) dz
Jx
Differentiating and noting that the terms from the limits of the integrals cancel
we have
. y(6)~y(a) v'(x) = -<p'(x) j\<p{z) - p(a)) m(z)g(z) dz
+ <f'(x) / (<P(b) ~ <f(z)) ™(%(2) dz
Jx
Differentiating again we have
<p{b) - <p(a)
v"(x) = -<p"(x) f (<p(z)-<p(a))m(z)g(z)dz
Ja
+ <P"{*) j (<f(i>) ~ <f(z)) m(z)9(z) dz
Jx
- f'{x)(<p(b) - <p(a))m(x)g(x)
This shows v G C2. Multiplying the last two equations by 6(z) and a(x)/2,
adding, then using L<p = 0 and the definition of m(x) we have
so Lv = —g. This shows that v satisfies (i) in (4.2) and the result follows. D
Section 6.4 Green's Functions 227
Example 4.1. Speed measure? If X has no drift (e.g., Brownian motion or
any diffusion on its natural scale) then <p(x) = x and (4.4) becomes
^,/ \ _ / 2m(z)(6 — z)(x — a)/(b — a) when z > x
t*l*. z) - | 2m(z)(6 - x)(z - a)/(b - a) when z < x
Taking g = 1, a = x — h, and b = x + h the last expression becomes
Cl \ — / m(2)(a; + /i — 2) when z>x
\ m(z)(z — x+h) when z <x
Letting T(a,b) = inf{t : Xt ¢ (a, 6)}, and using (4.5) with j=lwe have
rx+h
(4.6)
•E'xT'Cx-A.x+A) =/ (¢ + /1-2) m(z) c/z
Jx
+ / (z — s + h)m(z)dz ~ m(s)/i2
Jx-A
as h —» 0. Thus m(a;) gives us the time that .X"* takes to exit a small interval
centered at 2, or to be precise, the ratio of the time for Xt to that for Brownian
motion.
For another interpretation of m note that (11.7) in Chapter 2 implies that
if L% is the local time at x up to time t then for a bounded measurable g
[ Lxtg(x)dx= [ «K*<M*}t= / g(Xs)a(Xs)ds
J Jo Jo
When Xt has no drift <p'(x) = 1 so m(x) = l/a(x) and taking g(x) = f(x)m(x)
we have
Im{x)L*f{x)dx= i f{Xs)ds
Thus multiplying the local times by m(x) converts them into occupation times.
This and the previous interpretation suggest that it would be natural to call
m(x)dx the occupation measure. However,,it is too late to try to change
the original name, speed measure. D
Example 4.2. Second Proof of Feller's test. Let Tx = inf{t : Xt = 2},
Ta = limxx«rx and T^a,b) = TaATb. We content ourselves to establish a variant
of (b) in (2.1). The first of two steps is
(4.7) Lemma. Let 0 < 6 < /?. Then
(<p(z) — <p(a)) m(z) dz < 00
I
J a
228 Chapter 6 One Dimensional Diffusions
if and only if infa<a<o Po(Ta < Tb) > 0 and supQ<a<0 EQ(Ta A Tb) < oo.
Proof (3.2) implies
<p(b) - <p(a)
so infQ<a<0 Po(Ta < Tb) > 0 if and only if <p(a) > -oo. Taking g = 1 in (4.5)
and using (4.4) we have
(4.8)
+ 2lM^)[{*{z)-*{a))m{z)dZ
The first integral always stays bounded as a J. a. So Eqt^^ stays bounded as
a —» a if and only if y(or) > —oo and
/ (<p(z) - <p(a))m(z) dz <
J a
oo D
The-two conditions in (4.7) are equivalent to Po(Ta < Tb) > 0 and Eo(TaA
Tb) < oo. To complete the proof now we will show
(4.9) Lemma. If PQ(Ta <Tb)>0 then E0(Ta A Tb) < oo.
Proof P0(Ta < Tb) > 0 implies P0(Ta < oo) > 0. Pick M large enough so
that P0(Ta < M) > e > 0 and that P0(?6 < M) > e > 0. By considering
the first time the process starting from 0 hits x and using the strong Markov
property, it follows that if 0 < x < b then
P0(Tb <t) = E0(Px(Tb < t - TX);TX < t)
< Px(Tb < t)P0(Tx <t)< Px(Tb < t)
A similar argument shows
Px(Ta <t)> PQ(Ta <t) for a < x < 0
Combining the last two results shows that
Px(Ta A Tb < M) > e > 0 for all a < x < b
Section 6.5 Boundary Behavior 229
Letting r = Ta ATj and repeating the proof of (1.2) in Chapter 3 now, we have
Px(t > kM) = Ex(PXm(t > (k - 1)M); r > M)
<(l-e) sup Py(r> (k-l)M)
a<y<b
It follows by induction that for all integers k > 1
sup P(t > kM) < (1 - ef
x6(a,6)
Using (5.7) in Chapter 1 of Durrett (1995) now, we get an upper bound on all
the moments of the exit time r. D
6.5. Boundary Behavior
In Section 6.1 when we constructed a diffusion on a half line or a bounded
interval, we gave up when the process reached the boundary and we said that
it had exploded. In this section, we will identify situations in which we can
extend the life of the process. To explain how we might continue we begin with
a trivial example.
Example 5.1. Reflecting boundary at 0. Suppose b(x) = 0, and <r(x) is
positive and continuous on [0,oo). To define a solution to dXt = a(Xt) dBt
with a "reflecting boundary at 0," we extend a to R by setting a(—x) = cr(x),
let Yi be a solution of dYt = ff(Yt) dBt on R and then let Xt = |lt|.
To use the last recipe in general we start with Xt a solution of MP(6,a)
and let <p be its natural scale. Yt = f(Xt) solves MP(0,/i) where
h(y) = ^(9-1 (y))}2a(<p-\y))
To see if we can start the process Yt at 0 we extend h to the negative half-line
by setting h(—y) = h(y) and let Zt be a solution of MP(0,/i) on R. If we let
rh(y) = l//i(|y|) be the speed measure for Zt, which is on its natural scale, we
can use (4.6) and the symmetry m(—y) = fh(y) to conclude
1 fc
2^07-(-^)= / {e-y)m(y)dy
Changing variables y = <p(x), dy = <p'(x) dx, e = <p(6) the above
fs 1
Jo <p\x)a{x)
230 Chapter 6 One Dimensional Diffusions
Now, (i) recalling that the speed measure for Xt is m(x) = l/<p'(x)a(x), and
changing notation <p'(z) = s(z), and (ii) using Fubini's theorem, the above
= /1/ s(z) dz J m(x) dx = / / m(a;) dxs(z) dz
Introducing M as the antiderivative of m to bring out the analogy with the
condition in Feller's test (2.1), we have
(5.1) M-«,o = 2 / (M(z)-M(0))s(z)dz
Jo
To see that this means that the process cannot escape from 0, we note that
repeating the proof of (4.9) shows
(5.2) Lemma. If P0(t(_£,£) < oo) > 0 then £roT(_££) < oo.
Proof Pick M so that Pq{t^_£j£) < M) = S > 0. Using the strong Markov
property as in the proof of (4.9) we first get
sup -Px(t(_£i£) > M) < 1 — 6
*e(-e,0
and then conclude that for each integer k > 1
sup Px (r(_ £j£) > kM) < (1 - S)k
x£(-£,0
from which the desired result follows. □
Consider a diffusion on (0,r) where r < oo, let q G (0,r), and let
1= /V(2)-^,(0)) m(2)d2
Jo
J = / (M(z) - M(0)) s(z) dz
Jo
Feller's test implies that
when I < oo we can get IN to the boundary point.
The analysis above shows that
when J < oo we can get OUT from the boundary point.
Section 6.5 Boundary Behavior 231
This leads to four possible combinations, which were named by Feller as follows
I J name
< oo < oo regular
< oo = oo absorbing
= oo < oo entrance
= oo = oo natural
The second case is called absorbing because we can get in to 0 but cannot get
out. The third is called an entrance boundary because we cannot get to 0 but we
can start the process there. Finally, in the fourth case the process can neither
get to nor start at 0, so it is reasonable to exclude 0 from the state space. We
will now give examples of the various possibilities.
Example 5.2. Feller's branching diffusion. Let
dXt = /3Xt dt + <ry/TtdBt
Of course we want to suppose a > 0 but we will also suppose /? > 0 since the
calculations are somewhat different in the cases /? = 0 and /? < 0. See Exercise
5.1 below. Using the formula in (1.4), the natural scale is
"w = i "*■*{[ ^tiz)d»
-f
Jo
e-2^/* dy = -(1 - e-W*'*
/o 2/?
which maps [0,oo) onto [0,cr2/2/?). The speed measure is
m& = ^TTTT = e2/,s/*3/(^)
(p'{x)a{x)
To investigate the boundary at 0, we note that
I = [ m(x)(<p(x) - <p(0)) dx= [ -j- (e2^la2 - l) dx < oo
JO Jo *PX ^ '
since the integrand converges to 1/cr2 as x —* 0. To calculate J we note that
m(x) ~ l/(<r2x) as x —» 0 so M(0) = —oo and J = oo. The combination I < oo
and J = oo says that the process can get into 0 but not get out, so
0 is an absorbing boundary.
To investigate the boundary at oo, we note that
/oo yoo -I
m(x)(ip(oo) - <p(x)) dx = J ^dx =
00
232 Chapter 6 One Dimensional Diffusions
To calculate J we note that when /? > 0, m(x) > l/(cr2z) so M(oo) = oo and
J = oo. The combination I = oo and J = oo says that the process cannot get
into or out of oo, so
oo is a natural boundary.
Exercise 5.1. Show that 0 is an absorbing boundary and oo is a natural
boundary when (a) /? = 0, (b) /? < 0.
Example 5.3. Bessel process. Consider
dXt = ^-dt + dBt
Here 7 > —1 is the index of the Bessel process. To explain the restriction
on 7 and to prepare for the results we will derive, note that Example 1.2 in
Chapter 5 shows that the radial part of d dimensional Brownian motion is a
Bessel process with j = (d — 1). The natural scale is
<p(x) - \ exp f- / j/zdz) dy
[*_-,, f In x if 7 = 1
= k " ^=((^-1)/(1-7) *7#1
From the last computation we see that if 7 > 1 then <p(Q) = —00 and I = 00.
To handle — 1 < 7 < 1 we observe that the speed measure
So taking q = 1 in the definition of I
ri 2i-7
I = / 27 dz < 00
Jo 1-7
To compute J we observe that for any 7 > —1, M(z) = z7+1 /(7 + 1) and
/•1 27+i
J = / -2 7 dz < 00
Jo 7+1
Combining the last two conclusions we see that 0 is an
entrance boundary if 7€[l,oo)
regular boundary if 76(-1,1)
Section 6.5 Boundary Behavior 233
The first conclusion is reasonable since in d > 2 we can start Brownian motion
at 0 but then it does not hit 0 at positive times. The second conclusion can be
thought of as saying that in dimensions d < 2 Brownian motion will hit 0.
Exercise 5.2. Show that 0 is an absorbing boundary if j < —1. We leave it
to the reader to ponder the meaning of Brownian motion in dimension d < 0.
Example 5.4. Power noise. Consider
dXt = Xf dBt
on (0, oo). The natural scale is <p(x) = x and the speed measure is m(x) —
l/(<p'(x)a(x)) = x~2S so
oo if 5 < 1
oo if 5 > 1
7-|.■-»*-{<
When 6 > 1/2, M(0) = -oo and hence J = oo. When 8 < 1/2
- C —
rl 2l-24
J = I . 2S dz < oo
Combining the last two conclusions we see that the boundary point 0 is
natural if 5 £ [l,oo)
absorbing if 6 G [1/2,1)
regular if 6 £ (0,1/2)
The fact that we can start at 0 if and only if 6 < 1/2 is suggested by (3.3)
and Example 6.1 in Chapter 5. The new information here is that the solution
can reach 0 if and only if 5 < 1. For another proof of that result, recall the
time change recipe in Example 6.1 of Chapter 5 for constructing solutions of
the equation above and do
Exercise 5.3. Consider
hs= i °i*rMi(i».<i)dfl and Hl= f V2V.>i
Jo Jo
\ds
Show that (a) for any 6 > 0, P\{H'S < oo) = 1. (b) E^Hs < oo when 6 < 1.
(c) Pi(Hg = oo) = 1 when 6 >l.
Exercise 5.4. Show that the boundary at oo in Example 5.4 is natural if 6 < 1
and entrance if 8 > 1.
234 Chapter 6 One Dimensional Diffusions
Remark. The fact that we cannot reach oo is expected. (5.3) in Chapter 5
implies that these processes do not explode for any 5. The second conclusion
is surprising at first glance since the process starting at oo is a time change of
a Brownian motion starting from oo. However the function we use in (5.1) of
Chapter 5 is g(x) = x~2S so (2.9) in Chapter 3 implies
EM I g(Bs)l(B.<M)ds= I x~2&-Ixdx
which stays bounded as M —» oo if and only if 6 > 1.
Exercise 5.5. Wright-Fisher diffusion. Recall the definition given in
Example 1.7 in Chapter 5. Show that the boundary point 0 is
absorbing if /? = 0
regular if /? G (0,1/2)
entrance if /? > 1/2
Hint: simplify calculations by first considering the case a = 0 and then arguing
that the value of a is not important.
6.6. Applications to Higher Dimensions
In this section we will use the technique of comparing a multidimensional
diffusion with a one dimensional one to obtain sufficient conditions (a) for no
explosion, (b) for recurrence and transience, and (c) for diffusions to hit points
or not. Let Sr = inf{t : \Xt\ = r} and 5«, = limr_oo5y. Throughout this
section we will suppose that Xt is a solution to MP(6, a) on [0,5^,), and that
the coefficients satisfy
(i) 6 is measurable and locally bounded
(ii) a is continuous and nondegenerate for each x, i.e.,
Y^y'anix)^ > 0 when y ^ 0
>,i
The first detail is to show that Xt will eventually exit from any ball.
(6.1) Theorem. Let Sr = inf{t : \Xt\ = r}. Then there is a constant C < oo
so that ExSr < C for all x with \x\ < r.
Proof Assumptions (i) and (ii) imply that we can pick an a large so that if
\x\ < r
^aii(x)>l+\b{(x)\
Section 6.6 Applications to Higher Dimensions 235
for all x with \x\ < r. Let h(x) = ^,-=1 cosh(az,-). Using Ito's formula we have
rt d
h{Xt) - h{X0) = f y)asmh{aXi)bi(Xs)ds + local mart.
1 Jo ,-=i
Noting that cosh(y) > 1 and
ey e-y
cosh(y) > max [-^,-^-) >\ sinh(y)|
our choice of a implies
0
a
cosh(aXi)au(Xs) + asinh(aXj>f(X,)
> acosh(aXi) (|a«(X,) - |6,-(X,)|) > acosh(aX;') > a
so h(Xt) — tad is a local submartingale on [0, Sr). Using the optional stopping
theorem at time Sr A t we have
0 < h(x) < Ex{h(XSrM) ~ ad(Sr A t)}
Since h(XsrAt) < rfcosh(ar) it follows that
Ex(Sr At)< cosh(ar)/a
Letting t —» oo gives the desired result. D
Remark. Tracing back through the proof shows C = cosh(ar)/a where a is
chosen so that
|alV(z) > 1 + |6,-(a:)|
To see that the last estimate is fairly sharp consider a special case
Example 6.1. Suppose d = 1, a(x) = 1, and b(x) = — /?sgn(z) with /? > 0.
The natural scale has
<p(x) = J\>»dy=±{eV-l)
1 = e-2/J|x|
236 Chapter 6 One Dimensional Diffusions
for x > 0 and <p(—x) = — <p(x), so the speed measure
<p>(x)
Using (4.8) now with 6 = r, a = —r, x = 0 and noting that
o f{r) ~ ¥>(0) _ 2y(0) ~ y(-r) _ x
p(r) - p(-r) p(r) - <p(-r)
we have
£°r<-">=[ h{'2" "^ '""dz
The two integrals are equal so
£0^) = 1^(6^-)-1)^
To compare with (6.1) now, note that when a(x) = 1 and b(x) = — /?sgn(z), the
remark after the proof says we can pick a = 2/? + 2 so the inequality in (6.1) is
pT . cosh((2/?+2)r)
a. Explosions
With the little detail of the finiteness of ExSr out of the way we now
introduce a comparison between a multidimensional diffusion and a one dimensional
one, which is due to Khasminskii (1960). We begin by introducing three pairs
of definitions that may look strange but are tailor made for computations in
(d), (e), and (a) of the proof of (6.2).
a(x) = 2x a(x)x and /?(z) = yj2z,-6,(z) + ag(x)
1=1
v(r) = sup { : \x\2 = r > and p(r) = sup {a(x) : \x\2 = r}
I a(z) J
n(r) = p(r)i/(r) and 6(r) = \/2p(r)
Section 6.6 Applications to Higher Dimensions 237
Our plan is to compare Rt = \Xt\2 with
dZt = n(Zt) dt + 6{Zt) dBt on (0, oo)
So we introduce the natural scale <p(x) and speed measure m(x) for Zt:
<p(x) = / s(y) dy
m(x) = l/(s(x)02(x))
In view of Feller's test, (2.1), the next result says that if Zt can't reach oo in
finite time then Xt does not explode.
(6.2) Theorem. There is no explosion if
dx m(x)((p(oo) — <p(x)) = oo
I
Proof To prove the result we will let Rt = \Xt\2 and find a function g with
g(r) —* oo as r —» oo so that e~ig(Rt) is a supermartingale while Rt > 1. To
see that this is enough, let Sr = inf{t : Rt = r} and use the optional stopping
theorem at time S\ A Sn A t to conclude that for \x\ > 1
ff(M2) > c-*j(n2)Ps(S„ < Sx At)
Letting 5«, = linv^oo Sr, then n —» oo and i->oowe have Px(Soo < Si) — 0.
To define the function #, we follow the proof of (2.1). Let Yt = <p{Zt),
use the construction there to produce a function / so that e~tf{Yt) is a local
martingale, then take g = / o <p where <p is the natural scale of Zt ■ It follows
from our assumption and (2.2a) that g(r) —* oo as r —» oo.
Since ^ G C2 it follows from Ito's formula that
e~VZ«) -ff(Zo) = -/ e-'ff(Z.)<h
Jo
+ e-'g'(Zs)n(Zs)ds + localma.it.
Jo
+ \j\->g"{Z1)6\Z1)ds
238 Chapter 6 One Dimensional Diffusions
Since e~ig{Zt) is a local martingale it follows that
\e\z)g"{z) + n{z)g\z) = g{z)
Using the definition of 6 and n now we can rewrite this as
(a) P(z)g"(z) + v{z)p{z)g'{z) = g(z)
Before plunging into the proof we need one more property which follows from
(2.6c) and the fact that our natural scale is an increasing function with <p(l) = 0:
(b) g'(r) > 0 for r > 1
A little calculus gives
Dig(\*\2) = g'(\x\2)2x{
(c) Dijg(\X\') = g"(\x\0-)4xiXj i#j
Diig(\X?) = <A|z|2)4z? + g'(\Xf)2
Using Ito's formula now we have
e-tg(Rt)-g(Ro)= [ -e-'g{Rs)ds
Jo
+ Y] I e~'g'{Rs)2Xibi{Xs) ds + local mart.
. Jo
+ 111^ e-'g"(R,)4XixiatJ(Xt) ds
+ \J2J e-'g'(Rs)2a!i(Xs)ds
Using the functions a(r) and /?(r) introduced above we can write the last
equation compactly as
e~tg(Ri) — g(Ro) = local mart.
(d) + f e~' {-9(R,) + P(Xt)g'(Rs) + a(Xs)g"(Rs)} ds
Jo
To bound the integral when Rs > 1 we use (i) the definition of v and (b), then
(ii) the equation in (a), the fact that g > 0, and the definition of p
-g(R<) + {§§<? W + 9"(R,)} «(*.)
(e) < -g(R,) + MR,)g'(R,) + g"(R,)}a(X,)
£-^)+{S)}^)£0
Section 6.6 Applications to Higher Dimensions 239
Combining (d) and (e) shows that e~ig(Rt) is a local supermartingale while
Rt>l and completes the proof. D
b. Recurrence and Transience
The next two results give sufficient conditions for transience and recurrence
in d > 1. In this part we use the formulation of Meyers and Serrin (1960). Let
-, \ x , \ x j, \ 2i • 6(i) + tr (a(i))
\x\ \x\ a(x)
where tr (a(z)) = £\a,-,-(z) is the trace of a. We call de(x) the effective
dimension at x because (6.3) and (6.4) will show that there will be recurrence
or transience if the dimension is < 2 or > 2 in a neighborhood of infinity.
To formulate a result that allows de(x) to approach 2 as \x\ —» oo we need a
definition: S(t) is said to be a Dini function if and only if
In the next two results we will apply the definition to
S(t) = exp(- f~ds)
To see what the conditions say note that S(t) = (logt)~p is a Dini function if
p > 1 but not if p < 1 and this corresponds to e(s) = p/logs.
(6.3) Theorem. Suppose de(x) > 2(1 + e(|z|)) for |x| > R and 8(t) is a Dini
function, then Xt is transient. To be precise, if \x\ > R then Px(Sr < oo) < 1.
Here Sr = inf{t : \Xt\ = r}, and being transient includes the possibility of
explosion.
(6.4) Theorem. Suppose de(x) < 2(l+e(|z|)) for \x\ > R and 6(t) is not a Dini
function, then Xt is recurrent. To be precise, if \x\ > R then Px(Sr < oo) = 1.
Proof of (6.3) The first step is to let
240 Chapter 6 One Dimensional Diffusions
which has
¢^) = =1^(- £ '-&dt\ <o
and hence satisfies
r<p"(r) + (1 + e(r))<p'(r) = 0
Letting -Rt = \Xi\2 and using Ito's formula we get the following which we will
use four times below:
(6.5) Lemma. If rg"(r) + j(r)g'(r) = 0 for r G (u2, v2) and g'(\x\2) ■ {de{x) -
27(|z|2)) < 0 when \x\ £ (u,v), then g(Rt) is a local supermartingale while
u < \Xt\ < v.
Proof Using Ito's formula with (c) from the proof of (6.2)
g(Rt) - g(R0) = Y] [ g'(R,)2X'MXs) ds + local mart.
+ \j2g''(RMxixiaij(xs)ds
'i
i
= local mart. + j 2a{Xs) (Rsg"{R3) + g\R,)^y^) ds
Using our two assumptions on g, we get for u < \X%\ < v
R,g"(Rs) + <7W^4P = g'(R>) (^1 - 7(¾)) < 0
which proves the result. D
(6.5) implies that <p(Rt) is a local supermartingale while Rs > R2. If
R < r < s < oo then using the optional stopping theorem at time r = Sr A S,
((6.1) implies that Px(t < oo) = 1) we see that
<p(\x\2) > Ex<p(Rr) > f(r2)Px(Sr < Ss)
Rearranging gives
Px(Sr < Ss) < ^)^- < 1
Section 6.6 Applications to Higher Dimensions 241
The upper bound is independent of s and (6.3) follows. □
Proof of (6.4) This time we let
*»-jr-(-jr?-)*
s
Since i>'(r) = —<p'{r) we again have
rip"{r) + (1 + e(r) )tf'(r) = °
but this time ij)'{r) > 0. Since we have assumed de(x) < 2(l + e(|z|)) for \x\ > R
it follows from (6.5) that i[>(Rt) is a local supermartingale while .¾ > R2. If
R < r < s < oo using the optional stopping theorem at time r = Sr A S, ((6.1)
implies that Px(r < oo) = 1) we see that
V-(|*|2) > Ex^{Rr) = ^(r2)Px(Sr < St) + 4>(s2){l - P(Sr < S,)}
Rearranging gives
Pis <S)>^2)-^N2)
V(s-) - ip(r~)
Letting s —► oo and noting that i[>(s2) —» oo since 5(f) is not a Dini function
gives the desired result. D
c. Hitting Points
The proofs of (6.3) and (6.4) generalize easily to investigate whether
multidimensional diffusions hit 0. Let T0 = ini{t > 0 : Xt = 0}.
(6.6) Theorem. If de(x) > 2(1 - e(|z|)) for \x\ < rj and
then PX(T0 < oo) = 0.
(6.7) Theorem. If de(x) < 2(1 - e(|z|)) for |x| < tj and
leXP(-/
1 e(z) , \ dy
-^-1 dz) — < oo
y
then PX(T0 < oo) > 0.
242 Chapter 6 One Dimensional Diffusions
We have changed the sign of e to make the proofs more closely parallel the
previous pair. Another explanation of the change is that Brownian motion
is recurrent in d < 2 hits points in d < 2
is transient in d > 2 doesn't hit points in d > 2
so the boundary is now a little below 2 dimensions rather than a little above.
Proof of (6.6) Let
*>-jf-(-jfW
which has
^0-) = =^(-^1^) <o
^) = 1^)^(-/1^)
and hence satisfies
r<p"(r) +(1- e{r))<p'(r) = 0
Letting Rt = \Xt\2 and using (6.5) shows <p(Rt) is a local supermartingale while
Rt £ (0,rf). Ii0<r<s<7] using the optional stopping theorem at time
r = Sr A Ss we see that
<P(\*\2) > EMRt) > f(r2)P*(Sr < St)
Rearranging gives
P*(Sr < Ss) < <p(\*\2)/<p(r2)
Letting r —» 0 and noting <p(r2) —» oo gives
PX(T0 <Sn) = 0 for 0 < |x| < t)
To replace Sn by oo in the last equation let (To = 0 and for n > 1 let
r„ = inf{i > cr„_i : \Xt\ = 7)}
(rn=mf{t>Tn:\Xt\ = 7)/2}
The strong Markov property implies that the <rn — rn are i.i.d. so <rn —» oo as
n —» oo. It is impossible for the process to hit 0 in [rn, crn] and hitting 0 in
[<?n) Tn+i] has probability 0 so we have
PX(TQ < oo) = 0 for 0 < |x| < 7)
Section 6.6 Applications to Higher Dimensions 243
Using (3.2) and the Markov property now extends the conclusion to \x\ > tj. D
Proof of (6.7) This time we let
Since i>'(r) = — <p'{r) we have ij)'{r) > 0 and
nl>"{r) + (1- e(r))V-'(r) = 0
Letting -Rt = \Xt\2 and using (6.5) shows i>(Rt) is a local supermartingale while
-Ri G (0,7j2). If 0 < r < s < 7} using the optional stopping theorem at time
r = Sr A Ss we see that
V-(|x|2) > E^Rr) = Hr2)Px(Sr < Ss) + tK*2){1 - Px(Sr < S,)}
Rearranging gives
p(s <s)>^2)-w2)
V>(s2) - V>(r2)
Letting r —» 0 and noting ^(r2) —» 0 gives
Px(To < oo) > 0 for |x| < i}
Using (3.2) and the Markov property now extends the conclusion to \x\ >r).U
(6.8) Corollary. If (i) d > 3 or (ii) d = 2 and a is Holder continuous then
PX(T0 < oo) = 0 for all ¢.
Proof Referring to (4.2) in Chapter 5, we can write a(0) = UTAU where A
is a diagonal matrix with entries A;. Let T be the diagonal matrix with entries
1/t/AJ and let V = UTT. The formula for the covariance of stochastic integrals
implies that Xt = VXt has
a(0) = VTa(0)V = I
So we can without loss of generality suppose that a(0) = I. In d > 3, the
continuity of a implies de(x) —* d as z —» 0 so we can take e(r) = 0 and the
desired result follows from (6.6). If d = 2 and a is Holder continuous with
exponent 5 we can take e(r) = Cr4. To extract the desired result from (6.6) we
note that
244 Chapter 6 One Dimensional Diffusions
since the exponential in the integrand converges to a positive limit as y —* 0. D
It follows from (6.7) that a two dimensional diffusion can hit 0.
Example 6.1. Suppose b(x) = 0, a(0) = I, and for 0 < \x\ < 1/e let a(x) be
the matrix with eigenvectors
vi = -r~(xi,x2) ^2 = 7-7 (-22,21)
\x\ \x\
and associated eigenvalues
Ai = 1 A2 = l-2p/log|z|
These definitions are chosen so that de(x) = 2 — 2p/log(|z|) and hence we can
take e(r) = p/log(r). Plugging into the integral
r1'' ( y1/6 Pdz \ dy y1/6 , , „ 1X dy
/ exp - / —■ — =/ exp(plog logy ) —
Jo V Jy zlos2/ y Jo V
y1/e dy
= / y <oo ifp> 1
Jo y|logy|p
Remark. The problem of whether or not diffusions can hit points has been
investigated in a different guise by Gilbarg and Serrin (1956), see page 315.
Diffusions as Markov Processes
In this chapter we will assume that the martingale problem MP(6,a) has a
unique solution and hence gives rise to a strong Markov process. As the title
says, we will be interested in diffusions as Markov processes, in particular, in
their asymptotic behavior as t —* oo.
7.1. Semigroups and Generators
The results in this section hold for any Markov process, Xt. For any bounded
measurable /, and t > 0, let
Ttf(x) = Exf(Xt)
The Markov property implies that
Extf[Xt+t)\Tt) = ExJ(Xt) = Ttf(Xt)
So taking expected values gives the semigroup property
(1.1) Tt+tf{x) = T.(Ttf)(x)
for s,t>0. Introducing the norm ||/|| = sup \f(x)\ and L°° = {/ : ||/|| < oo}
we see that Tt is a contraction semigroup on L°°, i.e.,
(1-2) ||Ti/|| < ll/H
Following Dynkin (1965), but using slightly different notation, we will
define the domain of T to be
V(T) = {/ G L°° : \\Ttf - f\\ - 0 as t - 0}
As the next result shows, this is a reasonable choice.
(1.3) Theorem. V{T) is a vector space, i.e., if fltf2 € V(T) and cx,c2 G R,
ci/i + c2/2 G V(T). V(T) is closed. If / G V{T) then T,/ G V(T) and s -* Tsf
is continuous.
246 Chapter 7 Diffusions as Markov Processes
Remark. Here and in what follows, limits are always taken in, and hence
continuity is defined for the uniform norm || ■ ||.
Proof Since Tt(ci/i + C2/2) = c\Ttf\ + 02^/2, the first conclusion follows
from the triangle inequality. To prove V(T) is closed, let /„ G V(T) with
ll/n - /|| -* 0. Let c> 0. If we pick n large ||/„ - /|| < e/3. /„ G V(T) so
\\Ttfn — /n|| < e/3 for t < t0. Using the triangle inequality and the contraction
property (1.2) now, it follows that for t <to
\\Ttf - /|| < \\Ttf - Ttfn\\ + \\Ttfn -fn\\ + \\fn - /||
<ll/-/n|| + e/3 + e/3<e
To prove the last claim in the theorem we note that the semigroup and
contraction properties, (1.1) and (1.2), imply that if s < t
\\Ttf - Ttf\\ = ||r,(7U/ - /)|| < ||7W - /|| D
Remark. While V(T) is a reasonable choice it is certainly not the only one.
A common alternative is define the domain to be Co, the continous functions
which converge to 0 at 00 equipped with the sup norm. When the semi-group
Tt maps C0 into itself, it is called a Feller semi-group. We will not follow
this approach since this assumption does not hold, for example, in d = 1 when
00 is an entrance boundary.
The infinitesimal generator of a semigroup is defined by
HO h
Its domain V(A) is the set of / for which the limit exists, that is, the / for
which there is a g so that
Thf-f
g
0 as h -*0
Before delving into properties of generators, we pause to identify some operators
that cannot be generators.
(1.4) Theorem. If / G V(A) and f(x0) > f(y) for all y then Af(x0) < 0.
Proof Since x0 is a maximum, Ttf(x0) — f(x0) < 0 so Af(x0) < 0. D
Section 7.1 Semigroups and Generators 247
Example 1.1. The property in (1.4) looks innocent but it eliminates a number
of operators. For example, when k > 3 is an integer we cannot have Af = f(k\
the fcth derivative. To prove this, we begin by noting that gk(x) = 1 — x2 + xk
has a local maximum at 0. From this it follows that we can pick a small 6 > 0
and define a C°° function /jt > 0 that agrees with (¾ on (—5,6) and has a strict
global maximum there. Since Afk(0) = kl > 0, this contradicts (1.4). D
Returning to properties of actual generators, we have
(1.5) Theorem. V(A) is dense in V(T). If / G V(A) then Af G V(T),
TJ G V(A), and
-Ttf = ATtf = %Af
Tif-f= i ZAfds
Jo
Remark. Here we are differentiating and integrating functions that take values
in a Banach space, i.e., that map s G [0,oo) into L°°. Most of the proofs from
calculus apply if you replace the absolute value by the norm. Readers who want
help justifying our computations can consult Dynkin (1965), Volume 1, pages
19-22, or Ethier and Kurtz (1986), pages 8-10.
Proof Let / G V(T) and ga = fQa Tsf ds. To explain the motivation for the
last definition, we note that
|^-/| < \ fa\\Tsf -/|| ds^O
II a II a J0
asa-^0 since / G V(T). Now
Th ga = f T3+hfds = ga+ f TJds- f TJds
J0 Ja Jo
so we have
Thga 9a_{Taf_f)
<\[ Pi/-r0/||d-
+ lJQh\\Tsf-f\\ds^0
This shows that ga G V{A) and Aga = Taf — f. Since the first calculation
shows ga/a G V(T) and converges to / as a —* 0 the first claim follows.
248 Chapter 7 Diffusions as Markov Processes
The fact that Af G V(T) follows from (1.3) which implies (Thf - f)/h G
V(T) and V(T) is closed. To prove that Ttf G V(A) we note that the
contraction property and the fact that / G V{A) imply
ThTtf-TJ
-TAf
Tt(2£f£-Af
<
Tnf-f
-,4/-0
The last equality shows that ATtf = TAf and that the right derivative of Ttf
is TAf. To see that the left derivative exists and has the same value we note
that
Tf - T-hf
-TAf
Tt_h{Thl_l_Af
-Af
+ \\Tt-h(Af-ThAf)\\
Tnf-f
+ \\Af-ThAf\\^0
as h -* 0 since / G V(A) and Af G V(T). The final equation in (1.5) is
obtained by integrating the one above it. D
To make the connection between generators and martingale problems we
will now prove
(1.6) Theorem. If / G V{A) then for any probability measure fi
M{ = f(Xt) - f(XQ) - f Af(Xs) ds
Jo
is a Pf, martingale with respect to the filtration T% generated by Xt-
Proof Since / and Af are bounded, M{ is integrable for each t. Since M/ is
Ts measurable
£p(M/|^) = Ml + £p (f(Xt) - f(Xt) - J Af(Xr) dr T^j
By the Markov property the second term on the right-hand side is
= EX, (f(Xt-s) ~ f(Xo) - Jq ' Af(Xr) dr^j
Section 7.1 Semigroups and Generators 249
But for any y, it follows from the definition of Tr and Fubini's theorem that
Ey(f{Xt-.)-f(X0)-J ' Af{Xr)dr)
= 2}_,/(y) - f(y) - I ' TrAf(y) dr
Jo
which is 0 by (1.5) D
Our final topic is an important concept for some developments but will
play a minor role here. For each A > 0 define the resolvent operator for
bounded measurable / by
Uxf(x)= e-XiTtf(x)dt = Ex e-Xtf(Xt)dt
Jo Jo
Clearly \\U\f\\ < ||/[|/A. A more subtle fact is that
(1.7) Theorem. Ux maps V(T) 1-1 onto V(A). Its inverse is (A - A).
Proof Let A^f = (Thf — f)/h. Plugging in the definition of Ux and changing
variables s = t + h and s = (we have
AhUxf = fiJ e-Xi(Tt+hf-Ttf)dt
/ e-x'T,fds-- / e~XsTsfds
Jh " Jo
h
To conclude that Uxf £ V(A) and AUxf = Wxf - / we note that
\Uxf-f = \j e-x'T3fds--J fds
and compare the last two displays to conclude (using ||T</|| < ||/||) that
eAA_l 1 /-°° /•*
\\AhUxf-(\Uxf-f)\\<
/ e-*'||/||<fa + A/ e-x'\\f\\ds
Jh Jo
+ \ J\l - e-^)||/|| ds+±£ \\T.f - /|| ds -+ 0
as h -+ 0. The formula for AUxf implies that (A - A)Uxf = /• Thus Ux is 1-1
and its inverse is A — A.
250 Chapter 7 Diffusions as Markov Processes
To prove that U\ is onto, let / £ ~D(A) and note (1.5) implies
f°°
Ux(\f-Af)= / e-XtTt(Xf-Af)dt
Jo
= J \e-XtTtfdt-J e-Xt-Ttfdt
Integrating by parts
jf e-x<fTtfdt=(e-x<Ttf)\~ + jQ Xe'^TJ dt
Combining the last two displays shows U\(\f — Af) = / and the proof is
complete. D
7.2. Examples
In this section we will consider two families of Markov processes and compute
their generators.
Example 2.1. Purejuinp Markov processes. Let (S,S) be a measurable
space and let Q(x, A) :Sx5->[0, oo) be a transition kernel. That is,
(i) for each A G S, x —» Q(x, A) is measurable
(ii) for each x £ S, A —* Q(x,A) is a finite measure
(iii)Q(z, {*}) = ()
Intuitively Q(x, A) gives the rate at which our Markov chains makes jumps from
x into A and we exclude jumps from z to z since they are invisible.
To be able to construct a process Xt with a minimum of fuss, we will
suppose that A = supxgs Q(x,S) < oo. Define a transition probability P by
P(x,A) = jQ(x,A) iixgA
P(z,{z» = i(A-Q(z,S))
Let ii,^2,..- be i.i.d. exponential with parameter A, that is, P(t,- > t) = e~Xt.
Let Tn = ti + ■ ■ ■ + tn for n > 1 and let To = 0. Let Yn be a Markov chain
with transition probability P(x,dy) (see e.g., Section 5.1 of Durrett (1995) for
a defintion), and let
Xt=Yn fort€[rn>rn+1)
Section 7.2 Examples 251
Xt defines our pure jump Markov process. To compute its generator we
note that Nt = sup{n : Tn < t} has a Poisson distribution with mean Xt so if /
is bounded and measurable
Ttf(x) = Exf(Xt) = e-xtf(x) + Xte~xt J P(x, dy)f(y) + 0(t2)
Doing a little arithmetic it follows that
-XJp(x,dy)(f(y)-f(x))
= t-\e-Xi-l + Xt)f(x)
+ X(e-Xi-l)Jp(x,dy)f(y) + 0(t)
Tif(x)-f(x)
t
= t-\e-Xi-l + Xt)f{x)
Since t~1(e Xi — 1 + Xt) —»-0, e xt —* 1, and P(x, dy) is a transition probability
it follows that
Ttf(x) - f(x)
t
-XJp(x,dy)(f(y)-f(x))
as t —» 0. Recalling the definition of P(x, dy) it follows that
(2.1) Theorem. V(A) = V(T) = L°° and
Af(x)= JQ(x, dy)(f (y)-f(x))
' We are now ready for the main event.
Example 2.2. Diffusion processes. Suppose we have a family of measures
Px on (C, C) so that under Px the coordinate maps Xt(<j) = w(t) give the unique
solution to the MP(6, a) starting from x, and T% is the filtration generated by
the Xt. Let C\ be the C2 functions with compact support.
(2.2) Theorem. Suppose a and 6 are continuous and MP(6,a) is well posed.
Then C\ C V(A) and for all / £ C\
WV) = \ X>i(*)A//(*) + X>(*)A/(z)
252 Chapter 7 Diffusions as Markov Processes
Proof If / G C2 then Ito's formula implies
f(Xt) - f(X0) = J2[ Dif(Xs)bi(Xs) ds + local mart.
+ \j2JQDijf(Xs)ai]iXs)ds
If we let Tn be a sequence of times that reduce the local martingale, stopping
at time t A Tn and taking expected value gives
/•MT„
Exf(XtATa) - f(x) = VEx / A/(X,)6i(X.) <h
Jo
1 fMT„
+ 2^E'J0 DiSf[Xt)ai}{Xt)da
If / G C^- and the coefficients are continuous then the integrands are bounded
and the bounded convergence theorem implies
Exf{Xi)-f(x) = YjEx f Dif(X.)bi(Xt)da
Jo
(2.3) + I TEx f Dtj/(X.)oy (X,) ds
= EX f Af{Xs)ds
Jo
To prove the conclusion now we have to show that
Exf{Xt)-f{x)
-Af
The first step is to bound the movement of the diffusion process.
(2.4) Lemma. Let r > 0, K C Rd be compact, and Kr = {y : \x — y\ <
r for some x G K}. Suppose |&(z)| < B and J2i an(x) < -A for all x G i^r- If
we let Sr = inf{f : \Xt - X0\ > r} then
sup Px(Sr < t) < t ■ 2 (- + 4)
Section 7.2 Examples 253
Proof Let h(y) = J2i(yi ~ xi)2- Using Ito's formula we have
h(Xt) - h(X0) = J2[ 2(X> ~ xi)hi(x>)ds +local mart-
+ \YJJ^aii{Xs)ds
Letting Tn be a sequence of times that reduce the local martingale, stopping
at t A Sr A Tn, taking expected value, then letting n —* oo and invoking the
bounded convergence theorem we have
/•MSr
Exh(XiASr) = EX J2 i2(Xi ~ *iM**) + oo{Xt)} ds < 2t(rB + A)
Jo
Since h > 0 and h(Xsr) = r2 it follows that
r2Px(Sr< t)<2t(rB + A)
which is the desired result. D
Proof of (2.2) Pick R so that H = {z : \x\ < R — 1} contains the support
of / and let K = {x : \x\ < R}. Since / £ C\% and the coefficients are
continuous, Af is continuous and has compact support. This implies Af is
uniformly continuous, that is, given e > 0 we can pick 6 £ (0,1] so that if
\x-y\<6 then \Af(x) - Af(y)\ < e. If we let C = supz \Af(z)\ and use (2.3)
it follows that for x G K
E*f{Xt)-f(*)_Am <lExJ*\Af(Xs)-Af(x)\ds
<e + 2C sup PX(SS < t)
X£K
For x ¢ K, f(x) = 0 and Af = 0. So using (2.3) again gives that for x ¢ K
Exf(Xt)-f(x)
-Af(x)
Exf(Xt)
< CPX{TH < t)
< C SUP Py(Sl < t)
y£dK
where the second equation comes from using the strong Markov property at
time Tk- (2.4) shows that
sup Px(Si <t)< sup PX(SS <t)-*0 as t -* 0
254 Chapter 7 Diffusions as Markov Processes
Since e > 0 is arbitrary, the desired result follows. D
For general 6 and a it is a hopelessly difficult problem to compute V(A).
To make this point we will now use (1.7) to compute the exact domains of the
semi-group and of the generator for one dimensional Brownian motion. Let Cu
be the functions / on R that are bounded and uniformly continuous. Let C„
be the functions / on R so that /, /', /" G Cu.
(2.5) Theorem. For Brownian motion in d = 1, V(T) = Cu, V(A) = Cl, and
Af{x) = f"{x)/2 for all / G V{A).
Remark. For Brownian motion in d > 1, V(A) is larger than C„ and consists
of the functions so that A/ exists in the sense of distribution and lies in Cu
(see Revuz and Yor (1991), p. 226).
Proof If t > 0 and / is bounded then writing Pt(x, z) for the Brownian
transition probability and using the triangle inequality we have
\Ttf(x) - Ttf(y)\ < ll/H J \pt(x, z) - pt(y, z)\ dz
The right-hand side only depends on |a; — y\ and converges to 0 as |a; — y\ —» 0
so Ttf e Cu. If / G V(T) then \\Ttf - /|| -* 0. We claim that this implies
/ G Cu. To check this, pick e > 0, let t > 0 so that ||Ti/-/|| < e/3 then pick 6
so that ii\x — y\<6 then \Ttf(x) — Ttf(y)\ < e/3. Using the triangle inequality
now it follows that \f\x — y\<6 then
|/(*) - f(y)\ < \f(x) - Ttf(x)\ + \Ttf(x) - Ttf(y)\ + \Ttf(y) - f(y)\ < e
To prove that V(A) = C„ we begin by observing that (3.10) in Chapter 3
implies
Uxf(x) = (2A)-1/2 J e-l*-«l^/(y) dy
Our first task is to show that if / G Cu then U\f € C„. Letting g(x) = U\f(x)
we have
Jx J—oo
Differentiating once with respect to x and noting that the terms from
differentiating the limits cancel, we have
y»0O fX
g'(x)= e-(y-*^f(y)dy- e<'-yW^f{y) dy
Jx J—CO
Section 7.3 Transition Probabilities 255
Differentiating again gives
g"(x) = V2X / e-<*-*>S*f(y) dy + V2A / e'^^^fiy) dy - 2f(x)
Jx J—CO
It is easy to use the dominated convergence theorem to justify
differentiating the integrals. Prom the formulas it is easy to see that g, g', g" £ Cu so
ge Cl and
g"(x) = 2Xg(x)-2f(x)
Since Ag(x) = Xg(x) — f(x) by the proof of (1.7), it follows that Ag(x) =
g"(x)/2.
To complete the proof now we need to show V(A) D C2. Let h E C„ and
define a function / £ Cu by
f(x) = \h(x)-±h"(x)
The function y = h — U\f satisfies
\y"{x)-\y{x) = Q
All solutions of this differential equation have the form Aexy^ + f?e-xv^\
Since h — U\f is bounded it follows that h = U\f and the proof is complete. D
7.3. Transition Probabilities
In the last section we saw that if a and 6 are continuous and MP (6, a) is well
posed then
jTtf{x) = ATtf{x)
for / £ Cjf, or changing notation v(t,x) = Tif(x), we have
(3.1) ^=Av(t,x)
where A acts in the x variable. In the previous discussion we have gone from the
process to the p.d.e. We will now turn around and go the other way. Consider
(a) u e C1-2 and ut = Lu in (0, oo) x Rd
(b) u is continuous on [0, oo) x Rd and w(0, x) = f(x)
256 Chapter 7 Diffusions as Markov Processes
Here we have returned to our old notation
L=\ 5>i(*)Ai/(*) + £ b{(x) Dtf{x)
>i '
To explain the dual notation note that (3.1) says that v(t, -) G ~D{A) and satisfies
the equality, while (a) of (3.2) asks for u £ C1,2. As for the boundary condition
(b) note that f £ V(A) C V(T) implies that \\Ttf - f\\ -* 0 as t -* 0.
To prove the existence of solutions to (3.2) we turn to Friedman (1975),
Section 6.4.
(3.3) Theorem. Suppose a and 6 are bounded and Holder continuous. That
is, there are constants 0 < 6, C < oo so that
(HC) laijW-aijtolKClx-yl' Hx)-bi(y)\<C\x-y\s
Suppose also that a is uniformly elliptic, that is, there is a constant A so that
for all x,y
(UE) £y.-a1-y(a;)yi>A|y|2
hi
Then there is a function Pt(x,y) > 0 jointly continuous in t > 0, x, y, and C2
as a function of x which satisfies dp/dt = Lp (with L acting on the x variable)
and so .that if / is bounded and continuous
«(*.*)= Pi(x,y)f(y)dy
satisfies (3.2). The maximum principle implies |w(£,z)| < ||/||.
To make the connection with the SDE we prove
(3.4) Theorem. Suppose Xt is a nonexplosive solution of MP(6, a). If u is a
bounded solution of (3.2) then
u(t, x) = Exf{Xt)
Proof Ito's formula implies
u(t - s, Xs) - u(0,X„) = f ~(t - r,Xr)dr
+ ^2/ Diu(Xr)bi(Xr) dr + local mart.
+ \j2fDiju(Xr)aij(Xr)dr
Section 7.3 Transition Probabilities 257
so (a) tells us that u(t — s,Xs) is a bounded martingale on [0,t). (b) implies
that u(t — s,Xs) —* f{Xt) as s —* t, so the martingale convergence theorem
implies
u(t, x) = Exf(Xt) a
Pt(x, y) is called a fundamental solution of the parabolic equation Ut =
Lit since it can be used to produce solutions for any bounded continuous initial
data /. Its importance for us is that (3.4) implies that
Px(XteB)= f Pt(x,y)dy
Jb
i.e., Pt(x,y) is the transition density for our diffusion process. Let D = {x :
\x\ < R}. To obtain result for unbounded coefficients, we will consider
(a) u G C1-2 and du/dt = Lu in (0, oo) x D
(3.5) (b) u is continuous on [0,oo) x D with u(0,z) = f(x),
and u(t, y) = 0 when t > 0, y £ dD
To prove existence of solutions to (3.5) we turn to Dynkin (1965), Vol. II,
pages 230-231.
(3.6) Theorem. Let D = {x : \x\ < R} and suppose that (HC) and (UE)
hold in D. Then there is a function pf{x,y) > 0 jointly continuous in t > 0,
x, y G D, C2 as a function of x which satisfies dpR/dt = Lp (with L acting on
the x variable) and so that if / is bounded and continuous
u(t, x)= p?(x, y)f{y) dy
satisfies (3.5).
To make the connection with the SDE we prove
(3.7) Theorem. Suppose Xt is any solution of MP(6, a) and let r = inf{t :
Xt ¢ D}. If u is any solution of (3.5) then
u(t,x) = Ex(f{Xt);r>t)
258 Chapter 7 Diffusions as Markov Processes
Proof The fact that u is continuous in [0,t] x D implies it is bounded there.
Ito's formula implies that for 0 < s < t A r
u(t-s,Xs)-u(0,XQ)= --^{t-r,Xr)dr
+ J2 I Diu{Xr)bi{Xr)dr+ local mart.
+ \j2fDiju(Xr)aij(Xr)dr
so (a) tells us that u(t — s,Xs) is a bounded local martingale on [0,t A r). (b)
implies that u(t — s,X,) -^Oassjr and u(t — s,X3) —* f(Xt) as s —* t on
{r > t}. Using (2.7) from Chapter 2 now, we have
u(t,x) = Ex{f{Xi)-r>t) a
Combining (3.6) and (3.7) we see that
Ex{f{Xt); r>t)= f pf{x, y)f(y) dy
Jd
Letting R j oo and Pt(x, y) = limij_»oopf (z,y), which exists since (3.7) implies
i2 —>• p^(x,y) is increasing, we have
(3.8) Theorem. Suppose that the martingale problem for MP(6, a) is well
posed and that a and 6 satisfy (HC) and (UE) hold locally. That is, they
hold in {z : \x\ < R} for any R < oo. Then for each t > 0 there is a lower
semicontinuous function p(t,x,y) > 0 so that if / is bounded and continuous
then
Exf(Xi) = JPi(x,y)dy
Remark. The energetic reader can probably show that pt{x,y) is continuous.
However, lower semicontinuity implies that p<(z, y) is bounded away from 0 on
compact sets and this will be enough for our results in Section 7.5.
7.4. Harris Chains
In this section we will give a quick treatment of the theory of Harris chains to
prepare for applications to diffusions in the next section. In this section and
Section 7.4 Harris Chains 259
the next, we will assume that the reader is familiar with the basic theory of
Markov chains on a countable state space as explained for example in the first
five sections of Chapter 5 of Durrett (1995). This section is a close relative of
Section 5.6 there.
We will formulate the results here for a transition probability defined
on a measurable space (S,S). Intuitively, P(Xn+\ £ A\Xn = x) = p(x,A).
Formally, it is a function j):Sx5-»[0,l] that satisfies
for fixed A £ S, x —* p(x, A) is measurable
for fixed x £ S, A —» p(x, A) is a probability measure
In our applications to diffusions we will take S = Rd, and let
p(x,A) = / pi(x,y)dy
Ja
where pt(x, y) is the transition probability introduced in the previous section.
By taking t = 1 we will be able to investigate the asymptotic behavior of Xn
as n —* oo through the integers but this and the Markov property will allow us
to get results for t —* oo through the real numbers.
We say that a Markov chain Xn with transition probability p is a Harris
chain if we can find sets A, B £ S, a function q with q(x, y) > e > 0 for x £ A,
y £ B, and a probability measure p concentrated on B so that:
(i) If rA = inf{n >0:XneA}, then Pz(rA < oo) > 0 for all z £ S
(ii) If x £ A and C C B then p(x,C) > Jc q(x,y)p(dy)
In the diffusions we consider we can take A = B to be a ball with radius r, p to be
a constant c times Lebesgue measure restricted to B, and q(x, y) = p\(x,y)/c.
See (5.2) below. It is interesting to note that the new theory still contains most
of the old one as a special case.
Example 4.1. Countable State Space. Suppose Xn is a Markov chain on
a countable state space S. In order for Xn to be a Harris chain it is necessary
and sufficient that there be a state u with Px(Xn = u for some n > 0) > 0 for
all x £ S.
Proof To prove sufficiency, pick v so that p(u, v) > 0. If we let A = {u} and
B = {v} then (i) and (ii) hold. To prove necessity, let {u} be a point with
p({u}) > 0 and note that for all x
. Px(Xn = u for some n > 0) > Ex{q(XTA)p({u})} > 0 D
The developments in this section are based on two simple ideas, (i) To
make the theory of Markov chains on a countable state space work, all we need
260 Chapter 7 Diffusions as Markov Processes
is to have one point in the state space which is hit. (ii) In a Harris chain we can
manufacture such a point (called a below) which corresponds to "being in B
with distribution p." The notation needed to carry out this plan is occasionally
obnoxious but we hope these words of wisdom will help the reader stay focused
on the simple ideas that underlie these developments.
Given a Harris chain on^S,_«S), we will construct a Markov chain Xn with
transition probability p on (S, S) where S = S U {a} and S = {B, B U {a} :
B G S}. Thinking of a as corresponding to being on B with distribution p, we
define the modified transition probability as follows:
IfxeS-A, p(x,C)=p(x}C)iorCeS
IfxeA, p(x,{a}) = e
p(x,C) = p(x,C) - ep(C) for C G S
If x = a, p(a, D) = fp(dx)p(x, D) for D G S
Here and in what follows, we will reserve A and B for the special sets that
occur in the definition and use C and D for generic elements of S. We will
often simplify notation by writing p(x,a) instead of p(x,{a}), fj.(a) instead of
fi({a})} etc.
Our first step is to prove three technical lemmas that will help us carry out
the proofs below. Define a transition probability v by
v(x,{x}) = 1 if xeS
v(a,C) = p(C)
In words, v leaves mass in S alone but returns the mass at a to S and distributes
it according to p.
(4.1) Lemma, (a) vp = p and (b) pv = p.
Proof Before giving the proof we would like to remind the reader that
measures multiply the transition probability on the left, i.e., in the first case we
want to show \ivp~ = fip. If we first make a transition according to v and then
one according to p, this amounts to one transition according to p, since only
mass at a is affected by v and
p(a,D) = / p(dx)p(x,D).
The second equality also follows easily from the definition. In words, if p acts
first and then v, then this is the same as one transition according to p since v
returns the mass at a to where it came from. D
From (4.1) it follows easily that we have:
Section 7.4 Harris Chajns 261
(4.2) Lemma. Let Yn be an inhomogeneous Markov chain with pih = v and
P2k+i — V- Then Xn = Y2„ is a Markov chain with transition probability p and
Xn = Y2n+i is a Markov chain with transition probability p.
(4.2) shows that there is an intimate relationship between the asymptotic
behavior of Xn and of Xn. To quantify this we need a definition. If / is a
bounded measurable function on S, let / = vf, i.e.,
f(x) = f(x) forxeS
f(a) = Jfdp
(4.3) Lemma. If p. is a probability measure on (S, S) then
E,f(Xn) = Ej{Xn)
Proof Observe that if Xn and Xn are constructed as in (4.2), and P(Xq g
S) = 1 then X0 = .¾ and -^n is obtained from Xn by making a transition
according to v. D
Before developing the theory we give one example to explain why some of
the statements to come will be messy.
Example 4.2. Perverted Brownian motion. For x that is not an integer
> 2, let p(x,-) be the transition probability of a one dimensional Brownian
motion. When x > 2 is an integer let
p(x,{x + 1}) = 1-z-2
p(x,C) = x-2\CD[0,l]\ ifx + lgC
p is the transition probability of a Harris chain (take A = B = (0,1), p =
Lebesgue measure on B) but
P2(Xn = n + 2 for all n) > 0
I can sympathize with the reader who thinks that such crazy chains will not
arise "in applications," but it seems easier (and better) to adapt the theory to
include them than to modify the assumptions to exclude them.
a. Recurrence and transience
We begin with the dichotomy between recurrence and transience. Let
R = inf{n > 1 : Xn = a}. If Pa(R < oo) = 1 then we call the chain
262 Chapter 7 Diffusions as Markov Processes
recurrent, otherwise we call it transient. Let R\ = R and for fc > 2, let
Rk = inf{n > Rk-i : Xn = a) be the time of the fcth return to a. The strong
Markov property implies Pa(Rk < oo) = Pa(R < oo)fc so Pa(X„ = a i.o.) = 1
in the recurrent case and is 0 in the transient case. From this the next result
follows easily.
(4.4) Theorem. Let A(C) = £„ 2-"p"(a, C). In the recurrent case if A(C) > 0
then Pa{Xn G C i.o.) = 1. For A a.e. x, PX(R < oo) = 1.
Remark. Here and in what follows p™(x,C) = Px(Xn G C) is the nth iterate
of the transition probability denned inductively by p1 = p and for n > 1
Pn+1(x,C) = Jp(x,dy)pn(y,C)
Proof The first conclusion follows from the following fact (see e.g., (2.3) in
Chapter 5 of Durrett (1995)). Let Xn be a Markov chain and suppose
P (U~=n+1{Xm G Bm}\ Xn) > 6 on {Xn G An}
Then P({Xn G An i.o.}) - P({Xn G Bn i.o.}) = 0. Taking An = {a} and
Bn = C gives the desired result. To prove the second conclusion, let D = {x :
PX{R < oo) < 1} and observe that if pn(a, D) > 0 for some n then
Pa{Xm = a i.o.) < fpn(a, dx)Px(R < oo) < 1 D
Remark. Example 4.2 shows that we cannot expect to have PX(R < oo) = 1
for all x. To see that this can occur even when the state space is countable,
consider a branching process in which the offspring distribution has po = 0 and
^fcpi > 1- If we take A = B = {0} this is a Harris chain by Example 4.1.
Since pn(0,0) = 1 it is recurrent.
b. Stationary measures
(4.5) Theorem. Let R = inf{n > 1 : Xn = a}. In the recurrent case,
(R-l \ oo
E hx.ec] = £ Wn eC,R>n)
n=0 / n=0
defines a cr-finite stationary measure for p, with /i« A, the measure defined in
(4.4). fj. = fiv is a cr-finite stationary measure for p.
Section 7.4 Harris Chains 263
Proof If A(C) = 0 then the definition of A implies Pa(Xn G C) = 0, so
Pa(Xn G C, R > n) = 0 for all n, and /2(C) =.0. We next check that p. is
cr-finite. Let GM = {x : pk(x,a) > 6}. Let T0 = 0 and let T„ = inf{m >
T„-\ + fc : Xm G Gi,*}. The definition of Gk,s implies
p(r„<ra|r„_i<ra)<(i-<5)
so if we let N = inf{n : Tn > Ta} then EN < 1/6. Since we can only have
Xm G Gk,s with R > m when T„ < m < T„ + fc for some 0 < n < N it
follows that /2(Gi,4) < fc/<5. Part (i) of the definition of a Harris chain implies
S C Uj^m^iGj^x/m and cr-finiteness follows.
Next we show that Jlp = Ji.
Case 1. Let C be a set that does not contain a. Using the definition p. and
Fubini's theorem.
f fi(dy)p(y} C)=J2 f Pa(Xn edy,R> n)p(y, C)
•* n = 0 ^
oo
= J2 P°(Xn+i G C, R > n + 1) = /2(C)
n=0
since a ¢0 and Pa{Xo = a) = 1.
Case 2. To complete the proof now it suffices to consider C = {a}.
Hdy)P(y> a) = J2 I P°(*n &dy,R> n)p{y, a)
J n=QJ
oo
= £>„(# = n+l) = l = /2(a)
n=0
where in the last two equalities we have used recurrence and the fact that when
C = {a} only the n = 0 term contributes in the definition.
Turning to the properties of fj., we note that p = p,v and /2(a) = 1 so /x is
cr-finite. To check fj.p = fj., we note that (i) using the definition \i = /2t) and (b)
in (3.1), (ii) using (a) in (4.1), (iii) using p.p = /2, and (iv) using the definition
H = p,v:
fip = (p,v)(pv) = fipv = p,v = [M O
To investigate uniqueness of the stationary measure we begin with:
264 Chapter 7 Diffusions as Markov Processes
(4.6) Lemma. If v is a cr-finite stationary measure for p, then v(A) < oo and
v = vp is a cr-finite stationary measure for p with v(a) < oo.
Proof We will first show that v{A) < oo. If v{A) = oo then part (ii) of the
definition of a Harris chain implies v(C) = oo for all sets C with p(C) > 0.
If B = UiB; with 1/(2¾) < oo then p(B{) = 0 by the last observation and
p(B) = 0 by countable subadditivity, a contradiction. So v(A) < oo and
v(a) = vp(a) = ev(A) < oo. Using the fact that vp = v, we find
v(C) = vp{C) = v(C) - e v(A)p{B n C)
the last subtraction being well defined since v(A) < oo. From the last equality
it is clear that v is cr-finite. Since P(a) = ei/(j4), it also follows that vv = v. To
check vp = v, we observe that (a) of (4.1), the last result, and the definition of
v imply
vp = vvp = vp = v O
(4.7) Theorem. Suppose p is recurrent. If v is a cr-finite stationary measure
then v = v(a)n where p. is the measure constructed in the proof of (4.5).
Proof By (4.6) it suffices to prove that if v is a cr-finite stationary measure
for p with v(cc) < oo then v = v{oc)Jx. Our first step is to observe that
v(C) = v(a)p(a, C)+ f v{dy)p{y, C)
Js-{a}
Using the last identity to replace v(dy) on the right-hand side we have
v{C) = v{a)p(a, C)+ f v{a)p{a, dy)p(y, C)
JS-{a]
+ / Hdx) / P(x,dy)p(y,C)
JS-{a] JS-{a]
= v(a)Pa(Xi eC) + v(a)Pa(X! # a, X2 € C)
+ PP(X0 # a.Xx # a,X2 6 C)
Continuing in the obvious way we have
n
v(C) = v{a) J2 Pa(Xk # a for 1 < Jb < m, Xm G C)
m=\
+ Pp(Xk # a for 0 < k < n + l,Xn+i G C)
Section 7.4 Harris Chains 265
The last term is nonnegative, so letting n-+oowe have
0(C) > i>(a)ii(C)
To turn the > into an = we observe that
v(a) = I i/(dx)pn(x, a) > v(a) I p.(dx)pn(x, a) = v(a)ji(a) = v(a)
since /2(a) = 1. Let Sn = {x : pn(x,a) > 0}. By assumption U„S„ = S. If
v(D) > v(a)ji(D) for some D, then v(DP[Sn) > i>(a)ji(Dr\Sn) for some n and
it follows that 9(a) > v(cc), a contradiction. D
(4.5) and (4.7) show that a recurrent Harris chain has a cr-finite stationary
measure that is unique up to constant multiples. The next result, which goes
in the other direction, will be useful in the proof of the convergence theorem
and in checking recurrence of concrete examples.
(4.8) Theorem. If there is a stationary probability distribution then the chain
is recurrent.
Proof Let f = np, which is a stationary distribution by (4.6), and note that
part (i) of the definition of a Harris chain implies f(a) > 0. Suppose that the
chain is transient, i.e., Pa(R < oo) = g < 1. If 2¾ is the time of the fcth return
then Pa(Rk < oo) = 1k so if x ^ a
(CO \ CO CO 1
E1(Jf—})=Ep»(^< ^)^^^ = 137
n = 0 / k=l fc=l *
The last upper bound is also valid when x = a since
(CO \ CO CO i
n=0 / i=l i=0 "
Integrating with respect to f and using the fact that f is a stationary
distribution we have a contradiction that proves the result
1 / CO \ CO
¾ \n=0 / n=0
D
266 Chapter 7 Diffusions as Markov Processes
c. Convergence theorem
If J is a set of positive integers, we let g.c.d.(J) be the greatest common
divisor of the elements of I. We say that a recurrent Harris chain Xn is
aperiodic if g.c.d.({n > 1 : pn(a,a) > 0}) = 1. This occurs, for example, if we can
take A = B in the definition for then p(a, a) > 0. A well known consequence of
aperiodicity is
(4.9) Lemma. There is an mo < oo so that pm(a, a) > 0 for m > mo-
For a proof see (5.4) of Chapter 5 in Durrett (1995). We are now ready to prove
(4.10) Theorem. Let Xn be an aperiodic recurrent Harris chain with stationary
distribution tt. If PX(R < oo) = 1 then as n —* oo,
II^.O-tOII-o
Remark. Here || -1| denotes the total variation distance between the measures.
(4.4) guarantees that A a.e. x satisfies the hypothesis, while (4.5), (4.7), and
(4.8) imply tt is absolutely continuous with respect to A.
Proof In view of (4.3) and (4.6) it suffices to show that if f = np then
IIp"(*, 0-*OII-o
Our second reduction is to note that if x ^ a then
n
^(z, 0 = X>x(# = m)p"-m(«, 0
m=l
so we have
iipn(*, •) - *oii <J2P*(R=™)ir"m(a. •) - *(-)ii
m=l
and it suffices to prove the result when x = a.
Let S2 = S x S. Define a transition probability p on S x S by
P{{X1,X2),C1 X C2) = PiXi, 0^(10,02)
That is, each coordinate moves independently. We will call this the "product
chain" (think of product measure) and denote it by Zn.
Section 7.4 Harris Chains 267
We claim that the product chain is a Harris chain with A = {(a, a)} and
B = B x B. Clearly (ii) is satisfied. To check (i) we need to show that for all
(zi, X2) there is an N so that pN((x\, z2), (a, a)) > 0. To prove this let K and
L be such that pK(x\,a) > 0 and pL(x2,a) > 0, let M >m0} the constant in
(4.9), and take N = K + L + M. From the definitions it follows that
pK+L+M{xucc) > pK{xua)pL+M{a,a) > °
pK+L+M(x2,a) > pL(x1,a)pK+M(a,a) > 0
and hence pN((xi, xi), (oc, a)) > 0.
The next step is show that the product chain is recurrent. To do this,
note that (4.6) implies f = wp is a stationary distribution for p, and the two
coordinates move independently, so
f(Ci x C2) = f(Ci)f(C2)
defines a stationary probability distribution for p, and the desired conclusion
follows from (4.8).
To prove the convergence theorem now we will write Zn = (Xn,Yn) and
consider Zn with initial distribution 5a x f. That is X0 = a with probability
1 and Y0 has the stationary distribution ft. Let T = inf{n : Zn = (a, a)} = R.
(4.8), (4.7), and (4.5) imply ft x ft <5C A so (4.4) implies
PSaMT< °°)=1
Dropping the subscript 6a x ft from P and considering the time T,
n
P(Xn eC,T<n)=J2P(T= m)pn-m(a,C) = P(Yn £C,T<n)
since on {T = m}, Xm = Ym = a. Now
P(I„6C) = P(X„6C,r<n) + P(X„6C,T>n)
= P(y„ £C,T<n) + P(Xn eC,T>n)
<P(YneC) + P(T>n)
Interchanging the roles of X and Y we have
^n ec)< P{xn ec) + p(t > n)
and it follows that
268 Chapter 7 Diffusions as Markov Processes
ll^(M-*(OII = lW.€-)-P(y„€-)||
= sup \P(Xn g C) - P(Yn € C)\ <P(T>n)^0
c
as n —» oo. D
7.5. Convergence Theorems
In this section we will apply the theory of Harris chains developed in the
previous section to diffusion processes. To be able to use that theory we will suppose
throughout this section that
(5.1) Assumption. MP(6, a) is well posed and the coefficients a and 6 satisfy
(HC) and (UE) locally.
(5.2) Theorem. If A = B = {x : \x\ < 7'} and p is Lebesgue measure on B
normalized to be a probability measure then Xn is an aperiodic Harris chain.
Proof By (3.8) there is a lower semicontinuous function Pt{x,y) > 0 so that
Px(XteA)= [ Pt(x,y)dy
Ja
From this we see that PX(R = 1) > 0 for each x so (i) holds. Lower semi-
continuity implies Pi(x,y) > e > 0 when x,y € A so (ii) holds and we have
p(a, a) > 0. D
To finish checking the hypotheses of the convergence theorem (4.10), we
have to show that a stationary distribution 7r exists. Our approach will be to
show that EaR < oo, so the construction in (4.5) produces a stationary measure
with finite total mass. To check EaR < oo We will use a slight generalization
of Lemma 5.1 of Khasminskii (1960).
(5.3) Theorem. Suppose u > 0 is C2 and has Lu < —1 for all x ¢ K where K
is a compact set. Let Tk = inf{f > 0 : Xt G it'}. Then u(x) > EXTK-
Proof Ito's formula implies that Vt = u(Xt) + t is a local supermartingale on
[0,¾). Letting Tn j Tk be a sequence of times that reduce Vt, we have
u(x) > Exu{n A r„) + Ex{n A T„) > Ex(n A Tn)
Section 7.5 Convergence Theorems 269
Letting n —» oo and using the monotone convergence theorem the desired result
follows. D
Taking u(x) = |z|2/e gives the following
(5.4) Corollary. Suppose tr a(z) + 2x ■ b(x) < —e for x £ Kc = {x : \x\ > r}
then EXTK < \x\2/e.
Though (5.4) was easy to prove, it is remarkably sharp.
Example 5.1. Consider d = 1. Suppose a(x) = 1 and b(x) = —c/x for \x\ > 1
where c > 0 and set K = [-1,1]. If c > 1/2 then (5.4) holds with e = 1c-1. To
compute ExTk, suppose c ^ 1/2, let e = 2c— 1, and let u(x) = x2/e. Lu = —1
in (1,6) so u(Xt) + t is a local martingale until time T(itb) = inf{t: Xt ¢(1,6)}.
Using the optional stopping theorem at time T(itb) A t we have
x2 EXXT M
— = f-1 Ex{r{1>b) A t)
Rearranging, then letting t —* oo and using the monotone and bounded
convergence theorems, we have
p .. _ x* ExXni.v
Mi,s) -~ "
To evaluate the right-hand side, we need the natural scale and doing this it is
convenient to let start from 1 rather than 0:
^)=/exp(/¥"2)
= C j*y2*dy=C'{y2«l-l)
Using (3.2) in Chapter 6 with a = 1 it follows that
_ x2 62c+l _ ^20+1 J g2c+l _ t ^2
^«^1,») - "7 - 62c+l _ ! ■ 7 - 62c+l _ 1 " c
The second term on the right always converges to — 1/e as 6 —» oo. The third
term on the right converges to 0 when c > 1/2 and to oo when c < 1/2 (recall
that e < 0 in this case).
Exercise 5.1. Use (4.8) in Chapter 6 to show that ExTk = oo when c < 1/2.
Bt= I £ri-dW,
Jo
270 Chapter 7 Diffusions as Markov Processes
Exercise 5.2. Consider d = 1. Suppose a(x) = 1 and b(x) < —e/xs for x > 1
where e > 0 and 0 < 5 < 1. Let 7\ = inf{f : Xt = 1} and show that for x > 1,
£xTi < (z1+* - l)/e.
Example 5.2. When d > 1 and a(s) = J the condition in (5.4) becomes
x-b(x)<-(d + e)/2
To see this is sharp, let Yt = \Xt\, and use Ito's formula with f(x) = \x\ which
has
Dif = xi/\x\ Diif=l/\x\-x?/\x\3
to get
We have written W here for a d dimensional Brownian motion, so that we can
let
<*2k
I*. I
be a one dimensional Brownian motion B (it is a local martingale with {B)t = t)
and write
dYt = r±-:(xf 6(¾) + ^) dt + dBt
If x • b(x) = — {d— 1 + c)/2 this reduces to the previous example and shows that
the condition x • b(x) < —(d + e)/2 is sharp.
The last detail is to make the transition from ExTk < oo to EaR < oo.
(5.5) Theorem. Let K = {x : \x\ < r) and H = {x : \x\ < r + 1}. If
suPx6H E*Tk < oo then EaR < oo.
Proof Let U0 = 0, and for m > 1 let Vm = inf{f > Um-i : Xt G K},
Um = inf{i > Vm : \Xt\ £ H or t G Z}, and M = inf{m > 1 : Um G Z}. Since
^(f/m) G j4, -R < f/M- To estimate .Ea-R, we note that X(f7m_i) G A so
£ (Kn - I7m-i |^m_!) < Co = sup £,¾
x6j4
To estimate M, we observe that if r# is the exit time from H then (3.6) implies
tofP.fa > 1) > \K\ inf p[+1 (x,y) = e0 > 0
xgJT x,y£K
where p£+ (a;, y) is the transition probability for the process killed when it leaves
the ball of radius r+1. Prom this it follows that P(M > m) < (1 — eo)m. Since
Um — Vm < 1 we have EUm < (1 + Co)/ea and the proof is complete. D
8 Weak Convergence
8.1. In Metric Spaces
We begin with a treatment of weak convergence on a general space S with a
metric p, i.e., a function with (i) p(x,x) = 0, (ii) p(x,y) = p(y,x), and (iii)
p(x, y)+p(y, z) > p(x, z). Open balls are denned by {y : p(x} y) < r} with r > 0
and the Borel sets S are the cr-field generated by the open balls. Throughout
this section we will suppose S is a metric space and S is the collection of Borel
sets, even though several results are true in much greater generality. For later
use, recall that S is said to be separable if there is a countable dense set, and
complete if every Cauchy sequence converges.
A sequence of probability measures \in on (S,S) is said to converge
weakly if for each bounded continuous function f f(x)fj.n(dx) —» f f(x)fj.(dx).
In this case we write fMn =$■ fM. In many situations it will be convenient to deal
directly with random variables Xn rather than with the associated
distributions fin(A) = P(Xn G A). We say that Xn converges weakly to X and
write Xn =>• X if for any bounded continuous function /, Ef(Xn) —» Ef(X).
Our first result, sometimes called the Portmanteau Theorem, gives five
equivalent definitions of weak convergence.
(1.1) Theorem. The following statements are equivalent.
(i) Ef(Xn) —* Ef(X) for any bounded continuous function /.
(ii) For all closed sets K, limsup,,^^ P(Xn € K) < P(X G K).
(iii) For all open sets G, liminfn-.oo P(Xn G G) > P(X € G).
(iv) For all sets A with P(X £ dA) = 0, limsup,,.,^ P(Xn G A) = P(X G A).
(v) Let Dj = the set of discontinuities of /. For all bounded functions with
P(X G Dj) = 0, Ef(Xn) - Ef(X).
272 Chapter 8 Weak Convergence
Remark. To help remember (ii) and (iii) think about what can happen when
P(Xn = xn) = 1, xn —» x, and x lies on the boundary of the set. If K is closed
we can have xn ¢ K for all n but x G K. If G is open we can have xn £ G for
all n but ¢¢¢.
Proof We will go around the loop in roughly the order indicated. The last
step, (v) implies (i), is trivial so we have four things to show.
(i) implies (ii) Let p(x,K) = inf{p(x,y) : y € K}, ij)j{r) = (1 — jr)+ where
z+ = max{z, 0} is the positive part, and let fj(x) = i/>y(/>(z, K)). fj is bounded,
continuous, and > 1# so
limsup P(X„ G K) < lim Efj{Xn) = Efj(X)
n—*oo n—*oo
Now fj I 1K as j 1 oo, so letting j —* oo gives (ii).
(ii) is equivalent to (iii) This follows immediately from (a) P(A)+P(AC) =
1, and (b) A is open if and only if Ac is closed.
(ii) and (iii) imply (iv) Let K = A = the closure of A, G = A" = the
interior of A, and note that dA = K — G, so our assumptions imply
P(X G 10 = P(X EA) = P{X G G)
Using (ii) and (iii) now with the last equality we have
limsup P(Xn eA)< limsup P{Xn G K) < P(X G K) = P[X G A)
n—+oo n—*oo
liminf P(Xn e A) > liminf P(X„ EG)< P(X £G) = P(X G A)
n—*oo n—*oo
At this point we have shown
liminf P(Xn &A)> P(X G A) > limsup P(Xn G A)
n-*oo n-.oo
which with the triviality liminf < limsup gives the desired result.
(iv) implies (v) Suppose |/(z)| < K and pick ao < a\ < ... < at with
aQ < —K and at > K so that P(f(X) = a;) = 0 for 0 < i < I and ay—ay_i < e
for 1 < j < L This is always possible since {a : P(X = a) > 0} is a countable
set. Let Ai = {x : aj_i < f(x) < a,-}, 5^4,- C {x : f(x) G {aj-i, a,-}} U D/, so
P{X G 5.4,-) = 0, and it follows from (iv) that
i i
J2<*ip(Xn G At) ^J2a'P(X G **)
Section 8.1 In Metric Spaces 273
The definition of the a; implies
t
0 < J2<*ip(xn € At) - Ef(Xn) < €
1=1
and this inequality holds with X in place of Xn. Combining our conclusions we
have
limsup \Ef(Xn) - Ef(X)\ < It
n—oo
but e is arbitrary so the proof is complete. Q
An important consequence of the (1.1) is
(1.2) Continuous mapping theorem. Suppose fxn on (S,S) converge weakly
to fj., and let <p : S —* S' have a discontinuity set Dv with n(Dv) = 0. Then
fin o(p~l =4> /jo <p~x._
Proof Let / : S' —* R be bounded and continuous. Since Djaip C Dv, it
follows from (v) in (1.1) that
j Hn{dx)f{<p{x)) - J H{dx)f{<p{x))
Changing variables gives
J fi„ o ^_1((fa;)/(a;) -> Uo ^(cte)/^)
Since this holds for any bounded continuous /, the desired result follows. D
In checking convergence in distribution the following lemma is sometimes
useful.
(1.3) Converging together lemma. Suppose Xn =r- X and p(Xn,Yn) —» 0
in probability then Yn=t? X.
Proof Let K be a closed set and Ks = {y ■ p{y,K) < <?}, where p(y,K) =
inf{p(y, z) : z £ if}- ^ is easy to see that 2 ¢ -K* if and only if there is a 7 > 5
so that 5(2,7) H If = 0, so (¾)0 is open and Ks is closed. With this little
detail out of the way the rest is easy.
P(Yn € K) < P(Xn G Ks) + P(p(Xn,Yn) > S)
274 Chapter 8 WeaA' Convergence
Letting n —* oo, noticing the second term on the right goes to 0 by assumption,
and using (ii) in (1.1) we have
limsup P(Yn g K) < limsup P(Xn g Kg) < P{X g Ks)
n—*oo n—*co
Measure theory tells us that P(X g Ks) I P(X g K) as 8 J. 0 so we have shown
limsup P(Yn g K) < P(X g K)
n—oo
The desired conclusion now follows from (1.1). D
The next result, due to Skorokhod, is useful for proving results about
a sequence that converges weakly, because it converts weak convergence into
almost sure convergence.
(1.4) Skorokhod's representation theorem. Suppose S is a complete
separable metric space. If fMn =>• /*oo then we can define random variables Yn on
[0,1] (equipped with the Borel sets and Lebesgue measure A) with distribution
Hn so that Yn —* Yoo almost surely.
Remark. When S = R this is easy. Let Fn(x) = /*„(—oo, x] be the distribution
function, let
F-\y) = inf{z : Fn(x) > y}
let U(w) = w for w g [0,1] be a random variable that is uniform on (0,1),
and let Yn = i?^"1(f7). Then Yn -+ Yoo almost surely. (For a proof, see (2.1) of
Chapter 2 of Durrett (1995).) We invite the reader to contemplate the question:
"Is there an easy way to do this when S = R2 (or even S = [0, l]2)?" while we
give the general proof. The details are somewhat messy and not important for
the developments that follow, so if the reader finds herself asking "Who cares?",
she can skip to the beginning of the next section.
Proof We will warm up for the real proof by treating the special case fMn = fj..
(1.5) Theorem. For each probability measure \i there is a random variable Y
defined on [0,1] with P(Y g A) = n(A).
Proof For each k construct a decomposition Ak = {Ak,i,Akl2,...} of S into
disjoint sets of diameter less than 1/fc and so that Ak refines Ak-i, i.e., each
set -Afc-i^- is a union of sets in Ak- For each k construct a corresponding
decomposition Ik of the unit interval so that X(Ik,j) = M-^fcj) an(l arrange
things so that Ak-i,i D Akj if and only if Ik-i,i D Ikj-
Section 8.1 In Metric Spaces 275
Let Xkj be some point in A^j and let Xk,]{u) = ztj for w G hj. Since
.¾, Xk+i,... is contained in some element of Ak, which by definition has
diameter < 1/fc, it is a Cauchy sequence. So X(w) = lim* Xk(u) exists and satisfies
p(Xk,X) < 1/fc. To show that X has distribution fi we let K be a closed set,
let S(fc, K) = {j : Ak,j D K £ 0}, and let Ks = {x : p(x,K) < 6} as in the
proof of (1.3).
X(Xk G K) < \{Xk{w) e Uj€S{k,K)Ak,j)
< E MXt(W)€Atj)
= E A(4j)= E AiU*j)<Ai(^i/0
j6S(i,A-) j6S(i,A-)
Letting fc —» oo and noting -Ki/t J. If it follows that
limsup X(Xk G If) < n(K)
k
So (1.1) implies that Xk =>• M an(i it follows that X has distribution /*. D
Proof of (1.4) Construct a decomposition Ak of the space S as in the
preceding proof but this time require that each Akj has Hoo{dAkj) = 0. For
each n construct decompositions Zj?. of [0,1] so that A(J£ •) = fin(Akj). To
arrange the intervals in an appropriate way we introduce an ordering in which
[a, 6] < [c, d\ if 6 < c and demand that Ij?.. < .¾ if and only if J£>{ < 1¾. Let
Zij G -dj-j- and define X%(w) = zfciJ- for w G i^j. As before, X" = limfc_oo Xg
exists and has distribution \in.
Since J\ /Xoo(-Afcj) - Mn(^i,j) = 1 - 1 = 0, we have
E lA(J%) " A(4"j)l = E IM^j) " Mn(^j)|
j i
= 2^Uu)-^i,i))+
where y+ = max{y, 0} is the positive part of y. Since the ^(dAkj) = 0, (1.1)
implies fin(Akj) —* fioa{Ak,j) and the dominated convergence theorem implies
(1.6) !>('£)-a(j?j)I-o ^-°°
i
Fix fc and jQ, let a„ be the left endpoint of I£jV and let S(j0) = {j : 1^j < -¾..}
which by our construction does not depend on n. (1.6) implies that
«co= E I4°jl=lim E 14^-1= lirna„
J*65(io) j'6S(jo)
276 Chapter 8 Weak Convergence
Similarly if we let /?„ be the right endpoint of IJJj then /?„ —* /?«, asn->oo.
Hence if w is in the interior of If. it lies in the interior of I£ • for large n, and
it follows from the definition of the Xn that
limsup p{Xn, Xqo) < 2/fc
n—*oo
Letting fc —* oo we see that if w is not one of the countable set of points that is
the end point of some If. we have Xn —* X^, which proves (1.4). D
8.2. Prokhorov's Theorems
Let II be a family of probability measures on a metric space (S,p). We call II
relatively compact if every sequence \in of elements of II contains a
subsequence [Mnk that converges weakly to a limit n, which may be ^ II. We call II
tight if for each e > 0 there is a compact set K so that fi(K) > 1 — e for all
/igll. This section is devoted to the proofs of
(2.1) Theorem. If II is tight then it is relatively compact.
(2.2) Theorem. Suppose S is complete and separable. If II is relatively
compact, it is tight.
The first result is the one we want for applications, but the second one is
comforting since it says that, in nice spaces, the two notions are equivalent.
Proof of (2.1) We shall prove the result successively for Rd, R°°, a countable
union of compact sets, and finally, a general S. This approach is somewhat
lengthy but to inspire the reader for the effort, we note that the first two special
cases are important examples, and then the last two steps and the converse are
remarkably easy.
Before we begin we will prove a lemma that will be useful for the first two
examples.
(2.3) Lemma. Let A be a subclass of S that is closed under intersection, and
so that each open set is a countable union of elements of A. If ^„(j4) —» p{A)
for all A G A then fMn =>• fi.
Proof The inclusion-exclusion formula says
m
P(uT=1Ai) = X>(^) -J2p(A' riA]) + :. + (-i)m+1P(nT=1Ai)
t=l i<j
Section 8.2 Prokhorov's Theorems 277
Combining this with the fact that A is closed under intersection it is easy to see
that fMn(\Ji=1Ai) —» M(Uf=iA{). Given an open set G and an e > 0, choose sets
Ai,... Ak so that fj.(U{=1Ai) > fi(G) — e. Using the convergence just proved it
follows that
fi(G) - € < n(uf=1A{) < lim iin(U$=1Ai) < liminf fin(G)
n —>oo n—>oo
Since e is arbitrary, (iii) in (1.1) follows and the proof is'complete. D
Proof for Rd This is a fairly straightforward generalization of the argument
for d = 1. (See e.g., (2.5) in Chapter 2 of Durrett (1995).) In this case we can
rely on the distribution functions F(x) = fj.({y < z}), where for two vectors y <
x means y,- < z,- for each i. Let Qd be the points in Rd with rational coordinates.
By enumerating the points in Qd and then using a diagonal argument, we can
find a subsequence F„k so that Fnk(q) converges for each q G Qd. Call the limit
G(q) and define the proposed limiting distribution function by
F(x) = inf{G(g) : q G Qd and q > x)
where q > x means g,- > z,- for each i.
To check that F is a distribution function we note that it is an immediate
consequence of the definition that
(i) if z < y then F(x) < F(y)
{n)limylx F(y) = F(x)
where y J. z means y,- J. z,- for each i. The third condition for being a distribution
function is easier to prove than it is to state:
(iii) let A = (ai, 6J x • • ■ x (a,j, 6,j] be a rectangle with vertices V = {ai, 61} x
■■■ X {ad,bd}, and let sgn(t)) = (—l)n(a>u) where n(a,v) is the number of a's
that appear in v. The inclusion-exclusion formula implies that the probability
ofA\sJ2v€Vsgn(v)F(v)>Q.
The fact that (iii) holds for the Fn 's implies that it holds for G and by taking
limits we see that it holds for F.
Conditions (1)-(iii) imply that F is the distribution function of a measure
on Rd, which in our case will always have total mass < 1. Define the ith
marginal distribution function of F by F'(x) = limi?(y) where the limit is
taken through vectors y with y,- = z and the other coordinates y,- —>• 00. F' is
a one dimensional distribution function and hence its discontinuities D' are a
countable set.
Call z a good point if z,- ¢. Dl for each i. We claim that F is continuous at
good points. To see this, let w < z < y, let z,- be the vector which uses the first
278 Chapter 8 Weak Convergence
i components from y and the last d—i from w and note that the monotonicity
properties that come from the interpretation of F as a measure imply
d d
F(y) - F(w) = ^F(zi) - J-(z,_x) < X>(y,-) - F^w,)
:=1 isl
The continuity of F now follows from the monotonicity and the continuity of
the F{.
Our next claim is that for good x, Fnk(x) —* F(x). To prove this pick
P, 1, v € Qd with p < q < x < r and
F(x) - e < F(p) < F(q) < F(x) < F(r) < F(x) + e
Since u < v implies F(u) < G(v) < F(v) it follows that
F(x) - e < G(q) < F(x) < G(r) < F{x) + e
Using F„(q) < Fn(x) < Fn(r) and the convergence Fnk(q) -* G(q) for q G Qd
the claim follows easily.
Up to this point we have not used the tightness assumption. It enters into
the proof only to guarantee that F is the distribution function of a probability
measure. Given e choose a compact set K so that Fn(K) > 1 — e for all n,
where Fn(K) is short for the probability of K under the measure associated
with the distribution function Fn. Now pick a good rectangle A, i.e., one with
a,-,6,- ¢ Di for all i, so that K C A. The formula in (iii) for the probability of
A and the previous claim about convergence at good points imply
F(A) = lim Fnk(A) > liminf Fn(K) > 1 - e
k—*oo n—>oo
Finally, we have to prove that Fnk =r- F. For this we use the following
result which is of interest in its own right.
(2.4) Theorem. In Rd probability measure fin =$■ n if and only if the associated
distribution functions have Fn(x) —» F(x) for all x that are good for F.
Proof If x is good for F, then A = {y : y < x} has n(dA) = 0, so if
fj.n =$■ [i then (iv) of (1.1) implies Fn(x) —* F(x). To prove the converse we
will apply (2.3) to the collection A of finite disjoint unions of good rectangles,
(i.e., rectangles, all of whose vertices are good). We have already observed
that fin(A) —* fi(A) for all A e A. To complete the proof we note that A is
closed under intersection, and any open set is a countable disjoint union of good
rectangles. D
Section 8.2 Prokhorov's Theorems 279
Proof for R°° Let R°° be the collection of all sequences (zi, z2,...) of real
numbers. To define a metric on R°° we introduce the bounded metric p0(x, y) =
\x — y|/(l + \x — y\) on R and let
oo
: = 1
It is easy to see that the induced topology is that of coordinatewise convergence,
(i.e., z" —» x if and only if z" —» z,-) for each i and hence this metric makes
R°° complete and separable. A countable dense set is the collection of points
with finitely many nonzero coordinates, all of which are rational numbers.
(2.5) Lemma. The Borel sets S = 1Z°°, the product cr-field gnerated by the
finite dimensional sets {z : z,- G Ai for 1 < i < fc}.
Proof To argue that S D Tt°° observe that if G,- are open then {z : z,- G
Gi for 1 < i < fc} is open. For the other inclusion note that
{y ■ p{x,y)<s} = n~al \y-J2 ^mPo{xm,ym) < 6 \ e nx
so {y : p(x, y) < t> = U~=1{y : p(x, y) < T - 1/n} G ft°° • □
Let Wd : R°° —» Rd be the projection Wd(x) = (zi,... ,Xd). Our first claim
is that if II is a tight family on (R°° ,¾0°) then {/101J1 : /x e n} is a tight
family on (Rd,7£d). This follows from the following general result.
(2.6) Lemma. If II is a tight family on (S, S) and if h is a continuous map
from S to S' then {p o /i-1 : p. £ 11} is a tight family on (S',S').
Proof Given e, choose in S a compact set K so that p(K) > 1—e for all p G II.
If K' = h(K) then K' is compact and h'^K') D K so p o ft-^Jf') > 1 - e for
all /x G II. D
Using (2.6), the result for Rd, and a diagonal argument we can, for a given
sequence pn, pick a subsequence pnk so that for all d, pnk° ltd converges to a
limit Vd- Since the measures Vd obviously satisfy the consistency conditions of
Kolmogorov's existence theorem, there is a probability measure v on (R°°, 7£°°)
so that v o fl-J = Vd-
We now want to check that pnk => v. To do this we will apply (2.3)
with A = the finite dimensional sets A with v(dA) = 0. Convergence of finite
dimensional distributions and (1.1) imply that pnk(A) —» v(A) for all A G A.
280 Chapter 8 Weak Convergence
The intersection of two finite dimensional sets is a finite dimensional set and
d(A n B) C dA U dB so A is closed under intersection.
To prove that any open set G is a countable union of sets in A, we observe
that if y G G there is a 8 > 0 so that B(y, 8) = {x : p(x, y) < 5} C G. Pick 8
so large that B(y, 28) n Gc ¢. 0. (G = R°° G A so we can suppose Gc ^ 0.) If
we pick N so that 2~w < 8/2 then it follows from the definition of the metric
that for r < 8/2 the finite dimensional set
A(y,r,N) = {y : |x, - w| < r for 1 < i < iV} C 5(2/,5) C G
The boundaries of these sets are disjoint for different values of r so we can pick
r > 5/4 so that the boundary of A(y, r, N) has v measure 0.
As noted above R°° is separable. Let 1/1,1/21--- be an enumeration of the
members of the countable dense set that lie in G and let A{ = A(y{,n,Ni)
be the sets chosen in the last paragraph. To prove that U,-j4,' = G suppose
x G G — U,-j4,-, let 7 > 0 so that B(x,j) C G, and pick a point y,- so that
B(x,yi) < 7/9. We claim that x G -4,-. To see this, note that the triangle
inequality implies B(yi, 87/9) C G, so 5 > 47/9 and r > 5/4 = 7/9, which
completes the proof of /*„fc =$• 1/. D
Before passing to the next case we would like to observe that we have
shown
(2.7) Theorem. In R°°, weak convergence is equivalent to convergence of finite
dimensional distributions.
Proof for the cr-compact case We will prove the result in this case by
reducing it to the previous one. We start by observing
(2.8) Lemma. If S is <r-compact then S is separable.
Proof Since a countable union of countable sets is countable, it suffices to
prove this if S is compact. To do this cover S by balls of radius 1/n, let zj},,
m < mn be the centers of the balls of a finite sub cover, and check that the x^
are a countable dense set. D
(2.9) Lemma. If S is a separable metric space then it can be embedded home-
omorphically into R°°.
Proof Let gi, q2, ■ ■ ■ be a sequence of points dense in S and define a mapping
from S into R°° by
h(x) = (p(x,qi),p(x,q2),...)
Section 8.2 Prokhorov's Theorems 281
If the points x„ —* x in S then limp(zn,g{) = p{x,qi) and hence /i(zn) —* /i(z).
Suppose x ^ x' and let e = p(x,x'). If p(x,qi) < e/2, which must be true
for some i, then p(x',qi) > e/2 or the triangle inequality would lead to the
contradiction e < p(x,x'). This shows that /i is 1-1.
Our final task is to show that /i_1 is continuous. If xn does not converge
to x then limsupp(xn,x) = e > 0. If p{x,q{) < e/2 which must be true for
some q{ then limsup p(xn, g,«) > e/2, or again the triangle inequality would give
a contradiction, and hence h(xn) does not converge to h(x). This shows that
if h(xn) —* /i(z) then zn —* a; so /i_1 is continuous. D
It follows from (2.6) that if II is tight on S then {fi o h~l : fi G II} is a
tight family of measures on R°°. Using the result for R°° we see that if \in is a
sequence of measures in II and we let vn — \in o /i_1 then there is a convergent
subsequence i/„fc. Applying the continuous mapping theorem, (1.2), now to the
function <p = /i_1 it follows that fMnk = i/„fc o h converges weakly.
The general case Whatever S is, if II is tight, and we let Kt be so that
fi(Ki) > 1 — 1/i for all n G II then all the measures are supported on the
cr-compact set So = Ufiff and this case reduces to the previous one. D
Proof of (2.2) We begin by introducing the intermediate statement:
(H) For each e,6 > 0 there is a finite collection Ai,...,An of balls of radius 5
so that fi(Ui<„Ai) > 1 — e for all fi G II.
We will show that (i) if (H) holds then II is tight and (ii) if (H) fails then II is
not relativley compact. Combining (ii) and (i) gives (2.2).
Proof of (i) Fix e and choose for each k finitely many balls A\,..., A*k of
radius 1/fc so that /*(U;<„fcj4*) > 1 — e/2k for all /x € II. Let K be the closure
of r\^Li U,-<nfc Af. Clearly n(K) > 1 — e. K is totally bounded since if Bj is a
ball with the same center as Aj and radius 2/fc then Bj, 1 < j < n^ covers K.
Being a closed and totally bounded subset of a complete space, K is compact
and the proof is complete. D
Proof of (ii) Suppose (H) fails for some e and 6. Enumerate the members
of the countable dense set gi, g2,..., let A{ = B(qi, 6), and Gn = U"=1 A{. For
each n there is a measure \in G II so that ii„(Gn) < 1 — e. We claim that \in
has no convergent subsequence. To prove this, suppose ii„fc =>• v. (iii) of (1.1)
implies that for each m
v(Gm) < limsup fi„k{Gm)
k—*oo
< limsup fink(Gnk) < 1 — e
k—*oo
282 Chapter 8 Weak Convergence
However, Gm j S as m j oo so we have u(S) < 1 — e, a contradiction. D
8.3. The Space C
Let C = C([0, l],Rd) be the space of continuous functions from [0,1] to Rd
equipped with the norm
|M| = supK*)l
i<l
Let B be the collection of Borel subsets of C. Introduce the coordinate random
variables Xt(w) = w(i) and define the finite dimensional sets by
{w : Xti((j) € A{ for 1 < i < k)
where 0 <ix < t2 < ... < tk < 1 and A{ £ Kd, the Borel subsets of Rd.
(3.1) Lemma. B is the same as the cr-field C generated by the finite dimensional
sets.
Proof Observe that if £ is a given continuous function
{w : ||u - e|| < € - 1/n} = Dq {w : |w(g) - ¢(¢) | < e - 1/n}
where the intersection is over all rationals q £ [0,1]. Letting n —» oo shows
{w : ||w — £|| < e} G C and B C.C. To prove the reverse inclusion observe that if
the A{ are open the finite dimensional set {w : w(i,-) £ A;} is open so the tt — X
theorem implies C C.B. D
Let 0 < *! < t2 < ... < t„ < 1 and wt : C -* (Rd)n be defined by
Trt{u) = (u(ti),...,u(tn))
Given a measure \i on (C,C), the measures fi o w^ , which give the
distribution of the vectors (Xilt.. .,Xin) under n, are called the finite dimensional
distributions or f.d.d.'s for short. On the basis of (3.1) one might hope that
convergence of the f.d.d.'s might be enough for weak convergence of the \in.
However, a simple example shows this is false.
Example 3.1. Let an = 1/2— l/2n, 6„ = 1/2— l/4n, and let \in be the point
mass on the function
{0 ze[0,an]
4n(z - a„) x £ [a„, 6„]
471(1/2-«) xe[bn,l/2]
0 «€[1/2,1]
Section 8.3 The Space C 283
As n —» oo, fn{t) —» /oo = 0 but not uniformly. To see that fin does not converge
weakly to fi^, note that /i(w) = sup0<t<i w(i) is a continuous function but
Jh{w)nn{du) = 1 j* 0 = Jh^fiooidw)
(3.2) Theorem. Let /*„, 1 < n < oo be probability measures on C. If the finite
dimensional distributions of \in converge to those of fi^ and if the \in are tight
then fin => fioa-
Proof If fin is tight then by (2.1) it is relatively compact and hence each
subsequence fi„m has a further subsequence finim that converges to a limit v.
If / : Rfc —» R is bounded and continuous then f{Xtlt ■ ■ -Xtk) is bounded and
continuous from C to R and hence
J f(Xu,... Xtk)fin,m(da,) - y f(Xu,...X,Ji/OkO
The last conclusion implies that fini o ir^1 =>• i/ o ir^*1, so from the assumed
convergence of the f.d.d.'s we see that the f.d.d's of v are determined and hence
there is only one subsequential limit.
To see that this implies that the whole sequence converges to i/, we use the
following result, which as the proof shows, is valid for weak convergence on any
space. To prepare for the proof we ask the reader to do
Exercise 3.1. Let rn be a sequence of real numbers. If each subsequence of rn
has a further subsequence that converges to r then r„ —» r.
(3.3) Lemma. If each subsequence of fin has a further subsequence that
converges to v then fin ■=$■ v.
Proof Note that if / is a bounded continuous function, the sequence of real
numbers f f(u)fin(du) have the property that every subsequence has a further
subsequence that converges to f f(u)v(du). Exercise 3.1 implies that the whole
sequence of real numbers converges to the indicated limit. Since this holds for
any bounded continuous / the desired result follows. D
As Example 3.1 suggests, fin will not be tight if it concentrates on paths
that oscillate too much. To find conditions that guarantee tightness, we
introduce the modulus of continuity
ws(u) = sup{ |w(s) — w(i)| : \s — t\ < 6}
284 Chapter 8 Weak Convergence
(3.4) Theorem. The sequence \in is tight if and only if for each e > 0 there
are no, M and 8 so that
(i) Mn(|w(0)| > M) < e for all n > n0
(ii) fin (ws > e) < e for all n > n0
Remark. Of course by increasing M and decreasing 6 we can always check the
condition with no = 1 but the formulation in (3.4) eliminates the need for that
final adjustment. Also by taking e = 77 A C it follows that if fin is tight then
there is a 8 and an no so that y.n{^s > *]) < C f°r n > no-
Proof We begin by recalling (see e.g., Royden (1988), page 169)
(3.5) The Arzela-Ascoli Theorem. A subset A of C has compact closure if
and only if supw6j4 |w(0)| < 00 and limf_osupa,6j4 ws(u) = 0.
To prove the necessity of (i) and (ii), we note that if \in is tight and e > 0
we can choose a compact set K so that Hn(K) > 1 — e for all n. By (3.5),
K C {X(0) < M} for large M and if e > 0 then K C {w« < e} for small 6.
To prove the sufficiency of (i) and (ii), choose M so that /in(|X(0)| > M) <
e/2 for all n and choose 5u so that ^n(w«fc > 1/&) < e/2i+1 for all n. If we let
K be the closure of {^(0) < M,wsk < 1/k for all k} then (3.5) implies K is
compact" and fin(K) > 1 — e for all n. D
Condition (i) is usually easy to check. For example it is trivial when
Xn(0) = xn and xn —» x as n —» 00. The next result, (3.6), will be useful
in checking condition (ii). Here, we formulate the result in terms of random
variables Xn taking values in C, that is, to say random processes {Xn(t),0 <
t < 1}. However, one can easily rewrite the result in terms of their distributions
fin(A) = P(Xn e A). Here, A EC.
(3.6) Theorem. Suppose for some a,/? > 0
E\Xn(t2) - Xnit^f < K\U - i!|1+a
then (ii) in (3.4) holds.
Remark. The condition should remind the reader of Kolmogorov's continuity
criterion, (1.6) in Chapter 1. The key to our proof is the observation that the
proof of (1.6) gives quantitative estimates on the modulus of continuity. For a
much different approach to a slightly more general result, see Theorem 12.3 in
Billingsley (1968).
Section 8.4 Skorokhod's Existence Theorem for SDE 285
Proof Let 7 < a//3, and pick 77 > 0 small enough so that
A = (l-77)(l + a-/?T)-(l + 77)>0
From (1.7) in Chapter 1 it follows that if A = 3 • 2^-^/(1 - 2^) then with
probability > 1 - K2~NX/(1 - 2~A) we have
\Xn{q) - Xn(r)\ < A\q - r\y for q,r £ Q2 fl [0,1] with 2^-7^
Pick N so that K2-NX/(1 - 2~A) < e then 8 < 2^-^ so that ASn < e and
it follows that P(ws > e) < e. D
8.4. Skorokhod's Existence Theorem for SDE
In this section, we will describe Skorokhod's approach to constructing solutions
of stochastic differential equations. We will consider the special case
(*) dXt = (r(Xt)dBt
where a is bounded and continuous, since we can introduce the term b(Xt) dt
by change of measure. The new feature here is that a is only assumed to be
continuous, not Lipschitz or even Holder continuous. Examples at the end of
Section 5.3 show that we cannot hope to have uniqueness in this generality.
Skorokhod's idea for solving stochastic differential equations was to dis-
cretize time to get an equation that is trivial to solve, and then pass to the
limit and extract subsequential limits to solve the original equation. For each
n, define Xn(t) by setting X„(0) = x and, for m2~" < t < (m + l)2-n,
Xn(t) = Xn(m2-n) + cr(Xn(m2-n))(Bi - B{m2-n))
Since Xn is a stochastic integral with respect to Brownian motion, the formula
for covariance of stochastic integrals implies
(X'n,Xi)t = £ f\aikaik){Xn{[2ns]l2n))ds
= faij(Xn([2ns]/2n))ds
Jo
where as usual a = aaT. If we suppose that a{j(x) < M for all i,j and 2, it
follows that if s <t, then
\(X'n,Xi)t-(X'n,Xi),\<M(t-s)
286 Chapter 8 Weak Convergence
so (5.1) in Chapter 3 implies
E sup |Xi(ti)-Xi(5)|"<Cp^|(X')t-(X;)I|"/'<Cp{M(i--)K/2
«e[M]
Taking p = 4, we see that
E sup |X< (u) - X^(s)|4 < CM2(t - s)2
ti6[3,i]
Using (3.6) to check (ii) in (3.4) and then noting that Xn(0) = x so (i) in
(3.4) is trivial, it follows that the sequence Xn is tight. Invoking Prohorov's
Theorem (2.1) now, we can conclude that there is a subsequence Xn(M that
converges weakly to a limit X. We claim that X satisfies MP(0,a). To prove
this, we will show
(4.1) Lemma. If / £ C2 and Lf = \ £y atJ Di}f then
f(Xt) — /(X0) — / Lf(X,)ds is a local martingale
Jo
Once this is done the desired conclusion follows by applying the result to f(x) =
H and f(x) = x{Xj.
Proof It suffices to show that if /, A'/, and JDy/ are bounded, then the
process above is a martingale. Ito's formula implies that
/(*»(*)) " /tM»)) = E jf Dif(Xn(r))dX'n(r)
+ \y£jyiif{Xn{r))d{Xin,Xi)r
So it follows from the definition of Xn that
<X;,x£)r= f a{j(Xn([2nu]2-"))du
Jo
so if we let
Lnf(r) = i ^^([2^2^))^/(^)
2
then /(X„(i)) — /(Xn(s)) — /3 Lnf(r) dr is a local martingale.
Section 8.5 Donsker's Theorem 287
Skorokhod's representation theorem, (1.4), implies that we can construct
processes Yk with the same distributions as the Xn(k) on some probability space
in such a way that with probability 1 as k —* oo, Yk(t) converges to Y(t)
uniformly on [0,T] for any T < oo. If s < t and g : C —* R is a bounded
continuous function that is measurable with respect to T,, then
E[g{Y). j/(Y0 - f(Y.) - J* Lf(Yr) dr})
= \imE (g(Yk) ■ {/(Y/=) - f(Y,k) - jf ^/(Y^rir}) = 0
Since this holds for any continuous g, an application of the monotone class
theorem shows
E (/(¾) - f{Y.) - £ Lf(Yr)dr T^j = 0
which proves (4.1). D
8.5. Donsker's Theorem
Let &, 6. • ■ • be i.i.d. with E£i = 0 and Eg = 1. Let Sn = 6 + • • • + £„ be the
nth partial sum. The most natural way to turn Sm, 0 < m < n into a process
indexed by 0 < t < 1 let 5" = S[ni]/y/n where [z] is the largest integer < x.
To have a continuous trajectory we will instead let
Bn _ f Sm/-A if t = m/n
i \ linear if t £ [m/n, (m + l)/n]
(5.1) Donsker's Theorem. As n —* oo, Bn =>• B, where B is a standard
Brownian motion.
Proof We will prove this result using (3.2). Thus there are two things to do:
Convergence of finite dimensional distributions. Since Srnli is the
sum of [nt] independent random variables, it follows from the central limit
theorem that if 0 < ii < ... i„ < 1 then
To extend the last conclusion from Bn to Bn, we begin by observing that if
rn = nt — [nt] then
.B? = 5?+ »7, £[„*]+!/V"
288 Chapter 8 Weak Convergence
Since 0 < r„ < 1, we have r„£[nl]+i/^/n —» 0 in probability and the converging
together lemma, (1.3), implies that the individual random variables B" =>• Bt.
To treat the vector we observe that
(i% - 5?._x) - (SI\ - B»_J = (*<"• " B?.) - (5?._x - 5?._x) -+ 0
in probability, so it follows from (1.3) that
(Btl . Bt2 ~ Btl ' ■ ■ ■ ' Btm ~ B"m-1 ) =** (-8^ yBt2~ Btl,..., Btm - Btm_! )
Using the fact that (xi, z2,.. ■, zn) —* (zi> ^1 + x2, ■ ■ • > a;i + • ■ • + xm) is a
continuous mapping and invoking (1.2) it follows that the finite dimensional
distributions of Bn converge to those of B.
Tightness. The L2 maximal inequality for martingales (see e.g., (4.3) in
Chapter 4 of Durrett (1995)) implies
E (max Sj] < AESj
\°<i<t J ~
Taking £ = n/m and using Chebyshev's inequality it follows that
P ( max IS,-1 > ey/n) < 4/me2
\0<j<n/m ' Jl J - '
or writing things in terms of J3"
P ( max 15" - 5" I > e ) < 4/me2
V3<i<3 + l/m' * ' ' J ~ '
Since we need m intervals of length 1/m to cover [0,1] the last estimate is not
good enough to prove tightness. To improve this, we will truncate and compute
fourth moments. To isolate these details we will formulate a general result.
(5.2) Lemma. Let £1,^2,... be i.i.d. with mean \i. Let £,• = &l(|f|<Af)> let
Si = 6 + - • •+&, let a(M) = E(\ti\2; |&| > M), let pM = E&, and vM = E$).
Then we have
(i) P(6 # 6) < M-*E{\ti\3; |-e| >M) = a(M)/M2
(ii) \n - fiM\ < E(\Zi\; 1^1 > M) < cc{M)/M
(iii) If M > 1 and a(M) < 1 then
E(St - IJxmY < CXIM2 + C2t2
Section 8.5 Donsker's Theorem 289
where Ci = 32 {pAM + vM} and C2 = 24 {p.2M + vM}.
Remark. We give the explicit values of C\ and C2 only to make it clear that
they only depend on jxm and T>m■
Proof Only (iii) needs to be proved. Since E(£j — Hm) = 0 and the |y are
independent
=^(¾ - ^)4 + Q Q) ^(¾ - um?
If p is an even integer (we only care about p = 2,4) then
E(ij-pMY<2"(fipM + E^)
Prom this, (ii), and our assumptions it follows that
E{ij - w)2 < 4(/¾ + vm)
This gives the term C^2- To estimate the fourth moment we notice that
rM
E(tf)= / 4x3P(\i]-\>x)dx
Jo
rM
< 2M2 / 2xP(\£j | > x) dx = 1M2vM
Jo
So using our inequality for the pth power again, we have
E(ij - Mf)4 < 16(/22, + 1MHm)
Since we have supposed M > 1, this gives the term C\£M2 and the proof is
complete. D
We will apply (5.2) with M = 6y/n. Part (i) implies
(5.3) nP(6#£)<n.^Uo
by the dominated convergence theorem. Since \i — 0 part (ii) implies
(5.4) n\(lM\<n.^p- = o(V^)
290 Chapter 8 Weak Convergence
as n —* oo. Using the L4 maximal inequality for martingales (again see e.g.,
(4.3) in Chapter 4 of Durrett (1995)) and part (iii), it follows that if £ = n/m
and n is large then (here and in what follows C will change from line to line)
E sup \Sk - fc/2M|4 < CE\St - £j!m\4
0<k<l
< C{£M2 +e)<c(- + -^-) n2
— \m m -/
Using Chebyshev's inequality now it follows that
(5.5) P f sup ^ \Sk -kJiM\> eVn'/o) < Ce~4 (^ + ^)
Let B" = S[nl]/V" and Jfc,m = [k/m,(k + l)/m]. Adding up m of the
probabilities in (5.5), it follows that
limsup P (max max \Bns - Sty J > e/l) < Cr* id2 + 1)
To turn this into an estimate of the modulus of continuity, we note that
*»i/m(/)<3 max max \f(s) - f(k/m)\
0<*<m J6ik,m
since for example if k/m <s<(k+ l)/m <t < (k + 2)/m
\f(t)-m\<\f(t)-f((k+l)/m)\
+ |/((t + l)/m) - f(k/m)\ + \f(k/m) - f(s)\
If we pick m and 6 so that Ce~4(<52 + 1/m) < e it follows that
(5.6) limsup P(w1/m(Bn) > e/3) < e
n—>oo
To convert this into an estimate the modulus of continuity of Bn we begin
by observing that (5.3) and (5.4) imply
pf sup \B?-B?\>e/3) ^0
\0<1<1 J
as n —» oo so the triangle inequality implies
limsup P(w1/m(Bn) > e) < e
Section 8.5 Donsker's Theorem 291
Now the maximum oscillation of Bn over [s,t] is smaller than the maximum
oscillation of Bn over [[ns]/n,([nf] + l)/n] so
(5.7) ws(Bn) < ws+2/n(Bn)
and it follows that
limsupP(iu1/2m(5n) > e) < e
n—»00
This verifies (ii) in (3.4) and completes the proof of Donsker's theorem. D
The main motivation for proving Donsker's theorem is that it gives as
corollaries a number of interesting facts about random walks. The key to the
vault is the continuous mapping theorem, (1.2), which with (5.1) implies:
(5.8) Theorem. If t/> : C[0,1] —» R has the property that it is continuous
Po-a.s. then 4>{Bn) =» *I>{B).
Example 5.1. Let i>(u) = max{w(f) : 0 < t < 1}. It is easy to see that
|t/>(w) — ip(0\ < llw — V"l| so V1 : C[0,1] -» R is continuous, and (5.8) implies
max Sm/y/n=> M\ = max Bt
0<m<n 0<i<l
To complete the picture, we observe that by (3.8) in Chapter 1 the distribution
of the right-hand side is
Po(Mi > a) = P0(Ta < 1) = 2P0(Bi > a)
Example 5.2. Let V(w) = sup{i < 1 : w(i) = 0}. This time ij) is not
continuous, for if w£ has w£(0) = 0, w£(l/3) = 1, w£(2/3) = e, w(l) = 2, and
linear on each interval \j,(j + 1)/3] then i>(u0) = 2/3 but ip(uc) = 0 for e > 0.
It is easy to see that if ip(w) < 1 and w(t) has positive and negative values
in each interval (V"(w) — ^ilKw)) tnen i> is continuous at w. By arguments in
Example 3.1 of Chapter 1, the last set has Pq measure 1. (If the zero at ip(<j)
was isolated on the left, it would not be isolated on the right.) Using (5.8) now
sup{m < n : 5m_i • Sm < 0}/n =» L = sup{i < 1 : Bt = 0}
The distribution of L, given in (4.2) in Chapter 1, is an arcsine law.
Example 5.3. Let ij>(w) = \{t G [0,1] : u(t) > a}\. The point w = a shows
that t/1 is not continuous but it is easy to see that tp is continuous at paths w
with \{t G [0,1] : w(i) = a}\ = 0. Fubini's theorem implies that
EQ\{t G [0,1] : Bt = a}\ = f P0(Bt = a)dt = 0
Jo
292 Chapter 8 Weak Convergence
so i[> is continuous P0-a..s. With a little work (5.8) implies
\{m < n : Sm > ay/n}\/n => \{t g [0,1] : Bt > a}\
Before doing that work we would like to observe that (9.2) in Chapter 4 shows
that \{t £ [0,1] : Bt > 0}| has an arcsine law under Pq.
Proof Application of (5.8) gives that for any a,
\{t e [0,1] : B? > aJZ}\ =»► \{t e [0,1] : Bt > a}\
To convert this into a result about \{m <n : Sm > ay/n}\ we note that if e > 0
then
P(max|Xm| > ey/n) < nP(\Xm\ > e^/n)
(5.9) m^n
<e-2E(Xl;\Xm\>e^l)^0
by dominated convergence, and on {maxm<„ \Xm\ < ey/n}, we have
\{t e [0,1] : B? > (a + e)V^}\ < -\{m < n : Sm > ay/K)\
n
<|{i€[0,l]:B?>(a-OVn}|
Combining this with the first conclusion of the proof and using the fact that
b —» \{t £ [0,1] : J3t > 6}| is continuous at 6 = a with probability one, we arrive
easily at the desired conclusion. D
Example 5.4. Let i>(u) = frQ1,u(t)kdt where k > 0 is an integer. i[> is
continuous so applying (5.8) gives
/ {B?/y/n)k dt => f Bk dt
Jo Jo
To convert this into a result about the original sequence, we begin by observing
that if x < y with \x — y\ < e and \x\, \y\ < M then
y \z\k+l , eMk+i
, k k, f l*f+1 j eM*+
From this it follows that on
Gn(M) = I max \Xm \ < M ^^y/n, max
lm<n m5-n
Section 8.6 The Space D 293
we have
3k
X
(S(ni)/Vn)fc di - n-1-(^2) £ S*
m = l
<
(fc + 1)M
For fixed M we have P(maxm<„ \Xm\ < M~^k+2^-^/n) -» 0 by (5.9) so using
Example 5.1, and (1.1) it follows that
liminf P(Gn(M)) > P ( max \BA < M)
n-~oo v // — \j)<i<l J
The right-hand side is close to 1 if M is large so
Jo m = l
in probability and it follows from the converging together lemma (1.3) that
m=l •'O
It is remarkable that the last result holds under the assumption that EXi = 0
and EXf = 1, i.e., we do not need to assume that E\Xk\ < oo.
8.6. The Space D
In the previous section, forming the piecewise linear approximation was an
annoying bookkeeping detail. In the next section when we consider processes that
jump at random times, making them piecewise linear will be a genuine
nuisance. To deal with that problem and to educate the reader, we will introduce
the space JD([0,1], Rd) of functions from [0,1] into Rd that are right continuous
and have left limits. Since we only need two simple results, (6.4) and (6.5)
below, and one can find this material in Chapter 3 of Billingsley (1968), Chapter
3 of Ethier and Kurtz (1986), or Chapter VI of Jacod and Shiryaev (1987), we
will content ourselves to simply state the results.
We begin by defining the Skorokhod topology on D. To motivate this
consider
Example 6.1. For 1 < n < oo let
f m-/° *€ [0,(n+l)/2n)
Mj"\l *€ [(n+l)/2n,l]
294 Chapter 8 Weak Convergence
where (n+l)/2n = l/2fom = oo. We certainly want/„ —» /«, but ||/n— /oo|| =
1 for all n.
Let A be the class of strictly increasing continuous mappings of [0,1] onto
itself. Such functions necessarily have A(0) = 0 and A(l) = 1. For f,g £ D
define d(f, g) to be the infimum of those positive e for which there is a A € A
so that
sup |A(i) -1\ < e and sup \f(t) - g(X(t))\ < e
t t
It is easy to see that d is a metric. If we consider / = fn and g = fm in Example
6.1 then for e < 1 we must take A((n + l)/2n) = (m + l)/2m so
d(fn,fm) =
n+1
In
m+1
2m
=
1
2n ~
1
" 2m
When m = oo we have d(fn,foa) = l/2n so /„ —» /«, in the metric d.
We will see in (6.2) that d defines the correct topology on D. However, in
view of (2.2), it is unfortunate that the metric d is not complete.
Example 6.2. For 1 < n < oo let
fo te [0,1/2)
gn{t)={ 1 iG[l/2,(n+l)/2n)
[0 t £ [{n + l)/2n, 1]
In order to have e < 1 in the definition of d(gn, gm) we must have A(l/2) = 1/2)
and A((n + l)/2n) = (m + l)/2m so
d(gn,gm) =
2n ~ 2m
The pointwise limit of gn is g^ = 0 but d(gn,g00) = 1. We leave it to the reader
to show
Exercise 6.1. If h £ D then liminfn-.oo d(gn,h) > 0.
To fix the problem with completeness we require that A be close to the
identity in a more stringent sense: the slopes of all of its chords are close to 1.
If A G A let
log f^MfT
= sup
t-s
Section 8.6 The Space D 295
For f,g £ D define d0(x, y) to be the iniimum of those positive e for which there
is a A G A so that
||A||<e and sup \f(t) - g(\(t))\ < e
t
It is easy to see that do is a metric. The functions gn in Example 6.2 have
do(gn,gm) = min{l, |log(n/m)|}
so they no longer form a Cauchy sequence. In fact, there are no more problems.
(6.1) Theorem. The space D is complete under the metric do.
For the reader who is curious why we discussed the simpler metric d we note:
(6.2) Theorem. The metrics d and do are equivalent, i.e., they give rise to the
same topology on D.
Our first step in studying weak convergence in D is to characterize
tightness. Generalizing a definition from Section 7.3 we let
ws(f) = sup{|/(i) - /(s).| :s,t e S}
for each S C [0,1]. For 0 < 6 < 1 put
w's(f) = inf maxw[titU+l)(f)
where the iniimum extends over the finite sets {i,-} with
0 = t0 < ti <■■■ <tr and U - *,-_! > 6 for 1 < i < r
The analogue of (3.4) in the current setting is
(6.3) Theorem. The sequence fin is tight if and only if for each e > 0 there
are no, M, and S so that
(i) n„(\u(0)\ > M) < e for all n > n0
(ii) finis's ■> ^) < e f°r all n > n0
Note that the only difference from (3.4) is the ' in (ii). If we remove the ' we
get the main result we need.
296 Chapter 8 Weak Convergence
(6.4) Theorem. If for each e > 0 there are no, M, and 8 so that
(i) Mn(|w(0)| > M) < e for all n > nQ
(ii) fMn(w's > e) < e for all n > no
then nn is tight and every subsequential limit fj. has fi(C) = 1.
(6.1)-(6.4) are Theorems 14.2, 14.1, 15.2, and 15.5 in Billingsley (1968).
We will also need the following, which is a consequence of the proof of (3.3).
(6.5) Theorem. If each subsequence of \in has a further subsequence that
converges to \i then \in =r- fi.
8.7. Convergence to Diffusions
In this section we will prove a result, due to Stroock and Varadhan, about
the convergence of Markov chains to diffusion processes. Our treatment here
follows that in Chapter 11 of Stroock and Varadhan (1979) but we will discuss
both discrete and continuous time in parallel. Our duplicity lengthens the proof
somewhat, but is preferable we believe to the usual (somewhat dangerous) cop-
out: "the same argument with minor changes handles the other case."
In discrete time, the basic data is a sequence of transition probabilities
Iih(x, dy) for a Markov chain Y^h, m = 0,1,2,..., taking values in Sh C Rd,
i.e.,
P(Ytm+1)h € A\Y*h = x) = nA(x, A) iotxeSh,Ae Tld
In this case we define X^ = Y£rt/h-,, i.e., we make X% constant on intervals
[mh, (m + l)/i).
In continuous time, the basic data is a sequence of transition rates Qh{x, dy)
for a Markov chain X^, t > 0, taking values in Sh C Iid, i.e.,
^P(X* e A\X$ = x) = Qh(x, A) foix£Sh,Ae 1ld, with x £ A
at
Note that although x £ A is excluded from the interpretation, we will allow
Q(x, {x}) > 0 in which case jumps from z to z (which are invisible) occur at a
positive rate. To have a well behaved Markov chain we will suppose
(B) For any compact set K, supx6Jf Qh(x,Hd) < oo.
In words, our convergence theorem states that if the infinitesimal mean and
covariance of the Markov chain converge to those of a diffusion for which the
martingale problem is well posed, and we have a condition that rules out jumps
Section 8.7 Convergence to Diffusions 297
in the limit then we have weak convergence. To make a precise statement we
need some notation. To incorporate both cases we introduce a kernel
vi j \ — /Rh(x,dy)/h in discrete time
^ ' — \ Qh(x,dy) in continuous time
and define
aij(x) = / (2/. - xi)(yj ~ xi)Kh{x,dy)
J\y-x\<\
&*(*)=/ {Vi-*i)Kh(x,dy)
J\y-x\<l
/|y-x|<l
Ahc(x) = Kh(x,B(x,£y)
where B(x, e) = {y : \y — x\ < e}. Suppose
(A) a11 and 6' are continuous coefficients for which the martingale problem is
well posed, i.e., for each x there is a unique measure Px on {C,C) so that the
coordinate maps Xt(u) = w(i) satisfy Px(Xq = x) = 1 and
X\- j bi{X,)ds and X\X{ - f a^iX^ds
Jo Jo
are local martingales.
(7.1) Theorem. Suppose in continuous time that (B) holds, and in either case
that (A) holds and for each z, j, R < oo, and e > 0
(i) limfciosupi^B \a^(x) - a0(a;)| = 0
(ii) limAiosup|x|<fl \b*(x) - &,-(z)| = 0
(iii) limhi0sup|x|<fi A*(z) = 0
If Xq = Xh —» x then we have Xp =>• Xt the solution of the martingale problem
with Xq = x.
Remarks. Here =>• denotes convergence in JD([0,l],Rd) but the result can be
trivially generalized to give convergence D([0,T],Hd) for all T < oo.
In (i), (ii), and (iii) the sup is taken only over points x £ Sh- Condition
(iii) cannot hold unless the points in Sh are getting closer together. However,
there is no implicit assumption that Sh is becoming dense in all of Rd.
Proof The rest of the section is devoted to the proof of this result. We will
first prove the result under the stronger assumptions that for all i, j and e > 0
we have
298 Chapter 8 WeaA' Convergence
(i') limhi0supx |a£(z) - dij(x)\ = 0 (iv) supAx |o*-(a:)| < oo
(ii') limH0suPx \b*(x) - &,(z)| = 0 (v) supAiX \$(x)\ < oo
(iii') limAxosupx A*(a;) = 0 (vi) supAx A*(a;) < oo
and in continuous time that (B') supx Q(x, Rd) < oo.
a. Tightness
If / is bounded and measurable, let
Lhf(x) = JKh(x,dy)(f(y)-f(x))
As the notation may suggest, we expect this to converge to
To explain the reason for this definition we note that if / is bounded and
measurable then
(7.2a) In discrete time f(X%h) — J2jZo hLhf(Xjh) is a martingale.
(7.2b) In continuous time f(X^) — fQ Lhf(X%) is a martingale.
Proof (7.2a) follows easily from the definition of Lh and induction. Since we
have (B'), (7.2b) follows from (2.1) and (1.6) in Chapter 7. D
The first step in the tightness proof is to see what happens when we apply
Lh to fCiy(x) = fc(x - y) where fc{x) = g(\x\2/e2), and
a(x)=((l-x*)(l-xf 0<x<l
^; \0 z>l
The exact formula for g is not important. All we care about is that 0 < g(x) < 1
for x > 0, (/(0) = 1, g(x) = 0 for x > 1, and g G C2 (since g'(0) = g"(0) = 0).
(7.3) Lemma. There is a Cc which only depends on e so that \LhfyiC\ < C£.
Proof Applying Taylor's theorem to the function h(t) = f(x + t(y — x)) for
0 < t < 1 we have for some cSiV G [0,1]
f(y) - /(*) = Mi) - MO) = h'(o) + h"(cXty)/2\
(7-4) = £(w _ Xi)Dif(x) + J2(yi - xi)(yj - *j)Aj/(*x,»)
Section 8.7 Convergence to Diffusions 299
where zx>y = x + cXiV(y — x). Integrating with respect to It'/,(z, dy) over the set
\y — x\ < 1 and using the triangle inequality we have
Lhf(x) < IV/(z) ■ f (y-x) Kh (x, dy)
I J\y-x\<l
+ / Yl(yi- *•')(yj - xi)Dii f(z*,y)Kh(x,dy)
+ 2\\f\\aoKh{x,B{x,\Y)
Let Ac = supx |V/(z)|, and Bt = sup2 ||Aj/(z)|| where
|my 11 = sup < ^w,m0«j
l«l = 1
Noticing that the Cauchy-Schwarz inequality implies
(7.5) |I>t*|<MM
i
and the definition of 11 rrz^y 11 gives
(7-6) |X> - *••)(%• - *j)Dijm\ <\y- x|2||Ai/W||
it follows that
Lhf(x) < A
/ (y-x)Kh{x,dy)
J\y-x\<l
+ BC f \y-x\2Kh(x,dy)+2\\f\\oaKh(x,B(x,ir)
J\y-x\<l
Now
(7.7)
£4(*) = / \y-x\2Kh(x,dy)
so recalling the definitions of &f and Af, then using (iv)-(vi) the desired result
follows. □
To estimate the oscillations of the sample paths, we will let tq = 0,
r„ = inf{i > rn_! : |X* - X^ \ > e/4}
N = inf{n > 0 : r„ > 1}
cr = min{r„ — rn_i : 1 < n < N}
0 = max{\Xh(t) - Xh(t-)\ : 0 < t < 1}
300 Chapter 8 Weak Convergence
To relate these definitions to tightness, we will now prove
(7.8) Lemma. If a > 5 and 6 < e/4 then ws(Xh) < e.
Proof To check this we note that if a > S and 0 < t — s < S there are two
cases to consider
Case 1. r„ < s <t < rn+\
1/(0 - /001 < 1/(0 - /(r„)| + |/(r„) - f{s)\ < 2e/4
Case 2. r„_i < s < r„ < i < rn+i
1/(0 - /(01 < 1/(0 - /(t„)| + |/(r„) - /(rn-)|
+ |/(r„-) - /(t„_i)| + |/(t„_i) - /(S)| < e D
Tightness proof, discrete time. One of the probabilities in (7.8) is
trivial to estimate
(7.9a) Py{6 > e/4) < -J-supEh(x,B(x, e/4)c) < sup Ahc/i(x) -* 0
ft X X
by (iii'). The first step in estimating Py(a > 6) is to estimate Py(r\ < 5). To
do this we begin by observing that (7.2a) and (7.3) imply
fy,tli{Xkh) + Ce/ikh, k = 0,1,2,... is a submartingale
Using the optional stopping theorem at time t AS where S = [S/h]h < S and
noticing XrA$ = XrAs it follows that
Ey{fy,c/4(XrAs) + Cc/4(rAS)}>l
or rearranging and using r A S < S
(7.10) CC/4S > Ey{l - fytC/i(XrAS)} > Py(n < S)
Our next task is to estimate Py(N > k). To do this, we begin by observing
Ey{e~Tl) < Py(n <S) + e-sPy{n > 5)
< Ce/4S + e-s(l - Cc/4S) = A < 1
Section 8.7 Convergence to Diffusions 301
Iterating and using the strong Markov property it follows that Ey(e~Tn) < A"
and
(7.11) Py(N > k) = Py(rk < 1) < eEy{e-r>) < e\k
Combining (7.10) and (7.11) we have
(7.12) Ps(<r <S)< fcsup Px(r < 6) + Py(N > Jfc) < C£,4k6 + e\k
X
If we pick k so that e\k < e/3 and then pick 6 so that CckS < e/3 then (7.12),
(7.9a), and (7.8) imply that for small h, supy Py{ws(Xh) > e) < e. This verifies
(ii) in (6.4). Since X§ = xh, (i) is trivial and the proof of tightness is complete
in discrete time. □
Tightness proof, continuous time. One of the probabilities in (7.8)
is trivial to estimate. Since jumps of size > e/4 occur at a rate smaller than
supx IIA(z, B(x, e/4)c) and e~z > 1 - z
(7.9b) Py{6 > e/4) < l-exp(-supQh(x,B(x,e/4)c)) <supA^/4(z) ^0
X X
by (iii')- The first step in estimating Py(cr > 6) is to estimate Py(ri < 6). To
do this we begin by observing that (7.2b) and (7.3) imply
fy,c/i(Xt) + Cc/it, t > 0 is a submartingale
Using the optional stopping theorem at time r A 6 it follows that
Eyify.cMXrAs) + Ce/4(r A 6)} > 1
The remainder of the proof is identical to the one in discrete time. D
b. Convergence
Since we have assumed that the martingale problem has a unique solution,
it suffices by (6.5) to prove that the limit of any convergent subsequence solves
the martingale problem to conclude that the whole sequence converges. The
first step is to show
(7.13) Lemma. If / € C& then \\Lhf - Lf \\oo — 0.
302 Chapter 8 Weak Convergence
Proof Using (7.4) then integrating with respect to Kh(x,dy) over the set
\y — x\ < 1 we have
Lhf{x) = Yjbhi{x)Dif{x)
i
(7 U) + / J2(Vi - xi)(yj - xi)Dijf(zx,y)Kh(x, dy)
K } -/|y-x|<i V
+ [ {f(y)-f(x)}Kh(x,dy)
/|t/-x|>
Recalling the definition of A J and using (vi) it follows that the third term goes
to 0 uniformly in z. To treat the first term we note that (7.5) and (v) imply
X>*(z)A-/(*) - X>(z)A/(z) < sup \bf(x) - bj(x)\ J2 |A/(*)| - 0
uniformly in x.
Now the difference between the second term in (7.14) and J2ij aijDijf(x)
is smaller than
Eotf (aOUy/to - X>£(*)A-;/(z)
(7.15)
+1 / £(W ~ *<)(%■ ~ ^) {PaH***) ~ Duf{*)) Kh (*, dy)
J|«-il<l TT
By (7.5) the first term is smaller than
sup M*)-ak(s)| X)Ptf/(*)l
which goes to 0 uniformly in x by (iv). To estimate the second term in (7.15)
we note that for any e the integral over {y : e < \y — x\ < 1} goes to 0 uniformly
in x by (vi) and the fact that the integrand is uniformly bounded. Using (7.6),
the last piece
(7.16)
( X> - *••)(»; - *j) {Dijf(zx,y) - Dijfix)} Kh{x, dy)
<r(0
/ \y-x\2Kh(x,dy)
J\y-x\<c
Section 8.7 Convergence to Diffusions 303
where
r(c)= sup \\Dijf(zXiy) - Dijfix)
\y-x\<e
Now zx>y is on the line segment connecting x and y, D{jf is continuous and has
compact support, so T(e) -^Oas£->0. Using (7.7) and (iv) now the quantity
in (7.16) converges to 0 and the proof of (7.13) is complete. D
Convergence proof, discrete time. Let hn —* 0 so that Xhn converges
weakly to a limit X°. (7.2a) implies
i-l
f(Xtin)-J2hnLh"f(Xfrj, 4 = 0,1,2,... is a martingale
i=o
To pass to the limit, we rewrite the martingale property by introducing a
bounded continuous function g : D —» R which is measurable with respect
to T, and observe that if kn = [s/hn] + 1 and £„ = [t/hn] + 1 the martingale
property implies
E ( g{Xh~) J /(X*;fcJ - /(X&J - £ KL^HXfcj \)=0
Let hoo = 0. Skorokhod's representation theorem (1.4) implies that we can
construct processes Y", 1 < n < oo with the same distribution as the Xhn
so that Yn —* Y°° almost surely. Combining this with (7.13) and using the
bounded convergence theorem it follows that if / g C\ then
E (g(X°) j/(X,°) - /(X,°) - J* i/(Xr°) dr}) = 0
Since this holds for all continuous g : C —» R measurable with respect to F3 it
follows that if / G C\ then
f(X°) - f Lf{X°) ds is a martingale
Jo
Applying the last result to smoothly truncated versions of f(x) = x{ and f(x) =
X{Xj it follows that X° is a solution of the martingale problem. Since the
martingale problem has a unique solution, the desired conclusion now follows
from (6.5). D
304 Chapter 8 Weak Convergence
Convergence proof, continuous time. Let hn —* 0 so that Xhn
converges weakly to a limit X°. (7.2b) implies
/(X«*n) - f Lh»f{X^)ds t > 0... is a martingale
Jo
and repeating the discrete time proof shows that X° is a solution of the
martingale problem and the desired result follows. □
c. Localization
To extend the result to the original assumptions, we let <pk be a C°°
function that is 1 on 5(0,k) and 0 on 5(0, k + 1) and define
Kh<k{x,dy) = ipk{x)Kh{x,dy) + (1- ifk(x))6x(dy)
where 8X is a point mass at z. It is easy to see that if Kh satisfies (i)-(iii) then
Kh,k satisfies (i')-(iii') and (iv)-(vi). From the first two parts of the proof it
follows that for each k, Xh,k is tight, and any subsequential limit solves the
martingale problem with coefficients
ak(x) = <pk{x)a(x) bk(x) = <pk(x)b(x)
We have supposed that the martingale problem for a and 6 has a unique
nonexplosive solution X°. In view of (3.3), to prove (6.1) it suffices to show that
given any sequence hn —* 0 there is a subsequence h'n so that Xh" =>• X°. To
prove this we note that since each sequence Xhn,k is tight, a diagonal argument
shows that we can select a subsequence h'n so that for each k, Xhn,k =r- X0,k.
Let Gk = {w : w(i) € 5(0,fc),0 < t < 1}. Gk is an open set so if
Xh'»-k =» X°<k and H C C is open then (2.1) implies that
liminf P(Xh'»-k £ Gk DH)> P(X°-k £GkDH) = P(X° g Gk D H)
n—*oo
since up until the exit from 5(0, k), X°-k is a solution of the martingale problem
for 6 and a and hence its distribution agrees with that of X°. Now up to the
first exit from 5(0, k) Xh'» has the same distribution as that of X »' so we
have
liminf P(Xh'» £H)> liminf P(Xh'» £ Gk D H) > P(X° eGkDH)
n—*oo n—*oo
Since there is no explosion the right-hand side increases to P(X° £ H) as k j oo
and the proof is complete.
Section 8.8 Examples 305
8.8. Examples
In this section we will give applications of (7.1) beginning with the following
very simple situation.
Example 8.1. Ehrenfest Chain. Here, the physical system is a box filled
with air and divided in half by a plane with a small hole in it. We model this
mathematically by two urns that contain a total of In balls, which we think of
as the air molecules. At each time we pick a ball from the In in the two urns
at random and move it to the other urn, which we think of as a molecule going
through the hole. We expect the number of balls in the left urn to be about
n + Cy/n, so we let Z^ be the number of balls at time m in the left urn minus
n, and let Y^ = Z^/^/n. The state space S1/n = {k/y/n : -n < k < n}
while the transition probability is
„ , _i/2\ n — xy/n „ -i/2\ n + xy/n
n1/n(z, x + n ^2) = ^- II1/n(z, x - n x'2) = ^-
When e > n-1/2, A*'"(a:) = 0 for all x so (iii) holds. To check conditions (i)
and (ii) we note that
,1/ / 1 n-xy/n~ 1 n-xy/n\
b i (x) = n < —j= ■ ■= • > = — x
K ' {y/n In y/n In J
-{i--
Tl/n/ ) _ J ± . " ~ ZV" , I . n-xy/^.
K J ' - In n In
The limiting coefficients 6(z) = — x and a(x) = 1 are Lipschitz continuous, so
the martingale problem is well posed and (A) holds. Letting Xt'" = Yrtyn
and applying (7.1) now it follows that
(8.1) Theorem. As n —* oo, Xt'" converges weakly to an Ornstein-Uhlenbeck
process Xt, i.e., the solution of
dXt = -Xt dt + dBt
Next we state a lemma that will help us check the hypotheses in the next
two examples. The point here is to replace the truncated moments by ordinary
moments that are easier to compute.
a&te) = / (Vi ~ xi)(yj ~ xi)Kh(x, dy)
"#(*) = J{yi-Xi)Kh{x,dy)
7p(x)= / \y-x\pKh(x,dy)
306 Chapter 8 Weak Convergence
(8.2) Lemma. If p > 2 and for all R < oo we have
(a) limftiosupi^B \a*}.(x) - a{j(x)\ = 0
(b) limAxosup^^B \bf(x) - &;(z)| = 0
(c) limhioSup^i^T^a;) = 0
then (i), (ii), and (iii) of (7.1) hold.
Proof Taking the conditions in reverse order, we note A£ (z) < e~pj„ (x)
\bH*)-bH*)\< [ \y-*\i<h(x,dy)<7Z(x)
J\y-x\>l
Using the Cauchy-Schwarz inequality with the triviality (yt — zjt)2 < \y — x\2
\%(x) - a1j(x)\ < f |(y; - x{)(yj - xj)\Kh(x, dy)
J\y-x\>l
< f \y-x\2Kh(x,dy)<7*(x)
J\y-*\>1
The desired conclusion follows easily from the last three observations. D
Our first two examples are the last two in Section 5.1, and date back to
the beginnings of the subject, see Feller (1951).
Example 8.2. Branching Processes. Consider a sequence of branching
processes {Z^,m > 0} in which the probability of k children p£ has mean
1 + (/?„ /n) and variance a2. Suppose that
(Al) /?„ -* /? G (-oo, oo),
(A2) cr„ -► cr G (0, oo),
(A3) for any 6 > 0, £*>«„ fc2p£ - 0
Following the motivation in Example 1.6 of Chapter 5, we let Xt'" = ZPtJn.
Our goal is to show
(8.3) Theorem. If (Al), (A2), and (A3) hold then x]ln converges weakly to
Feller's branching diffusion Xt, i.e., the solution of
dXi = pXi dt + cr^fXtdBt
Section 8.8 Examples 307
References. For a classical generating function approach and the history of
the result, see Sections 3.3-3.4 of Jagers (1975). For an approach based on
semigroups and generators see Section 9.1 of Ethier and Kurtz (1986).
Proof We first prove the result under the assumption
(A3') for any 6 > 0, £*>«„ k2p% = 0 for large n
Then we will argue that if n is large then with high probability we never see
any families of size larger than nS.
The limiting coefficients satisfy (3.3) in Chapter 5, so using (4.1) there, we
see that the martingale problem is well posed. Calculations in Example 1.6 of
Chapter 5 show that (a) and (b) hold. To check (c) with p = 4, we begin by
noting that when Zq = £, Z" is the sum of £ independent random variables £"
with mean 1 + /?„ /n > 0 and variance a2. Let 5 > 0. Under (A3') we have in
addition \g\<n6 for large n. Using (a + 6)4 < 24a4 + 2464 we have
f\'n{x) = nE{n~*{Zl - nx}4\Z% = nx)
< 16n-3(z/?„)4 + 16n-3E({Z? - nx{l + /?„/n)}4|Z£ = nx)
To bound the second term we note that
E({Z? - nx{\ + /?„/n)}4|Z0" = nx) = nxE(tf -(1 + /?„/n))4
To bound £(£" - (1 + /W«))4, we note that if 1 + /Wn < nS
£(ff-(l+/Wn))4= /" 4x3P(\&-(l + f3n/n)\>x)dx
Jo
< 2(n8)2 f" 2x P(|3* -(1 + pn/n)\ > x) dx < 262^2
Jo
Combining the last three estimates we see that if \x\ <R then
j\/n{x) < 3252o^J2 + 96o^J22n-1 + 16(z/?„/n)4
Since 6 > 0 is arbitrary we have established (c) and the desired conclusion
follows from (8.2) and (7.1).
To replace (A3') by (A3) now, we observe that the convergence for the
special case and use of the continuous mapping theorem as in Example 5.4
imply
n~2 £>£=* / X,ds
m=0 J0
308 Chapter 8 Weak Convergence
(A3) implies that the probability of a family of size larger than nS is o(n~2).
Thus if we truncate the original sequence by not allowing more than Snn children
where <5n —» 0 slowly, then (Al), (A2) and (A3') will hold and the probability
of a difference between the two systems will converge to 0. D
Example 8.3. Wright Fisher Diffusion. Recall the setup of Example 1.7
in Chapter 5. We have an urn with n letters in it which may be A or a. To
build up the urn at time m + lwe sample with replacement from the urn at
time n but with probability a/n we ignore the draw and place an a in, and with
probability /3/n we ignore the draw and place an A in. Let Z^ be the number
of A's in the urn at time n, and let Xt'" = Z?niJn. Our goal is to show
(8.4) Theorem. As n -+ oo, Xt converges weakly to the Wright-Fisher
diffusion Xit i.e., the solution of
dXt = {-aXt + /?(1 - Xt)) dt + y/Xt{l-Xt)dBt
References. Again the first results are due to Feller (1951). Chapter 10 of
Ethier and Kurtz (1986) treats this problem and generalizations allowing more
than 2 alleles.
Proof The limiting coefficients satisfy (3.3) in Chapter 5 so using (4.1) there
we see that the martingale problem is well posed. Calculations in Example 1.7
in Chapter 5 show that (a) and (b) hold. To check condition (c) with p = 4
now, we note that if Zq = nx then the distribution of Z" is the same as that
of Sn = 6 + • • • + £„ where &,... ,£„ g {0,1} are i.i.d. with P($ = 1) =
z(l - a/n) + (1- x)/3/n.
E(Sn - np)
^ = e(±^-p)4
nE(tj - p)4 + 6 r) E& - pf < Cv?
since E(£j — p)m < 1 for all m > 0. From this and the inequality (a + 6)4 <
24a4 + 2464 it follows that
nE({Z?/n - x¥\Z% = x) < n(-ax + /?(1 - x))/n4 + nCn2/n4 -* 0
uniformly for x £ [0,1] and the proof is complete. D
Our final example has a deterministic limit. In such situations the following
lemma is useful.
Section 8.8 Examples 309
(8.5) Lemma. If for all R < oo we have
(a) limftiosupi^B \&^(x)\ = 0
(b) limftiosupi^B \b*(x) - 6,-(1)| = 0
then (i), (ii), and (iii) of (7.1) hold with 0,7(2:) = 0.
Proof The new (a) and (b) imply the ones in (8.2). To check (c) of (8.2) with
p = 2 we note that
J2 4(=) = f\y~ = 1¾ (=, dy) = T2A(z) □
Example 8.4. Epidemics. Consider a population with a fixed size of n
individuals and think of a disease like measles so that recovered individuals are
immune to further infection. Let S" and I" be the number of susceptible and
infected individuals at time n, so that n — (S" +1") is the number of recovered
individuals. We suppose that (S", /") makes transitions at the following rates:
(s, z) —» (s + 1, z) a(n — s — i)
(s, i) —* (s — 1, z" + 1) /?sz"/n
(s, z") -* (s, z - 1) 71
In words, each immune individual dies at rate a and is replaced by a new
susceptible. Each susceptible individual becomes infected at rate /? times the
fraction of the population that is infected, i/n. Finally, each infected individual
recovers at rate 7 and enters the immune class.
Let X]'n = (S?/n, I?/n). To check (a) in (8.5) we note that
a}{"(zi,z2) = — ■{an(l-a;1 -22) + /^=1=2}
n-
1/"/ N 1 a
«12 (=11=2) = — -pnxix2
n-
a^"(2:1,2:2) = —n • {Pnxix2 + jnxix2}
n
and all three terms converge to 0 uniformly in T = {(2:1,2:2) : 2:1,2:2 > 0,2:1 +
=2 < 1}- To check (b) in (8.5) we observe
6^(2:1,2:2) = an(l -2:1.- =2) (3nxix2 ■ -
n n
= a(l — x\ — x2) — (ix\x2 = 61(2:)
1.1/"/ \ a * *
6, (2:1, 2:2) = pnxix2 ■ 7ri2:2 ■ —
n n
= f3x\x2 — jx2 = 62(2:)
310 Chapter 8 Weak Convergence
The limiting coefficients are Lipschitz continuous on T. Their values outside T
are irrelevant so we extend them to be Lipschitz continuous. It follows that the
martingale problem is well posed and we have
(8.6) Theorem. Asn-^oo, Xt'" converges weakly to Xt, the solution of the
ordinary differential equation
dXt = b(Xt)dt
Remark. If we set a = 0 then our epidemic model reduces to one considered in
Sections 11.1-11.3 of Ethier and Kurtz (1986). In addition to (8.6) they prove
a central limit theorem for ^/n{Xt'" — Xt).
Solutions to Exercises
Chapter 1
1.1. Let A = {A = {w: (w(*i),w(*2), • • •) € B} : B G ft*1'2-"}}. Clearly, any
A G .4 is in the cr-field generated by the finite dimensional sets. To complete
the proof, we only have to check that A is a cr-field. The first and easier step is
to note if A = {w : (w(*i),w(i2),...) € B} then Ae = {w : (w(*i),w(i2),...) G
Bc} G .4. To check that A is closed under countable unions, let An = {w :
(w(i"),w(i2),-. ■) G £„}, let ii, i2,... be an ordering of {t^ \ n,m > 1} and
note that we can write An = {w : (w(2i),w(f2),...) G ^n} so U„j4„ = {w :
(w(ii),w(i2),...)GU„£„}G A.
1.2. Let j4n = {w : there is an s G [0,1] so that \Bt - B3\ < C\t - s\^ when
\t ~ s\ < k/n}- For 1 < i < n - k + 1 let
yiin=max{|5(i±i) -5 (i±^i)|:j = 0,1,...fc-l}
Bn = { at least one Yl-itl is < (2fc - l)C/n7}
Again .An C Bn but this time if j > 1/2 + 1/fc, i.e., £(1/2 - 7) < -1, then
P(Bn) < nP(\B(l/n)\ < (2k - l)C/n*)k
< nP(|5(l)| < (2k - yCn1'2^)*
< c'n^1'2-^1 — 0
1.3. The first step is to observe that the scaling relationship (1.3) implies
(*) Am,„ = 2-"/2Ali0
while the definition of Brownian motion shows EA\ 0 = t, and £'(Af0 — 02 =
C < 00. Using (*) and the definition of Brownian motion, it follows that if
k ^ m then Af n — t2~n and A^ „ — t2~n are independent and have mean 0 so
E[ £ (Allin-t2-n)) = £ E(All>n-t2-n)2 = 2nC2-2"
\l<m<2" / Km<2"
312 Solutions to Exercises
where in the last equality we have used (*) again. The last result and Cheby-
shev's inequality imply
£ <--<
> l/n < n2 ■ CTn
l<m<2n
The right-hand side is summable so the Borel-Cantelli lemma implies
E<„-*
m<2»
> l/n infinitely often J = 0
2.1. Take Y(w) = /((*_,).
2.2. When f(x) = z,-, 2.1 becomes EX(B\\F3) = EB(,)B\_3 = B\ since under
Px, B\_3 is normal with mean z,-. When f(x) = X{Xj with i ^ j 2.1 becomes
EX{B\B{\TS) = EB.(B)_,Bi_,) = B[B{
since under Px, B\_a and 5j_5 are independent normals with means z,- and Xj.
2.3. Let Y = l(T0>i) and note that Tb°#i = R—lsoYodi = l(ij>i+1). Using
the Markov property gives
PX(R > 1 + t\?i) = PBl(To > t)
Taking expected value now and recalling PX{B\ = y) = P\(x,y) gives
Px(# > 1 +0 = [Pi(x,v)Py(TQ > t)dy
2.4. Let Y = l(T0>i_t) and note that Y o 9t = !(£<<)■ Using the Markov
property gives
Po(L<t\rt)=PBt(T0>l-t)
Taking expected value now and recalling Po(Bi = y) = p<(0,y) gives
Po(i < 0 = JPt(0,y)Py(To > 1 -0
c/y
2.5. We will prove the result by induction on n. By assumption it holds for
n = 1. Suppose now it is true for n. Let y(w) = 1(t>i,w16«')- Applying the
Markov property at time n gives
£*(y o 6n\Tn) =PBn(T>l,B1e K)
Answers for Chapter 1 313
Integrating both sides over An = {T > n, Bn G K}, recalling the definition of
conditional expectation, and using our assumption shows
Ex(Yo6n;An)>aP(An)
The right-hand side is smaller than P{An+i) and the desired result follows.
2.6. Since £ h(r, Br) dr G T„ we have
: f h(r,Br)dr
Jo
+ EX(J h(r,Br)dr ?A
To evaluate the second term we let Y(w) = fQ h(s + u,wu) du and note
Yo63=[ h(s + u,Bs+u)du
Jo
so the Markov property implies
Ex (I Mr'
Br)dr
T*
E>
u:
h(r,Br)dr
T*
--EbAI h(s + u,Bu)du
2.7. Since fi c(Br) dr E T„ we have
Ex (/(5,) exp(Cl)|^) = exp(c,)£x (/(£<) exp (7 c(Br)dr
^
To evaluate the second term we let Y(w) = f(Bt-3)exp(fQ ' c(uu)du) and
note Y o63 = /(5*) exp( fQ~3 c(Bs+u) du) so the Markov property implies
Ex ff(Bt)exp(j c(Br)dr) T,\ = EB(l) (f(Bt-3)exp(J c(wu)du)\
2.8. (2.8), (2.9), and the Markov property imply that with probability one
there are times si > ti > s2 > t2 . ■. that converge to a so that B(sn) = B(a)
and B(tn) > B(a). In each interval (sn+i,sn) there is a local maximum.
2.9. Z = limsupno B(t)/f(t) G T$ so (2.7) implies P0(Z > c) G {0,1} for
each c, which implies that Z is constant almost surely.
314 Solutions to Exercises
2.10. Let C < oo, tn i 0, AN = {B(tn) > CyfQ for some n > N} and
A = DffAff. A trivial inequality and the scaling relation (1.3) imply
Po(AN) > Po(B(tN) > C^) = Po(B(l) > C) > 0
Letting N —* oo and noting An J. A we have Pq(A) > Pq(Bi > C) > 0. Since
A e T$ it follows from (2.7) that P0(A) = 1, that is, limsup^o B{i)/y/i > C
with probability one. Since C is arbitrary the proof is complete.
2.11. Since coordinates are independent it suffices to prove the result in one
dimension. Continuity implies that if 8 is small
P0 sup 15,1 <e/2 =2a>0
^0<3<6
so it follows from symmetry that
Po ( sup \B,\<e/2,Bs <0
\0<s<6
ao( sup |B,|
\0<3<5
< e/2, Bs > 0 = a > 0
Using the top result for x > 0 and the bottom one for x < 0 shows that if
xe [-e/2, e/2]
Px ( sup |5t| < e,Bs G [-e/2, e/2] ) > a > 0
\0<i<« J
Iterating and using Exercise 2.5 gives that if x £ [—e/2, e/2]
Px ( sup |5(| < e,BnS e [-e/2, e/2] ) > a" > 0
\0<i<n« J
3.1. Suppose j4 = U^-iKn where J<T„ are closed. Let Hn = UD,=1J<rm. if„
is closed so (3.4) implies T„ = inf{i : Bt € Hn} is a stopping time. In view
of (3.2) we can complete the proof now by showing that Tn J. Ta as n | oo.
Clearly, T^ < limT„ for all n. To see that < cannot hold let t be a time at
which Bt € A. Then Bt € Kn for some n and hence limT„ < i. Taking the inf
over t with Bt & A v/e have limTn < Ta-
3.2. If m2"" < t < (m + 1)2-" then {Sn <t} = {S < m2-"> G ?m2-n C ^i.
Answers for Chapter 1 315
3.3. Since constant times are stopping times the last three statements follow
from the first three.
{S AT <t} = {S <t}U{T <t} e Tt.
{SVT <t} = {S <t}n{r < 4 G Tt
{S + T<t} = UqireQ:q+r<t{S < q) D {T < r} G Tt
3.4. Define Rn by Rx = 7\, Rn = #„_i VT„. Repeated use of Exercise 3.3
shows that Rn is a stopping time. As n f oo Rn | supn Tn so the first conclusion
follows from (3.3). Define Sn by Si = Ti, Sn = S„-i AT„. Repeated use of
Exercise 3.3 shows that Sn is a stopping time. As n j oo, Sn J. infn Tn so the
second conclusion follows from (3.2). The last two conclusions follow from the
first two since
limsupT„ = inf sup Tm and liminf Tn = sup inf Tm
n " m>n " n "»>n
3.5. First if A G Ts then
A n {S < t} = U„ (A n {S < t - l/n» G ^
On the other hand if A D {S < t} G T% and the filtration is right continuous
then AD{S <t} = Dn(An{S <t + 1/n}) G n„ jF1+1/„ = ^.
3.6. {# < t} = {S < t) n A G ^ since AeTs
3.7. (i) Let r = sAi.
{S<t}n{S <s} = {S<r}£FrC T,
{S <t} n{S < s} = {S < r} £ Fr C T,
This shows {S < t} and {S < t} are in Ts- Taking complements and
intersections we get {S > t}, {S > t}, and {S = t} are in Ts-
(ii) {S < T}n{S < t} = Uq<i{S <q}n{T>q}eFt by (i), so {S < T} G Ts.
{S < T}H{T <t}= Uq<t{S <q}n{q<T<t}e Tt by (i), so {S < T} G TT.
Here the unions were taken over rational q. Interchanging the roles of S and
T we have {S > T} in ^n^ Taking complements and intersections we get
{S > T}, {S < T}, and {S = T} are in Ts n JFT.
3.8. If .4 G ft then
{B(5„) G ,4} n {S„ < 0 = U0<m<2-«{Sn = m/2n} n {B(m/2n) G ,4} G ^,
316 Solutions to Exercises
by (i) in Exercise 3.7. This shows {B(Sn) E A} e TSn so B(Sn) G TSn ■ Letting
n -* oo we have Bs = lirn„ B(Sn) G DnTsn = Ts-
3.9. (3.5) implies Tsat„ C Ts- To argue the other inclusion let A G Ts- Since
A = u„(A n {Tn > S})
it suffices to show that A n {Tn > S} G -?\sat„ ■ To do this we observe
A n {T„ > s} n {Tn a s < 0 = A n {r„ > s} n {S < *}
= Ug<((^ n {S < q}) n {Tn >q}ETt
since A & Ts implies An{S < q} & Tq and the fact that T„ is a stopping time
implies {Tn > q} eTq.
3.10. Since /Q5 g{B3) ds G Ts we have
^%(^)
Ts
[ 9(B,)ds
Jo
%(j g(B3)ds Ts)
Applying the strong Markov property to Y(w) = f (w' g(w3) ds we have
+ EX
EX(J g(B3)
ds
Ts
u(Bs)
3.11. Let Y3 (w) = 1 if s < t and u < u(t — s) < v, 0 otherwise. Let
y / \ _ f 1 if s < i, 2a — u < w(i — s) < 2a — u
>• 0 otherwise
Symmetry of the normal distribution implies EaY3 = EaY3, so if we let S :
inf{s < t : B3 = a} and apply the strong Markov property then on {S < oo}
EX(YS o 6s\Ts) = EaYs = EaYs = EX(YS ° 6S\TS)
Taking expected values now gives the desired result.
Answers for Chapter 1 317
4.1. We begin by noting symmetry and (2.5) imply
Po(R<l + t) = 2 / Pl(0,y) / Py(TQ = s)dsdy
Jo Jo
ft fOO ft
= /2/ Pl(0,y) / Py(TQ = s)dyds
Jo Jo Jo
by Fubini's theorem, so the integrand gives the density Pq(R = 1 + t). Since
Py(TQ = t) = PQ{Ty = t), (4.1) gives
fOO I -I
2irt3/*JQ ye dV~2id^{l + t)
4.2. Translation invariance implies that if r < s then
E(x,r)f(Br, - Bo) = E(0,o)f(Br,_r)
Applying the strong Markov property to Y(u) = /(w(r,) — w0) at time rr, and
using the last identity we have
E{o,o)(f(Br. - BTr)\TTr) = £(Ol0)/(BT._r)
The rest of the argument is identical to that for (4.4).
4.3. As s —* 0, <p(s) —* 1, so for some 5 > 0 we have y»(s) > 0 for all s < 8.
Using the equation <p(s)<p(t) = <p(s +1) we can now conclude by induction that
for all n, <p(s) > 0 for s < nS, and we can take logarithms without feeling guilty.
The rest is easy: ^(2"") + ^(2"") = ^(2-(n-1)) and induction tells us that
V>(2-") = 2-"t/>(1), while tf((m - 1)2-") + tf(2~") = tf(m2-") and induction
implies ij){m.2~n) = m2~nij){l). Since <p(s) is continuous and does not vanish,
t/>(s) = <p(s) is continuous and the desired result follows.
4.4. Let ipx(a) = EQ(exp(-\Ta)). (4.4) and (4.5) imply
<px(a)<px(b) = ipx(a+b) <px(a) = <pXa2(l)
As before the first equation implies <p\(a) = exp(c\d) while the second with
A = 1 and a = Vo gives c& = ybci and this implies c\ = —kV\.
4.5. Let Mi = max0<j<i5j. Exercise 3.10 implies that with probability 1,
Mi > Bi and hence a —» Ta is discontinuous at a = Mi. This shows Po(a —* ^a
is discontinuous in [0,n])->lasn->oo and the result follows from the hint.
318 Solutions to Exercises
4.6. Let Mx = maxo<,<i B,2 and T = inf{i > 1 : 52 > Mi}. Since
B\ < Mi the strong Markov property and (4.3) imply that with probability
one B% £ B\[x so s —» C, is discontinuous at s = Mi. This shows P0(s —* C,
is discontinuous in [0,n])-^lasn->oo and the result follows from the hint.
4.7. If we let f(x) = /(£), then it follows from the strong Markov property
and symmetry of Brownian motion that
Ex{f(Bt);r<t) = Ex[E0f(Bt-T);T <t]
= Ex[E0f(Bt-r);r <t]
= Ex[f(Bt);r<t]=Ex(f(Bt))
The last equality holds since f(y) = 0 when yd > 0, and the desired formula
follows as in (4.8).
Chapter 2
1.1. If A G -Fa and a < b then H„(s,u) = l[a+i/n,b+i/n)^A is optional and
converges to l(atb](s)lA(u) asn->oo.
2.1. The martingale property w.r.t. a nitration Qt is that
E(Xis;A) = E(Xf;A)
for all A € Qs. (3.5) in Chapter 1 implies that Tsi\> C T, so if X? is a
martingale with respect to Ts it is a martingale with respect to ^"saj- To argue
the other direction let A £ Ts. We claim that A n {S > s} £ Tsas- To prove
this we have to check that for all r
(A n {S > s}) n {S A s < r} G Tr
but this is 0 for r < s and true for r > s. If X/ is a martingale with respect to
•Fsaj it follows that if A £ T, then
E(Xf;A !1{S> s}) = E(Xf; An{S> s})
On the other hand
E(X?;An{S<s}) = E(Xsl(s>0y,An{S < s}) = E(Xf;An{S < s})
Adding the last two equations gives the desired conclusion.
Answers for Chapter 2 319
2.2. To check integrability we note that
Ex\Xt}f = J(27d)-1e-\y-x\3/2t\\og\y\\Pdy < oo
since | log |y|| is integrable near 0 and e-ly-*l2/2< takes care of the behavior for
large y. To show that ExXt —* oo we observe that for any R < oo
ExXt > (27rf)-d/2 / log \y\ dy + (log R)Ps(\Bt \ > R)
Ay\<i
The integral is convergent and as we showed in Example 2.1, PX(|S<| > R) —* 1
as t —* oo so the desired result follows from the fact that R is arbitrary.
2.3. Let Tn = inf{i : \Xt\ > n}. By (2.3) this sequence will reduce Xt. Note
that Xt " <n. Jensen's inequality for conditional expectation (see e.g., (1.1.d)
in Chapter 4 of Durrett (1995)) implies
E(V{X?")\rTnAt) > ¥>(£(X,T»|*TnA,)) = <P(XJ")
2.4. Let T„ < n be a sequence which reduces X, and Sn = (Tn — R)+. If s < t
then definitions involved (note S„ = Tn — R on {Sn > 0} = {Tn > R}), the
optional stopping theorem, and the fact 1(t„>r) G Fr C ^a+j imply
■E'(^ASnl(Sn>0)|^) = ■E'(X(ii+i)ATnl(Tn>ii)I^R+i)
= l(Tn>ii)-E'(X"(ii+i)ATnl-^ii+i)
= X(ii+i)ATnl(Tn>H) = y5ASnl(Sn>0)
2.5. Let Tn be a sequence that reduces X. The martingale property shows that
if A £ To then
E(XiATnl(Tn>oy,A) = E(X0l(rn>0);A)
Letting n —* oo and using Fatou's lemma and the dominated convergence
theorem gives
E(Xt;A) < \immfE(XtATnhTn>oy,A)
n—+oo v
< lim E(X0hTn>oy,A) = E(X0;A)
Taking A = fl we see that .EX* < 51¾ < oo. Since this holds for all A € To we
get .E(X"t|.Fo) < XQ. To replace 0 by s apply the last conclusion to Yt = Xs+t,
which by Exercise 2.4 is a local martingale.
3.1. Since t —* Vt is increasing, we only have to rule out jump discontinuities.
Suppose there is a t and an e > 0 so that s < t < u then V(u) - V(s) > 3e for
320 Solutions to Exercises
all n. It is easy to see that V(u) — V(s) is the variation of X over [s,u]. Let
s0 <t < u0. We will now derive a contradiction that the variation over [so,«o]
is infinite. Pick 5 > 0 so that if \r —1\ < 5 then \Xt — Xr\ < e. Assuming sn and
un have been defined, pick a partition of [s„, un] not containing t with mesh < S
and variation > It. Let sn+i be the largest point in the partition < t and «n+i
be the smallest point in the partition > t. Our construction shows that the
variation of X over [sn, un] — [sn+i,tn+i] is always > e, so the total variation
over [so,«o] is infinite, a contradiction which implies t —* Vt is continuous.
3.2. (X + Y)tZi - (X, Z)i - (Y, Z)i is a local martingale.
3.3. — X0Zt is a local martingale, so if Y* = — Xo then (Y, Z)t = 0 and the
desired result follows from Exercise 3.2.
3.4. (aXt)(bYt) - a6{X,y)t = ab(XtYt - (X,Y)t) is a local martingale.
3.5. If T is such that XT is a bounded martingale it follows from (3.7) and
the definition of (X) that (XT) = (X)T. For a general stopping time T, let
Tn = inf{i : \Xt\ > n}, note that the last result implies <XTaT») = (X)TaT"
and then let n —» oo. The result for the covariance process follows immediately
from the definition and the result for the variance process.
3.6. Since X? — (X)t is a martingale and \Xt\ < M for all t
E{X)t = EX? - EX% < M2
so letting i -+ oo we have E(X)oo < M2. This shows that X2 — (X)t is
dominated by an integrable random variable and hence is uniformly integrable.
3.7. The optional stopping theorem implies
E(Xs+t\Fs+3) = Xs+s
and Xs eTsC Fs+, so
E(Xs+t — XslFs+s) = Xs+s — Xs
i.e., Y, is a martingale. To prepare for the proof of the second result we note
that imitating the proof of (2.4)
E((Xs+i -Xs)2\Fs+3) = E((Xs+i-Xs+3)2\Fs+,)+(Xs+3 -X,)2
= E(X2s+t - X2S+S 1^5+,) + (XS+, - ^)2
Answers for Chapter 2 321
Let Zt = (Xs+t — Xs)2 — {(X)s+t — (X)s)- Using the last equality we get
E(Zt\fs+3) = E(X2s+i - Xj+S - {(X)s+t - (X)s}\Fs+,) + (Xs+3 - X,)2
= E(X2s+t - (X)s+i\fs+3) - X2S+, + (X)s + (Xs+3 - X,)2
= -(X)s+3 + (X)s + (Xs+3 - X,)2 = Z,
where the last equality follows from the optional stopping theorem and Exercise
3.6. This shows that Zt is a martingale, so (Y)t = (X)s+t — (X)s-
3.8. By stopping at Tn = inf{i : \Xt\ > n} and using Exercise 3.5 it suffices to
prove the result when X is a bounded martingale. Using Exercise 3.7, we can
suppose without loss of generality that S = 0. Using the L2 maximal inequality
on the martingale XiAx, and the optional stopping theorem on the martingale
X? - (X)t it follows that
E (jup^2AT) < *E(Xhn) = E((X)tAT = 0
Letting n —* oo now gives the desired conclusion.
3.9. This is an immediate consequence of (3.8).
4.1. It is simply a matter of patiently checking all the cases.
1. s <t < a. Yt =Y0 e fo so ,
E<yt\T,) = E{Y0\T,) = Y* = Y,
2. s < a < t < b
E(Yt \T,) = E(E(Yi \?a)F3) = E(Ya \Tt)
= E{YQ\T,) = Y0 = Y,
3. s < b, t > 6. Yt = Yh so this reduces to case 2 if s < a or to our assumption
if a < s < b. 4. 6 < s < t. E{Yt\T,) = E(Yb\Fs) =Yb=Ys.
4.2. If we fix w then t —* (X)t defines a measure. The triangle inequality for
L2 of that measure implies
(JHU(X)^ 2+(/lC2W) 2< (/(#* + K,)*d(X)?j
Now take expected value.
322 Solutions to Exercises
4.3. X G M2 implies Esupt X? < oo and hence EXq < oo. To check the other
condition let Tn be a sequence of stopping times | oo so that XTn is bounded
and (X)r„ < n. The optional stopping theorem implies
E iXTn ~ POtJ l(Tn>0) = £'^l(Tn>o)
Letting n —* oo it follows that E(X20 — (X)oo = EXq. To prove the converse
rearrange the displayed equation to conclude
£(*T„1(T„>0))<£*02+ £<*}«,
so X is L2 bounded.
4.4. The triangle inequality implies ||z|| < ||zn|| + ||z — xn\\ so
M^ll < liminf ||a=„||
For the other inequality note that ||zn|| < ||z|| + \\xn — x\\ so
\\x\\ > limsup \\xn\\
4.5. Let Hn e bill with \\Hn - H\\x -* 0. Since \\(Hn ■ X) - (H ■ X)\\2 -+ 0,
using Exercise 4.4, the isometry property for 6IIi, and Exercise 4.4 again
\\H -X\\2 = lim||fP -X\\2 = lim||if"||x = \\H\\X
5.1. If L, V G M2 have the desired properties then taking N = L — L' we have
(L,L- V) = {L',L- L') so (L - L') = 0. Using Exercise 3.8 now it follows
that L — V is constant and hence must be = 0.
5.2. If we let H, = 1 and K, = 1(,<T) then X = H ■ X and YT = K ■ Y. (It is
for this reason we have assumed Xo = Y0 = 0.) Using (5.4) now it follows that
(X,YT)t= [ l(,<T)d(X,Y)s = (X,Y)J
Jo
5.3. If we let H, = l(i<T) and K, = l(j>T) then XT = H ■ X and Y - YT =
K ■ Y. Using (5.4) now it follows that (XT,Y - YT)t = 0 so the desired result
follows from (3.11).
6.1. Write H, = H} + H2 and K, = K] + K2 where
H, = K, = -ffjl(s<j<T) = KA(s<3<T)
Answers for Chapter 2 323
Clearly, (H1 ■ X)t = (K1 ■ Y)t for all t. Since H2 = K2 = 0 for S < s < T,
(5.4) implies
(H2 ■ X)3 = (K2 -X)3 = 0 for S < s < T
and it follows from Exercise 3.8 that (H2 ■ X)t and (K2 ■ X)t are constant on
[S,T\. Combining this with the first result and using (4.3.b) gives the desired
conclusion.
6.2. Stop X at Tn = ini{t : \Xt\ > not /J H2 d(X),} to get a bounded
martingale and an integrand in U2(X). Exercise 4.5 implies
\\HT--XT-\\2 = E I " H2d{X)t
Jo
Using the L2 maximal inequality it follows that
E (sup (H ■ X)2] <4E f " H2 d{X)t
\t<Tn J Jo
Letting n —» oo now we can conclude
E (sup(if • X)2 J < oo
With this in hand we can start with
E(H -X)2Tn=E r H2 d(X)t
Jo
and let n —» oo to get the desired result.
6.3. By stopping we can reduce to the case in which X is a bounded continuous
martingale and (X)t < N for all t, which implies H € n2(X). If we replace S
and T by S„ and Tn which stop at the next dyadic rational and let H" = C
for S„ < s <Tn then H" £ Hi and it follows easily from the definition of the
integral in Step 2 in Section 2.4 that
jH?dX, = C{XTn-XsJ
To complete the proof now we observe that if |C(w)| < K then
\\Hn - H\\x < K2{E((X)T„ - (X)T) + E({X)S„ - (X)s)} - 0
324 Solutions to Exercises
as n —» oo by the bounded convergence theorem. Using Exercise 4.4 and (4.3.b)
it follows that \\{Hn ■ X) - (H ■ X)||2 -» 0 and hence f H? dX3 -* f H, dX3.
Clearly, C{Xt„ — Xsa) —* C{Xt — Xs) and the desired result follows.
6.4. X? - Xl = £>2(*?+1) - X\t?)
i
= E^W+i)-^'?)}2 + £2X(i?){X(i?+1) -X(i?)}
(3.8) implies that the first term converges in probability to (X)i. The previous
exercise shows the second term converges in probability to fQ 2X3 dX3.
6.5. The difference between evaluating at the right and the left end points is
2X){X(t?+1)-X(t?)}2-*2(X),
I
6.6. Let Cki„ = B((k + l/2)2~nt) - B(k2~nt) and £>,..,„ = B({k + l)2~nt) -
B((k + 1/2)2-"*). In view of (6.7) it suffices to show that as n -* oo
. k
in probability. ECl = 2~nt/2 and ECk,nDk,n = 0 so ES„ = t/2. To complete
the proof now we note that the terms in the sum are independent so
E(Sn -1/2)2 = E (j2Cln ~ 2_n</2 + CkinDkA
= E £(C* „ - 2-"t/2)2 + ClnD\tn < 2»Ct2-*» -+ 0
k
and then use Chebyshev's inequality.
6.7. (6.7) implies that if t" is a sequence of partitions of [0,t] with mesh —* 0
as n —* oo we have
y^ht^Btn -Bt-)-» f h3dB3
; ' '+X ' JO
in probability. The left-hand side has a normal distribution with mean 0 and
variance
Answers for Chapter 3 325
as n —* oo by the convergence of the Riemann approximating sums for the
integral. Convergence of the means and variances for a sequence of normal
random variables is sufRcent for them to converge in distribution (consider the
characteristic functions) and the desired result follows.
7.1. suPl \((Kn - K) -A)t\< Msup, \K? - IC| — 0
7.2. By stopping it suffices to prove the result when \A\t < M. If we let
Gf = f'(c(Aii}Aii+l)) then as in (7.4) we get
f(At)-f(AQ)= f G\dAs
Jo
As 6 —» 0, Gf —» f'{A3) uniformly on [0,t] so the desired conclusion from
Exercise 7.1.
8.1. Clearly Mt +M{- is a continuous local martingale and At +A't is continuous
adapted, locally of bounded variation and has A0 + A'0 = 0.
Chapter 3
1.1. The optional stopping theorem implies that
x = EXBT = aPx{BT = a) + 6(1 - PX(BT = a))
Solving gives Px(Bt = a) = (6 — x)/{b — a).
1.2. From (1.6) it follows that for any s and n P(supt>i Bt > n) = 1. This
implies P(supt>i Bt = oo) = 1 for all s and hence limsup^^, B3 = oo a.s.
1.3. If Sr(w) | t(w) which is < oo with positive probability then Bt(u)(u) = 0
and we have a contradiction.
3.1. Using the optional stopping theorem at time TAf we have E0B%.Ai =
E0(T A t). Letting t —* oo and using the bounded and monotone convergence
theorems with (1.4) we have
E0T = E0B% = a2 ■ —— + b2 ■ -r^— = ab
b + a b + a
3.2. Let f(x,t) = x6- axH + bx2t2 - ct3. Differentiating gives
Dtf = -ax4 + 2bx2t - Zct2
(1/2)JW = 15a;4 - 6ax2t + bt2
326 Solutions to Exercises
Setting a = 15, 26 = 6a, and 6 = 3c, that is a = 15, b = 45, c = 15, we have
Dtf + (1/2)JDXX/ = 0, so f(Bi,t) is a local martingale. Using the optional
stopping theorem at time ra A t we have
E0{B°aM - 15B«aA,(r0 At) +45B?aA1(ra At)2} = 15EQ(ra At)3
Letting t —» oo and using the bounded and monotone convergence theorems
with the results in (3.3) we have
15E0t% = a6 - 15a4Era + 4ba2Er2 = a6(l - 15 + 75)
so E0t* = 61/15.
3.3. Using the optional stopping theorem at time TLa A n we have
1 = Eoexp(-(2fM/(r2)ZT_aAn)
Letting n —* oo and noting that on {TLa = oo} we have Zt/t = aBt/t + n —* n
almost surely and hence Zt —» oo a.s., the desired result follows.
4.1. Let 6 : [0,oo) —» Rd be piecewise constant and have \63\ = 1 for all s. Let
y< = Ei /o *i dXi ■ The formula for the covariance of stochastic integrals shows
Yt is a local martingale with (Y)t = t, so (4.1) implies that Yt is a Brownian
motion. Letting 0 = t0 < ti < ... < tn and taking 04 = vj for s g (*j-ii*j] f°r
1 < j < n shows that Xtt — Xt0,..., Xtn —Xtn_t are independent multivariate
normals with mean 0 and covariance matrices (ti — to)I,..., (tn — tn_i)J, and
it follows that Xt is a d-dimensional Brownian motion.
4.2. X, is a local martingale with (X)3 = /q h2 dr. By modifying h3 after
time t we can suppose without loss of generality that (X)oo = oo. Using (4.4)
now we see that if j(u) = inf {s : (X)3 > u} then X^u) is a Brownian motion.
Since (X)3 and hence the time change j(u) are deterministic, the desired result
follows.
Chapter 4
4.1. Let r = inf{f > 0 : Bt £ G}, let y e dG, and let Ty = M{t >0:Bt=y}.
(2.9) in Chapter 1 implies Py(Ty = 0) = 1. Since r < Ty it follows that y is
regular.
4.2. Py(r = 0) < 1 and Blumenthal's 0-1 law imply Py(r = 0) = 0. Since we
are in d > 2, it follows that Py{Br = y) = 0 and hence e = 1 - Eyf{Br) > 0.
Let cr„ = inf{t > 0 : B% ¢ D(y, 1/n)}. cr„ J. 0 as n j oo so ^^(crn < r) -» 1.
1 - € = Eyf(Br) = Ey(f(Br);r < an) + Ey(v{BaJ;r > an)
Answers for Chapter 4 327
As n —* oo the first term on the right —» 0, so
Ey(v(B°*):r> <r„)-*l-e
and it follows that for large n, infx6sD(y,i/n) ^(^) < 1 — f ■
4.3. Let y £ dG and consider f/ = V(y,'Vg(y)} 1). Calculus gives us
g{z)-g{y)= I yg(y + o{z-y))-e(z-y)
Jo
d6
Continuity of Vg implies that if \z — y\ < r and z £ U then V<7(y + 9{z —
y)) ' <?(2 — 2/) > 0 so <?(z) > 0. This shows that for small r the truncated cone
U n D(y, r) C Gc so the regularity of y follows from (4.5c).
4.4. Let r = inf{t > 0 : Bt € V(y,v,a)}. By translation and rotation we can
suppose y = 0, v = (1,0,..., 0), and the d— 1 dimensional hyperplane is z<j = 0.
Let r0 = inf{* > 0 : Bf = 0}. (2.9) in Chapter 1 implies that P0(T0 = 0) = 1.
If a > 0 then for some k < oo the hyperplane can be covered by k rotations of
V so Pq{t = 0) > 1/fc and it follows from the 0-1 law that P0(t = 0) = 1.
7.1. (a) Ignoring Cd and differentiating gives
2 (|a;_6l|2 + y2)(d+2)/2
n h - A y i «*(<* +2)(*<-0)2»
(|z-0|2 + y2)(d+2)/2 (|x - 0|2 + y2)(d+4)/2
A/A» = 77-—„.,'■ ,w/, - d ■
(\x-6\2 + y2)d/2 (|a: - 6»|2 + y2)(<*+2)/2
P + 2p c/(i+2)t/3
(|a: - 6»|2 + y2)(d+2)/2 (|x - 0|2 + y2)(<*+4)/2
Dyy hB - -d- — -^-t o\(JJ.f\IO +
Adding up we see that
V^n /, , n /, {(-d)(d-l)+'(-d)^}y
2^UXiXiHg + Dyy tlB - ^ _ fl|2 + y2)(d+2)/2
d{d + 2)y{\x-0\* + y*)
(|z-0|2 + t/2)(d+4)/2
The fact that Ati(i) = 0 follows from (1.7)'as in the proof of (7.1).
328 Solutions to Exercises
(b) Clearly J dO hg(x, y) is independent of z. To show that it is independent of
2/, let x — 0 and change variables 0,- = yipi for 1 < i < d — 1 to get
/^^0.^) = /^^^^^=/^^(0,1) = 1
(c) Changing variables 0{ = z,- + r,-y and using dominated convergence
f d6hg(x,y)= f drhr(0,l)^0
JD(x,t)e JD(0,t/y)e
(d) Since Px(r < oo) = 1 for all x £ H, this follows from (4.3).
9.1. c(x) = -/? < 0 so io(z) = j5xe""r < 1 and it follows from (6.3) that
v(x) = Exe~PT is the unique solution of
-u" -/3u = 0 u(-a) = u(a) = 1
Guessing u{x) = B cosh(6z) with 6 > 0 we find
Iu" -pu= f^~ - /3b\ cosh(6z) = 0
if 6 = -\/2/?- Then we take 5 = 1/ cosh(ay2/?) to satisfy the boundary condition.
Chapter 5
3.1. We first prove that h is Lipschitz continuous with constant C2 = 1C\ +
U-1|A(0)| on D2 = {R< \x\ < 2R}. Let f(x) = (2R- \x\)/R and g(x) =
h(Rx/\x\). If x,y e D2 then
l/(*)-/(,)1 = ^^ <>-,!
Since h is Lipschitz continuous on Di = {\x\ < R}, and Rx/\x\ is the projection
of x onto Di we have
\g(x)-g(y)\ = \h(Rx/\x\)-h(Ry/\y\)\
Rx Ry
<d
M \y\
<Ci\x-y\
Answers for Chapter 5 329
Combining the last two results, introducing ||fc||oo,:' = sup{|fc(z)| : z £ £),}> and
using the triangle inequality:
\h(x) - h(y)\ < ||/||oo,2|<7(z) " 9(y)\ + |/(*) " /(y)lll<7||co,2
^l-CxIx-yl + ^IAlU,!
<(2C1 + \h(0)\/R)\x-y\ = C2\x-y\
since for ¢6¾ \h(x)\ < |/i(0)| + CiR.
To extend the last result to Lipschitz continuity on Rd we begin by
observing that if x G D\, y G D2, and 2 is the point of dD\ on the line segment
between x and y then
\h{x) - h(y)\ < \h(x) - h(z)\ + \h(z) - h(y)\
< d\x -z\ + C2\z -y\< C2\x - 2/|
since C\ < C2 and \x — z\ + \z — y\ = \x — y\. This shows that h is Lipschitz
continuous with constant C2 on \z\ < 2R. Repeating the last argument taking
|as| < 2jF2 and y > 2R completes the proof.
3.2. (a) Differentiating we have
k(x) = —Cz log z
k'{x) = -C\ogx-C > 0 iftXaKe-1
K"(x) = -C/x < 0 if x > 0
so k is strictly increasing and concave on [0, e-2]. Since n(e~2) = 2Ce-2 and
K'(e-2) = C, k is strictly increasing and concave on [0,00). When e < e-2
I!
dx = —C l loglogz|0 = 00
/0 Cz log z
(b) If g(t) = exp(-l/<p) with p > 0 then
9'(t) = ^T exp(-l/<p) = P9(t){log(l/g(t))}(p+1)/p
5.1. Under Q the coordinate maps Xt{u) satisfy MP(/?+ 6, <r). Let
Xt = Xt - f /3(XS) + b(X3) ds
Jo
= Xt- [ b{X,)ds
Jo
330 Solutions to Exercises
Since we are interested in adding drift —6 we let c = a_16 as before, but let
Yt = - I c(X3) ■ dXs
Jo
= -Yt+ f c(X,)-b(X,)ds
Jo
= -Yt+ f ba-H(Xt)da:
Jo
Since Yt and Yt differ by a process of bounded variation we have
(Y)t = (Y)t= J ba-H{X,)ds
Jo
The change of measure in this case is given by
d1=exp(Y1--(Y>1
= exp (-Yt + |<Y)<) = l/at
6.1. It suffices to show that P0 a.s.
f
limsup / |5i|_1l(|Bi|<2-„)c/s > 0
n—»oo Jo
To prove the last result, let RQ = 0, Sn = inf{f > -R„ : \Bt\ = 2-"}, Rn+1 =
inf{f > S„ : Bt = 0} for n > 0, and N = sup{n : i?„ < 7\}. Now
f l IB,]'1 1(|B,|<2-") da>T, nSm - Rm)
J° m^O
where the number of terms, N+1, has a geometric distribution with E(N + 1) =
2", and the Sm-Rm are i.i.d. with E(Sm - Rm) = 2~2n and E{Sm - Rm)2 =
(5/3)2_4n. Using the formula for the variance of a random sum or noticing that
P(N > 2") —* e_1 as n —* oo one concludes easily that limsup > C > 0 a.s.
Chapter 6
3.1. (i) If 6(2) < 0 then
exp (- J 2b{z)/a{z) dz\ > 1
Answers for Chapter 6 331
so <p(x) > x, y?(oo) = oo and recurrence follows from (3.3).
(ii) If b{z)ja{z) > e then
exp (- f 2b(z)/a(z) dz) < e~2ey
so <p(oo) < oo and transience follows from (3.3).
3.2. The condition is /,,1 -2b(y)/a(y) dy = 0. Let I = /,1 -2b(y)/a(y) dy.
<p'(n + y) = enI<p'(y) so if I > 0 <p(—oo) > —oo and if I < 0 then <p(oo) < oo.
If I = 0 then y'(y) is periodic so 0 < c < ^'(y) < C < oo, and it follows that
y?(oo) = oo and <p(—oo) = —oo.
5.1. (a) If /? = 0 then <p(x) = z and m(z) = l/a2x so
/ m(z)(^(z) — <p(Q)) dx = / a~2dx<
Jo Jo
00
M(0) = —oo so J = oo and it follows that 0 is absorbing. To deal with oo note
that <p(oo) = M(oo) = oo so I = J = oo.
(b) When p < 0, <p'(x) = e2^x/"2, so
2|/J|
m(z) = c-2""'/"'/^*
^) = 33(^-1)
To evaluate J for the boundary at 0, we note
•i i _ e-2|/J|*/<r2
Jm(x)(ip(x) — <p(0)) dx =
o Jo
2|/?|a
dx < oo
since the integrand converges to 1/cr2 asi-^O. Again M(0) = —oo so J = oo
and it follows that 0 is absorbing. To deal with oo we note that <p(oo) = oo so
I = oo. To evaluate J we begin by noting that
r°° ... . „ ^-2
Jx
-my/«ady= ^—e-2^1!"2
2|/?|
Since most of the integral comes from x < y < x + ^Jx it follows easily that
2
M(oo) - Mix) ~ -^re-2^'"2
\P\X
332 Solutions to Exercises
Since ip'(x) = e~2^xl" it follows that J = oo and oo is a natural boundary.
5.2. The computations in Example 5.3 show that (a) I < oo when j < 1 and
(b) for 7 < —1 we have M(0) = —oo and hence J = oo. Consulting the table
of definitions we see that 0 is an absorbing boundary for j < — 1.
5.3. (a) H'f < To < oo.
(b) (2.9) in Chapter 3 implies
EiHs = f 2yy-2S dy < oo
JO
since 5 < 1 implies 1 — 25 > —1.
(c) If 5 > 1 then Hg > Hi so it suffices to prove the result when 5 = 1. To do
this, we note that if 50 = x then x~iB3Xi> is a Brownian motion starting at 1,
so the distribution of
' B~2 ds under Px
o
does not depend on x. Letting Sx = (Tx/2 A T2x) ° ®tx and using the strong
Markov property it follows that the distribution of
rs*
Ix= B~2 ds under Px
does not depend on x. Pick 6 > 0 so that P\(IX > 5) > 0. The /2-» are
independent so the second Borel-Cantelli lemma implies that Pi(I2-^ > & i.o.) = 1
but this is inconsistent with P\(H\ < 00) > 0.
5.4. ip(x) = x so <p(oo) = 00 and I = 00. When 5 < 1/2, M(oo) = 00 and
hence J = 00. When 5 > 1/2
z1"2*-! , f < 00 if 5 > 1
25-1 2\=oo if5<l
Combining the results for I and J and consulting the table we have the indicated
result.
5.5. When a = 0
<p'(x) = exp(j^-^-dz)=Cy-2^
'-r
Answers for Chapter 8 333
Here and in what follows C is a constant which will change from line to line. If
/? > 1/2 then <p{0) = -oo and I = oo. If /? < 1/2 then <p(z) - <p(Q) = Czl~2P
and
f'ivMy) y -p-y{l-y)
as y -* 0. From this we see that I < oo if /? < 1/2. When /? = 0, M(0) = -oo
so J = oo. When /3 > 0, M(z) - M(0) ~ Cz2^ asz-^0soJ<ooif/?>0.
Comparing the possibilities for I and J gives the desired result.
Chapter 7
5.1. Letting 6 —» oo in (4.7) of Chapter 6 we have
£xT(li0o) = 2(p(a:) - p(l)) J m{z) dz + 2J (<p(z) - <p(l)) m(z) dz
The result now follows from m{z) = l/<p'(z)a(z) = z~2c.
5.2. We apply (5.3) to v(x) = x1+s - 1 and note that
Lu= (11^. £^ + 6(.)(1 + 5)^<-l
2 e e —
since b(x) < -e/xs, (1 + 5)/2 < 1, and a;4"1 < 1.
Chapter 8
3.1. Suppose r„ 7^ r. Then there is an e > 0 and a subsequence r„fc so that
\rn'k — r\> e for all k. However it is impossible for a subsequence of r„fc, so we
have a contradiction which proves rn —* r.
6.1. If d(gn, h) —* 0 then h can only take the values 0 and 1. In Example 6.2 we
showed that h cannot be = 0 so it must take the value 1 at some point a. Since
h is right continuous at a there must be some point 6 7^ 1/2 where h(b) = 1 but
in this case it is impossible for d(gnt h) —» 0.
References
M. Aizenman and B. Simon (1982) Brownian motion and a Harnack inequality
for Schrodinger operators. Comm. Pure Appl. Math. 35, 209-273
P. Billingsley (1968) Convergence of Probability Measures. John Wiley and Sons,
New York
R.M. Blumenthal and R.K. Getoor (1968) Markov Processes and their Potential
Theory. Academic Press, New York
D. Burkholder (1973) Distribution function inequalities for martingales. Ann.
Probab. 1, 19-42
D. Burkholder, R. Gundy, and M. Silverstein (1971) A maximal function
characterization of the class fP>. Trans A.M.S. 157, 137-153
K.L. Chung and R.J. Williams (1990) Introduction to Stochastic Integration.
Second edition. Birkhauser, Boston
K.L. Chung and Z. Zhao (1995) From Brownian Motion to Schrodinger's
Equation. Springer, New York
Z. Ciesielski and S.J. Taylor (1962) First passage times and sojurn times for
Brownian motion in space and exact HausdorfF measure. Trans. A.M.S.
103, 434-450
E. Cinlar, J. Jacod, P. Protter, and M. Sharpe (1980) Semimartingales and
Markov processes. Z. fur. Wahr. 54, 161-220
B. Davis (1983) On Brownian slow points. Z. fur Wahr. 64, 359-367
C. Dellacherie and P.A. Meyer (1978) Probabilities and Potentials. North
Holland, Amsterdam
C. Dellacherie and P.A. Meyer (1982) Probabilities and Potentials B. Theory of
Martingales. North Holland, Amsterdam
C. Doleans-Dade (1970) Quelques applications de laformule de changement de
variables pour les semimartingales. Z. fur Wahr. 16, 181-194
R. Durrett (1995) Probability: Theory and Examples. Second edition. Duxbury
Press, Belmont, CA
336 References
A. Dvoretsky and P. Erdos (1951) Some problems on random walk in space.
Pages 353-367 in Proc. 2nd Berkeley Syrup., U. of California Press
E.B. Dynkin (1965) Markov Processes. Springer, New York
P. Echeverria (1982) A criterion for invariant measures of Markov processes. Z.
fur Wahr. 61, 1-16
S. Ethier and T. Kurtz (1986) Markov Processes: Characterization and
Convergence. John Wiley and Sons, New York
W. Feller (1951) Diffusion processes in genetics. Pages 227-246 in Proc. 2nd
Berkeley Symp., U. of California Press
G.B. Folland (1976) An Introduction to Partial Differential Equations.
Princeton U. Press, Princeton, NJ
D. Freedman (1970) Brownian Motion and Diffusion. Holden-Day, San
Francisco, CA
A. Friedman (1964) Partial Differential Equations of Parabolic Type. Prentice-
Hall, Englewood Cliffs, NJ
A. Friedman (1975) Stochastic Differential Equations and Applications, Volume
1. Academic Press, New York
R.K. Getoor and M. Sharpe (1979) Excursions of Brownian motion and Bessel
processes. Z. fur. Wahr. 47, 83-106
D. Gilbarg and J. Serrin (1956) On isolated singularities of second order elliptic
differential equations. J. d'Analyse Math. 4, 309-340
D. Gilbarg and N.S. Trudinger (1977) Elliptic Partial Differential Equations of
Second Order. Springer, New York
E. Hewitt and K. Stromberg (1969) Real and Abstract Analysis. Springer, New
York
N. Ikeda and S. Watanabe (1981) Stochastic Differential Equations and
Diffusion Processes. North Holland, Amsterdam
J. Jacod and A.N. Shiryaev (1987) Limit Theorems for Stochastic Processes.
Springer, New York
P. Jagers (1975) Branching Processes with Biological Applications. John Wiley
and Sons, New York
F. John (1982) Partial Differential Equations. Fourth edition. Springer-Verlag,
New York
References 337
M. Kac (1951) On some connections between probability theory and differential
and integral equations. Pages 189-215 in Proc. 2nd Berkeley Symposium,
U. of California Press
I. Karatzas and S. Shreve (1991) Brownian Motion and Stochastic Calculus.
Second edition. Springer, New York
S. Karlin and H. Taylor (1981) A Second Course in Stochastic Processes.
Academic Press, New York
T. Kato (1973) Schrodinger operators with singular potentials. Israel J. Math.
13, 135-148
R.Z. Khasminskii (1960) Ergodic properties of recurrent diffusion processes and
stabilization of the solution to the Cauchy problem for parabolic equations.
Theor. Prob. Appl. 5, 179-196
F. Knight (1981) Essentials of Brownian Motion and Diffusion. American Math.
Society, Providence, RI
H.P. McKean (1969) Stochastic Integrals. Academic Press, New York
P.A. Meyer (1976) Un cours sur les integrals stochastiques. Pages 246-400 in
Lecture Notes in Math. 511, Springer, New York
N. Meyers and J. Serrin (1960) The exterior Dirichlet problem for second order
elliptic partial differential equations. J. Math. Mech. 9, 513-538
B. 0ksendal (1992) Stochastic Differential Equations. Third edition. Springer,
New York
S. Port and C. Stone (1978) Brownian Motion and Classical Potential Theory.
Academic Press, New York
P: Protter (1990) Stochastic Integration and Differential Equations. Springer,
New York
D. Revuz and M. Yor (1991) Continuous Martingales and Brownian Motion.
Springer, New York
L.C.G. Rogers and D. Williams (1987) Diffusions, Markov Processes, and
Martingales. Volume 2: ltd Calculus. John Wiley and Sons, New York
H. Royden (1988) Real Analysis. Third edition. MacMillan, New York
B. Simon (1982) Schrodinger semigroups. Bulletin A.M.S. 7, 447-526
D.W. Stroock and S.R.S. Varadhan (1979) Multidimensional Diffusion
Processes. Springer, New York
338 References
H. Tanaka (1963) Note on continuous additive functionals of one-dimensional
Brownian motion. Z. fur Wahr. 1, 251-257
J. Walsh (1978) A diffusion with a discontinuous local time. Asterique 52—53,
37-45
T. Yamada and S. Watanabe (1971) On the uniqueness of solutions to stochastic
differential equations. J. Math. Kyoto 11,1, 155-167; II, 553-563
Index
adapted 36
arcsine law 173, 291
Arzela-Ascoli theorem 284
associative law 76
averaging property 148
basic predictable process 52
Bessel process 179, 232
Bichteler-Dellacherie theorem 71
BlumenthaPs 0-1 law 14
boundary behavior of diffusions 231
Brownian motion
continuity of paths 4
definition 1
Holder continuity 6
Levy's characterization of 111
nondifferentiability 6
quadratic variation 7
writes your name 15
zeros of 23
Burkholder Davis Gundy theorem 116
C, the space 4, 282
Cauchy process 29
change of measure in SDE 202
change of time in SDE 207
Ciesielski-Taylor theorem 171
cone condition 147
continuous mapping theorem 273
contraction semigroup 245
converging together lemma 273
covariance process 50
D, the space 293
differentiating under an integral 129
diffusion process 177
Dini function 239
Dirichlet problem 143
distributive laws 72
Doleans measure 56
Donsker's theorem 287
Dvoretsky-Erdos theorem 99
effective dimension 239
exit distributions
ball 164
half space 166
exponential Brownian motion 178
exponential martingale 108
Feller semigroup 246
Feller's branching diffusion 181, 231
Feller's test 214, 227
Feynman-Kac formula 137
filtration 8
finite dimensional sets 3, 10
Fubini's theorem 86
fundamental solution 251
generator of a semigroup 246
Girsanov's formula 91
Green's functions
Brownian motion 104
one dimensional diffusions 222
Gronwall's inequality 188
340 Index
harmonic function 95
averaging property 148
Dirichlet problem 143
Harris chains 259
heat equation 126
inhomogeneous 130
hitting times 19, 26
infinitesimal drift 178
infinitesimal generator 246
infinitesimal variance 179
integration by parts 76
isometry property 56
Ito's formula 68
multivariate 77
Kalman-Bucy filter 180
Khasminskii's lemma 141
Khasminskii's test 237
Kolmogorov's continuity theorem 5
Kunita-Watanabe inequality 59
law of the iterated logarithm 115
Lebesgue's thorn 146
Levy's theorem 111
local martingale 37, 113
local time 83
continuity of 88
locally equivalent measures 90
Markov property 9
martingale problem 198
well posed 210
Meyer-Tanaka formula 84
modulus of continuity 283
monotone class theorem 12
natural scale 212
nonnegative definite matrices 198
occupation times
ball 167
half space 31
optional process 36
in discrete time 34
optional stopping theorem 39
Ornstein-Uhlenbeck process 180
pathwise uniqueness for SDE 184
■k — A theorem 11
Poisson's equation 151
potential kernel 102
predictable process 36
basic 52
in discrete time 34
simple 53
predictable cr-field 35
progressivley measurable process 36
Prokhorov's theorems 276
punctured disc 146
quadratic variation 50
recurrence and transience
Brownian motion 17, 98
one dimensional diffusions 220
Harris chains 262
reflecting boundary 229
reflection principle 25
regular point 144
resolvent operator 249
relatively compact 276
right continuous filtration
scaling relation 2
Schrodinger equation 156
semigroup property 245
semimartingale 70
shift transformations 9
simple predictable process 53
skew Brownian motion 89
Skorokhod representation 274
Skorokhod topology on D 293
speed measure 224
stopping time 18
strong Markov property
for Brownian motion 21
for diffusions 201
strong solution to SDE 183
tail cr-field 17
tight sequence of measures 276
transience, see recurrence
transition density 257
transition kernels 250
uniqueness in distribution for SDE 197
variance process 42
weak convergence 271
Wright-Fisher diffusion 182, 234, 308
Yamada-Watanabe theorem 193